I. Introduction and Purpose
This series of blog posts is an adaptation of a set of classes on unit-testing that I teach at ProQuest (a leading academic and corporate search engine for ‘paywalled’ articles and data). ProQuest has been kind enough to allow me to publish this series for anyone to read and learn from.
This is the first in a five-part series on unit-testing in Java. The complete series covers:
- What is unit-testing? why do we do it, what do we gain?
- An exercise in “TDD”, test-driven development.
- Writing tests with “EasyMock” mock objects.
- Patterns for cleaning up existing (messy) unit-tests
- Advanced unit-testing. (Possibly including taking a big ugly mess of legacy code, and making it testable.)
The goal of this series is to improve developers’ skill and comfort level with unit-tests, and thereby improving the overall number and quality of unit-tests in any Java application.
II. What is unit-testing?
Mike Cohn breaks automated software testing into three layers of a triangle, based on the “Three Little Pigs” story:
- Straw is the end-to-end, GUI testing, usually driven from a framework like Selenium. It’s the easiest and quickest to build, and covers a lot of ground. But it’s terribly brittle, falls over easily, and is expensive to maintain.
- Sticks are “integration” testing. It means testing high level (but not GUI) classes against real (often external) services and dependencies. Integration tests are often driven from Junit.
- Bricks are “unit-testing”. They are solid, expensive to build, but cheap to maintain. They test small “units” of code — the smaller, the better! They are totally self-contained — in fact, I frequently run all of our unit-tests w/o a network connection, just to make certain!
As an aside, Cohn recommends that the investment into testing match the areas of the three different parts of the triangle. All three levels are valuable, and for different reasons. And manual testing is still useful too, but that’s another story.
(I stole the triangle from Patrick Wilson-Welsh’s excellent article about flipping the testing triangle, which is well worth reading separately. Plus it has cool pictures.)
A Unit-test Exercise
OK, but what does a unit-test actually look like? What does it test?
When I interview Java developer candidates, I give them a simple coding exercise to do in-person. (A practice which I borrowed from Menlo.) It’s just a simple “reverse the words in text” exercise, but I require that they write unit-tests along the way. Here’s the exercise class itself, under src/main/java:
Then, under src/test/java, there’s a matching unit-test class:
We use the Junit framework to run the tests. In Eclipse, this is as simple as right-clicking in the test class, and selecting “run as Junit Test”. Every method annotated with “@Test” gets run. The assertEquals() method (and a whole host of other assertions) is provided by Junit. If the assert succeeds, the test passes, and Eclipse shows a green bar. If any assert fails, we get a red bar, and Eclipse tells us what failed and why.
So a unit-test is exercising a small chunk of code, with defined inputs and outputs. A nicely contained black-box, if you will. Sometimes those boxes can get pretty big… but more about that later. But in essence, a unit-test is an experimental framework, wrapped around the “black-box” we want to test, that tells us if the box does everything we think it should do.
There’s a key phrase right there: everything we think it should do. When I’m interviewing a candidate, I’m looking for:
- Could they actually write the code? Did they write a method that passed shouldReverseHelloWorld() ? (You may think that’s trivial, but do it under pressure with an interviewer watching you.)
- Did they write other tests? Tests with more text and spaces. Text with multiple lines.
- Did they consider edge cases? A null input. An empty line. White-space characters other than “space”? (E.g. tabs, weird Unicode spaces, etc.)
- Did their tests actually exercise all of the code that they wrote? (We’ll demonstrate how to see that in a little bit.)
This is all about “did it do everything we think it should do”. Not just the happy paths, but edge cases, null inputs, and so on. We have to think all the way around the box, in order to be sure. A good set of unit-tests covers all of those cases, while managing to avoid exponential explosion. (We don’t really need to test the entire works of Shakespeare.)
So try it for yourselves. If you’ve already checked-out the CodeExercise project from git, it’s in the package com.proquest.codeexercise. Alternately you can check it out via svn from svn://caucus.com/CodeExercise/trunk.
More about Junit
Junit has several other method annotations that may be used in a test class:
|@Before||Execute this method before every @Test method. Put common initialization/preparation code in this method.|
|@After||Execute this method after every @Test method. Put common cleanup code here. (Rarely needed.)|
|@Ignore||Ignore (don’t run) an @Test method. If you do this, add a comment explaining why!|
|@BeforeClass||Run this method once, before executing any tests. Used mostly to set up statics. Rare (because statics are evil!)|
|@AfterClass||Run this method once, after executing all tests. Used mostly to cleanup statics. (I use it to kill threads accidentally started by some tests.)|
|@Test(expected=Exception.class)||Test fails if the method does not throw the named exception.|
Junit also has a suite of assert methods. The most commonly used ones include:
- assertEquals(a, b)
- assertTrue (a)
- fail() (If you get to this point in the code, the test has failed! Often used with try/catch.)
More detailed information can be found at:
- http://www.vogella.com/articles/JUnit/article.html (a good tutorial)
- https://github.com/junit-team/junit/wiki (The official Junit Wiki)
III. Why Unit-test?
Why do we write unit-tests? What’s the benefit? After all, if we’re writing the code correctly in the first place, why do we need the extra cost of writing tests?
The simple answer is, we’re all human. We make mistakes. Tests help protect us from our mistakes. Remember the Mars Climate Orbiter, which was destroyed because the on-board software used metric units, and the ground-control software used English units. Keeping the tests as part of the project means we can run all of them at any time, and raise our confidence that we haven’t made any new mistakes.
But the real answer is much deeper than that. Over the course of several years, I found that writing unit-tests forced me to write better code. Not just code with fewer mistakes, but literally better code. Like many things, this is a lesson perhaps better learned through experience. But I can sum it up in these bullet points:
- Fewer mistakes. OK, that’s the easy one. Code that has tests typically has fewer mistakes.Change == Surprise. Unit-tests save us from the Law of Unintended Consequences. A unit-test can save us from future bugs that are “accidentally” inserted, especially by “other” people. That’s why, when we’re about to merge our changes into a project, we run all tests. Not just the ones concerned with the code we were working on.
- No Fear. Take #2 a little further. When we don’t have good unit-test coverage, there’s always a risk in making changes to existing, (presumably) working, code. (And if you’re not afraid of this case… well, you should be!) When you do have good test coverage, you have a safety net. You can make changes, fix bugs, improve existing code… and at least have some confidence that if you break anything in a really bad way, the existing tests will catch you. Entirely too much bad code gets written because someone hacked around existing code, fearful of changing its behavior, when they would have been much better off rewriting the (small) relevant part. (I’ve written a unabashedly geeky article, just about how programmers misuse this fear, called the Junit / Green Lantern Oath)
- Tests are part of the code. They’re not just bolted on afterwards. This forces us to think about how we’ll test a method, as we’re writing it. There’s a methodology called “TDD” (Test Driven Design), where tests are written first: the design of a method is built up, piece by piece, by adding tests… writing just enough code to pass each test in turn. I’m not religious about TDD. But tests should be written along with the production code, in some form. Use the tests as a kind of scaffolding, supporting the structure as we’re building it.
- Tests document the code. Good tests essentially become the specifications for the code! Even better: add commented links, in each test, to acceptance criteria in the original story. This gives us an easily-travelled highway, from story to code and back again. Writing the tests often expose ambiguities in the original story.
- Tests force smaller methods. Long methods are simply harder to test. If I know I’m going to be writing tests in the next few minutes, I’m going to make it easy on myself and break the new code into smaller pieces, each of which can be tested independently. And while I’m doing that, I now have the opportunity to name the (new, smaller) methods in a way that is more self-documenting than a longer, single method. These smaller methods with better names in turn invite me to break up the flow of the code into more abstract concepts, that can become a sort of higher-level “domain” language about the problem that is being solved. All because I wanted to write tests!
- Tests force de-coupling. Classes that have tight coupling to other classes are hard to work with. They’re even harder to test. The worst cases are classes that are tied to specific implementations of other classes. These are most often the result of using static methods from another class (see Why Static Classes Are Evil), or from over-using ‘new’ (constructors) of other complicated objects. If tight-coupling gets in the way of testing, then it’s getting in the way of good design, period. I’d rather have a class with a few more arguments in its constructors, than a class that always depends on a specific implementation of another class. When writing tests becomes painful, it’s an early warning that my code is starting to smell.
- Tests force fewer side-effects. The easiest method to test is one that takes some arguments, and returns a (single) result, with no changes to the object’s internal state, or any other side-effects. That’s not often practical: frequently a method has to change its object’s state, by design. In that case the change of state should be testable! (E.g. there should be a getter method that lets you examine the new state.) If the method changes the state of some other object… now we’re sliding down the slippery slope of side-effects. Once again, if it’s hard to test, it’s hard to use, and suggests that the design itself is flawed.
IV. Emma Code Coverage
Once you have unit-tests, it’s really nice to know how much testing you have in place. Enter EMMA, the unit-test writer’s friend. Emma works hand-in-hand with Junit, to measure how many lines (or instructions, or blocks, etc.) of your code is actually exercised by your tests. Emma can produce a statistical report, e.g. such-and-such a percentage of the lines of code in project X, or package Y, or class Z, are covered (or not covered) by tests.
With the EclEmma Eclipse-plugin, you can also see visually how much of a class is covered. When the plugin is installed (Help, Eclipse marketplace, find “emma”. Install EclEmma, done), right-click on a test class or an entire project, and choose “coverage as… Junit”. You’ll see lines covered by tests in green, lines not covered in red, and lines that are partly covered (e.g. an if with only one of the two possible outcomes covered) in yellow.
Code coverage isn’t everything, however. It’s possible to write a bad test… that exercises a method, but never actually makes any useful assertions about the result of the test. The best use of Emma is to find the “red” code, paths that are definitely not tested, and add tests that both cover it and verify that the results of the code are correct.
V. A Chess Exercise
The rest of this module is devoted to working an actual exercise, writing both tests and code, that involves chess. I’ve found that chess programming problems are small enough to work with, but sufficiently complicated (and object-oriented enough) to make useful exercises. I’ve built the underpinnings of a very simple chess board in the com.proquest.chess package in the CodeExercise project mentioned earlier. It includes some tests, to give you a feel for the pattern.
The goal of this exercise is to add a Rook class that correctly identifies all of the possible moves that a rook can make. For our purposes, ignore whether a move leaves (or puts) a King in check: this is just about where a Rook can move, or capture, depending on its location and other pieces around it.
The fundamental classes are:
- Color. An enum: Black or White.
- Type. Another enum. K, Q, R, B, N, or P. Or Nothing.
- Position. Just an (x,y) pair that represents a spot on the board. Positions are either valid or invalid: an invalid position is “off the board”.
- Piece. An abstract class. All specific pieces (e.g. King, Knight, Rook) extend Piece. A piece has a Color, a Type, and a Position.
- Move. Each move has a position (where a piece ends up after making a move), the type of promotion (for pawns only), and the piece captured by the move (if any).
- Board. The actual board where everything happens, plus collections of the white and black pieces currently on the board. A new Board is empty. The two most useful methods in Board are placePiece (puts a piece on the Board) and print (displays the entire board in an easy-to-see format).
Take some time to wander through the classes, and their tests (notably BoardTest), to see how they operate. They’re not complete by any means, but they provide enough infrastructure to write the Rook class. We’re not concerned with efficiency here — although if we write the tests correctly, we could improve efficiency later without changing the tests!
Once you feel familiar with the infrastructure, start by creating a Rook class that extends Piece, and returns an empty List of Move’s. Then write a test for just that much.
After that, start working on generating a list of Move’s for your Rook. When you write the tests, you may find the ChessTestUtils class helpful.
When you are confident that your Rook is working properly, you can add my unit-tests to the project, and see if they also run green. (Or you can just merge in the branch ‘rooktest’, which is probably easier.) But don’t do this until you are completely confident in your own code (and your own tests!).
VI. Other Resources
- Shaun Abram has a very good article about Software Quality via Unit Testing, where he makes a strong case for the economic value of unit-testing. The first few pages are an excellent introduction for managers who may not be familiar with the ins and outs of coding.
- A good book for total beginners is Pragmatic Unit Testing in Java with Junit. (It’s somewhat dated, as it’s based on a older version of Junit, but still worth the read.)
- I believe every developer should own a copy of “Uncle” Bob Martin’s Clean Code. It’s only partially about unit-testing, but this one book captures the essence of how to write clean code, including clean unit-tests!
- Michael Feathers’ book Working Effectively With Legacy Code offers many, many strategies for how to add tests to a project that doesn’t have any. This is a slow but very valuable read — I found myself just reading a few pages at a time, and pondering them for a day or two before moving on.
- A short but valuable article by Juri Strumpflohner on descriptive assert messages.