November 22, 2010

The todo Test Category

Originally published 26 Jan 2007

Recently, Elliotte Rusty Harold blogged about committing test cases for bugs to the build cycle before fixing them right away. He enumerated several good reasons for doing this such as the fix being difficult and time consuming. His point was that breaking the build (where breaking the build includes passing all the unit tests) is okay in some cases.

I disagree with this stance, but I think Harold has a really good idea there, which I'll get to in a moment. I think breaking the build does include passing all the unit tests and should be part of the continuous integration process. I've always liked the idea of the tests serving as a safety net when I add new functionality or refactor existing code. The tests not only specify the desired behavior, but act as a regression suite.

So what's the idea? I first learned of the concept of test categorization from Cedric Beust's TestNG framework and the writings of disco king Andrew Glover . Essentially, the concept is to break tests into groups such as unit, component, and system and run them at different intervals. Unit tests are run many times a day as code is developed, component tests less frequently, and system tests the least frequently of all. You can categorize tests in several different ways such as by file naming conventions, placing them in different directories, or using Java annotations.

Harold brings up a new category - the todo category. (I was originally thinking the fixme category, but I thought todo could serve more purposes.) The todo category contains all tests that serve as a reminder of work to do. They obviously fail or there would be nothing to do. When it's finally time, the test moves out of the todo category and into the unit, component, or system category as appropriate. Although you could have a todo category for each of the original, regression categories (todoUnit, todoComponent, and todoSystem), I think that would be overkill. Notice that I slipped the word regression into "original categories" above. I'm advocating that the original categories should pass at all times - they should break the build if they fail (well, at a minimum the unit category). Once tests get into the regression categories, they never go back to todo.

The obvious downside of this approach and of test categorization in general is the added complexity. It's more work to maintain categories and in this case, move tests into different categories than a simple one category fits all approach. Also, there's more work (at least initially) to setup the running of those categories. As always, I recommend weighing the options and deciding what's best for your particular situation. See Glover's writings for more benefits of test categorization.

No comments:

Post a Comment