Originally published 16 Oct 2007
Over time, how do you maintain a maximum 10 minute build? This is an important agile practice for continuous integration.
It's inevitable. As more and more features are added to an application, the code base grows. The build has more to do: more compiling, more tests and metrics to run, more reports to generate. We've done numerous things to keep the build time down. In this entry, I'd like to describe one of them: running standard JUnit tests concurrently.
A colleague of mine came up with this idea. He first did a preliminary search to see if anything already existed. Pretty much everything he found was intrusive, e.g., requiring extending a specialized TestCase. He then attempted to combine Ant's Parallel and JUnit tasks. That is, he could simply nest N number of JUnit tasks inside a Parallel task. The problem here was coming up with a good way to divide the tests up into equal parts beforehand. My colleague quickly scrapped this idea in favor of a custom Ant task that wrapped the JUnit task.
The thought was to leverage the JUnit task as much as possible, but fronting it with the ability to run tests in parallel. The conceptual design looks like this:
There are N number of TestExecutors. A TestExecutor runs in a single JVM. Thus, there are N JVMs. Say we have a dedicated, 4 processor integration machine running builds. We may choose to configure our custom task with 5 TestExecutors (we'll add one assuming we're not 100% compute bound). Note: if you're familiar with the JUnit task, you may be wondering how its fork/forkmode attributes fit in. The answer is that they are eliminated in favor of this jvm-count attribute.
There's a single TestProvider. His job is to gather all the tests and hand them out one at a time to whichever TestExecutor is ready. A simple protocol exists to let a TestExecutor tell the TestProvider that he's done and ready for another test. This should maximize parallelism.
The biggest negative with this approach is that since all tests run in a fixed number of JVMs, there's some loss of isolation here. Class level/global state set from a previous test could affect later tests. We're willing to trade this for increased performance.
The other possible downside is the use of a non-standard Ant task.
Using this approach we've successfully reduced our build time to usable levels.