Saturday, February 20, 2010

Parallelizing JUnit test runs

Original post at the kaChing engineering blog.

Test runs should be as fast as possible in order to allow a lean development cycle. One of the applications is a Continuous Deployment (see Lean Startup).

Using strong multi-core machines to run tests is not enough since most unit tests are using a single thread per test. Apart of reducing IO to minimum, having tests running in parallel is required to squeeze the juice of of the machine. Additional value from Parallelizing JUnit test runs is ensuring that tests have no dependency between each other.

The way we implemented parallel test runs is creating an ANT target with a parallel task containing N JUnit tasks where N == number of cores. For example:

<parallel>
      <junit printsummary="yes" haltonfailure="yes" fork="true" maxmemory="${maxmemory}" showoutput="yes">
        <jvmarg value="-XX:MaxPermSize=${permMem}"/>
        <jvmarg value="-Xms${minMem}"/>
        <jvmarg value="-Xmx${maxmemory}"/>
        <classpath refid="classpath.test" />
        <formatter type="xml" usefile="true" />
        <test name="com.kaching.GroupedTests$GroupA" todir="${testTargetJunit}" unless="testcase"/>
      </junit>
      <junit printsummary="yes" haltonfailure="yes" fork="true" maxmemory="${maxmemory}" showoutput="yes">
        <jvmarg value="-XX:MaxPermSize=${permMem}"/>
        <jvmarg value="-Xms${minMem}"/>
        <jvmarg value="-Xmx${maxmemory}"/>
        <classpath refid="classpath.test" />
        <formatter type="xml" usefile="true" />
        <test name="com.kaching.GroupedTests$GroupB" todir="${testTargetJunit}" unless="testcase"/>
      </junit>
  ...
     </parallel>
The GroupedTests$GroupX are classes extending TestCase with a public static Test suite(). The suite() method creates a JUnit test suite on the fly by loading all the tests in scope and filtering them out. If for example there are four cores, therefore you would like to have a group of four suites each running fourth of the tests. The suits are using a java.util.Random seeded with the commit revision number. Using the Random object we decide on placing test cases in suits.
random.nextInt(numOfTestSuites) == testSuiteId
Therefore a testcase goes to a single suite and they are evenly distributed between suites. Randomness in assigning test cases to suites (and therefore to processes) gives us some reassurance that there are no dependencies between tests.

As a result we have the tests running about twice as fast as they did before, in the range of 2 min, 20 sec for about 4.6k tests for one of our components, including code fetch from source repository, clean, build, and test suite setup time. It gives us a nice 100% test machine utilization and faster commit to production deployment cycle (about four minutes).

Monday, February 01, 2010

Benchmarking Voldemort on SSD with production data

Its not a secret kaChing is using Voldemort in production. As our data set and grows larger we wish to figure out what is the best bang for the buck in terms of hardware investment. Would it be migrating our 32bit machines to 64bit ones and have more memory available for them or using SSD? There are good reasons why one may be better then the other but the only way to know is to test. Yes, we believe that Test Driven Development (TDD) applies to building system as well as coding.

The time John invested the time in creating the tool (written in Scala btw) to parse the logs and get the graphs had quickly payed itself. In John's post Voldemort in the Wild he wrote about the initial settings he created to test the changes before we actually did them and some of the initial result. The results are unique to our own system, which is the best thing about them. You may get different results for different load, usage patterns, data size, cluster setup.

There are more tunings we played with, waiting to John's next post.

Creative Commons License This work by Eishay Smith is licensed under a Creative Commons Attribution 3.0 Unported License.