Saturday, July 24, 2010

Continuous Deployment at kaChing

In my last visit to Israel I visited IBM Research, Outbrain and Over there we discussed kaChing's culture, working methodologies and tools. The talk focused on continuous deployment and lean startup and was based on Pascal's and David's posts at kaChing Eng Blog.
Here is the presentation, using the fantastic Prezi tool.

Sunday, June 20, 2010

Scala on Eclipse without the plugin

The Scala supported IDE is one of the few pain points of developers who want to start using Scala in their Java project. On existing long term project developed by a team its hard to step in and introduce a new language that is not supported by the existing IDE. On way to go about it is to hid the fact that you use Scala from the Java world by using one way dependency injection. Still, if you wish to truly absorb Scala into your existing java environment then you'll soon introduced cross language dependencies.

Most of our team is using Eclipse as the main IDE, its incrimental compilation in Java with its tight JUnit integration are great for fast TDD programming. Unfortunately the Eclipse Scala plugin is not there yet, it may hangs the IDE and messes up Java compilation - especially in large (more then 1000 source files) Java/Scala projects. Though the plugin is getting better over time some developers would find the plugin as a majore drag on their productivity.
For developers who do not write Scala at all or rather edit Scala with other editors, you can use this alternate path which lets them work on their Java or Scala code without messing with the plugin.

Compile Scala Eclipse project without Scala Plugin 
1. Add a compilation script to your project.
The Following script is using the Fast Scala Compiler (fsc). The fsc is a compilation server which always run in the background, as in a warm scalac always ready to receive new work. Is will reduce compilation time dramatically.
The classpath for compilation is taken from the Eclipse project .classpath file. You may take the source directory from there as well if you wish (exercise to the reader).
The params are not passed to the fsc in the command line since in my project's case the line is too long for the OS to handle. The alternative is to put it into a file and let fsc handle it for you.
# Create a classpath from the eclipse .classpath file
lib=`grep classpathentry .classpath | grep "kind=\"lib\"" | tr -d '\t' | sed 's/.*path=\"\(.*[^src]\.jar\)\".*/\1/' | grep -v classpathentry`
CLASSPATH=`echo ${lib} | sed 's/ /:/g' `  

# point SCALA_HOME to Scala home (might want to add it to your project as well)  
export SCALA_HOME=lib-tools/scala-2.8.0

# java opts for your compilation server
export JAVA_OPTS="-client -Xmx1024M -Xms256M -XX:PermSize=128m -Xss2M -XX:MaxPermSize=256m -Xverify:none -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled"

mkdir -p $DEST
echo "-classpath $CLASSPATH:target/bin" > $SCALAC_PROPERTIES
echo "-d $DEST -deprecation" >> $SCALAC_PROPERTIES
find src srctest -name *.java -o -name *.scala >>

2. Eclipse classpath
Add the scalac destination directory ($DEST) to the .eclipse classpath

3. Add the fsc builder to your project
Open the project properties and add the script above to the list of builders.
Set the working directory to be your project root directory.

5. Set the builder properties
Run it in the background when needed:

Add Scala syntax highlighting editor for Eclipse 
Even without the Scala plugin refactoring and debugging capabilities you can write some decent Scala code with the help of syntax highlighting for better readability. There are two options:
  • The color editor which use jEdit modes. Daniel Spiewak added a Scala mode to it, unfortunately the plugin doesn't seem to work with Eclipse 3.6.
  • The Colorer take5 project which works fine on my Eclipse on the Mac. It does not have a Scala mode yet but you can grab one at 
It gives you a lightweight way (with its pros and cons) to write Scala in Eclipse.

Friday, May 28, 2010

InfoQ talk "Absorbing Scala in the Java Ecosystem"

About six months after, InfoQ released the video of the "Absorbing Scala in the Java Ecosystem" talk I gave at QCon San Francisco 2009.
The event was great though it feels funny looking at myself talking.

Thursday, April 15, 2010

Findbugs, Hudson and Pizza Driven Development (PDD)

Original post at the kaChing Eng Blog

As you may know, kaChing is an test driven engineering organization. Test driven is not an option, its a must. We move fast and push code to production few dozens of times a day in a five minutes release cycle, so we must have high confidence in our code.
In complex systems there is no end to testings, each test system is an another line of defense which eventually gets broken but the more you have, the less chances bugs will reach production. We do not have QA team and do not want to have one, the reasoning is that if a human is involved in testing then there is a higher chance of missing things and you simply can't test all the site dozens of times a day.

Lately we decided to add a yet another line of defense: Code Static Analysis (e.g. Findbugs, PMD and CPD). We decided to start with Findbugs which has a great ANT task and Hudson plugin (both of these we use). The problem with these tools is that they're producing tons of warnings and most organizations ignore them after since they're too noisy to deal with.

David V. from recommended: "Pizza Driven Development" (aka PDD) which works as following:
Step one: Order Pizza. Most engineers would commit on doing something if it will get them Pizza. For the minority that would not be seduced by Pizza, good old fashion violence would do.
Step two: Give each member of the team two cards. Go over the list of rules with the team and have them vote on them. Voting is done using the cards where:
  • No cards: I think the rule is stupid and we should filter it out in the findbugsExclude.xml
  • One card: The rule is important but not critical.
  • Two cards: The rule is super important and we should fix it right away.
The test sherif them creates three lists using the voting and the team works on fixing the "must fix" list which should have no more then few dozens of issues so it can be all fixed in couple of hours. Once you have that done the findbugs build is green and we're ready to go.

Next step is having hudson run findbugs on every post commit so the build is considered to be broken if a new findbugs issue is introduced. The engineer who introduced the issue must either filter the class from that rule in the xml file or fix the bug as a first priority. Since the engineers get a notification few minutes after the commit then they are probably still messing with the code and its easy for them to fix it on the spot.

In the next few weeks we are adding a new rule from the "not critical" list every few days. The goal is to have all the rules we think are important without the common "its to noisy, lets ignore it" approche. Only after we're done with that we're going to add the next static analysis tool to build. The good thing about these tools and hudson is that you can run them in parallel to the unit/integration tests, on another machine, so they won't slow down the overall release cycle.

Testing with Hibernate

Original post at the kaChing Eng Blog

Some of the unit tests I wrote lately involved setting up a data setup in the DB and testing the code using and mutating it. To make it clear, the DB is an in memory DB so the setup is super fast without IO slowdowns. Its very easy to do the setup using hibernate but the problem comes when you set a large collection of objects and a "java.sql.BatchUpdateException: failed batch" is being thrown. The frustrating point there is that hibernate won't let you know what exactly went wrong, even if you set your logger to "trace" level.

In order to solve it you can add the following to the system properties of that test:

It will execute the statements one by one and will give more precise details of what went wrong.
Note: you do NOT want to use these properties in production or in the continuance integration (CI) environment.

Saturday, February 20, 2010

Parallelizing JUnit test runs

Original post at the kaChing engineering blog.

Test runs should be as fast as possible in order to allow a lean development cycle. One of the applications is a Continuous Deployment (see Lean Startup).

Using strong multi-core machines to run tests is not enough since most unit tests are using a single thread per test. Apart of reducing IO to minimum, having tests running in parallel is required to squeeze the juice of of the machine. Additional value from Parallelizing JUnit test runs is ensuring that tests have no dependency between each other.

The way we implemented parallel test runs is creating an ANT target with a parallel task containing N JUnit tasks where N == number of cores. For example:

      <junit printsummary="yes" haltonfailure="yes" fork="true" maxmemory="${maxmemory}" showoutput="yes">
        <jvmarg value="-XX:MaxPermSize=${permMem}"/>
        <jvmarg value="-Xms${minMem}"/>
        <jvmarg value="-Xmx${maxmemory}"/>
        <classpath refid="classpath.test" />
        <formatter type="xml" usefile="true" />
        <test name="com.kaching.GroupedTests$GroupA" todir="${testTargetJunit}" unless="testcase"/>
      <junit printsummary="yes" haltonfailure="yes" fork="true" maxmemory="${maxmemory}" showoutput="yes">
        <jvmarg value="-XX:MaxPermSize=${permMem}"/>
        <jvmarg value="-Xms${minMem}"/>
        <jvmarg value="-Xmx${maxmemory}"/>
        <classpath refid="classpath.test" />
        <formatter type="xml" usefile="true" />
        <test name="com.kaching.GroupedTests$GroupB" todir="${testTargetJunit}" unless="testcase"/>
The GroupedTests$GroupX are classes extending TestCase with a public static Test suite(). The suite() method creates a JUnit test suite on the fly by loading all the tests in scope and filtering them out. If for example there are four cores, therefore you would like to have a group of four suites each running fourth of the tests. The suits are using a java.util.Random seeded with the commit revision number. Using the Random object we decide on placing test cases in suits.
random.nextInt(numOfTestSuites) == testSuiteId
Therefore a testcase goes to a single suite and they are evenly distributed between suites. Randomness in assigning test cases to suites (and therefore to processes) gives us some reassurance that there are no dependencies between tests.

As a result we have the tests running about twice as fast as they did before, in the range of 2 min, 20 sec for about 4.6k tests for one of our components, including code fetch from source repository, clean, build, and test suite setup time. It gives us a nice 100% test machine utilization and faster commit to production deployment cycle (about four minutes).

Monday, February 01, 2010

Benchmarking Voldemort on SSD with production data

Its not a secret kaChing is using Voldemort in production. As our data set and grows larger we wish to figure out what is the best bang for the buck in terms of hardware investment. Would it be migrating our 32bit machines to 64bit ones and have more memory available for them or using SSD? There are good reasons why one may be better then the other but the only way to know is to test. Yes, we believe that Test Driven Development (TDD) applies to building system as well as coding.

The time John invested the time in creating the tool (written in Scala btw) to parse the logs and get the graphs had quickly payed itself. In John's post Voldemort in the Wild he wrote about the initial settings he created to test the changes before we actually did them and some of the initial result. The results are unique to our own system, which is the best thing about them. You may get different results for different load, usage patterns, data size, cluster setup.

There are more tunings we played with, waiting to John's next post.

Saturday, January 09, 2010

Complement TDD with MDA

Original post at the kaChing Eng Blog.

Test Driven Development (aka TDD) is on the rise. Good developers understand that code with no proper testing is dead code. You can't trust it to do what you want and its hard to change.
I'm a strong believer in Dijkstra's observation that "Program testing can be a very effective way to show the presence of bugs, but it is hopelessly inadequate for showing their absence."

Dijkstra's statement doesn't contradict TDD. The test is testing a limited state machine. We do hope will cover the bloody battlefield of production confronting live data from users but if when we find the users did something unexpected which broke our software, we add a test emulating the users behavior and fix the problem.

Introducing Monitoring Driven Architecture (aka MDA)!
MDA is a second line of defense for TDD. MDA means that you bake monitoring into your architecture. Once you have MDA and the software is written in a monitorable (it is a word) way, you can have a faster detection of problems and auto roll back of faulty code. On the other hand, it is not uncommon that a small number of users suffer from a problem which manifests itself in some NPE thrown in one of the logs once in a blue moon and the operations team finds about it after a long while.

This is why I'm so excited about John's new Flexible Log Monitoring with Scribe, Esper, and Nagios deployment. It means that when we do find a problem we'll of course fix it but in addition make sure we express it in the logs and have our monitoring tools pick it up and send alerts about its existent without counting on anyone to manually look at the logs.

Wednesday, January 06, 2010

Outbrain's Password Recovery Mail

I forgot my Outbrain password and got this nice recovery email in return:
From: Outbrain CEO <****>
Hi eishay,

Click on the following link to reset your password:****
You will be prompted to choose a new password for your account.

Feel free to drop me a note with any questions you have.

Yaron Galai,
outbrain CEO
First time I notice a "system" email coming from a human that you can actually reply to, nevertheless the CTO of the company. Its an interesting concept, especially since I'm not a paying costumer. Yaron explained their take on it in this tweet:
@eishay Cool!... we have a rule here @Outbrain - all system emails must go out from an email of a human being. I hate all the info@ BS...

I found the approach very appealing, but can it scale? What is the max # of users it can support?
I assume its about the type of uses as well.

Reflection on 2009

This post is only to myself. Nothing interesting, just few personal notes.

The last year was very interesting!
Working in both LinkedIn and kaChing was/is a great experience. These are two fantastic companies, with very bright future, leading in their own fields and with amazingly talented engineers.

The SiliconValley CodeCamp '09 was a lot of fun. Had a full room and got high speaker evaluation and some candidates to kaChine (yes, we're still hiring).

The other talk was at QCon which was exciting. There where so many people that we had to move to a larger room that was packed as well :-)
Got some nice feedback via twitter twit twit twit twit twit twit twit twit twit.

It was great to do three sessions with the Reversim podcast (in Hebrew) about Scala, scalability and startups.

The open source serialization comparison project had significantly grown, it now has a dozen committers who contributed something in some point in time and still helping to update the project.
Overall I had lots of fun, and feel like 2010 is going to be even better !

Saturday, January 02, 2010

Subversion Backup

Posted on the kaChing Eng Blog.

Yes, we're using Subversion. I know that distributed version control systems (e.g. Git) are cool and we might get there sometime, but for misc reasons we're still using SVN. For the records, some of us are using GIT-SVN and we're working and releasing from trunk (part of a the lean startup methodology) so the branching merging is less of an issue.
I did some work to migrate our repository and spent some time to setup our SVN repo. Here are some bits and pieces I collected from scattered sites or made up myself to facilitate the SVN backup. Hope it will help anyone starting from scratch.

For the backup I'm using the great svnbackup script. Here are parts of our script (launched by crontab):

now=$(date +%F) --out-dir $OUT_DIR --file-name $FILE_NAME -v $REPO_LOCATION
if [ $RETVAL -ne 0 ]; then
    mail -s "ERROR: SVN backup on $now" $KACHING_OPS
    exit 1
Then the script sync's up the backup directory with S3 and verifies that the content of the last_saved file matches the last revision from SVN which it gets using
last_revision=$(svn -q --limit=1 log | head -2 | tail -1 | cut -c 2-6)
Backup is not enough, we must constantly test that when the time comes we'll be able to use it. Therefor we added a script, triggered by Nagios, to run on another machine and try to do a full repo rebuild from scratch.
The first thing the script is doing is to brute force clean up the repo:
rm -rf $SVN_REPO
svnadmin create $SVN_REPO
Then do a S3 sync to get all the backup files and load the files into the svn repo in the right order:
for file in $(ls $SVN_BACKUP_FILES_DIR/*.bzip2 | sort -t '-' -k 4 -n)
  bzip2 -dc $file | svnadmin load $SVN_REPO
Next step is getting few revisions and checking that their attributes (e.g. comments) match in both live and backup test repos.

Just because I'm paranoid we're also have an svn sync on an SVN slave server our second data center where every commit is backed-up on the fly and some of our systems (e.g. WebSVN) are reading from it.

Creative Commons License This work by Eishay Smith is licensed under a Creative Commons Attribution 3.0 Unported License.