Sunday, November 30, 2008

Inconsistencies when moving from Java to Scala

Writing the Scala serialization benchmark I did a Java to Scala calls. It is very simple, just as using a yet another Java library. You only need to add the Scala library jar into your classpath and you're ready to go.
Alas there where to small but nasty quirks

  • Scala's List is not a Java Collection list. Actually I could not create a Scala list since its abstract. Scala is using lists heavily and its list is very powerful. Still it sucks that you can't easily create a Scala list in Java and set it in a Scala object. Obviously there are easy ways around this problem.
  • The second is that Scala's enum is not a Java enum. It has many implications when trying to use a library written in Scala.

Scala is faster then Java until you hit Object Serialization

Adding Scala to the serialization benchmarking parade along with Java, Stax, Thrift and Protobuf.
Scala is actually closer to Java, actually as for the serialization engine it IS java since it compiles to java classes. Since you could use Scala in a Java environment like a yet another Jar file then it would be good to check out the serialization cost, especially if you're using RMI, remote Spring or other protocol based on Java serialization.
The surprising part of the Scala compare is that Scala is actually faster in creating objects. To be fair, I've created the exact same objects in Java and Scala and created all the Scala objects from Java code! Creating Scala objects from Scala code might be even faster.
In the chart below size is the size of the object's serialized byte array.In the chart below time is measured in nanoseconds.



Does anyone have an explanation to this?
Here are my assumptions:

  • Since for each class in Scala the compiler creates two Java classes then the encoding in the serialized form needs to contains twice the meta data.
  • Scala's Enumeration does not translate to Java enumeration. I assume that an enum object serialized representation in Java is more compact then a regular object. Scala looses that.
  • As a new language Scala could do a better job in performance and still lavarage the JIT. But they didn't really cared too much about serialization. Actually, there is a good reason, if one cares about serialization performance he should pick up Protobuf or Thrift.
By the way, its really fun writing Scala code. Its much smaller and nicer.

Friday, November 28, 2008

Avoid large scale Java serialization

Oh, it makes so much sense reading Ted's post Don't Serialize Java Objects... about large scale Java serialization and performance. A good reinforcement to earlier conclusion.

What This Cost In Space And Time

First, the Java serialization space overhead. On a toy example of this object, serialization to a byte array used 953 bytes. Properly writing out the instance variables consumed 296 bytes. In production, doing it the right way shrunk a 1,600-record SequenceFile from 1.4GB to 825MB.

Time savings were great, too. In the same toy example, it took my JVM 7.2 milliseconds to serialize the object and 1.7 milliseconds to unserialize. Doing with with stream I/O only took 76,000 nanoseconds to serialize, 58,000 nanoseconds to unserialize.

Wednesday, November 26, 2008

Using Spring RPC for Protobuf transport?

I tried playing around with some code to make Spring RPC be protobuf's transport and have it coexist with other Spring RPC services.
To give credit to Spring, they make it very easy to extend their framework:

Simple, right?
Well, if you want to use the Spring RPC transparent way of remoting services then you must overwrite getProxyForService of RemoteExporter to route Protobuf service calls (i.e. mimic the protobuf service stub). Alas, you can't do it since the service Protobuf does not have an interface with the method signature so you can't make a proxy out of it. And anyway, the methods protobuf service is generating are not intended to seem POJO like at all with the RpcCallback and RpcController arguments. It sucks a bit if you with to integrate protobuf in an existing environment.
One can always wrap the generated service so it will look like an innocent POJO, but on the other hand there is a good point in the protobuf way.
Using Spring RPC it is way too easy to forget you are calling a remote service. You make your business objects Serializable and just move them around, having most of Java data containers serializable just makes it easier. The next thing you know you're API has object serialized back and forth without justification. So Spring RPC is a big gun to shoot yourself in the foot with when you forget about the eight Fallacies of Distributed Computing.
My conclusion (for now) is that if you have to use Spring RPC for transport and have protobuf objects floating around, you better use a protobuf java seialization wrapper.

Tuesday, November 25, 2008

Protobuf vs Spring RPC

Did some RPC tests over the wire and got some interesting results. Hope that someone could do something similar and verify.
I run three RPC client/server combination (see links to source code):

  1. Protobuf as the protocol over a simple TCP/IP client/server java sockets.
  2. Protobuf as the protocol over a HTTP where the client is using Apache HTTP client (reusing connection), and the server part is a servlet in a War on Jetty v6. The server side (protowar) implementation is very basic and ment only for benchmarking.
  3. Spring RPC, the war container is Jetty v6. Not posting the source code for this one, but its basically the same as in the protobuf example, using java serialization and spring.
I did a warmup for each client service combination to get the JIT going, and took the minimum of many iterations to try and dodge the GC. All the test I run on the same single (localhost) machine which is a MacBook Pro with 4GB of RAM and Intel Core 2 Duo. Surely its not a falst machine and you'll get much better results on a decent server, still, I would expect the relationships will stay the same. The results:
  1. protobuf on plain socket: 0.228 milli/roundtrip
  2. protobuf on HTTP: 2.08 milli/roundtrip
  3. Spring/RPC: 106.8 milli/roundtrip
Note that the chart below is in logarithmic scale!
I.e. you loose over an order of magnitude by using HTTP. But say you would like to keep it around and use the benefits of web containers, HTTP VIPs etc, then a milli (less on a real server) might be worth it. But jumping to Spring/RPC, that will cost you. Sure, you have tons of benefits there, just make sure you know the tradeof.

Thursday, November 20, 2008

Some perspective

On the same machine, creating a plain object, default constructor, not setting any fields or invoking methods, dodging the GC, single thread, and subtracting any other code takes about 7 nano seconds per object (tested 100k iterations).
Of course, in real life you have more stuff involved in object life cycle.

Tuesday, November 18, 2008

Thrift vs Protobuf object creation patterns, or: the builder pattern

The protobuf object creation API is beautiful. However designed it did a good job. Not that the thrift one is bad, its actually simple and straight forward and much more flexible, but this flexibility is a loaded gun that one can shut his own foot with. Its strange a bit since it seems that Thrift is heavily inspired by protobuf, yet it did no adopt few key elements from it.

The protobuf API successfully confronted conflicting needs:

  • Make business objects immutable (thread safety etc., Scala and Erlang took it to the extreme)
  • Make it flexible to create a business object. For example, if I have ten fields in the objects, all optional I want to allow some of the fields not to be set. Alas! I want it to be an immutable object. Do I need to have factory methods/constructors of all the permutations?!
  • Don't have factory methods/constructors with many parameters. There will be numerous bugs which the compiler won't catch for having the developer mixing up values. For example when there are seven integers in a row and the product and member ids order got mixed up.
The protobuf solution is a nice builder pattern. The builder is a write only interface, like a java bean with only setters. After one finished setting all the values, an immutable object is built (see sample code). It seems co cover all the above requirements.
The only caveat is performance, thought its probably very marginal. The time to create a sample business object is:Protobuf 0.00085 milli > Thrift 0.00051 milli >POJO 0.00032 milli
the builder pattern does have its cost, but consider that its less then a nano second and the number of protobuf objects created in a transaction is probably few orders of magnitude less then creating POJOs.

Protobuf

MediaContent content = MediaContent.newBuilder().
setMedia(
Media.newBuilder().setUri("http://javaone.com/keynote.mpg").
setFormat("video/mpg4").
setTitle("Javaone Keynote").setDuration(1234567).
setBitrate(123).addPerson("Bill Gates").
addPerson("Steve Jobs").setPlayer(Player.JAVA).build()).
addImage(
Image.newBuilder().setUri("http://javaone.com/keynote_large.jpg").
setSize(Size.LARGE).setTitle("Javaone Keynote").build()).
addImage(
Image.newBuilder().setUri("http://javaone.com/keynote_thumbnail.jpg").
setSize(Size.SMALL).setTitle("Javaone Keynote").build()).
build();
Thrift

Media media = new Media();
media.setUri("http://javaone.com/keynote.mpg");
media.setFormat("video/mpg4");
media.setTitle("Javaone Keynote");
media.setDuration(1234567);
media.setBitrate(123);
media.addToPerson("Bill Gates");
media.addToPerson("Steve Jobs");
media.setPlayer(Player.JAVA);
Image image1 = new Image();
image1.setUri("http://javaone.com/keynote_large.jpg");
image1.setSize(Size.LARGE);
image1.setTitle("Javaone Keynote");
Image image2 = new Image("http://javaone.com/keynote_thumbnail.jpg",
"Javaone Keynote", -1, -1, Size.SMALL);
MediaContent content = new MediaContent();
content.setMedia(media);
content.addToImage(image1);
content.addToImage(image2);

Protobuf with option optimize for SPEED

Jon Skeet pointed out to me that I missed to include
option optimize_for = SPEED
Thanks Jon!

I added it and it does make a lot of difference. I wonder why its not the default. It appears that without the flag protobuf is using Java introspection which is very expensive. With the speed optimization protobuf is faster then thrift, though not by far.
The serialization speed differences between them are probably not meaningful for transactions that take few miliseconds to perform.




Monday, November 17, 2008

Java, StAX, Protobuf and Thrift

Another option to serialize objects is using the XML format. It has a lot of advantages, but its not performing well. In come cases this performance aspect is only a very small part of the transaction, but since there is so much of SOAP protocols floating around, in some cases they should be reconsidered. The fastest Java XML library that I know of is StAX. I've created a matching XML to the Thrift and Protobuf schemas, limiting the tag sizes to only two chars. Makes the XML not too readable, but limits its total size.
Here are the performance charts comparing to StAX with Java plain Serialization, Thrift and Protobuf.




Here is the Thrift object description:

namespace java serializers.thrift
typedef i32 int
typedef i64 long
enum Size {
SMALL = 1,
LARGE = 2,
}

enum Player {
JAVA = 0,
FLASH = 1,
}

/**
* Some comment...
*/
struct Image {
1: string uri, //url to the images
2: optional string title,
3: optional int width,
4: optional int height,
5: optional Size size,
}

struct Media {
1: string uri, //url to the thumbnail
2: optional string title,
3: optional int width,
4: optional int height,
5: optional string format,
6: optional long duration,
7: optional long size,
8: optional int bitrate,
9: optional list person,
10: optional Player player,
11: optional string copyright,
}

struct MediaContent {
1: optional list image,
2: optional Media media,
}
Protobuf:
// See README.txt for information and build instructions.

package serializers.protobuf;

option java_package = "serializers.protobuf";
option java_outer_classname = "MediaContentHolder";

message Image {
required string uri = 1; //url to the thumbnail
optional string title = 2; //used in the html ALT
optional int32 width = 3; // of the image
optional int32 height = 4; // of the image
enum Size {
SMALL = 0;
LARGE = 1;
}
optional Size size = 5; // of the image (in relative terms, provided by cnbc for example)
}

message Media {
required string uri = 1; //uri to the video, may not be an actual URL
optional string title = 2; //used in the html ALT
optional int32 width = 3; // of the video
optional int32 height = 4; // of the video
optional string format = 5; //avi, jpg, youtube, cnbc, audio/mpeg formats ...
optional int64 duration = 6; //time in miliseconds
optional int64 size = 7; //file size
optional int32 bitrate = 8; //video
repeated string person = 9; //name of a person featured in the video
enum Player {
JAVA = 0;
FLASH = 0;
}
optional Player player = 10; //in case of a player specific media
optional string copyright = 11;//media copyright
}

message MediaContent {
repeated Image image = 1;
optional Media media = 2;
}
The generated XML looks like this:

Sunday, November 16, 2008

Serialization: Protobuf vs Thrift vs Java

Here is a new project on the google code site named thrift-protobuf-compare that tries to compare Protobuf, Thrift and Java POJO serialization.
The results turned out to be very interesting.

The method is: creating three sets of objects and run the tests on them using the respective tools.
In each test the code first run a 200k iteration of the test without taking time (let the JIT warm up), asking the system to invoke the GC, and then running another 200k iteration with time measuring.

Note that the project does not do a full compare yet, and there are few ways to do each of the actions. Moreover, there is much more then the numbers (features like versioning, object merging and the RPC mechanizem). More investigation will follow.

The numeric results are in the project page, here are the graphs:
Milliseconds to create an object, smaller is better. The Protobuf results is not a mistake! It was created by the builder pattern.

Milliseconds to serialize an object to a byte array, smaller is better.
Milliseconds to deserialize an object from byte array, smaller is better.
Size of the byte array of a serialized object, smaller is better.

Results as you see are mixed, and maybe with a different implementation they might look different. But it seems that for now, Thrift performs much better in terms of CPU speed and protobuf buffer size is about 30% smaller then Thrift.

Please look at the code and post suggestions to add/change test cases.

Friday, November 14, 2008

Installing Thrift on Mac OSX 10.5


There are few dependencies in the way to install Thrift. I'll list the ones I found and the way to install it. You may have some of the dependencies and so skip a step or two.

  • X11. Doesn't sound like something Thrift would care about. It is actually a dependency of Python and you need that for Boost and you need that for thrift :-) Download it, double click to install and reboot.
  • Fink.
  • Boost. From the command line, type 'fink install boost1.33'
  • Download and unzip Thrift.
  • Run the following from the command line:
$ cd {Thrift dir}
$ cp /usr/X11/share/aclocal/pkg.m4 aclocal/
$ ./bootstrap.sh
$ ./configure --with-boost=/sw/
$ make
$ sudo make install
That's it!
Now run the thrift toturial script and make sure you read the file:
$ cd tutorial/
$ ./tutorial.thrift

Thursday, November 13, 2008

protobuf java serialization pains


The implementation in the last post is actually pretty ugly since its not generic and one needs to create a new serialization wrapper to every protobuf object.
The natural solution would be to use generics and get it over with, but it seems that protobuf makes it difficult to do.
The key problem with using the generics approach is that the method "parseFrom" that serialize the generated protobuf object is not declared in an interface or the GeneratedMessage class it inherits from. This means that one has to have the class at hand to do the deserialization, but that's the whole point! I do want to have a general serializer for and protocolbuf object.

So here is my solution, looks a bit hacky with the introspection and byte writing, but it works fine.

/**
* Manually serializing Protobuf objects
* The serialize form first has an integer SIZE which is the
* size of the test of the serialized protobuf.
* After the integer there are SIZE bites of the protobuf serialized object
*/
class ProtobufSerializer implements Externalizable{
/**
* Object to serialize
*/
private transient T _proto;
private transient String _className;
public ProtobufSerializer(){}
public ProtobufSerializer(T proto){
_proto = proto;
if(null != _proto)_className = _proto.getClass().getName();
}
public T get(){return _proto;}

/**
* If the first byte is the size of zero, the object is null
*/
public void readExternal (ObjectInput in)throws IOException, ClassNotFoundException{
int size = in.readInt();
if(0 == size)return;
byte[] array = new byte[size];
in.readFully(array, 0, size);
_className = new String(array);
size = in.readInt();
array = new byte[size];
in.readFully(array, 0, size);
try{
Class clazz = getClass().getClassLoader().loadClass(_className);
Method parseMethod = clazz.getMethod("parseFrom", array.getClass());
_proto = (T)parseMethod.invoke(clazz, array);
}
catch (Exception e){
throw new IOException("could not load class " + _className);
}
}

/**
* If the the object is null then the int zero is written to the stream
*/
public void writeExternal (ObjectOutput out)
throws IOException{
if(null == _proto){
out.writeInt(0);
return;
}
out.writeInt(_className.getBytes().length);
out.write(_className.getBytes());
ByteArrayOutputStream baos = new ByteArrayOutputStream();
_proto.writeTo(baos);
baos.close();
byte[] array = baos.toByteArray();
out.writeInt(array.length);
out.write(array);
}
}
And here is how to use it:

private ProtobufSerializer _mediaHolder;
public NewsMediaContent getNewsMediaContent ()
{
return null == _mediaHolder ? null : _mediaHolder.get();
}
public void setNewsMediaContent (NewsMediaContent media)
{
_mediaHolder = new ProtobufSerializer(media);
}
By the way, I noticed that Hadoop has a similar issue with Thrift.

Wednesday, November 12, 2008

Protobuf not Serializable ?!

I'm starting to use Google's protobuf for serialization and deserialization of objects. Its looks great, ought I'm still going to check out Facebook's Thrift, just to make sure I'm not missing something.
The code is very easy to use and much better then any of the XML to object mappings, or for that matter, XML as a data structure.
Still, there is one big cavity in protobuf. The generated objects do not implement Serializable. I hope there is a good reason for that though I didn't find one. Without knowing any better, its looks very simple to have the generated objects extend the interface.

With a large legacy codebase it is not feasible to change the bl transportation layer to use protobuf in one swift. Therefore you're stack if you have protobuf objects floating around and you need to java serialize them.

If you can embed the protobuf object in another one then its easy (though unpleasant) to solve the problem. I solved it by having a Externalizable class that wraps the protobuf object and deals with the java serialization for it.

For example, assuming NewsMediaContent is the protobuf object:

public class NewsMediaContentSerializer implements Externalizable{
private NewsMediaContent _media;
public NewsMediaContentSerializer(){}
public NewsMediaContentSerializer(NewsMediaContent media){_media = media;}
public NewsMediaContent getMedia (){return _media;}
public void readExternal (ObjectInput in) throws IOException, ClassNotFoundException{
int size = in.readInt();
if(0 == size) return;
byte[] array = new byte[size];
in.readFully(array);
_media = NewsMediaContent.parseFrom(array);
}
public void writeExternal (ObjectOutput out) throws IOException {
if(null == _media){out.writeInt(0); return;}
ByteArrayOutputStream baos = new ByteArrayOutputStream();
_media.writeTo(baos);
baos.close();
byte[] array = baos.toByteArray();
out.writeInt(array.length);
out.write(array);
}
}


And use it like this:

  public NewsMediaContent getNewsMediaContent (){
return null == _mediaHolder ? null : _mediaHolder.getMedia();
}
public void setNewsMediaContent (NewsMediaContent media){
_mediaHolder = new NewsMediaContentSerializer(media);
}

Monday, November 10, 2008

PMD


Lately I came across PMD. Looks like a nice tool one should add to the toolbox, next to FindBugs and Clover. Looks like many of the things PMD will do for you Eclipse already does and most rules are not so useful. Still, I did find few things that made it worth while to use. the Killer feature I think is the 'duplicated code finder'. Its very good, and it did find few surprises over a very large code base. I did needed about 4Gig of heap to use it.

Sunday, November 09, 2008

Scala (!)


In the last couple of days I went to few sessions at the Silicon Valley Code Camp. The sessions where nice, but the one which impressed me the most was about Scala. I heard about the language few times, and the JavaPosse are constantly talking about it.
So I went to David Pollak's talk about Scala and the Lift web framework.

The more I learned about the language the more I was impressed. Until now I thought that Groovy is the next big thing on the JVM and since LinkedIn is using Groovy I got a bit closer view on it. I must admit, I was not too impressed. Scala on the other hand definitely looks like the way to go. It is fully interoperable with Java legacy code, provide type safety which the lack of it is a big drawback for Groovy, especially when working with huge amounts of code that needs to be refactored once in a while. Having Twitter moving its core server code to Scala, and good Netbeens and Eclipse plugins are yet another good indicators.

Now I need to look for a nail to test this new hammer :-)

Monday, November 03, 2008

CrossOver Chromium / Chrom on Mac - better wait for the real thing

Just installed CrossOver on my Mac. Actually, I have nothing to do with it, it was just to check it out. The only windows only application I would like to check out it Chrom. As expected, the emulation works fine though the UI choppy. It definitely feels like an emulation and is no match to the slickness of Firefox.
I guess that if you have to have it then its better then nothing, but for a day to day usage then CrossOver would be my last resort.

Here are some snapshots of how it looks on the Mac.

Creative Commons License This work by Eishay Smith is licensed under a Creative Commons Attribution 3.0 Unported License.