Friday, November 28, 2008

Avoid large scale Java serialization

Oh, it makes so much sense reading Ted's post Don't Serialize Java Objects... about large scale Java serialization and performance. A good reinforcement to earlier conclusion.

What This Cost In Space And Time

First, the Java serialization space overhead. On a toy example of this object, serialization to a byte array used 953 bytes. Properly writing out the instance variables consumed 296 bytes. In production, doing it the right way shrunk a 1,600-record SequenceFile from 1.4GB to 825MB.

Time savings were great, too. In the same toy example, it took my JVM 7.2 milliseconds to serialize the object and 1.7 milliseconds to unserialize. Doing with with stream I/O only took 76,000 nanoseconds to serialize, 58,000 nanoseconds to unserialize.


Creative Commons License This work by Eishay Smith is licensed under a Creative Commons Attribution 3.0 Unported License.