Thursday, July 02, 2009

Microbenchmarking Scala vs Java

Following Nick's post I came up to check the numbers and investigate a possible improvement using Scala 2.8 @specialized annotation.
Final numbers are at the end of the post.
I added few changes:
First I thought it would be better if both Scala and Java would sort the same size of array ;-) In Nick's example Java sorted 10,000 elements and Scala sorted 100,000.
The other was to do few iterations int he same JVM run to let JIT kick in. So now the Scala code looks like this:

package quicksort
import java.lang.Long.MAX_VALUE

object QuicksortScala {

def quicksort(xs: Array[Int]) {

def swap(i: Int, j: Int) {
val t = xs(i); xs(i) = xs(j); xs(j) = t
}

def sort1(l: Int, r: Int) {
val pivot = xs((l + r) / 2)
var i = l;
var j = r
while (i <= j) {
while (xs(i) < pivot) i += 1
while (xs(j) > pivot) j -= 1
if (i <= j) {
swap(i, j)
i += 1
j -= 1
}
}
if (l < j) sort1(l, j)
if (j < r) sort1(i, r)
}
sort1(0, xs.length - 1)
}

def main(args : Array[String]) {
var time = MAX_VALUE
for(i <- 0 to 100) (time = Math.min(time, doSort))
println("Scala time = " + time)
}

def doSort() = {
var a : Array[Int] = new Array[Int](10000000)
var i : Int = 0
for (e <- a) {
a(i) = i*3/2+1;
if (i%3==0) a(i) = -a(i);
i = i+1
}
val t1 = System.currentTimeMillis();
quicksort (a)
val t2 = System.currentTimeMillis();
t2 - t1
}
}
And here is the Java code:
package quicksort;

public class QuicksortJava {

public void swap(int[] a, int i, int j) {
int temp = a[i];
a[i] = a[j];
a[j] = temp;
}

public void quicksort(int[] a, int L, int R) {
int m = a[(L + R) / 2];
int i = L;
int j = R;
while (i <= j) {
while (a[i] < m)
i++;
while (a[j] > m)
j--;
if (i <= j) {
swap(a, i, j);
i++;
j--;
}
}
if (L < j)
quicksort(a, L, j);
if (R > i)
quicksort(a, i, R);
}

public void quicksort(int[] a) {
quicksort(a, 0, a.length - 1);
}

public static void main(String[] args) {
QuicksortJava sorter = new QuicksortJava();
long time = Long.MAX_VALUE;
for(int i = 0; i < 100; i++)
time = Math.min(time, sorter.doSort());

System.out.println("java time = " + time);
}

private long doSort(){
// Sample data
int[] a = new int[10000000];
for (int i = 0; i < a.length; i++) {
a[i] = i * 3 / 2 + 1;
if (i % 3 == 0)
a[i] = -a[i];
}

long t1 = System.currentTimeMillis();
quicksort(a);
long t2 = System.currentTimeMillis();
return t2 - t1;
}
}
Decompiling the Scala code shows that there is no benefit of using the @specialized annotation since the Scala compiler compiles all the ints to primitives. Here is the Scala class decompiled code:
package quicksort;
import java.rmi.RemoteException;
import scala.*;
import scala.runtime.*;
public final class QuicksortScala$ implements ScalaObject{
public QuicksortScala$(){}
private final void sort1$1(int l, int r, int ai[]) {
do{
int pivot = ai[(l + r) / 2];
int i = l;
int j = r;
do{
if(i > j) break;
for(; ai[i] < pivot; i++);
for(; ai[j] > pivot; j--);
if(i <= j) {
swap$1(i, j, ai);
i++;
j--;
}
} while(true);
if(l < j) sort1$1(l, j, ai);
if(j < r)l = i;
else return;
} while(true);
}
private final void swap$1(int i, int j, int ai[]){
int t = ai[i];
ai[i] = ai[j];
ai[j] = t;
}
public long doSort(){
ObjectRef a$1 = new ObjectRef(new int[0x989680]);
IntRef i$1 = new IntRef(0);
(new BoxedIntArray((int[])a$1.elem)).foreach(new anonfun.doSort._cls1(a$1, i$1));
long t1 = System.currentTimeMillis();
quicksort((int[])a$1.elem);
long t2 = System.currentTimeMillis();
return t2 - t1;
}
public void main(String args[]){
LongRef time$1 = new LongRef(0xffffffffL);
Predef$.MODULE$.intWrapper(0).to(100).foreach(new anonfun.main._cls1(time$1));
Predef$.MODULE$.println((new StringBuilder()).append("Scala time = ").append(BoxesRunTime.boxToLong(time$1.elem)).toString());
}
public void quicksort(int xs$1[]){
sort1$1(0, xs$1.length - 1, xs$1);
}
public int $tag() throws RemoteException {
return scala.ScalaObject.class.$tag(this);
}
public static final QuicksortScala$ MODULE$ = this;
static { new QuicksortScala$(); }
}
I tried to compile and run the code in Scala 2.8 anyway and the results where exactly the same as with running it under Scala 2.7.5.
The numbers:
Scala: 1355ms
Java: 1265ms

==> Java was 7% faster then Scala (Closer gap then the 33% in the previous benchmark).

UPDATE
Ismael found a bug in the Scala Code, I updated Nick as well. The line
      if (j < r) sort1(i, r)
should have been
      if (i < r) sort1(i, r)
Changing the line made Java and Scala be equal in performance.
==> The Scala code above is no slower or faster then Java (as should be in this specific case).

11 comments:

Nick Wiedenbrueck July 2, 2009 at 10:14 PM  

Okay, that was just a typo in my published code segment. I really compared 100.000.

I also tried to figure out the influence of HotSpot, but I couldn't see any remarkable improvements. And also, in my benchmark Java is 30% faster.

Ismael Juma July 3, 2009 at 12:34 AM  

Hi Eishay,

It's not surprising that @specialized doesn't help much here. As far as I can see, while loops and Array[Int] are used almost everywhere (I only found one exception).

Scala is as fast as Java if you write it like Java (this means using while loops since Scala doesn't have an equivalent of Java's for loops).

The point of @specialized is to help cases where the code is higher level and you use anonymous functions extensively as well as higher-level collections. In the latter case, @specialized has the potential to allow Scala to do better than Java when using collections (not arrays) from the standard library (due to boxing on the Java side). Note that I said _potential_. :)

Best,
Ismael

Eishay Smith July 3, 2009 at 9:59 AM  

@Nick I figured its a typo, just teasing ;-)

@Ismael did you mean you need to cripple Java so it won't use for loops in order to make it as slow as Scala? Is there a way to make similar Scala code as fast as Java in spite of the lack of for loops?

Ismael Juma July 3, 2009 at 10:25 AM  

Of course not, Eishay. That would be ridiculous.

I said that you have to use a while loop in Scala. A while loop (in Scala or Java) performs as well as a for loop in Java.

Ismael

Eishay Smith July 3, 2009 at 10:58 AM  

But Ismael, the Scala code in the example does use a while loop (compiled down to a for loop).
I'm trying to change the Scala code to close the performance gap between it and the Java code. Any ideas?

Ismael Juma July 3, 2009 at 12:44 PM  

Eishay,

I understand that while loops are used in this benchmark. That is why I said that specialized would not help.

What does help is using the same code for both versions. :)

Here are some suggestions (they make the Scala code closer to the Java one and made Scala's version faster when I tested):

- Replace the inner methods with normal ones.

- Change Scala's version to use a class instead of object and create an object that simply creates the class and calls main on it.

- The line that starts with "if (j < r)" is different from the Java code. The Java code is "if (R > i)".

Does that make any difference?

Ismael

Eishay Smith July 3, 2009 at 11:15 PM  

You got it Ismael!

Only the last suggestion (the bug you found) made any difference. After solving it Scala and Java matched up. Actually Scala was 2ms but considering the non-realtime OS, 0.1% performance difference is meaningless. I'll update the post.

Anonymous July 6, 2009 at 1:10 PM  

Is there additional information available regarding the @specialized annotation? From what I understand it would have to be used when defining the generic type. Currently I can't find any usage in the scala 2.8 library. Who can tell more about that interesting thing?

Yours,
Stefan

Eishay Smith July 11, 2009 at 11:21 PM  

Hi Stefan, check out Dean's post where he discuss @specialized.

Eishay Smith July 11, 2009 at 11:23 PM  

Sorry, here is the link: http://blog.objectmentor.com/articles/2009/06/05/bay-area-scala-enthusiasts-base-meeting-whats-new-in-scala-2-8

phil February 6, 2010 at 6:20 PM  

Can you compare it with machine code?

I always wonder if I programmed it in assembly, if it would be very much faster or not.

Creative Commons License This work by Eishay Smith is licensed under a Creative Commons Attribution 3.0 Unported License.