Monday, May 04, 2009

Beware of Scala’s type inference !

Scala's type inference could be unpleasant, creating problems that did not exist in Java (no inference) or dynamically typed languages.

To be clear, Scala's type inference is awesome! Yet with great powers come great responsibility and the library writer, and to some extent the consumer too, should know what to beware of the rough edges.

Examine this library method which returns a Map

class MyLib {
def getMap = Map("a"->"b")
}
And the client code that reads the map
class MyClient {
def useLib {
val lib = new MyLib
val map = lib.getMap
useMap(map)
}
def useMap(map: Map[String, String]) = println("useMap " + map)
}
Looks nice and definitely works. After a while the library is using a mutable HashMap for some imperative reason
import scala.collection.mutable.HashMap
class MyLib {
def getMap = {
var map = new HashMap[String, String]()
map += ("a"->"b")
map += ("c"->"d")
map
}
}
Looks harmless enough, but when we'll run the client we'll get
java.lang.NoSuchMethodError: MyLib.getMap()Lscala/collection/immutable/Map;
MyClient.useLib(MyClient.scala)
What happened? The client used the Map trait but the code compiled with a dependency on the underlying implementation. Here is the library java representation using javap
public class MyLib extends java.lang.Object implements scala.ScalaObject{
public test1.MyLib();
public scala.collection.mutable.HashMap getMap();
public int $tag() throws java.rmi.RemoteException;
}
While if we'll decompiling with JAD gives us
import java.rmi.RemoteException;
import scala.*;
import scala.collection.Map;
public class MyClient implements ScalaObject{
public MyClient(){}
public void useMap(Map map){
Predef$.MODULE$.println((new StringBuilder()).append("useMap ").append(map).toString());
}
public void useLib() {
MyLib lib = new MyLib();
scala.collection.immutable.Map map = lib.getMap();
useMap(map);
}
public int $tag() throws RemoteException{
return scala.ScalaObject.class.$tag(this);
}
}
So although the useMap method use scala.collection.Map the compiler in useLib references scala.collection.immutable.Map. It seems that it it could use scala.collection.Map and by that dodging some of the problem.
If we'll change the library to
import scala.collection.mutable.HashMap
import scala.collection.Map
class MyLib {
def getMap: Map[String, String] = {
var map = new HashMap[String, String]()
map += ("a"->"b")
map += ("c"->"d")
map
}
}
Which provides the interface (using javap)
public class test1.MyLib extends java.lang.Object implements scala.ScalaObject{
public test1.MyLib();
public scala.collection.Map getMap();
public int $tag() throws java.rmi.RemoteException;
}
which gives the library writer flexibility to change the Map implementation without breaking the client's code. Another solution could be a defensive client
import scala.collection.Map
class MyClient {
def useLib {
val lib = new MyLib
val map : Map[String, String] = lib.getMap
useMap(map)
}
def useMap(map: Map[String, String]) = println("useMap " + map)
}
It looks ugly but it would not break the client when the library changes the map implementation.

Coding conventions should mandate explicit return value for external API. Obviously it's also a good practice for readability and self documented code. This issue is especially relevant for large projects that break up to binary dependent modules.

Addendum

In view of the comments I need to add a clarification:
Java will also fail at build time (and runtime). To be more clear, if in Java you change the source one class it does not changes the compilation of other classes (though it could break their compilation). I don't know much about dynamic languages, but I believe the case is similar there (please correct me if I'm wrong). On the other hand, if in Scala you use implicit all around then changing one the source code of a library class does change the compilation output of the consumer class and this is the dangerous part.

In many projects there are binary dependencies between libraries / modules. Actually most projects are depending on some sort of external library, linking to its jar. Compiled versions of the modules are kept in a repository and unless you change their code they are will not be rebuilt. Lets say I wish to upgrade a version of a module v1.0 to v2.0 used by a client v1.0. I will build it with all the others and run some tests, and if all goes well then I assume that the module is backwards compatible and add it to the binary repository. Not the folks that build the product are taking together all the jar files and they break on runtime!

Now, this is not a big deal as itself. The bigger deal is that the module is not really backward compatible and you may not have all the sources handy to rebuild them from scratch as in the unfortunate case of using not open source library, but even if you do have the sources its a pain.

9 comments:

Ismael Juma May 4, 2009 at 11:18 PM  

Hi Eishay,

I agree that using explicit result types is a good idea for public APIs.

As you say, with implicit result types, it's too easy to change the signature of the method without noticing and causing a binary incompatible change as result.

It should be noted that if the client recompiled the code, a compiler error would occur (instead of the runtime error). That is probably what you had in mind when you said:

"This issue is especially relevant for large projects that break up to binary dependent modules."

Eishay Smith May 4, 2009 at 11:27 PM  

Right Ismael,

I should have added that this and some of my other posts are aiming using java in large code bases where you may not own all the code or be able to recompile it all (some of it may be in a maven like repository).

Anonymous May 6, 2009 at 9:21 AM  

This is no different than in Java. If you change the return type in a third party library and attempt to run it, Java too will fail at runtime.

Eishay Smith May 6, 2009 at 10:28 AM  

Java will fail at build time (and runtime). To be more clear, if in Java you change the source one class it does not changes the compilation of other classes (though it could break their compilation). I don't know much about dynamic languages, but I believe the case is similar there (please correct me if I'm wrong). On the other hand, if in Scala you use implicit all around then changing one the source code of a library class does change the compilation output of the consumer class and this is the dangerous part.
In many large projects there are binary dependencies between libraries / modules, compiled versions of the modules are kept in a repository and unless you change their code they are will not be rebuilt. Lets say I wish to upgrade a version of a module v1.0 to v2.0 which is used by a client v1.0. I will build it with all the others and run some tests, and if all goes well then I assume that the module is backwards compatible and add it to the binary repository. Not the folks that build the product are taking together all the jar files and they break on runtime!
Now, this is not a big deal as itself. The bigger deal is that the module is not really backward compatible and you may not have all the sources handy to rebuild them from scratch as in the unfortunate case of using not open source library, but even if you do have the sources its a pain.

Anonymous May 6, 2009 at 12:37 PM  

Java will fail at build time (and runtime)Scala will fail at the same points under the same conditions. There is absolutely no difference.
Also, there's no mention of using implicits, which could lead to unexpected behavior, though not of the kind mentioned here.

Eishay Smith May 6, 2009 at 1:09 PM  

mmm, not sure we understand each other. When a library changes an implicit return type the client code does not necessarily have to change (like in the example in the post) and neither the Java or Scala compilers should fail when recompiling the client. But in this case the client compiled bytecode actually change - does not happen in Java since there are no implicits. Having the client bytecode change without the source code modified is not typical to Java code (unless you switch compilers) and as in the example, if you don't change the client binary in your runtime then it might break there.

Anonymous May 7, 2009 at 6:00 AM  

You code example does not use implicits. It infers the type, which is no different than in Java where you manually enter it.
In both cases, if the library changes the type, the dependent code will fail. Same for Scala as for Java.

Eishay Smith May 7, 2009 at 9:06 AM  

You're right, there are no implicits involved. Implicit in Scala is an overloaded word, shouldn't have used this term. When you're not explicitly defining a return type (using type inference) the return type of the method signature is implicitly defined by the actual return type, nothing to do with Scala's implicits constructs.
In the example, the client uses the library return value as a Map (not as a mutable HashMap that is returned). If it would have explicitly declared val map as {val map : Map[String, String]} as you would do in Java, i.e. not relay on type inference, then there would not be any problem if the library changed the implementation from immutable to mutable. In other words, if you are not well defining the value you receive from the library at the time you receive it in the client, and you don’t recompile your client when the library change (though you don’t change your client code), then they will break on runtime.

Anonymous May 8, 2009 at 6:52 AM  

There are two options here, for both Java and Scala:
1. You change your return type on your library, but not the dependent code, i.e. no recompile.
2. You change your return type and recompile your dependent code against the new return type.

Can we agree that for 1., both Scala and Java will throw exception at runtime?

For 2, you will get compile errors in both cases, hence my argument that Scala is no different than Java.
Scala will reinfer the type and find an unknown method call. It will complain about this at compile time.
Java is the same, except you'll first get a type error, which you then fix manually and then the unknown method call.

If you still disagree, I would urge you to actually try it out.

Creative Commons License This work by Eishay Smith is licensed under a Creative Commons Attribution 3.0 Unported License.