Saturday, June 20, 2009

Google App Engine Data Store Api is definitely beta

Was very surprised to see that when querying the data store service for a key and the key does not exist the service actually throws an exception instead of returning null or indicating the absence of value in another way. Definitely need a code review from Josh Bloch. Josh wrote in his Effective Java "Exceptions ... should never be used for ordinary control flow".
In Scala it actually makes things very ugly, instead of doing

ds.get(queryKey) match {
case null => {
logger.info("did not got it: " + queryKey.toString)
...
}
case entity: Entity if (null != entity) => {
logger.info("got it: " + entity.toString)
...
}
}
I must go with something like
try {
val entity = ds.get(queryKey)
logger.info("got it: " + entity.toString)
...
}
catch {
case ex: EntityNotFoundException => {
logger.info("did not got it: " + queryKey.toString)
...
}
}
Which may totally disrupt a nice chain of pattern matching.
Keeping an eye on Issue 1961

Sunday, June 14, 2009

No Scala for GWT

I'm working through my buzzword complaint application which naturally includes Cloud computing (Google App Engine), GWT, Scala and the rest of social Web2.0 hyped BS. Wow, that was a keyword loaded sentence ;-)

GWT has a nasty exception to my experience so far that you can replace any Java code with Scala. When you try to have a GWT EntryPoint be implemented in Scala you get this GWT compilation error

Checking rule <generate-with class='com.google.gwt.user.rebind.ui.ImageBundleGenerator'>
Checking if all subconditions are true <all>
<when-assignable class='com.google.gwt.user.client.ui.ImageBundle'>
[ERROR] Unable to find type 'com.newspipes.client.Newspipes'
[ERROR] Hint: Check that the type name 'com.newspipes.client.Newspipes' is really what you meant
[ERROR] Hint: Check that your classpath includes all required source roots
The error is a bit confusing since the class is in the classpath and you can see the compiled *.class under WEB-INF/classes. The GWT compiler compiles Java source directly to Javascript and it check that the sources it compiles are *.java files. So instead of "Unable to find type..." it actually means "Unable to find java source file of type...".

Tuesday, June 09, 2009

Unexpected repeated execution in Scala

The following Scala Easter egg is the one of the most dangerous "features" of the language.

I know there might be flaming involved, but be sure its not my intention. I like Scala a lot and I'm happily using it in production code. This post is a result of the following discussions: Seq repeated execution unexpected behavior and Strange behavior of map function depending on first argument

The following code is using a method that returns a Seq[Int] containing random numbers and acts on them (prints the first one).

object SeqTest {
def main(args: Array[String]) {
val randomInts : Seq[Int] = mkRandomInts()
println(randomInts.first)
println(randomInts.first)

val randomIntsList = randomInts.toList
println(randomIntsList.first)
println(randomIntsList.first)

def mkRandomInts() = {
val randInts = for {
i <- 1 to 3
val rand = i + (new Random).nextInt
} yield rand
randInts
}
}
As you can see from the output blow the sequence is reevaluated each time it is accessed, returning different random number. Once the code transforms the sequence into a List the results are stable.
-1867060800
312920158
-133186413
-133186413
Decompiling the code shows that mkRandomInts returns a scala.RandomAccessSeq.Projection which is a result of the Range we created using
i <- 1 to 3
As Jorge’s explained it: the "gotcha" is that, because of the way Scala's collections work, Range's laziness is "contagious" when you use functional for-comprehensions (for ... yield ...) and a Range is the first thing in the comprehension.
I'm not sure the Scaladoc explains that so well.

Martin and others explained that it comes to conserve memory since we don’t really want to have a list in size of 100000 when doing (0 to 100000). This is all nice and true, but it is not a good reason to have a repeated execution each time the code access a data structure derived from iteration on a range.

The example above is not about clean/nice/efficient code its about the principle of least surprise (POLS). This behavior will cause (in my case did cause) very hard to track bugs. It is also an undocumented behavior, in spite of it existing in a very common pattern equivalent to java’s
for (int i = 0; i < 100000; i++)
The implicit side effects of such optimization may be unacceptable. For example, the code executed in the for loop might change state in DB or file system or be CPU intensive. In the latter case, it would be very hard to understand why the application is so slow when repeatedly accessing elements of a sequence.

If Seq would have force() on it then a protective act would know to call it in such cases, but it does not have it (only RandomAccessSeq.Projection got it).
For example, every java InputStream has close() and java programmers are accustomed to close any stream they use in a finally "just in case", even if in some of them (e.g. StringBufferInputStream) close does nothing. But if we'll educate programmers to force a Seq anywhere we see it we create more boilerplate mess and we don’t want it in Scala :-)

Having such case forces the programmer to know about it and be ready => more potential bugs that will happen => learning curve is even higher.

To conclude, the problem is two fold:
First calling a chunk of code that (without explicit instruction) gets executed only when its derived output is accessed.
Second even if lazy is cool and expected, repeated execution is not. We have lazy data structures all around but they usually cache the data once they fetch it or in the case of lazy iterators (like in jdbc), you need to explicitly recreate them.

Creative Commons License This work by Eishay Smith is licensed under a Creative Commons Attribution 3.0 Unported License.