Saturday, May 09, 2009

Iterating over Map with Scala

Here is a summary of the iteration over a Scala Map discussion, started by Ikai at the Scala mailing list.
Here is the Map we wish to iterate over.

scala> val m1 = Map[Int, Int](1->2, 2->3, 4->50)                
m1: scala.collection.immutable.Map[Int,Int] = Map(1 -> 2, 2 -> 3, 4 -> 50)
The options are:
Accessing the Tuple2 object structure.
scala> m1 foreach ( (t2) => println (t2._1 + "-->" + t2._2))
Using case on the Tuple2
scala> m1 foreach {case (key, value) => println (key + "-->" + value)}
Using a for loop
scala> for ((key, value) <- m1) println (key + "-->" + value) 
Advanced: using unapply to expand the Tuple and use it in the match:
scala> object -> { def unapply[A,B](x : (A,B)) : Option[(A,B)] = Some(x) }
defined module $minus$greater

scala> m1 foreach {case k->v => println (k + "-->" + v)}
I don't know why the next one does not work, any idea?
scala> m1 foreach (println (_._1 + "-->" + _._2))           
:6: error: missing parameter type for expanded function ((x$1, x$2) => x$1._1.$plus("-->").$plus(x$2._2))
m1 foreach (println (_._1 + "-->" + _._2))
If you know of other options, please add.


David R. MacIver May 10, 2009 at 4:38 AM  

The last one doesn't work because it translates to m1.foreach((x, y) => x._1 + "-->" + y._2). The two _s each introduce a parameter.

mccv May 10, 2009 at 7:30 AM  

Dr. MacIver... can you clarify that a bit? Let's say I reduce it to a single argument, with m1 foreach (println(_.productArity)). This fails with the same error. Is something going on under the covers with the tuple that requires adding the parameter statement?

Eishay Smith May 10, 2009 at 8:10 AM  

I also didn't get it. Isn't (x, y) a Tuple2 ?
These are the areas where Scala's type system is less then intuitive.

David R. MacIver May 10, 2009 at 2:51 PM  

Even if (x, y) were a tuple (it's not here - it's a parameter list of two elements. It's a shame that these aren't the same thing, but not that confusing), it would be a tuple of the wrong sort because you're accessing elements x._1 and y._2, so it would be a tuple of tuples.

The problem with println(_.productArity) isn't actually the same thing. The problem is that the expansion happens within the next level out of brackets. Imagine if you'd written foo( instead). Would you expect this to evaluate as foo(t => or foo( => t.productArity))? The latter is the only one that makes sense, and is what's happening here (the fact that it then can't infer a decent type parameter for t is a bit annoying, but separate)

David R. MacIver May 10, 2009 at 2:52 PM  

Also, I'm not a Dr. I'm David R. MacIver. Initials, not title.

mccv May 10, 2009 at 7:10 PM  

Ah, that makes sense. Tricky with the argument list looking like a tuple...

Eishay Smith May 10, 2009 at 9:06 PM  

David, here t2 creates a Tuple2:
m1 foreach ( (t2) => println (t2._1 + "-->" + t2._2))
I thought that this is the case here too and I could at least do this (which I can't):
m1 foreach (println ("-->" + _._2))
So what is _ in the line above?

David R. MacIver May 11, 2009 at 3:14 AM  

The problem is that

m1 foreach (println ("-->" + _._2))

expands as

m1 foreach (println (x => "-->" + x._2))


m1 foreach (x => println ("-->" + x._2))

as you want it to.

Daniel May 11, 2009 at 7:21 AM  

This kind of thing has eluded me for a bit too, until I realized I can't put a wildcard inside parenthesis.

This will work:

List(1,2,3).foreach(Console println _+1)

While this will not:

List(1,2,3).foreach(Console println (_+1))

Curiously, this will work too:

List(1,2,3).foreach(Console println (_))

Daniel May 11, 2009 at 7:40 AM  

Let me expand a bit. I might be wrong -- I'm definitely not an expert in Scala, but still...

When you use "_" inside parenthesis, it turns the expression inside that parenthesis into a function.

For example, (1+1) is an expression, but (_+1) is a function receiving one parameter. Or, in other words, (_+1) is ((x$1) => x$1.$plus(1)).

Now, the type of (1+1) is Int. The type of ((x$1 : Int) => x$1.$plus(1)) is (Int) => Int. The type of ((x$1) => x$1.$plus(1)) must be inferred, though.

So, let's go back to List(1,2,3).foreach(Console println(_ + 1)). You are passing a function to println, for which one type must be inferred.

Now, foreach, in this example, has type (Int) => Unit. So, we can expand that to:

List(1,2,3).foreach((x$1 : Int) => Console println(_ + 1))

Notice that the x$1 does not replace the "_" inside the parenthesis, because THAT "_" is reserved for the expansion that will happen for println, not the expansion that happened for foreach.

We'll now expand the second parenthesis, using "y" instead of "x", for clarity:

List(1,2,3).foreach((x$1 : Int) => Console.println((y$1) => y$1.$plus(1)))

Now we have to infer the type of y$1. Let's start with the type of println, see if that helps us:

def println(x : Any) : Unit

Ok, so the result of y$1.$plus(1) must be of type Any. Knowing that, can we know what's the type of y$1? Obviously, not.

But what happen with the type we DO know, that of x$1? It's simple... x$1 is not getting used inside println.

This is all a bit confusing, but I hope it helps understand what's happening. And if I'm wrong in my explanation, I *sure* hope someone more knowledgeable comes and correct me! :-)

Eishay Smith May 11, 2009 at 9:10 AM  

Thanks Daniel & David, that's the type of explanation I was looking for.

Anonymous August 12, 2013 at 7:34 PM  

Great explain, especially when I compare this to traditional ways of Map Iteration in Java

Creative Commons License This work by Eishay Smith is licensed under a Creative Commons Attribution 3.0 Unported License.