Saturday, May 09, 2009

Iterating over Map with Scala

Here is a summary of the iteration over a Scala Map discussion, started by Ikai at the Scala mailing list.
Here is the Map we wish to iterate over.

scala> val m1 = Map[Int, Int](1->2, 2->3, 4->50)                
m1: scala.collection.immutable.Map[Int,Int] = Map(1 -> 2, 2 -> 3, 4 -> 50)
The options are:
Accessing the Tuple2 object structure.
scala> m1 foreach ( (t2) => println (t2._1 + "-->" + t2._2))
1-->2
2-->3
4-->50
Using case on the Tuple2
scala> m1 foreach {case (key, value) => println (key + "-->" + value)}
1-->2
2-->3
4-->50
Using a for loop
scala> for ((key, value) <- m1) println (key + "-->" + value) 
1-->2
2-->3
4-->50
Advanced: using unapply to expand the Tuple and use it in the match:
scala> object -> { def unapply[A,B](x : (A,B)) : Option[(A,B)] = Some(x) }
defined module $minus$greater

scala> m1 foreach {case k->v => println (k + "-->" + v)}
1-->2
2-->3
4-->50
I don't know why the next one does not work, any idea?
scala> m1 foreach (println (_._1 + "-->" + _._2))           
:6: error: missing parameter type for expanded function ((x$1, x$2) => x$1._1.$plus("-->").$plus(x$2._2))
m1 foreach (println (_._1 + "-->" + _._2))
^
If you know of other options, please add.

11 comments:

David R. MacIver May 10, 2009 at 4:38 AM  

The last one doesn't work because it translates to m1.foreach((x, y) => x._1 + "-->" + y._2). The two _s each introduce a parameter.

mccv May 10, 2009 at 7:30 AM  

Dr. MacIver... can you clarify that a bit? Let's say I reduce it to a single argument, with m1 foreach (println(_.productArity)). This fails with the same error. Is something going on under the covers with the tuple that requires adding the parameter statement?

Eishay Smith May 10, 2009 at 8:10 AM  

I also didn't get it. Isn't (x, y) a Tuple2 ?
These are the areas where Scala's type system is less then intuitive.

David R. MacIver May 10, 2009 at 2:51 PM  

Even if (x, y) were a tuple (it's not here - it's a parameter list of two elements. It's a shame that these aren't the same thing, but not that confusing), it would be a tuple of the wrong sort because you're accessing elements x._1 and y._2, so it would be a tuple of tuples.

The problem with println(_.productArity) isn't actually the same thing. The problem is that the expansion happens within the next level out of brackets. Imagine if you'd written foo(x.map(_.productArity) instead). Would you expect this to evaluate as foo(t => x.map(t.productArity)() or foo(x.map(t => t.productArity))? The latter is the only one that makes sense, and is what's happening here (the fact that it then can't infer a decent type parameter for t is a bit annoying, but separate)

David R. MacIver May 10, 2009 at 2:52 PM  

Also, I'm not a Dr. I'm David R. MacIver. Initials, not title.

mccv May 10, 2009 at 7:10 PM  

Ah, that makes sense. Tricky with the argument list looking like a tuple...

Eishay Smith May 10, 2009 at 9:06 PM  

David, here t2 creates a Tuple2:
m1 foreach ( (t2) => println (t2._1 + "-->" + t2._2))
I thought that this is the case here too and I could at least do this (which I can't):
m1 foreach (println ("-->" + _._2))
So what is _ in the line above?

David R. MacIver May 11, 2009 at 3:14 AM  

The problem is that

m1 foreach (println ("-->" + _._2))

expands as

m1 foreach (println (x => "-->" + x._2))

Not

m1 foreach (x => println ("-->" + x._2))

as you want it to.

Daniel May 11, 2009 at 7:21 AM  

This kind of thing has eluded me for a bit too, until I realized I can't put a wildcard inside parenthesis.

This will work:

List(1,2,3).foreach(Console println _+1)

While this will not:

List(1,2,3).foreach(Console println (_+1))

Curiously, this will work too:

List(1,2,3).foreach(Console println (_))

Daniel May 11, 2009 at 7:40 AM  

Let me expand a bit. I might be wrong -- I'm definitely not an expert in Scala, but still...

When you use "_" inside parenthesis, it turns the expression inside that parenthesis into a function.

For example, (1+1) is an expression, but (_+1) is a function receiving one parameter. Or, in other words, (_+1) is ((x$1) => x$1.$plus(1)).

Now, the type of (1+1) is Int. The type of ((x$1 : Int) => x$1.$plus(1)) is (Int) => Int. The type of ((x$1) => x$1.$plus(1)) must be inferred, though.

So, let's go back to List(1,2,3).foreach(Console println(_ + 1)). You are passing a function to println, for which one type must be inferred.

Now, foreach, in this example, has type (Int) => Unit. So, we can expand that to:

List(1,2,3).foreach((x$1 : Int) => Console println(_ + 1))

Notice that the x$1 does not replace the "_" inside the parenthesis, because THAT "_" is reserved for the expansion that will happen for println, not the expansion that happened for foreach.

We'll now expand the second parenthesis, using "y" instead of "x", for clarity:

List(1,2,3).foreach((x$1 : Int) => Console.println((y$1) => y$1.$plus(1)))

Now we have to infer the type of y$1. Let's start with the type of println, see if that helps us:

def println(x : Any) : Unit

Ok, so the result of y$1.$plus(1) must be of type Any. Knowing that, can we know what's the type of y$1? Obviously, not.

But what happen with the type we DO know, that of x$1? It's simple... x$1 is not getting used inside println.

This is all a bit confusing, but I hope it helps understand what's happening. And if I'm wrong in my explanation, I *sure* hope someone more knowledgeable comes and correct me! :-)

Eishay Smith May 11, 2009 at 9:10 AM  

Thanks Daniel & David, that's the type of explanation I was looking for.

Creative Commons License This work by Eishay Smith is licensed under a Creative Commons Attribution 3.0 Unported License.