Sunday, December 21, 2008

Using Scala First Class Functions By Name

WARNING: Today I changed this code all around, and I'll be reposting it soon. This post is still cool, and nothing is wrong with it, but, now the code is wicked awesome!!! So, look for it to be refreshed soon!

It occurred to me that while I knew really well how to pass anonymous functions around in Scala, I had very little experience passing around functions by name. I'm not sure why this is, but I think there isn't all that much out there written about it. It's important to understand that anywhere you could use an anonymous function, you could also use an existing function. Passing functions by name really helps clarify the intent of your code, and for DSL's in general. I'm going to try to demonstrate how to do so, for those who might be in the same boat as me.

I wrote a little app that searches through /usr/dict/words to find words matching the letters that I have in my Scrabble tray. Yes, I could have easily done this with "cat /usr/dict/words | grep someRegEx", but, the point was to learn more about functions.

(quick note...this post is really really long, but i think its full of good information)

First, I needed to be able to determine the following information about a word:
  • Does the word contain ALL the letters in my tray? (but maybe some other letters)
  • Does the word contain ONLY the letters in my tray? (no other letters allowed)
  • Does the word contain ANY of the letters in my tray?
  • Does the word contain NONE of the letters in my tray?
  • Lastly, does the word contain ALL of the letters in my tray, and ONLY the letters in my tray?

To do this I added a function to String that takes the characters to search for, and then any number of additional criteria. The criteria are functions that I reference by name. Lets see the client code, then I'll explain in more detail. Bear with me, I know its frustrating not seeing the actual library code, but I do think it helps to understand how its used before reading it.

1 class StringSearchTest {
2
3 import org.testng.annotations.Test
4 import PimpedString._
5 import Equalizer._
6
7 @Test
8 def testSearch {
9 var word = "ketoid"
10 word.searchFor("ke", word.includes_all) mustBe true
11 word.searchFor("ke", word.includes_none) mustBe false
12 word.searchFor("xzy", word.includes_none) mustBe true
13 word.searchFor("ke", word.includes_only) mustBe false
14
15 word = "keyyek"
16 word.searchFor("key", word.includes_only) mustBe true
17 word.searchFor("xyz", word.includes_any) mustBe true
18 word.searchFor("abc", word.includes_any) mustBe false
19 word.searchFor("key", word.includes_only, word.includes_all) mustBe true
20 word.searchFor("key", word.includes_all, word.includes_none) mustBe false
21 }
22}

Let's break down line a few lines:

10 word.searchFor("ke", word.includes_all) mustBe true
  • "word" is the word we are searching.
  • "ke" is what we are searching for.
  • "word.includes_all" is the criteria. This particular criteria states that 'word' must include all the letters we are searching for. In this case, word must contain both k and e.
  • The statement itself (word.searchFor("ke", word.includes_all)) returns a Boolean. True if the word matches the search criteria, false otherwise.

11 word.searchFor("ke", word.includes_none) mustBe false
12 word.searchFor("xzy", word.includes_none) mustBe true

includes_none is very much the opposite of includes all. It ensures that the word contains NONE of the things we are looking for. The statement in line 11 evaluates to false because the word does contain k (and e for that matter). Line 12 evaluates to true because they word does not contain x, y, or z.

16 word.searchFor("key", word.includes_only) mustBe true

This is nearly the same as the line 10, but the criteria is more restrictive. includes_only states that the word must include only letters that we are searching for. If we were searching for "key" in the word "hey", this would fail, because our word contains an "h".

17 word.searchFor("xyz", word.includes_any) mustBe true

Also very similar to above, but includes_any evaluates to true if the word contains ANY of the things we are searching for.

20 word.searchFor("key", word.includes_only, word.includes_all) mustBe true

Finally the moment of truth, line 20 is the most important of all. It states that the word must contain ONLY the letters in my tray, and ALL of the letters in my tray. This is the easiest way to find Bingo words. (It doesn't yet work perfectly, if my tray contained "key" and the word was "keyyyy", it would still evaluate to true. I need to fix this.)

Finally, before we see the actual library code, lets take a second to understand what "word.includes_all" actually is. "word.includes_all" is a higher order function that takes an Array of Char, and returns a function that takes a Char and returns a Boolean. A higher order function is a function that takes a function as an argument, returns a function, or both. includes_all does both. Higher order functions have been explained by everyone and their brother since 1960, so I won't be doing that here. I'm just explaining how to pass functions by name.

Ok, lets take a look at the library code and I'll explain in more detail.

1 object PimpedString {
2 implicit def pimpMyString(s: String): PimpedString =
3 new PimpedString(s)
4 implicit def convertToCharArray(s: String): Array[Char] =
5 s.toCharArray
6 }
7
8 class PimpedString(s: String) {
9
10 def includes_all(chars: Array[Char]) =
11 fold(chars)(s contains _)
12 def includes_none(chars: Array[Char]) =
13 fold(chars)(!s.contains(_))
14 def includes_any(chars: Array[Char]) =
15 chars.foldLeft(false)((b,c) => b || (s contains c))
16 def includes_only(chars: Array[Char]) =
17 includes_none(inverse(chars))
18
19 private val letters = "abcdefghijklmnopqrstuvwxyz".toCharArray
20 private def fold(chars: Array[Char])(f: Char => Boolean) =
21 chars.foldLeft(true)(_ && f(_))
22 private def inverse( chars: Array[Char] ) =
23 letters.filter(! chars.contains(_))
24 def searchFor(chars: Array[Char],
25 criteria: Function1[Array[Char], Boolean]*): Boolean = {
26 criteria.foldLeft(true){(b, f) => b && f(chars)}
27 }
28 }

(quick note: I've written about implicits here, and here, and I've written about foldLeft here, and I won't be covering them in any detail here.)

Let's just skip right ahead to line 10:

10 def includes_all(chars: Array[Char]) = fold(chars)(s contains _)

As advertised, includes_all is a higher order function that takes an Array of Char, and returns a function that takes a Char and returns a Boolean. Let's take a look back at the original code to see how it was used:

10 word.searchFor("ke", word.includes_all) mustBe true

Now, I could have just used an anonymous function here for my criteria. Assuming that the fold method wasn't private, it would look like this:

10 word.searchFor("ke", word.fold(chars)(s contains _)) mustBe true

The call to fold is basically saying look at each of the search characters and make sure I contain it. fold also returns a function that is executed in search. I could do this, but since includes_all is so common, so reusable, that instead of passing it anonymously, I give it a name, and reuse it all over. And, giving it a name makes it much, much more clear as to what is going on. The second version of line 10 is almost indecipherable.

That basically sums up how to reference functions by name, and pass them around. I'm going to finish the post, showing the actual Scrabble code. However, I don't think there are any ideas that I haven't already covered. Let's start by reviewing the original requirements, since its been a while:

Find all possible words where:
  • The word contains ALL the letters in my tray. (but maybe some other letters)
  • The word contains ONLY the letters in my tray. (no other letters allowed)
  • The word contain ANY of the letters in my tray. (other letters allowed)
  • The word contains ALL of the letters in my tray, and ONLY the letters in my tray.

Now to be consistent, lets look at the test/client code, because that's where you should always start.

1 object ScrabbleTest extends Application{
2 val tray = System.getProperty("tray").toString
3 val scrabble = Scrabble(tray)
4
5 println(scrabble.find(scrabble.all_letters_in_tray))
6 println(scrabble.find(scrabble.only_letters_in_tray))
7 println(scrabble.find(scrabble.any_letters_in_tray))
8 println(scrabble.find(
9 scrabble.only_letters_in_tray,
10 scrabble.all_letters_in_tray))
11 }

I think I have to take some pride in this. The code reads very, very close to the actual requirements. So much that I'm not going to explain it, except to point out again that scrabble.all_letters_in_tray is another example of passing a function by name.

Let's just go right to the library code.

1 case class Scrabble(tray: Array[Char]) {
2 val lines = Source.fromFile("/usr/share/dict/words").getLines
3 val words = lines.toList.map( _.toLowerCase )
4
5 def only_letters_in_tray(s: String): Function1[Array[Char], Boolean] =
6 s.includes_only
7 def all_letters_in_tray(s: String): Function1[Array[Char], Boolean] =
8 s.includes_all
9 def any_letters_in_tray(s: String): Function1[Array[Char], Boolean] =
10 s.includes_any
11
12 def find(criteria: Function1[String, Function1[Array[Char], Boolean]]*) = {
13 words.filter( word => {
14 word.searchFor(tray, criteria.map( _(word) ):_* )
15 }).map(_.trim)
16 }
17 }

Scrabble defines its own criteria functions - only_letters_in_tray, all_letters_in_tray, and any_letters_in_tray. But they are essentially just wrappers around the String criteria functions. This is a testament to how nice and reusable those original functions were, and why I luck out and don't have to explain them :)

The find here simply takes N criteria, and passes wrapped functions onto the String searchFor method. Line 14 is pretty insane, especially criteria.map( _(word) ):_*. It certainly takes some experience before being able to read (or write that line). Fortunately, as a user, the code is absurdly simple, so it doesn't really matter.

That's all for now. I hope this helped.

Anyone want to play me in Scrabble?

1 comment:

  1. Anyone want to play me in Scrabble?

    Sure!

    (Far more scala than scrabble these days though.)

    ReplyDelete