Jack Cough on Software: November 2008

Friday, November 28, 2008

foldLeft in Scala, Little Schemer style

I'm going to write up a Scala version of foldLeft in the style of The Little Schemer.

What is 0 + (1 + 2 + 3)?

>6

What is 0 + (1 + 2 + 3)?

>0 + 1 + (2 + 3)

What is 0 + 1?

>1

What is 1 + (2 + 3) ?

>1 + 2 + (3)

What is 1 + 2?

>3

What is 3 + (3)?

>3 + 3

What is 3 + 3?

>6

Are we done?

>No, we still haven't found out what foldLeft is.

What is foldLeft?

def foldLeft[A, B](as: List[A], z: B)(f: (B, A) => B): B = {
    as match {
      case Nil => z
      case x :: xs => foldLeft( xs, f(z, x) )( f )
    }
  }

But that doesn't tell us much does it? Let's walk through an example.

What is List( 1, 2, 3 )?

>List( 1, 2, 3 )

What is foldLeft(a, 0){ (x,y) => x + y } when a is List( 1, 2, 3 )

>6 ... but why? Let's step through it together.

What is the first question foldLeft asks about the list?

>case Nil

Is a Nil?

>No, it's List( 1, 2, 3 ), so we ask the next question.

What is the next question?

>case x :: xs

What is the value of case x :: xs?

>true, because the list is not Nil, it contains one element x which is 1, followed by a list xs which is List(2,3).

So what is the next step?

> foldLeft( xs, f(z, x) )( f )

What is f( z, x )?

> Well, remember, we called foldLeft like so: foldLeft(a, 0){ (x,y) => x + y }

And what is f?

> { (x,y) => x + y }

Again, what is f( z, x )?

> f( 0, 1 )
> (0,1) => 0 + 1
> 1

So what has foldLeft( xs, f(z, x) )( f ) become?

> foldLeft( List(2,3), 1 )( f )

Now we recur into foldLeft. What is the first question foldLeft asks about the list?

>case Nil

Is a Nil?

>No, it's List( 2, 3 ), so we ask the next question.

What is the next question?

>case x :: xs

What is the value of case x :: xs?

>true, because the list is not Nil, it contains one element x which is 2, followed by a list xs which is List(3).

So what is the next step?

> foldLeft( xs, f(z, x) )( f )

What is f( z, x )? Remember that z is now 1.

> f( 1, 2 )
> (1,2) => 1 + 3
> 3

So what has foldLeft( xs, f(z, x) )( f ) become?

> foldLeft( List(3), 3 )( f )

Now we recur into foldLeft. What is the first question foldLeft asks about the list?

>case Nil

Is a Nil?

>No, it's List(3), so we ask the next question.

What is the next question?

>case x :: xs

What is the value of case x :: xs?

>true, because the list is not Nil, it contains one element x which is 3, followed by a list xs which is Nil.

So what is the next step?

> foldLeft( xs, f(z, x) )( f )

What is f( z, x )? Remember that z is now 3.

> f( 3, 3 )
> (3,3) => 3 + 3
> 6

So what has foldLeft( xs, f(z, x) )( f ) become?

> foldLeft( Nil, 6 )( f )

Now we recur into foldLeft. What is the first question foldLeft asks about the list?

>case Nil

Is a Nil?

>Yes

So what do we do?

> case Nil => z

And what is z?

6!

So what is foldLeft(a, 0){ (x,y) => x + y } when a is List( 1, 2, 3 )

6!

Are we done?

Yes! Now foldLeft a taco right into your mouth.

Monday, November 24, 2008

Refactoring Imperative Code To Functional Code

I've been refactoring Scala literally for days. It's fantastic how much I've learned over the last year. I knew a little bit about functional programming from doing Lisp in college, but a year and a half ago I couldn't have given you the definition of it.

I decided to tackle some of the most obvious (and ugly) imperative/procedural code in my CPU simulator and turn it into an elegant, functional style.

I'll paste the original code here, then explain it a bit, then show the refactorings one step at a time. In explaining, I will assume that you know at least a bit about how anonymous functions work in Scala.

class EightBitAdder( a: EightBitNumber,
b: EightBitNumber,
carryIn: PowerSource ) {

private val fullAdders: Array[FullAdder] = createFullAdders( a, b, carryIn )
private val output: EightBitNumber = createOutput()

def getOutput: EightBitNumber = output
def getCarryOut: PowerSource = fullAdders(7).getCarryOut

private def createFullAdders( a: EightBitNumber,
b: EightBitNumber,
carryIn: PowerSource ): Array[FullAdder] = {
val fullAdders = new Array[FullAdder](8)
fullAdders(0) = new FullAdder( a(0), b(0), carryIn );
for( i <- 1 until 8 ){
fullAdders(i) = new FullAdder(a(i),b(i), fullAdders(i-1).getCarryOut);
}
fullAdders
}

private def createOutput(): EightBitNumber = {
val out = new Array[PowerSource](8)

var count = 0;
fullAdders.foreach( p => {
out(count) = p.getSumOut
count = count + 1
}
)
new EightBitNumber(out)
}
}

This code was designed to build an 8 bit adder out of two 8 bit numbers (numbers in this particular case aren't much more than arrays of bits) and a carry in bit. This is your typical, ordinary, everyday 8 bit adder that you much have seen in 1950. The adders job is simple, output the result of the two input numbers added up. As with any 8 bit adder, there are 9 outputs, 8 standard output bits, and the carry out/overflow bit.

The adder accomplishes addition in a standard fashion, by chaining 8 full adders (one bit adders) together. The first full adder in the chain uses the given carry in bit as its carry in bit, and subsequent adders in the chain use the carry out of the previous full adder as the carry in. The carry out of the entire adder is simply the carry out of the last full adder in the chain. Here is a picture of one:

Now that we have all the background out of the way, I'll start the refactorings. I'm actually going to go a little bit in reverse order, refactoring the createOutput() method first, because it is substantially easier to refactor than the createFullAdders method.

Simple Refactoring

Lets take another look at createOutput:

private def createOutput(): EightBitNumber = {
val out = new Array[PowerSource](8)

var count = 0;
fullAdders.foreach( p => {
out(count) = p.getSumOut
count = count + 1
}
)
new EightBitNumber(out)
}

This method is returning an EightBitNumber, which is really just an array holding the 8 output bits. The code is terribly imperative, and terribly terrible. The sad thing is, I actually wrote this code, and I'm not just making up a bad example :( Anyway, its overall strategy is pretty clear.

Create an empty, length 8 array
Create a counter object for indexing into the array
Loop over all the full adders (those are already create by the time this method gets called, and well see that in a bit)
For each adder: Put each full adders sum out into the array, and increment the counter.
Finally, create an 8 bit number object using the array.

The first 4 steps above are there simply to create the array to pass to the EightBitNumber object. Now lets take a look at the refactored code:

private val output = new EightBitNumber(fullAdders.map(fa => fa.getSumOut))

Wow! That looks a lot easier - 12 lines down to 1! But...some people might not know what it does, so I'll do my best to explain. The map method, which is a method on all Seq objects (short for Sequence; Array is a subclass), "Returns the list resulting from applying the given function f to each element of this list." An example should help.

Given a=List(1,2,3,4,5) then a.map( i => i * 10 ) returns List(10,20,30,40,50). i * 10 was applied to each element "i" in the list.

In the cpu simulator code above, the call to map has built an Array containing the sumOut of each full adder using the function fa => fa.getSumOut.

Slightly More Difficult Refactoring

With the easier part out of the way, I tackled the more difficult createFullAdders. Let's review the original implementation again.

private def createFullAdders( a: EightBitNumber,
b: EightBitNumber,
carryIn: PowerSource ): Array[FullAdder] = {
val fullAdders = new Array[FullAdder](8)
fullAdders(0) = new FullAdder( a(0), b(0), carryIn );
for( i <- 1 until 8 ){
fullAdders(i) = new FullAdder(a(i),b(i), fullAdders(i-1).getCarryOut);
}
fullAdders
}

This method is responsible for creating all the full adders, and chaining them together. The createOutputs method was only responsible for getting all the outputs off of the full adders created here.

Similar to the last method, this method uses an imperative style, creating an array, populating it, and finally returning it. It's quite a bit trickier though because of the chaining. You can't simply use the map function because there's no context in map. Here, each new full adder needs to know about the preceding full adder. This guy is going to be a bear to explain, but let me just go ahead and dump the code on you:

val (fullAdders, carryOut) =
(as zip bs).foldLeft((List[FullAdder](),carryIn)){
case ((current, carry), (a, b)) =>
val adder = new FullAdder(a, b, carry)
(current ::: List(adder),adder.carryOut)
}

With James Iry's help, we've made this code about as simple as possible. With the first pass, I wasn't sure if the functional code was more readable than the imperative, but after he helped me clean it up, I'm positive. Now, if you aren't familiar with some of the concepts contained in that code, you might be thinking, "What are you $%^&ing nuts?" But, I'm convinced that after you get used to reading this, it's so much easier to read, and so much less error prone, and so much more natural, that you'll never go back. Honestly, after leaving this project alone for almost a year and coming back to it and finding the imperative code, I almost threw up in my mouth a little.

Okay, now I'll try to explain these concepts, and most likely fail miserably.

First, and simplest, is zip. This one is pretty easy. Taken right from the Scaladoc - zip:
"Returns a list formed from this list and the specified list 'that' by associating each element of the former with the element at the same position in the latter. If one of the two lists is longer than the other, its remaining elements are ignored."

I think a few examples will explain perfectly.

Given a=List(1,2,3) and b=List(a,b,c) then a.zip(b) will return List((1,a), (2,b), (3,c)).
Given a=List(1,2,3) and b=List(a,b,c,x,y,x) then a.zip(b) will return List((1,a), (2,b), (3,c)), as the remaining elements in b are ignored.

The code in the cpu simulator zips as and bs, which associates the appropriate input bits together. Take a quick look back at the picture to see that this returns ((a0,b0),(a1,b1),(a2,b2),(a3,b3),(a4,b4),(a5,b5),(a6,b6),(a7,b7)).

Next is foldLeft, which is a bit more complicated. Once again, from the Scaladoc - foldLeft:
Combines the elements of this list together using the binary function f, from left to right, and starting with the value z.

This one I've written up separately, because it was so long. You can find it at http://jackcoughonsoftware.blogspot.com/2008/11/foldleft-in-scala-little-schemer-style.html

Next is pattern matching, but I have to go to bed again! At least I've made some progress :)

Revised Code

Here is the finshed product, which no longer uses 8, but instead creates adder chains of N, depending on the length of the inputs. Overall, I think it's a vast improvement over the original.

class AdderChain(as: Number, bs: Number, carryIn: PowerSource) {

if( as.size != bs.size ) error("numbers must be the same size")

val (fullAdders, carryOut) =
(as zip bs).foldLeft((List[FullAdder](),carryIn)){
case ((current, carry), (a, b)) =>
val adder = new FullAdder(a, b, carry)
(current ::: List(adder),adder.carryOut)
}

val output = new Number(fullAdders.map(fa => fa.sumOut))
}

Using Scala Implicits to Replace Mindless Delegation

I'm still refactoring my Scala CPU Simulator code, as I keep finding ways to just lop off piles of unneeded code. In this latest example, I was using pretty standard Java style delegation - having my class implement an interface, having a member variable of that interface, and delegating all method calls on that interface to the member variable.

There are several problems with this:

The main problem: if I add a method to the interface I then have to add it to everyone implementing the interface. But, what If I'm not actually writing the implementation of the interface? Client code won't compile. Adding an implicit doesn't remove this problem entirely, but it certainly works for plain old delegates.

It's error prone. Maybe the method was a void, and my IDE added the signature for the method, so the code compiles, but I forgot to actually delegate to the member.
It's just plain wordy and ugly.

To remove the boilerplate, all I needed to do was add an implicit conversion from my class to the interface, using the member variable. I'll show this below.

First, here is an example of the old, more wordy style:

trait PowerSource{
def connect( p: PowerSource ): Unit
def disconnect( p: PowerSource ): Unit
def reconnect( p: PowerSource ): Unit
}

class DelegateToPowerSource( in: PowerSource ) extends PowerSource {
def connect( p: PowerSource ) = in connect p
def disconnect( p: PowerSource ) = in disconnect p
def reconnect( p: PowerSource ) = in reconnect p
}

DelegateToPowerSource has a member, "p", and implements the PowerSource interface by delegating to that member for each of the methods on the PowerSource interface. The more methods that PowerSource has, the longer DelegateToPowerSource gets, and yet the code is just boilerplate. Even if the IDE does this for you, its still wordy and potentially error prone.

Now for the new version:

trait PowerSource{
def connect( p: PowerSource ): Unit
def disconnect( p: PowerSource ): Unit
def reconnect( p: PowerSource ): Unit
}

object DelegateToPowerSource{
implicit def delegateToPowerSource( d: DelegateToPowerSource ) = d.p
}

class DelegateToPowerSource( p: PowerSource )

That's it. Now, any time I add a method to the PowerSource interface (I should probably start calling it trait), DelegateToPowerSource simply has that method; via the conversion. I never have to change DelegateToPowerSource because of a change to PowerSource. DelegateToPowerSource can simply have whatever code in it that it was originally intended to have, obviously augmenting PowerSource in some way.

For completeness, I'll post the actual code where I did exactly this. But, it's pretty much the same, so if you get the point, no need to keep reading.


trait LogicGate extends PowerSource{
  val inputA: PowerSource
  val inputB: PowerSource
  val output: PowerSource
}

abstract class BaseLogicGate(val inputA: PowerSource, inputB: PowerSource) extends LogicGate {

  def state = output.state
  
  def -->( p: PowerSource ): PowerSource = output --> p 
  def <--( p: PowerSource ): PowerSource = output <-- p 
  
  def disconnectFrom( p: PowerSource ): PowerSource = output disconnectFrom p 
  def disconnectedFrom( p: PowerSource ): PowerSource = output disconnectedFrom p 

  def handleStateChanged( p: PowerSource ) = {}
  def notifyConnections = {} 
}

class AndGate(a: PowerSource, b: PowerSource) 
  extends BaseLogicGate(a: PowerSource, b: PowerSource){
  val output = new Relay(a, new Relay(b))
}

Of course there were several other LogicGates (or, nor, nand, etc). All of this was condensed down to:


object LogicGate{
  implicit def logicGateToPowerSource( lg: LogicGate ): PowerSource = lg.output
}

trait LogicGate{
  val inputA: PowerSource
  val inputB: PowerSource
  val output: PowerSource
}

class AndGate(val inputA: PowerSource, val inputB: PowerSource) extends LogicGate{
  val output = new Relay(inputA, new Relay(inputB))
}

BaseLogicGate is now removed entirely. This is great because BaseLogicGate was really weird, it didn't even use its inputs. The only reason it was there was to decouple the delegation logic from the actual trait.

Now things have become MUCH more clear. AndGate is a LogicGate with inputs A and B, and one output, made from Relays.

Tuesday, November 18, 2008

Equalizer

I wrote a cool Equalizer class in Scala that allows me to do assertions that are more readable than traditional assertions.

Quick side note: (Equalizer is really an idea shamelessly stolen from Bill Venners Equalizer in ScalaTest. Hopefully we'll add my methods in as well.)

My Equalizer lets me do things like:

val x = 5
x mustBe 5

This replaces the more common assertEquals( x, 5 ), and I think it does so nicely.

You can also do a few more sophisticated things with it, like so:

val x = 5
x mustBe ( 3 or 4 or 5 )

I probably could have used "in" here, such as x mustBe in( 3, 4, 5 ) ... what do you think?

Additionally:

val x = 5
x cantBe 6

For whatever this one is worth (I've actually found use cases for it):

val x = false
x canBeNullOr false

val y = 6
y canBeNullOr ( 5 or 6 )

This works by implicitly converting Any to Equalizer, which contains the methods above. The code can be found below.


import org.testng.Assert._

case object Equalizer {
  implicit def anyToCompare(a: Any) = new Equalizer(a)
}

class Equalizer(a: Any) {

  def mustBe(bs: Any*): Unit = {
    bs(0) match {
      case x:MyTuple => x mustBe a
      case _ => {
        val message = "In Equalizer: expecting one of=" + bs + "\nIn Equalizer: actual  =" + a
        println(message)
        bs size match {
          case 1 => assertEquals(a, bs(0), message)
          case _ => assertTrue(bs.contains(a), message)
        }
      }
    }
  }

  def canBeNullOr(bs: Any*) = {
    if (a != null) mustBe(bs: _*)
    else println("In Equalizer: expecting one of=" + bs + " or null\nIn Equalizer: actual  =" + a)
  }

  def cantBe(b: Any) = assertFalse(a == b)

  def is(b: Any) = a equals b

  def or(b: Any) = {
    a match {
      case x:MyTuple => x + b
      case _ => MyTuple2(a, b)
    }
  }

  import Equalizer._

  trait MyTuple{
    def +(b: Any): MyTuple
    def mustBe(a: Any): Unit
  }
  case class MyTuple2(y: Any, z: Any) extends MyTuple{
    def +(b: Any): MyTuple = MyTuple3(y, z, b)
    def mustBe(a: Any): Unit = a mustBe (y,z)
  }
  case class MyTuple3(x: Any, y: Any, z: Any) extends MyTuple{
    def +(b: Any): MyTuple = MyTuple4(x, y, z, b)
    def mustBe(a: Any): Unit = a mustBe (x,y,z)
  }
  case class MyTuple4(w: Any, x: Any, y: Any, z: Any) extends MyTuple{
    def +(b: Any): MyTuple = MyTuple5(w, x, y, z, b)
    def mustBe(a: Any): Unit = a mustBe (w,x,y,z)
  }
  case class MyTuple5(v: Any, w: Any, x: Any, y: Any, z: Any) extends MyTuple{
    def +(b: Any): MyTuple = MyTuple6(v, w, x, y, z, b)
    def mustBe(a: Any): Unit = a mustBe (v,w,x,y,z)
  }
  case class MyTuple6(u: Any, v: Any, w: Any, x: Any, y: Any, z: Any) extends MyTuple{
    def +(b: Any): MyTuple = throw new IllegalArgumentException("too many ors")
    def mustBe(a: Any): Unit = a mustBe (u,v,w,x,y,z)
  }
}

Sorry about MyTuple...I wanted people to be able to use Tuples as arguments in their mustBe statements, so I had to make sure I didn't pass in a Tuple into mustBe, via the "or" method. Anyway, Tuples don't even appear to be working so....ugh... If anyone can help me clean this up, that would be awesome.

Regardless, what do you think? I really like using the code in tests, even if the Equalizer class itself is a bit hairy.

Monday, November 17, 2008

Refactoring Scala: My CPU Simulator Revisited

I'm preparing to learn Ruby (though I've already tinkered with it some), and to do so, I'm going to revisit my CPU Simulator project that helped me learn Scala so well. I'm going to write it again in Ruby.

So today, I took up revisiting it, with the goal in mind of writing some Ruby. Turned out that didn't actually happen. What did happen was a lot of refactoring of the original Scala code. I hadn't touched that code in several months, and in between then and now I've written a LOT of Scala. Here's some of the things I've learned (and refactored).

I used to use way too much redundant type information. This was a natural habit from writing Java for so long (in fact, pretty much all my bad habits stem from too much Java). I rigorously removed ALL the obviously redundant type declarations. I did find that in some cases it helped to have it there, that the code wasn't exactly clear without it, and so I left it. I'm very interested to see what happens here when I slide over to Ruby.
Out of habit, I used a lot of unneeded semi-colons. I removed all of them. I hate semi-colons. It's official.
I got a lot better at functional programming. There were a lot of areas where I should have used functions, and I refactored the code to do so. My line of code cound is down considerably.
Finally, I removed all the Hamcrest matchers. This one I'll explain in more detail in my post tomorrow. I'm basically using my own assert library that allows me to say things like: x mustBe 5 I prefer this style to any other assertions I've come across.

Sunday, November 16, 2008

Back

For the past several months I've been giving my life to a cause that I now believe will fail miserably. I've decided to leave the NYSE. I wasn't being challenged technically, and management was simply not up to snuff. They actually asked me to, "just drink the kool-aid". I swear.

I haven't been writing, and I haven't even been coding much, all to work on this Death March project. Basically, its been terrible, and I feel like I went into a coma and I've just come out. I'm going to start writing again passionately.

I've accepted a new position where I'll be writing IPhone applications, Facebook apps with Rails, other web work with Rails, etc. I'm very excited about it. I get to learn two languages that I have little experience with (Ruby and Objective C). I get to work with a team that is much more in line with my way of thinking - adopting new technology, open-minded, agile, lots of tests, curious. It feels like a giant weight has been lifted off of my back. I'm very excited.

Additionally, I won't have to work in Java anymore. I'm convinced at this point that java is a dying language. I heard Neal Gafter left Google to work on languages at Microsoft.

I feel his situation parallels mine. He worked tirelessly on closures for Java 7, which he truly believed (as do many) were right for the language. But, the JCP is so damned conservative, that not only will his closures probably not make it into Java 7, but come on...will Java 7 ever actually come out?

I worked tirelessly for NYSE, trying to introduce new technology and exciting ideas. I tried to bring in Scala, and was shot down. I tried to bring in a test first attitude and stress that we really needed time to work on the testing framework there in general, only to be shot down at every suggestion. I felt the same as I imagine Neal felt, "This is impossible".

I believe NYSE ATS software projects will continue to fail until management is replaced.

This was a bit of a rant, but it was a long time coming.