Jack Cough on Software: 2010

Tuesday, December 14, 2010

Code Styles and Readability

Today at work we went on a quest to make a piece of code as readable as possible. I'm not fully convinced that we've succeeded entirely, but we learned a few things along the way.

The problem we're trying to solve is pretty simple. Given an activation record which represents the last call on the call stack, generate a List of Strings to be used in building the stack trace. This will require going all the way up to the first call on the stack by following parent pointers, grabbing the name of each procedure.

Before we begin, we need to understand a few things about the system.
One could imagine these simple classes:


case class Procedure_V1(name: String)
case class Activation_V1(procedure: Procedure, parent: Activation)

If that's all there was, there probably wouldn't be much to write about. I'll give a small bit more detail. In our world, a Procedure holds a List of Commands that it executes one by one until it is done. Any Command in this list could be a primitive, or a call to another procedure. If the Command is a primitive, it can just be executed on the spot. One example of a primitive is +. If the command is a call to another procedure then upon execution of this Command, a new Activation gets created. So the objects now look like this:


case class Command_V2(name:String)
case class Procedure_V2(name: String, commands: List[Command])
case class Activation_V2(procedure: Procedure, parent: Activation)

There's one minor catch that makes this whole thing interesting: any particular Activation might need to put two different entries into the stack trace. 1) Its own procedures name, and possibly 2) The name of its parent's current command. I wish I had time and space enough to explain why, but that's simply out of scope. For now, I'll just explain some simple rules.

A command should show up in the stack trace only if it calls other code. Activations have a return address which an index into it's parents commands List. It points to currently running command in the parent Activation. These rules allows us to write the final versions of the classes:


case class Command(name:String, callsOtherCode: Boolean)
case class Procedure(name: String, commands: List[Command])
case class Activation(procedure: Procedure, parent: Activation, returnAddress:Int)

Lets also say that these classes are sealed (or that we can't modify them). With these, we can take a stab at the original problem.


def callStack_V1(act: Activation): List[String] = {
  val parentCommandMaybe: Option[String] = Option(act.parent).flatMap(p => {
    val c = p.procedure.commands(act.returnAddress)
    if(c.callsOtherCode) Some(c.name) else None
  })
  List(Some(act.procedure.name), parentCommandMaybe).flatten :::
          callStack_V1(act.parent)
}

This was the first version I wrote. I liked it then, I still like it now. But, I wanted to see if I could make it better. First, lets see if we can clean up that last line. The list creation is a little confusing.


def callStack_V2(act: Activation): List[String] = {
  val parentCommandMaybe: Option[String] = Option(act.parent).flatMap(p => {
    val c = p.procedure.commands(act.returnAddress)
    if(c.callsOtherCode) Some(c.name) else None
  })
  act.procedure.name :: parentCommandMaybe.toList ::: callStack_V1(act.parent)
}

Only slightly better. Maybe if we remove the recursion things will be nicer.


def callStack_V3(act: Activation): List[String] = {
  Iterator.iterate(act)(_.parent).takeWhile(_ != null).toList.flatMap { a =>
    val proc = act.procedure.name
    val parentCommandMaybe: Option[String] = Option(a.parent).flatMap(p => {
      val c = p.procedure.commands(a.returnAddress)
      if (c.callsOtherCode) Some(c.name) else None
    })
    act.procedure.name :: parentCommandMaybe.toList
  }
}

That first bit before the first flatMap turns the stack into a List[Activation]. While that idea is simple enough, the code to do it really isn't. And, the remainder of the code is really hard to follow too. I still manage to struggle to fully understand a single flatMap sometimes, let alone two. Sure the type annotations help explain what is happening, but I still felt I could do better. Let's keep the same idea, but refactor a bit and see if that helps.


def unfold(act: Activation) =
  Iterator.iterate(act)(_.parent).takeWhile(_ != null).toList

def callStack_V4(act: Activation): List[String] = {
  unfold(act).flatMap { a =>
    val parentCommandMaybe: Option[String] = Option(a.parent).flatMap(p => {
      val c = p.procedure.commands(a.returnAddress)
      if (c.callsOtherCode) Some(c.name) else None
    })
    act.procedure.name :: parentCommandMaybe.toList
  }
}

Maybe thats ok, it gets some of that clutter out of the way and let's us think. But lets take a complete step back and try an imperative version. All the flattening is still hurting my head.


def callStackImperative(act: Activation): Seq[String] = {
  val buf = collection.mutable.ListBuffer[String]()
  for(a <- unfold(act)) {
    buf += a.procedure.name
    if (a.parent != null) {
      val c = a.parent.procedure.commands(a.returnAddress)
      if (c.callsOtherCode) buf += c.name
    }
  }
  buf
}

This is a case where an imperative style actually does quite well. I usually try to avoid this style like the plague, but in this case theres little doubt that this code is simple.

I still thought I could do better, so I continued. This time with a Racket friend.


def callStackSchemer(act: Activation): List[String] = {
  def callStackInner(act: Activation): List[String] = {
    if (act == null) Nil
    else {
      val rest = callStackInner(act.parent)
      def parCmd = act.parent.procedure.commands(act.returnAddress)
      act.procedure.name ::
        (if (act.parent != null && parCmd.callsOtherCode) parCmd.name :: rest
         else rest)
    }
  }
  callStackInner(act)
}

The advantage of this approach is that it could be read by a 10 year old. Ok maybe not 10. But, there are no for comprehensions, options, or flattening bits here. However, I do think there is a little bit of mixing up what is happening with how its happening. That is, the result building is mixed up with the structural recusion on the Activation. For this little function, it almost certainly doesn't matter that much. For something larger, it might.

Let explore a different topic for a bit. Sometimes, even with inner defs, pulling something outside of the body of a function helps clear out the clutter and makes it more readable. We did this already with unfold. There are tradeoffs of course, such as you then have a function in the top level that is only being used in one place. But, lets try it again anyway and see where it gets us..


def parentCommandMaybe_V1(act: Activation): Option[String] =
  Option(act.parent).flatMap(p => {
    val c = p.procedure.commands(act.returnAddress)
    if (c.callsOtherCode) Some(c.name) else None
  })

Pulling that out also helps me realize just how complicated that bit is. Maybe we can do better just on this part. Here are two attempts.


def parentCommandMaybe_V2(act: Activation): Option[String] =
  if(act.parent == null) None
  else{
    val command = act.parent.procedure.commands(act.returnAddress)
    if(command.callsOtherCode) Some(command.name) else None
  }

def parentCommandMaybe_V3(act: Activation): Option[String] =
  for{p <- Option(act.parent)
      c = p.procedure.commands(act.returnAddress)
      if c.callsOtherCode}
  yield c.name

To be quite honest, none of these solutions are glorious. I'm tempted to use V2 because any knuckehead could read it, and its of comparable size. I could have easily decided on V1 or V3 because we don't have any knuckeheads on our team. We do however, occasionally have students on the team and we are attempting to open source our code. Is that a good reason to dumb down the code? No, certainly not. But, if a particular function is of comparable size, then maybe it is a good idea. Or maybe not, maybe leaving the more advanced constructs (like for comprehensions) would make code readers learn more about the language that they are using. I'm undecided on this. Given the lack of wanting to make a decision here, I'll just choose V3.

Better solutions to this are most welcome.

Now that we've pulled that out, we can rewrite our function.


def callStack_V7(act: Activation): List[String] = {
  unfold(act).flatMap { a => 
     act.procedure.name :: parentCommandMaybe_V3(act).toList
  }
}

I think this is a really nice version. But...now that we've cleaned up some of the helper methods, lets see what things look like if we put them back in.


def callStack_V8(act: Activation): List[String] = {
  def unfold(act: Activation) =
    Iterator.iterate(act)(_.parent).takeWhile(_ != null).toList
  def parentCommandMaybe_V3(act: Activation): Option[String] =
    for{p <- Option(act.parent)
        c = p.procedure.commands(act.returnAddress)
        if c.callsOtherCode}
    yield c.name
  unfold(act).flatMap { a => 
    a.procedure.name :: parentCommandMaybe(a).toList
  }
}

Not bad. It may be a few lines longer than version 1, and it took quite a while to get here, but I do think this version is the best we can do. 'We' being my team. Finally, I'll add in some comments and call it a day. It would be nice if the code was so blatently obvious that this wasn't needed, but it's ok. Comments are helpful.


/**
 * Return a List representing the call stack.
 * The list includes procedure names and any commands
 * that call other code.
 * @param act - the final call on the stack.
 **/
def callStack(act: Activation): List[String] = {
  // unfold turns an activation into
  // List(act, act.parent, act.parent.parent ... )
  def unfold(act: Activation) =
    Iterator.iterate(act)(_.parent).takeWhile(_ != null).toList
  // if an activations parent is executing a command that can trigger
  // other code, it also needs to be added to the stack trace.
  def parentCommandMaybe(act: Activation): Option[String] =
    for{p <- Option(act.parent)
        c = p.procedure.commands(act.returnAddress)
        if c.callsOtherCode}
    yield c.name
  // flatMap because each activation can result in 1 or 2 entries
  unfold(act).flatMap { a => 
    a.procedure.name :: parentCommandMaybe(a).toList
  }
}

If you can write this better, please do. Send it to me and I'll learn something.

Saturday, November 27, 2010

App Inventor Review (plus some functional programming)

I App Invented all day today. Half of this was because it was fun, the other half was because it took all day to get anything reasonable done. Before I get into this at all, let me say that I had fun. I think App Inventor could be a cool tool in the future, and it was relatively bug free and nice to use. I think they have done a good job on it and I'm very happy for them. I don't know what their plans are for adding more language features in the future, but even though it was fun, I won't be using it again until they add a few things. I'll explain.

But lastly, before I do, I want to say that I hope I simply haven't overlooked anything. These are my impressions after using it for a day, and they are impressions from my language designers perspective. If I've managed to overlook some capabilities because of failure to read everything I was supposed to read, then I apologize. I don't think this is the case though.

After growing up and becoming a decent functional programmer (and/or growing out of Java), whenever I try to learn a new language that doesn't have certain functions that I've grown accustomed to (like head, tail, list-to-string, string-to-list, filter, map, flatten, and more) I find that I build them immediately and then start actually building something. Guy Steele put this so elegantly in his talk Growing a Language.

So, I set out to build them. But I found out right away that the blocks in App Inventor were severely limited. You do have the ability to create your own functions, but there are no higher order functions so right away you lose the ability to write map and filter (without repeating the mapping and filtering logic every time). Both map and filter could be accomplished with data abstraction (albeit slightly unnaturally), but that doesn't exist either. Ok, I'll be honest, I didn't expect this higher order functions, so this is ok. I didn't really expect data abstraction either. So I'll let that slide too. But, it sure would be nice if we could have a blocks language that does have them.

App Inventor does have Lists though, and Strings. But, Strings are not Lists, and no functions are provided to convert between the two. This bit was quite frustrating. Having to write string-to-list and list-to-string seemed wrong, especially when those are two of the main data types in the system. Lists did seem to implicitly convert to Strings in certain situations, but wrapped in parens. If I simply wanted to convert [a, b, c, d, c] to "abcde", I had to use my function.

So I set out to build both of those functions, and I started with string-to-list. This is when I found out the really bad news, the one fundamental flaw with App Inventor. You can't have local variables in your procedures... Worse yet, every time you need to name something, it becomes a global variable.

When you need a temp, create a global.
Globals are event created for every procedure argument.

The last one is particularly bad, because if you have two operations that work on lists, you have to pick a new name for the argument to each of the functions. For example, imagine if you had to do this in Scala:

def head[A](aList:List[A]): A = ...
def tail[A](anotherList:List[A]): List[A] = ...
def toString(yetAnotherList:List[_]) = ...

And imagine further that these names followed you around wherever you go.

You might be thinking, 'Well, maybe its not too big of a problem. After all, it's probably the case that App Inventor was designed for small things, and for teaching.' Well, the global thing is a real problem. Having to explain this to beginners trying to learn would be a nightmare. Thankfully, the App Inventor team does recognize this. Hal Abelson said on this thread,

"We're planning on adding local variables (no ETA). We agree that these are important for teaching programming."

Phew. But then again, that was August 7th and we're now pushing into December. (But I know how this goes as I've been working on a couple of my own language features for over a year now.)

Now back to the second point; having every procedure stomp all over the global namespace did turn out to be a problem in a relatively small amount of code. Yes I wrote all day, but I did't really churn out that much. Part of this was because I was slowed down mucking through the UI trying to find my procedure call blocks. I ended up with so much stuff in the My Definitions thingy, that it was a nightmare to find anything. This could probably be cleaned up pretty easily if they added the ability to sort it alphabetically. But even so, the globals business is bad for other reasons too. It just sort of continues encouraging imperative programming in a very unsatisfying way. I feel dirty after it. Oh well, hopefully they can get it fixed. Like I said, I did have fun, so I'm not trying to be negative, just blatantly honest.

Ok, finally back to string-to-list. Right away I noticed that there wasn't a charAt function. There is a 'segment' function which is basically substring, but I wanted charAt. So I wrote it, and fortunately it was one of the easier functions to write. Here it is:

The first time I wrote the string_to_list function, I wrote it using globals in typical imperative style. Here it is:

Hopefully that doesn't come out too difficult to read. If it does, you can just click on it to view it bigger. I just looped until the end of the String, adding to an accumulator. The accumulator was a global, along with the loop index. This is pretty gross, but whatever, it works. I tried it without the globals, but this turned out to be even worse, I think. Well, maybe not worse, but bigger, and uglier:

This one might actually be too small to read, and I screwed up the screen shot, but you can click it. I think most of what you need is there. Because I am not using a global, I have to use recursion. But this means I have to have the index and acc parameters passed in. Because I don't want users to have to pass this in, I just used two functions. The actual meat of the function is roughly the same size as the other. Fine. We'll they are both pretty huge, so not fine...but fine.

Ok, 1/2 the day is gone and I have string-to-list and charAt written (maybe an exaggeration, I can't really remember). At this point in writing this post I went to look for list_to_string. I couldn't find it. I swear I wrote it. Didn't I? Did I not need it? This just goes to show the problems with finding things in App Inventor. It's a big pain.

So maybe I didn't write it. But I did write some others I was going to need. I've explained most of the problems that I encountered, so most of the rest of the functions can be displayed without ceremony.

Here are head and tail:

Head was fine. Tail was kind of disgusting because I had to mutate a global so that I didn't have to mutate the argument. Try explaining any of that to a beginner.

Zip suffered from problems already discussed:

Unzip also demonstrates nothing new:

Filter was problematic as I explained earlier. You can't pass lambdas around, so any time you want to filter you just have to reproduce the logic. There's no way around it. Map too (though I didn't write map, but I guess unzip is just a map). Here is a simple filter. I didn't really test it because I used a much more complicated one. But, I think it should work:

Finally, I wrote flatten. This one proved to be tricky and fun. I haven't actually used it anywhere yet, but thats ok. It was tough to figure out at first, but I got it. See if you can understand it:

Some other remarks and random comments:

Testing was not too bad. When I had a function that didn't quite work right on the phone the first time, I could easily set up a call to it in the IDE, and see the results. That was nice. It would be much nicer to have a way to define actually test cases on these things, but no one ever seems to listen to me on that.
Code organization was a huge problem. As procedures grew, I'd have to shift everything around. It only gives you so much vertical space. Horizontal space seemed unlimited, but it was tricky to make it grow. They have an auto-organize feature, but it did some things that I didn't want it to do, such as moving globals away from the only procedure that they were used in.
When I named something incorrectly (I often put a dash, forgetting that it is illegal), the system yelled at me and says that it is an illegal name, but didn't say why. Then...it reverts back to whatever name was there before. So, if I had typed in a long name, I'm forced to retype it all again. that was super aggravating.
Sometimes it would say there was an error in a procedure, but it wouldnt tell me what the error was.
Things are slow. Yes, I know, I should expect them to be nearly as fast as typing. But, I found myself getting frustrated with just how slow things seemed. certainly, beginners won't be working as fast as myself, but there are times when I could imagine them getting frustrated too. Removing many of the globals would help. A quick-find for procedures would help too.
Message passing, Scratch style, would be very nice. I could easily imagine different GUI components passing messages to each other. My 9 year old son uses that style in Scratch all the time and it works great.
When there was a runtime error, it was difficult to tell what the error was. Additionally, the application would just halt. No option to continue. That seemed odd. Why not just let it continue in its current state and let the user decide to restart?
I was pretty easily able to find things in the documentation the few times that I needed to look things up. That was nice.

That's just about it. I guess I'll conclude. I had fun, I like App Inventor. By the end of the day, I had a working game going (not described here). I think App Inventor could be made pretty awesome by fixing some of the things I mentioned above. I'm not sure I'll continue to teach it to my son or not. On one side of the coin, it can do some pretty fun things with the phone (some of which I haven't tried, and none of which I've written up here.) On the other side, it's got severe limitations and I might as well move my son to Build Your Own Blocks. I probably actually have more comments, but I'm getting pretty tired at this point. Goodnight Moon.

Wednesday, February 24, 2010

SBT and Test Arguments

If you want to pass arguments to tests in SBT, you've come to the right place. First know this - it only works in certain situations, it's a bit crufty, and its subject to change. But I'll try to keep this page up to date if it does change.

Another important note, if you want to pass arguments at the sbt command line, you can only do it for test-quick, and test-only (I think). It doesn't work for the default "test" command. This is a bit unfortunate, but there is a workaround (not a very good one, but at least it exists).

A final note: these examples are for ScalaTest and ScalaCheck. I don't yet have examples for Specs, but Eric has implemented argument handling (in 1.6.2.1-SNAPSHOT for scala 2.7.7 and 1.6.4-SNAPSHOT for scala 2.8.0.Beta1).

Passing Args to ScalaTest from the Command Line

Let's say you have this little ScalaTest class that uses both params and tags:


import org.scalatest.fixture.FixtureFunSuite
import org.scalatest.Tag

class WickedCoolTest extends FixtureFunSuite{
 type FixtureParam = Map[String,Any]
 override def withFixture(test: OneArgTest) {
   test(test.configMap)
 }
 test("1", Tag("dood1")){ conf => println("dood1: " + conf) }
 test("2", Tag("dood2")){ conf => println("dood2: " + conf) }
 test("3", Tag("dood3")){ conf => println("dood3: " + conf) }
}

To run all tests in in this class use:


test-only WickedCoolTest

To run only tests tagged with "dood1" use:


test-only WickedCoolTest -- -n dood1

To run tests tagged with dood1 or dood2, use:


test-only WickedCoolTest -- -n "dood1 dood2"

To run all tests except for tests tagged with dood2, run:


test-only WickedCoolTest -- -l dood2

To run all tests except for tests tagged with dood2 dood3, run:


test-only WickedCoolTest -- -l "dood2 dood3"

To pass configuration parameters to a test, use:


test-only WickedCoolTest -- -Danswer=42

To pass many config params, add more -D's:


test-only WickedCoolTest -- -Danswer=42 -DrealAnswer=54

Tags and parameters can be used in combination, like so:


test-only WickedCoolTest -- -n "dood dood2" -Dhey=you -Dm=f

This runs only tests tagged with dood or dood2 and produces the following output:

dood: Map(hey -> you, m -> f)
dood2: Map(hey -> you, m -> f)

Passing Args to ScalaCheck from the Command Line

I'll spare you the entire example and just get to the point here.

To set the minimum number of successful tests, use '-s'


> test-only *IntParallelArrayCheck -- -s 5000
...
[info] == scala.collection.parallel.mutable.IntParallelArrayCheck ==
[info] OK, passed 5000 tests.

The rest of the examples are very similar. Consult ScalaCheck itself for further documentation.

-s (minSuccessfulTests): Number of tests that must succeed in order to pass a property
-d (maxDiscardedTests): Number of tests that can be discarded before ScalaCheck stops testing a property
-n (minSize): Minimum data generation size
-x (maxSize): Maximum data generation size
-w (workers): Number of threads to execute in parallel for testing
-z (wrkSize): Amount of work each thread should do at a time

Default Arguments

Maybe you want to pass default arguments to your test framework of choice. That is, you want to pass the same arguments every time you run your tests (and still be able to pass more dynamically, if you wish). You can do this too, by adding elements to 'testOptions'. Doing so will take effect when you run sbt test, sbt test-only, sbt test-quick, etc.

Let's say you want the minimum number of ScalaCheck passing tests to be 5000 every time you run ScalaCheck. Here is how you do that:


override def testOptions = 
  super.testOptions ++ 
  Seq(TestArgument(TestFrameworks.ScalaCheck, "-s", "5000"))

Or maybe you only ever want to run your fast ScalaTest tests.


override def testOptions = 
  super.testOptions ++ 
  Seq(TestArgument(TestFrameworks.ScalaTest, "-n", "fast"))

Custom Test Tasks

Maybe you have some test that you run all the time. At the sbt command line, instead of saying > test-quick blah.blah.Blah, you just want to say: > blah.

You can do that too, and you can pass args to it dynamically, and you get the default arguments as well. You'll have to take the following code and put it into your sbt file. It's ugly, I know, but the results are nice.


lazy val blah = singleTestTask("blah.blah.Blah")
 
private def singleTestTask(className: String) = task { args =>
  defaultTestTask(TestFilter(_ == className) :: 
   testOptions.toList ::: ScalaTestArgs(args))
}

private def newScalaTestArg(l: String*) = 
  TestArgument(TestFrameworks.ScalaTest, l:_*)

private def ScalaTestArgs(args: Seq[String]): List[TestArgument] = {
  def KVArgs(args: Seq[String]): TestArgument = 
    newScalaTestArg(args.map("-D" + _):_*)
  def tagsFromArgs(tags: Seq[String]): List[TestArgument] = {
    if (tags.isEmpty) Nil else 
      List(newScalaTestArg("-n", tags.mkString(" ")))
  }
  val (kvs, tags) = args.partition(_.contains("="))
  KVArgs(kvs.toSeq) :: tagsFromArgs(tags.toSeq)
}

Subclasses

Finally, maybe you want to set up tasks like 'test-fast' and 'test-slow' which only run your fast and slow tests. Let's imagine that you've created a trait called blah.blah.SlowTest and all of your slow tests extend that trait. Tests that don't extend SlowTest are considered to be in the fast group. With the code below, at the sbt command line you'll be able to say > test-fast, and > test-slow.

Again, it's a bit ugly to put this stuff in your sbt project file, but it works for now, until I come up with a better plan :)


lazy val testSlow = 
  runSubclassesOf("org.nlogo.util.SlowTest")
lazy val testFast = 
  runEverythingButSubclassesOf("org.nlogo.util.SlowTest")

private def runSubclassesOf(className: String) = {
  val subclass: Boolean => Boolean = x => x
  subclassTest(className, subclass)
}
  
private def runEverythingButSubclassesOf(className: String) = {
  val notSubclass: Boolean => Boolean = x => ! x
  subclassTest(className, notSubclass)
}

private def subclassTest(className: String, 
                         subclassCheck: Boolean => Boolean) = task { 
  args =>
  lazy val jars = 
    testClasspath.get.toList.map(_.asURL).toArray[java.net.URL]
  lazy val loader = 
    new java.net.URLClassLoader(jars,buildScalaInstance.loader)
  def clazz(name: String) = Class.forName(name, false, loader)
  lazy val superClass = clazz(className)
  def filter = 
    TestFilter(c => 
      subclassCheck(superClass.isAssignableFrom(clazz(c))))
  defaultTestTask
    (filter :: testOptions.toList ::: ScalaTestArgs(args))
}

Yeah...

Let me know if you have any issues with any of this, I'll be happy to help. I put it together pretty quickly. If there are any glaring errors or omissions, please tell.