Wednesday, June 24, 2009

Pimp vs Just Plain Fix my Library

I assume most people familiar with Scala are familiar with Pimp My Library. It's just a fun and useful thing to be able to add a missing method onto an API, or to sometimes be able to treat an object like something else.

As fun as it is (especially with the word Pimp), I kind of want to take the fun out of it a little bit. I want to say that its not just about adding that one great feature. Let me make this boring and annoying assertion: Pimping is most useful for fixing crappy, terrible, miserable API. And while that's cool, and useful, it kind of sucks. There's so much terrible code out there that is a nightmare to work with. I feel like I shouldn't have to be fixing up other people's crap, but at least I can.

Now for an example. Ever dealt with Java's ThreadGroup? Ever want to just get the Threads in a ThreadGroup? Sounds reasonable enough right? Holy Mother.... Couldn't be more wrong. Check out this gem that I lovingly stole straight from the Javadoc:


public int enumerate(Thread[] list)
Copies into the specified array every active thread in this
thread group and its subgroups.

First, the checkAccess method of this thread group is
called with no arguments; this may result in a security exception.

An application might use the activeCount method to
get an estimate of how big the array should be, however if the
array is too short to hold all the threads, the extra threads are
silently ignored.
If it is critical to obtain every active
thread in this thread group and its subgroups, the caller should
verify that the returned int value is strictly less than the length
of list.

Due to the inherent race condition in this method, it is recommended
that the method only be used for informational purposes.

list - an array into which to place the list of threads.
the number of threads put into the array.

WHAT?!? This API is just plain broken. It desperately needs fixing. All I want to do is get all the Threads in the Group...why should I have to deal with creating my own Array (and guessing the size), having it work entirely off side effects...And does it really return the number of threads it put into the Array? I'm afraid so. Broken. And does it really SILENTLY IGNORE things when there are too many threads to fit in the array!?! Horrible disaster.

Not even that fun to fix to be honest, but the resulting code is far more manageable. But here's a solution.

object PimpedThreadGroup {
implicit def threadGroupToPimpedThreadGroup(tg: ThreadGroup) = {
new PimpedThreadGroup(tg)

class PimpedThreadGroup(threadGroup: ThreadGroup) {

def getThreads: List[Thread] = getThreads(true)

def getThreads(recursive: Boolean): List[Thread] = {
def getThreads(sizeEstimate: Int): Seq[Thread] = {
val ths = new Array[Thread](sizeEstimate)
if (threadGroup.enumerate(ths, recursive) == sizeEstimate)
getThreads(sizeEstimate +10)
else for (t <- ths; if (t != null)) yield t
getThreads(threadGroup.activeCount() + 10).toList

def filter(state: State): List[Thread] = {
getThreads.filter(_.getState == state)

def exists(state: State): Boolean = {
getThreads.exists(_.getState == state)

def any_threads_alive_? = {
getThreads.exists(t => t.getState != NEW && t.getState != TERMINATED)

def any_threads_running_? = {
getThreads.exists(_.getState == RUNNABLE)

def any_threads_in_timed_waiting_? = {
getThreads.exists(_.getState == TIMED_WAITING)

Most important: def getThreads: List[Thread]. Now I can simply call threadGroup.getThreads and get back a List[Thread]. That's all I ever wanted. Is that too much to ask?

I can also add something to simply treat a ThreadGroup as a List[Thread], if I want. I'm not sure I like this because it could get me into some trouble (and it always does recursive search), but I do like the power it gives - I can call any method on List directly on the ThreadGroup.

implicit def ThreadGroupToList(tg:ThreadGroup): List[Thread] = {
new PimpedThreadGroup(tg).getThreads

By the way, I'm not going to explain it. I'm assuming you already know how it works (I might add a few more examples of how it can be used though, but I'm sure most people already understand). If you don't, you can click the Pimp My Library link above, or Google it. There's plenty out there.

In conclusion, was this a bit of a rant? I guess. Here's what we should take away from this though: Pimp My Library is a very effective tool not just for adding nice things to API's, but fixing broken ones. If it's our duty to refactor our broken code, to always make our code better when we have to modify it, then it's just as much our duty to fix up API's. In this case, we just don't have the original source. Refactoring without source code. It's actually pretty awesome.

Sunday, June 14, 2009

Scala MultithreadedTC Port

I haven't posted in quite a while now, but only because I've been working really hard. I asked Doug Lea if I could help him on his upcoming Scala concurrent data structures endeavor, especially on testing and to my surprise he sounded quite happy about the idea. He's going to get started sometime in July, and I figured from now until then would be a great time to get caught up on testing concurrent data structures, and have a nice framework in place before any code actually gets written. To that end, I've decided to port MultithreadedTC to Scala.

The reasons for doing so are quite simple - we can take advantage of Scala's flexible syntax, HOF's, yada yada...I don't think that that needs to be explained yet again. And, I don't think this particular example demonstrates something like, "but here those features are particularly valuable". The resulting code is definitely nicer than the original Java code, and that's simply all.

Over the next few days I'll probably update this post giving more examples. I'll start with one tonight. Also, you can also learn more about how it all works by clicking the link above.

I'll start with an example from the original Java code from the MultithreadedTC source, and I'll try to give a reasonable explanation of what is going on.

1 class MTCTimedOffer extends MultithreadedTestCase {
2 ArrayBlockingQueue<Object&rt; q;
4 @Override public void initialize() {
5 q = new ArrayBlockingQueue<Object>(2);
6 }
8 public void thread1() {
9 try {
10 q.put(new Object());
11 q.put(new Object());
13 freezeClock();
14 assertFalse(q.offer(new Object(), 25, TimeUnit.MILLISECONDS));
15 unfreezeClock();
17 q.offer(new Object(), 2500, TimeUnit.MILLISECONDS);
18 fail("should throw exception");
19 } catch (InterruptedException success){ assertTick(1); }
20 }
22 public void thread2() {
23 waitForTick(1);
24 getThread(1).interrupt();
25 }

This code tests the interactions of two threads. The second thread interrupts the first thread (line 24) as its attempting to offer an object to the queue on line 17. One challenge that the library means to solve is - How can we be sure that thread2 calls interrupt at the appropriate time? It does so by maintaining and internal metronome (or clock). The clock ticks forward only when all threads are blocked and at least one thread is waiting for the clock to tick.

On line 23 thread2 waits for the clock to tick. Remember, the clock doesn't tick until all threads are blocked. So when does thread1 become blocked? Well, since the queue can only contain two elements, it becomes blocked on line 14, when it tries to offer a third object. However, in this case, we've frozen the clock, so the clock will not tick, and thread2 will continue to wait.

Finally on line 17 thread1 becomes blocked offering a third object to the queue, and the clock is not frozen. Behind the scenes the framework sees that all threads are blocked, and indeed someone is waiting for the clock to tick (thread2). The clock ticks, and thread2 advances. He then interrupts thread1 using the getThread method. thread1 enters its catch block on line 19, checks to see that the clock has indeed reached 1, and everyone is happy.

Sort of...

There are a few things we can do better. Most importantly, of course - we can get rid of a lot of semicolons! Ok maybe that's not most important, but I really hate semicolons. Here's a more serious list of the deficiencies in the code above (oh and, its really not that bad at all).

  1. The call to getThread is not type safe at all, and break on a refactoring if you decided to rename a thread. How does that work anyway? Answer: the threads are named by the method names. On line 8 we have "public void thread1" and so we have a thread named thread1. This was originally done for a good reason - creating threads in Java is very verbose. This cleans that up a lot, at the minor price of some type safety.
  2. When we freeze the clock we could easily forget to unfreeze it, leading to quite a bit of headache.

There are more, and I will address them here, but let us focus on those two for now. I'll give the Scala equivalent first, then explain how I've addressed those issues.

1 class MTCTimedOffer extends MultiThreadedSuite {
3 val q = new ArrayBlockingQueue[String](2)
5 val producer = thread{
6 q put "w"
7 q put "x"
9 withClockFrozen {
10 q.offer("y", 25, TimeUnit.MILLISECONDS) mustBe false
11 }
13 intercept[InterruptedException] {
14 q.offer("z", 2500, TimeUnit.MILLISECONDS)
15 }
17 tick mustBe 1
18 }
20 thread{
21 waitForTick(1)
22 producer.interrupt()
23 }

Notice that the code isn't much smaller, 24 vs 26 lines. I'm not touting a giant absurd improvement in any way. However, lets look at the issues above.

  1. On line 5 ( val producer = thread{ ) an actual Thread object is returned, and that thread can be referenced by any of the other threads in the system by name, in an intuitive, type safe way. I do this simply, by having the thread method take a HOF and wrapping that HOF in a Thread. Easy. Notice on line 22 the second thread references the producer thread directly.

    Also, If I don't feel like it, I don't have to assign the thread to a val, and I don't have to reference it anywhere. The Thread created in line 20 isn't really needed by anyone else, so it remains anonymous (behind the scenes it actually gets the name thread2, and can still be gotten using the getThread method seen above).
  2. The second issue was a very simple one where we could forget to unfreeze the clock. Simple is ok, because all these simple fixes added up big in the long run, IMO. This issue is solved by using the withClockFrozen which appears on line 9. This is also a method that takes a function. It then does the work of freezing the clock for you, running the function, then unfreezing the clock. Simple, but effective.

Ok, its really late, and I have to work tomorrow. But, expect this post to be updated regularly in the near future. I'll likely be working with Bill Venners and Eric T. on getting this stuff into ScalaTest and/or Specs, and DL on improving this stuff, and hopefully Bill Pugh, the original author. Any ideas, comments, improvements, etc would be greatly appreciated! I'll have the source code somewhere where people can see it very soon. Bye!

Monday, June 01, 2009

Comedy: Scala vs. Ruby vs. Objective C

Please submit your favorite (or least favorite) language!

Added: Java, Clojure, JavaScript


this.computers = Array(
Map("Name" -> "MacBook", "Color"->"White"),
Map("Name" -> "MacBook Pro", "Color"->"Silver"),
Map("Name" -> "iMac", "Color"->"White"))


self.computers = [
{:Name=>"MacBook", :Color=>"White"},
{:Name=>"MacBook Pro", :Color=>"Silver"},
{:Name=>"iMac", :Color=>"White"}]

Objective C

NSDictionary *row1 =
[[NSDictionary alloc] initWithObjectsAndKeys:
@"MacBook", @"Name", @"White", @"Color", nil];
NSDictionary *row2 =
[[NSDictionary alloc] initWithObjectsAndKeys:
@"MacBook Pro", @"Name", @"Silver", @"Color", nil];
NSDictionary *row3 =
[[NSDictionary alloc] initWithObjectsAndKeys:
@"iMac", @"Name", @"White", @"Color", nil];

NSArray *array =
[[NSArray alloc] initWithObjects: row1, row2, row3, nil];

self.computers = array;

[row1 release];
[row2 release];
[row3 release];
[array release];


Assuming this field:

Map<String, String>[] computers;


Map[] maps = {
new HashMap(){{
put("Name", "MacBook");
put("Color", "White");
new HashMap(){{
put("Name", "MacBook Pro");
put("Color", "Silver");
new HashMap(){{
put("Name", "iMac");
put("Color", "White");
this.computers = maps;

Notice you can't use generics with the array creation, and, 4 space indenting! Yeah!


(def computers [
{:Name "MacBook"}
{:Name "MacBook Pro" :Color "Silver"}
{:Name "iMac" :Color "White"}])


this.computers =
[{'name':'MacBook', 'color':'White'},
{'name':'MacBook Pro', 'color':'Silver'},
{'name':'iMac', 'color':'White'}];