Jack Cough on Software: 2007

Sunday, December 30, 2007

Closures Non Local Return - Scala And Java

I've heard a lot of people bickering about Java Closures. One of the many reasons they are fighting with each other is that in the BGGA proposal changes the semantics of return. In BGGA there can be local and non local returns (I'll explain those in a second). Bloch says it going to be too prone to bugs. Gafter says we need to do it before the language becomes a dinosaur. I tend to go with Neal on this one, maybe thats because I'm more adventurous and not quite as worried about bugs since I do extensive unit testing.

Anyway, I wanted to demonstrate the problem in Scala. It did bite me today a little bit. I caught it quickly in my tests. The following two bits of code have different meanings:


private def updateOnOff = {
 def calculateOnOff: boolean = {
   incomingPowerSources.foreach( p => {if( p.isOn ) true; })
   return false;
 }
 cachedOnOffBoolean = calculateOnOff;
}

private def updateOnOff = {
 def calculateOnOff: boolean = {
   incomingPowerSources.foreach( p => {if( p.isOn ) return true; })
   return false;
 }
 cachedOnOffBoolean = calculateOnOff;
}

The purpose of the updateOnOff function is to simply set the cachedOnOffBoolean to true if any incoming power sources are on. If none of them are on, the boolean is set to false. The inner function, calculateOnOff is responsible for looping through the incoming power sources to see if any are on. If one is on the function should return true to the outer function which sets the boolean.

To be honest, I'm not sure which one is the local and which one isn't. I'm in the process of trying to figure it out. It seems natural to me though to assume that the first one is the local return. If anyone knows better, please let me know.

The first example is an example of local return. Local return returns from the most local block of code, and in this case its the closure function itself, not the calculateOnOff function. The closure here is what's inside the foreach call:

( p => {if( p.isOn ) true; } )

This is the function that will be returned from. It simply returns back to the loop, which goes on to the next item. Finally, when the loop is finshed, false is returned. So, in the first example false will always be returned! This is certainly not correct.

The second example is and example of non local return. Specifying "return" before false means that you intend to return from the enclosing method, not just the closure itself. You intend to break out of the loop. You've found what you were looking for, and you were done. The return returns true from the calculateOnOff, and the boolean is set to true, just as you wanted.

Its pretty straight forward when you know it but I'm sure Bloch is right, there are going to be lots of bugs. Developers who don't use unit testing religiously are going to get it. A simply copy past from a refactoring, and boom.

So what does this tell us? Well, at the least it says be careful, and write lots of unit tests. At the most it might mean that lots of idiots just stay in Java and the cool kids move on to Scala, and thats all right now baby, yeah.

Coolest Code I've Ever Written

Because Scala lets you name methods pretty much whatever you want, I got the chain wires together like this:


 @Test
 def lastPowerSourceInChainShouldBeNotifiedWhenFirstInChainIsTurnedOff() = {
   //given
   val first = new Generator
   val secondToLast = new Wire
  
   first-->new Wire-->new Wire-->new Wire-->secondToLast
  
   val last = createMockWithConnectionTo(secondToLast)
  
   // when
   first.turnOff
  
   // then
   verify(Array(last))
 }

 private def createMockWithConnectionTo( p: PowerSource ): PowerSource = {
   val mockPowerSource: PowerSource = createStrictMock(classOf[PowerSource]).asInstanceOf[PowerSource];
   expect( mockPowerSource <-- p ).andReturn( p )     expect( mockPowerSource.handleStateChanged( p ) )     replay(Array(mockPowerSource))     p --> mockPowerSource
   mockPowerSource
 }

The -->'s are actually methods on any PowerSource that connects the two together. So, the code can actually look like a chain of wires connected together. This is totally the coolest thing ever.

Friday, December 28, 2007

Using TestNG in Scala: By Example

Here is a very, very simple example of some TestNG stuff that I wrote the other day in Scala (it can also be found here):

package com.joshcough.cpu.electric;

import org.testng.annotations._
import org.testng.Assert_


package com.joshcough.cpu.electric;

import org.testng.annotations._
import org.testng.Assert._

class GeneratorTest {

  @Test
  def expectNewGeneratorToBeOn() = {
     val gen: Generator = new Generator
        assertTrue(gen.isOn);
    }
  
  @Test
  def expectTurnedOffGeneratorIsOff() = {
        val gen: Generator = new Generator
        gen.turnOff
        assertTrue(gen.isOff);
    }  
}

This might not be the prettiest code in the world, but:

It works (which is more than I could accomplish with others)
I can run it in my IDE
I can run it in Ant
I only have to do exactly what I ever did with Java
Its still way better than JUnit

There were a few things I had to do to make this work:

When compiling in Ant, I had to add this to my scalac call:

target="jvm-1.5" ...

When compiling in Eclipse I had to do this:

Project -> Properties -> Scala Compiler Properties -> target = jvm-1.5

I could give a more detailed example, but I haven't tried it yet. I'm pretty sure that things like @BeforeMethod, @AfterClass, @DataProvider and what not all just work the same. I'll try to come up with a better example though.

Reading Scala

After working a bunch in Scala the other day, and continuing to obsess over it, I've realized something.

Reading the code Scala language source code is perfect code reading material.

I had said that I needed some good code to read. Scala is perfect for this. Here are the reasons its good for me.

It's really good code
Its easy to read (I guess these two go together nomally)
It has a really nice mix of OO and Functional Programming (which I need to get better at anyway)
Its got a compiler written in Scala compiling to Java bytecode. :)
Its a langauge for god sakes, and I want to do language design so I should look at more language implementations

But, are those the same reasons its good for someone else to read? Maybe not, but I think Scala is still great for other people to read, beginners, professionals, and Ph.d's alike.

The first two points hold true: Its really good code, and its easy to read.
Beginners could easily read the data structures code in Scala collections. They could learn their data structures, learn FP, and learn OOP better all at once.
Theres a wealth of stuff for more advanced people to read.

The Actor Model for instance, which is Scalas concurrency stuff built on top of Doug Lea's fork join.

Langauge Designers could read it and learn a bit.

Compiler writers could read it and learn a bit.
Software Engineers could read it, and read the examples to learn a few things:

How to do things better in their own language
What they are missing stuck in an ancient language.

As I said before, It's probably helpful to read java.util.concurrent and the java collections stuff, and I'm sure theres a whole lot of other code thats great to read too. But, I'm getting worn out when it comes to Java. I'm ready to move on. Scala is new and fresh and exciting, and, I'll learn more by doing it.

Wednesday, December 26, 2007

Unit Testing in Scala and in IDE's in General

I reviewed a few Scala unit testing frameworks today: SCUnit, and Rehersal. When all was said and done I determined something that I found rather remarkable - TestNG is the best unit testing framework for Scala. For an example click here. To listen to my mind numbing justification of that radical statement, read on...

Why aren't new Unit Testing frameworks up to snuff?

The problem with any new unit testing framework in a new language is that it doesn't have IDE support. This is a huge pain in the ass for someone who relies on their IDE for everything (like me). TestNG already has IDE support, and you can use Java Annotations in Scala, so you can use TestNG just fine in Scala as well.

Does that mean the test frameworks themselves are bad? No, they are fine. But to be entirely honest, its probably not all that worthwhile for someone to build a xUnit framework in Scala other than the fact that its probably really fun. And I say all this knowing full well that could be horse shit - I don't enough about Scala. There may be some nice syntactic stuff that makes testing in Scala just better, more readable, easier to write - I'm not sure. Its my gut feeling. Anyway...

There is still a fundamental underlying problem here.

The main problem with this whole thing is (well there actually might be a couple, but one at a time) that if you want to write an xUnit framework for a language you have write the IDE support for it too. You shouldn't have to do this.

So what can we do to fix this?

IDE's should provide xUnit framework plugin points. All you have to do is plug in your particular framework, and boom instant IDE suppport for your xUnit framework. I've been saying this for a while about languages themselves. All you need to do is plug-in the compiler and boom, you have everything. Maybe thats too much to bite off...but I don't think plugging in your own xUnit is.

The JUnit and TestNG Eclipse plug-ins are nearly identical. They both look damn near the same as the test runner for NUnit. The patterns are clearly obvious. I haven't looked at the implementations at all, so I don't yet have any concrete ideas on how to make this happen, but its so worth it. Someone should do it. It could be me.

Finally...

This could be the first step towards instant IDE support for new languages. Maybe we identify several areas like this that are similar across languages and provide plug-in points for each. What would those features be? I'm not sure I still need to think. And, maybe more languages should just use the support of Java like Scala does.

But, I have to say this was a great day. In my opinion realizing this was a real breakthrough in what I know about well, everything. Additionally, I think I am finally saying things that I haven't really heard other people saying. Maybe they are and I haven't read it, but I feel like I might actually be making some progress.

Things are coming together (and Scala is awesome)

I did a ton of reading on Scala today, and I did a lot of writing code and using Scala today. This makes me really happy. The best idea that I had today was to throw out any Java code on the CPU Simulator project, and do it all again in Scala. Why? Because if I write it in Java I won't learn as much. So, that is started, and a bunch of the new code is checked in.

Oh man I have a lot to talk about with Scala. I wonder if I should break it into many posts... I think I will.

But let me just say that I what I did today ties together everything I've been working on over the last four months. Doing the CPU Simulator in Scala helps me with languages, language design, IDE's, unit testing, compilers, reading code, preparing to teach classes, everything...

I was sick today, yet able to do work. I didn't have much energy, but sitting here was fine. Mostly I didn't want to get other people sick so I didn't go to work. Anyway, I did some of my best work today, and I was able to come up with a lot of really good ideas and I want to write them down before they are gone forever.

Wednesday, December 12, 2007

Sunday, December 09, 2007

CPU Simulator on Google Code

No one might actually contribute, except myself, but I put my CPU Simulator project up on http://code.google.com/p/cpusimulator.

This wasn't just for the purpose of generating interest in this project, which is likely uninteresting to most people, but it was my first venture into creating/hosting my own project somewhere in the world. This is something that any self respecting developer should do at some point.

Overall I think Google Code is fabulous. I had my project up in minutes, with my code in Subversion. You get a wiki, downloads, administration, and a number of other things automatically.

Also, and awesomely, this helps me solve a major problem. I've been bitching that I have no way to easily link to code that I've written. Well, now, see this: http://cpusimulator.googlecode.com/svn/trunk/src/main/com/joshcough/cpu/gates/NorGate.java. Totally awesome.

I was disappointed that you only get 10 projects lifetime, but I guess I'll have to choose them wisely. Maybe if I get to be a teacher, and I'm using several projects for teaching, they will lift that limit.

Anyway, I'm really happy. Great day.

Saturday, December 08, 2007

Reading Code

I haven't written and songs in a while. Way too long. Like, maybe a year. Terrible. It dawned on me just now (though I'm sure I knew it a long time ago) - I write more songs, and better songs, when I'm playing other peoples music. This is especially true when the music is difficult for me to play in some way (maybe the guitar part is hard, maybe the vocals are hard, maybe the rhythm is weird).

Notice I said "playing" other peoples music and not "listening" to. However, when I'm listening to more music, I still write more than I do when I'm not. Its just that playing is even better.

I not sure if I tie playing and listening to music to two different actions in code reading. I might try here, but the real point is that I need to read more code. And, it has to be good code of course. I need to read good code, and a lot of it. If I do my, code will become better and better.

I suppose listening to music is like glancing over code once. You want to see it to know what it does, but you don't really care to understand the thought process behind it. I guess you don't have to do that when you are playing someone else's song, but I usually do. So playing someone else's music is like, reading code really hard. Getting into the head of the coder, like getting into the head of the song writer. You can just glance at code, and just listen to a song. But, if you want to get better, you have to really concentrate on it. The better the song, the better the code, the more work put in, the better the results.

At Oswego I don't think they did enough to say that we should always be reading good code. I think most schools probably don't stress this enough unfortunately. And, even more unfortunately, most of my professional career has been filled with terrible code. Because of those two things, I didn't read much code that was any good. I didn't even know that I should.

When I struggled for a few years after college, it wasn't really obvious what the problem was. I was far and away the most talented guy in my class (I'm not even ashamed to act arrogant about this anymore, its just plain true), yet it didn't seem to translate to professional success. Finally I started reading books. Not so much reading code, but lots and lots of books. Things started looking up.

I still haven't read good code though. Well, not that much of it. I need to read more. I'm certain that when I do my code will improve, maybe even dramatically. So, I plan to attempt to find some books on code reading, potentially this one: http://www.spinellis.gr/codereading/. But, thats obviously not enough. I need choose some good code to read.

What are some good OSS projects to start reading? I read through CruiseControl Java a bit, and it was ok. But it was a little bit boring. I suppose it helps to pick something I'm familiar with. I think the java.util.concurrent that Dr. Lea wrote is the best place to start. It might be Java, which I'm getting sick of, but I'm certain its some of the most well written code there is.

But here is something else important. I'd like to attempt to put together a bunch of code, somehow ordered by increasing difficulty. Why? So I could develop my own college course in code reading. It would be so fun to have a bunch of stuff in a bunch of different languages that gets more and more difficult to read, and have the students have to tell you what it does. And, add features to it. Oh man, what a good course. Its so practical too, thats what most of my development has been (though I wish it wasn't that way). Its been mostly - here is this code base, and I need you to add this feature to it.

I think this is an essential class probably missing from most CS or SE programs. Maybe I'm wrong and just talking out of my ass, but I know it was missing at Oswego, and I consider Oswego to be one of the finest CS schools.

Wednesday, November 21, 2007

Testability vs. Encapsulation

I'm always on the side of Testability, but I'd like to hear some other peoples opinions on the subject. I think (in most situations) its totally OK to relax encapsulation in favor of testability, but some people at my work do not. Let me give a rather long example of something I came across that bothered me a bit.

I was trying to test a class - well call it G, with a Logger - LOG , which was declared like this:

static final Logger LOG = Logger.instance();

Later on in G, I found a method shutdown, which called LOG.logAndExit() and logAndExit was final and called System.exit().

G:
public void shutdown(){ LOG.logAndExit( "shutting down" ); }

Logger:
public final void logAndExit( String message ){
System.out.println(message);
System.exit(0);
}

If my tests wanted to test the shutdown method on G, the entire JVM would shutdown which simply shutdown my tests. Brilliant. I had to find some way to Mock or override the LOG variable. But there were several problems with that.

It was private
It was final
It was static
The logAndExit method was final

All of these things make difficult testing. I tried to get around the "private" by writing my PrivateFieldHelper Class. But, as it turns out, you cannot change the accessibility of static final variables at Runtime. It only works for instance fields of all types, and non final static variables. So I was stuck with the LOG object that I had.

Unless of course I relaxed encapsulation and removed the final from LOG. Then I could use my PrivateFieldHelper to set it to a new Logger of some kind. BUT...I still had a problem.
The logAndExit method was final. So even if I extended our Logger class and tried to override logAndExit so that it wouldn't call System.exit(), I still could not do so. Even when I created a Mock Logger object using JMock I had the same problem. It appears that even mock objects can't override final methods.

So once again I decided it was best to relax encapsulation. I removed the final keyword from Logger logAndExit, and created a new class - SafeLog - that overrode the logAndExit method. I used my PrivateFieldHelper to set the (now only static private) LOG field on G, and I was able to safely call the shutdown method from my test code.

What a pain.

I understand that you can go way too far on this. Some people say you should never relax encapsulation for testability, because in doing so you relax intent and readability which later creates more of a maintenance problem. In some ways I do agree with this. On public API's and Libraries you most certainly will have higher maintenance costs. But, I do think removing a final here and a final there is ok, especially if its documented. I also think removing private in favor of default (package private) is ok.

What do you think? Does anyone know of any good articles or books explaining the trade-offs?

God this post is about to get long....

Some say you shouldn't relax private for package private, and all testing should be done through public methods. This is another one I just don't agree with. Lets say you have a reasonably complicated class that only exposes one public method. People reading the class later on might not know what inputs are valid for and what outputs are expected for each of the private methods in the class. You certainly can, and should get 100% code coverage through testing public methods, but it still might not be immediately obvious to someone reading the code later on what those private methods are doing.

Testing private (or package private) methods extensively should make it immediately obvious. Once again, What do you think? Does anyone know of any good articles or books explaining the trade-offs?

I think there can and should be things built into the languages themselves to expose hidden members to testing. Something like a test keyword. Or like JSR 294, superpackages, which define what classes in a package are accessible, and to whom. If it things like this were built right into the language, exposing members to testing, then we wouldn't even be having these debates.

How would it work? Maybe a lot like Generics. All the type information is removed at compile time, and so could any test availability information. You could turn this off. You could say, produce a jar for testing, and produce a jar for delivery. Its not that hard. Just ideas...You have any?

Friday, November 16, 2007

CPU Simulator Library in Progress

I've been working on a lot of stuff lately and not writing. For that I apologize. I want to keep everyone updated on what I'm doing, but mostly I want to keep myself update for tracking progress and what not.

I'm still wanting to design languages, write compilers, write IDE thingies, and all of that. If you recall, I decided that in order to do that I should become a compiler master. In reading a lot about compilers I decided I should take yet another step back to refresh my memory on Computer Architecture.

So I've been reading about that a LOT. Really low level, as low as possible. When I say as low as possible I even mean Quantum Physics. But, I don't intend to become a master in Quantum Physics. Its so fascinating but it also really almost drove me to insanity about 6 years ago. A simple refresher was all I needed.

After that I read a bunch about the history of computers, mixed in with some history of philosophy, and mathematics. All really great stuff. Once again I don't intend to become an expert in it but I want solid foundations in all this stuff, which I feel I don't quite have. I'm close.

So heres where the fun part begins. I started writing this cool little CPU simulator in Java. I don't have much yet but I'll keep the world informed. What I plan to do is - write classes that represent pretty much everything you'll find in a computer hardware. So far I have a MemoryModule class and things like that. You can imagine this pretty easily.

We can start with a Computer class which has CPU, Memory, IO, and a Bus. We can then break the CPU down into ControlUnit, ALU, Registers, and maybe Connections. The ControlUnit can be broken down further into SequencingLogic, Registers, Decoders, ControlMemory, etc...

This process of drilling down can probably go down to the atomic level. I'm not sure how low I'd go, where I'd draw the line. But the point is by doing all this, by actually writing the code, ensures that I understand each part. I could take some parts down further than others, and in doing so I'd learn more about that component. This is fine. I probably don't need to know things down to the atomic level on say, video. But going down to FullAdders and HalfAdders in the ALU, or even down to LogicGates is probably a useful thing to do.

Maybe I can invent my own instruction set for my "Hardware", or reuse an existing one. Either way, once all of this is done it would be nice to write a little OperatingSystem class that makes use of that instruction set. I would learn a lot in doing so I imagine. Then once my OperatingSystem is in a reasonable state (I don't have any idea what that would look like yet), I could attempt to write a little compiler that compiles some random language down to an Assembly Language that runs on my hardware.

This is a long way off but I think its a damn good goal. Yes its reproducing things that people have done 50 years ago in a way that most certainly won't be reusable, but whatever, it will get me towards my goal of language writing.

PrivateFieldHelper Class

Heres a little class I wrote to set private fields. I plan to use it a little for testing, but only when I'm backed into a corner, and can't change code otherwise.


/**
 * 
 * @author Josh Cough
 * 
 */
public class PrivateFieldHelperImpl implements PrivateFieldHelper {

 Class clazz;

 /**
  * 
  * @param c
  */
 public PrivateFieldHelperImpl(Class c) { this.clazz = c; }

 /*
  * (non-Javadoc)
  * @see com.joshcough.reflect.PrivateFieldHelper#setStaticFieldValue(java.lang.String, java.lang.Object)
  */
 public void setStaticFieldValue(String fieldName, Object newValue)
   throws IllegalArgumentException, NoSuchFieldException {
  setPrivateFieldValue(findPrivateStaticField(fieldName), null, newValue);
 }

 /*
  * (non-Javadoc)
  * @see com.joshcough.reflect.PrivateFieldHelper#getStaticFieldValue(java.lang.String)
  */
 public Object getStaticFieldValue(String fieldName) 
  throws NoSuchFieldException, IllegalArgumentException  {
  return getPrivateFieldValue(null, findPrivateStaticField(fieldName));
 }

 /*
  * (non-Javadoc)
  * @see com.joshcough.reflect.PrivateFieldHelper#getInstanceFieldValue(java.lang.Object, java.lang.String)
  */
 public Object getInstanceFieldValue(Object instance, String fieldName) 
  throws NoSuchFieldException, IllegalArgumentException  {
  return getPrivateFieldValue(instance, findPrivateInstanceField(fieldName));
 }
 
 /*
  * (non-Javadoc)
  * @see com.joshcough.reflect.PrivateFieldHelper#setInstanceFieldValue(java.lang.Object, java.lang.String, java.lang.Object)
  */
 public void setInstanceFieldValue(Object instance, String fieldName, Object newValue)
   throws IllegalArgumentException, NoSuchFieldException {
  setPrivateFieldValue(findPrivateInstanceField(fieldName), instance, newValue);
 }
 
 /**
  * 
  * @param f
  * @return
  * @throws 
  */
 private Object getPrivateFieldValue(Object instance, Field f) {
  f.setAccessible(true);
  Object o;
  try {
   o = f.get(instance);
  }  catch (IllegalAccessException e) {
   throw new PrivateFieldException(e);
  }
  f.setAccessible(false);
  return o;
 }

 /**
  * 
  * @param f
  * @param newValue
  * @throws 
  */
 private void setPrivateFieldValue(Field f, Object instance, Object newValue){
  f.setAccessible(true);
  try {
   f.set(instance, newValue);
  } catch (IllegalAccessException e) {
   throw new PrivateFieldException(e);
  }
  f.setAccessible(false);
 }

 /**
  * 
  * @param fieldName
  * @return
  * @throws NoSuchFieldException
  */
 private Field findPrivateStaticField(String fieldName) throws NoSuchFieldException {
  for (Field f : clazz.getDeclaredFields()) {
   if (f.getName().equals(fieldName)) {
    if (Modifier.isStatic(f.getModifiers()))
     return f;
   }
  }
  throw new NoSuchFieldException();
 }

 /**
  * 
  * @param fieldName
  * @return
  * @throws NoSuchFieldException
  */
 private Field findPrivateInstanceField(String fieldName) throws NoSuchFieldException {
  for (Field f : clazz.getDeclaredFields()) {
   if (f.getName().equals(fieldName)) {
    if (! Modifier.isStatic(f.getModifiers()))
     return f;
   }
  }
  throw new NoSuchFieldException();
 }
 
}

Cruise Control Remote Management API

I've never contributed to an OSS project before. Why that is I'm not quite sure. Looking through the CruiseControl Java code its pretty clear to me that I have the skills. I understand the code and I understand ways that I could refactor and improve it. I understand that I probably had those skills years ago. So why I haven't is beyond me; maybe I just never had the confidence.

Anyway, I've finally submitted something. I wrote a CruiseControl Remote Management library that wraps the exposed JMX attributes and operations for a server and its projects. You can do nice things like getting a reference to a server, getting all its projects, force builds on projects, set labels on projects, lots of nice stuff. Here's a simple example that forces a build on all projects:

CruiseServer server = new Server("localhost");
List projects = server.getProjects();
for( CruiseProject p: projects ){ p.forceBuild(); }

You could do nice things with Build Pipelining like trigger builds in other CruiseControl instances on different servers after a project on your server builds. Indeed, thats actually what I've done.

I think that this code could be used to clean up a lot of the code in their current tree. Maybe I'm wrong, or maybe they already have something like this, but I didn't see it.

Hopefully they will take this and add it in. Maybe they will ask me to add it in. Maybe I'll get to refactor a bunch of their Dashboard code. I'm almost certain it would be cleaner using this.

We'll see what happens.

Tuesday, October 30, 2007

Closures

Can anyone tell me how closures are implemented?

Scala has closures, and it compiles to Java bytecode, but, Java doesn't have closures. This is not to say that it can't be done, of course. I'm sure the implementation is probably not that hard. I'm just really curious to know how its done... I could go look at the source code for the Scala compiler of course. But I have been working on a number of other things.

I wrote a library for remote management of CruiseControl that seems to be much easier to use than the one they are using in the new Dashboard. I'm going to try to get that submitted here shortly. I'm starting to feel pretty confident in my development skills. Finally. I'm also starting to feel confident in my code reading skills too. I was able to look through the CC Java code with ease. Of course it helps that its mostly nice clean code, but still, I'm doing well.

Well enough, I suppose, that I could figure out how closures are implemented. And, since no one reads my blog anyway, I think I'm on my own.

Thursday, October 18, 2007

But Thats How It Was When I Got Here

Companies typically let their build system go to shit.

Why is that? Does anyone have an answer to this? The first thing I ever do on new projects is make sure I have a consistent build process. ThoughtWorks was the same way. The first thing I look at when joining a new company is the build process, and how to fix it. Why don't most companies do this themselves? I have some ideas, here they are:

They don't know how.
They somehow (wrongly) think its not as valuable as developing the next features
They get comfortable with the fact that it takes an hour to build, because it works
They suffer from "But Thats How It Was When I Got Here" syndrome

All of these are BAD BAD BAD reasons, and all make me very offended. Those might not be the only reasons, and they might not be the best reasons, but regardless, I'm going to tear them apart. My intention is not to make my coworkers feel bad, but to wake them up to the reality that theres a lot of new technology out there that they need to be learning and leveraging.

They don't know how.

If you don't know how to do something entirely modern, then you need to start learning. Everyone knows this, but so few people do it. Why? (Topic for a whole new blog posting). If you don't do it you're going to be passed by. This will occur first on the individual level, people will start to pass you making more money and you'll wonder why. Well, knowledge is why.

Worse though, if this is somehow your corporate culture, not learning new technology, eventually your whole company is going to stagnate. I refuse to let this happen at my group. Fortunately we have some great people who are eager to learn.

They somehow (wrongly) think its not as valuable as developing the next features

If you think its not valuable, you're dead wrong. If you don't have a repeatable build process someone will end up making a mistake and you'll deliver something broken to a client. Maybe that never happens, but you WILL end up slowly adding to your awful build process until you get something that is a total pile of nonsense. With this pile, any change takes days or weeks to figure out. You have to figure out all the side effects, you have to somehow verify that a updated build process produces the same results as the old process.

They get comfortable with the fact that it takes an hour to build, because it works

Believe it or not, if you are comfortable with your build process this is actually a sign that something might be wrong. I am never comfortable with my build process. I'm always tweaking it, trying to make it run faster, trying to remove duplication from it, trying this trying that...I think a good build should be less than 5 minutes. Some people say ten. I don't agree. If you have an hour build, you're likely doing a bunch of manual steps and will run into the problems I pointed out in the last step. If you have an hour long build and its completely automated then you have other issues that aren't quite as serious, but are still very bad.

But Thats How It Was When I Got Here

This one is troublesome to me. I HATE legacy software. I feel like. I don't know, some kind of Vigilante on a mission to KILL it. Unfortunately, even I find myself saying, "But thats how it was when I got here" from time to time. This is not a good excuse. If you're saying that now, you'll probably go on saying this until you end up in a situation like the last two items. This is almost an excuse for all the other items rolled up into one. Its like saying, "Yeah I know its bad, but what can you do?" And on top of that, its like, a way out, a way of saying, "I'm not going to deal with that problem." Well, guess what? You are going to deal with it, the hard way.

So is this all avoidable? Of course. "But how?" One simple way, READ. DAMN YOU. READ MORE. Thats it. Really. Just read what people are writing and you'll learn how to do things right. Now, if only we could get everyone to read. I smell a post.

Agitar Examples

I finally have a chance to post some examples of code generated by Agitar. I promised them in comments to the original post, but it turns out this is easier. First, I'll give a little background on the code that I wanted to test, and then I'll show the tests generated by Agitar. This example is designed to show the readability of Agitar. If a test fails after a change and its difficult to read, what approach should a developer take - read the test and see why it failed, or regenerate the test?

I wrote a simple interface for returning all the files under a directory:

(By the way I lost all my formatting and I apologize. Formatting on Blogger is a PITA.)

public interface ListFilesStrategy {
public List listFiles(File dir);
}

I have a two classes implementing this interface, RecursiveListFilesStrategy and NonRecursiveListFilesStrategy, which both extend AbstractListFilesStrategy.Here are the listings for AbstractListFilesStrategy and NonRecursiveListFilesStrategy:

public abstract class AbstractListFilesStrategy {

public void assertFileIsDirectory(File dir) {
if( dir == null )
throw new IllegalArgumentException("directory is null");

if( ! dir.isDirectory() )
throw new IllegalArgumentException("file is not directory");
}

}

public class NonRecursiveListFilesStrategy extends AbstractListFilesStrategy
implements ListFilesStrategy{

public List listFiles(File dir){
assertFileIsDirectory(dir);

List files = new ArrayList();

List directories = new ArrayList();
directories.add(dir);

while(! directories.isEmpty()){
File currentDir = directories.remove(0);
for( File f: currentDir.listFiles()){
if (f.isFile()) files.add(f);
else directories.add(f);
}
}
return files;
}
}

Now the test Agitar generated that was difficult to read. This test tested the listFiles method on my NonRecursiveListFilesStrategy:

public void testListFilesWithAggressiveMocks1() throws Throwable {
NonRecursiveListFilesStrategy nonRecursiveListFilesStrategy = new NonRecursiveListFilesStrategy();
File file = (File) Mockingbird.getProxyObject(File.class);
File file2 = (File) Mockingbird.getProxyObject(File.class);
File[] files = new File[0];
File file3 = (File) Mockingbird.getProxyObject(File.class);
File[] files2 = new File[2];
File file4 = (File) Mockingbird.getProxyObject(File.class);
File file5 = (File) Mockingbird.getProxyObject(File.class);
files2[0] = file4;
files2[1] = file5;
Mockingbird.enterRecordingMode();
Boolean boolean2 = Boolean.TRUE;
Mockingbird.setReturnValue(false, file, "isDirectory", "()boolean", new Object[] {}, boolean2, 1);
ArrayList arrayList = (ArrayList) Mockingbird.getProxyObject(ArrayList.class);
Mockingbird.replaceObjectForRecording(ArrayList.class, "()", arrayList);
ArrayList arrayList2 = (ArrayList) Mockingbird.getProxyObject(ArrayList.class);
Mockingbird.replaceObjectForRecording(ArrayList.class, "()", arrayList2);
Boolean boolean3 = Boolean.FALSE;
Mockingbird.setReturnValue(false, arrayList2, "add", "(java.lang.Object)boolean", new Object[] {file}, boolean3, 1);
Mockingbird.setReturnValue(arrayList2.isEmpty(), false);
Mockingbird.setReturnValue(arrayList2.remove(0), file2);
Mockingbird.setReturnValue(false, file2, "listFiles", "()java.io.File[]", new Object[] {}, files, 1);
Mockingbird.setReturnValue(arrayList2.isEmpty(), false);
Mockingbird.setReturnValue(arrayList2.remove(0), file3);
Mockingbird.setReturnValue(false, file3, "listFiles", "()java.io.File[]", new Object[] {}, files2, 1);
Mockingbird.setReturnValue(false, file4, "isFile", "()boolean", boolean2, 1);
Mockingbird.setReturnValue(false, arrayList, "add", "(java.lang.Object)boolean", boolean3, 1);
Mockingbird.setReturnValue(false, file5, "isFile", "()boolean", boolean3, 1);
Mockingbird.setReturnValue(false, arrayList2, "add", "(java.lang.Object)boolean", boolean3, 1);
Mockingbird.setReturnValue(arrayList2.isEmpty(), true);
Mockingbird.enterTestMode(NonRecursiveListFilesStrategy.class);
List result = nonRecursiveListFilesStrategy.listFiles(file);
assertNotNull("result", result);
}

This is certainly difficult to read. I'm pretty sure I could explain what its doing, but I'm more well read in testing and mocks than most developers. There are a few things they could do to clean it up however. They could try to be more in line with Behavior Driven Development and they have a few options in doing so. The method doesn't have an intent revealing name, I know what method its testing, but I don't know exactly what its doing. They could put in "given, when, then comments". The could use the extract method refactoring to seperate out the setup, the action, and the assertions, but maybe at least the setup portion.

All these things could be done, but even so there is a lot of setup happening here, and Most developers aren't ready, or aren't willing to read through it, and will likely start to ignore Agitar errors.

Now, I was told by Barry at Agitar that I could put in a test helper class, which would basically provide the setup portion of the test, and then it would generate more meaningful results. Here is an example:

public class FileTestHelper implements ScopedTestHelper {

public static File createFile() {
return new File("./src/main");
}
}

Should result in something like this …

public void testListFiles() throws Throwable {
ArrayList result = (ArrayList) new NonRecursiveListFilesStrategy().listFiles(FileTestHelper.createFile());
assertEquals("result.size()", 5, result.size());
}

Indeed, I could do something like this. But, there is a big problem with this. We have 5000+ classes that we want to generate tests for! How do we know which tests are meaningful, which ones need test helpers, yada yada?

I do promise more examples of simple methods that don't seem to be giving meaninful results. Mostly just checking to make sure methods catch NullPointerException. I do think the product will work wonderfully for green development, but will likely be ignored during legacy developement. Please let me know your thoughts.

Sunday, October 14, 2007

IDE's vs vi

I just told my friend that instant IDE support for new languages is the place to be (I truly believe this can happen). In response he told me, "I love vi."

I know vi, and I can get around in it OK, but I'm not a hardcore vi guy. I know there are some things you can do in vi that are really nice and allow you to do some things very quickly and powerfully. But, I don't know what they are. The real question here is, can vi do things that modern IDE's cant do? Can someone explain to me this?

I do think that vi is ancient technology, and I'm sure thats old news anyway. I don't know very many people who use vi or emacs anymore. I'm sure a lot of people would look at my friend funny for that comment. But I'm a little different. I'm curious.

What features do retired editors have that modern IDE's lack?

Questions on Blogger

So I like the look and feel of Blogger, but I have some problems with it.

I really need to be able to upload files as attachments to my blog.

Is there a way to do this?
Is there another site that allows you to do this?
If I host my own site can is there on OSS blog that I can put up that allows me to upload files?
Should I just use a wiki?
Should I just host my own site and link to files from Blogger to my site?

Whats the best way to go about all this? Seems like the easiest is to go with the last option. But, I don't want to host at home anymore. Does anyone know the easiest way to do hosting these days? I used to host at home a few years back but I'm now out of touch with hosting.

Friday, October 12, 2007

A Day of JavaCC!

In a dream come true, I got to write a bunch of stuff using JavaCC at work today. It was fantastic. I was able to identify the fact that we needed JavaCC for a particular problem, and sell people on it as well - also very exciting points.

NOTE: I tried to write this post before but it came out horrible. I'm trimming it down. The original post is in the comments. What remains is mostly notes and tips on how to do development using JavaCC. Not a tutorial, just notes.

Here are the steps we took on the syntactical analysis side:

We took some examples of the input language and started writing the grammar for it. This took a while because I really had to get myself familiar with JavaCC all over again.
We tried to generate the parser by running Java CC and failed with a Left Recursion error.
We looked up how to solve this and after a while figured out that we were simply missing a set of parenthesis.
We generated the parser from the grammar and ran the parser against some complex input. It failed.
We repeated this process for a while until we figured out that we need to make our grammar AND our input simpler to start, and work up. This was a good lesson. Don't try to get Everything right in one pass, the errors will be overwhelming. Start small, work up.
We finally worked up slowly to the point where we could parse our original complex input.

After we got it parsing, we started adding our Objects into the grammar file so that they can be produced at parse time. The same rules applies here. Start small - at the very bottom of your parse tree, the leaves, and have those return small objects that can be passed up the tree one level at a time. Slowly, you'll be able to build nice full objects. Its probably possible to start at the top as well, but I think its more complicated. Depending on if objects have to be constructed complete, it might even be impossible.

Wednesday, October 10, 2007

Compiler 2000

So I've been sidetracked a bit. I've been sick - dizzy for almost a week. The ThoughtWorkers, Ben and I have a release in two days, and I'm supposed to present on web frameworks next week with an emphasis on testing. I still haven't had a chance to get my Agitar examples to post either. But, that won't stop me from learning and writing about compilers.

So, I figured out my compiler from 2000! I'm really excited about this. It only took a few minutes. I didn't figure it out in huge detail but I want to give the gist of what I did figure out. First I'll start with a quick overview of the steps:

Define the input language grammar
Run javacc on the grammar to produce a parser/generator
Write a file in the input language
Run the parser/generator on the file created in step 4 to produce Java byte code in Text
Run jasmin to convert the human readable Java byte code to a Java class file
Run the class generated by jasmin on the JVM.

There's certainly some details missing here. How do you go from a grammar to a Parser/Generator? Its pretty common to go from the grammar to a parser, but the generator as well? Hmm. This one I might actually have trouble explaining but I do know how it works.

The grammar I build that is used by JavaCC has references to classes that I wrote that do the generation. JavaCC generates the parser that creates the AST comprised of the classes I wrote. These classes have the generation logic build in. Once the parser builds the tree it can just ask the root node to start generation and all the nodes get visited in the proper order, generating code.

I realize this all makes me sound naive about well, everything, but it was literally 5 minutes or reading old code, one and a half chapters in the dragon book, and thats about it. I have a lot of catching up to do, but I'm making progress already. Now that I know I have a complete working model that I built, I should be able to tinker with it quite a bit and post more.

Thursday, October 04, 2007

Agitar Evaluation

I've been evaluating the JUnit test generation product by Agitar in my spare time and want to spit out a few thoughts on it. Most of this will be criticism, but I'd like to first say that I think it has the potential to be an absolutely great product. If you don't know much about Agitar, they have a product that will automatically generate JUnit tests for your code. You can check out http://www.agitar.com/ and http://www.junitfactory.com for more information.

OK onto the important content.

Many of the tests created were lacking meaning.

This is OK however, because you can provide test helper classes that are read at test creation time to help create more meaningful tests. This is pretty easy during regular development, and it makes sense. The program can't know everything about your code. Unfortunately though, for large legacy code bases this is damn near impossible.

For example, lets say you have a code base with 5000 classes. 5000 classes means 5000 test classes. Lets give Agitar the benefit of doubt (there is little doubt however since I've done it many times, but lets give it to them anyway) and say that 80% of the classes are meaningful (Once again, the number is smaller in practice.) 80% meaningful coverage means 1000 test classes that aren't useful. This is a problem because:

Its likely to be your most important, or complex classes who's tests are lacking meaning.
How do you really know which ones are lacking meaning? Do you have to look through 5000 test classes?
Do you have to write test helpers for every test class?
What happens if you have no idea what your own classes are doing? (Think this is crazy? Its not. What if you inherit the code? What if you wrote the class four years ago? What if your team is large and people just hack things together?) How could you even write test helpers for this?
What if writing test helpers requires instantiating dependencies that are difficult to instantiate, which is why you haven't bothered writing tests to begin with?

You see, this is quickly becoming an overhead nightmare. This is especially true for a team who has a large untested code base. They have an untested code base because they don't know how to test. The only way to get them to start testing is with very low overhead. They don't want to have to wade through thousands of new classes.

Agitar is going to say to this, "But you have 80% test coverage at this point, which is far superior than what you had before". They might even say don't bother looking through the test code (I'm not sure if they'd really say this, but its possible). And maybe this works. Or maybe it only works for suckers. I'm not sure.

My feeling is that the product is simply not ready for legacy code bases. And in fact, by the nature of these problems, I'm not quite sure a product could EVER solve them. When I think back to when I first heard about the product, that was indeed my first thought. But, after seeing some of the videos I was very optimistic, especially with Kent Beck involved. I approached it with an open mind and was very hopeful. I'm not sure now.

The generated tests were difficult to read

I really should provide some examples here. What I'll likely do is finish this blog, get my examples at a later date and post them in a comment. So don't trash me because I haven't given examples. I don't have them on me.

Some of the code was pretty hard to read. Some of it was nice and easy to read. That is OK. It's what I expected and its livable on new projects. If a generated test is hard to read, its likely that your code isn't as clean as it needs to be.

The problem with this is pretty clear though. Its likely that most of the tests generated for large legacy code bases will be difficult to read. This isn't the fault of Agitar, Garbage in Garbage out. But, it just adds fuel to the maintenance fire. Not only do developers have to wade through garbage code, but now they have to decipher difficult to read test code that they've never seen before, just to make sure its a valid test. They might be better off just writing a test themselves, except that they might not know how.

Brief Use Case Example

I'm going to give a simple little example that demonstrates how test generation might be used, and some problems with it. Lets say a developer changes some code somewhere in a difficult to read class. He then runs the Agitar unit tests and some of them fail. Should he be worried? Should he look into it and fix the problem, or should he ignore it and regenerate the Agitar tests? If he does look into it then he might have to wade through hard to read code. If he doesn't then he might be ignoring a potential problem. Is it possible that he breaks something in an area that he doesn't even seem to be working in? That could be frustrating. Legacy code is frustrating to work with. Period.

There is an approach in line with Fowler's Refactoring that does help a bit though. Only read the tests in the area that you are working in. You might have to read through a couple hard to read test classes ... but its only a couple. Additionally, it'll give you some examples and incentive to clean up the code in that area.

If hand written tests are failing in any areas, fix them. If Agitar tests are failing in your current area, read them and try to figure out why. You might have a legitimate bug. After you've looked at the generated tests in your area and fixed up your code go ahead and regenerate all the tests.

Questions

Their website says something like "Reduce the drag from fragile Java code by 50% or more." What does that mean really? Does that tie into what I was saying before about the 80% meaningful tests? Does it mean you'll get your work done 50% faster? What is code Drag? Yes, yes, its just a marketing slogan, and I couldn't do much better, but what does it mean?

It also says 80% test coverage guaranteed. This is a bold statement IMO. I'm very curious to see what kind of awful code bases they've worked with, and what coverage they got. What about on something like a Hideous EJB2 project that can only be tested in the container?

What about new projects?

Once again I want to reinforce that this criticism is designed as constructive criticism in order to make the product better. I'm not trying to bash the product at all. I do think the product could work Great with new TDD projects. You write some tests, write some code to make your tests pass, use AgitarOne to generate some extra tests to get some thoughts about your code, refactor, and repeat. I think its a great supplement to the faulty human mind who can't see everything. The problems that exist with legacy code simply don't exist with green code. Of course you have to write test helpers, but its easy when you have a nice clean codebase and you're writing new code.

I would recommend this product to NYSE for new developement, but we don't have much new development. I'd certainly try to use it for any open source project that I'm working on.

Product Ideas

There are a number of small issues I have in the area of new development too.

Why not use JUnit 4?
Why not TestNG?

There are also a number of ideas that I have for the product:

How about annotations in the code that provide useful hints to the test generator?
How about annotations to tell the test generator not to generate test for a class or a method?
How about tying in or influencing JSR 305? This JSR is working to define annotations for Bugs, and is primarily being run by FindBugs and JetBrains. Agitar should certainly get involved. For example, @Nullable or whatever name they give a method that might return null. Little tips like this could certainly assist in test generation.

Debate

I'd really like to talk to the Agitar guys some more on this subject, and Kent Beck himself.

I openly invite anyone from Agitar to comment on this blog demonstrating how I'm wrong. I want to be wrong here. I want to have 80% meaningful test coverage on bad code. That would make my life so much easier. Please explain how this doesn't turn into an overhead nightmare when working with large legacy code bases. Please tell me how this helps get developers who've unfortunately never written tests on board.

Saturday, September 29, 2007

Compiler Mania

So I know I talk a whole lot of shit about what I want to do, but I believe that I have it figured out now. All the work and reading I've done lately on the future and history of programming languages has led me to compilers. I'm going to start reading and reading and reading about compilers, and write some. I've done it before, but I'm going to kick it up a notch, BAM.

In 2000 I took a compiler course from Doug Lea and we compiled a simple language into Java Byte Code. I still have the code around for that course, and tomorrow I'm going to find it and review it like a madman. I don't think I have the language syntax definition any longer however. It was on Doug's site, but now its been replaced by MiniJava. I'll likely be learning a lot about MiniJava as well.

I'll be reading two books: Compilers: Principles, Techniques, and Tools (2nd Edition) and Modern Compiler Implementation in Java (Hardcover) The first is the classic Dragon Book. I have the first edition, but hopefully I'll get to pick up the second edition soon. The second, despite its poor reviews on Amazon and the fact that its slightly outdated, was recommended by Doug Lea, so I'm thinking its a safe bet.

For some reason I just feel compilers is the place to be. I know I can compile to JBC, and I know so many new languages are going to be compiling to it to run on the JVM. I want to be a part of that. I have a lot of interest in new languages. I know I don't want to be stuck in the Java world forever. I want to be a guy leading the new language revolution. I want all languages to have instant IDE support. I want to make it so you don't have to do much more to create a new language than just define your syntax and plug in to an IDE.

I can do it. Follow me and on my ride as I detail as much as possible on what I learn.

Thursday, September 27, 2007

Pair Programming Problems

Today I got to pair with the ThoughtWorkers in our group. Being a former ThoughtWorker, we worked really well together. You could say we were on the same page, and level. When you're on the same level, and the same page, work just flows. So that leads into what I want to write about - what happens when you have to pair with someone who's either on a different page, or a different level than you. I'm not an expert at pairing, but I've certainly noticed that these two problems occur, and I've thought through them a bit.

Lets start with someone being on a different level. This is the obvious one and easy one to explain. Simply put, for the person who's more senior, it can be frustrating. You have two choices, slow down and teach the other person, or leave them behind and forge ahead doing work while they watch. Of course, you can only do the latter when you're at the keyboard.

The first approach is better since you bring up the level of the other person; the second approach isn't even really pairing. But, the first approach is problematic. Eventually, you get burned out of teaching. Especially if the person just seems slow, or stubborn. You could get rid of those people of course. There is a way to solve this though, switch pairs frequently. That isn't something that hasn't been said and done before of course. I'm just reiterating it because we don't switch nearly enough at NYSE.

In deadline situations however, it might just be better to take the reigns and plow ahead.

Onto the next problem, pairing with someone who's not on the same page as you. This could be in any way imaginable. Here are some examples:

They are old and lame, and you're young and cool
They like using Debuggers (lame), you like writing Unit Tests (cool)
They like checked exceptions, you like unchecked
They chew gum with their mouth open and you just want to smash them
A bunch of others

What can you do about this? Its a bit harder to deal with than the first problem. Of course switching pairs more helps, but you still have to be productive while you're pairing with this person. So, You have to try to find some middle ground on some things. For instance, if they like the debugger, and you like unit tests, write a test, and run the debugger in the context of the test. In order to find this middle ground, you're going to have to communicate well, and constantly. Pairing in this case is a bit much like a relationship, or a compromise. Both sides are usually unhappy, but in general things work out.

There is another problem I've noticed with Pairing and Legacy Code. Much of the time, you have to look through the code slowly to figure it out. Either person might want to look at different parts of the code. Its frustrating to have ideas about what the code is doing, and if you're not at the keyboard so you can't look at that area. It seems that its almost better to split up, review the code, then come back together to do the work. This presents other problems though. Sometimes while trying to figure out the code, you'll want to write tests, and refactor. But that is actually contributing to the work. I'm not articulating my thoughts exactly as I want to. If you split up and do work you aren't pairing. Thats not good. Ugh. I guess the whole point of that is that its difficult to pair and read legacy code.

Lets really quick review how to fix the problems:

Switch Pairs regularly
Communicate a lot
Teach people to bring them up a level
Don't worry about plowing ahead with a huge deadline looming

Tuesday, September 25, 2007

Language Explosion

I've said it before, and I'm not saying anything that people a lot smarter than me aren't already saying better, but I'll say it again anyway.

There is about to be a language explosion.

In many ways I have No Fucking Idea what I'm talking about...but something weird says that in five years time I'll have made the right moves by just attempting to talk about this today. I may be repeating myself, but each time I do, I get more ideas.

Any two important questions that need answering.

What can make this possible?

I've hinted on this before.

I can't back this up at all. But, certainly there's JRuby, Scala, and other projects. The JVM is a great tool. I'm so lucky to have written a compiler in college that produced Java Byte Code. I will be reviewing that code soon. I think many many languages will be compiling to JBC. Its so simple to do so. Boom, instant, multi-architecture runtime.

IDE Support

As IDE's mature, they will be able to take on new languages just with a compiler plugin. Maybe at first they will somehow only take on languages that compile to JBC, I'm not entirely sure. But, I assure you, its going to happen. Boom, instant IDE.

Library Support

Languages that utilize libraries built in other languages are in better shape that those who don't. That much can't be denied. There are a few interesting comments that I have on this.

Scala, for instance, has compile time access to Java classes (which, if you reference the last point, has IDE access as well). Someone had to write a Scala compiler to allow this to happen. Those points in and of themselves are not too interesting. But they do lead into something much greater.

Once Scala compiles down to JBC, then its accessible to Java code. Someone writing Java in their favorite IDE can drop in a Scala jar file, which is really just a Java jar file, and have full access to it. I'm very curious as to how this works. What does the JBC look like? How can you link to the source code? Do they line up? How can they?

Anyway, we still arent at the truly interesting part. Any new languages compiling to JBC, you guessed it, Boom! Instantly accessible to all other new languages also compiling to JBC. Ok maybe not instantly, you still have to write the compiler. That is the interesting point. New languages, designed for different purposes, designed to make different aspects of development easier, all with accessibility to each other. All tied into the IDE. Oh man.

So let me summarize all that.

Choose a syntax that fits your problem.
Write a compiler that understands Java.
Compile to Java Byte Code
Write a compiler plugin for the IDE, or write your compiler to fit some special IDE compiler spec.
Plug in.

Maybe I'm nuts and thats so far out of wack, or so far off in the future, but I can see this happening in 5 years.

Is there a way to capitalize on this?

As I see it, there is a whole bunch of work that needs to be done, but it will get done.

IDE's need a way to understand new languages. This could be through a compiler plugin, or Abstract Syntax Tree plugin or who knows, just something else.
People still have to write compilers.

Guess what, its that last part thats going to be the bomb, the explosion, the Oh Mama. How?

Writing compilers is hard work. No doubt about it. There needs to be generic compiler libraries that a compiler writer can use to to easily create compilers that do all this stuff. I'm not just talking LEX and YACC. I'm talking easy API's to do the following:

Give access to all Java code.
Provide the hooks to the IDE
Compile to Java Byte Code

The compiler writing is whats going to take the most time in this language explosion. Tools to make this thing easier are going to be used like mad. Maybe I'm crazy. Maybe its already been done. Maybe I'm just shooting my mouth off. But, I do know I'm learning. I do know I came up with all of this all on my own. I do know I'd love to have someone give me some ideas on it. I do know its just sort of flying out of my mouth and isn't completely well written. Sorry. I'm just excited.

Thursday, September 20, 2007

Progress on Goals

So admittedly, I haven't made much progress on my goals the past few weeks. I've been concentrating on CruiseControl. Unfortunately I think I'm going to have to revise them. The problem I'm having is that I'm so random. One day I'll be reading Spring and the next I'll be reading the History of Programming Languages. So what I think I should do is just write down anything and everything that I'm interested in. Here they are in no particular order:

History of Programming Languages
Scala

Scala Eclipse Plugin

Smalltalk

Squeak

Spring
Unit Testing

TestNG vs. JUnit

Unit Test Creation

Agitar

Concurrency

Testing Concurrency

FindBugs and the idea of FindBugs annotations
Developing new programming languages

Problems developing them

Library support
IDE/Refactoring support
Automated Unit Test creation support

Web Frameworks

Seam
Struts 2
Spring whatever

JBehave

Companies

ThoughtWorks

NYSE

How Stock Exchanges work

CruiseControl

Why CC Java Sucks Eggs

Why the new Dashboard isn't useful

Build Process
Extreme Programming
Dependency Injection
Aspect Oriented Programming
Compilers

Compiling to Java Byte Code
Compiling in IDE's
Building Compilers
AST

Operating Systems
Virtual Machines
OO

Design Patterns

Refactoring
Future of OO

Java

Annotations
New JSR's
Backwards Compatibility vs Forward Mobility

People

Doug Lea
Kent Beck
Martin Fowler

Legacy Code
Teaching

Agile Enablement
Giving Presentations

Constructors Considered Harmful
OSS Projects

I really need to get on one

Technology Evaluation
IDE's

Building Refactorings

OK GOD SOMEONE STOP ME.

Legacy Software

The main problem with moving forward with legacy software is that it takes so much damn time to get classes under test. So when ends up happening is, you load the entire system just to test one tiny little bit of it, and you end up with all sorts of different testing issues, further contributing to your original legacy code problem.

Why is it so hard to get a class under test?

Idiots.

Thats all. Thats the only reason. I feel like this should be the end of the post, but I suppose I'll trudge on, I haven't posted in a while.

Classes have complex dependencies that interact with each other constantly, handing each other other sets of complex Objects, making anything difficult to stub or mock out. I'm reading Working Effectively With Legacy Code and its helping, but these things take time. I think I'll get there, but maybe not for another year or so.

Brown Bag Lunch: Results

I guess a week went by since I wrote last, about the presentation I was going to give on CruiseControl. It didn't seem like a week at all. I was so busy getting ready for it, putting together slides, writing, and practicing, that the week totally flew by.

Everything went over great. I'm pretty certain everyone is going to start using CruiseControl, I got positive feedback on the implementation, I definitely won over a few people who want to help, and I may have gotten it into everyones heads that we should all be giving presentations. I'm not sure which of those things is the most important. I guess it doesn't matter.

Thursday, September 13, 2007

Brown Bag Lunch: CruiseControl

As a direct result of my own brainstorming on technology evaluation, I've decided to follow up on my own advice. I'm going to give a follow-up presentation to my bosses presentation on Continuous Integration. Mine is going to be directly oriented towards setting up Cruise Control, where as his was just a high level overview. I'm going to brainstorm some ideas right here on the spot, in outline format.

Intro

Brief overview of what this presentation is about

Follow up on Mike Roberts' Continuous Integration presentation
Brief refresher on what CruiseControl is
Will go over the internals of CruiseControl

Part 1: Overview

Overview of CruiseControl capabilities

Update from CVS
Run Ant
Show test reports
Scheduled Builds

Overview of what projects are currently in CruiseControl
Internals of CruiseControl

config.xml

Part 2: To do List

Determine what artifacts need to be published
Get performance tests running in CruiseControl
Get genversion running daily in CruiseControl
Get Unit Tests working through Ant
Get Emma and FindBugs reports integrated into CruiseControl

Part 3: What is required of the team

Install CC Tray
Write tests
Commit frequently
Don't check in broken code
If you break the build, everyone will know, so fix it.

Outro

Where to get more information

Challenges

Challenge others to give 30 minute presentations

Challenge Dave Litner to make others give 3o minute presentations

Mention Technology Evaluation

Questions

Questions can be asked throughout, but try to get more questions at the end.

I might come back and edit this, but its a damn good start.

Wednesday, September 12, 2007

Approaches to Technology Evaluation

I'm going to keep a running tab for Approaches to Technology Evaluation in small companies with small budgets, and tight schedules. This list should be full of ideas to do it and keep it affordable. This list doesn't really even have to be about software, it should apply anywhere.

Brown Bag Lunches

Pick a day each week, any day, and have everyone each lunch together for an hour while someone does a presentation on a new technology. Rotate the presenter, so even on a team as small as five everyone has five weeks to prepare. This is plenty of time. I think this can be a very effective, inspiring method for evaluation, and it has side effects of getting everyone involved, everyone gets to learn, and you build better moral.

Dedicate at least some actual work time to it.

Dedicate someone to it one day a week. Obviously this is a slightly more expensive approach. The company is actually paying for it. But, it guarantees that you get some quality time into it. The previous approach could easily break down it people don't want to spend much time at home. You have to judge your group.

Partition Work

If you partition your work properly, into independent modules, then you should be able to choose any module and build it using entirely new technologies, or just one new technology, keeping others around. This shouldn't cause dependency issues because the modules are separated.

This will have costs. At the least you need to do a minimal amount of evaluation, and spend some time up front learning new technologies chosen via evaluation. You may also find that the new technology is no good after you build, and have to rebuild using your older technology. If you're working with more than one new technology, you may run into a situation where you're unclear which technology is bad, and might mistakenly decide all of them are bad.

Force out old technologies

Choose a number like 5. Any technologies older than five years old are considered legacy tehcnologies, and under no circumstances should you continue any work with that technology. This approach is far more expensive then the others listed. But one that promises to ward off stagnation.

Force in new technologies

Slightly similar to the last approach is forcing in new technologies. This doesn't mean forcing out old ones however. It means you maintain the legacy technology, and proceed with development on the new technology. This can lead to maintenance issues, but I tend to think that developing further with the old technology leads to more issues.

Both of the last two approaches aren't really evaluation approaches, they need to be used in conjunction with an evaluation approach. But, having either policy will certainly stimulate the evaluation.

More Ideas To Come!

Innovation vs. Masturbation

Heres a thought that I'm sure many other people have had: Technology companies that are not dedicated to technology evaluation and adoption fall behind, and get bogged down with legacy software.

Lets say a piece of your system is horrendous, and it uses 8-10 year old technology. Lets also say that you have a new project coming up. My theory is this: If you don't evaluate new technology because it would introduce another maintenance variable into your system, then the new project will fail. Why? Because the old technology IS a maintenance variable! Even if you keep a close eye on the development of it, its still going to get out of hand. Why? Because its not good technology. When you have something difficult to do and the technology doesn't quite support it, you do a "Workaround", thinking, "Well, we'll fix that up later". Well, THERE IS NO LATER.

Most companies have there so called "valid" reasons not to do it yet. But I think for the most part none of them are valid. At the very least you need someone looking at new technologies. Yeah, its expensive, but you're not going to make significant advances in productivity without it. Is there some risk involved? Absolutely. Is the risk greater if you don't do it? My bets are on yes.

This is all similar to Innovate vs. Litigate, where a large, aging company (ie Microsoft) sues everyone in an attempt to hang on to dying technology, instead of focusing on creating new, great technology.

I prefer to call it by a slightly different name however: Innovation vs. Masturbation. I could explain, but I think you get the idea.

Sunday, September 09, 2007

New Language Adoption

I have some thoughts on language adoption as I've done some preliminary work learning Scala, and having learned a bit of Ruby, and a lot of .NET, all in the past 6 months. I'm going to focus here on Scala and Ruby.

One unfortunate problem with adopting a new language is the lack of quality libraries, or the total absence of libraries, and another is the lack of IDE support, especially refactoring.

Libraries

Its probably the case that many languages suffer for a long time, and never make it because they are lacking libraries. Ruby seems to have gotten over it somehow, despite starting from scratch. It took 10-15 years for it to catch on however. Clearly starting from scratch was Hard. If a language can make it from scratch, you know its special. (I should look into JRuby because I wonder if that somehow gets Ruby access to Java code. I don't think thats the case, I think its just compiling Ruby into class files to run on the JVM)

Scala gets over this problem by having full access to all Java libraries. However, I suspect that in doing so you have to sacrifice some potentially high quality features. For example Scala code is allowed to call Java constructors and as I've mentioned a few times now - theres something fundamentally flawed about construction in general. DIF's exist simply to solve that deficiency in the Java, but in the end, the language is still flawed. Does that mean the flaw carries over into Scala? I suspect at least somewhat. I'm going to continue researching this.

Either choice, from scratch, or library inheritance poses its issues. I'm not sure which way I'm leaning at this point.

IDE Support

Both languages suffered from lack of IDE support. Both had to take the start from scratch approach on this issue. I downloaded the Eclipse plugin for Scala, and its pretty minimal. No refactoring support, and many other features are missing. I remember doing the same for Ruby when I was applying to Thoughtworks.

This leads to to an obvious question: Is it possible to avoid this approach? That is, is it possible to create a new language that leverages existing IDE support? I suspect that it is, but I'm certain some research needs to be done on it.

IDE's are essentially compiler based. And Eclipse in particular is completely pluggable. What if you could plug in the compiler? It it possible to write refactorings generic enough that given the correct compiler, the refactoring could work across any language? Possible I assume, but difficult. Many Eclipse Java refactorings for example, simply print out Java code and then have the compiler recompile. These refactorings certainly would not work on other languages.

It might not be possible at all, but its certainly worth looking into. Until then early adopters will have to suffer though lack of support, until someone (quite likely them) builds that support. Language adoption will continue to take some time.

Wouldn't it be nice if anyone (well, anyone that can write a compiler) could create their own language and have it automatically fully integrated into an IDE with full refactoring support and all the other features? Then all we'd need is support in the IDE to easily create new languages and compilers. Then we'd be looking at thousands of languages that we could possibly adopt. Domain specific languages would be so much easier to create. So many possibilities I should really just stop talking.

Saturday, September 08, 2007

Next Generation IDE Technology

Certainly IDE's have automated refactorings, and they are so nice, but I'm thinking the next generation software is going to:

examine your code
determine what variables are related to what responsibilities
identify classes with more than one responsibility
identify classes in need of dependency breaking refactorings
refactor the code with minor intervention
compliment the refactorings with a full suite of unit tests that ensuring equal behavior

Additionally, as you type new code, you'll get real time warning messages such as, "It looks like you're adding unrelated behavior to this class, I recommend you do such and such..."

Following soon after, languages that enforce this at compile time. "Compile Error: Too much responsibility in this class". Additionally, "Compile Error: No unit tests for methods X, and Y". The IDE's will then tie into the unit test generation code like Agitar, and write legitimate unit tests for your code. You might think thats funny, and it might be annoying sometimes, but I bet it makes for cleaner code.

Then, people who haven't read a million books, and have 10 years of schooling can actually write code without subtracting value. They will be forced to do things correctly, and it shouldn't sacrifice creativity like a strict methodology.

Goals Revisited

For a few reasons, I think I'm going to drop everything and learn AOP, and some DIF's including Spring, Guice.

I need to do this for the following reasons:

There is something entirely unnatural about constructing objects in Java.
I need to learn how to break dependencies better.
I need a fresh view for looking at Unit Testing legacy code.

Stay tuned for more info on all of those items.

Problems with Dependency Injection

I'm having an issue with Dependency Injection that I need some clarification on.

I have a static factory method that takes a parameter:

public static GeographicalMap createGeographicalMap(String inputMap) throws DuplicateRoadException {
    GeographicalMap map = new GeographicalMap();
    map.initialize(inputMap);
    return map;
}

Its perfectly possible to use static factory methods in most DIF's. The problem is, the parameter won't be known until runtime, so it can't be configured before runtime. So, how can an object that uses a GeograpicalMap have the GeographicalMap injected into it?

One solution is to remove the "known at runtime only" parameter from the static factory method, and expose the initialize method. The client code then becomes:

GeographicalMap map;
public GeographicalMap getGeograpicalMap{ return map; }
public void setGeograpicalMap( GeograpicalMap map ){
    this.map = map;
}
public void someMethodThatUsesMap(){
    map.initialize(inputMap);
    map.doSomething();
}

This does not sit well with me at all. I have to break encapsulation, and return an object that is not fully initialized. Clients have to remember to initialize the map before using it, precisely the reason why I added the static factory method to begin with. (As a simple side note, I prefer static factory methods over constructors for readability.)

I'm certain someone must have solved this issue without violation OO principles so badly. I am determined to find the right answer.

Thursday, September 06, 2007

Book List

Refactoring: Improving the Design of Existing Code

Test Driven Development: By Example (The Addison-Wesley Signature Series)

The Pragmatic Programmer: From Journeyman to Master

Design Patterns: Elements of Reusable Object-Oriented Software (Addison-Wesley Professional Computing Series)

Effective Java Programming Language Guide

xUnit Test Patterns: Refactoring Test Code (The Addison-Wesley Signature Series)

Extreme Programming Explained: Embrace Change (2nd Edition) (The XP Series)

Peopleware: Productive Projects and Teams

The Mythical Man-Month: Essays on Software Engineering, 20th Anniversary Edition

This is my reading list for Software Engineers. If you haven't read these books, you shouldn't be allowed to program.

Only one of these books, xUnit Test Patterns is new, and could be debated. Its a great book, but its also GIGANTIC. I put it in because it immediately seems to be the authority on how to write unit tests properly and Every developer needs to know how to write unit tests properly. I do think however, that this one will quickly be replaced on the list, but probably with a 2nd edition that covers more dependency injection techniques.

I suppose its also possible that Effective Java could be debated because its language specific, but too bad.

The rest of them are simply not up for debate. If you don't agree, I hope you combust. That said, if you have anything that isn't on this list that really needs to be, please let me know.

There are many, many, many other books that I've read that are not on this list. I'm sure I'm missing some. I reserve the right to come back and adjust it.