Jack Cough on Software

Tuesday, October 30, 2007

Closures

Can anyone tell me how closures are implemented?

Scala has closures, and it compiles to Java bytecode, but, Java doesn't have closures. This is not to say that it can't be done, of course. I'm sure the implementation is probably not that hard. I'm just really curious to know how its done... I could go look at the source code for the Scala compiler of course. But I have been working on a number of other things.

I wrote a library for remote management of CruiseControl that seems to be much easier to use than the one they are using in the new Dashboard. I'm going to try to get that submitted here shortly. I'm starting to feel pretty confident in my development skills. Finally. I'm also starting to feel confident in my code reading skills too. I was able to look through the CC Java code with ease. Of course it helps that its mostly nice clean code, but still, I'm doing well.

Well enough, I suppose, that I could figure out how closures are implemented. And, since no one reads my blog anyway, I think I'm on my own.

Thursday, October 18, 2007

But Thats How It Was When I Got Here

Companies typically let their build system go to shit.

Why is that? Does anyone have an answer to this? The first thing I ever do on new projects is make sure I have a consistent build process. ThoughtWorks was the same way. The first thing I look at when joining a new company is the build process, and how to fix it. Why don't most companies do this themselves? I have some ideas, here they are:

They don't know how.
They somehow (wrongly) think its not as valuable as developing the next features
They get comfortable with the fact that it takes an hour to build, because it works
They suffer from "But Thats How It Was When I Got Here" syndrome

All of these are BAD BAD BAD reasons, and all make me very offended. Those might not be the only reasons, and they might not be the best reasons, but regardless, I'm going to tear them apart. My intention is not to make my coworkers feel bad, but to wake them up to the reality that theres a lot of new technology out there that they need to be learning and leveraging.

They don't know how.

If you don't know how to do something entirely modern, then you need to start learning. Everyone knows this, but so few people do it. Why? (Topic for a whole new blog posting). If you don't do it you're going to be passed by. This will occur first on the individual level, people will start to pass you making more money and you'll wonder why. Well, knowledge is why.

Worse though, if this is somehow your corporate culture, not learning new technology, eventually your whole company is going to stagnate. I refuse to let this happen at my group. Fortunately we have some great people who are eager to learn.

They somehow (wrongly) think its not as valuable as developing the next features

If you think its not valuable, you're dead wrong. If you don't have a repeatable build process someone will end up making a mistake and you'll deliver something broken to a client. Maybe that never happens, but you WILL end up slowly adding to your awful build process until you get something that is a total pile of nonsense. With this pile, any change takes days or weeks to figure out. You have to figure out all the side effects, you have to somehow verify that a updated build process produces the same results as the old process.

They get comfortable with the fact that it takes an hour to build, because it works

Believe it or not, if you are comfortable with your build process this is actually a sign that something might be wrong. I am never comfortable with my build process. I'm always tweaking it, trying to make it run faster, trying to remove duplication from it, trying this trying that...I think a good build should be less than 5 minutes. Some people say ten. I don't agree. If you have an hour build, you're likely doing a bunch of manual steps and will run into the problems I pointed out in the last step. If you have an hour long build and its completely automated then you have other issues that aren't quite as serious, but are still very bad.

But Thats How It Was When I Got Here

This one is troublesome to me. I HATE legacy software. I feel like. I don't know, some kind of Vigilante on a mission to KILL it. Unfortunately, even I find myself saying, "But thats how it was when I got here" from time to time. This is not a good excuse. If you're saying that now, you'll probably go on saying this until you end up in a situation like the last two items. This is almost an excuse for all the other items rolled up into one. Its like saying, "Yeah I know its bad, but what can you do?" And on top of that, its like, a way out, a way of saying, "I'm not going to deal with that problem." Well, guess what? You are going to deal with it, the hard way.

So is this all avoidable? Of course. "But how?" One simple way, READ. DAMN YOU. READ MORE. Thats it. Really. Just read what people are writing and you'll learn how to do things right. Now, if only we could get everyone to read. I smell a post.

Agitar Examples

I finally have a chance to post some examples of code generated by Agitar. I promised them in comments to the original post, but it turns out this is easier. First, I'll give a little background on the code that I wanted to test, and then I'll show the tests generated by Agitar. This example is designed to show the readability of Agitar. If a test fails after a change and its difficult to read, what approach should a developer take - read the test and see why it failed, or regenerate the test?

I wrote a simple interface for returning all the files under a directory:

(By the way I lost all my formatting and I apologize. Formatting on Blogger is a PITA.)

public interface ListFilesStrategy {
public List listFiles(File dir);
}

I have a two classes implementing this interface, RecursiveListFilesStrategy and NonRecursiveListFilesStrategy, which both extend AbstractListFilesStrategy.Here are the listings for AbstractListFilesStrategy and NonRecursiveListFilesStrategy:

public abstract class AbstractListFilesStrategy {

public void assertFileIsDirectory(File dir) {
if( dir == null )
throw new IllegalArgumentException("directory is null");

if( ! dir.isDirectory() )
throw new IllegalArgumentException("file is not directory");
}

}

public class NonRecursiveListFilesStrategy extends AbstractListFilesStrategy
implements ListFilesStrategy{

public List listFiles(File dir){
assertFileIsDirectory(dir);

List files = new ArrayList();

List directories = new ArrayList();
directories.add(dir);

while(! directories.isEmpty()){
File currentDir = directories.remove(0);
for( File f: currentDir.listFiles()){
if (f.isFile()) files.add(f);
else directories.add(f);
}
}
return files;
}
}

Now the test Agitar generated that was difficult to read. This test tested the listFiles method on my NonRecursiveListFilesStrategy:

public void testListFilesWithAggressiveMocks1() throws Throwable {
NonRecursiveListFilesStrategy nonRecursiveListFilesStrategy = new NonRecursiveListFilesStrategy();
File file = (File) Mockingbird.getProxyObject(File.class);
File file2 = (File) Mockingbird.getProxyObject(File.class);
File[] files = new File[0];
File file3 = (File) Mockingbird.getProxyObject(File.class);
File[] files2 = new File[2];
File file4 = (File) Mockingbird.getProxyObject(File.class);
File file5 = (File) Mockingbird.getProxyObject(File.class);
files2[0] = file4;
files2[1] = file5;
Mockingbird.enterRecordingMode();
Boolean boolean2 = Boolean.TRUE;
Mockingbird.setReturnValue(false, file, "isDirectory", "()boolean", new Object[] {}, boolean2, 1);
ArrayList arrayList = (ArrayList) Mockingbird.getProxyObject(ArrayList.class);
Mockingbird.replaceObjectForRecording(ArrayList.class, "()", arrayList);
ArrayList arrayList2 = (ArrayList) Mockingbird.getProxyObject(ArrayList.class);
Mockingbird.replaceObjectForRecording(ArrayList.class, "()", arrayList2);
Boolean boolean3 = Boolean.FALSE;
Mockingbird.setReturnValue(false, arrayList2, "add", "(java.lang.Object)boolean", new Object[] {file}, boolean3, 1);
Mockingbird.setReturnValue(arrayList2.isEmpty(), false);
Mockingbird.setReturnValue(arrayList2.remove(0), file2);
Mockingbird.setReturnValue(false, file2, "listFiles", "()java.io.File[]", new Object[] {}, files, 1);
Mockingbird.setReturnValue(arrayList2.isEmpty(), false);
Mockingbird.setReturnValue(arrayList2.remove(0), file3);
Mockingbird.setReturnValue(false, file3, "listFiles", "()java.io.File[]", new Object[] {}, files2, 1);
Mockingbird.setReturnValue(false, file4, "isFile", "()boolean", boolean2, 1);
Mockingbird.setReturnValue(false, arrayList, "add", "(java.lang.Object)boolean", boolean3, 1);
Mockingbird.setReturnValue(false, file5, "isFile", "()boolean", boolean3, 1);
Mockingbird.setReturnValue(false, arrayList2, "add", "(java.lang.Object)boolean", boolean3, 1);
Mockingbird.setReturnValue(arrayList2.isEmpty(), true);
Mockingbird.enterTestMode(NonRecursiveListFilesStrategy.class);
List result = nonRecursiveListFilesStrategy.listFiles(file);
assertNotNull("result", result);
}

This is certainly difficult to read. I'm pretty sure I could explain what its doing, but I'm more well read in testing and mocks than most developers. There are a few things they could do to clean it up however. They could try to be more in line with Behavior Driven Development and they have a few options in doing so. The method doesn't have an intent revealing name, I know what method its testing, but I don't know exactly what its doing. They could put in "given, when, then comments". The could use the extract method refactoring to seperate out the setup, the action, and the assertions, but maybe at least the setup portion.

All these things could be done, but even so there is a lot of setup happening here, and Most developers aren't ready, or aren't willing to read through it, and will likely start to ignore Agitar errors.

Now, I was told by Barry at Agitar that I could put in a test helper class, which would basically provide the setup portion of the test, and then it would generate more meaningful results. Here is an example:

public class FileTestHelper implements ScopedTestHelper {

public static File createFile() {
return new File("./src/main");
}
}

Should result in something like this …

public void testListFiles() throws Throwable {
ArrayList result = (ArrayList) new NonRecursiveListFilesStrategy().listFiles(FileTestHelper.createFile());
assertEquals("result.size()", 5, result.size());
}

Indeed, I could do something like this. But, there is a big problem with this. We have 5000+ classes that we want to generate tests for! How do we know which tests are meaningful, which ones need test helpers, yada yada?

I do promise more examples of simple methods that don't seem to be giving meaninful results. Mostly just checking to make sure methods catch NullPointerException. I do think the product will work wonderfully for green development, but will likely be ignored during legacy developement. Please let me know your thoughts.

Sunday, October 14, 2007

IDE's vs vi

I just told my friend that instant IDE support for new languages is the place to be (I truly believe this can happen). In response he told me, "I love vi."

I know vi, and I can get around in it OK, but I'm not a hardcore vi guy. I know there are some things you can do in vi that are really nice and allow you to do some things very quickly and powerfully. But, I don't know what they are. The real question here is, can vi do things that modern IDE's cant do? Can someone explain to me this?

I do think that vi is ancient technology, and I'm sure thats old news anyway. I don't know very many people who use vi or emacs anymore. I'm sure a lot of people would look at my friend funny for that comment. But I'm a little different. I'm curious.

What features do retired editors have that modern IDE's lack?

Questions on Blogger

So I like the look and feel of Blogger, but I have some problems with it.

I really need to be able to upload files as attachments to my blog.

Is there a way to do this?
Is there another site that allows you to do this?
If I host my own site can is there on OSS blog that I can put up that allows me to upload files?
Should I just use a wiki?
Should I just host my own site and link to files from Blogger to my site?

Whats the best way to go about all this? Seems like the easiest is to go with the last option. But, I don't want to host at home anymore. Does anyone know the easiest way to do hosting these days? I used to host at home a few years back but I'm now out of touch with hosting.

Friday, October 12, 2007

A Day of JavaCC!

In a dream come true, I got to write a bunch of stuff using JavaCC at work today. It was fantastic. I was able to identify the fact that we needed JavaCC for a particular problem, and sell people on it as well - also very exciting points.

NOTE: I tried to write this post before but it came out horrible. I'm trimming it down. The original post is in the comments. What remains is mostly notes and tips on how to do development using JavaCC. Not a tutorial, just notes.

Here are the steps we took on the syntactical analysis side:

We took some examples of the input language and started writing the grammar for it. This took a while because I really had to get myself familiar with JavaCC all over again.
We tried to generate the parser by running Java CC and failed with a Left Recursion error.
We looked up how to solve this and after a while figured out that we were simply missing a set of parenthesis.
We generated the parser from the grammar and ran the parser against some complex input. It failed.
We repeated this process for a while until we figured out that we need to make our grammar AND our input simpler to start, and work up. This was a good lesson. Don't try to get Everything right in one pass, the errors will be overwhelming. Start small, work up.
We finally worked up slowly to the point where we could parse our original complex input.

After we got it parsing, we started adding our Objects into the grammar file so that they can be produced at parse time. The same rules applies here. Start small - at the very bottom of your parse tree, the leaves, and have those return small objects that can be passed up the tree one level at a time. Slowly, you'll be able to build nice full objects. Its probably possible to start at the top as well, but I think its more complicated. Depending on if objects have to be constructed complete, it might even be impossible.

Wednesday, October 10, 2007

Compiler 2000

So I've been sidetracked a bit. I've been sick - dizzy for almost a week. The ThoughtWorkers, Ben and I have a release in two days, and I'm supposed to present on web frameworks next week with an emphasis on testing. I still haven't had a chance to get my Agitar examples to post either. But, that won't stop me from learning and writing about compilers.

So, I figured out my compiler from 2000! I'm really excited about this. It only took a few minutes. I didn't figure it out in huge detail but I want to give the gist of what I did figure out. First I'll start with a quick overview of the steps:

Define the input language grammar
Run javacc on the grammar to produce a parser/generator
Write a file in the input language
Run the parser/generator on the file created in step 4 to produce Java byte code in Text
Run jasmin to convert the human readable Java byte code to a Java class file
Run the class generated by jasmin on the JVM.

There's certainly some details missing here. How do you go from a grammar to a Parser/Generator? Its pretty common to go from the grammar to a parser, but the generator as well? Hmm. This one I might actually have trouble explaining but I do know how it works.

The grammar I build that is used by JavaCC has references to classes that I wrote that do the generation. JavaCC generates the parser that creates the AST comprised of the classes I wrote. These classes have the generation logic build in. Once the parser builds the tree it can just ask the root node to start generation and all the nodes get visited in the proper order, generating code.

I realize this all makes me sound naive about well, everything, but it was literally 5 minutes or reading old code, one and a half chapters in the dragon book, and thats about it. I have a lot of catching up to do, but I'm making progress already. Now that I know I have a complete working model that I built, I should be able to tinker with it quite a bit and post more.

Thursday, October 04, 2007

Agitar Evaluation

I've been evaluating the JUnit test generation product by Agitar in my spare time and want to spit out a few thoughts on it. Most of this will be criticism, but I'd like to first say that I think it has the potential to be an absolutely great product. If you don't know much about Agitar, they have a product that will automatically generate JUnit tests for your code. You can check out http://www.agitar.com/ and http://www.junitfactory.com for more information.

OK onto the important content.

Many of the tests created were lacking meaning.

This is OK however, because you can provide test helper classes that are read at test creation time to help create more meaningful tests. This is pretty easy during regular development, and it makes sense. The program can't know everything about your code. Unfortunately though, for large legacy code bases this is damn near impossible.

For example, lets say you have a code base with 5000 classes. 5000 classes means 5000 test classes. Lets give Agitar the benefit of doubt (there is little doubt however since I've done it many times, but lets give it to them anyway) and say that 80% of the classes are meaningful (Once again, the number is smaller in practice.) 80% meaningful coverage means 1000 test classes that aren't useful. This is a problem because:

Its likely to be your most important, or complex classes who's tests are lacking meaning.
How do you really know which ones are lacking meaning? Do you have to look through 5000 test classes?
Do you have to write test helpers for every test class?
What happens if you have no idea what your own classes are doing? (Think this is crazy? Its not. What if you inherit the code? What if you wrote the class four years ago? What if your team is large and people just hack things together?) How could you even write test helpers for this?
What if writing test helpers requires instantiating dependencies that are difficult to instantiate, which is why you haven't bothered writing tests to begin with?

You see, this is quickly becoming an overhead nightmare. This is especially true for a team who has a large untested code base. They have an untested code base because they don't know how to test. The only way to get them to start testing is with very low overhead. They don't want to have to wade through thousands of new classes.

Agitar is going to say to this, "But you have 80% test coverage at this point, which is far superior than what you had before". They might even say don't bother looking through the test code (I'm not sure if they'd really say this, but its possible). And maybe this works. Or maybe it only works for suckers. I'm not sure.

My feeling is that the product is simply not ready for legacy code bases. And in fact, by the nature of these problems, I'm not quite sure a product could EVER solve them. When I think back to when I first heard about the product, that was indeed my first thought. But, after seeing some of the videos I was very optimistic, especially with Kent Beck involved. I approached it with an open mind and was very hopeful. I'm not sure now.

The generated tests were difficult to read

I really should provide some examples here. What I'll likely do is finish this blog, get my examples at a later date and post them in a comment. So don't trash me because I haven't given examples. I don't have them on me.

Some of the code was pretty hard to read. Some of it was nice and easy to read. That is OK. It's what I expected and its livable on new projects. If a generated test is hard to read, its likely that your code isn't as clean as it needs to be.

The problem with this is pretty clear though. Its likely that most of the tests generated for large legacy code bases will be difficult to read. This isn't the fault of Agitar, Garbage in Garbage out. But, it just adds fuel to the maintenance fire. Not only do developers have to wade through garbage code, but now they have to decipher difficult to read test code that they've never seen before, just to make sure its a valid test. They might be better off just writing a test themselves, except that they might not know how.

Brief Use Case Example

I'm going to give a simple little example that demonstrates how test generation might be used, and some problems with it. Lets say a developer changes some code somewhere in a difficult to read class. He then runs the Agitar unit tests and some of them fail. Should he be worried? Should he look into it and fix the problem, or should he ignore it and regenerate the Agitar tests? If he does look into it then he might have to wade through hard to read code. If he doesn't then he might be ignoring a potential problem. Is it possible that he breaks something in an area that he doesn't even seem to be working in? That could be frustrating. Legacy code is frustrating to work with. Period.

There is an approach in line with Fowler's Refactoring that does help a bit though. Only read the tests in the area that you are working in. You might have to read through a couple hard to read test classes ... but its only a couple. Additionally, it'll give you some examples and incentive to clean up the code in that area.

If hand written tests are failing in any areas, fix them. If Agitar tests are failing in your current area, read them and try to figure out why. You might have a legitimate bug. After you've looked at the generated tests in your area and fixed up your code go ahead and regenerate all the tests.

Questions

Their website says something like "Reduce the drag from fragile Java code by 50% or more." What does that mean really? Does that tie into what I was saying before about the 80% meaningful tests? Does it mean you'll get your work done 50% faster? What is code Drag? Yes, yes, its just a marketing slogan, and I couldn't do much better, but what does it mean?

It also says 80% test coverage guaranteed. This is a bold statement IMO. I'm very curious to see what kind of awful code bases they've worked with, and what coverage they got. What about on something like a Hideous EJB2 project that can only be tested in the container?

What about new projects?

Once again I want to reinforce that this criticism is designed as constructive criticism in order to make the product better. I'm not trying to bash the product at all. I do think the product could work Great with new TDD projects. You write some tests, write some code to make your tests pass, use AgitarOne to generate some extra tests to get some thoughts about your code, refactor, and repeat. I think its a great supplement to the faulty human mind who can't see everything. The problems that exist with legacy code simply don't exist with green code. Of course you have to write test helpers, but its easy when you have a nice clean codebase and you're writing new code.

I would recommend this product to NYSE for new developement, but we don't have much new development. I'd certainly try to use it for any open source project that I'm working on.

Product Ideas

There are a number of small issues I have in the area of new development too.

Why not use JUnit 4?
Why not TestNG?

There are also a number of ideas that I have for the product:

How about annotations in the code that provide useful hints to the test generator?
How about annotations to tell the test generator not to generate test for a class or a method?
How about tying in or influencing JSR 305? This JSR is working to define annotations for Bugs, and is primarily being run by FindBugs and JetBrains. Agitar should certainly get involved. For example, @Nullable or whatever name they give a method that might return null. Little tips like this could certainly assist in test generation.

Debate

I'd really like to talk to the Agitar guys some more on this subject, and Kent Beck himself.

I openly invite anyone from Agitar to comment on this blog demonstrating how I'm wrong. I want to be wrong here. I want to have 80% meaningful test coverage on bad code. That would make my life so much easier. Please explain how this doesn't turn into an overhead nightmare when working with large legacy code bases. Please tell me how this helps get developers who've unfortunately never written tests on board.

Saturday, September 29, 2007

Compiler Mania

So I know I talk a whole lot of shit about what I want to do, but I believe that I have it figured out now. All the work and reading I've done lately on the future and history of programming languages has led me to compilers. I'm going to start reading and reading and reading about compilers, and write some. I've done it before, but I'm going to kick it up a notch, BAM.

In 2000 I took a compiler course from Doug Lea and we compiled a simple language into Java Byte Code. I still have the code around for that course, and tomorrow I'm going to find it and review it like a madman. I don't think I have the language syntax definition any longer however. It was on Doug's site, but now its been replaced by MiniJava. I'll likely be learning a lot about MiniJava as well.

I'll be reading two books: Compilers: Principles, Techniques, and Tools (2nd Edition) and Modern Compiler Implementation in Java (Hardcover) The first is the classic Dragon Book. I have the first edition, but hopefully I'll get to pick up the second edition soon. The second, despite its poor reviews on Amazon and the fact that its slightly outdated, was recommended by Doug Lea, so I'm thinking its a safe bet.

For some reason I just feel compilers is the place to be. I know I can compile to JBC, and I know so many new languages are going to be compiling to it to run on the JVM. I want to be a part of that. I have a lot of interest in new languages. I know I don't want to be stuck in the Java world forever. I want to be a guy leading the new language revolution. I want all languages to have instant IDE support. I want to make it so you don't have to do much more to create a new language than just define your syntax and plug in to an IDE.

I can do it. Follow me and on my ride as I detail as much as possible on what I learn.

Thursday, September 27, 2007

Pair Programming Problems

Today I got to pair with the ThoughtWorkers in our group. Being a former ThoughtWorker, we worked really well together. You could say we were on the same page, and level. When you're on the same level, and the same page, work just flows. So that leads into what I want to write about - what happens when you have to pair with someone who's either on a different page, or a different level than you. I'm not an expert at pairing, but I've certainly noticed that these two problems occur, and I've thought through them a bit.

Lets start with someone being on a different level. This is the obvious one and easy one to explain. Simply put, for the person who's more senior, it can be frustrating. You have two choices, slow down and teach the other person, or leave them behind and forge ahead doing work while they watch. Of course, you can only do the latter when you're at the keyboard.

The first approach is better since you bring up the level of the other person; the second approach isn't even really pairing. But, the first approach is problematic. Eventually, you get burned out of teaching. Especially if the person just seems slow, or stubborn. You could get rid of those people of course. There is a way to solve this though, switch pairs frequently. That isn't something that hasn't been said and done before of course. I'm just reiterating it because we don't switch nearly enough at NYSE.

In deadline situations however, it might just be better to take the reigns and plow ahead.

Onto the next problem, pairing with someone who's not on the same page as you. This could be in any way imaginable. Here are some examples:

They are old and lame, and you're young and cool
They like using Debuggers (lame), you like writing Unit Tests (cool)
They like checked exceptions, you like unchecked
They chew gum with their mouth open and you just want to smash them
A bunch of others

What can you do about this? Its a bit harder to deal with than the first problem. Of course switching pairs more helps, but you still have to be productive while you're pairing with this person. So, You have to try to find some middle ground on some things. For instance, if they like the debugger, and you like unit tests, write a test, and run the debugger in the context of the test. In order to find this middle ground, you're going to have to communicate well, and constantly. Pairing in this case is a bit much like a relationship, or a compromise. Both sides are usually unhappy, but in general things work out.

There is another problem I've noticed with Pairing and Legacy Code. Much of the time, you have to look through the code slowly to figure it out. Either person might want to look at different parts of the code. Its frustrating to have ideas about what the code is doing, and if you're not at the keyboard so you can't look at that area. It seems that its almost better to split up, review the code, then come back together to do the work. This presents other problems though. Sometimes while trying to figure out the code, you'll want to write tests, and refactor. But that is actually contributing to the work. I'm not articulating my thoughts exactly as I want to. If you split up and do work you aren't pairing. Thats not good. Ugh. I guess the whole point of that is that its difficult to pair and read legacy code.

Lets really quick review how to fix the problems:

Switch Pairs regularly
Communicate a lot
Teach people to bring them up a level
Don't worry about plowing ahead with a huge deadline looming

Tuesday, September 25, 2007

Language Explosion

I've said it before, and I'm not saying anything that people a lot smarter than me aren't already saying better, but I'll say it again anyway.

There is about to be a language explosion.

In many ways I have No Fucking Idea what I'm talking about...but something weird says that in five years time I'll have made the right moves by just attempting to talk about this today. I may be repeating myself, but each time I do, I get more ideas.

Any two important questions that need answering.

What can make this possible?

I've hinted on this before.

I can't back this up at all. But, certainly there's JRuby, Scala, and other projects. The JVM is a great tool. I'm so lucky to have written a compiler in college that produced Java Byte Code. I will be reviewing that code soon. I think many many languages will be compiling to JBC. Its so simple to do so. Boom, instant, multi-architecture runtime.

IDE Support

As IDE's mature, they will be able to take on new languages just with a compiler plugin. Maybe at first they will somehow only take on languages that compile to JBC, I'm not entirely sure. But, I assure you, its going to happen. Boom, instant IDE.

Library Support

Languages that utilize libraries built in other languages are in better shape that those who don't. That much can't be denied. There are a few interesting comments that I have on this.

Scala, for instance, has compile time access to Java classes (which, if you reference the last point, has IDE access as well). Someone had to write a Scala compiler to allow this to happen. Those points in and of themselves are not too interesting. But they do lead into something much greater.

Once Scala compiles down to JBC, then its accessible to Java code. Someone writing Java in their favorite IDE can drop in a Scala jar file, which is really just a Java jar file, and have full access to it. I'm very curious as to how this works. What does the JBC look like? How can you link to the source code? Do they line up? How can they?

Anyway, we still arent at the truly interesting part. Any new languages compiling to JBC, you guessed it, Boom! Instantly accessible to all other new languages also compiling to JBC. Ok maybe not instantly, you still have to write the compiler. That is the interesting point. New languages, designed for different purposes, designed to make different aspects of development easier, all with accessibility to each other. All tied into the IDE. Oh man.

So let me summarize all that.

Choose a syntax that fits your problem.
Write a compiler that understands Java.
Compile to Java Byte Code
Write a compiler plugin for the IDE, or write your compiler to fit some special IDE compiler spec.
Plug in.

Maybe I'm nuts and thats so far out of wack, or so far off in the future, but I can see this happening in 5 years.

Is there a way to capitalize on this?

As I see it, there is a whole bunch of work that needs to be done, but it will get done.

IDE's need a way to understand new languages. This could be through a compiler plugin, or Abstract Syntax Tree plugin or who knows, just something else.
People still have to write compilers.

Guess what, its that last part thats going to be the bomb, the explosion, the Oh Mama. How?

Writing compilers is hard work. No doubt about it. There needs to be generic compiler libraries that a compiler writer can use to to easily create compilers that do all this stuff. I'm not just talking LEX and YACC. I'm talking easy API's to do the following:

Give access to all Java code.
Provide the hooks to the IDE
Compile to Java Byte Code

The compiler writing is whats going to take the most time in this language explosion. Tools to make this thing easier are going to be used like mad. Maybe I'm crazy. Maybe its already been done. Maybe I'm just shooting my mouth off. But, I do know I'm learning. I do know I came up with all of this all on my own. I do know I'd love to have someone give me some ideas on it. I do know its just sort of flying out of my mouth and isn't completely well written. Sorry. I'm just excited.

Thursday, September 20, 2007

Progress on Goals

So admittedly, I haven't made much progress on my goals the past few weeks. I've been concentrating on CruiseControl. Unfortunately I think I'm going to have to revise them. The problem I'm having is that I'm so random. One day I'll be reading Spring and the next I'll be reading the History of Programming Languages. So what I think I should do is just write down anything and everything that I'm interested in. Here they are in no particular order:

History of Programming Languages
Scala

Scala Eclipse Plugin

Smalltalk

Squeak

Spring
Unit Testing

TestNG vs. JUnit

Unit Test Creation

Agitar

Concurrency

Testing Concurrency

FindBugs and the idea of FindBugs annotations
Developing new programming languages

Problems developing them

Library support
IDE/Refactoring support
Automated Unit Test creation support

Web Frameworks

Seam
Struts 2
Spring whatever

JBehave

Companies

ThoughtWorks

NYSE

How Stock Exchanges work

CruiseControl

Why CC Java Sucks Eggs

Why the new Dashboard isn't useful

Build Process
Extreme Programming
Dependency Injection
Aspect Oriented Programming
Compilers

Compiling to Java Byte Code
Compiling in IDE's
Building Compilers
AST

Operating Systems
Virtual Machines
OO

Design Patterns

Refactoring
Future of OO

Java

Annotations
New JSR's
Backwards Compatibility vs Forward Mobility

People

Doug Lea
Kent Beck
Martin Fowler

Legacy Code
Teaching

Agile Enablement
Giving Presentations

Constructors Considered Harmful
OSS Projects

I really need to get on one

Technology Evaluation
IDE's

Building Refactorings

OK GOD SOMEONE STOP ME.

Legacy Software

The main problem with moving forward with legacy software is that it takes so much damn time to get classes under test. So when ends up happening is, you load the entire system just to test one tiny little bit of it, and you end up with all sorts of different testing issues, further contributing to your original legacy code problem.

Why is it so hard to get a class under test?

Idiots.

Thats all. Thats the only reason. I feel like this should be the end of the post, but I suppose I'll trudge on, I haven't posted in a while.

Classes have complex dependencies that interact with each other constantly, handing each other other sets of complex Objects, making anything difficult to stub or mock out. I'm reading Working Effectively With Legacy Code and its helping, but these things take time. I think I'll get there, but maybe not for another year or so.

Brown Bag Lunch: Results

I guess a week went by since I wrote last, about the presentation I was going to give on CruiseControl. It didn't seem like a week at all. I was so busy getting ready for it, putting together slides, writing, and practicing, that the week totally flew by.

Everything went over great. I'm pretty certain everyone is going to start using CruiseControl, I got positive feedback on the implementation, I definitely won over a few people who want to help, and I may have gotten it into everyones heads that we should all be giving presentations. I'm not sure which of those things is the most important. I guess it doesn't matter.

Thursday, September 13, 2007

Brown Bag Lunch: CruiseControl

As a direct result of my own brainstorming on technology evaluation, I've decided to follow up on my own advice. I'm going to give a follow-up presentation to my bosses presentation on Continuous Integration. Mine is going to be directly oriented towards setting up Cruise Control, where as his was just a high level overview. I'm going to brainstorm some ideas right here on the spot, in outline format.

Intro

Brief overview of what this presentation is about

Follow up on Mike Roberts' Continuous Integration presentation
Brief refresher on what CruiseControl is
Will go over the internals of CruiseControl

Part 1: Overview

Overview of CruiseControl capabilities

Update from CVS
Run Ant
Show test reports
Scheduled Builds

Overview of what projects are currently in CruiseControl
Internals of CruiseControl

config.xml

Part 2: To do List

Determine what artifacts need to be published
Get performance tests running in CruiseControl
Get genversion running daily in CruiseControl
Get Unit Tests working through Ant
Get Emma and FindBugs reports integrated into CruiseControl

Part 3: What is required of the team

Install CC Tray
Write tests
Commit frequently
Don't check in broken code
If you break the build, everyone will know, so fix it.

Outro

Where to get more information

Challenges

Challenge others to give 30 minute presentations

Challenge Dave Litner to make others give 3o minute presentations

Mention Technology Evaluation

Questions

Questions can be asked throughout, but try to get more questions at the end.

I might come back and edit this, but its a damn good start.

Wednesday, September 12, 2007

Approaches to Technology Evaluation

I'm going to keep a running tab for Approaches to Technology Evaluation in small companies with small budgets, and tight schedules. This list should be full of ideas to do it and keep it affordable. This list doesn't really even have to be about software, it should apply anywhere.

Brown Bag Lunches

Pick a day each week, any day, and have everyone each lunch together for an hour while someone does a presentation on a new technology. Rotate the presenter, so even on a team as small as five everyone has five weeks to prepare. This is plenty of time. I think this can be a very effective, inspiring method for evaluation, and it has side effects of getting everyone involved, everyone gets to learn, and you build better moral.

Dedicate at least some actual work time to it.

Dedicate someone to it one day a week. Obviously this is a slightly more expensive approach. The company is actually paying for it. But, it guarantees that you get some quality time into it. The previous approach could easily break down it people don't want to spend much time at home. You have to judge your group.

Partition Work

If you partition your work properly, into independent modules, then you should be able to choose any module and build it using entirely new technologies, or just one new technology, keeping others around. This shouldn't cause dependency issues because the modules are separated.

This will have costs. At the least you need to do a minimal amount of evaluation, and spend some time up front learning new technologies chosen via evaluation. You may also find that the new technology is no good after you build, and have to rebuild using your older technology. If you're working with more than one new technology, you may run into a situation where you're unclear which technology is bad, and might mistakenly decide all of them are bad.

Force out old technologies

Choose a number like 5. Any technologies older than five years old are considered legacy tehcnologies, and under no circumstances should you continue any work with that technology. This approach is far more expensive then the others listed. But one that promises to ward off stagnation.

Force in new technologies

Slightly similar to the last approach is forcing in new technologies. This doesn't mean forcing out old ones however. It means you maintain the legacy technology, and proceed with development on the new technology. This can lead to maintenance issues, but I tend to think that developing further with the old technology leads to more issues.

Both of the last two approaches aren't really evaluation approaches, they need to be used in conjunction with an evaluation approach. But, having either policy will certainly stimulate the evaluation.

More Ideas To Come!

Innovation vs. Masturbation

Heres a thought that I'm sure many other people have had: Technology companies that are not dedicated to technology evaluation and adoption fall behind, and get bogged down with legacy software.

Lets say a piece of your system is horrendous, and it uses 8-10 year old technology. Lets also say that you have a new project coming up. My theory is this: If you don't evaluate new technology because it would introduce another maintenance variable into your system, then the new project will fail. Why? Because the old technology IS a maintenance variable! Even if you keep a close eye on the development of it, its still going to get out of hand. Why? Because its not good technology. When you have something difficult to do and the technology doesn't quite support it, you do a "Workaround", thinking, "Well, we'll fix that up later". Well, THERE IS NO LATER.

Most companies have there so called "valid" reasons not to do it yet. But I think for the most part none of them are valid. At the very least you need someone looking at new technologies. Yeah, its expensive, but you're not going to make significant advances in productivity without it. Is there some risk involved? Absolutely. Is the risk greater if you don't do it? My bets are on yes.

This is all similar to Innovate vs. Litigate, where a large, aging company (ie Microsoft) sues everyone in an attempt to hang on to dying technology, instead of focusing on creating new, great technology.

I prefer to call it by a slightly different name however: Innovation vs. Masturbation. I could explain, but I think you get the idea.

Sunday, September 09, 2007

New Language Adoption

I have some thoughts on language adoption as I've done some preliminary work learning Scala, and having learned a bit of Ruby, and a lot of .NET, all in the past 6 months. I'm going to focus here on Scala and Ruby.

One unfortunate problem with adopting a new language is the lack of quality libraries, or the total absence of libraries, and another is the lack of IDE support, especially refactoring.

Libraries

Its probably the case that many languages suffer for a long time, and never make it because they are lacking libraries. Ruby seems to have gotten over it somehow, despite starting from scratch. It took 10-15 years for it to catch on however. Clearly starting from scratch was Hard. If a language can make it from scratch, you know its special. (I should look into JRuby because I wonder if that somehow gets Ruby access to Java code. I don't think thats the case, I think its just compiling Ruby into class files to run on the JVM)

Scala gets over this problem by having full access to all Java libraries. However, I suspect that in doing so you have to sacrifice some potentially high quality features. For example Scala code is allowed to call Java constructors and as I've mentioned a few times now - theres something fundamentally flawed about construction in general. DIF's exist simply to solve that deficiency in the Java, but in the end, the language is still flawed. Does that mean the flaw carries over into Scala? I suspect at least somewhat. I'm going to continue researching this.

Either choice, from scratch, or library inheritance poses its issues. I'm not sure which way I'm leaning at this point.

IDE Support

Both languages suffered from lack of IDE support. Both had to take the start from scratch approach on this issue. I downloaded the Eclipse plugin for Scala, and its pretty minimal. No refactoring support, and many other features are missing. I remember doing the same for Ruby when I was applying to Thoughtworks.

This leads to to an obvious question: Is it possible to avoid this approach? That is, is it possible to create a new language that leverages existing IDE support? I suspect that it is, but I'm certain some research needs to be done on it.

IDE's are essentially compiler based. And Eclipse in particular is completely pluggable. What if you could plug in the compiler? It it possible to write refactorings generic enough that given the correct compiler, the refactoring could work across any language? Possible I assume, but difficult. Many Eclipse Java refactorings for example, simply print out Java code and then have the compiler recompile. These refactorings certainly would not work on other languages.

It might not be possible at all, but its certainly worth looking into. Until then early adopters will have to suffer though lack of support, until someone (quite likely them) builds that support. Language adoption will continue to take some time.

Wouldn't it be nice if anyone (well, anyone that can write a compiler) could create their own language and have it automatically fully integrated into an IDE with full refactoring support and all the other features? Then all we'd need is support in the IDE to easily create new languages and compilers. Then we'd be looking at thousands of languages that we could possibly adopt. Domain specific languages would be so much easier to create. So many possibilities I should really just stop talking.

Saturday, September 08, 2007

Next Generation IDE Technology

Certainly IDE's have automated refactorings, and they are so nice, but I'm thinking the next generation software is going to:

examine your code
determine what variables are related to what responsibilities
identify classes with more than one responsibility
identify classes in need of dependency breaking refactorings
refactor the code with minor intervention
compliment the refactorings with a full suite of unit tests that ensuring equal behavior

Additionally, as you type new code, you'll get real time warning messages such as, "It looks like you're adding unrelated behavior to this class, I recommend you do such and such..."

Following soon after, languages that enforce this at compile time. "Compile Error: Too much responsibility in this class". Additionally, "Compile Error: No unit tests for methods X, and Y". The IDE's will then tie into the unit test generation code like Agitar, and write legitimate unit tests for your code. You might think thats funny, and it might be annoying sometimes, but I bet it makes for cleaner code.

Then, people who haven't read a million books, and have 10 years of schooling can actually write code without subtracting value. They will be forced to do things correctly, and it shouldn't sacrifice creativity like a strict methodology.

Goals Revisited

For a few reasons, I think I'm going to drop everything and learn AOP, and some DIF's including Spring, Guice.

I need to do this for the following reasons:

There is something entirely unnatural about constructing objects in Java.
I need to learn how to break dependencies better.
I need a fresh view for looking at Unit Testing legacy code.

Stay tuned for more info on all of those items.