Friday, November 16, 2007

CPU Simulator Library in Progress

I've been working on a lot of stuff lately and not writing. For that I apologize. I want to keep everyone updated on what I'm doing, but mostly I want to keep myself update for tracking progress and what not.

I'm still wanting to design languages, write compilers, write IDE thingies, and all of that. If you recall, I decided that in order to do that I should become a compiler master. In reading a lot about compilers I decided I should take yet another step back to refresh my memory on Computer Architecture.

So I've been reading about that a LOT. Really low level, as low as possible. When I say as low as possible I even mean Quantum Physics. But, I don't intend to become a master in Quantum Physics. Its so fascinating but it also really almost drove me to insanity about 6 years ago. A simple refresher was all I needed.

After that I read a bunch about the history of computers, mixed in with some history of philosophy, and mathematics. All really great stuff. Once again I don't intend to become an expert in it but I want solid foundations in all this stuff, which I feel I don't quite have. I'm close.

So heres where the fun part begins. I started writing this cool little CPU simulator in Java. I don't have much yet but I'll keep the world informed. What I plan to do is - write classes that represent pretty much everything you'll find in a computer hardware. So far I have a MemoryModule class and things like that. You can imagine this pretty easily.

We can start with a Computer class which has CPU, Memory, IO, and a Bus. We can then break the CPU down into ControlUnit, ALU, Registers, and maybe Connections. The ControlUnit can be broken down further into SequencingLogic, Registers, Decoders, ControlMemory, etc...

This process of drilling down can probably go down to the atomic level. I'm not sure how low I'd go, where I'd draw the line. But the point is by doing all this, by actually writing the code, ensures that I understand each part. I could take some parts down further than others, and in doing so I'd learn more about that component. This is fine. I probably don't need to know things down to the atomic level on say, video. But going down to FullAdders and HalfAdders in the ALU, or even down to LogicGates is probably a useful thing to do.

Maybe I can invent my own instruction set for my "Hardware", or reuse an existing one. Either way, once all of this is done it would be nice to write a little OperatingSystem class that makes use of that instruction set. I would learn a lot in doing so I imagine. Then once my OperatingSystem is in a reasonable state (I don't have any idea what that would look like yet), I could attempt to write a little compiler that compiles some random language down to an Assembly Language that runs on my hardware.

This is a long way off but I think its a damn good goal. Yes its reproducing things that people have done 50 years ago in a way that most certainly won't be reusable, but whatever, it will get me towards my goal of language writing.


  1. > Yes its reproducing things that people have done 50 years ago in a way that most certainly won't be reusable, but whatever, it will get me towards my goal of language writing.


    Most language designers haven't done all of this. Acquire the knowledge, sure, but I'm not seeing the need to build this giant simulator.

    If you feel you need to get really close to the hardware, why not use something like OSKit and run your stuff in a VM?

  2. On the OSKit homepage it says:

    "For language researchers and enthusiasts, the OSKit lets them concentrate on the real issues raised by using advanced languages inside operating systems, such as Java, Lisp, Scheme, or ML--- instead of spending six months or years groveling inside ugly code and hardware. With the recent addition of extensive multithreading and sophisticated scheduling support, the OSKit also provides a nmodular platform for embedded applications, as well as a novel component-based approach to constructing entire operating systems."

    This sounds really appealing to me, and I intend to take a longer look. Maybe its something I would use after I do some more work on my own stuff.

    I feel the need to do it on my own first because I feel that writing anything entirely from scratch on your own is the best way to learn, as long as its complemented with a lot or reading.

    Also, you ask "How?" Well, all this work on hardware has been done. I'm simply reproducing it in software, likely in a way no one would ever use it. And, I'm sure most language designers haven't done this. I'm just weird.

    I do suppose if you can show me a way that I can completely learn what happens low level in hardware by using OSKit, and at the same time get more knowledge on Language Design that I would doing it on my own, and at the same time contribute something useful to the world, then sure, I'll do it. I'll read the page more. As for now though, I do plan to continue on.

    If you know of any really good Computer Architecture books please let me know.

  3. The "how" question was meant to apply to the quote - how will reproducing things that people have spent the last 50 years doing get you towards your goal of language writing?

    At this point, I think it would be hard for you to know what you would want to add to your new language - in other words, where existing languages fail. There's a whole bunch of research/study just to get to where existing languages are, and then to identify new things that are simply beyond them. In the mainstream of non-mainstream languages, Haskell, Erlang, and Common Lisp are all good examples - languages. Outside the mainstream there are hundreds of little languages that do something well - take a look at Miguel Serrano's Hop language for web programming. His demos are unreal and the source code will make anyone used to struts fall over in their chair.

    Even for C-family languages like Java, the ability to take an AST and transform it to an efficient sequence of cpu/os instructions is a black art worthy of a lifetime's specialization. It's certainly not necessary for you to fully grok that to do language design. It may not be neccessary to even design a languge per se, as Hop demonstrates - Liskell is another example of this.

  4. To answer your question, 'The "how" question was meant to apply to the quote - how will reproducing things that people have spent the last 50 years doing get you towards your goal of language writing?'

    Maybe it isn't immediately clear how this helps. The short answer is that I need better fundamentals.

    Now for the long answer.

    Let me fill you in on a little background. First, I come from a school that starts you off with Java from Day 1. Second (and unfortunately), I had sub-par teachers for my Programming Languages and Computer Architecture courses. Because of this, I lack in some computer science fundamentals. Not in some others of course. I might be good in Agile, Testing, Java, some other crap, but underneath there is a big hole.

    I believe that most language designers, and most good computer scientists/software developers need to have strong fundamentals, and so I plan to fix that with this work. Additionally, in doing this, I should be able to get to a good feel for what the obstacles the original language writers encountered.

    After that I can turn my focus to the history of languages, the evolution of languages up to our time, compilers and such. With strong fundamentals and a good concept of the history of language evolution, I can then begin studying more modern languages to decide what features I'd want in my own.

    This isn't a huge multi year project or anything. But I think its well worth dedicating a few months to.

    I found a few people who also seem to think this is very useful, and are doing Exactly the same thing. Check these links out. I'm really excited.

    I figure I can get that book, and basically take the course from home. Hopefully I'll even be able to teach it someday.

    Anyway I could go on forever, but I'm gonna stop.