Build Your Own CPU? That’s the Easy Part!

You want to build your own CPU? That’s great fun, but you might find it isn’t as hard as you think. I’ve done several CPUs over the years, and there’s no shortage of other custom CPUs out there ranging from pretty serious attempts to computers made out of discrete chips to computers made with relays. Not to trivialize the attempt, but the real problem isn’t the CPU. It is the infrastructure.

What Kind of Infrastructure?

I suppose the holy grail would be to bootstrap your custom CPU into a full-blown Linux system. That’s a big enough job that I haven’t done it. Although you might be more productive than I am, you probably need a certain amount of sleep, and so you may want to consider if you can really get it all done in a reasonable time. Many custom CPUs, for example, don’t run interactive operating systems (or any operating system, for that matter). In extreme cases, custom CPUs don’t have any infrastructure and you program them in straight machine code.

Machine code is error prone so, you really need an assembler. If you are working on a big machine, you might even want a linker. Assembly language coding gets tedious after a while, so maybe you want a C compiler (or some other language). A debugger? What about an operating system?

Each one of those things is a pretty serious project all by itself (on top of the project of making a fairly capable CPU). Unless you have a lot of free time on your hands or a big team, you are going to have to consider how to hack some shortcuts.

Getting Infrastructure?

The easiest way to get infrastructure is to steal it. But that means your CPU has to be compatible with some other available CPU (like OpenSparc or OpenRisc) and what fun is that? Still, the Internet is full of clone CPUs that work this way. What good is a clone CPU? Presumably, the designer wants to use that particular processor, but wants to integrate it with other items to produce a system on a chip. Of course, sometimes, people just want to emulate an old machine, and that can be fun too.

In general, though, the appeal to developing your own CPU is to make it your own. Maybe you want to experiment with strange instruction set architectures. Perhaps you have an idea about how to minimize processor stalls. Or you could be like me and just want a computer that models the way you think better than any commercial alternative. If so, what do you do? You could try porting infrastructure. This is about midway between stealing and building from scratch.

Portable Options

There are quite a few options for portable assemblers. Assuming your processor doesn’t look too strange and you don’t mind conventional assembler conventions about labels and symbols, you might consider TDASM or  TASM. I have my own variation on this, AXASM, and I’ll talk about it more in the near future.

Assembly language is fine, but you really want a high level language. Of course, your first thought will be to port gcc, which is a great C and C++ compiler (among other things). There’s good news, bad news, and worse news. The good news is that gcc is made to be portable as long as your architecture fits some predefined notions (for example, at least 32 bit integers and a flat address space). The bad news is that it is fairly difficult to do a port. The worst news is there is only a limited amount of documentation and a lot of it is very out of date.

Still, it is possible. There are only three things you have to create to produce a cross compiler:

  • A machine description
  • A machine header
  • Some machine-specific functions

However, building these is fairly complex and uses a Lisp-like notation that isn’t always intuitive. If you want to tackle it, there are several documents of interest. There’s a very good slide show overview, very out of date official documentation, and some guy’s master’s thesis. However, be prepared to read a lot of source code and experiment, too. Then you’ll probably also want to port gdb, which is also non-trivial (see the video below).

There are other C compilers. The llvm project has clang which you might find slightly easier to port, although it is still not what I would consider trivial. The lcc compiler started out as a book in 1995. It uses iburg to do code generation, and that tool might be useful with some other retargeting projects, as well. Although the vbcc compiler isn’t frequently updated, the documentation of its backend looks very good and it appears to be one of the easier compilers to port. There is a portable C compiler, PCC, that is quite venerable. I’ve seen people port some of the “small C” variants to a different CPU, although since they aren’t standard C, that is only of limited use.

Keep in mind, there’s more to doing a gcc port than just the C compiler. You’ll need to define your ABI (Application Binary Interface; basically how memory is organized and arguments passed). You’ll also need to provide at least some bootstrap C library, although you may be able to repurpose a lot of the standard library after you get the compiler working.

So maybe the C compiler is a bit much. There are other ways to get a high level language going. Producing a workable JVM (or other virtual machine) would allow you to cross compile Java and is probably less work overall. Still not easy, though, and the performance of your JVM will probably not be even close to a compiled program. I have found that versions of Forth are easy to get going. Jones on Forth is a good place to start if you can find a backup copy of it.

If you do bite the bullet and build a C compiler, the operating system is the next hurdle. Most Linux builds assume you have advanced features like memory management. There is a version, uClinux, that might be slightly easier to port. You might be better off looking at something like Contiki or FreeRTOS.

A Shot of Realism

Building a new CPU isn’t for the fainthearted and probably not your best bet for a first FPGA project. Sometimes, just getting a decent front panel can be a challenge (see video below), never mind compilers and operating systems.

Bootstrapping a new system to a full Linux-running monster would be a lot of work for one hacker. It might be more appropriate for a team of hackers. Maybe you can host your project on

Still, just because you can’t whip up the next 128-bit superscalar CPU on a weekend, doesn’t mean you shouldn’t try your hand at building a CPU. You’ll learn a lot and–who knows–you might even invent something new.

37 thoughts on “Build Your Own CPU? That’s the Easy Part!

  1. Good write-up, I like the link to opencores as well as lots of cool things in there.

    I never made my own processor but know people who have.

    It used to be that most versions of Linux need a Memory Management Unit which is easier than a CPU, unless your troubleshooting for corrupted memory. >:)

    1. Yeah as far as I know uClinux is the only real choice (other than totally roll your own) that can be happy without an MMU. As long as you have one processor, the MMU isn’t bad. Gets ugly as you add cores though (depending on your memory bus architecture).

      1. I find getting the MMU working right to consistently be the biggest pain on any SoC. The mainline Linux kernel can be configured to work without one though and there’s support in devicetree for some ARM chips with no MMU. Seen several pop up on the mailing list just recently.

        If not intending to run Linux then I would start by trimming down an existing simple ISA like MIPS 2k and port a simple compiler which already has support for it like lcc. If the target is bare metal then you could get by without binutils and libc which also saves a lot of effort. Should be doable in a few evenings.

  2. A recommendation: try Forth. A Forth system is small enough that one can program one without using an assembler, the easiest way is doing the absolutely minimum in machine code and then use the Forth system itself to do the rest.

    1. Absolutely. I put a link to a backup copy of Jones on Forth since I couldn’t find the original. That is a really good example of bootstrapping. I also did Forth (as a cross compiler) for my OneDer CPU. A great language, especially for this kind of thing.

    2. I have trouble taking forth seriously because I first heard of it via stumbling on Forth on the Atari (Learning by Using).

      Like, I know it’s a legit language, but whenever I hear its name I think of Captain McBoner, Mistress Underboob, and the giant keyboard of doom.

  3. I remember using Small-C under CP/M eons ago. It’s a subset of C, but sufficient to get work done. 16 bit integers and 8 bit shorts, I think. Couldn’t have been any bigger. May be a good starting point – if nothing else, write an 8080-to-your-machine converter and then run the results through an assembler. Wikipedia article

  4. I remember working with the then ‘perfect’ chip set to roll you own processor – the AMD2900 series – I did several had disk controllers with them, using the 2901, and 2909 and then the 2903 and 2910 chips – a bank of [then fast] 50-100nsec RAM as a ROM emulator and micro-coding the instructions for the controllers and writing a compiler in a high level language [PL1].

      1. It was a subset of PL/1 first implemented in the mid 70’s on a 8080, then a Z80 system [Q1 Corp], with 1 floppy drive, 8K RAM, and 6K ROM [Intel bragged in those days that they needed only 64K RAM and 2 floppy drives to do the same thing.

  5. Hi Al,

    Thanks for the article. I wanted to build a cpu so that’s what my first fpga project was. You’re right; it may not have been easy, but it certainly wasn’t the only part. Two seconds after I was finished (and convinced it worked) I realized it wasn’t a computer system. Nine and half years later. Here’s where things stand.
    – Wrote a micro code assembler
    – “Wrote” a forth like instruction set in microcode
    – Wrote an assembler
    – Built serial ports, timer, interrupt controller, static ram interface, spi ports…
    And then the madness started.
    – Hand wrote a Pascal compiler (using Jack Crenshaw’s compiler series from the 1980’s as a start). He never finished that series but I got enough from him to finish it. Procedures, functions, parameters, records, pointers…
    – Wrote a time-based functional simulator, with debugging support
    – Ported a substantial version of early Tanenbaum’s Minix – Got I/O and time based multitasking working and can read (not yet write) early Linux file systems. I’m at about 10K lines and counting.

    Have been working on the file system code for a while. Even though it’s “only” a port, it’s proved quite hard.

    P.S. I have one of your CPLD boards.

  6. Yeow! Al Williams I remember you from the BASIC Stamp days.

    Yes lots of good advice back there regarding a DYI project of building your own CPU. It happens that I’ve communicated with both the chap behind the BMOW project, and the Home Brew CPU project. (That’s the guy who rolled the lCC compiler into one for his.)

    And incidentally Small-C is available for both I8080 and I8086.

  7. Yes the core of a CPU can be described as simply as (¬p ∧r) ∨ (q⊕r)
    This is rule 110, the simplest known Universal Turing Machine and that core can be implemented with a handful of NAND gates.

      1. Handful as in number of fingers, 5 gates total, GATES, so just 2 7400 modules to make a CPU core. One NAND is used as an inverter so the parts at the transistor level is even lower than in the 2 modules.

    1. I’ve heard of “subtract, branch if negative” as the core of a 1-instruction Turing-compatible machine. Not quite sure what you’ve written, never got into advanced maths.

      I did do a bit of experimenting with CA though. I’d love to know how you’d program a Rule 110 machine.

      1. Cyclic Tag Systems are the next layer. You just have more layers of abstraction to get to something more familiar. That rule 110 core can be applied in a massively parallel manner too but each slice needs more than one core. If you had a million of those tiny slices that is how many cells get evaluated every 3 clock cycles. Or you can have just one core and feed a shift register though it, this may be better suited to a completely photonic processor that uses an optical delay line, then what you lost in all those layers of abstraction being processed serially you make up for by operating in the THz speed range.

  8. Linux is a ridiculously high and unnecessary bar to aim for to demonstrate a useful CPU. Much more reasonable is to do what actual early and useful computers did before computers became so complex:
    * Tiny Basic, runs well and surprisingly useful in under 4K RAM
    * Tokenized Basic, add strings and floating point functions to taste; can be built up a function at a time and hand coded
    * Forth, asm entioned already, very easy to build a complex system incrementally
    * CP/M. 8080 cmpatibility and 64K RAM can run an office with a huge library of free legacy software.

        1. It is functional, what it isn’t is fast. The principles are sound, even the simplest universal Turing machine can emulate any other more complex system that is why it is considered universal.

          As I pointed out elsewhere on this page you may only be able to implement a tiny CPU in some technologies but if it can run a thousands times faster than silicon there is a valid case for using it.

    1. It all depends on what your goal is. When I built my CPU, and all that followed, it was just to see if I could. I agree the Linux bar is high, but I also think the bar was quite high in the 1970’s and early 80’s when people were boot strapping the systems you mentioned . Almost all of that code was written in assembly. No debuggers; no simulators…

  9. Take a gander at Niklaus Wirth’s Project Oberon. In the book he documents the creation of the CPU(32 bit), Oberon OS and Compiler.

    It’s all well documented and the sources for everything is there. It’s currently running on Digilent and Saanlima Pipistrello boards.

    If you want to see how it’s all put together, check it out.

  10. You also could prototype your processors using ArchC (or similars).
    …” goal in designing ArchC is to provide architecture designers with a tool that allows them to rapidly evaluate new ideas in areas such as: processor and ISA design, memory hierarchy, and other aspects of computer architecture research.”

    . descrIbe the instruction set in a DSL (domain specific language), something like C with some hardware abstractions. Eventually this could be used to synthesize in a verilog/vhdl RTL code.

    . the tool generates part of toolchain: simulators, binutils-based assemblers and linkers, C and C++ Clang-based frontend and LLVM Backends for new models.

    . there are some ISA samples in their site, ARM, PowerPC, Mips, SparcV8

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s