Forth: The Hacker’s Language

Let’s start right off with a controversial claim: Forth is the hacker’s programming language. Coding in Forth is a little bit like writing assembly language, interactively, for a strange CPU architecture that doesn’t exist. Forth is a virtual machine, an interpreted command-line, and a compiler all in one. And all of this is simple enough that it’s easily capable of running in a few kilobytes of memory. When your Forth code is right, it reads just like a natural-language sentence but getting there involves a bit of puzzle solving.

From Thinking FORTH (PDF)

Forth is what you’d get if Python slept with Assembly Language: interactive, expressive, and without syntactical baggage, but still very close to the metal. Is it a high-level language or a low-level language? Yes! Or rather, it’s the shortest path from one to the other. You can, and must, peek and poke directly into memory in Forth, but you can also build up a body of higher-level code fast enough that you won’t mind. In my opinion, this combination of live coding and proximity to the hardware makes Forth great for exploring new microcontrollers or working them into your projects. It’s a fun language to write a hardware abstraction layer in.

But Forth is also like a high-wire act; if C gives you enough rope to hang yourself, Forth is a flamethrower crawling with cobras. There is no type checking, no scope, and no separation of data and code. You can do horrible things like redefine 2 as a function that will return seven, and forever after your math won’t work. (But why would you?) You can easily jump off into bad sections of memory and crash the system. You will develop a good mental model of what data is on the stack at any given time, or you will suffer. If you want a compiler to worry about code safety for you, go see Rust, Ada, or Java. You will not find it here. Forth is about simplicity and flexibility.

Being simple and flexible also means being extensible. Almost nothing is included with most Forth systems by default. If you like object-oriented style programming, for instance, Gforth comes with no fewer than three different object frameworks, and you get to choose whichever suits your problem or your style best. You can modify the Forth compiler or interpreter itself, so if you want type checking, you can add it. Some Forth implementations are written with just twenty or thirty functions in native assembly or C, and the rest is bootstrapped in Forth. Faster Forths are implemented entirely in assembly language, with some compile-time optimizations that make it run about as fast as anything else, even though it’s compiled as you type, on the target microcontroller itself.

Forth is probably not the language you want to learn if you are designing an enterprise banking backend. On the other hand, a string of blinky LEDs running a physics simulation isn’t an “enterprise” anything. Forth is a hacker’s language, in both the laudatory and the pejorative senses. I’m not sure that it’s going to help you get “real work” done at all, and you engineer types might want to walk away now. But if you want to tweak at the language itself, or use it to push your hardware around, or just play, Forth is fantastic. The hacker in me thinks that’s a lot of fun, and it’s a great match for smaller microcontroller projects.

Forth Crash Course: Theory Section

Forth is the simplest language after assembly language. Forth is procedural in the extreme — a Forth program is really just a chain of subroutines, called “words” in the Forth jargon. There’s no syntax, and all words are separated by a space and are parsed left to right. With a few exceptions for compiling, all words run right now so the Forth interpreter doesn’t have to look ahead to the next word. run this code! is valid Forth if you’ve already defined the words run, this, and code! and it calls each of the three words in the order that you’d expect.

Simplicity and Reverse Polish Notation

The corollary of this simple setup is that Forth uses the so-called Reverse Polish Notation (RPN). If the Forth interpreter is going to execute the + word right now, it has to have the two numbers that are going to get added already on the stack. Otherwise, it would have to wait for the next number to come along before it could finish the addition. Read that again, and let it sink in. It’s different from what you’re used to, and it’s important.

crazyAny other ordering or syntax is unnecessarily complicated. The way that you think of as “natural” to write down math is crazy, in the sense that the look-ahead nature of the + operator requires either parentheses or an “order of operations” to be unambiguous. Look at 2 + 3 * 4. There is nothing natural about getting 14 here at all — it’s the result of a convoluted syntax with implicit rules. You have to read the whole “sentence” first, find the *, remember that it has priority over +, evaluate it first, and then go back to the addition to finish up, even though + is the second word in the sentence. A computer doesn’t want to know about “order of operations”, it just wants to add two numbers, preferably ones that are already sitting in ALU registers. Don’t believe me? Read the machine code.

Forth, and RPN, write this as 2 3 4 * + or 3 4 * 2 +. Either way, the operator works on whatever numbers are already available. If you don’t think of this as being “reverse” or “polish” or even a “notation”, you’ll be on the right track. You’re simply writing things down in the order in which they should be executed. (How crazy is that!?!)

I like to think of RPN as the computing equivalent of mis en place; before you start cooking, you get all your ingredients lined up. This is the way Forth code works: get.broccoli chop.broccoli get.beef slice.beef get.oyster-sauce stir.fry Some elements are naturally interchangeable — you could get and slice the beef before the broccoli — but the overall order is important to the procedure, and you really don’t want to be going back to slice the broccoli while the beef is in the wok. But this is exactly what you’re doing when you insist on writing 3 + 4 instead of 3 4 +.

Compiling and Running

New words are defined and compiled with a : to enter compilation mode and a ; to exit. Compilation, such as it is, takes place immediately. Under the hood, the Forth interpreter is looking up each word that you use in the definition and simply stringing them together. One exception to this rule is the function name itself, as you’ll see now.

Starting FORTH

Compile your first Forth word: : seven 3 4 + ;. It’s not very useful, but it creates a word called seven that will put a 3 and a 4 on the stack and then run the addition word. The “result” is that whatever you had on the stack before, you’ll have a 7 on top of it now. An optimizing Forth compiler will just push a 7 onto the stack.

All programming is about breaking complicated tasks down into reasonable-sized chunks. What constitutes “reasonable” depends a bit on the language, a bit on the programmer’s own style, and a bit on the cultural zeitgeist. Executing a word in Forth requires a lot less overhead than a function call in C, and following long code logics can get convoluted even for experienced Forthers, so the definition of Forth words tend to be extremely short, typically one-liners including comments. You’ll be compiling quite often.

The Stack

The heart and soul of Forth is the data stack, henceforth “the stack”. Forth is a stack-based language, and until you’ve coded in Forth for a while, you can’t appreciate what this really means and how thoughts about the stack come to dominate your coding life. Forth words don’t take arguments or return values, instead they operate on whatever data is on the stack when they’re called.

Starting FORTH

The stack is the best and worst part of Forth. When your stack contents line up right with the words that operate on them, the result is code of a beauty and efficiency that can’t be beat. When they misalign, you find yourself wondering if the object-oriented folks aren’t right, and that coupling data with methods might be a good idea after all.

The gimmick in Forth programming is figuring out what’s needed on the stack by one word, and making sure that the word used just beforehand leaves that on the stack. In this sense, the Forth programmer defines words, but needs to think in phrases, where the contents of the stack start out empty and end that way again. This is like the way that C uses the stack within a function scope, keeping the local variables only until it’s done with the function, and then overwriting them.

As a concrete example of this chaining, imagine a word, gpio-set that sets a GPIO pin high. It will probably need both a port and a pin number to get the job done. A particularly Forthy way to implement this is to define a word for each pin on the part that you’re going to use: : PA3 PORTA 3 ; Then you can light the LED on pin A3 with PA3 gpio-set. In C, you’d first define a structure that includes a port and a pin number, then define gpio-set to take a structure of that type. In Forth, this is implicit: you make sure that the pin words push a port and pin number onto the stack, and then that pin-handling words expect them. It’s not safe, but it’s simple.

Stack Juggling

Swap-swap, the Forth Mascot, from Starting FORTH
Swap-swap, the Forth Mascot, from Starting FORTH

The absolutely worst part of Forth is stack manipulations. People usually start learning Forth by learning about the stack manipulations, and indeed they are important, but they’re trivial. Words like swap, drop, and dup let you move items around on the stack, but too many “stack juggling” manipulations in a given word is probably a sign of bad Forth code, rather than good. You’ll use these words for sure, but that’s not where the bodies are buried.

Rather, the stack is where the absolute minimal amount of data wants to sit between processing steps. The number of items needs to be small enough to avoid running out of finite stack space, of course. But a second reason is that it’s simply hard to keep too many stack items straight in your mind. As you get used to Forth, the amount of stack that you can internalize might go up from three items to five or seven, but you’re going to get confused if you let the stack grow unpruned for long.

So Forth is a disaster. It’s a language where you have to manage the stack yourself, and the only hope of not getting caught in a tangled web of endless stack juggling is to keep things as simple as possible. Forth proponents would claim that this is also its greatest virtue — there’s always pressure to keep things simple and straightforward because doing anything else will embed un-figure-outable bugs so deep into your code that you’ll go mad. I don’t know if it’s Stockholm syndrome, whether I’m a minimalist at heart, or whether I like the challenge, but this is actually one of the reasons that Forth is so neat.

Other programming languages allow you to juggle tens of variables because the compiler keeps track of different scopes for you, keeping track of how deep each (local) variable is on the stack. Forth doesn’t. And that means that you need to think about the order in which things run. Forth forces you to internalize a bit of optimization that an ideal compiler would do for you. This can end up in very tight code, or in headaches. It depends on the programmer and the problem.

In my experience, simple syntax, computer-friendly ordering, and the resulting emphasis on transparency and simplicity make it actually surprisingly easy to get stuff right in Forth.

The Sweet Spot

But enough philosophical crap about Forth. If you want that, you can read it elsewhere in copious quantities. (See the glossary below.) There are three reasons that Forth more interesting to the hardware hacker right now than ever before. The first reason is that Forth was developed for the computers of the late 1970s and early 1980s, and this level of power and sophistication is just about what you find in every $3 microcontroller on the market right now. The other two reasons are intertwined, but revolve around one particular Forth implementation.


There are a million Forths, and each one is a special snowflake. We’ve covered Forth for the AVRs, Forth on ARMs, and most recently Forth on an ESP8266. The joke goes that if you’ve seen one Forth implementation, you’ve seen one Forth implementation. But I think that’s too cynical — Forth is as much a way of thinking about the computer programming problem as it is a particular implementation. Once you learn one, you’ll be on good footing to learn any other.

Anyway, a few years ago, a physics graduate student [Matthias Koch] wrote a Forth for the MSP430 because he needed a microcontroller to collect analog data for an experiment. That’s “Mecrisp”. Later on, he needed more speed and re-wrote it for the ARM Cortex M family of chips, and we got “Mecrisp-Stellaris“.

And that’s where things got awesome. [Jean-Claude Wippler], the “J” in JeeLabs, decided that he was going to implement his new system of distributed electrical and environmental sensors in Forth, at least under the hood. To do this, he needed a hardware abstraction layer. The combination of Mecrisp-Stellaris with the JeeLabs libraries is a tremendous Forth ecosystem for a whole bunch of ARM microcontrollers, and it has all been developed within the last two years by a small core of thoughtful hackers. Combining the two provides a very pleasant Forth microcontroller hacking experience, like a weird interactive Arduino.

Your Homework

So if you want to follow along down a very strange rabbit hole, learn a bit about the real hacker’s programming language, or just fool around, stay tuned. In a couple of weeks, I’ll publish a hands-on guide to getting started with Mecrisp-Stellaris on the STM32 family ARM chips.

These are the droids you're looking for.
These are the droids you’re looking for.

A good minimum Mecrisp-Stellaris development environment is going to consist of a cheap STM32F103 development board, a ST-Link v2 (or better) programmer for it, and a USB-TTL serial adapter. Chances are good that you’ve got at least the latter kicking around already. If so, you could be set up with the rest for around $10, €10, or £10 depending on where you live. “STM32F103” and “ST-link” should get you set up on eBay.

So order some parts right now. While you’re waiting for delivery, work through this very good online tutorial. Run through that, and you’ll be set for next time.

If you want to go further, you can download a Forth for your desktop computer and start working through some other introductions.  Brodie’s “Starting FORTH” is the canonical Forth introduction, but it’s a bit dated (and feel free to skip Chapter 3 on disk access). If you prefer brevity over humor, check out J.V. Noble’s Beginner’s Guide or the Gforth tutorial.

All images from Starting FORTH used by permission from FORTH, Inc. The originals are black and white, rather than Hackaday dark charcoal and yellow, naturally.

235 thoughts on “Forth: The Hacker’s Language

  1. Sigh. Does anyone have any suggestions for debugging a z80 figforth (v1.3) implementation? Somehow, each word executed seems to be putting an extra word or two on the operand stack, so that “NOOP NOOP” typed on the terminal results in several words worth of stack used up. various other words ( like .) don’t seem to work from the terminal, although they work ok when called from the internal threaded code. Since I have new hardware, new IO code, no OS or debugger, had to translate from one assembler to another (manually), and don’t have THAT much faith that 1.3 was ever a working version, I’m getting pretty frustrated. I don’t understand how it can have so much working (VLIST, for example) and still have so much broken. When I put trace code at NEXT, It *looks* like all the words are treating the stack properly, but… each time it gets to EXECUTE I have a couple words deeper….

    1. Found it! Turns out that the assembler originally used, and the assembler that I am using, do not treat
      .DW BRAN, TARGET-$
      the same. In particular, my assembler doesn’t update $ for subsequent words, causing the offset to be different than expected (and NOT correct for the Forth-ish code.)
      Changing them all to
      .DW BRAN
      .DW TARGET-$
      Makes everything work much better. Sigh. I would have found it sooner if I had focused on the words I had noticed weren’t working at all, instead of trying to track down the mysterious growing stack…

        1. yeah, right. More like assembly language and ancient “how to interpret a core dump” skills. I instrumented NEXT (in assembly) to dump IP and SP in hex, used EMACS to do symbol lookups, and then processed ERROR crashing:
          NEXT: SEMIS.6F08
          NEXT: SPSTO.6F08
          NEXT: BLK.6F0C
          NEXT: AT.6F0A
          NEXT: DDUP.6F0A
          NEXT: DUP.6F0A
          NEXT: ZBRAN.6F08
          NEXT: SEMIS.6F0A
          NEXT: ZBRAN.6F0A
          NEXT: 4983.6F0C
          To realize “4983?! THAT’S not right…” (note the earlier 0BRANCH working just fine…)
          I did learn more Forth than I used to know while debugging this…

  2. FYI – ‘I’m not sure that it’s going to help you get “real work” done at all, and you engineer types might want to walk away now.’ Can’t let this stand, there might be kids watching. :)

    One reason there so many versions of FORTH is that real engineers need it and write their own custom version for the target at hand, to get real work done. Many don’t choose to talk about due to the religious flame wars of the past several decades. E.G. At least two of the 6+ versions of FORTH for the parallax propeller chip are in constant use by their authors for primary income. And yet the followers of each still argue about “which is faster” or which is “better” and other mooot discussions. Bear in mind, the author makes a very good living, typically with very timing specific embedded control systems.

    FORTH is a very specific tool for a very specific jpb. We still need a tiny monitor on the hardware (during initial or low level development) often there is no other solution available. When this a key requirement, then forth is justified, some might say mandatory. When a tiny monitor on the harhardware is not required, forth is not required, and some might say counterindicated.

    Generally speacking (and no offend meant), software engineers use high level language such as some form of C; and electrical engineer use solder and assembly language. Software engineers use FORTH because it might be fun to try “how small can you write it”, etc; electrical engineers might use forth when for example they don’t want to bother with assembler. Very rarely are there individuals with such a deep understanding of both hardware and software that they determine, for the task at hand, FORTH is the only logical option. Me, I like to glom on to folks that are smarter than me, often a guy that can write and use a custom forth can teach me a LOT about hardware, software, and problem solving in general. Also, FORTH is fun to program.

    FORTH is not for everyone, and its not always the best for a given application. FORTH is a handy tool for have in cases where we need a monitor running on the hardware; interactive development; assembler speed execution; custom assembler routing; and deterministic, time critical event handling.

    And you like that sort of thing, FORTH is just plain fun.

    1. I do not understand this religous war about ANS. The commercial suppliers need it for compatibility. So you can say it is according to the Standard. We all know that optimized is better than Standard in ANY area. But at least with ANS you can state where you differ and for what reason – if anybody is interested. In Forth you can as it is inherent – try the same in any other language and you are probably doomed or the compiler disagrees. As MPE has free versions for MSP430G2553 and for many ARMs there should not be an issue to use commercial trial tools for learning. And for implementations for other CPUs there should be people out there who have adapted it for other controllers – and could be asked to share an image for download. And everybody wins.

  3. I’ve lived my life in APL but when I learned of Forth ( ~ when Tron came out and Byte had a special issue ) my immediate thought was the vocabulary worth constructing is APL .

    See , my APL level computing , in fact general note-keeping , environment open ” from the chip to the math” , now up on GitHub and supported via YouTube .
    Go Forth ; branch & merge .

  4. I was into Forth way back, when the Byte magazine special issue came out. At the time, all I had was an Ohio Scientific Challenger IIp, based on a 6502 with 8K RAM. No hard disks or floppy disk, no internet, just cassette audio tapes on which to store software and data. I was fascinated with Forth, how it could be so small and work on a limited 8-bit machine. But I didn’t like certain details of the language, so I invented my own version, using special character available on the OSCIIp, designing a nicer looking syntax. I came up with a lot of clever stack manipulation and execution tricks. Educational, fun, but I never actually got it running.

    Anyone sufficiently bored with nothing better to do can read about it at

  5. Enjoyed the article and appreciate the Mecrisp recommendation. OpenFirmware architect WM Bradley stated for Forth users the hardest part about Forth usually involves explaining why it works for cross-platform application development like bootloaders or Hackaday. ( Admittedly the recent non-smoker vegetarian analogy is pretty funny. ) Forth haters will typically cite Latin, RPN/readability, maintenance, or some hoary project like Valdocs ruined by the language. Forth Inc and MPE have professional grade (docs, support, updates) cross platform development products that support new custom designs as well as Maker inspired Hackaday platforms like Arduino, Raspberry PI, Launchpad, etc. I say FORTH GO.

  6. I caught the FORTH addiction about a year ago for all the reasons described. Ultimately it comes down to what you’d want when you are hacking away with micro controllers and you want something that works like command line – you want to try something without recording / reflashing. I am writing a tiny 8bit FORTH for 12F683 via bitbang UART. This thing is like a throwaway screwdriver for testing all sorts of stuff on the breadboard.

  7. I tried to learn Forth on emulator of Jupter ACE. Jupiter ACE was 8-bit home computer similar to famous ZX-Spectrum but there was no BASIC but FORTH. User guide for Jupiter ACE can be found on web (many mistakes are in that manual, probably those were created when manual was digitized). Forth for Jupiter ACE was based on FIG Forth. That is confusing me because modern Forth (like gforth) is ANSI Forth. I was surprised that when I tried example for Jupiter ACE in gforth, I received different results, even for several simple examples. That is dangerous! For example PICK is different in FIG Forth (index from 1) and in ANSI Forth (index from 0). FIG Forth, 1 PICK is equivalent to DUP and 2 PICK is equivalent to OVER. ANSI Forth, 0 PICK is equivalent to DUP and 1 PICK is equivalent to OVER. PICK instruction is just one of several examples. I remember that some loops were implemented in different way too.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.