You would assume that any programming language available back in the 1960s would be small enough to easily implement on today’s computers. That’s not always true though, since old languages sometimes used multiple passes. But in some cases, you can implement what would have been a full language decades ago in a tiny footprint. A case in point is a pretty good implementation of Lisp — including garbage collection — in 436 bytes.
SectorLISP claims to be the tiniest real language, beaten only by toy languages that are not really very useful. If you want to, you can try it in your browser, but that version has better error messages and persistent bindings, so it hogs up a whole 509 bytes.
Of course, LISP can be an acquired taste, but it is elegant. Some say it is an acronym for “lots of irritating spurious parenthesis” but the structure does make it easy to parse.
Coincidentally, Forth is also easy to parse and SectorForth is one of the slightly larger languages that SectorLISP compares itself to. These tools are meant to live in a small bootsector of a floppy, but who knows where you might want to cram in a tiny scripting language? The fact that SectorLisp takes 436 bytes and the IBM 7090 LISP 1.5 took 32K is probably partly due to the efficiency of the x86 instructions set and partly due to the fact that the 7090 had a much larger environment to live in.
This isn’t the first small LISP we’ve seen. Forth, of course, is a staple.
Change it to a .COM file and you can ditch the first six bytes since cs=ds=es=ss in assumptions.
Your welcome.
Can we have a proper asm size-opto compeition on this ?
Winner gets all the glory if they can get it down to 256 bytes…
* you’re
He was saving two bytes
The first question I found myself asking was 436 bytes of what? I of course assumed it would be PDP-11 machine code, but I was wrong, it is 8086 machine code. Interesting, but hardly a readily available platform in this day and age. Let’s see what someone can do with ARM machine code.
It’s likely to be significantly larger, though the compact Thumb instruction set helps. The whole reason that CISC architectures were invented is because they allowed very compact code to be hand-crafted, back in the days when memory was very expensive. But I would love to see it so we can compare!
8086 is still readily available. All x86-64 CPUs start in Real Mode, which is 8086 compatible (That’s why SectorLISP is written in 8086 after all). Even after setting the right registers to jump to Protected Mode (32-bit), you can still run 8086 programs in Virtual Mode.
In many ways the instruction count tells us more than the byte count. So recoding this for the ARM might suffer significantly from the fact that all instructions are 32 bits (I am ignoring thumb mode), but a reduced “lines of code” count would measure both how clever the person coding was, as well as how agile the architecture is.
Whatever the case, fitting an entire language in a “handful” of bytes is impressive.
The byte count is a bit misleading as there is the BIOS that abstracts the I/O, boot code, memory and peripheral initialization etc. Old BIOS ranges from 256K compressed to 16 or even 32MB these days.
I am more interest in the size of a bare metal implementation.
I believe 80286-80486 era BIOSes did fit in the UMA and thus were much smaller than that. About 64KB total, I think, so they did fit into 2×27256 EPROMs or a single 27512 EPROM. The CMOS Setup Utility and the actual BIOS are/were separate entities, too, afaik. Many users don’t remember this. The PC BIOS in the PC/XT class system didn’t offer a CMOS, for example, because there was no Real Time Clock with built-in CMOS RAM by default. Dip switches or jumpers were used to configure the motherboard. RTCs were optional devices at the time and could be installed in the form of expansion cards. Some required DOS drivers, also, because they were original designs. The PC/AT introduced a standard RTC first time, afaik. The CMOS Utility was loaded from diagnostic s diskette. Compaq continued to do that practice, too. Compaq BIOSes were located on diskette or an hidden DOS partition of a fixed-disk.
Agree. Like on an ESP32 with enough ‘BIOS’ to support WiFi, I2C, SPI, UART etc for useable connectivity. Possibly such already exists. Then a wrinkle would be libraries for common IO devices and sensors.
Beating FORTH is quite impressive.
What can we do with a 436 byte language…? Maybe embed it in a gif file so we can write exploits in an easier language than NSO did? :P
Make a BIOS out of it instead of FORTH
I remember “Lost In Silly Parentheses”
Awesome, lets create a nice icon for it, for your favorite gui, it’ll be larger.
Isn’t Emacs written in Lisp? So how big would Emacs be in this version of Lisp?
Still larger than vi.
B^)
Most of the Emacs source is written in C, but it has a built-in lisp interpreter.
OK, now here is an interesting juxtaposition of articles. This one, and “FlyBrainLab: Google Earth But For A Drosophila Fly’s Brain”. So, coding FORTH into a fruit fly…
A Beowulf cluster of programmable sensor arrays…
bla-blaaaaaaaaaaa … where is the code ?
some good examples ??
“Where is the code?”
It is easy to miss, it is all contained in the period at the end of the 3rd sentence.
B^)
We can’t cut the first six bytes, since they are the literal strings for false (NIL) and true (T), and also part of the code (we execute them on start). We also can’t drop the far jump since we need false to be at offset 0.
This is beautiful :-)
Regardless of the CPU and bios details, being able to create a full language with such a small footprint is amazing and might be a fun project to try to port over to a small microcontroller such as an Arduino/atmel.
Not only that but this is so small it could be embedded in the hardware