Xcc700: Self-Hosted C Compiler For The ESP32/Xtensa

With two cores at 240 MHz and about 8.5 MB of non-banked RAM if you’re using the right ESP32-S3 version, this MCU seems at least in terms of specifications to be quite the mini PC. Obviously this means that it should be capable of self-hosting its compiler, which is exactly what [Valentyn Danylchuk] did with the xcc700 C compiler project.

Targeting the Xtensa Lx7 ISA of the ESP32-S3, this is a minimal C compiler that outputs relocatable ELF binaries. These binaries can subsequently be run with for example the ESP-IDF-based elf_loader component. Obviously, this is best done on an ESP32 platform that has PSRAM, unless your binary fits within the few hundred kB that’s left after all the housekeeping and communication stacks are loaded.

The xcc700 compiler is currently very minimalistic, omitting more complex loop types as well as long and floating point types, for starters. There’s no optimization of the final code either, but considering that it’s 700 lines of code just for a PoC, there seems to be still plenty of room for improvement.

21 thoughts on “Xcc700: Self-Hosted C Compiler For The ESP32/Xtensa

      1. Tiny CC. https://github.com/TinyCC/tinycc. Not quite a cross-compiler, but several people have rewritten the back end to target other architectures. It’s a far more complete implementation of C than this project, but still has some way to go, and in its original implementation by Fabrice Bellard, it targets x86, x86_64, arm, aarch64 or riscv64. I haven’t seen an ESP32 version, but I know SOMEBODY is working on a Cortex-M targeted version. And once you have a version that targets the right architecture, all it takes to make it a cross-compiler is to build tinycc itself on the system you want to compile from.

  1. I could not fathom this phrase “omitting more complex loop types”, because C loops are extremely simple. But, lo! the README says it supports ‘while’ but not ‘do’ or ‘for’.

    I’ve many times thought of using languages mostly to write a self-hosting compiler and for no other purpose (i.e., not to compile an existing body of code). FORTH obviously. Never thought of using a “C–” approach for that. By the time I really looked at, specifically, the 1990ish C– for DOS, its limitations (iirc, no nesting) were too much to inspire me. But if it’s nested, ‘while’ really is expressive enough, i’d be willing to write a compiler in a language that lacked ‘for’.

    I’d miss ‘struct’, though.

    1. Loops-wise all compilers resolve to about the same assembler code; all three, while, for and do, no difference, it is down to JUMP Assembler command to complete the loop.

      (disclaimer – its been decades since wrote Assembler code, but there are no real “loops” in Assembler to speak of, some kind of code increments/decrements/shifts register value, executes another piece of code, returns, inc/decr/shift – until some kind of condition is met, then jumps to a different piece of code).

      Meaning, it is no loss, really.

    2. struct is a far greater challenge than do..for or any other flow control mechanism. And much more important. Without struct, you have to keep track of both the size of each data structure and the offset to each individual property. Which is the major advantage you get in moving up from an assembler into C. This is a hard “no” for me, although I’ll look at the code anyway to see how much trouble it would take to add this essential feature. Leaving out struct is like leaving out float.

  2. I have a bunch ESP32s that I keep meaning to use, but so far everything I want to do with a microcontroller is doable with an ESP8266. But so far I have not been interested in capturing images — just weather data and turning things on and off.

  3. Self hosting a C-compiler is nice, but it has to live in an environment too. I see “ESP32-DOS” on the TFT, and there is an older article on hackaday for that: https://hackaday.com/2021/07/28/emulating-the-ibm-pc-on-an-esp32/ You’d also need a text editor to input the source code, and a generic (preferably multi tasking ) OS to start your programs would then be a logical choice. What’s up with that “elf loader”? If this works over a terminal (emulator (telnet / netcat)) that would be nice too, and drivers for the network are normally an OS function too, so multiple programs can use the network stack.

    I’d also really like to have a unix / linux like CLI, I’ve forgotten most of the old DOS commands, and they were extremely limited too (batch files did not even have a proper loop, pipes were never implemented properly). Sygwin to get command line unix utilities may be adequate. ncurses for simple menu’s.

    With such a system as a solid base, I guess / hope it would attract more people from the retro computing niches too, for example with Z80 and C64 emulators. ESP32 can also already run DOOM. https://www.youtube.com/watch?v=s9bV4q9rWs0

    With such a base it would in time also get extended into generic handheld devices, from (retro) game consoles to dedicated text entry (“distract-less editors”), diaries, calendar applications, programmable remote control and such. I’ve never had a flipper, but maybe that could benefit from this too, by combining the standard applications with custom scripts that can be modified on the fly on the device itself (Add a bluetooth keyboard?)

    Most of such things can also be done on a phone, but with an ESP32 you can do it on EUR 20 of hardware, and a battery life of weeks or months. (Or years even, if the “wake up duty cycle” is very low).

    1. elf_loader is a relocating loader that Espressif provides as a way to load elf files created by their esp-idf development framework and stored on SD card or on-board flash memory to be loaded and run on a standalone ESP32 system. Espressif’s intention is that you develop programs to run on the ESP32 series chips using their tools on a “real” computer, and elf_loader is something they give you so if you want to be able to run multiple programs and can provide a user interface to choose among them, you can do that. The step that this project adds is a compiler that can run on an ESP32-S3 to generate the elf files. Of course, since the latest ESP32 chips, the ESP32-Cx and ESP32-P4, and a few others are based on the RISC-V architecture, this compiler won’t generate code that will run on those. Just the -S3. But presumably, you could compile elf_loader for any ESP32, and it could be used to load elf files created for that chip.

      Writing a “terminal emulator” for ESP32 is relatively easy. I put that in quotes because what we’re talking about isn’t an emulator at all. If you program a microcontroller to do a task, like accepting input from a keyboard and displaying characters on a screen, it’s not a terminal emulator, but a TERMINAL. It is true that you need all of the other things – a text editor, a command line interpreter, and a program loader, along with libraries to access an SD card or USB thumb drive, and read and write files on a FAT filesystem, but each of these is its own project, and all of them and a few other things are required before you have a self-contained operating system. Of all of these, a C compiler is probably the most complex. But it’s also one of the three essential pieces you need that has to be written specifically for the target instruction set, the other two being an assembler and a disassembler/debugger. Which means all of these have to be written for or adapted to all of the instruction set architectures you want to support. Which in the ESP32 world means Tensilica Xtensa LX6, LX7, and RISC-V. But of course you can restrict your OS to support just one of these. It’s just that in Espressif’s opinion you’re better off developing your code using their tools on a PC and then building it separately for each specific chip you want to run it on. This is not an ideal situation, which is why you see Micro Python being ported to every MCU with enough memory to handle it. But I would be interested in exploring this compiler to see how much work it would be to adapt it to the RISC-V architecture. I think that Espressif will continue transitioning to RISC-V, and it won’t be too long before the plain ESP32 and the ESP32-S3 will be considered obsolete. This is already close to true for the original ESP32.

      Having grown up with the microcomputer, and spending the 1980s programming Z-80, 6502, and 6809 systems as well as a couple mainframes and Unix BSD, I don’t see multitasking as an essential. Multitasking is what forced us into systems with gigabytes of memory, terabytes of storage, and gigahertz of clock speed. If you’re going to embrace architectures like ESP32, you’re going to have to be a bit more realistic with your expectations. People have already done this with their cell phones – how many people do you see running multiple applications at the same time on their phones? It really isn’t necessary. If you really need the services provided by Unix, there are plenty of Unix machines out there, including ones you can wear on your wrist. Do you want to reinvent Unix? This isn’t and never will be that. Thankfully.

    2. This is a cool vision and might fit the spirit of the old C.H.I.P. single-board computer https://en.wikipedia.org/wiki/CHIP_(computer) — especially the ESP32-P4, which has two cores running at 400 MHz each. Maybe the fatal flaw of C.H.I.P. is that it was stewarded by one scrappy little startup that couldn’t keep the lights on — an open-source, come-one-come-all bazaar approach might have more success.

      For myself, though, I see an immediate need for this that doesn’t require an entire OS behind it. In my spare time I’ve been working on a set of plug-and-play process automation devices for small food producers without much tech experience. It’s built on a flow-based-ish library that gives strong typing all through the pipeline. The devices will have a GUI that allows users to assemble flow networks without writing code — but at some point their diagram will have to become executable. I’d assumed I’d have to use the Builder pattern on a data structure that holds the net on device bootup, but suddenly it might be possible to just convert the structure into C that gets compiled and hot-loaded! This is really exciting!

  4. Turbo Pascal ran fine on primitive 386 PCs of the era with full IDE, debugger and graphics library. Just goes to show how inefficient ANSI C is (not to mention C++ with its horrible STL, generics and exceptions. Also, in C malloc maps directly to a simple kernel call while in C++ new and delete is implementation-specific. Microsoft does is their way, GCC another and Segger Embedded Studio theirs. It’s a goddamn mess.

    1. I got into embedded programming thanks in part to Turbo C and an after-market product that let you use the Turbo C debugger against a x86 target running their debugging monitor (kind of like gdb server). I was able to code my project in Turbo C and target it to a NEC V30. Good times.

    2. Um, actually, C compilers also were available for 386. I built Windows applications in MS-C on a 386 machine. Obviously it was possible, because that’s how most applications were created in the early 90s. Pascal as a language was every bit as complex as C, it it’s just decades of cruft that have made modern C compilers into resource hogs. The fact that this never happened to Turbo Pascal relates more to how quickly it got replaced by C and fell out of use in most places. How does this “go to show how inefficient ANSI C is”? Furthermore, since C++ was originally a pre-processor for C, new ALSO directly called malloc().

    1. matlab may be a bit tough, since it’s a commercial application, but there’s an open-source replacement called Octave. I don’t know if it can import matlab files. I think there’s also an open-source replacement of TeX, as well. LaTeX, I think. Once you have the source, the rest is just work.

Leave a Reply to Will NorrisCancel reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.