Have you ever felt the urge to make your own private binary format for use in Linux? Perhaps you have looked at creating the smallest possible binary when compiling a project, and felt disgusted with how bloated the ELF format is? If you are like [Brian Raiter], then this has led you down many rabbit holes, with the conclusion being that flat binary formats are the way to go if you want sleek, streamlined binaries. These are formats like COM, which many know from MS-DOS, but which was already around in the CP/M days. Here ‘flat’ means that the entire binary is loaded into RAM without any fuss or foreplay.
Although Linux does not (yet) support this binary format, the good news is that you can learn how to write kernel modules by implementing COM support for the Linux kernel. In the article [Brian] takes us down this COM rabbit hole, which involves setting up a kernel module development environment and exploring how to implement a binary file format. This leads us past familiar paths for those who have looked at e.g. how the Linux kernel handles the shebang (#!
) and ‘misc’ formats.
On Windows, the kernel identifies the COM file by its extension, after which it gives it 640 kB & an interrupt table to play with. The kernel module does pretty much the same, which still involves a lot of code.
Of course, this particular rabbit hole wasn’t deep enough yet, so the COM format was extended into the .
♚ (Unicode U+265A) format, because this is 2025 and we have to use all those Unicode glyphs for something. This format extension allows for amazing things like automatically exiting after finishing execution (like crashing).
At the end of all these efforts we have not only learned how to write kernel modules and add new binary file formats to Linux, we have also learned to embrace the freedom of accepting the richness of the Unicode glyph space, rather than remain confined by ASCII. All of which is perfectly fine.
Top image: Illustration of [Brian Raiter] surveying the fruits of his labor by [Bomberanian]
Hm..yes! I thought it wenn Linus switched from aout to elf, but Linus did not listen to me. :-D
Olaf
ELF is no worse than PE but COM is crazy because it’s just raw instructions. However, the whole “.♚” business reminds me of the Atari Mega ST marking deleted files using a bomb character. It kinda makes me think that they should have used “.💣” instead. :)
Yay, COM files are back baby. 0xCD 0x21 😁
Actually, games written in COM can be order of magnituded faster than currently available crap that needs LATEST GE FORCE 1000 GIGA RTX SSD FULL SPEED HD RAY TRACING MULTI CORE PENTIUM.
There are “flat” bmnfmts already in the kernel, popular for NOMMU builds. The article mention these are bloated but last time I checked they were really not.
tangentially…
ELF is great because you really want all of its features. it isn’t bloated at all. the only way you would perceive it as bloated is if you didn’t want those features.
but you do. you really really want those features. ELF is great.
what’s bloated, fwiw, is i18n / l21n. regardless of binary format, every unix had a reasonably-sized libc until those features were added and then every libc became giant without exception. and that’s a feature you may want, or may not want, or may actively want to not have. sigh.
Without the ‘.’ in the extension, I was afraid some misguided soul was attempting to add Component Object Module support for some reason.
The horror, the horror…
Ah, yes. You mean Component Object Model.
Well, that thing still drives Windows at its core. But granted, it has no place in Linux.
Fun fact: Linux actually has COM. It’s called d-bus and uses glib. It’s part of Gtk and allows writing IPC mechanisms that expose and consume interfaces and services.
I wonder if there are security concerns with having the com files back? But i suppose it could perhaps save a lot of space for the smallest tools/commands. But i suppose er already have busybox for that.
The post seems to be several years old, the module is built against kernel 4.15. Does the example code even work on anything recent?
Maya I tried to contacted you I need to read the prog_firehose_sdx6x.elf or the full rom from a Nighthawk MR6450. But your Homepage gavie me a sending error. Thanks
Klaus
com, wow, that’s a flashback. I do miss the 256 byte demo competitions, that only works with .com executable. ( https://hackaday.com/2020/04/21/a-jaw-dropping-demo-in-only-256-bytes/ )
I think com did not even do pointer fixup, just loads into a known address. Or it had to be compiled to run relative to one of the base registers. No more than 64kb perhaps? 16-bit for sure….
Can you even have Address Space Layout Randomization (ASLR) work with a com format?
The load address was not fixed, but could be inferred from CPU registers (certainly from cs, since real mode was segmented, which also means pointer fixup was not necessary for programs that could fit into a 64k segment). Thus, aslr would probably work, although it might decrease usable memory.
I learned x86 assembly by hand assembling DOS .COM files in DEBUG.EXE