How Small Can You Make A C Executable?

It’s well known that the difference in executable size between a compiled binary and one hand-written in optimized assembler will be significant. The compiler brings in all manner of boilerplate whether it needs all of it or not, which is responsible for the extra space. [Weineng] has fallen down the rabbit hole of trying to make the smallest possible gcc-compiled C executable, and the resulting write-up is a fascinating read.

Surprisingly the smallest C program isn’t “Hello World”, but one which simply does nothing but return 0. This results in a binary weighing in at a surprisingly large 15,816 bytes — something which surely could be improved. There follows a set of clever compiler flags and bits of code manipulation to remove some debugging information, and strip out unnecessary stuff executed before void main().

At 13,632 bytes it’s still a little on the chunky side, so it’s time to examine what libraries it brings in. More compiler flags get it down to 8,704 bytes. Removing a code comment section and error handling with more flags takes it to 4,320 bytes. Then there’s code which dictates how memory is allocated, which brings it down to 400 bytes. That’s an impressive reduction!

Reading this as hardware people we maybe don’t have the elite knowledge of compiler flags it takes to manage something like this. But we’ve all at times had to reduce the size of a bit of software, so we’re sure some of the techniques used are going to be interesting to quite a few readers.

After all, even hardware people need to trim the fat at times.

19 thoughts on “How Small Can You Make A C Executable?

      1. But it does show that a program doing something similar can be 7 bytes instead of more than 2000 times as large.
        Although if I must beleive Gemini, this prints “a” instead of returning 0.

        1. Indeed, 61. The a was intentional.
          But that was the point. It’s amazing how much bloat compliers can introduce into even the simplest of programs.
          That being said, those seven bytes were intended to be run as a DOS com file. It doesn’t take many more bytes to make it an exe.
          Interacting with modern operating systems, especially when there’s a GUI like windows as an example, can require quite a few hoops to jump through to even set things up for the bare minimum.
          No doubt if something like windows was written from the ground up, in bare metal asm, it would be substantially smaller and faster.
          I guess that’s one of the hopes for using AI during the programming process. Although personally I’m not convinced the drawbacks outweigh the benefits overall.

          1. “No doubt if something like windows was written from the ground up, in bare metal asm, it would be substantially smaller and faster.”

            not at all. Windows is slow and bloated because of complexity. ASM is the worst possible language for managing complextiy.

            All of the bloat in this do-nothing program are downstream of attempts to manage complexity. If you’re not trying to do something complex, they are all wasted. But if you are trying to do something complex, they pay off in spades.

            Windows is slow because microsoft’s corporate culture is systematically bad at handling complexity.

  1. Importantly, this is about compiling to the smallest a.out (old fashioned linux style) executable. I assume this was chosen because the ELF (modern linux style executable) header is slightly larger and ELF is somewhat more complex. Either way, you’re dealing with

    The article now has a link to a reddit post from someone who managed to get it down to 84 bytes, with (I think) 32 bytes required to be devoted to the header. truly impressive.. although using inline assembly in your C program feels a bit like cheating…

    You could presumably go even smaller if you were targeting something without a header; presumably this includes most microcontrollers, but also classic MS-DOS (or CP/M), which have an executable format (“.com”) which has no header at all.

    also…

    Surprisingly the smallest C program isn’t “Hello World”

    I’m not usually one to critique hyperbole or complain about something that might be intended as sarcasm.. but surely nobody is surprised by that a binary that contains the text “Hello World” and the machine code to display that text is not the smallest one a compiler can generate.

    1. Either way, you’re dealing with

      my inability to complete a sentence before hitting post?
      meant to say that either way, you’re dealing with the need to create a valid header in addition to program code.

    2. DOS Turbo C and MS C could produce .com files which are essentially just binary images. With the right linker script you can get something similar on GCC or Clang – you can use either for hobby OS dev of boot sectors or similar where there is no dynamic loader. The issue here is not the C language but the lack of a light enough binary format in modern OSes.

  2. I used to have a demo program (I must have it somewhere, just can’t find it) that displays a starburst, with an overlaid blown-up text from a command line argument and plays a midi version of Bronski Beat’s Smalltown Boy, all in under 2200 bytes, which was amazing to me at the time.

  3. This is mildly off topic, but fun to note. On ye olde IBM 360, the smallest program was the IEFBR14 utility. It consisted of two instructions and weighed in at exactly 4 bytes: 1B FF 07 FE. It sets a condition code of 0 and returns. Commonly used in JCL decks where all of the action for a job step happened in the JCL cards preceding the EXEC IEFBR14 card.

      1. Challenge accepted. Wow, just wow. For a program that’s essentially a NOP, it sure carries along an amazing amount of baggage. A comparison of “true” and IEFBR14 stands as a good example of why while machines get faster and more capable, software performing an equivalent task gets bigger and slower. And keep in mind that of OS/360 is was said, “It sure is complicated, but it makes up for it by being slow.”

    1. You might be pleased to onow when I put ” what is 4A 65 73 75 73 20 49 73 20 4C 6F 72 64 in Unicode” into the local Gemma 4 E2B it tried to generate a table and broke itself. 🤣

  4. When you just make an empty ‘main’ program you’re ignoring the loader and setup code (the module often known as ‘crt0,asm’ that calls the function ‘main’. If you’re a ‘C’ programmer then you’ll know the difference between something that’s compiled for a OS environment and something that’s, well, just compiled (for example, as part of a bare metal embedded program). A ‘C’ programmer will understand implicitly all the setups, libraries and segment usage. To them assembly usage — or, more likely, non-usage — is a matter of convenience and choice.

    (Incidentally, this extends to C++ as well.)

  5. These days a utility to set Adressable RGB or your mouse DPI can easily near a gigabyte, whereas the functional portion would be 1 to 10MB in years past (even a couple hundred kilobytes when streamlined).

    I assume most of the bloat is including a full web browser and integration with social media platforms. Seems a lot of cruft. Annoying.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.