If there’s one constant with software developers, it is that sometimes they get bored. At these times, they tend to think dangerous thoughts, usually starting with ‘What if…’. Next you know, they have gone down a dark and winding rabbit hole and found themselves staring at something so amazing that the only natural conclusion that comes to mind is that while educational, it serves no immediate purpose.
The idea of applying this to snipping out the <stdio.h> header in C and the printf() function that it provides definitely is a good example here. Starting from the typical Hello World example in C, [Old Man Yells at Code] over at YouTube first takes us from the standard dynamically linked binary at a bloated 16 kB, to the statically linked version at an eyepopping 767 kB.
To remove any such dynamic linkages, and to keep file sizes somewhat sane, he then proceeds to first use the write()function from the <unistd.h> header, which does indeed cut out the <stdio.h> include, before doing the reasonable thing and removing all includes by rewriting the code in x86 assembly.
While this gets the final binary size down to 9 kB and needs no libraries to link with, it still performs a syscall, after setting appropriate register values, to hand control back to the kernel for doing the actual printing. If you try doing something similar with syscall(), you have to link in libc, so it might very well be that this is the real way to do Hello World without includes or linking in libraries. Plus the asm keyword is part of C, although one could argue that at this point you could just as well write everything in x86 ASM.
Of course, one cannot argue that this experience isn’t incredibly educational, and decidedly answers the original ‘What if…’ question.
 
            
 
 
    									 
    									 
    									 
    									 
			 
			 
			 
			 
			 
			 
			 
			 
			 
			
I half expected an asm call, seems like the easy way to do it.
I’m kind of surprised that GCC doesn’t have a built-in function for syscall instruction.
Obligatory older non-video link:
https://www.muppetlabs.com/~breadbox/software/tiny/teensy.html
It is always a trade off between quick and dirty vs highly optimized code. We have an embarrassment of riches in terms of memory and processing power today. Is it better to use a bunch of RAM or programming time? If the code runs fast and terminates, like a utility, its probably not worth it to optimize (like etcher). If it runs continuously like a spreadsheet, a browser, or a database it makes more sense to tune it more. It would be nice if it were easier to rip out just the functions you need from a library rather than imcluding the entire bundle.
Last time I checked(in gcc-avr, not x86) that exactly what happens during linking stage of executable building, removing all unused functions from libraries, and your own code. It’s called garbage collection if I’m not mistaken.
Removing unused functions is form of link-time optimization, or LTO.
The advantage of an MCU is there is no OS if you code bare metal. Prior to the main function it just needs to set up the C runtime environment by initializing variables. And there is nothing to return to after main.
Is there nothing akin to a COM file and some basic interrupts? To get this down to double digit bytes?
not anymore, a.out format was deprecated
time for another muppetlabs link lol
https://www.muppetlabs.com/~breadbox/txt/mopb.html
Is it still linking against crt1 or something?
Copy and paste the glibc or musl fprintf function
Calls to uncertified dependencies now a serious issue?
Not to far from the recent article of the person who wrote his “hello world” as a boot loader. No longer “shackled” by that pesky OS… ;).
“one could argue that at this point you could just as well write everything in x86 ASM”
Well yes, you sumarized it all there!