If there’s one constant with software developers, it is that sometimes they get bored. At these times, they tend to think dangerous thoughts, usually starting with ‘What if…’. Next you know, they have gone down a dark and winding rabbit hole and found themselves staring at something so amazing that the only natural conclusion that comes to mind is that while educational, it serves no immediate purpose.
The idea of applying this to snipping out the <stdio.h> header in C and the printf() function that it provides definitely is a good example here. Starting from the typical Hello World example in C, [Old Man Yells at Code] over at YouTube first takes us from the standard dynamically linked binary at a bloated 16 kB, to the statically linked version at an eyepopping 767 kB.
To remove any such dynamic linkages, and to keep file sizes somewhat sane, he then proceeds to first use the write()function from the <unistd.h> header, which does indeed cut out the <stdio.h> include, before doing the reasonable thing and removing all includes by rewriting the code in x86 assembly.
While this gets the final binary size down to 9 kB and needs no libraries to link with, it still performs a syscall, after setting appropriate register values, to hand control back to the kernel for doing the actual printing. If you try doing something similar with syscall(), you have to link in libc, so it might very well be that this is the real way to do Hello World without includes or linking in libraries. Plus the asm keyword is part of C, although one could argue that at this point you could just as well write everything in x86 ASM.
Of course, one cannot argue that this experience isn’t incredibly educational, and decidedly answers the original ‘What if…’ question.
 
            
 
 
    									 
    									 
    									 
    									 
			 
			 
			 
			 
			 
			 
			 
			 
			 
			
Here’s a link to my own software library that includes serial output functions, and a “hello world” program that prints out a heartbeat and echoes typed chars in hex and is 9K bytes long… everything included, including the interrupt table, serial port setup, and ISR for sending characters.
https://hackaday.io/project/177652-arduino-libraries-and-test-programs
I use the code extensively in my projects and for clients, specifically because it’s tiny compared to the STDIO library, it has a tiny memory footprint, and all code is available for verification. (FAA certification does not allow code that will never be called, such as the %p formatting function).
No huge stdio buffer, if you queue up more than 16 characters the system will block until complete. That’s usually not a problem, you usually only use serial output for debugging and it’s more important to use almost no memory than it is to avoid blocking.
At some point in the process the code has to be able to send a single char out the serial port, and that can be either an OS function call, a mapped I/O register that the OS allows access to, or (in the case of arduino) the I/O register itself.
Beyond that, I don’t see how “Hello World” takes up even 16KB.
Heh, or why Balena Etcher is 450 megs… To write images to SD cards. Think of the electricity that this sort of bloat and it’s overhead must be responsible for.
No doubt they wanted to write it in JavaScript, so had to include an entire embedded browser.
It’s a bloated Electron app.
For Windows development, I normally use Winforms or WPF. A year or so back, not wanting to be a luddite, I tentatively essayed my first WinUI3 project. However, I didn’t want to distribute it via the Microsoft Store, so generated a local build.
A simple Hello World level program resulted in dozens of files (IIRC) with a combined size of (again, IIRC…) north of 130MB.
I have gone back to WPF/WinForms.
To be fair they didn’t exactly hide the fact it’s bloated. You might have had a point if they called it the Tardigrade Etcher.
Video on a piece of code which could be explained by simply showing source is like that 450 MB etcher written in Electron.
There’s no code at that project, or I’m blind
As an optician, you’re not blind.
There’s a Github link in the project that points to the code.
https://github.com/ToolChainGang/AtmegaLib
The project has a GitHub link for the code.
I have a similar thing here,
https://github.com/T3sl4co1l/Reverb/blob/master/console.c
This project is among my “do it yourself and understand why before using libraries” sorts of projects, so it makes some peculiar choices at times, I’m sure; this version is also rather old (not that the core console / command / bufserial trio has changed much over the years). Anyway, these basically implement a primitive interactive shell over serial (the relevant parts of bufserial.c and .h can be changed for most any platform), which I use for bespoke “printf debugging” and basic dev testing (e.g. direct port read/write, SPI and I2C drivers for testing external peripherals, etc.).
I half expected an asm call, seems like the easy way to do it.
I’m kind of surprised that GCC doesn’t have a built-in function for syscall instruction.
Obligatory older non-video link:
https://www.muppetlabs.com/~breadbox/software/tiny/teensy.html
It is always a trade off between quick and dirty vs highly optimized code. We have an embarrassment of riches in terms of memory and processing power today. Is it better to use a bunch of RAM or programming time? If the code runs fast and terminates, like a utility, its probably not worth it to optimize (like etcher). If it runs continuously like a spreadsheet, a browser, or a database it makes more sense to tune it more. It would be nice if it were easier to rip out just the functions you need from a library rather than imcluding the entire bundle.
Last time I checked(in gcc-avr, not x86) that exactly what happens during linking stage of executable building, removing all unused functions from libraries, and your own code. It’s called garbage collection if I’m not mistaken.
Removing unused functions is form of link-time optimization, or LTO.
The advantage of an MCU is there is no OS if you code bare metal. Prior to the main function it just needs to set up the C runtime environment by initializing variables. And there is nothing to return to after main.
Is there nothing akin to a COM file and some basic interrupts? To get this down to double digit bytes?
not anymore, a.out format was deprecated
time for another muppetlabs link lol
https://www.muppetlabs.com/~breadbox/txt/mopb.html
Is it still linking against crt1 or something?
Copy and paste the glibc or musl fprintf function
Calls to uncertified dependencies now a serious issue?
Not to far from the recent article of the person who wrote his “hello world” as a boot loader. No longer “shackled” by that pesky OS… ;).