The Linux X86 Journey To Main()

November 5, 2021

Have you ever had a program crash before your main function executes? it is rare, but it can happen. When it does, you need to understand what happens behind the scenes between the time the operating system starts your program and your first line of code in main executes. Luckily [Patrick Horgan] has a tutorial about the subject that’s very detailed. It doesn’t cover statically linked libraries but, as he points out, if you understand what he does cover, that’s easy to figure out on your own.

The operating system, it turns out, knows nothing about main. It does, however, know about a symbol called _start. Your runtime library provides this. That code contains some stack manipulation and eventually calls __libc_start_main which is also provided by the library.

From there, you wind up with some trickery to manage the program’s environment and more library calls such as __libc_init_first and __libc_init do some more setup work. You’d think that would get you close, but there’s plenty more to do including setting up for at_exit and thunking for position-independent code, not to mention dynamically linked libraries.

This is one of those topics it will seem like you don’t really need until you do. Even if you use another language to generate executables, they all have to follow these steps somewhere. Granted, for many languages the startup is static and unlikely to require you to debug it, but it is still good to know what’s going on under the hood.

If you want a quick Linux assembly tutorial, have at it. If you prefer to shovel your assembly into a C source code file, you can do that, too.

7 thoughts on “The Linux X86 Journey To Main()”

Ghent the Slicer says:

November 5, 2021 at 7:33 pm

The info is gcc/glibc specific if you are using a different compiler/runtime library the mechanics are not the same. Much of this is dictated by the glibc ABI.

The ELF loader is a piece of code that gets executed first in your process PID – for dynamically linked ELF binaries. It is responsible for loading all dependent “shared objects” and mapping them into memory and also resolving linker relocations – aka adapting your code the the virtual address it was loaded at. The loader also calls an array of “initializer” functions specified in each binary and finally it passes control to the main binary “entry point”. The loader binary is specified in the ELF file itself. Typically something like /lib/ld-linux.so.2 for 32-bit binaries.

What is describe is the mechanics of the code at the entry point of a binary compiled with gcc and glibc runtime library. The described code itself is part of the glibc library. If you are using newlib for embedded systems or bionic for andorid the mechanics are slightly different even though the gcc compiler is the same.

Report comment

Reply
1. M says:
  
  November 5, 2021 at 8:38 pm
  
  Would the mechanics be different then if one were compiling against musl? Are there any good explanations of that out there?
  
  Report comment
  
  Reply
  1. Ghent the Slicer says:
    
    November 6, 2021 at 3:13 am
    
    Yes musl would be slightly different, for example look at the code for the exit function:
    https://git.musl-libc.org/cgit/musl/tree/src/exit/exit.c
    There is a call to __stdio_exit() which was not the diagram in the article.
    
    Report comment
    
    Reply
2. let's rape children says:
  
  November 6, 2021 at 2:06 am
  
  > implying that other compilers are anywhere near GCC quality of generated code.
  
  It became a standart compiler for a good reason bro.
  
  Report comment
  
  Reply
Ehud Gavron says:

November 6, 2021 at 1:44 pm

Great analysis by the original author and the commenters. Gentlemen and ladies, great job.
I don’t think ANYONE has EVER put this kind of an analysis of program startup and initialization
together since VAX/VMS. (My apologies to the SNA fans…)

Report comment

Reply
1. animal717 says:
  
  November 6, 2021 at 3:08 pm
  
  back in the decade ago this site https://opensecuritytraining.info/Training.html had a class called life of binaries and many more classes. Now they are back with updated/updating content at https://ost2.fyi/ . added space () in-case links can’t be uploaded here. As I don’t know if my original post went thru. I make no money, nor am i affiliate with them(links). only sharing these link to pass on what had been so kindly provide to me before.
  just throwing it out there ya know receive and give back in kind . Thanks to hackaday..com also for being a great site to check everyday.
  
  Report comment
  
  Reply
Brad says:

November 6, 2021 at 4:44 pm

Imagine if the file system which you paged the first page of your executable – the one which exec*() was looking at, suddenly got very very slow… before you paged in all your shared libraries.

it’s a dark tunnel, at night, and suddenly you are driving on jello

Report comment

Reply

Hackaday

The Linux X86 Journey To Main()

7 thoughts on “The Linux X86 Journey To Main()”

Leave a ReplyCancel reply

Search

Never miss a hack

If you missed it

Postal IRCs Are Almost A Thing Of The Past

Launching Rockets Is Hard, Bring Them Back Is Harder

Putting Some Zig In A Linux-Based 3D Printer

UDP Broadcasting And The Joys Of IPv4 Subnetting

The Death Of Physical Media And The Real Challenges To Software Archiving

Our Columns

Giving Resin 3D Printers Another Shot After Six Years

Hackaday Europe 2026: Project Gigapixel

Hackaday Links: July 19, 2026

Simple Games From A Simpler Time

Hackaday Podcast Episode 378: C Coders, Ceramic Printers, And Shadow Archives

7 thoughts on “The Linux X86 Journey To Main()”

Leave a ReplyCancel reply

Search

Never miss a hack

Subscribe

If you missed it

Our Columns