Linux Kernel From First Principles

Want to learn the internals of the Linux kernel? Version 6.5-rc5 has about 36 million lines of code in it, so good luck! [Seiya] has a different approach. Go back to the beginning and examine the 0.01 version of the kernel. Now you are talking about 10,000 lines and, removing comments and blanks, way less.

Sure, some things have changed, but the core ideas are the same. [Seiya] reports, “Reading V0.01 was really for me. It was like visiting Computer History Museum in Mountainview…”

There were only 66 system calls in that antique kernel. Some important features like mounting did not work yet. The sys_mount call simply returns -ENOSYS, for example. Some functions like the built-in strcpy were hardcoded for i386 CPUs — obviously, that’s changed today.

The kernel supports a small number of devices, including an ATA disk controller, a PS/2 keyboard, a VGA display in text mode, and some system clocks and timers. No need to worry about running a GUI like X or Wayland on this kernel!

Some of the comments are amusing in retrospect. For example: “schedule() is … GOOD CODE! There probably won’t be any reason to change this,…” Of course, there were lots of reasons to change it, and now there are many options for different use cases.

This turned out to be a read-only endeavor because — surprise — the kernel code isn’t able to compile with modern compilers. It didn’t seem worth the effort to modify the source, but reading it is certainly an interesting exercise.

We are big believers in learning things by going back to first principles. Works for Doppler radar. Multirotor drones, too.

13 thoughts on “Linux Kernel From First Principles

  1. Linux used in 747 MAX software?

    How many modules? How many line of code in each module?

    F-35 53% c and 35% c++, we read.

    Same questions for Navy F-35?

    Software modules no greater than one page of code for compliance with Boeing hardware engineers software standards.

    Standards in place before 1966 and in 1980.

    c/c++ industries caused maximum software module length to change?

    1. Old coding standard. I’d say today (and maybe in the past roo) that demanding modules be less than a page would actually be harmful. It would cause spreading cohesive functionality over multiple modules that should really be kept together for better maintainability.

      Better to train engineers on refactoring browsers than to over factor the code.

      As to 737 max, come on. My understanding is the failure were application functionality related.

      Are you calling a function a module or a code file a module? I think even functions will acceptably be longer than a page. It’s a red flag but that just means a reason to take a harder look at it to make sure it’s semantically cohesive and wouldn’t benefit from refactoring.

  2. Wasn’t there a paperback book that went through some early (but not that early) version of the kernel source code and explained what each piece did? I seem to remember seeing that in a bookstore, really wanting it but knowing I would never have the time to go through it.

    1. You’re presumably thinking of “Lions’ Commentary on Unix” by John Lions. It’s a book with a colorful history, beginning as a college text on the V6 kernel and then being banished by AT&T for spilling trade secrets before resurfacing commercially.

    2. My guess it was Tanenbaums’ MINIX book. I had fun with that, building new kernels on a twin 360k floppy PC clone. Or maybe it Comer’s book on XINU, for an LSI-11/03 which pre-dates Tanenbaum for the idea of a small OS for students to build and study. If it was Lions’ commentary you saw, it was probably a photocopy.

  3. Interesting analysis of Seiya Nuta. But keep in mind that today, Linux is compatible with almost any hardware that exists and implements very complex software technologies, designed for large computing centers: from file-systems like ZFS; support for virtualization via system calls (KVM); to solving /mitigate “CPU exploits” such as those that spawned specter /meltdown attacks). That primitive embryo only sought to reproduce the basic functionality of Unix in a limited scenario. It was almost a proof of concept. It is also true that it is very interesting to see the basic concepts in that code. You learn a lot. But probably for embedded development, it’s better to think in a microkernel instead of a monolithic.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.