Linux as a Library: Unikernels are Coming

If you think about it, an operating system kernel is really just a very powerful shared library that offers services to many programs. Of course, it is a very powerful library, but still — its main purpose is to provide services to programs. Your program probably doesn’t use all of the myriad services the kernel provides. Even a typical system might not fully use all the things that are in a typical kernel. Red Hat has a new initiative to bring a technology called unikernels to the forefront. A unikernel is a single application linked with just enough of the kernel for it to execute. As you might expect, this can result in a smaller system and better security.

It can also lead to better performance. The unikernel doesn’t have to maintain devices and services that are not used. Also, the kernel and the application can run in the same privilege ring. That may seem like a security hole, but if you think about it, the only reason a regular kernel runs at a higher privilege is to protect itself from a malicious application modifying the kernel to do something bad to another application. In this case, there is no other application.

This isn’t a new idea. Embedded operating systems have long built the application in with the kernel. However, Red Hat wants to bring Linux and open community into the unikernel landscape. The idea is that unlike other projects, this one will be based on Linux that is actively developed and maintained. According to Red Hat, previous systems either didn’t use Linux or mutated Linux to the point that it no longer benefits from the Linux community’s development efforts.

Linux has wormed its way into many embedded systems and it is easy to see how a unikernel would be handy for that or for some network appliances. Of course, you could always use a classic RTOS. For some applications, you might even consider just a basic framework like Mongoose.

47 thoughts on “Linux as a Library: Unikernels are Coming

    1. Needs not to be statically linked. If you are bringing the Linux kernel in, you could pretty well bring a dynamic linker/module loader in one form or another as well. It is not something usually done in a typical RTOS or embedded application, but there we are talking about tiny apps compared to the size of the Linux kernel.

      They could also provide a licensing exception similar to how kernel handles the distinction between kernel and user space – ultimately it is all code and function/system calls, so Linus made it explicit that the kernel GPL doesn’t extend beyond the kernel API, regardless how that API is invoked. If you aren’t using the kernel symbols, you aren’t covered by the kernel license. So something like packaging a kernel with an application where the two talk only via the system call API (as they would do in a normal system) would likely be fine.

      I am sure the RedHat folks aren’t dumb and they do realize that GPL is a non-starter for this type of embedded code.

      1. I think the rumpkernel project is dead. No activity since May and I’ve heard the author has gone full time into brewing beer.

        So for academic purposes, it would be handy to have a heavyweight Unikernel with good Linux compatibility. Not sure how practical it is in the real world.

  1. Another crap idea from RedHat. Satellite 6? Horrible performing management tool which constantly has bugs (due to their “sauce” on top of foreman/katello). OpenShift? (Or openShit as I like to call it) Being sold as the latest, greatest thing.. but you end up being the paying RH beta tester. Atomic hosts? Great! Hope you didn’t require any supported, decent monitoring tools in your corporate environment. OpenLDAP? Let’s “deprecate” that in favor of – what did you guess – our paying IDM. Red Hat is becoming more and more like MS.

    1. And you didn’t even mention the much reviled ‘SystemD’

      Interesting idea, on the one hand coupling application with only *parts* of the OS, while on the other we have camp LP striving to ossify as much as possible into one giant monolithic block. Unikernels! Systemderp! Ready!? FIGHT!!

      We just need more people putting mustard on our ice cream telling us we’re change averse because this is New so it’s Better lalala can’t hear you.

    2. Firstly, openshift is fantastic. I’ve had nothing but success with it for my hybrid deployments. Also, idm is free. FreeIPA. It’s in CentOS and Fedora :). They chose 389DS over openldap because openldap does not have the flexibility and maturity of 389. I will never go back to OpenLDAP until the upstream developers pull their heads out of their ass and the idm folks decode it’s time to port over. Thirdly, satellite is garbage. It’d be better of they stuck 100% upstream and not nitpicked versions.

      Google. Research. Not difficult.

      1. No it isn’t. RH is staying as “separate” entity with intact management and engineering. There were multiple announcements to the public, vendors and employees about that. It is even part of the published agreement. RH does not have much in terms of IP and if anything like you suggest happens, core people will start leaving (that was also explaned to IBM very clearly) and all that will be left will be a 34bn logo and buildings. If IBM upsets upstream projects, they will stop taking patches and you are in trouble again. IBM people aren’t stupid and they know this. I am actually carefully optimistic here.

  2. why not just use the already existing and well used kernel build tools to turn off and remove kernel features your application doesn’t need? That’s what is SOP for embedded systems which use Linux already. Putting the kernel into the same application space seems extreme and unnecessary. But I guess if they are talking about single threaded applications run in a VM cloud setup then maybe. Seems pretty specific though.

    1. Spectre and Meltdown mitigations are destroying VM performance. Get rid of kernel context switching entirely and the application in your VM is much faster. This isnt going to compete with RTOS, but instead complement VM container systems.

      Everything RedHat does these days is for cloud services.

  3. I’m not sure I understand how this would be used in practice. Forgive me, operating systems aren’t my thing, so this question might be stupid, but:

    As I understand it, the need for privilege separation comes from having several applications, or rather, tasks, that shouldn’t be able to overwrite each other. That’s easy enough to understand that my Firefox tab running code from Google shouldn’t be able to overwrite my VLC process playing a video. Okay, simple enough, and the argument is that for a single-purpose embedded box, there’s only ever one thing happening at a time?

    But even embedded systems usually have networking now, don’t they? Standalone stuff hardly benefits from Linux in the first place. And don’t we want to separate the networking stack, which might be tricked into running code from heaven-knows-where, from the main application?

    I feel like I’m missing something.

    1. Yes, how do you run two or more such things at once? That bothered me too. The classic kernel isolates process memory spaces so they can share the system. This uni kernel thing seems to be a kick-the-can, in that, within the uni kernel bubble the application is perfectly isolated – but try to put two on the system at the same time, and suddenly you have to isolate the uni kernel bubbles from one another.
      Maybe there’s some nuance we’re missing here, like maybe hypervisors are better at memory isolation (just speculation, IDK how it could be), so unikernels hand that chore off to a hypervisor? But if that were the case, didn’t we end up with the same picture we started with, we just moved all the labels around?
      Or maybe it’s just another smoke screen marketing type thing. Something to cloud the real issues. I used to like RH but lately not so much.

      1. My impression is that this is for things where you’re figuring out which combo of virtualization and container you might use for your singleton service. If your goal is to run a single thing on a “server” in the cloud, there are multiple layers of protection, and the top layers are effectively protecting yourself from yourself (because there’s nobody else in the container). The system-call interface is pretty wide, because it’s covering a ton of stuff like user and group capabilities, while the vm interface is potentially much narrower since it doesn’t care about many of those things. So a unikernel lets you strip away a lot of the overhead of that high-level interface.

        Hmm. The underlying cloud compute systems need to treat everything above the vended hardware interface as potentially malicious regardless of how you structure it. In a traditional multi-user/service Unix system, the kernel also treats everything above the syscall interface as potentially malicious. But in a Unix system running a single service, once that service is compromised that service is compromised, the syscall interface doesn’t provide the service much protection from _itself_. So the kernel’s protections might not be worth their cost.

        It is true that you can structure a service to carefully arrange it’s processes and interfaces to keep compromises contained. For instance, if a process cannot write to disk, then it probably cannot be compromised persistently across a reboot. But very few systems make any pretense of attempting this level of craft, and very few engineers are capable of really carrying it off.

    2. Processes separation is mostly done with virtual memory, and has little to do with privileges. The separation works fine even at kernel level. Of course having full privilege means that you can overwrite another process if you really want to, but that’s true even for unprivileged processes too actually, using different methods. The added security of user and kernel privilege is that you can’t access the kernel stuff from user mode.

      Also, bundling the app and the kernel together doesn’t means you can’t have several processes. The kernel certainly needs contexts anyway for interrupts and all, so adding threading and multi-processing doesn’t need much more. And usually the network stack itself is inside the kernel and usually not vulnerable. Not that it can’t be, but for example the TCP/IP stack is so common (and not that complex) you can consider it mostly secure. The vulnerable networking part is almost always inside the main app.

  4. So they watched google make chrome OS and are now thinking they could do it programmaticly for any application, not just chrome…

    I mean we did this in the long long ago…. back in the olden days of DOS having a boot disk for a particularly resource intensive game that only loaded the components needed for that game wasn’t that weird…

    1. > So they watched google make chrome OS and are now thinking they could do it programmaticly for any application, not just chrome…

      they also decided that they should run the application as root, because why not give Chrome or whatever access to vulnerable firmware interfaces to flash malware onto the hardware?

    2. “I mean we did this in the long long ago…. back in the olden days of DOS having a boot disk for a particularly resource intensive game that only loaded the components needed for that game wasn’t that weird…”

      That’s the first thing I thought of as well. The thing is, is this even necessary for the mainstream? Sure, it will simplify building embedded systems, but there are good reasons the Linux kernel has remained monolithic through the years.

  5. > Also, the kernel and the application can run in the same privilege ring. That may seem like a security hole, but if you think about it, the only reason a regular kernel runs at a higher privilege is to protect itself from a malicious application modifying the kernel to do something bad to another application.

    no, that’s not the only reason. the application could do bad things to hardware or firmware, possibly bricking the device or installing impossible-to-remove malware.

    1. But what you miss is that unikernels aren’t meant for hardware, hell, they aren’t even meant for paravirtualization. The idea is full virtualization costs can be drastically reduced. And on the front of licensing this will likely be software that is deemed “internal” by organizations and may support you getting your medical records to show up in a browser but what they do in their walled garden you’ll never know if they chose a Linux unikernel or osv or one of the other implementations so have fun pushing for source releases of gpl components. Make no mistake this is driven by companies running hundreds of thousands or millions of VMs or containers today.

        1. Not coreos. The problem is coreos, and container centric platforms give false promises of segregated process space. Perhaps if you were looking forward instead of trying to poke holes at evolution of managed applications you’d see the clear differences. Also I don’t see a licensing issue with the GNU/Linux kernel as the kernel isn’t a build step, per se but rather calls the process as it’s init process, but often times means additional modules may need to exist to handle network stacks and protocols such as DHCP or any advertising steps needed at a “host” level. This is what containers were often marketed as.. micro-VMs that have 150mb of disk and almost all the men/cpu belongs to the app as there is no ssh, no mutable filesystems, no not, no services to speak of. And red hat, like always, is just late to the party but in this case no other Linux backers are at the party either.

          1. “Perhaps if you were looking forward instead of trying to poke holes at evolution of managed applications you’d see the clear differences.”

            Phew! Dodged a bullet there.

          2. Docker in production environments is a mess from a security perspective, and CoreOS is no exception. I’ve done plenty of gigs proving to clients that they need to still be very mindful of what they put into the same k8s clusters running on CoreOS. The best I’ve found for isolation so far has been SmartDataCenter/Triton from Joyent.

      1. BTW this is intended for the open source world as there is where the OS code comes from and the distribution would no doubt also requires the disclosure of the application code.
        You are saying the same story of ”many pairs of eyes” security model that we heard so many times doesn’t work here?

        1. People make mistakes everywhere, which is why you layer security. Meaning 1 mistake does not completely compromise the system. Removing a layer of security increases the this risk, and thus shouldn’t be done lightly.

      2. What kingdom? You gave up one VM without ssh or userland utilities? And thanks to immutable image strategy you deploy a new version and this compromised image is no longer an entry point. There is no kingdom in this model without attacking the hyper visor and since many of these use your application as the pid 1 you don’t have full OS installs. It feels like that’s being lost on many here. If you want to see a functional implementation based on the fbsd kernel go look at Unik as a packager.

  6. Seem to me like reductionism to the point at which you’re really just running applications in a VM like Java.
    But instead we have a kernel that wasn’t designed to work that way. In my view they would be better off proposing doing it using microkernels such as L4. There are both FOSS versions and commercial versions easily enough, doesn’t even need an RTOS.

    1. There are plenty of variants out there that focus on containers (boot2docker, coreos, rancheros, project atomic) but what containers don’t give you is true process segregation. This is meant to be run as an app is a tiny vm, and your app is basically injected into an existing kernel effectively as an embedded init process. This assumes your hypervisor handles guest segregation properly and doesn’t have issues with neighbor memory snooping, etc. I’ve been using unikernels for some of my native applications but none of my JVM work has really felt stable on them yet (big limitations surface around large thread counts and trying to spawn new processes so many web server applications are a bit of a miss). I have enjoyed the experience with both vSphere and KVM/Qemu management of VMs for unikernels applications.

  7. Hmm… so what’ll manage the memory space for these unikernel+application things?

    Though, having a Kernel+application rolled into one to me sounds like a bare-metal application: i.e. an application with a single set of purposes.
    Take for example an Atmel running firmware to get UART signal from an ESP32 and spit out the response… memory managed by coder as it’s hard coded except the firmware manages a few buffers for remembering UART command, where the dials are if any and so on.
    Hardware is managed by the firmware for GPIO connected stuff…

    On the other hand… an operating system that manages the memory space and hardware some what but can be taken over by an app… feels like something I’ve heard exploited by Amiga A1200 Demos I think. This allowed those needing bare metal access that the Demoscene needed to cram so much into those FDDs by exploiting this flaw. I maybe wrong and confusing it with the A500… however the OS was designed without memory protection allowing for such hardware or OS abuse and thus amazing Demos. I maybe incorrect still, however I’m catching up on the Amiga scene at the moment.

    The unikernel idea sounds like it’ll break or not be able to truly implement memory protection in this case , but maybe good for tightly integrated designs that’ll never see rogue data in it’s normal installation, like the semi-isolated Atmel central-controller idea as above.

    1. To be fair, memory protection wasn’t really a thing back when the A500s and A2000s were being made, and both used the 68000. That said, IIRC, the full versions of the 68020, 030, 040 and 060 had an MMU but those wouldn’t find their way into Amigas for a while.

  8. No.. just no.

    An operating system is not a library. It’s a program that creates and maintains a hardware abstraction layer so userland code doesn’t have to rely on a thicket of overlapping but incompatible libraries.

    This whole article can be paraphrased, “Hey, you know what was awesome? Late 1980s Windows 3.1 DLL hell.”

  9. An operating system is a “library” in the sense that it provides the functions. For many a small computer, the Microsoft BASIC in ROM had needed I/O, but couldn’t be used by other programs. So when I bought a disk assembler for my Radio Shack Color Computer, I got the assembler, and a simple operating system, which I doubt was used by other software. So early programs included the fundamentals of an operating system, and each application, running on bare metal, had to code those functions by themselves.

    That said, is this new? I thought with Unix, installing meant compiling a kernel for the system. Even in the early days of Linux, it wasn’t a necessity, but seen as a good thing. Why include drivers for hardware you didn’t have, or functions you didn’t need? I think I did it once. But even as the kernel bloats to handle endless hardware, and functions not everyone uses, it becomes a less common thing, since most have enough memory to not notice the kernel bloat. Or modules mean it’s all readily available, but not loaded until needed.

    Even with Microware OS-9 in the eighties, users were told to strip out the modules not needed.

    There are some Linux HOWTOs about dedicated things like web browsers for public places, which boot up and launch the browser, no intermediate step for the user. I recall they did strip out things, but I can’t remember, and likely it didn’t deal with the kernel.

    Michael

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.