It’s All In The Libs – Building A Plugin System Using Dynamic Loading

Shared libraries are our best friends to extend the functionality of C programs without reinventing the wheel. They offer a collection of exported functions, variables, and other symbols that we can use inside our own program as if the content of the shared library was a direct part of our code. The usual way to use such libraries is to simply link against them at compile time, and let the linker resolve all external symbols and make sure everything is in place when creating our executable file. Whenever we then run our executable, the loader, a part of the operating system, will try to resolve again all the symbols, and load every required library into memory, along with our executable itself.

But what if we didn’t want to add libraries at compile time, but instead load them ourselves as needed during runtime? Instead of a predefined dependency on a library, we could make its presence optional and adjust our program’s functionality accordingly. Well, we can do just that with the concept of dynamic loading. In this article, we will look into dynamic loading, how to use it, and what to do with it — including building our own plugin system. But first, we will have a closer look at shared libraries and create one ourselves.

Note that some details may vary on different architectures, and all examples in here are focusing on x86_64 Linux, although the main principles should be identical on other systems, including Linux on ARM (Raspberry Pi) and other Unix-like systems.

Building A Shared Library

The first step toward dynamically loadable libraries is the normal shared library. Shared libraries are just a collection of program code and data, and there is nothing too mysterious about them. They are ELF files just like a regular executable, except they usually don’t have a main() function as entry point, and their symbols are arranged in a way that any other program or library can use them as needed in their own context. To arrange them that way, we use gcc with the -fPIC option to generate position-independent code. Take the following code and place it in a file libfunction.c.

int double_me(int value)
    return value + value;

Yes, that’s all there is going to be, a simple function double_me() that will double a given value and return it. To turn this into our own shared library, we first compile the C file as position-independent object file, and then link it as a shared library:

$ gcc -c -fPIC libfunction.c
$ gcc -shared -o libfunction.o

Of course, we can combine it into a single call to gcc and avoid the intermediate object files. Note that you might want to add a soname with the -Wl,-soname, option, and add some versioning to the output file, but for simplicity, we leave that out now.

$ gcc -shared -fPIC -o libfunction.c

Either way, we now have our own shared library, so let’s go right ahead and use it.

// file main.c
#include <stdio.h>

// declare the function, ideally the library has a .h file for this
int double_me(int);

int main(void)                 
    int i;
    for (i = 1; i <= 10; i++) {
        // call our library function
        printf("%d doubled is %d\n", i, double_me(i));
    return 0;

Now we just have to remember to link against our library when we compile the file, and add our current work directory to the list of paths gcc should look into to find the libraries. Keep in mind that library file names are expected to be in the form of and are then linked via -llibrary_name.

$ gcc -o main main.c -L. -lmylib

This should keep the linker happy and output us our main executable. But what about the loader? Will it automatically find our library?

$ ./main
./main: error while loading shared libraries: cannot open shared object file: No such file or directory

Well that’s a big nope, and it shows that telling the linker (part of the compiler suite) about our library won’t make the loader (part of the OS) magically know about its location. To find out what libraries are required, along with the loader’s situation of resolving those dependencies, we can use the ldd command. To get some more debug output from the loader, we can set the LD_DEBUG=all environment variable when calling our executable.

So in order to make the loader find our library, we have to tell it where to look, either by adding the correct directory to the LD_LIBRARY_PATH environment variable, or by adding it to the ldconfig paths in either /etc/ inside the /etc/ directory. Let’s try it with the environment variable for now.

1 doubled is 2
2 doubled is 4
10 doubled is 20

Yes, the loader will now find our library and successfully run the executable.

Dynamically Loading A Shared Library

For our next trick, we will use dynamic loading to read the library into our code at runtime. Once loaded, we can search for symbols in it and extract them to pointers, and then use them as if the library was linked in the first place. Unix and Unix-like systems provide libdl for this. Let’s have a look how we can call our double_up() function this way.

// dynload.c
#include <stdio.h>
#include <dlfcn.h>

int main(void) {
    // handle for dynamic loading functions
    void *handle;

    // function pointer for the library's double_me() function
    int (*double_me)(int);

    // just a counter
    int i;

    // open our library ..hopefully
    if ( (handle = dlopen("./", RTLD_LAZY)) == NULL) {
        return 1;

    // try to extract "double_me" symbol from the library
    double_me = dlsym(handle, "double_me");
    if (dlerror() != NULL) {
        return 2;

    // use double_me() just like with a regularly linked library
    for (i = 1; i <= 10; i++) {
        printf("%d doubled is %d\n", i, double_me(i));

    return 0;

We try to load our library using the dlopen() function, which returns a generic pointer handle on success. We can then find and extract the double_me symbol from our library with dlsym(), passing the previously returned handle to it. If the symbol is found, dlsym() returns its address as void *, which can be assigned to a (preferably matching) pointer type representing the symbol. In our case, a function pointer that takes an int as parameter, and returns int, just like our double_me() function. If all succeeded, we can call the freshly extracted double_me() function as if it was there from the very beginning, and the output will be just the same. Just remember to link against libdl when compiling.

$ gcc -o dynload dynload.c -ldl
$ ./dynload
1 doubled is 2
2 doubled is 4
10 doubled is 20

There we go, instead of linking at compile time, we’ve now loaded our library at runtime, and after extracting our symbols, we can use it just as before. Admittedly, using dynamic loading solely as a replacement for the linker isn’t too useful on its own. A more common use for dynamic loading is to extend a program’s core functionality by integrating a plugin system that allows the users to add external components as they need them. A prime example is the Apache webserver that has an extensive list of modules to add individually as one pleases. Of course, we will focus on a much simpler approach here.

Building Your Own Plugin System

Take the good old kids’ game Telephone (or Chinese Whispers, Whisper Mail, Broken Phone, etc). Someone starts with a message and it gets whispered around, and the last child says what the initial message was supposed to be. Well, this sounds like something a bunch of plugins could do by passing a message from one to the other, slightly messing up the data as we go. We’ll write the code to run the telephone system, and anyone can contribute a kid/plugin.

As an API, let’s say that the plugin takes a pointer to the message and a length as parameters and alters the message directly in memory. Let’s simply call it process, so the function would look like void process(char **, int). This is what a plugin with a process() function that sets every second character to uppercase could look like:

// file plugin-uppercase.c
include <ctype.h>

void process(char **message, int len)
    int i;
    char *msg = *message;

    for (i = 1; i < len; i += 2) {
        msg[i] = toupper(msg[i]);

Let’s turn it right away into a uppercase.plugin file, and assume we have two more plugins, increase.plugin that increases each digit it finds, and leet.plugin that makes our message just that: l337.

$ gcc -shared -fPIC -o uppercase.plugin plugin-uppercase.c
$ gcc -shared -fPIC -o increase.plugin plugin-increase.c
$ gcc -shared -fPIC -o leet.plugin leet-replace.c

Our main program would then take a message as first argument, and an arbitrary number of plugin files as the rest of the argument list. It will load the plugins one by one, pass the message along from one plugin to the other through their process() functions, and then print out the result. (For focus, we’re pretending that we live in a perfect little world where errors do not happen.)

// file telephone.c
#include <stdio.h>
#include <string.h>
#include <dlfcn.h>

int main(int argc, char **argv) {
    void *handle;
    void (*process)(char **, int);
    int index;

    if (argc < 3) {
        printf("usage: %s <message> <plugin> [,<plugin>,...]\n", argv[0]);
        return 1;

    // argv[1] is the message, start from index 2 for the plugin list
    for (index = 2; index < argc; index++) {
        // open next plugin
        handle = dlopen(argv[index], RTLD_NOW);
        // extract the process() function
        process = dlsym(handle, "process");
        // call the process function, modifying argv[1] directly
        process(&argv[1], strlen(argv[1]));
        // close the plugin

    // print the resulting message
    printf("%s\n", argv[1]);
    return 0;

Just like before, we load the plugin file (a dynamically loaded shared library), extract the function we need, and execute it — only this time in a loop. So let’s compile and test it.

$ gcc -o telephone telephone.c -ldl
$ ./telephone "hello hackaday" ./uppercase.plugin
hElLo hAcKaDaY
$ ./telephone "hello hackaday" ./uppercase.plugin ./leet.plugin 
h3lL0 h4cK4D4Y
$ ./telephone "hello hackaday" ./uppercase.plugin ./leet.plugin ./increase.plugin 
h4lL1 h5cK5D5Y

As expected, with each plugin altering the input message in their own way, the amount and order of plugins given as parameter to our main program will affect the final message. Now this may not count much as data processing example, but the same concept can of course be used for some more useful scenarios. If you’re curious about the full implementation, you can find it on GitHub. Also note that our main program has never changed, and if we decide to make adjustments to one of the plugins, we only have to recompile that one plugin. We could even add mechanisms to the main program to reload the plugins, and we wouldn’t even have to restart the main program itself.

Raspberry Pi GPIO Monitor

One of those more useful scenarios that would follow the same principles could be a program that monitors the GPIO pins on a Raspberry Pi. We’d have different plugins that can all handle any information our main program reads from the GPIOs. Each plugin would have a set of basic functions it can implement: a function for the plugin setup phase, one to handle each GPIOs state change, and one to tear down the plugin when it’s not used anymore. One plugin could handle input change on one pin to change the state of an output pin, another one could perform some tasks when one specific input pin gets high, and a third one could just write all state changes to a log file.

In the end, the dynamic loading part won’t be much different than in the previous example, and going into the details of such a GPIO monitor would go beyond the scope of this article. However, we wouldn’t mention it if we hadn’t implemented it, so a basic GPIO monitor can be also found on GitHub.

Where To Go From Here

With dynamic loading, we have seen an alternative approach to compile-time linking that makes it easier to extend our program’s main functionality with external libraries. While it adds a bit of complexity to extract the symbols from a library, the main principles are rather simple and straightforward: you open a library, you extract symbols from it, and you close it.
However, this simplicity also has its shortcoming: in order to extend a program’s functionality through dynamic loading, we need to know beforehand what we can find in the loaded library or plugin. We cannot simply add a completely new function and hope that our program will magically know about it on the fly. But if you design your core program with these limitations in mind, dynamic loading will give you a flexible way to extend functionality as needed.

Note that we’ve opened up a Pandora’s box of security issues. If arbitrary external functions can run within our main code, it’s only as secure as the libraries that it dynamically links to. Abusing this trust is the basis of DLL injection attacks or DLL hijacking. If an attacker can fool the operating system into feeding the calling program their dynamically loadable library, they’ve won.

Since dynamic loading will need the support from the operating system, so this isn’t really anything for an 8-bit microcontroller environment. You will always have function pointers though.

Some Words On Looking Up Symbols

You may be thinking that if dlsym() can resolve symbols in a dynamically loaded file, there must be a way to also find the available symbols in the first place, maybe get a whole list of them. Well, yes, common binutils tools such as readelf or nm do just that, with the help of the Binary File Descriptor library libbfd. Also, the GNU extension of our dynamic loading library libdl offers the dlinfo() function to obtain further information about the loaded file. Some further reading about the ELF file format is recommended before you go down that rabbit hole.

15 thoughts on “It’s All In The Libs – Building A Plugin System Using Dynamic Loading

  1. Huh. I thought arbitrary code execution was a CWE, not a feature. In all seriousness, if you do this with your software please be aware of (and mitigate) the risk.

  2. “Note that we’ve opened up a Pandora’s box of security issues. If arbitrary external functions can run within our main code, it’s only as secure as the libraries that it dynamically links to. ”

    Vetting, and signing, but then it’s no longer “arbitrary”.

  3. This and statically linked C libraries on portable applications are very important.
    I’ve seen many “portable” apps that are dynamically linked to a one-off shared C library… and it won’t work on any other OS except the version (say Ubuntu 10.2) of OS that it was compiled on: LOL.
    I tried a portable Firefox with ALSA support… that failed because I didn’t have that (exact) version of libc (for example)

    1. Not true as long as the syscall interface to the kernel hasn’t changed. If you replicate the dependent library content on a different host system and either re-point LD_LIBRARY_PATH or chroot the execution environment, it will work the same as if the executable were statically linked.

      1. It is strange as the Firefox build with ALSA support complained about glibc on both a 3.2.65 kernel and on a 4.12 kernel…
        yet I found out roughly the era of ubuntu the portable Firefox was built on by searching repository information (debian) for the expected glibc and found the version of ubuntu that ran the portable app,
        yet palemoon (firefox fork) runs fine on any distro with the various kernel versions and glibc versions:
        I mainly use 3.2.65 due to how long it took me to tweak my OS for maximum battery,
        io linux (kernel 4x) for video editing and a tablet PC with Manjaro (for portability, kernel 4.16, though a reinstall is smoother and less glitchy than an update).

  4. dlopen() isn’t any more or less secure than the OS’s dynamic linker that sets up the initial virtual memory space. Both look at default system search paths (eg. /etc/ld.config) overridden by LD_LIBRARY_PATH to resolve where library objects are resolved. And both assume that such path targets are protected by a file-system ACL.

  5. Dynamic Loading Libraries (and programs) are one of the most important features of any OS. And you can make it happen on MCUs.

    For true data and code sharing potential you may benefit from single address space OS (SASOS) architecture. (on most systems without MMU (like most MCUs) you share the same address space as there isn’t any virtual to physical translation.)
    (SASOS can be as secure as the unix like virtual address space OSs). On SASOS processes/tasks get space from the same virtual address space, but that doesn’t mean, that they can see each others data or code. But there is the potential, that they can share with each other whatever they want.

    On unix like virtual address space systems you just can’t share whatever you what (ex every program starts on the same virtual address), you can’t share pointers etc.

    There are of course pros and cons for each architecture. The most famous SASOS you may heard of is AmigaOS.

    One disadvantage of SASOS may be that you need to relocate programs on (load or) runtime, but with that you get free ASLR, also your OS can work on systems without MMU (on MCUs), and you don’t need to make PIC code to be able to load into memory (you need relocation on load or runtime).

    We are developing an OS (Threos) (mainly for our embedded needs), that is a multitasking microkernel with single virtual address space (with paging and true virtual address space protection) architecture.

    Threos has all sort of code and data sharing features, (Dynamically linked/loaded libraries). We are using on demand loading and relocation of programs, and it’s possible to get on larger MCUs (with a little more memory, but without MMU).

    Of course without MMU you lost the memory protection, ex. the MPU in cortex M series is just too limited to get proper memory protection.

  6. (1) You don’t need to build a library to be able to access a compiled object file via dlopen(). Just beware that if you need to load multiple object files that reference objects (functions, external variables etc) in each other, that you use “lazy binding” otherwise dlopen() could fail (because objects you are referencing have not been loaded yet).

    (2) I am 99.99% certain (a very long time since I last did any of this) that the object does not need to be compiled as position independent (PIC) as dlopen() dynamically links it during load.

    (3) the object file can contain external references back into the executable that invokes dlopen() (e.g. errno, stdin etc) and those references will be fixed by dlopen().

    (4) there is no reason why the executable must know about a function that is to be called. The name of the function could easily be provided as user input while the executable is running. This opens up the door to the executable generating code which gets compiled by an external compiler (e.g. GCC or CLANG) and then gets loaded by the executable as an extension to itself. I did this for a tool called ZMech many years ago.

    1. Well, gcc, for example, is 100% certain that it won’t let you link something as a sharable, if it’s not compiled as PIC. I don’t know why; it seems like the main function of a loader is to handle relocation. It may have something to do with the fact that when you call a function that’s part of a shared library, that library is loaded into its own address space, independent of the address space of the executables that call it, so it still has to be able to function when it’s running at a different address than the one it was loaded at. I don’t really know – that’s just a guess. I’m just saying that I’ve gotten that complaint from gcc, and this is a possible explanation.

      1. I’ve just checked my old code and I can confirm that I do NOT use the PIC setting when compiling with GCC. I think something else must be going on with your build.

        I have used PIC when compiling modules that I loaded myself into arrays (without the aid of dlopen()) and executed them directly without any relocation – but that’s another story.

  7. Nostalgia; The 16-bit IBM 1130 had a dynamic load feature called LOCAL (load on call). It worked just as you describe.

    The 1130 had a maximum 32K of RAM,and a single removable hard disk with a whopping 512K.
    But, incredibly enough, its FORTRAN compiler could run and compile code with as little as 4K RAM.
    The LOCAL facility was one of several tricks they used to stretch the computer’s capabilities.

    On our company’s 16K machine, I was routinely running lunar trajectory simulations. Amazing.

    1. Code bloat is certainly an ever-growing beast. When people have no choice but to make something run in 4KB of memory, they find a way. If you give them 16 GB RAM and multiple TB of disk, they don’t bother to be efficient, and figure they’re doing fine if their executable is only 100 MB. Same goes for executable speed. Magically, we were able to watch full-screen DVD video on 286 machines, but somehow this is still a challenge for Core 2 Duos. It’s not that the task is too difficult for a Core 2 Duo; it’s that if you develop something on a Core i7, you may never notice that it’s so fat and slow, it won’t run on an lesser CPU.

      The reasons for this are many, but I ran into an example just last night. I’m developing an animation application and also a live streaming application that both need a plugin mechanism for adding new A/V file types for sources and add-on image and audio filters. I just discovered how to use plugins in an application a few months ago, and so of course now I want to do everything as plugins! The mechanism for the plugins is pure C, which is typical, and allows plugins to be pretty light, but the problem is that if a plugin needs to be configured by the user, then there needs to be a user interface for that, and if the plugin takes care of it, this may mean linking in a GUI library other than the one used by the main application. Also, one of my objectives is to NOT be locked into a specific GUI library, since this has caused me great trouble in the past, and this means not only that I would require that plugin developers use my choice of GUI system, but also that over time, even plugins I make myself may not be consistent in their use of GUI libraries. This means that the application could easily end up loading four or five different GUI libraries, just to handle plugins. I’m working on a way to minimize this, by allowing simple control panels – those needing only standard controls – to use a high-level description for a panel layout that the application interprets, so that plugin designers don’t have to use a GUI library unless they really want to. I’ve seen something like this done elsewhere – I think that Cubase VST audio plugin developers have the option of using a control panel mechanism provided by the application, or developing their own. And if you look at how Thingiverse provides a user interface for specifying parameters for creating STL files from OpenSCAD models, this is another example of a control panel that the developers of models don’t have to do a lot of work to take advantage of.

      It seems like user interface has always been the difficult part. Your lunar trajectory simulations probably didn’t use drag-and-drop or dialog boxes, I’m guessing. I once interviewed at a company that made small measuring instruments, and was surprised to hear that some of these used PowerPC CPUs, for what were clearly simple instruments not needing any kind of sophisticated processing. I asked why, and the interviewer said, “because we’re using (xxx embedded OS) for the user interface”.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.