Accelerate Your Large Builds Locally With Distcc

The motto of Sun Microsystems back in the day was “The Network Is The Computer” which might be kind of relevant when CPUs were slower and single-core affairs, but lately to get a faster compile, you’d simply throw more cores and memory at the problem. The thing is, most of us don’t do huge compilations all that often, we can’t remember the last time we even attempted a Linux kernel build. However if you do find yourself with a sudden need to do so, and have access to a pile of machines hooked to a network, then why not check out distcc: the fast distributed C/C++ Compiler? We’ve seen a few mentions in comments and a HaD links article referencing it, but never explicitly covered the tool. So here we go.

To call distcc a compiler is a bit misleading, it is a compiler frontend. Each client you have access to runs the distcc daemon process and sits there awaiting instructions. To start a parallel compile, you invoke your normal build command with the ‘pump’ script, enable parallel build mode set the compiler to ‘distcc’ and let the tool do the rest. A really nice feature is that the compiler hosts need not share a filesystem or have synchronised clocks. They can even be different operating systems and CPU architectures, with appropriate cross-compilers in place, so leveraging existing hardware without dedicating it to the task is much easier.

Distcc compiler support is focussed upon GCC, but does include support LLVM as well. OS support is primarily aimed at Linux but seems to run just fine on various BSD derivatives and even Cygwin on Windows. You can imagine a neat scenario where whilst working on your laptop,  you come home and kick off a new build, for your machine to pick up the other machines in your vicinity and automatically tap into their CPU power. And you only need to set it up once!

Whilst this is distributed computing for your needs, on your network, we have seen many good uses lately of distributed computing across the whole internet. Like the one about a certain pesky coronavirus.

19 thoughts on “Accelerate Your Large Builds Locally With Distcc

  1. distcc!?! you have to configure the machines you’re going to use, last I checked: they don’t discover each other. https://github.com/icecc/icecream is like that but does discovery, so that machines can come and go from the network and the compile jobs just keep chugging along. I don’t understand why gentoo doesn’t switch to that. At Qt, we use it extensively in the offices, and I’ve also used it at home when my collection of machines were all getting a bit long in the tooth.

    And both of these are rather primitive compared to the general-purpose clusters that we should have now (any process, not only a compiler, should be able to migrate to a less-busy machine); but somehow those are still not mainstream. I played with openmosix around 25 years ago, it seemed cool back then. I’ve been waiting for the other shoe to drop ever since. Let’s see, https://en.wikipedia.org/wiki/OpenSSI was supposed to have picked up where it left off, but the article says 2010 was the newest release so that doesn’t sound very hopeful. I figured since Plan 9 makes it easy to cross-compile for multiple architectures, and easy to mount all resources of remote machines locally, maybe a scheduler could be built to do roughly that sort of thing; but so far I haven’t tried. (It takes a while to get used to using Plan 9 at all, and then you have to write mostly C, perhaps some go if you are willing to do without the full complement of platforms; so it’s a bit wierd to do much general computing on it, for me so far anyway.) Haven’t tried Kerrighed either; I wonder if it’s worth trying.

  2. my issue is often the linking, not the compiling..
    Compiling with a lot of cpu threads, and fast nvme drives, has got pretty fast even without leaving the machine. But all the linkers seem to be single thread…

  3. I have lately worked on optimizing C/C++ builds. Isn’t just spawning the processes and reading the source files much of the work? I have looked up how distcc works and it sends the preprocessed version of the source code over the network so loading all the files and prerocessing them can not be spared.
    What if we did a fusefs on the server that holds all sources and headers? It can be updated using something like rsync that only sends a repeatedly used header once at the beginning of the process or on-demand. And it is also practical to only send the changes. This way we could further reduce the load on the client machine and at least in theory it could scale even better.

    1. So your optimisation would be… file caching? There’s already a common cache system used with distcc, ccache. I understand it’s particularly useful with partial builds.

  4. hmm, why not just do make -j 32 locally? No overhead of file transfer, etc. Linux Kernel compiles in around 80-90 seconds or so.

    With a decent SSD/NVME, it would beat even the largest clusters. By the time the files have finished transferring, I’m done compiling.

    1. This isnt for just 1 kernel, and one time. billy bob clippy gates
      What if you do the full Linux from scratch, with Beyond LFS book, and do some full system tests right after. recompile stuff aswell?

    2. WTF? What kind of monster computer do you have that it can compile the kernel in 80 or 90 seconds? Or is it a REALLY cut down kernel?

      Now I am curious… how long to compile Firefox and/or LibreOffice. Those are the really long builds on my machine.

      1. i built quite some stuff for my dec3000 system (alpha axp) for dec unix. iE i replaced most proprietary stuff with gnu, and to have a 21164sx@533MHz on the network was a big helper. but this is talking day vs week :D . setting this up was a bit painful but once it works youre quite fine – especially on custom systems.
        pls note this was done years ago, distcc was fine for me and helped a lot.
        its not perfect for big applications like firefox or libreoffice or “make world” but on vintage machines it helps a lot – especially if youre not “cross” but “generation” compiling.
        consider the man pages ;)

  5. I avoid complex and large compilations. In 2006 when I began using Linux I had more luck compiling, but nowadays there are too many flavors of building toolsets I can’t feasibly debug anymore. Python and LTS distributions are a lifesaver for me.

  6. While troubleshooting/tuning a large (200+ clients) long (~12 hours) distcc build on Sun hardware nearly three decades ago the largest difference to buildtimes between similar machines (same # MHz, same gigabit NIC’s) was traced back to the size of the L2 cache in the CPU’s and the order of libraries in the LD_LIBRARY_PATH.

  7. This is a common problem in yocto or open embedded linux projects, even small ones. The people suggesting vertical scaling are obviously a different crowd, but in yocto, after vertically scaling cpu, iops, ram, etc, the thing to do is still:
    https://www.openembedded.org/wiki/Using_IceCC

    As the first commenter mentioned, distcc/icecc is quite long in the tooth, and I’ll add that it is not extensible to the heterogeneous build jobs now common in any modern software project, e.g. minimally C and Rust, which have different toolchain, different “recipes” for building.

    Managing such diverse build workloads is exactly what OE’s bitbake was designed for. Although bitbake supports many buildtime optimization features, aside from network dl/state caching, they are all vertical, they all assume a single build host.

    One technology that was designed from the ground up to describe and orchestrate any workload a computer can do, from a container anyway, is Kubernetes. A few years ago, I stepped away from embedded Linux systems development to get more experience with K8s ecosystem and tools.

    I recently returned to the embedded space and am now looking for the best way to horizontally scale bitbake, how to best leverage clusters of build machines to do all these jobs. Icecc was/is a massive improvement over distcc, but it has only rudimentary logic for determining which node to dispatch jobs to. I’m imagining a scheduler that knows how to, e.g. dispatch the ld command to the node with the fastest single core freq. Otherwise, it’s not that different from what icecc is doing, except that it provides a containerized abstraction of a “build job” that can be easily modified to support the growing amount of rust and other languages and task types.

    I’ve found kas, which is basically just a way of configuring and building yocto with modern standards like yaml and containers. It’s not distributed in any way, it’s still doing a huge monolithic build in a single container, but to me it feels like a good starting place for building the build system I’ve been dreaming of all these years..

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.