Learning The Ropes With A Raspberry Pi Mandelbrot Cluster

You’ve probably heard it said that clustering a bunch of Raspberry Pis up to make a “supercomputer” doesn’t make much sense, as even a middle-of-the-road desktop could blow it away in terms of performance. While that may be true, the reason most people make Pi clusters isn’t for raw power, it’s so they can build experience with parallel computing without breaking the bank.

So while there was probably a “better” way to produce the Mandelbrot video seen below, creator [Michael Kohn] still learned a lot about putting together a robust parallel processing environment using industry standard tools like Kubernetes and Docker. Luckily for us, he was kind enough to document the whole process for anyone else who might be interested in following in his footsteps. Whatever your parallel task is, and whatever platform it happens to be running on, some of the notes here are likely to help you get it going.

It’s not the biggest Raspberry Pi cluster we’ve ever seen, but the four Pi 4s and the RGB LED festooned enclosure they live in make for an affordable and space-saving cluster to hone your skills on. Whether you’re practicing for the future of software development and deployment, or just looking for something new to play around with, building one of these small-scale clusters is a great way to get in on the action.

10 thoughts on “Learning The Ropes With A Raspberry Pi Mandelbrot Cluster

  1. Software raytracing might be one of the other things that would work surprisingly well with just a lot of cores. The algorithms scale like searching a tree so if you get the detail high enough it can perform better than rendering. But they’re branch (and branch-miss) heavy so they don’t perform great on GPU compute.

    1. The whole point of nVidia RTX cards (and AMD/ATI equivalents) was to have hardware capable of efficient ray tracing computation. And it works great for both gaming and 3D rendering. For gaming there are some corners cut, like limiting detail level and using machine learning to improve the result. This is done in such a way so game can perform at least 30 renders per second. There are some other tricks game engines use that render engines don’t. Still, ray tracing in hardware is the future. Except if you can get some older servers that used 64-core CPUs with lots of RAM, and convert them into render farm. That would outperform any single GPU.

  2. It’s always fun to build hacks around, but it’s not cheap at all to build a 4 RPI4 cluster. I’d rather spend the same amount of money in a cloud managed kubernetea cluster instead…

  3. This is reminiscent of the way massively parallel Transputer racks (of up to 1000 cores) were used to generate Mandelbrot sets in real-time in the 1980s. Transputers used 4 point to point 10MHz synchronous serial links to adjacent Transputers to communicate data between processors. Hackaday covered them in 2019:

    https://hackaday.com/2019/04/19/retrotechtacular-transputer/

    I had the good fortune to be able to use Transputers to simulate a neural net for my dissertation project in 1988/1989. Transputers were astounding tools for formalising ideas about parallel processing.

    1. Yes, we used loads of transputers in the 80s/90s. Ironically one of the 1st things we did with them was mandlebrot creation outputting to a GKS system. The performance compared to the rather rudimentary PC’s of the time was great

  4. The standard image took hours on a C64, except if skipping the 2 main circles and exploiting symmetry. Then a nibble below 40 minutes could be achieved. Today it’s done in seconds even on cheap microcontrollers. Massive parallelism beyond a handful of PIs is in reach without starving for it.

    But now for something not really completely different:

    What if we calculate each pixel exactly in quotients of limitess BigInts (coordinate —▶ current iteration depth, current result coordinate)? I know how damn slow that would be, I played with it. But being exact (in opposite to using the lossy floats) each pixel would only need to be calculated once and cached forever! It even could be revisited to increase the iteration count without recalculating the known part.

    Not using the MP3 coding of numbers (floats) could pay back when we store the results and maybe that even can be use as proof-of-work in blockain or spam prevention contexts (hashcash like?).

    Would having cached results that turn making MB images into a database query take the fun out of that topic?

    I don’t think so. I mainly use drawing a MB image as coding example. Finding a nice region or even planning an animated zoom into them is a different dimension of creativity. I’d still have fun in coding the algorithm in 4711 languages just for fun.

    Silly idea?

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.