Hot Swappable Raspberry Pi Rack

The Raspberry Pi has inspired many a hacker to take the inexpensive (~$35) microcomputer to the enterprise level. From bitcoin miners to clusters, the Raspberry Pi has found itself at the heart of many large-scale projects.

On [Dave] served up his own contribution with his Raspberry Pi Rack. Inspired by enterprise blade servers, he wanted to house multiple Raspberry Pi boards in a single enclosure providing power and Ethernet. The spacing between the blades and the open sides allow for each Pi to cool without the additional power and cost of fans.

Starting with an ATX power supply and Ethernet switch, Dave created a base that housed the components that would be shared by all the Pis. Using a 3D model of a Pi he found online, he began working on the hotswap enclosures. After “dozens of iterations” he created a sled that would hold a Pi in place with clips rather than screws and slide into his rack to connect to power and Ethernet.

Like most projects, some mistakes were made along the way. In his write up [Dave] describes how after printing the bottom plate he realized he hadn’t accounted for the holes for the Ethernet cable runs. Instead the cables run along the back wall in a way he now prefers.

You can find all the details and download the 3D models on his project page.

29 thoughts on “Hot Swappable Raspberry Pi Rack

      1. That doesn’t really matter.

        Let’s say we have a budget of $1k.
        We can afford either of those setups:

        40 x RPi 1 (2.5W per node at 100%)
        25 x RPi 2 (2W per node at 100%)
        2 x Intel i7 4770 (100W per node at 100%)

        There are numerous applications for a cluster, but let’s focus on one that is easily verifiable (a lot of people do such tests).
        Originally I thought about adding a web server benchmark, but I couldn’t find two or more with the same test environment setup.
        Let’s use Linpack benchmark, which is a de facto standard for measuring computational performance (that’s what TOP500 list uses)

        Following figures are for DP floating ops, single node:

        RPi 1: 40 Mflops
        RPi 2: 372 MFlops
        i7: 160 000 MFlops

        Which gives us following economy figures:

        Performance/cost [Mflops/$]:
        RPi 1: 40 * 40 / 1000 = 1,6 [MFlops/$]
        RPi 2: 25 * 372 / 1000 = 9,3 [MFlops/$]
        i7: 2 * 160000 / 1000 = 320 [MFlops/$]

        Performance/power [Mflops/W]:
        RPi 1: 40 / 2.5 = 16 [MFlops/W]
        RPi 2: 372 / 2 = 186 [MFlops/W]
        i7: 160000 / 100 = 1600 [MFlops/W]

        The recent trend is to use general purpose GPU processing for computation, so if I were to use GPUs for comparison, the results would be even more dramatical.
        Please don’t fall into the temptation that when all you have is a hammer, everything looks like a nail. Performance hardware is for computing, RPi is for embedded stuff, application specific HW is for magic ;)

        By the way, there are some _real_ ARM clusters which look nowhere close to what’s been presented. They pack 96 x 2.5GHz cores with 160 Gbit I/O throughput into 1/2 rack unit!
        Power requirements are roughly 50% lower than Intel Xeon servers, while performance stays the same.

        1. A few interesting facts to add:

          The most efficient cluster at the moment achieves over 5000 MFlops/W. Guess what architecture it uses ;)

          In terms of performance/cost, clusters are rather build for long-term use, so initial cost is much smaller factor than power efficiency. Also, consumer hardware is rarely used for reliability reasons. Rough calculation done on TOP500 setups yields about 50-200 Mflops/$.

          1. Very interesting. I recall back in the 80’s and 90’s how every budget season various science groups wold beg for “supercomputer” money and make promises of the amazing problems they will solve. The computers got faster by order of magnitude and there have been some nice computational visualizations. But I have seen almost none of the promised solutions. There must be a rule of thumb that a big science problem needs big budget increases for approximately one career length. Then funding for upgrade and rebuilding (and a retirement party).

            The best thing so far was probably the movie “Cars” which used desktops and blades that would have been supercomputers a few years earlier. Certainly more meaningful than finding the Higgs.

          2. Time on computing clusters is still highly sought after, but some kind of work loads has always been easier than others. E.g. rendering.

        2. And also doing this we don’t have other choice than programming complicated huge high numeberednumbered threads instead of a very few but faster ones on a Core i7 for example.

          btw : if some hackaday admin read this, please add a confirmation dialog to the “Report this comment” I clicked on it by mistake.

        3. Not to mention that two of the i7s can be fitted in a single motherboard, which makes the efficiency and performance even better. Now the cluster can be used for non-cluster tasks, as it acts just like a larger multi-core cpu with a slower (still blazing fast compared to any ethernet solution) bus between half of it.

          1. I’m going to say that only 1 i7 can be run in any given board. They do not have QPI? (not sure whats its called off the top of my head). Xeons on the other hand do so they can be run in multiprocessor mode. Xeons are also bloody expensive. I personally prefer opterons due to their significantly cheaper price.

        4. @Matt There is no doubt the i7 is a FP powerhouse, although to achieve that 160,000 MFlops you are looking at optimised pipelined operation to crank out 160 DFP operations every nS. For numerical supercomputing, you could probably do better with GPUs, as you said. 160,000 MFlops is not a reasonable estimate for more general software. For clusters using more general software applications, that 160,000 MFlops will collapse quite a bit.

          There are applications such as build farms, multiuser cloud server applications, redis clusters, render farms, Spark. For many of these cluster applications you need to build prototypes with maybe 50 nodes. Its a lot cheaper to use $35 ARM boards than i7 machines.

          There are other differences as well. A 20 RPi cluster gives you 80 cores which can each be dedicated to a single user app. The i7 needs to swap processes. The 20 RPi nodes have 20GB RAM with effectively 20 32 bit buses in parallel. The i7 shares its memory between processes. The i7 is very well suited to running VMs, where the RPi cluster will be unsuitable for VMs.

          Out of interest, we tried to get a benchmark of 4 RPi 2 nodes versus an i7 using Blender render farm software. The i7 was just a 3.2Ghz office machine with 8G ram. The RPi 2 were headless on a GBit switch. Blender can use 2 detail settings (the higher one uses more FP I think). For some fairly simple 150 frame renders, we found the following:-

          1x i7 machine = 7 – 15 x Rpi 2 nodes (15 was for the higher detail setting)

          This puts the performance/cost at about the same and the performance/watt of the RPi 2 more than 3x better.

          Someone can check this, but it seems that for real world applications the massive FP capabilities of the i7 are not actually used effectively. The ARM-is-a-toy meme may not be warranted for many cluster applications.

  1. Nifty implementation, but I’m concerned about the power connector at the bottom; 0.1″ pin headers and hot glue likely wouldn’t survive repeated insertions. A proper BMI plug mounted with a little wiggle room would be more robust.

    1. I used some pretty beefy epoxy to secure those. I’m not really worried about them falling out, but I know the connectors themselves won’t last forever. They’re m-to-f breadboard jumpers that I had on hand. Although like any piece of computer guts, they’re not really designed to be plugged and unplugged daily.

  2. I’ve been wanting to do something like this, just for the hell of it. its not practical, there are better ways (etc) but it IS cool to do.

    what stopped me was that I wanted it all. I wanted all the i/o on a backplane, and whoever does that well will ‘win’ a lot of cred, that’s for sure. you need video, serial (out of band console, suppose you bring the ethernet down, etc), usb. to make it more fun, make the cluster box use sleds that can support pi’s and beaglebones.

    see, that’s the hard part. not all have hdmi. not all have composite video. and what about storage? I’d like some easy way to pick what gets used for booting, somehow.

    after thinking about all that, I never built a thing for the ‘pi cluster’. it always seemed like it would be such a partial solution and that would never make me want to use it.

    when things go wrong, you need to grab its video and its console. probably even eject the card, fix things and reboot with a different card. the cluster and sled idea should support that, to be most useful.

    multi-vlan would also be needed (for me, i’d want this to be a network test bed as much as a compute test bed).

    well, I have a bunch of PI’s and when I need them connected, they take up a table, all spread out ;) but at least I can get to every port and jack and do what’s needed when its needed.

    1. I was thinking about how I could do usb connections (for more storage) and HDMI. The usb is conveniently right next to the Ethernet, so it would be pretty easy to add USB receptors to the enclosure. However, what those actually hook up to would be the hacky/hard part, I suppose it’d be possible to integrate some usb hard drives.

      1. For HDMI what you could have is a hdmi switch (4 inputs 1 output). I have one of these at the moment, there is a button on the top to change the source.

        If you then use a “port saver” cable to put the connector at the top (near the ethernet cables) you could then put another “port saver” and male-male hdmi connector that is “built in” to the caddy. When you put the pi in the caddy you plug in one end of the hdmi, put it into the rack and have it connect tot he switch.

        The only problem is you now have 4 devices/Cables (switch, 2 cables, 1 port converter) instead of a cable and also you need to find somewhere to put the excess cable…unless you can find someone to make really short ones for the pi-to-caddy cable.

        Hope that makes sense.

  3. It may not be the best of breed, but honestly, if you had that, you wouldn’t be posting here.

    What it is though, is cute! Thumbs up to the guy playing with a cluster in a little box. Looks like fun!

  4. One of the projects on my perpetual back-burner is a “two headed” USB Ethernet dongle. It would have a standard USB-Ethernet chip in it along with a 3 port switch. The purpose would be to allow daisy chaining them instead of star-wiring for clusters.

        1. If you’ve got a cluster of devices in a row or a matrix, daisy chaining is going to be inherently less messy (particularly with the correct length cables) than star wiring.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s