Setting up a cluster of computers used to be a high-end trick used in big data centers and labs. After all, buying a bunch of, say, VAX computers runs into money pretty quickly (not even counting the operating expense). Today, though, most of us have a slew of Raspberry Pi computers.
Because the Pi runs Linux (or, at least, can run Linux), there are a wealth of tools out there for doing just about anything. The trick is figuring out how to install it. Clustering several Linux boxes isn’t necessarily difficult, but it does take a lot of work unless you use a special tool. One of those tools is Docker, particularly Docker Swarm Mode. [Alex Ellis] has a good video (see below) showing the details of a 28 CPU cluster.
It is easy to set up a swarm using the instructions on the Docker website. If you aren’t familiar with Docker, it is almost (but not quite) a light-weight virtual machine manager. A true virtual machine manager pretends to be a piece of hardware so that Linux (or another operating system) can boot on it and run applications. Docker is a container manager, which means it doesn’t pretend to be a piece of hardware, it pretends to be a running operating system. Programs see their own file system and other resources, but in reality, there is only one kernel running on the host hardware.
The idea is similar to running something in a chroot jail. The program can make changes to its file system without upsetting the rest of the system. Docker also provides other kinds of isolation. The real draw, though, is that it can automatically load images of predefined environments. This allows developers to provide packages that are essentially preinstalled in their own private operating system.
That’s important because it means that a service can run on any node in a cluster. That lets you do tricks like balancing load across multiple nodes. You can also do rolling updates and dynamically add or remove computers from the cluster.
We’ve seen clusters before, of course. Even the little Pi Zero can get in on the act. If you want to understand more about Docker, you can always read the official documentation. Given the prevalence of Linux in embedded systems, it might be an interesting way to deploy preconfigured applications.
Here come the Raspberry Pi Cluster Haters™, I can here them tripping over the rug in a rush to comment.
I know almost nothing about this topic, so I have my popcorn ready.
This is a Raspberry-Pi-Cluster something.
Perhaps, in the interests of decency, we should simply leave it there.
I do a deep dive on the RPi Docker Swarm along with all the details over at http://blog.alexellis.io/live-deep-dive-pi-swarm/
Really interested in the comments on this one. Thinking of starting to acquire some pi zeros to let me run a “simulated” production setup and do some experiments with horizontal scalability.
Provided I can get all the software running of course. Out of curiosity does the load balancing mentioned above assume you’re working with a largely homogeneous set of services and just mirrors what’s there? Or can it offload a single process to its own underlying machine?
The load balancing in swarm mode works by creating a service and deciding how many replicas to have. Each replica (or task) gets a round-robin distribution of work. The neat thing is that if you publish a service on port 3000 – you can point to any node and request port 3000 and the swarm will route and balance for you. More @ https://docs.docker.com/engine/swarm/
Using containers, you don’t actually need separate physical hardware. lxc(not familiar with Docker internals, but likely the same) allows you to split up a single NIC into an arbitrary number of virtual network cards. You can run a set of services on a single host, each with it’s own independent RAM, file system, CPU, and networking.
If you need a large cluster of cores and if CUDA or FPGAs are not an option a cluster of pis might be a fine solution
It would be nice to see a cluster that uses the gpio so you don’t need to power up the Hub/Ethernet controller
USB OTG is also an option for Raspberry Pi Zero, I’ve got an 8 core swarm working here: https://twitter.com/alexellisuk/status/764518552154042369 with Docker. Each Pi Zero has a single USB cable for power and data.
I really wouldn’t use a Pi Zero in a cluster environment if you can avoid it. For MPI clustering it’s just about usable (you can run old versions of Apache Spark in a pinch, but basic OpenMPI or MPICH work fine). You can run PHP apps on there mostly fine albeit a little slowly for some values of apps. If you’re getting into Redis, Node.JS etc. in a docker swarm while you probably can run it on the zero, the single core and 512mb of RAM will be sub-optimal for 4 pi zeros compared to running the swarm on a single pi 2.
In all cases, running a Pi cluster should never be about performance, it’s slow, it’s expensive (per FLOP) compared to more efficient options. I have a clusterhat with a Pi 2 and 4 Zeroes that I love to bits, but it’s a fraction of the speed of the desktop next to it.
For me, a Raspberry Pi cluster is more about learning how to deal with clusters than doing real work. I wish Raspberry Pi were available when I was in college, 10-15 years ago, it would be a game changer.
I’ll ask here, because I have no where else to ask.
GPU computing how can we do it efficiently and platform independently?
Lets say I want to have at some project euler problems and just don’t mind going at them naively and bruteforcing it,
What language should I use and what libraries. CUDA etc is Nvidea specific so out of the question.
Is C/C++ and openCL really the best shot?
Because I’m pretty tired of C/C++…
Haskell would be ideal since I’m trying to learn Haskell.
I’m asking in this thread because I think it’s a better option than RPi clusters for parallel computing
And also I should mention I was looking for an ergonomic solution.
You should definitely find a better place to ask that question. Your question is (contrary to what you might believe) totally unrelated to the article above.
No one said that GPUs are worse for parallel computing, but this article is, again, about an RPi cluster, not how best to setup a powerful cluster.
Well, there is not really a good place to ask, but I will keep asking. But HaD attracts all walks and I assume people interested in RPi clusters will also be interested in computation on GPU.
A site like stackexchange may have the right place for you to ask – Maybe http://serverfault.com/ or one of the programming orientated ones.
Don’t worry about it. Learn what to expect from the Raspberry Pi people and laugh about it–lot!
This is the typical response one gets from the Raspberry Pi people; ANY Raspberry Pi people.
If you ask a question not gushing with praise for the RPi once, you’re told that you’re not in the right place, so your question/observation isn’t relevant and won’t be answered (clever weasel tactic, right?).
If you ask it another way, you’re told by the moderator that your question has been addressed before, so stop asking.
If you ask a third time, your post/comment/question is simply deleted, without a trace.
Can’t, or doesn’t, happen? Three times for me on the forums; twice on the Raspberry Pi website blogs. Don’t waste your valuable time.
@Julian–your Raspberry Pi brainwashing is showing. It ain’t pretty.
I only play an expert on the interweb, but this is what I come across most often for platform independent GPU computing … https://mathema.tician.de/software/pyopencl/
Thanks I’ve been looking into python opencl bindings.
Really I was hoping for a language designed around it because I’m not finding any opencl bindings particularly ergonomic.
Have you heard about Cray’s Chapel language? http://chapel.cray.com/tutorials/SC15/Chapel-SC15-1-Background.pdf
This is *exactly* what I was looking for. The presentation looks perfect.
I wonder how well it would run on a cluster of Orange PI computers.
I all ready have 3 Orange PI’s running and 2 more on its way.
If any one has tried it let me know. I am thinking of doing a small cluster with the price of the Orange Pi computers running around $20 with delivery. And they have all most have the same spec’s as a PI 3, that is a lot cheaper then the PI Zeros.
And as for the first comment. I’m not a Raspberry PI hater its just they are out of my price range for the things I’m doing.
And please let me know if any one has tried it on any of the PI clones.
The process shouldn’t be any different, pi clones still run Linux on the arm architecture, you can even run modified versions of Raspbian. If I had the hardware to try it out, I would. And the results should be expected.
Thank you that is what I thought. Now I just need the time to do it.
I think the next order I put in I will be getting 3-5 more Orange Pi’s
They have been great so far. I have one as a server with 40t of hard drives (22 USB connected older hard drives). and yes it is slow as f*&^k.
But it is great for what it is doing, and the price of it running. I still have a problem with turning the power off and on to save on power, with some of the drives not knowing what and were to go. I’m still working on that.
Thanks again.
Be careful when ordering a Raspberry Pi clone – the main reason is that a certain Kernel version is needed to run Docker. I had an issue ordering two Orange Pis with the H3 chip. The Pi Zero costs 5 USD and I took a real world IoT swarm with sensors to Dockercon. It really depends on your workload as to whether the Zero is powerful enough.
Ever since the Zero has come out I have never seen it for less then a Orange with shipping.
I am very aware for the flaws of the Oranges though. And try my best to work around them.
And if I cant then I will goto The Raspberry PI.
They are far from perfect but so far I have been able to work with them.
I do have 2 Raspberry PI b+’s and a Raspberry PI 3. among other weird computers I not to surge about.
and or how to get working and or program. I really wish I could get my hands on some of those little Intel computers.
But you know what I would like the most and that there was a hackaday club (meeting place) close to me.
Toronto is over a hour away and Ive been there twice and loved it there.
Thank you
Hackers? Getting the most bang for the buck? NOT using an Odroid C2?
Oh, I get it: we want to see if we can’t use a device that’s half as powerful with half the memory, does not have 1GB Ethernet nor any form of high-capacity storage–such as up to 64GB of eMMC–, runs all kinds of OSs including Android, which the RPi group has said they’d NEVER do, beats the RPi3 hollow in every test you can throw at them to compare the two, costs essentially the same…and we’re doing ALL of this because, uh, somehow we’ve been convinced that the Raspberry Pi, which cannot even be programmed in Assembly Language to do real-time tasks and which is literally trounced by the Arduino in this respect, is THE answer.
Real hackers use the best tool for the job. Prove it.
Oh, I forgot: you can run Windows on the Raspberry Pi. This one fact is all that most people need to make an argument, so use IT.
are you for real? Or is this just a jab at people who make comments like this?
Imagine a Beowulf cluster of these !…
ooops, sorry, wrong blog :)
Enters confusingly: Ok, Raspberry Pi might not answer any questions for me.
As a learning tool, it appears great. As a functional tool, very good success stories.
Question: what is the best tool to build a hive mind with and how does it compare to the pi? I want to build a world of Ricks. My goal is to enslave the Morty’s and put them to work building a better Rickverse.
But to do this I need a hive Rick. Oh yeah, I plugged a bunch of wires into my pie and it didn’t do anything. Maybe the bakery ripped me off. I asked the baker if this had a good CPU. He checked the computer and said it WaS 2 dollars off! Maybe I will try blueberry next time! The baker said apple was the best!
I’m always curious about Pi clusters, and maybe it’s due to having a plethora of old desktops and running a number of VMs regularly, but what’s the point of running a cluster of Pis versus a cluster of VMs on a couple old scraps of AMD or Intel hardware? You can set up a Docker cluster on a more modern, but used, quad core (AMD and Intel both use virtualization for making that seem like there’s 8 cores), along with 8+ gigs of RAM for roughly the cost of 6 RPi units with needed peripherals.
I chose the Pi because I’m fascinated by them and have already bought a bunch of them (zeros and Pi2/3). I also know the Kernel version is new enough to support Docker and swarm mode so rather than being an uphill battle (as it may have been 2 years ago) it’s now a pleasant experience to set up a swarm, just check out the video. If anyone fancies building a cluster with an RPi clone I’d do your research and make sure that the Kernel is up to date. If someone wants to send me some Odroids I’d happily take them off your hands!
My 7 RPis and a switch are all drawing less than 12W combined. They also take up around the space of a shoebox. Both of these are very attractive plus points for me.
ARM and the Pi in particular have always been popular on Twitter etc – so I think a lot of people share that fascination. Scaleway built a farm of ARM servers to rent out to people and they were so successful that they ran out of stock.
We’re unlikely to run a production load on my set of Raspberry Pis and we’d probably be just as unlikely to run a production work-load on a bunch of dusty PCs in someone’s home – but both are really exciting for learning and exploring the tech. I wish I’d had access to all this 15 years ago.
Just to be clear – is it possible to do a distributed computing of a script that was not intended to perform such task, with this docker option?
Example: I have the OpenDroneMap mapping software that is dedicated to run on one machine, can i “run” this script with swarm and multiply cores?
Anyone know how this kind of cluster could be achieved on the CluterHAT running on a Rpi3B? Currently trying to attain anothe 2 pi zeros to fill the slots but i would like to see it configurable in delegating tasks uch as DNS managment or webhosting to specific nodes. I know this can be achieved by manually installing the desired programs on the right pi zero, then plugging them in and booting the script. But thats not what im after.
If anyone knows a bit more about how the ClusterHAT is configured to communicate with the host computer, as he documentation provided is quite vauge. And all attempts to contact the developer of the ClusterHAT have proved fruitless.