Can An 8 Node Raspberry Pi Cluster Web Server Survive Hackaday?

Plenty of folks have used their Raspberry Pi as a web server. [Steve] however is the first 8 node load balanced pi cluster server we’ve run into.  While we have seen pi clusters before, they’ve never been pressed into service as a public facing web server. [Steve] has created a really nice informative website about the Raspberry Pi, and Linux in general. As his page views have increased, he’s had to add nodes to the server. Currently [Steve] sees about 45,000 page views per month.

At first glance it would seem that the load balance system would be the weak link in the chain. However, [Steve] did realize that he needed more than an Pi to handle this task. He built the load balancer using an old PC with 512MB of RAM and a 2.7GHz x86 CPU. The most important thing about the balancer is dual network interfaces, one side facing the internet, the other facing the Pi cluster. The balancer isn’t a router though. Only HTTP requests are forwarded. The Pi nodes themselves live on their own sub net. Steve has run some basic testing with siege, however nothing beats a real world test. We figured a couple of links in from Hackaday would be enough to acid test the system.

50 thoughts on “Can An 8 Node Raspberry Pi Cluster Web Server Survive Hackaday?

    1. Agreed. It kind of negates the point of using the Pi in my opinion. Might as well just scrap the whole lot and use a low power intel atom server. I’m using one that eats up just 10watts. Lots of RAM and fast SATA drive.

  1. If you really wanted to test it, you should have said it was built using an Arduino for the purposes of automatically modifying the throttled firmware on O’scopes and FLIR cameras for full functionality so they could be used to rebuild a tractor… while being “wearable” through implanted 3D-printed magnets under the skin.

    …then just sit back and graph the FPS(flames per second) and the TRI(troll rage index)…

  2. Yes it can, a 386 can. It’s the internet pipe that is the bottle neck.
    now a slashdotting… That is something to behold as it has taken down core routers, or it used to. I dont think that place is as popular as it used to be.

    1. I agree fully with both your points. 45,000 seemed reasonable until I noticed it said months. Years ago I ran a tiny turn based scalable game akin to Planetarion. The whole thing fit on a 750mb harddrive with a 486dx. It could handle just over a thousand users logged in at any time (though never got close). Per month static page visits were consistently over 250,000. If I was to count dynamic pages it was probably an order of magnitude higher.

      And slashdot..yeah I spend plenty of time there but it seems to be filled with marketers, people who think they know tech, and elitist old programmers that apparently browse slashdot during their entire workday. It has not had quality discussions or stories for some time now, but people still go because nobody else comes close to taking its place.

  3. Nice project for trying out cluster stuff. Have the feeling putting some spare RAM into the load balancer and not using the PI’s would likely perform just as well if not better though. Ever tried it?

    The content of the page seems quite static so having everything relevant in a cache with hardly anything dynamic going on seems feasible.

    1. 60 ms is not across the world. At least not very far (about 5000 miles max). And latency is mostly the effect of speed of light limit in long optic fibers, and interleaving in modems, not CPU processing delays.

      1. Doubly wrong. Fun Fact: light travels 187 miles in one millisecond. That means in 60ms, in optimal conditions, the maximum distance will be just shy of half the circumference at the equator. Even still, 5000 miles IS on the other side of the earth (and often moreso) at most latitudes.

        1. Light in a vaccum does, yes. Light in a fiber is only about 60% that speed, so it travels some 10,800 km in 60ms. That’s a quarter around a great circle; halfway around would be what is usually taken to mean “on the other side of the world”. You could build two polar stations a stone’s throw from each other at 0 and 180 degrees longitude but it would be quite a stretch to say they were on the other side of the world from each other.

      1. A router connects two or more networks.
        The Raspberry Pi cluster has its own subnet.
        The load balancer connects the external network and subnet.
        Therefore, the load balancer is a router.

        Really I don’t know why they wrote that. He knows that Linux with forwarding is a router complete with its own routing table.

        1. Hi, see next explanation from mints. Of course most linux can forward and connect network traffix. But in this case it is described as a proxy, and a proxy forwards http (may https ) only to the raspberry. Ping is not http, and will not reach the cluster,

    2. I don’t think the pings go to the R-Pis.His webpage states that the Pis only get the http requests passed, as it’s apache on the 2.7GHz CPU that’s doing the load balancing, not the IP or MAC layer. So, pinging the load balancer at the dot com address probably never gets to the Pis, and is handled either by the gateway (the interface was on a local network, what home gateways pass pings now days? and the IP resolves to a VM owned home cable IP, not his domain name. So, maybe a dyndns type situation?) or at worst the ‘balancer’ (highly doubtful, since the balancer looks to have a 192.168.0.x on the outward facing side)

      As for “across the world”, the IP whois goes to a Great Britian identity protection service, and a tracert goes to a home cable connection on virgin media. The router outside his gateway looks to be in Surrey, but I am not really sure and don’t want to be too nosy and snoop more. Why yes, my youth was misspent by doing crap like this with friends in grey-hat games of ‘break into my computer, I dare you.’ Open SMTP servers were always the best way to get a file installed, because when god@heaven tells you to check an email, you often think twice. Same for anyone @something.invalid, .example, .test, or best was .localhost! Those look like something from a local service, since the TLDs aren’t valid.

      Lastly, ping time. I don’t know how your ping time is for other sites, but for me to get out of the verizon 10.x.x.x (I have a globally visible IP that geoips to the wrong part of the country, but tracert goes through a 10.x.x.x before the VZ regional router) takes 32ms on dsl. Regional to VZ border is another 28ms on top of that (probably just the time the border is taking to respond due to other traffic). Then alter. net and glbx router drops back down to 39ms on tracert until the cable crosses from the USA to what looks like a London glbx router, where I sit in the 120ms range. That 60ms is just unavoidable duty to fiber over the ocean. From gblx to Virgin Media is about 10ms after that (from VM border to VM core to Surrey local router looks like no delay at all, good on Virgin Media), and then the link to the guys home connection is 20-30ms since it jumps through cable (not fios or dsl or a data center like connection). So if you are getting just 60ms, either you are in Europe (less fiber crossing oceans) or you have a blazing fast dedicated pipe out to an ocean crossing fiber link.

  4. Adam, thanks for the tip! In addition to being a cluster test study the site itself is an awesome rpi resource!

    By the way, featured on Hackaday is a great way to test a Web server. When articles about my projects were posted, I have seen 10+ times the normal spikes in traffic. I’m sure after today Steve will have much better idea of how his cluster behaves under heavy load :)

  5. Great experimental setup. The result is very good. But as said before, there should be a PI as load balancer used, too. Imho the loadbalancar itselfs could do the whole job (may with an 512Meg upgrade)

    1. Good question. Though from experience with a mid-volume (>300.000 users) site NFS works absolutely fine for such things. And if all else fails you can do rsync ;)

      BTW for mass-operations on many nodes, Ansible is a great open source tool. Really cool if you need to quickly roll out stuff to 200 machines, or run scripts on 200 machines and get some form of collated run report back.

  6. Neat idea! Though I have to say, using Apache for loadbalancing is rather hopeless. A much sleeker (and more performant) solution is haproxy. This is not to say that the LB is the bottleneck here, I doubt that’s the case, but still…
    Oh and the monitoring and health check facilities are superior too, haproxy can replay requests to another node if the first one dies during the request, AND you can run haproxy in tcp mode to loadbalance anything that uses tcp, not just http.

    Alright I’ll shut up now ;)

    1. Indeed. It’s like showing you can deliver a pizza with a fleet of motorbikes that had their frames filled with uranium if you drag them along with a Hummer. Never mind a single motorbike would have been fine if used properly. Or a single Raspi with Gatling or even lighttpd.

  7. Hi there,
    I was looking for some other stuff and landed upon this page. I was wondering if a Raspberry-Pi cluster would do a decent setup to perform load testing on a web app, on a local server. Any ideas, opinions welcome (but no trolls please).

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.