Can An 8 Node Raspberry Pi Cluster Web Server Survive Hackaday?

8-pi-cluster-3

Plenty of folks have used their Raspberry Pi as a web server. [Steve] however is the first 8 node load balanced pi cluster server we’ve run into.  While we have seen pi clusters before, they’ve never been pressed into service as a public facing web server. [Steve] has created a really nice informative website about the Raspberry Pi, and Linux in general. As his page views have increased, he’s had to add nodes to the server. Currently [Steve] sees about 45,000 page views per month.

At first glance it would seem that the load balance system would be the weak link in the chain. However, [Steve] did realize that he needed more than an Pi to handle this task. He built the load balancer using an old PC with 512MB of RAM and a 2.7GHz x86 CPU. The most important thing about the balancer is dual network interfaces, one side facing the internet, the other facing the Pi cluster. The balancer isn’t a router though. Only HTTP requests are forwarded. The Pi nodes themselves live on their own sub net. Steve has run some basic testing with siege, however nothing beats a real world test. We figured a couple of links in from Hackaday would be enough to acid test the system.

Comments

  1. Evan says:

    Serves up pages quickly for me!
    Nice build, Steve, and thanks for all the info. I’ve bookmarked it for reading later.

  2. isama says:

    Nice build, tough I feel using something other than a raspberry as a loadbalancer feels like cheating :P

    I’d try using LVS on all the pi’s sharing an ip. Nerver tried it, but it seems like a nice way to get redundancy and loadbalancing across the pi’s
    http://www.linuxvirtualserver.org/

  3. Trui says:

    If you use this at a typical home, I’d guess the weakest link is the internet connection, unless you have very complicated dynamic page content.

  4. bluewraith says:

    So who will be the one to post to reddit and lead it into slashdot? :D

  5. Tien Gow says:

    Good, fast response so far. Nice site.

  6. Mystick says:

    If you really wanted to test it, you should have said it was built using an Arduino for the purposes of automatically modifying the throttled firmware on O’scopes and FLIR cameras for full functionality so they could be used to rebuild a tractor… while being “wearable” through implanted 3D-printed magnets under the skin.

    …then just sit back and graph the FPS(flames per second) and the TRI(troll rage index)…

  7. fartface says:

    Yes it can, a 386 can. It’s the internet pipe that is the bottle neck.
    now a slashdotting… That is something to behold as it has taken down core routers, or it used to. I dont think that place is as popular as it used to be.

    • mints says:

      I agree fully with both your points. 45,000 seemed reasonable until I noticed it said months. Years ago I ran a tiny turn based scalable game akin to Planetarion. The whole thing fit on a 750mb harddrive with a 486dx. It could handle just over a thousand users logged in at any time (though never got close). Per month static page visits were consistently over 250,000. If I was to count dynamic pages it was probably an order of magnitude higher.

      And slashdot..yeah I spend plenty of time there but it seems to be filled with marketers, people who think they know tech, and elitist old programmers that apparently browse slashdot during their entire workday. It has not had quality discussions or stories for some time now, but people still go because nobody else comes close to taking its place.

  8. Nonano says:

    Nice project for trying out cluster stuff. Have the feeling putting some spare RAM into the load balancer and not using the PI’s would likely perform just as well if not better though. Ever tried it?

    The content of the page seems quite static so having everything relevant in a cache with hardly anything dynamic going on seems feasible.

  9. Tom the Brat says:

    7:15 a.m. central time usa: Responds nice and fast for me.

  10. Andrew says:

    Hm, so it takes 8 raspberries to survive HaD effect?
    Well, 1 odroid-x2 at 2Ghz armed with debian, lighttpd wordress & mysqld did urvive for me… Twice.

  11. DosX says:

    Still fresh, give it time and we will see how it survives.

    If it does hoorah! if not still awesome.

  12. dynamodan says:

    Still fast for me here. Way fast.

  13. Sasha says:

    Wow, I’m amazed at the speed! Plus his ping is 60ms from my residential connection, over wifi, from across the world. That’s… awesome.

    • Trui says:

      60 ms is not across the world. At least not very far (about 5000 miles max). And latency is mostly the effect of speed of light limit in long optic fibers, and interleaving in modems, not CPU processing delays.

      • mints says:

        Doubly wrong. Fun Fact: light travels 187 miles in one millisecond. That means in 60ms, in optimal conditions, the maximum distance will be just shy of half the circumference at the equator. Even still, 5000 miles IS on the other side of the earth (and often moreso) at most latitudes.

        • Acido says:

          Light in a vaccum does, yes. Light in a fiber is only about 60% that speed, so it travels some 10,800 km in 60ms. That’s a quarter around a great circle; halfway around would be what is usually taken to mean “on the other side of the world”. You could build two polar stations a stone’s throw from each other at 0 and 180 degrees longitude but it would be quite a stretch to say they were on the other side of the world from each other.

    • franky says:

      As he said the load balancer is not a router. If you ping no raspberry is involved in the test…

      • mints says:

        A router connects two or more networks.
        The Raspberry Pi cluster has its own subnet.
        The load balancer connects the external network and subnet.
        Therefore, the load balancer is a router.

        Really I don’t know why they wrote that. He knows that Linux with forwarding is a router complete with its own routing table.

        • franky says:

          Hi, see next explanation from mints. Of course most linux can forward and connect network traffix. But in this case it is described as a proxy, and a proxy forwards http (may https ) only to the raspberry. Ping is not http, and will not reach the cluster,

    • Quin says:

      I don’t think the pings go to the R-Pis.His webpage states that the Pis only get the http requests passed, as it’s apache on the 2.7GHz CPU that’s doing the load balancing, not the IP or MAC layer. So, pinging the load balancer at the dot com address probably never gets to the Pis, and is handled either by the gateway (the interface was on a local network, what home gateways pass pings now days? and the IP resolves to a VM owned home cable IP, not his domain name. So, maybe a dyndns type situation?) or at worst the ‘balancer’ (highly doubtful, since the balancer looks to have a 192.168.0.x on the outward facing side)

      As for “across the world”, the IP whois goes to a Great Britian identity protection service, and a tracert goes to a home cable connection on virgin media. The router outside his gateway looks to be in Surrey, but I am not really sure and don’t want to be too nosy and snoop more. Why yes, my youth was misspent by doing crap like this with friends in grey-hat games of ‘break into my computer, I dare you.’ Open SMTP servers were always the best way to get a file installed, because when god@heaven tells you to check an email, you often think twice. Same for anyone @something.invalid, .example, .test, or best was .localhost! Those look like something from a local service, since the TLDs aren’t valid.

      Lastly, ping time. I don’t know how your ping time is for other sites, but for me to get out of the verizon 10.x.x.x (I have a globally visible IP that geoips to the wrong part of the country, but tracert goes through a 10.x.x.x before the VZ regional router) takes 32ms on dsl. Regional to VZ border is another 28ms on top of that (probably just the time the border is taking to respond due to other traffic). Then alter. net and glbx router drops back down to 39ms on tracert until the cable crosses from the USA to what looks like a London glbx router, where I sit in the 120ms range. That 60ms is just unavoidable duty to fiber over the ocean. From gblx to Virgin Media is about 10ms after that (from VM border to VM core to Surrey local router looks like no delay at all, good on Virgin Media), and then the link to the guys home connection is 20-30ms since it jumps through cable (not fios or dsl or a data center like connection). So if you are getting just 60ms, either you are in Europe (less fiber crossing oceans) or you have a blazing fast dedicated pipe out to an ocean crossing fiber link.

  14. John says:

    I was sure to click as many ads as possible to help out a fellow hacker ;)

  15. neonmaus says:

    He could have even more performance if he would use nginx instead of Apache.

  16. Anthony Pearia says:
  17. ElectroNick says:

    Adam, thanks for the tip! In addition to being a cluster test study the site itself is an awesome rpi resource!

    By the way, featured on Hackaday is a great way to test a Web server. When articles about my projects were posted, I have seen 10+ times the normal spikes in traffic. I’m sure after today Steve will have much better idea of how his cluster behaves under heavy load :)

  18. Anthony Pearia says:
  19. franky says:

    Great experimental setup. The result is very good. But as said before, there should be a PI as load balancer used, too. Imho the loadbalancar itselfs could do the whole job (may with an 512Meg upgrade)

  20. jordan says:

    *Clicks on ALL the links on his page* haha i feel so evil.

  21. DSouth says:

    Did anyone see where he talked about what kind of cluster filesystem is in use? Where does he store his site files, NFS? Or some other cluster filesystem?

    • Good question. Though from experience with a mid-volume (>300.000 users) site NFS works absolutely fine for such things. And if all else fails you can do rsync ;)

      BTW for mass-operations on many nodes, Ansible is a great open source tool. Really cool if you need to quickly roll out stuff to 200 machines, or run scripts on 200 machines and get some form of collated run report back.

  22. Neat idea! Though I have to say, using Apache for loadbalancing is rather hopeless. A much sleeker (and more performant) solution is haproxy. This is not to say that the LB is the bottleneck here, I doubt that’s the case, but still…
    Oh and the monitoring and health check facilities are superior too, haproxy can replay requests to another node if the first one dies during the request, AND you can run haproxy in tcp mode to loadbalance anything that uses tcp, not just http.

    Alright I’ll shut up now ;)

  23. Mekael Hannigan says:

    Maybe it’s just me, but I’m not quite seeing the point of this.

    • Acido says:

      Indeed. It’s like showing you can deliver a pizza with a fleet of motorbikes that had their frames filled with uranium if you drag them along with a Hummer. Never mind a single motorbike would have been fine if used properly. Or a single Raspi with Gatling or even lighttpd.

  24. Should’ve used something like bond-ucarp to load-balance. The computer at the front is just overkill.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s