A RISC-V Supercluster For Very Low Cost

As ARM continues to make inroads in the personal computing space thanks to its more modern and streamlined instruction set architecture (ISA) and its reduced power demands especially compared to x86 machines, the main reason it continues to become more widespread is how easy it is to get a license to make chips using this ISA. It’s still not a fully open source instruction set, though, so if you want something even more easily accessible than ARM you’ll need to find something like these chips running the fully open-source RISC-V ISA and possibly put them to work in a custom supercluster.

[bitluni] recently acquired a large number of CH32V003 microcontrollers and managed to configure them all to work together in a cluster. The entire array is only $2 (not including all of the other components attached to the board) so a cluster of arbitrary size is potentially possible. [bitluni] built a four-layer PCB for this project with an 8-bit bus so the microcontrollers can communicate with each other. Each chip has its own ADC and I/O that are wired to a set of GPIO pins on the sides of the board. The build is rounded out with a USB interface for programming and power.

There were a few quirks to get this supercluster up and running, including some issues with the way the reset and debug pins work on these specific microcontrollers. With some bugs like this out of the way, the entire cluster is up and running, and [bitluni] hints that his design could be easily interfaced with even larger RISC-V superclusters. As for a use for this build, sometimes clusters like these are built just to build them, but since the I/O and ADCs are accessible in theory this cluster could do anything a larger microcontroller might be able to do, only at a much lower price.

Continue reading “A RISC-V Supercluster For Very Low Cost”

Picture of the miniJen structure on a presentation desk

A Jenkins Demo Stand For Modern Times

Once you’re working on large-scale software projects, automation is a lifesaver, and Jenkins is a strong player in open-source automation – be it software builds, automated testing or deploying onto your servers. Naturally, it’s historically been developed with x86 infrastructure in mind, and let’s be fair, x86 is getting old. [poddingue], a hacker and a Jenkins contributor, demonstrates that Jenkins keeps up with the times, with a hardware demo stand called miniJen, that has Jenkins run on three non-x86 architectures – arm8v (aarch64), armv7l and RISC-V.

There’s four SBCs of different architectures involved in this, three acting as Jenkins agents executing tasks, and one acting as a controller, all powered with a big desktop PSU from Pine64. The controller’s got a bit beefier CPU for a reason – at FOSDEM, we’ve seen it drive a separate display with a Jenkins dashboard. It’s very much a complete demo for its purpose, and definitely an eyecatcher for FOSDEM attendees passing by the desk! As a bonus, there’s also a fascinating blog post about how [poddingue] got to running Jenkins on RISC-V in particular.

Even software demonstrations get better with hardware, and this stood out no doubt! Looking to build a similar demo, or wondering how it came together? [poddingue] has blog posts on the demo’s structure, a repo with OpenSCAD files, and a trove of videos demonstrating the planning, design and setup process. As it goes with continuous integrations, we’ve generally seen hackers and Jenkins collide when it comes to build failure alerts, from rotating warning lights to stack lights to a Christmas tree; however, we’ve also seen a hacker use it to keep their firmware size under control between code changes. And, if you’re wondering what continuous integration holds for you, here’s our hacker-oriented deep dive.

The 13.5 Million Core Computer

Having a dual- or quad-core CPU is not very exotic these days and CPUs with 12 or even 16 cores aren’t that rare. The Andromeda from Cerebras is a supercomputer with 13.5 million cores. The company claims it is one of the largest AI supercomputers ever built (but not the largest) and can perform 120 Petaflops of “dense compute.”

We aren’t sure about the methodology, but they also claim more than one exaflop of “AI computing.” The computer has a fabric backplane that can handle 96.8 terabits per second between nodes. According to a post on Extreme Tech, the core technology is a 3-plane wafer processor, WSE-2. One plane is for communications, one holds 40 GB of static RAM, and the math plane has 850,000 independent cores and 3.4 million floating point units.

The data is sent to the cores and collected by a bank of 64-core AMD EPYC 3 processors. Andromeda is optimized to handle sparse matrix computations. The company claims that the performance scales “almost linearly.” That is, as you double the number of cores used, you roughly half the total run time.

The machine is available for remote use and cost about $35 million to build. Since it uses 500 kW at peak run times, it isn’t free to operate, either. Extreme Tech notes that the Frontier computer at Oak Ridge National Labs is both larger and more precise, but it cost $600 million, so you’d expect it to be more capable.

Most homebrew “supercomputers” we see are more for learning how to work with clusters than trying to hit this sort of performance. Of course, if you have a modern graphics card, OpenCL and CUDA will let you do some of this, too, but at a much lesser scale.

Turing Pi 2: The Low Power Cluster

We’re not in the habit of recommending Kickstarter projects here at Hackaday, but when prototype hardware shows up on our desk, we just can’t help but play with it and write it up for the readers. And that is exactly where we find ourselves with the Turing Pi 2. You may be familiar with the original Turing Pi, the carrier board that runs seven Raspberry Pi Compute boards at once. That one supports the Compute versions 1 and 3, but a new design was clearly needed for the Compute Module 4. Not content with just supporting the CM4, the developers at Turing Machines have designed a 4-slot carrier board based on the NVIDIA Jetson pinout. The entire line of Jetson devices are supported, and a simple adapter makes the CM4 work. There’s even a brand new module planned around the RK3588, which should be quite impressive.

One of the design decisions of the TP2 is to use the mini-ITX form-factor and 24-pin ATX power connection, giving us the option to install the TP2 in a small computer case. There’s even a custom rack-mountable case being planned by the folks over at My Electronics. So if you want 4 or 8 Raspberry Pis in a rack mount, this one’s for you.
Continue reading “Turing Pi 2: The Low Power Cluster”

Learning The Ropes With A Raspberry Pi Mandelbrot Cluster

You’ve probably heard it said that clustering a bunch of Raspberry Pis up to make a “supercomputer” doesn’t make much sense, as even a middle-of-the-road desktop could blow it away in terms of performance. While that may be true, the reason most people make Pi clusters isn’t for raw power, it’s so they can build experience with parallel computing without breaking the bank.

So while there was probably a “better” way to produce the Mandelbrot video seen below, creator [Michael Kohn] still learned a lot about putting together a robust parallel processing environment using industry standard tools like Kubernetes and Docker. Luckily for us, he was kind enough to document the whole process for anyone else who might be interested in following in his footsteps. Whatever your parallel task is, and whatever platform it happens to be running on, some of the notes here are likely to help you get it going.

It’s not the biggest Raspberry Pi cluster we’ve ever seen, but the four Pi 4s and the RGB LED festooned enclosure they live in make for an affordable and space-saving cluster to hone your skills on. Whether you’re practicing for the future of software development and deployment, or just looking for something new to play around with, building one of these small-scale clusters is a great way to get in on the action.

Continue reading “Learning The Ropes With A Raspberry Pi Mandelbrot Cluster”

An Epic Tale Of Reset Line Detective Work

The Pine64 folks have given us so many tasty pieces of hardware over the last few years, but it’s fair to say that their products are for experimenters rather than consumers and can thus be a little rough around the edges at times. Their Clusterboard for example is a Mini-ITX PCB which takes up to seven of their SOPINE A64 compute modules, and networks them for use as a cluster by means of an onboard Gigabit Ethernet switch. It’s a veritable powerhouse, but it has an annoying bug in that it appears reluctant to restart when told. [Eric Draken] embarked upon a quest to fix this problem, and while he got there in the end his progress makes for a long and engrossing read.

We journey through the guts of the board and along the way discover a lot about how reset signals are generated. The eventual culprit is a back-EMF generated through the reset distribution logic itself causing the low-pulled line to never quite descend into logic 0 territory once it has been pulled high, and the solution an extremely simple application of a diode. For anyone who wishes to learn about logic level detective work it’s well worth a look. Meanwhile the board itself with its 28 ARM cores appears to have plenty of potential. It’s even a board we’ve mentioned before, in a personal supercomputer project.

Building A Cheap Kubernetes Cluster From Old Laptops

Cluster computing is a popular choice for heavy duty computing applications. At the base level, there are hobby clusters often built with Raspberry Pis, while the industrial level involves data centers crammed with servers running at full tilt. [greg] wanted something cheap, but with x86 support – so set about building a rig his own way.

The ingenious part of [greg]’s build comes in the source computers. He identified that replacement laptop motherboards were a great source of computing power on the cheap, with a board packing an i7 CPU with 16GB of RAM available from eBay for around £100, and with i5 models being even cheaper. With four laptop motherboards on hand, he set about stacking them in a case, powering them, and hooking them up with the bare minimum required to get them working. With everything wrapped up in an old server case with some 3D printed parts to hold it all together, he was able to get a 4-node Kubernetes cluster up and running for an absolute bargain price.

We haven’t seen spare laptop motherboards used in such a way before, but we could definitely see this becoming more of a thing going forward. The possibilities of a crate full of deprecated motherboards are enticing for those building clusters on the cheap. Of course, more nodes is more better, so check out this 120 Pi cluster to satiate your thirst for raw FLOPs.