Don’t Believe Everything You Read: The Great Electric Toaster Hoax

We’ve all looked up things on Wikipedia and, generally, it is usually correct information. However, the fact that anyone can edit it leads to abuse and makes it somewhat unreliable. Case in point? The BBC’s [Marco Silva] has the story of the great online toaster hoax which erroneously identified the inventor of the toaster with great impact.

You should read the original story, but in case you want a synopsis, here goes: Until recently, the Wikipedia entry for toasters stated that a Scottish man named Alan MacMasters invented the electric toaster in the 1800s. Sounds plausible. Even more so because several books had picked it up along with the Scottish government’s Brand Scottland website. At least one school had a day memorializing the inventor and a TV show also honored him with a special dessert named for Alan MacMasters, the supposed inventor. Continue reading “Don’t Believe Everything You Read: The Great Electric Toaster Hoax”

The 13.5 Million Core Computer

Having a dual- or quad-core CPU is not very exotic these days and CPUs with 12 or even 16 cores aren’t that rare. The Andromeda from Cerebras is a supercomputer with 13.5 million cores. The company claims it is one of the largest AI supercomputers ever built (but not the largest) and can perform 120 Petaflops of “dense compute.”

We aren’t sure about the methodology, but they also claim more than one exaflop of “AI computing.” The computer has a fabric backplane that can handle 96.8 terabits per second between nodes. According to a post on Extreme Tech, the core technology is a 3-plane wafer processor, WSE-2. One plane is for communications, one holds 40 GB of static RAM, and the math plane has 850,000 independent cores and 3.4 million floating point units.

The data is sent to the cores and collected by a bank of 64-core AMD EPYC 3 processors. Andromeda is optimized to handle sparse matrix computations. The company claims that the performance scales “almost linearly.” That is, as you double the number of cores used, you roughly half the total run time.

The machine is available for remote use and cost about $35 million to build. Since it uses 500 kW at peak run times, it isn’t free to operate, either. Extreme Tech notes that the Frontier computer at Oak Ridge National Labs is both larger and more precise, but it cost $600 million, so you’d expect it to be more capable.

Most homebrew “supercomputers” we see are more for learning how to work with clusters than trying to hit this sort of performance. Of course, if you have a modern graphics card, OpenCL and CUDA will let you do some of this, too, but at a much lesser scale.

The Fastest Fourier Transform In The West

An interesting aspect of time-varying waveforms is that by using a trick called a Fourier Transform (FT), they can be represented as the sum of their underlying frequencies. This mathematical insight is extremely helpful when processing signals digitally, and allows a simpler way to implement frequency-dependent filtration in a digital system. [klafyvel] needed this capability for a project, so started researching the best method that would fit into an Arduino Uno. In an effort to understand exactly what was going on they have significantly improved on the code size, execution time and accuracy of the previous crown-wearer.

A complete real-time Fourier Transform is a resource-heavy operation that needs more than an Arduino Uno can offer, so faster approximations have been developed over the years that exchange absolute precision for speed and size. These are known as Fast Fourier Transforms (FFTs). [klafyvel] set upon diving deep into the mathematics involved, as well as some low-level programming techniques to figure out if the trade-offs offered in the existing solutions had been optimized. The results are impressive.

Fastest FFT code benchmarking results in ms
Benchmarking results showing speed of implementation versus the competition (ApproxFFT)

Not content with producing one new award-winning algorithm, what is documented on the blog is a masterclass in really understanding a problem and there are no less than four algorithms to choose from depending on how you rank the importance of execution speed, accuracy, code size or array size.

Along the way, we are treated to some great diversions into how to approximate floats by their exponents (French text), how to control, program and gather data from an Arduino using Julia, how to massively improve the speed of the code by using trigonometric identities and how to deal with overflows when the variables get too large. There is a lot to digest in here, but the explanations are very clear and peppered with code snippets to make it easier and if you have the time to read through, you’re sure to learn a lot!  The code is on GitHub here.

If you’re interested in FFTs, we’ve seen them before around these parts. Fill your boots with this link of tagged projects.