Making Floating Point Calculations Less Cursed When Accuracy Matters

Inverting the earlier exponentiation to reduce floating point arithmetic error. (Credit: exozy)
Inverting the earlier exponentiation to reduce floating point arithmetic error. (Credit: exozy)

An unfortunate reality of trying to represent continuous real numbers in a fixed space (e.g. with a limited number of bits) is that this comes with an inevitable loss of both precision and accuracy. Although floating point arithmetic standards – like the commonly used IEEE 754 – seek to minimize this error, it’s inevitable that across the range of a floating point variable loss of precision occurs. This is what [exozy] demonstrates, by showing just how big the error can get when performing a simple division of the exponential of an input value by the original value. This results in an amazing error of over 10%, which leads to the question of how to best fix this.

Obviously, if you have the option, you can simply increase the precision of the floating point variable, from 32-bit to 64- or even 256-bit, but this only gets you so far. The solution which [exozy] shows here involves using redundant computation by inverting the result of ex. In a demonstration using Python code (which uses IEEE 754 double precision internally), this almost eradicates the error. Other than proving that floating point arithmetic is cursed, this also raises the question of why this works.

Continue reading “Making Floating Point Calculations Less Cursed When Accuracy Matters”

Looking For Pi In The 8087 Math Coprocessor Chip

Even with ten fingers to work with, math can be hard. Microprocessors, with the silicon equivalent of just two fingers, can have an even harder time with calculations, often taking multiple machine cycles to figure out something as simple as pi. And so 40 years ago, Intel decided to give its fledgling microprocessors a break by introducing the 8087 floating-point coprocessor.

If you’ve ever wondered what was going on inside the 8087, wonder no more. [Ken Shirriff] has decapped an 8087 to reveal its inner structure, which turns out to be closely related to its function. After a quick tour of the general layout of the die, including locating the microcode engine and ROM, and a quick review of the NMOS architecture of the four-decade-old technology, [Ken] dug into the meat of the coprocessor and the reason it could speed up certain floating-point calculations by up to 100-fold. A generous portion of the complex die is devoted to a ROM that does nothing but store constants needed for its calculation algorithms. By carefully examining the pattern of NMOS transistors in the ROM area and making some educated guesses, he was able to see the binary representation of constants such as pi and the square root of two. There’s also an extensive series of arctangent and log2 constants, used for the CORDIC algorithm, which reduces otherwise complex transcendental calculations to a few quick and easy bitwise shifts and adds.

[Ken] has popped the hood on a lot of chips before, finding butterflies in an op-amp and reverse-engineering a Sinclair scientific calculator. But there’s something about seeing constants hard-coded in silicon that really fascinates us.

The New Xbox: Just How Fast Is 12 TeraFLOPS?

Microsoft’s new Xbox Series X, formerly known as Project Scarlet, is slated for release in the holiday period of 2020. Like any new console release, it promises better graphics, more immersive gameplay, and all manner of other superlatives in the press releases. In a sharp change from previous generations, however, suddenly everybody is talking about FLOPS. Let’s dive in and explore what this means, and what bearing it has on performance.

Continue reading “The New Xbox: Just How Fast Is 12 TeraFLOPS?”

Creating Black Holes: Division By Zero In Practice

Dividing by zero — the fundamental no-can-do of arithmetic. It is somewhat surrounded by mystery, and is a constant source for internet humor, whether it involves exploding microcontrollers, the collapse of the universe, or crashing your own world by having Siri tell you that you have no friends.

It’s also one of the few things gcc will warn you about by default, which caused a rather vivid discussion with interesting insights when I recently wrote about compiler warnings. And if you’re running a modern operating system, it might even send you a signal that something’s gone wrong and let you handle it in your code. Dividing by zero is more than theoretical, and serves as a great introduction to signals, so let’s have a closer look at it.

Chances are, the first time you heard about division itself back in elementary school, it was taught that dividing by zero is strictly forbidden — and obviously you didn’t want your teacher call the cops on you, so you obeyed and refrained from it. But as with many other things in life, the older you get, the less restrictive they become, and dividing by zero eventually turned from forbidden into simply being impossible and yielding an undefined result.

And indeed, if a = b/0, it would mean in reverse that a×0 = b. If b itself was zero, the equation would be true for every single number there is, making it impossible to define a concrete value for a. And if b was any other value, no single value multiplied by zero could result in anything non-zero. Once we move into the realms of calculus, we will learn that infinity appears to be the answer, but that’s in the end just replacing one abstract, mind-boggling concept with another one. And it won’t answer one question: how does all this play out in a processor? Continue reading “Creating Black Holes: Division By Zero In Practice”

Understanding Floating Point Numbers

People learn in different ways, but sometimes the establishment fixates on explaining a concept in one way. If that’s not your way you might be out of luck. If you have trouble internalizing floating point number representations, the Internet is your friend. [Fabian Sanglard] (author of Game Engine Black Book: Wolfenstein 3D) didn’t like the traditional presentation of floating point numbers, so he decided to explain them a different way.

Continue reading “Understanding Floating Point Numbers”

Embed With Elliot: Keeping It Integral

If there’s one thing that a lot of small microcontrollers hate (and that includes the AVR-based Arduini), it’s floating-point numbers. And if there’s another thing they hate it’s division. For instance, dividing 72.3 by 12.9 on an Arduino UNO takes around 32 microseconds and 500 bytes, while dividing 72 by 13 takes 14 microseconds and 86 bytes. Multiplying 72 by 12 takes a bit under 2.2 microseconds. So roughly speaking, dividing floats is twice as slow as dividing (16-bit) integers, and dividing at all is five to seven times slower than multiplying.

There’s a whole lot of the time that you just don’t care about speed. For instance, if you’re doing a calculation that only runs infrequently, it doesn’t matter if you’re using floats or slow division routines. But if you ever find yourself in a tight loop that’s using floating-point math and/or doing division, and you need to get a bit more speed, I’ve got some tips for you.

Some of these tips (in particular the integer division tricks at the end) are arcane wizardry — only to be used when the situation really calls for it. But if you’re doing the same calculations repeatedly, you can often gain a lot just by giving the microcontroller numbers in the format it natively understands. Have a little sympathy for the poor little silicon beasties trapped inside!

Continue reading “Embed With Elliot: Keeping It Integral”

An Improvement To Floating Point Numbers

On February 25, 1991, during the eve of the of an Iraqi invasion of Saudi Arabia, a Scud missile fired from Iraqi positions hit a US Army barracks in Dhahran, Saudi Arabia. A defense was available – Patriot missiles had intercepted Iraqi Scuds earlier in the year, but not on this day.

The computer controlling the Patriot missile in Dhahran had been operating for over 100 hours when it was launched. The internal clock of this computer was multiplied by 1/10th, and then shoved into a 24-bit register. The binary representation of 1/10th is non-terminating, and after chopping this down to 24 bits, a small error was introduced. This error increased slightly every second, and after 100 hours, the system clock of the Patriot missile system was 0.34 seconds off.

A Scud missile travels at about 1,600 meters per second. In one third of a second, it travels half a kilometer, and well outside the “range gate” that the Patriot tracked. On February 25, 1991, a Patriot missile would fail to intercept a Scud launched at a US Army barracks, killing 28 and wounding 100 others. It was the first time a floating point error had killed a person, and it certainly won’t be the last.

Continue reading “An Improvement To Floating Point Numbers”