Using Integer Addition To Approximate Float Multiplication

April 10, 2025 by Maya Posch 14 Comments

Once the domain of esoteric scientific and business computing, floating point calculations are now practically everywhere. From video games to large language models and kin, it would seem that a processor without floating point capabilities is pretty much a brick at this point. Yet the truth is that integer-based approximations can be good enough to hit the required accuracy. For example, approximating floating point multiplication with integer addition, as [Malte Skarupke] recently had a poke at based on an integer addition-only LLM approach suggested by [Hongyin Luo] and [Wei Sun].

As for the way this works, it does pretty much what it says on the tin: adding the two floating point inputs as integer values, followed by adjusting the exponent. This adjustment factor is what gets you close to the answer, but as the article and comments to it illustrate, there are plenty of issues and edge cases you have to concern yourself with. These include under- and overflow, but also specific floating point inputs.

Unlike in scientific calculations where even minor inaccuracies tend to propagate and cause much larger errors down the line, graphics and LLMs do not care that much about float point precision, so the ~7.5% accuracy of the integer approach is good enough. The question is whether it’s truly more efficient as the paper suggests, rather than a fallback as seen with e.g. integer-only audio decoders for platforms without an FPU.

Since one of the nice things about FP-focused vector processors like GPUs and derivatives (tensor, ‘neural’, etc.) is that they can churn through a lot of data quite efficiently, the benefits of shifting this to the ALU of a CPU and expecting (energy) improvements seem quite optimistic.

How Hard Is It To Write A Calculator App?

February 16, 2025 by Al Williams 40 Comments

How hard can it be to write a simple four-function calculator program? After all, computers are good at math, and making a calculator isn’t exactly blazing a new trail, right? But [Chad Nauseam] will tell you that it is harder than you probably think. His post starts with a screenshot of the iOS calculator app with a mildly complex equation. The app’s answer is wrong. Android’s calculator does better on the same problem.

What follows is a bit of a history lesson and a bit of a math lesson combined. As you might realize, the inherent problem with computers and math isn’t that they aren’t good at it. Floating point numbers have a finite precision and this leads to problems, especially when you do operations that combine large and small numbers together.

Indeed, any floating point representation has a bigger infinity of numbers that it can’t represent than those that it can. But the same is true of a calculator. Think about how many digits you are willing to type in, and how many digits you want out. All you want is for each of them to be correct, and that’s a much smaller set of numbers.

Continue reading “How Hard Is It To Write A Calculator App?” →

A Die-Level Look At The Pentium FDIV Bug

December 29, 2024 by Dan Maloney 10 Comments

The early 1990s were an interesting time in the PC world, mainly because PCs were entering the zeitgeist for the first time. This was fueled in part by companies like Intel and AMD going head-to-head in the marketplace with massive ad campaigns to build brand recognition; remember “Intel Inside”?

In 1993, Intel was making some headway in that regard. The splashy launch of their new Pentium chip in 1993 was a huge event. Unfortunately an esoteric bug in the floating-point division module came to the public’s attention. [Ken Shirriff]’s excellent account of that kerfuffle goes into great detail about the discovery of the bug. The issue was discovered by [Dr. Thomas R. Nicely] as he searched for prime numbers. It’s a bit of an understatement to say this bug created a mess for Intel. The really interesting stuff is how the so-called FDIV bug, named after the floating-point division instruction affected, was actually executed in silicon.

We won’t presume to explain it better than [Professor Ken] does, but the gist is that floating-point division in the Pentium relied on a lookup table implemented in a programmable logic array on the chip. The bug was caused by five missing table entries, and [Ken] was able to find the corresponding PLA defects on a decapped Pentium. What’s more, his analysis suggests that Intel’s characterization of the bug as a transcription error is a bit misleading; the pattern of the missing entries in the lookup table is more consistent with a mathematical error in the program that generated the table.

The Pentium bug was a big deal at the time, and in some ways a master class on how not to handle a complex technical problem. To be fair, this was the first time something like this had happened on a global scale, so Intel didn’t really have a playbook to go by. [Ken]’s account of the bug and the dustup surrounding it is first-rate, and if you ever wanted to really understand how floating-point math works in silicon, this is one article you won’t want to miss.

Fixed Point Math Exposed

June 23, 2024 by Al Williams 50 Comments

If you are used to writing software for modern machines, you probably don’t think much about computing something like one divided by three. Modern computers handle floating point quite well. However, in constrained systems, there is a trap you should be aware of. While modern compilers are happy to let you use and abuse floating point numbers, the hardware is often woefully slow. It also tends to eat up lots of resources. So what do you do? Well, as [Low Byte Productions] explains, you can opt for fixed-point math.

In theory, the idea is simple. Just put an arbitrary decimal point in your integers. So, for example, if we have two numbers, say 123 and 456, we could remember that we really mean 1.23 and 4.56. Adding, then, becomes trivial since 123+456=579, which is, of course, 5.79.

Continue reading “Fixed Point Math Exposed” →

Making Floating Point Calculations Less Cursed When Accuracy Matters

March 14, 2024 by Maya Posch 27 Comments

Inverting the earlier exponentiation to reduce floating point arithmetic error. (Credit: exozy)

An unfortunate reality of trying to represent continuous real numbers in a fixed space (e.g. with a limited number of bits) is that this comes with an inevitable loss of both precision and accuracy. Although floating point arithmetic standards – like the commonly used IEEE 754 – seek to minimize this error, it’s inevitable that across the range of a floating point variable loss of precision occurs. This is what [exozy] demonstrates, by showing just how big the error can get when performing a simple division of the exponential of an input value by the original value. This results in an amazing error of over 10%, which leads to the question of how to best fix this.

Obviously, if you have the option, you can simply increase the precision of the floating point variable, from 32-bit to 64- or even 256-bit, but this only gets you so far. The solution which [exozy] shows here involves using redundant computation by inverting the result of e^x. In a demonstration using Python code (which uses IEEE 754 double precision internally), this almost eradicates the error. Other than proving that floating point arithmetic is cursed, this also raises the question of why this works.

Continue reading “Making Floating Point Calculations Less Cursed When Accuracy Matters” →

Looking For Pi In The 8087 Math Coprocessor Chip

May 20, 2020 by Dan Maloney 14 Comments

Even with ten fingers to work with, math can be hard. Microprocessors, with the silicon equivalent of just two fingers, can have an even harder time with calculations, often taking multiple machine cycles to figure out something as simple as pi. And so 40 years ago, Intel decided to give its fledgling microprocessors a break by introducing the 8087 floating-point coprocessor.

If you’ve ever wondered what was going on inside the 8087, wonder no more. [Ken Shirriff] has decapped an 8087 to reveal its inner structure, which turns out to be closely related to its function. After a quick tour of the general layout of the die, including locating the microcode engine and ROM, and a quick review of the NMOS architecture of the four-decade-old technology, [Ken] dug into the meat of the coprocessor and the reason it could speed up certain floating-point calculations by up to 100-fold. A generous portion of the complex die is devoted to a ROM that does nothing but store constants needed for its calculation algorithms. By carefully examining the pattern of NMOS transistors in the ROM area and making some educated guesses, he was able to see the binary representation of constants such as pi and the square root of two. There’s also an extensive series of arctangent and log₂ constants, used for the CORDIC algorithm, which reduces otherwise complex transcendental calculations to a few quick and easy bitwise shifts and adds.

[Ken] has popped the hood on a lot of chips before, finding butterflies in an op-amp and reverse-engineering a Sinclair scientific calculator. But there’s something about seeing constants hard-coded in silicon that really fascinates us.

The New Xbox: Just How Fast Is 12 TeraFLOPS?

March 23, 2020 by Lewin Day 73 Comments

Microsoft’s new Xbox Series X, formerly known as Project Scarlet, is slated for release in the holiday period of 2020. Like any new console release, it promises better graphics, more immersive gameplay, and all manner of other superlatives in the press releases. In a sharp change from previous generations, however, suddenly everybody is talking about FLOPS. Let’s dive in and explore what this means, and what bearing it has on performance.

Continue reading “The New Xbox: Just How Fast Is 12 TeraFLOPS?” →