# Embed with Elliot: Keeping it Integral

If there’s one thing that a lot of small microcontrollers hate (and that includes the AVR-based Arduini), it’s floating-point numbers. And if there’s another thing they hate it’s division. For instance, dividing 72.3 by 12.9 on an Arduino UNO takes around 32 microseconds and 500 bytes, while dividing 72 by 13 takes 14 microseconds and 86 bytes. Multiplying 72 by 12 takes a bit under 2.2 microseconds. So roughly speaking, dividing floats is twice as slow as dividing (16-bit) integers, and dividing at all is five to seven times slower than multiplying.

There’s a whole lot of the time that you just don’t care about speed. For instance, if you’re doing a calculation that only runs infrequently, it doesn’t matter if you’re using floats or slow division routines. But if you ever find yourself in a tight loop that’s using floating-point math and/or doing division, and you need to get a bit more speed, I’ve got some tips for you.

Some of these tips (in particular the integer division tricks at the end) are arcane wizardry — only to be used when the situation really calls for it. But if you’re doing the same calculations repeatedly, you can often gain a lot just by giving the microcontroller numbers in the format it natively understands. Have a little sympathy for the poor little silicon beasties trapped inside!

# An Improvement to Floating Point Numbers

On February 25, 1991, during the eve of the of an Iraqi invasion of Saudi Arabia, a Scud missile fired from Iraqi positions hit a US Army barracks in Dhahran, Saudi Arabia. A defense was available – Patriot missiles had intercepted Iraqi Scuds earlier in the year, but not on this day.

The computer controlling the Patriot missile in Dhahran had been operating for over 100 hours when it was launched. The internal clock of this computer was multiplied by 1/10th, and then shoved into a 24-bit register. The binary representation of 1/10th is non-terminating, and after chopping this down to 24 bits, a small error was introduced. This error increased slightly every second, and after 100 hours, the system clock of the Patriot missile system was 0.34 seconds off.

A Scud missile travels at about 1,600 meters per second. In one third of a second, it travels half a kilometer, and well outside the “range gate” that the Patriot tracked. On February 25, 1991, a Patriot missile would fail to intercept a Scud launched at a US Army barracks, killing 28 and wounding 100 others. It was the first time a floating point error had killed a person, and it certainly won’t be the last.

Punch cards were a standard form of program and data storage for decades, but you’d never know it by looking around today. Card punches and even readers are becoming rare and expensive. Sometimes it takes a bit of hacking [YouTube link] to get that old iron running again!

[Antiquekid3] managed to score an old punch card reader on Ebay, but didn’t have a way to interface with it. The reader turned out to be a Documation M-1000-L. After a bit of searching, [Antiquekid3] managed to find the manual [PDF link] on BitSavers. It turns out that the Documation reader used a discrete output for each row of data. One would think the Documation reader would be a perfect fit for the PDP-8 lurking in the background of [Antiquekid3’s] video, but unfortunately the ‘8 lacks the necessary OMNIBUS card to interface with a reader.

Undaunted, [Antiquekid3] threw some modern hardware into the mix, and used an Arduino Uno as a Documation to Serial interface. The Arduino had plenty of I/O to wire up with the card reader’s interface. It also had a serial interface which made outputting data a snap. The ATmega328 even had enough power to translate each card from one of IBM’s many keypunch formats to serial.

[Antiquekid3’s] test deck of cards turned out to be a floating point data set. Plotting the data with a spreadsheet results in a nice linear set of data points. Of course, no one knows what the data is supposed to mean! Want more punch card goodness? Check out this tweeting punch card reader, or this Arduino based reader which uses LEGO and a digital camera to coax the data from the paper.

# A detailed tutorial on speeding up AVR division

[Alan Burlison] is working on an Arduino project with an accelerometer and a few LEDs. Having the LEDs light up as his board is tilted to one side or another is an easy enough project a computer cowboy could whip out in an hour, but [Alan] – ever the perfectionist – decided to optimize his code so his accelerometer-controlled LEDs don’t jitter. The result is a spectacular blog post chronicling the pitfalls of floating point math and division on an AVR.

To remove the jitter from his LEDs, [Alan] used a smoothing algorithm known as an exponential moving average. This algorithm uses multiplication and is usually implemented using floating point arithmetic. Unfortunately, AVRs don’t have floating point arithmetic so [Alan] used fixed point arithmetic – a system similar to balancing your checkbook in cents rather than dollars.

With a clever use of bit shifting to calculate the average with scaling, [Alan] was able to make the fixed point version nearly six times faster than  the floating point algorithm implementation. After digging into the assembly of his fixed point algorithm, he was able to speed it up to 10 times faster than floating point arithmetic.

The takeaway from [Alan]’s adventures in arithmetic is that division on an AVR is slow. Not very surprising after you realize the AVR doesn’t have a division instruction. Of course, sometimes you can’t get around having to divide so multiplying by the reciprocal and using fixed point arithmetic is the way to go if speed is an issue.

Sure, squeezing every last cycle out of an 8 bit microcontroller is a bit excessive if you’re just using an Arduino as a switch. If you’re doing something with graphics or need very fast response times, [Alan] gives a lot of really useful tips.