These Bit Twiddling Tricks Will Make Your Coworkers Hate You

In the embedded world, twiddling a few bits is expected behavior. Firmware is far enough down the stack that the author may care about the number of bits and bytes used, or needs to work with registers directly to make the machine dance. Usually these operations are confined to the typical shifting and masking but sometimes a problem calls for more exotic solutions. If you need to descend down these dark depths you invariably come across the classic Bit Twiddling Hacks collected by [Sean Eron Anderson]. Here be dragons.

Discussions of bit math are great opportunities to revisit Wikipedia’s superb illustrations

Bit Twiddling Hacks is exactly as described; a page full of snippets and suggestions for how to perform all manner of bit math in convenient or efficient ways. To our surprise upon reading the disclaimer at the top of the page, [Sean] observes that so many people have used the contents of the page that it’s effectively all been thoroughly tested. Considering how esoteric some of the snippets are we’d love to know how the darkest corners found use.

The page contains a variety of nifty tricks. Interview content like counting set bits makes an early appearance.  There’s more esoteric content like this trick for interleaving the bits in two u16’s into a single u32, or rounding up to the next power of two by casting floats. This author has only been forced to turn to Bit Twiddling Hacks on one occasion: to sign extend the output from an unfortunately designed sensor with unusual length registers.

Next time you need to perform an operation with bitmatch, check out Bit Twiddling Hacks. Have you ever needed it in production? How did you use it? We’d love to hear about it in the comments.

32 thoughts on “These Bit Twiddling Tricks Will Make Your Coworkers Hate You

  1. I feel like a huge number of these are “barrel shifters are awesome.” Which of course means if you’re on an architecture that doesn’t have one, many of those trucks will actually be pretty bad.

    1. If you scour the Java JDK source code you find many references to HD (“Hacker’s Delight”) pages. Although the mentioned website contains many even more obscure yet useful tricks like conditionally doing something without branches.

      1. An e-Reader can be a good alternative. At best that A4 sized one of a coworker. But that was really expensive, I think around € 700,-
        Books were fine for their time, but they also have drawbacks: They can be heavy and lack a proper search function.

        1. If you worked in COMPASS (the assembler for the CDC), you’d get plenty of bit-twiddling practice packing an unpacking 6-bit characters (12-bit if they were lower case) from those 60-bit words. You’d also experience 1’s complement arithmetic, with its distinct positive and negative zeros. You’d practice writing code where some registers can be used for fetching, and different ones for storing, but there is neither a fetch nor store instruction. You’d understand why it is sometimes useful to set an address register to the contents of itself!

          It almost seems like they were trying to break all the rules, until you realize the machine was developed before the rules of modern byte-addressable machines became so widespread.

  2. in the early 1990s I discovered what was later named the “Carry Rippler”: a bit trick to iterate over all “subsets” of one-bits in a word.

    void subsets(int v)
    int u=0;
    do printf(“%d\n”,u); while(u=u-v&v);

    I once used it in a program to generate all addresses in a sparse table.

    It is also used in the Stockfish chess engine:

  3. Counting set bits was an opcode on the old CDC Cyber 73 I first learned on. I never understood what it was for until much later, when I learned that NSA code breakers used it for cryptanalysis.

  4. That’s a good read if you want your brain to hurt (or the one of your coworkers), but it remembers me this phrase “Do not optimize too early” or something. This does not mean “Do not optimize at all and use 100 frameworks”!
    But if you like (or need to…) writing code for low-computing-power microcontrollers those things from the link might be useful.

    1. Except for a low-computing-power microcontroller, half of those tricks result in *worse* code. Many for instance, don’t have a barrel shifter, so the “compute the sign of an integer” or “absolute value without branching” tricks are incredibly bad if the compiler’s stupid, although even in those cases if you cast it as a rotate rather than a shift down, it’d be faster. That is, instead of (mask = (v >> 7)) you do (mask = (v <>7) ). Sometimes the compiler’s smart enough to recognize some of these tricks. Most of the time not so much.

      It’s unfortunate, because those tricks seem to be optimized for x86 – and optimization at this level for x86 has very little benefit outside of a few applications. And sadly there’s not a lot of explanation for many of them.

  5. We haven’t used anywhere nearly all of those but we’ve used a chunk, particularly ways of calculating parity very quickly for generating packets for communicating with the chips we designed. (Fun: programming a bitbang SPI interface using a digital pattern generator on automated test equipment.) I separately figured out how to calculate absolute values without branching while playing Human Resource Machine, a cute programming game on Steam.

  6. PSA: if you do any gpu programming in any form, this page is incredibly useful! Especially the conditionally do something without branching. These aren’t dragons, they’re magic tunes to be used properly!

  7. My fave trick that I didn’t see there but expected to: endian-swapping. Say you want byte k from a word of The Wrong Endianness, you can XOR the address with (byte address granularity – 1), i.e. 3 for 32-bit or 7 for 64-bit:

    int input;
    int *ip=&input;
    unsigned char *cpi=(unsigned char *) ip;

    unsigned char ch=cpi[k ^ 3];

    There’s probably a better way but this has always amused me.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.