Picking A CRC

You send a file, but how do you know it arrived intact? In other words, how do you know that it didn’t get cut off, garbled, or changed somehow? Simplistically, you could just add up all the bytes in the file — a checksum — and send that along with the file. You compute the checksum when you know the file is good, and the receiver can compare the checksum to see if they match.

However, a simple addition doesn’t catch certain classes of errors, which is why there are better checksum algorithms that, for example, wrap the carry bit around or otherwise modify files with common errors so they produce different checksums. There are two problems with checksums. First, no matter how much you modify the algorithm, the chances that two files produce the same checksum are pretty high. Especially with common error patterns.

For example, assume a very simple algorithm that simply adds the bytes and discards any carry. If a file contains 0x80, 0x80, those numbers essentially cancel each other out. If you replace them with 0, 0, you’ll get the same checksum. To some degree, using anything other than a second copy of the entire file will have this problem — some corruption goes undetected — but you want to minimize the number of times that happens.

The other problem is that a checksum by itself doesn’t let you correct anything. You know the data is bad, but you don’t know why. If you think about it, the simplest checksum is a parity bit on a byte: odd parity is simply summing all the bits together. If the parity bit doesn’t match, you know the byte is bad, but you don’t know why. Any even number of errors goes undetected, but I am sure one-, three-, five-, or seven-bit errors will get caught.

People invent better error-checking codes by devising schemes that can promise they can detect a certain number of bit flips and, at least in some cases, correct them. One of these is the cyclic redundancy check (CRC). It is easy to think of the CRC as a “strong checksum,” but it actually works differently. What’s more, there isn’t just a single CRC algorithm. You have to select or design a particular algorithm based on your needs. Most people pick a “named” implementation like CCITT or Ethernet and assume it must be the best. It probably isn’t.

A CRC is a checksum in the broad sense: you feed it a message, and it gives you a small value that you append, store, or compare later. But unlike a simple additive checksum, a CRC is based on polynomial division over GF(2), which is a fancy way of saying “divide using XOR instead of carries.” That detail matters. It gives CRCs very strong guarantees against common classes of errors, provided you choose the right polynomial for the job. That’s the key. You must choose the right polynomial.

The Polynomial Machine

A CRC treats your message as a long binary polynomial. For example, the byte stream is interpreted as a sequence of coefficients: each bit is either present or absent. The CRC algorithm divides the message polynomial, after shifting it by the CRC width, by a generator polynomial. The remainder is the CRC.

In normal arithmetic, division involves subtraction and carries. In CRC arithmetic, subtraction is XOR. That is why CRC code often looks like this:


if (crc & topbit)
crc = (crc << 1) ^ poly;
else
crc <<= 1;

That loop is implementing polynomial long division, one bit at a time. The generator polynomial is the magic number. For a 16-bit CRC, the polynomial has degree 16. For a 32-bit CRC, degree 32. You will usually see it written as a hex constant, such as 0x1021 for CRC-16/CCITT or 0x04C11DB7 for the classic Ethernet/ZIP/PNG CRC-32. But the polynomial is not just an arbitrary constant. It determines what error patterns the CRC is guaranteed to detect.

What CRCs Catch

A well-chosen CRC can guarantee detection of all single-bit errors, many multi-bit errors, all burst errors up to a certain length, and a very high percentage of longer random errors. The key metric is Hamming distance, often abbreviated HD. If a CRC has HD=4 for messages up to a certain length, it detects all 1-, 2-, and 3-bit errors in messages of that length.

That last qualifier is important. CRC strength is not just “16-bit CRC good, 32-bit CRC better.” It depends on the maximum message length. A polynomial that is excellent for 32-byte embedded packets may be mediocre for kilobyte-size messages. A polynomial standardized decades ago may be familiar but not optimal.

[Philip Koopman’s] work at Carnegie Mellon is the go-to reference here. [Koopman] and [Chakravarty’s] paper on CRC polynomial selection for embedded networks specifically looked for good CRC polynomials for short embedded messages, and [Koopman’s] “Best CRC Polynomials” tables list polynomials by width and Hamming-distance performance. The important takeaway is that many standard polynomials were chosen for historical reasons, not because they are mathematically best for your packet size.

There are plenty of videos that explain CRC, but if you are going to watch a video, you might as well pick one of the many from [Phil Koopman] himself, like the one below.

Famous Does Not Mean Optimal

Take CRC-16/CCITT, polynomial 0x1021. It is found everywhere: telecom, embedded examples, and bootloaders. It is not a terrible polynomial, but it is not automatically the best 16-bit choice. [Koopman’s] tables include other 16-bit polynomials with better Hamming-distance behavior over useful embedded-message lengths.

Likewise, classic CRC-32 using polynomial 0x04C11DB7 is deeply entrenched because of Ethernet, ZIP, gzip, and PNG. But CRC-32C, the Castagnoli polynomial, is often a better general-purpose choice. It has excellent error detection properties over common message lengths and is also supported by hardware instructions on many CPUs. Intel added CRC32 instructions with SSE4.2, and ARM AArch64 also includes CRC acceleration for CRC-32 and CRC-32C.

Of course, standards matter if you have to meet the standard. But if you are designing a new private protocol between your sensor board and your controller, blindly copying the first CRC-16 example from the Internet is not engineering. Pick a polynomial based on your packet length and your risk model.

The Practical Embedded View

For very small messages, even an 8-bit CRC may be adequate. For moderate packets, a good 16-bit CRC is often enough. For firmware images or large records, 32 bits is more reasonable. The point is not to use the biggest CRC you can tolerate. The point is to choose a CRC width and polynomial that give the desired detection strength for your longest protected message.

Also, remember what a CRC does not do. It is not cryptographic. It does not stop malicious tampering. The point of a CRC is to detect accidental corruption, not protect against sophisticated hacking attempts.

Real-world CRC definitions also include bit reflection, initial value, final XOR value, and sometimes byte order conventions. Two CRCs can use the same polynomial and still produce different answers because those parameters differ. That is a common embedded debugging trap. Someone says “CRC-16,” and both sides implement different CRC-16s. CRC-16/IBM, CRC-16/CCITT-FALSE, CRC-16/XMODEM, CRC-16/KERMIT, and CRC-16/MODBUS are not interchangeable.

If you specify a CRC in a protocol document, include at least the width, the polynomial (which can be represented in different formats, by the way), the initial value, if you use reflection on the input or output, and any value to XOR the output with. It is also a great idea to include the computed checksum for ASCII “123456789” so anyone can compare their algorithm to yours.

If you are working with Linux systems, be sure to look at the cksum program which can use several CRC algorithms or other methods like sha1 and other digest-style methods.

Efficiency

Computing CRCs a bit at a time is compact, but it costs eight loop iterations per byte. In some cases, that’s ok, but for performance, you want a table if you can afford the memory. For a 16-bit CRC, the table is only 512 bytes and can be generated at compile time, if desired.

Many CPUs have CRC peripherals. Use them, but read the fine print to make sure they can handle your desired CRC. It is often a good idea to check a hardware implementation against a known-good software implementation before you send it out into the wild. You can do many CRC tests using an online tool. Of course, there are several out there.

Choosing a CRC Today

For a new embedded protocol, define the maximum length of data you need to check. Then decide how many bits of overhead you can afford. Then head to Koopman’s tables to pick a polynomial with good Hamming-distance performance for that length.

The CRC has been around for a long time. But it isn’t just something you grab off the shelf. You need to plan and understand the ramifications of picking different polynomials.

CRCs aren’t the only game in town. Credit card numbers, for example, use check digits. There are other ways you can identify and, in some cases, zap bit errors, too.

5 thoughts on “Picking A CRC

  1. I think today, when in doubt, you can actually use a cryptographic hash function, like sha1, drop bytes until you’re happy with the length.

    When zfs came out I thought it was amateurish overkill, still feels amateurish “Why could they not just pick a good crc?”… but there is so much “compute” now and hardware support for sha and aes, why be frugal?

  2. I was pleasantly surprised when I discovered that my STM32 uC had hardware support for CRC, but then I quickly discovered that it only supports a very specific CRC, and I was already using another one for my RS485 based network, so that ended quickly.

    But if you are free to choose, then using a CRC hat has hardware support for your uC is an obvious advantage.

    When I did some research into CRC’s I found a bunch of algorithms. Some are very quick but use a big LUT (or partial LUT’s), other algorithms can be very small in code size, but execute slowly. In the end I found a very heavily optimized algorithm that was both small and quick (but not extreme), and it ran wel on my Atmega processor, so I kept on using that 16 bit CRC. It’s now hidden deeply in some network stack software library I’ve been using for 20 years, and I can’t even remember what the CRC is. (I can look it up though, I still use the library).

  3. BitTorrent comes to mind where the files are chopped into chunks and a hash created for each chunk and an overall hash for all the chunks. So that if one chunk is corrupt it can be requested again. And after all chunks are transfered the overall hash calculated and compared.

  4. Even on embedded devices I take my message and do a sha-256 hash on it and then tag those bytes onto the end. Most processors can do SHA-256 using specialized parts of the chip, and are very very fast. Some protocols like the old CANBUS don’t have enough space for this sort of thing, but CAN-FD has room for 64bytes, and most of my messages fit in a tiny struct. This way, there is pretty much a 100% that a given instruction is the correct instruction.
    Rarely is this overhead coming at any real cost in performance.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.