From time to time, I’ve had to implement various CRC calculations for customers for different data communications systems. I’ve even had to reverse engineer unknown CRC computations in order to interoperate with existing systems, which is tricky but doable.

CRC stands for Cyclic Redundancy Check. It is an error-detecting code used to determine if a block of data has been corrupted. CRCs are ubiquitous. They are present in many of the link layers that TCP/IP is used over. For instance, Ethernet and Wi-Fi packets both contain CRCs.

The mathematics behind CRCs initially appears daunting, but it doesn’t have to be. The idea is given a block of N bits, let’s compute a checksum of a sort to see if the N bits were damaged in some way, for instance by transit over a network. The extra data we transmit with this checksum is the “Redundancy” part of CRC, and the second C just means this is a “Check” to see if the data are corrupted (as opposed to an ECC code, which both detects and corrects errors). The Cyclic part means it uses cyclic codes, which is where people tend to get lost in the math. If you look at the Wikipedia page for cyclic codes, it starts by discussing Galois field arithmetic and goes uphill from there.

I’d like to present an alternative explanation here that is useful to the software implementor of CRCs.

Error correction goes way back in computing. One of the first things I ran into while understanding the ASCII character set was parity. The seven-bit ASCII character set in some computers used the 8th bit as a check bit to see if the character had been transmitted correctly. If the character had an odd number of ones in the lower 7 bits, the 8th bit was set to a 1, otherwise 0.

So if the receiving device (modem, computer terminal, etc.) got an 8-bit quantity where the number of ones in it was not even, it knew that character had been corrupted. This is a simplistic scheme, but some important concepts shared with the CRC are here:

1. We added the check data to the transmitted data (a redundancy), and
2. It isn’t perfect — it only detects certain errors that it was designed to check.  (Specifically, the parity bit successfully detects all one-bit errors in each 7-bit block of data, but potentially can fail if worse data corruption occurs.)

What might a better scheme be? One reasonably obvious answer is a modular sum. This is the scheme used by the IPV4 TCP checksum. The document that describes this is RFC 793, the Transmission Control Protocol RFC. The checksum is defined as the 16-bit quantity obtained by doing a one’s-complement sum of all the 16-bit quantities in a TCP packet (header and data), with the checksum field taken to have value 0 in the computation. The resulting sum is complemented (1s flipped to 0s, and vice versa) and stored in the checksum field of the header.

Simple to compute, and more robust than parity. But why isn’t it good enough? Well, for one thing, we’d like to know more than just that the right number of one and zero bits arrived. We’d also like to know if they are in the right order. As an example, let’s compute an 8-bit checksum for the following packet:

AA BB CC DD EE FF

We can just sum up the values, say with “bc” on unix:

% bc -l
bc 1.06
Copyright 1991-1994, 1997, 1998, 2000 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type warranty'.
obase=16
ibase=16
AA + BB + CC + DD + EE + FF
4FB

So our 8-bit checksum after discarding the high nibble would be FB. But say our packet had some stuff scrambled in byte order:

AA BB EE CC DD FF

We get the same checksum! AA + BB + EE + CC + DD + FF = 4FB

CRCs do not have this property. Now we get to why CRCs are used rather than simple parity or modular sum schemes. A CRC can detect not just bit corruption, but also bit order corruption. This is where most papers start discussing polynomial arithmetic. But let’s hold off on that.

The idea behind a CRC is, we treat the message as a big number, we choose a special number to divide the message by (referred to as the “CRC polynomial” in the literature), and the remainder of the division is the CRC. Intuitively, it should be obvious that we can detect more than single bit errors with this scheme. Additionally, I think it is obvious that some divisors are better than others at detecting errors.

Let’s work an example in base 10. Let’s say our packet when treated as a base 10 number works out to be:

103209357

And let’s say the divisor we selected is 7:

103209357 (message) % 7 (CRC polynomial) = 6 (remainder)

That’s the basics of the math behind CRCs. No Galois fields mentioned, and no polynomials, other than I said one number is called the polynomial.

I stated it should be obvious that some CRC polynomials are better than others. Let’s consider what happened if our CRC polynomial was 2 instead of 7. Any odd value will result in a remainder of 1, any even value will result in a remainder of 2. That is, say our message was dramatically corrupted to from 103209357 to 197867.

Before transmission: 103209357 % 2 = 1. After reciept: 197867 % 2 = 1.

Wow, 2 wasn’t a good choice, was it? If we’d used 7, we would have noted 197867 % 7 = 5, and we could have detected the error. But note that even 7 isn’t a very good divisor to use. If our message was corrupted to the value 13, 13 % 7 = 6, which is the same as the original number!

More on what makes a good choice as a divisor below. First, let’s dive in a little deeper. Computers typically use base 2 math, rather than base 10. So one thing to think about is dividing things is more computationally complex than addition on most processor architectures. And treating the message as one big number opens us up to the wonderful world of arbitrary precision arithmetic — doing math on numbers larger than the instruction set of the processor provides for.

Well, what all that talk about Galois fields and polynomial arithmetic is really about is saying that we want to do a special kind of math to do a CRC. We want to do binary arithmetic without any carries or borrows. If you dig into what the oracle Knuth has to say in The Art of Computer Programming, Volume 2, Seminumerical Algorithms, and read section 4.6, on polynomial arithmetic, he notes that you can do binary polynomial arithmetic simply by ignoring carries.

For example, let’s say we want to add 0101b + 1110b. With carries we see:

        11
         0101b
        +1110b
        ------
         10011b

Let’s just do the binary math without carries and see the difference:

         0101b
        +1110b
        ------
         1011b`

Let’s play along with the literature and say we treated the 1s and 0s as coefficients of a polynomial in x, and added the two polynomials with modulo 2 arithmetic:

0x^{3}+1x^{2}+0x+1

1x^{3}+1x^{2}+1x+1

In this case, we add like terms, can’t combine unlike terms (no carries!) and we get:

1x^{3}+2x^{2}+1x+2

And furthermore, we use modulus 2 arithmetic on the coefficients:

(1\%2)x^{3}+(2\%2)x^{2}+(1\%2)x+1=1x^{3}+0x^{2}+1x+1=1011b

So, in the end, all that polynomial and Galois field stuff is just saying “we want to divide the message with a number using binary math with no carries or borrows.”

This is now looking like something I can implement without writing a symbolic math package. In fact, in this system of mathematical rules, both addition and subtraction may be implemented with XOR — because addition and subtraction have the same effect. One nice thing about XOR is that it’s an efficient operation on a computer.

In the real world, messages are usually longer than four bits. So the division you will need to do in your CRC computation requires arbitrary precision arithmetic. But no worries, the fact that there are no carries or borrows is going to make this fairly cheap to implement. You’ll use shifts and XORs and possibly table lookups instead of complex division.

As I mentioned above, the choice of the CRC polynomial is key to the error-detecting scheme. Some values are much better than others. Most people just choose one of the commonly used CRC polynomial values more or less at random. The polynomials have common nicknames, such as CRC32, CCITT-16, etc. But no CRC is going to protect you against all possible data corruption — it’s a probabilistic problem.

If you really want to be sure your data are protected, you need to do some analysis. The Art of Error Correcting Coding by Morelos-Zaragoza is a good place to start. The things you need to determine beforehand are: How many bits are you going to be computing the CRC over? And what kinds of errors are likely in your communication channel? Individual bits? Burst errors? Are runs of leading zeroes likely in your data? (Consider that the CRC division is not affected by these). People have come up with variations of the basic CRC computation to handle typical corruption scenarios for a given application. If certain errors are more likely than others, your CRC computation should be more sensitive to those kinds of errors.

Finally, despite CRC calculations being around since the 60s, I didn’t find a paper that quantitatively evaluated all possible CRC polynomials for common cases until this one in 2004. The authors, Koopman and Chakravarty, discovered that, in many cases, previously unknown CRC polynomials were better than previously known and traditionally used polynomials. This is something to research when choosing a CRC for a given application.

CRC computations are complexified by considerations of things like bit order, and byte order within words. Data transmission hardware often receive data bit by bit in msb-first order, whereas we software engineers typically do math on binary numbers in a processor in lsb-first order. So you may have to interoperate with a hardware implementation that treats the message bits in “reverse” order. Then there’s the whole big-endian-vs-little-endian issue of how we treat the words we are doing math on.

There is an attempt by Ross Williams to completely parameterize the CRC computation and generate code for a given set of parameters. His toolbox generates both easily understood and high-performance variations for a given set of CRC parameters. This is a great paper — it is very accessible — and I strongly recommend reading it.

In particular, chapter 9 of Williams gives the classical implementation of a shift-register/XOR-based CRC computation, and chapter 10 gives the classical table lookup version which most real software implementations use. Unfortunately, however, I have found that his attempt to completely parameterize CRC computations fails because when I’ve had to interoperate with “real-world” systems, I always end up having some weird swizzle in the algorithm, such as a DSP that can’t address individual bytes, or the message bit order is msb first but the polynomial is lsb first. So I’ve never been able to use his code unmodified. But it is a great starting point for implementing and understanding a table-driven CRC implementation. Basically the partial computations for byte- or word-at-a-time arbitrary precision division are stored in a table and looked up rather than computed. There is an obvious set of speed vs. space tradeoffs here depending upon the underlying word size you use for the division, so you can use different table sizes and enjoy higher performance based on how much memory you want to devote to the tables.