CRC Networking and How To Understand the Cyclic Redundancy Check

CRC stands for Cyclic Redundancy Check (CRC) — an error-detecting code used to determine if a block of data has been corrupted. The mathematics behind CRCs may initially appear daunting, but don’t have to be. In this post, I present an alternative explanation useful to the software implementor of CRCs.

CRCnetworking error detection

From time to time, I’ve had to implement various CRC calculations for customers for different data communications systems. I’ve even had to reverse engineer unknown CRC computations to interoperate with existing systems, which is tricky but doable.

However, that raises several questions: What is CRC? How does CRC work? And what is CRC in networking? This “CRC explained” blog post will answer those questions and more, helping you understand the cyclic redundancy check, CRC networking, CRC error detection and everything you need to know about CRC.

Contents

What is Cyclic Redundancy Check?

CRC stands for Cyclic Redundancy Check. It is an error-detecting code used to determine if a block of data has been corrupted. Simple to implement in hardware, CRC is a great technique for detecting common transmission errors.

The mathematics behind CRCs initially appear daunting, but don’t have to be. The idea is, given a block of N bits, we can compute a CRC checksum of a sort to see if the N bits were damaged in some way, by transit over a network for example.

The extra data we transmit with this checksum is the “Redundancy” part of CRC, and the second C just means this is a “Check” to see if the data are corrupted (as opposed to an ECC code, which both detects and corrects errors). The “cyclic” part means it uses cyclic codes, which is where people tend to get lost in the math. (If you look at the Wikipedia page for cyclic codes, it starts by discussing Galois field arithmetic and goes uphill from there.)

What is CRC in Networking?

Commonly used in digital networks to detect accidental changes to digital data, a CRC in networking produces a fixed-length data set based on the build of a file or larger data set. CRCs are ubiquitous and present in many of the link layers that TCP/IP is used over. For instance, Ethernet and Wi-Fi packets both contain CRC code.

I’d like to present an alternative explanation of CRC here that is useful to anyone involved in CRC networking.

How Does CRC Work?

Error correction goes way back in computing. One of the first things I ran into while understanding the ASCII character set was parity. The seven-bit ASCII character set in some computers used the eighth bit as a check bit to see if the character had been transmitted correctly. If the character had an odd number of ones in the lower-seven bits, the eighth bit was set to a one, otherwise zero.

So, if the receiving device (modem, computer terminal, etc.) got an 8-bit quantity where the number of ones in it was not even, it knew that character had been corrupted.

This is a simplistic scheme, but some important concepts shared with the CRC are here:

We added the check data to the transmitted data (a redundancy), and
It isn’t perfect — it only detects certain errors that it was designed to check.

Specifically, the parity bit successfully detects all one-bit errors in each seven-bit block of data but potentially can fail if worse data corruption occurs.

A Better Data Scheme for a Cyclic Redundancy Check

What might a better data scheme be for a CRC? One reasonably obvious answer is a modular sum. This is the scheme used by the IPV4 TCP checksum. The document that describes this is RFC 793, the Transmission Control Protocol RFC.

The checksum is defined as the 16-bit quantity obtained by doing a one’s-complement sum of all the 16-bit quantities in a TCP packet (header and data), with the checksum field taken to have value zero in the computation. The resulting sum is complemented (ones flipped to zeros, and vice versa) and stored in the checksum field of the header.

Simple to compute, and more robust than parity, but why isn’t it good enough? Well, for one thing, we’d like to know more than just that the right number of one and zero bits arrived. We’d also like to know if they are in the right order.

Example: Compute an 8-bit Checksum

As an example, let’s compute an 8-bit checksum for the following packet:

AA BB CC DD EE FF

We can just sum up the values, say with “bc” on unix:

% bc -l

bc 1.06

Copyright 1991-1994, 1997, 1998, 2000 Free Software Foundation, Inc.

This is free software with ABSOLUTELY NO WARRANTY.

For details type `warranty'.

obase=16

ibase=16

AA + BB + CC + DD + EE + FF

4FB

So, our 8-bit checksum after discarding the high nibble would be FB. But say our packet had some stuff scrambled in byte order:

AA BB EE CC DD FF

We get the same checksum! AA + BB + EE + CC + DD + FF = 4FB

Why are CRCs Used?

CRCs do not have the properties of the 8-bit checksum example above. Now we get to why CRCs are used rather than simple parity or modular sum schemes. A cyclic redundancy check can detect not just bit corruption but also bit order corruption. (This is where most articles start discussing polynomial arithmetic. But let’s hold off on that.)

The idea behind a CRC is: We treat the message as a big number, choosing a special number to divide the message by (referred to as the “CRC polynomial” in the literature) and the remainder of the division is the cyclic redundancy check. Intuitively, it should be obvious that we can detect more than single-bit errors with this scheme. Additionally, I think it is obvious that some divisors are better than others at detecting errors.

Example: Compute a Packet in Base 10

Let’s work an example in base 10. Let’s say our packet when treated as a base 10 number works out to be:

103209357

And let’s say the divisor we selected is 7:

103209357 (message) % 7 (CRC polynomial) = 6 (remainder)

That’s the basics of the math behind CRCs. No Galois fields mentioned, and no polynomials, other than I said one number is called the polynomial.

CRC Polynomials

I stated it should be obvious that some CRC polynomials are better than others. Let’s consider what happened if our CRC polynomial was 2 instead of 7. Any odd value will result in a remainder of 1, any even value will result in a remainder of 2. That is, say our message was dramatically corrupted to from 103209357 to 197867.

Before transmission: 103209357 % 2 = 1. After receipt: 197867 % 2 = 1.

Wow, 2 wasn’t a good choice, was it? If we’d used 7, we would have noted 197867 % 7 = 5, and we could have detected the error. But note that even 7 isn’t a very good divisor to use. If our message was corrupted to the value 13, 13 % 7 = 6, which is the same as the original number! More on what makes a good choice as a divisor below.

First, let’s dive in a little deeper. Computers typically use base 2 math, rather than base 10. So, one thing to think about is dividing things is more computationally complex than addition on most processor architectures. And treating the message as one big number opens us up to the wonderful world of arbitrary precision arithmetic — doing math on numbers larger than the instruction set of the processor provides for.

Well, what all that talk about Galois fields and polynomial arithmetic is really about is saying that we want to do a special kind of math to do a CRC. We want to do binary arithmetic without any carries or borrows. If you dig into what the oracle Knuth has to say in The Art of Computer Programming, Volume 2, Seminumerical Algorithms, and read section 4.6, on polynomial arithmetic, he notes that you can do binary polynomial arithmetic simply by ignoring carries.

Example: Polynomial Arithmetic

For example, let’s say we want to add 0101b + 1110b. With carries we see:

         0101b

        +1110b

        ------

10011b

Let’s just do the binary math without carries and see the difference:

         0101b

        +1110b

        ------

         1011b

Let’s play along with the literature and say we treated the 1s and 0s as coefficients of a polynomial in x, and added the two polynomials with modulo 2 arithmetic:

$0x^{3}+1x^{2}+0x+1$

$1x^{3}+1x^{2}+1x+1$

In this case, we add like terms, can’t combine unlike terms (no carries!) and we get:

$1x^{3}+2x^{2}+1x+2$

And furthermore, we use modulus 2 arithmetic on the coefficients:

$(1\%2)x^{3}+(2\%2)x^{2}+(1\%2)x+1=1x^{3}+0x^{2}+1x+1=1011b$

So, in the end, all that polynomial and Galois field stuff is just saying “we want to divide the message with a number using binary math with no carries or borrows.”

This is now looking like something I can implement without writing a symbolic math package. In fact, in this system of mathematical rules, both addition and subtraction may be implemented with XOR or “exclusive or” — because addition and subtraction have the same effect. One nice thing about XOR is that it’s an efficient operation on a computer.

In the real world, messages are usually longer than four bits. So, the division you will need to do in your CRC computation requires arbitrary precision arithmetic. But no worries, the fact that there are no carries or borrows is going to make this fairly cheap to implement. You’ll use shifts and XORs and possibly table lookups instead of complex division.

As I mentioned above, the choice of the CRC polynomial is key to the error-detecting scheme. Some values are much better than others. Most people just choose one of the commonly used CRC polynomial values more or less at random. The polynomials have common nicknames, such as CRC32, CCITT-16, etc. But no cyclic redundancy check is going to protect you against all possible data corruption — it’s a probabilistic problem.

Use the Cyclic Redundancy Check to Ensure Your Data is Protected

If you really want to ensure data protection, you need to do some analysis. The Art of Error Correcting Coding by Morelos-Zaragoza is a good place to start. The things you need to determine beforehand are:

How many bits are you going to be computing the CRC over?
What kinds of errors are likely in your communication channel?
Individual bits?
Burst errors?
Are runs of leading zeroes likely in your data? Consider that the CRC division is not affected by these.

People have come up with variations of the basic CRC computation to handle typical corruption scenarios for a given application. If certain errors are more likely than others, your CRC computation should be more sensitive to those kinds of errors.

CRC References and Recommendations

Finally, despite cyclic redundancy check calculations being around since the 1960s, I didn’t find a paper that quantitatively evaluated all possible CRC polynomials for common cases until this one in 2004. The authors, Koopman and Chakravarty, discovered that, in many cases, previously unknown CRC polynomials were better than previously known and traditionally used polynomials. This is something to research when choosing a CRC for a given application.

CRC computations are complexified by considerations of things like bit order, and byte order within words. Data transmission hardware often receive data bit by bit in msb-first order, whereas we software engineers typically do math on binary numbers in a processor in lsb-first order. So, you may have to interoperate with a hardware implementation that treats the message bits in “reverse” order. Then there is the whole big-endian-vs-little-endian issue of how we treat the words we are doing math on.

There is an attempt by Ross Williams to completely parameterize the CRC computation and generate code for a given set of parameters. His toolbox generates both easily understood and high-performance variations for a given set of CRC parameters. This is a great paper — it is very accessible — and I strongly recommend reading it.

In particular, chapter 9 of Williams gives the classical implementation of a shift-register/XOR-based CRC computation.
Chapter 10 gives the classical table lookup version which most real software implementations use.

Unfortunately, however, I have found that Williams’ attempts to completely parameterize CRC computations fails because when I’ve had to interoperate with “real-world” systems, I always end up having some weird swizzle in the algorithm, such as a DSP that can’t address individual bytes, or the message bit order is msb-first but the polynomial is lsb-first.

So, I’ve never been able to use his code unmodified, but it is a great starting point for implementing and understanding a table-driven CRC implementation. Basically, the partial computations for byte- or word-at-a-time arbitrary precision division are stored in a table and looked up rather than computed. There is an obvious set of speed vs. space tradeoffs here depending upon the underlying word size you use for the division, so you can use different table sizes and enjoy higher performance based on how much memory you want to devote to the tables.

Conclusion
Cardinal Peak offers product engineering services that include CRC networking to identify corrupt data, validate data and protect your product’s valuable information. Reach out to our expert software engineers to learn more and discuss your product development.

If you enjoyed this blog and want to see new ones, click below to subscribe to our product engineering newsletter.

What is CRC Networking? Understanding the Cyclic Redundancy Check