I’m at the DSI conference in Las Vegas today, presenting a primer for law enforcement investigators on how video compression works and trying to answer the question of why “lossy” compression should be considered reliable for use in courtrooms. (My slides are available here, and I welcome comments on them.) I think I was invited to speak because of our CaseCracker product, which is used to record custodial interrogations, although what I’m discussing is only slightly related.
The lack of trust in digital media compression in a forensic setting is primarily a PR issue for the media compression industry, if such an industry can be said to exist. We use terms like “lossy compression” and “predicted blocks”—terms that have relatively precise technical meaning. But these terms also have a slightly different meaning to laymen, and that everyday meaning isn’t exactly reassuring if you’re a judge relying on testimony compressed using a lossy compression algorithm. So it’s important for lawyers and investigators working in the criminal justice system to understand how image compression works.
The technical meaning of “lossy compression” is that the process of encoding followed by the process of decoding doesn’t output the exact same file as the source file you started out with:
When we say the output file isn’t the same as the source file, what we mean is that a byte-for-byte comparison of the two files will fail—not that a guy protesting his innocence will be turned into a different guy admitting his guilt. In fact, with a well-implemented codec, the mathematical lossiness shouldn’t be subjectively noticeable at all. Intuitively, everyone knows that: Nobody worries about using lossy media compression for recording videos of their kids’ birthdays or pictures of their vacations.
But still, it’s worth thinking about the question as to how to state with certainty that lossy compression algorithms should be considered reliable for courtroom use.
In preparing for this talk, I tried to think of all the ways that video compression is lossy. I came up with four independent sub-processes that each contribute to a codec’s overall lossiness:
- Resolution reduction: Often the video resolution is reduced prior to encoding, because this can dramatically diminish the number of bits to encode. The result is that the output is fuzzier and less crisp.
- Color sub-sampling: The human eye is not equally sensitive to luminance and chrominance changes, so chroma is normally subsampled, which typically reduces the color information in the picture by a factor of 4 and the total uncompressed size of the picture by a factor of 2. The color sub-sampling is not usually perceptible except in test patterns explicitly designed to expose it.
- Noise reduction and other pre-filtering: Sometimes video encoders, particularly expensive ones, will filter the image prior to encoding in order to remove noise and otherwise make the image easier to compress. This might result in a softer image in certain cases, but again it normally won’t make any subjective difference in the output.
- Quantization: This is a technical term that loosely translates to “rounding”. The basic idea is that the human eye can’t usually discern small differences in intensity. So why waste a lot of bits faithfully preserving the difference between a 66% gray block and a 69% gray block, when the viewer will perceive them as the same thing anyway? By quantizing both blocks to an average value—say, 67% gray—the encoder is able to dramatically reduce the amount of information it needs to send. (The same concept applies to high frequencies in the image.) Quantization is responsible for the majority of lossiness in video compression, but again, its use is normally not perceptible except in the lab.
I’m not a lawyer, thank heaven, but I’m pretty sure the relevant legal issue is whether a piece of video evidence accurately reproduces the event it purports to record. And so in a law enforcement setting, the ultimate answer is that someone who is trusted needs to be able to testify that a particular video clip faithfully represents what happened.