The Importance of Encrypting Video Over IP

I just read this report of a new IP security vulnerability being demonstrated today at the DefCon hacker’s conference in Las Vegas. The new hack has two components:

The attackers are able to view video being streamed across a network, and
The attackers are able to use a man-in-the-middle attack to insert video controlled by the attacker to a video decoder somewhere on the network.

The linked video shows viscerally how an attacker could foil a security/surveillance video system – a modern-day Thomas Crown Affair. But the underlying problem goes beyond the surveillance market and could conceivably affect a wide range of industries using video over IP. This is a big deal, and vendors of any form of network-connected IP video device – whether a camera, encoder, or decoder – should take note.

In fact, the security researchers who are demonstrating the hack are also helpfully releasing open source software to exploit the vulnerability. So what started out as a vulnerability that was only open to bad guys with a reasonably deep technical understanding has just become widely accessible. Thanks, guys.

At Cardinal Peak, we’ve built a large number of these systems, so I feel like I have a relatively good understanding of why vendors of IP video solutions are doing what they are. It’s all about cost: today most IP video is not encrypted when it is transmitted across the network. That’s bad. (What’s even worse, many products’ user interfaces offer faux security options, like bogus “password-protection,” that might lead enterprise customers to think they’ve got more security than they do.)

The reason that video is sent unencrypted is a corollary to the First Law of Video:

Video – even video compressed using state-of-the-art codecs like H.264 – is BIG.

It takes a lot of bits to send motion imagery across a network. If you want to encrypt that video, you’ll have to encrypt those bits. Encrypting a lot of bits consumes nontrivial computing power – which means you either need a beefier CPU in your embedded video encoder device, or you need dedicated hardware like an FPGA. Either way, adding encryption to your product is going to add to your cost of goods sold.

But wait, it’s worse. Adding more processing power to an embedded device means more power to dissipate, which increases the need for moving parts like fans which lower reliability. So in addition to cost, there is complexity, reliability, and power dissipation.

Even if you somehow get around that, there are more problems. To display the video, you still need to decrypt it, which means you’re going to consume CPU power on the decode side, as well. On modern computers, that probably isn’t a huge problem if all you want to do is display video from a single camera. On the other hand, if you’re trying to display a 16-up display of live video from 16 cameras – well, time to buy some more Intel stock.

And finally: adding security features to a system are always at cross-purposes with making that system easy to use. So solving this problem places a burden on every system integrator and IT administrator.

What a pain!

For standards-based MPEG-4 or H.264 systems, there is a standard called Secure RTP (with the associated SRTP RFC if you’re looking for some light reading) that, if implemented widely, would basically prevent the hack. Unfortunately, as far as I’m aware, very few encoders, decoders, or network recorders implement SRTP. That may be about to change, assuming news of the hack causes customers to complain to their vendors.

I’m not aware of a standards-based way to encrypt MPEG-2 video over IP, although at first blush you wouldn’t think it would be too hard to come up with one. But crypto in general seems difficult to get right – witness the difficulties that they’ve had with ssh, which has been designed to be secure from the ground up.