Introduction to AES67

AES67 is a standard for transport of high performance audio over IP networks. High performance, as AES67 defines it, is at least a 44.1 kHz sampling frequency, at least 16-bit resolution and latency less than 10 ms. AES67 is targeted to applications in professional audio: broadcast, production, live audio and commercial and residential applications.

Rather than invent something new, AES67 specifies how to use other well-established standards in cooperation for audio networking. In contrast to systems and products that provide a full-featured user experience for audio networking, AES67 is focused on providing the basic requirements of interoperability. Providing a full-featured user experience is the product developer’s responsibility, not something dictated by the standard.

The core functionality of AES67 can be seen as being comprised of three components:

Synchronization and media clocking — Audio devices operating as a system must be accurately synchronized with one another. This ensures that all devices on the network create and reproduce audio samples at exactly the same rate and time. Synchronization capability distinguishes AES67 (and other professional media networks) from systems such as Internet radio, VoIP and Airplay which, in the best case, synchronize devices to the originator of a selected digital audio source. AES67 uses the IEEE 1588 Precision Time Protocol for synchronization.

Transport, encoding and streaming — Audio networking works by transferring audio samples in network data packets. The packets are transmitted at regular interval. In AES67, data packets are IP packets formatted according to the Real-time Transport Protocol (RTP). The RTP standards define packet formats for numerous types of audio and video. Although other parameters are supported, AES67 interoperability is focused on 24-bit uncompressed audio at 48-kHz sampling, with packets transmitted at 1-ms intervals.

Connection management — A principal advantage of audio networking over point-to-point analog or digital connections is routing flexibility. A network allows any device on the network to send audio to any other device or devices on the network. Establishing these connections requires a connection management protocol. AES67 uses SDP and SIP for connection management. SIP identifies potential connections by a SIP URI which looks like an email address. SIP is a proven protocol that enjoys wide acceptance in internet telephony.

AES67 distinguishes itself through its focus on interoperability as it defines a common interoperability mode that is designed to be easily implemented as an interoperability mode on devices already supporting these existing protocols.

For media networking to achieve maximum usefulness, we need to have maximum options in interconnecting networking-enabled components. AES67 is an open standards-based approach to media networking interoperability. It augments commercial media networking offerings such as Dante, AVB, Q-LAN, RAVENNA and Livewire with the opportunity to interconnect with other systems. Through the network effect, additional interconnection opportunities have a strong positive effect on the usefulness of a network.

Kevin Gross is an independent consultant to AV equipment manufacturers and systems designers, who frequently collaborates on projects with Cardinal Peak. As an AES fellow, he is a recognized expert at the intersection of real-time media and networking. Kevin has done work in multiple standards bodies, including IEEE, where he participated in AVB development, IETF, where he has authored several RFCs and AES, where he led the group that produced the AES67 standard. Kevin conceived and developed CobraNet, helped build the first configurable audio DSP system and developed early digital audio workstations.