AES3-2003 Excerpt
AES3-2003 Excerpt
AES3-2003 Excerpt
REVISED
AES standard for digital audio —
Digital input-output interfacing —
Serial transmission format for two-
channel linearly represented digital
audio data
Published by
Audio Engineering Society, Inc.
Copyright © 2003 by the Audio Engineering Society
Abstract
The format provides for the serial digital transmission of two channels of periodically sampled and uniformly
quantized audio signals on a single shielded twisted wire pair. The transmission rate is such that samples of
audio data, one from each channel, are transmitted in time division multiplex in one sample period. Provision is
made for the transmission of both user and interface related data as well as of timing related data, which may be
used for editing and other purposes. It is expected that the format will be used to convey audio data that have
been sampled at any of the sampling frequencies recognized by the AES5, Recommended Practice for
Professional Digital Audio Applications Employing Pulse-Code Modulation — Preferred Sampling
Frequencies.
An AES standard implies a consensus of those directly and materially affected by its scope and provisions and
is intended as a guide to aid the manufacturer, the consumer, and the general public. The existence of an AES
standard does not in any respect preclude anyone, whether or not he or she has approved the document, from
manufacturing, marketing, purchasing, or using products, processes, or procedures not in agreement with the
standard. Prior to approval, all parties were provided opportunities to comment or object to any provision.
Attention is drawn to the possibility that some of the elements of this AES standard or information document
may be the subject of patent rights. AES shall not be held responsible for identifying any or all such patents.
Approval does not assume any liability to any patent owner, nor does it assume any obligation whatever to
parties adopting the standards document. This document is subject to periodic review and users are cautioned to
obtain the latest printing. Recipients of this document are invited to submit, with their comments, notification of
any relevant patent rights of which they are aware and to provide supporting documentation.
2003-09-09 printing
AES3-2003
- 2 -
Contents
Foreword................................................................................................................................................................ 3
Foreword to second edition.................................................................................................................................... 3
Foreword to third edition ....................................................................................................................................... 4
1 Scope................................................................................................................................................................... 5
2 Normative references .......................................................................................................................................... 5
3 Definitions and abbreviations ............................................................................................................................. 6
4 Interface format................................................................................................................................................... 7
4.1 Structure of format........................................................................................................................................... 7
4.2 Channel coding .............................................................................................................................................. 10
4.3 Preambles....................................................................................................................................................... 10
4.4 Validity bit ..................................................................................................................................................... 11
5 User data format................................................................................................................................................ 11
6 Channel status format ....................................................................................................................................... 11
7 Interface format implementation....................................................................................................................... 19
7.1 General........................................................................................................................................................... 19
7.2 Transmitter..................................................................................................................................................... 19
7.3 Receivers........................................................................................................................................................ 19
8 Electrical requirements ..................................................................................................................................... 20
8.1 General characteristics .................................................................................................................................. 20
8.2 Line driver characteristics.............................................................................................................................. 20
8.3 Line receiver characteristics........................................................................................................................... 22
8.4 Connectors ..................................................................................................................................................... 24
Annex A............................................................................................................................................................... 25
Annex B ............................................................................................................................................................... 26
Annex C ............................................................................................................................................................... 28
Annex D............................................................................................................................................................... 29
2003-09-09 printing
AES3-2003
- 5 -
REVISED
AES standard for digital audio —
Digital input-output interfacing —
Serial transmission format for two-channel
linearly represented digital audio data
1 Scope
This document specifies a recommended interface for the serial digital transmission of two channels of
periodically sampled and linearly represented digital audio data from one transmitter to one receiver.
It is expected that the format will be used to convey audio data that have been sampled at any of the sampling
frequencies recognized by the AES5 Recommended Practice for Professional Digital Audio Applications
Employing Pulse-Code Modulation — Preferred Sampling Frequencies. Note that conformance with this
interface specification does not require equipment to utilise these rates. The capability of the interface to
indicate other sample rates does not imply that it is recommended that equipment supports these rates. To
eliminate doubt, equipment specifications should define supported sampling frequencies.
The format is intended for use with shielded twisted-pair cable of conventional design over distances of up to
100 m without transmission equalization or any special equalization at the receiver and at frame rates of up to
50 kHz. Longer cable lengths and higher frame rates may be used, but with a rapidly increasing requirement for
care in cable selection and possible receiver equalization or the use of active repeaters, or both.
The document does not cover connection to any common carrier equipment, nor does it specifically address any
questions about the synchronizing of large systems, although by its nature the format permits easy
synchronization of receiving devices to the transmitting device.
Specific synchronization issues are covered in AES11 AES recommended practice for digital audio engineering
-- Synchronization of digital audio equipment in studio operations. An engineering guideline document to
accompany this interface specification has been published as AES-2id AES information document for digital
audio engineering -- Guidelines for the use of the AES3 interface.
In this interface specification, mention is made of an interface for consumer use. The two interfaces are not
identical.
2 Normative references
The following standards contain provisions which, through reference in this text, constitute provisions of this
document. At the time of publication, the editions indicated were valid. All standards are subject to revision,
and parties to agreements based on this document are encouraged to investigate the possibility of applying the
most recent editions of the indicated standards.
2003-09-09 printing
AES3-2003
- 6 -
AES11, AES recommended practice for digital audio engineering—Synchronization of digital audio equipment
in studio operations, Audio Engineering Society, New York, NY, USA .
AES18, AES recommended practice for digital audio engineering—Format for the user data channel of the
AES digital audio interface, Audio Engineering Society, New York, NY, USA.
ITU-T Recommendation V.11: Electrical characteristics for balanced double-current interchange circuits
operating at data signalling rates up to 10 Mbit/s, International Telecommunication Union, Geneva,
Switzerland..
IEC 60268-12, Sound system equipment, part 12: Application of connectors for broadcast and similar use,
International Electrotechnical Commission, Geneva, Switzerland.
IEC 60958-3, Digital audio interface - Part 3: Consumer applications, International Electrotechnical
Commission, Geneva, Switzerland.
ISO 646, Information processing—ISO 7-bit coded character set for information interchange, International
Organization for Standardization, Geneva, Switzerland.
3.1
sampling frequency
frequency of the samples representing an audio signal
NOTE When more than one audio signal is transmitted through the same interface, the sampling
frequencies are identical.
3.2
audio sample word
amplitude of a digital audio sample
NOTE Representation is linear in 2’s complement binary form. Positive numbers correspond to
positive analog voltages at the input of the analog-to-digital converter (ADC). The number of bits per
word can be specified from 16 to 24 in two coding ranges, less than or equal to 20 bits and less than or
equal to 24 bits.
3.3
auxiliary sample bits
four least significant bits (LSBs) which can be assigned as auxiliary sample bits and used for auxiliary
information when the number of audio sample bits is less than or equal to 20
3.4
validity bit
bit indicating whether the audio sample bits in the same subframe are suitable for conversion to an analog audio
signal
3.5
channel status
bits carrying, in a fixed format derived from the block (see 3.11), information associated with each audio
channel which is decodable by any interface user
3.6
user data
channel provided to carry any other information
3.7
2003-09-09 printing
AES3-2003
- 7 -
parity bit
bit provided to permit the detection of an odd number of errors resulting from malfunctions in the interface
3.8
preambles
specific patterns used for synchronization. See 4.3.
3.9
subframe
fixed structure used to carry the information described in 4.1.1 and 4.1.2
3.10
frame
sequence of two successive and associated subframes, see 4.1.2
3.11
block
group of 192 consecutive frames
NOTE The start of a block is designated by a special subframe preamble. See 4.3.
3.12
channel coding
coding describing the method by which the binary digits are represented for transmission through the interface
3.13
unit interval
UI
shortest nominal time interval in the coding scheme
3.14
interface jitter
deviation in timing of interface data transitions (zero crossings) when measured with respect to an ideal clock
3.15
intrinsic jitter
output interface jitter of a device that is either free-running or is synchronized to a jitter-free reference
3.16
jitter gain
ratio, expressed in decibels, of the amplitude of jitter at the synchronization input of a device to the resultant
jitter at the output of the device
3.17
frame rate
rate of transmission of frames
4 Interface format
Each subframe is divided into 32 time slots, numbered from 0 to 31. See figure 1.
Time slots 0 to 3, the preambles, carry one of the three permitted preambles. See 4.1.2 and 4.3; also see figure
2.
Time slots 4 to 27, the audio sample word, carry the audio sample word in linear 2’s complement
representation. The most significant bit (MSB) is carried by time slot 27.
When a 24-bit coding range is used, the LSB is in time slot 4. See figure 1(a).
When a 20-bit coding range is sufficient, time slots 8 to 27 carry the audio sample word with the LSB in time
slot 8. Time slots 4 to 7 may be used for other applications. Under these circumstances, the bits in time slots 4
to 7 are designated auxiliary sample bits. See figure 1(b).
If the source provides fewer bits than the interface allows, either 20 or 24, the unused LSBs are set to logic 0.
Time slot 28, the validity bit, carries the validity bit associated with the audio sample word. See 4.4.
Time slot 29, the user data bit, carries 1 bit of the user data channel associated with the audio channel
transmitted in the same subframe. See clause 5.
Time slot 30, the channel status bit, carries 1 bit of the channel status information associated with the audio
channel transmitted in the same subframe. See clause 6.
Time slot 31, the parity bit, carries a parity bit such that time slots 4 to 31 inclusive will carry an even number
of ones and an even number of zeros (even parity).
2003-09-09 printing
AES3-2003
- 9 -
0 3 4 27 28 31
Preamble LSB 24-bit audio sample word MSB V U C P
(a)
V Validity bit
U User data bit
C Channel status bit
P Parity bit
AUX Auxiliary sample bits
0 3 4 7 8 27 28 31
Preamble AUX LSB 20-bit audio sample word MSB V U C P
(b)
The first subframe normally starts with preamble X. However, the preamble changes to preamble Z once every
192 frames. This defines the block structure used to organize the channel status information. The second
subframe always starts with preamble Y.
The modes of transmission are signaled by setting bits 0 to 3 of byte 1 of channel status. Examples include:
Two-channel mode: In two-channel mode, the samples from both channels are transmitted in
consecutive subframes. Channel 1 is in subframe 1, and channel 2 is in subframe 2.
Stereophonic mode: In stereophonic mode, the interface is used to transmit stereophonic audio in
which the two channels are presumed to have been simultaneously sampled. The left, or A, channel is
in subframe 1, and the right, or B, channel is in subframe 2.
Single-channel mode (monophonic): In monophonic mode, the transmitted bit rate remains at the
normal two-channel rate and the audio sample word is placed in subframe 1. Time slots 4 to 31 of
subframe 2 either carry the bits identical to subframe 1 or are set to logic 0. A receiver normally
defaults to channel 1 unless manual override is provided.
Primary-secondary mode: In some applications requiring two channels where one of the channels is
the main or primary channel while the other is a secondary channel, the primary channel is in subframe
1, and the secondary channel is in subframe 2.
2003-09-09 printing
AES3-2003
- 10 -
Subframe Subframe
1 2
Each bit to be transmitted is represented by a symbol comprising two consecutive binary states. The first state
of a symbol is always different from the second state of the previous symbol. The second state of the symbol is
identical to the first if the bit to be transmitted is logic 0. However, it is different if the bit is logic 1. See
figure 3.
Clock
(2 times bit rate)
1
Source coding
0
1
Channel coding
0 (biphase mark)
4.3 Preambles
Preambles are specific patterns providing synchronization and identification of the subframes and blocks.
To achieve synchronization within one sampling period and to make this process completely reliable, these
patterns violate the biphase-mark code rules, thereby avoiding the possibility of data imitating the preambles.
A set of three preambles is used. These preambles are transmitted in the time allocated to four time slots at the
start of each subframe, time slots 0 to 3, and are represented by eight successive states. The first state of the
preamble is always different from the second state of the previous symbol, representing the parity bit.
Depending on this state the preambles are:
Channel Coding
Preceding state 0 1
Preamble
X 11100010 00011101 Subframe 1
Y 11100100 00011011 Subframe 2
Z 11101000 00010111 Subframe 1 and block start
2003-09-09 printing
AES3-2003
- 11 -
Like biphase code, these preambles are d.c. free and provide clock recovery. They differ in at least two states
from any valid biphase sequence.
NOTE Owing to the even-parity bit in time slot 31, all preambles will start with a transition in the
same direction. See 4.1.1. Thus only one of these sets of preambles will, in practice, be transmitted
through the interface. However, it is necessary for either set to be decodable because a polarity reversal
might occur in the connection.
Clock
1 1 1 0 0 0 1 0
Parity LSB
Possible formats for the user data channel are indicated by the channel status byte 1, bits 4 to 7.
Channel status information is organized in 192-bit blocks, subdivided into 24 bytes. See figure 5. The first bit of
each block is carried in the frame with preamble Z.
The specific organization follows, wherein the suffix 0 designates the first byte or bit. Where multiple bit states
represent a counting number, tables are arranged with most significant bit (MSB) first, except where noted as
LSB first.
2003-09-09 printing
AES3-2003
- 12 -
Byte Bit
0 1 2 3 4 5 6 7
0 a b c d e
1 f g
2 h i j
3 k n=0
3 l m n=1
4 o p q r
5 s
6
7
Alphanumeric channel origin data
8
9
10
11 Alphanumeric channel destination
12 data
13
14
15 Local sample address code (32-bit
16 binary)
17
18
19 Time-of-day sample address code
20 (32-bit binary)
21
22 Reliability flags
23 Cyclic redundancy check character
Key:
a use of channel status block j indication of alignment level
b linear PCM identification k channel number
c audio signal pre-emphasis l channel number
d lock indication m multichannel mode number
e sampling frequency n multichanel mode
f channel mode o digital audio reference signal
g user bits management p reserved but undefined
h use of auxiliary sample bits q sampling frequency
i source word length r sampling frequency scaling flag
s reserved but undefined
2003-09-09 printing
AES3-2003
- 13 -
Byte 0
bit 0 Use of channel status block
0 Consumer use of channel status block (see note).
state 1 Professional use of channel status block.
NOTE 1 The significance of byte 0, bit 0 is such that a transmission from an interface conforming to
IEC 60958-3 consumer use can be identified, and a receiver conforming only to IEC 60958-3
consumer use will correctly identify a transmission from a professional-use interface as defined in this
standard. Connection of a professional-use transmitter with a consumer-use receiver or vice versa
might result in unpredictable operation. Thus the following byte definitions only apply when bit 0 =
logic 1 (professional use of the channel status block).
NOTE 2 The indication of sampling frequency, or the use of one of the sampling frequencies that can
be indicated in this byte, is not a requirement for operation of the interface. The 00 state of bits 6 to 7
may be used if the transmitter does not support the indication of sampling frequency, the sampling
frequency is unknown, or the sample frequency is not one of those that can be indicated in this byte.
In the latter case for some sampling frequencies byte 4 may be used to indicate the correct value.
NOTE 3 When byte 1, bits 1 to 3 indicate single channel double sampling frequency mode then the
sampling frequency of the audio signal is twice that indicated by bits 6 to 7 of byte 0.
2003-09-09 printing
AES3-2003
- 14 -
Byte 1
2003-09-09 printing
AES3-2003
- 15 -
Byte 2
NOTE 2 The default state of bits 3 to 5 indicates that the number of active bits within the 20-bit or 24-bit
coding range is not specified by the transmitter. The receiver should default to the maximum number of bits
specified by the coding range and enable manual override or automatic set.
NOTE 3 The nondefault states of bits 3 to 5 indicate the number of bits within the 20-bit or 24-bit coding range
which might be active. This is also an indirect expression of the number of LSBs that are certain to be inactive,
which is equal to 20 or 24 minus the number corresponding to the bit state. The receiver should disable manual
override and auto set for these bit states.
NOTE 4 Irrespective of the audio sample word length as indicated by any of the states of bits 3 to 5, the MSB
is in time slot 27 of the transmitted subframe as specified in 4.1.1.
2003-09-09 printing
AES3-2003
- 16 -
Byte 3
bit 7 Multichannel mode
0 Undefined multichannel mode (default).
state 1 Defined multichannel modes.
The definition of the remaining bit states depends on the state of bit 7.
OR,
NOTE 1 The defined multichannel modes identify mappings between channel numbers and function. The
standard mappings are under consideration. Some mappings may involve groupings of up to 32 channels by
combining two modes.
NOTE 2 For compatibility with equipment that is only sensitive to the channel status data in one subframe the
channel carried by subframe 2 may indicate the same channel number as channel 1. In that case it is implicit
that the second channel has a number one higher than the channel of subframe 1 except in single channel double
sampling frequency mode.
NOTE 3 When bit 7 is 1 the 4 bit channel number can be mapped to the channel numbering in bits 20 to 23 of
the consumer mode channel status defined in IEC 60958-3. In this case channel A of consumer mode maps to
channel 2, channel B maps to channel 3 and so on.
2003-09-09 printing
AES3-2003
- 17 -
Byte 4
bit 2 Reserved
NOTE 1 The sampling frequency indicated in byte 4 is not dependent on the channel mode indicated in byte 1.
NOTE 2 The indication of sampling frequency, or the use of one of the sampling frequencies that can be
indicated in this byte, is not a requirement for operation of the interface. The 0000 state of bits 3 to 6 may be
used if the transmitter does not support the indication of sampling frequency in this byte, the sampling
frequency is unknown, or the sample frequency is not one of those that can be indicated in this byte. In the later
case for some sampling frequencies byte 0 may be used to indicate the correct value.
NOTE 3 The reserved states of bits 3 to 6 of byte 4 are intended for later definition such that bit 6 is set to
define rates related to 44,1 kHz, except for state 1000, and clear to defined rates related to 48 kHz. They
should not be used until further defined.
Byte 5
bits 0 to 7 Reserved
value Set to logic 0 until further defined
2003-09-09 printing
AES3-2003
- 18 -
Bytes 6 to 9
bits 0 to 7 Alphanumeric channel origin data.
7-bit data with no parity bit complying with ISO 646, International Reference
Version (IRV). LSBs are transmitted first with logic 0 in bit 7.
value First character in message is byte 6.
(each byte) Nonprinted control characters, codes 0116 to 1F16 and 7F16, are not permitted.
Default value is logic 0 (code 0016).
Note: ISO 646, IRV, is commonly identified as 7-bit ASCII
Bytes 10 to 13
bits 0 to 7 Alphanumeric channel destination data.
7-bit data with no parity bit complying with ISO 646, International Reference
Version (IRV). LSBs are transmitted first with logic 0 in bit 7.
value First character in message is byte 6.
(each byte)
Nonprinted control characters, codes 0116 to 1F16 and 7F16, are not permitted.
Default value is logic 0 (code 0016).
Bytes 14 to 17
bits 0 to 7 Local sample address code
value 32-bit binary value representing first sample of current block.
(each byte) LSBs are transmitted first. Default value is logic 0.
NOTE This has the same function as a recording index counter.
Bytes 18 to 21
bits 0 to 7 Time-of-day sample address code
value 32-bit binary value representing first sample of current block.
(each byte) LSBs are transmitted first. Default value is logic 0.
NOTE This is the time of day laid down during the source encoding of the signal and remains
unchanged during subsequent operations. A value of all zeros for the binary sample address code is, for
transcoding to real time, or to time codes in particular, to be taken as midnight (that is, 00 h, 00 min, 00
s, 00 frame). Transcoding of the binary number to any conventional time code requires accurate
sample frequency information to provide a sample accurate time.
Byte 22
bits 0 to 7 Reliability flags
Flags used to identify whether the information carried by the channel status data is
reliable. If reliable, the appropriate bits are set to logic 0 (default); if unreliable, the
bits are set to logic 1.
0 to 3 Reserved and are set to logic 0 until further defined.
bits
4 Bytes 0 to 5.
5 Bytes 6 to 13.
6 Bytes 14 to 17.
7 Bytes 18 to 21.
Byte 23
bits 0 to 7 Channel status data cyclic redundancy check character (CRCC).
Generating polynomial is G(x) = x8 + x4 + x3 + x2 + 1.
The CRCC conveys information to test valid reception of the entire channel status
data block (bytes 0 to 22 inclusive). For serial implementations the initial condition
value
of all ones should be used in generating the check bits with the LSB transmitted
first. Default value is logic 0 for minimum implementation of channel status only.
See 7.2.1
NOTE Annex B includes a diagram of the shift register circuit used to generate the code, two
examples of channel status data, and the corresponding CRCC.
2003-09-09 printing
AES3-2003
- 19 -
7.1 General
To promote compatible operation between items of equipment built to this specification it is necessary to
establish which information bits and operational bits need to be encoded and sent by a transmitter and decoded
by an interface receiver.
Documentation shall be provided describing the channel status features supported by the interface transmitters
and receivers.
7.2 Transmitter
Transmitters shall follow all the formatting and channel coding rules established in earlier sections of this
specification including all notes therein. Along with the audio sample word, all transmitters shall correctly
encode and transmit the validity bit, user bit, parity bit, and the three preambles. The channel status shall be
encoded to one of the implementations given in 7.2.1, 7.2.2, and 7.2.3.
The following three implementations are defined: minimum, standard, and enhanced. These terms are used to
communicate in a simple manner the level of implementation of the interface transmitter involving the many
features of channel status. Irrespective of the level of implementation, all reserved states of bits defined in 6
shall remain unchanged.
If additional bytes of channel status, which do not fully comply with the standard implementation, are
implemented as required by an application, the interface transmitter shall be classified as a minimum
implementation of channel status. See 7.2.2.
It should be noted that the minimum implementation imposes severe operational restrictions on some receiving
devices which may be connected to it. For example, receivers implementing byte 23 will normally show a
cyclic redundancy check error when the default value of logic 0 is received as the CRCC. Also, reception of the
default value for byte 0 bits 6 to 7 might cause improper operation in receiving devices not supporting manual
override or auto set capabilities.
7.3 Receivers
Implementation in receivers is highly dependent on the application. Proper documentation shall be provided on
the level of implementation of the interface receiver for decoding the transmitted information (validity, user,
channel status, parity) and on whatever subsequent action is taken by the equipment of which it is a part.
2003-09-09 printing
AES3-2003
- 20 -
8 Electrical requirements
8.1 General characteristics
The electrical parameters of the interface are based on those defined in ITU-T recommendation V.11 which
allow transmission of balanced-voltage digital signals up to a few hundred meters in length.
A circuit conforming to the general configuration shown in figure 6 may be used.
Although equalization may be used at the receiver, there shall be no equalization before transmission.
The frequency range used to qualify the interface electrical parameters is dependent on the maximum data rate
supported. The upper frequency is 128 times the maximum frame rate.
The interconnecting cable shall be balanced and screened (shielded) with a nominal characteristic impedance of
110 Ω at frequencies from 100 kHz to 128 times the maximum frame rate.
Termination
Driving and
Network Isolation
Network
Note 1: Holding closer tolerances for the characteristic impedance of the cable, and for the driving and
terminating impedances, can increase the cable lengths for reliable transmission and for higher data rates.
Note 2: Closer tolerances for the balance of the driving impedance, the terminating impedance, and for the
cable itself can reduce both electromagnetic susceptibility and emissions.
Note 3: Using cable having lower loss at higher frequencies can improve the reliability of transmission for
greater distances and higher data rates.
8.2.3 Balance
Any common-mode component at the output terminals shall be more than 30 dB below the signal at frequencies
from d.c. to 128 times the maximum frame rate.
2003-09-09 printing
AES3-2003
- 21 -
NOTE Operation toward the lower limit of 5 ns may improve the received signal eye pattern, but may
increase EMI at the transmitter. Equipment must meet local regulations regarding EMI.
NOTE 1 This jitter may be strongly asymmetric in character and the deviation from the ideal timing
should meet the specification in either direction.
NOTE 2 This requirement applies both when the equipment is locked to an effectively jitter-free
timing reference, which may be a modulated digital audio signal, and when the equipment is free-
running.
10
700 Hz, -3 dB
-10
Gain
(dB)
-20
70 Hz, -20 dB
-30
10 100 1000 1*104 1*105 1*106 1*107
Jitter frequency
NOTE If jitter attenuation is provided and it is such that the sinusoidal jitter gain falls below the jitter
transfer function mask of figure 8 then the equipment specification should state that the equipment
jitter attenuation is within this specification. The mask imposes no additional limit on low-frequency
jitter gain. The limit starts at the input-jitter frequency of 500 Hz where it is 0 dB, and falls to –6 dB at
and above 1 kHz.
2003-09-09 printing
AES3-2003
- 22 -
10
Gain
(dB) 500 Hz, 0 dB
0
1000 Hz, -6 dB
-10
10 100 1000 1*104 1*105 1*106 1*107
The receiver shall correctly interpret the data when connected directly to a line driver working between the
extreme voltage limits specified in 8.2.2.
NOTE The AES3-1985 specification for line driver signal amplitude was 10 V peak to peak
maximum.
2003-09-09 printing