Advanced Audio Coding-LC
Advanced Audio Coding-LC
Filename
.m4a, .m4b, .m4p, .m4v,
extension
.m4r, .3gp, .mp4, .aac
Internet media
audio/aac, audio/aacp
type
Type of format
Lossy compression
Standard(s)
ISO/IEC 13818-7:2003
Advanced Audio Coding (AAC) is a standardized, lossy compression and encoding scheme for digital
audio. Designed to be the successor of the MP3 format, AAC generally achieves better sound quality than
MP3 at many bit rates.
AAC has been standardized by ISO and IEC, as part of the MPEG-2 & MPEG-4 specifications. The
MPEG-2 standard contains several audio coding methods, including the MP3 coding scheme. AAC is able
to include 48 full-bandwidth (up to 96 kHz) audio channels in one stream plus 15 low frequency
enhancement (LFE, limited to 120 Hz) channels and up to 15 data streams. AAC is able to achieve
indistinguishable audio quality at data rates of 320 kbit/s (64kbit/s/channel) for five channels. The quality is
close to CD also at 96 kbit/s (48kbit/s/channel) for stereo.
AAC's best known use is as the default audio format of Apple's iPhone, iPod, iTunes, and the format used
for all iTunes Store audio (with extensions for proprietary digital rights management).
AAC is also the standard audio format for Sony’s PlayStation 3, latest generation of Sony Walkman, Sony
Ericsson Walkman Phone, Nintendo's Wii (with the Photo Channel 1.1 update installed for Wii consoles
purchased before late 2007) and the MPEG-4 video standard. HE-AAC is part of digital radio standards
like DAB+ and Digital Radio Mondiale.
Contents
[hide]
• 1 History
o 1.1 Standardization
o 1.2 AAC’s improvements over MP3
• 2 How AAC works
o 2.1 Modular encoding
o 2.2 AAC error protection toolkit
o 2.3 Error Resilient (ER) AAC
o 2.4 AAC Low Delay
• 3 Promoting aspects
o 3.1 Licensing and patents
• 4 Products that support AAC
o 4.1 HDTV Standards
4.1.1 Japanese ISDB-T
4.1.2 International ISDB-Tb
o 4.2 Hardware
4.2.1 iTunes and iPod
4.2.2 Other Portable Players
4.2.3 Mobile phones
4.2.4 Other devices
o 4.3 Software
4.3.1 Other software media players
4.3.2 Nero Digital Audio
4.3.3 FAAC and FAAD2
• 5 Extensions and improvements
• 6 Container formats
• 7 See also
• 8 Notes
• 9 External links
[edit] History
AAC was developed with the cooperation and contributions of companies including Fraunhofer IIS, AT&T
Bell Laboratories, Dolby, Sony Corporation and Nokia, and was officially declared an international
standard by the Moving Pictures Experts Group in April 1997. MPEG-2 AAC LC profile consists of a base
format very much like AT&T's PAC coding format [1] [2] [3], with the addition of TNS[4], the Dolby Kaiser
Window described below, a nonuniform quantizer, and a reworking of the bitstream format to handle up to
16 stereo, 16 mono, 16 LFE, and 16 commentary channels in one bitstream. The Main profile adds a set of
recursive predictors that are calculated on each tap of the filterbank. The SSR uses a 4-band PQMF
filterbank, with four shorter filterbanks following, in order to allow for scalable sampling rates.
[edit] Standardization
It is specified both as Part 7 of the MPEG-2 standard, and Part 3 of the MPEG-4 standard. As such, it can
be referred to as MPEG-2 Part 7 and MPEG-4 Part 3 depending on its implementation, however it is
most often referred to as MPEG-4 AAC, or AAC for short.
AAC was first specified in the standard MPEG-2 Part 7 (known formally as ISO/IEC 13818-7:1997) in
1997 as a new "part" (distinct from ISO/IEC 13818-3) in the MPEG-2 family of international standards.
It was updated in MPEG-4 Part 3 (known formally as ISO/IEC 14496-3:1999) in 1999. The reference
software is specified in MPEG-4 Part 4 and the conformance bit-streams are specified in MPEG-4 Part 5. A
notable addition in this version of the standard is Perceptual Noise Substitution (PNS).
HE-AAC (AAC with SBR) was first standardized in ISO/IEC 14496-3:2001/Amd.1. HE-AAC v2 (AAC
with Parametric Stereo) was first specified in ISO/IEC 14496-3:2001/Amd.4. [5]
The current version of the AAC standard is ISO/IEC 14496-3:2005 (with 14496-3:2005/Amd.2. for HE-
AAC v2[6])
The MPEG4 standard also contains other ways of compressing sound. These are low bit rate and generally
used for speech.
AAC was designed to fix many of the serious performance flaws in the MP3 format (which was specified
in MPEG-1 and MPEG-2) by the ISO/IEC in 11172-3 and 13818-3.
Improvements include:
• More sample frequencies (from 8 kHz to 96 kHz) than MP3 (16 kHz to 48 kHz)
• Up to 48 channels (MP3 supports up to two channels in MPEG-1 mode and up to 5.1 channels in
MPEG-2 mode)
• Arbitrary bit-rates and variable frame length. Standardized constant bit rate with bit reservoir.
• Higher efficiency and simpler filterbank (rather than MP3's hybrid coding, AAC uses a pure
MDCT)
• Higher coding efficiency for stationary signals (AAC uses a blocksize of 1024 samples, allowing
more efficient coding than MP3's 576 sample blocks)
• Higher coding accuracy for transient signals (AAC uses a blocksize of 128 samples, allowing
more accurate coding than MP3's 192 sample blocks)
• Can use Kaiser-Bessel derived window function to eliminate spectral leakage at the expense of
widening the main lobe
• Much better handling of audio frequencies above 16 kHz
• More flexible joint stereo (separate for every scale band)
• Adds additional modules (tools) to increase compression efficiency: TNS, Backwards Prediction,
PNS etc... These modules can be combined to constitute different encoding profiles.
Overall, the AAC format allows developers more flexibility to design codecs than MP3 does, and corrects
many of the unfortunate design choices made in the original MPEG 1 audio specification. This increased
flexibility often leads to more concurrent encoding strategies and, as a result, to more efficient
compression. However in terms of whether AAC is better than MP3, the advantages of AAC are not
entirely decisive, and the MP3 specification, while outdated, has proven surprisingly robust in spite of
considerable flaws. AAC and HE-AAC are universally accepted as better than MP3 at low bitrates
(typically less than 128 kbit/s). This is especially true at very low bitrates where the superior stereo coding,
pure MDCT, and more optimal transform window sizes leave MP3 unable to compete. However, as bitrate
increases, the efficiency of an audio format becomes less important relative to the efficiency of the
encoder's implementation, and the intrinsic advantage AAC holds over MP3 no longer dominates audio
quality.
[edit] How AAC works
AAC is a wideband audio coding algorithm that exploits two primary coding strategies to dramatically
reduce the amount of data needed to represent high-quality digital audio.
Furthermore:
• The signal is processed by a modified discrete cosine transform (MDCT) according to its
complexity;
• Internal error correction codes are added;
• The signal is stored or transmitted.
• In order to prevent corrupt samples, a modern implementation of the Luhn mod N algorithm is
applied to each frame
The MPEG-4 audio standard does not define a single or small set of highly efficient compression schemes
but rather a complex toolbox to perform a wide range of operations from low bitrate speech coding to high-
quality audio coding and music synthesis.
• The MPEG-4 audio coding algorithm family spans the range from low bitrate speech encoding
(down to 2 kbit/s) to high-quality audio coding (at 64 kbit/s per channel and higher).
• AAC offers sampling frequencies between 8 kHz and 96 kHz and any number of channels
between 1 and 48.
• In contrast to MP3's hybrid filter bank, AAC uses the modified discrete cosine transform (MDCT)
together with the increased window lengths of 1024 points.
AAC encoders can switch dynamically between a single MDCT block of length 1024 points or 8 blocks of
128 points.
• If a signal change or a transient occurs, 8 shorter windows of 128 points each are chosen for their
better temporal resolution.
• By default, the longer 1024-point window is otherwise used because the increased frequency
resolution allows for a more sophisticated psychoacoustic model, resulting in improved coding
efficiency.
AAC takes a modular approach to encoding. Depending on the complexity of the bitstream to be encoded,
the desired performance and the acceptable output, implementers may create profiles to define which of a
specific set of tools they want to use for a particular application. The standard offers four default profiles:
• Low Complexity (LC) - the simplest and most widely used and supported;
• Main Profile (MAIN) - like the LC profile, with the addition of backwards prediction;
• Sample-Rate Scalable (SRS), a.k.a. Scalable Sample Rate (MPEG-4 AAC-SSR);
• Long Term Prediction (LTP); added in the MPEG-4 standard – an improvement of the MAIN
profile using a forward predictor with lower computational complexity.
Depending on the AAC profile and the MP3 encoder, 96 kbit/s AAC can give nearly the same or better
perceptual quality as 128 kbit/s MP3.[7]
[edit] AAC error protection toolkit
Applying error protection enables error correction up to a certain extent. Error correcting codes are usually
applied equally to the whole payload. However since different parts of an AAC payload show different
sensitivity to transmission errors, this would not be a very efficient approach.
The AAC payload can be subdivided into parts with different error sensitivities.
• Independent error correcting codes can be applied to any of these parts using the Error Protection
(EP) tool defined in MPEG-4 Audio standard.
• This toolkit provides the error correcting capability to the most sensitive parts of the payload in
order to keep the additional overhead low.
• The toolkit is backwardly compatible simpler and pre-existing AAC decoders. A great deal of the
tool kit's error correction functions are based around spreading information about the audio signal
more evenly in the datastream.
Error Resilience (ER) techniques can be used to make the coding scheme itself more robust against errors.
For AAC, three custom-tailored methods were developed and defined in MPEG-4 Audio
• Huffman Codeword Reordering (HCR) to avoid error propagation within spectral data;
• Virtual Codebooks (VCB11) to detect serious errors within spectral data;
• Reversible Variable Length Code (RVLC) to reduce error propagation within scale factor data.
The MPEG-4 Low Delay Audio Coder (AAC-LD) is designed to combine the advantages of perceptual
audio coding with the low delay necessary for two-way communication. It is closely derived from the
MPEG-2 Advanced Audio Coding (AAC) format.
No licenses or payments are required to be able to stream or distribute content in AAC format. [8] This
reason alone makes AAC a much more attractive format to distribute content than MP3, particularly for
streaming content (such as Internet radio).
However, a patent license is required for all manufacturers or developers of AAC codecs [9]. It is for this
reason FOSS implementations such as FAAC and FAAD are distributed in source form only, in order to
avoid patent infringement. (See below under Products that support AAC, Software.)
Unlike Ogg Vorbis, AAC uses proprietary technology, and thus requires a patent license. Contrary to
popular belief, it is not the property of a single company, having been developed in a standards-making
organization.
In December 2003, Japan started broadcasting terrestrial DTV ISDB-T standard that implements Mpeg2
video and Mpeg2 AAC audio. In April 2006 Japan started broadcasting the ISDB-T mobile sub-program,
called 1Seg, that was the first implementation of video H.264AVC with audio HE-AAC in Terrestrial
HDTV broadcasting service in the planet.
In December 2007, Brazil started broadcasting terrestrial DTV standard called International ISDB-Tb that
implements video coding H.264AVC with audio AAC-LC on main program(single or multi) and video
H.264AVC with audio HE-AACv2 in the 1Seg mobile sub-program.
[edit] Hardware
In April 2003, Apple Computer brought mainstream attention to AAC by announcing that its iTunes and
iPod products would support songs in MPEG-4 AAC format (via a firmware update for older iPods).
Customers could download music in a proprietary Digital Rights Management (DRM)-restricted form of
AAC (see FairPlay) via the iTunes Store or create files without DRM from their own CDs using iTunes. In
later years, Apple began offering music videos and movies, which also use AAC for audio encoding.
On May 29, 2007, Apple began selling songs and music videos free of DRM from participating record
labels. These files mostly adhere to the AAC standard and are playable on many non-Apple products but
they do include custom iTunes information such as album artwork and a purchase receipt, so as to identify
the customer in case the file is leaked out onto peer-to-peer networks, it is possible however to remove
these custom tags to restore interoperability with players that conform strictly to the AAC specification.
iTunes supports a "Variable bit rate" (VBR) encoding option which encodes AAC tracks in an "Average bit
rate" (ABR) scheme. As of October 2007, Apple has not added support for HE-AAC which is fully part of
the MP4 standard or true VBR encoding to iTunes.
For a number of years, many mobile phones from manufacturers such as Nokia, Motorola, Samsung, Sony
Ericsson, BenQ-Siemens and Philips have supported AAC playback. The first such phone was the Nokia
5510 released in 2002 which also plays MP3s. However this phone was a commercial failure and such
phones with integrated music players did not gain mainstream popularity until 2005 when the trend of
having AAC as well as MP3 support continued. Most new smartphones and music-themed phones support
playback of these formats.
• Sony Ericsson phones support various AAC formats in MP4 container. AAC-LC is supported in
all phones beginning with K700, phones beginning with W550 have support of HE-AAC. The
latest devices such as the P990, K610, W890i and later support HE-AAC v2.
• Nokia XpressMusic and other new generation Nokia multimedia phones: also support AAC
format.
• BlackBerry: RIM’s latest series of Smartphones such as the 8100 ("Pearl") and 8800 support
AAC.
• Apple's iPhone supports AAC and FairPlay protected AAC files used as the default encoding
format in the iTunes store.
• Palm OS PDAs: Many Palm OS based PDAs and smartphones can play AAC and HE-AAC with
the 3rd party software Pocket Tunes. Version 4.0, released in December 2006, added support for
native AAC and HE-AAC files. The AAC codec for TCPMP, a popular video player, was
withdrawn after version 0.66 due to patent issues, but can still be downloaded from sites other
than corecodec.org. CorePlayer, the commercial follow-on to TCPMP, includes AAC support.
Other PalmOS programs supporting AAC include Kinoma Player and AeroPlayer.
• Microsoft Windows Mobile platforms support AAC either by the native Windows Media Player
or by third-party products (TCPMP, CorePlayer)
• Epson supports AAC playback in the P-2000 and P-4000 Multimedia/Photo Storage Viewers. This
support is not available with their older models, however.
• Vosonic supports AAC recording and playback in the VP8350, VP8360 and VP8390 MultiMedia
Viewers.
• The Sony Reader portable eBook plays M4A files containing AAC, and displays metadata created
by iTunes. Other Sony products, including the A and E series Network Walkmans, support AAC
with firmware updates (released May 2006) while the S series supports it out of the box.
• Nearly every major car stereo manufacturer offers models that will play back .m4a files recorded
onto CD in a data format. This includes Pioneer, Sony, Alpine, Kenwood, Clarion, Panasonic, and
JVC.
• The Roku SoundBridge network audio player supports playback of AAC encoded files.
• The Squeezebox network audio player (made by Slim Devices, a Logitech company) supports
playback of AAC files.
• The Xbox 360 supports streaming of AAC through the Zune software, and off supported iPods
connected through the USB port
• The Wii video game console supports AAC files through version 1.1 of the Photo Channel as of
December 11, 2007. All AAC profiles and bitrates are supported as long as it is in the .m4a file
extension. This update removed MP3 compatibility, but users who have installed this may freely
downgrade to the old version if they wish.[10]
[edit] Software
The Rockbox Open source firmware (available for multiple portable players) also offers support for AAC
to varying degrees, depending on the model of player and the AAC profile.
Optional iPod Support (playback of unprotected AAC files) for the Xbox 360 is available as a free
download from Xbox Live.[11]
Almost all current computer media players include built-in decoders for AAC, or can utilize a library to
decode it. On Microsoft Windows, DirectShow can be utilized this way with the corresponding filters to
enable AAC playback in any DirectShow based player. Software player applications of particular note
include:
• Easy CD-DA Extractor for Windows, CD Ripper and audio converter, which includes an AAC
encoder that supports LC and HE AAC.
• ffdshow is a free open source DirectShow filter for Microsoft Windows operating systems that
uses FAAD2 to support AAC decoding.
• foobar2000 is a freeware audio player for Windows that supports LC and HE AAC.
• Jetaudio is a free media player for Microsoft Windows that plays a large array of formats,
including AAC.
• The KMPlayer also supports AAC.
• KSP Sound Player also supports AAC.
• Media Player Classic
• MPlayer or xine are often used as AAC decoders on Linux.
• RealPlayer includes RealNetworks’s RealAudio 10 AAC encoder.
• Songbird (software) for Windows, Linux and Apple Macintosh supports AAC, including the DRM
rights management encoding used for purchased music from the iTunes Store, with a plug-in.
• Sony SonicStage also support AAC.
• VLC media player supports playback of MP4 and AAC files.
• Winamp for Windows, which includes an AAC encoder that supports LC and HE AAC;
• Another Real product, Rhapsody supports the RealAudio AAC codec, in addition to offering
subscription tracks encoded with AAC.
• XBMC (XBox Media Center) supports both AAC (LC and HE) on modified Xbox game-consoles.
• XMMS supports mp4 playback using a plugin provided by the faad2 library.
• ConvertDirect.com serves AAC Files using Youtube Video conversion. It converts Youtube video
to AACs.
Some of these players (e.g., foobar2000, Winamp, and VLC) also support the decoding of raw or MP4-
contained AAC streamed over HTTP using the SHOUTcast protocol. Plug-ins for Winamp and foobar2000
enable the creation of such streams.
In May 2006, Nero AG released an AAC encoding tool free of charge, Nero Digital Audio [2], which is
capable of encoding LC-AAC, HE-AAC and HE-AAC v2 streams. The tool is a Command Line Interface
tool only, and a separate utility is included to decode to PCM WAV.
Various tools including the foobar2000 audio player and MeGUI can provide a GUI for the encoder.
[edit] FAAC and FAAD2
FAAC and FAAD2 stand for Freeware Advanced Audio Coder and Decoder 2 respectively, collectively
make up an open source implementation of AAC.
• Perceptual Noise Substitution (PNS) – added in MPEG-4. It allows the coding of noise as
pseudorandom data;
• MPEG-4 Scalable To Lossless (SLS) – can supplement an AAC stream to provide a lossless
decoding option, such as in Fraunhofer IIS's "HD-AAC" product;
• High Efficiency AAC (HE-AAC), a.k.a. aacPlus v1 or AAC+ – the combination of SBR (Spectral
Band Replication) and AAC; used for low bitrates;
• HE-AAC v2, a.k.a. aacPlus v2 or eAAC+ – the combination of Parametric Stereo (PS) and HE-
AAC; used for even lower bitrates;
• Long Term Predictor (LTP) – added in MPEG-4.
In addition to the MP4 container format for storage, AAC audio data may be packaged in a more basic
format called Audio Data Interchange Format (ADIF),[12] consisting of a single header followed by the raw
AAC audio data blocks.[13] Alternatively, it may be packaged in a streaming format called Audio Data
Transport Stream (ADTS), consisting of a series of frames, each frame having a header followed by the
AAC audio data.[12] Both formats are defined in MPEG-2 part 7, but are only considered informative by
MPEG-4, so an MPEG-4 decoder does not need to support either format.[12] Two more formats are defined
in MPEG-4 part 3: Low-overhead MPEG-4 Audio Transport Multiplex (LATM), which provides a way to
combine separate audio payloads, and Low Overhead Audio Stream (LOAS), a self-synchronizing
streaming format.[12]
[edit] Notes
1. ^ Johnston, J. D. and Ferreira, A. J., “Sum-difference stereo transform coding,” ICASSP '92,
March, 1992, pp. II-569-572.
2. ^ Sinha, D. and Johnston, J. D., “Audio compression at low bit rates using a signal adaptive
switched filterbank,” IEEE ASSP, 1996, pp. 1053-1057.
3. ^ Johnston, J. D., Sinha, D., Dorward, S. and Quackenbush, S., “AT&T perceptual audio coder
(PAC)” in Collected Papers on Digital Audio Bit-Rate Reduction, Gilchrist, N. and Grewin, C.
(Ed.), Audio Engineering Society, 1996.
4. ^ Herre, J. and Johnston, J. D., “Enhancing the performance of perceptual audio coders by using
temporal noise shaping,” AES 101st Convention, no. preprint 4384, 1996
5. ^ a b http://www.codingtechnologies.com/products/assets/CT_aacPlus_whitepaper.pdf
6. ^ ISO/IEC 14496-3:2005/Amd.2 [1]
7. ^ Apple - QuickTime - Technologies - AAC Audio
8. ^ Via Licensing. "MPEG-4 Audio Licensing FAQ Q6".
9. ^ Via Licensing. "MPEG-4 AAC License Fees".
10. ^ | Nintendo - Customer Service | Wii - Photo Channel
11. ^ Xbox.com | System Use - Use an Apple iPod with Xbox 360
12. ^ a b c d Wolters, Martin; Kristofer Kjorling, Daniel Homm, Heiko Purnhagen. "A closer look into
MPEG-4 High Efficiency AAC" (PDF). 3 Retrieved on 2008-07-31. Presented at the 115th
Convention of the Audio Engineering Society, 10-13 October 2003.
13. ^ "Advanced Audio Coding (MPEG-2), Audio Data Interchange Format". Library of Congress /
National Digital Information Infrastructure and Preservation Program (2007-03-07). Retrieved on
2008-07-31.
v•d•e
v•d•e
High-definition (HD)