An Analytical Comparison of Encoding Technologies

Sean McGrath

October 18, 2005 Executive Summary

With the recent popularity of the Internet and the use of personal computers as media devices, digital audio, especially digital , has become a common component of most peoples lives. As the number of uses for digital audio has grown, so has the number of different way to store and encode audio in a digital manner.

This report covers a critical analysis and comparison of three of the main methods used for encoding digital audio; MPEG-1 Layer-3, MPEG-4 , and Vor- bis I. The majority of this analysis will focus on the technical features of these methods and the approaches they use in their encoding algorithms. Each of these encoding technolo- gies has their own unique features, benefits and drawbacks and this report will outline them in detail. Contents

1 Introduction ...... 2

2 Digital Audio ...... 3

2.1 Sampling ...... 3

2.2 ...... 4

2.3 Audio ...... 4

2.4 Encoding and Decoding ...... 5

3 Determining Audio Quality ...... 7

4 Audio ...... 8

4.1 MPEG-1 Layer-3 ...... 8

4.2 MPEG-4 Advanced Audio Coding ...... 11

4.3 I ...... 13

5 Performance Comparison ...... 14

6 Recomendations ...... 16

7 Conclusion ...... 17

1 1 Introduction

In an age where the distribution of digital content is beginning to surpass distribution of physical media, the manner in which this content is represented digitally can play a huge role in the acceptance of the content by end users. This is especially true for digital forms of media (i.e. images, audio and ) since they often require large amounts of data to represent the data accurately.

This report focuses on the digital storage of audio and several of the more popular meth- ods for encoding it. This is a topic that has become a heated discussion in the last few years as the worldwide music industry has started distributing its content over the inter- net. When choosing how you encode digital music, you are in fact choosing the quality of the audio (how close it resembles the original source), the amount of disk space that will be required to store the audio at that level of quality, how compatible this audio will be with players and portable devices, and how limited you will be in the use of that audio. Limiting the use of audio to the end user is accomplished through digital rights management (DRM), something that varies from encoding to encoding.

DRM is often the deciding factor for the music industry when choosing how to distribute their music digitally. We will see that by doing this they are in fact drastically limiting the quality and portability of the music.

This report covers the features and specifications of the MPEG-1 Layer-3 (MP3), MPEG- 4 Advanced Audio Coding (AAC) and the Vorbis I ( Vorbis) encoding/decoding schemes (or codecs) and the benefits that each has over the other. These are three of the most popular codecs in use today and were chosen for discussion in this report for that reason along with the fact that they are supported on multiple platforms. Another large player in the world, especially with the music industry, is the Windows Media 9

2 , which was left out of this discussion because it is only supported by the Windows platform.

2 Digital Audio

The primary goal of a digital audio codec is to take an existing digital audio stream, com- press it (also called encoding) and store it in a new format. In order to play this encoded audio stream the codec must decode it in order to play it. Before we can make sense of the encoding/decoding process it is necessary to explain how audio is represented digitally.

2.1 Sampling

Sampling of digital audio refers to the process of digitally storing the amplitude of the sound wave at any given time. Each time a sample is taken, the amplitude is stored as typically a 2 byte (16-bit) or 3 byte (24-bit) value that is capable of measuring even subtle differences in volume. When encoding digital audio one of the key choices to be made is how often to sample the original sound source. This is known as the sampling frequency, with 44.1 kHz (44,100 samples taken per second) and higher being desired for high quality audio. Compact discs (CDs) use a sampling rate of 44.1 kHz and store each sample using 16 bits. The Nyquist Theorem states that in order to prevent abnormal audio signals in the representation, a sampling frequency of at least twice the highest recorded frequency is needed [8]. The highest audio frequency that the human ear can hear is 20 kHz so a sampling frequency of 44.1 kHz is over the minimum frequency required to avoid these abnormalities.

Uncompressed digital audio, such as that found on CDs or in WAV files is stored us-

3 ing what is called the Pulse Code (PCM) format which uses this method of sampling and provides a very accurate reflection of the original sound.

2.2 Bit Rate

Another measurement that plays an important role in audio encoding is the bit rate, or the number of bits used to store a segment of audio. Bit rates are typically measured in kilobits per second (kbps), and range from 8 kbps to 1411 kbps (the bit rate used on compact discs). Lower bit rates are often associated with lower quality, and the aim of some encoders is to overcome this and maximize audio quality at lower bit rates.

There are three different methods used to capture bit rates while encoding an audio stream. The first is constant bit rate (CBR) which uses the same number of bits to store each sample as opposed to an average bit rate (ABR) which will store each second of the audio stream with the same number of bits, but the number of bits for each sample may vary. The final type of bit rate that is commonly used is a variable bit rate (VBR). With a variable bit rate, the encoder chooses the best bit rate for a segment of audio depending on its characteristics in order to keep quality high, but save on disk space.

2.3 Audio Bandwidth

The audio bandwidth of an audio source refers to the frequency range of that source. The higher the audio bandwidth of a signal the more accurate it is. The highest bandwidth required when producing signals used by the human ear ranges from 20 Hz to 20 kHz (the audible frequencies to the human ear) [17]. The importance of audio bandwidth will become apparent in the later sections of this report when we look at how encoders

4 attempt to minimize disk storage.

2.4 Encoding and Decoding

By doing a few simple calculations on the uncompressed digital audio that is stored on a CD we see that the disk space required to store audio in this form is an issue. An audio CD uses a bit rate of approximately 1411 kbps, or 1411000 bits per second. This works out to be roughly 172 KB per second of audio, or 10 MB per minute. Now image if someone wanted to store their entire CD collection consisting of 200 discs at 40 minutes a piece, This would require roughly 80 GB of storage, which even with today’s large inexpensive hard drives is a bit unpractical.

By compressing this audio into lower bit rates, we can effectively reduce the file size of an audio file with very little loss in quality. Studies have shown that under optimal listening conditions, even expert listeners are unable to determine uncompressed from compressed audio (stereo, 16 bit samples, 256 kbps, 48kHz sampling frequency) a sixth of the original size [10]. Using these compression setting we would be able to shrink the CD library mentioned above to 14 GB where it could them be stored on a portable device. So if compressing audio to a bit rate of 256 kbps is enough to reduce the size by a sixth, why do encoders bother encoding at levels such as 128 kbps and 64 kbps? The answer is simply that we can decrease storage even more to sizes that are more attractive for use on the internet and portable devices by slightly decreasing the quality of the audio. We will see in the later sections that the three codecs discussed focus heavily on providing high quality audio at these lower bit rates.

The compression of audio signals differs greatly from the compression of regular data files such as text files and executables. With these basic file types, compression must be non-

5 destructive in the sense that once they are uncompressed you have the exact same file, bit for bit, as the original. Audio compression, or encoding, is based on a psychoacoustic model that eliminates sounds in the input signal that are not perceived by the human ear. This results in the encoded signal sounding the same to humans, but being represented much differently, on a bit for bit basis, once decoded (uncompressed). Audio can be lightly compressed without destroying information when is used, otherwise if the decoding process doesn’t produce a bit for bit replica of the original, lossy encoding is being used.

The psychoacoustic models that are used by the codecs covered in this report succeed by using the limitations of the human ear to remove unnecessary noises in the audio signal, a technique called perceptual coding.

The human ear has a minimum hearing threshold that determines how loud a sound at particular frequency must be before the ear can detect it. This threshold varies across frequencies due to the sensitivity variation of our ear. The ear is most sensitive to sounds in the range of 1 kHz to 5 kHz and less sensitive at higher and lower frequencies [11]. So what audio codecs can do is to eliminate any signals that fall below this threshold.

Similar to the minimum hearing threshold is the concept of amplitude masking. Ampli- tude masking occurs when a loud tone (masker) prevents a quite tone (maskee), that is played at roughly the same time, from being heard [11]. Based on the sensitivities of the ear, two different tones with the same amplitude will appear to have different volumes, and masking can occur [6].

6 3 Determining Audio Quality

Although the motivation behind encoding audio is to reduce file sizes, many people re- main concerned with the loss of quality when using lower bit rates. When an encoder is unable to properly compress and audio signal at a low bit rate the results can vary to include things such as artifacts, lost bandwidth, pre-echoes and “double speak”.

Artifacts are abnormal noises added to the audio signal during the encoding process and are commonly found when the bit rate is too low. These noises differ in their nature and include everything from digital “blips” to distortions.

It is very easy for audio bandwidth to be lost during the encoding phase when the encoder uses up all of the bits for a segment of audio that are specified by the bit rate [2]. When the encoder uses up all of its allocated bits, it simply drops sections of the audio stream resulting in the deletion of some audio frequencies.

Sometimes it is possible for a noise signal to be generated in the encoded audio stream before the audio that cause the noise even occurs [2]. These are referred to as pre-echoes. Similarly the encoding process can also trigger a slight reverberation effect known as “double speak” when the time signature of the audio interferes with the sampling fre- quency.

7 4 Audio Codecs

4.1 MPEG-1 Layer-3

Introduction

Most experts will agree that the MP3 codec single handedly set the for today’s codecs and is also the file format that has driven much of the broadband internet and CD- RW burner sales in the last several years. Shortly after the International Organization for Standardization (ISO) formed the Motion Pictures Experts Group (MPEG) work began on their first audio codec. Work began in 1987 and continued until 1992 when the codec was finally standardized (ISO/IEC standard IS 11172). The MPEG-1 is an open standard that is available to anyone wishing to use it, provided they pay the fee to license the technology and any royalties that may arise.

Encoding Process

The MP3 codec, like all audio codecs, aims to reduce file size at the expense of low bit rates by using perceptual coding. This process can be seen in Figure 1. The encoding algorithm begins with input from a PCM audio stream, typically a CD or a WAV file, and breaks each frame (determined by the sampling frequency) of the stream into 32 frequency sub- bands. This is accomplished through the use of a polyphase filterbank, and is done to aid in the removal of redundant information from the audio stream [2]. These 32 sub-bands are then broken up further into a total of 576 sub-bands (18 each with a bandwidth of 41.75 Hz) through a lossless Modified Discrete Cosine Transformation (MDCT) [11].

At the same time this breakdown is occurring, the audio stream is passed through a psy-

8 PCM Polyphase Quantization Bit Stream Encoded Audio MDCT Filterbank and Coding Formation Bit Stream Stream 32 576 bands bands

Psychacoustic Model

Figure 1: The MP3 Encoding Scheme choacoustic model which gathers statistics such as the masking threshold for the different audio bands. The information provided by this process is used in the next phase of en- coding to eliminate the audible noise created by the encoding process.

The next step is to quantize to the 576 sub-bands produced from the MDCT. The results from the psychoacoustic model are used in this step to ensure that any noise created from this process is kept below the masking threshold, and therefore inaudible. Huffman cod- ing is then used on the quantized values in order to achieve [18]. The values are compressed repeatedly until the number of bits in each audio frame matches the chosen bit rate of compression [5]. Variable bit rates are made possible through the use of a bit reservoir, that stores bits that are not used up by the and can be applied to other audio frames that require additional bits.

The final step is to take the resulting quantized/coded samples and restructure them back into a bit stream along with error correction codes that becomes the final encoded bit stream [6].

Features

The MP3 codec supports four operating modes; mono, dual channel, stereo and joint stereo. This means that at most it only supports the encoding of 2 channels. It also sup- ports bit rates in the range of 32 kbps to 320 kbps as well as variable bit rates. Bit rates

9 below 128 kbps (64 kbps per channel) are considered to be less than CD quality and on roughly the same level as AM/FM , short wave radio and telephone sound [5]. The sampling frequencies of MP3 encoding include 32 kHz, 44.1 kHz, and 48 kHz.

Limitations

With the large amount of success the MP3 codec has had, it doesn’t come with out its limitations. To begin with, not all encoders produce the same results or the same level of quality. The implementation of the encoder is left entirely up to the developer, and isn’t specified in the standard [2]. Another factor that may cause variations in the encoder, is that the MP3 standard does not specify the psychoacoustic model that should be used [7].

With a maximum bit rate of 320 kbps, there is a 320-bit limit on each frame which may be limiting in cases where a particular frame needs more storage in order to accurately represent the audio signal.

As previously mention, the PCM audio is partitioned twice. The first time by the polyphase filterbank, a lossy operation, which creates 32 sub-bands (in order to maintain backwards compatibility with lower MPEG-1 layers) that are again partitioned further by a lossless MDCT. An approach that performed the MDCT at the beginning would producer higher quality results with more transparent encoding [19].

10 4.2 MPEG-4 Advanced Audio Coding

Introduction

Shortly after the standardization on the MPEG-1 format, work began on a new higher ef- ficiency codec that would outperform the MP3 codec. Finally in 1997, MPEG-2 Advanced Audio Coding (AAC) was standardized (IS 13818-7). This codec is the core of the MPEG-4 AAC codec which was first released in 1998 and is one of the state-of-the-art audio codecs in use today in services such as Apple Computer’s iTunes Music Store and XM . MPEG-4 AAC extends the functionality of MPEG-2 AAC by adding better support for low bit rate encodings and other tools to enhance the usefulness of the codec [20].

Improvements to MPEG-1

In order to outline the differences between AAC and MP3, we will focus on a few of the major improvements made in the MPEG-2 AAC encoding process. The encoding process, in general, is far more complex then the relatively simple MP3 encoding process but there are a few key concepts that will be discussed.

One of the limitations of the MP3 codec was that it used a lossless MDCT operation im- mediately following a lossy operation for the sake of retaining backwards compatibility. The MPEG-2 AAC codec wasn’t designed to be backwards compatible with anything, so it skips the lossy partitioning of the polyphase filterbank and performs are more complex MDCT. The MDCT in the MP3 codec partitioned the audio signal into 576 sub-bands, but the MPEG-2 AAC codec produces 1024 sub-bands that offer a higher resolution and increases the efficiency of the coding process [4].

MPEG-2 AAC also improves on the joint stereo and Huffman codings to enhance the

11 coding efficiency. The Huffman coding used by the MP3 codec worked on one frequency band at at time, whereas the AAC codec performs the coding on quadruples of frequency bands [1]. MPEG-2 AAC also adds what is called Temporal (TNS) which is used to combat pre-echos and improve the removal of quantization noise [12].

The MPEG-4 AAC codec builds onto the enhancements in the MPEG-2 AAC codec with the primary goal of adding functionality. MPEG-4 AAC focuses a lot on coding at very low bit rates such as 2 kbps for applications such as spoken word. The MPEG-4 AAC cod- ing phase replaces the standard Huffman coding with the Transform-Domain Weighted Interleave Vector Quantization (TwinVQ) when using low bit rate ranging from 2 kbps to 16 kbps in order to achieve better results [13].

Experts have shown that the improvements to the AAC format offers the same quality as MP3, but at only 70 percent of the bit rate [1].

The MPEG-4 AAC codec extends the sampling frequency of MPEG-1 to include rates between 8 kHz and 96 kHz (inclusive). The bit rates that are supported go as low as 2 kbps and as high as 320 kbps and there is support for up to 48 full frequency channels [21].

Other tools that have been implemented in the MPEG-4 AAC standard vary quite differ- ently from tools in other codecs, and show the motivation for using this codec in MPEG-4 video. These tools include support for multilingualism, digital audio effect such as rever- beration, selective channel selection, and text-to-speech applications [14, 15].

12 4.3 Vorbis I

Introduction

Vorbis I is an audio codec that originated in 1998 from research by the Xiph.Org Founda- tion in attempts to combat the problems that began arising with licensing and royalty fees of . It was developed as an open source codec and remains patent free, so everyone is free to use it as they wish without the need to pay a licensing fee or other royalties. Since its release it has been slowly gaining popularity, mainly with game developers and so called “audiophiles” who are concerned with audio quality and is currently considered a high-end “second generation” audio codec in the same league as MPEG-4 AAC.

It is important to distinguish between the terms Vorbis and Ogg Vorbis. Vorbis refers to the audio codec, and Ogg Vorbis refers to a file format that contains audio compressed by the Vorbis codec [8].

Uniqueness Over MP3 and AAC

One aspect of the Vorbis codec that immediately sets it apart from the AAC and MP3 codecs is that its specification is defined in terms of the decoder, and not the encoder. It assumes that any encoder that produces a bit stream that is readable by a decoder is a valid Vorbis encoder [22]. It supports multiple bit rates and sampling frequencies ranging from 48 kbps to 350 kbps and 8 kHz to 192 kHz respectively. It is also important to note that all of the bit rates supported by the Vorbis codec are variable bit rates only.

The Ogg Vorbis community is less concerned with bit rates then MP3 and AAC commu- nities, and more concerned with audio quality. This is why the majority of Ogg Vorbis encoders don’t mention bit rates, but instead let the user choose a given level of quality

13 between -1 and 10, with 10 being the highest. A quality level of 2 typically results in a file the same quality as an MP3 encoded at 128 kbps but is 25 percent smaller [8].

A very unique feature that is potentially supported in Ogg Vorbis files is the concept of “bit rate peeling”. Bit rate peeling allows lower bit rates to be extracted from an Ogg Vorbis file without re-encoding by simply chopping off the tails of each audio frame in the file [9].

Instead of using joint stereo coding like the other two codecs, Vorbis uses what is called lossy channel coupling which is very similar to joint stereo coding and distorts the audio signal’s stereo image by combining redundant tones in different channels in order to save space [8].

There is one area where Vorbis pulls ahead of even AAC, and that is with respect to the number of distinct audio channels that are supported. Where MP3 supports 2 and AAC supports 48, Vorbis can support up to 255 channels.

5 Performance Comparison

To compare the relationships between bit rates and file size of the three codec mentioned above a test plan was created that tested 2 (acoustically different) songs along 4 different bit rates. The tests were performed in a Mac OS X environment using iTunes to encode MP3 and AAC files, and Ogg Drop X to encode Ogg Vorbis files. The bit rates used were 64 kbps, 128 kbps, 256 kbps, and a VBR of 128 kbps. The VBR of 128 kbps was not used for AAC due to lack of encoder support, and all Ogg Vorbis files were VBR. The songs were encoded at a sampling frequency of 44.1 kHz in stereo.

The songs that were encoded were Nashville Skyline by Dishwalla, a slow quiet song,

14 MP3 MPEG-4 AAC Ogg Vorbis 64 kbps 2.1 MB 2.1 MB 2 MB 128 kbps 4.2 MB 4.3 MB 4.2 MB 256 kbps 8.5 MB 8.6 MB 8.4 MB 128 kbps (VBR) 4.5 MB 4.2 MB

Figure 2: File sizes for Nashville Skyline

MP3 MPEG-4 AAC Ogg Vorbis 64 kbps 1.9 MB 1.9 MB 2 MB 128 kbps 3.8 MB 3.8 MB 3.7 MB 256 kbps 7.6 MB 7.6 MB 7.6 MB 128 kbps (VBR) 4.1 MB 3.7 MB

Figure 3: File sizes for Accidents and Accidents by Alexisonfire, a faster song with lots happening sonically. The songs were encoded directly from CD.

The results of this test for Nashville Skyline are shown in Figure 2. Uncompressed this song file was 47.1 MB with a bit rate of 1411 kbps and sampling frequency of 44.1 kHz. The Ogg Vorbis files are generally smaller than all others, and the AAC files are the largest.

The results of the test for Accidents are shown in Figure 3 and again the Ogg Vorbis files are generally smaller or equal to all the others, and the AAC files are the largest. Uncom- pressed the song file was 41.9 MB with a bit rate of 1411 kbps and sampling frequency of 44.1 kHz.

15 It should be noted that the audio quality of the 64 kbps MP3s was far below that of the 64 kbps AAC and Ogg Vorbis, and would be considered intolerable by most people.

Others have performed far more detailed experiments using these codecs that measure more complex attributes of the encoded file such as . The test outlined above is a basic test that outlines file sizes for given bit rates, something that nearly anyone can do on their own. In experiments that measure quality, listening test are often performed with listeners ranging from expert listeners to ordinary people without trained ears.

Results from tests performed by Dolby Labs stated that the AAC format was the only codec given the rank of “Excellent” at 64 kbps [16]. Mp3 was also included in this test, but Ogg Vorbis was not. Another test performed by ExtremeTech.com that compared MP3, AAC and Ogg Vorbis produced results that indicated Ogg Vorbis was the best sounding codec, especially at a bit rate of 64 kbps [3].

6 Recomendations

When trying to determine which codec is the ‘best’, there are several factors that need to be considered. These include platform support (present and future), portable devices support, quality-to-size ratio, and appeal to commercial endeavors and developers.

In terms of support, MP3 is way out in front of AAC and Ogg Vorbis. Due mainly to that fact that it has been around a lot longer. Nearly every operating system and an abundance of portable devices (music players, cell phones, PDAs, etc) support the MP3 format and have free players/software available. AAC and Ogg Vorbis have little support on portable devices and software for them is harder to find for certain operating systems.

Support is one factor in choosing what codec is ideal for portable devices, but the quality-

16 to-size ratio is a good indicator of what codec should be supported on these devices. Stor- age is often an issue on portable devices and most users of these devices are concerned with audio quality, so AAC and especially Ogg Vorbis are the best choice for the audio codec that should be supported. This also shows that AAC and Ogg Vorbis are excellent choices if storage is at all an issue, not just on portable devices.

From a business stand point, such as an online music store and software development company, MP3 and AAC prove to be less attractive because in terms of licensing fees and royalties, Ogg Vorbis is the best choice since it is entirely patent and royalty free for developers to use.

The future will likely see AAC become the leading codec over MP3 and Ogg Vorbis since it was developed to out perform MP3 and it has the support of various industries (iTunes Music Store and MPEG-4 Video).

7 Conclusion

With the wide range of digital audio codecs that are available today it is quite feasible to transfer entire CD collections into digital form so they can then be stored on a computer or even a small portable device no bigger than a deck of cards. Using low bit rates and a sophisticated codec such as Vorbis I, it is easy to achieve a compression ration of approx- imately 21:1 that cause very little loss of audio quality. Although it is hard to state which codec is the best, in terms of audio quality and performance at low bit rates AAC and Ogg Vorbis are above and beyond that of MP3.

As we see come to play a larger role in peoples lives and technological advances lower the cost of storage and internet bandwidth, audio codecs are still going

17 to play an important role and a codec that offers high quality with small file sizes will still be desirable by many.

18 References

[1] K. Brandenburg. MP3 and AAC Explained. AES 17th International Conference on High Quality Audio Coding, 1999.

[2] K. Brandenburg and H. Popp. An Introduction to MPEG Layer. Technical report, Fraunhofer Institut Integrierte Schaltungen (IIS), 2000.

[3] Jason Cross. Audio Codec Quality Shootout. http://www.findarticles.com/ p/articles/mi zdext/is 200404/ai ziff123493/pg 4.

[4] Peter Doliwa. MPEG-4 Advanced Audio Coding.

[5] Fraunhofer. Audio and Multimedia MPEG Audio Layer-3. http://www.iis. fraunhofer.de/amm/techinf/layer3/.

[6] Christopher Hoult. A Comparison of the ATRAC and MPEG-1 Layer 3 Audio Com- pression Algorithms. Technical report, University of Southampton, 2002.

[7] Chris A. Lanciani. Auditory Perception and the MPEG Audio Standard. PhD thesis, Georgie Institute of Technology, 1995.

[8] Graham Mitchell. An Introduction to Compressed Audio with Ogg Vorbis. http: //grahammitchell.com/writings/vorbis intro.html, 2004.

[9] Jack Moffitt. Ogg Vorbis–Open, Free Audio–Set Your Media Free. http://www. linuxjournal.com/article/4416.

19 [10] Davis Pan. A Tutorial on MPEG/Audio Compression. IEEE MultiMedia, 2(2):60–74, 1995.

[11] Ken C. Pohlmann. Principles of Digital Audio. McGraw-Hill Professional, 4th edition, 2000.

[12] Luckose Poondikulam, Suyog Moogi, Rahul Kumar, Kalyan Chakravarthy, Vaibhav Mathur, and Barker Nalanikuzhi. Efficient Implementation of Transform Based Au- dio Coders Using SIMB Paradigm and Multifunction Computations.

[13] Heiko Purnhagen. An Overview Of Mpeg-4 Audio Version 2. AES 17th International Conference on High Quality Audio Coding, 1999.

[14] Ramapriya Rangachar. Analysis and Improvement of the MPEG-1 Audio Layer III Algorithm at Low Bit-Rates. Master’s thesis, Arizona State University, December 2001.

[15] Eric D. Scheirer. Structured Audio and Effects Processing in the MPEG-4 Multimedia Standard. Multimedia Systems, 7(1):11–22, 1999.

[16] G. Stoll and F. Kozamernlk. EBU listening tests on Internet audio codecs. Technical report, EBU Project Group, 2000.

[17] Glossary of Audio/Video and Production Terminology. http://www. ardenwoodsnd-dvd.com/glossary/glossary a.html.

[18] Huffman Coding. http://en.wikipedia.org/wiki/Huffman coding.

[19] Mp3 limitations. http://www.mp3-tech.org/content/?Mp3% 20Limitations.

[20] MPEG-4 AAC Standard. http://www.vialicensing.com/products/ mpeg4aac/standard.html.

20 [21] MPEG-4 Audio: AAC. http://www.apple.com/mpeg4/aac/.

[22] Ogg Vorbis I Format Specification. http://www.xiph.org/ogg/vorbis/doc/ vorbis-spec-intro.html, 2002.

21