An Analytical Comparison of Digital Audio Encoding Technologies

Total Page:16

File Type:pdf, Size:1020Kb

An Analytical Comparison of Digital Audio Encoding Technologies An Analytical Comparison of Digital Audio Encoding Technologies Sean McGrath October 18, 2005 Executive Summary With the recent popularity of the Internet and the use of personal computers as media devices, digital audio, especially digital music, has become a common component of most peoples lives. As the number of uses for digital audio has grown, so has the number of different way to store and encode audio in a digital manner. This report covers a critical analysis and comparison of three of the main methods used for encoding digital audio; MPEG-1 Layer-3, MPEG-4 Advanced Audio Coding, and Vor- bis I. The majority of this analysis will focus on the technical features of these methods and the approaches they use in their encoding algorithms. Each of these encoding technolo- gies has their own unique features, benefits and drawbacks and this report will outline them in detail. Contents 1 Introduction . 2 2 Digital Audio . 3 2.1 Sampling . 3 2.2 Bit Rate . 4 2.3 Audio Bandwidth . 4 2.4 Encoding and Decoding . 5 3 Determining Audio Quality . 7 4 Audio Codecs . 8 4.1 MPEG-1 Layer-3 . 8 4.2 MPEG-4 Advanced Audio Coding . 11 4.3 Vorbis I . 13 5 Performance Comparison . 14 6 Recomendations . 16 7 Conclusion . 17 1 1 Introduction In an age where the distribution of digital content is beginning to surpass distribution of physical media, the manner in which this content is represented digitally can play a huge role in the acceptance of the content by end users. This is especially true for digital forms of media (i.e. images, audio and video) since they often require large amounts of data to represent the data accurately. This report focuses on the digital storage of audio and several of the more popular meth- ods for encoding it. This is a topic that has become a heated discussion in the last few years as the worldwide music industry has started distributing its content over the inter- net. When choosing how you encode digital music, you are in fact choosing the quality of the audio (how close it resembles the original source), the amount of disk space that will be required to store the audio at that level of quality, how compatible this audio will be with players and portable devices, and how limited you will be in the use of that audio. Limiting the use of audio to the end user is accomplished through digital rights management (DRM), something that varies from encoding to encoding. DRM is often the deciding factor for the music industry when choosing how to distribute their music digitally. We will see that by doing this they are in fact drastically limiting the quality and portability of the music. This report covers the features and specifications of the MPEG-1 Layer-3 (MP3), MPEG- 4 Advanced Audio Coding (AAC) and the Vorbis I (Ogg Vorbis) encoding/decoding schemes (or codecs) and the benefits that each has over the other. These are three of the most popular codecs in use today and were chosen for discussion in this report for that reason along with the fact that they are supported on multiple platforms. Another large player in the codec world, especially with the music industry, is the Windows Media 9 2 audio codec, which was left out of this discussion because it is only supported by the Microsoft Windows platform. 2 Digital Audio The primary goal of a digital audio codec is to take an existing digital audio stream, com- press it (also called encoding) and store it in a new format. In order to play this encoded audio stream the codec must decode it in order to play it. Before we can make sense of the encoding/decoding process it is necessary to explain how audio is represented digitally. 2.1 Sampling Sampling of digital audio refers to the process of digitally storing the amplitude of the sound wave at any given time. Each time a sample is taken, the amplitude is stored as typically a 2 byte (16-bit) or 3 byte (24-bit) value that is capable of measuring even subtle differences in volume. When encoding digital audio one of the key choices to be made is how often to sample the original sound source. This is known as the sampling frequency, with 44.1 kHz (44,100 samples taken per second) and higher being desired for high quality audio. Compact discs (CDs) use a sampling rate of 44.1 kHz and store each sample using 16 bits. The Nyquist Theorem states that in order to prevent abnormal audio signals in the representation, a sampling frequency of at least twice the highest recorded frequency is needed [8]. The highest audio frequency that the human ear can hear is 20 kHz so a sampling frequency of 44.1 kHz is over the minimum frequency required to avoid these abnormalities. Uncompressed digital audio, such as that found on CDs or in WAV files is stored us- 3 ing what is called the Pulse Code Modulation (PCM) format which uses this method of sampling and provides a very accurate reflection of the original sound. 2.2 Bit Rate Another measurement that plays an important role in audio encoding is the bit rate, or the number of bits used to store a segment of audio. Bit rates are typically measured in kilobits per second (kbps), and range from 8 kbps to 1411 kbps (the bit rate used on compact discs). Lower bit rates are often associated with lower quality, and the aim of some encoders is to overcome this and maximize audio quality at lower bit rates. There are three different methods used to capture bit rates while encoding an audio stream. The first is constant bit rate (CBR) which uses the same number of bits to store each sample as opposed to an average bit rate (ABR) which will store each second of the audio stream with the same number of bits, but the number of bits for each sample may vary. The final type of bit rate that is commonly used is a variable bit rate (VBR). With a variable bit rate, the encoder chooses the best bit rate for a segment of audio depending on its characteristics in order to keep quality high, but save on disk space. 2.3 Audio Bandwidth The audio bandwidth of an audio source refers to the frequency range of that source. The higher the audio bandwidth of a signal the more accurate it is. The highest bandwidth required when producing signals used by the human ear ranges from 20 Hz to 20 kHz (the audible frequencies to the human ear) [17]. The importance of audio bandwidth will become apparent in the later sections of this report when we look at how encoders 4 attempt to minimize disk storage. 2.4 Encoding and Decoding By doing a few simple calculations on the uncompressed digital audio that is stored on a CD we see that the disk space required to store audio in this form is an issue. An audio CD uses a bit rate of approximately 1411 kbps, or 1411000 bits per second. This works out to be roughly 172 KB per second of audio, or 10 MB per minute. Now image if someone wanted to store their entire CD collection consisting of 200 discs at 40 minutes a piece, This would require roughly 80 GB of storage, which even with today’s large inexpensive hard drives is a bit unpractical. By compressing this audio into lower bit rates, we can effectively reduce the file size of an audio file with very little loss in quality. Studies have shown that under optimal listening conditions, even expert listeners are unable to determine uncompressed from compressed audio (stereo, 16 bit samples, 256 kbps, 48kHz sampling frequency) a sixth of the original size [10]. Using these compression setting we would be able to shrink the CD library mentioned above to 14 GB where it could them be stored on a portable device. So if compressing audio to a bit rate of 256 kbps is enough to reduce the size by a sixth, why do encoders bother encoding at levels such as 128 kbps and 64 kbps? The answer is simply that we can decrease storage even more to sizes that are more attractive for use on the internet and portable devices by slightly decreasing the quality of the audio. We will see in the later sections that the three codecs discussed focus heavily on providing high quality audio at these lower bit rates. The compression of audio signals differs greatly from the compression of regular data files such as text files and executables. With these basic file types, compression must be non- 5 destructive in the sense that once they are uncompressed you have the exact same file, bit for bit, as the original. Audio compression, or encoding, is based on a psychoacoustic model that eliminates sounds in the input signal that are not perceived by the human ear. This results in the encoded signal sounding the same to humans, but being represented much differently, on a bit for bit basis, once decoded (uncompressed). Audio can be lightly compressed without destroying information when lossless compression is used, otherwise if the decoding process doesn’t produce a bit for bit replica of the original, lossy encoding is being used. The psychoacoustic models that are used by the codecs covered in this report succeed by using the limitations of the human ear to remove unnecessary noises in the audio signal, a technique called perceptual coding.
Recommended publications
  • Audio Coding for Digital Broadcasting
    Recommendation ITU-R BS.1196-7 (01/2019) Audio coding for digital broadcasting BS Series Broadcasting service (sound) ii Rec. ITU-R BS.1196-7 Foreword The role of the Radiocommunication Sector is to ensure the rational, equitable, efficient and economical use of the radio- frequency spectrum by all radiocommunication services, including satellite services, and carry out studies without limit of frequency range on the basis of which Recommendations are adopted. The regulatory and policy functions of the Radiocommunication Sector are performed by World and Regional Radiocommunication Conferences and Radiocommunication Assemblies supported by Study Groups. Policy on Intellectual Property Right (IPR) ITU-R policy on IPR is described in the Common Patent Policy for ITU-T/ITU-R/ISO/IEC referenced in Resolution ITU-R 1. Forms to be used for the submission of patent statements and licensing declarations by patent holders are available from http://www.itu.int/ITU-R/go/patents/en where the Guidelines for Implementation of the Common Patent Policy for ITU-T/ITU-R/ISO/IEC and the ITU-R patent information database can also be found. Series of ITU-R Recommendations (Also available online at http://www.itu.int/publ/R-REC/en) Series Title BO Satellite delivery BR Recording for production, archival and play-out; film for television BS Broadcasting service (sound) BT Broadcasting service (television) F Fixed service M Mobile, radiodetermination, amateur and related satellite services P Radiowave propagation RA Radio astronomy RS Remote sensing systems S Fixed-satellite service SA Space applications and meteorology SF Frequency sharing and coordination between fixed-satellite and fixed service systems SM Spectrum management SNG Satellite news gathering TF Time signals and frequency standards emissions V Vocabulary and related subjects Note: This ITU-R Recommendation was approved in English under the procedure detailed in Resolution ITU-R 1.
    [Show full text]
  • Preview - Click Here to Buy the Full Publication
    This is a preview - click here to buy the full publication IEC 62481-2 ® Edition 2.0 2013-09 INTERNATIONAL STANDARD colour inside Digital living network alliance (DLNA) home networked device interoperability guidelines – Part 2: DLNA media formats INTERNATIONAL ELECTROTECHNICAL COMMISSION PRICE CODE XH ICS 35.100.05; 35.110; 33.160 ISBN 978-2-8322-0937-0 Warning! Make sure that you obtained this publication from an authorized distributor. ® Registered trademark of the International Electrotechnical Commission This is a preview - click here to buy the full publication – 2 – 62481-2 © IEC:2013(E) CONTENTS FOREWORD ......................................................................................................................... 20 INTRODUCTION ................................................................................................................... 22 1 Scope ............................................................................................................................. 23 2 Normative references ..................................................................................................... 23 3 Terms, definitions and abbreviated terms ....................................................................... 30 3.1 Terms and definitions ............................................................................................ 30 3.2 Abbreviated terms ................................................................................................. 34 3.4 Conventions .........................................................................................................
    [Show full text]
  • Lossy Audio Compression Identification
    2018 26th European Signal Processing Conference (EUSIPCO) Lossy Audio Compression Identification Bongjun Kim Zafar Rafii Northwestern University Gracenote Evanston, USA Emeryville, USA [email protected] zafar.rafi[email protected] Abstract—We propose a system which can estimate from an compression parameters from an audio signal, based on AAC, audio recording that has previously undergone lossy compression was presented in [3]. The first implementation of that work, the parameters used for the encoding, and therefore identify the based on MP3, was then proposed in [4]. The idea was to corresponding lossy coding format. The system analyzes the audio signal and searches for the compression parameters and framing search for the compression parameters and framing conditions conditions which match those used for the encoding. In particular, which match those used for the encoding, by measuring traces we propose a new metric for measuring traces of compression of compression in the audio signal, which typically correspond which is robust to variations in the audio content and a new to time-frequency coefficients quantized to zero. method for combining the estimates from multiple audio blocks The first work to investigate alterations, such as deletion, in- which can refine the results. We evaluated this system with audio excerpts from songs and movies, compressed into various coding sertion, or substitution, in audio signals which have undergone formats, using different bit rates, and captured digitally as well lossy compression, namely MP3, was presented in [5]. The as through analog transfer. Results showed that our system can idea was to measure traces of compression in the signal along identify the correct format in almost all cases, even at high bit time and detect discontinuities in the estimated framing.
    [Show full text]
  • CT-Aacplus — a State-Of-The-Art Audio Coding Scheme
    AUDIO CODING CT-aacPlus — a state-of-the-art Audio coding scheme Martin Dietz and Stefan Meltzer Coding Technologies, Germany CT-aacPlus is a combination of Spectral Band Replication (SBR) technology – a bandwidth-extension tool developed by Coding Technologies (CT) in Germany – with the MPEG Advanced Audio Coding (AAC) technology which, to date, has been one of the most efficient traditional perceptual audio-coding schemes. CT-aacPlus is able to deliver high-quality audio signals at bit-rates down to 24 kbit/s for mono and 48 kbit/s for stereo signals. The forthcoming Digital Radio Mondiale (DRM) broadcasting system, among others, will use CT-aacPlus for its audio-coding scheme. CT-aacPlus will enable DRM to deliver an audio quality, in the frequency range below 30 MHz, that is equivalent to – or even better than – that offered by today’s analogue FM services. This article describes the principles of traditional audio coders – and their limitations when used for low bit-rate applications. The second part describes the basic idea of SBR technology and demonstrates the improvements achieved through the combination of SBR technology with traditional audio coders such as AAC and MP3. Advanced Audio Coding (AAC) has so far been one of the most efficient traditional perceptual audio-coding algorithms. In combination with the bandwidth-extension technology, Spectral Band Replication (SBR), the coding efficiency of AAC can be even further improved by at least 30%, thus providing the same audio quality at a 30% lower bit-rate. The combination of AAC and SBR – referred to as CT-aacPlus – will be used by the Digital Radio Mondiale transmission system [1] in the frequency bands below 30 MHz and will provide near- FM sound quality at bit-rates of around 20 kbit/s per audio channel.
    [Show full text]
  • Critical Assessment of Advanced Coding Standards for Lossless Audio Compression
    TONNY HIDAYAT et al: A CRITICAL ASSESSMENT OF ADVANCED CODING STANDARDS FOR LOSSLESS .. A Critical Assessment of Advanced Coding Standards for Lossless Audio Compression Tonny Hidayat Mohd Hafiz Zakaria, Ahmad Naim Che Pee Department of Information Technology Faculty of Information and Communication Technology Universitas Amikom Yogyakarta Universiti Teknikal Malaysia Melaka Yogyakarta, Indonesia Melaka, Malaysia [email protected] [email protected], [email protected] Abstract - Image data, text, video, and audio data all require compression for storage issues and real-time access via computer networks. Audio data cannot use compression technique for generic data. The use of algorithms leads to poor sound quality, small compression ratios and algorithms are not designed for real-time access. Lossless audio compression has achieved observation as a research topic and business field of the importance of the need to store data with excellent condition and larger storage charges. This article will discuss and analyze the various lossless and standardized audio coding algorithms that concern about LPC definitely due to its reputation and resistance to compression that is audio. However, another expectation plans are likewise broke down for relative materials. Comprehension of LPC improvements, for example, LSP deterioration procedures is additionally examined in this paper. Keywords - component; Audio; Lossless; Compression; coding. I. INTRODUCTION Compression is to shrink / compress the size. Data compression is a technique to minimize the data so that files can be obtained with a size smaller than the original file size. Compression is needed to minimize the data storage (because the data size is smaller than the original), accelerate information transmission, and limit bandwidth prerequisites.
    [Show full text]
  • Etsi Ts 101 154 V2.4.1 (2018-02)
    ETSI TS 101 154 V2.4.1 (2018-02) TECHNICAL SPECIFICATION Digital Video Broadcasting (DVB); Specification for the use of Video and Audio Coding in Broadcast and Broadband Applications 2 ETSI TS 101 154 V2.4.1 (2018-02) Reference RTS/JTC-DVB-377 Keywords broadcasting, digital, DVB, MPEG, TV, UHDTV, video ETSI 650 Route des Lucioles F-06921 Sophia Antipolis Cedex - FRANCE Tel.: +33 4 92 94 42 00 Fax: +33 4 93 65 47 16 Siret N° 348 623 562 00017 - NAF 742 C Association à but non lucratif enregistrée à la Sous-Préfecture de Grasse (06) N° 7803/88 Important notice The present document can be downloaded from: http://www.etsi.org/standards-search The present document may be made available in electronic versions and/or in print. The content of any electronic and/or print versions of the present document shall not be modified without the prior written authorization of ETSI. In case of any existing or perceived difference in contents between such versions and/or in print, the only prevailing document is the print of the Portable Document Format (PDF) version kept on a specific network drive within ETSI Secretariat. Users of the present document should be aware that the document may be subject to revision or change of status. Information on the current status of this and other ETSI documents is available at https://portal.etsi.org/TB/ETSIDeliverableStatus.aspx If you find errors in the present document, please send your comment to one of the following services: https://portal.etsi.org/People/CommiteeSupportStaff.aspx Copyright Notification No part may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying and microfilm except as authorized by written permission of ETSI.
    [Show full text]
  • Video - Dive Into HTML5
    Video - Dive Into HTML5 You are here: Home ‣ Dive Into HTML5 ‣ Video on the Web ❧ Diving In nyone who has visited YouTube.com in the past four years knows that you can embed video in a web page. But prior to HTML5, there was no standards- based way to do this. Virtually all the video you’ve ever watched “on the web” has been funneled through a third-party plugin — maybe QuickTime, maybe RealPlayer, maybe Flash. (YouTube uses Flash.) These plugins integrate with your browser well enough that you may not even be aware that you’re using them. That is, until you try to watch a video on a platform that doesn’t support that plugin. HTML5 defines a standard way to embed video in a web page, using a <video> element. Support for the <video> element is still evolving, which is a polite way of saying it doesn’t work yet. At least, it doesn’t work everywhere. But don’t despair! There are alternatives and fallbacks and options galore. <video> element support IE Firefox Safari Chrome Opera iPhone Android 9.0+ 3.5+ 3.0+ 3.0+ 10.5+ 1.0+ 2.0+ But support for the <video> element itself is really only a small part of the story. Before we can talk about HTML5 video, you first need to understand a little about video itself. (If you know about video already, you can skip ahead to What Works on the Web.) ❧ http://diveintohtml5.org/video.html (1 of 50) [6/8/2011 6:36:23 PM] Video - Dive Into HTML5 Video Containers You may think of video files as “AVI files” or “MP4 files.” In reality, “AVI” and “MP4″ are just container formats.
    [Show full text]
  • White Paper: the AAC Audio Coding Family for Broadcast and Cable TV
    WHITE PAPER Fraunhofer Institute for THE AAC AUDIO CODING FAMILY FOR Integrated Circuits IIS BROADCAST AND CABLE TV Management of the institute Prof. Dr.-Ing. Albert Heuberger (executive) Over the last few years, the AAC audio codec family has played an increasingly important Dr.-Ing. Bernhard Grill role as an enabling technology for state-of-the-art multimedia systems. The codecs Am Wolfsmantel 33 combine high audio quality with very low bit-rates, allowing for an impressive audio 91058 Erlangen experience even over channels with limited bandwidth, such as those in broadcasting or www.iis.fraunhofer.de mobile multimedia streaming. The AAC audio coding family also includes special low delay versions, which allow high quality two way communication. Contact Matthias Rose Phone +49 9131 776-6175 [email protected] Contact USA Fraunhofer USA, Inc. Digital Media Technologies* Phone +1 408 573 9900 [email protected] Contact China Toni Fiedler [email protected] Contact Japan Fahim Nawabi Phone: +81 90-4077-7609 [email protected] Contact Korea Youngju Ju Phone: +82 2 948 1291 [email protected] * Fraunhofer USA Digital Media Technologies, a division of Fraunhofer USA, Inc., promotes and supports the products of Fraunhofer IIS in the U. S. Audio and Media Technologies The AAC Audio Coding Family for Broadcast and Cable TV www.iis.fraunhofer.de/audio 1 / 12 A BRIEF HISTORY AND OVERVIEW OF THE MPEG ADVANCED AUDIO CODING FAMILY The first version of Advanced Audio Coding (AAC) was standardized in 1994 as part of the MPEG-2 standard.
    [Show full text]
  • Audio Decoding on the C54X
    Audio Decoding on the C54X Alec Robinson, Chuck Lueck, Jon Rowlands AbstractFueled by the excitement over music distribution on the Internet, audio decompression has become a popular topic. There are several different algorithms available, which makes the programmable DSP a nice choice for systems supporting multiple formats. This paper discusses our efforts in porting two audio compression algorithms, MPEG-1 Layer 3 (MP3) and MPEG-2 Advanced Audio Coding (AAC), to the fixed-point C54X DSP. Introduction Within the last year, the popularity of downloadable, compressed audio formats via the internet has skyrocketed. This paper discusses our efforts in porting the decoders of two such compressed audio formats, MPEG-1 Layer 3 (MP3) and MPEG-2 Advanced Audio Coding (AAC), to the C54X fixed-point DSP. MPEG-1 Layer 3 is currently perhaps the most popular compressed audio format, while MPEG-2 AAC offers better audio quality at the same compression ratios. Platform Description The target processor for development was the C54X, TI’s low-power, 16-bit fixed-point DSP. Our initial design goal was to have both decoders running on the C5410 processor, which has 64 kwords of RAM available and up to 100 MIPS of computation power, while our ultimate goal is to get the same functionality running on the C5409, which has 32 kwords of RAM along with a 16 kword ROM for holding tables, constants, etc. For code development, we made significant use of the Spectrum Digital C54X EVM. With this, it was possible to profile the software, perform functional and diagnostic testing on the C54X, etc.
    [Show full text]
  • Introduction
    CMPT 820 Multimedia Systems Introduction Ze-Nian Li Spring 2019 1 CMPT820 Multimedia Systems Why this course? Multimedia is cool Media -> Multimedia Everywhere Requires broad knowledge in mathematics, signal processing, communications, networking, software, hardware, … Job opportunities Multimedia is a booming industry • in the metro Vancouver area Tons of opportunities created by next-generation standards and emerging applications: • JPEG/JPEG 2000 • MPEG-1/2/4 H.264/265/HEVC 4K/8K TV 3D/freeview • 3G/4G/5G mobile communications • Multimedia-enabled smartphone, tablets • Social media, Cloud media, Crowd media • Online gaming 2 CMPT820 Multimedia Systems Multimedia is Multidisciplinary Computer hardware, network, operating system, database Image, audio, Multimedia Computer vision, speech computing pattern recognition, processing Machine learning Human computer Computer interaction graphics 3 CMPT820 Multimedia Systems Books and References Recommended Textbook Fundamentals of Multimedia, 2nd Edition, by Z.N. Li, M.S. Drew, and J. Liu, Springer, 2014. Reference books Video Processing and Communications, Y. Wang, J. Ostermann, Y-Q Zhang, Prentice Hall, 2002. Resource Home page • www.cs.sfu.ca/CC/820/li/ Please check it regularly 4 CMPT820 Multimedia Systems Grading Scheme Two programming assignments 2x10% Presentation and class participation 40% Term project 40% It is a Graduate seminar course ! 5 CMPT820 Multimedia Systems Topics Introduction to Image and Video Compression Wavelets and JPEG-2000 H.264/MPEG-4 AVC, H.265, and MPEG-7 Image and Video Quality Assessment Content Based Image and Video Retrieval Visual Content Analysis Digital Audio Compression 6 CMPT820 Multimedia Systems Questions? 7 CMPT820 Multimedia Systems What is Multimedia? Multimedia means that information can be represented through audio, images, graphics and animation, video, in addition to traditional media (i.e., text and graphics drawings).
    [Show full text]
  • Toner TEHQ6 Encoder Host TCE CABLE TOOLS.Qxd
    832202 OTT Multiscreen Adaptive Transcoder Live + VOD OTT Multiscreen Adaptive Transcoder can deliver live content (IPTV channels) and on demand content (multimedia fles) to any device (Smartphones, Tablets, STBs, PCs, SmartTVs...) through any kind of network. OTT Multiscreen Adaptive Transcoder LIVE+VOD ref. 832202 “Bufering is old fashioned”. Thanks to the technology used by this system the start of playing will be inmediately. “Maximize the play quality”. Using the new streaming adaptive protocols, the video quality can be adecuated to the connection bandwidth between fnal user and server ofering the maximun available quality. “Green and powerful”. Video encoding full optimized for the hardware of mobile devices which means low battery consumption. “Clean Mobile”. Installation of new applications not needed, the playing will be done by the mobile device own browser. 3G & 4G Unicast (HTTP) Headend Unicast (HTTP, RTSP) Transcoder + Storage Multicast (UDP, RTP) STB Files 969 Horsham Road l Horsham, Pennsylvania 19044 USA l Phone: 215-675-2053 Fax: 215-675-7543 l [email protected] 832202 OTT Multiscreen TECHNICAL SPECIFICATION Adaptive Transcoder Live + VOD SERVICE VP8 FHD (1080p@60) 832202 OTT Multiscreen Adaptive Transcoder LIVE+VOD VP9 4K (2160p@30) INPUT Theora FHD (1080p@50) Encapsulated - Protocols HLS - HTTP Live Streaming (Apple), RTMP - Real Time Messaging Protocol, RTMPT - Real Time Messaging Protocol over HTTP, RTP - Real- Performance time Transport Protocol, UDP - User Datagram Protocol, UDP Lite - User 1 UHD H.265 (2160p@30)
    [Show full text]
  • Audio Coding: Basics and State of the Art
    AUDIO CODING: BASICS AND STATE OF THE ART PACS REFERENCE: 43.75.CD Brandenburg, Karlheinz Fraunhofer Institut Integrierte Schaltungen, Arbeitsgruppe Elektronische Medientechnolgie Am Helmholtzring 1 98603 Ilmenau Germany Tel: +49-3677-69-4340 Fax: +49-3677-69-4399 E-mail: [email protected] ABSTRACT High quality audio coding based on perceptual models has found its way to widespread application in broadcasting and Internet audio (e.g. mp3). After a brief presentation of the basic ideas, a short overview over current high quality perceptual coders will be given. Algorithms defined by the MPEG group (MPEG-1 Audio, e.g. MPEG Layer-3 (mp3), MPEG-2 Advanced Audio Coding, MPEG-4 Audio including its different functionalities) still define the state of the art. While there has been some saturation in the progress for highest quality (near transparent) audio coding, considerable improvements have been achieved for lower ("near CD-quality") coding. One technology able to push the bit-rates for good quality audio lower is Spectral Band Replication (SBR) as used in mp3PRO or improved AAC codes. INTRODUCTION High quality audio compression has found its way from research to widespread applications within a couple of years. Early research of 15 years ago was translated into standardization efforts of ISO/IEC and ITU-R 10 years ago. Since the finalization of MPEG-1 in 1992, many applications have been devised. In the last couple of years, Internet audio delivery has emerged as a powerful category of applications. These techniques made headline news in many parts of the world because of the potential to change the way of business for the music industry.
    [Show full text]