White Paper: the AAC Audio Coding Family for Broadcast and Cable TV
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Digital Speech Processing— Lecture 17
Digital Speech Processing— Lecture 17 Speech Coding Methods Based on Speech Models 1 Waveform Coding versus Block Processing • Waveform coding – sample-by-sample matching of waveforms – coding quality measured using SNR • Source modeling (block processing) – block processing of signal => vector of outputs every block – overlapped blocks Block 1 Block 2 Block 3 2 Model-Based Speech Coding • we’ve carried waveform coding based on optimizing and maximizing SNR about as far as possible – achieved bit rate reductions on the order of 4:1 (i.e., from 128 Kbps PCM to 32 Kbps ADPCM) at the same time achieving toll quality SNR for telephone-bandwidth speech • to lower bit rate further without reducing speech quality, we need to exploit features of the speech production model, including: – source modeling – spectrum modeling – use of codebook methods for coding efficiency • we also need a new way of comparing performance of different waveform and model-based coding methods – an objective measure, like SNR, isn’t an appropriate measure for model- based coders since they operate on blocks of speech and don’t follow the waveform on a sample-by-sample basis – new subjective measures need to be used that measure user-perceived quality, intelligibility, and robustness to multiple factors 3 Topics Covered in this Lecture • Enhancements for ADPCM Coders – pitch prediction – noise shaping • Analysis-by-Synthesis Speech Coders – multipulse linear prediction coder (MPLPC) – code-excited linear prediction (CELP) • Open-Loop Speech Coders – two-state excitation -
Audio Coding for Digital Broadcasting
Recommendation ITU-R BS.1196-7 (01/2019) Audio coding for digital broadcasting BS Series Broadcasting service (sound) ii Rec. ITU-R BS.1196-7 Foreword The role of the Radiocommunication Sector is to ensure the rational, equitable, efficient and economical use of the radio- frequency spectrum by all radiocommunication services, including satellite services, and carry out studies without limit of frequency range on the basis of which Recommendations are adopted. The regulatory and policy functions of the Radiocommunication Sector are performed by World and Regional Radiocommunication Conferences and Radiocommunication Assemblies supported by Study Groups. Policy on Intellectual Property Right (IPR) ITU-R policy on IPR is described in the Common Patent Policy for ITU-T/ITU-R/ISO/IEC referenced in Resolution ITU-R 1. Forms to be used for the submission of patent statements and licensing declarations by patent holders are available from http://www.itu.int/ITU-R/go/patents/en where the Guidelines for Implementation of the Common Patent Policy for ITU-T/ITU-R/ISO/IEC and the ITU-R patent information database can also be found. Series of ITU-R Recommendations (Also available online at http://www.itu.int/publ/R-REC/en) Series Title BO Satellite delivery BR Recording for production, archival and play-out; film for television BS Broadcasting service (sound) BT Broadcasting service (television) F Fixed service M Mobile, radiodetermination, amateur and related satellite services P Radiowave propagation RA Radio astronomy RS Remote sensing systems S Fixed-satellite service SA Space applications and meteorology SF Frequency sharing and coordination between fixed-satellite and fixed service systems SM Spectrum management SNG Satellite news gathering TF Time signals and frequency standards emissions V Vocabulary and related subjects Note: This ITU-R Recommendation was approved in English under the procedure detailed in Resolution ITU-R 1. -
Preview - Click Here to Buy the Full Publication
This is a preview - click here to buy the full publication IEC 62481-2 ® Edition 2.0 2013-09 INTERNATIONAL STANDARD colour inside Digital living network alliance (DLNA) home networked device interoperability guidelines – Part 2: DLNA media formats INTERNATIONAL ELECTROTECHNICAL COMMISSION PRICE CODE XH ICS 35.100.05; 35.110; 33.160 ISBN 978-2-8322-0937-0 Warning! Make sure that you obtained this publication from an authorized distributor. ® Registered trademark of the International Electrotechnical Commission This is a preview - click here to buy the full publication – 2 – 62481-2 © IEC:2013(E) CONTENTS FOREWORD ......................................................................................................................... 20 INTRODUCTION ................................................................................................................... 22 1 Scope ............................................................................................................................. 23 2 Normative references ..................................................................................................... 23 3 Terms, definitions and abbreviated terms ....................................................................... 30 3.1 Terms and definitions ............................................................................................ 30 3.2 Abbreviated terms ................................................................................................. 34 3.4 Conventions ......................................................................................................... -
Subjective Assessment of Audio Quality – the Means and Methods Within the EBU
Subjective assessment of audio quality – the means and methods within the EBU W. Hoeg (Deutsch Telekom Berkom) L. Christensen (Danmarks Radio) R. Walker (BBC) This article presents a number of useful means and methods for the subjective quality assessment of audio programme material in radio and television, developed and verified by EBU Project Group, P/LIST. The methods defined in several new 1. Introduction EBU Recommendations and Technical Documents are suitable for The existing EBU Recommendation, R22 [1], both operational and training states that “the amount of sound programme ma- purposes in broadcasting terial which is exchanged between EBU Members, organizations. and between EBU Members and other production organizations, continues to increase” and that “the only sufficient method of assessing the bal- ance of features which contribute to the quality of An essential prerequisite for ensuring a uniform the sound in a programme is by listening to it.” high quality of sound programmes is to standard- ize the means and methods required for their as- Therefore, “listening” is an integral part of all sessment. The subjective assessment of sound sound and television programme-making opera- quality has for a long time been carried out by tions. Despite the very significant advances of international organizations such as the (former) modern sound monitoring and measurement OIRT [2][3][4], the (former) CCIR (now ITU-R) technology, these essentially objective solutions [5] and the Member organizations of the EBU remain unable to tell us what the programme will itself. It became increasingly important that com- really sound like to the listener at home. -
TMS320C6000 U-Law and A-Law Companding with Software Or The
Application Report SPRA634 - April 2000 TMS320C6000t µ-Law and A-Law Companding with Software or the McBSP Mark A. Castellano Digital Signal Processing Solutions Todd Hiers Rebecca Ma ABSTRACT This document describes how to perform data companding with the TMS320C6000 digital signal processors (DSP). Companding refers to the compression and expansion of transfer data before and after transmission, respectively. The multichannel buffered serial port (McBSP) in the TMS320C6000 supports two companding formats: µ-Law and A-Law. Both companding formats are specified in the CCITT G.711 recommendation [1]. This application report discusses how to use the McBSP to perform data companding in both the µ-Law and A-Law formats. This document also discusses how to perform companding of data not coming into the device but instead located in a buffer in processor memory. Sample McBSP setup codes are provided for reference. In addition, this application report discusses µ-Law and A-Law companding in software. The appendix provides software companding codes and a brief discussion on the companding theory. Contents 1 Design Problem . 2 2 Overview . 2 3 Companding With the McBSP. 3 3.1 µ-Law Companding . 3 3.1.1 McBSP Register Configuration for µ-Law Companding. 4 3.2 A-Law Companding. 4 3.2.1 McBSP Register Configuration for A-Law Companding. 4 3.3 Companding Internal Data. 5 3.3.1 Non-DLB Mode. 5 3.3.2 DLB Mode . 6 3.4 Sample C Functions. 6 4 Companding With Software. 6 5 Conclusion . 7 6 References . 7 TMS320C6000 is a trademark of Texas Instruments Incorporated. -
Equalization
CHAPTER 3 Equalization 35 The most intuitive effect — OK EQ is EZ U need a Hi EQ IQ MIX SMART QUICK START: Equalization GOALS ■ Fix spectral problems such as rumble, hum and buzz, pops and wind, proximity, and hiss. ■ Fit things into the mix by leveraging of any spectral openings available and through complementary cuts and boosts on spectrally competitive tracks. ■ Feature those aspects of each instrument that players and music fans like most. GEAR ■ Master the user controls for parametric, semiparametric, program, graphic, shelving, high-pass and low-pass filters. ■ Choose the equalizer with the capabilities you need, focusing particularly on the parameters, slopes, and number of bands available. Home stereos have tone controls. We in the studio get equalizers. Perhaps bet- ter described as a “spectral modifier” or “frequency-specific amplitude adjuster,” the equalizer allows the mix engineer to increase or decrease the level of specific frequency ranges within a signal. Having an equalizer is like having multiple volume knobs for a single track. Unlike the volume knob that attenuates or boosts an entire signal, the equalizer is the tool used to turn down or turn up specific frequency portions of any audio track . Out of all signal-processing tools that we use while mixing and tracking, EQ is probably the easiest, most intuitive to use—at first. Advanced applications of equalization, however, demand a deep understanding of the effect and all its Mix Smart. © 2011 Elsevier Inc. All rights reserved. 36 Mix Smart possibilities. Don't underestimate the intellectual challenge and creative poten- tial of this essential mix processor. -
Bit Rate Adaptation Using Linear Quadratic Optimization for Mobile Video Streaming
applied sciences Article Bit Rate Adaptation Using Linear Quadratic Optimization for Mobile Video Streaming Young-myoung Kang 1 and Yeon-sup Lim 2,* 1 Network Business, Samsung Electronics, Suwon 16677, Korea; [email protected] 2 Department of Convergence Security Engineering, Sungshin Women’s University, Seoul 02844, Korea * Correspondence: [email protected]; Tel.: +82-2-920-7144 Abstract: Video streaming application such as Youtube is one of the most popular mobile applications. To adjust the quality of video for available network bandwidth, a streaming server provides multiple representations of video of which bit rate has different bandwidth requirements. A streaming client utilizes an adaptive bit rate (ABR) scheme to select a proper representation that the network can support. However, in mobile environments, incorrect decisions of an ABR scheme often cause playback stalls that significantly degrade the quality of user experience, which can easily happen due to network dynamics. In this work, we propose a control theory (Linear Quadratic Optimization)- based ABR scheme to enhance the quality of experience in mobile video streaming. Our simulation study shows that our proposed ABR scheme successfully mitigates and shortens playback stalls while preserving the similar quality of streaming video compared to the state-of-the-art ABR schemes. Keywords: dynamic adaptive streaming over HTTP; adaptive bit rate scheme; control theory 1. Introduction Video streaming constitutes a large fraction of current Internet traffic. In particular, video streaming accounts for more than 65% of world-wide mobile downstream traffic [1]. Commercial streaming services such as YouTube and Netflix are implemented based on Citation: Kang, Y.-m.; Lim, Y.-s. -
4. MPEG Layer-3 Audio Encoding
MPEG Layer-3 An introduction to MPEG Layer-3 K. Brandenburg and H. Popp Fraunhofer Institut für Integrierte Schaltungen (IIS) MPEG Layer-3, otherwise known as MP3, has generated a phenomenal interest among Internet users, or at least among those who want to download highly-compressed digital audio files at near-CD quality. This article provides an introduction to the work of the MPEG group which was, and still is, responsible for bringing this open (i.e. non-proprietary) compression standard to the forefront of Internet audio downloads. 1. Introduction The audio coding scheme MPEG Layer-3 will soon celebrate its 10th birthday, having been standardized in 1991. In its first years, the scheme was mainly used within DSP- based codecs for studio applications, allowing professionals to use ISDN phone lines as cost-effective music links with high sound quality. In 1995, MPEG Layer-3 was selected as the audio format for the digital satellite broadcasting system developed by World- Space. This was its first step into the mass market. Its second step soon followed, due to the use of the Internet for the electronic distribution of music. Here, the proliferation of audio material – coded with MPEG Layer-3 (aka MP3) – has shown an exponential growth since 1995. By early 1999, “.mp3” had become the most popular search term on the Web (according to http://www.searchterms.com). In 1998, the “MPMAN” (by Saehan Information Systems, South Korea) was the first portable MP3 player, pioneering the road for numerous other manufacturers of consumer electronics. “MP3” has been featured in many articles, mostly on the business pages of newspapers and periodicals, due to its enormous impact on the recording industry. -
A Predictive H.263 Bit-Rate Control Scheme Based on Scene
A PRJ3DICTIVE H.263 BIT-RATE CONTROL SCHEME BASED ON SCENE INFORMATION Pohsiang Hsu and K. J. Ray Liu Department of Electrical and Computer Engineering University of Maryland at College Park College Park, Maryland, USA ABSTRACT is typically constant bit rate (e.g. PSTN). Under this cir- cumstance, the encoder’s output bit-rate must be regulated For many real-time visual communication applications such to meet the transmission bandwidth in order to guaran- as video telephony, the transmission channel in used is typ- tee a constant frame rate and delay. Given the channel ically constant bit rate (e.g. PSTN). Under this circum- rate and a target frame rate, we derive the target bits per stance, the encoder’s output bit-rate must be regulated to frame by distribute the channel rate equally to each hme. meet the transmission bandwidth in order to guarantee a Ffate control is typically based on varying the quantization constant frame rate and delay. In this work, we propose a paraneter to meet the target bit budget for each frame. new predictive rate control scheme based on the activity of Many rate control scheme for the block-based DCT/motion the scene to be coded for H.263 encoder compensation video encoders have been proposed in the past. These rate control algorithms are typically focused on MPEG video coding standards. Rate-Distortion based rate 1. INTRODUCTION control for MPEG using Lagrangian optimization [4] have been proposed in the literature. These methods require ex- In the past few years, emerging demands for digital video tensive,rate-distortion measurements and are unsuitable for communication applications such as video conferencing, video real time applications. -
Lossy Audio Compression Identification
2018 26th European Signal Processing Conference (EUSIPCO) Lossy Audio Compression Identification Bongjun Kim Zafar Rafii Northwestern University Gracenote Evanston, USA Emeryville, USA [email protected] zafar.rafi[email protected] Abstract—We propose a system which can estimate from an compression parameters from an audio signal, based on AAC, audio recording that has previously undergone lossy compression was presented in [3]. The first implementation of that work, the parameters used for the encoding, and therefore identify the based on MP3, was then proposed in [4]. The idea was to corresponding lossy coding format. The system analyzes the audio signal and searches for the compression parameters and framing search for the compression parameters and framing conditions conditions which match those used for the encoding. In particular, which match those used for the encoding, by measuring traces we propose a new metric for measuring traces of compression of compression in the audio signal, which typically correspond which is robust to variations in the audio content and a new to time-frequency coefficients quantized to zero. method for combining the estimates from multiple audio blocks The first work to investigate alterations, such as deletion, in- which can refine the results. We evaluated this system with audio excerpts from songs and movies, compressed into various coding sertion, or substitution, in audio signals which have undergone formats, using different bit rates, and captured digitally as well lossy compression, namely MP3, was presented in [5]. The as through analog transfer. Results showed that our system can idea was to measure traces of compression in the signal along identify the correct format in almost all cases, even at high bit time and detect discontinuities in the estimated framing. -
EBU Evaluations of Multichannel Audio Codecs
EBU – TECH 3324 EBU Evaluations of Multichannel Audio Codecs Status: Report Source: D/MAE Geneva September 2007 1 Page intentionally left blank. This document is paginated for recto-verso printing Tech 3324 EBU evaluations of multichannel audio codecs Contents 1. Introduction ................................................................................................... 5 2. Participating Test Sites ..................................................................................... 6 3. Selected Codecs for Testing ............................................................................... 6 3.1 Phase 1 ....................................................................................................... 9 3.2 Phase 2 ...................................................................................................... 10 4. Codec Parameters...........................................................................................10 5. Test Sequences ..............................................................................................10 5.1 Phase 1 ...................................................................................................... 10 5.2 Phase 2 ...................................................................................................... 11 6. Encoding Process ............................................................................................12 6.1 Codecs....................................................................................................... 12 6.2 Verification of bit-rate -
CT-Aacplus — a State-Of-The-Art Audio Coding Scheme
AUDIO CODING CT-aacPlus — a state-of-the-art Audio coding scheme Martin Dietz and Stefan Meltzer Coding Technologies, Germany CT-aacPlus is a combination of Spectral Band Replication (SBR) technology – a bandwidth-extension tool developed by Coding Technologies (CT) in Germany – with the MPEG Advanced Audio Coding (AAC) technology which, to date, has been one of the most efficient traditional perceptual audio-coding schemes. CT-aacPlus is able to deliver high-quality audio signals at bit-rates down to 24 kbit/s for mono and 48 kbit/s for stereo signals. The forthcoming Digital Radio Mondiale (DRM) broadcasting system, among others, will use CT-aacPlus for its audio-coding scheme. CT-aacPlus will enable DRM to deliver an audio quality, in the frequency range below 30 MHz, that is equivalent to – or even better than – that offered by today’s analogue FM services. This article describes the principles of traditional audio coders – and their limitations when used for low bit-rate applications. The second part describes the basic idea of SBR technology and demonstrates the improvements achieved through the combination of SBR technology with traditional audio coders such as AAC and MP3. Advanced Audio Coding (AAC) has so far been one of the most efficient traditional perceptual audio-coding algorithms. In combination with the bandwidth-extension technology, Spectral Band Replication (SBR), the coding efficiency of AAC can be even further improved by at least 30%, thus providing the same audio quality at a 30% lower bit-rate. The combination of AAC and SBR – referred to as CT-aacPlus – will be used by the Digital Radio Mondiale transmission system [1] in the frequency bands below 30 MHz and will provide near- FM sound quality at bit-rates of around 20 kbit/s per audio channel.