Reducing the Complexity of Sub-Band ADPCM Coding to Enable High

White Paper Reducing the complexity of sub-band ADPCM coding to enable high-quality audio streaming from mobile devices a technical white paper by Neil Smyth and David Trainor, APTX www.aptx.com 1 | © APT Licensing Ltd., 2009 White Paper ABSTRACT The number of consumer audio applications demanding high quality audio compression and communication across wireless networks continues to grow. Although the consumer is increasingly demanding higher audio quality, devices such as portable media players and wireless headsets also demand low computational complexity, low power dissipation and practical transmission bit-rates to help conserve battery life. This paper discusses research undertaken to lower the complexity of existing high-quality sub-band ADPCM coding schemes to better satisfy these conflicting criteria. INTRODUCTION The widespread popularity of portable media devices has allowed consumers to enjoy audio/video entertainment, games and online content at their convenience. The only limitation is the awkward necessity of wires to connect these devices to displays, earphones or car/home entertainment systems. Wireless audio streaming isolates the speaker device, allowing users to place their mobile devices in a safe or withdrawn location while they enjoy convenient wireless audio streaming. A major obstacle to wireless audio streaming is that the mobile device and/or speaker system are typically power restricted. If the application involves video playback there is also a requirement for low latency in order to provide acceptable lip-synch. These requirements have led to the use of ADPCM coders rather than perceptual coders in wireless audio/video devices due to their relatively low complexity and latency. As the storage capacities of these devices continues to grow consumers increasingly exploit the quality benefits of increasingly higher bit rates to compress their audio content. As consumers become increasingly accustomed to high quality wired audio playback they will demand that this quality is maintained when using the convenience of wireless streaming. All of these factors lead to the opposing design constraints of high quality and low complexity. A number of audio compression algorithms are used for wireless audio streaming, these include Bluetooth SBC [1] and Enhanced apt-X [2]. The Philips developed SBC algorithm was selected as a mandatory codec for use in Bluetooth to insure interoperability between products. Bluetooth SBC is a frame-based variable rate APCM codec with low complexity processing overhead, an adaptive quantization step size and lower latency than optional Bluetooth codecs such as MP3 and AAC. 2 | © APT Licensing Ltd., 2009 White Paper Enhanced apt-X is a frameless ADPCM codec (see Figure 1 for a basic overview) with a fixed compression ratio of 4:1, an adaptive quantization step size and predictive differential coding. The use of prediction and differential coding in such an ADPCM codec provides reduced levels of quantization noise and thus quality improvements over an APCM codec. However, a significant obstacle to deployment of ADPCM codecs in comparison to APCM is the additional computational cost associated with the prediction process. (a) ADPCM sub-band codec (b) ADPCM encoder (c) ADPCM decoder Fig. 1: ADPCM Sub-band codec block diagram The difference in latency between the SBC and Enhanced apt-X codecs should also be considered. The frameless structure of Enhanced apt-X provides extremely low latencies. In contrast the framed SBC codec requires buffering of frame data prior to encoding thereby consuming additional algorithmic delay. When considering the impact of framed and frameless coding, the buffering and transmission delays incurred in the wireless transmission system should also be considered. If the wireless transmission system is designed 3 | © APT Licensing Ltd., 2009 White Paper to accommodate low latency operation the low latencies associated with a frameless streaming codec such as Enhanced apt-X will be apparent and desirable. The contrasting algorithmic structure of SBC and Enhanced apt-X has led these two algorithms to be appropriated as the basis for research into lowering the complexity of high-quality sub-band ADPCM coding. The research goals of lowering the complexity of ADPCM coding are as follows: • Reduce the power dissipation • Lower the computational complexity • Enhance the audio quality • Reduce memory storage/access requirements This paper outlines the results achieved thus far in developing a high quality audio codec for use in wireless applications. Here we describe how careful selection of the sub-band filter bank can reduce processing requirements while maintaining audio quality. It is also explained how the additional complexity associated with ADPCM coding as opposed to APCM coding can be mitigated by careful application of adaptive prediction techniques. Additional methods for reducing the computational complexity are also described, highlighting their advantages and disadvantages. These methods include utilizing stereo intensity coding to discard perceptually unimportant side channel information, using large frame lengths to improve processing efficiency and introducing favourable conditions for entropy coding in order to achieve significant bit rate improvements. SUB-BANDING FILTER BANK ANALYSIS The sub-band filter bank is a major component of an APCM/ADPCM codec in terms of computational complexity and audio quality. The primary purpose of the sub-band filter bank is to translate the input PCM samples into a number of sub-bands that can be independently coded. This enables the frequency- dependent psychoacoustic properties of the human hearing system to be exploited. A high quality and low complexity sub-band filter bank design is obtained by balancing the two parameters. The full spectrum of target hardware must be considered (RISC microprocessor, DSP, FPGA, ASIC) to account 4 | © APT Licensing Ltd., 2009 White Paper for the wide variety of applications and scenarios in which such a codec could be deployed. This requires some general measures of complexity to be used for comparative purposes. Perhaps the most important measure in terms of power dissipation is the execution time of the algorithm. The faster the filter bank can complete its task, the greater the possibility of the hardware platform entering a low power state. If the filter bank architecture and hardware platform allow parallel processing to occur it is desirable for the implementation to employ these features. In software implementation it is known that lowering the execution time is preferable to increased power dissipation associated with parallelism [3, 4, 5]. A filter bank that offers a relatively low number of memory accesses has two possible advantages, (a) lower power dissipation attributed to less switching of the data and address buses and (b) faster execution time attributed to memory access delays. The following analysis considers four different filter bank architectures and discusses their relative advantages and disadvantages. QUASI-LINEAR PHASE IIR An IIR filter can be used to create a quasi-linear phase halfband _lter that maintains approximately linear phase within the passband. Quasi-linear phase IIR (QLPIIR) filters are an interesting alternative to FIR filters used in the QMF due to the reduced number of coefficients required to achieve the same transition roll-off and stopband attenuation characteristics. In this research MATLAB has been used to design a halfband quasi- linear phase IIR filter with a stopband attenuation of 70 dB. The analysis IIR filter prototype is described in Figure 2. The allpass filters are defined as follows: 5 | © APT Licensing Ltd., 2009 White Paper Fig. 2: QLP IIR prototype analysis filter WAVELET PACKET DECOMPOSITION Similarly to the QMF and quasi-linear phase IIR, the high-pass and low-pass components of the discrete wavelet transform (DWT) can be used in the construction of a multi-level wavelet packet decomposition (WPD) by means of a network of halfband filters . DAUBECHIES 4 AND 6 The Daubechies 4 and Daubechies 6 DWTs were evaluated. They possess an irregular processing structure with a relatively high number of coefficients compared to CDF CDF 9/7 (see below) but they offer a lower latency. COHEN-DAUBECHIES-FEAUVEAU (CDF) 9/7 The Cohen-Daubechies-Feauveau wavelet is a family of biorthogonal wavelets that offer an invertible andsymmetric structure. Lifting decomposition of CDF 9/7 provides the polyphase matrix representation of Equation 4, also described by Figure 3. The inverse CDF 9/7 wavelet transform is implemented by simple inversion of the forward transform. The lifting scheme coe_cients of CDF 9/7 are all derived from an irrational number. In a fixed-point implementation the use of an irrational number results in rounding differences and quantization between the forward and inverse wavelet transforms. This distortion can be improved by modifying the underlying irrational number [6]. 6 | © APT Licensing Ltd., 2009 White Paper Fig. 3: CDF 9/7 discrete wavelet transform using lifting composition COSINE MODULATED FILTER BANKS (CMFB) An N-band cosine modulated filter bank is constructed from a prototype low-pass filter that possesses a cut- off frequency of Fs=4N. Cosine functions are then used to modulate the low-pass filter and form N band-pass filters each with a bandwidth of Fs=2N. Various methods can be used to efficiently construct this filter bank. For the purposes of this research the 4 and 8 sub-band variants of the Bluetooth

Load more