CITY UNIVERSITY OF HONG KONG 香港城市大學

Lossless Audio Coding Using Code-Excited Linear Prediction with Embedded Entropy Coding 碼激勵線性預測與嵌入式熵編碼結合的 無損壓縮音頻編碼

Submitted to Department of Electronic Engineering 電子工程學系 in Partial Fulfillment of the Requirements for the Degree of Master of Philosophy 哲學碩士學位

by

Li Yongmin 李永敏

May 2011 二零一一年五月

ii

Abstract

Audio coding is used to and decompress digital audio data.

Generally, there are two classes of audio coding techniques; lossy coding and lossless coding. Lossy coders such as MP3, can achieve high compression ratio, however the reconstructed signal is not exactly the same as the original signal. Lossless coders can reconstruct a signal identical to the original signal but the compression performance is not as good as that of lossy coders.

With increasing demand of high-fidelity audio, most efforts in audio coding research are directed to of high quality audio. Recent research trends in lossless audio compression focus on a decorrelation-plus-entropy coding approach.

The main disadvantage of this approach is that it is not scalable to cater for channels with different bandwidth. In addition, the order of the decorrelation filter is generally very high and to compute the filter coefficients on the fly in real time is quite a demanding task both in the encoder and decoder.

In this thesis, a scalable coding algorithm for lossless compression of audio signals is presented. The proposed algorithm consists of a lossy coding part and a lossless coding part. The lossy coding part is based on code excitation approach which is similar to Code-Excited Linear Prediction (CELP) coding for coding speech signal.

However, the proposed approach has fundamental differences from CELP coding in iii many aspects, in particular, the excitation gain and short term prediction coefficients are adapted in a sample-by-sample fashion to cope with rapid time-varying nature of audio signals. In addition, a block adaptive linear predictive filter is added in the proposed coder to further improve short term prediction.

The excitation codebook is a fixed codebook constructed from a collection of stochastic codewords which are obtained by training from a large collection of real audio signals. A total of eleven stochastic codebooks for various rates were designed. During encoding the excitation codebook is searched by using an M-L tree search strategy with a joint optimization based on minimum error energy and minimum code length after entropy coding. Several codebook selection methods and

M-L tree search strategies are discussed and compared in order to determine the best compression performance at various encoding complexities. In the lossless coding part, the error between the input and the synthetic signal has significantly lower entropy which can then be encoded by an arithmetic coder to achieve lossless compression. If the residual is not sent to the decoder, the decoder can recover a reasonable good quality of audio signal from the received lossy coding parameters only.

The proposed coding algorithm has very low decoding complexity due to its simple code excitation structure and achieves compression performance comparable to other advanced lossless coders for coding CD quality audio. v

Table of Contents

Abstract ...... ii

Acknowledgement ...... iv

Table of Contents ...... v

List of Figures ...... vii

List of Tables ...... ix

Glossary of Symbols ...... xi

Chapter 1 Introduction ...... 1

1.1 Lossy Audio Coding ...... 2

1.2 Lossless Audio Coding ...... 6

1.3 Motivation of This Research ...... 10

1.4 Thesis Outline...... 12

Chapter 2 Lossless Audio Coding Using A Two Stage Joint Lossy/Entropy Coding Approach ..... 13

2.1 Lossy Coding Part ...... 19 2.1.1 Excitation Codebook ...... 22 2.1.2 Sample-by-Sample Gain Adaption ...... 24 2.1.3 Sample-by-Sample Adaptive Short Term Predictor ...... 27 2.1.4 Block Adaptive Linear Predictive Filter ...... 28 2.1.5 Codebook Search Algorithm ...... 29 2.1.6 Selecting the Order of the Sample-by-Sample Adaptive Predictor ...... 31 2.1.7 Improvement of Adding Block Adaptive Filter ...... 36 2.1.8 Adding A Long-Term Adaptive Codebook ...... 37 2.1.9 Using A Nonlinear Predictor ...... 42

2.2 Entropy Coding ...... 48 2.2.1 Basic ...... 49 2.2.2 Adaptive Arithmetic Coding ...... 50

Chapter 3 Improved Coding Schemes ...... 52 vi

3.1 Improving Codebook Search Algorithm ...... 52 3.1.1 Codeword Search ...... 53 3.1.2 Adaptive Codebook Selection ...... 55

3.2 Encoder for Joint Lossy/Entropy Coding Scheme ...... 58

3.3 Decoder for Joint Lossy/Entropy Coding Scheme ...... 59

3.4 Improvement by Nonscalable Lossless Coding Scheme ...... 60

3.5 Extension to Multi-Channel Coding ...... 66

Chapter 4 Simulation and Results ...... 71

4.1 Simulation Performance ...... 71

4.2 Performance Comparison ...... 74 4.2.1 Test of Mono Files ...... 74 4.2.2 Test of Stereo Files ...... 77

Chapter 5 Conclusions and Future Work ...... 79

5.1 Conclusions ...... 79

5.2 Future Work ...... 82

Bibliography ...... 83

vii

List of Figures

Figure1.1.1 Classification of lossy audio coding ...... 2

Figure1.1.2 Generic structure of coder by exploring ...... 3

Figure1.1.3 Block diagram of the subband coding system ...... 6

Figure1.2.1 Two approaches to lossless audio coding ...... 7

Figure1.3.1 Coder with joint optimization algorithm ...... 11

Figure 2.1.1 Basic structure of a two-stage lossy-plus-entropy coding approach for lossless

audio coding ...... 14

Figure 2.1.2 Block diagram of ADPCM encoder ...... 16

Figure 2.1.3 Block diagram of CELP encoder ...... 17

Figure 2.1.4 Two-stage lossy-plus-entropy coding approach with separated optimization

for lossy and entropy coding part ...... 18

Figure 2.1.5 Basic structure of the lossy coding part ...... 19

Figure 2.1.6 Proposed structure of the lossy coding part ...... 21

Figure 2.1.7 Basic structure of the excitation codebook ...... 22

Figure 2.1.8 M-L tree search strategy for the excitation codebook ...... 31

Figure 2.1.9 Encoder without block adaptive filter ...... 32

Figure 2.1.10 Performance comparison between coding schemes with order-2 and order-3

SbS adaptive linear predictor...... 35

Figure 2.1.11 Performance comparison between coding schemes with and without block

adaptive filter ...... 37

Figure 2.1.12 Simplified structure with the added adaptive codebook ...... 38

Figure 2.1.13 Encoder with added adaptive codebook ...... 40

Figure 2.1.14 Performance comparison between coding schemes with and without

adaptive codebook ...... 41

Figure 2.1.15 Encoder with SbS adaptive nonlinear predictor ...... 43

Figure 2.1.16 Structure of 1-D second-order Volterra filter ...... 43 viii

Figure 2.1.17 Encoder with block adaptive nonlinear synthesis filter ...... 47

Figure 2.2.1 Proposed coder structure with entropy coding part ...... 49

Figure 2.2.2 An example of arithmetic coding process ...... 50

Figure 3.1.1 Modified M-L tree search strategy ...... 54

Figure 3.2.1 Proposed scalable lossless audio encoder ...... 58

Figure 3.3.1 Proposed scalable lossless audio decoder ...... 59

Figure 3.4.1 Nonscalable encoder with a clean input of block adaptive analysis filter ...... 61

Figure 3.4.2 Nonscalable encoder with a clean input of block adaptive predictor ...... 62

Figure 3.4.3 Nonscalable encoder with a clean input of SbS adaptive predictor ...... 65

Figure 3.4.4 Compression performance comparison between nonscalable scheme with and

without a clean input of SbS adaptive predictor ...... 66

Figure 3.5.1 Perception of stereo signals...... 67

Figure 3.5.2 Coding scheme for removing cross-correlation between two channels ...... 67

Figure 3.5.3 Decoding scheme for reconstructing original stereo signals ...... 69

Figure 3.5.4 Coding scheme for multi-channel signals ...... 70

ix

List of Tables

Table 2.1.1 List of the error entropies and SNRs obtained by the coding scheme with order-2

SbS adaptive predictor shown in Figure. 2.1.9 ...... 33

Table 2.1.2 List of the error entropies and SNRs obtained by the coding scheme with order-3

SbS adaptive predictor shown in Figure. 2.1.9 ...... 34

Table 2.1.3 List of the error entropies and SNRs obtained by the coding scheme with block

adaptive filter shown in Figure 2.1.6 ...... 36

Table 2.1.4 List of the error entropies plus additional and SNRs obtained by the

scheme with adaptive codebook shown in Figure 2.1.13 ...... 41

Table 2.1.5 List of error entropies and SNRs obtained by the scheme with order-2 SbS

adaptive nonlinear predictor shown in Figure 2.1.15 and with order-2 SbS

adaptive linear predictor shown in Figure 2.1.6 by using codebook (6-2)...... 46

Table 2.1.6 List of error entropies and SNRs obtained by the scheme with block adaptive

nonlinear filter shown in Figure 2.1.17 and with block adaptive linear filter

shown in Figure 2.1.6 by using codebook (6-4)...... 48

Table 3.1.1 List of error entropies, SNRs and compression ratio obtained by the scheme with

and without improved codebook search algorithm...... 57

Table 3.4.1 Compression ratios obtained by the scalable coding scheme shown in Figure 3.2.1

and by the nonscalable coding scheme with a clean input of the block adaptive

predictor shown in Figure 3.4.2 ...... 63

Table 3.4.2 Compression ratios obtained by the nonscalable coding scheme shown in Figure

3.4.2 without adaptive codebook selection algorithm ...... 64

Table 3.4.3 Compression ratios obtained by the nonscalable coding scheme with a clean input

of the SbS adaptive predictor shown in Figure 3.4.3 ...... 65

Table 3.5.1 List of the original signal entropy, signal entropy and the prediction gain obtained

by the coding scheme with inter-channel decorrelation ...... 70

Table 4.1.1 MPEG items used in the experiment ...... 72 x

Table 4.1.2 CD sets used to record test audio files ...... 72

Table 4.1.3 List of results in entropy and SNR of residual signal obtained after lossy coding of

mono audio test sets ...... 73

Table 4.1.4 Comparison between G.726 and the proposed lossy coding scheme ...... 73

Table 4.2.1 Compression ratios obtained by FLAC -2, MPEG-4 ALS and our proposed coding

scheme for coding mono audio test sets ...... 75

Table 4.2.2 Decoding time (in seconds) used by MPEG-4 ALS (RLS-LMS) and our proposed

scalable coding scheme for mono audio test sets...... 76

Table 4.2.3 Compression ratios obtained by FLAC -2, MPEG-4 ALS (RLS-LMS) and our

proposed scalable coding scheme for coding stereo audio files ...... 77

Table 4.2.4 Decoding time (in seconds) used by MPEG-4 ALS (RLS-LMS) and our proposed

decoding scheme for stereo audio files...... 78