The Perceptual Impact of Different Quantization Schemes in G.719
Total Page:16
File Type:pdf, Size:1020Kb
The perceptual impact of different quantization schemes in G.719 BOTE LIU Master’s Degree Project Stockholm, Sweden May 2013 XR-EE-SIP 2013:001 Abstract In this thesis, three kinds of quantization schemes, Fast Lattice Vector Quantization (FLVQ), Pyramidal Vector Quantization (PVQ) and Scalar Quantization (SQ) are studied in the framework of audio codec G.719. FLVQ is composed of an RE8 -based low-rate lattice vector quantizer and a D8 -based high-rate lattice vector quantizer. PVQ uses pyramidal points in multi-dimensional space and is very suitable for the compression of Laplacian-like sources generated from transform. SQ scheme applies a combination of uniform SQ and entropy coding. Subjective tests of these three versions of audio codecs show that FLVQ and PVQ versions of audio codecs are both better than SQ version for music signals and SQ version of audio codec performs well on speech signals, especially for male speakers. I Acknowledgements I would like to express my sincere gratitude to Ericsson Research, which provides me with such a good thesis work to do. I am indebted to my supervisor, Sebastian Näslund, for sparing time to communicate with me about my work every week and giving me many valuable suggestions. I am also very grateful to Volodya Grancharov and Eric Norvell for their advice and patience as well as consistent encouragement throughout the thesis. My thanks are extended to some other Ericsson researchers for attending the subjective listening evaluation in the thesis. Finally, I want to thank my examiner, Professor Arne Leijon of Royal Institute of Technology (KTH) for reviewing my report very carefully and supporting my work very much. II Contents 1 Introduction .................................................................................................................................... 1 1.1 A review of audio and speech coding ................................................................................ 1 1.2 Objective.............................................................................................................................. 2 1.3 Thesis outline ...................................................................................................................... 3 2 Background .................................................................................................................................... 4 2.1 Audio and speech coding ................................................................................................... 4 2.1.1 Waveform coding ..................................................................................................... 4 2.1.2 Parametric coding .................................................................................................... 5 2.1.3 Perceptual model ...................................................................................................... 6 2.2 Quantization ........................................................................................................................ 7 2.2.1 Scalar quantization ................................................................................................... 7 2.2.2 Vector quantization .................................................................................................. 7 2.3 Entropy coding .................................................................................................................... 8 2.3.1 Huffman coding ....................................................................................................... 8 2.3.2 Range coding ............................................................................................................ 8 2.4 Performance evaluation .................................................................................................... 11 3 G.719 ............................................................................................................................................ 12 3.1 System description ............................................................................................................ 12 3.1.1 Encoder overview .................................................................................................. 12 3.1.2 Decoder overview .................................................................................................. 13 3.2 Encoder .............................................................................................................................. 13 3.2.1 Transient detection ................................................................................................. 13 3.2.2 Adaptive time-frequency transform ...................................................................... 14 3.2.3 Grouping of spectral coefficients .......................................................................... 14 3.2.4 Norm quantization ................................................................................................. 15 3.2.5 Bit-allocation .......................................................................................................... 17 3.2.6 Shape quantization ................................................................................................. 18 3.2.7 Noise level adjustment ........................................................................................... 18 3.3 Decoder .............................................................................................................................. 19 3.3.1 Norm decoding ....................................................................................................... 19 3.3.2 Spectral coefficient decoding ................................................................................ 19 III 3.3.3 Spectrum filling ...................................................................................................... 19 3.3.4 Noise level adjustment ........................................................................................... 20 3.3.5 De-normalization ................................................................................................... 20 3.3.6 Inverse transform ................................................................................................... 20 4 Quantization theory ..................................................................................................................... 21 4.1 Fast lattice vector quantization ........................................................................................ 21 4.1.1 Lattice ..................................................................................................................... 21 4.1.2 FLVQ encoder ........................................................................................................ 22 4.1.3 FLVQ decoder ........................................................................................................ 24 4.2 Pyramidal vector quantization.......................................................................................... 24 4.2.1 Mapping between bits and pulses ......................................................................... 25 4.2.2 Calculation of the size of the codebook................................................................ 25 4.2.3 PVQ Encoder.......................................................................................................... 25 4.2.4 PVQ Decoder ......................................................................................................... 26 4.2.5 Split ......................................................................................................................... 27 5 Implementation ............................................................................................................................ 28 5.1 Bit-stream format .............................................................................................................. 28 5.2 G.719 with scalar quantization......................................................................................... 28 5.2.1 Creation of a codebook .......................................................................................... 29 5.2.2 Entropy coding implementation ............................................................................ 32 5.3 G.719 with PVQ ................................................................................................................ 32 5.3.1 Quantization of norms ........................................................................................... 32 5.3.2 Bit-pulse conversion and pre-calculated table ..................................................... 33 5.3.3 Range coding .......................................................................................................... 34 5.3.4 Bit saved by range coding ..................................................................................... 36 5.4 Bit-allocation schemes ...................................................................................................... 36 6 Results .......................................................................................................................................... 38 6.1 Objective evaluations ....................................................................................................... 38 6.1.1 Fixed bit-allocation coding .................................................................................... 38 6.1.2 Flexible bit-allocation coding...............................................................................