Voice and Audio Compression for Wireless Communications
Total Page:16
File Type:pdf, Size:1020Kb
VOICE-BOOK-2E-SAMPLE-CHAPS 2007/8/20 page 1 Voice and Audio Compression for Wireless Communications by c L. Hanzo, F.C.A. Somerville, J.P. Woodard, H-T. How School of Electronics and Computer Science, University of Southampton, UK VOICE-BOOK-2E-SAMPLE-CHAPS 2007/8/20 page i Contents Preface and Motivation 1 Acknowledgements 11 I Speech Signals and Waveform Coding 13 1 Speech Signals and Coding 15 1.1 Motivation of Speech Compression . 15 1.2 Basic Characterisation of Speech Signals . ........ 16 1.3 Classification of Speech Codecs . 20 1.3.1 WaveformCoding ........................... 20 1.3.1.1 Time-domain Waveform Coding . 21 1.3.1.2 Frequency-domain Waveform Coding . 21 1.3.2 Vocoders ................................ 22 1.3.3 HybridCoding ............................. 23 1.4 WaveformCoding................................ 23 1.4.1 DigitisationofSpeech . 23 1.4.2 QuantisationCharacteristics . 25 1.4.3 Quantisation Noise and Rate-Distortion Theory . ....... 25 1.4.4 Non-uniform Quantisation for a Known PDF: Companding ...... 28 1.4.5 PDF-independent Quantisation using Logarithmic Compression . 31 1.4.5.1 The µ-Law Compander . 32 1.4.5.2 The A-law Compander . 33 1.4.6 Optimum Non-uniform Quantisation . 35 1.5 ChapterSummary................................ 39 2 Predictive Coding 41 2.1 ForwardPredictiveCoding . 41 2.2 DPCMCodecSchematic ............................ 42 2.3 PredictorDesign ................................ 43 i VOICE-BOOK-2E-SAMPLE-CHAPS 2007/8/20 page ii ii CONTENTS 2.3.1 ProblemFormulation. 43 2.3.2 Covariance Coefficient Computation . 45 2.3.3 Predictor Coefficient Computation . 46 2.4 Adaptive One-word-memory Quantization . ...... 50 2.5 DPCMPerformance............................... 53 2.6 Backward-Adaptive Prediction . 55 2.6.1 Background............................... 55 2.6.2 StochasticModelProcesses . 57 2.7 The 32 kbps G.721 ADPCM Codec . 60 2.7.1 Functional Description of the G.721 Codec . 60 2.7.2 AdaptiveQuantiser. 62 2.7.3 G.721 Quantiser Scale Factor Adaptation . 62 2.7.4 G.721 Adaptation Speed Control . 63 2.7.5 G.721 Adaptive Prediction and Signal Reconstruction ........ 64 2.8 SpeechQualityEvaluation . 66 2.9 G.726 and G.727 ADPCM Coding . 68 2.9.1 Motivation ............................... 68 2.9.2 Embedded G.727 ADPCM coding . 68 2.9.3 Performance of the Embedded G.727 ADPCM Codec . 70 2.10 Rate-Distortion in Predictive Coding . ........ 74 2.11 ChapterSummary................................ 80 II Analysis by Synthesis Coding 83 3 Analysis-by-synthesis Principles 85 3.1 Motivation.................................... 85 3.2 Analysis-by-synthesis Codec Structure . ........ 86 3.3 TheShort-termSynthesisFilter. ..... 87 3.4 Long-TermPrediction. 90 3.4.1 Open-loop Optimisation of LTP parameters . 90 3.4.2 Closed-loop Optimisation of LTP parameters . ..... 96 3.5 ExcitationModels................................ 100 3.6 AdaptivePost-filtering . 102 3.7 Lattice-based Linear Prediction . 105 3.8 ChapterSummary................................111 4 Speech Spectral Quantization 113 4.1 Log-areaRatios.................................113 4.2 LineSpectralFrequencies. 117 4.2.1 Derivation of the Line Spectral Frequencies . 117 4.2.2 Computation of the Line Spectral Frequencies . 121 4.2.3 Chebyshev-description of Line Spectral Frequencies .........123 4.3 Spectral Vector Quantization . 125 4.3.1 Background...............................125 4.3.2 Speaker-adaptive Vector Quantisation of LSFs . 129 VOICE-BOOK-2E-SAMPLE-CHAPS 2007/8/20 page iii CONTENTS iii 4.3.3 Stochastic VQ of LPC Parameters . 130 4.3.3.1 Background .........................131 4.3.3.2 The Stochastic VQ Algorithm . 132 4.3.4 Robust Vector Quantisation Schemes for LSFs . 134 4.3.5 LSF Vector-quantisers in Standard Codecs . 136 4.4 Spectral Quantizers for Wideband Speech Coding . ........137 4.4.1 Introduction to Wideband Spectral Quantisation . ........137 4.4.1.1 Statistical Properties of Wideband LSFs . 139 4.4.1.2 Speech Codec Specifications . 139 4.4.2 Wideband LSF Vector Quantizers . 142 4.4.2.1 Memoryless Vector Quantization . 142 4.4.2.2 Predictive Vector Quantization . 145 4.4.2.3 Multimode Vector Quantization . 149 4.4.3 Simulation Results and Subjective Evaluations . .......152 4.4.4 Conclusions on Wideband Spectral Quantisation . 153 4.5 ChapterSummary................................154 5 RPE Coding 155 5.1 TheoreticalBackground. 155 5.2 The 13 kbps RPE-LTP GSM Speech encoder . 162 5.2.1 Pre-processing .............................162 5.2.2 STPanalysisfiltering. 164 5.2.3 LTPanalysisfiltering. 165 5.2.4 Regular Excitation Pulse Computation . 165 5.3 The 13 kbps RPE-LTP GSM Speech Decoder . 166 5.4 Bit-sensitivityoftheGSMCodec. 170 5.5 A ’Tool-box’ Based Speech Transceiver . 171 5.6 ChapterSummary................................172 6 Forward-Adaptive CELP Coding 175 6.1 Background...................................175 6.2 TheOriginalCELPApproach . 176 6.3 FixedCodebookSearch. 179 6.4 CELPExcitationModels . 181 6.4.1 BinaryPulseExcitation . 181 6.4.2 Transformed Binary Pulse Excitation . 182 6.4.2.1 Excitation Generation . 182 6.4.2.2 TBPEBitSensitivity . 184 6.4.3 Dual-rate Algebraic CELP Coding . 187 6.4.3.1 ACELP Codebook Structure . 187 6.4.3.2 Dual-rate ACELP Bitallocation . 189 6.4.3.3 Dual-rate ACELP Codec Performance . 190 6.5 CELPOptimization...............................191 6.5.1 Introduction...............................191 6.5.2 Calculation of the Excitation Parameters . 192 6.5.2.1 Full Codebook Search Theory . 192 VOICE-BOOK-2E-SAMPLE-CHAPS 2007/8/20 page iv iv CONTENTS 6.5.2.2 Sequential Search Procedure . 194 6.5.2.3 Full Search Procedure . 195 6.5.2.4 Sub-Optimal Search Procedures . 197 6.5.2.5 Quantization of the Codebook Gains . 198 6.5.3 Calculation of the Synthesis Filter Parameters . ........200 6.5.3.1 Bandwidth Expansion . 201 6.5.3.2 Least Squares Techniques . 201 6.5.3.3 Optimization via Powell’s Method . 204 6.5.3.4 Simulated Annealing and the Effects of Quantization . 205 6.6 CELPError-sensitivity . 209 6.6.1 Introduction...............................209 6.6.2 Improving the Spectral Information Error Sensitivity .........209 6.6.2.1 LSFOrderingPolicies. 209 6.6.2.2 The Effect of FEC on the Spectral Parameters . 211 6.6.2.3 The Effect of Interpolation . 212 6.6.3 Improving the Error Sensitivity of the Excitation Parameters . 213 6.6.3.1 The Fixed Codebook Index . 214 6.6.3.2 The Fixed Codebook Gain . 214 6.6.3.3 Adaptive Codebook Delay . 215 6.6.3.4 Adaptive Codebook Gain . 215 6.6.4 Matching Channel Codecs to the Speech Codec . 216 6.6.5 ErrorResilienceConclusions. 220 6.7 Dual-mode Speech Transceiver . 221 6.7.1 The Transceiver Scheme . 221 6.7.2 Re-configurable Modulation . 222 6.7.3 Source-matched Error Protection . 224 6.7.3.1 Low-quality 3.1 kBd Mode . 224 6.7.3.2 High-quality 3.1 kBd Mode . 228 6.7.4 Packet Reservation Multiple Access . 229 6.7.5 3.1kBdSystemPerformance. 231 6.7.6 3.1kBdSystemSummary . 234 6.8 Multi-slotPRMATransceiver . 235 6.8.1 Background and Motivation . 235 6.8.2 PRMA-assisted Multi-slot Adaptive Modulation . 235 6.8.3 Adaptive GSM-like Schemes . 237 6.8.4 Adaptive DECT-like Schemes . 238 6.8.5 Summary of Adaptive Multi-slot PRMA . 239 6.9 ChapterSummary................................240 7 Standard Speech Codecs 241 7.1 Background...................................241 7.2 The US DoD FS-1016 4.8 kbits/s CELP Codec . 241 7.2.1 Introduction...............................241 7.2.2 LPC Analysis and Quantization . 243 7.2.3 The Adaptive Codebook . 244 7.2.4 The Fixed Codebook . 245 VOICE-BOOK-2E-SAMPLE-CHAPS 2007/8/20 page v CONTENTS v 7.2.5 Error Concealment Techniques . 246 7.2.6 DecoderPost-Filtering . 247 7.2.7 Conclusion ...............................247 7.3 The IS-54 DAMPS speech codec . 247 7.4 TheJDCspeechcodec .............................251 7.5 The Qualcomm Variable Rate CELP Codec . 253 7.5.1 Introduction...............................253 7.5.2 Codec Schematic and Bit Allocation . 254 7.5.3 CodecRateSelection. 255 7.5.4 LPC Analysis and Quantization . 256 7.5.5 ThePitchFilter.............................257 7.5.6 The Fixed Codebook . 258 7.5.7 Rate1/8FilterExcitation. 259 7.5.8 DecoderPost-Filtering . 260 7.5.9 Error Protection and Concealment Techniques . 260 7.5.10 Conclusion ...............................261 7.6 Japanese Half-Rate Speech Codec . 261 7.6.1 Introduction...............................261 7.6.2 Codec Schematic and Bit Allocation . 262 7.6.3 EncoderPre-Processing . 264 7.6.4 LPC Analysis and Quantization . 264 7.6.5 TheWeightingFilter . 265 7.6.6 ExcitationVector1 . 265 7.6.7 ExcitationVector2 . 266 7.6.8 ChannelCoding ............................266 7.6.9 DecoderPostProcessing . 268 7.7 Thehalf-rateGSMcodec . 269 7.7.1 Half-rateGSMcodecoutline. 269 7.7.2 Half-rate GSM Codec’s Spectral Quantisation . 271 7.7.3 Errorprotection.............................272 7.8 The8kbits/sG.729Codec . 273 7.8.1 Introduction...............................273 7.8.2 Codec Schematic and Bit Allocation . 274 7.8.3 EncoderPre-Processing . ..