The Design and Implementation of a Speech Codec for Packet Switched Networks Stephen Charles Hall University of Wollongong
Total Page:16
File Type:pdf, Size:1020Kb
University of Wollongong Research Online University of Wollongong Thesis Collection University of Wollongong Thesis Collections 1988 The design and implementation of a speech codec for packet switched networks Stephen Charles Hall University of Wollongong Recommended Citation Hall, Stephen Charles, The design and implementation of a speech codec for packet switched networks, Doctor of Philosophy thesis, Department of Electrical and Computer Engineering, University of Wollongong, 1988. http://ro.uow.edu.au/theses/1352 Research Online is the open access institutional repository for the University of Wollongong. For further information contact Manager Repository Services: [email protected]. THE DESIGN AND IMPLEMENTATION OF A SPEECH CODEC FOR PACKET SWITCHED NETWORKS A thesis submitted in fulfilment of the requirements for the award of the degree of DOCTOR OF PHILOSOPHY from THE UNIVERSITY OF WOLLONGONG by STEPHEN CHARLES HALL, B.Sc. (Eng.) Department of Electrical and Computer Engineering 1988 I hereby certify that no part of the work presented in this thesis has been previously submitted for a degree to any university or similar institution. Stephen Charles Hall 29/08/88 CONTENTS ACKNOWLEDGEMENTS ABSTRACT CHAPTER 1 : INTRODUCTION 1.1 Background to the thesis 1.1.1 Segregated and integrated communications networ 1.1.2 Local and wide area networks 1.1.3 Problems associated with the addition of voice data LAN 1.1.4 The need for a special speech codec 1.2 Aims of the thesis 1.3 An overview of the thesis contents 1.4 Original contributions made by the thesis 1.5 Publications by the author related to the thesis CHAPTER 2 : THE NETWORK AND WORKSTATIONS 2.1 Introduction 2.2 The network 2.2.1 Configuration 2.2.2 Switching technique 2.2.3 Capacity 2.2.4 Channel errors 2.2.5 Delay 2.3 The workstations 2.3.1 Functional components n 2.3.2 Structure of the packet voice terminal 15 2.3.3 Conclusions 18 CHAPTER 3 : SPEECH QUALITY IN PACKET VOICE COMMUNICATIONS 20 3.1 Introduction 20 3.2 Signal distortion 20 3.2.1 Introduction 20 3.2.2 Fixed distortion 21 3.2.3 Variable distortion 21 3.2.4 Summary and conclusions 22 3.3 Signal delay 22 3.3.1 Types of delay 22 3.3.2 The subjective effects of fixed delay 23 3.3.3 Delay minimization 24 3.3.4 Summary and conclusions 26 3.4 Signal loss 26 3.4.1 Introduction 26 3.4.2 Lost packets 27 3.4.3 The effect of lost packets on speech quality 27 3.4.4 Summary and conclusions 28 3.5 Signal corruption 29 3.5.1 Introduction 29 3.5.2 Corruption of voice packets 29 3.5.3 Summary and conclusions 31 3.6 Silence elimination 31 3.6.1 Introduction 31 3.6.2 The advantage of silence elimination 31 3.6.3 The disadvantages of silence elimination 32 3.6.4 Summary and conclusions iii 3.7 Overal1 speech qual ity 35 3.7.1 Quality standards 35 3.7.2 Maximizing the overall quality 36 3.7.3 Conclusions 37 CHAPTER 4 : THE ACCESS CONTROLLER 38 4.1 Introduction 38 4.2 Contention-based vs. ordered access 38 4.3 Priority access 39 4.4 Summary and conclusions 40 CHAPTER 5 : THE NETWORK VOICE PROTOCOL 41 5.1 Introduction 41 5.2 Packetization 41 5.2.1 Introduction 41 5.2.2 Factors influencing the optimum packet length 42 5.2.3 Instantaneous variations in the packet length 42 5.2.4 Long-term variations in the packet length 43 5.2.5 Summary and conclusions 44 5.3 Prioritization 44 5.3.1 Introduction 44 5.3.2 The relative prioritization of voice and data 45 5.3.3 Prioritization of voice according to its activity 45 5.3.4 Prioritization of voice according to its transmission history 46 5.3.5 Summary and conclusions 47 5.4 Flow control 47 5.4.1 Introduction 47 5.4.2 Flow control of voice traffic 48 IV 5.4.3 Network load estimation/prediction 49 5.4.4 Summary and conclusions 49 5.5 Synchronization 49 5.5.1 Introduction 49 5.5.2 Essential issues in packet voice synchronization 50 5.5.2.1 Packet ordering 50 5.5.2.2 Identification of the type of a missing packet 51 5.5.2.3 Correction of variable packet delay 52 5.5.2.4 Clock frequency matching 55 5.5.2.5 Temporal distortion of silence intervals 55 5.5.2.6 Adjustment of the playout rate 56 5.5.3 A taxonomy of packet voice synchronization schemes 57 5.5.3.1 Introduction 57 5.5.3.2 Synchronization schemes with exact knowledge of Dv 57 5.5.3.3 Synchronization schemes with approximate knowledge of Dv 58 5.5.3.4 Synchronization schemes with no knowledge of Dv 58 5.5.4 Summary and conclusions 60 5.6 Fil 1 -in 61 5.6.1 Introduction 61 5.6.2 Simple packet fill-in schemes 61 5.6.3 Advanced packet fill-in schemes 62 5.6.4 Summary and conclusions 64 CHAPTER 6 : CODEC REQUIREMENTS 65 6.1 Introduction 65 6.2 Input signal characteristics 65 6.3 Signal distortion 66 6.4 Signal delay 66 6.5 Bandwidth efficiency 66 6.6 Variable rate coding 67 6.7 Robustness to bit errors 67 6.8 Robustness to packet loss 68 6.9 Tandem coding 68 6.10 Voice conferencing 69 6.11 Voice messaging 70 6.12 PCM compatibility 71 6.13 Non-speech code 71 6.14 Control information 72 6.15 Packetization 72 6.16 Implementation 72 CHAPTER 7 : DESIGN AND DEVELOPMENT OF THE CODEC 74 7.1 Introduction 74 7.2 Variable rate coding 74 7.2.1 Introduction 74 7.2.2 Variable rate coding in DCM systems 75 7.2.2.1 Techniques 75 7.2.2.2 Issues 76 7.2.3 Variable rate coding in packet switched networks 77 7.2.3.1 Techniques 77 7.2.3.2 Issues 78 7.2.4 Multi rate coding 79 7.2.5 Embedded coding 80 7.2.6 Issues in the design of the embedded code 81 7.2.6.1 Code hierarchy 81 VI 7.2.6.2 Explicit noise coding vs. coarse feedback coding 82 7.2.6.3 Code format 83 7.2.7 Summary and conclusions 84 7.3 Redundancy reduction coding 85 7.3.1 Introduction 85 7.3.2 Waveform coders vs. vocoders 85 7.3.3 Time domain vs. frequency domain waveform coders 86 7.3.4 Predictive waveform coders 86 7.3.5 Concl usions 92 7.4 The adaptive quantizer in the primary coder 92 7.4.1 Introduction 92 7.4.2 Adaptation vs. companding 93 7.4.3 Backward vs. forward adaptation 96 7.4.4 Syllabic, instantaneous and hybrid adaptation 96 7.4.5 The optimization of backward adaptive quantizers 98 7.4.6 The Generalized Hybrid Adaptive Quantizer 101 7.4.6.1 Introduction 101 7.4.6.2 The syllabic compandor 101 7.4.6.3 The instantaneously adaptive quantizer 103 7.4.7 Derivation of the GHAQ optimization procedure 107 7.4.8 Performance measures 110 7.4.9 The training set 111 7.4.10 Evaluation of the GHAQ optimization procedure 112 7.4.10.1 Introduction 112 7.4.10.2 Convergence 113 7.4.10.3 Design optimality 117 7.4.10.4 The effect of p on the performance of the GHAQ 118 7.4.10.5 The effect of L on the performance of the GHAQ 120 7.5 The predictor in the primary coder 120 Vll 7.5.1 Introduction 120 7.5.2 An analytic approach to predictor optimization 122 7.5.3 An iterative approach to predictor optimization 123 7.6 Comparative performance tests 124 7.6.1 Introduction 124 7.6.2 Test conditions 125 7.6.3 Results for the 1-bit adaptive quantizers 126 7.6.3.1 The optimum coder parameters 126 7.6.3.2 SNR results 128 7.6.3.3 Step responses 130 7.6.4 Results for the 2-bit adaptive quantizers 133 7.6.4.1 The optimum coder parameters 133 7.6.4.2 SNR results 135 7.6.4.3 Step responses 135 7.6.5 Summary and conclusions 135 7.7 Development of the secondary coding algorithm 138 7.7.1 Introduction 138 7.7.2 Selection of the coding technique 138 7.7.3 Adaptation of the secondary quantizer 139 7.7.4 Embedded code generation 140 7.7.5 Optimization of the secondary quantizer 143 7.7.5.1 The optimization procedure 143 7.7.5.2 Convergence of the optimization procedure 144 7.7.5.3 Results 145 7.8 Recovery from bit errors 147 7.8.1 Introduction 147 7.8.2 Effects of bit errors on the primary decoder 148 7.8.3 The development of the robust GHAQ 150 7.8.4 Performance of the robust GHAQ 153 viii 7.8.5 Idle channel noise in the robust GHAQ 154 7.8.6 The effects of bit errors on the secondary decoder 155 7.9 Recovery from missing packets 155 7.9.1 Introduction 155 7.9.2 The effect of missing packets on the embedded decoder 156 7.9.3 A mechanism for recovering from missing packets 157 7.10 Packetization issues 158 7.11 Prioritization and flow control issues 160 7.11.1 Introduction 160 7.11.2 Speech prioritization in DCM systems 161 7.11.3 Fixed-rate performance of the embedded coder 162 7.11.4 Generation of the prioritization variables 165 7.11.5 Use of the prioritization variables 167 7.12 Packet voice synchronization and fill-in issues 170 7.12.1 Synchronization 170 7.12.2 Fi I 1 -in 170 CHAPTER 8 : IMPLEMENTATION OF THE CODEC 173 8.1 Introduction 173 8.2 Implementation strategy 173 8.3 An overview of the codec card 173 8.4 Signal conditioning and conversion 175 8.5 The embedded codec 176 8.5.1 Choice of digital signal processor 176 8.5.2 Program structure and timing 176 8.5.3 Arithmetic considerations 179 8.5.3.1 Fixed-point notation 179 8.5.3.2 Arithmetic overflow 180 8.5.3.3 Truncation error 180 IX 8.5.4 Code and control information formats 181 8.5.4.1 Introduction 181 8.5.4.2 Transmit group structure 182 8.5.4.3 Receive group structure 184 8.5.4.4 Codec control/status word 185 8.5.5 DSP resource usage 187 8.6 The codec/network voice protocol interface 188 8.6.1 Introduction 188 8.6.2 Information transfer techniques 188 8.6.3 Memory buffers and blocks 189 8.6.4 Information parcels and frames 189 8.7 The card control/status register 192 8.8 Card configuration options 194 CHAPTER 9 : EVALUATION OF THE CODEC 196 9.1 Introduction 196 9.2 Performance comparison with log PCM 196 9.3 Dynamic range 199 9.4 Signal delay 200 9.5 Robustness to bit errors and missing packets 200 9.6 Idle channel noise 202 9.7 Transcoding 203 9.8 Subjective quality