Code Excited Linear Prediction with Multi-Pulse Codebooks
Total Page:16
File Type:pdf, Size:1020Kb
CODE EXCITED LINEAR PREDICTION WIT& '.' MU~TI-PULSECODEBOOI~S __- --- - - .. .. I. Lei Zhang R.X-Sc.. Iiniversi~yof British / .A THESIS SUBMITTED IZE PARTIAL F1-LFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF in the School Engineering Science 6 I& Zhang 1997 f SIIIOS FKXSER liN1~"'ESITJ' - 3 A11 rights reserved. This work may not be - rcprodu<eed in. whofe or in part. by pltotocopg~ or other means. without the permission of the author. *-. - . T BiMiothqpmtmak ~ationalLibrary P 1*m of Canada du Cana ". l ?F.r Acquisitions and - Acquisitions 6t Bibliographig Services services bibliographiques 395 Wellingtpn Street 395, rue Wellington Ottawa OW KMON4 Onaw ON K1A ON4 Canada Cairada . ... " 4 I T 4 -. .9 Our fie Notre dfBrmce a - i * ' 1 ' C \ . -- , . 2 3 - 0 * I- *. - + * I The author has granted a non- L'auteur a accorde une licence non exclusive licence allowing the - exclusive permettant B la National Library of Canada to Bibliotheque nationale du ~aqa&de reproduce, loan, distribute or sell reprodwe, preter, dishher ou ' copies of this thesis in microform, vendre dks copies de cette these sous paper or electronic' formats. la fonne de microfiche/^, de reproduction sur papier ou sur format I elecfronique. *. F, I The author retains ownership of the L'guteur conserve la propriete du copyright in this thesis. Neither the droit d7autekqui protege cette these. thesis nor substantial extracts fiom it Ni la these ni des extraits substantiels may be printed or otherwise f #+ de celle-cipe doivent etre imprimes reproduced without the author's ou autrement reproduits sans son permission. autorisation. A APPROVAL . Name: -kei Zhang . Degree: Master, of AppliedeScience * TitIe af thesis Code Excited Linear Prediction with Multi-Pulse Code- L book s Examining Committee: Dr.. B. Gru~er,~Chai:man Dr; V. Cuperman Professor, Engineering Science, SFU Senior Supervisor Dr. J. Vaisey Assistant Professor, :Engineering Science, SFU Supervisor D1; J. Cavers fiofessdr, Engineering Science, SFI! Examiner %, May 13, 1997 Date Approved: ' Abstract - d ' f Voice compression is an important ingredient in digital cornmrtnication ancl voice stor- age systems. In the past decade. Code Excited Linear Prediction (CELP)has become - the dominant speech coding algorithm for bit rates between -I kb/s and=16 kb/s. How- ever, for rates around 4 kb/s and beloiz;, CELP loses its competitive eclge to spectral , domain coding. For many applications, an attractive approach for increasing systern + - 1 rapacity whilt maintaining good quality is to allow tthe ),it rate to vary according to the input speech charzicterist ics. The corresponding coclecs are charact crized by their B average hit rate and belong to the$lass of source-coiltrolled variable rate c~clecs. Source-controlled variable rate speech coders have been applied to digital cellular ' P communications ancl to- speech storage systems such as voice mail anel voice response equipment. Replacing the fixed-rate coders 115. variable-rate coders resdts in a signif- *_ icant increase in the Gystem capacity while maintaining the clesirecl quality of servfce. This thesis presents a variable-rate C'ELPpystern which achieves good cornulunica- tions speech quality at an average rate ofaabout3 kb/s based on a one-way converwtioti with 30% silence.. The coclec operates as a source-controlled variable rate coder with rates of 1.9 kh/s for voiced and transition souncls, :3.0 kb/s for unCoicetl sou~~clsand 6'70 b/s for silent -frames. The appropriate coding rate isaselected by analyzing ex.h input speech frame using a frame classifier. The codec uses a n~odrilardesign in'which . - * - the ge"nera1. struct t1r6 and coding algorithm is the same for all rates. A11 configw%t ior~s are based on 'the system with the highest bit-rate. The lower hit-rates are, obtainecl- * by l-arying the franie/snhfrarne sizes. using different Eodehooks for quantization. and in some cases disabling coclec components. The predominant source of quality degradation at rates around 4 kb/s or lower for C'ELP systems is. t he inadeq~iaternocleling of t he pitch lag correlation which rcsults li in noisy reconstructed speech. We aclclress the prohlenl .by usi~~gnew techr~iques * +-&* -including prediction of the +ed codebook target vector and joint optimization of the - adaptire and c&ehodk search. The prediction of the fixed codebook target ' vector is based on fixed codebook selections in previous subframes and a running * estimate for the fundamental frequency. Rpuits are presented which indicltte that the variable rate system at ail average rate of less than 3.2 kb/s, achieves better *q?~alitythan fixed ate standard codecs with rates irkhe range 4 - 1.8 kbjs on the m. P speech database tested. To C;lncc and &bo. fhc ctnt~rof nsy uwi-ld. d I would like to thank my advisor. Dr. Vladimir C'upernlan,\ or his steady support and guida ice throughout the course of this reseealso want td thank qveryone in the I speecbiab for their friendship, antl for lentlirig me a helping hand when needed. I sincerely thank my parents for taking great care of Grace, antl leaviilg me no ' chance of doirig a better joh in the future, and ai~ofor so many other things ... Finally, 8 thariks to Yubo who has seen the best and the worst sides .of me.. antl who has always* L b been a spurce of support and comfort for the past ti(o Fears. * . ... Abstract ................; ........., ................... .= ..................... 11 i T & 7 Acknowledgements .............. ; .......................................... vi .'+esL e 9 * List of 'Ia5Ps ......................................................: .-.......... x List .~fFigures ......................................................... .'. .. xii .. .* ... List of ~bbreviatiork...................................................... arll 1 Introduction ........: .............................. :................... 1 1.1 . Thesis Objectives ............................................... 2 .. 1.2 Thesis Organization .............................................. 3 2 Speech Coding .......................................................... 4 1 Iritroduction .................................................. .I . * 2.2 Siglal C'onlpression Techniques ................................... 6 2.2 Scalar Quantization ....... ............. .- . : ............. 6 2.2 Vector Quantization ......................... I. .......... -r 22 Linear Prediction in Speech C'ocling .........................- S 2.2.i ..\ltern~tiveRepresentation of LPC. C'orfficieilts .............. 12 2. Spccch C'oding Systems ........................................ 13 2.3. I Analysis-S\..nt hesis C'oders .........: ....................... 14 2.32 \Vaveform Coders ....................................... l'i \ = ,* r ? vii I 1 . , . d ! ~uiti~ulseExcited Linear Predicti 3.1 . MPLPC Algorithm ............ i 3.1.1 Multipulse Excitation Model ................................, , 21 3.1.2- MPLPC Pulse Search ...................;' .................. 23 i :3.2 Other MPLPC: Coding Techniques. ..............; .................ir 24 3.3 Regular Pulse Excitation Coding .:........... .'., ..................- 25 3.4 MPLPC based Systems ..............: ...... 1 .................. 27 Code Excited Linear Prediction ................ .,I.. .................. I 29 1 Introduction .................................................. 29 * 2 Excitation Coclel>ooks ...................... !. ................... 32 4.2. Stochastic Codebook ................................. .I.. :$2 4.2. Adaptive Codehook ................: .................... 33 4.3 ZIR-ZSR Decornposit ion .................................... : ... 13.1 I -1.3:1 Codebook Search ................. .' ..................... 3.5 4.1 Linear Prediction .Analysis and Quantization ...................... 117 -1.6 C'ELP Systems ......................... ;. ..................... 39 1.6.1 The DoD 4.8 kh/s Speech Coding Stanclad ................. :19 , I -1.6.:3 LD-C'ELP ....................... .I....................... -I0 Variable-Rate Speech Coding .....,.........I. ......................... 42 -54 Voice Activity Detection ............... .I. ........................ 4:3 1 - I 5.2 Active Speech C'kssification ............ :. ............ , .......... 45 I A Low-Rate Variable Rate CELP C~der.I. .......................... -19 6.1 System Overview 50 2 System Configuration 52 2 Bit Allocations ............................... .. ...... ,. 52 . 6.2.2 Voicecl/Transit ion Coding ................................. 53 t, 6.2.3 linvoicetl Coding .............. 1 ......................... 3:% 6.2. Silence Coding ..........................i .............. 3 ... 'b - ... ,P 6 .I%.' - f rC - . m v s ,,, * # 63.5 variable Rate Operatio11 ..................... - 6.3 Excitat,ion Generation and Encoding ......: ....... ., -. 2% 6.3.1 ;\I ulti-pulse ~ixedcodehook ~esi~n1 ....... 6.3.2 - Predicted Vector .................., ................... .': 55 -" 6.3.3 .Joint C'odebook Search and Gain Quantization ............... a I 6.3.4 Gain Quantizer ...................... : ..................... 59 * 6.13.5 Low Complexity Codebook. Search ......:...... 1. ........... 61 . 9 :* . 6.4 C'odec Components .............................................* 6% 6.4.1 Frame Classifier ......................................... 62 Z s I 6.4.2 LPC Analysis and Quantization ............ c ..:........... 67 + 6.4.:3 Pitch Estimation ........................................ 68 4.Adaptive C'odehook ...................................... 69 6.Fixed C'odebook ...................................