(12) United States Patent (10) Patent No.: US 7,848,922 B1 Jabri Et Al
Total Page:16
File Type:pdf, Size:1020Kb
US007848922B1 (12) United States Patent (10) Patent No.: US 7,848,922 B1 Jabri et al. (45) Date of Patent: *Dec. 7, 2010 (54) METHOD AND APPARATUS FORATHIN 6,115,688 A * 9/2000 Brandenburg et al. ....... TO4,503 AUDIO CODEC 6,115,689 A * 9/2000 Malvar ................ ... TO4,503 6,167,373 A 12/2000 Morii ......................... TO4,219 (76) Inventors: Marwan A. Jabri, 656 Hilary Dr., 6,314.393 B1 1 1/2001 Zheng et al. Tiburon, CA (US) 94920; Nicola 6.424,939 B1* 7/2002 Herre et al. ................. TO4,219 Chong-White, 364 Penshurst St., 6,717,955 B1 4/2004 Holler Chatswood, NSW (AU) 2067: Jianwei 6,799,060 B1* 9/2004 Kim ........................... 455,563 Wang, 2700 Lincoln Village Cir., Apt. 6,807,524 B1 * 10/2004 Bessette et al. .......... 704/200.1 291, Larkspur, CA (US) 94939 6,912,584 B2 * 6/2005 Wang et al. ................. TO9,231 (*) Notice: Subject to any disclaimer, the term of this (Continued) patent is extended or adjusted under 35 U.S.C. 154(b) by 512 days. OTHER PUBLICATIONS 3GPP TS 26.073 ANSI-C code for the Adaptive Multi Rate (AMR This patent is Subject to a terminal dis- speech codec'. Release 5.00, (Mar. 2002) SEN R claimer. Project (3GPP), http://www.3gpp2.org/. (21) Appl. No.: 11/890,263 (Continued) (22) Filed: Aug. 2, 2007 Primary Examiner Vijay B Chawan (74) Attorney, Agent, or Firm Hoffmann & Baron, LLP Related U.S. Application Data (57) ABSTRACT (63) Continuation of application No. 10/688,857, filed on Oct. 17, 2003, now Pat. No. 7,254,533. An apparatus and method for encoding and decoding a voice (60) Provisional application No. 60/439,366, filed on Jan. signal. The apparatus includes an encoder configured togen 9, 2003, provisional application No. 60/419,776, filed erate an output bitstream signal from an input voice signal. on Oct. 17, 2002. The output bitstream signal is associated with at least a first standard of a first plurality of CELP voice compression stan (51) Int. Cl. dards. Additionally, the apparatus includes a decoder config GIOL 9/04 (2006.01) ured to generate an output voice signal from an input bit (52) U.S. Cl. .................... 704/219; 704/220; 704/200.1; Stream signal. The input bitstream signal is associated with at 704/229; 704/223; 704/262 least a first standard of a second plurality of CELP voice (58) Field of Classification Search ................. 704/219, compression standards. The CELP encoder includes a plural 704/220, 230, 200.1, 500-504, 223, 229, ity of codec-specific encoder modules. Additionally, the 704/262, 264 CELP encoder includes a plurality of generic encoder mod See application file for complete search history. ules. The CELP decoder includes a plurality of codec-specific (56) References Cited decoder modules. Additionally, the CELP decoder includes a plurality of generic decoder modules. U.S. PATENT DOCUMENTS 5,787,390 A * 7/1998 Quinquis et al. ............ TO4,219 20 Claims, 23 Drawing Sheets 1712 s(n) PC Perceptual t 1710 processing fairfLP Analysis-> Classification-E Rate 18 pitch search Gain index Calculate Generate Frame Gain random quantization excitation -:FrameType Rate %NELP encoding Determination Rate 4 or Rate 1 CELP encoding LPC to LSP LSP Type 0 Rate Ar Rate 1 conversion quantization ACBlag, ACB gain, FCBlag, 1716 FCBgain, pred. Rate 4 or Rate 1 RCELP encoding switch, interp. path Type 1 Rate: or Rate 1 ACBlag, ACB gain, FCBlag, FCBgain, pred. switch US 7,848.922 B1 Page 2 U.S. PATENT DOCUMENTS ETSI GSM 06.60, “Enhanced Full Rate (EFR) Speech transcoding” version 8.0.1 (Nov. 2000), European Telecommunications Standards 7,254,533 B1* 8/2007 Jabri et al. .................. TO4,219 Institute (ETSI), http:IIwww.etsi.org/. 7,539,612 B2* 5/2009 Thumpudi et al. ....... TO4/2001 ETSI GSM 06.20, “Half rate speech: Half rate speech transcoding”. 2002fOO28670 A1 3/2002 Ohsuge version 8.01 (Nov. 2000). European Telecommunications Standards 2003. O103524 A1 6/2003 Hasegawa Institute (ETSI), http:IIwww.etsi.org/. ISO/IEC 14496-3 MPEG4-CELP Coder, “Information OTHER PUBLICATIONS Technology—coding of Audiovisual Objects, Part3: Audio, Subpart 3GPP TS 26.090, “Adaptive Multi-Rate (AMR) speech codec; 3. CELP, ISO/JTC 1/SC 29 N2203CELP. May 1998. Transcoding functions'. Release 5.0.0 (Jun. 2002), 3rd Generation ITU-T G.723.1 "Speech Coders: Dual rate speech coder for multi Partnership Project (3GPP), http://www.3gpp2.org/. media communications transmission at 5.3 and 6.3 kbit/s', ITU-T 3GPP TS 26.104 “ANSI-C code for the floating-point AMR speech Recommendation G.723.1 (1996), Geneva, http://www.itu.org/. codec', Release 5.00, (Jun. 2002)3rd Generation Partnership Project ITU-T G.723.1 Annex B “Dual rate speech coder for multimedia (3GPP), http://www.3gpp2.org/. communications transmitting at 5.3 and 6.3 kbit/s, Annex B: Alter 3GPP TS 26.173 “ANSI-C code for the Adaptive Multi-Rate native specification based on floating point arithmetic'. ITU-T Rec Wideband speech codec”. (Mar. 2002) 3rd Generation Partnership ommendation G.723.1—Annex B. http:IIwww.itu.org/. Project (3GPP), http://www.3gpp2.org/. ITU-T G.728 "Coding of speech at 16 kbit/s using low-delay code 3GPP TS 26.190 “AMRWideband speech codec; Transcoding Func excited linear prediction'. ITU Recommendation G.728 (1992), tions (Release 5), 3rd Generation Partnership Project (3GPP); Dec. Geneva, http://www.itu.org/. 2001, http://www.3gpp2.org/. ITU-T G.729“Coding of speech at 8 kbit/s using conjugate-Structure 3GPP TS 26.204 “ANSI-C code for the floating-point Adaptive Algebraiccode-excited linear-prediction (CS-ACELP). ITU-T Rec Multi-Rate Wideband (AMR-WB) speech codec, Release 5.0.0. ommendation G.729 (1996), Geneva, http://www.itu.org/. (Mar. 2002)3rd Generation Partnership Project (3GPP): http://www. ITU-T G.729A "Coding of Speech at 8 kbit/s using conjugate struc 3gpp2.org/. ture algebraiccode-excited linear-prediction (CS-ACELP) Annex A: 3GPP2 C.S0030-0 “Selectable Mode Vocoder Service Option for Reduced complexity 8kbit/s. CS-ACELP speech codec'. ITU-T Rec Wideband Spread Spectrum Communication Systems', 3rd Genera ommendation G.729—Annex A, Nov. 1996, http://www.itu.org/. tion Partnership Project (3GPP2), Dec. 2001, http://www.3gpp2. ITU-T G.729C "Annex C: Reference floating-point implementation org/. for G.729 CSACELP 8 kbit/s speech coding”, ITU-T Recommenda ANSI/TIA/EIA-136-Rev.C., part 410. “TDMA Cellular/ tion G.729—Annex C. Sep. 1998, http://www.itu.org/. PCS Radio Interface, Enhance Full Rate Voice Codec (ACELP).” Spanias, A.S. "Speech Coding: A Tutorial Review'. Proc. IEEE, vol. Formerly IS-641. TIA published standard, Jun. 1, 2001, http://www. 82, No. 10, pp. 1541-1582, Oct. 1994. tiaoline.org. TIA/EIA/IS-127-2 “Enhanced Variable Rate Codec, Speech Service Cox, "Speech Coding Standards.” Speech Coding and Synthesis, Option 3 for WidebandSpread Spectrum Digital Systems' Telecom W.B.Kleijn et al., eds., pp. 49-78, Elsevier Science, (1995), The munications Industry Association 1999. Netherlands. TIA/EIAIIS-733, “High Rate Speech Service Option 17 for ETSI, GSM 6.10 "Recommendation GSM 6.10 Full-Rate Speech Wideband Spread Spectrum Communication Systems”. TIA pub Transcoding, version 8.02 (Nov. 2000). European Telecommunica lished standard, Nov. 17, 1997. tions Standards Institute (ETSI), http:IIwww.etsi.org/. * cited by examiner U.S. Patent Dec. 7, 2010 Sheet 1 of 23 US 7,848,922 B1 CODEC 1 EnCOCer EnCOced bitstream Speech Samples aCCOrding to COdec (e.g. PCM) Standard 1 CODEC 2 EnCOOder EnCOded bitstream Speech Samples aCCOrding to COdec (e.g. PCM) Standard 2 CODEC KEnCOCer EnCOced bitstream Speech Samples according to Codec (e.g. PCM) Standard K FIG. 1A U.S. Patent Dec. 7, 2010 Sheet 2 of 23 US 7,848,922 B1 CODEC 1 DeCOOder EnCOced bitstream Speech Samples aCCOrding to COdeC (e.g. PCM) Standard 1 CODEC 2 DeCOcer EnCOced bitstream Speech Samples according to Codec (e.g. PCM) Standard 2 CODECK DeCOCler EnCOded bitstream Speech Samples acCording to COdec (e.g. PCM) Standard K FIG. 1B U.S. Patent Dec. 7, 2010 Sheet 3 of 23 US 7,848,922 B1 210 EnCOced bitStream Speech Samples Thin COCeC aCCOrding to One of the standards (e.g. PCM) (encoder part) {Codec 1, Codec 2, ...Codec K} EnCOced bitstream Speech Samples Thin COdeC acCording to One of (e.g. PCM) (encoder part) the Standards {Codec 1, Codec 2, ...Codec K} 200 FIG. 2 U.S. Patent Dec. 7, 2010 Sheet 4 of 23 US 7,848,922 B1 U.S. Patent US 7,848,922 B1 U.S. Patent Dec. 7, 2010 Sheet 6 of 23 US 7,848,922 B1 520 Input speech sample (PCM) Pre-processing 510 530 LP analysis and quantization 540 Open-loop pitch lag analysis 550 Adaptive codebook lag analysis and Quantization 560 Adaptive Codebook gain analysis and duantization 570 Fixed COcebook index analysis and Quantization 58O Fixed COcebOOk gain analysis and duantization 590 COceC bitStream Bitstream packing FIG. 5 U.S. Patent Dec. 7, 2010 Sheet 7 of 23 US 7,848,922 B1 620 COdeC bitstream BitStream Unpackin 610 O 9 630 ExCitation reconstruction 640 Synthesis filtering 650 Output speech sample (PCM) Post processing 660 FIG. 6 U.S. Patent Dec. 7, 2010 Sheet 8 of 23 US 7,848,922 B1 Codec 1 (C1) 720 C1 Pre-processing C1 Parameter(Q1) 1 Encoding Genericcareece Pre-processing so C1 Parameter 2 Encoding Specific Pre-processing (Q1) Generic Parameter 1 Encoding Specific Parameter 1 C1 Bitstream Packing EnCOding COmbine all COceCS EnCOding into a single Specific Parameter 2 CKPre-processing 9 CK Parameter 1 Encoding (Q1)Q1 Generic Bitstream CK Parameter 2 Encoding Packing (QN) Specific Bitstream Packing CKBitstream Packing Universal Thin COdeC (EnCOder) FIG 7 U.S.