(12) United States Patent (10) Patent No.: US 7,657.427 B2 Jelinek (45) Date of Patent: Feb

Total Page:16

File Type:pdf, Size:1020Kb

(12) United States Patent (10) Patent No.: US 7,657.427 B2 Jelinek (45) Date of Patent: Feb USOO7657427B2 (12) United States Patent (10) Patent No.: US 7,657.427 B2 Jelinek (45) Date of Patent: Feb. 2, 2010 (54) METHODS AND DEVICES FOR SOURCE FOREIGN PATENT DOCUMENTS CONTROLLED VARIABLE BITRATE WIDEBAND SPEECH CODING JP O8-305398 11, 1996 (75) Inventor: Milan Jelinek, Sherbrooke (CA) (Continued) (73) Assignee: Nokia Corporation, Espoo (FI) OTHER PUBLICATIONS c - r Tammi, M., et al., “Signal Modification For VoicedWideband Speech (*) Notice: SupEyssessity Coding And Its Application For IS-95 System”, IEEE 2002, pp. U.S.C. 154(b) by 768 days. 35-37. (Continued) (21) Appl. No.: 11/039,539 Primary Examiner Matthew J Sked (22) Filed: Jan. 19, 2005 (74) Attorney, Agent, or Firm Harrington & Smith, PC (65) Prior Publication Data (57) ABSTRACT US 2005/O177364 A1 Aug. 11, 2005 Speech signal classification and encoding systems and meth Related U.S. Application Data ods are disclosed herein. The signal classification is done in three steps each of them discriminating a specific signal class. (63) Ry o: lication No. PCT/CAO3/O1571, First, a voice activity detector (VAD) discriminates between ed on Oct. 9, active and inactive speech frames. If an inactive speech frame (51) Int. Cl is detected (background noise signal) then the classification Gioi iA06 (2006.01) chain ends and the frame is encoded with comfort noise GOL 9/02 (200 6. 01) generation (CNG). If an active speech frame is detected, the GOL 9/12 (200 6. 01) frame is subjected to a second classifier dedicated to discrimi 52) U.S. C. 704/208. 704/214: 704/221: nate unvoiced frames. If the classifier classifies the frame as (52) U.S. Cl. ....................... s s 704/22 9 unvoiced speech signal, the classification chain ends, and the frame is encoded using a coding method optimized for (58) Field t list Seash - - - - - - - - - -hhi - - - - - - - None unvoiced signals. Otherwise, the speech frame is passed ee application file for complete search history. through to the “stable voiced classification module. If the (56) References Cited frame is classified as stable voiced frame, then the frame is encoded using a coding method optimized for stable voiced U.S. PATENT DOCUMENTS signals. Otherwise, the frame is likely to contain a non-sta 5,911,128 A 6/1999 DeJaco ....................... TO4,221 tionary speech segment Such as a voiced onset or rapidly 6,360,199 B1 3/2002 Yokoyama .................. 704/214 evolving Voiced speech signal. In this case a general-purpose 6,604,070 B1* 8/2003 Gao et al. ................... TO4/222 speech coder is used at a high bit rate for Sustaining good 6,961,698 B1 * 1 1/2005 Gao et al. ................... TO4,229 Subjective quality. 7,472,059 B2* 12/2008 Huang ........................ TO4/220 (Continued) 12 Claims, 12 Drawing Sheets - f Voice Activity CNG encoding Detected or DTX es 8 Unvoiced Unvoiced speech Frame? optimized encoding f Voiced speech optimized encoding f fa Generic speech encoding US 7,657.427 B2 Page 2 U.S. PATENT DOCUMENTS Jelinek, M., et al., “Advances In Source-Controlled Variable Bit Rate Wideband Speech Coding. Special Workshop in Maui, Lectures by 2002.0099.548 A1 7/2002 Manjunath et al. .......... TO4/266 Masters. In Speech Processing, Jan. 2004, pp. 1-8. 2002.0143527 A1* 10, 2002 Gao et al. ... 704/223 Das et al., “Variable Dimension Spectral Coding of Speech at 2400 FOREIGN PATENT DOCUMENTS bps and Below with Phonetic Classification'. Acoustics, Speech, and Signal Processing, 1995, ICASSP-95, 1995 International Conference WO WO-96,04646 A1 2, 1996 of Detroit, MI, USA May 9-12, 1995, New York, NY, USA, IEEEUS, WO WO96,05592 2, 1996 May 9, 1995, pp. 492-495, XPO10625277 ISBN: 0-7803-2431-5. WO WO-01/22402 A1 3, 2001 Wang et al., “Phonetically-Based Vector Excitation Coding of OTHER PUBLICATIONS Speech at 3.6 kbps'. International Conference on Acoustics, Speech, and Signal Processing ICASSP 1989, May 23, 1989, pp. 49-52, Cellario, L., et al., “CELP Coding At Variable Rate”, European Trans XPO10O83.193. actions On Telecommunications and Related Technologies, vol. 5, No. 5, Sep.1994, pp. 69-79. * cited by examiner U.S. Patent US 7,657.427 B2 (77 U.S. Patent Feb. 2, 2010 Sheet 2 of 12 US 7,657.427 B2 f(0 A04 f(22 Voice Activity No CNG encoding Detected? Or DTX Yes 106 f(6 Unvoiced Yes Unvoiced speech Frame? Optimized encoding No 112 f f(0 Yes Voiced speech Optimized encoding 114 Generic speech encoding First F1 U.S. Patent Feb. 2, 2010 Sheet 3 of 12 US 7,657.427 B2 00? ET–EEPET-- U.S. Patent Feb. 2, 2010 Sheet 4 of 12 US 7,657.427 B2 300 , Pitch cycle --909 Search 310 Operation Successful? Delay contour Selection 306 Stable Voiced Operation Full-rate low bit rate Successful? generic COding Coding Pitch-synchronous modification Yes Operation Successful? FEEF 4. U.S. Patent US 7,657.427 B2 U.S. Patent Feb. 2, 2010 Sheet 6 of 12 US 7,657.427 B2 600 f02 - Voice Activity No Detected? Yes 404 Unvoiced Yes Frame? Stable Voiced Yes Frame? No 5/2 No LOW energy frame? Yes - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 604 a06 Generic Generic Half-Rate Half-Rate Full-Rate Half-Rate Voiced Unvoiced Coding and Guantization - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - U.S. Patent Feb. 2, 2010 Sheet 7 of 12 US 7,657.427 B2 600 f02 - Voice Activity Detected? Unvoiced Frame? No1 LOW energy frame? - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 604 6/4 606 606 604 t Generic Generic Half-Rate w Full-Rate Half-Rate Unvoiced HR Unvoiced QR CNGER Coding and Quantization - - - - - - - - - - - - - - - - - - - - -n - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - U.S. Patent US 7,657.427 B2 004 30/ U.S. Patent Feb. 2, 2010 Sheet 9 of 12 US 7,657.427 B2 600 f(3 - Voice Activity No Detected? Yes 106 Unvoiced Yes Frame? No 6O2 Yes - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Af4 606 40? 402 Voiced HR Unvoiced HR CNG ER: Coding and Quantization - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - U.S. Patent Feb. 2, 2010 Sheet 10 of 12 US 7,657.427 B2 -900 Yes f(6 Unvoiced Yes Frame? 6O2 No V/UY No ff0 Transition? Yes Yes - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 674 6,06 ?(6 604 402 Generic Half-Rat Half-Rate Unvoiced HR Unvoiced QR CNC ER Coding and Quantization - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - U.S. Patent US 7,657.427 B2 03% L --------------- U.S. Patent Feb. 2, 2010 Sheet 12 of 12 US 7,657.427 B2 0007 900/. 69007 -1 --- r --------------- r------------------------- [9]'No.10) ISHT?IS/VIVO US 7,657,427 B2 1. 2 METHODS AND DEVICES FOR SOURCE In wireless systems using code division multiple access CONTROLLED VARIABLE BITRATE (CDMA) technology, the use of source-controlled variable bit WIDEBAND SPEECH CODING rate (VBR) speech coding significantly improves the system capacity. In source-controlled VBR coding, the codec oper CROSS REFERENCE TO RELATED ates at several bit rates, and a rate selection module is used to APPLICATION determine the bit rate used for encoding each speech frame based on the nature of the speech frame (e.g. Voiced, This application is a continuation of International Patent unvoiced, transient, background noise). The goal is to attain Application No. PCT/CA2003/001571 filed on Oct. 9, 2003. the best speech quality at a given average bitrate, also referred 10 to as average data rate (ADR). The codec can operate at FIELD OF THE INVENTION different modes by tuning the rate selection module to attain different ADRs at the different modes where the codec per The present invention relates to digital encoding of Sound formance is improved at increased ADRs. The mode of opera signals, in particular but not exclusively a speech signal, in tion is imposed by the system depending on channel condi view of transmitting and synthesizing this sound signal. In 15 tions. This enables the codec with a mechanism of trade-off particular, the present invention relates to signal classification between speech quality and system capacity. and rate selection methods for variable bit-rate (VBR) speech Typically, in VBR coding for CDMA systems, an eighth coding. rate is used for encoding frames without speech activity (si lence or noise-only frames). When the frame is stationary BACKGROUND OF THE INVENTION Voiced or stationary unvoiced, half-rate or quarter-rate are used depending on the operating mode. If half-rate can be Demand for efficient digital narrowband and wideband used, a CELP model without the pitch codebook is used in speech coding techniques with a good trade-off between the unvoiced case and a signal modification is used to enhance the Subjective quality and bit rate is increasing in various appli periodicity and reduce the number of bits for the pitch indices cation areas such as teleconferencing, multimedia, and wire 25 in Voiced case. If the operating mode imposes a quarter-rate, less communications. Until recently, telephone bandwidth no waveform matching is usually possible as the number of constrained into a range of 200-3400Hz has mainly been used bits is insufficient and some parametric coding is generally in speech coding applications.
Recommended publications
  • Packetcable™ 2.0 Codec and Media Specification PKT-SP-CODEC
    PacketCable™ 2.0 Codec and Media Specification PKT-SP-CODEC-MEDIA-I10-120412 ISSUED Notice This PacketCable specification is the result of a cooperative effort undertaken at the direction of Cable Television Laboratories, Inc. for the benefit of the cable industry and its customers. This document may contain references to other documents not owned or controlled by CableLabs. Use and understanding of this document may require access to such other documents. Designing, manufacturing, distributing, using, selling, or servicing products, or providing services, based on this document may require intellectual property licenses from third parties for technology referenced in this document. Neither CableLabs nor any member company is responsible to any party for any liability of any nature whatsoever resulting from or arising out of use or reliance upon this document, or any document referenced herein. This document is furnished on an "AS IS" basis and neither CableLabs nor its members provides any representation or warranty, express or implied, regarding the accuracy, completeness, noninfringement, or fitness for a particular purpose of this document, or any document referenced herein. 2006-2012 Cable Television Laboratories, Inc. All rights reserved. PKT-SP-CODEC-MEDIA-I10-120412 PacketCable™ 2.0 Document Status Sheet Document Control Number: PKT-SP-CODEC-MEDIA-I10-120412 Document Title: Codec and Media Specification Revision History: I01 - Released 04/05/06 I02 - Released 10/13/06 I03 - Released 09/25/07 I04 - Released 04/25/08 I05 - Released 07/10/08 I06 - Released 05/28/09 I07 - Released 07/02/09 I08 - Released 01/20/10 I09 - Released 05/27/10 I10 – Released 04/12/12 Date: April 12, 2012 Status: Work in Draft Issued Closed Progress Distribution Restrictions: Authors CL/Member CL/ Member/ Public Only Vendor Key to Document Status Codes: Work in Progress An incomplete document, designed to guide discussion and generate feedback, that may include several alternative requirements for consideration.
    [Show full text]
  • Of the Cdma2000® System Specifications
    3GPP2 S.P0052-0 Ver.0.2.7 Date: August 21, 2003 1 2 System Release Guide for the 3 Release <ALPHA> 4 of the cdma2000® System Specifications 5 COPYRIGHT © 2003 3GPP2 and its Organizational Partners claim copyright in this document and individual Or- ganizational Partners may copyright and issue documents or standards publications in in- dividual Organizational Partner’s name based on this document. Requests for reproduction of this document should be directed to the 3GPP2 Secretariat at [email protected]. Requests to reproduce individual Organizational Partner’s documents should be directed to that Organizational Partner. See www.3gpp2.org for more information. 6 7 © 2003 3GPP2 - i - S.P0052: CDMA2000® System Release Guide –Release <Alpha> 1 Executive Summary 2 The System Release Guide (SRG) for the Release <ALPHA> provides an overview 3 for and reference to the Release <ALPHA> of the 3GPP2 wireless telecommuni- 4 cation system (cdma2000®) capabilities, features, and services. This document 5 is intended for use by persons and /or companies who are developing and / or 6 deploying cdma2000 systems or by persons who are otherwise interested in 7 cdma2000 systems. 8 Air interface support for HRPD and enhanced IOS are included and provide 9 high-speed forward link data rate service capability up to 2.4576 Mbps in a 10 1.25 MHz. Since cdma2000 uses many IP based protocols to a large degree, it 11 offers various features of IP based services. The system in this release contains 12 support for the Legacy System, and limited support for the 3GPP2 Legacy Mo- 13 bile Station Domain, making use of IP-based transport and signaling.
    [Show full text]
  • A Novel Transcoding Algorithm for SMV and G.723.1 Speech Coders Via Direct Parameter Transformation
    A Novel Transcoding Algorithm for SMV and G.723.1 Speech Coders via Direct Parameter Transformation Seongho Seo, Dalwon Jang, Sunil Lee, and Chang D. Yoo Department of Electrical Engineering and Computer Science, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea [email protected] Abstract as input. The length of each speech frame is 30 ms which cor- responds to 240 samples. Input speech of each frame is first In this paper, a novel transcoding algorithm for the Selectable high pass filtered and then divided into 4 subframes of 60 sam- Mode Vocoder (SMV) and the G.723.1 speech coder is pro- ples each. For every subframe, a 10th order linear prediction posed. In contrast to the conventional tandem transcoding al- coefficients (LPC) are computed. The linear predictive analysis gorithm, the proposed algorithm converts the parameters of one requires a look-ahead of 7.5 ms long. The LPC set for the last coder to the other without going through the decoding and en- subframe is quantized using the Predictive Split Vector Quan- coding process. The proposed algorithm is composed of four tizer (PSVQ) and transmitted. The unquantized LPC sets are parts: the parameter decoding, Line Spectral Pair (LSP) conver- used to obtain the perceptually weighted speech signal. For ev- sion, pitch period conversion and rate selection. The evaluation ery two subframes, the open-loop pitch analysis is performed in results show that the proposed algorithm achieves equivalent the domain of the weighted speech signal. Then, the adaptive speech quality to that of tandem transcoding with reduced com- codebook (ACB) and fixed codebook (FCB) are searched on a putational complexity and delay.
    [Show full text]
  • Network Working Group R. Gellens Request for Comments: 4281 Qualcomm Category: Standards Track D
    Network Working Group R. Gellens Request for Comments: 4281 Qualcomm Category: Standards Track D. Singer Apple P. Frojdh Ericsson November 2005 The Codecs Parameter for "Bucket" Media Types Status of This Memo This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited. Copyright Notice Copyright (C) The Internet Society (2005). Abstract Several MIME type/subtype combinations exist that can contain different media formats. A receiving agent thus needs to examine the details of such media content to determine if the specific elements can be rendered given an available set of codecs. Especially when the end system has limited resources, or the connection to the end system has limited bandwidth, it would be helpful to know from the Content-Type alone if the content can be rendered. This document adds a new parameter, "codecs", to various type/subtype combinations to allow for unambiguous specification of the codecs indicated by the media formats contained within. By labeling content with the specific codecs indicated to render the contained media, receiving systems can determine if the codecs are supported by the end system, and if not, can take appropriate action (such as rejecting the content, sending notification of the situation, transcoding the content to a supported type, fetching and installing the required codecs, further inspection to determine if it will be sufficient to support a subset of the indicated codecs, etc.) Gellens, et al.
    [Show full text]
  • Superseded Packetcable™ Codec and Media Specification (PKT-SP-CODEC-MEDIA-I01-060406)
    PacketCable™ Codec and Media Specification PKT-SP-CODEC-MEDIA-I01-060406 ISSUED Notice This PacketCable specification is a cooperative effort undertaken at the direction of Cable Television Laboratories, Inc. (CableLabs®) for the benefit of the cable industry. Neither CableLabs, nor any other entity participating in the creation of this document, is responsible for any liability of any nature whatsoever resulting from or arising out of use or reliance upon this document by any party. This document is furnished on an AS-IS basis and neither CableLabs, nor other participating entity, provides any representation or warranty, express or implied, regarding its accuracy, completeness, or fitness for a particular purpose. © Copyright 2006 Cable Television Laboratories, Inc. All rights reserved. PKT-SP-CODEC-MEDIA-I01-060406 PacketCable™ Document Status Sheet Document Control Number: PKT-SP-CODEC-MEDIA-I01-060406 Document Title: Codec and Media Specification PacketCable Release: 2 Revision History: I01 – Released 04/05/2006 Date: April 6, 2006 Status: Work in Draft Issued Closed Progress Distribution Restrictions: Authors CL/Member CL/ Member/ Public Only Vendor Key to Document Status Codes: Work in Progress An incomplete document, designed to guide discussion and generate feedback, that may include several alternative requirements for consideration. Draft A document in specification format considered largely complete, but lacking review by Members and vendors. Drafts are susceptible to substantial change during the review process. Issued A stable document, which has undergone rigorous member and vendor review and is suitable for product design and development, cross-vendor interoperability, and for certification testing. Closed A static document, reviewed, tested, validated, and closed to further engineering change requests to the specification through CableLabs.
    [Show full text]
  • (12) United States Patent (10) Patent No.: US 6,445,696 B1 Foodeei Et Al
    USOO644.5696B1 (12) United States Patent (10) Patent No.: US 6,445,696 B1 Foodeei et al. (45) Date of Patent: Sep. 3, 2002 (54) EFFICIENT WARIABLE RATE CODING OF (57) ABSTRACT VOICE OVER ASYNCHRONOUSTRANSFER MODE The invention uses an ATM Adaptation Layer of type 2 (AAL2) standard mechanism to define efficient Support for (75) Inventors: Majid Foodeei, San Francisco; Variable Rate Coding (VRC). The VRC in this context Anthony E. Raetz, Menlo Park, both of typically refers to codecs, which adapt their rate to infor CA (US) mation content variations in Speech and audio. Such VRC (73) Assignee: Network Equipment Technologies, results in lower average rate than the constant rate codecs or Inc., Fremont, CA (US) use of constant rate codecs coupled with Silence Suppression - 0 (SS), currently deployed in voice over ATM schemes using (*) Notice: Subject to any disclaimer, the term of this AAL2. Possible ATM transport, trunking and access appli patent is extended or adjusted under 35 cations encompass both Circuit Emulation Services (CES) U.S.C. 154(b) by 0 days. and Local Loop Emulation Services (LLES). A typical VRC profile encompasses options for all Sub-rates within one or (21) Appl. No.: 09/513,667 more VRC standard or proprietary variable rate codec. The (22) Filed: Feb. 25, 2000 output of a rate determination algorithm (RDA), commonly part of variable rate codec, is fed into present AAL2 inter (51) Int. Cl." ............................. H04J 3/24; HO4L 12/56 working function (IWF). The IWF in AAL2, which normally (52) U.S. Cl. .................... 370/356; 370/395.6; 370/465; Supports SS or multiple rates (as opposed to voice content 370/469; 370/471; 370/474 VRC), is extended to accommodate VRC and thereafter (58) Field of Search ................................
    [Show full text]
  • Supported Codecs and Formats Codecs
    Supported Codecs and Formats Codecs: D..... = Decoding supported .E.... = Encoding supported ..V... = Video codec ..A... = Audio codec ..S... = Subtitle codec ...I.. = Intra frame-only codec ....L. = Lossy compression .....S = Lossless compression ------- D.VI.. 012v Uncompressed 4:2:2 10-bit D.V.L. 4xm 4X Movie D.VI.S 8bps QuickTime 8BPS video .EVIL. a64_multi Multicolor charset for Commodore 64 (encoders: a64multi ) .EVIL. a64_multi5 Multicolor charset for Commodore 64, extended with 5th color (colram) (encoders: a64multi5 ) D.V..S aasc Autodesk RLE D.VIL. aic Apple Intermediate Codec DEVIL. amv AMV Video D.V.L. anm Deluxe Paint Animation D.V.L. ansi ASCII/ANSI art DEVIL. asv1 ASUS V1 DEVIL. asv2 ASUS V2 D.VIL. aura Auravision AURA D.VIL. aura2 Auravision Aura 2 D.V... avrn Avid AVI Codec DEVI.. avrp Avid 1:1 10-bit RGB Packer D.V.L. avs AVS (Audio Video Standard) video DEVI.. avui Avid Meridien Uncompressed DEVI.. ayuv Uncompressed packed MS 4:4:4:4 D.V.L. bethsoftvid Bethesda VID video D.V.L. bfi Brute Force & Ignorance D.V.L. binkvideo Bink video D.VI.. bintext Binary text DEVI.S bmp BMP (Windows and OS/2 bitmap) D.V..S bmv_video Discworld II BMV video D.VI.S brender_pix BRender PIX image D.V.L. c93 Interplay C93 D.V.L. cavs Chinese AVS (Audio Video Standard) (AVS1-P2, JiZhun profile) D.V.L. cdgraphics CD Graphics video D.VIL. cdxl Commodore CDXL video D.V.L. cinepak Cinepak DEVIL. cljr Cirrus Logic AccuPak D.VI.S cllc Canopus Lossless Codec D.V.L.
    [Show full text]
  • Media and Radio Signal Processing for Mobile Communications Kyunghun Jung , Russell M
    Cambridge University Press 978-1-108-42103-4 — Media and Radio Signal Processing for Mobile Communications Kyunghun Jung , Russell M. Mersereau Frontmatter More Information Media and Radio Signal Processing for Mobile Communications Get to grips with the principles and practise of signal processing used in real mobile communications systems. Focusing particularly on speech and video processing, pion- eering experts employ a detailed, top-down analytical approach to outline the network architectures and protocol structures of multiple generations of mobile communications systems, identify the logical ranges where media and radio signal processing occur, and analyze the procedures for capturing, compressing, transmitting and presenting media. Chapters are uniquely structured to show the evolution of network architectures and technical elements between generations up to and including 5G, with an emphasis on maximizing service quality and network capacity through reusing existing infrastruc- ture and technologies. Examples and data taken from commercial networks provide an in-depth insight into the operation of a number of different systems, including GSM, cdma2000, W-CDMA, LTE, and LTE-A, making this a practical, hands-on guide for both practicing engineers and graduate students in wireless communications. Kyunghun Jung is a Principal Engineer at Samsung Electronics, where he leads the research and standardization for bringing immersive media services and vehicular applications to 5G systems. Russell M. Mersereau is Regents Professor Emeritus in the School of Electrical and Computer Engineering at the Georgia Institute of Technology, and a Fellow of the IEEE. © in this web service Cambridge University Press www.cambridge.org Cambridge University Press 978-1-108-42103-4 — Media and Radio Signal Processing for Mobile Communications Kyunghun Jung , Russell M.
    [Show full text]
  • Wideband Speech Coding Standards and Applications
    Wideband Speech Coding Standards and Applications Abstract Increasing the bandwidth of sound signals from the telephone bandwidth of 200-3400 Hz to the wider bandwidth of 50-7000 Hz results in increased intelligibility and naturalness of speech and gives a feeling of transparent communication. The emerging end-to-end digital communication systems enable the use of wideband speech coding in a wide area of applications. Recognizing the need of high quality wideband speech codecs, several standardization activities have been recently conducted, resulting in the selection of a new wideband speech codec, AMR-WB, at bit rates from 6.6 to 23.85 kbit/s by both 3GPP and ITU-T. The adoption of AMR-WB by the two bodies is of significant importance since for the first time the same codec is adopted for wireless as well as wireline services. This will eliminate the need for transcoding, and ease the implementation of wideband voice applications and services across a wide range of communication systems and platforms. This document presents a summary of wideband speech coding standards for wideband telephony applications. The quality advantages and applications of wideband speech coding are first presented, and the issue of telephony over packet networks is discussed. Several wideband speech coding standards are discussed and a special emphasis is given to the AMR-WB standard recently selected by 3GPP and ITU-T. 1. Introduction Most speech coding systems in use today are based on telephone-bandwidth narrowband speech, nominally limited to about 200-3400 Hz and sampled at a rate of 8 kHz. This limitation built into the conventional telephone system dates back to the first transcontinental telephone service established between New-York and San Francisco in 1915.
    [Show full text]
  • Selectable Mode Vocoder Service Option for Wideband Spread Spectrum Communication Systems
    3GPP2 C.S0030-0 Version 2.0 Date: December 2001 Selectable Mode Vocoder Service Option for Wideband Spread Spectrum Communication Systems COPYRIGHT 3GPP2 and its Organizational Partners claim copyright in this document and individual Organizational Partners may copyright and issue documents or standards publications in individual Organizational Partner's name based on this document. Requests for reproduction of this document should be directed to the 3GPP2 Secretariat at [email protected]. Requests to reproduce individual Organizational Partner's documents should be directed to that Organizational Partner. See www.3gpp2.org for more information. Intentionally left blank. C.S0030-0 Version 2.0 1 FOREWORD 2 These technical requirements form a standard for Service Option 56, a variable-rate 3 two-way speech service option. The maximum speech-coding rate of the service option is 4 8.55 kbps. 5 This standard does not address the quality or reliability of Service Option 56, nor does it 6 cover equipment performance or measurement procedures. 7 i C.S0030-0 Version 2.0 1 NOTES 2 1. Accompanying “Recommended Minimum Performance Standard for the Selectable 3 Mode Vocoder, Service Option 56,” provides specifications and measurement methods. 4 2. “Base station” refers to the functions performed on the land-line side, which are 5 typically distributed among a cell, a sector of a cell, a mobile switching center, and a 6 personal communications switching center. 7 3. This document uses the following verbal forms: “Shall” and “shall not” identify 8 requirements to be followed strictly to conform to the standard and from which no 9 deviation is permitted.
    [Show full text]
  • Selectable Mode Vocoder (SMV) Service Option for Wideband Spread Spectrum Communication Systems
    3GPP2 C.S0030-0 Version 3.0 Date: January 2004 1 2 Selectable Mode Vocoder (SMV) Service Option for 3 Wideband Spread Spectrum Communication 4 Systems 5 6 COPYRIGHT 3GPP2 and its Organizational Partners claim copyright in this document and individual Organizational Partners may copyright and issue documents or standards publications in individual Organizational Partner's name based on this document. Requests for reproduction of this document should be directed to the 3GPP2 Secretariat at [email protected]. Requests to reproduce individual Organizational Partner's documents should be directed to that Organizational Partner. See www.3gpp2.org for more information. 7 1 Intentionally left blank. C.S0030-0 v3.0 1 FOREWORD 2 These technical requirements form a standard for Service Option 56, a variable-rate 3 two-way speech service option. The maximum speech-coding rate of the service option is 4 8.55 kbps. 5 This standard does not address the quality or reliability of Service Option 56, nor does it 6 cover equipment performance or measurement procedures. 7 i C.S0030-0 v3.0 1 NOTES 2 1. Accompanying “Recommended Minimum Performance Standard for the Selectable 3 Mode Vocoder, Service Option 56,” provides specifications and measurement methods. 4 2. “Base station” refers to the functions performed on the land-line side, which are 5 typically distributed among a cell, a sector of a cell, a mobile switching center, and a 6 personal communications switching center. 7 3. This document uses the following verbal forms: “Shall” and “shall not” identify 8 requirements to be followed strictly to conform to the standard and from which no 9 deviation is permitted.
    [Show full text]
  • (12) United States Patent (10) Patent No.: US 7.254.533 B1 Jabri Et Al
    US00725.4533B1 (12) United States Patent (10) Patent No.: US 7.254.533 B1 Jabri et al. (45) Date of Patent: Aug. 7, 2007 (54) METHOD AND APPARATUS FOR A THIN 3GPP TS 26.090, “Adaptive Multi-Rate (AMR) speech codec: CELP VOICE CODEC Transcoding fuctions”, Release 5.0.0 (Jun. 2002), 3' Generation Partnership Project (3GPP), http://www.3gpp2.org/. (75) Inventors: Marwan A. Jabri, Broadway (AU); 3GPP TS 26.104 “ANSI-C code for the floating-point AMR speech Nicola Chong-White, Chatswood (AU): codec”. Release 5.00, (Jun. 2002) 3" Gerneration Partnership Jianwei Wang, Killarney Heights (AU) Project (3GPP), http://www.3gpp2.org/. 3GPP TS 26.173 “ANSI-C code for the Adaptive Multi-Rate (73) Assignee: Dilithium Networks Pty Ltd., Sydney, Wideband speech codec, (Mar. 2002) 3" Generation Partnership NSW (AU) Project (3GPP), http://www.3gpp2.org/. *) Notice: Subject to anyy disclaimer, the term of this 3GPP TS 26.190 “AMR Wideband speech codec; Transcoding patent is extended or adjusted under 35 Functions (Release 5), 3 Generation Partnership Project (3GPP); U.S.C. 154(b) by 872 days. Dec. 2001, http://www.3gpp2.org/. (21) Appl. No.: 10/688,857 (Continued) Primary Examiner V. Paul Harper (22) Filed: Oct. 17, 2003 (74) Attorney, Agent, or Firm Townsend and Townsend Related U.S. Application Data and Crew LLP (60) Provisional application No. 60/439.366, filed on Jan. (57) ABSTRACT 9, 2003, provisional application No. 60/419,776, filed on Oct. 17, 2002. An apparatus and method for encoding and decoding a voice (51) Int. Cl. signal.
    [Show full text]