HD Voice and Wideband Codecs (HD-02) Panel Discussion (ITEXPO West 2009) September 02, 2009 Los Angeles, CA
Total Page:16
File Type:pdf, Size:1020Kb
A World Leader and Innovator In Wireless Technologies HD Voice and Wideband Codecs (HD-02) Panel Discussion (ITEXPO West 2009) September 02, 2009 Los Angeles, CA A. Ryan Heidari Director, Technology Marketing [email protected] HD Voice and Wideband Codecs (HD-02) Wednesday - 09/02/09, 9:30-10:15am Historically high-quality voice codecs have been compute intensive and complicated to license due to multiple ownership of intellectual property. This has now changed. Moore’s Law has made the compute requirements easily manageable, and several vendors have released state of the art royalty-free codecs, for example Polycom, Skype and Speex. This session explains the benefits of each. Presented by: Jeffrey Rodman Founder, Polycom, and CTO, Polycom Voice Solutions Group Polycom Jan Linden VP, Engineering Global IP Solutions Julian Spittka Product Manager and Senior Engineer - Audio/Video Group Skype A. Ryan Heidari (Moderator) Director, Technology and Product Marketing Qualcomm Inc. Evolution in Wireless Modem LTE 10,000 DO-rev B HSPA+ DOrB HSDPA 1000 HSDPA 7.2 3.6 DO-rev A EVDO WCDMA 100 (Kbps) EDGE CDMA 1x Average Throughput Average GPRS 10 GSM CDMA AMPS 1 1980 1985 1990 1995 2000 2005 2010 Qualcomm Multimedia Codecs Video Codecs: – MPEG-4 Simple Profile – H.263 Profile 0/3 – H.264 Baseline – RealVideo v10* – Windows Media v9* Audio Codecs: – EFR – AMR-NB – all rates – AMR-WB – all rates* – AAC – up to 128kbps @ 48kHz* – AAC Plus - up to 128kbps @ 48kHz* – Enhanced AAC Plus - up to 128kbps @ 48kHz* – QCELP -- 13kbps fixed full and half rates – EVRC -- 8kbps fixed full rate – EVRC-B--new NB extension of EVRC – EVRC-WB* -- new WB extension of EVRC – Real Audio v8* – Windows Media Audio v9* * Not available on all MSM platforms. For a detailed feature release plan by chipset, please refer to “Mobile Video SW Release Plan” What is in a QUALCOMM chipset? • A Leader in Delivering Wireless Solutions • Always Striving to Achieve Semiconductor Excellence CDMA Micro- Processor GSM/GPRS RF PM GPS DSP 3D Graphics • Smaller Form Factor Video • Enhanced System Performance DSP • Highly Integrated Applications Audio • Greater Quality & Reliability • Reduced Power Consumption Memory Imaging • Lower Overall Costs • Faster Time to Market Snapdragon Will Enable the Next Generation of Consumer Electronic Devices Snapdragon pairs industry-leading processing capabilities with QUALCOMM’s proven wireless leadership Things you always wanted to know, but were afraid to ask Dad, where do There is an evil monster, who lives codecs come from? across the ocean, in the land of MENS. He makes up new codecs whenever he is angered by the X-Men Speech Codecs Landscape Fragmented and mostly driven by fixed network service providers Wireless Wireline Proprietary ISAC-WB iLBC RNA-10 RT-Audio WMA-WB SILK IP-MR SVOPC Speex BV16 BV32 3GPP2 EVRC-WB EVRC-B EVRC QC8 QC13 3GPP AMR-WB AMR HR EFR ITU-T G.728 G.729.1 G.729 G.722.1 G.722 G.728 G.723 G.726 G.711 4kbps 8kbps 16kbps 32kbps 64kbps Linux Operating System is cheap or even free as compared with Sun’s own Solaris OS or Microsoft’s Windows “Open source 'is free like a puppy is free' says” Scott McNealy Sun Chief Executive = Hinting at long-term costs and hassles, and occasional clean-up jobs. This is despite of the fact that Sun later released Solaris under an open source licensing as well! ITExpo Introduction Jeffrey Rodman CTO/Co-Founder August 2009 Polycom’s Vision: A Phone Call can be… an HD Voice an HD Video A phone call … session… session… Polycom’s Vision: A Phone Call can be… a gaming an HD Voice an HD Video session… session… session… Polycom’s Vision: A Phone Call can be… a gaming an applications an HD Video session… session… session… Polycom’s Vision: A Phone Call can be… a gaming an applications an astronomy session… session… lesson… Polycom’s Vision: And all unified via the IP network Who is Polycom? Founded in 1990 HQ in San Francisco Bay area, global presence Pioneer in Voice, Video, Data, Web conferencing Leader in conferencing and personal communication 2,400+ employees, worldwide Financially strong, publicly traded (NASDAQ: PLCM) >2x the installed base of nearest competition 600+ patents registered or pending What Polycom Does What Polycom Does Wireless Infrastructure 12% 14% Telepresence/Video 18% 50% Polycom Collaboration: From the endpoints Telepresence PERSONAL IMMERSIVE - - - 128 Voice Polycom Collaboration: To The Infrastructure Telepresence Recording & Conference Platforms Management Security Streaming Infrastructure and RSS 2000 Applications RMX MGC Family SE200 VMC 1000 VBP Voice Human Collaboration: HD Voice From the Start Telepresence 2005: 2003: 1988: HD Voice over IP Desk HD Voice over POTS VIDEO with HD Voice - - - 128 Voice Thank You! • Global IP Solutions (GIPS) – Recognized leader in world class voice and video processing technology for IP networks – GIPS s/w is deployed in over 800 million end-points – Enables developers to offer the highest quality regardless of network conditions – HD Voice since 2001 • Jan Linden – Vice President Engineering – GIPS – R&D in speech and video processing for more than 15 years Get the Most Out of HD Voice 1. HD capable microphone Just getting an HD Voice 2. High quality HD Voice codec codec is not enough! – Suitable for usage scenario 3. HD Voice Quality Enhancement – Echo cancellation, noise suppression, gain control,… 4. End-to-end network HD Voice support – Preferably no transcoding – Absolutely no narrowband 5. Network clean-up – Jitter buffer and packet loss concealment 6. HD capable speaker So Many Codec Options iSAC RTAudio G.729.1 iPCM-WB G.719 BV 32 G.722.1 (Siren) Speex G.722.2 (AMR-WB) AAC-LD G.722 SILK G.718 SVOPC G.711.1 EVRC-WB Choosing Speech Codec • Many conflicting parameters Packet-loss Complexity affect codec choice Complexity Robustness • Determines upper limit of Memory quality Delay • Support of several codecs Speech Codec necessary Cost Bit-rate – Interoperability Bit-rate – Usage scenario Sampling • IPR issues a significant Quality Rate concern Cost of a Codec • Implementation – Optimized implementation on a specific platform – For speech codecs rarely any IPR issues Codec IPR B ? • Video and music codecs very different ? A • IPR licensing ? C – Pay patent holders licensing fees – “One-stop-shopping” licensing possible but usually, not all IPR holders represented – Only this part is usually free for a “royalty free codec” • Indemnification – Pay your vendor for indemnification or risk paying unknown IPR holders HD Voice and Wideband Codecs Julian Spittka Skype Wednesday - 09/02/09, 9:30-10:15am Bio • Product Manager and Senior Engineer – Skype Audio / Video Group • Co-Founder of Camino Networks (acquired by Skype) • Co-Author of iLBC codec standard (IETF RFC 3951 and CableLabs) • 8+ years of experience in IP Communications • Master’s degree from RWTH Aachen University Commoditization of Communications 1. Move from Circuit Switched Voice to VoIP 2. Move from Firmware/Hardware to Software New Communications World • New applications require scalability (e.g. video chat, mobile VoIP, telepresence, in- game chat, live music performances) • New distribution models require simple licensing models (e.g. free download) What is needed? • Commoditization through codecs that are: – Scalable (speech quality, bit rate, complexity) – Widely and easily distributable • Accessible codec technology • Interoperability • Minimal number of codecs that cover the operating space What can be done? • In May 2009, Skype proposed to IETF to form a working group with the agenda to standardize a wideband codec • Skype has submitted its signature default super wideband codec SILK to IETF • SILK is available today royalty free to 3rd party developers SILK • Scalable – Variable and adaptive bit rate: 6 - 40kbit/s – Sampling rate: 8 - 24 kHz – Complexity • Portable (Fixed point ANSI C) • Lightweight (CPU, memory, code size) • Robust to jitter and packet loss • Low delay (25ms) • More info: https://developer.skype.com/silk SILK Speech Quality MOS scores for wideband speech signals at different MOS scores for wideband speech signals at different bit rates. packet loss rates. All codecs were operated at a bit rate of 18.25 kbps. MOS (Mean Opinion Score) listening test was performed by Dynastat, an independent 3rd party laboratory. Confidence intervals (95%) are +/- 0.1 MOS. All bitrates are measured and averaged over frames containing active speech. All audio signals are wideband. SILK and Speex were run in the highest complexity mode. Thank You!.