IETF Opus Characterization: MUSHRA and Crowd Sourcing?

IETF Opus Characterization: MUSHRA and Crowd Sourcing? Dr. Christian Hoene, Symonics GmbH, Novi Sad, Serbia, March 2013 IETF RFC 6716: Opus Speech and Audio Codec . by JM. Valin, K. Vos, T. Terriberry . Features . Bit rates: From 6 kbit/s to 510 kbit/s . Frame sizes: 2.5 ms to 60 ms . Sampling rates: 8 kHz to 48 kHz . Dynamic changeable modes and support for concealment of time adjustment . Open source and royalty free . Mandatory for WebRTC 11.03.2013 2 What is WebRTC? . A Browser with HTML5 and JavaScript . Real-time Voice + Video Codecs . Access to microphone and camera . RTP, RTCP, UDP, and rate control . Secure Transmission (SRTP) . NAT traversal (ICE, STUN, TURN) . Global IP Sound‘s acoustic processing routines . APIs for web programming 11.03.2013 3 Simple Video Communication Service http://tools.ietf.org/agenda/81/slides/rtcweb-1.pdf 11.03.2013 4 WebRTC Status as today . Google and Mozilla are shipping WebRTC browser for PCs, Android, and iOS . Opus, Microsoft and Apple will follow soon . Codecs: G.711, Opus, and probability VP8 WebRTC will change the telecom business! 11.03.2013 5 How does Opus work? . Opus consists of two coding algorithms . based on CELT and Silk respectively . Designed with the Internet specialists . Internet requirements . Wide range of bit and packet rates . Dynamically changing of modes . Support of playout adjustment . Similar royalty model as Internet protocols . Time line . Standardization started 2009 . RFC 6716 published Sep. 2012 . Next step: Formal Characterization of Opus. 11.03.2013 6 Operational Modes Hybrid Audio Codec CELT from Xiph.org Frequency Band Frequency Enhanced Speech Codec SILK from Skype Bit rate 11.03.2013 7 Best effort Internet requires a scalable codec Live Music Telephony Ultra-low Stereo Delay Hi-Fi HD Voice Telepresence Toll Music sharing Quality . 11.03.2013 8 Sound Check 11.03.2013 9 Analog Telephone (60s-70s) ITU-T P.48 : Specification for an intermediate reference system (300-3400 Hz) 11.03.2013 10 Digital, ISDN (80s) ITU-T G.711 : Pulse code modulation (PCM) of voice frequencies ITU-T P.830: Annex D - Modified IRS send and receive characteristics 11.03.2013 11 Cellular Phone (90s) Full rate speech; Transcoding (GSM 06.10 version 8.1.1 Release 1999) 11.03.2013 12 2000s: VoIP is not better ITU-T G.711 Appendix 1 Packet Loss Concealment, 60ms, 20% LR 11.03.2013 13 10s: Things getting better . Silk (16kHz, 16kb/s) . Silk (24kHz, 24kb/s) . PCM (48kHz, stereo) Voice sounds well with >16 kHz 11.03.2013 14 Adding DNA to Suzanne Vega . Reference . GSM (FR, 8 kbits) . Opus (stereo, 64 kHz, 64 kbits) 11.03.2013 15 Prof. Dr. Alexander Carôt . Computer scientists and musician . PhD on remote music collaboration . Uses CELT in Soundjack [http://www.carot.de/soundjack/] 11.03.2013 16 Soundjack Live Session 11.03.2013 17 Wideband Voice Tests by Google 2011 . Formal Listening-only Tests . Using MUSHRA Scale . 100 Perfect . 37 Toll Quality (MOS Value of about 4.5) . 0 Bad . About 20 Listeners 11.03.2013 18 Test Results 11.03.2013 19 Wideband Speech 11.03.2013 20 Stereo Music Results 11.03.2013 21 Formal Tests at Universität Tübingen . MUSHRA tests on published Opus version . Covering complete parameter spaces . Cross check with PEAQ and POLQA . 300 students willing to do the testing . starting March 2013 . We are looking for a partner to help us . do the crowd sourcing tests . and publish the results in an RFC and a scientific journal 11.03.2013 22 Questions to address . How good is crowd sourcing compared to formal MUSHRA testing? . In which quality ranges does crowd testing work well? . Which question about testing conditions to ask before doing the quality tests? . What kind of errors are introduced by crowd testing? . How well does Opus perform against other codecs? . What is better? Crowd or PEAQ/POLQA? 11.03.2013 23 Thank you very much Symonics GmbH Sand 13 72076 Tübingen Germany Phone: +49 7071 568130-0 Fax: +49 7071 568130-9 [email protected] 11.03.2013 24 .

IETF Opus Characterization: MUSHRA and Crowd Sourcing?

Space Application Development Board

Concatenation of Compression Codecs – the Need for Objective Evaluations

Ffmpeg Documentation Table of Contents

Influence of Speech Codecs Selection on Transcoding Steganography

TR 101 329-7 V1.1.1 (2000-11) Technical Report

Ffmpeg Codecs Documentation Table of Contents

Technical Manual Creating Media for the Motorola V1100

G.729/A Speech Coder: Multichannel Tms320c62x Implementation (Rev. B)

AMR Wideband Speech Codec; Feasibility Study Report (3GPP TR 26.901 Version 4.0.1 Release 4)

Joint Effect of Channel Coding and AMR Compression on Speech Quality Minh Quang Nguyen, Hang Nguyen

TR 101 329 V1.2.5 (1998-10) Technical Report

Full Rate Speech; Processing Functions (3GPP TS 46.001 Version 7.0.0 Release 7)