IETF Opus Characterization: MUSHRA and Crowd Sourcing?
Dr. Christian Hoene, Symonics GmbH, Novi Sad, Serbia, March 2013 IETF RFC 6716: Opus Speech and Audio Codec
. by JM. Valin, K. Vos, T. Terriberry . Features . Bit rates: From 6 kbit/s to 510 kbit/s . Frame sizes: 2.5 ms to 60 ms . Sampling rates: 8 kHz to 48 kHz . Dynamic changeable modes and support for concealment of time adjustment . Open source and royalty free . Mandatory for WebRTC
11.03.2013 2 What is WebRTC?
. A Browser with HTML5 and JavaScript . Real-time Voice + Video Codecs . Access to microphone and camera . RTP, RTCP, UDP, and rate control . Secure Transmission (SRTP) . NAT traversal (ICE, STUN, TURN) . Global IP Sound‘s acoustic processing routines . APIs for web programming
11.03.2013 3 Simple Video Communication Service
http://tools.ietf.org/agenda/81/slides/rtcweb-1.pdf
11.03.2013 4 WebRTC Status as today
. Google and Mozilla are shipping WebRTC browser for PCs, Android, and iOS . Opus, Microsoft and Apple will follow soon . Codecs: G.711, Opus, and probability VP8
WebRTC will change the telecom business!
11.03.2013 5 How does Opus work?
. Opus consists of two coding algorithms . based on CELT and Silk respectively . Designed with the Internet specialists
. Internet requirements . Wide range of bit and packet rates . Dynamically changing of modes . Support of playout adjustment . Similar royalty model as Internet protocols
. Time line . Standardization started 2009 . RFC 6716 published Sep. 2012 . Next step: Formal Characterization of Opus.
11.03.2013 6
Operational Modes
Hybrid Audio Codec CELT from Xiph.org
Frequency Band Frequency Enhanced Speech Codec SILK from Skype
Bit rate
11.03.2013 7 Best effort Internet requires a scalable codec Live Music
Telephony Ultra-low Stereo Delay Hi-Fi HD Voice Telepresence Toll Music sharing Quality .
11.03.2013 8 Sound Check
11.03.2013 9 Analog Telephone (60s-70s)
ITU-T P.48 : Specification for an intermediate reference system (300-3400 Hz)
11.03.2013 10 Digital, ISDN (80s)
ITU-T G.711 : Pulse code modulation (PCM) of voice frequencies ITU-T P.830: Annex D - Modified IRS send and receive characteristics
11.03.2013 11 Cellular Phone (90s)
Full rate speech; Transcoding (GSM 06.10 version 8.1.1 Release 1999)
11.03.2013 12 2000s: VoIP is not better
ITU-T G.711 Appendix 1 Packet Loss Concealment, 60ms, 20% LR
11.03.2013 13 10s: Things getting better
. Silk (16kHz, 16kb/s)
. Silk (24kHz, 24kb/s)
. PCM (48kHz, stereo)
Voice sounds well with >16 kHz
11.03.2013 14 Adding DNA to Suzanne Vega . Reference
. GSM (FR, 8 kbits)
. Opus (stereo, 64 kHz, 64 kbits)
11.03.2013 15 Prof. Dr. Alexander Carôt
. Computer scientists and musician . PhD on remote music collaboration
. Uses CELT in Soundjack [http://www.carot.de/soundjack/]
11.03.2013 16 Soundjack Live Session
11.03.2013 17 Wideband Voice Tests by Google 2011
. Formal Listening-only Tests
. Using MUSHRA Scale . 100 Perfect . 37 Toll Quality (MOS Value of about 4.5) . 0 Bad
. About 20 Listeners
11.03.2013 18 Test Results
11.03.2013 19 Wideband Speech
11.03.2013 20 Stereo Music Results
11.03.2013 21 Formal Tests at Universität Tübingen
. MUSHRA tests on published Opus version . Covering complete parameter spaces . Cross check with PEAQ and POLQA . 300 students willing to do the testing . starting March 2013
. We are looking for a partner to help us . do the crowd sourcing tests . and publish the results in an RFC and a scientific journal
11.03.2013 22
Questions to address
. How good is crowd sourcing compared to formal MUSHRA testing? . In which quality ranges does crowd testing work well? . Which question about testing conditions to ask before doing the quality tests? . What kind of errors are introduced by crowd testing? . How well does Opus perform against other codecs? . What is better? Crowd or PEAQ/POLQA?
11.03.2013 23 Thank you very much
Symonics GmbH Sand 13 72076 Tübingen Germany Phone: +49 7071 568130-0 Fax: +49 7071 568130-9 [email protected]
11.03.2013 24