IETF Characterization: MUSHRA and Crowd Sourcing?

Dr. Christian Hoene, Symonics GmbH, Novi Sad, Serbia, March 2013 IETF RFC 6716: Opus Speech and Audio Codec

. by JM. Valin, K. Vos, T. Terriberry . Features . Bit rates: From 6 kbit/s to 510 kbit/s . Frame sizes: 2.5 ms to 60 ms . Sampling rates: 8 kHz to 48 kHz . Dynamic changeable modes and support for concealment of time adjustment . Open source and royalty free . Mandatory for WebRTC

11.03.2013 2 What is WebRTC?

. A Browser with HTML5 and JavaScript . Real-time Voice + Video Codecs . Access to microphone and camera . RTP, RTCP, UDP, and rate control . Secure Transmission (SRTP) . NAT traversal (ICE, STUN, TURN) . Global IP Sound‘s acoustic processing routines . APIs for web programming

11.03.2013 3 Simple Video Communication Service

http://tools.ietf.org/agenda/81/slides/rtcweb-1.pdf

11.03.2013 4 WebRTC Status as today

. Google and Mozilla are shipping WebRTC browser for PCs, Android, and iOS . Opus, Microsoft and Apple will follow soon . Codecs: G.711, Opus, and probability VP8

WebRTC will change the telecom business!

11.03.2013 5 How does Opus work?

. Opus consists of two coding algorithms . based on CELT and Silk respectively . Designed with the Internet specialists

. Internet requirements . Wide range of bit and packet rates . Dynamically changing of modes . Support of playout adjustment . Similar royalty model as Internet protocols

. Time line . Standardization started 2009 . RFC 6716 published Sep. 2012 . Next step: Formal Characterization of Opus.

11.03.2013 6

Operational Modes

Hybrid Audio Codec CELT from Xiph.org

Frequency Band Frequency Enhanced Speech Codec SILK from Skype

Bit rate

11.03.2013 7 Best effort Internet requires a scalable codec Live Music

Telephony Ultra-low Stereo Delay Hi-Fi HD Voice Telepresence Toll Music sharing Quality .

11.03.2013 8 Sound Check

11.03.2013 9 Analog Telephone (60s-70s)

ITU-T P.48 : Specification for an intermediate reference system (300-3400 Hz)

11.03.2013 10 Digital, ISDN (80s)

ITU-T G.711 : Pulse code modulation (PCM) of voice frequencies ITU-T P.830: Annex D - Modified IRS send and receive characteristics

11.03.2013 11 Cellular Phone (90s)

Full rate speech; Transcoding (GSM 06.10 version 8.1.1 Release 1999)

11.03.2013 12 2000s: VoIP is not better

ITU-T G.711 Appendix 1 Packet Loss Concealment, 60ms, 20% LR

11.03.2013 13 10s: Things getting better

. Silk (16kHz, 16kb/s)

. Silk (24kHz, 24kb/s)

. PCM (48kHz, stereo)

Voice sounds well with >16 kHz

11.03.2013 14 Adding DNA to Suzanne Vega . Reference

. GSM (FR, 8 kbits)

. Opus (stereo, 64 kHz, 64 kbits)

11.03.2013 15 Prof. Dr. Alexander Carôt

. Computer scientists and musician . PhD on remote music collaboration

. Uses CELT in Soundjack [http://www.carot.de/soundjack/]

11.03.2013 16 Soundjack Live Session

11.03.2013 17 Wideband Voice Tests by Google 2011

. Formal Listening-only Tests

. Using MUSHRA Scale . 100 Perfect . 37 Toll Quality (MOS Value of about 4.5) . 0 Bad

. About 20 Listeners

11.03.2013 18 Test Results

11.03.2013 19 Wideband Speech

11.03.2013 20 Stereo Music Results

11.03.2013 21 Formal Tests at Universität Tübingen

. MUSHRA tests on published Opus version . Covering complete parameter spaces . Cross check with PEAQ and POLQA . 300 students willing to do the testing . starting March 2013

. We are looking for a partner to help us . do the crowd sourcing tests . and publish the results in an RFC and a scientific journal

11.03.2013 22

Questions to address

. How good is crowd sourcing compared to formal MUSHRA testing? . In which quality ranges does crowd testing work well? . Which question about testing conditions to ask before doing the quality tests? . What kind of errors are introduced by crowd testing? . How well does Opus perform against other codecs? . What is better? Crowd or PEAQ/POLQA?

11.03.2013 23 Thank you very much

Symonics GmbH Sand 13 72076 Tübingen Germany Phone: +49 7071 568130-0 Fax: +49 7071 568130-9 [email protected]

11.03.2013 24