RTP Payload Format for the CELT Codec Draft-Valin-Celt-Rtp-Profile-00.Txt

RTP Payload Format for the CELT Codec draft-valin-celt-rtp-profile-00.txt IETF 74 — March 2009 Greg Maxwell, Jean-Marc Valin [email protected], [email protected] 1 About CELT… • There are two main classes of audio codecs – Speech codecs with low to medium quality and low delay – Music codecs with high quality and high delay • CELT aims for both high quality and very low delay – Prevents collisions during conversations (higher sense of presence) – Reduces or remove the need for acoustic echo cancellation – Allows synchronization for live music performance • Perceptual transform (MDCT) codec • Developed within the Xiph.Org Foundation • Reference implementation is open source (BSD- licensed) • No royalties, avoids known patents in the field AVT IETF-74 Greg Maxwell <[email protected]> — draft-valin-celt-rtp-profile-00.txt 2 CELT characteristics • Sampling rates from 32 kHz to 96 kHz • Total algorithmic delay from 2 ms to 24 ms (8 ms typical) • Frame sizes from 64 samples to 512 samples • One or two channels of audio encoded into a single frame • Error and Loss robustness – Monotonically decreasing 'bit importance' • Signaling-free on-the-fly rate adjustment • Bit-stream ''not frozen yet'' AVT IETF-74 Greg Maxwell <[email protected]> — draft-valin-celt-rtp-profile-00.txt 3 Audio codec landscape AAC,MP3,Vorbis 100 200 Speech (48) ≈ narrowband Music (64) AMR-WB+ wideband 80 D 80 > wideband e l ≈ a y G.722.1C 60 40 ( G.723.1 m s AMR-WB AAC-LD ) 40 20 G.729 CELT 20 G.722 Ref CELT A B C 7 kHz 3.5 kHz 0 40 80 Bitrate (kbps/channel) AVT IETF-74 Greg Maxwell <[email protected]> — draft-valin-celt-rtp-profile-00.txt 4 Codec behavior impacting the draft • The decoder MUST know – The sample rate the sender is using – The codec frame size the sender is using – The length of each compressed codec frame – If the encoded frame codes for one or two channels • Of these only the compressed length should reasonably change frame to frame • Sample rate, frame size changes require somewhat computationally expensive setup AVT IETF-74 Greg Maxwell <[email protected]> — draft-valin-celt-rtp-profile-00.txt 5 Frame size • Power of two sizes give the best performance – Embedded implementations may only support some sizes – Single frame size concurrently • External factors often drive frame size preferences • Current draft negotiates using fmtp and requires the answerer will respond with a single supported size and presumes it will send with that rate – This has early media issues AVT IETF-74 Greg Maxwell <[email protected]> — draft-valin-celt-rtp-profile-00.txt 6 Channel mapping • Indicates the grouping of audio channels into CELT frames and how the channels are used • Not all receivers will support multi-channel reception • Common use cases would have asymmetric configurations – Stereo down to conference bridge clients, mono up • Current draft is simply broken in this regard – SDP signals a 'mapping' parameter – If its used like a 'sprop' there is no way to indicate receiver capability – Change to having separate capability and sender mode attributes • Early media problems AVT IETF-74 Greg Maxwell <[email protected]> — draft-valin-celt-rtp-profile-00.txt 7 Compressed length • CELT can output any requested number of bytes • Support for multiple CELT frames per packet requires signaling the distribution of bytes to frames • Signaled in-band • CELT compressed lengths at the start of each RTP – Most common case is short lengths – Lengths under 255 bytes use a single byte – Longer lengths encode a 0xFF for each 255 bytes of payload then another byte with the remaining length. • No issues with this approach? • Is the low overhead in the draft mode worthwhile? AVT IETF-74 Greg Maxwell <[email protected]> — draft-valin-celt-rtp-profile-00.txt 8 Common SDP attributes • ptime – Profile treats this as a receiver requested minimum packetization interval only • b=AS: – Profile treats this as a receiver requested maximum bitrate • These are the simple, conventional uses, no codec interaction • No issues here? • Some implementers appear to have incorrect beliefs about ptime AVT IETF-74 Greg Maxwell <[email protected]> — draft-valin-celt-rtp-profile-00.txt 9 Open issues • Early media issues with current negotiation approach – The offerer could use distinct payload types with single configurations – How acceptable is it to burn payload types for this? – Also send in-band? • Could be done without continual overhead • Re-invite not addressed – Obvious solution is to recommend different payload types be used when sender parameters change AVT IETF-74 Greg Maxwell <[email protected]> — draft-valin-celt-rtp-profile-00.txt 10 Future work • CELT would be more flexible with configuration data (~100 bytes) – Expected all receivers would support all configuration data – Configuration packet transmitted at regular interval? – Incrementally transmitted in-band? – Base64-encoded in SDP parameters? • Freezing the CELT bit-stream AVT IETF-74 Greg Maxwell <[email protected]> — draft-valin-celt-rtp-profile-00.txt 11.

RTP Payload Format for the CELT Codec Draft-Valin-Celt-Rtp-Profile-00.Txt

C-Based Hardware Design of Imdct Accelerator for Ogg Vorbis Decoder

Blackberry QNX Multimedia Suite

Detail Streaming Support Protocols

Ogg Audio Codec Download

Opus, a Free, High-Quality Speech and Audio Codec

Multi-Core Platforms for Audio and Multimedia Coding Algorithms in Telecommunications

Game Audio the Role of Audio in Games

Dune HD Duo 4K Is a New Premium Full-Sized 4K Network Media Player

Lossy Audio Compression Identification

What Is Ogg Vorbis?

Methods of Sound Data Compression \226 Comparison of Different Standards

How to Convert AVI, MP4, FLV (Flash Video) and Other Non-Free Video Encoded Formats to Free Video Format Encoding OGV (Ogg Vorbis / Theora) on GNU / Linux and Freebsd