INF SERV – Media Storage and Distribution Systems: DataData FormatsFormats andand CodecsCodecs

1/9 – 2003 Why codecs and formats?

! Codecs (coders/decoders) " Determine how information is represented " Important for servers and distribution systems

# Required sending speed

# Amount of loss allowed

# Buffers required

# … ! Formats " Determine how data is stored " Important for servers and distribution systems

# Where is the data?

# Where is the data about the data?

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen Media data Media data

! Medium: "Thing in the middle“ " here: means to distribute and present information ! Media affect human computer interaction

! The mantra of multimedia users " Speaking is faster than writing " Listening is easier than reading " Showing is easier than describing

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen Dependence of Media

! Time-independent media " Text " Graphics " Discrete media

! Time-dependent media " Audio " Video ! "Continuous" refers to the " Animation user’s impression of the data, " Continuous media not necessarily to its representation ! Combined video and audio is ! Interdependant media multimedia - relations must " Multimedia be specified

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen Properties of a Multimedia System

! Flexibility " Provide mechanisms to handle all kinds of media, in particular, discrete and continuous media " A VCR and a desktop publishing system for text and graphics are no multimedia systems " An editor with voice annotation is a multimedia system

! Integration " Independent media storage " Computer-controlled media combination

! Definition integratedA computer-controlled multimedia system ishandling characterized of independent by the discrete and

continuous media INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen Multimedia: Not Your Ordinary Data

! Multimedia is different from traditional digital data: " High data volume " Continuous streaming " Several related streams " Quality of services

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen High Data Volume

! Throughput: " Higher volume than for traditional data " Longer transactions than for traditional data " Requires

# Performance and bandwidth

# Resource management techniques

# Compression

! Typical values " Uncompressed video: 140 – 216 Mbit/s " Uncompressed audio (CD): 1.4 Mbit/s " Uncompressed speech: 64 Kbit/s " Compressed audio & video (VoD): down to 1.2 – 4 Mbit/s " Compressed audio & video (Conf.): down to 128 Kbit/s " Compressed speech: down to 6.2 Kbit/s

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen Coding for distribution Compression - Necessity

! E.g., video sequence " 25 images/sec. # PAL standard " 3 byte/pixel # YUV (luminance + 2 chrominance values) # RGB (red-green-blue values) " Image resolution 640 * 480 pixel # Data rate = 640 * 480 * 3 Byte * 25/s = 23040000 byte/s

o Approx. 1/100 stream over ADSL o Approx. 1/16 stream over Ethernet ~ 22 MByte/s o Approx. 1/2 stream over Fast Ethernet # Compression is necessary

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen Compression – General Requirements

! Dependence on application type: " Dialogue mode " Retrieval mode

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen Compression – Mode Dependent Requirements

! Dialogue and retrieval mode requirements: " Synchronization of audio, video, and other media

! Dialogue mode requirements: " End-to-end delay < 150ms " Compression and decompression in real-time " Symmetric ! Retrieval mode requirements: " Fast forward and backward data retrieval " Random access within 1/2 s " Asymmetric

! We look mainly at retrieval mode!

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen Compression Categories

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen Basic Encoding Steps

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen Run-Length Coding

! Assumption " Long sequences of identical symbols ! Example

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen Bit-Plane Coding

! Assumption " Even longer sequences of identical bits ! Example 10,0,6,0,0,3,0,2,2,0,0,2,0,0,1,0, … ,0,0 (absolute) 0,x,1,x,x,1,x,0,0,x,x,1,x,x,0,x, … ,x,x (sign bits)

1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, … ,0,0 (MSB) 0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0, … ,0,0 (MSB-1) 1,0,1,0,0,1,0,1,1,0,0,1,0,0,0,0, … ,0,0 (MSB-2) 0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0, … ,0,0 (MSB-3) (0,1) (2,1) (0,0)(1,0)(2,0)(1,0)(0,0)(2,1) (5,0)(8,1)

" Up to 20% savings over run-length coding can be achieved

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen

! Assumption " Some symbols occur more often than others " E.g., character frequencies of the English language

! Fundamental principle " Frequently occurring symbols are coded with shorter bit strings

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen Huffman Coding

! Example " Characters to be encoded:

# A, B, C, D, E " Probability to occur:

# p(A)=0.3, p(B)=0.3, p(C)=0.1, p(D)=0.15, p(E)=0.15

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen Huffman

! Table and example of application to data stream

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen JPEG

! “JPEG”: Joint Photographic Expert Group

! International Standard: " For digital compression and coding of continuous-tone still images:

# Gray-scale

# Color " Since 1992

! Joint effort of: " ISO/IEC JTC1/SC2/WG10 " Commission Q.16 of CCITT SGVIII

! Compression rate of 1:10 yields reasonable results

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen JPEG

! Very general compression scheme

! Independence of " Image resolution " Image and pixel aspect ratio " Color representation " Image complexity and statistical characteristics

! Well-defined interchange format of encoded data

! Implementation in " Software only " Software and hardware

! “Motion JPEG” for video compression " Sequence of JPEG-encoded images

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen JPEG

! Sequence of compression steps " Different resolutions possible " Lossy or lossless mode

# factor ~1,6:1 " Symmetrical codec

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen JPEG – Baseline Mode: Quantization

! Use of quantization tables for the DCT-coefficients " Map interval of real numbers to one integer number " Allows to use different granularity for each coefficient

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen JPEG – 4 Modes of Compression

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen Motion JPEG

! Use series of JPEG frames to encode video

! Pro " Lossless mode – editing advantage " Frame-accurate seeking – editing advantage " Arbitrary frame rates – playback advantage " Arbitrary frame skipping – playback advantage " Scaling through progressive mode – distribution advantage " Min transmission delay = 1/framerate – conferencing advantage " Supported by popular frame grabbers

! Contra " Series of JPEG-compressed images " No standard, no specification # Worse, several competing quasi-standards " No relation to audio " No inter-frame compression

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen H.261 (px64)

! International Standard " for video conferences at p x 64kbit/s (ISDN):

# Real-time encoding/decoding, max. signal delay of 150ms

# Constant data rate ! Intraframe coding " DCT as in JPEG baseline mode ! Interframe coding , motion estimation

" Search of similar macroblock in previous image and compare

# Position of this macroblock defines motion vector

# Difference between similar macroblocks

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen MPEG (Moving Pictures Expert Group)

! International Standard: " Compression of audio and video for playback (1.5 Mbit/s): " Real-time decoding ! Sequence of I-, P-, and B-Frames:

! Random access " at I-frames " at P-frames: i.e. decode previous I-frame first " at B-frame: i.e. decode I and P-frames first

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen MPEG-2

! From MPEG-1 to MPEG-2 " Improvement in quality

# From VCR to TV to HDTV " No CD-ROM based constraints

# Higher data rates o MPEG-1: about 1.5 MBit/s o MPEG-2: 2-100 MBit/s ! Evolution " 1994: International Standard " Also later known as H.262 " Prominent role for digital TV in DVB (digital video broadcasting) and DVD (digital video disk)

# Commercial MPEG-2 realizations available

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen MPEG-2

! Beyond MPEG-1: " Higher quality encoding " Higher data rates " Interleaved modes

! Use cases " Broadcast quality production # DVB-T: Terrestrial # DVB-S: Satellite # DVB-C: Cable " Program Stream # for post-processing, storage, and DVD distribution " Transport Stream # for broadcasting, error resilience

! Scaling: " Signal to Noise Ration (SNR) scaling - progressive compression error correcting codes " Spatial scaling - several pixel resolutions " Temporal scaling - frame dropping

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen MPEG-4

! MPEG-4 (ISO 14496) originally " Targeted at systems with very scarce resources " To support applications like

# Mobile communication

# Videophone and E-mail " Max. data rates and dimensions (roughly)

# Between 4800 and 64000 bits/s

# 176 columns x 144 lines x 10 frames/s

! Further demand " To provide enhanced functionality to allow for analysis and manipulation of image contents

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen MPEG-4

! Hence: find standardized ways to " Represent units of aural, visual or audiovisual content

# audio/visual objects" or AVOs

# object coding independent of other objects, surroundings and background

# natural and synthetic objects " Compose these objects together

# i.e. creation of compound objects that form audiovisual scenes " Multiplex and synchronize the data associated with AVOs

# for transportation over network channels providing a QoS (Quality-of-Service) " Interact with the audiovisual scene generated at the decoder’s site

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen MPEG-4: Scope

! Definition of " „System Decoder Model“

# specification for decoder implementations " Description language

# binary syntax of an AV object’s bitstream representation

# scene description information " Corresponding concepts, tools and algorithms, especially for

# content-based compression of simple and compound audiovisual objects

# manipulation of objects

# transmission of objects

# random access to objects

# animation

# scaling

# error robustness

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen MPEG-4: Scope

! Targeted bit rates for video and audio: " VLBV core

# „Very Low Bit-rate Video“

# 5 - 64 Kbit/s

# image sequences with CIF resolution and up to 15 frames/s " Higher-quality video

# 64 Kbit/s - 4 Mbit/s

# quality like digital TV " Natural audio coding

# 2 - 64 Kbit/s

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen MPEG-4: Video and Image Encoding

! Encoding / decoding of " Rectangular images and video

# coding similar to MPEG-1/2

# motion prediction

# texture coding " Images and video of arbitrary shape

# as done in conventional approach o 8x8 DCT or shape-adaptive DCT

# plus coding of shape and transparency information ! Encoder " Must generate timing information

# speed of the encoder clock = time base

# desired decoding times and/or expiration times o by using time stamps attached to the stream " Can specify the minimum buffer resources needed for decoding

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen MPEG-4: Composition of Scenes

! Scene description includes: " Tree to define hierarchical relationships between objects

" Objects’ positions in space and time

# by converting the objects’ local coordinate system into a global coordinate system " Attribute value selection

# e.g. pitch of sound, color, texture, animation parameters " Description based on some VRML concepts

# VRML = „Virtual Reality Modeling Language“ " Interaction with scenes

# e.g. change viewing point, drag object, start/stop streams, select language

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen MPEG-4: Example of a Composition

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen MPEG-4: Synthetic Objects

! Visual objects: " Virtual parts of scenes

# e.g. virtual background " Animation

# e.g. animated faces

! Audio objects: " „Text-to-speech“

# speech generation from given text and prosodic parameters

# face animation control " „Score driven synthesis“

# music generation from a score

# more general than MIDI " Special effects

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen MPEG-4: Layered Networking Architecture

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen MPEG-4: Layered Networking Architecture

! DMIF „Delivery Multimedia Integration Framework“ " Allows to establish multiple party sessions

# interaction with o remote interactive peers o broadcast systems o storage systems

# establishment of channels with speci.c QoSs and bandwidths " Controls

# FlexMux layer

# TransMux layer

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen MPEG-4: Error Handling

! Mobile communication: " Low bit-rate (< 64 Kbps) " Error-prone ! MPEG-4 concepts for error handling: " Resynchronization

# enables receiver to „tune in“ again

# based on markers within bitstream " Data recovery

# enables receiver to reconstruct lost data

# encode data in an error-resilient manner " Error concealment

# enables receiver to bridge gaps in data

# e.g. by repeating parts of old frames

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen Network-aware c o di ng Network-aware c o di ng

! Adapt to reality of the Internet " Content

# Is created once, off-line

# Is sent many times, under different circumstances " No guarantees concerning

# Throughput

# Jitter

# Packet loss " Sending rate

# Must adhere to rules

# Often: don’t send more than TCP would

" Can’t send at the best available encoding rate

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen Approaches

! Simulcast ! Scalable coding " SNR Scalability " Temporal Scalability " Spatial Scalability " Fine Grained Scalability ! Multiple Description Coding

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen Simulcast

! Choose a set of sending rates " During content creation

# Encode content in best possible quality below that sending rate " During transmission

# Choose version with the best te ra admissable quality ing nd se ble ssi po at ty ali 3 simulcast rates qu ble ssi po st Be Quality

Single rate codec

Sending rate

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen Scalable coding

! Typically used as Layered coding

! A base layer " Provides basic quality te ra " Must always be transferred ing nd se ble ! Enhancement layer ssi One or more po at ty enhancement layers ali qu ble ssi " Improve quality po st Be " Transferred if possible Quality

Base layer

Sending rate

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen Temporal Scalability

! Frames can be dropped " In a controlled manner " Frame dropping does not violate dependancies " Low gain example: B-frame dropping in MPEG-1

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen SNR Scalability

! SNR – signal-to-noise ratio ! Idea " Base layer

# Is regularly DCT encoded

# A lot of data is removed using quantization " Enhancement layer is regularly DCT encoded

# Run Inverse DCT on quantized base layer

# Subtract from original

# DCT encode the result " If enhancement layer arrives at client

# Add base and enhancement layer before running Inverse DCT

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen Spatial Scalability

! Idea " Base layer

# Downsample the original image (code only 1 pixel instead of 4)

# Send like a lower resolution version " Enhancement layer

# Subtract base layer pixels from all pixels

# Send like a normal resolution version " If enhancement layer arrives at client

# Decode both layers

# Add layers 73 Base layer 72 61 Less data to code 75 83 -1 -12 Enhancement layer Better compression due to low values 2 10

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen Fine Grained Scalability

! Idea " Cut of compressed tail bits of samples ! Base layer " As in SNR coding ! Enhancement layer " Use bit-plane coding for enhancement layer instead of run-level coding " Cut tail bits off until data rate is reached

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen Fine Grained Scalability

(0,1) (2,1) (0,0)(1,0)(2,0)(1,0)(0,0)(2,1) (5,0)(8,1)

te ra ing nd se ble ssi po at ty ali qu ble S ssi FG po of st al Be Go Quality

Sending rate

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen Multiple Description Coding

! Idea " Encode data in two streams " Each stream has acceptable quality " Both streams combined have good quality " The redundancy between both streams is low

! Problem " The same relevant information must exist in both streams " Old problem: started for audio coding in telephony " Currently a hot topic

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen Multimedia File Formats Overview

! File formats " Define the storage of media data on disks " Specify synchronization " Specify timing " Contain metadata

! They allow " Interchange of data without interpretation

# Copying

# Platform independance " Management " Editing " Retrieval for presentation

! Needed for all asynchronous applications

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen Examples

! Streaming format " File format and wire format are identical " MPEG-1, DVI ! Streamable format " File format specifies wire format(s) " MPEG-4, Quicktime, Windows Media, Real Video

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen RTP Recorder Solution

! Pragmatic generic solution " Stores and sends all MBone sessions " No interpretation of data " Interpretation of network timestamps " Derivation of synchronity information Sender Receiver Receiver

MBone VCR indexindex filefile

datadata filefile Sender

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen Stored Motion JPEG

! Motion JPEG Chunk File Format (UC Berkeley) " Specifies entire clip’s length in s+ns " Contains sequence of images " Each image in Independent JPEG Group’s JFIF format ! AVI MJPEG DIB (Microsoft) " Supports audio interleaving " Time-stamped data chunks " One frame per AVI RIFF data chunk " Hack for .le size > 1GB ! Quicktime (Apple) " Dedicated tracks for interleaving and timing " One frame per field " Several fields per sample " Formats A: full JFIF images, B: QT headers and data only

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen Quicktime File Format

Quicktime file duration default rate copyright author

track track choice track choice media sample track size location order sample edit list sample media media track size location order sample

edit list sample

! Run-time choice of tracks " availability of codecs " bandwidth " language

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen MPEG-4 File Format

MP4 file moov mdat trak (video) 1st OD trak (OD) interleaved, time-ordered BIFS, OD, video, and audio trak (BIFS) access units other atoms trak (audio) ...

MP4 file media file mdat video and audio access units some units may be unordered BIFS access units some units may be unused some units may be unordered some units may be unused mdat moov mdat mdat

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen Other File Formats

! Real Video " Not published " Supports various codecs " Supports various encoding formats per .le " Supports dynamic selection " Supports dynamic scaling ("stream thinning") ! AVI " AVI is published " Uses Resource (RIFF) " Supports various codecs ! ASF / Windows Media File Format " Submitted as MPEG-4 proposal (but refused) " ASF files can include Windows binary code " ASF is patented in the USA

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen Summary

! Storage and distribution system must support: " Discrete media such as text and graphics " Continuous media such as audio and video " Interrelated Multiplexed media

! Encoding Format and File Format must be distinguished " Separation of file format and wire format " Streamable files vs. streaming format

! Trend towards " Formats that define presentation environments " Interaction of encoding format and application " Interaction of client and server

! Influence on Distribution Systems?

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen References

! Ralf Steinmetz, Klara Nahrstedt: Multimedia Fundamentals, Volume I: Media Coding and Content Processing (2nd Edition), Prentice Hall, 2002, ISBN 0130313998 ! Touradj Ebrahimi (Ed.), Fernando Pereira, The MPEG-4 Book, Prentice Hall, 2002, ISBN 0130616214 ! Weiping Li, Overview of Fine Granularity Scalability in MPEG-4 Video Standard, IEEE Transactions on Circuits and Systems for Video Technology, 11(3), Mar. 2001 ! Vivek K. Goyal, Multiple Description Coding: Compression Meets the Network , IEEE Signal Processing Magazine, Sep. 2001

INF 5070 – media servers and distribution systems 2003 Carsten Griwodz & Pål Halvorsen