<<

CS 414 – Multimedia Systems Design Lecture 13 – MPEG-2/ MPEG-4

Klara Nahrstedt Spring 2014

CS 414 - Spring 2014 Administrative

 MP1 – deadline February 19, 5pm (Wednesday)  MP1 Demonstration – February 21, 5-7pm (Friday) Sign-up sheet on Piazza  HW1 – posted on February 24 (Monday) Individual effort !!!  HW1 – deadline on March 3 (Monday)  Midterm – March 7 (Friday)

CS 414 - Spring 2014 Outline  MPEG-2/MPEG-4  Reading:  Media Coding , Section 7.7.2 – 7.7.5  http://www.itu.int/ITU-D/tech/digital- broadcasting/kiev/References/mpeg-4.  http://en.wikipedia.org/wiki/H.264

CS 414 - Spring 2014 MPEG-2 Extension

 MPEG-1 was optimized for CD-ROM and apps for 1.5 Mbps (video strictly non- interleaved)  MPEG-2 adds to MPEG-1: More aspect ratios: 4:2:2, 4:4:4 Progressive and interlaced coding Four scalable modes  spatial scalability, data partitioning, SNR scalability, temporal scalability

CS 414 - Spring 2014 MPEG-1/MPEG-2

 Pixel-based representations of content takes place at the encoder  Lack support for content manipulation e.g., remove a date stamp from a video turn off “current score” visual in a live game

 Need support manipulation and interaction if the video is aware of its own content

CS 414 - Spring 2014 MPEG-4 Compression

CS 414 - Spring 2014 Interact With Visual Content

CS 414 - Spring 2014 Original MPEG-4

 Conceived in 1992 to address very low bit rate audio and video (64 Kbps)  Required quantum leaps in compression beyond statistical- and DCT-based techniques committee felt it was possible within 5 years

 Quantum leap did not happen (so we still have DCT-based techniques in MPEG-4)

CS 414 - Spring 2014 The “New” MPEG-4  Support object-based features for content

 Enable dynamic rendering of content defer composition until decoding

 Support convergence among digital video, synthetic environments, and the

CS 414 - Spring 2014 MPEG-4 Components  Systems – defines architecture, multiplexing structure, syntax  Video – defines video coding for animation of synthetic and natural hybrid video (Synthetic/Natural Hybrid Coding)  Audio – defines audio/speech coding, Synthetic/Natural Hybrid Coding such as MIDI and text-to-speech synthesis integration  Conformance Testing – defines compliance requirements for MPEG-4 bitstream and decoders  Technical report  DSM-CC Multimedia Integration Framework – defines Digital Storage Media – Command&Control Multimedia Integration Format; specifies merging of broadcast, interactive and conversational multimedia for set-top-boxes and mobile stations CS 414 - Spring 2014 MPEG-4 Example

CS 414 - Spring 2014 ISO N3536 MPEG4 MPEG-4 Example

CS 414 - Spring 2014 ISO N3536 MPEG4 MPEG-4 Example

Daras, P. MPEG-4 Authoring Tool, J. Applied Signal Processing, 9, 1-18, 2003

CS 414 - Spring 2014 Interactive Drama

CS 414 - Spring 2014 http://www.interactivestory.net MPEG-4 Characteristics and Applications

CS 414 - Spring 2014 Example of MPEG-4 Scene

CS 414 - Spring 2014 Source: http://mpeg.chiariglione.org/standards/mpeg-4/mpeg-4.htm Media Objects

 An object is called a media object real and synthetic images; analog and synthetic audio; animated faces; interaction

 Compose media objects into a hierarchical representation  compound, dynamic scenes

CS 414 - Spring 2014 Logical Structure of Scene

CS 414 - Spring 2014 Example of Composed Scene

Sprite Object Player Object

Composed Scene

CS 414 - Spring 2014 Video Syntax Structure

New MPEG-4 Aspect: Object-based layered syntactic structure

CS 414 - Spring 2014 Content-based Interactivity  Achieves different qualities for different objects with a fine granularity in spatial resolution, temporal resolution and decoding complexity  Needs coding of video objects with arbitrary shapes  Scalability Spatial and temporal scalability Need more than one layer of information (base and enhancement layers) CS 414 - Spring 2014 Examples of Base and Enhancement Layers

CS 414 - Spring 2014 Coding of Objects  Each VOP corresponds to an entity that after being coded is added to the bit  Encoder sends together with VOP Composition information where and when each VOP is to be displayed  Users are allowed to change the composition of the entire scene displayed by interacting with the composition

information CS 414 - Spring 2014 Spatial Scalability

VOP which is temporally coincident with I-VOP in the base layer, is encoded as P-VOP in the enhancement layer. VOP which is temporally coincident with P-VOP in the base layer is encoded as B-VOP in the enhancement layer.

CS 414 - Spring 2014 Examples of Base and Enhancement Layers

CS 414 - Spring 2014 Temporal Scalability

CS 414 - Spring 2014 High Level Codec for Generalized Scalability

CS 414 - Spring 2014 Composition (cont.)

 Encode objects in separate channels encode using most efficient mechanism transmit each object in a separate stream  Composition takes place at the decoder, rather than at the encoder requires a binary scene description (BIFS)  BIFS is low-level language for describing: hierarchical, spatial, and temporal relations

CS 414 - Spring 2014 MPEG-4 Part 11 – Scene Description  BIFS – Binary Format for Scenes  Coded representation of the spatio-temporal positioning of audio-visual objects as well as their behavior in response to interactions  Coded representation of synthetic 2D and 3D objects that can be manifested audibly and/or visibly  BIFS – MPEG-4 scene description protocol to  Compose MPEG-4 objects  Describe interaction about MPEG-4 objects  Animate MPEG-4 objects  BIFS Language Framework – XMT (textual representation of multimedia content using XML)  Accommodates SMIL, W3C scalable vector and VRML (now )

CS 414 - Spring 2014 MPEG-4 Rendering

CS 414 - Spring 2014 ISO N3536 MPEG4 Major Components on MPEG-4 Terminal

http://mpeg.chiariglione.org/standards/mpeg-4/mpeg-4.htm Integration and Synchronization of Multiple Streams

Source: http://mpeg.chiariglione.org/technologies/mpeg-4/mp04-bifs/index.htm

CS 414 - Spring 2014 Synchronization

 Global timeline (high-resolution units) e.g., 600 units/sec

 Each continuous track specifies relation e.g., if a video is 30 fps, then a frame should be displayed every 33 ms.

 Others specify start/end time

CS 414 - Spring 2014 Data Recovery

• Reversible Variable Length Codes (RVLC) : variable length code words are designed such that they can be read both in forward and reverse directions

QP – quantization parameter HEC – header extension code MP – macro-block CS 414 - Spring 2014 MPEG-4 parts

 MPEG-4 part 2 Includes Advanced Simple Profile, used by codecs such as Quicktime 6  MPEG-4 part 10 MPEG-4 AVC/H.264 also called Used by coders such as Quicktime 7 Used by high-definition video media like Blu- ray Disc

CS 414 - Spring 2014 MPEG-4 System Layer Model

SL – sync layer DMIF – Delivery Multimedia Integration Framework http://mpeg.chiariglione.org/standards/mpeg-4/mpeg-4.htm MPEG-4 File with Media Data

OD – Object Descriptor BIFS – Binary Format for Scenes Mdat – media data atoms http://mpeg.chiariglione.org/standards/mpeg-4/mpeg-4.htm Interaction as Objects

 Change colors of objects  Toggle visibility of objects  Navigate to different content sections  Select from multiple camera views change current camera angle  Standardizes content and interaction e.g., broadcast HDTV and stored DVD

CS 414 - Spring 2014 Hierarchical Model

 Each MPEG-4 movie composed of tracks each track composed of media elements (one reserved for BIFS information) each media element is an object each object is a audio, video, sprite, etc.  Each object specifies its: spatial information relative to a parent temporal information relative to global timeline

CS 414 - Spring 2014 Conclusion

 A lot of MPEG-4 examples with interactive capabilities  Content-based Interactivity  Scalability  Sprite Coding  Improved Compression Efficiency (Improved Quantization)  Universal  re-synchronization  data recovery  error concealment

CS 414 - Spring 2014