Multimedia Information Systems

Multimedia Compression-MPEG Multimedia Compression

❚ Fundamentals of Data Compression ❚ Image Compression & JPEG ❚ Video Compression & MPEG-Video ❚ References ❙ V. O. K. Li & W. J. Liao, Distributed Multimedia Systems, Proceedings of IEEE, Vol. 85, No. 7, 1997. (1083-1089) ❙ D. L. Gall, MPEG: A Video Compression Standard for Multimedia Applications, CACM, Vol. 34, No. 4, 1991. ❙ http://www.cis.ohio- state.edu/hypertext/faq/usenet/compression-faq ❙ http://bmrc.berkeley.edu/frame/research/mpeg/mpegfaq.thml Standards

❚ Important for communications ❚ Customers prefer standards (freedom to choose) ❚ Increases volumes and bring down cost of service and SW/HW ❚ Reduce risk of deploying new technology ❚ Major players often participate ❚ Facilitate development on a common background ❚ Provide research opportunity Types of Standards

❚ Industrial/Commercial standards ❙ mutual agreement among companies ❙ may become de facto standards ❚ Voluntary standards ❙ By volunteers in open committee ❙ Based on consensus ❙ Market driven ❙ Stay ahead of technology Global Standards ❚ International ❙ ITU: International Telecommunications Union (UN) ◆ ITU-T: ITU Telecommunication Standardization Sector (CCITT) ◆ ITU-R: ITU Radio Communication Sector (CCIR) ❙ ISO: International Standards Organization ❙ IEC: International Electrotechnical Commission ❙ JTC1: Joint Technical Committee on Information Technology ❚ National ❙ ANSI: American National Standards Institute Organizations of an ISO Standard Body

❚ Group: WG1 (JPEG), WG11(MPEG) ❚ Convenor: Danial Lee(JPEG), Leonardo Chiariglione (MPEG) ❚ Sub-Group: Video, System, Audio and Conformance ❚ Ad Hoc Group: Coding Efficiency, Encoder Optimization ❚ NB: National Body Delegates ❚ HoD: Head of Delegation ❚ Observer ISO/IEC JTC1 SC29

❚ Study Committee (SC) 29 ❙ Working Group (WG) 1 ◆ Joint Bi-level Image Group(JBIP) ◆ Joint Photographic Experts Group (JPEG) ❙ WG11 ◆ Moving Picture Experts Group (MPEG) ❙ WG12 ◆ Multimedia Hypermedia Experts Group (MHEG) How does Standards Work?

❚ Schedule ❙ 3 to 5 one week meeting in different nations each year ❙ 300 to 400 delegate from around the world ❙ 200 companies from over 50 nations ❙ A final standard in about 4-5 years Proposal Review Process

❚ Call for proposal ❚ CE: Core Experiments Process ❙ Complete descriptions with at least one independent verification ❙ One functionality one tool as reviewed by peer ❙ Consensus based decision process at AHG, SG level ❚ CE: Core Experiment Description ❙ Proposal that is relevant and is supported by two companies ❚ VM: Verification Model ❙ The best proposal is admitted to VM for everyone to implement ❙ The new reference for the best performance ❙ The proposal needs to be challenged by incoming proposal Proposal Review Process (Cont.)

❚ WD: Working Draft ❚ CD: Committee Draft ❙ All the doors will be frozen ❙ First round vote by National Bodies with comment ❚ FCD: Final Committee draft ❚ DIS: Draft International Standard ❙ Second round vote by National Bodies without comment ❚ IS:International standard Why Does Company Work in the Standards ❚ Interoperability: war of formats (VHS vs. Beta) ❚ Patent Royalties ❙ licensing fee for MPEG-2 box: US$4 ❙ Total licensing fee for DVD US$10 ❙ Big companies can avoid being taxed by other companies ❙ $250 Millions per year for RCA patent profiles ❚ Create new market ❙ VCD: Video Compact Disk ❙ DVD: Digital Versatile Disk ❙ DBS: Direct Broadcast System ❙ HDTV or DVB (digital video broadcast) MPEG

❚ Motion Picture Expert Group ISO/IEC JTC1/SC29/WG11 ❚ MPEG Standards ❙ MPEG-1 (ISO/IEC 11172, Nov. 92) ❙ MPEG-2 (ISO/IEC 13814, Nov. 94) ❙ MPEG-4 (ISO/IEC 14496, Oct, 98) ❙ MPEG-7 (ongoing) ❙ MPEG-21(ongoing ) ❚ Only bit stream syntax & decoding are specified History of MPEG

❚ MPEG-1 ❙ started in 1988 ❙ Coding of moving pictures and associated audio for digital storage media at up to about 1.5 Mb/s ❙ compression standards for progressive frame-based video in SIF (360x240) –source input format ❙ Applications; VCD ❚ MPEG-2 ❙ Generic coding of moving pictures and associated audio ❙ compression standard for interlaced frame-based video in CCIR-601(720x480) and HDTV (1920x1088) ❙ Applications: DVD, SVCD, Direct TV, DVB, HDTV History of MPEG (Cont.)

❚ MPEG-4 ❙ Very low bit rate audio-visual coding ❙ Multimedia Standard for object-based video from natural or synthetic source ❙ Applications: Internet, cable TV, virtual studio, home LAN ❚ MPEG-7 ❙ Multimedia content description interface ❙ Applications: Internet, video search engine, digital library Applications of MPEG-1 & 2

❚ Digital storage media: 1~1.5 Mb/s ❚ Asymmetric applications ❙ Electronic publishing ◆ education & training ◆ travel guidance ◆ videotext ◆ point of sale ❙ Games ❙ Entertainment ❚ Symmetric Applications ❙ Electronic publishing ❙ Video mail ❙ Video-telephone ❙ Video conferencing JPEG H.261 MPEG-1 MPEG-2 Application multilevel still video-based multimedia and Digital NTSC images tele-communica broadcast TV and video- tions on-demand Resolution 352x288 352x288 720x480 Data rate 64kbps ~ 1.5Mbps 4 ~ 10Mbps 2.048Mbps Full Motion No Yes Yes Yes Picture Rate 30 24 ~ 30 30 Compression 15:1 100:1 ~ 200:1 200:1 100:1 ratios

Features of Video Compression Algorithm

❚ Random access (latency: 0.5 seconds) ❚ Fast Forward/Reverse Searches ❚ Reverse playback ❚ Audio-Visual Synchronization ❚ Robustness to errors ❚ Coding/decoding delay (150ms of videotelephone) ❚ Editability ❚ Format Flexibility: raster size & frame rate ❚ Cost tradeoffs: implementable Principles of MPEG

❚ Spatial redundancy (Intra-frame) ❙ DCT ❚ Temporal redundancy(Inter-frame) ❙ block based motion compensation ❙ MB: 16x16 macro block ❙ Prediction, Interpolation

Forward Prediction

1 2 3 4 5 6 7 8 9 10 11 12 13 I B B B P B B B P B B B I

Bidirectional Prediction

Motion estimation and compensation

❚ Motion estimation: estimate motion parameters of moving objects in an image sequence ❙ At the encoder ❚ Motion compenstion: replace a picture or portion thereof, based on displaced pels of a previously transmitted frame in an image sequence ❙ At the decoder ❚ Why motion compensation ❙ Reduce interframe correlation ❙ Block motion compensation is adopted by H/261/H.263, MPEG1/2 Motion Estimation

❚ Predict current frame from previous frame ❚ Motion Estimation approach ❙ Block matching method ◆ Pel based ◆ Block based ◆ Object based ❙ Differential (gradient) method – optical flow ❙ Fourier method Motion estimation problem

❚ Moving object: a group of contiguous perls that share the same set of motion parameters – not necessarily match the ordinary meaning of object ❚ Assumptions: ❙ Objects are rigid body: object deformation can be neglected for at least a few nearby frames ❙ Objects move only in translational movement for at least a few frames Motion estimation problem (cont)

❚ Assumptions: (cont) ❙ Illumination is spatially and temporally uniform; the observed object intensities are unchanged under movement ❙ Occlusion of one object by another and uncovered background are neglected Block matching motion estimation

❚ Concept: correlation technique that searches for the best match between the current image block and candidates in a confined area of previous frame ❚ Assumptions: images are partitioned into non-overlapped rectangular blocks ❙ Each block is viewed as an independent object ❙ The motion of pels within ghe same block is uniform Motion Estimation

(x,y) (x,y) Search position (4) Motion (1) Current Vector Macroblock (2) Defined Search Window

(3) Best matched Predicted Reference macroblock within Picture Picture the search window Factors Affect Block-based Matching Algorithm

•Searching algorithm, order •Matching criteria •Searching range Full search

W W

b

Current frame Reference frame

Search position (2W +1) 2 Logarithmic search Cross search algorithm 2 2

1 1 3 3 2 2 3 3 3 3

1 1 Search position:

5+ 4 log2 w Matching function

❚ Mean squared error (MSE): (min) ∑∑ MSE(d1, d2) = 1 / N1N2

[f(n1, n2, t) -f(n1-d1, n2-d2, t-1) ] ™Mean absolute difference (MAD): (min) ∑∑ MSE(d1, d2) = 1 / N1N2

|f(n1, n2, t) -f(n1-d1, n2-d2, t-1) | d1, d2 are distance away MPEG 1/ 2 A Motion Compensated interpolation

Previous frame B

current

MC interpolation modes: C 1.Block B = Block A 2. Block B= Block C 3. Block B = (A + C)/2 Future frame MPEGF-Decoder

Coded Data VLC-1 Scan-1 Q-1 DCT-1

I frame

Output Motion + Compensation P/B frame

Framestore Memory MPEG Encoder

Rate Input Controller Pictures Inter - DCT Q Scan VLC Buffer Output

Intra Q-1

DCT-1

Motion Motion Estimator Compensation +

Coding Mode Decision Framestore Memory VLC : Varialbe-length Coding Q : Quantization DCT : DCT Transform Parts of MPEG-1

❚ ISO/IEC 11172-1: Systems ❚ ISO/IEC 11172-2: Video ❚ ISO/IEC 11172-3: Audio ❚ ISO/IEC 11172-4: Conformance Testing ❚ ISO/IEC 11172-5: Software Parameters of MPEG-1

❚ Picture size: up to 4096x4096,normally 360x240 ❚ Pel aspect ratio: choices ❚ Picture rates: 23.976, 24,25, 29.97, 30, 50, 59.94, 60 ❚ 4:2:0 format MPEG Layers

Layers of the syntax Function Sequence layer Random access unit : context Group of pictures layer Random access unit : video Picture Layer Primary coding unit Slice layer Resynchronization unit Macroblock layer Motion compensation unit Block layer DCT unit Video stream structure Sequence Sequence GOP1 …… GOP i …… GOP n Sequence layer header end

GOP Picture 1 ….. Picture j …. Picture m GOP header layer

Picture Slice 1 ….. Slice k …. Slice p Picture header layer

Slice Macroblock 1 ….. Macroblock l …. Macroblock q Slice header layer

Macroblock Macroblock block 1 block 2 Block 3 Block 4 header layer Slice in MPEG (MPEG2 not across line) Parts of MPEG-2 ❚ ISO/IEC 13818-1; Systems ❚ ISO/IEC 13818-2: Video ❚ ISO/IEC 13818-3: Audio ❚ ISO/IEC 13818-4: Compliance Testing ❚ ISO/IEC 13818-5: Software ❚ ISO/IEC 13818-6: DSM-CC ❚ ISO/IEC 13818-7: NBC Audio ❚ ISO/IEC 13818-8: 10-Bit Video (dropped) ❚ ISO/IEC 13818-9: Real-Time Interface ❚ ISO/IEC 13818-10: DSM-CC Conformance(digital storage medium) Differences between MPEG-1 & MPEG-2

❚ MPEG-2 is backwards compatible to MPEG-1 ❚ MPEG-1: progressive, MPEG-2: interlace ❚ MPEG-1: fixed picture rate, MPEG-2: low delay mode for big picture ❚ Slice boundary ❚ Zig-Zag & Alternate scan for DCT Coef. ❚ MPEG-2: scalable video coding ❚ New VLC table for DCT Coef. ❚ Nonlinear quantization table ❚ IDCT mismatch control SCALABILITY

❚ SNR scalability: ❙ same luminance resolution and format, lower layer (4:2:0 more error correction MPEG-1) ❙ a single enhancement layer (4:2:2, less resilient to error) ❚ Spatial: ❙ base layer at lower resolution independent coded ❙ enhancement:difference between interpolated of base and source image ❚ Temporal scalability: ❙ extension to higher temporal picture rate, backward compatible with lower-rate ❙ base temporal rate coded independently ❙ temporal prediction relative to base layer ❚ Data Partition extension ❙ two channel transmission /storage ❙ header, motion vector, low freq DCT coeff ❙ less critical infor, high freq DCT coeff, less error protection Profiles

❚ Simple profile (SP) ❚ Main Profile (MP) ❚ SNR scalable profile (SNR) ❚ Spatially scalable (Spt) ❚ High profile (HP) Parameters of Profile parameter Constrain MPEG-1 SP MP MPEG-1 chroma 4:2:0 4:2:0 4:2:0 4:2:0

Picture I,P,B,D I, P, B, D I, P I, P, B type slices All MB All MB All MB All MB

Scalable No no no no

Intra DC 8 8 8, 9, 10 8, 9, 10 precision HigMP@HLh MP@HL Samples/line 1920 1920 High Line/frame 1152 Frame /sec 60 1152 Luminance rate 62668800 Bitrate 80Mbps 60 VBV buffer zise 9781248 bits 62668800 High 1440 MP@H-14 1152 80Mbps 60 47001600 9781248 bits 60 Mbps High 61 7340032 1440 MP@H-1Ma4 in 720 1152 MP@ML 576 30 60 10368000 15 47001600 1835008 Low 352 60 Mbps MP@LL 288 30 61 7340032 3041280 720 Main 4 MP@ML 475136 576 30 10368000 15 1835008 Low 352 MP@LL 288 30 3041280 4 475136 Color Space in MPEG

❚ YCbCr ❚ Luminance Chrominance ratio ❙ 4:2:0 (MPEG-1, MPEG-2) ❙ 4:4:4 (MPEG-2) ❙ 4:2:2 (MPEG-2) Chrominance Sampling

× × × × × × ○× × ○× × ○× × ○ ○ ○ ○× ○× ○× ○× ○× ○× ○× × ○× × ○× × × × × × × × ○× ○× ○× ○× ○× ○× ○× × ○× × ○× × ○× ○× ○× ○× ○× ○× × × × × × × ○× × ○× × ○× × ○× ○× ○× ○× ○× ○× ○ ○ ○ ○× × ○× × ○× × ○× ○× ○× ○× ○× ○× × × × × × × ○× × ○× × ○× × ○× ○× ○× ○× ○× ○× ××× ××× ○○○ ××× ×× × (c) 4:4:4 Format (a) 4:2:0 Format (b) 4:2:2 Format

× Represent luminance samples ○ Represent chrominance samples Chrominance Sampling (Cont.)

❚ Given a CCIR typical frame 720x480

Y samples Y lines C samples C lines Horizontal vertical /line /frame /line /frame subsampling subsampling factor factor 4:4:4 720 480 720 480 X X 4:2:2 720 480 360 480 2:1 X 4:2:0 720 480 360 240 2:1 X 4:1:1 720 480 180 120 2:1 2:1 4:1:0 720 480 180 120 2:1 2:1

Scan Order in MPEG

Zig-Zag Scan Alternate Scan Typical Frame Size of MPEG

❚ with an I frame distance of 15 and a P frame distance of 3

I P B Average 30 Hz SIF @ 1.15Mbit/sec 150,000 50,000 20,000 38,000 30 Hz CCIR 601@ 4Mbit/sec 400,000 200,000 80,000 130,000 Frame Order of MPEG

Time I BBPBBPBBPBBPBBI frame#12345678910111213141516

Dec. I P BBPBBPBBPBBI BB frame#14237561089131112161415 MPEG Myths

❚ Compression Ratios over 100:1 ❚ MPEG-1 is 352x240: up to 4095 x 4095 and bit rates up to 100 Mbit/sec ❚ Motion Compensation displaces macroblocks from previous pictures ❚ Display picture size is the same as the coded picture size ❚ Picture coding types (I, P, B) all consist of the same macroblocks types ❚ Sequence structure is fixed to a specific I,P,B frame pattern Other Video Standards

❚ ITU-T H.261: px64 teleconferencing standard, 64kb/s ❚ ITU-T H.263: low bit rate teleconferencing, <= 64kb/s ❚ ATSC(Advance Television Systems Committee): MPEG-2 video+ Dolby AC3 ❚ DVB (European Digital Video Broadcast): MPEG-2 ❚ DVC (Digital Video cassette Consortium): 6mm tape MPEG-2 I-frame-like coding 25Mb/s ❚ MJPEG(Motion JPEG) Assignment 1 ❚ Operations of video editing ❙ extraction ❙ cutting ❙ insertion ❙ concatenation ❙ scaling ❚ How to perform editing of MPEG video without fully decompression? (Hint: GOP, the 6th myth of MPEG) Extraction of MPEG Video

Start Point ...... B B P B B I B B P B B P B B P B B I B B P B B P ......

Start Point ...... B B P B B I B B P B B P B B P B B I B B P B B P ...... Forward Prediction Start Point ...... B B P B B I B B P B B P B B P B B I B B P B B P ......

Forward Prediction Cutting of MPEG Video

Start Point CLIP1 CLIP2 ...... B B P B B I B B P B B P B B P ...... B I B P ......

(a) the cut range specified by the users

CLIP1 CLIP2 ...... B B P B B I ...... B I B P ......

(b) Result of cutting

Backward Prediction

...... B B P B B I B B P B B P B B P ...... B I B P ...... CLIP1 Start Point CLIP2