Image and Video Coding II

Thomas Wiegand

Technische Universität Berlin Introduction

bitstream encoder decoder Motivation Motivation for Image and Video Coding

Application of Image and Video Coding Efficient transmission or storage of images and videos Use less throughput for given data Transmit more data for given throughput

Image and Video Coding: Enabling Technology Enables new applications Makes applications economically feasible

Examples Distribution of digital images Storage of images and videos Digitial television (DVB-T, DVB-T2, DVB-S, DVB-S2) Internet video streaming (YouTube, Netflix, Amazon, ...)

T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 2 / 23 Motivation Practical Video Coding Examples

Image Storage & Distribution Raw camera formats, JPEG, JPEG-2000, JPEG-XR, H.265 | HEVC

Digital Television Terrestrial: MPEG-2 (SD, DVB-T), H.265 | HEVC (HD, DVT-T2) Satellite: H.264 | AVC (HD, DVB-S), H.265 | HEVC (UHD, DVB-S2)

Video Storage on Optical Discs DVD-Video (MPEG-2) Blu-ray (MPEG-2, H.264 | AVC, VC-1) Blu-ray 3D (H.264 | AVC), Ultra HD Blu-ray (H.265 | HEVC)

Video Streaming YouTube, Netflix, Amazon, other media portals H.264 | AVC, VP9, H.265 | HEVC

T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 3 / 23 Motivation Types of Image and Video Compression

Lossless Compression Invertible / reversible Original input data can be completely recovered Examples: PNG, JPEG-LS for images H.264 | AVC Lossless for video

Lossy Compression Not invertible Only approximation of original input data can be recovered Achieves much higher compression ratios Dominant form of image and video compression Examples: JPEG, JPEG-2000 for images MPEG-2, H.264 | AVC, H.265 | HEVC for video

T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 4 / 23 Source Data Sampling and Quantization Source Data

Analog Signals Continuous-time/space and continuous-amplitude signals Typically resulting from physical measurements Most signals in reality (audio and visual signals)

Digital Signals Discrete-time/space and discrete-amplitude signals Often generated from analog signal Sometimes directly measured (typical for image & video signals)

Input Data for Image and Video Coding Require digital signals Need to be stored and processed with a computer Compression methods are computer programs

T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 5 / 23 Source Data Sampling and Quantization Analog-to-Digital Conversion

Sampling Maps continuous-time/space signal into discrete-time/space signal

Quantization Maps continuous-amplitude signal into discrete-amplitude signal Simplest case: Rounding

T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 6 / 23 Source Data Sampling and Quantization Impact of Sampling (Spatial Resolution)

400×300 samples 200×150 samples

100×75 samples 50×38 samples

T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 7 / 23 Source Data Sampling and Quantization Impact of Quantization (Bit Depth)

8 bits per component 4 bits per component

3 bits per component 2 bits per component

T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 8 / 23 Source Data Sampling and Quantization Raw Images and Videos

Single-Component Image x Matrix of integer samples

s[x, y]

Characterized by y Number of samples in horizontal and vertical direction W ×H Sample bit depth B

Color Images Three color components (typically RGB or YCbCr) Additionally characterized by color sampling format

Videos Sequence of images Additionally characterized by frame rate

T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 9 / 23 Source Data Sampling and Quantization Color Sampling Formats

RGB YCbCr 4:4:4 YCbCr 4:2:2 YCbCr 4:2:0

most common format

T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 10 / 23 Source Data Sampling and Quantization Raw Data Rate

Raw Data Rate Bit rate R of raw data format

R = (samples per time unit) · (bit depth per sample) (1)

Example: Full High Definition (HD) Video 1920×1080 luma samples, 50 frames per second (Europe) 4:2:0 chroma format (chroma components with 1/4 luma resolution) 8 bits per sample raw data rate = 50 Hz · 1920 · 1080 · (1 + 2/4) · 8 bits ≈ 1.24 Gbit/s

Example: Ultra High Defintion (UHD) Video 3840×2160 luma samples, 60 frames per second (USA, Japan) 4:2:0 chroma format, 10 bits per sample raw data rate = 60 Hz · 3840 · 2160 · (1 + 2/4) · 10 bits ≈ 7.5 Gbit/s

T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 11 / 23 The Image & Video Coding Problem Typical Video Communication Scenario

pre- capture processing video raw input encoder video samples encoded bitstream

transmission channel (can be replaced by storage) bitstream channel no transmission errors channel demodulator channel modulator decoder encoder

received bitstream display and perception video decoder raw output post- video samples processing

T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 12 / 23 The Image & Video Coding Problem Typical Video Communication Scenario

pre- capture processing video raw input encoder video samples encoded bitstream

transmission channel (can be replaced by storage) bitstream channel no transmission errors channel demodulator channel modulator decoder encoder

received bitstream display and perception video decoder raw output post- video samples processing

T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 12 / 23 The Image & Video Coding Problem Typical Video Communication Scenario

pre- capture processing video raw input encoder video samples encoded bitstream

transmission channel (can be replaced by storage) bitstream channel no transmission errors channel demodulator channel modulator decoder encoder

received bitstream display and perception video decoder raw output post- video samples processing

T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 12 / 23 The Image & Video Coding Problem Application Examples

HD movie UHD broadcast on Blu-ray Disc over DVB-S2

1920×1080 luma samples 3840×2160 luma samples raw video format 4:2:0 chroma format 4:2:0 chroma format 8 bits per sample 10 bits per sample 24 frames per second 60 frames per second

raw data rate ca. 600 Mbit/s ca. 7.5 Gbit/s

channel bit rate 36 Mbit/s (read speed) 58 Mbit/s (8PSK 2/3)

video bit rate ca. 20 Mbit/s ca. 25 Mbit/s

required compression ca. 30:1 ca. 300:1

T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 13 / 23 The Image & Video Coding Problem The Basic Image & Video Coding Problem

Basic Image & Video Coding Problem Two equivalent formulations:

Representing image / video data with highest fidelity possible within an available bit rate and

Representing image / video data using lowest bit rate possible while maintaining a specified reproduction quality

Image / Codec: System of encoder and decoder

bitstream encoder decoder

T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 14 / 23 The Image & Video Coding Problem Practical Video Coding Problem

Characteristics of Video Codecs / Video Communication Bit rate: Throughput of the communication channel Quality: Fidelity of the reconstructed signal Delay: Start-up latency, end-to-end delay Complexity: Computation, memory, memory access

Practical Video Coding Problem

Given a maximum allowed complexity and a maximum delay, achieve an optimal trade-off between bit rate and reconstruction quality for the transmission problem in the targeted application

In this course: Will concentrate on source codec Ignore aspects of transmission channel (e.g., transmission errors)

T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 15 / 23 The Image & Video Coding Problem Basic Design and Principles of Hybrid Video Coding

(see other set of slides)

Block-based Image Coding Transform of sample blocks Scalar quantization Entropy coding of quantization indexes In modern codecs: Intra-picture prediction

Hybrid Video Coding Motion-compensated prediction using already coded picture of prediction erro signals Typically: Block-based decision between intra-and inter-picture prediction

Basic principles are used in all major video coding standards: H.261, MPEG-2 Video, H.263, MPEG-4 Visual, H.264/AVC, H.265/HEVC New project: Versatile Video Coding (VVC) started April 2018

T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 16 / 23 Coding Efficiency Example

OriginalLosslessLossy Compressed: Compressed:Image (1024 JPEG× GZIPPNG678 image(Quality points, 95)75)50)25)1) 945 KB)67.0346.5012.033.181.941.110.24 % %

100 %

T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 17 / 23 Coding Efficiency Image Compression Example

OriginalLosslessLossy Compressed: Compressed:Image (1024 JPEG× GZIPPNG678 image(Quality points, 95)75)50)25)1) 945 KB)67.0346.5012.033.181.941.110.24 % %

100 %

T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 17 / 23 Coding Efficiency Image Compression Example

OriginalLosslessLossy Compressed: Compressed:Image (1024 JPEG× GZIPPNG678 image(Quality points, 95)75)50)25)1) 945 KB)67.0346.5012.033.181.941.110.24 % %

100 %

T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 17 / 23 Coding Efficiency Image Compression Example

OriginalLosslessLossy Compressed: Compressed:Image (1024 JPEG× GZIPPNG678 image(Quality points, 95)75)50)25)1) 945 KB)67.0346.5012.033.181.941.110.24 % %

100 %

T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 17 / 23 Coding Efficiency Image Compression Example

OriginalLosslessLossy Compressed: Compressed:Image (1024 JPEG× GZIPPNG678 image(Quality points, 95)75)50)25)1) 945 KB)67.0346.5012.033.181.941.110.24 % %

100 %

T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 17 / 23 Coding Efficiency Image Compression Example

OriginalLosslessLossy Compressed: Compressed:Image (1024 JPEG× GZIPPNG678 image(Quality points, 95)75)50)25)1) 945 KB)67.0346.5012.033.181.941.110.24 % %

100 %

T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 17 / 23 Coding Efficiency Image Compression Example

OriginalLosslessLossy Compressed: Compressed:Image (1024 JPEG× GZIPPNG678 image(Quality points, 95)75)50)25)1) 945 KB)67.0346.5012.033.181.941.110.24 % %

100 %

T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 17 / 23 Coding Efficiency Image Compression Example

OriginalLosslessLossy Compressed: Compressed:Image (1024 JPEG× GZIPPNG678 image(Quality points, 95)75)50)25)1) 945 KB)67.0346.5012.033.181.941.110.24 % %

100 %

T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 17 / 23 Coding Efficiency Trade-Off between Quality and Compression Ratio

Coding Efficiency Ability to trade-off bit rate and reconstruction quality Want best reconstruction quality for a given bitrate (or vice versa)

quality

better codec B codec A

bit rate

T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 18 / 23 Coding Efficiency Coding Efficiency Example

1:100 Compression: JPEGH.265 | HEVC

T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 19 / 23 Coding Efficiency Coding Efficiency Example

1:100 Compression: JPEGH.265 | HEVC

T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 19 / 23 Coding Efficiency Measuring Coding Efficiency

Average Bit Rate number of bits in bitstream for a video sequence R = (2) nominal duration of the video sequence

Quality / Distortion Typically: Mean square error (MSE) and peak-signal-to-noise ratio (PSNR) For a W ×H array s[x, y] of original samples, the array s0[x, y] of reconstructed samples, and a sample bit depth B

1 X 2 MSE = s[x, y] − s0[x, y] (3) W · H ∀(x,y)

(2B − 1)2  PSNR = 10 · log (4) 10 MSE For videos: Average PSNR over entire video sequence

T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 20 / 23 Outline Outline of Course

Image and Video Coding II Acquisition, Representation, Display, and Perception of Images Source Coding Fundamentals Video Coding Overview Video Encoder Control Intra-Picture Coding Techniques Inter-Picture Coding Technqiues Comparision of Video Coding Standards

T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 21 / 23 Outline Literature I

Source Coding Wiegand, T. and Schwarz, H., “Source Coding: Part I of Fundamentals of Source and Video Coding,” Foundations and Trends in Signal Processing, Now Publishers, vol. 4, no. 1–2, 2011. ( available on web site) Cover, T. M. and Thomas, J. A., “Elements of Information Theory,” John Wiley & Sons, New York, 2006. Gersho, A. and Gray, R. M., “Vector Quantization and Signal Compression,” Kluwer Academic Publishers, Boston, Dordrecht, London, 1992. Jayant, N. S. and Noll, P., “Digital Coding of Waveforms,” Prentice-Hall, Englewood Cliffs, NJ, 1994.

Human Vision Wandell, B. A., “Foundations of Vision,” Sinauer Associates, Sunderland, 1995. Hauske, G. “Systemtheorie der visuellen Wahrnehmung,” B. G. Teubner, Stutgart, 1994.

T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 22 / 23 Outline Literature II

Image and Video Coding Schwarz, H. and Wiegand, T., “Video Coding: Part II of Fundamentals of Source and Video Coding,” Foundations and Trends in Signal Processing, Now Publishers, vol. 10, no. 1–3, 2016. (pdf available on web site) Bull, D. R., “Communicating Pictures: A Course in Image and Video Coding,” Elsevier, 2014. Ohm, J.-R., “Multimedia Signal Coding and Transmission,” Springer, 2015. Pennebaker, W. and Mitchell, J., “JPEG Still Image Standard,” Van Nostrand Reinhold, New York, 1993. Vetterli, M. and Kovacevic, J., “ and Subband Coding,” Prentice-Hall, 1995. Wien, M., “High Efficiency Video Coding — Coding Tools and Specifications,” Springer 2014. Sze, V., Budagavi, M., and Sullivan, G. J. (eds.), “High Efficiency Video Coding (HEVC): Algorithm and Architectures,” Springer, 2014.

T. Wiegand (TU Berlin) — Image and Video Coding: Introduction 23 / 23