Human Perception of Video
• 30 frames per second seems to allow the visual system to integrate the discrete frames CSEP 590 into continuous perception. Even 10 frames Data Compression per second is acceptable. Autumn 2007 • If distorted, nearby frames in the same scene should have only small details wrong. Video Compression – A difference in average intensity is noticeable • Compression choice when reducing bit rate – skipped frames cause stop action – lower fidelity frames may be better
CSEP 590 - Lecture 13 - Autumn 2007 2
Applications of Digital Video Video Encoding Frame-by-Frame coding • Teleconference or video phone – Real-time video – Very low delay (1/10 second is a standard) Encoder • Live Broadcast Video video frames decoding each frame – Modest delay is tolerable (seconds is normal) requires the previous frame – Error tolerance is needed. • Video-in-a-can (DVD, Video-on-Demand) Group-of-Frames coding
– Random access to compressed data is desired Group1444 of 2 Frames444 3 – Encoding can take a lot of time • Decoding must always be at at least the frame rate. Encoder video frames coded groups of frames no interdependencies
CSEP 590 - Lecture 13 - Autumn 2007 3 CSEP 590 - Lecture 13 - Autumn 2007 4
Coding Techniques Digital Video Data
• Frame-by-frame coding with prediction • CCIR 601 (4,2,2 scheme) – Very low bit rates – 13.5 MHz sample rate for luminance channel – 6.75 MHz sample rate for each of two chrominance channels – low delay – 8 bits per sample is a bit rate of 27 x 8 = 216 Mb per second – Not error resilient – MPEG-SIF – ½ sample rate for luminance and ¼ for • Group-of-frames coding chrominance – 81 Mb per second – Higher bit rates – within a group prediction is used • CIF (Common Interchange Format) – 288 x 352 pixels per frame for luminance channel – Error resilient – 144 x 176 pixels per frame for each of two chrominance – Random Access – 8 bits per pixel and 30 frames per second gives 48.7 Mb per – Higher delay second – QCIF (Quarter - CIF) is ¼ the data or 12.2 Mb per second.
CSEP 590 - Lecture 13 - Autumn 2007 5 CSEP 590 - Lecture 13 - Autumn 2007 6
1 High Compression Ratios Possible Motion Compensation
• Nearby frames are highly correlated. Use the previous frame to predict the current one. • Need to take advantage of the fact that usually objects move very little in 1/30 th of a second. – Video coders use motion compensation as part of prediction
Previous Frame Frame
CSEP 590 - Lecture 13 - Autumn 2007 7 CSEP 590 - Lecture 13 - Autumn 2007 8
Block Based Motion Compensation Motion Vectors
motion compensation blocks motion vector = (0,0)
CSEP 590 - Lecture 13 - Autumn 2007 9 CSEP 590 - Lecture 13 - Autumn 2007 10
Motion Vectors Motion Compensation
• For each motion compensation block – Find the block in the previous decoded frame that gives the least distortion. – If the distortion is too high then code the block independently. (intra block) – Otherwise code the difference (inter block) • The previous decoded frame is used because both the encoder and decoder have access to it.
motion vector = (20,5) 20 down and 5 to right
CSEP 590 - Lecture 13 - Autumn 2007 11 CSEP 590 - Lecture 13 - Autumn 2007 12
2 Issues Fractional Motion Compensation
• Distortion measured in squared error or absolute • Half or quarter pixel motion compensation may achieve better error predictions. – Absolute error is quicker to calculate • Fractional motion compensation is achieved by linear • Block size interpolation. – Too small then too many motion vectors previous frame current frame – Too large then there may be no good match • Searching range to find best block – Too large a search range is time consuming – Too small then may be better matches – Prediction can help. • Prediction resolution – Full pixel, half-pixel, quarter-pixel resolution – Higher resolution takes longer, but better prediction results Half pixel motion compensation
CSEP 590 - Lecture 13 - Autumn 2007 13 CSEP 590 - Lecture 13 - Autumn 2007 14
Linear Interpolation Rate Control
frame • Calculate an interpolated pixel as the average of buffer overlapping pixels. Encoded Decoder player • Better interpolation methods exist. frames
a1 a2 Half = a1 + a2 + a3 + a4 a a 4 4 4 4 • Buffer is filled at a constant rate (almost). a3 a4 • Buffer is emptied at a variable rate.
a a 1 2 3a 1 9a 2 a3 3a 4 a = + + + Quarter a 16 16 16 16
a3 a4
CSEP 590 - Lecture 13 - Autumn 2007 15 CSEP 590 - Lecture 13 - Autumn 2007 16
Underflow Problem Example
• Set up C/F = 60 buffer – Constant rate channel at C bits per second Decoder initial 20 100 – Frame rate F in frames per second. 60 bits in this frame not yet arrived – bi is the number of bits in compressed frame i – Initial occupancy of buffer B Decoder 40 20 20 •Bi is the number of bits in the buffer at frame i = 60 bits that arrived B0 B = + − Bi+1 Bi C/F bi Decoder 100 20 • Buffer should never empty B i > 0 for all i
CSEP 590 - Lecture 13 - Autumn 2007 17 CSEP 590 - Lecture 13 - Autumn 2007 18
3 Example Rate Control
Decoder • The rate control buffer is seeded to allow for 100 20 variability in number of bits per frame and variability in channel rate. Decoder • Causes of variability in bits per frame. 80 20 20 20 100 – Encoder does not predict well how many bits will be used in a frame - scalar quantization. Decoder – Encoder allocates more bits in frames that are 20 80 20 20 hard to encode because they are not predicted well - scene changes. Decoder • Causes of channel rate variability 20 20 100 20 – Congestion on the internet
CSEP 590 - Lecture 13 - Autumn 2007 19 CSEP 590 - Lecture 13 - Autumn 2007 20
Rate Control Algorithms H.261
• On-line solution • Application – low bit rate streaming video – Send a few frames to seed the buffer • Frame-by-frame encoder – Encoder simulates the buffer, should the buffer threaten to empty start sending more frames at • DCT based with 8x8 coding block lower fidelity or skip frames (decoder will – Uses JPEG style coding interpolate skipped frames). • Motion compensation based on 16x16 • Off-line solution macroblocks. – Attempt to allocate bits to frames to assure even fidelity. • Half pixel motion compensation – Seed the buffer with enough frames to prevent underflow.
CSEP 590 - Lecture 13 - Autumn 2007 21 CSEP 590 - Lecture 13 - Autumn 2007 22
H.261 H.261(QCIF)
176 pixels Block P-Frame Macroblock
144 99 motion pixels I-Frame vectors
CSEP 590 - Lecture 13 - Autumn 2007 23 CSEP 590 - Lecture 13 - Autumn 2007 24
4 H.261 H.261 Organization 176 Picture Header • Within a group of blocks (GOB) prediction is GOB 1 Picture GOB 2 used with motion vectors for coding. 144 Frame … GOB 9 GOB
Group of GOB MB 1 MB 2 … MB 11 Blocks Header
Macroblock MB Y1 Y2 Also called a slice Header Y3 Y4 Cb Cr
Block EOB JPEG style run length code
CSEP 590 - Lecture 13 - Autumn 2007 25 CSEP 590 - Lecture 13 - Autumn 2007 26
P-Frame Intra-Macroblock Distribution
Inter Forman Video Intra
CSEP 590 - Lecture 13 - Autumn 2007 27 CSEP 590 - Lecture 13 - Autumn 2007 28
MPEG-1 MPEG-1 • P-frames predicted by I-frames • Application – Video coding for random access or P-frames • B-frames predicted by I-frames • Group-of-frames encoder and P-frames • DCT based with 8x8 coding block – Uses JPEG style coding • Motion compensation based on 16x16 macroblocks. P-Frame • Forward and Backward Prediction within a group of frames B-Frame
I-Frame
CSEP 590 - Lecture 13 - Autumn 2007 29 CSEP 590 - Lecture 13 - Autumn 2007 30
5 Orders MPEG-1 Notes
Display Order • Random access unit = Group-of-Frames – Called GOP for group-of-pictures 10 9 8 7 6 5 4 3 2 1 • Error resilient – B-frames can be damaged without propagation Coding/Decoding Order • Added delay – Coding order different than display order 9 8 10 6 5 7 3 2 4 1 • Encoding time consuming – Suitable for non-interactive applications Added delay is one frame time
CSEP 590 - Lecture 13 - Autumn 2007 31 CSEP 590 - Lecture 13 - Autumn 2007 32
Beyond MPEG-1 Newest Trends
• MPEG-2 • H.264 – Just out in 2003, many new features – Application independent standard – Quarter pixel motion compensation • MPEG-4 – Variable size motion blocks – Multimedia applications – Multiple frame prediction • 3-D Wavelet Coding – Model based coding – Third dimension is time • H.263 – 3-D SPIHT has been implemented – More error resilience – Delay is large because GOP is large • GTV – Group testing for video – Bits per frame can be controlled enabling off-line rate control to succeed.
CSEP 590 - Lecture 13 - Autumn 2007 33 CSEP 590 - Lecture 13 - Autumn 2007 34
6