9.1How Does H.263 Differ from H.261 and MPEG-1?
Chapter 9 H.263
9.1How does H.263 differ from H.261 and MPEG-1?
9.1.1Coding of H.263 coefficients
three-dimensional event of (last, run, level)
9.1.2Coding of motion vectors
MVDx = MVx – predx
MVDy = MVy – predy(9.2)
Figure 9.1 Motion vector prediction
Figure 9.2 Motion vector prediction for the border macroblocks
9.1.3Source pictures
10Table 9.1 Number of pixels per line and number of lines per picture for each of the H.263 picture formats
Picture format / Number of pixels for luminance per line / Number of lines for luminance per picture / Number of pixels for chrominance per line / Number of lines for chrominance per picturesubQCIF / 128 / 96 / 64 / 48
QCIF / 176 / 144 / 88 / 72
CIF / 352 / 288 / 176 / 144
4CIF / 704 / 576 / 352 / 288
16CIF / 1408 / 1152 / 704 / 576
Extensions of H.263 (H.263+ , H.263++)
This has led to H.26L and eventually to H.264/AVC/MPEG4-V10
Scope and goals of H.263+
The expected enhancements of H.263+ over H.263 fall into two basic categories:
- enhancing quality within existing applications
- broadening the current range of applications.
A few examples of the enhancements are:
- improving perceptual compression efficiency
- reducing video coding delay
- providing greater resilience to bit errors and data losses.
Scopes and goals of H.26L
- enhanced visual quality at very low bit rates and particularly at PSTN rates (e.g. at rates below 24 kbit/s)
- enhanced error robustness in order to accommodate the higher error rates experienced when operating for example over mobile links.
- low complexity appropriate for small, relatively inexpensive, audio-visual terminals
- low end-to-end delay as required in bidirectional personal communications.
In addition, the group was closely working with the MPEG-4 experts group, to include new coding methods and promote interoperability. This work was formally ratified in 2003 and named H.264 in ITU-T and MPEG_4 Part-10 in ISO/IEC. Details of this codec will be covered in Chapter 11.
Optional modes of H.263
Advanced motion estimation/compensation
9.4.1.1 Four motion vectors per macroblock
Figure 9.3 Redefinition of the candidate predictors MV1, MV2 and MV3 for each luminance block in a macroblock
9.4.1.2 Overlapped motion compensation
Figure 9.4 Weighting values for prediction with motion vectors of the luminance blocks on top or bottom of the current luminance block, H1(i,j)
Figure 9.5 Weighting values for prediction with motion vectors of luminance blocks to the left or right of current luminance block, H2(i,j)
The creation of each interpolated (overlapped) pixel, p(i,j), in an reference luminance block is governed by:
(9.3)
where q(i,j), r(i,j) and s(i,j) are the motion compensated pixels from the reference picture with the three motion vectors defined by:
Figure 9.6 Weighting values for prediction with motion vector of current block, H0(i,j)
9.4.2Importance of motion estimation
Figure 9.7 PSNR of Claire sequence coded at 256 kbit/s, with MPEG-1, H.261 and H.263
- motion compensation on smaller block sizes of pixels results in smaller error signals than for the macroblock compensation used in the other codecs
- overlapped motion compensation; by removing the blocking artefacts on the block boundaries, the prediction picture has a better quality, so reducing the error signal, and hence the number of significant DCT coefficients
- efficient coding of DCT coefficients through three-dimensional (last, run, level)
- efficient representation of the combined macroblock type and block patterns.
Deblocking filter
Figure 9.8 Filtering of pixels at the block boundaries
Figure 9.9 d1 as a function of d
Motion estimation/compensation with spatial transforms
Figure 9.10 Mapping of a block to a quadrilateral
One of the methods for this purpose is the bilinear transform, defined as [11]:
Figure 9.11 Intensity interpolation of a nongrid pixel
(9.7)
which is simplified to
Figure 9.12 Reconstructed pictures with operating individually the BMST and BMA motion vectors.
Figure 9.13. Frame by frame reconstruction of the pictures by BMST
Figure 9.14 Mesh-based motion compensation
a mesh
b motion compensated picture
Figure 9.15 Performance of spatial transform motion compensation
9.5Treatment of B-pictures
Figure 9.16 Prediction in PB frames mode
9.5.1.1 Macroblock type
One of the modes of the MCBPC is the intra macroblock type that has the following meaning:
- the P-blocks are intra coded
- the B-blocks are inter coded with prediction as for an intra block.
9.5.1.2Motion vectors for B-pictures in PBframes
9.5.1.3 Prediction for a B-block in PB frames
Figure 9.17 Forward and bi-directional prediction for a B-block
9.5.2Improved PB frames
Bidirectional prediction: in the bidirectional prediction mode, prediction uses the reference pictures before and after the BPB-picture
Forward prediction: In the forward prediction mode the vector data contained in MVDB are used for forward prediction from the previous reference picture (an intra or inter picture, or the P-picture part of PB or improved PB frames).
Backwardprediction: In the backward prediction mode the prediction of BPB-macroblock is identical to BREC of normal PB frames mode. No motion vector data is used for the backward prediction.
9.5.3Quantisation of B-pictures
10Table 9.2 dbquant codes and relation between quant and bquant
dbquant / Bquant00 /
01 /
10 /
11 /
10.4Advanced variable length coding
10.4.1Syntax-based arithmetic coding (More in H.264)
10.4.2Reversible variable length coding
Figure 9.18 A reversible VLC
10.4.3Resynchronisation markers
10.4.4Advanced Intra/Inter VLC
10.4.4.1Advanced intra coding
Scanning
Figure 9.19 Alternate scans: (a) horizontal, (b) vertical
Figure 9.20 Three neighbouring blocks in the DCT domain
The reconstruction for each coding mode is given by:
Mode 0: DC prediction only
(9.9)
Mode 1: DC and AC prediction from the block above
(9.10)
Mode 2: DC and AC prediction from the block to the left
(9.11)
10.4.4.2 Advanced inter coding with switching between two VLC tables
The outcome of this annex [25-S] in a different form is also used in H.264. In H.264, rather than switching between two VLC tables, selection is made among 11 VLC tables. Decision for selection of the most suitable table is based on the context of data to be coded. This is called context adaptive VLC (CAVLC), to be explained in great details in Chapter 11.
10.5Protection against error
10.5.1Forward error correction
To allow the video data and error correction parity information to be identified by the decoder, an error correction framing pattern is included. This pattern consists of multiframes of eight frames, each frame comprising 1 bit framing, 1 bit fill indicator (FI), 492 bits of coded data and 18 bits parity. One bit from each one of the eight frames provide the frame alignment pattern of (S1S2S3S4S5S6S7S8)=(00011011) that will help the decoder to resynchronise itself after the occurrence of errors.
The error detection/correction code is a BCH (511, 493) [16]. The parity is calculated against a code of 493 bits, comprising a bit fill indicator (FI) and 492 bits of coded video data. The generator polynomial is given by:
(9.12)
The parity bits are calculated by dividing the 493 bits (left shifted by 18 bits) of the video data (including the fill bit) to this generating function. Since the generating function is a 19-bit polynomial, the remainder will be an 18-bit binary number (that is why data bits had to be shifted by 18 bits to the left), to be used as the parity bits. For example, for the input data of 01111...11 (493 bits), the resulting correction parity bits are 011011010100011011 (18 bits). The encoder appends these 18 bits to the 493 data bits, and the whole 511-bits are sent to the receiver as a block of data. Now this 511 bit data is exactly divisible to the generating function, and the remainder will be zero. Thus, the receiver can perform a similar division, and if there is any remainder, it is an indication of channel error. This is a very robust form of error detection, since burst of errors can also be detected.
10.5.2Back channel
T: TransformQ: QuantiserCC: Coding control
P: Picture memory with motion compensated variable delay
AP: Additional picture memoryv: Motion vector
p: Flag for INTRA/INTERt: Flag for transmitted or not
qz: Quantisation indicationq: Quantisation index for DCT coefficients
Figure 9.21 An encoder with multiple reference pictures
A positive acknowledgment (ACK) or a negative acknowledgment (NACK) is returned depending on whether the decoder successfully decodes a GOB.
An improved version of this annex [25-N] is used in H.264, under the concept of multiple reference frames. It is used both to improve compression efficiency, by searching for better motion vectors among several frames, as well as error robustness as discussed here. Multiple reference motion compensation and error free selection of reference frame will be studied in some details in Chapter 11.
10.5.3Data partitioning
Figure 9.22 Effects of errors on (a) with and (b) without data partitioning
Table 9.3 VLC and RVLC bits of MCBPC (for I-pictures)
Index / MB type / CBPC / Normal VLC / RVLC0 / 3 (Intra) / 00 / 1 / 1
1 / 3 / 01 / 001 / 010
2 / 3 / 10 / 010 / 0110
3 / 3 / 11 / 011 / 01110
4 / 4 (Intra+Q) / 00 / 0001 / 00100
5 / 4 / 01 / 000001 / 011110
6 / 4 / 10 / 000010 / 001100
7 / 4 / 11 / 000011 / 0111110
Table 9.4 Number of bits per slice for data partitioning
Data PartitioningSlice No / Slice header / MV / Coeff / SUM / Normal
1 / 52 / 30 / 211 / 293 / 269
2 / 63 / 34 / 506 / 603 / 571
3 / 45 / 42 / 748 / 835 / 803
4 / 48 / 42 / 1025 / 1115 / 1083
5 / 45 / 71 / 959 / 1075 / 1043
6 / 41 / 46 / 844 / 931 / 899
7 / 48 / 34 / 425 / 507 / 475
8 / 51 / 32 / 408 / 491 / 459
9 / 38 / 24 / 221 / 283 / 251
Figure 9.23 Error in a bit stream
:
(9.13)
whereN is the total number of pixels at the upper and lower borders of the MB.
Figure 9.24 Pixels at the boundary of (a) a macroblock and (b) four blocks
Figure 9.25 Step-by-step decoding and skipping of bits in the bit stream
10.5.4Error concealment
10.5.4.1 Intraframe error concealment
Figure 9.26 An example of Intraframe error concealment
9.7.5.1 Interframe error concealment
Figure 9.27 A grid of macroblocks in the current and previous frame
Zero mv
Previous mv
Top mv
Mean mv
;(9.14)
Majority mv
; N 6(9.15)
Vector median mv
thejth motion vector, mvj, is the median if:
(9.16)
Table 9.5 PSNR [dB] of various error concealment methods at 5 frames/s.
Type / 64Kbps, 5 fps, QCIFSequence / Seq-1 / Seq-2 / Seq-3 / Seq-4
Zero / 17.04 / 18.15 / 14.54 / 13.08
Previous / 17.28 / 18.34 / 14.53 / 13.48
Top / 19.27 / 21.08 / 17.04 / 16.25
Average / 19.18 / 21.74 / 17.51 / 16.18
Majority / 19.35 / 21.83 / 17.89 / 16.61
Median / 19.87 / 22.52 / 18.29 / 16.89
No Errors / 22.57 / 26.94 / 20.85 / 19.88
Table 9.6 PSNR [dB] of the various error concealment methods at 12.5 frames/s.
Type / 64Kbps, 12.5 fps, QCIFSequence / Seq-1 / Seq-2 / Seq-3 / Seq-4
Zero / 20.62 / 22.11 / 18.64 / 16.62
Previous / 22.49 / 22.19 / 18.19 / 16.53
Top / 22.97 / 25.16 / 20.97 / 20.04
Average / 22.92 / 25.99 / 21.24 / 20.08
Majority / 23.33 / 26.32 / 21.57 / 20.36
Median / 24.36 / 26.72 / 22.16 / 20.81
No Errors / 26.12 / 29.69 / 23.93 / 23.04
Figure 9.28, An erroneous picture along with its error concealed one
Bidirectional mv
Figure 9.29 A group of alternate P and B pictures
To estimate a missing motion vector for a P-picture, say P31, the available motion vectors of the same spatial coordinates of the B-pictures can be used, with the following substitutions:
if only B23 is available, then P31 = 2 × B23
if only B21 is available, then P31 = -2 × B21
if both B23 and B21 are available, then P31 = B23 - B21
If none of them are available, then set P31 = 0
To estimate a missing motion vector of a B-picture, simply divide that of the P-picture by two; B23=½P31 or B21= -½P31
9.7.5.2Loss concealment
9.7.5.3 Selection of best estimated motion vector
Scalability
Figure 8.11 Block diagram of a two-layer SNR scalable coder
Figure 8.12 A DCT based base layer encoder
Figure 8.13 A two-layer SNR scalable encoder with drift at the base layer
Figure 8.14 A three layer drift free SNR scalable encoder
Figure 8.15 A block diagram of a three-layer SNR decoder
Figure 8.16 Picture quality of the base layer of SNR encoder at 2 Mbit/s
Base + Enhancement @ 8 Mbit/s
Spatial scalability
Figure 8.17 Block diagram of a two-layer spatial scalable encoder
Figure 8.19 Details of spatial scalability encoder
(a)
(b)
Figure 8.20 (a) Base layer picture of a spatial scalable encoder at 2 Mbit/s , and (b) its enlarged version
Temporal scalability
Figure 8.21 A block diagram of a two-layer temporal scalable encoder
Hybrid scalability
Figure 8.25 SNR, spatial and temporal hybrid scalability encoder
Overhead due to scalability
Figure 8.26 Increase in bit rate due to scalability
Table 8.4 Applications of SNR scalability
Base layer / Enhancement layer / ApplicationITU-R-601 / same resolution and format as lower layer / two quality service for standard TV
high definition / same resolution and format as lower layer / two quality service for HDTV
4:2:0 high definition / 4:2:2 chroma simulcast / video production/distribution
Table 8.5 Applications of spatial scalability
Base / Enhancement / Applicationprogressive(30Hz) / progressive(30Hz) / CIF/QCIF compatibility or scalability
interlace(30Hz) / interlace(30Hz) / HDTV/SDTV scalability
progressive(30Hz) / interlace(30Hz) / ISO/IECE11172-2/compatibility with this specification
interlace(30Hz) / progressive(60Hz) / Migration to HR progressive HDTV
Table 8.6 Applications of temporal scalability
Base / Enhancement / Higher / Applicationprogressive(30Hz) / progressive(30Hz) / progressive (60Hz) / migration to HR progressive HDTV
interlace(30Hz) / interlace(30Hz) / progressive (60Hz) / migration to HR progressive HDTV
Scalability in H.263
Figure 9.30 B-picture prediction dependency in the temporal scalability
SNR scalability
Figure 9.31 Prediction flow in SNR scalability
Spatial scalability
Figure 9.32 Prediction flow in an spatial scalability
Multilayer scalability
Figure 9.33 Positions of the base and enhancement-layer pictures in a multilayer scalable bit-stream
Transmission order of pictures
Figure 9.34 Example of picture transmission order
Numbers next to each picture indicate the bit stream order, separated by commas for the two alternatives.