9.1How Does H.263 Differ from H.261 and MPEG-1?

Chapter 9 H.2639.1 How does H.263 differ from H.261 and MPEG-1? 9.1.1 Coding of H.263 coefficients three-dimensional event of (last, run, level)9.1.2 Coding of motion vectorsMVDx = MVx – predxMVDy = MVy – predy (9.2)Figure 9.1 Motion vector predictionFigure 9.2 Motion vector prediction for the border macroblocks9.1.3 Source pictures 10 Table 9.1 Number of pixels per line and number of lines per picture for each of the H.263 picture formats Picture Number of Number of Number of Number of format pixels for lines for pixels for lines for luminance per luminance per chrominance chrominance line picture per line per picture subQCIF 128 96 64 48 QCIF 176 144 88 72 CIF 352 288 176 144 4CIF 704 576 352 288 16CIF 1408 1152 704 576 Extensions of H.263 (H.263+ , H.263++) This has led to H.26L and eventually to H.264/AVC/MPEG4-V10Scope and goals of H.263+ The expected enhancements of H.263+ over H.263 fall into two basic categories: enhancing quality within existing applications  broadening the current range of applications. A few examples of the enhancements are: improving perceptual compression efficiency  reducing video coding delay  providing greater resilience to bit errors and data losses.Scopes and goals of H.26L  enhanced visual quality at very low bit rates and particularly at PSTN rates (e.g. at rates below 24 kbit/s)  enhanced error robustness in order to accommodate the higher error rates experienced when operating for example over mobile links.  low complexity appropriate for small, relatively inexpensive, audio-visual terminals  low end-to-end delay as required in bidirectional personal communications. In addition, the group was closely working with the MPEG-4 experts group, to include new coding methods and promote interoperability. This work was formally ratified in 2003 and named H.264 in ITU-T and MPEG_4 Part-10 in ISO/IEC. Details of this codec will be covered in Chapter 11. Optional modes of H.263Advanced motion estimation/compensation9.4.1.1 Four motion vectors per macroblockFigure 9.3 Redefinition of the candidate predictors MV1, MV2 and MV3 for each luminance block in a macroblock9.4.1.2 Overlapped motion compensationFigure 9.4 Weighting values for prediction with motion vectors of the luminance blocks on top or bottom of the current luminance block, H1(i,j) Figure 9.5 Weighting values for prediction with motion vectors of luminance blocks to the left or right of current luminance block, H2(i,j)The creation of each interpolated (overlapped) pixel, p(i,j), in an reference luminance block is governed by:(9.3) where q(i,j), r(i,j) and s(i,j) are the motion compensated pixels from the reference picture with the three motion vectors defined by:Figure 9.6 Weighting values for prediction with motion vector of current block, H0(i,j)9.4.2 Importance of motion estimation Figure 9.7 PSNR of Claire sequence coded at 256 kbit/s, with MPEG-1, H.261 and H.263 motion compensation on smaller block sizes of pixels results in smaller error signals than for the macroblock compensation used in the other codecs  overlapped motion compensation; by removing the blocking artefacts on the block boundaries, the prediction picture has a better quality, so reducing the error signal, and hence the number of significant DCT coefficients  efficient coding of DCT coefficients through three-dimensional (last, run, level)  efficient representation of the combined macroblock type and block patterns.Deblocking filter Figure 9.8 Filtering of pixels at the block boundaries Figure 9.9 d1 as a function of dMotion estimation/compensation with spatial transformsFigure 9.10 Mapping of a block to a quadrilateralOne of the methods for this purpose is the bilinear transform, defined as [11]: Figure 9.11 Intensity interpolation of a nongrid pixel(9.7) which is simplified toFigure 9.12 Reconstructed pictures with operating individually the BMST and BMA motion vectors.Figure 9.13. Frame by frame reconstruction of the pictures by BMSTFigure 9.14 Mesh-based motion compensation a mesh b motion compensated picture Figure 9.15 Performance of spatial transform motion compensation9.5 Treatment of B-picturesFigure 9.16 Prediction in PB frames mode9.5.1.1 Macroblock type One of the modes of the MCBPC is the intra macroblock type that has the following meaning: the P-blocks are intra coded  the B-blocks are inter coded with prediction as for an intra block.9.5.1.2 Motion vectors for B-pictures in PB frames 9.5.1.3 Prediction for a B-block in PB frames Figure 9.17 Forward and bi-directional prediction for a B-block9.5.2 Improved PB frames Bidirectional prediction: in the bidirectional prediction mode, prediction uses the reference pictures before and after the BPB-pictureForward prediction: In the forward prediction mode the vector data contained in MVDB are used for forward prediction from the previous reference picture (an intra or inter picture, or the P-picture part of PB or improved PB frames).Backward prediction: In the backward prediction mode the prediction of BPB-macroblock is identical toBREC of normal PB frames mode. No motion vector data is used for the backward prediction.9.5.3 Quantisation of B-pictures 10 Table 9.2 dbquant codes and relation between quant and bquant dbquant Bquant 00 01 10 1110.4 Advanced variable length coding 10.4.1 Syntax-based arithmetic coding (More in H.264) 10.4.2 Reversible variable length coding Figure 9.18 A reversible VLC10.4.3 Resynchronisation markers 10.4.4 Advanced Intra/Inter VLC10.4.4.1 Advanced intra coding ScanningFigure 9.19 Alternate scans: (a) horizontal, (b) verticalFigure 9.20 Three neighbouring blocks in the DCT domainThe reconstruction for each coding mode is given by: Mode 0: DC prediction only(9.9)Mode 1: DC and AC prediction from the block above(9.10)Mode 2: DC and AC prediction from the block to the left(9.11)10.4.4.2 Advanced inter coding with switching between two VLC tablesThe outcome of this annex [25-S] in a different form is also used in H.264. In H.264, rather than switching between two VLC tables, selection is made among 11 VLC tables. Decision for selection of the most suitable table is based on the context of data to be coded. This is called context adaptive VLC (CAVLC), to be explained in great details in Chapter 11.10.5 Protection against error 10.5.1 Forward error correction To allow the video data and error correction parity information to be identified by the decoder, an error correction framing pattern is included. This pattern consists of multiframes of eight frames, each frame comprising 1 bit framing, 1 bit fill indicator (FI), 492 bits of coded data and 18 bits parity. One bit from each one of the eight frames provide the frame alignment pattern of (S1S2S3S4S5S6S7S8)=(00011011) that will help the decoder to resynchronise itself after the occurrence of errors.The error detection/correction code is a BCH (511, 493) [16]. The parity is calculated against a code of 493 bits, comprising a bit fill indicator (FI) and 492 bits of coded video data. The generator polynomial is given by:(9.12)The parity bits are calculated by dividing the 493 bits (left shifted by 18 bits) of the video data (including the fill bit) to this generating function. Since the generating function is a 19-bit polynomial, the remainder will be an 18-bit binary number (that is why data bits had to be shifted by 18 bits to the left), to be used as the parity bits. For example, for the input data of 01111...11 (493 bits), the resulting correction parity bits are 011011010100011011 (18 bits). The encoder appends these 18 bits to the 493 data bits, and the whole 511-bits are sent to the receiver as a block of data. Now this 511 bit data is exactly divisible to the generating function, and the remainder will be zero. Thus, the receiver can perform a similar division, and if there is any remainder, it is an indication of channel error. This is a very robust form of error detection, since burst of errors can also be detected. 10.5.2 Back channel T: Transform Q: Quantiser CC: Coding control P: Picture memory with motion compensated variable delay AP: Additional picture memory v: Motion vector p: Flag for INTRA/INTER t: Flag for transmitted or not qz: Quantisation indication q: Quantisation index for DCT coefficientsFigure 9.21 An encoder with multiple reference picturesA positive acknowledgment (ACK) or a negative acknowledgment (NACK) is returned depending on whether the decoder successfully decodes a GOB. An improved version of this annex [25-N] is used in H.264, under the concept of multiple reference frames. It is used both to improve compression efficiency, by searching for better motion vectors among several frames, as well as error robustness as discussed here. Multiple reference motion compensation and error free selection of reference frame will be studied in some details in Chapter 11. 10.5.3 Data partitioningFigure 9.22 Effects of errors on (a) with and (b) without data partitioningTable 9.3 VLC and RVLC bits of MCBPC (for I-pictures) Index MB type CBPC Normal VLC RVLC0 3 (Intra) 00 1 11 3 01 001 010 2 3 10 010 01103 3 11 011 011104 4 (Intra+Q) 00 0001 001005 4 01 000001 0111106 4 10 000010 0011007 4 11 000011 0111110Table 9.4 Number of bits per slice for data partitioning Data PartitioningSlice No Slice header MV Coeff SUM Normal1 52 30 211 293 2692 63 34 506 603 5713 45 42 748 835 8034 48 42 1025 1115 10835 45 71 959 1075 10436 41 46 844 931 8997 48 34 425 507 4758 51 32 408 491 4599 38 24 221 283 251 Figure 9.23 Error in a bit stream:(9.13) where N is the total number of pixels at the upper and lower borders of the MB.Figure 9.24 Pixels at the boundary of (a) a macroblock and (b) four blocks Figure 9.25 Step-by-step decoding and skipping of bits in the bit stream10.5.4 Error concealment 10.5.4.1 Intraframe error concealmentFigure 9.26 An example of Intraframe error concealment9.7.5.1 Interframe error concealment Figure 9.27 A grid of macroblocks in the current and previous frameZero mvPrevious mvTop mvMean mv; (9.14)Majority mv; N  6 (9.15)Vector median mv th the j motion vector, mvj, is the median if:(9.16)Table 9.5 PSNR [dB] of various error concealment methods at 5 frames/s. Type 64Kbps, 5 fps, QCIFSequence Seq-1 Seq-2 Seq-3 Seq-4Zero 17.04 18.15 14.54 13.08Previous 17.28 18.34 14.53 13.48Top 19.27 21.08 17.04 16.25Average 19.18 21.74 17.51 16.18Majority 19.35 21.83 17.89 16.61Median 19.87 22.52 18.29 16.89No Errors 22.57 26.94 20.85 19.88Table 9.6 PSNR [dB] of the various error concealment methods at 12.5 frames/s. Type 64Kbps, 12.5 fps, QCIFSequence Seq-1 Seq-2 Seq-3 Seq-4Zero 20.62 22.11 18.64 16.62Previous 22.49 22.19 18.19 16.53Top 22.97 25.16 20.97 20.04Average 22.92 25.99 21.24 20.08Majority 23.33 26.32 21.57 20.36Median 24.36 26.72 22.16 20.81No Errors 26.12 29.69 23.93 23.04Figure 9.28, An erroneous picture along with its error concealed oneBidirectional mv Figure 9.29 A group of alternate P and B picturesTo estimate a missing motion vector for a P-picture, say P31, the available motion vectors of the same spatial coordinates of the B-pictures can be used, with the following substitutions: if only B23 is available, then P31 = 2 × B23 if only B21 is available, then P31 = -2 × B21 if both B23 and B21 are available, then P31 = B23 - B21If none of them are available, then set P31 = 0 To estimate a missing motion vector of a B-picture, simply divide that of the P-picture by two; B23=½P31 or B21= -½P319.7.5.2 Loss concealment 9.7.5.3 Selection of best estimated motion vectorScalability Figure 8.11 Block diagram of a two-layer SNR scalable coderFigure 8.12 A DCT based base layer encoderFigure 8.13 A two-layer SNR scalable encoder with drift at the base layer Figure 8.14 A three layer drift free SNR scalable encoder Figure 8.15 A block diagram of a three-layer SNR decoderFigure 8.16 Picture quality of the base layer of SNR encoder at 2 Mbit/s Base + Enhancement @ 8 Mbit/sSpatial scalabilityFigure 8.17 Block diagram of a two-layer spatial scalable encoder Figure 8.19 Details of spatial scalability encoder(a) (b) Figure 8.20 (a) Base layer picture of a spatial scalable encoder at 2 Mbit/s , and (b) its enlarged versionTemporal scalabilityFigure 8.21 A block diagram of a two-layer temporal scalable encoder Hybrid scalability Figure 8.25 SNR, spatial and temporal hybrid scalability encoderOverhead due to scalabilityFigure 8.26 Increase in bit rate due to scalabilityTable 8.4 Applications of SNR scalability Base layer Enhancement layer Application ITU-R-601 same resolution and two quality service for format as lower layer standard TV high definition same resolution and two quality service for HDTV format as lower layer 4:2:0 high definition 4:2:2 chroma simulcast video production/distributionTable 8.5 Applications of spatial scalability Base Enhancement Application progressive(30Hz) progressive(30Hz) CIF/QCIF compatibility or scalability interlace(30Hz) interlace(30Hz) HDTV/SDTV scalability progressive(30Hz) interlace(30Hz) ISO/IECE11172-2/compatibility with this specification interlace(30Hz) progressive(60Hz) Migration to HR progressive HDTVTable 8.6 Applications of temporal scalability Base Enhancement Higher Application progressive(30Hz) progressive(30Hz) progressive migration to HR (60Hz) progressive HDTV interlace(30Hz) interlace(30Hz) progressive migration to HR (60Hz) progressive HDTVScalability in H.263Figure 9.30 B-picture prediction dependency in the temporal scalability SNR scalabilityFigure 9.31 Prediction flow in SNR scalabilitySpatial scalability Figure 9.32 Prediction flow in an spatial scalabilityMultilayer scalabilityFigure 9.33 Positions of the base and enhancement-layer pictures in a multilayer scalable bit-streamTransmission order of pictures Figure 9.34 Example of picture transmission order Numbers next to each picture indicate the bit stream order, separated by commas for the two alternatives.

9.1How Does H.263 Differ from H.261 and MPEG-1?

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support