<p>Chapter 9 H.263</p><p>9.1 How does H.263 differ from H.261 and MPEG-1? 9.1.1 Coding of H.263 coefficients three-dimensional event of (last, run, level)</p><p>9.1.2 Coding of motion vectors</p><p>MVDx = MVx – predx</p><p>MVDy = MVy – predy (9.2)</p><p>Figure 9.1 Motion vector prediction</p><p>Figure 9.2 Motion vector prediction for the border macroblocks</p><p>9.1.3 Source pictures 10 Table 9.1 Number of pixels per line and number of lines per picture for each of the H.263 picture formats Picture Number of Number of Number of Number of format pixels for lines for pixels for lines for luminance per luminance per chrominance chrominance line picture per line per picture subQCIF 128 96 64 48 QCIF 176 144 88 72 CIF 352 288 176 144 4CIF 704 576 352 288 16CIF 1408 1152 704 576 Extensions of H.263 (H.263+ , H.263++) This has led to H.26L and eventually to H.264/AVC/MPEG4-V10</p><p>Scope and goals of H.263+ The expected enhancements of H.263+ over H.263 fall into two basic categories:</p><p> enhancing quality within existing applications broadening the current range of applications. A few examples of the enhancements are:</p><p> improving perceptual compression efficiency reducing video coding delay providing greater resilience to bit errors and data losses.</p><p>Scopes and goals of H.26L enhanced visual quality at very low bit rates and particularly at PSTN rates (e.g. at rates below 24 kbit/s) enhanced error robustness in order to accommodate the higher error rates experienced when operating for example over mobile links. low complexity appropriate for small, relatively inexpensive, audio-visual terminals low end-to-end delay as required in bidirectional personal communications. In addition, the group was closely working with the MPEG-4 experts group, to include new coding methods and promote interoperability. This work was formally ratified in 2003 and named H.264 in ITU-T and MPEG_4 Part-10 in ISO/IEC. Details of this codec will be covered in Chapter 11. Optional modes of H.263</p><p>Advanced motion estimation/compensation</p><p>9.4.1.1 Four motion vectors per macroblock</p><p>Figure 9.3 Redefinition of the candidate predictors MV1, MV2 and MV3 for each luminance block in a macroblock</p><p>9.4.1.2 Overlapped motion compensation</p><p>Figure 9.4 Weighting values for prediction with motion vectors of the luminance blocks on top or bottom of the current luminance block, H1(i,j) Figure 9.5 Weighting values for prediction with motion vectors of luminance blocks to the left or right of current luminance block, H2(i,j)</p><p>The creation of each interpolated (overlapped) pixel, p(i,j), in an reference luminance block is governed by:</p><p>(9.3) where q(i,j), r(i,j) and s(i,j) are the motion compensated pixels from the reference picture with the three motion vectors defined by:</p><p>Figure 9.6 Weighting values for prediction with motion vector of current block, H0(i,j)</p><p>9.4.2 Importance of motion estimation Figure 9.7 PSNR of Claire sequence coded at 256 kbit/s, with MPEG-1, H.261 and H.263</p><p> motion compensation on smaller block sizes of pixels results in smaller error signals than for the macroblock compensation used in the other codecs overlapped motion compensation; by removing the blocking artefacts on the block boundaries, the prediction picture has a better quality, so reducing the error signal, and hence the number of significant DCT coefficients efficient coding of DCT coefficients through three-dimensional (last, run, level) efficient representation of the combined macroblock type and block patterns.</p><p>Deblocking filter </p><p>Figure 9.8 Filtering of pixels at the block boundaries Figure 9.9 d1 as a function of d</p><p>Motion estimation/compensation with spatial transforms</p><p>Figure 9.10 Mapping of a block to a quadrilateral</p><p>One of the methods for this purpose is the bilinear transform, defined as [11]: Figure 9.11 Intensity interpolation of a nongrid pixel</p><p>(9.7) which is simplified to</p><p>Figure 9.12 Reconstructed pictures with operating individually the BMST and BMA motion vectors.</p><p>Figure 9.13. Frame by frame reconstruction of the pictures by BMST</p><p>Figure 9.14 Mesh-based motion compensation a mesh b motion compensated picture Figure 9.15 Performance of spatial transform motion compensation</p><p>9.5 Treatment of B-pictures</p><p>Figure 9.16 Prediction in PB frames mode</p><p>9.5.1.1 Macroblock type One of the modes of the MCBPC is the intra macroblock type that has the following meaning:</p><p> the P-blocks are intra coded the B-blocks are inter coded with prediction as for an intra block.</p><p>9.5.1.2 Motion vectors for B-pictures in PB frames </p><p>9.5.1.3 Prediction for a B-block in PB frames </p><p>Figure 9.17 Forward and bi-directional prediction for a B-block</p><p>9.5.2 Improved PB frames Bidirectional prediction: in the bidirectional prediction mode, prediction uses the reference pictures before and after the BPB-picture</p><p>Forward prediction: In the forward prediction mode the vector data contained in MVDB are used for forward prediction from the previous reference picture (an intra or inter picture, or the P-picture part of PB or improved PB frames).</p><p>Backward prediction: In the backward prediction mode the prediction of BPB-macroblock is identical to</p><p>BREC of normal PB frames mode. No motion vector data is used for the backward prediction.</p><p>9.5.3 Quantisation of B-pictures 10 Table 9.2 dbquant codes and relation between quant and bquant dbquant Bquant 00 01 10 11</p><p>10.4 Advanced variable length coding 10.4.1 Syntax-based arithmetic coding (More in H.264) 10.4.2 Reversible variable length coding </p><p>Figure 9.18 A reversible VLC</p><p>10.4.3 Resynchronisation markers 10.4.4 Advanced Intra/Inter VLC</p><p>10.4.4.1 Advanced intra coding Scanning</p><p>Figure 9.19 Alternate scans: (a) horizontal, (b) vertical</p><p>Figure 9.20 Three neighbouring blocks in the DCT domain</p><p>The reconstruction for each coding mode is given by: Mode 0: DC prediction only</p><p>(9.9)</p><p>Mode 1: DC and AC prediction from the block above</p><p>(9.10)</p><p>Mode 2: DC and AC prediction from the block to the left</p><p>(9.11)</p><p>10.4.4.2 Advanced inter coding with switching between two VLC tables</p><p>The outcome of this annex [25-S] in a different form is also used in H.264. In H.264, rather than switching between two VLC tables, selection is made among 11 VLC tables. Decision for selection of the most suitable table is based on the context of data to be coded. This is called context adaptive VLC (CAVLC), to be explained in great details in Chapter 11.</p><p>10.5 Protection against error 10.5.1 Forward error correction To allow the video data and error correction parity information to be identified by the decoder, an error correction framing pattern is included. This pattern consists of multiframes of eight frames, each frame comprising 1 bit framing, 1 bit fill indicator (FI), 492 bits of coded data and 18 bits parity. One bit from each one of the eight frames provide the frame alignment pattern of (S1S2S3S4S5S6S7S8)=(00011011) that will help the decoder to resynchronise itself after the occurrence of errors.</p><p>The error detection/correction code is a BCH (511, 493) [16]. The parity is calculated against a code of 493 bits, comprising a bit fill indicator (FI) and 492 bits of coded video data. The generator polynomial is given by:</p><p>(9.12)</p><p>The parity bits are calculated by dividing the 493 bits (left shifted by 18 bits) of the video data (including the fill bit) to this generating function. Since the generating function is a 19-bit polynomial, the remainder will be an 18-bit binary number (that is why data bits had to be shifted by 18 bits to the left), to be used as the parity bits. For example, for the input data of 01111...11 (493 bits), the resulting correction parity bits are 011011010100011011 (18 bits). The encoder appends these 18 bits to the 493 data bits, and the whole 511-bits are sent to the receiver as a block of data. Now this 511 bit data is exactly divisible to the generating function, and the remainder will be zero. Thus, the receiver can perform a similar division, and if there is any remainder, it is an indication of channel error. This is a very robust form of error detection, since burst of errors can also be detected. </p><p>10.5.2 Back channel T: Transform Q: Quantiser CC: Coding control P: Picture memory with motion compensated variable delay AP: Additional picture memory v: Motion vector p: Flag for INTRA/INTER t: Flag for transmitted or not qz: Quantisation indication q: Quantisation index for DCT coefficients</p><p>Figure 9.21 An encoder with multiple reference pictures</p><p>A positive acknowledgment (ACK) or a negative acknowledgment (NACK) is returned depending on whether the decoder successfully decodes a GOB. </p><p>An improved version of this annex [25-N] is used in H.264, under the concept of multiple reference frames. It is used both to improve compression efficiency, by searching for better motion vectors among several frames, as well as error robustness as discussed here. Multiple reference motion compensation and error free selection of reference frame will be studied in some details in Chapter 11. </p><p>10.5.3 Data partitioning</p><p>Figure 9.22 Effects of errors on (a) with and (b) without data partitioning</p><p>Table 9.3 VLC and RVLC bits of MCBPC (for I-pictures) Index MB type CBPC Normal VLC RVLC</p><p>0 3 (Intra) 00 1 1</p><p>1 3 01 001 010 2 3 10 010 0110</p><p>3 3 11 011 01110</p><p>4 4 (Intra+Q) 00 0001 00100</p><p>5 4 01 000001 011110</p><p>6 4 10 000010 001100</p><p>7 4 11 000011 0111110</p><p>Table 9.4 Number of bits per slice for data partitioning Data Partitioning</p><p>Slice No Slice header MV Coeff SUM Normal</p><p>1 52 30 211 293 269</p><p>2 63 34 506 603 571</p><p>3 45 42 748 835 803</p><p>4 48 42 1025 1115 1083</p><p>5 45 71 959 1075 1043</p><p>6 41 46 844 931 899</p><p>7 48 34 425 507 475</p><p>8 51 32 408 491 459</p><p>9 38 24 221 283 251 Figure 9.23 Error in a bit stream</p><p>:</p><p>(9.13) where N is the total number of pixels at the upper and lower borders of the MB.</p><p>Figure 9.24 Pixels at the boundary of (a) a macroblock and (b) four blocks Figure 9.25 Step-by-step decoding and skipping of bits in the bit stream</p><p>10.5.4 Error concealment 10.5.4.1 Intraframe error concealment</p><p>Figure 9.26 An example of Intraframe error concealment</p><p>9.7.5.1 Interframe error concealment Figure 9.27 A grid of macroblocks in the current and previous frame</p><p>Zero mv</p><p>Previous mv</p><p>Top mv</p><p>Mean mv</p><p>; (9.14)</p><p>Majority mv</p><p>; N 6 (9.15)</p><p>Vector median mv</p><p> th the j motion vector, mvj, is the median if:</p><p>(9.16)</p><p>Table 9.5 PSNR [dB] of various error concealment methods at 5 frames/s. Type 64Kbps, 5 fps, QCIF</p><p>Sequence Seq-1 Seq-2 Seq-3 Seq-4</p><p>Zero 17.04 18.15 14.54 13.08</p><p>Previous 17.28 18.34 14.53 13.48</p><p>Top 19.27 21.08 17.04 16.25</p><p>Average 19.18 21.74 17.51 16.18</p><p>Majority 19.35 21.83 17.89 16.61</p><p>Median 19.87 22.52 18.29 16.89</p><p>No Errors 22.57 26.94 20.85 19.88</p><p>Table 9.6 PSNR [dB] of the various error concealment methods at 12.5 frames/s. Type 64Kbps, 12.5 fps, QCIF</p><p>Sequence Seq-1 Seq-2 Seq-3 Seq-4</p><p>Zero 20.62 22.11 18.64 16.62</p><p>Previous 22.49 22.19 18.19 16.53</p><p>Top 22.97 25.16 20.97 20.04</p><p>Average 22.92 25.99 21.24 20.08</p><p>Majority 23.33 26.32 21.57 20.36</p><p>Median 24.36 26.72 22.16 20.81</p><p>No Errors 26.12 29.69 23.93 23.04</p><p>Figure 9.28, An erroneous picture along with its error concealed one</p><p>Bidirectional mv Figure 9.29 A group of alternate P and B pictures</p><p>To estimate a missing motion vector for a P-picture, say P31, the available motion vectors of the same spatial coordinates of the B-pictures can be used, with the following substitutions:</p><p> if only B23 is available, then P31 = 2 × B23</p><p> if only B21 is available, then P31 = -2 × B21</p><p> if both B23 and B21 are available, then P31 = B23 - B21</p><p>If none of them are available, then set P31 = 0 </p><p>To estimate a missing motion vector of a B-picture, simply divide that of the P-picture by two; </p><p>B23=½P31 or B21= -½P31</p><p>9.7.5.2 Loss concealment 9.7.5.3 Selection of best estimated motion vector</p><p>Scalability Figure 8.11 Block diagram of a two-layer SNR scalable coder</p><p>Figure 8.12 A DCT based base layer encoder</p><p>Figure 8.13 A two-layer SNR scalable encoder with drift at the base layer Figure 8.14 A three layer drift free SNR scalable encoder Figure 8.15 A block diagram of a three-layer SNR decoder</p><p>Figure 8.16 Picture quality of the base layer of SNR encoder at 2 Mbit/s Base + Enhancement @ 8 Mbit/s</p><p>Spatial scalability</p><p>Figure 8.17 Block diagram of a two-layer spatial scalable encoder Figure 8.19 Details of spatial scalability encoder</p><p>(a) (b) Figure 8.20 (a) Base layer picture of a spatial scalable encoder at 2 Mbit/s , and (b) its enlarged version</p><p>Temporal scalability</p><p>Figure 8.21 A block diagram of a two-layer temporal scalable encoder Hybrid scalability Figure 8.25 SNR, spatial and temporal hybrid scalability encoder</p><p>Overhead due to scalability</p><p>Figure 8.26 Increase in bit rate due to scalability</p><p>Table 8.4 Applications of SNR scalability Base layer Enhancement layer Application ITU-R-601 same resolution and two quality service for format as lower layer standard TV high definition same resolution and two quality service for HDTV format as lower layer 4:2:0 high definition 4:2:2 chroma simulcast video production/distribution</p><p>Table 8.5 Applications of spatial scalability Base Enhancement Application progressive(30Hz) progressive(30Hz) CIF/QCIF compatibility or scalability interlace(30Hz) interlace(30Hz) HDTV/SDTV scalability progressive(30Hz) interlace(30Hz) ISO/IECE11172-2/compatibility with this specification interlace(30Hz) progressive(60Hz) Migration to HR progressive HDTV</p><p>Table 8.6 Applications of temporal scalability Base Enhancement Higher Application progressive(30Hz) progressive(30Hz) progressive migration to HR (60Hz) progressive HDTV interlace(30Hz) interlace(30Hz) progressive migration to HR (60Hz) progressive HDTV</p><p>Scalability in H.263</p><p>Figure 9.30 B-picture prediction dependency in the temporal scalability SNR scalability</p><p>Figure 9.31 Prediction flow in SNR scalability</p><p>Spatial scalability Figure 9.32 Prediction flow in an spatial scalability</p><p>Multilayer scalability</p><p>Figure 9.33 Positions of the base and enhancement-layer pictures in a multilayer scalable bit-stream</p><p>Transmission order of pictures Figure 9.34 Example of picture transmission order Numbers next to each picture indicate the bit stream order, separated by commas for the two alternatives.</p>
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages32 Page
-
File Size-