Paper

An MPEG-2 to H.264 Intra Transcoding Method using Adaptive Macroblock Pair Type Selection

Takeshi YOSHITOME (Member), Kazuto KAMIKURA NTT Cyber Space Laboratories, NTT Corporation

〈Summary〉 We propose an MPEG-2 to H.264 intra transcoding for interlace stream of frame and field macroblock intermingled to improve our previous work. This method uses an adaptive MB pair type selection method that keeps as many discrete cosine transform (DCT) coefficients of the original MPEG-2 bitstream as possible to avoid mixture of re- quantization noise. Experimental results show that the proposed method improves the peak signal-to-noise ratio (PSNR) values by about 0.33–1.55 dB over those of the conventional methods. Keywords: transcoding, MPEG-2, H.264/AVC, adaptive MB pair type selection

ing the number of bits in I-pictures is important for 1. Introduction obtaining a smaller bitstream size because I-pictures High quality transcoding will be one of the have a larger number of bits than P- or B-pictures. most important technologies of visual applications in Our proposed method is effective for both I-pictures the future. For example, a video service using ter- and for the intra MBs included in P- and B-pictures. restrial digital broadcating for Internet protocol (IP) This method will also improve P- and B-picture im- network users has started in Japan 1). In this service, age quality because these pictures refer to the high the MPEG-2 2) compressed digital broadcating sig- quality I-pictures that the method generates. nals sent to a network center, transcoded to H.264 3), 2. Mechanism of Using and transfered to network users. The most impor- First Encoding Information tant factor in such transcoding services is reducing re-quantization noise during the transcoding proce- A data flow example of transcoding with first dure. It is well known that all decisions of the first MPEG-2 encoding information is shown in Fig. 1. encoding procedure are repeated during the transcod- For simplicity, the macro-block (MB) size is 2 × ing procedure to reduce re-quantization noise 4).This 2. A quantization parameter is represented using noise suppression mechanism is effective when the quantizer-scale-code (QSC) in MPEG-2, or using QP compression standards of the first and second encod- in H.264. Because this paper deals with MPEG-2 ing are identical, such as for MPEG-2/MPEG-2 or and H.264, we use the quantization step Δ which H.264/H.264 conversion. In our previous work 5),6), means the divisor of the DCT coefficient. For ex- we expanded this approach to apply it to MPEG- ample, Δ = 16 represents QSC = 12 in MPEG-2 2/H.264 transcoding for progressive bit-streams. In when q scale type = 1, and represents QP =28in this paper, we show how we used an adaptive mac- H.264. roblock (MB) pair type selection method to apply this In the first MPEG-2 encoding, an input MB that noise suppression mechanism to interlace intra bit- includes the values (A, B, C, and D) is processed streams that consist of frame and field MBs. Reduc- by orthogonal transformation. Then, the transformed

33 The Journal of the IIEEJ vol. 40 no. 1(2011)

Table 1 Major differences between MPEG-2 and H.264 MPEG-2 H.264 H.264 re-enact 4 × 4, 4 × 8, Block size 16 × 16 8 × 4, 8 × 8 OK 8 × 16, 16 × 8 16 × 16 Vector resolution 0.5pix 0.25pix OK Intra prediction no yes ? Orthogonal DCT integer ? Transformation DCT Size of orthogonal 8 × 8 8 × 8 (*1) OK Transformation 4 × 4 entropy coding VLC CAVLC OK CABAC

Fig. 1 Second encoding with first MPEG-2 encod- Specification of yes yes (*1) OK ing information Quantization matrix Interval of linear log. ? Quantization step discrete cosine transform (DCT) coefficients (44, 33, Specifications of each pair ? 21, and 12) are quantized by a quantization step Δ = Field/FrameMB MB MBs 10. This first quantization process adds quantization (*1): only High-Profile noise to the input MB.

In the second encoding, every type of encoding in- act the effect of this noise reduction mechanism, we formation, such as MB type, motion vector, DCT must control H.264 behavior to imitate MPEG-2 be- mode, and quantization step, is re-used. This re- havior as much as possible when the first encoder is use of encoding information does not produce any MPEG-2 and the second encoder is H.264. The two significant difference between encoders are similar but are not upper compatible. errors of the first encoding and those of the second The differences between them are listed in Table 1. encoding, nor does it produce any notable difference These differences should be absorbed to recreate between the DCT coefficients of the two encodings. this noise reduction mechanism. The “ok” means This means that the re-quantization noise added in that H.264 can recreate the same MPEG-2 function the second encoding phase is almost zero. Unfortu- in the table. The “?” means that H.264 can process nately, there is no bit reduction because the output the function only in a similar but not identical way bitstream size of the first and second encodings is al- that MPEG-2 does. For example, certain MPEG-2 most the same when these encodings make use of the quantization steps cannot be expressed exactly using same compression tool. But transcoding using dif- H.264 steps because the former are defined by a linear ferent compression tools, such as those for MPEG-2 interval and the latter are defined by a logarithmic in- and H.264, may reduce the bitstream size because the terval. The following four functions are not exactly performance of H.264’s entropy coding (CABAC) is the same between MPEG-2 and H.264: (1) Internal higher than that of MPEG-2’s entropy coding (VLC). picture motion compensation, (2) Orthogonal trans- formation, (3) Available quantization step, and (4) 3. Differences between MPEG-2 and Frame/Field MB specification. The differences from H.264 (1) to (3) were analyzed in our previous work 5),6).In As depicted in Fig. 1, there is a close resemblance the second half of this section, we describe the dif- between the decoded images of the first and second ferences of (4) in detail and show a way to overcome decoders when the compression standards of the first them. and second encodings are exactly the same. To reen- The DCT type specifications for MPEG-2 and

34 Paper : An MPEG-2 to H.264 Intra Transcoding using Adaptive Macroblock Pair Type Selection

H.264 MBs are shown in Fig. 2. Although a DCT between upper and lower MBs is about 86%. type can be specified for each MB independently in For the remaining 14%, single agreement is ob- MPEG-2, H.264 specifies the DCT type for each pair tained between MPEG-2 and H.264 for upper or lower of MBs that adjoin each other above and below. MBs, so the overall MB agreement obtained is 93%. When an upper MB is specified as the frame DCT In other words, the quantization noise reduction is anditslowerMBisspecifiedasthefieldDCTin not effective for only 7% of the MBs. MPEG-2, the DCT type of one MB is different from 4. Simple Transcoding Example its DCT type specified in H.264, either type frame or field DCT in H.264. In this section, we show an actual transcoding ex- The noise reduction in the second encoder, men- ample that indicates the re-quantization noise is re- tioned in Section 2, decreases when many disagree- duced when the MPEG-2 DCT type is equal to the ments of DCT type exist in an MB pair. We therefore H.264 DCT type. The MPEG-2 compressed image measure the disagreement quantity using the MPEG- using the quantization step Δ1 =20isshownatthe 2 bit-streams generated by five commercial encoders. right side of Fig. 3. We show the transcoding process These encoders are NEC’s VC-5310, Mitsubishi’s for a certain MB pair, shown at the left side of the 7) 9) MH2200E, and three NTT’s ~ . The percentage figure. The upper MB of this pair was transformed by for which the upper MB has the same DCT type as MPEG-2’s frame DCT, and the lower MB was trans- the lower MB is shown in Table 2. The input images formed by field DCT. Let us assume this MB pair is are 12 sequences included in the standard HDTV se- transcoded using H.264’s frame DCT. Then, Eq. (1) quences. The average of the DCT type agreement shows the 8 × 8pixelsvalue,Up, located at the top- left corner of the upper MB. The DCT coefficients of the upper MB, Ut, are shown in Eq. (2). This coeffi-

cient are quantized by re-quantization step Δ2 = 20,

which is equals to MPEG-2’s quantization step Δ1. The re-quantized DCT coefficients, Uq,areshownin Eq. (3). All the Uq coefficients are multiples of 20 because DCT coefficients of Ut are quantized by 20. This process adds re-quantization noise. The mean

square error of the 8 × 8 coefficients, U!MSE ,isshown Fig. 2 DCT type specification of macroblock in in Eq. (4). The sum total of the coefficients is 127.0. MPEG-2 and H.264 The 8 × 8 pixel values Lp, located at the top-left Table 2 Percentage of the DCT type agreement be- corner of the lower MB, are also transcoded in the tween upper and lower MBs in MPEG-2 same manner. The DCT coefficients of the lower MB No Scene MPEG-2 encoder Lt, quantized DCT coefficients of the lower MB Lq, A B C D E 1 Cognac and Fruit 84.9 84.5 84.0 89.7 84.8 8 Walk through the Sq. 83.8 85.9 82.8 89.7 83.5 10 Streetcar 84.4 85.3 82.7 82.0 81.2 14 Yacht Harbor 84.3 85.5 84.8 82.9 82.7 15 Yachting 89.2 84.3 83.3 89.6 84.7 16 Whale Show 87.3 88.2 87.6 88.1 87.6 19 Opening Ceremony 81.8 86.7 80.6 82.3 82.5 22 Marching in 82.5 88.9 82.5 83.6 81.9 23 Green Leaves 92.5 93.9 89.1 91.5 91.1 28 Summertime Tanning 89.6 88.8 87.9 89.7 88.0 46 Sprinkling 85.8 85.5 84.1 84.6 85.0 47 Picture Cuts 86.0 86.0 86.8 89.6 84.3 Fig. 3 MPEG-2 compressed image and a pair MB Ave 86.0 87.0 86.7 86.9 84.8 for the simple trascoding example

35 The Journal of the IIEEJ vol. 40 no. 1(2011) ⎛ ⎞ and the mean square error of the lower MB LMSE 3.155.848.455.42.20.21.41.3 ⎜13.418.20.71.01.80.18.70.0⎟ are shown in Eq. (7), (8), and (9). The sum total of ⎜ 1.817.30.515.71.60.55.80.1⎟ L ⎜ 5.46.622.91.113.52.51.90.6⎟ MSE = ⎜95.114.066.380.52.30.15.30.5⎟ the LMSE ’s coefficient is 1214.8. Because the lower ⎝46.27.387.298.727.10.41.30.3⎠ 48.424.226.577.79.00.12.10.0 MB was transformed using field DCT in MPEG-2 92.255.331.50.03.60.30.02.0 and frame DCT in H.264, LMSE becomes larger than (9) U 7 7 MSE .⎛ ⎞ 50.057.062.058.047.038.036.038.0 NoiseP ower(LMSE )= LMSE (x, y) (10) ⎜ 79.083.085.081.068.055.043.038.0⎟ x=0 y=0 ⎜ 94.094.096.097.093.081.066.055.0⎟ Up ⎜103.096.091.091.091.084.067.053.0⎟ = ⎜ 83.072.063.063.071.071.063.052.0⎟ 5. Adaptive Selection Method of MB Pair ⎝ 50.041.033.038.050.060.058.054.0⎠ 42.035.031.033.041.046.044.039.0 Type in H.264 29.028.030.037.045.049.046.042.0

(1) 10) ⎛ ⎞ In our previous method , the selection method of 476.140.5 −19.919.20.1 −1.5 −0.92.1 ⎜ 78.558.9 −19.8 −21.50.3 −0.2 −1.52.1⎟ H.264’s MB pair type was very simple. The method, ⎜−99.7 −18.8 −0.8 −19.3 −0.21.3 −0.0 −1.8⎟ ⎜−61.7 −21.220.41.0 −0.40.01.3 −1.1⎟ named “r”, uses FramePair in H.264 encoding except Ut =⎜ 0.40.0 −0.50.2 −0.10.0 −0.20.6⎟ ⎝ 1.2 −0.2 −0.30.6 −0.10.30.30.4⎠ −7.0 −21.2 −0.3 −1.00.50.1 −0.2 −1.4 both upper and lower MPEG-2 MBs were coded on 2.62.6 −0.4 −1.30.20.3 −0.61.7 Field DCT. (2) ⎛ ⎞ In this paper, we improve the method “r” by us- 480.040.0 −20.020.00.00.00.00.0 ⎜ 80.060.0 −20.0 −20.00.00.00.00.0⎟ ing the adaptive selection method of H.264’s MB pair ⎜−100.0 −20.00.0 −20.00.00.00.00.0⎟ Uq ⎜−60.0 −20.020.00.00.00.00.00.0⎟ = ⎜ 0.00.00.00.00.00.00.00.0⎟ type, shown in Fig. 4, to increase the transcoded im- ⎝ 0.00.00.00.00.00.00.00.0⎠ 0.0 −20.00.00.00.00.00.00.0 age quality. In this algorithm, if upper and lower MBs 0.00.00.00.00.00.00.00.0 have the same DCT type as in MPEG-2, (i.e., Case (3) ⎛ ⎞ 15.00.30.00.70.02.20.84.5 1 and Case 2 in Fig. 4), the MB pair type of H.264 is ⎜ 2.31.10.02.20.10.02.34.2⎟ set to the MPEG-2’s DCT type. ⎜ 0.11.40.60.50.11.60.03.3⎟ U ⎜ 2.91.30.11.00.20.01.71.1⎟ MSE = ⎜ 0.10.00.20.00.00.00.10.4⎟ Otherwise, the prediction errors of the MB whose ⎝ 1.50.00.10.30.00.10.10.1⎠ 48.41.50.11.00.20.00.11.9 H.264 DCT type differs from the MPEG-2 DCT type 6.96.70.21.80.00.10.33.0 are calculated using the function “Err (a, b, X, Y)”, (4) 7 7 shown in Eq. (11). NoiseP ower(UMSE )= UMSE (x, y)(5) 15 15 2 x=0 y=0 Err(a, b, X, Y )= {ORGxy−DCPREDxy} ⎛ ⎞ 45.043.041.040.040.041.044.045.0 i=0 j=0 ⎜ 54.044.050.072.077.056.035.028.0⎟ ⎜ 38.043.050.055.056.052.046.042.0⎟ (11) ⎜ 68.057.057.066.061.043.034.039.0⎟ Lp = ⎜ 27.034.044.051.053.049.042.036.0⎟ ⎝ 39.042.053.061.054.039.043.057.0⎠ where 40.041.043.045.047.049.050.051.0 115.081.048.041.049.058.067.075.0 a = upperMB or LowerMB (6) b = Frame or Field Lt x = X + i (for all case) ⎛ ⎞ 398.312.5 −13.07.418.5 −0.5 −1.21.1 ⎜−23.74.3 −39.2 −1.01.30.3 −2.90.2⎟ ⎜ 18.74.239.316.0 −1.3 −0.72.40.3⎟ ⎜−37.7 −22.6 −15.2 −19.0 −3.71.6 −1.4 −0.8⎟ = ⎜ 10.216.331.99.0 −1.5 −0.42.30.7⎟ ⎝−13.2 −2.79.3 −10.1 −5.20.61.1 −0.5⎠ 7.0 −4.925.2 −8.8 −3.00.31.40.1 −29.6 −27.4 −5.60.1 −18.10.5 −0.2 −1.4 (7) Lq ⎛ ⎞ 400.020.0 −20.00.020.00.00.00.0 ⎜−20.00.0 −40.00.00.00.00.00.0⎟ ⎜ 20.00.040.020.00.00.00.00.0⎟ ⎜−40.0 −20.0 −20.0 −20.00.00.00.00.0⎟ = ⎜ 20.020.040.00.00.00.00.00.0⎟ ⎝−20.00.00.0 −20.00.00.00.00.0⎠ 0.00.020.00.00.00.00.00.0 −20.0 −20.00.00.0 −20.00.00.00.0 (8) Fig. 4 The adaptive selection method of H.264’s MB pair type

36 Paper : An MPEG-2 to H.264 Intra Transcoding using Adaptive Macroblock Pair Type Selection

Table 3 Relationship among MPEG-2’s DCT type, Table 4 PSNR comparison between the method “r” H.264’s MB pair type, and DCT coefficient and the proposed method preservation Proposed Method PSNR First Transcoding Scene method (dB) r (dB) improvement (dB) MPEG-2 H.264 Boat 36.89 36.82 0.07 DCT type case# Coeff. Bus 33.23 33.14 0.08 Upper Lower in pair MB Upper Lower Bicycle 32.62 32.55 0.07 MB MB Fig. 4 type MB MB Cheerleader 31.08 31.04 0.04 Frame Frame 1 Frame O O Mobile 30.13 30.05 0.08 Field Field 2 Field O O Plane 37.36 37.28 0.08 Frame Field 3 Frame O X Sussie 37.82 37.68 0.14 4 Field X O Race 35.23 35.22 0.01 Field Frame 5 Frame X O Ave. 34.30 34.22 0.08 6 Field O X

O: DCT coef. are preserved Table 5 Percentage of selected MB pair types in X: DCT coef. are altered the proposed method

⎧ MPEG-2 DCT type in MB pair ⎪ Y j a lowerMB , b Frame Scene Frame Frame Field Field ⎪ + ( = = ) ⎨⎪ × × × × Y +j+16 (a = upperMB , b = Frame) 2 1and 1 2 y = ⎪ Selected MB pair type in H.264 ⎪ Y +2j (a = lowerMB , b = Field) ⎩⎪ Frame Frame Field Field Y j a upperMB , b Field +2 +1 ( = = ) Pair Pair Pair Pair In the above equation, X, Y is the left-top coor- Boat 19.0 21.8 9.3 49.9 dinate of the input MB pair. ORGxy is the pixel Bicycle 17.3 11.9 10.5 60.3 Cheerleader 68.0 17.6 8.1 6.2 value located at the coordinate (x, y) in the input Mobile 53.6 19.4 11.6 15.4 MB pair. DCPREDxy is the pixel value of an MB Plane 49.8 21.9 9.5 17.9 pair predicted by the intra-DC prediction mode. Sussie 76.3 6.2 3.6 13.9 The method shown in Fig. 4 is a good selection Race 62.7 25.9 2.2 9.2 manner, but it is not the best one because the func- Ave. 49.5 18.6 7.2 24.7 tion Err (a, b, X, Y) does not return the minimum error for all prediction modes. It returns the differ- Table 4. The proposed method improves the PSNR ence error when the intra DC prediction mode is used. of transcoding images by about 0.08 dB over that To calculate the best intra prediction mode requires of method “r”. Table 5 shows the relationship be- performing many evaluations for all possible predic- tween the selected H.264’s MB pair type and the tion modes, as well as very high calculation power. MPEG-2 DCT type. On average, in H.264 compres- Our proposed method, however, does not require such sion, 25.8% of MB pairs had different DCT types high calculation power. This is the reason we use the in MPEG-2 compression, the FrameMB pair was se- method shown in Fig. 4. lected in 18.6% of the cases and the FieldMB pair Table 3 shows the relationship among MPEG-2’s was selected in 7.2% of the cases. DCT type, H.264’s MB pair type, and DCT coeffi- 6. Proposed Method cient preservation. The DCT coefficient is preserved when the H.264’s MB pair type is equal to MPEG- The main features of the proposed method are 2’s DCT type. If the H.264’s MB pair type differs listed in Table 6. This method uses the first MPEG- from the MPEG-2’s DCT type, the coefficients will be 2 encoding information, which contains the quantiza- changed. This means that the re-quantization noise tion step and DCT type for each MB and the quan- will be mixed into the transcoded image. tization matrix for each picture. The MB type is We compare the PSNR value of transcoded images always “I8 × 8” in this method. The intra predic- generated by the method “r” with those generated tion mode is always “DC mode” in this method. The by the proposed method. The results are shown in high H.264 profile enables the use of an 8 × 8 integer

37 The Journal of the IIEEJ vol. 40 no. 1(2011)

Table 6 Main features of proposed transcoding method Profile High Profile MB type I8 × 8 Intra pred. mode DC Quantization matrix the same as MPEG-2’s matrix Quantization step nearest to MPEG-2’s step MB pair type See Fig. 4

DCT that has the same DCT size as that of MPEG-2. (a) Δ = 16 The quantization matrix of H.264 is the same matrix used in the first MPEG-2’s quantization. The quanti- zation step is selected to be the value nearest to that of MPEG-2’s quantization step. The H.264 standard can select one of the intra MB modes from among I4×4, I8×8andI16×16 and each DCT matrix size is 4×4, 8×8, and 4×4, respectively. Note that there is no “16×16” in DCT matrix size in both standards of MPEG-2 and H.264. The proposed method uses the I8×8modebecauseitsDCTmatrix (b) Δ = 26 size is the same as that of MPEG-2. The MB pair type selection method of H.264 is shown in Fig. 4 in Section 5.

7. Simulation and Evaluation

We used computer simulations to compare our pro- posed method with the conventional method. The conventional method we used is the ordinary MPEG- 2/H.264 transcoding method that does not use the (c) Δ = 36 first encoding information, but uses intra MB mode Fig. 5 PSNR of the proposed and conventional selection. TM5 11) was used for the first MPEG-2 methods (Bicycle) encoding under the following conditions: The quan- tization step Δ was fixed, the q scale type was equal proposed methods. All PSNR values were calculated to one, and the quantization matrix was the MPEG- using the original raw image sequences. 2 default setting matrix. The jm12.1 12) was used The PSNR values of the conventional and the pro- for the conventional second H.264 encoding under posed methods are shown in Fig. 5 (a)–(c). The the following conditions: The profile was high-profile, MPEG-2 compressed scenes “Bicyle” were used for the UseHadamrd was on, and the quantization ma- these simulations. The quantization step of the first trix was the H.264 default setting matrix. The con- encoding, Δ1, was set to 16, 26, or 36. That of the ventional second encoding had no restrictions with second encoding, Δ2, varied from 2 to 64. The solid regard to selecting MB mode and intra-prediction line indicates the proposed method and the dotted mode, so its output bit-streams included the all-MB line indicates the conventional method. These fig- mode, the all-prediction type, and all available DCT ures show that the proposed method has two peaks sizes. We used the modified jm12.1, which can in- when Δ2 =Δ1 and Δ2 =0.5Δ1. At these points, the put the first encoding information, for our proposed PSNR of the proposed method is about 0.94–1.15 dB transcoding method. The quantization step Δ was higher than that of the conventional method. The

fixed for the simulations of both the conventional and PSNR of the proposed method drops when 0.5Δ1

38 Paper : An MPEG-2 to H.264 Intra Transcoding using Adaptive Macroblock Pair Type Selection

Table 7 Comp. ratio and PSNR of the proposed method Input MPEG-2 Proposed transcoder Conv. PSNR Scene Δ PSNR Picture size Picture size comp. PSNR PSNR diff. (dB) (bit) (bit) ratio (dB) (dB) (dB) *1 *2 *2/*1 *3 *4 *3–*4 16 37.11 181478 136104 0.750 36.89 36.15 0.74 Boat 26 34.95 137327 88752 0.646 34.67 34.18 0.49 36 33.48 115553 63544 0.550 33.22 32.91 0.31 16 33.62 375350 327664 0.873 33.23 31.89 1.34 Bus 26 30.90 269078 215808 0.802 30.61 29.64 0.97 36 29.14 212218 156248 0.736 28.96 28.12 0.84 16 33.13 407425 358544 0.880 32.62 31.07 1.55 Bicycle 26 30.39 292379 238336 0.815 30.01 28.79 1.22 36 28.58 229024 171800 0.750 28.30 27.36 0.94 16 34.15 437518 400888 0.916 31.08 30.29 0.80 Cheer 26 31.57 321946 278368 0.865 31.16 29.86 1.30 leader 36 29.78 258731 208648 0.806 29.52 28.25 1.27 Mobile 16 30.78 615195 584064 0.949 30.13 29.10 1.03 and 26 27.85 444828 403952 0.908 27.27 26.33 0.94 calend. 36 25.91 347213 300176 0.865 25.50 24.63 0.87 16 37.69 186106 144896 0.779 37.36 36.87 0.49 Plane 26 35.51 149595 99688 0.666 35.21 34.88 0.33 36 33.84 129335 74504 0.576 33.63 33.24 0.38 16 37.78 174206 110200 0.633 37.82 36.79 1.03 Sussie 26 36.05 136072 71440 0.525 36.22 35.28 0.94 36 34.86 117297 51880 0.442 35.09 34.21 0.89 16 35.58 260261 211664 0.813 35.23 34.00 1.23 Race 26 33.33 195038 142016 0.728 33.13 31.96 1.17 36 31.79 163418 107648 0.659 31.73 30.71 1.03

< Δ2 < Δ1 and Δ1 < Δ2 < 2.0Δ1. This PSNR listed in Table 7. The compression ratio is the ratio drop is similar to that reported in MPEG-2/MPEG- of H.264 picture size to MPEG-2 picture size. The 2 transcoding 13). The reason for this drop is that conventional PSNR column is the interpolation value DCT coefficients of the transcoder are close to the when the conventional method generates the same multiple numbers of the MPEG-2’s quantization step bitstream in terms of size. These interpolation values

Δ1. The re-quantization for such multiple numbers are calculated using the simulation results of the con- generates only a small quantization error when the re- ventional method. Details of the interpolation man- quantization step Δ2 is equal to Δ1,0.5Δ1, 0.25Δ1, ner are shown in the following steps. and so on. When 0.5Δ1 < Δ2 < Δ1,theΔ2 is not Find the two conventional method sim- a factor of the DCT coefficients. In this case, a large ulations (C1, C2) whose bitrates (BITC1,BITC2) re-quantization error is generated. sandwich the proposed bitrate BITP . Figure 5 (a) shows that the PSNR of the pro- Calculate the PSNR of the proposed posed method is higher than that of the conventional method, PSNRP , by the following equation: method when Δ2 <22. In Fig. 5 (b), the PSNR of PSNRP = a · BITP + b the proposed method is higher than that of the con- where ventional method when Δ2 < 36. In Fig. 5 (c), the PSNRC1 − PSNRC2 PSNR of the proposed method is higher than that of a = · BITC1 BITC1 − BITC2 < the conventional method when Δ2 44. The com- b = PSNRC1 − a · BITC1 pression ratio and PSNR of the proposed method In Table 7, PSNR of transcoded video from “Sussie” when the quantization step Δ2 is equal to Δ1 are is higher than that of the input MPEG-2 video.

39 The Journal of the IIEEJ vol. 40 no. 1(2011)

Table 8 Lowest comp. ratio when the proposed Δ2, and shows no relationship between Δ1 and Δ2. method is superior to the conventional The noise in the proposed method becomes small- method est when Δ is equal to Δ , and the re-quantization Lower MPEG Comp. ratio 2 1 Scene Δ limit size size limit noise reduction effect of the proposed method be-

(bit) (bit) comes smaller as the difference between Δ2 and Δ1 16 120496 181478 0.664 becomes larger. As Δ2 becomes larger than a cer- Boat 26 85388 137327 0.622 tain value, the PSNR of the proposed method be- 36 61025 115553 0.528 comes lower than that of the conventional method. 16 263005 375350 0.701 Bus 26 196332 269078 0.730 Table 8 shows the lowest compression ratio when 36 144448 212218 0.681 the proposed method is superior to the conventional 16 214544 407425 0.527 method. Although the amount of this crossing value Bicycle 26 206837 292379 0.707 depends on the input images we tested, the proposed 36 153215 229024 0.669 method is superior to the conventional method for our 16 314171 437518 0.718 Cheer 26 224580 321946 0.698 tested simulations when the re-compression is over leader 36 175380 258731 0.678 0.837. 16 514620 615195 0.837 8. Conclusions Mobile 26 357932 444828 0.805 36 267602 347213 0.771 We proposed an MPEG-2 to H.264 intra transcod- 16 137974 186106 0.741 ing method for interlace bit-streams intermingled Plane 26 95885 149595 0.641 36 71574 129335 0.553 with a frame and field macroblock. This method 16 - - - uses the adaptive MB pair type selection method and Sussie 26 53418 136072 0.393 keeps as many discrete cosine transform (DCT) coef- 36 49086 117297 0.418 ficients of the original MPEG-2 bitstream as possible. 16 133741 260261 0.514 Experimental results show that the proposed method Race 26 130302 195038 0.668 improves peak signal-to-noise ratio (PSNR) by about 36 101665 163418 0.622 0.33–1.55 dB over that of the conventional method. The advantages of the proposed method are that it We assume that this unexpected PSNR improve- not only results in high PSNR but also that it does ment was generated by the deblocking filter of H.264. not require complex calculation to select an MB mode As MPEG-2 has no deblocking filter, MPEG-2 com- and an intra prediction mode. pressed images always have some blocking noise. In References general, a deblocking filter is effective for monotonical 1) A. Sagata, M. Ikeda, H. Iwasaki, K. Nitta, T. Onishi, images butless effective for complicated images. Be- T. Sano, Y. Nakajima, M. Inamori, T. Yoshitome, H. cause “Sussie” has a larger monotonical background Matsuda, R. Tanida, A. Shimizu, K. Nakamura, and J. than other images, we assume that the PSNR im- Naganuma: “A Professional-use Transcoder for Retrans- mission of Digital Terrestrial TV Broadcast over IP”, provement in “Sussie” generated by H.264’s deblock- Proceedings of the IEICE General Conference, D-11-10, ing filter is larger than those of other images. The (2009-3). 2) ISO/IEC IS 13818-2, ITU-T Recommendation H.262, PSNR of the proposed method is about 0.33–1.55 dB “Generic coding of moving pictures and associated au- better than that of the conventional method when dio information”, (1994-11). 3) ISO/IEC 14496-10:2003, Information technology - Cod- the quantization step Δ2 is equal to Δ1. The re- ing of audio-visual objects - Part 10: Advanced Video compression ratio of the H.264 bitstream size to the Coding, (2003-12). MPEG-2 bitstream size ranges from 0.442 to 0.916. 4) P. Guilotel, et al: “Adaptive Encoders: The New Gener- ation of MPEG-2 Encoders”, SMPTE journal (2000-4). As Δ2 becomes larger than a certain value, the PSNR 5) T. Yoshitome, J. Naganuma, and Y. Yashima: “A study of the proposed method becomes lower than that of MPEG-2 to H.264 Intra Transcoding for Progressive Contents”, ITE Journal, Vol.62, No.11, pp.1819–1824 of the conventional method. The noise in the con- (2008-11). ventional method becomes larger in proportion to 6) T. Yoshitome, J. Naganuma, and Y. Yashima: “An

40 Paper : An MPEG-2 to H.264 Intra Transcoding using Adaptive Macroblock Pair Type Selection

MPEG-2 to H.264 Transcoding Method Preserving DCT Information for Progressive Contents”, ITE Journal, Takeshi Yoshitome (Member) Vol.63, No.6, pp.837–846 (2009-6). Dr. Yoshitome received the B.E., 7) T. Yoshitome, K. Nakamura, K. Nitta, M. Ikeda, and M.E., and Ph.D. degrees in com- M. Endo: “Development of an HDTV MPEG-2 en- puter science from Tsukuba Uni- coder based on multiple enhanced SDTV encoding LSIs”, versity, Japan, in 1982, 1984, and IEEE International Conference on Consumer Electronics, 2010. In 1984, he joined Elec- pp.160–161 (2001-6). trical Communication Laborato- 8) J. Naganuma, H. Iwasaki, K. Nitta, K. Nakamura, T. ries, Nippon Telegraph and Tele- Yoshitome, M. Ogura, Y. Nakajima, Y. Tashiro, T. phone Corporation (NTT), Kana- Onishi, M. Ikeda, and M. Endo: “VASA: Single-chip gawa, Japan, where he since has MPEG-2 422P@HL LSI with Multi-chip Config- been engaged in research and de- uration for Large Scale Processing beyond HDTV Level”, velopment of image processing sys- in IEEE Hot Chips 14, Session 7, (2002-8). tems. Dr. Yoshitome is currently 9) H. Iwasaki, J. Naganuma, Y. Nakajima, Y. Tashiro, M. a Senior Research Engineer of Ikeda, K. Nakamura, T. Yoshitome, T. Onishi, T. Izuoka, the Visual Media Communications and M. Endo: “A 1.1W single-chip MPEG-2 HDTV Project in NTT Cyber Space Lab- CODEC LSI for embedding in consumer-oriented mo- oratories, Kanagawa, Japan. He is bile CODEC system”, Custom Integer. Circuits Conf. a member of IIEEJ and ITE. (CICC), pp.177–180 (2003-9). 10) T. Yoshitome, K. Kamikura, and N. Kitawaki: “An MPEG-2 to H.264 intra transcoding for interlace bit- Kazuto Kamikura Dr. streams intermingled with a frame and field macroblock”, Kamikura received the B.E. in the IIEEJ Image Electronics and Visual Computing and M.E. degrees in electrical Workshop 2010 (IEVC2010), 1P-6, (2010-3). engineering from Tokyo Science 11) MPEG-2, Test Model 5 (TM5), Doc ISO/IEC University, Japan, in 1984 and JTC/SC29/WG11/N0400, 1986, respectively. Since 1986, Test Model Editing Committee, (1993-4). he has been with NTT Human 12) Joint Video Team (JCT), “Reference Software JM12.1”, Interface Laboratories of Nippon http://iphome.hhi.de/suehring/tml/ Telegraph and Telephone Corpo- 13) S. Kadono, M. Etoh, and N. Yokoya: “Rationality of re- ration (NTT), Kanagawa, Japan. stricted re-quantization for efficient MPEG transcoding”, His current research interests 2000 International Conference on Image Processing 2000, include digital image processing pp.952–955 (2000-9). and video sequence coding. Dr. Kamikura is a member of the (Received July 1, 2010) IEICE of Japan, the Institute of (Revised Sep. 2, 2010) Television Engineers of Japan, and the Institute of Image Electronics Engineers of Japan. He is also a member of IEICE and ITE.

41