Lossless Video Coding EE398B 2007.4.19 2/54 Isis Itit Worthyworthy Toto Sacrificesacrifice Compressioncompression Ratio?Ratio?
Total Page:16
File Type:pdf, Size:1020Kb
Stanford EE398B April 19, 2007 LosslessLossless VVideoideo CodingCoding Dr.Dr. SeishiSeishi TakamuraTakamura NTTNTT CorporationCorporation JapanJapan “Lossy”“Lossy” vs. vs. “Lossless”“Lossless” Lossy image/video coding Lossless image/video coding – Input != Output – Input == Output – High compression – Low compression (~1/10 - 1/1000) (~1/2 -1/3) – Possibly be – “mathematically lossless” “perceptually lossless” – This is what’s meant “lossless” in this talk Takamura: Lossless Video Coding EE398B 2007.4.19 2/54 IsIs ItIt WorthyWorthy toto SacrificeSacrifice CompressionCompression Ratio?Ratio? Yes, for many practical reasons – Medical diagnosis, fine art archive, military purposes – Image/video analysis, automatic product examination The observer is not a human — “perceptually lossless” does not make sense – Contribution transmission, studio applications Subject to be re-encoded/edited repeatedly — coding error accumulates, if lossy and Yes for some emotional reasons – A CG artist said, “Just do not lose even one bit from my artwork!” – Provides an upper-bound of image quality (can feel at ease) – Fed up with drawing R-D curves, evaluating Lagrange cost function, subjective impairment measurement, HVS… Takamura: Lossless Video Coding EE398B 2007.4.19 3/54 OutlineOutline Lossless image coding – Typical schemes and technology – FYI: Lossless image codecs comparison Lossless video coding – Lossless variants of lossy H.264/AVC – Using reversible transforms – Using DCT residual packing – FYI: Lossless video codecs comparison Summary Takamura: Lossless Video Coding EE398B 2007.4.19 4/54 OutlineOutline Lossless image coding – Typical schemes and technology – FYI: Lossless image codecs comparison Lossless video coding – Lossless variants of lossy H.264/AVC – Using reversible transforms – Using DCT residual packing – FYI: Lossless video codecs comparison Summary Takamura: Lossless Video Coding EE398B 2007.4.19 5/54 TypicalTypical LosslessLossless ImageImage CodecsCodecs Algorithm Method Note Prediction Entropy coding GIF — LZW up to 256 colors TIFF — LZW Linear prediction DEFLATE PNG (LZ77+Huffman) JPEG* Linear prediction Huffman *lossless mode JPEG 2000 reversible (5,3) DWT Arithmetic coding (EBCOT) context-based Golomb based on JPEG-LS Median prediction (run-length in flat region) “LOCO” method Context-based linear Huffman / Arithmetic CALIC prediction + nonlinear coding Better Compression correction Takamura: Lossless Video Coding EE398B 2007.4.19 6/54 PredictionPrediction SchemesSchemes Linear prediction – Fixed, encoder-optimized and online-update predictors x – Using causal pixels (extrapolative DPCM) – Using non-causal pixels (interpolative DPCM) – Block transform – Reversible DWT x – Reversible DCT – R/G/B decorrelation Median prediction Markov model c – Build a stochastic model of the image x – Encode the pixel occurrence probability: Prob(x|c) – cf. minimum MSE prediction = Σx(Prob(x|c)*x) Pixelwise residual (“Lossy plus lossless”) approach – Consider another lossy coder (e.g., JPEG) as a predictor – Entropy code the pixel-wise residual Decoded Original Residual –= (lossy) image data (And motion compensation for video) Takamura: Lossless Video Coding EE398B 2007.4.19 7/54 OutlineOutline Lossless image coding – Typical schemes and technology – FYI: Lossless image codecs comparison Lossless video coding – Lossless variants of lossy H.264/AVC – Using reversible transforms – Using DCT residual packing – FYI: Lossless video codecs comparison Summary Takamura: Lossless Video Coding EE398B 2007.4.19 8/54 ARTest Codec Comparison (full-color) 3 2.9 2.8 2.7 2.6 Compression Ratio 2.5 2.4 Top 14 codecs out of 26 [The art of lossless Average of 627 full-color (24-bit) imag 2.3 Also provides encoding/decoding time comparison BMF RKIM data compression, 2002] Takamura: Lossless Video CodingUHIC NK ERI ArHanGeL es, weighted by image size (#pixels) UHArc ACE 25% LSP SBC DC EE398B 2007.4.19RK PNGCrush LOCO Base of JPEG-LS 9/54 MRPMRP —— A A Gray-ScaleGray-Scale LosslessLossless ImageImage CodecCodec [Matsuda et al., 2005] Context-based 2D-linear prediction – M=20-56 contexts (predictors) – P=30-72 pixels from current frame – All M·P coefficients are optimized for minimum rate, not for minimum MSE, and downloaded Encoding time: 10-25 minutes per frame Decoding time: 0.2 seconds per frame Takamura: Lossless Video Coding EE398B 2007.4.19 10/54 SUT Codec Comparison (gray-scale) 2.2 2.1 2.0 1.6% Compression Ratio 1.9 Average of 13 gray-scale (8-bit) images Weighted by image size (#pixels) http://itohws03.ee.noda.sut.ac.jp/~matsuda/mrp/ CALICa: CALIC with arithmetic1.8 coding CALICh: CALIC with Huffman coding MRP Takamura: Lossless Video CodingBMF TMW 9.5% Glicbawls CALICa JPEG-LS EE398B 2007.4.19CALICh Note: Conditions are different from previousJP comparisonEG 2000 11/54 OutlineOutline Lossless image coding – Typical schemes and technology – FYI: Lossless image codecs comparison Lossless video coding – Lossless variants of lossy H.264/AVC – Using reversible transforms [Takamura, ICASSP2005] – Using DCT residual packing – FYI: Lossless video codecs comparison Summary Takamura: Lossless Video Coding EE398B 2007.4.19 12/54 LosslessLossless CodingCoding ofof H.264H.264 FRExtFRExt H.264 FRExt skips the transform and quantization entropy entropy entropy reduction reduction reduction Inter/Intra 4x4/8x8 Quanti- Zigzag Entropy Video prediction transform zation scan coding FRExt thru Entropy remains the same Takamura: Lossless Video Coding EE398B 2007.4.19 13/54 EnhancementEnhancement ofof FRExtFRExt Instead of no-operation (FRExt), we apply reversible transform to reduce entropy We compared four transforms Aiming at entropy Reversible reduction transform Inter/Intra 4x4/8x8 Quanti- Zigzag Entropy Video prediction transform zation scan coding FRExt thru Takamura: Lossless Video Coding EE398B 2007.4.19 14/54 ReversibleReversible TransformsTransforms TestedTested 4x4 Transform of H.264 (reversible if not quantized) Piecewise Scaled Transform Reversible DCT Linear Prediction – Non-adaptive version – Adaptive version Takamura: Lossless Video Coding EE398B 2007.4.19 15/54 4x44x4 ReversibleReversible TransformTransform H.264H.264 4x44x4 TransformTransform Transform coefficients A B C D 1 1 1 1 1 2 1 1 ⎡⎡ A B C D⎤⎤ ⎡⎡1 1 1 1 ⎤⎤⎡⎡ Intra/ ⎤⎤⎡⎡1 2 1 1 ⎤⎤ ⎢⎢ ⎥⎥ ⎢⎢ ⎥⎥⎢⎢ ⎥⎥⎢⎢ ⎥⎥ EE FF GG HH 22 11 −−11 −−22 Inter 11 11 −−11 −−22 ⎢⎢ ⎥⎥ == ⎢⎢ ⎥⎥⎢⎢ ⎥⎥⎢⎢ ⎥⎥ ⎢⎢ II JJ KK LL⎥⎥ ⎢⎢11 −−11 −−11 11 ⎥⎥⎢⎢prediction ⎥⎥⎢⎢11 −−11 −−11 22 ⎥⎥ ⎢⎢ ⎥⎥ ⎢⎢ ⎥⎥⎢⎢ residual ⎥⎥⎢⎢ ⎥⎥ ⎣⎣MM NN OO PP⎦⎦ ⎣⎣11 −−22 22 −−11⎦⎦⎣⎣ ⎦⎦⎣⎣11 −−22 11 −−11⎦⎦ Obviously reversible (if not quantized) Signal correlation is reduced… However, the signal space is expanded – 2.66 bit/pixel increment – Corrupts coding performance Takamura: Lossless Video Coding EE398B 2007.4.19 16/54 4x44x4 ReversibleReversible TransformTransform PiecewisePiecewise ScaledScaled TransformTransform Decompose H.264’s 4x4 transform into equivalent four 2x2 transforms, and compensate the expansion at each stage 1 1 1 1 1 1 1 1 1 –1 1 –1 1 –1 2 1 –1 2 1 1 2 1 1 1 2 1 1 –1 1 –2 1 –1 2 1 –2 5 Output is still H.264 transform Piecewise scaled transform (decomposed) reversible! 2 Divide by 2 and round 5 Divide by 5 and round Takamura: Lossless Video Coding EE398B 2007.4.19 17/54 4x44x4 ReversibleReversible TransformTransform ReversibleReversible DCTDCT [Komatsu, Sezaki, 1996] X 0 = R((x0 + x1 + x2 + x3 ) / 4) X1 = x0 − x3 + R((x1 − x2 ) / C) X 2 = R((x0 − x1 − x2 + x3 ) / 2) X 3 = R((x0 − x3 ) / C) − x1 + x2 where R(x) = ⎣⎦x + 0.5 This is parallel to – H.264’s 4x4 transform, when C=2 – Parallel to mathematical DCT, when C =1+ 2 (recommended by the paper) We tried C = 1 + 2 , 3 and 3.5 Takamura: Lossless Video Coding EE398B 2007.4.19 18/54 4x44x4 ReversibleReversible TransformTransform LinearLinear PredictionPrediction ((non-adaptive)non-adaptive) Horizontal: Apply following transform to four rows αΗx0 αΗx1 αΗx2 pred. pred. pred. Input x0 x1 x2 x3 Output x0 R(x1 -αΗ x0) R(x2 -αΗ x1) R(x3 -αΗ x2) Vertical: transform the columns in the same manner, using αV αH = αV =0.7 works good for I frames αH = αV =0.4 works good for P and B frames Takamura: Lossless Video Coding EE398B 2007.4.19 19/54 4x44x4 ReversibleReversible TransformTransform LinearLinear PredictionPrediction ((adaptive)adaptive) Optimal coefficient α is image dependent In energy concentration sense, autocorrelation factor gives an optimal α. (But not known to the decoder unless explicitly transmitted) We estimate α from neighboring decoded blocks ρH , ρV : horizontal and vertical Highest autocorrelation factors in a block weight ρH ρH ρH ρH ρV ρV default αΗ, αV ρV ρV weighted weighted average average ρH coded ρH coded αΗ, αV coded Highest αΗ, αV ρV block ρV block block weight If horizontally correlated If vertically correlated If not correlated Takamura: Lossless Video Coding EE398B 2007.4.19 20/54 The smaller, More constant, TransformTransform ComparisonComparison the better the better Transform ⎡1 0 0 0⎤ Determinant Norm of each row ⎢0 1 0 0⎥ ⎢ ⎥ 1,1,1,1 FRExt ⎢0 0 1 0⎥ 1 ( ) ⎢ ⎥ ⎣0 0 0 1⎦ ⎡1 1 1 1 ⎤ ⎢ ⎥ H.264 ⎢2 1 −1 −2⎥ ⎢1 −1 −1 1 ⎥ 40 (2, 10,2, 10) Transform ⎢ ⎥ ⎣1 −2 2 −1⎦ ⎡ 1 1 1 1 ⎤ ⎢ ⎥ Piecewise ⎢ 1 1/ 2 −1/ 2 −1 ⎥ ⎢1/ 2 −1/ 2 −1/ 2 1/ 2 ⎥ 1 (2, 10 / 2,1, 10 /10) Scaled ⎢ ⎥ ⎣1/10 −1/ 5 1/ 5 −1/10⎦ ⎡1/ 4 1/ 4 1/ 4 1/ 4 ⎤ ⎢ 1 1/ C −1/ C −1 ⎥ Reversible ⎢ ⎥ 2 2 2 ⎢1/ 2 −1/ 2 −1/ 2 1/ 2 ⎥ 1+1/C (1/ 2, 2 + 2C / C,1, 2 + 2C / C) DCT ⎢ ⎥ ⎣1/ C −1 1 −1/ C⎦ ⎡ 1 0 0 0⎤ Non- ⎢ ⎥ Linear ⎢−α 1 0 0⎥ orthogonal 2 2 2 ⎢ 0 −α 1 0⎥ 1 1, 1+α , 1+α , 1+α Prediction ⎢ ⎥ ( ) ⎣ 0 0 −α 1⎦ Takamura: Lossless Video Coding EE398B 2007.4.19 21/54 SimulationSimulation ConditionsConditions 4 QCIF sequences and 5 CIF sequences I frame performance – Encoded 100 frames (all intra) – Compared with Motion JPEG 2000 (kakadu codec) P frame performance – Encoded 100 frames in IPPP… structure – Coding performance is obtained from 99 P frames B frame performance – Encoded 298 frames in IBBPBBP… structure – Coding performance is obtained from 198 B frames Takamura: Lossless Video Coding EE398B 2007.4.19 22/54 ResultsResults (average(average ofof 99 sequences)sequences) H.264 trans.