<<

Coding Coding

ideo ideo Japan Japan EE398B Stanford

April 19, 2007 V V NTT Corporation NTT Corporation Dr. Seishi Takamura Dr. Seishi Takamura

Lossless Lossless “Lossy”“Lossy” vs. vs. “Lossless”“Lossless”

‰ Lossy image/ coding ‰ Lossless image/video coding – Input != Output – Input == Output

– High compression – Low compression (~1/10 - 1/1000) (~1/2 -1/3)

– Possibly be – “mathematically lossless” “perceptually lossless” – This is what’s meant “lossless” in this talk

Takamura: Lossless Video Coding EE398B 2007.4.19 2/54 IsIs ItIt WorthyWorthy toto SacrificeSacrifice CompressionCompression Ratio?Ratio?

‰ Yes, for many practical reasons – Medical diagnosis, fine art archive, military purposes – Image/video analysis, automatic product examination The observer is not a human — “perceptually lossless” does not make sense – Contribution transmission, studio applications Subject to be re-encoded/edited repeatedly — coding error accumulates, if lossy ‰ and Yes for some emotional reasons – A CG artist said, “Just do not lose even one from my artwork!” – Provides an upper-bound of image quality (can feel at ease) – Fed up with drawing R-D curves, evaluating Lagrange cost function, subjective impairment measurement, HVS…

Takamura: Lossless Video Coding EE398B 2007.4.19 3/54 OutlineOutline

‰ Lossless image coding – Typical schemes and technology – FYI: Lossless image comparison

‰ Lossless video coding – Lossless variants of lossy H.264/AVC – Using reversible transforms – Using DCT residual packing – FYI: Lossless video codecs comparison

‰ Summary

Takamura: Lossless Video Coding EE398B 2007.4.19 4/54 OutlineOutline

‰ Lossless image coding – Typical schemes and technology – FYI: Lossless image codecs comparison

‰ Lossless video coding – Lossless variants of lossy H.264/AVC – Using reversible transforms – Using DCT residual packing – FYI: Lossless video codecs comparison

‰ Summary

Takamura: Lossless Video Coding EE398B 2007.4.19 5/54 TypicalTypical LosslessLossless ImageImage CodecsCodecs

Algorithm Method Note Prediction Entropy coding GIF — LZW up to 256 colors TIFF — LZW Linear prediction PNG (LZ77+Huffman) JPEG* Linear prediction Huffman *lossless mode JPEG 2000 reversible (5,3) DWT (EBCOT) context-based Golomb based on JPEG-LS Median prediction (run-length in flat region) “LOCO” method Context-based linear Huffman / Arithmetic CALIC prediction + nonlinear coding Better Compression correction

Takamura: Lossless Video Coding EE398B 2007.4.19 6/54 PredictionPrediction SchemesSchemes

‰ Linear prediction – Fixed, encoder-optimized and online-update predictors x – Using causal (extrapolative DPCM) – Using non-causal pixels (interpolative DPCM) – Block transform – Reversible DWT x – Reversible DCT – R/G/B decorrelation ‰ Median prediction ‰ Markov model c – Build a stochastic model of the image x – Encode the occurrence probability: Prob(x|c)

– cf. minimum MSE prediction = Σx(Prob(x|c)*x) ‰ Pixelwise residual (“Lossy plus lossless”) approach – Consider another lossy coder (e.g., JPEG) as a predictor – Entropy code the pixel-wise residual Decoded Original Residual –= (lossy) image data ‰ (And for video)

Takamura: Lossless Video Coding EE398B 2007.4.19 7/54 OutlineOutline

‰ Lossless image coding – Typical schemes and technology – FYI: Lossless image codecs comparison

‰ Lossless video coding – Lossless variants of lossy H.264/AVC – Using reversible transforms – Using DCT residual packing – FYI: Lossless video codecs comparison

‰ Summary

Takamura: Lossless Video Coding EE398B 2007.4.19 8/54 ARTestARTest Codec ComparisonComparison (full-color)(full-color)

[The art of lossless , 2002] 3

2.9

2.8

2.7 25% 2.6

Compression Ratio Compression 2.5

2.4

2.3 I O K RK C M eL rc DC I N ER A ACE LSP SBC LO Base of BMF RK UHIC UH NGCrush rHanG P JPEG-LS Top 14 codecs out of 26 A Average of 627 full-color (24-bit) images, weighted by image size (#pixels) Also provides encoding/decoding time comparison

Takamura: Lossless Video Coding EE398B 2007.4.19 9/54 MRPMRP —— A A Gray-ScaleGray-Scale LosslessLossless ImageImage CodecCodec

[Matsuda et al., 2005] ‰ Context-based 2D-linear prediction – M=20-56 contexts (predictors) – P=30-72 pixels from current frame – All M·P coefficients are optimized for minimum rate, not for minimum MSE, and downloaded ‰ Encoding time: 10-25 minutes per frame ‰ Decoding time: 0.2 seconds per frame

Takamura: Lossless Video Coding EE398B 2007.4.19 10/54 SUTSUT CodecCodec ComparisonComparison (gray-scale)(gray-scale)

http://itohws03.ee.noda.sut.ac.jp/~matsuda/mrp/ 2.2 1.6%

2.1

9.5% 2.0

Compression Ratio Compression 1.9

1.8

P a R -LS M BMF TMW G LICh cbawls A Average of 13 gray-scale (8-bit) images li CALIC PE C G 2000 G J E Weighted by image size (#pixels) JP CALICa: CALIC with arithmetic coding Note: Conditions are different CALICh: CALIC with from previous comparison Takamura: Lossless Video Coding EE398B 2007.4.19 11/54 OutlineOutline

‰ Lossless image coding – Typical schemes and technology – FYI: Lossless image codecs comparison

‰ Lossless video coding – Lossless variants of lossy H.264/AVC – Using reversible transforms [Takamura, ICASSP2005] – Using DCT residual packing – FYI: Lossless video codecs comparison

‰ Summary

Takamura: Lossless Video Coding EE398B 2007.4.19 12/54 LosslessLossless CodingCoding ofof H.264H.264 FRExtFRExt

‰ H.264 FRExt skips the transform and quantization

entropy entropy entropy reduction reduction reduction

Inter/Intra 4x4/8x8 Quanti- Zigzag Entropy Video prediction transform zation scan coding

FRExt thru Entropy remains the same

Takamura: Lossless Video Coding EE398B 2007.4.19 13/54 EnhancementEnhancement ofof FRExtFRExt

‰ Instead of no-operation (FRExt), we apply reversible transform to reduce entropy ‰ We compared four transforms Aiming at entropy Reversible reduction transform

Inter/Intra 4x4/8x8 Quanti- Zigzag Entropy Video prediction transform zation scan coding

FRExt thru

Takamura: Lossless Video Coding EE398B 2007.4.19 14/54 ReversibleReversible TransformsTransforms TestedTested

‰ 4x4 Transform of H.264 (reversible if not quantized) ‰ Piecewise Scaled Transform ‰ Reversible DCT ‰ Linear Prediction – Non-adaptive version – Adaptive version

Takamura: Lossless Video Coding EE398B 2007.4.19 15/54 4x44x4 ReversibleReversible TransformTransform H.264H.264 4x44x4 TransformTransform

Transform coefficients

A B C D 1 1 1 1 1 2 1 1 ⎡⎡ A B C D⎤⎤ ⎡⎡1 1 1 1 ⎤⎤⎡⎡ Intra/ ⎤⎤⎡⎡1 2 1 1 ⎤⎤ ⎢⎢ ⎥⎥ ⎢⎢ ⎥⎥⎢⎢ ⎥⎥⎢⎢ ⎥⎥ EE FF GG HH 22 11 −−11 −−22 Inter 11 11 −−11 −−22 ⎢⎢ ⎥⎥ == ⎢⎢ ⎥⎥⎢⎢ ⎥⎥⎢⎢ ⎥⎥ ⎢⎢ II JJ KK LL⎥⎥ ⎢⎢11 −−11 −−11 11 ⎥⎥⎢⎢prediction ⎥⎥⎢⎢11 −−11 −−11 22 ⎥⎥ ⎢⎢ ⎥⎥ ⎢⎢ ⎥⎥⎢⎢ residual ⎥⎥⎢⎢ ⎥⎥ ⎣⎣MM NN OO PP⎦⎦ ⎣⎣11 −−22 22 −−11⎦⎦⎣⎣ ⎦⎦⎣⎣11 −−22 11 −−11⎦⎦

‰ Obviously reversible (if not quantized) ‰ Signal correlation is reduced… ‰ However, the signal space is expanded – 2.66 bit/pixel increment – Corrupts coding performance

Takamura: Lossless Video Coding EE398B 2007.4.19 16/54 4x44x4 ReversibleReversible TransformTransform PiecewisePiecewise ScaledScaled TransformTransform

‰ Decompose H.264’s 4x4 transform into equivalent four 2x2 transforms, ‰ and compensate the expansion at each stage

1 1 1 1 1 1 1 1 1 –1 1 –1 1 –1 2 1 –1 2

1 1 2 1 1 1 2 1 1 –1 1 –2 1 –1 2 1 –2 5 Output is still H.264 transform Piecewise scaled transform (decomposed) reversible! 2 Divide by 2 and round 5 Divide by 5 and round

Takamura: Lossless Video Coding EE398B 2007.4.19 17/54 4x44x4 ReversibleReversible TransformTransform ReversibleReversible DCTDCT

[Komatsu, Sezaki, 1996]

X 0 = R((x0 + x1 + x2 + x3 ) / 4)

X1 = x0 − x3 + R((x1 − x2 ) / C)

X 2 = R((x0 − x1 − x2 + x3 ) / 2)

X 3 = R((x0 − x3 ) / C) − x1 + x2 where R(x) = ⎣⎦x + 0.5

‰ This is parallel to – H.264’s 4x4 transform, when C=2 – Parallel to mathematical DCT, when C =1+ 2 (recommended by the paper) ‰ We tried C = 1 + 2 , 3 and 3.5

Takamura: Lossless Video Coding EE398B 2007.4.19 18/54 4x44x4 ReversibleReversible TransformTransform LinearLinear PredictionPrediction ((non-adaptive)non-adaptive)

‰ Horizontal: Apply following transform to four rows

αΗx0 αΗx1 αΗx2 pred. pred. pred. Input x0 x1 x2 x3

Output x0 R(x1 -αΗ x0) R(x2 -αΗ x1) R(x3 -αΗ x2)

‰ Vertical: transform the columns in the same manner, using αV ‰ αH = αV =0.7 works good for I frames ‰ αH = αV =0.4 works good for P and B frames

Takamura: Lossless Video Coding EE398B 2007.4.19 19/54 4x44x4 ReversibleReversible TransformTransform LinearLinear PredictionPrediction ((adaptive)adaptive)

‰ Optimal coefficient α is image dependent ‰ In energy concentration sense, autocorrelation factor gives an optimal α. (But not known to the decoder unless explicitly transmitted) ‰ We estimate α from neighboring decoded blocks

ρH , ρV : horizontal and vertical Highest autocorrelation factors in a block weight

ρH ρH ρH ρH ρV ρV default αΗ, αV ρV ρV weighted weighted average average ρH coded ρH coded αΗ, αV coded Highest αΗ, αV ρV block ρV block block weight

If horizontally correlated If vertically correlated If not correlated

Takamura: Lossless Video Coding EE398B 2007.4.19 20/54 The smaller, More constant, TransformTransform ComparisonComparison the better the better

Transform ⎡1 0 0 0⎤ Determinant Norm of each row ⎢0 1 0 0⎥ ⎢ ⎥ (1,1,1,1) FRExt ⎢0 0 1 0⎥ 1 ⎢ ⎥ ⎣0 0 0 1⎦ ⎡1 1 1 1 ⎤ ⎢ ⎥ H.264 ⎢2 1 −1 −2⎥ ⎢1 −1 −1 1 ⎥ 40 (2, 10,2, 10) Transform ⎢ ⎥ ⎣1 −2 2 −1⎦ ⎡ 1 1 1 1 ⎤ ⎢ ⎥ Piecewise ⎢ 1 1/ 2 −1/ 2 −1 ⎥ ⎢1/ 2 −1/ 2 −1/ 2 1/ 2 ⎥ 1 (2, 10 / 2,1, 10 /10) Scaled ⎢ ⎥ ⎣1/10 −1/ 5 1/ 5 −1/10⎦ ⎡1/ 4 1/ 4 1/ 4 1/ 4 ⎤ ⎢ 1 1/ C −1/ C −1 ⎥ Reversible ⎢ ⎥ 2 2 2 ⎢1/ 2 −1/ 2 −1/ 2 1/ 2 ⎥ 1+1/C (1/ 2, 2 + 2C / C,1, 2 + 2C / C) DCT ⎢ ⎥ ⎣1/ C −1 1 −1/ C⎦ α α ⎡ 1 0 0 0⎤ Non- ⎢ ⎥ Linear ⎢− 1 0 0⎥ orthogonal α α 2 2 2 ⎢ 0 − 1 0⎥ 1 (1, 1+ , 1+ , 1+α ) Prediction ⎢ ⎥ ⎣ 0 0 −α 1⎦ Takamura: Lossless Video Coding EE398B 2007.4.19 21/54 SimulationSimulation ConditionsConditions

‰ 4 QCIF sequences and 5 CIF sequences ‰ I frame performance – Encoded 100 frames (all intra) – Compared with Motion JPEG 2000 (kakadu codec) ‰ P frame performance – Encoded 100 frames in IPPP… structure – Coding performance is obtained from 99 P frames ‰ B frame performance – Encoded 298 frames in IBBPBBP… structure – Coding performance is obtained from 198 B frames

Takamura: Lossless Video Coding EE398B 2007.4.19 22/54 ResultsResults (average(average ofof 99 sequences)sequences)

H.264 trans. 3.5 w/o quant. Piecewise scaled 3 Reversible DCT FRExt is C=1+C =1+ 22 not good Reversible DCT at intra 2.5 C=3 Reversible DCT Almost no C=3.5 compression 2 FRExt Compression Ratio

Linear pred. 1.5 non-adaptive Linear pred. adaptive 1 JPEG 2000 I frames P frames B frames

Takamura: Lossless Video Coding EE398B 2007.4.19 23/54 SummarySummary forfor ReversibleReversible TransformTransform CodingCoding

‰ Compared FRExt with four different reversible transforms – Non-orthogonal transform (linear prediction) yields better compression than other orthogonal transforms – Adaptive linear prediction works the best – 12% better than FRExt (intra) – 3% better than FRExt (inter) – 2% worse than JPEG 2000 ‡ Note Our method deserves yet another ~1% improvement [Sano et al, 2006]

Takamura: Lossless Video Coding EE398B 2007.4.19 24/54 OutlineOutline

‰ Lossless image coding – Typical schemes and technology – FYI: Lossless image codecs comparison

‰ Lossless video coding – Lossless variants of lossy H.264/AVC – Using reversible transforms – Using DCT residual packing [Takamura, ICIP2005] – FYI: Lossless video codecs comparison

‰ Summary

Takamura: Lossless Video Coding EE398B 2007.4.19 25/54 MotivationMotivation

‰ H.264/AVC — promising lossy video coding – About twice better compression than other standards – More and more H.264-compliant bit streams will be created and distributed lossless ‰ What if H.264 bit stream + additional = lossless? Enhancement – Useful for many applications Bit stream – Good interoperability with existing H.264 systems H.264 Bit stream – Two-layered scalability ‰ Existing technologies – Conventional pixel-wise residual coding: inefficient – H.264 High 4:4:4 Profile (FRExt): single-layered, incompatibility

Takamura: Lossless Video Coding EE398B 2007.4.19 26/54 TransformTransform andand QuantizationQuantization

An image ⎡Au⎤ ⎡ 1 1⎤⎡x⎤ B ⎢ ⎥ = ⎢ ⎥⎢ ⎥ (2D array of pixels) y Bv −1 1 y ⎣ ⎦ ⎣ ⎦⎣ ⎦ 255 x y 255 ・・・ 2D mapping (x,y) Transform

・・・ 0 A 510 0 x 255 45 deg. x (and y) : 8-bit rotated & integer magnified -255

Quantization bin Quantization step size

r eoriginal bord bit 8- Takamura: Lossless Video Coding EE398B 2007.4.19 27/54 RecoveringRecovering OriginalOriginal fromfrom Q-BinQ-Bin

White grid: ydif Impossible signal 2 To specify original( )… m 1 sfor ran al 0 1 2 3 e t idu 20 choices rs es -1 Inve e r 4 5 -wis xel -2 Pi -2 -1 0 1 2 6 7 8 No xdif conversion 9 10 (s B implified 5 25 choices 11 12 FGS) 4 No ( c 3 g on en v 2 e er 12 Choices ra si A lot of l F on 1 (optimal) G 1 2 3 4 5 A S) choices B

4.1…

3.9… A

Takamura: Lossless Video Coding EE398B 2007.4.19 28/54 H.264’sH.264’s 4x4 4x4 TransformTransform

Transformed coefficients

A B C D 1 1 1 1 • • • • 1 2 1 1 Intra / E F G H • • • • = 2 1 –1 –2 Inter 1 1 –1 –2 I J K L 1 –1 –1 1 • prediction• • • 1 –1 –1 2 residuals M N O P 1 –2 2 –1 • • • • 1 –2 1 –1 Key Points: ‰ Above matrices are all integer-valued ‰ To have all 16 coefficient values (A to P) = To have exact 4x4 residual values = To have original image!

Takamura: Lossless Video Coding EE398B 2007.4.19 29/54 EnhancementEnhancement CodingCoding

A min Amax ‰ Scan all the integer grids ( , , ) within 16- Bmin 1 2 3 dimensional quantization bin 4 5 16 6 7 8 Ω =(A|A=Amin…Amax,B|B=Bmin…Bmax,…,P|P=Pmin…Pmax) using 16-nested for-loop to obtain: 9 10 B 11 12 cases - total number of (possible candidates) max cases=12, index - the serial number of (original image) index=5

‰ Encode index using log2(cases) bits

However… It takes 24 million YEARS if each loop is 16 ‰ Loop count (=volume of Ω ) becomes done in 1 nsec, only 800,000,000,000,000,000,000 when QP=12 for one 4x4 block!

Takamura: Lossless Video Coding EE398B 2007.4.19 30/54 H.264H.264 BasisBasis FunctionsFunctions

1 1 1 1 2 1 –1 −2 1 –1 –1 1 1 –2 2 –1 1 1 1 1 2 1 –1 −2 1 –1 –1 1 1 –2 2 –1 1ABCD1 1 1 2 1 –1 −2 1 –1 –1 1 1 –2 2 –1 1 1 1 1 2 1 –1 −2 1 –1 –1 1 1 –2 2 –1

2 2 2 2 4 2 –2 –4 2 –2 –2 2 2 –4 4 –2 1 1 1 1 2 1 –1 –2 1 –1 –1 1 1 –2 2 –1 –1EFGH–1 –1 –1 –2 –1 1 2 –1 1 1 –1 –1 2 -2 1 –2 –2 –2 –2 –4 –2 2 4 –2 2 2 –2 –2 4 –4 2 1 1 1 1 2 1 –1 –2 1 –1 –1 1 1 –2 2 –1 –1 –1 –1 –1 –2 –1 1 2 –1 1 1 –1 –1 2 −2 1 –1 –1 –1 –1 –2 –1 1 2 –1 1 1 –1 –1 2 −2 1 1 I1 1 1 2 J1 –1 –2 1KL–1 –1 1 1 –2 2 –1

1 1 1 1 2 1 –1 –2 1 –1 –1 1 1 −2 2 –1 –2 –2 –2 –2 –4 –2 2 4 –2 2 2 –2 –2 4 –4 2 2MNO2 2 2 4 2 –2 –4 2 –2 –2 2 2 P–4 4 –2 –1 –1 –1 –1 –2 –1 1 2 –1 1 1 –1 –1 2 –2 1 Takamura: Lossless Video Coding EE398B 2007.4.19 31/54 NiftyNifty PropertiesProperties ofof H.264H.264 CoefficientsCoefficients

1 1 1 1 1 –1 –1 1 2 0 0 2 1 1 1 1 + 1 –1 –1 1 = 2 0 0 2 1 A1 1 1 1 –1C–1 1 2 0 0 2 1 1 1 1 1 –1 –1 1 2 0 0 2

‰ For arbitrary input, A+C is always even – One bit reduction of C, if A is known (can be ‘pack’ed half) – So are A+I, B+J, E+G, M+O etc. ‰ A+C+I+K is always a multiple of four. 1 1 1 1 1 –1 –1 1 1 1 1 1 1 –1 –1 1 4 0 0 4 1 1 1 1 + 1 –1 –1 1 ++–1 –1 –1 –1 –1 1 1 –1 =0 0 0 0 1 AC1 1 1 1 –1 –1 1 –1 –1I–1 –1 –1K1 1 –1 0 0 0 0 1 1 1 1 1 –1 –1 1 1 1 1 1 1 –1 –1 1 4 0 0 4

Takamura: Lossless Video Coding EE398B 2007.4.19 32/54 MoreMore NiftyNifty PropertiesProperties

‰ B+(C>>1)+(A>>1): Always even 0 4 4 0 4 4 0 –4 – If A and C are known, B can be discretized 4 0 –4 –4 with interval of 2. 0 –4 –4 0 – Same for F+(E>>1)+(G>>1), E+(A>>1)+(I>>1), 10 0 0 0 10 0 0 0 G+(C>>1)+(K>>1), etc. 10 0 0 0 10 0 0 0 ‰ B–J+E–G: Multiple of 4 20 0 0 20 ‰ 2.5(A+C)+2B+D: Multiple of 10 0 0 0 0 0 0 0 0 0 0 0 0 ‰ 2.5(A+C+I+K)+2(E+G)+M+O: Multiple of 20 100 0 0 0 ‰ 6.25(A+C+I+K)+5(B+E+G+J)+4F+2.5(D+L+M+O) 0 0 0 0 +2(H+N)+P: Multiple of 100 0 0 0 0 0 0 0 0

Takamura: Lossless Video Coding EE398B 2007.4.19 33/54 EnhancedEnhanced EnhancementEnhancement CodingCoding

A • • • ‰ For each block • • • • • • • • 1. Entropy-code the coefficient A using • • • • log2(Amax−Amin+1) bits 2. ‘Pack’ each of seven coefficients and B C D entropy-code in the same manner E • • • I • K • Ω8 3. For-loop the rest eight coefficients ( ) M • • • to find out index and cases Æpack 4. Entropy-code index using log (cases) bits 2 • • • • ‰ At 2 and 3, numerical properties are • F G H exploited • J • L • N O P Loop count=400 (2·1018 times speed-up!) Æindex, cases

Takamura: Lossless Video Coding EE398B 2007.4.19 34/54 QuantizationQuantization andand BoundsBounds

Bounds A , A Coefficients Levels min max quantization estimation B , B A ... P level … level min max A P …. Pmin, Pmax

levelA = (A*α + β) >> γ The bounds can be x = levelA << γ calculated/shared by both y = x + (1 << γ) – 1 encoder and decoder. Amin = (x – β + α –1) / α Amax = (y – β) / α

Then Amin <= A <= Amax Takamura: Lossless Video Coding EE398B 2007.4.19 35/54 CodingCoding SimulationSimulation withwith JM-BLJM-BL

‰ Implemented on JM8.3 (H.264 reference codec) ‰ Two mode selection strategies regarding the base layer (BL) and enhancement layer (EL):

1. OverallOpt: chooses a mode that minimizes total optimize (BL+ EL) bit amount 2. BaseOpt: chooses a mode that RD-optimizes the EL BL, no care for the EL optimize BL ‰ Compared the performance with: – Motion JPEG 2000 – FRExt-based lossless coding – Pixelwise residual entropy

Takamura: Lossless Video Coding EE398B 2007.4.19 36/54 RateRate Trade-OffTrade-Off

Total (EL+BL) 28

24 OverallOpt BaseOpt 20 BL 16 Better ntropy idual e ise res Pixel-w Rate (% of uncompressed) 12 EL

8 45678 QP(P) Silent, IPP Takamura: Lossless Video Coding EE398B 2007.4.19 37/54 PerformancePerformance ComparisonComparison

Motion JPEG2000 70 OverallOpt (EL/BL) BaseOpt (EL/BL) Intra 60 FRExt Intra Intra Intra Inter 50 IPPIBP IBBP

40 Inter Inter Inter IPP IBP IBBP IPP IBP IBBP IPP IBP IBBP 30

20 Better rate (% of uncompressed) 10

0 container news silent stefan Takamura: Lossless Video Coding EE398B 2007.4.19 38/54 CodingCoding SimulationSimulation withwith JSVM-BLJSVM-BL

‰ Implemented on JSVM 4 (H.264 scalable extension reference codec) ‰ 14 test sequences (QCIF/CIF/4CIF, 15/30/60Hz) ‰ BL structures: III…, IPP..., IBP... and IBBBP... ‰ Other settings – FastSearch: used – ME search range: 96 – Reference frame #: 1 – 8x8 transform: disabled ‰ Compared with: – Ordinary coding of Motion JPEG2000 (lossless) – Residual coding of Motion JPEG2000 (lossless) for EL – Residual entropy

Takamura: Lossless Video Coding EE398B 2007.4.19 39/54 BitBit RateRate EvaluationEvaluation

“Proposed Residual Arithmetic EL” packer coder

Original JSVM JSVM video encoder decoder

“Proposed BL” (III, IPP, etc.) – pixel-wise + residual video

Motion JPEG 2000 Entropy Motion JPEG 2000 Lossless Coder measurement Lossless Coder

“MJ2K (total)” “EL Entropy” “MJ2K EL”

Takamura: Lossless Video Coding EE398B 2007.4.19 40/54 BitBit RateRate BreakoutBreakout (CITY(CITY 4CIF,4CIF, BL=III/IPP)BL=III/IPP)

Total (III) Base (III) Enhance (III) 60 Total (IPP) Base (IPP) Enhance (IPP) MJ2K

50 Motion ) JPEG2000 % EL+BL ( tio40 a r n o BL si s30 re p m o C20 EL 10 rate (% of uncompressed)

0 35791113QP

Takamura: Lossless Video Coding EE398B 2007.4.19 41/54 ELEL BitBit RateRate (BL=III(BL=III……))

30 MJ2K

25 Entropy Proposed )20 % ( tio a r15 te ra it- B10

5 rate (% of uncompressed)

0 3 5 7QPi 9 11 13

Takamura: Lossless Video Coding EE398B 2007.4.19 42/54 ELEL BitBit RateRate (BL=IPP(BL=IPP……))

30 MJ2K Entropy 25 Proposed

)20 % ( tio a r15 te ra it- B 10

rate (% of uncompressed) 5

0 3 5 7QPp 9 11 13 Takamura: Lossless Video Coding EE398B 2007.4.19 43/54 PerformancePerformance ComparisonComparison

75 MJ2K III IPP IBP IBBBP 70 65 60 55 50 45 40 Compression ratio (%) ratio Compression 35 30

) ) ) ) ) ) ) ) ) ) ) ) ) ) e z z z z z z z z z z z z z z g H H H H H H H H H H H H H H a rate (% of uncompressed) 5 5 5 0 r 0 0 0 0 0 0 0 0 0 0 e 3 1 3 1 6 3 6 3 3 1 6 3 6 3 v

@ @ @ @ @ @ @ @ @ @ @ @ A

F F F IF IF I IF I IF IF@ IF IF@ IF IF IF I IF

C C C C C C C C C C C C C C ( ( ( ( ( ( 4 4 4 Q Q Q 4 Q s e ( ( n ( ( ( l ( ( ( u i r r a ty w i u e B b e c o m o C r e b c r M C r o o a S F H ‰ Intra BL compression ratio is 3 points worse than Motion JPEG2000 ‰ Inter BL is 7 points better than Motion JPEG2000 Takamura: Lossless Video Coding EE398B 2007.4.19 44/54 BL:ELBL:EL BitBit RateRate RatioRatio

III base/enh ratio IBBBBP base/enh bit rate ratio 100% 100% 90% Enhancement Enhancement 90% 80% Base 80% Base 70% 70% 60% 60% 50% 50% 40% Intra (III) 40% Inter (IBBBP) 30% 30% 20% 20% 10% 10% 0% 0% BUS CIFBUS CITY CIF BUS CIFBUS BUS QCIF CITY 4CIF CITY CITY CIF CREW CIF BUS QCIFBUS CITY 4CIF CREW 4CIF CREW CIF MOBILE CIF MOBILE CIF MOBILE SOCCER CIF CREW 4CIF CREW MOBILE CIF MOBILE CIF MOBILE SOCCER 4CIF SOCCER HARBOUR CIF FOREMAN CIF SOCCER CIF HARBOUR 4CIF SOCCER 4CIF HARBOUR CIF FOREMAN QCIFFOREMAN FOREMAN CIF HARBOUR 4CIF FOREMAN QCIF Average EL ratio=29% Average EL ratio=32%

‰ ~1/3 bits are spent for EL, regardless of BL GOP structure and sequence

Takamura: Lossless Video Coding EE398B 2007.4.19 45/54 SummarySummary ofof ResidualResidual PackingPacking

‰ Scalable lossless coding having H.264-compliant BL – Drastic speed-up without losing efficiency – 20% less bits than conventional pixel-wise residual coding – Intra BL coding (III) – ~55% compression (compared to original) – ~3 points worse than Motion JPEG2000 – 1-3% worse than Motion JPEG2000 – 14% less bits than FRExt – Inter BL coding (IPP, IBP, IBBBP) – ~44% compression (compared to original) – 5~7 points better than Motion JPEG2000 – 14% more bits than FRExt ‰ BL can be safely RD-Optimized (i.e., BaseOpt works good) ‰ Optimal EL amount: ~1/3 of total bit rate Takamura: Lossless Video Coding EE398B 2007.4.19 46/54 OutlineOutline

‰ Lossless image coding – Typical schemes and technology – FYI: Lossless image codecs comparison

‰ Lossless video coding – Lossless variants of lossy H.264/AVC – Using reversible transforms – Using DCT residual packing – FYI: Lossless video codecs comparison

‰ Summary

Takamura: Lossless Video Coding EE398B 2007.4.19 47/54 VBSVBS —— A A Gray-ScaleGray-Scale LosslessLossless VideoVideo CodecCodec

[Shiodera et al., 2006] ‰ Video extension of MRP (see page 10) ‰ Variable block-sized motion compensation – Quad-tree based, 32x32 to 8x8 sizes – Integer-pel accuracy ‰ Context-based 3D-linear prediction – M=24 contexts (predictors) – P=20 pixels from current frame – Q=25 pixels from previous frame – All M(P+Q) coefficients are optimized for minimum rate, not for minimum MSE, and downloaded for each frame ‰ Encoding time: 3-4 minutes per frame ‰ Decoding time: 0.02 seconds per frame

Takamura: Lossless Video Coding EE398B 2007.4.19 48/54 SUTSUT CodecCodec ComparisonComparison (gray-scale)(gray-scale)

[Shiodera et al., 2006] 3

2.8 15%

2.6 34% gain due to Motion Compensation 2.4

2.2 Compression Ratio

2 14%

1.8 VBS FRExt VBS-intra JPEG-LS JPEG 2000 (intra) (intra)

Average of 6 gray-scale (8-bit) video sequences (all CIF, 25 frames)

Takamura: Lossless Video Coding EE398B 2007.4.19 49/54 MSUMSU CodecCodec ComparisonComparison (YUV(YUV 4:2:0)4:2:0)

[MSU “Lossless Codecs Comparison”, Apr. 2007]

2.8 9.7%

2.6

2.4

2.2 Compression Ratio 2

FRExt 1.8 S 4 th v G ry O L V1 ri u Z F fY L YU MSU Snow F x26 OCO ga f lpa L a u A L H CorePN Average of 9 YUV 4:2:0 (YV12) video sequences Note: Conditions are different Also provides encoding time, RGB&YUV4:2:2 coding comparison from previous comparison Weighted by image size (#pixels, not by #frames) Takamura: Lossless Video Coding EE398B 2007.4.19 50/54 OutlineOutline

‰ Lossless image coding – Typical schemes and technology – FYI: Lossless image codecs comparison

‰ Lossless video coding – Lossless variants of lossy H.264/AVC – Using reversible transforms – Using DCT residual packing – FYI: Lossless video codecs comparison

‰ Summary

Takamura: Lossless Video Coding EE398B 2007.4.19 51/54 SummarySummary

‰ Presented lossless coding technologies – H.264 variants: FRExt extension and Scalable coding – Introduced MRP, VBS and other state-of-the-art lossless codec comparisons (only quantitative comparison, since the conditions are not the same) ‰ Lossless – About 1/2 for gray image – About 1/3 for RGB24 image ‰ Lossless video compression – About 1/2.9 for both gray and YUV4:2:0 video

Takamura: Lossless Video Coding EE398B 2007.4.19 52/54 LastLast ButBut NotNot Least…Least…

‰ Drawbacks – High data rate – No rate control possible – Difficult to enclose users (re-encoding with another codec is not a big deal) ‰ Merits – No need to locally-decode the signal – Can predict blocks/frames in parallel – No subjective viewing necessary – Independent of monitors, viewers and viewing conditions

Takamura: Lossless Video Coding EE398B 2007.4.19 53/54 ReferencesReferences

‰ MRP I. Matsuda et al: “A Lossless Coding Scheme Using Adaptive Predictors and Arithmetic Code Optimized for Each Image,” Trans. IEICE (DII), Vol.J88-D-II, No.9, pp.1798- 1807, Sep. 2005 (in Japanese) ‰ The art of Lossless Data Compression http://artest1.tripod.com/graphics25.html ‰ Reversible DCT K. Komatsu and K. Sezaki: “Reversible of images,” IEICE Trans. Fundamentals, vol. J79-A, no. 4, pp. 981–990, Apr. 1996 (in Japanese) ‰ H.264 + reversible transform (and its follower) S. Takamura and Y.Yashima: “H.264-based Lossless Video Coding Using Adaptive Transforms,” vol.2, pp.301-304, ICASSP 2005, Philadelphia, Mar. 2005 S. Sano et al: “H.264-based lossless video coding using correlation of adjacent block”, FIT2006, J-023, 2006 (in Japanese) ‰ H.264 + residual packing S. Takamura and Y.Yashima: “Lossless Scalable Video Coding with H.264 Compliant Base Layer,” ICIP 2005, vol.2, pp.II-754-757, Genova, Sep. 2005 ‰ VBS T. Shiodera et al.: “A Lossless Video Coding Scheme Using MC and 3D Prediction Optimized for Each Frame,” Journal of the ITE, vol.60, no.7, Jul. 2006 (in Japanese) ‰ MSU codecs comparison 2007 http://www.compression.ru/video/codec_comparison/lossless_codecs_2007_en.html Takamura: Lossless Video Coding EE398B 2007.4.19 54/54