Perceptual Signal Coding for More Efficient Usage of Bit Codes Scott Miller Mahdi Nezamabadi Scott Daly Dolby Laboratories, Inc
Total Page:16
File Type:pdf, Size:1020Kb
Perceptual Signal Coding for More Efficient Usage of Bit Codes Scott Miller Mahdi Nezamabadi Scott Daly Dolby Laboratories, Inc. What defines a digital video signal? • SMPTE 292M, SMPTE 372M, HDMI? – No, these are interface specifications – They don’t say anything about what the RGB or YCbCr values mean • Rec601 or Rec709? – Not really, these are encoding specifications which define the OETF (Opto-Electrical Transfer Function) used for image capture – Image display ≠ inverse of image capture © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org What defines a digital video signal? • This does! • The EOTF (Electro-Optical Transfer Function) is what really matters – Content is created by artists while viewing a display – So the reference display defines the signal © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org Current Video Signals • “Gamma” nonlinearity – Came from CRT (Cathode Ray Tube) physics – Very much the same since the 1930s – Works reasonably well since it is similar to human visual sensitivity (with caveats) • No actual standard until last year! (2011) – Finally, with CRTs almost extinct the effort was made to officially document their response curve – Result was ITU-R Recommendation BT.1886 © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org Recommendation ITU-R BT.1886 EOTF © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org If gamma works so well, why change? • Gamma is similar to perception within limits – Traditional cinema and television programs are viewed at moderately low light levels, and have relatively limited dynamic ranges – Within these constraints, gamma works • If we stay in these ranges for the future, no change may be required – This is the current thinking for UHDTV (Ultra High Definition Television), detailed in ITU-R Report BT.2246 © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org 12 bit Rec1886 Curve with 100 nit Peak © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org But 100 nits is no longer adequate! • Most modern display devices are already operating well above this level – Consumer displays typically 200 to 500 nits – Commercial displays available at 1000 to 2000 nits – Laboratory displays at 4000 to 20,000 nits • When we expand the range, gamma shows its limitations © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org 12 bit Rec1886 Curves © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org How can we improve performance? • Add more bits – Not very practical - too many legacy pipelines – 10 or 12 bits is about the best we can expect typically • Use a better curve – Power functions waste codes at high end – Log functions waste codes at low end – Greatest efficiency would be to follow human perception © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org Perception & Signal Coding • DICOM (Digital Imaging and Communications In Medicine) – Grayscale standard display function - 1998 – Barten model used directly for signal coding – 0.05 to ~4000 nits – Too aggressive with Barten parameters – visible steps at low end of scale © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org Perception & Signal Coding Continued • Digital Cinema – Cowan, et al - 2004 – Barten model used as benchmark for gamma coding – Only up to ~50 nits • UHDTV – ITU-R Report BT.2246 mentioned earlier – Barten model again used as benchmark for gamma – Still thinking traditional 100 nit brightness © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org Barten CSF (Contrast Sensitivity Function) Model 1 M opt (u) k CSF = = k = 3.0 2 mt 2 ⎛ 1 1 u ⎞⎛ 1 Φ ⎞ 0 σ 0 = 0.5 arc min ⎜ 2 + 2 + 2 ⎟⎜ + 2 ⎟ ⎜ ⎟⎜ −(u u0 ) ⎟ T X 0 X max Nmax ⎝ηpE 1− e ⎠ ⎝ ⎠ Cab = 0.08 arc min mm T = 0.1 sec 2 2 2 M u = e−2π σ u opt ( ) X max =12° 2 2 Nmax =15 cycles σ = σ 0 + (Cabd ) η = 0.03 2 2 8 2 d = 5 − 3tanh 0.4log L X 0 40 − ( ( )) Φ0 = 3×10 sec deg 2 πd 2 4 u0 = 7 cycles deg E = L(1− (d 9.7) + (d 12.4) ) 4 p =1.2×106 photons sec deg 2 Td © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org JNDs Based on Barten Model • Barten Parameters chosen conservatively – 40º angular size – Varied spatial frequency at every luminance level to track peaks of the CSF – Select peak luminance level – Iteratively calculate the rest of the steps © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org Tracking Peaks of Contrast Sensitivity © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org Converting CSF to Contrast Steps © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org Building Some Optimized Curves • Choose peaks of 100, 1000, and 10,000 nits as before – Then pick f so that a near zero minimum level is reached – 0 to 100 nits = 0.46 JNDs per code word at 12 bits – 0 to 1000 nits = 0.68 JNDs per code word at 12 bits – 0 to 10,000 nits = 0.9 JNDs per code word at 12 bits © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org 12 bit Uniform JND Curves © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org Functional Approximation • Would be great to have a functional form of the Iterative LUT – Helpful for standardization – Simpler to document – Invertibility is very helpful – Good alignment with a modified Naka-Rushton model © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org Perceptual Quantizer (PQ) EOTF © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org 12 bit PQ and Rec1886 Curves © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org 10 bit PQ and Rec1886 Curves © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org Visual Test Framework © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org JND Cross Test Pattern © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org JND Cross Test Results Gray Field Luminance: 0.001 0.005 0.01 0.03 0.05 0.1 1 5 10 50 100 250 500 (cd/m2) PQ 1K 11 11 10 11 10 10 10 10 10 10 10 10 9 PQ 10K 11 11 11 10 11 11 11 11 10 11 10 10 10 Rec1886 1K >12 >12 >12 >12 >12 12 12 12 12 10 10 10 9 Rec1886 2K >12 >12 >12 >12 >12 >12 >12 12 12 11 11 10 9 Bits required for no visible boxes © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org Video Images Dark Ramp Black Controllers Glacier White Feathers Black Feathers Charcoal White Paper Plane Hangar © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org Video Images Test Results Black White Image Name: Dark Ramp Black Feathers Charcoal Glacier White Paper Plane Hangar Controllers Feathers PQ 1K 10 9 8 8 7 7 8 9 PQ 10K 10 9 9 8 8 8 9 9 Rec1886 1K >12 11 10 10 7 7 8 9 Rec1886 2K >12 11 11 10 7 7 9 9 Bits required for no visible banding © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org PQ Shows Balanced Performance Across Entire Luminance Range • 10 or 11 bits were always enough to eliminate visible quantization artifacts with PQ • Most PQ images looked good with 10 bits or even slightly less • Gamma fails at low luminance levels, even at higher bit depths © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org PQ Enables Good Performance With High Dynamic Ranges • PQ shows 1 to 2 bit advantage over gamma at low end with 1000 nit peak signals • Only slight performance impact going to 10,000 nit peak signals with PQ • Gamma gets much worse with higher peak levels © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org Questions? © 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org SMPTE Meeting Presentation Perceptual Signal Coding for More Efficient Usage of Bit Codes Scott Miller Dolby Laboratories, Inc., 1040 Stony Hill Road, Yardley, PA, [email protected] Mahdi Nezamabadi Dolby Laboratories, Inc., 1040 Stony Hill Road, Yardley, PA, [email protected] Scott Daly Dolby Laboratories, Inc., 432 Lakeside Drive, Sunnyvale, CA, [email protected] Written for presentation at the 2012 SMPTE Annual Technical Conference & Exhibition Abstract. As the performance of electronic display systems continues to increase, the limitations of current signal coding methods become more and more apparent. With bit depth limitations set by industry standard interfaces, a more efficient coding system is desired to allow image quality to increase without requiring expansion of legacy infrastructure bandwidth. A good approach to this problem is to let the human visual system determine the quantization curve used to encode video signals. In this way optimal efficiency is maintained across the luminance range of interest, and the visibility of quantization artifacts is kept to a uniformly small level. Keywords. perception, human visual system, transfer function, perceptual curve, signal, encoding, coding, bit depth, gamma, logarithmic, Barten, efficiency, EOTF The authors are solely responsible for the content of this technical presentation. The technical presentation does not necessarily reflect the official position of the Society of Motion Picture and Television Engineers (SMPTE), and its printing and distribution does not constitute an endorsement of views which may be expressed. This technical presentation is subject to a formal peer-review process by the SMPTE Board of Editors, upon completion of the conference. Citation of this work should state that it is a SMPTE meeting paper. EXAMPLE: Author's Last Name, Initials.