Perceptual Signal Coding for More Efficient Usage of Bit Codes Scott Miller Mahdi Nezamabadi Scott Daly Dolby Laboratories, Inc. What defines a digital video signal?

• SMPTE 292M, SMPTE 372M, HDMI? – No, these are interface specifications – They don’t say anything about what the RGB or

YCbCr values mean • Rec601 or Rec709? – Not really, these are encoding specifications which define the OETF (Opto-Electrical Transfer Function) used for image capture – Image display ≠ inverse of image capture

© 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org What defines a digital video signal?

• This does!

• The EOTF (Electro-Optical Transfer Function) is what really matters – Content is created by artists while viewing a display – So the reference display defines the signal

© 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org Current Video Signals

• “Gamma” nonlinearity – Came from CRT (Cathode Ray Tube) physics – Very much the same since the 1930s – Works reasonably well since it is similar to human visual sensitivity (with caveats) • No actual standard until last year! (2011) – Finally, with CRTs almost extinct the effort was made to officially document their response curve – Result was ITU-R Recommendation BT.1886

© 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org Recommendation ITU-R BT.1886 EOTF

© 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org If gamma works so well, why change?

• Gamma is similar to perception within limits – Traditional cinema and television programs are viewed at moderately low light levels, and have relatively limited dynamic ranges – Within these constraints, gamma works • If we stay in these ranges for the future, no change may be required – This is the current thinking for UHDTV (Ultra High Definition Television), detailed in ITU-R Report BT.2246

© 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org 12 bit Rec1886 Curve with 100 nit Peak

© 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org But 100 nits is no longer adequate!

• Most modern display devices are already operating well above this level – Consumer displays typically 200 to 500 nits – Commercial displays available at 1000 to 2000 nits – Laboratory displays at 4000 to 20,000 nits • When we expand the range, gamma shows its limitations

© 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org 12 bit Rec1886 Curves

© 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org How can we improve performance?

• Add more bits – Not very practical - too many legacy pipelines – 10 or 12 bits is about the best we can expect typically • Use a better curve – Power functions waste codes at high end – Log functions waste codes at low end – Greatest efficiency would be to follow human perception

© 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org Perception & Signal Coding

• DICOM (Digital Imaging and Communications In Medicine) – Grayscale standard display function - 1998 – Barten model used directly for signal coding – 0.05 to ~4000 nits – Too aggressive with Barten parameters – visible steps at low end of scale

© 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org Perception & Signal Coding Continued • Digital Cinema – Cowan, et al - 2004 – Barten model used as benchmark for gamma coding – Only up to ~50 nits • UHDTV – ITU-R Report BT.2246 mentioned earlier – Barten model again used as benchmark for gamma – Still thinking traditional 100 nit brightness

© 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org Barten CSF (Contrast Sensitivity Function) Model

1 M opt (u) k CSF = = k = 3.0 2 mt 2 ⎛ 1 1 u ⎞⎛ 1 Φ ⎞ 0 σ 0 = 0.5 arc min ⎜ 2 + 2 + 2 ⎟⎜ + 2 ⎟ ⎜ ⎟⎜ −(u u0 ) ⎟ T X 0 X max Nmax ⎝ηpE 1− e ⎠ ⎝ ⎠ Cab = 0.08 arc min mm T = 0.1 sec 2 2 2 M u = e−2π σ u opt ( ) X max =12°

2 2 Nmax =15 cycles σ = σ 0 + (Cabd ) η = 0.03 2 2 8 2 d = 5 − 3tanh 0.4log L X 0 40 − ( ( )) Φ0 = 3×10 sec deg 2 πd 2 4 u0 = 7 cycles deg E = L(1− (d 9.7) + (d 12.4) ) 4 p =1.2×106 photons sec deg 2 Td

© 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org JNDs Based on Barten Model

• Barten Parameters chosen conservatively – 40º angular size – Varied spatial frequency at every luminance level to track peaks of the CSF – Select peak luminance level – Iteratively calculate the rest of the steps

© 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org Tracking Peaks of Contrast Sensitivity

© 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org Converting CSF to Contrast Steps

© 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org Building Some Optimized Curves

• Choose peaks of 100, 1000, and 10,000 nits as before – Then pick f so that a near zero minimum level is reached – 0 to 100 nits = 0.46 JNDs per code word at 12 bits – 0 to 1000 nits = 0.68 JNDs per code word at 12 bits – 0 to 10,000 nits = 0.9 JNDs per code word at 12 bits

© 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org 12 bit Uniform JND Curves

© 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org Functional Approximation

• Would be great to have a functional form of the Iterative LUT – Helpful for standardization – Simpler to document – Invertibility is very helpful – Good alignment with a modified Naka-Rushton model

© 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org Perceptual Quantizer (PQ) EOTF

© 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org 12 bit PQ and Rec1886 Curves

© 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org 10 bit PQ and Rec1886 Curves

© 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org Visual Test Framework

© 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org JND Cross Test Pattern

© 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org JND Cross Test Results

Gray Field Luminance: 0.001 0.005 0.01 0.03 0.05 0.1 1 5 10 50 100 250 500 (cd/m2)

PQ 1K 11 11 10 11 10 10 10 10 10 10 10 10 9

PQ 10K 11 11 11 10 11 11 11 11 10 11 10 10 10

Rec1886 1K >12 >12 >12 >12 >12 12 12 12 12 10 10 10 9

Rec1886 2K >12 >12 >12 >12 >12 >12 >12 12 12 11 11 10 9

Bits required for no visible boxes

© 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org Video Images

Dark Ramp Black Controllers Glacier White Feathers

Black Feathers Charcoal White Paper Plane Hangar

© 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org Video Images Test Results

Black White Image Name: Dark Ramp Black Feathers Charcoal Glacier White Paper Plane Hangar Controllers Feathers

PQ 1K 10 9 8 8 7 7 8 9

PQ 10K 10 9 9 8 8 8 9 9

Rec1886 1K >12 11 10 10 7 7 8 9

Rec1886 2K >12 11 11 10 7 7 9 9

Bits required for no visible banding

© 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org PQ Shows Balanced Performance Across Entire Luminance Range • 10 or 11 bits were always enough to eliminate visible quantization artifacts with PQ • Most PQ images looked good with 10 bits or even slightly less • Gamma fails at low luminance levels, even at higher bit depths

© 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org PQ Enables Good Performance With High Dynamic Ranges • PQ shows 1 to 2 bit advantage over gamma at low end with 1000 nit peak signals • Only slight performance impact going to 10,000 nit peak signals with PQ • Gamma gets much worse with higher peak levels

© 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org Questions?

© 2012 SMPTE · e 2012 Annual Technical Conference & Exhibition · www.smpte2012.org

SMPTE Meeting Presentation

Perceptual Signal Coding for More Efficient Usage of Bit Codes

Scott Miller Dolby Laboratories, Inc., 1040 Stony Hill Road, Yardley, PA, [email protected] Mahdi Nezamabadi Dolby Laboratories, Inc., 1040 Stony Hill Road, Yardley, PA, [email protected] Scott Daly Dolby Laboratories, Inc., 432 Lakeside Drive, Sunnyvale, CA, [email protected]

Written for presentation at the 2012 SMPTE Annual Technical Conference & Exhibition

Abstract. As the performance of electronic display systems continues to increase, the limitations of current signal coding methods become more and more apparent. With bit depth limitations set by industry standard interfaces, a more efficient coding system is desired to allow image quality to increase without requiring expansion of legacy infrastructure bandwidth. A good approach to this problem is to let the human visual system determine the quantization curve used to encode video signals. In this way optimal efficiency is maintained across the luminance range of interest, and the visibility of quantization artifacts is kept to a uniformly small level. Keywords. perception, human visual system, transfer function, perceptual curve, signal, encoding, coding, bit depth, gamma, logarithmic, Barten, efficiency, EOTF

The authors are solely responsible for the content of this technical presentation. The technical presentation does not necessarily reflect the official position of the Society of Motion Picture and Television Engineers (SMPTE), and its printing and distribution does not constitute an endorsement of views which may be expressed. This technical presentation is subject to a formal peer-review process by the SMPTE Board of Editors, upon completion of the conference. Citation of this work should state that it is a SMPTE meeting paper. EXAMPLE: Author's Last Name, Initials. 2011. Title of Presentation, Meeting name and location.: SMPTE. For information about securing permission to reprint or reproduce a technical presentation, please contact SMPTE at [email protected] or 914-761-1100 (3 Barker Ave., White Plains, NY 10601).

Copyright © 2012 Society of Motion Picture and Television Engineers. All rights reserved.

Introduction The fundamental basis for interpreting any visual signal is knowledge of that signal’s transfer function – the description of how to convert the signal’s carrier (analog voltage, film density, or digital code values) to optical energy. With electronic displays for television and film, the critical information is found in the EOTF (electro-optical transfer function) for reference standard displays. The vast majority of content is color graded (either live in the camera, or during post production) according to artistic preference while viewing on a reference standard display. Therefore it is the EOTF and not the OETF (opto-electronic transfer function – used in camera capture) that truly defines the intent of visual signal code values. Reference EOTF curves have been defined for television1 and digital cinema2 applications – both based on power functions with exponent values of 2.4 and 2.6 respectively. While these systems have some known issues with dark level reproduction, they have been used with great success for many years. This success comes primarily because these curves crudely approximate human perception when implemented on relatively dim reference displays with a peak brightness of ~50 to 100 cd/m2, and with a dynamic range (or contrast) less than 3 log units of luminance. As typical display brightness and dynamic range has steadily increased this approximation has steadily become more and more inaccurate. Typical displays today are now achieving peak levels of 500 cd/m2 or more (with several commercial examples above 1000 cd/m2) and artifacts in dark details have increased proportionately. Further, through digital driving circuitry, display noise is vastly reduced, and through better cameras and use of synthetic imagery, the image capture noise is vastly lower or even zero. Thus the well-known effect of masking by noise no longer hinders low amplitude visibility. It is clear that the displays of today and the future could benefit from a better system.

Gamma Coding and Perception The ITU-R Rec. BT.18861 EOTF for television, commonly referred to as “gamma encoding”, is often said to be perceptually linear. A recent ITU report on Ultra-High Definition Television (UHDTV) (Report ITU-R BT.2246)3 used a scaled Barten contrast sensitivity function, called “Barten (Ramp)”, along with an alternative threshold function by Schreiber to illustrate how the ITU-R Rec. BT.1886 EOTF for HDTV behaved similarly to human perception, and was near or below visual detection thresholds for 10 and 12 bit implementations. Though this is roughly the case for a gamma curve with a peak level of 100 cd/m2 (or 100 nits) as shown in figure 1, when higher peak luminance levels are used the 12 bit gamma curve quickly rises above both the Barten and Schreiber thresholds, suggesting that it will become likely to show visible quantization artifacts – especially at the dark end of the luminance range.

Figure 1. 12 bit Rec1886 gamma curves with peak luminances of 100, 1000, and 10,000 cd/m2.

2 Copyright © 2012 Society of Motion Picture and Television Engineers. All rights reserved.

Though the system precision could be increased by using a higher bit depth, legacy infrastructures would be difficult to push beyond 12 bits. In fact most live production and broadcast environments still operate at the 10 bit level, so a system which could provide increased performance at these common bit depths would be ideal.

Barten Perceptual Model Several models have been created over the years to represent the human visual system response. One well respected model for the contrast sensitivity function (CSF) was developed by Peter Barten4 and has been referenced by many electronic imaging studies and standards. This complex model of contrast sensitivity is based on physics, optics, and some experimentally determined parameters. It has been shown to align well with many visual experiments spanning several decades of research, and is given summarized in figure 2, where S is contrast 2 sensitivity, L is luminance in cd/m , u is spatial frequency in cycles/deg., and X0 is angular size in deg.

Figure 2. Barten model for human visual contrast sensitivity. For more details, consult Barten’s 2004 paper5 where he describes the equation and documents most of the commonly accepted parameter values. This model was used by Digital Imaging and Communication in Medicine (DICOM) to create a specialized EOTF for the medical industry6, and also used as a visual threshold reference in studies for digital cinema7 and UHDTV3.

An EOTF Based on Perception To create a more efficient EOTF, a curve is desired that is a closer fit to the actual human visual response curve. Since the Barten model has been used effectively as a benchmark for evaluating the performance of other EOTF curves, why not use it directly to compute an optimized perceptual EOTF? For this system, Barten model parameters were chosen to be very similar to those used in prior studies with two exceptions: Angular size X0 was chosen to be 40 degrees (this angle is representative of many display scenarios, and additionally the overall system is near its peak sensitivity at 40 degrees), and the spatial frequency u was allowed to vary with luminance to track the maximum sensitivity of the human visual system, that is, tracking the peak of the CSF as it undergoes shape changes as a function of adapting luminance level – shown in figure 3.

3 Copyright © 2012 Society of Motion Picture and Television Engineers. All rights reserved.

Figure 3. Tracking the peaks of contrast sensitivity when adjusting luminance levels. Plugging all of these parameters into the Barten equation (along with a luminance level) generates the estimated contrast sensitivity function for human vision at that luminance level. Other work has shown that the visual system sensitivity is maximized at the luminance level to which it is adapted, known as the crispening effect8. With a CSF response now determined, its inverse defines the minimum detectable modulation mt for every luminance level (mt = 1/CSF). Since mt is also defined by equation 1, a signal can be constructed based on single just noticeable difference (JND) steps by using the form shown in equation 2.

�!"# − �!"# 1 + �! �! = �� �!"# = �!"# (1 & 2) �!"# + �!"# 1 − �!

Starting with any selected luminance level Lj, the next level up Lj+1 or the next level down Lj-1 can be calculated using the relationships in equations 3 & 4.

1 + �! 1 − �! �!!! = �! �� �!!! = �! (3 & 4) 1 − �! 1 + �! As an alternative, each signal level can be stepped by a fraction of a JND instead. This fraction value is designated as f in equations 5 & 6.

1 + ��! 1 − ��! �!!! = �! �� �!!! = �! �ℎ��� � < 1 (5 & 6) 1 − ��! 1 + ��! With the appropriate selection of the JND fraction f, any desired luminance range can be matched to the exact number of code word steps available. By setting f to a value of 0.46 the 4060 code words of a 12 bit SDI-legal signal can be used to cover the range from 100 cd/m2 down to ~10-6 cd/m2. To state this another way: a perceptually based EOTF can be constructed with 12 bits such that the lowest code word represents ~1x10-6 cd/m2 and the highest code word represents ~100 cd/m2 with a precision of 0.46 JNDs per code word step. By choosing larger f values, we can produce larger ranges of luminance. For example: an f value of 0.68 covers ~10- 6 to 1000 cd/m2, and an f of 0.9 covers ~10-6 to 10,000 cd/m2. These curves are plotted in figure 4.

4 Copyright © 2012 Society of Motion Picture and Television Engineers. All rights reserved.

Figure 4. 12 bit perceptual, uniform JND curves with peak luminances of 100, 1000, and 10,000 cd/m2. The plots show that there is much less variation in precision of contrast steps with the perceptually based curves than there was with their gamma based counterparts, as seen because the lines are closer together. Because of the high efficiency of the perceptual EOTF, and the desire for larger available peak luminance levels for future displays – a 10,000 cd/m2 version was selected for extended testing. Although this EOTF built by iteration could be incorporated directly into look up tables (LUTs) used in video hardware, the utility of having an equation which closely approximates this curve is recognized. A direct functional form helps to simplify issues of specification and standardization by representing the curve in a compactly written equation. This equation (and its inverse) may also be implemented directly by software based systems for video processing as an alternative to large LUTs. A modification of the Naka-Rushton cone response8 equation was found to fit well to the iterative JND table (this discovery courtesy of Robin Atkins, also at Dolby Laboratories), with the form and parameter values shown in equation 7 below. ! ! ! ! � − �! � = � ! (7) �! − �!� !

0 ≤ � ≤ 1; � = 10,000; � = 78.8438; � = 0.1593; �! = 0.8359; �! = 18.8516; �! = 18.6875 For compact reference, this functional form is referred to as the Perceptual Quantizer (or PQ) curve. This signal encoding is anchored to absolute luminance levels viewed on the display screen (note that this is not absolute luminance at capture; television and cinema are display referred and not scene referred systems). The PQ curve has nearly a square-root behaviour (slope = -1/2) at the darkest light levels, consistent with the Rose-DeVries law based on photon detection statistics, and then rolls off to a constant zero slope for the highest light levels , which is consistent with the log behaviour of the well-known Weber’s law. Between those extreme luminance regions, it exhibits varying slopes, and throughout the mid luminance levels it exhibits a slope similar to the gamma nonlinearities.

Visual Tests A real world comparison was developed to illustrate the advantages of the Perceptual Quantizer over traditional ITU-R Rec. BT.1886 gamma for some brighter display scenarios. Figure 5a shows plots of the 12 bit PQ curve generated for a range up to 10K cd/m2 as well as a version generated for a range up to 1K cd/m2, compared to a 1K cd/m2 peak Rec1886 curve. The 1K PQ signal shows much higher performance than the ITU-R Rec. BT.1886 gamma function, and

5 Copyright © 2012 Society of Motion Picture and Television Engineers. All rights reserved.

even though it covers an order of magnitude greater dynamic range, the 10K PQ signal shows higher performance in the shadow and midtone regions than the 1K Rec1886 as well.

Figure 5a. 12 bit PQ curves compared to 12 bit Rec1886 curve at 1000 cd/m2 peak.

Figure 5b. 10 bit PQ curves compared to 10 bit Rec1886 curve at 1000 cd/m2 peak. Though the ITU-R Rec. BT.1886 systems show greater precision in their brightest region, the plots indicate that these areas are below perceptual thresholds, so these levels are likely to be “wasted” visually, and not contribute to a better viewing experience. Figure 5b shows the same plots for 10 bit versions of all three curves. The relative performance differences are similar, but with the higher levels of quantization, it would seem that ITU-R Rec. BT.1886 ought to exhibit visible artifacts in the darker regions, while the PQ curves are still near or below the ITU BT.2246 thresholds for their entire ranges. In order to validate the actual, perceived performance of these EOTF curves, visual tests were conducted at several different bit depths. A test chart was developed to target the most sensitive area of the human visual system – the region near white or gray. Flat field images with a D65 chromaticity were created in a linear RGB with UHDTV primaries and a D65 white point. These images were then quantized using the inverses of four different proposed EOTF functions: ITU-R Rec. BT.1886 with a 0.001 cd/m2 black point and 1000 cd/m2 white point, ITU-R Rec. BT.1886 with a 0.002 cd/m2 black point and 2000 cd/m2 white point (both 1,000,000:1 contrast ratio), a PQ curve with a 1000 cd/m2 peak level, and the PQ curve with a 10,000 cd/m2 peak level Each curve under test was quantized to six different bit depths from 7 bits up to 12 bits. The images were then brought back into linear space by the appropriate quantizer function, and then sent to a high quality reference display (Dolby PRM-4200) for viewing. This display has a displayable luminance range of roughly 0.001 to 600 cd/m2, and the “P3” color (the minimum for DCI compliance). A block diagram of the test framework is shown in figure 6.

6 Copyright © 2012 Society of Motion Picture and Television Engineers. All rights reserved.

Figure 6. Visual test framework. All images were kept within the P3, 600 cd/m2 range of the display (only using a subset of the 1000 to 10,000 cd/m2 systems under test) to ensure no clipping artifacts would disrupt the visual evaluations. After quantization, a JND Cross test pattern shown in figure 7a was created by perturbing square areas of the gray field by one quantization step in every direction in the RGB color space – 26 possible combinations as shown in figure 7b. This pattern has the advantage of testing quantization visibility in both the luminance and chromatic dimensions, in a way that explicitly spreads the levels over the exact color gamut being taken into account.

Figure 7a. JND Cross test pattern. Figure 7b. Color box locations in RGB space. A total of thirteen JND Cross images were generated at different luminance levels for the D65 gray field. For each curve under test, the bit depth of the quantization was started at 7 bits, then increased by one bit at a time until none of the colored boxes were visible (i.e. the visibility is a result of quantization pushing them wider than a threshold difference). At each luminance level, the bit depth required to make all quantization steps invisible was recorded. If visible boxes were still present at 12 bits, a “>12” value was recorded. Results of the JND Cross test are shown in table 1.

7 Copyright © 2012 Society of Motion Picture and Television Engineers. All rights reserved.

Gray Field 0.001 0.005 0.01 0.03 0.05 0.1 1 5 10 50 100 250 500 Luminance:PQ 1K 11 11 10 11 10 10 10 10 10 10 10 10 9 (cd/m2) PQ 10K 11 11 11 10 11 11 11 11 10 11 10 10 10 Rec1886 1K >12 >12 >12 >12 >12 12 12 12 12 10 10 10 9 Rec1886 2K >12 >12 >12 >12 >12 >12 >12 12 12 11 11 10 9 Table 1. JND Cross visual test results – bits required for no visible boxes. Due to the extreme sensitivity of the JND Cross pattern, some additional images (shown in figure 8) were run through the same test framework. Though some of these images are still difficult, they are more representative of normal use cases for video rather than an extreme scenario like the JND Cross. All but “Dark Ramp” and “Plane Hangar” were 16 bit /color captured through HDR still image techniques to prevent conflicting source quantization artifacts.

Figure 8. More typical video images.

Dark Black Black White White Plane Name: Charcoal Glacier Ramp Controllers Feathers Feathers Paper Hangar PQ 1K 10 9 8 8 7 7 8 9 PQ 10K 10 9 9 8 8 8 9 9 Rec1886 1K >12 11 10 10 7 7 8 9 Rec1886 2K >12 11 11 10 7 7 9 9 Table 2. Images visual test results – bits required for no visible banding/artifacts. The test results show that the PQ curve appears to be very close to its goal of perceptual uniformity, straddling the 10/11 bit threshold across the range of levels tested with the JND Cross. None of the tested images required more than 10 bits. In contrast, the gamma systems struggled with the test content. 12 bits were required for most of the JND Cross patterns, and even 12 bits were not sufficient for all of the patterns below 0.05 cd/m2, as well as the “Dark Ramp” image. Though the ITU-R Rec. BT.1886 functions performed better on some of the bright images, they required much different bit depths depending on scene brightness (consistent with the plots in figure 3). On the darker images the PQ curve shows a clear 1 or 2 bit advantage over Rec1886, and if the Rec1886 peak level was pushed higher – these margins would continue to grow. The bit-depth experiments clearly validate the PQ approach. Hence an EOTF based on the PQ curve provides the best quantization performance and satisfies the criterion of minimum visibility. Though the 1K PQ curve had slightly more precision than the 10K PQ curve, the visible differences were very small given the drastic difference in peak output. Difficult images such as “Dark Ramp”, “Black Controllers”, and “Plane Hangar” required 9 or 10 bits from either PQ system – so the 10K PQ seems to be a better option overall, considering the balance of quantization performance and dynamic range capability.

8 Copyright © 2012 Society of Motion Picture and Television Engineers. All rights reserved.

Conclusion It is clear that current, high performance displays having higher brightness, higher dynamic range, wider color gamut, and lower noise levels are beginning to show the weaknesses of traditional display response functions, and that a new type of EOTF is required to allow these displays to reach their full potential. It is also clear that this new EOTF must satisfy the criterion of minimum visibility of steps using 10 or 12 bits, over a wide brightness range (wider than available in current display devices in order to allow for future developments in display technology). The conventional gamma based curve, which came from the no longer employed Cathode Ray Tube (CRT), does not meet the requirements; experiments show that stepping is visible at even 12-bits. An EOTF based on perception can easily meet the requirements. Even over an extreme brightness range (up to 10,000 cd/m2), visible stepping is not apparent on test patterns at 11 bits, or on images tested down to 9 or 10 bits.

References 1. International Telecommunications Union Radiocommunication Sector, Recommendation ITU-R BT.1886, “Reference electro-optical transfer function for flat panel displays used in HDTV studio production”, Mar. 2011. 2. Society of Motion Picture and Television Engineers, Recommended Practice RP 431-2- 2011, “D-Cinema Quality – Reference Projector and Environment”, May 2011. 3. International Telecommunications Union Radiocommunication Sector, Report ITU-R BT.2246, “The present state of ultra high definition television”, Oct. 2011. 4. P. G. J. Barten, Contrast Sensitivity of the Human Eye and its Effects on Image Quality, SPIE Optical Engineering Press: Bellingham, WA, 1999. 5. P. G. J. Barten, “Formula for the contrast sensitivity of the human eye”, Proc. SPIE-IS&T Vol. 5294:231-238, Jan. 2004. 6. NEMA Standards Publication PS 3.14-2008, Digital Imaging and Communications in Medicine (DICOM), Part 14: Grayscale Standard Display Function, National Electrical Manufacturers Association, 2008. 7. M. Cowan, G. Kennel, T. Maier, and B. Walker, “Contrast Sensitivity Experiment to Determine the Bit Depth for Digital Cinema”, SMPTE Mot. Imag. J., 113:281-292, Sept. 2004. 8. P. Whittle, “Increments and decrements: luminance discrimination”, Vis Res. V 26, #10, 1677-1691, 1986. 9. K. I. Naka, W. A. H. Rushton, “S-potentials from luminosity units in the retina of fish (Cyprinidae)”, J. Physiol. 185: 587-599, Jan. 1966.

9 Copyright © 2012 Society of Motion Picture and Television Engineers. All rights reserved.