<<

BACHELOR THESIS

Martin Safko

Implementation of a tone mapping operator for scotopic viewing conditions

Department of Software and Computer Science Education

Supervisor of the bachelor thesis: doc. Alexander Wilkie, Dr.

Study programme: Computer Science

Study branch: Programming and Software Systems

Prague 2019 I declare that I carried out this bachelor thesis independently, and only with the cited sources, literature and other professional sources.

I understand that my work relates to the rights and obligations under the Act No. 121/2000 Sb., the Copyright Act, as amended, in particular the fact that the Charles University has the right to conclude a license agreement on the use of this work as a school work pursuant to Section 60 subsection 1 of the Copyright Act.

In ...... date ...... signature of the author

i I would like to thank my parents for supporting me during my studies.

ii Title: Implementation of a tone mapping operator for scotopic viewing conditions

Author: Martin Safko

Department: Department of Software and Computer Science Education

Supervisor: doc. Alexander Wilkie, Dr., Department of Software and Computer Science Education

Abstract: Creating night-time images and movies that look plausible has been a problem in the industry since the creation of camera. To capture an image we need enough to create a measurable quantity on a camera sensor. For this reason, shooting at night was not possible until sensors sensitive enough were developed and even then the captured images do not look realistic. Movie industry circumvent these issues by manually color correcting the footage in post- production. We implement an algorithm presented in a 2011 SIGGRAPH paper capable of solving this problem in a psycho-physically plausible and consistent way for spectral images and also augment it by a technique taken from a paper by INRIA.

Keywords: human visual system tone mapping low light conditions

iii Contents

Introduction 2 Motivation ...... 2 Goals ...... 2

1 Human Visual System 3 1.1 Eye...... 3 1.2 Trichromacy ...... 4 1.3 Opponency model ...... 5 1.4 Rods ...... 7 1.5 Luminous efficiency function ...... 7 1.6 Visual acuity ...... 8

2 Implementation 9 2.1 Overview ...... 9 2.2 Algorithm ...... 10 2.2.1 Bilateral filter ...... 12 2.3 ART ...... 13

3 Results 14

Conclusion 17

Bibliography 18

List of Figures 19

A Attachments 20

1 Introduction

In low-light environments, humans perceive color differently than in daylight conditions. However, devices and software that try to capture reality as we see it, for example cameras or photo-realistic renderers, are optimized for a daylight use and in dark environments produce unnatural looking images. This thesis aims to implement a tone mapping operator producing images that are perceptually similar to real life experience. The algorithm is based on a 2011 SIGGRAPH paper[1].

Motivation

Creating plausible night-time looking images or movie scenes is not easy. Night shootings requires long exposures which is suitable only for static scenes. Other possibility is to use more sensitive equipment. Nevertheless, both of these tech- niques are either impractical or require very expensive equipment. Movie indus- try has come up with a process known as day for night to overcome these issues. The idea is to shoot in well lit scenes and later color grade the footage in post- production. Although the result looks good, it requires manual control and is typically not perceptually correct. From the scientific point of view, simulating human vision under different conditions can also help us better understand human visual system, especially perception of colors. By comparing the result of an experiment to the simulation we can build a model which can give us an insight into the inner workings of our brain.

Goals

The goal of this thesis is to implement a psycho-physically plausible tone mapping operator for low light spectral images and use it in an existing spectral renderer. We would also like to simulate loss of spatial resolution.

2 1. Human Visual System

In this section we describe basic theory of vision and general colorimetry termi- nology. Out of all human senses, vision is the most complex one. Figure 1.1 shows the abstract path from light to color sensation. Light entering our eye is converted to a nerve signal which travels all the way to the back side of the brain where it is processed and perceived as a color sensation. We can study the physics of light and the biology and chemistry of an eye in great detail. Unfortunately, human brain is still mostly a black box. What we mean is that we do not yet have a complete understanding of neural pathways in the same way as we do with electrical circuits. Another thing to note is that no two people are the same and so everyone’s vision is a little different. In this text we shall focus on visual system of an average person with no defects.

Figure 1.1: Abstract path from light to color perception

1.1 Eye

Simplified diagram of an eye is shown in Figure 1.2. Light entering theeyeis focused by the lens onto retina, a light-sensitive layer located at the back of an eye. It is covered by two types of special cells, rods and cones. The distribution of rods and cones across retina in not uniform but follows the curve in Figure 1.3. Most of the cones are located at a spot called fovea, which is where our sharp vision is established. The other special point on the retina is the blind spot. It is a place where optical nerve is connected to the eye and so no rods and cones are residing there.

3 Figure 1.2: Diagram of an eye

1.2 Trichromacy

There are three types of cone cells, L(long), M(medium) and S(short). Each one is sensitive to a different part of visible spectrum. Figure 1.4 shows the spectral sensitivity function of each cone type. Sometimes they are also referred to as red, green and blue cones given by the visible part of the spectrum they occupy. Interestingly, L and M types have very similar spectral response and slightly differ only at higher wavelengths. Given the spectrum entering an eye we can calculate response of each cone type by integrating the product of spectral power distribution and response function, as you can see in eq. 1.1

∫︂ L = R(λ) l(λ) dλ λ ∫︂ M = R(λ) m(λ) dλ (1.1) λ ∫︂ S = R(λ) s(λ) dλ λ where l, m, s are the response functions, R is the spectrum of light and we integrate over the visible spectrum wavelengths. Human eyes have only three cone types which means the incoming spectrum is reduced to three quantities. As a result, different spectra can produce the same

4 Figure 1.3: Cone and rod distribution around fovea color response. In other words, two colored objects could look the same under a table lamp but be completely different under the sunlight. This phenomenon is called metameric failure[2]. This is generally undesirable which is why a color quality testing needs to be done under specific conditions. On the other hand, trichromacy is very useful. From storing colored images to manufacturing display devices, we only need three quantities to specify color.

1.3 Opponency model

Trichromacy is not enough to explain all phenomena of colored vision. One such example are color-blind people. Most common form is the red-green which means such a person has difficulty distinguishing between those colors. Other forms include blue-yellow and total color blindness. Other example is the absence of bluish yellow or reddish green colors. Similar ideas lead to the development of opponency model in the 19th century. It states that there are three channels of opposing values. Light / dark, red / green and blue / yellow channel. We can imagine instead of only positive signal there is negative and positive. We can use an analogy from electrical engineering with 0V/5V signals transformed into -5V/5V with 0V representing neutral value. Opponency model and trichromacy are not mutually exclusive theories but rather complement each

5 1.0

0.8

0.6

0.4

0.2

400 500 600 700

Figure 1.4: Normalized spectral sensitivity curves

From left to right: short, medium and long wavelength cone response functions. Note that even though the long wavelength cones are sometimes referred to as ”red” cones, their integrated sensitivity corresponds to a greenish yellow color that is actually not very different from that of the medium wavelength cones. other. Trichromacy states how the signals are created and the opponency model specifies how those signals are transformed and sent to the brain. Diagramin Figure 1.5 shows how cone signals are combined. Although this is just a model, it explains many observed visual phenomena.

Figure 1.5: Opponency model

Note that rods also contribute to opponency model. Signal from rods is wired to the L+M channel, so we do not need a separate neural pathway for night vision.

6 1.4 Rods

Similar to cones, rods also have a specific spectrum response function. However, the difference between rods and cones is in the amount of light at which theyare active. Humans use cones during the bright light situations. We call it . On the other hand, during the night, we rely only on our rods vision, which we call . There is also a middle ground, mesopic vision, when both cones and rods contribute to visual perception. One thing to note is that we do not switch between scotopic and photopic vision instantly. Instead it is a rather slow process where humans adapt the vision to current situations. Empirical data in [3] provides information on the amount of time the process takes. What is interesting is that going from scotopic to photopic vision (i.e. from darkness to sunlight) is relatively fast compared to the other way around which can take several tens of minutes. Similarly to cones, rods are also connected to the same pathways of opponency model. Because of that, even though rods produce achromatic signal it can still affect our colored vision.

1.5 Luminous efficiency function

Luminous efficiency function describes perceived brightness of a given wavelength of light. Figure 1.6 shows we perceive green part of the visible spectrum as the most bright in photopic conditions. We can also see that our brightness sensitivity shifts to the blue part of the spectrum in scotopic conditions. That is because low amount of light hitting our eyes does not have enough power to activate cones, so all of our visual perception is based on the signal coming from rods. As a result we are more sensitive to blue light and less sensitive to red light. This is called Purkinje shift after Jan Evangelista Purkynˇewho first observed the phenomenon.

7 Figure 1.6: Luminous efficiency function

1.6 Visual acuity

Visual acuity can be defined as a spatial resolution of the visual system. There are many factors that influence visual acuity, ranging from optical properties of an eye to neural processes. We will concentrate on spatial resolution in relation to scotopic vision. If you recall the section 1.1 on distribution of rods, we discussed the absence of rods in fovea, responsible for sharp vision. That means when illumination intensity decreases and we increasingly rely on rods for our vision, less information is being sent from fovea. Because of that and other factors[4], we see blurry at night.

8 2. Implementation

In this section we will describe the algorithm used for creating psycho-physically plausible low light images. First, we will outline the mathematics and later describe the implementation in a spectral renderer.

2.1 Overview

The algorithm in this thesis is based on SIGGRAPH 2011 paper by Kirk and O’Brien[1]. The paper is divided into three parts:

• Acquiring spectrum from RGB image

• Computing photopic response most perceptually similar to the scotopic vision

• Tonemaping HDR image to LDR image for viewing on a specific display device

The use of spectral renderer allow us to directly compute cone and rod re- sponses without a need to estimate spectrum from RGB image. This gives us an ability to skip the first part of the paper while giving us more accurate results. We also skip the last part. Instead of tailoring the output image to specific mon- itor, we use a general method of keeping the image data in device independent coordinates. The image is then converted on output by an ICC profile[5] specific to a given display device. Note that the algorithm presented below is not based on rigorous analysis but is rather based on empirical data and incomplete color science models. Con- sequently, the algorithm contains many constants which have a wide range of possible values. By altering those values we could slightly alter the results. Since comparing images with a real scotopic vision is very subjective, we do not exper- iment with optimizing the constants and we leave it as a future work for people that are interested.

9 2.2 Algorithm

The idea is to take human vision models described in chapter 1 extended with influence of rod signals. After using the models in scotopic settings we reverse them while using in photopic settings to get a response perceptually similar to scotopic viewing conditions. The algorithm can be described in the following steps:

1. compute the LMSR (long, medium, short, rod) response to the given spec- trum

2. calculate the regulated signals

3. calculate the color shift in the opponent model

4. compute photopic response most perceptually similar to the scotopic or mesopic response

5. convert the result into CIE XYZ

6. optionally blur the image using bilateral filter to simulate acuity loss

We start with a spetral image that is ready to be processed. For every pixel of the input image we need to do the steps above. Calculating the LMSR response q is the same as described in section 1.2.

∫︂ qlong = l(λ) R(λ) dλ Ω ∫︂ qmed = m(λ) R(λ) dλ Ω ∫︂ qshort = s(λ) R(λ) dλ Ω ∫︂ qrod = r(λ) R(λ) dλ Ω

10 Our mission is to calculate photopic cone response ˆq most perceptually similar to scotopic LMSR response q. We do this by calculating ∆q such that

T ˆq = [qlong, qmed, qshort] + ∆q

Next step is to compute the regulated signals which is a consequence of rods and cones sharing neural pathways.

1/2 glong = 1/(1 + 0.33(qlong + κ1qrod))

1/2 gmed = 1/(1 + 0.33(qmed + κ1qrod))

1/2 gshort = 1/(1 + 0.33(qshort + κ2qrod))

After that we determine the amount of color shift in the opponent color model.

(︃ gmed glong )︃ ∆or/g = xκ1 ρ1 − ρ2 qrod mmax lmax

(︃ gshort (︃ glong gmed )︃)︃ ∆ob/y = y ρ3 − ρ4 α + (1 − α) qrod smax lmax mmax

(︃ glong gmed )︃ ∆olum = z α + (1 − α) qrod lmax mmax

We can rewrite the opponency model diagram in Figure 1.5 using a matrix.

⎡ ⎤ −1 1 0 ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ A = ⎢−1 −1 1⎥ ⎢ ⎥ ⎣ ⎦ 1 1 0

11 In other words, opponency values are a combination of cone signals. We can write it as o = Aqˆ

By using the A−1 we can get trichromacy values back from the opponency model.

T −1 ˆq = [qlong, qmed, qshort] + A ∆o

The final step is to convert from LMS to CIE XYZ to make it easierto work with the image later on. We use the standard conversion matrix defined in CIECAM02 taken from [6]. For all the constants in the above equations we used the values below. For more details see [1].

Constants used:

ρ1 = 1.111 lmax = 0.637 x = 15 κ1 = 0.25 ρ2 = 0.939 mmax = 0.392 α = 0.619 y = 15 κ2 = 0.40 ρ3 = 0.400 smax = 1.606 z = 5 ρ4 = 0.150

2.2.1 Bilateral filter

Optional last step is to blur the image. We do this step to simulate acuity loss in low light conditions. We use bilateral filter[7] because even though we lack spatial detail in such conditions the edges remain sharp. This idea was inspired by the INRIA paper[8] on similar topic. Bilateral filter works by blurring the image spatially but only in areas with a similar color information. In other words, it avoids averaging across edges. To determine if two pixel color values are similar we convert them to CIE LAB and use a standard euclidean distance. Although

12 the bilateral filter is typically used for noise reduction, it can be found in awider range of applications, like this thesis shows.

2.3 ART

The algorithm described above was implemented in ART1. ART (Advanced Ren- dering Toolkit) is a spectral rendering system with aim on modularity rather than speed which makes it a good choice for academic research and experimentation. ART encapsulates the entire image creation process, starting from modeling a scene, through path tracing to finally tone mapping the output image. What makes it special is that it is capable of handling spectral information along each step in the pipeline. We use this fact that by the time we are at the tone mapping step, we still have a spectral image available to us. Programming languages used also play a role in the design. Core of the ART is written in ANSI C while the high level classes are implemented in Objective-C. This gives the best of both worlds. On one hand, efficient and fast core. Onthe other hand, powerful object-oriented features for fast development. We extended the tonemap part of ART. Tonemap works by first parsing the command line arguments from which we build an action sequence. Action se- quence is a chain of ArnAction derived object instances. Each action takes an input from image stack, performs its operation and places the output back on stack. This is one of the things that make ART so modular and easy to extend. We added --scotopic option that selects the use of above algorithm when con- verting a spectral image. This option replaces the default ArtRAW converter with our custom one followed by a bilateral filter in the action sequence.

1https://cgg.mff.cuni.cz/ART/

13 3. Results

Here we present the results of the implemented algorithm. We used CornellBox, MacbethChart, ImageData and VillaRotonda scenes available in Gallery folder in ART repository. Figure 3.1 shows the results of tone mapping with different options specified. Images 3.1a and 3.1d are default images with no color transformations applied. In the middle, images 3.1b and 3.1e are outputs from the implemented algorithm. These represent photopic responses most perceptually similar to what a human would see in dark adapted conditions. You can see the , i.e. the blue shift in colors discussed in section 1.5. We used scenes with high luminance levels to showcase the color shift. Therefore we decreased exposure in images 3.1c and 3.1f to replicate illumination intensity in scotopic conditions and get more realistic results. Finally, we show the effects of bilateral filter on images in figure 3.2. Without the filter, images are perfectly sharp. By switching on the bilateral filter wesee great reduction in spatial resolution while keeping the edges sharp. In image 3.2b we can clearly see how Gandalf’s face and beard becomes blurred but the edge of his hat remains sharp. Image 3.2d shows the same attributes. Although the statues and ornamental details are just blurry blobs, the structure as a whole remains well defined with very little blurring between the building and theback- ground. A positive side effect of using bilateral filter is the removal of noise left by the renderer.

14 (a) Photopic (b) Scotopic (c) Adjusted exposure

(d) Photopic (e) Scotopic (f) Adjusted exposure

Figure 3.1: Results

15 (a) No filter (b) Bilateral filter

(c) No filter (d) Bilateral filter

Figure 3.2: Bilateral filter results

16 Conclusion

We have achieved the goal stated at the start by successfully implementing psycho-physically plausible tone mapping operator for low light spectral images as part of ART, the academic spectral rendering system. The implementation of algorithm presented in a SIGGRAPH 2011 paper is augmented with a bilateral filter for simulation of acuity loss in scotopic viewing conditions. We have reviewed relevant color science theory and used it to describe the prin- ciples of the implemented algorithm. The presented results correctly reproduce observed visual phenomena. However, no rigorous experiment was performed to measure how close the results correspond to real human scotopic vision.

17 Bibliography

[1] Adam G. Kirk, , and James F. O’Brien. Perceptually based tone mapping for low-light conditions. ACM Transactions on Graphics, 30(4):42:1–10, July 2011. Proceedings of ACM SIGGRAPH 2011, Vancouver, BC Canada.

[2] Wikipedia contributors. Metamerism (color) — Wikipedia, the free encyclope- dia, 2019. https://en.wikipedia.org/w/index.php?title=Metamerism_ (color)&oldid=885559956 [Online; accessed 19-July-2019].

[3] Michael Pianta and Michael Kalloniatis. Characterisation of dark adapta- tion in human cone pathways: An application of the equivalent background hypothesis. The Journal of Physiology, 528:591 – 608, 08 2004.

[4] Wikipedia contributors. Visual acuity — Wikipedia, the free encyclopedia, 2019. https://en.wikipedia.org/w/index.php?title=Visual_acuity& oldid=906787520 [Online; accessed 21-July-2019].

[5] Wikipedia contributors. Icc profile — Wikipedia, the free encyclope- dia, 2018. https://en.wikipedia.org/w/index.php?title=ICC_profile& oldid=876098208 [Online; accessed 19-July-2019].

[6] Wikipedia contributors. Lms color space — Wikipedia, the free encyclopedia, 2019. https://en.wikipedia.org/w/index.php?title=LMS_color_space& oldid=906114260 [Online; accessed 19-July-2019].

[7] C. Tomasi and R. Manduchi. Bilateral filtering for gray and color images. In Proceedings of the Sixth International Conference on Computer Vision, ICCV ’98, pages 839–, Washington, DC, USA, 1998. IEEE Computer Society.

[8] Ning Zhou, Weiming Dong, Wang Jiaxin, and Jean-Claude Paul. Simulating Human Visual Perception in Nighttime Illumination. Tsinghua Science & Technology, 14(1):133–138, 2009.

18 List of Figures

1.1 Abstract path from light to color perception ...... 3 1.2 Diagram of an eye ...... 4 1.3 Cone and rod distribution around fovea ...... 5 1.4 Normalized spectral sensitivity curves ...... 6 1.5 Opponency model ...... 6 1.6 Luminous efficiency function ...... 8

3.1 Results ...... 15 3.2 Bilateral filter results ...... 16

19 A. Attachments

The attachments of this thesis include source code and uncompressed result im- ages.

20