Depth Perception in Computer Graphics
Total Page:16
File Type:pdf, Size:1020Kb
UCAM-CL-TR-546 Technical Report ISSN 1476-2986 Number 546 Computer Laboratory Depth perception in computer graphics Jonathan David Pfautz September 2002 15 JJ Thomson Avenue Cambridge CB3 0FD United Kingdom phone +44 1223 763500 http://www.cl.cam.ac.uk/ c 2002 Jonathan David Pfautz This technical report is based on a dissertation submitted May 2000 by the author for the degree of Doctor of Philosophy to the University of Cambridge, Trinity College. Technical reports published by the University of Cambridge Computer Laboratory are freely available via the Internet: http://www.cl.cam.ac.uk/TechReports/ Series editor: Markus Kuhn ISSN 1476-2986 iii iv ABSTRACT With advances in computing and visual display technology, the interface between man and machine has become increasingly complex. The usability of a modern interactive system depends on the design of the visual display. This dissertation aims to improve the design process by examining the relationship between human perception of depth and three-dimensional computer-generated imagery (3D CGI). Depth is perceived when the human visual system combines various different sources of information about a scene. In Computer Graphics, linear perspective is a common depth cue, and systems utilising binocular disparity cues are of increasing interest. When these cues are inaccurately and inconsistently presented, the effectiveness of a display will be limited. Images generated with computers are sampled, meaning they are discrete in both time and space. This thesis describes the sampling artefacts that occur in 3D CGI and their effects on the perception of depth. Traditionally, sampling artefacts are treated as a Signal Processing problem. The approach here is to evaluate artefacts using Human Factors and Ergonomics methodology; sampling artefacts are assessed via performance on relevant visual tasks. A series of formal and informal experiments were performed on human subjects to evaluate the effects of spatial and temporal sampling on the presentation of depth in CGI. In static images with perspective information, the relative size of an object can be inconsistently presented across depth. This inconsistency prevented subjects from making accurate relative depth judgements. In moving images, these distortions were most visible when the object was moving slowly, pixel size was large, the object was located close to the line of sight and/or the object was located a large virtual distance from the viewer. When stereo images are presented with perspective cues, the sampling artefacts found in each cue interact. Inconsistencies in both size and disparity can occur as the result of spatial and temporal sampling. As a result, disparity can vary inconsistently across an object. Subjects judged relative depth less accurately when these inconsistencies were present. An experiment demonstrated that stereo cues dominated in conflict situations for static images. In moving imagery, the number of samples in stereo cues is limited. Perspective information dominated the perception of depth for unambiguous (i.e., constant in direction and velocity) movement. Based on the experimental results, a novel method was developed that ensures the size, shape and disparity of an object are consistent as it moves in depth. This algorithm manipulates the edges of an object (at the expense of positional accuracy) to enforce consistent size, shape and disparity. In a time-to-contact task using only stereo and perspective depth cues, velocity was judged more accurately using this method. A second method manipulated the location and orientation of the viewpoint to maximise the number of samples of perspective and stereo depth in a scene. This algorithm was tested in a simulated air traffic control task. The experiment demonstrated that knowledge about where the viewpoint is located dominates any benefit gained in reducing sampling artefacts. This dissertation provides valuable information for the visual display designer in the form of task- specific experimental results and computationally inexpensive methods for reducing the effects of sampling. v vi PREFACE This dissertation is the result of my own work and includes nothing that is the outcome of work done in collaboration. I hereby declare that this dissertation is not substantially the same as any other I have submitted for a degree or a diploma or other qualification at any other University. I further state that no part of this dissertation has already been or is being concurrently submitted for any such degree, diploma or other qualification. This dissertation does not exceed sixty thousand words, including tables, footnotes and bibliography. PUBLICATIONS Sections of this work have been published previously [Pfautz & Robinson 1999]. TRADEMARKS All trademarks contained in this dissertation are hereby acknowledged. Jonathan D. Pfautz vii viii ACKNOWLEDGEMENTS Financial support for this work was generously provided by Trinity College, the Cambridge University Computer Laboratory, the Cambridge Overseas Trust, the Overseas Research Student scheme offered by the Committee of Vice-Chancellors and Principals of U.K. Universities and my parents, Glenn and Virginia Pfautz. I would like to express my thanks to my supervisor, Peter Robinson, Neil Dodgson, the members of the Rainbow Research Group and the many others who have contributed to this dissertation. ix x GLOSSARY OF ABBREVIATIONS Numbers in brackets indicate the chapters or experiments in which the abbreviation appears. ANOVA Analysis of Variance (A, B, C, D, E) BDT Binocular Disparity Threshold (3, 7, E, F) CASD Cambridge Autostereo Display (3) CFF Critical Fusion Frequency (3) CGI Computer-Generated Imagery (1, 2, 3, 4, 5, 6, 7, 8, A, B, C, D, E, F) CRT Cathode Ray Tube (3, 4, 5, 6, 7, D, E, F) FOV Field-of-View (1, 3, 5, 6, A, C, F) GFOV Geometric Field-of-View (3) HDTV High-Definition Television (3) HMD Head-Mounted Display (3) HVS Human Visual System (1, 2, 3, 4, 5, 7) IOD Inter-Ocular Distance (3, 7) JND Just Noticeable Difference (5) LCD Liquid Crystal Display (3, 4, 5, 6, A) SD Standard Deviation (B, C, D, E, F) TTC Time to Contact (6, 7, 8, E) VDS Visual Display System (1, 2, 3, 4, 5, 6, 7, 8, A) VE Virtual Environment (1, 3) xi xii CONVENTIONS This thesis adopts some conventions for clarity: • r[]x denotes the nearest integer function or round of a number, x. • x denotes the floor of a number, x. r r • n[V ] denotes the Euclidean norm of a vector, V . • x∈(n, m) denotes the interval n < x < m x∈[n, m] denotes the interval n ≤ x ≤ m • Small visual angles will be expressed in minutes, m, and seconds, s: m's" • Results of analyses of variance will be presented as follows: F(m,n) = j, p < 0.01 Where m is the degrees of freedom of the independent variable being analysed, n is the residual degrees of freedom, j is the F-value and p is the significance level. Significance levels of p < 0.01 will be reported. Graphs reporting statistical data will have error bars representing a 95% confidence interval. • In this thesis, we present many graphs that show the effects of sampling for typical viewing parameters and display characteristics. For brevity, we will not explicitly state the values of the parameters used. Generally, they are chosen to demonstrate common trends and behaviours. • Data for experiments A – F can be found at: http://mit.edu/jpfautz/www/phd/ xiii xiv TABLE OF CONTENTS Chapter 1: Introduction........................................................................................................................... 1 1.1 Depth in Computer Graphics.......................................................................................................................2 1.2 Applications of 3D CGI ..............................................................................................................................2 1.3 Displaying Digital Imagery.........................................................................................................................3 1.4 Methodology ...............................................................................................................................................5 1.5 Aims............................................................................................................................................................6 1.6 Layout of Dissertation.................................................................................................................................6 Chapter 2: Human Depth Perception...................................................................................................... 7 2.1 Pictorial Depth Cues ...................................................................................................................................7 2.2 Oculomotor Depth Cues............................................................................................................................10 2.3 Binocular Depth Perception ......................................................................................................................10 2.4 Depth from Motion ...................................................................................................................................11 2.5 Combination and Application of Depth Cues ...........................................................................................12 2.6 Depth Acuity.............................................................................................................................................14 2.7 Conclusions...............................................................................................................................................15