Gaze-Contingent Visual Communication

Gaze-Contingent Visual Communication

GAZE-CONTINGENT VISUAL COMMUNICATION A Dissertation by ANDREW TED DUCHOWSKI Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY August 1997 Major Subject: Computer Science GAZE-CONTINGENT VISUAL COMMUNICATION A Dissertation by ANDREW TED DUCHOWSKI Submitted to Texas A&M University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Approved as to style and content by: Bruce H. McCormick Udo W. Pooch (Chair of Committee) (Member) John J. Leggett Norman C. Griswold (Member) (Member) Wayne L. Shebilske Richard A. Volz (Member) (Head of Department) August 1997 Major Subject: Computer Science iii ABSTRACT Gaze-Contingent Visual Communication. (August 1997) Andrew Ted Duchowski, B.Sc., Simon Fraser University Chair of Advisory Committee: Dr. Bruce H. McCormick Virtual environments today lack realism. Real-time display of visually rich scenery is encumbered by the demand of rendering excessive amounts of information. This problem is especially severe in virtual reality. To minimize refresh latency, image quality is often sacrificed for speed. What’s missing is knowledge of the participant’s locus of visual attention, and an understanding of how an individual scans the visual field. Neurophysiological and psychophysical literature on the human visual system suggests the field of view is inspected minutatim through brief fixations over small regions of interest. Significant savings in scene pro- cessing can be realized if fine detail information is presented “just in time” in a gaze-contingent manner, delivering only as much information as required by the viewer. An attentive model of vision is proposed where Volumes Of Interest (VOIs) represent fixations through time. The visual scanpath, composed of raw two-dimensional point of regard (POR) data, is analyzed over a se- quence of video frames in time. Fixation locations are predicted by a piecewise auto-regressive integrated moving average (PARIMA) time series model of eye movements. PARIMA model parameters are derived from established spatio-temporal characteristics of eye movements. POR data is fitted to the PARIMA model through the application of the three-dimensional wavelet transform. Identified fixations are assembled into volumes in three-dimensional space-time, delineating dynamic foveal attention. The attentive visual model utilizes VOIs to synthesize video sequences matching human visual acuity. Specif- ically, spatial resolution drops off smoothly with the degree of eccentricity from the viewer’s point of gaze. Seamless degradation of individual video frames is accomplished through inhomogeneous wavelet recon- struction where the intersections of VOIs and frames constitute expected foveal regions. Peripheral degrada- tion of video is evaluated through human subjective quality testing in a gaze-contingent environment. The proposed method of visual representation is applicable to systems with inherently intensive display require- ments including teleoperator and virtual environments. iv To my wife and family. v ACKNOWLEDGMENTS I would like to thank Professor Bruce H. McCormick, my advisor and mentor, for his masterful guidance. His strict adherence to the scientific method inspired me to strive for a high level of rigor throughout the course of my investigation. The ultimate reward in maintaining this discipline is my personal satisfaction and confidence in the completion of this work. I will never forget Professor McCormick’s insights and poignant prose. I am forever indebted and deeply grateful. I would like to extend thanks to my advisory committee members for their contribution and patience in the preparation of this dissertation. I thank Professor Wayne Shebilske from the Psychology Department for his invaluable help in setting up the eye tracking apparatus. His forewarning of potential difficulties in dealing with human subjects were right on the mark and I am grateful for his suggestions. I would also like to thank Dr. Shebilske for his help in my understanding of the theoretical aspects of human perception and performance and above all his encouragement. I would like to thank Professor Don House of the Visualization Department for putting together the summer workshop on wavelets. This workshop, together with Dr. House’s enthusiasm, prompted me to learn this mathematical concept which forms the analytical foundation of the work found herein. Without Dr. House’s interest and support I may never have braved through the initial learning curve. I would like to express my appreciation to Dr. Rick Jacoby from NASA Ames Research for his contribution to the development of the gaze-contingent video system. Rick’s help with the implementation of shared memory prompted me to design the eye tracking system without which none of the human subject experiments would have been possible. I would like to ex- tend my warm thanks to Drs. Nancy Amato and Mac Lively for their help and guidance in the preparation of my defense. I would also like to express my thanks to the support staff of the Computer Science Department at Texas A&M. They are the silent support force who enabled me to put this work together. I thank my family for seeing me through graduate school. Their moral (and of course financial) support carried me through. Finally, I owe my greatest debt of gratitude to my wife, Corey. Thanks for your patience, understanding, and above all your love. This research was supported in part by the National Science Foundation, under Infrastructure Grant CDA- 9115123 and CISE Research Instrumentation Grant CDA-9422123, and by the Texas Advanced Technology Program under Grant 999903-124. vi TABLE OF CONTENTS Page ABSTRACT . iii DEDICATION . iv ACKNOWLEDGMENTS . v TABLE OF CONTENTS . vi LIST OF FIGURES . xi LIST OF TABLES . xv CHAPTER I INTRODUCTION . 1 1.1 Research Objective . 1 1.2 Specific Aims . 3 1.3 Dissertation Organization . 3 II VISUAL ATTENTION . 5 2.1 Chronological Review of Visual Attention . 5 2.2 Visual Search . 9 2.3 Scene Integration . 10 2.4 Summary . 10 III NEUROPHYSIOLOGY . 12 3.1 The Brain and the Visual Pathways . 12 3.2 Physiology of the Eye . 17 3.3 Implications for Attentional Visual Display Design . 26 IV EYE MOVEMENTS . 28 4.1 Eye Trackers . 28 4.2 The Oculomotor System . 30 vii CHAPTER Page 4.3 Taxonomy and Models of Eye Movements . 31 4.4 Implications for Eye Movement Analysis . 36 4.5 Implications for Pre-Attentional Visual Display Design . 37 V INTRODUCTION TO WAVELETS . 39 5.1 Fundamentals . 39 5.2 Wavelet Functions . 49 5.3 Wavelet Maxima and Multiscale Edges . 52 5.4 Multiresolution Analysis . 55 5.5 Wavelet Decomposition and Reconstruction . 57 5.6 Wavelet Filters . 62 5.7 Discrete Wavelet Transform . 69 5.8 Multidimensional Multiscale Edge Detection . 85 5.9 Anisotropic Multidimensional Discrete Wavelet Transform . 91 5.10 Wavelet Interpolation . 93 VI TIME SERIES ANALYSIS . 101 6.1 Fundamentals . 101 6.2 Nondeterministic (Stochastic) Time Series Models . 107 6.3 Stochastic Process Sample Statistics . 118 6.4 Stationary Time Series Modeling . 120 6.5 Non-stationary (Linear) Time Series Modeling . 122 6.6 Interrupted Time Series Experiments . 124 6.7 Piecewise Autoregressive Integrated Moving Average Time Series . 124 VII EYE MOVEMENT MODELING . 133 7.1 Linear Filtering Approach to Eye Movement Classification . 133 7.2 Conceptual Specification of the PARIMA Model . 136 7.3 Implementation Recommendations . 142 7.4 Three-dimensional Considerations in the Frame-Based Implementation . 144 7.5 Automatic Algorithm Specification . 147 7.6 Limitations of the Frame-Based PARIMA Implementation . 150 7.7 Summary . 153 viii CHAPTER Page VIII VOLUMES OF INTEREST . 154 8.1 Synthesis of Volumes Of Interest . 156 8.2 Graphical VOI Construction . 157 8.3 Comparison of Two- and Three-dimensional Eye Movement Visualizations . 159 8.4 Aggregate Volumes Of Interest . 161 IX GAZE-CONTINGENT VISUAL COMMUNICATION . 165 9.1 Background . 165 9.2 Resolution Mapping . 168 9.3 Multiple ROI Image Segmentation . 173 9.4 Multiple ROI Image Reconstruction Examples . 174 X EXPERIMENTAL METHOD AND APPARATUS . 178 10.1 Hardware . 178 10.2 Software . 182 10.3 Calibration Procedures . 186 10.4 Eye Tracker-Image Coordinate Space Mapping Transformation . 188 XI EXPERIMENT 1: EYE MOVEMENT MODELING . 196 11.1 Video Sequences . 196 11.2 Experimental Trials . 197 11.3 Subjects . 198 11.4 Experimental Design . 198 11.5 Results . 199 11.6 Discussion . 204 XII EXPERIMENT 2: GAZE-CONTINGENT VOI DETECTION . 207 12.1 Video Sequences . 207 12.2 Experimental Trials . 208 12.3 Subjects . ..

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    314 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us