Direct Interaction with Large Displays Through Monocular Computer Vision

DIRECT INTERACTION WITH LARGE DISPLAYS THROUGH MONOCULAR COMPUTER VISION A thesis submitted in fulfilment of the requirements for the degree of Doctor of Philosophy in the School of Information Technologies at The University of Sydney Kelvin Cheng October 2008 © Copyright by Kelvin Cheng 2008 All Rights Reserved Abstract Large displays are everywhere, and have been shown to provide higher productivity gain and user satisfaction compared to traditional desktop monitors. The computer mouse remains the most common input tool for users to interact with these larger displays. Much effort has been made on making this interaction more natural and more intuitive for the user. The use of computer vision for this purpose has been well researched as it provides freedom and mobility to the user and allows them to interact at a distance. Interaction that relies on monocular computer vision, however, has not been well researched, particularly when used for depth information recovery. This thesis aims to investigate the feasibility of using monocular computer vision to allow bare-hand interaction with large display systems from a distance. By taking into account the location of the user and the interaction area available, a dynamic virtual touchscreen can be estimated between the display and the user. In the process, theories and techniques that make interaction with computer display as easy as pointing to real world objects is explored. Studies were conducted to investigate the way human point at objects naturally with their hand and to examine the inadequacy in existing pointing systems. Models that underpin the pointing strategy used in many of the previous interactive systems were formalized. A proof-of-concept prototype is built and evaluated from various user studies. Results from this thesis suggested that it is possible to allow natural user interaction with large displays using low-cost monocular computer vision. Furthermore, models developed and lessons learnt in this research can assist designers to develop more accurate and natural interactive systems that make use of human’s natural pointing behaviours. ii Acknowledgements I would like to thank my supervisor Masahiro Takatsuka for his continuing support and for being a fantastic supervisor throughout my degree. I would also like to thank Peter Eades for his guidance and advice over the years, and to David Vronay for his supervision and for making possible the internship at Microsoft Research Asia. I want to thank Frank Yu, Tony Huang, Adel Ahmed, Ben Yip, Calvin Cheung, Nelson Yim and Daniel Rachmat for their advice and discussions at various times during my PhD. Thanks to Shea Goyette, Henry Molina, Abhaya Parthy, Howard Lau, Jacob Munkberg, Zianlong Zhou, and Sheila Cheng for proof-reading my thesis as well as the mathematics. A big thank you to all the volunteers from my usability studies particularly the few who have participated in all of my experiments. Thank you to my family for their support and constant encouragement. My thanks to fellow students at ViSLAB and the Infovis/HUM group for providing laughter and entertainment, creating a relaxed and enjoyable working atmosphere. Thanks to the various people at workshop, dp, the administrative and academic staff at the School of IT for helping me throughout the years in various areas during my time at SIT and especially for their help with my broken laptop. Thanks for the weekly supply of milk, and in particular for providing a fantastic espresso machine to keep me going during the day. iii Finally, thank you to The University of Sydney, Faculty of Science, Capital Markets Cooperative Research Centre and Microsoft Research Asia for keeping me funded. iv Table of Contents Abstract ................................................................................................................................................. ii Acknowledgements .............................................................................................................................. iii Table of Contents .................................................................................................................................. v List of Tables ......................................................................................................................................... x List of Illustrations .............................................................................................................................. xi Chapter 1 Introduction ........................................................................................................................ 1 1.1 Human-Computer Interaction ..................................................................................................... 1 1.2 Interaction with Large Displays .................................................................................................. 2 1.2.1 Entertainment Systems ........................................................................................................ 4 1.2.2 Presentation Systems ........................................................................................................... 5 1.3 Alternate input techniques ........................................................................................................... 6 1.4 Pointing Strategies ...................................................................................................................... 9 1.5 Aim of this Thesis ......................................................................................................................... 9 1.6 Contribution of this Thesis ......................................................................................................... 11 1.7 Research Methodology .............................................................................................................. 12 1.8 Thesis Structure ......................................................................................................................... 13 Chapter 2 Background and Previous Work ..................................................................................... 15 2.1 Manipulation and Interaction .................................................................................................... 15 2.2 Graphical User Interface .......................................................................................................... 16 2.3 Handheld Intrusive Devices....................................................................................................... 18 2.4 Non-intrusive Techniques .......................................................................................................... 24 2.4.1 Sensitive Surfaces .............................................................................................................. 24 2.4.2 Computer Vision ................................................................................................................ 25 2.5 Selection Strategies ................................................................................................................... 38 v 2.5.1 Mouse click-less interfaces ................................................................................................ 38 2.5.2 Laser Pointer ...................................................................................................................... 39 2.5.3 Hand gesture ...................................................................................................................... 40 2.6 Summary .................................................................................................................................... 42 Chapter 3 Case Study – Current Interactive Methods ................................................................... 45 3.1 User Interface for the Media PC ............................................................................................... 45 3.2 Related Work ............................................................................................................................. 46 3.2.1 Input devices ...................................................................................................................... 46 3.2.2 User Interface and Interaction Design ............................................................................... 48 3.3 Hypotheses................................................................................................................................. 49 3.4 Usability Study .......................................................................................................................... 49 3.5 Apparatus .................................................................................................................................. 49 3.6 Participants ............................................................................................................................... 50 3.7 Procedure .................................................................................................................................. 51 3.8 Design ........................................................................................................................................ 51 3.9 Results ....................................................................................................................................... 52 3.10 Participants’ comments ........................................................................................................... 53 3.11 General Observations .............................................................................................................

Direct Interaction with Large Displays Through Monocular Computer Vision

Depth Camera Technology Comparison and Performance Evaluation

WIMP Interfaces for Emerging Display Environments

Download Free Bumptop-3D Windows Desktop Organizer

3D Desktop Vdock Exodo Theme Free Download for Windows 7

Per-Pixel Calibration for RGB-Depth Natural 3D Reconstruction on GPU

Motion Capture Technologies 24

Enriching the Desktop Metaphor with Physics, Piles and the Pen

Lecture 1: Course Introduction and Overview

Research Article Human Depth Sensors-Based Activity Recognition Using Spatiotemporal Features and Hidden Markov Model for Smart Environments

Multiple View Geometry for Video Analysis and Post-Production

Evaluation of Microsoft Kinect 360 and Microsoft Kinect One for Robotics and Computer Vision Applications

THE Tof CAMERAS