Gesture-Based Interaction with Time-Of-Flight Cameras

Gesture-Based Interaction with Time-Of-Flight Cameras

Aus dem Institut fur¨ Neuro- und Bioinformatik der Universit¨at zu Lubeck¨ Direktor: Prof. Dr. rer. nat. Thomas Martinetz Gesture-Based Interaction with Time-of-Flight Cameras Inauguraldissertation zur Erlangung der Doktorwurde¨ der Universit¨at zu Lubeck¨ – Aus der Technisch-Naturwissenschaftlichen Fakult¨at – vorgelegt von Martin Haker aus Reinbek Lubeck¨ 2010 Erster Berichterstatter: Prof. Dr.-Ing. Erhardt Barth Zweiter Berichterstatter: Prof. Dr.-Ing. Achim Schweikard Tag der mundlichen¨ Prufung:¨ 20. Dezember 2010 Zum Druck genehmigt. Lubeck,¨ den 10. Januar 2011 ii Contents Acknowledgements vi I Introduction and Time-of-Flight Cameras 1 1 Introduction 3 2 Time-of-Flight Cameras 7 2.1 Introduction .............................. 7 2.2 State-of-the-art TOFSensors . 8 2.3 Alternative Optical Range Imaging Techniques . 11 2.3.1 Triangulation. 11 2.3.2 Photometry........................... 12 2.3.3 Interferometry . 13 2.3.4 Time-of-Flight . 13 2.3.5 Summary............................ 14 2.4 Measurement Principle . 15 2.5 Technical Realization . 17 2.6 Measurement Accuracy . 21 2.7 Limitations............................... 23 2.7.1 Non-Ambiguity Range . 23 2.7.2 Systematic Errors. 24 2.7.3 Multiple Reflections . 24 2.7.4 Flying Pixels .......................... 25 2.7.5 Motion Artefacts . 25 2.7.6 Multi-Camera Setups. 26 II Algorithms 27 3 Introduction 29 4 Shading Constraint 33 4.1 Introduction .............................. 33 4.2 Method................................. 35 4.2.1 Probabilistic Image Formation Model . 35 4.2.2 Lambertian Reflectance Model . 37 iii CONTENTS 4.2.3 Computation of Surface Normals . 38 4.2.4 Application to Time-of-Flight Cameras . 40 4.3 Results ................................. 41 4.3.1 Synthetic Data . 41 4.3.2 Real-World Data . 44 4.4 Discussion ............................... 45 5 Segmentation 49 5.1 Introduction .............................. 49 5.1.1 Segmentation Based on a Background Model . 50 5.1.2 Histogram-based Segmentation . 53 5.1.3 Summary............................ 56 6 Pose Estimation 59 6.1 Introduction .............................. 59 6.2 Method................................. 61 6.3 Results ................................. 64 6.3.1 Qualitative Evaluation . 64 6.3.2 Quantitative Evaluation of Tracking Accuracy . 66 6.4 Discussion ............................... 69 7 Features 71 7.1 Geometric Invariants. 75 7.1.1 Introduction .......................... 75 7.1.2 Geometric Invariants. 76 7.1.3 Feature Selection . 79 7.1.4 Nose Detector .... ......... .......... .. 80 7.1.5 Results ............................. 81 7.1.6 Discussion ........................... 83 7.2 Scale Invariant Features . 84 7.2.1 Introduction .......................... 84 7.2.2 Nonequispaced Fast Fourier Transform (NFFT) . 85 7.2.3 Nose Detection . 88 7.2.4 Experimental Results. 90 7.2.5 Discussion ........................... 94 7.3 Multimodal Sparse Features for Object Detection . 95 7.3.1 Introduction .......................... 95 iv CONTENTS 7.3.2 Sparse Features. 96 7.3.3 Nose Detection . 98 7.3.4 Results ............................. 102 7.3.5 Discussion ........................... 107 7.4 Local Range Flow for Human Gesture Recognition . 108 7.4.1 Range Flow........................... 109 7.4.2 Human Gesture Recognition . 111 7.4.3 Results ............................. 115 7.4.4 Discussion ........................... 117 III Applications 119 8 Introduction 121 9 Facial Feature Tracking 123 9.1 Introduction .............................. 123 9.2 Nose Tracking ............................. 124 9.3 Results ................................. 126 9.4 Conclusions .............................. 127 10 Gesture-Based Interaction 129 10.1Introduction .............................. 129 10.2Method................................. 131 10.2.1 Pointing Gesture . 131 10.2.2 Thumbs-up Gesture . 135 10.2.3 System Calibration . 136 10.3Results ................................. 138 10.4Discussion ............................... 140 11 Depth of Field Based on Range Maps 143 11.1 Introduction .............................. 143 11.2Method................................. 146 11.2.1 Upsampling the TOFrange map . 146 11.2.2 Preprocessing the Image Data . 148 11.2.3 Synthesis of Depth of Field . 150 11.2.4 Dealing with Missing Data . 152 11.3Results ................................. 154 v CONTENTS 11.4 Discussion ............................... 156 12 Conclusion 159 References 161 vi Acknowledgements There are a number of people who have contributed to this thesis, both in form and content as well on a moral level, and I want to express my gratitude for this support. First of all, I want to thank Erhardt Barth. I have been under his supervision for many years and I feel that I have received a lot more than supervision; that is the earnest support on both a technical and the above mentioned moral level. Thank you! At the Institute for Neuro- and Bioinformatics I am grateful to Thomas Martinetz for giving me the opportunity to pursue my PhD. I thank Martin B¨ohme for a fruitful collaboration on the work presented in this thesis. Special gratitude goes to Michael Dorr, who was always around when advice or a motivating chat was needed. I also want to thank all my other colleagues for providing such a healthy research environ- ment – not only on a scientific level but also at the foosball table. I also need thank the undergraduate students who worked with me over the years. Special thanks go to Michael Glodek and Tiberiu Viulet who significantly con- tributed to the results of this thesis. Major parts of this thesis were developed within the scope of the European project ARTTS (www.artts.eu). I thank both the ARTTS consortium for the collaboration as well as European Commission for funding the project (contract no. IST-34107) within the Information Society Technologies (IST) priority of the 6th Framework Programme. This publication reflects the views only of the authors, and the Commis- sion cannot be held responsible for any use which may be made of the information contained therein. vii Part I Introduction and Time-of-Flight Cameras 1 1 Introduction Nowadays, computers play a dominant role in various aspects of our lives and it is difficult, especially for younger generations, to imagine a world without them. In their early days, computers were mainly used to perform the tedious work of solving complex mathematical evaluations. Only universities and large companies were able to afford computers and computing power was expensive. This changed with the advent of personal computers in the 1970’s, such as the Altair and the Apple II. These small, yet powerful, machines for individual usage brought the technology into both offices and homes. In this period, computers were also introduced to the gaming market. As computers were intended for a wider range of users, they required suitable forms of user interaction. While punched cards provided a means for computer spe- cialists to feed simple programs into early computer models even up to the 1960’s, the general user required a graphical user interface and intuitive input devices, such as mouse and keyboard. In the years that followed, the computer industry experienced an enormous tech- nological progress; the computing power could be increased significantly while both the cost of production and the size of the devices were reduced. This development lead to the emergence of computers in more and more different aspects of our lives. Hand-held devices allow us to access the internet wherever we are. Complex mul- timedia systems for viewing digital media, such as movies or photos, appear in our living rooms. And computer-aided systems provide support in automobiles, produc- tion facilities, and hospitals. These new appliances of computer technology require novel forms of human- computer interaction; mouse and keyboard are not always the ideal means for pro- viding user input. Here, a lot of progress has been achieved in the areas of speech 3 CHAPTER 1. INTRODUCTION recognition systems and touch screens in recent years. In the former case, however, it has proven difficult to devise systems that perform with sufficient reliability and ro- bustness, i.e. apart from correctly recognizing speech it is difficult to decide whether an incoming signal is intended for the system or whether it belongs to an arbitrary conversation. So far, this limited the widespread use of speech recognition systems. Touch screens, on the other hand, have experienced a broad acceptance in infor- mation terminals and hand-held devices, such as Apple’s iPhone. Especially, in the latter case one can observe a new trend in human-machine interaction – the use of gestures. The advantage of touch sensitive surfaces in combination with a gesture recog- nition system is that a touch onto the surface can have more meaning than a simple pointing action to select a certain item on the screen. For example, the iPhone can recognize a swiping gesture to navigate between so-called views, which are used to present different content to the user, and two finger gestures can be used to rotate and resize images. Taking the concept of gestures one step further is to recognize and interpret hu- man gestures independently of a touch sensitive surface. This has two major advan- tages: (i) It comes closer to our natural use of gestures as a body language and (ii) it opens a wider range of applications where users do not need to be in physical contact with the input device. The success of such systems has first been shown in the gaming market

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    185 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us