Junaed Sattar Phd Thesis
Total Page:16
File Type:pdf, Size:1020Kb
TOWARDS A ROBUST FRAMEWORK FOR VISUAL HUMAN-ROBOT INTERACTION Junaed Sattar School of Computer Science McGill University, Montr´eal November 2011 A Thesis submitted to McGill University in partial fulfillment of the requirements of the degree of Doctor of Philosophy c Junaed Sattar, MMXI ABSTRACT This thesis presents a vision-based interface for human-robot interaction and control for autonomous robots in arbitrary environments. Vision has the advantage of being a low-power, unobtrusive sensing modality. The advent of robust algorithms and a significant increase in computational power are the two most significant reasons for such widespread integration. The research presented in this dissertation looks at visual sensing as an intuitive and uncomplicated method for a human operator to communicate in close-range with a mobile robot. The array of communication paradigms we investigate includes, but are not limited to, visual tracking and ser- voing, programming of robot behaviors with visual cues, visual feature recognition, mapping and identification of individuals through gait characteristics using spatio- temporal visual patterns and quantifying the performance of these human-robot in- teraction approaches. The proposed framework enables a human operator to control and program a robot without the need for any complicated input interface, and also enables the robot to learn about its environment and the operator using the visual interface. We investigate the applicability of machine learning methods { super- vised learning in particular { to train the vision system using stored training data. A key aspect of our work is a system for human-robot dialog for safe and efficient task execution under uncertainty. We present extensive validation through a set of human-interface trials, and also demonstrate the applicability of this research in the field on the Aqua amphibious robot platform in the under water domain. While our framework is not specific to robots operating in the under water domain, vision under water is affected by a number of issues, such as lighting variations and color degra- dation, among others. Evaluating the approach in such difficult operating conditions provides a definitive validation of our approach. ii RESUM´ E´ Cette th`esepr´esentera une interface bas´eesur la vision qui permet l'int´eractionentre humains et robots et aussi le control de robots autonomes parcourant des environ- ments ind´efinis. La vision `al'avantage d'^etreune modalit´esensorielle discr`eteet `a faible puissance. La probabilit´ed'algorithmes complexes et une hausse significative de puissance computationelle sont deux des raisons les plus importantes d'en faire une int´egrationsi r´epandue.La recherche pr´esent´edans cette dissertation ´evalue la d´etectionvisuelle comme m´ethode simple et intuitive pour un op´erateurhumain de communiquer `acourte port´eeavec un robot mobil. L'ensemble des mod`elescom- municationnels ´etudi´esinclus, sans tous les nomm´es,la localisation et l'inspection visuelle, l'utilisation de signaux visuels pour la programmation comportemental de robots, la reconnaissance visuelle, la reconnaissance d'individus par leurs mouve- ments corporels caract´eristiquesutilisant des motifs visuels spatio-temporels tout en quantifiant la performance de cette approche `al'int´eractionentre humains et robots. La structure propos´eepermet `al'op´erateur humain de programmer et cont^olerun robot sans la n´ecessit´ed'une interface `aentr´eede donn´eescomplexe. Cette struc- ture permet aussi au robot de reconna^ıtredes caract´eristiques cl´esde son environ- ment et de son op´erateurhumain par l'ulisation d'une interface visuelle. L'´etude de l'appplication possible des m´ethodes d'apprentissage ulitilis´eespar certaines ma- chines, toujours sous supervision, permet d'entra^ınerle syst`emevisuel `autiliser ses bases de donn´ees. Un aspect important de cette recherche est l'´elaboration d'un syst`emede dialogues entre humains et robots permettant l'ex´ecutions´ecuritaireet efficace de t^aches aux d´elimitationsincertaines. On pr´esente une ample validation `a travers de nombreux essais utilisant notre interface avec l'aide de cobayes humains. On d´emontre aussi les applications possibles de cette recherche au sein des utilisa- tions aquatiques du Aqua, robot amphibien `aplateforme. Alors que notre structure de recherche ne se sp´ecialisepas dans la robotique aquatique, la vision sous l'eau est toujours affect´eepar de nombreux facteurs, notamment la lumunosit´evariante et la d´egradationde couleur. L'´evaluation de l'approche n´ecessairedans de telles conditions op´erationnellesdifficiles cr´eeune validation d´efinitive de notre recherche. iv ACKNOWLEDGEMENT The author would like to gratefully acknowledge the contribution and support of colleagues, friends and family, without which, this thesis would not have been pos- sible. First and foremost, I would like to express my gratitude to Gregory Dudek { supervisor, friend and mentor for seven years. Without his infectious enthusiasm and steadfast faith in my abilities, two theses, more than 10 field trials and a variety of stimulating research projects would not have seen the light of day. I would not be pursuing a doctoral degree, let alone in the field of Robotics, if it were not for Greg. Rarely am I at a loss of words to describe my gratitude, but I feel no amount of words would ever do justice to the inspiration he has provided. I am also thankful to the Dudek family { Nick, Natasha and Krys, for treating me as one of their own, and for unhesitatingly helping out on numerous research field trials, often without being asked. My appreciation also extends towards the various members of my PhD com- mittee, and faculty members at the School of Computer Science and Center for Intelligent Machines (CIM). In particular, I am thankful to Joelle Pineau, Tal Arbel, Doina Precup, Godfried Toussaint, Sue Whitesides, and Frank Ferrie, who have pro- vided valuable guidance during the course of my PhD. I would like to thank Michael Langer, for it was his course in Computational Perception that spiked my interest in machine vision all those years ago. Meyer Nahon and Inna Sharf have assisted with ideas regarding the Aqua platform, and in various field trials over the years in Barbados. I also acknowledge the input provided by Nicholas Roy and Nando de Freitas. The staff at CIM and the School of Computer Science, both past and present, have made my life easier in so many ways, and my appreciation goes to them as well: Cynthia Davidson, Marlene Gray, Jan Binder, Patrick McLean, Diti Anastasopoulos, Sheryl Morrissey and Vanessa Bettencourt-Laberge. All of this research have been validated on-board the Aqua platform, which has proven to be an amazingly robust and versatile robotic platform to build my research upon. For that, I am thankful to my friend and colleague Christopher Prahacs, the man behind the design and construction of the Aqua robots. Without his unwavering dedication, impeccable work ethic and extreme patience, there would be not even one Aqua robot, let alone three. I am grateful for having the opportunity to work alongside and learn from Chris, lessons very few textbooks can teach, if any. My lab-mates and friends at the Mobile Robotics Lab have provided the best support any doctoral student can ask for. I would like to thank Eric Bourque for his advice and amazing insights into programming in general and the world of open- source software. Ioannis Rekleitis gets a big thank-you for providing video and photographic support for almost all the Aqua trials. Malika Meghjani, Gabrielle Charette, Yogesh Girdhar, Anqi Xu, Florian Shkurti, Nicolas Plamondon, Olivia Chiu, Matt Garden, Dave Meger, Dimitri Marinakis and Bir Bikram Dey have all played key roles in assisting with robot trials and providing stimulating discussions towards my research. Special gratitude goes to Philippe Gigu`ere,for sharing the vi task of playing \robot parent" with Chris Prahacs and myself over the years. Travis Thomson and Erica Dancose take credit for translating the thesis abstract in French. Last but not least, an enormous amount of gratitude goes to my family. No words would suffice to express the love and support Rafa has given me all these years. She has unwaveringly seen me through the toughest of times, and I am thankful to have her beside me. As she approaches the end of her own doctoral journey, I hope I can be for her what she has been for me. This thesis is for Nadyne, my little angel, and my father, the late M. A. Sattar. vii TABLE OF CONTENTS LIST OF FIGURES :::::::::::::::::::::::::::::::: xv LIST OF TABLES ::::::::::::::::::::::::::::::::: xxi CHAPTER 1. Introduction ::::::::::::::::::::::::::: 1 1.1. A Framework for Visual Human-Robot Interaction ::::::::::: 1 1.2. Visual Human-Robot Interaction ::::::::::::::::::::: 3 1.3. Overview of Approach ::::::::::::::::::::::::::: 6 1.3.1. Visual Programming of a Mobile Robot ::::::::::::::: 7 1.3.2. People Tracking :::::::::::::::::::::::::::: 8 1.3.3. Risk-Uncertainty Assessment ::::::::::::::::::::: 8 1.4. Motivation ::::::::::::::::::::::::::::::::: 9 1.5. Contributions ::::::::::::::::::::::::::::::: 11 1.6. Statement of Originality ::::::::::::::::::::::::: 12 1.7. Document Outline ::::::::::::::::::::::::::::: 13 CHAPTER 2. A Framework for Robust Visual HRI :::::::::::::: 15 TABLE OF CONTENTS 2.1. Related Work ::::::::::::::::::::::::::::::: 16 2.1.1. Visual Tracking :::::::::::::::::::::::::::: 16 2.1.2. Distribution Similarity Measures ::::::::::::::::::: 19 2.1.3.