Unsupervised Learning and Reverse Optical Flow in Mobile Robotics a Dissertation Submitted to the Department of Electrical Engin

UNSUPERVISED LEARNING AND REVERSE OPTICAL FLOW IN MOBILE ROBOTICS A DISSERTATION SUBMITTED TO THE DEPARTMENT OF ELECTRICAL ENGINEERING AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY Andrew Lookingbill May 2011 © 2011 by Andrew Lookingbill. All Rights Reserved. Re-distributed by Stanford University under license with the author. This dissertation is online at: http://purl.stanford.edu/mz066kz5780 ii I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. Sebastian Thrun, Primary Adviser I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. Bernd Girod I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. Andrew Ng Approved for the Stanford University Committee on Graduate Studies. Patricia J. Gumport, Vice Provost Graduate Education This signature page was generated electronically upon submission of this dissertation in electronic format. An original signed hard copy of the signature page is on file in University Archives. iii Preface They say you are supposed to be able to describe your research to a layperson in five minutes: your \elevator pitch." In that sense, my graduate work is perfect. Whether the audience consists of strangers on a plane, extended family, or curious neighbors, what I do, why I do it, and the basics of how it is done are straightforward to explain. To teach robots to \see," to operate independently of human supervision, and to learn about the environment without having explicitly labeled data is exciting stuff. So, while this thesis may not have any plot twists or a surprise ending, I hope you find it interesting reading. And who knows? In the coming revolution, I may be partly to blame for any stray toasters. iv Acknowledgments Research, at least at Stanford, is a collaborative effort. I have had the unique good fortune to work with some extraordinarily talented people during my time here. I would like to thank the members of the Stanford Autonomous Helicopter Project for their help in acquiring the video used for testing the multi-object tracking algorithm discussed in Chapter 2. I would also like to thank my collaborators David Lieb, David Stavens, John Rogers, Jim Curry, and Itai Katz for their insights as well as the long hours, late nights, sunburns, and mosquito bites we endured as we wrote and then tested our algorithms in the field. I am indebted to the members of my reading committee, Professors Girod and Ng, my thesis defense committee chair, Professor Widrow, and my advisor, Professor Thrun. Finally, I want to thank my mother, for everything. v Contents Preface iv Acknowledgments v 1 Introduction 1 1.1 Thesis structure . 2 2 Optical Flow and Reverse Optical Flow 3 2.1 Feature Selection . 4 2.2 Feature Tracking . 4 2.3 Flow Caching and Traceback . 7 2.4 Examples . 10 2.5 Related Work . 12 3 Multi-Object Tracking and Activity Models 17 3.1 Learning Activity Maps from a Moving Platform . 18 3.1.1 Feature Selection and Feature Tracking . 18 3.1.2 Identifying Moving Objects on the Ground . 19 3.1.3 Tracking Moving Objects with Particle Filters . 21 3.1.4 Learning the Activity-Based Ground Model . 25 3.2 Applications . 26 3.2.1 Using the Activity Model for Improved Tracking . 26 3.2.2 Registration Based on Activity Models . 27 3.3 Results . 28 vi 3.3.1 Hypotheses . 28 3.3.2 Methods . 29 3.3.3 Findings . 30 3.3.4 Additional Results . 31 3.3.5 Conclusions . 33 3.4 Related Work . 35 4 Road Following 39 4.1 Adaptive Road Following . 40 4.2 Results . 46 4.2.1 Hypotheses . 46 4.2.2 Methods . 46 4.2.3 Findings . 50 4.2.4 Additional Results . 53 4.2.5 Conclusions . 54 4.3 Related Work . 56 5 Self-Supervised Navigation 65 5.1 Off-Road Navigation Algorithm . 67 5.1.1 Alternate Approaches . 74 5.2 Results . 78 5.2.1 Hypotheses . 79 5.2.2 Methods . 79 5.2.3 Findings . 81 5.2.4 Additional Results . 83 5.2.5 Conclusions . 85 5.3 Related Work . 85 6 Conclusions and Future Work 89 6.1 Conclusions . 89 6.2 Future Work . 91 vii Bibliography 92 viii List of Tables 3.1 Single and Multi-Object Tracking Performance . 31 ix List of Figures 2.1 Features identified using an algorithm by Shi and Tomasi [5]. 5 2.2 Image pyramids, filtered and subsampled, for two consecutive video frames. A sum of squared differences measure is iteratively minimized between the two, moving from coarser to finer levels, to calculate the optical flow for a given feature. 5 2.3 (a) Optical flow based on a short image sequence for an image contain- ing a moving object (dark car). 6 2.4 Changes in texture and color appearance with distance . 7 2.5 Changes in specular illumination with distance. Specular illumination depends on the angle of incidence at point P, which differs between robot positions a and b. 8 2.6 White lines represent the optical flow field in a typical desert driving scene (the length of flow vectors has been scaled by a factor of 2 for clarity) . 9 2.7 Optical flow compressed and stored for a number of frames in the past 10 2.8 Operations for tracing the location of a feature backwards in time . 11 2.9 (a) Points selected in initial video frame (b) Origin of points 200 frames in the past . 12 2.10 (a) Numbered points selected in initial video frame (b) Corresponding points 200 frames in the past . 12 2.11 Outdoor reverse optical flow example. Frame on right shows point when object interacts with robots local sensors, frame on left shows image region where reverse optical flow indicates the point originated. 13 x 2.12 Outdoor reverse optical flow example. Frame on right shows point when object interacts with robots local sensors, frame on left shows image region where reverse optical flow indicates the point originated. 13 2.13 Indoor reverse optical flow example. Frame on right shows point when object interacts with robots local sensors, frame on left shows image region where reverse optical flow indicates the point originated. 14 2.14 (a) Points on desert roadway selected in initial video frame (b) Corre- sponding points 200 frames in the past . 14 3.1 The Stanford Helicopter is based on a Bergen Industrial Twin platform and is outfitted with instrumentation for autonomous flight (IMU, GPS, magnetometer, PC104). In the experiments reported here the onboard laser was replaced with a color camera. 18 3.2 (a) Optical flow based on a short image sequence, for an image con- taining a moving object (dark car). (b) The \corrected" flow after compensating for the estimated platform motion, which itself is ob- tained from the image flow. The reader will notice that this flow is significantly higher for the moving car. These images were acquired with the Stanford helicopter. 19 3.3 (a) Multiple particle filters, used for tracking multiple moving objects on the ground. Lighter particles have been more heavily weighted in the reward calculation, and are more likely to be selected during the resampling step. Shown here is an example of tracking three moving objects on the ground, a bicyclist and two pedestrians (the truck in the foreground is not moving). (b) the center of each particle filter in a different frame in the sequence clearly identifies all moving objects. 23 3.4 Two moving objects being tracked in video taken from a helicopter as part of a DARPA demo. 24 xi 3.5 Example of a learned activity map of an area on campus, using data acquired from a camera platform undergoing unknown motion. The arrows indicate the most likely motion direction modes in each grid cell; their lengths correspond to the most likely velocity of that mode; and the thickness represents the probability of motion. This diagram clearly shows the main traffic flows around the circular object; it also shows the flow of pedestrians that moved through the scene during learning. 26 3.6 The single-frame alignment of two independent video sequences based on the activity-based models acquired from each. This registration is performed without image pixel information. It uses only activity information from the learning grid. 29 3.7 Example of two tracks (a) without and (b) with the learned activity map. The track in (a) is incomplete and misses the moving object for a number of time steps. The activity map enables the tracker to track the top object more reliably. 32 3.8 Selected features in frame from jogging video. 33 3.9 Features classified as moving in jogging video frame. 34 3.10 Particles corresponding to a particle filter tracking the jogger. Lighter colored particles have been heavily weighted in this step, darker particles have received a lower weight. 35 3.11 Dependence of tracking accuracy on number of moving objects in train- ing data. 36 4.1 Adaptive road following algorithm . 40 4.2 (a) Dark line shows the definition region used in the proposed algorithm. (b)-(d) White lines show the locations in previous frames to which optical flow has traced the definition region.

Load more