Robot Self-Modeling
Total Page:16
File Type:pdf, Size:1020Kb
Abstract Robot Self-Modeling Justin Wildrick Hart 2014 Traditionally, models of a robot’s kinematics and sensors have been provided by designers through manual processes. Such models are used for sensorimotor tasks, such as manipulation and stereo vision. However, these techniques often yield static models based on one-time calibrations or ideal engineering drawings; models that often fail to represent the actual hardware, or in which individual unimodal models, such as those describing kinematics and vision, may disagree with each other. Humans, on the other hand, are not so limited. One of the earliest forms of self-knowledge learned during infancy is knowledge of the body and senses. In fants learn about their bodies and senses through the experience of using them in conjunction with each other. Inspired by this early form of self-awareness, the re search presented in this thesis attempts to enable robots to learn unified models of themselves through data sampled during operation. In the presented experiments, an upper torso humanoid robot, Nico, creates a highly-accurate self-representation through data sampled by its sensors while it operates. The power of this model is demonstrated through a novel robot vision task in which the robot infers the visual perspective representing reflections in a mirror by watching its own motion reflected therein. In order to construct this self-model, the robot first infers the kinematic parame ters describing its arm. This is first demonstrated using an external motion capture system, then implemented in the robot’s stereo vision system. In a process inspired by infant development, the robot then mutually refines its kinematic and stereo vi 2 sion calibrations, using its kinematic structure as the invariant against which the system is calibrated. The product of this procedure is a very precise mutual calibra tion between these two, traditionally separate, models, producing a single, unified self-model. The robot then uses this self-model to perform a unique vision task. Knowledge of its body and senses enable the robot to infer the position of a mirror placed in its environment. From this, an estimate of the visual perspective describing reflections in the mirror is computed, which is subsequently refined over the expected position of images of the robot’s end-effector as reflected in the mirror, and their real-world, imaged counterparts. The computed visual perspective enables the robot to use the mirror as an instrument for spacial reasoning, by viewing the world from its perspec tive. This test utilizes knowledge that the robot has inferred about itself through experience, and approximates tests of mirror use that are used as a benchmark of self-awareness in human infants and animals. Robot Self-Modeling A Dissertation Presented to the Faculty of the Graduate School of Yale University in Candidacy for the Degree of Doctor of Philosophy by Justin Wildrick Hart Dissertation Director: Brian Scassellati December 2014 UMI Number: 3582284 All rights reserved INFORMATION TO ALL USERS The quality of this reproduction is dependent upon the quality of the copy submitted. In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if material had to be removed, a note will indicate the deletion. Di!ss0?t&iori Publishing UMI 3582284 Published by ProQuest LLC 2015. Copyright in the Dissertation held by the Author. Microform Edition © ProQuest LLC. All rights reserved. This work is protected against unauthorized copying under Title 17, United States Code. ProQuest LLC 789 East Eisenhower Parkway P.O. Box 1346 Ann Arbor, Ml 48106-1346 Copyright © 2014 by Justin Wildrick Hart All rights reserved. Contents 1 Introduction 5 1.1 Self-awareness in hum ans ............................................................................. 6 1.2 What is to be gained by emulating this process in robots? ................. 12 1.3 The kinesthetic-visual self-model ............................................................. 15 1.4 Summary ...................................................................................................... 17 2 The mirror-test as a target for self-aware systems 21 2.1 An architecture for mirror self-recognition ............................................. 24 2.1.1 The perceptual model ................................................................... 26 2.1.2 The end-effector m o d el................................................................... 27 2.1.3 The perspective-taking m o d e l ...................................................... 28 2.1.4 The structural m o d el ..................................................................... 29 2.1.5 The appearance m o d e l .................................................................. 29 2.1.6 The functional m odel ..................................................................... 30 2.2 Summary and conclusions .......................................................................... 33 3 Test platform 35 3.1 Nico, an infant humanoid ............................................................................. 37 3.1.1 Stereo vision s y s te m ...................................................................... 39 3.1.2 Actuation and motor control ......................................................... 42 iii 3.2 The Vicon MX motion capture s y s te m ................................................... 43 3.3 Summary ...................................................................................................... 44 4 Background: Homogeneous representations 47 4.1 P o in t s ............................................................................................................ 48 4.2 L in es ................................................................................................................ 50 4.3 P la n e s ............................................................................................................. 50 4.4 Rigid transformations ................................................................................... 52 4.5 S u m m a ry ...................................................................................................... 54 5 Background: Computer vision 55 5.1 Homographies ................................................................................................ 55 5.2 Pinhole camera model ................................................................................ 57 5.2.1 The pinhole camera’s projection.................................................... 60 5.2.2 Ideal cameras & extrinsic p aram eters .......................................... 61 5.2.3 Intrinsic parameters ....................................................................... 62 5.2.4 Lens d is to rtio n ................................................................................ 65 5.3 Epipolar geometry ....................................................................................... 69 5.4 Summary ...................................................................................................... 72 6 Kinematic inference 74 6.1 Nico’s kinematic model ................................................................................ 75 6.1.1 Representing k in em atics ................................................................ 76 6.1.2 Encoder offset and gear reduction ................................................. 81 6.2 Kinematic inference ....................................................................................... 84 6.2.1 Circle point a n a ly s is ....................................................................... 85 6.2.1.1 The single joint c a s e ...................................................... 85 iv 6.2.1.2 Extending to multiple revolute joints ........................... 86 6.2.2 Offset and gear reduction ............................................................. 94 6.2.3 Nonlinear refinem ent ....................................................................... 94 6.3 Evaluation ...................................................................................................... 96 6.4 R esults ............................................................................................................ 97 6.5 Summary and conclusions .............................................................................104 7 Integrating kinematics and computer vision 106 7.1 Tracking the robot’s hand using computer vision .......................................109 7.1.1 Object track in g ....................................................................................I l l 7.1.1.1 Color blob d e tec tio n ...........................................................I l l 7.1.1.2 Fiducial tra c k in g .................................................................113 7.1.2 Stereo reconstruction .......................................................................... 116 7.2 Integrating vision and kin em atics ................................................................ 118 7.3 Evaluation ..........................................................................................................120 7.4 R esults ................................................................................................................122 7.5 Summary and conclusions .............................................................................126 8 Simultaneous refinement of kinematics and vision 128 8.1 Classical approaches to camera calibration .................................................130