Abstract a Cognitive Robotic Imitation Learning System
Total Page:16
File Type:pdf, Size:1020Kb
ABSTRACT Title of dissertation: A COGNITIVE ROBOTIC IMITATION LEARNING SYSTEM BASED ON CAUSE-EFFECT REASONING Garrett Ethan Katz Doctor of Philosophy, 2017 Dissertation directed by: Professor James A. Reggia Department of Computer Science As autonomous systems become more intelligent and ubiquitous, it is increas- ingly important that their behavior can be easily controlled and understood by human end users. Robotic imitation learning has emerged as a useful paradigm for meeting this challenge. However, much of the research in this area focuses on mimicking the precise low-level motor control of a demonstrator, rather than inter- preting the intentions of a demonstrator at a cognitive level, which limits the ability of these systems to generalize. In particular, cause-effect reasoning is an important component of human cognition that is under-represented in these systems. This dissertation contributes a novel framework for cognitive-level imitation learning that uses parsimonious cause-effect reasoning to generalize demonstrated skills, and to justify its own actions to end users. The contributions include new causal inference algorithms, which are shown formally to be correct and have reason- able computational complexity characteristics. Additionally, empirical validations both in simulation and on board a physical robot show that this approach can ef- ficiently and often successfully infer a demonstrator's intentions on the basis of a single demonstration, and can generalize learned skills to a variety of new situations. Lastly, computer experiments are used to compare several formal criteria of parsi- mony in the context of causal intention inference, and a new criterion proposed in this work is shown to compare favorably with more traditional ones. In addition, this dissertation takes strides towards a purely neurocomputa- tional implementation of this causally-driven imitation learning framework. In par- ticular, it contributes a novel method for systematically locating fixed points in recurrent neural networks. Fixed points are relevant to recent work on neural net- works that can be \programmed" to exhibit cognitive-level behaviors, like those involved in the imitation learning system developed here. As such, the fixed point solver developed in this work is a tool that can be used to improve our engineering and understanding of neurocomputational cognitive control in the next generation of autonomous systems, ultimately resulting in systems that are more pliable and transparent. A COGNITIVE ROBOTIC IMITATION LEARNING SYSTEM BASED ON CAUSE-EFFECT REASONING by Garrett Ethan Katz Dissertation submitted to the Faculty of the Graduate School of the University of Maryland, College Park in partial fulfillment of the requirements for the degree of Doctor of Philosophy 2017 Advisory Committee: Professor James A. Reggia, Chair Professor Yiannis Aloimonos Professor Rodolphe J. Gentili Professor Jeffrey W. Herrmann Professor Dana S. Nau c Copyright by Garrett Ethan Katz 2017 Acknowledgments This work was supported financially by ONR award N000141310597 and a seed grant from Lockheed Martin Corporation. Previous versions of some content in Chapters2,4,5, and7 were published in [67{71], all papers on which I was first author and to which I made the primary contributions. To my classmates - Konstantinos Zampogiannis, Jason Filippou, Mahfuza Sharmin, Michael Maynord, and Tommy Pensyl: Thank you for providing the spice of life, getting me out of the house, and keeping me balanced. To my former room- mates, Raegan and Kevin Conlin: thank you for so many years of warmth and friendship. You and your family are an inspiration. To my fellow research assistants over the years - Charmi Patel, Dale Dullnig, Baram Sosis, Josh Langsfeld, Hyuk Oh, Isabelle Shuggi, Theresa Hauge, Gregory Davis, and Di-Wei Huang: thank you for your fruitful collaborations and comradery. Thanks especially to you, Di-Wei, for showing me the ropes and always bringing me back gifts from your trips to Taiwan. It was a small thing but it said a lot. To all of the professors at University of Maryland with whom I was a student or a teaching assistant - Aravind Srinivasan, Howard Elman, H´ectorCorrada Bravo, Jandelyn Plane, James Reggia, David Jacobs, Yiannis Aloimonos, Michael Hicks, Dave Levin, Dana Nau, Giovanni Forni, and David Van Horn: Thank you for sharing your time and knowledge, and making me a better learner and teacher. Thanks especially to Jandelyn Plane, for giving me the opportunity to be a summer course instructor. Special thanks also to Howard Elman and Dana Nau for generously ii agreeing to be on my preliminary exam committee, and to Giovanni Forni, for the insightful feedback on the content of Chapter7. And thanks to Dana Nau, Yiannis Aloimonos, Rodolphe Gentili, and Jeffrey Herrmann for taking an interest in my work and agreeing to be on my dissertation committee. Rodolphe, thank you for being so generous and hospitable, letting me use your lab day in and day out. Thank you also for your guidance, and for always valuing my opinion and treating me as if I were your peer - so much so that I sometimes forgot how much farther I have to go before that point. You always suffered through my less self-aware moments with grace. Jim, thank you for dutifully performing the demanding and at certain times thankless job of being my advisor. You have always been patient, encouraging, and remarkably devoted to my well-being and success, as you are with all of your students. You gave form to my vague notions and taught me what research was, always steering me towards the right path. It has made all the difference. Lastly, Al and Paul, thank you for giving me the opportunity to first do research back at City College, and for supporting me in my move to Maryland to pursue a doctoral degree. And to Mom, Dad, Adria, and the rest of my family: Thank you for your unconditional love and support. Everyone should be so lucky. iii Table of Contents Acknowledgements ii List of Tables vii List of Figures viii List of Abbreviationsx 1 Introduction1 1.1 Problem Statement............................1 1.2 Objectives.................................6 1.3 Overview..................................7 2 Background 10 2.1 Imitation Learning............................ 10 2.2 Causal Inference.............................. 15 2.2.1 Abductive Inference....................... 16 2.2.2 Automated Planning....................... 20 2.2.3 Plan Recognition......................... 27 2.3 Neural Network Transparency and Control............... 30 3 The CERIL architecture 37 3.1 A Running Example: Hard Drive Maintenance............. 43 3.2 Recording Demonstrations........................ 44 3.3 Learning New Skills by Inferring Intentions............... 46 3.4 Transferring Learned Skills to New Situations............. 52 3.5 Post-Learning Imitation and Generalization.............. 55 3.6 Encoding Background Knowledge.................... 59 4 Causal Reasoning Algorithms for Inferring Intent 67 4.1 Formalizing Causal Knowledge...................... 68 4.2 Formalizing Parsimonious Explanation................. 72 4.3 Parsimonious Covering Algorithms................... 75 4.4 Theoretical Results............................ 83 iv 5 Experimental Validation 90 5.1 Overall System Performance....................... 91 5.1.1 Imitation Learning Trials..................... 91 5.1.2 Monroe Plan Corpus Experiments................ 98 5.2 Empirical Comparison of Parsimony Criteria.............. 103 5.2.1 Parsimony Criteria Considered.................. 105 5.2.2 Testing Data and Performance Metrics............. 107 5.2.3 Learning Novel Intention Sequences in the Monroe Domain.. 109 5.2.4 Results............................... 111 5.2.5 Discussion............................. 116 6 Causal explanation of planned actions 120 6.1 CERIL's XAI Mechanism........................ 122 6.2 Causal Plan Graphs............................ 124 6.3 Justifying Actions with Causal Chains................. 129 6.4 Graphical User Interface......................... 133 6.5 Initial Experimental Results....................... 136 6.6 Discussion................................. 138 7 Locating Fixed Points in Neural Networks 140 7.1 Fixed Points of Neural Attractor Dynamics............... 141 7.2 Theoretical Groundwork......................... 144 7.2.1 Notation.............................. 144 7.2.2 Directional Fibers......................... 144 7.2.3 A Fiber-Based Fixed Point Solver................ 149 7.2.4 The traverse Update Scheme................. 150 7.3 Application to Recurrent Neural Networks............... 156 7.3.1 Neural Network Model...................... 157 7.3.2 Applying traverse ....................... 158 7.3.3 Topological Sensitivity of Directional Fibers.......... 160 7.4 Computer Experiments.......................... 163 7.4.1 Experimental Methods...................... 163 7.4.1.1 Sampling Distribution for W............. 163 7.4.1.2 Counting Unique Fixed Points............. 164 7.4.2 Comparison with a Baseline Solver............... 165 7.4.3 Comparison of Different Directional Fibers........... 173 7.5 Discussion................................. 175 8 Discussion 180 8.1 Summary................................. 180 8.2 Contributions............................... 181 8.3 Limitations and Future Work...................... 183 v A Appendix 187 A.1 Dock Maintenance Causal Relations................... 187 A.2 Monroe County Corpus Causal Relations................ 191 A.3 State Reconstruction in the Monroe Corpus.............. 195 A.4 Anomalies