Deep Reinforcement Learning for Adaptive Human Robotic Collaboration

Total Page:16

File Type:pdf, Size:1020Kb

Deep Reinforcement Learning for Adaptive Human Robotic Collaboration DEGREE PROJECT IN THE FIELD OF TECHNOLOGY ENGINEERING PHYSICS AND THE MAIN FIELD OF STUDY COMPUTER SCIENCE AND ENGINEERING, SECOND CYCLE, 30 CREDITS STOCKHOLM, SWEDEN 2019 Deep Reinforcement Learning for Adaptive Human Robotic Collaboration JOHAN FREDIN HASLUM KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE Deep Reinforcement Learning for Adaptive Human Robotic Collaboration JOHAN FREDIN HASLUM Master in Computer Science Date: April 11, 2019 Supervisor: Mårten Björkman Examiner: Olov Engvall School of Electrical Engineering and Computer Science iii Abstract Robots are expected to become an increasingly common part of most humans everyday lives. As the number of robots increase, so will also the number of human-robot interactions. For these interactions to be valuable and intuitive, new advanced robotic control policies will be necessary. Current policies often lack flexibility, rely heavily on human expertise and are often programmed for very specific use cases. A promising alternative is the use of Deep Reinforcement Learning, a family of algorithms that learn by trial and error. Following the recent success of Reinforcement Learning (RL) to areas previously considered too complex, RL has emerged as a possible method to learn Robotic Control Policies. This thesis explores the possibility of using Deep Re- inforcement Learning (DRL) as a method to learn Robotic Control Policies for Human Robotic Collaboration (HRC). Specifically, it will evaluate if DRL algorithms can be used to train a robot to collaboratively balance a ball with a human along a predetermined path on a table. To evaluate if it is possible several experiments are performed in a sim- ulator, where two robots jointly balance a ball, one emulating a human and one relying on the policy from the DRL algorithm. The experi- ments performed suggest that DRL can be used to enable HRC which perform equivalently or better than an emulated human performing the task alone. Further, the experiments indicate that less skilled hu- man collaborators performance can be improved by cooperating with a DRL trained robot. iv Sammanfattning Närvaron av robotar förväntas bli en allt vanligare del av de flesta människors vardagsliv. När antalet robotar ökar, så ökar även antalet människa-robot-interaktioner. För att dessa interaktioner ska vara an- vändbara och intuitiva, kommer nya avancerade robotkontrollstrate- gier att vara nödvändiga. Nuvarande strategier saknar ofta flexibilitet, är mycket beroende av mänsklig kunskap och är ofta programmerade för mycket specifika användningsfall. Ett lovande alternativ är användningen av Deep Reinforcement Le- arning, en familj av algoritmer som lär sig genom att testa sig fram, likt en människa. Efter den senaste tidens framgångar inom Reinfor- cement Learning (RL) vilket applicerats på områden som tidigare an- setts vara för komplexa har RL nu blivit ett möjlig alternativ till mer etablerade metoder för att lära sig kontrollstrategier för robotar. Denna uppsats undersöker möjligheten att använda Deep Reinforcement Le- arning (DRL) som metod för att lära sig sådana kontrollstrategier för människa-robot-samarbeten. Specifikt kommer den att utvärdera om DRL-algoritmer kan användas för att träna en robot och en människa att tillsammans balansera en boll längs en förutbestämd bana på ett bord. För att utvärdera om det är möjligt utförs flera experiment i en simu- lator, där två robotar gemensamt balanserar en boll, en simulerar en människa och den andra en robot som kontrolleras med hjälp av DRL- algoritmen. De utförda experimenten tyder på att DRL kan användas för att möjliggöra människa-robot-samarbeten som utförs lika bra el- ler bättre än en simulerad människa som utför uppgiften ensam. Vi- dare indikerar experimenten att prestationer med mindre kompetenta mänskliga deltagare kan förbättras genom att samarbeta med en DRL- algoritm-kontrollerad robot. Contents 1 Introduction 1 1.1 Motivation . 1 1.2 Problem Specification . 2 1.2.1 Research Questions . 2 1.2.2 Scope & Delimitation . 3 1.3 Ethics and Societal Impact . 3 2 Background 5 2.1 Reinforcement Learning . 5 2.1.1 Natures Way of Learning . 5 2.1.2 Formulation . 6 2.1.3 Reinforcement Learning Tools and Ideas . 6 2.2 Deep Learning in Reinforcement Learning . 10 2.3 Deep Reinforcement Learning Algorithms . 11 2.3.1 Deep Q-Learning . 12 2.3.2 Deep Deterministic Policy Gradient . 12 2.3.3 A3C . 14 2.3.4 TRPO . 15 2.3.5 PPO . 16 3 Related Work 18 3.1 Reinforcement Learning for Robotic Control . 18 3.1.1 End-to-End Visuomotor Policies . 18 3.1.2 Training in Simulation . 19 3.1.3 Imitation Learning . 20 3.1.4 Auto encoder Learning . 21 3.1.5 Domain Randomization and Large Scale Data Col- lection . 22 v vi CONTENTS 3.1.6 Prioritized Experience Replay . 23 3.2 Human-Robot Collaboration . 24 4 Method 27 4.1 Problem . 28 4.2 Implementation . 29 4.2.1 Physics Engine . 29 4.2.2 Robot Environment . 29 4.2.3 Human Movement Simulation . 30 4.2.4 Reinforcement Learning Algorithm . 31 4.2.5 Observation Space . 32 4.2.6 Design choices . 33 4.3 Experimental Setup . 35 4.3.1 Common Details . 35 4.3.2 Collaborative Balancing with Varying Level of Skilled Human Partner . 36 4.3.3 Balancing with DRL Collaborator . 38 4.3.4 Balancing with More Information . 38 5 Experiments 39 5.1 Results . 39 5.1.1 Analysis . 39 5.1.2 Performance of Human Collaborator Acting Alone 41 5.1.3 Performance of Robot Collaborator Acting Alone 41 5.1.4 Performance of Human-Robot Collaborator . 46 5.1.5 Performance of Robot-Robot Collaborator . 56 5.1.6 Performance of Human-Robot Collaborator with more Information . 57 6 Discussion 60 6.1 Conclusions . 60 6.1.1 Research Questions . 61 6.1.2 Unanswered Questions . 62 6.1.3 Limitations and Improvements . 63 7 Conclusion 64 Bibliography 65 8 Appendix 70 Chapter 1 Introduction 1.1 Motivation As robots become increasingly common in society, the number of in- teractions between humans and robots will most likely increase signif- icantly. The interplay between these two will become a part of every- day life for a lot of people. In order for these interactions to become useful, they need to feel natural for the individual involved. This does not only require the robot to interact in a way that feels customary for humans in general, it also has to adapt on a person to person basis. Human collaboration involves complex organization and communi- cation that result in an outcome that is greater than the sum of the individual capabilities. This advanced interplay between two or more individuals, can often be done without much effort and in silence. For example, carrying a table together can easily be done, just relying on the haptic feedback felt in the collaborators hands. This ability to adapt in a way that feels ordinary to humans have not been transferred to robots. Successfully equipping robots with this capability, will likely be essential in the future. More specifically current research within the area of human-robot col- laboration is focused on industry applications. An example of this is the use of human-robot pair in car manufacturing. By combining the skillfulness of humans and their ability to learn quickly, with the cost efficiency and physical strength of robots, the efficiency is increased and the operational cost reduced [19]. 1 2 CHAPTER 1. INTRODUCTION The ability to learn how to best collaborate with humans is an active area of research, although several different approaches have been sug- gested, few have shown great promise [9]. Specific problems have been solved such as jointly lifting an object, however the proposed method rely heavily on human expert knowledge to implement a func- tioning control system. This is not only true for collaboration tasks, but rather all robotic control problems. Since human expertise is costly and can be a scarce resource, the possibility of teaching robots how to inter- act with it’s surroundings using laymen or no human intervention at all would enable cheaper and more accessible robotic control systems. The possibility of teaching robots how to behave through other means than human crafted control policies is one that is researched exten- sively. One promising such field is Deep Reinforcement Learning (DRL). DRL is a set of self learning algorithms that relies on interaction and examples, thus requiring no expert knowledge. A lot of work is cur- rently focused on the applicability of DRL to robotic control and the vision of many researchers is for robots to learn in a similar way as humans. Namely by learning to recognize visual and other sensory inputs and learn how to map these inputs to appropriate actions. The application of DRL to robotic control has shown promise, although it is still in early stages of development. The goal of this thesis is to evaluate the applicability of DRL to human-robot collaborations. Fur- ther it also evaluates the importance of different sensory modalities on the performance of the algorithm. 1.2 Problem Specification 1.2.1 Research Questions The questions that this thesis attempts to explore and answer are the following: What is a suitable Deep Reinforcement Learning framework for learning adap- tive human-robot interactions, such that robots can learn to collaborate with humans, in what the human perceive as a natural way? CHAPTER 1. INTRODUCTION 3 What impact does the available sensor modalities have on the performance off such a framework? 1.2.2 Scope & Delimitation The scope of this thesis is to evaluate the applicability of DRL algo- rithms on human robot collaborative problems. This is evaluated us- ing a toy problem, which represents the challenges involved with Hu- man Robotic Interactions. This toy problem is an adaptation of previous experiments used in other research projects involving human-robot and human-human col- laboration.
Recommended publications
  • Design and Evaluation of Modular Robots for Maintenance in Large Scientific Facilities
    UNIVERSIDAD POLITÉCNICA DE MADRID ESCUELA TÉCNICA SUPERIOR DE INGENIEROS INDUSTRIALES DESIGN AND EVALUATION OF MODULAR ROBOTS FOR MAINTENANCE IN LARGE SCIENTIFIC FACILITIES PRITHVI SEKHAR PAGALA, MSC 2014 DEPARTAMENTO DE AUTOMÁTICA, INGENIERÍA ELECTRÓNICA E INFORMÁTICA INDUSTRIAL ESCUELA TÉCNICA SUPERIOR DE INGENIEROS INDUSTRIALES DESIGN AND EVALUATION OF MODULAR ROBOTS FOR MAINTENANCE IN LARGE SCIENTIFIC FACILITIES PhD Thesis Author: Prithvi Sekhar Pagala, MSC Advisors: Manuel Ferre Perez, PhD Manuel Armada, PhD 2014 DESIGN AND EVALUATION OF MODULAR ROBOTS FOR MAINTENANCE IN LARGE SCIENTIFIC FACILITIES Author: Prithvi Sekhar Pagala, MSC Tribunal: Presidente: Dr. Fernando Matía Espada Secretario: Dr. Claudio Rossi Vocal A: Dr. Antonio Giménez Fernández Vocal B: Dr. Juan Antonio Escalera Piña Vocal C: Dr. Concepción Alicia Monje Micharet Suplente A: Dr. Mohamed Abderrahim Fichouche Suplente B: Dr. José Maráa Azorín Proveda Acuerdan otorgar la calificación de: Madrid, de de 2014 Acknowledgements The duration of the thesis development lead to inspiring conversations, exchange of ideas and expanding knowledge with amazing individuals. I would like to thank my advisers Manuel Ferre and Manuel Armada for their constant men- torship, support and motivation to pursue different research ideas and collaborations during the course of entire thesis. The team at the lab has not only enriched my professionally life but also in Spanish ways. Thank you all the members of the ROMIN, Francisco, Alex, Jose, Jordi, Ignacio, Javi, Chema, Luis .... This research project has been supported by a Marie Curie Early Stage Initial Training Network Fellowship of the European Community’s Seventh Framework Program "PURESAFE". I wish to thank the supervisors and fellow research members of the project for the amazing support, fruitful interactions and training events.
    [Show full text]
  • Learning for Microrobot Exploration: Model-Based Locomotion, Sparse-Robust Navigation, and Low-Power Deep Classification
    Learning for Microrobot Exploration: Model-based Locomotion, Sparse-robust Navigation, and Low-power Deep Classification Nathan O. Lambert1, Farhan Toddywala1, Brian Liao1, Eric Zhu1, Lydia Lee1, and Kristofer S. J. Pister1 Abstract— Building intelligent autonomous systems at any Classification Intelligent, mm scale Fast Downsampling scale is challenging. The sensing and computation constraints Microrobot of a microrobot platform make the problems harder. We present improvements to learning-based methods for on-board learning of locomotion, classification, and navigation of microrobots. We show how simulated locomotion can be achieved with model- Squeeze-and-Excite Hard Activation System-on-chip: based reinforcement learning via on-board sensor data distilled Camera, Radio, Battery into control. Next, we introduce a sparse, linear detector and a Dynamic Thresholding method to FAST Visual Odometry for improved navigation in the noisy regime of mm scale imagery. Locomotion Navigation We end with a new image classifier capable of classification Controller Parameter Unknown Dynamics Original Training Image with fewer than one million multiply-and-accumulate (MAC) Optimization Modeling & Simulator Resulting Map operations by combining fast downsampling, efficient layer Reward structures and hard activation functions. These are promising … … SLIPD steps toward using state-of-the-art algorithms in the power- Ground Truth Estimated Pos. State Space limited world of edge-intelligence and microrobots. Sparse I. INTRODUCTION Local Control PID: K K K Microrobots have been touted as a coming revolution p d i for many tasks, such as search and rescue, agriculture, or Fig. 1: Our vision for microrobot exploration based on distributed sensing [1], [2]. Microrobotics is a synthesis of three contributions: 1) improving data-efficiency of learning Microelectromechanical systems (MEMs), actuators, power control, 2) a more noise-robust and novel approach to visual electronics, and computation.
    [Show full text]
  • Robot Learning
    Robot Learning 15-494 Cognitive Robotics David S. Touretzky & Ethan Tira-Thompson Carnegie Mellon Spring 2009 04/06/09 15-494 Cognitive Robotics 1 What Can Robots Learn? ● Parameter tuning, e.g., for a faster walk ● Perceptual learning: ALVINN driving the Navlab ● Map learning, e.g., SLAM algorithms ● Behavior learning; plans and macro-operators – Shakey the Robot (SRI) – Robo-Soar ● Learning from human teachers – Operant conditioning: Skinnerbots – Imitation learning 04/06/09 15-494 Cognitive Robotics 2 Lots of Work on Robot Learning ● IEEE Robotics and Automation Society – Technical Committee on Robot Learning – http://www.learning-robots.de ● Robot Learning Summer School – Lisbon, Portugal; July 20-24, 2009 ● Workshops at major robotics conferences – ICRA 2009 workshop: Approachss to Sensorimotor Learning on Humanoid Robots – Kobe, Japan; May 17, 2009 04/06/09 15-494 Cognitive Robotics 3 Parameter Optimization ● How fast can an AIBO walk? Figures from Kohl & Stone, ICRA 2004, for the ERS-210 model: – CMU (2002) 200 mm/s – German Team 230 mm/s Hand-tuned gaits – UT Austin Villa 245 mm/s – UNSW 254 mm/s – Hornsby (1999) 170 mm/s – UNSW 270 mm/s Learned gaits – UT Austin Villa 291 mm/s 04/06/09 15-494 Cognitive Robotics 4 Walk Parameters 12 parameters to optimize: ● Front locus (height, x pos, ypos) ● Rear locus (height, x pos, y pos) ● Locus length ● Locus skew multiplier (in the x-y plane, for turning) ● Height of front of body ● Height of rear of body From Kohl & Stone (ICRA 2004) ● Foot travel time ● Fraction of time foot is on ground 04/06/09 15-494 Cognitive Robotics 5 Optimization Strategy ● “Policy gradient reinforcement learning”: – Walk parameter assignment = “policy” – Estimate the gradient along each dimension by trying combinations of slight perturbations in all parameters – Measure walking speed on the actual robot – Optimize all 12 parameters simultaneously – Adjust parameters according to the estimated gradient.
    [Show full text]
  • Systematic Review of Research Trends in Robotics Education for Young Children
    sustainability Review Systematic Review of Research Trends in Robotics Education for Young Children Sung Eun Jung 1,* and Eun-sok Won 2 1 Department of Educational Theory and Practice, University of Georgia, Athens, GA 30602, USA 2 Department of Liberal Arts Education, Mokwon University, Daejeon 35349, Korea; [email protected] * Correspondence: [email protected]; Tel.: +1-706-296-3001 Received: 31 January 2018; Accepted: 13 March 2018; Published: 21 March 2018 Abstract: This study conducted a systematic and thematic review on existing literature in robotics education using robotics kits (not social robots) for young children (Pre-K and kindergarten through 5th grade). This study investigated: (1) the definition of robotics education; (2) thematic patterns of key findings; and (3) theoretical and methodological traits. The results of the review present a limitation of previous research in that it has focused on robotics education only as an instrumental means to support other subjects or STEM education. This study identifies that the findings of the existing research are weighted toward outcome-focused research. Lastly, this study addresses the fact that most of the existing studies used constructivist and constructionist frameworks not only to design and implement robotics curricula but also to analyze young children’s engagement in robotics education. Relying on the findings of the review, this study suggests clarifying and specifying robotics-intensified knowledge, skills, and attitudes in defining robotics education in connection to computer science education. In addition, this study concludes that research agendas need to be diversified and the diversity of research participants needs to be broadened. To do this, this study suggests employing social and cultural theoretical frameworks and critical analytical lenses by considering children’s historical, cultural, social, and institutional contexts in understanding young children’s engagement in robotics education.
    [Show full text]
  • Multi-Leapmotion Sensor Based Demonstration for Robotic Refine Tabletop Object Manipulation Task
    Available online at www.sciencedirect.com ScienceDirect CAAI Transactions on Intelligence Technology 1 (2016) 104e113 http://www.journals.elsevier.com/caai-transactions-on-intelligence-technology/ Original article Multi-LeapMotion sensor based demonstration for robotic refine tabletop object manipulation task* Haiyang Jin a,b,c, Qing Chen a,b, Zhixian Chen a,b, Ying Hu a,b,*, Jianwei Zhang c a Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China b Chinese University of Hong Kong, Hong Kong, China c University of Hamburg, Hamburg, Germany Available online 2 June 2016 Abstract In some complicated tabletop object manipulation task for robotic system, demonstration based control is an efficient way to enhance the stability of execution. In this paper, we use a new optical hand tracking sensor, LeapMotion, to perform a non-contact demonstration for robotic systems. A Multi-LeapMotion hand tracking system is developed. The setup of the two sensors is analyzed to gain a optimal way for efficiently use the informations from the two sensors. Meanwhile, the coordinate systems of the Mult-LeapMotion hand tracking device and the robotic demonstration system are developed. With the recognition to the element actions and the delay calibration, the fusion principles are developed to get the improved and corrected gesture recognition. The gesture recognition and scenario experiments are carried out, and indicate the improvement of the proposed Multi-LeapMotion hand tracking system in tabletop object manipulation task for robotic demonstration. Copyright © 2016, Chongqing University of Technology. Production and hosting by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
    [Show full text]
  • Machine Vision for Service Robots & Surveillance
    Machine Vision for Service Robots & Surveillance Prof.dr.ir. Pieter Jonker EMVA Conference 2015 Delft University of Technology Cognitive Robotics (text intro) This presentation addresses the issue that machine vision and learning systems will dominate our lives and work in future, notably through surveillance systems and service robots. Pieter Jonker’s specialism is robot vision; the perception and cognition of service robots, more specifically autonomous surveillance robots and butler robots. With robot vision comes the artificial intelligence; if you perceive and understand, than taking sensible actions becomes relative easy. As robots – or autonomous cars, or … - can move around in the world, they encounter various situations to which they have to adapt. Remembering in which situation you adapted to what is learning. Learning comes in three flavors: cognitive learning through Pattern Recognition (association), skills learning through Reinforcement Learning (conditioning) and the combination (visual servoing); such as a robot learning from observing a human how to poor in a glass of beer. But as always in life this learning comes with a price; bad teachers / bad examples. 2 | 16 Machine Vision for Service Robots & Surveillance Prof.dr.ir. Pieter Jonker EMVA Conference Athens 12 June 2015 • Professor of (Bio) Mechanical Engineering Vision based Robotics group, TU-Delft Robotics Institute • Chairman Foundation Living Labs for Care Innovation • CEO LEROVIS BV, CEO QdepQ BV, CTO Robot Care Systems Delft University of Technology Content •
    [Show full text]
  • A Review of Robot Learning for Manipulation: Challenges, Representations, and Algorithms
    Journal of Machine Learning Research 22 (2021) 1-82 Submitted 9/19; Revised 8/20; Published 1/21 A Review of Robot Learning for Manipulation: Challenges, Representations, and Algorithms Oliver Kroemer∗ [email protected] School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213, USA Scott Niekum [email protected] Department of Computer Science The University of Texas at Austin Austin, TX 78712, USA George Konidaris [email protected] Department of Computer Science Brown University Providence, RI 02912, USA Editor: Daniel Lee Abstract A key challenge in intelligent robotics is creating robots that are capable of directly in- teracting with the world around them to achieve their goals. The last decade has seen substantial growth in research on the problem of robot manipulation, which aims to exploit the increasing availability of affordable robot arms and grippers to create robots capable of directly interacting with the world to achieve their goals. Learning will be central to such autonomous systems, as the real world contains too much variation for a robot to expect to have an accurate model of its environment, the objects in it, or the skills required to manipulate them, in advance. We aim to survey a representative subset of that research which uses machine learning for manipulation. We describe a formalization of the robot manipulation learning problem that synthesizes existing research into a single coherent framework and highlight the many remaining research opportunities and challenges. Keywords: manipulation, learning, review, robots, MDPs 1. Introduction Robot manipulation is central to achieving the promise of robotics|the very definition of a robot requires that it has actuators, which it can use to effect change on the world.
    [Show full text]
  • Acknowledgements Acknowl
    2161 Acknowledgements Acknowl. B.21 Actuators for Soft Robotics F.58 Robotics in Hazardous Applications by Alin Albu-Schäffer, Antonio Bicchi by James Trevelyan, William Hamel, The authors of this chapter have used liberally of Sung-Chul Kang work done by a group of collaborators involved James Trevelyan acknowledges Surya Singh for de- in the EU projects PHRIENDS, VIACTORS, and tailed suggestions on the original draft, and would also SAPHARI. We want to particularly thank Etienne Bur- like to thank the many unnamed mine clearance experts det, Federico Carpi, Manuel Catalano, Manolo Gara- who have provided guidance and comments over many bini, Giorgio Grioli, Sami Haddadin, Dominic Lacatos, years, as well as Prof. S. Hirose, Scanjack, Way In- Can zparpucu, Florian Petit, Joshua Schultz, Nikos dustry, Japan Atomic Energy Agency, and Total Marine Tsagarakis, Bram Vanderborght, and Sebastian Wolf for Systems for providing photographs. their substantial contributions to this chapter and the William R. Hamel would like to acknowledge work behind it. the US Department of Energy’s Robotics Crosscut- ting Program and all of his colleagues at the na- C.29 Inertial Sensing, GPS and Odometry tional laboratories and universities for many years by Gregory Dudek, Michael Jenkin of dealing with remote hazardous operations, and all We would like to thank Sarah Jenkin for her help with of his collaborators at the Field Robotics Center at the figures. Carnegie Mellon University, particularly James Os- born, who were pivotal in developing ideas for future D.36 Motion for Manipulation Tasks telerobots. by James Kuffner, Jing Xiao Sungchul Kang acknowledges Changhyun Cho, We acknowledge the contribution that the authors of the Woosub Lee, Dongsuk Ryu at KIST (Korean Institute first edition made to this chapter revision, particularly for Science and Technology), Korea for their provid- Sect.
    [Show full text]
  • 74. Learning from Humans Fro
    1995 Multimedia Contents Learning74. Learning from Humans fro Aude G. Billard, Sylvain Calinon, Rüdiger Dillmann 74.1 Learning of Robots .............................. 1995 This chapter surveys the main approaches devel- 74.1.1 Principle................................... 1996 oped to date to endow robots with the ability 74.1.2 Brief History.............................. 1996 to learn from human guidance. The field is best 74.2 Key Issues When Learning known as robot programming by demonstration, from Human Demonstrations ............... 1998 robot learning from/by demonstration, appren- 74.2.1 When and Whom to Imitate ....... 1998 ticeship learning and imitation learning. We start 74.2.2 How to Imitate and How to Solve with a brief historical overview of the field. We the Correspondence Problem...... 1999 then summarize the various approaches taken 74.3 Interfaces for Demonstration................ 2000 to solve four main questions: when, what, who 74.4 Algorithms to Learn from Humans ........ 2002 and when to imitate. We emphasize the im- 74.4.1 Learning Individual Motions....... 2002 portance of choosing well the interface and the 74.4.2 Learning Compound Actions ....... 2003 channels used to convey the demonstrations, 74.4.3 Incremental Teaching Methods ... 2004 with an eye on interfaces providing force control 74.4.4 Combining Learning and force feedback. We then review algorith- from Humans with Other mic approaches to model skills individually and Learning Techniques .................. 2005 as a compound and algorithms that combine 74.4.5 Learning from Humans, a Form learning from human guidance with reinforce- of Human–Robot Interaction...... 2006 ment learning. We close with a look on the use 74.5 Conclusions and Open Issues of language to guide teaching and a list of open in Robot LfD .......................................
    [Show full text]
  • Abstract a Cognitive Robotic Imitation Learning System
    ABSTRACT Title of dissertation: A COGNITIVE ROBOTIC IMITATION LEARNING SYSTEM BASED ON CAUSE-EFFECT REASONING Garrett Ethan Katz Doctor of Philosophy, 2017 Dissertation directed by: Professor James A. Reggia Department of Computer Science As autonomous systems become more intelligent and ubiquitous, it is increas- ingly important that their behavior can be easily controlled and understood by human end users. Robotic imitation learning has emerged as a useful paradigm for meeting this challenge. However, much of the research in this area focuses on mimicking the precise low-level motor control of a demonstrator, rather than inter- preting the intentions of a demonstrator at a cognitive level, which limits the ability of these systems to generalize. In particular, cause-effect reasoning is an important component of human cognition that is under-represented in these systems. This dissertation contributes a novel framework for cognitive-level imitation learning that uses parsimonious cause-effect reasoning to generalize demonstrated skills, and to justify its own actions to end users. The contributions include new causal inference algorithms, which are shown formally to be correct and have reason- able computational complexity characteristics. Additionally, empirical validations both in simulation and on board a physical robot show that this approach can ef- ficiently and often successfully infer a demonstrator's intentions on the basis of a single demonstration, and can generalize learned skills to a variety of new situations. Lastly, computer experiments are used to compare several formal criteria of parsi- mony in the context of causal intention inference, and a new criterion proposed in this work is shown to compare favorably with more traditional ones.
    [Show full text]
  • Robot Learning for Persistent Autonomy
    Robot Learning for Persistent Autonomy Petar Kormushev and Seyed Reza Ahmadzadeh Abstract Autonomous robots are not very good at being autonomous. They work well in structured environments, but fail quickly in the real world facing uncertainty and dynamically changing conditions. In this chapter, we describe robot learning approaches that help to elevate robot autonomy to the next level, the so-called ’per- sistent autonomy’. For a robot to be ’persistently autonomous’ means to be able to perform missions over extended time periods (e.g. days or months) in dynamic, uncertain environments without need for human assistance. In particular, persistent autonomy is extremely important for robots in difficult-to-reach environments such as underwater, rescue, and space robotics. There are many facets of persistent auton- omy, such as: coping with uncertainty, reacting to changing conditions, disturbance rejection, fault tolerance, energy efficiency and so on. This chapter presents a col- lection of robot learning approaches that address many of these facets. Experiments with robot manipulators and autonomous underwater vehicles demonstrate the use- fulness of these learning approaches in real world scenarios. 1 Persistent Autonomy While humans and animals can perform effortlessly complicated tasks in unknown environments, our human-built robots are not very good at being similarly indepen- dent. Operating in real environments, they easily get stuck, often ask for help, and generally succeed only when attempting simple tasks in well-known situations. We Petar Kormushev Dyson School of Design Engineering, Imperial College London, London SW7 2AZ, UK e-mail: [email protected] S. Reza Ahmadzadeh iCub Facility, Istituto Italiano di Tecnologia, via Morego 30, 16163, Genoa, Italy e-mail: [email protected] 1 2 Petar Kormushev and Reza Ahmadzadeh would like autonomous robots to be much better at being autonomous for longer stretches of time (persistent autonomy), and to be able to carry out more compli- cated tasks without getting stuck, lost or confused.
    [Show full text]
  • Deep Learning in Robotics: a Review of Recent Research Harry A. Pierson
    Deep Learning in Robotics: A Review of Recent Research Harry A. Pierson (corresponding author) Department of Industrial Engineering, University of Arkansas, Fayetteville, AR, USA 4207 Bell Engineering Center 1 University of Arkansas Fayetteville, AR 72701 [email protected] +1 (479) 575-6034 Michael S. Gashler Department of Computer Science and Computer Engineering, University of Arkansas, Fayetteville, AR, USA 504 J. B. Hunt Building 1 University of Arkansas Fayetteville, AR 72701 [email protected] 1 Deep Learning in Robotics: A Review of Recent Research Advances in deep learning over the last decade have led to a flurry of research in the application of deep artificial neural networks to robotic systems, with at least thirty papers published on the subject between 2014 and the present. This review discusses the applications, benefits, and limitations of deep learning vis-à-vis physical robotic systems, using contemporary research as exemplars. It is intended to communicate recent advances to the wider robotics community and inspire additional interest in and application of deep learning in robotics. Keywords: deep neural networks; artificial intelligence; human-robot interaction 1. Introduction Deep learning is the science of training large artificial neural networks. Deep neural networks (DNNs) can have hundreds of millions of parameters [1, 2], allowing them to model complex functions such as nonlinear dynamics. They form compact representations of state from raw, high-dimensional, multimodal sensor data commonly found in robotic systems [3], and unlike many machine learning methods, they do not require a human expert to hand-engineer feature vectors from sensor data at design time. DNNs can, however, present particular challenges in physical robotic systems, where generating training data is generally expensive, and sub-optimal performance in training poses a danger in some applications.
    [Show full text]