Learning Motor Skills: from Algorithms to Robot Experiments

Learning Motor Skills: From Algorithms to Robot Experiments Erlernen Motorischer Fähigkeiten: Von Algorithmen zu Roboter-Experimenten Zur Erlangung des akademischen Grades Doktor-Ingenieur (Dr.-Ing.) vorgelegte Dissertation von Dipl.-Ing. Jens Kober aus Künzelsau März2012—Darmstadt—D17 Learning Motor Skills: From Algorithms to Robot Experiments Erlernen Motorischer Fähigkeiten: Von Algorithmen zu Roboter-Experimenten Vorgelegte Dissertation von Dipl.-Ing. Jens Kober aus Künzelsau 1. Gutachten: Prof. Dr. Jan Peters 2. Gutachten: Prof. Dr. Stefan Schaal Tag der Einreichung: Darmstadt—D17 Bitte zitieren Sie dieses Dokument als: URN: urn:nbn:de:tuda-tuprints-12345 URL: http://tuprints.ulb.tu-darmstadt.de/12345 Dieses Dokument wird bereitgestellt von tuprints, E-Publishing-Service der TU Darmstadt http://tuprints.ulb.tu-darmstadt.de [email protected] Die Veröffentlichung steht unter folgender Creative Commons Lizenz: Namensnennung – Keine kommerzielle Nutzung – Keine Bearbeitung 2.0 Deutschland http://creativecommons.org/licenses/by-nc-nd/2.0/de/ Erklärung zur Dissertation Hiermit versichere ich, die vorliegende Dissertation ohne Hilfe Dritter nur mit den an- gegebenen Quellen und Hilfsmitteln angefertigt zu haben. Alle Stellen, die aus Quellen entnommen wurden, sind als solche kenntlich gemacht. Diese Arbeit hat in gleicher oder ähnlicher Form noch keiner Prüfungsbehörde vorgelegen. Darmstadt, den 13. März 2012 (J. Kober) i Abstract Ever since the word “robot” was introduced to the English language by Karel Capek’sˇ play “Rossum’s Universal Robots” in 1921, robots have been expected to become part of our daily lives. In recent years, robots such as autonomous vacuum cleaners, lawn mowers, and window cleaners, as well as a huge number of toys have been made commercially available. However, a lot of additional research is required to turn robots into versatile household helpers and companions. One of the many challenges is that robots are still very specialized and cannot easily adapt to changing environments and requirements. Since the 1960s, scientists attempt to provide robots with more autonomy, adaptability, and intelligence. Research in this field is still very active but has shifted focus from reasoning based methods towards statistical machine learning. Both navigation (i.e., moving in unknown or changing environments) and motor control (i.e., coordinating movements to perform skilled actions) are important sub-tasks. In this thesis, we will discuss approaches that allow robots to learn motor skills. We mainly consider tasks that need to take into account the dynamic behavior of the robot and its environment, where a kinematic movement plan is not sufficient. The presented tasks correspond to sports and games but the presented techniques will also be applicable to more mundane household tasks. Motor skills can often be represented by motor primitives. Such motor primitives encode elemental motions which can be generalized, sequenced, and combined to achieve more complex tasks. For example, a forehand and a backhand could be seen as two different motor primitives of playing table tennis. We show how motor primitives can be employed to learn motor skills on three different levels. First, we discuss how a single motor skill, represented by a motor primitive, can be learned using reinforcement learning. Second, we show how such learned motor primitives can be generalized to new situations. Finally, we present first steps towards using motor primitives in a hierarchical setting and how several motor primitives can be combined to achieve more complex tasks. To date, there have been a number of successful applications of learning motor primitives employing imitation learning. However, many interesting motor learning problems are high-dimensional reinforcement learning problems which are often beyond the reach of current reinforcement learning methods. We review research on reinforcement learning applied to robotics and point out key challenges and important strategies to render reinforcement learning tractable. Based on these insights, we introduce novel learning approaches both for single and generalized motor skills. For learning single motor skills, we study parametrized policy search methods and introduce a framework of reward-weighted imitation that allows us to derive both policy gradient methods and expectation- maximization (EM) inspired algorithms. We introduce a novel EM-inspired algorithm for policy learning that is particularly well-suited for motor primitives. We show that the proposed method out-performs several well-known parametrized policy search methods on an empirical benchmark both in simulation and on a real robot. We apply it in the context of motor learning and show that it can learn a complex ball-in-a-cup task on a real Barrett WAM. In order to avoid re-learning the complete movement, such single motor skills need to be generalized to new situations. In this thesis, we propose a method that learns to generalize parametrized motor plans, obtained by imitation or reinforcement learning, by adapting a small set of global parameters. We employ reinforcement learning to learn the required parameters to deal with the current situation. Therefore, we introduce an appropriate kernel-based reinforcement learning algorithm. To show its feasibility, we evaluate this algorithm on a toy example and compare it to several previous approaches. Subsequently, we apply the approach to two robot tasks, i.e., the generalization of throwing movements in darts and of hitting movements in table tennis on several different real robots, i.e., a Barrett WAM, the JST-ICORP/SARCOS CBi and a Kuka KR 6. We present first steps towards learning motor skills jointly with a higher level strategy and evaluate the approach with a target throwing task on a BioRob. Finally, we explore how several motor primitives, iii representing sub-tasks, can be combined and prioritized to achieve a more complex task. This learning framework is validated with a ball-bouncing task on a Barrett WAM. This thesis contributes to the state of the art in reinforcement learning applied to robotics both in terms of novel algorithms and applications. We have introduced the Policy learning by Weighting Exploration with the Returns algorithm for learning single motor skills and the Cost-regularized Kernel Regression to generalize motor skills to new situations. The applications explore highly dynamic tasks and exhibit a very efficient learning process. All proposed approaches have been extensively validated with benchmarks tasks, in simulation, and on real robots. iv Abstract Zusammenfassung Schon seit 1921 mit dem Theaterstück „Rossum’s Universal Robots“ von Karel Capekˇ das Wort „Roboter“ in die deutsche Sprache eingeführt wurde, besteht die Erwartung, dass Roboter Teil unseres täglichen Lebens werden. Seit ein paar Jahren sind sowohl Roboter wie autonome Staubsauger, Rasenmäher und Fensterreiniger als auch eine große Anzahl an Spielzeugrobotern im Handel erhältlich. Allerdings ist noch viel Forschung nötig, bis Roboter als universelle Haushalts-Helfer und Gefährten einsetzbar sind. Eine der größten Herausforderungen ist, dass Roboter immer noch sehr spezialisiert sind und sich nicht ohne Weiteres an sich ändernde Umgebungen und Anforderungen anpassen können. Seit den 1960ern versuchen Wissenschaftler, Roboter mit mehr Autonomie, Anpassungsfähigkeit und Intelligenz auszustatten. Die Forschung auf diesem Gebiet ist sehr aktiv, hat sich allerdings von regel-basierten Systemen hin zu statistischem maschinellem Lernen verlagert. Sowohl Navigation (d.h. sich in unbekannten oder sich ändernden Umgebungen zu bewegen) als auch Motorsteuerung (d.h. das Koordinieren von Bewegungen, um komplexe Aktionen auszuführen) sind hierbei wichtige Teilaufgaben. In dieser Doktorarbeit diskutieren wir Ansätze, die es Robotern ermöglichen, motorische Fähigkeiten zu erlernen. Wir betrachten in erster Linie Aufgaben, bei denen das dynamische Verhalten des Roboters und seiner Umgebung berücksichtigt werden muss und wo ein kinematischer Bewegungsplan nicht ausreichend ist. Die vorgestellten Anwendungen kommen aus dem Sport- und Spiel-Bereich, aber die vorgestellten Techniken können auch bei alltäglichen Aufgaben im Haushalt Anwendung finden. Motorische Fähigkeiten können oft durch Motor-Primitive dargestellt werden. Solche Motor-Primitive kodieren elementare Bewegungen, die verallgemeinert, aneinandergereiht und kombiniert werden können, um komplexere Aufgaben zu erfüllen. Zum Beispiel könnte ein Vorhand- und Rückhand-Spiel als zwei verschiedene Motor- Primitive für Tischtennis angesehen werden. Wir zeigen, wie Motor-Primitive verwendet werden können, um motorische Fähigkeiten auf drei verschiedenen Ebenen zu erlernen. Zuerst zeigen wir, wie eine einzelne motorische Fertigkeit, die durch eine Motor-Primitive dargestellt wird, mittels Reinforcement-Learning (bestärkendes Lernen) gelernt werden kann. Zweitens zeigen wir, wie solche erlernten Motor-Primitiven verallgemeinert werden können, um auf neue Situationen zu reagieren. Schließlich präsentieren wir erste Schritte, wie Motor-Primitive in einer hierarchischen Struktur gelernt werden können und wie sich mehrere Motor-Primitive kombinieren lassen, um komplexere Aufgaben zu erfüllen. Es gab schon eine Reihe von erfolgreichen Anwendungen des Erlernens von Motor-Primitiven durch überwachtes Lernen. Allerdings sind viele interessante motorische Lernprobleme hochdimensionale Reinforcement-Learning-Probleme, die oft außerhalb der Anwendbarkeit

Learning Motor Skills: from Algorithms to Robot Experiments

A Review of Robot Learning for Manipulation: Challenges, Representations, and Algorithms

Self-Corrective Apprenticeship Learning for Quadrotor Control

Arxiv:1907.03146V3 [Cs.RO] 6 Nov 2020

Deep Q-Learning from Demonstrations

Third-Person Imitation Learning

Teaching Agents with Deep Apprenticeship Learning

Exploring Applications of Deep Reinforcement Learning for Real-World Autonomous Driving Systems

One-Shot Imitation Learning

Reward Learning from Human Preferences and Demonstrations in Atari

Aerobatics Control of Flying Creatures Via Self-Regulated Learning

Supplemental

Truly Batch Apprenticeship Learning with Deep Successor Features ∗