Learning and Adaptation of Inverse Dynamics Models: a Comparison

Learning and Adaptation of Inverse Dynamics Models: A Comparison Kevin Hitzler1, Franziska Meier2, Stefan Schaal2 and Tamim Asfour1 Abstract— Performing tasks with high accuracy while inter- acting with the real world requires a robot to have an exact representation of its inverse dynamics that can be adapted to new situations. In the past, various methods for learning inverse dynamics models have been proposed that combine the well-known rigid body dynamics with model-based parameter estimation, or learn directly on measured data using regression. However, there are still open questions regarding the efficiency of model-based learning compared to data-driven approaches as well as their capabilities to adapt to changing dynamics. In this paper, we compare the state-of-the-art inertial parameter estimation to a purely data-driven and a model-based approach on simulated and real data, collected with the humanoid robot Apollo. We further compare the adaptation capabilities of two models in a pick and place scenario while a) learning the model incrementally and b) extending the initially learned Fig. 1. The humanoid robot Apollo with 7 DoFs in each arm, executing model with an error model. Based on this, we show the gap a pick and place task without (left) and with an object (right) between simulation and reality and verify the importance of modeling nonlinear effects using regression. Furthermore, we work in human-centered environments. Additionally, there demonstrate that error models outperform incremental learning are cases in which the dynamic parameters are only partially regarding adaptation of inverse dynamics models. known in advance or might not even be available at all, so that they need to be identified at run-time. Current state- I. INTRODUCTION of-the-art methods address these problems by estimating the Humans are able to move their body through space with inertial parameters of a robot [2] or learning the inverse dy- high accuracy, even when moving fast, lifting heavy objects namics model directly from measured data using regression, or using tools. To achieve similar capabilities, a robot needs see e.g. [3], [4], [5]. However, these methods strongly rely to learn a representation of its own body with its dynamic on the data used for model inference. Hence, they might properties that can be adapted to new situations and environ- only approximate the robot’s dynamics from previously seen mental interactions. tasks or workspace configurations. Due to the fact that the In the context of robot control, this is known as the inverse robot’s dynamics always depends on the current state, offline dynamics problem. Given a desired trajectory with joint posi- learning would require large amounts of training data from a tions, velocities and accelerations, a dynamics model is used well-explored state space. For robots with many degree-of- to predict the torques (or efforts) that have to be applied on freedom (DoF), this is not easily feasible. each individual joint [1]. Deriving such an inverse dynamics Furthermore, one has to keep in mind that the robot’s model analytically, e.g. by making use of the rigid-body dynamics can change during task execution, for example dynamics formulation, is not sufficient as it does not consider when using tools or lifting heavy objects. This means that nonlinearities like backlash or friction and thus, would lead the robot would either have to learn new dynamics or to large tracking errors. In such cases, the robot would continuously adapt its existing model. A promising way constantly have to track its current position and compensate to overcome these limitations is to incrementally learn the the errors with high-gain feedback control. This makes it inverse dynamics model online [5], [6] or learn an error dangerous to interact with the real world and impossible to model on top of an existing one [7], [8], [9]. However, there are many challenges when it comes to online learning as the The research leading to these results has received funding from the robot has to adapt to the new dynamics without losing the German Research Foundation (DFG: Deutsche Forschungsgemeinschaft) ability of its previously learned model, whereas many error under Priority Program on Autonomous Learning (SPP 1527) and from the German Federal Ministry of Education and Research (BMBF) under model approaches assume that learning can be done in a the project ROBDEKON (13N14678). The research was conducted in task-specific way. Hence, the robot needs to have knowledge collaboration between the Max-Planck-Institute of Intelligent Systems, the about when to switch to a certain task [8], [10]. University of Southern California and the Karlsruhe Institute of Technology. 1Institute for Anthropomatics and Robotics, Karlsruhe Institute of Tech- While regression-based learning methods for inverse dy- nology, Karlsruhe, Germany. fhitzler, [email protected] namics have been explored in the past [11], there are still 2Max-Planck-Institute of Intelligent Systems, Germany and Computa- many open questions regarding the efficiency of model- tional Learning and Motor Control Lab, University of Southern California, USA. Franziska Meier is currently with Facebook AI Research and Stefan based learning compared to data-driven approaches as well Schaal is currently with Google X. as their capabilities to adapt to new situations with changing dynamics. In this work, we target both challenges. First, dynamics [14]. However, applying these methods can lead we compare three different methods for inverse dynamics to inaccurate inverse dynamics models, since many of the learning, the state-of-the-art inertial parameter estimation [2], robot’s parameters are usually not known in advance and the model-based Dynamic Bezier´ Map [12] and a purely must be assumed such as the inertial parameters (i.e. the data-driven approach that uses feedforward neural networks mass, center of mass and inertia tensor matrix). Furthermore, [13]. The methods are evaluated 1) in simulation and 2) two when making use of the rigid body dynamics, nonlinear of them based on real data collected with the humanoid robot backlash and friction effects are neglected so that the feed- Apollo (Fig. 1) with 7 DoFs in each arm. After that, we forward command results in compare the adaptation capabilities of the previously learned = H(q )q¨ +C(q ;q˙ )q˙ + g(q ) (4) models in a pick and place scenario, executed on a real robot, tFF d d d d d d while a) learning the models incrementally and b) extending which is only a rough approximation of the equation of the models with an error model. motion. Thus, it can lead to poor torque predictions and cause imprecise movements of the robot. II. FUNDAMENTALS In the following, we describe the basic concept of in- B. Learning Inverse Dynamics verse dynamics models in the context of feedforward robot For the reasons given above, many efforts have been control and provide an overview of existing state-of-the-art made to identify the robot’s dynamics parameters at run- approaches. After introducing the necessary theory, the most time, or learn the inverse dynamics model directly from interesting methods are evaluated in detail. measured data. State-of-the-art inertial parameter estimation, A. Inverse Dynamics in Feedforward Control for example, learn the inertial parameters of a robot by re-formulating the well-known Newton-Euler equations into The inverse dynamics equation represents the mapping a linear combination of known and unknown parameters between the robot’s motion expressed in generalized coordi- and solving a least-squares problem [2], [15]. With the nates (i.e. joint positions q, velocitiesq ˙ and accelerationsq ¨) prior knowledge of the rigid body dynamics, the resulting and the generalized forces t (i.e. joint torques) that are dynamics model is able to learn fast and generalize well on required to achieve a particular motion. The symbols q,q ˙ simulated data, but it does not consider any non-linearities andq ¨ as well as t denote n-dimensional vectors where n and may lead to poor results on a real robot. is the robot’s number of DoFs. The inverse dynamics of a Nonlinear regression methods such as LWPR [5], GPR robot with n joints can then be described as [11] and n-SVR [4] overcome this problem by directly learn- t = H(q)q¨ +C(q;q˙)q˙ + g(q) + e(q;q˙;q¨) (1) ing from measured data of a robot. The local learning method LWPR is very efficient due to its low computational costs where H(q) is the symmetric and positive-definite inertia ma- but often performs poorly in unseen regions, i.e. they suffer trix, C(q;q˙)q˙ is the centripetal and Coriolis force, g(q) is the from insufficient extrapolation capabilities [16], whereas the gravitational component and e(q;q˙;q¨) are nonlinear backlash global learning methods GPR and n-SVR can learn more and friction effects [14]. In order to control the robot, e.g. by accurate models but need much more time for training [11]. using feedforward torque control, the generalized forces have Due to the fact that regression methods only learn based to be transmitted to torque commands and sent to the motor on previously seen data, they can only approximate the real controller. Typically, the motor commands are computed as function of the robot’s inverse dynamics. The so-called ”Dynamic Bezier´ Map” (DBM) proposed t = tFF + tFB (2) in [12] allow to learn an exact encoding of a robot’s inverse where tFF is the feedforward and tFB the feedback com- dynamics, represented by Lagrangian equations. Thus, they ponent. Given a desired motion with desired positions qd, can provide both high accuracy in trained regions, i.e. good velocitiesq ˙d and accelerationsq ¨d, the feedforward compo- interpolation, as well as good extrapolation capabilities.

Learning and Adaptation of Inverse Dynamics Models: a Comparison

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support