Humanoid Whole-Body Movement Optimization from Retargeted
Total Page:16
File Type:pdf, Size:1020Kb
Humanoid Whole-Body Movement Optimization from Retargeted Human Motions Waldez Gomes, Vishnu Radhakrishnan, Luigi Penco, Valerio Modugno, Jean-Baptiste Mouret, Serena Ivaldi To cite this version: Waldez Gomes, Vishnu Radhakrishnan, Luigi Penco, Valerio Modugno, Jean-Baptiste Mouret, et al.. Humanoid Whole-Body Movement Optimization from Retargeted Human Motions. IEEE/RAS Int. Conf. on Humanoid Robots, Oct 2019, Toronto, Canada. hal-02290473 HAL Id: hal-02290473 https://hal.archives-ouvertes.fr/hal-02290473 Submitted on 17 Sep 2019 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Humanoid Whole-Body Movement Optimization from Retargeted Human Motions Waldez Gomes1, Vishnu Radhakrishnan1, Luigi Penco1, Valerio Modugno2, Jean-Baptiste Mouret1, Serena Ivaldi1 Abstract— Motion retargeting and teleoperation are powerful tools to demonstrate complex whole-body movements to hu- manoid robots: in a sense, they are the equivalent of kinesthetic teaching for manipulators. However, retargeted motions may not be optimal for the robot: because of different kinematics and dynamics, there could be other robot trajectories that perform the same task more efficiently, for example with less power consumption. We propose to use the retargeted trajectories to bootstrap a learning process aimed at optimizing the whole-body trajectories w.r.t. a specified cost function. To ensure that the optimized motions are safe, i.e., they do not violate system constraints, we use constrained optimization algorithms. We compare both global and local optimization approaches, since the optimized robot solution may not be close to the demonstrated one. We evaluate our framework with the humanoid robot iCub on an object lifting scenario, initially demonstrated by a human operator wearing a motion-tracking suit. By optimizing the initial retargeted movements, we can improve robot performance by over 40%. I. INTRODUCTION Kinesthetic teaching is widely used to demonstrate the Fig. 1: After several human whole-body demonstrations, the motion required to perform a task to robotics manipulators. robot learns an initial movement that is further refined by an Motion retargeting [1], [2] implements this idea for humanoid optimization algorithm in simulation. robots by using human motion tracking data to demonstrate whole-body movements. While this approach is very powerful, it presents two main limitations. First, humans generate based on Probabilistic Movement Primitives [3], which can movements that may not be optimal for the robot mainly capture motion variability computing a Gaussian trajectory due to structural differences between them: different number distribution from a set of demonstrations. Task trajectories of joints, or power generation at the joint level for instance. are executed by our multi-task whole-body controller [4] Second, because of the intrinsic variability of the human which solves a quadratic programming problem to generate motion, the robot may not learn from the best possible the robot controls for executing the desired motion. representation of the task execution. The main insight is As we previously did in [5], we use non-linear constrained that even if a human demonstration enables the robot to optimization to find the best parameters of the whole-body perform a given task, i.e., lift a box (Fig. 1), the retargeted trajectories that minimize an effort cost of executing the motion will be hardly optimal for the robot from an energetic desired task, under the constraints related to the robot point of view: if a more efficient way to perform a task exists, motion and actuator capabilities. This optimization phase, it needs to be found. bootstrapped by the initial retargeted trajectories, evaluates In this paper, we tackle those issues by using the retargeted in simulation the execution of the entire trajectory at each motions to bootstrap an optimization process that finds the episode (roll-out) until an optimal trajectory is found (Fig. 1). best whole-body movement that minimizes a given cost We use constrained optimization algorithms to ensure that the function related to the energetic consumption. solutions are always safe, not violating any constraints [5], [6]. Additionally, instead of picking one trajectory, we rely on Since we cannot exclude a priori that the optimized solution a parameterized description of the retargeted task trajectories is not close to the initial demonstration, we investigate both *This work was supported by the European Unions Horizon 2020 Research local and global constrained optimization algorithms. and Innovation Programme under Grant Agreement No. 731540 (project We show the effectiveness of our framework for whole- AnDy) and from the European Research Council (ERC) under Grant body movement optimization on a lifting task, initially Agreement No. 637972 (project ResiBots). 1 Inria, CNRS, University of Lorraine, Loria, UMR 7503 demonstrated by a human operator wearing a motion-tracking 2 La Sapienza University, Rome, Italy suit: the robot performance, evaluated by the fitness related Roll-out Stage to the torque consumption, improves by over 40%. (III - C) (III - B) The main contribution of this paper is twofold: First, we Trajectory Initial Whole-body ProMP Generation show a framework that learns optimal whole-body trajectories ProMP QP Training and Weights controller that do not violate any of the problem constraints, and at the Execution same time benefits from ProMPs to bootstrap the learning Eq. 6 with a trajectory that takes into account several human (III - E) Motion Constrained demonstrations. Second, we show the method’s viability with Retargeting (III - D) Optimization different optimization algorithms in a simulated environment, providing practical intelligence for future research on trajec- tory optimization for humanoid robots. Optimal Demonstrations ProMP In the rest of the paper, we explain how similar studies have From Human Weights approached the problem of optimizing motion retargeting for humanoid robots, mainly by optimizing control parameters, Fig. 2: Overview of our framework for learning optimal or reference trajectories (Section II). Then we follow to weight parameters for whole-body trajectories. the building blocks of our framework (Fig. 2), precisely, fundamentals of our robot controller, ProMPs, our motion retargeting, and the constrained optimization algorithms that optimize them w.r.t. to different cost functions related to the were used. In Section IV, we describe our experimental humanoid kinematics or dynamics. scenario for the humanoid, that it is further evaluated by In this work, we take the approach of keeping the QP three different optimization algorithms in a benchmark with controller parameters fixed, and optimizing the reference tra- different initial trajectories. Finally, we discuss the results jectories directly. Furthermore, before trajectory optimization, in Section V and conclude in favor of the viability of our those trajectories need to be compactly parameterized so that method to optimize retargeted motions to humanoid robots the optimization only deals with a small set of parameters, (Section VI). as it is typically done in motion planning and reinforcement learning [20], [21]. II. RELATED WORK In [22], the authors used reinforcement learning (RL) to Humanoid robots are able to execute multiple tasks through modify waypoints in a reference trajectory for the robot’s different joint trajectories (motion redundancy). In order to hand. The waypoints were used as parameters for a Bayesian control a humanoid robot, a well-established approach is to Optimization (BO) framework. It included cost function use quadratic programming (QP) formulations that implement evaluations directly from robot demonstrations to have only a strict [7], [8] or soft [9], [10] priorities among many tasks few roll-outs in comparison to optimizations in simulations. while dealing with low-level constraints (e.g. joint or torque In [5], the authors parameterized a reference trajectory for limits). However, even though those QP controllers have the robot Center of Mass (CoM). A compact representation of shown satisfactory results, they still demand good input the entire trajectory is provided through a radial basis function trajectories. Otherwise, the high number of degrees of freedom (RBF) network. Furthermore, the trajectory optimization was and constraints in humanoid robots may hinder the generation done in 2 steps: Unconstrained optimization; and black-box of feasible motions that do not violate robot constraints. constrained optimization, where the solution of the first step One solution for whole-body trajectory generation is was used as an initial point for the second step. The main issue to learn the movements from human demonstrations [11], of this study was exactly the fact that its main optimization [12]. In [1], the authors retarget human motion onto a step required a prior bootstrapping step from a successful humanoid robot using a QP controller and additionally taking unconstrained optimization