Learning-In-The-Loop Optimization: End-To-End Control and Co-Design of Soft Robots Through Learned Deep Latent Representations

Learning-In-The-Loop Optimization: End-To-End Control And Co-Design of Soft Robots Through Learned Deep Latent Representations Andrew Spielberg, Allan Zhao, Tao Du, Yuanming Hu, Daniela Rus, Wojciech Matusik CSAIL Massachusetts Institute of Technology Cambridge, MA 02139 [email protected], [email protected], [email protected] [email protected], [email protected], [email protected] Abstract Soft robots have continuum solid bodies that can deform in an infinite number of ways. Controlling soft robots is very challenging as there are no closed form solutions. We present a learning-in-the-loop co-optimization algorithm in which a latent state representation is learned as the robot figures out how to solve the task. Our solution marries hybrid particle-grid-based simulation with deep, variational convolutional autoencoder architectures that can capture salient features of robot dynamics with high efficacy. We demonstrate our dynamics-aware feature learning algorithm on both 2D and 3D soft robots, and show that it is more robust and faster converging than the dynamics-oblivious baseline. We validate the behavior of our algorithm with visualizations of the learned representation. Figure 1: Our algorithm learns a latent representation of robot state which it uses as input for control. Above are velocity field snapshots of a soft 2D biped walker moving to the right (top), the corresponding latent representations (middle), and their reconstructions (bottom) from our algorithm. In each box, the x (left) and y (right) components of the velocity fields are shown; red indicates negative values, blue positive. 1 Introduction Recent breakthroughs have demonstrated capable computational methods for both controlling (Heess et al. [2017], Schulman et al. [2017], Lillicrap et al. [2015]) and designing (Ha et al. [2017], Spielberg et al., Wampler and Popovic´ [2009]) rigid robots. However, control and design of soft robots have been explored comparatively little due to the incredible computational complexity they present. Due to their continuum solid bodies, soft robots’ state dimensionality is inherently infinite. High, but finite dimensional approximations such as finite elements can provide robust and accurate forward simulations; however, such representations have thousands or millions of degrees of freedom, making them ill-suited for most control tasks. To date, few compact, closed-form models exist for describing soft robot state, and none apply to the general case. In this paper, we address the problem of learning low-dimensional robot state while simultaneously optimizing robot control and/or material parameters. In particular, we require a representation applicable to physical control of real-world robots. We propose a computer vision-inspired approach which makes use of the robot’s observed dynamics in learning a compact observation model for soft robots. Our task-centric method interleaves 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada. controller (and material) optimization with learning low-dimensional state representations. Our “learning-in-the-loop optimization” method is inspired by recent advances in hybrid particle-grid- based differentiable simulation techniques and deep, unsupervised learning techniques. In the learning phase, simulation grid data is fed into a deep, variational convolutional autoencoder to learn a compact latent state representation of the soft robot’s motion. In the optimization phase, the learned encoder function creates a compact state description to feed into a parametric controller; the resulting, fully differentiable representation allows for backpropagating through an entire simulation and directly optimizing a simulation loss with respect to controller and material parameters. Because learning is interleaved with optimization, learned representations are catered to the task, robot design (including, e.g., discrete actuator placement), and environment at hand, and not just the static geometry of the soft structure. Because of our judicious choice of a physics engine which operates (in part) on a grid, we are able to easily employ modern, deep learning architectures (convolutional neural networks) to extract robust low-dimensional state representations while providing a representation amenable to real-world control through optical flow. Because of our fully-differentiable representation of the controller, observations, and physics, we can directly co-design robot performance. To our knowledge, our pipeline is the first end-to-end method for optimizing soft robots without the use of a pre-chosen, fixed representation, minimizing human overhead. In this paper, we contribute: 1) An algorithm for control and co-design of soft robots without the need for manual feature engineering; 2) experiments on five model robots evaluating our system’s performance compared to baseline methods; 3) visualizations of the learned representations, validating the efficacy of our learning procedure. 2 Related Work Dimensionality Reduction for Control A compact, descriptive latent space is crucial for tractably modeling and controlling soft robots. Methods for extracting and employing such spaces for control typically fall into two categories: a) analytical methods, and b) learning-based methods. Analytical methods examine the underlying physics and geometry of soft structures in order to extract an optimal subspace for capturing low-energy (likely) deformations. Most popular among these methods are modal bases [Sifakis and Barbic, 2012], formed by solving a generalized eigenvalue problem based on the harmonic dynamics of a system. These methods suffer from inadequately modeling actuation, contact, and tasks and only represent a linear approximation of system’s dynamics. Still, such representations have been successfully applied to real-time linear control (LQR) in Barbicˇ and Popovic´ [2008] and Thieffry et al. [2018], and (with some human labeling) animation [Barbicˇ et al., 2009], but lack the physical accuracy needed for physical fabrication. In another line of work, Chen et al. [2017] presented a method for using analytical modal bases in order to reduce the degrees of freedom of a finite element system for faster simulation while maintaining physical accuracy. However, the resulting number of degrees of freedom are still impractical for most modern control algorithms. For the specific case of soft robot arms, geometrically-inspired reduced coordinates may be employed. Della Santina et al. [2018] developed a model for accurately and compactly describing the state of soft robot arms by exploiting segment-wise constant curvature of arms. Learning-based methods, by contrast, use captured data in order to learn representative latent spaces for control. Since these representations are derived from robot simulations or real-world data, they can naturally handle contact, actuation, and be catered to the task. Goury and Duriez [2018] demonstrated some theoretical guarantees on how first-order model reduction techniques could be applied to motion planning and control for real soft robots. As drawbacks, their method is catered to FEM simulation, requires a priori knowledge of how the robot will move, and representations are never re-computed, making it ill-suited to co-design where dynamics can change throughout optimization. Two works from different domains have similarities to our work. Ma et al. [2018] applied deep- learning of convolutional autoencoders in the context of controlling rigid bodies with directed fluids. Our algorithm shares high-level similarities, but operates in the domain of soft robot co-optimization and exploits simulation differentiability for fast convergence. Amini et al. [2018] employed latent representations for autonomous vehicle control in the context of supervised learning on images. Co-Design of Soft Robots There exist two main threads of work in which robots are co-designed over morphology and control — gradient-free and gradient-based. Most of the work in model-free co-optimization of soft-robots is based on evolutionary algorithms. Cheney et al. [2013], Corucci 2 Figure 2: At each step of our simulation, the following procedure is performed. First, the unreduced state is fed into an observer function — centroids of a segmentation, as in Hu et al. [2019], or, as we demonstrate in this paper, an automatically learned latent space. Regardless, the observer outputs features to be processed by a parametric controller, which converts these features to actuation signals. Finally, the actuation is fed into our MPM simulator, which performs a simulation step. The entire pipeline is differentiable and therefore we can compute derivatives with respect to design variables even when executing the workflow for many steps. et al. [2016], and Cheney et al. [2018] have demonstrated task-specific co-optimization of soft robots over materials, actuators, and topology. These approaches are less susceptible to local minima than gradient-based approaches but are vastly more sample inefficient. For instance, a single evolved robot in Cheney et al. [2013] requires 30000 forward simulations; by comparison, optimized robots in our work are optimized in the equivalent of 400 simulations (treating gradient calculation as equal to 3 forward simulations). Further, their approach was limited to simple open-loop controllers tied to form, while ours solves for more robust, closed-loop control. While some algorithms exist for gradient-based co-optimization of rigid robots (Wampler and Popovic´

Learning-In-The-Loop Optimization: End-To-End Control and Co-Design of Soft Robots Through Learned Deep Latent Representations

Control in Robotics

An Abstract of the Dissertation Of

Curriculum Reinforcement Learning for Goal-Oriented Robot Control

Final Program of CCC2020

Real-Time Vision, Tracking and Control

Arxiv:2011.00554V1 [Cs.RO] 1 Nov 2020 AI Agents [5]–[10]

An Emotional Mimicking Humanoid Biped Robot and Its Quantum Control Based on the Constraint Satisfaction Model

Latent-Space Control with Semantic Constraints for Quadruped Locomotion

Mobile Robots Adaptive Control Using Neural Networks

Real-Time Hebbian Learning from Autoencoder Features for Control Tasks

THE MOBILE ROBOT CONTROL for OBSTACLE AVOIDANCE with an ARTIFICIAL NEURAL NETWORK APPLICATION Victor Andreev, Victoria Tarasova

AI-Perspectives: the Turing Option Frank Kirchner1,2