Deep Reinforcement Learning for Soft Robotic Applications: Brief Overview with Impending Challenges
Total Page:16
File Type:pdf, Size:1020Kb
Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 23 November 2018 doi:10.20944/preprints201811.0510.v2 Peer-reviewed version available at Robotics 2019, 8, 4; doi:10.3390/robotics8010004 Review Deep Reinforcement Learning for Soft Robotic Applications: Brief Overview with Impending Challenges Sarthak Bhagat 1,2,y,Hritwick Banerjee 1,3y ID , Zion Tsz Ho Tse4 and Hongliang Ren 1,3,5,* ID 1 Department of Biomedical Engineering, Faculty of Engineering, 4 Engineering Drive 3, National University of Singapore, Singapore 117583, Singapore; [email protected] (H.B.) 2 Department of Electronics and Communications Engineering, Indraprastha Institute of Information Technology, Delhi, New Delhi, Delhi 110020, India; [email protected] (S.B.) 3 Singapore Institute for Neurotechnology (SINAPSE), Centre for Life Sciences, National University of Singapore, Singapore 117456. 4 School of Electrical & Computer Engineering, College of Engineering, The University of Georgia, Athens 30602, USA; [email protected] (Z.T.H.T) 5 NUS (Suzhou) Research Institute (NUSRI), Wuzhong Dist., Suzhou City, Jiangsu Province, China. † These authors equally contributed towards this manuscript. * Correspondence: [email protected] (H.R.); Tel.: +65-6601-2802 Abstract: The increasing trend of studying the innate softness of robotic structures and amalgamating it with the benefits of the extensive developments in the field of embodied intelligence has led to sprouting of a relatively new yet extremely rewarding sphere of technology. The fusion of current deep reinforcement algorithms with physical advantages of a soft bio-inspired structure certainly directs us to a fruitful prospect of designing completely self-sufficient agents that are capable of learning from observations collected from their environment to achieve a task they have been assigned. For soft robotics structure possessing countless degrees of freedom, it is often not easy (something not even possible) to formulate mathematical constraints necessary for training a deep reinforcement learning (DRL) agent for the task in hand, hence, we resolve to imitation learning techniques due to ease of manually performing such tasks like manipulation that could be comfortably mimicked by our agent. Deploying current imitation learning algorithms on soft robotic systems have been observed to provide satisfactory results but there are still challenges in doing so. This review article thus posits an overview of various such algorithms along with instances of them being applied to real world scenarios and yielding state-of-the-art results followed by brief descriptions on various pristine branches of DRL research that may be centers of future research in this field of interest. Keywords: deep reinforcement learning; imitation learning; soft robotics 1. Introduction 1.1. Soft robotics: a new surge in robotics The past decade has seen engineering and biology coming together [1–5] leading to cropping up of a relatively newer field of research- Soft Robotics (SoRo). SoRo has come to our aid by enhancing physical potentialities of robotic structures amplifying the flexibility, rigidity and the strength and hence, accelerating their performance. Biological organisms use to good advantage their soft structure to maneuver in complex environments, giving motivation to exploit such physical attributes that could be incorporated in our models to perform tasks that demand robust interactions with uncertain environments. SoRo is capable of creating three dimensional bio-inspired structures[6] that are capable of self-regulated homeostasis, resulting in robotics actuators that have the potential to mimic biomimetic motions with simple, inexpensive actuation [7,8] and able to achieve bending, twisting, © 2018 by the author(s). Distributed under a Creative Commons CC BY license. Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 23 November 2018 doi:10.20944/preprints201811.0510.v2 Peer-reviewed version available at Robotics 2019, 8, 4; doi:10.3390/robotics8010004 2 of 33 extension, flexion with non-rigid materials [5,9]. These developments in creating such robotic hardware presents before us an opportunity to couple them with imitation learning techniques by exploiting the special properties of these materials to clich much higher levels of precision and accuracy. Various underlying physical properties including body shape, elasticity, viscosity, softness, density and many more has certainly prompted us to employ such unconventional structures and morphologies in our robotic systems bringing about a revolution in research domain of embodied intelligence. Developing such technique would definitely lead to fabrication of robots that could invulnerably communicate with the environment. SoRo also presents before us strong prospects of being melded with Tissue Engineering giving rise to composite systems that could find vast applications in medical domain. [10] Figure 1. The figure shows various application of SoRo in today’s world. 1.2. Deep learning: an overviewDeep Learning for Controls in Robotics There has been a certain incline towards utilization of deep learning techniques for creating autonomous systems that are capable to replace humans in varied domains. Deep learning [11] approaches have shown tremendous amount of success when combined with reinforcement learning (RL) tasks in the past decade and are known to produce state-of-art results in various diverse fields [12]. There have been several pioneer algorithms in this domain that have shown ground-breaking results in tasks difficult to handle with former methods. The need for creating completely autonomous intelligent robotic systems has led to the heavy dependence on the use of Deep RL to solve a set of complex real world problems without any prior information about the environment. These continuously evolving systems aid the agent to learn through a sequence of multiple time steps, gradually moving towards an optimal solution. Robotics: perception & control Essentially, all robotics tasks can be broken down into 2 different fragments, namely perception [13,14] and control. The task of perception can be viewed as a relatively simpler problem wherein agents are provided with necessary information about the environment via sensory inputs, from which they may extract desired target quantities or properties. But in the case of learning a control policy, the agent actively interacts with environment trying to achieve an optimal behaviour based on the rewards received. The problem of control goes one step further than the former due to the following factors: • Data distribution : In case of Deep RL for perception, the observations are independent and identically distributed while in case of controls, they are accumulated in an online manner due to their continuous nature where each one is correlated to the previous ones[15]. • Supervision Signal : Complete supervision is often provided in case of perception in form of ground truth labels while in controls there are only sparse rewards available. • Data Collection : Dataset collection can be done offline in perception but requires online collection in case of controls. Hence, this greatly affects the data we can collect due the fact that the agent needs to execute actions in the real world which is not a primitive task in most scenarios. Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 23 November 2018 doi:10.20944/preprints201811.0510.v2 Peer-reviewed version available at Robotics 2019, 8, 4; doi:10.3390/robotics8010004 3 of 33 1.3. Deep learning in SoRo The implementation of Deep Learning Techniques for solving compound problems in the task of controls[16–18] in soft robotics has been one of the hottest topics of research. Hence, there has been development of various algorithms that have certainly surpassed the accuracy and precision of earlier approaches. There are innumerable control problems that exist but by far the most popular and in-demands control task involves manipulation. The last decade has seen a huge dependence on soft robotics (and/or bio-robotics) for solving the control related tasks, and applying such DRL techniques on these soft robotics systems has become a focus point of heavy ongoing research. Hence, the amalgamation of these budding fields of interests presents before us a challenge as well as potential of building much smarter control systems[10] that can handle objects of varying shapes [19], adapt to continuously diverging environments and perform substantial tasks of manipulation combining the hardware capabilities of a soft robotic system alongside the learning procedures of a deep learning agent. Hence, in this paper we focus applying DRL and imitation learning techniques to soft robots to perform the task of control of robotic systems. 1.4. Forthcoming challenges The next big step towards learning policy controls for robotic applications is imitation learning. In such approaches, the agents learns to perform a task by mimicking the actions of an expert, gradually improving its performance with time just as a Deep Reinforcement Learning agent. Even though, it is really hard to design the reward/loss function in these cases due to the huge dimension of the action space pertaining to the wide variety of motions possible in soft robots, these approaches are extremely useful for humanoid robots or manipulators with high degrees of freedom where it is easy to demonstrate desired behaviours as a result of the magnified flexibility and tensile strength. Manipulation tasks especially the