Online Learning of Uneven Terrain for Humanoid Bipedal Walking Seung-Joon Yi Byoung-Tak Zhang Daniel D
Total Page:16
File Type:pdf, Size:1020Kb
Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10) Online Learning of Uneven Terrain for Humanoid Bipedal Walking Seung-Joon Yi Byoung-Tak Zhang Daniel D. Lee Biointelligence Laboratory Biointelligence Laboratory GRASP Laboratory Seoul National University Seoul National University University of Pennsylvania Seoul 151-742, Korea Seoul 151-742, Korea Philadelphia, PA 19104, USA [email protected] [email protected] GRASP Laboratory University of Pennsylvania Philadelphia, PA 19104, USA [email protected] Abstract literature have focused on the second part of this problem, such as building reactive feedback controllers to walk on We present a novel method to control a biped humanoid uneven surfaces without an explicit model of the surface. In robot to walk on unknown inclined terrains, using an online learning algorithm to estimate in real-time the these approaches, the terrain model is implicitly represented local terrain from proprioceptive and inertial sensors. in parameters of the feedback controller and tuned indirectly. Compliant controllers for the ankle joints are used to In addition, much of this work has been implemented on pro- actively probe the surrounding surface, and the mea- prietary hardware platforms, using special hardware to help sured sensor data are combined to explicitly learn the mechanically stabilize the robot platform (Sano et al. 2008; global inclination and local disturbances of the terrain. Hashimoto et al. 2006). More recent work on quadruped lo- These estimates are then used to adaptively modify the comotion over uneven terrain assumes that a precise model robot locomotion and control parameters. Results from of the terrain is given, and uses motion planning techniques both a physically-realistic computer simulation and ex- with perfect state and terrain information (Kalakrishnan et periments on a commercially available small humanoid al. 2009; Kolter, Rodgers, and Ng 2008; Pongas, Mis- robot show that our method can rapidly adapt to chang- ing surface conditions to ensure stable walking on un- try, and Schaal 2007; Ratliff, Bagnell, and Srinivasa 2007; even surfaces. Rebula et al. 2007). Obviously, this has clear limitations in real world implementations where noise and uncertainty are much greater problems. Introduction In this work, we show how to use existing hardware on The main advantage of legged locomotion over wheeled lo- bipedal robots to address the sensing part of the problem comotion is that legs have the capability of climbing rougher using online machine learning techniques. By incorporating terrain than wheeled or tracked vehicles. Unfortunately, this electronic compliance and foot pressure sensors, the swing ideal is often not achieved in reality, especially for the cur- foot is used to provide noisy estimates of the local gradient rent generation of bipedal humanoid robots. Many walking of the contact point, and the computed pose of the foot from controller implementations for humanoid robots assume per- joint encoders and the inertial measurement unit is used to fectly flat surfaces, and even a slight deviation in the floor rapidly learn an explicit model of the surface the robot is can lead to serious instabilities in these controllers. In this walking upon. manuscript, we propose a novel method to learn and adapt to Our proposed framework has a number of key advantages. uneven terrain, and demonstrate how our methods can stabi- We show how algorithms for surface measurement, terrain lize walking in humanoid robots. model estimation, and adaptive walking control can be sep- The problem of legged walking control on uneven arately analyzed and integrated without having to consider surfaces has been studied by a number of researchers the full complicated dynamics of the robot and environment. (Hashimoto et al. 2006; Hyon 2009; Jenkins, Wrotek, Our approach also uses direct sensory information from the and McGuire 2007; Kalakrishnan et al. 2009; Kim et al. foot, thus enabling more rapid learning of the surround- 2007; Kim, Park, and Oh 2007; Mikuriya et al. 2005; ing environmental characteristics compared to other meth- Kolter, Rodgers, and Ng 2008; Liu, Chen, and Veloso 2009; ods that depend upon indirect measurement from the torso Ogino, Toyama, and Asada 2007; Park and Kim 2009; alone. Pongas, Mistry, and Schaal 2007; Ratliff, Bagnell, and Srini- To show the effectiveness of our approach, we first vasa 2007; Rebula et al. 2007; Sano et al. 2008; Yamaguchi, demonstrate our algorithms using a freely available robot Takanishi, and Kato 1994). This problem can be divided into simulation package that we have developed. This simula- two parts: (a) using sensor information to create a model of tion environment allows us to perform many trials to quan- the surrounding terrain, and (b) constructing controllers to tify the performance of our approach. In addition, we walk on rough terrains. Much of the previous research in the have performed experiments using a commercially available Copyright c 2010, Association for the Advancement of Artificial small humanoid robot platform without any special hard- Intelligence (www.aaai.org). All rights reserved. ware modifications. In spite of limitations in the hardware 1639 Figure 1: Side and frontal view of surface sensing. Foot position displacement f~ and foot normal f^ are calculated d n ^ by forward kinematics using inertial sensors and joint angle Figure 2: Estimating the surface gradient N from foot dis- ~ ^ readings. placement fd and foot normal fn measurements. For the foot displacement case, we incorporate the constraints that ~ N^new should lie in the same plane with N^ and fd to uniquely platform, our results show that our method can rapidly esti- determine N^new. mate the surface gradient, enabling the robot to successfully walk across unknown surfaces. This paper is structured as follows. First we describe to the joints of the swing leg during the latter part of the sin- our method to probe the shape of the walking surface us- gle support phase. Touchdown is determined using the foot ing a compliant swing foot, and show how online learning is force sensors, and the swing leg is stiffened once the foot has used to estimate an explicit surface model in real-time using successfully made ground contact. After landing, joint angle the measured foot position and angles. Then we demon- values from the encoders are measured, and the pose of the strate our walk controller for rough terrain when the surface foot is calculated using the forward kinematic model of the model is known and estimated from the sensors. We quan- robot. An overview of this surface sensing process is shown tify our results from simulation as well as experiments on a in Figure 1. The reliability of this calculation depends upon humanoid robot, and conclude with a discussion of our find- the accuracy of the joint encoders and inertial sensor read- ings. ings. We compensate for any noise in the estimates of the ~ ^ foot displacement fd and the foot normal fn by using the Online Learning of Uneven Terrain following online learning algorithm. When bipedal animals walk on unknown terrains without vi- sual feedback, they can cautiously sense the local surface Learning the surface model using proprioception and force feedback. A foot is placed After the position and angle of the landed foot is determined, into an expected landing position until the ground reaction we use this information to explicitly estimate the local sur- force is felt, and the location and shape of the surface can be face gradient. There are a variety of methods to explicitly estimated from the foot landing position and angle. In situa- model the various terrain surfaces. One possibility is to de- tions where the surface is very rough, humans actively probe fine the surface as a function of the global 2D position of the the surface by sensing a number of positions and the foot robot: Msurface = f(x; y). trajectory is actively modified according to the learned sur- This requires very good localization capability, as well as face profile. This process is the inspiration for our approach. building a detailed map of the surrounding environment. In- Instead of relying upon torso-based measurements, we take stead we take a simpler approach, using rapid online learn- the foot landing positions and angles as noisy measurements ing of the local surface gradient via the noisy, built-in sen- and use them to learn the surrounding surface gradient. We sors. Thus, in this simple approach, the model consists of explain this process in more detail here. maintaining an estimate of the local surface normal vector: Msurface = N^. Surface sensing using compliant swing leg After each footfall, we get two unit vectors as noisy mea- ^ ~ In order for the foot to conform to the local surface gradi- surements: the foot normal fn and foot displacement fd. We ent upon landing, we purposefully give higher compliance will estimate the surface gradient using both of these mea- 1640 surements. In the ideal case, the measured gradient of the landing point will coincide with that of the surface gradient, and the foot displacement vector will be perpendicular to the surface normal: ^ fn = N^ (1) ~ fd · N^ = 0 (2) But because of noise in these measurements, we will not use these as hard constraints for the model. Instead, we de- fine the new model Nnew so that it minimizes an online learning cost function. For the foot normal measurement, we use following cost: Figure 3: Stepping on uneven terrain. (a) The swing foot d~ ^fn ^ 2 ^fn ^fn 2 starts at ground position with distance variable L. (b) The Cfn = kNnew − fnk + αkNnew − N k (3) foot pose follows the computed trajectory during the early phase of the step, and the swing leg becomes compliant dur- which results in the following simple update rule: ing the latter part of the step.