Locomotion of Bipedal Humanoid Robots: Planning and Learning to Walk
Total Page:16
File Type:pdf, Size:1020Kb
PLEASE TYPE THE UNIVERSITY OF NEW SOUTH WALES Thesis/Dissertation Sheet Surname or Family name: YIK First name: TAK FAI Other name/s: Abbreviation for degree as given in the University calendar: PhD School: Computer Science and Engineering Faculty: Engineering Title: Locomotion of Bipedal Humanoid Robots: Planning and Learning to Walk Abstract 350 words maximum: (PLEASE TYPE) Pure reinforcement learning does not scale well to domains with many degrees of freedom and particularly to continuous domains. In this thesis, we introduce a hybrid method in which a symbolic planner constructs an approximate solution to a control problem. Subsequently, a numerical optimisation algorithm is used to refine the qualitative plan into an operational policy. The method is demonstrated on the problem of learning a stable walking gait for a bipedal robot. The contributions of this thesis are as follows. Firstly, the thesis proposes a novel way to generate gait patterns by using a genetic algorithm to generate walking gaits for a humanoid robot using zero moment point as the stability criterion. This is validated on physical robot. Second, we propose^an innovative generic learning method that utilises the trainer's domain knowledge about the task to accelerate learning and extend the capabilities of the learning algorithm. The proposed method, which takes advantage of domain knowledge and combines symbolic planning and learning to accelerate and reduce the search space of the learning problem, is tested on a bipedal humanoid robot learning to walk. Finally, it is shown that the extended capability of the learning algorithm handles high complexity learning tasks in the physical worid with experimental verification on a physical robot. Declaration relating to disposition of project thesis/dissertation I hereby grant to the University of Nev/ South Wales or its agents the right to archive and to make available my thesis or dissertation in whole or in part in the University libraries in all forms of media, now or here after known, subject to the provisions of the Copyright Act 1968.1 retain all property rights, such as patent rights. 1 also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation. I also authorise Uniyersity^lvli'crofilms to use the 350 word abstract of my thesis in Dissertation Abstracts International (this is applicable to doctoral theses only). Witness Date The University recognises that there may be exceptional circumstances requiring restrictions on copying or conditions on use. Requests for restriction for a period of up to 2 years must be made in writing. Requests for a longer period of restriction may be considered in exceptional circumstances and require the approval of the Dean of Graduate Research. C^vu \ < / (o I ORIGINALITY STATEMENT 1 hereby declare that this submission is my own work and to the best of my knowledge it contains no materials previously published or written by another person, or substantial proportions of material which have been accepted for the award of any other degree or diploma at UNSW or any other educational institution, except where due acknowledgement is made in the thesis. Any contribution made to the research by others, with whom I have worked at UNSW or elsewhere, is explicitly acknowledged in the thesis. I also declare that the intellectual content of this thesis is the product of my own work, except to the extent that assistance from others in the project's design and conception or in style, presentation and linguistic expression is acknowledged.' Signed Date COPYRIGHT STATEMENT 'I hereby grant the University of New South Wales or its agents the right to archive and to nnake available my thesis or dissertation in whole or part in the University libraries In all forms of media, now or here after known, subject to the provisions of the Copyright Act 1968.1 retain all proprietary rights, such as patent rights. 1 also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation. I also authorise University Microfilms to use the 350 word abstract of my thesis In Dissertation Abstract International (this is applicable to doctoral theses only). I have either used no substantial portions of copyright material in my thesis or I have obtained permission to use copyright material; where permission has not been granted I have applied/will apply for a partial restriction of the digital popy of my thesis or dissertation.' ». / i i Signed Date AUTHENTICITY STATEMENT 'I certify that the Library deposit digital copy is a direct equivalent of the final officially approved version of my thesis. No emendation of content has occurred and if there are any minor variations in formatting, they are the result of the conversion to digital fomnat.' Signed Date 0<S " 10 - Zoo<S THE UNIVERSITY OF NEW SOUTH WALES SCHOOL OF COMPUTER SCIENCE AND ENGINEERING Locomotion of Bipedal Humanoid Robots: Planning and Learning to Walk Tak Fai Yik Requirement for the Degree of Doctor of Philosophy October 2007 Supervisor: Professor Claude Sammut Abstract Pure reinforcenient learning does not sctile well to domains with many degrees of freedom and particularly to continuous domains. In this thesis, we introduce a hybrid method in which a symbolic planner constructs an approx- imate solution to a control problem. Subsequently, a numerical optimisation algorithm is used to refine the qualitative plan into an operational policy. The method is demonstrated on the problem of learning a stable walking gait for a bipedal robot. The contributions of this thesis are as follows. Firstly, the thesis proposes a novel way to generate gait patterns by using a genetic algorithm to generate walking gaits for a humanoid robot using zero moment point as the stabil- ity criterion. This is validated on physical robot. Second, we propose an innovative generic learning method that utilises the trainer's domain knowl- edge about the task to accelerate learning and extend the capabilities of the learning algorithm. The proposed method, which takes advantage of domain knowledge and combines symbolic planning and learning to accelerate and re- duce the search space of the learning problem, is tested on a bipedal humanoid robot learning to walk. Finally, it is shown that the extended capability of the learning algorithm handles high complexity learning tasks in the physical world with experimental verification on a physical robot. Contents 1 Introduction 3 1.1 Problem definition 4 1.2 Contributions of this thesis 6 1.3 Overview 7 2 Background 9 2.1 Walking and gaits 10 2.2 Robotics 11 2.2.1 Kinematics and inverse kinematics 12 2.2.2 Dynamics and zero moment point 12 2.3 Learning 15 2.3.1 Genetic algorithms 15 2.3.2 Hill climbing 19 2.4 Symbolic planning 21 2.4.1 STRIPS notation 21 2.4.2 Plan search 24 2.4.3 Looped plans 25 2.4.4 Constraint logic programming 26 3 Related research 28 3.1 Walking robots 28 3.1.1 Direct gait generation 29 3.1.2 Alternative gait generation 31 3.1.3 Learning walking gaits 34 3.1.4 Summary 37 3.2 Symbolic problem solving 37 3.2.1 Qualitative physics 37 3.2.2 Relational and guided learning 42 3.2.3 Hybrid system: Combining planning and learning 43 3.2.4 Iterative plans 44 3.3 Summary 45 4 Evolving a locus based gait 47 4.1 Introduction 47 4.1.1 Gait generation 48 CONTENTS ii 4.1.2 Locus based quadrupedal walk 49 4.1.3 Locus based bipedal walk 49 4.2 The GuRoo project 50 4.3 Gait generation 50 4.3.1 Motion sequence 52 4.3.2 Estimation of the zero moment point 54 4.3.3 Gait evolution 56 4.4 Walking 61 5 Combining planning and learning 68 5.1 The bipedal humanoid robot: Cycloid-II 69 5.2 The learning task 71 5.3 Overview of the learning system 71 5.4 Implementation 74 5.5 Qualitative representation of robot walking 75 5.5.1 States 76 5.6 Description of the planner 80 5.6.1 Actions 82 5.7 Constraint solving 91 5.8 Description of the learner 98 6 Experiments 101 6.1 Crouching experiment 102 6.2 Walking experiments 109 6.3 Experimental results 112 7 Conclusions and future work 118 7.1 Genetic algorithm 118 7.2 Symbolic planning with constrained learning 119 7.3 Discussion and future work 120 7.3.1 Improvements in planning 121 7.3.2 Improvements in learning 122 7.4 Conclusions 122 List of Figures 2.1 A human walk cycle (from left to right). The left leg is indicated by dotted line and the right leg is indicated by solid line 11 2.2 Relationship between zero moment point, gravity and centre of mass. 13 2.3 Genetic operators used to create offspring 18 2.4 Hill climbing: a) if the change results in a lower function value, the change is discarded; b) if the change results in a higher function value, point is moved over to the new position 20 2.5 All states from the block world domain (with 2 blocks) 22 4.1 The GuRoo humanoid robot (Wyeth et al., 2003) 51 4.2 The gait generator learns from off-line evolution using an approxi- mate model, which can then be tested on the real robot or in simulation. 52 4.3 The generated locus (Wyeth et al., 2003) 53 4.4 Approximate configuration of GuRoo used to simplify the zero mo- ment point estimation during gait evolution (Wyeth et al., 2003).