Biologically Inspired Control Systems for Autonomous Navigation and Escape from Pursuers
Total Page:16
File Type:pdf, Size:1020Kb
Biologically Inspired Control Systems for Autonomous Navigation and Escape from Pursuers Dejanira Araiza Illan (Candidate) Dr. Tony J. Dodd (Supervisor) Department of Automatic Control & Systems Engineering July 2012 Summary This thesis proposes biologically inspired solutions to the problem of autonomous navi- gation and escape from pursuers for a robot in a dynamic environment. Pursuit can be encountered in real life tasks, but the study of strategies to avoid it has not been extensive compared to the study of pursuit. The problem of navigation and escape has been solved by means of two different available techniques, from a high level control perspective: path planning based on potential functions, and learning navigation strategies with Q-learning. The sensed environment is represented through an approximate cell decomposition or occupancy grid. Embedding a navigation scheme that allows avoiding static and dynamic possible dangers provides a robot with capabilities that improve its sense of self-preservation when inter- acting with the environment for the first time, equivalent to innate behaviours in animals. Biologically inspired features have been added to the designs, in the forms of: • Temporary local goals for the navigation and escape representing possible hideaways (refuges) or locations away from pursuers according to the knowledge of the environ- ment. • Protean motion towards the goal, to increase the chances of avoiding capture. Novel contributions include a description of the problem of escape from pursuers, in a dynamic, sensed, previously unknown environment; the use of different kinds of potential functions adapted to the problem, with added biologically inspired proteanism and tem- porary local goals (hideaways or more distant locations); a comparison of the performance of different potential functions; a comparison of the possible variations in the proposed algorithms; an interpretation of the problem from a game theory perspective; and the use of tabular Q-learning with dynamic parameters and shaped reward functions to find suitable navigation strategies. The work presented in this thesis intends to widen research on the problem of individual escape from pursuers, along with the use of biologically inspired features to solve engineering problems. iii Acknowledgements I would like to express my deepest gratitude to my supervisor, Dr. Tony J. Dodd, for all the guidance and patience. All my gratitude to all the people that have encouraged me throughout the PhD: • My Mum, Dad, my sisters Maria and Gloria, and my boyfriend Chris Kitchen. • My dearest friends in Sheffield: Javier Caballero, Miriam Flores, Agust´ın Lugo, Fortino Acosta, Rodrigo Echegoyen,´ Ernesto Vidal, and the Chacko family (Elishba, Joe and Sophie). • My university professors in ITESM CCM, particularly Pedro Ponce, Katya Romo, and Rodrigo Regalado. • All the lovely people from the Mexican and Latin-American Society, PROGRESS, StitchSoc and Son de America. • Other dear friends in Mexico. I also would like to thank Charles Winter, for his invaluable collaboration in the implemen- tation of the algorithms in real robots. To the colleagues in the department that shared their knowledge and advice with me. Finally, I would like to thank my two Mexican sponsors, CONACyT and Becas Comple- mento SEP, for giving me the great opportunity to study abroad and contribute to the development of science and technology. v Contents Summary iii Acknowledgements v Nomenclature xx 1 Introduction 1 1.1 Motivation . .2 1.2 Problem definition . .3 1.3 Aims and objectives . .4 1.4 Main original contributions . .5 1.5 Publications . .6 1.6 Thesis outline . .6 2 Bio-robotics and biological studies on anti-predator behaviours 9 2.1 Bio-robotics . 10 2.1.1 Biomimetics . 12 2.1.2 Biologically inspired robotics . 14 2.2 Studies on prey-predator interactions . 17 2.2.1 Anti-predator behaviours . 18 2.2.2 Population dynamics, game theory and food chains . 22 2.2.3 Other models of prey-predator interactions . 24 2.2.4 Evolutionary prey-predator interactions . 27 2.2.5 Enhanced anti-predator behaviours and applications . 29 2.3 Discussion . 30 2.4 Concluding remarks . 32 3 Autonomous navigation with potential functions 34 3.1 Path planning for navigation of autonomous robots . 35 3.1.1 Sensor-based bug algorithms . 37 3.1.2 Roadmaps . 37 3.1.3 Cell decomposition methods . 38 3.1.4 Sample-based methods . 39 3.1.5 Potential functions . 40 3.1.6 Search algorithms for path planning . 41 vii 3.1.7 Models of the environment for path planning . 43 3.2 Stealth navigation and evasion from pursuers . 44 3.3 Artificial potential fields for path planning . 46 3.3.1 Scalar potential functions . 47 3.3.2 Boundary value problem solutions to the Laplace equation . 50 3.3.3 Vector fields . 53 3.3.4 Planning for dynamic environments and information from sensors . 54 3.3.5 Discrete potential fields . 55 3.3.6 Challenges implementing potential functions . 55 3.3.7 Combining potential functions with other path planning methods . 57 3.4 Discussion . 57 3.5 Concluding remarks . 59 4 Bio-inspired analysis of the environment and path planning 61 4.1 The navigation and escape problem . 62 4.2 Approach to the solution of the problem . 64 4.2.1 Selecting a control approach: reactive or deliberative . 64 4.2.2 Considerations about the environment . 65 4.2.3 Choosing an approach to solve the problem . 65 4.2.4 Path planning with potential functions and discrete environment . 67 4.2.5 Modelling the pursuers . 70 4.2.6 Simulations and implementation . 71 4.3 Bio-inspiration for navigation and escape . 71 4.4 Analysis of the environment . 73 4.4.1 Approximation of sensed elements into a grid . 74 4.4.2 Computing bio-inspired hideaways . 77 4.4.3 Computing distance functions . 78 4.4.4 Selecting a local navigation goal . 80 4.4.5 Bio-inspired a priori guided proteanism through subgoals . 80 4.4.6 Updating the environment representation . 84 4.4.7 Modifying the resolution of the grid . 85 4.5 Potential functions for navigation and escape . 86 4.5.1 Discrete potential functions . 86 4.5.2 Approximation of a boundary value problem . 89 4.5.3 Wave-front potential functions . 94 4.5.4 Comparison of the used potential functions . 95 4.6 Planning the path . 97 4.7 Discussion . 102 4.8 Concluding remarks . 104 5 Simulations with bio-inspired paths and pursuers 107 5.1 Parameters for the simulations . 108 5.2 Guided proteanism, low resolution grid, no updates . 112 5.2.1 Visual examples of computed paths . 113 viii 5.2.2 Differences in guided protean paths and steepest descent only paths . 117 5.2.3 Success computing paths that reached the local goal . 117 5.3 Guided proteanism, low resolution grid, sensing updates . 122 5.3.1 Visual examples of computed paths . 124 5.3.2 Success of computed paths . 124 5.4 Guided proteanism, high resolution grid . 133 5.4.1 Visual examples of computed paths . 137 5.4.2 Differences in guided protean paths and steepest descent only paths . 139 5.4.3 Success of computed paths . 139 5.5 Guided proteanism, high resolution, sensing updates . 142 5.5.1 Visual examples of computed paths . 142 5.5.2 Success of computed paths . 142 5.6 General proteanism, high resolution, sensing updates . 154 5.6.1 Visual examples of computed paths . 155 5.6.2 Success of computed paths . 155 5.7 Discussion . 163 5.8 Concluding remarks . 165 6 Pursuit-evasion games and reinforcement learning 167 6.1 Pursuit-evasion games . 168 6.2 Solutions to pursuit-evasion games . 171 6.2.1 Differential games . 171 6.2.2 Controllers . 173 6.2.3 Co-evolution of strategies . 173 6.2.4 Visibility-based analysis (graph search) . 173 6.2.5 Probabilistic analysis . 174 6.2.6 Multi-agent systems approach . 176 6.3 Reinforcement learning . 177 6.4 Reinforcement learning algorithms . 180 6.4.1 Monte Carlo . 181 6.4.2 Temporal difference (TD) . 181 6.4.3 Eligibility traces and function approximation . 182 6.4.4 Variations in the key features of the algorithms . 183 6.4.5 Applications of reinforcement learning to navigation . 185 6.5 Discussion . 186 6.6 Concluding remarks . 188 7 Autonomous navigation and escape using reinforcement learning 190 7.1 The problem from a game theory and reinforcement learning perspective . 191 7.2 Solving the game using reinforcement learning . 192 7.2.1 Biologically inspired analysis of the environment . 194 7.2.2 Parameters for the Q-learning algorithm . 194 7.2.3 Empirical selection of ranges of operation of the parameters . 197 7.3 Simulations . 201 ix 7.4 Discussion . ..