Computational Intelligence and Games

Optimization, Imitation and Innovation: Computational Intelligence and Games Julian Togelius A Thesis submitted for the degree of Doctor of Philosophy Department of Computer Science University of Essex September 2007 Contents 1 Introduction 1 1.1 Mainscientificcontributions. ........ 1 1.2 Organizationofthethesis . ...... 3 1.3 Listofpapers.................................... .. 3 1.4 Notesonstyle.................................... .. 5 2 Computational intelligence 7 2.1 Evolutionarycomputation . ...... 7 2.1.1 Evolutionstrategies . ... 9 2.1.2 Geneticalgorithms. .. .. .. .. .. .. .. .. .. .. .. .. .. 10 2.1.3 CascadingElitism .............................. 11 2.2 Neuralnetworks .................................. .. 11 2.2.1 Multi-layerperceptrons . .... 12 2.2.2 Evolutionaryneuralnetworks . ..... 14 2.3 Evolutionaryrobotics . .. .. .. .. .. .. .. .. .. .. .. .. ..... 17 2.3.1 Realityandsimulation. ... 19 2.3.2 The assumptions and philosophy of evolutionary robotics ......... 22 2.4 Issues in evolving controllers for complex behaviour in embodiedagents . 24 2.4.1 Sensing ..................................... 24 2.4.2 Incrementality ................................ 26 2.4.3 Modularity ................................... 28 2.4.4 Controllerrepresentations . ...... 32 2.4.5 Stateful control and internal models . ....... 35 2.4.6 Competitiveco-evolution . .... 38 2.4.7 Evolution and other forms of reinforcement learning . ........... 40 2.5 Summary ........................................ 43 i 3 Computational intelligence and games 44 3.1 Computergames ................................... 45 3.1.1 Typesofcomputergames . 46 3.1.2 CommercialversusacademicgameAI . .... 49 3.2 Optimization .................................... .. 52 3.2.1 Games for computational intelligence: testing machine learning algorithms 52 3.2.2 Computational intelligence for games: optimizing agentsandgames . 54 3.3 Imitation ....................................... 54 3.3.1 Games for computational intelligence: testing supervised learning algorithms 54 3.3.2 Computational intelligence for games: modelling behaviour and dynamics 55 3.4 Innovation...................................... .. 56 3.4.1 Games for computational intelligence: evolving complex general intelligence 56 3.4.2 Computational intelligence for games: emergence and content creation . 58 3.5 Summary ........................................ 59 4 Games in this thesis 60 4.1 Simulatedcarracing .............................. .... 60 4.1.1 Dynamics .................................... 61 4.1.2 Collisions .................................... 63 4.1.3 Games...................................... 65 4.2 Othergames ...................................... 69 4.2.1 SimulatedHelicopterControl . ..... 69 4.2.2 Cellz....................................... 70 5 Optimization 73 5.1 Racingsinglecarsonsingletracks . ........ 73 5.1.1 Noinputsandactionsequences . .... 74 5.1.2 Open-loopneuralnetwork . ... 75 5.1.3 Newtonianinputsandforcefields. ..... 76 5.1.4 Newtonian inputs and neural networks . ...... 77 5.1.5 Simulated sensor inputs and neural networks . ........ 78 5.1.6 Conclusions................................... 83 5.2 Scalinguptomultipletracks . ...... 83 5.2.1 Methods..................................... 84 5.2.2 Evolvingtrack-specificcontrollers. ......... 86 5.2.3 Evolving general and robust driving skills . ......... 90 ii 5.2.4 Evolvingspecializedcontrollers . ........ 92 5.2.5 Observations on evolved driving behaviour . ......... 94 5.2.6 Conclusions................................... 94 5.3 Further experiments in optimizing car driving . ............ 95 5.3.1 Comparing controller architectures on the track task ............ 96 5.3.2 Comparing learning methods for point-to-point racing ........... 100 5.4 Optimizationinothergamedomains . ....... 105 5.4.1 Controlling a simulated helicopter . ....... 105 5.4.2 PlayingCellz .................................. 112 5.5 Summary ........................................115 6 Imitation 117 6.1 Imitatingdrivingstyles . ...... 117 6.1.1 Whenisaplayermodeladequate? . 118 6.1.2 Directmodelling ............................... 118 6.1.3 Indirectmodelling ............................. 119 6.1.4 Results ..................................... 120 6.2 Imitating real car dynamics for controller evolution . ............... 122 6.2.1 Our approach to dynamics modelling and controller evolution . 122 6.2.2 Model requirements for controller evolution .................................... 123 6.3 Datacollection .................................. 124 6.3.1 Modellingtechniques. 126 6.3.2 Modelacquisitionexperiments . ..... 129 6.3.3 Controllerlearningexperiments . ....... 130 6.3.4 Discussion.................................... 136 6.4 Othertypesofimitation . ..... 136 6.4.1 Imitating simulated car dynamics in sensor space . .......... 137 6.5 Summary ........................................143 7 Innovation 145 7.1 Co-evolving car controllers for competitive driving . ............... 145 7.1.1 Co-EvolutionaryAlgorithm . 146 7.1.2 Experiments .................................. 147 7.1.3 Discussion.................................... 151 7.2 Multi-population competitive co-evolution . ............. 152 iii 7.2.1 Controllerarchitectures . ..... 153 7.2.2 Co-evolutionaryalgorithms . ..... 155 7.2.3 Results ..................................... 158 7.2.4 Discussion.................................... 165 7.3 Personalizedracingtrackcreation. .......... 167 7.3.1 Fitnessfunctions .............................. 168 7.3.2 Trackrepresentation . 169 7.3.3 Initialisationandmutation . ..... 170 7.3.4 Results ..................................... 171 7.3.5 Discussion.................................... 173 7.4 Summary ........................................173 8 Conclusions 175 8.1 The computational intelligence perspective . ............ 175 8.1.1 Robotrealityandsimulation . 175 8.1.2 Incrementalityandmodularity . ..... 176 8.1.3 Controller architectures and learning methods . ........... 177 8.1.4 Co-evolution .................................. 179 8.2 ThegameAIperspective ............................ 179 8.2.1 Generalizationandspecialization . ........ 180 8.2.2 Generationofdiverseopponents . ..... 180 8.2.3 Personalizedcontentcreation . ...... 181 8.3 Summaryandfutureresearchdirections . ......... 181 iv Acknowledgements First of all I want to thank those persons that had a direct influence on the contents of this thesis. These include first and foremost my supportive supervisor, Simon Lucas, but also the people whom I have collaborated with on various papers. In the order of number of co-authored papers, which partially coincides with their influence on my development as a researcher, these are Renzo De Nardi, Hugo Marques, Alberto Moraglio, Alexandros Agapitos, Richard Newcombe, Peter Burrow, Magdalena Kogutowska and Edgar Galvan Lopez. I have also benefited from many stimulating discussions on the nature of research and academia with Owen Holland and Marie Gustafsson. As for the people who have supported and influenced me as a person during the last three years, these are too many to enumerate. A good way to sum it up is friends, family and Georgia. Thanks for the love, you’ll have more of it back now that I’ll be less preoccupied with finishing my PhD! Of course, I am grateful for the scholarship from the University of Essex and EPSRC for making all this possible in the first place. v Abstract This thesis concerns the application of computational intelligence techniques, mainly neural networks and evolutionary computation, to computer games. This research has three parallel and non-exclusive goals: to develop ways of testing machine learning algorithms, to augment the entertainment value of computer games, and to study the conditions under which complex general intelligence can evolve. Each of these goals is discussed at some length, and the research described is also discussed in the light of current open questions in computational intelligence in general and evolutionary robotics in particular. A number of experiments are presented, divided into three chapters: optimization, imitation and innovation. The experiments in the optimization chapter deals with optimizing certain aspects of computer games using unambiguous fitness measures and evolutionary algorithms or other reinforcement learning algorithms. In the imitation chapter, supervised learning techniques are used to imitate aspects of behaviour or dynamics. Finally, the innovation chapter provides examples of using evolutionary algorithms not as pure optimizers, but rather as innovating new behaviour or structures using complex, nontrivial fitness measures. Most of the experiments in this thesis are performed in one of two games based on a simple car racing simulator, and one of the experiments extends this simulator to the control of a real- world radio-controlled model car. The other games that are used as experimental environments are a helicopter simulation game and the multi-agent foraging game Cellz. Among the main achievements of the thesis are a method for personalised content creation based on modelling player behaviour and evolving new game content (such as racing tracks), a method for evolving control for non-recoverable robots (such as racing cars) using multiple models, and a method for multi-population competitive co-evolution. Chapter 1 Introduction Once upon a time, it was widely believed that we could build artificial intelligence ourselves. In other words, that we could understand the mechanisms of intelligence in

Load more