Field Guide to Genetic Programming Nicholas Freitag Mcphee
Total Page:16
File Type:pdf, Size:1020Kb
University of Minnesota Morris Digital Well University of Minnesota Morris Digital Well Computer Science Publications Faculty and Staff choS larship 3-2008 Field Guide to Genetic Programming Nicholas Freitag McPhee Riccardo Poli William B. Langdon Follow this and additional works at: https://digitalcommons.morris.umn.edu/cs_facpubs Part of the Computer Sciences Commons Recommended Citation R. Poli, W. B. Langdon, and N. F. McPhee. A field guide to genetic programming. Published via http://lulu.com and freely available at http://www.gp-field-guide.org.uk, 2008. (With contributions by J. R. Koza). GPBiB This Book is brought to you for free and open access by the Faculty and Staff choS larship at University of Minnesota Morris Digital Well. It has been accepted for inclusion in Computer Science Publications by an authorized administrator of University of Minnesota Morris Digital Well. For more information, please contact [email protected]. A Field Guide to Genetic Programming Riccardo Poli Department of Computing and Electronic Systems University of Essex – UK [email protected] William B. Langdon Departments of Biological and Mathematical Sciences University of Essex – UK [email protected] Nicholas F. McPhee Division of Science and Mathematics University of Minnesota, Morris – USA [email protected] with contributions by John R. Koza Stanford University – USA [email protected] March 2008 c Riccardo Poli, William B. Langdon, and Nicholas F. McPhee, 2008 This work is licensed under the Creative Commons Attribution- Noncommercial-No Derivative Works 2.0 UK: England & Wales License (see http://creativecommons.org/licenses/by-nc-nd/2.0/uk/). That is: You are free: to copy, distribute, display, and perform the work Under the following conditions: Attribution. You must give the original authors credit. Non-Commercial. You may not use this work for commercial purposes. No Derivative Works. You may not alter, transform, or build upon this work. For any reuse or distribution, you must make clear to others the licence terms of this work. Any of these conditions can be waived if you get permission from the copyright holders. Nothing in this license impairs or restricts the authors’ rights. Non-commercial uses are thus permitted without any further authorisation from the copyright owners. The book may be freely downloaded in electronic form at http://www.gp-field-guide.org.uk. Printed copies can also be purchased inexpensively from http://lulu.com. For more information about Creative Commons licenses, go to http://creativecommons.org or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA. To cite this book, please see the entry for (Poli, Langdon, and McPhee, 2008) in the bibliography. ISBN 978-1-4092-0073-4 (softcover) Preface Genetic programming (GP) is a collection of evolutionary computation tech- niques that allow computers to solve problems automatically. Since its in- ception twenty years ago, GP has been used to solve a wide range of prac- tical problems, producing a number of human-competitive results and even patentable new inventions. Like many other areas of computer science, GP is evolving rapidly, with new ideas, techniques and applications being con- stantly proposed. While this shows how wonderfully prolific GP is, it also makes it difficult for newcomers to become acquainted with the main ideas in the field, and form a mental map of its different branches. Even for people who have been interested in GP for a while, it is difficult to keep up with the pace of new developments. Many books have been written which describe aspects of GP. Some provide general introductions to the field as a whole. However, no new introductory book on GP has been produced in the last decade, and anyone wanting to learn about GP is forced to map the terrain painfully on their own. This book attempts to fill that gap, by providing a modern field guide to GP for both newcomers and old-timers. It would have been straightforward to find a traditional publisher for such a book. However, we want our book to be as accessible as possible to every- one interested in learning about GP. Therefore, we have chosen to make it freely available on-line, while also allowing printed copies to be ordered in- expensively from http://lulu.com. Visit http://www.gp-field-guide. org.uk for the details. The book has undergone numerous iterations and revisions. It began as a book-chapter overview of GP (more on this below), which quickly grew to almost 100 pages. A technical report version of it was circulated on the GP mailing list. People responded very positively, and some encouraged us to continue and expand that survey into a book. We took their advice and this field guide is the result. Acknowledgements We would like to thank the University of Essex and the University of Min- nesota, Morris, for their support. Many thanks to Tyler Hutchison for the use of his cool drawing on the cover (and elsewhere!), and for finding those scary pinks and greens. We had the invaluable assistance of many people, and we are very grateful for their individual and collective efforts, often on very short timelines. Rick Riolo, Matthew Walker, Christian Gagne, Bob McKay, Giovanni Pazienza, and Lee Spector all provided useful suggestions based on an early techni- cal report version. Yossi Borenstein, Caterina Cinel, Ellery Crane, Cecilia Di Chio, Stephen Dignum, Edgar Galv´an-L´opez, Keisha Harriott, David Hunter, Lonny Johnson, Ahmed Kattan, Robert Keller, Andy Korth, Yev- geniya Kovalchuk, Simon Lucas, Wayne Manselle, Alberto Moraglio, Oliver Oechsle, Francisco Sepulveda, Elias Tawil, Edward Tsang, William Tozier and Christian Wagner all contributed to the final proofreading festival. Their sharp eyes and hard work did much to make the book better; any remaining errors or omissions are obviously the sole responsibility of the authors. We would also like to thank Prof. Xin Yao and the School of Computer Science of The University of Birmingham and Prof. Bernard Buxton of Uni- versity College, London, for continuing support, particularly of the genetic programming bibliography. We also thank Schloss Dagstuhl, where some of the integration of this book took place. Most of the tools used in the construction of this book are open source,1 and we are very grateful to all the developers whose efforts have gone into building those tools over the years. As mentioned above, this book started life as a chapter. This was for a forthcoming handbook on computational intelligence2 edited by John Fulcher and Lakhmi C. Jain. We are grateful to John Fulcher for his useful comments and edits on that book chapter. We would also like to thank most warmly John Koza, who co-authored the aforementioned chapter with us, and for allowing us to reuse some of his original material in this book. This book is a summary of nearly two decades of intensive research in the field of genetic programming, and we obviously owe a great debt to all the researchers whose hard work, ideas, and interactions ultimately made this book possible. Their work runs through every page, from an idea made somewhat clearer by a conversation at a conference, to a specific concept or diagram. It has been a pleasure to be part of the GP community over the years, and we greatly appreciate having so much interesting work to summarise! March 2008 Riccardo Poli William B. Langdon Nicholas Freitag McPhee 1See the colophon (page 235) for more details. 2Tentatively entitled Computational Intelligence: A Compendium and to be pub- lished by Springer in 2008. What’s in this book The book is divided up into four parts. Part I covers the basics of genetic programming (GP). This starts with a gentle introduction which describes how a population of programs is stored in the computer so that they can evolve with time. We explain how programs are represented, how random programs are initially created, and how GP creates a new generation by mutating the better existing programs or com- bining pairs of good parent programs to produce offspring programs. This is followed by a simple explanation of how to apply GP and an illustrative example of using GP. In Part II, we describe a variety of alternative representations for pro- grams and some advanced GP techniques. These include: the evolution of machine-code and parallel programs, the use of grammars and probability distributions for the generation of programs, variants of GP which allow the solution of problems with multiple objectives, many speed-up techniques and some useful theoretical tools. Part III provides valuable information for anyone interested in using GP in practical applications. To illustrate genetic programming’s scope, this part contains a review of many real-world applications of GP. These in- clude: curve fitting, data modelling, symbolic regression, image analysis, signal processing, financial trading, time series prediction, economic mod- elling, industrial process control, medicine, biology, bioinformatics, hyper- heuristics, artistic applications, computer games, entertainment, compres- sion and human-competitive results. This is followed by a series of recom- mendations and suggestions to obtain the most from a GP system. We then provide some conclusions. Part IV completes the book. In addition to a bibliography and an index, this part includes two appendices that provide many pointers to resources, further reading and a simple GP implementation in Java. About the authors The authors are experts in genetic programming with long and distinguished track records, and over 50 years of combined experience in both theory and practice in GP, with collaborations extending over a decade. Riccardo Poli is a Professor in the Department of Computing and Elec- tronic Systems at Essex. He started his academic career as an electronic en- gineer doing a PhD in biomedical image analysis to later become an expert in the field of EC.