Genetic Algorithms for the Exploration of Parameter Spaces in Agent-Based Models
Total Page:16
File Type:pdf, Size:1020Kb
NORTHWESTERN UNIVERSITY Genetic Algorithms for the Exploration of Parameter Spaces in Agent-Based Models A DISSERTATION SUBMITTED TO THE GRADUATE SCHOOL IN PARTIAL FULFILLMENT OF THE REQUIREMENTS for the degree DOCTOR OF PHILOSOPHY Field of Computer Science By Forrest J. Stonedahl EVANSTON, ILLINOIS December 2011 2 c Copyright by Forrest J. Stonedahl 2011 All Rights Reserved 3 ABSTRACT Genetic Algorithms for the Exploration of Parameter Spaces in Agent-Based Models Forrest J. Stonedahl This work provides the first comprehensive investigation of the use of genetic algorithms for exploring the range of behaviors produced by agent-based models. Agent-based modeling (ABM) is a powerful computer simulation technique in which many agents interact according to simple rules resulting in the emergence of complex aggregate-level behavior. However, as ABM is increasingly employed in both natural and social sciences, the methods and tools for understanding, exploring, and analyzing the behavior of agent-based models have not kept pace. In particular, models may be characterized by a large number of parameters, and the task of discovering parameter settings for which a model will produce a certain behavior is both difficult and time-consuming. Genetic algorithms (GAs) offer a flexible metaheuristic search mechanism which has previously been successful in a variety of combinatorial opti- mization and search problems. There is a rich space of possible model exploration tasks, and we offer a new unified framework for the creation and application of quantitative measures to perform these tasks using an evolutionary-search paradigm. We demonstrate the utility of GAs for ABM parameter exploration through a sequence of case studies in various ap- plication domains, including behavioral biology, viral marketing, archeology, and web-based journalism. This work advances agent-based modeling methodology by exploring when and 4 how GAs can be useful in the model development and analysis process. It also contributes to a deeper understanding of GAs, by evaluating their strengths and weaknesses with regard to the particular challenges posed by this problem domain. We develop novel heuristics for dealing with model stochasticity in conjunction with fitness caching techniques. We also present the first set of benchmark models/tasks for automated ABM parameter search and exploration, and we rigorously investigate the performance of GAs on these benchmarks, with varying levels of stochasticity. An important product of this research is BehaviorSearch, a new automated software tool for efficient exploration of ABM parameter spaces. The design and affordances of BehaviorSearch are discussed with respect to improving model exploration and analysis by ABM practitioners. 5 Acknowledgments First and foremost, I thank my wonderful wife Susa, who insisted that I not get all mushy in these acknowledgements. So I won't. There is insufficient space to give her the thanks she deserves in any case. Second and still foremost, I thank our Siamese cat, Gabby, who insisted that I thank her in these acknowledgements. Either that or she was just meowing because I forgot to feed her in my thesis-writing haze. Sometimes it's hard to know. It goes without saying that this thesis could never have come to fruition without the support of my advisor, Uri Wilensky, who first introduced me to the marvelous world of agent-based modeling, complex systems, and emergence. Uri also possesses an uncanny ability to gather together a research group composed of some of the most intelligent, quirky, and amazing people I have ever met. I am deeply honored to have been a part of the Center for Connected Learning and Computer-Based Modeling these past years. It has been a place of much intellectual growth, and I will never forget it. I thank everyone in the CCL { for the thought-provoking conversations, the good advice, the frenetic Google-document- sharing chats, the late-night 3D-printer surgeries, the inspiration to improve education and technology, and all of the laughter along the way. I particularly wish to single out Michelle Wilkerson-Jerde, who walked side by side with me this past year on the simultaneously treacherous paths of thesis writing and academic job searching. Special thanks also goes to Josh Unterman and Daniel Kornhauser who each, in their own inimitable ways, shared oysters of wisdom with me. 6 I am also deeply grateful to the other members of my committee: Doug Downey, who always gave me sensible advice whenever I asked1, Luis Amaral, who taught me about networks (and whose book recommendations were superlative2), and Bill Rand with whom I have had the pleasure of collaborating on numerous evolutionary computation projects, and who has served as a mentor to me on countless occasions. Thanks to my father for writing his daily blog, and helping keep this Forrest in touch with his Idaho roots. Thanks also to my mother for unswervingly positive support, and proofreading portions of this thesis, as well as several of my other papers along the way. (Who besides a loving mother could enjoy reading a stream of text filled with largely unin- telligible jargon?) I also thank my brother Birrion, whose ski bum philosophy on life helps to counterbalance my workaholic nature, and thus I'm sure maintains some crucial harmony of the universe. I would also like to mention Marion, Joyce and Jack, Karline, Susan, Peggy, and all those among my extended family and in-laws who have been supportive during my graduate school years. I thank my friend Dave Ohls, for the many empathetic and commiserative chats about the ups and downs (perhaps mostly the downs) of graduate school. And David and Laura Little for their noble attempts to preserve my sanity with the occasional enjoyable evening of card/board games. Thanks also to Valdis Krebs, for his stimulating email discussions and his generosity in sharing an interesting dataset with me. I thank Janet Pierrehumbert and Robert Daland, for their collaboration on the linguistics modeling project that is briefly mentioned in this text. I thank Bryan Pardo for wandering into my office with occasional advice on teaching, 1proving that I should have asked more often... 2see, e.g., Influence: Science and Practice by Robert Cialdani 7 research, job hunting, and random trivia. I am very grateful to NICO (the Northwestern Institute on Complex Systems) for fostering interdisciplinary research and collaboration at Northwestern. I also must thank the miracles of modern technology { this thesis work would never have been possible without many many CPU-hours churning through millions of agent-based sim- ulations. For the availability of these computational resources I especially want to recognize: Luis Amaral, who generously provided time on his computing cluster, Northwestern's Social Sciences Computing Cluster, the University of Maryland's OIT HPCC, and Northwestern's new Quest high performance computing cluster (which executed the bulk of the experiments reported herein). I am also thankful to several sources of funding, which supported this work at various times and in various ways: specifically, the National Science Foundation (grant IIS-0713619), a Murphy Society grant from the McCormick School of Engineering, and a Google and WPP Marketing Research Award. Also, I am grateful for the generosity of both The Graduate School and the Cognitive Science Program at Northwestern, in pro- viding supplementary travel grant awards that allowed me to present this work at several conferences and workshops, and thus receive constructive feedback from my academic peers at other institutions. Finally, I thank everyone who I meant to thank but have forgotten in these acknowledge- ments (which is now, by virtue of self-reference, the empty set). 8 Preface \... from so simple a beginning endless forms most beautiful and most wonderful have been, and are being, evolved." { Charles Darwin Though I can't remember precise words and must plead for some artistic license as I paraphrase them below, I can still vividly recall one afternoon several years ago in my thesis advisor's office, having just expressed a bit of the traditional graduate student angst about my chosen thesis topic. \Well," my advisor responded, \it all depends on whether you want to have a tidy thesis or a messy thesis. (I must have looked aghast { after all, who in their right mind would want to produce a messy thesis?) \Don't get me wrong", he continued, \my personal preference is toward messy theses. Tidy theses are narrowly-defined, for example: `we present a novel algorithm that performs X% better than all previous algorithms for a specific problem Y'. Messy theses are much broader in scope, expressing powerful ideas and looking for the big picture." Both types of theses are valuable. In the field of artificial intelligence, we often discuss problem solving strategies in terms of exploration versus exploitation. For instance, if you are going out to dinner, should you order a slight variation of your favorite dish (exploitation of previous knowledge with high likelihood of modest reward), or try a new and exotic dish (exploration in the hope of discovering something far better, but the reward is uncertain). 9 Is it better to take a well-known travel route and refine it, or go out hunting for that elusive `Northwest Passage', which might or might not exist? All good theses contribute something new to their discipline, and the research process always contains some mix of exploration and exploitation. But in my view, given this spectrum, \tidy theses" put more emphasis on the exploitation side, while \messy theses" expend more effort exploring. Perhaps as a result of my name, I also have a penchant for arboreal analogies. \Tidy theses" are akin to pine trees, thin spikes reaching straight up toward the sky. \Messy theses" resemble oak trees, spreading out as they reach upward. Or possibly mangroves, which also reach down, around, and every which way with their roots.