Towards a Data-Driven Approach for Agent-Based Modelling: Simulating Spanish Postmodernisation
Total Page:16
File Type:pdf, Size:1020Kb
UNIVERSIDAD COMPLUTENSE DE MADRID FACULTAD DE INFORMÁTICA Departamento de Ingeniería del Software e Inteligencia Artificial TESIS DOCTORAL Towards a data-driven approach for Agent-Based Modelling: Simulating Spanish postmodernisation MEMORIA PARA OPTAR AL GRADO DE DOCTOR PRESENTADA POR Samer Hassan Collado Directores Juan Pavón Mestras Millán Arroyo Menéndez Madrid, 2017 © Samer Hassan Collado, 2009 UNIVERSIDAD COMPLUTENSE DE MADRID Towards a Data-driven Approach for Agent-Based Modelling: Simulating Spanish Postmodernisation by Samer Hassan Collado Directed by Juan Pav´onMestras Mill´anArroyo Men´endez A thesis submitted in candidature for the degree of PhD Facultad de Inform´atica Departamento de Ingenier´ıadel Software e Inteligencia Artificial September 2009 \In many systems, the situation is such that under some conditions chaotic events take place. That means that given a particular starting point it is impossible to predict outcomes. This is true even in some quite simple systems, but the more complex a sys- tem, the more likely it is to become chaotic. It has always been assumed that anything as complicated as human society would quickly become chaotic and, therefore, unpredictable. What I have done, however, is to show that, in studying human society, it is possible to choose a starting point and to make appropriate assumption that will suppress the chaos, and will make it possible to predict the future, not in full detail, of course, but in broad sweeps; not with certainty, but with calculable probabilities." Hari Seldon Isaac Asimov, Prelude to Foundation, 1988 Abstract In the lasts decades, computer simulation in general, and agent based modelling (ABM) in particular, has become one of the mainstream modelling techniques in many scientific fields, especially in Social Sciences such as Sociology or Economics. Social simulation allows the study of the complexity inherent to social phenomena and it is attracting multidisciplinary research teams in order to manage this complexity. There are different methodologies for ABM that, after compiling experience in pro- cesses, methods and tools, attempt to provide a systematic way to tackle new problems. Both the Multi-Agent Systems field and ABM have tried to provide robust methodolo- gies to guide researchers in the modelling process. However, there is an important epistemological distinction among agent-based mod- els that these methodologies do not consider. Models can be classified depending on their research aim, and this classification can have methodological implications. Some- times researchers seek a generic model to explain a social phenomena from a high degree of abstraction, and one that is simple enough to be used as an illustration of a specific theory or hypothesis. On the other hand, researchers may prefer to focus on the ex- pressiveness of the model, together with the empirical descriptiveness of a specific case study. The first case corresponds to Theoretical Research, while the second one would be Data-driven Research. Nowadays, most of the models are conceived from the theoretical approach, and thus methodologies are frequently biased towards them. However, without disregarding the role of theory, models can also seek expressiveness. In order to do that, they may have needs that are not met in general methodologies. For instance, issues such as the empir- ical initialisation, the limitations of data collection, the throughout empirical validation or the role of data in the design are not usually considered in those methodologies. Thus, there is a lack of a complete ABM methodology that, assuming data-driven research has a different approach and aims, provides a specific flow of data-driven model develop- ment. Such methodology should consider the key role of empirical data throughout all the modelling stages. This lack has caused most data-driven models to be constructed without a common frame. This work attempts to fill this gap and build a complete methodology to guide data- driven agent-based modelling. Therefore, it can be advocated that when there are avail- able data from the observation of the real phenomenon, the modelling and simulation process involves additional stages. This methodology attempts to guide the injection of empirical data into the simulation, bringing them closer to the real phenomenon under study, while acknowledging the important role of theory in the whole process. There- fore, the approach is complemented with a systematic method for the exploration of the model space in order to achieve comprehensible but descriptive models. Such a method was coined `Deepening KISS', as it is exposed in the methodological chapter. This methodology is supported technically by the specification and implementation of a social agent framework. Such framework is structured in modules which can be enabled at will in order to facilitate the exploration of the model space and its incremental construction, both in the frame of the data-driven approach. Instead of attempting to build a general-purpose framework, this agent framework focuses on a family of problems which can be best tackled within it. Moreover, an in-depth case study was developed to test and validate the application of the methodology and proposed framework. This case study addresses the complex issue of social values evolution, together with the friendship emergence and the demo- graphic dynamics involved. The construction of this agent-based model, coined Mentat, can be summarised in a series of key milestones. The proposed data-driven methodology is applied intensively through the course of its development. The modelling process has been realised as (a) bottom-up and (b) top-down. (a) is represented by the social network arising from the micro behaviour and friendship dynamics. (b) relies on the elaborated demographic model. The conceptualisation and specification of (a) and (b) has been justified theo- retically in order to support its development. They have been implemented within the modular agent framework, designed in incremental layers. Mentat features are struc- tured in modules which can be enabled or disabled in order to explore the model space following the stages defined in the methodology. The model is validated from a quan- titative macro perspective (empirical validation), from a qualitative micro perspective (social dynamics matching the theoretical assumptions) and from a theoretical perspec- tive (discussing its sociological consistency). Different techniques of Artificial Intelligence are applied and combined in the model, testing the framework adaptability and their use for social simulation. Mentat serves as a case study of the methodology and framework, but it also provides some sociological insight of the problem under study, by giving new support to specific theories. The ABM specifically stresses the key significance of demo- graphic dynamics in the case study: the evolution of social values in Spain during the end of 20th Century. This implies that intergenerational changes are considerably more important than intragenerational ones in this Spanish context, and supports Inglehart's theories of values evolution. Keywords: agent-based model, data-driven modelling, demography, friendship, fu- zzy logic, microsimulation, social values, social simulation, surveys. Resumen Durante las ´ultimasd´ecadas,la simulaci´oncomputacional y el modelado basado en agentes (agent-based modelling, ABM) en particular, se han convertido en una de las principales tecnolog´ıasde modelado en m´ultiplescampos cient´ıficos,especialmente en ciencias sociales como la sociolog´ıay la econom´ıa. La simulaci´onsocial permite el estudio de la complejidad propia de los fen´omenossociales, y est´aatrayendo equipos de investigaci´onmultidisciplinares para poder manejar dicha complejidad. Existen distintas metodolog´ıaspara ABM que, despu´esde compilar suficiente experi- encia en procesos, m´etodos y herramientas, ofrecen una forma sistem´aticapara estudiar problemas nuevos. Tanto el campo de Sistemas Multi-Agente como el de ABM han tratado de ofrecer metodolog´ıasrobustas que puedan guiar a los investigadores en el proceso de modelado. Sin embargo, hay una distinci´onepistemol´ogicaimportante entre los modelos basados en agentes que tiene implicaciones metodol´ogicasy que dichas metodolog´ıasno consid- eran. Los modelos pueden ser clasificados en funci´onde su objetivo de investigaci´on.En ocasiones, el investigador persigue un modelo gen´ericopara explicar el fen´omenosocial, desde un alto grado de abstracci´ony de forma suficientemente simplificada para ilustrar f´acilmente una teor´ıao hip´otesisconcreta. Por otro lado, el investigador puede preferir centrarse en la expresividad del modelo, en la que se haga una extensa descripci´on emp´ıricade un caso de estudio concreto. El primer caso corresponde a la llamada \in- vestigaci´ondirigida por teor´ıa”,mientras que la segunda es la \investigaci´ondirigida por datos". Hoy d´ıa,la mayor parte de los modelos son concebidos desde el punto de vista de la investigaci´ondirigida por la teor´ıa,y por ello las metodolog´ıassueles estar sesgadas hacia ese enfoque. Sin embargo, y sin ignorar el importante papel de la teor´ıa,los modelos pueden buscar principalmente expresividad y descripci´on. Y para ello, pueden tener requisitos que no son abordados por dichas metodolog´ıasgen´ericas. Por ejemplo, temas como la inicializaci´onemp´ırica,las limitaciones de la recolecci´onde datos, la validaci´on emp´ıricaintensiva, o el papel de los datos en el dise~nono son considerados