Architecture and Services for Modeling Contagions on Large Networked Populations

Providing High Performance Computing based Models as a Service: Architecture and Services for Modeling Contagions on Large Networked Populations Sherif Hanie El Meligy Abdelhamid Dissertation submitted to the Faculty of the Virginia Polytechnic Institute and State University in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science and Applications Chris J. Kuhlman, Co-Chair Madhav V. Marathe, Co-Chair Mohamed Kholief Chris North Sekharipuram S. Ravi Tuesday 15th November, 2016 Blacksburg, Virginia Keywords: Social Behavior, Contagions, Networks, Control of Contagion Processes, Graph Dynamical Systems, Modeling and Simulation, Open Science, Software Systems, Web Services Copyright 2016, Sherif E. Abdelhamid Providing High Performance Computing based Models as a Service: Architecture and Services for Modeling Contagions on Large Networked Populations Sherif E. Abdelhamid (ABSTRACT) Network science emerged as an interdisciplinary field over the last 20 years, and played a central role to address fundamental problems in other fields, e.g., epidemiology, public health, and transportation, and is now part of most university curriculums. Network dynamics is a major area within network science where researchers study different forms of processes in networked populations, such as the spread of emotions, influence, opinions, flu, ebola, and mass movements. These processes often referred to individually and collectively as contagions. Contagions are increasingly studied because of their economic, social, and political impacts. Yet, resources for studying network dynamics are largely dispersed and stand-alone. Furthermore, many researchers interested in the study of networks are not computer scientists. As a result, they do not have easy access to computing and data resources. Even with the presence of software or tools, it is challenging to install, build, and maintain software. These challenges create a barrier for researchers and domain scientists. The goal of this work is the design and implementation of a research framework for modeling contagions on large networked populations. The framework consists of various systems and services that provide support for researchers and domain scientists at different stages of their research workflow. Providing High Performance Computing based Models as a Service: Architecture and Services for Modeling Contagions on Large Networked Populations Sherif E. Abdelhamid (GENERAL AUDIENCE ABSTRACT) Network science is a field which studies complex networks. Network science emerged over the last 20 years as an interdisciplinary academic field which integrates data, tools, and theories from multiple disciplines, and use this integration to address fundamental problems in other fields, e.g., epidemiology, public health, and transportation, and is now part of most university curriculums. Network dynamics is a major area within network science where researchers study different forms of processes in networked populations, such as the spread of emotions, influence, opinions, flu, ebola, and mass movements. These processes often referred to individually and collectively as contagions. Contagions are increasingly studied because of their economic, social, and political impacts. To study contagions, researchers and domain scientists need both software and hardware that can collect, analyze, and manage large volumes of networked data. Furthermore, for non-computer scientists, it is more challenging to install, build, and maintain the software with the required hardware. These challenges create a barrier for researchers and domain scientists to conduct their experiments and replicate others’ work. The goal of this work is the design and implementation of various systems and services that provide support for researchers and domain scientists at different stages of their research workflow. Dedication I would like to dedicate my PhD dissertation to my lovely sons Yaseen and Yusuf for the joy they brought to my life. My wife Mona for her endless love, kindness, and support. My parents Eman and Hany, my brother Karim, and all my family in Egypt for encouraging and guiding me all the time. iv Acknowledgments I am very grateful to my advisor, Dr. Chris J. Kuhlman who helped me with my PhD research, taught me various professional skills, and was always motivating and encouraging. I thank My advisor, Prof. Madhav V. Marathe, who guided me throughout my research, and I am very grateful to him for providing such a stimu- lating and interactive environment at Network Dynamics and Simulation Science Laboratory (NDSSL). I would like to thank Prof. S. S. Ravi, Prof. Chris North, and Prof. Mohamed Kholief, as they worked closely with me on different aspects of my research, and provided their valuable advice and feedback. I would like to thank all my friends, colleagues, and NDSSL members for their friendship, advice, and support. At last but not the least, I would like to thank my parents, my wife, and all my family for their unconditional support and love. v Contents 1 Introduction1 1.1 Background . .1 1.2 Challenges for Domain Scientists . .2 1.3 Research Motivation and Objectives . .3 1.4 Graph Dynamical Systems Foundations . .4 1.4.1 GDS Overview . .4 1.4.2 Illustrative Examples . .8 1.5 Research Question, Solutions and Hypotheses . 11 1.6 Research Overview, Approach and Contributions . 12 1.6.1 GDSC . 13 1.6.2 EDISON . 14 1.6.3 MARS . 15 1.6.4 NEMO . 15 1.7 Document Structure . 16 2 Literature Review 17 2.1 GDSC Related Work . 17 2.2 EDISON Related Work . 19 2.3 MARS Related Work . 22 vi 2.4 NEMO Related Work . 24 2.5 Agent-based Modeling of Depression Related Work . 26 3 GDSC: Graph Dynamical Systems Calculator 29 3.1 Introduction . 29 3.1.1 Technical Challenges . 30 3.1.2 Contributions . 30 3.2 GDSC System Overview . 34 3.3 Web Application and System Features . 36 3.3.1 Analysis Inputs . 37 3.3.2 Analysis Log . 45 3.3.3 Auto Regression Testing . 47 3.4 Illustrative Case Study . 49 3.5 Limitations . 51 4 EDISON: A Web Application for Evaluation of Network-Based Social Dynam- ics 52 4.1 Introduction . 52 4.1.1 Technical Challenges . 53 4.1.2 Contributions . 53 4.2 Behavior Models . 55 4.2.1 Need For Web Application Functionality . 56 4.3 User Interface . 59 4.4 System Architecture . 63 4.4.1 Middleware . 64 4.4.2 Integration with MARS . 64 vii 4.4.3 InterSim Simulation Engine . 65 4.4.4 HPC Hardware Resources . 66 4.5 Illustrative Case Studies . 66 4.5.1 Networks and Simulations . 66 4.5.2 Simulation Results . 67 4.6 Usability . 76 4.7 Limitations . 76 5 MARS: Network Services and Their Compositions for Network Science 78 5.1 Introduction . 78 5.1.1 Technical Challenges . 79 5.1.2 Contributions . 80 5.2 MARS System Overview . 82 5.3 Network Services . 85 5.3.1 Network Storage Service, NStS . 85 5.3.2 Network Query Service, NQS, and Network Query Parsing Service, NQPS . 85 5.3.3 Network Query Search Service, NQSS . 87 5.3.4 Network Measure Service, NMS . 87 5.4 System Workflow Service . 89 5.4.1 Query Execution Workflow . 89 5.4.2 Query Validation Workflow . 90 5.5 Illustrative Case Study . 92 5.6 Performance Evaluation . 93 5.7 MARS Implementation . 98 5.8 Limitations . 98 viii 6 Understanding Contagion Dynamics in Networked Populations Through Data Exploration and Visualization 99 6.1 Introduction . 99 6.1.1 Contributions . 100 6.2 NEMO Overview . 101 6.2.1 Summary of User Features . 101 6.2.2 Additional Details . 102 6.3 Illustrative Case Study . 106 6.4 Limitations . 110 7 Agent-Based Modeling and Simulation of Depression and Its Impact on Students’ Success and Academic Retention 111 7.1 Overview . 111 7.2 Introduction . 112 7.2.1 Background . 112 7.2.2 Motivation for Agent-based Modeling of Depression . 113 7.2.3 Contributions . 114 7.3 Data and Methodology . 115 7.3.1 College Social Network . 115 7.3.2 Contagion (Behavioral) Model . 116 7.4 Simulation Results . 127 7.4.1 Effect of peer influence on the number of depressed students . 127 7.4.2 Effect of symptom influence on the number of depressed students 127 7.4.3 Students predisposed to depression . 128 7.5 Discussion . 130 7.6 Implications . 131 ix 7.7 Limitations . 131 8 A Survey on Research Practices and Usability Evaluation of EDISON System133 8.1 Introduction . 133 8.2 Study Design . 134 8.2.1 Mixed Methods Approach . 134 8.2.2 Study Participants . 134 8.3 Research Practices Survey . ..

Load more