Mathematical Study of Human Dynamics with Modeling of Collective Motion and Social Media
Total Page:16
File Type:pdf, Size:1020Kb
MATHEMATICAL STUDY OF HUMAN DYNAMICS WITH MODELING OF COLLECTIVE MOTION AND SOCIAL MEDIA by HYE RIN LINDSAY LEE Submitted in partial fulfillment of the requirements For the degree of Doctor of Philosophy Thesis Adviser: Dr. Alethea Barbaro Department of Mathematics, Applied Mathematics and Statistics CASE WESTERN RESERVE UNIVERSITY August, 2020 Mathematical Study of Human Dynamics with Modeling of Collective Motion and Social Media Case Western Reserve University We hereby approve the thesis1 of HYE RIN LINDSAY LEE for the degree of Doctor of Philosophy Dr. Alethea Barbaro Committee Chair, Adviser Date Department of Mathematics, Applied Mathematics, and Statistics Dr. Jenny´ Brynjarsd´ottir Committee Member Date Department of Mathematics, Applied Mathematics and Statistics Dr. Roger H. French Committee Member Date Department of Materials Science and Engineering Dr. David Gurarie Committee Member Date Department of Mathematics, Applied Mathematics and Statistics Dr. Mary Ann Horn Committee Member Date Department of Mathematics, Applied Mathematics and Statistics Date of Defense May 26, 2020 1We certify that written approval has been obtained for any proprietary material contained therein. I dedicate this dissertation to my late grandpa, Dr. ByungHyuk Lee. I thank him for his never ending support and love. Although he is not here to celebrate this big milestone of my life, I know that he has been with me all along on this journey. I am happy. Table of Contents List of Tables vii List of Figures xiv Acknowledgements xxv Acknowledgements xxv Abstract xxvii Abstract xxvii Chapter 1. Introduction1 Overview1 Mathematical and Statistical Background3 Literature Review8 Chapter 2. Evaluation of the Social Force Model Using Experimental Data 28 Motivation 28 Experiment 29 Modeling 31 Modeling Result 44 Experiment and Simulation Comparison 51 No Noise Model vs. Noise Model 55 Discussion 57 Future Work 61 Chapter 3. Examining the Effects of Parameter Scaling in a Model for Collective Motion 64 iv Motivation 64 The Proposed Scaling 65 Introduction of the Numerical Experiment 67 Scaling of R and Dt 69 The Result of the Scaled and Non-Scaled Parameters 71 Implementing the Zone of Repulsion 77 Different Types of Collision 84 Comparison of the Repulsion Cases 94 Discussion 99 Future Work 101 Chapter 4. Predicting Civil Unrest Using Twitter Data from the 2015 Baltimore Protest 103 Motivation 103 Data Background 104 Cleaning Data 107 Time Series 111 Looking into Hashtags 114 Outlier Tweets 118 The Sentiment Analysis 121 Detecting Events Using Nonnegative Matrix Factorization 126 Building the Prediction Tool 145 Discussion 150 Future Work 152 v Chapter 5. Conclusions 155 Appendix A. Social Issue Events Detected by NMF 157 Appendix B. One-Way Analysis of Variance (ANOVA) of the Four Cases 159 Appendix. Complete References 161 vi List of Tables 2.1 Parameters used for simulating crowd models 39 2.2 Top 5 parameter sets are given for all the experimental conditions. The Wasserstein distance is abbreviated as ‘WD’ in the table. From top to bottom, the order goes from the best to worst for each row. 45 2.3 Results from simulations of parameter sets given by Latin Hypercube Sampling, plotted in 2D parameter space. The x−axes are the parameter A, and the y−axes are the parameter y. Here, we plot the Wasserstein distances as a colored circle at the point in parameter space where it was run. The blue schemes represent lower Wasserstein distance values, whereas the red schemes represent higher values. The best parameter sets are filled with a magenta star, and the next top four parameter sets are filled with a magenta diamond. Since parameter space is four-dimensional, consisting of [ARush, BRush, ANoRush, BNoRush], we look at the Wasserstein Distances projected onto parameter space for the no-rush agents on the left, and projected onto the rush agents on the right. 47 2.4 Top 5 parameter sets are given for all the rush and no-rush conditions. The Wasserstein distance is abbreviated as ‘WD’ in vii the table. From top to bottom, the order goes from the best to worst for each row. 59 2.5 Results from simulations of parameter sets given by Latin Hypercube Sampling, plotted in parameter space. Here, we plot the Wasserstein distances as a colored circle at the point in parameter space where it was run. The blue schemes represent lower Wasserstein distance values, whereas the red schemes represent higher values. The best parameter sets are filled with a magenta star, and the next top four parameter sets are filled with a magenta diamond. Since parameter space is four- dimensional, consisting of [ARush, BRush, ANoRush, BNoRush], we look at the Wasserstein Distances projected onto parameter space for the no-rush agents on the left, and projected onto the rush agents on the right. 60 3.1 The scaled parameters using Equation 1.19. The first column is the number of particles, the second column is the radius, and the last column is the time step. 70 3.2 The average difference of the global, local, and fixed local polarities of the non-scaled parameters cases between the varying number of particles. For each cell, the first element is the difference of the global polarities between the two Ns, the second element is the difference of the local polarities, and the last element is the difference of the fixed local polarities. 73 viii 3.3 The average difference of the global, local, and fixed local polarities of the scaled parameters cases between the varying number of particles. For each cell, the first element is the difference of the global polarities between the two Ns, the second element is the difference of the local polarities, and the last element is the difference of the fixed local polarities. 76 3.4 The scaled parameters using Equation 1.19. The first column is the number of particles, the second column is the radius for the zone of orientation, the third column is the radius for the zone of repulsion, and the last column is the time step. 78 3.5 The average difference of the global, local, and fixed local polarities of the non-scaled parameters without repulsion cases between the varying number of particles. For each cell, the first element is the difference of the global polarities between the two Ns, the second element is the difference of the local polarities, and the last element is the difference of the fixed local polarities. 88 3.6 The average difference of the global, local, and fixed local polarities of the scaled parameters without repulsion cases between the varying number of particles. For each cell, the first element is the difference of the global polarities between the two Ns, the second element is the difference of the local polarities, and the last element is the difference of the fixed local polarities. 90 ix 3.7 The average difference of the global, local, and fixed local polarities of the non-scaled parameters with repulsion cases between the varying number of particles. For each cell, the first element is the difference of the global polarities between the two Ns, the second element is the difference of the local polarities, and the last element is the difference of the fixed local polarities. 92 3.8 The average difference of the global, local, and fixed local polarities of the scaled parameters with repulsion cases between the varying number of particles. For each cell, the first element is the difference of the global polarities between the two Ns, the second element is the difference of the local polarities, and the last element is the difference of the fixed local polarities. 94 3.9 We list the summary statistics for each case. In each cell, we list the number of data, N, mean, and standard deviation (std) of the absolute average differences of global (GP), local (LP), and fixed local (FLP) polarities, respectively. NSNR stands for the non-scaled parameter set with no repulsion, and SNR stands for the scaled parameter set with no repulsion. Similarly, NSR stands for the non-scaled parameter set with repulsion, and SR stands for the scaled parameter set with repulsion. 95 3.10 We list the test statistic (W) and p-values from the Shapiro- Wilk’s test between the cases using the raw (first two columns) x and log transformed (last two columns) data. In each cell, we list the mentioned values of the absolute average differences of global, local, and fixed local polarities, respectively. NSNR stands for the non-scaled parameter set with no repulsion, and SNR stands for the scaled parameter set with no repulsion. Similarly, NSR stands for the non-scaled parameter set with repulsion, and SR stands for the scaled parameter set with repulsion. 96 3.11 We list the statistic (F), degrees of freedom (df), p-values, and 95% confidence interval (CI) from the F-test between the two cases. In each cell, we list the said values of the absolute average differences of global, local, and fixed local polarities, respectively. NSNR stands for the non-scaled parameter set with no repulsion, and SNR stands for the scaled parameter set with no repulsion. Similarly, NSR stands for the non-scaled parameter set with repulsion, and SR stands for the scaled parameter set with repulsion. 97 3.12 We list the statistic (t), degrees of freedom (df), p-values, and 95% confidence interval (CI) from the t-test between the two cases. In each cell, we list the said values of the absolute average differences of global, local, and fixed local polarities, respectively. NSNR stands for the non-scaled parameter set with no repulsion, and SNR stands for the scaled parameter set with no repulsion.