Exploratory Causal Analysis in Bivariate Time Series Data
Total Page:16
File Type:pdf, Size:1020Kb
EXPLORATORY CAUSAL ANALYSIS IN BIVARIATE TIME SERIES DATA by James M. McCracken A Dissertation Submitted to the Graduate Faculty of George Mason University In Partial fulfillment of The Requirements for the Degree of Doctor of Philosophy Physics Committee: Dr. Robert Weigel, Dissertation Director Dr. Paul So, Committee Member Dr. Timothy Sauer, Committee Member Dr. Maria Dworzecka, Acting Department Chair Dr. Donna M. Fox, Associate Dean for of Student Affairs & Special Programs, College of Science Dr. Peggy Agouris, Dean College of Science Date: Fall Semester 2015 George Mason University Fairfax, VA Exploratory Causal Analysis in Bivariate Time Series Data A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at George Mason University By James M. McCracken Master of Science University of Central Florida, 2006 Bachelor of Science Florida Institute of Technology, 2004 Director: Robert Weigel, Professor Department of Physics Fall Semester 2015 George Mason University Fairfax, VA Copyright c 2015 by James M. McCracken All Rights Reserved ii Dedication This dissertation is dedicated to Angel whose patience I have tested often but never broken. iii Acknowledgments This work would not be have been possible without the help of many people who have supported me over the last few years, financially and otherwise. I have, hopefully, made those people aware of how much I appreciate their help. Most importantly though, I'd like to thank my adviser Professor Robert Weigel. I appreciate the he was willing to let me prove that I was a serious student, despite my somewhat unusual circumstances as a doctoral candidate. His academically-holistic advising style emphasized a deep understanding of both the background literature and the many different methods for approaching modern data analysis problems. I believe he has helped me learn not only the material printed in this thesis but, more importantly, how to be a professional physicist. I could not have asked for a better adviser. iv Table of Contents Page List of Tables . vii List of Figures . viii Abstract . xii 1 Motivation . 1 2 Introduction . 3 2.1 Notation and Language . 5 2.2 Time series causality framework . 6 2.3 Exploratory causal analysis framework . 8 2.4 Moving forward with data causality . 12 3 Causality studies . 16 3.1 Foundational causality . 17 3.1.1 Philosophical studies . 18 3.1.2 Natural science studies . 18 3.1.3 Psychological (and other social science) studies . 20 3.2 Data causality . 20 3.2.1 Statistical causality . 21 3.2.2 Computational causality . 25 3.2.3 Time series causality . 29 4 Time series causality tools . 30 4.1 Granger causality . 32 4.1.1 Background . 33 4.1.2 Practical Usage . 35 4.2 Information-theoretic causality . 37 4.2.1 Background . 37 4.2.2 Practical usage . 40 4.3 State space reconstruction causality . 43 4.3.1 Background . 43 4.3.2 Practical usage . 46 v 4.4 Correlation causality . 51 4.4.1 Background . 51 4.4.2 Practical usage . 52 4.5 Penchant causality . 53 4.5.1 Background . 53 4.5.2 Practical usage . 56 5 Exploratory Causal Analysis . 63 5.1 ECA summary . 64 5.2 Synthetic data examples . 65 5.2.1 Impulse with linear response . 65 5.2.2 Cyclic driving with linear response . 78 5.2.3 RL series circuit . 86 5.2.4 Cyclic driving with non-linear response . 93 5.2.5 Coupled logistic map . 101 5.2.6 Impulse with multiple linear responses . 108 5.3 Empirical data examples . 115 5.3.1 Snowfall data . 116 5.3.2 OMNI data . 121 6 Conclusions . 139 6.0.3 ECA results and efficacy . 139 6.1 Future work . 142 vi List of Tables Table Page 4.1 Conditional and joint probabilities of fX; Yg in Section 4.2.2 (the shorthand notation Zt := Z(t) is used for convenience). 41 4.2 Lagged cross-correlation calculations for the example times series pair fX; Yg in Section 4.4.2. 52 5.1 The ECA summary and vector . 93 5.2 The ECA summary distribution . 96 5.3 Comparisons of the ECA summaries . 111 5.4 The values of each of the five time series causality tools . 112 5.5 Additional comparisons of the ECA summaries . 126 5.6 Additional values of each of the five time series causality tools . 131 5.7 Comparisons of the average ECA summaries . 133 5.8 Comparisons of the confidence interval ECA summaries . 133 5.9 The average values of each of the five time series causality tools . 134 5.10 The 90% confidence intervals . 135 5.11 The bootstrapped 90% confidence intervals . 136 5.12 Comparisons of the bootstrapped ECA summaries . 136 vii List of Figures Figure Page 5.1 An instance of Eqn. 5.3 . 67 5.2 Histograms of the instance of Eqn. 5.3 . 68 5.3 Autocorrelations of the instances of X and Y . 70 5.4 Lagged cross-correlation differences . 71 5.5 Lagged cross-correlation plots . 73 5.6 Lagged cross-correlation plots . 74 5.7 Leaning calculations for the instance of Eqn. 5.3 . 75 5.8 ECA summaries for different instances of Eqn. 5.3 . 77 5.9 An instance of Eqn. 5.4 . 80 5.10 Histograms of the instance of Eqn. 5.4 . 81 5.11 Autocorrelations of the instances of X and Y . 82 5.12 Lagged cross-correlation differences . 83 5.13 ECA summaries for different instances of Eqn. 5.4 . 85 5.14 The voltage and current signals of a circuit . 88 5.15 The 25 bin histograms of the signals . 89 5.16 Autocorrelations of the signals . 90 5.17 Lagged cross-correlation differences . 91 5.18 An instance of Eqn. 5.8 . 94 5.19 Autocorrelations of the instances . 97 5.20 Lagged cross-correlation differences . 98 5.21 An instance of Eqn. 5.9 . 102 5.22 Histograms of the instance of Eqn. 5.9 . 103 5.23 Autocorrelations of the instances . 105 5.24 Lagged cross-correlation differences . 106 5.25 An instance of Eqn. 5.10 . 109 5.26 Histograms of the instance of Eqn. 5.10 . 110 5.27 Autocorrelations of the instances . 113 5.28 Lagged cross-correlation differences and weighted mean observed leanings . 114 viii 5.29 Daily snowfall and mean temperature . 117 5.30 Histograms of the data . 118 5.31 Autocorrelations of the data . 119 5.32 Hourly measurements . 122 5.33 Histograms of the data . 123 5.34 Hourly measurements of the rectified magnetic field . 124 5.35 Daily averages of the hourly measurements . 127 5.36 Histograms of the data . 128 5.37 Daily averages of the hourly measurements . 129 5.38 Autocorrelations of the data . 130 ix Abstract EXPLORATORY CAUSAL ANALYSIS IN BIVARIATE TIME SERIES DATA James M. McCracken, PhD George Mason University, 2015 Dissertation Director: Robert Weigel, PhD Many scientific disciplines rely on observational data of systems for which it is difficult (or impossible) to implement controlled experiments and data analysis techniques are re- quired for identifying causal information and relationships directly from observational data. This need has lead to the development of many different time series causality approaches and tools including.