Download (4Mb)
Total Page:16
File Type:pdf, Size:1020Kb
University of Warwick institutional repository: http://go.warwick.ac.uk/wrap A Thesis Submitted for the Degree of PhD at the University of Warwick http://go.warwick.ac.uk/wrap/57636 This thesis is made available online and is protected by original copyright. Please scroll down to view the document itself. Please refer to the repository record for this item for information to help you to cite it. Our policy information is available from the repository home page. AUTHOR: Christopher F. H. Nam DEGREE: Ph.D. TITLE: The Uncertainty of Changepoints in Time Series DATE OF DEPOSIT: ................................. I agree that this thesis shall be available in accordance with the regulations governing the University of Warwick theses. I agree that the summary of this thesis may be submitted for publication. I agree that the thesis may be photocopied (single copies for study purposes only). Theses with no restriction on photocopying will also be made available to the British Library for microfilming. The British Library may supply copies to individuals or libraries. subject to a statement from them that the copy is supplied for non-publishing purposes. All copies supplied by the British Library will carry the following statement: “Attention is drawn to the fact that the copyright of this thesis rests with its author. This copy of the thesis has been supplied on the condition that anyone who consults it is understood to recognise that its copyright rests with its author and that no quotation from the thesis and no information derived from it may be published without the author’s written consent.” AUTHOR’S SIGNATURE: ................................................... .... USER’S DECLARATION 1. I undertake not to quote or make use of any information from this thesis without making acknowledgement to the author. 2. I further undertake to allow no-one else to use this thesis while it is in my care. DATE SIGNATURE ADDRESS ................................................... ............................... ................................................... ............................... ................................................... ............................... ................................................... ............................... ................................................... ............................... www.warwick.ac.uk The Uncertainty of Changepoints in Time Series by Christopher F. H. Nam Thesis Submitted to the University of Warwick for the degree of Doctor of Philosophy Department of Statistics March 2013 Contents Acknowledgments iv Declarations vi List of Abbreviations viii Abstract x Chapter 1 Introduction 1 1.1 StructureofThesis............................ 7 Chapter 2 Literature Review 9 2.1 Introduction................................ 9 2.2 Terminology and Notation . 11 2.3 AtMostOneChange........................... 13 2.4 BinarySegmentation.. .. .. .. .. .. .. .. 16 2.5 Penalised Likelihood approaches . 18 2.6 Global Segmentation . 20 2.7 PrunedExactLinearTimealgorithm. 23 2.8 AutoPARM ................................ 25 2.9 Bayesian Model Selection methods . 28 2.10 Reversible Jump Markov Chain Monte Carlo . 31 2.11 Product-Partitionmodels . 32 2.12 Hidden Markov Models based methods . 35 2.12.1 Deterministic State Sequence Inference . 37 2.12.2 ExactCPDistributions . 42 2.12.3 ConstrainedHMMs . 43 2.13 Exact Sampling of the Posterior via Recursions . ..... 47 2.14Conclusion ................................ 52 i Chapter 3 Exact Changepoint Distributions and Sequential Monte Carlo Samplers 55 3.1 Introduction................................ 55 3.2 Background ................................ 58 3.2.1 Exact CP Distributions via Finite Markov Chain Imbedding 60 3.2.2 Sequential Monte Carlo methods . 67 3.3 Methodology ............................... 75 3.3.1 Approximating the model parameter posterior, p(θ y n) . 78 | 1: 3.4 ResultsandApplications. 81 3.4.1 SimulatedData.......................... 82 3.4.2 Hamilton’s GNP data . 87 3.5 ConclusionandDiscussion. 93 Chapter 4 Quantifying the Uncertainty of Brain Activity 98 4.1 Introduction................................ 98 4.2 A Brief Introduction to Neuroimaging and fMRI . 99 4.2.1 Data Acquisition . 100 4.2.2 Preprocessing.. .. .. .. .. .. .. .. 102 4.2.3 Statistical Analysis . 104 4.3 AnAnxietyInducingExperiment . 107 4.4 Results................................... 108 4.5 ConclusionandDiscussion. 112 Chapter 5 Model Selection for Hidden Markov Models 116 5.1 Introduction................................ 116 5.2 Background ................................ 118 5.3 LiteratureReview ............................ 118 5.3.1 An Information Theoretic Approach . 119 5.3.2 Parallel Markov Chain Monte Carlo . 120 5.3.3 Reversible Jump Markov Chain Monte Carlo . 121 5.3.4 Sequential Hidden Markov Model . 123 5.4 Methodology ............................... 126 5.4.1 Approximating p(y n H) .................... 128 1: | 5.5 Results................................... 129 5.5.1 Simulated Data . 130 5.5.2 Hamilton’s GNP data . 138 5.6 ConclusionandDiscussion. 141 ii Chapter 6 Quantifying the Uncertainty of Autocovariance Change- points 145 6.1 Introduction................................ 145 6.2 Motivation ................................ 149 6.3 Wavelets.................................. 151 6.3.1 Discrete Wavelet Transform (DWT) . 155 6.3.2 Non-Decimated Wavelet Transform (NDWT) . 160 6.3.3 Locally Stationary Wavelet processes . 164 6.4 Methodology ............................... 170 6.4.1 LSW-HMM modelling framework . 171 6.4.2 SMC samplers implementation . 177 6.4.3 ExactCPdistributions . 178 6.4.4 OutlineofApproach . 179 6.5 ResultsandApplications. 180 6.5.1 Simulated Data . 181 6.5.2 Oceanographic Application . 191 6.6 Discussion................................. 194 6.A Appendix ................................. 200 6.A.1 Joint density of Ik = (I1,k,I2,k,...,IJ∗,k)............ 200 D ˜ 6.A.2 Computing Σk , the covariance structure of Dk ........ 202 6.A.3 Determining how much of the EWS one needs to know to D compute Σk ........................... 203 6.A.4 Order of HMM with respect to analysing wavelet and J ∗ . 204 6.A.5 SMC samplers example implementation . 204 Chapter 7 Discussion and Future Work 207 7.1 Summaryofthesis ............................ 207 7.2 FutureWork ............................... 208 7.2.1 Changepoints for multivariate time series . 208 7.2.2 Changepoint uncertainty in light of missing data . 210 7.2.3 Forecasting Changepoints . 211 iii Acknowledgments Several people and services have played some part in the forming of this thesis; each in their own unique ways. I would like to express some gratitude to the founders and developers of Spotify, YouTube, Grooveshark and BBC iPlayer for providing such services and entertainment which have aided me when working through the days (and nights) in C1.03. A special thanks also go to the writers of the sitcoms 30 Rock and Arrested Development for providing the necessary laughter required to survive a PhD. My thanks go to the members of staff at the Statistics Department for pro- viding a supportive working environment and many enjoyable conversations over lunch in the Common Room. These will be sorely missed. I would like to thank my fellow PhD students in the department, both past and present, for providing interesting conversations and entertainment, both statistical and non-statistical in context. My special thanks also go to former PhD students: Peter Windridge, Nas- tasiya Grinberg, Robert Goudie, Bryony Hill and Guy Freeman, for reassuring me through the bad times, entrusting me with their own PhD experiences, and gener- ally making me smile and laugh even if from several miles away. As part of this thesis is also in collaboration with an external affiliation, I would also like to thank the Mathematics and Statistics Department at Lancaster University, in particular Aim´ee Gott, Rebecca Killick and Matt Nunes for entertaining me sufficiently well during my numerous collaborative visits. This thesis would not be possible without the mathematical ideas and sup- port of my supervisor and collaborators. I am extremely indebted to John Aston, my supervisor, for his immense guidance and support over the last three and a half iv years, and for putting up with my frequent incompetence and rare bouts of compe- tence. I have learnt a great deal both academic and non-academic related, namely a first foray into parenting, under your supervision. I would also like to thank Adam Johansen for practically acting as an unofficial second supervisor to me. My special thanks also go to Idris Eckley from Lancaster University for being a superb collab- orator, offering invaluable advice regarding both work and life, and making me get over my dog allergy. Finally, my thanks go to the Nam clan (Peter, Pamela, Janice, Flora, Jos´e) for their continuous love and support over the past seven and half years I have been at the University of Warwick. May you have also enjoyed the Warwickshire and West Midlands area over the past 15 years or so since we first ventured into this area. My how times have changed... v Declarations I, Christopher Nam, hereby declare that this thesis is based on my own research, except when stated otherwise, in accordance with the regulations of the University of Warwick, and has not been submitted elsewhere. The results presented in Chapter 2 utilises open source code available from the respective authors. I am grateful to these authors for making such code widely available. Chapters 3 and 4 contains work