Bayesian Analysis of Non-Gaussian Stochastic Processes for Temporal and Spatial Data

BAYESIAN ANALYSIS OF NON-GAUSSIAN STOCHASTIC PROCESSES FOR TEMPORAL AND SPATIAL DATA Dissertation Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University By Jiangyong (Matthew) Yin, M.S. Graduate Program in Statistics The Ohio State University 2014 Dissertation Committee: Peter F. Craigmile, Advisor Xinyi Xu, Advisor Steven N. MacEachern c Copyright by Jiangyong (Matthew) Yin 2014 Abstract The Gaussian stochastic process is the most commonly used approach for modeling time series and geo-statistical data. The Gaussianity assumption, however, is known to be insufficient or inappropriate in many problems. In this dissertation, I develop specific non-Gaussian models to capture the asymmetry and heavy tails of many real-world data indexed in the time, space or space-time domain. Chapter 2 of this dissertation deals with a particular non-Gaussian time series model { the stochastic volatility model. The parametric stochastic volatility model is a nonlinear state space model whose state equation is traditionally assumed to be linear. Nonparametric stochastic volatility models provide great flexibility for modeling financial volatilities, but they often fail to account for useful shape information. For example, a model may not use the knowledge that the autoregressive compo- nent of the volatility equation is monotonically increasing as the lagged volatility increases. I propose a class of additive stochastic volatility models that allow for different shape constraints and can incorporate leverage effect, the asymmetric impacts of positive and negative return shocks on volatilities. I develop a Bayesian model fitting algorithm and demonstrate model performances on simulated and empirical datasets. Unlike general nonparametric models, the proposed model sacrifices lit- tle when the true volatility equation is linear. In nonlinear situations, the proposed method improves the model fit and the ability to estimate volatilities over general, ii unconstrained, nonparametric models, and at the same time, maintain more modeling flexibility compared to parametric models. The second part of this dissertation focuses on non-Gaussian spatial processes. In Chapter 3, I first introduce a general framework for constructing non-Gaussian spatial processes using transformations of a latent multivariate Gaussian process. Based on this framework, I then develop a heteroscedastic asymmetric spatial process (HASP) for capturing the non-Gaussian features of environmental or climatic data, such as the heavy tails and skewness. The conditions for this non-Gaussian spatial process to be well defined is discussed at length. The properties of the HASP, especially its marginal moments and covariance structure, are established along with a Markov chain Monte Carlo (MCMC) procedure for sampling from the posterior distribution. The HASP model is used to study a US nitrogen dioxide concentration dataset. It is demonstrated that the ability of HASP to capture asymmetry and heavy tails benefits its predictive performance. In the end, I highlight the extensions of the proposed methods in the temporal and spatial domain in several new directions. I discuss the applications of the proposed methodology to the GARCH-type nonparametric volatility models as well as to other state space models, in particular, the stochastic conditional duration model. I also discuss the extensions of the heteroscedastic asymmetric spatial process to the space- time domain. iii To my parents iv Acknowledgments First and foremost, I owe my deepest gratitude to my advisors, Peter Craigmile and Xinyi Xu, for taking me in as their student since my very first year at Ohio State and devote so much of their time to the individual studies that we did together, for their continued guidance and coaching throughout the past five years even when we were oceans or cities apart, for their amazing patience and tolerance for my often wrong opinions and sometimes unreasonable requests, for imparting their knowledge and wisdom to me without any reservations, and for their financial support for my academic growth. I would like to thank my committee member, Professor Steven MacEachern, for his guidance and his invaluable suggestions during my candidacy exam which has contributed to the development of part of this dissertation. I would also like to thank Dr. Yoonkyung Lee, who was on my candidacy exam committee, for her time and for allowing me to sit in her classes asking numerous questions. I specially want to thank Dr. Christopher Holloman, for the two years of great consulting experience at SCS and for giving me the opportunity to work on the Nationwide project. I really appreciate the financial support of the university and the department which has allowed me to complete my graduate studies and also afford the chance to see much of this country. I have also benefited greatly from the teachings of so v many great statisticians at Ohio State, without whom, I would not have achieved the statistical maturity that I have today. To my friends with whom I have shared too many drunken nights, Agniva, Casey, Dani, Grant, Jingjing, John, Sarah, Steve, Tyler and many others, thank you for making the past 5 years in Columbus a fun time. This dissertation is dedicated to my parents, Min Zhang and Mingshi Yin, who, with only high school education, have worked so hard to put me through college and always support me no matter where I decide to go or what I decide to do. This dissertation research is supported in part by the National Science Foundation (NSF) under grant DMS-0906864, DMS-1209194 and SES-1024709. vi Vita 09/09/1983 . Born in Rushan, China 09/2002 - 07/2006 . Bachelor of Science in Statistics, Fudan University, China 07/2006 - 07/2009 . Executive and Senior Executive, ACNielsen, Shanghai 09/2009 - 08/2011 . Master of Science in Statistics, The Ohio State University Publications Research Publications Jiangyong Yin, Peter F. Craigmile and Xinyi Xu. Shape-constrained Semiparametric Additive Stochastic Volatility Models. Submitted, May 2014. Jiangyong Yin and Xinyi Xu. Portfolio Optimization Using Constrained Hierarchical Bayes Models. Department of Statistics Technical Report No. 874. The Ohio State University, Aug. 2013. Fields of Study Major Field: Statistics vii Table of Contents Page Abstract . ii Dedication . iv Acknowledgments . .v Vita......................................... vii List of Tables . .x List of Figures . xi Chapters Page 1. Introduction . .1 1.1 Motivation . .1 1.2 Non-Gaussian Time Series Models . .3 1.3 Non-Gaussian Spatial and Spatio-temporal Processes . .6 1.4 Outline of the Dissertation . 12 2. Shape-constrained Semiparametric Additive Stochastic Volatility Models 14 2.1 Background . 14 2.1.1 Asymmetric or Semiparametric Stochastic Volatility Models 14 2.1.2 The Role of Shape Constraints in Semiparametric SV Models 19 2.2 Model Specification . 20 2.2.1 Semiparametric Additive Stochastic Volatility Models . 20 2.2.2 Shape-constrained Semiparametric Additive Stochastic Volatil- ity Models . 21 2.3 Model Fitting and Comparison . 28 viii 2.3.1 MCMC Procedure . 28 2.3.2 Model Comparison Criteria . 36 2.4 Simulations . 40 2.4.1 Uncentered versus Centered Basis Functions . 42 2.4.2 Unleverage SV Models . 43 2.4.3 Leveraged SV Models . 49 2.5 Empirical Studies . 54 2.6 Discussion . 58 3. Heteroscedastic Asymmetric Spatial Processes . 60 3.1 Introduction . 60 3.2 A General Framework for Constructing Non-Gaussian Processes . 60 3.3 Heteroscedastic Asymmetric Spatial Process (HASP) . 70 3.3.1 Model Specification and Properties . 70 3.3.2 The Covariance Properties of the HASP Model . 78 3.3.3 Linear Co-Regionalization Version of the HASP Model . 85 3.3.4 More on the Choices of the Correlation and Cross-Correlation Functions . 88 3.3.5 Examples of the HASP Sample Paths . 93 3.4 Model Fitting and Spatial Prediction . 99 3.4.1 The Likelihood Function . 99 3.4.2 Model Fitting Strategy . 101 3.4.3 Spatial Prediction . 108 3.5 Using HASP to Model the Nitrogen Dioxide Pollution Data in the Contiguous United States . 111 3.6 Discussion . 121 4. Concluding Remarks and Future Work . 124 4.1 Summary . 124 4.2 Extensions of the Shape-Constrained Semiparametric Additive SV Model . 125 4.2.1 GARCH-type models . 125 4.2.2 Other non-Gaussian nonlinear state-space models . 129 4.3 Extensions of the Heteroscedastic Asymmetric Spatial Process . 132 4.3.1 Into the spatio-temporal domain . 132 4.3.2 Other directions . 140 4.4 Closing Remarks . 141 Bibliography . 142 ix List of Tables Table Page 2.1 For the three true unleveraged SV models (Lf, Cf and Sf ), a summary N of the MSE and MAE of the log volatilities, fhtgt=1, obtained from fitting five different SV models. The averages and standard errors are multiplied by 100, and are based on 200 replicate series. 45 2.2 For the three true leveraged SV models (Lf-Lg, Lf-NLg and NLf-NLg), N a summary of the MSE and MAE of the log volatilities, fhtgt=1, obtained from fitting five different SV models. The averages and standard errors are multiplied by 100, and are based on 200 replicate series. 50 2.3 Predictive log likelihood of daily returns from November 1, 2010 to October 31, 2013 using six different SV models. (u) - unleveraged; (l) - leveraged. The results of the best performing model (those with the largest predictive log likelihood) are in bold face. 55 3.1 Predictive performance of the GP, SHP and HASP models fitted on the original and the transformed data. The highlighted values are the smallest within each group. 120 x List of Figures Figure Page 2.1 A comparison of the centered and uncentered basis functions. 7 knots (unequally spaced in the first row and equally spaced in the second row) are used for illustration purposes, where the first and last knot denote the lowest and largest possible values of the predictor x.

Bayesian Analysis of Non-Gaussian Stochastic Processes for Temporal and Spatial Data

An Infinite Dimensional Central Limit Theorem for Correlated Martingales

Stochastic Differential Equations with Variational Wishart Diffusions

Mean Field Methods for Classification with Gaussian Processes

Particle Filter Based Monitoring and Prediction of Spatiotemporal Corrosion Using Successive Measurements of Structural Responses

Stochastic Differential Equations with Variational Wishart Diffusions

Markov Random Fields and Stochastic Image Models

Even If Google Sent You Here, What Follows Is NOT the Book RANDOM

Gaussian Processes on Trees: from Spin Glasses to Branching Brownian Motion

Spatial Process Generation Arxiv:1308.0399V1 [Stat.CO] 2 Aug

A Minicourse on Stochastic Partial Differential Equations

Introduction to Random Fields and Scale Invariance Hermine Biermé

Modeling and Estimation of Some Non Gaussian Random Fields