Doctor of Philosophy
Total Page:16
File Type:pdf, Size:1020Kb
RICE UNIVERSITY By R. Michael Weylandt , Jr. A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE Doctor of Philosophy APPROVED, THESIS COMMITTEE Kathy Ensor (Sep 14, 2020 15:23 CDT) Frank Jones (Sep 14, 2020 16:58 CDT) Katherine B. Ensor Frank Jones Marina Vannucci Richard Baraniuk (Sep 14, 2020 17:02 ADT) Richard G. Baraniuk HOUSTON, TEXAS September 2020 RICE UNIVERSITY Computational and Statistical Methodology for Highly Structured Data by R. Michael Weylandt Jr. A Thesis Submitted in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy Approved, Thesis Committee: Katherine B. Ensor, Chair Noah Harding Professor of Statistics Director of the Center for Computational Finance and Economic Systems Richard G. Baraniuk Victor E. Cameron Professor of Electrical and Computer Engineering Founder & Director of OpenStax Frank Jones Noah Harding Professor of Mathematics Marina Vannucci Noah Harding Professor of Statistics Houston, Texas September, 2020 ABSTRACT Computational and Statistical Methodology for Highly Structured Data by R. Michael Weylandt Jr. Modern data-intensive research is typically characterized by large scale data and the impressive computational and modeling tools necessary to analyze it. Equally im- portant, though less remarked upon, is the important structure present in large data sets. Statistical approaches that incorporate knowledge of this structure, whether spatio-temporal dependence or sparsity in a suitable basis, are essential to accurately capture the richness of modern large scale data sets. This thesis presents four novel methodologies for dealing with various types of highly structured data in a statisti- cally rich and computationally efficient manner. The first project considers sparse regression and sparse covariance selection for complex valued data. While complex valued data is ubiquitous in spectral analysis and neuroimaging, typical machine learning techniques discard the rich structure of complex numbers, losing valuable phase information in the process. A major contribution of this project is the develop- ment of convex analysis for a class of non-smooth “Wirtinger” functions, which allows high-dimensional statistical theory to be applied in the complex domain. The second project considers clustering of large scale multi-way array (“tensor”) data. Efficient clustering algorithms for convex bi-clustering and co-clustering are derived and shown to achieve an order-of-magnitude speed improvement over previous approaches. The third project considers principal component analysis for data with smooth and/or sparse structure. An efficient manifold optimization technique is proposed which can flexibly adapt to a wide variety of regularization schemes, while efficiently estimating multiple principal components. Despite the non-convexity of the manifold constraints used, it is possible to establish convergence to a stationary point. Additionally, a new family of “deflation” schemes are proposed to allow iterative estimation of nested prin- cipal components while maintaining weaker forms of orthogonality. The fourth and final project develops a multivariate volatility model for US natural gas markets. This model flexibly incorporates differing market dynamics across time scales and differ- ent spatial locations. A rigorous evaluation shows significantly improved forecasting performance both in- and out-of-sample. All four methodologies are able to flexibly incorporate prior knowledge in a statistically rigorous fashion while maintaining a high degree of computational performance. iv Acknowledgements This work described in this thesis was primarily supported by a National Science Foundation Graduate Research Fellowship under Grant Number 1842494. Additional support was provided by an NSF Research Training Grant Number 1547433, by the Brown Foundation, and by a Computational Science and Engineering Enhancement Fellowship and the 2016/2017 BP Fellowship Graduate Fellowship, both awarded by the Ken Kennedy Institute at Rice University. Computational resources for Chap- ter 7 were provided by the Center for Computational Finance and Economic Sys- tems (CoFES) at Rice University. The Rice Business Finance Center provided the Bloomberg Terminal used to access the data used in Chapter 7. Contents Abstract ii Acknowledgements iv List of Illustrations x List of Tables xix 1 Introduction 1 1.1 Overview of Part I ............................ 4 1.2 Overview of Part II ............................ 7 I Complex Graphical Models 11 2 Optimality and Optimization of Wirtinger Functions 12 2.1 Optimality Conditions in Real Vector Spaces .............. 13 2.1.1 Second-Order Approximations and Convexity ......... 19 2.2 Implications for Wirtinger Convex Optimization ............ 21 2.2.1 Wirtinger Convex Optimization ................. 21 2.2.2 Wirtinger Semi-Definite Programming ............. 29 2.3 Algorithms for Wirtinger Optimization ................. 30 2.3.1 Background ............................ 30 2.3.2 Proximal Gradient Descent ................... 31 2.3.3 Coordinate Descent ........................ 33 2.3.4 Alternating Direction Method of Multipliers .......... 34 2.3.5 Numerical Experiments ..................... 35 vi 3 Sparsistency of the Complex Lasso 38 3.1 Introduction and Background ...................... 38 3.2 Prediction and Estimation Consistency ................. 41 3.3 Model Selection Consistency ....................... 51 3.4 Algorithms ................................ 59 3.5 Numerical Experiments .......................... 61 3.A Concentration of Complex valued Sub-Gaussian Random Variables . 65 4 Sparsistency of Complex Gaussian Graphical Models 79 4.1 Introduction and Background ...................... 79 4.2 Likelihood Approach: Sparsistency ................... 80 4.3 Pseudo-Likelihood Approach: Sparsistency ............... 89 4.4 Algorithms ................................ 99 4.5 Numerical Experiments .......................... 100 4.A Properties of the Complex Multivariate Normal ............ 103 4.A.1 Proper Complex Random Variables ............... 104 4.A.2 Graphical Models for Complex Random Variables ....... 105 4.B Concentration of Complex Sample Covariance Matrices ........ 106 4.C Lemmata Used in the Proof of Theorem 4.1 .............. 112 II Additional Work 116 5 Splitting Methods for Convex Bi-Clustering 117 5.1 Introduction ................................ 118 5.1.1 Convex Clustering, Bi-Clustering, and Co-Clustering ..... 118 5.1.2 Operator Splitting Methods ................... 120 5.2 Operator-Splitting Methods for Convex Bi-Clustering ......... 123 5.2.1 Alternating Direction Method of Multipliers .......... 123 vii 5.2.2 Generalized ADMM ....................... 125 5.2.3 Davis-Yin Splitting ........................ 126 5.3 Simulation Studies ............................ 127 5.4 Complexity Analysis ........................... 130 5.5 Extensions to General Co-Clustering .................. 131 5.6 Discussion ................................. 132 5.A Step Size Selection ............................ 134 5.B Detailed Derivations ........................... 135 5.B.1 ADMM .............................. 135 5.B.2 Generalized ADMM ....................... 138 5.B.3 Davis-Yin ............................. 139 6 Multi-Rank Sparse and Functional PCA 144 6.1 Introduction ................................ 145 6.2 Manifold Optimization for SFPCA ................... 146 6.3 Iterative Deflation for SFPCA ...................... 151 6.4 Simulation Studies ............................ 154 6.5 Discussion ................................. 157 6.A Proofs ................................... 159 6.A.1 Deflation Schemes ........................ 159 6.A.2 Simultaneous Multi-Rank Deflation ............... 163 6.A.3 Solution of the Generalized Unbalanced Procrustes Problem . 166 6.B Algorithmic Details ............................ 168 6.B.1 Manifold Proximal Gradient ................... 168 6.B.2 Manifold ADMM ......................... 170 6.B.3 Additional Note on Identifiability ................ 170 6.C Additional Experimental Results .................... 171 6.D Additional Background .......................... 172 viii 7 Natural Gas Market Volatility Modeling 179 7.1 Introduction ................................ 180 7.1.1 U.S. Natural Gas Markets .................... 183 7.1.2 Previous Work .......................... 185 7.2 Empirical Characteristics of LNG Prices ................ 188 7.3 Model Specification ............................ 190 7.3.1 Bayesian Estimation and Prior Selection ............ 195 7.3.2 Computational Concerns ..................... 197 7.4 Application to Financial Risk Management ............... 198 7.4.1 In-Sample Model Fit ....................... 199 7.4.2 In-Sample Risk Management .................. 200 7.4.3 Out-of-Sample Risk Management ................ 201 7.5 Conclusions ................................ 203 7.A Additional Data Description ....................... 205 7.B Data Set Replication Information .................... 207 7.C Additional Results ............................ 216 7.C.1 In-Sample Model Fit ....................... 216 7.C.2 In-Sample Risk Management .................. 218 7.C.3 Out-of-Sample Risk Management ................ 220 7.D Computational Details .......................... 229 7.D.1 Stan Code ............................ 229 7.D.2 MCMC Diagnostics ........................ 235 7.D.3 Simulation-Based Calibration .................. 237 7.D.4 Additional Simulation Studies .................. 237 7.E Additional Background