Computation of High-Dimensional Multivariate Normal and Student-T Probabilities Based on Matrix Compression Schemes
Total Page:16
File Type:pdf, Size:1020Kb
Computation of High-Dimensional Multivariate Normal and Student-t Probabilities Based on Matrix Compression Schemes Dissertation by Jian Cao In Partial Fulfillment of the Requirements For the Degree of Doctor of Philosophy King Abdullah University of Science and Technology Thuwal, Kingdom of Saudi Arabia April, 2020 2 EXAMINATION COMMITTEE PAGE The dissertation of Jian Cao is approved by the examination committee: Committee Chairperson: Marc G. Genton Committee Members: Marc G. Genton, David E. Keyes, H˚avard Rue, Victor Panare- tos 3 ©April, 2020 Jian Cao All Rights Reserved 4 ABSTRACT Computation of High-Dimensional Multivariate Normal and Student-t Probabilities Based on Matrix Compression Schemes Jian Cao The first half of the thesis focuses on the computation of high-dimensional mul- tivariate normal (MVN) and multivariate Student-t (MVT) probabilities. Chapter 2 generalizes the bivariate conditioning method to a d-dimensional conditioning method and combines it with a hierarchical representation of the n × n covariance matrix. The resulting two-level hierarchical-block conditioning method requires Monte Carlo simulations to be performed only in d dimensions, with d ≪ n, and allows the dom- inant complexity term of the algorithm to be O(n log n). Chapter 3 improves the block reordering scheme from Chapter 2 and integrates it into the Quasi-Monte Carlo simulation under the tile-low-rank representation of the covariance matrix. Simula- tions up to dimension 65;536 suggest that this method can improve the run time by one order of magnitude compared with the hierarchical Monte Carlo method. The second half of the thesis discusses a novel matrix compression scheme with Kronecker products, an R package that implements the methods described in Chapter 3, and an application study with the probit Gaussian random field. Chapter 4 studies the potential of using the sum of Kronecker products (SKP) as a compressed covariance matrix representation. Experiments show that this new SKP representation can save the memory footprint by one order of magnitude compared with the hierarchical rep- resentation for covariance matrices from large grids and the Cholesky factorization in one million dimensions can be achieved within 600 seconds. In Chapter 5, an R package is introduced that implements the methods in Chapter 3 and show how the 5 package improves the accuracy of the computed excursion sets. Chapter 6 derives the posterior properties of the probit Gaussian random field, based on which model selection and posterior prediction are performed. With the tlrmvnmvt package, the computation becomes feasible in tens of thousands of dimensions, where the predic- tion errors are significantly reduced. 6 ACKNOWLEDGEMENTS I would like to express my sincere gratitude to my advisor Prof. Marc G. Genton, who has kept me on track on the lengthy path towards a Ph.D. with his patience and guidance. Besides my advisor, I would like to thank my collaborators, Prof. David Keyes, Prof. George Turkiyyah, and Prof. Daniele Durante for their inspiring ideas, critical comments, and academic writing suggestions. I would like to say thank you to the rest of my thesis committee: Prof. H˚avard Rue and Prof. Victor Panaretos, for their insightful comments and encouragement. I would like to thank my colleagues and the friends I met at KAUST for filling my four years' Ph.D. life with warmth and joy. Finally, I would like to thank my parents for being 24/7 at the phone whatever difficulty I encountered on a different longitude. The march to this milestone would not have been feasible without you all. 7 TABLE OF CONTENTS Examination Committee Page 2 Copyright 3 Abstract 4 Acknowledgements 6 List of Symbols 10 List of Figures 17 List of Tables 21 1 Introduction 26 1.1 Research motivations ........................... 26 1.2 Contributions and outline of the thesis ................. 29 2 Hierarchical-Block Conditioning Approximations for High-Dimensional Multivariate Normal Probabilities 32 2.1 Chapter overview ............................. 32 2.2 Building hierarchical representations .................. 33 2.2.1 Generating a hierarchical covariance matrix .......... 33 2.2.2 Hierarchical Cholesky factorization ............... 35 2.3 d-dimensional conditioning approximation ............... 37 2.3.1 d-dimensional LDL decomposition ................ 37 2.3.2 d-dimensional truncated expectations .............. 39 2.3.3 The d-dimensional conditioning algorithm ........... 41 2.3.4 Reordered d-dimensional conditioning .............. 43 2.4 The hierarchical-block conditioning method .............. 44 2.4.1 The hierarchical-block conditioning algorithm ......... 45 2.4.2 Simulation with a constant covariance matrix ......... 46 2.4.3 Simulation with 1D exponential covariance matrix ....... 49 8 2.4.4 Simulation with a 2D exponential covariance matrix ...... 51 2.5 Block reordering .............................. 53 2.6 Discussion ................................. 56 2.7 Chapter summary ............................. 59 3 Computing High-Dimensional Normal and Student-t Probabilities with Tile-Low-Rank Quasi-Monte Carlo and Block Reordering 61 3.1 Chapter overview ............................. 61 3.2 SOV for MVN and MVT probabilities ................. 63 3.2.1 SOV for MVN integrations .................... 63 3.2.2 SOV for MVT integrations .................... 65 3.3 Low-rank representation and reordering for MVN and MVT probabilities 67 3.3.1 Section overview ......................... 67 3.3.2 TLR as a practical representation for MVN and MVT ..... 70 3.3.3 Reordering schemes and TLR factorizations .......... 73 3.3.4 Preconditioned TLR Quasi-Monte Carlo algorithms ...... 74 3.4 Numerical simulations .......................... 76 3.5 Application to stochastic generators ................... 79 3.5.1 A skew-normal stochastic generator ............... 79 3.5.2 Estimation with simulated data ................. 82 3.5.3 Estimation with wind data from Saudi Arabia ......... 85 3.6 Chapter summary ............................. 87 4 Sum of Kronecker Products Representation and Its Cholesky Fac- torization for Spatial Covariance Matrices from Large Grids 89 4.1 Notations and relevant literature .................... 89 4.2 Nearest sum of Kronecker products ................... 91 4.2.1 SVD framework .......................... 91 4.2.2 ACA framework .......................... 92 4.3 Quasi-optimal sum of Kronecker products ............... 94 4.3.1 Indexing and partitioning .................... 95 4.3.2 Storage costs comparison ..................... 98 4.3.3 On perturbed grids ........................ 99 4.4 Cholesky factorization under SKP .................... 101 4.4.1 Basic ideas and notations .................... 101 4.4.2 Algorithm and complexity .................... 105 4.4.3 Numerical results under expanding domain ........... 108 9 4.4.4 Correction mechanism for a fixed domain ............ 110 4.5 Simulation of large Gaussian random fields ............... 111 4.6 Chapter summary ............................. 114 5 The tlrmvnmvt R Package 116 5.1 R and R packages for MVN/MVT probabilities ............ 116 5.2 Package functions ............................. 117 5.2.1 Function interfaces ........................ 118 5.2.2 Computation with dense matrices ................ 122 5.2.3 Computation with TLR matrices ................ 125 5.3 Performance comparison ......................... 129 5.4 Application in finding excursion sets .................. 136 5.5 Chapter summary ............................. 141 6 Computation of High-dimensional Probit Gaussian Random Field for Binary Classification 143 6.1 Chapter overview ............................. 143 6.2 Properties ................................. 144 6.3 MVN probabilities and truncated distributions ............. 150 6.4 Monte Carlo simulations ......................... 153 6.5 Chapter summary ............................. 156 7 Concluding Remarks 159 References 162 10 LIST OF SYMBOLS n the problem size/dimension m m ≤ n, the diagonal block size in the hierarchical representation and the block/tile size in the tile- low-rank representation d d ≤ m, the second conditioning dimension r equal to n=m, the number of diagonal blocks in the hierarchical representation s in Chapter 2, it is equal to m=d, denoting the num- ber of diagonal blocks in the D factor of the LDL decomposition. In Chapter 3, it is an integration variable a; b a; b 2 Rn, the lower and the upper integration lim- its µ µ 2 Rn, the mean parameter Σ Σ 2 Rn×n, the covariance matrix Φn(a; b; µ; Σ) an n-dimensional multivariate normal probability with the lower integration limits a, the upper inte- gration limits b, the mean parameter µ, and the co- variance matrix Σ. µ is occasionally omitted when it is zero. a is occasionally omitted when it is −1 ϕn(x; µ; Σ) the n-dimensional multivariate normal probability density at x with the mean parameter µ, and the covariance matrix Σ. µ is occasionally omitted when it is zero L the lower Cholesky factor or the L factor in the LDL decomposition D in Chapter 2, it is the D factor in the LDL de- composition. In Chapter 4, it is the inverse of the lower Cholesky factor L. In Chapter 6, it is equal to diag(2y − 1) 11 x in Chapters 2 and 3, x 2 Rn is the vector of in- tegration variables. In Chapter 4, xi is the first coordinate of si. In Chapter 5, x is the realization of X(·) at a set of spatial locations. In Chapter 6, x is the predictor vector y in Chapters 2 and 3, y = Lx is the vector of trans- formed integration variables. In Chapter 5, y is a segment