Spaio-‐Temporal Dependence: a Blessing and a Curse For
Total Page:16
File Type:pdf, Size:1020Kb
Spao-temporal dependence: a blessing and a curse for computaon and inference (illustrated by composi$onal data modeling) (and with an introduc$on to NIMBLE) Christopher Paciorek UC Berkeley Stas$cs Joint work with: The PalEON project team (hp://paleonproject.org) The NIMBLE development team (hp://r-nimble.org) NSF-CBMS Workshop on Bayesian Spaal Stas$cs August 2017 Funded by various NSF grants to the PalEON and NIMBLE projects PalEON Project Goal: Improve the predic$ve capacity of terrestrial ecosystem models Friedlingstein et al. 2006 J. Climate “This large varia-on among carbon-cycle models … has been called ‘uncertainty’. I prefer to call it ‘ignorance’.” - Pren-ce (2013) Grantham Ins-tute Cri-cal issue: model parameterizaon and representaon of decadal- to centennial-scale processes are poorly constrained by data Approach: use historical and fossil data to es$mate past vegetaon and climate and use this informaon for model ini$alizaon, assessment, and improvement Spao-temporal dependence: a blessing 2 and a curse for computaon and inference Fossil Pollen Data Berry Pond West Berry Pond 2000 Year AD , Pine W Massachusetts 1900 Hemlock 1800 Birch 1700 1600 Oak 1500 Hickory 1400 Beech 1300 Chestnut 1200 Grass & weeds 1100 Charcoal:Pollen 1000 % organic matter Spao-temporal dependence: a blessing European settlement and a curse for computaon and inference 20 20 40 20 20 40 20 1500 60 O n set of Little Ice Age 3 SeQlement-era Land Survey Data Survey grid in Wisconsin Surveyor notes Raw oak tree propor$ons (on a grid in the western por$on and in irregular township areas in the eastern por$on) Spao-temporal dependence: a blessing 4 and a curse for computaon and inference Outline • Applicaon 1: Spaal smoothing of composi$onal data – Seng: Mul$variate data, high-dimensional quan$$es, non-conjugate models – A hierarchical mul$nomial probit model with CAR spaal process – Data augmentaon – How much smoothness (in space)? – Computaonal implicaons • Computaonal tools – Overview of current soaware – Introduc$on to NIMBLE • Applicaon 2: Temporal predic$on of biomass from composi$onal data – How much smoothness (in $me)? – A hierarchical s$ck-breaking composi$onal model with Generalized Pareto nonstaonary temporal smoothing – Default MCMC and computaonal challenges – Customized MCMC using NIMBLE • Concluding thoughts Spao-temporal dependence: a blessing 5 and a curse for computaon and inference Applicaon 1: Spaal smoothing of composi$onal data • Mul$variate: ~20 taxa (species) • Sum-to-one constraint on proporons • 8 km by 8 km grid: • ~10,000 grid points • 1.3 million trees (> 20 cm diameter) in total • ~125 trees per grid cell Spao-temporal dependence: a blessing 6 and a curse for computaon and inference Applicaon 1: Should we model spaal dependence? • Yes: • We want to es$mate composi$on at all locaons. • We want to smooth over noise at observed locaons. • We are interested in joint inference for mul$ple locaons, so we need to account for posterior covariance. • No: • We would need to model the spaal dependence, with the resul$ng computaonal implicaons. Spao-temporal dependence: a blessing 7 and a curse for computaon and inference Applicaon 1: Should we model mul$variate dependence? • Yes: • Taxa do show correlated abundance (taxa have similari$es in their ecological characteris$cs). • If joint inference on mul$ple tree species is desired, need mul$variate correlaon structure to properly characterize given our actual knowledge. • No: • Dependence varies by locaon (nonstaonarity) • E.g., hemlock/beech posi$vely correlated in general, but beech not present in some locaons where hemlock appears (different western range limits) • Would require more complex model • Locaons with data have data for all taxa • Imputaon is only spaal not mul$variate • With no measurement error and separable covariance, kriging predic$on for a taxon depends only on data from that taxon at other locaons • Inference not focused on mul$-taxon func$onals Spao-temporal dependence: a blessing 8 and a curse for computaon and inference Applicaon 1: Spaal smoothing of composi$onal data • Mul$variate: ~20 taxa (species) • Sum-to-one constraint on proporons • 8 km by 8 km grid: • ~10,000 grid points • 1.3 million trees (> 20 cm diameter) in total • ~125 trees per grid cell Model overview: • Mul$nomial likelihood (no over-dispersion) • One spaal process per taxon • Sum-to-one constraint based on a mul$nomial probit specificaon • Otherwise, no mul$variate structure • Spaal process hyperparameters Spao-temporal dependence: a blessing 9 and a curse for computaon and inference Applicaon 1: Standard Spaal Mul$nomial Logit Model A spaal mul$nomial logit model: y Multi(n , ✓(s )) i ⇠ i i exp(gp(si)) ✓p(si)= k exp(gk(si)) g ( ) GP(φ ) p · ⇠ P p for locaon i and taxon p. Computaonal implicaons: • No conjugacy! • Can’t integrate analy$cally over the latent processes • How propose good values of each g process? Consider McCulloch and Rossi (1994) mul$nomial extension of Albert and Chib (1993) data augmentaon (DA) trick for probit regression. Spao-temporal dependence: a blessing 10 and a curse for computaon and inference Applicaon 1: Spaal Mul$nomial Probit Model with Data Augmentaon A spaal mul$nomial probit model: yij = p i↵ wijp = max wijk k w (g (s ), 1) ijp ⇠ N p i g ( ) GP(φ ) p · ⇠ p for locaon i, tree j, and taxon p. Computaonal implicaons: • Data augmentaon version allows conjugate updates of each g process • But! Introduce new level in model – higher dimensional and with poten$al for cross-level dependence to impede MCMC performance Spao-temporal dependence: a blessing 11 and a curse for computaon and inference Applicaon 1: How much smoothness? Applicaon is based on 8 km grid, so CAR style (i.e., Markov random field) models a natural choice. How smooth spaally? • First order (simple neighborhood) CAR models: not smooth spaally. yij = p i↵ wijp = max wijk k wijp (gp(si), 1) ⇠ N 2 gp (0, σpQ−) (ICAR) ⇠ N • Second order (thin-plate spline) CAR models: very smooth spaally. • Lindgren et al (2011) SPDE approximaon to Matern-based Gaussian process: range parameter and limited control over differen$ability parameter. Spao-temporal dependence: a blessing 12 and a curse for computaon and inference Applicaon 1: Smoothness and computaon • Sparse precision matrices • Very computaonally efficient for conjugate updates • Without conjugacy not clear how to generate good proposals for en$re spaal field for a taxon, so computaonal efficiency of limited relevance • Locaon-specific updates would mix poorly when there is strong spaal dependence • Simple CAR models may show reasonable mixing for spaal process values with fixed hyperparameters because of lesser spaal smoothness • Cross-level dependence from separate updates of latent data values, spaal process values, spaal hyperparameters • Updates of spaal process and hyperparameters not directly informed by data Spao-temporal dependence: a blessing 13 and a curse for computaon and inference Applicaon 1: MCMC design yij = p i↵ wijp = max wijk k wijp (gp(si), 1) ⇠ N 2 g (0, σ Q−) (ICAR) p ⇠ N p • Cross-level dependence from separate updates of latent data values, spaal process values, spaal hyperparameters • Adequate performance required joint (cross-level) updates of {gp, σp}: • Metropolis proposal for σp with conjugate proposal for gp • Equivalent to marginalizing over gp but avoids correlated truncated normal density for w Spao-temporal dependence: a blessing 14 and a curse for computaon and inference Applicaon 1: MCMC implementaon yij = p i↵ wijp = max wijk k wijp (gp(si), 1) ⇠ N 2 g (0, σ Q−) (ICAR) p ⇠ N p • Overall MCMC wriQen in R • Truncated normal computaons done in C++ via Rcpp (can also use openMP for parallelizaon) • Joint {gp, σp} samples done in R using sparse matrix computaons with spam package (which uses Fortran) • Even with customizaon, MCMC takes order of two weeks • Computaon pre-dates NIMBLE but NIMBLE designed to allow users to set up customized MCMC sampling for components of models • E.g., the joint {gp, σp} sampling could be coded as a user-defined sampler in NIMBLE (and NIMBLE provides such a sampler for some such situaons) Spao-temporal dependence: a blessing 15 and a curse for computaon and inference Applicaon 1: MCMC performance 0.26 0.22 0.80 0.24 0.20 0.75 Ash Beech 0.22 Basswood 0.18 0.70 0.20 0.65 0.16 0 50 150 250 0 50 150 250 0 50 150 250 iteration iteration iteration 1.0 0.20 0.8 0.54 0.6 Gum Birch Cedar 0.18 0.50 0.4 0.2 0.46 0.16 0 50 150 250 0 50 150 250 0 50 150 250 iteration iteration iteration Trace plots for taxon-specific hyperparameters Spao-temporal dependence: a blessing 16 and a curse for computaon and inference Applicaon 1: Results Model selec$on: • First order CAR and Lindgren GP approximaon have similar performance but GP approximaon has anomalies at the spaal boundaries. • Second order (thin plate spline) CAR too smooth. Predicon: hQp://gandalf.berkeley.edu:3838/paciorek/setVegComp Spao-temporal dependence: a blessing 17 and a curse for computaon and inference Bayesian soaware landscape Hand-coded algorithms: • R, Python: fast to develop and easy to share, but slow computaon • C++, Rcpp: slower to develop and harder to share, but fast computaon • Julia: fast to develop and fast computaonally but less widely used Black-box MCMC engines: • JAGS: single variable samplers with a focus on conjugate samplers • Stan: Hamiltonian MC, variaonal Bayes • PyMCMC3: flexible sampler choice, Hamiltonian MC, variaonal Bayes NIMBLE: • Customizable MCMC and other algorithms plus a system for programming algorithms for