Spaio-‐Temporal Dependence: a Blessing and a Curse For

Spaio-‐Temporal Dependence: a Blessing and a Curse For

Spao-temporal dependence: a blessing and a curse for computaon and inference (illustrated by composi$onal data modeling) (and with an introduc$on to NIMBLE) Christopher Paciorek UC Berkeley Stas$cs Joint work with: The PalEON project team (hp://paleonproject.org) The NIMBLE development team (hp://r-nimble.org) NSF-CBMS Workshop on Bayesian Spaal Stas$cs August 2017 Funded by various NSF grants to the PalEON and NIMBLE projects PalEON Project Goal: Improve the predic$ve capacity of terrestrial ecosystem models Friedlingstein et al. 2006 J. Climate “This large varia-on among carbon-cycle models … has been called ‘uncertainty’. I prefer to call it ‘ignorance’.” - Pren-ce (2013) Grantham Ins-tute Cri-cal issue: model parameterizaon and representaon of decadal- to centennial-scale processes are poorly constrained by data Approach: use historical and fossil data to es$mate past vegetaon and climate and use this informaon for model ini$alizaon, assessment, and improvement Spao-temporal dependence: a blessing 2 and a curse for computaon and inference Fossil Pollen Data Berry Pond West Berry Pond 2000 Year AD , Pine W Massachusetts 1900 Hemlock 1800 Birch 1700 1600 Oak 1500 Hickory 1400 Beech 1300 Chestnut 1200 Grass & weeds 1100 Charcoal:Pollen 1000 % organic matter Spao-temporal dependence: a blessing European settlement and a curse for computaon and inference 20 20 40 20 20 40 20 1500 60 O n set of Little Ice Age 3 SeQlement-era Land Survey Data Survey grid in Wisconsin Surveyor notes Raw oak tree propor$ons (on a grid in the western por$on and in irregular township areas in the eastern por$on) Spao-temporal dependence: a blessing 4 and a curse for computaon and inference Outline • Applicaon 1: Spaal smoothing of composi$onal data – Seng: Mul$variate data, high-dimensional quan$$es, non-conjugate models – A hierarchical mul$nomial probit model with CAR spaal process – Data augmentaon – How much smoothness (in space)? – Computaonal implicaons • Computaonal tools – Overview of current soaware – Introduc$on to NIMBLE • Applicaon 2: Temporal predic$on of biomass from composi$onal data – How much smoothness (in $me)? – A hierarchical s$ck-breaking composi$onal model with Generalized Pareto nonstaonary temporal smoothing – Default MCMC and computaonal challenges – Customized MCMC using NIMBLE • Concluding thoughts Spao-temporal dependence: a blessing 5 and a curse for computaon and inference Applicaon 1: Spaal smoothing of composi$onal data • Mul$variate: ~20 taxa (species) • Sum-to-one constraint on proporons • 8 km by 8 km grid: • ~10,000 grid points • 1.3 million trees (> 20 cm diameter) in total • ~125 trees per grid cell Spao-temporal dependence: a blessing 6 and a curse for computaon and inference Applicaon 1: Should we model spaal dependence? • Yes: • We want to es$mate composi$on at all locaons. • We want to smooth over noise at observed locaons. • We are interested in joint inference for mul$ple locaons, so we need to account for posterior covariance. • No: • We would need to model the spaal dependence, with the resul$ng computaonal implicaons. Spao-temporal dependence: a blessing 7 and a curse for computaon and inference Applicaon 1: Should we model mul$variate dependence? • Yes: • Taxa do show correlated abundance (taxa have similari$es in their ecological characteris$cs). • If joint inference on mul$ple tree species is desired, need mul$variate correlaon structure to properly characterize given our actual knowledge. • No: • Dependence varies by locaon (nonstaonarity) • E.g., hemlock/beech posi$vely correlated in general, but beech not present in some locaons where hemlock appears (different western range limits) • Would require more complex model • Locaons with data have data for all taxa • Imputaon is only spaal not mul$variate • With no measurement error and separable covariance, kriging predic$on for a taxon depends only on data from that taxon at other locaons • Inference not focused on mul$-taxon func$onals Spao-temporal dependence: a blessing 8 and a curse for computaon and inference Applicaon 1: Spaal smoothing of composi$onal data • Mul$variate: ~20 taxa (species) • Sum-to-one constraint on proporons • 8 km by 8 km grid: • ~10,000 grid points • 1.3 million trees (> 20 cm diameter) in total • ~125 trees per grid cell Model overview: • Mul$nomial likelihood (no over-dispersion) • One spaal process per taxon • Sum-to-one constraint based on a mul$nomial probit specificaon • Otherwise, no mul$variate structure • Spaal process hyperparameters Spao-temporal dependence: a blessing 9 and a curse for computaon and inference Applicaon 1: Standard Spaal Mul$nomial Logit Model A spaal mul$nomial logit model: y Multi(n , ✓(s )) i ⇠ i i exp(gp(si)) ✓p(si)= k exp(gk(si)) g ( ) GP(φ ) p · ⇠ P p for locaon i and taxon p. Computaonal implicaons: • No conjugacy! • Can’t integrate analy$cally over the latent processes • How propose good values of each g process? Consider McCulloch and Rossi (1994) mul$nomial extension of Albert and Chib (1993) data augmentaon (DA) trick for probit regression. Spao-temporal dependence: a blessing 10 and a curse for computaon and inference Applicaon 1: Spaal Mul$nomial Probit Model with Data Augmentaon A spaal mul$nomial probit model: yij = p i↵ wijp = max wijk k w (g (s ), 1) ijp ⇠ N p i g ( ) GP(φ ) p · ⇠ p for locaon i, tree j, and taxon p. Computaonal implicaons: • Data augmentaon version allows conjugate updates of each g process • But! Introduce new level in model – higher dimensional and with poten$al for cross-level dependence to impede MCMC performance Spao-temporal dependence: a blessing 11 and a curse for computaon and inference Applicaon 1: How much smoothness? Applicaon is based on 8 km grid, so CAR style (i.e., Markov random field) models a natural choice. How smooth spaally? • First order (simple neighborhood) CAR models: not smooth spaally. yij = p i↵ wijp = max wijk k wijp (gp(si), 1) ⇠ N 2 gp (0, σpQ−) (ICAR) ⇠ N • Second order (thin-plate spline) CAR models: very smooth spaally. • Lindgren et al (2011) SPDE approximaon to Matern-based Gaussian process: range parameter and limited control over differen$ability parameter. Spao-temporal dependence: a blessing 12 and a curse for computaon and inference Applicaon 1: Smoothness and computaon • Sparse precision matrices • Very computaonally efficient for conjugate updates • Without conjugacy not clear how to generate good proposals for en$re spaal field for a taxon, so computaonal efficiency of limited relevance • Locaon-specific updates would mix poorly when there is strong spaal dependence • Simple CAR models may show reasonable mixing for spaal process values with fixed hyperparameters because of lesser spaal smoothness • Cross-level dependence from separate updates of latent data values, spaal process values, spaal hyperparameters • Updates of spaal process and hyperparameters not directly informed by data Spao-temporal dependence: a blessing 13 and a curse for computaon and inference Applicaon 1: MCMC design yij = p i↵ wijp = max wijk k wijp (gp(si), 1) ⇠ N 2 g (0, σ Q−) (ICAR) p ⇠ N p • Cross-level dependence from separate updates of latent data values, spaal process values, spaal hyperparameters • Adequate performance required joint (cross-level) updates of {gp, σp}: • Metropolis proposal for σp with conjugate proposal for gp • Equivalent to marginalizing over gp but avoids correlated truncated normal density for w Spao-temporal dependence: a blessing 14 and a curse for computaon and inference Applicaon 1: MCMC implementaon yij = p i↵ wijp = max wijk k wijp (gp(si), 1) ⇠ N 2 g (0, σ Q−) (ICAR) p ⇠ N p • Overall MCMC wriQen in R • Truncated normal computaons done in C++ via Rcpp (can also use openMP for parallelizaon) • Joint {gp, σp} samples done in R using sparse matrix computaons with spam package (which uses Fortran) • Even with customizaon, MCMC takes order of two weeks • Computaon pre-dates NIMBLE but NIMBLE designed to allow users to set up customized MCMC sampling for components of models • E.g., the joint {gp, σp} sampling could be coded as a user-defined sampler in NIMBLE (and NIMBLE provides such a sampler for some such situaons) Spao-temporal dependence: a blessing 15 and a curse for computaon and inference Applicaon 1: MCMC performance 0.26 0.22 0.80 0.24 0.20 0.75 Ash Beech 0.22 Basswood 0.18 0.70 0.20 0.65 0.16 0 50 150 250 0 50 150 250 0 50 150 250 iteration iteration iteration 1.0 0.20 0.8 0.54 0.6 Gum Birch Cedar 0.18 0.50 0.4 0.2 0.46 0.16 0 50 150 250 0 50 150 250 0 50 150 250 iteration iteration iteration Trace plots for taxon-specific hyperparameters Spao-temporal dependence: a blessing 16 and a curse for computaon and inference Applicaon 1: Results Model selec$on: • First order CAR and Lindgren GP approximaon have similar performance but GP approximaon has anomalies at the spaal boundaries. • Second order (thin plate spline) CAR too smooth. Predicon: hQp://gandalf.berkeley.edu:3838/paciorek/setVegComp Spao-temporal dependence: a blessing 17 and a curse for computaon and inference Bayesian soaware landscape Hand-coded algorithms: • R, Python: fast to develop and easy to share, but slow computaon • C++, Rcpp: slower to develop and harder to share, but fast computaon • Julia: fast to develop and fast computaonally but less widely used Black-box MCMC engines: • JAGS: single variable samplers with a focus on conjugate samplers • Stan: Hamiltonian MC, variaonal Bayes • PyMCMC3: flexible sampler choice, Hamiltonian MC, variaonal Bayes NIMBLE: • Customizable MCMC and other algorithms plus a system for programming algorithms for

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    56 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us