A frailty-contagion model for multi-site hourly precipitation driven by atmospheric covariates

Erwan Koch (joint work with Philippe Naveau)

ETH Zurich, Department of Mathematics, RiskLab

Workshop on stochastic weather generators Vannes May 18, 2016

Erwan Koch (ETH Zurich) A model for multi-site hourly precipitation May 18, 2016 1 / 23 Motivation

Three weather stations in northern Brittany () with hourly winter precipitation records from 1995 to 2011.

Erwan Koch (ETH Zurich) A model for multi-site hourly precipitation May 18, 2016 2 / 23 Outline

1 Short review of precipitation generators

2 Heteroscedastic multi-site model

3 Inference method and simulation study

4 Application to hourly precipitation in Brittany

5 Conclusion and perspectives

Erwan Koch (ETH Zurich) A model for multi-site hourly precipitation May 18, 2016 3 / 23 Short review of precipitation generators Some seminal papers about single-site models

Parametric methods Two statistically independent components: The occurrence process: Markov chains (Katz, 1977). In most cases a two-state, first-order Markov chain (Richardson, 1981). The intensity process: exponential distributions (Richardson, 1981), gamma distributions (Stern and Coe, 1984) or some exponential mixtures (Woolhiser and Pegram, 1979).

Erwan Koch (ETH Zurich) A model for multi-site hourly precipitation May 18, 2016 4 / 23 Short review of precipitation generators Some seminal papers about multi-site models

Parametric: purely statistical models Seminal paper by Wilks (1998) which extends Katz (1977). Models the spatial correlation of both precipitation occurrence and intensity using latent multivariate normal distributions. Extended in Wilks (2009) and Kleiber et al. (2012). Kleiber et al.’s (2012) approach departs from that of Wilks (2009) by embracing models for occurrence and intensity in a GLM framework. This category also includes generalized chain-dependent processes (Zheng and Katz, 2008; Zheng et al., 2010) and copula-based approaches (Bardossy and Pegram, 2009).

Drawbacks Generally, independence between occurrence and intensity. They do not really take large scale conditions into account. Extreme values are overlooked in both sides of the precipitation distribution (zeros and very large values).

Erwan Koch (ETH Zurich) A model for multi-site hourly precipitation May 18, 2016 5 / 23 Short review of precipitation generators Some seminal papers about multi-site models

Parametric: Weather-state models Use available atmospheric data. Try to represent some physical processes originating precipitation. Important case of hidden Markov models for occurrence (Hughes and Guttorp, 1994, 1999) and intensity (Ailliot et al., 2009 ; Charles et al., 1999). Precipitations depend on a latent process following a Markov chain whose transition probabilities vary under atmospheric conditions.

Drawbacks Generally, independence between occurrence and intensity. More parameters than in purely statistical models ⇒ estimation is difficult. Extreme values are overlooked in both sides of the precipitation distribution (zeros and very large values).

Erwan Koch (ETH Zurich) A model for multi-site hourly precipitation May 18, 2016 6 / 23 Short review of precipitation generators Summary

Literature To the best of our knowledge, no (or only a few) hourly precipitation generator(s) so far. Generally, occurrence and intensity considered as independent (Katz, 1977; Kleiber et al., 2012; . . . ). Extreme values are overlooked: Difficulty to reproduce the large precipitation amounts. Difficulty to reproduce the persistence of dry periods.

Erwan Koch (ETH Zurich) A model for multi-site hourly precipitation May 18, 2016 7 / 23 Heteroscedastic multi-site model Link between occurrence and intensity

Weather stations Intensity range (mm) Brest- Pleyber-Christ 0.2-0.8 0.63 0.65 0.68 0.8-1.4 0.74 0.75 0.79 1.4-2 0.81 0.82 0.82 2-4 0.90 0.91 0.91 > 4 0.94 0.94 0.94 Chi-Square test p-value 8.40 × 10−64 4.57 × 10−59 4.39 × 10−43

Independence test Clear increasing relationship. The assumption of independence between occurrence and intensity is clearly unrealistic.

Erwan Koch (ETH Zurich) A model for multi-site hourly precipitation May 18, 2016 8 / 23 Heteroscedastic multi-site model The model

Let Pm,t (m = 1,..., M; t = 1,..., T ) be the precipitation amount at station m recorded during the tth hour.

The model

( 0 0 BmPt−1 + m,t if BmPt−1 + m,t ≥ u Pm,t = 0 , 0 if BmPt−1 + m,t < u where:

Bm is a vector of autoregressive coefficients.

The m,t are independent and satisfy, conditionally to an observable common factor Ft ,  0  m,t ∼ N 0, exp(θ Ft ) .

Ft is a vector of d atmospheric explanatory variables at time t. The threshold u is positive.

Erwan Koch (ETH Zurich) A model for multi-site hourly precipitation May 18, 2016 9 / 23 Inference method and simulation study Maximum likelihood inference

0 We denote by B the matrix composed of the rows Bm. Maximum likelihood inference

The log-likelihood function denoted Lu(B, θ) for a given u can be written

Lu (B, θ) T M ( " 0 !# " 0 ! #) X X 1 Pm,t − B Pt−1 u − B Pt−1 = I ln φ m + I ln Φ m , {Pm,t ≥u} 0 0 {Pm,t =0} 0 t=2 m=1 exp(θ Ft ) exp(θ Ft ) exp(θ Ft ) where Φ(.) and φ(.) represent the distribution function and the probability distribution function of the standardized normal distribution, respectively.

The maximum likelihood estimator is computed numerically.

Erwan Koch (ETH Zurich) A model for multi-site hourly precipitation May 18, 2016 10 / 23 Inference method and simulation study Simulation study

Sample size 100 1000 Parameter values Bias Stdev Relative error Bias Stdev Relative error

B11 = 0.65 -2.08 17.71 -3.19 0.00 0.05 0.00 B12 = −0.08 -8.74 22.97 114.37 -0.01 0.06 0.07 B13 = 0.11 -13.30 30.62 -122.71 0.00 0.06 0.03 B21 = 0.47 -3.44 17.94 -7.37 0.01 0.05 0.02 B22 = 0.25 -5.57 13.83 -22.36 0.00 0.04 -0.01 B23 = 0.02 -7.76 20.43 -435.58 0.00 0.06 -0.09 B31 = 0.22 -8.64 27.02 -39.95 0.00 0.05 -0.01 B32 = 0.10 -7.17 21.43 -75.40 0.00 0.05 0.02 B33 = 0.36 -8.18 24.75 -22.65 -0.01 0.05 -0.03

θ0 = 30.63 -12.36 146.13 -0.40 0.19 1.54 0.01 θ1 = 0.07 0.07 0.36 0.96 0.00 0.01 -0.01 θ2 = 0.03 -0.06 0.14 -0.27 -0.07 0.00 0.01 θ3 = 0.03 0.03 0.08 0.92 0.00 0.00 0.02

Inference assessment by simulations for two sample sizes of 100 and 1000. The empirical bias, standard deviation and relative error are derived from 100 independent replicas simulated from the model with parameters given in the first column.

Erwan Koch (ETH Zurich) A model for multi-site hourly precipitation May 18, 2016 11 / 23 Application to hourly precipitation in Brittany Estimates

Threshold: u = 0.7

Matrix B Brest-Guipavas Landivisiau Pleyber-Christ Brest-Guipavas 0.65 -0.08 0.11 [0.59 ; 0.74] (87) [-0.15 ; 0.01] (86) [0.02 ; 0.19] (89) Landivisiau 0.47 0.25 0.02 [0.41 ; 0.53] (81) [0.17 ; 0.32] (92) [-0.06 ; 0.09] (91) Pleyber-Christ 0.22 0.10 0.36 [0.16 ; 0.27 ] (77) [0.02 ; 0.17] (82) [0.30 ; 0.43] (80)

Atmospheric variables Estimates Confidence interval Coverage probability ˆ Regression intercept θ0 = 30.626 [28.02 ; 32.32] 80 ˆ Temperature θ1 = 0.070 [0.064; 0.076] 38 ˆ Seal level pressure θ2 = −0.034 [-0.037 ; -0.031] 94 ˆ Humidity θ3 = 0.028 [0.022; 0.036] 94

Erwan Koch (ETH Zurich) A model for multi-site hourly precipitation May 18, 2016 12 / 23 Application to hourly precipitation in Brittany Forecast scheme

Erwan Koch (ETH Zurich) A model for multi-site hourly precipitation May 18, 2016 13 / 23 Application to hourly precipitation in Brittany Comparison of the dynamics: out-sample time series

Time (weeks) 60 61 62

Brest-Guipavas 6 5 4 3 2 Rainfall (mm) Rainfall 1 0

6 Landivisiau 5 4 3 2 1 0

Pleyber-Christ 6 5 4 3 2 Rainfall (mm) Rainfall 1 0 10001 10101 10201 Time 10301 10401 10501 Time (h)

Erwan Koch (ETH Zurich) A model for multi-site hourly precipitation May 18, 2016 14 / 23 Application to hourly precipitation in Brittany Out-sample intensities

Brest-Guipavas Landivisiau Pleyber-Christ 11 10 9 8 7 6 Median_OS Median_OS Median_OS 5 4 Simulated intensity (mm) intensity Simulated 3 2 1 0

0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11 Quantiles_OS ObservedQuantiles_OS intensity (mm) Quantiles_OS

Quantile-quantile plot between observed rainfall amount (x-axis) and simulated one (y-axis) from the model. The gray color corresponds to the 98% confidence interval and the solid line to the median.

Erwan Koch (ETH Zurich) A model for multi-site hourly precipitation May 18, 2016 15 / 23 Application to hourly precipitation in Brittany Out-sample dry periods

Brest-Guipavas Landivisiau Pleyber-Christ 350 300 250 200 Median_OS Median_OS Median_OS 150 100 Simulated lengths of dry periods (h) of dry periods lengths Simulated 50 0

0 50 100 150 200 250 300 350 0 50 100 150 200 250 300 350 0 50 100 150 200 250 300 350 Quantiles_OS Observed lengthsQuantiles_OS of dry periods (h) Quantiles_OS

Quantile-quantile plot between observed dry periods length (x-axis) and sim- ulated one (y-axis) from the model. The gray color corresponds to the 98% confidence interval and the solid line to the median.

Erwan Koch (ETH Zurich) A model for multi-site hourly precipitation May 18, 2016 16 / 23 Application to hourly precipitation in Brittany Out-sample transition probabilities

Wet|Wet Wet|Dry 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 200 150 Density Density 100 50 0

N = 1000 Bandwidth = 0.001776 N = 1000 Bandwidth = 0.0003602 200 150 Density Density 100 Density 50 0

N = 1000 Bandwidth = 0.001657 N = 1000 Bandwidth = 0.0003976 200 150 Density Density 100 50 Pleyber-Christ Pleyber-Christ Landivisiau Brest-Guipavas 0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 N = 1000 Bandwidth = 0.001754 Transition probabilities N = 1000 Bandwidth = 0.0004003

Erwan Koch (ETH Zurich) A model for multi-site hourly precipitation May 18, 2016 17 / 23 The left and right panels display the out-sample probability of having a wet hour given that the preceding hour was wet (left) and dry (right). The black points represent the observed estimates whereas the curves correspond to the density of estimates over 1000 out-sample simulated series. Application to hourly precipitation in Brittany Out-sample correlations 0.8 0.4 Median_CorS_1_1 0.0

Plage 0.8 0.4 Median_CorS_2_1 Correlation 0.0

Plage 0.8 0.4 Median_CorS_3_1 Pleyber-Brest Pleyber-Brest Landivisiau-Brest Brest-Brest 0.0 -5 -4 -3 -2 -1 0 1 2 3 4 5 LagPlage (h)

Erwan Koch (ETH Zurich) A model for multi-site hourly precipitation May 18, 2016 18 / 23 Application to hourly precipitation in Brittany Out-sample odd-ratios

Brest−Guipavas Landivisiau Pleyber−Christ 0.9 0.8 0.7 0.6 Odd−ratios 0.5 Array_Odd_Ratios_Same[1, m, 1:Hor_Max] Array_Odd_Ratios_Same[1, m, 1:Hor_Max] Array_Odd_Ratios_Same[1, m, 1:Hor_Max] Array_Odd_Ratios_Same[1, 0.4 0.3

0 12 24 36 48 60 72 0 12 24 36 48 60 72 0 12 24 36 48 60 72 1:Hor_Max Forecast1:Hor_Max horizon (h) 1:Hor_Max

Erwan Koch (ETH Zurich) A model for multi-site hourly precipitation May 18, 2016 19 / 23 Application to hourly precipitation in Brittany Other experiences

Role of the frailty and contagion terms Model fitted without the frailty component: temporal dynamics, rainfall intensity and dry period persistence not reproduced accurately. Model fitted without the contagion term: performance quite good, although the persistence of wet periods seems to be underestimated.

Application of our model on daily precipitation Performance at least as good as in the hourly case. Comparison with the standard model by Wilks (1998): Our model gives similar results concerning the general statistical properties (distributions of intensity and dry periods lengths, transition probabilities, cross-correlations, . . . ). Due to the absence of covariates, Wilks’ model does not capture the dynamic structure, and especially the synchronicity between observed and simulated rainfalls.

Erwan Koch (ETH Zurich) A model for multi-site hourly precipitation May 18, 2016 20 / 23 Conclusion and perspectives Conclusion

Simple model for hourly precipitation with 2 components: Auto-regressive part: local scale effects. Heteroscedastic noise based on covariates: large scale effects. Only one equation, allowing to introduce dependence between occurrence and intensity. Compared to models separating occurrence and intensity: Easier to manipulate. Easier to interpret. More realistic. Reasonable results on French Brittany, especially with regard to extremes. Due to the fact that: The model has the relevant structure. The covariates have been well chosen.

Erwan Koch (ETH Zurich) A model for multi-site hourly precipitation May 18, 2016 21 / 23 Conclusion and perspectives Perspectives

∀t, {εm,t }m=1,...,M : components of a general multivariate Gaussian vector =⇒ second source of spatial dependence.

∀t, {εm,t }m=1,...,M : components of a multivariate Student (or more generally elliptical) vector =⇒ heavier tails (more realistic in the South of France for example). Non-parametric relationship (like a Generalized Additive Model) between the variance and the covariates. Stochastic model for the covariates. Parametric structure on the coefficients of matrix B =⇒ more tractable inference procedure in the case of M large.

Analytical computation of P(Pm,t ≤ u + x|Pm,t−1 > u).

Erwan Koch (ETH Zurich) A model for multi-site hourly precipitation May 18, 2016 22 / 23 Appendix References References

Allard, D. and Bourotte, M., 2015 Disaggregating daily precipitations into hourly values with a transformed censored latent Gaussian process. Stochastic Environmental Research and Risk Assessment, 29 (2) Kleiber, W., Katz, R., and Rajagopalan, B., 2012 Daily spatiotemporal precipitation simulation using latent and transformed gaussian processes. Water Resources Research, 48 (1) Koch, E. and Naveau, P., 2015 A frailty-contagion model for multi-site hourly precipitation driven by atmospheric covariates. Advances in Water Resources, 78 Wilks, D. S., 1998 Multisite generalization of a daily stochastic precipitation generation model Journal of Hydrology, 210 (1-4) Wilks, D. S., 2009 A gridded multisite weather generator and synchronization to observed weather data Water Resources Research, 45 (10)

Erwan Koch (ETH Zurich) A model for multi-site hourly precipitation May 18, 2016 23 / 23