Statistical Extreme Value Theory (EVT) Part II
Eric Gilleland Research Applications Laboratory 21 July 2014, Trieste, Italy Peaks Over Threshold (POT)
Poisson distribution applicable for modeling the frequency of exceeding a high threshold (i.e., low probability, rare, event).
What about the intensity of values that exceed the threshold?
UCAR Confidential and Proprietary. © 2008, University Corporation for Atmospheric Research. All rights reserved.
Peaks Over Threshold (POT)
• X random variable
• Let Y = X – u, conditional on X > u, where u is a high threshold. • Model the “excesses” over the threshold. • Y has an approximate generalized Pareto (GP) distribution for high u with cdf:
−1/ξ ⎡ y ⎤ y H (y;σ (u),ξ) = 1− ⎢1+ ξ ⎥ , y > 0, ξ > 0 ⎣ σ (u)⎦ σ (u) σ (u) > 0
UCAR Confidential and Proprietary. © 2008, University Corporation for Atmospheric Research. All rights reserved.
Peaks Over Threshold (POT)
Three types of Extreme Beta df; Value df’s 1.2 bounded Beta Exponential upper tail 1.0 Pareto
0.8 Exponential; 0.6
density light tail 0.4
0.2 Pareto; heavy tail 0.0
0 2 4 6 8 10
UCARUCAR ConfidentialConfidential andand Proprietary.Proprietary. ©© 2014,2008, UniversityUniversity CorporationCorporation forfor AtmosphericAtmospheric Research.Research. AllAll rightsrights reserved.reserved.
Peaks Over Threshold (POT)
• ξ < 0 yields the Beta cdf (bounded upper tail) upper bound at: u – σ(u) / ξ
• ξ = 0 yields the exponential cdf (“light” upper tail)
• ξ > 0 yields the Pareto cdf (“heavy” upper tail)
UCARUCAR ConfidentialConfidential andand Proprietary.Proprietary. ©© 2014,2008, UniversityUniversity CorporationCorporation forfor AtmosphericAtmospheric Research.Research. AllAll rightsrights reserved.reserved.
Peaks Over Threshold (POT) Connection between GEV and GP families
• Maximum Mn ≤ u if no Xi > u, i = 1, 2, …, n
• “Memoryless” property of exponential distribution Pr{ Y > y + y’ | Y > y’ } = Pr{ Y > y }
• Stability of GP distribution • Lose memoryless property (need to rescale)
• If Y = X – u (X > u) has exact GP distribution with parameters σ(u) > 0 and ξ, then excess over a higher threshold u’ > u follows the GP distribution with parameters σ(u’) > 0 and ξ, where
σ(u’) = σ(u) + ξ(u’ – u), u’ > u
UCARUCAR ConfidentialConfidential andand Proprietary.Proprietary. ©© 2014,2008, UniversityUniversity CorporationCorporation forfor AtmosphericAtmospheric Research.Research. AllAll rightsrights reserved.reserved.
Peaks Over Threshold (POT) 60 40 20 Hurricane Damage (billion USD) 0
1930 1940 1950 1960 1970 1980 1990 Year
UCARUCAR ConfidentialConfidential andand Proprietary.Proprietary. ©© 2014,2008, UniversityUniversity CorporationCorporation forfor AtmosphericAtmospheric Research.Research. AllAll rightsrights reserved.reserved.
Peaksfevd(x = Dam, Over data = damage, thresholdThreshold = 6, type = "GP", time.units = "2/year")(POT)
● ●
70 1−1 line regression line 95% confidence bands 80 50 60 ● 30 40 ●
Empirical Quantiles ● ● ● ● 20 ● ● ● ● ●●● ●●● ●● ●● 10 ● ●●● ●●●●● ●●
10 15 20 25 30 35 Data Quantiles from Model Simulated 10 20 30 40 50 60 70
Model Quantiles Dam( > 6) Empirical Quantiles
Empirical Modeled 0.15 300 0.10
100 ● Density ●
Return Level ●●●●●●●●●●●●●● ● ● 0.05 100 − 0.00 0 20 40 60 2 5 10 50 200 500
N = 18 Bandwidth = 2.289 Return Period (years)
UCARUCAR ConfidentialConfidential andand Proprietary.Proprietary. ©© 2014,2008, UniversityUniversity CorporationCorporation forfor AtmosphericAtmospheric Research.Research. AllAll rightsrights reserved.reserved.
Peaks Over Threshold (POT)
95% lower CI Estimate 95% upper CI
σ(u = 6 billion USD) 1.03 billion USD 4.59 billion USD 8.15 billion USD
ξ -0.15 0.51 1.18
100-year return level 0.65 billion USD 43.64 billion USD 86.64 billion USD (billion USD)
UCARUCAR ConfidentialConfidential andand Proprietary.Proprietary. ©© 2014,2008, UniversityUniversity CorporationCorporation forfor AtmosphericAtmospheric Research.Research. AllAll rightsrights reserved.reserved.
Peaks Over Threshold (POT): GP return levels
• First need quantile of GP cdf
-1 • xp = H (1 – p; σ(u), ξ)
• = (σ(u) / ξ) × (p-ξ – 1), 0 < p < 1
• Complication: must account for the rate, ζ, of exceeding the threshold for interpretability, as well
as the number of events per year, ny.
• replace p-1 with m = return period of interest×ny×ζ
UCARUCAR ConfidentialConfidential andand Proprietary.Proprietary. ©© 2014,2008, UniversityUniversity CorporationCorporation forfor AtmosphericAtmospheric Research.Research. AllAll rightsrights reserved.reserved.
Peaks Over Threshold (POT): GP return levels
Hurricane example
Lack of structure in data: some years have no hurricanes, some one, some two, etc.
Reasonable to use an average number per year in place of ny.
UCARUCAR ConfidentialConfidential andand Proprietary.Proprietary. ©© 2014,2008, UniversityUniversity CorporationCorporation forfor AtmosphericAtmospheric Research.Research. AllAll rightsrights reserved.reserved.
Peaks Over Threshold (POT): Threshold Selection
• The Bias-Variance Trade-Off
• Want a high threshold to obtain better GP approximation
• Want a lower threshold for more reliable estimation (more data!)
• Difficult to automate selection
UCARUCAR ConfidentialConfidential andand Proprietary.Proprietary. ©© 2014,2008, UniversityUniversity CorporationCorporation forfor AtmosphericAtmospheric Research.Research. AllAll rightsrights reserved.reserved.
Peaks Over Threshold (POT): Threshold Selection
Invariance of GP above threshold
• ξ does not change
• σ(u) is a function of the threshold
• The modification, σ* = σ(u) – ξu, is no longer a function of the threshold, and does not change
• Check for stability in parameter estimates as the threshold varies.
UCARUCAR ConfidentialConfidential andand Proprietary.Proprietary. ©© 2014,2008, UniversityUniversity CorporationCorporation forfor AtmosphericAtmospheric Research.Research. AllAll rightsrights reserved.reserved.
Peaks Over Threshold (POT):
Thresholdthreshrange.plot(x = damage$Dam,Selection r = c(2, 7)) 10 ● ● ● ●
5 ● ● ● ● ● ● 0 5 − reparameterized scale reparameterized
2 3 4 5 6 7 1.0
● ● ● ●
0.5 ●
shape ● ● ● ● ● 0.0
2 3 4 5 6 7
Threshold
UCARUCAR ConfidentialConfidential andand Proprietary.Proprietary. ©© 2014,2008, UniversityUniversity CorporationCorporation forfor AtmosphericAtmospheric Research.Research. AllAll rightsrights reserved.reserved.
Peaks Over Threshold (POT): Dependence in Threshold Excess Data
• Remove dependence (e.g., decluster)
• Runs declustering perhaps the simplest
• Model the dependence (e.g., through bivariate extreme value models)
• Concept of “extremal index”
UCARUCAR ConfidentialConfidential andand Proprietary.Proprietary. ©© 2014,2008, UniversityUniversity CorporationCorporation forfor AtmosphericAtmospheric Research.Research. AllAll rightsrights reserved.reserved.
Peaks Over Threshold (POT): Dependence in Threshold Excess Data
Extremal Index, θ, 0 < θ ≤ 1
Mean cluster size ≈ 1 / θ
θ = 1 implies a lack of clustering at high levels, and degree of clustering at high levels increases as θ decreases
UCARUCAR ConfidentialConfidential andand Proprietary.Proprietary. ©© 2014,2008, UniversityUniversity CorporationCorporation forfor AtmosphericAtmospheric Research.Research. AllAll rightsrights reserved.reserved.
Peaks Over Threshold (POT): Dependence in Threshold Excess Data
UCARUCAR ConfidentialConfidential andand Proprietary.Proprietary. ©© 2014,2008, UniversityUniversity CorporationCorporation forfor AtmosphericAtmospheric Research.Research. AllAll rightsrights reserved.reserved.
Peaks Over Threshold (POT): Dependence in Threshold Excess
Full time Data range is 1 August θ ≈ 0.08 1961 to 28 1500 September r = 29 2010 1000 164 clusters 500 Red River flow (Port Royal, Texas) Royal, (Port flow Red River 0
05/62 11/62 06/63 12/63 Time (days)
UCARUCAR ConfidentialConfidential andand Proprietary.Proprietary. ©© 2014,2008, UniversityUniversity CorporationCorporation forfor AtmosphericAtmospheric Research.Research. AllAll rightsrights reserved.reserved.
Peaks Over Threshold (POT): Dependence in Threshold Excess Data
1500 θ ≈ 0.85 1000 500 Red River flow (Port Royal, Texas) Royal, (Port flow Red River 0
05/62 11/62 06/63 12/63 Time (days)
UCARUCAR ConfidentialConfidential andand Proprietary.Proprietary. ©© 2014,2008, UniversityUniversity CorporationCorporation forfor AtmosphericAtmospheric Research.Research. AllAll rightsrights reserved.reserved.
Peaks Over Threshold (POT): Modeling Frequency and Intensity
Poisson-GP Model
• Poisson process for exceeding a high threshold
• Event: Xt > u • Rate parameter: λ • Number of events in [0, T] has Poisson distribution with parameeter λT
• GP distribution for excess over threshold
• Excess Yt = Xt – u given Xt > u • Scale and Shape parameters
UCARUCAR ConfidentialConfidential andand Proprietary.Proprietary. ©© 2014,2008, UniversityUniversity CorporationCorporation forfor AtmosphericAtmospheric Research.Research. AllAll rightsrights reserved.reserved.
Peaks Over Threshold (POT): Modeling Frequency and Intensity
Point Process (PP) representation
• Subsumes Poisson-GP model • GEV parameterization (connection to GEV) • Can relate parameters of GEV(µ, σ, ξ) to those of the PP(λ, σ(u), ξ) • Shape parameter, ξ, is identical • ln λ = -(1/ξ) ln(1 + ξ(u – µ) / σ) • σ(u) = σ + ξ(u – µ) • Need time scale, h, to take account of the block size for GEV (e.g., h ≈ 1/365.25 for daily data and annual blocks)
UCARUCAR ConfidentialConfidential andand Proprietary.Proprietary. ©© 2014,2008, UniversityUniversity CorporationCorporation forfor AtmosphericAtmospheric Research.Research. AllAll rightsrights reserved.reserved.
Peaks Over Threshold (POT): Modeling Frequency and Intensity
Point Process (PP) vs Poisson-GP
• Approaches are equivalent • Poisson-GP model • Can obtain the same parameter estimates indirectly through the relationships between the two parameterizations • Point Process • More convenient to quantify total uncertainty • Easier to interpret (eliminates dependence of parameters on the threshold)
UCARUCAR ConfidentialConfidential andand Proprietary.Proprietary. ©© 2014,2008, UniversityUniversity CorporationCorporation forfor AtmosphericAtmospheric Research.Research. AllAll rightsrights reserved.reserved.
Peaks Over Threshold (POT): Modeling Frequency and Intensity 3 2 simulated data simulated 1 0
0 200 400 600 800 1000
UCARUCAR ConfidentialConfidential andand Proprietary.Proprietary. ©© 2014,2008, UniversityUniversity CorporationCorporation forfor AtmosphericAtmospheric Research.Research. AllAll rightsrights reserved.reserved.
Peaks Over Threshold (POT): Modeling Frequency and Intensity
fevd(x = y, threshold = 1, type = "PP")
● 1−1 line ● 3.5 regression line
3.5 95% confidence bands 3.0
● ● 3.0 ● ● ● 2.5
● ● ●
● ● ● 2.5 2.0 ● ●●●
● ●●● ● ●●● ● 1.5 ●●●● Empirical Quantiles 2.0 ●
● Observed Z_k Values ● ● ● ●●●● ●● ●●●● ●● ● ●● ●● ●● ● 1.0 ●●●● ●●●●● ●● ●●● 1.5 ● ●●● ● ●●●● ●● ●●●●●● ●●●●●● ●●●●●● ●● ●●●●●● ●●●● 0.5 ●●●● ●●●●●● ●●● ●●●● ●●●●●● ●●●● ●●●● ●●● ●● ● 1.0
1.0 1.5 2.0 2.5 3.0 0 1 2 3 4 5 Expected Values Model Quantiles Under exponential(1) Intensity Frequency
UCARUCAR ConfidentialConfidential andand Proprietary.Proprietary. ©© 2014,2008, UniversityUniversity CorporationCorporation forfor AtmosphericAtmospheric Research.Research. AllAll rightsrights reserved.reserved.
Peaks Over Threshold (POT): Modeling Frequency and Intensity
Poisson-GP estimation method
95% lower CI Estimate 95% upper CI λ 0.07 0.08 0.10 σ(u) 0.28 0.40 0.53 ξ -0.16 0.07 0.29
PP estimation method
95% lower CI Estimate 95% upper CI µ 2.06 2.53 3.01 σ 0.20 0.51 0.82 ξ -0.16 0.07 0.30
UCARUCAR ConfidentialConfidential andand Proprietary.Proprietary. ©© 2014,2008, UniversityUniversity CorporationCorporation forfor AtmosphericAtmospheric Research.Research. AllAll rightsrights reserved.reserved.
Peaks Over Threshold (POT): Modeling Frequency and Intensity
Verify that the two sets of parameter estimates are consistent ln λ = -(1 / ξ) ln(1 + ξ(u – µ) / σ ≈ 0.009
λ ≈ 29.93 (estimated on an annual maximum scale)
0.08 * 365.25 ≈ 29.22
σ(u) = σ + ξ(u – µ) ≈ 0.4
UCARUCAR ConfidentialConfidential andand Proprietary.Proprietary. ©© 2014,2008, UniversityUniversity CorporationCorporation forfor AtmosphericAtmospheric Research.Research. AllAll rightsrights reserved.reserved.
POT vs Block Maxima ● 4
3 ● ●
● ● 2 simulated data simulated
1 ● 0
0 20 40 60 80 100 120
UCAR Confidential and Proprietary. © 2008, University Corporation for Atmospheric Research. All rights reserved.
Conclusion of Part II: Questions?
UCAR Confidential and Proprietary. © 2008, University Corporation for Atmospheric Research. All rights reserved.