An Overview of the Extremal Index Nicholas Moloney, Davide Faranda, Yuzuru Sato

An overview of the extremal index Nicholas Moloney, Davide Faranda, Yuzuru Sato To cite this version: Nicholas Moloney, Davide Faranda, Yuzuru Sato. An overview of the extremal index. Chaos: An Interdisciplinary Journal of Nonlinear Science, American Institute of Physics, 2019, 29 (2), pp.022101. 10.1063/1.5079656. hal-02334271 HAL Id: hal-02334271 https://hal.archives-ouvertes.fr/hal-02334271 Submitted on 28 Oct 2019 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. An overview of the extremal index Nicholas R. Moloney,1, a) Davide Faranda,2, b) and Yuzuru Sato3, c) 1)Department of Mathematics and Statistics, University of Reading, Reading RG6 6AX, UKd) 2)Laboratoire de Sciences du Climat et de l’Environnement, UMR 8212 CEA-CNRS-UVSQ,IPSL, Universite Paris-Saclay, 91191 Gif-sur-Yvette, France 3)RIES/Department of Mathematics, Hokkaido University, Kita 20 Nishi 10, Kita-ku, Sapporo 001-0020, Japan (Dated: 5 January 2019) For a wide class of stationary time series, extreme value theory provides limiting distributions for rare events. The theory describes not only the size of extremes, but also how often they occur. In practice, it is often observed that extremes cluster in time. Such short-range clustering is also accommodated by extreme value theory via the so-called extremal index. This review provides an introduction to the extremal index by working through a number of its intuitive interpretations. Thus, depending on the context, the extremal index may represent (i) the loss of iid degrees of freedom, (ii) the multiplicity of a compound Poisson point process, (iii) the inverse mean duration of extreme clusters. More recently, the extremal index has also been used to quantify (iv) recurrences around unstable fixed points in dynamical systems. Whether extreme events occur in isolation or in ourselves to the technically more tractable definition of clusters is an important question for their predic- an extreme as something rare or large in magnitude (de- tion and mitigation. Extreme value theory can fined more precisely in Sec. II). We have in mind a sta- accommodate clustering via the extremal index tionary or easily detrendable process generating weakly which, heuristically, measures the size of the clus- correlated random variables, which are available in rea- ter. Mathematically, clustering is most conve- sonably large numbers. In the 1920s, Fisher and Tip- niently understood within the framework of point pett30 identified the possible limiting distributions for processes. Without clustering, extreme events the rescaled maxima Mn = {X1,...,Xn} of a set of n occur in the manner of a Poisson process. With independently and identically distributed (iid) random clustering, extreme events are bunched together variables Xi. For parametric modeling, these generalised in a compound Poisson process. In order to de- extreme value (GEV) cumulative distributions are most velop an intuition for the extremal index, we sur- conveniently written in the form vey some simple examples from stochastic pro- − cesses, real-world time series and dynamical sys- x − µ 1/ξ G (x) = exp − 1+ ξ , (1) tems. ξ;µ,σ σ ! where µ and σ > 0 are location and scale parameters, respectively. The shape parameter ξ is determined by I. INTRODUCTION the tail behaviour of the parent distribution F (x) = P(X ≤ x)38. These GEV distributions are closely re- Human societies are perpetually exposed to natural lated to generalised Pareto distributions (GPD), which hazards, and an understanding of the statistics of ex- describe the distribution of exceedances over high thresh- treme events is vital to predicting and mitigating their olds. Taken together, the GEV and GPD form the basis 51,53 effects . Analogous to the central limit theorem for of extreme value theory. Details may be found in nu- the sums of random variables, universal laws likewise ex- merous textbooks8,16,19,20,42. Importantly, extreme value ist for their maxima or minima, and extreme value theory theory covers non-iid series, provided they are stationary now finds application in a broad range of areas includ- and weakly correlated. For example, for stationary Gaus- 41 48 46 ing hydrology , earth science , finance , meteorology sian series, the Berman condition4,20 provides a particu- 1,13 and climate science . For recent reviews of modeling larly simple check for applicability in terms of the auto- 31,36 frameworks for extremes, see . correlation function. It states that if the autocorrelation While extreme events are most often felt in terms of rh at time-lag h satisfies rh log h → 0 as h → ∞, then their impact, for the purposes of this review we will limit (i) the distribution of block maxima follows Eq. 1 with ξ = 0, (ii) extremes do not cluster. Our focus here is on the clustering of extremes. It is 3 a)Electronic mail: [email protected] often observed that the extremes of e.g. temperatures , 21 6 47 b)Electronic mail: [email protected] water levels , wind speeds or financial time series c)Electronic mail: [email protected] cluster in time. In other words, they do not occur ran- d)Also at London Mathematical Laboratory, 8 Margravine Gar- domly as would be expected on the basis of a Poisson dens, London, UK process. Extreme value theory can accommodate such 2 clustering via the so-called extremal index, while leaving II. POISSON APPROXIMATION TO EXTREMES the shape of the GEV distribution in Eq. 1 unchanged (this is not to be confused with the ‘extreme value index’, Apart from furnishing an estimate of the size of ex- which in some literature refers to what we here call the treme events, the Poisson approximation makes clear how shape parameter). This index, denoted θ ∈ [0, 1], essen- these events occur within the framework of a point pro- tially quantifies the inverse of the mean cluster size, and cess. The basic idea is that extreme events, by virtue of thus has an appealing physical meaning. When modeling the rarity, occur in the manner of a Poisson process. threshold-exceeding extremes with GPDs, the extremal n Consider a set of iid random variables {Xi}i=1, and index estimates the effective number of iid degrees of free- let F (x)= P(X>x) be the probability that the random dom, since all the exceedances in a cluster are highly variable X exceeds a value x. The number of exceedances correlated. Nevertheless, the clusters themselves do oc- nu over a threshold u from n trials with success proba- cur randomly in the manner of a Poisson process, and bility F (u) is a random variable following a binomial dis- therefore the occurrence of exceedances overall can be tribution, Bin(n, F (u)). Therefore, the expected number described by a compound Poisson process. In this way, of exceedances is nF (u). the extremal index fits naturally within the point process To make the passage to extremes, suppose now that framework of extremes by quantifying the multiplicity of the threshold un is no longer fixed but allowed to increase the underlying compound Poisson process. This perspec- with n in order to probe the tail of the distribution F . If tive is particularly insightful when treating recurrences in un is chosen such that chaotic dynamical systems: visits of a trajectory to rare sets on the attractor will cluster if they occur in the man- lim nF (u )= τ, (2) →∞ n ner of a compound Poisson process. This is indeed the n case if the rare set includes an unstable fixed point. The for some constant τ, then it is well known that the bi- extremal index is thus a tool for probing local proper- nomial distribution converges to a Poisson distribution ties of the attractor. The aim of this review is to work with intensity τ. Thus, in the extreme regime, the through these various statistical and dynamical interpre- number of exceedances is a Poisson random variable. tations. This in turn gives an approximation to the maximum Mn = max{X1,...,Xn}, since if even the maximum fails to exceed the threshold, M ≤ u , then the number of It is important to mention that there is no exclusive n n exceedances nu = 0. But since nu is a Poisson random definition of clustering, and this review restricts its at- variable, tention to clustering as measured by the extremal index. Other notions of clustering are also developed in the liter- −τ P(Mn ≤ un)= P(nu =0)= e . (3) ature. For example, the index of dispersion (also known as the coefficient of variation or Fano factor) is a diag- This approximation can be further refined to recover the nostic used in point processes14. It measures the ratio of GEV distributions in Eq. (1) via a suitable choice of the 19 the variance of the number of points in an interval, to the intensity τ = τ(un) . Note that the Poisson approxi- mean of the number of points in the same interval. For mation applies not only to extremely high exceedances, a Poisson process, this ratio is one. If the ratio is greater but to any kind of rare event. The distinguishing fea- than one, the point process is said to be “over-dispersed”. ture is that recurrences to the rare set (i.e.

An Overview of the Extremal Index Nicholas Moloney, Davide Faranda, Yuzuru Sato

Using Extreme Value Theory to Test for Outliers Résumé Abstract

Extreme Value Theory Page 1

Going Beyond the Hill: an Introduction to Multivariate Extreme Value Theory

View Article

An Introduction to Extreme Value Statistics

Limit Theorems for Point Processes Under Geometric Constraints (And

Extreme Value Theory in Application to Delivery Delays

Actuarial Modelling of Extremal Events Using Transformed Generalized Extreme Value Distributions and Generalized Pareto Distributions

Extreme Value Statistics and the Pareto Distribution in Silicon Photonics 1

State Space Modelling of Extreme Values with Particle Filters

L-Moments for Automatic Threshold Selection in Extreme Value Analysis

EXTREME VALUE THEORY and LARGE FIRE LOSSES The