Modeling and Estimating the Multivariate Tail Dependence

Home , N

Andrea Krajina, Institut for Mathematical Stochastics, University of G¨ottingen

Joint work with John Einmahl, Tilburg University Johan Segers, Universit´ecatholique de Louvain

The Hague, 2013 ASTIN Colloquium Outline

Introduction

Tail dependence: Stable tail dependence function l

Nonparametric estimator of l

M-Estimator of l

– Deﬁnition – Asymptotic Properties – Examples of Parametric Models

⊲ Logistic Model

⊲ Factor Model Multivariate DoA

X1,...,Xn, Xi := (Xi1,...,Xid), i =1,...,n, independent random vectors from a d-variate df F

if there exist sequences a > 0, b R, j =1,...,d, such that jn jn ∈ n(1 F (a x + b , . . . , a x + b )) log G(x ,...,x ) (1) − 1n 1 1n dn d dn → − 1 d weakly as n , and marginal distributions G ,...,G of G are →∞ 1 d non-degenerate, we say that F is in the domain of attraction of G

such G is called an extreme value distribution

limit df of the vector of component-wise maxima of X1,...,Xn Multivariate DoA

let F1,...,Fd be the d continuous marginals, and C a copula of F, F (x1,...,xd) = C(F1(x1),...,Fd(xd)) the DoA condition (1) implies the d univariate DoA conditions, and the DoA condition for the dependence structure

marginal distributions, j =1,...,d

−1/γj lim n(1 Fj(ajnxj + bjn)) = log Gj(xj)=(1+ γjxj) n→∞ − −

dependence structure

x1 xd lim n 1 C 1 ,..., 1 =: l(x1,...,xd) = l(x) n→∞ − − n − n Tail Dependence: Function l

1 replacing n by t, we get 1 C(1 tx ,..., 1 tx ) l(x) = lim − − 1 − d t→0 t

the limit function l is deﬁned for x := (x ,...,x ) [0, )d, and in 1 d ∈ ∞ terms of the original df F it reads

−1 l(x) = lim t P 1 F1(X11) tx1 or ... or 1 Fd(X1d) txd t→0 − ≤ − ≤

l is called the stable tail dependence function Nonparametric Estimator of l

assumption: the DoA for the dependence structure holds, i.e. the function l exists

ˆln is nonparametric estimator of l

n 1 1 1 ˆl (x) := 1 R1 > n + kx or ... or Rd > n + kx , n k i 2 − 1 i 2 − d Xi=1

j – Ri is the rank of Xij among X1j,...,Xnj, j =1,...,d – k 1, 2,...,n ∈{ } k = k and k/n 0 when n n →∞ → →∞ consistency follows from d =2 result

asymptotic normality for arbitrary d Nonparametric Estimator of l

Theorem 1 (Asymptotic normality of ˆln in dimension d)

(C0) for all j =1,...,d, the ﬁrst-order partial derivative of l with respect to xj exists and is continuous on the set of points x such that xj > 0, (C1) t−1P (1 F (X ) tx or ... or 1 F (X ) tx ) l(x) = O(tα), − 1 11 ≤ 1 − d 1d ≤ d − uniformly in x ∆ − as t 0, for some α > 0 ∈ d 1 ↓ 2α (C2) k = o n 1+2α , for the α of (C1) then for every T > 0, as n , →∞

P sup √k ˆln(x) l(x) B(x) 0, x∈[0,T ]d − − →

Nonparametric Estimator of l

where

d B(x) = Wl(x) lj (x)Wj(xj) − jX=1

with W (x) := W u [0, ]d : u x or ... or u x l Λ ∈ ∞ 1 ≤ 1 d ≤ d centered Gaussian, covariance structure determined by l (via Λ )

marginal processes W (x ) := W (0,..., 0,x , 0,..., 0), x 0 j j l j j ≥ lj are the (right-hand) partial derivative of l wrt xj, j =1,...,d M-Estimator: Model Assumptions

X1,...,Xn iid random vectors from continuous d-variate df F assumptions

– the DoA for the dependence structure holds, i.e. the function l exists – function l belongs to some parametric family l : θ Θ Rp { θ ∈ ⊆ } we do not assume

– F itself is parametrically modeled – not even: copula of F is parametric – any restrictions on the margins of F – diﬀerentiability of l (factor models; likelihood methods(!)) M-Estimator: Deﬁnition

let g (g ,...,g )T : [0, 1]d Rq, q p, be an integrable function ≡ 1 q → ≥ such that ϕ : Θ Rq deﬁned by → ϕ(θ) := g(x)l(x; θ) dx Z[0,1]d

is a homeomorphism between Θ and its image ϕ(Θ)

M-estimator θˆn of θ0 is deﬁned as the minimizer of the criterion function

2 ˆ Qk,n(θ) = g(x)ln(x)dx g(x)l(x; θ)dx Z[0,1]d − Z[0,1]d

M-Estimator: Asymptotic Properties

existence and consistency

asymptotic normality

conditions:

(C0) for all j =1,...,d, the ﬁrst-order partial derivative of l with respect to xj exists and is continuous on the set of points x such that xj > 0, −1 (C1) t P (1 F1(X11) tx1 or or 1 Fd(X1d) txd) l(x) = α − ≤ ··· − ≤ − O(t ), uniformly in x ∆d−1 as t 0, for some α > 0 2α ∈ ↓ (C2) k = o n 1+2α , for the α of (C1) (C3) ϕ is twice continuously diﬀerentiable and Dϕ(θ0) is of full rank M-Estimator: Asymptotic Properties

existence, uniqueness and consistency

asymptotic normality

conditions:

existence, uniqueness and consistency

asymptotic normality

conditions:

(C0) for all j =1,...,d, the ﬁrst-order partial derivative of l with respect to xj exists and is continuous on the set of points x such that xj > 0, M-Estimator: Asymptotic Properties

existence, uniqueness and consistency

asymptotic normality

conditions:

(C0) for all j =1,...,d, the ﬁrst-order partial derivative of l with respect to xj exists and is continuous on the set of points x such that xj > 0, M-Estimator: Asymptotic Properties

existence, uniqueness and consistency

asymptotic normality

conditions:

Theorem 2 If in addition to the assumptions of Theorem ??, the conditions (C1) and (C2) hold, then as n and k , →∞ →∞ d √k(θˆn θ ) N(0,M(θ )). − 0 → 0

Corollary 2.1 If in addition to conditions of Theorem 2, the mapping θ H is weakly continuous at θ and if M(θ ) is non-singular, then as 7→ θ 0 0 n and k , →∞ →∞ k(θˆ θ )T M(θˆ )−1(θˆ θ ) d χ2. (2) n − 0 n n − 0 → p M-Estimator: Overview

M-estimator, θˆ , minimizer of gˆl gl 2 n k n − θk R R dimensions d 2 ≥ diﬀerentiability of l

consistent and asymptotically normal

existing estimators

– nonparametric (empirical) estimator of l: Huang, 1992; Drees and Huang, 1998; Einmahl, de Haan and Li, 2006

– MLE in parametric models: Coles and Tawn, 1991; Joe, Smith and Weissman, 1992; Smith, 1994; Ledford and Tawn, 1996; de Haan, Neves, Peng, 2007 Example: Logistic Model

the d-dimensional multivariate logistic distribution function with parameter θ [0, 1] ∈ θ d 1/θ F (x) = exp   x   − j   Xj=1      the stable tail dependence function corresponding to the logistic model:

θ d 1 θ lθ(x) =  x  , θ [0, 1], xj 0, j =1,...,d j ∈ ≥ Xj=1   θ =0 - complete dependence, θ =1 - independence

Gumbel, 1960. Simulations: Logistic Model

dimension d = 5

1 1 θ l (x) = x θ + + x θ , θ [0, 1], x 0, j =1,..., 5 θ 1 ··· 5 ∈ j ≥

simulation: 500 samples of size n = 3000 from a ﬁve-dimensional logistic distribution function with θ0 = 0.5 θˆ , an M-estimator of θ , with g 1 n 0 1 ≡ optimal g: gopt(x)=(∂/∂θ)l(x; θ0)

also, we consider the estimation of lθ(1, 1, 1, 1, 1) - we compare ˆ l (1, 1, 1, 1, 1)=5θn and ˆl (1, 1, 1, 1, 1) θˆn n Simulations: Logistic Model

θ0=0.5 θ0=0.5 bias RMSE 0.00 0.01 0.02 0.03 0.04 0.05 −0.06 −0.05 −0.04 −0.03 −0.02 −0.01 0.00 50 100 150 200 250 300 50 100 150 200 250 300 k k (a) Bias of estimator of θ (b) RMSE of estimator of θ

Figure 1: Five-dimensional logistic model, θ0 =0.5 Simulations: Logistic Model

θ0=0.5, l0.5(1,1,1,1,1)= 5 θ0=0.5, l0.5(1,1,1,1,1)= 5 bias RMSE

M−estimator M−estimator ln(1,1,1,1,1) ln(1,1,1,1,1) 0.00 0.05 0.10 0.15 0.20 0.25 0.30 −0.30 −0.25 −0.20 −0.15 −0.10 −0.05 0.00 50 100 150 200 250 300 50 100 150 200 250 300 k k

(a) Bias of estimator of l0.5(1, 1, 1, 1, 1) (b) RMSE of estimator of l0.5(1, 1, 1, 1, 1)

Figure 2: Five-dimensional logistic model, θ0 =0.5 Symmetric vs. asymmetric logistic

Asymmetric logistic model for l, in d =2

θ 1/θ 1/θ l(x1,x2; θ, ψ1, ψ2)=(1 ψ1)x1+(1 ψ2)x2+ (ψ1x1) +(ψ2x2) − −

for the Loss-ALAE data:

– Hypothesis test: symmetric vs. asymmetric logistic – symmetric logistic “won”, θ 0.65 ≈ Example: Factor Model

r-factor model, r N, in dimension d: X =(X ,...,X ) and ∈ 1 d

Xj = b jZ + + brjZr + εj, j 1,...,d 1 1 ··· ∈ { }

– factors Zj are independent Fr´echet(ν) random variables, ν > 0, P(Z x ) = exp 1/x , x > 0 (std. Fr´echet) 1 ≤ 1 {− 1} 1 – loadings bij are nonnegative constants such that

b + b + + b =1, j 1,...,d 1j 2j ··· rj ∈{ }

– dimension of the parameter vector θ Rp, p = d(r 1) ∈ − Ledford and Tawn (1998), Wang and Stoev (2011) Example: Factor Model

r-factor model, r N, in dimension d: X =(X ,...,X ) and ∈ 1 d

Xj = b jZ + + brjZr + εj, j 1,...,d 1 1 ··· ∈ { }

stable tail dependence function

r l(x1,...,xd) = max bijxj , 1≤j≤d{ } Xi=1

non-diﬀerentiable l Simulations: Four-dimensional model with 2 factors

simulation: 500 samples of size n = 5000 from a four-dimensional model

X = 0.8Z 0.2Z 1 1 ∨ 2 X = 0.5Z 0.5Z 2 1 ∨ 2 X = 0.3Z 0.7Z 3 1 ∨ 2 X = 0.1Z 0.9Z , 4 1 ∨ 2

with independent standard Fr´echet distributed factors Z1 and Z2 p =4 (2 1)=4 × − θ0 = (0.2, 0.5, 0.7, 0.9) Simulations: Four-dimensional model with 2 factors

θ1=0.2, θ2=0.5 θ1=0.2, θ2=0.5

θ1 θ1 θ2 θ2 bias RMSE −0.04 −0.02 0.00 0.02 0.04 0.00 0.02 0.04 0.06 200 400 600 800 1000 200 400 600 800 1000 k k (a) (b)

Figure 3: Four-dimensional 2-factor model, estimation of θ1 =0.2, θ2 =0.5 Simulations: Four-dimensional model with 2 factors

θ3=0.7, θ4=0.9 θ3=0.7, θ4=0.9

θ3 θ3 θ4 θ4 bias RMSE −0.04 −0.02 0.00 0.02 0.04 0.00 0.02 0.04 0.06 200 400 600 800 1000 200 400 600 800 1000 k k (a) (b)

Figure 4: Four-dimensional 2-factor model, estimation of θ3 =0.7, θ4 =0.9 Example: Three-dimensional model with 3 factors

monthly returns of three industry portfolios: Telecommunications, Finance and Oil; July 1, 1926 - December 31, 2009

http://mba.tuck.dartmouth.edu/pages/faculty/ken.french

model

X1 = b11Z1 + b21Z2 + b31Z3 + ε1

X2 = b12Z1 + b22Z2 + b32Z3 + ε2

X3 = b13Z1 + b23Z2 + b33Z3 + ε3,

θ R6, θ =(b , . . . , b ) ∈ 11 23 2 the q =7 auxiliary functions are gi(x) = xi, gi+3(x) = xi , i =1, 2, 3 and g 1 7 ≡ Example: Three-dimensional model with 3 factors

k = 60 k = 90 0.394 0.593 0.013 0.344 0.616 0.040 0.691 0.211 0.098 0.701 0.216 0.083 0.358 0.062 0.580 0.368 0.052 0.580 k = 120 k = 150 0.387 0.586 0.027 0.388 0.581 0.031 0.695 0.215 0.090 0.699 0.211 0.090 0.348 0.058 0.594 0.364 0.086 0.550 Overview

EVT in a d-dimensional case (1) + (2) ⇔ (1) d univariate DoA conditions (2) the DoA condition for the tail dependence = existence of l:

1 C(1 tx1,..., 1 txd) l(x1,...,xd) = lim − − − . t→0 t

estimation of l: nonparametric (ˆln) and parametric (M-estimator) logistic, asymmetric logistic and factor models

meta-elliptical model: = elliptical distribution with “free” marginals; ﬂexible, contains many known models, such as normal and t-distributions References (1)

(1) X. Huang (1992), Statistics of Bivariate Extreme Values, Ph.D. thesis, Erasmus University Rotterdam, Tinbergen Institute Research Series 22.

(2) H. Drees, X. Huang (1998), Best attainable rates of convergence for estimates of the stable tail dependence functions, J. Multivariate Anal., 64, 25-47.

(3) J. H. J. Einmahl, L. de Haan, D. Li (2006), Weighted approximations of Tail Copula processes with Application to Testing the Bivariate Extreme Value Condition, The Annals of Statistics, 34, 1987-2014. References (2)

(1) A. Ledford and J. Tawn (1998), Concomitant Tail Behaviour for Extremes, Advances in Applied Probability.

(2) Z. Wang and S. Stoev (2011), Conditional Sampling for Max-Stable Random Fields, Advances in Applied Probability.

(1) J.H.J. Einmahl, A. Krajina, J. Segers (2012), An M-Estimator of Tail Dependence in Arbitrary Dimensions, The Annals of Statistics.

(2) J.H.J. Einmahl, A. Krajina, J. Segers (2008), A Method of Moments Estimator of Tail Dependence, Bernoulli.

(3) A. Krajina (2012), A Method of Moments Estimator of Tail Dependence in Elliptical Copula Models, Journal of Statistical Planning and Inference.