Modeling and Estimating the Multivariate Tail Dependence
Andrea Krajina, Institut for Mathematical Stochastics, University of G¨ottingen
Joint work with John Einmahl, Tilburg University Johan Segers, Universit´ecatholique de Louvain
The Hague, 2013 ASTIN Colloquium Outline
Introduction
Tail dependence: Stable tail dependence function l
Nonparametric estimator of l
M-Estimator of l
– Definition – Asymptotic Properties – Examples of Parametric Models
⊲ Logistic Model
⊲ Factor Model Multivariate DoA
X1,...,Xn, Xi := (Xi1,...,Xid), i =1,...,n, independent random vectors from a d-variate df F
if there exist sequences a > 0, b R, j =1,...,d, such that jn jn ∈ n(1 F (a x + b , . . . , a x + b )) log G(x ,...,x ) (1) − 1n 1 1n dn d dn → − 1 d weakly as n , and marginal distributions G ,...,G of G are →∞ 1 d non-degenerate, we say that F is in the domain of attraction of G
such G is called an extreme value distribution
limit df of the vector of component-wise maxima of X1,...,Xn Multivariate DoA
let F1,...,Fd be the d continuous marginals, and C a copula of F, F (x1,...,xd) = C(F1(x1),...,Fd(xd)) the DoA condition (1) implies the d univariate DoA conditions, and the DoA condition for the dependence structure
marginal distributions, j =1,...,d
−1/γj lim n(1 Fj(ajnxj + bjn)) = log Gj(xj)=(1+ γjxj) n→∞ − −
dependence structure
x1 xd lim n 1 C 1 ,..., 1 =: l(x1,...,xd) = l(x) n→∞ − − n − n Tail Dependence: Function l
1 replacing n by t, we get 1 C(1 tx ,..., 1 tx ) l(x) = lim − − 1 − d t→0 t
the limit function l is defined for x := (x ,...,x ) [0, )d, and in 1 d ∈ ∞ terms of the original df F it reads
−1 l(x) = lim t P 1 F1(X11) tx1 or ... or 1 Fd(X1d) txd t→0 − ≤ − ≤
l is called the stable tail dependence function Nonparametric Estimator of l
assumption: the DoA for the dependence structure holds, i.e. the function l exists
ˆln is nonparametric estimator of l
n 1 1 1 ˆl (x) := 1 R1 > n + kx or ... or Rd > n + kx , n k i 2 − 1 i 2 − d Xi=1
j – Ri is the rank of Xij among X1j,...,Xnj, j =1,...,d – k 1, 2,...,n ∈{ } k = k and k/n 0 when n n →∞ → →∞ consistency follows from d =2 result
asymptotic normality for arbitrary d Nonparametric Estimator of l
Theorem 1 (Asymptotic normality of ˆln in dimension d)
(C0) for all j =1,...,d, the first-order partial derivative of l with respect to xj exists and is continuous on the set of points x such that xj > 0, (C1) t−1P (1 F (X ) tx or ... or 1 F (X ) tx ) l(x) = O(tα), − 1 11 ≤ 1 − d 1d ≤ d − uniformly in x ∆ − as t 0, for some α > 0 ∈ d 1 ↓ 2α (C2) k = o n 1+2α , for the α of (C1) then for every T > 0, as n , →∞
P sup √k ˆln(x) l(x) B(x) 0, x∈[0,T ]d − − →
Nonparametric Estimator of l
where
d B(x) = Wl(x) lj (x)Wj(xj) − jX=1
with W (x) := W u [0, ]d : u x or ... or u x l Λ ∈ ∞ 1 ≤ 1 d ≤ d centered Gaussian, covariance structure determined by l (via Λ )
marginal processes W (x ) := W (0,..., 0,x , 0,..., 0), x 0 j j l j j ≥ lj are the (right-hand) partial derivative of l wrt xj, j =1,...,d M-Estimator: Model Assumptions
X1,...,Xn iid random vectors from continuous d-variate df F assumptions
– the DoA for the dependence structure holds, i.e. the function l exists – function l belongs to some parametric family l : θ Θ Rp { θ ∈ ⊆ } we do not assume
– F itself is parametrically modeled – not even: copula of F is parametric – any restrictions on the margins of F – differentiability of l (factor models; likelihood methods(!)) M-Estimator: Definition
let g (g ,...,g )T : [0, 1]d Rq, q p, be an integrable function ≡ 1 q → ≥ such that ϕ : Θ Rq defined by → ϕ(θ) := g(x)l(x; θ) dx Z[0,1]d
is a homeomorphism between Θ and its image ϕ(Θ)
M-estimator θˆn of θ0 is defined as the minimizer of the criterion function
2 ˆ Qk,n(θ) = g(x)ln(x)dx g(x)l(x; θ)dx Z[0,1]d − Z[0,1]d
M-Estimator: Asymptotic Properties
existence and consistency
asymptotic normality
conditions:
(C0) for all j =1,...,d, the first-order partial derivative of l with respect to xj exists and is continuous on the set of points x such that xj > 0, −1 (C1) t P (1 F1(X11) tx1 or or 1 Fd(X1d) txd) l(x) = α − ≤ ··· − ≤ − O(t ), uniformly in x ∆d−1 as t 0, for some α > 0 2α ∈ ↓ (C2) k = o n 1+2α , for the α of (C1) (C3) ϕ is twice continuously differentiable and Dϕ(θ0) is of full rank M-Estimator: Asymptotic Properties
existence, uniqueness and consistency
asymptotic normality
conditions:
(C0) for all j =1,...,d, the first-order partial derivative of l with respect to xj exists and is continuous on the set of points x such that xj > 0, −1 (C1) t P (1 F1(X11) tx1 or or 1 Fd(X1d) txd) l(x) = α − ≤ ··· − ≤ − O(t ), uniformly in x ∆d−1 as t 0, for some α > 0 2α ∈ ↓ (C2) k = o n 1+2α , for the α of (C1) (C3) ϕ is twice continuously differentiable and Dϕ(θ0) is of full rank M-Estimator: Asymptotic Properties
existence, uniqueness and consistency
asymptotic normality
conditions:
(C0) for all j =1,...,d, the first-order partial derivative of l with respect to xj exists and is continuous on the set of points x such that xj > 0, M-Estimator: Asymptotic Properties
existence, uniqueness and consistency
asymptotic normality
conditions:
(C0) for all j =1,...,d, the first-order partial derivative of l with respect to xj exists and is continuous on the set of points x such that xj > 0, M-Estimator: Asymptotic Properties
existence, uniqueness and consistency
asymptotic normality
conditions:
(C0) for all j =1,...,d, the first-order partial derivative of l with respect to xj exists and is continuous on the set of points x such that xj > 0, −1 (C1) t P (1 F1(X11) tx1 or or 1 Fd(X1d) txd) l(x) = α − ≤ ··· − ≤ − O(t ), uniformly in x ∆d−1 as t 0, for some α > 0 2α ∈ ↓ (C2) k = o n 1+2α , for the α of (C1) (C3) ϕ is twice continuously differentiable and Dϕ(θ0) is of full rank M-Estimator: Asymptotic Normality
Theorem 2 If in addition to the assumptions of Theorem ??, the conditions (C1) and (C2) hold, then as n and k , →∞ →∞ d √k(θˆn θ ) N(0,M(θ )). − 0 → 0
Corollary 2.1 If in addition to conditions of Theorem 2, the mapping θ H is weakly continuous at θ and if M(θ ) is non-singular, then as 7→ θ 0 0 n and k , →∞ →∞ k(θˆ θ )T M(θˆ )−1(θˆ θ ) d χ2. (2) n − 0 n n − 0 → p M-Estimator: Overview
M-estimator, θˆ , minimizer of gˆl gl 2 n k n − θk R R dimensions d 2 ≥ differentiability of l
consistent and asymptotically normal
existing estimators
– nonparametric (empirical) estimator of l: Huang, 1992; Drees and Huang, 1998; Einmahl, de Haan and Li, 2006
– MLE in parametric models: Coles and Tawn, 1991; Joe, Smith and Weissman, 1992; Smith, 1994; Ledford and Tawn, 1996; de Haan, Neves, Peng, 2007 Example: Logistic Model
the d-dimensional multivariate logistic distribution function with parameter θ [0, 1] ∈ θ d 1/θ F (x) = exp x − j Xj=1 the stable tail dependence function corresponding to the logistic model:
θ d 1 θ lθ(x) = x , θ [0, 1], xj 0, j =1,...,d j ∈ ≥ Xj=1 θ =0 - complete dependence, θ =1 - independence
Gumbel, 1960. Simulations: Logistic Model
dimension d = 5
1 1 θ l (x) = x θ + + x θ , θ [0, 1], x 0, j =1,..., 5 θ 1 ··· 5 ∈ j ≥
simulation: 500 samples of size n = 3000 from a five-dimensional logistic distribution function with θ0 = 0.5 θˆ , an M-estimator of θ , with g 1 n 0 1 ≡ optimal g: gopt(x)=(∂/∂θ)l(x; θ0)
also, we consider the estimation of lθ(1, 1, 1, 1, 1) - we compare ˆ l (1, 1, 1, 1, 1)=5θn and ˆl (1, 1, 1, 1, 1) θˆn n Simulations: Logistic Model
θ0=0.5 θ0=0.5 bias RMSE 0.00 0.01 0.02 0.03 0.04 0.05 −0.06 −0.05 −0.04 −0.03 −0.02 −0.01 0.00 50 100 150 200 250 300 50 100 150 200 250 300 k k (a) Bias of estimator of θ (b) RMSE of estimator of θ
Figure 1: Five-dimensional logistic model, θ0 =0.5 Simulations: Logistic Model
θ0=0.5, l0.5(1,1,1,1,1)= 5 θ0=0.5, l0.5(1,1,1,1,1)= 5 bias RMSE
M−estimator M−estimator ln(1,1,1,1,1) ln(1,1,1,1,1) 0.00 0.05 0.10 0.15 0.20 0.25 0.30 −0.30 −0.25 −0.20 −0.15 −0.10 −0.05 0.00 50 100 150 200 250 300 50 100 150 200 250 300 k k
(a) Bias of estimator of l0.5(1, 1, 1, 1, 1) (b) RMSE of estimator of l0.5(1, 1, 1, 1, 1)
Figure 2: Five-dimensional logistic model, θ0 =0.5 Symmetric vs. asymmetric logistic
Asymmetric logistic model for l, in d =2
θ 1/θ 1/θ l(x1,x2; θ, ψ1, ψ2)=(1 ψ1)x1+(1 ψ2)x2+ (ψ1x1) +(ψ2x2) − −
for the Loss-ALAE data:
– Hypothesis test: symmetric vs. asymmetric logistic – symmetric logistic “won”, θ 0.65 ≈ Example: Factor Model
r-factor model, r N, in dimension d: X =(X ,...,X ) and ∈ 1 d
Xj = b jZ + + brjZr + εj, j 1,...,d 1 1 ··· ∈ { }
– factors Zj are independent Fr´echet(ν) random variables, ν > 0, P(Z x ) = exp 1/x , x > 0 (std. Fr´echet) 1 ≤ 1 {− 1} 1 – loadings bij are nonnegative constants such that
b + b + + b =1, j 1,...,d 1j 2j ··· rj ∈{ }
– dimension of the parameter vector θ Rp, p = d(r 1) ∈ − Ledford and Tawn (1998), Wang and Stoev (2011) Example: Factor Model
r-factor model, r N, in dimension d: X =(X ,...,X ) and ∈ 1 d
Xj = b jZ + + brjZr + εj, j 1,...,d 1 1 ··· ∈ { }
stable tail dependence function
r l(x1,...,xd) = max bijxj , 1≤j≤d{ } Xi=1
non-differentiable l Simulations: Four-dimensional model with 2 factors
simulation: 500 samples of size n = 5000 from a four-dimensional model
X = 0.8Z 0.2Z 1 1 ∨ 2 X = 0.5Z 0.5Z 2 1 ∨ 2 X = 0.3Z 0.7Z 3 1 ∨ 2 X = 0.1Z 0.9Z , 4 1 ∨ 2
with independent standard Fr´echet distributed factors Z1 and Z2 p =4 (2 1)=4 × − θ0 = (0.2, 0.5, 0.7, 0.9) Simulations: Four-dimensional model with 2 factors
θ1=0.2, θ2=0.5 θ1=0.2, θ2=0.5
θ1 θ1 θ2 θ2 bias RMSE −0.04 −0.02 0.00 0.02 0.04 0.00 0.02 0.04 0.06 200 400 600 800 1000 200 400 600 800 1000 k k (a) (b)
Figure 3: Four-dimensional 2-factor model, estimation of θ1 =0.2, θ2 =0.5 Simulations: Four-dimensional model with 2 factors
θ3=0.7, θ4=0.9 θ3=0.7, θ4=0.9
θ3 θ3 θ4 θ4 bias RMSE −0.04 −0.02 0.00 0.02 0.04 0.00 0.02 0.04 0.06 200 400 600 800 1000 200 400 600 800 1000 k k (a) (b)
Figure 4: Four-dimensional 2-factor model, estimation of θ3 =0.7, θ4 =0.9 Example: Three-dimensional model with 3 factors
monthly returns of three industry portfolios: Telecommunications, Finance and Oil; July 1, 1926 - December 31, 2009
http://mba.tuck.dartmouth.edu/pages/faculty/ken.french
model
X1 = b11Z1 + b21Z2 + b31Z3 + ε1
X2 = b12Z1 + b22Z2 + b32Z3 + ε2
X3 = b13Z1 + b23Z2 + b33Z3 + ε3,
θ R6, θ =(b , . . . , b ) ∈ 11 23 2 the q =7 auxiliary functions are gi(x) = xi, gi+3(x) = xi , i =1, 2, 3 and g 1 7 ≡ Example: Three-dimensional model with 3 factors
k = 60 k = 90 0.394 0.593 0.013 0.344 0.616 0.040 0.691 0.211 0.098 0.701 0.216 0.083 0.358 0.062 0.580 0.368 0.052 0.580 k = 120 k = 150 0.387 0.586 0.027 0.388 0.581 0.031 0.695 0.215 0.090 0.699 0.211 0.090 0.348 0.058 0.594 0.364 0.086 0.550 Overview
EVT in a d-dimensional case (1) + (2) ⇔ (1) d univariate DoA conditions (2) the DoA condition for the tail dependence = existence of l:
1 C(1 tx1,..., 1 txd) l(x1,...,xd) = lim − − − . t→0 t
estimation of l: nonparametric (ˆln) and parametric (M-estimator) logistic, asymmetric logistic and factor models
meta-elliptical model: = elliptical distribution with “free” marginals; flexible, contains many known models, such as normal and t-distributions References (1)
(1) X. Huang (1992), Statistics of Bivariate Extreme Values, Ph.D. thesis, Erasmus University Rotterdam, Tinbergen Institute Research Series 22.
(2) H. Drees, X. Huang (1998), Best attainable rates of convergence for estimates of the stable tail dependence functions, J. Multivariate Anal., 64, 25-47.
(3) J. H. J. Einmahl, L. de Haan, D. Li (2006), Weighted approximations of Tail Copula processes with Application to Testing the Bivariate Extreme Value Condition, The Annals of Statistics, 34, 1987-2014. References (2)
(1) A. Ledford and J. Tawn (1998), Concomitant Tail Behaviour for Extremes, Advances in Applied Probability.
(2) Z. Wang and S. Stoev (2011), Conditional Sampling for Max-Stable Random Fields, Advances in Applied Probability.
(1) J.H.J. Einmahl, A. Krajina, J. Segers (2012), An M-Estimator of Tail Dependence in Arbitrary Dimensions, The Annals of Statistics.
(2) J.H.J. Einmahl, A. Krajina, J. Segers (2008), A Method of Moments Estimator of Tail Dependence, Bernoulli.
(3) A. Krajina (2012), A Method of Moments Estimator of Tail Dependence in Elliptical Copula Models, Journal of Statistical Planning and Inference.