Arthur CHARPENTIER - Multivariate Distributions
Multivariate Distributions: A brief overview (Spherical/Elliptical Distributions, Distributions on the Simplex & Copulas)
A. Charpentier (Université de Rennes 1 & UQàM)
Université de Rennes 1 Workshop, November 2015. http://freakonometrics.hypotheses.org
@freakonometrics 1 Arthur CHARPENTIER - Multivariate Distributions
Geometry in Rd and Statistics T P The standard inner product is < x, y >`2 = x y = i xiyi.
Hence, x ⊥ y if < x, y >`2 = 0. 1 1 The Euclidean norm is kxk =< x, x > 2 = Pn x2 2 . `2 `2 i=1 i d d The unit sphere of R is Sd = {x ∈ R : kxk`2 = 1}.
If x = {x1, ··· , xn}, note that the empirical covariance is
Cov(x, y) =< x − x, y − y >`2
and Var(x) = kx − xk`2 . T For the (multivariate) linear model, yi = β0 + β1 xi + εi, or equivalently,
yi = β0+ < β1, xi >`2 +εi
@freakonometrics 2 Arthur CHARPENTIER - Multivariate Distributions
The d dimensional Gaussian Random Vector T If Z ∼ N (0, I), then X = AZ + µ ∼ N (µ, Σ) where Σ = AA . Conversely (Cholesky decomposition), if X ∼ N (µ, Σ), then X = LZ + µ for T 1 some lower triangular matrix L satisfying Σ = LL . Denote L = Σ 2 . With Cholesky decomposition, we have the particular case (with a Gaussian distribution) of Rosenblatt (1952)’s chain,
f(x1, x2, ··· , xd) = f1(x1) · f2|1(x2|x1) · f3|2,1(x3|x2, x1) ···
··· fd|d−1,··· ,2,1(xd|xd−1, ··· , x2, x1).
1 1 T −1 d f(x; µ, Σ) = d 1 exp − (x − µ) Σ (x − µ) for all x ∈ R . (2π) 2 |Σ| 2 2 | {z } kxkµ,Σ
@freakonometrics 3 Arthur CHARPENTIER - Multivariate Distributions
The d dimensional Gaussian Random Vector
T −1 Note that kxkµ,Σ = (x − µ) Σ (x − µ) is the Mahalanobis distance. d Define the ellipsoid Eµ,Σ = {x ∈ R : kxkµ,Σ = 1} Let X1 µ1 Σ11 Σ12 X = ∼ N , X2 µ2 Σ21 Σ22 then
−1 −1 X1|X2 = x2 ∼ N (µ1 + Σ12Σ22 (x2 − µ2) , Σ11 − Σ12Σ22 Σ21)
X1 ⊥⊥ X2 if and only if Σ12 = 0. Further, if X ∼ N (µ, Σ), then AX + b ∼ N (Aµ + b, AΣAT).
@freakonometrics 4 Arthur CHARPENTIER - Multivariate Distributions
The Gaussian Distribution, as a Spherical Distributions
If X ∼ N (0, I), then X = R · U, where
2
2 2 1 R = kXk`2 ∼ χ (d)
0 ●
−1
and −2 −2 −1 −2 0 −1 U = X/kXk ∼ U(S ), 1 `2 d 0 1 2 2 with R ⊥⊥ U.
@freakonometrics 5 Arthur CHARPENTIER - Multivariate Distributions
The Gaussian Distribution, as an Elliptical Distributions
1 If X ∼ N (µ, Σ), then X = µ + R · Σ 2 · U, where | {z } 2
2 2 1 R = kXk`2 ∼ χ (d) 0
●
−1
−2 −2 and −1 −2 0 −1 U = X/kXk ∼ U(S ), 0 1 `2 d 1 2 2 with R ⊥⊥ U.
@freakonometrics 6 Arthur CHARPENTIER - Multivariate Distributions
Spherical Distributions T T Let M denote an orthogonal matrix, M M = MM = I. X has a spherical distribution if X =L MX.
E.g. in R2, cos(θ) − sin(θ) X1 L X1 = sin(θ) cos(θ) X2 X2
d T L For every a ∈ R , a X = kak`2 Yi for any i ∈ {1, ··· , d}. Further, the generating function of X can be written
T [eit X ] = ϕ(tTt) = ϕ(ktk2 ), ∀t ∈ d, E `2 R
for some ϕ : R+ → R+.
@freakonometrics 7 Arthur CHARPENTIER - Multivariate Distributions
Uniform Distribution on the Sphere
Actually, more complex that it seems... x1 = ρ sin ϕ cos θ x2 = ρ sin ϕ sin θ x3 = ρ cos ϕ
with ρ > 0, ϕ ∈ [0, 2π] and θ ∈ [0, π].
If Φ ∼ U([0, 2π]) and Θ ∼ U([0, π]), we do not have a uniform distribution on the sphere...
see https://en.wikibooks.org/wiki/Mathematica/Uniform_Spherical_Distribution, http://freakonometrics.hypotheses.org/10355
@freakonometrics 8 Arthur CHARPENTIER - Multivariate Distributions
Spherical Distributions
●
● ● ● 2
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●●● ●●● ● ● 1 ●● ●● ● ●●● ●●● ●●● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ●● ● ● ●● ● ●● ●●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● Random vector X as a spherical distribution if ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ●● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ●● ● ●● ●● ● ● ●● ● ●●● ● ● ●● ●● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ●●●● ● ● ●●● ● ● ●●● ● ● ●● ● ●● ●● ● ● ● ●●●●●● ●●●●● ● ● ● ●●●●●●●● ● ● ● ● ● ● ● ● −1 ● ● ● ● ● ● ● ● ● ● ● ●
●
● X = R · U ● ● ● ●
−2 ●
● ●
−2 −1 0 1 2 where R is a positive random variable and U is uniformly d
distributed on the unit sphere of R , Sd, with R ⊥⊥ U ●
● ● 0.02 ● 2
● ● ● 0.04 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.08 ● ● ● ● ● ● ● ● 1 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.14 ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●0.12 ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● E.g. X ∼ N (0, ). −1 ● ● ● ● ● ● I ● ● 0.06 ● ● ● ●
●
● ● ● ● ●
−2 ●
● ●
−2 −1 0 1 2
@freakonometrics 9 Arthur CHARPENTIER - Multivariate Distributions
Elliptical Distributions 2 ● ● ● ●
● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1 ● ● ● ● ● ● ● ● ●●●● ●● ● ● ●●● ●●●● ● ● ● ● ● ●●● ● ● ●● ● ●● ● ● ● ●●● ● ●●● ● ● ● ●● ● ● ● ●●●● ● ● ● ● ● ●● ● ● ● ●● ●● ● ● ●● ● ● ● ● Random vector X as a elliptical distribution if ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ●● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● 0 ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●●● ● ● ●● ● ● ● ●●●●● ● ● ● ● ● ● ●●●● ● ● ●● ● ● ● ● ● ●●● ● ● ●● ● ●● ●●●● ●● ● ● ● ●●● ●● ● ●●●●●● ● ● ●● ●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −1 ● ● ● ● ● ● ●
● X = µ + R · A · U ● ● −2
0 −2 −1 0 1 2 where A satisfies AA = Σ, U(Sd), with R ⊥⊥ U. Denote 1 Σ 2 = A. 2 ● ● ● ● 0.02 ● ● 0.04 ● ● ● ● ● 0.06● ● ● ● ● ● ● ● ● ● 1 ● ● ● ● ● ● ● ● ● ● ● ● 0.12● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.14 ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.08 ● ● ● ● ● ● E.g. X ∼ N (µ, Σ). −1 ● ● ● ● ● ● ●
● ●
● −2
−2 −1 0 1 2
@freakonometrics 10 Arthur CHARPENTIER - Multivariate Distributions
Elliptical Distributions
1 X = µ + RΣ 2 U where R is a positive random variable, U ∼ U(Sd), with
U ⊥⊥ R. If X ∼ FR, then X ∼ E(µ, Σ,FR).
Remark Instead of FR it is more common to use ϕ such that
itTX itTµ T d E[e ] = e ϕ(t Σt), t ∈ R .
0 E[X] = µ and Var[X] = −2ϕ (0)Σ
q 1 T −1 f(x) ∝ 1 f( (x − µ) Σ (x − µ)) |Σ| 2
where f : R+ → R+ is called radial density. Note that
dF (r) ∝ rd−1f(r)1(x > 0).
@freakonometrics 11 Arthur CHARPENTIER - Multivariate Distributions
Elliptical Distributions
If X ∼ E(µ, Σ,FR), then
T AX + b ∼ E(Aµ + b, AΣA ,FR)
If X1 µ1 Σ11 Σ12 X = ∼ E , ,FR X2 µ2 Σ21 Σ22 then
−1 −1 X1|X2 = x2 ∼ E(µ1 + Σ12Σ22 (x2 − µ2) Σ11 − Σ12Σ22 Σ21,F1|2)
where 2 1 F1|2 is the c.d.f. of (R − ?) 2 given X2 = x2.
@freakonometrics 12 Arthur CHARPENTIER - Multivariate Distributions
Mixtures of Normal Distributions
Let Z ∼ N (0, I). Let W denote a positive random variable, Z ⊥⊥ W . Set
√ 1 X = µ + W Σ 2 Z,
so that X|W = w ∼ N (µ, wΣ).
E[X] = µ and Var[X] = E[W ]Σ
itTX itTµ− 1 W tTΣt) d E[e ] = E e 2 , t ∈ R .
i.e. X ∼ E(µ, Σ, ϕ) where ϕ is the generating function of W , i.e. ϕ(t) = E[e−tW ]. If W has an inverse Gamma distribution, W ∼ IG(ν/2, ν/2), then X has a multivariate t distribution, with ν degrees of freedom.
@freakonometrics 13 Arthur CHARPENTIER - Multivariate Distributions
Multivariate Student t X ∼ t(µ, Σ, ν) if 1 Z X = µ + Σ 2 √ W /ν
where Z ∼ N (0, I) and W ∼ χ2(ν), with Z ⊥⊥ W . Note that ν Var[X] = Σ if ν > 2. ν − 2
@freakonometrics 14 Arthur CHARPENTIER - Multivariate Distributions
Multivariate Student t
(r = 0.1, ν = 4), (r = 0.9, ν = 4), (r = 0.5, ν = 4) and (r = 0.5, ν = 10).
@freakonometrics 15 Arthur CHARPENTIER - Multivariate Distributions
On Conditional Independence, de Finetti & Hewitt Instead of X =L MX for any orthogonal matrix M, consider the equality for any permutation matrix M, i.e.
L (X1, ··· ,Xd) = (Xσ(1), ··· ,Xσ(d)) for any permutation of {1, ··· , d}
E.g. X ∼ N (0, Σ) with Σi,i = 1 and Σi,j = ρ when i 6= j. Note that necessarily 1 ρ = Corr(X ,X ) = − . i j d − 1
From de Finetti (1931), X1, ··· ,Xd, ··· are exchangeable {0, 1} variables if and only if there is a c.d.f. Π on [0, 1] such that
Z 1 xT1 n−xT1 P[X = x] = θ [1 − θ] dΠ(θ), 0
i.e. X1, ··· ,Xd, ··· are (conditionnaly) independent given Θ ∼ Π.
@freakonometrics 16 Arthur CHARPENTIER - Multivariate Distributions
On Conditional Independence, de Finetti & Hewitt-Savage
More generally, from Hewitt & Savage (1955) random variables X1, ··· ,Xd, ···
are exchangeable if and only if there is F such that X1, ··· ,Xd, ··· are (conditionnaly) independent given F.
E.g. popular shared frailty models. Consider lifetimes T1, ··· ,Td, with Cox-type
proportional hazard µi(t) = Θ · µi,0(t), so that
θ P[Ti > t|Θ = θ] = F i,0(t)
Assume that lifetimes are (conditionnaly) independent given Θ.
@freakonometrics 17 Arthur CHARPENTIER - Multivariate Distributions
d The Simplex Sd ⊂ R
( d )
d X Sd = x = (x1, x2, ··· , xd) ∈ R xi > 0, i = 1, 2, ··· , d; xi = 1 . i=1 Henre, the simplex here is the set of d-dimensional probability vectors. Note that d Sd = {x ∈ R+ : kxk`1 = 1} Remark Sometimes the simplex is
( d−1 )
˜ d−1 X Sd−1 = x = (x1, x2, ··· , xd−1) ∈ R xi > 0, i = 1, 2, ··· , d; xi≤1 . i=1
T Note that if x˜ ∈ S˜d−1, then (x˜, 1 − x˜ 1) ∈ Sd. d If h : R+ → R+ is homogeneous of order 1, i.e. h(λx) = λ · h(x) for all λ > 0. Then x x h(x) = kxk`1 · h where ∈ Sd. kxk`1 kxk`1
@freakonometrics 18 Arthur CHARPENTIER - Multivariate Distributions
Compositional Data and Geometry of the Simplex
d Following Aitchison (1986), given x ∈ R+ define the closure operator C " # x x x C[x , x , ··· , x ] = 1 , 2 ,..., d ∈ S . 1 2 d Pd Pd Pd d i=1 xi i=1 xi i=1 xi
It is possible to define (Aitchison) inner product on Sd
1 X xi yi X xi yi < x, y > = log log = log log a 2d x y x y i,j j j i
where x denotes the geometric mean of x. It is then possible to define a linear model with compositional covariates,
yi = β0+ < β1, xi >a +εi.
@freakonometrics 19 Arthur CHARPENTIER - Multivariate Distributions
Dirichlet Distributions
d d Given α ∈ , and x ∈ Sd ⊂ R+ R 5
● 4 ●
d 3 ● ● ● 1 ● ● ● Y 2 ●● α −1 ● i ●●● f (x , ··· , x ; α) = x , ● ● ● 1 d i 1 ● 0● B(α) 1 0 ● i=1 0 2 1 3 2 3 4 4 where 5 5 Qd Γ(α ) B(α) = i=1 i Pd ● 1.0 ●●●●● ●●●●●●●●● ●●●●●●●●●● Γ α ●●●●●●●●●●● ●●●●●●●●●●●● i ●●●●●●●●●●●●●● ●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●● i=1 ●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● 0.8 ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● 0.6 ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● Then ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●● ●●●●●● ●●●●●●● ●●●●●●● ●●●●●●● ●●●●●● ●●●●●●● ●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● 0.4 ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●● ●●●● ●●●● ●●●● ●●●● ●●●● ●●● ●●●● ●●●● ●●●● ●●●● ●●●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● d ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ●● ●●● ●● ●● ●● ●● ●●● ●●● ●● ●● ●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●● 0.2 ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ● X ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●● ●●●● ●●●● ●●●● ●●●● 0.0 ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●● ●●● 0.2 X ∼ Beta α , α − α . 0.0 i i j i 0.0 0.4 0.2 0.6 j=1 0.4 0.6 0.8 0.8 1.01.0 and E(X) = C(α).
@freakonometrics 20 Arthur CHARPENTIER - Multivariate Distributions
Dirichlet Distributions
5
4
● 3 Stochastic Representation ● ● 2 ● ● ● ● ● ●
● 1● ● ● Let Z = (Z , ··· ,Z ) denote independent G(α , θ) random ● 0 1 d i 1 0 0 2 ● 1 3 T T ● 2 variables. Then S = Z1 + ··· + Zd = Z 1 has a G(α 1, θ) 3 4 4 ● 5 distribution, and 5 !
Z Z Z ●●●● 1 d 1.0 ●●●●●●●● ●●●●●● ●●●●●●●●●● ●●●●●●●●●●● ●●●●●●●●●●●●● ●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●● X = C(X) = = , ··· , ●●●●●●●●●●●●●●●●●●●●●●●●● ●●●● ●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● 0.8 ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● d d ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● P P ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● S ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●● ●●●●● ●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● Z Z ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● i i ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●● ●●●●●●●●●●● ●●●●●●●●●● ●●●●●●●●●●●●●●●● 0.6 ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● i=1 i=1 ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● 0.4 ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●● 0.2 ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● 0.0 ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●● ●●● 0.2 has a Dirichlet distribution Dirichlet(α). 0.0 0.0 0.4 0.2 0.6 0.4 0.6 0.8 0.8 1.01.0
@freakonometrics 21 Arthur CHARPENTIER - Multivariate Distributions
Uniform Distribution on the Simplex X ∼ D(1) is a random vector uniformly distributed on the simplex.
Consider d − 1 independent random variables U1, ··· ,Ud−1 with a U([0, 1]) distribution. Define spacings, as
Xi = U(i−1):d − U where Ui:d are order statistics
with conventions U0:d = 0 and Ud:d = 1. Then
X = (X1, ··· ,Xd) ∼ U(Sd).
@freakonometrics 22 Arthur CHARPENTIER - Multivariate Distributions
‘Normal distribution on the Simplex’ (also called logistic-normal). Let Y˜ ∼ N (µ, Σ) in dimension d − 1. Set Z = (Y˜ , 0) and
eZ1 eZd X = C(eZ ) = , ··· , eZ1 + ··· + eZd eZ1 + ··· + eZd
@freakonometrics 23 Arthur CHARPENTIER - Multivariate Distributions
Distribution on Rd or [0, 1]d
Technically, things are more simple when X = (X1, ··· ,Xd) take values in a product measurable space, e.g. R × · · · × R. In that case, X has independent components if (and only if)
d Y P[X ∈ A] = P[Xi ∈ Ai], where A = A1 × · · · , ×Ad. i=1
E.g. if Ai = (−∞, xi), then
d d Y Y F (x) = P[X ∈ (−∞, x] = P[Xi ∈ (−∞, xi] = Fi(xi). i=1 i=1 If F is absolutely continous,
d ∂dF (x) Y f(x) = = f (x ). ∂x ··· ∂x i i 1 d i=1
@freakonometrics 24 Arthur CHARPENTIER - Multivariate Distributions
Fréchet classes
Given some (univariate) cumulative distribution functions F1, ··· ,Fd R → [0, 1], let F(F1, ··· ,Fd) denote the set of multivariate cumulative distribution function
of random vectors X such that Xi ∼ Fi. d Note that for any F ∈ F(F1, ··· ,Fd), ∀x ∈ R ,
F −(x) ≤ F (x) ≤ F +(x)
where + F (x) = min{Fi(xi), i = 1, ··· , d}, and − F (x) = max{0,F1(x1) + ··· + Fd(xd) − (d − 1)}.
+ − Note that F ∈ F(F1, ··· ,Fd), while usually F ∈/ F(F1, ··· ,Fd).
@freakonometrics 25 Arthur CHARPENTIER - Multivariate Distributions
Copulas in Dimension 2 A copula C : [0, 1]2 → [0, 1] is a cumulative distribution function with uniform margins on [0, 1]. Equivalently, a copula C : [0, 1]2 → [0, 1] is a function satisfying
• C(u1, 0) = C(0, u2) = 0 for any u1, u2 ∈ [0, 1],
• C(u1, 1) = u1 et C(1, u2) = u2 for any u1, u2 ∈ [0, 1],
• C is a 2-increasing function, i.e. for all 0 ≤ ui ≤ vi ≤ 1,
C(v1, v2) − C(v1, u2) − C(u1, v2) + C(u1, u2) ≥ 0.
@freakonometrics 26 Arthur CHARPENTIER - Multivariate Distributions
Copulas in Dimension 2
Borders of the copula function 1.4 1.2 1.0 0.8 0.6 0.4
1.2 0.2 1.0 0.8 0.6
0.0 0.4 0.2 0.0
0.2 !0.2 !! 0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4
Border conditions, in dimension d = 2, C(u1, 0) = C(0, u2) = 0, C(u1, 1) = u1 et
C(1, u2) = u2.
@freakonometrics 27 Arthur CHARPENTIER - Multivariate Distributions
Copulas in Dimension 2
If C is the copula of random vector (X1,X2), then C couples marginal distributions, in the sense that
P(X1 ≤ x1,X2 ≤ x2) = C(P(X1 ≤ x1),P(X2 ≤ x2))
Note tht is is also possible to couple survival distributions: there exists a copula C? such that ? P(X > x, Y > y) = C (P(X > x), P(Y > y)). Observe that ? C (u1, u2) = u1 + u2 − 1 + C(1 − u1, 1 − u2).
The survival copula C? associated to C is the copula defined by
? C (u1, u2) = u1 + u2 − 1 + C(1 − u1, 1 − u2).
? Note that (1 − U1, 1 − U2) ∼ C if (U1,U2) ∼ C.
@freakonometrics 28 Arthur CHARPENTIER - Multivariate Distributions
Copulas in Dimension 2
If X has distribution F ∈ F(F1,F2), with absolutely continuous margins, then its copula is
−1 −1 C(u1, u2) = F (F1 (u1),F2 (u2)), ∀u1, u2 ∈ [0, 1].
More generally, if h−1 denotes the generalized inverse of some increasing function h : R → R, defined as h−1(t) = inf{x, h(x) ≥ t, t ∈ R}, then −1 −1 C(u1, u2) = F (F1 (u1),F2 (u2)) is one copula of X. Note that copulas are continuous functions; actually they are Lipschitz: for all
0 ≤ ui, vi ≤ 1,
|C(u1, u2) − C(v1, v2)| ≤ |u1 − v1| + |u2 − v2|.
@freakonometrics 29 Arthur CHARPENTIER - Multivariate Distributions
Copulas in Dimension d The increasing property of the copula function is related to the property that
P(X ∈ [a, b]) = P(a1 ≤ X1 ≤ b1, ··· , ad ≤ Xd ≤ bd) ≥ 0
for X = (X1, ··· ,Xd) ∼ F , for any a ≤ b (in the sense that ai ≤ bi. Function h : Rd → R is said to be d-increaasing if for any [a, b] ⊂ Rd, Vh ([a, b]) ≥ 0, where
V ([a, b]) = ∆b h (t) = ∆bd ∆bd−1 ...∆b2 ∆b1 h (t) h a ad ad−1 a2 a1
for any t, where
∆bi h (t) = h (t , ··· , t , b , t , ··· , t ) − h (t , ··· , t , a , t , ··· , t ) . ai 1 i−1 i i+1 n 1 i−1 i i+1 n
@freakonometrics 30 Arthur CHARPENTIER - Multivariate Distributions
Copulas in Dimension d
Black dot, + sign, white dot, - sign.
@freakonometrics 31 Arthur CHARPENTIER - Multivariate Distributions
Copulas in Dimension d A copula in dimension d is a cumulative distribution function on [0, 1]d with uniform margins, on [0, 1].
d Equivalently, copulas are functions C : [0, 1] → [0, 1] such that for all 0 ≤ ui ≤ 1, with i = 1, ··· , d,
C(1, ··· , 1, ui, 1, ··· , 1) = ui,
C(u1, ··· , ui−1, 0, ui+1, ··· , ud) = 0, C is d-increasing.
The most important result is Sklar’s theorem, from Sklar (1959).
@freakonometrics 32 Arthur CHARPENTIER - Multivariate Distributions
Sklar’s Theorem
1. If C is a copula, and if F1 ··· ,Fd are (univariate) distribution functions, d then, for any (x1, ··· , xd) ∈ R ,
F (x1, ··· , xn) = C(F1(x1), ··· ,Fd(xd))
is a cumulative distribution function of the Fréchet class F(F1, ··· ,Fd).
2. Conversely, if F ∈ F(F1, ··· ,Fd), there exists a copula C such that the equation above holds. This function is not unique, but it is if margins d F1, ··· ,Fd are absolutely continousand then, for any (u1, ··· , ud) ∈ [0, 1] ,
−1 −1 C(u1, ··· , ud) = F (F1 (u1), ··· ,Fd (ud)),
−1 −1 where F1 , ··· ,Fn are generalized quantiles.
@freakonometrics 33 Arthur CHARPENTIER - Multivariate Distributions
Copulas in Dimension d
Let (X1, ··· ,Xd) be a random vector with copula C. Let φ1, ··· , φd, φi : R → R denote continuous functions strictly increasing, then C is also a copula of
(φ1(X1), ··· , φd(Xd)). If C is a copula, then function
d ? X k X C (u1, ··· , ud) = (−1) C(1, ··· , 1, 1 − ui1 , 1, ...1, 1 − uik , 1, ...., 1) , k=0 i1,··· ,ik
for all (u1, ··· , ud) ∈ [0, 1] × ... × [0, 1], is a copula, called survival copula, associated with C. ? If (U1, ··· ,Ud) ∼ C, then (1 − U1 ··· , 1 − Ud) ∼ C . And if
P(X1 ≤ x1, ··· ,Xd ≤ xd) = C(P(X1 ≤ x1), ··· , P(Xd ≤ xd)),
for all (x1, ··· , xd) ∈ R, then ? P(X1 > x1, ··· ,Xd > xd) = C (P(X1 > x1), ··· , P(Xd > xd)).
@freakonometrics 34 Arthur CHARPENTIER - Multivariate Distributions
On Quasi-Copulas d Function Q : [0, 1] → [0, 1] is a quasi-copula if for any 0 ≤ ui ≤ 1, i = 1, ··· , d,
Q(1, ··· , 1, ui, 1, ··· , 1) = ui,
Q(u1, ··· , ui−1, 0, ui+1, ··· , ud) = 0,
s 7→ Q(u1, ··· , ui−1, s, ui+1, ··· , ud) is an increasing function for any i, and
|Q(u1, ··· , ud) − Q(v1, ··· , vd)| ≤ |u1 − v1| + ··· + |ud − vd|.
For instance, C− is usually not a copula, but it is a quasi-copula. Let C be a set of copula function and define C− and C+ as lower and upper bounds for C, in the sense that
C−(u) = inf{C(u),C ∈ C} and C+(u) = sup{C(u),C ∈ C}.
Then C− and C+ are quasi copulas (see connexions with the definition of Choquet capacities as lower bounds of probability measures).
@freakonometrics 35 Arthur CHARPENTIER - Multivariate Distributions
The Indepedent Copula C⊥, or Π The independent copula C⊥ is the copula defined as
d ⊥ Y C (u1, ··· , un) = u1 ··· ud = ui (= Π(u1, ··· , un)). i=1
⊥ Let X ∈ F(F1, ··· ,Fd), then X ∈ F(F1, ··· ,Fd) will denote a random vector with copula C⊥, called ‘independent version’ of X.
@freakonometrics 36 Arthur CHARPENTIER - Multivariate Distributions
Fréchet-Hoeffding bounds C− and C+, and Comonotonicity Recall that the family of copula functions is bounded: for any copula C,
− + C (u1, ··· , ud) ≤ C(u1, ··· , ud) ≤ C (u1, ··· , ud),
for any (u1, ··· , ud) ∈ [0, 1] × ... × [0, 1], where
− C (u1, ··· , ud) = max{0, u1 + ... + ud − (d − 1)}
and + C (u1, ··· , ud) = min{u1, ··· , ud}.
If C+ is always a copula, C− is a copula only in dimension d = 2.
+ + The comonotonic copula C is defined as C (u1, ··· , ud) = min{u1, ··· , ud}. The lower bound C− is the function defined as − C (u1, ··· , ud) = max{0, u1 + ... + ud − (d − 1)}.
@freakonometrics 37 Arthur CHARPENTIER - Multivariate Distributions
Fréchet-Hoeffding bounds C− and C+, and Comonotonicity + Let X ∈ F(F1, ··· ,Fd). Let X ∈ F(F1, ··· ,Fd) denote a random vector with copula C+, called comotonic version of X. In dimension d = 2, let − X ∈ F(F1,F2) be a counter-comonotonic version of X.
@freakonometrics 38 Arthur CHARPENTIER - Multivariate Distributions
Fréchet-Hoeffding bounds C− and C+
1. If d = 2, C− is the c.d.f. of (U, 1 − U) where U ∼ U([0, 1]).
− 2. (X1,X2) has copula C if and only if there is φ strictly increasing and ψ
strictly decreasing sucht that (X1,X2) = (φ(Z), ψ(Z)) for some random variable Z.
3. C+ is the c.d.f. of (U, ··· ,U) where U ∼ U([0, 1]).
+ 4. (X1, ··· ,Xn) has copula C if and only if there are functions φi strictly
increasing such that (X1, ··· ,Xn) = (φ1(Z), ··· , φn(Z)) for some random variable Z.
Those bounds can be used to bound other quantities. If h : R2 → R is 2-croissante, then for any (X1,X2) ∈ F(F1,F2) −1 −1 −1 −1 E(φ(F1 (U),F2 (1 − U))) ≤ E(φ(X1,X2)) ≤ E(φ(F1 (U),F2 (U))), where U ∼ U([0, 1]), see Tchen (1980) for more applications
@freakonometrics 39 Arthur CHARPENTIER - Multivariate Distributions
Elliptical Copulas Let r ∈ (−1, +1), then the Gaussian copula with parameter r (in dimension d = 2) is
−1 −1 1 Z Φ (u1) Z Φ (u2) x2 − 2rxy + y2 C(u1, u2) = √ exp dxdy 2 2 2π 1 − r −∞ −∞ 2(1 − r )
where Φ is the c.d.f. of the N (0, 1) distribution
Z x 1 z2 Φ(x) = √ exp − dz. −∞ 2π 2
@freakonometrics 40 Arthur CHARPENTIER - Multivariate Distributions
Elliptical Copulas Let r ∈ (−1, +1), and ν > 0, then the Student t copula with parameters r and ν is
T −1(u ) T −1(u ) ν − ν +1 Z ν 1 Z ν 2 1 Γ + 1 x2 − 2rxy + y2 2 √ 2 1 + dxdy. 2 ν 2 −∞ −∞ πν 1 − r Γ 2 ν(1 − r )
where Tν is the c.d.f. of the Student t distribution, with ν degrees of freedom
−( ν+1 ) Z x ν+1 2 2 Γ( 2 ) z Tν (x) = √ ν 1 + −∞ νπ Γ( 2 ) ν
@freakonometrics 41 Arthur CHARPENTIER - Multivariate Distributions
Archimedean Copulas, in dimension d = 2 Let φ denote a decreasing convex function (0, 1] → [0, ∞] such that φ(1) = 0 and φ(0) = ∞. A (strict) Archimedean copula with generator φ is the copula defined as −1 C(u1, u2) = φ (φ(u1) + φ(u2)), for all u1, u2 ∈ [0, 1].
E.g. if φ(t) = tα − 1; this is Clayton copula. The generator of an Archimedean copula is not unique.Further, Archimedean
copulas are symmetric, since C(u1, u2) = C(u2, u1).
@freakonometrics 42 Arthur CHARPENTIER - Multivariate Distributions
Archimedean Copulas, in dimension d = 2
? The only copula is radialy symmetric, i.e. C(u1, u2) = C (u1, u2) is such that e−αt − 1 φ(t) = log . This is Frank copula, from Frank (1979)). e−α − 1
Some prefer a multiplicative version of Archimedean copulas
−1 C(u1, u2) = h [h(u1) · h(u2)].
The link is h(t) = exp[φ(t)], or conversely φ(t) = h(log(t)).
@freakonometrics 43 Arthur CHARPENTIER - Multivariate Distributions
Archimedean Copulas, in dimension d = 2
Remark in dimension 1, P(F (X) ≤ t) = t, i.e. F (X) ∼ U([0, 1]) if X ∼ F . Archimedean copulas can also be characterized by their Kendall function,
φ(t) K(t) = [C(U ,U ) ≤ t] = t − λ(t) where λ(t) = P 1 2 φ0(t)
and where (U1,U2) ∼ C. Conversely,
Z t ds φ(t) = exp , t0 λ(s)
where t0 ∈ (0, 1) is some arbitrary constant.
@freakonometrics 44 Arthur CHARPENTIER - Multivariate Distributions
Archimedean Copulas, in dimension d = 2 Note that Archimedean copulas can also be defined when φ(0) ≤ ∞. Let φ denote a decreasing convex function (0, 1] → [0, ∞] such that φ(1) = 0. Define the inverse of φ as φ−1(t), for 0 ≤ t ≤ φ(0) φ−1(t) = 0, for φ(0) < t < ∞.
An Archimedean copula with generator φ is the copula defined as
−1 C(u1, u2) = φ (φ(u1) + φ(u2)), for all u1, u2 ∈ [0, 1].
Non strict Archimedean copulas have a null set, {(u1, u2), φ(u1) + φ(u2) > 0} non empty, such that
P((U1,U2) ∈ {(u1, u2), φ(u1) + φ(u2) > 0}) = 0.
@freakonometrics 45 Arthur CHARPENTIER - Multivariate Distributions
Archimedean Copulas, in dimension d = 2
This set is bounded by a null curve, {(u1, u2), φ(u1) + φ(u2) = 0}, with mass
φ(0) ((U ,U ) ∈ {(u , u ), φ(u ) + φ(u ) = 0}) = − , P 1 2 1 2 1 2 φ0(0+)
which is stricly positif if −φ0(0+) < +∞. E.g. if φ(t) = tα − 1, with α ∈ [−1, ∞), with limiting case φ(t) = − log(t) when α = 0; this is Clayton copula.
@freakonometrics 46 Arthur CHARPENTIER - Multivariate Distributions
Archimedean Copulas, in dimension d = 2
ψ(t) range θ (1) 1 (t−θ − 1) [−1, 0) ∪ (0, ∞) Clayton, (1978) θ Clayton (2) (1 − t)θ [1, ∞) 1−θ(1−t) (3) log [−1, 1) Ali-Mikhail-Haq t (4) (− log t)θ [1, ∞) Gumbel, Gumbel (1960), Hougaard (1986) e−θt−1 (5) − log (−∞, 0) ∪ (0, ∞) Frank, Frank (1979), Nelsen (1987) e−θ −1 (6) − log{1 − (1 − t)θ } [1, ∞) Joe, Frank (1981), Joe (1993) (7) − log{θt + (1 − θ)} (0, 1] 1−t (8) [1, ∞) 1+(θ−1)t (9) log(1 − θ log t) (0, 1] Barnett (1980), Gumbel (1960) (10) log(2t−θ − 1) (0, 1] (11) log(2 − tθ ) (0, 1/2] (12) ( 1 − 1)θ [1, ∞) t (13) (1 − log t)θ − 1 (0, ∞) (14) (t−1/θ − 1)θ [1, ∞) (15) (1 − t1/θ )θ [1, ∞) Genest & Ghoudi (1994) (16) ( θ + 1)(1 − t) [0, ∞) t
@freakonometrics 47 Arthur CHARPENTIER - Multivariate Distributions
Archimedean Copulas, in dimension d ≥ 2 Archimedean copulas are associative (see Schweizer & Sklar (1983), i.e.
C(C(u1, u2), u3) = C(u1,C(u2, u3)), for all 0 ≤ u1, u2, u3 ≤ 1.
In dimension d > 2, assume that φ−1 is d-completely monotone (where ψ is d-completely monotine if it is continuous and for all k = 0, 1, ··· , d, (−1)kdkψ(t)/dtk ≥ 0). An Archimedean copula in dimension d ≥ 2 is defined as
−1 C(u1, ··· , un) = φ (φ(u1) + ... + φ(un)), for all u1, ··· , un ∈ [0, 1].
@freakonometrics 48 Arthur CHARPENTIER - Multivariate Distributions
Archimedean Copulas, in dimension d ≥ 2 Those copulas are obtained iteratively, starting with
−1 C2(u1, u2) = φ (φ(u1) + φ(u2))
and then, for any n ≥ 2,
Cn+1(u1, ··· , un+1) = C2(Cn(u1, ··· , un), un+1).
Let ψ denote the Laplace transform of a positive random variable Θ, then (Bernstein theorem), ψ is completely montone, and ψ(0) = 1. Then φ = ψ−1 is an Archimedean generator in any dimension d ≥ 2. E.g. if Θ ∼ G(a, a), then ψ(t) = (1 + t)1/α, and we have Clayton Clayton copula.
@freakonometrics 49 Arthur CHARPENTIER - Multivariate Distributions
Archimedean Copulas, in dimension d ≥ 2
Let X = (X1, ··· ,Xd) denote remaining lifetimes, with joint survival distribution function that is Schur-constant, i.e. there is S : R+ → [0, 1] such that
P(X1 > x1, ··· ,Xd > xd) = S(x1 + ··· + xd).
Then margins Xi are also Schur-contant (i.e. exponentially distributed), and the survival copula of X is Archimedean with generator S−1. Observe further that
P(Xi − xi > t|X > x) = P(Xj − xj > t|X > x),
d for all t > 0 and x ∈ R+. Hence, if S is a power function, we obtain Clayton copula, see Nelsen (2005).
@freakonometrics 50 Arthur CHARPENTIER - Multivariate Distributions
Archimedean Copulas, in dimension d ≥ 2
Let (Cn) be a sequence of absolutely continuous Archimedean copulas, with
generators (φn). The limit of Cn, as n → ∞ is Archimedean if either
• there is a genetor φ such that s, t ∈ [0, 1],
φn(s) φ(s) lim 0 = 0 . n→∞ φn(t) φ (t)
• there is a continuous function λ such that lim λn(t) = λ(t). n→∞
• there is a function K continuous such that lim Kn(t) = K(t). n→∞
• there is a sequence of positive constants (cn) such that lim cnφn(t) = φ(t), n→∞ for all t ∈ [0, 1].
@freakonometrics 51 Arthur CHARPENTIER - Multivariate Distributions
Copulas, Optimal Transport and Matching Monge Kantorovich, Z min [`(x1,T (x1))dF1(x1); wiht T (X1) ∼ F2 when X1 ∼ F1] T :R→R
2 for some loss function `, e.g. `(x1, x2) = [x1 − x2] . 2 ? In the Gaussian case, if Xi ∼ N (0, σi ), T (x1) = σ2/σ1 · x1. Equivalently Z min `(x1, x2)dF (x1, x2) = min {EF [`(X1,X2)]} F ∈F(F1,F2) F ∈F(F1,F2)
If ` is quadratic, we want to maximize the correlation,
max {EF [X1 · X2]} F ∈F(F1,F2)
@freakonometrics 52