Depth in Multivariate Statistics

Motivation Main definitions Algorithms Applications Parameter depth & regression Concluding remarks Depth in Multivariate Statistics Ignacio Cascos Departamento de Estadística Universidad Carlos III de Madrid Getafe, January 2011 Motivation Main definitions Algorithms Applications Parameter depth & regression Concluding remarks Outline Motivation Main definitions Depth function and depth-trimmed region Data depths Algorithms Halfspaces and convex hulls (depth functions) Circular sequence (depth-trimmed regions) Applications Bagplot, DDplot, volume statistics, L-statistics, stochastic orders, quality control, risk measures Parameter depth & regression Motivation Main definitions Algorithms Applications Parameter depth & regression Concluding remarks Boxplot & Bagplot (Rousseeuw, Ruts & Tukey, 1999) Decathlon @ Athens 2004: long jump & 100m Boxplot long jump 6.6 6.8 7.0 7.2 7.4 7.6 7.8 8.0 Boxplot 100m ● ● 10.6 10.8 11.0 11.2 Motivation Main definitions Algorithms Applications Parameter depth & regression Concluding remarks Boxplot & Bagplot (Rousseeuw, Ruts & Tukey, 1999) Decathlon @ Athens 2004: long jump & 100m 11.2 11.0 10.8 10.6 6.6 6.8 7.0 7.2 7.4 7.6 7.8 8.0 Motivation Main definitions Algorithms Applications Parameter depth & regression Concluding remarks PPplot & DDplot Liu, Parelius & Singh (1999) Histogram 40 30 20 Frequency 10 0 −2 −1 0 1 2 3 x Motivation Main definitions Algorithms Applications Parameter depth & regression Concluding remarks PPplot & DDplot Liu, Parelius & Singh (1999) PP plot ●●● ●●●●● 1.0 ●● ●●●● ● ●●●● ●● ●●● ● ●●●● ●●●● ●●●● ●●●● ● 0.8 ● ●●●●●● ●●● ●●●●● ●●● ●●● ●●●●● ●●●● ●●●● ●●●● ● 0.6 ●●●●●● ●● ●●●● ●●●● ● ●●●●●● ●●● ●●●●● ●●● ●●●●●●●● 0.4 ●● ●●● ●● ●●● ●●● ●●●●● ●●●●●● ●●● ●●● ●●●● ●●●●● ●●●● 0.2 ●●●● ● ● ●●●●● ●●● ●●●● ●● ●●●●● ●●●● ●●●●● 0.0 0.0 0.2 0.4 0.6 0.8 1.0 (Fn(xi); F(xi)) Motivation Main definitions Algorithms Applications Parameter depth & regression Concluding remarks PPplot & DDplot Liu, Parelius & Singh (1999) DD plot ●●●●● 0.5 ●●● ● ● ●● ● ●●● ●●● ●● ●●● ● ●●● ● ●● ●● ● ●● 0.4 ●●● ●●●●● ●● ●●●● ●●●● ●●● ●● ●● ●●●● ●●●● ● ● ●●●● 0.3 ●●●●● ●● ●●●● ●● ●● ●● ●●●● ●●●●●● ●●● ●●●● ●●●● 0.2 ●●●●● ●●●●● ●●● ●●● ● ●●● ● ●● ● ●● ●●●● ●● ●● 0.1 ● ●●●● ●● ●● ●● ●●●● ● ● ●●● ●●● ●●●●●●●●● ● ● ●●●● 0.0 0.0 0.1 0.2 0.3 0.4 0.5 (Dn(xi); D(xi)) Dn(x) = minfFn(x); 1 − Fn(x)g ; D(x) = minfF(x); 1 − F(x)g Motivation Main definitions Algorithms Applications Parameter depth & regression Concluding remarks PPplot & DDplot Liu, Parelius & Singh (1999) Data cloud 3 ● ● ● ● ● ● ● ● ● ● 2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1 ● ● ● ● ●●●● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ●● ●● ● ● ● ●● ● ● ● ●●● ●● ● ●● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ●●●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ●● ●● ● ● ●● ● ● ● ●●● ● ● ● ●● ● ● ●● ●● ●●● ● ● ● ●● ● ● ● ● ● ● ● ●● ●●● ● ● ● ● ● −1 ●●● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● −2 ● ● ● −2 −1 0 1 2 Motivation Main definitions Algorithms Applications Parameter depth & regression Concluding remarks PPplot & DDplot Liu, Parelius & Singh (1999) DD plot ● ● ● ● ● ●● ●●● 0.4 ● ● ● ● ● ● ●● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● 0.3 ● ● ● ● ● ● ●●● ● ● ● ●● ● ●●● ●● ● ●● ● ●●●● ● ● ●●●● ● ●●●● ● ● ● ●●●●●● 0.2 ● ● ●● ●●●● ●● ●● ●● ●●●● ●●● ●● ●● ●●●● ● ●● ●●● ●● ● ●●● ●●●● ● ● ● ●● ● ● ●●●● ●● ●●●●●●●● ●● ● ●● 0.1 ● ●●● ●●● ●●●●● ●●●●● ●● ●● ●●● ●●●●● ●●●●● ●●●●●● ●●●●● ●●●●● ●●●●●● ●●●● ●● 0.0 0.0 0.1 0.2 0.3 0.4 (Dn(xi); D(xi)) Motivation Main definitions Algorithms Applications Parameter depth & regression Concluding remarks Convex hull peeling Barnett (1976) For a given multivariate data set S. Step 0. Set i = 1. Step 1. Find the convex hull of the data set S. Ci is the set of its extreme points. Step 2. Delete Ci from the data set, S = S n Ci. Step 3. While there are some points left in the data set S go back to Step 1, with i = i + 1. Motivation Main definitions Algorithms Applications Parameter depth & regression Concluding remarks Convex hull peeling Barnett (1976) For a given multivariate data set S. Step 0. Set i = 1. Step 1. Find the convex hull of the data set S. Ci is the set of its extreme points. Step 2. Delete Ci from the data set, S = S n Ci. Step 3. While there are some points left in the data set S go back to Step 1, with i = i + 1. Motivation Main definitions Algorithms Applications Parameter depth & regression Concluding remarks Convex hull peeling Barnett (1976) For a given multivariate data set S. Step 0. Set i = 1. Step 1. Find the convex hull of the data set S. Ci is the set of its extreme points. Step 2. Delete Ci from the data set, S = S n Ci. Step 3. While there are some points left in the data set S go back to Step 1, with i = i + 1. Motivation Main definitions Algorithms Applications Parameter depth & regression Concluding remarks Convex hull peeling Barnett (1976) 11.2 11.0 10.8 10.6 6.6 6.8 7.0 7.2 7.4 7.6 7.8 8.0 Motivation Main definitions Algorithms Applications Parameter depth & regression Concluding remarks Convex hull peeling Barnett (1976) Peeling depth If m is the level of the centermost (deepest) layer, the peeling depth of a point from Ci is given by i=m. Problem There is no distributional counterpart of the convex hull peeling. Motivation Main definitions Algorithms Applications Parameter depth & regression Concluding remarks Convex hull peeling Barnett (1976) Peeling depth If m is the level of the centermost (deepest) layer, the peeling depth of a point from Ci is given by i=m. Problem There is no distributional counterpart of the convex hull peeling. Motivation Main definitions Algorithms Applications Parameter depth & regression Concluding remarks Outline Motivation Main definitions Depth function and depth-trimmed region Data depths Algorithms Halfspaces and convex hulls (depth functions) Circular sequence (depth-trimmed regions) Applications Bagplot, DDplot, volume statistics, L-statistics, stochastic orders, quality control, risk measures Parameter depth & regression Motivation Main definitions Algorithms Applications Parameter depth & regression Concluding remarks Depth function Liu (1990), Zuo & Serfling (2000), Dyckerhoff (2004) A depth function, D(x; P) (or D(x) shortly), satisfies: D1 Affine invariance. D(Ax + b; PAX+b) = D(x; PX) for every d×d d nonsingular A 2 R and b 2 R ; D2 Vanishes at infinity. D(x; P) −! 0 if kxk ! 1 ; d D3 Upper semicontinuity. fx 2 R : D(x; P) ≥ αg is closed ; D4 Monotonicity relative to deepest point. D(x; P) ≤ D(θ + λ(x − θ); P) for θ = arg maxxD(x; P) and 0 ≤ λ ≤ 1 ; D4’ Quasiconcavity. D(λx + (1 − λ)y; P) ≥ minfD(x; P); D(y; P)g for 0 ≤ λ ≤ 1 . Motivation Main definitions Algorithms Applications Parameter depth & regression Concluding remarks Depth function Liu (1990), Zuo & Serfling (2000), Dyckerhoff (2004) A depth function, D(x; P) (or D(x) shortly), satisfies: D1 Affine invariance. D(Ax + b; PAX+b) = D(x; PX) for every d×d d nonsingular A 2 R and b 2 R ; D2 Vanishes at infinity. D(x; P) −! 0 if kxk ! 1 ; d D3 Upper semicontinuity. fx 2 R : D(x; P) ≥ αg is closed ; D4 Monotonicity relative to deepest point. D(x; P) ≤ D(θ + λ(x − θ); P) for θ = arg maxxD(x; P) and 0 ≤ λ ≤ 1 ; D4’ Quasiconcavity. D(λx + (1 − λ)y; P) ≥ minfD(x; P); D(y; P)g for 0 ≤ λ ≤ 1 . Motivation Main definitions Algorithms Applications Parameter depth & regression Concluding remarks Depth function Liu (1990), Zuo & Serfling (2000), Dyckerhoff (2004) A depth function, D(x; P) (or D(x) shortly), satisfies: D1 Affine invariance. D(Ax + b; PAX+b) = D(x; PX) for every d×d d nonsingular A 2 R and b 2 R ; D2 Vanishes at infinity. D(x; P) −! 0 if kxk ! 1 ; d D3 Upper semicontinuity. fx 2 R : D(x; P) ≥ αg is closed ; D4 Monotonicity relative to deepest point. D(x; P) ≤ D(θ + λ(x − θ); P) for θ = arg maxxD(x; P) and 0 ≤ λ ≤ 1 ; D4’ Quasiconcavity. D(λx + (1 − λ)y; P) ≥ minfD(x; P); D(y; P)g for 0 ≤ λ ≤ 1 . Motivation Main definitions Algorithms Applications Parameter depth & regression Concluding remarks Depth function Liu (1990), Zuo & Serfling (2000), Dyckerhoff (2004) A depth function, D(x; P) (or D(x) shortly), satisfies: D1 Affine invariance. D(Ax + b; PAX+b) = D(x; PX) for every d×d d nonsingular A 2 R and b 2 R ; D2 Vanishes at infinity. D(x; P) −! 0 if kxk ! 1 ; d D3 Upper semicontinuity. fx 2 R : D(x; P) ≥ αg is closed ; D4 Monotonicity relative to deepest point. D(x; P) ≤ D(θ + λ(x − θ); P) for θ = arg maxxD(x; P) and 0 ≤ λ ≤ 1 ; D4’ Quasiconcavity. D(λx + (1 − λ)y; P) ≥ minfD(x; P); D(y; P)g for 0 ≤ λ ≤ 1 . Motivation Main definitions Algorithms Applications Parameter depth & regression Concluding remarks Depth function Liu (1990), Zuo & Serfling (2000), Dyckerhoff (2004) A depth function, D(x; P) (or D(x) shortly), satisfies: D1 Affine invariance. D(Ax + b; PAX+b) = D(x; PX) for every d×d d nonsingular A 2 R and b 2 R ; D2 Vanishes at infinity. D(x; P) −! 0 if kxk ! 1 ; d D3 Upper semicontinuity. fx 2 R : D(x; P) ≥ αg is closed ; D4 Monotonicity relative to deepest point. D(x; P) ≤ D(θ + λ(x − θ); P) for θ = arg maxxD(x; P) and 0 ≤ λ ≤ 1 ; D4’ Quasiconcavity. D(λx + (1 − λ)y; P) ≥ minfD(x; P); D(y; P)g for 0 ≤ λ ≤ 1 . Motivation Main definitions Algorithms Applications Parameter depth & regression Concluding remarks Depth function Liu (1990), Zuo & Serfling (2000), Dyckerhoff (2004) A depth function, D(x; P) (or D(x) shortly), satisfies: D1 Affine invariance. D(Ax + b; PAX+b) = D(x; PX) for every d×d d nonsingular A 2 R and b 2 R ; D2 Vanishes at infinity. D(x; P) −! 0 if kxk ! 1 ; d D3 Upper semicontinuity. fx 2 R : D(x; P) ≥ αg is closed ; D4 Monotonicity relative to deepest point. D(x; P) ≤ D(θ + λ(x − θ); P) for θ = arg maxxD(x; P) and 0 ≤ λ ≤ 1 ; D4’ Quasiconcavity. D(λx + (1 − λ)y; P) ≥ minfD(x; P); D(y; P)g for 0 ≤ λ ≤ 1 . Motivation Main definitions Algorithms Applications Parameter depth & regression Concluding remarks Depth-trimmed or central regions Dyckerhoff (2004) The sets α d D (P) = fx 2 R : D(x; P) ≥ αg are nested, Dβ(P) ⊆ Dα(P) if α ≤ β, and further: α α R1 Affine equivariance. D (PAX+b) = fAx + b : x 2 D (PX)g for d×d d every nonsingular A 2 R and b 2 R ; R2 Bounded.

Depth in Multivariate Statistics

Recent Outlier Detection Methods with Illustrations Loss Reserving Context

Download from the Resource and Environment Data Cloud Platform (

Of Typicality and Predictive Distributions in Discriminant Function Analysis Lyle W

Rainbow Plots, Bagplots and Boxplots for Functional Data

Detection of Outliers in Pollutant Emissions from the Soto De Ribera Coal-Fired Plant Using Functional Data Analysis: a Case Study in Northern Spain †

An Analysis of the Impact of an Outlier on Correlation Coefficients Across Small Sample Data Where RHO Is Non-Zero

The Aplpack Package

The Concept of Depth in Statistics

Package 'Mrfdepth'

Visualization Tools for Uncertainty and Sensitivity Analyses on Thermal-Hydraulic Transients

Oriented Spatial Box Plot, a New Pattern for Points Clusters Laurent

Package 'Aplpack'