
DEPTH, OUTLYINGNESS, QUANTILE, AND RANK FUNCTIONS – CONCEPTS, PERSPECTIVES, CHALLENGES Depth, Outlyingness, Quantile, and Rank Functions in Multivariate & Other Data Settings Robert Serfling1 Serfling & Thompson Statistical Consulting and Tutoring ASA Alabama-Mississippi Chapter Mini-Conference University of Mississippi, Oxford April 5, 2019 1 www.utdallas.edu/∼serfling DEPTH, OUTLYINGNESS, QUANTILE, AND RANK FUNCTIONS – CONCEPTS, PERSPECTIVES, CHALLENGES “Don’t walk in front of me, I may not follow. Don’t walk behind me, I may not lead. Just walk beside me and be my friend.” – Albert Camus DEPTH, OUTLYINGNESS, QUANTILE, AND RANK FUNCTIONS – CONCEPTS, PERSPECTIVES, CHALLENGES OUTLINE Depth, Outlyingness, Quantile, and Rank Functions on Rd Depth Functions on Arbitrary Data Space X Depth Functions on Arbitrary Parameter Space Θ Concluding Remarks DEPTH, OUTLYINGNESS, QUANTILE, AND RANK FUNCTIONS – CONCEPTS, PERSPECTIVES, CHALLENGES d DEPTH, OUTLYINGNESS, QUANTILE, AND RANK FUNCTIONS ON R PRELIMINARY PERSPECTIVES Depth functions are a nonparametric approach I The setting is nonparametric data analysis. No parametric or semiparametric model is assumed or invoked. I We exhibit the geometric structure of a data set in terms of a center, quantile levels, measures of outlyingness for each point, and identification of outliers or outlier regions. I Such data description is developed in terms of a depth function that measures centrality from a global viewpoint and yields center-outward ordering of data points. I This differs from the density function, which measures local probability mass. I This extends the univariate median, quantiles, ranks, and outlyingness functions. I This also yields depth-based inference procedures. DEPTH, OUTLYINGNESS, QUANTILE, AND RANK FUNCTIONS – CONCEPTS, PERSPECTIVES, CHALLENGES d DEPTH, OUTLYINGNESS, QUANTILE, AND RANK FUNCTIONS ON R PRELIMINARY PERSPECTIVES Depth functions on Rd must satisfy essential criteria I Two criteria are of paramount importance: I Invariance under affine transformation to new coordinates, I Robustness against influence of discordant data points. I A proposed “depth function” deficient with respect to these criteria is not acceptable for nonparametric description of a data cloud in terms of center, quantiles, outlyingness, and ranks. DEPTH, OUTLYINGNESS, QUANTILE, AND RANK FUNCTIONS – CONCEPTS, PERSPECTIVES, CHALLENGES d DEPTH, OUTLYINGNESS, QUANTILE, AND RANK FUNCTIONS ON R PRELIMINARY PERSPECTIVES Depth functions are neither a parametric approach nor a semiparametric approach I In parametric or semiparametric inference, with primary goals of model-fitting and model-testing, the criteria of invariance and robustness become seriously compromised against the key and overriding criterion of high efficiency. I Proposed depth functions compromising affine invariance and robustness cannot be advanced on the basis that their associated rank tests are efficient in some parametric or semiparametric setting. I The fundamental role of depth functions is nonparametric centrality-based data description and ordering. DEPTH, OUTLYINGNESS, QUANTILE, AND RANK FUNCTIONS – CONCEPTS, PERSPECTIVES, CHALLENGES d DEPTH, OUTLYINGNESS, QUANTILE, AND RANK FUNCTIONS ON R PRELIMINARY PERSPECTIVES A depth function in the multivariate setting should properly extend the univariate case. I In the univariate setting, depth (centrality) has been only implicit, with emphasis instead on equivalent outlyingness. I Extensions of univariate median, quantiles, outlyingness, and ranks to the multivariate setting should reduce to these when the dimension is 1. I This is a necessary condition for a proposed depth function. I However, this condition is not sufficient as validation of a proposed depth function. DEPTH, OUTLYINGNESS, QUANTILE, AND RANK FUNCTIONS – CONCEPTS, PERSPECTIVES, CHALLENGES d DEPTH, OUTLYINGNESS, QUANTILE, AND RANK FUNCTIONS ON R GENERAL NOTION OF DEPTH FUNCTION Depth functions – the general notion I Relative to distribution F on space X , a depth function D(x, F ) provides a center-outward ordering of x in X . I Higher depth represents greater “centrality”. I Maximum depth points define “center” or “median”. I Nested contours of equal depth enclose the median. I Orientation to a “center” compensates for the typical lack of a linear order in X . I For a sample Xn = {X1,..., Xn} in X following distribution F , some sample version D(x, Xn) is constructed. I Implementation of this simple notion for specific choices of X poses interesting challenges. d I For X = R , some general treatments: Liu, Parelius, and Singh (1999), Zuo and Serfling (2000), Liu, Souvaine, and Serfling (2006). DEPTH, OUTLYINGNESS, QUANTILE, AND RANK FUNCTIONS – CONCEPTS, PERSPECTIVES, CHALLENGES d DEPTH, OUTLYINGNESS, QUANTILE, AND RANK FUNCTIONS ON R GENERAL NOTION OF DEPTH FUNCTION Centrality likes symmetry I Notions of symmetry yield notions of “center”. I “Centers” provide reference points for “centrality”. I If F is symmetric about θ, then we might require of a depth function D(x, F ): I D(x, F ) is maximal at θ. I D(X, F ) is symmetric about θ. I D(x, F ) decreases along rays from θ. d I In R , we have a hierarchy of notions of symmetry : spherical ≺ elliptical ≺ central ≺ angular ≺ halfspace DEPTH, OUTLYINGNESS, QUANTILE, AND RANK FUNCTIONS – CONCEPTS, PERSPECTIVES, CHALLENGES d DEPTH, OUTLYINGNESS, QUANTILE, AND RANK FUNCTIONS ON R GENERAL NOTION OF DEPTH FUNCTION Centrality ignores multimodality d I Depth functions on R measure centrality without regard to multimodality. I Is this a virtue or a defect? I Multimodality might be regarded as the business of density functions rather than depth functions. I Are we really interested in “multi-centrality”? I One resolution: Carry out cluster analysis first, then carry out depth analysis within clusters. DEPTH, OUTLYINGNESS, QUANTILE, AND RANK FUNCTIONS – CONCEPTS, PERSPECTIVES, CHALLENGES d DEPTH, OUTLYINGNESS, QUANTILE, AND RANK FUNCTIONS ON R GENERAL NOTION OF DEPTH FUNCTION Depth and density functions are very different I Centrality is a global concept. Thus depth functions are center-oriented in the sense of median-oriented. I In comparison, density functions provide local measures of probability mass. I For example, for the uniform distribution on the d-cube, the density is constant, but typical depth functions are not. I Thus the likelihood function is not a depth function. I Features such as multimodality and nonconvexity are to be handled via the density function, not the depth function. I Nevertheless, for an ellipsoidally symmetric distribution, the contours of a depth function should coincide with those of the density function. DEPTH, OUTLYINGNESS, QUANTILE, AND RANK FUNCTIONS – CONCEPTS, PERSPECTIVES, CHALLENGES d DEPTH, OUTLYINGNESS, QUANTILE, AND RANK FUNCTIONS ON R d DEPTH FUNCTIONS ON R The story began formally with Tukey (1975) d I The “Tukey” or “halfspace” depth on X = R is d D(x, F ) = inf{P(H): x ∈ H closed halfspace}, x ∈ R , the minimal probability attached to any halfspace with x on the boundary. I Antecedents: Hotelling (1929) I Can use other classes besides halfspaces (Small, 1987) I Affine invariant I Robust sample versions I However, I Computationally intensive I Complicated asymptotics I Exhibits some undesirable anomalous behavior (Dutta, Ghosh, and Chaudhuri, 2011) DEPTH, OUTLYINGNESS, QUANTILE, AND RANK FUNCTIONS – CONCEPTS, PERSPECTIVES, CHALLENGES d DEPTH, OUTLYINGNESS, QUANTILE, AND RANK FUNCTIONS ON R d DEPTH FUNCTIONS ON R Measuring centrality by halfspace depth Halfspace Depth Function 0.5 0.4 0.3 0.2 0.1 0 0 0 0.2 0.2 0.4 0.4 y 0.6 0.6 x 0.8 0.8 1 1 Halfspace depth for F uniform on [0, 1]2. For comparison, the Halfspace Depth Function for Uniform Distribution on [0, 1]2 density function for this distribution is constant. DEPTH, OUTLYINGNESS, QUANTILE, AND RANK FUNCTIONS – CONCEPTS, PERSPECTIVES, CHALLENGES d DEPTH, OUTLYINGNESS, QUANTILE, AND RANK FUNCTIONS ON R d DEPTH FUNCTIONS ON R The story advanced ... d I The simplicial depth (Liu, 1988, 1990) on R is d D(x, F ) = P(x ∈ S[X1,..., Xd+1]), x ∈ R , d for S[X1,..., Xd+1] a simplex in R having independent observations X1,..., Xd+1 from F as vertices. I For d = 2: probability that a random triangle covers x I Can use other shapes besides simplices I The sample version is a U-statistic pointwise in x I Affine invariant I However, I Poor robustness (a U-statistic is an average) I Computational burden increases with d I With the introduction of this depth by Regina Liu, it was realized that “depth” is a general concept with many quite different implementations. This spawned an “industry”! DEPTH, OUTLYINGNESS, QUANTILE, AND RANK FUNCTIONS – CONCEPTS, PERSPECTIVES, CHALLENGES d DEPTH, OUTLYINGNESS, QUANTILE, AND RANK FUNCTIONS ON R d DEPTH FUNCTIONS ON R The “spatial” (or “L1”) depth d I The spatial depth (Vardi and Zhang, 2000) on R is d D(x, F ) = 1 − kES(x − X)k, x ∈ R , where S(y) = y/kyk (= 0 if y = 0) is the sign function on Rd and k · k is the Euclidean norm. I Only orthogonally invariant I The sample version involves basically a sum, n X S(x − Xi ), 1 so it is easily computed and its asymptotic behavior is readily derived via the CLT. I However, its robustness decreases away from the center. DEPTH, OUTLYINGNESS, QUANTILE, AND RANK FUNCTIONS – CONCEPTS, PERSPECTIVES, CHALLENGES d DEPTH, OUTLYINGNESS, QUANTILE, AND RANK FUNCTIONS ON R d DEPTH FUNCTIONS ON R Halfspace and spatial depths, F Uniform on [0, 1]2 Halfspace Depth Contours Halfspace Depth Function Halfspace Depth Function 0.8 0.5 0.5 0.4 0.4 0.6 0.3 0.3 y 0.2 0.2 0.4 0.1 0.1 0 0 0 0 0 0 0.2 0.2 0.2 0.2 0.2 0.4 0.4 0.4 0.4 y 0.6 0.6 x y 0.6 0.6 x 0.8 0.8 0.8 0.8 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 1 1 x Spatial Depth Contours Spatial Depth Function Spatial Depth Function 1 0.8 1 1 0.6 0.8 0.8 0.6 0.6 y 0.4 0.4 0.4 0.2 0.2 0 0 0 0 0.2 0.2 0.2 0.2 0.2 0.4 0.4 0.4 0.4 y 0.6 0.6 x y 0.6 0.6 x 0.8 0.8 0.8 0.8 0 0 0.2 0.4 0.6 0.8 1 1 1 1 x Upper row: halfspace depth.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages106 Page
-
File Size-