Causal Inference on Discrete Data Via Estimating Distance Correlations

Total Page:16

File Type:pdf, Size:1020Kb

Causal Inference on Discrete Data Via Estimating Distance Correlations 1 Causal Inference on Discrete Data via Estimating Distance Correlations Furui Liu, Laiwan Chan Department of Computer Science and Engineering, The Chinese University of Hong Kong Abstract In this paper, we deal with the problem of inferring causal directions when the data is on discrete domain. By considering the distribution of the cause P (X) and the condi- tional distribution mapping cause to effect P (Y jX) as independent random variables, we propose to infer the causal direction via comparing the distance correlation between P (X) and P (Y jX) with the distance correlation between P (Y ) and P (XjY ). We in- fer “X causes Y ” if the dependence coefficient between P (X) and P (Y jX) is smaller. Experiments are performed to show the performance of the proposed method. Editor: Aapo Hyvarinen¨ 1 Introduction Inferring the causal direction between two variables from observational data becomes arXiv:1803.07712v3 [stat.ML] 7 Aug 2018 a hot research topic. Additive noise models (ANMs) (Zhang and Hyvarinen,¨ 2009a; Shimizu et al., 2006; Hoyer et al., 2009; Peters et al., 2011; Hoyer and Hyttinen, 2009; Zhang and Hyvarinen,¨ 2009b; Shimizu et al., 2011; Hyvarinen¨ and Smith, 2013; Hyvarinen¨ et al., 2010; Hyttinen et al., 2012; Mooij et al., 2011) are preliminary trials to solve this problem. They assume that the effect is governed by the cause and an ad- ditive noise, and the causal inference is done by finding the direction that admits such a model. Recently, under another view of exploiting the asymmetry between cause and effect, the linear trace method (LTr) (Zscheischler et al., 2011; Janzing et al., 2010) and information geometric causal inference (IGCI) (Janzing et al., 2012) are proposed. Suppose X is the cause and Y is the effect. Based on the fact that the generating of P (X) is independent with that of P (Y jX) (Janzing and Scholkopf,¨ 2010; Lemeire and Janzing, 2013; Scholkopf¨ et al., 2012), LTr suggests that the trace condition is fulfilled in the causal direction while violated in the anti-causal direction, and IGCI shows that the density of the cause and the log slope of the function transforming cause to effect are uncorrelated while the density of the effect and the log slope of the inverse of the func- tion are positively correlated. By accessing these so-called cause-effect asymmetries, one can determine the causal direction. Then a kernel method using the framework of IGCI to deal with high dimensional variables is developed (Chen et al., 2014), and nonlinear extensions of trace method is presented (Chen et al., 2013). In some situations, the variables of interest are on discrete domains, and researchers have adopted additive noise models to discrete data for causal inference (Peters et al., 2010, 2011). Given observations of the variable pair, they do regressions in both direc- tions, and test the independence between the residuals and the regressors. The direction that admits an additive noise model is inferred as the causal direction. However, ANMs may not be suitable for modeling discrete variables in some situations. For example, it is not natural to adopt ANMs to modeling categorical variables. Methods with wider applicability would be valuable for causal inference on discrete data. Motivated by the postulate that the generating of P (X) is independent with that of P (Y jX), we suggest that the (P (x);P (Y jx)) is an observation of a pair of vari- ables that are independent with each other (here P (x) = P (X = x), referring to the probability at one specified point). To infer the causal direction, we calculate the de- pendence coefficients between P (X) and P (Y jX), and P (Y ) and P (XjY ). Then the direction that induces smaller correlation is inferred as the causal direction. Without a functional causal model assumption, our method is with wider applicability than tradi- tional ANMs. Various experiments are conducted to demonstrate the performance of the proposed method. This paper is organized as follows. Section 2 defines the problem. Section 3 presents the causal inference principle. Section 4 gives the detailed causal inference method. Section 5 shows the experiments, and section 6 concludes the whole paper. 2 Problem Description Suppose we have a set of observed variables X and Y with support domain X and Y respectively. X is the cause and Y is the effect but we do not have prior knowledge about the causal direction. We assume that they are on discrete domain (for continuous variable we can perform discretization) and there are no latent confounders. We want to identify the causal direction (“X causes Y” or “Y causes X”). For clarity, we list the symbols that may appear in the following sections in table 1. Since we constrain 2 variables in finite range, P (X; Y ) and P (Y jX) can be written as matrices, and P (Y jx) is a vector in the matrix P (Y jX). We use these representations in the rest of the paper. Table 1: Lookup table Symbol Description X support of variable X Y support of variable Y P (X) distribution of X P (x) probability of X = x P (X; Y ) joint distribution of (X; Y ) P (Y jX) conditional distribution of Y given X P (Y jx) conditional distribution of Y given X = x j · j cardinality of a set 3 Causal Inference Principle In this section, we will present the principle that we are using for causal inference on discrete data. We start with the basic idea. 3.1 Basic Idea The basic idea is to consider the (P (x);P (Y jx)) as a realization of a variable pair, and the two variables (one is one dimensional and the other is high dimensional) are independent with each other. See figure 1 for an example. Figure 1 shows an example of P (X) and P (Y jX). Suppose jX j = M and jYj = L (here M = 7 and L = 8). P (X) is a vector in RM and P (Y jX) is a matrix in RL×M . The highlighted (red bars) grids are a pair (P (x);P (Y jx)). Consider P (X) and P (Y jX) as two independent random variables. The generating of the P (X) and P (Y jX) is done by drawing realizations (P (x);P (Y jx)) at each possible value of x (shifting the red bars from right to left). We have jX j realizations. We formalize this in postulate 1. Postulate 1. P (X) and P (Y jX) are both random variables taking realizations at dif- ferent x. P (X) is independent with P (Y jX). Once we have this postulate in mind, one could seek for some properties induced by it for causal discovery. We will discuss this in the next section. 3 P(Y|X) (P(x) , P(Y|x)) P(X) Figure 1: P (X) and P (Y jX) as random variables 3.2 Distance Correlation If we want to characterize the dependence between P (X) and P (Y jX), one measure- ment is the correlation. However, P (Y jX) is a high dimensional random vector. Adopt- ing traditional dependence coefficients like Pearson correlations would cause certain estimation bias when sample size is not large. Moreover, it would be useful if the in- dependence between variables corresponds to 0 correlation. This is not true if we use traditional correlations. Here we propose to use distance correlation (Szekely´ et al., 2007) as the dependence measurement. Distance correlation is a measurement of dependence between two random variables (one dimensional or high dimensional). Suppose we have two random variables (α; β), with characteristic functions fα(t) and fβ(s) respectively. Their joint characteristic function is fα,β(t; s). The distance covariance is defined as below. Definition 1. The distance covariance C(α; β) between two random variables (α; β) is 2 2 C (α; β) = kfα,β(t; s) − fα(t)fβ(s)k (1) Here k·k refers the weighted L2 norm, and similarly we can define distance variance C2(α; α) (Szekely´ et al., 2007). Then the distance correlation is defined as Definition 2. The distance correlation D(α; β) is C(α; β) D(α; β) = (2) pC(α; α)C(β; β) and D(α; β) = 0 if C(α; α) = 0 or C(β; β) = 0. 4 This dependence measurement is a distance metric between the joint and multiply of the marginal characteristic functions. There are other methods to measure the depen- dence, like mutual information and kernel independence measurements (Gretton et al., 2005). However, mutual information is hard to estimate given finite sample size. Kernel methods involve a few parameters (kernel functions, kernel widths) which is not easy to choose. So we use this metric in our paper. Then we discuss how to estimate the distance correlation empirically from data (Szekely´ et al., 2007). Suppose we have n n observations of two random variables (α; β) as f(αi; βi)gi=1. For variable α and β, we can construct n n n 1 X 1 X 1 X a = kα − α k; a = a ; a = a ; a = a (3) ij i j i· n ij ·j n ij ·· n2 ij j=1 i=1 i;j=1 n n n 1 X 1 X 1 X b = kβ − β k; b = b ; b = b ; b = b (4) ij i j i· n ij ·j n ij ·· n2 ij j=1 i=1 i;j=1 and then we construct matrices A and B, with its entries as Aij = aij − ai· − a·j + a·· (5) Bij = bij − bi· − b·j + b·· (6) Then we can estimate the empirical distance covariance as follows (Szekely´ et al., 2007).
Recommended publications
  • The Practice of Causal Inference in Cancer Epidemiology
    Vol. 5. 303-31 1, April 1996 Cancer Epidemiology, Biomarkers & Prevention 303 Review The Practice of Causal Inference in Cancer Epidemiology Douglas L. Weedt and Lester S. Gorelic causes lung cancer (3) and to guide causal inference for occu- Preventive Oncology Branch ID. L. W.l and Comprehensive Minority pational and environmental diseases (4). From 1965 to 1995, Biomedical Program IL. S. 0.1. National Cancer Institute, Bethesda, Maryland many associations have been examined in terms of the central 20892 questions of causal inference. Causal inference is often practiced in review articles and editorials. There, epidemiologists (and others) summarize evi- Abstract dence and consider the issues of causality and public health Causal inference is an important link between the recommendations for specific exposure-cancer associations. practice of cancer epidemiology and effective cancer The purpose of this paper is to take a first step toward system- prevention. Although many papers and epidemiology atically reviewing the practice of causal inference in cancer textbooks have vigorously debated theoretical issues in epidemiology. Techniques used to assess causation and to make causal inference, almost no attention has been paid to the public health recommendations are summarized for two asso- issue of how causal inference is practiced. In this paper, ciations: alcohol and breast cancer, and vasectomy and prostate we review two series of review papers published between cancer. The alcohol and breast cancer association is timely, 1985 and 1994 to find answers to the following questions: controversial, and in the public eye (5). It involves a common which studies and prior review papers were cited, which exposure and a common cancer and has a large body of em- causal criteria were used, and what causal conclusions pirical evidence; over 50 studies and over a dozen reviews have and public health recommendations ensued.
    [Show full text]
  • Statistics and Causal Inference (With Discussion)
    Applied Statistics Lecture Notes Kosuke Imai Department of Politics Princeton University February 2, 2008 Making statistical inferences means to learn about what you do not observe, which is called parameters, from what you do observe, which is called data. We learn the basic principles of statistical inference from a perspective of causal inference, which is a popular goal of political science research. Namely, we study statistics by learning how to make causal inferences with statistical methods. 1 Statistical Framework of Causal Inference What do we exactly mean when we say “An event A causes another event B”? Whether explicitly or implicitly, this question is asked and answered all the time in political science research. The most commonly used statistical framework of causality is based on the notion of counterfactuals (see Holland, 1986). That is, we ask the question “What would have happened if an event A were absent (or existent)?” The following example illustrates the fact that some causal questions are more difficult to answer than others. Example 1 (Counterfactual and Causality) Interpret each of the following statements as a causal statement. 1. A politician voted for the education bill because she is a democrat. 2. A politician voted for the education bill because she is liberal. 3. A politician voted for the education bill because she is a woman. In this framework, therefore, the fundamental problem of causal inference is that the coun- terfactual outcomes cannot be observed, and yet any causal inference requires both factual and counterfactual outcomes. This idea is formalized below using the potential outcomes notation.
    [Show full text]
  • Bayesian Causal Inference
    Bayesian Causal Inference Maximilian Kurthen Master’s Thesis Max Planck Institute for Astrophysics Within the Elite Master Program Theoretical and Mathematical Physics Ludwig Maximilian University of Munich Technical University of Munich Supervisor: PD Dr. Torsten Enßlin Munich, September 12, 2018 Abstract In this thesis we address the problem of two-variable causal inference. This task refers to inferring an existing causal relation between two random variables (i.e. X → Y or Y → X ) from purely observational data. We begin by outlining a few basic definitions in the context of causal discovery, following the widely used do-Calculus [Pea00]. We continue by briefly reviewing a number of state-of-the-art methods, including very recent ones such as CGNN [Gou+17] and KCDC [MST18]. The main contribution is the introduction of a novel inference model where we assume a Bayesian hierarchical model, pursuing the strategy of Bayesian model selection. In our model the distribution of the cause variable is given by a Poisson lognormal distribution, which allows to explicitly regard discretization effects. We assume Fourier diagonal covariance operators, where the values on the diagonal are given by power spectra. In the most shallow model these power spectra and the noise variance are fixed hyperparameters. In a deeper inference model we replace the noise variance as a given prior by expanding the inference over the noise variance itself, assuming only a smooth spatial structure of the noise variance. Finally, we make a similar expansion for the power spectra, replacing fixed power spectra as hyperparameters by an inference over those, where again smoothness enforcing priors are assumed.
    [Show full text]
  • Testing the Causal Direction of Mediation Effects in Randomized Intervention Studies
    Prevention Science (2019) 20:419–430 https://doi.org/10.1007/s11121-018-0900-y Testing the Causal Direction of Mediation Effects in Randomized Intervention Studies Wolfgang Wiedermann1 & Xintong Li1 & Alexander von Eye2 Published online: 21 May 2018 # Society for Prevention Research 2018 Abstract In a recent update of the standards for evidence in research on prevention interventions, the Society of Prevention Research emphasizes the importance of evaluating and testing the causal mechanism through which an intervention is expected to have an effect on an outcome. Mediation analysis is commonly applied to study such causal processes. However, these analytic tools are limited in their potential to fully understand the role of theorized mediators. For example, in a design where the treatment x is randomized and the mediator (m) and the outcome (y) are measured cross-sectionally, the causal direction of the hypothesized mediator-outcome relation is not uniquely identified. That is, both mediation models, x → m → y or x → y → m, may be plausible candidates to describe the underlying intervention theory. As a third explanation, unobserved confounders can still be responsible for the mediator-outcome association. The present study introduces principles of direction dependence which can be used to empirically evaluate these competing explanatory theories. We show that, under certain conditions, third higher moments of variables (i.e., skewness and co-skewness) can be used to uniquely identify the direction of a mediator-outcome relation. Significance procedures compatible with direction dependence are introduced and results of a simulation study are reported that demonstrate the performance of the tests. An empirical example is given for illustrative purposes and a software implementation of the proposed method is provided in SPSS.
    [Show full text]
  • Articles Causal Inference in Civil Rights Litigation
    VOLUME 122 DECEMBER 2008 NUMBER 2 © 2008 by The Harvard Law Review Association ARTICLES CAUSAL INFERENCE IN CIVIL RIGHTS LITIGATION D. James Greiner TABLE OF CONTENTS I. REGRESSION’S DIFFICULTIES AND THE ABSENCE OF A CAUSAL INFERENCE FRAMEWORK ...................................................540 A. What Is Regression? .........................................................................................................540 B. Problems with Regression: Bias, Ill-Posed Questions, and Specification Issues .....543 1. Bias of the Analyst......................................................................................................544 2. Ill-Posed Questions......................................................................................................544 3. Nature of the Model ...................................................................................................545 4. One Regression, or Two?............................................................................................555 C. The Fundamental Problem: Absence of a Causal Framework ....................................556 II. POTENTIAL OUTCOMES: DEFINING CAUSAL EFFECTS AND BEYOND .................557 A. Primitive Concepts: Treatment, Units, and the Fundamental Problem.....................558 B. Additional Units and the Non-Interference Assumption.............................................560 C. Donating Values: Filling in the Missing Counterfactuals ...........................................562 D. Randomization: Balance in Background Variables ......................................................563
    [Show full text]
  • Local White Matter Architecture Defines Functional Brain Dynamics
    2018 IEEE International Conference on Systems, Man, and Cybernetics Local White Matter Architecture Defines Functional Brain Dynamics Yo Joong Choe* Sivaraman Balakrishnan Aarti Singh Kakao Dept. of Statistics and Data Science Machine Learning Dept. Seongnam-si, South Korea Carnegie Mellon University Carnegie Mellon University [email protected] Pittsburgh, PA, USA Pittsburgh, PA, USA [email protected] [email protected] Jean M. Vettel Timothy Verstynen U.S. Army Research Laboratory Dept. of Psychology and CNBC Aberdeen Proving Ground, MD, USA Carnegie Mellon University [email protected] Pittsburgh, PA, USA [email protected] Abstract—Large bundles of myelinated axons, called white the integrity of the myelin sheath is critical for synchroniz- matter, anatomically connect disparate brain regions together ing information transmission between distal brain areas [2], and compose the structural core of the human connectome. We fostering the ability of these networks to adapt over time recently proposed a method of measuring the local integrity along the length of each white matter fascicle, termed the local [4]. Thus, variability in the myelin sheath, as well as other connectome [1]. If communication efficiency is fundamentally cellular support mechanisms, would contribute to variability constrained by the integrity along the entire length of a white in functional coherence across the circuit. matter bundle [2], then variability in the functional dynamics To study the integrity of structural connectivity, we recently of brain networks should be associated with variability in the introduced the concept of the local connectome. This is local connectome. We test this prediction using two statistical ap- proaches that are capable of handling the high dimensionality of defined as the pattern of fiber systems (i.e., number of fibers, data.
    [Show full text]
  • Causation and Causal Inference in Epidemiology
    PUBLIC HEALTH MAnERS Causation and Causal Inference in Epidemiology Kenneth J. Rothman. DrPH. Sander Greenland. MA, MS, DrPH. C Stat fixed. In other words, a cause of a disease Concepts of cause and causal inference are largely self-taught from early learn- ing experiences. A model of causation that describes causes in terms of suffi- event is an event, condition, or characteristic cient causes and their component causes illuminates important principles such that preceded the disease event and without as multicausality, the dependence of the strength of component causes on the which the disease event either would not prevalence of complementary component causes, and interaction between com- have occun-ed at all or would not have oc- ponent causes. curred until some later time. Under this defi- Philosophers agree that causal propositions cannot be proved, and find flaws or nition it may be that no specific event, condi- practical limitations in all philosophies of causal inference. Hence, the role of logic, tion, or characteristic is sufiicient by itself to belief, and observation in evaluating causal propositions is not settled. Causal produce disease. This is not a definition, then, inference in epidemiology is better viewed as an exercise in measurement of an of a complete causal mechanism, but only a effect rather than as a criterion-guided process for deciding whether an effect is pres- component of it. A "sufficient cause," which ent or not. (4m JPub//cHea/f/i. 2005;95:S144-S150. doi:10.2105/AJPH.2004.059204) means a complete causal mechanism, can be defined as a set of minimal conditions and What do we mean by causation? Even among eral.
    [Show full text]
  • Experiments & Observational Studies: Causal Inference in Statistics
    Experiments & Observational Studies: Causal Inference in Statistics Paul R. Rosenbaum Department of Statistics University of Pennsylvania Philadelphia, PA 19104-6340 1 My Concept of a ‘Tutorial’ In the computer era, we often receive compressed • files, .zip. Minimize redundancy, minimize storage, at the expense of intelligibility. Sometimes scientific journals seemed to have been • compressed. Tutorial goal is: ‘uncompress’. Make it possible to • read a current article or use current software without going back to dozens of earlier articles. 2 A Causal Question At age 45, Ms. Smith is diagnosed with stage II • breast cancer. Her oncologist discusses with her two possible treat- • ments: (i) lumpectomy alone, or (ii) lumpectomy plus irradiation. They decide on (ii). Ten years later, Ms. Smith is alive and the tumor • has not recurred. Her surgeon, Steve, and her radiologist, Rachael de- • bate. Rachael says: “The irradiation prevented the recur- • rence – without it, the tumor would have recurred.” Steve says: “You can’t know that. It’s a fantasy – • you’re making it up. We’ll never know.” 3 Many Causal Questions Steve and Rachael have this debate all the time. • About Ms. Jones, who had lumpectomy alone. About Ms. Davis, whose tumor recurred after a year. Whenever a patient treated with irradiation remains • disease free, Rachael says: “It was the irradiation.” Steve says: “You can’t know that. It’s a fantasy. We’ll never know.” Rachael says: “Let’s keep score, add ’em up.” Steve • says: “You don’t know what would have happened to Ms. Smith, or Ms. Jones, or Ms Davis – you just made it all up, it’s all fantasy.
    [Show full text]
  • From Distance Correlation to Multiscale Graph Correlation Arxiv
    From Distance Correlation to Multiscale Graph Correlation Cencheng Shen∗1, Carey E. Priebey2, and Joshua T. Vogelsteinz3 1Department of Applied Economics and Statistics, University of Delaware 2Department of Applied Mathematics and Statistics, Johns Hopkins University 3Department of Biomedical Engineering and Institute of Computational Medicine, Johns Hopkins University October 2, 2018 Abstract Understanding and developing a correlation measure that can detect general de- pendencies is not only imperative to statistics and machine learning, but also crucial to general scientific discovery in the big data age. In this paper, we establish a new framework that generalizes distance correlation — a correlation measure that was recently proposed and shown to be universally consistent for dependence testing against all joint distributions of finite moments — to the Multiscale Graph Correla- tion (MGC). By utilizing the characteristic functions and incorporating the nearest neighbor machinery, we formalize the population version of local distance correla- tions, define the optimal scale in a given dependency, and name the optimal local correlation as MGC. The new theoretical framework motivates a theoretically sound ∗[email protected] arXiv:1710.09768v3 [stat.ML] 30 Sep 2018 [email protected] [email protected] 1 Sample MGC and allows a number of desirable properties to be proved, includ- ing the universal consistency, convergence and almost unbiasedness of the sample version. The advantages of MGC are illustrated via a comprehensive set of simula- tions with linear, nonlinear, univariate, multivariate, and noisy dependencies, where it loses almost no power in monotone dependencies while achieving better perfor- mance in general dependencies, compared to distance correlation and other popular methods.
    [Show full text]
  • A Comparative Study of Statistical Methods Used to Identify
    Briefings in Bioinformatics Advance Access published August 20, 2013 BRIEFINGS IN BIOINFORMATICS. page1of13 doi:10.1093/bib/bbt051 A comparative study of statistical methods used to identify dependencies between gene expression signals Suzana de Siqueira Santos, Daniel YasumasaTakahashi, Asuka Nakata and Andre¤Fujita Submitted: 18th April 2013; Received (in revised form): 10th June 2013 Abstract One major task in molecular biology is to understand the dependency among genes to model gene regulatory net- works. Pearson’s correlation is the most common method used to measure dependence between gene expression signals, but it works well only when data are linearly associated. For other types of association, such as non-linear Downloaded from or non-functional relationships, methods based on the concepts of rank correlation and information theory-based measures are more adequate than the Pearson’s correlation, but are less used in applications, most probably because of a lack of clear guidelines for their use. This work seeks to summarize the main methods (Pearson’s, Spearman’s and Kendall’s correlations; distance correlation; Hoeffding’s D measure; Heller^Heller^Gorfine measure; mutual in- http://bib.oxfordjournals.org/ formation and maximal information coefficient) used to identify dependency between random variables, especially gene expression data, and also to evaluate the strengths and limitations of each method. Systematic Monte Carlo simulation analyses ranging from sample size, local dependence and linear/non-linear and also non-functional relationships are shown. Moreover, comparisons in actual gene expression data are carried out. Finally, we provide a suggestive list of methods that can be used for each type of data set.
    [Show full text]
  • From Distance Correlation to Multiscale Graph Correlation
    From Distance Correlation to Multiscale Graph Correlation Cencheng Shen∗1, Carey E. Priebey2, and Joshua T. Vogelsteinz3 1Department of Applied Economics and Statistics, University of Delaware 2Department of Applied Mathematics and Statistics, Johns Hopkins University 3Department of Biomedical Engineering and Institute of Computational Medicine, Johns Hopkins University January 19, 2019 Abstract Understanding and developing a correlation measure that can detect general de- pendencies is not only imperative to statistics and machine learning, but also crucial to general scientific discovery in the big data age. In this paper, we establish a new framework that generalizes distance correlation — a correlation measure that was recently proposed and shown to be universally consistent for dependence testing against all joint distributions of finite moments — to the Multiscale Graph Correla- tion (MGC). By utilizing the characteristic functions and incorporating the nearest neighbor machinery, we formalize the population version of local distance correla- tions, define the optimal scale in a given dependency, and name the optimal local correlation as MGC. The new theoretical framework motivates a theoretically sound ∗[email protected] [email protected] [email protected] 1 Sample MGC and allows a number of desirable properties to be proved, includ- ing the universal consistency, convergence and almost unbiasedness of the sample version. The advantages of MGC are illustrated via a comprehensive set of simula- tions with linear, nonlinear, univariate, multivariate, and noisy dependencies, where it loses almost no power in monotone dependencies while achieving better perfor- mance in general dependencies, compared to distance correlation and other popular methods. Keywords: testing independence, generalized distance correlation, nearest neighbor graph 2 1 Introduction p q Given pairs of observations (xi; yi) 2 R ×R for i = 1; : : : ; n, assume they are generated by independently identically distributed (iid) FXY .
    [Show full text]
  • University of Aberdeen the Distance Standard Deviation Edelmann, Dominic; Richards, Donald ; Vogel, Daniel
    University of Aberdeen The Distance Standard Deviation Edelmann, Dominic; Richards, Donald ; Vogel, Daniel Publication date: 2017 Document Version Other version Link to publication Citation for published version (APA): Edelmann, D., Richards, D., & Vogel, D. (2017). The Distance Standard Deviation. (pp. 1-27). ArXiv. https://arxiv.org/abs/1705.05777 General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. ? Users may download and print one copy of any publication from the public portal for the purpose of private study or research. ? You may not further distribute the material or use it for any profit-making activity or commercial gain ? You may freely distribute the URL identifying the publication in the public portal ? Take down policy If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim. Download date: 02. Oct. 2021 The Distance Standard Deviation Dominic Edelmann,∗ Donald Richards,y and Daniel Vogelz May 17, 2017 Abstract The distance standard deviation, which arises in distance correlation analy- sis of multivariate data, is studied as a measure of spread. New representations for the distance standard deviation are obtained in terms of Gini's mean differ- ence and in terms of the moments of spacings of order statistics. Inequalities for the distance variance are derived, proving that the distance standard deviation is bounded above by the classical standard deviation and by Gini's mean difference.
    [Show full text]