Chapter 10 U-Statistics

Chapter 10 U-Statistics

Chapter 10 U-statistics When one is willing to assume the existence of a simple random sample X1,...,Xn, U- statistics generalize common notions of unbiased estimation such as the sample mean and the unbiased sample variance (in fact, the “U” in “U-statistics” stands for “unbiased”). Even though U-statistics may be considered a bit of a special topic, their study in a large-sample theory course has side benefits that make them valuable pedagogically. The theory of U- statistics nicely demonstrates the application of some of the large-sample topics presented thus far. Furtheromre, the study of U-statistics even enables a theoretical discussion of sta- tistical functionals, which gives insight into the common modern practice of bootstrapping. 10.1 Statistical Functionals and V-Statistics Let S be a set of cumulative distribution functions and let T denote a mapping from S into the real numbers R. Then T is called a statistical functional. We may think of statistical functionals as parameters of interest. If, say, we are given a simple random sample from an distribution with unknown distribution function F , we may want to learn the value of θ = T (F ) for a (known) functional T . Some particular instances of statistical functionals are as follows: • If T (F ) = F (c) for some constant c, then T is a statistical functional mapping each F to PF (X ≤ c). • If T (F ) = F −1(p) for some constant p, where F −1(p) is defined in Equation (3.13), then T maps F to its pth quantile. • If T (F ) = E F (X), then T maps F to its mean. 157 Suppose X1,...,Xn is an independent and identically distributed sequence with distribution ˆ function F (x). We define the empirical distribution function Fn to be the distribution function for a discrete uniform distribution on {X1,...,Xn}. In other words, n 1 1 X Fˆ (x) = #{i : X ≤ x} = I{X ≤ x}. n n i n i i=1 ˆ Since Fn(x) is a distribution function, a reasonable estimator of T (F ) is the so-called plug-in ˆ estimator T (Fn). For example, if T (F ) = E F (X), then the plug-in estimator given a simple random sample X1,X2,... from F is n ˆ 1 X T (Fn) = E ˆ (X) = Xi = Xn. Fn n i=1 As we will see later, a plug-in estimator is also known as a V-statistic or a V-estimator. Suppose that for some real-valued function φ(x), we define T (F ) = E F φ(X). Note in this case that T {αF1 + (1 − α)F2} = α E F1 φ(X) + (1 − α)E F2 φ(X) = αT (F1) + (1 − α)T (F2). For this reason, such a functional is sometimes called a linear functional (see Definition 10.1). To generalize this idea, we consider a real-valued function taking more than one real argu- ment, say φ(x1, . , xa) for some a > 1, and define T (F ) = E F φ(X1,...,Xa), (10.1) which we take to mean the expectation of φ(X1,...,Xa) where X1,...,Xa is a simple random sample from the distribution function F . We see that E F φ(X1,...,Xa) = E F φ(Xπ(1),...,Xπ(a)) for any permutation π mapping {1, . , a} onto itself. Since there are a! such permutations, consider the function def 1 X φ∗(x , . , x ) = φ(x , . , x ). 1 a a! π(1) π(a) all π ∗ ∗ Since E F φ(X1,...,Xa) = E F φ (X1,...,Xa) and φ is symmetric in its arguments (i.e., permuting its a arguments does not change its value), we see that in Equation (10.1) we may assume without loss of generality that φ is symmetric in its arguments. A function defined as in Equation (10.1) is called an expectation functional, as summarized in the following definition: 158 Definition 10.1 For some integer a ≥ 1, let φ: Ra → R be a function symmetric in its a arguments. The expectation of φ(X1,...,Xa) under the assumption that X1,...,Xa are independent and identically distributed from some distri- bution F will be denoted by E F φ(X1,...,Xa). Then the functional T (F ) = E F φ(X1,...,Xa) is called an expectation functional. If a = 1, then T is also called a linear functional. Expectation functionals are important in this chapter because they are precisely the func- tionals that give rise to V-statistics and U-statistics. The function φ(x1, . , xa) in Definition 10.1 is used so frequently that we give it a special name: Definition 10.2 Let T (F ) = E F φ(X1,...,Xa) be an expectation functional, where φ: Ra → R is a function that is symmetric in its arguments. In other words, φ(x1, . , xa) = φ(xpi(1), . , xπ(a)) for any permutation π of the integers 1 through a. Then φ is called the kernel function associated with T (F ). Suppose T (F ) is an expectation functional defined according to Equation (10.1). If we have a simple random sample of size n from F , then as noted earlier, a natural way to estimate ˆ T (F ) is by the use of the plug-in estimator T (Fn). This estimator is called a V-estimator or a ˆ V-statistic. It is possible to write down a V-statistic explicitly: Since Fn assigns probability 1 n to each Xi, we have n n ˆ 1 X X Vn = T (Fn) = E ˆ φ(X1,...,Xa) = ··· φ(Xi ,...,Xi ). (10.2) Fn na 1 a i1=1 ia=1 In the case a = 1, then Equation (10.2) becomes n 1 X V = φ(X ). n n i i=1 2 It is clear in this case that E Vn = T (F ), which we denote by θ. Furthermore, if σ = Var F φ(X) < ∞, then the central limit theorem implies that √ d 2 n(Vn − θ) → N(0, σ ). For a > 1, however, the sum in Equation (10.2) contains some terms in which i1, . , ia are not all distinct. The expectation of such terms is not necessarily equal to θ = T (F ) because in Definition 10.1, θ requires a independent random variables from F . Thus, Vn is not necessarily unbiased for a > 1. Example 10.3 Let a = 2 and φ(x1, x2) = |x1 − x2|. It may be shown (Problem 10.2) that the functional T (F ) = E F |X1 − X2| is not linear in F . Furthermore, since 159 |Xi1 − Xi2 | is identically zero whenever i1 = i2, it may also be shown that the V-estimator of T (F ) is biased. Since the bias in Vn is due to the duplication among the subscripts i1, . , ia, one way to correct this bias is to restrict the summation in Equation (10.2) to sets of subscripts i1, . , ia that contain no duplication. For example, we might sum instead over all possible subscripts satisfying i1 < ··· < ia. The result is the U-statistic, which is the topic of Section 10.2. Exercises for Section 10.1 Exercise 10.1 Let X1,...,Xn be a simple random sample from F . For a fixed x for ˆ which 0 < F (x) < 1, find the asymptotic distribution of Fn(x). Exercise 10.2 Let T (F ) = E F |X1 − X2|. (a) Show that T (F ) is not a linear functional by exhibiting distributions F1 and F2 and a constant α ∈ (0, 1) such that T {αF1 + (1 − α)F2} 6= αT (F1) + (1 − α)T (F2). (b) For n > 1, demonstrate that the V-statistic Vn is biased in this case by finding cn 6= 1 such that E F Vn = cnT (F ). Exercise 10.3 Let X1,...,Xn be a random sample from a distribution F with finite third absolute moment. (a) For a = 2, find φ(x1, x2) such that E F φ(X1,X2) = Var F X. Your φ function should be symmetric in its arguments. 2 Hint: The fact that θ = E X1 − E X1X2 leads immediately to a non-symmetric φ function. Symmetrize it. 3 (b) For a = 3, find φ(x1, x2, x3) such that E F φ(X1,X2,X3) = E F (X − E F X) . As in part (a), φ should be symmetric in its arguments. 10.2 Hoeffding’s Decomposition and Asymptotic Nor- mality Because the V-statistic n n 1 X X V = ··· φ(X ,...,X ) n na i1 ia i1=1 ia=1 160 is in general a biased estimator of the expectation functional T (F ) = E F φ(X1,...,Xa) due to presence of summands in which there are duplicated indices on the Xik , one way to produce an unbiased estimator is to sum only over those (i1, . , ia) in which no duplicates occur. Because φ is assumed to be symmetric in its arguments, we may without loss of generality restrict attention to the cases in which 1 ≤ i1 < ··· < ia ≤ n. Doing this, we obtain the U-statistic Un: Definition 10.4 Let a be a positive integer and let φ(x1, . , xa) be the kernel func- tion associated with an expectation functional T (F ) (see Definitions 10.1 and 10.2). Then the U-statistic corresponding to this functional equals 1 X X Un = n ··· φ(Xi1 ,...,Xia ), (10.3) a 1≤i1<···<ia≤n where X1,...,Xn is a simple random sample of size n ≥ a. The “U” in “U-statistic” stands for unbiased (the “V” in “V-statistic” stands for von Mises, who was one of the originators of this theory in the late 1940’s).

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    20 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us