Nonparametric Methods
Franco Peracchi University of Rome “Tor Vergata” and EIEF January 2011 Contents
1 Density estimators 2 1.1Empiricaldensities...... 4 1.2Thekernelmethod...... 10 1.3Statisticalpropertiesofthekernelmethod...... 16 1.4Othermethodsforunivariatedensityestimation...... 30 1.5Multivariatedensityestimators...... 34 1.6Statacommands...... 40
2 Linear nonparametric regression estimators 43 2.1Polynomialregressions...... 45 2.2Regressionsplines...... 47 2.3Thekernelmethod...... 51 2.4Thenearestneighbormethod...... 55 2.5Cubicsmoothingsplines...... 59 2.6Statisticalpropertiesoflinearsmoothers...... 62 2.7Averagederivatives...... 68 2.8Methodsforhigh-dimensionaldata...... 70 2.9Statacommands...... 77
3 Distribution function and quantile function estimators 80 3.1Thedistributionfunctionandthequantilefunction...... 80 3.2Theempiricaldistributionfunction...... 88 3.3Theempiricalquantilefunction...... 98 3.4Conditionaldfandqf...... 113 3.5Estimatingtheconditionalqf...... 116 3.6Estimatingtheconditionaldf...... 128 3.7Relationshipsbetweenthetwoapproaches...... 136 3.8Statacommands...... 139
1 1 Density estimators
Let the data Z1,...,Zn be a sample from the distribution of a random vector Z. We are interested in the general problem of estimating the distribution of Z nonparametrically,thatis,without restricting it to belong to a known parametric family.
We begin with the problem of how to estimate nonparametrically the density function of Z.
Although the distribution function (df) and the density function are equiva- lent ways of representing the distribution of Z,theremaybeadvantages in analyzing a density:
• The graph of a density may be easier to interpret if one is interested in aspects such as symmetry or multimodality.
• Estimates of certain population parameters, such as the mode,aremore easily obtained from an estimate of the density.
2 Use of nonparametric density estimates Nonparametric density estimates may be used for:
• Exploratory data analysis.
• Estimating qualitative features of a distribution (e.g. unimodality, skewness, etc.).
• Specification and testing of parametric models.
• If Z =(X, Y ), they may be used to construct a nonparametric estimate of the conditional mean function (CMF) of Y given X f(x, y) μ(x)=E(Y | X = x)= y dy, fX (x) where fX (x)= f(x, y) dy. In fact, given an estimate fˆ(x, y)ofthejoint density f(x, y)of(X, Y ), the analogy principle suggests estimating μ(x)by fˆ(x, y) μˆ(x)= y dy, (1) fˆX (x) where fˆX (x)= fˆ(x, y) dy.
• More generally, many statistical problems involve estimation of a pop- ulation parameter that can be represented as a statistical functional T (f) of the population density f. In all these cases, if fˆ is a reasonable estimate of f, then a reasonable estimate of θ = T (f)isθˆ = T (fˆ).
3 1.1 Empirical densities We begin with the simplest problem of estimating univariate densities, and discuss the problem of estimating multivariate densities later in Section 1.5.
Thus, let Z1,...,Zn be a sample from the distribution of a univariate rv Z with df F and density function f. What methods are appropriate for estimat- ing f depends on the nature of the rv Z.
Discrete Z If Z is discrete with a distribution that assigns positive probability mass to the points z1,z2,..., then the probability fj =Pr{Z = zj} may be estimated by n −1 fˆj = n 1{Zi = zj},j=1, 2,..., i=1 called the empirical probability that Z = zj.
This is just the sample average of n iid binary rv’s X1,...,Xn,whereXi = 1{Zi = zj} has a Bernoulli distribution with mean fj and variance fj(1−fj). Thus, fˆj is unbiased for fj and its sampling variance is
fj(1 − fj) Var fˆj = . n
√ It is easy to verify that fˆj is n -consistent for fj and asymptotically normal, with asymptotic variance AV(fˆj)=fj(1−fj) which can be estimated consistently by AV (fˆj)=fˆj(1 − fˆj).
Notice that a symmetric asymptotic confidence interval for fj, e.g. CI.95(fj)=fˆj ± 1.96 fˆj(1 − fˆj)/n, is problematic because may contain values less than zero or greater than one.
4 Continuous Z The previous approach breaks down when Z is continuous. Notice however that, if z is any point in a sufficiently small interval (a, b] then, by the definition of probability density, Pr{a