Density Estimation Using Quantile Variance and Quantile-Mean Covariance

Density Estimation using Quantile Variance and Quantile-Mean Covariance Montes-Rojas, Gabriel∗ Mena, Sebastian´ y August 2018 Abstract Based on asymptotic properties of sample Quantile Distribution derived by Hall & Martin (1988) and Ferguson (1999), we propose a novel method which explodes Quantile Variance, and Quantile-Mean Covariance to estimate distributional density from samples. The process consists in firstly estimate sample Quantile Variance and sample Quantile-Mean Covariance using bootstrap techniques and after use them to compute distributional density. We conducted Montecarlo Simulations for different Data Generating Process, sample size and parameters and we discovered that for many cases Quantile Density Estimators perform better in terms of Mean Integrated Squared Error than standard Kernel Density Estimator. Finally, we propose some smoothing techniques in order to reduce estimators variance and increase their accuracy. Keywords:Density Estimation, Quantile Variance, Quantile-Mean Covariance, Bootstrap JEL Classification:C13, C14, C15, C46 ∗CONICET , Universidad de San Andrés, and Universidad de Buenos Aires. Email: [email protected] yCONICET, Universidad de San Andrés, and Universidad Nacional de Tucumán. Email: [email protected] 1 1 Introduction If the continuous random variable X has density f(x) and median Q(0:5) = θ, then the ^ 1 1 2 sample variance of Qn(0:5) is known to be approximately [4nf(θ)2] . As n ! 1 we also 1 know that the asymptotic variance of the median converges to [4f(θ)2] . It can be shown that this result is true for any quantile τ being the asymptotic variance of Q(τ) = xτ equal to τ(1−τ) 2 . Ferguson (1999) extended this results and proved that sample Mean-Quantile joint f(xτ ) distribution has asymptotic covariance equal to $(τ) 3. This results had been widely used to f(xτ ) estimate order statistics variance and construct confidence intervals for robust inference. This usually requires make assumptions about f(x) functional form or instead firstly estimate this density function using non-parametric techniques. Based on properties of sample quantile variance bootstrap estimation derived by Hall & Martin (1988) we explore the revert problem of firstly estimate sample quantile variance (Qv), and sample quantile-mean covariance (Qc), and then use them to non-parametrically estimate density function f(x). We call them Qv and Qc density estimators. 2 Quantile Variance Density Estimator Let (X1; :::; Xn) be i:i:d: with distribution function F (x), density f(x), mean µ and finite 2 variance σ . Let 0 < τ < 1 and let qτ denote the τ quantile of F , so that F (qτ ) = τ. Assume that the density f(x) is continuous and positive at qτ . Let Q(τ) be the quantile function so −1 that Q(τ) ≡ F (τ) = qτ and Yτ;n = Xn:nτ denote the sample τ quantile. Then: p L τ(1 − τ) n(Yτ;n − qτ ) −! N 0; 2 (1) f(xτ ) 1i.e. the sample median of n observations 2According to Stigler (1973) this result, jointly with asymptotic properties of Median estimation, was firstly stated by Laplace in 1818 3 Where $(τ) = argmina E[ρτ (X − a)] is the expected quantile loss function 2 Definition 1 The quantile variance function, V (τ), is the asymptotic variance of the sample quantiles, i.e. Yτ;n, with index τ 2 (0; 1) p V (τ) = lim nV (Yτ;n): (2) n!1 τ(1−τ) From Equations1 and2 we know that at the limit V (τ) = 2 so density at quantile f(xτ ) τ is: s τ(1 − τ) f(Q(τ)) = : (3) V (τ) The Quantile Variance estimator of f(Q(τ)) requires only a consistent estimator of V (τ). Following subsections sketch the steps we need to achieve that. 2.1 Consistent estimator of V (τ) We propose a naive non parametric Bootstrap estimator which tooks B random sub-samples b b of size n out of n observations, sort them into a order rank j = 1; :::; n so X1 < ::: < Xn and: 1. Compute sub-sample quantile τ for sub-sample b as: ( Pn Xb1 Xb ≤ Qb (τ)! ) ^b b j=1 j j n Qn(τ) = inf Xj j Pn b ≥ τ b = 1; 2; :::; B; j = 1; ::; n j=1 Xj ^ 1 PB ^b 2. Compute bootstraped quantile mean as: Qn(τ) = B b=1 Qn(τ); b = 1; 2; :::; B ^ 1 PB ^b ^ 2 3. Finally Compute bootstraped quantile variance as: V (τ) = B b=1(Qn(τ)−Qn(τ)) ; b = 1; 2; :::; B ^ B Lemma 1 Vn (τ) = V (τ) + op(1) as n ! 1, B ! 1. 3 2.2 QV Density Estimator Then our proposed density estimator is s τ(1 − τ) f^(Q(τ)) = : (4) V^ (τ) ^ − 1 Lemma 2 fn(Q(τ)) = f(Q(τ)) + Op(n 4 ) as n ! 1, B ! 1. Hall & Martin(1988) shows that relative error of the bootstrap quantile variance estimator and bootstrap sparsity function estimator is of precise order n1=44. Then the rate of ^ convergence of fn(Q(τ) is: τ(1 − τ) nσ2 = + O(n−1) qτ 2 f(xτ ) 1 ^ L 2 n 4 (fn(Q(τ)) − f(Q(τ))) −! N 0;T (5) 2 1 τ(1−τ) Where T = 1 2 2π 2 f(xτ ) 3 Quantile-Mean Covariance Density Estimator Suppose fX1; :::; Xng is an i.i.d. (independent and identically distributed) sample with distribution function F (:), density f(:), quantile function Q(τ); τ 2 (0; 1), and mean µ. Definition 2 The quantile-mean covariance function, C(τ), is the asymptotic covariance ^ between the sample quantiles, i.e. Qn(τ), with index τ 2 (0; 1) and the sample mean, i.e., ¯ Pn Xn = i=1 Xi, p ^ ¯ C(τ) = lim nCOV (Qn(τ); Xn): (6) n!1 4Where n is Sample Size 4 Definition 3 Define the expected quantile loss function (EQLF) 1 $(τ) = argmin E[ρτ (X −a)] = τ(µ−E[XjX < Q(τ)]) = τ µ − E[1[X < Q(τ)]X] ; (7) a τ where ρτ (u) = fτ − 1[u ≤ 0]gu is the quantile check function in Koenker and Bassett (1978). By Ferguson (1999), $(τ) f(Q(τ)) = : (8) C(τ) The estimator of f(Q(τ)) requires consistent estimators of $(τ) and C(τ). The following subsections sketch the steps we need for each element. 3.1 Consistent estimator of $(τ) Consider the following estimator: n 1 1 X $^ (τ) = τ(X¯ − X 1[X < Q^ (τ)]; n n τ n i i n i=1 Lemma 3 $^ n(τ) = $(τ) + op(1) as n ! 1. 3.2 Consistent estimator of C(τ) The key point is how to estimate C(τ). Basically we just want to estimate a \covariance" between two random variables. Think about how we would estimate the covariance between ¯ 2 Xn andσ ^n or any other two \moments". Proposal: The bootstrap b n 1. Consider B bootstrap samples of size n, fxi gi=1; b = 1; 2; :::; B; ¯ b ^b 2. Compute (Xn; Qn(τ)); b = 1; 2; :::; B; 5 ^B 1 PB ¯ b ^b 1 PB ¯ b 1 PB ^b 3. Compute Cn (τ) = B b=1 Xn × Qn(τ) − B b=1 Xn B b=1 Qn(τ) (i.e. sample covariance) ^B Lemma 4 Cn (τ) = C(τ) + op(1) as n ! 1, B ! 1. 3.3 QC Density Estimator Then our proposed density estimator is ^ $^ n(τ) fn(Q(τ)) = (9) ^B Cn (τ) ^ Lemma 5 fn(Q(τ)) = f(Q(τ)) + op(1) as n ! 1, B ! 1. 4 Smoothing Methods In practice, the density estimator may have several kinks in small samples because of the non continuous nature of quantile estimators. Then we can propose smoothing strategies to obtain smoother estimators. 4.1 Moving Average Suppose a given grid of τ values T = τ1; τ2; :::; τT such that τi < τi + 1; i = 1; 2; :::; T − 1. The consider a moving average of 2m + 1 (i.e. we would consider m quantiles to the left and m quantiles to the right to take an average), for τi, i = m + 1; :::; m − 1, 1 f^ (Q(τ ))MA = f^ (Q(τ )) + f^ (Q(τ )) + ::: n i 2m + 1 n i−m n i−m+1 ^ ^ ^ ^ +fn(Q(τi−1)) + fn(Q(τi)) + fn(Q(τi+1)) + ::: + fn(Q(τi+m)) 6 4.2 HP Filter Hodrick and Prescott (1997) proposed a very popular method for decomposing time-series into trend and cycle. Paige & Trindade (2010) proved that HP-Filter is a special case of ^ HP Penalized Spline Smoother. Then, our HP smoothed estimator fn(Q(τi)) will result from minimization of: T X ^ ^ HP 2 min f (f(Q(τi)) − fn(Q(τi)) ) f^ (Q(τ ))HP n i i=1 T X ^ HP ^ HP ^ HP ^ HP 2 +λ [(fn(Q(τi)) − fn(Q(τi−1)) ) − (fn(Q(τi−1)) − fn(Q(τi−2)) )] g i=1 Where λ is a free smoothing parameter. 4.3 Kernel Alternatively, consider a kernel smoothing estimator for a kernel Ψ, 1 f^ (Q(τ ))K = Ψ(τ − τ )f^ (Q(τ )) + Ψ(τ − τ )f^ (Q(τ )) + ::: n i 2m + 1 i−m i n i−m i−m+1 i n i−m+1 ^ ^ ^ +Ψ(τi−1 − τi)fn(Q(τi−1)) + Ψ(0)fn(Q(τi)) + Ψ(τi+1 − τi)fn(Q(τi+1)) + ::: ^ +Ψ(τi+m − τi)fn(Q(τi+m)) m is here a smoothing parameter and we can thus analyze the asymptotic properties with respect to n and m to get optimality properties. 5 Montecarlo Simulations This section shows alternative scenarios where we firstly simulate known Data Generating Process (DGP), secondly we try to recover those DGP using techniques presented before (Quantile Variance Density Estimator, Quantile-Mean Covariance Density Estimator) and 7 thirdly we evaluate the performance of estimators through their Mean Integrated Square Error (MISE).

Density Estimation Using Quantile Variance and Quantile-Mean Covariance

Generalized Linear Models and Generalized Additive Models

Stochastic Process - Introduction

5. the Student T Distribution

Variance Function Regressions for Studying Inequality

Flexible Signal Denoising Via Flexible Empirical Bayes Shrinkage

A Tail Quantile Approximation Formula for the Student T and the Symmetric Generalized Hyperbolic Distribution

Variance Function Estimation in Multivariate Nonparametric Regression by T

Variance Function Program

Sampling Student's T Distribution – Use of the Inverse Cumulative

Examining Residuals

Nonparametric Multivariate Kurtosis and Tailweight Measures

Estimating Variance Functions for Weighted Linear Regression