QUANTILE REGRESSION
ROGER KOENKER
Abstract. Classical least squares regression maybeviewed as a natural way of extending the
idea of estimating an unconditional mean parameter to the problem of estimating conditional
mean functions; the crucial link is the formulation of an optimization problem that encompasses
b oth problems. Likewise, quantile regression o ers an extension of univariate quantile estimation
to estimation of conditional quantile functions via an optimization of a piecewise linear ob jective
function in the residuals. Median regression minimizes the sum of absolute residuals, an idea
intro duced by Boscovich in the 18th century.
The asymptotic theory of quantile regression closely parallels the theory of the univariate sample
quantiles. Computation of quantile regression estimators may b e formulated as a linear program-
ming problem and eciently solved by simplex or barrier metho ds. A close link to rank based
inference has b een forged from the theory of the dual regression quantile pro cess, or regression
rankscore pro cess.
Quantile regression is a statistical technique intended to estimate, and conduct inference ab out,
conditional quantile functions. Just as classical linear regression metho ds based on minimizing sums
of squared residuals enable one to estimate mo dels for conditional mean functions, quantile regression
metho ds o er a mechanism for estimating mo dels for the conditional median function, and the full
range of other conditional quantile functions. By supplementing the estimation of conditional mean
functions with techniques for estimating an entire family of conditional quantile functions, quantile
regression is capable of providing a more complete statistical analysis of the sto chastic relationships
among random variables.
Quantile regression has b een used in a broad range of application settings. Reference growth
curves for childrens' heightandweighthave a long history in p ediatric medicine; quantile regression
metho ds may b e used to estimate upp er and lower quantile reference curves as a function of age, sex,
and other covariates without imp osing stringent parametric assumptions on the relationships among
these curves. Quantile regression metho ds have b een widely used in economics to study determinents
of wages, discrimination e ects, and trends in income inequality. Several recent studies have mo deled
the p erformance of public scho ol students on standardized exams as a function of so cio-economic
characteristics like their parents' income and educational attainment, and p olicy variables likeclass
size, scho ol exp enditures, and teacher quali cations. It seems rather implausible that suchcovariate
e ects should all act so as to shift the entire distribution of test results by a xed amount. It is of
obvious interest to know whether p olicy interventions alter p erformance of the strongest students in
the same way that weaker students are a ected. Such questions are naturally investigated within
the quantile regression framework.
In ecology, theory often suggests how observable covariates a ect limiting sustainable p opulation
sizes, and quantile regression has b een used to directly estimate mo dels for upp er quantiles of the
conditional distribution rather than inferring such relationships from mo dels based on conditional
central tendency. In survival analysis, and event history analysis more generally, there is often also a
Version: Octob er 25, 2000. This article has b een prepared for the statistics section of the International Encyclo-
pedia of the Social Sciences edited by Stephen Fienb erg and Jay Kadane. The researchwas partially supp orted by
NSF grant SBR-9617206. 1
2 ROGER KOENKER
desire to fo cus attention on particular segments of the conditional distribution, for example survival
prosp ects of the oldest-old, without the imp osition of global distributional assumptions.
1. Quantiles, Ranks and Optimization
Wesay that a student scores at the th quantile of a standardized exam if he p erforms better than
the prop ortion ,andworse than the prop ortion 1 , of the reference group of students. Thus,
half of the students p erform b etter than the median student, and half p erform worse. Similarly,the
quartiles divide the p opulation into four segments with equal prop ortions of the p opulation in each
segment. The quintiles divide the p opulation into 5 equal segments; the deciles into 10 equal parts.
The quantile, or p ercentile, refers to the general case.
More formally,anyrealvalued random variable, Y ,maybecharacterized by its distribution
function,
F y = ProbY y
while for any0< <1,
Q = inffy : F y g
is called the th quantile of X . The median, Q1=2, plays the central role. Like the distribution
function, the quantile function provides a complete characterization of the random variable, Y.
The quantiles may b e formulated as the solution to a simple optimization problem. For any
0 < <1, de ne the piecewise linear \check function", u= u I u<0 illustrated in Figure
1.
ρτ (u)
τ−1 τ
Figure 1. Quantile Regression Function
^
Minimizing the exp ectation of Y with resp ect to yields solutions, , the smallest of
whichisQ de ned ab ove.
The sample analogue of Q , based on a random sample, fy ; :::; y g,of Y 's, is called the th
1 n
sample quantile, and may b e found by solving,
n
X
min y ;
i
2R
i=1
QUANTILE REGRESSION 3
While it is more common to de ne the sample quantiles in terms of the order statistics, y
1
y ::: y , constituting a sorted rearrangement of the original sample, their formulation as a
2 n
minimization problem has the advantage that it yields a natural generalization of the quantiles to
the regression context.
Just as the idea of estimating the unconditional mean, viewed as the minimizer,
X
2
^ = argmin y
i
j
2 R
0
can b e extended to estimation of the linear conditional mean function E Y jX = x= x by solving,
X
0 2
^
p
= argmin y x ;
i
j
i
2 R
0
the linear conditional quantile function, Q jX = x= x , can b e estimated by solving,
Y
i
X
0
^
p
= argmin y x :
i
j
2 R
The median case, =1=2, which is equivalent to minimizing the sum of absolute values of the
residuals has a long history. In the mid-18th century Boscovich prop osed estimating a bivariate
linear mo del for the ellipticity of the earth by minimizing the sum of absolute values of residuals
sub ject to the condition that the mean residual to ok the value zero. Subsequentwork by Laplace
characterized Boscovich's estimate of the slop e parameter as a weighted median and derived its as-
ymptotic distribution. F.Y. Edgeworth seems to have b een the rst to suggest a general formulation
of median regression involving a multivariate vector of explanatory variables, a technique he called
the \plural median". The extension to quantiles other than the median was intro duced in Ko enker
and Bassett 1978.
2. An Example
To illustrate the approachwemay consider an analysis of a simple rst order autoregressivemodel
for maximum daily temp erature in Melb ourne, Australia. The data are taken from Hyndman,
Bashtannyk, and Grunwald 1996. In Figure 2 weprovide a scatter plot of 10 years of daily
temp erature data: to day's maximum daily temp erature is plotted against yesterday's maximum.
Our rst observation from the plot is that there is a strong tendency for data to cluster along the
dashed 45 degree line implying that with high probabilitytoday's maximum is near yesterday's
maximum. But closer examination of the plot reveals that this impression is based primarily on the
left side of the plot where the central tendency of the scatter follows the 45 degree line very closely.
On the right side, however, corresp onding to summer conditions, the pattern is more complicated.
There, it app ears that either there is another hot day, falling again along the 45 degree line, or
there is a dramatic co oling o . But a mild co oling o app ears to b e more rare. In the language of
conditional densities, if to dayishot,tomorrow's temp erature app ears to b e bimo dal with one mo de
roughly centered at to day's maximum, and the other mo de centered at ab out 20 .
Several estimated quantile regression curves have b een sup erimp osed on the scatterplot. Each
curve is sp eci ed as a linear B-spline. Under winter conditions these curves are bunched around
the 45 degree line, however in the summer it app ears that the upp er quantile curves are bunched
around the 45 degree line and around 20 .Intheintermediate temp eratures the spacing of the
quantile curves is somewhat greater indicating lower probability of this temp erature range. This
impression is strengthened by considering a sequence of density plots based on the quantile regression
estimates. Given a family of reasonably densely spaced estimated conditional quantile functions,
it is straightforward to estimate the conditional density of the resp onse at various values of the
conditioning covariate. In Figure 3 we illustrate this approach with several of density estimates
based on the Melb ourne data. Conditioning on a low previous day temp erature weseeanice
4 ROGER KOENKER
...... today's max temperature ...... 10 20. 30 40 ......
10 15 20 25 30 35 40
yesterday's max temperature
Figure 2. Melb ourne Maximum Daily Temp erature: The plot illustrates 10 years
of daily maximum temp erature data in degrees centigrade for Melb ourne, Aus-
tralia as an AR1 scatterplot. The data is scattered around the dashed 45
degree line suggesting that to day is roughly similar to yesterday. Sup erimp osed
on the scatterplot are estimated conditional quantile functions for the quantiles
2f:05;:10;:::; :95g.Notethatwhenyesterday's temp erature is high the spac-
ing b etween adjacent quantile curves is narrower around the 45 degree line and at
ab out 20 degrees Centigrade than it is in the intermediate region. This suggests
bimo dality of the conditional density in the summer.
QUANTILE REGRESSION 5
Yesterday's Temp 11 Yesterday's Temp 16 Yesterday's Temp 21 density density density 0.05 0.10 0.15 0.02 0.06 0.10 0.02 0.06 0.10 0.14
10 12 14 16 12 14 16 18 20 22 24 15 20 25 30
today's max temperature today's max temperature today's max temperature
Yesterday's Temp 25 Yesterday's Temp 30 Yesterday's Temp 35 density density density 0.01 0.03 0.05 0.01 0.03 0.05 0.01 0.03 0.05
15 20 25 30 35 20 25 30 35 40 20 25 30 35 40
today's max temperature today's max temperature today's max temperature
Figure 3. Conditional density estimates of to day's maximum temp erature for sev-
eral values of yesterday's maximum temp erature based on the Melb ourne data:
These density estimates are based on a kernel smo othing of the conditional quantile
estimates as illustrated in the previous gure using 99 distinct quantiles. Note that
temp erature is bimo dal when yesterdaywas hot.
unimo dal conditional density for the following day's maximum temp erature, but as the previous
day's temp erature increases we see a tendency for the lower tail to lengthen and eventually wesee
a clearly bimo dal density. In this example, the classical regression assumption that the covariates
a ect only the lo cation of the resp onse distribution, but not its scale or shap e, is violated.
3. Interpretation of Quantile Regression
Least squares estimation of mean regression mo dels asks the question, \How do es the conditional
mean of Y dep end on the covariates X ?" Quantile regression asks this question at each quantile of the
conditional distribution enabling one to obtain a more complete description of how the conditional
distribution of Y given X = x dep ends on x. Rather than assuming that covariates shift only the
lo cation or scale of the conditional distribution, quantile regression metho ds enable one to explore
p otential e ects on the shap e of the distribution as well. Thus, for example, the e ect of a job-
training program on the length of participants' current unemployment sp ell might b e to lengthen the
shortest sp ells while dramatically reducing the probabilityofvery long sp ells. The mean treatment
6 ROGER KOENKER
e ect in such circumstances might b e small, but the treatment e ect on the shap e of the distribution
of unemployment durations could, nevertheless, b e quite signi cant.
3.1. QuantileTreatment E ects. The simplest formulationofquantile regression is the two-
sample treatment-control mo del. In place of the classical Fisherian exp erimental design mo del in
which the treatment induces a simple lo cation shift of the resp onse distribution, Lehmann 1974
prop osed the following general mo del of treatment resp onse:
\Supp ose the treatment adds the amountx when the resp onse of the untreated
sub ject would b e x. Then the distribution G of the treatment resp onses is that of the
random variable X +X where X is distributed according to F ."
Sp ecial cases obviously include the lo cation shift mo del, X =, and the scale shift mo del,
0
X = X , but the general case is natural within the quantile regression paradigm.
0
Doksum 1974 shows that if x is de ned as the \horizontal distance" b etween F and G at x,
so
F x= Gx +x
then x is uniquely de ned and can b e expressed as