Fused Adaptive Lasso for Spatial and Temporal Quantile Function Estimation
Total Page:16
File Type:pdf, Size:1020Kb
Technometrics ISSN: 0040-1706 (Print) 1537-2723 (Online) Journal homepage: http://www.tandfonline.com/loi/utch20 Fused Adaptive Lasso for Spatial and Temporal Quantile Function Estimation Ying Sun, Huixia J. Wang & Montserrat Fuentes To cite this article: Ying Sun, Huixia J. Wang & Montserrat Fuentes (2016) Fused Adaptive Lasso for Spatial and Temporal Quantile Function Estimation, Technometrics, 58:1, 127-137, DOI: 10.1080/00401706.2015.1017115 To link to this article: http://dx.doi.org/10.1080/00401706.2015.1017115 View supplementary material Accepted author version posted online: 01 Sep 2015. Published online: 22 Feb 2016. Submit your article to this journal Article views: 84 View related articles View Crossmark data Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=utch20 Download by: [North Carolina State University] Date: 26 April 2016, At: 14:04 Supplementary materials for this article are available online. Please go to http://www.tandfonline.com/r/TECH Fused Adaptive Lasso for Spatial and Temporal Quantile Function Estimation Ying SUN Huixia J. WANG CEMSE Division Department of Statistics King Abdullah University of Science and Technology George Washington University Thuwal 23955-6900, Saudi Arabia Washington, DC 20052 ([email protected]) ([email protected]) Montserrat FUENTES Department of Statistics North Carolina State University Raleigh, NC 27695 ([email protected]) Quantile functions are important in characterizing the entire probability distribution of a random vari- able, especially when the tail of a skewed distribution is of interest. This article introduces new quan- tile function estimators for spatial and temporal data with a fused adaptive Lasso penalty to accom- modate the dependence in space and time. This method penalizes the difference among neighboring quantiles, hence it is desirable for applications with features ordered in time or space without repli- cated observations. The theoretical properties are investigated and the performances of the proposed methods are evaluated by simulations. The proposed method is applied to particulate matter (PM) data from the Community Multiscale Air Quality (CMAQ) model to characterize the upper quantiles, which are crucial for studying spatial association between PM concentrations and adverse human health effects. KEY WORDS: Fused adaptive Lasso; Particulate matter; Quantile estimation; Spatial quantiles; Temporal quantiles. 1. INTRODUCTION while a local smoothing technique accounts for the dependence among qτ (i)’s. Moreover, only one realization of Y is needed to For a continuous univariate random variable Y, the quan- estimate qτ . q = F −1 τ ={y ∈ R F y = τ} tile function is defined as τ ( ) : ( ) , In the presence of right-skewed distributions, the usual Gaus- where F is the strictly increasing cumulative distribution func- sian assumption will underestimate the upper tail probability τ ∈ , tion (cdf) and (0 1) is the quantile level. Parzen (2004, and methods based on the extreme value theory or quantiles 2008) gave comprehensive discussions on quantiles from theory are more appropriate to characterize the tail distribution. Coo- to practice. The quantile function is important in characterizing ley et al. (2012) reviewed current methodology for the analysis the entire probability distribution, for example, the median is of spatial extremes, focusing on methods that account for de- a robust location estimator and upper quantiles are useful in pendence in extremes, such as max-stable processes and copula describing extreme events. Downloaded by [North Carolina State University] at 14:04 26 April 2016 models. Under these models, the statistical inference requires = Y ,...,Y T When data, Y ( 1 n) , are from a stochastic pro- repeated measurements, which are not generally available for cess, multivariate quantile function is not uniquely defined. We stochastic processes. In such a framework, the main goal of consider marginal quantiles and define the quantile function as the article consists of proposing nonparametric methods for = q ,...,q n T q i = F −1 τ F qτ ( τ (1) τ ( )) , where τ ( ) i ( ) with i denot- estimating upper quantile values in time or space without repli- Y ing the marginal cdf of i . In practice, estimating qτ is difficult cated observations. Our work is motivated by the need of bet- Y because i ’s are correlated and repeated measurements of the ter tools to characterize the tail distribution of the particulate random vector Y are usually not available. For example, for matter (PM), a mixture of pollutants. Since most air quality time series or spatial data, it is challenging to estimate the tem- data are right-skewed under stable meteorological conditions, poral quantile curves or spatial quantile surfaces due to the de- pendence among neighboring observations. Ignoring the depen- dence may underestimate the magnitude of the quantile function. © 2016 American Statistical Association and In this article, we propose a method to tackle this problem via the American Society for Quality a smoothing procedure to estimate qτ locally. The advantage TECHNOMETRICS, FEBRUARY 2016, VOL. 58, NO. 1 of this method is that the index-varying qτ (i), i = 1,...,n, DOI: 10.1080/00401706.2015.1017115 describes the changing behavior in the marginal distribution, Color versions of one or more of the figures in the article can be found online at www.tandfonline.com/r/tech. 127 128 YING SUN, HUIXIA J. WANG, AND MONTSERRAT FUENTES and are observed as a time series at a particular location, or 2. METHODOLOGY a spatial surface at a certain time, we need to characterize the upper quantiles using one time series or one set of spatial data 2.1 Background only. To estimate quantile functions, quantile regression, first pro- For stochastic processes, one way to create replication is via a posed by Koenker and Bassett (1978), has become an impor- stationarity assumption. For example, the second-order station- tant and widely used technique. Quantile regression can be arity means that the mean and the covariance of the process do used to estimate quantiles of a response variable conditional not change when shifted in time or space. Specifically, a time on the regressors or predictors, and thus provide a comprehen- {Y t } δ series, ( ) , is second-order stationary if, for all , and for sive picture of the conditional response. Suppose we observe all t, E{Y (t)}=μ, and cov{Y (t),Y(t + δ)}=K(δ). Under the p {(Xi ,Yi ),i = 1,...,n}, where Xi ∈ R is the vector of predic- second-order stationarity assumption, covariance functions play tors and Yi ∈ R is the continuous response. Let qτ (Xi )bethe an important role in interpreting the local behavior, and Gaus- conditional τth quantile of Yi , such that P (Yi ≤ qτ (Xi )|Xi ) = τ, sian process models are widely used. For skewed distributions, for 0 <τ<1. The quantile qτ (Xi ) is estimated by minimizing methodologies based on multivariate extreme value theory are n ρ Y − q , ρ u = u{τ − I u< } i=1 τ ( i τ (Xi )) where τ ( ) ( 0) . developed to model the tail dependence, but it becomes difficult Before introducing the regularization technique in quan- for high or infinite dimensions (Cooley et al. 2012). Another way tile regression model, we briefly review two types of Lasso to model the local behavior is through nonparametric smooth- penalties in the least squares regression model. The Lasso ing. In practice, it is often difficult to tell whether the true un- methodology, as proposed by Tibshirani (1996), is a reg- derlying process is nonstationary or stationary with long range ularization technique for simultaneous estimation and vari- dependence by only looking at the observed data. The local T able selection. In a linear regression model E(Yi |Xi ) = Xi β, features can be either interpreted as a result of the nonstation- β = β ,...,β T the coefficient ( 1 p) is estimated by minimizing arity or the long range dependence in a stationary process. For n T 2 p i= (Yi − Xi β) + λ j= |βj |, where λ is a tuning parame- example, Furrer and Nychka (2007) presented a framework to 1 1 ter. The L1 norm penalty shrinks the coefficients toward 0 as understand the asymptotic properties of kriging and splines and λ increases. The fused Lasso penalty (Tibshirani et al. 2005) proved the equivalent interpretation between kriging estimators enforces sparsity in both the coefficients and their successive and generalized splines based on Nychka (1995). Stein (1991) differences, which is desirable for applications with features or- showed that the predictions that a generalized covariance func- dered in some meaningful way. The penalty term can be written tion yields are identical to an order 2 thin plate smoothing spline. λ p |β |+λ p |β − β |. as 1 j=1 j 2 j=2 j j−1 Draghicescu, Guillas, and Wu (2009) addressed time-varying The Lasso-type penalties in the least squares regression model quantile curve estimation for a class of nonstationary time are widely used in many fields. For example, in comparative ge- series. nomic hybridization (CGH) analysis, Eilers (2003), Huang et al. In this article, we propose nonparametric methods assuming (2005), and Tibshirani and Wang (2008) employed the penal- nonstationarity to capture the local dependence by minimizing ized least squares method by shrinking the distances between a loss function with a penalty term. For example, when observa- signals at adjacent clone locations, taking the spatial depen- tions are temporally inhomogeneous, we assume the marginal dence into account. Nowak et al. (2011) considered multiple Y distribution of i varies over time to characterize the temporal arrays of the CGH data. Moreover, Storlie et al. (2011) studied dependence, and improve the quantile estimation by penalizing the Lasso penalty for surface estimation and variable selec- the roughness. We use the fused Lasso technique (Tibshirani tion in multivariate nonparametric regression. Das et al.