Accepted Manuscript

Testing in generalized partially linear models: A robust approach

Graciela Boente, Ricardo Cao, Wenceslao Gonzalez´ Manteiga, Daniela Rodriguez

PII: S0167-7152(12)00336-7 DOI: 10.1016/j.spl.2012.08.031 Reference: STAPRO 6408

To appear in: and Probability Letters

Received date: 22 August 2011 Revised date: 29 August 2012 Accepted date: 30 August 2012

Please cite this article as: Boente, G., Cao, R., Gonzalez´ Manteiga, W., Rodriguez, D., Testing in generalized partially linear models: A robust approach. Statistics and Probability Letters (2012), doi:10.1016/j.spl.2012.08.031

This is a PDF file of an unedited manuscript that has been accepted for publication. Asa service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. Testing in generalized partially linear models: A robust approach

,a ,b ,c ,a Graciela Boente , Ricardo Cao , Wenceslao Gonz´alez Manteiga , Daniela Rodriguez∗

aFacultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires and CONICET, Argentina bDepartamento de Matem´aticas, Universidade da Coru˜na, A Coru˜na, Spain cDepartamento de Estad´ıstica e Investigaci´on Operativa, Facultad de Matem´aticas, Universidad de Santiago de Compostela, Santiago de Compostela, Spain

Abstract

In this paper, we introduce a family of which allow to decide between a parametric model and a semiparametric one. More precisely, under a generalized partially linear model, i.e., when the observations satisfy y (x ,t ) F ( ,µ ) with µ = H η(t )+ xtβ and H a known link i| i i ∼ · i i i i function, we want to test H0 : η(t)= α+γt against H1 : η is a nonlinear smooth function. A general approach which includes robust estimators based on a robustified deviance or a robustified quasi– likelihood is considered. The asymptotic behavior of the test under the null hypothesis is obtained. Key words: Generalized partially linear models, Kernel weights, Rate of convergence, Robust testing

Abbreviated Title: Robust test in generalized semiparametric models.

∗Corresponding Author Instituto de C´alculo, FCEyN, UBA Ciudad Universitaria, Pabell´on 2 Buenos Aires, C1428EHA, Argentina fax 54-11-45763375 Email addresses: [email protected] (Graciela Boente), [email protected] (Ricardo Cao), [email protected] (Wenceslao Gonz´alez Manteiga), [email protected] (Daniela Rodriguez)

Preprint submitted to Elsevier August 29, 2012 1. Introduction

Semiparametric models contain both a parametric and a nonparametric component. Sometimes the nonparametric component plays the role of a nuisance parameter. A lot of research has been done on estimators of the parametric component in a general framework, aiming to obtain asymptotically efficient estimators. The aim of this paper is to consider semiparametric versions of the generalized linear models where the response y is to be predicted by covariates (x,t), where x Rp and t R with a compact set. Without loss of generality we will assume that = [0∈, 1]. It will∈T also ⊂ be assumedT that the conditional distribution of y (x,t) belongs to the canonicalT | exp [yθ(x,t) B (θ(x,t)) + C(y)], for known functions B and C. Then, µ (x,t) = E (y (x,t)) = − | B0 (θ(x,t)), with B0 as the derivative of B. In generalized linear models (McCullagh and Nelder, 1989), which is a popular technique for modelling a wide variety of data, it is often assumed that the is modelled linearly through a known link function, g, i.e., g(µ (x,t)) = γ + xtβ + αt. For instance, an ordinary logistic regression model assumes that the observations (yi, xi,ti) are such that the responses are independent binomial variables yi (xi,ti) Bi(1,pi) whose success | ∼ t probabilities depend on the explanatory variables through the relation g(pi)= γ + xi β + αti, with g(u) = log (u/(1 u)). − In many situations, the linear model is insufficient to explain the relationship between the re- sponse variable and its associated covariates. A natural generalization, which suffers from the curse of dimensionality, is to model the mean nonparametrically in the covariates. An alternative strategy is to allow most predictors to be modelled linearly while one or a small number of predictors enter in the model nonparametrically. This is the approach we will follow, so that the relationship will be given by the semiparametric generalized partially linear model

µ (x,t)= H η(t)+ xtβ (1)

1 p  where H = g− is a known link function, β R is an unknown parameter and η is an unknown smooth function. ∈ In the context of hypothesis testing for regression models, that is, when H(t) = t, Gao (1997) established a large sample theory for testing H0 : β = 0 and, in addition to this, H¨ardle et al. (2000) tested H : η = η too, while H¨ardle and Mammen (1993) considered the lack of fit problem H : η 0,η 0 0 ∈ ηθ : θ Θ . Besides, Gonz´alez–Manteiga and Aneiros–P´erez (2003) studied the case of dependent {errors and∈ } Koul and Ni (2004) considered the case of random design and heteroscedastic errors. These methods are based on a L2 distance comparison between a nonparametric estimator of the regression function and a smoothed parametric estimator, so they face the problem of selecting the smoothing parameter. An alternative approach is based on the empirical estimator of the integrated regression function. Goodness of fit tests based on empirical process for regression models with non– random design have been studied, for instance, by Koul and Stute (1998) and Diebolt (1995). On the other hand, under a purely model with Berkson measurement errors, Koul and Song (2008) considered a marked empirical process of the calibrated residuals. Recently, Koul and Song (2010) proposed a test for the partial model based on the supremum of a martingale transform of a process of calibrated residuals, when both the covariates in the parametric and nonparametric components are subject to Berkson measurement errors. On the other hand, for generalized partially linear models, hypothesis testing mainly focusses on comparing kernel based estimators with smoothed parametric estimators. For instance, H¨ardle et al. (1998) considered a test statistic to decide between a linear and a . Their proposal is based on the estimation procedure considered by Severini and Staniswalis (1994)

2 modified to deal with the smoothed and unsmoothed likelihoods. A comparative study of different procedures was performed by M¨uller (2001) while a different approach was considered in Rodr´ıguez Campos et al. (1998). As it is well known, such estimates fail to deal with outlying observations and so does the test statistic. In a semiparametric setting, outliers can have a devastating effect, since the extreme points can easily affect the scale and the shape of the function estimate of η, leading to possibly wrong conclusions. In particular, as mentioned Hampel’s comment on Stone (1977) paper “If we believe in a smooth model without spikes,. . . , some robustification is possible. In this situation, a clear outlier will not be attributed to some sudden change in the true model, but to a gross error, and hence it may be deleted or otherwise made harmless”. Therefore, in this context robust procedures need to be developed to avoid wrong conclusions on the hypothesis to be tested (see Bianco et al. (2006) for a discussion). Robust procedures for generalized linear models have been considered among others by Stefanski et al. (1986), K¨unsch et al. (1989), Bianco and Yohai (1995), Cantoni and Ronchetti (2001), Croux and Haesbroeck (2002) and Bianco et al. (2005). The basic ideas from robust smoothing and from estimation have been adapted to deal with the case of independent observations following a partially linear regression model with H(t) = t; we refer to Gao and Shi (1997), He et al. (2002) and Bianco and Boente (2004). Moreover, robust tests for a given alternative, under a partially linear regression model were studied in Bianco et al. (2006). Besides, a robust approach for testing the parametric form of a regression function versus an omnibus alternative, based on the centered asymptotic rank transformation, was considered by Wang and Qu (2007) when H(t) = t and β = 0, i.e., under the nonparametric model yi = η(ti)+ i. Under a generalized partially linear model (1), Boente et al. (2006) introduced a general profile– based two–step robust procedure to estimate the parameter β and the function η while Boente and Rodriguez (2010) (see also, Rodriguez, 2008) developed a three–step method to improve the computational time of the previous one. Beyond the importance of developing robust estimators in more general settings, the work on testing also deserves attention. An up-to-date review of robust hypothesis testing results can be found in He (2002). The aim of this paper is to propose a class of tests for the nonparametric component based on the three–step robust procedure proposed by Boente and Rodriguez (2010). The paper is organized as follows. In Section 2, we remind the definition of the general profile– based two–step estimators as well as the three–step robust estimates and their asymptotic properties. In Section 3, we present a robust alternative to test hypothesis concerning the nonparametric com- ponent η. Their asymptotic behavior is studied in Section 4 while a bootstrap procedure is discussed in Section 5. Section 6 reports the result of a Monte Carlo study conducted to evaluate the per- formance of the tests under the null hypothesis and under a set of alternatives. Finally, proofs are relegated to the Appendix.

2. Preliminaries: The estimation procedure

As mentioned in the Introduction, Boente et al. (2006) introduced a highly robust procedure under model (1) while Boente and Rodriguez (2010) introduced a local approach to improve the computational time. Let (yi, xi,ti) be independent observations such that yi (xi,ti) F ( ,µi) with t | ∼ · µi = H η(ti)+ xi β and Var (yi (xi,ti)) = V (µi). Let η0(t) and β0 denote the true parameter | t values, and E0 the expected value under the true model, so that E0(y1 (x1,t1)) = H η0(t1)+ x1 β0 .  |  3 As in Robinson (1988), we will assume that the vector 1n is not in the space spanned by the t column vectors of (x1, , xn) , that is, we do not allow β0 to include an intercept so that the model ···t t is identifiable, i.e., if xi β1 + η1(ti)= xi β2 + η2(ti) for 1 i n, then, β1 = β2 and η1 = η2. Due to the generality of the semiparametric model (1), identifiability≤ ≤ implies that only “slope”coefficients can be estimated. p 2 Let w1 : R R be a weight function to control leverage points on the carriers x, ρ : R R a → t → and K : R R a kernel function. Define S(β,a,τ)= E0 ρ y, x β + a w1(x) t = τ → | and S (β,a,t)= n W (t)ρ y , xtβ + a w (x ) where W (t) are the kernel weights on t , i.e. n i=1 i i i 1 i i   i  1 P n  − W (t)= K((t t )/h ) K((t t )/h ) . i  − j n  − i n j=1 X   Following the ideas of Severini and Staniswalis (1994), Boente et al. (2006) defined, for each fixed β, the function ηβ(t) as the minimizer of S(β,a,t). Since Sn(β,a,t) provides a consistent estimate of S(β,a,t), the minimizer in a, ηβ(t), of Sn(β,a,t) estimates ηβ(t). These functions allow the above mentioned authors to define a two–step robust quasi–likelihood estimators of β0 and η0 as β = argminβ Sn(β, ηβ,t) and η(t)b = ηβ(t), respectively. Boente and Rodriguez (2010) introduced a new family of estimators of β0 and η0 that improve the computational results. Both proposals provideb robust root-n consistentb estimatorsbb of the regression parameter β. If the function ρ(y,u) is continuously differentiable and we denote Ψ (y,u) = ∂ρ(y,u)/∂u, the functional ηβ(t) and the estimates ηβ(t) will be a solution of the differentiated equations, i.e., (1) (1) (1) they will be a solution of S (β,a,t) = 0 and Sn (β,a,t) = 0 respectively, where S (β,a,τ) = t b (1) n t E0 Ψ y, x β + a w1(x) t = τ and Sn (β,a,t) = i=1 Wi(t)Ψ yi, xi β + a w1(xi). We refer to Boente et al. (2006) and| Boente and Rodriguez (2010) for a discussion on the choice of the loss functions, where also conditions to ensure Fisher–consistencyP of the resulting estimators are stated. We only point out that, under a , two families of loss functions ρ have been considered in the literature, the first one bounds the deviances, as in our simulation study, while the second one introduced by Cantoni and Ronchetti (2001) is based on robustifying the quasi–likelihood by bounding the Pearson residuals.

3. Test statistics

A robust test statistic to test H0 : η0 α + γt, α R,γ R can be defined by comparing the robust semiparametric estimators with the∈{ robust estimators∈ ∈ obt}ained under a parametric model. We will give an approach which robustifies the test statistic defined in H¨ardle et al. (1998). Denote β a robust root n estimator of β and η(t) = η (t) the estimates of η (t) solution of − 0 β 0 p η (t) = argmina Sn(β,a,t). As in Section 2, let w2 : R R be a weight function that controls β ∈R →b b b b t high leverage points on the covariates x. Denote L(β,α,γ)= E0 ρ y, x β + α + γt w2(x) and bb b 1 n    L (β,α,γ)= ρ y , xtβ + α + γt w (x ) , n n i i i 2 i i=1 X   which correspond to the robustified objective functions under a generalized linear regression model. Then, the robust estimates of the regression parameter under the generalized linear model can be

4 defined as the minimizer of Ln

(βh0 , αh0 , γh0 ) = argmin Ln(β,α,γ). (2) β p,α ,γ ∈R ∈R ∈R b b b t To test H0, a natural approach is to compare the predicted values xi β + η(ti) with those obtained t under the null hypothesis, xi βh0 + αh0 + γh0 ti. However, as it is well known, in nonparametric and semiparametric models, due to the bias of the kernel estimator of S(βb,a,tb), the smoothing bias of η(t) is non-negligible, even underb theb linearb hypothesis H0, see, for instance, H¨ardle and Mammen (1993) and H¨ardle, et al. (1998) for a discussion, when considering the classical estimators. For thatb reason, a simple comparison between both estimators may be misleading and conduct wrong conclusions. To solve this problem, H¨ardle, et al. (1998) introduced a smoothing bias to αh0 + γh0 t to compensate that of η(t). It is worth noting that the smoothed estimators obtained under the null hypothesis may not belong to family of linear functions. However, they provide consistentb estimatorsb under the parametric model.b

To define smoothed estimators under the null hypothesis, consider the pseudo–observations yi corresponding to the parametric fit of the conditional expectation under the null hypothesis, that t t t is, yi = H x β + αh0 + γh0 ti and denote Ψ(µ, x β + a)= E Ψ y, x β + a (x,t) where thee i h0 | conditional expectation is taken when y (x,t) F ( ,µ). | ∼ ·   e b b b e The function ηh0 is defined as follows. Since the pseudo–observations will not have outliers, in the sense of large Pearson residuals, but only leverage points could appear, it is quite natural to n t define ηh0 (t) as theb value solving i=1 Wi(t)Ψ(yi, xi βh0 + a)w1(xi) = 0, or equivalently as the value η (t) = argmin S (β ,a,t), where S (β,a,t) = n W (t)ρ y , xtβ + a w (x ), with h0 a R n h0 P n i=1 i i i 1 i ∈ e b (∂ρ(µ,ab)) /∂a = Ψ(µ,a). Note that under milde conditions ρ(µ,a) = E (ρ (y,a) (x,t)) where the P |  conditionalb expectation ise takenb when y (x,t) e F ( ,µ). e e | ∼ · e eThen, the test statistic is defined using a goodness–of–fit measure e, based on the quasi– n t t 1 = 2 Q H x β + η(ti) ,H x β + ηh (ti) w2(xi)w(ti) T − i i h0 0 i=1 X      y 1 b b where Q(y,µ)= µ (s y)V − (s) ds is theb quasi–likelihood. Sinceb the quasi-likelihood is computed comparing predicted values− for the responses based on robust estimators, large deviations of the predicted responsesR from its mean will not have large influence in the test statistics. However, outly- ing points in the explanatory variables may have large influence on the quasi–likelihood expression. Hence, in order to bound their effect, we introduce a weight function w2(xi) in the test definition. We have also included a weight function w(t) to avoid boundary effects. The function w has a compact support 0 = [0, 1], in particular we have that for n large enough I[hn,1 hn](t) w(t). This robust versionT of⊂ quasi-likelihood T test is different from the robust likelihood ratio–type− or≥ score type tests as defined in Heritier and Ronchetti (1994) which still uses the responses yi and compares the responses and the fits obtained under the restricted and unrestricted models.

4. Asymptotic behavior

1 2 2 For the sake of simplicity, we denote ρn = hn + (nhn)− , χ(y,a) = ∂Ψ (y,a)/∂a, χ1(y,a) = 2 2 ∂ Ψ (y,a)/∂a , υ(β,t) = ηβ(t) ηβ(t), υ0(t) = υ(β ,t), υj (β,t) = ∂υ(β,t)/∂βj and υj,0(t) = − 0 υj (β0,t). b b b b b b b b 5 We will need the following set of assumptions

A1. The density f of t1 is bounded on , twice continuously differentiable in the interior of with bounded derivatives T T

A2. inft [0,1] f(t) > 0 ∈ A3. η0 is twice continuously differentiable in the interior of with bounded derivatives on . t T T A4. r(t, τ) = E0 χ y1, x1 β0 + η0(t) w1(x1) t1 = τ is uniformly continuous in the interior of and bounded in . | T T  t  A5. The functions v0(τ) = E0 χ y1, x1 β0 + η0(τ) w1(x1) t1 = τ and v1(τ) = t | E0 χ y1, x β + η0(τ) x1w1(x1) t1 = τ are uniformly continuous in the interior of and 1 0 |   T v0 = inft [0,1] v0(τ) > 0. I ∈ | |   A6. Ψ, χ, χ1, w, wj and ψj (x)= xwj (x) are bounded functions for j =1, 2. A7. K is a function of bounded variation with compact support [0, 1] and it satisfies K(u)du =1 and uK(u)du = 0. 3 1 4 R A8. The bandwidth sequence satisfies nh / log(n) and n 2 h log(n) 0. R n → ∞ n → Theorem 4.1. Assume that A1 to A8 hold. Moreover, assume that t t a) = g(y, x,u) = χ y, x β0 + a w1(x) E0 χ y1, x1 β0 + a w1(x1) t1 = u , a R , has G { 1 W − | ∈ } covering number N(, ,L (Q)) A− , for any probability Q and 0 << 1 . 2 G ≤   b) ψ1,2(x)= w1(x) x is bounded or supt E0 (ψ1,2(x) t) < . k k ∈T | ∞ 1 w Then, under H0 : η α + γ t, α R,γ R , we have that vn− ( 1 mn) N(0, 1), with 1 2 ∈ { 2 ∈ ∈1 } 2 T − −→ m = c h− K (u)du and v =2c h− (K K(u)) du, where n 1,Ψ n n 2,Ψ n ∗ R t R 2 H0 x1 β0 + α0 + γ0t1 2 2 2 1 c1,Ψ = E w(t1)E w2(xi) t t1 E w1(x1)σ (x1,t1) t1 v0(t1)− f(t1)− " V H x1 β0 + α0 + γ0t1 # !  h i 2 t 2  2 H0 x β + α0 + γ0t1 2 w (t ) c = w (x ) 1 0 t w2(x )σ2(x ,t ) t 1 2,Ψ E E 2 i t 1 E 1 1 1 1 1 4 " ( V H x1 β0 + α0 + γ0t1 )# v0(t1) f(t1)  h n oi  2  2 t  t σ (x0,t0)= E Ψ y1, x β0 + η0(t1) E0 Ψ y1, x β0 + η0(t1) (x1,t1) (x1,t1) = (x0,t0) . 1 − 1 | | h      i 

Remark 4.1. When considering the canonical exponential family described in the Introduction t t V H x1 β0 + η0(t1) = H0 x1 β0 + η0(t1) and so

 t  2 2 2 1 c1,Ψ = E w(t1)E w2(x1)H0 x1 β0 + α0 + γ0t1 t1 E w1(x1)σ (x1,t1) t1 v0(t1)− f(t1)−  h   i h 2 i 2  2 t 2 2 1 c2,Ψ = E w (t1) E w2(x1)H0 x1 β0 + α0 + γ0 t1 t1 E w1(x1)σ (x1,t1) t1 4 . v0(t1) f(t1)  h n   oi h n oi 

5. A Montecarlo test

In this section, we develop a boostrap procedure to implement the goodness–of–fit test for lin- earity. The need of bootstrapping has been studied by several authors such as H¨ardle and Mammen

6 (1993), H¨ardle et al. (1998). These authors applied a wild bootstrap procedure to construct the bootstrap samples. However, in the present setting due to the expensive computing time needed to compute the robust estimators, a linearised Montecarlo as defined in Zhu (2005) provides a better approach. This approach was also considered in Zhu and Zhang (2004) who propose a procedure for approximating the p value when considering a log-likelihood ratio test statistics for testing homogeneity. R´emillard and− Scaillet (2009) and Kojadinovic and Yan (2011) applied this method to provide fast goodness–of–fit tests for copulas.

1 As it will be shown in the Appendix, = R +O ((n/h) 2 ρ log n), under H : η α+γt, α T1 n p n 0 0 ∈{ ∈ R,γ R , with ∈ } n t 2 H0 xi β0 + α0 + γ0ti 2 2 Rn = w(ti)w2(xi) v0(ti)− f(ti)− V H xtβ + α + γ t × i=1 i 0 0 0 i X 2 n  t t W0,j (ti)w1(xj ) Ψ yj , xj β0 + η0(ti) E0 Ψ yj, xj β0 + η0(ti) (xj ,tj , xi,ti)  − |  j=1 X h      i t 2  H0 x β0 + α0 + γ0t 2 2 2  = n w(t)w (x) v (t)− f(t)− (t)dF (x,t) , 2 V (H (xtβ + α + γ t)) 0 Wn n Z 0 0 0 n t t where n(t)= j=1 W0,j (t)w1(xj ) Ψ yj , xj β0 + η0(t) E0 Ψ yj , xj β0 + η0(ti) (xj ,tj ,ti = t) . This suggestsW the following Montecarlo procedure − | P    

Step B1 Given a sample (yi, xi,ti) 1 i n compute the estimators (βh , αh0 , γh0 ) as in (2). { } ≤ ≤ 0 Define b b b n t v0(t)= Wi(t)χ yi, x β + ηH (t) w1(xi) with ηH (t)= αh + γh t. • i=1 i h0 0 0 0 0 P t   t j (t)=Ψ yj , x β + ηH0 (tb) E0 Ψ yj , x β + ηH0 (t) (xj ,tj ) • b j h0 − b j h0 b | b b   ? ?   Stepb B2 Generate n randomb b variables 1 ...n, independentb b of the sample (yi, xi,ti) 1 i n ? ? ? { } ≤ ≤ and such that E(i ) = 0, Var(i ) = 1 and i are bounded. For instance, we generate n observations from the two point distribution P ?(? = a) = p and P ?(? = b)=1 p, with a = (1 √5)/2, b =(1+ √5)/2 and p =(5+ √5)/10. − − ? ? Step B3 Define Rn = Rn(βh0 , v0, ηH0 ) with

2 2 n b b tb n H0 xi βh + ηH0 (ti) ? 0 2 ? R = w(t )w (x ) v (t )− W (t )w (x ) (t ) . n i 2 i  t  0 i  j i 1 j j i j  i=1 V H x β + ηH (ti) j=1 X ib h0 b 0 X    b b b  ?  ? Step B4 Repeat Step B2 and Step B3 bNboot times, to get Nboot values of Rn, say Rn,i, 1 i N . ≤ ≤ boot

The (1 α) quantiles of the distribution of R (an so of 1) can be approximated by the (1 − − ? T − α) quantiles of the conditional distribution of R . The p value can be estimated by p = k/(Nboot + 1)− where k is the number of R? which are larger or equal− than . n,i T1 b

7 6. Monte Carlo study

This section contains the results of a simulation study conducted with the aim of comparing the performance of the proposed testing procedure with the classical one. We consider a logistic partially linear model. The classical procedure corresponds to use the maximum likelihood estimators under the parametric model and the estimators defined in Carroll et al. (1997), which are an alternative to those, based on profile likelihood, considered in Severini and Staniswalis (1994), under the nonparametric one. To be more precise, we select the deviance as loss function and w1 = w2 1 in Sections 2 and 3. The robust estimators correspond to those controlling large values of≡ the deviance and they are computed using the score function defined in Croux and Haesbroeck (2002) with tuning constant c =0.5. The weight functions w1 and w2 used to control high leverage points are taken as the Tukey’s biweight function with tuning constant c = 4.685. To be more precise, 2 2 2 2 since xi R, we define w (xi)= w (xi)= 1 [(xi Mn)/4.685] when xi Mn 4.685 and 0 ∈ 1 2 − − | − |≤ otherwise, with Mn the of xi. The central model denoted C0 in the figures corresponds to a logistic model where x ( 1, 1) and t (0, 1), independent each other. On the other hand, the i ∼U − i ∼U responses are such that yi (xi,ti) Bi(1,p(xi,ti)) with log (p(x, t)/ (1 p(x, t))) = β0x + η0(t, ∆), with β = 2, η (t, ∆) = (t| 0.5)∼ + ∆cos(6π(t 0.5)), that is, H(u) =− exp(u)/(1 + exp(u)). The 0 0 − − value ∆ = 0 corresponds to the null hypothesis, H0 : η0 α + γ t, α R,γ R , while as alternatives we choose a grid of 10 equally spaced values of ∆∈ {[0.2, 2.0]. We∈ performed∈ NR} = 1000 ∈ replications of samples of size n = 200 and Nboot = 5000 bootstrap samples. The Epanechnikov 2 kernel K(t) = (3/4)(1 t )I[ 1,1](t) was selected for the smoothing procedure with bandwidth h =0.1. − − Clearly, the test statistics considered depend on the choice of h. For the classical procedure, H¨ardle et al. (1998) pointed the sensitivity of the test level to the selection of the bandwidth parameter. In particular, they suggested to apply the test for different choices of h to have an impression on how the function η0 differs significantly from linearity. The selected value h = 0.1 was chosen so that, under H0, the observed frequencies of rejection for the bootstrap robust and classical test reached values close to the nominal level α =0.1. Figure 1 gives the frequency of rejection both for the classical and robust procedure for the uncontaminated samples. The nominal level was 0.10. The frequency of rejection of the asymptotic test is plotted in lines combined with filled diamonds while that of the Montecarlo test corresponds to the solid line. As expected the Montecarlo test improves the performance of the asymptotic ones, for the sample size considered.

For each generated sample, we also consider the following contaminations labelled C1 and C2. We first generate a sample u (0, 1) for 1 i n and then, the contaminated sample, denoted i ∼U ≤ ≤ (yi,c, xi,c,ti), is defined as follows for each contamination scheme

Contamination C1 introduces bad high leverage points in the carriers x, without changing the • responses already generated, i.e., (y , x ) = (y , x ) if u 0.90 and (y , x ) = (y , x ) i,c i,c i i i ≤ i,c i,c i i,new if ui > 0.90, where xi,new is a new observation from a N(10, 1). Contamination C includes outlying observations in the responses generated according to an • 2 incorrect model. Let η(t, ∆) = ∆ cos(6π(t 0.5)) and pi,new = H(η(ti, 20(1 ∆))), define y Bi(1,p ). Then, (y , x ) = (y−, x ) if u 0.90 and (y , x ) =− (y , x ) if i,new ∼ i,new i,c i,c i i i ≤ i,c i,c i,new i ui > 0.90. e e

8 a) b)

π π

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0

∆ ∆ Figure 1: Frequency of rejection π of the asymptotic test, plotted with filled diamonds, and the Montecarlo test plotted with a solid line. a) Classical test b) Robust test.

Figures 2 and 3 give the frequency of rejection both for the classical and robust procedure for the contaminated samples. Figure 2 reports the frequencies of rejection for both the asymptotic and Montecarlo procedure, on the other hand, only the results for the Montecarlo test are reported for C2 since the asymptotic ones behave similarly. The results show that the classical test seem to be quite insensitive to high leverage points if the model is adequate as in C1, while its power is sensitive to a misleading model. It is worth noting that under C1, the Montecarlo classical test outperforms the robust one, since it leads to a lower loss of power. Quite surprisingly, under the present setting, the classical test adapts for the effect of high leverage points, if the model is correct, since the same disturbance is produced both in the parametric and nonparametric estimators. However, the classical procedure suffers from the inclusion of outliers under a misleading model, while the robust procedure is more stable. In this sense, the robust tests should be preferred. In particular, the robust asymptotic test avoids extra computing time at the expense of some level loss, leading to a conservative test.

P. Appendix. Proofs

(0,1) In this section we will give the proof of Theorem 4.1. From now on, let Sn (β,a,t) = n t i=1 W0,i(t)Ψ yi, xi β + a w1(xi) where W0,i(t)=1/(nh)K ((ti t)/hn). Besides, define the fam- t − t ily of functions = g(y, x,u)= χ y, x β0 + a w1(x) E0 χ y1, x1 β0 + a w1(x1) t1 = u , a P G {1  1 − | ∈ R and let N(, ,L (Q)) stand for its L covering number. Denote also by n = (t, β): t [2}h , 1 2h ], βG β ρ . We will need− the following lemmas available in BoenteK et{ al. (2012). ∈ n − n k − 0k≤ n} 3 Lemma P.1. Assume that A1 to A4, A6 and A7 hold and that nhn/ log(n) . Then, we have (0,1) → ∞ that sup(t,β) Sn (β, η0(t),t) = Op ρn√log n . ∈Kn  3 1 W Lemma P.2. Assume that A1 to A7 hold and that nh / log(n) . If N(, ,L (Q)) A− , n → ∞ G ≤ for any probability Q and 0 << 1, we have that sup(t,β) ηβ(t) η0(t) = Op ρn√log n . ∈Kn − 3 Lemma P.3. Assume that A1 to A7 hold and that nhn/ log( n) . If, in addition, ψ1,2(x) = 2 b 1 → ∞ W w1(x) x is bounded or supt E0 (ψ1,2(x) t) < and N(, ,L (Q)) A− , for any probability k k ∈T | ∞ G ≤

9 a) b) Asymptotic test

π π

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0

∆ ∆ Montecarlo test

π π

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0

∆ ∆ Figure 2: Frequency of rejection π of the asymptotic and Montecarlo test, under C0 in solid lines and under C1 in lines with diamonds. a) Classical test b) Robust test.

a) b)

π π

0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0

0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0

∆ ∆ Figure 3: Frequency of rejection π of the Montecarlo test, under C0 in solid lines and under C2 in lines with diamonds. a) Classical test b) Robust test.

2 Q and 0 << 1, we have that sup(t,β) ηβ(t) η(t) R(β,t) = Op(ρnlog n), with ∈Kn − −

n b b b 1 b t η(t) = η (t) v (t)f(t) − W (t)w (x )Ψ y , x β + η (t) (P.1) 0 −{ 0 } 0,i 1 i i i 0 0 i=1 X   1 t R(βb,t) = v (t)− v (t) (β β ) . (P.2) 0 1 − 0 b t Lemma P.4. Assume that H0 holds, i.e., η0(t) = α0 + γ0t. Denote yi,0 = H xi β0 + α0 + γ0ti  e 10 t 1 and ν(τ) = E w1(x1)ζ y1,0, x1 β0 + η0(τ) H0 H− (y1,0) t1 = τ where ζ(y,a) = ∂Ψ (y,a) /∂y. | 3 Under A1 to A7 if in addition nhn/ log(n) , we have that  e e → ∞ e e e 1 sup ηh0 (t) (αh0 + γh0 t α0 γ0t) v0(t)− ν(t) η0(t) t [2h ,1 2h ] − − − − ∈ n − n n b 1 1b b t 2 + v0(t)− f(t)− W0,i(t)w1(xi)Ψ yi,0, xi β0 + η0(t) = Op(ρnlog n) .

i=1   X e e

Proof of Theorem 4.1. In order to derive an expansion for the test statistic note that, uniformly on t [2h , 1 2h ] we have ∈ n − n 1 η(t) ηh0 (t) = η(t)+ R(β,t) (αh0 + γh0 t α0 γ0t)v0(t)− ν(t) η0(t) − −n − − − 1 1 t 2 b b +b v0(t)− bf(bt)− bW0,i(tb)w1(xi)Ψ yi,0, xi β0 + η0(t) + Op(ρnlog n) i=1 X   e e with η(t) and R(β,t) defined in (P.1) and (P.2), respectively. Hence,

n b b 1 1 t t η(t) bηh (t) = v0(t)− f(t)− W0,i(t)w1(xi) Ψ yi, x β + η0(t) Ψ yi,0, x β + η0(t) − 0 − i 0 − i 0 i=1 X h    i 1 2 e b b + R(β,t) (αh0 + γh0 t α0 γ0t)v0(t)− ν(t)+ Op(ρnlog n) e − n − − 1 1 t t = vb0(bt)− f(t)− W0,i(t)w1(xi) Ψ yi, x β + η0(t) E0 Ψ yi, x β + η0(t) (xi,ti) − b b i 0 − i 0 | i=1 h      i X2 + Op(√n)+ Op(ρnlog n).

1 Therefore, we have the following expression for the test statistic = R + O ((n/h) 2 ρ log n) with T1 p n 2 n t 2 H0 xi β0 + α0 + γ0ti R = η(ti) E η(ti) x1,t1,..., xn,tn w(ti) V H xtβ + α + γ t − | i=1 i 0 0 0 i X h  i n t  b 2 b H0 xi β0 + α0 +bγ0ti b 2 2 2 = w(ti)w2(xi) v0(ti)− f(ti)− (ti) V H xtβ + α + γ t Wn i=1 i 0 0 0 i Xn t  t n(ti) = W0,j (ti)w1(xj ) Ψ yj , xj β0 + η0(ti) E0 Ψ yj , xj β0 + η0(ti) (xj ,tj , xi,ti) W − | j=1 X h      i which is a U statistic. Therefore, using standard arguments as in H¨ardle and Mammen (1993) it fol- −1 w 2 1 2 1 2 lows that vn− ( 1 mn) N(0, 1), with vn =2c2,Ψ hn− (K K(u)) du and mn = c1,Ψ hn− K (u)du where c (Ψ), cT(Ψ)− and −→σ2(x ,t ) are given in Theorem 4.1. ∗ 1 2 0 0 R R t Let us verify the expressions for mn and vn. Denote Vj,i = w1(xj )[Ψ(yj , xj β0 + η0(ti)) t − E0(Ψ(yj, x β0 + η0(ti)) (xj ,tj , xi,ti))], then j | n t 2 n n 1 2 2 H0(xi β0 + η0(ti)) R = w(ti)v− (ti)f − (ti)w2(xi) Kh (tj ti)Kh (t` ti)Vj,iV`,i n2 0 V (H(xtβ + η (t ))) n − n − i=1 i 0 0 i j=1 X X X`=1

11 2 n t 2 K (0) 2 2 H0(xi β0 + η0(ti)) 2 R = w(ti)v− (ti)f − (ti)w2(xi) V n2h2 0 V (H(xtβ + η (t ))) i,i n i=1 i 0 0 i X n t 2 2K(0) 2 2 H0(xi β0 + η0(ti)) + w(t )v− (t )f (t )w (x ) V K (t t )V 2 i 0 i − i 2 i t i,i hn ` i `,i n hn V (H(x β + η (t ))) − i=1 i 0 0 i `=i X X6 n t 2 1 2 2 H0(xi β0 + η0(ti)) + w(ti)v− (ti)f − (ti)w2(xi) Kh (tj ti)Kh (t` ti)Vj,iV`,i n2 0 V (H(xtβ + η (t ))) n − n − i=1 j=ij=` `=i i 0 0 i X 6 X6 X6 n t 2 1 2 2 H0(xi β0 + η0(ti)) 2 2 + w(ti)v− (ti)f − (ti)w2(xi) K (tj ti)V n2 0 V (H(xtβ + η (t ))) hn − j,i i=1 i 0 0 i j=i X X6 = R1 + R2 + R3 + R4. Using that nh2 and that n → ∞ n t 2 t 2 1 2 2 H0(xi β0 + η0(ti)) 2 p w(t1)w2(x1) H0(x1 β0 + η0(t1)) 2 w(ti)w2(xi)v− (ti)f − (ti) V E0 V n 0 V (H(xtβ + η (t ))) i,i −→ v2(t )f 2(t ) V (H(xtβ + η (t ))) 1,1 i=1 i 0 0 i 0 1 1 1 0 0 1 X   p 1/2 p we get that R 0 and so, hn R 0. 1 −→ 1 −→ On the other hand, using that E(V`,i (x`,t`, xi,ti)) = 0 and E(Vi,i (xi,ti)) = 0 and that V`,i | | and Vi,i are conditionally independent, for ` = i, we get that E(R2) = 0. On the other hand, let 2 2 t 6 2 t Zi = w(ti)v0− (ti)f − (ti)w2(xi)Vi,i(H0(xi β0 + η0(ti)) )/V (H(xi β0 + η0(ti))), then, we have that 2 R2 = (2K(0))/(n h) i=` ZiKhn (t` ti)V`,i, and so, 6 − P 2K2(0)n(n 1) Var(R2)= 4 2 − (C1,h + C2,h) n hn 2 2 2 with C1,h = E(Z1 Khn (t2 t1)V2,1) and C2,h = Cov(Z1Khn (t2 t1)V2,1,Z2Khn (t1 t2)V1,2). Note that, − − −

1 2 2 t2 t1 2 1 2 C1,h = E Z K − V = R(t1,t1 + uhn)K (u)f(t1)f(t1 + uh)du dt1. h2 1 h 2,1 h n   n   n Z Hence, C1,h = O(1)/hn. In a similar way, we get that C2,h = O(1)/hn, which implies that 1/2 p h Var(R ) 0 as n , therefore, hn R 0. n 2 → → ∞ 2 −→ 2 n 2 Write R4 = (1/n ) i=1 j=i WiKh(tj ti)Vj,i with 6 − 1 P P t 2 2 2 t − Wi = w(ti)w2(xi) H0(xi β0 + η0(ti)) v0(ti)f (ti) V (H(xi β0 + η0(ti)))

2 n 2 o2 then, E(R4)=(1/n) E(W1Khn (tj t1)V ) = ((n 1)/n)E(W1K (t2 t1)E(V (x1,t1, x2,t2))) j=1 − j,1 − hn − 2,1| and it is easy to see that 6 (V 2 (x ,t , x ,t )= w2(x )σ2(x ,t ,t ). Let R = ((n 1)/n) (W K2 (t E 2,1 1 1 2 2 1 2 2 2 1 4,1 E 1 hn 2 2 2 P | 2 2 2 − 2 − t1)w1(x2)σ (x2,t2)) and R4,2 = ((n 1)/n)E(W1Khn (t2 t1)w1 (x2)[σ (x2,t2,t1) σ (x2,t2)]), − 2 − − then, E(R4)= R4,1 + R4,2. Using that σ (x2,t2,t1) is Lipschitz, we obtain that σ2(x ,t ,t ) σ2(x ,t ) < t t

12 1/2 and so, hn A 0. 2 → 1 2 2 t 2 t − Let a(t1)= w(t1)v0− (t1)f − (t1) and b(t1)= E w2(xi)H0(xi β0 + η0(ti)) V (H(xi β0 + η0(ti))) t1 , then    1/2 n 1 1/2 2 t2 t1 2 2 hn R4,1 = − hn E K − w1(x2)σ (x2,t2)a(t1)b(t1) nh2 h n   n   2 2 Denote by c(t2)= E(w (x2)σ (x2,t2) t2). Thus, 1 | n 1 h1/2R = − h1/2 K2(u)du a(t )b(t )c(t )f 2(t )dt + o(1). n 4,1 nh n 2 2 2 2 2 n Z Z Using analogous arguments to those considered previously when studying the convergence of R2, one can easily obtain that h Var(R ) 0. Then, n 4 →

1/2 1 2 p hn R4 K (u)du E(a(t2)b(t2)c(t2)f(t2)) 0 , − h −→  n Z  where

t 2 2 2 2 1 H0(x β0 + η0(t)) E(a(t2)b(t2)c(t2)f(t2)) = E E(w (x)σ (x,t) t)w(t)v− (t)f − (t)E w2(x) t . 1 | 0 V (H(xtβ + η (t)))   0 0 

Finally, we will study the asymptotic behavior of R3. The expected value of R3 is equal 0, and so it is enough to study its . Straightforward calculations lead to write Var(R3)= A1 +A2 +A3 where

t 4 (n 1)(n 2) 2 2 H0(x1 β0 + η0(t1)) 2 2 A1 = − − E a (t1)w2(x1) Kh (t2 t1)Kh (t3 t1) n3 V 2(H(xtβ + η (t ))) n − n − ×  0 0 1 2 2 2 2 w1(x3)w1(x2)σ (x2,t2)σ (x3,t3)  t 4 (n 1)(n 2) 2 2 H0(x1 β0 + η0(t1)) 2 2 A2 = − − E a (t1)w (x1) K (t2 t1)K (t3 t1) n3 2 V 2(H(xtβ + η (t ))) hn − hn −  0 0 1 w2(x )w2(x )[σ2(x ,t ,t )σ2(x ,t ,t ) σ2(x ,t )σ2(x ,t )] 1 3 1 2 2 2 1 3 3 1 − 2 2 3 3 2 t  2 t 2 (n 1) (n 2) H0(x1 β0 + η0(t1)) H0(x2 β0 + η0(t2)) A3 = 2 − − E a(t1)a(t2)w2(x1)w2(x2) n3 V (H(xtβ + η (t ))) V (H(xtβ + η (t ))) ×  0 0 1 0 0 2

Khn (t3 t1)Khn (t4 t1)Khn (t3 t2)Khn (t4 t2)E(V3,1V3,2 (x3,t3,t2,t1))E(V4,1V4,2 (x4,t4,t2,t1)) . − − − − | |  4 t H0 (x β +η0(t1)) Let b (t )= w2(x ) 1 0 t and σ2(t )= (w2(x )σ2(x ,t ) t ) thus H 1 E 2 1 2 t 1 1 E 1 1 1 1 1 V (H(x1 β0+η0(t1))) | |   h 1 h A = n a2(t )b (t )σ2(t + uh )σ2(vh + t )K2(u)K2(v)du dv dt = O(1) n 1 nh2 1 H 1 1 n n 1 1 nh n Z n then, we have that h A 0. On the other hand, using (P.3), we get the following bound n 1 → 2 1 A a2(t )b(t )K2(u)K2(v)f(t )f(uh + t )f(vh + t )dt du dv = O(1) | 2| ≤ nh 1 1 1 n 1 n 1 1 nh n Z n

13 which implies that h A 0. Finally, straightforward calculations lead to h A converges to n 2 → n 3 2 E(a2(t)b2(t)c2(t,t,t)f 3(t)) [K K(u)]2du. ∗ 2 2 2 2 Using that c (t2,t2,t2)=RE(E(V2,2 (x2,t2)) t2)= E(w1 (x2)σ (x2,t2) t2), we get that | | | 2 2 2 3 2 2 2 2 3 E(a (t2)b (t2)c (t2,t2,t2)f (t2)) = E(a (t2)b (t2)E(w1 (x2)σ (x2,t2) t2)f (t2)) | 2 t 2 w (t1) H0(x1 β0 + η0(t1)) 2 2 = E 4 E w2(x1) t1 E(w1(x1)σ (x1,t1) t1) = c2,Ψ , v (t1)f(t1) V (H(xtβ + η (t )))| | 0   1 0 0 1  ! concluding the proof.

Acknowledgements

This research was partially supported by Grants 20020100100276 and 20020100300057 from the Universidad de Buenos Aires, pip 112-200801-00216 from conicet, pict 0821 from anpcyt at Argentina and also by Spanish Grant MTM2008-03010/MTM of the Ministerio Espa˜nol de Ciencia e Innovaci´on and XUNTA Grant PGIDIT07PXIB207031PR. This work began while Graciela Boente and Daniela Rodriguez were visiting the Departamento de Estad´ıstica e In- vestigaci´on Operativa from the Universidad de Santiago de Compostela. The authors wish to thank the Editor, the Associate Editor and two anonymous referees for valuable comments which led to an improved version of the original paper.

References

Bianco, A. and Boente, G. (2004). Robust estimators in semiparametric partly linear regression models. J. Statist. Plan. Inference, 122, 229–252. Bianco, A., Boente, G. and Mart´ınez, E. (2006). Robust tests in semiparametric partly linear models. Scand. J. Statist., 33, 435–450. Bianco, A. Garc´ıa Ben, M. and Yohai, V. (2005). Robust estimation for linear regression with asymmetric errors. Canad. J. Statist., 33, 1–18. Bianco, A. and Yohai, V. (1995). Robust estimation in the logistic regression model. Lecture Notes in Statistics, 109, 17–34. Springer–Verlag, New York. Boente, G., Cao, R., Gonz´alez–Manteiga, W. and Rodriguez, D. (2012). Testing in generalized partly linear models: A robust approach. Technical Report, available at http://www.ic.fcen.uba.ar/preprints.pdf Boente, G., He, X. and Zhou, J. (2006). Robust Estimates in Generalized Partially Linear Models. Ann. Statist., 34, 2856–2878. Boente, G. and Rodriguez, D. (2010). Robust inference in generalized partially linear models. Comp. Statist. Data Anal., 54, 2942–2966. Cantoni, E. and Ronchetti, E. (2001). Robust inference for generalized linear models. J. Amer. Statist. Assoc., 96, 1022–1030. Carroll, R., Fan, J., Gijbels, I. and Wand, M. (1997). Generalized partially linear single-index models. J. Amer. Statis. Assoc., 92, 477–489. Croux, C. and Haesbroeck, G. (2002). Implementing the Bianco and Yohai estimator for logistic regression. Comp. Statist. Data Anal., 44, 273–295. Diebolt, J. (1995). A nonparametric test for the regression function: Asymptotic theory. J. Statist. Plan. Inference, 44, 1–17. Gao, J. (1997). Adaptive parametric test in a semiparametric regression model. Comm. Statist. Theory Methods, 26, 787–800. Gao, J. and Shi, P. (1997). M–type smoothing splines in nonparametric and semiparametric regression models. Statist. Sinica, 7, 1155–1169.

14 Gonz´alez–Manteiga, W. and Aneiros–P´erez, G. (2003). Testing in partial linear regression models with dependent errors. J. Nonparametr. Statist., 15, 93–111. H¨ardle, W., Liang, H. and Gao, J. (2000). Partially linear models. Springer-Verlag. H¨ardle, W. and Mammen, E. (1993). Comparing nonparametric versus parametric regression fits. Ann. Statist., 21, 1926–1947. H¨ardle, W., Mammen, E. and M¨uller, M. (1998). Testing parametric versus semiparametric modeling in generalized linear models. J. Amer. Statist. Assoc., 93, 1461–1474. He, X. (2002). Robust tests in statistics – a review of the past decade. Estad´ıstica, 54, 29–46. He, X., Zhu, Z. and Fung, W. (2002). Estimation in a semiparametric model for longitudinal data with unspecified dependence structure. Biometrika, 89, 579–590. Heritier, S. and Ronchetti, E. (1994). Robust bounded-influence tests in general parametric models. J. Amer. Statist. Assoc. 89, 897–904. Kojadinovic, I. and Yan, J. (2011). A goodness-of-fit test for multivariate multiparameter copulas based on multiplier central limit theorems. Statistics and Computing, 21, 17–30. Koul, H. and Ni, P. (2004). Minimum distance regression model checking. J. Statist. Plan. Inference, 119, 109–142. Koul, H. L. and Song, W. (2008). Regression model checking with Berkson measurement errors. J. Statist. Plan. Inference, 138, 1615–1628. Koul, H. L. and Song, W. (2010). Model checking in partial linear regression models with Berkson measurement errors. Statistica Sinica, 20, 1551–1579. Koul, H. L. and Stute, W. (1998). Lack of fit tests in regression with non–random design. Applied Statistical Science, 3, 53–69. K¨unsch, H., Stefanski, L. and Carroll, R. (1989). Conditionally unbiased bounded-influence estimation in general regression models with applications to generalized linear models. J. Amer. Statist. Assoc., 84, 460–466. Mc Cullagh, P. and Nelder, J.A. (1989). Generalized linear models. London: Chapman and Hall. 2nd edition. M¨uller, M. (2001). Estimation and testing in generalized partial linear models A comparative study. Statistics and Computing, 11, 299-309. Pollard, D. (1984). Convergence of Stochastic Processes. New York: Springer–Verlag. R´emillard, B. and Scaillet, O. (2009) Testing for equality between two copulas. J. Multivariate Anal., 100, 377-386. Robinson, P. (1988). Root-n-consistent Semiparametric regression. Econometrica, 56, 931–954. Rodriguez, D. (2008). Estimaci´on robusta en modelos parcialmente lineales generalizados. Ph.D. thesis at the University of Buenos Aires. Available at http://cms.dm.uba.ar/doc/tesisdanielarodriguez.pdf Rodr´ıguez Campos, M. C., Gonz´alez–Manteiga, W. and Cao, R. (1998). Testing the hypothesis of a generalized linear regression model using nonparametric regression estimation. J. Statist. Plan. Inference, 67, 99–122. Severini, T. and Staniswalis, J. (1994). Quasi-likelihood estimation in semiparametric models. J. Amer. Statist. Assoc., 89, 501–511. Stefanski, L., Carroll, R. and Ruppert, D. (1986). Bounded score functions for generalized linear models. Biometrika, 73, 413–424. Stone, C. (1977). Consistent nonparametric regression. Ann. Statist., 5, 595–645. Wang, L. and Qu, A. (2007). Robust tests in regression models with omnibus alternatives and bounded influence. J. Amer. Statist. Assoc., 102, 347–358. Zhu, L. (2005). Nonparametric Monte Carlo tests and their applications. Lecture notes in Statistics, 182. Zhu, L. and Zhang, H. (2004). Hypothesis testing in mixture regression models. J. Royal Statist. Soc., Ser. B, 66, 3–16.

15