Statistica Sinica 14(2004), 547-570
KURTOSIS AND CURVATURE MEASURES FOR NONLINEAR REGRESSION MODELS
L.M.Haines,T.E.O’BrienandG.P.Y.Clarke
University of KwaZulu-Natal, Loyola University Chicago and Agriculture Western Australia
Abstract: An expression for the second-order approximation to the kurtosis asso- ciated with the least squares estimate of an individual parameter in a nonlinear regression model is derived, and connections between this and various other mea- sures of curvature are made. Furthermore a means of predicting the reliability of the commonly-used Wald confidence intervals for individual model parameters, based on measures of skewness and kurtosis, is developed. Numerous examples illustrating the theoretical results are provided.
Key words and phrases: Bias, confidence intervals, parameter-effects curvature, relative overlap, skewness.
1. Introduction There has been considerable interest over the past twenty years in devel- oping measures of curvature for nonlinear regression models which in some way quantify the deviation of the model from linearity. Specifically, in a landmark paper in 1980, Bates and Watts built on the seminal work of Beale (1960) and in- troduced relative intrinsic and parameter-effects curvatures which provide global measures of the nonlinearity of the model. However these measures are not al- ways helpful when the individual parameters in the model are of interest (see e.g., Cook and Witmer (1985)) and as a consequence a number of researchers have developed measures of curvature which are specifically associated with the individual parameters. In particular Ratkowsky (1983) suggested examining the skewness and kurtosis of the least squares estimates of the parameters by means of simulation and Hougaard (1985) reinforced this idea by deriving a formula for the second-order approximation to skewness. Further Cook and Goldberg (1986) and Hamilton (1986) extended the ideas of Bates and Watts (1980) to accommo- date individual parameters, while Clarke (1987) introduced a marginal curvature measure derived from the second-order approximation to the profile likelihood. The list seems rather daunting and the curvature measures diverse and uncon- nected. In fact Clarke (1987), and more recently Kang and Rawlings (1998), have 548 L. M. HAINES, T. E. O’BRIEN AND G. P. Y. CLARKE identified certain relationships between the measures, but these connections are limited. The usefulness of measures of curvature for individual model parameters can best be gauged by how well the measures predict the coincidence between the Wald and the likelihood-based confidence intervals. Ratkowsky (1983, 1990) provided rules of thumb relating to the closeness-to-normality of the individual parameter estimates, and thus to the accuracy of the Wald intervals, which are based on the skewness and kurtosis of the parameter estimates obtained by sim- ulation and which are supported by a wealth of practical experience. Addition- ally Cook and Goldberg (1986) suggested a connection between their individual parameter-effects curvature measure and the reliability of the Wald confidence intervals but this claim has to a greater extent been refuted by the findings of van Ewijk and Hoekstra (1994). Clarke (1987) specifically designed his marginal curvature measure to reflect the closeness or otherwise of the Wald and the profile likelihood-based intervals, but the measure is approximate and does not perform well in all cases. In summary, none of the existing curvature measures associated with individual model parameters are entirely satisfactory in terms of assess- ing the reliability of the Wald confidence intervals. Interestingly Cook and Tsai (1990) approached the problem of assessing the closeness or otherwise of Wald and profile likelihood intervals more directly, by invoking a method introduced by Hodges (1987) which is based on determining the confidence levels associ- ated with the smallest Wald interval containing, and the largest Wald interval contained in, the likelihood-based interval. The aim of the present study is two-fold, first to develop a formula for the second-order approximation to kurtosis for the least squares estimates of the individual model parameters and to relate this to other measures of curvature, and second to develop a measure based on skewness and kurtosis which predicts well the closeness or otherwise of the Wald and the likelihood-based confidence intervals. The paper is organized as follows. Relevant notation and background are given in the next section and the new results on kurtosis are derived in Section 3. Various measures of curvature are related to the second-order approximations to skewness and kurtosis in Section 4, and a composite measure for assessing the accuracy of the Wald confidence intervals is developed in Section 5. Several illustrative examples are provided in Section 6 and the Fieller-Creasy problem is treated in detail in Section 7. Some brief conclusions are drawn in Section 8.
2. Preliminaries Consider the nonlinear regression model
yi = η(xi,θ)+i,i=1,...,n, KURTOSIS AND CURVATURE MEASURES 549 where yi is an observation taken at a value xi of the explanatory variable x, θ is a p × 1 vector of unknown parameters, η(xi,θ) is a function nonlinear in and three times differentiable with respect to θ,andthei are error terms independently 2 distributed as N(0,σ ). Let D be the (n × p)matrixwith(i, a)th entry ∂ηi/∂θa T where ηi = η(xi,θ), let G = D D, and introduce a matrix K, not necessarily unique, such that KT DT DK = I. Furthermore, define B to be the 3-dimensional array of order (p × p × p)with(l, rs)th element given by 2 ∂ηi ∂ ηi Bl,rs = KlaKrbKsc , a,b,c i ∂θa ∂θb∂θc and let Bl denote the lth face of that array for l =1,...,p. Also define C to be the 4-dimensional array of order (p × p × p × p)with(l, rst)th element given by 3 ∂ηi ∂ ηi Cl,rst = KlaKrbKscKtd , a,b,c,d i ∂θa ∂θb∂θc∂θd and let Cl denote the lth 3-dimensional symmetric array within C for l =1,...,p. Next, consider an individual parameter θj with maximum likelihood estimate T T θˆj and define tj = θˆj − θj for j =1,...,p.Letz = K D e and w = He, with H an orthonormal matrix such that KT DT H = 0. Clearly z ∼ N(0,σ2I) independently of w. Then, following Clarke (1980), tj canbeexpandedina power series in terms of z and w. Specifically, suppose that terms in w are neglected, since these are associated with the intrinsic nonlinearity of the model, and suppose also that terms in z of degree four and higher are ignored. Then the series can be written as b c tj = s10 + s20 + s30 + s30 + ···, (2.1) where the subscripts u and v in the terms suv denote the degree in z and in w, respectively, and the superscripts b and c for s30 identify terms involving the T matrices B and C separately. Furthermore, let kj denote the jth row of K, b T the vector with lth element z Blz = r,s Bl,rszrzs,andc the vector with lth element r,s,t Cl,rstzrzszt for l =1,...,p. Then it immediately follows from T − T − T Clarke (1980) that s10 = kj z and that s20 = (1/2)kj b = (1/2)z Mz,where M = l KjlBl.Also b 1 T 1 T T s30 = b Mz = (z Blz)(ml z), (2.2) 2 2 l T where ml denotes the lth row of M,and c 1 T 1 s30 = − kj c = − KjlCl,rstzrzszt 6 6 r,s,t l 1 T 1 T T = − (z Nrz)zr = − (z Nrz)(er z), (2.3) 6 r 6 r 550 L. M. HAINES, T. E. O’BRIEN AND G. P. Y. CLARKE