<<

Journal of Modern Applied Statistical Methods

Volume 5 | Issue 2 Article 2

11-1-2005 Second-Order Accurate Inference on Simple, Partial, and Multiple Correlations Robert J. Boik Montana State University{Bozeman, [email protected]

Ben Haaland University of Wisconsin{Madison, [email protected]

Follow this and additional works at: http://digitalcommons.wayne.edu/jmasm Part of the Applied Commons, Social and Behavioral Sciences Commons, and the Commons

Recommended Citation Boik, Robert J. and Haaland, Ben (2005) "Second-Order Accurate Inference on Simple, Partial, and Multiple Correlations," Journal of Modern Applied Statistical Methods: Vol. 5 : Iss. 2 , Article 2. DOI: 10.22237/jmasm/1162353660 Available at: http://digitalcommons.wayne.edu/jmasm/vol5/iss2/2

This Invited Article is brought to you for free and open access by the Open Access Journals at DigitalCommons@WayneState. It has been accepted for inclusion in Journal of Modern Applied Statistical Methods by an authorized editor of DigitalCommons@WayneState. Journal of Modern Applied Statistical Methods Copyright c 2006 JMASM, Inc. November 2006, Vol. 5, No. 2, 283–308 1538–9472/06/$9 5.00 Invited Articles Second-Order Accurate Inference on Simple, Partial, and Multiple Correlations

Robert J. Boik Ben Haaland Mathematical Sciences Statistics Montana State University–Bozeman University of Wisconsin–Madison

This article develops confidence interval procedures for functions of simple, partial, and squared multiple correlation coefficients. It is assumed that the observed multivariate represent a random sample from a distribution that possesses finite moments, but there is no requirement that the distribution be normal. The coverage error of conventional one-sided large sample intervals decreases at rate 1/√n as n increases, where n is an index of sample size. The coverage error of the proposed intervals decreases at rate 1/n as n increases. The results of a simulation study that evaluates the performance of the proposed intervals is reported and the intervals are illustrated on a real data set.

Key words: bootstrap, confidence intervals, Cornish-Fisher expansion, Edgeworth expansion, second-order accuracy

Introduction pansions for these distributions are derived. Accurate inference procedures for sim- The Edgeworth expansions, in turn, are ple, partial, and multiple correlation coeffi- used to construct confidence intervals that cients depend on accurate approximations are more accurate than conventional large to the distributions of estimators of these sample intervals. coefficients. In this article, Edgeworth ex- Denote the p p sample correlation × matrix based on a random sample of size Robert Boik ([email protected]) is N from a p-variate distribution by R and Professor of Statistics. His research in- denote the corresponding population corre- terests include linear models, multivari- lation matrix by ∆. The p2 1 vectors ob- × ate analysis, and large sample methods. tained by stacking the columns of R and Ben Haaland ([email protected]) is a ∆ are denoted by r and ρ, and are ob- Ph.D. student. His research interests in- tained by applying the vec operator. That clude multivariate analysis, rates of conver- is, r def= vec R and ρ def= vec ∆. The exact gence, and nonparametric techniques. joint distribution of the components of r,

283 284 SECOND-ORDER ACCURATE INFERENCE when from a multivariate , where Z is Fisher’s (1921) distribution, was derived by Fisher (1962), Z transformation. If the requirement of but the distribution is very difficult to use multivariate normality is relaxed, then the in practice because it is expressed in inte- distributions of r and r are substantially gral form unless p = 2. If sample size is suf- more complicated. Fortunately, the asymp- ficiently large, however, then one may sub- totic distribution of √n(r ρ) still is mul- stitute the asymptotic distribution of r, but tivariate normal with − zero and fi- with some loss of accuracy. nite matrix whenever the par- In this article, the big O (pronounced ent distribution has finite fourth-order mo- oh) notation (Bishop, Fienberg, & Holland, ments. Expressions for the scalar compo- 1975, 14.2) is used to index the magnitude nents of the asymptotic § of √n(r ρ) when sampling from non- of a quantity. Let un be a quantity that de- normal distributions− were derived by Hsu pends on n = N rx, where rx is a fixed − (1949) and Steiger and Hakstian (1982, constant and N is sample size. Then, un = −k k 1983). Corresponding matrix expressions O(n ) if n un is bounded as n . | | −k → ∞ were derived by Browne and Shapiro (1986) Note that if un = O(n ) and k > 0, then and Neudecker and Wesselman (1990). The un converges to zero as n . Also, the rate of convergence to zero →is faster∞ for large asymptotic bias of r when sampling from k than for small k. An approximation to the non-normal distributions was obtained by distribution of a is said to Boik (1998). be jth-order accurate if the difference be- In the bivariate case, Cook (1951) ob- tween the approximating cumulative distri- tained scalar expressions for the moments −5/2 bution function (cdf) and the exact cdf has of r with error O(n ). These moments magnitude O(n−j/2). For example, the cdf could be used to compute the first three cumulants of √n(r ρ). The first four cu- of a sample correlation coefficient, rij, is − def mulants and the corresponding Edgeworth F (t) = P (r t). Suppose that Fˆ is rij ij ≤ rij expansion for the distribution of of √n(r an estimator of F . Then, Fˆ is first-order − rij rij ρ) were obtained by Nakagawa and Niki accurate if F (t) Fˆ (t) = O(n−1/2), for | rij − rij | (1992). all t. Some distributional results for √n(r − Pearson and Filon (1898) showed that ρ) have been obtained under special non- the first-order accurate asymptotic joint normal conditions. Neudecker (1996), for distribution of the components of √n(r example, obtained a matrix expression for ρ), when sampling from a multivariate− the asymptotic covariance matrix in the normal distribution, is itself multivariate special case of elliptical parent distribu- normal with mean zero and finite covari- tions. Also, Yuan and Bentler (1999) gave ance matrix. Pearson and Filon also de- the asymptotic distribution of correlation rived expressions for the components of and multiple correlation coefficients when the asymptotic covariance matrix. A ma- the p-vector of random variables can be trix expression for the asymptotic covari- written as a linear function of zv, where ance matrix was derived by Nell (1985). z is a p-vector of independent components Niki and Konishi (1984) derived the Edge- and v is a scalar that is independent of z. worth and Cornish-Fisher expansions for This article focuses on confidence in- √n [Z(r) Z(ρ)] with error only O(n−9/2) tervals for functions of ∆. Specifically, when sampling− from a bivariate normal second-order accurate interval estimators BOIK & HAALAND 285 for simple, partial, and squared multiple sults of Olkin and Finn to allow condition- correlation coefficients as well as for differ- ing on any set of variables rather than a ences among simple, partial, and squared single variable and to allow squared multi- multiple correlation coefficients are con- ple correlations to be based on an arbitrary structed. A confidence interval is said to number of explanatory variables. Alf and be jth-order accurate if the difference be- Graf (1999) gave scalar equations for the tween the actual coverage and the stated of the difference between two nominal coverage has magnitude O(n−j/2). squared sample multiple correlations. This In general, large sample confidence intervals article also extends the results of Olkin and are first-order accurate under mild validity Finn (1995), but does so using relatively conditions. simple and easily computed (with a com- First-order accurate confidence inter- puter) matrix expressions for the required vals for correlation functions are not new. derivatives. As in the previous articles, this Olkin and Siotani (1976) and Hedges and article relies on derivatives of correlation Olkin (1983) used the delta method (Rao, functions. Unlike Olkin and Finn, how- 1973, 6a.2) together with the asymptotic ever, simple, partial, and multiple correla- distribution§ of √n(r ρ) when sampling tions are treated as functions of the covari- − from a multivariate normal distribution to ance matrix, Σ, instead of the correlation derive the asymptotic distribution of par- matrix, ∆. The advantage of the current tial and multiple correlation coefficient es- approach is simplicity of the resulting ex- timators. Olkin and Finn (1995) used pansions. these results to obtain explicit expressions This article also extends the results of for asymptotic standard errors of estima- Olkin and Finn (1995), Graf and Alf (1999), tors of simple, partial, and squared mul- and Alf and Graf (1999) to be applicable tiple correlation coefficients as well as ex- to to non-normal as well as normal distri- pressions for standard errors of differences butions. This extension is straightforward. among these coefficients. The major contri- Denote the sample covariance based on a bution of Olkin and Finn (1995) was that random sample of size N by S. Then, one they demonstrated how to use existing mo- needs only to replace the asymptotic covari- ment expressions to compute first-order ac- ance of vec S derived under normality with curate confidence intervals for correlation the asymptotic covariance matrix derived functions when sampling from multivariate under general conditions. normal distributions. In summary, the contributions of this To avoid complicated expressions for article are as follows: (a) easily computed derivatives, Olkin and Finn (1995) gave matrix expressions for the required deriva- confidence intervals for partial correlations tives are given (see Theorems 2 and 3), (b) and their differences only in the special case the proposed intervals are asymptotically when the effects of a single variable are par- distribution free (ADF); and (c) the accu- tialed out. Similarly, confidence intervals racy of the confidence intervals is extended for differences among squared multiple cor- from first-order to second-order. In addi- relation coefficients (estimated using a sin- tion, the proposed interval estimators are gle sample) are given only for the special illustrated on a real data set and a small cases when either one or two explanatory simulation study that evaluates the per- variables are employed. Graf and Alf (1999) formance of the proposed intervals is con- used numerical derivatives to extend the re- ducted. 286 SECOND-ORDER ACCURATE INFERENCE

Before describing the new results, the Model D. Components a1, a3, and a5 Olkin and Finn (1995) article will be briefly in model D on page 161 should be the fol- revisited. It appears that the methods de- lowing: scribed in Olkin and Finn have begun to be incorporated into statistical practice (34 1 a1 = citations from 1997 to June 2005, most of 2 2 (1 ρ02) (1 ρ12) which are reports of empirical studies). Un- − − fortunately, the derivatives reported in the p 1 , article contain several errors. The effects of − (1 ρ2 ) (1 ρ2 ) these errors on the examples given by Olkin − 03 − 13 p and Finn are minimal, but their effects in ρ13 ρ01ρ03 practice are unknown. A list of corrections a3 = 3−/2 1/2 , and (1 ρ2 ) (1 ρ2 ) to these derivatives is given in the next sec- − 03 − 13 tion. ρ03 ρ01ρ13 Corrections to Olkin and Finn a5 = − . (1 ρ2 )1/2(1 ρ2 )3/2 Below is a list of corrections to the − 03 − 13 derivatives reported in Olkin and Finn (1995). The nominal 95% confidence interval for ρ · ρ · should be [0.020, 0.054] rather Model B. Components a1 and a3 in 01 2 − 01 3 model B on page 159 should be the follow- than [0.017, 0.057]. ing:

2 2 Notation and Preliminary Remarks a1 = 2 ρ01 ρ12 ρ13 + ρ03ρ13 ρ02ρ12 − − In some applications, an investigator h  may have one or more fixed explanatory +ρ ρ (ρ ρ ρ ρ ) 12 13 02 13 − 03 12 variables (e.g., group or treatment vari- i ables) whose effects must be removed before 2 2 1 ρ12 1 ρ13 estimating the correlation matrix. A linear ÷ − − model that is suitable for this purpose is de-    ρ03 ρ01ρ13 scribed in this section. The sample covari- and a3 = 2 − 2 . ance matrix, S, is based on the residuals − 1 ρ13  −  from the fitted model. The confidence in- The nominal 95% confidence interval for terval procedures that are constructed later ρ2 ρ2 should be [ 0.19, 0.015] rather 0(12) − 0(13) − in this article require estimates of second than [ 0.22, 0.018]. − and third-order moments of S. Expressions Model C. Components a1 and a3 in for the population moments as well as con- model C on page 160 should be the follow- sistent estimators of these moments are de- ing: scribed in this article. Sufficient validity 2 2 conditions were described in Boik (2005, a1 = (1 ρ ) (1 ρ ) 1 and − 02 − 12 − 5.1). q § ρ02 ρ01ρ12 a3 = − . 1 ρ2 − 12 The nominal 95% confidence interval for Denote the response vector for the ith ρ01 ρ01·2 should be [ 0.0018, 0.0073] rather subject by yi and let Y be the N p re- than− [ 0.0017, 0.0080].− sponse matrix whose ith row is y0. That× is, − i BOIK & HAALAND 287

0 Y = y y y . The joint model If fourth-order moments of Y are fi- 1 2 · · · N for the N subjects is nite, then the ensures  that Y = XB + E, (1) dist √n(s σ) N(0, Ω22,∞), where (4) where X is an N q matrix of fixed explana- − −→ × tory variables, and E is an N p random 0 0 0 Ω ∞ Υ σσ Υ ε ε ε ε th × 0 22, = 22 , 22 = E( i i i i), matrix. Denote the i row of E by εi. It is − ⊗ ε assumed that i for i = 1, . . . , N are inde- def def th pendently and identically distributed with s = vec S, σ = vec Σ, and εi is the i row of E in (1). Boik (1998, 2005, eq. 12) mean E(εi) = 0 and Var(εi) = Σ. The components of Σ are denoted by showed that if fourth-order moments of Y are finite, then the finite-sample of σij for i = 1, . . . , p and j = 1, . . . , p. Note th √n(s σ) is that the variance of the i variable is σii, − not σ2 , and the correlation between the ii def th th Ω = Var √n(s σ) (5) i and j variables is ρij = σij/√σiiσjj. 22,n − To represent the matrix of correlations as c1 0  c1  a function of Σ, a notation for diagonal = (Υ22 σσ ) + 1 2Np(Σ Σ), matrices is needed. Suppose that M is a n − − n ⊗   q q matrix with positive diagonal compo- N 2 th × where c1 = i=1 aii, aij is the ij compo- nents m11, m22, . . . , mqq and that b is a real nent of A in (3), Υ22 is defined in (4), Np = number. Then the diagonal matrix with di- P I 2 + I /2, and I is the commuta- agonal components mb , mb , . . . , mb is de- p (p,p) (a,b) 11 22 qq tion matrix (MacRae, 1974). Magnus and noted by (M)b . That is,  D Neudecker (1979, 1999 3.7) denoted I § (a,b) q by Kb,a. If data are sampled from a multi- b q b q0 variate normal distribution, then the finite- (M)D = ei miiei , (2) i=1 sample variance and an unbiased estimator X of the finite-sample variance of √n(s σ) q th − where ei is the i column of Iq. Using this are notation, the p p correlation matrix is × − 1 − 1 Ω22,n = 2Np (Σ Σ) and 2 2 ⊗ ∆ = (Σ)D Σ (Σ)D . n2 Estimators of simple, partial, and mul- Ω = 2N (S S) 22,n (n 1)(n + 2) p ⊗ tiple correlation coefficients will be based − on the sample covariance matrix, namely b 2n ss0, −1 0 −(n 1)(n + 2) S = n Y AY, (3) − where A = I X(X0X)−X0, ( )− de- respectively. From Boik (1998, 2005, N − notes an arbitrary generalized inverse, n = eq. 13), an unbiased estimator of Ω22,n un- N rx, and rx = rank(X). It is readily der general conditions is sho−wn that if the rows of Y have finite vari- S Σ Ω22,n = a1Υ22 + a22Np (S S) (6) ance, then is unbiased for . ⊗ 0 Moments of the Sample Covariance Matrix b +ae3ss , where 288 SECOND-ORDER ACCURATE INFERENCE

2 2 2 0 0 n c1 n (c1 nc2) Υ ε ε ε ε ε ε a = , a = − , where 42 = E ( i i i i i i) , 1 d 2 − (n 1)d ⊗ ⊗ ⊗ − 0 and Υ21 = E (εi εiεi) . n [2nc + (n 3)c2] ⊗ a = 2 − 1 , 3 − (n 1)d If the data are a random sample from a mul- − tivariate normal distribution, then to order N N −1 O(n ), Ω42,n and √nΩ222,n simplify to c = a4 , d = n(n + 2)c 3c2, 2 ij 2 − 1 i=1 j=1 X X Ω42,n (9)

N = (2N 2N ) (Σ σ Σ) 2N Υ = n−1 ε ε0 ε ε0 , p ⊗ p ⊗ ⊗ p 22 i i ⊗ i i i=1 X  and √nΩ222,n = Ω42,n. e e e e 0e N c1 is defined in (5), and εi = Y Aei is the p-vector of residuals from the fitted model It will be seen later in this arti- cle that the proposed inference proce- for subject i. e Two additional covariance-related dures depend on Ω222,n and Ω42,n only through functions that can be writ- quantities are needed in this article, namely 0 ˙ ˙ ˙ ten as ψσ ψσ ψσ vec Ω222,n and ⊗ 0 ⊗ Ω42,n (7)   ψ˙ σ ψ˙ σ ψ˙ σ vec Ω42,n, where ψ˙ σ is a ⊗ ⊗ 0 def 2 ˙ = Cov √n (ω ω), √n(s σ) p 1 vector that satisfies ψσσ = 0. Theo- − − rem×1 gives consistent estimators of the re-  0  = nE [(ω ω)(s σ) ] and quired quantities. The results in Theorem −b − 1 follow from the consistency of Ω42,n and def 3 0 Ω described by Boik (2005). Ω = n 2 E [(bs σ)(s σ) (s σ)] , 222,n 222,n − − ⊗ − b 2 Theoremb 1. Suppose that ψ˙ σ is a p 1 where ω = vec Ω22,n and ω = vec Ω22,n. 0 × vector function of Σ that satisfies ψ˙ σσ = Using the results of Boik (2005), it can be 0. Then, the required sixth-order quantities shown that if sixthb order moments are fi- b can be consistently estimated as nite and the column space generated by X in (1) contains the n-vector of ones, then 0 ψ˙ s ψ˙ s ψ˙ s vec Ω = Ω42,n and √nΩ222,n can be written as fol- ⊗ ⊗ 42,n lows:   N b 0 3 0 −1 ˙ ˙ ˙ 0 ˙ Ω (8) n ψs (εi εi) 4ψsΥ21ΨsΥ ψs 42,n ⊗ − 21 i=1 X h i = Υ42 2Np2 (Ω22,n σ) (Υ22 σ) e e 0 b b − ⊗ − ⊗ ˙ ˙ ˙ and √n ψs ψs ψs vec Ω222,n 0 ⊗ ⊗ 4N 2 (N I 2 ) (Υ vec Υ ) − p p ⊗ p 21 ⊗ 21   0 b −1 = ψ˙ s ψ˙ s ψ˙ s vec Ω42,n +O n and √nΩ222,n ⊗ ⊗   0  −1 0 b = Ω (Υ Υ ) 2N + O n , 2ψ˙ sΥ Ψ˙ sΥ ψ˙ s, 42,n − 21 ⊗ 21 p − 21 21  b b BOIK & HAALAND 289

N − 1 − 1 = (Σ ·k) 2 Σ ·k (Σ ·k) 2 , where where Υ = n−1 ε ε ε0 , ij D ij ij D 21 i ⊗ i i i=1 X  σii·k σij·k b e e e Σij·k = ψ˙ s is a consistent estimator of ψ˙ σ, Ψ˙ s is a σij·k σjj·k   p p matrix that satisfies vec Ψ˙ s = ψ˙ s, and × 0 p p ε is defined in (6). If the rows of Y have a = E [Σ ΣH Σ] E , E = e e , i ij − k ij ij i j multivariate normal distribution, then the p p p  required sixth-order quantities can be con- k e e e e E = k1 k2 kq , sistently estimated as · · · 0 −1 0  0 Hk = Ek (EkΣEk) Ek, ψ˙ s ψ˙ s ψ˙ s vec Ω (10) ⊗ ⊗ 42,n   0 b and the remaining terms are defined in (2). = √n ψ˙ s ψ˙ s ψ˙ s vec Ω ⊗ ⊗ 222,n It is required that neither i nor j be a com-   ponent of k; otherwise ρij·k = 0 and no in- 3 b ference is necessary. The parameter ρij·k = 8 trace SΨ˙ s . can be estimated by rij·k, where rij·k is ob-    tained by substituting S for Σ in (11). ˙ ˙ Specific structures for ψσ and ψs are de- Theorem 2 gives first and second scribed below. derivatives of ρij·k with respect to σ. The Derivatives derivatives are given without proof because The main theoretical tool used in this they follow from established matrix calcu- article is the Taylor series expansion of ran- lus results (Graham, 1981). dom quantities. The first-order linear ex- pansion is the basis of the delta method. Theorem 2. The first two derivatives of To construct second-order accurate inter- ρ ·k with respect to σ are val estimators, it is necessary to include the ij quadratic term in the expansion as well. Accordingly, both first and second deriva- def ∂ ρij·k ρ˙ k = = N (γ k γ k) tives of correlation functions are required. ij· ∂ σ p i· ⊗ j· Derivatives of partial correlations as well as (γ k γ k) + (γ k γ k) ρ ·k/2 and squared multiple correlations are given in − i· ⊗ i· j· ⊗ j· ij this article.  2  def ∂ ρij·k ρ¨ k = Partial Correlations ij· ∂ σ ∂ σ Suppose that an investigator is inter- ⊗ ested in the correlation between variables i = Np2 γi·k γi·k γj·k γj·k ρij·k/2 and j, controlling for a subset of variables ⊗ ⊗ ⊗ 0  indexed by k = k1 k2 kq . This · · · + (γi·k γi·k γi·k γi·k) partial correlation, denoted as ρ ·k, is the ⊗ ⊗ ⊗ ij off diagonal component in the correlation h matrix ∆ ·k, where + γ ·k γ ·k γ ·k γ ·k 3ρij·k/4 ij j ⊗ j ⊗ j ⊗ j i 1 ρij·k  ∆ij·k = (11) ρ ·k 1 (Np Np) Np2 γi·k γj·k γj·k γj·k  ij  − ⊗ ⊗ ⊗ ⊗ h  290 SECOND-ORDER ACCURATE INFERENCE

+ γ k γ k γ k γ k = 2Np2 γi(k) γi(k) γi γi j· ⊗ i· ⊗ i· ⊗ i· ⊗ ⊗ ⊗ i  +2 (N N ) γ k h γ k 2 (N N ) (I h I ) ρ˙ k, where p p i( ) k i( ) − p ⊗ p p ⊗ k ⊗ p ij· ⊗ ⊗ ⊗ 2  p 2 (γ γ γ γ ) 1 ρ k , γ k = [I H Σ] e /√σ ·k, i i i i i( ) i· p − k i ii − ⊗ ⊗ ⊗ − p  hk = vec (Hk) , where γi = ei /√σii,

p H is defined in (11), N is defined in (5), γ k = (I H Σ) e /√σ , k p i( ) p − k i ii and σii·k is defined in (11). If k is empty, then ρij·k becomes the simple correlation ρij Hk is defined in (11), and hk is defined in p and γi·k becomes γi = ei /√σii. Theorem 2.

Multiple Correlations Derivatives of Differences Suppose that an investigator has in- In practice, a linear function of corre- terest in the multiple correlation between lation coefficients rather than a correlation variable i and variables indexed by k = 0 coefficient itself could be of interest. Let k k k . The squared multiple 1 2 · · · q ψ be a linear function of simple, partial, correlation, denoted as ρ2 , can be writ-  i(k) and/or squared multiple correlation coeffi- ten as cients. For example, if the difference be- 2 p0 p tween a simple and a partial correlation is ρi(k) = ei ΣHkΣei /σii, (12) of interest, then ψ = ρij ρij·k, where k p − where ei is defined in (2), and Hk is de- is a non-empty vector of index values. By fined in (11). It is required that i not be properties of derivatives, the derivative of 2 a component of k; otherwise ρi(k) = 1 and a linear function is the the linear function no inference is necessary. The parameter of the derivatives. Accordingly, Theorems 2 2 2 ρi(k) can be estimated by ri(k), where ri(k) 2 and 3 can be used to obtain the first two is obtained by substituting S for Σ in (12). derivatives of any linear function, ψ, with Theorem 3 gives first and second respect to σ. 2 Denote the first two derivatives of ψ derivatives of ρi(k) with respect to σ. The derivatives are given without proof because with respect to σ by ψ˙ σ and ψ¨σ. That is, they follow from established matrix calcu- 2 lus results (Graham, 1981). def ∂ ψ def ∂ ψ ψ˙ σ = and ψ¨σ = , (13) ∂ σ ∂ σ ∂ σ Theorem 3. The first two derivatives of ⊗ 2 ρi(k) with respect to σ are where ψ is a linear function of simple, par- tial, and/or squared multiple correlation co- ∂ ρ2 2 def i(k) 2 efficients. The specific structure of these ρ˙ k = = (γ γ ) 1 ρ k i( ) ∂ σ i ⊗ i − i( ) derivatives for individual correlation coef-  ficients as well as for linear functions that γi(k) γi(k) and correspond to extensions of the first four − ⊗ models in Olkin and Finn (1995) are listed  ∂2ρ2 below. The fifth model in Olkin and Finn 2 def i(k) ρ¨ k = i( ) ∂ σ ∂ σ is discussed separately later in this article. ⊗ BOIK & HAALAND 291

˙ 1. Single Partial Correlation Coefficient. where ψσ is defined in (13) and Ω22,n is de- ˙ If ψ = ρij·k, then ψσ = ρ˙ ij·k, and fined in (5). The central limit theorem en- ¨ ψσ = ρ¨ij·k, where the derivatives are sures that given in Theorem 2. If k is empty, √n (u ψ) − 1 then ψ becomes ψ = ρij. P ψ u = Φ − +O n 2 , ≤ σψ       2. Difference Between Partial Correla- b where Φ( ) is the cdf of the standard nor- tion Coefficients. Denote the set that · consists of the components of k by mal distribution. ˙ The proposed confidence intervals are k . If ψ = ρij·k1 ρtu·k0 , then ψσ = { } −¨ based on the asymptotically pivotal quan- ρ˙ ij·k ρ˙ tu·k , and ψσ = ρ¨ij·k ρ¨tu·k , 1 − 0 1 − 0 tity where k1 and/or k0 could be empty, i, j = t, u = k0 = k1 , and √n ψ ψ {the deriv} {atives} are⇒ {given} 6 in{Theorem} T = − , (15) σ  2. This difference is an extension of bψ Olkin and Finn’s Models C and D. 2 2 where σˆψ is a consistent estimator of σψ. k k 0 b Examples include ψ = ρij· 1 ρij· 0 , ˙ ˙ − The quantity, ψsΩ22,nψs, for example, is ψ = ρij ρij·k, ψ = ρij·k ρtu·k, and 2 − − consistent for σψ, where Ω22,n is defined in ψ = ρij ρtu. − (6) and ψ˙ s is ψ˙ σ afterb substituting S for Σ. b 3. Single Squared Multiple Correlation It follows from the central limit Theorem 2 ˙ and Slutsky’s Theorem (Casella & Berger, Coefficient. If ψ = ρi(k), then ψσ = 2 2 2002, Theorem 5.5.17) that T N(0, 1) to ρ˙ k , and ψ¨σ = ρ¨ k , where the ∼ i( ) i( ) first-order accuracy. Inverting the cdf of T derivatives are given in Theorem 3. yields first-order accurate one-sided lower and upper intervals: 4. Difference Between Squared Multi- ple Correlation Coefficients. If ψ = 2 2 2 2 (ψ σψz1−α/√n, ) and ( , ψ σψzα/√n) ρ ρ , then ψ˙ σ = ρ˙ k ρ˙ k i(k1) j(k0) i( 1) j( 0) − ∞ −∞ − − 2 2 − and ψ¨σ = ρ¨ k ρ¨ k , where i = i( 1) − j( 0) respb ectivb ely, where zα is the 100αbpercenb tile j = k0 = k1 , and the deriva- of the standard normal distribution. The tives⇒are{ giv} 6en {in }Theorem 3. This standard normal critical value zα can be re- difference is an extension Olkin and placed by the t critical value, tα,n, without Finn’s Models A and B. affecting the asymptotic properties of the intervals.

Second-Order Accurate Confidence Intervals Edgeworth Expansion of the Distribution of Let ψ be a linear function of simple, T partial, or squared multiple correlation co- To construct a second-order accurate efficients and denote the estimator obtained approximation to the distribution of T , it is S Σ ˆ 2 by substituting for by ψ. Define σψ as necessary to obtain the mean (bias), vari- 2 def ˆ σψ = n Var(ψ). An application of the delta ance, and of T . These quantities method reveals that and estimators of these quantities are sum- marized in Theorem 4. A proof is sketched 0 2 ˙ ˙ −1 σψ = ψσΩ22,nψσ + O n , (14) in the Appendix.  292 SECOND-ORDER ACCURATE INFERENCE

˙ 0 ˙ 0 ¨ Theorem 4. If ψ is a function of a single +2 ψσΩ22,n ψσΩ22,n ψσ, covariance matrix, Σ, and ψˆ is the same ⊗   function of a single sample covariance ma- where Ω42,n and Ω222,n are described in (7) trix, S, then the first three cumulants of T and (8), and ψ˙ σ and ψ¨σ are defined in (13). in (15) can be written as follows: 0 Furthermore, ψ˙ σσ = 0 as required in The- orem 1 and κ1 and κ2 can be estimated con- κ1 − 3 E(T ) = + O n 2 , sistently by √n   m1 m11 m3 m11 −1 Var(T ) = 1 + O n , κ1 = 3 , and κ3 = 3 3 3 , σψ − 2σψ σψ − σψ  b b b b 3 b b κ1 κ3 − 3 0 U 2 b b2 ˙ ˙ b b and E T = + O n , where σψ = ψsΩ22,nψs + , " − √n # √n n     1 0 b m1 m11 b m1 = ψ¨s vec Ω22,n, where κ1 = 3 , 2 σψ − 2σψ N b b 0 3 m3 m11 −1 κ = 3 , m = n ψ˙ s (ε ε ) 3 σ3 σ3 3 i ⊗ i ψ − ψ i=1 X h i m1 − 3 b 0 e0 e = E √n ψ ψ + O n 2 , 6ψ˙ sΥ Ψ˙ sΥ ψ˙ s √n − − 21 21 h  i   b 0 b b0 m3 +3 ψ˙ sΩ ψ˙ sΩ ψ¨s, 22,n ⊗ 22,n √n   b N b 0 3 3 −1 ˙ m1 − 3 m11 = n ψs (εi εi) = E √n ψ ψ + O n 2 , ⊗ − − √n i=1 (  ) X h i     b e e b ˙ 0 ˙ 0 ˙ 4ψsΥ21ΨsΥ21ψs and m11 − 0 b b0 2 2 −1 +2 ψ˙ sΩ22,n ψ˙ sΩ22,n ψ¨s, = n Cov ψ ψ, σψ σψ + O n . ⊗ − −      ψ˙ s and ψ¨s are ψ˙bσ and ψ¨σ bafter substituting Specific solutionsb forb m1, m3, and m11 are S for Σ, U is any Op(1) random variable, 1 0 and the remaining terms are defined in The- m = ψ¨σ vec Ω , 1 2 22,n orem 1. 0 Asymptotically, the choice of U in The- m = √n ψ˙ σ ψ˙ σ ψ˙ σ vec Ω 3 ⊗ ⊗ 222,n orem 4 is arbitrary, because limn→0 U/n = 0   and the asymptotic properties of the inter- ˙ 0 ˙ 0 ¨ vals are not affected by U/n. Nonetheless, +3 ψσΩ22,n ψσΩ22,n ψσ, and ⊗ the finite-sample properties of the intervals   0 are affected by the choice of U. From ex- ˙ ˙ ˙ 0 m11 = ψσ ψσ ψσ vec Ω42,n perience, it appears that ψ˙ Ω ψ˙ s often ⊗ ⊗ s 22,n   b BOIK & HAALAND 293

2 √ ˆ κ1 κ3 (t 1) −1 underestimates the variance of n(ψ ψ). ϕ(t) + − + O n Accordingly, choosing U to be a positiv− e − √n 6√n   quantity might be beneficial. In this article,  the quantity U will be chosen by using the respectively, where κ1/√n and κ3/√n are second order Taylor series expansion of ψˆ the mean and skewness of T (see Theorem 4), and ϕ(t) and Φ(t) are the standard nor- under the condition that ψ˙ σ = 0. Specifi- cally, the second-order Taylor series expan- mal pdf and cdf. sion of ψˆ is

0 Proposed Interval Estimators √n ψ ψ = ψ˙ σ√n(s σ) The expansion of F in Theorem 5 can − − T   be used to express the of T in 1 0b terms of those of a standard normal random + ψ¨σ √n(s σ) √n(s σ) 2√n − ⊗ − variable and to obtain a polynomial trans-   formation of T that has distribution N(0, 1) −1 −1 +Op n . up to O(n ). These results, known as Cornish-Fisher expansions (Pace and Sal-  ψ˙ σ 0 √ ˆ van, 1997, 10.6), are summarized in Corol- If = , then the variance of n(ψ ψ) § is − lary 5.1. Corollary 5.1. Var √n ψ ψ Denote the 100α − of T by tα. Then, h  i 1 0 b 2 ¨ ¨ −2 κ1 κ3(zα 1) −1 = ψσ (Ω22,n Ω22,n) ψσ + O n . t = z + + − + O n and 2n ⊗ α α √n 6√n   Accordingly, U will be chosen as U = −1 0 tα = tα + Op n , ψ¨s Ω Ω ψ¨s/2. For this choice, 22,n ⊗ 22,n  thequantity U/nrepresents an estimate of where zα is theb 100α percentile of the one ofb the twbo leading terms in the O(n−1) N(0, 1) distribution, component of (14). t = z + κ /√n + κ (z2 1)/(6√n), The primary device for constructing α α 1 3 α − the interval estimator is a second-order and κ and κ are consistent estimators of Edgeworth expansion of the distribution of b 1 b3 b κ and κ in Theorem 4. The normal criti- T . This expansion is summarized in Theo- 1 3 cal value, z , can be replaced by the t criti- rem 5. A proof is sketched in the Appendix. b αb cal value, tα,n, without affecting the second- ˆ Theorem 5. The probability density func- order property of tˆα. Also, define T1 and T1 tion (pdf) and the cdf of T in (15) can be as written as 2 def κ κ (T 1) T = T 1 3 − and 1 − √n − 6√n fT (t) =

2 3 def κ1 κ3(T 1) κ1 t κ3 (t 3t) − T = T − . ϕ(t) 1 + + − + O n 1 1 − √n − 6√n √n 6√n   b b −1  Then, Pb(T1 t) = Φ(t) + O (n ) and and F (t) def= P (T t) = Φ(t) P (Tˆ t) = Φ(≤t) + O (n−1). T ≤ 1 ≤ 294 SECOND-ORDER ACCURATE INFERENCE

A straightforward application of Corol- Corollary 5.3. Define Tˆ3 as lary 5.1 yields second-order accurate in- ˆ 2 ˆ 2 − d T tervals for ψ. Note that tα in Corollary κ3 T e 2 1 def κ1 − 5.1 is a second-order accurate percentile of T3 = T , where ˆ − √n −  6√n  the distribution of T . Accordingly, (ψ b ˆ − b t1−ασˆψ/√n, ) is a second-order accurate b 2 √ (5 √17) ∞ κ3 31 7 17 − − 100(1 α)% lower confidence interval for ψ d = − e 2 . and ( − , ψˆ tˆ σˆ /√n) is a second-order 72n −∞ − α ψ  accurate 100(1 α)% upper confidence in- b bˆ ˆ −3/2 terval for ψ. One− drawback of these inter- Then, T3 = T1 + Op(n ), and is mono- tone in T . Inverting the cdf of Tˆ3 yields vals, however, is that tˆα is a quadratic func- the following lower and upper 100(1 α)% tion of zα and, therefore, is not monotonic second-order accurate confidence intervals:− in α. The same issue exists if Tˆ1 is used as an asymptotic . It is not σ tˆ − monotone in T and, therefore, its cdf can- ψ ψ 3,1 α , and − √n ∞ not be inverted for arbitrary α. Methods for   correcting this non-monotone deficiency are b b σ tˆ summarized in Corollaries 5.2 and 5.3. The , ψ ψ 3,α , −∞ − √n method in Corollary 5.2 is based on Hall’s   (1992) cubic transformation, whereas the b b ˆ method in Corollary 5.3 is based on the ex- where tˆ3,α is the solution to T3 = zα or ponential transformation described by Boik Tˆ3 = tα,n for T . The solution can be com- (2006). puted using the modified Newton method de- scribed in the Appendix. Corollary 5.2. Define Tˆ2 as

2 2 3 The intervals described in Corollaries def κ1 κ3(T 1) κ3T T2 = T − + . 5.2 and 5.3 require consistent estimators of − √n − 6√n 108n κ and κ , the mean and skewness of T . If b b b 1 3 b −1 Then, Tˆ2 = Tˆ1 + Op(n ) and Tˆ2 is mono- ψ is a function of a single covariance ma- trix, then the estimators described in The- tone in T . Inverting the cdf of Tˆ2 yields the following lower and upper 100(1 α)% orem 4 can be used. In some situations, in- second-order accurate confidence intervals:− vestigators are interested in comparing cor- relation functions based on two covariance σ tˆ − matrices. For example, if a comparison be- ψ ψ 2,1 α , and − √n ∞ tween squared multiple correlation coeffi-   b cients from two populations is of interest, b 2 σψtˆ2,α then one could define ψ as ψ = ρi(k)(Σ1) , ψ , 2 2 2 − −∞ − √n ρi(k)(Σ2), where ρi(k)(Σu) is ρi(k) computed   on Σ , for u = 1, 2. Of course, the com- b b u √ parison need not be restricted to squared ˆ κ1 6 n where t2,α = + multiple correlations. One could, for exam- √n κ3 × b ple, define ψ as ψ = ρij·k(Σ1) ρij·k(Σ2) 1 − 3 or ψ = ρij·k(Σ1) ρtu·k0 (Σ2). These com- κ3 κ3 b − 1 1 + z . parisons are extensions of Olkin and Finn’s − 2√n 6√n − α (    ) (1995) Model E. The mean and skewness b b BOIK & HAALAND 295

as well as estimators of the mean and skew- and n = min(n1, n2). Furthermore, consis- ∗ ∗ ness of T for comparisons such as these are tent estimators of κ1 and κ3 are obtained 2 given in Theorem 6. A proof is sketched in by substituting σˆi , mˆ 1;i, mˆ 3;i, and mˆ 11;i for 2 the Appendix. σi , m1;i, m3;i, and m11;i, respectively. Esti- mators of m1;i, m3;i, and m11;i are given by Theorem 6. Let ψi be a correlation func- mˆ , mˆ , and mˆ in Theorem 4 computed ˆ 1 3 11 tion computed on Σi for i = 1, 2 and let ψi on sample i. be ψi computed on Si, where Si is a sam- ple covariance matrix based on ni degrees of Second-order accurate confidence in- freedom from sample i, and samples 1 and tervals for ψ in Theorem 6 can be com- 2 are independent. Define T as ∗ ˆ ∗ ˆ puted by substituting κˆ1/ δ and κˆ3/ δ for κˆ1/√n and κˆ3/√n in Corollaries 5.2 and √n1n2 ψ ψ p p T = − , where 5.3.   δ b Simulation Studies p The finite sample properties of the pro- ψ = ψ1 ψ2, b ψ = ψ1 ψ2, − − posed intervals were evaluated by comput- ˆ 2 2 2 ˙ ¨ ing the intervals on samples from normal, δ = n2σˆb1 + nb1σˆ2 ,band σˆi , ψsi , and ψsi are 2 ˙ ¨ normal mixture, and affine lognormal dis- σˆψ, ψs, and ψs based on sample i. Let m1;i, tributions. Let C be a p p nonsingular m3;i, and m11;i be m1, m3, and m11 of Theo- × rem 4 for population i. Then, the first three matrix and denote a randomly selected p- cumulants of T are vector from the N(0, Ip) distribution as z. Denote a randomly selected p-vector from ∗ κ1 − 3 a normal, normal mixture, or affine lognor- E(T ) = + O n 2 , √δ mal distribution with mean 0 and covari-   ance Σ = CC0 as y. Then z can be trans- − Var(T ) = 1 + O n 1 , and formed to y by

∗  3 κ3 − 3 [U + k(1 U)] E [T E(T )] = + O n 2 , y = Cz, y = Cz − , and − √δ θ + k2(1 θ)   − 2 2 p where δ = n2σ1 + n1σ2, zm m2/2 1 y = C e 1pe − em2 (em2 1) 3/2   − ∗ n2m1;1 n1m1;2 n2 m11;1 κ1 = − for the normal, normalpmixture, and √n1n2 − 2δ√n1 affine lognormal distributions, respectively; 3/2 where U is a Bernoulli(θ) random variable n m + 1 11;2 , distributed independently of z, k and m 2δ√n2 are scalar constants, and e zm is a p-vector whose ith component is ezim. 3/2 3/2 5/2 2 ∗ n2 m3;1 n1 m3;2 n2 σ1m11;1 The parameter values for the normal κ3 = 3 2 δ√n1 − δ√n2 − δ √n1 mixture were θ = 0.7 and k = 3. For this choice, marginal is 3.488 and n5/2σ2m multivariate kurtosis (Bilodeau and Bren- +3 1 2 11;2 , δ2 n ner, 1999, 13.2) is 1.1626. The normal √ 2 § 296 SECOND-ORDER ACCURATE INFERENCE mixture is a member of the class of el- covariance matrices were constructed such liptically contoured distributions. Accord- that ρ2 , ρ2 , and ρ2 ρ2 , i(k) ∈ L2 j(k) ∈ L2 j(k) ≤ i(k) ingly, asymptotic standard errors based on where = 0.2, 0.4, 0.6, 0.8 . The covari- L2 { } normal theory are smaller than the correct ance matrices were constructed as follows: standard errors by a factor of 1/√2.1626 and coverage of one-sided nominal 100(1 1 0 0 0 − 0 1 0 0 α)% normal theory intervals converges to Σ = CC0, where C = , v v 1 0 Φ(z1−α/√2.1626), where z1−α is the 100(1 − w w 1 1 α) percentile of the standard normal distri-   bution. For α = 0.05, coverage of normal   theory intervals converges to 0.868 rather 2 2 ρi(k) ρj(k) than 0.95. v = , w = 2 , v 2 v1 ρj(k) The parameter value for the affine log- u2 1 ρi(k) u − u − u normal distribution was m = 0.8326. For t   t 0 this choice, the marginal skewness and kur- i = 3, j = 4, and k = 1 2 . tosis for each component of the lognormal For each covariance matrix, 5,000 real- z  random vector e m is 4.00 and 38. This izations of the N 4 matrix Y were sam- distribution is not a member of the class of pled from each of×the three distributions elliptically contoured distributions. (normal, normal mixture, affine lognormal) and one-sided nominal 95% confidence in- Generation of Covariance Matrices tervals for ψ were computed. The simu- To examine the performance of inter- lation study was repeated for sample sizes vals for ψ = ρ ρ ·k, ten 4-dimensional ij − ij N 25, 50, 100, 200 . covariance matrices were constructed such ∈ { } that ρ , ρ ·k , and ρ ρ ·k, ij ∈ L1 ij ∈ L1 ij ≤ ij Results where 1 = 1/6, 1/3, 2/3, 5/6 . Each of Figures 1–3 displays four pan- L { } The covariance matrices were constructed p p p p els of 16 plots each. The upper two panels as follows: display the coverage of normal theory and ADF confidence intervals for ψ = ρ ρ ·k. 1 0 0 0 ij − ij 0 0 1 0 0 Coverage of one-sided intervals for ψ = Σ = CC , where C =   , ρ2 ρ2 is displayed in the lower two 1 1 1 1 i(k) − j(k) 1 v w 1 panels. The 10 plots in the upper triangle     of each panel display the coverage of upper 2 2 one-sided intervals and the 10 plots in the 1 ρ k 1 ρ k −  ij· − ij· lower triangle display the coverage of lower w = 2 , q1 2ρ ·k  one-sided intervals. The four plots on the − ij diagonal in each panel display coverage for (2 + w) 2√h both lower and upper intervals. v = −  1 4ρ2 In the upper two panels, row and col- − ij 2 2 umn labels correspond to ρij and ρij·k, re- 2 h = 2ρ2 (3 + 2w + w2 4ρ2 2ρ2 w2), spectively, for upper intervals and to ρij·k ij ij ij 2 − − and ρij, respectively, for lower intervals. In 0 i = 3, j = 4, and k = 1 2 . the lower two panels, row and column la- 2 2 To examine the performance of inter- bels correspond to ρ k and ρ k , respec-  j( ) i( ) vals for ψ = ρ2 ρ2 , ten 4-dimensional tively, for upper intervals and to ρ2 and i(k) − j(k) i(k) BOIK & HAALAND 297

2 ρj(k), respectively, for lower intervals. In all Second-order intervals were based on panels, plot symbols are assigned as follows: the exponential method described in Corol- lary 5.3. Within each plot, the coverage of Interval Symbol intervals based on sample sizes of 25, 50, First-Order Lower 100, and 200 is plotted from left to right. First-Order Upper × Figures 2 and 3 display coverage of inter- ∗ Second-Order Lower  vals when sampling from normal mixtures Second-Order Upper and from affine lognormal distributions.

Figure 1: Coverage when Sampling from Normal Distributions

Normal Theory: ψ = ρ ρ ·k ADF: ψ = ρ ρ ·k ij − ij ij − ij 1/6 1/3 2/3 5/6 1/6 1/3 2/3 5/6 1 1

0.95 0.95 1/6 1/6

0.9 0.9 1 1

0.95 0.95 1/3 1/3

0.9 0.9 1 1

0.95 0.95 2/3 2/3

0.9 0.9 1 1

0.95 0.95 5/6 5/6

0.9 0.9

Normal Theory: ψ = ρ2 ρ2 ADF: ψ = ρ2 ρ2 i(k) − j(k) i(k) − j(k) 0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8

0.95 0.95

0.2 0.9 0.2 0.9 0.85 0.85

0.95 0.95

0.4 0.9 0.4 0.9 0.85 0.85

0.95 0.95

0.6 0.9 0.6 0.9 0.85 0.85

0.95 0.95

0.8 0.9 0.8 0.9 0.85 0.85 298 SECOND-ORDER ACCURATE INFERENCE

Figure 2: Coverage when Sampling from Normal Mixture Distributions

Normal Theory: ψ = ρ ρ ·k ADF: ψ = ρ ρ ·k ij − ij ij − ij 1/6 1/3 2/3 5/6 1/6 1/3 2/3 5/6 1 0.95 0.9 1/6 1/6 0.9 0.8 0.85 1 0.95 0.9 1/3 1/3 0.9 0.8 0.85 1 0.95 0.9 2/3 2/3 0.9 0.8 0.85 1 0.95 0.9 5/6 5/6 0.9 0.8 0.85

Normal Theory: ψ = ρ2 ρ2 ADF: ψ = ρ2 ρ2 i(k) − j(k) i(k) − j(k) 0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8 1 1

0.9 0.9 0.2 0.2 0.8 0.8 1 1

0.9 0.9 0.4 0.4 0.8 0.8 1 1

0.9 0.9 0.6 0.6 0.8 0.8 1 1 0.9 0.9 0.8 0.8 0.8 0.8

It is apparent in Figure 1 that the Figure 2 confirms that normal theory coverage of second-order intervals is supe- based intervals perform poorly when sam- rior to that of first-order intervals when pling from normal mixtures. As expected sampling from multivariate normal distri- from theory, coverage approaches 0.86 as butions. Furthermore, the second-order N increases in each plot. The ADF inter- ADF intervals perform nearly as well as do vals fare much better. Also, the coverage the second-order intervals that are explic- of second-order accurate ADF intervals is itly based on normal theory. equal to or superior to that of first-order BOIK & HAALAND 299

Figure 3: Coverage when Sampling from Affine Lognormal Distributions

Normal Theory: ψ = ρ ρ ·k ADF: ψ = ρ ρ ·k ij − ij ij − ij 1/6 1/3 2/3 5/6 1/6 1/3 2/3 5/6 1 1

0.9 0.9 1/6 1/6

0.8 0.8 1 1

0.9 0.9 1/3 1/3

0.8 0.8 1 1

0.9 0.9 2/3 2/3

0.8 0.8 1 1

0.9 0.9 5/6 5/6

0.8 0.8

Normal Theory: ψ = ρ2 ρ2 ADF: ψ = ρ2 ρ2 i(k) − j(k) i(k) − j(k) 0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8 1 0.9 0.9 0.2 0.8 0.2 0.8 0.7 1 0.9 0.9 0.4 0.8 0.4 0.8 0.7 1 0.9 0.9 0.6 0.8 0.6 0.8 0.7 1 0.9 0.9 0.8 0.8 0.8 0.8 0.7 accurate ADF intervals whenever the cover- rate intervals cannot improve on first-order age of first-order accurate intervals departs accurate intervals and can even perform substantially from 0.95. See, for example, slightly worse if sample size is small. The plot (4, 1) in the upper right-hand panel of performance degradation is due to the dif- Figure 2. First-order accurate ADF inter- ficulty of estimating higher-order moments vals perform well if the bias of ψˆ is small from small samples. Variability in these es- and the distribution of ψˆ is nearly sym- timates induces variability in the interval metric. In these cases, second-order accu- endpoints. Nonetheless, Figure 2 reveals 300 SECOND-ORDER ACCURATE INFERENCE that the second-order ADF intervals never times performed much better than first- performed much worse and sometimes per- order accurate intervals. formed much better than first-order accu- rate intervals. Illustration Figure 3 displays coverage of one-sided Heller, Judge, and Watson (2002) con- intervals when sampling from affine lognor- ducted a survey of university employees mal distributions. The performance of nor- to examine relationships among personal- mal theory intervals is quite poor. The per- ity traits, job satisfaction, and life satisfac- formance of first-order ADF intervals varies tion. They obtained six broad personality depending on the covariance matrix and trait measures, three job satisfaction mea- coverage is as low as 0.78; see plot (3, 4) sures, and three life satisfaction measures in the lower right-hand panel. Figure 3 on each of N = 134 subjects. A subset of also reveals that second-order ADF inter- their data from a random sample of 15 sub- vals never performed much worse and some- jects is listed in Table 1.

Table 1: Sample of Size N = 15 from Heller Variable 1 2 3 4 5 6 7 8 Case N E C CSE PA NA LSO JSO 1 2.75 2.50 3.92 2.44 3.80 2.30 3.00 3.25 1 1.83 4.42 4.25 2.55 4.10 1.30 4.20 4.63 3 2.00 3.25 4.83 2.92 4.00 1.40 2.60 3.88 4 2.17 2.50 4.17 2.53 3.80 1.50 3.60 3.88 5 2.92 3.50 4.08 1.96 3.10 2.20 4.00 3.25 6 1.58 3.67 3.67 2.89 4.00 1.30 4.00 4.00 7 2.00 3.67 4.00 2.68 4.20 1.50 4.00 4.38 8 1.58 4.17 4.75 2.90 4.00 1.30 2.20 2.38 9 2.58 3.17 4.08 2.34 2.80 1.30 3.40 4.38 10 2.92 3.00 3.92 2.21 3.40 1.50 3.60 3.13 11 2.33 2.08 4.67 2.17 2.40 1.70 3.00 3.88 12 2.25 3.67 4.92 2.74 3.90 1.60 3.80 3.00 13 2.08 3.33 3.50 2.39 3.30 1.60 4.00 3.88 14 3.92 2.42 3.00 1.60 2.50 3.10 3.60 4.13 15 1.67 4.17 4.58 2.93 4.30 1.60 4.00 3.63 Notes: N = Neuroticism, E = Extroversion, C = Conscientiousness, CSE = Core Self-Evaluations, PA = Positive Affectivity, NA = Negative Affectivity, LSO = Life Satisfaction—Significant Other Report JSO = Job Satisfaction—Significant Other Report.

Appreciation is expressed to Danny Heller that this relationship, in part, would be who provided the raw data for this exam- due to the influence of stable personality ple. They expected that job and life sat- traits. Specifically, they hypothesized that isfaction would be positively related, and (a) zero-order correlation coefficients be- BOIK & HAALAND 301 tween job and life satisfaction would be pos- controlling for personality traits, would be itive, and that (b) partial correlation co- smaller than the zero-order coefficients. efficients between job and life satisfaction,

Table 2: Confidence Intervals Computed on Heller’s Data Normal Theory ADF ψ Quantity N = 15 N = 134 N = 15 N = 134 r7,8 0.476 0.477 0.476 0.477 r7,8·k 0.410 0.398 0.410 0.398 σˆψ 0.889 0.473 0.966 0.562 ωˆ1 0.086 0.821 0.281 1.597 ωˆ −0.183 −2.171 −0.082 −3.597 3 − − − − ρ ρ ·k tˆ 1.803 1.797 1.845 1.937 7,8 − 7,8 3,0.05 − − − − tˆ3,0.95 1.722 1.543 1.679 1.459 L(1) 0.353 0.110 0.389 0.001 U(1) −0.484 0.147 −0.520 −0.160 L(2) 0.344 0.016 0.368 0.008 U(2) −0.494 0.153 −0.542 0.174 t 1.761 1.656 1.761 1.656 0.05,n − − − − t0.95,n 1.761 1.656 1.761 1.656 2 r7(k) 0.468 0.249 0.468 0.249 2 r8(k) 0.263 0.109 0.263 0.109 σˆψ 1.191 0.818 1.343 0.822 ωˆ 0.269 0.110 0.347 0.212 1 − − − − ωˆ 0.095 0.010 0.147 0.124 3 − − − ρ2 ρ2 tˆ 1.984 1.771 2.203 1.924 7(k) − 8(k) 3,0.05 − − − − tˆ3,0.95 1.512 1.543 1.392 1.423 L(1) 0.355 0.023 0.426 0.023 U(1) −0.766 0.258 −0.838 0.259 L(2) 0.276 0.031 0.294 0.039 − − U(2) 0.837 0.266 0.996 0.278 L(1) and U(1) are lower and upper limits of first-order accurate intervals. L(2) and U(2) are lower and upper limits of second-order accurate intervals.

One way to test hypothesis (b) is to Watson, but is comparable to intervals that compute a confidence interval for the dif- were examined. Intervals based on the sub- ference between the simple and partial cor- sample of N = 15 cases and intervals based relation coefficients. The upper portion of on the full sample of N = 134 cases are Table 2 displays first- and second-order ac- reported in Table 2. Results of interme- curate intervals for ρ ρ ·k, where k diate calculations also are given to aid in- 7,8 − 7,8 consists of indices 1, . . . , 6. This interval vestigators who wish to verify accuracy of was not examined by Heller, Judge, and computer routines. Endpoints of second- 302 SECOND-ORDER ACCURATE INFERENCE order accurate ADF intervals are somewhat mal distribution, then shifted to the right compared to first-order 1 1 + r accurate ADF intervals both for N = 15 Z (r ) = ln ij and ρij ij 2 1 r and for N = 134. Notice that the first-order  − ij  accurate interval based on N=134 contains 2 zero whereas the second-order accurate in- 1 + ri(k) 2 1 Z 2 (r k ) = ln terval does not contain zero. ρ k i( ) i( ) 2  q 2  1 r k The lower portion of Table 2 displays − i( )  q  intermediate computations and intervals for are normalizing transformations for rij and ρ2 ρ2 both for the subsample and for 2 7(k) 8(k) ri(k), respectively. Also, the full− sample. Again, the second-order in- tervals are shifted to the right compared to 1 1 + rij·k Zρij k (rij·k) = ln the first-order accurate intervals. · 2 1 rij·k  −  is a normalizing transformation for rij·k. If data are sampled from multivariate normal Concluding Remarks distributions, then conventional confidence 2 Some caution must be exercised when intervals for ρij·k or ρi(k) that are based on interpreting the proposed intervals com- the above normalizing transformations are puted on nonnormal data. If the popu- only first-order accurate, however, because lation is multivariate normal, then all re- the normalizing transformation corrects for lationships among variables are linear and skewness but not for bias. 2 the traditional correlation coefficients ad- Derivatives of functions of ρij·k or ρi(k) equately summarize the existing relation- are readily obtained by one or more appli- ships. If the population is nonnormal, how- cations of the chain rule. For simple and ever, then relationships among variables partial correlations, need not be linear. Traditional correlation ∂ Zρij k (ρij·k) ρ˙ ij·k coefficients still summarize the linear rela- · = 2 and ∂ σ 1 ρ k tionships, but they do not reflect nonlinear − ij· components of the existing relationships. 2 ∂ Zρij k (ρij·k) ρ¨ij·k · In some cases, confidence interval pro- = 2 ∂ σ ∂ σ 1 ρ k cedures can be improved by applying a ⊗ − ij· normalizing transformation of the correla- ρ˙ ·k ρ˙ ·k 2ρij·k tion estimator. Let Wn be a com- + ij ⊗ ij . 2 2 puted on a random sample of size N, where 1 ρij·k n = N r and r is a fixed positive inte- − − Also, for squaredmultiplecorrelation coef- ger. It is assumed that the moments of Wn are finite at least to order three. Denote ficients, 2 the mean and variance of Wn by µW and 2 2 ∂ Zρ (ρi(k)) ρ˙ k 2 i(k) i( ) σW , respectively. A normalizing transfor- = and ∂ σ 2 2 mation of W , if it exists, is a function of 2 ρ k 1 ρ k n i( ) − i( ) Wn, say g(Wn), chosen such that the skew- q   √ 2 2 2 ness of n [g(Wn) g(µW )] has magnitude ∂ Zρ2 (ρ k ) ρ¨ − − i(k) i( ) i(k) O(n 3/2). Konishi (1981) showed that if = ∂ σ ∂ σ 2 ρ2 1 ρ2 data are sampled from a multivariate nor- ⊗ i(k) − i(k) q   BOIK & HAALAND 303

2 2 2 The simulation study also examined ρ˙ i(k) ρ˙ i(k) 3ρi(k) 1 + ⊗ − . properties of first- and second- order in- 3  2  2 2  2 tervals that are based on Fisher’s Z trans- 4 ρi(k) 1 ρi(k) − formation. The results are very similar to     Employing the above transformations those displayed in Figures 1–3. The per- when data have not been sampled from formance of the second-order intervals was multivariate normal distributions does not comparable to that of first-order intervals necessarily reduce skewness. Nonetheless, whenever the first-order intervals did not their use can be advantageous because the perform too badly. The second-order inter- endpoints of the confidence intervals ob- vals improved on the first-order intervals in tained by back-transforming the intervals cases where coverage of the first-order in- for Zρij k are guaranteed to be in ( 1, 1) and tervals deviated substantially from 1 α. endpoin· ts obtained by back-transforming− − the intervals for Z 2 are guaranteed to be ρi(k) in (0, 1). Intervals computed directly on the An investigator might prefer to boot- correlation coefficients need not satisfy this strap correlation functions and thereby dis- property. In small samples, it might be bet- pense with the requirement of explicitly 2 ter to transform ri(k) as estimating the bias, variance, and skew- ness of the relevant sampling distributions. 2 ri(k) One-sided percentile bootstrap intervals for ∗ 2 1 Z (ri(k)) = ln functions of simple, partial, and multiple 2 1 q r2  − i(k) correlations, however, are only first-order  q  accurate. Often these intervals have poor because this transformation maps (0, 1) to coverage and can be inferior to normal the- ( , ), whereas the normalizing trans- ory intervals even when multivariate nor- formation−∞ ∞ for r2 maps (0, 1) to (0, ). i(k) ∞ mality is violated (Rasmussen, 1987, 1988; Fisher’s Z transformation also can be Strube, 1988). Also see Efron (1988) for a employed when ψ is a difference between discussion of this issue. Second-order ac- correlation coefficients. For example, sup- curate confidence intervals can be obtained pose that ψ = (ρ ρ ·k)/2. The divisor ij − ij by bootstrapping the asymptotic pivotal 2 is used so that ψ ( 1, 1). Then, ψˆ can ∈ − quantity T in (15); i.e., percentile-t inter- be transformed by vals. Hall, Martin, and Schucany (1989) 1 1 + ψ examined percentile, percentile-t, trans- Zψ(ψ) = ln . formed percentile-t, and coverage-corrected 2 1 ψ ! − b iterated bootstrap intervals for zero-order Even if skewnessb is not reduced by this correlation coefficients. Fisher’s transfor- b transformation, the endpoints of the back- mation was employed for the transformed transformed interval will be in ( 1, 1). The percentile-t intervals. They found that in − derivatives of this function are very small samples (N 20), the iterated intervals as well as the p≤ercentile-t intervals ∂ Zψ(ψ) ψ˙ σ = and worked well. Percentile-t intervals that em- ∂ σ 1 ψ2 − ployed the jackknife to estimate the stan- dard error of the (transformed) sample cor- 2 ψ˙ σ ψ˙ σ 2ψ ∂ Z (ψ) ψ¨σ relation coefficient were superior to those ψ = + ⊗ . ∂ σ ∂ σ 1 ψ2  (1 ψ2)2 that employed the delta method. One could ⊗ − − 304 SECOND-ORDER ACCURATE INFERENCE employ jackknife estimates of bias, vari- Table 3. The coverage values in the first ance, and skewness in the proposed second- two lines of Table 3 are those plotted in order accurate intervals. This issue remains Figure 3. The remaining lines are based to be explored. on an additional 2,000 samples. For each Abramovitch and Singh (1985) and of these samples, 1,000 bootstrap samples Zhou and Gao (2000) showed that boot- were drawn and the bootstrap sampling dis- strapping corrected asymptotic pivotal tributions of T and Tˆ3 were approximated. quantities such as Tˆ2 in Corollary 5.2 or Quantiles of these distributions were substi- Tˆ3 in Corollary 5.3 to obtain quantiles of tuted for tα,n in the computation of confi- their respective distributions yields third- dence intervals. Table 3 demonstrates that order accurate confidence intervals. This bootstrapping T yields second-order accu- bootstrap adjustment procedure was exam- rate confidence intervals with coverage sim- ined for the conditions corresponding to ilar to those based on Tˆ3. Bootstrapping Tˆ3 plot (3, 4) in the lower right-hand panel of slightly improved the coverage of the upper Figure 3. A sample size of N = 25 was interval, but over adjusted the endpoint of employed. The results are summarized in the lower interval.

Table 3: Coverage of Bootstrap-Adjusted Confidence Intervals for ψ = ρ2 ρ2 when Sampling from a Lognormal Distribution i(k) − i(k) Pivotal Bootstrap Coverage Number of Quantity Adjusted Lower Intervals Upper Intervals Samples T No 0.972 0.781 5000

T3 No 0.966 0.869 5000 T No 0.968 0.786 2000

Tb3 No 0.959 0.877 2000 T Yes 0.930 0.869 2000

Tb3 Yes 0.934 0.886 2000

The intervbals constructed in this arti- 1 References cle have reasonable coverage properties un- der most of the conditions that were exam- Alf, E.F., & Graf, R.G. (1999). Asymp- ined. In some conditions, however, the cov- totic confidence limits for the differ- erage of the second-order accurate intervals ence between two squared multiple is less accurate than desired, even though it correlations: A simplified approach. represents a substantial improvement over Psychological Methods, 4, 70–75. coverage of first-order accurate intervals. A bootstrap adjustment to the second-order Abramovitch, L., & Singh, K. (1985). Edge- accurate intervals might improve coverage worth corrected pivotal statistics and slightly, but this issue needs to be exam- the bootstrap. The Annals of Statis- ined more thoroughly. tics, 13, 116–132.

Bilodeau, M., & Brenner, D. (1999). The- ory of . New York: Springer. BOIK & HAALAND 305

Bishop, Y.M.M., Fienberg, S.F., & Hol- Graf, R.G., & Alf, E.A., Jr. (1999). Corre- land, P.W. (1975). Discrete multivari- lations redux: Asymptotic confidence ate analysis: Theory and practice. limits for partial and squared multi- Cambridge: MIT Press. ple correlations. Applied Psychologi- cal Measurement, 23, 116–119. Boik, R.J. (1998). A local parame- terization of orthogonal and semi- Graham, A. (1981). Kronecker products orthogonal matrices with applica- and matrix differentiation with appli- tions. Journal of Multivariate Anal- cations. Chichester: Ellis Horwood ysis, 67, 244–276. Limited. Hall, P. (1992). On the removal of skew- Boik, R.J. (2005). Second-order accurate ness by transformation. Journal of inference on eigenvalues of covariance the Royal Statistical Society, Series and correlation matrices. Journal of B, 54, 221–228. Multivariate Analysis, 96, 136–171. Hall, P., Martin, M.A., & Schucany, Boik, R.J. (2006). Accurate confidence in- W.R. (1989). Better nonparametric tervals in regression analyses of non- bootstrap confidence intervals for the normal data. Annals of the Institute correlation coefficient. Journal of of Statistical Mathematics, accepted. Statistical Computation and Simula- tion, 33, 161–172. Browne, M.W. & Shapiro, A. (1986). The asymptotic covariance matrix of sam- Hedges, L.V., & Olkin, I. (1983). Joint ple correlation coefficients under gen- distribution of some indices based on eral conditions. Linear Algebra & its correlation coefficients. In S. Kar- Applications, 82, 169–176. lin, T. Amemiya, & L. A. Good- man (Eds.), Studies in , Casella, G. & Berger, R.L. (2002). Statis- , and multivariate analysis tical inference (Second Ed.). Pacific (pp. 437–454). New York: Academic Grove CA: Duxbury. Press.

Cook, M.B. (1951). Two applications of Heller, D., Judge, T.A., & Watson, bivariate k-statistics. Biometrika, 38, D. (2002). The role of 368–376. personality and trait affectivity in the relationship between job and life sat- Efron, B. (1988). Bootstrap confidence in- isfaction. Journal of Organizational tervals: Good or bad? Psychological Behavior, 23, 815–835. Bulletin, 104, 293–296. Hsu, P.L. (1949). The limiting distribution Fisher, R.A. (1921). On the probable error of functions of sample and ap- of a coefficient of correlation deduced plications to testing hypotheses. In from a small sample. Metron, 1, 3–32. J. Neyman (Ed.), Proceedings of the Berkeley Symposium on Mathemati- Fisher, R.A. (1962). The simultaneous dis- cal Statistics & Probability (pp. 359– tribution of correlation coefficients. 402). Berkeley: University of Califor- Sankhy¯a, A, 24, 1–8. nia Press. 306 SECOND-ORDER ACCURATE INFERENCE

Konishi, S. (1981). Normalizing transfor- Olkin, I., & Finn, J.D. (1995). Correla- mations of some statistics in multi- tions redux. Psychological Bulletin, variate analysis. Biometrika, 68, 647– 118, 155–164. 651. Olkin, I., & Siotani, M. (1976). Asymp- MacRae, E.C. (1974). Matrix derivatives totic distributions of functions with an application to an adaptive of a correlation matrix. In linear decision problem. Annals of S. Ikeda, T. Hayakawa, H. Hudimoto, Statistics, 2, 337–346. M. Okamoto, S. Siotani, & S. Ya- mamoto (Eds.), Essays in probability Magnus, J.R., & Neudecker, H. (1979). The and Statistics (pp. 235–251). Tokyo: commutation matrix: some prop- Shinko Tsusho. erties and applications. Annals of Statistics. 7, 381–394. Pace, L., & Salvan, A. (1997). Princi- ples of . Singa- Magnus, J.R., & Neudecker, H. (1999). Ma- pore: World Scientific. trix differential calculus with applica- Pearson, K. & Filon, L.N.G. (1898). tions in statistics and econometrics Mathematical contributions to the (Rev. ed.). Chichester: John Wiley theory of evolution. IV. On the prob- & Sons. able errors of constants and Nakagawa, S., & Niki, N. (1992). Distri- on the influence of random selection bution of the sample correlation co- on variation and correlation. Philo- efficient for nonnormal populations. sophical Transactions of the Royal Journal of the Japanese Society of Society of London, A 191, 229–311. Computational Statistics, 5, 1–19. Rao, C.R. (1973). Linear statistical in- ference and its applications (Second Neudecker, H. (1996). The asymptotic ed.). New York: John Wiley & Sons. variance matrices of the sample cor- relation matrix in elliptical and nor- Rasmussen, J.L. (1987). Estimating cor- mal situations and their proportion- relation coefficients: Bootstrap and ality. Linear Algebra & its Applica- parametric approaches, Psychological tions, 237/238, 127–132. Bulletin, 101, 136–139.

Neudecker, H. & Wesselman, A.M. (1990). The Rasmussen, J.L. (1988). “Bootstrap confi- asymptotic variance matrix of the dence intervals: Good or Bad”: Com- sample correlation matrix. Linear ments on Efron (1988) and Strube Algebra & its Applications 127, 589– (1988) and further evaluation. Psy- 599. chological Bulletin, 104, 297–299.

Niki, N., & Konishi, S. (1986). Higher- Steiger, J.H. & Hakstian, A.R. (1982). The order asymptotic expansions for the asymptotic distribution of the ele- distribution of the sample correla- ments of a correlation matrix: The- tion coefficient. Communications in ory and application. British Jour- Statistics—Simulation and Computa- nal of Mathematical & Statistical Psy- tion, 13, 169–182. chology, 35, 208–215. BOIK & HAALAND 307

2 2 Steiger, J.H. & Hakstian, A.R. (1983). A Furthermore, expanding √n(σˆψ σψ) historical note on the asymptotic dis- around s = σ reveals that −

tribution of correlations. British 0 2 2 Journal of Mathematical & Statistical √n(σˆ σ ) = ψ˙ σ ψ˙ σ √n (ω ω)+ ψ − ψ ⊗ − Psychology, 36, 157.   0 0 − 1 ˙ ¨ b 2 Strube, M.J. (1988). Bootstrap type I er- 2 √n(s σ) ψσΩ22,n ψσ + Op n , − ⊗ ror rates for the correlation coeffi- h i   cient: An examination of alternate where ω = vec Ω22,n. The claimed result procedures. Psychological Bulletin, is obtained by taking expectations, using 0 104, 290–292. ψ˙ σσ = 0, and collecting terms of like or- der. Zhou, X.-H., & Gao, S. (2000). One- 0 To verify that ψ˙ σσ = 0, note that sided confidence intervals for means of 2 ρij·k and ρ k are scale invariant functions positively skewed distributions. The i( ) of Σ. That is, ρij·k(Σ) = ρij·k(αΣ) and American , 54, 100–104. 2 2 ρi(k)(Σ) = ρi(k)(αΣ), where α is any pos- itive scalar constant. It follows that the 2 Appendix derivatives of ρij·k(αΣ) and ρi(k)(αΣ) with respect to α each are zero. Using the chain Proof of Theorem 4 rule, Taylor series expansions for 0 2 ∂ ρ ·k(αΣ) ∂ ασ ∂ ρ ·k √n ψ ψ , √n ψ ψ , and 0 = ij = ij − − ∂ α ∂ α ∂ σα 3     √n ψ ψ aroundh s = σ ican be writ- b − b 0 1 tenh as follows.i = σ ρ˙ ·k and b ij α 0   √n ψ ψ = ψ˙ σ√n(s σ) − − 2 0 2   ∂ ρi(k)(αΣ) ∂ ασ ∂ ρi(k) 0b 0 = = 1 ¨ ∂ α ∂ α ∂ σα + ψσ √n(s σ) √n(s σ)   ! 2√n − ⊗ −   − 0 2 1 1 = σ ρ˙ k . +Op n , i( ) α    2 0 2 ˙ √n ψ ψ = ψ˙ σ√n(s σ) The result follows because ψσ is a linear 2 − − function of ρ˙ k and/or ρ¨ k terms. h  i h i ij· i( ) b − 1 +Op n 2 , and Proof of Theorem 5   Expand T as 3 0 3 √n ψ ψ = ψ˙ σ√n(s σ) − − √n(ψ ψ) h  i h i T = − σψ × b 3 0 2 + ψ˙ σ√n(s σ) b 2√n − h i 1 2 2 −1 1 √n(σ σ ) + Op n , 0 2 ψ ψ −1 − 2√nσψ − ψ¨σ √n(s σ) √n(s σ) + O n . " # × − ⊗ −     b 308 SECOND-ORDER ACCURATE INFERENCE

2 and then take expectations of T , T , and h(tˆ3,α,i) tˆ if < 1, T 3 to verify the claimed expressions for the − 3,α,i ˆ (1) ˆ −  t3,α,i h (t3,α,i) mean, variance, and skewness of T . The tˆ3,α,i h(tˆ3,α,i) 1 u =  if > , sizes of the various remainder terms differ i  2 ˆ (1) ˆ 2  t3,α,i h (t3,α,i) because the expectation of Op(1) terms that  h(tˆ ) are even-order in √n(s σ) or √n (ω ω) 3,α,i otherwise, (1) − − h (tˆ3,α,i) have magnitude O(1) whereas the expecta-   tion of Op(1) terms that are odd-orderb in  ˆ 2 κ tˆ(2 dˆtˆ2)e−dtˆ /2 √n(s σ) or √n (ω ω) have magnitude and h(1)(tˆ) = 1 3 − . O n−−1/2 . It follows− that the cumulant − 6√n generating function of T can be expanded b  b as follows: Proof of Theorem 6 iuT Define Z as CT (u) = ln E e i    2 3 √ni ψi ψi κ1 iu u κ3 (iu) −1 def − = + + O n . Zi = , √n − 2 6√n  σi   b Inverting the characteristic function, where all terms are defined in Theorem 6. exp CT (u) , yields the density function, Then, the first-order Taylor series expan- { } 2 2 fT , and integrating fT yields the distribu- sion of T in Theorem 6 around σˆi = σi for tion function, FT . i = 1, 2 is

Inversion of Exponential Functions − 1 g 2 σ1Z1 σ2Z2 This section describes a Modified New- T = − 1 −1 2 2 2 × (g σ1 + σ2 ) ton algorithm for finding the value of tˆ3,α that satisfies h(tˆ3,α) = 0, where V1 V2 −1 1 + + Op n , where ˆ − 2√n1 2√n2 h(t3,α)    n − ˆˆ2 1 ˆ2 d t3,α/2 g = , n = min(n1, n2), κ κ3 t3,α e 1 n = tˆ 1 − z , 2 3,α − √n −  6√n  − α 2 2 b √n1(σˆ σ ) b V = 1 − 1 , and ˆ 1 2 2 where d is defined in Corollary 5.3. The so- σ1 + gσ2 lution is unique because h(tˆ3,α) is a mono- 2 2 tonic function of tˆ by construction. An √n2(σˆ σ ) 3,α V = 2 − 2 . ˆ ˆ ˆ ˆ 2 −1 2 2 initial guess for t3,α is t3,α,0 = tα, where tα is g σ1 + σ2 defined in Corollary 5.2. At iteration i + 1 of the algorithm, the value of tˆ3,α is The claimed results are obtained by us- ing Theorem 4 to obtain expectations of T , tˆ = tˆ u , where [T E(T )]2, and [T E(T )]3. 3,α,i+1 3,α,i − i − −