Volume 116, Number 1, January-February 2011 Journal of Research of the National Institute of Standards and Technology

[J. Res. Natl. Inst. Stand. Technol. 116, 539-556 (2011)] Maximum Likelihood and Restricted Likelihood Solutions in Multiple-Method Studies

Volume 116 Number 1 January-February 2011

Andrew L. Rukhin A formulation of the problem of considered to be known an upper bound combining data from several sources is on the between-method variance is National Institute of Standards discussed in terms of random effects obtained. The relationship between and Technology, models. The unknown measurement likelihood and moment-type Gaithersburg, MD 20899-8980 precision is assumed not to be the same equations is also discussed. for all methods. We investigate maximum likelihood solutions in this model. By representing the likelihood equations as Key words: DerSimonian-Laird simultaneous equations, estimator; Groebner basis; [email protected] the exact form of the Groebner basis for heteroscedasticity; interlaboratory studies; their stationary points is derived when iteration scheme; Mandel-Paule algorithm; there are two methods. A parametrization meta-analysis; parametrized solutions; of these solutions which allows their polynomial equations; random effects comparison is suggested. A numerical model. method for solving likelihood equations is outlined, and an alternative to the maximum likelihood method, the restricted Accepted: August 20, 2010 maximum likelihood, is studied. In the situation when methods variances are Available online: http://www.nist.gov/jres

1. Introduction: Meta Analysis and difficult when the unknown measurement precision Interlaboratory Studies varies among methods whose summary results may not seem to conform to the same measured property. This article is concerned with mathematical aspects Reference [6] provides some practical suggestions for of an ubiquitous problem in applied statistics: how to dealing with the problem. combine measurement data nominally on the same In this paper we investigate the ML solutions of the property by methods, instruments, studies, medical random effects model which is formulated in Sec. 2. By centers, clusters or laboratories of different precision. representing the likelihood equations as simultaneous One of the first approaches to this problem was sug- polynomial equations, the so-called Groebner basis gested by Cochran [1], who investigated maximum for them is derived when there are two sources. A likelihood (ML) estimates for one-way, unbalanced, parametrization of such solutions is suggested in heteroscedastic random-effects model. Cochran re- Sec. 2.1. The maxima of the likelihood are turned to this problem throughout his career [2, 3], and compared for positive and zero between-labs variance. Ref. [4] reviews this work. Reference [5] discusses A numerical method for solving likelihood equations applications in metrology and gives more references. by reducing them to an optimization problem for a The problem of combining data from several sources homogeneous objective function is given in Sec. 2.2. is central to the broad of meta-analysis. It is most An alternative to the ML method, the restricted maxi-

539 Volume 116, Number 1, January-February 2011 Journal of Research of the National Institute of Standards and Technology

σ^ 2 mum likelihood is considered in Sec. 3. An explicit i = 0. In order to find these estimates one can replace formula for the restricted likelihood estimator is μ in (2) by discovered in Sec. 3.2 in the case of two methods. −1 Section 4 deals with the situation when methods ∑ x ()σσ22+ μ = i ii, (3) variances are considered to be known, and an upper σσ221+ − ∑ ()i bound on the between-method variance is obtained. i The Sec. 5 discusses the relationship between likeli- which reduces the number of parameters from p +2 to hood equations and moment-type equations, and p +1. Sec. 6 gives some conclusions. All auxiliary material Our goal is to represent the set of all stationary points related to an optimization problem and to elementary of the likelihood equations as solutions to simultaneous symmetric is collected in the Appendix. polynomial equations. To that end, note that

−=μ 22 − 2 2. ML Method and Polynomial Equations ∑∑∑∑wi() x i wx ii ( wx ii )/() w i iiii(4) =−2 To model interlaboratory testing situation, denote by ∑∑wwij()/(). x i x j w i 1≤

σ^ 2 ()x − μ 2 QP/ Q It follows from (2) that if i is the ML estimator of ∑ k ==. (8) σ 2 σ^ 2 σσ22+ i , then i > 0. However, it is quite possible that k k P'/ P P'

540 Volume 116, Number 1, January-February 2011 Journal of Research of the National Institute of Standards and Technology

σ 2 The negative log-likelihood function (2) in this variables, say, 2 . With n = n1 + n2, f = n + n2, σ 2 − 2 υ σ 2 − 2 2 − 2 notation is u = 1 /(x1 x2) , = 2 /(x1 x2) , z1 = v1s1 /(x1 x2) 2 − 2 and z2 = v2 s /(x1 x2) , under lexicographic order, 2 2 2 2 Q vs 2 σ σ LP=+log +∑∑ii + v logσ . (9) 1 > 2 , the Groebner basis for equations (13) written σ 2 ii P' iii in the form

22 2 +− +υ = Π σ +σ 2 uznuu()()0,11 Let Pi = k≠i ( k ) be the partial of P (14) σ 2 with regard to i ; denote by Qi the same partial deriv- ′ ∑ 22 ative of Q and by Pi = j:j≠i Pij the derivative of υυυ+−()()0,zn u + = (15) ′ (σ2, σ2,…,σ 2 22 P p ). By differentiating (9), we see that the stationary points of (2) satisfy polynomial equations, consists of two polynomials,

Gu(,υ )=++++ cu25 c υυυυ c 4 c 3 c 2 , σσ42+−+ σ 2 σ 4 2 101234 (16) iiiii()()()QP' QP' P' (10) cnzfz=++23[(1) nz ], +−vs()()0.σσσ222 + 2 P' = 012 1 12 ii i i =−23 + + 3 − 23 c121121222 nnfz nnn(3) nnz nnf , σ 2, =−32 + 32 + + 23 + Each of these polynomials has degree 4 in i cnnfznnnnnnnnzz22121121221222(442) 2 2 i =1,…, p. When σ = 0, P = ∏σ , and the equations 22222 i −+++nn(46) n n nn n z f n n z (10) simplify to 1121222 121 −++43 22234 + − n(5 n112121222 5 nn 8 nn 4 nn 4 n ) z

222 +++22 2 σσP()() Q P'−+ QP' P P' fn21(2 n nn 12 2), n 2 ii i ii (11) +−σ 22 2 = cfnznnnnnnnzz=−3332 +(4 + + 5 232 + 4) vsPP'ii()()0. ii 321211212212 −+43 + 22342 + + (2nnnnnnnnzz1121212212 8 12 7 2 ) These polynomials have degree 3 in σ 2. If σ2 > 0, in i +−+++33(3106)nnn23 z f n 32 z n 3 n n 22 n nn 3 zz 2 addition to (10) one has 12 2 21 1 2 1 2 12 12 ++−222 nn21(5 n 7 nn 1222 2 n ) z =+−′′′′′3 = (12) FP() ( QPQPP ) 0, ++++−−332233 3(264),fnz21 n 2 n 1 nn 1 2 nn 12 n 2 z 2 fn 2 2 =−22 + + and F has degree 3p−3 in σ . cnzzfznz4112112[(1) ], In both cases the collection of all stationary points and forms an affine variety whose structure can be studied υ=++++ υυυυυ5432(17) via the Groebner basis of the ideal of polynomials (11) Gu201234(, ) du d d d d , or (10) and (12) which vanish on this variety. The dnzfznz=++2[(1)2 ], Groebner basis allows for successive elimination of 012112 =− 22 variables leading to a description of the points in the dnfn12 , =−22 + + + 2 variety, i.e., to the characterization of all (complex) dnnfznfnnnnz221112222(34) solutions. There are powerful numerical algorithms for 22 2 +++fn(2 nnn 2), evaluation of such bases [8]. Many polynomial likeli- 1122 =−22 + 2 + + 2 hood equations are reviewed in Ref. [9]. We determine dnfzfnnnnzz3212( 112212 2 2 ) the Groebner basis for equations (11) when p = 2 in the −++−n(3 n22223 6 nn 4 n ) z 2 nf z − 4 nnfz − fn , next section. 21 1222 2 1 22 2 =++++ dnzzzf41212(1 )[ (1 znz 1 ) 12 ] . 2.1 ML Method: p =2 This fact can be derived from the definition of the − 2 σ 2 When p = 2, Q =(x1 x2) . If = 0, the polynomial Groebner basis and confirmed by existing computa- equations (11) take the form tional software. υ υ It follows that for stationary points u, , G1(u, )= −−−+==24σσ 2 2 σσ 2 22 (13) G (u,υ) = 0. All these points can be found by express- ()(xx12iiiii n vs )()0,1,2. 1 2 i 2 ing u through υ via (17), The Groebner basis is useful for solving these =−υυυυ32 + + + equations as it allows elimination of one of the uddddd()/,12340 (18)

541 Volume 116, Number 1, January-February 2011 Journal of Research of the National Institute of Standards and Technology

substituting this expression in (16), and solving then −−22 [(xx12 ) /4 s 1 ]++ , x= max(0, x ). In this andonly in the resulting sextic for υ, σ 22 this situation (whose probability is zero), ˆii =s . The cd()υυυ32+++ d d d 2 equation (21) implies that 2yu++≤υ 1/2, so that 01 2 3 4 (19) ++++23υυυ 2 dc01()=0. c 2 c 3 c 4 ()xx− 2 σˆ 2 ≤ 12. (24) 4 Thus, there are 3 × 6 = 18 complex root pairs out of which positive roots u, υ are to be chosen. Although in Unfortunately, the Groebner basis for equations practice most all these roots are complex or nega- (21)-(23) has a much more complicated structure: the tive and are not meaningful, sometimes the number form of the entering into the basis poly-

of positive roots is fairly large. For example, x1= nomials depends on z1, z2, n1 and n2 . To find solutions −==22 to (21)-(23) we start with conditions on u and υ for 0.391,xssnn21 = 0.860, = 0.075, 2 = 0.102, 12 3 , which y > 0 in (21)-(23). υ there are three positive solutions,(u = 0.2 63, = 0.053), For fixed u and υ the behavior of the derivative of (9) uu= 0.138,υυ = 0.106), ( = 0.035, = 0.312). In this with respect to σ 2 is determined by that of the cubic particular case one has to compare the likeli- polynomial (12) which now takes the form, hood function evaluated at these solutions with the σ 2 likelihood evaluated whenˆ > 0, in which case the 3 Fy()=(2 y++ uυυ ) − 2( y + u )( y + ). (25) stationary points are (uz =υ = 0.065, = 0.230), υυ (uzu = 0.246, = 0.055, = 0.003), ( = 0.042, = This derivative is positive if and only if F(y)>0. 0.234,zu = 0.019),( = 0.048,υ = 0.065, z = 0.193) . As is easy to check, the derivative of F has two roots: (The likelihood is maximized at the last solution.) −(u + υ)/2 and y* = 1/6 − (u + υ)/2. The polynomial σσσ222= F has no positive roots if and only if either y* ≤ 0, Ifˆˆˆ 0, the estimators12 , satisfy the equations, F(0) ≥ 0 or if y* > 0, F( y*) ≥ 0 . We rewrite these ()xx− 222 nνν s n s conditions as follows: (21) has a positive root if and 12==, 1−− 112 22 σσ2222+ σ σ 4 σ 2 σ 4 (20) only if either ()12 1 1 2 2

1 3 which imply the inequalitiesνσsn22 / <ˆ < [( x− x ) 2 uuu+≥υυυ,( + ) <2 , (26) ii i i 12 3 2 + vi si ]/ni . Evaluation of second shows that or 2()nxxnnσσˆˆ22−≤ 2 ( σ ˆ 2 + σ ˆ 2 ). 12 1 2 121 2 11 These solutions are to be compared with the solutions uu+−υυ<,| |< . (27) 3 33 when σσˆ 222 > 0, in which case withyxx = /(− ) > 0, 12 If (26) holds, there is a unique positive stationary ++υυ3 − + + (2yu ) 2( yuy )( ) = 0, (21) point. When condition (27) is met, only the largest root of F(y) = 0 (i.e., the one exceeding y*) can be the ML uu2 (−+υν ) 2( z − uyuy )( + )( + υ ) = 0, (22) 11 estimator σ^2. Analysis of second derivatives shows that ( y, u, υ), y > 0, can provide the minimum only υυ2 (−+uz ) 2( − νυ )( yuy + )( + υ ) = 0. (23) − 22 if 2y + u + υ > √3|u − υ | and ( y + u)(y + υ) min (v ( y + u)/u, v ( y + υ)/υ) ≥ y|u − υ |. Notice that (21)-(23) imply that if σˆ 2 > 0, then the 1 2 Figure 1 shows the region where (21) has a positive inequalities ss2222≥≤σσˆˆ , , and σσ ˆˆ 22 ≤ , are all υ 1122 1 2 root in the (u, ) plane. Its boundary− is formed by two straight segments where 3√3|u − υ | = 1, 3(u + υ) ≤ 1, σˆ 22=,s equivalent. The fact excludes the possibility that ii and by a cubic curve when u + υ ≥ 1/3. The largest 22σσ 2 222 σ possible value of u or υ such that σ^2> 0 is 8/27. unlessss12 = , in which caseˆˆ 1 = 21 = s , ˆ=

542 Volume 116, Number 1, January-February 2011 Journal of Research of the National Institute of Standards and Technology

Fig. 1. The region where (21) has a positive root. The squares mark the points where the boundary changes from linear to cubic. Two points of this region at which u or υ take the largest possible value (8/27) are marked by o.

To study further the form of solutions to (21)-(23), The conditions (26) and (27) reduce to the inequalities, we use the fact that ⎛⎞1 zs+− z(1 s ) 2(1ssn−−≥ )⎜⎟12 , (30) ⎝⎠Knss(1− ) arg min L (28) and =arg min ++ ++ 1 zuz12//1/(2)= v yuv n |1−− 2ww | (1 w ) < . (31) []ννυ+++++ υ 63 12loguyuy log log( ) log( ) , The of the (29) reveals ≠− +−−3/2 which can be shown by the direct calculation or by the that whensszszsn 1/ 2, |1 2 | [12 (1 )] < ( 2) argument from Sec. 2.2. Thus assuming the condition ss(1− )/(3 3 n ), and (29) has three real roots, two of υ υ z1 / u + z2 / + 1/(2y + u + )=n, one can parametrize them of the same sign as 1−= 2sIfs . 1/ 2, then we the solutions of Eqs. (21)-(23) as follows: for s +− − havewy=≥ 1/ 2. Indeed the condition 0 leads to the in the unit interval, u = (zs12 z (1 s ))/[ sn ( 1/ K )], − υ = (zs+− z (1 s ))/[(1 −− s )( n 1/ K )] , u + y = 2 w (1 − w )2 , following restriction on the range of s :|1 2 w | 12 ≤ |1− 2 s| with a strict inequality when y >0. υυ+−2 ++− yw=2 (1 w )with K =2 yu =2 ww (1 )>1/, n The end points of this domain can be found as where w, 0 < w < 1, solves the cubic equation u − υ = solutions to the equation, 2(1−− 2ww ) (1 w ) or ss(1−− )(1 2 s )2 − (2)(1)nss−−2(zs+− z (1 s )) + 12 =0, (32) −− nn 3 (nw 2)(1 2 ) (1−− 2w ) + which is a in ξ =1−2s, n −+− 2(1 2szsz )(12 (1 s )) −Δ−+ =0. (29) 422(nzn 1) 8 8 2 (33) ns(1− s ) ξξξ−−−=0. nnn

543 Volume 116, Number 1, January-February 2011 Journal of Research of the National Institute of Standards and Technology

σ 2 Fig.2. The region whereˆ > 0 whennn12 = 2, = 2 .

+Δ− Δ Δ Herezzz = (12 )/2, = ( zz 21 )/2,| |< z . _ that | | > 1/2. Then the derivative takes negative If the equation (33) has two distinct roots ξ_, ξ in values at the end points of the interval (−1, 1). the interval (−1, 1), then the _range of s-values is the If ξ0 is the point of maximum of the polynomial (33) _ _ 3 interval [_s, s ], 0 < _s =(1− ξ )/2 < s = (1−ξ_)/2<1, in (−1, 1), then 4ξ − 4(n − 1) ξ0 /n =8Δ/n, and the _ _ 0 and for s ∈ (_s, s ) the polynomial in (32) is negative. roots ξ_, ξ exist if and only if Therefore, if there are two roots of (29) in the interval −Δ−+ − ξξξ42−−−2(nzn 1) 8 8 2 (0, 1) which have the same sign as 1 2s, the one with 000 = the smallest absolute value of |1 − 2w|, i.e., the one with nn n −−+ ξξ42+−2(nzn 1) 8 2 (34) |1−− 2wn |< ( 2)/(3 n ), is to be chosen. Thus, the solu- 3>0.00 nn _ tions of equations (21)-(23) are parameterized by s∈ [_s, s ]. To determine_ conditions for the equation (33) to have According to inequality (34), two roots ξ_, ξ , in the interval (−1, 1), observe first that nnnnz−−+−1481242 it cannot have more than two roots there. Indeed the ||ξ 2 −≤ ,(35) polynomial in (33) assumes negative values at the end 0 39nn2 points, and it has exactly two roots if and only if there is a point in this region at which that polynomial is and (35) shows thatznn≤−+ (423/2 8 1)/(24 nn ) ≤− ( 1) positive. /(3 3nnn ) . One must have |ξ |≤− ( 1) /(3 ), as the If the derivative of this polynomial, 4ξ 3− 4(n − 1) 0 derivative decreases only in this sub-interva l of (− 1,1). ξ/n − 8Δ/n, does not vanish in (−1, 1), then (33) Thus cannot have two roots there. This happens when this derivative has just one real root, or equivalently the nnnnz−−+−148124⎡⎤2 discriminant of the cubic polynomial is negative which ξ 2 ≥−⎢⎥(36) 3/2 0 1.2 means that | Δ |>[(n − 1)/(3n)] and which implies 3n ⎣⎦⎢⎥(1)n −

544 Volume 116, Number 1, January-February 2011 Journal of Research of the National Institute of Standards and Technology

σ 2 Fig.3. The region whereˆ > 0 whennn12 = 4, = 4 .

σ 2 Fig.4. The region whereˆ > 0 whennn12 = 8, = 3.

545 Volume 116, Number 1, January-February 2011 Journal of Research of the National Institute of Standards and Technology

22 If 8zn <−−+−− 2 (i.e., 4 n 8 n_ 1 24 nzn > ( 1) ), Eq. (33) minLL= min σσ22 σ 2 λσλσ 2 2 λσ 2 must have two roots ξ_, ξ as it takes a positive value ,,,11pp , ,, when ξ = 0. Since sign(ξ )=−sign (Δ), assuming that ⎡ ()xs− μν22 0 iii+ = min ⎢∑∑22 2 82,wegetzn≥− λσ22 σ σ 2 λσ()+ σ λσ ,,,,1  p ⎣ iiii ⎤ 2|Δ− |n 1 +++λσ22 σ ν λσ 2 −ξξ2 ∑∑log( (iii )) log( )⎥ =(00 )| | ⎦ nn ii 2 3/2 ⎡ − μ ⎛⎞−−+−⎡⎤2 1 ()xi ≥+nnnnz148124 =[(min⎢ min ∑ ⎜⎟⎢⎥2 λ 22 2 σσ22,,, σ 2⎣ λσσ+ ⎝⎠3(1)nn⎣⎦⎢⎥− (37) 1 p i i 2 1/2 ν s (39) ⎡⎤2 −+− +++∑∑ii)n logλσσ ] log ()22 + 48124nn nz σ 2 i ⎢⎥1,− iii ⎢⎥(1)n − 2 ⎣⎦ ⎤ + ν σ 2 ∑ i log i ⎥ i ⎦ which means that ⎡ ⎛⎞()xs− μν22 =logmin ⎢n ⎜⎟∑∑iii+ 23/2 yy,,, y⎢ ⎝⎠yy+ y (4nn−+− 8 1 24 nz ) 01 p ⎣ ii0 ii (38) +−2 −+− ⎤ 3(nnnnz 1)(4 8 1 24 ) +++∑∑log(yy )ν log y⎥ 0 iii⎦ ≥−−Δ≥4[(nn 1)32 27 ] 0 . ii +−nnlog n . The condition (38) can_ also be shown to be sufficient λλ−+ μ2 The minimizing value of is00 =(∑ (xyyii ) /( ) ξ_, ξ i for the existence of . +∑ν sy2 / )/ n . Thus the objective function in (39) This parametrization and the Groebner basis for (13) i ii i becomes homogeneous of degree 0 in yy01, , , yp allow for given z1, z2 to compare the minimal values of (9) when y > 0 with that when y = 0. Figures 2-4 show which reduces the problem to minimization of such σ^ 2 bounded regions where > 0 in the space (z1, z2) when a function. Ifyy01 , , , yp is a solution to the problem n = n = 2, n = n = 4, or n = 8, n = 3. When 22 2 1 2 1 2 1 2 σλˆˆ σλ σλ ˆ ≥ (39), then =00yy , 1 = 01 , ,pp = 0 y is a ML n1 = n 3, this region is a triangle, z1 + z2 < c; for ≠ 2 solution. n1 n2 it is more complicated. We summarize the obtained results. To minimize the objective function in (39)

Theorem 2.1. When p = 2, the Groebner basis for (13) Fy(,,,01 y yp ) is given by (16), (17). The solutions of (21)-(23) satisfy ⎛⎞− μν22 ()xsiii conditions (26) and (27). These solutions can be =logn ⎜⎟∑∑+ (40) _ ⎝⎠yy+ y parametrized via (29) by s ∈ (_s, s ) where the end points ii0 ii +++ν can be found from (32) given that (38) holds. ∑∑log(yy0 iii ) log y , ii …, We conclude by noticing the relationship of the prob- one can impose a restriction on y0, y1, yp such as ∑ =1 λ =1. ∞ lem discussed here to the likelihood solutions to the yi or 0 Note that F tends to + when classical Behrens-Fisher problem [10] which assumes y → 0 or y → +∞ for any fixed y > 0 . Since i _ i 0 σ 2 ∑ ( − )/( ) = 0, that =0. i xi μ y0 + yi one sees that

()xs− μν22 2.2 Solving Likelihood Equations Numerically iii+ ∂ + 22 ν ()yy0 ii y 1 i In view of the difficulty in evaluating the Groebner Fn=,−++(41) ∂+yyyy⎡⎤()xs− μν22 basis for p > 2, an iterative method to solve the opti- iii∑ ⎢⎥jjj+ 0 + mization problem in (2) is of interest. Notice that the j ⎣⎦⎢⎥yy0 jj y sum of the first two terms in (2) is a homogeneous σ 2 σ 2 …, σ 2 − function of , 1, p, of degree 1. Therefore, which must be positive for large yi and negative for ∑ with n = i ni , sufficiently small yi .

546 Volume 116, Number 1, January-February 2011 Journal of Research of the National Institute of Standards and Technology

If y > 0, by equating this derivative to zero, we obtain (1)k + 2 0 ++1 ⎡⎤υμ()x − ys(1)kk=,⎢⎥∑∑iik+1 + νυ 2(1) 0 +υ (1)k + ii i (48) n ⎣⎦ii1 i sn2 λλ− λ iii−+00=0, (42) k = 0, 1, …. 2 ν yi yyii0 As in [11], (46) defines a sequence converging to a λμ−++22 ν Σ λλ where=()/()/,sothat=.iixyysyn00 iiii i stationary point, and at each step the value of (40) is The equation (42) must have a positive root, which decreased. λ means that for i evaluated at a stationary point, Theorem 2.2 Successive-substitution iteration defined νλλyn− iii00≥ , (43) by equations (46), (47) and (48) converges to a station- 22λ 4si 0 ary point of (40) so that at each step this function decreases. At any stationary point inequalities and (43) and (44) hold and (45) is valid. s λλλλλλλ22nn− i =<.000++ii 000 ++ i(44) Notice that Vangel and Rukhin tacitly assume in [11] 22νν ysii2244ssii i y00 s i i y that y0 > 0. The case of the global minimum attained at the boundary, i.e., when y0 = 0, can be handled as 2 λ It follows that for any stationary point, 2si > 0yi, i.e., follows. Equating (41) to zero gives

2s2 −+μν22 i σ 22 ()xsiii <<2.ˆiis y ∝ , (49) 4ns2 (45) i n 11++ii i νσ2 i ˆ and a simpler version of an iterative scheme λ λ For a fixed value of 0, say, 0 = 1, the Eq. (42) leads to an iteration_ scheme, with specified initial values of (0) μ y0 and 0. We take these to be the estimates arrived at ()xs−+μν22 by the method of moments as described later in Sec. 5, ik ii n (1)k + i μ but once they are given, one can solve the cubic yik=,+1 (50) υ ()xs−+μν22 equation for i = y0 / yi , jk jj ∑ ()k j y j υυν+−+22 υν 2 ⎡⎤−1 iiiiii(1 )sy (1 ) 0 x 1 (46) =,∑∑j ⎢⎥ −+(1υμ )yvx + ( − )2 = 0. (1)kk++ (1) iii0 jjyyjj⎣⎦⎢⎥

… (0) 2 It is easy to see that each of these equations has k =0,1, , yj = sj , y0 = 0, converges fast. However either one or three positive roots. If there is just one the solutions obtained via (46), (47), (48) and (50) υ root, then it uniquely defines i. In case of three positive are to be compared by evaluating L. To assure that a roots, the root which minimizes (40) is chosen. Under global minimum has been found, several starting values υ υ (k) (k) this_ agreement,_ at stage k in (46) i = i , y0 = y0 , should be tried. μ μ = k , and after solving (46) we put The maximum likelihood estimator and the restrict- ed maximum likelihood estimator discussed in the next υ (1)k + x ∑ jj section can be computed via their R-language imple- 1+υ (1)k + mentation through the lme function from nlme library μ =,j j (47) k +1 υ (1)k + [12]. However this routine has a potential (false or ∑ j singular) convergence problem occurring in some 1+υ (1)k + j j practical situations. and

547 Volume 116, Number 1, January-February 2011 Journal of Research of the National Institute of Standards and Technology

3. Restricted Maximum Likelihood An application of (54) shows that

The possible drawback of the maximum likelihood ′′′ ′′ ′′ method is that it may lead to biased estimators. Indeed FPPPQPQPPH≥+−()=.(55) the maximum likelihood estimators of σ ’s do not take into account the loss in degrees of freedom that results from estimating μ. This undesirable feature is eliminat- Since P (σ 2) > 0, if F has a positive root σ 2, then ed in the restricted maximum likelihood (REML) H also must have a positive root τ 2. Moreover, method which maximizes the likelihood corresponding τ 2 ≤ σ 2, so that the ML estimator of σ 2 is always small- to the distribution associated with linear combinations er than the REML estimator of the same parameter for of observations whose expectation is zero. This method the same data. discussed in detail by Harville [13] has gained consid- The polynomial equations for the restricted likeli- erable popularity and is employed now as a tool of hood method have the form, choice by many statistical software packages. The negative restricted likelihood function has the σνσ4222′′−+ ′ ′′ + − ′ form ii()()()=0,QP QPPP i i ii s i P (56) ++σσ221− RL=log(( L ∑ i ))= i with each of these polynomials being of degree 3 in σ 2, − 2 −1 i ()xxij ⎡⎤− …, σ 2 ∑∑⎢⎥()σσ221+ i =1, p. When > 0, Eqs. (56) have to be aug- σσσσ2222++ i 1<≤≤ijp()()ij⎣⎦ i mented by the equation H =0. ++++σσ221− σσ 22 log(())log()∑∑ii ii 3.1 Solving REML Equations ν s2 ++∑∑ii νσlog2 . (51) To solve the optimization problem in (51) in practice σ 2 ii iii one can use a method similar to that in Sec. 2.2. Indeed, By using notation (5) and (6), we can rewrite ⎡ ()xs− μν22 iii+ minRL = min ⎢∑∑22 2 2 σσ22,,, σ 2 λσσ ,,,, 22 σ 2⎣ λσ()+ σ λσ Q ν s 11ppiiii ++′ ii +νσ2 (52) RL=log P ∑∑2 ii log. P′ σ 1 22 iii +++log(∑∑ ) log(λσ ( σ )) λσ22+ σ i ii()i σ 2 σ 2 …, To estimate and i , i =1, p one has to find the ⎤ ′ ′ + ∑νλσlog(2 )⎥ minimal value of Q/P + log P . The derivative of this ii⎦ function in σ 2 is proportional to the polynomial H of i − ⎡ 1 ()xs− μν22 degree 2p 3 in this variable, iii+ =[()min⎢ min ∑∑22 2 σσ22 σ 2 λ λ σσ+ σ ,,,1  p ⎣ iiii HPPQPQP=,′′′+− ′′ ′′ (53) ⎛ 1 ⎞ +−(n 1)logλ ] + log ⎜⎟∑ ++∑log ()σσ22 σσ22+ i ⎝ i i ⎠ i as opposed to 3p−3 which is the degree of the corre- ⎤ + νσ2 sponding polynomial F in (12) under the ML scenario. ∑ iilog ⎥ σ 2 …, i ⎦ For fixed i , i =1, p , all p roots of the poly- nomial P = P (σ 2) are real numbers, so that ⎡ ⎛⎞()xs− μν22 =(1)logmin ⎢ n −+⎜⎟∑∑iii yy,,, y + 01 p ⎢⎣ ⎝⎠iiyy0 ii y − ′′p 1 ′22 ′ PP≤≤[] P [] P (54) ⎛⎞1 ⎤ p ++++log⎜⎟∑∑ log(yy ) ∑ν log y⎥ + 0 iii ⎝⎠iiyy0 i i⎦⎥ [14, 3.3.29]. +−−nn1 ( − 1)log( n − 1) . (57)

548 Volume 116, Number 1, January-February 2011 Journal of Research of the National Institute of Standards and Technology

_ λ λ ∑ ( −μ )2 The minimizing value of is 0 =( i xi / The gradient of the function in (61) ∑ 2 − (y0 + yi)+ i visi /yi)/(n 1), and the objective function  2(nz− 1) en in (57) is homogeneous of degree 0 in y’s. If −−(2)n − , (62) …, TT y0, y1, yp is a solution to the problem (57), then zz zez σ∼ 2 λ σ∼ 2 λ …, σ∼ 2 λ = 0 y0, 1 = 0 y1, p = 0 yp is a REML solution. and its Hessian is When y > 0, the function in (57) takes the form 0 −−TT − 2 2(nnzznee 1) 4( 1) ( 2) 22 ∇−+F = ⎛⎞()xs− μν TT22 T (1)logn −+⎜⎟∑∑iii zz() zz () ze +  ⎝⎠iiyy0 ii y (58) ⎛⎞n (63) +diag ⎜⎟. ⎛⎞⎛⎞ ⎝⎠2 +−+11ν z log⎜⎟⎜⎟∑∑ log ∑ii logy . ++ → ⎝⎠⎝⎠iiyy00ii yy i where n / z is the vector with coordinates ni / zi , i =1,…, p, leads to the iteration scheme,

Its derivative with respect to yi is  nnz2(− 1) ()k ()xs− μν22 ∝−−(2),ne iii+ zzz(1)kkTk+ () () () + 22 (64) ()yy0 ii y −−(1)n s−2 ⎡⎤()xs− μν22 zez(0)=,i Tk ( ) =1, ∑ ⎢⎥jij+ i ∑s−2 + k j ⎣⎦⎢⎥yy0 jj y k 1 (59) k =0,1,…, which converges fast. + 2 ν The solutions obtained via (60) and (64) are to be ()yy0 ii1 −++. compared by evaluating RL in (57). 1 yyy+ ∑ 0 ii + j yy0 j 3.2 REML p =2 λ σ∼ 2 σ∼ 2 As in Sec. 2.2, for a fixed value 0 = 1 we can use an To find the REML solutions i and when p =2 iterative scheme which is based on (59). However,_ now notice that (51) takes the form (0) μ one has to specify initial values of y0 , 0 and of − (0) ∑ 1 2 A = (y0 + yj) after which the cubic equations for ⎛⎞()xx− 12 +++σσ22 υ = y / y , RL=log(2)⎜⎟22 y 12 i 0 i ⎝⎠2y ++σσ 12 (65) υυν(1+−+−+ )22syy (1 υν ) 2 (1 υ ) ⎛⎞⎛⎞ss22 iiiiii00 i ++νσνσ⎜⎟⎜⎟12log22 ++ log , ++υυ − μ2 (60) 1122σσ22 iii/()=0Ax ⎝⎠⎝⎠12 − 2 ≥ 2 2 which shows that if (x1 x2) s1 + s2, are solved for i =1,…, p, with updating as in (47) and σ 22 (48). Each of these equations has either one or three ii=,=1,2,si (66) positive roots. If there is just one root, then it defines vi , out of three roots the largest is taken. 1 T σ 2⎡⎤−−− 222 If y = 0, then with the vectors e=(1, …, 1) ,  =(⎣⎦xx12 ) s 1 s 2 . (67) ,…, T σ 2 × 2 z =(z1 (zp) , zi =1/ i , and p p symmetric matrix − 2 2 2 defined by positive elements ((xi xj) +vi si +vj sj )/2, (58) takes the form Indeed this choice simultaneously minimizes each of three bracketed terms in (65) guaranteeing the global ⎡⎤minimum at this point. When (x − x )2 < s 2 + s 2, as we −−−−TT 1 2 1 2 min ⎢⎥(nzznzenz 1)log( ) ( 2)log( )∑ ii log . σ 2 σ 2 will see, = 0. To find i , i = 1, 2, one can employ zz1,, p >0⎣⎦i (61) the Groebner basis of the polynomial equations (56) which in the notation of Sec. 2.1 takes the form

549 Volume 116, Number 1, January-February 2011 Journal of Research of the National Institute of Standards and Technology

22+−υν + − + υ All stationary points of the restricted likelihood uu(1)()()=011 u z u (68) equations can be found by solving the cubic equation υ υ G5 ( ) = 0 for > 0, and substituting this solution in 22+−υνυυ + − + vu(1)()()=0.22 z u (69) (71), allowing one to express u through v,

−+++υυυυ32 This Groebner basis consists of three polynomials ubbbbb=(12340 )/.

υ25++ υυυυ 43 ++ 2 Gu301234(, )= au a a a a , (70) Thus, in this case there are 3 complex non-zero roots pairs u, v. whose coefficients are not needed here, Another approach is to use the argument in Sec. 3.1 which for a fixed sum 2y + y + y = 1, leads to the Guv(,)= buυυυυυ++ b5432 b ++ b b , 0 1 2 401234 optimization problem, −+−+−+−2 b01211221121= 2 nznzznznzn [ 2( 1) ( 2) 2], −−− +−2 ⎡⎤⎛⎞zz bnnn12=(1)(2)(22), nn 12 −+++12 ν min ⎢⎥(ny 1)log⎜⎟ 1∑ ii log , (73) 22 yy+≤1 ⎢⎥ bnn=(−+ 2 − 2)(2 nnnnnz + 2 −− 4 + 2) 12 ⎣⎦⎝⎠yy12 212 122121 yy12>0, >0 ++ −32 + + 23 + − 2 (22)(4nn12 nnnnnnn 1121221 7 45 If (x − x )2 < s 2 + s 2, then according to Lemma 1 in −−++2 − 1 2 1 2 14nn12 12 n 2 8 n 1 124)nz22 the Appendix, for a fixed sum y1 + y2 = y, the minimum ++ −22 + + 2 − − + ≤ (22)(2nn12 nnnnnn 1 12212 2342), in (73) monotonically decreases when 0 < y 1, so that 22 this minimum is attained at the boundary, y1 + y2 = 1, in bnn=(−+ 2 − 2)ν z ∼ 31221 which case indeed σ 2 = 0. Thus, one can solve the ++−22 + +−−+ − 2(nn12 2 2)( nnnnnn 1 2 12212 2 3 4 2) zz 12 cubic equation for w =1 2y1 , −++−−−+2232 2 (3nn12 6 nn 12 4 n 2 n 1 12 nn 12 12 n 2 4 n 1 (2)(42)(8nw−−Δ+−+Δ−−32δδ wn 4 nzw 2) +−12nz 4) 2 (74) 22 −Δ4(nz − 1)2(14)=0, −δ + −+−−+−ν 2 2(21nn 2 2 2) z 1 2(2 nn 1 2 2) + 2 −− + δ −+Δ (2nn12 2 n 2 nn1242) z 2 where = (nn21 )/2, and as before z = ( zz 12 )/2, = ()/2.zz− −+−2(ν nn 2 2),2 21 21 2 An alternative method is to use the iteration scheme ++ + − + − bnzzzn412121= (1 )[( 2 n 2 2) z 1 ( n 1 2) z 2 (64) to solve (57). ++ − nn1222], (71) andand 4. Known Variances υυυυυ65++ 4 + 3 (72) Gu5(, )= c 0123 c c c , 4.1 Conditions and Bounds for Strictly Positive withwith Variance Estimates −− cnnn02=( 1)(2), In view of the rather complicated nature of likeli- cnnnnnz=(22 +−−+ 2 4 2) 1212121 hood equations, in many situations it is desirable to −+22 + −− + (34nnnnnn121212 362) z 2 have a simpler estimating method for the common −−nnnn2222 − +34nn+− 2, mean μ. The most straightforward way is to assume 121212 that the variances σ 2, i =1,…, p, are known. In this ν 2 −+− i cznnzz2211=(244) 2 12 case, essentially suggested in Ref. [15], but also pur- ++−2 +νν + +− + (2nn12 3 3) z 2 2 2112 z (2 nn 4 4) z 22 , sued in Refs. [16, 17], these variances are taken to be s 2 (or a multiple thereof). Because of (3), the only czzz=[()2()1].−−+++2 zz i 3212 12 parameter to estimate is y = σ 2.

550 Volume 116, Number 1, January-February 2011 Journal of Research of the National Institute of Standards and Technology

Qa() () We give here upper bounds on the ML estimator ≤+(1)(p −− 1) σ^2and on the REML estimator σ~2. ! ⎡⎤()xx− 2 Pa(1)+ () σ 2 ⎢⎥ij Theorem 4.1 Assume that the variances i , max 22 1<≤≤ijp⎢⎥2a ++σσ (1)! + i =1,…, p, are known. Then ⎣⎦ij (83)

22 1 ⎡⎤(1)(pxx−− ) ⎡⎤− 2 (1)+ σσσ222≤−+ij ()xxij Pa() ˆ max ⎢⎥(),ij (75) −− ⎢⎥ ≠ =(p  1)max 22 . 2 ij ⎢⎥p 1<≤≤ijp ++σσ ⎣⎦+ ⎣⎦⎢⎥2a ij ! and

22221 The argument, as before, shows now that if σσσ ≤−−−+⎡⎤(1)()(pxx ). 2221− max ⎣⎦ij i j+ (76) −++≤−σσ 2 ij≠ max 1<≤≤ijp()/(2)(1),thenxxij a i j p Ha()k ( )≥− 0, k = 0, , 2 p 3 . It follows thatσ 2 ≤ a , In particular, if − so that (76) is valid withapxx = 212 max[(−− 1)( ) − 2 ij ()xxij p −+σσ22 ≤ (ij )]+ . The proof of (75) is similar . + max 22 2, (77) 1<≤≤ijpσσ+−(1)p ij When p = 2, the bounds (75) and (76) are sharp as then σ^2=0.If (24) and (67) show. When p increases, their accuracy decreases. Section 4.2 gives necessary and sufficient ()xx− 2 ij≤ 1 (78) conditions for σ~ 2 = 0 when p = 3. It is possible to get max 22 , 1<≤≤ijpσσ+ − ijp 1 better estimates under additional conditions. For exam- σ~ 2 σ− 2 (σ2 σ2 =0. ple, if the ordering of sequences p −  ( ij ) i + j ) − 2 σ2 σ2 −1 Proof: We prove first that (notation of Lemma 2) and (xi xj) ( i + j ) is the 2 2 2 2 same, then the maximum of (x − x ) (σ + σ ), ()xx− i j i j ′′−≤− ′′ij ′′′ (79) 1 ≤ i < j ≤ p, in (77) and (78) can be replaced by their ||(1).QP Q P pmax 22 P P 1<≤≤ijpσσ+ ij average. Indeed, one can write 4.2 Example: Restricted Maximum Likelihood for ′′−−− ′ ′ k +− 2 (80) QP Q P=(1)∑ k k qppk−−2  − y , p =3 k,

When p = 3, Q(y)=q0y + q1, is a polynomial of and Lemma 3 in the Appendix shows that since degree one, and (p − 1 − )| k − 1 − | ≤ (p − 1)(k − 1), (79) holds. Hy()=18 y32+− 3(6 q ) y One gets for the polynomial H in (53), 10 ++−2 + +− (6211 4 6qy ) 2  210211 q 2 q  ⎡⎤− 2 (84) ()xxij 3 ≥−−′′′⎢⎥ k HPP1( p 1)max 22 , (81) 1<≤≤ijpσσ+ = ∑hy3−k ⎣⎦⎢⎥ij 0 σ~ 2 which is positive under condition (78). Then =0. is a polynomial of degree three. If H(0) = h3 < 0, it has Similarly (77) and (54) imply that a positive root, which means that σ~ 2 > 0. Otherwise the existence of a positive root is related to the sign of the discriminant ⎡⎤()xx− 2 FPP≥−−′′⎢⎥()2 ( p 1)ij PP ′′ ≥ 0.(82) 22−− 3 3 max 22 Dhhhh=4412 02 hh 13 ⎢⎥1<≤≤ijpσσ+ ⎣⎦ij −+22 (85) 27hh03 18 hhhh 0123 . ≥ If D < 0 and h3 0, there is just one real root which ∼ To prove (76) observe that with y replaced by y + a must be negative, and then σ 2 = 0. If D ≥ 0, then there ≥ a formula like (79) holds for any positive a. The only are three real roots. The condition h3 0 implies that σ 2 …, σ 2 modification is that p −  ( 1 , p ) is replaced by either two of them are positive and one is negative (so σ 2 …, σ 2 (k) p −  (a + 1 , a + p ) and the H (a)/k! represent that at least one of the coefficients h1 or h2 is negative) ∼ polynomial coefficients. Then a version of Lemma 3 in and σ 2 > 0, or all three roots are negative (so that ∼ the Appendix states that σ 2 > 0.)

551 Volume 116, Number 1, January-February 2011 Journal of Research of the National Institute of Standards and Technology

Separation between these cases depends on the ratio 5. Moment-Type Equations λ 222≤ =qq10 / . LetE 1 = 10 / q , E 2 = 20 / q . Then E 2 E 1 /3, 5.1 Weighted Means Statistics hhqE= 18, / = 18(−+− 1/6), hqEE /22 = 6( 2 /3λ ), 0101 2021 σ 2 3 When the within-lab and between-lab variances i hq/ = 2 EE+− E 2λ E . Actually, as follows from 2 30 12 2 1 and σ are known, the best (in terms of the mean ≥ λ the proof of Theorem 4.1, E1 , but we do not use squared error) unbiased estimator of the treatment this fact here. effect μ in the model (1) is a weighted means statistic 0 σ 2 σ 2 2 (3) with wi = wi = 1/( + i ). Even without the Algebra shows that whenEE21 = /3, one has D = 2 normality assumption for these optimal weights, 544(EEh−+− 3λλ ) (3 1/16 ) , when = 0, i.e., if − 113E ∑ w 0 (x − μ )2 = p − 1, and λλ++−32 2 i i i EEE211= /( 0.5), then D = 36(4 EE 11 3 ) 32−−+++λ − 3 (8EEE111 20 10 1 48 )(1 2 E 1 ) , and on the curve λ − 2 h2 = 0, i.e., when E2 = 2 E1 /3, D factorizes as fol- 11 E()==μμ− 2 . lows,DEE = 36(432+− 2 3λ )(108 EEE 3 − 162 2 − 18 0221σσ+ − (88) 11 1 1 1 ∑∑wii() + λ ˆ ii 81 ). LetE1 be the smallest positive root of the cubic equation

σ 2 σ 2 If Var(xi)= i + , but the weights wi are arbitrary, 32−−++λ (86) 82010148=0,EEE111 λ …  p which is defined for < 0.5137 , and let E1 be the σσ222+ p ∑()iiw (only) positive root of the equation −+−μσσ2221 Ewx∑∑ii()=() i w i p i 1 ∑w 32+−λ (87) i 423=0.EE11 1 (89)

The discriminant of (87) is negative, and according ⎡⎤pp ⎢⎥∑∑ww222σ to Descartes’s rule there are always two complex roots, ppiii   =.σσ22⎢⎥∑∑ww−+11 − and one positive root. It is easy to check that E = E (λ) iiipp  1 1 ⎢⎥ λ ⎢⎥11∑∑ww is monotonically increasing in < 0 and E1(1/16) = 1/4. ⎣⎦ii λ ≤ ≤ 11 In this notation, when 1/16 (so that h2 0 ≤ ≤ 2 implies that h1 0), the region {(E1, E2):E2 E1 /3, ≥ σ∼ h3 0, 2 > 0}, is formed by three curves: (i ) h3 =0 σ 2 +−λλ +− In particular, when wi =1/ i between the point M1 = (( 1 48 1)/4, (1 24 1+ 48λ )/24), wherehE = 0 intersects the curve = 32 EEEh22/3, and the point ( ,λ − 2 /3) on = 0, ()ii E = 11122 ⎡⎤p 1 EM2 /3 between and the point ME = ( = 3λ + 1/6, ⎢⎥∑ 11 21 ()x − μσ24p 1 (90) ED22/3) where= 0 intersects EE = /3, and by ( i i i ) Ep∑∑ii=1−+σ 2 ⎢⎥ −1 . 121 σσ22⎢⎥p i ii1 1 the cubic inE2 curve corresponding to the equatio n ⎢⎥∑ = σ 2 D 0, which connects the pointMM23 and the point = ⎣⎦1 i (3λλ+< 1/6,E 2 /3). See Fig. 5. If 2 /81, this region also 1  ≤ ˆ includes the part of the curvehEE311 = 0 where , but the probability of having the likelihood solution The simplest estimate of the within-trials vari- there is zero. ances σ 2 is by the available s 2 , but the problem of  i i Whenλ >>= 1/16, one hasEE 1/ 4 (1/16), so that estimating the between-trial component of variance 11 >>>>λλ σ2 remains. hEEh12120. Also then /3, , so that 0. σ 2 >≥ Thus in this situation 0 if and only ifh 3 0.

552 Volume 116, Number 1, January-February 2011 Journal of Research of the National Institute of Standards and Technology

≥ σ∼ 2 λ ≥ Fig. 5. The region where h 3 0 and > 0 when = 1/27. The solid line is h 3 0, the dash-dotted line 2 is E2 = E1 / 3 , and the dotted line is D = 0. The point M1 is marked by +, the point M2 is marked by a square, the point M3 by o. 5.2 DerSimonian-Laird Procedure 5.3 Mandel-Paule Method By employing the idea behind the method of The Mandel-Paule algorithm uses weights of the moments, DerSimonian and Laird [18] made use of the form (91) as well, but now y = yMP , which is designed identity (90) as an estimating equation for μ and σ 2, to approximate σ 2, is found from the moment-type σ 2 2 provided that i are estimated by si , in the following estimating equation, way. For weights of the form p ()xx−  2 i − ∑ 2 =1.p (93) 1 ys+ w =, (91) i=1 MP i i ys+ 2 i See Refs. [15], [19]. The modified Mandel-Paule pro- − determine a non-negative y = yDL from the formula, cedure with y = yMMP is defined as above, but p 1 in the right-hand side of (93) is replaced by p, i.e., ⎡⎤ ⎢⎥∑sxx−202()−−+ p 1 p ()xx−  2 ⎢⎥ii ∑ i =.p i (92) + 2 (94) yDL =max0,⎢⎥− i=1 ysMMP i ppp⎛⎞1 ⎢⎥−−−242 sss− Notice that when p = 2, the DerSimonian-Laird ⎢⎥∑∑∑iii⎜⎟ ⎣⎦iii=1 =1⎝⎠ =1 estimator coincides with the Mandel-Paule rule, as, in this case,

1 222 pp22−− 22 yy==max0,()⎡⎤ xxss−−− ,(95) andput=(())/().x ∑∑ xys++ ys MP DL ⎣⎦12 1 2 DLii=1 i DL i =1 DL i 2 pp Herexxss022 = (∑∑−− )/ , is one of traditional ii=1ii =1 i so that this estimator is similar to the REML estimates estimators for the common mean (the Graybill-Deal estimates (66) and (67). In the general case, both of estimator). In other words, the statistic x~ 0 and the weights these rules set y = 0, when −22 ws= corresponding toσ = 0, are used to evaluate p 2 ii ()xx−  the sum∑ sx−202 (− x ) which is then employed to ∑ i ≤−p 1. (96) i ii s2 estimateσ 2 via (90). i=1 i

553 Volume 116, Number 1, January-February 2011 Journal of Research of the National Institute of Standards and Technology

It was shown in Ref. [20] that the modified Mandel- − Paule is characterized by the following fact: the ML − 2 ⎡⎤1 ()xxi  1 σ^ 2 σ 2 ∑∑⎢⎥ estimator of coincides with yMMP , if in the repa- 22 2 ()ys++⎣⎦≠ ys rametrized version of the likelihood equation the δ =.ikkiik: (98) 1 weights w admit representation (91). As a conse- ∑ i + 2 quence, the corresponding weighted means statistic (3) i ysi must be close to the ML estimator, so that the modified Mandel-Paule estimator approximates its ML counter- Simulations show that (98) gives good confidence in- part. Thus, the modified Mandel-Paule estimator can be tervals of the form,xt ±− ( p 1)δ . For the Mandel- interpreted as a procedure which uses weights of the α /2 2 2 form 1/(y + si ), instead of solutions of the likelihood Paule rule or the DerSimonian-Laird procedure they equation that are more difficult to find, and still main- σ 2  tains the same estimate of as ML. outperform the intervals,xt±− ( p 1) Varx ( ). Still A similar interpretation holds for the original α /2 DL ≈ ~ Mandel-Paule rule and the REML function [21]. For when all sample means are close xi x , then yDL =0 ∑ −2 −1 this reason both Mandel-Paule rules are quite natural. It and the uncertainty estimate ( i si ) may be a more is also suggestive to use the weights (91) with satisfactory answer than δ ≈ 0. y = yMP determined from the Mandel-Paule equation When p =2, (93) as a first approximation when solving the likelihood equations via the iterative scheme (47) and (48). (x−++ x )()()22 ys ys 2 (99) δ =.12 1 2 ++222 (2ys12 s ) 5.4 Uncertainty Assessment It is tempting to use formula (88) to obtain an estima- In this case δ is an increasing function of y with the ~ − 2 →∞. tor of the variance of x . For example, DerSimonian and largest value (x1 x2) /4 attained when y Laird [18] suggested an approximate formula for the An alternative method of obtaining confidence estimate of the variance of their estimator, intervals for μ on the basis of REML estimators was suggested in Ref. [24] for an adjusted estimator of  Var()μ based on the inverse of the Fisher informa-  1 Var()= x . (97) tion matrix. This (not GUM consistent) estimator is DL ∑()ys+ 21− DL i more complicated. i 6. Conclusions Similarly motivated by (88), Paule and Mandel [19] ∑ 2 −1 −1 suggested to use [ i (yMP + si ) ] to estimate the The original motivation for this work was an attempt ~ variance of xMP . However these estimators typically to employ modern computational algebra techniques by underestimate the true variance, Ref. [5]. They are not evaluating the Groebner bases of likelihood polynomi- GUM consistent [22] in the sense that the variance esti- al equations for the random effects model. While this mate is not representable as a quadratic form in devia- attempt leads to an explicit answer when there are two − ~ tions xi x . labs with no between-lab effect, it was not successful in Horn, Horn and Duncan [23] in the more general a more general situation. The classical iterative algo- context of linear models have suggested the following rithms appear to be more efficient in this application GUM consistent estimate of Var(x~), which has the although they do not guarantee the global optimum. ∑ ω 2 − ~ 2 −ω ω ∑ form i i (xi x ) /(1 i), with i = wi / k wk . Simplified, method of moments based procedures, ω 2 −1 When the plug-in weights are i =(y + si ) / especially the DerSimonian-Laird method, deserve ∑ 2 −1 … k (y + sk ) , i =1, , p, this estimate is much wider use in interlaboratory studies.

554 Volume 116, Number 1, January-February 2011 Journal of Research of the National Institute of Standards and Technology

7. Appendix 7.2 Elementary Symmetric Polynomials: 7.1 Lemma 1 Lemmas 2 and 3

zz We note the following identities for elementary Lemma 1 If12+ >1, then the following function ≤≤ νν symmetric polynomials. For 1ijp < , define 12 σσσσσσσ22 22222 of y, ij=(,,11111 i−+ , i ,, j − , j + ,,  p ).Since Py( ) = ( y222++ (σσ ) y + σσ 22 )∏ ( y + σ 2 ), one ij ijkij≠ , k ⎡⎤⎛⎞zz −+++12 ν has min ⎢⎥(ny 1)log⎜⎟ 1∑ ii log , yy+ = y (100) 12 ⎣⎦⎢⎥⎝⎠yy12 σσσ222++ σ 2 pk−−=()() pk ij i j  pk −−1 () ij is monotonically nondecreasing in the interval (0, 1) (105) +σσ22 σ 2 ijpk −−2 (). ij − Proof: With B =(1+z1 / y1 + z2 / y2)/(n 1), any stationary point (y1, y2) in (100) satisfies the simultane- Comparison of the coefficients of the polynomial p ous equations, Py′′ =∑  (− 1) −2 with its other expression, =2 p− 2∑ ∏ (y +σ 2 ), shows that zzνν (101) 1<≤≤ijp kij≠ , k 11−−==, 21ζ yB22yy yB 1212 ⎛⎞ σ 2 ∑ pij−−()=⎜⎟ p . (106) ≤≤ ⎝⎠2 where ζ is the Lagrange multiplier. By multiplying 1 1. ⎛⎞p −  To prove Lemma 1 it suffices to show that the deriv- σσ22 σ 2 ∑ ijp−−2 ()= ij⎜⎟ p − . (107) ative of (100) is negative. This fact follows from the 1<≤≤ijp ⎝⎠2 inequality Combined with (105), identities (106) and (107) (1)nB−−′ ννyy′′ B 1 demonstrate the following formula (108) (which the ++11 2 2=<0, (102) author was unable to find in the literature.) ByyBy12 Lemma 2 Under the notation above, for 11,≤≤ p − which can be proven by differentiation of (101). Indeed for i =1,2, σσ22+− σ 2 ∑ ()()=().ijpij−−1 p p − (108) ν yzByByByB′′(1)−− ′′ 2(1) 1<≤≤ijp ii1 ++−=. i i i 22 2 (103) yyBiiyB y B yB The following result Lemma 2 gives an upper bound on the coefficients of polynomial Q. ′ ′ Since y1 + y2 = y, and y1 + y2 = 1, by adding these iden- tities we get Lemma 3 For  =0,1,, …, p − 2,

′′ ′ − 2 ννyy⎛⎞ zz − ()xxij 11+−++− 2 2BB 1 2 1 (104) ≤+ −− (109) =1⎜⎟ . qppp−−21(1)( 1) −− max 22 . 2 1<≤≤ijpσσ+ yy12B ⎝⎠ yyBy 12 ij

555 Volume 116, Number 1, January-February 2011 Journal of Research of the National Institute of Standards and Technology

Indeed, [19] R. C. Paule and J. Mandel, Consensus values and weighting factors, J. Res. Natl. Bureau Stand. 87, 377-385 (1982). [20] A. L. Rukhin and M. G. Vangel, Estimation of a common mean 22 and weighted means statistics, J. Am. Stat. Assoc. 93, 303-308 qxx−−=()∑ −  −− ()σ p22 ijp ij (1998). 1<≤≤ijp (110) [21] A. L. Rukhin, B. Biggerstaff, and M. G. Vangel, Restricted 2 ()xx− maximum likelihood estimation of a common mean and =()().∑ ijσσ22+  σ 2 22ijpij−−2  Mandel-Paule algorithm, J. Stat. Plann. Inf. 83, 319-330 1<≤≤ijpσσ+ ij (2000). [22] International Organization for Standardization, Guide to the Expression of Uncertainty in Measurement. Geneva, Switzerland, (1993). [23] S. A. Horn, R. A. Horn, and D. B. Duncan, Estimating 8. References heteroscedastic variance in linear models. J. Am. Stat. Assoc. 70, 380-385 (1975). [1] W. G. Cochran, Problems arising in the analysis of a series of [24] M. G. Kenward and J. H. Roger, Small sample inference for similar experiments, J. Royal Statist. Soc. Supplement 4, 102- fixed effects from restricted maximum likelihood. Biometrics, 118 (1937). 53, 983-997 (1997). [2] W. G. Cochran, The combination of estimates from different experiments, Biometrics 10, 101-129 (1954). [3] P. S. R. S. Rao, J. Kaplan, and W. G. Cochran, Estimators for About the author: Andrew L. Rukhin is a Mathe- the one-way random effects model with unequal error vari- matical Statistician in the Statistical Engineering ances, J. Am. Stat. Assoc. 76, 89-97 (1981). Division of the NIST Information Technology [4] P. S. R. S. Rao, Cochran’s contributions to variance component models for combining estimates, in P. Rao and J. Sedransk, Laboratory. He got his M.S. in Mathematics (Honors) edis., W. G. Cochran’s Impact on Statistics, Wiley, New York, from the Leningrad State University (Russia) in 1967 (1981), pp. 203-222. and his Ph.D. from the Steklov Mathematical Institute [5] A. L. Rukhin, Weighted means statistics in interlaboratory in 1970. After emigrating from the USSR he worked as studies, Metrologia 46, 323-331 (2009). a professor at Purdue University (1977-1987), at [6] D. R. Cox, Combination of data, in Encyclopedia of Statistical Sciences, Vol 2, S. Kotz, C. B. Read, N. Balakrishnan, and University of Massachusetts, Amherst (1987-1989) and B. Vidakovic, eds., Wiley, New York (2006), pp. 1074-1081. at the University of Maryland at Baltimore County [7] E. F. Beckenbach and R. Bellman, Inequalities, Springer, (1989-2008). Now Rukhin is a Professor Emeritus at Berlin, (1961), p. 3. this university. Rukhin is a Fellow of the Institute of [8] D. Cox, J. Little, and D. O’Shea, Ideals, Varieties and Mathematical Statistics, and a Fellow of the American Algorithms, 2nd ed, Springer, New York (1997), pp. 96-97. [9] M. Drton, B. Sturmfels, and S. Sullivant, Lectures on Algebraic Statistical Association. He won Senior Distinguished Statistics, Birkhauser, (2009), pp. 15-55. Scientist Award By Alexander von Humboldt- [10] A. Belloni and G. Didier, On the Behrens-Fisher problem: a Foundation in 1990, and the Youden Prize for globally convergent algorithm and a finite-sample study of the Interlaboratory Studies in 1998 and in 2008. The Wald, LR and LM tests, Ann. Statist. 36, 2377-2408 (2008). National Institute of Standards and Technology is an [11] M. G. Vangel and A. L. Rukhin, Maximum likelihood analysis for heteroscedastic one-way random effects ANOVA in inter- agency of the U.S. Department of Commerce. laboratory studies, Biometrics 55, 129-136 (1999). [12] R Development Core Team R: A language and environment for statistical computing, R Foundation for Statistical Computing, Vienna, Austria (2009) ISBN 3-900051-07-0, URL http://www.R-project.org. [13] D. Harville, Maximum likelihood approaches to variance com- ponents estimation and related problems, J. Am. Stat. Assoc. 72, 320-340 (1977). [14] D. S. Mitrinovich, Analytic Inequalities, Springer, Berlin, (1970) p 227. [15] J. Mandel and R. C. Paule, Interlaboratory evaluation of a material with unequal number of replicates, Anal. Chem. 42, 1194-1197 (1970). [16] R. Willink, Statistical determination of a comparison reference value using hidden errors, Metrologia 39, 343-354 (2002). [17] E. Demidenko, Mixed Models, Wiley, New York, (2004), pp. 253-255. [18] R. DerSimonian and N. Laird, Meta-analysis in clinical trials, Control. Clin. Trials 7, 177-188 (1986).

556