V1701015 D-Optimality for Regression Designs: a Review
Total Page:16
File Type:pdf, Size:1020Kb
TECHNOMETRICSO, VOL. 17, NO. 1, FEBRUARY 1975 D-Optimality for Regression Designs: A Review R. C. St. John and N. R. Draper Bowling Green State University University of Wisconsin Department of Quantitative Statistics Department Analysis and Control 1210 W. Dayton Street Bowling Green, Ohio 43403 Madison, Wisconsin 53706 After stating the model and the design problem, we briefly present the results for regression design prior to the work of Kiefer and Wolfowitz. We then review the major results of Kiefer and Wolfowitz, particularly those on the theory of design, as well as the way the criterion has been extended to non-linear models. Finally, we discuss algorithms for constructing D-optimum designs. KEY WORDS The design problem consists of selecting vectors D-Optimum xi 7 i = 1, 2, *** , n from x such that the design Design defined by these n vectors is, in some defined sense, optimal. By and large, solutions to this problem consist of developing some sensible criterion based 1. MODEL AND NOTATION on the model (2), and using it to obtain optimal In this section we introduce our notation for the designs. linear regression model, and the assumptions about the model and the design region. We consider the 2. BACKGROUND linear model Smith (1918) was one of the first to state a criterion and obtain optimal experimental designs yi = f’(x,)@ + ei , i = 1, 2, * * * ) n 7 (1) for regression problems. For polynomial regression which we express in matrix notation as of order p - 1 in one variable over the design region x = [-1, 11, she proposed the criterion y= x0+&. (2) min max z@(x)). (7) The vector y is an n X 1 vector of observations; ,x;,i=1.2 ,..., n, =cx X is an n X p matrix, with row i containing f’(x;); xi is a Q X 1 vector of predictor variables; f’(xi) is a This criterion was later called G-optimality by p X 1 vector which depends on the form of the Kiefer and Wolfowitz (1959), and it has taken on response function assumed; 0 is a p X 1 vector of considerable importance in the theory and construc- unknown parameters; e is an n X 1 vector of inde- tion of optimal designs. Using this criterion Smith pendently and identically distributed random vari- obtained designs for various values of p. Guest ables, with mean zero and variance a2. The experi- (1958) showed that, for a polynomial of degree mental region is denoted by x, and we assume that p - 1, the allocation of points according to Smith’s x is compact and that fi(xi)‘s are continuous on x. min max criterion (7) could be obtained by finding We assume that least squares estimates of the the zeroes of the derivative of a Legcndre poly- parameters @are to be obtained. These are given by nomial. Wald (1943) proposed the criterion of maximizing 6 = (x’x)-‘X’y, (3) the determinant of X’X as a means of maximizing and the variance-covariance matrix of 0 is the local power of the F-ratio for testing a linear hypothesis on the parameters of certain fixed-effects V(O) = g”(x’x)-‘. (4) analysis of variance models. Mood (1946) proposed Then, at point xex, the predicted response is the same criterion for obtaining weighing designs. Kiefer and Wolfowitz (1959) later called this B(x) = f’(x>e^, (5) criterion D-optimality and extended its use to . with variance regression models in general. De la Garza (1954) showed that, for a polynomial of degree p - 1, v(Q(x)) = ~“f’(x)(x’x)-‘f(X). (6) there exists a design with p distinct points which has the same X’X matrix as a design with more than Received July, 1974; revised October, 1974 p distinct points (Kiefer (1959) subsequently 15 16 R. C. ST. JOHN AND N. R. DRAPER corrected De la Garza’s proof). Hoe1 (1958) obtained Dejnition 2. .$* is G-optimal if and only if the optimal allocation for a polynomial of degree min max d(x, l) = max d(x, l*). (13 p - 1 using both the min max criterion of Smith and t x*x x=,-i the determinant criterion of Wald. He showed that A sufficient condition for {* to satisfy (12) is the two criteria gave the same results, thus hinting at the equivalence theorem of Kiefer and Wolfowitz max d(x, .$*) = p. (13) (1959), which we discuss in the next section. xex Various other properties of the X’X matrix were The equivalence of D- and G-optimality is es- suggested as being an appropriate criterion for tablished in the general equivalence theorem of design. Elfving (1952) and Chcrnoff (1953) mini- Kiefer and Wolfowitz, which can be formally stated mized the trace of (X’X)-’ to obtain regression as follows: designs. Ehrenfeld (1955) suggested that maximizing Theorem 1. Conditions (ll), (12), and (13) are the minimum eigenvalue of X’X be used as a equivalent, and the set of all 5 satisfying these design criterion. conditions is convex. The implication of t,his result is that we can 3. DESIGN THEORY use (13) to verify whether or not a specific design Kiefer (1959, 1961a, 1961b, 1962 and 1970a) and is D-optimal. That is, if (13) is satisfied, then the Kiefer and Wolfowitz (1959, 1960) made a number design is D-optimal. Note that D-optimality is of contributions to the theory of regression design. essentially a parameter estimation criterion, whereas We discuss the two main results: the idea of design G-optimality is a response estimation criterion. The measure and their general equivalence theorem. equivalence theorem says these two design criteria A design can be viewed as a measure E on x. Suppose are identical when the design is expressed as a measure we have an n-point design with n; observations at on x. the site xi (note that Zni = n). Let [ be a probability As defined above, the criterion of D-optimality is measure on x such that E(xi) = 0 if there are to be applicable when all p parameters are to be estimated. no observations at point xi , and such that t(x;) = However, in many cases one is interested in only ni/n if there are to be ni > 0 observations at s < p parameters, the other parameters being point x, . For a discrete n-point design, E takes on nuisance parameters. Kiefer and Wolfowitz (1959) values which are multiples of l/n, and defines an discussed this problem and Kiefer (1961a) gave exact design on x. Removing the restriction that [ be definitions of D,-optimality and G,-optimality (i.e., a multiple of l/n, we can extend this idea to a optimality when interest is in s < p parameters) design measure which satisfies, in general: which are analogous to the above definitions of D- and G-optimality. Kiefer (1961a, 1961b, and 1962) later extended the general equivalence theorem to this situation. ‘l(dx) = 1. (8) InI Karlin and Studden (1966a) gave various forms of the response model (I) for .which the D-optimal With respect to model (a), we can define a matrix designs were known. In addition, Karlin and Studden analogous to X’X for design 5. Let extended and simplified Kiefer’s results for D,- and G.-optimality. In particular they discuss the problem miCEI = / f~WiWG-W, i, j = 1, 2, . , P (9) x of estimating s < p parameters when M(t) is singular. and Atwood (1969) gave further results in the theory of D-optimality. In particular he gave more general definitions of D,- and G,-optimality, and he discussed Note that, for an exact design, nM([) = X’X. the role of symmetry in optimal design. Similarly, a generalization of the variance function Silvey and Titterington (1973) have given a (6) is given by geometric interpretation of optimal design. Using d(x, 0 = f’(x)M-‘@f(x). (10) results from Strong Lagrangian Theory they established duality theorems which state the From this following general definitions of D-opti- Kiefer-Wolfowitz equivalence theorem in terms of mality and G-optimality. a design measure on the function space (the image Dejkition 1. [* is D-optimal if and only if of x under the model). Moreover, they suggest that M({*) is nonsingular and this duality mav lead to efficient algorithms- for constructing D-optimal designs, a point which we my det (M(Q) = dct (M(t*)). (11) discuss below. TECHNOMETRICSO, VOL. 17, NO. 1. FEBRUARY 1975 D-OPTIMALITY FOR REGRESSION DESIGNS 17 Sibson (1973) and (1974), following his contribu- of obtaining D-optimal designs which are based on tion to the discussion of the paper of Wynn (1972), the same procedure as employed by Box and Hunter. is developing linear programming algorithms for Atkinson and Hunter (1968) extend the BOX- the explicit solution of D-opt,imality problems, based Lucas results to the case where the design is to have on these duality theorems. He has been able to avoid more than p points. Draper and Hunter (1967a and the unboundedness problems involved by imposing 1967b), discussed the problem of selecting prior extra constraints on the total mass allocation to distributions on the parameters to obtain designs points, namely, that no point should have mass for non-linear models. greater than l/p. M. J. BOX (1968, 1970, 1971) has given a number Kiefer (1974) has recently given further equi- of additional results for non-linear design, particu- valence results between D-optimality and other larly on the subjects of replicated design points design criteria.