Uniform Supersaturated Design and Its Construction
Total Page:16
File Type:pdf, Size:1020Kb
中国科技论文在线 http://www.paper.edu.cn Vol. 45 No. 8 SCIENCE IN CHINA (Series A) August 2002 Uniform supersaturated design and its construction ¢¡¢£ 1 2 3 ¨¢© FANG Kaitai ( ) , GE Gennian ( ¤¦¥¢§ ) & LIU Minqian ( ) 1. Department of Mathematics, Hong Kong Baptist University, Hong Kong, China; 2. Department of Mathematics, Suzhou University, Suzhou 215006, China; 3. Department of Statistics, Nankai University, Tianjin 300071, China Correspondence should be addressed to Fang Kaitai (email: [email protected]) Received February 2, 2002 Abstract Supersaturated designs are factorial designs in which the number of main effects is greater than the number of experimental runs. In this paper, a discrete discrepancy is proposed as a measure of uniformity for supersaturated designs, and a lower bound of this discrepancy is obtained as a benchmark of design uniformity. A construction method for uniform supersaturated designs via resolvable balanced incomplete block designs is also presented along with the investigation of properties of the resulting designs. The construction method shows a strong link between these two different kinds of designs. Keywords: discrepancy, resolvable balanced incomplete block design, supersaturated design, uni- formity. Recently, supersaturated designs have aroused increasing interest, as their potential in saving run size and the technical novelty have begun to be realized. A supersaturated design is a factorial design whose run size is not large enough for estimating all the main effects represented by the columns of the design matrix. If many factors are to be investigated (e.g. in a screening study) and runs are very expensive, economic considerations may compel the investigators to adopt a supersaturated design. For example, a physical experiment may require the making of expensive prototypes; a computer experiment using finite elements analysis can be time-consuming and expensive. In practice, the data collected by supersaturated designs are analyzed under the assumption of effect sparsity, i.e. the response of interest depends mainly on the effects of a few dominant factors, and the interactions and the effects of the remaining factors are relatively negligible. Various fields of research may benefit from the use of supersaturated designs, including computer and medical experiments[1], industrial and engineering experiments[2;3], and so on. The problem of constructing supersaturated designs was originally considered by Satterthwa- ite[4]. Booth and Cox[5] first examined these designs systematically. Since then, many sesearchers[1−3;6−14] have investigated into two-level supersaturated designs. These two-level de- signs can be used for screening the factors in simple linear models. However, designs with multi- levels are often requested in many situations in exploring nonlinear effects of the factors. Such works include refs. [15|17] for three-level supersaturated designs, and refs. [18, 19] for multi-level supersaturated designs. A common characteristic of the existing q-level supersaturated designs (q > 3) except those of Yamada and Lin[15] is that these designs are constructed from given orthogonal designs through 转载 中国科技论文在线 http://www.paper.edu.cn No. 8 UNIFORM SUPERSTRUCTURED DESIGNS 1081 computer search, while those of Yamada and Lin[15] are generated from two-level orthogonal or supersaturated designs. All existing supersaturated designs are evaluated based on some measures such as E(s2), aveχ2 and E(d2). In fact, supersaturated design is a kind of U-type design[20;21]. An important criterion for evaluating U-type designs is the uniformity criterion, which has gained popularity in recent years and has been shown to be intimately connected to many other design criteria. For example, Fang and Mukerjee[22] provided an analytic connection between uniformity and orthogonality in two-level factorial designs. When the experimental domain is a unit hyper- cube, many measures of uniformity, such as the star discrepancy, the centered L2-discrepancy and [23−25] the wrap-around L2-discrepancy have been proposed . The main purpose of this paper is to provide a class of multi-level supersaturated designs from the uniformity viewpoint. In sec. 1, the discrepancy measure of uniformity is introduced, a discrete discrepancy for measuring the uniformity of factorial designs is defined by using a reproducing kernel in Hilbert space, and a lower bound of the discrete discrepancy and a sufficient and necessary condition for achieving it are also obtained. This lower bound can be used as a benchmark of design uniformity. Some justification for using the discrete discrepancy as a design criterion is also provided in sec. 1. In sec. 2, a new construction method for multi-level supersaturated designs via resolvable balanced incomplete block designs (RBIBDs) is proposed. The RBIBD is an important object both in experimental design theory and combinatorial design theory. It has many good statistical properties as well as combinatorial properties and has played a crucial role in the construction of other combinatorial configurations[26]. The resulting designs are shown to be uniform supersaturated designs. Thus the construction method serves as an important bridge between these two different kinds of designs, i.e. uniform supersaturated designs and RBIBDs. With this method, some infinite classes for the existence of uniform supersaturated designs are also obtained in this section without any computer search. 1 The discrepancy measure First let us introduce some knowledge related to supersaturated designs. A U-type symmetric design (a U-type design for simplicity) X is an n × m matrix with symbols f1; ··· ; qg such that the q symbols in each column appear equally often. A U-type design can be regarded as a design with n runs and m factors, each having q levels. Obviously, the number of levels q should be a divisor of n. The set of all such designs is denoted by U(n; qm)[27]. A U-type design is called an orthogonal design if every pair of design columns with all of its level-combinations appears equally often. In this case, m(q − 1) 6 n − 1. When m(q − 1) = n − 1, the design is saturated and when m(q − 1) > n − 1, the design has no enough degrees of freedom for estimating all the factorial effects simultaneously, the design is called a supersaturated design, denoted by S(n; qm). 1.1 Definition of the discrepancy Note that a U-type design may not be a good design. An important and popular measure of uniformity of U-type designs is the discrepancy. The discrepancy can be defined in terms of a kernel function. Let X be a measurable subset of Rm. A kernel function K(x; w) is a real-valued 中国科技论文在线 http://www.paper.edu.cn 1082 SCIENCE IN CHINA (Series A) Vol. 45 function defined on X × X and is symmetric in its arguments and non-negative definite, K(x; w) = K(w; x); for any x; w 2 X ; (1) n aiaj K(xi; xj ) > 0; for any ai 2 R; xi 2 X ; i = 1; ··· ; n: (2) i;j=1 X In fact K(x; w) is the reproducing kernel for some unique Hilbert space of real-valued functions on X , but that property is not needed here. For a more detailed discussion of reproducing kernels, see refs. [28, 29]. Let F∗ denote the uniform distribution function over X . Let P = fz1; ··· ; zng ⊆ X be a set of design points and Fn denote the associated empirical distribution, where 1 F (x) = 1 z x : n n f 6 g z P X2 1A is the indicator function of A, and z = (z1; ··· ; zm) 6 x = (x1; ··· ; xm) means that zj 6 xj for all j. For a given kernel function K(x; w), the discrepancy of P is defined by[30] 1 x w x w w w 2 D(P ; K) = X 2 K( ; )d[F∗( ) − Fn( )]d[F∗( ) − Fn( )] x w x w 2 x z x = RX 2 K( ; )dF∗( )dF∗( ) − n z2P X K( ; )dF∗( ) 1 1 z z0 2 R+ n2 z;z02P K( ; ) : P R (3) From this definition, it is clearP that the discrepancyo measures far away from the empirical distri- bution Fn is from F∗. From a uniformity point of view, for a fixed number of points, n, a design with low discrepancy is preferable[31]. For any q-level design X 2 U(n; qm), X = f1; ··· ; qgm comprising all possible level combina- −m tions of the m factors, F∗ just assigns probability q to each member of X . Let a if x = w; K~ (x; w) = for x; w 2 f1; ··· ; qg; a > b > 0; (4) ( b if x 6= w; m Kd(x; w) = K~ (xj ; wj ); for any x; w 2 X : (5) j=1 Y And then Kd(x; w) is a kernel function satisfying conditions (1) and (2). The corresponding discrete discrepancy can be used for measuring the uniformity of factorial design points. Note that the existing discrepancies mentioned above are defined on a unit hypercube and are used for measuring the uniformity of points corresponding to continuous variables, while for a factorial design, the factorial levels are not continuous; they are discrete, and a q-level factor needs q − 1 degrees of freedom in the estimation of its main effects, not one degree of freedom such as for a continuous variable. The discrete discrepancy defined here is partly from this consideration. Given the kernel Kd(x; w) defined by (4) and (5), from (3) the discrete discrepancy D(P ; Kd) can be computed as follows: n m a + (q − 1)b m 1 D2(P ; K ) = − + K~ (z ; z ); (6) d q n2 kj lj k;l j=1 X=1 Y where zk = (zk1; ··· ; zkm). 中国科技论文在线 http://www.paper.edu.cn No. 8 UNIFORM SUPERSTRUCTURED DESIGNS 1083 2 1.2 The lower bound of D (P ; Kd) m For any X 2 U(n; q ), let λkl be the number of coincidences between the kth and lth rows x x m k and l, i.