<<

European Journal of Operational Research 215 (2011) 629–638

Contents lists available at ScienceDirect

European Journal of Operational Research

journal homepage: www.elsevier.com/locate/ejor

Stochastics and Generating and improving orthogonal designs by using mixed integer programming

a, b a a Hélcio Vieira Jr. ⇑, Susan Sanchez , Karl Heinz Kienitz , Mischel Carmen Neyra Belderrain a Technological Institute of Aeronautics, Praça Marechal Eduardo Gomes, 50, 12228-900, São José dos Campos, Brazil b Naval Postgraduate School, 1 University Circle, Monterey, CA, USA article info abstract

Article history: Analysts faced with conducting involving quantitative factors have a variety of potential Received 22 November 2010 designs in their portfolio. However, in many experimental settings involving discrete-valued factors (par- Accepted 4 July 2011 ticularly if the factors do not all have the same number of levels), none of these designs are suitable. Available online 13 July 2011 In this paper, we present a mixed integer programming (MIP) method that is suitable for constructing orthogonal designs, or improving existing orthogonal arrays, for experiments involving quantitative fac- Keywords: tors with limited numbers of levels of interest. Our formulation makes use of a novel linearization of the Orthogonal design creation correlation calculation. The orthogonal designs we construct do not satisfy the definition of an orthogonal array, so we do not Statistics advocate their use for qualitative factors. However, they do allow analysts to study, without sacrificing balance or , a greater number of quantitative factors than it is possible to do with orthog- onal arrays which have the same number of runs. Ó 2011 Elsevier B.V. All rights reserved.

1. Introduction 1999, p. 23). Orthogonality is important because without this prop- erty it is impossible to model the effect of one factor independently One of the few tools used in almost all branches of science is of the other factors. The orthogonality definition given above guar- experimental testing. Areas as diverse as medicine, chemistry, psy- antees that even qualitative factors can be analyzed. chology, engineering and biology have been using experiments to The study of orthogonal arrays was introduced by Rao (1945) in develop and prove their theories. Although experimental design the 1940s. The main techniques used to construct orthogonal ar- originated with the need for experiments in agriculture, it has been rays are: (1) Galois fields and finite geometries; (2) Codes; (3) Dif- widely used in many other domains, such as manufacturing and ference Schemes; (4) Hadamard Matrices; (5) orthogonal Latin industrial applications, clinical trials, military test and evaluation squares and F-Squares. For a review of orthogonal design creation, (T&E), and simulation. Hedayat et al. (1999, p. 298) credit we refer the reader to Hedayat et al. (1999). The study of how to approaches from Taguchi, in the main, and Deming, as leading to improve or create designs is still an active research topic, as can the ‘‘explosive increase in the use of design for product quality be seen, for example, in Lejeune (2003), Kleijnen (2005), van Beers improvement and monitoring in industrial processes...’’. and Kleijnen (2008) and Hernandez et al. (2011). Taguchi’s most important contribution to quality improvement In this paper, we limit the scope of our studies to problems in techniques is his emphasis on investigating both mean perfor- which all the factors are quantitative (either continuous or dis- mance and performance variation (Koch, 2002, p. 2). To this end, crete). If an array has the property that the maximum absolute Taguchi created several orthogonal arrays. These designs, which pairwise correlation between any two columns equals zero, we have been widely used and are incorporated in several statistical will refer to this as an orthogonal design. Note that orthogonal software packages, are described in more detail in Section 3. arrays (under the classical definition of Dey and Mukerjee (1999) Orthogonal arrays have several definitions according to the area seen above) are, by construction, orthogonal designs—but there of application. A common definition is the following: ‘‘An orthogo- are orthogonal designs that are not orthogonal arrays. nal array OA (N,n,m mn,g), having N rows, n(P2) columns, Analysts faced with conducting experiments involving quanti- 1 ÂÁÁÁ m m (P2) symbols and strength g( n), is a N n array, tative factors have a variety of potential designs in their portfolio. 1 ÂÁÁÁ n 6  with elements in the ith column from a set of mi distinct symbols This includes orthogonal arrays; two-level designs such as factorial (1 6 i 6 n), in which all possible combinations of symbols appear and fractional factorial designs, or Plackett–Burman designs; equally often as rows in every N g subarray’’ (Dey and Mukerjee, central composite designs; and space-filling designs like Latin  hypercubes. One reason for promoting two-level designs is their efficiency in terms of minimizing the variances of the estimated Corresponding author. Tel.: +55 12 39473671. ⇑ E-mail addresses: [email protected] (H. Vieira), [email protected] effects. Central composite designs that extend two-level designs (S. Sanchez), [email protected] (K.H. Kienitz), [email protected] (M.C.N. Belderrain). leverage this efficiency, as well as allowing quadratic effects to

0377-2217/$ - see front matter Ó 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.ejor.2011.07.005 630 H. Vieira Jr. et al. / European Journal of Operational Research 215 (2011) 629–638 be explored. Space-filling designs are advantageous when little is the construction method works best with fewer groups of discrete known about the nature of the response surface, and when factor factors. combinations in the interior of the region of exploration are of po- Our approach does allow direct construction of mixed designs. tential interest. We generalize Hernandez’s continuous-factor MIP formulation to However, in many experimental settings involving discrete-val- deal with mixed designs, and to assure the balance and space-filling ued factors, none of these designs are suitable. Designs for the spe- properties that good designs should have. A design is said to be bal- cific combination of factor levels may not be readily available— anced if, for each factor, the number of objects in each of its levels is particularly if the factors do not all have the same number of levels. equal. The balance property is important from a mathematical per- The orthogonal arrays defined and constructed as above may re- spective because it allows correct analysis of heteroscedastic exper- quire more runs than the allowed budget, in part because the fac- iments (see Bathke, 2004). Balance is also desirable from a practical tor levels may mandate a large number of runs (estimating main perspective on several counts, particularly for exploratory investi- effects requires c 1 degrees of freedom for a qualitative factor gations over broad factor ranges where linearity and constant var- À with c levels, while a single degree of freedom suffices for a quan- iance assumptions are often inappropriate. First, it is rare for an titative factor). At the same time, focusing on two-level designs (or analyst to know a priori the relationship between the factors and even central composite designs) may be undesirable because they the response variability if they do not already know the relationship do not sample at all levels of interest for each factor, and the space- between the factors and the expected response, so constructing a filling designs in the literature are intended for continuous-valued design that accommodates known heteroscedastic behavior is gen- factors or factors with the same number of levels. Rounding the erally not an option. By using a balanced design, an analyst need not specified levels to accommodate discrete-valued factors can ad- make assumptions about the nature of the response variability. Sec- versely affect the orthogonality properties of the design (Sanchez ond, exploratory studies often involve more than one potential re- and Wan, 2009). A flexible method for constructing efficient sponse, and it may not be apparent ahead of time which one(s) orthogonal designs is needed. will wind up being of most interest. For example, in a disaster relief In this paper, we present a mixed integer programming (MIP) scenario, the primary response might be the total amount of food method that is suitable for constructing orthogonal designs. We and medical supplies distributed, and the factors might represent also show cases where our approach can improve existing orthog- various delivery resource levels and protocols. However, if the onal arrays. Our formulation generalizes and extends that of Her- alternatives are similar in terms of the total goods distributed, the nandez (2008), and makes use of a novel linearization of the analyst might then be interested in examining results from the correlation calculation. We describe our approach in Section 2, same design for other responses, such as the time or cost associated and present some numerical results in Section 3. In Section 4 we with the disaster relief efforts. Third, balance makes it easier for an summarize our findings and briefly discuss the benefits of this analyst to focus on specific region(s) of the design space that appear approach. interesting. For example, local analyses of regions where trade-offs between cost and effectiveness can be examined in detail may be more informative than global analyses that show effectiveness 2. Our approach tends to improve as resource levels increase. Balance and space-filling behavior are both useful design char- We base our approach on the work of Hernandez (2008) and acteristics when there is an interest in the interior of the region of Hernandez et al. (2011), who use MIP to create nearly orthogonal exploration or when little is known about the nature of the re- Latin Hypercubes with any number of design points n, provided sponse surface. This latter situation (when the response surface that the designs are non-saturated (i.e., there are fewer than n col- is unknown) is a common concern in the literature, leading to umns). They start with a randomly-generated Latin hypercube, the following statement by Santner et al. (2003, p. 124): ‘‘[B]ecause where each column vector’s values are the integers 1 to n (without we don’t know the true relation between the response and inputs, ties), which are considered to be the ranked values of the continu- designs should allow one to fit a variety of models and should pro- ous factors. The use of non-tied ranks allow them to express the vide about all portions of the experimental region.’’ standard sample pairwise correlation formula as a linear function Balanced, space-filling designs have this desirable characteristic, of the n values for a single column, when all other columns are while classical designs (such as 2k factorials, central composite de- fixed. Their MIP iteratively selects a single column to optimize, signs, balanced incomplete block designs, etc.) do not. and converts randomly-generated Latin hypercubes to nearly orthogonal (or in some cases, orthogonal) Latin hypercubes. While the work of Hernandez (2008) and Hernandez et al. 2.1. Mathematical formulation (2011) greatly expands the catalogue of OLHs and NOLHs available Given a design matrix M =[arc]n j, the sample pairwise correla- to experimenters, it does not accommodate mixed designs (matri-  ces in which factors have different number of levels—e.g., factor 1 tion between the columns x and y is given by (1) has 10 levels, factor 2 has 4 levels, factor 3 has 6 levels, etc.). n r 1 arx x ary y Although Hernandez (2008) discusses mixed designs, he does not q ¼ ð À Þð À Þ 1 xy ¼ n 1 s s ð Þ propose a direct way to construct them using an optimization ap- P ð À Þ x y proach. Instead, he suggests using his MIP formulation to construct designs for small subsets of non-binary discrete variables (possible where arc is the (r,c)th entry of the matrix M which has n rows and j when these are grouped by their number of levels), as well as a  1 n  1 n 1 n  2 columns, x n r 1arx; y n r 1ary; sx n 1 r 1 arx x ; ¼ ¼ ¼ ¼ ¼ À ¼ ð À Þ base design for all continuous-valued factors. He randomly stacks qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 n P  2 P P the small designs, joins these with the base design, randomly adds and sy n 1 r 1 ary y . ¼ À ¼ ð À Þ columns for binary factors through random permutations of {0,1}, If oneq considersffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiP the values of arc as unknowns to be determined and measures the resulting maximum pairwise correlation; the by a mathematical programming approach, the standard sample base design may need to be altered, or variations of the larger com- pairwise correlation formula makes the problem non-linear in com- bined design may need to be constructed and stacked, in order to plicated ways: in the numerator we have multiple terms of the form achieve the desired level of near-orthogonality. Hernandez a a , and the standard deviations in the denominator are also rx  ry acknowledges that these steps ‘‘require more specificity’’ and that non-linear in the arx and ary. As, in general, linear problems can be H. Vieira Jr. et al. / European Journal of Operational Research 215 (2011) 629–638 631 solved much more readily than non-linear ones, we make some Here, the design matrix under analysis has n rows and j columns; changes enabling the problem to be solved using linear techniques. column x is the one under optimization; the decision variable xr is The approach we adopt to overcome the numerator non-linear- the value of the rth row of column x (arx); the binary decision vari- ity follows that of Hernandez (2008); we fix the values of all factors ables hrl have value 1 if xr = l and 0 otherwise; v maxy–x qà is ¼ xy but one and optimize this one, non-fixed factor. To address the proportional to the maximum absolute pairwise correlation be- non-linearity in the denominator, we optimize a transformation tween column x and all the other columns; bx is the number of lev- of the correlation coefficient rather than the coefficient itself. These els of column x; sc and c are, respectively, the standard deviation simple—but novel—changes allow us to reformulate the problem and mean of the values arc of column c. into a MIP that is computationally tractable. There are seven sets of constraints. The first two constraint sets Let z denote the column values of x that minimize the maxi- ensure that v P qà and v P qà , i.e., v P qà regardless of the xy À xy j xyj mum absolute pairwise correlation between column x and all the sign of qxyà , for all y – x. The third constraint set assures that only T other columns. Mathematically, z =(a1x a2x ... anx) = arg min one of the bx levels will be assigned to xr. The translation from bin- maxy–x qxy , where k means the absolute value of k. ary indicator values to equivalent integer factor levels (i.e., from hrl j j j j We define qxyà as: to xr) is accomplished by the fourth constraint set. The design bal- n ance is guaranteed by the fifth constraint set, which requires that r 1 arx x ary y qxyà qxy n 1 sx ¼ ð À Þð À Þ 2 the number of times factor x assumes each of its possible values ¼ ð À Þ ¼ sy ð Þ P is constant. The last two constraint sets ensure that hrl can assume If we constrain the vector z to be balanced, the order in which its only the values 0 or 1, and that xr can assume only positive values. components are arranged (e.g., as a result of the optimization pro- If the experimenter wishes discrete values that are not equally cess) does not change the value of its standard deviation sx. This spaced (e.g., one factor with 4 levels: 1, 2, 5 and 11), then it is nec- essary to replace the fourth constraint of (3) by (4), where means that qà q , and that optimizing zà arg min maxy–x qà xy / xy ¼ xy p p1; p2; ...; pb is the set of the bx levels that the factor x is equivalent to optimizing z = arg min maxy–x qxy . In the process ¼ f x g j j can assume. of optimizing arg min maxy–x qxyà , the values of sy, ary and y will be constants for y – x. This fact makes z⁄ a linear function of the bx integer-valued entries in column x. The optimization of z⁄ becomes xr plhrl; r 1; 2; ...; n 4 a MIP problem, with the formulation given in Eq. (3). ¼ l 1 ¼ ð Þ X¼ Min v The use of mathematical programming for the construction of n 1 n  designs is not new: Lee (2000) shows examples of using semidefi- r 1 xr n k 1xk arc c s:t: v P ¼ À ¼ ð À Þ; c 1;2;...;x 1;x 1;...;j s ¼ À þ nite programming for generating E-,D- and A-optimal designs; P ÀÁPc n 1 n  Jones and Nediak (2005) use non-linear programming to create r 1 xr n k 1xk arc c v P ¼ À ¼ ð À Þ; c 1;2;...;x 1;x 1;...;j D-optimal designs; Appa et al. (2006) combine integer program- À s ¼ À þ P ÀÁPc ming and constraint programming to find mutually orthogonal La- bx tin squares; van Dam et al. (2009) and van Dam et al. (2010) use hrl 1; r 1;2;...;n l 1 ¼ ¼ MIP to, respectively, find bounds and create Maximin Latin Hyper- X¼ bx cube Designs; non-linear mixed integer programming was used to xr lhrl; r 1;2;...;n construct balanced incomplete block designs by Yokoyaa and Yam- ¼ ¼ l 1 ¼ ada (2011); and, as previously mentioned, Hernandez (2008) cre- n X n ates nearly orthogonal Latin hypercubes by using mixed integer hrl ; l 1;2;...;bx r 1 ¼ bx ¼ programming. However, to the best of our knowledge, the transfor- X¼ mation of the correlation function we use to linearize our MIP is hrl 0;1 ; r 1;2;...;n; l 1;2;...;bx 2f g ¼ ¼ unique, as is our ability to construct new orthogonal designs, and xr P 0; r 1;2;...;n ¼ improve existing orthogonal arrays, to obtain efficient, space-fill- 3 ð Þ ing, balanced designs for discrete-valued factors.

Table 1 Designs created or improved by using MIP.

Name Status Number of rows Maximum number Number of Max. # of factors by # of levels of factors factors added 23456789 H15a New 15 9 – – 9 – – – – – – H15b Improved 15 8 6 – 7 – 1 – – – – H16a Improved 16 15 6 9 – 6 – – – – – H16b Improved 16 15 10 5 – 10 – – – – – H18a New 18 7 – – 4 – – 1 – – 2 H18b New 18 6 – – 2 – – – – – 4 H18c New 18 6 – – – – – – – – 6 H18d Improved 18 10 2 1 9 – – – – – – H21a New 21 14 – – 11 – – – 3 – – H21b New 21 6 – – – – – – 6 – – H24a New 24 9 – – – 2 – 7 – – – H24b New 24 8 – – 1 1 – – – 6 – H24c Improved 24 20 5 13 4 3 – – – – – H27 Improved 27 17 4 – 17 – – – – – – H30a New 30 12 – – 11 – 1 – – – – H30b New 30 11 – – 5 – 5 1 – – – H36 Improved 36 20 3 17 1 1 – – – – 1

Max. means ‘‘maximum’’ and # means ‘‘number’’. 632 H. Vieira Jr. et al. / European Journal of Operational Research 215 (2011) 629–638

Table A.1 Design H15a (39).

Factor 1 Factor 2 Factor 3 Factor 4 Factor 5 Factor 6 Factor 7 Factor 8 Factor 9 3 levels 3 levels 3 levels 3 levels 3 levels 3 levels 3 levels 3 levels 3 levels 312313222 113232211 332331331 333221112 121111321 121322121 213211333 231312213 233123312 122333323 212231123 132122133 223213131 321133232 311122212

Table A.4 Design H18c (96). Table A.2 Design H18a (346192). Factor 1 Factor 2 Factor 3 Factor 4 Factor 5 Factor 6 9 levels 9 levels 9 levels 9 levels 9 levels 9 levels Factor 1 Factor 2 Factor 3 Factor 4 Factor 5 Factor 6 Factor 7 3 levels 3 levels 3 levels 3 levels 6 levels 9 levels 9 levels 569421 274894 1311194 175245 2323129 618636 1113573 945989 1131237 426764 1121448 351196 3232391 889389 1213512 423318 3112285 967172 3133233 593843 3211656 297727 3221614 736533 2222122 881418 2323468 314657 2122377 752951 3312346 148575 2233689 632262 2331455 1332561

Table A.5 Design H21a (31173).

F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 F11 F12 F13 F14 3L 3L 3L 3L 3L 3L 3L 3L 3L 3L 3L 7L 7L 7L Table A.3 Design H18b (3294). 2223323333 2 1 4 1 3322121321 3 3 3 7 Factor 1 Factor 2 Factor 3 Factor 4 Factor 5 Factor 6 2133231322 2 7 7 4 3 levels 3 levels 9 levels 9 levels 9 levels 9 levels 3331321121 1 1 4 2 137656 1311332332 2 7 1 4 231874 2223223121 3 4 2 1 131215 3122311133 3 6 5 3 138729 3221122213 3 7 2 2 229931 1231312213 3 2 3 7 225187 2313232213 3 1 7 5 212887 1332112332 2 4 4 2 212921 2122121312 1 2 3 1 217346 1232213212 1 6 6 5 326463 3111213331 2 3 7 6 124492 2133332221 2 4 1 7 119252 1313211121 2 6 6 4 316548 3213213212 1 5 1 5 325739 2211332212 1 5 6 3 334363 1121133121 3 3 5 3 338695 3332133133 1 5 5 6 113578 1112121133 1 2 2 6 323114 Fx means Factor x and yL means y Levels. H. Vieira Jr. et al. / European Journal of Operational Research 215 (2011) 629–638 633

2.2. Implementation In either case, if the formulation in (3) returns a v⁄ – 0, then no new column with the desired levels exists that is orthogonal to the Our proposal can be used to generate new orthogonal designs, existing columns. However, this does not mean that it is impossible or to improve existing ones by adding more columns. to find new orthogonal columns with different desired levels. For

In order to improve an orthogonal design O =[arc]n j 1, one has example, suppose that after three steps, we construct a design with T  À to make M =[OX]n j, where X =(x1 x2 ...xn) is an n 1 matrix, and four two-level orthogonal columns, but the formulation (3) returns   apply formulation (3). If the original design matrix is orthogonal a v⁄ – 0 in the fourth step. This fact means that no new two-level and, after applying (3), the optimized và maxy qà 0, then a orthogonal column can be added to the design. However, this does ¼ xy ¼ new orthogonal column has been found. If this column is added not mean that a new three-, four-, five-, etc., level orthogonal col- to the original orthogonal design, the new design will still be an umn may not exist. orthogonal one. We remark that with this iterative design construction, our for- To generate a new design, one can randomly create a single-col- mulation in (3) could be simplified so the column under optimiza- umn matrix with the desired levels and, sequentially, add new tion (x) is always column j. We leave the more general formulation orthogonal columns to it until the formulation (3) is unable to to accommodate situations where the analyst seeks to modify an continue. existing design without starting from scratch if the desired number of levels for one (or more) of the factors change.

Table A.6 3. Numerical results Design H21b (76).

Factor 1 Factor 2 Factor 3 Factor 4 Factor 5 Factor 6 3.1. New designs 7 levels 7 levels 7 levels 7 levels 7 levels 7 levels 246655 Sloane (2007) maintains a web page with an extensive library of 211773 known orthogonal designs (‘‘arrays’’ in Sloane’s web-page termi- 676252 nology). The only designs listed by him with: 515147 653524 461163 Fifteen runs is the complete factorial with one three-level factor  152631 and one five-level factor; 623365 Eighteen runs are the MA.18.3.6.6.1 (six three-level factors and 757776  474427 one six-level factor) and the OA.18.7.3.2 (seven three-level 172257 factors); 563434 Twenty-four runs are:  334443 – OA (24, 22041) with twenty-two-level factors and one four- 722716 level factor; 365614 115326 – MA.24.2.13.3.1.4.1 with thirteen two-level factors, one 546572 three-level factor and one four-level factor; 427111 – OA (24, 212121) with twelve two-level factors and one 334265 twelve-level factor; 731331 11 1 1 247542 – OA (24, 2 4 6 ) with eleven two-level factors, one four-level factor and one six-level factor.

Table A.7 Design H24a (4267).

Factor 1 Factor 2 Factor 3 Factor 4 Factor 5 Factor 6 Factor 7 Factor 8 Factor 9 4 levels 4 levels 6 levels 6 levels 6 levels 6 levels 6 levels 6 levels 6 levels 124125353 332362414 333156532 416324433 221155225 232614166 223141151 413666266 144336131 135463544 421441361 425215243 314221416 222532651 233411544 245522115 446244655 315443323 116653322 134233666 346654442 111536624 342362235 441515512 634 H. Vieira Jr. et al. / European Journal of Operational Research 215 (2011) 629–638

Table A.8 Design H24b (314186).

Factor 1 Factor 2 Factor 3 Factor 4 Factor 5 Factor 6 Factor 7 Factor 8 3 levels 4 levels 8 levels 8 levels 8 levels 8 levels 8 levels 8 levels 12388582 32227172 11343837 13468822 23654336 22634475 22857111 14533364 21321771 24475646 34563358 12725654 23755548 34486884 22736636 34111723 14841283 31644465 31872513 33518745 11272458 31266267 23182211 13117127

Table A.9 Table A.10 Design H30a (31151). Design H30b (355561).

F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 F11 F12 F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 F11 3L 3L 3L 3L 3L 3L 3L 3L 3L 3L 3L 5L 3L 3L 3L 3L 3L 5L 5L 5L 5L 5L 6L 2111133123 2 4 1221154532 1 1111212222 1 2 3212114222 5 2331322322 3 1 1122311225 3 3213311133 2 4 2232341334 4 1232333131 3 3 3132323334 4 2311111131 2 3 1213351412 6 3323223223 2 1 1311153335 3 3323232331 2 2 1312213224 3 2331213132 2 5 2212222452 5 1112321222 3 4 2223254253 4 3123321212 2 4 2322335512 6 3131231331 2 4 3333132342 5 2323232222 1 5 3121151111 6 2133111131 3 2 1132135343 5 3121333111 1 1 2233141115 1 1212232222 3 2 3111222444 2 1133113333 1 1 2333325111 1 2221113311 2 5 2111235111 2 1212323333 3 3 3213235155 6 3213113311 3 4 3321322441 2 1313132111 2 2 2133123541 1 3321312213 3 3 3131355525 4 2123323222 1 5 2221213244 2 2232222222 1 5 3313244525 1 2322222222 1 3 2222343253 3 3132122113 3 1 1331244153 4 1232131313 3 5 3322131433 3 1332221113 1 3 1123114524 5 3211131333 1 2 1331212433 6 1222311311 1 1 1113342351 2

Fx means Factor x and yL means y Levels. Fx means Factor x and yL means y Levels.

Also, Sloane lists no twenty-one-run design and no thirty-run 1. Design H15a: Nine three-level factors; design. 2. Design H18a: Four three-level factors, one six-level factor By using our MIP method, we create one more orthogonal de- and two nine-level factors; sign with fifteen runs, three with eighteen runs, two with 3. Design H18b: Two three-level factors and four nine-level twenty-one runs, two with twenty-four runs and two with thirty factors; runs: 4. Design H18c: Six nine-level factors; H. Vieira Jr. et al. / European Journal of Operational Research 215 (2011) 629–638 635

5. Design H21a: Eleven three-level factors and three seven- With our approach, we are able to add: level factors; 6. Design H21b: Six seven-level factors; 1. Six more three-level factors to the fifteen-run complete factorial 7. Design H24a: Two four-level factors and seven six-level design; factors; 2. Three more two-level factors and three more four-level factors 8. Design H24b: One three-level factor, one four-level factor to the MA.16.2.6.4.3 design; and six eight-level factors; 3. Three more three-level factors and two more four-level factors 9. Design H30a: Eleven three-level factors, and one five-level to the MA.24.2.13.3.1.4.1. factor; 10. Design H30b: Three three-level factors, four five-level fac- Kuhfeld (2010) also maintains a home-page with several known tors and one six-level factor. orthogonal designs. Our approach is able to improve the design OA (36, 91216,2) — proposed by Kuhfeld and Suen (2005) — by adding 3.2. Improved designs one more two-level factor, one more three-level factor and one more four-level factor. Sloane (2007) lists the following designs in his previously men- Additionally, we are able to improve some of Taguchi’s designs tioned home-page: by adding:

Fifteen-run complete factorial with one three-level factor and 5. Five two-level factors and five four-level factors to L16b design,  one five-level factor (as seen before); which originally had five four-level factors; Sixteen-run MA.16.2.6.4.3 with six two-level factors and three 6. Two three-level factors to L18 design, which originally had one  four-level factors; two-level factor and seven three-level factors; Twenty-four-run MA.24.2.13.3.1.4.1, with thirteen two-level 7. Four three-level factors to L27 design, which originally had thir-  factors, one three-level factor and one four-level factor (as seen teen three-level factors. before).

Table A.11 Design H15b (3751).

Original design Factors added Factor 1 Factor 2 Factor 3 Factor 4 Factor 5 Factor 6 Factor 7 Factor 8 5 levels 3 levels 3 levels 3 levels 3 levels 3 levels 3 levels 3 levels 11231131 12113112 13211333 21222321 22132212 23223322 31323213 32221223 33333132 41312321 42331213 43312121 51112133 52133332 53121211

Table A.12 Design H16a (2946).

Original Design Factors added F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 F11 F12 F13 F14 F15 2L 2L 2L 2L 2L 2L 4L 4L 4L 2L 2L 2L 4L 4L 4L 1111111111 2 1 2 3 3 2212121332 2 2 1 3 4 1222213131 1 2 1 4 3 2121223312 1 1 2 4 4 1122221441 2 1 3 3 2 2221211222 2 2 4 3 1 1211123421 1 2 4 4 2 2112113242 1 1 3 4 1 2211224141 2 1 3 2 3 1112214322 2 2 4 2 4 2122122121 1 2 4 1 3 1221112342 1 1 3 1 4 2222114411 2 1 2 2 2 1121124232 2 2 1 2 1 2111212431 1 2 1 1 2 1212222212 1 1 2 1 1

Fx means Factor x and yL means y Levels. 636 H. Vieira Jr. et al. / European Journal of Operational Research 215 (2011) 629–638

Table A.13 Design H24c (2133443).

Original design Factors added F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 F11 F12 F13 F14 F15 F16 F17 F18 F19 F20 2L 2L 2L 2L 2L 2L 2L 2L 2L 2L 2L 2L 2L 3L 4L 3L 3L 3L 4L 4L 1111111111 1 1 1 1 1 3 1 2 2 2 1221122212 1 2 2 1 1 1 1 1 2 2 1212112221 2 1 1 1 3 1 3 1 4 1 1121211222 2 2 2 1 3 3 3 2 1 1 1222111122 1 1 2 2 2 2 2 3 4 4 1212221112 1 2 1 2 3 2 2 3 3 1 1221212111 2 1 1 2 4 2 2 3 1 3 1122121211 2 2 2 2 4 2 2 3 4 3 1122222121 1 1 2 3 1 1 3 2 1 2 1112212212 1 2 1 3 2 3 3 1 3 4 1211221221 2 1 2 3 2 3 1 1 3 4 1111122122 2 2 1 3 4 1 1 2 2 3 2222222222 1 1 1 1 4 3 1 2 2 3 2112211121 1 2 2 1 4 1 1 1 2 3 2121221112 2 1 1 1 2 1 3 1 4 4 2212122111 2 2 2 1 2 3 3 2 1 4 2111222211 1 1 2 2 3 2 2 3 4 1 2121112221 1 2 1 2 2 2 2 3 3 4 2112121222 2 1 1 2 1 2 2 3 1 2 2211212122 2 2 2 2 1 2 2 3 4 2 2211111212 1 1 2 3 4 1 3 2 1 3 2221121121 1 2 1 3 3 3 3 1 3 1 2122112112 2 1 2 3 3 3 1 1 3 1 2222211211 2 2 1 3 1 1 1 2 2 2

Fx means Factor x and yL means y Levels.

Table A.14 Design H36 (217314191).

Original design Factors added F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 F11 F12 F13 F14 F15 F16 F17 F18 F19 F20 2L 2L 2L 2L 2L 2L 2L 2L 2L 2L 2L 2L 2L 2L 2L 2L 9L 2L 3L 4L 1111112112 1 1 2 2 1 1 2 1 1 1 1111121111 1 2 1 2 1 2 1 2 2 1 1112111222 2 1 1 2 2 1 1 2 2 3 1112222122 2 2 2 1 2 2 2 2 3 4 1121122221 2 2 2 1 1 2 5 1 1 2 1121212121 2 1 1 2 1 2 8 2 3 4 1121221122 2 1 1 1 2 1 4 1 2 1 1122211212 1 2 1 1 1 1 5 2 1 4 1122222211 1 2 2 2 2 1 3 1 2 3 1211211211 1 1 2 1 2 2 4 2 1 3 1211212121 1 2 1 1 2 1 9 1 3 1 1212121212 2 2 2 1 2 1 8 1 3 2 1212122211 2 1 1 1 1 2 9 2 2 1 1212211111 2 2 2 2 1 1 6 2 2 2 1221112212 2 1 1 2 2 2 3 1 3 2 1221221222 1 1 2 2 1 2 6 2 3 3 1222111122 1 2 1 1 1 2 7 1 1 4 1222122121 1 1 2 2 2 1 7 1 1 4 2111212212 2 1 2 1 1 1 7 1 2 4 2111221211 2 2 1 2 2 2 7 1 2 4 2112112221 1 2 2 2 1 2 4 1 3 3 2112122222 1 1 1 1 2 2 6 2 1 2 2112221121 1 1 1 1 1 1 3 1 3 2 2121111222 1 2 2 2 2 1 9 2 3 1 2121112111 2 2 1 1 2 1 6 2 1 3 2122111111 1 1 2 1 2 2 8 2 3 2 2122221112 2 1 2 2 1 2 9 1 1 1 2211111122 2 2 2 1 1 2 3 1 2 3 2211121121 2 1 2 2 2 1 5 2 1 4 2211222222 1 2 1 2 1 1 8 2 1 2 2212212112 1 1 1 2 2 2 5 1 2 3 2221121211 1 1 1 1 1 1 2 1 3 4 2221222112 1 2 2 1 2 2 1 2 2 2 2222122112 2 2 1 2 1 1 4 2 3 3 2222211221 2 2 1 2 2 2 2 1 1 1 2222212221 2 1 2 1 1 1 1 2 2 1 H. Vieira Jr. et al. / European Journal of Operational Research 215 (2011) 629–638 637

All our designs are provided in Appendix A. The times required at different stations in a production facility, the number of vehicles to solve the MIPs associated with these designs ranged from less of different types within a convoy, the days of supply of different than five minutes to more than two hours on a laptop computer types of commodities carried by these convoy vehicles for disaster using CPLEX without the parallel mode. The computational relief efforts, and the sizes of families within a simulation of social requirements are clearly problem driven; problems involving large networks, are just a few examples of discrete-valued factors with numbers of design points with small numbers of factors tend to be limited numbers of levels that may arise in simulation solved quickly, while those involving small numbers of design experiments. points and large numbers of factors may take substantially longer. We remark that the orthogonal designs we construct do not sat- We believe that using CPLEX with the parallel mode could substan- isfy the definition of an orthogonal array, so we do not advocate tially decrease the MIP solution time. Other potential approaches their use for qualitative factors. However, they do allow analysts to decreasing the solution time, such as adding valid inequalities, to study, without sacrificing balance or orthogonality, a greater could also be explored. One benefit of tabulating our designs is that number of quantitative factors than it is possible to do with they are readily available for use by those without access to opti- orthogonal arrays which have the same number of runs. In actual- mization software. ity, the benefits of being able to construct customized designs are even more pronounced for larger-scale experiments. Statisticians accustomed to conducting physical experiments might consider 4. Concluding remarks 15–20 factors to be a large number for exploration, except in a screening . However, the simulation setting is quite dif- We propose a new approach to the orthogonal design creation ferent, as Kleijnen et al. (2005) discuss. Among other things, simu- problem: to use the mixed integer programming technique to gen- lation models often have hundreds or thousands of inputs and erate new orthogonal designs, or to improve existing ones by add- ing more columns. The resulting designs are, by construction, Table A.16 balanced and space-filling ones. We illustrate this approach by cre- Design H18d (2139). ating ten new orthogonal designs and improving seven existing ones, including three well-known Taguchi designs. Table 1 summa- Original design Factors added rizes the designs created or improved in this work. F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 We have focused on the creation of designs for factors with 2L 3L 3L 3L 3L 3L 3L 3L 3L 3L three or more levels. Three levels is the minimum that ‘‘allows 121122333 3 the relationship between the response and the design factors to 111111111 1 be modeled as a quadratic [function]’’ (Montgomery, 2005, p. 348). 223123211 2 132321312 1 This is more useful, for many practical applications, than simply 131213232 1 identifying the important main effects. For example, determining 112222221 3 whether increasing a factor is associated with constant, diminish- 222312131 3 ing, or increasing rates of change in the response may help deci- 232131232 1 133132122 3 sion-makers make trade-offs between the cost and effectiveness 221231321 3 of various factor-level combinations. With a larger number of fac- 122233113 2 tor levels, response surfaces characterized by more-complex 213221133 1 behavior (such as thresholds) can be explored. 113333331 1 The orthogonal designs we show in Section 3 indicate that our 211332213 2 233212312 2 method can be used successfully to construct customized designs, 231323122 2 taking into account the total sampling budget, the number of 123311223 3 (quantitative) factors, and the number of levels of interest. The 212113323 2

flexibility of our approach is particularly valuable for computer Fx means Factor x and yL means y Levels. simulation experiments. The number of pieces of major equipment

Table A.15 Design H16b (25410).

Original design Factors added F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 F11 F12 F13 F14 F15 4L 4L 4L 4L 4L 2L 2L 2L 2L 2L 4L 4L 4L 4L 4L 1333322121 1 4 3 4 1 3421322212 1 2 4 1 2 1222222121 4 1 3 1 4 4231412221 3 1 2 4 2 3243111122 2 1 4 2 1 1444411211 3 2 3 2 3 3134222212 4 3 4 4 3 2341221222 4 4 1 2 2 1111111211 2 3 3 3 2 4142321111 1 3 2 2 4 2432112112 2 2 1 4 4 4324112221 2 4 2 1 3 2123412112 3 3 1 1 1 2214321222 1 1 1 3 3 4413221111 4 2 2 3 1 3312411122 3 4 4 3 4

Fx means Factor x and yL means y Levels. 638 H. Vieira Jr. et al. / European Journal of Operational Research 215 (2011) 629–638

Table A.17 Design H27 (317).

Original design Factors added F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 F11 F12 F13 F14 F15 F16 F17 3L 3L 3L 3L 3L 3L 3L 3L 3L 3L 3L 3L 3L 3L 3L 3L 3L 3321213132 3 2 1 1 2 2 3 2123231231 2 3 1 2 2 3 2 1333333222 1 1 1 2 1 3 1 3132321321 3 2 1 2 1 1 3 1333222111 3 3 3 1 2 1 2 3213213321 1 3 2 1 3 2 1 3132213213 2 1 3 3 2 1 2 2231312123 2 3 1 3 1 3 1 1111111111 1 1 1 1 1 1 1 3321321213 1 3 2 1 3 2 2 1222222333 1 1 1 2 3 1 3 1333111333 2 2 2 2 2 1 2 1222333111 2 2 2 3 3 1 3 2312231123 3 1 2 2 3 2 1 1222111222 3 3 3 3 2 3 2 3213321132 2 1 3 2 2 2 2 2231123231 3 1 2 1 3 3 2 2312123312 2 3 1 3 2 2 2 1111222222 2 2 2 3 3 3 3 2123123123 1 2 3 1 1 3 3 2123312312 3 1 2 2 3 3 1 3213132213 3 2 1 2 1 2 3 3132132132 1 3 2 3 3 1 1 3321132321 2 1 3 3 1 2 1 1111333333 3 3 3 1 1 1 1 2312312231 1 2 3 3 1 2 3 2231231312 1 2 3 1 2 3 3

Fx means Factor x and y L means y Levels. embedded parameters that could be explored systematically, and Jones, D.H., Nediak, M.S., 2005. Optimal on-line calibration of testlets. In: Berger, the complexity of the response behavior make space-filling designs M.P.F., Wong, W.K. (Eds.), Applied Optimal Designs. John Wiley & Sons. Kleijnen, J.P.C., 2005. An overview of the design and analysis of simulation useful for uncovering and exploring the model’s response surfaces. experiments for sensitivity analysis. European Journal of Operational Research Others may assume that with enough computer processing power, 164, 287–300. it is simpler to examine all possible combinations of factor levels Kleijnen, J.P.C., Sanchez, S.M., Lucas, T.W., Cioppa, T.M., 2005. A user’s guide to the brave new world of designing simulation experiments. INFORMS Journal on than to construct a custom design. Yet even using the fastest com- Computing 17 (3), 263–289. puters in large computing clusters cannot overcome the ‘‘curse of Koch, P.N., 2002. Probabilistic design: optimizing for six sigma quality. In: 43rd dimensionality,’’ so a brute-force approach will not work when AIAA/ASME/ASCE/AHS Structures, Structural Dynamics, and Materials Conference, 4th AIAA Non-Deterministic Approaches Forum, pp. 1–11. the number of factors is large. These all suggest that having a Kuhfeld, W.F., 2010. Orthogonal Arrays. SAS. . 1 p of runs is very useful as part of a portfolio of designs that facilitate Kuhfeld, W.F., Suen, C.Y., 2005. Some new orthogonal arrays oa(4r,r 2 ,2). Statistics & Probability Letters 75, 169–178. large-scale simulation experiments (see Sanchez et al. (2010) for Lee, J., 2000. Semidefinite programming in experimental design. In: Wolkowicz, H., further discussion and applications). Saigal, R., Vandenberghe, L. (Eds.), Handbook of Semidefinite Programming, Our ongoing research focuses on the creation of larger orthogo- International Series in Operations Research and Management Science, vol. 27. nal designs (more than 100 factors); on the study of the imbalance Kluwer. Lejeune, M.A., 2003. Heuristic optimization of experimental designs. European effect on the achievement of design orthogonality; and on the Journal of Operational Research 147, 484–498. incorporation of qualitative factors into the designs. Montgomery, D.C., 2005. Design and Analysis of Experiments, 6th ed. John Wiley & Sons. Rao, C.R., 1945. Finite geometries and certain derived results in number theory. Appendix A. Designs created or improved by using our approach Proceedings of the National Institute of Sciences of India 11, 136–149. Sanchez, S.M., Wan, H., 2009. Better than a petaflop: the power of efficient experimental design. In: Rosseti, M.D., Hill, R.R., Johansson, B., Dunkin, A., Tables A.1, A.2, A.3, A.4, A.5, A.6, A.7, A.8, A.9, A.10, A.11, A.12, Ingalls, R.G. (Eds.), Proceedings of the 2009 Winter Simulation Conference, pp. A.13, A.14, A.15, A.16, A.17. 60–74. Sanchez, S.M., Lucas, T.W., Sanchez, P., Nannini, C.J., Wan, H., 2010. Designs for References large-scale simulation experiments, with applications to defense and homeland security. Design and Analysis of Experiments: Special Designs and Applications, vol. 3. John Wiley & Sons. Appa, G., Magos, D., Mourtos, I., 2006. Searching for mutually orthogonal Latin Santner, T.J., Williams, B.J., Notz, W.I., 2003. The Design and Analysis of Computer squares via integer and constraint programming. European Journal of Experiments. Springer. Operational Research 173, 519–530. Sloane, N.J., 2007. Orthogonal arrays. .  Bathke, A., 2004. The ANOVA F test can still be used in some balanced designs with van Beers, W.C.M., Kleijnen, J.P.C., 2008. Customized sequential designs for random unequal variances and nonnormal data. Journal of Statistical Planning and simulation experiments: Kriging metamodeling and bootstrapping. European Inference 2, 413–422. Journal of Operational Research 186, 1099–1113. Dey, A., Mukerjee, R., 1999. Fractional Factorial Plans. John Wiley & Sons. van Dam, E.R., Rennen, G., Husslage, B., 2009. Bounds for maximin Latin hypercube Hedayat, A.S., Sloane, N.J., Stufken, J., 1999. Orthogonal Arrays: Theory and designs. Operations Research 57 (3), 595–608. Applications. Springer-Verlag. van Dam, E.R., Husslage, B., Hertog, D., 2010. One-dimensional nested maximin Hernandez, A.S., 2008. Breaking barriers to design dimensions in nearly orthogonal designs. Journal of Global Optimization 46 (2), 287–306. Latin hypercubes. Ph.D. Thesis, Naval Postgraduate School. Yokoyaa, D., Yamada, T., 2011. A mathematical programming approach to the Hernandez, A.S., Carlyle, M., Lucas, T.W., 2011. Enabling nearly orthogonal Latin construction of bibds. International Journal of Computer Mathematics 88 (5), hypercube construction for any non-saturated run-variable combination. 1067. Working paper, Operations Research Department, Naval Postgraduate School.