Optimization Methods
From Theory to Design Scientific and Technological Aspects in Mechanics
Bearbeitet von Marco Cavazzuti
1. Auflage 2012. Buch. xviii, 262 S. Hardcover ISBN 978 3 642 31186 4 Format (B x L): 15,5 x 23,5 cm Gewicht: 584 g
Weitere Fachgebiete > Mathematik > Numerik und Wissenschaftliches Rechnen > Angewandte Mathematik, Mathematische Modelle Zu Inhaltsverzeichnis
schnell und portofrei erhältlich bei
Die Online-Fachbuchhandlung beck-shop.de ist spezialisiert auf Fachbücher, insbesondere Recht, Steuern und Wirtschaft. Im Sortiment finden Sie alle Medien (Bücher, Zeitschriften, CDs, eBooks, etc.) aller Verlage. Ergänzt wird das Programm durch Services wie Neuerscheinungsdienst oder Zusammenstellungen von Büchern zu Sonderpreisen. Der Shop führt mehr als 8 Millionen Produkte. Chapter 2 Design of Experiments
All life is an experiment. The more experiments you make the better. Ralph Waldo Emerson, Journals
2.1 Introduction to DOE
Within the theory of optimization, an experiment is a series of tests in which the input variables are changed according to a given rule in order to identify the reasons for the changes in the output response. According to Montgomery [4] “Experiments are performed in almost any field of enquiry and are used to study the perfor- mance of processes and systems. […] The process is a combination of machines, methods, people and other resources that transforms some input into an output that has one or more observable responses. Some of the process variables are controllable, whereas other vari- ables are uncontrollable, although they may be controllable for the purpose of a test. The objectives of the experiment include: determining which variables are most influential on the response, determining where to set the influential controllable variables so that the response is almost always near the desired optimal value, so that the variability in the response is small, so that the effect of uncontrollable variables are minimized.” Thus, the purpose of experiments is essentially optimization and RDA. DOE, or experimental design, is the name given to the techniques used for guiding the choice of the experiments to be performed in an efficient way. Usually, data subject to experimental error (noise) are involved, and the results can be significantly affected by noise. Thus, it is better to analyze the data with appro- priate statistical methods. The basic principles of statistical methods in experimental design are replication, randomization, and blocking. Replication is the repetition of the experiment in order to obtain a more precise result (sample mean value) and to estimate the experimental error (sample standard deviation). Randomization
M. Cavazzuti, Optimization Methods: From Theory to Design,13 DOI: 10.1007/978-3-642-31187-1_2, © Springer-Verlag Berlin Heidelberg 2013 14 2 Design of Experiments refers to the random order in which the runs of the experiment are to be performed. In this way, the conditions in one run neither depend on the conditions of the previous run nor predict the conditions in the subsequent runs. Blocking aims at isolating a known systematic bias effect and prevent it from obscuring the main effects [5]. This is achieved by arranging the experiments in groups that are similar to one another. In this way, the sources of variability are reduced and the precision is improved. Attention to the statistical issue is generally unnecessary when using numerical simulations in place of experiments, unless it is intended as a way of assessing the influence the noise factors will have in operation, as it is done in MORDO analysis. Due to the close link between statistics and DOE, it is quite common to find in literature terms like statistical experimental design, or statistical DOE. However, since the aim of this chapter is to present some DOE techniques as a mean for collecting data to be used in RSM, we will not enter too deeply in the statistics which lies underneath the topic, since this would require a huge amount of work to be discussed. Statistical experimental design, together with the basic ideas underlying DOE, was born in the 1920s from the work of Sir Ronald Aylmer Fisher [6]. Fisher was the statistician who created the foundations for modern statistical science. The second era for statistical experimental design began in 1951 with the work of Box and Wilson [7] who applied the idea to industrial experiments and developed the RSM. The work of Genichi Taguchi in the 1980s [8], despite having been very controversial, had a significant impact in making statistical experimental design popular and stressed the importance it can have in terms of quality improvement.
2.2 Terminology in DOE
In order to perform a DOE it is necessary to define the problem and choose the variables, which are called factors or parameters by the experimental designer. A design space, or region of interest, must be defined, that is, a range of variability must be set for each variable. The number of values the variables can assume in DOE is restricted and generally small. Therefore, we can deal either with qualitative discrete variables, or quantitative discrete variables. Quantitative continuous vari- ables are discretized within their range. At first there is no knowledge on the solution space, and it may happen that the region of interest excludes the optimum design. If this is compatible with design requirements, the region of interest can be adjusted later on, as soon as the wrongness of the choice is perceived. The DOE technique and the number of levels are to be selected according to the number of experiments which can be afforded. By the term levels we mean the number of different values a variable can assume according to its discretization. The number of levels usually is the same for all variables, however some DOE techniques allow the differentiation of the number of levels for each variable. In experimental design, the objective func- tion and the set of the experiments to be performed are called response variable and sample space respectively. 2.3 DOE Techniques 15
2.3 DOE Techniques
In this section some DOE techniques are presented and discussed. The list of the techniques considered is far from being complete since the aim of the section is just to introduce the reader into the topic showing the main techniques which are used in practice.
2.3.1 Randomized Complete Block Design
Randomized Complete Block Design (RCBD) is a DOE technique based on blocking. In an experiment there are always several factors which can affect the outcome. Some of them cannot be controlled, thus they should be randomized while performing the experiment so that on average, their influence will hopefully be negligible. Some other are controllable. RCBD is useful when we are interested in focusing on one particular factor whose influence on the response variable is supposed to be more relevant. We refer to this parameter with the term primary factor, design factor, control factor, or treatment factor. The other factors are called nuisance factors or disturbance factors. Since we are interested in focusing our attention on the primary factor, it is of interest to use the blocking technique on the other factors, that is, keeping constant the values of the nuisance factors, a batch of experiments is performed where the primary factor assumes all its possible values. To complete the randomized block design such a batch of experiments is performed for every possible combination of the nuisance factors. Let us assume that in an experiment there are k controllable factors X1,...Xk and one of them, Xk, is of primary importance. Let the number of levels of each factor be L1, L2,...,Lk.Ifn is the number of replications for each experiment, the overall number of experiments needed to complete a RCBD (sample size)is N = L1 · L2 · ...· Lk · n. In the following we will always consider n = 1. Let us assume: k = 2, L1 = 3, L2 = 4, X1 nuisance factor, X2 primary factor, thus N = 12. Let the three levels of X1 be A, B, and C, and the four levels of X2 be α, β, γ, and δ. The set of experiments for completing the RCBD DOE is shown in Table2.1. Other graphical examples are shown in Fig.2.1.
2.3.2 Latin Square
Using a RCBD, the sample size grows very quickly with the number of factors. Latin square experimental design is based on the same idea as the RCBD but it aims at reducing the number of samples required without confounding too much the importance of the primary factor. The basic idea is not to perform a RCBD but rather a single experiment in each block. Latin square design requires some conditions to be respected by the problem for being applicable, namely: k = 3, X1 and X2 nuisance factors, X3 primary factor, 2 L1 = L2 = L3 = L. The sample size of the method is N = L . 16 2 Design of Experiments
Table 2.1 Example of RCBD experimental design for k = 2, L1 = 3, L2 = 4, N = 12, nuisance factor X1, primary factor X2
(a) (b)
Fig. 2.1 Examples of RCBD experimental design
For representing the samples in a schematic way, the two nuisance factors are divided into a tabular grid with L rows and L columns. In each cell, a capital latin letter is written so that each row and each column receive the first L letters of the alphabet once. The row number and the column number indicate the level of the nuisance factors, the capital letters the level of the primary factor. Actually, the idea of Latin square design is applicable for any k > 3, however the technique is known with different names, in particular: • if k = 3: Latin square, • if k = 4: Graeco-Latin square, • if k = 5: Hyper-Graeco-Latin square. Although the technique is still applicable, it is not given a particular name for k > 5. In the Graeco-Latin square or the Hyper-Graeco-Latin square designs, the 2.3 DOE Techniques 17
Table 2.2 Example of Latin square experimental design for k = 3, L = 3, N = 9
additional nuisance factors are added as greek letters and other symbols (small letters, numbers or whatever) to the cells in the table. This is done in respect of the rule that in each row and in each column the levels of the factors must not be repeated, and to the additional rule that each factor must follow a different letters/numbers pattern in the table. The additional rule allows the influence of two variables not to be onfounded completely with each other. To fulfil this rule, it is not possible a Hyper-Graeco-Latin square design with L = 3 since there are only two possible letter pattern in a 3 × 3 table; if k = 5, L must be ≥4. The advantage of the Latin square is that the design is able to keep separated several nuisance factors in a relatively cheap way in terms of sample size. On the other hand, since the factors are never changed one at a time from sample to sample, their effect is partially confounded. For a better understanding of the way this experimental design works, some exam- ples are given. Let us consider a Latin square design (k = 3) with L = 3, with X3 primary factor. Actually, for the way this experimental design is built, the choice of the primary factor does not matter. A possible table pattern and its translation into a list of samples are shown in Table2.2. The same design is exemplified graphically in Fig. 2.2. Two more examples are given in Table2.3, which shows a Graeco-Latin square design with k =4, L =5, N =25, and a Hyper-Graeco-Latin square design with k =5, L =4, N =16. Designs with k > 5 are formally possible, although they are usually not discussed in the literature. More design tables are given by Box et al. in [9].
2.3.3 Full Factorial
Full factorial is probably the most common and intuitive strategy of experimental design. In the most simple form, the two-levels full factorial, there are k factors and L = 2 levels per factor. The samples are given by every possible combination of the factors values. Therefore, the sample size is N = 2k. Unlike the previous DOE 18 2 Design of Experiments
Fig. 2.2 Example of Latin square experimental design for k = 3, L = 3, N = 9
Table 2.3 Example of Graeco-Latin square and Hyper-Graeco-Latin square experimental design
methods, this method and the following ones do not distinguish anymore between nuisance and primary factors apriori. The two levels are called high (“h”) and low (“l”) or, “+1” and “−1”. Starting from any sample within the full factorial scheme, the samples in which the factors are changed one at a time are still part of the sample space. This property allows for the effect of each factor over the response variable not to be confounded with the other factors. Sometimes, in literature, it happens to encounter full factorial designs in which also the central point of the design space is added to the samples. The central point is the sample in which all the parameters have a value which is the average between their low and high level and in 2k full factorial tables can be individuated with “m” (mean value) or “0”. Let us consider a full factorial design with three factors and two levels per factor (Table2.4). The full factorial is an orthogonal experimental design method. The term orthogonal derives from the fact that the scalar product of the columns of any two-factors is zero. We define the main interaction M of a variable X the difference between the average response variable at the high level samples and the average response at the 2.3 DOE Techniques 19
Table 2.4 Example of 23 full factorial experimental design Experiment Factor level Response Two- and three-factors interactions number X1 X2 X3 variable X1 · X2 X1 · X3 X2 · X3 X1 · X2 · X3
1 −1(l) −1(l) −1(l) yl,l,l +1 +1 +1 −1 2 −1(l) −1(l) +1(h) yl,l,h +1 −1 −1 +1 3 −1(l) +1(h) −1(l) yl,h,l −1 +1 −1 +1 4 −1(l) +1(h) +1(h) yl,h,h −1 −1 +1 −1 5 +1(h) −1(l) −1(l) yh,l,l −1 −1 +1 +1 6 +1(h) −1(l) +1(h) yh,l,h −1 +1 −1 −1 7 +1(h) +1(h) −1(l) yh,h,l +1 −1 −1 −1 8 +1(h) +1(h) +1(h) yh,h,h +1 +1 +1 +1
low level samples. In the example in Table2.4,forX1 we have
yh,l,l + yh,l,h + yh,h,l + yh,h,h yl,l,l + yl,l,h + yl,h,l + yl,h,h MX = − . (2.1) 1 4 4
Similar expressions can be derived for MX2 and MX3 .Theinteraction effect of two or more factors is defined similarly as the difference between the average responses at the high level and at the low level in the interaction column. The two-factors interaction effect between X1 and X2 following Table2.4 is
yl,l,l + yl,l,h + yh,h,l + yh,h,h yh,l,l + yh,l,h + yl,h,l + yl,h,h MX ,X = − . (2.2) 1 2 4 4 The main and the interaction effects give a quantitative estimation of the influence the factors, or the interaction of the factors, have upon the response variable. The number of main and interaction effects in a 2k full factorial design is 2k − 1; it is also said that a 2k full factorial design has 2k − 1 degree of freedom. The subdivision of the number of main and interaction effects follows the Tartaglia triangle [10], also known as Khayyam triangle, or Pascal triangle: in a 2k full factorial design there are ! ! ( − ) k = k = k main effects, k = k = k k 1 two-factors interactions, 1 1!(k−1)! 2 2!(k−2)! 2 k = k! j j!(k− j)! j-factors interactions, and so on. The idea of the 2k full factorial experimental designs can be easily extended to the general case where there are more than two factors and each of them have a different number of levels. The sample size of the adjustable full factorial design ,..., ,..., = k with k factors X1 Xk,havingL1 Lk levels, is N i=1 Li . At this point, the careful reader has probably noted that the sample space of the adjustable full factorial design is equivalent to the one of the RCBD. Therefore, we could argue that the RCBD is essentially the more general case of a full factorial design. It is true, however, that in the RCBD the focus is generally on a single variable (the primary factor), and a particular stress is put on blocking and randomization techniques. It is not just a problem of sampling somehow a design space since, in fact, the order of the experiments and the way in which they are performed matter. 20 2 Design of Experiments
In adjustable full factorial designs, it is still possible to compute an estimation of the main and the interaction effects, but the definitions given above must be reformulated in terms of sums of squares. Let us consider the case of k = 4, the average of the response variable for all the N samples is