AN EXPERT SYSTEM FOR DESIGNING
STATISTICAL EXPERIMENTS.
A thesis Presented to
The Faculty of The College of Engineering And Technology
Ohio University
In Partial Fulfillment of
The Requirements for the Degree of
Nasters of Science
Mustafa S. Shraim s- i ACKNOWLEDGEMENT
I would like to express my thanks to my advisor, Dr. W. Robert Terry, for his help and encouragement throughout this research. Also, I would like to extend my thanks to my committee members, Dr. Francis Bellezza and Prof. Ken Cutright for their guidance and advice.
This thesis is dedicated to my parents, Saadeddin and Etaf Shraim, for their continuous love and support and to my girlfriend, Ellen Battistone, without whom ..... TABLE OF CONTENTS
1.1 Experimental Design Review ------1 1.1.1 Basic Experimental Design Principles --- 2 1.1.2 Experimental Design Procedure ------4 1.1.3 Statistical Techniques in Experimental Design ...... 5 1.2 Experimental Design and Quality Improvement -- 6 1.3 Statistical software 8 1.4 Knowledge Engineering Background ------13
Chapter Two: NEEDS ANALYSIS ...... 18
Chapter Three: TECHNICAL AND STATISTICAL TERMINOLOGY - 21
3.1 Definitions of Technical Terms 21 3.2 Definitions of Experimental Designs ------26 3.3 Definitions of Tests ...... 29
Chapter Four: EXPERIMENTAL DESIGN ...... 31
4.1 Introduction ------31 4.2 Basic Experimental Design Principles ------31 4.3 Screening Experiments ...... 36 4.4 Model Adequacy ...... 42
Chapter Five: KNOWLEDGE REPRESENTATION 50
5.1 Introduction ...... 50 5.2 Experimental Design Rules ...... 51 5.2.1 Rules for Single Factor Experiments ---- 5 1 5.2.2 Rules for Factorial Experiments ------55
Chapter Six: SYSTEM EVALUATION ...... 87
6.1 Introduction ...... 87 6.2 Evaluation Experiments 87 6.3 Conclusions ...... 89 6.4 Recommendations ...... 9 1 Evaluation Test ......
Data Sheet ------A Point Distribution ...... Data Summary Score Averages
Paired t-test ------A Instructions for Running the Program ------Sample Output ------TABLE OF FLOW CHARTS CHAPTER ONE
INTRODUCTION
1.1 EXPERIMENTAL DESIGN REVIEW
A "true experiment" may be defined as a study in which some independent variables are manipulated
(Hicks (1982)). Their effects on the response or the
dependent variable are determined and levels of the
independent variables are controlled and not chosen at random. The various combinations of levels of independent variables are referred to as treatments. These treatments are run in random order.
The main reason for designing experiments is to obtain accurate information about the study at the lowest cost. Poorly planned experiments are often costly due to performing unnecessary experimental runs. Moreover, it is often necessary to study situations which the effects of two or more independent variables combine in a nonadditative fashion. These nonadditative effects are called interaction effects. If the experiment is performed by changing the level of one independent variable at a time while holding the others constant, then interaction effects can not be determined. Another reason for designing an experiment is the need to measure the experimental error which is a crucial component in conducting statistical tests. These statistical tests provide the experimenter with the power of interpreting the results of the data analysis.
Steele and Torsie (1960) stated that " the experimenter who contacts the consulting statistician at the planning stage of the experiment rather than after its completion improves the chances of achieving his aims".
According to them, the statistician is often faced with data that only provides biased estimates and differences.
The conclusions drawn from the analysis of such data may not apply to the population the experimenter had in mind.
However, if the experimenter applies proper experimental design, then valid conclusions based on precise estimates can be drawn.
1.1.1 BASIC EXPERIMENTAL DESIGN PRINCIPLES
Often, after defining the independent variables and the response(s) of an experiment, the data is collected without taking into consideration the way it was collected. Other times, the experiment is conducted with little or no consideration given to the extraneous variability that can be contributed to the experimental error. For example, if the temperature is said to have an effect on the outcome (response) of a certain experiment, then performing the experimental runs under different temperatures could inflate the experimenter error. As a result, the validity of conclusions drawn are affected.
There are three basic principles in designing experiments; (1) blocking, (2) randomization, and (3) replication. Blocking is method used in experimental design to draw valid conclusions from an experiment. This is done by performing the experimental runs in subexperiments or blocks. Each block contains the experimental runs that are performed under homogeneous conditions. The variability between blocks can then be eliminated since the extraneous conditions of one block may differ from those of another. Randomization is the key principle in designing statistical experiments. It means that the experimental runs are selected and performed in a random order. The emphasis on randomization comes from the fact that statistical methods require that the observations, which contain the experimental error, are randomly distributed random variables. Replication refers to the repetition of the basic experimental runs. It is an important aspect in designing statistical experiments for two important reasons. Firstly, It provides the experimenter with an estimate of the experimental error which is an important element in conducting statistical tests. Secondly, It provides the experimenter with a more precise estimate of the mean response.
A detailed discussion on these basic experimental design principles and other aspects will be presented in chapter three.
1.1.2 EXPERIMENTAL DESIGN PROCEDURE
In this section, a description of procedure for experimentation will be presented. This includes the design, the performance, the analysis and the conclusions.
(1) Statement of the problem: The objective of an
experiment must be clearly defined.
(2) Choice of variables: this includes both the
independent variables and the response(s). Levels for
the independent variables over which the investigation
will be performed must also be defined. The dependent
variable or response must be selected so that it (a)
provides useful information about the study, and
(b) can be measurable.
(3) Choice of experimental design: The experimenter must take
the basic principles mentioned above into
consideration when he or she is to select an
experimental design. The objective of the experiment
as well as budget constraints are also important
issues. (4) Performing the experimental runs: the experimenter
must ensure that proper randomization, measurement,
and homogeneity of environment in which the experiment
is taking place are proceeding as planned.
(5) Analyzing the experiments: the data collected from the
experiment should be analyzed by using proper
statistical techniques. Graphical methods are also
helpful especially in checking the adequacy of the
model.
(6) Drawing conclusions: statistical techniques provide a
guidance in drawing conclusions from the analysis of
the experimental data. Statistical significance on
effects and differences can be easily recognized for
recommendations on the findings.
1.1.3 STATISTICAL TECHNIQUES IN EXPERIMENTAL DESIGN
The experimenter can safely use statistical
techniques when the requirements of the experimental design are met. Namely, recognizing the basic principles mentioned earlier and building the design upon them
enables the experimenter to use statistical methodology in a straight forward manner. The following are some considerations for the experimenter to keep in mind.
(1) The choice of the experimental design and analysis
should be kept simple, since it is proven that they
are almost always best when they are simple (Montgomery (1976)).
(2) The experimenter should recognize the sequential
approach in experimentation. That is, the construction
of the primary experiment must not be comprehensive
since adding and dropping variables could occur. This
is always true in screening experiments where the
experimenter is trying to separate the important
variables from the unimportant ones.
(3) Interpretation of statistical significance should be
accompanied with practical knowledge about the
experimental situation. For example, an engineer may
determine that a modified carburetor of an automobile
can save 0.1 mile/gallon which is statistically
significant. However, if the cost of modification is
$1000/carburetor then it is obvious that it is not
worth the effort especially when the price of gas is
low.
1.2 EXPERIMENTAL DESIGN AND QUALITY IMPROVEMENT
It is important to consider the role that
experimental design has been playing in quality
improvement. Nearly all manufacturing firms use some type of quality control procedure to achieve quality products.
However, these procedures may be costly to implement,
resulting in overpriced products. They may also be
ineffective due to the lack of suitable experimental design methodologies. Therefore, some type of consultation system through which experimental design expertise can be conveyed to quality control personnel is needed. Engineers can not just use existing statistical analysis software packages that do not help in planning the right experiment.
In the past five years, there has been a great change in the way American firms think when it comes to quality improvement. Dr. Genishi Taguchi, a Japanese quality consultant, has introduced new methodologies called "Off- line Quality Control" to achieve quality improvement.
Viewing Taguchi's quality control methods from the perspective of the Japanese industrial success, some
American firms readily adapted Taguchi's methods. His philosophy is divided into two main categories: (1) On- line quality control methods and (2) Off-line quality control methods. His on-line quality control methods consist of statistical process control, prediction and correction, and sampling inspection. The expert system developed in this research is only concerned with off- quality control since it deals with designing experiments
Box (1985) has observed that the method of off- line quality control advocated by Taguchi which utilizes experimental design suffers from the following:
(1) The orthogonal array design ignores the existence of
interactions between the controllable variables and
the uncontrollable or noise variables.
(2) Taguchi seems to ignore the possibility of data
transformation. In practice, the possibility of
dealing with inadequate models should always be kept
in mind.
(3) The sequential approach in experimentation is not
referred to by Taguchi despite its proven efficiency.
1.2 STATISTICAL SOFTWARE
Traditional software of experimental design may be misused by the user for two major reasons as stated by
Hand (1984). The first is the difficulty in operating the software because it requires a comprehensive understanding that can only be achieved by studying the details in its manuals. The second reason comes from the fact that some users mistakingly analyze their data using a wrong or less suitable experimental design. The second reason tends to be more harmful than the first one because it may lead to invalid conclusions. This problem is likely to occur when users are so eager to analyze the data that, sometimes, they tend to use the easiest design available.
However, lack of knowledge about experimental design and not realizing the consequences of poorly designed experiments are very important reasons,too.
Analytical software packages assume that the user
knows what he or she is doing with regard to selecting the
right computerized algorithm. This assumption can be
invalid if the user lacks this expertise. Therefore, the need for an expert system to answer the user's questions on the selection of the right experimental design concerns many researchers in this field. The field of experimental
design is one that has not been given a lot of attention when it comes to the development of such systems because of the following reasons:
(1) Experimental design is used in many domains such as
psychology, agriculture, engineering, physical science
and others which makes decision making rather
difficult. These domains differ in their requirements
in planning and making inferences from results. For
example, in complex high technology, a large number of
experimental runs can not be afforded because of the
high cost of a run. In fields like agriculture or
psychology, however, experimental runs may be
replicated several times.
(2) The field of experimental design has a wide range
which makes it somewhat ineffective to be covered in
one consultation system.
(3) Some researchers (Hand (1984)) argue that the difficulty also comes from the enormous amount of
technical terms that the user must be familiar
with before efficiently using the consultation system.
For example, blocking is an important aspect of
experimental design that must be understood by the
user. If he or she is not familiar with this concept,
the whole experimental design will be affected and,
as a result, the conclusions drawn from the experiment
become affected. Also, cost of conducting experiments
may be increased as well as risk of error.
(4) It has been proven to be difficult to develop a
system that contains symbolic knowledge as well as
powerful number manipulation due to the lack of a
computer language that effectively suit both. Powerful
number manipulation languages such as Fortran are not
efficient in representing symbolic knowledge.
The points mentioned above are the major difficulties that researchers face in trying to develop a computer program that is able to emulate the human expert in reasoning and decision making with regard to designing experiments. The domain of experimental design can not be compared to knowledge of geography, for example. The latter can be represented easily in a knowledge-based system because of its static pieces of information. On the other hand, designing an experiment is usually done in several phases. One phase may totally depend on the information obtained from a previous one as is the case in the sequential approach in experimentation.
There are two major questions that need to be answered when it comes to developing an expert system for designing experiments. The first is: For what purpose the system is developed? and the second is: For whom will the system be designed ?
The first question has two possible answers: (1) For the selection of a suitable experimental design and (2)
For number manipulation. The second answer has been answered for many years. The market is full of such software that compete with one another in how fast the analysis is carried out, how easy to input the data, and the capacity in terms of how much data the system can handle at one time. The first answer, however, is hardly answered by software market despite its crucial role in effective experimental design. This point is most forcefully made by Hahn (1977) when he stated " the world's best statistical analysis can not rescue a poorly planned experimental program".
The second question, which deals with the group of people that the system is designed for, must be answered clearly so that the system developed is effective. An expert system in this field can not be effective for a wide variety of people. The novices expectations of the system significantly differ from those of the well- informed person. The well-informed person does not want to be bothered with elementary questions and explanations while the novice is likely to require lot of help throughout the session. The novice is likely to appreciate an expert system that does everything for them from planning the experiment through the analysis to inferences on the findings. Attempts to satisfy requirements for both extremes can dramatically affect the effectiveness of the system. Consequently, identifying the group of people for whom the system will be designed is an essential step in the development of an expert system for designing experiments. 1.4 KNOWLEDGE ENGINEERING BACKGROUND
Knowledge engineering is basically the reduction of knowledge domain into sets of facts and rules. The knowledge engineering process consists of two major phases: knowledge acquisition and knowledge representation. The two phases together can be described as the process of acquiring knowledge from expert on a particular domain, coding this knowledge and computerizing it to achieve a system that effectively emulates the human expert in decision making process. Before the knowledge engineer starts acquiring knowledge, he or she must define the problem and the domain on which the knowledge will be acquired. There are three major requirements that need to be satisfied when it comes to identifying the domain:
(1) The domain must be narrow to achieve a more
effective system.
(2) The domain must have sufficient expertise. For
example, an attempt to develop an expert system on
astrology will probably fail due to serious lack of
expertise in that domain.
(3) The domain should contain reliable knowledge that
does not significantly change over time.
Knowledge-based systems are different from other types of systems. The following features of knowledge- based systems distinguish them from other systems:
(1) Knowledge-based systems or expert systems are
limited to a specific domain of expertise.
(2) The reasoning mechanism of one expert system may be
applied to other expert systems.
(3) Expert systems can apply deductive reasoning.
(4) Expert systems provide symbolic output for a
better interaction with the user.
traditional computer systems are better in applying procedural programming such as numerical and algorithmic processes, whereas expert systems have the ability to emulate the human expert in a particular domain. There are many situations in which expert systems are required ranging from scarcity or non-consistent availability of expertise to providing education and training facilities. They can also replace the human expert when obtaining their expertise is time-consuming or expensive.
While expert systems are not able to think, they still have some advantages over the human expert in terms of: Consistency in delivering answers and not jumping to conclusions without considering all details. In other words, unlike human experts, expert systems are not emotionally influenced in the decision making process which sometimes lead to inconsistent judgement.
An expert system usually consists of two major parts:
(1) Knowledge base and (2) Inference engine. The knowledge base itself consists of the knowledge that is specific to the domain of application including facts about the domain, and rules that describes relations in the domain.
The inference engine performs two tasks: It examines the facts and determines whether or not they are satisfied using the rules in the rulebase. It also determines the order in which these rules are scanned and fired.
The order in which the rules are examined is an important issue for the knowledge engineer to keep in mind when developing an expert system. The system must have a starting point from which the searching procedure begins.
This starting point depends upon the objective of the function. In any case, there are two scanning procedures to consider: forward chaining and backward chaining.
In forward chaining, the inference engine starts with the facts and works its way forward in an attempt to find conclusion that matches the given facts. In backward chaining, the inference engine starts with an assumed conclusion and works its way backward in an attempt to 16 find facts that support that conclusion. For example, in
medical diagnostics, the goal is to identify symptoms
(facts) from a given state of illness (conclusion). It
should be mentioned here that some expert systems employ a
combination of backward and forward chaining.
Expert system programs can be written in many
languages including old ones such as Basic and Fortran.
However, artificial intelligence programs are best
optimized if they are written in languages that provide
the appropriate data structure. This means selecting the
language that is most suitable for symbolic representation
in which modifying and making logical inferences are easy
to accomplish.
The most popular languages for expert systems are
Lisp and Prolog. They utilize logical declarations specifying what the program will do. The program of this expert system is written in Turbo Prolog which is a
Boleland International's fast, compiler-based
implementation of Prolog programming language. This language is designed to provide answers it deduces from
its powerful internal routines. In comparison with other declarative languages, Turbo Prolog proves efficient in
the sense that it can only use few lines to accomplish a
task that other languages require several pages of code to 17 accomplish. It should be mentioned here that Turbo Prolog runs on IBM PC or compatible and requires DOS to start. CHAPTER TWO
NEEDS ANALYSIS
2.1 STATEMENT OF NEED
Quality improvement relies heavily on applying suitable experimental design. Most quality engineers lack training in experimental designs necessary for designing cost effective off-line quality control experiments as indicated by Box et al(1988). In Japan, there is a great emphasis on experimental design methodologies for achieving better quality product. The Japanese are convinced that quality improvement can be achieved without necessarily spending excessive amounts of money. Instead of jumping to tolerance design to correct quality problem which is mainly concerned with purchase of equipment, they go through parameter design which is concerned with applying experimental design techniques to determine the causes of deviating from standards and correct them.
Therefore, an expert system that will enable engineers to select the suitable design is needed. This expert system consists of three parts: (1) Designing experiments for a single factor (or comparing more than two treatment means), (2) designing experiments with two or more factors, and (3) explanation facility that will provide a 19 brief explanation about basic statistical and technical terms.
The program is written in Turbo Prolog (Version 1.1) and requires 640 K memory on an IBM or compatible as well as DOS diskette to start it up. Production rules are employed to represent the knowledge in this expert system.
2.2 SYSTEM LIMITATIONS
(1) The program is limited to guide users to a suitable
design in linear situations. If tests performed by
user show that nonlinearity exists, the system will
advise the user to employ non-linear model but does
not go any further.
(2) The system is also limited to guiding the user to the
suitable design but does not perform numerical
calculations.
(3) The system is limited to advise the user on
controlled experiments. Therefore, unplanned
experiments such as regression are not included in
this expert system.
2.3 SYSTEM FEATURES
(1) The system is user friendly because of its
clear language and menu-driven structure. The user
may also leave the system at any point in the program.
(2) The system employs the approach of sequential
experimentation for more effective experimental
design.
(3) The system provides an explanation facility in
case the user wants to refresh his or her memory
about a certain technical terms.
(4) The system can be easily modi'fied or expanded due to
its modifiable inference engine.
(5) The system can be interfaced with other computer
languages such as Pascal or Fortran. This is an
important issue for future work if we want to link
this expert system with a powerful number
manipulating language like Pascal (Version 4.0). CHAPTER THREE
TECHNICAL AND STATISTICAL TERMINOLOGY
3.1 DEFINITIONS OF STATISTICAL TERMS:
This section is concerned with refreshing the user's memory with regard to basic statistical and other technical terminology used in the expert system. These definitions are viewed as the help facility which is a part of the expert system
(1) Type I error (M): often called " level of statistical
significance" which is the probability of rejecting
the null hypothesis when it is actually true.
(2) Type I1 error (B): is the probability of accepting the
null hypothesis when it is actually false. This is
more serious than type I error. The magnitude of the
expression (I-&) determines the power of the test.
(3) Independent variables: sometimes referred to as
"experimental variables" or "factors". The objective
of an experiment is usually to investigate their
effects on the dependent variable (response) or
possibly to find the settings of these variables over
which the response is optimized or the variability of
the response is minimized. The independent variables
can be either qualitative or quantitative depending on
their nature. 22
(4) Levels of independent variables: In planned
experiments, the experimenter wishes to investigate
the effects over specific settings within the range of
each of the independent variables. Usually, the
experimenter assigns two levels within the range of an
independent variable if the relationship between that
variable and the response is known to be linear. two
levels of the independent variables are also used when
performing screening of the independent variables
which is simply separating the important variables
from the unimportant ones. Three levels will provide
sufficient data for fitting a quadratic relationship.
(5) Dependent variable (response): It is the criterion
used to measure the experimental outcome. For example,
the strength of some material may be characterized as
the dependent variable if the experiment is to
investigate the effects of temperature and raw
materials on it.
(6) Interaction: When the difference in the outcome of the
dependent variable (response) between levels of one
factor is not the same at all levels of another
factor, these two factors are said to interact in
determining the response.
(7) Experimental run: When the experimenter sets the
independent variables each at a specified level and
measures the outcome (yield) by recording the value of 2 3
the dependent variable, it is called an experimental
run.
(8) Randomization: A set of observation is said to be
randomized when arranged in a random order. The
experimental runs should be randomly assigned to
prevent inflating the variance resulting from the
contribution of uncontrollable variables, learning
effect on the part of the experimenter, and other
reasons. Employing the randomization process simply
averages out these extraneous effects. Randomization
is an essential step prior to collecting the data.
(9) Homogeneous conditions: When experimental runs are
performed under the same conditions, this situation is
called homogeneous conditions. This means that all
runs were subjected to the same system uncontrollable
variables.
(10) Uncontrollable (noise) variables: These are extraneous
variables and can not be controlled by the
experimenter. Examples of usually uncontrollable
variables are temperature and humidity. Ignoring them
may affect the conclusions drawn from the experiment
due to their contribution of inflating the variance.
(11) Blocking: This experimental design concept is the key
to situations where not all experimental runs can be
performed under homogeneous conditions. Blocking the
experimental runs will eliminate the error contributed 2 4
by the difference of conditions so that experimental
runs in one block are performed under homogeneous
conditions.
(12) Lack of fit: The model representing the experimental
data should be adequate. However, sometimes this
adequacy is not achieved because of violating The
linearity assumption. The experimenter should run a
lack of fit test to determine whether the model is
adequate or not before further analysis are carried
out.
(13) Transformation: When the model does not adequately
represent the data in the original scale or metric,
the experimenter may need to transform the data into
another scale so there is a good fit.
(14) Confounding ( or "confusing" effects with each other):
When the experimenter is using fractional factorial
designs, he or she may lack degrees of freedom to
estimate all effects which are included in the model.
In this case, the concept of confounding is very
helpful in estimating main effects and important
interactions while sacrificing the unimportant
interactions (usually higher order interactions) in
order to estimate the important effects such as the
main effects and two-factor interactions effects. It
is helpful in the sense that the experimenter decides
what effects to be confounded. Confounding is usually 25
used in screening experiments where the objective is
to identify the important main effects ignoring the
interaction effects. The choice of the generator in
fractional factorials determines which effects are
confounded with each other.
(15) Replication: This is simply the repetition of an
experimental run which increases the number of degrees
of freedom available resulting in a better estimate of
the error. It is also essential in obtaining more
precise estimates of the effects. In reality,
replication of experimental runs may not be feasible
due to budget or time constraints.
(16) Analysis of Variance (ANOVA): Total variation
displayed by a set of observations is measured by the
sum of squares of deviation from the mean. It can be
broken up into components associated with defined
sources of variation.
(17) Resolution: The type of fractional factorial that
determine the confounding of the effects. A number
associated with the word "Resolution" is specified.
Different types of resolutions will be discussed later
in this chapter.
(18) Generator: A high order interaction term that is
usually negligible is used to generate fractional
factorial designs (Box and Hunter (1961)).
(19) Levels of an independent variable: The values within 26
the feasible range of an independent variable over
which the experiment is conducted. For example, if the
feasible range of the variable temperature is between
100 F and 120 F, the experimenter may assign the value
of 100 F to be the low level and the value of 120 to
be the high level. The experimenter should be cautious
when dividing levels for independent variables. For
example, variables that contain a curvature in their
relationship with the response should be identified.
If the curvature contain an increase then decrease or
vice versa, then the response may be similar for high
and low levels when assigned at extreme points of the
range. In this case, the experimenter is advised to
identify the point where the slope is zero and divide
the levels so that one level represent that point.
Prior knowledge about the nature of the independent
variables with regard to their relationship with the
response is helpful in detecting such situations.
feasible range of
3.2 DEFINITIONS OF EXPERIMENTAL DESIGNS:
In this section, some important experimental designs
will be briefly reviewed. Again, this is not intended to
cover all specifications of these designs but only to give
a general idea along with references:
(1) Latin Square design: this type of design is utilized 27
if the following conditions are satisfied : (a) if
there is a single factor under study, (b) there are
two extraneous sources of variability, and (c) each of
the extraneous variables has a number of levels equal
to the number of levels or treatments of the single
factor.
(2) Graeco Latin Square design: This is the same as the
regular Latin Square except that the extraneous
variability comes from three sources instead of two.
(3) Hyper-Graeco Latin Square design: This is the same as
the regular Latin Square except that the extraneous
variability comes from more than three sources.
(4) Youden Square design: This type of design is the same
as Latin Square but it is "incomplete". This means
that the number of levels of one extraneous variable
differ from either (a) the number of levels of the
other extraneous variable or (b) the number of
treatments (levels of the single factor under study).
See Smith and Hartly (1948) for more detail.
(5) Factorial design: An experiment that investigates all
possible treatment combinations of two or more
factors. Each of the factors is investigated at two or
more levels. Efficiency and the ability to estimate
the interaction effects are the major advantages of
factorial experiments over one-factor-at-a-time
experiments (Box, Hunter and Hunter (1978)). 28
(6) Factorial design at two levels (2-k): This type of
factorial is achieved when each factor has two levels.
The 2-k design assumes a linear relationship between
the independent variables from one side and the
dependent variable (response) from the other side.
(7) Factorial design at three levels (3-k): This type of
design is used when there is a significant curvature
in the relationship between the dependent and the
independent variables.
(8) Fractional factorial design at two levels (2-k-p): An
important feature of factorial design is that the
experimenter can run a fraction of it and still obtain
valid conclusions. The fraction is the p in the
expression (2-k-p), where k is the number of factors
or independent variables. p must be an integer and
smaller than k. However, (k-p) has to be a multiple of
two. The fractional factorials are most appreciated
when a screening experiment is to be conducted and
when the budget and or time constraints do not allow
the experimenter to run a full factorial (Box and
Hunter (1961)).
(9) Fractional factorials at three levels (3-k-p):
These are designs that serve the same purpose as
fractional factorials at two levels except that the
levels are three for each variable instead of two. As
indicated earlier, this is used when nonlinearity due 29
to quadratic effects exists.
(10) Central Composite (Box) Design: Type of design that is
used when the experimenter needs to fit second order
model. In particular, this design is mostly used when
the experimenter decides to go to higher order design
and already has the runs from 2-k design (Box and
Draper (1979)).
3.3 DEFINITIONS OF TESTS:
In some situations, the experimenter would like to
compare the individual treatments (or levels of the single
factor). In this section, a review of methods for
comparing individual treatments will be conducted.
(1) Orthogonal Contrasts method: This method is used to
compare individual treatments or levels of single
factor experiments. It is used when contrasts are set
prior to running the analysis of variance so that the
experimenter does not set contrasts that reflect large
observed differences in means.
(2) Duncan's multiple range test: This is probably the
most popular test to compare all possible pairs of
treatments developed by Duncan (1955). The test
indicates that two means are not significantly
different from each other if they fall within two 3 0
other means that are not significantly different from
each other.
(3) Least Significant Difference method (LSD): This method
is used when the F-statistic of all treatments or
levels combined is significant atw=5%.
(4) Dunnet's method: If the experimenter wishes to compare
individual treatments with a control treatment, then
the method developed by Dunnet (1964) is appropriate.
It is suggested to use more observations for the
control treatment so that the results are more
reliable. It should be noted that not more than
(number of treatments - 1) comparisons can be made
using this method.
(5) Scheffe's method: This is an alternative for the
orthogonal contrasts method when the experimenter has
no prior knowledge of what contrasts to compare.
Scheffe' (1953) developed this powerful test to
compare all possible contrasts without inflating type
I error. CHAPTER FOUR
EXPERIMENTAL DESIGN
4.1 INTRODUCTION
The bottomline for an effective experimental design is taking related aspects into consideration as early as possible when planning an experiment. Failure to do so may increase the risk of drawing invalid conclusions. For example, suppose there is a source of variability which the experimenter is not able to control by physical means.
If this source of variability is not accounted for by the experimental design, then the experimental error will be inflated by it. This inflation of the experimental error will increase the risk of drawing an invalid conclusion.
In addition, if the experimenter ignores a basic principle such as randomization , it could be costly if not impossible to correct at later stages. The focus of this chapter will be on the basic principles of designing experiments.
4.2 BASIC EXPERIMENTAL DESIGN PRINCIPLES
In this section, the emphasis will be placed on the key principles that the experimenter must take into considerations when designing an experiment. The major principles that will be discussed are: (1) blocking, (2) randomization, and (3) replication.
(1) Blocking: In many situations, not all experimental
runs can be performed under homogeneous conditions due
to extraneous variability that can not be controlled
by the experimenter. As a result, the error term is
likely to be inflated leading to the possibility of
drawing the wrong conclusions. If this occurs, the
experiment should be divided into subexperiments
(blocks) such that each subexperimental runs are run
under homogeneous conditions. This is done to
eliminate the variability contributed by the
uncontrollable factors. Each block contains a portion
of the experimental runs that can be run under similar
conditions.
The experimenter may not be able to detect the
existence of extraneous variability at the beginning
because of his or her lack of knowledge about the
experiment. In this case, the experimenter should
check the consistency of the observed response. Lack
of consistency indicates the possibility that the
experiment is being affected by uncontrollable
variability. When variability has been detected, the
experimental runs can be performed in blocks as
indicated earlier. The use of blocks ranges from simple comparative experiments to multifactor somewhat complicated ones.
In simple comparative experiment, the experimenter wishes to compare two methods . Therefore each pair of experimental materials is blocked from others. To illustrate this, consider comparing two types of shoe materials (Box, Hunter, and Hunter (1978)). The experimenter weighs the shoes before and after they are worn by subjects. The weight lost through wear is recorded so that less loss in material means better material. The subjects, however, are different in their body weights, and the length of time that they wear their shoes. Therefore, the experimenter would like to eliminate the variability contributed by the subjects which can not be controlled ( It is probably difficult for the experimenter to find subjects with the same body weight who are also similar with respect to how long they keep their shoes on). The experimenter can eliminate the variability by blocking
(eliminating the differences between body weights and length of time that the subjects wear their shoes). By doing so, the variability of subjects weight and how long they keep they wear shoes is eliminated. In such a situation, blocking is the best tool to be used for eliminating all the extraneous variability. In single factor experiments, blocking can also
be used if uncontrollable variability exists. If one
source of variability exists, the experimenter may run
a completely randomized block design. If two
uncontrollable sources of variability exist, then he
or she may use a Latin Square design if requirements
for using it are satisfied as previously mentioned.
The use of blocks in factorial designs is a
little more complicated. This is primarily due to the
lack of enough degrees of freedom if the experiment is
not sufficiently replicated. This problem often occurs
in fractional factorials where main effects can be
mistakingly confounded with blocks. So the
experimenter must exercise caution when he or she uses
blocks in fractional factorials. In Particular, blocks
must not be confounded with main effects because if
they are, it will be difficult to figure out whether
the outcome represents the main effect or the block
effects.
(2) Randomization: Employing a proper randomization
procedure is crucial step in designing an experiment.
Randomization is used in experimental design to
average out the effect of variables that can not be
controlled by the experimenter. Such averaging does 35
not remove their effects completely but it assures
that the experimental errors are randomly distributed.
Montgomery (1976) illustrates this by giving the
following example. A metallurgical engineer is
studying the effects of two different hardening
processes, oil quenching and saltwater quenching, on
an Aluminum alloy. The objective of this experiment is
to determine the quenching solution that produces the
maximum hardness for this alloy. The specimen used for
testing differ in their thickness. The engineer would
continually handicap one quenching medium over the
other if the specimen subjected to one medium are
thicker than those subjected to the other medium. This
problem can be easily solved by applying the specimen
randomly to the quenching solution.
(3) Replication: This concept means repetition of the
basic experiment. It usually serves two purpose: (a)
to obtain more precision on the estimated effects and
(b) to estimate the error term for testing
significance. In some situation, where the estimates
of the effects are very important, it is necessary to
replicate the experimental runs to achieve precision.
For example, more precision is needed when developing
statistical models for forecasting purposes than
others because the objective is to precisely predict
events in the future. Failure to meet this objective will probably be more costly than performing more
experimental runs. Replication is also necessary to
estimate the error term. More reliable conclusions can
be drawn when there are enough replicates. This is
because replicates increase the degrees of freedom
which , in turn, provide a better estimate for the
error. In short, it is always recommended to replicate
the experimental runs as many times as possible.
Performing replicate runs does not come at an early
stage of the design due to the sequential nature of
most experiments. In other words, and in many
situations, the important variables, their ranges, and
the levels to be assigned are yet to be explored.
Therefore, the experimenter should replicate the
experiment when the ambiguity about the experiment is
eliminated. As a result, the replicates will
contribute to the precision of estimates of effects
and the estimation of error term after the screening
experiment is over.
4.3 SCREENING EXPERIMENTS
In many situations, little or no prior knowledge about the situation to be investigated is available. In such situation, the experimenter would like to be able to determine which of the large number of possible factors are most important. In screening experiments, there would likely to be a large number of independent variables and it is expected that not all of them are important. The experimenter's first goal would be to separate the important ones from the rest but will not utilize a large amount of the budget to achieve this. Therefore, a quick and economical method to screen these variables is needed so the experimenter will only concentrate on those factors that are important throughout the rest of the study. Had the experimenter not screened variables, however, he or she would be spending time and money in investigating variables that should not be there in the first place.
Fractional factorials at two levels ( 2-k-p ) can be utilized for screening purposes as indicated by Box and
Hunter (1961). These designs can economically and efficiently set the guidelines for the experimenter to determine what factors ( and possibly interactions) to proceed with in the analysis of the experiment.
A screening experiment is conducted at early stage of the design. The information obtained from that stage will be used as the input for the next stage and so on in an iterative approach. However, some experts do not seem to appreciate this approach despite its clear efficiency. 38
Genishi Taguchi, who is considered a leading expert in this field, has not mentioned the concept of iterative experimentation. However, in reality, some variables that were considered at early stage of the design may be dropped if found unimportant and other variables may be included as indicated by Box(1985). Also the ranges of the variables on which the experiment is carried out may change.
Screening of experimental variables should be economical. Box, Hunter, and Hunter (1978) have suggested that usually not more than one quarter of the budget should be invested for that screening purposes. After the
"screening" stage, the experimenter will have enough information with regard to which variables are important.
As indicated earlier, fractional factorial designs at two levels(2-k-p) are cost effective in exploring information about the experimental variables at early stage of the design. For example, if an experiment involves five variables given that it is unknown which ones are important, the experimenter may use one quarter fractional factorial at two levels ( 2-5-2 ) which involves 8 runs instead of using the full factorial at two levels ( 2-5
) involving 32 runs. If the number of experimental variables is sufficiently large, a smaller fraction can be run. The experimenter must be careful in deciding what
type (or as often called resolution) of fractional
factorial to use. The following is a brief review of the major alternatives:
(a) Resolution 111 : This type is selected when the
experimenter wishes to estimate only the main effects.
This guarantees that the main effects can be estimated
without being confounded with one another. However,
this resolution may have its main effects confounded
with its two-factor interaction. Therefore, if the
experimenter wishes to estimate two-factor
interactions as well as main effects, then this
resolution should not be used. If they are used,
however, then the experimenter can not guarantee
whether one effect reflects a main effect or a two-
factor interaction because they are confounded with
each other. For example, if AB refers to the
interaction between factor A and factor B then the
two-factor interaction (AB) may be confounded with
factor C. This means that it will not be possible to
determine whether a significant effect is due to the
main effect C or to the two-factor interaction (AB) or
both.
(b) Resolution IV : This type is used when the
experimenter is interested in estimating the two-
factor interactions separately from the main effects. 40
This is simply a result of not having main effects
confounded with each other or with two-factor
interactions. However, the exp,erimenterfs decision on
which one of the two-factor interactions are
significant must be made with caution because two-
factor interactions are confounded with each other.
It should be mentioned here that resolution IV
fractional factorials can be constructed by combining
resolution I11 design with its "sign-reversed"
partner. In other words, the experimenter will only
have to change signs (+,-) in the resolution 111
design and combine it with the original. This is often
referred to as a fold-over design. Again, utilizing
fold-over designs was also missed by Taguchi as
mentioned by Box (1985) despite its ease in
application as well as its usefulness. For example,
suppose that it is desired to separate the two-factor
interactions from main effects. This can be done by
folding over Resolution 111 design without having to
set up a Resolution IV design.
(c) Resolution V : This type of design is useful when the
experimenter wishes to estimate of two-factor
interactions which are not confounded with each other.
This also includes estimating main effects separately
from each other and from two-factor interactions.
This is usually done when the experimenter knows that two-factor interactions are important.
The above mentioned resolutions are commonly used when fractional factorial designs at two levels ( 2-k-p ) are considered. However, resolution I11 is the one to use when screening large number of variables since the interest is primarily in dropping those variables that are not important. If the experimenter knows that some interactions are important, then resolution IV can be used.
All fractional factorial designs require
"generator(s)" for constructing them. The number of generators required depends on the size of the fraction being constructed. For example, if the experimenter wishes to construct a one half fraction then one generator is required. Two generators are required for a one quarter fraction and so on. In other words, the number of generators required is equal to parameter p in the (2-k-p) expression. However, the experimenter may find it easier to refer to some books such as the one by
Montgomery(l986) for constructing fractional factorial for a given resolution.
4.4 MODEL ADEQUACY
In some situations, the data collected on the 4 2 original scale fail to satisfy the assumptions upon which statistical tests are based. As a result, the conclusions drawn from experiments in such situations are of doubtful validity. In this section, a discussion of the different statistical assumptions and the consequences from violating them will be made. Also, a review of existing tests for detecting these violations will then be provided as well as remedies to correct them.
The analysis of variance (ANOVA) model is used to analyze the results of an experiment. The ANOVA model is based on several statistical assumption. Violating one or more assumptions such as (1) linearity, (2) additivity,
(3) normality, (4) homogeneity of variance and (5) independence may occur along with serious consequences.
Therefore, The possibility that one or more of the above assumptions will be violated must not be ignored. In particular, statistical tests should be employed to determine whether or not they have been violated. Certain types of transformation can be recommended for remedying the violation of these assumptions.
However, Taguchi's experimental design does not refer to transformation as a remedy to violation of one or more of the above mentioned assumptions. In his discussion on Kackar's paper regarding Taguchi's methodology, Box (1985) criticizes the absence of the concept of transformation from Taguchi's quality improvement philosophy among other issues.
Throughout his books and articles, Box acknowledges the occasional need of transformation as indicated in Box and Cox (1964). This probably motivated him to develop techniques for best selection of transformation function.
When data is properly transformed, it will improve the degree to which assumptions of the statistical model are satisfied. Usually, transforming the dependent variable
(response) can improve the adequacy of the model in planned experiments. However, Box extends the concept of transformation to unplanned experiment where transforming the independent variables can greatly simplify the form with which the experimenter is working. As indicated by
Box and Tidwell (1962), transforming the independent variables and/or the response in regression and especially in response surfaces can greatly ease handling the experimental data.
When attempting to find an appropriate transformation function, it is not always necessary to determine which assumption has been violated. In particular, the method developed by Box and Cox (1964), called " maximum likelihood estimator", enables the experimenter to select 4 4 the appropriate transformation to satisfy the additivity, the homogeneity of variance, and the normality assumptions all together if necessary. In other words, the functional form used for this transformation accounts for all the above three assumptions.
It is important, as indicated by Box(1985), to confirm that the assumptions mentioned earlier before proceeding further with the investigation. Failure to do so may seriously affect the validity of conclusions drawn from the experiment. As mentioned earlier, the assumptions to be confirmed are:
(1) Linearity: This assumption indicates that the
dependent variable or response can be expressed as a
linear function of the independent variables. The
assumption is said to be satisfied when no significant
curvature exists in the relationship between the
response and the independent variables. If a
significant curvature exists, then it will be
necessary to include high order terms in the model.
Failure to do so may dramatically inflate the error
some of squares. When this occurs, the power of test
is reduced.
(2) Additivity: This assumption states that the model,
which expresses the dependent variable as a function
of the independent variables, must not contain any cross product (interaction) terms. This will not be
true in situations in which the nonadditivity or
interactions may actually be part of the model which
generated the data. Therefore, one must not assume the
additivity when there is no enough knowledge about the
phenomena being studied to support such an assumption.
If the experimenter falsely assumes additivity, then
the sum of squares due to nonlinearity will be
incorrectly included in the sum of squares of error.
This inflation in the error term will reduce the power
of test.
(3) Normality: When the differences between the actual
observations and the predicted values (i.e. residuals)
are plotted on a dot diagram, they should approximate
a normal distribution with mean centered at zero. Such
a distribution will be symmetrical and will have its
mode equal to the mean. If there is a significant
departure from normality then the significance level
will be seriously distorted. This will, in turn,
affect the validity of the conclusions which are based
on the level of significance for the test.
(4) Homogeneity of variance: This assumption states that
the error (variance) should be nearly constant over
the different combinations of the independent
variables. In other words, the plot of residuals 4 6
against predicted values must not follow a particular
pattern and should be structureless. However,
sometimes the experimenter encounters the problem of
heterogeneity of variance. If this occurs then the
assumption of homogeneous variance is said to be
violated. This is a type of a problem that is better
to be prevented rather than corrected due to its
serious consequences on the validity of the conclusion
drawn in such situations. For example, a proper
randomization process can greatly help in preventing
this assumption from being violated. Randomization
will simply average out the effects contributed by
uncontrollable variables that could systematically
build up if no randomization was applied. When this
assumption is violated, the consequences are similar
to those of violating the normality assumption in
distorting the level of significance resulting in
reducing the power of the test.
(5) Independence: This assumption states that the
residuals are independent from each other. In other
words, there must not be a functional relationship
between one residuals. If this assumption is violated,
the variance of residuals may be biased. A detailed
discussion on consequences and methods for dealing
with violating this assumption is provided in paper
written by Glen and Terry (1984). It is recommended to refer to Eisenhart (1947) for a
complete discussion on the assumptions underlying the
analysis of variance. Cochran (1947) also provides a
discussion of consequences of violating these assumptions.
It is essential to identify the various tests that
can be used to verify the validity of these assumptions.
When there are more than one test to verify some
assumption, the most common ones will only be identified.
The linearity assumption can be verified using the
lack of fit test in Draper and Herzberg (1971). It should
be noted that performing this test requires at least one
replicate to account for pure error term. However, it is
always advisable to replicate as many times as possible when conducting this test.
The assumption of additivity can be verified using "
Tukey's one degree of freedom test for additivity"
developed by Tukey (1949).
Homogeneity of variance can be verified using the
test developed by Rartlett (1937). But since this test is
sensitive to the assumption of normality, another test
developed by Bartlett and Kendall (1946) should be used when the assumption of normality is questionable.
The assumption of normality can be tested using the
Chi-square goodness of fit test. However, the experimenter may also use the Klomogorov-Smirnov test which is argued by some authors to be more powerful. Also Shapiro, Wilk and Chen (1968) developed an alternative test which is more attractive since it does not require the inclusion of the mean and the variance as a part of the hypothesis.
The following table summarizes tests and references to test the statistical assumptions that were mentioned above.
Table 1 : Summary of tests for validating assumptions* ...... , Assumption I, Reference Guidelines ,I ...... tf I, It " Linearity " Lack of fit test in Draper 4 " and Herzberg(l971) I,
" Additivity " Tukey's one degree of freedom " " test for additivity in Tukey " " (1949). Also Tukey and Moore " 11 " (1954)
" Homogeneity of " Bartlett's test in Bartlett " " variance " (1937). Also Bartlett and I, " Kendall (1946) I, ''------'' " Normality " Goodness of fit test, 1, ,# " Kolmogorov-Smirnov test, 11 I, " or Shapirio,wilk and Chen It I, " (1968) ...... * Refer to Glen and Terry (1984) for testing the assumption of independence. When a test of the mentioned above shows that an
assumption has been violated, it is essential to correct
this violation by applying the appropriate type of
transformation. There are a variety of transformations
that can be used when the nonnormality assumption is
violated. The most appropriate transformation depends on
the nature of the nonnormality. The method developed by
Box and Cox (1964) allows the experimenter to select the
right transformation regardless of whether or not the
experimenter has knowledge about the nature of the
nonnormality. The same is true for correcting the
nonhomogeneity of the variance since these two assumptions
can, usually, be remediated together.
The additivity assumption will be violated when the
true effects are multiplicative or proportional. This
violation can be corrected by applying the logarithmic
transformation which is a part of Box and Cox method of
Maximum Likelihood transformation.
Violating the assumption of linearity can be
corrected by applying Box and Cox (1964). In unplanned
experiments such as the case in regression, the
experimenter may have to transform some of the independent
variables. Refer to Box and Tidwell (1961) for a complete
discussion on transforming the independent variables. CHAPTER FIVE
KNOWLEDGE REPRESENTATION
5.1 INTRODUCTION
In this chapter, the focus will be on developing the
production rules needed for this expert system. The
listing of the production rules in this chapter, however,
is not the same as the program code.
This expert system consists of production rules that
are coded in Turbo Prolog as segments. Each segment
consists of a condition ( the 'IF' part) or a conclusion
( the 'THEN' part) about some type of experiment. A series of rule's conditions are connected through predicate
names. An array is used to keep track of conditions and
conclusion of the same rule. This makes it simple to modify the program when needed. For example, the user may have to answer a question by choosing from 'y', In', 'c',
(for yes, no, and continue, respectively). If the name of
the predicate is sub1 then subl('yl) will lead to another
condition (question) or a conclusion (answer) which is to
follow that particular answer.
Although, this type of coding (using arrays) can
consume a lot of memory, it is the easiest way to check 5 1 validity of the system. In other words, if the number of conditions for one rule is large and if arrays are not used, then each time you have to check one of these conditions you must also check preceding conditions of that rule. On the other hand, you only have to worry about the condition on hand if you are using arrays.
5.2 EXPERIMENTAL DESIGN RULES
In this section, lists of the production rules for all types of experiments covered in this expert system will be provided. Conclusions or 'THEN' parts for some rules can be used as conditions or 'IF' parts for other rules.
Rules for this expert system will be divided into two categories; single factor experiments and factorial experiment. They were developed after the knowledge acquisition phase has been completed and verified. It should be mentioned here that these rules do not contain the help facility options since the definitions of various technical terms and designs were included in chapter three. 5.2.1 RULES FOR SINGLE FACTOR EXPERIMENTS
------RULE 1 IF 1. The experiment is a single factor one,
and 2. There is no extraneous source of variability,
THEN * Run a Complete Randomized Design.
------RULE 2
IF 1. There is one source of extraneous variability
and 2. Data is complete (no missing values),
THEN * Run a Complete Randomized Block Design.
------RULE 3
IF 1. There are two sources of extraneous
variability,
and 2. The number of levels for each source is equal
to the number of treatments under study,
and 3. Data is complete (no missing values),
THEN * Run a Latin Square Design.
------RULE 4 IF 1. There are three sources of extraneous
variability,
and 2. The number of levels for each source is equal
to the number of treatments under study, and 3. Data is complete (no missing values),
THEN * Run a Graeco Latin Square Design.
------RULE 5 IF 1. There are more than three sources of
extraneous variability,
and 2. The number of levels for each source is equal
to the number of treatments under study,
and 3. Data is complete (no missing values),
THEN * Run a Hyper-Graeco Latin Square Design.
------RULE 6 IF 1. A Complete Randomized Design was used,
and 2. Comparing individual treatments is desired,
and 3. It is known what treatments to compare and
contrasts are set,
THEN * Use Orthogonal Contrasts Method.
------RULE 7 IF 1. A Complete Randomized Design is used,
and 2. Comparing individual treatments is desired,
and 3. Contrasts are not set prior to analysis,
and 4. Comparing all possible pairs of treatments is
desired,
and 5. The F-test in the ANOVA for all treatments
combined is significant at 5%, 54 THEN * Use the Least Significant Difference (LSD) method
------RULE 8
IF 1. A complete Randomized Design is used,
and 2. Comparing individual treatments is desired,
and 3. Contrasts are not set prior to analysis,
and 4. Comparing all possible pairs of treatments is
desired,
and 5. The F-test in the ANOVA for all treatments
combined is not necessarily significant,
THEN * Use Duncan's multiple range test.
RULE-?
IF 1. A Complete Randomized design is used,
and 2. Comparing individual treatments is desired,
and 3. Contrasts are not set prior to analysis,
and 4. Comparing all possible pairs of treatments is
not necessary,
and 5. Comparing treatments with a control treatment
is not desired,
THEN * Use Scheffe's method.
1. A Complete Randomized design is used,
and 2. Comparing individual treatments is desired, 55
and 3. Contrasts are not set prior to analysis,
and 4. Comparing all possible pairs of treatments is
not necessary,
and 5. Comparing treatments with a control treatment
is desired, THEN * Use Dunnet's method.
RULE-II IF 1. There are two sources of extraneous
variability,
and 2. The number of levels for each source is equal
to the number of treatments under study,
and 3. Data is not complete, THEN * Use Latin Square Design taking into consideration that every pair of treatments
must appear together the same number of
times.
------RULE 12 IF 1. There are two sources of extraneous
variability,
and 2. The number of levels for at least one source
is not equal to the number of treatments
under study,
THEN * Use Youden Square Design but be sure that data is balanced. 4.2.2 RULES FOR FACTORIAL EXPERIMENTS
------RULE 13 IF 1. All independent variables are known to be
important,
and 2. Number of levels is predetermined for each
independent variable,
and 3. Number of levels is not the same for each of
the independent variables,
THEN * Run a full factorial design. If you are considering all possible interactions, then
you will need at least one replicate to
estimate the error term. Otherwise, use the
unimportant interactions for that purpose.
------RULE 14
IF 1. All independent variables are known to be
important,
and 2. Number of levels is predetermined for each
independent variable,
and 3. Number of levels is equal to two for each of
the independent variables,
THEN * Run a factorial experiment at two levels
(2-k). ------RULE 15 IF 1. All independent variables are known to be
important,
and 2. Number of levels is predetermined for each
independent variable,
and 3. Number of levels is equal to three for each
of the independent variables,
THEN * Run a factorial experiment at three levels (3-k).
------RULE 16 IF 1. All independent variables are known to be
important,
and 2. Number of levels is predetermined for each
independent variable,
and 3. Number of levels is greater than three for
each of the independent variables,
THEN * Run N-K factorial experiment where N represents the number of levels.
------RULE 17
IF
1. Not all independent variables are known to be
important, 58
and 2. The number of independent variables is large
THEN * Divide each independent variable into high and low levels. Run a highly fractionated
factorial to identify the important main
effects. This is a screening experiment that
is used to separate the important variables
from the unimportant ones. The rest of the
experiment will be carried out with the
identified important variables.
------RULE 18 IF 1. All independent variables are known to be
important,
and 2. Number of levels for each of the independent
variables is not predetermined,
and 3. There is prior knowledge indicating that
two-factor interactions are not important,
and 4. It is possible to perform a full factorial
experiment at two levels (2^k),
and 5. All experimental runs can be performed under
homogeneous conditions, THEN * Set up (2-k) factorial design at two levels. Randomize runs, perform experiment, and
collect data. Fit the first order model and
check the adequacy of the model. Proceed to
RULE 34. ------RULE 19 IF 1. All independent variables are known to be
important,
and 2. Number of levels for each of the independent
variables is not predetermined,
and 3. There is no prior knowledge indicating that
two-factor interactions are not important,
and 4. It is possible to perform a full factorial
experiment at two levels (2^k),
and 5. All experimental runs can be performed under
homogeneous conditions,
THEN * Set up (2-k) factorial design at two levels. Randomize runs, perform experiment, and
collect data. Fit the first order model that
includes the two-factor interaction terms and
check the adequacy of the model. Proceed to
RULE 35.
------RULE 20 IF 1. All independent variables are known to be
important,
and 2. Number of levels for each of the independent
variables is not predetermined,
and 3. There is prior knowledge indicating that
two-factor interactions are important, 60
and 4. It is possible to perform a full factorial
experiment at two levels (2*k),
and 5. Not all experimental runs can be performed
under homogeneous conditions,
THEN * Set up (2-k) factorial design at two levels in blocks. Randomize runs, perform
experiment, and collect data. Fit the first
order model that includes the two-factor
interaction terms and check the adequacy of
the model. Proceed to RULE 34. ------RULE 21 IF 1. All independent variables are known to be
important,
and 2. Number of levels for each of the independent
variables is not predetermined,
and 3. There is prior knowledge indicating that
two-factor interactions are important,
and 4. It is possible to perform a full factorial
experiment at two levels (2^k),
and 5. Not all experimental runs can be performed
under homogeneous conditions,
THEN * Set up (2-k) factorial design at two levels in blocks. Randomize runs, perform
experiment, and collect data. Fit the first
order model and check the adequacy of the
model. Proceed to RULE 35. ------RULE 22 IF 1. All independent variables are known to be
important,
and 2. Number of levels for each of the independent
variables is not predetermined,
and 3. There is prior knowledge indicating that
two-factor interactions are not important,
and 4. It is not possible to perform a full
factorial experiment at two levels (2^k),
and 5. All experimental runs can be performed under
homogeneous conditions,
THEN * Set up (2"k-p) fractional factorial design at
two levels. Randomize runs, perform
experiment, and collect data. Fit the first
order model and check the adequacy of the
model. Proceed to RULE 34.
------RULE 23 IF 1. All independent variables are known to be
important,
and 2. Number of levels for each of the independent
variables is not predetermined,
and 3. There is no prior knowledge indicating that
two-factor interactions are not important,
and 4. It is not possible to perform a full 6 2
factorial experiment at two levels (2^k),
and 5. All experimental runs can be performed under
homogeneous conditions,
THEN * Set up (2-k-p) fractional factorial design at
two levels. Randomize runs, perform
experiment, and collect data. Fit the first
order model that includes all two-factor
interaction terms and check the adequacy of
the model. Proceed to RULE 35.
------RULE 24 IF 1. All independent variables are known to be
important,
and 2. Number of levels for each of the independent
variables is not predetermined,
and 3. There is prior knowledge indicating that
two-factor interactions are not important,
and 4. It is not possible to perform a full
factorial experiment at two level (2^k),
and 5. Not all experimental runs can be performed
under homogeneous conditions,
THEN * Set up (2-k-p) fractional factorial design at two levels in blocks. Randomize runs, perform
experiment, and collect data. Fit the first
order model and check the adequacy of
the model. Be sure that blocks are not 63
confounded with main effects and main effects
are not confounded with each other. Proceed
to RULE 34.
------RULE 25 IF 1. All independent variables are known to be
important,
and 2. Number of levels for each of the independent
variables is not predetermined,
and 3. There is no prior knowledge indicating that
two-factor interactions are not important,
and 4. It is possible to perform a full factorial
experiment at two levels (2^k),
and 5. Not all experimental runs can be performed
under homogeneous conditions,
THEN * Set up (2-k-p) fractional factorial design at two levels in blocks. Randomize runs, perform
experiment, and collect data. Fit the first
order model that includes two-factor
interaction terms and check the adequacy of
the model. Be sure that blocks are not
confounded with main effects and main effects
are not confounded with each other or with
two factor interactions. Proceed to RULE 35. 64
------RULE 26
IF 1. A screening experiment to identify the
important variables has been conducted,
and 2. There is prior knowledge indicating that the
two-factor interactions are not important,
and 3. It is possible to perform a full (2^k)
factorial at two levels,
and 4. All experimental runs can be performed under
homogeneous conditions,
THEN * Set up (2-k) factorial design at two levels. Randomize runs, perform experiment, and
collect data. Fit the first order model and
check the adequacy of the model. Proceed to
RULE 34.
------RULE 27
IF 1. A screening experiment to identify the
important variables has been conducted,
and 2. There is prior knowledge indicating that the
two-factor interactions are not important,
and 3. It is possible to perform a full (2-k)
factorial at two levels,
and 4. Not all experimental runs can be performed
under homogeneous conditions,
THEN * Set up (2-k) factorial design at two levels in blocks. Randomize runs, perform 65
experiment, and collect data but be sure
that blocks are not confounded with main
effects. Fit the first order model and check
the adequacy of the model. Proceed to RULE
34.
------RULE 28
IF 1. A screening experiment to identify the
important variables has been conducted,
and 2. There is prior knowledge indicating that
the two-factor interactions are important,
and 3. It is possible to perform a full (2^k)
factorial at two levels,
and 4. Not all experimental runs can be performed
under homogeneous conditions,
THEN * Set up (2-k) factorial design at two levels in blocks but be sure that blocks are not
confounded with main effects. Randomize
runs, perform experiment, and collect data.
Fit the first order model that includes the
two-factor interaction terms and check the
adequacy of the model. Proceed to RULE 35.
------RULE 29 IF 1. A screening experiment to identify the
important variables has been conducted, 6 6
and 2. There is prior knowledge indicating that
the two-factor interactions are important,
and 3. It is possible to perform a full (2-k)
factorial at two levels,
and 4. All experimental runs can be performed
under homogeneous conditions,
THEN * Set up (2-k) factorial design at two levels.
Randomize runs, perform experiment, and
collect data. Fit the first order model that
includes the two-factor interaction terms
and check the adequacy of the model. Proceed
to RULE 35.
------RULE 30
IF 1. A screening experiment to identify the
important variables has been conducted,
and 2. There is prior knowledge indicating that the
two-factor interactions are not important,
and 3. It is not possible to perform a full (2-k)
factorial at two levels,
and 4. All experimental runs can be performed under
homogeneous conditions,
THEN * Set up (2-k-p) fractional factorial design at two levels . Randomize runs, perform experiment, and collect data. Fit the first
order model without two-factor interactions 67
and check the adequacy of the model.
proceed to RULE 34.
------RULE 31 IF 1. A screening experiment to identify the
important variables has been conducted,
and 2. There is prior knowledge indicating that the
two-factor interactions are not important,
and 3. It is not possible to perform a full (2-k)
factorial at two levels,
and 4. Not all experimental runs can be performed
under homogeneous conditions, THEN * Set up (2-k-p) fractional factorial design at two levels in blocks. Randomize runs,
perform experiment, and collect data. Fit
the first order model and check the
adequacy of the model. Be sure that blocks
are not confounded with main effects and
main effects are not confounded with each
other. Proceed to RULE 34.
------RULE 32
IF 1. A screening experiment to identify the
important variables has been conducted,
and 2. There is prior knowledge indicating that the two-factor interactions are important,
and 3. It is not possible to perform a full (2-k)
factorial at two levels,
and 4. Not all experimental runs can be performed
under 11omogeneous conditions,
THEN * Set up (2-k-p) fractional factorial design at two levels in blocks. Randomize runs,
perform experiment, and collect data. Fit
the first order model that includes two-
factor interaction terms and check the
adequacy of the model. Be sure that blocks
are not confounded with main effects and
main effects are not confounded with each
other or with two-factor interactions.
Proceed to RULE 35.
------RULE 33
IF 1. A screening experiment to identify the
important variables has been conducted,
and 2. There is prior knowledge indicating that
the two-factor interactions are important,
and 3. It is not possible to perform a full (2-k)
factorial at two levels,
and 4. All experimental runs can be performed
under homogeneous conditions,
THEN * Set up (2-k-p) fractional factorial design 69 at two levels . Randomize runs, perform
experiment, and collect data. Fit the first
order model that includes two-factor
interaction terms and check the adequacy of
the model. Proceed to RULE 35.
------RULE 34 IF 1. There is not an adequate fit of the model, THEN * The assumption of no interactions in the model is possibly wrong. Include the most
likely two-factor interaction terms and fit
the new model. Proceed to RULE 36.
------RULE 35 IF 1. There is not an adequate fit of the model, THEN * The assumption of no higher than two-factor interactions is possibly wrong. Include most
likely three-factor interaction terms and
fit a new model. Proceed to RULE 36.
------RULE 36 IF 1. There is not an adequate fit of the new
model, THEN * The data may need to be transformed into
another scale. Use Box & Cox (1964) 'Maximum
Likelihood method' to identify the best 70
transformation. Fit the model with the
transformed data. Proceed to RULE 37.
------RULE 37
IF 1. The fit of the model with the with the
transformed data is not adequate, THEN * At this stage, it is possible that a higher
order model is needed. Start by expanding
the first order into a central composite
design (CCD).
------RULE 38
IF 1. There is an adequate fit of the model,
and 2. More precision on estimates is desired, THEN * Replicate the experimental runs as many times as possible and estimate the effects
considering randomization in conducting the
experiment.
------RULE 39
IF 1. If the experimenter does not know whether or
not it is possible to perform a full (2^k),
and 2. Budget and cost per run are known, THEN * Calculate total cost=(2^k)*(cost per run). Compare total cost with budget.
U L ? the Usc dlir,~q t,: t R:n!~f',: fie tioI Rante s ,e
&
FIGURE 1. ( CONTINUED 1. FIGURE 2. EXPERIHENTS UITH THO OR HORE IHDEPENDEHT VARIABLES, FIGURE 2s ( CONlINUED 1. FIGURE 2. ( ONTINULD 1, S t yp 2% f ct rial i sign R on t e fom,*anadc iftl t iaiaj it fcs? orLr no c w th o-factor nt r ct on Con ucf faei olefit tcs - w I I
FIGURE 2, ( CONTINUED 1. FIGURE 2. ( CONTIHUED 1. FIGURE 2, ( COHIINUED 1, FIGURE 2. ( CONTINUED 1. FIGURE 2. ( CONTIMJZD 1.
FIGURE 2. ( CONTINUED 1. FIGURE 2, ( CONTIMJLD 1. CHAPTER SIX
SYSTEM EVALUATION
6.1 INTRODUCTION
The utility of the expert system for designing experiments can be measured in terms of the ability to improve the performance of users in designing experiments.
In this chapter, the effectiveness of the system will be evaluated. Conclusions and recommendations for future studies will also be presented.
Feedback from the user is a very important step in evaluating such system. Therefore, an experiment will be conducted to determine if there is a significant improvement in user's ability to solve experimental design problems when using the expert system.
6.2 EVALUATION EXPERIMENT
A simple experiment was designed to determine if the expert system enhances the ability of the students to solve experimental design problems when using the expert system (See Appendix). The experiment will simply compare the answers obtained by solving the problems with or without utilizing the system. 88
Since the problems vary in their difficulty, it is important to block the variation to obtain precision on the results of the experiment. Therefore, a paired t-test was employed to analyze the data. However, there is no contribution of variation from the difference in problems difficulty. Also, the distribution of points is not the same for all problem since they are not equally important.
Ten students from both graduate and undergraduate classes ( five from each ) were chosen at random to participate in the experiment. Each student had a minimum of one introductory course in experimental design. The experiment was constructed so that each student took the test on his or her own prior to taking it by using the expert system. This was done to ensure that students would not use knowledge gained from the expert syst:em in answering the problems on their own. The knowledge that students may gain from taking the test on their own is negligible since they can not use any literature.
Points were given to the problems based on the possibilities available for answering. For example, if the student answers " factorial " without specifying " factorial at two levels", then a point was taken off. As a result they would get one point instead of two.
Specifications for breaking up points are provided in the Appendix.
As stated in the Appendix, the hypothesis to be tested is that there is a significant improvement in scores when solving the given experimental design problems when using the expert system as opposed of solving them without it. Data was collected and analyzed (Appendix).
The results of this analysis indicate that use of the expert system produced a significant improvement in the subjects ability to solve the experimental design problems. In particular numbers, The calculated t-value is
6.62 while the critical value at =0.05 is 1.895. This indicates that the expert system was helpful in guiding the user to the appropriate experimental design in the situations tested. Moreover, An examination of the mean score for each of the problem revealed that the scores are higher when the expert system is used.
6.3 CONCLUSIONS
Expert systems have been widely used in a variety of fields. Experimental Design is one of few fields which have not received a lot of attention when it comes to developing intelligent systems. Currently available statistical software packages deal only with number calculations and ignore the important problem of deciding the appropriate experimental design to run. This expert system can be effectively usedas a teaching aid in experimental design classes. Students will find this system very helpful in the design stage of an experiment. . In fact, the help facility can be accessed from any point in the system where help may be needed.
If the system is used to design experiments in real situations, it is preferred that the user have basic experimental design knowledge. That is, the user will be in charge of setting up or referring to the type of design provided by the system.
The purpose of the on-screen help facility is not to teach the subject of designing experiments but to refresh the user's memory on particular designs or technical terms. It is only a dictionary for selected designs and technical terms.
Unlike statistical software packages, this system does not perform any statistical calculations. Its objective is simply providing guidance to the user when searching for the appropriate design. Therefore, the system is not expected to be appealing to those who look for a software that does everything from planning through analyzing to conclusions. In short, developing this expert 9 1
system is an attempt to place more emphasis in the design
stage of the field because wrong conclusions may be drawn
from experiments that are poorly designed.
6.4 RECOMMENDATIONS
1) Future programs can be concerned with relevant subjects
in the field such as Regression, Response Surface
Methodology, experiments with multiple response, simple
comparative experiments and others. It may be feasible
then to integrate them into one system. For example,
the main menu in the integrated system can be the
individual subjects mentioned above.
2) Future versions can be concerned with a limited
interface with a powerful number calculation language
for analysis of experimental data. For example, this
expert system can be interfaced with Turbo Pascal in
its recent version.
3) Future versions can also be concerned with providing
the user of the design set-up with data collection
sheets and the randomized order of treatment
combinations. This, too, needs some calculation
power. For example, if a factorial design at two
levels has been decided upon, the system may ask the
user to enter variable names and actual number of
experimental runs affordable. The system will then
provide the user with the design set-up in random 92
order. It can go further to provide the adequate high
and low levels of the controllable variables.
I would very much recommend a future program that includes the third suggestion without worrying about the numerical calculations for data analysis. In fact, there is no need to develop a system that includes the design and the analysis because data collections and modifications usually lie between the two stages. It should be remembered that the nature of experimentation is sequential and divided into several stages. Therefore, one program that includes all stages may not be effective. REFERENCES
1) Aleksander, I., ElggAgqing Intelligent Systems, UNIPUB, New York, 1984.
2) Baker, T. B., " Quality Engineering By Design: Taguchi ' s Philosophy ", Quality Progress, PP. 32- 42 Dec. 1986.
3) Bartlett, M. S., " The Use of Transformation", Proc.- Roy. SOC. A, 160:268, 1937.
4) Bartlett, M.S., and Kendall, D. G., "The Statistical Analysis of Variance-Hetrogeneity and The Logarithmic Transformation", Journal Of Royal Statistical Soc~gty- Su~plement,8:128, 1946.
5) Black, W. J. * ------Intelligent ------Knowledge ------Based Systems: ------An Introduction Van Nostrand Reinhold (UK) Co. Ltd., ------) England, 1986.
6) Box, G. E. P., "Discussion on Paper by Kackar(1985)". ------Journal of Quality ------Technology, Vo1.17, No.4, Oct.1985. 7) Box, G. E. P. and Cox, D. R., "An analysis of Transformation", J.R.Statist.Soc.Ser., B,26:211-243, 1964.
8) Box, G. E. P., Kackar, R. N., Nair, V. N.,Phadke, M.. Shoemaker, A. C., and Wu, C. F., "Quality Practice in Japan", Quality Proms, PP. 37-41, Mar.1988.
9) Box, G. E. P., and Tidwell, P. W., "Transformation of independent variables", Techngmetrics, Vo1.4, No.4, 1962.
10) Box, G. E. P., and Draper, N. R., Evpiutipqgyy- Operation, Wiley, New York, 1969.
11) Box, G. E. P. and Hunter, J. S., " The 2 k-p Fractional Factorial Designs, Part I", Technometrics, V01.3, No.3, PP. 311-351, 1961.
12) Box, G. E. P. and Hunter, J. S., " The 2 k-p Fractional Factorial Designs, Part 11", Tecl~nometrics, V01.3, No.4, PP. 449-459, 1961. Box, G. E. P., Hunter, W. G., and Hunter, J. S.,
------,Statistics for Ex~erimenters----,-----I John Wiley & Sons, Inc., New York, 1978.
Prolog Programming for Artificial Bratko, 1. 9 ------Intelligence, ---- Addison-Wesley Publishers Limited, Great Britain, 1986.
Cochran, W. G., "Some Consequences When The Assumptions For The Analysis Of Variance Are Not Satisfied", Biometrics, Vo1.3, PP. 22-38, 1947.
Dixon, W. J. and Massey, F. J., Jr., Introduction-Lo ------Statistical Analysis, --- McGraw Hill Book Company, New York, 1983
Draper, N. R., and Herzberg, A. M., "On Lack of Fit", ------Technometrics, 13:231-241, 1971. Duncan, D. B., "Multiple Range and Multiple F Tests", ------Biometrics, Vol.11, PP.l-42, 1955. Dunnet, C. W., "New Tables for Multiple Comparisons with a Control", Biometrics, Vo1.20, PP.482-491, 1964.
Eisenhart, C., "The Assumptions Underlying the Analysis of Variance", Biometrics, Vo1.3, PP.l-21, 1947. Fikes, R. and Kehler, T., " The Role of Frame-Based Representation in Reasoning ", Communications of the ---ACM, Vo1.28, No.9, PP. 904-920, September 1985.
Gale, W. A., " Knowledge-Based Knowledge Acquisition for a Statistical Consulting System ", International Journal of Man-Machine Studies 26, PP. 55-64, 1987. ------J Glen, T. M., and Terry, W. R., "Statistical problems and treatments in assessing the long term impact of human performance productivity improvement", ------Proceedings ...... Seventh International Conference on Production Research, PP. 906-911, 1984.
Hahn, G. J., " More Intelligent Statistical Softwares and Statistical Expert Systems: Future Studies ", T'g ...... American Statistician Vo1.39, PP. 1-16, Feb. 1985.
Hahn, G. J., " Some Things Engineer Should Know About Experimental Design ", Journal-of-Quality-T_gchnologr, Vo1.9, No.1, PP. 13-20, Jan. 1977.
Hand, D. J., " Statistical Expert Systems: Design ", ------The Statistician, Vo1.33, PP. 351-369, 1984. 27) Hand, D. J., " Statistical Expert Systems: Necessary Attributes ", Journal of Applied-statistics, Vo1.12, No.1, PP. 19-27, 1985.
28) Hicks, C. R. * ------Fundamental Conce~ts ------in the Design ---- of Experiments, CBS College Publishing, New York, 1982.
29) Hines, W. W. and Montgomery, D. C., Probabiliry-aqd ------Statistics in Engineering ------and Management ------~ Science John Wiley & Sons, Inc., New York, 1972.
30) Hunter, S. J. " Statistical Design Applied to Product Design ", Journal of Quality-Technology, Vo1.17, No.4, PP. 210-219, 1985.
31) Johnson, P. E., Nachtsheim, C. J., and Zuakeran, I. A., " Consultant Expertise ", Exeert Systemp, Vo1.4, No.3, PP. 180-188, Aug. 1987.
32) Jones, B., " The Computer as a Statistical Consultant", BIAS, Vo1.7, No.2, PP. 168-195, 1980.
33) Kackar, R. N., " Off-Line Quality Control, Parameter Design, and the Taguchi Method ", Journal of Quality ------Technology, Vo1.17, No.4, PP. 176-188 Oct. 1985. 34) Kackar, R. N., "Taguchi's Philosophy: Analysis and Commentary ", Quality Progress, PP. 21-29 Dec. 1986.
35) Kennard, R. W. and Stone, L. A., Computer-Aided Design of Experiments ", Technometricz, Vol.11, No.1, PP. 137-148, 1969.
36) Mendenhall, W., Introduction to Linear Models and The Design and Analysis offExperiments, Duxbury Press, California. 1968. 37) Miller, R. K. and Walker, T. C., --Artificial ------Intelligence ------A~~lications ------in Engineering, ------SEA1 Technical Publishing, U.S.A, 1988.
38) Montgomery, D. C., Depign and Analysi-s of Experiments, John Wiley & Sons, Inc., New York, 1976.
39) Nelson, L. S., " Extreme Screening Designs ", Journal ---of Quality ------Technology, Vo1.14, No.2, PP. 99-100, April 1982.
40) Shapiro, S. S., Wilk, M. B., and Chen, H. J., Journal
------Iof American Statistical Association 63:1343, 1968. 41) Scheffe', H., "A Method for Judging All Contrasts in the Analysis of Variance", E%3ometrika, Vo1.40, PP.87- 104, 1953.
42) Smith, C. A. B., and Hartley, H. O., " Construction of Youden Squares", Journal of the Royal Statistical ------Society, B, Vol.10, PP.262-264.
43) Snee, R. D., " Computer-Aided Design of Experiments, Some Practical Experiences ", Journal of Quality Technology, Vo1.17, No.4, PP. 222-236, Oct. 1985.
44) Steele, R. G. D., and Torsie, J. H., "Principles & ...... Procedures of Statistics", McGraw Hill, New York, 1960.
45) Sullivan, L. P., " The Power of Taguchi Methods ", Quality Progress, PP. 76-79 June, 1987.
46) Tukey, J. W., One Degree of Freedom for Non- Additivity", Biometries, Vo1.5, PP.232-242, 1949.
47) Youden, W. J., " Partial Confounding in Factorial Replication ", Technometrics, Vo1.3, No.3, PP. 353- 358, August 1961. APPENDIX
Evaluation Test ......
Data Sheet
Point Distribution ......
Data Summary ......
Score Averages ------
Paired t-test ......
Instructions for Running the Program ------
Sample Output ...... EFFICIENCY TEST
The following problems were presented to a number of students for determining what type of experimental design is appropriate for each of them. The students test hand to determine the answer on his/her own and then they were allowed to determine the answer by using the expert system. These problems were selected from Montgomery
(1976) and Box, Hunter, and Hunter (1978). PROBLEM-I An engineer wishes to determine whether or not four tips produce different readings on a hardness test machine. The machine operates by pressing the tip into a metal specimen, and from the depth of the resulting depression, the hardness of the specimen can be determined. There is only one factor to consider; tip type. (a) If the metal specimens differ slightly in their hardness, what type of experimental design should be used to eliminate the uncontrolled variability of specimen hardness? (b) After the type of design has been decided upon, the engineer is looking for a way to compare all pairs of individual tip types, what test do you think he/she should use? ------PROBLEM 2 The percentage of hardwood concentration in raw pulp, the vat pressure, and the cooking time of pulp are being 98 investigated for their effects on strength of paper. Three levels of hardwood concentration, three levels of pressure, and two cooking times are selected. What type of design should be used?
------PROBLEM 3 An experimenter is studying the effect of five different formulations of an explosive mixture used in the manufacture of dynamite on the observed explosive force.
Each formulation is mixed from a batch of materials that is only large enough for five formulations to be tested.
The formulations are prepared by different operator, and there may be substantial difference in skills and experience of the operators. Five operators are chosen at random for this experiment. It should be noted that this single factor experiment has two uncontrolled variables; operators and batches. What type of experimental design should be run?
------PROBLEM 4 Three factors are studied for their effects on the horsepower necessary to remove a cubic inch of metal per minute. The factors are feed rate at two levels, tool condition at two levels, and tool type at two levels. It should be mentioned that all the factors are important.
What type of design should be used? 99
------PROBLEM 5 An engineer is studying the effect on filtration time of some product. The variables under consideration are:
(1) Water supply
(2) Raw materials
(3) Temperature of filtration
(4) Hold up time prior to filtration
(5) Recycle
(6) Rate of addition of caustic soda
(7) Type of filter cloth
(a) The engineer does not know for certain which of the above variables are important. What type of design do you recommend for to identify the important ones? (b) If the experimenter has identified that variables (1) water supply, (3) Temperature of filtration, and (6) rate of adding caustic soda are important and he/she suspects that two-factor interactions among them are important, too.
What is the next step the engineer should take? (c) If involving the three main effects and their two-factor interactions provided an adequate model, what should the experimenter do to obtain more precision on their estimates? ------DATA SHEET
USING SYSTEM (Y/N)? ------SUBJECT # ----
ISE 307/507 ------
PROBLEM 1
PROBLEM 2
PROBLEM 3
PROBLEM 4
PROBLEM 5 POINT DISTRIBUTION
Points for the problems will be distributed as the following:
PROBLEM 1 (a) Complete Randomized Block Design 2 Complete Randomized Design 1 Otherwise 0
(b) Duncan's Multiple Range Test Otherwise
PROBLEM 2 Full Factorial Design Factorial Design Otherwise
PROBLEM 3 Latin Square Design Factorial Design Otherwise
PROBLEM 4 Factorial Design at 2 levels Factorial Design Otherwise
PROBLEM 5 (a) Fractional factorial at 2 levels 2 Factorial 1 Otherwise 0
(b) Factorial at 2 levels + Inclusion of 2-factor interactions 2 Factorial at 2 levels 1 Otherwise 0
(c) Replication of experimental runs 1 Otherwise 0 DATA SUMMARY
1Jsing the Expert System? No
Problem
! (4) ! 2 Subject ! DATA SUMMARY
using the Expert System? 122
Problem
I (1) I 1 I (2) 1 2 I (3) I 2 I (4) 1 2 Subject I (5) 1 2 SCORE AVERAGES
Average without Average with Difference --auntem--lll-- -suntem-l2l- 121-1-111
Problem 1 (a)
(b)
Problem 2
Problem 3
Problem 4
Problem 5 (a)
(b)
(c PAIRED T-TEST
HO: ul = u2
HA: ul < u2
Where: ul refers to the mean score obtained when solving the problems without using the system. u2 refers to mean score obtained when solving the problems
using the system.
Statistic: to = da/(sd/ n)
Where: da is the average of the differences between pairs,
sd is the standard deviation of the differences, and n is number of matched pairs. oc = 0.05 da = 0.975
Sd = 0.4166
to = 0.975/(0.416/ 8) = 6.62
t0.05,8-1 = 1.895 to > t0.05,7
Conclusion: U2 is greater than U1 ( reject'the null hypothesis and conclude that using the system produce significantly higher scores). INSTRUCTION FOR RUNNING THE PROGRAM
1) Insert a DOS disk in drive a and get A>
2) Remove the DOS disk from drive a and insert the program
(Turbo Prolog) disk and type ~rolpg
3) Insert the data disk in drive b
4) Move the cursor to option Files in the main menu and hit
center>
5) Move the cursor to option Load and hit center>
6) Type b:expert and hit
7) After the file has been loaded, move the cursor to option
Run in the main menu and hit
EXPERT SYSTEM FOR DESIGNING STATISTICAL EXPERIMENTS Developed by: Mustafa Shraim Advisor: Dr. W. Robert Terry
Department of Industrial & Systems Engineering Ohio University Athens, Ohio
Press space bar to continue
THIS PROGRAM IS DESIGNED TO HELP THE USER IN SELECTING THE APPROPRIATE EXPERIMENTAL DESIGN. IT DOES NOT PROVIDE ANY NUMERICAL CALCULATIONS OF EXPERIMENTAL DATA. IT SHOULD BE NOTED THAT THIS PROGRAM APPLIES TO SINGLE DEPENDENT VARIABLE OR RESPONSE AND FIXED EFFECTS SITUATIONS ONLY
Do you wish to start the program (y/n)?y 1 -- There is only one factor (independent variable) to investigate 2 -- There are two or more factors (independent variables) to investigate
Enter number of choice or 'q' to quit or 'h' for help: 1
Are there any sources of extraneous uncontrollable Factors in the experiment (y/n)?
Enter answer or 'q' to quit or 'h' for help : h
Uncont'rollable variability: variability that comes from extraneous sources over which the experimenter has no control. Example: Temperature in a field experiment Press space bar to continue Are there any sources of extraneous uncontrollable Factors in the experiment (y/n)?
Enter answer or 'q' to quit or 'h' for help : n
Run a complete randomized design to compare K>2 treatment means or levels of the single factor
Enter 'c' to continue or 'h' for help h
Complete Randomized Design: A very simple form of experimental design in which the treatments involved are allocated to the experimental units purely on a random basis
Press space bar to continue Do you wish to compare individual treatment means (y/n)?
Enter answer (or 'q' to qtlit): y
Is it known what treatments to compare and contrasts are set (y/n)?
Ent~ranswer or 'q' to quit or 'h' for help: n
Do you wish to compare all possible pairs of treatment means (levels In single factor experiment) (y/n)?
Enter answer ('q' to quit): y
Is F-test For all treatments (levels) combined significant at 5% (yt'n)?
Enter answer ('q' to quit): n
Use Duncan's multiple range test
Enter 'c' to continue or 'h' for help h
Duncan's multiple range test is a powerful tool to compare treatment means. Refer to Duncan's(l955)
Press space bar to continue 1 -- There is only one factor (independent variable) to investigate 2 -- There are two or more factors (independent variables) to investigate
Enter number of choice or 'q' to quit or 'h' for help: I
Are there any sources of extraneous uncontrollable Factors in the experiment (y/n)?
Enter answer or '11' to quit or 'h' for help : y
1 -- So~~rceof uncontrollable variability comes from one factor 2 -- Source of uncontrollable variability comes from more than one factor
Enter number of choice ('q' to quit ): 2
1 -- Uncantrollahle variability comes from two factors 2 -- Uncontrollable variability comes from three factors 3 -- Uncontrollable variability comes from more than three factors
Enter number of choice ('q' to quit): 1 112 Is number of levels for each of the extraneous factors equal to the number of treatments (or levels) to be compared (y/n)? Enter answer ('q' to quit): y
[S data complete (i.e. no missinz values) (y/n)?
Enter answer ('q' to quit): y
Use Latin square design
Enter 'c' to continue or 'h' for help h
Latin sqllare design is used when the uncontrollable variability comes from two su~rrceswhich may be idrntified as rows and columns. One Roman letter is used in each cell of the square with no repetition in the same row and column For example, if we would like to investigate three treatments A,B, and C with two uncontrollable variables each at three levels then the square is A C B CBA B A C
Press space bar to continue 113 1 -- 'l'l~ere is only one factor ( iride~~endcntvariable) to i~~vest fga~ c 2 -- There are two or more factors (Independent variables) to investigate
Enter number of chotce or 'q' to quit or 'h' for help: 2
Are all independent variables known to he important (y/n)?
Entcr answer or 'q' to qult or 'h' for help: y
Is number of levels for each of the independent variables predetermined (y/n)?
Enter answer or 'q' to quit or 'h' for help: y
Is number of levels different from variable to variable (y/t~)?
Enter answer('q' to quit): y Ran a full factorial design. If you are considering all possible interactions then you will need at least two replicates of one treatment combination. OtherwJse you may use the unimportant interactions to estimate the error term
Ent.er 'c' to continue or 'h' for help h
Full factorial: Involves all possible treatment combinations in two or more factor experiments. For example, if an three factor (independent variables) are considered in an experiment and each is at two levels then the full factorial involves 2x2~2= 8 treatment combinations. Interaction: Is R measure of the extent to which the the effect (apon the dependent variable) of changing the level of onc factor depends on the level(s) of another or others
Press space bar to continue
Run a full factorial design. If you are considering all possible interactions then you will need at least two replicates of one treatment combination. Otherwise you may use the unimportant interactions to estimate the error term
Enter 'c' to continue or 'h' for help c 1 T11cr.e is clnly one factor (inciependent variable) to investigate 2 -- here are two or more factors (independent variables) to investigate
Enter nr~rnt~er.of choicr or '(1' to quit or 'h' for h~lp:2
Are all ind~pendentvariables known to be important (y/n)?
Ent~ransw~r or 'q' to quit or 'h' for help: n
Divide each independent variables into high and low levels RII~a highly fractionated factorial design at two levels. This will serve as a screening experiment for identifying the set of the significant variables
Enter 'c' to continue or 'h' for help h
Screcninc: The process of separating the important independent variables from the unimportant. Screenihg is applied at early stage of experimental design when a large number of variables is present
Press space bar to continue Divide each independent variables into high and low levels Run a highly fractionated factorial design at two levels. This will serve as a screening experiment for identifying the set of the significant variables
Enter 'c' to continue or 'h' for help c
Is there any prior knowledge indicating that any of the two-factor interactions are important (y/n) 7
Enter answer or 'q' to quit or 'h' for help: n
Is it economically possihle to perform all 2-k experimental runs given the budget and cost per run (y/n)?
Enter answer or 'q' to quit or 'h' for help: y
Can all runs be performed under homogeneous condition (yln)?
Enter answer or 'q' to quit or 'h' for helpy
Set up a full factorial design at two levels (2-k). Randomize runs, perform experiment and collect data. Fit the first order model that includes two-factor interactions. Conduct a lack of fit test (use higher order interactions to estimate error if experiment is not suffeciently replicated)
Entpr 'c' to continue or 'h' for help h Randomization: A set of treatments applied to a set of units is said to be randomized when the treatment applied to any given unit is chosen at random from those available and not already allocated First order model: (Linear model) contains only the terms that are linear in nature
Press space bar to continue
2-k: ( Factorial design at two levels) refers to k factors, each being investigated at two levels. This is the most popular factorial design because it dors not require a lar~enumber of runs. However. ttti s drsigri cst.imat.a tlir: I incar affects of the independent variables and does not account for nonlinrar effects Interaction: Is a measure of the extent to which the the effect (upon the dependent variahle) of changing the level of one factor depends on the level(s) of another or others
Press space bar to coritiri~le
Set up a full factorial design at two levels (2-k). Randomize runs, perform experiment and collect data. Fit the first order model that includes two-factor interactions. Condact a lack of fit test (use higher order interactions to estimate error if experiment is not suffeciently replicated)
Enter 'c' to continue or 'h' for help c
Is there a good fit of the model (y/n)?
Enter answer or 'q' to quit or 'h' for help: h The model is adequate and has a good fit if there is no significant violations in one or more of th~following statistical assumptions: Linearity, Normality, Independence Homogeneity of variance, and Additivity
Press space bar to continue
Linearity Definition: The dependent variahle (response) can be expressed as a linrar function of the independent variables. Such an assumption will he valid when the surface which represents thr relationship betwren the dependent variable and th~ ir~deper~cientvarial~les hns 110 significant curvattlr~ Trsting: Linearity can be tested by employing the lack of fit test in Draper & Herzberg (1971) Krmrdy: If the lin~arityassumption is violated, appropriate transformation can be achieved by employing 'Maximum Ilk~lihnotl mrthod' in Rox & Cox (1964)
Press space bar to continue
Normality Definitior~:The difference between predicted and actual, 1.e. the resid~~al,values must follow a normal distribution with the mean centered at zero Testing: Normality can be tested by employing chi-square goodness of fit test, Kolmogorov-Smirnov test,or the W test developed by Shapiro, Wilk. L Chen (1968) Remedy: If the normality assumption is viloated, appropriate transformation can be achieved by employing 'Maximum likelihood method' in Box & Cox (1964)
Press space bar to continue
Independence Definition: Residuals should be independently distributed random variables. This means that the knowledge of the magnitude of a given residual does not contain any information for predicting the values of other residuals Testing: Independence assuption can be tested by employing Darbin-Watson test Remedy: Refer to Glen & Terry (1984)
Press space bar to continue Homogeneity Of Variance Definition: Thp variance of the distributions of the residuals are constant for all combination of values of the independent variables test in^: The assumption of homogeneous variance can be tested by employing a test developed by Rartlett (1937) or the one by Bartlett and kendall (1946) Remedy: If homogeneity of variance is viloated, appropriate transformation can be achieved by employing 'Maximum likelihood method' in Rox & Cox (1964)
Press space bar to continue
Additivity Definition: The model, which expresses the dependent variable as a function of the independent variables, must not contain any cross product or interaction terms. However, the assumption that interactions are not present will not always be valid Tcsting: The additivity assumption can be tested by employing 'One degree of freedom test for additivity' by Tukey(1949) Rrmrdy: If The additivity assumption is violoated, appropriate tral~sformation can be achieved by employing 'Maximum likelihood method' in Box & Cox (1964)
Press space bar to continue
Js there a good fit of the model (y/n)?
Enter answer or 'q' to quit or 'h' for help: n
The assumption of no higher than two-factor interactions is possibly wrong. Include the most likely three-factor interaction terms in the model. FIt the first order model
Enter 'c' to continue or 'h' for help h 3-factor interaction: A measure of the extent to which the effect ( upon the dependent variable) of changing the level of one factor depends on the levels of the other two factors
Press space bar to continue
T~Passamption of no highrr than two-fnctor interactions is possibly wrnng. Include the most likely three-factor interaction terms in the model. Fit the first order model
Enter 'c' to continue or 'h' for help c
Is there n good fit of the model (y/n)?
Entcr answer or 'q' to quit or 'h' for help: n
Tra~~sformationof the data may be applied to achieve model adequacy. The lack of adequacy is a result of vilolation of one or more of the statistical assumptions
Enter 'c' to continue or 'h' for help h
Transformat.ion: The metrics in which data is recorded are chosen merely as a matter of convenience. However. they may not be those in which the system is most simply modeled. In such cases, transforming the response or the independent variable(s) or both on another scale provides us with a simple model
Press space bar to continue 1s there a good fit on the transformed data (y/n)?
Enter answer or 'q' to quit: y
Is more precision on estimates of effects desired (y/n)?
Enter answer or '(1' to quit: y
Replicate the experimental runs as many times as possible to achieve the preclsion desired
Enter 'c' to continue or 'h' for help
Replication: The execution of an experiment or a survey more than once so as to increase precision and to obtain a closer estimation of sampling error
Press space bar to continue