AN EXPERT SYSTEM FOR DESIGNING

STATISTICAL EXPERIMENTS.

A thesis Presented to

The Faculty of The College of Engineering And Technology

Ohio University

In Partial Fulfillment of

The Requirements for the Degree of

Nasters of Science

Mustafa S. Shraim s- i ACKNOWLEDGEMENT

I would like to express my thanks to my advisor, Dr. W. Robert Terry, for his help and encouragement throughout this research. Also, I would like to extend my thanks to my committee members, Dr. Francis Bellezza and Prof. Ken Cutright for their guidance and advice.

This thesis is dedicated to my parents, Saadeddin and Etaf Shraim, for their continuous love and support and to my girlfriend, Ellen Battistone, without whom ..... TABLE OF CONTENTS

1.1 Experimental Design Review ------1 1.1.1 Basic Experimental Design Principles --- 2 1.1.2 Experimental Design Procedure ------4 1.1.3 Statistical Techniques in Experimental Design ...... 5 1.2 Experimental Design and Quality Improvement -- 6 1.3 Statistical software 8 1.4 Knowledge Engineering Background ------13

Chapter Two: NEEDS ANALYSIS ...... 18

Chapter Three: TECHNICAL AND STATISTICAL TERMINOLOGY - 21

3.1 Definitions of Technical Terms 21 3.2 Definitions of Experimental Designs ------26 3.3 Definitions of Tests ...... 29

Chapter Four: EXPERIMENTAL DESIGN ...... 31

4.1 Introduction ------31 4.2 Basic Experimental Design Principles ------31 4.3 Screening Experiments ...... 36 4.4 Model Adequacy ...... 42

Chapter Five: KNOWLEDGE REPRESENTATION 50

5.1 Introduction ...... 50 5.2 Experimental Design Rules ...... 51 5.2.1 Rules for Single Factor Experiments ---- 5 1 5.2.2 Rules for Factorial Experiments ------55

Chapter Six: SYSTEM EVALUATION ...... 87

6.1 Introduction ...... 87 6.2 Evaluation Experiments 87 6.3 Conclusions ...... 89 6.4 Recommendations ...... 9 1 Evaluation Test ......

Data Sheet ------A Point Distribution ...... Data Summary Score Averages

Paired t-test ------A Instructions for Running the Program ------Sample Output ------TABLE OF FLOW CHARTS CHAPTER ONE

INTRODUCTION

1.1 EXPERIMENTAL DESIGN REVIEW

A "true experiment" may be defined as a study in which some independent variables are manipulated

(Hicks (1982)). Their effects on the response or the

dependent variable are determined and levels of the

independent variables are controlled and not chosen at random. The various combinations of levels of independent variables are referred to as treatments. These treatments are run in random order.

The main reason for designing experiments is to obtain accurate information about the study at the lowest cost. Poorly planned experiments are often costly due to performing unnecessary experimental runs. Moreover, it is often necessary to study situations which the effects of two or more independent variables combine in a nonadditative fashion. These nonadditative effects are called interaction effects. If the experiment is performed by changing the level of one independent variable at a time while holding the others constant, then interaction effects can not be determined. Another reason for designing an experiment is the need to measure the experimental error which is a crucial component in conducting statistical tests. These statistical tests provide the experimenter with the power of interpreting the results of the data analysis.

Steele and Torsie (1960) stated that " the experimenter who contacts the consulting statistician at the planning stage of the experiment rather than after its completion improves the chances of achieving his aims".

According to them, the statistician is often faced with data that only provides biased estimates and differences.

The conclusions drawn from the analysis of such data may not apply to the population the experimenter had in mind.

However, if the experimenter applies proper experimental design, then valid conclusions based on precise estimates can be drawn.

1.1.1 BASIC EXPERIMENTAL DESIGN PRINCIPLES

Often, after defining the independent variables and the response(s) of an experiment, the data is collected without taking into consideration the way it was collected. Other times, the experiment is conducted with little or no consideration given to the extraneous variability that can be contributed to the experimental error. For example, if the temperature is said to have an effect on the outcome (response) of a certain experiment, then performing the experimental runs under different temperatures could inflate the experimenter error. As a result, the validity of conclusions drawn are affected.

There are three basic principles in designing experiments; (1) blocking, (2) , and (3) replication. Blocking is method used in experimental design to draw valid conclusions from an experiment. This is done by performing the experimental runs in subexperiments or blocks. Each block contains the experimental runs that are performed under homogeneous conditions. The variability between blocks can then be eliminated since the extraneous conditions of one block may differ from those of another. Randomization is the key principle in designing statistical experiments. It means that the experimental runs are selected and performed in a random order. The emphasis on randomization comes from the fact that statistical methods require that the observations, which contain the experimental error, are randomly distributed random variables. Replication refers to the repetition of the basic experimental runs. It is an important aspect in designing statistical experiments for two important reasons. Firstly, It provides the experimenter with an estimate of the experimental error which is an important element in conducting statistical tests. Secondly, It provides the experimenter with a more precise estimate of the mean response.

A detailed discussion on these basic experimental design principles and other aspects will be presented in chapter three.

1.1.2 EXPERIMENTAL DESIGN PROCEDURE

In this section, a description of procedure for experimentation will be presented. This includes the design, the performance, the analysis and the conclusions.

(1) Statement of the problem: The objective of an

experiment must be clearly defined.

(2) Choice of variables: this includes both the

independent variables and the response(s). Levels for

the independent variables over which the investigation

will be performed must also be defined. The dependent

variable or response must be selected so that it (a)

provides useful information about the study, and

(b) can be measurable.

(3) Choice of experimental design: The experimenter must take

the basic principles mentioned above into

consideration when he or she is to select an

experimental design. The objective of the experiment

as well as budget constraints are also important

issues. (4) Performing the experimental runs: the experimenter

must ensure that proper randomization, measurement,

and homogeneity of environment in which the experiment

is taking place are proceeding as planned.

(5) Analyzing the experiments: the data collected from the

experiment should be analyzed by using proper

statistical techniques. Graphical methods are also

helpful especially in checking the adequacy of the

model.

(6) Drawing conclusions: statistical techniques provide a

guidance in drawing conclusions from the analysis of

the experimental data. Statistical significance on

effects and differences can be easily recognized for

recommendations on the findings.

1.1.3 STATISTICAL TECHNIQUES IN EXPERIMENTAL DESIGN

The experimenter can safely use statistical

techniques when the requirements of the experimental design are met. Namely, recognizing the basic principles mentioned earlier and building the design upon them

enables the experimenter to use statistical methodology in a straight forward manner. The following are some considerations for the experimenter to keep in mind.

(1) The choice of the experimental design and analysis

should be kept simple, since it is proven that they

are almost always best when they are simple (Montgomery (1976)).

(2) The experimenter should recognize the sequential

approach in experimentation. That is, the construction

of the primary experiment must not be comprehensive

since adding and dropping variables could occur. This

is always true in screening experiments where the

experimenter is trying to separate the important

variables from the unimportant ones.

(3) Interpretation of statistical significance should be

accompanied with practical knowledge about the

experimental situation. For example, an engineer may

determine that a modified carburetor of an automobile

can save 0.1 mile/gallon which is statistically

significant. However, if the cost of modification is

$1000/carburetor then it is obvious that it is not

worth the effort especially when the price of gas is

low.

1.2 EXPERIMENTAL DESIGN AND QUALITY IMPROVEMENT

It is important to consider the role that

experimental design has been playing in quality

improvement. Nearly all manufacturing firms use some type of quality control procedure to achieve quality products.

However, these procedures may be costly to implement,

resulting in overpriced products. They may also be

ineffective due to the lack of suitable experimental design methodologies. Therefore, some type of consultation system through which experimental design expertise can be conveyed to quality control personnel is needed. Engineers can not just use existing statistical analysis software packages that do not help in planning the right experiment.

In the past five years, there has been a great change in the way American firms think when it comes to quality improvement. Dr. Genishi Taguchi, a Japanese quality consultant, has introduced new methodologies called "Off- line Quality Control" to achieve quality improvement.

Viewing Taguchi's quality control methods from the perspective of the Japanese industrial success, some

American firms readily adapted Taguchi's methods. His philosophy is divided into two main categories: (1) On- line quality control methods and (2) Off-line quality control methods. His on-line quality control methods consist of statistical process control, prediction and correction, and inspection. The expert system developed in this research is only concerned with off- quality control since it deals with designing experiments

Box (1985) has observed that the method of off- line quality control advocated by Taguchi which utilizes experimental design suffers from the following:

(1) The orthogonal array design ignores the existence of

interactions between the controllable variables and

the uncontrollable or noise variables.

(2) Taguchi seems to ignore the possibility of data

transformation. In practice, the possibility of

dealing with inadequate models should always be kept

in mind.

(3) The sequential approach in experimentation is not

referred to by Taguchi despite its proven efficiency.

1.2 STATISTICAL SOFTWARE

Traditional software of experimental design may be misused by the user for two major reasons as stated by

Hand (1984). The first is the difficulty in operating the software because it requires a comprehensive understanding that can only be achieved by studying the details in its manuals. The second reason comes from the fact that some users mistakingly analyze their data using a wrong or less suitable experimental design. The second reason tends to be more harmful than the first one because it may lead to invalid conclusions. This problem is likely to occur when users are so eager to analyze the data that, sometimes, they tend to use the easiest design available.

However, lack of knowledge about experimental design and not realizing the consequences of poorly designed experiments are very important reasons,too.

Analytical software packages assume that the user

knows what he or she is doing with regard to selecting the

right computerized algorithm. This assumption can be

invalid if the user lacks this expertise. Therefore, the need for an expert system to answer the user's questions on the selection of the right experimental design concerns many researchers in this field. The field of experimental

design is one that has not been given a lot of attention when it comes to the development of such systems because of the following reasons:

(1) Experimental design is used in many domains such as

psychology, agriculture, engineering, physical science

and others which makes decision making rather

difficult. These domains differ in their requirements

in planning and making inferences from results. For

example, in complex high technology, a large number of

experimental runs can not be afforded because of the

high cost of a run. In fields like agriculture or

psychology, however, experimental runs may be

replicated several times.

(2) The field of experimental design has a wide range

which makes it somewhat ineffective to be covered in

one consultation system.

(3) Some researchers (Hand (1984)) argue that the difficulty also comes from the enormous amount of

technical terms that the user must be familiar

with before efficiently using the consultation system.

For example, blocking is an important aspect of

experimental design that must be understood by the

user. If he or she is not familiar with this concept,

the whole experimental design will be affected and,

as a result, the conclusions drawn from the experiment

become affected. Also, cost of conducting experiments

may be increased as well as risk of error.

(4) It has been proven to be difficult to develop a

system that contains symbolic knowledge as well as

powerful number manipulation due to the lack of a

computer language that effectively suit both. Powerful

number manipulation languages such as Fortran are not

efficient in representing symbolic knowledge.

The points mentioned above are the major difficulties that researchers face in trying to develop a computer program that is able to emulate the human expert in reasoning and decision making with regard to designing experiments. The domain of experimental design can not be compared to knowledge of geography, for example. The latter can be represented easily in a knowledge-based system because of its static pieces of information. On the other hand, designing an experiment is usually done in several phases. One phase may totally depend on the information obtained from a previous one as is the case in the sequential approach in experimentation.

There are two major questions that need to be answered when it comes to developing an expert system for designing experiments. The first is: For what purpose the system is developed? and the second is: For whom will the system be designed ?

The first question has two possible answers: (1) For the selection of a suitable experimental design and (2)

For number manipulation. The second answer has been answered for many years. The market is full of such software that compete with one another in how fast the analysis is carried out, how easy to input the data, and the capacity in terms of how much data the system can handle at one time. The first answer, however, is hardly answered by software market despite its crucial role in effective experimental design. This point is most forcefully made by Hahn (1977) when he stated " the world's best statistical analysis can not rescue a poorly planned experimental program".

The second question, which deals with the group of people that the system is designed for, must be answered clearly so that the system developed is effective. An expert system in this field can not be effective for a wide variety of people. The novices expectations of the system significantly differ from those of the well- informed person. The well-informed person does not want to be bothered with elementary questions and explanations while the novice is likely to require lot of help throughout the session. The novice is likely to appreciate an expert system that does everything for them from planning the experiment through the analysis to inferences on the findings. Attempts to satisfy requirements for both extremes can dramatically affect the effectiveness of the system. Consequently, identifying the group of people for whom the system will be designed is an essential step in the development of an expert system for designing experiments. 1.4 KNOWLEDGE ENGINEERING BACKGROUND

Knowledge engineering is basically the reduction of knowledge domain into sets of facts and rules. The knowledge engineering process consists of two major phases: knowledge acquisition and knowledge representation. The two phases together can be described as the process of acquiring knowledge from expert on a particular domain, coding this knowledge and computerizing it to achieve a system that effectively emulates the human expert in decision making process. Before the knowledge engineer starts acquiring knowledge, he or she must define the problem and the domain on which the knowledge will be acquired. There are three major requirements that need to be satisfied when it comes to identifying the domain:

(1) The domain must be narrow to achieve a more

effective system.

(2) The domain must have sufficient expertise. For

example, an attempt to develop an expert system on

astrology will probably fail due to serious lack of

expertise in that domain.

(3) The domain should contain reliable knowledge that

does not significantly change over time.

Knowledge-based systems are different from other types of systems. The following features of knowledge- based systems distinguish them from other systems:

(1) Knowledge-based systems or expert systems are

limited to a specific domain of expertise.

(2) The reasoning mechanism of one expert system may be

applied to other expert systems.

(3) Expert systems can apply deductive reasoning.

(4) Expert systems provide symbolic output for a

better interaction with the user.

traditional computer systems are better in applying procedural programming such as numerical and algorithmic processes, whereas expert systems have the ability to emulate the human expert in a particular domain. There are many situations in which expert systems are required ranging from scarcity or non-consistent availability of expertise to providing education and training facilities. They can also replace the human expert when obtaining their expertise is time-consuming or expensive.

While expert systems are not able to think, they still have some advantages over the human expert in terms of: Consistency in delivering answers and not jumping to conclusions without considering all details. In other words, unlike human experts, expert systems are not emotionally influenced in the decision making process which sometimes lead to inconsistent judgement.

An expert system usually consists of two major parts:

(1) Knowledge base and (2) Inference engine. The knowledge base itself consists of the knowledge that is specific to the domain of application including facts about the domain, and rules that describes relations in the domain.

The inference engine performs two tasks: It examines the facts and determines whether or not they are satisfied using the rules in the rulebase. It also determines the order in which these rules are scanned and fired.

The order in which the rules are examined is an important issue for the knowledge engineer to keep in mind when developing an expert system. The system must have a starting point from which the searching procedure begins.

This starting point depends upon the objective of the function. In any case, there are two scanning procedures to consider: forward chaining and backward chaining.

In forward chaining, the inference engine starts with the facts and works its way forward in an attempt to find conclusion that matches the given facts. In backward chaining, the inference engine starts with an assumed conclusion and works its way backward in an attempt to 16 find facts that support that conclusion. For example, in

medical diagnostics, the goal is to identify symptoms

(facts) from a given state of illness (conclusion). It

should be mentioned here that some expert systems employ a

combination of backward and forward chaining.

Expert system programs can be written in many

languages including old ones such as Basic and Fortran.

However, artificial intelligence programs are best

optimized if they are written in languages that provide

the appropriate data structure. This means selecting the

language that is most suitable for symbolic representation

in which modifying and making logical inferences are easy

to accomplish.

The most popular languages for expert systems are

Lisp and Prolog. They utilize logical declarations specifying what the program will do. The program of this expert system is written in Turbo Prolog which is a

Boleland International's fast, compiler-based

implementation of Prolog programming language. This language is designed to provide answers it deduces from

its powerful internal routines. In comparison with other declarative languages, Turbo Prolog proves efficient in

the sense that it can only use few lines to accomplish a

task that other languages require several pages of code to 17 accomplish. It should be mentioned here that Turbo Prolog runs on IBM PC or compatible and requires DOS to start. CHAPTER TWO

NEEDS ANALYSIS

2.1 STATEMENT OF NEED

Quality improvement relies heavily on applying suitable experimental design. Most quality engineers lack training in experimental designs necessary for designing cost effective off-line quality control experiments as indicated by Box et al(1988). In Japan, there is a great emphasis on experimental design methodologies for achieving better quality product. The Japanese are convinced that quality improvement can be achieved without necessarily spending excessive amounts of money. Instead of jumping to tolerance design to correct quality problem which is mainly concerned with purchase of equipment, they go through parameter design which is concerned with applying experimental design techniques to determine the causes of deviating from standards and correct them.

Therefore, an expert system that will enable engineers to select the suitable design is needed. This expert system consists of three parts: (1) Designing experiments for a single factor (or comparing more than two treatment means), (2) designing experiments with two or more factors, and (3) explanation facility that will provide a 19 brief explanation about basic statistical and technical terms.

The program is written in Turbo Prolog (Version 1.1) and requires 640 K memory on an IBM or compatible as well as DOS diskette to start it up. Production rules are employed to represent the knowledge in this expert system.

2.2 SYSTEM LIMITATIONS

(1) The program is limited to guide users to a suitable

design in linear situations. If tests performed by

user show that nonlinearity exists, the system will

advise the user to employ non-linear model but does

not go any further.

(2) The system is also limited to guiding the user to the

suitable design but does not perform numerical

calculations.

(3) The system is limited to advise the user on

controlled experiments. Therefore, unplanned

experiments such as regression are not included in

this expert system.

2.3 SYSTEM FEATURES

(1) The system is user friendly because of its

clear language and menu-driven structure. The user

may also leave the system at any point in the program.

(2) The system employs the approach of sequential

experimentation for more effective experimental

design.

(3) The system provides an explanation facility in

case the user wants to refresh his or her memory

about a certain technical terms.

(4) The system can be easily modi'fied or expanded due to

its modifiable inference engine.

(5) The system can be interfaced with other computer

languages such as Pascal or Fortran. This is an

important issue for future work if we want to link

this expert system with a powerful number

manipulating language like Pascal (Version 4.0). CHAPTER THREE

TECHNICAL AND STATISTICAL TERMINOLOGY

3.1 DEFINITIONS OF STATISTICAL TERMS:

This section is concerned with refreshing the user's memory with regard to basic statistical and other technical terminology used in the expert system. These definitions are viewed as the help facility which is a part of the expert system

(1) Type I error (M): often called " level of statistical

significance" which is the probability of rejecting

the null hypothesis when it is actually true.

(2) Type I1 error (B): is the probability of accepting the

null hypothesis when it is actually false. This is

more serious than type I error. The magnitude of the

expression (I-&) determines the power of the test.

(3) Independent variables: sometimes referred to as

"experimental variables" or "factors". The objective

of an experiment is usually to investigate their

effects on the dependent variable (response) or

possibly to find the settings of these variables over

which the response is optimized or the variability of

the response is minimized. The independent variables

can be either qualitative or quantitative depending on

their nature. 22

(4) Levels of independent variables: In planned

experiments, the experimenter wishes to investigate

the effects over specific settings within the range of

each of the independent variables. Usually, the

experimenter assigns two levels within the range of an

independent variable if the relationship between that

variable and the response is known to be linear. two

levels of the independent variables are also used when

performing screening of the independent variables

which is simply separating the important variables

from the unimportant ones. Three levels will provide

sufficient data for fitting a quadratic relationship.

(5) Dependent variable (response): It is the criterion

used to measure the experimental outcome. For example,

the strength of some material may be characterized as

the dependent variable if the experiment is to

investigate the effects of temperature and raw

materials on it.

(6) Interaction: When the difference in the outcome of the

dependent variable (response) between levels of one

factor is not the same at all levels of another

factor, these two factors are said to interact in

determining the response.

(7) Experimental run: When the experimenter sets the

independent variables each at a specified level and

measures the outcome (yield) by recording the value of 2 3

the dependent variable, it is called an experimental

run.

(8) Randomization: A set of observation is said to be

randomized when arranged in a random order. The

experimental runs should be randomly assigned to

prevent inflating the variance resulting from the

contribution of uncontrollable variables, learning

effect on the part of the experimenter, and other

reasons. Employing the randomization process simply

averages out these extraneous effects. Randomization

is an essential step prior to collecting the data.

(9) Homogeneous conditions: When experimental runs are

performed under the same conditions, this situation is

called homogeneous conditions. This means that all

runs were subjected to the same system uncontrollable

variables.

(10) Uncontrollable (noise) variables: These are extraneous

variables and can not be controlled by the

experimenter. Examples of usually uncontrollable

variables are temperature and humidity. Ignoring them

may affect the conclusions drawn from the experiment

due to their contribution of inflating the variance.

(11) Blocking: This experimental design concept is the key

to situations where not all experimental runs can be

performed under homogeneous conditions. Blocking the

experimental runs will eliminate the error contributed 2 4

by the difference of conditions so that experimental

runs in one block are performed under homogeneous

conditions.

(12) Lack of fit: The model representing the experimental

data should be adequate. However, sometimes this

adequacy is not achieved because of violating The

linearity assumption. The experimenter should run a

lack of fit test to determine whether the model is

adequate or not before further analysis are carried

out.

(13) Transformation: When the model does not adequately

represent the data in the original scale or metric,

the experimenter may need to transform the data into

another scale so there is a good fit.

(14) ( or "confusing" effects with each other):

When the experimenter is using fractional factorial

designs, he or she may lack degrees of freedom to

estimate all effects which are included in the model.

In this case, the concept of confounding is very

helpful in estimating main effects and important

interactions while sacrificing the unimportant

interactions (usually higher order interactions) in

order to estimate the important effects such as the

main effects and two-factor interactions effects. It

is helpful in the sense that the experimenter decides

what effects to be confounded. Confounding is usually 25

used in screening experiments where the objective is

to identify the important main effects ignoring the

interaction effects. The choice of the generator in

fractional factorials determines which effects are

confounded with each other.

(15) Replication: This is simply the repetition of an

experimental run which increases the number of degrees

of freedom available resulting in a better estimate of

the error. It is also essential in obtaining more

precise estimates of the effects. In reality,

replication of experimental runs may not be feasible

due to budget or time constraints.

(16) Analysis of Variance (ANOVA): Total variation

displayed by a set of observations is measured by the

sum of squares of deviation from the mean. It can be

broken up into components associated with defined

sources of variation.

(17) Resolution: The type of fractional factorial that

determine the confounding of the effects. A number

associated with the word "Resolution" is specified.

Different types of resolutions will be discussed later

in this chapter.

(18) Generator: A high order interaction term that is

usually negligible is used to generate fractional

factorial designs (Box and Hunter (1961)).

(19) Levels of an independent variable: The values within 26

the feasible range of an independent variable over

which the experiment is conducted. For example, if the

feasible range of the variable temperature is between

100 F and 120 F, the experimenter may assign the value

of 100 F to be the low level and the value of 120 to

be the high level. The experimenter should be cautious

when dividing levels for independent variables. For

example, variables that contain a curvature in their

relationship with the response should be identified.

If the curvature contain an increase then decrease or

vice versa, then the response may be similar for high

and low levels when assigned at extreme points of the

range. In this case, the experimenter is advised to

identify the point where the slope is zero and divide

the levels so that one level represent that point.

Prior knowledge about the nature of the independent

variables with regard to their relationship with the

response is helpful in detecting such situations.

feasible range of

3.2 DEFINITIONS OF EXPERIMENTAL DESIGNS:

In this section, some important experimental designs

will be briefly reviewed. Again, this is not intended to

cover all specifications of these designs but only to give

a general idea along with references:

(1) Latin Square design: this type of design is utilized 27

if the following conditions are satisfied : (a) if

there is a single factor under study, (b) there are

two extraneous sources of variability, and (c) each of

the extraneous variables has a number of levels equal

to the number of levels or treatments of the single

factor.

(2) Graeco Latin Square design: This is the same as the

regular Latin Square except that the extraneous

variability comes from three sources instead of two.

(3) Hyper-Graeco Latin Square design: This is the same as

the regular Latin Square except that the extraneous

variability comes from more than three sources.

(4) Youden Square design: This type of design is the same

as Latin Square but it is "incomplete". This means

that the number of levels of one extraneous variable

differ from either (a) the number of levels of the

other extraneous variable or (b) the number of

treatments (levels of the single factor under study).

See Smith and Hartly (1948) for more detail.

(5) Factorial design: An experiment that investigates all

possible treatment combinations of two or more

factors. Each of the factors is investigated at two or

more levels. Efficiency and the ability to estimate

the interaction effects are the major advantages of

factorial experiments over one-factor-at-a-time

experiments (Box, Hunter and Hunter (1978)). 28

(6) Factorial design at two levels (2-k): This type of

factorial is achieved when each factor has two levels.

The 2-k design assumes a linear relationship between

the independent variables from one side and the

dependent variable (response) from the other side.

(7) Factorial design at three levels (3-k): This type of

design is used when there is a significant curvature

in the relationship between the dependent and the

independent variables.

(8) Fractional factorial design at two levels (2-k-p): An

important feature of factorial design is that the

experimenter can run a fraction of it and still obtain

valid conclusions. The fraction is the p in the

expression (2-k-p), where k is the number of factors

or independent variables. p must be an integer and

smaller than k. However, (k-p) has to be a multiple of

two. The fractional factorials are most appreciated

when a screening experiment is to be conducted and

when the budget and or time constraints do not allow

the experimenter to run a full factorial (Box and

Hunter (1961)).

(9) Fractional factorials at three levels (3-k-p):

These are designs that serve the same purpose as

fractional factorials at two levels except that the

levels are three for each variable instead of two. As

indicated earlier, this is used when nonlinearity due 29

to quadratic effects exists.

(10) Central Composite (Box) Design: Type of design that is

used when the experimenter needs to fit second order

model. In particular, this design is mostly used when

the experimenter decides to go to higher order design

and already has the runs from 2-k design (Box and

Draper (1979)).

3.3 DEFINITIONS OF TESTS:

In some situations, the experimenter would like to

compare the individual treatments (or levels of the single

factor). In this section, a review of methods for

comparing individual treatments will be conducted.

(1) Orthogonal Contrasts method: This method is used to

compare individual treatments or levels of single

factor experiments. It is used when contrasts are set

prior to running the analysis of variance so that the

experimenter does not set contrasts that reflect large

observed differences in means.

(2) Duncan's multiple range test: This is probably the

most popular test to compare all possible pairs of

treatments developed by Duncan (1955). The test

indicates that two means are not significantly

different from each other if they fall within two 3 0

other means that are not significantly different from

each other.

(3) Least Significant Difference method (LSD): This method

is used when the F-statistic of all treatments or

levels combined is significant atw=5%.

(4) Dunnet's method: If the experimenter wishes to compare

individual treatments with a control treatment, then

the method developed by Dunnet (1964) is appropriate.

It is suggested to use more observations for the

control treatment so that the results are more

reliable. It should be noted that not more than

(number of treatments - 1) comparisons can be made

using this method.

(5) Scheffe's method: This is an alternative for the

orthogonal contrasts method when the experimenter has

no prior knowledge of what contrasts to compare.

Scheffe' (1953) developed this powerful test to

compare all possible contrasts without inflating type

I error. CHAPTER FOUR

EXPERIMENTAL DESIGN

4.1 INTRODUCTION

The bottomline for an effective experimental design is taking related aspects into consideration as early as possible when planning an experiment. Failure to do so may increase the risk of drawing invalid conclusions. For example, suppose there is a source of variability which the experimenter is not able to control by physical means.

If this source of variability is not accounted for by the experimental design, then the experimental error will be inflated by it. This inflation of the experimental error will increase the risk of drawing an invalid conclusion.

In addition, if the experimenter ignores a basic principle such as randomization , it could be costly if not impossible to correct at later stages. The focus of this chapter will be on the basic principles of designing experiments.

4.2 BASIC EXPERIMENTAL DESIGN PRINCIPLES

In this section, the emphasis will be placed on the key principles that the experimenter must take into considerations when designing an experiment. The major principles that will be discussed are: (1) blocking, (2) randomization, and (3) replication.

(1) Blocking: In many situations, not all experimental

runs can be performed under homogeneous conditions due

to extraneous variability that can not be controlled

by the experimenter. As a result, the error term is

likely to be inflated leading to the possibility of

drawing the wrong conclusions. If this occurs, the

experiment should be divided into subexperiments

(blocks) such that each subexperimental runs are run

under homogeneous conditions. This is done to

eliminate the variability contributed by the

uncontrollable factors. Each block contains a portion

of the experimental runs that can be run under similar

conditions.

The experimenter may not be able to detect the

existence of extraneous variability at the beginning

because of his or her lack of knowledge about the

experiment. In this case, the experimenter should

check the consistency of the observed response. Lack

of consistency indicates the possibility that the

experiment is being affected by uncontrollable

variability. When variability has been detected, the

experimental runs can be performed in blocks as

indicated earlier. The use of blocks ranges from simple comparative experiments to multifactor somewhat complicated ones.

In simple comparative experiment, the experimenter wishes to compare two methods . Therefore each pair of experimental materials is blocked from others. To illustrate this, consider comparing two types of shoe materials (Box, Hunter, and Hunter (1978)). The experimenter weighs the shoes before and after they are worn by subjects. The weight lost through wear is recorded so that less loss in material means better material. The subjects, however, are different in their body weights, and the length of time that they wear their shoes. Therefore, the experimenter would like to eliminate the variability contributed by the subjects which can not be controlled ( It is probably difficult for the experimenter to find subjects with the same body weight who are also similar with respect to how long they keep their shoes on). The experimenter can eliminate the variability by blocking

(eliminating the differences between body weights and length of time that the subjects wear their shoes). By doing so, the variability of subjects weight and how long they keep they wear shoes is eliminated. In such a situation, blocking is the best tool to be used for eliminating all the extraneous variability. In single factor experiments, blocking can also

be used if uncontrollable variability exists. If one

source of variability exists, the experimenter may run

a completely randomized block design. If two

uncontrollable sources of variability exist, then he

or she may use a Latin Square design if requirements

for using it are satisfied as previously mentioned.

The use of blocks in factorial designs is a

little more complicated. This is primarily due to the

lack of enough degrees of freedom if the experiment is

not sufficiently replicated. This problem often occurs

in fractional factorials where main effects can be

mistakingly confounded with blocks. So the

experimenter must exercise caution when he or she uses

blocks in fractional factorials. In Particular, blocks

must not be confounded with main effects because if

they are, it will be difficult to figure out whether

the outcome represents the main effect or the block

effects.

(2) Randomization: Employing a proper randomization

procedure is crucial step in designing an experiment.

Randomization is used in experimental design to

average out the effect of variables that can not be

controlled by the experimenter. Such averaging does 35

not remove their effects completely but it assures

that the experimental errors are randomly distributed.

Montgomery (1976) illustrates this by giving the

following example. A metallurgical engineer is

studying the effects of two different hardening

processes, oil quenching and saltwater quenching, on

an Aluminum alloy. The objective of this experiment is

to determine the quenching solution that produces the

maximum hardness for this alloy. The specimen used for

testing differ in their thickness. The engineer would

continually handicap one quenching medium over the

other if the specimen subjected to one medium are

thicker than those subjected to the other medium. This

problem can be easily solved by applying the specimen

randomly to the quenching solution.

(3) Replication: This concept means repetition of the

basic experiment. It usually serves two purpose: (a)

to obtain more precision on the estimated effects and

(b) to estimate the error term for testing

significance. In some situation, where the estimates

of the effects are very important, it is necessary to

replicate the experimental runs to achieve precision.

For example, more precision is needed when developing

statistical models for forecasting purposes than

others because the objective is to precisely predict

events in the future. Failure to meet this objective will probably be more costly than performing more

experimental runs. Replication is also necessary to

estimate the error term. More reliable conclusions can

be drawn when there are enough replicates. This is

because replicates increase the degrees of freedom

which , in turn, provide a better estimate for the

error. In short, it is always recommended to replicate

the experimental runs as many times as possible.

Performing replicate runs does not come at an early

stage of the design due to the sequential nature of

most experiments. In other words, and in many

situations, the important variables, their ranges, and

the levels to be assigned are yet to be explored.

Therefore, the experimenter should replicate the

experiment when the ambiguity about the experiment is

eliminated. As a result, the replicates will

contribute to the precision of estimates of effects

and the estimation of error term after the screening

experiment is over.

4.3 SCREENING EXPERIMENTS

In many situations, little or no prior knowledge about the situation to be investigated is available. In such situation, the experimenter would like to be able to determine which of the large number of possible factors are most important. In screening experiments, there would likely to be a large number of independent variables and it is expected that not all of them are important. The experimenter's first goal would be to separate the important ones from the rest but will not utilize a large amount of the budget to achieve this. Therefore, a quick and economical method to screen these variables is needed so the experimenter will only concentrate on those factors that are important throughout the rest of the study. Had the experimenter not screened variables, however, he or she would be spending time and money in investigating variables that should not be there in the first place.

Fractional factorials at two levels ( 2-k-p ) can be utilized for screening purposes as indicated by Box and

Hunter (1961). These designs can economically and efficiently set the guidelines for the experimenter to determine what factors ( and possibly interactions) to proceed with in the analysis of the experiment.

A screening experiment is conducted at early stage of the design. The information obtained from that stage will be used as the input for the next stage and so on in an iterative approach. However, some experts do not seem to appreciate this approach despite its clear efficiency. 38

Genishi Taguchi, who is considered a leading expert in this field, has not mentioned the concept of iterative experimentation. However, in reality, some variables that were considered at early stage of the design may be dropped if found unimportant and other variables may be included as indicated by Box(1985). Also the ranges of the variables on which the experiment is carried out may change.

Screening of experimental variables should be economical. Box, Hunter, and Hunter (1978) have suggested that usually not more than one quarter of the budget should be invested for that screening purposes. After the

"screening" stage, the experimenter will have enough information with regard to which variables are important.

As indicated earlier, fractional factorial designs at two levels(2-k-p) are cost effective in exploring information about the experimental variables at early stage of the design. For example, if an experiment involves five variables given that it is unknown which ones are important, the experimenter may use one quarter fractional factorial at two levels ( 2-5-2 ) which involves 8 runs instead of using the full factorial at two levels ( 2-5

) involving 32 runs. If the number of experimental variables is sufficiently large, a smaller fraction can be run. The experimenter must be careful in deciding what

type (or as often called resolution) of fractional

factorial to use. The following is a brief review of the major alternatives:

(a) Resolution 111 : This type is selected when the

experimenter wishes to estimate only the main effects.

This guarantees that the main effects can be estimated

without being confounded with one another. However,

this resolution may have its main effects confounded

with its two-factor interaction. Therefore, if the

experimenter wishes to estimate two-factor

interactions as well as main effects, then this

resolution should not be used. If they are used,

however, then the experimenter can not guarantee

whether one effect reflects a main effect or a two-

factor interaction because they are confounded with

each other. For example, if AB refers to the

interaction between factor A and factor B then the

two-factor interaction (AB) may be confounded with

factor C. This means that it will not be possible to

determine whether a significant effect is due to the

main effect C or to the two-factor interaction (AB) or

both.

(b) Resolution IV : This type is used when the

experimenter is interested in estimating the two-

factor interactions separately from the main effects. 40

This is simply a result of not having main effects

confounded with each other or with two-factor

interactions. However, the exp,erimenterfs decision on

which one of the two-factor interactions are

significant must be made with caution because two-

factor interactions are confounded with each other.

It should be mentioned here that resolution IV

fractional factorials can be constructed by combining

resolution I11 design with its "sign-reversed"

partner. In other words, the experimenter will only

have to change signs (+,-) in the resolution 111

design and combine it with the original. This is often

referred to as a fold-over design. Again, utilizing

fold-over designs was also missed by Taguchi as

mentioned by Box (1985) despite its ease in

application as well as its usefulness. For example,

suppose that it is desired to separate the two-factor

interactions from main effects. This can be done by

folding over Resolution 111 design without having to

set up a Resolution IV design.

(c) Resolution V : This type of design is useful when the

experimenter wishes to estimate of two-factor

interactions which are not confounded with each other.

This also includes estimating main effects separately

from each other and from two-factor interactions.

This is usually done when the experimenter knows that two-factor interactions are important.

The above mentioned resolutions are commonly used when fractional factorial designs at two levels ( 2-k-p ) are considered. However, resolution I11 is the one to use when screening large number of variables since the interest is primarily in dropping those variables that are not important. If the experimenter knows that some interactions are important, then resolution IV can be used.

All fractional factorial designs require

"generator(s)" for constructing them. The number of generators required depends on the size of the fraction being constructed. For example, if the experimenter wishes to construct a one half fraction then one generator is required. Two generators are required for a one quarter fraction and so on. In other words, the number of generators required is equal to parameter p in the (2-k-p) expression. However, the experimenter may find it easier to refer to some books such as the one by

Montgomery(l986) for constructing fractional factorial for a given resolution.

4.4 MODEL ADEQUACY

In some situations, the data collected on the 4 2 original scale fail to satisfy the assumptions upon which statistical tests are based. As a result, the conclusions drawn from experiments in such situations are of doubtful validity. In this section, a discussion of the different statistical assumptions and the consequences from violating them will be made. Also, a review of existing tests for detecting these violations will then be provided as well as remedies to correct them.

The analysis of variance (ANOVA) model is used to analyze the results of an experiment. The ANOVA model is based on several statistical assumption. Violating one or more assumptions such as (1) linearity, (2) additivity,

(3) normality, (4) homogeneity of variance and (5) independence may occur along with serious consequences.

Therefore, The possibility that one or more of the above assumptions will be violated must not be ignored. In particular, statistical tests should be employed to determine whether or not they have been violated. Certain types of transformation can be recommended for remedying the violation of these assumptions.

However, Taguchi's experimental design does not refer to transformation as a remedy to violation of one or more of the above mentioned assumptions. In his discussion on Kackar's paper regarding Taguchi's methodology, Box (1985) criticizes the absence of the concept of transformation from Taguchi's quality improvement philosophy among other issues.

Throughout his books and articles, Box acknowledges the occasional need of transformation as indicated in Box and Cox (1964). This probably motivated him to develop techniques for best selection of transformation function.

When data is properly transformed, it will improve the degree to which assumptions of the are satisfied. Usually, transforming the dependent variable

(response) can improve the adequacy of the model in planned experiments. However, Box extends the concept of transformation to unplanned experiment where transforming the independent variables can greatly simplify the form with which the experimenter is working. As indicated by

Box and Tidwell (1962), transforming the independent variables and/or the response in regression and especially in response surfaces can greatly ease handling the experimental data.

When attempting to find an appropriate transformation function, it is not always necessary to determine which assumption has been violated. In particular, the method developed by Box and Cox (1964), called " maximum likelihood estimator", enables the experimenter to select 4 4 the appropriate transformation to satisfy the additivity, the homogeneity of variance, and the normality assumptions all together if necessary. In other words, the functional form used for this transformation accounts for all the above three assumptions.

It is important, as indicated by Box(1985), to confirm that the assumptions mentioned earlier before proceeding further with the investigation. Failure to do so may seriously affect the validity of conclusions drawn from the experiment. As mentioned earlier, the assumptions to be confirmed are:

(1) Linearity: This assumption indicates that the

dependent variable or response can be expressed as a

linear function of the independent variables. The

assumption is said to be satisfied when no significant

curvature exists in the relationship between the

response and the independent variables. If a

significant curvature exists, then it will be

necessary to include high order terms in the model.

Failure to do so may dramatically inflate the error

some of squares. When this occurs, the power of test

is reduced.

(2) Additivity: This assumption states that the model,

which expresses the dependent variable as a function

of the independent variables, must not contain any cross product (interaction) terms. This will not be

true in situations in which the nonadditivity or

interactions may actually be part of the model which

generated the data. Therefore, one must not assume the

additivity when there is no enough knowledge about the

phenomena being studied to support such an assumption.

If the experimenter falsely assumes additivity, then

the sum of squares due to nonlinearity will be

incorrectly included in the sum of squares of error.

This inflation in the error term will reduce the power

of test.

(3) Normality: When the differences between the actual

observations and the predicted values (i.e. residuals)

are plotted on a dot diagram, they should approximate

a with mean centered at zero. Such

a distribution will be symmetrical and will have its

mode equal to the mean. If there is a significant

departure from normality then the significance level

will be seriously distorted. This will, in turn,

affect the validity of the conclusions which are based

on the level of significance for the test.

(4) Homogeneity of variance: This assumption states that

the error (variance) should be nearly constant over

the different combinations of the independent

variables. In other words, the plot of residuals 4 6

against predicted values must not follow a particular

pattern and should be structureless. However,

sometimes the experimenter encounters the problem of

heterogeneity of variance. If this occurs then the

assumption of homogeneous variance is said to be

violated. This is a type of a problem that is better

to be prevented rather than corrected due to its

serious consequences on the validity of the conclusion

drawn in such situations. For example, a proper

randomization process can greatly help in preventing

this assumption from being violated. Randomization

will simply average out the effects contributed by

uncontrollable variables that could systematically

build up if no randomization was applied. When this

assumption is violated, the consequences are similar

to those of violating the normality assumption in

distorting the level of significance resulting in

reducing the power of the test.

(5) Independence: This assumption states that the

residuals are independent from each other. In other

words, there must not be a functional relationship

between one residuals. If this assumption is violated,

the variance of residuals may be biased. A detailed

discussion on consequences and methods for dealing

with violating this assumption is provided in paper

written by Glen and Terry (1984). It is recommended to refer to Eisenhart (1947) for a

complete discussion on the assumptions underlying the

analysis of variance. Cochran (1947) also provides a

discussion of consequences of violating these assumptions.

It is essential to identify the various tests that

can be used to verify the validity of these assumptions.

When there are more than one test to verify some

assumption, the most common ones will only be identified.

The linearity assumption can be verified using the

lack of fit test in Draper and Herzberg (1971). It should

be noted that performing this test requires at least one

replicate to account for pure error term. However, it is

always advisable to replicate as many times as possible when conducting this test.

The assumption of additivity can be verified using "

Tukey's one degree of freedom test for additivity"

developed by Tukey (1949).

Homogeneity of variance can be verified using the

test developed by Rartlett (1937). But since this test is

sensitive to the assumption of normality, another test

developed by Bartlett and Kendall (1946) should be used when the assumption of normality is questionable.

The assumption of normality can be tested using the

Chi-square goodness of fit test. However, the experimenter may also use the Klomogorov-Smirnov test which is argued by some authors to be more powerful. Also Shapiro, Wilk and Chen (1968) developed an alternative test which is more attractive since it does not require the inclusion of the mean and the variance as a part of the hypothesis.

The following table summarizes tests and references to test the statistical assumptions that were mentioned above.

Table 1 : Summary of tests for validating assumptions* ...... , Assumption I, Reference Guidelines ,I ...... tf I, It " Linearity " Lack of fit test in Draper 4 " and Herzberg(l971) I,

" Additivity " Tukey's one degree of freedom " " test for additivity in Tukey " " (1949). Also Tukey and Moore " 11 " (1954)

" Homogeneity of " Bartlett's test in Bartlett " " variance " (1937). Also Bartlett and I, " Kendall (1946) I, ''------'' " Normality " Goodness of fit test, 1, ,# " Kolmogorov-Smirnov test, 11 I, " or Shapirio,wilk and Chen It I, " (1968) ...... * Refer to Glen and Terry (1984) for testing the assumption of independence. When a test of the mentioned above shows that an

assumption has been violated, it is essential to correct

this violation by applying the appropriate type of

transformation. There are a variety of transformations

that can be used when the nonnormality assumption is

violated. The most appropriate transformation depends on

the nature of the nonnormality. The method developed by

Box and Cox (1964) allows the experimenter to select the

right transformation regardless of whether or not the

experimenter has knowledge about the nature of the

nonnormality. The same is true for correcting the

nonhomogeneity of the variance since these two assumptions

can, usually, be remediated together.

The additivity assumption will be violated when the

true effects are multiplicative or proportional. This

violation can be corrected by applying the logarithmic

transformation which is a part of Box and Cox method of

Maximum Likelihood transformation.

Violating the assumption of linearity can be

corrected by applying Box and Cox (1964). In unplanned

experiments such as the case in regression, the

experimenter may have to transform some of the independent

variables. Refer to Box and Tidwell (1961) for a complete

discussion on transforming the independent variables. CHAPTER FIVE

KNOWLEDGE REPRESENTATION

5.1 INTRODUCTION

In this chapter, the focus will be on developing the

production rules needed for this expert system. The

listing of the production rules in this chapter, however,

is not the same as the program code.

This expert system consists of production rules that

are coded in Turbo Prolog as segments. Each segment

consists of a condition ( the 'IF' part) or a conclusion

( the 'THEN' part) about some type of experiment. A series of rule's conditions are connected through predicate

names. An array is used to keep track of conditions and

conclusion of the same rule. This makes it simple to modify the program when needed. For example, the user may have to answer a question by choosing from 'y', In', 'c',

(for yes, no, and continue, respectively). If the name of

the predicate is sub1 then subl('yl) will lead to another

condition (question) or a conclusion (answer) which is to

follow that particular answer.

Although, this type of coding (using arrays) can

consume a lot of memory, it is the easiest way to check 5 1 validity of the system. In other words, if the number of conditions for one rule is large and if arrays are not used, then each time you have to check one of these conditions you must also check preceding conditions of that rule. On the other hand, you only have to worry about the condition on hand if you are using arrays.

5.2 EXPERIMENTAL DESIGN RULES

In this section, lists of the production rules for all types of experiments covered in this expert system will be provided. Conclusions or 'THEN' parts for some rules can be used as conditions or 'IF' parts for other rules.

Rules for this expert system will be divided into two categories; single factor experiments and factorial experiment. They were developed after the knowledge acquisition phase has been completed and verified. It should be mentioned here that these rules do not contain the help facility options since the definitions of various technical terms and designs were included in chapter three. 5.2.1 RULES FOR SINGLE FACTOR EXPERIMENTS

------RULE 1 IF 1. The experiment is a single factor one,

and 2. There is no extraneous source of variability,

THEN * Run a Complete Randomized Design.

------RULE 2

IF 1. There is one source of extraneous variability

and 2. Data is complete (no missing values),

THEN * Run a Complete Randomized Block Design.

------RULE 3

IF 1. There are two sources of extraneous

variability,

and 2. The number of levels for each source is equal

to the number of treatments under study,

and 3. Data is complete (no missing values),

THEN * Run a Latin Square Design.

------RULE 4 IF 1. There are three sources of extraneous

variability,

and 2. The number of levels for each source is equal

to the number of treatments under study, and 3. Data is complete (no missing values),

THEN * Run a Graeco Latin Square Design.

------RULE 5 IF 1. There are more than three sources of

extraneous variability,

and 2. The number of levels for each source is equal

to the number of treatments under study,

and 3. Data is complete (no missing values),

THEN * Run a Hyper-Graeco Latin Square Design.

------RULE 6 IF 1. A Complete Randomized Design was used,

and 2. Comparing individual treatments is desired,

and 3. It is known what treatments to compare and

contrasts are set,

THEN * Use Orthogonal Contrasts Method.

------RULE 7 IF 1. A Complete Randomized Design is used,

and 2. Comparing individual treatments is desired,

and 3. Contrasts are not set prior to analysis,

and 4. Comparing all possible pairs of treatments is

desired,

and 5. The F-test in the ANOVA for all treatments

combined is significant at 5%, 54 THEN * Use the Least Significant Difference (LSD) method

------RULE 8

IF 1. A complete Randomized Design is used,

and 2. Comparing individual treatments is desired,

and 3. Contrasts are not set prior to analysis,

and 4. Comparing all possible pairs of treatments is

desired,

and 5. The F-test in the ANOVA for all treatments

combined is not necessarily significant,

THEN * Use Duncan's multiple range test.

RULE-?

IF 1. A Complete Randomized design is used,

and 2. Comparing individual treatments is desired,

and 3. Contrasts are not set prior to analysis,

and 4. Comparing all possible pairs of treatments is

not necessary,

and 5. Comparing treatments with a control treatment

is not desired,

THEN * Use Scheffe's method.

1. A Complete Randomized design is used,

and 2. Comparing individual treatments is desired, 55

and 3. Contrasts are not set prior to analysis,

and 4. Comparing all possible pairs of treatments is

not necessary,

and 5. Comparing treatments with a control treatment

is desired, THEN * Use Dunnet's method.

RULE-II IF 1. There are two sources of extraneous

variability,

and 2. The number of levels for each source is equal

to the number of treatments under study,

and 3. Data is not complete, THEN * Use Latin Square Design taking into consideration that every pair of treatments

must appear together the same number of

times.

------RULE 12 IF 1. There are two sources of extraneous

variability,

and 2. The number of levels for at least one source

is not equal to the number of treatments

under study,

THEN * Use Youden Square Design but be sure that data is balanced. 4.2.2 RULES FOR FACTORIAL EXPERIMENTS

------RULE 13 IF 1. All independent variables are known to be

important,

and 2. Number of levels is predetermined for each

independent variable,

and 3. Number of levels is not the same for each of

the independent variables,

THEN * Run a full factorial design. If you are considering all possible interactions, then

you will need at least one replicate to

estimate the error term. Otherwise, use the

unimportant interactions for that purpose.

------RULE 14

IF 1. All independent variables are known to be

important,

and 2. Number of levels is predetermined for each

independent variable,

and 3. Number of levels is equal to two for each of

the independent variables,

THEN * Run a factorial experiment at two levels

(2-k). ------RULE 15 IF 1. All independent variables are known to be

important,

and 2. Number of levels is predetermined for each

independent variable,

and 3. Number of levels is equal to three for each

of the independent variables,

THEN * Run a factorial experiment at three levels (3-k).

------RULE 16 IF 1. All independent variables are known to be

important,

and 2. Number of levels is predetermined for each

independent variable,

and 3. Number of levels is greater than three for

each of the independent variables,

THEN * Run N-K factorial experiment where N represents the number of levels.

------RULE 17

IF

1. Not all independent variables are known to be

important, 58

and 2. The number of independent variables is large

THEN * Divide each independent variable into high and low levels. Run a highly fractionated

factorial to identify the important main

effects. This is a screening experiment that

is used to separate the important variables

from the unimportant ones. The rest of the

experiment will be carried out with the

identified important variables.

------RULE 18 IF 1. All independent variables are known to be

important,

and 2. Number of levels for each of the independent

variables is not predetermined,

and 3. There is prior knowledge indicating that

two-factor interactions are not important,

and 4. It is possible to perform a full factorial

experiment at two levels (2^k),

and 5. All experimental runs can be performed under

homogeneous conditions, THEN * Set up (2-k) factorial design at two levels. Randomize runs, perform experiment, and

collect data. Fit the first order model and

check the adequacy of the model. Proceed to

RULE 34. ------RULE 19 IF 1. All independent variables are known to be

important,

and 2. Number of levels for each of the independent

variables is not predetermined,

and 3. There is no prior knowledge indicating that

two-factor interactions are not important,

and 4. It is possible to perform a full factorial

experiment at two levels (2^k),

and 5. All experimental runs can be performed under

homogeneous conditions,

THEN * Set up (2-k) factorial design at two levels. Randomize runs, perform experiment, and

collect data. Fit the first order model that

includes the two-factor interaction terms and

check the adequacy of the model. Proceed to

RULE 35.

------RULE 20 IF 1. All independent variables are known to be

important,

and 2. Number of levels for each of the independent

variables is not predetermined,

and 3. There is prior knowledge indicating that

two-factor interactions are important, 60

and 4. It is possible to perform a full factorial

experiment at two levels (2*k),

and 5. Not all experimental runs can be performed

under homogeneous conditions,

THEN * Set up (2-k) factorial design at two levels in blocks. Randomize runs, perform

experiment, and collect data. Fit the first

order model that includes the two-factor

interaction terms and check the adequacy of

the model. Proceed to RULE 34. ------RULE 21 IF 1. All independent variables are known to be

important,

and 2. Number of levels for each of the independent

variables is not predetermined,

and 3. There is prior knowledge indicating that

two-factor interactions are important,

and 4. It is possible to perform a full factorial

experiment at two levels (2^k),

and 5. Not all experimental runs can be performed

under homogeneous conditions,

THEN * Set up (2-k) factorial design at two levels in blocks. Randomize runs, perform

experiment, and collect data. Fit the first

order model and check the adequacy of the

model. Proceed to RULE 35. ------RULE 22 IF 1. All independent variables are known to be

important,

and 2. Number of levels for each of the independent

variables is not predetermined,

and 3. There is prior knowledge indicating that

two-factor interactions are not important,

and 4. It is not possible to perform a full

factorial experiment at two levels (2^k),

and 5. All experimental runs can be performed under

homogeneous conditions,

THEN * Set up (2"k-p) fractional factorial design at

two levels. Randomize runs, perform

experiment, and collect data. Fit the first

order model and check the adequacy of the

model. Proceed to RULE 34.

------RULE 23 IF 1. All independent variables are known to be

important,

and 2. Number of levels for each of the independent

variables is not predetermined,

and 3. There is no prior knowledge indicating that

two-factor interactions are not important,

and 4. It is not possible to perform a full 6 2

factorial experiment at two levels (2^k),

and 5. All experimental runs can be performed under

homogeneous conditions,

THEN * Set up (2-k-p) fractional factorial design at

two levels. Randomize runs, perform

experiment, and collect data. Fit the first

order model that includes all two-factor

interaction terms and check the adequacy of

the model. Proceed to RULE 35.

------RULE 24 IF 1. All independent variables are known to be

important,

and 2. Number of levels for each of the independent

variables is not predetermined,

and 3. There is prior knowledge indicating that

two-factor interactions are not important,

and 4. It is not possible to perform a full

factorial experiment at two level (2^k),

and 5. Not all experimental runs can be performed

under homogeneous conditions,

THEN * Set up (2-k-p) fractional factorial design at two levels in blocks. Randomize runs, perform

experiment, and collect data. Fit the first

order model and check the adequacy of

the model. Be sure that blocks are not 63

confounded with main effects and main effects

are not confounded with each other. Proceed

to RULE 34.

------RULE 25 IF 1. All independent variables are known to be

important,

and 2. Number of levels for each of the independent

variables is not predetermined,

and 3. There is no prior knowledge indicating that

two-factor interactions are not important,

and 4. It is possible to perform a full factorial

experiment at two levels (2^k),

and 5. Not all experimental runs can be performed

under homogeneous conditions,

THEN * Set up (2-k-p) fractional factorial design at two levels in blocks. Randomize runs, perform

experiment, and collect data. Fit the first

order model that includes two-factor

interaction terms and check the adequacy of

the model. Be sure that blocks are not

confounded with main effects and main effects

are not confounded with each other or with

two factor interactions. Proceed to RULE 35. 64

------RULE 26

IF 1. A screening experiment to identify the

important variables has been conducted,

and 2. There is prior knowledge indicating that the

two-factor interactions are not important,

and 3. It is possible to perform a full (2^k)

factorial at two levels,

and 4. All experimental runs can be performed under

homogeneous conditions,

THEN * Set up (2-k) factorial design at two levels. Randomize runs, perform experiment, and

collect data. Fit the first order model and

check the adequacy of the model. Proceed to

RULE 34.

------RULE 27

IF 1. A screening experiment to identify the

important variables has been conducted,

and 2. There is prior knowledge indicating that the

two-factor interactions are not important,

and 3. It is possible to perform a full (2-k)

factorial at two levels,

and 4. Not all experimental runs can be performed

under homogeneous conditions,

THEN * Set up (2-k) factorial design at two levels in blocks. Randomize runs, perform 65

experiment, and collect data but be sure

that blocks are not confounded with main

effects. Fit the first order model and check

the adequacy of the model. Proceed to RULE

34.

------RULE 28

IF 1. A screening experiment to identify the

important variables has been conducted,

and 2. There is prior knowledge indicating that

the two-factor interactions are important,

and 3. It is possible to perform a full (2^k)

factorial at two levels,

and 4. Not all experimental runs can be performed

under homogeneous conditions,

THEN * Set up (2-k) factorial design at two levels in blocks but be sure that blocks are not

confounded with main effects. Randomize

runs, perform experiment, and collect data.

Fit the first order model that includes the

two-factor interaction terms and check the

adequacy of the model. Proceed to RULE 35.

------RULE 29 IF 1. A screening experiment to identify the

important variables has been conducted, 6 6

and 2. There is prior knowledge indicating that

the two-factor interactions are important,

and 3. It is possible to perform a full (2-k)

factorial at two levels,

and 4. All experimental runs can be performed

under homogeneous conditions,

THEN * Set up (2-k) factorial design at two levels.

Randomize runs, perform experiment, and

collect data. Fit the first order model that

includes the two-factor interaction terms

and check the adequacy of the model. Proceed

to RULE 35.

------RULE 30

IF 1. A screening experiment to identify the

important variables has been conducted,

and 2. There is prior knowledge indicating that the

two-factor interactions are not important,

and 3. It is not possible to perform a full (2-k)

factorial at two levels,

and 4. All experimental runs can be performed under

homogeneous conditions,

THEN * Set up (2-k-p) fractional factorial design at two levels . Randomize runs, perform experiment, and collect data. Fit the first

order model without two-factor interactions 67

and check the adequacy of the model.

proceed to RULE 34.

------RULE 31 IF 1. A screening experiment to identify the

important variables has been conducted,

and 2. There is prior knowledge indicating that the

two-factor interactions are not important,

and 3. It is not possible to perform a full (2-k)

factorial at two levels,

and 4. Not all experimental runs can be performed

under homogeneous conditions, THEN * Set up (2-k-p) fractional factorial design at two levels in blocks. Randomize runs,

perform experiment, and collect data. Fit

the first order model and check the

adequacy of the model. Be sure that blocks

are not confounded with main effects and

main effects are not confounded with each

other. Proceed to RULE 34.

------RULE 32

IF 1. A screening experiment to identify the

important variables has been conducted,

and 2. There is prior knowledge indicating that the two-factor interactions are important,

and 3. It is not possible to perform a full (2-k)

factorial at two levels,

and 4. Not all experimental runs can be performed

under 11omogeneous conditions,

THEN * Set up (2-k-p) fractional factorial design at two levels in blocks. Randomize runs,

perform experiment, and collect data. Fit

the first order model that includes two-

factor interaction terms and check the

adequacy of the model. Be sure that blocks

are not confounded with main effects and

main effects are not confounded with each

other or with two-factor interactions.

Proceed to RULE 35.

------RULE 33

IF 1. A screening experiment to identify the

important variables has been conducted,

and 2. There is prior knowledge indicating that

the two-factor interactions are important,

and 3. It is not possible to perform a full (2-k)

factorial at two levels,

and 4. All experimental runs can be performed

under homogeneous conditions,

THEN * Set up (2-k-p) fractional factorial design 69 at two levels . Randomize runs, perform

experiment, and collect data. Fit the first

order model that includes two-factor

interaction terms and check the adequacy of

the model. Proceed to RULE 35.

------RULE 34 IF 1. There is not an adequate fit of the model, THEN * The assumption of no interactions in the model is possibly wrong. Include the most

likely two-factor interaction terms and fit

the new model. Proceed to RULE 36.

------RULE 35 IF 1. There is not an adequate fit of the model, THEN * The assumption of no higher than two-factor interactions is possibly wrong. Include most

likely three-factor interaction terms and

fit a new model. Proceed to RULE 36.

------RULE 36 IF 1. There is not an adequate fit of the new

model, THEN * The data may need to be transformed into

another scale. Use Box & Cox (1964) 'Maximum

Likelihood method' to identify the best 70

transformation. Fit the model with the

transformed data. Proceed to RULE 37.

------RULE 37

IF 1. The fit of the model with the with the

transformed data is not adequate, THEN * At this stage, it is possible that a higher

order model is needed. Start by expanding

the first order into a central composite

design (CCD).

------RULE 38

IF 1. There is an adequate fit of the model,

and 2. More precision on estimates is desired, THEN * Replicate the experimental runs as many times as possible and estimate the effects

considering randomization in conducting the

experiment.

------RULE 39

IF 1. If the experimenter does not know whether or

not it is possible to perform a full (2^k),

and 2. Budget and cost per run are known, THEN * Calculate total cost=(2^k)*(cost per run). Compare total cost with budget.

U L ? the Usc dlir,~q t,: t R:n!~f',: fie tioI Rante s ,e

&

FIGURE 1. ( CONTINUED 1. FIGURE 2. EXPERIHENTS UITH THO OR HORE IHDEPENDEHT VARIABLES, FIGURE 2s ( CONlINUED 1. FIGURE 2. ( ONTINULD 1, S t yp 2% f ct rial i sign R on t e fom,*anadc iftl t iaiaj it fcs? orLr no c w th o-factor nt r ct on Con ucf faei olefit tcs - w I I

FIGURE 2, ( CONTINUED 1. FIGURE 2. ( CONTIHUED 1. FIGURE 2, ( COHIINUED 1, FIGURE 2. ( CONTINUED 1. FIGURE 2. ( CONTIMJZD 1.

FIGURE 2. ( CONTINUED 1. FIGURE 2, ( CONTIMJLD 1. CHAPTER SIX

SYSTEM EVALUATION

6.1 INTRODUCTION

The utility of the expert system for designing experiments can be measured in terms of the ability to improve the performance of users in designing experiments.

In this chapter, the effectiveness of the system will be evaluated. Conclusions and recommendations for future studies will also be presented.

Feedback from the user is a very important step in evaluating such system. Therefore, an experiment will be conducted to determine if there is a significant improvement in user's ability to solve experimental design problems when using the expert system.

6.2 EVALUATION EXPERIMENT

A simple experiment was designed to determine if the expert system enhances the ability of the students to solve experimental design problems when using the expert system (See Appendix). The experiment will simply compare the answers obtained by solving the problems with or without utilizing the system. 88

Since the problems vary in their difficulty, it is important to block the variation to obtain precision on the results of the experiment. Therefore, a paired t-test was employed to analyze the data. However, there is no contribution of variation from the difference in problems difficulty. Also, the distribution of points is not the same for all problem since they are not equally important.

Ten students from both graduate and undergraduate classes ( five from each ) were chosen at random to participate in the experiment. Each student had a minimum of one introductory course in experimental design. The experiment was constructed so that each student took the test on his or her own prior to taking it by using the expert system. This was done to ensure that students would not use knowledge gained from the expert syst:em in answering the problems on their own. The knowledge that students may gain from taking the test on their own is negligible since they can not use any literature.

Points were given to the problems based on the possibilities available for answering. For example, if the student answers " factorial " without specifying " factorial at two levels", then a point was taken off. As a result they would get one point instead of two.

Specifications for breaking up points are provided in the Appendix.

As stated in the Appendix, the hypothesis to be tested is that there is a significant improvement in scores when solving the given experimental design problems when using the expert system as opposed of solving them without it. Data was collected and analyzed (Appendix).

The results of this analysis indicate that use of the expert system produced a significant improvement in the subjects ability to solve the experimental design problems. In particular numbers, The calculated t-value is

6.62 while the critical value at =0.05 is 1.895. This indicates that the expert system was helpful in guiding the user to the appropriate experimental design in the situations tested. Moreover, An examination of the mean score for each of the problem revealed that the scores are higher when the expert system is used.

6.3 CONCLUSIONS

Expert systems have been widely used in a variety of fields. Experimental Design is one of few fields which have not received a lot of attention when it comes to developing intelligent systems. Currently available statistical software packages deal only with number calculations and ignore the important problem of deciding the appropriate experimental design to run. This expert system can be effectively usedas a teaching aid in experimental design classes. Students will find this system very helpful in the design stage of an experiment. . In fact, the help facility can be accessed from any point in the system where help may be needed.

If the system is used to design experiments in real situations, it is preferred that the user have basic experimental design knowledge. That is, the user will be in charge of setting up or referring to the type of design provided by the system.

The purpose of the on-screen help facility is not to teach the subject of designing experiments but to refresh the user's memory on particular designs or technical terms. It is only a dictionary for selected designs and technical terms.

Unlike statistical software packages, this system does not perform any statistical calculations. Its objective is simply providing guidance to the user when searching for the appropriate design. Therefore, the system is not expected to be appealing to those who look for a software that does everything from planning through analyzing to conclusions. In short, developing this expert 9 1

system is an attempt to place more emphasis in the design

stage of the field because wrong conclusions may be drawn

from experiments that are poorly designed.

6.4 RECOMMENDATIONS

1) Future programs can be concerned with relevant subjects

in the field such as Regression, Response Surface

Methodology, experiments with multiple response, simple

comparative experiments and others. It may be feasible

then to integrate them into one system. For example,

the main menu in the integrated system can be the

individual subjects mentioned above.

2) Future versions can be concerned with a limited

interface with a powerful number calculation language

for analysis of experimental data. For example, this

expert system can be interfaced with Turbo Pascal in

its recent version.

3) Future versions can also be concerned with providing

the user of the design set-up with data collection

sheets and the randomized order of treatment

combinations. This, too, needs some calculation

power. For example, if a factorial design at two

levels has been decided upon, the system may ask the

user to enter variable names and actual number of

experimental runs affordable. The system will then

provide the user with the design set-up in random 92

order. It can go further to provide the adequate high

and low levels of the controllable variables.

I would very much recommend a future program that includes the third suggestion without worrying about the numerical calculations for data analysis. In fact, there is no need to develop a system that includes the design and the analysis because data collections and modifications usually lie between the two stages. It should be remembered that the nature of experimentation is sequential and divided into several stages. Therefore, one program that includes all stages may not be effective. REFERENCES

1) Aleksander, I., ElggAgqing Intelligent Systems, UNIPUB, New York, 1984.

2) Baker, T. B., " Quality Engineering By Design: Taguchi ' s Philosophy ", Quality Progress, PP. 32- 42 Dec. 1986.

3) Bartlett, M. S., " The Use of Transformation", Proc.- Roy. SOC. A, 160:268, 1937.

4) Bartlett, M.S., and Kendall, D. G., "The Statistical Analysis of Variance-Hetrogeneity and The Logarithmic Transformation", Journal Of Royal Statistical Soc~gty- Su~plement,8:128, 1946.

5) Black, W. J. * ------Intelligent ------Knowledge ------Based Systems: ------An Introduction Van Nostrand Reinhold (UK) Co. Ltd., ------) England, 1986.

6) Box, G. E. P., "Discussion on Paper by Kackar(1985)". ------Journal of Quality ------Technology, Vo1.17, No.4, Oct.1985. 7) Box, G. E. P. and Cox, D. R., "An analysis of Transformation", J.R.Statist.Soc.Ser., B,26:211-243, 1964.

8) Box, G. E. P., Kackar, R. N., Nair, V. N.,Phadke, M.. Shoemaker, A. C., and Wu, C. F., "Quality Practice in Japan", Quality Proms, PP. 37-41, Mar.1988.

9) Box, G. E. P., and Tidwell, P. W., "Transformation of independent variables", Techngmetrics, Vo1.4, No.4, 1962.

10) Box, G. E. P., and Draper, N. R., Evpiutipqgyy- Operation, Wiley, New York, 1969.

11) Box, G. E. P. and Hunter, J. S., " The 2 k-p Fractional Factorial Designs, Part I", Technometrics, V01.3, No.3, PP. 311-351, 1961.

12) Box, G. E. P. and Hunter, J. S., " The 2 k-p Fractional Factorial Designs, Part 11", Tecl~nometrics, V01.3, No.4, PP. 449-459, 1961. Box, G. E. P., Hunter, W. G., and Hunter, J. S.,

------, for Ex~erimenters----,-----I John Wiley & Sons, Inc., New York, 1978.

Prolog Programming for Artificial Bratko, 1. 9 ------Intelligence, ---- Addison-Wesley Publishers Limited, Great Britain, 1986.

Cochran, W. G., "Some Consequences When The Assumptions For The Analysis Of Variance Are Not Satisfied", Biometrics, Vo1.3, PP. 22-38, 1947.

Dixon, W. J. and Massey, F. J., Jr., Introduction-Lo ------Statistical Analysis, --- McGraw Hill Book Company, New York, 1983

Draper, N. R., and Herzberg, A. M., "On Lack of Fit", ------Technometrics, 13:231-241, 1971. Duncan, D. B., "Multiple Range and Multiple F Tests", ------Biometrics, Vol.11, PP.l-42, 1955. Dunnet, C. W., "New Tables for Multiple Comparisons with a Control", Biometrics, Vo1.20, PP.482-491, 1964.

Eisenhart, C., "The Assumptions Underlying the Analysis of Variance", Biometrics, Vo1.3, PP.l-21, 1947. Fikes, R. and Kehler, T., " The Role of Frame-Based Representation in Reasoning ", Communications of the ---ACM, Vo1.28, No.9, PP. 904-920, September 1985.

Gale, W. A., " Knowledge-Based Knowledge Acquisition for a Statistical Consulting System ", International Journal of Man-Machine Studies 26, PP. 55-64, 1987. ------J Glen, T. M., and Terry, W. R., "Statistical problems and treatments in assessing the long term impact of human performance productivity improvement", ------Proceedings ...... Seventh International Conference on Production Research, PP. 906-911, 1984.

Hahn, G. J., " More Intelligent Statistical Softwares and Statistical Expert Systems: Future Studies ", T'g ...... American Statistician Vo1.39, PP. 1-16, Feb. 1985.

Hahn, G. J., " Some Things Engineer Should Know About Experimental Design ", Journal-of-Quality-T_gchnologr, Vo1.9, No.1, PP. 13-20, Jan. 1977.

Hand, D. J., " Statistical Expert Systems: Design ", ------The Statistician, Vo1.33, PP. 351-369, 1984. 27) Hand, D. J., " Statistical Expert Systems: Necessary Attributes ", Journal of Applied-statistics, Vo1.12, No.1, PP. 19-27, 1985.

28) Hicks, C. R. * ------Fundamental Conce~ts ------in the Design ---- of Experiments, CBS College Publishing, New York, 1982.

29) Hines, W. W. and Montgomery, D. C., Probabiliry-aqd ------Statistics in Engineering ------and Management ------~ Science John Wiley & Sons, Inc., New York, 1972.

30) Hunter, S. J. " Statistical Design Applied to Product Design ", Journal of Quality-Technology, Vo1.17, No.4, PP. 210-219, 1985.

31) Johnson, P. E., Nachtsheim, C. J., and Zuakeran, I. A., " Consultant Expertise ", Exeert Systemp, Vo1.4, No.3, PP. 180-188, Aug. 1987.

32) Jones, B., " The Computer as a Statistical Consultant", BIAS, Vo1.7, No.2, PP. 168-195, 1980.

33) Kackar, R. N., " Off-Line Quality Control, Parameter Design, and the Taguchi Method ", Journal of Quality ------Technology, Vo1.17, No.4, PP. 176-188 Oct. 1985. 34) Kackar, R. N., "Taguchi's Philosophy: Analysis and Commentary ", Quality Progress, PP. 21-29 Dec. 1986.

35) Kennard, R. W. and Stone, L. A., Computer-Aided Design of Experiments ", Technometricz, Vol.11, No.1, PP. 137-148, 1969.

36) Mendenhall, W., Introduction to Linear Models and The Design and Analysis offExperiments, Duxbury Press, California. 1968. 37) Miller, R. K. and Walker, T. C., --Artificial ------Intelligence ------A~~lications ------in Engineering, ------SEA1 Technical Publishing, U.S.A, 1988.

38) Montgomery, D. C., Depign and Analysi-s of Experiments, John Wiley & Sons, Inc., New York, 1976.

39) Nelson, L. S., " Extreme Screening Designs ", Journal ---of Quality ------Technology, Vo1.14, No.2, PP. 99-100, April 1982.

40) Shapiro, S. S., Wilk, M. B., and Chen, H. J., Journal

------Iof American Statistical Association 63:1343, 1968. 41) Scheffe', H., "A Method for Judging All Contrasts in the Analysis of Variance", E%3ometrika, Vo1.40, PP.87- 104, 1953.

42) Smith, C. A. B., and Hartley, H. O., " Construction of Youden Squares", Journal of the Royal Statistical ------Society, B, Vol.10, PP.262-264.

43) Snee, R. D., " Computer-Aided Design of Experiments, Some Practical Experiences ", Journal of Quality Technology, Vo1.17, No.4, PP. 222-236, Oct. 1985.

44) Steele, R. G. D., and Torsie, J. H., "Principles & ...... Procedures of Statistics", McGraw Hill, New York, 1960.

45) Sullivan, L. P., " The Power of Taguchi Methods ", Quality Progress, PP. 76-79 June, 1987.

46) Tukey, J. W., One Degree of Freedom for Non- Additivity", Biometries, Vo1.5, PP.232-242, 1949.

47) Youden, W. J., " Partial Confounding in Factorial Replication ", Technometrics, Vo1.3, No.3, PP. 353- 358, August 1961. APPENDIX

Evaluation Test ......

Data Sheet

Point Distribution ......

Data Summary ......

Score Averages ------

Paired t-test ......

Instructions for Running the Program ------

Sample Output ...... EFFICIENCY TEST

The following problems were presented to a number of students for determining what type of experimental design is appropriate for each of them. The students test hand to determine the answer on his/her own and then they were allowed to determine the answer by using the expert system. These problems were selected from Montgomery

(1976) and Box, Hunter, and Hunter (1978). PROBLEM-I An engineer wishes to determine whether or not four tips produce different readings on a hardness test machine. The machine operates by pressing the tip into a metal specimen, and from the depth of the resulting depression, the hardness of the specimen can be determined. There is only one factor to consider; tip type. (a) If the metal specimens differ slightly in their hardness, what type of experimental design should be used to eliminate the uncontrolled variability of specimen hardness? (b) After the type of design has been decided upon, the engineer is looking for a way to compare all pairs of individual tip types, what test do you think he/she should use? ------PROBLEM 2 The percentage of hardwood concentration in raw pulp, the vat pressure, and the cooking time of pulp are being 98 investigated for their effects on strength of paper. Three levels of hardwood concentration, three levels of pressure, and two cooking times are selected. What type of design should be used?

------PROBLEM 3 An experimenter is studying the effect of five different formulations of an explosive mixture used in the manufacture of dynamite on the observed explosive force.

Each formulation is mixed from a batch of materials that is only large enough for five formulations to be tested.

The formulations are prepared by different operator, and there may be substantial difference in skills and experience of the operators. Five operators are chosen at random for this experiment. It should be noted that this single factor experiment has two uncontrolled variables; operators and batches. What type of experimental design should be run?

------PROBLEM 4 Three factors are studied for their effects on the horsepower necessary to remove a cubic inch of metal per minute. The factors are feed rate at two levels, tool condition at two levels, and tool type at two levels. It should be mentioned that all the factors are important.

What type of design should be used? 99

------PROBLEM 5 An engineer is studying the effect on filtration time of some product. The variables under consideration are:

(1) Water supply

(2) Raw materials

(3) Temperature of filtration

(4) Hold up time prior to filtration

(5) Recycle

(6) Rate of addition of caustic soda

(7) Type of filter cloth

(a) The engineer does not know for certain which of the above variables are important. What type of design do you recommend for to identify the important ones? (b) If the experimenter has identified that variables (1) water supply, (3) Temperature of filtration, and (6) rate of adding caustic soda are important and he/she suspects that two-factor interactions among them are important, too.

What is the next step the engineer should take? (c) If involving the three main effects and their two-factor interactions provided an adequate model, what should the experimenter do to obtain more precision on their estimates? ------DATA SHEET

USING SYSTEM (Y/N)? ------SUBJECT # ----

ISE 307/507 ------

PROBLEM 1

PROBLEM 2

PROBLEM 3

PROBLEM 4

PROBLEM 5 POINT DISTRIBUTION

Points for the problems will be distributed as the following:

PROBLEM 1 (a) Complete Randomized Block Design 2 Complete Randomized Design 1 Otherwise 0

(b) Duncan's Multiple Range Test Otherwise

PROBLEM 2 Full Factorial Design Factorial Design Otherwise

PROBLEM 3 Latin Square Design Factorial Design Otherwise

PROBLEM 4 Factorial Design at 2 levels Factorial Design Otherwise

PROBLEM 5 (a) Fractional factorial at 2 levels 2 Factorial 1 Otherwise 0

(b) Factorial at 2 levels + Inclusion of 2-factor interactions 2 Factorial at 2 levels 1 Otherwise 0

(c) Replication of experimental runs 1 Otherwise 0 DATA SUMMARY

1Jsing the Expert System? No

Problem

! (4) ! 2 Subject ! DATA SUMMARY

using the Expert System? 122

Problem

I (1) I 1 I (2) 1 2 I (3) I 2 I (4) 1 2 Subject I (5) 1 2 SCORE AVERAGES

Average without Average with Difference --auntem--lll-- -suntem-l2l- 121-1-111

Problem 1 (a)

(b)

Problem 2

Problem 3

Problem 4

Problem 5 (a)

(b)

(c PAIRED T-TEST

HO: ul = u2

HA: ul < u2

Where: ul refers to the mean score obtained when solving the problems without using the system. u2 refers to mean score obtained when solving the problems

using the system.

Statistic: to = da/(sd/ n)

Where: da is the average of the differences between pairs,

sd is the standard deviation of the differences, and n is number of matched pairs. oc = 0.05 da = 0.975

Sd = 0.4166

to = 0.975/(0.416/ 8) = 6.62

t0.05,8-1 = 1.895 to > t0.05,7

Conclusion: U2 is greater than U1 ( reject'the null hypothesis and conclude that using the system produce significantly higher scores). INSTRUCTION FOR RUNNING THE PROGRAM

1) Insert a DOS disk in drive a and get A>

2) Remove the DOS disk from drive a and insert the program

(Turbo Prolog) disk and type ~rolpg

3) Insert the data disk in drive b

4) Move the cursor to option Files in the main menu and hit

center>

5) Move the cursor to option Load and hit center>

6) Type b:expert and hit

7) After the file has been loaded, move the cursor to option

Run in the main menu and hit SAMPLE OUTPUT

EXPERT SYSTEM FOR DESIGNING STATISTICAL EXPERIMENTS Developed by: Mustafa Shraim Advisor: Dr. W. Robert Terry

Department of Industrial & Systems Engineering Ohio University Athens, Ohio

Press space bar to continue

THIS PROGRAM IS DESIGNED TO HELP THE USER IN SELECTING THE APPROPRIATE EXPERIMENTAL DESIGN. IT DOES NOT PROVIDE ANY NUMERICAL CALCULATIONS OF EXPERIMENTAL DATA. IT SHOULD BE NOTED THAT THIS PROGRAM APPLIES TO SINGLE DEPENDENT VARIABLE OR RESPONSE AND FIXED EFFECTS SITUATIONS ONLY

Do you wish to start the program (y/n)?y 1 -- There is only one factor (independent variable) to investigate 2 -- There are two or more factors (independent variables) to investigate

Enter number of choice or 'q' to quit or 'h' for help: 1

Are there any sources of extraneous uncontrollable Factors in the experiment (y/n)?

Enter answer or 'q' to quit or 'h' for help : h

Uncont'rollable variability: variability that comes from extraneous sources over which the experimenter has no control. Example: Temperature in a field experiment Press space bar to continue Are there any sources of extraneous uncontrollable Factors in the experiment (y/n)?

Enter answer or 'q' to quit or 'h' for help : n

Run a complete randomized design to compare K>2 treatment means or levels of the single factor

Enter 'c' to continue or 'h' for help h

Complete Randomized Design: A very simple form of experimental design in which the treatments involved are allocated to the experimental units purely on a random basis

Press space bar to continue Do you wish to compare individual treatment means (y/n)?

Enter answer (or 'q' to qtlit): y

Is it known what treatments to compare and contrasts are set (y/n)?

Ent~ranswer or 'q' to quit or 'h' for help: n

Do you wish to compare all possible pairs of treatment means (levels In single factor experiment) (y/n)?

Enter answer ('q' to quit): y

Is F-test For all treatments (levels) combined significant at 5% (yt'n)?

Enter answer ('q' to quit): n

Use Duncan's multiple range test

Enter 'c' to continue or 'h' for help h

Duncan's multiple range test is a powerful tool to compare treatment means. Refer to Duncan's(l955)

Press space bar to continue 1 -- There is only one factor (independent variable) to investigate 2 -- There are two or more factors (independent variables) to investigate

Enter number of choice or 'q' to quit or 'h' for help: I

Are there any sources of extraneous uncontrollable Factors in the experiment (y/n)?

Enter answer or '11' to quit or 'h' for help : y

1 -- So~~rceof uncontrollable variability comes from one factor 2 -- Source of uncontrollable variability comes from more than one factor

Enter number of choice ('q' to quit ): 2

1 -- Uncantrollahle variability comes from two factors 2 -- Uncontrollable variability comes from three factors 3 -- Uncontrollable variability comes from more than three factors

Enter number of choice ('q' to quit): 1 112 Is number of levels for each of the extraneous factors equal to the number of treatments (or levels) to be compared (y/n)? Enter answer ('q' to quit): y

[S data complete (i.e. no missinz values) (y/n)?

Enter answer ('q' to quit): y

Use Latin square design

Enter 'c' to continue or 'h' for help h

Latin sqllare design is used when the uncontrollable variability comes from two su~rrceswhich may be idrntified as rows and columns. One Roman letter is used in each cell of the square with no repetition in the same row and column For example, if we would like to investigate three treatments A,B, and C with two uncontrollable variables each at three levels then the square is A C B CBA B A C

Press space bar to continue 113 1 -- 'l'l~ere is only one factor ( iride~~endcntvariable) to i~~vest fga~ c 2 -- There are two or more factors (Independent variables) to investigate

Enter number of chotce or 'q' to quit or 'h' for help: 2

Are all independent variables known to he important (y/n)?

Entcr answer or 'q' to qult or 'h' for help: y

Is number of levels for each of the independent variables predetermined (y/n)?

Enter answer or 'q' to quit or 'h' for help: y

Is number of levels different from variable to variable (y/t~)?

Enter answer('q' to quit): y Ran a full factorial design. If you are considering all possible interactions then you will need at least two replicates of one treatment combination. OtherwJse you may use the unimportant interactions to estimate the error term

Ent.er 'c' to continue or 'h' for help h

Full factorial: Involves all possible treatment combinations in two or more factor experiments. For example, if an three factor (independent variables) are considered in an experiment and each is at two levels then the full factorial involves 2x2~2= 8 treatment combinations. Interaction: Is R measure of the extent to which the the effect (apon the dependent variable) of changing the level of onc factor depends on the level(s) of another or others

Press space bar to continue

Run a full factorial design. If you are considering all possible interactions then you will need at least two replicates of one treatment combination. Otherwise you may use the unimportant interactions to estimate the error term

Enter 'c' to continue or 'h' for help c 1 T11cr.e is clnly one factor (inciependent variable) to investigate 2 -- here are two or more factors (independent variables) to investigate

Enter nr~rnt~er.of choicr or '(1' to quit or 'h' for h~lp:2

Are all ind~pendentvariables known to be important (y/n)?

Ent~ransw~r or 'q' to quit or 'h' for help: n

Divide each independent variables into high and low levels RII~a highly fractionated factorial design at two levels. This will serve as a screening experiment for identifying the set of the significant variables

Enter 'c' to continue or 'h' for help h

Screcninc: The process of separating the important independent variables from the unimportant. Screenihg is applied at early stage of experimental design when a large number of variables is present

Press space bar to continue Divide each independent variables into high and low levels Run a highly fractionated factorial design at two levels. This will serve as a screening experiment for identifying the set of the significant variables

Enter 'c' to continue or 'h' for help c

Is there any prior knowledge indicating that any of the two-factor interactions are important (y/n) 7

Enter answer or 'q' to quit or 'h' for help: n

Is it economically possihle to perform all 2-k experimental runs given the budget and cost per run (y/n)?

Enter answer or 'q' to quit or 'h' for help: y

Can all runs be performed under homogeneous condition (yln)?

Enter answer or 'q' to quit or 'h' for helpy

Set up a full factorial design at two levels (2-k). Randomize runs, perform experiment and collect data. Fit the first order model that includes two-factor interactions. Conduct a lack of fit test (use higher order interactions to estimate error if experiment is not suffeciently replicated)

Entpr 'c' to continue or 'h' for help h Randomization: A set of treatments applied to a set of units is said to be randomized when the treatment applied to any given unit is chosen at random from those available and not already allocated First order model: (Linear model) contains only the terms that are linear in nature

Press space bar to continue

2-k: ( Factorial design at two levels) refers to k factors, each being investigated at two levels. This is the most popular factorial design because it dors not require a lar~enumber of runs. However. ttti s drsigri cst.imat.a tlir: I incar affects of the independent variables and does not account for nonlinrar effects Interaction: Is a measure of the extent to which the the effect (upon the dependent variahle) of changing the level of one factor depends on the level(s) of another or others

Press space bar to coritiri~le

Set up a full factorial design at two levels (2-k). Randomize runs, perform experiment and collect data. Fit the first order model that includes two-factor interactions. Condact a lack of fit test (use higher order interactions to estimate error if experiment is not suffeciently replicated)

Enter 'c' to continue or 'h' for help c

Is there a good fit of the model (y/n)?

Enter answer or 'q' to quit or 'h' for help: h The model is adequate and has a good fit if there is no significant violations in one or more of th~following statistical assumptions: Linearity, Normality, Independence Homogeneity of variance, and Additivity

Press space bar to continue

Linearity Definition: The dependent variahle (response) can be expressed as a linrar function of the independent variables. Such an assumption will he valid when the surface which represents thr relationship betwren the dependent variable and th~ ir~deper~cientvarial~les hns 110 significant curvattlr~ Trsting: Linearity can be tested by employing the lack of fit test in Draper & Herzberg (1971) Krmrdy: If the lin~arityassumption is violated, appropriate transformation can be achieved by employing 'Maximum Ilk~lihnotl mrthod' in Rox & Cox (1964)

Press space bar to continue

Normality Definitior~:The difference between predicted and actual, 1.e. the resid~~al,values must follow a normal distribution with the mean centered at zero Testing: Normality can be tested by employing chi-square goodness of fit test, Kolmogorov-Smirnov test,or the W test developed by Shapiro, Wilk. L Chen (1968) Remedy: If the normality assumption is viloated, appropriate transformation can be achieved by employing 'Maximum likelihood method' in Box & Cox (1964)

Press space bar to continue

Independence Definition: Residuals should be independently distributed random variables. This means that the knowledge of the magnitude of a given residual does not contain any information for predicting the values of other residuals Testing: Independence assuption can be tested by employing Darbin-Watson test Remedy: Refer to Glen & Terry (1984)

Press space bar to continue Homogeneity Of Variance Definition: Thp variance of the distributions of the residuals are constant for all combination of values of the independent variables test in^: The assumption of homogeneous variance can be tested by employing a test developed by Rartlett (1937) or the one by Bartlett and kendall (1946) Remedy: If homogeneity of variance is viloated, appropriate transformation can be achieved by employing 'Maximum likelihood method' in Rox & Cox (1964)

Press space bar to continue

Additivity Definition: The model, which expresses the dependent variable as a function of the independent variables, must not contain any cross product or interaction terms. However, the assumption that interactions are not present will not always be valid Tcsting: The additivity assumption can be tested by employing 'One degree of freedom test for additivity' by Tukey(1949) Rrmrdy: If The additivity assumption is violoated, appropriate tral~sformation can be achieved by employing 'Maximum likelihood method' in Box & Cox (1964)

Press space bar to continue

Js there a good fit of the model (y/n)?

Enter answer or 'q' to quit or 'h' for help: n

The assumption of no higher than two-factor interactions is possibly wrong. Include the most likely three-factor interaction terms in the model. FIt the first order model

Enter 'c' to continue or 'h' for help h 3-factor interaction: A measure of the extent to which the effect ( upon the dependent variable) of changing the level of one factor depends on the levels of the other two factors

Press space bar to continue

T~Passamption of no highrr than two-fnctor interactions is possibly wrnng. Include the most likely three-factor interaction terms in the model. Fit the first order model

Enter 'c' to continue or 'h' for help c

Is there n good fit of the model (y/n)?

Entcr answer or 'q' to quit or 'h' for help: n

Tra~~sformationof the data may be applied to achieve model adequacy. The lack of adequacy is a result of vilolation of one or more of the statistical assumptions

Enter 'c' to continue or 'h' for help h

Transformat.ion: The metrics in which data is recorded are chosen merely as a matter of convenience. However. they may not be those in which the system is most simply modeled. In such cases, transforming the response or the independent variable(s) or both on another scale provides us with a simple model

Press space bar to continue 1s there a good fit on the transformed data (y/n)?

Enter answer or 'q' to quit: y

Is more precision on estimates of effects desired (y/n)?

Enter answer or '(1' to quit: y

Replicate the experimental runs as many times as possible to achieve the preclsion desired

Enter 'c' to continue or 'h' for help

Replication: The execution of an experiment or a survey more than once so as to increase precision and to obtain a closer estimation of sampling error

Press space bar to continue