An Expert System for Designing Statistical

AN EXPERT SYSTEM FOR DESIGNING STATISTICAL EXPERIMENTS. A thesis Presented to The Faculty of The College of Engineering And Technology Ohio University In Partial Fulfillment of The Requirements for the Degree of Nasters of Science Mustafa S. Shraim s- i ACKNOWLEDGEMENT I would like to express my thanks to my advisor, Dr. W. Robert Terry, for his help and encouragement throughout this research. Also, I would like to extend my thanks to my committee members, Dr. Francis Bellezza and Prof. Ken Cutright for their guidance and advice. This thesis is dedicated to my parents, Saadeddin and Etaf Shraim, for their continuous love and support and to my girlfriend, Ellen Battistone, without whom ..... TABLE OF CONTENTS 1.1 Experimental Design Review ------------------- 1 1.1.1 Basic Experimental Design Principles --- 2 1.1.2 Experimental Design Procedure ---------- 4 1.1.3 Statistical Techniques in Experimental Design .................... 5 1.2 Experimental Design and Quality Improvement -- 6 1.3 Statistical software 8 1.4 Knowledge Engineering Background ------------- 13 Chapter Two: NEEDS ANALYSIS .......................... 18 Chapter Three: TECHNICAL AND STATISTICAL TERMINOLOGY - 21 3.1 Definitions of Technical Terms 21 3.2 Definitions of Experimental Designs ---------- 26 3.3 Definitions of Tests ......................... 29 Chapter Four: EXPERIMENTAL DESIGN .................... 31 4.1 Introduction --------------------------------- 31 4.2 Basic Experimental Design Principles --------- 31 4.3 Screening Experiments ........................ 36 4.4 Model Adequacy ............................... 42 Chapter Five: KNOWLEDGE REPRESENTATION 50 5.1 Introduction ................................. 50 5.2 Experimental Design Rules .................... 51 5.2.1 Rules for Single Factor Experiments ---- 5 1 5.2.2 Rules for Factorial Experiments -------- 55 Chapter Six: SYSTEM EVALUATION ....................... 87 6.1 Introduction ................................. 87 6.2 Evaluation Experiments 87 6.3 Conclusions .................................. 89 6.4 Recommendations .............................. 9 1 Evaluation Test ................................. Data Sheet -------------------------------------A Point Distribution .............................. Data Summary Score Averages Paired t-test ----------------------------------A Instructions for Running the Program ------------ Sample Output ----------------------------------- TABLE OF FLOW CHARTS CHAPTER ONE INTRODUCTION 1.1 EXPERIMENTAL DESIGN REVIEW A "true experiment" may be defined as a study in which some independent variables are manipulated (Hicks (1982)). Their effects on the response or the dependent variable are determined and levels of the independent variables are controlled and not chosen at random. The various combinations of levels of independent variables are referred to as treatments. These treatments are run in random order. The main reason for designing experiments is to obtain accurate information about the study at the lowest cost. Poorly planned experiments are often costly due to performing unnecessary experimental runs. Moreover, it is often necessary to study situations which the effects of two or more independent variables combine in a nonadditative fashion. These nonadditative effects are called interaction effects. If the experiment is performed by changing the level of one independent variable at a time while holding the others constant, then interaction effects can not be determined. Another reason for designing an experiment is the need to measure the experimental error which is a crucial component in conducting statistical tests. These statistical tests provide the experimenter with the power of interpreting the results of the data analysis. Steele and Torsie (1960) stated that " the experimenter who contacts the consulting statistician at the planning stage of the experiment rather than after its completion improves the chances of achieving his aims". According to them, the statistician is often faced with data that only provides biased estimates and differences. The conclusions drawn from the analysis of such data may not apply to the population the experimenter had in mind. However, if the experimenter applies proper experimental design, then valid conclusions based on precise estimates can be drawn. 1.1.1 BASIC EXPERIMENTAL DESIGN PRINCIPLES Often, after defining the independent variables and the response(s) of an experiment, the data is collected without taking into consideration the way it was collected. Other times, the experiment is conducted with little or no consideration given to the extraneous variability that can be contributed to the experimental error. For example, if the temperature is said to have an effect on the outcome (response) of a certain experiment, then performing the experimental runs under different temperatures could inflate the experimenter error. As a result, the validity of conclusions drawn are affected. There are three basic principles in designing experiments; (1) blocking, (2) randomization, and (3) replication. Blocking is method used in experimental design to draw valid conclusions from an experiment. This is done by performing the experimental runs in subexperiments or blocks. Each block contains the experimental runs that are performed under homogeneous conditions. The variability between blocks can then be eliminated since the extraneous conditions of one block may differ from those of another. Randomization is the key principle in designing statistical experiments. It means that the experimental runs are selected and performed in a random order. The emphasis on randomization comes from the fact that statistical methods require that the observations, which contain the experimental error, are randomly distributed random variables. Replication refers to the repetition of the basic experimental runs. It is an important aspect in designing statistical experiments for two important reasons. Firstly, It provides the experimenter with an estimate of the experimental error which is an important element in conducting statistical tests. Secondly, It provides the experimenter with a more precise estimate of the mean response. A detailed discussion on these basic experimental design principles and other aspects will be presented in chapter three. 1.1.2 EXPERIMENTAL DESIGN PROCEDURE In this section, a description of procedure for experimentation will be presented. This includes the design, the performance, the analysis and the conclusions. (1) Statement of the problem: The objective of an experiment must be clearly defined. (2) Choice of variables: this includes both the independent variables and the response(s). Levels for the independent variables over which the investigation will be performed must also be defined. The dependent variable or response must be selected so that it (a) provides useful information about the study, and (b) can be measurable. (3) Choice of experimental design: The experimenter must take the basic principles mentioned above into consideration when he or she is to select an experimental design. The objective of the experiment as well as budget constraints are also important issues. (4) Performing the experimental runs: the experimenter must ensure that proper randomization, measurement, and homogeneity of environment in which the experiment is taking place are proceeding as planned. (5) Analyzing the experiments: the data collected from the experiment should be analyzed by using proper statistical techniques. Graphical methods are also helpful especially in checking the adequacy of the model. (6) Drawing conclusions: statistical techniques provide a guidance in drawing conclusions from the analysis of the experimental data. Statistical significance on effects and differences can be easily recognized for recommendations on the findings. 1.1.3 STATISTICAL TECHNIQUES IN EXPERIMENTAL DESIGN The experimenter can safely use statistical techniques when the requirements of the experimental design are met. Namely, recognizing the basic principles mentioned earlier and building the design upon them enables the experimenter to use statistical methodology in a straight forward manner. The following are some considerations for the experimenter to keep in mind. (1) The choice of the experimental design and analysis should be kept simple, since it is proven that they are almost always best when they are simple (Montgomery (1976)). (2) The experimenter should recognize the sequential approach in experimentation. That is, the construction of the primary experiment must not be comprehensive since adding and dropping variables could occur. This is always true in screening experiments where the experimenter is trying to separate the important variables from the unimportant ones. (3) Interpretation of statistical significance should be accompanied with practical knowledge about the experimental situation. For example, an engineer may determine that a modified carburetor of an automobile can save 0.1 mile/gallon which is statistically significant. However, if the cost of modification is $1000/carburetor then it is obvious that it is not worth the effort especially when the price of gas is low. 1.2 EXPERIMENTAL DESIGN AND QUALITY IMPROVEMENT It is important to consider the role that experimental design has been playing in quality improvement. Nearly all manufacturing firms use some type of quality control procedure to achieve quality products. However, these procedures may be costly to implement, resulting in overpriced products.

An Expert System for Designing Statistical

Case Study Applications of Statistics in Institutional Research

Regression Assumptions in Clinical Psychology Research Practice—A Systematic Review of Common Misconceptions

A Causation Coefficient and Taxonomy of Correlation/Causation Relationships

The Quest for Statistical Significance: Ignorance, Bias and Malpractice of Research Practitioners Joshua Abah

Parametric and Nonparametric Bayesian Models for Ecological Inference in 2 × 2 Tables∗

Guidance for Data Quality Assessment

Statistical Assumptions of Substantive Analyses Across the General Linear Model: a Mini-Review

Conceptual Foundations: Pro B Ab Ility Distrib Utio Ns

Propensity Score Analysis Concepts and Issues

Part I: Statistical Design and Philosophy

Sweating the Small Stuff: Does Data Cleaning and Testing of Assumptions Really Matter in the 21St Century?

Statistical Methods for Practitioners (EPA QA/G-9S)