ABSTRACT
Statistical Monitoring of a Process with Autocorrelated Output and Observable Autocorrelated Measurement Error
Jesús Cuéllar Fuentes, Ph.D.
Co Mentors: John W. Seaman, Jr., Ph.D., Jack D. Tubbs, Ph.D.
Our objective in this work is to monitor a production process yielding output that is correlated and contaminated with autocorrelated measurement error. Often, the elimination of the causes of the autocorrelation of the measurement error and the reduction of the measurement error to a negligible level is not feasible because of regulatory restrictions, technological limitations, or the expense of requisite modifications. In this process, reference material is measured to verify the performance of the measurement process, before the product material is measured.
We propose the use of a transfer function to account for measurement error in the
product measurements. We obtain the base production signal and use a modified version of the common cause (CC) chart and the special cause control (SCC) chart, originally proposed by Roberts and Alwan (1988), to monitor the base production process. We incorporate control limits in the CC chart as suggested by Alwan (1991) and
Montgomery and Mastrangelo (1991) and add MR-chart to the original SCC chart.
The common cause control (CCC) chart and SCC charts comprise a flexible monitoring scheme capable of detecting not only changes in the process mean, but also shifts in the mean and the variance of the random shocks that generate the base process.
Statistical Monitoring of a Process with Autocorrelated Output and Observable Autocorrelated Measurement Error
by
Jesús Cuéllar Fuentes, M.S.
A Dissertation
Approved by the Department of Statistical Science
______Jack D. Tubbs, Ph.D., Chairperson
Submitted to the Graduate Faculty of Baylor University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy
Approved by the Dissertation Committee
______John W. Seaman, Jr., Ph.D., Co-Chairperson
______Jack D. Tubbs, Ph.D., Co-Chairperson
______Jane L. Harvill, Ph.D.
______Elisabeth M. Umble, Ph.D.
______Dean M. Young, Ph.D.
Accepted by the Graduate School May 2008
______J. Larry Lyon, Ph.D., Dean
Page bearing signatures is kept on file in the Graduate School. Copyright © 2008 by Jesús Cuéllar Fuentes
All rights reserved
TABLE OF CONTENTS
List of Figures viii
List of Tables xii
List of Abbreviations xv
Acknowledgments xvi
Dedication xvii
CHAPTER 1 – Introduction 1
1.1 Control Charts and Autocorrelated Observations 4
1.2 Schemes to Monitor Processes with Autocorrelated Output with Unobserved Measurement Error 8
1.2.1 Methods that Reduce or Remove Autocorrelation 9
1.2.2 Methods that Account for Autocorrelation 11
1.2.2.1 Observation-based methods. 11
1.2.2.2 Residual-based methods. 13
1.2.2.3 Specialized monitoring schemes. 17
1.3 Schemes to Monitor Processes with Autocorrelated Output and Observed Correlated Measurement Error 18
1.4 Summary and Discussion 29
CHAPTER 2 – Monitoring Schemes 30
2.1 Characteristics of a Successful Monitoring Scheme 31
2.2 Issues to Consider when Monitoring a Process with an Autocorrelated Output and no Observed Measurement Error 32
2.2.1 Special Causes and Types of Disturbances 33
iii 2.2.2 Dynamic Behavior of the Residuals after a Change in the Process Mean 34
2.2.3 Assessing Performance of a Control Scheme 34
2.2.4 Calculation of Control Limits: Moving Range vs. Standard Deviation 36
2.3 Selection of a Monitoring Scheme for a Process with Autocorrelated Output and Unobserved Measurement Error 37
2.3.1 Construction of the Common-Cause Control Chart 39
2.3.2 The Special-Cause Charts and Control Limits. 40
2.3.3 Procedure to Implement the Selected Monitoring Scheme for a Process with Autocorrelated Output and Unobserved Measurement Error 42
2.4 Monitoring Scheme for a Process with Autocorrelated Output and Observed Autocorrelated Measurement Error 44
2.4.1 Procedure to Implement the CCC and SCC Charts to Monitor a Process with Autocorrelated Output and Observed Autocorrelated Measurement Error 47
2.5 Summary and Discussion 49
CHAPTER 3 – Behavior of Observations, Forecasts, and Residuals after Changes in the Mean and Variance 51
3.1 Behavior of the Lead-One Forecasts and Residuals for Processes with Correlated Output and Unobserved Measurement Error 51
3.1.1 Behavior of the Residuals after a Step Shift in the Process Mean 53
3.1.2 Behavior of the Residuals after a Step Change in the Process Variance 60
3.1.3 Behavior of the Observations, Lead-One Forecasts, and Residuals after a Step Change in the Variance of the Random Shocks 62
3.1.4 Behavior of the Observations, Lead-One Forecasts, and Residuals after a Step Change in the Mean of the Random Shocks 67
3.2 Behavior of the Lead-One Forecasts and Residuals for Processes with Correlated Output and with Observed Correlated Measurement Error 69
iv 3.2.1 Behavior of the Residuals after a Step Shift in the Mean of the Base Process 70
3.2.2 Behavior of the Residuals after a Step Shift in the Variance of the Base Process 77
3.2.3 Behavior of the Residuals after a Step Shift in the Mean of the Random Shocks of the Base Process 78
3.2.4 Behavior of the Residuals after a Step Shift in the Variance of the Random Shocks of the Base Process 83
3.3 Summary and Discussion 88
CHAPTER 4 – Illustration of the Monitoring of a Process with Autocorrelated Output and Unobservable Measurement Error 90
4.1 Phase I Implementation 91
4.1.1 Identification of the ARIMA Model and Estimation of Its Parameters 92
4.1.2 Obtaining Lead-One Forecasts 107
4.1.3 Construction of the Common Cause Control Chart 111
4.1.4 Construction of the Special Cause Control Charts 111
4.2 Phase II Implementation – Process Monitoring 116
4.3 Detection of Out-of-Control Conditions 119
4.3.1 Procedure to Generate Data with Out-of-Control Conditions 120
4.3.2 Effect of a Step Shift in the Mean of the Process 123
4.3.3 Effect of a Step Shift in the Mean of the Random Shocks 131
4.3.4 Effect of a Step Shift in the Variance of the Random Shocks 138
4.4 Summary and Discussion 145
v CHAPTER 5 – Illustration of the Monitoring of a Process with Autocorrelated Output and with an Observable Autocorrelated Measurement Error 147
5.1 Phase I of the Implementation of the CCC and SCC Charts 148
5.1.1 Identification of the Transfer Function and Estimation of its Parameters 152
5.1.2 Identification of the Base Process Model and Estimation of its Parameters 158
5.1.3 Construction of the CCC and SCC Charts 168
5.2 Phase II – Implementation of the Monitoring Scheme for the Base Process 173
5.3 Detection of Out-of-Control Conditions 176
5.3.1 Detection of Step Shifts in the Mean of the Base Process 178
5.3.2 Detection of Step Shifts in the Mean of the Random Shocks of the Base Process 185
5.3.3 Detection of Step Shifts in the Variance of the Random Shocks of the Base Process 190
5.4 Summary and Discussion 198
CHAPTER 6 – Performance of the CCC and SCC Charts when the Mean or the Variance of the Base Process or the Random Shock Shift 199
6.1 Description of the Simulation Study 200
6.1.1 Simulation Procedure to Obtain Run Lengths 204
6.2 Determination of the Control Limits for the I-Chart and the MR-Chart 206
6.3 Performance of the CCC and SCC Charts for a Step Shift in the Mean of the Base Process 209
6.4 Performance of the CCC and SCC Charts for a Step Shift in the Mean of the Random Shocks of the Base Process 216
6.5 Performance of the CCC and SCC Charts for a Step Shift in the Variance of the Random Shocks of the Base Process 222
6.6 Summary and Discussion 225
vi CHAPTER 7 – Conclusions and Future Research 227
APPENDICES 232
Appendix A – Literature Summary 233
Appendix B – Variance of the Base Process 240
Appendix C – SAS Code 242
REFERENCES 263
vii
LIST OF FIGURES
Figure 1. Diagram of the measurement of the product and reference materials and corresponding signals of the base pharmaceutical process (P), the measurement process (X), and overall process (Y). 2
( prod ) Figure 2. Aggregation of the measurement process signal X t and the intrinsic production process signal Pt into the output signal Yt. 18
Figure 3. Change in the measurement process from measuring the reference materials. 24
Figure 4. Diagram representing the use of the transfer function as a filter to obtain the base process signal. 28
Figure 5. Diagram of the measurement of the product and reference materials and corresponding signals of the base pharmaceutical process and the measurement process. 30
Figure 6. Diagram of the measurement of the product and reference materials and corresponding signals of the base pharmaceutical process and the measurement process. 90
Figure 7. Time plot of Phase I measurement process data. 93
Figure 8. ACF of the measurement process series. 95
Figure 9. PACF of the measurement process series. 95
Figure 10. ACF and PACF of the residuals from model (4.10). 105
Figure 11. ACF and PACF of the residuals from model (4.11). 105
Figure 12. Histogram and normal probability plot of the residuals from model (411). 106
Figure 13. Observed versus forecasted measurement process series. 110
Figure 14. CCC chart for the Phase I measurement process data. 112
Figure 15. I-chart and MR-chart of the residuals from model (4.11). 114
Figure 16. Distribution gamma(1, 4) fitted to the observed MR values. 116
viii Figure 17. CCC Chart of Phase II Data 119
Figure 18. I-chart and MR-chart of Phase II data. 120
Figure 19. CCC and SCC charts with a 3σW step shift in mean of the process 125
Figure 20. Expected behavior of the residuals after a shift in the mean of the measurement process. 128
Figure 21. CCC and SCC Charts with a 2σW Step Shift in Mean of the Process. 129
Figure 22. CCC and SCC charts with a1.5σW step shift in mean of the process. 130
Figure 23. CCC and SCC Charts with a 3σa step shift in the mean of the random shocks. 132
Figure 24. CCC and SCC charts with a 2σa step shift in the mean of the random shocks. 136
Figure 25. CCC and SCC charts with a 1.5σa step shift in the mean of the random shocks. 137
Figure 26. CCC and SCC charts with a 3σa step shift in the variance of the random shocks. 139
Figure 27. CCC and SCC charts with a 2σa step shift in the variance of the random shocks. 143
Figure 28. CCC and SCC charts with a 0.5σa step shift in the variance of the random shocks. 144
Figure 29. Time plot of the measurement process and overall process observations. 153
Figure 30. ACF and PACF of the overall process series. 154
Figure 31. CCF of the prewhitened Xt and Yt series. 155
Figure 32. ACF and PACF of the residuals after model (5.8) is fitted. 158
Figure 33. ACF and PACF of the residuals ut from model (3.35). 164
Figure 34. Time plot of the overall process observations and their lead-one forecasts. 166
Figure 35. Histogram and normal probability plot of the residuals from model (5.11). 167
ix Figure 36. CCC chart for the overall process observations. 170
Figure 37. SCC charts of the residuals of the overall process. 172
Figure 38. Phase II CCC chart of the overall process observations. 175
Figure 39. Phase II I-chart and MR-chart of residuals. 175
Figure 40. CCC and SCC charts of the overall process with a 3σu shift in the mean of the base process. 179
Figure 41. Expected behavior of the overall process observations, lead-one forecasts, and residuals after a 3σu shift in the mean of the base process. 182
Figure 42. CCC and SCC charts of the overall process with a 2σu shift in the mean of the base process. 183
Figure 43. CCC and SCC charts of the overall process with a 1.5σu shift in the mean of the base process. 184
Figure 44. CCC and SCC charts of the overall process with a 3σu shift in the mean of the random shocks of the base process. 186
Figure 45. CCC and SCC charts of the overall process with a 2σu shift in the mean of the random shocks of the base process. 189
Figure 46. CCC and SCC charts of the overall process with a 1.5σu shift in the mean of the random shocks of the base process. 190
Figure 47. CCC and SCC charts of the overall process with a 3σu shift in the variance of the random shocks of the base process. 191
Figure 48. CCC and SCC charts of the overall process with a 2σu shift in the variance of the random shocks of the base process. 192
Figure 49. CCC and SCC charts of the overall process with a 0.5σu shift in the variance of the random shocks of the base process. 193
Figure 50. Probability and cumulative probability distribution functions of the run length for all charts combined for φbp = – 0.95 and θbp = – 0.90 a step shift in the mean of the base process. 210
Figure 51. Behavior of the observed, lead-one forecast, and residuals values after a shift in the mean of the random shocks at observation 100 and for combinations of φbp and θbp. 221
x Figure 52. CDF plots of the run length for all charts combined for different shifts in the mean of the random shocks of the base process. 221
Figure 53. CDF plots of the run length for all charts combined for different shifts in the variance of the random shocks of the base process. 225
xi
LIST OF TABLES
Table 1. Description of the Special-Cause Rules 6
Table 2. Classification of Methods to Monitor Processes with Autocorrelated Data 9
Table 3. Decisions when the Overall and the Measurement Processes are monitored separately 21
Table 4. Transient Behavior and Steady State Level of Residuals for Various Process Models 59
Table 5. Guidelines to identify a model based on the ACF and PACF 96
Table 6. Conditional Least Squares Estimates of the Coefficient in Model with Components at Lags 1, 2, 3, 4, 10, 11, 12, and 13 102
Table 7. Conditional Least Squares Estimates of the Coefficients in Model (4.10) 102
Table 8. Conditional Least Squares Estimates of the Coefficients in Model (4.11) 103
Table 9. Portmanteaus Test of Autocorrelation of the Residuals Model (4.10) 103
Table 10. Portmanteaus Test of Autocorrelation of the Residuals Model (4.11) 104
Table 11. Comparison of Model Selection Criteria 106
Table 12. Tests of Normality of the Residuals from Model (4.11) 106
Table 13. Out-of-Control Conditions 121
Table 14. Portmanteau Test of the Cross-Correlation of the Residuals,
For the Transfer function vBˆ( ) = ωˆ 0 156
Table 15. Portmanteau Test of the Cross-Correlation of the Residuals For the Transfer function (3.31) 157
Table 16. Conditional Least Squares Estimates of the parameters of Model (5.10). 159
Table 17. Portmanteau Lack of Fit Test that the Autocorrelation of the Residuals of Model (5.7) are Equal to Zero 160
xii Table 18. Portmanteau Lack of Fit Test that the Cross-Correlation of the Residuals of Model (5.10) and the Xt series are Equal to Zero 161
Table 19. Conditional Least Squares Estimates of the parameters of Model (5.11) 162
Table 20. Portmanteau Lack of Fit Test that the Autocorrelation of the Residuals of Model (5.11) are Equal to Zero 162
Table 21. Portmanteau Lack of Fit Test that the Cross-Correlation of the Residuals of Model (5.11) and the Xt series are Equal to Zero 163
Table 22. Comparison of Variance Estimates and Information Criteria for Models (5.10) and (5.11) 165
Table 23. Correlation between Parameter Estimates of Model (5.11) 165
Table 24. Tests for Normality of the Residuals from Model (3.35) 168
Table 25. Types of Step Shifts and their Magnitudes 176
Table 26. Expected values of the lead-one forecasts and the Residuals of the Overall Process 182
Table 27. Types of Step Shifts and their Magnitude 201
Table 28. Simulation Design Points for the Parameters φbp, and θbp each at 3 Levels 202
Table 29. Designed Experiment of the Parameters φbp, and θbp each at 3 Levels 208
Table 30. Run Length Summary Statistics for step shifts in the mean of the base process 211
Table 31. Run Length Summary Statistics for Step Shifts in the Mean of the Random Shocks 217
Table 32. Run Length Summary Statistics for Step Shifts in the Variance of the Random Shocks 223
xiii
LIST OF ABBREVIATIONS
Page where First Acronym Meaning appears ACF Autocorrelation function 17 AR Autoregressive 17 ARL Average run length 16 ARMA Autoregressive moving average 10 ARIMA Autoregressive integrated moving average 9 CC chart Common cause chart 37 CCC chart Common cause control chart 37 CCF Cross-correlation function 149 CDF Cumulative distribution function 202 CL Center line 6 CSE Combined Shewhart and EWMA chart 14 CUSUM Cumulative sum 8 EWMA Exponentially weighted moving average 10 EWMV Exponentially waited moving variance 12 EWMV Exponentially weighted root mean square 12 GLRT Generalized likelihood ratio test 14 I-chart Individual observation control chart 7 IID Individually and identically distributed 6 IMA Integrated moving average 14 IQRRL Run length interquartile range 202 LCL Lower control limit 5 MCAP Max-CUSUM for autocorrelated processes 15 MR-chart Moving range control chart 7 MRL Median run length 202 OGLF Optimal generalized linear filter 16 OSLF Optimal second-order linear filter 16 PACF Partial autocorrelation function 92 PID Proportional-Integral-Derivative 14 RL Run length 34 RMA Reverse moving average 15 SCC chart Special cause control chart 37 SCR Special cause rule 4 SDRL Standard deviation of the run length 34 SPC Statistical process control 2 UCL Upper control limit 5
xiv
ACKNOWLEDGMENTS
My wife and my daughters deserve my most gratitude for their encouragement, support, and patience. Dr. Seaman, Dr. Tubbs, and Dr. Young thank you for your continuous encouragement and valuable conversations. Also, I want to thank the faculty and staff of the Department of Statistical Science for their teachings and help.
xv
A Yolanda, Victoria, Amanda y Julieta. ¡La esencia de mi vida!
A mis papas. ¡Mi cimiento!
xvi
CHAPTER ONE
Introduction
The subject of this work was motivated by the need to monitor a pharmaceutical
process. An assay is performed to assess the properties of the active pharmaceutical ingredient and the final drug product that the patient will use. In this assay the pharmaceutical product is dissolved in a buffered solution and placed into a transparent holder (called a cuvette). A beam of light of a specific wavelength is shone onto the sample and the amount of light absorbed is measured. The performance of the measurement process is assessed by measuring a sample of a reference or control material with an absorbance that is known to be within a specified tolerance. The measurement process is affected by variations in the batches of buffered solutions, by the deterioration of the instrument’s lamp and optical detectors, by the analyst’s technique, by the instrument setup, and by the variation of the physical characteristics of the cuvettes.
We will use Yt to represent the absorbance of the product solution, Xt to represent
the absorbance of the reference material, and Pt to represent the intrinsic absorbance of
the product generated by the base production process. The subscript t represents the time
period at which the absorbance value is determined. The quantity Pt cannot be measured
unless it goes through the measurement process. The measurement process
( prod ) “contaminates” the actual absorbance value, Pt, with a measurement error, X t , producing the measurement of the product material, Yt. We will refer to Yt as the output
1 of the overall process because it contains the intrinsic absorbance value and the product measurement error. Also, we will refer to Xt as the output of the measurement process and it represents the measurement error of the reference material. Figure 1 depicts this pharmaceutical process.
Base Process
Product P Y
Measurement Process
X Reference
Figure 1. Diagram of the measurement of the product and reference materials and corresponding signals of the base pharmaceutical process (P), the measurement process (X), and overall process (Y).
The overall process and the measurement process generate values that are naturally autocorrelated. In the real-world pharmaceutical problem that motivated this research, the removal of the cause or causes of the autocorrelation of the measurement process and the reduction of its contribution to the overall process measurements to a negligible level, is not feasible because of regulatory constraints and expensive requisite modifications.
The objective of the monitoring of the pharmaceutical process is to detect out-of- control conditions of the base production process and eliminate their causes, so that the base process is maintained in control. This requires that we remove the measurement error from the overall process observation.
2 Contamination of a base process by a measurement process is certainly not unique
to pharmaceutical production. For example, another process where the product material
and a reference material are measured is in the production of metallic pigments. Thin
metal plates are ground while bathed in oil to provide cooling and a suspension for the
fine metal particles, which are the final product. The key characteristic of the product is assessed by first measuring the particle size of a reference suspension with metal particles of known size, Xt, and then measuring a sample of the suspension generated by the base
process, Yt. The objective of measuring the reference suspension first is to verify the
performance of the measurement process. In both examples the overall process and the
measurement process observations are autocorrelated. Another motivation for this work
is to present a practical monitoring scheme that can be used in an industrial setting. In
fact, we believe that the most important contribution of this work is to provide a practical
perspective for a user of statistical process control (SPC) in a typical manufacturing
operation with autocorrelated output; that is, we provide a usable methodology for
someone with knowledge of basic process improvement principles, SPC control charts,
and regression analysis such as those holding Six Sigma Green or Black Belts.
This chapter is organized as follows. In Section 1.1 we discuss control charts in
general and how autocorrelation affects their performance. In Section 1.2 we summarize
the different methods that have been proposed to monitor processes with an autocorrelated output, but with an unobserved measurement error. Also, in this section we discuss possible ways to monitor the base process. In Section 1.3 we summarize and discuss our findings. The specific contributions made in this chapter are:
3 • A synthesis of the methods that have been proposed to monitor processes with an
autocorrelated output over the last 25+ years.
• A summary and a discussion of the issues that affect the performance of a control
charts when data are correlated.
• A discussion of different approaches to monitor processes with autocorrelated output
and an observable autocorrelated measurement error.
The remainder of the dissertation is organized as follows. In Chapter 2 we
discuss the characteristics of a successful monitoring scheme and, based on these, we
propose a monitoring scheme for the measurement process and the base process. In
Chapter 3 we develop the expressions that describe the behavior of the observations, forecasts, and residuals after the process is affected by different types of disturbances.
The implementation of the monitoring of the measurement process is illustrated in
Chapter 4 and in Chapter 5 we illustrate the implementation of the monitoring of the base
process. In Chapter 6 we present a simulation study to investigate the detection
capability of the proposed monitoring scheme. Finally, in Chapter 7 we summarize our
findings, establish conclusions, and suggest future research.
1.1 Control Charts and Autocorrelated Observations
A desired property of a process is that it is in a state of control. A process is said
to be in a state of control, or stable, when based on past observations the process output
characteristic can be expected to vary within specified limits (Alwan and Roberts, 1988).
That is, the location, variation, and shape of the distribution of the observations of a
4 process output characteristic remains constant over time. The measurement of the output
characteristic at time t, Xt, of a stable process can be represented as
X t= μ+εt, (1.1)
where μ represents the mean of the process output and εi is a sequence of independently
2 and identically distributed random variables with E[εt] = 0 and Var[εt] = σε (Harris and
Ross, 1991, Montgomery and Mastrangelo, 1991, Box and Luceño, 1997; p. 13).
However, in most cases only Xt is observed, therefore a stable process will have E[Xt] = μ
2 2 and Var[Xt] = σX = σε .
Statistical Process Control (SPC) is the methodology used to achieve process
stability. SPC is generally implemented in two phases. In Phase I data are collected
when the process is believed to be stable. These data are used to calculate control limits
and construct control charts to monitor the location and the dispersion of the distribution
of the output characteristic.
In Phase II, the process is monitored using the control charts with the control
limits calculated in Phase I. If an observation violates a Special-Cause Rule (SCR), the
process is said to be out of control. Table 1 summarizes the SCRs (Nelson, 1984). SCR
1 is always applied. The other rules are applied according to the effect of potential
special causes on the quality characteristic Y. For example, SCR 3 may be implemented
to accelerate the detection of a change in the mean of the process, μ.
If an observation is larger than the upper control limit (UCL), smaller than the
lower control limit, (LCL), or violates a specified SCR, the process is said to be out of
control. Generally, when an out-of-control condition is detected the process is stopped;
the cause of the out-of-control condition is identified, and permanently eliminated. Then,
5 the process operation is resumed assuming that the process has been returned to its original state of control (Woodall and Montgomery 1999, Montgomery 1996, Jensen et
al. 2006).
Table 1. Description of the Special-Cause Rules
Special-Cause Description Rule No.
1 One point outside the control limits
2 Nine points in a row above or below the center line
3 Six points in a row steadily increasing or decreasing
4 Fourteen points in a row alternating up and down
Two out of three points in the region between 2 and 3 5 standard deviations above or below the center line. Four out of five points in a row in the region beyond 1 6 standard deviation above or below the center line Fifteen points in a row within ±1 standard deviation of the 7 center line Eight points in a row on both sides of the center line with 8 none within the ±1 standard deviation of the center line
A control chart is a time-ordered plot of a statistic of the observed data (means, a
single observation, standard deviations, ranges, etc.) with a fixed upper control limit,
lower control limit, and center line (CL). The UCL and LCL are calculated such that 99 to 100% of the observations fall between these limits when the process is stable. Control
charts are used in pairs. One control chart monitors the location of the process (e.g. the
mean) and the second chart monitors the variation of the process (e.g. the standard
deviation). The CL is just the mean of the plotted statistic. Therefore, the control limits
6 are traditionally placed 3 standard deviations above and below the mean. The generic
definition of the control limits of a control chart are:
LCL = μ−3 σε CL =μ (1.2)
UCL = μ+3 σε
Where μ is the mean and σε is the standard deviation of the plotted statistic. For example, if the plotted statistic is normally distributed, then about 99.73% of the values will be within the UCL and LCL calculated using (1.2).
This specification of the control limits assumes that the values of the statistic
plotted on a control chart are randomly selected from a common distribution with a mean
μ and a standard deviation σε. That is, the plotted values of the statistic are independent
and identically distributed (IID). The data collected during Phase I are used to obtain
estimates of μ and σε. In Phase II the estimates of μ and σε are traditionally used as if
they were fixed and equal to the actual parameters of the distribution of the plotted
statistic; i.e. the estimates of μ and σε are not updated using Phase II observations.
If the values of the plotted statistic are not independent, a higher number of false
out-of-control signals are observed (Faltin, Mastrangelo, Runger, Ryan, 1997). For
example, Maragah (1989) showed that for an autoregressive process of order 1, AR(1),
the control limits are narrower with positive autocorrelation and wider for negative
autocorrelation than the control limits for independent data, leading to a higher or lower number of false out-of-control signals than for a process with independent observations.
There are processes where a single measurement of the quality characteristic (n =
1) is sufficient to assess the performance of the process. The pharmaceutical process that motivated our research is an example of this. In this case the statistic to plot on the
7 control chart to monitor the mean is the individual observation (the Individuals control
chart, or the I-chart), and the moving range on a second chart to monitor the process
variation (the moving range chart or the MR-chart). There are other processes where
more than one measurement (n ≥ 1) is needed to assess their performance, for example
measuring five consecutive pieces of product or measuring the product on five different
locations (n = 5). In these cases, the mean of the n observations is plotted on the control
chart to monitor the process location (X-bar control chart), and the standard deviation of
the n observations on the control chart to monitor the process variation (S control chart).
If the individual observations or the means were autocorrelated and the control limits were calculated using (1.2), then the I-chart or the X-bar chart will either generate more false alarms than expected or will not detect real out-of-control situations.
1.2 Schemes to Monitor Processes with Autocorrelated Output with Unobserved Measurement Error
A plethora of approaches have been proposed to deal with processes that naturally generate serially dependent observations, where the measurement error is not considered and where the removal of the sources of autocorrelation is not practical or economically
feasible. Frequent sampling of continuous processes, for example water treatment
(Berthouex, Hunter, and Pallesen, 1978), nuclear reactions (Ermer, Chow, Wu, 1979),
and clean room air purification (Ramiréz, 1998), generate serially correlated data. Also, batch processes with carryover effects generate serially correlated observations, like
biochemical processes (Winkel and Zhang, 2004), or measurement processes such as the
one in our motivating pharmaceutical process.
There have been a large number of methods proposed to monitor processes with
autocorrelated observations. These methods have been classified as residual-based
8 control charts or observation-based control charts with modified control limits (Lu and
Reynolds, 1999a; Lee, 2004). A more discriminating classification of methods proposed for this problem seems to be in order. Specifically, we distinguish between methods that reduce, remove, or account for autocorrelation. We now summarize these methods. In the following, refer to the taxonomy presented in Table 2.
1.2.1 Methods that Reduce or Remove Autocorrelation
In our discussion of the methods summarized in Table 2, we shall refer to the
author and date of publication, as usual, as well as the reference number used in the table.
The references are organized alphabetically at the end of the dissertation, and
chronologically in Appendix A. Yaschin (1993, 21) proposed monitoring a sequence of
score transforms with a cumulative sum (CUSUM) chart. These scores are the logarithm
of the ratio of the on- and off-target density functions, which substantially reduce the
magnitude of the serial correlation.
Runger and Willemain (1995, 27) proposed two monitoring schemes for
processes that generate a large amount of serial data. The first scheme constructs
weighted averages of consecutive data that are uncorrelated. The weights are derived
based on the underlying autoregressive integrated moving average (ARIMA) process and
the batch size is selected to detect a specified shift in the process mean. In the second
scheme, called un-weighted batch means, consecutive data is grouped into batches. The
batch size is selected to reduce the lag 1 autocorrelation of the means to 0.1 or less. The
advantage of this approach is that a time series model is not needed. Similarly,
Willemain and Runger (1995, 27) introduced a monitoring scheme based on the run
lengths of observations above and below a target value. Since these run lengths are IID
9 any of the standard control charts for IID data (Shewhart, CUSUM, exponentially weighted moving average (EWMA)) can be used to monitor the mean of the process.
Table 2. Classification of Methods to Monitor Processes with Autocorrelated Data (number in parenthesis is the reference number in Appendix A; for an explanation of the acronyms used in the table, see the accompanying text).
Transformation (21) Run Sums (42) Reduce/Remove Run Lengths (42) Autocorrelation Weighted batch means (28) Un-weighted batch means (28) Adjust X, X-Bar, S chart limits (5, 15, 26, 49) Adjust CUSUM (2, 13, 27, 33, 41, 53, 56) Observation-based Adjust EWMA (26, 33, 39, 45) Methods ARMA chart (48) T2 control chart (55) Approach EWMV and EWMS (22) Fit ARIMA - Residual & forecast (7, 14) Fit EWMA - Residual forecast & Observed (16) Fit ARIMA - CSE (29) Fit ARIMA - Generalized likelihood ratio test chart (44) Residual-based Fit ARIMA - log res2, res2charts (46) Account for Methods Fit PID model - PID control chart (58) autocorrelation Fit ARIMA- Combined X-S2 X and EWMA charts (59) Fit ARIMA - Worst-case EWMA (62, 65) Fit ARIMA - Reverse moving average chart (63) Fit ARIMA - MCAP for mean and variance (66) Fit ARIMA - OSLF chart (70) Fit ARIMA - OGLF chart (71)
Spectral control chart (8) Specialized Detect outliers or level shifts (40) Applications Detect changes in autocorrelation structure (57) Adaptive EWMA (61)
10 Also, Willemain and Runger (1998, 41) proposed the use of run sums to monitor the mean of a process with autocorrelated data. The performance of control charts of run sums is equivalent to that of control charts of IID data.
The proponents of the methods to reduce or eliminate autocorrelation claim that their schemes perform better than the schemes that use a Shewhart chart of individual forecast errors from an autoregressive moving average (ARMA) model. However,
Yashin’s (1993, 21) approach is cumbersome and requires simulation to tune the parameters of his CUSUM chart. Some of the methods proposed by Willemain and
Runger (1995, 27) have the advantage of not requiring the fit of an ARIMA model, but the batching of consecutive observations, counting or summing run lengths, hides the actual dynamic behavior of the process. Furthermore, the statistic plotted on the control chart cannot be interpreted directly to identify the potential cause of out-of control conditions.
1.2.2 Methods that Account for Autocorrelation
These methods can be further classified as observation-based, residual-based, or specialized monitoring schemes.
1.2.2.1 Observation-based methods. These methods plot statistics of the observed data on control charts with control limits adjusted for the autocorrelation of the data.
Modifications of the individuals (X), mean ( X ), and standard deviation (S) Shewhart charts have been proposed by Vasilopoulos and Stamboulis (1978, 5), English,
Krishnamurti, and Sastri (1991, 15), and Kramer and Schmid (2000, 48). Also, Wardell,
Moskowitz, and Plante (1994b, 25) suggest adding a likelihood ratio statistic to an X
11 chart to aid in deciding if a point that violates the SCR 1 represents an actual out of
control condition.
Procedures to design observation-based CUSUM charts to detect specific changes
in the mean for specific models have been proposed by Johnson and Bagshaw (1975,2),
Harris and Ross (1991, 13), Runger, Willemain, and Prabhu (1995, 26), VanBrackle and
Reynolds (1997, 32), Timmer, Pignatiello, and Longnecker (1998, 40), Lou and Reynolds
(2001, 52), and Atienza, Tang, and Ang (2002a, 55).
Similarly, modifications of the standard, observation-based EWMA chart to
account for specific autocorrelation were proposed by Wardell, Moskowitz, and Plante
(1994b, 25), VanBrackle and Reynolds (1997, 32), Zhang (1998, 38), and Lou and
Reynolds (1999a, 44).
Jiang, Tsui, and Woodall (2000, 47) introduced an observation-based ARMA
chart that plots a statistic based on the first-order autoregressive moving average or
ARMA (1, 1) model. The variance of the autocorrelated process at different lags is used
to compute the control limits. Also, Apley and Tsung (2002, 54) proposed the
autoregressive T2 chart. This control chart monitors the Hotelling’s T2 statistic
constructed using a vectors of specified size of subsequent observations that are updated
as new observations are obtained. The upper control limit is the 1 – α percentile of a χ2 with degrees of freedom equal to the dimension of the vectors.
McGregor and Harris (1993, 22) presented the exponentially weighted moving variance (EWMV) and the exponentially weighted root mean square (EWRMS) charts to monitor the variation of a process that generates individual autocorrelated observations.
These charts use statistics based on the squared deviation of an observation from the
12 mean of the process. The EWRMS statistic uses the squared deviation of the
observations from the known process mean or from a specified target value. The EWMV
statistic uses the squared deviations of the observations from an estimate of the process
mean.
The main advantage of the observation-based methods is that there is no need to
fit a model to represent the autocorrelation of the data. Another advantage is that the
actual observations or statistics based on the observations are displayed on the control chart, facilitating the interpretation of out of control signals. However, as Alwan (1991,
14) points out, the adjustment of the control limits are devised for specific autocorrelation and such adjustments cannot be generalized. Also, the modified control limits of
Shewhart-type charts consider only violations of SCR 1, but because of the nonrandom behavior of the data plotted on the control chart, using the other SCR’s would likely generate false alarms.
1.2.2.2 Residual-based methods. The common characteristic of these methods is
the use an ARIMA or other time-dependent models to generate a sequence of IID lead-
one or one-step ahead forecast errors or residuals (from here on the term residual will
refer to the lead-one forecast error). These residuals are plotted on control charts for IID
data. Several authors have reported the use of this approach (Chow, Wu, and Ermer,
1979; Berthouex, Hunter, and Pallesen, 1978, 4; Notohardjono, Ermer, 1986, 6), however
most of the recent papers refer to the approach proposed by Alwan and Roberts (1988, 7).
The method proposed by these authors consists of the following steps: 1) fitting an
ARIMA model (Box, Jenkins, and Reinsel, 1994); 2) Construct the common-cause (CC)
ˆ chart which is a time-ordered plot of the lead-one forecasts, X t−1 ()1 ; and 3) Construct
13 the special-cause control (SCC) chart which is an individuals Shewhart control chart of
ˆ the residuals, eXXttt=−−1 ()1 , with control limits calculated using (2.1).
Alwan (1991, 14) presented an example of the construction of the CCC and SCC
and proposed the use of a single control chart, instead of two, of lead-one forecasts with
ˆ control limits calculated as in (2.2), but with μ taken to be equal to X t−1 ()1 and σε is estimated by the standard deviation of the residuals.
Montgomery and Mastrangelo (1991, 16) suggested a similar approach as Alwan
and Roberts (1988, 7), but they proposed optimizing the parameter of a EWMA forecast
to approximate the lead-one forecasts of the actual ARIMA model that describes the
process. This approximation is adequate for processes with positive autocorrelation and a
slow moving mean. They suggested constructing two control charts. The first is a time-
ordered plot of the observations, their corresponding EWMA forecasts, Zt, and control
limits constructed around the forecasted value Zt. These control limits are the same as
those proposed by Alwan (1991, 14). The second chart is the same as the SCC chart
proposed by Alwan and Roberts (1988, 7).
Several control charts of residuals have been proposed in an effort to detect small
shifts of the process mean and take advantage of its effect on the residuals (this is
discussed in detail in Chapter 3) or changes in the variance for different ARMA or IMA
(integrated moving average) models.
• CSE Chart – Lin and Adams (1996, 28) proposed a combined residual-based
Shewhart and EWMA (CSE) chart. The Shewhart chart detects large changes in the
residual that occur immediately after a shift in the process mean and the EWMA
14 detects smaller changes in the residuals that occur several observations after the
mean shift.
• GLRT Chart – Apley and Shi (1999, 43) suggested the use of the residual-based
generalized likelihood ratio test (GLRT) control chart. The GLRT statistic plotted
on the chart is computed from the fault signature that accounts for the transient
behavior of the residuals after a change in the process mean has occurred.
• log(e2) EWMA Chart – Lu and Reynolds (1999b, 45) used the EWMA chart of the
logarithm of the squared of the residuals as well as individuals chart of the residuals
squared to monitor the variance of a process with autocorrelated output.
• PID Chart – Jiang. Wu, Tsung, Nair, and Tsui (2002, 57) introduced the
Proportional-Integral-Derivative (PID) chart based on the three-term controller
equation. The PID statistic consists of the proportional (P) term that considers the
direct effect of the forecast error, the integral (I) term considers the additive effect of
current and previous forecast errors, and the derivative (D) term accounts for the
effect due to the difference between the current and the previous forecast error (see
Box, Jenkings, and Reinsel; 1994 p.493). The authors “tune” the PID statistic to
detect a specific change in the mean and to achieve a particular in-control average
run length.
• X - S2 EWMA Chart – This chart was introduced by Knoth and Schmid (2002, 58)
to monitor the mean and the variance of a process that generates autocorrelated
observations. This approach plots the EWMA statistic of the mean of the residuals
and the EWMA of the sample variance of the residuals on the same chart. Two-
sided control limits are used for the EWMA chart on the mean and an upper control
15 limit is used for the EWMA on the variance. If either EWMA statistic exceeds the
corresponding control limits, then the process is out of control.
• Worst-Case EWMA Chart – Apley and Lee (2003, 61) and Lee (2004, 64) presented
a modified residuals-based EWMA chart to account for the uncertainty of the
estimated values of the ARMA model fitted to the data. This worst-case EWMA
chart has wider limits than the EWMA of residuals based on the assumption that
there is no estimation error.
• RMA Chart – Dyer, Adams, and Conerly (2003, 62) proposed the reverse moving
average (RMA) control chart. The RMA is an average of previous residuals. The
number of residuals to average is determined through simulations of the observations
from a specified ARMA and for specific changes in the process mean.
• MCAP Chart – The Max-CUSUM for autocorrelated processes or MCAP chart was
introduced by Cheng and Thaga (2005, 65) to simultaneously monitor the mean and
the standard deviation. They calculate a statistic, Zi, for the mean that is equal to the
Z-score based on the mean and the standard deviation of the stable process. Also,
they calculate a statistic, Yi, for the standard deviation. This statistic is equal to the
inverse function of the normal cumulative distribution of the ratio of the mean-
squared of the residuals and the standard deviation of the stable process. Then,
CUSUM statistics for Zi and Yi are computed. Since Zi and Yi are independent and
have the same distribution, a single statistic, defined as the maximum of the Zi and Yi
CUSUM statistics, is plotted on the CUSUM chart that is used to monitor the
process.
16 • OSLF Chart – Chin and Apley (2006, 69) proposed the optimal second-order linear
filter (OSLF) control chart. The statistic plotted on the chart corresponds to an
ARMA(2, 1) filter of the residuals from an ARMA model. The parameters of the
filter are obtained by a numerical minimization of the out-of-control ARL for a
specified in-control ARL and for a specific change in the process mean (shifts,
spikes, and sinusoidal changes).
• OGLF Charts – The optimal generalized linear filter (OGLF) control chart was also
introduced by Apley and Chin (2007, 70). The statistic plotted on the control chart is
a truncated sum of the weighted residuals. The weights are found by the numerical
minimization of the out-of-control average run length (ARL) for a specified in-
control ARL and for a specific shift in the process mean.
1.2.2.3 Specialized monitoring schemes. These are methods that monitor specific characteristics of a process.
• Spectral Chart – This chart was introduced by Beneke, Leemis, Schiegel, and Foote
(1988, 8) and it is based on the periodogram and its objective is to detect cyclical
patterns in the process data.
• λLS Chart – Atienza, Tang, and Ang (1998, 39) proposed plotting test statistics,
computed using the estimated parameters of an ARMA model, which detect additive
outliers, innovative outliers, or level shifts.
• ACF and Q Charts – The autocorrelation function (ACF) chart and the Box-Ljung-
Pierce statistic (Q) chart were conceived by Atienza, Tang, and Ang (2002b, 56) with
the purpose to detect changes in the stochastic model that describes the
autocorrelation of the process
17 • AEWMA Chart – The adaptive EWMA chart devised by Nembhard and Kao (2003,
60) monitors the output of a process during a process transition (e.g. a warm-up
period). The EWMA parameter and the first-order autoregressive or AR(1) process
parameter are adjusted dynamically to represent the signal of the process during the
transition period. At the end of the transition the AR(1) parameter is restarted so that
the EWMA continuous to monitor the output of the process.
At this point one is confronted with the difficult decision to select an appropriate
monitoring scheme. We will discuss this in Chapter 2.
1.3 Schemes to Monitor Processes with Autocorrelated Output and Observed Correlated Measurement Error
As mentioned above, the overall process observations, Yt, contain information
about the base process that generates the product, Pt, and the measurement process that
( prod ) generates the measurement, X t (see Figure 2). In processes like this the objective of
the monitoring scheme is to detect out-of-control condition in the base process, as well as to maintain the measurement process under control.
Yt Pt Base Measurement X ( prod ) Process Product Process t
( prod ) Figure 2. Aggregation of the measurement process signal X t and the intrinsic production process signal Pt into the output signal Yt.
18 It is important to note that Xt represents the signal of the measurement process
( prod ) obtained from measuring the reference material and X t represents the unobservable
contribution of the measurement process contained in the measurement of the product
material Yt.
The measurement of the reference material, Xt, is used to verify the performance
of the measurement process and it is assumed to represent the measurement error
contained in the measurement of the product.
The construction of an appropriate monitoring system requires that either the
( prod ) measurement process signal X t be removed from the overall output measurement or
be accounted for so that the behavior of the base process can be observed.
MacGregor and Harris (1993) and Lu and Reynolds (1999a) considered the
situation where the process output can be described as
X tt= μ+εt. (1.3)
The mean of the random process, μt, is represented by the first-order autoregressive or
AR(1) model
(1− φμ−μ=XtXB)( ) bt,
where E[Xt] = μX is the process mean or process level, φX is the first-order autoregressive
coefficient, B is the backwards shift operator, defined as BXXtt= −1 . The bt’s are called
the random shocks and form a white noise series, that is, a sequence of independent
2 random variables with mean zero and variance σb . The εt are also independent normal
2 random variables with mean zero and variance σε that represent the variation introduced
by the measurement system. It has been shown that an AR(1) process with added white
19 noise corresponds to a first-order autoregressive moving average, or ARMA(1, 1) process
(Box, Jenkins, and Reinsel, 1994; p. 174). Therefore, Xt in (1.3) can be represented by the model
(11−φXtB)( X −η) =( −θ XB)at, (1.4) where θX is the first-order moving average coefficient, at is a white noise series of are
2 independent random shocks with mean zero and varianceσa . Lu and Reynolds (1999a) used the standard deviation of the residuals from model (1.4) to calculate the control limits for the I-chart of residuals. The variance of the residuals from (1.4) is a function of
2 2 the variance of the random shocks, σb , and of the measurement error,σε . However, to monitor the stability of the underlying AR(1) process, i.e. excluding the variation introduced by the measurement system, the I-chart should be constructed using the residuals from the AR(1) model with control limits calculated using the standard deviation of these residuals, σb. The I-chart used by Lu and Reynolds (1999a) would be appropriate when the measurement error is small compared to the process variance.
In the processes that we are considering the overall process observations are autocorrelated as well as the observations from the measurement process. The results of
Lu and Reynolds (1999a) are not applicable in this case. However, we will follow their approach to study the behavior of the residuals after special causes affect the base process
(see Chapter 3).
As far as we know there are no other methods proposed to monitor processes with autocorrelated output and with an observed autocorrelated measurement error that must be accommodated. Therefore, we will discuss three possible approaches to monitor the process described in Figure 1 and select one to monitor the base process.
20 The first approach consists of monitoring the measurement process to ensure that it is maintained in control. Then, an ARIMA model is fitted to the overall process
ˆ ˆ observations, Yt. The lead-one forecastsYt−1 (1) and the residuals YYtt− −1 ()1 can be used to monitor the overall process output. This monitoring scheme will lead us to the situations described in Table 1 below.
( prod ) If Xt is unrelated to X t , this scheme is not appropriate. The decisions in the
( prod ) body of Table 3 are based on the fact that Xt is at least proportional to X t . Also, from
Table 3 we see that when the control charts of the measurement and the overall processes detect an out-of-control condition at the same time, we cannot be entirely sure that the out-of-control condition detected in the overall process control chart is solely due to an out-of-control condition in the measurement process. This out-of-control condition could be due to both the base and to the measurement process being out of control. In this situation, both processes need to be investigated to identify the process that is out of control. This monitoring scheme is not very practical and will not be considered further.
Table 3. Decisions when the Overall and the Measurement Processes are monitored separately
Yt Xt In control Out of control
In control Base process in control Base process is out of control Base process may be in control or Out of control Base process in control may not be in control
21 A second approach is to obtain the signal from the base process by subtracting the
measurements of the reference material from the measurements of the product material, that is
PYXtt= − t. (1.5)
This approach requires the following two assumptions. First, the behavior of the
measurement process when the product material is measured must be the same as when
( prod ) the reference material is measured; that is X t= X t with probability one. Second, the
base process must be independent of the measurement process.
If the foregoing assumptions hold, then an ARIMA model is fitted to the values
ˆ ˆ obtained from (1.5). The lead-one forecasts, Pt−1 (1) , and the residuals PPtt− −1 ()1 can be used to monitor the base process. If an out-of-control condition is detected by these control charts, then we know that the base process is out of control.
( prod ) This approach is not appropriate if X t is not equal to X t with probability one since use of equation (3.3) implicitly assumes that
( prod ) PYXtt= −=− t YX t t.
A third approach to monitor the base process is to provide for a more general
( prod ) relationship between X t and X t . Ideally we would determine the bias and precision
when the product and reference material are measured. This allows us to model the
relationship as
prod X t=κ12 +κ X t, (1.6)
where κ1 corrects for the difference in bias and κ2 for the difference in precision. This
expression then can be used to obtain the base process signal
22 PYtt= − ζ +κ Xt. (1.7)
In many cases (1.6) cannot be established because tests are destructive or because they are expensive. In this work we will assume that relationship (1.6) cannot be established and that the only information available is the autocorrelated data from the measurements of the product and the reference material.
This situation forces us to view the problem differently. A relationship similar to
(1.7) can be established by recognizing that the measurements of the reference and product material are not equally affected by the transitory and non-transitory behavior of the measurement process, and that the measurement of the reference material can influence the measurement of the product material (or vice versa if the order of measurement is inverted). The consecutive measurement of a reference material and a product material causes a dependency between the two values. This dependency may be caused, for example, when the depth of an amorphous metal connector in a semiconductor is measured by running a needle over the surface and measuring the drop of the needle. However, as the needle moves over the surface it accumulates a minute amount of metal, then the following measurement will be slightly biased because the size of the tip of the needle is not the same, and as more measurements are taken the bias increases. Other examples of the dependency between measurements occur when the raw material needed in the measurement is slowly depleted, or when taking a measurement causes an unavoidable physical change (e.g. warm-up effects caused by the deterioration of an optical system, etc.), or when a chemical or a biochemical contamination is introduced by measuring the reference material sample (e.g. when the same buffered solution is used to generate the reference and product samples), etc. Furthermore, it may
23 be impractical or expensive to restore the measurement process to its original state after a
measurement is performed. Figure 3 depicts this view of the measurement process where
the shaded box indicates that the measurement of the reference material affects the
behavior of the measurement process when the product material is measured.
The third approach is to assume that the effect of the measurement of the reference material, Xt, on the measurement of the product, Yt, is linear, so that their
dependency can be represented as
YvBXt= ( ) t, (1.8)
2 where vB()=+ v01 vBvB + 2 +… is a discrete linear transfer function. The vj’s are called
∞ the impulse response weights and v < ∞ . It is possible that the dependency ∑ j=0 j between the reference and product measurements is nonlinear, but we do not consider this situation.
Reference Product Material Material
(Changed Measurement Measurement measurement Process Process process)
X Y t t
Figure 3. Change in the measurement process from measuring the reference materials.
Assuming that the reference material is always measured before the product
materials, then (1.8) is a causal transfer function, meaning that Xt affects Yt (or vice
versa, if the product material is measured before the reference material), and that if Xt is
24 stationary, Yt is also stationary (Zhang, 1997). If Xt is not stationary, it can be
transformed into a stationary process.
The transfer function in (1.8) can be parsimoniously parameterized as (Box,
Jenkins, Reinsel, 1994; p. 415)
ω(BB) b Y = X , (1.9) tδ()B t