Machine Learning for Automated Anomaly Detection In
Total Page:16
File Type:pdf, Size:1020Kb
Machine Learning for Automated Anomaly Detection in Semiconductor Manufacturing by Michael Daniel DeLaus Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Master of Engineering at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY June 2019 ○c Massachusetts Institute of Technology 2019. All rights reserved. Author................................................................ Department of Electrical Engineering and Computer Science May 24, 2019 Certified by. Duane S. Boning Clarence J. LeBel Professor Electrical Engineering and Computer Science Thesis Supervisor Accepted by . Katrina LaCurts Chairman, Master of Engineering Thesis Committee 2 Machine Learning for Automated Anomaly Detection in Semiconductor Manufacturing by Michael Daniel DeLaus Submitted to the Department of Electrical Engineering and Computer Science on May 24, 2019, in partial fulfillment of the requirements for the degree of Master of Engineering Abstract In the realm of semiconductor manufacturing, detecting anomalies during manufac- turing processes is crucial. However, current methods of anomaly detection often rely on simple excursion detection methods, and manual inspection of machine sensor data to determine the cause of a problem. In order to improve semiconductor production line quality, machine learning tools can be developed for more thorough and accurate anomaly detection. Previous work on applying machine learning to anomaly detection focused on building reference cycles, and using clustering and time series forecasting to detect anomalous wafer cycles. We seek to improve upon these techniques and apply them to related domains of semiconductor manufacturing. The main focus is to develop a process for automated anomaly detection by combining the previously used methods of cluster analysis and time series forecasting and prediction. We also explore detecting anomalies across multiple semiconductor manufacturing machines and recipes. Thesis Supervisor: Duane S. Boning Title: Clarence J. LeBel Professor Electrical Engineering and Computer Science 3 4 Acknowledgments I would like to take this opportunity to express my great appreciation to all of the people who helped me throughout this project. This journey would not have been possible without the love and support of my parents, Mike and Susan, and two brothers, Robert and Dahn. Thank you for encouraging me every step of the way and for always being there for me. I would like to thank Prof. Duane Boning for his guidance throughout this entire project and all of his valuable advice. Without his support, this project would not have been possible. I would also like to thank Mr. Jack Dillon, Mr. Dennis Murphy, Mr. Alan Bowers, and Mr. Adrian McKiernan for their help in getting access to the resources that I needed from Analog Devices. 5 6 Contents 1 Introduction 13 1.1 Current Anomaly Detection Practices . 14 1.2 Thesis Goals . 14 1.3 Methodology . 15 1.4 Thesis Outline . 15 2 Literature Review 17 2.1 Anomaly Detection . 17 2.2 Previous Work . 20 2.2.1 Reference Cycles . 20 2.2.2 Cluster Analysis . 21 2.2.3 Time Series Forecasting with Neural Networks . 22 3 Dataset 27 3.1 Data characteristics . 28 3.2 Anomalies in Data . 30 4 Methods 33 4.1 Time-Series Averaging for Reference Cycle . 33 4.1.1 Dynamic Time Warping . 33 4.1.2 DTW Barycenter Averaging . 34 4.2 Clustering Algorithms . 35 4.2.1 K-Means . 36 7 4.2.2 K-Medoids . 36 4.2.3 CLARA . 36 4.2.4 Agglormerative . 37 4.2.5 Divisive . 37 4.3 Neural Networks . 37 4.3.1 Multi-Layer Perceptron . 37 4.3.2 Long Short-Term Memory . 38 4.4 Anomaly Detection Pipeline . 38 5 Automated Anomaly Detection Experiments 41 5.1 Reference Cycle . 41 5.2 Clustering Analysis . 42 5.3 Anomaly Detection . 47 5.3.1 Training the Models . 47 5.3.2 Identification of Anomalous Points . 48 5.4 Distribution Validation . 55 5.4.1 Empirical Probability Density Function . 56 5.4.2 Empirical PDF Experimental Results . 58 6 Further Experiments 61 6.1 Deviation Scores . 61 6.2 Across Different Machines . 63 6.2.1 Clustering different recipes . 64 6.2.2 Time-series forecasting across recipes . 65 7 Future Work 67 7.1 Experiment Recommendations . 67 7.2 Predicting Machine Failures . 68 8 Conclusion 69 8 List of Figures 2-1 Example of Cluster Analysis [17] . 21 2-2 Example of Time Series Forecasting for Stock Prices [11] . 23 2-3 Architecture of a Long Short-Term Memory Model [5] . 24 3-1 Difference between recipe 920 and 945 for Parameter 19, for normal (good) runs . 28 3-2 Drift between cycles for parameter 17 [5] . 29 3-3 Difference between recipe 920 and 945 for Parameter 19 [5] . 30 3-4 Normal (right) vs. Anomalous (left) data [5] . 31 4-1 Mapping between Time-Series A and B [16] . 34 4-2 Clustering techniques used . 35 4-3 Automated Anomaly Detection Model . 38 4-4 Reference Cycle Example . 39 4-5 Clustering Plasma Etcher Data . 39 5-1 Data format [14] . 43 5-2 K-Means method applied to 150 wafer cycles . 44 5-3 K-Medoids method applied to 70 wafer cycles . 44 5-4 CLARA method applied to 150 wafer cycles . 45 5-5 Agglomerative clustering method applied to 20 wafer cycles . 45 5-6 Divisive clustering method applied to 20 wafer cycles . 46 5-7 MLP forecast one cycle (parameter 19) . 48 5-8 LSTM forecast one cycle (parameter 19) . 49 9 5-9 MLP-single model . 51 5-10 MLP-uni model . 52 5-11 LSTM-single model . 52 5-12 LSTM-uni model . 53 5-13 Forecasting and residual plots of LSTM model trained on recipe 920 and tested on recipe 945 (Parameter 19) . 55 5-14 Normal Q-Q plot and histogram of residuals for parameter 5 . 56 5-15 Normal Q-Q plot and histogram of residuals for parameter 17 . 57 5-16 Normal Q-Q plot and histogram of residuals for parameter 19 . 57 5-17 Empirical PDF frequency plots for Parameter 5, Recipe 920 . 58 6-1 Sum of residual values for normal (green) and anomalous (red) cycle for one cycle of parameter 5 and parameter 19 (Recipe 920) . 62 6-2 Deviation plots for ideal (blue), normal (green) and anomalous (red) cycles for parameters 5 and 19 (Recipe 920) . 63 6-3 Recipe 920 (Blue) and Recipe 120 (Green), one cycle, parameter 19 . 64 6-4 K-Means clustering for 30 cycles of recipe 920 and 10 cycles of recipe 120 (parameter 19) . 65 6-5 LSTM trained on recipe 920, forecasting recipe 120 (parameter 19). The black line represents the actual values and the green line is the predicted values . 66 10 List of Tables 3.1 Anomalous Parameters in Plasma Etcher Data . 27 5.1 Cluster Validation Results . 46 5.2 Anomalous Time-steps in Recipe 920 Parameters . 49 5.3 Anomalous Time-steps in Recipe 920 Parameters . 51 5.4 Anomalous Time-steps in Recipe 945 Parameters . 54 5.5 Anomalous Time-steps in Recipe 920 Downsampled Parameters . 54 5.6 Anomalous Time-steps Train on 920, Test on 945 . 54 5.7 Performance comparison of empirical PDF vs. Gaussian for anomaly detection . 59 11 12 Chapter 1 Introduction Manufacturing offers a great environment for the application of machine learning and artificial intelligence techniques, particularly in the realm of semiconductor man- ufacturing. Semiconductor manufacturing facilities are equipped with many sensors which monitor the manufacturing process and the semiconductors that are made. In order to reduce costs, companies use the data generated by the sensors on the various machines to further optimize their manufacturing processes. Currently, much of the data that is generated is only used for troubleshooting when a problem arises. A single manufacturing process has hundreds, if not thousands, of parameters from sensors, so efficiently determining the source of a problem in a process is difficult. The wafers that semiconductors are manufactured on go through multiple process cycles. This process is long and when a cycle goes wrong, it is hard to detect anomalies in time and so the process continues until the process in finished. These wafers are expensive to produce, so a process failure can cause a substantial loss in both cost and time. This is why machine learning offers a great potential for anomaly detection in semiconductor manufacturing. If anomalies in the manufacturing process could be detected, or even predicted, earlier, then a manufacturing facility could halt the process and correct the affected machine. This would increase process yield and 13 reduce costs, both of which are of extreme interest to semiconductor manufacturers. 1.1 Current Anomaly Detection Practices The primary focus of this study is on semiconductor manufacturing data collected by Analog Devices (ADI). Presently, ADI has had limited success in using Statistical Process Control (SPC) and limits monitoring in their fabrication process. These methods are often unable to reliably detect out-of-control processes and temporal anomalies that occur. This is due to the complex nature of the manufacturing process, including multiple recipes and parameters which makes it difficult to set individual thresholds and limits for each data channel. A single semi-conductor manufacturing process has hundreds of parameters from many different sensors making it infeasible to monitor each and every parameter effectively. Thus, ADI has depended on a reactive rather than proactive approach to anomalous events, mostly using the data to troubleshoot an issue after it has occurred rather than flagging and analyzing anomalies as they occur. Even then, it can bevery difficult to manually identify the specific parameters of a process that were responsible for an anomalous event. 1.2 Thesis Goals The current anomaly detection protocol at ADI presents a promising opportunity for the application of machine learning based anomaly detection methods.