Empirical Investigation of Decision Tree Extraction From
Total Page:16
File Type:pdf, Size:1020Kb
EMPIRICAL INVESTIGATION OF DECISION TREE EXTRACTION FROM NEURAL NETWORKS A thesis presented to the faculty of the Fritz J. and Dolores H. Russ College of Engineering and Technology of Ohio University In partial fulfillment of the requirements for the degree Master of Science Maimuna H. Rangwala June 2006 This thesis entitled EMPIRICAL INVESTIGATION OF DECISION TREE EXTRACTION FROM NEURAL NETWORKS by MAIMUNA H. RANGWALA has been approved for the Department of Industrial and Manufacturing Systems Engineering and the Russ College of Engineering and Technology by Gary R. Weckman Associate Professor of Industrial & Manufacturing Systems Engineering R. Dennis Irwin Dean, Fritz J. and Dolores H. Russ College of Engineering and Technology ABSTRACT RANGWALA, MAIMUNA H., M.S., June 2006. Industrial and Manufacturing Systems Engineering EMPIRICAL INVESTIGATION OF DECISION TREE EXTRACTION FROM NEURAL NETWORKS (201 pp.) Director of Thesis: Gary R. Weckman The purpose of this thesis is to develop heuristics for employing Trepan, an algorithm for extracting decision trees from neural networks. Typically, several parameters need to be chosen to obtain a satisfactory performance of the algorithm. The current understanding of the various interactions between these is not well understood. By empirically evaluating the performance of the algorithm on a test set of databases chosen from benchmark machine learning and real world problems, several heuristics are proposed to explain and improve the performance of the algorithm. The experimentation is further validated by performance statistic measures. The algorithm is extended to work for multi-class regression problems and its ability to comprehend generalized feedforward networks is investigated. This work thus serves to provide improvements, an increased understanding of the behavior of the algorithm and heuristics to choose parameters for a better performance. Approved: Gary R. Weckman Associate Professor of Industrial and Manufacturing Systems Engineering Dedicated to my Father Prof. H. T. RANGWALA (1942-2006) and my Sister Fatema Rangwala (1988-2006) 5 TABLE OF CONTENTS ABSTRACT........................................................................................................................3 LIST OF TABLES.............................................................................................................. 8 LIST OF FIGURES .......................................................................................................... 11 CHAPTER 1. INTRODUCTION .................................................................................... 13 1.1 MACHINE LEARNING.......................................................................................... 13 1.2 CLASSIFICATION ALGORITHMS .......................................................................... 14 1.3 RESEARCH OBJECTIVES ..................................................................................... 14 1.4 THESIS OVERVIEW ............................................................................................. 16 CHAPTER 2. BACKGROUND AND LITERATURE REVIEW .................................. 17 2.1 ARTIFICIAL NEURAL NETWORKS ....................................................................... 17 2.1.1 Neural Network Architecture........................................................................ 17 2.1.2 Neural Network Training.............................................................................. 24 2.1.3 Neural Networks for Classification and Regression..................................... 25 2.1.4 Rule Extraction from Neural Networks ........................................................ 26 2.2 DECISION TREES ................................................................................................ 27 2.2.1 Decision Tree Classification......................................................................... 27 2.2.2 Decision Tree Applications........................................................................... 29 2.3 C4.5 ALGORITHM .............................................................................................. 30 6 2.3.1 Information Gain, Entropy Measure and Gain Ratio.................................... 31 2.4 TREPAN ALGORITHM....................................................................................... 37 2.4.1 M-of-N Splitting tests ................................................................................... 39 2.4.2 Single Test TREPAN and Disjunctive TREPAN ......................................... 40 CHAPTER 3. METHODOLOGY ................................................................................... 41 3.1 PHASE 1 ............................................................................................................. 43 3.2 PHASE 2 ............................................................................................................. 44 3.2.1 Datasets......................................................................................................... 45 3.2.2 Neural Network Modeling ............................................................................ 56 3.3 PHASE 3 ............................................................................................................. 57 3.4 PHASE 4 ............................................................................................................. 60 3.5 PERFORMANCE MEASURES ................................................................................. 62 3.5.1 Classification Accuracy................................................................................ 62 3.5.2 Comprehensibility......................................................................................... 64 CHAPTER 4. RESULTS AND DISCUSSION............................................................... 65 4.1 INVESTIGATE AND EXTEND TREPAN................................................................ 65 4.2 DATASET ANALYSIS .......................................................................................... 65 4.2.1 Corrosion....................................................................................................... 65 4.2.2 Outages ......................................................................................................... 79 4.2.3 Iris ................................................................................................................. 89 4.2.4 Body Fat........................................................................................................ 94 7 4.2.5 Saginaw Bay............................................................................................... 101 4.2.6 Admissions.................................................................................................. 110 CHAPTER 5. CONCLUSIONS AND FUTURE RESEARCH .................................... 118 5.1 SUMMARY AND DISCUSSION ............................................................................ 118 5.1.1 Accuracy ..................................................................................................... 119 5.1.2 Comprehensibility....................................................................................... 120 5.2 HEURISTICS...................................................................................................... 120 5.3 CONCLUSIONS.................................................................................................. 122 5.4 FUTURE RESEARCH.......................................................................................... 123 REFERENCES ............................................................................................................... 125 APPENDIX A: WEIGHTS AND NETWORK FILE FORMATS (GFF)...................... 131 APPENDIX B: CORROSION RESULTS...................................................................... 134 APPENDIX C: OUTAGES RESULTS.......................................................................... 143 APPENDIX D: IRIS RESULTS..................................................................................... 153 APPENDIX E: BODY FAT RESULTS......................................................................... 157 APPENDIX F: SAGINAW BAY RESULTS................................................................. 161 APPENDIX G: ADMISSIONS RESULTS.................................................................... 178 8 LIST OF TABLES Table 2.1: Activation Functions used in Neural Networks............................................... 21 Table 2.2: Play Tennis Example Dataset .......................................................................... 34 Table 3.1: Dataset Summary............................................................................................. 44 Table 3.2: Iris--Sample Dataset ........................................................................................ 45 Table 3.3: Body Fat -- Sample Dataset............................................................................. 47 Table 3.4: Body Fat Class Labels ..................................................................................... 47 Table 3.5: Saginaw Bay -- Sample Dataset ...................................................................... 50 Table 3.6: Chlorophyll Level Class Labels....................................................................... 50 Table 3.7: Corrosion -- Sample Dataset............................................................................ 52 Table 3.8: Corrosion Class labels ....................................................................................