Automated Feature Engineering for Deep Neural Networks with Genetic Programming Jeff .T Heaton Nova Southeastern University, [email protected]

Nova Southeastern University NSUWorks CEC Theses and Dissertations College of Engineering and Computing 2017 Automated Feature Engineering for Deep Neural Networks with Genetic Programming Jeff .T Heaton Nova Southeastern University, [email protected] This document is a product of extensive research conducted at the Nova Southeastern University College of Engineering and Computing. For more information on research and degree programs at the NSU College of Engineering and Computing, please click here. Follow this and additional works at: https://nsuworks.nova.edu/gscis_etd Part of the Computer Sciences Commons All rights reserved. This publication is intended for use solely by faculty, students, and staff of oN va Southeastern University. No part of this publication may be reproduced, distributed, or transmitted in any form or by any means, now known or later developed, including but not limited to photocopying, recording, or other electronic or mechanical methods, without the prior written permission of the author or the publisher. NSUWorks Citation Jeff .T Heaton. 2017. Automated Feature Engineering for Deep Neural Networks with Genetic Programming. Doctoral dissertation. Nova Southeastern University. Retrieved from NSUWorks, College of Engineering and Computing. (994) https://nsuworks.nova.edu/gscis_etd/994. This Dissertation is brought to you by the College of Engineering and Computing at NSUWorks. It has been accepted for inclusion in CEC Theses and Dissertations by an authorized administrator of NSUWorks. For more information, please contact [email protected]. Automated Feature Engineering for Deep Neural Networks with Genetic Programming by Jeff Heaton A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science College of Engineering and Computing Nova Southeastern University 2017 We hereby certify that this dissertation, submitted by Jeff Heaton, conforms to acceptable standards and is fully adequate in scope and quality to fulfill the dissertation requirements for the degree of Doctor of Philosophy. _____________________________________________ ________________ James D. Cannady, Ph.D. Date Chairperson of Dissertation Committee _____________________________________________ ________________ Sumitra Mukherjee, Ph.D. Date Dissertation Committee Member _____________________________________________ ________________ Paul Cerkez, Ph.D. Date Dissertation Committee Member Approved: _____________________________________________ ________________ Yong X. Tao, Ph.D., P.E., FASME Date Dean, College of Engineering and Computing College of Engineering and Computing Nova Southeastern University 2017 An Abstract of a Dissertation Submitted to Nova Southeastern University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy Automated Feature Engineering for Deep Neural Networks with Genetic Programming by Jeff Heaton 2017 Feature engineering is a process that augments the feature vector of a machine learning model with calculated values that are designed to enhance the accuracy of a model’s predictions. Research has shown that the accuracy of models such as deep neural networks, support vector machines, and tree/forest-based algorithms sometimes benefit from feature engineering. Expressions that combine one or more of the original features usually create these engineered features. The choice of the exact structure of an engineered feature is dependent on the type of machine learning model in use. Previous research demonstrated that various model families benefit from different types of engineered feature. Random forests, gradient-boosting machines, or other tree-based models might not see the same accuracy gain that an engineered feature allowed neural networks, generalized linear models, or other dot-product based models to achieve on the same data set. This dissertation presents a genetic programming-based algorithm that automatically engineers features that increase the accuracy of deep neural networks for some data sets. For a genetic programming algorithm to be effective, it must prioritize the search space and efficiently evaluate what it finds. This dissertation algorithm faced a potential search space composed of all possible mathematical combinations of the original feature vector. Five experiments were designed to guide the search process to efficiently evolve good engineered features. The result of this dissertation is an automated feature engineering (AFE) algorithm that is computationally efficient, even though a neural network is used to evaluate each candidate feature. This approach gave the algorithm a greater opportunity to specifically target deep neural networks in its search for engineered features that improve accuracy. Finally, a sixth experiment empirically demonstrated the degree to which this algorithm improved the accuracy of neural networks on data sets augmented by the algorithm’s engineered features. Acknowledgements There are several people who supported and encouraged me through my dissertation journey at Nova Southeastern University. First, I would like to extend my thanks to Dr. James Cannady for serving as my dissertation committee chair. He provided guidance and advice at every stage of the dissertation process. He also gave me several additional research opportunities during my time at NSU. I would also like to thank my committee members Dr. Sumitra Mukherjee and Dr. Paul Cerkez. Their advice and guidance through this process were invaluable. My sincere gratitude and thanks to Tracy, my wife and graduate school partner. Her help editing this dissertation was invaluable. We took a multi-year graduate journey as she completed her Master’s degree in Spanish at Saint Louis University and I worked on my Ph.D. It was great that she could accompany me to Florida for each of the cluster meetings and I could join her in Madrid for summer classes. The GECCO conference that I attended in Madrid was instrumental in the development of this research. Her love and encouragement made this project all possible. Thank you to Dave Snell for encouraging me to pursue a Ph.D. and for guidance at key points in my career. I would like to thank my parents. From the time that they bought me my first Commodore 64 and through all my educational journeys, they have always supported me. Table of Contents Approval ii Acknowledgements iv List of Tables vii List of Figures viii 1. Introduction 1 Problem Statement 6 Dissertation Goal 7 Relevance and Significance 8 Barriers and Issues 9 Definitions of Terms 12 List of Acronyms 22 Summary 23 2. Literature Review 25 Feature Engineering 25 Neural Networks 30 Deep Learning 39 Evolutionary Programming 48 Summary 59 3. Methodology 60 Introduction 60 Algorithm Contract and Specification 61 Algorithm Design Scope 65 Experimental Design 68 Measures 68 Experiment 1: Limiting the Search Space 70 Experiment 2: Establishing Baseline 73 Experiment 3: Genetic Ensembles 74 Experiment 4: Population Analysis 76 Experiment 5: Objective Function Design 79 Experiment 6: Automated Feature Engineering 81 Real World Data sets 85 Synthetic Data sets 86 Resources 87 Summary 88 4. Results 90 Introduction 90 Experiment 1 Results 90 Experiment 2 Results 94 Experiment 3 Results 96 v Experiment 4 Results 97 Experiment 5 Results 101 AFE Algorithm Design 103 Experiment 6 Results 116 Summary 123 5. Conclusions, Implications, Recommendations, and Summary 124 Conclusions 124 Implications 125 Recommendations 126 Summary 129 References 182 Appendixes A. Algorithms 138 B. Hyperparameters 144 C. Third Party Libraries 152 D. Data Set Descriptions 156 E. Synthetic Data Set Generation 163 F. Engineered Features for the Data Sets 167 G. Detailed Experiment Results 173 H. Dissertation Source Code Availability 179 vi List of Tables Tables 1. Common neural network transfer functions 41 2. Experiment 1 results format, neural network and genetic programming 72 3. Experiment 2 result format, neural network baseline 74 4. Experiment 3-result format, neural network genetic program ensemble 76 5. Experiment 4 results format, patterns in genetic programs 79 6. Experiment 5 results format, evaluating feature ranking 80 7. Experiment 6 results format, engineered feature effectiveness 84 8. Experiment 1 neural network results for select expressions 92 9. Experiment 1 genetic program results for select expressions 93 10. Experiment 2 baselines for data sets (RMSE) 95 11. Experiment 3, Ensemble, GP & neural network results 97 12. Experiment 4, patterns from genetic programs 100 13. Experiment 6, feature engineered vs non-augmented accuracy 120 14. Experiment 6, T-Test with data sets that showed improvement 122 15. Data sets used by the dissertation experiments 156 vii List of Figures Figures 1. Regression and classification network (original features) 3 2. Neural network engineered features 5 3. Elman neural network 35 4. Jordan neural network 35 5. Long short-term memory (LSTM) 37 6. Dropout layer in a neural network 44 7. Expression tree for genetic programming 50 8. Point crossover 52 9. Subtree mutation 53 10. Feature engineering to linearly separate two classes 58 11. Algorithm high level design and contract 64 12. Ensemble of genetic programs for a neural network 75 13. Generate candidate solutions 77 14. Branches with common structures 78 15. High-level overview of AFE algorithm 82 16. Dissertation algorithm evaluation 83 17. Overview of AFE algorithm 105 18. Input neuron weight connections 109 19. Data flow/mapping for the AFE algorithm

Automated Feature Engineering for Deep Neural Networks with Genetic Programming Jeff .T Heaton Nova Southeastern University, [email protected]

Incorporating Automated Feature Engineering Routines Into Automated Machine Learning Pipelines by Wesley Runnels

Learning from Few Subjects with Large Amounts of Voice Monitoring Data

Expert Feature-Engineering Vs. Deep Neural Networks: Which Is Better for Sensor-Free Affect Detection?

Identification of Activities of Daily Living Through Data Fusion on Motion and Magnetic Sensors Embedded on Mobile Devices

Artificial Neural Networks

Programming Neural Networks with Encog3 in Java

A Thesis Entitled a Supervised Machine Learning Approach Using

Optimization of Neural Network Architecture for Biomechanic Classification Tasks with Electromyogram Inputs

Deep Learning Feature Extraction Approach for Hematopoietic Cancer Subtype Classiﬁcation

Feature Engineering for Predictive Modeling Using Reinforcement Learning

Detecting Simulated Attacks in Computer Networks Using Resilient Propagation Artificial Neural Networks

Comparative Analysis of Simulators for Neural Networks Joone and Neuroph