The Use of Genetic Programming for Detecting the Incorrect Predictions of Classification Models

Total Page:16

File Type:pdf, Size:1020Kb

The Use of Genetic Programming for Detecting the Incorrect Predictions of Classification Models The use of Genetic Programming for detecting the incorrect predictions of Classification Models Adrianna Maria Napiórkowska Dissertation presented as partial requirement for obtaining the Master’s degree in Advanced Analytics 201 Title: The use of Genetic Programming for detecting the incorrect Adrianna Maria 9 predictions of Classification Models Napiórkowska MAA i NOVA Information Management School Instituto Superior de Estat´ısticae Gest~aode Informa¸c~ao Universidade Nova de Lisboa The use of Genetic Programming for detecting the incorrect predictions of Classification Models by Adrianna Maria Napi´orkowska Dissertation presented as partial requirement for obtaining the Master's degree in Advanced Analytics Advisor: Leonardo Vanneschi Lisbon, November 27th 2019 Abstract Companies around the world use Advanced Analytics to support their decision making process. Traditionally they used Statistics and Business Intelligence for that, but as the technology is advancing, the more complex models are gaining popularity. The main reason for an increasing interest in Machine Learning and Deep Learning models is the fact that they reach a high predic- tion accuracy. On the second hand with good performance, comes an increasing complexity of the programs. Therefore the new area of Predictors was intro- duced, it is called Explainable AI. The idea is to create models that can be understood by business users or models to explain other predictions. Therefore we propose the study in which we create a separate model, that will serve as a veryfier for the machine learning models predictions. This work falls into area of Post-processing of models outputs. For this purpose we select Genetic Programming, that was proven to be successful in various applications. In the scope of this research we investigate if GP can evaluate the prediction of other models. This area of applications was not explored yet, therefore in the study we explore the possibility of evolving an individual for another model validation. We focus on classification problems and select 4 machine learn- ing models: logistic regression, decision tree, random forest, perceptron and 3 different datasets. This set up is used for assuring that during the research we conclude that the presented idea is universal for different problems. The performance of 12 Genetic Programming experiments indicates that in some cases it is possible to create a successful model for errors prediction. During the study we discovered that the performance of GP programs is mostly connected to the dataset on the experiment is conducted. The type of predictive models does not influence the performance of GP. Although we managed to create good classifiers of errors, during the evolution process we faced the problem of overfitting. That is common in problems with imbalanced datasets. The results of the study confirms that GP can be used for the new type of problems and successfully predict errors of Machine Learning Models. Keywords: Machine Learning, Explainable AI, Post-processing, Classification, Genetic Programming, Errors Prediction 2 Table of contents List of Figures 5 List of Tables 7 1 Introduction 9 Introduction 9 2 Machine Learning 13 2.1 Models interpretability . 15 2.2 Explainable AI . 17 3 Genetic Programming 21 3.1 General structure . 21 3.2 Initialization . 22 3.3 Selection . 24 3.4 Replication and Variation . 25 3.5 Applications . 27 4 Experimental study 31 4.1 Research Methodology . 32 4.1.1 Data Flow in a project . 32 4.1.2 Predictive models used in the study . 33 4.1.3 Dataset Used in a Study . 35 4.2 Experimental settings . 40 4.3 Experimental results . 46 5 Conclusions and future work 57 Bibliography 61 3 TABLE OF CONTENTS 4 List of Figures 2.1 Types of machine learning problems . 14 2.2 Machine learning process . 15 2.3 Deep Learning solution . 15 2.4 Modified machine learning process . 16 3.1 Example of a tree generation process using full method . 23 3.2 Example of a tree generation process using grow method . 24 3.3 Example of subtree crossover . 26 3.4 Example of subtree mutation . 27 4.1 Visualization of the test cases preparation . 32 4.2 Datasets transformations used in experimental study and steps applied in the process. 33 4.3 Distribution of dependent variable in Breast Cancer Wisconsin dataset . 36 4.4 Distribution of dependent variable in Bank Marketing dataset . 37 4.5 Distribution of target variable in Polish Companies Bankruptcy dataset before and after up-sampling . 38 4.6 Implementation of the research idea . 41 4.7 Data split conducted in the project . 41 4.8 Summary of the results for: Breast Cancer Wisconsin Dataset Test Cases . 47 4.9 Summary of the results for: Bank Marketing Dataset Test Cases 50 4.10 Summary of the results for: Polish Companies Bankruptcy Dataset Test Cases . 52 4.11 Comparison of the performance of the best GP programs from different runs calculated on the test set . 54 4.12 Average of Maximum Train Fitness summarized by Model and Test Case . 55 5 LIST OF FIGURES 6 List of Tables 4.1 Summary of the predictions used as test cases . 39 4.2 Comparison between Confusion Matrices obtained by 2 different Fitness Functions . 44 4.3 Summary of the parameters selected for test cases . 45 4.4 Best Individuals found for Breast Cancer Wisconsin Dataset Test Cases . 48 4.5 Best Individuals found for Bank Marketing Dataset Test Cases . 51 4.6 Best Individuals found for Polish Companies Bankruptcy Dataset Test Cases . 53 7 LIST OF TABLES 8 Chapter 1 Introduction The history of algorithms begins in 18th century, when Ada Lovelace, a math- ematician and poet, have written an article describing a concept that would allow the engine to repeat a series of instructions. This method is known nowadays as loops, widely known in computer programming. In her work, she describes how code could be written for a machine to handle not only numbers, but also letters and commands. She is considered the author of first algorithm and first computer programmer. Although Ada Lovelace did not have a computer as we have today, the ideas she developed are present in various algorithms and methods used nowadays. Since that time, the researches and scientist were focused on optimization of work and automation of repetitive tasks. Over the years they have developed a wide range of methods for that purpose. In addition to that the objective of many researches was to allow computer programs to learn. This ability could help in various areas, starting from learning how to treat diseases based on medical records, apply predictive models in areas where classic approaches are not effective or even create a personal assistant that can learn and optimize our daily tasks. All of the mentioned concepts can be described as machine learning. According to Mitchell (1997), an understanding of how to make computers learn would create new areas for customization and development. In addition, the detailed knowledge of machine learning algorithms and the ways they work, might lead to a better comprehension of a human learning abilities. Many com- puter programs were developed by implementing useful types of learning and they started to be used in commercial projects. According to research, these algorithms were outperforming other methods in the various areas, like speech or image recognition, knowledge discovery im large databases or creating a program that would be able to act like human e.g chat bots and game playing programs. 9 CHAPTER 1. INTRODUCTION On the one hand intelligent systems are very accurate and have high pre- dictive power. They are also described by a large number of parameters, hence it is more difficult to draw direct conclusions from the models and trust their predictions. Therefore the research in an area of explainable AI started to be very popular and there was a need for analysis of the output of predictive mod- els. There are areas of study or business applications that especially require transparency of applied models, e.g. Banking and the process of loans ap- proval. One reason for that are new regulations protecting personal data - like General Data Protection Regulation (GDPR), which require entrepreneurs to be able to delete sensitive personal data upon request and protect consumers with new right - Right of Explanation. It is affecting business in Europe since May 2018 and is causing an increasing importance ofthe field of Explainable AI as mentioned in publication: Current Advances, Trends and Challenges of Machine Learning and Knowledge Extraction: From Machine Learning to Ex- plainable AI by Holzinger et al. (2018). The applications of AI in many fields is very successful, but as stated in mentioned article: We are reaching a new AI spring. However, as fantastic current approaches seem to be, there are still huge problems to be solved: the best performing models lack transparency, hence are considered to be black boxes. The general and worldwide trends in privacy, data protection, safety and security make such black box solutions difficult to use in practice. Therefore in order to align with this regulation and provide trust-worthy predictions in many cases the additional step of post-processing of the predic- tions is applied. A good model should generate decisions with high certainty. First of the indication for that is high performance observed during training phase. Secondly the results of evaluation on the test and validation sets should not diverse significantly, proving stability of the solution. In this area, the use of post-processing of outputs can be very beneficial. It the model is predicting loans that will not be repaid, then the cost of wrong prediction can be very high, if the loan will be given to the bad consumer. Therefore banking insti- tution spend a lot of time and resources on improving their decision making process.
Recommended publications
  • Metaheuristics1
    METAHEURISTICS1 Kenneth Sörensen University of Antwerp, Belgium Fred Glover University of Colorado and OptTek Systems, Inc., USA 1 Definition A metaheuristic is a high-level problem-independent algorithmic framework that provides a set of guidelines or strategies to develop heuristic optimization algorithms (Sörensen and Glover, To appear). Notable examples of metaheuristics include genetic/evolutionary algorithms, tabu search, simulated annealing, and ant colony optimization, although many more exist. A problem-specific implementation of a heuristic optimization algorithm according to the guidelines expressed in a metaheuristic framework is also referred to as a metaheuristic. The term was coined by Glover (1986) and combines the Greek prefix meta- (metá, beyond in the sense of high-level) with heuristic (from the Greek heuriskein or euriskein, to search). Metaheuristic algorithms, i.e., optimization methods designed according to the strategies laid out in a metaheuristic framework, are — as the name suggests — always heuristic in nature. This fact distinguishes them from exact methods, that do come with a proof that the optimal solution will be found in a finite (although often prohibitively large) amount of time. Metaheuristics are therefore developed specifically to find a solution that is “good enough” in a computing time that is “small enough”. As a result, they are not subject to combinatorial explosion – the phenomenon where the computing time required to find the optimal solution of NP- hard problems increases as an exponential function of the problem size. Metaheuristics have been demonstrated by the scientific community to be a viable, and often superior, alternative to more traditional (exact) methods of mixed- integer optimization such as branch and bound and dynamic programming.
    [Show full text]
  • Genetic Programming: Theory, Implementation, and the Evolution of Unconstrained Solutions
    Genetic Programming: Theory, Implementation, and the Evolution of Unconstrained Solutions Alan Robinson Division III Thesis Committee: Lee Spector Hampshire College Jaime Davila May 2001 Mark Feinstein Contents Part I: Background 1 INTRODUCTION................................................................................................7 1.1 BACKGROUND – AUTOMATIC PROGRAMMING...................................................7 1.2 THIS PROJECT..................................................................................................8 1.3 SUMMARY OF CHAPTERS .................................................................................8 2 GENETIC PROGRAMMING REVIEW..........................................................11 2.1 WHAT IS GENETIC PROGRAMMING: A BRIEF OVERVIEW ...................................11 2.2 CONTEMPORARY GENETIC PROGRAMMING: IN DEPTH .....................................13 2.3 PREREQUISITE: A LANGUAGE AMENABLE TO (SEMI) RANDOM MODIFICATION ..13 2.4 STEPS SPECIFIC TO EACH PROBLEM.................................................................14 2.4.1 Create fitness function ..........................................................................14 2.4.2 Choose run parameters.........................................................................16 2.4.3 Select function / terminals.....................................................................17 2.5 THE GENETIC PROGRAMMING ALGORITHM IN ACTION .....................................18 2.5.1 Generate random population ................................................................18
    [Show full text]
  • A Hybrid LSTM-Based Genetic Programming Approach for Short-Term Prediction of Global Solar Radiation Using Weather Data
    processes Article A Hybrid LSTM-Based Genetic Programming Approach for Short-Term Prediction of Global Solar Radiation Using Weather Data Rami Al-Hajj 1,* , Ali Assi 2 , Mohamad Fouad 3 and Emad Mabrouk 1 1 College of Engineering and Technology, American University of the Middle East, Egaila 54200, Kuwait; [email protected] 2 Independent Researcher, Senior IEEE Member, Montreal, QC H1X1M4, Canada; [email protected] 3 Department of Computer Engineering, University of Mansoura, Mansoura 35516, Egypt; [email protected] * Correspondence: [email protected] or [email protected] Abstract: The integration of solar energy in smart grids and other utilities is continuously increasing due to its economic and environmental benefits. However, the uncertainty of available solar energy creates challenges regarding the stability of the generated power the supply-demand balance’s consistency. An accurate global solar radiation (GSR) prediction model can ensure overall system reliability and power generation scheduling. This article describes a nonlinear hybrid model based on Long Short-Term Memory (LSTM) models and the Genetic Programming technique for short-term prediction of global solar radiation. The LSTMs are Recurrent Neural Network (RNN) models that are successfully used to predict time-series data. We use these models as base predictors of GSR using weather and solar radiation (SR) data. Genetic programming (GP) is an evolutionary heuristic computing technique that enables automatic search for complex solution formulas. We use the GP Citation: Al-Hajj, R.; Assi, A.; Fouad, in a post-processing stage to combine the LSTM models’ outputs to find the best prediction of the M.; Mabrouk, E.
    [Show full text]
  • Long Term Memory Assistance for Evolutionary Algorithms
    mathematics Article Long Term Memory Assistance for Evolutionary Algorithms Matej Crepinšekˇ 1,* , Shih-Hsi Liu 2 , Marjan Mernik 1 and Miha Ravber 1 1 Faculty of Electrical Engineering and Computer Science, University of Maribor, 2000 Maribor, Slovenia; [email protected] (M.M.); [email protected] (M.R.) 2 Department of Computer Science, California State University Fresno, Fresno, CA 93740, USA; [email protected] * Correspondence: [email protected] Received: 7 September 2019; Accepted: 12 November 2019; Published: 18 November 2019 Abstract: Short term memory that records the current population has been an inherent component of Evolutionary Algorithms (EAs). As hardware technologies advance currently, inexpensive memory with massive capacities could become a performance boost to EAs. This paper introduces a Long Term Memory Assistance (LTMA) that records the entire search history of an evolutionary process. With LTMA, individuals already visited (i.e., duplicate solutions) do not need to be re-evaluated, and thus, resources originally designated to fitness evaluations could be reallocated to continue search space exploration or exploitation. Three sets of experiments were conducted to prove the superiority of LTMA. In the first experiment, it was shown that LTMA recorded at least 50% more duplicate individuals than a short term memory. In the second experiment, ABC and jDElscop were applied to the CEC-2015 benchmark functions. By avoiding fitness re-evaluation, LTMA improved execution time of the most time consuming problems F03 and F05 between 7% and 28% and 7% and 16%, respectively. In the third experiment, a hard real-world problem for determining soil models’ parameters, LTMA improved execution time between 26% and 69%.
    [Show full text]
  • Investigating the Parameter Space of Evolutionary Algorithms
    Sipper et al. BioData Mining (2018) 11:2 https://doi.org/10.1186/s13040-018-0164-x RESEARCH Open Access Investigating the parameter space of evolutionary algorithms Moshe Sipper1,2* ,WeixuanFu1,KarunaAhuja1 and Jason H. Moore1 *Correspondence: [email protected] Abstract 1Institute for Biomedical Informatics, Evolutionary computation (EC) has been widely applied to biological and biomedical University of Pennsylvania, data. The practice of EC involves the tuning of many parameters, such as population Philadelphia 19104-6021, PA, USA 2Department of Computer Science, size, generation count, selection size, and crossover and mutation rates. Through an Ben-Gurion University, Beer Sheva extensive series of experiments over multiple evolutionary algorithm implementations 8410501, Israel and 25 problems we show that parameter space tends to be rife with viable parameters, at least for the problems studied herein. We discuss the implications of this finding in practice for the researcher employing EC. Keywords: Evolutionary algorithms, Genetic programming, Meta-genetic algorithm, Parameter tuning, Hyper-parameter Introduction Evolutionary computation (EC) has been widely applied to biological and biomedical data [1–4]. One of the crucial tasks of the EC practitioner is the tuning of parameters. The fitness-select-vary paradigm comes with a plethora of parameters relating to the population, the generations, and the operators of selection, crossover, and mutation. It seems natural to ask whether the myriad parameters can be obtained through some clever methodology (perhaps even an evolutionary one) rather than by trial and error; indeed, as we shall see below, such methods have been previously devised. Our own interest in the issue of parameters stems partly from a desire to better understand evolutionary algo- rithms (EAs) and partly from our recent investigation into the design and implementation of an accessible artificial intelligence system [5].
    [Show full text]
  • A Genetic Programming-Based Low-Level Instructions Robot for Realtimebattle
    entropy Article A Genetic Programming-Based Low-Level Instructions Robot for Realtimebattle Juan Romero 1,2,* , Antonino Santos 3 , Adrian Carballal 1,3 , Nereida Rodriguez-Fernandez 1,2 , Iria Santos 1,2 , Alvaro Torrente-Patiño 3 , Juan Tuñas 3 and Penousal Machado 4 1 CITIC-Research Center of Information and Communication Technologies, University of A Coruña, 15071 A Coruña, Spain; [email protected] (A.C.); [email protected] (N.R.-F.); [email protected] (I.S.) 2 Department of Computer Science and Information Technologies, Faculty of Communication Science, University of A Coruña, Campus Elviña s/n, 15071 A Coruña, Spain 3 Department of Computer Science and Information Technologies, Faculty of Computer Science, University of A Coruña, Campus Elviña s/n, 15071 A Coruña, Spain; [email protected] (A.S.); [email protected] (A.T.-P.); [email protected] (J.T.) 4 Centre for Informatics and Systems of the University of Coimbra (CISUC), DEI, University of Coimbra, 3030-790 Coimbra, Portugal; [email protected] * Correspondence: [email protected] Received: 26 November 2020; Accepted: 30 November 2020; Published: 30 November 2020 Abstract: RealTimeBattle is an environment in which robots controlled by programs fight each other. Programs control the simulated robots using low-level messages (e.g., turn radar, accelerate). Unlike other tools like Robocode, each of these robots can be developed using different programming languages. Our purpose is to generate, without human programming or other intervention, a robot that is highly competitive in RealTimeBattle. To that end, we implemented an Evolutionary Computation technique: Genetic Programming.
    [Show full text]
  • Computational Creativity: Three Generations of Research and Beyond
    Computational Creativity: Three Generations of Research and Beyond Debasis Mitra Department of Computer Science Florida Institute of Technology [email protected] Abstract In this article we have classified computational creativity research activities into three generations. Although the 2. Philosophical Angles respective system developers were not necessarily targeting Philosophers try to understand creativity from the their research for computational creativity, we consider their works as contribution to this emerging field. Possibly, the historical perspectives – how different acts of creativity first recognition of the implication of intelligent systems (primarily in science) might have happened. Historical toward the creativity came with an AAAI Spring investigation of the process involved in scientific discovery Symposium on AI and Creativity (Dartnall and Kim, 1993). relied heavily on philosophical viewpoints. Within We have here tried to chart the progress of the field by philosophy there is an ongoing old debate regarding describing some sample projects. Our hope is that this whether the process of scientific discovery has a normative article will provide some direction to the interested basis. Within the computing community this question researchers and help creating a vision for the community. transpires in asking if analyzing and computationally emulating creativity is feasible or not. In order to answer this question artificial intelligence (AI) researchers have 1. Introduction tried to develop computing systems to mimic scientific One of the meanings of the word “create” is “to produce by discovery processes (e.g., BACON, KEKADA, etc. that we imaginative skill” and that of the word “creativity” is “the will discuss), almost since the beginning of the inception of ability to create,’ according to the Webster Dictionary.
    [Show full text]
  • Geometric Semantic Genetic Programming Algorithm and Slump Prediction
    Geometric Semantic Genetic Programming Algorithm and Slump Prediction Juncai Xu1, Zhenzhong Shen1, Qingwen Ren1, Xin Xie2, and Zhengyu Yang2 1 College of Water Conservancy and Hydropower Engineering, Hohai University, Nanjing 210098, China 2 Department of Electrical and Engineering, Northeastern University, Boston, MA 02115, USA ABSTRACT Research on the performance of recycled concrete as building material in the current world is an important subject. Given the complex composition of recycled concrete, conventional methods for forecasting slump scarcely obtain satisfactory results. Based on theory of nonlinear prediction method, we propose a recycled concrete slump prediction model based on geometric semantic genetic programming (GSGP) and combined it with recycled concrete features. Tests show that the model can accurately predict the recycled concrete slump by using the established prediction model to calculate the recycled concrete slump with different mixing ratios in practical projects and by comparing the predicted values with the experimental values. By comparing the model with several other nonlinear prediction models, we can conclude that GSGP has higher accuracy and reliability than conventional methods. Keywords: recycled concrete; geometric semantics; genetic programming; slump 1. Introduction The rapid development of the construction industry has resulted in a huge demand for concrete, which, in turn, caused overexploitation of natural sand and gravel as well as serious damage to the ecological environment. Such demand produces a large amount of waste concrete in construction, entailing high costs for dealing with these wastes 1-3. In recent years, various properties of recycled concrete were validated by researchers from all over the world to protect the environment and reduce processing costs.
    [Show full text]
  • A Genetic Programming Strategy to Induce Logical Rules for Clinical Data Analysis
    processes Article A Genetic Programming Strategy to Induce Logical Rules for Clinical Data Analysis José A. Castellanos-Garzón 1,2,*, Yeray Mezquita Martín 1, José Luis Jaimes Sánchez 3, Santiago Manuel López García 3 and Ernesto Costa 2 1 Department of Computer Science and Automatic, Faculty of Sciences, BISITE Research Group, University of Salamanca, Plaza de los Caídos, s/n, 37008 Salamanca, Spain; [email protected] 2 CISUC, Department of Computer Engineering, ECOS Research Group, University of Coimbra, Pólo II - Pinhal de Marrocos, 3030-290 Coimbra, Portugal; [email protected] 3 Instituto Universitario de Estudios de la Ciencia y la Tecnología, University of Salamanca, 37008 Salamanca, Spain; [email protected] (J.L.J.S.); [email protected] (S.M.L.G.) * Correspondence: [email protected] Received: 31 October 2020; Accepted: 26 November 2020; Published: 27 November 2020 Abstract: This paper proposes a machine learning approach dealing with genetic programming to build classifiers through logical rule induction. In this context, we define and test a set of mutation operators across from different clinical datasets to improve the performance of the proposal for each dataset. The use of genetic programming for rule induction has generated interesting results in machine learning problems. Hence, genetic programming represents a flexible and powerful evolutionary technique for automatic generation of classifiers. Since logical rules disclose knowledge from the analyzed data, we use such knowledge to interpret the results and filter the most important features from clinical data as a process of knowledge discovery. The ultimate goal of this proposal is to provide the experts in the data domain with prior knowledge (as a guide) about the structure of the data and the rules found for each class, especially to track dichotomies and inequality.
    [Show full text]
  • Evolving Evolutionary Algorithms Using Linear Genetic Programming
    Evolving Evolutionary Algorithms Using Linear Genetic Programming Mihai Oltean [email protected] Department of Computer Science, Babes¸-Bolyai University, Kogalniceanu 1, Cluj- Napoca 3400, Romania Abstract A new model for evolving Evolutionary Algorithms is proposed in this paper. The model is based on the Linear Genetic Programming (LGP) technique. Every LGP chro- mosome encodes an EA which is used for solving a particular problem. Several Evolu- tionary Algorithms for function optimization, the Traveling Salesman Problem and the Quadratic Assignment Problem are evolved by using the considered model. Numeri- cal experiments show that the evolved Evolutionary Algorithms perform similarly and sometimes even better than standard approaches for several well-known benchmark- ing problems. Keywords Genetic algorithms, genetic programming, linear genetic programming, evolving evo- lutionary algorithms 1 Introduction Evolutionary Algorithms (EAs) (Goldberg, 1989; Holland, 1975) are new and powerful tools used for solving difficult real-world problems. They have been developed in or- der to solve some real-world problems that the classical (mathematical) methods failed to successfully tackle. Many of these unsolved problems are (or could be turned into) optimization problems. The solving of an optimization problem means finding solu- tions that maximize or minimize a criteria function (Goldberg, 1989; Holland, 1975; Yao et al., 1999). Many Evolutionary Algorithms have been proposed for dealing with optimization problems. Many solution representations
    [Show full text]
  • Genetic Programming for Julia: Fast Performance and Parallel Island Model Implementation
    Genetic Programming for Julia: fast performance and parallel island model implementation Morgan R. Frank November 30, 2015 Abstract I introduce a Julia implementation for genetic programming (GP), which is an evolutionary algorithm that evolves models as syntax trees. While some abstract high-level genetic algorithm packages, such as GeneticAlgorithms.jl, al- ready exist for Julia, this package is not optimized for genetic programming, and I provide a relatively fast implemen- tation here by utilizing the low-level Expr Julia type. The resulting GP implementation has a simple programmatic interface that provides ample access to the parameters controlling the evolution. Finally, I provide the option for the GP to run in parallel using the highly scalable ”island model” for genetic algorithms, which has been shown to improve search results in a variety of genetic algorithms by maintaining solution diversity and explorative dynamics across the global population of solutions. 1 Introduction be inflexible when attempting to handle diverse problems. Julia’s easy vectorization, access to low-level data types, Evolving human beings and other complex organisms and user-friendly parallel implementation make it an ideal from single-cell bacteria is indeed a daunting task, yet bi- language for developing a GP for symbolic regression. ological evolution has been able to provide a variety of Finally, I will discuss a powerful and widely used solutions to the problem. Evolutionary algorithms [1–3] method for utilizing multiple computing cores to improve refers to a field of algorithms that solve problems by mim- solution discovery using evolutionary algorithms. Since icking the structure and dynamics of genetic evolution.
    [Show full text]
  • Pygad: an Intuitive Genetic Algorithm Python Library
    PyGAD: An Intuitive Genetic Algorithm Python Library Ahmed Fawzy Gad School of Electrical Engineering and Computer Science University of Ottawa Ottawa, ON, Canada [email protected] Abstract—This paper introduces PyGAD, an open-source easy- to-use Python library for building the genetic algorithm. PyGAD supports a wide range of parameters to give the user control over everything in its life cycle. This includes, but is not limited to, population, gene value range, gene data type, parent selection, crossover, and mutation. PyGAD is designed as a general- purpose optimization library that allows the user to customize the fitness function. Its usage consists of 3 main steps: build the fitness function, create an instance of the pygad.GA class, and calling the pygad.GA.run() method. The library supports training deep learning models created either with PyGAD itself or with frameworks like Keras and PyTorch. Given its stable state, PyGAD is also in active development to respond to the user’s requested features and enhancement received on GitHub github.com/ahmedfgad/GeneticAlgorithmPython. PyGAD comes with documentation pygad.readthedocs.io for further details and examples. Keywords— genetic algorithm, evolutionary algorithm, optimiza- tion, deep learning, Python, NumPy, Keras, PyTorch I. INTRODUCTION Nature has inspired computer scientists to create algorithms for solving computational problems. These naturally-inspired algorithms are called evolutionary algorithms (EAs) [1] where the initial solution (or individual) to a given problem is evolved across multiple iterations aiming to increase its quality. The EAs can be categorized by different factors like the number of solutions used. Some EAs evolve a single initial solution (e.g.
    [Show full text]