GPDL: a Framework-Independent Problem Definition
Total Page:16
File Type:pdf, Size:1020Kb
GPDL: A Framework-Independent Problem Definition Language for Grammar-Guided Genetic Programming Gabriel Kronberger Michael Kommenda gabriel.kronberger@fh- michael.kommenda@fh- hagenberg.at hagenberg.at Stefan Wagner Heinz Dobler stefan.wagner@fh- heinz.dobler@fh- hagenberg.at hagenberg.at University of Applied Sciences Upper Austria School for Informatics, Communications and Media Softwarepark 11, 4232 Hagenberg, Austria ABSTRACT Keywords Defining custom problem types in genetic programming (GP) Domain Specific Languages, Evolutionary Computation Soft- software systems is a tedious task that usually involves the ware Systems, Genetic Programming implementation of custom classes and methods including framework-specific code. Users who want to solve a custom 1. MOTIVATION problem have to know the details of the targeted framework, for instance cloning semantics, and often have to write a lot Most genetic programming software systems are not sim- of boilerplate code in order to implement the necessary func- ple enough to be readily used as standard tools for solving tionality correctly. This can lead to frustration and hinders optimization problems. In many GP systems (e.g., Heuris- new developments and the application of GP to solve inter- ticLab, ECJ, JGAP, and OpenBEAGLE) the definition of esting problems. a custom GP problem usually includes the implementation In this contribution we propose a framework-independent of a number of classes, including classes for symbols, classes definition language for GP problems that can reduce the re- for constraints and classes for evaluation. In this process, quired effort and facilitate the integration of new problem users must also write a lot of framework-specific boilerplate types. We draw a parallel between the implementation of code, e.g., for cloning, persistence, or algorithm analysis. As compilers for programming languages and the implementa- a result, a large part of the source code is not directly re- tion of GP problems and reuse the well-established concept lated to the problem definition and is only necessary to fit of attributed grammars with semantic actions to define com- the classes into the existing framework. Users therefore have putational symbols, semantics and structural constraints for to know implementation details to be able to implement the GP. This goes beyond previous work in the area of context- functionality correctly. free-grammar GP and grammatical evolution, because we As a result, the necessary effort for implementing even also interweave the definition of symbol semantics and the simple problems (e.g., symbolic regression or artificial ant) target function with the definition of the grammar. is comparatively large and can be estimated at several days This paper describes the proposed GP problem definition for inexperienced users. The effort for larger and more com- language (GPDL) and exemplary definitions of two popular plex problems is often much higher, especially if the symbol benchmark problems using GPDL. We also describe a refer- semantics or structural constraints are complex. This is too ence implementation of a GPDL compiler for HeuristicLab. much effort for users who are primarily interested in solv- ing a given problem using GP, or only experimenting if GP would be a usable method at all. As a result, GP systems Categories and Subject Descriptors are not used frequently, because the effort is prohibitively I.2.2 [Automatic Programming]: Program synthesis; I.2.8 large. [Problem Solving, Control Methods, and Search]: Another important aspect is that several sophisticated Heuristic Methods and powerful GP systems are available, but a user would have to implement a problem for each framework separately in order to try which system is most suitable. Our suggestion to improve the current situation is to de- Permission to make digital or hard copies of all or part of this work for fine a domain specific language (DSL) for the definition of personal or classroom use is granted without fee provided that copies are GP problems. This language should be easy to understand, not made or distributed for profit or commercial advantage and that copies and independent of the framework as well as the program- bear this notice and the full citation on the first page. To copy otherwise, to ming platform. Using this language, it should be possible to republish, to post on servers or to redistribute to lists, requires prior specific define GP problems in an abstract way so that it is possible permission and/or a fee. GECCO’13 Companion, July 6–10, 2013, Amsterdam, The Netherlands. to solve these problems with various popular GP software Copyright 2013 ACM 978-1-4503-1964-5/13/07 ...$15.00. systems. A similar development has occurred in the closely related language. The language specification is usually based on an area of mathematical programming or constraint program- attributed grammar [5] with semantic actions. The gram- ming. A large number of solvers using sophisticated algo- mar defines the syntax and the language semantics are de- rithmic machinery have been developed and partially canned fined through a combination of semantic actions and sym- into software products (e.g., CPLEX solver). Domain spe- bol attributes. Typically, semantic actions are specified us- cific languages for mathematical programming (e.g., OPL ing source code in the target programming language. The or AMPL) have been developed to allow the formulation compiler generator interweaves the source code for semantic of such optimization problems independently from specific actions with the generated source code for parsing the input solvers. An important aspect of such languages is the sepa- token stream. ration of the problem formulation from the solver algorithm. In the definite clause translation grammar GP (DCTG- Many solvers support standard languages so that it is easy GP) [10] system the concept of attributed grammars with to experiment with different solvers. semantics actions was already used for the definition of GP Generally, GP can be seen as a solver for problems where problems to reduce the implementation effort. As in other the goal is to find a solution in form of a program which CFG-GP or GE systems, the problem description encom- is optimal with respect to the given target function. There passes the set of computational symbols and the grammar are many different approaches to GP which all fit into this for solution candidates. However, the definition of symbol general description, and also other kinds of solvers which are semantics is also included in the problem description docu- applicable to this problem. ment. Including semantic actions directly into the problem In this contribution we describe the first implementation definition has the benefit that it is possible to specify all of a framework-independent language for GP problem def- details of the GP problem in one self-contained document initions (GPDL). We aim to raise awareness of the issues without writing boilerplate code. McKay et al. state that discussed above and to facilitate discussion on the viability definition of semantics in DCTG-GP “is substantially sim- and usefulness of the proposed solution among the commu- pler than for a standard GP system”,andthat“the program- nity of GP software system developers. We also describe the ming cost of targeting a new problem is generally small – in reference implementation of a compiler for GPDL targeting the case of DCTP-GP, quite typically ten or twenty lines of the HeuristicLab framework [11],[6]. The reference imple- code” [8]. mentation should help developers who want to implement However, the DCTG-GP problem definition language is the language for other GP software systems. rather hard to read and specific to Prolog; additionally, the fitness function has to be implemented separately. The lan- 2. RELATED WORK guage format has not been used in any other GP systems besides DCTG-GP. Context-free grammars (CFG) have already been used to define the set of possible solution candidates in context-free- grammar GP [13] and grammatical evolution [9]. This is a 3. DESIGN CONSIDERATIONS powerful idea that also seems natural because the promise Based on our research of previous work and the initial of GP is to find programs solving a given problem and pro- motivation of our idea, our design goals for GPDL are sim- grams are usually specified in a programming language with plicity, expressivity, and generality: syntax defined by a grammar. A recent survey of grammar • Simplicity: Users should be able to define custom prob- guided GP approaches is given in [8]. lems in less time, so that it is possible to quickly try if Some popular GP paradigms, for instance stack-based GP the problem can be solved by GP and to try different systems, most notably PushGP, or other forms of linear GP variants of the problem formulation. In particular, it that evolve assembler, binary or byte code directly, use a should not be necessary to familiarize oneself with the very simple syntax. These systems rely more on the seman- details of the target framework and it should not be tics of operations; the set of computational symbols is pre- necessary to write boilerplate code to integrate a new defined and fixed, so the evolutionary operators, crossover problem. Furthermore, it should not be necessary to and mutation, are designed to take operation semantics into separately recompile code when adapting or extending account. In these GP systems the set of possible solution the problem definition. In an exaggerated scenario, it candidates is also defined by a grammar but the seman- should be possible to simply call a “solver” executable tics are much more important, so the grammar-based ap- with a GP problem definition so that the solver pro- proach for problem definition does not fit very well. Simi- duces a solution and its fitness value after a specified larly, Cartesian GP is also not well-suited as it uses a graph- time interval.