Libcudaoptimize: an Open Source Library of GPU-Based Metaheuristics Youssef S.G

libCudaOptimize: an Open Source Library of GPU-based Metaheuristics Youssef S.G. Nashed, Roberto Ugolotti, Pablo Mesejo, Stefano Cagnoni To cite this version: Youssef S.G. Nashed, Roberto Ugolotti, Pablo Mesejo, Stefano Cagnoni. libCudaOptimize: an Open Source Library of GPU-based Metaheuristics. 14th Genetic and Evolutionary Computa- tion Conference companion (GECCO’12), Jul 2012, Philadelphia, United States. pp.117-124, 10.1145/2330784.2330803. hal-01221652 HAL Id: hal-01221652 https://hal.inria.fr/hal-01221652 Submitted on 28 Oct 2015 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. libCudaOptimize: an Open Source Library of GPU-based Metaheuristics Youssef S.G. Nashed Roberto Ugolotti Dept. of Information Dept. of Information Engineering Engineering University of Parma, Italy University of Parma, Italy [email protected] [email protected] Pablo Mesejo Stefano Cagnoni Dept. of Information Dept. of Information Engineering Engineering University of Parma, Italy University of Parma, Italy [email protected] [email protected] ABSTRACT Keywords Evolutionary Computation techniques and other metaheuris- Open Source Library, GPGPU, CUDA, Particle Swarm Op- tics have been increasingly used in the last years for solving timization, Differential Evolution, Scatter Search, Solis and many real-world tasks that can be formulated as optimiza- Wets local search tion problems. Among their numerous strengths, a major one is their natural predisposition to parallelization. 1. INTRODUCTION In this paper, we introduce libCudaOptimize, an open source library which implements some metaheuristics for In the last decades, several metaheuristics have been de- continuous optimization: presently Particle Swarm Opti- veloped (among others, Particle Swarm Optimization (PSO) mization, Differential Evolution, Scatter Search, and So- [6], Differential Evolution (DE) [16], Ant Colony Optimiza- lis&Wets local search. This library allows users either to tion (ACO) [3], Scatter Search (SS) [5], . ) and applied to a apply these metaheuristics directly to their own fitness func- large variety of problems and fields. To facilitate the use of tion or to extend it by implementing their own parallel op- Evolutionary Computation (EC) methods in optimization timization techniques. The library is written in CUDA-C to problems, several software environments or libraries have make extensive use of parallelization, as allowed by Graphics been developed, like HeuristicLab [20], Matlab Optimiza- Processing Units. tion Toolbox [17], CILib [13], jMetal [4] or JCLEC [19]. After describing the library, we consider two practical case Recently, these metaheuristics have also been developed on studies: the optimization of a fitness function for the auto- Graphics Processing Units (GPU) [8, 10], fully exploiting matic localization of anatomical brain structures in histolog- their intrinsic parallelism and obtaining significant speedups ical images, and the parallel implementation of Simulated (up to 30 times) compared to single thread CPU implemen- Annealing as a new module, which extends the library while tations. However, no open-source software has been released keeping code compatibility with it, so that the new method to easily take advantage of this aspect. can be readily available for future use within the library as The main idea behind our work is to offer a user the chance an alternative optimization technique. to apply metaheuristics as simply and fast as possible to his own problem of interest, exploiting the parallelization op- portunities offered by modern GPUs as much as possible. Categories and Subject Descriptors To the best of our knowledge, there are no software tools I.2.5 [Artificial Intelligence]: Programming Languages in which the entire optimization process, from exploration and Software; D.2.13 [Reusable Software]: Reusable li- operators to function evaluation, is completely developed on braries the GPU, and allows one to develop both local and global optimization methods. Only in the last years, some pack- General Terms ages, like ParadisEO [9], have started to use GPUs to speed up their algorithms, however, as it stands, this is limited Algorithms, Design to parallel data access during fitness evaluation, while the method itself is still executed sequentially. Also, the GPU implementations so far in ParadisEO are strictly limited to local optimization/search methods. We present libCudaOptimize, a GPU-based open source library that allows users to run their methods in parallel to optimize a fitness function, introduce a new optimization algorithm, or easily modify/extend existing ones. In the first case, the only thing one needs to do is to write the new fitness function in C++ or CUDA-C, while in the second and third cases, one can take advantage of the framework positive constants, rand() returns random values uniformly offered by the library to avoid the need to go deep into basic distributed in [0, 1], BPn is the best-fitness position visited implementation issues, especially regarding parallel code. so far by the particle and BLPn is the best-fitness position libCudaOptimize is expected to be used by users who visited so far by any particle of a neighborhood of the par- have, at least, a basic knowledge of C++. Although no ticle (which may be as large as the current swarm: in this explicit understanding of CUDA-C or even of metaheuris- case, this position would be the global best). tics is required it is very useful anyway, nonetheless, one In particular, the PSO version implemented in this library can use this library just by writing a C++ fitness function is the same described in [10]: an lbest PSO relying on a ring and launching one of the optimization techniques already topology with two neighbors and constant inertia factor. implemented (to date PSO, DE, SS and Solis&Wets local search (SW) [15]). This allows one to: 2.1.2 Differential Evolution Differential Evolution [16] recently gained credit as one of • implement commonly successful techniques with lim- the most successful evolutionary algorithms. DE perturbs ited efforts; the current population members with the scaled differences • easily compare the results obtained by running differ- of other individuals. Every element of the population acts ent techniques on different functions; as a parent vector and, for each of them, a donor vector is created. In the basic version of DE, the donor vector for the th • analyze the effects of changing values of the parame- i parent (Xi) is generated by combining three random and ters which regulate the behavior of the optimization distinct elements Xr1, Xr2 and Xr3. The donor vector Vi is techniques on user-defined problems; calculated as: • run high-dimensional optimization experiments on con- Vi = Xr1 + F · (Xr2 − Xr3) sumer level hardware, thanks to the efficient CUDA-C parallel implementation. where F (scale factor) is a parameter that strongly influ- ences DE’s performances and typically lies in the interval The remainder of this work is organized as follows: in [0.4, 1]. After mutation, every parent-donor pair generates section 2 libCudaOptimize is described, then the operations an offspring (called the trial vector) by means of a crossover needed to start working with the library are presented in operation. Cr is called crossover rate and appears as one section 3, followed by two case studies in section 4 and by of the control parameters of DE, like F . This trial vector is conclusions in section 5. then evaluated and, if its fitness is better than the parent’s, it will eventually replace it. 2. THE PACKAGE The library offers the choice between the two most commonly used kinds of crossover (binomial, also called uniform, 2.1 Implemented Methods and exponential). With respect to mutation schemes [2], In the present version, the library implements three dif- DE/rand/1 (explained above), DE/target-to-best/1, and DE/best/1 ferent global optimization methods (PSO, DE and SS) and are available. one local search technique (SW), which was added to demon- 2.1.3 Scatter Search strate that the possibilities of this tool may go beyond the implementation of population-based metaheuristics. Scatter Search [5] is a population-based algorithm in which a systematic combination between solutions (instead of a 2.1.1 Particle Swarm Optimization randomized one, as usually happens in EC) taken from a Particle Swarm Optimization [6] is a bio-inspired opti- smaller pool of evolved solutions named the reference set R mization algorithm based on the simulation of the social (usually around ten times lower than typical EC population behavior of bird flocks. In the last fifteen years PSO has sizes). R is drawn from a randomly initialized population, been applied to a very large variety of problems [14] and and it is composed of: numerous variants of the algorithm have been presented [1]. • a Best Set comprising the |B1| individuals of the initial During the execution of PSO a set of particles moves population having the best fitness function; within the function domain searching for the optimum of the function

Load more