Permuted Orthogonal Block-Diagonal Transformation Matrices for Large Scale Optimization Benchmarking Ouassim Ait Elhara, Anne Auger, Nikolaus Hansen

Permuted Orthogonal Block-Diagonal Transformation Matrices for Large Scale Optimization Benchmarking Ouassim Ait Elhara, Anne Auger, Nikolaus Hansen To cite this version: Ouassim Ait Elhara, Anne Auger, Nikolaus Hansen. Permuted Orthogonal Block-Diagonal Trans- formation Matrices for Large Scale Optimization Benchmarking. GECCO 2016, Jul 2016, Denver, United States. pp.189-196, 10.1145/2908812.2908937. hal-01308566v3 HAL Id: hal-01308566 https://hal.inria.fr/hal-01308566v3 Submitted on 11 May 2016 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Permuted Orthogonal Block-Diagonal Transformation Matrices for Large Scale Optimization Benchmarking Ouassim Ait ElHara Anne Auger Nikolaus Hansen LRI, Univ. Paris-Sud Inria Inria Université Paris-Saclay LRI, Univ. Paris-Sud LRI, Univ. Paris-Sud TAO Team, Inria Université Paris-Saclay Université Paris-Saclay firstname.ait [email protected] [email protected] [email protected] ABSTRACT In the context of optimization, it consists in running an We propose a general methodology to construct large-scale algorithm on a set of test functions and extracting differ- testbeds for the benchmarking of continuous optimization ent performance measures from the generated data. To be algorithms. Our approach applies an orthogonal transfor- meaningful, the functions on which the algorithms are tested mation on raw functions that involve only a linear number of should relate to real-world problems. The COCO platform is operations in order to obtain large scale optimization bench- a benchmarking platform for comparing continuous optimiz- mark problems. The orthogonal transformation is sampled ers [3]. In COCO, the philosophy is to provide meaningful from a parametrized family of transformations that are the and quantitative performance measures. The test functions product of a permutation matrix times a block-diagonal ma- are comprehensible, modeling well-identified difficulties en- trix times a permutation matrix. We investigate the impact countered in real-world problems such as ill-conditionning, of the different parameters of the transformation on its shape multi-modality, non-separability, skewness and the lack of a and on the difficulty of the problems for separable CMA-ES. structure that can be exploited. We illustrate the use of the above defined transformation in In this paper, we are interested in designing a large-scale the BBOB-2009 testbed as replacement for the expensive test suite for benchmarking large-scale optimization algo- orthogonal (rotation) matrices. We also show the practica- rithms. For this purpose we are building on the BBOB-2009 bility of the approach by studying the computational cost testbed of the COCO platform. We consider a continuous and its applicability in a large scale setting. optimization problem to be of large scale when the number of variables is in the hundreds or thousands. Here, we start by extending the dimensions used in COCO (up to 40) and Keywords consider values of d up to at least 640. Large Scale Optimization; Continuous Optimization; Bench- There are already some attempts on large scale continu- marking. ous optimization benchmarking. The CEC special sessions and competitions on large-scale global optimization have ex- 1. INTRODUCTION tended some of the standard functions to the large scale set- The general context of this paper is numerical optimiza- ting [9,8]. In the CEC large scale testbed, four classes of tion in a so-called black-box scenario. More precisely one is functions are introduced depending on their level of separa- interested in estimating the optimum of a function bility. These classes are: (a) separable functions; (b) partially separable functions with a single group of dependent xopt = arg min f(x) ; (1) variables; (c) partially separable functions with several in- dependent groups, each group comprised of dependent vari- with x 2 S ⊆ d and d ≥ 1 the dimension of the problem. R ables; and (d) fully non-separable functions. On the par- The black-box scenario means that the only information on tially separable functions, variables are divided into groups f that an algorithm can use is the objective function value and the overall fitness is defined as the sum of the fitness of f(x) at a given queried point x. There are two conflict- one or more functions on each of these groups independently. ing objectives: (i) estimating x as precisely as possible opt Dependencies are obtained by applying an orthogonal ma- with (ii) the least possible cost, that is the least number of trix to the variables of a group. By limiting the sizes of the function evaluations. groups, the orthogonal matrices remain of reasonable size, Benchmarking is a compulsory task to evaluate the per- which allows their application in a large scale setting. formance of optimization algorithms designed to solve (1). The approach we consider in this paper is to start with the BBOB-2009 testbed [6] from the COCO platform and Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed replace the transformations that do not scale linearly with for profit or commercial advantage and that copies bear this notice and the full cita- the problem dimension d with more efficient variants whilst tion on the first page. Copyrights for components of this work owned by others than trying to conserve these functions' properties. ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re- publish, to post on servers or to redistribute to lists, requires prior specific permission The paper is organized as follows: in Section2 we moti- and/or a fee. Request permissions from [email protected]. vate and introduce our particular transformation matrix for GECCO ’16, July 20-24, 2016, Denver, CO, USA large scale benchmarking and identify its different parame- c 2016 ACM. ISBN 978-1-4503-4206-3/16/07. $15.00 ters. Section3 treats the effects of the different parameters DOI: http://dx.doi.org/10.1145/2908812.2908937 on the transformed problem and its difficulty. We choose In order to introduce non-separability and coordinate sys- possible default values for the parameters in Section4 and tem independence, another transformation consists in apply- give some implementation details and complexity measures ing an orthogonal matrix to the search space: x 7! z = Rx ; d×d in Section5 where we also illustrate how the transformation with R 2 R an orthogonal matrix. can be applied in COCO. We sum up in Section6. If we take the Rastrigin function (f15) from [6] (and nor- malize it in our case), it combines all the transformations 2. BBOB-2009 RATIONALE AND THE CASE defined above: Rastrigin FOR LARGE SCALE f15(x) = fraw (z) + fopt ; (5) Our building of large-scale functions within the COCO 10 asy osz Rastrigin framework is based on the BBOB-2009 testbed [6]. We are where z = RΛ QT0:2 (T (R (x − xopt))), fraw as explaining in this section how this testbed is built, and how defined in Table1 and R and Q two d × d orthogonal ma- we intend to make it large-scale friendly. trices. 2.1 The BBOB-2009 Testbed 2.2 Extension to Large Scale The BBOB-2009 testbed relies on the use of a number of Complexity-wise, all the transformations above, bar the raw functions from which 24 different problems are gener- application of an orthogonal matrix, can be computed in lin- ated. Here, the notion of raw function designates functions ear time, and thus remain applicable in practice in a large in their basic form applied to a non-transformed (canonical scale setting. The problems in COCO are defined, in most base) search space. For example, we call the raw ellipsoid cases, by combining the above defined transformations. Ad- function the convex quadratic function with a diagonal Hes- ditional transformations are of linear cost and can be applied 6 i−1 sian matrix whose ith element equals 2 × 10 d−1 , that is in a large scale setting without affecting computational com- d 6 i−1 plexity. f Elli(x) = P 10 d−1 x2. On the other hand, the separa- i=1 i Our idea is to derive a computationally feasible large scale ble ellipsoid function used in BBOB-2009 (f ) sees the search 2 optimization test suite from the BBOB-2009 testbed [6], space transformed by a translation and an oscillation T osz while preserving the main characteristics of the original func- (4), that is, it is called on z = T osz(x − x ) instead of x, opt tions. To achieve this goal, we replace the computationally with x 2 d. Table1 shows normalized and generalized opt R expensive transformations that we identified above, namely examples of such widely used raw functions. full orthogonal matrices, with orthogonal transformations The 24 BBOB-2009 test functions are obtained by apply- of linear computational complexity: permuted orthogonal ing a series of transformations on such raw functions that block-diagonal matrices. serve two main purposes: (i) have non trivial problems that can not be solved by simply exploiting some of their proper- 2.2.1 Objectives ties (separability, optimum at fixed position...) and (ii) allow to generate different instances, ideally of similar difficulty, The main properties we want the replacement transfor- of a same problem.

Permuted Orthogonal Block-Diagonal Transformation Matrices for Large Scale Optimization Benchmarking Ouassim Ait Elhara, Anne Auger, Nikolaus Hansen

MATH 2030: MATRICES Introduction to Linear Transformations We Have

Periodic Staircase Matrices and Generalized Cluster Structures

Arxiv:Math/0010243V1

Vectors, Matrices and Coordinate Transformations

Support Graph Preconditioners for Sparse Linear Systems

THREE STEPS on an OPEN ROAD Gilbert Strang This Note Describes

Arxiv:2004.05118V1 [Math.RT]

The Spectra of Large Toeplitz Band Matrices with A

Conversion of Sparse Matrix to Band Matrix Using an Fpga For

Wronskian Solutions to the Kdv Equation Via B\" Acklund

Copyrighted Material

Rotation Matrix - Wikipedia, the Free Encyclopedia Page 1 of 22