<<

Metaheuristic Optimization, Machine Learning, and AI Virtual Workshop March 8-12, 2021

SPEAKER TITLES/ABSTRACTS

Christian Blum Spanish National Research Council

“Hybrid

During the last decade, research on metaheuristics in the context of combinatorial optimization has experienced an interesting shift towards the hybridization of metaheuristics with other techniques for optimization. At the same time, the focus of research has changed from being rather algorithm- oriented to being more problem-oriented. The focus is nowadays rather on solving a problem at hand, and not so much about promoting a certain . This has led to an enormously fruitful cross-fertilization of different areas of optimization, algorithmics, mathematical modelling, operations research, , simulation, and other fields. This cross-fertilization has resulted in a multitude of powerful hybrid algorithms that were obtained by combining metaheuristics with mathematical programming, , constraint programming, and lately also with machine learning. Nowadays nearly all high-performance optimization techniques in combinatorial optimization are hybrids. In this talk, I will provide a short glimpse of some recent developments in this field.

Ray-Bing Chen National Cheng Kung University

“Particle Swarm Stepwise (PaSS) Algorithm for Information Criterion Based Variable Selections”

A new stochastic search algorithm is proposed for solving information-criterion-based variable selection problems. The idea behind the proposed algorithm is to search for the best model for the previously specified information criterion using multiple search particles. These particles simultaneously explore the candidate model space and communicate with each other to share search information. A new stochastic stepwise procedure is proposed to update the model during the search for the best model by adding or deleting variables. The proposed algorithm can also be used to generate variable selection ensembles efficiently. Several examples are used to demonstrate the performances of the proposed algorithm. A parallel version of the proposed algorithm is also introduced to accelerate the performance in terms of computation time.

Carlos Coello Coello CINVESTAV-IPN

“An Overview of Evolutionary Multi-Objective Optimization”

Multi-objective optimization refers to solving problems having two or more (often conflicting) objectives at the same time. Such problems are ill-defined and their solution is not a single solution but instead, a set of them, which represent the best possible trade-offs among the objectives. Evolutionary algorithms are particularly suitable for solving multi-objective problems because they are population-based, and require little domain-specific information to conduct the search. Due to these advantages, the development of the so-called multi-objective evolutionary algorithms (MOEAs) has significantly increased in the last 15 years.

In this talk, we will provide a general overview of the field, including the main algorithms in current use as well as some of the many applications of them.

Abhishek Gupta Agency for Science, Technology and Research (A*STAR)

“Transfer and Multi-task Evolutionary Computation”

Optimization is at the heart of problem-solving in domains spanning the natural sciences, engineering, operations research, and even machine learning algorithms. In all of these domains, problems encountered are often repetitive in nature; e.g., optimizing products or processes to operate under different environments, solving partial differential equations (which can be seen as residual minimization problems) under varying initial / boundary conditions, etc. However, traditional optimization solvers, including those of evolutionary computation, are only designed to handle a single task at a time, assuming a zero prior knowledge state. Unlike humans who continually learn with experience, the capabilities of such solvers do not grow despite repeatedly solving similar problems.

In this talk, I shall present recent advances towards a new breed of probabilistic model-guided (evolutionary) optimization algorithms that are able to learn across tasks. Instead of limiting the updates of the search process to information acquired from a single optimization problem, the information is transferred and adaptively reused by multiple tasks - leading to faster convergence behaviors. Algorithmic realizations of the aforementioned idea shall be discussed, with diverse applications ranging from rapid training of physics-informed neural networks to manufacturing systems optimization.

Yaochu Jin University of Surrey

“Data-driven Bayesian Evolutionary Optimization”

Bayesian optimization has shown to be successful in handling black-box computationally expensive optimization problems. However, traditional Bayesian optimization is limited to low-dimensional single-objective problems. This talk presents some recent advances in extending Bayesian optimization to high-dimensional, multi-objective optimization by integrating machine learning with evolutionary algorithms, two techniques in artificial intelligence. Advanced machine learning methods such as ensembles and deep neural networks are adopted to replace the Gaussian process for reducing time complexity and evolutionary algorithms are employed to solve high-dimensional (up to 100 decision variables) and multi-objective (up to 20 objectives) expensive optimization problems.

Seongho Kim Wayne State University/Karmanos Cancer Institute

“Multi-Objective Optimization on Phase II Single-Arm Trial Designs”

Simon’s two-stage minimax and optimal designs are widely used for Phase II single-arm trials with a binary endpoint. The initial step in finding these designs is to construct a set of feasible solutions that are constrained by desired type I and II error rates. The minimax and optimal designs are then the feasible solutions that have the smallest total sample size and the smallest expected total sample size under a null response rate, respectively. That is, these are not optimized with regard to type I and II error rates. Therefore, estimated error rates often deviate from the desired error rates. We here develop a new design called Pareto . The developed designs use multi-objective optimization (MOO) to find a set of Pareto frontier in order to make estimated error rates as close as the desired rates, which increases the probability of early termination under a null response rate. Furthermore, we will show applications of genetic algorithm (GA)-based MOO to find Pareto designs.

Jinglai Li University of Birmingham

“Entropy Estimation via Normalizing Flow”

Entropy estimation is an important problem in information theory and statistical science. Many popular entropy estimators suffer from fast growing estimation bias with respect to dimensionality, rendering them unsuitable for high dimensional problems. In this work we propose a transform- based method for high dimensional entropy estimation, which consists of the following two main ingredients.

First by modifying the k-NN based entropy estimator, we propose a new estimator which enjoys small estimation bias for samples that are close to a uniform distribution. Second we design a normalizing flow based mapping that pushes samples toward a uniform distribution, and the relation between the entropy of the original samples and the transformed ones is also derived. As a result the entropy of a given set of samples is estimated by first transforming them toward a uniform distribution and then applying the proposed estimator to the transformed samples. Numerical experiments demonstrate the effectiveness of the method for high dimensional entropy estimation problems.

Dietmar Maringer University of Basel

"Meta-Heuristics in Finance"

Many quantitative problems in finance come with demanding optimization problems, defying traditional numerical methods. This often led to models that rely on simplifying assumptions. It is not always clear, however, how much these simplifications affect and deteriorate the quality of results.

Metaheuristics are therefore a welcome extension to the financial economist’s toolbox: they can deal with non-convex and discontinuous search spaces and many types of constraints, which adds substantial flexibility and allows for more realistic models. And in combination with machine learning techniques, the scope of solvable quantitative and computational has been increased.

This talk presents how meta-heuristics and related methods can be employed in financial modelling and decision making, and provides examples for topics such as portfolio optimization, algorithmic trading, and price modelling.

Soumya Mohanty University of Texas Rio Grande Valley

“Particle Swarm Optimization in Statistical Regression: Adaptive spline fitting and other case studies”

We discuss some instances of parametric and nonparametric regression where optimization roadblocks were removed successfully with particle swarm optimization (PSO). In nonparametric regression, which constitutes the principal part of the talk, we discuss adaptive spline fitting with free knots and how PSO solves the high-dimensional non-convex optimization barrier that it presents. This allows the virtues of free knot placement, such as better fit quality and unified handling of smooth and (a certain class of) non-smooth curves, to be realized. Besides standard benchmarks, we present applications of the adaptive spline fitting method to real-world problems drawn from the newborn field of gravitational wave (GW) astronomy. The same application domain presents a host of computationally expensive parametric regression problems as well that have been addressed using PSO. In particular, it drastically reduces the cost of a matched filter search for merging binaries of compact stars in data from a worldwide GW detector network, and enables a new approach to resolving binaries against the confusion noise background of their own making. Among the attractive features of PSO is its robustness and the small number of parameters that require tuning. In this context, we present a practical strategy for tuning PSO that has worked well across all of the above applications.

Jonas Mueller Amazon Web Services

“Metaheuristics in Automated Machine Learning with AutoGluon” AutoGluon (auto.gluon.ai) is an open-source library that automates supervised learning tasks with text, image, or tabular data. It provides a fully automated end-to-end ML pipeline that translates raw data into highly accurate predictions, without any manual oversight. Compared to other AutoML systems, AutoGluon is substantially more accurate and offers greater levels of automation (it is easier to use). In multiple Kaggle prediction competitions, AutoGluon outperforms almost all human data scientists after merely training for a few hours on the raw competition data. AutoGluon is not only accurate, but also allows you to trade-off between prediction accuracy and inference latency, a critical feature for many Amazon teams that use this system in their production applications. This talk will cover various metaheuristics used inside of a successful AutoML system, as well as the metaheuristics we employed to design this system.

Frederick Kin Hing Phoa Academia Sinica

“A Two-Step Approach to the Search of Minimum Energy Designs via

Swarm Intelligence Based (SIB) method has been used in many problems that their solutions fall in discrete and continuous domains in recent years. SIB 1.0 is efficient to converge to optimal solution but its particle size is fixed and pre-defined, while SIB 2.0 allows particle size changes during the procedure but it takes longer time to converge. This paper introduces a two-step SIB method that combines the advantages of two SIB methods. The first step via SIB 2.0 serves as a preliminary study to determine the optimal particle size and the second step via SIB 1.0 serves as a follow-up study to obtain the optimal solution. This method is applied to the search of optimal minimum energy designs, which is a new class of uniform designs with useful applications in sensor allocations and hotspot detections. The result from the proposed method outperforms those from either SIB 1.0 or SIB 2.0 in terms of cost efficiency.

Christina Ramirez University of North Carolina, Los Angeles

“SARS-CoV-2 Worldwide Replication Drives Rapid Rise of Mutations across the Viral Genome and their Selection”

Scientists and the public were alarmed at the first large viral variant of SARS-CoV2 reported in December 2020. We have followed the time course of emerging viral mutants and variants during the SARS-CoV-2 pandemic in ten countries on four continents. We examined complete SARS- CoV-2 nucleotide sequences in GISAID, (Global Initiative of Sharing All Influenza Data) with sampling dates extending until January 20, 2021. These sequences originated from ten different countries: United Kingdom, South Africa, Brazil, USA, India, Russia, France, Spain, Germany, and China. Among the novel mutations, some previously reported mutations waned and some of them increased in prevalence over time. VUI2012/01 (B.1.1.7) and 501Y.V2 (B.1.351), the so-called UK and South Africa variants, respectively, and two variants from Brazil 484K.V2, now called P.1 and P.2, increased in prevalence. Despite lockdowns, worldwide active replication in genetically and socio-economically diverse populations facilitated selection of new mutations. The data on mutant and variant SARS-CoV-2 strains provided here comprise a global resource for easy access to the myriad mutations and variants detected to date globally. Rapidly evolving new variant and mutant strains might give rise to escape variants, capable of limiting the efficacy of vaccines, therapies, and diagnostic tests.

Marzie Rasekh Boston University

“Identifying Optimal Subgroups in Clinical Trials using Optimization Algorithms”

In clinical trials, the treatment response differs across groups of patients due to patient heterogeneity. In order to identify patients with better clinical outcomes, many covariates and biomarkers are collected at the time of patient recruitment with the intention to later identify subgroups by these covariates. A subgroup is defined by a cutoff on one or more covariates, e.g Age>65. Finding subgroups is not trivial: the large number of covariates make the search space intractable, multiple testing corrections make reaching significance impossible, and different types of outcomes and statistics must be properly handled.

Here we propose a novel method to identify subgroups with different treatment responses using Particle Swarm Optimization (PSO). PSO is an that can find local optima and can be configured to handle any type of data points or outcomes. To demonstrate the application of PSO to the problem of subgroup identification we developed an package named sidtoolbox. This package allows the user to load their clinical data, explore the relations between the covariates, and identify subgroups using PSO along with two other state of the art methods, TSDT and Virtual Twins. On simulated data, PSO outperforms TSDT and is comparable to Virtual Twins. However, Virtual Twins can only handle binary outcomes, while PSO is more flexible. To date, this is the first application of evolutionary algorithms to subgroup identification in randomized clinical trials.

Mitchell Schepps University of California, Los Angeles

"Optimizing Patient Enrollment in Clinical Trials by Metaheuristics"

Choosing how many clinical trial sites worldwide is a fundamental issue when dealing with a Phase II or Phase III trial. The designers need to balance minimizing the cost with maintained good enrollment at sites. The complex surface is navigated well by metaheuristics. This talk covers some results from this problem.

Christine Shoemaker National University of Singapore

" of Homogeneously Noisy, Expensive Multimodal Functions with RBF Surrogates"

Evaluation of a noisy f(x) returns a value f(x)+ ε where ε is a random variable with variance σ and f(x) is the mean value of the function. We consider expensive functions f(x), which means the number of objective function evaluations that can be calculated to get an answer is limited. To reduce the number of objective function evaluations required to accurately solve the problem, an RBF surrogate approximation (based on previously evaluated values of the function f(x)) is created. The variance σ(x) is unknown, but the variance is the same for all x (e.g. homogeneous noise). Selecting the optimal x for noisy function f(x) is especially challenging because it may be necessary to evaluate the expensive objective function f(x) more than once at the same x in order to get a sufficiently accurate estimate of f(x) adequately close to the optimal value of x. We consider the optimization of multimodal, computationally expensive continuous noisy functions f(x) with the assistance of a Radial Basis Function (RBF) surrogate approximation of f(x). This RBF surrogate is then used to help guide the optimization search to reduce the number of objective function evaluations required. Because the function is noisy, it is necessary to resample some points to estimate the variance of the noise at each location. So, the proposed TRIM algorithm decides iteratively where to sample next and decides when and where to make repeated evaluations at a previously evaluated point. The results outperform Bayesian Optimization methods ISSO and EQI for noisy functions on test problems and can be more flexible in the choice of objective function. (Draft Paper by Shen and Shoemaker)

Noah Simon University of Washington

“Optimizing Designs for Adaptive Enrichment Clinical Trials”

The biomedical field has recently focused on developing targeted therapies, designed to be effective in only some subset of the population with a given disease. However, for many new treatments, characterizing this subset has been a challenge. Often, at the start of large-scale trials the subset is only rudimentarily understood. This leads practitioners to either 1) run an all-comers trial without use of the biomarker or 2) use a poorly characterized biomarker that may miss parts of the true target population.

In this talk we will discuss a class of adaptive enrichment designs: clinical trial designs that allow the simultaneous construction and use of a biomarker, during an ongoing trial, to adaptively enrich the enrolled population. For poorly characterized biomarkers, these trials can significantly improve power while still controlling type one error. However, there are additional challenges in this framework: In particular, how do we adapt our enrollment criteria in an “optimal” way?

We characterize optimality as a Bayesian --- we put priors on all aspects of our generative model, and select a “utility” to optimize. To make decisions during the trial we aim to use the decision strategy that optimizes our prespecified utility. Unfortunately, the optimal decision strategy is analytically intractable. Instead we discuss the use of reinforcement learning to optimize our strategy. More specifically, we express decisions as the output of a neural network; by simulating trials from our generative model/prior we can take stochastic steps to optimize decisions made by this network.

Similar ideas were used (to great acclaim) in Alpha Zero, an AI that achieved superhuman strength at the game GO (the first computer program to do so). In addition, these ideas are also seen in the computer science literature for the so-called “contextual bandit” problem.

Ponnuthurai Suganthan Nanyang Technological University

“Randomization Based Deep and Shallow Learning Algorithms”

This talk will first introduce the origins of randomization-based feedforward neural networks such as the popular instantiation called random vector functional link neural network (RVFL) originated in the early 1990s. Subsequently, a performance comparison among the randomization-based feedforward neural network models will be presented. The talk will also include ensemble/deep randomization-based neural networks. Another randomization-based paradigm is the random forest which exhibits highly competitive performances. The talk will briefly describe heterogeneous oblique random forests. Kernel ridge regression will be also briefly introduced. The talk will also present extensive benchmarking studies using tabular classification datasets.

Ye Tian Hong Kong Polytechnic University

“Mining Information from Biological Networks via Metaheuristics”

Many real-world complex systems including biological systems can be represented by networks, in which the nodes denote the objects (e.g., gene) and the edges denote the connections between objects. Many tasks in bioinformatics aim to mine useful information from the networks, which are challenging due to the high-dimensional and highly discrete decision space. This presentation introduce several advanced metaheuristics for mining biological networks, which use the techniques of large-scale optimization, multi-objective optimization, and sparse optimization for the module identification, community detection, and critical node detection of biological networks.

Xin Tong National University of Singapore

“Exploration Enhancement and Parameter Tuning for Nature-Inspired Metaheuristic Optimization Algorithms”

Nature-inspired metaheuristic algorithms have been widely shown to be effective in tackling high- dimensional and complex optimization problems across many disciplines. They are general purpose optimization algorithms, easy to use and implement, flexible and derivative free. However, many such algorithms suffer a common but serious drawback of premature convergence to local optima and so fail to converge to a global minimum. In this talk, we propose a general, simple and effective approach to add a noise component to an algorithm to enhance its exploration ability. We formulate and prove rigorously that the modified version of the algorithm will converge almost surely to a global minimum, using the small-set argument from stochastic analysis. We illustrate this approach to three widely applied nature-inspired metaheuristic optimization algorithms: particle swarm optimization (PSO), competitive swarm optimization (CSO) and the bat algorithm (BAT). We also discuss how to apply various techniques from reinforcement learning to find the optimal parameters for our metaheuristic algorithms. Alan R. Vazquez University of California, Los Angeles

“Constructing Optimal Screening Designs for Effective Experimentation using Metaheuristics”

Design of Experiments (DoE) is a branch of statistics that deals with the planning and analysis of controlled experiments to study complex processes. The goal is to establish cause-effect relationships between the controllable factors of the process and its performance measures through sound statistical models. DoE involves theory and methods to find experimental designs that tell experimenters which tests to perform given their budgets. An important class of experimental designs is two-level screening designs which permit an economical evaluation of a large number of factors at two different settings or levels. Traditional two-level screening designs involve numbers of tests or runs which are powers of two. Constructing attractive designs for other numbers of runs is challenging, especially when these are large. In this talk, we use a metaheuristic called iterated local search to find flexible optimal two-level screening designs which can accommodate many factors and runs. Our designs optimize the criterion which measures the average estimation efficiency of many potential statistical models. We show that iterated local search combined with expert knowledge can find highly efficient designs which fill the gaps between the traditional screening designs. We illustrate the applicability of the flexible two-level screening designs for planning drug combination experiments and calibrating machine learning algorithms.

Stefan Wager Stanford University

“Machine Learning for Causal Inference”

Given advances in machine learning over the past decades, it is now possible to accurately solve difficult non-parametric prediction problems in a way that is routine and reproducible. In this talk, I'll discuss how machine learning tools can be rigorously integrated into observational study analyses, and how they interact with classical statistical ideas around randomization, semiparametric modeling, double robustness, etc. I'll also survey some recent advances in methods for treatment heterogeneity. When deployed carefully, machine learning enables us to develop causal estimators that reflect an observational study design more closely than basic linear regression based methods.

Xin-She Yang Middlesex University London

“Nature-Inspired Algorithms for Optimization: Challenges and Open Problems”

Nature-inspired algorithms have become promising in dealing with nonlinear problems in optimization, data mining and machine learning. However, there are some important problems that need to be resolved. This talk introduces some recent nature-inspired algorithms and their main characteristics. Some open problems will be highlighted and discussed for future research.

George Yin University of Connecticut

“Using Machine Learning in Stochastic Control and Filtering”

In this talk, we present some of our recent work on using the machine learning approach to solve stochastic control and filtering problems. The motivation reflects our effort in designing feasible and efficient algorithms. We examine the convergence of the algorithms and provide computational examples.

References:

H.J. Kushner and G. Yin, Stochastic Approximation and Recursive Algorithms and Applications, 2nd Ed., Springer-Verlag, New York, 2003.

Z. Jin, H. Yang, H., and G. Yin, A hybrid deep learning method for optimal insurance strategies: algorithms and convergence analysis. Insurance: and Economics, 96 (2021), 262-275.

W. Qiu, Q. Song, and G. Yin, Solving elliptic Hamilton-Jacobi-Bellman equations in a value space, IEEE Control Systems Letters, 5 (2021), 55-60.

L.Y. Wang, G. Yin, and Q. Zhang, Deep filtering, to appear in Communications in Information and Systems.

Qingfu Zhang City University of Hong Kong

“Introduction to Decomposition Based Multiobjective Evolutionary Algorithms (MOEA/D)”

Many real-world optimization problems are multiobjective by nature. MOEA/D and its variants have been widely used for solving multiobjective optimization problems. MOEA/D decompose a multiobjective optimization problem into a set of subtasks and then solve them in a collaborative manner. In this talk, I will briefly explain basic ideas behind MOEA/D and discuss some design issues in MOEA/D.