CONTRIBUTED RESEARCH ARTICLES 27

Differential Evolution with DEoptim

An Application to Non-Convex Portfolio Optimiza- tween upper and lower bounds (defined by the user) tion or using values given by the user. Each generation involves creation of a new population from the cur- by David Ardia, Kris Boudt, Peter Carl, Katharine M. rent population members {xi |i = 1,. . .,NP}, where Mullen and Brian G. Peterson i indexes the vectors that make up the population. This is accomplished using differential mutation of Abstract The R package DEoptim implements the population members. An initial mutant parame- the Differential Evolution algorithm. This al- ter vector vi is created by choosing three members of gorithm is an evolutionary technique similar to the population, xi1 , xi2 and xi3 , at random. Then vi is classic genetic algorithms that is useful for the generated as solution of problems. In this note we provide an introduction to the package . vi = xi + F · (xi − xi ), and demonstrate its utility for financial appli- 1 2 3 cations by solving a non-convex portfolio opti- where F is a positive scale factor, effective values for mization problem. which are typically less than one. After the first mu- tation operation, mutation is continued until d mu- tations have been made, with a crossover probability Introduction CR ∈ [0,1]. The crossover probability CR controls the fraction of the parameter values that are copied from Differential Evolution (DE) is a search heuristic intro- the mutant. Mutation is applied in this way to each duced by Storn and Price(1997). Its remarkable per- member of the population. If an element of the trial formance as a global optimization algorithm on con- parameter vector is found to violate the bounds after tinuous numerical minimization problems has been mutation and crossover, it is reset in such a way that extensively explored (Price et al., 2006). DE has the bounds are respected (with the specific protocol also become a powerful tool for solving optimiza- depending on the implementation). Then, the ob- tion problems that arise in financial applications: for jective function values associated with the children fitting sophisticated models (Gilli and Schumann, are determined. If a trial vector has equal or lower 2009), for performing model selection (Maringer and objective function value than the previous vector it Meyer, 2008), and for optimizing portfolios under replaces the previous vector in the population; oth- non-convex settings (Krink and Paterlini, 2011). DE erwise the previous vector remains. Variations of is available in R with the package DEoptim. this scheme have also been proposed; see Price et al. In what follows, we briefly sketch the DE algo- (2006). rithm and discuss the content of the package DEop- Intuitively, the effect of the scheme is that the tim. The utility of the package for financial applica- shape of the distribution of the population in the tions is then explored by solving a non-convex port- search space is converging with respect to size and folio optimization problem. direction towards areas with high fitness. The closer the population gets to the global optimum, the more the distribution will shrink and therefore reinforce Differential Evolution the generation of smaller difference vectors. For more details on the DE strategy, we refer DE belongs to the class of genetic algorithms (GAs) the reader to Price et al.(2006) and Storn and Price which use biology-inspired operations of crossover, (1997). mutation, and selection on a population in order to minimize an objective function over the course The package DEoptim of successive generations (Holland, 1975). As with other evolutionary algorithms, DE solves optimiza- DEoptim (Ardia et al., 2011) was first published on tion problems by evolving a population of candi- CRAN in 2005. Since becoming publicly available, it date solutions using alteration and selection opera- has been used by several authors to solve optimiza- tors. DE uses floating-point instead of bit-string en- tion problems arising in diverse domains. We refer coding of population members, and arithmetic op- the reader to Mullen et al.(2011) for a detailed de- erations instead of logical operations in mutation, in scription of the package. DEoptim contrast to classic GAs. DEoptim consists of the core function Let NP denote the number of parameter vectors whose arguments are: ∈ Rd (members) x in the population, where d de- • fn: the function to be optimized (minimized). notes dimension. In order to create the initial genera- tion, NP guesses for the optimal value of the param- • lower, upper: two vectors specifying scalar real eter vector are made, either using random values be- lower and upper bounds on each parameter to

The R Journal Vol. 3/1, June 2011 ISSN 2073-4859 28 CONTRIBUTED RESEARCH ARTICLES

be optimized. The implementation searches be- > set.seed(1234) tween lower and upper for the global optimum > DEoptim(fn = Rastrigin, of fn. + lower = c(-5, -5), + upper = c(5, 5), • control: a list of tuning parameters, among + control = list(storepopfrom = 1)) which: NP (default: 10 · d), the number of pop- ulation members and itermax (default: 200), The above call specifies the objective function to the maximum number of iterations (i.e., pop- minimize, Rastrigin, the lower and upper bounds ulation generations) allowed. For details on on the parameters, and, via the control argument, the other control parameters, the reader is re- that we want to store intermediate populations from ferred to the documentation manual (by typ- the first generation onwards (storepopfrom = 1). ing ?DEoptim). For convenience, the function Storing intermediate populations allows us to exam- DEoptim.control() returns a list with default ine the progress of the optimization in detail. Upon elements of control. initialization, the population is comprised of 50 ran- dom values drawn uniformly within the lower and • ...: allows the user to pass additional argu- upper bounds. The members of the population gen- ments to the function fn. erated by the above call are plotted at the end of The output of the function DEoptim is a member different generations in Figure2. DEoptim consis- of the S3 class DEoptim. Members of this class have a tently finds the minimum of the function (market plot and a summary method that allow to analyze the with an open white circle) within 200 generations us- optimizer’s output. ing the default settings. We have observed that DE- Let us quickly illustrate the package’s usage with optim solves the Rastrigin problem more efficiently the minimization of the Rastrigin function in R2, than the simulated annealing method available in the which is a common test for global optimization: R function optim (for all annealing schedules tried). Note that users interested in exact reproduction of > Rastrigin <- function(x) { results should set the seed of their random number + sum(x^2 - 10 * cos(2 * pi * x)) + 20 generator before calling DEoptim (using set.seed). } DE is a randomized algorithm, and the results may The global minimum is zero at point x = (0,0)0.A vary between runs. perspective plot of the function is shown in Figure1. Finally, note that DEoptim relies on repeated evaluation of the objective function in order to move the population toward a global minimum. Therefore, users interested in making DEoptim run as fast as possible should ensure that evaluation of the objec- tive function is as efficient as possible. Using pure R code, this may often be accomplished using vector- ization. Writing parts of the objective function in a lower-level language like C or Fortran may also in- crease speed.

f(x_1,x_2) Risk allocation portfolios

Mean-risk models were developed in early fifties for the portfolio selection problem. Initially, variance was used as a risk measure. Since then, many al- x_2 ternative risk measures have been proposed. The x_1 question of which risk measure is most appropriate is still the subject of much debate. Value-at-Risk (VaR) and Conditional Value-at-Risk (CVaR) are the most popular measures of downside risk. VaR is the neg- ative value of the portfolio return such that lower returns will only occur with at most a preset prob- ability level, which typically is between one and five percent. CVaR is the negative value of the mean of Figure 1: Perspective plot of the Rastrigin function. all return realizations that are below the VaR. There is a voluminous literature on portfolio optimization The function DEoptim searches for a minimum problems with VaR and CVaR risk measures, see, of the objective function between lower and upper e.g., Fábián and Veszprémi(2008) and the references bounds. A call to DEoptim can be made as follows: therein.

The R Journal Vol. 3/1, June 2011 ISSN 2073-4859 CONTRIBUTED RESEARCH ARTICLES 29

Generation 1 Generation 20

4 4

2 2

2 0 2 0 x x

−2 −2

−4 −4

−4 −2 0 2 4 −4 −2 0 2 4

x1 x1

Generation 40 Generation 80

4 4

2 2

2 0 2 0 x x

−2 −2

−4 −4

−4 −2 0 2 4 −4 −2 0 2 4

x1 x1

Figure 2: The population (market with black dots) associated with various generations of a call to DEoptim as it searches for the minimum of the Rastrigin function at point x = (0,0)0 (market with an open white circle). The minimum is consistently determined within 200 generations using the default settings of DEoptim.

The R Journal Vol. 3/1, June 2011 ISSN 2073-4859 30 CONTRIBUTED RESEARCH ARTICLES

Modern portfolio selection considers additional age estimators, and their implementation in portfolio criteria such as to maximize upper-tail skewness and allocation, we refer the reader to Würtz et al.(2009, liquidity or minimize the number of securities in the Chapter 4). These estimators might yield better per- portfolio. In the recent portfolio literature, it has formance in the case of small samples, outliers or de- been advocated by various authors to incorporate partures from normality. risk contributions in the portfolio allocation prob- lem. Qian’s (2005) Risk Parity Portfolio allocates port- > library("quantmod") folio variance equally across the portfolio compo- > tickers <- c("GE", "IBM", "JPM", "MSFT", "WMT") > getSymbols(tickers, nents. Maillard et al.(2010) call this the Equally- + from = "2000-12-01", Weighted Risk Contribution (ERC) Portfolio. They de- + to = "2010-12-31") rive the theoretical properties of the ERC portfolio > P <- NULL and show that its volatility is located between those > for(ticker in tickers) { of the minimum variance and equal-weight portfo- + tmp <- Cl(to.monthly(eval(parse(text = ticker)))) lio. Zhu et al.(2010) study optimal mean-variance + P <- cbind(P, tmp) portfolio selection under a direct constraint on the + } contributions to portfolio variance. Because the re- > colnames(P) <- tickers sulting optimization model is a non-convex quadrat- > R <- diff(log(P)) ically constrained quadratic programming problem, > R <- R[-1,] they develop a branch-and-bound algorithm to solve > mu <- colMeans(R) > sigma <- cov(R) it. Boudt et al.(2010a) propose to use the contribu- We first compute the equal-weight portfolio. This tions to portfolio CVaR as an input in the portfolio is the portfolio with the highest weight diversifica- optimization problem to create portfolios whose per- tion and often used as a benchmark. But is the risk centage CVaR contributions are aligned with the de- exposure of this portfolio effectively well diversified sired level of CVaR risk diversification. Under the across the different assets? This question can be an- assumption of normality, the percentage CVaR con- swered by computing the percentage CVaR contribu- tribution of asset i is given by the following explicit . tions with function ES in the package Performance- w = (w w )0 function of the vector of weights 1,..., d , Analytics (Carl and Peterson, 2011). These percent- =. ( )0 mean vector µ µ1,...,µd and covariance matrix age CVaR contributions indicate how much each as- Σ : set contributes to the total portfolio CVaR. h ( ) ( ) i w −µ + √Σw i φ zα > library("PerformanceAnalytics") i i w0Σw α √ , > pContribCVaR <- ES(weights = rep(0.2, 5), 0 0 φ(zα) −w µ + w Σw α + method = "gaussian", + portfolio_method = "component", with zα the α-quantile of the standard normal distri- + mu = mu, bution and φ(·) the standard normal density func- + sigma = sigma)$pct_contrib_ES tion. Throughout the paper we set α = 5%. As we > rbind(tickers, round(100 * pContribCVaR, 2)) show here, the package DEoptim is well suited to [,1] [,2] [,3] [,4] [,5] solve these problems. Note that it is also the evo- tickers "GE" "IBM" "JPM" "MSFT" "WMT" lutionary optimization strategy used in the package "21.61" "18.6" "25.1" "25.39" "9.3" PortfolioAnalytics (Boudt et al., 2010b). To illustrate this, consider as a stylized example a We see that in the equal-weight portfolio, 25.39% five-asset portfolio invested in the stocks with tickers of the portfolio CVaR risk is caused by the 20% GE, IBM, JPM, MSFT and WMT. We keep the dimen- investment in MSFT, while the 20% investment in sion of the problem low in order to allow interested WMT only causes 9.3% of total portfolio CVaR. The users to reproduce the results using a personal com- high risk contribution of MSFT is due to its high stan- puter. More realistic portfolio applications can be dard deviation and low average return (reported in obtained in a straightforward manner from the code percent): below, by expanding the number of parameters opti- > round(100 * mu , 2) mized. Interested readers can also see the DEoptim GE IBM JPM MSFT WMT package vignette for a 100-parameter portfolio opti- -0.80 0.46 -0.06 -0.37 0.0 mization problem that is typical of those encountered > round(100 * diag(sigma)^(1/2), 2) in practice. GE IBM JPM MSFT WMT We first download ten years of monthly data us- 8.90 7.95 9.65 10.47 5.33 ing the function getSymbols of the package quant- mod (Ryan, 2010). Then we compute the log-return We now use the function DEoptim of the package series and the mean and covariance matrix estima- DEoptim to find the portfolio weights for which the tors. For on overview of alternative estimators of the portfolio has the lowest CVaR and each investment covariance matrix, such as outlier robust or shrink- can contribute at most 22.5% to total portfolio CVaR

The R Journal Vol. 3/1, June 2011 ISSN 2073-4859 CONTRIBUTED RESEARCH ARTICLES 31 risk. For this, we first define our objective function from the last two lines, this minimum risk portfolio to minimize. The current implementation of DEop- has a higher expected return than the equal weight tim allows for constraints on the domain space. To portfolio. The following code illustrates that DEop- include the risk budget constraints, we add them to tim yields superior results than the gradient-based the objective function through a penalty function. As optimization routines available in R. such, we allow the search algorithm to consider in- > out <- optim(par = rep(0.2, 5), feasible solutions. A portfolio which is unacceptable + fn = obj, for the investor must be penalized enough to be re- + method = "L-BFGS-B", jected by the minimization process and the larger the + lower = rep(0, 5), violation of the constraint, the larger the increase in + upper = rep(1, 5)) the value of the objective function. A standard ex- > out$value pression of these violations is α × |violation|p. Crama [1] 0.1255093 and Schyns(2003) describe several ways to calibrate > out <- nlminb(start = rep(0.2, 5), the scaling factors α and p. If these values are too + objective = obj, small, then the penalties do not play their expected + lower = rep(0, 5), role and the final solution may be infeasible. On the + upper = rep(1, 5)) > out$objective other hand, if α and p are too large, then the CVaR [1] 0.1158250 term becomes negligible with respect to the penalty; thus, small variations of w can lead to large varia- Even on this relatively simple stylized example, tions of the penalty term, which mask the effect on the optim and nlminb routines converged to local 3 the portfolio CVaR. We set α = 10 and p = 1, but minima. recognize that better choices may be possible and de- Suppose now the investor is interested in the pend on the problem at hand. most risk diversified portfolio whose expected re- turn is higher than the equal-weight portfolio. This > obj <- function(w) { amounts to minimizing the largest CVaR contribu- + if (sum(w) == 0) { w <- w + 1e-2 } + w <- w / sum(w) tion subject to a return target and can be imple- + CVaR <- ES(weights = w, mented as follows: + method = "gaussian", > obj <- function(w) { + portfolio_method = "component", + if(sum(w) == 0) { w <- w + 1e-2 } + mu = mu, + w <- w / sum(w) + sigma = sigma) + contribCVaR <- ES(weights = w, + tmp1 <- CVaR$ES + method = "gaussian", + tmp2 <- max(CVaR$pct_contrib_ES - 0.225, 0) + portfolio_method = "component", + out <- tmp1 + 1e3 * tmp2 + mu = mu, + } + sigma = sigma)$contribution + tmp1 <- max(contribCVaR) The penalty introduced in the objective func- + tmp2 <- max(mean(mu) - sum(w * mu), 0) tion is non-differentiable and therefore standard + out <- tmp1 + 1e3 * tmp2 gradient-based optimization routines cannot be + } used. In contrast, DEoptim is designed to consistently > set.seed(1234) find a good approximation to the global minimum of > out <- DEoptim(fn = obj, the optimization problem: + lower = rep(0, 5), + upper = rep(1, 5)) > set.seed(1234) > wstar <- out$optim$bestmem > out <- DEoptim(fn = obj, > wstar <- wstar / sum(wstar) + lower = rep(0, 5), > rbind(tickers, round(100 * wstar, 2)) + upper = rep(1, 5)) par1 par2 par3 par4 par5 > out$optim$bestval tickers "GE" "IBM" "JPM" "MSFT" "WMT" [1] 0.1143538 "17.38" "19.61" "14.85" "15.19" "32.98" > wstar <- out$optim$bestmem > 100 * (sum(wstar * mu) - mean(mu)) > wstar <- wstar / sum(wstar) [1] 0.04150506 > rbind(tickers, round(100 * wstar, 2)) par1 par2 par3 par4 par5 This portfolio invests more in the JPM stock and tickers "GE" "IBM" "JPM" "MSFT" "WMT" less in the GE (which has the lowest average return) "18.53" "21.19" "11.61" "13.37" "35.3" compared to the portfolio with the upper 22.5% per- > 100 * (sum(wstar * mu) - mean(mu)) centage CVaR constraint. We refer to Boudt et al. [1] 0.04827935 (2010a) for a more elaborate study on using CVaR al- locations as an objective function or constraint in the Note that the main differences with the equal- portfolio optimization problem. weight portfolio is the low weights given to JPM and A classic risk/return (i.e., CVaR/mean) scatter MSFT and the high weight to WMT. As can be seen chart showing the results for portfolios tested by

The R Journal Vol. 3/1, June 2011 ISSN 2073-4859 32 CONTRIBUTED RESEARCH ARTICLES

DEoptim is displayed in Figure3. Gray elements for the solution of global optimization problems in depict the results for all tested portfolios (using a wide variety of fields. We have referred interested hexagon binning which is a form of bivariate his- users to Price et al.(2006) and Mullen et al.(2011) for togram, see Carr et al.(2011); higher density regions a more extensive introduction, and further pointers are darker). The yellow-red line shows the path of to the literature on DE. The utility of using DEoptim the best member of the population over time, with was further demonstrated with a simple example of a the darkest solution at the end being the optimal port- stylized non-convex portfolio risk contribution allo- folio. We can notice how DEoptim does not spend cation, with users referred to PortfolioAnalytics for much time computing solutions in the scatter space portfolio optimization using DE with real portfolios that are suboptimal, but concentrates the bulk of the under non-convex constraints and objectives. calculation time in the vicinity of the final best port- The latest version of the package includes the folio. Note that we are not looking for the tangency adaptive Differential Evolution framework by Zhang portfolio; simple mean/CVaR optimization can be and Sanderson(2009). Future work will also be di- achieved with standard optimizers. We are looking rected at parallelization of the implementation. The here for the best balance between return and risk con- DEoptim project is hosted on R-forge at https:// centration. r-forge.r-project.org/projects/deoptim/. We have utilized the mean return/CVaR concen- tration examples here as more realistic, but still styl- ized, examples of non-convex objectives and con- Acknowledgements straints in portfolio optimization; other non-convex objectives, such as drawdown minimization, are also The authors would like to thank Dirk Eddelbuettel, common in real portfolios, and are likewise suitable Juan Ospina, and Enrico Schumann for useful com- to application of Differential Evolution. One of the ments and Rainer Storn for his advocacy of DE and key issues in practice with real portfolios is that a making his code publicly available. The authors are portfolio manager rarely has only a single objective also grateful to two anonymous reviewers for help- or only a few simple objectives combined. For many ful comments that have led to improvements of this combinations of objectives, there is no unique global note. Kris Boudt gratefully acknowledges financial optimum, and the constraints and objectives formed support from the National Bank of Belgium. lead to a non-convex search space. It may take sev- eral hours on very fast machines to get the best an- swers, and the best answers may not be a true global Disclaimer optimum, they are just as close as feasible given poten- tially competing and contradictory objectives. The views expressed in this note are the sole respon- When the constraints and objectives are relatively sibility of the authors and do not necessarily reflect simple, and may be reduced to quadratic, linear, or those of aeris CAPITAL AG, Guidance Capital Man- conical forms, a simpler optimization solver will pro- agement, Cheiron Trading, or any of their affiliates. duce answers more quickly. When the objectives Any remaining errors or shortcomings are the au- are more layered, complex, and potentially contra- thors’ responsibility. dictory, as those in real portfolios tend to be, DE- optim or other global optimization algorithms such as those integrated into PortfolioAnalytics provide Bibliography a portfolio manager with a feasible option for opti- mizing their portfolio under real-world non-convex D. Ardia, K. M. Mullen, B. G. Peterson and J. Ulrich. constraints and objectives. DEoptim: Differential Evolution Optimization in R, The PortfolioAnalytics framework allows any ar- 2011. URL http://CRAN.R-project.org/package= bitrary R function to be part of the objective set, and DEoptim. R package version 2.1-0. allows the user to set the relative weighting that they want on any specific objective, and use the appropri- K. Boudt, P. Carl, and B. G. Peterson. Portfolio op- ately tuned optimization solver algorithm to locate timization with conditional value-at-risk budgets. portfolios that most closely match those objectives. Working paper, 2010a. K. Boudt, P. Carl, and B. G. Peterson. Port- folioAnalytics: Portfolio Analysis, including Nu- Summary meric Methods for Optimization of Portfolios, 2010b. URL http://r-forge.r-project.org/projects/ In this note we have introduced DE and DEoptim. returnanalytics/. R package version 0.6. The package DEoptim provides a means of applying the DE algorithm in the R language and environment P. Carl and B. G. Peterson. PerformanceAnalytics: for statistical computing. DE and the package DE- Econometric tools for performance and risk analy- optim have proven themselves to be powerful tools sis., 2011. URL http://r-forge.r-project.org/

The R Journal Vol. 3/1, June 2011 ISSN 2073-4859 CONTRIBUTED RESEARCH ARTICLES 33

0.4

0.2

0 monthly mean (in %)

−0.2

−0.4

14 16 18 20 22 24 monthly CVaR (in %)

Figure 3: Risk/return scatter chart showing the results for portfolios tested by DEoptim.

The R Journal Vol. 3/1, June 2011 ISSN 2073-4859 34 CONTRIBUTED RESEARCH ARTICLES

projects/returnanalytics/. R package version J. A. Ryan. quantmod: Quantitative Financial Modelling 1.0.3.3. Framework, 2010. URL http://CRAN.R-project. org/package=quantmod. R package version 0.3-16. D. Carr, N. Lewin-Koh and M. Maechler. hexbin: http:// Hexagonal binning routines, 2011. URL O. Scaillet. Nonparametric estimation and sensitiv- CRAN.R-project.org/package=hexbin . R package ity analysis of expected shortfall. Mathematical Fi- version 1.26.0 nance, 14(1):74–86, 2002. Y. Crama and M. Schyns. Simulated annealing for complex portfolio selection problems. European R. Storn and K. Price. Differential Evolution – A sim- Journal of Operations Research, 150(3):546–571, 2003. ple and efficient heuristic for global optimization over continuous spaces. Journal of Global Optimiza- C. Fábián and A. Veszprémi. Algorithms for han- tion, 11:341–359, 1997. dling CVaR constraints in dynamic stochastic pro- gramming models with applications to finance. D. Würtz, Y. Chalabi, W. Chen and A. Ellis Portfolio Journal of Risk, 10(3):111–131, 2008. Optimization with R/Rmetrics. Rmetrics Association M. Gilli and E. Schumann. Heuristic optimisa- and Finance Online, 2009. tion in financial modelling. COMISEF wps-007 J. Zhang and A. C. Sanderson. JADE: Adap- 09/02/2009, 2009. tive Differential Evolution with optional external J. H. Holland. Adaptation in Natural Artificial Systems. archive. IEEE Transactions on Evolutionary Compu- University of Michigan Press, 1975. tation, 13(5):945–958, 2009.

T. Krink and S. Paterlini. Multiobjective optimization S. Zhu, D. Li, and X. Sun. Portfolio selection with using Differential Evolution for real-world portfo- marginal risk control. Journal of Computational Fi- lio optimization. Computational Management Sci- nance, 14(1):1–26, 2010. ence, 8:157–179, 2011. S. Maillard, T. Roncalli, and J. Teiletche. On the prop- erties of equally-weighted risk contributions port- David Ardia folios. Journal of Portfolio Management, 36(4):60–70, aeris CAPITAL AG, Switzerland 2010. [email protected] D. G. Maringer and M. Meyer. Smooth transition Kris Boudt autoregressive models: New approaches to the Lessius and K.U.Leuven, Belgium model selection problem. Studies in Nonlinear Dy- [email protected] namics & Econometrics, 12(1):1–19, 2008. K. M. Mullen, D. Ardia, D. L. Gil, D. Windover, and Peter Carl J. Cline. DEoptim: An R package for global opti- Guidance Capital Management, Chicago, IL mization by Differential Evolution. Journal of Sta- [email protected] tistical Software, 40(6):1–26, 2011. Katharine M. Mullen K. V. Price, R. M. Storn, and J. A. Lampinen. Differ- National Institute of Standards and Technology ential Evolution: A Practical Approach to Global Opti- Gaithersburg, MD mization. Springer-Verlag, Berlin, Heidelberg, sec- [email protected] ond edition, 2006. ISBN 3540209506. E. Qian. Risk parity portfolios: Efficient portfolios Brian G. Peterson through true diversification of risk. PanAgora As- Cheiron Trading, Chicago, IL set Management working paper, 2005. [email protected]

The R Journal Vol. 3/1, June 2011 ISSN 2073-4859