<<

Simulated Annealing

A Basic Introduction

Lecture 10

Olivier de Weck, Ph.D. Massachusetts Institute of Technology

1 Outline

• Background in Statistical Mechanics – Atom Configuration Problem – Metropolis • Simulated Annealing Algorithm • Sample Problems and Applications • Summary

2 Learning Objectives

• Review background in Statistical Mechanics: configuration, ensemble, entropy, heat capacity • Understand the basic assumptions and steps in Simulated Annealing (SA) • Be able to transform design problems into a combinatorial optimization problem suitable to SA • Understand strengths and weaknesses of SA

3 Heuristics

4 What is a ?

• A Heuristic is simply a rule of thumb that hopefully will find a good answer.

• Why use a Heuristic?

– Heuristics are typically used to solve complex (large, nonlinear, non- convex (i.e. contain local minima)) multivariate combinatorial optimization problems that are difficult to solve to optimality.

• Unlike gradient-based methods in a convex design space, heuristics are NOT guaranteed to find the true global optimal solution in a single objective problem, but should find many good solutions (the mathematician's answer vs. the engineer’s answer)

• Heuristics are good at dealing with local optima without getting stuck in them while searching for the global optimum.

5 Types of Heuristics

• Heuristics Often Incorporate Randomization

• Two Special Cases of Heuristics – Construction Methods • Must first find a feasible solution and then improve it. – Improvement Methods • Start with a feasible solution and just try to improve it.

• 3 Most Common Heuristic Techniques – Simulated Annealing – Genetic – New Methods: Particle Swarm Optimization, etc…

6 Origin of Simulated Annealing (SA)

• Definition: A heuristic technique that mathematically mirrors the cooling of a set of atoms to a state of minimum energy.

• Origin: Applying the field of Statistical Mechanics to the field of Combinatorial Optimization (1983)

• Draws an analogy between the cooling of a material (search for minimum energy state) and the solving of an optimization problem.

• Original Paper Introducing the Concept – Kirkpatrick, S., Gelatt, C.D., and Vecchi, M.P., “Optimization by Simulated Annealing,” Science, Volume 220, Number 4598, 13 May 1983, pp. 671- 680.

7 MATLAB® “peaks” function

• Difficult due to plateau at z=0, local maxima

• SAdemo1 Peaks 3 • x =[-2,-2] o 2 • Optimum at Maximum 1 • x*=[ 0.012, 1.524]

y 0 • z*=8.0484 -1

-2 Start Point -3 -3 -2 -1 0 1 2 3 x

8 “peaks” convergence • Initially ~ nearly random search • Later ~ gradient search SA convergence history 2 current configuration new best configuration 0

-2

-4 System Energy -6

-8

-10 0 50 100 150 200 250 300 Iteration Number 9 Statistical Mechanics

10 The Analogy

• Statistical Mechanics: The behavior of systems with many degrees of freedom in thermal equilibrium at a finite temperature.

• Combinatorial Optimization: Finding the minimum of a given function depending on many variables.

• Analogy: If a liquid material cools and anneals too quickly, then the material will solidify into a sub-optimal configuration. If the liquid material cools slowly, the crystals within the material will solidify optimally into a state of minimum energy (i.e. ground state). – This ground state corresponds to the minimum of the cost function in an optimization problem.

11 Sample Atom Configuration

Original Configuration Perturbed Configuration Atom Configuration - Sample Problem Atom Configuration - Sample Problem 7 7

6 6

5 5

4 3 4

3 4 3 4 2 2 2 3 2

1 1 1 1

0 0

-1 -1 -1 0 1 2 3 4 5 6 7 -1 0 1 2 3 4 5 6 7 E=133.67 E=109.04 Perturbing = move a random atom Energy of original (configuration) to a new random (unoccupied) slot 12 Configurations

• Mathematically describe a configuration – Specify coordinates of each “atom”

1 3 4 5 r with i=1,..,4 r , r , r , r i 11 2 2 3 4 4 3

– Specify a slot for each atom

rRi with i=1,..,4 1 12 19 23

13 Energy of a state

• Each state (configuration) has an energy level associated with it

H q,,,, q t qii p L q q t Hamiltonian i

HTVEtot Energy of configuration = Objective function value of design kinetic potential

Energy 14 Energy sample problem

• Define energy function for “atom” sample problem 1 N 222 Ei mgy i x i x j y i y j potential ji energy kinetic energy

N Absolute and relative position of E R Eii r i 1 each atom contributes to Energy

15 Compute Energy of Config. A

• Energy of initial configuration Atom Configuration - Sample Problem 7 E 1 10 1 5 18 20 20.95 6 1 5

4 E2 1 10 2 5 5 5 26.71 3 3 4 E3 1 10 4 18 5 2 47.89 2 2 1 1 E4 1 10 3 20 5 2 38.12 0

-1 -1 0 1 2 3 4 5 6 7

Total Energy Configuration A: E({rA})= 133.67

16 Boltzmann

P! Number of configurations NR PN! P=# of slots=25 N =6,375,600 N=# of atoms =4 R

What is the likelihood that a particular configuration will exist in a large ensemble of configurations?

Er({ }) Boltzmann probability Pr{ } exp depends on energy kTB and temperature

17 Boltzmann Collapse at low T T=100 T=10 T=1

T=100 - E =100.1184 T=10 - E =58.9245 T=1 - E =38.1017 avg avg avg 14000 14000

12000 12000 12000

10000 10000 10000

8000 8000 8000

6000

Occurences 6000

Occurences 6000 Occurences

4000 4000 4000

2000 2000 2000

0 0 20 40 60 80 100 120 140 160 180 200 0 0 20 40 60 80 100 120 140 160 180 200 0 Energy 0 20 40 60 80 100 120 140 160 180 200 Energy Energy T high T low

Boltzmann Distribution collapses to the lowest energy state(s) in the limit of low temperature

Basis of search by Simulated Annealing

18 Partition Function Z

• Ensemble of configurations can be described statistically • Partition function, Z, generates the ensemble average

EENR Z Tr expii exp kBB Ti 1 k T

Initially (at T>>0) equal to the number of possible configurations

19 Free Energy

NR Average Energy of all EETETavg() i Configurations in an i 1 ensemble

F T kB Tln Z E ( T ) TS

NR ETi exp ETi i 1 kTB dZln EETavg () Z d1 kB T

Relates average Energy at T with Entropy S

20 Specific Heat and Entropy

• Specific Heat

NR 1 2 2 ETETi () 2 2 dE() T NR i 1 ETET() CT 22 dT kBB T k T • Entropy T 1 CT() dS()() T C T S()() T S T1 dT T T dT T

Entropy is ~ equal to ln(# of unique configurations)

21 Low Temperature Statistics

Specific Heat C and Entropy S 5 10 10

T Entropy S 8 decreases 6 with lower T 4

2 # # of configurations, N

# configs 0 Ht. Capacity 10 0 0 2 4 0 2 4 decreases 10 10 10 10 10 10 Temperature Temperature

150 150 g

100 v 100 a 50 50

0 0

-50 -50 Approaches Free Energy, F Free

-100 Energy, Average E -100 E* from Approaches -150 -150 above 0 2 4 0 2 4 E* from 10 10 10 10 10 10 Temperature Temperature below 22 Minimum Energy Configurations • Sample Atom Placement Problem Uniqueness?

Atom Configuration - Sample Problem Atom Configuration - Sample Problem 7 7

6 6

5 E*=60 5 E*=14.26 4 4

3 3

2 2 3 2

1 2 1 4 3 1 4 1

0 0

-1 -1 -1 0 1 2 3 4 5 6 7 -1 0 1 2 3 4 5 6 7 Strong gravity g=10 Low gravity g=0.1 23 Simulated Annealing

24 Dilemma

• Cannot compute energy of all configurations ! – Design space often too large – Computation time for a single function evaluation can be large • Use Metropolis Algorithm, at successively lower temperatures to find low energy states – Metropolis: Simulate behavior of a set of atoms in thermal equilibrium (1953) – Probability of a configuration existing at T  Boltzmann Probability P(r,T)=exp(-E(r)/T)

25 The SA Algorithm • Terminology: – X (or R or ) = Design Vector (i.e. Design, Architecture, Configuration) – E = System Energy (i.e. Objective Function Value) – T = System Temperature – = Difference in System Energy Between Two Design Vectors

• The Simulated Annealing Algorithm

1) Choose a random Xi, select the initial system temperature, and specify the cooling (i.e. annealing) schedule

2) Evaluate E(Xi) using a simulation model

3) Perturb Xi to obtain a neighboring Design Vector (Xi+1)

4) Evaluate E(Xi+1) using a simulation model

5) If E(Xi+1)< E(Xi), Xi+1 is the new current solution

6) If E(Xi+1)> E(Xi), then accept Xi+1 as the new current solution with a (- /T) probability e where = E(Xi+1) - E(Xi). 7) Reduce the system temperature according to the cooling schedule. 8) Terminate the algorithm. 26 SA Block Diagram

Define initial Evaluate energy Start To configuration Ro E(Ro)

Compute energy Perturb configuration Evaluate energy End Tmin y T Ri+1 E(Ri+1) E = E(Ri+1)-E(Ri) n

Reduce temperature Reached y equilbrium Accept Ri+1 as y Ti+1 =Tj - T new configuration E < 0? at Tj ? y n Keep R as current i E Create random configuration n exp - > v? T number v in [0,1] Metropolis step SA Block Diagram

Image by MIT OpenCourseWare.

27 MATLAB Function: SA.m

Best Configuration Initial Configuration SA.m xbest Ebest xo Simulated Annealing Option Flags Algorithm History options xhist

evaluation perturbation function function file_eval.m file_perturb.m

28 Exponential Cooling

k • Typically (T1/To)~0.7-0.9 T TT1 Simulated Annealing Evolution kk1 10 C-specific heat To 9 S-entropy T/10-temperature 8 7 Can also do linear 6 or step-wise cooling 5

4

3 …

2 But exponential cooling 1 often works best.

0 0 5 10 15 20 25 30 35 40 45 Temperature Step

29 Key Ingredients for SA

1. A concise description of a configuration (architecture, design, topology) of the system (Design Vector).

2. A random generator of rearrangements of the elements in a configuration (Neighborhoods). This generator encapsulates rules so as to generate only valid configurations. Perturbation function.

3. A quantitative objective function containing the trade-offs that have to be made (Simulation Model and Output Metric(s)). Surrogate for system energy.

4. An annealing schedule of the temperatures and/or the length of times for which the system is to be evolved.

30 Sample Problems

31 Traveling Salesman Problem

• N cities arranged randomly on [-1,1] • Choose N=15 • SAdemo2.m • Minimize “cost” of route (length, time,…) • Visit each city once, return to start city

N 22 22 l R xj R i11 x j R i x j R x j R N i1 j 1 j 1

visit N cities return home to first city

32 TSP Problem (cont.)

Initial (Random) Route Final (Optimized) Route Length: 17.43 Length: 8.24

Length of TSP Route: 17.4309 Length of TSP Route: 8.2384 1 2 1 2 7 10 7 10 0.8 11 0.8 11

0.6 Start1 0.6 1

0.4 8 0.4 8 3 3 0.2 14 12 0.2 14 Start12 9 9 0 0

-0.2 -0.2

-0.4 -0.4 6 6 -0.6 5 -0.6 5 4 4

-0.8 13 -0.8 13

-1 15 -1 15 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 Shortest Route Result with SA 33 Structural Optimization

• Define: – Design Domain find i iN 1,..., – Boundary Conditions T minC f u (i ) – Loads s.t. u K1 f – Mass constraint N

• Subdivide domain s.t. Vmii max – N x M design “cells” i 1 – Cell density =1 or =0 • Where to put material to minimize compliance?

34 Structural Topology Optimization (II) “Energy” = strain energy = compliance Computed via Finite Element Analysis

41 4242 43 44 45 46 47 48 49 50 43 44 45 28 29 30 31 32 4633 34 35 36 47 31 3232 33 34 35 36 37 38 48 39 40 33 34 49 35 19 20 21 22 23 3624 25 26 27 50 37 38 21 22 23 24 25 26 27 28 29 30 24 25 39 10 11 12 13 14 26 15 16 17 18 40 Applied Load 27 11 1212 13 14 15 16 17 1828 19 20 13 14 29 15 1 2 3 4 5 16 6 7 8 9 30 17 1 22 3 4 5 6 7 188 9 10 3 4 5 19 6 20 7 Cantilever 8 9 F=-2000 Boundary 10 condition

Deformation not drawn to scale

35 Structural Toplogy Optimization

Initial Structural Toplogy “Final” Toplogy Initial Structural Configuration: x o “island” 9 Structural Configuration 8

7

6

5

4

3

2

1

0

-1 0 2 4 6 8 10 Compliance C=236.68 Compliance C=593.0 Not “optimal” – often need post-processing 36 Structural Optimization – Convergence Analysis Simulated Annealing Evolution 14 45 C-specific heat T=7.4 S-entropy 40 12 ln(T)-temperature Don’t terminate T=36 35 T=19 when C is still 10 30 T=6.7 T=10T=11 large ! T=13 T=6 8 25 T=21 T=9.2T=8.2 T=16 T=29T=68 T=93 20 T=24 T=1.8e+002 6 T=55 Occurences T=40 T=3e+002 T=17 T=1.9e+002 T=32 T=2.2e+002 T=2.7e+002 15 T=14 T=1.3e+002T=1.6e+002 T=2.4e+002 T=61T=45T=1e+002 4 T=49T=84 T=26 T=1.1e+002 10 T=75 T=1.4e+002

2 5

0 0 0 5 10 15 20 25 30 35 40 200 400 600 800 1000 1200 1400 1600 1800 2000 2200 Temperature Step Energy Boltzmann Distributions Evolution of - Entropy, Temperature, Specific Heat

37 Premature termination

SA convergence history 2200 current configuration 2000 new best configuration

1800

1600

1400

1200

1000 System Energy

800

600

400

200 0 500 1000 1500 2000 2500 3000 Iteration Number

Indicator: Best Configuration found only shortly before Simulated Annealing terminated. 38 Final Example: Telescope Array Optimization

• Place N=27 stations in xy within a 200 km radius • Minimize UV density metric • Ideally also minimize cable length

Cable Length = 1064.924 UV Density = 0.67236 200 100

100 50

0 0 Initial Configuration -100 -50

-200 -100 -200 -100 0 100 200 -100 -50 0 50 100

39 Optimized Solution

Cable Length = 1349.609 UV Density = 0.33333 200 100

100 50

0 0

-100 -50

-200 -100 -200 -100 0 100 200 -100 -50 0 50 100

Simulated Annealing Improved UV density from 0.67 to 0.33

• Simulated Annealing transforms the array: – Hub-and-Spoke  Circle-with-arms Cohanim B. E., Hewitt J. N., and de Weck O.L., “The Design of Radio Telescope Array Configurations using Multiobjective Optimization: Imaging Performance versus Cable Length”, The Astrophysical Journal, Supplement Series, 154, 705-719, October 2004 40 Summary

41 Summary: Steps of SA

• The Simulated Annealing Algorithm

1) Choose a random Xi, select the initial system temperature, and outline the cooling (ie. annealing) schedule

2) Evaluate E(Xi) using a simulation model

3) Perturb Xi to obtain a neighboring Design Vector (Xi+1)

4) Evaluate E(Xi+1) using a simulation model

5) If E(Xi+1)< E(Xi), Xi+1 is the new current solution

6) If E(Xi+1)> E(Xi), then accept Xi+1 as the new current solution with a (- /T) probability e where = E(Xi+1) - E(Xi). 7) Reduce the system temperature according to the cooling schedule. 8) Terminate the algorithm.

42 Recent Research in SA

• Alternative Cooling Schedules and Termination criteria • Adaptive Simulated Annealing (ASA) – determines its own cooling schedule • Hybridization with other Heuristic Search Methods (GA, Tabu Search …) • Multiobjective Optimization with SA

43 References

• Cerny, V., "Thermodynamical Approach to the Traveling Salesman Problem: An Efficient Simulation Algorithm", J. Opt. Theory Appl., 45, 1, 41-51, 1985 .

• de Weck O. “System Optimization with Simulated Annealing (SA), Memorandum SA.pdf

• Cohanim B. E., Hewitt J. N., and de Weck O.L., “The Design of Radio Telescope Array Configurations using Multiobjective Optimization: Imaging Performance versus Cable Length”, The Astrophysical Journal, Supplement Series, 154, 705-719, October 2004

• Jilla, C.D., and Miller, D.W., “Assessing the Performance of a Heuristic Simulated Annealing Algorithm for the Design of Distributed Satellite Systems,” Acta Astronautica, Vol. 48, No. 5-12, 2001, pp. 529-543.

• Kirkpatrick, S., Gelatt, C.D., and Vecchi, M.P., “Optimization by Simulated Annealing,” Science, Volume 220, Number 4598, 13 May 1983, pp. 671-680.

• Metropolis,N., A. Rosenbluth, M. Rosenbluth, A. Teller, E. Teller, "Equation of State Calculations by Fast Computing Machines", J. Chem. Phys.,21, 6, 1087-1092, 1953.

44 MIT OpenCourseWare http://ocw.mit.edu

ESD.77 / 16.888 Multidisciplinary System Design Optimization Spring 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.