Examination of Particle Tails

Hindawi Publishing Corporation Journal of Artificial Evolution and Applications Volume 2008, Article ID 893237, 10 pages doi:10.1155/2008/893237 Research Article Examination of Particle Tails Tim Blackwell and Dan Bratton Department of Computing Goldsmiths, University of London, New Cross, London SE14 6NW, UK Correspondence should be addressed to Tim Blackwell, [email protected] Received 20 July 2007; Accepted 29 November 2007 Recommended by Riccardo Poli The tail of the particle swarm optimisation (PSO) position distribution at stagnation is shown to be describable by a power law. This tail fattening is attributed to particle bursting on all length scales. The origin of the power law is concluded to lie in multiplicative randomness, previously encountered in the study of first-order stochastic difference equations, and generalised here to second-order equations. It is argued that recombinant PSO, a competitive PSO variant without multiplicative randomness, does not experience tail fattening at stagnation. Copyright © 2008 T. Blackwell and D. Bratton. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. 1. INTRODUCTION full model, or can be imposed for the purposes of the oretical analysis. In general, particles are expected to search in the im- This paper focuses on the effective sampling distribution of mediate vicinity of the attractors, and indeed this behaviour particle swarm optimisation (PSO). PSO is a population- is necessary for convergence. The central portion of the sam- based global optimisation technique. The algorithm is pling distribution is therefore coincident with this region. made of two essential steps: particle movement (dynamics) However, the swarm must retain an ability to explore outside through a search space of solutions and an indirect parti- the attractor vicinity if premature convergence, a problem in cle interaction which is mediated by information sharing complex environments, is to be avoided. This exploration is within a social network. Particles place personal “attractors” due to the finite tail of the sampling distribution; there is a in the search space at positions corresponding to the best lo- small probability that a particle will be displaced far from the cation (as determined by the evaluation of an objective func- region of convergence, and this probability increases with tail tion) that they have visited. The dynamical update rule which fatness. specifies the acceleration of a particle i is a sum of an attrac- Some insights into the PSO sampling distribution have tion towards a personal attractor and an attraction towards been provided by a study of “bare bones” formulations [4]. the best attractor amongst other particles in i’s social neigh- In this work, Kennedy replaced the particle update rule with bourhood [1–3]. The particle acceleration is added to parti- sampling from a Gaussian distribution with mean at the cen- cle velocity in a discretisation of particle kinematics; the ve- troid of the personal and neighbourhood best positions of locity is added in turn to particle position and a new trial each particle, and standard deviation was set to the sepa- solution is achieved. Despite the simplicity of this scheme, ration between them. Performance comparisons with a few the algorithm is effective over the standard benchmark prob- benchmark problems were disappointing (later confirmed lems and has increasing numbers of real-world applications. in [5]), but the addition by hand of particle “bursts” ame- However, little is known about how a PSO achieves its results. liorated the situation somewhat, indicating that the tails of The sampling distribution provides the probability den- this Gaussian distribution function are too thin to enable es- sity of particle placement between and around fixed attrac- cape from stagnation. Further evidence for this conclusion tors. This scenario occurs when the swarm “stagnates,” the has been provided by a study of Levy´ bare bones [5]where swarm is not improving, the network of attractors is fixed, more extensive trials showed that fat tails, as provided by the and the particles are moving independently of each other, Levy´ distribution, improve bare bones performance so that that is, the particles decouple. Stagnation can occur in the it becomes effectively equivalent to standard PSO. A recent 2 Journal of Artificial Evolution and Applications gether in a concluding section and some open research ques- 0. initialise swarm tions are outlined. FOR EACH particle i = randomise vi, xi,setpi xi 2. PSO AND STOCHASTIC DIFFERENCE EQUATIONS FOR EACH particle i 1. find neighbourhood best This section recasts PSO as a stochastic difference equation = arg min ( ( ), ∈ ) gi f pj j Ni (SDE). Since this is an unfamiliar form of PSO, a few re- FOR EACH dimension d 2. move particle lations from the literature governing parameter choice are gathered together and presented here. F(p):xid ← xid 3. update memory Each PSO particle i in the swarm has dynamic variables IF f (xi) <f(pi)THEN position and velocity, xi and vi,andamemorypi of a past pi ← xi position visited. Furthermore, each particle is embedded in END a social, rather than spatial, neighbourhood Ni of informers. END The algorithm is outlined in Algorithm 1. A general dynamic rule F(p) for a single particle position update is Algorithm 1: Particle swarm opimisation. K vid(t +1)= wvid(t)+ Φj (t) pjd − xid(t) , j=1 (1) = theoretical analysis of bare bones PSO has been given in [6]. xid(t +1) xid(t)+vid(t +1). However, a Gaussian bare bones version of a “fully informed” particle swarm (FIPS) was able to deliver good performance The sum in (1)isoverK informers pi and for each di- overasmalltestbedoffunctions[7], which suggests that par- mension d. In standard PSO [11], however, K = 2 and the ticle tails might not be so important, at least for FIPS. two attractors pi and pgi are the personal and neighbourbood It is known that decoupled PSO exhibits bursts of outliers best positions. In this case, [8]. Bursts are temporary excursions, along a coordinate axis, = of the particle to large distances from the attractors. A burst Φ1,2 φ1,2u1,2,(2) will typically grow to a maximum and then return through ∼ a number of damped oscillations to the region of the attrac- where φ1,2 are “acceleration” constants and u1,2 U(0, 1) are tors. The addition of bursts to Gaussian bare bones increases random numbers drawn from the uniform distribution on the chance of particle displacement away from the attractors, the unit interval. This “inertia weight” (IW) formulation of fattening the tail of the overall sampling distribution. Levy´ PSO [12]derivesitsnamefromtheparameterw which imi- distributions also produce particle outliers, but these outliers tates inertia in the sense that it weights the tendency to move = are not correlated; they do not occur in sequence along a in a straight line at constant speed (w 1) to the tendency to single axis. At the moment, the circumstances, when a se- move erratically about the attractors Φ. Other formulations quence of correlated outliers (as manifest, for example, in of PSO include Kennedy and Mendes’ [13] fully informed PSO bursts) might prove to be fortuitous, are unknown. This particle swarm (FIPS), wherein a particle is influenced by = paper investigates the tail of the PSO sampling distribution. K>2neighbours,Φj (1/K)φ j uj , and the “constricted” Our results show that this tail is described by power laws and Clerc-Kennedy (CK) swarm [14] which is equivalent to (1) = = CK is therefore fatter than Gaussian. This fattening is indeed due with the identification χ w, φ1,2 χφ1,2 . to bursting, the origin of which is concluded to lie in a pro- At stagnation, we only need to consider a single parti- cess known as multiplicative randomness. cle in one dimension, so particle labels i will be dropped. By Bursting is already known to occur in first-order stochas- virtue of v(t) = x(t) − x(t − 1), (1)canberewrittenasa tic difference equations [9]. This paper generalises the first second-order stochastic difference equation (SDE) order results to second order difference equations with multiplicative randomness, a class of equations that includes x(t +1)+a(t)x(t)+bx(t − 1) = c(t)(3) standard PSO. Since the amount of bursting is dependent on the range of the probability distributions, the paper begins by with discussing possible parameter regimes. This is accomplished = − − by a formulation of PSO as a finite difference equation. The a(t) Φj (t) w 1, j paper continues with a series of experiments which reveal the b = w, (4) power law tails of the sampling distribution. Section 4 intro- duces multiplicative randomness and power law tails in first- c(t) = Φj (t)pj , order processes and extends these results to second-order dif- j ference equations. A competitive reformulation PSO without multiplicative randomness has recently been proposed (re- where the parameters a and c are stochastic variables because combinant PSO, [10]). Section 5 investigates decoupled re- of the presence of random numbers in (2). (However, they combinant PSO and demonstrates the consequent thinness are not independent because the same random numbers u1 of the distribution tail. The results of this paper are drawn to- and u2 appear in a and c.) T. Blackwell and D. Bratton 3 Constant parameter SDEs have been studied by many The CK condition for complex eigenvalues and oscilla- authors in a number of domains (e.g., [15] gives references tion, a2 < 4b,becomesa = χφ, b = χ, to population dynamics, epidemics, ARCH(1) processes, in- vestment portfolios, immigrant populations, and internet |χ| < 1, ff usage). A constant parameter second-order di erence equa- 1 2 1 2 (13) tion, a(t) = a, c(t) = a, Φ(t) = Φ, is obtained by replac- 1+ − √ <φCK < 1+ − √ .

Load more