Using Differential Evolution to Improve Pheromone-based Coordination of Swarms of Drones for Collaborative Target Detection

Mario G. C. A. Cimino, Alessandro Lazzeri and Gigliola Vaglini Department of Information Engineering, Università di Pisa, Largo Lazzarino 1, Pisa, Italy

Keywords: Differential Evolution, Parametric Adaptation, Collaborative Target Detection, Marker-based Stigmergy, .

Abstract: In this paper we propose a novel algorithm for adaptive coordination of drones, which performs collaborative target detection in unstructured environments. Coordination is based on digital pheromones released by drones when detecting targets, and maintained in a virtual environment. Adaptation is based on the Differential Evolution (DE) and involves the parametric behaviour of both drones and environment. More precisely, attractive/repulsive pheromones allow indirect communication between drones in a flock, concerning the availability/unavailability of recently found targets. The algorithm is effective if structural parameters are properly tuned. For this purpose DE combines different parametric solutions to increase the swarm performance. We focus first on the study of the principal parameters of the DE, i.e., the crossover rate and the differential weight. Then, we compare the performance of our algorithm with three different strategies on six simulated scenarios. Experimental results show the effectiveness of the approach.

1 INTRODUCTION AND In our approach, stigmergy and flocking are exploited to enable self-coordination in the MOTIVATION navigation of unstructured environments. Such environments typically contain a number of The Differential Evolution algorithm (DE) is a obstacles such as trees and buildings. Targets are stochastic population based algorithm, very suited placed according to different patterns. More for numerical and multi-modal optimization specifically, stigmergy implies the use of an problems. Recently, DE has been applied in many attractive digital pheromone to locally coordinate research and application areas. Compared with other drones. When a drone detects a new piece of target, population-based algorithms, such as genetic it releases the attractive pheromone. Other drones in algorithm and particle swarm optimization, DE the neighbourhood can sense and follow the exhibits excellent performance both in unimodal, pheromone gradient to cooperate in detecting the multimodal, separable, and non-separable problems. pattern of targets. With respect to (Cimino, 2015b) Moreover, it is much simpler to use because it has in this study we sensibly improve the performance only three parameters (Das, 2011): the population N, of the algorithm, by adopting the parametric DE- the differential weight F, and the crossover rate CR. based adaptation, by introducing repulsive When managing the DE algorithm, the main effort is pheromone and scattering mechanisms. Basically, to properly set the parameters in order to avoid flocking implies a set of three behavioural rules, premature convergence towards a non-optimal named separate, align, and scatter. It maintains the solution. As reported by literature, there is no drones in groups enhancing the stigmergy when dominant setting, and for each problem a proper occurring. In contrast, the repulsive pheromone study of the behaviour of DE is needed to find the helps the drones to avoid multiple exploration of the correct parameterization. same zone. Indeed, the drone releases a repulsive In this paper, we adopt DE algorithm to optimize pheromone whereas it does not sense a target. the behaviour of a swarm of aerial drones carrying Finally, the scattering rule makes a drone to turn out collaborative target detection (Cimino, 2015b).

605

Cimino, M., Lazzeri, A. and Vaglini, G. Using Differential Evolution to Improve Pheromone-based Coordination of Swarms of Drones for Collaborative Target Detection. DOI: 10.5220/0005732606050610 In Proceedings of the 5th International Conference on Pattern Recognition Applications and Methods (ICPRAM 2016), pages 605-610 ISBN: 978-989-758-173-1 Copyright c 2016 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved

ICPRAM 2016 - International Conference on Pattern Recognition Applications and Methods

by an angle when it tails another drone. position of each robot, the next position, and the In the literature there are two design methods to possible collision are provided to DE. In the develop collective behaviours in swarm systems: decentralized formulation, each robot runs DE for behaviour-based design and automatic design itself considering the information of neighbour (Brambilla, 2013). The former implies the robots. Authors conclude that the decentralized developers to implement, study, and improve the approach needs less time in comparison to the behaviour of each single individual until the desired centralized one; moreover the performance is collective behaviour is achieved. This is the comparable to PSO. In our approach, we consider to approach adopted in (Cimino, 2015b). The latter is use DE offline to find a proper and general purpose usually used to reduce the effort of the developers. parameter tuning for the swarms. Moreover, in our Automatic design methods can be furtherly divided formulation drones have a limited computing in two categories: reinforcement learning and capability, and then an online execution of DE is not evolutionary robotics (Brambilla, 2013). The first feasible. implies a definition at the individual level of positive In (Cruz-Alvarez, 2013) DE/1/rand/bin is used and repulsive reinforce to give reward to the with F=0.7, CR=1.0, N=120 for 250 generations, and individual. In general, it is usually hard for the another variant called DE/1/best/bin is used with developer to decompose the collective output of the F=0.8, CR=1.0, N=150 for 200 generations to tune swarm in individual rewards. Evolutionary robotics the behaviour of a robot in wall-following task. Here implies evolutionary techniques inspired by the it seems that DE/1/best/bin is able to find a slightly Darwinian principle of selection and evolution. better solution than DE/1/rand/bin. However, Generally in these methods each swarm consists of authors used different parameters settings (F, N and individuals with the same behaviour. A population number of generations) for each variant, thus a of swarms is then computed, where each population comparative analysis is difficult. In our approach we member has a particular behaviour. A simulation is focus on DE/1/rand/bin variant and evaluate several made for each member and a fitness function is combinations of CR and F. computed. Then through a mutation and crossover procedure a new generation is computed. This process iteratively repeats improving the 3 SWARM BEHAVIORAL MODEL performance of the swarm population. In this section we improve the swarm algorithm of (Cimino, 2015b). We refer to the time unit as a tick, 2 RELATED WORK i.e., an update cycle of both the environment and the drones. Each drone is equipped with: (a) wireless DE has been used in several domains for communication device for sending and receiving optimization and parameterization tasks (Das, 2011). information from a ground station; (b) self-location As an example, in (Nikolos, 2005) the authors used a capability, e.g. based on global position system classical DE variant, namely DE/1/rand/bin, to (GPS); (c) a sensor to detect a target in proximity of coordinate multiple drones navigating from a known the drone; (d) processor with limited computing initial position to a predetermined target location. capability; (e) a sensor to detect obstacles. Here, DE is set up with N=50, F=1.05 and CR=0.85. The environment and the pheromone dynamics The algorithm was defined to terminate in 200 We consider a predefined area that contains a set generations, but it usually converges in 30 iterations. of targets to be identified. The environment is Our problem sensibly differs, because the target modelled by a digital grid corresponding to the position is unknown, and our approach is physical area. The grid has C2 cells, each identified independent of the initial position. by (x, y) coordinates with x, y ∈ {1,…,C}. The In (Chakraborty, 2008) the authors confront DE actual size of the area and the granulation of the grid and Particle Swarm Optimization (PSO) for co- depend on the domain application. Figure 1 shows operative distributed multi-robot path planning Pheromone dynamics in an urban scenario. Here, the problem. As for (Nikolos and Brintaki, 2005) initial intensity of the pheromone is represented as a dark position of the robots and final position are known. colour, and each target is represented by an “X”. A Here, both centralized and decentralized darker gradation means higher pheromone intensity. formulations are proposed. In the centralized At the beginning, the pheromone is in one cell at its approach, DE minimizes the distance for the next maximum intensity, and then it diffuses to step of each robot. In this case all information of the

606 Using Differential Evolution to Improve Pheromone-based Coordination of Swarms of Drones for Collaborative Target Detection

neighbouring cells. After a certain time the Every tick period, represented by the hourglass, pheromone evaporates, disappearing from the the drone performs two parallel processes: (a) the environment. When a drone detects a piece of the target detection, which corresponds to the release of overall target, a pheromone of intensity I is released the attractive pheromone on the digital grid or rather in the cell corresponding to the drone location. Then, of the repulsive pheromone; (b) the movement at each tick the pheromone diffuses to the procedure, which starts with the obstacle and neighbouring cells according to a diffusion rate boundary detection. If a close obstacle (i.e., object or δ ∈[0,1]. At the same time the pheromone drone) is detected (b.1), the drone points toward a evaporates decreasing its intensity, by an free direction. If a free direction is found, it moves evaporation rateε ∈[0,1]. forward, otherwise it stays still. If there are no close objects detected (b.2), the drone tries to sense positive pheromone in its neighbouring cells and, if detected, the drone points toward the maximum intensity of it. Differently, if no pheromone is detected (b.3), the drone tries to detect surrounding drones in order to stay with the flock. Then, the drone checks if a repulsive pheromone is in nearby cells (b.4). If there is, the drone moves toward the minimum intensity of it. Finally, if there are no surrounding drones (b.5), it performs a random turn and move forward.

Figure 1: Pheromone dynamics in an urban scenario.

The drone behaviour The drone behaviour is structured into five-layer logic, where each layer has an increasing level of priority. Starting from the lowest: the random fly behaviour entails the exploration in absence of stimuli/information; the repulsive pheromone-based coordination reduces the redundancy of exploration; the flocking behaviour assembles the flock enabling local pheromone interactions; the positive pheromone-based coordination exploits the digital Figure 2: Overall representation of the drone behaviour. pheromone and directs the swarm toward potential new targets; finally the highest priority level, the At the highest priority (b.1), the drone has to objects/boundary avoidance, which prevents the avoid collision and to exit the search area. For this drone from exiting the area or colliding with purpose the drone is able to detect objects (i.e., other obstacles and other drones. drones or obstacle) at the object sensing distance (ο) Fig. 2 shows a UML activity diagram of the on its trajectory. If a potential collision is detected drone behaviour. Here, a rectangular box represents the drone chooses the minimum rotation angle to the sensors (left) and actuators (right), a rounded-corner left or right to avoid the obstacle. If it finds a free box represents an activity (a procedure), solid and trajectory it moves forward, if there is no free dashed arrows represent control and data flows, direction within 180 degree it stays still one tick. respectively. The boundaries of the area are considered as a “wall” of obstacles.

607 ICPRAM 2016 - International Conference on Pattern Recognition Applications and Methods

The second priority is to follow the positive different targets distribution. In DE algorithm, a pheromone gradient (b.2), provided by the ground population member is a real n-dimensional vector, station. Given the drone position (i.e., the cell on the where n is the number of parameters to tune. DE grid), the cell with the maximum intensity of starts with a population of N members, injected or positive pheromone and the minimum intensity of randomly generated. In the literature, the population the repulsive pheromone (b.4) in its 8 neighbouring size spreads from a minimum of 2n to a maximum cells is returned to the drone. Then, the drone of 40n (Mallipeddi, 2011). A large population follows the maximum pheromone intensity. increases the chance of finding an optimal solution The flocking behaviour (b.3) consists in three but it is very time consuming. To balance speed and rules, represented by Fig.3. A drone considers as reliability we use N=20. At each iteration, and for flock mates all drones in a flock visibility radius (ρ). each population member (target), a mutant vector is The first rule is the separation: more specifically, if created by mutation of selected members and then a the drone senses another drone closer than the flock trial vector is created by crossover of mutant and mobility distance (μ), it turns away by a flock target. Finally, the best fitting among trial and target separation angle (σ). The second rule is the replaces the target. alignment: the drone computes the average direction In addition to DE, other classes of optimization of its flock mates and it adapts its direction by a methods, such as Particle Swarm Optimization flock alignment angle (α). The third rule is the (PSO) attracted attention in the last decade. The scatter: the follower drone considers the closer interested reader is referred to (Cimino, 2015a) for following drone in front of itself. If the follower is in further details. Several DE strategies have been the tail sector of the following defined by angle flock designed, by combining different structure and tail angle (γ) the follower turns away the following parameterization of mutation and crossover by a flock scatter angle (τ). These three rules operators (Mezura, 2006 and Zaharie, 2007). As determine the structure of the swarm, permitting the noted by (Das, 2011), practitioners mostly prefer to drones to navigate in coordinated group, exploring use a classical DE variant like DE/1/rand/bin. The the environment and, most important, sensing differential weight F ϵ [0,2] mediates the generation pheromone released by flock mates. of the mutant vector. F is usually set in [0.4-1), and The fourth priority (b.4) is to move against the a frequently used starting value is 0.8 (Mezura, repulsive pheromone gradient. The drone received 2006). There are different crossover methods in DE. the information of the lowest cell from the ground A competitive approach is the binomial crossover station (b.2). Finally, if nothing is detected (b.5) then (Zaharie, 2007). With binomial crossover, a the drone is in its basic behaviour (random fly). It component of a vector is taken with probability CR randomly turns by an angle smaller than the from the mutant vector and with probability 1-CR maximum rand-fly turn angle (θ). from the target vector. A good value for CR is between 0.3 and 0.9, and a frequently used starting value is 0.7 (Mallipeddi, 2011). The fitness measure to evaluate the effectiveness of a solution is the average time (in ticks), over 5 runs, which the swarm of drones take to find 95% of the targets of the scenario.

(a) separation (b) alignment (c) scattering Figure 3: Flock visibility radius and other parameters in 5 EXPERIMENTAL RESULTS flocking behaviour. We implemented our algorithm using NetLogo, a leading simulation platform for swarm intelligence 4 ADAPTATION WITH (ccl.northwestern.edu/netlogo), and MATLAB for the DIFFERENTIAL EVOLUTION adaptation algorithm (www.mathworks.com). We evaluated our algorithm by comparing four different The swarm algorithm presented in Section 3 strategies: the adaptive stigmergic and flocking involves a number of structural parameters to be behaviour (S+F*) presented in Section 3; the appropriately set for each given application scenario. stigmergic and flocking behaviour (S+F) presented Determining such correct parameters is not a simple in (Cimino, 2015b); the stigmergic behaviour (S), in task since different areas have different topology and which the pheromone grid maintains only the

608 Using Differential Evolution to Improve Pheromone-based Coordination of Swarms of Drones for Collaborative Target Detection

positive pheromone intensity information and the Figure 4 (a) does not contain obstacles at all, but drones do not perform flocking behaviour; finally, contains 5 clusters of 10 targets each. Figure 4 (b) the random fly behaviour (R), in which drones represents a timber with 1 cluster of 20 targets with randomly explore the area with no pheromone nor tree obstacles. Figure 4 (c) shows an industrial area flocking behaviour. We tested the swarm algorithm with 2 clusters of 55 targets, each representing a on six different scenarios. For each scenario, each of pollutant lost. The two scenarios Rural minefield and the three strategies has been specifically optimized Urban minefield, represented in Figure 4 (d) and by the adaptation module. To compare the Figure 4 (e), respectively, are two mined areas of approaches, 10 trials have been carried out for each 80,000 m2 derived from real-world maps near considered approach. We also determined that the Sarajevo, in Bosnia-Herzegovina (www.see- resulting performance indicator samples are well- demining.org). The last scenario, Illegal Dump, is a modelled by a normal distribution, using a graphical real-world example of illegal dumping taken from an normality test. Hence, we calculated the 95% area of 80,000 m2 near the town of Paternò, Italy confidence intervals. (www.trashout.me). The total number of drones is 80; each drone is Figure 5 shows the evolutions of DE algorithm represented as a black triangle, arranged in 4 swarms against generations, with two different values of CR placed in the corners of the map, except for Illegal and F. Here, the best and the mean solution (solid Dumps in which 3 swarms are placed in 3 corners and dashed line, respectively) have been calculated due to the topology. Characteristics of the topology over 5 runs. The optimization is characterized by a of targets and obstacles are summarised in Table 2. very good improvement of the initial solution and by Figure 4 represents the initial configuration for a fast convergence: only 20 generations are needed all scenarios. The three scenarios Field, Forest, and to reduce the average time by 20%. More precisely, Urban represent respectively an area of 40,000 m2. Table 1 shows that the final best performance is achieved with CR=0.5 and F=0.7.

(a) Field (b) Forest

Figure 5: Evolution of the DE algorithm against generations for different DE parameters. (c) Urban (d) Rural minefield Table 1: Final results of the evolution of Figure 5 in numerical terms. F\CR 0.3 0.5 0.7 0.5 933,67±14,16 903,85±29,77 950,80±16,36 0.7 966,60±15,53 900,65±19,06 907,15±12,96 0.9 951,40±15,67 930,60±22,36 933,73±54,14

Table 2 characterizes each scenario with the results, in the form “mean ± confidence interval”. It (e) Urban minefield (f) Illegal dumps is worth noting that the proposed adaptive algorithm Figure 4: Maps of 3 synthetic and 3 real-world scenarios. “S+F*” outperforms the algorithms presented in (Cimino, 2015b) in all scenarios. More specifically,

609 ICPRAM 2016 - International Conference on Pattern Recognition Applications and Methods in Field and Forest scenarios, the two algorithms ACKNOWLEDGEMENTS “S+F” and “S+F*” are comparable. This is due to the simple layout of both scenarios, which does not This work is partially supported by the Tuscany allow a good exploitation of the “S+F*” features. Region, Italy, via the SCIADRO research project. Indeed, in the other scenarios, with a more complex topology, the advantages of the “S+F*” strategy are substantial. REFERENCES Table 2: Features and numerical results of each scenario. Brambilla, M. Ferrante, E. Birattari & M. Dorigo, M., N° of targets / Type / n° of Completion time Scenario 2013, ‘Swarm robotics: a review from the swarm clusters obstacles (ticks) engineering perspective,’ Swarm Intelligence, vol. 7, R 1664±220 pp. 1-41. Trees: 0 S 656±101 Field 50 / 5 Chakraborty, J., Konar, A., Chakraborty, U. K. & Jain, L. Buildings: 0 S+F 589±86 S+F* 582±121 C., 2008, ‘Distributed cooperative multi-robot path R 1862±356 planning using differential evolution’, in IEEE Trees: 400 S 615±67 , World Congress on Forest 20 / 1 Buildings: 0 S+F 602±124 Computational Intelligence, pp. 718-725. S+F* 593±146 Cimino, M.G.C.A., Lazzeri, A. & Vaglini, G., 2015a, R 2049±148 ‘Improving the analysis of context-aware information Trees: 0 S 998±61 via marker-based stigmergy and differential Urban 110 / 2 Buildings: 7 S+F 890±93 evolution’, Proceeding of the international S+F* 666±100 Conference on and Soft R 1588±216 Computing (ICAISC 2015), in Springer LNAI, Rural Trees: 281 S 1570±158 28 / 28 Vol. 9120, Part II, pp. 1-12, 2015. Mines Buildings: 3 S+F 1530±225 Cimino, M.G.C.A., Lazzeri, A. & Vaglini, G., 2015b, S+F* 1123±116 ‘Combining stigmergic and flocking behaviors to R 1844±140 coordinate swarms of drones performing target Urban Trees: 54 S 1733±169 40 / 40 search’, Proceedings of the 6th International Mines Buildings: 28 S+F 1704±225 S+F* 1025±76 Conference on Information, Intelligence, Systems and R 1548±207 Application (IISA 2015), Springer, in press. Illegal Trees: 140 S 971±160 Cruz-Álvarez, V. R., Montes-Gonzalez, F., Mezura- 42 / 11 Dumps Buildings:19 S+F 934±216 Montes, E. & Santos, J., 2013, ‘Robotic behavior S+F* 757±112 implementation using two different differential evolution variants’, in Advances in Artificial Intelligence, Springer Berlin Heidelberg, pp. 216-226. Mallipeddi, R., Suganthan, P.N., Pan, Q.K. & Tasgetiren, 6 CONCLUSIONS M.F., 2011, ‘Differential evolution algorithm with ensemble of parameters and mutation strategies’, in In this paper, we have presented an algorithm to Applied Soft Computing, vol. 11, no. 2, pp.1679-1696. adapt, via the DE algorithm, the coordination of a Mezura-Montes, E., Velázquez-Reyes, J. & Coello, C.A., swarm of drones performing target detection on the 2006, ‘A comparative study of differential evolution basis of stigmergy and flocking. We first evaluated variants for ,’ Proceedings of the 8th Annual Conference on Genetic and evolutionary several combinations of the structural parameters of computation, ACM, pp.485-482. the DE. Results show that a crossover rate (CR) of Nikolos, I. K. & Brintaki, A. N., 2005, ‘Coordinated UAV 0.5 and a differential weight (F) of 0.7 produce path planning using differential evolution’, in better solutions. Then, to test the effectiveness and Intelligent Control, in IEEE Proceedings of the 2005 the reliability of the approach, we compared our International Symposium on Control and Automation, algorithm with three search strategies over real- pp. 549-556. world and synthetic scenarios. As a result, our Das, S. and Suganthan, P. N., 2011, ‘Differential approach resulted dominant in all scenarios. Future evolution: a survey of the state-of-the-art’, in IEEE work will (i) investigate our approach on additional Transactions on Evolutionary Computation, Vol. 15.1, pp. 4-31. scenarios, (ii) use other optimization methods for the Zaharie, D., 2007, ‘A comparative analysis of crossover adaptation, and (iii) include non-functional variants in differential evolution’, in Proceedings of requirements in the algorithm, such as computing IMCSIT, 2nd International Symposium Advances in power and endurance of drones. Artificial Intelligence and Applications, pp. 171-181.

610