Behaviour-Based Decentralised Cooperative Control for

Unmanned Systems

Haoyang Cheng

A thesis in fulfilment of the requirements for the degree of

Doctor of Philosophy

School of Mechanical and Manufacturing Engineering

Faculty of Engineering

August 2013

PLEASE TYPE THE UNIVERSITY OF NEW SOUTH WALES Thesis/Dissertation Sheet

Sumame or Family name: Cheng

First name: Haoyang Other name/s:

Abbreviation for degree as given in the University calendar: PhD

School: Mechanical and Manufactuting Engineeling Faculty: Enginee1ing

Title: Behaviour-Based Decentralised Cooperative Control for Unmanned Systems

Abstract

The study of intelligence has provided researchers with powerful tools to apply biological inspired problem solving techniques to the cooperative control of multi-agent systems. The coordination mechanism based on interactions among the lower lever components makes the solution more flexible, robust, adaptive and scalable when compared to a centralised approach. The previous research that applied swarm methodology was limited to relative simple scenarios. The objective of this thesis is twofold: first, to explore the feasibility of using a decentralised framework to coordinate a group of agents in complex mission scenarios; and second, to identify the areas of development that can have the maximum impact on the performance of the swarm.

The primary contribution of this research is the development and analysis of a behaviour-based cooperative controller for unmanned systems. In the cooperative moving target engagement problem, the proposed controller enables each Unmanned Aerial Vehicle (UA V) to switch between multiple behaviour states, each of which contains a set of rules. The rules control the agent level interactions through the combination of direct interaction and indirect interaction and assign the UAVs to time-dependant cooperative tasks. The simulation results that are presented demonstrate the applicability of the method and indicate that the performance depends on the complexity of the coupled task constraints. A predictive model was then integrated into the controller to let the agents estimate the intentions of their neighbours and choose activities which enhance the overall team utility. Additionally the same methodology is used to address the problem of repositioning a spacecraft within a swarm in order to balance the fuel consumption of the individual spacecraft. The proposed controller guides the spacecraft in the high-fuel-consumption positions to switch with those in the low-fuel consumption positions. The coordination is driven by the local environment without explicit external communication. From the simulation data, an extension of mission lifetime can be observed.

This research extends the current literature on swarm intelligent systems by considering complex mission scenarios. A deeper understanding of the performance of the decentralised controllers is developed from the analysis of the results.

Declaration relating to disposition of project thesis/dissertation

I hereby grant to the University of New South Wales or its agents the light to archive and to make available my thesis or disse1tation in whole or in pmt in the University libralies in all fonns of media, now or here after known, subject to the provisions of the Copy1ight Act 1968. I retain all property lights, such as patent 1ights. I also retain the light to use in future works (such as mticles or books) all or part of this thesis or disse1tation.

I also authmise University Microfilms to use the 350 word abstract of my thesis in Disse1tation Abstracts lntemational (this is applicable to doctoral theses only).

-zt/t,fts...... ~ ..\. {L) ...... ::::::··· ¥ ·· ..... 1.\...... ~: .- :\-' .. .. Signature Witness Date

The University recognises that there may be exceptional circumstances requiling restlictions on copying or conditions on use. Requests for restliction for a peliod of up to 2 years must be made in wliting. Requests for a longer peliod of restliction may be considered in exceptional circumstances and require the approval of the Dean of Graduate Research.

FOR OFFICE USE ONLY Date of completion of requirements for Award: COPYRIGHT STATEMENT

'I hereby grant the University of New South Wales or its agents the right to archive and to make available my thesis or dissertation in whole or part in the University libraries in all forms of media, now or here after known, subject to the provisions of the Copyright Act 1968. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation. I also authorise University Microfilms to use the 350 word abstract of my thesis in Dissertation Abstract International (this is applicable to doctoral theses only). I have either used no substantial port.ions of copyright material in my thesis or I have obtained permission to use copyright material; where permission has not been granted I have applied/will apply for a partial restriction of the digital copy of my thesis or dissertation.'

Signed ...... ~.\~ .. Lt......

Date ...... :?..1../??.. ~.. /~.J.~ ......

AUTHENTICITY STATEMENT

'I certify that the Library deposit digital copy is a direct equivalent of the final officially approved version of my thesis. No emendation of content has occurred and if there are any minor variations in formatting, they are the result of the conversion to digital format.'

Signed ·····~-~-~.. \., ......

Date .... -~ .c !.~.. ~.I h.?.J.~ ......

.... ORIGINALITY STATEMENT

'I hereby declare that this submission is my own work and to the best of my knowledge it contains no materials previously published or written by another person, or substantial proportions of material which have been accepted for the award of any other degree or diploma at UNSW or any other educational institution, except where due acknowledgement is made in the thesis. Any contribution made to the research by others, with whom I have worked at UNSW or elsewhere, is explicitly acknowledged in the thesis. I also declare that th~ intellectual content of this thesis is the product of my own work, except to the extent that assistance from others in the project's design and conception or in style, presentation and linguistic expression is acknowledged.'

Signed ..... ~ - ~t \. t...... Date .... Yf./!.1./.l.J ...... Abstract

The study of has provided researchers with powerful tools to apply biological inspired problem solving techniques to the cooperative control of multi-agent systems. The coordination mechanism based on interactions among the lower lever components makes the solution more flexible, robust, adaptive and scalable when compared to a centralised approach. The previous research that applied swarm methodology was limited to relative simple scenarios. The objective of this thesis is twofold: first, to explore the feasibility of using a decentralised framework to coordinate a group of agents in complex mission scenarios; and second, to identify the areas of development that can have the maximum impact on the performance of the swarm.

The primary contribution of this research is the development and analysis of a behaviour-based cooperative controller for unmanned systems. In the cooperative moving target engagement problem, the proposed controller enables each Unmanned Aerial Vehicle (UAV) to switch between multiple behaviour states, each of which contains a set of rules. The rules control the agent level interactions through the combination of direct interaction and indirect interaction and assign the UAVs to time- dependant cooperative tasks. The simulation results that are presented demonstrate the applicability of the method and indicate that the performance depends on the complexity of the coupled task constraints. A predictive model was then integrated into the controller to let the agents estimate the intentions of their neighbours and choose activities which enhance the overall team utility.

Additionally the same methodology is used to address the problem of repositioning a spacecraft within a swarm in order to balance the fuel consumption of the individual spacecraft. The proposed controller guides the spacecraft in the high-fuel-consumption positions to switch with those in the low-fuel consumption positions. The coordination is driven by the local environment without explicit external communication. From the simulation data, an extension of mission lifetime can be observed.

This research extends the current literature on swarm intelligent systems by considering complex mission scenarios. A deeper understanding of the performance of the decentralised controllers is developed from the analysis of the results.

I

Acknowledgements

I wish to express my sincerest thanks to my supervisors, Mr John Page and Dr John Olsen, whose support, counsel and encouragement has guided me through these years. I would also like to thank Dr

Nathan Kinkaid for his valuable suggestions and comments throughout the entire program.

I would like to thank my family for their love and support throughout this journey. I would especially like to thank my parents for giving me the courage and inspiration that takes to complete this program. Lastly, my wife has been exceptionally understanding and supportive over the years.

II

List of Publications

The following is a list of publications produced throughout the duration of the thesis research.

Journal Publications:

H. Cheng, J. Page, and J. Olsen, ‘Cooperative Control of UAV Swarm via Information Measures’,

International Journal of Intelligent Unmanned Systems, Vol. 1, Iss: 3, June 2013.

H. Cheng, J. Page, and J. Olsen, ‘Dynamic Mission Control for UAV Swarm via Task Stimulus

Approach’, American Journal of Intelligent Systems, Vol. 2, Iss: 7, December 2012.

Conference Publications:

H. Cheng, J. Page, N. Kinkaid, and J. Olsen, ‘Behaviour-Based Control Law for Spacecraft Swarm

Operation’, in the 5th International Conference on Spacecraft Formation Flying Missions and

Technologies, Munich, Germany, 29 - 31 May 2013.

H. Cheng, J. Page, and J. Olsen, ‘Information Theory Applied to a Self-Organised, Unmanned Aerial

System’, in the 28th Congress of the International Council of the Aeronautical Sciences, Brisbane,

Australia, 23 - 28 September 2012.

H. Cheng, J. Page, N. Kinkaid, and J. Olsen, ‘Coordination of Mission Roles for Semi-Autonomous

UAVs in a Simulated Combat Environment’, in the Fourteenth Australian International Aerospace

Congress, December 2011.

T. Chi, H. Cheng, J. Page and J. Olsen, ‘Debris Avoidance through Autonomous Spacecraft

Scattering’, in the 12th Australian Space Science Conference, Melbourne. 24 - 26 September, 2012.

III

Table of Contents Page Abstract ...... I Acknowledgements ...... II List of Publications ...... III List of Figures ...... VI

1. Introduction ...... 1 1.1. Cooperative Control Overview ...... 1 1.2. Research Objectives ...... 6 1.3. Outline of the Thesis ...... 7

2. Related Work ...... 8 2.1. Centralised Approach ...... 8 2.1.1. Task Assignment Problem ...... 8 2.1.2. Path Planning Problem ...... 10 2.2. Distributed Approach ...... 12 2.2.1. Distributed Coordination and Control ...... 12 2.2.2. Distributed Data Fusion ...... 15 2.3. Swarm Intelligence Approach ...... 16 2.3.1. Bio-Inspired Swarm ...... 17 2.3.2. Artificial Swarm ...... 20

3. Behaviour-Based Multiple UAVs Cooperative Control ...... 22 3.1. Nomenclature ...... 22 3.2. Problem Statement ...... 23 3.3. Centralised Approach ...... 27 3.4. Decentralised Approach ...... 29 3.4.1. Information Theoretic Cooperative Tracking ...... 30 3.4.2. Four Basic Behaviours ...... 34

IV

3.5. Simulation and Analysis ...... 39 3.5.1. Lyapunov Potential Field vs. Information Measures ...... 40 3.5.2. Centralised Control vs. Swarm Control ...... 42 3.5.3. Predictive Swarm ...... 47 3.6. Summary ...... 54

4. Behaviour-Based Control Law for Spacecraft Swarm Operations ...... 55 4.1. Problem Statement ...... 56 4.1.1. Initial Conditions ...... 57 4.1.2. Unbalanced Fuel Consumption ...... 60 4.2. Behaviour-Based Control Law ...... 62 4.3. Simulation Results ...... 67 4.4. Summary ...... 73

5. Conclusion and Future Work ...... 74

Appendix A. Multiple UAV Simulator ...... 76

Appendix B. Approximation Methods of Collision Avoidance Constraints ...... 81

Appendix C. Distributed Spacecraft Manoeuvre Planning ...... 88

Bibliography ...... 103

V

List of Figures Page Figure 1.1: Autonomous Capability Levels ...... 2 Figure 1.2: Chains of workers ...... 4 Figure 1.3: nest ...... 5 Figure 2.1: Simulation of ant using NetLogo ...... 18 Figure 3.1: Schematic diagram for cooperative tracking ...... 24 Figure 3.2(a): Target position and velocity RMS estimation error ...... 34 Figure 3.2(b): Target entropic information ...... 34 Figure 3.3 (a-c): Guidance law for cruise mode ...... 36 Figure 3.4: Standby mode ...... 36 Figure 3.5: Attack mode ...... 36 Figure 3.6: Schematic block diagram of data fusion ...... 38 Figure 3.7: Pseudo code for behaviour-based control architecture ...... 39 Figure 3.8 (a): Cooperative tracking via Lyapunov potential field ...... 40 Figure 3.8 (b): Cooperative tracking via information measures ...... 41 Figure 3.9 (a-c): Simulated scenario ...... 43-44 Figure 3.10: Position estimation error and standard deviation (SD) of target 1 ...... 44 Figure 3.11(a-b): Monte Carlo results of scenario 1 & 2 ...... 46-47 Figure 3.12: Predictive decentralised controller ...... 48 Figure 3.13(a-h): Simple responsive swarm vs. Predictive swarm ...... 49-53 Figure 3.14: Repeated scenario 2 with predictive decentralised controller included ...... 53 Figure 4.1(a-b): The initial position of the spacecraft swarm ...... 56-57 Figure 4.2: PRO with drift ...... 58 Figure 4.3(a-b): Relative orbits obtained using Hill’s equations (50 orbits) ...... 59 Figure 4.3(c-d): Relative orbits obtained using GA (50 orbits) ...... 59 Figure 4.4: Fuel cost map ...... 61 Figure 4.5: Movement in descent mode ...... 63 Figure 4.6(a-d): Movements of the spacecraft swarm with 200 agents ...... 48-69 Figure 4.7: Mission lifetime against initial (Scenario 1) ...... 70 Figure 4.8(a-d): Movements of the spacecraft swarm with 300 agents ...... 71-72 Figure 4.9: Mission lifetime against initial (Scenario 2) ...... 73

VI

Chapter 1

Introduction

With recent technological advances in autonomous control and communication, multi-agent systems are receiving a great deal of attention due to their increased ability to carry out complex tasks in a superior manner when compared to single-agent systems. Multi-agent systems can be defined as a collection of loosely coupled autonomous agents that interact in order to achieve a common objective which is usually beyond the agents’ individual capabilities [1]. The basic functions performed by each individual includes: sensing their immediate environment; communicating with other agents; processing information gathered; and taking local actions in response. When the conflict of interest presents itself, negotiation procedures can be instigated to improve cooperation.

Intensive research has been conducted on the cooperative control of groups of unmanned systems.

The results of such research on Unmanned Aerial Systems (UASs) can be found in Search and Rescue

(SAR), Intelligence, Surveillance and Reconnaissance (ISR) [2-5]. Moreover, the flying of autonomous formations of spacecraft (or spacecraft clusters) has also been identified as a fruitful research area for the multi-agent systems. This concept enables the transformation of a large monolithic spacecraft into a network of smaller elements for the purpose of offering improved scientific return through longer baseline observations and a high degree of redundancy in mission- critical systems. Examples include the System F6 program at the Defence Advanced Research

Projects Agency (DARPA), the Solar Imaging Radio Array (SIRA), the Magnetospheric Multiscale

Mission (MMS), and the Exoplanet Exploration Program (EEP) missions at NASA [6-9]. The problem addressed in this area is to explore effective cooperative control methods. In this study, the potential of applying the behaviour-based control method to multi-agents system is investigated.

This chapter defines the scope for the entire thesis. It introduces the cooperative control methods applied to multi-agent systems also the research objectives and approaches are overviewed and finally, the structure of this thesis is outlined.

- 1 -

1.1. Cooperative Control Overview

To date most unmanned aerial vehicles (UAVs) used in civil or military missions, such as the

Predator, have been remote controlled. These vehicles are operated by a remote pilot located in a central control system who has essentially the same authority over the vehicles behaviour as a traditional pilot would in a manned vehicle. The human factors associated with the UAV operators’ workload has been identified as one of the key limitations to increasing future remote piloted UAS effectiveness [10]. Full authority remote control not only increases the expense of UAV operations, but also makes coordination among UAVs more complex. For these reasons, research is being undertaken to investigate methods of increasing UAV autonomy and interoperability, while reducing global communication and human operator reliance. In the ‘Unmanned Aircraft Systems Roadmap

2005-2030’ released by the U.S. Department of Defence, the cooperative control technologies for groups of UASs, such as bio-inspired control, have been identified as one of the development objectives for future missions [11].

Figure 1.1: Autonomous Capability Levels [11].

- 2 -

As shown in Figure 1.1, the higher the autonomy level is, the more decentralised the control becomes.

The ongoing J-UCAS program has demonstrated level 6 autonomy, which includes: coordinated navigation plan, communication plan reassignment and accumulation of data from the entire group.

The level 10 autonomy imply that the agents will operate in a fully decentralised manner.

In general, the complexity in the cooperative team problem is a result of task coupling, uncertainty, and communication constraints[12]. Moreover, the individual agent has only locally sensed information and often limited computational capability. Due to the complexities of the inherent problems, centralised decision and control algorithms are traditionally adapted to optimise timing and task constraints. However, centralised approach requires intensive computation and robust communication. The decision maker, which could be either the control centre or one of the agents within the group, requires access to the information gathered by all agents and there is often a communication cost in distributing locally gathered information. Even if information is centrally available, the inherent complexity of the decision problem makes solving centralised problem computationally in a timely manner infeasible. To avoid this current computational infeasibility distributed algorithms based on decomposition have been proposed [13]. Such approach decomposes the optimisation problem into subproblems by taking advantage of system structure, and then they can be solved independently. In general, each agent finds the optimal solution for the subproblem, assuming that the decisions made by other agents are fixed, and then communicates its solutions to other agents. This process is performed by each agent iteratively until a sufficiently optimal solution to the original problem is found. The distributed algorithms are more scalable than centralised algorithms, however, the performance is still affected by a potentially large number of interacting agents. The terms ‘distributed’ and ‘decentralised’ are often used in literatures to address such algorithms. The term ‘decentralised’ has also been used to describe the fully decentralised system presented in the following sections. To remove ambiguity, we use the term ‘distributed’ only to describe the multi-agent optimisation via decomposition. The differences between ‘distributed’ and

‘decentralised’ framework is presented in chapter 2 in detail.

- 3 -

As opposed to centralised solutions, mission planning is accomplished using emergent behaviour, rather than as an outcome of the optimisation problem. This approach focuses on decentralised architecture, interactions among simple agents, adaptability, and robustness. As often observed, biological system such as the swarms of , bees, fish and birds reveal some amazing cooperative behaviour. As shown in Figure 1.2, weaver ant workers form chains of their own bodies to allow them to cross wide gaps and pull leaf edges together to make a nest. Figure 1.3 shows the complex nest built by .

Figure 1.2: Chains of ant workers.

- 4 -

Figure 1.3: Wasp nest.

Every single insect that lives in colonies seems to appear in the right place and do the right task so that the whole colony appears to have some global organisation which it does not. The continuous integration of all individual activities does not seem to require any supervision [14]. Such systems are often referred as a Self-Organised (SO) system or swarm intelligent system. The terms ‘SO system’ and ‘intelligent system’ are used interchangeable in this thesis. After decades of research, the mechanism that drives SO systems has been identified. Recent research suggest that the individuals within the swarm system are neither able to assess a global situation nor aware of the tasks to be carried out by the other agents and thus no control, i.e. there is no supervisor in a swarm system. Their actions are determined by sets of behaviour rules on the basis of local information. For example, each time an agent performs an action, the local environment is modified by this action. The modified environmental configuration will then influence the actions of the other agents.

The discovery of the underlying mechanism of the SO system has provided powerful tools in the field of cooperative controller design. The SO system is undoubtedly a decentralised structure. Apart from being highly scalable, the most attractive features of SO system when it is applied to the cooperative control of multi-agent systems is that it can operate in a very flexible and robust manner. This

- 5 - flexibility allows the system to adapt to a changing environment, which is extremely difficult for centralised control algorithms. The robustness limits the system’s degradation due to the failure of some individuals.

1.2. Research Objectives

The design of cooperative controllers by means of SO can be found in literature [15-18]. The applications include: persistent coverage, wide area search, communication relay, formation control, and target sets engagement. The previous research that applies swarm methodology was limited to relative simple scenarios which lack complex task constraints. The reasons for the lack of successful applications in relative complex scenarios are as follows:

First, the swarm intelligent systems are hard to develop. As opposed to the centralised

algorithm, the path to the problem solving is not the minimisation of an objective function,

but the emergent properties in this system due to the interactions among agents and between

agents and the environment. The emergent properties can only be observed through the

simulation of multiple scenarios. Therefore for the development of control algorithm based on

swarm intelligence not only must individual behaviours be determined but also what

interactions are required to generate the desired global behaviours [14].

Second, it is difficult to enforce task constraints using any decentralised controller. This is

mainly due to the absence of global communication and coordinating information. Optimality

is impossible to achieve if high level cooperation among the agents is required.

The objective of this thesis is to address these difficulties and investigate a decentralised control framework that enables efficient operations for a large scale multi-agent system performing a cooperative task when a centralised decision maker cannot or does not exist. This thesis includes two applications of swarm intelligent systems. In the first application the problem of assigning multiple

UAVs to a time-dependant cooperative tasks is investigated. This is a typical cooperative control problem, which requires multiple agents to perform separate subtasks simultaneously, or within the available time windows of cooperating agents. This problem has not been addressed by decentralised

- 6 - strategies due to the optimality issue. The sacrificed in optimality is to be expected, since a centralised strategy has the potential to take into account more complex dependencies that exist between the tasks performed by agents, and the effect of their actions. The guiding principal of decentralisation is not to achieve the same optimality as the centralised counterpart, but increase the flexibility, robustness and simplicity. This study developed a decentralised controller, compared it against the centralised strategy and identified areas of development that could have the most significant impact on the performance. This is a critical step in creating swarm intelligent system that can be successfully applied in different scenarios. In the second application, the problem of repositioning a spacecraft swarm in order to balance the fuel consumption among the agents is considered. In space missions, the fuel efficiency becomes much more important because the amount of propellant that can be carried by each spacecraft is limited and the system life is dependent on fuel remaining in the spacecraft that has consumed the most fuel. In this study, not only the emergent properties of the system are required, but also the efficiency of individual agents is important. Through the investigation of these two scenarios, we expect to explore the validity of using behaviour-based methods to achieve cooperation in complex operational scenarios; and to analyse the performance of decentralised strategies.

1.3. Outline of the Thesis

An introduction to the multi-agent system and the objective of this thesis is given in Chapter 1.

Chapter 2 provides a summary of the current state of research in the field of cooperative control.

Chapter 3 investigates the use of decentralised controllers for the cooperative moving target engagement missions. The simulation study evaluates the performance of the decentralised controller against the centralised controller quantitatively. Chapter 4 investigates the performance of a behaviour-based controller in terms of extending the mission life time of a spacecraft swarm operation. Chapter 5 provides a summary of the research and suggestion for future research in the field of swarm intelligence applied to unmanned systems.

- 7 -

Chapter 2

Related Work

This chapter presents a review of previous work relating to the control of distributed multiple-agent systems. The previous work in the field of cooperative control is characterised in three classes.

Section 2.1 includes the classical centralised methods. A summary of distributed control and computation based on the decomposition of centralised problems is presented in section 2.2. Section

2.3 describes the modelling techniques that generate desirable collective behaviours and the swarm intelligent systems that have been successfully applied in the context of unmanned systems.

2.1. Centralised Approach

The traditional, centralised approach, defines the mathematical model that possesses a certain amount of the information about the problem domain and then seeks the optimal solution to that abstract model. Hence the key aspect of centralised approach is to extract the mathematical representation from cooperative missions. The centralised approach can be implemented in two different ways:

There exists a central agent which could be one of the agents, i.e. the team leader or a control

centre. The coordination variables from the distributed agents are sent to the central agent

where the optimisation problem is solved and plans are made for all agents.

The coordination variables are synchronised across all group member. Each agent solves the

same optimisation problem for the whole group but only executes its part of the plan [12].

This could increases the redundancy, one of the major limitations of central control, but

results in high communication cost.

2.1.1. Task Assignment Problem

One major research area related to multi-agent systems is to develop cooperative decision algorithms which allow efficient allocation of the overall group resources over the multiple targets. The problem of assigning cooperative tasks to a group of flying munitions has been investigated by Schumacher

- 8 - and Chandler [19]. In this scenario, a group of flying munitions perform a parallel-path search over the mission area. When targets are detected, the munitions are required to classify, attack and perform battle damage assessment of the targets. The assignment problem is formulated as a network flow optimisation problem, which is to find the optimal total cost flow on a directed network:

maximise

subject to

(2.1)

and

and

where is binary decision variable represent the assignment of -th vehicle to -th task, and the positive value is the cost of this assignment. The values of different tasks are calculated dynamically based on the target value, probability of successfully performing the tasks, remaining fuel of each vehicle, etc. The objective of the optimisation problem is to maximise the sum of the value of the tasks performed by all vehicles. The main limitation of this method is that only one task can be assigned to each vehicle at a time, not taking into account the succeeding tasks that will need to be performed. This method is further improved by solving the network flow problem iteratively

[20, 21]. In this alternative method, only the assigned task with the shortest estimated time of arrival is finalised in each iteration. This process will repeat until all tasks have been assigned. An advanced formulation of this problem is to use Mixed Integer Linear Program (MILP) [22]. The advantage of using MILP formulation is that both binary and continuous variables can be included in the formulation which is used to represent task order and timing constraints respectively. It provides a

- 9 - solution to the multiple task assignment requirements. Because of the complexity of the problem, the

MILP algorithm takes a long time to set up and solve. Bayen and Tomlin show that the problem of scheduling the air traffic flow subject to airspace and airport metering constraints can be formulated as a MILP [23]. They derive an approximation algorithm which consists of solving two reduced subproblems [24]. The main algorithm alternatively approximates two objectives with factors of 5 and

3, respectively. An assignment tree generation algorithm has been proposed that produces the optimal solution to the assignment problem based on piecewise optimal trajectories [25]. This algorithm generates a tree of feasible assignments and then chooses the optimal assignment. To avoid the computational complexity, the genetic algorithm (GA) has been used to search for optimal solution to the assignment problem represented by tree structure [26]. Turra et al. add an additional complication into this problem by considering a moving target [27]. Their approach is to use a faster numerical procedure that runs every time the target states change.

The task assignment problem discussed above requires that the tasks must be performed in the specified order, i.e. the start time of the attack cannot be earlier than the completing time to classify.

A different scenario, the cooperative moving target engagement, is investigated in subsequent research. In this scenario, the target must be continuously tracked by two or more vehicles during the attack phase while an additional UAV launches a guided weapon. This requirement increases the complexity of the problem. Multiple agents need to perform separate subtasks simultaneously, or within the available time windows of cooperative agents. Kingston & Schumacher in their paper solved this problem with a MILP that addressed task timing constraints and agent dynamic constraints to generate a flyable path for each UAV [28]. To reduce the size of the above optimisation problem to a more reasonable level, the stationary target assumption is added to their study. The dynamic variant of this scenario is investigated in Chapter 3, in which the target motion is described by the stochastic process.

2.1.2. Path Planning Problem

- 10 -

The cooperative path planning problem has been an active research area. The objective is to generate feasible trajectories to guide a fleet of vehicles to their destination while avoiding collision with obstacles and each other. Example applications include rendezvous or formation initialisation and reconfiguration. A hierarchical approach is presented to guide multiple UAVs to jointly reach a target area while minimising their exposure to radar [29, 30]. In the first step, the initial path is computed by using the Voronoi diagram and conventional graph search algorithm, such as the A* algorithm. Fuel and threat considerations determine the cost assigned to each edge of the graph. In the second step, the initial path is refined to a flyable path by incorporating the vehicles flight dynamics constraints. The coordination is achieved by calculating the earliest ETA that is common for all UAVs. The team members then select the trajectory that corresponds to this ETA.

The collision avoidance is a critical aspect in multiple vehicle path planning problems. The exact representation of these constraints would be nonlinear and non-convex, i.e. the prohibited zone represented by a sphere given in Equation 2.2.

(2.2) where , is the state vector of -th vehicle, and R is the safety distance. It is shown that this problem can be rewritten as a linear form by introducing binary slack variables and additional linear constraints that account for the collision avoidance [31, 32]. Another method based on sequential convex programming (SCP) that transfer the collision avoidance constraints into linear form is presented in the literature [33]. It uses multiple iterations to ensure that the convex approximations of non-convex constraints are enforced. The comparison of these two approximation methods is presented in Appendix B. The simulation results show that the SCP method results in better performance in terms of fuel consumption and solver time. In reference [34, 35], a method for the generation of fuel efficient, collision free, reconfiguration trajectories for spacecraft formation flying in deep space is presented. This method assumes that the trajectories are piecewise cubic polynomials and uses a set of waypoints through which these trajectories pass. Then a gradient-based

- 11 - algorithm is used to select these way-points locations and velocities so that collisions are avoided. As in most spacecraft trajectory planning methods, linearised and time-discretised spacecraft dynamics is adopted. To ensure that collision avoidance constraints are not violated between the time intervals of the discrete dynamics model, an additional constraint is introduced that forces spacecrafts moving on separate planes [36].

Because of the distributed nature of the multi-vehicle path planning problem, in most cases, each vehicle has only information about the states of neighbouring vehicles. This leads to the research of using distributed approaches to determine the trajectories based on the partial information.

2.2. Distributed Approach

In section 2.1, the problem is modelled and solved globally by a centralised algorithm. However, centralised algorithms require consistency of global information, typically scale poorly and are too computationally intensive for practical control of systems with a large number of agents. When a centralised controller is not utilised, the individual agents must cooperate to achieve system-wide or independent objectives while operating under both local and coupled constraints.

2.2.1. Distributed Coordination and Control

In general, the distributed coordination problem is to optimise simultaneously as a collection of objective functions, i.e. multi-objective optimisation. In many cases, the team objectives and individual objectives are in conflict, the solution often requires sacrifices by some team members to obtain better overall performance. In a typical spacecraft formation flying scenario, the vehicles are arranged as part of a passive aperture. Changing the viewing mode of the fleet would require the reconfiguration of the formation, moving from one aperture to another. The main issue is to determine which spacecraft should move to each new location on the new aperture in a fuel efficient way. A distributed coordination scheme is developed via auction and selection process [37, 38]. Each of the spacecraft make bidding decisions on every possible final location based on their fuel usage. Given individual bidding decisions, the coordinator performs the selection and collision avoidance check.

- 12 -

This method distributes the computational complexity over the whole group, but it still needs a central coordinator to recombine the results. Distributed task allocation based on auctions has also been used to solve the problem of assigning multiple UAVs to targets [39]. A heuristic selection method is proposed to reduce the number of auction cycles required to achieve the assignment. This leads to a fast decision-making process and decrease in communication cost.

Distributed control has been successfully applied to the path planning problem for multiple unmanned vehicles. In this type of problem, the system is normally comprised of multiple vehicles, each with independent dynamics and disturbances but with coupled constraints. The distributed approaches presented in the literatures can be divided in two main categories: non-cooperative approach and cooperative approach.

As for non-cooperative approach, the centralised problem is decomposed into smaller

subproblems. In the subproblem, each agent only optimises its own control variables, and all

subproblems are solved sequentially (in a predetermined order). Because each subproblem

contains fewer decision variables and constraints, the computation time is much less than the

original problem. Since the solution of the subproblem does not account for the objective of

other agents, the solution is non-cooperative. The application of non-cooperative approach

can be found in references [40, 41], where a distributed form of model predictive control is

presented. At initialization, this algorithm solves the centralised problem and applies the first

control input to the vehicles. Then each vehicle solves a subproblem for its own plan

sequentially at each subsequent step. Using sequential planning generates fast computation

without having to iterate, and reduces the amount of data that needs to be exchanged between

vehicles, but it causes a bias in the solution against lower numbered agents in the planning

sequence. This algorithm is further extended to enables the use of much shorter planning

horizons while still preserving the robust feasibility and with only communicating the plans

within a local neighbourhood rather than the entire fleet [42].

On the other hand, a cooperative approach allows the agent sacrifice its local cost if that leads

to more benefit to the other vehicle. References [43, 44] propose a distributed optimisation

- 13 -

algorithm using penalty functions to treat infeasible constraints imposed by agents. This

method requires an iterative bargaining process to solve, i.e. the agents negotiate via

proposing a solution and receiving a counter offer when the other subsystems in the

neighbourhood change their individual moves. The offer proposed by agents is the solution

that minimise the local objective function, assuming the control variables of other agents are

fixed (using values that were received in a previous iteration). It is shown that this method

results in convergence to Pareto optimal solution. Algorithm 2.1 provides the general

structure of the cooperative approach for a distributed system. The main difference in each

algorithm is the form of the local optimisation problem in step 3 and 6. The iteration can be

implemented in two different ways: simultaneous and sequential [45]. In the simultaneous

iteration approach, all of the agents update simultaneously with solutions from the last step. In

the sequential iteration approach, the agents update one at a time with the most recently

computed solutions used. A similar approach can be found in the application of multiple

vehicle formation control with the subsystem dynamics being nonlinear and heterogeneous

[46]. Kuwata & How also used negotiation to tighten the constraints for the agents [47]. The

parameter used in negotiation is then subtracted from the local cost function because of the

potential benefit to the other agent. Moreover, this algorithm exploits the sparse structure of

the active coupling constraints of trajectory optimisation by including only the active

coupling constraints in subproblem. In a formation flying application, a slightly different

decentralised algorithm is presented [48]. Each flying vehicle computes its optimal trajectory

based on how much it deviates from the common objective. The algorithm iterates until the

deviation converges.

Algorithm 2.1: Cooperative approach for distributed system [49].

1: Agents determine locally feasible solution, (problem initialisation), ignoring any interconnected constraints.

- 14 -

2: Agents communicate to all agents in neighbourhood and form from received solutions.

3: Agents solve the local optimisation problem while the states of other vehicles are fixed.

4: Agents communicate the solution of local optimisation problem to others.

5: While

6: Each agent solves the local optimisation problem over each received solution fixed.

7: Each agent selects its preferred solution .

8: Agents communicate to all agents in neighbourhood.

9: end (while).

2.2.2. Distributed Data Fusion

Sensor management in a distributed network is a major challenge in multiple agent robotic systems. It relies on data fusion techniques to manage and coordinate the use of sensing resources in a way that the most useful set of information can be achieved. In Chapter 3, the information-based data fusion technique is used to coordinate UAVs tracking ground targets. A review of previous works in this field is presented in this section. Mutambara & Durrant-Whyte proposed a distributed information filter and extended it to solve the distributed control problem [50, 51]. They later considered sensor management and control essentially for maximising the information gain of the network as a whole

[52]. It can be shown that using information theoretic measures as controller objective functions has

- 15 - particular value in distributed sensor management, since information measures provide an intuitive way of describing tasks such as surveillance, localisation and general information gathering.

Ryan, et al. develops a receding horizon control formulation with multiple step look-ahead for a team of UAV sensors to cooperatively search for and track a target, this has the potential to produce better solutions by accounting for sensor motion constraints and delayed payoff [53]. Fisher information matrix was also considered to quantify the information provided by the measurements, which lead to trajectory design for a UAV tracking a ground target with a vision based sensor [54]. A method utilizing information theoretic measures, has been investigated, with a particle filter to optimise control inputs for a cooperative localization with nonlinear observation model and non-Gaussian noise

[55]. Information theoretic measures have also been applied in exploration missions to generate informative path for unmanned platforms [56, 57]. A combination of particle filtering for nonparametric density estimation, information theoretic measures for comparing possible action sequences, and artificial physics for reducing the search space was presented for coordinating a large dynamic sensor network [58].

The distributed control based on Bayesian filter can also be found in the literatures [2, 3]. In the search and rescue mission, the search vehicles builds an equivalent representation of the target state probability density function using recursive Bayesian filtering which enable them to coordinate their actions without exchanging any information about their plans. The control inputs of the search vehicles are ones that maximise the cumulative probability of detection. In another cooperative control mission where two aerial munitions are attacking a ground target, the individual target location estimations are fused in a weighted least squares solution [59, 60]. Both calculus of variation and a dynamic programming approach are proposed to find the optimal trajectories that minimise the variances of the target states. Scerri et al. used a Binary Bayesian Grid Filter to represent the probability that there is a radio frequency emitter within a global map and applied Rapidly-expanding

Random Tree path planner to determine UAV paths for locating targets [61].

2.3. Swarm Intelligence Approach

- 16 -

Swarm intelligence provides an alternative way to develop cooperative controller for multi-agent systems. The inspiration behind swarm intelligence based systems comes from the biological study of self-organised behaviours in insect colonies. Much like the social insect, the swarm robotic systems are generally defined as decentralised systems, comprised of relatively simple agents which are equipped with a restricted set of communicational, computational and sensing abilities required to accomplish a given task [62]. In general, it is difficult to design swarm systems that follow explicit global behaviour. Unlike optimising a predefined objective function, the solution to the problem is the emergent in the SO systems which results from simultaneous interactions among agents and between agents and their environment. Thus, developing swarm intelligence based systems usually adopts bottom-up process and focuses on the development of interaction sets and the rules that govern their interaction.

2.3.1. Bio-Inspired Swarm

One possible idea to develop the agent level interaction is to use evolutionary computing process. The genetic algorithm is used to evolve the local field functions that lead to the interactions of cells that ultimately produce the user-desired shape [63, 64]. In their study, the interactions of the agents are based on the chemotaxis-driven aggregation behaviours exhibited by actual living cells. The local field function that direct the movement of the agents is represented by tree structure, the nodes of which are mathematical operators. The genetic programming searches the space of possible combinations of mathematical operators and evolves the field function based on evolution and natural selection. In the study of behaviour, the evolutionary algorithm is utilized to develop control parameters that represent the weightings of the acceleration vectors and maximum velocity and acceleration [65]. Similar approach is also used to develop the behaviours of the UAV swarm capable of searching for and attacking targets [66]. As can be seen, the evolutionary algorithms are mainly used to develop very simple interactions which can be represented by a single field function or to tune the set of parameters embedded in the behaviours.

- 17 -

In the work by Bonabeau, et al [14], several abstract models of the collective behaviour of the social insects are presented. In general, the mechanism by which individuals coordinate their activities relies on . Stigmergy is the mechanism within which individual behaviour modifies the environment, which in turn modifies the behaviour of other individuals [14]. Particularly, , self-assembly and response thresholds are the models that are most closely related to the applications of . The coordination mechanism using can be demonstrated using ant foraging activities. The ants initially move in random direction looking for food. When ants find food source, they carry the food back to their nest, dropping a chemical substance called pheromone as they move other foragers follow such pheromone trails. As more ants carry food to the nest, the more chemical is deposited, and hence the higher the probability of recruiting additional ants. Also as ants that use the shortest route return to both food and nest more quickly the shortest route is the heaviest impregnated with pheromones. Figure 2.1 shows the simulation of trail-laying trail-following behaviour when ants are foraging. The use of digital pheromones for controlling and coordinating swarms of unmanned vehicles can be found in literatures [16, 67-69]. The pheromone information, exchanged through shared environment, changes dynamically over time via propagation and evaporation this allows new information to be spread and removes old information. The dynamic nature of the mechanism yields solutions that can adapt to rapidly changing environments.

- 18 -

Figure 2.1: Simulation of ant foraging using NetLogo [70].

A set of stigmergic algorithms that allow a swarm of simple agents to build nest-like structure inspired by wasp colonies is presented [71, 72]. In their models, the agents do not communicate and have no global representation of the structure they are building. The agents move randomly on the structure and deposit building material depending on the local configuration of the structure. The behaviours of the builders are defined in a specified set of rules. The coherent and well-organised structures have been produced using simular models in simulation. Such bio-inspired algorithms have been used to coordinate autonomous multi-robot systems to build user specified structures [73, 74].

The model based on response thresholds has been employed to design decentralised task allocation algorithms. It refers to the likelihood of reacting to task-associated stimuli, taking inspiration from the behaviours of insect societies. In this model, each agent has a response threshold for every task.

Agents switch from one task to another when the level of the task stimuli exceeds their threshold. The

- 19 - control algorithms based on this model have been designed to control a group of robots in the object retrieval task [75-77].

The applications that involve deploying large numbers of mobile agents in an environment for the purpose of search and rescue or information gathering has increased the interest in control algorithms in swarms. Inspired by the collective behaviour of of birds, school of fish, and of animals, the biological principles from such systems have been used for developing decentralised controller for motion planning and formation control. The heuristic rules: separation, alignment, and cohesion, that are used to create coordinated swarm motion are introduced by Reynolds [78]. To generate stable aggregation for autonomous vehicles the artificial potential functions are used to model the inter-vehicle interactions [17, 79, 80]. The motion of the group can be directed by the virtual leaders, i.e. a moving reference point that influences vehicles in its neighbourhood by means of additional artificial potentials [81]. A theoretical framework for design and analysis of swarm flocking algorithms is provided in literature [82]. It can be demonstrated that flocking with agent pairwise potential and velocity consensus, which is equivalent to three rules Reynolds proposed, is insufficient for creation of flocking behaviour and leads to regular fragmentation. The flocking algorithms with additional terms that account for group objective and obstacle avoidance yield the desired results. The flocking behaviour is extended into pattern formation using bifurcating potential field function which allows a transition between different patterns through a simple parameter switch [83]. Izzo and

Pettazzi present a path planning technique that enables multiple spacecraft to form complex geometries [84]. In this approach, the potential field of the agent is defined as a sum of weighted contributions: gather, dock, and avoid. The weighting parameters can be evaluated by the theory of symmetric groups. This approach has also been used to control spacecraft cluster collision avoidance manoeuvres, in the event of a potential impact threat they reconfigure and return to its original configuration after the threat has passed [85].

2.3.2. Artificial Swarm

- 20 -

As shown in the last section, the studies of how insect colonies collectively perform specific tasks provide researchers with useful tools to assist in the design of swarm intelligent robotics. The stigmergic algorithms that mimic the behaviours of the insects have been successfully transferred to the robotics systems. A bottom-up design process is proposed for the development of by Brooks [86, 87]. The underlying idea of this approach is the decomposition by task achieving behaviour. Each behaviour depends on a pattern of interactions with the environment. All the behaviours are processed in parallel, but only one behaviour dominates the agent at a given time and the dominant behaviour can change frequently in response to environmental sensing. The advantage of this approach is that it gives an incremental path from simple behaviours to complex autonomous intelligent systems.

Nowak and Lamont applied the decomposition design process to the development of SO-based autonomous vehicles [88]. In their design process, the problem domain is first decomposed into sub- objectives and then from that a set of SO rules are developed. Different to the development of the behaviours for a single robot, the interactions among agents need to be taken into consideration.

Numerous research studies have been conducted to investigate the feasibility of applying swarm- based solution to the control of unmanned systems. The concept of employing swarm of autonomous weapons is explored by Frelinger, et al [89]. From the simulation results of attacking ground targets using simple swarming algorithm, they conclude that: first, the communications used in conjunction with swarming algorithms can compensate the low capability of weapon sensors; second, the swarm of weapons exhibit robustness to the target location errors. Gaudiano, et al. investigate two different missions in their study: a search mission and a suppression mission [90]. In the search mission, several basic swarm strategies are present to control the UAV swarm flight over the target area looking for targets. In the suppression mission, the swarming algorithm is extended in which each agent has multiple behaviour states and stochastic transition rules for switching between them. In a more realistic environment, the searching vehicles must deal with the communication constraints. A future path projection method is proposed to mitigate the impact of limited communication range [91].

In their method, each vehicle constructs an individual probability distribution map that describes the

- 21 - likelihood of detecting targets within the search space. When the vehicles move beyond their communication range, they update the probability distribution map based on the prediction of the path of its neighbours, given the last received position and heading. A decentralised control framework for multiple UAVs to cooperatively search, detect, locate, and destroy ground based radio frequency emitting mobile targets is presented [92, 93]. The control framework is based on a behaviour-based state machine. The UAVs switch from one state to another based on the sensor readings and communicated data. Inspired by the previous research, the ground moving target engagement scenario is investigated in Chapter 3. This study adopts a quantitative methodology to evaluate decentralised control framework against centralised one, in order to identify the aspect of the problem that have the most impact on the performance of decentralised approach.

Chapter 3

Behaviour-Based Multiple UAVs Cooperative Control

In this chapter, the problem of assigning multiple UAVs to time-dependant cooperative tasks is investigated. A time-dependent cooperative task is one that requires multiple agents to perform separate subtasks simultaneously, or within the available time windows of cooperative agents (i.e. two or more agents are assigned subtasks within the same mission, and the agents have limited time windows to complete their tasks).

- 22 -

Such problems have mainly been investigated by using centralised task assignment algorithms. For a centralised controller, all the state information from the distributed vehicles is sent to a centralized agent, where the control or assignment decisions are made for each individual vehicle by an optimisation program. However, the problems’ size and the cost for global computation and information can be highly prohibitive for this centralised approach. This gives rise to the need for local solutions that dynamically assign agents to cooperative tasks. The aim of this research has been to design a method and algorithm to enable efficient operations for a large scale UAS performing a cooperative task when a centralised decision maker cannot or does not exist.

The remainder of this chapter is structured as follows. Following the problem statement in Section

3.2, Section 3.3 describes the behaviour-based control framework. Simulation results and analysis are presented in Section 3.4 and a summary of the study is given in Section 3.5.

3.1. Nomenclature

Transition matrix

Observation Vector

Process noise and its covariance, respectively

Measurement noise

Information state contribution and its associated information matrix, respectively

Observation noise and covariance matrix, respectively

Information state and information matrix, respectively

Target state and state covariance, respectively

- 23 -

, Entropy of discrete random variables and continuous random variable , respectively

Mutual Information obtained about a random variable with respect to a random variable

Bearing angle of the target measured by the vehicle

Distances of the target in and directions

Velocities of the target in and directions

3.2. Problem Statement

The problem of interest is a dynamic variant of the known cooperative moving target engagement mission (described and analysed in [28]). In this scenario, a group of UAVs is operating over a defined area tracking and attacking several ground based moving targets. Instead of assuming the targets are stationary relative to the UAVs as in previous research, the target motion is represented using state-space model with motion uncertainty. Each UAV is assumed to be equipped with radar based Ground Moving Target Indicator (GMTI) to measure the locations and range rate of the ground targets relative to the UAV and a sensor to detect the presence of other UAVs within a predefined distance. This mission requires that the target be continuously tracked by two vehicles during the attack phase while an additional UAV launches a guided weapon. The information from the tracking vehicles is fused to form a precise target location for targeting the weapon.

The GMTI sensor footprint in this study is an assumed to be a circular sector-shape with a minimum and maximum radius. The GMTI sensors are Doppler based which means that a moving ground target can only be detected and tracked if the range rate relative to the vehicle is above some minimum detection velocity. Thus the tracking UAVs have to stay within some offset angle from the heading of the moving target in order to maintain this relative velocity. Figure 3.1 shows the heading of the target and the associated target’s region of detectability (light grey area). Within the target detection region,

- 24 - the onboard GMTI radars can detect and track the target’s motion. The sensor footprint (dark blue area) is shown pointing out the side of the UAV. As the UAV flies along arcs of a circle centred at the target, the sensor footprint can lock on the target constantly. To further complicate the problem, the difference in bearing angles ( ) of the tracking UAVs to the target is required to be greater than a certain value to reduce the target location estimation error. The target kinematic model and GMTI measurement model are presented in the following.

Figure 3.1: Schematic diagram for cooperative tracking.

The primary objective of target tracking is to estimate the states of the target. A target is usually treated as a point object in tracking problems. The kinematic model of the target describes the evolution of the target states. The target kinematic model considered in this study is a commonly used model called the discrete white noise acceleration model [94]. The corresponding state-space model is given by:

, (1)

where the state vector . and are the distances of the target in and directions from the origin point. The corresponding velocities are thus and . is the constant velocity of the

- 25 - target during the time period , is the increase in the velocity of the target during this period, while is the average acceleration of the target during this period. is modelled as a white Gaussian noise sequence denoted by , where:

(2) .

The standard deviation of process noise and is set to be 0.5 meter per second squared. The transition matrix is:

(3)

and the vector gain multiplying the process noise is given by:

(4)

.

To simplify the problem, we assume that the dynamic model of the target is known to the UAVs. In many practical situations, the dynamic model of the target being tracked is not available to the tracker or it may possibly be time-varying, e.g. manoeuvring targets. The techniques used to deal with such problems can be found in [95].

In this study, the UAVs are assumed to be equipped with GMTI sensors [96]. The measurement equation is given by:

. (5)

- 26 -

The measurement vector comprises positions in the and directions and range rate where:

, (6)

where is the bearing angle of the target measured by the vehicle at time .

The sensor measurement matrix is given by:

(7) .

in equation (5) is the Gaussian measurement noise vector denoted by ,

where:

(8) .

The error statics for sensor measurements are given in terms of the range standard deviation , the range rate standard deviation and the bearing angle standard deviation , which are known. In this exercise, they are set to correspond to 5 meters, 1 meter per second and 0.001 radians. With these and the position variances, the covariance can be calculated as:

, (9)

- 27 -

, (10)

, (11)

where range of the target measured from each vehicle is evaluated at

.

3.3. Centralised Approach

The centralised algorithm based on mixed integer linear programming (MILP) has been proposed for solving the cooperative control problems. It is presented briefly here as a reference, since it is used as the baseline strategy to compare with the proposed decentralised approach. The full details of this method are available in [28] [12].

In this cooperative scenario, the ability to assign cooperative tasks (i.e. two or more agents are assigned subtasks at the same target) is critical to the mission’s success. Furthermore, each agent may only have limited time windows to perform a specific task. For example, if a coordinated task requires one UAV to track a target while the other attacks it, then one UAV may need to adjust its path to coordinate with its partner. A MILP formulation using a time availability window approach allows these constraints to be addressed. This approach results in a three stages process for the full mission planning problem:

First, the flying path of the UAVs is abstracted into time availability windows in which a task

can be performed. All potential task execution times are reduced to sets of time windows, e.g.

UAV 1 can only perform tracking task on the target between 40 and 60 seconds after the

mission begins.

Second, the MILP problem is then formulated and solved using these time availability

windows as a mathematical abstraction of the path planning. The MILP problem includes

both continuous and binary decision variables. The continuous timing variables represent

- 28 -

the time that task is initiated on the target . is constrained by the time windows

obtained from the first step. The binary decision variables represent that agent is to

perform task at target starting in time window and indicates that UAVs and

are assigned to cooperatively track target in the time window . These binary variables are

used in the non-timing constraints, such that each agent receives one task at a time, each

target is served at most once and any target receives one task receives all tasks. The cost

function can be defined so as to minimize the time that is required to destroy every target.

Solving this MILP problem results in the task assignments and task schedules.

Finally, specific flight paths are calculated to meet the assigned task schedule. If there are not

sufficient UAVs to fully destroy all targets without multiple tasks per UAV, the assignment

algorithm must be iterated. For each iteration, only the earliest target prosecution time is

selected. The target corresponding to that time is removed from the target list and the UAVs

assigned to that target have their states updated to reflect the time it will take to complete their

respective tasks. New time windows are computed with the updated positions and shifted by

the amount of time committed to earlier stages. The assignment algorithm is iterated until all

targets have been assigned.

To reduce the size of the above optimisation problem to a more reasonable level, the following four assumptions were added in previous research by the investigators when formulate the problem (all four assumptions were added at once):

1. Targets have constant heading.

2. Targets have fixed position.

3. The tracking UAVs place the target in the centre of the sensor footprint.

4. Weapons are launched at a fixed distance from the target and flight time of the weapon is

known.

- 29 -

3.4. Decentralised Approach

The centralized approach discussed above formulated this problem as a MILP problem. Since MILPs are inherently non-deterministic polynomial time (NP)-complete, this algorithm scales exponentially with the number of UAVs. Moreover, in the presence of moving targets, the whole solution must be recalculated as the states of the targets change, making this approach unfeasible as a real-time solution. Finally, constraints on information flow, such as communication delay, can result in uncoordinated assignments. The centralised assignment algorithm discussed in [21], [22]and [97] can be computed by either a centralised agent (e.g. command centre or team leader) or redundantly by each of the UAVs. With either approach up-to-date information on each UAV’s state, consisting of position, heading, and velocity, has to be shared across the whole network. Any fault in vehicle state measurements and vehicle to vehicle communications might cause uncoordinated assignments or inconsistent team decision making.

The main contribution of this research is the investigation of the decentralised control framework that enables efficient operations for a large scale UAS performing a cooperative task when a centralised decision maker cannot or does not exist. The proposed methodology is designed based on agent level interactions through the combination of direct interact and indirect interaction. Direct interactions are the obvious interactions that agents send out requests in given circumstances and other agents who received the requests will respond. Indirect interactions are made via stigmergy. Stigmergy is the mechanism within which individual behaviour modifies the environment, which in turn modifies the behaviour of other individuals [14]. In this study, stigmergy is used to coordinate the behaviour of the

UAVs during the tracking stage via mutual information.

3.4.1. Information Theoretic Cooperative Tracking

In order to track the targets, the UAVs in track mode have to satisfy the sensor geometry constraints and maintain separated line-of-sight angles to the target. The estimation error in the position of the moving target can be reduced by multiple sensors with separated line-of-sight angles to the target, with preferably orthogonal views [97]. This is treated in previous research by restricting the difference

- 30 - in bearing angles of the UAVs to the target to be greater than 45 degrees. Different to the previous research, the angle separation requirement is not enforced by constraints. Since the performance of the target estimation is dependent upon the positions from which measurements are taken relative to the target and the information content of a set of measurements can be quantified and recursively computed by information filter, our methodology is to let the tracking UAVs fly trajectories that locally maximise the information gain for next measurement. By doing this, the angle separation will be formed automatically.

The standard version of the Kalman filter uses the estimation of states and calculates the gain with a recursive computation of the state covariance [98]. The information filter is derived from the Kalman filter in terms of the information states vector and the information matrix

. The information state vector and information matrix are defined as:

(12)

(13)

Because the target motion state and the sensor measurement used in our study has a nonlinear relationship (as given in equation (5)), we utilize an Extended Information Filter (EIF) [99]. The prediction step of the filter is given by:

, (14)

and

, (15)

where

- 31 -

, (16)

, (17)

and

. (18)

while the estimation step can be written as:

, (19)

. (20)

The information state contribution and its associated information matrix are given, respectively, as:

(21)

and

(22)

where the Jacobian is evaluated at and is evaluated at

. Compared with a Kalman filter, the estimation step of the information filter simplifies while the complexity in the prediction step increases. The initial condition for the information filter is set as regardless of [95].

- 32 -

Information filters play an important role in decentralised data fusion problems that involve multiple information sources. As for a decentralised system, data fusion and vehicle control occur locally at each individual agent with local observation and the information communicated from neighbouring agents. In conventional approaches to state estimation, it is difficult to capture the statistical relationships that exist between estimates produced by combinations of observations [52]. The information filter formulation has an additive structure that can be exploited by distributed decision- making algorithms

For sensor information sources, the posterior information state and information matrix are obtained from:

, (23)

, (24)

where is the information matrix and represents the information state contributions of the sensors . This is assuming that the sensors are fully connected with no communication failures or delays. The posterior target state estimation is obtained from:

(25)

The entropic information of the estimated target states is given by:

(26)

The mutual information gain for vehicle tracking target is calculated from:

- 33 -

(27) .

From equation (22), it can be found that the information matrix results directly from the geometric position of a UAV relative to the target. Thus, the mutual information will give an a priori measure of the information to be gained from potential sensing actions.

To test and verify this methodology, we setup a simulation with two UAVs tracking one target from different line-of-sight angle separations. The simulation results of the track estimation error against the separation angle using the sensor model presented in the previous subsection are as follows.

Figure 3.2(a) shows that the minimum estimation error was achieved with a separation angle between the two UAVs of approximately 90 degree line-of-sight. Figure 3.2(b) shows the entropic information plotted against separation angle. The figure suggests that the information measure on the target is directly related to the estimation error. It follows that the more information we have on the target, the more accurate the estimation of both its position and velocity, provided that the sensors are appropriately displaced. This result verified that navigating towards the location from where the observation is more informative (i.e. mutual information gain calculated from equation (27) can lead to more accurate estimation about the target’s states and also maintain desired angle separation requirement.

5 1.8 1.6 4 1.4 1.2 3 1 2 0.8 0.6 1 0.4

0.2 Velocity error (in m/s) (in error Velocity

0 0 Position error (in meters) (in error Position 0 60 120 180 240 300 360

Seperation angle (in degree) Position RMS estimation error Velocity RMS estimation error

Figure 3.2(a): Target position Root Mean Square (RMS) estimation error and target velocity RMS

- 34 -

estimation error.

-8.6 -8.7 -8.8 -8.9 -9 -9.1 -9.2

-9.3 Entropic InformationEntropic -9.4 -9.5 0 60 120 180 240 300 360

Seperation angle (in degree)

Figure 3.2(b): Target entropic information.

3.4.2. Four Basic Behaviours

In this section, behaviour-based control framework which effectively generates cooperative behaviours of multiple UAVs is presented. A swarm system can be defined as a decentralized system made of autonomous agents that are distributed in the environment and follow simple probabilistic stimulus-response behaviours [100]. The individuals within the swarm system are not able to assess a global situation and control the tasks to be carried out by the other agents, i.e. there is no supervisor in a self-organising swarm system. For example, each time an agent performs an action, the local environment is modified by this action. The new environmental configuration will then influence the future actions of other agents. This process leads to the emergent properties of the colony at the system level. In this behaviour-based approach, each vehicle can switch between four behavioural states; these being cruise, standby, attack and track. In each state, the behaviour of the UAVs is governed by local rules. The switch between behavioural states is triggered by changes of local environment, e.g. target states and request received from neighbours.

A. Cruise

The UAVs is in cruise mode if it is not engaged with any target. This behaviour has three guidance rules that navigate the UAV toward the nearest target, depending on the request it received.

- 35 -

If there is no request received, as shown in Figure 3.3(a), the UAVs will fly toward the

nearest target and switch to standby mode once it reach the standoff distance.

If the UAV receives request for tracking support from the standby vehicle, as shown in Figure

3.3(b), it will fly toward tracking region of the target and switch to track mode once it arrives.

If the UAVs are already in the tracking region when it receives the request, they will switch to

track mode immediately.

If the UAVs receive request for tracking support, at the same time it also receive the local

track data from the another tracking vehicle, as shown in Figure 3.3(c), it will start running

the track data through data fusion process and fly toward the location from where the

observation can be most informative. Once it arrives, it switches to track mode.

Figure 3.3(a): Guidance law for Figure 3.3(b): Guidance law for Figure 3.3(c): Guidance law cruise mode cruise mode for cruise mode

B. Standby

In the standby mode, the UAVs circle around the target with a predefined standoff distance. The

UAVs receive local track date from all the tracking UAVs and use decentralised data fusion algorithm to process the track (Figure 3.4). Once the target track is defined accurately enough for the guided

- 36 - weapon, the standby UAVs switch to attack state. This accuracy is measured by the entropic information on the target.

Figure 3.4: Standby mode Figure 3.5: Attack mode

C. Attack

The attack mode is simply modelled as the UAVs heading toward the target and launching the weapon from a predefined distance from the target (Figure 3.5). The UAVs tracking the target must track the target for the duration of the weapon flight. Once the target is destroyed, all the UAVs engaged to this target switch back to cruise state.

D. Track

When an UAV is in the track mode, its motion is governed by two rules:

The first rule the tracking UAVs follow is to place the target within the sensor footprint while

remaining within the detectable line-of-sight angle to the target heading.

The second rule is to increase the entropic information on the target. The control and

estimation process is conducted in discrete time. The tracking UAVs fly at constant velocity

with their heading as the decision variable. They will request local track information, i.e.

information states and the information matrix from another tracker and

compute the desired heading command that maximise the mutual information gain for the

- 37 -

next observation step. This process will work recursively until the track is accurate enough to

trigger the attack.

In terms of communication, the track information is only shared among the two tracking UAVs and the attacking UAV. Figure 3.6 shows a schematic block diagram of the sensor and communication network for engaging a target. As shown in the diagram, two tracking UAVs make observations on the targets and maintain local tracks. The local tracks are shared between two tracking UAVs and used for calculating mutual information. The information states and information matrices are sent to the attacking UAVs to fuse a global track. Algorithm 1 summarizes the behaviours that followed by each UAVs and the conditions under which the UAVs switch modes.

Communication Tracking UAV

between Tracking and Attacking UAVs Local track u Attacking UAV

Prediction u Global Track Local track

Tracking UAV Prediction u

Local track u

Prediction

u

Figure 3.6: Schematic block diagram of data fusion.

Algorithm 1

1: switch behaviour

- 38 -

2: case Cruise (default behaviour at the beginning of the mission)

3: if (no request received)

4: Fly toward the nearest target; Change to standby once it reaches the standoff distance.

5: elseif (tracking request received)

6: Fly toward tracking region of the target; Change to track once it arrives.

7: elseif (tracking request and receive from other tracking UAV received)

8: Compute the desired heading command that maximise the mutual information gain for the next observation step; Change to track once it arrives.

9: end if

10: case Standby

11: Circle around the target; Send request for tracking support; Process the target track; Change to attack when the level of accuracy is achieved.

12: case Attack

13: Release weapon; Change back to cruise.

14: case Track

15: if (target tracked by single UAV)

16: Track target by maintain sensor coverage.

17: elseif (target tracked by cooperative UAVs)

- 39 -

18: Perform cooperative tracking (step 8); Change back to cruise once the target is destroyed.

19: end if

20: end while

Figure 3.7: Pseudo code for behaviour-based control architecture.

3.5. Simulation and Analysis

This section presents the simulation results and analysis of the performance of the behaviour-based method which lead to the investigation of a predictive decentralised controller. The development of the multiple-UAV simulator can be found in Appendix A.

3.5.1. Lyapunov Potential Field vs. Information Measures

In this cooperative moving target engagement scenario, the ability to generate accurate tracks of the targets is critical to mission performance. The guidance rules have to enable the tracking UAVs to fly trajectories that increase the amount of information provided by the measurements and improve target states estimation, resulting in proper target tracking and an accurate target location estimate, without violating the dynamics constraints and sensor restrictions. In this study, we compare the proposed cooperative tracking method with decentralised method based on potential field. Potential field method has been investigated to generate guidance vector field that provides desired velocity of the unmanned systems [101] [102] [103]. The specific method we adopted in this comparison is from

Frew and Lawrence’s work [104]. Their method is based on a Lyapunov guidance vector field that produces stable convergence to loiter circle of predefined radius. Cooperative tracking by multiple unmanned aircraft is achieved through additional phasing control law which adjusts the speed of the vehicles to maintain desired relative phase on the loiter circle. The phase separation is set to be 90 degree to minimise the achievable error variance [105].

- 40 -

Figure 3.8 (a): Cooperative tracking via Lyapunov potential field.

Figure 3.8 (b): Cooperative tracking via information measures.

Figure 3.8 shows the flight trajectories generated by these two methods while tracking a ground moving target from stand-off distance. It can be seen from figure 3.8(a), by following the guidance vector field, the UAVs are attracted to the loiter circle about the target vehicle, and maintain the

- 41 - desired phase angle separation. Such coordination is achieved by constantly sharing the position between the two UAVs. In contrast, the UAVs using proposed method head toward the tracking region in the early stage without any knowledge of another UAV. As the UAV 2 is initially located nearer to the target, it starts tracking the target earlier than UAV 1. Figure 3.8(b) shows that UAV 1 receives track data from UAV 2 and plans its next move to maximise information which causes UAV

1 to loiter counter clockwise about the target.

In order to investigate the performance of these two techniques, the Monte Carlo method is using for a total sample size of 100. The two UAVs are randomly generated between 40 kilometres and 55 kilometres from the target. We record the time when the entropic information on the target reaches -

8.8 which indicates the accuracy of the estimation. The simulation results show that the UAVs spend average 748 seconds by using potential field method while 348 seconds by using information measures. This indicates the effectiveness of the information theoretic framework applied to cooperative tracking. Even without coupled objective functions and dynamics, i.e. no planning information is shared between UAVs; the UAVs can still demonstrate the capability to perform cooperative tracking in a time critical scenario. In contrast, the Lyapunov potential field method is more suitable for persistent tracking as it can produce asymptotic stability to desired loiter pattern.

3.5.2. Centralised Control vs. Swarm Control

In this section, the simulation results of the complete cooperative moving target engagement scenario are presented. First an example scenario was used to explain the proposed behaviour-based control framework. This scenario includes 7 UAVs and 3 ground targets. The mission area is a square whose area is 100 square kilometres. The UAVs fly at a constant speed, governed by flight dynamics, of 110 meters per second. Their behaviours are governed by the rule sets presented in section 3.4. The targets were randomly distributed over the mission area and their motions are defined by equation (1).

Figures 3.9 (a), 3.9 (b) and 3.9 (c) display stages of the simulated scenario at 3 points in time.

At 188 seconds, UAVs 1 (red) and 4 (yellow) were designated to assist in tracking target 1 in cooperation with UAV 7 (orange). The coloured area displays the sensor footprint of the tracking

- 42 -

UAVs. The UAVs’ sensor footprints have the same colour as the UAVs themselves. It can be noted that UAV 1 changed its heading once UAV 4 started tracking as a result of the tracking rule, i.e. heading toward the location where the measurement could yield more information. This action caused

UAVs 1 and 4 to maintain an appropriate line-of-sight separation angle while tracking the target. On the right side, UAV 6 (cyan) were tracking target 3 while UAV 5 (magenta) was in standby mode waiting for tracking support from the second vehicle. As can be seen from the flight path, UAV 2

(green) was heading toward target 2 initially, and then changed its course to help tracking target 3 after it received tracking request from standby UAV 5.

Figure 3.9 (a): Simulated scenario at time 188 seconds.

- 43 -

Figure 3.9 (b): Simulated scenario at time 213 seconds.

After 213 seconds, UAV 1 started tracking target 3 after target 1 was destroyed and the track information became accurate enough for UAV 5 to attack target 3. UAV 2 switched back to target 2 because tracking support was no longer needed for target 3.

Figure 3.9 (c): Simulated scenario at time 279 seconds.

- 44 -

40 x position error 30 X position SD 20

10

0

-10 1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 161 171 181

-20 Error of Position Estimate [m] Estimate Position of Error -30 Number of Measurements

15 Y position error 10 Y position SD 5 0 -5 1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 161 171 181 -10

-15 Error of Position Estimate [m] Estimate Position of Error -20 Number of Measurements

Figure 3.10: Position estimation error and standard deviation (SD) of target 1.

After 279 seconds, UAVs 2 and 3 were tracking target 2 cooperatively while UAV 7 was attacking.

The information filter results for target 1 are presented in Figure 3.10. It shows the position estimation error and standard deviation bounds for and axis. As can be seen from the lower plot, the axis estimation results deteriorated as the line-of-sight of UAV 1 to target 1 approaching horizontal, due to the fact that cross-range velocity is difficult to detect using a Doppler based radar. This is unavoidable for using one vehicle tracking the target because of the circular motion of the vehicle around the target. Both axis estimation results improved fast after 160 measurements when UAV 4 joined UAV 1 in tracking the same target, especially for axis estimation. This result shows that the proposed method can coordinate the trajectories of the tracking UAVs which increase the accuracy of the estimation.

The Monte Carlo method with sample size of 100 is using to evaluate the proposed control framework compared with centralised control strategy proposed in [28] in two scenarios. Each scenario includes

- 45 - seven cases with the number of UAVs progressively increasing from 6 to 12. In scenario one, 3 targets and the whole team of UAVs were randomly distributed over the mission area at the beginning of the simulation. In scenario two, the UAV swarm enter the mission area from the left side in a tight formation and the number of targets increased to 4. At the end of each simulation the time for eliminating all targets was recorded. The computational intensity is one of the main issues with the centralised algorithm. It is found out that the problems of 12 vehicles (include both UAVs and targets) are solvable in less than 30 seconds by using CPLEX solver on a desktop computer. As the size of the problems increased to 20 vehicles, the time spent on finding solutions become 90 second. It will take

20 minutes to optimise the problems of 30 vehicles. Thus maximum 12 UAVs and 4 targets are considered in our study for the comparison of centralised and decentralised algorithms.

The average of the final task completion time for each simulated scenario are summarised in figure

3.11. The error bars show the 95 percent confidence interval for the mean value based on the sample data. It can be seen from figure 3.11(a) that the curve representing the behaviour-based UAV swarm descends faster than the centralised controlled swarm, i.e. the advantage of the centralised controller was becoming less significant with the number of UAVs increase. For a small size group of agents, the swarm is less efficient as centralised controller especially for mission scenarios with strong timing and coupling constraints. Such results are expected as the centralised controller optimising the common objective shared by the whole team and be able to resolve the conflicts if the tasks are coupled. While the decentralised controller could only operate under local and environment-driven rules and is absence of global coordination information. As the number of UAVs increase, the UAVs only need to engage the targets that near to them. This means that the tasks that the UAVs are to perform become less coupled than in the situation that one UAV have to perform multiple tasks on multiple targets.

The simulation results of scenario 2 are given in figure 3.11(b). As can be seen, the gap between these two curves does not decrease as the number of UAVs increase. This is mainly because in scenario 2 the whole team of UAVs starts at one side of the mission area in close formation. Such setup causes the UAVs lose the advantages of spreading over the mission area. Therefore, the task assignment

- 46 - problem becomes more tightly coupled than in the scenario 1, which makes the decentralised controller impossible to achieve the performance of the centralised controller if the problem can be solved in real-time. Such drawback of swarm system leads to the investigation of integrating predictive model into individual agents to enhance the mission performance.

1300 1200 Swarm 1100 Centralised 1000 900 800 700 600 500

all targets seconds) (in all 400

Final task completionFinal timefor 6 7 8 9 10 11 12 Number of UAVs

Figure 3.11(a): Monte Carlo results of scenario 1.

3300 3200 Swarm 3100 3000 Centralised 2900 2800 2700 2600 2500 2400 2300 2200

2100 all targets seconds) (in all

6 7 8 9 10 11 12 Final task completionFinal timefor Number of UAVs

Figure 3.11(b): Monte Carlo results of scenario 2.

3.5.3. Predictive Swarm

In the section 3.4, the cooperative moving target engagement problem was solved by using decentralised swarm system consisting of simple agents. The rules employed by these agents were presented. The key principal of the swarm system is the simplicity of the agents. As discussed above,

- 47 - the behaviour of the UAVs are governed by the rule sets which are executed on the basis of local information without knowledge being required about the state of the entire group. Also, an individual agent only manages its actions to maximise its individual utility, as under the rules of cooperative tracking. In the swarm system, an individual agent neither controls nor is aware of tasks to be carried out by the other agents. Even though this design enables agents to perform separate subtasks cooperatively, the achievable performance is low. This is mainly because in the early state of the mission the UAVs fly toward the nearest target without knowing that there is a possibility that other

UAVs may arrive earlier and destroy the target before they arrive. This causes an excessive number of the UAVs to waste time on a target which can be better engaged by other members in the team.

To address this issue, a predictive decentralised controller was considered. This type of controller was explored in [106] [107]. In their research, the agents run filters on the states of teammates and estimate their cost to execute tasks. The decision estimation methodology they designed let agents operate in a decentralised manner, but it works as a redundant centralised controller with decentralised states estimator instead of treating the group as an agent-based system. In this study, we preserve the behaviour based system designed in section 3.4, and add a predictive model which let the UAVs to predict the intentions of their neighbours in order to improve the team performance. This predictive model uses the mixed integer linear programming formulation and works as a preliminary assignment algorithm.

When the UAV is in cruise mode, it senses its neighbours’ states, such as position, heading and velocity, and uses them as the input to compute the assignment. This process is demonstrated in

Figure 3.12.

- 48 -

Figure 3.12: Predictive decentralised controller

As can be seen from Figure 3.12, the green UAV sensed the states of four UAVs around it. Then the predictive model returned the results that those three UAVs (in the big circle) above it were in a better position to engage the target (red colour triangle) in the top right corner. Thus the green UAV would discard the target near to it and fly toward the target in the bottom right. If the UAVs do not have the predictive model, they will all head toward the nearest target, which will cause the target on the top to be over-served and delay the time to finish the second target.

The prediction model will run at the beginning of the mission to generate an initial assignment. After that, it will be triggered by three events: the UAVs change back to cruise mode from any of the other three modes; detect new neighbours; the neighbours’ movements deviate from previous prediction. By setting up these rules, the load on computation is expected to reduce and make the UAVs better adapt to the uncertainty. Even though this prediction model uses the same formulation as the centralised approach in previous research, they operate differently in the following ways. First, this prediction model is only concerned with the UAV and its near neighbours, while the centralised approach runs for the whole group. This makes the prediction decentralised controller scale as well as decentralised controller. Second, the assignment resulting from the prediction model is just a temporary assignment.

Since the UAVs are in the cruise mode after receiving assignments from the prediction model, they will still respond to any request to turn into trackers. Although the cooperation of the swarm is still

- 49 - reliant on the interactions that take place among the group, the prediction model will give the agents a certain level of situation assessment ability. Thus the agents would accept an assignment that does not maximise their individual utility if it will maximise the overall team utility. For instance the agents will choose the targets which may not be the nearest but by engaging such targets would potentially reduce the overall time of finishing the mission. Following is an example used to demonstrate the increased performance by using predictive swarm comparing with simple swarm presented early.

Figure 3.13(a): Simple responsive swarm at 90 seconds

- 50 -

Figure 3.13 (b): Predictive swarm at 90 seconds

As shown in Figure 3.13 (a, b), the UAVs with simple swarm controller were all flying toward target

1 because the rule set tells them to engage the nearest target if there is no request received. In contrast, the UAVs with predictive swarm controller split into two group of three. One group headed toward target 1 and another group chose target 2.

Figure 3.13 (c): Simple responsive swarm at 250 seconds

- 51 -

Figure 3.13 (d): Predictive swarm at 250 seconds

As can be seen from Figure 3.13 (c), until the target 1 was destroyed, the UAVs began to move toward target 2. On the other hand, as shown in Figure 3.13 (d), three UAVs chose the target 2 at the beginning of the mission was already halfway to target 2.

Figure 3.13 (e): Simple responsive swarm at 590 seconds

- 52 -

Figure 3.13 (f): Predictive swarm at 500 seconds

After target 1 was destroyed, the UAVs with simple swarm controller all chose target 2. As for the

UAVs with predictive swarm controller, three UAVs previously engaged target 1 decided to go to target 3 after they attacked target 1. As can be seen from Figure 3.13 (e, f), it took one and half minutes longer for the UAVs with simple swarm controller to reach target 2.

Figure 3.13 (g): Simple responsive swarm at 1115 seconds

- 53 -

Figure 3.13 (h): Predictive swarm at 850 seconds

Finally, target 3 was destroyed. The UAVs with predictive swarm controller finished the mission almost five minutes earlier. The simulation study of scenario 2 is repeated including predictive decentralised controller this time. As we can see from figure 3.14, the predictive decentralised controller provided better performance than simple responsive swarm in all the case.

3300 Swarm 3200 Predictive swarm 3100 3000 Centralised 2900 2800 2700 2600 2500 2400

2300 targets (in seconds) (in targets 2200 2100 Final task completionFinal timefor all 6 7 8 9 10 11 12 Number of UAVs

Figure 3.14: Repeated scenario 2 with predictive decentralised controller included.

3.6. Summary

This chapter investigated the use of decentralised controllers for the cooperative moving target engagement missions. Different to a centralised task assignment algorithm, the cooperation of the agents is entirely implicit. The behaviour of the UAVs was governed by rule sets which ultimately lead to a highly cooperative performance of the group without the need for a centralised control authority. The information theoretic measures are adopted to estimate the value of possible future actions and make the UAVs fly trajectories that locally maximise the information content for next measurement. Two types of decentralised controllers were proposed: responsive swarm controller and predictive swarm controller.

- 54 -

The simulation study evaluates the performance of the decentralised controller against the centralised controller quantitatively. The results show that the performance of the decentralised controller depends on the complexity of the coupled task constraints. If the increased number of UAVs could cause the tasks less coupled, the decentralised controller would become competitive with centralised controller in the large size problem. If the complexity of the coupled task constraints does not reduce as the size of the swarm increase, the predictive model can be integrated into the decentralised controller to let the agents estimate the intentions of their neighbours and choose activities which enhance the overall team utility accordingly. The improvement in the performance by using predictive swarm controller was shown in the simulation results. However, the benefit of redundancy and robustness offered by a swarm system has not been demonstrated in current design. In the future work, adaptive ability of these two types of controllers under uncertainties such as communication constraints, manoeuvring targets and threats will be investigated.

Chapter 4

Behaviour-Based Control Law for Spacecraft Swarm Operations

This Chapter presents a behaviour-based control methodology to reposition a swarm of spacecraft in order to balance the fuel consumption among the agents. The problem of interest is to maintain a swarm configuration which contains a large number (hundreds to even thousands) of femtosatellites

(100-gram class spacecraft). Under the influence of the J2 perturbation, the agents with larger orbital element differences with respect to the reference orbit consume more energy in maintaining their position within the configuration. This suggests that the fuel consumption of individual spacecraft will not be uniform across the swarm as it depends on initial conditions and the location of the individual

- 55 - spacecraft within the formation. As the fuel capacity is limited, some spacecraft within the swarm will deplete their fuel reserves earlier than others. This will cause the premature degradation of the mission if the unbalanced fuel consumption is not managed. The objective of this study is to investigate the feasibility of using the behaviour-based control laws based on agent-level interaction to achieve the desired coordination in order to balance the fuel consumption while maintaining the desired formations.

The concept of SO system has been applied to the formation flying spacecraft [85, 108], in which the spacecraft, modelled as a swarm of agents, follow biological rules of ‘avoidance’ of each other and the threat, ‘gather’ to maintain the formation and ‘attraction’ towards target locations according to pre-defined artificial potential functions. In this study, the scenario is more complicated than converging to target locations. Therefore the agents will be required to switch between multiple modes and their behaviours are governed by different rule sets under each mode. The proposed control law will guide the spacecraft in the high-fuel-consumption positions to switch location with those in the low-fuel-consumption orbits. This approach does not require any centralised command and control. The coordination is achieved through agent-level interactions.

The remainder of this study is structured as follows. Section 4.1 describes the numerical method that was used to compute the initial conditions for each spacecraft within the swarm and the unbalanced fuel consumption among the spacecraft. In section 4.2, the behaviour-based control framework is presented. The results of the simulation are given in section 4.3 while section 4.4 summarizes the study.

4.1. Problem Statement

In this study, the initial configuration of the swarm of spacecraft is considered to follow a random distribution centred at the reference orbit. Each individual spacecraft on the periodic relative orbits

(PROs) will be rotating with respect to the centre of the relative frame.

- 56 -

Different types of PROs of the spacecraft within the formation have been investigated [109]. The shape of the projection of the PROs perpendicular to the radial direction is of interest for the purpose of ground-observation missions. A projected circular PRO is used in this study as its primary advantage is that the spacecraft are separated by a fixed distance when the formation is projected onto the along-track/cross-track ( ) plane. When the ellipse of relative motion is projected on to the along-track/cross-track plane, it produces a circle. Figure 4.1 shows the configuration of the swarm comprising 200 spacecraft. By definition, , , and are the relative displacements in the radial, in- track, and cross-track directions respectively in the local vertical/local horizontal (LVLH) frame.

Figure 4.1(a): The initial position of the spacecraft swarm.

Figure 4.1 (b): The initial position of the spacecraft swarm on plane.

- 57 -

4.1.1. Initial Conditions

The secular drift among the spacecraft in the formation which are caused by the earth oblateness (J2) effects has been studied [110]. The initial conditions must be carefully selected to generate relative orbits that are not only bounded, but also remain close to the desired trajectory. In this study, the numerical approach genetic algorithm (GA) is used to determine the initial conditions. Firstly, the approximate initial conditions that satisfy the relative motions are obtained from Hill’s equations

[111]. Then, these initial conditions are used to generate an initial population in GA. The fitness function which is used to rank the individuals in GA is given by:

(4.1) .

The first term is drift in the along-track direction (see Figure 4.2). It is measured as the drift in the along-track direction after 5 periods. In the presence of J2 perturbation, the period matching conditions obtained from Hill’s equations would not be valid due to the differences in the precession rates of the spacecraft [112]. Thus the secular drift in the along-track direction has to be accounted for to minimize the propellant used to correct the orbits. The second term of the fitness function represents the deviation of the PRO from the desired projected circular formation. It is the sum of the Euclidean distance between the control points on the desired path and the corresponding points on the PRO. The nonlinear dynamics model of the relative motion is used to propagate the dynamics [113]. Therefore the individuals with lower fitness scores represent better solutions. Following the initial phase, the main cycle of the GA refines the solutions iteratively by applying the natural selection process.

- 58 -

Figure 4.2: PRO with drift.

The solution obtained from the GA is presented in the following example. The reference orbit with the following initial mean orbital elements is considered:

.

The resulting relative orbit with initial relative position is shown in

Figure 4.3. As can be seen from Figure 4.3(a & b), the initial conditions obtained from Hill’s equations result in drift in both along-track and cross-track directions over 50 orbits. Figure 4.3(c & d) shows the relative orbit resulting from initial conditions generated by the GA after 100 generations. It is clear that the relative motion in the along-track direction is bounded. The growth in the cross-track oscillation is less, but cannot be eliminated. Without control inputs to maintain the spacecraft within the tolerance of the desired states, it will cause the distortion of the swarm configuration in the long term.

- 59 -

Figure 4.3(a): Relative orbits obtained using Hill’s Figure 4.3(b): Relative orbits equations (50 orbits). obtained using Hill’s equations ( plane).

Figure 4.3(c): Relative orbits obtained using GA Figure 4.3(d): Relative orbits (50 orbits). obtained using GA ( plane).

4.1.2 Unbalanced Fuel Consumption

- 60 -

To prevent the distortion of the swarm configuration, control effort is required to counter the effect of the perturbation. The linearised dynamics model for J2 perturbed relative motion with a mean circular reference orbit is used in this study:

. (4.2)

The entries of the transition matrix can be found in [114]. Equation (4.2) can be compactly represented as a linear time varying (LTV) state-space equation:

. (4.3)

The discrete equivalent state-space equation with a sampling period is given by:

, (4.4)

where

, and

.

With the initial state and impulsive control inputs , the final states can be written recursively as:

- 61 -

, (4.5)

where

.

Based on the linearised system model, the formation-keeping problem can be formulated as a linear programming (LP) problem to minimise the overall which is assumed to be a direct function of the fuel required to maintain the spacecraft flight within the position tolerance [115]. In the simulation, the LP planning horizon is set to be one orbit time. The relative dynamics are discretised on a 5.964 seconds time step so that one orbit contains 1000 time steps. Figure 4.4 shows the fuel cost per orbit for each initial position within the swarm configuration. As can be seen, the fuel consumption for each spacecraft in the swarm is not uniform. It is related to the individuals’ orbital elements differences with respect to the reference orbit. If the coordination strategy is not applied to balance the fuel consumption, the mission lifetime will be affected as the spacecraft on the outer perimeter of the configuration will run out of fuel before those nearer to the centre.

Figure 4.4: Fuel cost map.

4.2. Behaviour-Based Control Law

- 62 -

In this study we investigated the possibility of using the limited local sensing capabilities of each individual spacecraft to coordinate the swarm and achieve a common objective. In particular it is assumed that each spacecraft can sense the presence of its neighbours and measure the inter- spacecraft distances. The objective is to extend the mission lifetime by allowing the spacecraft in the high-fuel-consumption positions to switch with those in the low-fuel-consumption positions. A behaviour-based control law was developed to achieve this. Each agent carries out reactive behaviours driven by local environment. At the beginning of each planning cycle (set to be one orbit time in the simulation) an individual spacecraft selects one of four behaviour modes to determine its next move.

These behaviours are position-hold, descend, ascend, and formation-keep.

A. Position-Hold

Position-hold is the default behaviours mode of the agents at the beginning of the mission. In this mode, the spacecraft will maintain their relative position within the swarm. Disturbances will generate perturbations that will cause the spacecraft to drift from the designed PRO. As a result, the control scheme, discussed in section 4.1.2, is applied to move the satellite from the disturbed state back to the desired state and maintain the spacecraft within the tolerance of the desired state. If the agent switches back to position-hold mode from any other modes, it will need to recalculate the initial relative velocity at its current location that satisfies the partial J2-invariance conditions and achieves that initial state.

B. Descend

As analysed in section 4.1, those spacecraft on the outer perimeter of the swarm consume more propellant in order to maintain the desired PRO. To prevent them from running out of propellant much earlier than the rest, they will change to the descend mode and move to a position where the fuel consumption is lower. A switch to the descend mode is triggered when the amount of propellant becomes less than a pre-set value .

- 63 -

As shown in the Figure 4.5, the spacecraft in the descend mode (represented by the black circle) will firstly construct a fuel consumption map for its surrounding area. The size and the resolution of the fuel consumption map are determined by the local sensory information and on-board computational resources. It will subsequently move to the cell with lowest fuel consumption rate without violating the safety distance of its neighbours (represented by white circles). If the agent cannot find any other cell on the map that has lower value than its current cell, it will change to the position mode.

Furthermore, the descend mode will be temporarily disabled if the agent sensing the surrounding space becomes too crowded. Empirical value is used in the simulation. For example, for a swarm of

200 spacecraft, the spacecraft will stop moving if it senses more than 4 neighbours within a 50 meters radius. If this situation lasts for more than the time of10 orbits, the agent will change to formation- keep mode automatically. The purpose of this logic is to: firstly leave enough space for the agents in the ascend mode moving outward; secondly prevent the agents from squeezing into the minima area of the fuel consumption map. The trajectory shown on the diagram simply indicates the start and final position of one move. The more detailed path planning algorithm used in this research can be found in

Appendix C.

Figure 4.5: Movement in descent mode.

- 64 -

C. Ascend

As the agents with high fuel consumption rate relocate, the empty spaces they leave must be filled by other agents in order to maintain the original swarm configuration. Therefore, the ascend mode, an inverse process of the descend mode, is necessary. This behaviour mode is triggered when the agents detect increased number of neighbours. In the simulation, the sensor range of 50 metres is assumed.

The ascend behaviour is activated when the agents detect the number of neighbours increased by two.

It is worth noting that only the agents who have enough fuel (more than ) are able to change to the ascend mode.

In ascend mode, the control law is similar to the one used in the descend mode. The agents seek for the cell with a high value of fuel consumption. The movement will stop when the agents enter into less occupied space. After 10 orbits without motion, the agents will change to formation-keep mode.

D. Formation-keep

In the formation-keep mode, the agents will adjust the inter-spacecraft distance after the reconfiguration. The movement of the agents in this mode is coordinated according artificial potential functions (APF) [17, 79, 116]. With this method the movements of all the spacecraft follow the local gradient of the potential field and converge to the desired states. The swarm configuration in consideration are assumed to be on the projected circular plane (global objective), at the same time the agents maintain a separation distance from each other (local objective). The potential function of agent is:

(4.6)

- 65 - where

(4.7)

is the position of agent . is the set of neighbours around agent . is the radius of the swarm configuration. , , and are scaling factors. The first term in the potential function represents the repulsion potential between agents and the second term represents the global potential which force the agents stay within the desired region. It should be noted that the projection of the movement of the individual agents on to the along-track/cross-track ( ) plane follows a circular trajectory at the same angular rate. As a result, the projection of swarm configuration appears to be a rotating disk about the axis. To simplify the calculation, the state vector of the agents in the potential function is defined in a new frame fixed in the rotating disk.

Typically, the control input via APF approach is calculated by defining a virtual force that drive the system to move along the negative gradient of the potential function, which results continuous control history [117]. Certainly, the approach is not intended for cases involving discrete control, such as impulsive thrust. The use of a discrete control scheme for the path planning problem of an on-orbit servicing vehicle has also been considered [118], and this study adopts a similar approach. In this case, the time derivative of the potential of agent is given by

(4.8)

where is the velocity of agent . The states of the agents will propagated freely until the thrust actuator is on, which results in step changes in the agents’ velocity. The control action will be triggered if the time derivative of the potential is greater than zero.

- 66 -

This condition defines switching times for thrust actuators:

(4.9)

The impulsive velocity change made to the agent is computed by

(4.10)

where is the vehicle velocity immediately prior to the application of the impulse and the positive definite function is used to limit the velocity of the manoeuvring spacecraft. Define the

Lyapunov function for the swarm system as

(4.11)

where is the number of agents which are in the formation-keep mode. Using the control defined by equation (4.10), the derivative of the Lyapunov function along the trajectories of the system is given by

(4.12)

for all implying a decrease in for all = 1,...,M. Therefore, the states of the agents converge to the desired terminal states with formation constraints enforced.

Since the mission scenario does not require an accurate location of each agent other than even distribution on the circular plane, reaching the equilibrium point defined by the potential function is

- 67 - not necessary. Therefore, in this study the maximum amount of propellant that agents in the formation-keep mode can use is set to be . The agents will change back to the position-hold mode when the propellant limit is reached.

4.3. Simulation Results

The simulation results of applying the behaviour-based control law to a spacecraft swarm are presented in this section. For simulations, the parameters used to trigger the corresponding behaviours are chosen as follows: = 2 m/s, = 0.5 m/s and = 70 percent of initial . Without loss of generality, two scenarios with different swarm sizes are simulated.

Scenario 1: Spacecraft swarm with 200 agents

Figure 4.6 shows the movements of the spacecraft swarm with 200 agents at four stages. The agents whose fuel consumption is more than 2.5 m/s per orbit in the initial configuration are indicated in red colour. is produced by the use of propellant to generate a thrust that accelerate the spacecraft. Thus the fuel consumption is measure as change in velocity for simplicity. As can be seen, the movements of the agents started around 600 orbits after the initiation of the mission. The red agents gradually moved toward the location with low fuel consumption and pushed the blue agents move outward to fill the empty space. The whole process finished after 1000 orbits with all the red agents occupying the low-fuel-consumption positions. Moreover, the final configuration of the swarm maintained all of the mission-essential features such as the size of the formation and agent spacing.

- 68 -

Figure 4.6(a): Initial swarm.

Figure 4.6(b): Swarm after 600 orbits

- 69 -

Figure 4.6(c): Swarm after 800 orbits

Figure 4.6(d): Swarm after 1100 orbits

The objective of this study is to balance the fuel consumption among the agents and thus extend the mission lifetime. Figure 4.7 summarises the increase in mission lifetime by using the proposed control law. The useful life of the swarm was regarded to have ended when 10 percent of the swarm ran out

- 70 - of fuel, as we assume this would cause the degradation of mission performance to non useful levels. It is shown that the mission lifetime increases 14.2 percent when the initial is 5 m/s and increases more with higher initial as the amount of spent on the manoeuvres becomes smaller portion of the total . The increase in mission lifetime reaches 33.4 percent when the initial is set to be 10 m/s.

4500 4000 3500 3000 2500

2000 Static Swarm 1500 Dynamic Swarm 1000 500

0 Mission lifetime lifetime Mission (numberoforbits) 5 6 7 8 9 10 Initial delta-V (m/s)

Figure 4.7: Mission lifetime against initial (Scenario 1).

Scenario 2: Spacecraft swarm with 300 agents

Figure 4.8 shows the movements of the spacecraft swarm with 300 agents at four stages. The mission lifetime is plotted in Figure 4.9. The results are very similar to those in Scenario 1.

- 71 -

Figure 4.8(a): Initial swarm.

Figure 4.8(b): Swarm after 600 orbits

- 72 -

Figure 4.8(c): Swarm after 800 orbits

Figure 4.8(d): Swarm after 1200 orbits

- 73 -

4500 4000 3500 3000 2500

2000 Static Swarm 1500 Dyanmic Swarm 1000 500

0 Mission lifetime lifetime Mission (numberoforbits) 5 6 7 8 9 10 Initial delta-V (m/s)

Figure 4.9: Mission lifetime against initial (Scenario 2).

4.4. Summary

This study investigates the feasibility of applying behaviour-based control laws to coordinate a spacecraft swarm. Four basic behaviours were developed: position-hold, descend, ascend, and formation-keep. The control actions applied by swarm agent are governed by the rule sets under each behaviour condition. The simulation results show that the coordination at the system level emerges as each agent enacts simple reactive rules. The objective which is to balance the fuel consumption while maintaining the swarm configuration is achieved by using the proposed method.

- 74 -

Chapter 5

Conclusions and Future Work

In this thesis the feasibility of applying decentralised control techniques to swarms of unmanned systems was investigated. The main contribution to this work is to apply a quantitative methodology to evaluate the optimality of decentralised control techniques under complex mission scenarios. The proposed decentralised controller is based on behaviour-based state machine, in which each agent operates in one of multiple behaviour states and switches between states initiated by triggering conditions. Each unique behaviour uses an independent rule set to achieve independent objectives.

The coordination among agents uses a combination of direct (messages that convey information from one agent to another or one agent to a number of agents) and indirect (stigmergy that communicate

‘through the environment’) interactions.

It is generally assumed that a decentralised controller has difficulties in enforcing complex constraints due to the lack of central coordination information. However there is poor systematic understanding of such limitations and trade-offs involved in using swarm systems compared with a centralised counterpart. In order to extend the current knowledge base, two problems involving UAVs and spacecraft were considered in this thesis. The dynamic variant of the cooperative moving target engagement scenario was investigated. This problem involves assigning multiple UAVs to a time- dependant cooperative tasks, which was previously solved by using a centralised algorithms. In the spacecraft swarm problem, the premature degradation of the mission due to unbalanced fuel consumption among agents caused by the J2 perturbation is investigated and a solution proposed. This required a coordination scheme to reallocate the agents within the configuration. The behaviour-based decentralised controller has been shown to be feasible way of dealing with complex cooperative missions. The main findings of this research can be summarised as follows:

The simulation study evaluates the performance of the decentralised controller against the

optimal solution generated from centralised algorithm. The results indicate that the

- 75 -

performance of the decentralised controller is highly depends on the complexity of the

coupled task constraints.

The simple responsive swarm do not provide satisfactory performance outcomes when

dealing with complex dependencies. It was then demonstrated that the loss of mission

efficiency due to the absence of global coordinating of information can be mitigated, to some

extent, by giving agents a certain level of prediction capability.

The design of SO systems requires tweaking of parameters to achieve desired emergent

behaviours. More importantly, the heuristic methods are necessary to reduce the duplication

of the effort when the fuel capacity is limited in the case of the space swarm.

In a natural swarm system, the individual behaviours are mostly probabilistic. For example, in the response threshold model when the stimulus or demand for a certain task increases, the agents start performing that task with an increased probability. Such non-deterministic behaviour is the reason that the swarm-based solutions are robust and self-adaptive. In future studies, the rule sets that govern the agent behaviours can be designed on a probabilistic basis. As it is known, the collective behaviour emerges from the interaction of a large number of individuals. This leads to a new questions: what is the critical size of a swarm with which unmanned vehicles following probabilistic rules can function in a specific mission? The predictive swarm considered in this thesis allows the agents sacrifice their individual utility if the actions will enhance the overall team utility. However, one of the key principals in the notation of swarms is the simplicity of the agents. The second question that is worth considering is: what would be the balance between increasing the intelligence level of the individuals and increasing the size of the swarm? With deterministic behaviour, the decentralisation, locality, scalability and parallelism of a swarm system is demonstrated in this thesis. The other features of a swarm system such as flexibility and adaptively is to be investigated when the probabilistic rules are applied. This involves the third question: how well a swarm can perform in a complex mission when there are uncertainties present? By answering these questions, a more comprehensive understanding of intelligent swarm system will be developed.

- 76 -

Appendix A

Multiple UAV Simulator

In this appendix, the development of the simulator is presented. This simulator is built using the

MATLAB/SIMULINK simulation software. It enables users to implement cooperative control algorithms. The development of this simulator adopted the hierarchical design approach used in

MultiUAV, which is a multiple vehicle simulator originally developed by U.S. Air Force Research

Laboratory (AFRL) [119].

A.1. Top level model block

Figure A1: Top level model block.

- 77 -

Figure A1 shows the top level model block which includes four targets and ten UAVs. The number of targets and UAVs can be defined by users. The UAVs are independent modules, thus the simulator has the capability to simulate heterogeneous swarm by simply replacing the control functions. The true states of the targets motion are sent to the UAV modules via signal bus. However, such information is not directly available to the UAVs during the simulated scenarios. It will firstly pass the sensor model which is integrated into the UAV control unit. If the targets remain outside of the sensor range, the information will be discarded. If the targets are inside the sensor range, the observed targets motion will be passed to the UAVs whose accuracy is based on the sensor model.

A.2. Target

Figure A2: Target block

The ground moving target block is shown in figure A2. The nonmaneuvering motion model is used for the targets. In the simulation, the targets follow the nearly straight and level motion at a constant velocity with small acceleration noise along and directions.

- 78 -

A.3. UAV

Figure A3: UAV block

Figure A3 shows the UAV block which consists of the vehicle dynamics module and behaviour-based control module.

Figure A4: UAV dynamics

- 79 -

The UAV dynamics block, as shown in figure A4, reads the autopilot command such as heading and velocity, and runs vehicle dynamics simulation. The runtime data of the UAVs is organised and stored using global MATLAB structures. The ReadVehicleStatusS s-function retrieves the commands from the global structures which are updated by the vehicle control unit in each time step. The vehicle dynamics is aggregated in a single s-function, TacticalVehicle, developed in the MultiUAV. This dynamics model is based on inputs from a file of aerodynamic forces, moments, and damping derivatives. The aerodynamic parameters are used, along with physical parameters, in a nonlinear six- degree-of-freedom equations of motion simulation to generate the vehicle dynamics.

Figure A5: UAV control unit

Figure A5 shows the subsystem of the behaviour-based control block. The user defined cooperative control algorithm is implemented in the s-function named ControlUnitS. There are two sources for inputs to the ControlUnitS s-function, block inputs including the updated vehicle dynamics and target states, and global structures that contain the message received from other UAVs. The ControlUnitS s- function contains the codes that describe the cooperative behaviour logic and the rule sets that govern the operations of the UAVs as they are in corresponding behaviour modes. The SendMessageS s-

- 80 - function sends inter-vehicle messages to UAVs within communication range. The message data sent to a certain UAV will be stored in the message inbox of that UAV with a unique identifier including the sender, time sent, and the type of the message.

- 81 -

Appendix B

Approximation Methods of Collision Avoidance Constraints

The collision avoidance is a critical aspect in multiple vehicle path planning problems. The exact representation of these constraints would be nonlinear and non-convex. Two methods have been proposed in the literatures to transfer such constraints into linear form: mixed integer linear programming (MILP) [31, 32], and sequential convex programming (SCP) [33]. In this appendix, the comparison of these two methods with application to spacecraft manoeuvring problems is presented.

The spacecraft is assumed to be flying in low earth circular orbit at radius of 7000 km. The linearised equations of relative motion are used. The maximum impulsive velocity change along each axis is set to be 0.1m/s. The relative dynamics for the satellites are discretised on a 30 s time-step. The total control steps are set to be 60.

To represent collision avoidance constraints using MILP formulation requires two steps: Simplify the obstacle with polygons. For example, the inequities that represent the constraints of a rectangular obstacle in a two dimensions case are given:

(B.1) or

or

or where defines the bottom left corner of the rectangle and defines the top right corner of the rectangle. These inequities will guarantee the point lies in the area outside the obstacle.

- 82 -

Transform these inequities into a mixed-integer form. As the logical operation ‘or’ cannot be included in a mathematical programming problem, the binary variables are introduced to transform

‘or’ into ‘and’. The new inequities in the MILP form are given [120]:

and (B.2)

and

and

and

On the other hand, the SCP method approximates the exact constraints by convex functions and solves the problem in an iterative process. The simplified constraint can be written as [33]:

(B.3) where , is the state vector of -th vehicle, is the state vector of the obstacle which is assuming stationary, is the safety distance, and is the state vector in previous iteration.

Case 1: The obstacle is at the origin with a radius of 30 meters. The start and end positions of the spacecraft is and .

- 83 -

A. MILP

Figure B.1: Solutions to trajectories of the spacecraft using MILP method (rectangle approximation). The total fuel use is equivalent to a of 0.178 m/s. The solver time is 0.687 second.

Figure B.2: Solutions to trajectories of the spacecraft using MILP method (hexagon approximation). The total fuel use is equivalent to a of 0.158 m/s. The solver time is 0.813 second.

- 84 -

The results of using MILP method in Case 1 are summarised in Figure B.1 and B.2. As can be seen, the hexagon approximation is more close to the exact constraints represented by the circle. Thus, the fuel consumption of the manoeuvre is lower than the rectangle approximation. However using hexagon would introduce more decision variables and inequities, the solver time is longer than the rectangle approximation.

B. SCP

Figure B.3: Solutions to trajectories of the spacecraft using SCP method. The total fuel use is equivalent to a of 0.151 m/s. The solver time is 0.14 second.

The SCP method results in a better overall performance than MILP method.

Case 2: Two obstacles centres at the origin and respectively. The start and end positions of the spacecraft is and .

- 85 -

A. MILP

Solutions to trajectories of the spacecraft using MILP method (rectangle Figure B.4: approximation). The total fuel use is equivalent to a of 0.3241 m/s. The solver time is 18.156 second.

- 86 -

As is shown in Figure B.4, the space between two obstacles becomes infeasible, because the approximation of the avoidance regions overlapped. The solution contains a velocity in out-of-plane direction.

B. SCP

- 87 -

Figure B.5: Solutions to trajectories of the spacecraft using SCP method. The total fuel use is equivalent to a of 0.153 m/s. The solver time is 0.173 second.

The solution given by the SCP method plans a trajectories right through two obstacles, which results in a much less fuel consumption and solver time.

- 88 -

Appendix C

Distributed Spacecraft Manoeuvre Planning

In Chapter 4, the behaviour-based control law is developed to coordination the spacecraft swarm. .

The proposed control law will guide the spacecraft in the high-fuel-consumption positions to switch with those in the low-fuel-consumption positions. During this process, the spacecrafts need to manoeuvre through cluttered environment. In this appendix, a distributed multiple spacecraft motion planning algorithm is presented. The proposed algorithm is inspired by the work [121, 122], which used a game theory approach to formulate the distributed optimisation problem. The simulation scenarios used to validate this algorithm include spacecraft formation reconfiguration and close proximity manoeuvres around obstacles.

C.1. Problem Formulation

In this section, the method for formulating the reconfiguration-manoeuvre planning problem is described. The problem of interest is a typical minimum-fuel, fixed-time manoeuvre for a cluster of spacecraft. The individual spacecraft are governed by vehicle dynamics together with individual and inter-vehicle constraints. The individual vehicle constraints may be due to restrictions on the control input, obstacle avoidance and final states. Inter-vehicle constraints considered in this study include collision avoidance and plume avoidance.

A. Discrete Relative Dynamics

For the problem addressed in this study, the spacecraft fly in close proximity to a reference point that executes a circular orbit, as is the case in a close formation flight or rendezvous manoeuvre. Ignoring perturbations such as atmospheric drag, third-body gravitation, etc., the Clohessy-Wiltshire equations

(C.1a, b & c) are used to describe the spacecraft dynamics [111]:

- 89 -

, (C.1a)

, (C.1b)

, (C.1c)

where is the angular velocity of the reference orbit, , , are the applied accelerations provided by the thrusters (assuming that the attitude of the spacecraft is maintained in such a way that three thrust components are vectors along the reference axes) and , , and are the relative displacements in the radial, in-track, and cross-track directions, respectively. These are linearised equations of relative motion and are thus generally only valid for small relative displacements (although it is worth noting that the Clohessy-Wiltshire relations remain valid even for arbitrary relative displacements in the in-track direction). One may then take the state variables of the spacecraft as where the superposed dot indicates the time derivative of each respective displacement. The relative motion of a spacecraft in the local vertical/local horizontal (LVLH) frame is then given as a linear time-invariant (LTI) equation, namely:

, (C.2)

where

,

- 90 - and

.

The continuous-time system can be discretised with a zero-order hold at a sampling period . The discrete equivalent state space equation is:

, (C.3)

where

and

.

With the initial state and control sequence , the final state can be written recursively as:

. (C.4)

This equation (C.4) expresses the relative position and velocity of the spacecraft in the reference frame at any future time as a linear function of the initial conditions and the control sequence. The control input that appears in equation (C.4) is assumed to be applied continuously throughout the sampling period. For satellite systems, it can be more accurate to model the control input (which is

- 91 - often applied by impulsive thrusters) as being applied impulsively at the beginning of each sampling period, which results in an impulsive velocity change . As a consequence, equations (C.3) and

(C.4) become:

, (C.5)

. (C.6)

B. Centralised Approach to Multiobjective Optimisation

The objective of this multiple spacecraft control problem is to minimise the fuel consumption of each individual vehicle. Such a problem for a total of spacecraft can be posed as the following centralised optimisation:

minimise

subject to

(C.7)

and

and

where represents the dynamics constraints given by equations (C.3) & (C.4) for vehicle , represents the local constraint for vehicle , i.e. obstacle avoidance and magnitude of the impulsive velocity change, and represents the inter-vehicle constraints, i.e. collision avoidance and plume avoidance between vehicle and . The objective function of the spacecraft is expressed as:

- 92 -

, which is the sum of the absolute values of the control input along each axis. The objective function can also be expressed as a linear combination of the control inputs by introducing two sets of slack variables for its positive and negative parts [123].

To solve the multiobjective problem, the weighting method is frequently used, especially for practical problems. The idea of the weighting method is to associate each objective function with a weighting coefficient and minimise the weighted sum of the objectives. In this way, the multiple objective functions are transformed into a single objective function:

minimise (C.8)

where for all and . It has been proved that the solution of this weighting problem is Pareto optimal [124] which implies that there is no other solution that makes every player at least as well off and at least one player strictly better off.

One of the major drawbacks of the centralised approach is that it scales poorly with the number of agents. The decision variables related to inter-vehicle constraints increase exponentially as the size of the problem increases. Limited onboard computing resources may render such computationally complex approaches impracticable. Alternatively, for the system in which a centralised controller is not essential, the adoption of a distributed approach may be able to reach the global objectives while distributing the computational load.

C.2. Distributed Motion Planning Algorithm

This section presents the distributed motion planning algorithm. In order to formulate the distributed algorithm, an assumption is made that communication among vehicles within a neighbourhood can be established automatically. In this distributed approach, the centralised problem is broken down into

- 93 - local optimisation for each agent. The local optimisation for an individual vehicle is computed assuming the states of other vehicles in the neighbourhood are fixed. The local optimisation problem for vehicle is defined as:

minimise

subject to

(C.9)

And

And

where the notation is used to represent the local solution received from its neighbour. As the local optimisation lets each vehicle minimise its own cost, this problem is transferred into a repeated game where each of the players searches for its own optimal strategies through an iterative process in response to the strategies of other players. Figure 1 presents the algorithm for vehicle within a system of vehicles.

Algorithm C.1

1: Determine the initial control input ignoring all other vehicles.

2: Communicate to all vehicles in the system and form from received control inputs.

3: Solve the local optimisation problem (9), , while the states of other vehicles are fixed.

4: Communicate local control input to all vehicles.

- 94 -

5: While

6: Solve the local optimisation problem (9), .

7: Update local control input: .

8: Communicate local control input to all vehicles.

9: end (while).

Figure C.1: The distributed motion planning algorithm for multiple spacecraft

The iteration in Algorithm C.1 can be implemented in two different ways: simultaneous and sequential [45]. In the simultaneous iteration approach, all of the agents update simultaneously with solutions from the last step. In the sequential iteration approach, the agents update one at a time with the most recently computed solutions used. In either implementation, Algorithm C.1 will terminate in a finite number of iterations, which occurs when none of the vehicles can change its local control input by more than the predetermined level . The proof of the global convergence of Algorithm C.1 is provided at the bottom of this appendix. The advantage of the proposed algorithm compared with the penalty method [43] [44] for solving the spacecraft manoeuvre planning problem, is that it does not require a nonlinear penalty term to be added into the objective function. This allows the optimisation problem (C.9) to be formulated as a mixed-integer, linear programming problem, and the global optimum can be found using commercial software.

C.3. Simulation Results

To investigate the effectiveness of the distributed algorithm, three scenarios were considered. Firstly, two different reconfiguration problems of two spacecraft were simulated. Then the problem of three spacecraft manoeuvring in a cluttered environment was considered. In the simulation, the spacecraft

- 95 - were assumed to be flying in low earth orbit (LEO) at radius of 7000 km (circular orbit). The maximum impulsive velocity change along each axis is set to be 0.1m/s. The relative dynamics for the satellites are discretised on a 60 s time-step. The obstacle and plume-impingement region is modelled using hexahedrons. These constraints can be transformed into mixed-integer form by introducing binary slack variables [32]. The simulation was implemented using Matlab. Within

Matlab, the CPLEX solver is called through its Matlab interface to solve the local optimisation problem.

A. Scenario 1: Two-spacecraft reconfiguration

First we considered a reconfiguration of two spacecraft flying in formation. The initial and final positions are all on the y-axis (in-track), but the directions of the manoeuvres are reversed. The safety distance between the two spacecraft in this scenario is set at 100 meters. Figure C.2(a) shows the solution sequence of the distributed algorithm. The Pareto optimal front is calculated in a centralised manner using the weighting method (equation (C.8)). We observe that the solution of the distributed algorithm converges to the Pareto optimal state after five iterations. Figure C.2(b) shows the required manoeuvre where the spacecraft move symmetrically in the x-y plane while maintaining safety distance between them. The black lines indicate the direction of thrust firing, with length proportional to magnitude. Figure C.2(c) shows the solutions from each iteration.

- 96 -

Figure C.2(a): Cost for spacecraft A and B of each iteration in a two-spacecraft reconfiguration.

Figure C.2(b): Solution to trajectories of the spacecraft in Scenario 1.

Figure C.2(c): Solutions to trajectories of the spacecraft generated during each iteration in a two-

spacecraft reconfiguration.

B. Scenario 2: Two-spacecraft reconfiguration with plume avoidance

This section presents the simulation results for a two-spacecraft reconfiguration problem with plume avoidance introduced between the spacecraft. We assume that the plume-impingement region is defined by a hexahedron of length of 100 meters and a width of 20 meters. The collision-avoidance distance is 15 meters. In this scenario, the initial and final positions are all on the y-axis, and the directions of the manoeuvres are the same. To initiate an in-track motion, the spacecraft in front, A

(green colour) fires backward causing the plume of the thrusters to impinge upon the other spacecraft.

- 97 -

To avoid this, as the solution shows in Figure C.3(b), the green vehicle delays firing its thrusters until the red vehicle (B) moves out of the plume-avoidance region. This causes the green vehicle to consume more energy in order to finish the manoeuvre in the given time. This can be seen in Figure

C.3(a).

Figure C.3(a): Cost for spacecraft A and B of each iteration in a two-spacecraft reconfiguration with

plume avoidance.

Figure C.3(b): Solution to trajectories of the spacecraft in Scenario 2.

- 98 -

Figure C.3(c): Solutions to trajectories of the spacecraft generated during each iterations in a two-

spacecraft reconfiguration with plume avoidance.

C. Scenario 3: Three spacecraft manoeuvring in a cluttered environment

In the final example, the proposed algorithm is implemented on three spacecraft manoeuvring in a cluttered environment. The red vehicle starts from rest and performs a rendezvous manoeuvre to another side of the structure. The other two vehicles are on the neighbouring orbits (dashed line) of the structure and rendezvous with the structure at different locations. The resulting trajectories are summarized in Figure C.4(a). Figure C.4(b) shows the minimum distance between three vehicles during each iteration. As shown in the figure, the inter-vehicle distance converged after 10 iterations, and satisfied the collision-avoidance constraint.

- 99 -

Figure C.4(a): Solution to spacecrafts’ trajectories in a cluttered environment. The blue boxes

represent a structure in the reference orbit.

Figure C.4(b): Minimum distance between each spacecraft during iterations

C.4. Summarise

In this study, we have presented a distributed approach to the motion planning problem for multiple spacecraft. This approach decomposes the centralised problem into smaller subproblems, where each

- 100 - agent computes a local optimisation while treating decision variables of other agents as constants.

These agents communicate their local solution with each other to achieve the global objective of the system. An iterative scheme is applied to solve this distributed optimisation problem. The main advantage of this approach is that it adopts a linear form of the objective function. This allows the local optimisation problem to be formulated as mixed-integer, linear programming problem, most of which can be quickly solved with resort to commercial software. Simulations were presented to demonstrate the algorithm. The results showed the fast convergence of the proposed algorithm, while local and inter-vehicles constraints were maintained

C.5. Proof of the global convergence of Algorithm C.1

Since the local optimisation problem (C.9) is computed given that the states of other vehicles are fixed, the constraints imposed by the other vehicles are not treated in local optimisation. To prove the global convergence of Algorithm C.1, we consider the penalty augmented global cost function used in reference [43] to include all the inter-vehicle constraints, i.e.:

(C.10)

where is the total number of vehicles, is a large positive constant which tightens the constraints, represents the total pairs of vehicles and denote the penalty function that punishes the violation of inter-vehicle constraints. is defined as:

(C.11)

- 101 - where is the total number of time steps, denotes the distance between pair of vehicles at time step and is the safety distance. It is obvious that the global cost function is convex.

Therefore, the local optimisation problem (9) becomes:

minimise

subject to (C.12)

and

If we assume that problem (C.12) has the same solution as problem (C.9) (ideally, the solution of problem (C.12) converges to the solution of (C.9) as penalty parameter ), then from step 7 of the algorithm, we have:

By convexity of

- 102 -

Since the global cost function is non-increasing with iteration, given the fact that is bounded, the solution sequence of Algorithm 1 converges.

- 103 -

Bibliography

[1] B. Dunin-Keplicz and R. Verbrugge, Teamwork in Multi-Agent Systems: A Formal Approach: Wiley, 2011. [2] F. Bourgault, et al., "Coordinated decentralized search for a lost target in a Bayesian world," in Intelligent Robots and Systems, 2003. (IROS 2003). Proceedings. 2003 IEEE/RSJ International Conference on, 2003, pp. 48-53 vol.1. [3] T. Furukawa, et al., "Recursive Bayesian search-and-tracking using coordinated uavs for lost targets," in Robotics and Automation, 2006. ICRA 2006. Proceedings 2006 IEEE International Conference on, 2006, pp. 2521-2526. [4] N. Nigam and I. Kroo, "Persistent Surveillance Using Multiple Unmanned Air Vehicles," in Aerospace Conference, 2008 IEEE, 2008, pp. 1-14. [5] N. Ceccarelli, et al., "Micro UAV Path Planning for Reconnaissance in Wind," in American Control Conference, 2007. ACC '07, 2007, pp. 5310-5315. [6] DARPA. (2012, 3 April). System F6. Available: http://www.darpa.mil/Our_Work/TTO/Programs/System_F6.aspx [7] NASA. (2012, 3 April). Solar Imaging Radio Array. Available: http://sira.gsfc.nasa.gov/ [8] NASA. (2012, 3 April). The Magnetospheric Multiscale (MMS) mission. Available: http://mms.gsfc.nasa.gov/index.html [9] NASA. (2012, 3 April). Exoplanet Exploration Program (ExEP). Available: http://exep.jpl.nasa.gov/ [10] K. Heffner and F. Hassaine, "Towards Intelligent Operator Interfaces in Support of Autonomous UVS Operations," in 16th International Command and Control Research and Technology Symposium, Quebuc, Canada, 2011. [11] S. Cambone, et al., "Unmanned Aircraft Systems Roadmap 2005-2030," ed: U.S. Department of Defence, 2005. [12] T. Shima and S. Rasmussen, UAV Cooperative Decision and Control: Challenges and Practical Approaches: SIAM, Society for Industrail and Applied Mathemics, 2009. [13] J. Shamma, Cooperative Control of Distributed Multi-Agent Systems: Wiley, 2008. [14] E. Bonabeau, et al., Swarm Intelligence- From Natural to Artificial Systems: Oxford University Press, 1999. [15] S. Hauert, et al., "Evolved swarming without positioning information: an application in aerial communication relay," Autonomous Robots, vol. 26, p. 11, 2009.

- 104 -

[16] C. A. Erignac, "An Exhaustive Swarming Search Strategy based on Distributed Pheromone Maps," presented at the AIAA Infotech@Aerospace 2007 Conference and Exhibit, Rohnert Park, California, 2007. [17] V. Gazi, "Swarm aggregations using artificial potentials and sliding-mode control," Robotics, IEEE Transactions on, vol. 21, pp. 1208-1214, 2005. [18] D. J. Nowak, "Exploitation of self organization in uav swarms for optimization in combat environments," Master of Science in Computer Science, Department of Electrical and Computer Engineering, Air Force Institute of Technology, 2008. [19] C. Schumacher, et al., "Task allocation for wide area search munitions," in American Control Conference, 2002. Proceedings of the 2002, 2002, pp. 1917-1922 vol.3. [20] C. Schumacher, et al., "Task Allocation for Wide Area Search Munitions via Iterative Network Flow," in AIAA Guidance, Navigation, and Control Conference and Exhibit, Monterey, California, 2002. [21] C. Schumacher, et al., "Task allocation for wide area search munitions with variable path length," in American Control Conference, 2003. Proceedings of the 2003, 2003, pp. 3472- 3477 vol.4. [22] C. Schumacher, et al., "UAV Task Assignment with Timing Constraints via Mixed-Integer Linear Programming," in the AIAA Unmanned Unlimited Conference, Chicago, Illinois, 2004. [23] A. M. Bayen and C. J. Tomlin, "Real-time discrete control law synthesis for hybrid systems using MILP: application to congested airspace," in American Control Conference, 2003. Proceedings of the 2003, 2003, pp. 4620-4626 vol.6. [24] A. M. Bayen, et al., "An approximation algorithm for scheduling aircraft with holding time," in Decision and Control, 2004. CDC. 43rd IEEE Conference on, 2004, pp. 2760-2767 Vol.3. [25] S. Rasmussen, et al., "Optimal vs. Heuristic Assignment of Cooperative Autonomous Unmanned Air Vehicles," in the AIAA Guidance, Navigation, and Control Conference, Austin, TX, 2003. [26] T. Shima, et al., "UAV cooperative multiple task assignments using genetic algorithms," in American Control Conference, 2005. Proceedings of the 2005, 2005, pp. 2989-2994 vol. 5. [27] D. Turra, et al., "Real Time UAVs Task Allocation with Moving Targets," in AIAA Guidance, Navigation, and Control Conference and Exhibit, Providence, Rhode Island, 2004. [28] D. B. Kingston and C. J. Schumacher, "Time-dependent cooperative assignment," in American Control Conference, 2005. Proceedings of the 2005, 2005, pp. 4084-4089 vol. 6. [29] P. R. Chandler, et al., "UAV Cooperative Path Planning," in AIAA Guidance, Navigation, and Control Conference and Exhibit, Denver, CO, 2000.

- 105 -

[30] T. W. McLain and R. W. Beard, "Coordination variables, coordination functions, and cooperative timing missions," Journal of Guidance, Control, and Dynamics, vol. 28, pp. 150- 161, 2005. [31] A. Richards and J. P. How, "Aircraft trajectory planning with collision avoidance using mixed integer linear programming," in American Control Conference, 2002. Proceedings of the 2002, 2002, pp. 1936-1941 vol.3. [32] A. Richards, et al., "Spacecraft Trajectory Planning with Avoidance Constraints Using Mixed- Integer Linear Programming," AIAA Journal on Guidance, Control, and Dynamics, vol. 25, p. 9, 2002. [33] D. Morgan, et al., "Spacecraft Swarm Guidance Using a Sequence of Decentralized Convex Optimizations," in AIAA/AAS Astrodynamics Specialist Conference, Minneapolis, Minnesota, 2012. [34] C. Sultan, et al., "Energy Suboptimal Collision-Free Path Reconfiguration for Spacecraft Formation Flying " Journal of Guidance, Control, and Dynamics, vol. 29, 2006. [35] C. Sultan, et al., "Energy optimal reconfiguration for large scale formation flying," in American Control Conference, 2004. Proceedings of the 2004, 2004, pp. 2986-2991 vol.4. [36] B. Ackmese, et al., "A convex guidance algorithm for formation reconfiguration," in AIAA Guidance, Navigation, and Control Conference and Exhibit, Keystone, Colorado, 2006. [37] I. Gokhan, et al., "Precise formation flying control of multiple spacecraft using carrier-phase differential GPS," in Guidance, Control and Navigation Conference, 2000. [38] M. Tillerson, et al., "Co-ordination and control of distributed spacecraft systems using convex optimization techniques," International Journal of Robust and Nonlinear Control, vol. 12, pp. 207-242, 2002. [39] S. Baliyarasimhuni and R. Beard, "Multiple UAV Task Allocation Using Distributed Auctions," in AIAA Guidance, Navigation and Control Conference and Exhibit, ed: American Institute of Aeronautics and Astronautics, 2007. [40] A. Richards and J. How, "A decentralized algorithm for robust constrained model predictive control," in American Control Conference, 2004. Proceedings of the 2004, 2004, pp. 4261- 4266 vol.5. [41] A. Richards and J. How, "Decentralized model predictive control of cooperating UAVs," in Decision and Control, 2004. CDC. 43rd IEEE Conference on, 2004, pp. 4286-4291 Vol.4. [42] Y. Kuwata, et al., "Distributed Robust Receding Horizon Control for Multivehicle Guidance," Control Systems Technology, IEEE Transactions on, vol. 15, pp. 627-641, 2007.

- 106 -

[43] G. Inalhan, et al., "Decentralized optimization, with application to multiple aircraft coordination," in Decision and Control, 2002, Proceedings of the 41st IEEE Conference on, 2002, pp. 1147-1155 vol.1. [44] S. L. Waslander, et al., "Theory and algorithms for cooperative systems," ed: World Scientific Pub Co Inc, 2004. [45] D. P. Bertsekas and J. Tsitsiklis, Parallel and Distributed Computation: Numerical Methods, 1 ed.: Athena Scientific, 1997. [46] W. B. Dunbar and R. M. Murray, "Distributed receding horizon control for multi-vehicle formation stabilization," Automatica, vol. 42, pp. 549-558, 2006. [47] Y. Kuwata and J. How, "Decentralized Cooperative Trajectory Optimization for UAVs with Coupling Constraints," in Decision and Control, 2006 45th IEEE Conference on, 2006, pp. 6820-6825. [48] R. L. Raffard, et al., "Distributed optimization for cooperative agents: Application to formation flight," in 43rd IEEE Conference on Decision and Control, Atlantis, Paradise Island, Bahamas, 2004. [49] S. L. Waslander, "Multi-Agent Systems Design for Aerospace Applications," Doctor of Philosophy, Department of Aeronautics and Astronautics, Stanford University, 2007. [50] A. G. O. Mutambara and H. Durrant-Whyte, "Distributed decentralized robot control," in American Control Conference, 1994, 1994, pp. 2266-2267 vol.2. [51] A. G. O. Mutambara and H. F. Durrant-Whyte, "A formally verified modular decentralized robot control system," in Intelligent Robots and Systems '93, IROS '93. Proceedings of the 1993 IEEE/RSJ International Conference on, 1993, pp. 2023-2030 vol.3. [52] H. Durrant-Whyte and B. Grocholsky, "Management and control in decentralised networks," in Information Fusion, 2003. Proceedings of the Sixth International Conference of, 2003, pp. 560-566. [53] A. D. Ryan, et al., "Information-theoretic sensor motion control for distributed estimation," in ASME 2007 International Mechanical Engineering Congress and Exposition, Seattle, USA, 2007. [54] S. S. Ponda, et al., "Trajectory Optimization for Target Localization Using Small Unmanned Aerial Vehicles," in AIAA Guidance, Navigation, and Control Conference, Chicago, Illinois, 2009. [55] G. Hoffmann, et al., "Distributed Cooperative Search Using Information - Theoretic Costs for Particle Filters, with Quadrotor Applications," in AIAA Guidance, Navigation, and Control Conference and Exhibit, ed: American Institute of Aeronautics and Astronautics, 2006.

- 107 -

[56] A. Singh, et al., "Efficient informative sensing using multiple robots," J. Artif. Int. Res., vol. 34, pp. 707-755, 2009. [57] K. Yang, et al., "A Gaussian process-based RRT planner for the exploration of an unknown and cluttered environment with a UAV," Advanced Robotics, vol. 27, pp. 431-443, 2013/04/01 2013. [58] C. M. Kreucher, et al., "An Information-Based Approach to Sensor Management in Large Dynamic Networks," Proceedings of the IEEE, vol. 95, pp. 978-999, 2007. [59] E. Doucette, et al., "Simultaneous Localization and Planning for Cooperative Air Munitions Via Dynamic Programming," in Optimization and Cooperative Control Strategies. vol. 381, M. Hirsch, et al., Eds., ed: Springer Berlin Heidelberg, 2009, pp. 69-79. [60] A. Sinclair, et al., "Simultaneous Localization and Planning for Cooperative Air Munitions," in Advances in Cooperative Control and Optimization. vol. 369, P. Pardalos, et al., Eds., ed: Springer Berlin Heidelberg, 2007, pp. 81-93. [61] P. Scerri, et al., "Geolocation of RF Emitters by Many UAVs," in AIAA Infotech@Aerospace, Rohnert Park, California, 2007. [62] Y. Altshuler, et al., "Swarm Intelligence — Searchers, Cleaners and Hunters," in Swarm Intelligent Systems, N. Nedjah and L. D. M. Mourelle, Eds., ed Berlin, Heidelberg: Springer, 2006, pp. 93-132. [63] B. Linge, et al., "Self-organizing primitives for automated shape composition," in Shape Modeling and Applications, 2008. SMI 2008. IEEE International Conference on, 2008, pp. 147- 154. [64] B. Linge, et al., "An Emergent System for Self-Aligning and Self-Organizing Shape Primitives," in Self-Adaptive and Self-Organizing Systems, 2008. SASO '08. Second IEEE International Conference on, 2008, pp. 445-454. [65] H. Kwong and C. Jacob, "Evolutionary exploration of dynamic ," in Evolutionary Computation, 2003. CEC '03. The 2003 Congress on, 2003, pp. 367-374 Vol.1. [66] I. C. Price, "Evolving self-organized behavior for homogeneous and heterogeneous UAV or UCAV swarms," Master of Science, Department of Electrical and Computer Engineering, Air Force Institute of Technology, 2006. [67] J. Sauter, et al., "Swarming Unmanned Air and Ground Systems for Surveillance and Base Protection," in AIAA Infotech@Aerospace Conference, ed: American Institute of Aeronautics and Astronautics, 2009.

- 108 -

[68] J. A. Sauter, et al., "Effectiveness of Digital Pheromones Controlling Swarming Vehicles in Military Scenarios," Journal of Aerospace Computing, Information, and Communication, vol. 4, pp. 753-769, 2007/05/01 2007. [69] H. Parunak and B. Sven, "Stigmergic Networking Methods for Swarms of Unpiloted Vehicles," in AIAA 3rd "Unmanned Unlimited" Technical Conference, Workshop and Exhibit, ed: American Institute of Aeronautics and Astronautics, 2004. [70] U. Wilensky. (26 June). NetLogo. Available: http://ccl.northwestern.edu/netlogo/ [71] G. Theraulaz and E. Bonabeau, "Modelling the Collective Building of Complex Architectures in Social Insects with Lattice Swarms," Journal of Theoretical Biology, vol. 177, pp. 381-400, 1995. [72] I. Karsai and Z. Pénzes, "Comb Building in Social Wasps: Self-organization and Stigmergic Script," Journal of Theoretical Biology, vol. 161, pp. 505-525, 1993. [73] K. Petersen, et al., "TERMES: An Autonomous Robotic System for Three-Dimensional Collective Construction," in Robotics: Science and Systems, Los Angeles, CA, USA, 2011. [74] Q. Lindsey, et al., "Construction with quadrotor teams," Autonomous Robots, vol. 33, pp. 323-336, 2012/10/01 2012. [75] M. J. B. Krieger and J.-B. Billeter, "The call of duty: Self-organised task allocation in a population of up to twelve mobile robots," Robotics and Autonomous Systems, vol. 30, p. 19, 2000. [76] T. H. Labella, et al., "Division of labor in a group of robots inspired by ants' foraging behavior," ACM Trans. Auton. Adapt. Syst., vol. 1, pp. 4-25, 2006. [77] A. Brutschy, et al., "Self-organized task allocation to sequentially interdependent tasks in swarm robotics," Autonomous Agents and Multi-Agent Systems, pp. 1-25, 2012/12/27 2012. [78] C. Reynolds, "Flocks, herds and schools: A distributed behavioral model," in Proceedings of the 14th annual conference on Computer graphics and interactive techniques, 1987, pp. 25- 34. [79] V. Gazi and K. M. Passino, "Stability analysis of swarms," Automatic Control, IEEE Transactions on, vol. 48, pp. 692-697, 2003. [80] C. M. Saaj, et al., "Spacecraft Swarm Navigation and Control Using Artificial Potential Field and Sliding Mode Control," in Industrial Technology, 2006. ICIT 2006. IEEE International Conference on, 2006, pp. 2646-2651. [81] N. E. Leonard and E. Fiorelli, "Virtual leaders, artificial potentials and coordinated control of groups," in Decision and Control, 2001. Proceedings of the 40th IEEE Conference on, 2001, pp. 2968-2973 vol.3.

- 109 -

[82] R. Olfati-Saber, "Flocking for multi-agent dynamic systems: algorithms and theory," Automatic Control, IEEE Transactions on, vol. 51, pp. 401-420, 2006. [83] D. J. Bennet, et al., "Autonomous Three-Dimensional Formation Flight for a Swarm of Unmanned Aerial Vehicles," Journal of Guidance, Control, and Dynamics, vol. 34, pp. 1899- 1908, November 1 2011. [84] D. Izzo and L. Pettazzi, "Autonomous and Distributed Motion Planning for Satellite Swarm," Journal of Guidance, Control, and Dynamics, vol. 30, p. 10, March–April 2007. [85] S. Nag and L. Summerer, "Behaviour based, autonomous and distributed scatter manoeuvres for satellite swarms," Acta Astronautica, 2012. [86] R. Brooks, "New Approaches to Robotics," Science, vol. 253, p. 6, 1991. [87] R. A. Brooks, "Intelligence without representation," Artificial Intelligence, vol. 47, pp. 139- 159, 1991. [88] D. J. Nowak and G. B. Lamont, "Autonomous Self Organized UAV Swarm Systems," in Aerospace and Electronics Conference, 2008. NAECON 2008. IEEE National, 2008, pp. 183- 189. [89] D. R. Frelinger, et al., Proliferated Autonomous Weapons: An Example of Cooperative Behavior. Santa Monica, CA: RAND Corporation, 1998. [90] P. Gaudiano, et al., "Control of UAV Swarms: What The Bugs Can Teach Us," presented at the 2nd AIAA "Unmanned Unlimited" Systems, Technologies, and Operations, San Diego, California, 2003. [91] P. DeLima and D. , "Maximizing Search Coverage Using Future Path Projection for Cooperative Multiple UAVs with Limited Communication Ranges," in Optimization and Cooperative Control Strategies. vol. 381, M. Hirsch, et al., Eds., ed: Springer Berlin Heidelberg, 2009, pp. 103-117. [92] D. J. Pack and G. W. P. York, "Developing a Control Architecture for Multiple Unmanned Aerial Vehicles to Search and Localize RF Time-Varying Mobile Targets: Part I," in Robotics and Automation, 2005. ICRA 2005. Proceedings of the 2005 IEEE International Conference on, 2005, pp. 3954-3959. [93] D. J. Pack, et al., "Cooperative Control of UAVs for Localization of Intermittently Emitting Mobile Targets," Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on, vol. 39, pp. 959-970, 2009. [94] X. Rong Li and V. P. Jilkov, "Survey of maneuvering target tracking. Part I. Dynamic models," Aerospace and Electronic Systems, IEEE Transactions on, vol. 39, pp. 1333-1364, 2003.

- 110 -

[95] Y. Bar-Shalom, et al., Estimation with Applications to Tracking and Navigation Wiley- Interscience, 2001. [96] Y. Bar-Shalom, Multitarget/Multisensor Tracking: Applications and Advances vol. 3: Artech Print on Demand, 2000. [97] C. Schumacher, "Ground moving target engagement by cooperative UAVs," in American Control Conference, 2005. Proceedings of the 2005, 2005, pp. 4502-4505 vol. 7. [98] J. Crassidis, "Sequential State Estimation," in Optimal Estimation of Dynamic Systems, ed: Chapman and Hall/CRC, 2004, pp. 243-342. [99] A. G. O. Mutambara, "Information based estimation for both linear and nonlinear systems," in American Control Conference, 1999. Proceedings of the 1999, 1999, pp. 1329-1333 vol.2. [100] S. Garnier, et al., "The biological principles of swarm intelligence," Swarm Intelligence, vol. 1, pp. 3-31, 2007. [101] E. Rimon and D. E. Koditschek, "Exact robot navigation using artificial potential functions," Robotics and Automation, IEEE Transactions on, vol. 8, pp. 501-518, 1992. [102] L. Dale, "Lyapunov Vector Fields for UAV Flock Coordination," in 2nd AIAA "Unmanned Unlimited" Conf. and Workshop & Exhibit, ed: American Institute of Aeronautics and Astronautics, 2003. [103] D. R. Nelson, et al., "Vector field path following for small unmanned air vehicles," in American Control Conference, 2006, 2006, p. 7 pp. [104] E. W. Frew, et al., "Coordinated Standoff Tracking of Moving Targets Using Lyapunov Guidance Vector Fields," JOURNAL OF GUIDANCE, CONTROL, AND DYNAMICS, vol. 31, March–April 2008. [105] G. Gu, et al., "Optimal Cooperative Sensing using a Team of UAVs," Aerospace and Electronic Systems, IEEE Transactions on, vol. 42, pp. 1446-1458, 2006. [106] T. Shima, et al., "UAV team decision and control using efficient collaborative estimation," in American Control Conference, 2005. Proceedings of the 2005, 2005, pp. 4107-4112 vol. 6. [107] T. Shima, et al., "Decentralized Estimation for Cooperative Phantom Track Generation," in Cooperative Systems. vol. 588, D. Grundel, et al., Eds., ed: Springer Berlin Heidelberg, 2007, pp. 339-350. [108] D. Izzo and L. Pettazzi, "Autonomous and Distributed Motion Planning for Satellite Swarm," Journal of Guidance, Control, and Dynamics, vol. 30, April 2007. [109] C. Sabol, et al., "Satellite Formation Flying Design and Evolution," Journal of Spacecraft and Rockets, vol. 38, March–April 2001.

- 111 -

[110] H. Schaub and J. L. Junkins, Analytical mechanics of space systems: American Institute of Aeronautics and Astronautics, 2003. [111] H. Curtis, Orbital Mechanics: For Engineering Students, 1 ed.: Butterworth-Heinemann, 2005. [112] S. R. Vadali, et al., "An intelligent control concept for formation flying satellites," International Journal of Robust and Nonlinear Control, vol. 12, 2002. [113] G. Xu and D. Wang, "Nonlinear Dynamic Equations of Satellite Relative Motion Around an Oblate Earth," Journal of Guidance, Control, and Dynamics, vol. 31, September–October 2008. [114] S. R. Vadali, "Model for Linearized Satellite Relative Motion About a J2-Perturbed Mean Circular Orbit," Journal of Guidance, Control, and Dynamics, vol. 32, September–October 2009. [115] M. Tillerson and J. P. How, "Advanced Guidance Algorithms for Spacecraft Formation- keeping," in the American Control Conference, Anchorage, AK, 2002. [116] C. C. Cheah, et al., "Region-based shape control for a swarm of robots," Automatica, vol. 45, pp. 2406-2411, 2009. [117] W. S. Levine, The Control Handbook, System Control Applications: CRC PressINC, 2010. [118] C. Robert, "Autonomous path planning for on-orbit servicing vehicles," Journal of the British Interplanetary Society, vol. 53, pp. 26-38, 2000. [119] S. J. Rasmussen, et al., "Introduction to the MultiUAV2 simulation and its application to cooperative control research," in American Control Conference, 2005. Proceedings of the 2005, 2005, pp. 4490-4501 vol. 7. [120] A. G. Richards, "Trajectory optimization using Mixed-Integer Linear Programming," Master of Science, Department of Aeronautics and Astronautics, Massachusetts Institute of Technology, 2002. [121] A. N. Venkat, et al., "Distributed MPC Strategies With Application to Power System Automatic Generation Control," Control Systems Technology, IEEE Transactions on, vol. 16, pp. 1192-1206, 2008. [122] L. Giovanini, "Game approach to distributed model predictive control," Control Theory & Applications, IET, vol. 5, pp. 1729-1739, 2011. [123] A. Robertsony, et al., "Spacecraft formation flying control design for the Orion mission " in AIAA Guidance, Navigation, and Control Conference and Exhibit, Portland, OR, 1999. [124] K. Miettinen, Nonlinear Multiobjective Optimization, 1 ed.: Springer, 1998.

- 112 -

- 113 -