School of Aerospace, Transport and Manufacturing: M.Sc. Thesis

M.Sc.
Thesis
School of Aerospace, Transport and Manufacturing
Simulation of a Swarm Formed by Unmanned Aerial

Vehicles in a Combat Scenario
Supervisors: Hyo-Sang Shin & Antonios Tsourdos

Student: Francisco de Borja Serra Planelles
f.serra-planelles@cranfield.ac.uk
MSc in Autonomous Vehicle Dynamics & Control
Cranfield University - 9th December 2016

M.Sc. Autonomous Vehicle Control & Dynamics Cranfield University
Abstract
In this thesis a global analysis of the modern air combat battlefield is made to
study the implementation of fleets of unmanned aerial vehicles acting as a swarm to
confront other aircraft. Previous approaches tackle the application of UAV Swarms to
environments less demanding than air combat as reconnaissance or search and rescue.
Hence a task allocation algorithm is introduced to this context and simulated under
the desired circumstances trying to embrace new combat tactics and procedures.
The main inspiration is the behaviour of insects such as T emnothorax Albipennis to
develop a stochastic policy based algorithm. Simulations are run to test the viability
of this proposal.
Keywords: UAV · Swarm · Task Allocation · Stochastic Policies · Air Combat.
2015 - 2016 ii
Acknowledgements
First I would like to thank BAE Systems for the opportunity to work in such as
interesting field as the design of artificial intelligence algorithms for fleets of unmanned
vehicles. I would also like to thank my supervisor Dr. Hyo Sang Shin for his help
and supervision during my research. And to finish, I would like to remark my family
and friend’s support during my stay at Cranfield University.
2015 - 2016 iv
Contents
Abstract. ii
Acknowledgements. iv
1 Introduction. 1
1.1 Context. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Aims & Objectives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Contribution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Literature Review. 2
2.1 Definition of Modern Air-Combat Scenario. . . . . . . . . . . . . . . 2
2.1.1 Beyond-Visual-Range Engagements. . . . . . . . . . . . . . . 2
2.1.2 Within-Visual-Range Engagements. . . . . . . . . . . . . . . 2
2.2 Modern Aircraft Evasion Tactics. . . . . . . . . . . . . . . . . . . . . 3
2.2.1 Defense of short-range, infrared guided missiles. . . . . . . . . 4
2.3 Unmanned Aerial Vehicles as Actors in Aerial Combat. . . . . . . . 6
2.3.1 High maneuvering capability. . . . . . . . . . . . . . . . . . . 6
2.3.2 Detecting Equipment. . . . . . . . . . . . . . . . . . . . . . . 8
2.3.3 How key air combat needs affect unmanned aerial vehicles
design and behaviour. . . . . . . . . . . . . . . . . . . . . . . 12
2.3.4 Unmanned Aerial Vehicle issues when considered for the mod-
ern battlefield. . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.5 Summary of Possible Unmanned Aircraft Candidate for an
Air-Combat Scenario. . . . . . . . . . . . . . . . . . . . . . . 16
3 Introduction to Unmanned Aerial Vehicles. 17

3.1 UAV Swarm Characterization. . . . . . . . . . . . . . . . . . . . . . 17
3.2 Task Allocation in UAV Swarms. . . . . . . . . . . . . . . . . . . . . 18
3.3 Biologically inspired task allocation based in stochastic policies. . . . 19
4 Battlefield Actors. 20
4.1 Engagement Procedure. . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.2 Graphical Representation. . . . . . . . . . . . . . . . . . . . . . . . . 21
5 Decision Making, Navigation, & Control. 22

5.1 Decision Making. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.2 Navigation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.2.1 Algorithm based in Potential Fields. . . . . . . . . . . . . . . 23
5.3 Control. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
6 Task Allocation. 25
6.1 Task Allocation Problem Definition. . . . . . . . . . . . . . . . . . . 25
6.2 Optimized Stochastic Policies for Task Allocation in Swarms of Robots. 25
6.2.1 Definitions and Assumptions. . . . . . . . . . . . . . . . . . . 25
6.2.2 Base model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
6.2.3 Extended Base Model: Including Quorum Concept. . . . . . 27
6.2.4 Agent Implementation. . . . . . . . . . . . . . . . . . . . . . 28
2015 - 2016 vi
7 Simulation Analysis. 29
7.1 First Case: Different simulation duration. . . . . . . . . . . . . . . . 29
7.2 Second Case: Different matrix K values. . . . . . . . . . . . . . . . . 30
7.3 Third Case: Different number of swarm agents. . . . . . . . . . . . . 32
8 Conclusions & Further Work. 34
References 35
2015 - 2016 vii

List of Figures
1 Boyd’s OODA loop. . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 BAE Taranis, and Eurofighter Typhoon at the back. . . . . . . . . . 7
3 Eurofight Typhoon radar: Euroradar Captor-E. . . . . . . . . . . . . 8
4 Fighter F-22 Raptor and Bomber F-117 Nighthawk shape comparison. 9
5 Radio signal reflection comparative. . . . . . . . . . . . . . . . . . . 10
6 Attacker/Tactical Bomber Lockheed F-117 Nighthawk. . . . . . . . . 10
7 Tactical Bomber Northop Grumman B-2 Spirit. . . . . . . . . . . . . 10
8 Lockheed RQ-170 Sentinel. . . . . . . . . . . . . . . . . . . . . . . . 14
9 Graphic representation of the simulation after 10 seconds. . . . . . . 21
10 Graphic representation of the simulation after 100 seconds. . . . . . 21
11 Decision Making Flow Chart. . . . . . . . . . . . . . . . . . . . . . . 22
12 Potential Fields. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
13 Kinematics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
14 Strongly connected graph. . . . . . . . . . . . . . . . . . . . . . . . . 26
15 Enemies killed with configuration 1. . . . . . . . . . . . . . . . . . . 29
16 Swarm agents survived with configuration 1. . . . . . . . . . . . . . . 30
2015 - 2016 ix
List of Tables
1 Aircraft comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2 Department of Defense Selected Acquisition Reports of some UAV
and manned aircraft models. . . . . . . . . . . . . . . . . . . . . . . 13
3 Agents involved in simulations. . . . . . . . . . . . . . . . . . . . . . 20
4 Agents’ representation. . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5 Simulation 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
6 Simulation 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
7 Simulation 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2015 - 2016 xi
1 Introduction.
1.1 Context.
The development of intelligent unmanned systems to be implemented in aircraft
present great potential in the future in terms of efficiency and cost reduction. Un-
manned air vehicle design tends to specialisation, this means that decision-making
and task-allocation algorithms will need to produce cooperative behaviours in order
to achieve such an improvement in performance. Therefore swarm systems present a
high robustness when confronting the losses of units because the general behaviour
to accomplish the system objective arises from a collective perspective. Moreover
the conduct of a single agent may even have no sense, but as a part of a larger group
produces the desired result.
Other important asset of swarm systems behaviours based in stochastic policies

is the simplicity of the mechanism that resides behind. The stochastic nature of
it makes the whole system unpredictable hence it is impossible to establish in a
deterministic way its evolution. Consequently the system becomes almost completely
decentralised improving scalability.
1.2 Aims & Objectives.

Though this thesis, an study of the modern air battlefield goes on to define the
features to implement in an swarm fleet and better determine a the simulation
approach to test the bio-inspired system defined later. The simulation of a stochastic
policy-based system will be tested under the circumstances of an air combat scenario.
Adaptation of previous approaches related to less demanding models will be the key
to implement a successful procedure that accomplish acceptable results. At last,
statistical analysis of the results will show if the system defined accomplish with the
original purpose, if swarm fleet proposed based in stochastic rates is successful or if
it need more development.
1.3 Contribution.
This thesis proposes a behaviour model to characterise a swarm of unmanned air
vehicles with attack purposes. The core is a task allocation algorithm based in
stochastic policies. Previous models tackle problems as static scenarios where number
of tasks do not change. The author develops new procedure to affront a dynamic
situation where number of tasks changes along time. Moreover, the development
includes decision-making and basic control features for each unit. The full model
arises from the study of what are the possible needs of today’s UAVs when introduced
in air combat.
2015 - 2016 1
2 Literature Review.
2.1 Definition of Modern Air-Combat Scenario.
The aim of this work is to study new behaviour approaches for fleets of UAV when
facing combat scenarios. Nowadays air-to-air fight engagement procedures distinguish
between two main phases, Beyond Visual Range phase and Within Visual Range
engagements. This division of the air fight conflict in two parts attends to the different
needs of each phase. In these procedures, weapons and maneuvers differ considerably
from one to other. Furthermore traditional aircraft based combat superiority fall into
three main features that define combat development. Assuming that pilots have the
maximum level of training possible for the aircraft the are flying, these features are
weapon capability, detection measures, and aircraft maneuvering capability. These
are the main bases of the modern basic fight principles defined by CNATRA [12].
Next, the two main type of air combat engagements will be generally described
to define the context of the scenario proposed by this thesis, which include combat
assumptions and suppositions. Moreover, the comparison between characteristics of
some actual UAV models and modern fighters will lead to the definition of a possible
combat or interception scenario that may include the use of unmanned aircraft.
2.1.1 Beyond-Visual-Range Engagements.

Modern detection techniques permit to learn enemies’ positions before visual confirm-
ation. Radars are the main tool that allow it. Detection methods along with guided
missiles technologies allow pilots to attack enemies at this point. Examples of this
equipment are the radar AN/APG-68 from Northrop Grumman in the Lockheed Mar-
tin F-16 fighter, with a detecting range up to 295 km. An usual guided missile used by
occidental fighters as Eurofighter Typhoon, is the medium to large range guided mis-
sile AIM-120 AMRAAM from Hughes/Raytheon with an attacking range up to 25 km.
The process involving a long range attack with guided missiles is divided in three
parts: First step is detection using the radar. Second step is lock up step. Target
tracking is fixed over the objective allowing the last part. Now the pilot is able to
shot the projectile without need of taking care of the trajectory to the enemy aircraft.
The electronic gear over the missile in communication with the main radar of the
airplane predicts the path to follow until the collision.
Despite the obvious benefits of this way of attack in theory, tests and combat
experience have demonstrated that degrading factors as weather, aggressive maneuv-
ering, electronic countermeasures among others; cause hit probability to decrease
substantially [13]. The success values obtained from combat show that WVR shots
are much effective.
2.1.2 Within-Visual-Range Engagements.

At close quarters combat, the weapons used might differ compared to BVR phase.
At this point, when acting at visual range, fire gun, heat guided missiles, radar
guided missiles and rockets are the possible weapons to use. Heat guided missiles like
the AIM-9 Sidewinder from Raytheon are examples of short-range missiles. Other
alternatives are rockets and machine gun fire. As said before, when reducing distance
2015 - 2016 2
hit probability increases and more weapon options become available.
The principal danger for pilots when diminishing distance to enemies is obvious.
The less further the enemy is, more probable is that they will spot you with their
radar, or even going into their visual range.
2.2 Modern Aircraft Evasion Tactics.

As exposed below, the main weapons of modern fighter aircraft involved in air-
superiority engagements are the missiles equipped. Fighter pilots must defend
themselves from a variety of threats. But at the end, all these menaces lead to missile
attack in the most part of the cases. The use of on-board cannons occurs when there
are no more missiles left to launch. The aggression process consists on three parts
[14]:
1. Detection: The aircraft radar signature appears in the radar of the counterpart.
2. Lock on & tracking: Aircraft position signal is fixed by the radar.
3. Missile launching.
The following defensive procedures, defined and well explained at [14]; try to
avoid the attack as early as possible by trying to stay ”invisible” to the enemy radar.
If detected and under missile fire, then the goal is to get away from the missile by
combining maneuvering and obscuring the electronic sensors from the missile that
are tracking the objective. In the next sections the main procedures will be explained.
Denying your location to the enemy.

The most useful defense is to never let the enemy to spot you. Stealth technology is
the natural step beyond today’s military aircraft design guidelines, but it is far from
been infallible. Fighters that include these design principles are the F22-Raptor from
Lochheed Martin or the T-50 from Sukhoi. Because of the price and complexity of
developing these kind of solutions to reduce radar-cross sections, other more simple
techniques are taken into account, such as approaching the enemy from a direction he
is not expecting. Low altitude flight and the using terrain masking, gives advantage
against the enemy due to the inherent difficulty for the radar at look-down modes.
The reason behind is that the aircraft must be away from the ground clutter.
Aircraft are also equipped with electronic countermeasures (ECM), that include
radar jammers. Since jammers produce a radar signal, they highlight position to
enemy radar. The purpose of this signal is to obstruct the enemy radar when locking
and tracking your aircraft. Due to this, it is usual to avoid using the jammer if
concealing position to the enemy is the priority, but prudent to begin using the
jammer once the aircraft is detected.
Early detection of long-range, radar-guided missiles.

Radar-guided missiles can be launched at a targets from distances of 50 kilometers
and even further. These missiles can be passively guided. This means to follow a
2015 - 2016 3
radar signal reflected off the target aircraft produced by the launch platform. Missiles
can be also actively guided, using an on-board radar.
Radar warning receivers (RWR), are the main tool to detect radar-guided missiles.
They detect and classify incoming radar signals. When a target of interest is been
tracked, it gets marked the counter-side radar with a pulse signal. This is detected by
RWR informing the pilot of the signal origin distance. Then, if the enemy launches a
passive radar-guided missile, the incoming signal will switch from pulse to continuous
one. This change will be noticed by the RWR informing the pilot of a missile coming,
but it will not give any position since it is the enemy and not the missile the producer
of the signal.
For active radar guided missiles, the missile’s radar itself is detected by the RWR.
The RWR recognizes the radar’s waveform as that of a known kind of enemy missile,
and sounds the missile launch warning. In this case, the RWR can also plot the
estimated location of the missile.
Usually long-range air-to-air missiles are semi-active radar-guided. Aircraft’s

on-board radar signal is used by the missile until it is close enough to activate its
radar. In this situation there is no warning oh the missile until the activation of its
own radar. By then the missile is already dangerously close. At that moment, the
pilot must assume the worst, he is been tracked by an incoming missile and he has
to fly defensively. Which falls into the following point.
Early defense of long-range, radar-guided missiles.

Radar-guided missiles can be affected by electronic countermeasures (jamming), as
well as a simpler form of countermeasures called chaff. Chaff tools consist on ejecting
small metal strips reflecting radar signal and obscuring the missile radar. In addition,
the pilot can ”notch” the missile by positioning it directly at his 3- or 9-o’clock. At
this relative position the aircraft apparent velocity is zero since it is perpendicular to
the missile. This can make the aircraft to get ”notched out” by the terrain rejection
feature of the missile’s radar. This is because terrain also has a null speed.
Other alternative consists in kinematically defeating the missile. The aim is to

force the missile to bleed off as much energy as possible by aggressive maneuvering.
The most effective movement is a drag. It is based on sharp 45 turns with the missile
located at 6 o’clock, forcing it to make energy-losing turns to keep tracking the target
aircraft. The disadvantage of this maneuver is that the plane’s nose moves away
from the fight and provoking a totally defensive stance. The pilot can also combine
a notching movements with climbs and descents that make the missile to climb and
descend as well. The effectiveness of this option is less compared to a drag but keeps
the pilot’s nose closer to the fight, making him more offensive.
2.2.1 Defense of short-range, infrared guided missiles.

In closer fights, use of heat-seeking missiles arises. This kind of missiles can be
detected by missile warning sensors (MWS). They are small cameras situated all
around the aircraft. detecting enemy’s missile smoke traces and warning with the
direction of the threat. The answer can be to launch flares to spoof the missile’s
2015 - 2016 4
infrared sensors, and/or reducing his own heat signature by powering back his engines.
Missile end-game defense.

If it is not possible to kinematically defeat the menace, the missile will follow towards
the aircraft until impact is imminent. At this moment the end-game defense begins.
Usually this consists in high-speed, high-g descending turns that allows the aircraft
to achieve the highest line-of-sight rate across the missile’s field of view, making the
missile to go at high-g turns in order to keep the trajectory. The more energy the
missile bleeds off before the end-game, the less energy the missile will have available
to make these turns. Ideally the missile dynamics will concur into two possible
options. They are either not having enough energy to complete the turn, and fall
short of the aircraft, or traveling so fast that it cannot turn sharp enough to stay
with the aircraft in the maneuver, finally overshooting, and then destructing itself.
The jammer is not used during the end-game. The main reason is that the vast
majority of modern radar-guided missiles have a home-on-jam characteristic. If
the missile perceives that its own radar signal is being jammed, it will turn off the
jammed signal; only needing to home in the jammer’s radar signal to pursuit the
target. In this situation, the jammer acts as a guide for the missile.
Guns defense.
Eventually, if the contenders run out of missiles a dogfight may ensue. When an air-
craft becomes defensive in a dogfight, there are two options: transform the defensive
engagement into an offensive one, or try to run away.
Defensive engagement is always based in turning, as a straight-and-level aircraft

becomes an easy target. However to run away, the pilot needs to fly straight to achieve
as much speed as possible. Thus, running away is a hazardous option: There is need
of enough separation from the enemy to deny a missile shot right up the tailpipe as
running away and a need of enough of a speed advantage over the enemy that he
cannot simply chase down. A fighter pilot is constantly evaluating the escape window.
Last option available for the pilot is to convert the engagement from defensive to
offensive. In a defensive situation, defensive aircraft nose is facing away the enemy
while and enemy’s nose is directly looking at the counterpart aircraft. Reach an
offensive position involves ”gaining angles” on the enemy, which requires out-turning
him. If the aircraft has better turn performance than the enemy, then it should be
possible to slowly catch up the offensive advantage. It is supposed that the enemy
will try to deny this opportunity by performing the necessarily maneuvers.
2015 - 2016 5
2.3 Unmanned Aerial Vehicles as Actors in Aerial Com-

bat.
There is enough evidence to recognize that there is not a single sensor or a unique
aerodynamic design feature capable of ensuring by itself air dominance if implemented.
The effectiveness of an air superiority aircraft is based in the successful combination
of maneuvering performance including wing loading, thrust-to-weight ratio, and
avionics, electronic warfare and weapons integration. Moreover, operational tactics
and aircrew training must be developed to gain the complete potential of the weapon
system. Latter feature defined is the key when UAV kick in. Artificial Intelligence
has an obvious potential to work out operational tactics faster and more efficiently
than a human. Furthermore fleets of UAVs have the potential of redefining the
tactics applied nowadays. Actual technology step is to jump from remotely piloted
unmanned aircraft to a full autonomous one, capable of deploying missions alone
with no continuous human supervision
2.3.1 High maneuvering capability.

The fundamental principle of air combat is the Observation-Orientation-Decision-
Action cycle or OODA loop, defined by USAF fighter pilot and military strategist
John Boyd. First step consists in observation of the situation. Then the pilot
orients the aircraft based in the already-available data. At following phase, he takes
the decision on what action should go next. Finally proceeds and wait to observe
counterpart’s reaction. The key to victory is breaking enemy’s OODA loop, or doing
it faster than him. To break the loop consists on denying vital information. To
achieve this the aircraft includes a combination of detecting sensors, special shape
to reduce visual and radar signature, and at the same time responsive maneuver
capability, fast performance indeed.
According to Boyd, decision-making occurs in a recurring cycle of observe-orient-

decide-act. An entity (whether an individual or an organization) that can process
this cycle quickly, observing and reacting to unfolding events more rapidly than
an opponent can thereby ”get inside” the opponent’s decision cycle and gain the
advantage
Figure 1: Boyd’s OODA loop.
2015 - 2016 6
When dealing with modern fighters, to decide how to gain the needed nose pointing
capability to outperform adversary’s OODA loop and achieve combat advantage,
three solutions are on the table:
• Extremely high short-term sustained Angle of Attack values.
• High values of thrust-to-weight ratio.
• Thrust Vectoring.
Nowadays there are still not UAV models in service specially developed for combat.
All the principal military aircraft producers are immerse in a technical race to satisfy
this need. Good examples of this programs are the BAE systems Taranis, figure
2, or the Dassault nEUROn. The main mission of these programmes are not the
air superiority, rather their principal goal goes from tactical attacks based in large
range and stealth to surveillance and reconnaissance. The problem of UAVs when
confronting air combat is that despite the advances in control techniques and flight
designs, UAV maneuvering performance is far from traditional fighters.
Figure 2: BAE Taranis, and Eurofighter Typhoon at the back.
At table 1, information obtained from [15, 16, 17, 18] Different kind of UAV
models are compared with a modern fighter model and the future substitute of it. It
is almost impossible for civil usage to obtain flight envelopes of this aircraft, since
the are designed for military purpose. Comparing that kind of documents would be
the optimal way to establish a rigorous technical approach. But for the scope of this
thesis the information gathered is enough to conform the big picture.
The large difference in thrust-to-weight ratio between UAVs and fighter is obvious.
Modern air superiority aircraft are designed considering a value of this feature over
one (around 1.35 for the F35A, and 1.6 for the F16C/D). Values below one for
the unmanned vehicles can give a general perspective of their considerable less
maneuvering capability, even if the wing load, aerodynamic geometries and other
aspects are not considered.
2015 - 2016 7
2.3.2 Detecting Equipment.

The main technology used by modern war aircraft is onboard radars. The ones on
actual fighters are from the kind active electronically scanned array (AESA), also
known as active phased array radar (APAR). They are a type of phased array radar
formed by numerous solid-state transmit/receive modules that do trans-receiver
functions. AESA radars focus the signal produced by emitting detached waves
from everyone of the modules that conform the radar. These signals are added up
constructively at certain angles. Furthermore, by spreading the emissions along a
frequency band instead of just one value, these signals become harder to detect over
the background noise. This feature makes moder fighters equipped with this type or
radar elements virtually stealthy when broadcasting radar signals.
Figure 3: Eurofight Typhoon radar: Euroradar Captor-E.
Aircraft radars usually have two programs: searching and tracking. At search
mode, the radar produces signals in a zig-zag pattern. When these signals are
reflected by possible targets, an indication appears on the radar display. In this
mode, the possible targets are not tracked down. When the pilot desires to lock up
an aircraft, the radar will switch to track programme. Then the radar aims more
energy on a particular target. Because of it, the pilot obtains more information about
the aimed aircraft, but at reducing possible information about other targets in the
area.
An advanced characteristic of modern radars, is the situational awareness modes

(SAM). A radar configured in this phase combines both tracking and scanning per-
mitting to follow one or a small number of crucial targets while not losing the rest of
information about other objectives. At this mode, the radar signals sweeps the sky,
and at the same time briefly and regularly stop scanning to control locked targets.
When considering heat-seeking missiles, radar is not strictly necessary. Locking mode
is used in this case to direct the seeker head to the target. But even if the seeker
is not locked, the missile can be launched and it will scan the sky for heat sources.
The radar just decreases the time needed for a launch.
Radar-guided missiles are divided in two types: Active and passive. Active radar
missiles have their own on-board radar. It is one-way signal radar, hence the missile
needs assistance from the launching aircraft until it is close enough to actively lock
the target by itself. However they can be fired with no radar lock. On the other
hand, passive radar missiles need the aircraft signal all time since they know where
is the target by following the reflected signal over it.
2015 - 2016 8
Even if an aircraft can only scan out a region corresponding to a cone in front with
the apex situated over the radar, it can detect incoming signal from any direction. A
digital processor listens to distinctive signals from the sources around the aircraft,
and display their azimuth angle.
To avoid radar detection, most modern generations of fighters like Lockheed

Martin F35 Lighting II or F22 Raptor include stealth technology. This kind of
features are a combination of technologies that try to reduce the distance at which
the aircraft can be detected. The most important works about goes in the direction
of radar cross section (RCS) reductions, but also acoustic, thermal, and other aspects.
Figure 4: Fighter F-22 Raptor and Bomber F-117 Nighthawk shape compar-
ison.
High Radar cross-section reductions occur when the most part of the the incoming
radar signals are absorbed instead of been reflected. The best way to reflect them as
low as possible is avoiding conforming orthogonal surfaces. This usually happens at
the tail in traditional aircraft configurations where the vertical and horizontal parts
are set at right angles,figure 5. Stealth designs as the tactical bomber Lockheed
F117 Nighthawk tend to tilt the tail surfaces to avoid as much as possible corner
reflections, figures 6 and 4. Most radical designs, as the Northrop Grumman B-2
Spirit or the BAE systems Taranis, have no tail, figure 7. These kind of shapes
present excellent low drag properties and highly reduce radar aircraft profile. They
try to resemble a so-called flat plate, that is the most efficient profile in terms of
RCS reduction as there would be no angles to reflect radar signals.
Other design principle is to parallel alignment of edges or even surfaces. For

example, on the F-22 Raptor, the tail and wing leading edges planes have same angle.
This provokes the return of a narrow signal in an specific direction away instead of
returning a diffuse signal detectable at many angles. But all these concessions in the
aircraft shape design have a drawback, aircraft’s aerodynamic properties are seriously
compromised. This type of aircraft are usually inherently unstable, therefore cannot
be flown without a fly-by-wire control system.
2015 - 2016 9
Figure 5: Radio signal reflection comparative.
Figure 6: Attacker/Tactical Bomber Lockheed F-117 Nighthawk.
Figure 7: Tactical Bomber Northop Grumman B-2 Spirit.
2015 - 2016 10
2015 - 2016
Manufacturer General Atomics Northrop Grumman BAE Systems Lockheed Martin Lockheed Martin
Parameters Units MQ-9 Reaper RQ-4 Global Hawk Taranis F35A CTOL F16C/D 50-52 block
Thrust [lbf] 900 [hp] 7,600 6,480 43,000 27,000
Weight [kg] 2,223 14,628 (MTOW) 8,000 (MTOW) 31,750 (MTOW) 16,875 (MOTW)
Wingspan [m] 20.1 39.8 9.94 10.7 9.8
Lenght [m] 11 14.5 11.35 15.7 14.8
Height [m] 3.8 4.7 4 4.38 4.8
Range [km] 1,852 22,780 - 2,200 3,200
Speed [kn] 200 (cruise) 310 High Subsonic Mach 1.6 Mach 2
Ceiling [m] 15,240 18,288 - 15,000 15,240
Payload [kg] 1,701 1,360 - 8,160 7,800
M.Sc. Autonomous Vehicle Control & Dynamics
Armament Air-to-ground missiles None - Multi-role Multi-role

Unit Cost $64.2 million $ 168 million - $103 million $18.8 million
Flying cost 3,600 $ per hour - - - 20000 $ per hour
Operating Start [Year] 2007 2011 (Block 30) Development Testing phase 1994
Table 1: Aircraft comparison
11
Cranfield University
2.3.3 How key air combat needs affect unmanned aerial vehicles
design and behaviour.
Again from the gathered information in table 1, some important facts arise. Actual
attack UAV models as the MQ-9 Reaper are much cheaper in flying costs than
multi-role fighters that develop the same kind of missions in certain scenarios as
air-to-ground strikes. At the moment, the unit cost comparison will be falling from
one side or the other depending on the fighter model studied, type of missions
developed, logistic and personnel cost related to the deployment of these units... For
example, the Lockheed F16 unit cost is much cheaper nowadays as the program is in
production and improvement since 1970. Furthermore it is the most exported fighter
model, (producer statement). Hence the production costs are extensively low.
But, USAF is actually immersed in the substitution of the F16 by the F35, which
is considerably more expensive at all levels nowadays. The reduction of costs is
something that is hardly predictable, more with the appearance of UAVs that are
taking part in a more efficient way of the traditional missions these kind of aircraft
develop.
The principal advantages concerned to the use of unmanned aerial vehicles in the
battlefield are the following:
Costs.
UAV may be much cheaper than fighters. This obviously depends in the model
that are compared, but usually the lack of a human on board makes the design to
avoid certain elements that increases prices. Cockpit, pilot expenses as salary or
formation. These elements are not needed for UAVs. Also as they are remotely
controlled, operators can take turns permitting the extension of mission length. This
is remarkable when watching to the RQ-4 Global Hawk range, table 1.
Probably the most significant factor in UAV programs’ expansion is the cost
advantage this technology present against traditional aircraft and related operations.
UAVs are unmanned, but the operation that involve them still need operators, main-
tenance teams, and large networks of equipment and personnel to guarantee the
intelligence and legal procedures to develop missions. Opponents to UAV use, argue
that this need of crews and the significant larger accident ratio, make unmanned
air vehicles more expensive to governments in the long-term than conventional air
combat gear.
From a global perspective, UAVs are marginally cheaper to buy and operate
than manned aircraft. But there is something that often goes unnoticed, this is
the operational advantage over cost effectiveness. The use of unmanned vehicles
in sensitive foreign operations outweigh the possible menace produced by higher
accident ratios and growing counter-reaction at target environments.In table 2 a
general comparison of some UAV models and manned aircraft purchases obtained
from Selected Acquisition Reports of the USA Department of Defense, shows that
UAV programs are usually less expensive to buy and operate than traditional fighter
aircraft.
2015 - 2016 12
Unit APUC Ud. O&S-AAC O&S-AAC Base

cost
UAVs per per [2] per unit [3] per hr. and [5]
SAR unit unit[4]
[1]
($M) ($M) ($K) ($K) (Year)
MQ-1 Predator 4 N/A N/A 1,210.0 [6] 1.32 2010

MQ-1C Gray 4 106.49 26.62 7,960.0 N/A 2010
Eagle
MQ-9 Reaper 4 25.93 6.48 2,988.0 3.25 2008
RQ-4 Global 1 103.04 103.04 15,591.10 31.12 2000
Hawk
FIGHTERS
F-15C 1 N/A N/A 7,681.11 [7] 25.69 N/A
F-16C/D 1 N/A N/A 4,039.80 [8] 13.47 N/A
F-22 1 185.73 185.73 11,256 [9] 11.26 2005
F-35 1 90.77 90.77 4,927.5 [10] 16.43 2002
(Data is for DoD assets for use in overt missions only. Data current as of December 2011 except for F-15 and F-22 SARs,which
are current as of December 2010.
[1] APUC is the Average Procurement Unit Cost, denoted in millions of dollars.
[2] Cost per Aircraft is the APUC divided by the number of aircraft (if multiple) comprising one unit, denoted in millions
of dollars.
[3] O&S Average Annual Cost per Aircraft is the cost of Operations and Support per Unit, including Unit-level Manpower,
Unit Operations, Maintenance, Sustaining Support, Continuing System Improvements, Indirect Support, and Other, denoted
here in thousands of dollars and adjusted to reflect costs on a per-aircraft basis.
[4] O&S per Aircraft is the preceding value divided by the number of aircraft (if multiple) comprising one unit, denoted in
thousands of dollars.
[5] Base year is the initial year in which acquisition was valued. Variations due to different cost base years should be taken
into consideration.
[6] Cost obtained from the MQ-9 Reaper SAR. The MQ-1 Predator is antecedent to the MQ-9 Reaper.
[7] Cost obtained from F-22 SAR, updated to base year 2010, and is provided per squadron of 18 aircraft. This value
represents the per-aircraft cost, obtained by dividing the total annual cost per squadron of $141.5 million by 18 aircraft,
denoted in thousands of dollars.
[8] Cost obtained from the F-35 SAR. The F-16C/D is antecedent to the F-35.
[9] Cost is provided per squadron of 18 aircraft. This value represents the per-aircraft cost, obtained by dividing the total
annual cost per squadron of $202.6 million by 18 aircraft, denoted in thousands of dollars.
[10] Cost calculated based on estimated average of 300 flight hours per year, denoted in thousands of dollars.
Table 2: Department of Defense Selected Acquisition Reports of some UAV

and manned aircraft models.
It is important to remark that F-16 and F-15 programs are no longer able to
be purchased. Hence, acquisition cost is irrelevant compared to UAVs. Anyway,
operating costs are still important since they are the main USAF fighter and striker
models at combat zones nowadays.
When discussing staff requirements, traditional aircraft need one pilot and a
weapon operator, or even just a pilot depending on the model. Normal estimations
about UAVs show that the requirements for UAVs like the MQ-1 Predator are around
80 persons to operate one unit from beginning to end.
It is a fact that actually UAVs tend more to crash than fighters. USA Congres-
sional Budget Office states that MQ-9 Predator mishap rate has improved from 28
crashes per 100,000 flying hours to 7.6 crashes. But these values are still far from
the ones presented by the F-15. This fighter model had in 2011 a mishap rate of
2.36 mishaps per 100,000 hours of flight. USAF counts total losses from Predator
2015 - 2016 13
crashes in 2011 of $48 million in total, and happening mainly overseas. This amount
is less than the one in the same period for F-16, that was $ 57.3 millions, most of
the crashes at home.
While monetary losses that arise from crashes might be just a small part of the
whole budget, security issues are critical. In 2011, a mishap of a Lockheed RQ-170
Sentinel UAV, figure 8; in Iran showed that this kind of accidents risk to compromise
missions and security since the enemy is able to catch sensitive technological material
that later can be sold to hostile countries.
There always be variations in cost military technology assessment, however it

seems that the most common UAVs programs present cost advantage over fighters.
Nevertheless, unmanned vehicles contribute with strategic advantages should be
evaluated in equal measure. While manned aircraft are dynamically more capable
and versatile, UAVs offer various security and strategic advantages that fighters are
actually not providing.
Thanks to its nature, UAVs are able to go into dangerous regions with no security
issues for the pilots’ lives. Moreover they are not constrained by operators’ shift
schedules or pilot endurance. They gather more intelligence and reconnaissance
data than any manned aircraft, and even attack selected targets. It is really hard
to assess these advantages, or the the negative reaction from counterpart civilians
and foreign countries when UAVs are deployed. There are still no over-all objective
measurements to evaluate strategic advantages and possible inconveniences to the
use of aerial unmanned vehicles in fighters typical missions. This type of information
is the most lacking and where the efforts should go on to better implement UAV
programs.
Figure 8: Lockheed RQ-170 Sentinel.
Swarm capability & Potential Air Superiority.

This concept is related with relatively low costs compared to traditional fighters.
In a combat situation, deploying a number of UAVs that largely overcome enemy
2015 - 2016 14
aircraft squadrons, is significantly advantage. To outpoint the target capability of

the enemy aircraft radars and available weapons is a considerably benefit. Ever
the most sophisticated detecting systems would be surpassed as there would be too
many possible targets for them engage with successful probability. Moreover, swarm
combat at scale of for example 500 versus 500 units imply that in less than a second
one single aircraft may be considered to be engaged by tenths of enemy units as it
flies through, trying to fire its weapons when the firing solutions available and the
current scheme of maneuver matches in a millisecond.
Autonomous capability and integrated artificial intelligence give UAVs a new

range of capabilities to UAV groups. John Boyd stated that aircraft could give
victory advantage in air combat by permitting to rapidly change speed, angle of
attack, and position. Boyds studies produced some of the fighter design principles
applied in modern models like the Lockheed F16. But the this approach assumes that
a human operator is inside the Observe-Orient-Decide-Act cycle. When considering
UAVs, OODA loop is executed by software. Hence the speed of execution is just not
comparable.
Moreover, in warfare, enemy operating systems will become important targets of

intelligence collection. Confronting an unknown schema of maneuver, vulnerability
data will begin to roll in within seconds of engagement, at which point human or
machine analysis will be able to evaluate potential platform and software weaknesses.
At this point, we will need the ability to reprogram our swarms on the fly to capitalize
on enemy weaknessesan OODA loop on the operating system level.
2.3.4 Unmanned Aerial Vehicle issues when considered for the mod-
ern battlefield.
The advantages of introducing unmanned aerial vehicles to missions assigned tradi-
tionally to manned aircraft might be at same time disadvantages. The reason is the
actual state of the technology that is supposed to achieve the needs of today’s air
combat, air-ground attack, reconnaissance, etc.
Actual UAV control loops does not match human reflex speed reaction and range
of possibilities. The important of this point in combat air is crucial as fast maneuver-
ing and counter act are the key factors to achieve advantage when considering ODDA
loop. Improving this problem with problem with velocity may take several time to be
solved. But nevertheless unmanned aerial vehicles most potential use is to not to be
used in a deterministic way as manned aircraft but more like swarming. This means
sacrificing individual performance but improving the over all result by putting a large
number of individuals working together. Anyway, artificial intelligence algorithms
to achieve swarm behaviours are still in development. Efficient swarming will be
translated in an algorithm that guides the group not the individual, letting each unit
to decide by itself, opposite to central command.
Hence, UAVs are already prepared to develop missions as tactic bombardment or

reconnaissance and surveillance that need no fast dynamics and short time decision.
But they need large improvements to be able to operate at missions where transient
phases are reduced to seconds or even less.
Furthermore, the way actual UAVs are controlled, not fully automated but re-
motely piloted; means a degradation of OODA loop. The lacking of pilot’s situational
2015 - 2016 15
awareness makes OODA cycle slower and less efficient. Off-board operators are not
prepared to react to unexpected circumstances as pilots. And there is another issue,
the possibility of communication problems.
2.3.5 Summary of Possible Unmanned Aircraft Candidate for an

Air-Combat Scenario.
Summarizing, the features the unmanned vehicle proposed should have are:
• Stealth.
• Detection range.
• Performance.
The proposed model would be somehow something similar to the BAE Systems
Taranis. The size and shape needed should be the small as possible to avoid detection.
Due to the inherent better performance of a manned aircraft because of the faster
OODA loop of the pilot compared to a modern autonomous vehicle, and the much
better dynamics of a fighter compared to UAVs; the best engagement procedure for
an unmanned air vehicle to attack an aircraft would consist in trying to approach
the target as much as possible without been detected and launch a missile or look
for a direct hit with the own vehicle.
The model should present a reliable performance, with a minimum capability of

maneuvering. At least a significant acceleration to be able to hit the target when
the distance is so reduced that the detection is certain. Moreover, the UAV would
be fully autonomous to eliminate the communication delay between the aircraft and
the control station and the sensor awareness limitation for the operator, improving
the UAV OODA loop. The main point about this is clear, obviously in a 1 versus
1 engagement the UAV has nothing to do, but trying to reduce the performance
gap as much as possible increases significantly the probability of kill. However, the
key factor is the number of UAV agents, conforming a swarm. This means that a
reasonable number would be around five units per enemy.
Moreover, the communications involving a fleet of minimum 20 units become an

important issue, bandwidth capability of modern communication equipments do not
permit continuous communication between all units, and with the exterior. Hence
the strategy followed should promote the communication independence of the actors
as much as possible, reducing it to a central controller in certain moments.
2015 - 2016 16
3 Introduction to Unmanned Aerial Vehicles.

Unmanned aerial vehicles (UAVs) are nowadays a robust tool capable of deploying
almost the same missions as traditional aircraft, but with a significant lower cost.
Missions that need long time, a several number of units, or that are repetitive are
ideal for deploying UAV systems. Surveillance, aerial coverage for ground units,
or large-scale attacks are examples of these needs. One step beyond the actual
technology state is to introduce to UAV fleets simple agent behaviours that all
together produce a collective purpose.
Actual UAV models are remote controlled manually or semi automatized by pilots.
Pathfinder Raven, Reaper or Predator are examples of this form of control. They are
similar in size to standard fighters i.e. Eurofighter Typhoon, and cost about ten times
less. But they still act somehow like manned airplanes when missions are defined.
With the appearance of even smaller and cheaper UAVs, new mission approaches
can be designed. Complete autonomous fleets capable of develop missions with no
continued human assistance arise. Hence decision-making responsibility should be
part of each agent composing the system.
3.1 UAV Swarm Characterization.

A large number of robots, acting together can be defined as an UAV swarm. The
main characteristics that define these systems are scalability, simple agent behaviour,
and local communication. The collective behaviour of the complete system emerges
from the interactions between agents and with the environment [1].
This kind of systems should complete their goals, been able at the same time to
recover from faults, and responding predictably to human supervision at any time.
Large fleets of agents make centralized task management and control approaches not
working. The main issue is communication networks. Such a large number of agents
transmitting and receiving information simultaneously would make a centralised
systems very unstable and computing demanding, therefore this approach is not
acceptable.
At this point is where a decentralized design arises. The key is to obtain a desired
group behaviour focused on achieving a common goal, by defining simple individuals
behaviours. The individual stimulus should be just local information like sensor data
and communication with other agents around. Thus the three main domains that
challenge the development of this kind of systems are:
• Task allocation.
• Guidance & control.
• Communication networks.
The simulations produced by this thesis will be mainly focused in the task allocation
aspect, studying approaches that tackle large number of agents and trying to apply
them to the scenario of our interest: Air battlefields. This does not mean that the
other fields will be forgotten, just that the solutions applied will be already proved
methods or model simplifications.
2015 - 2016 17
3.2 Task Allocation in UAV Swarms.

Task allocation is defined as the process involving agents assigned to specific tasks,
in optimal concentration related to current state. Principal characteristic of this
process is its individual-centric nature. It relies to individual decisions about what
task to carry out. The two main sources of stimuli that produce changes in agents
behaviours are inter-individual interactions and the environment [2].
Optimal coordination among the UAV swarm is the key of mission success. The
main reason of this fact is that the distributed nature of the agents. Assigning
them properly lead to the maximum efficiency of the system. Hence, to maintain
flexibility when communications may not be available, the decision-making should be
distributed. The allocation problem obtained from this system configuration becomes
complex, needing trade-offs in the structure of the resolution algorithms.When design-
ing algorithms to solve the allocation problem, compromise between two goals is
needed, the optimality of the solution and the time to obtain that solution. Therefore
task allocation problems are most part NP-Hard, demanding exponential time to
obtain a satisfactory solution.
In the last decades, two main approaches to design decentralized task allocation
frameworks have been developed. One is CBBA algorithm [3], and the other is SCA
algorithm [4]. CBBA framework is based in auction-based decision strategy as a way
for decentralized task allocation, and consensus routine based on local communication
is used as mechanism to resolve conflicts achieving agreement on the winning bid
values. SCA is defined by enabling improvement of the global cost of task alloca-
tions obtained from fast greedy algorithms. It is a Markov Chain Monte Carlo method.
The main characteristic of CBBA algorithm is that the compromise between

computation time and optimality of the solution is fixed due to the structure of
the algorithm, it cannot be settled down according to the demands of each mission.
This means that the performance is warrantied if the mission parameters satisfy
CBBA conditions. If not, there is no certainty of getting a solution. This issue
makes it not suitable for several real scenarios.The principal description of SCA is
that it uses stochastic transfers between tasks assigned each agent, and provides
configurable algorithm parameters that permit trade-offs between optimal solution
and computational requirements. This is the main difference compared to CBBA
algorithm.
This approaches work well when the number of agents conforming the swarm
is not very large, around 10 or 20 units. But when this number increases, the
communication needs for auction algorithms become unaffordable for networks that
large in size. Furthermore, not only the size is an issue, in real scenarios losses in
communication appear due to the topology of the terrain and other factors. Hence,
new approaches should be designed. In this situation, with groups of hundreds of
agents and limited communications; is when new approaches inspired by insects
behaviours arise. Self-organized behaviour of social insects like ants, can be translated
as agents with limited on-board resources, scalable in number. They conform systems
that are considerably robust to changes in population, and scalable in units and tasks
[5]. This framework will be explained in next section of this thesis, as background
for the model that will be developed.
2015 - 2016 18
3.3 Biologically inspired task allocation based in stochastic

policies.
In [7] to [9], a reliable approach based in stochastic policies is proposed to solve the
task allocation problem for groups of hundreds of agents. A way of dynamically
assigning a homogeneous swarm of robots to various tasks is presented. This model
will be inspiration in this thesis to build a stochastic policy-based approach to a
combat scenario.
In nature decentralized consensus behaviours can be found in groups of ants,

honeybees and other kind of social insects. In [11] T emnothorax albipennis ants
start a collective decision-making process when they should choose a location for
a new nest between two options. At the starting point, undecided ants choose one
place. Then, an ant that has already decided acts as a recruiter. It returns to the
old nest and recruits a new ant for its site through a tandem run. The path to this
place is learned by the second ant following the recruiter. Once the path learned,
the second ant acts as a new recruiter. When a critical population level is reached
(quorum), convergence rate to a single option is then favoured by the recruiters.
Instead of recruiting new ants, the recruiters directly transport them to the selected
location.
2015 - 2016 19
4 Battlefield Actors.
The strategy adopted is inspired by the UAV model proposed from the author’s
interpretation of the characteristics of a modern air combat scenario. To simplify
the case, the air combat studied would be between a swarm of 30 agents against 10
enemy units. Due to the limitations of the scope of the thesis, the most important
characteristic of the swarm to test is the task allocation algorithm. Hence the enemy
considered would be other unmanned aircraft, with an speed performance and man-
euver capability similar to the one of the swarm agents. Then, to win the battle the
only advantage presented by the swarm is the obvious one, the larger number of units.
The aim is to present UAV with the least quantity of on-board resources. To
conform a system that would be high scalable in number of units forming it and
enemies to attack. It should be robust to changes in this numbers. The reason
is that at combat there would be obviously losses that change system size. These
characteristics are related to decentralized systems. In them, agents change between
simple behaviours modified by external interactions with the rest of elements. To
achieve it, a distributed algorithm using stochastic rules to dispense targets between
all the swarm will be applied.
Unit Type Amount Speed Attack Hit Prob. Defense Integrity

Swarm 30 units 300 kts Suicide 60 % No def. 1 hit
Enemy 10 units 300 kts No attack - Run away 3 hit
Table 3: Agents involved in simulations.
4.1 Engagement Procedure.

As it will be explained later, the swarm should be aware of the enemies before
engaging. Hence its is supposed that escorting the swarm there would be some
kind of detection agent i.e. UAV RQ-4 Global Hawk. Then the procedure is as
follows: The reconnaissance agent sends enemy number and location to the swarm.
In the simulation this step is supposed to be already done. Next, the task allocation
algorithm process at each swarm agent which enemy to attack. Then, using on-board
radar each unit follows the target that it has been allocated to it. When the distance
between the target and the agent decreases to less than 5 meters, the agent explodes.
The probability of each agent to damage the enemy is a 60 percent. Each enemy unit
is able to suffer up to three impacts.
The enemy units are supposed to have a defense mechanism that consists in a
proximity detection system with a radius of action. When a swarm agent goes inside
this radius, the enemy UAV will fly away from it trying to avoid collision or attack
by increasing as much as possible the distance between the two actor involved in the
engagement. This mechanism is based in potential field algorithm explained later on.
2015 - 2016 20
4.2 Graphical Representation.

The figure 10 shows how the swarm and the enemy are represented during the
simulations. The blue crosses represent the swarm agents. The blue circles are the
enemy targets. When an enemy is killed, the circle goes red. When there are several
crosses over a circle but, this is not red means that the swarm units attacking it
failed.
Figure 9: Graphic representation of the simulation after 10 seconds.
Figure 10: Graphic representation of the simulation after 100 seconds.
AGENT Swarm Enemy Base

Operational × ◦ ◦
Dead ∗ ◦ -
Table 4: Agents’ representation.
The figures 9 and 10, represents the simulation for 10 and 100 seconds. It is observable
that two enemies have been killed, and around 18 swarm agent have failed during
their attack. In the left bottom corner, the green circle represents the base where
the swarm starts.
2015 - 2016 21
5 Decision Making, Navigation, & Control.

5.1 Decision Making.
The decision making algorithm considers the integrity of swarm agents and enemies.
To do so, swarm units are supposed to be able to detect if the target designated is
dead or alive. Such a difference is not quite obvious, hence the sensor need on-board
for that kind of task does not influence the initial UAV proposal.
Due to the kind of attacking system been designed, there is almost no communic-
ations between agents, and with the exterior of the cluster that they form. So the
decision making scheme is reduced to just the agent and the enemy integrity. Fur-
thermore, in the simulations this algorithm is implemented with the task allocation
one all together.
Figure 11: Decision Making Flow Chart.
The algorithm goes as follows. First after initialisation the agent receives from
the reconnaissance UAV the number of units, their initial coordinates, and other
parameters affecting the task allocation algorithm that will be explained later on.
Next the agent proceeds to run the task allocation procedure to attack a target.
When the target is known the swarm unit starts the chase. When the distance is
short enough to detect the state of the enemy, the agent will evaluate the target
integrity. If it is operative, the swarm member will proceed to fly against the enemy
trying to hit it. This is a typical suicide behaviour. If the enemy is already down
because of others’ attack, then the agent runs one more time the task allocation
algorithm to select a target, and continues as before. It is important to note that
if the attack is not successful, it is considered that the agent will be killed as its
attack was suicidal. This means that the swarm UAV has just one opportunity to hit
the target. The probability of hit success is a parameter that will be tested during
simulations.
2015 - 2016 22
5.2 Navigation.
An air combat scenario includes two groups of agents fighting each other in the
air. These teams go from one versus one battles to several against several unit
engagements. Hence, because of obvious large space of the battle field and the
relative small size of the units moving in it, the navigation algorithm to choose is easy.
Bug algorithms are discarded because they usually only work in environments with
static obstacles. Exact algorithms like Voronoi diagrams, obtain a mathematically
determined way to get the answer. But they need sensor precision of the obstacles.
Therefore they are above all applied to avoid static ones. The last ones, algorithms
based on grid description are the best option for dynamic obstacles as the need of
sensor precision is not as critical as at other methods.
Hence an algorithm based in potential fields is applied in this thesis for the
agents’ movement. Concept of potential fields are part of a larger family of grid
algorithms. They emulate the behaviour of magnetic fields spreading. When applying
this concept to robot navigation, two main objects arise in the environment: Goal
points and obstacles. The goal point is what attracts the agent to its position. In
counterpart, The obstacles repulse the agent from them.
5.2.1 Algorithm based in Potential Fields.

To apply the algorithm, for each swarm agent the goal will be the enemy UAV
assigned to attack. The obstacles will be the rest of units in the air. Therefore for
each agent there is a sum of repulsive forces according to the rest of units and a
repulsive force corresponding to the target, that is also added. The total force is the
source of movement of each element. This force is proportional to the potential field
generated.
F (q) = −∆U (q)
Where U (q) is the sum of attractive and repulsive fields:
U (q) = Uatt (q) + Urep (q)

Where:
Uatt (q) = ξkq − qgoal k

1 2

1 1
Urep (q) = ν − , kρ(q, qobst )k ≤ ρ0
2 ρ(q, qobst ) ρ0
These potential fields lead to the following forces:
(q − qgoal )
Fatt (q) = −∆Uatt (q) = ξ
kq − qgoal k

1 1 1 q − qobst
Frep (q) = −∆Urep (q) = ν −
ρ(q, qobst ) ρ0 ρ(q)2 ρ(q)
ξ and ν are the attractive and repulsive factors, ρ(q, qobs ) is the distance from the
agent to the obstacle, and ρ0 is the maximum distance from the obstacle repulsive
field has effect.
2015 - 2016 23
Therefore defining the battlefield as a two-dimensional space the total force is:
N
X
F (x, y) = Ftarget (x, y) + Fagenti (x, y);
i
Figure 12: Potential Fields.
In our case, as the obstacles sizes are relatively small compared to the rest of size
bodies, and all of them are moving, there is no possibility to fall into local minimum
issues that provoke the agents to get stuck between obstacles and not move to the
goal position.
5.3 Control.
The control applied to the swarm agents is a basic simplification of the a rigid body
dynamics. At this simulation, the units have been treated as particles, hence no
rotation has been taken into account. Then, the resulting movement of each element
is just translational. The cause of the movement is the total force produced by
attractive and repulsive fields over each UAV. The kinematics are:
φi = arctan(Fy /Fx )
xi = xi−1 + v × dt × cos φ
yi = yi−1 + v × dt × sin φ
The subscript i represents the iteration. dt is the time step between iterations. Hence
the displacement of each element is the previous position, plus the velocity times the
time step in the direction of the force that produces the movement.
Figure 13: Kinematics.
2015 - 2016 24
6 Task Allocation.
The problem of assigning subsets of agents from the total swarm in an optimal
proportion to each task is defined as an instance of a single-task, multi-robot task
problem (ST-MR). Also known as the coalition formation problem when conformed
by software agents. Some approaches as market-based ones need large agent commu-
nication and cooperation. Therefore this approaches are often costly. Approaches
that arise from insects’ behaviour, swarms of ants, etc; based in optimized stochastic
policies, that rely in little to no communication between agents is the task allocation
algorithm that will be implemented in this thesis to tackle the issue of assigning
targets to agents.
6.1 Task Allocation Problem Definition.

[6] Multi-robot task allocation problem is presented as follows:
Given a set of tasks M , a set of robots N , and a function for each subset of
robots n ∈ R specifying the cost or utility of completing each subset of tasks
cn : 2M → R+ ∪ {∞}, find the allocation A∗ ∈ RM that minimises or maximises a
global objective function C : RM → R+ ∪ {∞}.
6.2 Optimized Stochastic Policies for Task Allocation in

Swarms of Robots.
6.2.1 Definitions and Assumptions.
Consider group of N robots that has to be allocated among M tasks, our enemies in our
case. The number of agents assigned to enemy i ∈ {1, ..., M } at time t by ni (t), that is
a non-negative integer; and the desired number of robots assigned to each enemy i by
ndi . The population fraction associated to the previous variables are xi (t) = ni (t)/N ,
and xdi = ndi /N . The population fraction vector is: x(t) = [x1 (t), ..., xM (t)]T . The
target distribution is the vector of desired swarm fraction assigned to each enemy:
xd = [xd,1 , ..., xd,M ]T . Define quantities in fractions instead of integers is better for
scaling, it is also useful for applications in which losses of robots are common.
The constraints between tasks are defined by a directed graph G = (V, E), where
V, the set of M vertices, corresponds to enemies {1, ..., M }, and E, the set of NE
edges, is the corresponding transitions between tasks. Enemies i and j are considered
adjacent and said that i ∼ j, if an agent assigned to enemy i can switch to j. This
relation is defined as (i, j) ∈ V × V, with the set E = {(i, j) ∈ V × V|i ∼ j}. In the
thesis scenario, G model represents the enemy inter-connectivity: V is the set of
M targets, and each edge (i, j) is a one-way path that agents can use from i to j.
The P existing routes from i to j are represented by the various edges (i, j)m where
m = 1, ..., P .
It is necessary to apply this algorithm that G is strongly connected. this means

that a direct path exists between any pair of vertices, figure 14. This allows the
units to attack any enemy starting from any other target. In this case, the graph
is also a fully connected graph, because every vertex is adjacent to every other
vertex. This is obvious as targets are flying aircraft, hence flying from one to other
is not restricted by any physical obstacle as it could be over ground with mountains,
2015 - 2016 25
buildings, etc. This permits the swarm elements to travel directly from one target to
another, instead of going through a sequence of intermediate steps before achieving
the desired enemy.
Figure 14: Strongly connected graph.
x(t) corresponds to the state distribution in a Markov process over G. Hence,

V is the state space and E is all the transitions available. Each edge (i, j) has a
transition rate kij that represents the probability per time unit for one agent to
change from enemy i to enemy j. Thus the stochastic policies arise as the agents are
programmed to go from target i to j with probability kij δt every step of time δt. The
amount of transitions between i and j in time ∆t is defined by an statistical Poisson
distribution with parameter kij ∆t. Finding the optimal values of kij force the agents
to distribute themselves among th enemies as quickly as possible to achieve the
desired ratio specified in xd . The use of constant kij values is imperative to associate
the system to a linear continuous model. It is assumed that the values kij , xd and
the complete graph G is known by every agent before starting.
6.2.2 Base model.

The swarm model is a function of the rates kij . It is represented based to x(t). At
limit N → ∞, the system resembles the next linear ordinary differential equation
(ODE):
dxi (t) X X
= kji xj (t) − kij xi (t) (1)
dt
∀|(j,i)∈E ∀|(i,j)∈E
Where i = 1, ..., M . Each transition from i to j has a fraction of agents per time unit
that are going from enemy i to j. Then the model defined by equation 1 specifies
the switch rate of swarm fraction xi (t) as the difference between the income and
outcome of agents at target i. It shows this effect by representing elements as they
change immediately from one enemy to other, not considering the time that each
element needs to transit from one to other task. Because of the constant value of
kij , some agents still travel between tasks when xd is achieved. This feature improve
robustness of the system [9].
The equation 1 can be resumed as a linear model:
2015 - 2016 26
dx
= Kx (2)
dt
Where K is a matrix with the following properties:
K ∈ RM ×M
KT 1 = 0
Kij ≤ 0 ∀(i, j) ∈ E
The matrix resulting from these characteristics has the form:

−kij ,
 if i 6= j, (i, j) ∈ E
Kij = 0, if i =6 j, (i, j) 6∈ E (3)

P
(i,l)∈E kil , if i = j.
6.2.3 Extended Base Model: Including Quorum Concept.

Quorum is defined as a threshold value for each enemy. When agent concentration
at each target rises the limit value the system is programmed to then transfer units
to an adjacent enemy at a larger kij rate. Every enemy has a quorum quantity
associated qi . It is defined as an amount of units that corresponds to a part of the
desired agent limit at each enemy x̄i .
dxi (t) X X
= kji xj (t) − φij (t) (4)
dt
∀|(j,i)∈E ∀|(i,j)∈E
Where φij (t) is the flow of swarm units travelling from target i to j. If quorum of
enemy i is exceeded, transition rate from i to adjacent objectives j can be set to a
multiple of the existing transition rate αkij , with α > 0 satisfying max αkij < min
max . Resulting in:
kij
φij (t) = kij xi (t) + σi (xi , qi )(α − 1)kij xi (t) (5)
x̄i kij = x̄j kji ∀(i, j) ∈ E

Where σi is the analytic switching given by:
x
γ(qi − x̄i ) −1
σi (xi , qi ) = (1 + e i) , σi ∈ [0, 1] (6)
This analytic switching function acts like a threshold method like the one described
by [10]. In [8], it is demonstrated that the introduction of the quorum concept to
the base model speeds up the allocation process and allows to eliminate wireless
communication procedure that was implemented in [7].
In the simulation, as the swarm units attack as suicide projectiles trying to hit
the assigned target; The quorum concept in this model is equivalent to the number
of impacts each enemy can support.
2015 - 2016 27
6.2.4 Agent Implementation.

To programme each agent of the swarm to act as part of the overall group, it is
necessary to develop a controller that matches the behaviour of the ODE model by
using individual stochastic policies. To do so, the transitions that will be defined
respond to a Poisson distribution. Considering i and j as the current target that an
agent is attacking and the rest of them, then kij is the transition rate of the agent to
go from i to j per second. At each ∆t, the agent provokes a random method with
two possible results, 0 or 1. The probability of obtaining one is given by kij × ∆t. If
the result is 1 then the agent attack the adjunct enemy j, if it is 0; then the agent
just stay attacking the current target i.
The number of agents assigned to a target i remaining after a time t responds to

the following expression:
Ni (t) = Ni,initial × e−kt
Applying this methodology to each agent the Task Allocation algorithm arise.
Each unit calculates randomly the time to change of target. Every time someone
arrives to a target i, it produces the random value ts for every other enemy adjacent
j, according to a Poisson distribution with parameter k × t. If current time is less
than the minimum ts,j the agent will change to the target with the minimum ts . The
implementation is written next, [8]:
Data: Given enemy topology G and transition matrix K

Initialization;
Estimation of the number of units assigned to the enemy i, xi /x̄i ;
if xi /x̄i is less than qi then
Compute ts,j using k = kij ∀ adjunct enemies
else
Compute ts,j using k = kij max ∀ adjunct enemies
end
if current time t is less or equal to the minimum ts,j then
Attack the current target
else
Change attack to enemy with minimum ts,j at time t = min ts,j
end
Algorithm 1: Original T.A. algorithm at agent level.
To apply the quorum, agents should be able to estimate the quorum level corres-
ponding to each enemy. The transition of each unit is synchronised with its internal
clock. Moreover as the objective is to adapt this algorithm to our air combat scenario,
the swarm agents will die as well as the enemies. This means that the algorithm will
suffer some modifications to introduce agents’ integrity concept.
if t > min ts,j or enemy integrity == 0 then

Change attack to enemy with minimum ts,j at time t = min ts,j
else
Attack the current target
end
Algorithm 2: Modified part from the new algorithm.
2015 - 2016 28
7 Simulation Analysis.
7.1 First Case: Different simulation duration.
The killing speed will be measured in this simulation, watching how many targets
have been eliminated in simulations with different duration. Each simulation is
evaluated 5 times. The other parameters like the number of swarm agents, and the
matrix K are constant.
Case 1 Time [sec] K Swarm Units

1 250 1 30
2 400 1 30
3 500 1 30
4 600 1 30
Table 5: Simulation 1.
Figure 15: Enemies killed with configuration 1.
Figure 15 presents the average value of enemies killed, for four different cases,
varying in each one of them the time duration of the simulation. The results show
that the the killing rate seems to not be affected by the time duration, as it presents
high diversity in results with no correlation related with time. This theory of non
correlation is reinforced when figure 16 is observed. The number of friendly units
dead is similar in average and dispersion values no matter the case.
2015 - 2016 29
Figure 16: Swarm agents survived with configuration 1.
The results show that that a priori the system is fast enough to kill but the
effectiveness with this configuration is not very high. No more than 5 kills in the best
case no matter the duration of the simulation. This fact shows that the intervals of
time that are considered are not much relevant with the actual configuration of K
values and number of swarm agents. It may be possible that as the algorithm uses
a randomisation function, the uncertainty that arises does not permit to get clear
results with the actual number of simulations (5 times each). A possible solution
can be to increase this number.
7.2 Second Case: Different matrix K values.

As specified in [7], modifying the matrix K should present a faster convergence time,
but producing at equilibrium that agent float between targets. The difference win
this situation is that now targets and agents can die, hence the graph dimensions
tend to vary. This means that K values would have to be recalculated on-line to
obtain again the optimal values. The author of the thesis did not arrive to deepen
in the the matrix K optimization beyond a simple static approximation to obtain
acceptable results in different configurations.

1 300 0.5 30
2 300 0.85 30
3 300 1 30
4 300 1.2 30
2015 - 2016 30
In this experiment, it is clear the correlation between the matrix K and the killing
rate. The cases with K values different from the design value present almost no kills.
The survivability of the swarm is enhanced when K is the design one. Other values
increase the elements dying.
At this case, it is possible to appreciate the sensitivity of the algorithm to the

values of K. Every value that goes far from the optimal working one makes the agents
to not change from target aiming to complete the goal concentration. The agents
that died were almost all designated to the initial target, the one to be killed probably.
2015 - 2016 31
7.3 Third Case: Different number of swarm agents.

In this part, the simulation will focus the importance of the swarm size. The expected
results can be observed in the plots. The number of kills rise with the number of
agents conforming the swarm. But, the negative point that arise from this results
is that even with a proportion of 60 agents versus 10 targets, the average score is
not close to the 100 percent of kills. This may be caused by the hit probability. But
further revision of the algorithm would be acceptable. Its is something strange that
to kill between 5 and 8 targets, a ratio of 1.5 agents per target is only needed while
with 6 agents per target the results are between 5 and 9. Increasing 4 times the
number of agents to kill just around two more enemies is completely inefficient from
a cost perspective.

1 500 1 15
2 500 1 30
3 500 1 40
4 500 1 60
In figure 19 the killing rate grows as the number of elements conforming the
swarm scales up. The deviation for each case is variant, hence the robustness is still
affected. For example, for a agent number of 40 units, the killing rate is averaged 7
with a deviation of 1 unit. Then, for 60 units. the average value changes to 6, with
one simulation where 9 enemy units died.
2015 - 2016 32
Figure 20 shows that as it is natural, the survivability of the swarm increases

with the number of agents working together. The average values of friendly units
lost goes down very slow with the increase of swarm size as appreciated in the graph,
but the dispersion is highly reduced.
2015 - 2016 33
8 Conclusions & Further Work.

To finish this work, some conclusions about the results obtained from the simulations
of the promoted algorithm will be exposed next as a closure of this thesis. The fields
touched in this thesis are so numerous that further improvements can go in several
directions.
• The research made lead to define the characteristics and features to apply in an
unmanned aerial vehicle that will be able to confront other aircraft in the air
battlefield. Moreover, a decision making and a task allocation algorithm were
implemented inspired by swarm behaviours coming from insects’ interactions.
• Stochastic Policies are a potential candidate to be the base of decision making

and task allocation algorithms when the number of agents to organize and
control rise up substantially. Bio inspired algorithms have proven somehow
that can achieve significant results. However, when the scenario is dynamically
a challenge, presenting not only transients but also considerable changes in the
initial parameters stochastic rates have to be recalculated on the run to permit
achieving optimal results.
• The results the come from this work, although initially promising it is observed
that further work is needed in optimization of the matrix K to find values that
will boost performance. It is also needed to review the algorithm to make it
more robust against the changes that arise in an air engagement.
• Further improvement of the model introduced in this thesis will make it more
realistic. More advanced dynamics will help to get simulations and results that
will be more reliable. More capabilities to the agents or enemies can be also
implanted, like missiles launch, etc. Furthermore other kind of enemies as SAM
sites can be added to make the scenario even more challenging.
2015 - 2016 34
References
[1] N. Correll, D. Rus. Architectures and control of networked robotic systems.
Handbook of Collective Robotics, pp. 81-104, Pan Stanford, Singapore, 2013.
[2] Deborah M. Gordon. The organization of work in social insect colonies.

Nature 380 (6570): 121-124, 1996.
[3] Han-Lim Choi, Luc Brunet, and Jonathan P. How. Consensus-Based De-
centralized Auctions for Robust Task Allocation. IEEE Transactions on Robotics,
Vol. 25, No. 4, 2009.
[4] Kai Zhang, Emmanuel G. Collins Jr, Adrian Barbu. An Efficient

Stochastic Clustering Auction for Heterogeneous Robotic Collaborative Teams.
Springer Science+Business Media Dordrecht 2013.
[5] N. Franks, S. C. Pratt, N. F. Britton, E. B. Mallon, and D. T.

Sumpter. Information flow, opinion-polling and collective intelligence in house-
hunting social insects. Philos. Trans.: Biol. Sci., vol. 357, pp. 15671584, 2002.
[6] Dias, M.B., R. Zlot, N. Kalra and a. Stentz. Market-Based Multirobot

Coordination: A Survey and Analysis. Proceedings of the IEEE 94(7), 12571270,
2006.
[7] . Halsz, M. Ani Hsieh, S. Berman, and V. Kumar. Dynamic Redistribution

of a Swarm of Robots Among Multiple Sites. Proceedings of the 2007 IEEE/RSJ
International Conference on Intelligent Robots and Systems San Diego, CA,
USA, Oct 29 - Nov 2, 2007.
[8] M. Ani Hsieh, . Halsz, S. Berman, and V. Kumar. Biologically inspired re-
distribution of a swarm of robots among multiple sites. Springer Science+Business
Media, LLC 2008.
[9] S. Berman, . Halsz, M. Ani Hsieh, and V. Kumar. Navigation-based

Optimization of Stochastic Strategies for Allocating a Robot Swarm among
Multiple Sites. Proceedings of the 47th IEEE Conference on Decision and Control
Cancun, Mexico, Dec. 9-11, 2008.
[10] Agassounon, W., and Martinoli, A. Efficiency and robustness of threshold-

based distributed allocation algorithms in multi-agent systems. In Proceedings of
the first international joint conference on autonomous agents and multi-agent
systems (AAMAS02) (pp. 10901097). New York: ACM. 2002.
[11] Franks, N., Pratt, S. C., Britton, N. F., Mallon, E. B., and Sump-
ter, D. T. Information flow, opinion- polling and collective intelligence in
house-hunting social insects. Philosophical Transactions B: Bio- logical Sciences,
357(1429), 15671584. 2002.
[12] Department of the NAVY, Chief of Naval Air Training Flight Train-
ing Instruction: Basic Fighter Maneuvering & Section Engaged Maneuvering.
CNATRA P-1289, N715. January 2016.
[13] Lt Col Patrick Higby Promise and Reality: Beyond Visual Range (BVR)
Air-To-Air Combat. Air War College (AWC) Electives Program. Air Power
Theory, Doctrine, and Strategy: 1945-Present. Maxwell AFB, AL. March 2005.
2015 - 2016 35
[14] Robert L. Shaw Fighter Combat: Tactics & Maneuvering. Naval Institute
Press. Annapolis, Maryland. 1985.
[15] www.UAVglobal.com.
[16] United States Government Accountability Office Observations on the

Costs and Benefits of an Increased Department of Defense Role in Helping to
Secure the Southwest Land Border. Washington, DC. September 2011.
[17] Lockheed Martin F-35 Lightning II Program Status and Fast Facts 4Q 2015.
2016.
[18] U.S. Air Force F-35A Lightning II, RQ-4 Global Hawk, F-16 Fighting Falcon
& MQ-9 Reaper fact sheets. Air Combat Command Public Affairs Office. April
2014.
2015 - 2016 36

School of Aerospace, Transport and Manufacturing: M.Sc. Thesis

Uploaded by

Copyright:

Available Formats

School of Aerospace, Transport and Manufacturing: M.Sc. Thesis

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

School of Aerospace, Transport and Manufacturing: M.Sc. Thesis

Uploaded by

Copyright:

Available Formats

M.Sc.

School of Aerospace, Transport and Manufacturing

Simulation of a Swarm Formed by Unmanned Aerial

Supervisors: Hyo-Sang Shin & Antonios Tsourdos

MSc in Autonomous Vehicle Dynamics & Control

Cranfield University - 9th December 2016

Keywords: UAV · Swarm · Task Allocation · Stochastic Policies · Air Combat.

3 Introduction to Unmanned Aerial Vehicles. 17

5 Decision Making, Navigation, & Control. 22

8 Conclusions & Further Work. 34

2015 - 2016 vii

Other important asset of swarm systems behaviours based in stochastic policies

1.2 Aims & Objectives.

2.1.1 Beyond-Visual-Range Engagements.

2.1.2 Within-Visual-Range Engagements.

hit probability increases and more weapon options become available.

2.2 Modern Aircraft Evasion Tactics.

2. Lock on & tracking: Aircraft position signal is fixed by the radar.

Denying your location to the enemy.

Early detection of long-range, radar-guided missiles.

Usually long-range air-to-air missiles are semi-active radar-guided. Aircraft’s

Early defense of long-range, radar-guided missiles.

Other alternative consists in kinematically defeating the missile. The aim is to

2.2.1 Defense of short-range, infrared guided missiles.

Missile end-game defense.

Defensive engagement is always based in turning, as a straight-and-level aircraft

2.3 Unmanned Aerial Vehicles as Actors in Aerial Com-

2.3.1 High maneuvering capability.

According to Boyd, decision-making occurs in a recurring cycle of observe-orient-

Figure 1: Boyd’s OODA loop.

• Extremely high short-term sustained Angle of Attack values.

• High values of thrust-to-weight ratio.

Figure 2: BAE Taranis, and Eurofighter Typhoon at the back.

2.3.2 Detecting Equipment.

Figure 3: Eurofight Typhoon radar: Euroradar Captor-E.

An advanced characteristic of modern radars, is the situational awareness modes

To avoid radar detection, most modern generations of fighters like Lockheed

Other design principle is to parallel alignment of edges or even surfaces. For

Figure 5: Radio signal reflection comparative.

Figure 6: Attacker/Tactical Bomber Lockheed F-117 Nighthawk.

Figure 7: Tactical Bomber Northop Grumman B-2 Spirit.

Armament Air-to-ground missiles None - Multi-role Multi-role

Table 1: Aircraft comparison

Unit APUC Ud. O&S-AAC O&S-AAC Base

MQ-1 Predator 4 N/A N/A 1,210.0 [6] 1.32 2010

Table 2: Department of Defense Selected Acquisition Reports of some UAV

There always be variations in cost military technology assessment, however it

Figure 8: Lockheed RQ-170 Sentinel.

Swarm capability & Potential Air Superiority.

aircraft squadrons, is significantly advantage. To outpoint the target capability of

Autonomous capability and integrated artificial intelligence give UAVs a new

Moreover, in warfare, enemy operating systems will become important targets of

Hence, UAVs are already prepared to develop missions as tactic bombardment or

2.3.5 Summary of Possible Unmanned Aircraft Candidate for an

The model should present a reliable performance, with a minimum capability of

Moreover, the communications involving a fleet of minimum 20 units become an

3 Introduction to Unmanned Aerial Vehicles.

3.1 UAV Swarm Characterization.

• Guidance & control.