A visual tracking system for honeybee 3D flight trajectory reconstruction and analysis

A thesis submitted to The University of Manchester for the degree of Doctor of Philosophy in the Faculty of Science and Engineering

2020

Cong Sun

School of Engineering Department of Electrical & Electronic Engineering

Table of Contents

Abstract ...... 7

Declaration ...... 8

Copyright statement ...... 9

Acknowledgements ...... 10

1. Chapter 1 – Introduction ...... 11

1.1 The main character of the story - bees ...... 12

1.1.1 Characteristics and behaviours ...... 12

1.1.2 The unquestionable value ...... 16

1.1.3 Current situation and threats ...... 17

1.1.4 Strategies to maintain effective pollination ...... 19

1.2 Main contributions ...... 20

1.3 Chapter summary ...... 21

2 Chapter 2 – Literature review ...... 24

2.1 An overview of behaviour ...... 24

2.2 Ground-based passive observation methods ...... 27

2.3 Airborne netting and sampling platforms ...... 28

2.4 Entomological radar ...... 29

2.4.1 Traditional scanning radar / marine radar ...... 30

2.4.2 Vertical-looking radar (VLR) ...... 31

2.4.3 Conventional entomological scanning radar (without harmonic transponders) ...... 34

2.4.4 Harmonic radar...... 35

2.4.5 Radio-frequency identification...... 39

2.5 Optical imaging systems ...... 42

2.5.1 In-field visual tracking with conventional instrumentation ...... 43

2.5.2 Insect Monitoring with Light Detection and Ranging (LiDAR) ...... 45

2.5.3 Laboratory observation ...... 47

2.5.4 Object tracking inspired by modern computer vision algorithms ...... 49

2.5.4.1 Overview of modern object tracking ...... 49

2.5.4.2 Small object tracking ...... 51

2.6 Summary ...... 52

3 Chapter 3 - Relevant Theory...... 54

3.1 Gaussian mixture model ...... 54

3.2 Kalman filter ...... 57

3.3 Hungarian algorithm ...... 59

3.4 3D space triangulation & epipolar geometry ...... 60

4 Chapter 4 – Hardware configuration and data collection ...... 66

4.1 Exploration of the imaging principle ...... 66

4.2 First experiment (2D determination) ...... 71

4.3 Camera parameters ...... 73

4.4 Simulation of 3D flights ...... 75

4.5 Second experiment session (3D determination) ...... 82

4.6 Conclusion ...... 86

5 Chapter 5 – Software development ...... 88

5.1 Definition of the problem ...... 88

5.2 Processing procedure of the software ...... 90

5.3 Flowchart of the software processing procedure ...... 91

5.4 Software development environment ...... 92

5.5 Camera calibration ...... 92

5.6 Pre-processing of the raw video ...... 96

5.7 Background subtraction ...... 97

5.8 Morphological operations & further filtering ...... 103

5.9 Software simulation ...... 109

5.10 Epipolar geometry system and 3D triangulation ...... 113

5.11 Motion estimation and the generation of flight tracks ...... 116

5.12 Quantified system evaluation ...... 119

5.13 Conclusion ...... 121

6 Chapter 6 – Results and data analysis ...... 122

6.1 Initial cleaning and reduction of data ...... 122

6.2 Analysis of individual flights ...... 126

6.3 Error analysis ...... 136

6.4 Analysis of the entire cluster ...... 138

6.4.1 Analysis of flight tendency ...... 138

6.4.2 Density distribution of worker bees at hive entrance ...... 148

6.5 Conclusion ...... 154

7 Chapter 7 - Conclusion ...... 155

7.1 Overview ...... 155

7.2 Summary of system development and findings ...... 156

7.3 Further work ...... 158

7.3.1 Improved placement of the cameras and recording strategy ...... 158

7.3.2 Upgrading the imaging equipment ...... 161

7.3.3 Collaboration with honeybee experts and entomologists ...... 162

7.3.4 Analysis with other environmental variables ...... 162

7.4 Conclusion ...... 163

Bibliography ...... 164

Appendix ...... 174

List of Figures

Figure 1.1 World distribution map of honeybees (retrieved from: https://www.apiservices.biz/) ...... 12 Figure 1.2 Adult honeybee anatomy ...... 14 Figure 2.1 Radar PPI display on a morning in 1988 at Jiangpu, showing a large number of take-offs of micro- (Riley, et al., 1991)...... 31 Figure 2.2 (A) A ground-based harmonic radar for low-altitude insect tracking (photo by A. D. Smith). (B) Side view of a hand-held harmonic radar detector (O'Neal, Landis, Rothwell, Kempel, & Reinhard, 2004). (C) Photograph of a vertical-looking radar (VLR) sited on the rooftop of a building at Rothamsted (Chapman, Reynolds, & Smith, 2003)...... 35 Figure 2.3 (A) An RFID tag attached to a billbug (Silcox, Doskocil, Sorenson, & Brandenburg, 2011). (B) Bumblebee individual attached with a 200 mg transmitter (Hagen, Wikelski, & Kissling, 2011). (C) Attachment of a 300 mg radio transmitter to the thorax of a green darner (Wikelski, et al., 2006). (D) Honeybee attached with a radar transponder. (photo courtesy of M. Garcia- Alonso) (E) Butterfly equipped with an UHF RFID tag (Särkkä, Viikari, Huusko, & Jaakkola, 2012)...... 36 Figure 2.4 (A) A honeybee with a 3-mg passive RFID tag glued to its back (B) The RFID readers installed at the entrance of the hive for detecting returned marked forager bees. (Henry, et al., 2012) ...... 42 Figure 2.5 (A) The illumination source used in the IRADIT system. It was fitted with an infra-red high-pass filter and powered by a pulsed xenon light source. (B) The superimposed enhanced images showing the manoeuvre of an insect against the bright sky. (C) The computer-processed image of (B) where the noise was mostly removed and the complete track of an insect is shown (Schaefer & Bent, 1984) ...... 45 Figure 2.6 Example of airborne LiDAR data, showing the colour coded point cloud as per height (left) and the light intensity (right)(courtesy Optech Inc.) .... 46 Figure 2.7 (A) The architecture of a typical in-lab insect observation system. (B) The enclosure used in other research for the analysis of the 2D movement of cockroaches. (C) The view from the camera fitted on the ceiling of the enclosure

in (B), superimposed with the colour coded tracks of the targets (Wilkinson, Lebon, Wood, Rosser, & Gouagna, 2014; Correll, et al., 2006)...... 49 Figure 3.1 (A) Demonstration of a two-camera imaging system (C and C’). The real-world point X and its projection points on each camera are given by x and x’. (B) Once the epipoles are found (e and e’), based on the position of a point x on the left camera, an epipolar line l’ on the other camera can be derived, indicating that all the possible locations of the project point of the same real-world location X must lie on this line. (Hartley & Zisserman, 2003) ...... 61 Figure 4.1 (A) 785 nm streamline Class 3B laser and the optical scanner (B) 850 nm IR emitter consisting of 100 LEDs. (C) The forward scattering imaging configuration for both systems...... 67 Figure 4.2 Divergence curve of the beam size at focus (blue solid curve) and the depth of focus (black dashed curve) vs. distance from laser at (A) short range (100 – 500 mm) and (B) long range (500 – 2000 mm)...... 69 Figure 4.3 Signal-to-noise ratio (dB) vs. shifted elevation angle of the emitter. ... 70 Figure 4.4 (A) Image of a hanging bee in bright sky, 1 m distant, with the emitter off; (B)Image of the same bee from 1 m with emitter turned on; (C) Image of the bee from 5 m distance. All these images are enhanced equally...... 70 Figure 4.5 System setup at beehive for 2D flight track determination of honeybees...... 71 Figure 4.6 Schematic of the camera setup (top view) ...... 72 Figure 4.7 Dimensions of the GoPro Hero 7 Black (no protection case) ...... 74 Figure 4.8 Schematics of (A) the orthogonal setup and (B) the parallel setup for a dual camera imaging system...... 76 Figure 4.9 (A) Simulation of the orthogonal configuration of a two-camera imaging system, looking at 20 virtual spheres at the same time. (B) Scene captured by the right (front) camera showing 20 the motion of 20 spheres at a given instant and the blue curve shows the pre-set moving path of one sphere. (C) Simulation of the parallel (stereo) configuration of a two-camera imaging system...... 78 Figure 4.10 Inverse projection of the measurement error from 2D camera view plane to 3D, shown in (A) the single camera case and (B) the dual-camera case following the orthogonal setup...... 79

Figure 4.11 Image captured by the right (front) camera, showing the background, the target beehive and the location of the other camera (not entirely visible from this camera)...... 83 Figure 4.12 The custom-made calibration MDF chessboard for the orthogonal calibration...... 84 Figure 4.13 (A) Cameras undergoing stereo calibration using a chessboard pattern after each recording. (B) The interface of the pose estimation calibration software, with alignment lines shown in different colours...... 85 Figure 5.1 Flowchart of the bee tracking software and the output images of each intermediate phase ...... 91 Figure 5.2 (A)The impact of radial distortion, represented by displacement of pixels induced by such distortion. (B)The impact of tangential distortion. (Bouguet, 2015) ...... 93 Figure 5.3 The performance of six studied background subtractors for one frame taken from the bee flight recordings. (Figures have been cropped for visualisation): (A)original frame (B)MOG2 (C)MOG (D)CNT (E)GMG (F)GSOC (G)KNN...... 100 Figure 5.4 Output of MOG subtractor on one frame from the bee flight recordings, showing bee contours (marked with red circles) and a large amount of noise from the tree leaves...... 102 Figure 5.5 The accumulated probability distribution of moving pixels across a 24-second recording, represented as the foreground mask image...... 104 Figure 5.6 The binarized image of the foreground mask, showing image components whose pixel intensity lie within 10-50 in the original mask...... 105 Figure 5.7 A simplified demonstration of the effect of (i) the dilation operation and (ii) the erosion operation on continuous shapes. In this example, a rectangle A is being transformed with a round structuring element B. The shapes with solid outlines are before the operation and the dashed-lined ones are the resulting shapes. (Owens, 1997) ...... 106 Figure 5.8 The effect of (i) the dilation operation and (ii) the erosion operation on digital shapes. A 3 × 3 square structuring element is applied. (Efford, 2000) ...... 107

Figure 5.9 The effect of the opening morphological operation on scene which contains bees flying among leaves...... 108 Figure 5.10 One example of generated path where k1 = 4.6, k2 = 2.5, k3 = 8.3 ...... 110 Figure 5.11 (A) Stacked locations of three moving virtual bees. (B) Object locations estimated by the Kalman filter (red dots) and ground truth of the detection (green dots)...... 111 Figure 5.12 The 3D simulation scene for the dual-camera setup constructed in 3ds Max...... 112 Figure 5.13 The simulated dual-camera imaging system and 20 moving spheres. The purple curve is the moving path of one of the spheres...... 113 Figure 5.14 A frame pair from the left (A) and right (B) camera recordings, demonstrating the workings of the epipolar geometry system...... 114 Figure 5.15 Schematic of the 3D triangulation ...... 116 Figure 5.16 Flowchart of the motion estimation and flight track generation module based on the coordinates of detected objects in each frame...... 117 Figure 5.17 (A) Stacked computer dots of target locations in a 24-second-long video clip. (B) The colour coded bee flight tracks generated, which is also the final result of the software...... 119 Figure 6.1 3D projection view (i) and the top view (ii) of the reconstructed flight tracks in a 9.5 minute video. Yellow cube represents the location of the beehive...... 124 Figure 6.2 The virtual ‘outside gate’ in front of the beehive (top view) ...... 125 Figure 6.3 3D flight track (track A) of an individual bee coloured coded with the magnitude of instantaneous velocity, showing in (i) the 3D projection view, (ii) the side view, (iii) the front view and (iv) the top view. (unit: mm) ...... 126 Figure 6.4 3D flight track (track B) of another individual bee coloured coded with the magnitude of instantaneous velocity, showing in (i) the 3D projection view, (ii) the side view, (iii) the front view and (iv) the top view. (unit: mm) ...... 127 Figure 6.5 Top view of the relative location of the two tracks in Figure 6.2 and Figure 6.3, the beehive, and the cameras...... 129 Figure 6.6 The magnitude of velocity of track A vs. time...... 130 Figure 6.7 The magnitude of velocity of track B vs. time...... 131

Figure 6.8 The x (i), y (ii) and z (iii) component of the instantaneous velocity vector of track A vs. time...... 132 Figure 6.9 Diagram of the possible situations when large measurement error occurs ...... 133 Figure 6.10 (i) The magnitude of acceleration of track A vs. time (with large measurement error deviation). (ii) Averaged magnitude of acceleration (across 10 adjacent measurement points) of track A vs. time. (ii – x, y, z) The x, y, z components of averaged acceleration vector of track A vs. time...... 135 Figure 6.11 Averaged magnitude of acceleration of track B vs. time ...... 136 Figure 6.12 Demonstration of two flight track categories...... 139 Figure 6.13 Low curvature tracks shown in (i) 3D projection view, (ii) side view and (iii) top view. Yellow rectangle represents the location of the hive entrance...... 145 Figure 6.14 (i) Side view and (ii) top view of the tracks with large curvature. Yellow rectangle represents the location of the hive entrance...... 147 Figure 6.15 3D Distribution of all detection points in the 9.5 min recording. Red cube represents the location of the beehive entrance...... 148 Figure 6.16 Density map of flying honeybees around the hive (top view), showing (i) all tracks, (ii) ‘straight tracks’ and (iii) curved tracks...... 150 Figure 6.17 Density map of flying honeybees around the hive (side view), showing (i) all tracks, (ii) ‘straight tracks’ and (iii) ‘curved tracks’...... 152 Figure 7.1 (A) Positioning of the cameras during the field experiment. (B) A possible optimised positioning of the cameras. (Gray areas are the common field of view) ...... 160

List of Tables

Table 4.1 2D bee flight monitoring study 2018 ...... 73 Table 4.2 bee flight monitoring study 2019 ...... 82 Table 6.1 Flight statistics of the entire bee swarm in the recorded video...... 141

Abstract

Although apiculture has been practised for thousands of years, many aspects of the honeybee, including the responses to the ambient environment through their central nervous system and their behaviours at an individual level, especially the flight behaviours, are yet to be understood thoroughly. Due to the dramatic decline in honeybee populations across the world, such studies have shown more significance.

The development of instrumentation and software of an automated 3D insect detecting and tracking system is proposed. This inexpensive system comprises two orthogonally mounted video cameras with an observing volume of over 250 m3 and an offline analysis software system that outputs 3D space trajectories and inflight statistics of the target honeybees. The imaging devices record at 2.7K, 60 frames per second and require no human intervention once set up. The software module uses several forms of modern image processing techniques with GPU- enabled acceleration; it is able to minimise the effect of highly cluttered stationary background objects and moving artefacts whose characteristics are distinguishable from those of bees. The statistics of bees’ flight activity are presented and discussed. This system provides a streamlined and low-cost approach to the study of inflight behaviours of bees and other insects. It will find applications in the optimisation of pollination strategy, population dynamics and hive health, accurate knowledge of which is predicated upon a better understanding of the inflight behaviours of bees at the individual level.

7

Declaration

No portion of the work referred to in the thesis has been submitted in support of an application for another degree or qualification of this or any other university or other institute of learning.

8

Copyright statement

i. The author of this thesis (including any appendices and/or schedules to this thesis) owns certain copyright or related rights in it (the “Copyright”) and he has given The University of Manchester certain rights to use such Copyright, including for administrative purposes.

ii. Copies of this thesis, either in full or in extracts and whether in hard or electronic copy, may be made only in accordance with the Copyright, Designs and Patents Act 1988 (as amended) and regulations issued under it or, where appropriate, in accordance with licensing agreements which the University has from time to time. This page must form part of any such copies made. iii. The ownership of certain Copyright, patents, designs, trademarks and other intellectual property (the “Intellectual Property”) and any reproductions of copyright works in the thesis, for example graphs and tables (“Reproductions”), which may be described in this thesis, may not be owned by the author and may be owned by third parties. Such Intellectual Property and Reproductions cannot and must not be made available for use without the prior written permission of the owner(s) of the relevant Intellectual Property and/or Reproductions. iv. Further information on the conditions under which disclosure, publication and commercialisation of this thesis, the Copyright and any Intellectual Property and/or Reproductions described in it may take place is available in the University IP Policy (see http://documents.manchester.ac.uk/DocuInfo.aspx?DocID=24420), in any relevant Thesis restriction declarations deposited in the University Library, The University Library’s regulations (see http://www.library.manchester.ac.uk/about/regulations/) and in The University’s policy on Presentation of Theses

9

Acknowledgements

I would like to express my profound thanks to my supervisor, Prof. Patrick Gaydecki, for his patient guidance and thoughtful support through the entire PhD research and write-up of the thesis. I have also been impressed and motivated by his attitude towards life and career, which enlightened me in the path to pursue in the future.

I would like to thank all my fellows in the research group – Muhammed Abdulaal, Antony Barton, Lorenza Zaira Curetti, Sheheera Ismail, Gareth Jones, Thomas Lloyd, Lubos Marcinek, Jiaxuan Wang and Wenyang Xie for their professionalism, support and friendship that helped me a lot. In particular, I would like to thank Erdem Atbas, Zhenzhou Yang, Jiayan Huang and my lovely parents for their encouragement and support that brought me happiness and strength to overcome this difficult time in 2020.

I would like to thank everyone at the School of Electrical and Electronic Engineering for their support and guidance during my six years of time as an overseas student. I received unexpected welcome and hospitality from all the people I met, and it was my honour to be part of such an academic environment full of enthusiasm and wisdom.

10

1. Chapter 1 – Introduction

Insects are a large group of invertebrates (phylum: ) and occupy a significant position with respect to biological balance and diversity, considering the vast number of individuals, species, and the richness in genetic varieties. Insects are vital components within the global ecosystem, but from an anthropocentric perspective, some insect species do severe damage to crops, spreading disease, posing threats to the wellbeing of humanity and other forms of life. Some other species provide honey, silk and medical materials as by- products, and contribute to pollination and biological pest control. The honeybee was chosen to be the main studied subject in this research, considering the enormous economic value created by their pollination services as well as the current threats that honeybees are facing. These threats include the attacks from parasite mites, damage from the use of insecticides and acaricides, loss of wild habitats and behavioural abnormalities caused by climate change. Understanding the life habits and flight behaviours of bees is already a major topic in entomological research. However, the flight behaviour of the honeybee has not been investigated systematically and is still a relatively unworked area; this is mainly because the tools to do so have until recently been limited. There is further a paucity of knowledge regarding how a bee’s nervous system interacts precisely with its physical behaviours. In addition, considering the urgent need of preventing the population of bees from reducing, thorough studies on them are of considerable significance. The aim of such studies is not to achieve full manipulation over bees, but to monitor them using imaging and tracking techniques and gain further knowledge to help with pollination management in terms of optimising pollination capacity across a certain area, especially in the regions where the wild bee species have suffered significant damage. Specifically, the system described in this thesis has filled a research gap in halting species abundance decline with improved tracking approaches in a way that hive abnormalities can be identified early and situations can be precisely diagnosed based on the vast amount of data the system can provide.

In general, this research involves the design and construction of instrumentation for monitoring the flight behaviour of bees with video cameras, the development

11 of software for extracting flight statistics from the recordings and the analysis of the data. This thesis provides details on the exploration of several imaging methods with different types of equipment and the setup of field trials, recording videos with up to 2.7K resolution and 60 frames per second. Then the reconstruction of honeybee 3D flight tracks through multiple noise removing algorithms is described, followed by the analysis with separated filtering of the flight tracks that are grouped based on various criteria at both individual and colony levels. The results showed several newly identified behaviours of honeybees, benefitting from the recording and analysis with unprecedented detail and precision.

1.1 The main character of the story - bees

1.1.1 Characteristics and behaviours

It is estimated that there are approximately 20,000 species of bee; they are very widely distributed and can be found in every continent except Antarctica, (Figure 1.1) (Mortensen, Schmehl, & Ellis, 2013). They have important ecological and economic roles as pollinators, and for these reasons have been the focus of a major corpus of research. Taxonomically, bees belong to the Anthophila clade and the superfamily Apoidea. Among all the bee species, the Western honeybee (European honeybee), Apis mellifera, is the best-known honeybee species and the primary species maintained by beekeepers.

Figure 1.1 World distribution map of honeybees (retrieved from: https://www.apiservices.biz/)

12

The life cycle of bees follows a complete metamorphosis, including four stages: egg, larva, pupation, and winged adult. The sex of a bee is determined by whether the egg is fertilised or not. Unfertilised eggs develop into male bees (drones), fertilised eggs that are fed with royal jelly initially and switch to honey and pollen after three days develop into worker bees. A fertilised egg that is fed with only royal jelly will develop into a queen bee (Snodgrass, 1956). It typically takes eight days from the time the egg is laid until the development of a mature larva (Nelson, Sturtevant, & Lineburg, 1924). The pupal stage differs, depending on the type of individual: worker bee – nine days, queen – five days and drone – eight to nine days (Bertholf, 1925). Most of the transformation processes, such as the development of organs and the reconstruction of the musculature, begin in the pupal stage and some in the late stage of the larva. Inside a honeybee community, or specifically inside every single beehive, there exists a hierarchy and a set of strict social regulations, termed eusociality. Eusociality is the highest level of organization of sociality, and is defined to have the following four characteristics: adults living in groups, cooperative care of juveniles (individuals care for broods that are not their own), reproductive division of labour (not all individuals get to reproduce), and overlap of generations (Wilson, 1971). In a typical bee colony, this social structure is embodied as having three types of bee: one bee queen, around 200 fertile males (drones) and between 20,00 to 200,000 sterile females (workers). The queen and the drones are mainly responsible for mating and producing their offspring; almost all the other tasks are assigned to the worker bees. These tasks include feeding the larvae, building and cleaning the hive, foraging for food and defence. These hard-working worker bees are the main subject of this research.

13

Figure 1.2 Adult honeybee anatomy

The anatomy of a typical worker bee is shown in Figure 1.2. It depicts the structure of the adult worker bee’s exoskeleton and the major body segments, including the head, thorax and abdomen. The bee is provided with a range of external organs: the antenna, ocelli and compound eyes enable perception of the environment; flight is made possible via the forewings, hind wings, with vision the major sense for navigation. The mouth parts and the corbicula (pollen baskets) are for collecting and carrying food; and the sting is, of course, used for defence.

There are two types of muscles in an adult bee. Besides the ordinary muscles that are reconstructed from larval muscles, the wing muscles are specially formed for the flight. These are connected to the tergum of the thorax (Morison, 1928; Tiegs, 1955). The forewing and hindwing are connected with a hook structure, termed hamuli, when the wings are stretched out. During flight, the motion of bee wings resembles the motion of a boat oar treading water. The full

14 wing stroke consists of three phases: mid-stroke, rearward stroke and forward stroke. At the end of both the rearward and forward strokes, the bee’s wing rotates to create lift forces for rotation, acceleration and wake capture; while during the mid-stroke, the wing maintains the correct angle of attack (a wing’s orientation relative to the direction of motion) to prevent stall (turbulent air interrupting lift) (Altshuler, Dickson, Vance, Roberts, & Dickinson, 2005).

The bee uses its antenna for the sense of touch, smell, taste and even for hearing. The antenna features mechanoreceptors to respond to the movement of air particles, so that it can ‘hear’ sound. It also features olfactory receptors to detect odours. There are three simple eyes – the ocelli, located on the head as a simple light sensor. It also assists with navigation during daytime, since it provides information on the orientation of the sun. When bees are out foraging, they can recognize a desired compass direction based on several cues, including the sun and the polarization pattern of the blue sky. The sun is the preferred or main compass; the other mechanisms are used under cloudy skies or inside a dark beehive. Bees also have a pair of compound eyes that comprise many eye units, termed ommatidia. The compound eye is the main ocular organ to see the world, and is sensitive to the polarisation of light. The compound eyes are more sensitive to blue and purple, enabling the visualization of desirable flowers much more easily through ultraviolet patterning, in addition to detecting floral odours (Snodgrass, 1956).

Food source investigation and foraging is optimised in the worker bee, which develops a pair of corbiculae (pollen basket) on both the hind legs (Figure 1.2), which is specially designed for carrying large quantities of pollen back to the colony. They house sensory bristles across their bodies and a pair of front and hind wings that beat at an average frequency of around 230 Hz.

Information is transferred between honeybee individuals through two main means: pheromones and the dance language. The dance language indicates the distance, situation and direction of a food source (Von Frisch, 1967). The frequency of the waggle indicates the quality of the discovered foraging site, which encourages other workers to travel to the site discovered by the pioneer. Pheromones are complex chemical compounds produced by glands, they play a

15 significant role in almost all behaviours of western honeybees, including foraging, brood recognition, nest recognition, building, alarm, mating, and defence (Mucignat-Caretta, 2014). There are two types of pheromones: primer pheromones and releaser pheromones. Primer pheromones act as a driving power in maintaining the homeostasis and regulating factors of the entire colony, triggering complex and long-term responses. These are mainly released by the queen (Conte & Hefetz, 2008); while releaser pheromones have a weaker effect, and trigger simple and transitory responses at the behavioural level, such as sexual, aggregation, alarm, trail, territorial and recognition (Ali & Morgan, 1990).

Like all other insects, honeybee behaviour is also very sensitive to ambient temperature. The Western honeybee must reach an internal body temperature of 35 °C to . Further, research suggests that the optimal air temperature for foraging is 22-25 °C. Below 7–10 °C bees are generally immobile, and above 38 °C their activity slows. Western honeybees can sustain a temperature of up to 50 °C for a short period (Heinrich, 1979).

The limited space inside the bee colony similarly restricts the movement range of the bees, although the behaviour within the colony is nevertheless complex. Research aiming to observe honeybee behaviours inside the colony have less uncertainty and more controllable variables. Therefore, for beekeepers and researchers, observations conducted inside the bee colony are much easier than tracking a swarm of bees out foraging and monitor their inflight behaviours.

1.1.2 The unquestionable value

Humans have been collecting honey produced by honeybees for thousands of years, dating back to around 7000 BCE (Mallory, 1997). Historical sources suggest that the domestication of western honeybees started in ancient Egypt. Beekeepers have been selecting western honeybees to have desired features including increased honey production, reduced aggressiveness, and resistance to disease (Weber, 2013). The honeybee produces a large variety of by-products including honey, propolis, beeswax and royal jelly. However, the economic value

16 of these bee products account for less than one-tenth of the overall value created by honeybees with respect to pollination activity. Insect pollination is responsible for around 80% of commercially grown crops and an estimated 35% of world crop production (Allsopp, 2008; Klein, 2007). These crops include tomatoes, beans, onions, apples, and a huge amount of other agricultural products. In the UK, insect pollination contributes an estimated £400m per year to the UK economy, with bees the most important pollinators (Wentworth, 2010). According to the American Beekeeping Federation and the United States Department of Agriculture (USDA), managed western honeybees are responsible for the pollination of approximately $20 billion worth of crop across the whole United States, with another $4 billion from wild bees (Chisel, 2015).

1.1.3 Current situation and threats

In the UK, the capacity of honeybees to act as pollinators has fallen by more than 50%, from a peak 70.3% in 1984 to 34.1% by 2007 (Klein, 2007). Moreover, an increasing number of cultivated areas have been established while the population of honeybees has been declining dramatically. Currently, only about 2% of UK hives (~5,700) are known to be professionally managed for pollination services. Other beekeepers are unlikely to move their hives that often to accommodate the flowering season of different plants and instead leave the hives less maintained (Breeze, 2011). In the UK, most beekeeping is conducted on a small scale and geared towards honey production. There are only enough honeybee hives within the UK to pollinate a third of crops, therefore the main pollinators are wild bees and other insects (Breeze, 2011). Between 1985 and 2005 there was a significant reduction in the number of beekeepers in Europe, falling by 31% (Potts, 2010). Insect-dependent crops could, in theory, be pollinated by hand, but the cost of this is prohibitive (initial estimates of labour costs suggest a figure of £1.5 billion a year) (Wentworth, 2010). The effect of losing pollinator species can be explored through computer models of plant- pollinator networks. These suggest that networks are fairly robust to the removal of specialist pollinators; plant species diversity can still be maintained until 70% of pollinator species are lost, providing the most important specialist

17 pollinator is lost first. Losing generalist pollinators, like honeybees, however, have much more severe consequences to biological diversity (Wentworth, 2010).

The main causes behind decline of bee population can be summarised as:

• Varroa mite (Varroa destructor) – this is an external parasitic mite that attacks and feeds on honeybees. It was responsible for disastrous infestations of western honeybees in the early 1990s, and is so far believed to be the biggest threat to the honeybee population. The infestation between managed hives can be controlled to some extent, but wild bees are suffering an infectious rate of 95%-100% within the spreading radius (Goulson, 2015). • Insecticides and acaricides – bees that are chronically exposed to various types of pesticides suffer damage to the central nervous system compromising their ability to navigate, causing overstimulation, paralysis and death (Johnson, 2010). Neonicotinoids, in particular, have been shown to cause a wide range of harm to honeybees and wild bees, including damage to memory and navigation ability. They have, thus been banned in the EU for outdoor since 2018, due to contamination of water and soil (Carrington, 2018; Carrington, 2012). However, the actual mechanism behind it is still poorly understood since honeybees’ foraging behaviours are not constantly tracked and the neurotoxins require a long time to take effect among the entire colony compared with the relatively short life cycle of a worker bee. • Colony collapse disorder (CCD) (Weber, 2013) – this refers to unusually high rates of honeybee colony loss in a specific time. A global CCD event occurred during 2006 to 2009, with huge honeybee colony losses in North America, ranging from 30% to 90% in different states in the US (Johnson, 2010). The actual causes of a CCD are poorly understood; researchers consider it is “a syndrome of many different factors”, including the use of pesticides, a new parasite and existing stresses that may compromise the immune system of bees (Tomizawa, 2005). • Climate change – this is widely accepted as one of the biggest threats to biodiversity and bees cannot escape from it either. Climate change can

18

severely disturb the balance of ecosystems, natural vegetation and the food chain, causing a spatial mismatch between plants and pollinators. Extreme weather also poses threats to honeybee winter colonies, making it difficult for them to survive during colder periods (Goulson, 2015). • Agricultural intensification and urbanisation – the conversion of semi- natural land into both intensively farmed arable land and urban landscapes has resulted in a dramatic reduction in the quantity and variety of food sources and available habitats for honeybees. As a dramatic example, the UK has lost 97% of its wildflower meadows since 1930s (Wentworth, 2010). This also makes it difficult for the honeybees to travel long distances to foraging sites since habitats are often fragmented.

1.1.4 Strategies to maintain effective pollination

The UK’s Department for Environment Food & Rural Affairs published a 10-year plan in 2014 called the National Pollinator Strategy (NPS) (UKNS, 2018), aiming to cohere the powers from the government, beekeepers, researchers and the industry and provide guidance for improving the status of pollinating insect species. The NPS implementation plan proposed altogether four types of action; they are:

• Strengthening the evidence base – specifically, this refers to detailed monitoring and a better understanding of pollinators. This will provide information on the pressures face by these pollinators and the impact on their population decline. • Managing our land – this includes the reservation of high-quality and pollinator-friendly habitats, engagement with farmers and beekeepers on pollinator management and effective on-farm measures, as well as guidance on the use of pesticide and pest management. • Bee health – this includes the identification and control of bee pests and diseases, the surveillance of invasive species and the protection of honeybee health.

19

• Engaging people – this refers to the awareness and action taking on understanding the current situation and protecting endangered pollinators, from the perspective of an ordinary person.

For a research team to answer the call, actions should be focused on understanding the behaviours of pollinators using all available facilities to extract useful data, and interpreting the results into the determination of bees health and their living condition, as well as measures that actually support the pollinators.

1.2 Main contributions

• Previous published studies on the monitoring of insect behaviours were widely reviewed and other optimised approaches for the same goal, with the help of present technology, were investigated. The monitoring of insects requires the investigation of multiple remote sensing techniques, including radar monitoring, radio frequency approaches and active/passive visual imaging. The advantages and disadvantages of these methods, as well as the trade-off between the performance and cost have been fully explained and discussed in the thesis. • An imaging system that is capable of extracting 3D flight trajectories of honeybees with an acceptable resolution, accuracy, and a large trackable volume that has never been achieved. The system is low-cost, fully operational in the field and has the ability to compensate a certain level of noise, which is significant if used by an ordinary beekeeper. Further, prior to the final design of the entire system, the physiology of honeybees, their flight characteristics, movement range and foraging preferences were thoroughly understood and reasonable assumptions were made based on related studies, so that the efficiency and reliability of the desired system can be guaranteed. The imaging system also provides parameter pre-sets for different application scenarios, to minimise the time required for the initialisation or calibration before the monitoring starts.

20

• The results obtained from the system have shown great application value in bee health monitoring and the protection of them. The system is able to observe and extract information of honeybee inflight behaviour at a precision that has, as far as is known, never been achieved. Therefore, innovative interpretations and new perspectives in observing bee behaviours were revealed, including a more detailed understanding of the division of duty at different stages of the colony. This may include foraging preferences of the worker bees, swarming and population variations in response to environment change or micro-level monitoring of the defencing and recognition mechanism among different honeybee individuals. Further information can be revealed by more specific and targeted analysis, to monitor the health status and activity intensity of the target bee colony, so that the damage caused by external factors can be identified early and determined with accuracy.

1.3 Chapter summary

Chapter 2 reviews the literature that discusses insect observation strategies. It provides an evaluation of techniques, from traditional light traps and modified marine radar to advanced vertical-looking radar and state-of-the-art harmonic radar. It also discusses the efforts researchers in the area of visual tracking, both under laboratory conditions and field trials, and the development of image processing techniques in artificial intelligence, surveillance and several other areas. The literature provided inspiration and ideas on what is important in the observation of insects and the possibilities of bringing new techniques into insect tracking.

Chapter 3 provides the fundamental theory and relevant mathematical derivations that supports the subsequent processing algorithms. This includes the mathematical model behind the applied image segmentation method, the Mixture of Gaussians model, and the architecture of a typical Kalman filter in the application of object tracking. Subsequently, the Hungarian algorithm, which is designed to allocate incoming measurement to the existing tracks, is explained,

21 together with a necessary execution strategy. The principle to convert 2D coordinate pairs to 3D coordinate is also described. This was based on appropriate definitions of information from a multi-camera imaging system, and the manipulation of matrices to calculate the relation between the cameras and the target, so that the 3D triangulation can be accurately performed.

Chapter 4 covers the investigation of the appropriate imaging principles, the development of hardware, how the experiment was conducted and the collection of video data. It first describes the obstacles encountered during the imaging of bees at early stage of the research, mainly with respect to several active infrared imaging methods. From these early investigations it was decided that visual range passive imaging was the correct approach, and a set of experiment was designed and conducted. This chapter also describes the experimental procedure in detail and provides evidence of the success of tracking tiny objects such as honeybees using ordinary digital video cameras. Next, the selection of the camera model for final trials is presented; there is also a debate on the trade-off between a higher resolution and a higher framerate, given a fixed transmission bandwidth. The construction of a simulated 3D imaging scenario was built in a 3D rendering tool and this is also described. Finally, the procedure for the final experiments, which consisted of two cameras recording synchronously, is provided.

Chapter 5 represents the most significant part of the thesis, describing the development of the entire 3D tracking software. This chapter includes the definition of the problem and the possible approaches for the processing of the data. Several commonly used image segmentation algorithms are compared and investigated, thus the best approach for the processing of small-scale information such as honeybees was finalised. This chapter also describes the noise removal procedure on the different types of noise in the video frame, including static background objects, large-scale foreground moving objects, and foliage such as grass, which was the most challenging part. The concluding part of the chapter covers the implementation of the Kalman filter for the tracking and motion estimation of objects moving in 3D space, and also the conversion

22 from 2D coordinates pairs to 3D coordinate following the epipolar geometry principles.

Chapter 6 discusses the results from the processing of the software and covers the analysis of the flight statistics at both individual and swarm level. In the analysis of individual bee tracks, two random selected tracks are analysed in detail, showing the one perspective view in the 3D space. The instantaneous velocity, acceleration and orientation of the track in every single frame is calculated and presented, showing the time when each accelerated and decelerated, and how turns were accomplished. The swarm level analysis discusses the general movement trends of the worker bees flying around the hive. This includes the number of exiting and entering bees, the general flight direction and average velocity of the swarm at different times. A clear level of order is extracted and made evident from the seemingly chaotic flights. Besides the demonstration of the 3D data and basic analysis of the flight statistics, this chapter also discusses the density distribution of the bees using heatmaps. The information yielded through his analysis is discussed, with respect to decision making, leading and guidance during flights, division of duties and adaption to environmental changes affecting the entire bee community. The capabilities and overall performance of the entire system regarding various practical applications is also mentioned.

Chapter 7 summarises the previous chapters, emphasises and complements the significance and novelty of this research. Additionally, this chapter provides several suggestions for optimisation and further improvement of the system, and possibilities for further work, including upgrades to the imaging equipment, the improvement of the experiment plan, the measurement of other environmental statistics and collaboration with entomologists and bee experts.

23

2 Chapter 2 – Literature review

2.1 An overview of insect behaviour

The study of insect behaviour is a very broad and widely researched topic. It concerns the largest group within the arthropod phylum: how insects live their lives in various environments, how they react to other ’ behaviours (including humans), and how they influence the ecological equilibrium and the way the world works. In this research, bee flight behaviour has been revealed a little further.

An established definition of behaviour is the action an individual takes in response to the surrounding environment and certain stimuli, based on the information acquired from it (A Dictionary of Biology, n.d.). For an insect, these actions include brooding, breeding, defensive behaviour, foraging, preying, migration, repopulation, and clustering under extreme weather conditions. Although many of these topics have been intensively studied well over the years through both laboratory experiments and often accompanied by advanced genetic and microscopy techniques, some still lag behind, compared with the progress in the study of other large-scale vertebrates; this is especially the case when it comes to the movement behaviours of insects.

Considering the movement of individual insects, the broad topic can be subdivided into three main areas: first, short-range movements, which refer to movement from within nests or colonies (for example, the waggle dance of honeybees); second, foraging and preying behaviours involving travel of up to several hundred metres (involving food resource searching strategy, return flight, forager recruitment); third, long-range movements, including individual and swarm migration.

Unlike with large-scale vertebrates, the study of insect movement has its own limitations and advantages. Most insects are too small to be tracked individually with normal equipment. Although the migrations of several emblematic insect species have been studied well, such as the monarch butterfly (Danaus plexippus) (Garland & Davis, 2002; Knight, Brower, & Williams, 1999; Howard & Davis, 2009) and dragonfly (Anax junius) (Wikelski, et al., 2006) since they are among

24 the insect species having the largest body shapes, the general and ultimate motivation behind insect migration is often unclear. And unlike most vertebrates, insects do not undertake return migrations. They tend to repopulate at certain milestones along the migration route and at the destination (Holland, Wikelski, & Wilcove, 2006). This can make insect tracking a difficult, time- consuming and heavily human labour-dependent task. In addition, the range of movement and manoeuvrability of insects largely depends on the size of the insect and the utility of wings. An ant (Cataglyphis fortis) will typically travel approximately 30-60 m for food on plains that has an average level of nutrition (e.g. vegetation, food sources, water) (Sommer & Wehner, 2004), while a typical swarm of locusts can travel as far as 150 kilometres in a day (Steedman, 1990). Generally, the larger the body size of the insect, the greater its movement ability and the broader its movement range can be. But this is not always true; winged aphids can rise as high as 600 m through the effects of updrafts and get widely spread - although their sizes normally lie within the range of 2-5 mm. The large range of insect sizes and their ability to travel long distances brings many challenges to the study of insect movement and observation. Although small- sized insects often have limited ranges of movement, they are harder to detect using conventional tracking methodologies (such as entomological radar). Further, sensors such as tracking transponders, which can be glued to larger insects, are too large or heavy for insects such as diptera. These insects often blend into the ambient environment and even prefer to dig holes in the soil and hide underground, searching for leaves and roots as shelter, avoiding direct sunlight and the risk of visual contact by predators. This presents a challenge to monitoring strategies. However, even for large-sized insect species, tracking devices (either passive imaging cameras or radar), face challenges covering the entire tracks of flight; in most cases the obtained trajectories and positioning data are fragmented and incomplete, making conclusions less reliable. For many eusocial species, the tasks and patterns of behaviour vary dramatically between different individuals, depending on their sexes, castes and lifecycle stages. Therefore, the experiment routine and instrumentation must be designed differently for different castes.

25

Observing insect behaviour is often more efficient and economic because insects are amendable to experimental operations which means they are relatively easy to catch, breed and manipulate. The large-scale population studies, and the short lifecycles makes the results allow researchers to quickly restart or readjust the experiments upon failure (Chapman, Reynolds, & Wilson, 2015).

Generally, animal behavioural research follows two main approaches. In the first, experimental subjects are put into an artificial environment in the laboratory, specially designed to provide only a small number of possible stimuli. The reactions that animals make are restricted and prescribed by the experimental design, similar to psychological experiments. The effect of the uninvolved variables onto the existing ones is thus neglected. In the second approach, the animal is observed under natural conditions using appropriate data measurement and recording equipment (Green & Swets, 1966). In such field conditions, the richness of the statistics is guaranteed but much more ambient noise is introduced and this type of disturbance can make the results unpredictable (Stephens, Johnson-Kerner, Bialek, & Ryu, 2008). Considering the insect behaviour, instead of observing the insects directly, many researchers focus on the consequences of movement, trying to find the cues and proofs of ongoing or finished migrations, while others rely more on the recent advancement in imaging and data acquisition techniques, in which case direct observation of behaviour is possible.

The ultimate goal of studying insect behaviour is to understand motivation or causation, the response to different external stimuli and the main triggers of certain crucial behaviours. This has a significant role in, for example, pest management, the purpose being suppression of reproduction before the external environment factors impact the internal states of a swarm; or a more efficient, more comprehensive and more economic method can be applied during the protection and breeding of the beneficial insects. Note that observations and tracking of insect behaviours are not merely for the trajectories and traces themselves, but as a strong evidence towards the thorough understanding of the secrets underneath.

26

2.2 Ground-based passive observation methods

There are two main approaches to obtaining evidence of possible insect movement with ground-based measurements. One is the use of light traps. Light traps are a very popular way of sampling nocturnal insects. In a major study, a research group conducted a large set of experiments, capturing night-flying noctuid moths (Lepidoptera: Noctuidae) in a mountain valley in south-eastern Spain overnight for 170 nights over 2 sampling years (1991-1992), using light traps and bait traps (Yela & Holyoak, 1997). To increase the rate of successful catches, they also put a bait trap using a mixture of wine and sugar alongside the blacklight (UV-A light) source. By comparing the number of specimens caught on different days, the team determined the effects of moon-phase and temperature on the activities of adult noctuid moths. Another major study in south-western Japan counted the nightly catches from a 60 W light trap during autumns over 1980-85, and found evidence of rice planthoppers (N.lugens Stal and S.furcifera Horvath) migrating from China to Japan and the possible cause of the invasions by the same species in early summer (Wada, Seino, Ogawa, & Nakasuga, 1987). Light traps provide valuable information about the distribution and population of certain insect species and are a very efficient way of catching insect specimens. But they heavily rely on the phototaxis of insects which means what will be attracted or repelled is very uncertain. Their performance can also be easily affected by many factors including the design, the bulb type, the timing of experiments and moon phase (Altaf Hussain Sheikh, Moni Thomas, 2018).

Another commonly used method requires the marking of the studied insects. Insects in such studies are often captured and marked with a selection of specific markers, then released into the field; some will be recaptured at the same or different locations after certain time intervals. The choice of marker depends on the tracked insect species and the environment the insects will encounter (Hagler & Jackson, 2001). They can either be marked individually by physical labels or stickers (Howard & Davis, 2009), or in a group by dust or spray paint (Hagler & Jackson, 2001). Other marking methods even require no contact with the insects such as genetic marking and pollen marking. For pollen marking, the insects get self-marked by collecting pollens that are marked beforehand

27

(Suchan, Talavera, Sáez, Ronikier, & Vila, 2019) while in genetic marking, researchers are required to record the visible genetic mutations of the observed individuals so that they can be distinguished from the bigger crowd (Lowe & Allendorf, 2010; Nagoshi, Meagher, & Hay-Roe, 2012; Lukhtanov, Pazhenkova, & Novikova, 2016). Both marking techniques require no need to capture the insects and handle them with different types of marks, which minimises the possibility of affecting the genuine behaviours of the insects and their decision making during the incoming migration. The mark-capture and mark-release-recapture methods are a powerful tool in understanding the insect population dynamics, dispersal and other ecological interactions. They provide cues of possible migration direction in a much larger scale and time, often crossing generations and help to generate the hypothetical routes of migration based on several observations (Talavera & Vila, 2017). These methods require very little training and are non-destructive to the insects.

The biggest disadvantage of ground-based observations/captures is the lack of height information of migration, which indicates the origin of the migratory populations, considering the various windspeeds and directions at different heights. These methods can also be easily affected by weather and lunar cycles. While light traps only focus on nocturnal insects, insect marking and recapture is hard to get high resolution data. In addition, ground-based methods are often very costly, tedious and time-consuming. The results are heavily influenced by the timing of experiments and are most of the time very subjective, compared with remote sensing techniques.

2.3 Airborne netting and sampling platforms

Since the ground-based sampling methods do not provide information on the height of migration, researchers have also exploited airborne sampling methods at higher altitude for scientific studies since 1930s (Hardy & Milne, 1938) to collect insect specimens when the migrants ascend beyond the “flight boundary layer”, which is defined as the maximum height at which the windspeed is still less than the maximum airspeed of any given insect in it. This may extend to

28 several hundred metres above the ground. Air netting using tethered balloons or kytoons are very useful for determining the taxonomic composition. A variant of this method involved the use of a remotely piloted vehicle to sample the aerial density of potato leafhoppers (Empoasca fabae); this had a better performance compared with ordinary weather balloons in heavily turbulent conditions (Shields & Testa, 1999). Recent studies use air netting as a supplementary material to the species identification of the monitored insects, such as netting suspended from air kytoons with high frequency radar scans (Riley, et al., 1991) and a compound system containing vertical-looking radar, aerial netting and a network of ground light traps (Chapman J. W., et al., 2002).

However, airborne netting is expensive and not practical for long-term monitoring. Further, the lack of the adequate sampling volume to track less numerous species and the limited capability to monitor insect movement at different heights simultaneously restricts its usefulness in many aspects of insect movement characterisation.

2.4 Entomological radar

Before radars were introduced for entomological research, the only methods for obtaining information on insect movement was by collecting cues and traces of their historical positions indirectly, through specimen catches, foraging evidence, recorded visual contacts by researchers at different locations. The use of entomological radar offers the prospect of supplanting these labour-intensive and sometimes unreliable methods. It has made a huge contribution to the study of insect migration phenomena. Since the first successful observation of insects in 1949, the remote sensing capability of various versions of radar has included detection ranging from several kilometres down to several metres above ground, with vertical-looking radars, harmonic radars and other radio telemetry systems.

29

2.4.1 Traditional scanning radar / marine radar

The first reliable report of radar echoes from insects dates to 1949 (Crawford, 1949). The report suggested the potential use of radar in scanning insects. In 1954, a swarm of locusts was detected by a radar system (Rainey, 1955). The pioneering work of G. W. Schaefer, who deployed radar specially designed for locust observations in 1968, announced a new era of monitoring insect migration (Schaefer G. , 1972). Compared with the conventional indirect insect sampling approaches, radar can detect insect behaviours for both diurnal and nocturnal studies. Since the insects do not perceive radar waves, the observation results are undistorted and representative of natural activity. The radar described by Schaefer employed a scanning pencil beam with a wavelength of 9 mm, which could be used to estimate the target’s speed and orientation of movement. Much similar research was conducted during the 1970s – 1980s. Some of the systems used designs similar to the Schaefer radar. Others used X-band scanning radars with a wavelength of 3.2 cm, based on modified marine equipment, to study species with smaller sizes such as planthoppers (Riley J. R., 1992). Aircraft- mounted radar were also investigated, in which the vertical profiles of insect- density and of average alignment were obtained along the flight path of the aircraft over hundreds of kilometres (Schaefer G. W., 1979; Greenbank, Schaefer, & Rainey, 1980). This significantly extended the range of observation over ground-based radars (though the latter already have a good monitoring range), and also provided transects of a migration.

The range of observation with traditional scanning radars is usually from around 1.5 km to several tens of kilometres for medium-bodied insect individuals (~100 mg) in a swarm. Traditionally, the monitor used to display the detected targets of a ground-based radar is called a plan position indicator (PPI) (Figure 2.1). The flight path can be constructed by connecting sequential registrations of the same target on the PPI from each circular scan. Target identification and recognition is often achieved in two steps: first, the echoes returned by the insects are distinguished from those of other animals or precipitation by comparing the speed of movement and wingbeat frequencies, which is very easy to accomplish; however, the discrimination of different species of insects is extremely difficult

30 due to the huge intersection of the wingbeat frequencies among intraspecies individuals and frequencies across interspecies individuals (Riley J. , 1980). The main observing targets are usually medium to large-sized insect species, such as grasshoppers, locusts, pyralid and tortricid moths. As mentioned above, an important advantage of scanning radar is that the insect flight behaviour will not be affected by radar beams. However, they rely heavily on the evidence from other insect observation methods to determine the species. These are mostly time consuming and labour-intensive, and thus not suitable for long-period observations, despite a certain level of automation. Over the last 30-40 years, radar has revealed much about insect flight behaviour, including the determination of the boundary layers of insects’ common activity and migration, prediction on movement direction of the insect swarm, the population distribution along the horizontal plane and height, and the insects’ response to temperature and wind profile during migration (Drake, 1984; Drake & Farrow, 1983).

Figure 2.1 Radar PPI display on a morning in 1988 at Jiangpu, showing a large number of take-offs of micro-insects (Riley, et al., 1991).

2.4.2 Vertical-looking radar (VLR)

This form of radar, specifically designed for entomological research, emerged in the late 1990s. VLR systems electronically record echoes from the targets as they pass through a stationary vertical-pointing beam, as the name suggests, showing

31 a gradually rising and falling signal echo. VLR systems allow continuous and autonomous monitoring over long time periods and typically employ a wavelength of 3.2 cm.

Early versions of VLR systems could estimate the number of targets passing by in a given time period and their speed, determined by the rate of the rise and fall of the echo signal. However, the identification capabilities were very limited. More sophisticated designs emerged, called nutating VLRs (Figure 2.2 C), which provided more detailed information. They are able to measure the target’s displacement vector, heading direction, wingbeat frequency, size and shape; further, a VLR can derive the aerial densities, migration fluxes and orientation distributions of the migrants by transmitting a signal that incorporates target interrogation (Harman & Drake, 2004; Smith, Riley, & Gregory, 1993).

The nutating VLRs emit a circularly symmetric, vertically looking beam in which the plane of linear polarization is continuously rotated by mechanically turning the upward-pointing wave-guide feed about the vertical axis. In addition, the feed is offset by a very small angle (0.18°) around the vertical axis, making the beam oscillate slightly. This arrangement produces nutation – a rocking motion along the axis of rotation, as can be observed in the motion of a spinning top. The inclusion of beam nutation and rotating linear polarisation allows information related to the shape and body alignment of overflying insects to be deduced (Chapman, Reynolds, & Smith, 2003).

A VLR system is much easier to set up than a traditional radar. The range of observation extends from approximately 150 m to 1200 m above the radar’s antenna in a series of height bands (~15 bands for a typical design, with each band being around 45m in depth). Target identification is achieved by first establishing the mass of the insect body, through a comparison of the maximum and minimum backscattering radar reflectivity of the insect and wingbeat frequencies. VLRs operate with very high level of automation and are typically computer controlled. This means it is practical to set up a VLR in remote regions with a minimum of human intervention. The siting choices are very flexible, and the datasets usually extend over years. This has yielded millions of records of insects and many migration swarms. A network of VLR systems can provide a

32 comprehensive view of the migration activities of various species covering an entire region. The migration routes, the timing and lateral extent of migrations can be validated with much higher accuracy and robustness than many other methods; hence the influences of ambient conditions on migration behaviour, such as wind or weather, can be strengthened.

As a recently developed technique, VLR systems showed considerable potential in entomological research, but they also have clear limitations:

• The range of insect species that VLRs are suitable for is very limited, since micro-insects such as ants and cereal aphids are too small to be detected and they predominate in the wild. The main subjects of study are restricted to larger species including and hoverflies. Insects that weigh 1 mg or lower are very unlikely to be detected properly and only those weighing more than 15 mg can be detected throughout the entire range of detection. Millimetre radar produces stronger echoes thus the signals reflected by smaller targets are strong enough to be detected. This type of radar proved its usefulness in some studies, in one of which a Q- band 8 mm wavelength radar was employed together with aerial netting and ground trapping equipment (Riley & Reynolds, 1987). This means using a wavelength that is smaller (higher frequency) might solve the issues but such a system would be much more expensive or even require custom-made components. • The blind zone of a VLR is approximately 150 m above ground and lower, because of the time interval (1 millisecond) required to process the information. An alternative approach is to use a frequency-modulated continuous-wave (FMCW) radar system that can work in conjunction with a normal VLR system to provide better low altitude performance and greater vertical resolution. But it is very expensive and lacks the target parameterization capabilities that are essential for identification if used alone (Bean, Mcgavin, Chadwick, & Warner, 1971; Atlas, Metcalf, Richter, & Gossard, 1970). • Data on wingbeat frequency can only be extracted when the radar nutation cycle is periodically stopped, which means the body mass and

33

the wingbeat frequency cannot be measured simultaneously in a single session of operation. The VLRs need to be switched between nutating and non-nutating modes periodically (Chapman, Reynolds, & Smith, 2003). • The definitive determination of the species involved (i.e. the ground truth) still requires captures of insect specimen at high altitude using meteorological balloons or remotely piloted drones. In the past few decades, most of the successful research on observing migration phenomena of pest and beneficial insects depended on the integration of information obtained from other sources, for example aerial netting catches, ground trapping networks and biometeorology. • The trade-off between the performance of the radar system and the cost is always a factor with such complex and expensive systems. • Limited observation range with respect to the horizontal plane since many radars have low or no mobility.

2.4.3 Conventional entomological scanning radar (without harmonic transponders)

The observation volume of conventional marine radar, vertical-looking radar or Doppler weather radars are mainly focused on high altitude observations because of their dead zone limitations. However, the change in detection range due to different wavelengths inspired the use of radar on low-altitude observations, such as foraging behaviours. The echoes from ground vegetation (so called radar “clutter”) are much stronger than those from the target insects during low-altitude observations (Loper & Wolf, 1986). Data can only be obtained by carefully choosing the siting of the receiver to provide a good field of view and high-quality observations.

34

A B

C

Figure 2.2 (A) A ground-based harmonic radar for low-altitude insect tracking (photo by A. D. Smith). (B) Side view of a hand-held harmonic radar detector (O'Neal, Landis, Rothwell, Kempel, & Reinhard, 2004). (C) Photograph of a vertical-looking radar (VLR) sited on the rooftop of a building at Rothamsted (Chapman, Reynolds, & Smith, 2003).

2.4.4 Harmonic radar

Harmonic radar receives the returned signals from a rectifier circuit, commonly termed a harmonic tag or transponder, which is attached to the tracked animal (Figure 2.3 B – D). Echoes returned from the transponders have exactly half of the wavelength of the transmitted signal. The receiver is selectively tuned to the reflected signal and thus detects the target even with cluttered background.

35

A B

C D

E

Figure 2.3 (A) An RFID tag attached to a billbug (Silcox, Doskocil, Sorenson, & Brandenburg, 2011). (B) Bumblebee individual attached with a 200 mg transmitter (Hagen, Wikelski, & Kissling, 2011). (C) Attachment of a 300 mg radio transmitter to the thorax of a green darner (Wikelski, et al., 2006). (D) Honeybee attached with a radar transponder. (photo courtesy of M. Garcia- Alonso) (E) Butterfly equipped with an UHF RFID tag (Särkkä, Viikari, Huusko, & Jaakkola, 2012).

36

There are two different designs of a harmonic radar. One is the sophisticated ground-based radar station (Figure 2.2 A) and the other is a simpler handheld transmitter/receiver system (Figure 2.2 B) that provides good portability. For both designs, the principle is the same: the signals returned from the transponders are very easy to distinguish as there are almost no naturally occurring rectifiers. The harmonic tags operate passively and therefore no battery is required, which means the tags can be made even smaller and lighter. Normally the detecting range extends to 750 m and a scan completes in 3 seconds. The range resolution is around ±3 m and the azimuth resolution is ±1.25 m, which covers a very large portion of the movement range of many insect species.

The use of harmonic radar in entomological research was first reported by Daniel Mascanzoni and Henrik Wallin in 1985. The tracing capability of this technique was tested by tracking a nocturnal carabid species Pterostichus rnefanarius Illiger (Mascanzoni & Wallin, 1986). The harmonic radar was deployed in a cereal field in Sweden and the team noticed that the carabid beetles were capable of dispersing long distances in a short time. After that, harmonic radar became a common tool for low-altitude insect tracing and contributed to a greater understanding of insect behaviour. One research team based in the UK led by E. T. Cant carried out a set of field observations in 1999 with harmonic radar, to track the flight paths of five butterfly species within an agricultural landscape (Cant, Smith, Reynolds, & Osborne, 2005). The observation volume was approximately 500 × 400 m2 on the Rothamsted estate in the UK. There were several linear features in the field, including pasture, fences and scattered trees. Information was gathered regarding the flight straightness, duration, displacement, ground speed, foraging and the influence of linear landscape features on flight direction. Before their study, butterfly mobility was only documented by visual observation and mark-recapture experiments. Harmonic radar has also been employed to study the waggle dance of honeybees (Riley, Greggers, Smith, Reynolds, & Menzel, 2005). Radar was used to test how effectively recruited bees translate the encoded information after seeing the waggle dance done by other worker bees. During their experiments, bees that were about to exit the hive were caught and fitted with tiny harmonic

37 transponders, weighing 6-20 mg each, (depending on the degree of mechanical robustness required) and set free to forage. Numbers of exits were counted after an occurrence of a waggle dance inside the hive. In another study, the research group discovered that bees took multiple orientation flights before they started foraging, to gradually learn the look of their hive and the landscape around it from different perspectives, so that they were able to navigate back once the foraging was complete (Capaldi, et al., 2000). Other studies have found that social bees were believed to achieve this using a mechanism known as path integration: the flight vector is continuously updated at each landing position and by integrating all the distances covered and angles steered for each displacement segment, the bees can eventually find their way back home (Collett & Collett, 2000). This form of navigation, also termed “dead reckoning”, was used by early aviator pioneers.

The limitations of harmonic centre on the use of harmonic tags, specifically in the following ways:

• The terrain and vegetation in the experimental field, although not sources of echo, often act as barriers and create shadow areas within which tags are undetectable. • The weight of the tag normally ranges from 1 mg up to 15 mg. They produce extra drag on the flying insect and alter the insect’s centre of gravity due to the long antenna, making it unsuitable for small insects (Colpitts & Boiteau, 2004). The dipole needs to be at least 10 mm long to be detectable - the length that creates the largest harmonic cross-section lies around 12mm (Boiteau & Colpitts, 2001). Studies reveal that harmonic radar tags should weigh no more than 23–33% of the insect’s acceptable extra loading, which is normally for pollen and nectar carrying, for the technique to have minimal impact on the number and quality of upward flights taken. The extra weight attached on the insect body could result in significantly increased energy consumption, affecting the navigation strategy and insect’s decision-making during flights. So far, there are very few studies that have addressed this effect.

38

• The glue used to attach the transponders to the insects can be toxic. Droplets of 0.1 mg of Krazy Glue, Loctite, and Bowman FSA applied to the pronotum had no effect on the survival of the Colorado potato or plum curculio after 5 and 7 days, but caused more than 40% mortality after only four hours with both the western and northern corn rootworms. The three glues created an effective bond lasting 4–5 days between the harmonic radar tag and the Colorado potato beetle in more than 85% of cases and the plum curculio in almost 50% of cases (Boiteau, Meloche, Vincent, & Leskey, 2009). • Very light transponders are much less robust and reliable than heavier ones and are very hard to fabricate. But the heavier the transponders are, the more burden is attached to the insect. Therefore, there exists an inevitable trade-off between the weight versus the power (i.e. detecting range) • The returned signal contains no identification information. If multiple targets are tracked with the transponders attached to them, it is difficult to uniquely identify each target since they all return the same type of echo. This limits the ability to simultaneous multiple-target tracking. For tracked individuals that fly beyond the maximum detectable range, it is almost impossible to re-track them when they return after several days. • The harmonic tags require removal and reattachment for every trip of the insect, which is a labour-intensive work.

2.4.5 Radio-frequency identification

Radio-frequency identification (RFID) is a technology that involves a receiver unit, known as a ‘reader’, and an RFID tag attached to the object to be tracked. The reader emits an electromagnetic interrogation pulse to power the RFID tag, which returns unique information corresponding to the item. This technology has been used widely for many years and for many purposes, including access management, tracking of goods and deliveries and electronic travel documents.

For entomological research, since the tags are to be attached to insect individuals, they must not affect the insect’s mobility. Passive RFID tags can be

39 made very small and light, and they are inexpensive. Active tags have on-board batteries, which make them much heavier and larger, normally weighing more than the insect itself. So far, very few studies used active tags on insects, despite the success of tracking dobsonfly larva in shallow fresh water (Hayashi & Nakane, 1989). In that study, a crystal controlled active radio tag powered by a silver oxide battery was tied to the side of the prothorax, weighing around 185 mg under water. The receiver was installed above the water surface at a very small height. There have also been designs of much smaller active tags in recent years, such as the 95 mg active RFID tag with CMOS implementation made by Meera Kumari and S. M. Rezaul Hasan (Kumari & Hasan, 2020).

Unlike the signals returned from harmonic radar tags, which contain no identification information, signals emitted by an RFID tag contains a unique inventory number; hence multiple targets can be tracked and distinguished simultaneously.

There are a number of insect studies using the RFID technique. Studies using radio telemetry mainly focused on habitat use and movement: habitat selection, movement paths and distances. Very few have addressed foraging behaviours, migration strategies, flying patterns (Kissling, Pattemore, & Hagen, 2014). Movement studies of certain insect species have provided estimates of the average and maximum travel distances under natural conditions, including winged insects such as bumblebees (Hagen, Wikelski, & Kissling, 2011) and ground-living beetles (Figure 2.3 A) (Riecken & Raths, 1996; Negro, Casale, Migliore, Palestrini, & Rolando, 2008). Ultra-high frequency RFID systems have also been deployed to follow the activity and the movements of a butterfly species (Melitaea cinxia)(Figure 2.3 E) (Särkkä, Viikari, Huusko, & Jaakkola, 2012). Another research team in Italy used RFID to study the habitat use of a species ( olympiae) and found the relationship between the selection of habitat, repopulation sites and the requirements of food sources around the site. There have also been RFID-based studies on foraging, resting and preying behaviours of dobsonfly larvae (Protohermes grandis) (Hayashi & Nakane, 1989) and bumblebees (Hagen, Wikelski, & Kissling, 2011). However, the extra weight of the transmitter had a very high likelihood of influencing the

40 insect’s behaviour. The migration paths of the green darner dragonfly (Anax junius) have been tracked using a battery powered tag attached to it and a receiver carried by an aircraft together with a ground team.

Many studies on the behaviour of honeybees have been conducted, which also form the main subject of this PhD project. In most of these studies, passive RFID tags were used, weighing about 3 mg (see Figure 2.4 A). This is relatively low, considering a honeybee is typically able to carry 70 mg of nectar plus 10 mg of pollen at maximum (Decourtye, et al., 2011; Gill, Ramos-Rodriguez, & Raine, 2012; Henry, et al., 2012). RFID readers were installed at the entrance of the hive (see Figure 2.4 B) to monitor the number of entries and exits of tracked honeybee individuals, since the working range is usually less than 1 cm. One team also put readers at several possible food resource sites (Schneider, Tautz, Grünewald, & Fuchs, 2012). They tested the effect of pesticides, including thiamethoxam, on homing ability and the relationship behind pesticide poisoning and bee mortality by analysing the decline in the number of returning bees.

The biggest constraint of using RFID tags comes from the power of the signal. If a passive tag is used, then the transmitter signal must be roughly one thousand times stronger than that required for active tag. However, for an active tag, the extra weight limits its use. The battery of the tags normally last for between 7 - 21 days, making it impossible for long-term tracking. Compared with harmonic radar or vertical-looking radar, the tracking range of the signal from an active tag lies between 100 - 500 m, while for a passive tag, the distance between the tag and the transceiver must be under 1 m - or even several millimetres in some cases (Silcox, Doskocil, Sorenson, & Brandenburg, 2011). In addition, the impact of attaching a tag to an insect, especially to a very small-sized one, is not yet thoroughly studied. Research suggests that, compared with bumblebee (Bombus terrestris) individuals without transmitters attached, the worker bees fitted with transmitters exhibited significantly lower flower visitation and the energy consumption rose dramatically. The bumblebees with transmitters tended to rest longer that those without after landing and ended up spending more time foraging on individual flower heads (Hagen, Wikelski, & Kissling, 2011).

41

Figure 2.4 (A) A honeybee with a 3-mg passive RFID tag glued to its back (B) The RFID readers installed at the entrance of the hive for detecting returned marked forager bees. (Henry, et al., 2012)

2.5 Optical imaging systems

As is discussed in the previous sections, both entomological radar and low- altitude radio telemetry techniques have their own disadvantages. They have made vital contributions to entomological research. Entomological radar, represented by vertical-looking radar, detect insect swarms not individuals. Therefore, they are mainly used for monitoring insect migration activities. Networks of multiple VLR can expand the range of observation, mitigating the issue of lack of portability to some extent. Entomological radar also has difficulties in identifying the observed insect species and provides little information other than speed and orientation of movement. For low altitude observations, harmonic radar and RFID scanners have made significant contributions to large-bodied insect studies including butterflies, locusts, beetles, moths and dragonflies. However for small insects such as honeybees and aphids, the weight of the transponders represents a significant barrier to usage.

To obtain information on insect movement that is representative of normal behaviour, researchers have also exploited optical imaging techniques, including visual range passive optical cameras and infra-red active imaging systems. Because of rapid progress in imaging technology in recent decades, insect

42 activities can be recorded with much higher resolution and speed, and the processing of video data is also less labour-intensive.

2.5.1 In-field visual tracking with conventional instrumentation

Historically, optical imaging devices have been widely used for insect observation and detection, even before the advent of modern image processing techniques. In 1979, swarms of locusts filmed with highspeed cameras (500 frames per second) were analysed (Baker, Gewecke, & Cooter, 1981). The videos were recorded at midday during March, in Australia and New Guinea. Wingbeat frequency (~22.9 Hz), flight speed (~4.6 m/s), body angle and ascent angle were calculated. This also suggested that tethered flight experiments conducted in controlled laboratory conditions may yield biased results. In July and August 1982, another research group recorded the flights of multiple Gypsy moths with a camera mounted at a height of 27 m (David, Kennedy, & & Ludlow, 1983). The moths were dusted with pink fluorescent powder to make them more visible. The observation area was around 400 m2 and the positions of these moths in each frame were marked manually by projecting each video frame onto a drawing table. Video techniques have also been employed to observe two tsetse fly species (G.pallidipes Aust. and G.morsitans morsituns Westw.) (Gibson & Brady, 1985). The camera was positioned 2.5 m above ground looking vertically downward; the field of view was 2 m x 2.5 m. The area of the ground was covered with black cotton cloth to provide the necessary contrast for the camera to detect the light-colour (the viewing and analysis were done on a black and white monitor).

Image intensifiers have also been used to increase the contrast against the background under low light situations (Murlis & Bettany, 1977; Riley, Armes, Reynolds, & Smith, 1992; Stephens, Johnson-Kerner, Bialek, & Ryu, 2008). In one of the studies, a cascaded set of photomultiplier tubes returned a light magnification of approximately ×40,000 (Murlis & Bettany, 1977), with two 55- W car spotlights fitted with infrared filters as the illumination source. The

43 positions of the subjects, which were moths, were then superimposed to generate the flight track.

Seminal work was conducted by Shaefer et al., who designed a field system for infra-red active determination of insect flight (IRADIT) in 1979 to study the natural insect behaviour (Schaefer & Bent, 1984). In order to successfully detect the rose grain aphid (Metopolophium dirhodum) against the bright afternoon sky, the illumination system comprised an adapted military xenon arc light source within a search-light parabolic reflector and fitted with a high-pass filter with a lower cut-off in the infra-red (see Figure 2.5 A). The imaging system featured an image intensifier fitted with an 823 nm narrow-band to minimise the effect of solar illuminance and to provide a high SNR. The results (Figure 2.5 B & C) confirmed the success of the imaging system and the feasibility of extracting moving tracks from other noise in the image.

In the late 20th century, the power of desktop computers limited the range and sophistication of imaging processing; human labour was needed to record positions of the tracked insects despite certain level of automation in derivations of insect motion. Extra calculation regarding the effect of ambient factors, such as windspeed and temperature, introduced even more burden on computation. Now, with the advancement of imaging technique and electronic hardware, visual tracking systems can be made fully automated and process the data more efficiently, suggesting a powerful tool for the analysis of insect inflight behaviours.

A

44

B C

Figure 2.5 (A) The illumination source used in the IRADIT system. It was fitted with an infra-red high-pass filter and powered by a pulsed xenon light source. (B) The superimposed enhanced images showing the manoeuvre of an insect against the bright sky. (C) The computer-processed image of (B) where the noise was mostly removed and the complete track of an insect is shown (Schaefer & Bent, 1984)

2.5.2 Insect Monitoring with Light Detection and Ranging (LiDAR)

LiDAR is a method that measures the target distance using ultraviolet, visible or near infrared light to image non-metallic objects via active backscattering. The distance is determined by the return time of each signal which means a LiDAR system outputs a picture of distance instead of colours. Conventional designs comprise a collimated laser beam and a point sensor. The observation field is illuminated point by point so the efficiency is low. Modern designs use a wide diverging laser beam and 1D or 2D sensor array. Since the entire scene is illuminated with one pulse, the whole system is not sensitive to platform motion and information of thousands of pixels can be acquired simultaneously (Medina, Gayá, & Del Pozo, 2006). Compared to conventional radar systems, LiDAR often has a better range and a wider field of view. The robustness can be improved by fusing LiDAR measurements with other types of sensors.

LiDAR systems have been drawing extensive attention in many research fields. Airborne and satellite-based LiDAR systems are commonly used in the generation of topographic maps, canopy heights, digital elevation maps and

45 landslide investigation (Jaboyedoff, et al., 2012; Lohani & Ghosh, 2017). There is also atmospheric LiDAR that is used to quantify various atmospheric components (e.g. profiling clouds, measuring winds, studying aerosols). Terrestrial and mobile types of LiDAR systems are currently widely studied in robotics and transportation, for the perception of the environment and object classification. It has found its application in scanning road surfaces (Lam, Kusevic, Mrstik, Harrap, & Greenspan., 2010), lane markings and traffic signs based on the reflectivity of the surface (Zhang, 2010; Soilán, Riveiro, Martínez- Sánchez, & Arias, 2016), and roadside objects (Pu, Rutzinger, Vosselman, & Elberink, 2011). Being able to generate high-resolution maps makes it possible for the LiDAR system to detect flatness defects of road or concrete surfaces and monitor landslide, rock fall and debris flow (Jaboyedoff, et al., 2012).

Figure 2.6 Example of airborne LiDAR data, showing the colour coded point cloud as per height (left) and the light intensity (right)(courtesy Optech Inc.)

Applications of LiDAR in monitoring insects have also been investigated in recent years. Since the backscattered signals can be sampled up to several kHz, LiDAR is commonly used to determine the insect species based on their wing-beat frequencies. Gebru et al. (Alem Gebru, 2017) designed a multispectral polarimetric optical detection system that sampled backscattered infrared signals at 4 kHz to determine mosquito species and sex in flight. The main parameter used for determination was wing-beat frequency. In another notable study, the autofluorescence and fluorescence after dusted with different dyes of the target insects were investigated (Mei, et al., 2012). The species of different planthopper (Hemiptera) and moth (Lepidoptera) were determined from the peaks shown in the returned signal spectrum, based on the reflectivity using laser beams of different wavelengths. The target distance was around 50 m and

46 the system was able to detect insects as small as 20 mm. A LiDAR system was used to monitor the abundance of insects, bats and birds simultaneously by measuring the population change over time (Malmqvist, et al., 2018). The LiDAR transect covered approximately 2.5 m3 and 520 m in length. The size of the target can be determined from the blobs shown in the depth image generated by the LiDAR. A scanning polarised LiDAR was also be used to monitor the density of trained bees to locate buried land mines (Shaw, et al., 2005).

The limitations and difficulties reside in reconstructing point cloud data in poor weather conditions as the laser signal can be easily refracted by the raindrops (Hasirlioglu, Kamann, Doric, & Brandmeier, 2016; Zhu, et al., 2017), and the false positives caused by planar metallic surfaces or other types of reflective surfaces (Soilán, Riveiro, Martínez-Sánchez, & Arias, 2016). The monitoring of small objects such as insects also requires a relatively dark background to guarantee its sensitivity to smaller targets (Malmqvist, et al., 2018).

2.5.3 Laboratory observation

Complementary to conducting observations of insects in the field, many studies employ artificial laboratory conditions. For motion detection, this reduces the burden associated with post-processing of the recorded videos since background noise is strictly controlled.

There is a large body of work that describes the behaviours of insects in specially designed enclosures, covering such themes as the social behaviours in eusocial insect communities, wingbeat kinematics, interactions with external factors, insect swarm intelligence, effects of odour and pheromones and insect neurology (Balch, Khan, & Veloso, 2001; Ristroph, Berman, Bergou, Wang, & Cohen, 2009; Noldus, Spink, & Tegelenbosch, 2002; Correll, et al., 2006; Sokolowski, Moine, & Naassila, 2012; Wilkinson, Lebon, Wood, Rosser, & Gouagna, 2014). In one notable study, the body and wing motions of fruit flies were recorded in a flight chamber using a high speed camera recording at 8000 frames per second with a resolution of 512×512 (Ristroph, Berman, Bergou, Wang, & Cohen, 2009). The system was able to extract full 3D body and wing motions and discern the

47 kinematic strategies used by a single insect with minimum errors from measurement and inconsistent variations. Another research group designed a real-time tracking system that could track up to hundreds of ants at the same time in a 2D plane, despite some problems such as occlusion and clumping of the targets (Balch, Khan, & Veloso, 2001).

Compared with field experiments, laboratory experiments only allow the involvement of a limited number of variables, usually one, with all the other variables suppressed or completely removed. Since the experimental space is specially designed for the convenience of conducting observations, it is easier to track tens or even hundreds of moving targets at the same time (Balch, Khan, & Veloso, 2001). Laboratory environments ensure adequate illumination and an even coloured background; the number of insects released is controllable; the expected movement volume is normally fixed if the observation is carried out in a flight cage or chamber, which dramatically reduces the computational load and noise from cluttered backgrounds (Wilkinson, Lebon, Wood, Rosser, & Gouagna, 2014; Thoma, Hansson, & Knaden, 2015). The experiments can also be designed to observe a single type of behaviour in each session, making the analysis much more efficient and reliable. Additionally, there is no need for species recognition, working under extreme weather conditions or postponing experiments due to adverse weather.

However, since the essence of such experimentation is manipulation, these studies rely heavily on the integrity of the experimental design. The variables under study should be limited, and the interference between different variables must be considered, requiring evidence from other experiments. In many cases, insects take actions in response to a joint effect of many factors. Studying merely some of these factors may lead to inaccurate interpretations. In such cases, the results obtained in the laboratory do not accurately reflect the behaviours of insects in the wild; hence conclusions can sometimes be similarly unwarranted.

Insect tracking algorithms often work only in the 2D plane, especially for multiple target tracking. Further, the lack of height data is a significant limitation during laboratory observations of winged insects. The range of movement of the studied targets is very limited, usually within metres (ants, cockroaches,

48 mosquitos). For those with higher mobility, for example moths, beetles or bees, it is almost impossible to conduct meaningful flight observations.

A B

C

Figure 2.7 (A) The architecture of a typical in-lab insect observation system. (B) The enclosure used in other research for the analysis of the 2D movement of cockroaches. (C) The view from the camera fitted on the ceiling of the enclosure in (B), superimposed with the colour coded tracks of the targets (Wilkinson, Lebon, Wood, Rosser, & Gouagna, 2014; Correll, et al., 2006).

2.5.4 Object tracking inspired by modern computer vision algorithms

2.5.4.1 Overview of modern object tracking

The main task of any desired honeybee tracking system is to extract the flight trajectories. If radar, mounted trackers or indoor monitoring cannot meet the

49 requirements, the best approach is observation using visual data. Thus, it becomes a visual object tracking problem.

Visual object tracking is widely used in many applications, including surveillance, autonomous vehicles and human-computer interaction. It is a very popular, yet challenging subject. The difficulties arise from illumination variation, camera orientation, occlusions between target objects or target and background, shadows, nonrigid object structures, the movement and rotation of objects.

Compared with object recognition of a single image, visual tracking involves analysis of a video, often in real-time, and utilises information not only within the frame (to identify the target) but also across frames, taking into account differences in the foreground while the background remains unchanged, or the displacement of the same object in adjacent frames. In general, most visual tracking methods follow three key steps: filtering out unwanted information in each frame and identification of the object of interest, keeping track of the object in each following frame and recording its state, and analysis of the object tracks together with interpretation of behaviour.

Researchers have proposed a large number of tracking methods in recent years, often integrated with artificial intelligence. These trackers are being continually improved and maintained by many researchers, yielding higher accuracy and reliability. In addition, there are visual tracking benchmarks, such as OTB (Wu, Lim, & Yang, 2015) and VOT challenge, evaluating the existing trackers and providing unbiased reviews regarding the performance and accuracy of real- time trackers.

In the field of visual object tracking, there are two main methods, namely generative and discriminative methods. Generative methods create a model of the target object in the current frame together with its attributes, and search for the most similar area in the following frames (Kalman filter, particle filter, mean shift filter). In contrast, discriminative methods extract the features of the whole image, with targets being the positive samples and background clutter being the negative samples. Then classifiers are trained using machine learning methods. These are applied to the next frame, searching for the most likely area.

50

Prior to 2012, the most widely employed trackers were Struck (Hare, et al., 2016) and TLD (Kalal, Mikolajczyk, & Matas, 2012). Struck is an online training tracker that uses a kernelized structured output support vector machine (SVM). The TLD tracker outperforms Struck in long-term tracking. In summary, both had impressive benchmarks performances; however, the emergence of correlation filters and deep learning methods has made a huge difference in visual tracking. Some of them are exceptional, such as the KCF algorithm (Henriques, Caseiro, Martins, & Batista, 2015) and MDNet (Nam & Han, 2016), and the potential of deep learning has not yet been fully exploited.

2.5.4.2 Small object tracking

Although visual tracking is developing rapidly, most of the methods, especially those based on machine learning and feature matching, are very dependent on the richness of object surface texture. The reason why most of the cutting-edge trackers cannot contribute much to this bee tracking project is that the target objects, namely the flying bees, are small in size and less variable in shape, compared with pedestrians or cars (for autonomous driving algorithms). Less texture yields less information, meaning modern techniques such as machine learning can be difficult to apply. Many machine learning or shallow deep learning networks are designed for a very specific task. For example, the Scale Invariant Feature Transform (SIFT) is very powerful for matching and detection of stationary objects in different projections. The Histogram of Oriented Gradient (HOG) method is designed to detect objects with certain shapes or textures and is commonly applied to pedestrian detection, since walking human bodies tend to be variant along the horizontal axis and relatively constant vertically. To date, there are few algorithms designed to detect flying insects or even small objects. However, the algorithms discussed provide inspiring ideas and the kernel or core algorithms can be adapted for these purposes.

On the other hand, the morphological attributes and relative positions between different parts of the same object, for example, arms waving, legs stretching, eyes blinking, a person’s mouth opening and shutting, as well as occasions when some

51 parts of the object remain stationary while others move or rotate (e.g. the behaviour of nodding and shaking head), all add to the complexity of target behaviours. This is also one of the biggest challenges affecting the accuracy of visual tracking. Clearly, the wingbeat of bees and their orientations can also make a difference in how they are observed, but in a much simpler way.

An object such as a honeybee can be simplified into a fundamental shape (sphere, ellipsoid, cube) or basic combinations of them, rather than complex nonrigid outlines. The object occlusion issue can be ignored most of the time because the size of the tracked objects is relatively small, making it less likely to overlap with other objects, especially in a 3D recording system. The movement scale and range are much larger compared with the size of the object itself, which provides more information on object displacement traces. Therefore, it is possible to apply trajectory-based algorithms and train classifiers with historical trajectory position data, although active insects such as bees may rush in and leave the recording frame more often. This makes them hard to track and keep their tracking IDs consistent.

In the literature, simple methods such as multi-level thresholding (Zhou & Wang, 2007) or infra-red background estimation (Sanna & Lamberti, 2014) are not appropriate for fields of view with very cluttered backgrounds. This is because both small objects and above ground vegetation are associated with high spatial frequencies of the image spectrum, making it undesirable to use high-pass filters; further, the infrared radiation emitted by bees is very weak, providing poor SNR with the background. The swaying of grass and tree leaves makes it impossible to apply methods such as optical flow (Ballard, 1983) or other special processing methods that are designed for stationary backgrounds.

2.6 Summary

The tools and techniques used when observing insects of different species, swarms with complex patterns of behaviours and activities, different insect castes in the same insect community, or even individuals within the same caste whose actions are affected by ambient conditions, should be carefully selected.

52

Therefore, it is crucial to understand the advantages and limitations of all the commonly used techniques for different types of target.

The main experimental subjects in this research are European honeybees. The detailed information of the instrumentation and design of software, as well as the analysis based on the video data, are covered in the following chapters.

53

3 Chapter 3 - Relevant Theory

Before considering the technical details of the system developed, there are a number of concepts and equations to consider, especially with regard to object recognition and filtering. Unlike other commonly studied or tracked objects, the main subject in this project, the western honeybee, presents particular challenges. Therefore, the algorithms were carefully tested and selected to produce reliable and repeatable results with respect to background subtraction, 3D triangulation, trajectory reconstruction and other crucial stages.

3.1 Gaussian mixture model

There are several popular and well-developed background subtraction approaches, such as CNT, which is aimed at high-speed processing for low- specification hardware, and GMG, which models the background with a combination of Bayesian Inference and Kalman Filters. These either use efficient search algorithms to enable real-time implementation or are embedded with pre-trained deep learning models to enhance the accuracy. However, as mentioned previously, honeybees are small and have impressive flight ability, which makes the tracking task slightly different from pedestrian or vehicle tracking. A typical background subtraction algorithm is the Mixture of Gaussians model (MOG), which was implemented in this project. The MOG, implied by its name, is a data distribution model linearly combined by a set of Gaussian distribution functions. For a variable 푥 that is subject to the Gaussian distribution with a mean of 휇 and standard covariance 휎, its probability density function is given by:

(푥−휇)2 1 − 푓(푥|휇, 휎2) = 푒 2휎2 (3.1) √2휋휎2

When an image is modelled by a Gaussian distribution, x represents the pixel intensity or colour for any pixel and function f stands for the distribution of the number of pixels having the same pixel intensity across the entire brightness spectrum (usually 0 – 255). For an arbitrary image, it is impossible to correctly model it with only one Gaussian distribution. Thus is why multiple Gaussian

54 distributions with different means and covariances are required for a better approximation.

If there are 퐾 Gaussian distributions, each with a mean 휇푘 (푘 = 1 … 퐾) and a standard covariance σ푘, then the possibility of a variable 푥 subject to any Gaussian component in the entire model is:

푝(퐱푁) = ∑ 푤푘푁(퐱푁|훍푘, 횺푘) (3.2) 1 where 퐱푁 represents the value of 푥 at time 푁, 푤푘 is the weight parameter of the

th 2 th k Gaussian component, 훴푘 = 휎푘 퐼 is the covariance of the k component, th 푁(푥|휇푘, 훴푘) is the normal distribution of k component, which is represented by:

1 1 − (퐱−훍 )푇횺−1(퐱−훍 ) 2 푘 푘 푘 푁(퐱|훍푘, 횺푘) = 퐷 1 푒 . (3.3) (2휋)2 |횺푘|2

When implemented in image processing, each pixel in the scene can be modelled by the possibility function 푝(푥푁), with 푝(푥푁) being the possibility of a certain pixel has a value 푥푁 at time 푁. The 퐾 distributions are ordered based on the fitness value 푤푘⁄휎푘 and the first 퐵 distributions are used as a model of the background. The criterion for selecting 퐵 is given by:

퐵 = arg min(∑ 푤푗 > 푇) (3.4) 푏 푗=1 where 푇 is a threshold defining the minimum allowed fraction of the model.

The algorithm checks every pixel in the scene to establish if it is more than 2.5 standard deviations from any of the 퐵 distributions. Background subtraction is performed by marking these pixels as foreground pixels. The parameters of the first Gaussian component that match the test value are updated by the following equations:

푁+1 푁 푤̂푘 = (1 − 훼)푤̂푘 + 훼푝̂(휔푘|퐱푁+1) (3.5)

푁+1 푁 (3.6) 훍̂푘 = (1 − 훼)훍̂푘 + 휌퐱푁+1

̂푁+1 ̂푁 푁+1 푁+1 T (3.7) 횺푘 = (1 − 훼)횺푘 + 휌(퐱푁+1 − 훍̂푘 )(퐱푁+1 − 훍̂푘 )

55

푁 ̂푁 휌 = 훼푁(퐱푁+1|훍̂푘 , 횺푘 ) (3.8)

1 ; if 휔 is the first match Gaussian component 푝̂(휔 |퐱 ) = { 푘 푘 푁+1 0 ; othertise (3.9)

th where 휔푘 is the k Gaussian component, 훼 is the learning rate.

The above update equations are derived from the EM (Expectation- Maximization) algorithm. This algorithm takes initial random guesses of the mean, covariance and weight parameters of all the 퐾 Gaussian components, then calculates the log-likelihood functions of these parameters. The values will converge and iteration halts when there is no significant change. If there is no successful match, the Gaussian component with the least probability is replaced by a new distribution with the current pixel value as its mean and an initially high variance (KaewTraKulPong P. &., 2002; The EM Algorithm for Gaussian Mixtures). The whole model switches to an 퐿-recent window version after the historical samples reaches number 퐿. Then the sample window size is fixed to 퐿 so that the update equations prioritize recent data and can be adapted to newer changes in the environment.

The MOG subtractor requires absolute stillness of the camera and is very sensitive to tiny variance between video frames. When a low threshold is set, it keeps as much useful information as possible, which is what is desired in small object tracking, but also the noise. Targets such as aeroplanes flying across a clear sky and ants walking along sand dunes are perfect for the MOG subtractor since all the varying pixels are from the target objects, and the background, either busy or not, is completely stationary. Tracking bees is difficult since an outdoor scene may comprise moving objects in the background, such as grass and trees blowing in the wind. Although they belong to the background, they create motion that is mixed with useful data when processed by background subtractors. Therefore, for more accurate subtraction, the historical data sampling window 퐿 and motion scale threshold require good tuning. Although the MOG subtractor requires a perfectly stable video input, an advantage of a stationary camera is that the region of interest (ROI) within a given frame will be the same throughout all the frames of the video. This makes it convenient to ignore specific parts of the frame for which the probability of the target object

56 entering is close to zero. Setting a proper ROI not only reduces the computational load but also makes the processing and debugging much more efficient.

3.2 Kalman filter

The extracted coordinate data in each video frame are filtered by the Kalman filter, which is a very powerful tool in object tracking. It makes predictions based on the historical position data and the current measurement. Since the time interval between each video frame is known and fixed, the extracted position data not only contains displacement of the target but also the velocity. Hence the state vector of the implemented Kalman filter contains both displacement and speed along all three axes. With the Kalman filter applied to predict the trend of movement of a target, the resultant motion trajectory is much smoother; further, the degradation due to frame losses and measurement error can be reduced. The Kalman filter is also capable of addressing the continuity of multiple tracked targets after occlusion impressively well. The Kalman filter is usually implemented in two main phases: “predict” and “update”, which is shown below.

PREDICT

The predicted state estimate is given by:

퐱̂푘|푘−1 = 퐅푘퐱̂푘−1|푘−1 + 퐁푘퐮푘 (3.10) where 퐱̂푘|푘 is the posteriori state estimate at time 푘 given observations up to and including at time 푘, 퐅푘 is the state transition matrix, 퐁푘 is the control input model, 퐮푘 is the control vector (퐁푘퐮푘 is omitted in the following sections with 퐮푘 being zero in this case).

The predicted error estimate is given by:

T 퐏푘|푘−1 = 퐅푘퐏푘−1|푘−1퐅푘 + 퐐푘 (3.11) where 퐏푘|푘 is the posteriori error covariance matrix, 퐐푘 is the process noise covariance matrix.

57

UPDATE

퐒푘 = 퐇푘퐏푘|푘−1퐇푘 + 퐑푘 (3.12)

T −1 퐊푘 = 퐏푘|푘−1퐇푘 퐒푘 (3.13)

퐱̂푘|푘 = 퐱̂푘|푘−1 + 퐊푘(퐲̃푘 − 퐇푘퐱̂푘|푘−1) (3.14)

T 퐏푘|푘 = 퐏푘|푘−1 − 퐊푘(퐒푘퐊푘) (3.15) where 퐒푘 is the innovation covariance, 퐇푘 is the observation matrix, 퐑푘 is the observation noise matrix, 퐊푘 is the optimal Kalman gain, 퐲̃푘 is the vector of observations.

With every iteration, the filter gives a prediction of the current 퐱̂ (the starting 퐱̂ is initialized with the initial position of the tracked objects and zero velocity) and 퐏 based on historical measurements. Then 퐱̂ is returned to the tracking module and 퐏 is updated internally. The state vector of the Kalman filter includes the coordinates and velocity components along the x-axis (horizontal) and y-axis (vertical) in 2D implementation, plus the z-axis if used further in the 3D implementation. The element values along the diagonal of the observation noise covariance matrix (R) and the process noise covariance matrix (Q) can be set to be either large or small depending on the noise level of the current measurement.

In this work, the algorithm was initially evaluated using simulated data. During this simulation stage, since the imported data were artificial and the noise levels were lower than for field data, the R and Q coefficients were set to be relatively small (0.1), i.e. the incoming measurement data and the prediction generated by the filter itself at each iteration had equal weight in updating the filter coefficients. The values in the error covariance matrix (P) are determined by how close the initialisation is to the ground truth; it is an important factor in determining the time required for the filter to reach equilibrium. These three vectors were carefully tuned to handle the field data later on the details of both the simulation and field data analysis are discussed in the following chapters.

58

3.3 Hungarian algorithm

The Hungarian algorithm is an assignment algorithm that was developed by Harold Kuhn in 1955 largely based on the works of two Hungarian mathematicians: Dénes Kőnig and Jenő Egerváry (Kuhn, 1955).

This algorithm aims to solve problems with the following template:

If there are several tasks that are required to be done and there are several workers that can do all the tasks but are paid differently for each task, find a solution for all the tasks involving all workers with the minimum total wage outlay.

The Hungarian algorithm consists of the four steps below. The first two steps are executed once, while the last two steps are repeated until an optimal assignment is found. The input is usually a matrix with the number of rows and columns being the same (e.g. the numbers of tasks and workers are the same) filled with non-negative elements.

Step 1: Subtract row minima

For each row, find the lowest entry and subtract it from every other entry in that row. This makes the smallest entry in each row now zero.

Step 2: Subtract column minima

Similarly, for each column, subtract the lowest entry from every other entry in that column. This makes the smallest entry in each column now zero.

Step 3: Cover all zeros with a minimum number of lines

Cover all zeros in the resulting matrix using a minimum number of horizontal and vertical lines.

If there are n lines drawn, an optimal assignment exists among the zeros and the algorithm is finished.

If there are fewer than n lines drawn, continue with Step 4.

59

Step 4: Create additional zeros

Find the smallest entry that is not crossed by a line in Step 3. Subtract it from all uncrossed entries and add it to all entries that are crossed twice.

End.

The input matrix used in the project contained the existing tracks generated by previous videos data and the current extracted coordinates ready to be assigned. After the assignment by the Hungarian algorithm, each newly detected target can be assigned its closest existing track. Then such data can be fed into the 3-state Kalman filter for further correction and calibration.

Clearly, there are situations when the number of existing tracks and detections do not match, and the algorithm was adjusted to adapt to such cases. When there are fewer detections than previous frames, which means the tracked targets are either missing or have left the scene, the assignment procedure will try to assign all the new detections to tracks. For cases in which the number of detections is larger than that of existing tracks, the algorithm will attempt to assign all the tracks with a detection with a minimum total cost. The unassigned detections are treated as new targets, so new tracks are generated for them.

3.4 3D space triangulation & epipolar geometry

This theory concerns the conversion from 2D analysis to 3D space. In this state, a very important concept termed epipolar geometry is introduced. It requires information on the cameras used (camera matrices) and the conversion from pixel widths on the image to real-world length unit.

In the geometrical system used, there are several vital concepts that need explanation before discussing the equations.

• The epipoles are the intersection points of the straight line connecting the two camera centres and the scene plane of each camera (e and e’ in

60

Figure 3.1 (B)) They are also the projections of each camera on the other camera’s scene plane. • The epipolar plane is the plane containing the baseline. Therefore there are infinite epipolar planes. • The epipolar lines are the intersection lines of the epipolar plane and both the camera scene plane. • The fundamental matrix is a translation or mapping matrix between a point on one camera image plane and its corresponding epipolar line on the other camera image plane. It contains information about the cameras’ intrinsic matrices and the relative positions between the two cameras.

Given a camera’s internal parameters and the relative positions of the cameras, the fundamental matrix and the epipolar lines for each point on one of the camera scenes can be calculated. They are then used to match the projection points on both the left and right cameras to find the points that belong to the same real-world 3D position.

(A) (B)

Figure 3.1 (A) Demonstration of a two-camera imaging system (C and C’). The real-world point X and its projection points on each camera are given by x and x’. (B) Once the epipoles are found (e and e’), based on the position of a point x on the left camera, an epipolar line l’ on the other camera can be derived, indicating that all the possible locations of the project point of the same real-world location X must lie on this line. (Hartley & Zisserman, 2003)

61

If a real-world point X is represented in homogenous coordinates as:

퐗 = (푋푆, 푋푆, 푋푆, 1) (3.16) and its projection (scene point on the camera) as:

퐱 = (푢′, 푣′, 푤′) (3.17)

with xpix = u’/w’ and ypix = v’/w’, we have,

퐱 = 퐏퐗 (3.18)

P is a 3x4 matrix that represents the mapping from the scene to the image and is therefore called a “camera”.

The linear transformation represented by homogeneous vectors is given by:

푥푆 푦푆 푥푖 = 푓 , 푦푖 = 푓 (3.19) 푧푆 푧푆 which is

푥 푢 푓 0 0 0 푆 푦 [푣] = [ ] [ 푆] 0 푓 0 0 푧 (3.20) 푤 푆 0 0 1 0 1 where (푥푖, 푦푖, 푓) is the image point and 푥푖 = 푢⁄푤 , 푦푖 = 푣⁄푤.

Given the principal point (푥0, 푦0) (usually the centre point of the image) and scaling factors 푘푥 and 푘푦 (from physical to pixel magnitude), we define 훼푥 =

푓푘푥, 훼푦 = −푓푘푦, and 푥푝푖푥 = 푢′/푤′, 푦푝푖푥 = 푣′/푤′, then

푥푆 푢′ 훼푥 푠 푥0 0 푦푆 [푣′] = [ 0 훼푦 푦0 0] [ ] (3.21) 푧푆 푤′ 0 0 1 0 1 where s represents the skew coefficient between the x and the y axis and is often 0. This equation is normally called the camera projection equation.

The matrix

훼푥 푠 푥0 훼푥 0 1920/2 퐊 = [ 0 훼푦 푦0] = [ 0 훼푦 1080/2] (푖푛 푎푛 푖푑푒푎푙 1080푝 푠푖푡푢푎푡푖표푛) 0 0 1 0 0 1 (3.22)

62 is called a calibration matrix. It has five degrees of freedom with a dimension of 3 by 3.

To convert from camera coordinates to world coordinates, we have

푥푆 푋푆 푦푆 퐑 퐓 푌푆 [ ] = [ T ] [ ] (푐표표푟푑푖푛푎푡푒 푡푟푎푠푓표푟푚푎푡푖표푛 푒푞푢푎푡푖표푛) (3.23) 푧푆 ퟎ3 1 푍푆 1 1 where R is the rotation matrix and T is the translation vector and T can be substituted by

퐓 = −퐑퐂̃ (3.24) where 푪̃ represents the homogeneous form of the 2D coordinates of the camera respect to the real-world origin O.

퐑 and 퐓 are defined as:

1 0 0 (3.25) 퐑푥 = [0 푐표푠 휃푥 − 푠푖푛 휃푥] 0 푠푖푛 휃푥 푐표푠 휃푥

푐표푠 휃푦 0 푠푖푛 휃푦 퐑푦 = [ 0 1 0 ] (3.26) − 푠푖푛 휃푦 0 푐표푠 휃푦

푐표푠 휃푧 − 푠푖푛 휃푧 0 퐑푧 = [푠푖푛 휃푧 푐표푠 휃푧 0] (3.27) 0 0 1

퐑 = 퐑푥퐑푦퐑푧 (3.28)

푡푥 퐓 = (푡푦) (3.29) 푡푧

Combining the above two translation equations, we have the projection matrix P, which is defined as:

̃ 퐑 −퐑퐂 (3.30) 퐏 = 퐊[퐈3|ퟎ3] [ T ] = 퐊퐑[퐈3| − 퐂̃] ퟎ3 1 thus 퐱 = 퐏퐗 (3.31)

63

T The corresponding skew-symmetric matrix of a vector 푎 = (푎1, 푎2, 푎3) is defined as

0 −푎3 푎2 [퐚]× = [ 푎3 0 −푎1] (3.32) −푎2 푎1 0 and can be used to replace cross-product of 2 vectors in such a way

(3.33) 퐚 × 퐛 = [퐚]×퐛

The fundamental matrix F can be computed from correspondences between image points alone without knowledge of camera internal parameters or relative pose information. For any pair of corresponding points 퐱푖 and 퐱′푖 in the two images, the following condition always applies:

T (3.34) 퐱′푖 퐅퐱푖 = 0

Given an image point 퐱 in one of the cameras 퐏, select any scene point (3D) 퐗 on ray of 퐱 in camera 퐏.

퐏 = 퐊[퐈|ퟎ], 퐏′ = 퐊′[퐑|퐓] (3.35)

(assuming 퐊 = 퐊′ when two exactly the same cameras are used)

퐱 = 퐏퐗 (3.36)

so 퐗 = 퐏+퐱 = 퐏T(퐏퐏T)−1퐱 1 (3.37)

Find the projection point 퐱′ of 퐗 in camera 퐏′ view plane.

퐱′ = 퐏′퐗 = 퐏′퐏+퐱 (3.38)

Find epipole 퐞′ as projection of 퐂 in camera 퐏′, which is given by 퐞′ = 퐏′퐂.

ퟎ 퐂 = ( ) (3.39) 1

퐞′ = 퐏′퐂 (3.40)

Find the Epipolar line 퐥′ connecting 퐞′ and 퐱′ in 퐏’ as a function of 퐱.

1 The symbol 퐏+ means the Moore-Penrose pseudoinverse of matrix 퐏, which has some of the properties of the actual inverse. It is commonly used to ignore the zero-division issue when the original matrix is singular.

64

+ + 퐥′ = 퐞′ × (퐏′퐏 퐱) = [퐞′]×퐏′퐏 퐱 (3.41)

The fundamental matrix is defined by,

퐥′ = 퐅퐱. (3.42)

+ (3.43) 퐅 = [퐏′퐂]×퐏′퐏

퐱′ belongs to 퐥′, thus 퐱′T퐥′ = 0, thus 퐱′T퐅퐱 = 0.

The Epipolar line 퐥′ = (푙푎, 푙푏, 푙푐) forms a line in the form of 푙푎 ∙ 푥 + 푙푏 ∙ 푦 + 1 ∙ 푙푐 = 0.

For the ideal 90-degree case, the rotation matrix and translation vector are: (w.r.t. the left camera)

0 0 1 퐑 = [ 0 1 0] (3.44) −1 0 0

−푁 퐓 = ( 0 ) (3.45) 푁

(N can be any fractional number as long as the indicated direction remains the same)

65

4 Chapter 4 – Hardware configuration and data collection

4.1 Exploration of the imaging principle

The basic idea of this project was inspired by an infra-red active imaging system for automatic determination of insect flight tracks, namely IRADIT, designed by Schaefer and Bent in 1980s. Before the system was built, Schaefer’s team compared several approaches for in-field insect flight track extraction, including radar scanning, passive imaging in both visual range and infra-red wavelength, and infra-red active detection, and then decided to implement a near-infra-red active detection method to create high contrast between the tracked insects and the background with minimum interference of the flight behaviours of the targets.

Considering the spectral distribution of radiation from a 6000 K blackbody radiator, which approximates the solar radiance, and the technological level of imaging devices at that time, the illumination system comprised a xenon arc light source and a search-light parabolic reflector fitted with a high-pass filter at the infra-red illuminator and an 823 nm narrow-band filter at the detector. The system was capable of capturing the tracks of flying aphids against the clear bright afternoon sky.

The performance of the IRADIT system was impressive but still limited considering the imaging and data processing techniques available at the time. Therefore, it was suggested that for this work, the principle of the IRADIT system be improved and its application expanded to a broader field, employing cutting- edge techniques available at low cost. The desired system should be highly automated, record much greater volumes of data, process them with higher speed and be able to work properly under variable conditions.

During the initial design and exploration stage of the new imaging system, the team investigated two main forward scattering configurations. The first one involved a 785 nm 120 mW infra-red laser with a high-frequency resonant scanning reflecting mirror fitted in front of the laser beam to generate a

66 pyramidal light beam with a very small beam angle (Figure 4.1 A). The second one was a low-power infra-red LED array which had 100 850 nm narrow-band LEDs positioned evenly on a square-shaped PCB board (Figure 4.1 B), forming a cohesive beam and a roughly circular illumination area. With both approaches, the wings of the inflight insects would be illuminated, and the light rays forward scattered by the wings would be detected by the camera at the other side of the target (Figure 4.1 C). These two types of illuminator were designed to work with a high-speed (170 frames per second) infra-red camera with a narrow-band optical filter fitted in front of the camera lens.

A B

C

Target bee

IR emitter IR camera

Figure 4.1 (A) 785 nm streamline Class 3B laser and the optical scanner (B) 850 nm IR emitter consisting of 100 LEDs. (C) The forward scattering imaging configuration for both systems.

67

After several tests including indoor low-light condition recordings and a number of bright illumination condition field trials, these two plans were discarded one at a time. For the laser beam scanning method, although the beam size increased linearly against the working distance, it was still around 700 µm even at a distance of 2000 mm (Figure 4.2) thus the energy density was significant. The power of the laser diode was 120 mW, which was classified as a Class 3B laser. This type of laser will cause permanent ocular damage to human eyes with exposures of 1/100th of a second and possible burns and damage to a bees’ compound eyes during the observation. Since it emitted invisible light, the damage zone was not observable. Furthermore, it was very difficult to synchronise the scanning frequency with the frame rate of the camera. The 785 nm wavelength provided an adequate signal-to-noise ratio but was also very expensive.

A

68

B

Figure 4.2 Divergence curve of the beam size at focus (blue solid curve) and the depth of focus (black dashed curve) vs. distance from laser at (A) short range (100 – 500 mm) and (B) long range (500 – 2000 mm).

The infra-red LED array emitted light with a wavelength of 850 nm; further, it was much cheaper, safer and easier to implement. But the challenge lay in the alignment of all the LEDs. Although the average power of the emitter was 72 W m-2 if the LEDs aligned, improperly aligned LEDs greatly affected the final output power of the emitter; perfect beam collimation for 100 LED diodes was very difficult to achieve. Furthermore, in a forward scattering configuration, good imaging results rely on the diffraction, nonhomogeneous refraction and diffuse reflection of the target. Although the angle requirements are not as strict as those for specular reflection, the target still required proper illumination across the entire observation volume. The narrow emission angle of the LED diode was ±3°, and during the testing, the SNR declined dramatically as the elevation angle off- set increased (Figure 4.3). Since the elevation angles of the emitter and the camera were fixed, the beam could not follow the target, causing poor and uneven illumination across all the targets in the observation volume. In addition, the pulsed emitter was also required to be connected to the camera shutter with a long (at least twice the length from the emitter to the target) BNC cable to guarantee minimum synchronisation latency. This limited the system’s portability for further field trials. In general, the beam collimation of the emitter,

69 the fixed elevation angle, the synchronisation between the emitter and the camera shutter all contributed to the difficulties in obtaining a satisfactory SNR.

SNR vs. elevation angle 20

15

10 SNR 5

0 2 4 6 8 10 12 14 16 18 20 Angle (degree)

Figure 4.3 Signal-to-noise ratio (dB) vs. shifted elevation angle of the emitter.

Figure 4.4 shows the images captured by the monochromic high-speed camera, for a bee specimen hanging in the air against bright sunlight at a distance of 1 m (A), (B) and 5 m (C) using the LED array emitter. The only difference between the emitter being on or off was the lower part of the body of the bee being illuminated. As the observation distance increased, the light from the emitter could not compete effectively with the sunlight.

A B C

Figure 4.4 (A) Image of a hanging bee in bright sky, 1 m distant, with the emitter off; (B)Image of the same bee from 1 m with emitter turned on; (C) Image of the bee from 5 m distance. All these images are enhanced equally.

Both visual and near infrared video data were collected during the first visit to the beehives. After the processing of the video recorded by a DSLR camera and a mobile phone camera (described in details in the following sections), the

70 movement of the honeybees was much more recognisable, and the flight statistics were reserved to a great extent. Despite the presence of large amount of ambient noise, such as the grass on the ground and tree leaves behind the beehive in the camera scene (Figure 4.5) in the images, the videos recorded by cameras operating in the visible wavelengths were of superior quality and hence amenable to post-processing. Therefore, the priority of work moved from active near infra-red imaging to passive imaging using visible wavelengths.

Beehive Beehive entrance

Camera

Figure 4.5 System setup at beehive for 2D flight track determination of honeybees.

4.2 First experiment (2D determination)

During the second year of this research, frequent field work was conducted, during which video data of bee flight in the vicinity of the hive were recorded. Instead of observing flying honeybees on their way to food sources, the beehive represented the epicentre of interest regarding flight behaviour. By monitoring the surrounding space of a beehive, density statistics could be obtained within a single frame, including the changing flight speeds, from outside the entrance of the hive to the very end of the camera’s visual plane. The activities of dozens of bees gathering at the entrance to the hive, landing and taking off at the same time within a volume of less than one cubic decimetre, could be collected and analysed, although the large numbers served as a significant challenge to the multiple target tracking system. The imaging of such an observation volume

71 provided a very broad range of both motion statistics and population activity of the honeybees.

The video data was collected by a DSLR camera 4 metres away from the beehive and a mobile phone 2 metres away, both mounted onto a tripod for better stability (Figure 4.6).The resolution and framerate of the videos were 1080p60 and 4K30 respectively. Since the exposure, focusing and white balance settings were not customisable, certain areas in the camera scene were vulnerable to saturation and not all of the targets were evenly in-focus. There was moderate amount of grass on the ground (Figure 4.2) around the observed beehive, which served as the main source of dynamic noise. The recordings were taken during three afternoons in May and June 2018 (Table 4.1). The weather was clear and bright to overcast. However, during this experimental session, no wind speed data was collected.

Mobile phone camera

2 m DSLR 4 m Beehive

Figure 4.6 Schematic of the camera setup (top view)

72

Table 4.1 2D bee flight monitoring study 2018

Resolution Monitoring Target/Camera Date Equipment & Weather Period Separation (m) Framerate Mobile 13.00-13.10 2.0 4K30 Moderate phone 13.20-13.30 2.0 4K30 amount 24 May camera of clouds DSLR 13.00-13.10 4.0 1080p60 to clear camera 13.20-13.30 4.0 1080p60 Mobile 14.00-14.20 3.0 4K30 5 June phone 14.25-14.35 3.0 4K30 Overcast camera 14.45-14.55 3.0 4K30

The recordings taken in the second year of this project were mainly for feasibility verification, hence the cameras for initial data evaluation and subject to review. During these early trials, there was still a certain level of lens distortion due to incomplete camera calibration; further, the location mapping from image pixels to the real world was also, as yet, unknown. However, despite such limitations, a large corpus of good quality video data was recorded successfully; this was sufficient to enable the development of the 2D tracking software.

4.3 Camera parameters

Before a final decision on choice of camera was made, a selection of different resolutions and framerates was investigated to determine the most suitable model to image a moving object as small as a honeybee. It was decided that a pair of action cameras (compact cameras that are designed to record actions) such as the GoPro (GoPro, 2020) met the requirements for resolution and data transmission speed. In addition, action cameras are smaller, cheaper and much more rugged compared with industrial cameras or custom-made cameras. The

73 significant parameters, such as white balance, ISO and aperture, are controllable. Moreover, they also operate under adverse weather conditions including rain and extremes of temperature. The model used in this project, the GoPro Hero 7 Black, supports a waterproof level of up to 10 metres underwater; the standard operating temperature is 4-40 degrees Celsius.

44.9 mm

62.3 mm

Figure 4.7 Dimensions of the GoPro Hero 7 Black (no protection case)

The camera supports frame rates and resolutions of up to 1080p120 and 2.7K60 respectively, when employing a linear lens mode (the mode with minimum radial distortion, compared with fisheye mode). The lens had a focal length of 35 mm, the equivalent of 17 mm on a full-frame image sensor, and a vertical and horizontal field of view angle of 71° and 86° respectively. Thus, the scalar α from a pixel-wise location on an image to a real-world location is given by:

푁푝푖푥,푥⁄2 훼 = (4.1) 푙푓표푐푎푙 푡푎푛(휃ℎ표푟푖푧표푛푡푎푙⁄2) where Npix,x is the total number of pixels on the horizontal axis of the image, lfocal is the focal length and θhorizontal is the horizontal field of view angle. Hence, at a distance of 3 metres away from the camera in 2.7K resolution, a 1 mm long object occupies 2.067 pixels, while in 1080p resolution, that number is 1.468 pixels. A western honeybee is normally 15 mm in length; so for a honeybee to occupy at least two pixels in the camera scene, it cannot fly farther than 10 m away from the camera in the 2.7K case and 7 metres in the 1080p case.

74

The key problem in the framerate debate lies in the gap between two sequential captures of a given object and the motion blur such a framerate generates. The top flight speed of a worker bee is typically in the range of 15-20 mph. If observed with a framerate of 60 frames per second, approximately 9 frames will depict motion that crosses horizontally the observing volume, for a distance of 3 metres; 60 frames would cover the horizontal field of view at 10 metres. Although 9 locations seem inadequate to determine the flight track of a flying object, the software and filters employed proved to be much better at connecting the historical locations of a given object than recognising it from a blurred image.

To summarize, it is more important to record with a high resolution than a high framerate given a fixed data transmission bandwidth, especially for the early- stage raw video processing, considering the capability of the post-processing software.

4.4 Simulation of 3D flights

For a 3D recording system, the most significant concern is the setup of the cameras. There are two possible options: one is the orthogonal configuration (Figure 4.8 (A)), in which the optical axes (the line passing through the centre of the image sensor and the centre of curvature of a lens) of the two different cameras are orthogonal and are within the same horizontal plane. The cameras are then observing the same object from two different directions in a 90-degree angle. Such setup ensures the acquisition of the depth information of an object in 3D world and enables the straightforward reconstruction of a complete 3D flight track of the target. The other approach is similar to the observational mode of human eyes, in which the two cameras are set up very closely (Figure 4.8 (B)), perhaps 10-20 cm apart, and the optical axes are nearly parallel to each other i.e. forming a very small angle, facing the same direction. The distance between the target object and the cameras must be at least 30 times the distance between the two cameras to achieve a good performance respecting 3D discrimination.

75

A

Public field of view

Left Target distance (4 m) camera Target θ

4 m

Horizontal view angle – 86° Angle between optical axes – 90° Right camera

B

Public field of view

Target

θ

4 m 4 m Horizontal view angle – 86° Angle between optical axes – 6°

Left Right

camera camera Figure 4.8 Schematics of (A) the orthogonal setup and (B) the parallel setup for a dual camera imaging system.

The virtual experimental scene was constructed using Autodesk 3ds Max software (Autodesk, 2020). The setups of both orthogonal (Figure 4.9 A) and parallel configurations (Figure 4.9 C) were simulated by observing 20 spheres with a radius of 10 mm moving under pre-set patterns (Figure 4.9 B). The videos

76 recorded by these virtual cameras were then exported and analysed by an image processing software suite, details of which are discussed in the following chapters.

A

B

77

C

Figure 4.9 (A) Simulation of the orthogonal configuration of a two-camera imaging system, looking at 20 virtual spheres at the same time. (B) Scene captured by the right (front) camera showing 20 the motion of 20 spheres at a given instant and the blue curve shows the pre-set moving path of one sphere. (C) Simulation of the parallel (stereo) configuration of a two-camera imaging system.

Comparing the orthogonal and the parallel approach, they both have their own advantages and disadvantages. This is discussed in detail below.

Orthogonal setup

In a single camera situation, where no depth information can be obtained, if the displacement of a target between two adjacent frames is detected, the projected displacement vector along one of the axes, say the horizontal axis, on the image sensor is given by

퐒ℎ표푟푖푧표푛푡푎푙,2퐷 = 훼 ∙ 퐒ℎ표푟푖푧표푛푡푎푙,3퐷 ∙ sin 휃ℎ표푟푖푧표푛푡푎푙 (4.2) where 퐒ℎ표푟푖푧표푛푡푎푙,2퐷 and 퐒ℎ표푟푖푧표푛푡푎푙,3퐷 are the horizontal components of the displacement vectors on 2D view plane and the 3D space respectively, 훼 is the scaling factor from real-world space to the image sensor introduced by the magnification of the camera lens, and 휃ℎ표푟푖푧표푛푡푎푙 is the angle between the optical axis of the camera and 퐒ℎ표푟푖푧표푛푡푎푙,3퐷.

78

Then, for any measurement error 퐞ℎ표푟푖푧표푛푡푎푙,2퐷 on the 2D image, the estimated error in the 3D space 퐞ℎ표푟푖푧표푛푡푎푙,3퐷 is given by

퐞ℎ표푟푖푧표푛푡푎푙,2퐷 퐞ℎ표푟푖푧표푛푡푎푙,3퐷 = (4.3) 훼 ∙ sin 휃ℎ표푟푖푧표푛푡푎푙

A

ehorizontal,3D

ehorizontal,2D Θhorizontal

B y

x ehorizontal,3D, left

ehorizontal,3D, right ex,2D,left

Left camera

Right camera

ey,2D,right

Figure 4.10 Inverse projection of the measurement error from 2D camera view plane to 3D, shown in (A) the single camera case and (B) the dual-camera case following the orthogonal setup.

79

This means, when the object’s moving path is parallel to the image plane, this equation yields the minimum 3D error. However, in the worst-case scenario, the 3D error can be infinity if the object is moving along the optical axis of the camera. In such cases, the results of the detection displacement is then invalid. Therefore, the scalar of the 3D error calculated from 2D measurement is in the range of 1/sin90° to 1/sin 0° (i.e. 1 to ∞).

The introduction of the second camera in the orthogonal setup solves this issue straightforwardly (Figure 4.10). This is achieved by calculating the error before projection (i.e. the 3D error) based on the measurement of both the cameras independently and taking the smaller one as the final error. The final error along the horizontal plane is given by

퐞ℎ표푟푖푧표푛푡푎푙,3퐷 = 푚푖푛{퐞ℎ표푟푖푧표푛푡푎푙,3퐷,푙푒푓푡, 퐞ℎ표푟푖푧표푛푡푎푙,3퐷,푟푖푔ℎ푡} (4.4) where 퐞ℎ표푟푖푧표푛푡푎푙,3퐷,푙푒푓푡 푎푛푑 퐞ℎ표푟푖푧표푛푡푎푙,3퐷,푟푖푔ℎ푡 are the estimated 3D errors generated from the left and right camera respectively.

The measurement errors generated from the two channels are relatively independent, thus the magnitudes of the errors are very close to each other, given that the cameras are the same model and the same type of image processing operations is applied to both channels. Therefore, it is obvious that the 2D to 3D error scalar is now in the range of 1/sin90° to 1/sin 45° (i.e. 1 to √2) theoretically, where the largest possible error happens when the object is moving 45° off the optical axis.

Besides, the conversion operations between the coordinates obtained from two different cameras and 3D real-world coordinates is not complicated; the reference system is also straightforward to construct.

However, using the orthogonal configuration, the overlapping volume of the field of views of the cameras is smaller than with the other approach, making it difficult to determine whether an object is within the common volume or can only be seen by one of the cameras. While for the parallel design, if an object can be seen by one of the cameras, it is very likely that it can be seen by the other one as well. In addition, the background clutter is significantly different in the views

80 of each camera. Hence in the background removal stage, the foreground and background modelling cannot be shared between the two cameras, increasing the computation burden.

Parallel setup

For parallel setup, there are a number of advanced computer vision techniques that can be implemented to match the features between the two channels and extract the depth information, such as Scale-Variant Feature Transform (SIFT) or others based on the similar feature extraction principle. Since the background scenes captured by two cameras are very similar, it is more straightforward to perform feature matching and generation of the disparity map – the mapping of the pixel difference or motion between a pair of stereo images, where depth data is accessible and the depth of the objects in the entire scene are encoded with pixel intensity.

If the two cameras are the same model and they are perfectly synchronised during recording, the depth information can be converted into point cloud data to reconstruct the 3D surfaces of the scene. The reconstruction of the 3D shapes is normally achieved by extracting the silhouettes from multiple perspectives of the same object after segmentation. Not only is it difficult to segment tiny image objects such as honeybees, the 3D shape of a flying bee is also difficult to obtain since the wings beat constantly at very high frequency (~230 Hz) (unless a high- speed camera is used) (Ristroph, Berman, Bergou, Wang, & Cohen, 2009). It is impossible to implement for the tracking of multiple flying targets. Therefore, in most cases, the flying bees would be treated as background noise. Further, as is stated above, placing two cameras close together, perhaps 10-20 cm apart, may result in large errors when calculating the 3D position of an object moving along the optical axes. The error estimation is very similar to the single-camera situation. The only difference lies in the range of the upper limit of the error scalar. For example, in a parallel setup where the angle between the optical axes of the two cameras is 6°, the final 3D error, according to Equation 4.2, lies in the range of 1/sin90° – 1/sin3° (i.e. 1 – 19.107), which is still very large and unacceptable. The error also accumulates during calculations, ultimately leading to unusable track information.

81

Additionally, when the two approaches are compared in terms of portability, the parallel plan has its natural advantages: once the cameras are installed and calibrated, there is no need to configure the system in the field. It also requires very little space and is more adaptive to uneven grounds.

It was concluded that, positioning the cameras orthogonally was more suitable, since coordinate determination was easier and provided the minimum measurement error, with adequate observation volume for honeybees tracking.

4.5 Second experiment session (3D determination)

The second phase of the field trials was conducted during the third year of this project, between June and July 2019. The experimental site was as the year before, except this time there was almost no grass on the ground due to the maintenance of the hive and the introduction of a second hive. A group of trees was deliberately included within the background (Figure 4.11), because the type of noise that the tree leaves generated was very similar to that of the grass. On some occasions, trees leaves can be more complicated to process than grass since tree leaves are often 2 - 10 m above ground and cover a vertical range that a honeybee could fly through. Both the hives were included in the observations, which provided statistics of two different honeybee communities lead by two different queens.

Table 4.2 bee flight monitoring study 2019

Resolution Monitoring Target/Camera Date Equipment & Weather Period Separation (m) Framerate GoPro 14.45-15.00 4.0 2.7K60 6 June Clear Hero 7 15.00-15.25 4.0 2.7K60 Black 10.35-10.55 4.0 2.7K60 11 July Overcast × 2 11.05-11.20 4.0 2.7K60

82

Tree

The second camera Beehive

Figure 4.11 Image captured by the right (front) camera, showing the background, the target beehive and the location of the other camera (not entirely visible from this camera).

Two GoPro Hero 7 Black cameras were used in this set up. For long-term field trials, they required no connection to the PC to work properly and the battery usually lasts for more than three hours, which was acceptable in this case. The resolution and framerate were set to be 2.7K (2704 by 1520) and 60 frames per second, the maximum bandwidth for this camera.

From the initial findings of the image processing stage, the visibility of the flying honeybees would not be greatly affected by light intensity, white balance or hues and saturation level of colours - the reasons behind which are discussed in the following chapters. During a two-month period, two visits were paid to the beehives, during which the weather varied between clear and bright, overcast and drizzle. Videos recording was performed for several hours during the visits, with each clip lasting approximately nine minutes.

To reconstruct the offline 3D coordinates and 3D trajectories, the system used the orthogonal configuration as described above. The two cameras were set at the same height (1.2 m) to the front and the right of the beehive respectively and the optical axes of the two cameras intersected at the target and formed a 90°

83 angle. The combination of the 2D coordinate pairs was then achieved through appropriate 3D geometry mathematics.

Before the cameras were installed at the beehive site, they were calibrated in the laboratory using an 11 (column) × 9 (row) monochromic chessboard pattern to obtain the camera’s intrinsic matrix and the distortion coefficients. The dimension of the chessboard must be big enough to guarantee the visibility of the pattern during the edge and corner detection of the chessboard tiles. The two cameras were then calibrated to remove the lens distortions. The cameras used were the same model, which offered great convenience in the later processing stages.

Figure 4.12 The custom-made calibration MDF chessboard for the orthogonal calibration.

Another set of calibrations was conducted when the cameras were properly set up and ready for recording. To guarantee the relative position of the cameras, i.e. to ensure the angle formed by the two optical axes was 90° and the view planes of the cameras were perpendicular to the ground surface, a custom-made calibration rig was fabricated (Figure 4.12). This comprised two panels of MDF, 680 mm × 500 mm in dimensions attached to each other, forming a 90° angle. They were supported by two wooden bars at the back to ensure rigidity. On both panels, two chessboard patterns (total number of tiles: 7 (column) × 10 (row))

84 were painted with waterproof paint. The rig was placed next to the beehive before recording, with one panel facing the front and the other to the right. During the experiments, the cameras were positioned 3 m from the calibration panels; this was approximately 4 m from the target beehive. The alignment started with activating the calibration software (Figure 4.13 B) and the position and orientation of the cameras were adjusted based on the relative lengths of the axes along the chessboard pattern, namely the 3D pose estimation. When the green (depth) axis reached its shortest length (aiming at a length of less than 3 pixels) and the red (horizontal) axis and the blue (vertical) axis were perfectly aligned with the calibration lines (the absolute vertical and horizontal lines) marked on view plane, it was considered that the visual plane of the camera was parallel to the calibration board and the principal axis coincided with the normal vector coming from the original point on the board. This meant that the two cameras were in an ideal orthogonal configuration. The boards were then removed for the recording phase. Videos from the left and right channels were synchronised using a momentary pulse from a mobile phone’s flashlight, observable by both cameras in the overlapping volume (equivalent to a clapperboard strike). To double check the accuracy of the angle and minimise the error, after each recording session, one of the calibration panels was raised manually and moved across the overlapping field of view in different orientations, as a reference for further stereo calibration (Figure 4.13 A).

A B

Figure 4.13 (A) Cameras undergoing stereo calibration using a chessboard pattern after each recording. (B) The interface of the pose estimation calibration software, with alignment lines shown in different colours.

85

Theoretically, the angle formed by the cameras can be any angle between 0-180°, which was then substituted into the further calculations once properly measured. Therefore, the 3D pose estimation of the cameras with the two-piece chessboard is theoretically not necessary. However, the reason for the implementation of both the 3D pose estimation and stereo calibration was first, to cross validate the formed angle and second, to ensure that the positions of the cameras were not affected by wind or other types of disturbance. The reason for using 90° was to maintain the consistency from simulations to field trials; hence the findings and estimations generated by simulated results could be further verified. Although it took more effort to form a specific angle during the experiments, it provided greater convenience during processing.

For security purposes, and to avoid damage, the equipment including the calibrated cameras and chessboards had to be removed after each experiment session. This clearly increased the amount of work associated with each field session. But for any commercial system, weatherproof cameras could be permanently installed once properly calibrated; the collection of data only requires the replacement of memory cards. Therefore, other types of camera mounting system could be used, which would provide better stability and durability (rather than normal tripods).

In addition to the video data, a 2D ultrasonic anemometer was introduced to monitor the wind speed around the beehive. However, the anemometer failed to synchronise with the camera due to its low sampling rate. Thus, the collected wind speed data was not used for the calculation of true flight speed of the bee against the wind; only the ground speed was derived. During the time when when the wind speed shown on the anemometer was close to zero, the flight speed calculated was treated as the true speed.

4.6 Conclusion

This chapter provided a detailed description of the video recording system for honeybee flight capture, the design of each experiment and a brief introduction to the simulation scenario of the dual-imaging system. The first experiment

86 mainly focused on collecting 2D bee flight data for the initial processing of the video and the investigation of possible image processing techniques to apply. The second experiment used a mature imaging system, with two fully calibrated cameras recording simultaneously so that the 3D statistics could be calculated. The operating conditions and the advantages as well as the disadvantages of the system was also discussed. In the next chapter, algorithms that were used to process the data and reconstruct the 3D flight paths of the bees will be explained, along with a detailed description of the simulation scenario.

87

5 Chapter 5 – Software development

5.1 Definition of the problem

The video recorded by a camera includes a lot of information, stored as pixels in a video frame. Therefore, to reduce the computation burden of the software, it is an essential step to extract the regions of interest (ROI), i.e. the bee flight data, before any further processing. The removal of irrelevant information mainly relies on image segmentation via digital image processing. The goal of image segmentation is to partition all the pixels in an image into different sets, based on the relationship of neighbouring pixels, such as the consistency of surface texture and motion pattern. Once the segmentation is done, the image can be analysed not by pixel, but by larger elements within it; this makes the interpretation of the information contained in an image much easier, especially for the processing of large amount of data, such as a video. The image segmentation of honeybees, in terms of the software design, normally follows two main approaches:

• Pre-trained object classifier This method first builds a 3D model of the target (i.e. the honeybee) based on several silhouettes or perspectives from various angles on a 2D plane, which can be regarded as the training stage of the classifier. Then the classifier is applied to each incoming video frame, looking for image contours that match the feature of the bee model with a certain size of searching window. However, there are several reasons that make this approach challenging to apply in an insect tracking situation. First, the wings of bees generate much more motion than the main body, and the wingbeat frequency of a bee (~230 Hz) is much higher than the frame rate of a common camera. During the training phase of the bee model, a large amount of data that contain orientation and posture information of the wings is required, making the whole procedure more complicated and time-consuming. Second, based on the available resolution of the cameras, the flying bees actually occupy merely a tiny area in the scene if a relatively large observation volume is targeted. Thus the surface texture of the bee is impossible to fully extract since the search window has to be

88

very small to fit the size of a bee. Additionally, since the number of bees in the same scene could easily be in double figures, the time required to finish the searching is unacceptable. Therefore, this approach is not appropriate for the tracking of insects. • Background subtraction This was the method implemented in this research, which enabled high- speed image segmentation. As the name suggests, this method extracts the foreground moving objects by subtracting sequential frames from each other. Thus the static background is eliminated. Although the static background might include the hovering or landed bees in some cases, here attention is paid only to the flight status and tracks of the bees. Background subtraction removed the majority of noise at the initial processing phase, so the following modules were able to focus on the useful information with higher efficiency.

The types of background noise that were present in video data can be divided into four different categories:

1. Pure static background object (such as buildings, rocks or fixtures) or objects with very slight motion (e.g. clouds). This type of noise was almost completely absent after operation of the background subtraction, and was very easy to remove. 2. Non-stationary objects that shared no resemblance to the bees in terms of shape and body size, e.g. the research team members, cars, twine that secured the beehive cells or birds. This type of noise could not be removed simply by background subtracting but was segmented with filters, based on the contour area, colour, or convexity. 3. Vegetation (e.g. grass and tree leaves). This type of noise required much more detailed criteria to filter than the previous one. Vegetation occupied similar areas to the bees and the bees were very likely to blend into grass or bushes, especially when they are searching for food sources. Hence the segmentation of the vegetation required information extracted from sequential video frames, based on the moving patterns, movement range, velocity and correlation with the windspeed.

89

4. Other insects or dimensionally small animals. This type of noise was very difficult for the system to segment from the bees if merely based on the video information, since their appearance highly resemble.

5.2 Processing procedure of the software

The image segmentation and flight track extraction proceeded according to the following phases (Figure 5.1), from raw recordings of bee flight to coordinate extraction for each detected track and its visualisation. The entire software processing was fully automated (following calibration of the cameras and pre- tuning of each filter).

1. The raw videos of bee flight were recorded and the calibration of the cameras performed. 2. Pre-processing of the raw videos, including the synchronisation of the two channels, video clipping and greyscale conversion. 3. Background subtraction. 4. Morphological operation. 5. Further noise filtering based on moving patterns and contour shapes. 6. The calculation of the centre of mass of all the remaining image contours. 7. The mapping from the dual-channel 2D coordinates to 3D coordinates based on the epipolar geometry principle. 8. Inputting the collection of 3D locations of each video frame into a 3-state Kalman filter for the generation of 3D flight track and its estimation. 9. Visualisation of the 3D dots and tracks with VTK (the Visualization Toolkit).

90

5.3 Flowchart of the software processing procedure

Figure 5.1 Flowchart of the bee tracking software and the output images of each intermediate phase

91

5.4 Software development environment

The main offline task was to process the large amount of video data with high efficiency and accuracy. Hence, many computer vision and image processing algorithms were required, as well as the fast implementation within the programme.

The OpenCV library (Bradski, 2000) was identified as an appropriate tool, considering the requirements of this bee tracking project. This is a library of programming functions for image processing and computer vision tasks, including feature extraction, facial recognition, image segmentation and motion understanding. It is designed for computational efficiency and to provide a simple-to-use computer vision infrastructure which allows the programmer to develop quickly complex applications. OpenCV also features multiple threads of execution, deep learning frameworks and the full use of GPU acceleration in some functions. In this research, the development of the image segmentation (i.e. background subtraction) module mainly used the functions that were available in the OpenCV library; these were carefully optimised for deployment in this research. The software development environment was Visual Studio Community 2017 and the coding language was C++. The version of OpenCV used in this project was 4.0.1, though some early stage testing programmes were written using previous versions.

5.5 Camera calibration

No matter how accurately a camera lens is manufactured, some optical distortion is inevitable. Although some distortion may be arbitrary, most lens distortion can be modelled with mathematical equations and calibrated (corrected) using software. There are two main categories of distortion: radial distortion and tangential distortion (Figure 5.2).

Radial distortion is caused by the light rays bending around the edge of a lens. The further the light ray is from the lens’ optical centre, the worse the distortion is. Radial distortion finds its existence in every type of lens, especially long-range zoom lenses and wide-angle lenses. Even for a high-quality prime lens (fixed

92 focal length lens), the distortion may be up to 10% around the edges of the camera scene. This is a significant error when considering tracking insects at pixel-level accuracy. The radial distortion can be characterized by the first few (usually 2 or 3) terms of a Taylor series expansion around the optical centre of the imager.

A

B

Figure 5.2 (A)The impact of radial distortion, represented by displacement of pixels induced by such distortion. (B)The impact of tangential distortion. (Bouguet, 2015)

93

For typical cameras, it is often sufficient to employ the first two terms (Weng, Cohen, & Herniou, 1992), i.e.:

2 4 푥푐표푟푟푒푐푡푒푑 = 푥 ⋅ (1 + 푘1푟 + 푘2푟 ) (5.1)

2 4 푦푐표푟푟푒푐푡푒푑 = 푦 ⋅ (1 + 푘1푟 + 푘2푟 ) (5.2) where (x, y) is the location of the pixel with distortion, (xcorrected, ycorrected) is the corresponding distortion-free coordinates, k1, k2, (k3, …) are the distortion coefficients and r is the distance between (x, y) and the centre of the radial distortion.

For highly distorted lenses such as fisheye lenses, a third radial distortion term is used,

2 4 6 푥푐표푟푟푒푐푡푒푑 = 푥 ⋅ (1 + 푘1푟 + 푘2푟 + 푘3푟 ) (5.3)

2 4 6 푦푐표푟푟푒푐푡푒푑 = 푦 ⋅ (1 + 푘1푟 + 푘2푟 + 푘3푟 ) (5.4)

Tangential distortion occurs when the lens and the image sensor are not parallel to each other. It can be minimally characterised by two additional parameters p1 and p2,

2 2 푥푐표푟푟푒푐푡푒푑 = 푥 + [2푝1푥푦 + 푝2(푟 + 2푥 )] (5.5) and

2 2 푦푐표푟푟푒푐푡푒푑 = 푦 + [푝1(푟 + 2푦 ) + 2푝2푥푦] (5.6)

Before initiating the recording on the experimental target, cameras were previously calibrated to remove the distortion effect as much as possible. This was done by placing a rigid flat panel mounted with a calibration pattern (a chessboard pattern), and recording the entire pattern at different orientations and locations. The edges and corners of the chessboard tiles were extracted and analysed, especially when the panel was placed close to the edge of the scene. Then the camera intrinsic matrix K was computed. It contained information on the camera’s focal length, the location of its optical centre and the skew coefficient. The intrinsic matrix is defined as

94

푓푥 푠 푐푥 퐊 = [ 0 푓푦 푐푦] (5.7) 0 0 1 where fx and fy are the focal lengths in x-axis and y-axis, which are equal if the shape of the pixel in the image sensor is a square, s is the skew coefficient and is non-zero only when the image axes are not perpendicular, cx and cy are the coordinates of the optical centre of the sensor in pixel unit.

Additionally, during the calibration process, the distortion vector D was also calculated, taking the form of

퐃 = [푘1 푘2 푝1 푝2]

표푟 [푘1 푘2 푝1 푝2 푘3] (5.8)

표푟 [푘1 푘2 푝1 푝2 푘3 푘4 푘5 푘6].

Since the camera matrix and the distortion coefficients are irrelevant to the location or orientation of the camera, the calibration was conducted in the laboratory, rather than the observation site. As long as the resolution setting remains unchanged, the camera matrix is still valid, and the distortion coefficients can be directly applied to any calibration case.

To minimise the processing burden and the generation of the inherent distortion, a linear lens was used instead of the wide-angle lens for the recording of bee flight with an acceptable resolution (2.7K), even though for the action camera used in this project, its fisheye camera is more powerful and supports up to 4K recording.

Two different alignments methods of the cameras and the calibration panel were used before and after the recording of bee flights. The first one used the object pose estimation approach, as described in the previous chapter, with two connected MDF boards containing the chessboard patterns. Once calibration had been performed, the optical axes of the cameras were aligned to form an angle of 90 and their view planes were both aligned perpendicular to the ground surface. Another calibration was conducted when the recording of bee flight finished. Unlike the previous one, this type of calibration, namely stereo calibration, used only one chessboard pattern and the board was placed in the

95 public field of view of the two cameras with the camera remaining stationary. During the calibration, the board was moved manually; this was very similar to the calibration process conducted inside the laboratory for one camera. Stereo calibration calculates the geometrical relationship between the two cameras, which is known as the extrinsic parameters, consisting of the rotation matrix R and the translation vector T (described in Chapter 3). In such a case, one of the cameras is regarded as the original point. The position and orientation of the second camera can be derived with the extrinsic parameters provided.

5.6 Pre-processing of the raw video

The raw videos were first checked for any dropped frames. One possible reason for frame dropping was the unfinished initialization of the camera. Although frame dropping rarely happened in the entire project and only existed at the very beginning instead of the main section of the video clip, video integrity check was reserved for stability purposes. As is briefly mentioned in the previous chapter, the synchronisation of the two video channels was achieved with a momentary pulse from a mobile phone’s flashlight, which was placed in the common volume of the camera views. The synchronisation took place twice, before and after the recording session respectively. According to the frame-by-frame observation of the video, the flashlight gave two sequential pulses with different duration and intensity after a single click of the shutter. The first was short, with a duration that was very close to the exposure (1/60 s); it thus could only be captured in a single frame. The second pulse happened 5 frames after the first one and was brighter and longer. Therefore, to get a more accurate synchronisation, the video frames from both cameras were aligned according to the position of the first light pulse. Then the total numbers of frames across the two synchronisations spots in both channels were compared and video pairs were only accepted if such numbers were equal.

Data pre-processing was required after the synchronisation of the dual-channel video recordings. This comprised the validation of necessary video elements, such as the proper image contrast, resolution, the integrity of files and the

96 compensation for severe vibration or occlusion as well as other operations including video format conversion and cropping of the files. The entire videos were saved as several 5-minute, 24-bit uncompressed .avi format video clips.

5.7 Background subtraction

Background subtraction (Piccardi, 2004) is very common with respect to computer vision applications. The goal is to recognise and segment the foreground individual objects from the background, so that the movement of the foreground objects can be tracked with higher accuracy and speed. The computation burden is also greatly reduced, due to the exclusion of redundant background information. Background subtraction yields impressive results in surveillance applications, in which the camera positions are often fixed and the field of view of the camera is static.

Normally, there are three steps in a typical background subtraction procedure. The first is to build a model of the background, based on the first few frames of the video. Next, the subtractor compares the new incoming frames with the model and calculates the foreground information by subtracting them. In the final stage, the background model is updated to mitigate the effects of subtraction, for example the illumination condition; it also labels the previous foreground object that blends into the background during the processing as a background object. In some background subtraction algorithms, the model can compensate for camera shake and rotation.

The conventional background subtraction approach reads in a pure background image before the processing starts, which serves as the comparison criteria. Then the absolute difference in pixel intensity for every single pixel is calculated by subtracting the background image with each video frame. Such difference is then compared with a threshold. Pixels are marked as either background or foreground components based on the thresholding result. It is given by

1, 푖푓 |퐈푡(푥, 푦) − 퐁(푥, 푦)| > 푇ℎ푟푒푠ℎ표푙푑 퐅푡(푥, 푦) = { (5.9) 0, 푖푓 |퐈푡(푥, 푦) − 퐁(푥, 푦)| ≤ 푇ℎ푟푒푠ℎ표푙푑

97 where Ft(x, y) is the motion label of the pixel at coordinate (x, y) at time t, I is the pixel value, B stands for the background model.

This conventional approach is commonly called Static Frame Difference. However, the accuracy of the result is guaranteed only when the background is perfectly static - hence any change is not permitted, including a slight shake of the camera or a change in illumination. Such requirements are not realistic in practical applications. Therefore, the Static Frame Difference method has very limited use in common image segmentation tasks. It is, however, the basic idea behind background subtraction.

To apply a certain amount of adaptivity, the next level of background subtraction algorithm uses the frame prior to the current one, or the mean of pixel intensity of previous frames, instead of a static image, as the background model. Under a pre-set learning rate α, the initialisation and update of the background model is given by

Initialisation:

푁 1 퐁 = ∑ 퐈 (5.10) 푡 푁 푡−푛 푛=1

Update:

퐁푡 = (1 − 훼)퐁푡−1 + 훼퐈푡 (5.11) where Bt is the background model at time t and It is the greyscale form of the video frame at time t.

As background subtraction algorithms evolved, algorithms based on the Gaussian distribution and the mixture of multiple Gaussian distributions were derived. The mixture of Gaussian model is integrated with edge detection and texture extraction algorithms in the segmentation of the foreground objects. The background subtractor implemented in this research was an improved version of the mixture of Gaussian model, namely MOG2 (Zivkovic, Improved adaptive Gaussian mixture model for background subtraction, 2004). It is highly efficient, very robust and easy to implement. Several other background subtractors that are available in the OpenCV library were also investigated, including the MOG

98

(the first generation of mixture of Gaussian model) (KaewTraKulPong & Bowden, 2002), CNT (CouNT) (Zeevi, 2017), GSOC (an algorithm implemented during one Google Summer of Code), GMG (Godbehere & Goldberg, 2012) and KNN (K- nearest neighbours method) (Zivkovic & Heijden, 2006).

The investigation of the appropriate background subtraction method to implement was conducted by analysing the performance of each method on the recorded video data of bee flight around the hive. The objects presented in the scene included the target beehive and the flying bees around it, the walking researcher, the buildings in the distance and the tree leaves next to the hive.

A

B C

D E

99

F G

Figure 5.3 The performance of six studied background subtractors for one frame taken from the bee flight recordings. (Figures have been cropped for visualisation): (A)original frame (B)MOG2 (C)MOG (D)CNT (E)GMG (F)GSOC (G)KNN.

Figure 5.3 shows the binarized output image produced by each background subtractor for a given inout image. The white areas represent estimated foreground objects that are marked with positive motion labels while the black areas represent the background. Each subtractor was applied using its default settings (length of historical frames - 400, threshold - 25).

Unlike the first version of the Mixture of Gaussian (MOG) subtractor, which creates the same number of Gaussian distributions (usually 5) for every pixel in the image, MOG2 is more adaptive in selecting the appropriate number of Gaussian distributions based on the historical results of different pixels. This makes the algorithm much more computationally efficient in this bee tracking application because most of the image tends to remain static. Furthermore, the MOG2 subtractor enables the detection of object shadows and is more capable of not only detecting the edges and contours of fairly large objects, but also reserving dimensionally small information, such as the locations of the bees.

The CNT subtractor is very powerful in detecting large objects and has high tolerance to camera shake. However, it has a tendency to remove small objects or singularities, which results in very significant losses of bee trajectories. The CNT subtractor requires very low hardware processing power, and hence is very commonly used with microcontrollers such as the Raspberry Pi.

100

The GMG subtractor requires a relatively long initialisation using the initial (120 by default) frames for background modelling. It employs a probabilistic foreground segmentation algorithm and detects foreground objects using Bayesian inference. But the results of the GMG subtractor on the bee flight video yielded too much noise, making it very difficult to recover the bee locations.

The GSOC subtractor is extremely powerful in segmenting large objects and recovering the interior part of the object even when the object is single coloured. However, as can be seen, almost no other information is left besides the researcher contour. It also requires the longest processing time compared with other subtractors in the list.

The KNN subtractor is very efficient if the number of foreground pixels is low. It has a very close quality of performance when compared to the MOG2 subtractor and even requires fewer morphological operations in later tuning. But the biggest drawback of the KNN subtractor is that it requires much more time to process and has no shadow detecting feature.

The objects studied in this project were honeybees. At a distance between 3 – 5 m from the cameras, the maximum area occupied by a bee was tens of pixels. This meant that the desired foreground objects were all small-sized contours, thus there was no requirement of recovering the interior of the object and small- sized information were supposed to be reserved as much as possible without introducing too much grain in the image. Therefore, the MOG2 subtractor was chosen as the main algorithm, along with the KNN subtractor to cross-validate the results of MOG2 in several cases.

Figure 5.4 shows the performance of the MOG2 subtractor with a practical application after certain parameter adjustments. The length of historical frame reference was set to 200 and the threshold was 16. The threshold of 16 was chosen to preserve as much useful information as possible (i.e. the bee locations), whilst rejecting irrelevant particle noise. If the history length was too high, the historical locations of the foreground objects tended to stay longer in the image, leaving very long trails, and the update of low-speed objects was also very slow; if it was too low, each identified object would flicker more, due to

101 abrupt changes in its displacement, which is adverse to the generation of a stable contour of the object.

A Gaussian blur filter with a kernel size of 5 by 5 was applied to every video frame before the background subtraction, to reduce the number of grains caused by the ISO settings of image sensor.

This stage carried the heaviest computational load of the whole procedure and was the most time consuming. Each 5-minute 2.7K 60 fps video required approximately 15 minutes to process, using a laptop featuring an Intel i7-8750H CPU and Nvidia RTX2070-MQ GPU with GPU acceleration enabled during the processing. It is worthy of note that without GPU acceleration, the processing would have required approximately ten hours. The main factors determining computational load were the number of Gaussian models in the subtractor and the number of historical video frames included.

Figure 5.4 Output of MOG subtractor on one frame from the bee flight recordings, showing bee contours (marked with red circles) and a large amount of noise from the tree leaves.

102

5.8 Morphological operations & further filtering

After the application of the background subtractor, there were still a large amount of positive motion labels left in the image. The majority of them came from the swaying tree leaves and grass, and apparently, the locations of flying bees were among them, too. Although it was still challenging to distinguish the bees from other noise when looking at the images frame by frame with human eyes, when the objects started to move and the binarized image sequence was played as a video, the flight tracks of the bees could already be observed. Theoretically, the extraction of bee locations can be done manually from this point. However, the workload would be huge, especially for such a multi-object tracking task. Therefore, it was imperative to make the entire process more automated; hence, to remove the noise generated by ground vegetation, another set of operations was required. This included the morphological operation and other filters based on various criteria.

Morphologically, the shapes of bees and tree leaves showed no obvious difference after the background subtraction. But they have very distinct motion patterns. The movement range of bees was much greater and the probability of the same bee flying past the same location in the scene in a short amount of time was extremely low. In contrast, since a leaf is attached to a branch, even with the help of wind, its movement is highly constrained to within a small region of space. This means within a short video clip, a certain leaf was very likely to remain around a position, or at most a very limited neighbourhood of that position; whereas a bee could appear in any location in the scene. Hence its probability distribution of appearance in the entire scene was much more evenly distributed, although higher in the vicinity of the entrance to the beehive.

To address the motion differences, the entire video was divided equally into clips of 10 s duration. The motion labels (i.e. 1 for non-static and 0 for static) of every pixel in a frame were converted into 16-bit unsigned double type (depending on the total number of frames in a video) and were accumulated. The accumulated values were then normalised back to an 8-bit (i.e. pixel value range 0-255) image, namely the foreground mask image, to ensure that the mask image could be

103 applied to the original 8-bit video without overflowing. This mask image could also be saved in a common image format for easier visualisation.

Figure 5.5 The accumulated probability distribution of moving pixels across a 24-second recording, represented as the foreground mask image.

In the mask image shown in Figure 5.5, the pixel intensity of most of the locations where the bees once were are below 10, pixel values of objects moving within a very small space, such as the tree leaves, are all higher than 50. Then the mask image passed through two thresholding functions. The first was a normal binary thresholding, supressing pixel values that were lower than thresh_1 down to 0. The second was a to-zero-inverse form of thresholding, supressing pixel values that were higher than thresh_2 down to 0. After such thresholding, only pixel values between thresh_1 and thresh_2 were retained. These were then lifted to 255 to generate a completed binarized mask image. Note that thresh_1 and thresh_2 were set to 10 and 50 respectively. The thresholding operations are defined by

THRESH_BINARY:

푀푎푥 푖푓 푖푛푝푢푡(푥, 푦) > 푇ℎ푟푒푠ℎ표푙푑 표푢푡푝푢푡(푥, 푦) = { 0 표푡ℎ푒푟푤푖푠푒 (5.12)

104

THRESH_TOZERO_INV:

0 푖푓 푖푛푝푢푡(푥, 푦) > 푇ℎ푟푒푠ℎ표푙푑 표푢푡푝푢푡(푥, 푦) = { 푖푛푝푢푡(푥, 푦) 표푡ℎ푒푟푤푖푠푒 (5.13)

Figure 5.6 The binarized image of the foreground mask, showing image components whose pixel intensity lie within 10-50 in the original mask.

Obvious difference can be noticed from the mask image shown in Figure 5.6. The majority of the noise has been removed and the tracks of the bees are clearly visible. Although there are still a small number of tree leaf artefacts, they can be easily filtered since they have smaller contour sizes compared with bees. The removal of such noise was done using several morphological operations.

Morphological operation comprises a set of non-linear operations that alters the shape of an object in an image; it is often applied to binary images, in which the relative ordering of the pixels values are more focused and the absolute numerical values are insignificant. Before the operation, the shape of the probe is selected, which is often called the structuring element. The structuring element defines the shape of the probe (e.g. rectangle, ellipse) and the size of the probe. There are two main types of morphological operation: erosion and dilation.

Mathematically, the erosion of the binary image I by the structuring element S is defined by

(5.14) 105

퐈 ⊖ 퐒 = {(푖, 푗): 퐒푖푗 ⊆ 퐈} where (i,j) is a random position in the entire image, Sij is the structuring element centred at position (i,j).

The erosion operation searches for every position that meets the above condition and the collection of all such positions becomes the output area. The effect of erosion is that the outline of the target area is “eroded” by the thickness of the radius of the structuring element (in the case of a circular one).

Similarly, the definition of the dilation operation is

퐈 ⊕ 퐒 = {(푖, 푗): 퐒푖푗 ∪ 퐈 = ∅} (5.15) thus the resulting area after dilation is the input area “dilated” by the thickness of the radius of the structuring element, making it slightly larger around the boundary.

i

ii

Figure 5.7 A simplified demonstration of the effect of (i) the dilation operation and (ii) the erosion operation on continuous shapes. In this example, a rectangle A is being transformed with a round structuring element B. The shapes with solid outlines are before the operation and the dashed-lined ones are the resulting shapes. (Owens, 1997)

106

For the practical application onto a digital image, it is much more straightforward since the pixels are normally square shaped. Thus, during the erosion operation, the probe examines the neighbourhood of all the pixels in an image by the size and shape of the structuring element. If the pixel values under the structuring element are all ‘1’(or ‘255’ in 8-bit images), then this examined pixel is set to ‘1’, otherwise it is eroded, i.e. set to ‘0’. And for the dilation operation, the examined pixel is set to ‘0’ only if the neighbouring pixel values are all ‘0’, otherwise ‘1’.

i

ii

Figure 5.8 The effect of (i) the dilation operation and (ii) the erosion operation on digital shapes. A 3 × 3 square structuring element is applied. (Efford, 2000)

The erosion operation removes dimensionally small objects and reduces the sizes of larger objects, yielding cleaner images, while the dilation operation makes everything larger and the weaker signals more obvious. The shape of the structuring element can be a basic rectangle, or an ellipse to form rounded edges around the region of interest and smoother transition of different shapes.

There are several variants of the morphological operation based on these two basic types, for example, opening and closing, which are an erosion followed by a dilation and a dilation followed by an erosion respectively. Since they are nonlinear, such operations are not merely different in the order of the two basic ones but enhance the image to a greater extent. The opening operation basically

107 removes the unwanted particle noise and retains unchanged the size of the remaining desired signals. The closing operation reunites the small-scale details around the larger objects, since they are often from different parts of the same object, and removes isolated small noise components.

Figure 5.9 The effect of the opening morphological operation on scene which contains bees flying among leaves.

After performing MOG2 background subtraction on the raw videos, a dilation operation with a 3 × 3 elliptical structuring element (kernel) was applied. This operation further amplified all of the information in the image, including the useful information and the noise. Meanwhile, the mask image obtained in the previous section was processed by an opening operation with a rectangular 3 × 3 kernel to remove remaining particle noise after the thresholding and maintain the results of bee recognition at the same level of clarity (Figure 5.9). The opening operation also united previously isolated neighbouring contours into one, such as the body and wings that belong to the same bee individual. This obviated the risk of repeated detections on the same object. However, in some cases two or more bees that were very close to each other were combined into one segment, so the Kalman filter was employed to resolve such issues and provide correct estimations of on bee motion trajectories.

Image contour division and detection was then implemented based on the contour area and contour convexity, in which contours with an area of between 10 to 150 pixels and a shape which was rounded rather than strip-like were identified. Thresholds were set to ensure smooth contours and identify bees within the field of view. The centre of mass of every single valid contour, which

108 was considered to be the target location, was computed and saved as the input for the next processing module.

5.9 Software simulation

After the software development of the object segmentation on 2D images was finished, the priority of the work changed to the estimation of flight tracks based on the historical positions, so that multiple targets in the same scene could be tracked at the same time. The application of this system in 3D practical scenarios was also addressed. Therefore, three different simulation scenes were built, including the construction of the Kalman filter and its adjustment, the exploration of the parameter settings and the expected observing volume of the dual-camera monitoring system, and the feature point matching of 2D coordinates pairs to 3D coordinates.

Before the final decision on the choice of camera was made, the video data was not yet available for the software to process. The performance of the image filters and the background subtractor was tested using simulated bee flight data. The simulation was supposed to mimic the relatively irregular flight patterns of a honeybee so that the recognition module could be tested on multiple moving objects with various velocities. The displacement vectors of the virtual bees were generated and then fed into the pre-built 2-state Kalman filter. Hence as soon as the field data was available, the Kalman filter could be applied subject to minimum adjustment.

The virtual bees followed a moving path of an auto-generated polar curve. The polynomial coefficients of the curve were pseudo-random, which took the function form as

휃 = 휔 × 푁푓푟푎푚푒 (5.16)

휌 = (7 − 3 cos 푘1휃) sin 푘2휃 + cos 푘3휃 (5.17) where 휃 and 휌 are the angular and radial coordinate of the current position of the object in the polar coordinate system respectively, 휔 is the angular velocity of the object, 푁푓푟푎푚푒 is the serial number of the current frame, 푘1, 푘2 and 푘3 are

109 random coefficients ranging between 0.0 - 4.9, 0.0 – 4.9 and 0.0 – 9.9 respectively, and which vary after a certain number of frames. Even though the angular velocity was a constant value, since ρ was varying, the linear velocity of the object also was constantly changing.

To ensure the simulated results approximated the real data as closely as possible, the maximum curvature and linear velocity of a moving point was limited, based on the approximate maximum acceleration and speed a bee could achieve.

An example of the generated trajectory of one virtual honeybee is shown in Figure 5.10 below.

Figure 5.10 One example of generated path where k1 = 4.6, k2 = 2.5, k3 = 8.3

The stored version of the simulated data used a pure black background. The moving objects (i.e. the virtual bees) were set to be solid coloured disks with a very small diameter. For the convenience of visualisation, they were also painted with different colours. However, the colour information was completely ignored during contour detection, hence the recognition of different targets was purely based on their moving patterns rather than any other cues.

110

Figure 5.11 (A) shows one example of simulated situations in which three virtual bees were “released” into the scene, the flight track of each virtual bee was generated independently, then superimposed onto the same plane. There were also cases with occlusion and overlapping of two targets. Figure 5.11 (B) shows the estimation of the 2D Kalman filter on one of the targets.

A B

Figure 5.11 (A) Stacked locations of three moving virtual bees. (B) Object locations estimated by the Kalman filter (red dots) and ground truth of the detection (green dots).

The second simulation scenario was constructed using a 3D model making and rendering tool, Autodesk 3ds Max. During the simulation, a second camera was introduced, and the goal was to define the appropriate distance between the camera and the subject and the proper resolution and framerate. Considering the size of the field trial site and the average capability of readily available cameras, which determines the visibility of bees, the target distance was set to 3000 mm. Additionally, in the 3ds Max models can be rendered with custom surface texture and realistic illumination. A virtual chessboard was also made and could be moved and rotated freely (Figure 5.12). With the 3D chessboard model, it was also possible to validate the R (rotation), T (translation) and F (fundamental) matrices in the epipolar geometry system with very high accuracy. The proper size of the chessboard pattern for camera calibration and the valid detecting range of it was also investigated.

111

Figure 5.12 The 3D simulation scene for the dual-camera setup constructed in 3ds Max.

The third simulation was conducted using the scene built in the second simulation. Altogether 20 spheres with same radius (10 mm) were placed randomly in the scene. They all followed different 3D moving paths that were configured in advance. The animation of the spheres and the recording of both the virtual cameras started simultaneously. Because the spheres were the same in size and colour, and they moved inside a 3D space, their projected sizes on the camera view planes were varied. There were frequent instances of overlap, occlusion and inclusion of different spheres; some of them were even shaded by others in some cases. Thus the nature of bee flight in the real world was believed to be adequately modelled by these simulated data. Such simulated result was far beyond adequate to reflect the motion of multiple subjects in the same 3D space. Since the simulated result did not require object segmentation and noise filtering (there was no noise in the simulated case), and the coordinates were extracted merely with a simple contour detection method, the result was only used for the investigation of 3D triangulation and the expansion of the Kalman filter from 2D plane to 3D space, i.e. 3-state Kalman filter.

112

Figure 5.13 The simulated dual-camera imaging system and 20 moving spheres. The purple curve is the moving path of one of the spheres.

5.10 Epipolar geometry system and 3D triangulation

The basic principle of the epipolar geometry was described in Chapter 3, therefore in this section, this method will be demonstrated with an example of aligning and matching the locations of the corners of a ground tile in the field trial site (shown in Figure 5.14 below).

Before the processing on bee flight videos, the epipoles e and e’, the fundamental matrix F which stored the camera locations, resolution and focal lengths of the camera configuration during the field experiment were calculated based on the relative positions of the calibrated cameras. During the mapping of each following frame pair, the left camera provided the coordinates of all the detected objects in its scene, serving as the input for calculating the epipolar lines. These corresponding epipolar lines were then drawn in the view plane of the other camera.

113

A

B

Figure 5.14 A frame pair from the left (A) and right (B) camera recordings, demonstrating the workings of the epipolar geometry system.

As is shown in Figure 5.14 (A), the corners of the red tile in the left camera were marked manually with four different colours. Their coordinates were then fed into the epipolar line calculation. The positions of these corners in the right scene were also marked manually as the “ground truth”. The coloured lines crossing through them are the calculated epipolar lines based on the detection in the other camera. Since the epipolar lines are defined as lines passing through the right epipole e’ (the position where all the epipolar lines converge to in

114

Figure 5.14 (B)) and the corresponding coordinates of the left ones, the shortest distances from every detected point on the right scene to every epipolar line were then calculated, converted and saved into a correspondence matrix. If the epipolar lines of the red and blue dots are very close in the right camera, then the system made assignments based on the distances from each dot to each undetermined line. In extreme cases when two or more epipolar lines overlap completely, the system will fail to work. Even so, this is only for the current frame pair; the correct assignment will resume in the next frame and assignments with a large offset will be removed in the trajectory estimation phase. Furthermore, the pairing conditions were defined as not only being the closest point to the epipolar line but also constrained by being smaller than the correspondence threshold (in this case this value was 0.01). Points on the right scene that satisfied both conditions were considered as the matched points of the left target locations. The real world coordinates of point P(xp, yp, zp) are given as

푆훼푥 (푓 + 훼푥 ) 푟 푙 (5.18) 푥푝 = 2 2 푓 − 훼 푥푟푥푙

푆훼푥 (푓 + 훼푥 ) 푙 푟 (5.19) 푦푝 = 2 2 푓 − 훼 푥푟푥푙

푆훼푦 (푓 + 훼푥 ) 푆훼푦 (푓 + 훼푥 ) 푙 푟 푟 푙 (5.20) 푧푝 = 2 2 표푟 푧푝 = 2 2 푓 − 훼 푥푟푥푙 푓 − 훼 푥푟푥푙

푊 ⁄2 퐻 ⁄2 훼 = 푝𝑖푥 or 훼 = 푝𝑖푥 (5.21) 푓 푡푎푛(휃ℎ표푟𝑖푧표푛푡푎푙⁄2) 푓 푡푎푛(휃푣푒푟푡𝑖푐푎푙⁄2) where S = 3000 mm is the target distance from the calibration chessboard to each camera, α is the millimetre to pixel width scalar, f = 17mm is the focal length of the camera, (xl, yl), (xr, yr) are the 2D coordinates of the target object on left and right view planes respectively, Wpix = 2704 and Hpix = 1520 are the width and height (in pixels) of a 2.7K resolution, θhorizontal = 86° and θvertical = 71° are the horizontal and vertical field of view angle of the camera.

115

Figure 5.15 Schematic of the 3D triangulation

5.11 Motion estimation and the generation of flight tracks

The final part of the software comprised a 3-state Kalman filter to smooth the moving trajectories and to predict the flight pattern of the target. Based on the historical velocity, motion direction and current orientation of existing tracks, as well as the abilities of bees to make sharp turns, accelerate and decelerate, the 3D tracking module distinguished multiple targets during their convergence and kept track of the correct one. The flowchart of this motion estimation module is shown in Figure 5.16.

The Kalman filtering model was built with three state variables and their derivatives (3D coordinates and 3D velocity) as the state vector. The covariance matrix of the process noise Q and the covariance matrix of the observation noise R were defined based on a set of simulation results. They were further adjusted to provide reference values with different weights between historical data and current measurement. The final values used were 10-1 and 10-4 for every element in Q and R respectively.

116

Figure 5.16 Flowchart of the motion estimation and flight track generation module based on the coordinates of detected objects in each frame.

The whole group of discrete 3D coordinates in every video frame was fed into the filter to initialise and in return optimise the filter itself. After initialisation of the whole model, new detections grabbed from the current frame were allocated to existing tracks through an algorithm called the Hungarian Algorithm. It calculated a 2D cost (distance) matrix between each newly detected object and all the existing tracks and output an allocation strategy with a minimum overall cost. This, in the first place, attached every object in the current frame to their nearest existing track based on the cost matrix. The maximum accepted distance

117 off-set in the programme was set to 50 pixels, which means any new detection that was more than 50 pixels away from any existing tracks was put into a vector named unallocated points. Due to their distance from all the existing tracks being too large, these were considered as the starting points of new tracks. Existing tracks that obtained no allocation were marked as “frame skipped/dropped” and were paid more attention as the filtering process continued, whether their tracked targets had disappeared permanently or merely for a couple of frames. The threshold for frame drops was 20. If the tracked object was lost, the Kalman filter continued to make estimations of future locations of the target for the next 20 frames at maximum. If the target was not able to be retrieved after that, the tracking of the current object was terminated, and the last 20 location were removed from the track since they were pure estimations.

The processing of the dual channel binarized video yielded a set of relatively complete 3D tracks during a 5 minutes’ period. The final result was visualized with colour coded interface using VTK shown in Figure 5.17 (The Visualization ToolKit). The coordinates of each extracted flight tracks of bees with time stamps were stored as .xml files. Velocities, accelerations, and decelerations of each track at every instant were calculated based on the coordinates.

A

118

B

Figure 5.17 (A) Stacked computer dots of target locations in a 24-second-long video clip. (B) The colour coded bee flight tracks generated, which is also the final result of the software.

5.12 Quantified system evaluation

In order to evaluate the performance of the software, different metrices were defined to reflect the quality characteristics. Let TP stand for the true positives which hold the number of objects correctly labelled as bees and assigned to their own tracks, TN stand for true negatives which hold the number of objects correctly labelled as background, FP stand for false positives which hold the number of non-bee objects labelled as bees or bees that were incorrectly assigned, and FN stand for false negatives that hold the number of bees incorrectly labelled. Then the evaluation metrics are defined as:

푇푃 푅푒푐푎푙푙(푅푒) = (5.22) 푇푃 + 퐹푁 푇푃 (5.23) 푃푟푒푐푖푠푖표푛(푃푟) = 푇푃 + 퐹푃 The raw video was processed with a range of background subtraction thresholds/sensitivities (9, 12, 16, 25), morphological operation kernel sizes (3, 5, 7), probabilistic filtering thresholds (3 – 10 / 255) and other filtering parameters, to generate the binarized video. Two 5 s duration image sequences (300 frames each) were randomly selected from each minute of the entire video

119 and were carefully examined frame by frame using different filtering parameter sets. The examination of bee locations from the raw videos was impossible for human eyes, but that from the binarized video was much easier. The determination of bees from other moving objects was achievable by sweeping adjacent binarized frames back and forth. The examination focused on the decrease of TP and FP, and unusual patterns (such as abrupt changes in direction, long gaps between adjacent measurement points) in the mask images. The following list provides the setting of the main parameters used for the optimal performance with a high Pr value and a relatively lower Re value, since it was contradictory to maximise both and Pr is usually considered more significant in the study of recovering insect flight tracks: • maximum accepted correspondence in Epipolar line searching (i.e. Epipolar constraint) – 0.01 • length of referred historical frames in background subtraction – 400, threshold/sensitivity – 25 • morphological operation kernel shape – ellipse, kernel size – 5×5 • Gaussian filter kernel size = 5×5 • probabilistic filtering threshold – 4/255. Over the examined image sequences, the reduction of TP was lower than 12% on average, i.e., Re was higher than 0.88 within the effective observation volume. The number of frames where FP being non-zero was around 8%, yielding a Pr value of over 0.9. After Kalman filtering, location outliers were further excluded. The maximum accepted number of skipped frames was set to 6, which terminated tracks that could no longer be detected for 6 sequential frames. Additionally, only tracks with more than 50 measurement points and fewer than 10 missing frames in total were accepted for analysis. Such non-linear operations increased Pr further but reduced Re.

The following list defines the main situations when the bees cannot be properly detected, or the 3D pairing fails to work: 1. The target leaves the observation volume of the system.

120

2. The target inside the volume cannot be observed by one or both the cameras due to occlusion. 3. The target shows very little difference in pixel intensity against the background. 4. The target’s motion slows and no longer triggers the background subtractor. Most of the above situations can be solved easily using better equipment, e.g., cameras that support higher resolution and recording frame rate, or higher pixel bit depth (10-bit or higher).

5.13 Conclusion

This chapter described the entire workflow of the software developed, including the principles that each module followed and the result from each processing phase. The framework was built gradually, with each module carefully designed and tuned based on the success and performance of the previous one. The software collaborated tightly with the hardware and the final results confirmed the system was capable of handling accurately large amounts of data.

121

6 Chapter 6 – Results and data analysis

As described in the last chapter, the final output after the entire processing by the software were the 3D historical coordinates of all the detected flying honeybees within the time period when the videos were recorded, together with their 3D tracks with a certain level of estimation and correction applied. The two types of data were stored as 3D vectors in xml files – extensible markup language files that retain the format of matrices and data types. The 3D matrices of measurement points and generated tracks were saved in the following formats,

퐏 퐏 11 21 퐏12 퐏22 [ ∙∙∙ ] , [ ∙∙∙ ]

퐏1푀 퐏2푀 [퐃퐨퐭] = 1 2 (6.1) 퐏퐹1 퐏 ∙∙∙, [ 퐹2 ] ∙∙∙

[ 퐏퐹푀퐹 ]

퐐 퐐 11 21 퐐12 퐐22 [ ∙∙∙ ] , [ ∙∙∙ ]

퐐1퐿 퐐2퐿 [퐓퐫퐚퐜퐤] = 1 2 (6.2) 퐐푁1 퐐 ∙∙∙, [ 푁2 ] ∙∙∙

[ 퐐푁퐿푁 ]

where 퐏퐹푀퐹 is the 3D coordinates of the MF-th measurement point in the F-th frame, and 퐐푁퐿푁 is the 3D coordinates of the LN-th point of the N-th track.

6.1 Initial cleaning and reduction of data

During each field trial, dual-channel videos with a length of 15 minutes were recorded reduced to approximately 10 minutes after synchronisation and cropping. The subsequent video (the duration of the one to be discussed in the following sections was strictly speaking 9 minutes 30 seconds) was then divided into several 1-minute video clips for the convenience of displaying and processing. Figure 6.1 represents the 3D projection view and top view of the

122 extracted 3D flight tracks in the space across the entire video, showing the overall view of the amount of data. The beehive (yellow cube in Figure 6.1) is located roughly in the western end of the scene, with its entrance facing east. The images show the activity of the honeybees, most of which are flying at the front of the hive. Further analysis and findings from it will be discussed in the following sections.

(i)

123

(ii)

Figure 6.1 3D projection view (i) and the top view (ii) of the reconstructed flight tracks in a 9.5 minute video. Yellow cube represents the location of the beehive.

Each 1-minute video was handled following the same approach, and the useful data were then extracted for further analysis. First, any track consisting of less than 20 measurement points was deleted, based on the consideration that tracks must be long enough to adequately reflect the real state and flying patterns of the target. The number of remaining tracks were 605 after the thresholding of track length. Close to the entrance of the beehive (within 150 mm from the entrance), there was a very high density of bees, usually flying slowly. Slower motion created weaker contrast for background subtraction and multiple appearance of bees in a small space makes the extraction of each individual target even more challenging. Although it was still possible to capture flight tracks within that volume, the accuracy of image extraction within such a small space could not be guaranteed, and there was also no other evidence to support the data. Therefore, this part of space was less focused. However, from beyond 150 mm to the east of the beehive entrance, most of the bee tracks could be accurately captured and recovered. It is also noticeable that although some of the tracks were relatively short, they were still considered as valid data. In fact, after the thresholding of

124 track length, those visually short tracks only implied low flight speed, the flight duration was, however, sufficient for showing the behaviour of the bee.

Based on the pre-set volume of valid detection, which was 150 mm to the front of the beehive and beyond, each set of data was further divided by another threshold, and this was 300 mm from the entrance. This threshold acted similarly as an ‘outside gate’ (shown in figure 6.2) and the bees were classified as flying entirely within the gate and crossing the gate during their flights, either entering or exiting. The purpose was to divide the data based on the allocated task of the worker bees and to observe them separately. For example, the short distance tasks can be defencing and patrolling, while long distance tasks include scouting and foraging (Robinson, 1992).

Figure 6.2 The virtual ‘outside gate’ in front of the beehive (top view)

The analysis of the data focused on two aspects. The first was the analysis of the entire track at the individual level, including the track of a bee flying from the boundary of the observation volume to the beehive entrance, or a bee making a return flight within the neighbourhood of the beehive; this was not affected by the thresholding of the ‘outside gate’. The other aspect focused on the overall activity of the worker bee swarm and included only the bees flying across the ‘outside gate’ carrying out long distance tasks. The activity level of the swarm within a specific time period, statistics of the flight data and how these results reflected the hidden attributes of the honeybee colony were analysed.

125

6.2 Analysis of individual flights

(i)

(ii) (iii)

(iv)

Figure 6.3 3D flight track (track A) of an individual bee coloured coded with the magnitude of instantaneous velocity, showing in (i) the 3D projection view, (ii) the side view, (iii) the front view and (iv) the top view. (unit: mm)

126

(i)

(ii) (iii)

(iv)

Figure 6.4 3D flight track (track B) of another individual bee coloured coded with the magnitude of instantaneous velocity, showing in (i) the 3D projection view, (ii) the side view, (iii) the front view and (iv) the top view. (unit: mm)

127

Figures 6.3 and 6.4 show the views of two randomly selected 3D flight tracks (namely track A and B) in the 9.5 min video, from the 3D projection view, side, front and top views respectively. The lengths (the number of measurement points) of these two tracks are 85 (duration: 1.42 s) and 53 (duration: 0.88 s), respectively. The black arrows indicate the direction of the flight and the track is colour coded according to instantaneous velocity. Slow track segments are depicted in blue, fast in red. The movement ranges are given by the limits of each axis. These two sets of figures provide very detailed views from all perspectives of the 3D flight tracks, showing the turns that bees took at every moment, the curvature of each displacement segment and the speed. Track A is possibly from a patrolling bee investigating the surrounding area of the beehive and returned to the hive at the end, while track B is from a returning bee flying along the path of a spiral curve. Figure 6.5 shows the top view of the field trial site and the relative positions of the cameras, the beehive and these two trajectories (blue – track A, green – track B). The entrance to the beehive faced east and the x, y, z axes are pointing to the north, the west and vertically up.

128

x (mm)

1000 y (mm) B Right camera Beehive entrance 1000 -1000 (0, 1500, 0) A

Left camera

Figure 6.5 Top view of the relative location of the two tracks in Figure 6.2 and Figure 6.3, the beehive, and the cameras.

The magnitude of the velocity at every instant is given by

|퐏푐푢푟 − 퐏푝푟푒| 푣 = (6.3) 푇 where 퐏푐푢푟 푎푛푑 퐏푝푟푒 are the 3D coordinates of the current and the previous measurement points of bee location and 푇 is the time interval determined by the framerate (i.e. 1/60 s). The magnitude of the velocities of the two tracks against time are shown in Figure 6.6 and Figure 6.7. The horizontal axis represents the duration of the track. Since the video was recorded at 60 frames per second, the step size along the horizontal axis is 1/60 s. In this 3D coordinate system, the location of the beehive was (0, 1500, 0 mm). The starting and ending location of track A was (233, 1394, 97) and (-200, 1090, 74) respectively. The farthest location from the beehive along the entire track was the 33rd measurement point (-497, 231, 470) and the separation between this point to the entrance of the

129 beehive was 1441.43 mm. It can be seen from the 3D track that the bee first left the entrance, made a turn at approximately 1441.43 mm away from the hive and then returned. According to Figure 6.6, the speed of the bee was approximately 2.7 m s-1 when first detected. It then decelerated to approximately 1 m s-1 when it made the first turn. One second after departure from the entrance, the bee made a final turn, gradually accelerated to approximately 2 m s-1 and exited the field of view of the cameras.

Velocity (m s-1)

Time (s) Figure 6.6 The magnitude of velocity of track A vs. time.

Similarly, track B started from location (439, -978, 360) and terminated at (-538, 1166, -232). The farthest location of the track from the beehive entrance was the ninth measurement point, (28, -1106, 294), being approximately 2623 mm from the hive. This flight track started from the boundary of the observation area opposite to the hive side and this bee gradually approached the beehive along a spiral curve flight path. It started with a flight speed of approximately 2 m s-1, accelerated and decelerated several times in a short time, then stabilised at around 2 m s-1 at the 22nd location. The deceleration probably means that the bee was navigating and determining the precise location of the beehive entrance. It travelled at a nearly constant speed for the next 15 frames, then gradually accelerated to around 2.8 m s-1, indicating that the entrance was found and the bee started to quickly decelerate close to the entrance. Then the system stopped tracking due to the insufficient motion generated by the bee after landing.

130

Velocity (m s-1)

Time (s) Figure 6.7 The magnitude of velocity of track B vs. time.

Similar analysis was conducted on remaining 603 tracks; it was found that the top speed of flying bees around the hive rarely exceeded 5 m s-1, and approximately 91% of bees flew at speeds between 1 m s-1 and 5 m s-1. The tracks shown above are clearly among them.

Figure 6.8 shows the curve of the x, y, z components of the instantaneous velocity vector against time of track A. They are given by

퐏푐푢푟 − 퐏푝푟푒 (푣 , 푣 , 푣 ) = (6.4) 푥 푦 푧 푇

The sign of each component indicates the moving direction in the real world (e.g. North, West or upward) at any given instant.

Velocity (m s-1)

Time (s) (i)

131

Velocity (m s-1)

Time (s) (ii) Velocity (m s-1)

Time (s) (iii)

Figure 6.8 The x (i), y (ii) and z (iii) component of the instantaneous velocity vector of track A vs. time.

The measurement error cannot be neglected when it comes to the calculation of flight accelerations. Figure 6.10(i) shows the relationship between the magnitude of the instantaneous acceleration/deceleration and time. It is obvious that the acceleration fluctuates greatly. Considering the maximum acceleration that a honeybee can achieve or sustain during its flight, such fluctuant values seem unrealistic.

132

Figure 6.9 Diagram of the possible situations when large measurement error occurs

Figure 6.9 shows the situations when the measurement of bee location is extracted with large deviation. A bee may consist of several contours after the processing of background subtraction. Then, as was described in Chapter 5, morphological operations were applied to merge the neighbouring contours of the same object and avoid multiple detections of it. This operation has a limitation – the estimated centre of mass can be shifted due to more detection of the wings or other body parts, resulting in errors in the detection of bee location.

The average body length of a Western honeybee is around 15 mm. Ideally, the location of the bee should be stored as the coordinates of roughly the middle point of the body, i.e. 7.5 mm from either end. In the worst case, however, taking the detection of the head or abdomen of a bee as its location results in the maximum possible error 푒푚푎푥 = 7.5 푚푚. The maximum errors in velocity and acceleration are therefore

푒푚푎푥 7.5 푚푚 푒 = = = 0.45 푚 푠−1 (6.5) 푣,푚푎푥 푇 1⁄60 푠

푒 0.45 푚 푠−1 푒 = 푣,푚푎푥 = = 27 푚 푠−2 (6.6) 푎푐푐,푚푎푥 푇 1⁄60 푠

A small error in displacement measurement therefore results in a large error in the calculation of acceleration, due to the very short time interval T between frames. The acceleration estimates become highly unreliable and exaggerated as shown in Figure 6.10 (i). However, since the mean of the deviation is zero, when

133 the magnitude of time increased and the acceleration/deceleration was calculated across a long period, for example, 10 measurement points, the errors tended to cancel and some uncertainty were removed, shown in Figure 6.10 (ii).

Acceleration (m s-2)

Time (s) (i) Acceleration (m s-2)

Time (s) (ii)

Acceleration (m s-2)

Time (s) (ii) - x

134

Acceleration (m s-2)

(ii) - y Time (s)

Acceleration (m s-2)

(ii) - z Time (s) Figure 6.10 (i) The magnitude of acceleration of track A vs. time (with large measurement error deviation). (ii) Averaged magnitude of acceleration (across 10 adjacent measurement points) of track A vs. time. (ii – x, y, z) The x, y, z components of averaged acceleration vector of track A vs. time.

Combining the acceleration graphs with the magnitude of velocity of track A, it was found that at approximately 0.35 s from the start of the track, although the velocity seems relatively stable, the acceleration reached its maximum, being over 40 m s-2. From Figure 6.3 it is shown that the increase in acceleration was due to the abrupt change in flight direction. Additionally, as shown in Figure 6.10 (ii) – x, this change of flying direction mainly occurred along the x axis.

Unlike track A, the acceleration curve of track B shows more stability in Figure 6.11.

135

Acceleration (m∙s-2)

Time (s) Figure 6.11 Averaged magnitude of acceleration of track B vs. time

Results from Figure 6.7 and Figure 6.11 also reveal the effect of acceleration associated with abrupt turns and otherwise constant linear velocity. The acceleration of the second bee stayed mainly in the range 10 – 20 m s-2, showing a rather normal status of a flying bee, while track A possibly shows the requirement of making a sharp turn in an extreme case, resulting in a high acceleration in a short instant.

6.3 Error analysis

The following list describes the main sources of measurement errors during the data acquisition and image processing:

1. Resolution of the image sensor: with the current resolution setting (2704 × 1520), the length of one pixel on the image sensor represents around 0.48 mm on the plane 3 m away from the camera and around 1.61 mm at 10 m. The reliability of 3D measurement points beyond 10 m from each camera could not be guaranteed. Therefore, the boundary of the trackable volume terminated at 10 m away and measurement points generated beyond this range were removed. And such resolution is acceptable considering that the average body length of a Western honeybee is around 15 mm. 2. Bias in the estimation of object centre of mass during the image processing: as discussed previously, taking the detection of the head or the tip of the wing of the bee as its location results in a maximum possible error of 7.5 mm.

136

3. Image distortion from poorly manufactured camera lens: both cameras were calibrated with 50 chessboard images to minimise the effect of lens distortion. Although camera calibration cannot remove the distortion completely, such operation still properly suppressed the error to the minimum range possible. The remaining error was also further validated during the 3D pairing stage and controlled by the Epipolar constraints. 4. Imperfect synchronisation of the cameras: the maximum temporal offset between the cameras is 1/120 s. Considering the highest flight speed of a honeybee (~7.25 m/s) (Wenner, 1963), such offset resulted in an error of around 60 mm. However, this error can be controlled through the Epipolar constraint during 3D pairing. The relatively strict threshold used in the system (0.01) was able to detect large synchronisation offset and only accepted video pairs with an error smaller than approximately twice the length of the bee (30 mm). And obviously, this type of error can be reduced using a higher frame rate or a wired synchronisation approach, which however increases the cost as well as the setup difficulty of the system and reduces the portability. 5. Imperfect measurement of the angle between the cameras: as discussed in Chapter Four, the preferred angle between the optical axes of the cameras was 90°, to minimise the error of depth information. The alignment of the cameras to form a 90° angle was achieved using the calibration software shown in Figure 4.13 (B) and the rotation matrix was then optimised using 10 feature points in the scene (four of them can be seen in Figure 5.14 (B). The average alignment error over 8 trials was ±0.23° (validated through stereo calibration), which was within the Epipolar constraint.

To summarise, the measurement errors of this system were controlled first by calibrating the cameras separately, then by the Epipolar constraint during 3D pairing to minimise the accumulation of error in 2D image processing. And as described in Chapter 5.12, the reliability of the results was quantified and validated carefully with human labour frame by frame, since the system already outperformed human eyes in real-time processing. The performance of the system can be easily improved using better imaging equipment and processing platforms.

137

6.4 Analysis of the entire cluster

The visual tracking system developed in this research was not only able to detect and extract the flight tracks of inflight honeybee individuals, but also to acquire data that can be used to analyse the off-hive activities of bee swarms. In this section is presented processed results of the same video data considered in the previous section.

6.4.1 Analysis of flight tendency

As mentioned previously, flight tracks that were entirely within the ‘outside gate’ were removed (these represented a minority of the data). Then, the remaining tracks were categorised into two groups, based on whether the bee was leaving or approaching the hive, i.e. the exiting and entering tracks. This was done by calculating and comparing the distances from both the starting point and ending point of a track to the location of the beehive entrance (Figure 6.12). If the ending point is farther from the entrance than the starting point, the track is labelled as an exiting track, otherwise an entering track. It is worth mentioning that although these two groups of tracks were named as ‘exiting’ and ‘entering’ tracks, the instant of a bee entering or exiting through the gap of the entrance was never captured. The names only indicate the tendency of the moving direction of the bees and can be understood as ‘flying away from’ and ‘approaching’ the beehive. Therefore, any bee merely flying past the hive, neither entering nor exiting from the entrance, was also categorised by this criterion.

138

Beehive

L2, start L1, end L2, end L1, start

L2, end < L2, start ‘entering track’ L1, end > L1, start ‘exiting track’

Figure 6.12 Demonstration of two flight track categories.

Then the following statistics of two different groups were calculated and the results are shown in Table 6.1: the total number of extracted tracks:

푁 = 푁푒푛푡푒푟 + 푁푒푥푖푡 (6.7) the accumulated magnitude of velocities of all historical locations on entering and exiting flights:

푁푒푛푡푒푟 퐿푒푛푡푒푟,푛 퐼푒푛푡푒푟 = ∑ ∑ 푣푒푛푡푒푟(푛, 푙) (6.8) 푛=1 푙=1

푁푒푥𝑖푡 퐿푒푥𝑖푡,푛 퐼푒푥푖푡 = ∑ ∑ 푣푒푥푖푡(푛, 푙) (6.9) 푛=1 푙=1 the accumulated velocity vectors of all historical locations on entering and exiting flights:

푁푒푛푡푒푟 퐿푒푛푡푒푟,푛 퐈푒푛푡푒푟 = ∑ ∑ 퐯푒푛푡푒푟(푛, 푙) (6.10) 푛=1 푙=1

푁푒푥𝑖푡 퐿푒푥𝑖푡,푛 퐈푒푥푖푡 = ∑ ∑ 퐯푒푥푖푡(푛, 푙) (6.11) 푛=1 푙=1 the normalised orientation unit vector of entering and exiting flights:

(6.12)

139

(6.13) 퐈푒푛푡푒푟 퐎푒푛푡푒푟 = |퐈푒푛푡푒푟|

퐈푒푥푖푡 퐎푒푥푖푡 = |퐈푒푥푖푡| the ratio between the average velocities of entering and exiting flights:

퐼푒푥푖푡⁄푁푒푥푖푡 푟푒푥푖푡−푡표−푒푛푡푒푟 = , (6.14) 퐼푒푛푡푒푟⁄푁푒푛푡푒푟 where:

푁푒푛푡푒푟, 푁푒푥푖푡 : the number of tracks of entering and exiting bees respectively, in a 1 min long video,

퐿푒푛푡푒푟,푛, 퐿푒푥푖푡,푛 : the length of track n, in terms of the number of historical locations,

푣(푛, 푙) : the magnitude of the instantaneous velocity on the l-th location of track n,

퐯(푛, 푙) : the instantaneous velocity vector on the l-th location of track n,

140

Table 6.1 Flight statistics of the entire bee swarm in the recorded video

Time 퐼푒푛푡푒푟 퐼푒푥푖푡 퐼푒푛푡푒푟 퐼푒푥푖푡 segment 푁푒푛푡푒푟 퐎푒푛푡푒푟 푁푒푥푖푡 퐎푒푥푖푡 ⁄ ⁄ 푟푒푥푖푡−푡표−푒푛푡푒푟 (m/s) (m/s) 푁푒푛푡푒푟 푁푒푥푖푡 (min) (0.5397, 0.8340, (-0.0232, -0.9673, 0 - 1 68 3934.3 70 4180.58 57.8584 59.7226 1.0322 -0.1150) 0.2526) (0.1787, 0.9260, (0.0388, -0.9578, 1 - 2 28 1703.8 40 2463.00 60.8521 61.5750 1.0119 -0.3326) 0.2847) (0.0247, 0.9150, (0.3233, -0.9369, 2 - 3 26 1425.2 19 1253.25 54.8173 65.9605 1.2033 -0.4027) 0.1326) (0.1889, 0.9501, (0.4520, -0.8672, 3 - 4 46 2370.7 30 1678.23 51.5385 55.9410 1.0854 -0.2482) 0.2091) (0.5025, 0.8419, (0.0900, -0.9719, 4 - 5 32 1510.7 38 2273.03 47.2100 59.8166 1.2670 -0.1969) 0.2175) (0.5638, 0.8254, (0.2238, -0.8898, 5 - 6 37 2230.1 17 884.71 60.2751 52.0415 0.8634 0.02850) 0.3976) (0.5717, 0.7968, (0.3034, -0.8384, 6 - 7 31 2025.8 11 708.77 65.3490 64.4336 0.9860 -0.1958) 0.4527) (0.4377, 0.8817, (0.0705, -0.9378, 7 - 8 33 1728.1 14 825.69 52.3673 58.9781 1.1262 -0.1763) 0.3400) (0.5989, 0.7651, (-0.4448, -0.8808, 8 - 9 22 1105.9 11 653.40 50.2718 59.4004 1.1816 -0.2365) 0.1624) (0.4397, 0.7743, (0.2501, -0.9478, 9 - 9.5 17 856.45 15 850.74 50.3795 56.7161 1.1258 -0.4551) -0.1977) Sum 340 18891.5 - 265 15771.0 - - - -

Mean ------55.5633 59.5147 1.0711

141

The division of duty in a honeybee colony is largely determined by the worker bee’s age. Worker bees perform brood care and colony maintenance tasks when they are young (the first 2-3 weeks of their adult life) and venture outside as a ‘field bee’ looking for food, scouting for new colony sites or defending the nest when they get older (the remaining 1-3 weeks of adult life) (Robinson, Regulation of division of labor in insect societies, 1992) (Robinson, 2002). Respecting this research, the major focus concerned activities of workers in the vicinity of the entrance to the hive. Table 6.1 shows the relationship between the number of worker bees leaving the hive, carrying out tasks such as foraging and scouting, against time, during a specific period. This generally provides how the activity level of the bee swarm changes at different times of a day.

The directions of the accumulated instantaneous velocity of the swarm, 퐎푒푛푡푒푟 and 퐎푒푥푖푡 show the directions that most worker bees head to and return from, at different time intervals. The lengths of 퐎푒푛푡푒푟 and 퐎푒푥푖푡 are short when the bees are performing relatively random flights and become longer when the flights of the entire swarm start to follow certain regularity, showing a clearer tendency of flight direction. Further, if the lengths of 퐎푒푛푡푒푟 and 퐎푒푥푖푡 stay relatively long in time and pointing to the same direction, it means a ‘flight tunnel’ between the beehive and a location along the direction of 퐎푒푛푡푒푟 or 퐎푒푥푖푡 has been established, with more bees flying towards or coming from that place, for example, a food source. When there was a significant difference between 퐎푒푛푡푒푟 and 퐎푒푥푖푡, then it was very likely that the bees visited at least two places during their trip, and

퐎푒푛푡푒푟 and 퐎푒푥푖푡 meant the directions of the first and the last visited place.

From the fourth minute until the end of the table, the values of 퐎푒푛푡푒푟 fluctuate within a very small range, which means the bees were all returning from roughly the same direction. A reasonable speculation could be that the worker bees found a profitable food source and were all returning to the hive to report, or the foraging was already on-going, and the bees were busy carrying the pollen and nectar back. The likelihood of each case depends on the duration of such flight behaviour. In addition, in some research (Collett & Collett, 2000), it was found that bees can estimate the travelled distance and the direction of the beehive through the integration of the displacement segments and orientation vectors.

142

So, the accumulated direction forms a complete 180° and the displacements also forms a closed loop. Therefore, the statistics of the departing and returning directions of the worker bees provides strong evidence for that finding.

By looking at the change of 푁푒푛푡푒푟 and 푁푒푥푖푡 against time, when 푁푒푛푡푒푟 outnumbered 푁푒푥푖푡 in a certain period (2 – end), this suggests that the bees were heading back to the hive and sharing information to other worker bees that remained in standby or doing other tasks. Then more worker bees started to forage after watching the pioneers’ waggle dances. The increase in the number of exiting bees afterwards possibly indicated the increased number of foragers and the total number of increasing exits could possibly be an indicator of the quality of the food source.

The statistics in the table also suggest that, if the observation period is extended, it should be possible to determine the onset and offset of flight activity as an aggregate. By comparing this period with the corresponding windspeed, temperature, weather, or sunrise and sunset times, the relationship between the activity level of bees and these ambient factors can be found. The activity level might also be used as a proxy of the health conditions of the bees, showing the proportion of foragers and other field bees among all the bee castes – a group of bee individuals that perform similar tasks and are similarly aged. The statistics of different hives located in different environments can be further compared.

It can be seen from the table that in most cases, the average speed of an exiting 퐼 퐼 track, 푒푥푖푡⁄ , is slightly higher than that of an entering track, 푒푛푡푒푟⁄ . It 푁푒푥푖푡 푁푒푛푡푒푟 was assumed that first of all, the returning bees probably carried a certain amount of food and the increase in load reduced flight speed; secondly, bees that travelled a long distance also resulted in a depletion of stamina even when no pollen was carried by the bees (Wolf, Schmid-Hempel, Ellington, & Stevenson, 1989); thirdly, since a returning bee requires frequent navigation updates and corrections, and was more cautious when making turns or approaching a marked landscape, it slows on the way back. Such observation can be combined with the change in the numbers of returning and leaving bees, as was discussed above, to determine the likelihood of each assumption.

143

The flight data also required the monitoring of the windspeed to cancel the effect of wind on the flight speed of the bees. The data was collected during the time when the wind was relatively weak. Although the sampling frequency of the anemometer used was very low (1 sample/5 min), it was assumed that the influence of wind was negligible on the bee flights in this research.

Apart from the categorisation based on entering or leaving, analysis based on the curvature of the tracks was also conducted. This was done by calculating the change in flight orientations across the whole track, given by

퐯 ∙ 퐯 −1 푖 푖+1 (6.15) 훼푖 = 푐표푠 (푖 = 1, 2, . . . , 푁 − 1) |퐯푖| ∙ |퐯푖+1|

효 = [훼1 훼2 ⋯ 훼푁−1] (6.16) where 훼푖 is the angle between 퐯푖 푎푛푑 퐯푖+1, the instantaneous velocities at measurement point i and i+1 respectively, N is the length of the track and A is the vector containing the array of 훼 across the entire track.

If more than 90% of elements in A are smaller than 5°, the track is marked as a ‘straight track’, otherwise a ‘curved track’. There are altogether 204 ‘straight tracks’ out of 605 tracks, shown in Figure 6.13.

(i)

144

(ii)

(iii)

Figure 6.13 Low curvature tracks shown in (i) 3D projection view, (ii) side view and (iii) top view. Yellow rectangle represents the location of the hive entrance.

145

Out of 204 tracks, only 12 have a negative elevation angle from the entrance, which means the bees rarely flew lower than the bottom of the hive (the hive was positioned on a metal frame at a height of approximately 600 mm from ground, shown in Figure 5.13 in Chapter 5). The elevation angle ranged from - 7.9° to 50.8°, with an average of 19.2° as shown in Figure 6.13 (ii). When observed from the top (Figure 6.13 (iii)), the angle of the sector covering the tracks was from -55.4° to 52.4° (assuming north as positive direction), with and average of -6.5°.

(i)

146

(ii)

Figure 6.14 (i) Side view and (ii) top view of the tracks with large curvature. Yellow rectangle represents the location of the hive entrance.

Figure 6.14 shows the side and top view of the remaining tracks, i.e. the ‘curved tracks’. It is interesting to see that the curved tracks seem to have no correlation with the location of the beehive entrance, while most straight tracks either started or terminated at the entrance. This may be apparent by nature, but also shows that honeybees performing long-distance duties, such as foraging, tend to make fewer turns for the ease of navigation, and possibly, energy saving.

147

6.4.2 Density distribution of worker bees at hive entrance

Figure 6.15 3D Distribution of all detection points in the 9.5 min recording. Red cube represents the location of the beehive entrance.

Figure 6.15 shows the all the detected locations of honeybees that were successfully allocated to a flight track in the 9.5 min recording. Isolated points and non-honeybee tracks have been removed. Each measurement point is represented by a blue coloured singularity and the red cube represents the beehive (0, 1500, 0 mm). The total number of measurement points is 12596. Due to the limitations of visualisation in a 2D plane, these points seem rather chaotic, despite the observation of several tracks coming out of the cloud of points.

Since the size of a measurement point (1 mm3) is much smaller than the dimension of the observation space, for the generation of heatmaps, each point was expanded to cover a larger area, which blurs the density map but reveals hitherto hidden patterns of behaviour and distribution.

148

(i)

(ii)

149

(iii)

Figure 6.16 Density map of flying honeybees around the hive (top view), showing (i) all tracks, (ii) ‘straight tracks’ and (iii) curved tracks.

Figure 6.16 shows the monochromic heatmaps of the density of honeybees in the observation space. Darker areas were less visited while brighter areas were more populated. Figure 6.16 (i) shows the highly populated cluster not only at the entrance (cluster A) but also approximately 1.5 – 2 m away from the entrance (cluster B and C). Cluster B and C located at both sides on the way to the entrance. These clusters are more obvious in the heatmap containing only the curved tracks (Figure 6.16 iii), while the straight flights are relatively evenly distributed across the area (Figure 6.16 ii).

150

(i)

(ii)

151

(iii)

Figure 6.17 Density map of flying honeybees around the hive (side view), showing (i) all tracks, (ii) ‘straight tracks’ and (iii) ‘curved tracks’.

Figure 6.17 shows the heatmap of honeybee density from a side view. It can be seen that cluster A (the same cluster as that in Figure 6.16 (i)) is lower than cluster D, and cluster D is most likely the superposition of cluster B and C, since they are located at the same height. The connection between cluster A and D is very narrow, which possibly means bees tended to choose the same path when flying across different clusters. Further, the location of cluster D is not on the main route to the hive, instead it suggests the hovering of the patrolling bees.

Figure 6.16 (i) and (iii) also shows several flight tracks that are more frequently taken – ‘popular tracks’. From previous studies on honeybee foraging behaviour, it was found that when a bee discovers a profitable food source and starts to carry food back to the hive, the foraging target does not change until abandoned (Karaboga, 2005). The bee will take several round trips to the same food source and also share the information through waggle dances to other bees. This means a stable food carrying chain is established. Such a bee is often referred to as an

152

‘employed’ bee. The ‘popular tracks’ shown in Figure 6.16 (i) and (ii) are probably the tracks of employed bees and also indicate the direction of the food source.

There are two reasonable speculations on the function of these sub-clusters inside the entire one:

1. Studies show that bees largely rely on their eyesight to localise the position of the hive if they are not far from it. Other senses, e.g. the sense of the sun’s direction, will come into use mainly during long-distance flights, for example, foraging (Butler, 1949). Although the exact proximity to the hive when the bee decides to switch senses was not found in the study, the bee is more likely to hover in front of the entrance and form clusters shown above. They may make several circular flights to navigate before finding the way back. Out of the 204 straight tracks analysed during the 9.5-minute observation, 97 were leaving the entrance and the rest 107 were returning. The difference was not significant, meaning there are also many foragers made their way back without too much effort. According to another study, the longer worker bees have become foragers, the better their navigation ability is, especially the localisation of the hive, or even the entrance. Experienced foragers take less effort returning, flying into the hive without hesitation (Karaboga, 2005). As an aside, it would be interesting to see how behaviour changes or if the numbers vary if the beehive is rotated while the foragers are in the field, as was conducted in other research, in which it was found that returned foragers hovered around the beehive to relocate the new position of the entrance (Butler, 1949).

2. Apart from foraging, another crucial duty of workers outside the nest is guarding. The proportion of bees performing defensive tasks is much smaller than foraging – around 15% of worker bees ever take part in defensive tasks during their lives; the guarding caste is often mid-aged (Arechavaleta-Velasco, Hunt, & Emore, 2003). The defensive behaviours involve patrolling around the hive, mainly in front of the entrance. If intruders are identified, the role from patrol switches to defence and attack (Breed, Guzmán-Novoa, & Hunt, 2004). Therefore the clusters identified in the video analysis may also lie within the patrol range of these guard bees, which routinely perform examination of

153 returning workers and identification of intruders (Stabentheiner, Kovac, & Schmaranzer, 2002). Such scrutiny requires back and forth flights of the examiners, leading to the formation of clusters as above.

6.5 Conclusion

The 3D bee tracking system was capable of extracting the 3D coordinates of all the detected honeybees. Further analysis based on such measurement of bees not only shows the flight statistics such as velocity and acceleration, but also provides a clearer view of the inflight behaviour of a bee at any point along its track. Although the flight of one individual may appear, at first sight, random, the analysis of the entire swarm provides a strong supportive evidence of conclusions and interpretations from other studies and the data volume is much larger in this research. The software was capable of uniquely extracting the fight trajectories of individual bees, obviating the requirement for physical marking using coloured dots or tracking tags, which inevitably influences behaviour. A wide range of behaviours within the detection range was recorded and analysed with unprecedented detail and precision in comparison to other studies. Continued use of the system may in time lead to the discovery of as-yet undiscovered phenomena.

154

7 Chapter 7 - Conclusion

7.1 Overview

Although the study of bee behaviour has always been a popular topic, the findings and actual understanding of bees are relatively limited in extracting the very absolute nature of bees’ activities outside the colony, especially their flight behaviours, without introducing too much disturbance. The technical barriers in monitoring very small targets like insects still remain. Detailed studies of inflight behaviour and the precise determination of population health are sparse; much information is very general or vague, and confined to bee swarms or community- level activity. Therefore, the observation and analysis of specific bee behaviour is in urgent need

Prior to this research, instrumentation and monitoring systems used for other studies were neither appropriate nor optimised for the study of bee behaviour, especially flight behaviour. For example, vertical-looking radar (VLR) is mainly used to monitor the migration of insects; but bees do not migrate in cold weather, and instead they form winter clusters to keep warm - hence VLR is not an option. For harmonic radar, that find its use in tracking low altitude insect individuals, the weight and length of the transponders are a significant burden to the flying bee. This influences their natural flight behaviour making it difficult to derive accurate or meaningful statistics or interpretations. Gluing the transponders to each subject also limits the use to a small number of targets. Observations that have been conducted in laboratory environments also lack the integrated impact of multiple ambient factors, and the constrained nature of the space diminishes the accuracy of any comparisons that might be made with bee activity in the field.

Despite the recent advancement in real-time digital video image processing, their application in the analysis of insect behaviour has not been widely introduced. By exploiting the evolution of image processing techniques, the processing power of recent hardware and the capacity of storage media, an automated 3D visual tracking and analysis system for the study of insect flight was designed and constructed in this research. The system was designed to work in the field

155 and did not interfere with normal bee flight or activity. The output of the system was accurate and reliable; it therefore proved the feasibility of applying classical image processing techniques to the study of insect motion.

7.2 Summary of system development and findings

The fundamental idea of this imaging system was inspired by the work of previous studies on both active and passive imaging of moving insects. The initial aim was to design and build a fully automated imaging system without interfering with the flight of honeybees during their foraging or activities around the hive. The final constructed system consists of a dual-channel video recording system and software developed for fully automated processing of the videos and data analysis. The operation of the system requires very little labour and cost. The results have a high potential for further applications.

Two set of experiments were conducted as the research progressed, the first one in May and June 2018 and the second one in June and July 2019. The experiment in 2018 recorded several videos of honeybee flight that not only showed the flight activities of all the observed bees around the hive, but also all types of noise in the background and the achievable field of view of the camera. This was then used to determine the proper image processing techniques for the removal of each type of noise and the optimised separation between the camera and the target. It required balancing the size of view range and the size of each honeybee appearing in the view plane (i.e. the accuracy of measurement). The second set of experiments expanded the observation of flying honeybees from 2D to 3D vision with the introduction of a second camera. The optical rays of the cameras formed a 90° angle so that the actual 3D locations of the bees could be calculated based on the 2D projections on each view plane. The measurement errors from 3D projection were also compensated across two viewpoints. The equipment was very easy to install and calibrate. The cameras use replaceable lithium batteries and mass storage media to guarantee long-term recording in the field; further, there was no requirement for any external power supply. The cameras were highly reliable and able to sustain extreme weather such as strong sunlight, rain, or low temperature.

156

Software development progressed synchronously with each change in imaging equipment, and the approach and was carefully adjusted based on the results of each experiment. The automated processing of the raw video data and the extraction of multiple honeybee flight paths followed this procedure:

• Synchronising the recordings from both cameras. • Removing static background or objects with very slow motion through background subtraction. • Filtering out noise from grass and leaves or other dynamic noise based on the moving patterns and contour shapes. • 3D coordinates restoration through the mapping of two viewpoints and 3D triangulation. • Estimating the 3D flight tracks using Kalman filtering. • Using the maximum supported transmission bandwidth

The software extracted the 3D coordinates of all the observed honeybees at each location along their flights and generated the flight tracks based on the relevance between historical locations of bees. Using the cameras’ maximum supported transmission bandwidth, the processing of the video data was accurate to less than 10 millimetres in a 3D space and 1/60 s in terms of temporal resolution. This was unprecedented in analysing bee behaviours in outdoor conditions.

The data was analysed both at individual level and colony level. A large number of singularities and 3D flight tracks of over 60 path points were extracted in the observation volume. Each track reflected every decision the bee made during its flight within a very short time period, including speed changes and adjustments in orientation. The swarm level analysis revealed the preferences of flight behaviours of the worker bee group during different time of the day, including the changes in the number of bees that were either leaving or entering the hive as well as the overall direction of such movement. The density distribution of worker bees around the hive was also analysed in detail. The generated tracks were divided into two groups – ‘straight tracks’ and ‘curved tracks’ – based on curvature analysis. The density maps from multiple perspectives of both groups revealed certain tendencies and preferences of bees in the vicinity of the hive,

157 refuting the notion that the localised flight behaviour was disorganized or chaotic.

The findings from the analysis can be summarised as:

• The returning workers have a smaller average velocity than the departing ones, possibly due to the carrying load and stamina depletion during foraging. • Compared with those departing from the hive, returning workers normally merely require slightly more navigation of the entrance to the hive when they are near the hive. Instead, the margin between the number of bees leaving and entering the hive directly was very small. And this ability can possibly be trained as the worker gets more experienced. • The workers hovering in front of the hive entrance tend to form small clusters that are slightly higher than the entrance. Additionally, bees flying from one cluster to another often move along a more popular track, forming what is named as a ‘flight tunnel’.

The results confirm the very considerable potential of the system for a wide range of applications involving other insect species; it extracts the flight statistics of the honeybee with impressive precision and analyses the swarm with unprecedented detail. As such, it is evident that such a tool would be a valuable addition to the field of entomological research.

7.3 Further work

7.3.1 Improved placement of the cameras and recording strategy

During the experiments in which two cameras were used, the common field of view was not optimised (Figure 7.1 A). The field of view extended from the bottom left (southeast) to the top right corner (northwest), covering an area of approximately 20 m2 horizontally to the front of the beehive. This space covered only a small proportion of the area that the beehive entrance faced. From the results shown in the previous chapter, it is known that most of the bee activities were concentrated in front of the beehive, and extended further in a fan-shaped

158 area. There were rarely flight tracks to the back of the beehive. Therefore, the main reason behind the termination of those extracted tracks was that the bees flew beyond the detection range, rather than the bees being too far and too small to be detected.

A

Beehive

Left camera

Right camera

B

Beehive

Left camera Right camera

159

Figure 7.1 (A) Positioning of the cameras during the field experiment. (B) A possible optimised positioning of the cameras. (Gray areas are the common field of view)

Figure 7.1 A also shows that some part of the view in each camera was blocked by the beehive due to its position in the view planes, which resulted in a loss of information. Therefore, with the field of view angles being unchanged, adjusting the positioning of the cameras and minimising the blockage by other objects as much as possible (Figure 7.1 B) would extend considerably the length of tracks, hence the trajectory of an individual be could be fully observed and quantified.

Further, the observation of bee behaviour should not be limited to a location in the immediate vicinity of the beehive. It is true that bee activity around the hive is undoubtedly frequent and intensive. In this work, flying bees were detected in almost every frame of the video. The intensive activity around the hive is advantageous since it allows useful data to be collected over a short time, yielding a higher data acquisition efficiency. However, such observations should also be carried out in other locations where the bees also visit, including the foraging sites such as a typical garden, flowering crops, orchards and landscapes over which bees navigate on their return flights. The type of activity and task bees carry out differ according to location. Their flight patterns and behaviours are also subject to these changes. Hence including such data from different habitats surely enriches the statistics, yielding interpretations that are more comprehensive and informative. This work has already confirmed that the system is robust to noise generated by foliage or flowers if there is no large obstruction in the field of view, and is thus capable of processing the flight data recorded in other locations.

Further, the recording strategy can be optimised based on the type of flight statistics required, by adjusting the framerate or the length of each video clip. For example, if the change in the number of bees entering or exiting the hive against the time of day is more focused, the framerate can be turned down so that the analysis of bees’ inflight posture or instantaneous velocity can be simplified. The duration of each recording can also be reduced while the frequency of recording increased so that the observation can last longer in time.

160

7.3.2 Upgrading the imaging equipment

One of the most critical factors that affects the recognition accuracy of the target is the image quality. This is not completely determined by the resolution. The lens used on a phone camera or even the action camera used in this research is inferior to a professional lens used on a typical DSLR, in terms of image sharpness and adaptability to illumination change. A sharper image, which is largely determined by aperture (i.e. the physical size of the lens) captures objects with clearer edges thus benefiting the visibility of target and the accuracy of the generated flight tracks. On the other hand, higher resolution means more pixels are contained in a single frame. Under the circumstance that the distance between the target and the image sensor remains unchanged, the number of pixels that a bee occupies is larger, and the calculated centre of mass of a bee in the view plane is thus more accurate. Recording in higher resolution also expands the effective range of detection, making it much easier to cover the desired observation space.

However, the computational burden grows exponentially with the increase in resolution. The time required to process such highly resolved data may not be acceptable in practical use. Therefore, aiming at a recording with higher temporal resolution is not necessarily desirable and depends on context.

Another possible improvement concerns the stability of the camera mounting system. In this research, the two cameras are not physically connected, thus before each experiment, their relative position required calibration and adjustment. This can be very inconvenient if the equipment is constantly moved and set up at different locations. In chapter 4, the merits and demerits of the orthogonal and parallel configuration of the cameras were discussed, as well as the estimated range of measurement errors. In fact, the angle formed by the cameras can be any value between 0° and 90°. Modified software would accept any angle as part of a function argument, which would return the 3D coordinates for full 3D triangulation. If the cameras were to be moved slightly closer together, with a decreased angle of perhaps 45°, a rigid support structure would maintain the relative position between the cameras. The requirement for repeated calibration would therefore be obviated.

161

7.3.3 Collaboration with honeybee experts and entomologists

The visual tracking system in this project was designed and built completely from an engineer’s perspective after a detailed study of the habits and flight characteristics of bees. Although the system attempted to extract many quantified features respecting flight behaviour, the attention paid to every metric was evenly distributed. The scientific correctness and rigour of the methodology from an entomological perspective still requires further consideration and self- examination. This means the interpretation of the flight data such as the historical locations, velocities and orientations might be inadequate or insufficient. Therefore, it is suggested that collaboration with a bee expert or an entomologist is in urgent need. Such collaboration would provide guidance on the design and execution of experiments, the appropriate weight on the analysis of the data, and judgements on which parts of the system are practically useful, and which need improvement. More specific in-field applications include the characterisation of departure angles and the angle shifts of foragers to determine their foraging routes and the estimation of the locations of the food source. The monitoring cycle can also be extended to cover the life cycle of worker bees and analysis the abundance and reproduction activity of the entire colony, which contributes to the monitoring of diseases and pre-warning of possible population decline. The system is also capable of identifying old and new foragers based on the spiral behaviours, i.e. the requirement of frequent navigation, on their return flights.

7.3.4 Analysis with other environmental variables

The data collected during each experiment was mainly limited to video recordings; wind speed was merely an indicator of the usability of the video, providing only a simple reference. However, it is known that environmental and meteorological factors have a considerable impact on flight patterns and decision making of the bees. Future research would consider such factors as wind speed (with high sampling frequency measurement), temperature, humidity, elevation angle of the sun and solar illumination intensity; these would be measured synchronously with the video data of bee flight. This would facilitate more complex and comprehensive analysis, perhaps revealing new flight or

162 behavioural strategies related to such variables. The recording of these factors could be further investigated and may lead to uncharted territory in the study of insect behaviour.

7.4 Conclusion

This PhD research proved the feasibility of applying computer vision techniques to the tracking and analysis of honeybees. A fully automated 3D visual tracking system of Western honeybees was designed and built with low cost. The 3D flight tracks generated with advanced image processing techniques and high accuracy were analysed at both individual and swarm level. The instrumentation and software represent an opportunity to continue the research at a more sophisticated level. It is hoped that more subtle and interesting secrets may be revealed, and the important role bees are playing in nature and in relation to human economic activity; perhaps it may also serve to draw attention to the severe situation currently faced by these essential insects. We also hope that more researchers and bee lovers can take part in the future study and protection of bees. Afterall, to protect nature is to protect ourselves.

163

Bibliography

A Dictionary of Biology. (n.d.). Retrieved from Encyclopedia.com: https://www.encyclopedia.com/science/dictionaries-thesauruses-pictures-and- press-releases/animal-behaviour-1

Alem Gebru, S. J. (2017). Multispectral polarimetric modulation spectroscopy for species and sex determination of Malaria disease vectors. CLEO: Applications and Technology, ATh1B-2.

Ali, M. F., & Morgan, E. D. (1990). Chemical communication in insect communities: A guide to insect pheromones with special emphasis on social insects. Biol Rev Camb Philos, 65(3):227–47.

Allsopp, M. H. (2008). Valuing insect pollination services with cost of replacement. PloS one, 3(9), e3128.

Altaf Hussain Sheikh, Moni Thomas, R. (2018). Light Trap and Insect Sampling : an Overview. International Journal of Current Research, 8(11, pp.40868-40873), 40869.

Altshuler, D. L., Dickson, W. B., Vance, J. T., Roberts, S. P., & Dickinson, M. H. (2005). Short- amplitude high-frequency wing strokes determine aerodynamics of honeybee flight. Proceedings of the National Academy of Sciences, 102(50), 18213-18218.

Arechavaleta-Velasco, M. E., Hunt, G. J., & Emore, C. (2003). Quantitative trait loci that influence the expression of guarding and stinging behaviors of individual honey bees. ehavior genetics, 33(3), 357-364.

Atlas, D., Metcalf, J. I., Richter, J., & Gossard, E. E. (1970). The birth of “CAT” and microscale turbulence. Journal of the Atmospheric Sciences,, 903-913.

Autodesk, I. (2020). Autodesk 3ds Max 2019. Retrieved from Autodesk.co.uk: https://www.autodesk.co.uk/campaigns/3ds-max

Baker, P., Gewecke, M., & Cooter, R. (1981). The natural flight of the migratory locust, Locusta migratoria L. - III. Wing-beat frequency, flight speed and attitude. Journal of Comparative Physiology □ A, 141(2), 233-237.

Balch, T., Khan, Z., & Veloso, M. (2001). Automatically tracking and analyzing the behavior of live insect colonies. Proceedings of the International Conference on Autonomous Agents, 521-528.

Ballard, D. H. (1983). Rigid body motion from depth and optical flow. Computer Vision, Graphics, and Image Processing, (pp. 95-115).

Bean, B., Mcgavin, R., Chadwick, R., & Warner, B. (1971). Preliminary results of utilizing the high resolution FM radar as a boundary-layer probe. Boundary-Layer Meteorology, 1(1971), 466-473.

Bertholf, L. M. (1925). The moults of the honeybee. Journal of Economic Entomology, 18(2), 380-384.

164

Boiteau, G., & Colpitts, B. (2001). Electronic tags for the tracking of insects in flight: Effect of weight on flight performance of adult Colorado potato beetles. Entomologia Experimentalis et Applicata, 100(2), 187-193.

Boiteau, G., Meloche, F., Vincent, C., & Leskey, T. (2009). Effectiveness of Glues Used for Harmonic Radar Tag Attachment and Impact on Survival and Behavior of Three Insect Pests. Environmental Entomology, 38(1), 168-175.

Bouguet, J.-Y. (2015, October 14). First calibration example - Corner extraction, calibration, additional tools. Retrieved from Camera Calibration Toolbox for Matlab: http://www.vision.caltech.edu/bouguetj/calib_doc/htmls/example.html

Bradski, G. (2000). The OpenCV Library. Dr. Dobb's Journal of Software Tools.

Breed, M. D., Guzmán-Novoa, E., & Hunt, G. J. (2004). Defensive behavior of honey bees: organization, genetics, and comparisons with other bees. Annual Reviews in Entomology, 49(1), 271-298.

Breeze, T. D. (2011). Pollination services in the UK: How important are honeybees? Agriculture, Ecosystems & Environment, 142(3-4), 137-143.

Butler, C. G. (1949). Bee behaviour. Nature, 163(4134), 120-122.

Cant, E., Smith, A., Reynolds, D., & Osborne, J. (2005). Tracking butterfly flight paths across the landscape with harmonic radar. Proceedings of the Royal Society B: Biological Sciences, 272(1565), 785-790.

Capaldi, E. A., Smith, A. D., Osborne, J. L., Fahrbach, S. E., Farris, S. M., Reynolds, D. R., & Riley, J. R. (2000). Ontogeny of orientation flight in the honeybee revealed by harmonic radar. Nature, 537-540.

Carrington, D. (2012, March 29). Pesticides linked to honeybee decline. Retrieved from The Guardian.

Carrington, D. (2018, April 27). EU agrees total ban on bee-harming pesticides. Retrieved from The Guardian.

Chapman, J. W., Reynolds, D. R., & Smith, A. D. (2003). Vertical-looking radar: a new tool for monitoring high-altitude insect migration. Bioscience, 503-511.

Chapman, J. W., Reynolds, D. R., Smith, A. D., Riley, J. R., Pedgley, D. E., & Woiwod, I. P. (2002). High‐altitude migration of the diamondback moth Plutella xylostella to the UK: a study using radar, aerial netting, and ground trapping. Ecological Entomology, 641-650.

Chapman, J., Reynolds, D., & Wilson, K. (2015). Long-range seasonal migration in insects: Mechanisms, evolutionary drivers and ecological consequences. Ecology Letters, 18(3), 287-302.

Chisel, J. T. (2015). Honey Bees’ Impact on the U.S. Economy . Economics Theses, 98.

Collett, M., & Collett, T. S. (2000). How do insects use path integration for their navigation? Biological cybernetics, 245-259.

165

Colpitts, B., & Boiteau, G. (2004). Harmonic radar transceiver design: Miniature tags for insect tracking. IEEE Transactions on Antennas and Propagation, 52(11), 2825-2832.

Conte, Y. L., & Hefetz, A. (2008). Primer pheromones in social Hymenoptera. Annu Rev Entomol., 53:523–42.

Correll, N., Sempo, G., De Meneses, Y., Halloy, J., Deneubourg, J., & Martinoli, A. (2006). SwisTrack: A tracking tool for multi-unit robotic and biological systems. IEEE International Conference on Intelligent Robots and Systems, 2185-2191.

Crawford, A. B. (1949). Radar reflections in the lower atmosphere. Proceedings of the Institute of Radio Engineers, 404-405.

David, C. T., Kennedy, J. S., & & Ludlow, A. R. (1983). Finding of a sex pheromone source by gypsy moths released in the field. Nature, 303, 804-806.

Decourtye, A., Devillers, J., Aupinel, P., Brun, F., Bagnis, C., Fourrier, J., & Gauthier, M. (2011). Honeybee tracking with microchips: A new methodology to measure the effects of pesticides. Ecotoxicology, 20(2), 429-437.

Drake, V. A. (1984). The vertical distribution of macro-insects migrating in the nocturnal boundary layer: a radar study. Boundary-layer meteorology, 353-374.

Drake, V. A., & Farrow, R. A. (1983). The nocturnal migration of the Australian plague locust, Chortoicetes terminifera (Walker)(Orthoptera: Acrididae): quantitative radar observations of a series of northward flights. Bulletin of Entomological Research, 567-585.

Efford, N. (2000). Digital Image Processing: A Practical Introduction Using Java. Addison- Wesley Longman Publishing Co., Inc.

Garland, M. S., & Davis, A. K. (2002). An Examination of Monarch Butterfly (Danaus plexippus) Autumn Migration in Coastal Virginia. The American Midland Naturalist, 147(1), 170-174.

Gibson, G., & Brady, J. (1985). 'Anemotactic' flight paths of tsetse flies in relation to host odour: a preliminary video study in nature of the response to loss of odour. Physiological Entomology, 395-406.

Gill, R., Ramos-Rodriguez, O., & Raine, N. (2012). Combined pesticide exposure severely affects individual-and colony-level traits in bees. Nature, 491(7422), 105-108.

Godbehere, A. B., & Goldberg, K. (2012). Visual tracking of human visitors under variable- lighting conditions for a responsive audio art installation. American Control Conference, 4305-4312.

GoPro, I. (2020). GoPro.com. Retrieved from GoPro Hero 7 Black: https://gopro.com/en/gb/shop/cameras/hero7-black/CHDHX-701-master.html

Goulson, D. N. (2015). Bee declines driven by combined stress from parasites, pesticides, and lack of flowers. Science, 347(6229).

Green, D. M., & Swets, J. A. (1966). Signal detection theory and psychophysics. New York: Wiley.

166

Greenbank, D. O., Schaefer, G. W., & Rainey, R. C. (1980). Spruce budworm (Lepidoptera: Tortricidae) moth flight and dispersal: new understanding from canopy observations, radar and air craft. The Memoirs of the Entomological Society of Canada, 1-49.

Hagen, M., Wikelski, M., & Kissling, W. (2011). Space use of bumblebees (Bombus spp.) revealed by radio-tracking. PLoS ONE, 6(5).

Hagler, J., & Jackson, C. (2001). Methods for marking insects: Current Techniques and Future Prospects. Annual Review of Entomology, 46(1), 511-543.

Hardy, A., & Milne, P. (1938). Studies in the Distribution of Insects by Aerial Currents. Journal of Animal Ecology, 199-229.

Hare, S., Golodetz, S., Saffari, A., Vineet, V., Cheng, M., Hicks, S., & Torr, P. (2016). Struck: Structured Output Tracking with Kernels. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(10), 2096-2109.

Harman, I. T., & Drake, V. A. (2004). Insect monitoring radar: analytical time-domain algorithm for retrieving trajectory and target parameters. Computers and electronics in agriculture, 23-41.

Hartley, R., & Zisserman, A. (2003). Epipolar Geometry and the Fundamental Matrix. In R. Hartley, & A. Zisserman, Multiple view geometry in computer vision (pp. 239-261). Cambridge University Press. Retrieved from University of Oxford: https://www.robots.ox.ac.uk/~vgg/hzbook/hzbook2/HZepipolar.pdf

Hasirlioglu, S., Kamann, A., Doric, I., & Brandmeier, T. (2016). Test methodology for rain influence on automotive surround sensors. IEEE 19th International Conference on Intelligent Transportation Systems (ITSC), (pp. 2242-2247). Rio de Janeiro, Brazil.

Hayashi, F., & Nakane, M. (1989). Radio tracking and activity monitoring of the dobsonfly larva, Protohermes grandis (Megaloptera: Corydalidae). Oecologia, 78(4), 468-472.

Hayashi, F., & Nakane, M. (1989). Radio tracking and activity monitoring of the dobsonfly larva, Protohermes grandis (Megaloptera: Corydalidae). Oecologia, 78(4), 468-472.

Heinrich, B. (1979). Keeping a cool head: honeybee thermoregulation. Science, 205(4412), pp.1269-1271.

Henriques, J., Caseiro, R., Martins, P., & Batista, J. (2015). High-speed tracking with kernelized correlation filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(3), 583-596.

Henry, M., Béguin, M., Requier, F., Rollin, O., Odoux, J.-f., Aupinel, P., . . . Decourtye, A. (2012). A Common Pesticide Decreases Foraging Success and Survival in Honey Bees. Science (New York, N.Y.), 336(April), 3-5.

Holland, R., Wikelski, M., & Wilcove, D. (2006). How and why do insects migrate? Science, 313(5788), 794-796.

Howard, E., & Davis, A. (2009). The fall migration flyways of monarch butterflies in eastern North America revealed by citizen scientists. Journal of Insect Conservation, 13(3), 279-286.

167

Jaboyedoff, M., Oppikofer, T., Abellán, A., Derron, M. H., Loye, A., Metzger, R., & Pedrazzini, A. (2012). Use of LIDAR in landslide investigations: a review. Natural hazards, 5-28.

Johnson, R. (2010). Honey Bee Colony Collapse Disorder. Washington: Congressional Research Service.

KaewTraKulPong, P. &. (2002). An Improved Adaptive Background Mixture Model for Real- time Tracking with Shadow Detection. Video-based surveillance systems (pp. 135- 144). Boston: Springer.

KaewTraKulPong, P., & Bowden, R. (2002). An improved adaptive background mixture model for real-time tracking with shadow detection. Video-based surveillance systems , 135-144.

Kalal, Z., Mikolajczyk, K., & Matas, J. (2012). Tracking-learning-detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(7), 1409-1422.

Karaboga, D. (2005). An idea based on honey bee swarm for numerical optimization. Technical report-tr06, Erciyes university, engineering faculty, computer engineering department., Vol. 200, pp. 1-10.

Kissling, W. D., Pattemore, D. E., & Hagen, M. (2014). Challenges and prospects in the telemetry. Biological reviews, 511-530.

Klein, A. M.-D. (2007). Importance of pollinators in changing landscapes. Proceedings of the royal society B: biological sciences, 274(1608), 303-313.

Knight, A., Brower, L., & Williams, E. (1999). Spring remigration of the monarch butterfly, Danaus plexippus (Lepidoptera: Nymphalidae) in north-central Florida: Estimating population parameters using mark-recapture. Biological Journal of the Linnean Society, 68(4), 531-556.

Kuhn, H. W. (1955). The Hungarian method for the assignment problem. Naval research logistics quarterly, 83-97.

Kumari, M., & Hasan, S. R. (2020). A New CMOS Implementation for Miniaturized Active RFID Insect Tag and VHF Insect Tracking. IEEE Journal of Radio Frequency Identification, 4(2), 124-136.

Lam, J., Kusevic, K., Mrstik, P., Harrap, R., & Greenspan., M. (2010). Urban scene extraction from mobile ground based lidar data. Proceedings of 3DPVT, 1-8.

Lohani, B., & Ghosh, S. (2017). Airborne LiDAR Technology: A Review of Data Collection and Processing Systems. Proc. Natl. Acad. Sci., India, Sect. A Phys. Sci., 567-579.

Loper, G. M., & Wolf, W. W. (1986). Using the USDA-ARS insect radar unit to monitor honeybee drone activity. American bee journal.

Lowe, W., & Allendorf, F. (2010). What can genetics tell us about population connectivity? Molecular Ecology, 19(15), 3038-3051.

Lukhtanov, V., Pazhenkova, E., & Novikova, A. (2016). Mitochondrial chromosome as a marker of animal migratory routes: DNA barcoding revealed Asian (non-African)

168

origin of a tropical migrant butterfly Junonia orithya in south Israel. Comparative Cytogenetics, 10(4), 671-677.

Mallory, J. P. (1997). Encyclopedia of Indo-European Culture. Taylor & Francis.

Malmqvist, E., Jansson, S., Zhu, S., Li, W., Svanberg, K., Svanberg, S., . . . Åkesson, S. (2018). The bat–bird–bug battle: Daily flight activity of insects and their predators over a rice field revealed by high-resolution scheimpflug lidar. Royal Society open science, 172303.

Mascanzoni, D., & Wallin, H. (1986). The harmonic radar: a new method of tracing insects in the field. Ecological entomology, 387-390.

Medina, A., Gayá, F., & Del Pozo, F. (2006). Compact laser radar and three-dimensional camera. J. Opt. Soc. Am. A., 800-805.

Mei, L., Guan, Z. G., Zhou, H. J., Lv, J., Zhu, Z. R., Cheng, J. A., . . . Somesfalean, G. (2012). Agricultural pest monitoring using fluorescence lidar techniques. Applied Physics B, 733-740.

Morison, G. D. (1928). Memoirs: The Muscles of the Adult Honey-bee (Apis mellifera L.). Journal of Cell Science, 2(284), 563-651.

Mortensen, A., Schmehl, D., & Ellis, J. (2013, August). Entomology and Nematology Department. Retrieved from University of Florida: http://entnemdept.ufl.edu/creatures/MISC/BEES/euro_honey_bee.htm

Mucignat-Caretta, C. (. (2014). Neurobiology of chemical communication. CRC Press.

Murlis, J., & Bettany, B. W. (1977). Night flight towards a sex pheromone source by male Spodoptera littoralis (Boisd.)(Lepidoptera, Noctuidae). Nature, 433-435.

Nagoshi, R., Meagher, R., & Hay-Roe, M. (2012). Inferring the annual migration patterns of fall armyworm (Lepidoptera: Noctuidae) in the United States from mitochondrial haplotypes. Ecology and Evolution, 2(7), 1458-1467.

Nam, H., & Han, B. (2016). Learning Multi-domain Convolutional Neural Networks for Visual Tracking. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2016-Decem, 4293-4302.

Negro, M., Casale, A., Migliore, L., Palestrini, C., & Rolando, A. (2008). Habitat use and movement patterns in the endangered ground beetle species, Carabus olympiae (Coleoptera: Carabidae). European Journal of Entomology, 105(1), 105-112.

Nelson, J. A., Sturtevant, A. P., & Lineburg, B. (1924). Growth and feeding of honeybee larvae. US Department of Agriculture.

Noldus, L., Spink, A., & Tegelenbosch, R. (2002). Computerised video tracking, movement analysis and behaviour recognition in insects. Computers and Electronics in Agriculture, 35(2-3), 201-227.

O'Neal, M., Landis, D., Rothwell, E., Kempel, L., & Reinhard, D. (2004). Tracking insects with harmonic radar: A case study. American Entomologist, 50(4), 212-218.

169

Owens, R. (1997, 10 29). Mathematical Morphology. Retrieved from The University of Edinburgh: http://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/OWENS/LECT3/node3. html

Piccardi, M. (2004). Background subtraction techniques: a review. IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No. 04CH37583), vol. 4, pp. 3099-3104.

Potts, S. G. (2010). Declines of managed honey bees and beekeepers in Europe. Journal of apicultural research, 49(1), 15-22.

Pu, S., Rutzinger, M., Vosselman, G., & Elberink, S. O. (2011). Recognizing basic structures from mobile laser scanning data for road inventory studies. ISPRS Journal of Photogrammetry and Remote Sensing, 528-539.

Rainey, R. C. (1955). Observation of desert locust swarms by radar. Nature, 77-77.

Riecken, U., & Raths, U. (1996). Use of radio telemetry for studying dispersal and habitat use of Carabus coriaceus L. Annales Zoologici Fennici, 33(1), 109-116.

Riley, J. (1980). Radar as an Aid to the Study of Insect Flight. Pergamon Press Ltd.

Riley, J. R. (1992). A millimetric radar to study the flight of small insects. Electronics & communication engineering journal, 43-48.

Riley, J. R., & Reynolds, D. R. (1987). The migration of Nilapar vata lugens (Stal) (Delphacidae) and other Hemiptera associated with rice during the dry season in the Philippines - a study using radar, visual observations, aerial netting and ground trapping. Bullitin of Entomological Research, 145-169.

Riley, J. R., Xia-nian, C., Xiao-xi, Z., Reynolds, D. R., Guo-min, X., Smith, A. D., . . . Bao-ping, Z. (1991). The long‐distance migration of Nilaparvata lugens (Stål) (Delphacidae) in China: radar observations of mass return flight in the autumn. Ecological Entomology, 16(4), 471-489.

Riley, J., Armes, N., Reynolds, D., & Smith, A. (1992). Nocturnal observations on the emergence and flight behaviour of Helicoverpa armigera (Lepidoptera: Noctuidae) in the post-rainy season in central India. Bulletin of Entomological Research, 82(2), 243- 256.

Riley, J., Greggers, U., Smith, A., Reynolds, D., & Menzel, R. (2005). The flight paths of honeybees recruited by the waggle dance. Nature, 435(7039), 205-207.

Ristroph, L., Berman, G., Bergou, A., Wang, Z., & Cohen, I. (2009). Automated hull reconstruction motion tracking (HRMT) applied to sideways maneuvers of free-flying insects. Journal of Experimental Biology, 212(9), 1324-1335.

Robinson, G. E. (1992). Regulation of division of labor in insect societies. Annual review of entomology, 37(1), 637-665.

Robinson, G. E. (2002). Genomics and integrative analyses of division of labor in honeybee colonies. The American Naturalist, 160(S6), S160-S172.

170

Sanna, A., & Lamberti, F. (2014). Advances in target detection and tracking in forward- looking infrared (FLIR) imagery. Sensors (Switzerland), 14(11), 20297-20303.

Särkkä, S., Viikari, V., Huusko, M., & Jaakkola, K. (2012). Phase-based UHF RFID tracking with nonlinear Kalman filtering and smoothing. IEEE Sensors Journal, 12(5), 904-910.

Schaefer, G. (1972). Radar detection of individual locusts and swarms. International Study Conference on the Current and Future Problems of Acridology, 379-380.

Schaefer, G. W. (1979). An airborne radar technique for the investigation and control of migrating pest insects. Philosophical Transactions of the Royal Society of London. B, Biological Sciences, 287(1022), 459-465.

Schaefer, G., & Bent, G. (1984). An infra-red remote sensing system for the active detection and automatic determination of insect flight trajectories (IRADIT). Bulletin of Entomological Research, 74(2), 261-278.

Schneider, C., Tautz, J., Grünewald, B., & Fuchs, S. (2012). RFID tracking of sublethal effects of two neonicotinoid insecticides on the foraging behavior of Apis mellifera. PLoS ONE, 7(1), 1-9.

Shaw, J. A., Seldomridge, N. L., Dunkle, D. L., Nugent, P. W., Spangler, L. H., Bromenshenk, J. J., . . . Wilson, J. J. (2005). Polarization lidar measurements of honey bees in flight for locating land mines. Optics express, 5853-5863.

Shields, E. J., & Testa, A. M. (1999). Fall migratory flight initiation of the potato leafhopper, Empoasca fabae (Homoptera: Cicadellidae): observations in the lower atmosphere using remote piloted vehicles. Agricultural and Forest Meteorology, 317-330.

Silcox, D., Doskocil, J., Sorenson, C., & Brandenburg, R. (2011). Radio frequency identification tagging: A novel approach to monitoring surface and subterranean insects. American Entomologist, 57(2), 86-93.

Smith, A. D., Riley, J. R., & Gregory, R. D. (1993). A method for routine monitoring of the aerial migration of insects by using a vertical-looking radar. Philosophical Transactions of the Royal Society of London, 393-404.

Snodgrass, R. E. (1956). Anatomy of the Honey Bee. Cornell University Press.

Soilán, M., Riveiro, B., Martínez-Sánchez, J., & Arias, P. (2016). Traffic sign detection in MLS acquired point clouds for geometric and image-based semantic inventory. ISPRS Journal of Photogrammetry and Remote Sensing, 92-101.

Sokolowski, M., Moine, M., & Naassila, M. (2012). " Beetrack" : A software for 2D open field locomotion analysis in honey bees. Journal of Neuroscience Methods, 207(2), 211- 217.

Sommer, S., & Wehner, R. (2004). The ant's estimation of distance travelled: Experiments with desert ants, Cataglyphis fortis. Journal of Comparative Physiology A: Neuroethology, Sensory, Neural, and Behavioral Physiology, 190(1), 1-6.

Stabentheiner, A., Kovac, H., & Schmaranzer, S. (2002). Honeybee nestmate recognition: the thermal behaviour of guards and their examinees. Journal of Experimental Biology, 205(17), 2637-2642.

171

Steedman, A. (1990). Locust Handbook (3rd edition). Chatham: Natural Resources Institute.

Stephens, G., Johnson-Kerner, B., Bialek, W., & Ryu, W. (2008). Dimensionality and dynamics in the behavior of C. elegans. PLoS Computational Biology, 4(4).

Suchan, T., Talavera, G., Sáez, L., Ronikier, M., & Vila, R. (2019). Pollen metabarcoding as a tool for tracking long-distance insect migrations. Molecular Ecology Resources, 19(1), 149-162.

Talavera, G., & Vila, R. (2017). Discovery of mass migration and breeding of the painted lady butterfly Vanessa cardui in the Sub-Sahara: The Europe-Africa migration revisited. Biological Journal of the Linnean Society, 120(2), 274-285.

The EM Algorithm for Gaussian Mixtures. (n.d.). Probabilistic Learning: Theory and Algorithms, CS 274A.

Thoma, M., Hansson, B., & Knaden, M. (2015). High-resolution Quantification of Odor- guided Behavior in Drosophila melanogaster Using the Flywalk Paradigm. Journal of Visualized Experiments(106), 1-10.

Tiegs, O. W. (1955). The flight muscles of insects-their anatomy and histology; with some observations on the structure of striated muscle in general. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 238(656), 221-348.

Tomizawa, M. &. (2005). Neonicotinoid insecticide toxicology: Mechanisms of selective action. Annu. Rev. Pharmacol. Toxicol, 45, 247-268.

UKNS, T. (2018). Retrieved from Department for Environment Food & Rural Affairs: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attac hment_data/file/766200/nps-implementation-plan-2018-2021.pdf

Von Frisch, K. (1967). The dance language and orientation of bees.

Wada, T., Seino, H., Ogawa, Y., & Nakasuga, T. (1987). Evidence of autumn overseas migration in the rice planthoppers, Nilaparvata lugens and Sogatella furcifera: analysis of light trap catches and associated weather patterns. Ecological Entomology, 321-330.

Weber, E. (2013). Apis mellifera: The domestication and spread of European honey bees for agriculture in North America.

Weng, J., Cohen, P., & Herniou, M. (1992). Camera calibration with distortion models and accuracy evaluation. IEEE Transactions on Pattern Analysis & Machine Intelligence, 965-980.

Wenner, A. M. (1963). The flight speed of honeybees: a quantitative approach. Journal of Apicultural Research, 25-32.

Wentworth, J. (2010, January https://post.parliament.uk/research-briefings/post-pn-348/). Insect pollination. Retrieved from UK Parliament POST.

Wikelski, M., Moskowitz, D., Adelman, J., Cochran, J., Wilcove, D., & May, M. (2006). Simple rules guide dragonfly migration. Biology Letters, 2(3), 325-329.

172

Wilkinson, D., Lebon, C., Wood, T., Rosser, G., & Gouagna, L. (2014). Straightforward multi- object video tracking for quantification of mosquito flight activity. Journal of Insect Physiology, 71, 114-121.

Wilson, E. O. (1971). The Insect Societies. Belknap Press of Harvard University Press.

Wolf, T. J., Schmid-Hempel, P., Ellington, C. P., & Stevenson, R. D. (1989). Physiological correlates of foraging efforts in honey-bees: oxygen consumption and nectar load. Functional Ecology, 417-424.

Wu, Y., Lim, J., & Yang, M. (2015). Object tracking benchmark. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(9), 1834-1848.

Yela, J., & Holyoak, M. (1997). Effects of Moonlight and Meteorological Factors on Light and Bait Trap Catches of Noctuid Moths (Lepidoptera: Noctuidae). Environmental Entomology, 26(6), 1283-1290.

Zeevi, S. (2017). The BackgroundSubtractorCNT project. Retrieved from Github: https://github.com/sagi-z/BackgroundSubtractorCNT

Zhang, W. (2010). LIDAR-based road and road-edge detection. IEEE Intelligent Vehicles Symposium, (pp. 845-848). San Diego, CA.

Zhou, X., & Wang, Y. (2007). Real-Time Detecting and Tracking of IR Small Target with Complex Background.

Zhu, S., Malmqvist, E., Li, W., Jansson, S., Li, Y., Duan, Z., . . . Brydegaard, M. (2017). Insect abundance over Chinese rice fields in relation to environmental parameters, studied with a polarization-sensitive CW near-IR lidar system. Applied Physics B, 1-11.

Zivkovic, Z. (2004). Improved adaptive Gaussian mixture model for background subtraction. Proceedings of the 17th International Conference on Pattern Recognition, 28-31.

Zivkovic, Z., & Heijden, F. V. (2006). Efficient adaptive density estimation per image pixel for the task of background subtraction. Pattern recognition letters, 773-780.

173

Appendix

Journal Publication (submitted to Journal of Insect Science)

A visual tracking system for honeybee, Apis mellifera, 3D flight trajectory reconstruction and analysis

174