Long-range Outdoor Monocular Localization with Active Features for Ship Air Wake Measurement by Brandon J. Draper B.S., Aerospace Engineering, Univ. of Maryland, College Park (2016) Submitted to the Department of Aeronautics and Astronautics in partial fulfillment of the requirements for the degree of M•s-tttr BBJchdM of Science in Aerospace Engineering at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY February 2019 © Massachusetts Institute of Technology 2019. All rights reserved.

Author..-Signature redacted ...... Departme~t of Aeronautics and Astronautics October 1, 2019 Signature redacted Certified by ...... Jonathan P. How R. C. Maclaurin Professor of Aeronautics and Astronautics, MIT Thesis Supervisor --- . / Signature redacted Accepted by ....

\..._..,/ Sertac Karaman Associate Professor of Aeronautics and Astronautics, MIT MASSACHUSETTSINSTITUTE Chair, Graduate Program Committee OF TECHNOLOGY

MAR 12 2019 LIBRARIES ARCHIVES 2 Long-range Outdoor Monocular Localization with Active Features for Ship Air Wake Measurement by Brandon J. Draper

Submitted to the Department of Aeronautics and Astronautics on October 1, 2019, in partial fulfillment of the requirements for the degree of Bachelor of Science in Aerospace Engineering

Abstract Monocular pose estimation is a well-studied aspect of computer vision with a wide ar- ray of applications, including camera calibration, autonomous navigation, object pose tracking, augmented reality, and numerous other areas. However, some unexplored areas of camera pose estimation remain academically interesting. This thesis provides a detailed description of the system hardware and software that permits operation in one application area in particular: long-range, precise monocular pose estimation in feature-starved environments. The novel approach to pose extraction uses special hardware, including active LED features and a bandpass-interference optical filter, to significantly simplify the image processing step of the Perspective-n-Point (PnP) problem. The PnP problem describes the calculation of pose from n extracted image points corresponding to n known 3D world points. The proposed application method operates in tandem with a tethered (UAV) and mobile ground control station (GCS). The integrated localization and flight system serves as a plat- form for future U.S. Navy air flow research. Indoor tests at the RAVEN flight space of MIT's Aerospace Controls Lab and outdoor tests at a grass strip runway demonstrate the system's efficacy in providing an accurate and precise pose estimate of the UAV relative to the mobile GCS.

Thesis Supervisor: Jonathan P. How Title: R. C. Maclaurin Professor of Aeronautics and Astronautics, MIT

3 4 Acknowledgments

I would first like to thank my advisor, Jon How, both for the opportunity to advance myself here at MIT and for providing insight and advice at each obstacle I faced in the project. Your guidance proved invaluable in the completion of the project and in my personal development. I have learned such a tremendous amount in such a short time. You have helped me to improve both technically and analytically, giving me the skills that provide access to much greater success in the future. I would like to thank Ben Cameron, John Walthour, Will Fisher, and many others of Creare, LLC. for their technical expertise and roles in the completion of the project. Thank you to all of my fellow ACL members for their support during my time at MIT. Justin Miller took me under his wing when I first joined the lab, helping me transition into grad school and offering support in the early stages of the project. Brett Lopez has provided extensive hardware support, and has propped me up when- ever I felt particularly discouraged. Michael Everett was always my go-to when it came to and Robotic (ROS) issues, and has played a vi- tal role in the completion of the project. Lastly, my UROP, Ryan Scerbo, helped tremendously in the final stages of the project, implementing software updates and providing much-needed flight test support. Thanks to my parents, who encourage me to set the bar high and give it my all. You're the reason for my ambition and dedication, and I owe you much of my success. I want to thank my incredible partner, Dema Tzamaras, for her unending sup- port, patience, and resilience through our long-distance relationship. Thank you for encouraging me when I faltered, being the rock that I relied on, and being a constant source of joy. I hope I bring as much happiness to your life as you do to mine. Lastly, I would be remiss not to acknowledge the sources of funding for the project and my educational pursuits at MIT: Creare, LLC., the Small Business Technology Transfer (STTR), and the Office of Naval Research.

5 6 Contents

1 Introduction 15 1.1 Project Overview ...... 15 1.2 M otivation ...... 16 1.3 Related Work ...... 17 1.3.1 Operational Environment Overview ...... 18 1.3.2 Non-camera-based Localization Methods ...... 18 1.3.3 Camera-based Localization Methods ...... 20 1.4 Thesis Contributions ...... 21

2 Monocular Camera Model and Perspective-n-Point Problem 23 2.1 Monocular Camera Model ...... 23 2.1.1 Pinhole Camera Model ...... 24 2.1.2 Real Lens Calibration ...... 25 2.2 Perspective-n-Point Problem ...... 27 2.2.1 PnP Description ...... 27 2.2.2 The OpenCV Iterative PnP Method ...... 29

3 Monocular Localization System 31 3.1 System Requirements ...... 31 3.2 Monocular Localization Choice ...... 32 3.3 H ardware ...... 32 3.3.1 Active Features ...... 33 3.3.2 C am era ...... 34

7 3.3.3 Power Considerations ...... 35 3.4 Software ...... 36 3.4.1 Initial Centroid Extraction ...... 37 3.4.2 First Centroid Filter ...... 39 3.4.3 Second Centroid Filter ...... 39 3.4.4 Assign Points to Rows ...... 40 3.4.5 Solve PnP ...... 41 3.4.6 Summary ...... 41

4 Flight System 43 4.1 Hardware ...... 43 4.1.1 GCS Hardware ...... 43 4.1.2 UAV Hardware ...... 45 4.1.3 Special Hardware Considerations ...... 46 4.2 Software ...... 48 4.2.1 GCS Software ...... 49 4.2.2 UAV Software ...... 50 4.2.3 Waypoint Commands in a Relative System ...... 52 4.2.4 A Note on Communications ...... 54 4.2.5 Summary ...... 56

5 Testing and Results 57 5.1 Monocular Localization System Ground Tests ...... 57 5.1.1 Vicon ...... 58 5.1.2 Outdoor ...... 66 5.2 Flight System Ground Tests ...... 70 5.2.1 Software-in-the-Loop Simulations ...... 70 5.2.2 Hardware Validation ...... 70 5.3 Integrated System Tests ...... 72 5.3.1 Initial Verification ...... 72 5.3.2 Stationary GCS Flights ...... 73

8 5.3.3 Mobile GCS Flights...... 77 5.3.4 Summary ...... 80

6 Conclusion 81 6.1 Summary ...... 81 6.2 Future Work...... 82

A Equipment 83

9 10 List of Figures

1-1 Project Overview ...... 16

2-1 Pinhole Camera Model ...... 25 2-2 Camera Distortion Examples ...... 26 2-3 Camera Calibration Process ...... 27 2-4 P3P Problem Diagram ...... 28

3-1 Sunlight Energy Spectrum at Sea Level ...... 33 3-2 Mobile Ground Station Beacon Mounting ...... 34 3-3 Camera Setting Comparison ...... 35 3-4 Optical Filter Transmission Spectrum ...... 36 3-5 Pose Extraction Block Diagram ...... 37 3-6 Trucated Search Space and Feature Extraction Process ...... 38

4-1 Hardware Interconnect Diagram ...... 44 4-2 Ground Control Station Hardware ...... 45 4-3 Assembled UAV ...... 45 4-4 Anti-vibration Mount and Plot ...... 47 4-5 Electromagnetic Interference Shielding ...... 48 4-6 Tether Tensioner ...... 49 4-7 Ground Control Station Console ...... 50 4-8 ROS Architecture for Flight ...... 51 4-9 Yaw Command Generation Example ...... 52 4-10 Mission Transform Example ...... 53

11 5-1 1:35 Scale Model ...... 5 9 5-2 Vicon Ground Test 1: Position ...... 6 1 5-3 Vicon Ground Test 1: Orientation ...... 6 2 5-4 Vicon Ground Test 2: Position ...... 6 3 5-5 Vicon Ground Test 2: Orientation ...... 6 4 5-6 Noise Filter Example ...... 6 5 5-7 Briggs Field Beacon Installation ...... 6 7 5-8 Outdoor Validation: Position ...... 68 5-9 Outdoor Validation: Orientation...... 6 9 5-10 Mobile Flight Platform ...... 7 4 5-11 Stationary GCS Flight: Position ...... 7 5 5-12 Stationary GCS Flight: Orientation ...... 7 6

5-13 Mobile GCS Flight: Position ...... 7 8 5-14 Mobile GCS Flight: Orientation 79

12 List of Tables

3.1 Specifications for PointGrey Chameleon 3 Monochrome Camera .. . 34 3.2 Operational camera settings ...... 34

4.1 Transfer parameter values and associated meanings...... 55

13 14 Chapter 1

Introduction

1.1 Project Overview

This thesis presents a monocular localization method its application to an instru- mented, outdoor unmanned aerial vehicle (UAV) that is tethered to a ground control station (GCS). The purpose of this project is to develop a system capable of measur- ing the air flow aft of a large naval carrier in order to provide a baseline to a computational fluid dynamics model. The monocular localization method provides a high-accuracy camera pose (position and orientation) estimate in an outdoor en- vironment permitting precise positioning of the air flow data. The tether allows the GCS to tow the UAV, in addition to providing a simple recovery method. Chapter 2 provides a description of the Perspective-n-Point (PnP) problem and a derivation of the solution method that this project utilizes. The well-studied PnP problem is the basis for the monocular localization method developed in this thesis. Chapter 3 details the constraints that the monocular localization system developed in this thesis satisfies. It discusses the reasoning for the hardware choices and describes the technical specifications of each subsystem. This chapter also describes the software of the monocular localization system, explaining each image processing step. Chapter 4 explains the actual flight system, including system hardware and soft- ware. As in the previous chapter, this chapter examines the role of each hardware subsystem and notes any special considerations. It also describes in depth the GCS

15 L

Figure 1-1: A tethered UAV instrumented with an air probe collects data behind a naval ship with active features. and UAV software, in addition to the unique communication method of this project. Chapters 5 and 6 review testing procedures, results, and conclusions of this thesis. Initial tests display results of the monocular localization system for indoor and out- door environments, while later tests show the effectiveness of the integrated flight and monocular localization system on a UAV with both a stationary and mobile GCS.

1.2 Motivation

Camera pose (position and orientation) estimation is a well-studied area of computer vision. It finds applications among camera calibration [25], autonomous navigation [14, 271, object pose tracking and augmented reality [10, 23], and many other areas. A significant body of work applies the ability to extract the pose of a camera. However, there still exist nuanced application areas worth investigating. Well-calibrated systems in laboratory environments can produce extremely precise pose estimates, with some achieving sub-millimeter accuracy [17]. Intuitively, highly

16 accurate and precise pose estimates are harder to achieve in non-laboratory settings. Methods to extract pose in unknown environments typically create a map (usually simultaneously A la Simultaneous Localization and Mapping (SLaM)) [3, 20] in order to generate a knowledge of the area. This requires significant computational effort and onboard storage, and, in feature-starved environments, still results in absolute errors greater than one meter [121. Aside from lacking accuracy and precision in unknown environments, common camera pose estimation techniques are limited in terms of maximum sensing distance. Methods, such as Vicon or the Microsoft Kinect, that utilize a camera with a built-in source of illumination are ineffective at distances beyond ten meters. Camera pose estimation algorithms that detect and label features in the image are shown to have significant (>1 meter) absolute translation errors at distances greater than 100 meters

(or worse performance at smaller distances) [91. This thesis examines a camera pose extraction method capable of reporting ac- curate and precise pose estimates in unknown, feature-starved environments. The developed technique has an operational distance greater than 300 meters. It achieves this performance with the implementation of active features in the form of 100 Watt infrared LEDs combined with an optical-filter-equipped camera. The method is used to post-process video from an autonomous unmanned aerial vehicle (UAV).

1.3 Related Work

This section discusses the most common and also the most recent pose estimation methods and how those methods fall short in this project's application environment. First, this section examines non-camera localization methods, such as GPS, LIDAR, and ultra-wide band (UWB) positioning. Then it explores camera-based techniques, such as vision-enabled simultaneous localization and mapping (SLAM), visual-inertial navigation (VIN), and stereo-camera-based approaches. In each method examined, this review discusses the aspect of the approach that causes it to fall short when applying it to the operational environment considered in this thesis.

17 1.3.1 Operational Environment Overview

This overview provides a brief description of this project's operational environment to better-inform the discussion of the shortcomings of related pose estimation methods. The goal of the localization system presented in this thesis is to provide pose estimates of an unmanned aerial vehicle (UAV) aft of a naval ship, which creates a number of unique constraints. First of all, the flight environment is feature-starved, with open ocean on three sides of the ship and a distant landmass on the remaining horizon. The only interesting nearby object is the ship itself. The UAV flies from the deck of the naval ship to a distance of up to 100 meters aft of the ship, and pose estimates are needed for all ranges. Standard UAV weight considerations require that the onboard portion of the localization system be small and lightweight. The U.S. Navy imposes additional system constraints, including the prohibition of any lasers and the minimization of all radio signals. Finally, the system must provide an absolute translation error smaller than one meter.

1.3.2 Non-camera-based Localization Methods

GPS, DGPS, and IMU

The most obvious localization method for outdoor relative positioning is differential GPS (DGPS) or real-time kinematic (RTK) GPS. It is reasonable to assume that placing a GPS on the ground control station (GCS) and one on the UAV will provide decent relative positioning. Beyond that, the inertial measurement unit (IMU) on the UAV can provide pitch and roll information while a magnetometer (or compass) provides the yaw. However, this approach has implementation drawbacks that prevent it from being a viable option for this project. Both DGPS and RTK GPS provide highly-accurate relative position estimates with horizontal position errors around one meter and half of a meter respectively [4]. Each of these systems utilize a stationary base station who's position is known to extremely high accuracy. Due to the reliance on a stationary base station, neither of these methods satisfy the project constraints. Additionally, testing of an RTK

18 GPS system displayed frequent signal dropouts and subsequent recalibration times of about fifteen minutes. These frequent dropouts and long recalibration time reaffirm the impracticality of RTK GPS for this project. Aside from GPS issues, the magnetometer onboard the UAV is unable to provide accurate results in this environment. The magnetometer suffers from large errors when near any steel objects, such as the naval ship. Magnetometer tests displayed absolute errors greater than 20 degrees and heading drift of more than 15 degrees in only five minutes. Since much of the UAV's operation will be above or near the ship, this project cannot rely on a magnetometer for accurate pose estimates. As a result of these shortcomings, GPS position combined with IMU data is not sufficient for this project's operational environment.

Laser-based

Lasers are prohibited outright by U.S. Navy restrictions when operating on a naval vessel. Even if the U.S. Navy were to permit the use of lasers, current laser-based pose estimation methods cannot satisfy all other constraints. Velodyne makes a LIDAR sensor popular in research applications. The smallest version, the Velodyne Puck Lite, weighs in at 590 grams without any cabling [30], a significant weight for a UAV. While this too-heavy LIDAR has a sensing range of 100 meters, more weight-affordable options have a much lower sensing range. The prohibition of lasers, and the unsatisfactory balance of weight and range for LIDAR, means that this project cannot use laser-based pose estimation methods.

Ultra-wide Band

Ultra-wideband (UWB) positioning methods are extremely accurate, with some boast- ing millimeter accuracy [15, 29]. They are relatively low-power, and do not interfere with radio signals of the same wavelengths. Unfortunately, their low power leads to limited range [15]. Most UWB applications involve indoor environments with multiple sensors spread out over the area. Due to the low-range of UWB systems, they are not suitable for this project.

19 1.3.3 Camera-based Localization Methods

Simultaneous Localization and Mapping (SLAM)

Simultaneous localization and mapping (SLAM) is a commonly used localization method when operating in unknown environments. There exist numerous SLAM methods in operation today, and SLAM remains a hot topic in the academic commu- nity. For brevity, the discussion here will focus on ORB-SLAM2, the SLAM method currently considered the state-of-the-art for SLAM. ORB-SLAM2 [211 is a feature-based SLAM method that utilizes feature points extracted from the image for tracking, mapping, relocalization, and loop closing. It operates in real-time and performs effectively indoors and somewhat less so outdoors, with errors less than one centimeter and a few meters respectively [20]. Unfortunately, the outdoor performance of ORB-SLAM2 is insufficient for this project. Additionally, it relies on a testing environment different from that implicit in this project, needing a variety of viewing angles and interesting image features in order to perform well. The poor outdoor performance of ORB-SLAM and the feature-starved operational environment of this thesis rules out ORB-SLAM2 for this project.

Visual-inertial Navigation

Visual-inertial navigation (VIN) methods utilize a camera (or set of cameras) and an inertial navigation system (INS) to estimate camera pose. The pose solution updates based upon the observed optical flow and inertial measurements. Due to solution drift over time, visual-inertial techniques generally create a map of the environment in a SLAM-like approach. Monocular VIN suffers from scale ambiguity, which is true for all monocular systems in fully-unknown environments. Stereo VIN also has limitations that become prohibitive in this thesis's operational environment. Namely, the stereo VIN solution degrades to the monocular case when the scene features are much more distant from the camera than the stereo baseline [19]. For example, when the only environmental features are a distant horizon and a naval ship that takes up very little image area, the stereo VIN methods will degrade to the monocular case.

20 Due to these constraints, visual-inertial navigation methods are inadequate for the application described in this thesis.

1.4 Thesis Contributions

The contributions of this thesis are the design and implementation of a long-range monocular localization technique, the creation of an autonomous unmanned aerial vehicle (UAV) platform, and the delivery of an air flow dataset to the U.S. Navy. The long-range monocular localization technique determines the position and ori- entation of a camera relative to a setup of high-power LED beacons. The system comprises an active feature setup, a camera equipped with an optical filter, and soft- ware that processes a video feed to extract the 3D pose of the camera at each frame of the video. This system operates at distances greater than 300 meters from the active feature points and is capable of operation even in direct sunlight. The autonomous UAV platform comprises a ground control station (GCS) and a UAV. The GCS serves as a command center for a human operator, allowing control over the UAV. The UAV is equipped with a Pixhawk autopilot and an ODROID com- panion computer and responds to high-level commands from the GCS. This system can easily be expanded to support multi-aircraft control from the single GCS. The air flow dataset delivered to the U.S. Navy provides a baseline to a compu- tational fluid dynamics (CFD) model. This model provides the U.S. Navy with the turbulent air flow aft of the superstructure of a naval helicopter carrier in numerous wind conditions, informing pilots of potentially dangerous areas of flight.

21 22 Chapter 2

Monocular Camera Model and Perspective-n-Point Problem

The monocular localization system functions due to a monocular camera model and the well-studied Perspective-n-Point (PnP) problem. The monocular camera model relates a 3D object to its corresponding 2D image, and this thesis represents it as a pinhole camera with radial and tangential distortion. The PnP problem relates 3D feature points to 2D image points, estimating the position and orientation of the known feature points using a nonlinear least-squares function.

2.1 Monocular Camera Model

This thesis utilizes a monocular camera model in order to properly calibrate the image stream from the camera. In particular, a pinhole camera model serves as the base model, which is then augmented with measured radial and tangential distortions in order to eliminate the errors caused by the real lenses.

This section depicts a pinhole camera model as described by the OpenCV docu- mentation for Camera Calibration and 3D Reconstruction [251.

23 2.1.1 Pinhole Camera Model

The pinhole camera model describes the relationship between 3D real world coor- dinates and 2D image coordinates using an infinitely small aperture and no lenses. This perspective transformation does not take into account real lens distortion, which Section 2.1.2 describes how to remove in order to achieve a true image.

The perspective transformation of the pinhole model is described as:

s m' =A [R t] M'

u fx 0 cx r1 r12 r13 tl Y s v = 0 fy cy r21 r22 r23 t 2 0 1 0 1 r31 r 32 r33 t3

m' U V 1 T M [X Y Z 1 1 T

" s is a scaling factor " (u, v) are image coordinates of the projected 3D world coordinates " fx, fv are the focal lengths of the camera expressed in pixel units " (cx, cy) is the principal point on the image plane " [R t] is the joint rotation-translation matrix of extrinsic parameters * (X, Y, Z) are the 3D world coordinates

As long as the projected 3D point is not on the image plane (i.e. when z 4 0), the above model equates to:

x y z =R[X Y Z + t

X'= x/z, y' =y/z, U = fx' + c, V = fyy' + cY

These equations use homogeneous coordinates to easily represent the 3D-to-2D projective transformation with matrices. Figure 2-1 shows this pinhole camera model.

24 P =(X, Y,Z)

I principal i point "-AV X (CO, CV) ze X

Figure 2-1: The pinhole camera model relates 3D coordinates to their respective 2D image coordinates. Diagram from [25].

2.1.2 Real Lens Calibration

The pinhole camera model does not account for distortions caused by real lenses. Since no real lens features the ideal behavior of an infinitely small aperture, all camera lenses introduce some amount of radial and tangential distortion to the image. A simple calibration process removes this distortion according to simple mathematical calculations. Figure 2-2 demonstrates the various types of possible radial distortion.

Extending the pinhole model described in Section 2.1.1 to account for distortion results in the following model:

x X y =R Y +t z Z

25 U U U U U

Figure 2-2: Real camera images generally experience some type of image distortion. Left: Positive radial (barrel) distortion. Middle: No distortion. Right: Negative radial (pincushion) distortion.

z'x/z

y'y/z

2 Y = x + 2pi'y' +p2 (r2 + 2x' ) 1 +k4r2 + ksr4+k6r6 2 4 6 ,,,1+k 1 r +k 2 r +ksr 2 4 6 1 + k4 r + k5r + k6r

r2 _ ,2 + ,2

u = fXx" + cX

v = fyy" + cy

" ki, k2, k3 , k4, k5 , k6 are radial distortion coefficients " Pi, P2 are tangential distortion coefficients

This thesis uses a (ROS) [61 camera calibration process to calculate and remove the radial and tangential distortions. The ROS calibration in turn uses OpenCV functions to calibrate the camera and commit that calibration to the camera driver. The camera-lens combination featured here exhibits positive radial distortion, which causes the calibrated image to feature sunken-in edges. Figure 2-3 provides an example of the results of camera calibration.

26 Figure 2-3: ROS camera calibration removes distortion from real lenses. Left: Un- calibrated image. Middle: Calibrated image. Right: Calibrated image truncated to remove null areas.

2.2 Perspective-n-Point Problem

In 1981, Fischler and Bolles formally introduced the term Perspective-n-Point [5] as a way to describe the problem of camera location determination based upon n control points with known, static relative 3D positions. This section discusses the Perspective- n-Point problem and the iterative solution method that this thesis utilizes.

2.2.1 PnP Description

The Perspective-n-Point problem generally refers to the calculation of camera position and orientation based upon n control points. Figure 2-41 displays the P3P problem. At low n values, the PnP problem is academically uninteresting. In terms of degrees of freedom (DoF), POP (n = 0) maintains all six degrees of freedom, with three in rotation and 3 in translation. PiP constrains the two translational degrees of freedom orthogonal to the optical axis, resulting in a four DoF system. P2P further constrains the system to two degrees of freedom: a scale factor and a rotation. At P2P, one can imagine the known control points each constrained to a ray from the perspective origin to the control points image coordinate. At n = 3, the constraints result in a zero-DoF problem with a finite number of solutions, unlike the PnP problems so far, which each have infinitely many solu- tions. Figure 2-4 displays the P3P problem, which results in a system of polynomial

'This figure taken from [5] and modified to add angle labels.

27 L

Oac

c b b a Obc A

Rac

Rab

/

Rbc

Figure 2-4: The P3P problem, where L is the aperture. Rab, Rbc, and Rac are known. equations with eight solutions:

2 2 R - a + b - 2abcos(Oab)

R = a2 + C2 - 2ac cos(ac)

e= b2 - 2bccos(b,)

However, four of the solutions turn out to be invalid due to their location behind the camera, resulting in four valid solutions. Appendix A in the Fischler and Bolles paper [5j presents the algebraic derivation of solutions to the P3P system of equations.

At n = 4 or 5, there are generally at least two solutions. However, with n = 4, all control points on a common plane, and no two control points on one line, an analytical technique produces one unique solution in front of the camera.

At n > 6, there is always a unique solution. At high n values, the system is over constrained and requires a least squares method to obtain a solution. In practice, more control points leads to higher solution accuracy.

Many techniques address the PnP problem, each with drawbacks and advantages. Some PnP solution methods focus on improving accuracy, while others offer faster so-

28 lutions at the cost of accuracy. This thesis employs an iterative least-squares method, as described in Section 2.2.2, that provides moderate speed and accuracy.

2.2.2 The OpenCV Iterative PnP Method

The OpenCV Iterative PnP method attempts to find a camera pose that minimizes reprojection error using a combination of Direct Linear Transformation (DLT) and Levenberg-Marquardt optimization [13, 16]. Reprojection error is a metric describing the accuracy of the solution to the non- linear PnP problem. The OpenCV Iterative PnP method simulates the projection of the known 3D control points onto the image plane using the current camera pose estimate, and attempts to minimize the reprojection error, i.e. the squared distances between the simulated and actual image coordinates. If the algorithm is provided with an initial guess of the pose, it uses that guess to refine the solution that it finally returns. With no initial guess, the algorithm uses the DLT method to produce an initial guess, then refines the guess using Levenberg- Marquardt optimization, also known as the damped least squares (DLS) method.

29 30 Chapter 3

Monocular Localization System

The monocular localization system uses a single camera onboard the unmanned aerial vehicle (UAV) to extract the full 3D pose. High-power LED beacons act as active feature points in a video recorded during a flight, and the pose extraction software post-processes the video and outputs the full pose of the UAV. For each frame of the video, the software applies a custom noise-reduction algorithm and then determines the pose of the UAV relative to the ground control station (GCS).

3.1 System Requirements

The primary objective of the Aerowake project is to gather air flow information for the area aft of the superstructure of a naval ship. The positioning of the air flow data relative to the ship is paramount to the success of the project. The water- based environment and desired unmanned aerial vehicle (UAV) flight range also create conditions that the monocular localization system must satisfy. The United States Navy strictly limits the quantity and type of transmissions aboard the naval vessel. In particular, the project must keep radio transmissions to a minimum and may not use any type of laser. These stipulations limit the pose extraction methods available. Lidar, for example, does not satisfy these requirements. The data requested by the U.S. Navy needs to be accurately and precisely po- sitioned relative to the naval vessel. In order to meet the accuracy and precision

31 requirements, the localization system must have a small measurement resolution fea- turing a small error. This factor rules out GPS as a possible localization method. Differential GPS (DGPS) is also not a valid option due to the relatively frequent signal dropouts and subsequent re-syncing times. The project's operating environment is following a ship in open water. This en- vironment is feature-starved and limits possible vision-based localization methods. Additionally, the project requires operation between 10 and 150 meters from the ship, which also places restraints on vision-based localization methods. Finally, the components of the localization system onboard the UAV must be lightweight to prevent adversely affecting the flight time or capability of the UAV.

3.2 Monocular Localization Choice

The various system requirements for the Aerowake project lead to the selection of a monocular localization method using infrared (IR) LED active feature points. A subsequent section of this document covers in depth the specific hardware required. This method satisfies each of the listed system requirements in the following ways: Monocular localization with active features: 1. Does not require any transmissions between the ground control station (GCS) and unmanned aerial vehicle (UAV) except in the form of IR light, 2. Allows for accurate and precise positioning of air flow data relative to the naval vessel using a simple Perspective-n-Point algorithm, 3. Requires a once-per-day setup time cost to mount the LED's, 4. Creates feature points clearly visible from more than 150 meters, and 5. Adds only a small amount of weight to the UAV in the form of a single camera.

3.3 Hardware

Hardware specially chosen to meet the system requirements promotes simple and straightforward pose extraction. The camera has a moderate resolution to provide an

32 UV Visible Infrared - - Solar Irradiance 1.5 at Sea Level E Atmospheric absorption bands E 1 850 nm

0.5 1120

0 250 500 750 1000 1250 1500 Wavelength (nm)

Figure 3-1: Sunlight is full-spectrum, but has relatively low intensity at 850 nanome- ters. The dip at wavelengths just above 850 nanometers is due to H 2 0 absorption. Figure is modified from [24]. accurate pose without generating excessively large video files, and the infrared (IR)

LEDs generate a wavelength less likely to suffer from noise in an outdoor environment.

3.3.1 Active Features

Two arrays of 100 Watt, 850 nanometer infrared LED's mounted on the ground control station (GCS) serve as feature points for the pose extraction algorithm. The power, wavelength, and positioning of these beacons simplifies the task of identifying and labeling them within the video.

The 100 Watt power allows the LED beacons to shine brighter than any other source of 850 nanometer light in the environment. This allows the camera to pick up the beacons from distances greater than 300 meters, even on very sunny days.

The 850 nanometer wavelength of the LED's is in a range of sunlight that is naturally low in intensity relative to visible light, furthering the camera's ability to pickup only the beacons. Figure 3-1 shows the sunlight spectrum.

33 Figure 3-2: The beacons's are arranged in two arrays of four equally-spaced LED's.

Table 3.1: Specifications for PointGrey Chameleon 3 Monochrome Camera Resolution Image Size Weight Dimensions 1.3 MP 1288 x 964 54.9g 44 x 35 x 19.5 mm

The LED's are arranged in a manner that allows for rapid identification within the pose extraction algorithm. Section 3.4 provides an in-depth discussion of the soft- ware, including how LED placement assists identification within the pose extraction algorithm. Figure 3-2 shows the arrangement of the beacons.

3.3.2 Camera

A PointGrey Chameleon 3 monochrome camera [7] with a lens and an in-line optical filter records video of the active features for the pose extraction algorithm. Table 3.1 lists the camera specs, and Table 3.2 lists the operational settings.

Table 3.2: Operational camera settings Exposure Shutter Speed Gain Frame Rate Range -7.58 to 2.41 EV 0.046 to 32.75 ms -11.00 to 23.99 dB 1 to 30 FPS Setting -7.58 EV 0.046 ms -2.5 dB 30 FPS

34 Figure 3-3: Left: Active features captured with standard (automatic) camera settings. Right: Active features captured with minimal camera settings.

The active features are extremely bright compared to the surrounding environ- ment, which allows for camera settings that capture very little light. By doing so, the majority of noise is filtered before it is captured in the video. The camera expo- sure and shutter speed are set to minimum, each of which limits the amount of light captured by the sensor. The gain is set to further reduce the impact of each photon that reaches the camera. Figure 3-3 shows the effect of the minimal camera settings. In addition to software settings that reduce noise, an in-line band-interference filter eliminates all wavelengths of light except for those expected from the active features. Figure 3-4 shows the spectrum of the band interference filter. The lens provides aperture adjustment capabilities. With the aperture at it's smallest setting, there is little light reaching the sensor. This setting reduces the noise in the image and also allows the image to remain focused over long ranges. The camera, lens, and optical filter are paramount to the success of the pose extraction algorithm. With the setup described here, the active features stand out in the video and are very easy to automatically extract and label for pose estimation.

3.3.3 Power Considerations

With eight 100 Watt beacons, the power consumption of the monocular localization system is significant. Although nominally 100 Watts, each LED is driven with three

35 Figure 3-4: The Bi-850 optical filter removes all wavelengths of light except for those expected from the active features. Figure from [18]. amps at 15 Volts'. This means an actual power draw of 45 Watts per beacon, for a total of 360 Watts. A gas generator satisfies this power consumption during testing. Despite being cumbersome, the added complexity from the high-power beacons and generator is greatly surpassed by the ease-of-pose-extraction afforded by this system.

3.4 Software

The monocular localization software 2 uses Python to implement a Perspective-n- Point pose (position and orientation) extraction algorithm relative to feature points of known dimensions. ROS (Robot Operating System) [6], OpenCV [26], and noise reduction code allow for quick and accurate determination of the camera's pose. The pose extraction algorithm comprises multiple steps to take in an initial image, process it to filter noise and recover feature points, and finally calculate the position

'Nominal power twice actual power is common quirk of high-power IR LED's. 2 All software for this project is open-source, and is available at https: //github. com/ creare-com/aerowake and https://bitbucket .org/Brndn004/creare/src/master/.

36 Compute Relative Position

Obtain First Second Assign Points Solve PNP for * Initial Centroids Centroid Filter Centroid Filter to Rows Drone's Position Image Return Position

Figure 3-5: The block diagram of the pose extraction process and orientation of the camera relative to the detected feature points.

Figure 3-5 shows the process of computing the relative position from an image.

When an image arrives, it passes through a number of blocks in order to extract the image coordinates of the feature points. Once the program detects and assigns corresponding real-world points to each image point, OpenCV's solvePNP function calculates the relative position and orientation.

The software described herein is specific to a feature with two rows of four feature points each, as shown on the right in Figure 3-3. With minor modifications, the software can operate on features with various layouts.

3.4.1 Initial Centroid Extraction

When an image arrives from the camera, the Obtain Initial Centroids block attempts to find all potential feature points in the image. Since the feature points are created with very bright IR LED's, this corresponds to finding the brightest pixels returning their image coordinates. If a previous solution exists, meaning the algorithm has identified the feature on a previous image, then the search region is truncated to an area surrounding the previous solution points. Figure 3-6 shows this truncation on

an example image. If a previous solution does not exist, then the program searches for bright pixels in the original image. This truncation serves two purposes:

1. Decrease the amount of data to process, and

2. Eliminate any noise in the truncated space.

To search for feature points within an image, the following functions are performed to produce a list of image coordinates (see Figure 3-6):

. Threshold and binarize image by pixel intensity to find brightest image pixels

37 Figure 3-6: Top: Image truncation in the case of a previous solution. Top Left: Original image. Top Right: Truncated image with significantly reduced search space.

Bottom: The Feature extraction process. Bottom Left: Original image. Bot- tom Middle: Thresholded, binarized, and morphed to simplify finding brightest pixels. Bottom Right: Pixels of interest identified.

* "Close" resulting contours by dilating and then eroding the image " Return coordinates of centroids of resulting contours

If the program identifies fewer coordinates than there are known feature points, the algorithm fails and returns that it cannot find the feature. If it identifies the same number of coordinates as feature points, then it passes the coordinates to the Assign Points to Rows block. If it identifies too many coordinates, then it passes them to the First Centroid Filter block. An obstructed feature or poorly-aimed camera can cause too few coordinates, while reflections from the sun or LED's can cause too many coordinates. The first and second centroid filters robustly identify the correct coordinates in the case of noise. However, it is still possible for noise to cause an incorrect solution or no solution. In practice, an incorrect solution is typically thousands of meters from the expected solution, making it easy to filter out, and no solution occurs sparsely and only for single frames at a time.

38 3.4.2 First Centroid Filter

The program contains two noise-reduction blocks, called the First and Second Cen- troid Filter. Each uses a different technique to remove false-positive detections. The First Centroid Filter block is called upon only if there are more coordinates returned than there are known feature points. This block eliminates false positives by clustering the coordinates in the x and y image plane. The idea is that the correct feature points will be strongly correlated, while any noise will likely not be. The First Centroid Filter process is as follows:

1. Cluster by x coordinate 2. Cluster by y coordinate 3. Return coordinates that exist in both clusters

If this block succeeds in reducing the number of detected points to the number of actual feature points, then the algorithm passes the coordinates to the Assign Points to Rows block. If there too many coordinates remain, then they are passed to the Second Centroid Filter block. If there are fewer coordinates than feature points after this block, then the algorithm returns the original list of centroids to be processed by the Second Centroid Filter block.

3.4.3 Second Centroid Filter

The Second Centroid Filter block is called after the first if there are still more co- ordinates than feature points. This block eliminates false positives by clustering the slopes of each possible pair of coordinates. The cluster method that this step uses is a simple algorithm that places the input data into groups based upon the gap size and an input max-gap argument. The actual feature points are precisely arranged in two rows in which the slope of each row is the same. Additionally, any two points within the same row will create a line with a slope that is the same as the overall rows. The Second Centroid Filter process is as follows:

1. Create all possible pairs of coordinates

39 2. Cluster pairs by slope between points 3. Return all centroids that exist in cluster of size 12

There are two rows of four feature points, leading the Second Centroid Filter to expect a cluster of 12 pairs of points. Each row of four has (') = 6 pairs of points with nearly the same slope, for a total of 12 pairs in the cluster of interest.

If there are fewer coordinates than feature points after this block, then the al- gorithm returns that it cannot find the feature. Otherwise the algorithm passes the coordinates to the Assign Points to Rows block.

3.4.4 Assign Points to Rows

To apply OpenCV's solvePNP function, each real-world feature point must be as- signed an image coordinate. The Assign Points to Rows block accomplishes this task. In practice, there are almost always the correct number of coordinates remaining when this block is called; however, it is capable of correctly-assigning points even if noise coordinates exist in the input coordinates.

The Assign Points to Rows process is as follows:

1. Create subsets of four centroids 2. Perform linear fit on each subset 3. Keep two lowest-residual subsets 4. Assign as top or bottom row based on position in image 5. Sort each row from left to right in image 6. Return two sorted rows

In practice, the code differs slightly from the above process in order to assign the points more efficiently. Also, the code will rarely need to perform a linear fit on every subset of four points since finding one low-residual subset identifies the other. The program passes the returned sorted rows to the Solve PnP block.

40 3.4.5 Solve PnP

A Perspective-n-Point (PNP) problem is when the pose of a camera is determined from n points in an image that correspond to n known real-world coordinates. With three image points, there are eight possible solutions to the pose. By including a fourth point, the ambiguity is removed and only one solution remains. Additional image points serve to reduce error due to noisy measurements. This project utilizes eight image points in order to balance speed and accuracy. The Assign Points to Rows block corresponds each real-world feature point with an image coordinate. The program then calls the OpenCV function solvePNP. This function is an iterative least squares method that finds the pose which minimizes reprojection error. It takes as inputs the 2D image coordinates and the 3D real-world coordinates of the feature points and returns the 3D camera pose. The Solve PNP block returns the position and orientation as calculated by the solvePnP function.

3.4.6 Summary

The monocular localization system described in this section is a combination of hard- ware and software that allows for simple camera pose extraction from significant distances in feature-starved environments. The active features with known real-world coordinates work together with the optical filter to considerably reduce noise before any light even reaches the camera sensor. With the feature geometry in mind, the image processing software easily discards any noise that does reach the camera image. The result is an accurate pose solution with minimal required image processing.

41 42 Chapter 4

Flight System

The flight system comprises all hardware and software required for autonomous flight and data capture, including a ground control station (GCS) and unmanned aerial vehicle (UAV). A GCS operator mans a laptop connected to the GCS to command the UAV, and an active tether connects the GCS to the UAV. A Pixhawk on the GCS and another on the UAV allow for simple communication between the GCS and UAV, both for navigation and for updating the mission.

4.1 Hardware

The primary hardware systems are the ground control station (GCS) and the un- manned aerial vehicle (UAV). The GCS tethers the UAV using an active reel and allows an operator to command the UAV. These commands include takeoff and land, waypoint navigation, and other standard flight procedures such as arming the UAV. Figure 4-1 shows all hardware of the flight system.

4.1.1 GCS Hardware

The primary GCS hardware includes an ODROID microcomputer, a tether with a reel motor controller, and a GPS-enabled Pixhawk autopilot. A laptop connects to the ODROID microcomputer via an ethernet cable to allow an operator to issue

43 Wake Swarm Hardware Interconnect Diagram

Serial Serial Extenal GPS Port PixHawk PW otor , External GPS Port PlxHawk GPS - ixGaw E "CM otor) DSMX Port Mavlink via f- E r Serl via Telern Port Hadheid Radio Receiver Serial via GPIO Pins i asitr G PIO Pins n---- SSH via - ESC Ror Eth ern et Tbe mT ele metry n Flig ht e d u r L a p t op ( _____ ODR~i XU4) adid-Radio iMavlink via _(ODRiDX4 EC L _.I USB-Serial USB

Reel Motor Reel Serial 12C vi Controller via USB GPIO Pi

2-bit tal SgnalPolntGrey Digital Signal Motor Shaft Encoder /- Pressore Air Senst/ iPressore

2-bit Digital Signal eel Shaft Encoder/_h bs. Press. Sensor

Analog Voltage Tether Temp. Sensor Sensor,,,

Ground Control Station Flight Scheduler On Stem of Boat On Aircraft

Figure 4-1: This diagram shows all flight system hardware and associated connections.

commands to the system. Figure 4-2 shows the GCS.

The ODROID microcomputer controls the reel motor controller and interfaces with the Pixhawk autopilot. This ODROID serves as a low-cost, low-power platform for controlling the ground station. Running off of an eMMC chip, a lightweight version of Ubuntu executes all GCS software, including the reel motor controller code and mission commanding code. The GCS operator uses a laptop to SSH into the GCS ODROID in order to control the system.

The reel motor controller provides active tether manipulation during flight. An emergency disable switch will immediately stop the reel motor if needed. The tether is a braided ultra high molecular weight (UHMW) polyethylene line. This type of line is common for fishing, and features a low-weight, small-diameter, and high strength perfect for this application. The tether is controlled to vary its length during flight so that there is enough tether to get to the desired locations, but not so much that the tether risks becoming tangled in the UAV rotors. The tether length is automatically

44 Figure 4-2: Left: The GCS is contained within a splash-resistant box. Right: The weatherproof box contains all GCS hardware, including the power supply, reel motor controller and tether, ODROID microcomputer, and Pixhawk autopilot.

Figure 4-3: The fully-assembled UAV. actuated based upon the distance between the GCS and the current mission waypoint. The Pixhawk autopilot serves as a source of GPS location for relative navigation and provides a low-latency communication method for issuing commands to the UAV. Section 4.2.4 describes this communication method in detail.

4.1.2 UAV Hardware

The primary UAV hardware includes an ODROID microcomputer, a GPS-enabled Pixhawk autopilot, and the vehicle itself. Figure 4-3 shows the UAV. The ODROID microcomputer on board the UAV connects to both the GCS Pix-

45 hawk and the UAV Pixhawk. The connection to the GCS Pixhawk provides low- latency communication between the GCS and UAV, as described in Section 4.2.4. Once the ODROID has the GCS location from the GCS Pixhawk, it calculates the desired GPS coordinates of the UAV based upon the current relative mission way- point. It sends this desired GPS coordinate to the UAV Pixhawk as a navigation command, and the UAV then navigates to the specified location. Section 4.2.3 de- scribes the relative-to-absolute coordinate calculation in detail. The UAV's Pixhawk autopilot listens to the UAV ODROID for navigation com- mands and controls the motors to achieve the desired position. The Pixhawk [28 is an open-hardware project that provides effective, low-cost autopilot capabilities. The UAV itself is a quadcopter built on a Tarot Iron Man 650 frame. It features four 400KV motors and a six cell 6600 milliamp hour lithium polymer battery, which provides a 30 minute flight time. The UAV has a gross takeoff weight of approximately 2.5 kilograms (5.5 pounds). The monocular localization system described in Chapter 3 uses a 1.3 megapixel PointGrey Chameleon 3 monochrome camera. Other UAV hardware includes telemetry radios, electromagnetic interference (EMI) shielding, etc. Appendix A lists a full bill of materials.

4.1.3 Special Hardware Considerations

Certain challenges of this project have spawned unique or otherwise interesting so- lutions that warrant discussion. Vibration issues, electromagnetic interference, and tether behavior have each created obstacles common to this application and to UAV flight in general. This subsection discusses this thesis's solutions to these problems. The first interesting challenge concerns any application utilizing and gyroscopes. During its first flights, the UAV exhibited errant behavior, occa- sionally showing controlled flight into terrain. The logs showed high vibration values. After 3D printing and installing an anti-vibration mount, the autopilot displayed manageable vibration levels. Figure 4-4 shows the vibration mount [221 and a before- and-after plot of the vibration levels. The second interesting challenge comes as a result of using a microcomputer on

46 Goo Tape vs. Anti-vibration Mount Results

60

50

u, 40

C 30 0

5 20

10

0 200 400 600 800 1000 1200 Time [s]

- X Vibration - Goo Tape -Y Vibration - Goo Tape - Z Vibration - Goo Tape - X Vibration - Anti-vibe Mount --- Y Vibration - Anti-vibe Mount --- Z Vibration - Anti-Vibration Mount

Figure 4-4: Vehicle vibration leads to large covariances within the Pixhawk autopilot. The anti-vibration mount (Top) reduces vibration of the autopilot during flight by a factor of more than 3 (Bottom).

47 Figure 4-5: Electromagnetic interference shielding around the ODROID reduces degradation of GPS signal accuracy. board the UAV. The ODROID XU4 features two USB 3.0 ports, which are notorious for electromagnetic interference (EMI) [8]. This EMI causes GPS signal degradation that causes the UAV to hold its position less-accurately. Installing EMI shielding, as shown in Figure 4-5, helps to reduce this interference and increase position accuracy. The final notable piece of hardware is a tether tensioner. The plumbing of the tether within the GCS box requires constant tension or else it slips off the tension sensor and other routing pulleys. The tether tensioner is a set of simple spring- actuated pulleys that maintain light tension on the portion of the tether within the GCS box. Figure 4-6 shows the tether tensioner in the GCS box.

4.2 Software

This project's software allows an operator to autonomously navigate the unmanned aerial vehicle (UAV) to a position relative to the ground control station (GCS) based upon a high-level command. All code for this project is open-source'.

'All source code for this project is available at https: //github. com/creare- com/aerowake and https: //bitbucket. org/Brndn004/creare/src/master/.

48 Figure 4-6: The tether tensioner is a set of spring-actuated pulleys that maintains light tension within the GCS box, preventing the tether from slipping off of the tension sensor and other pulleys.

4.2.1 GCS Software

The GCS software comprises reel motor controller code and a text-based interface that allows the operator to quickly command the UAV. The GCS software automatically logs all issued commands in addition to tether status and GCS location. The reel motor controller code 2 fine-tunes the behavior of the reel motor for op- timal performance. Adjustable motor acceleration, deceleration, and RPM establish the behavior that either deals out or reels in line to the desired tether length. A simple custom API makes precise motor control available to the main GCS program. The main GCS software, aptly named main.py, interfaces with the GCS operator providing easy-to-understand UAV and reel command functionality. Figure 4-7 shows the operator interface. When the operator issues one of the allowed inputs, the GCS code communicates with either the reel motor controller, the GCS Pixhawk, or both. Through this interface, the operator can manually control the reel and command flight objectives. Flight objectives go through the GCS Pixhawk as explained in Section 4.2.4, while reel commands go directly to the reel motor controller.

2 John Walthour of Creare, LLC. wrote and tested all low-level reel motor control code.

49 Allowed input:

Kill UAV motors when input starts with minus sign listen Tell UAV to start listening to commands arm Command UAV to arm throttle disarm Command UAV to disarm throttle takeoff Command UAV to takeoff to 10 m Waypoint Number (0-1) Navigate to designated waypoint clear Clear current waypoint land Command UAV to land help Show this list of allowed inputs quit Terminate program

Enter command:

Figure 4-7: The main GCS code interfaces with the GCS operator via a simple text- based command line.

4.2.2 UAV Software

The UAV software utilizes the Robot Operating System (ROS) [61 and comprises a yaw commanding ROS node and the main program that issues commands to the UAV Pixhawk. The UAV software automatically logs all issued commands, all Pixhawk variables including UAV location, and video from the onboard camera. ROS is a "collection of tools, libraries, and conventions that aim to simplify the task of creating complex and robust robot behavior" [6]. In essence, ROS pro- vides python-based tools that enable straightforward communication between various python scripts. This project utilizes ROS on the UAV in order to employ automatic multi-threaded behavior without the headache of programming multiple threads by hand. In particular, the main UAV code is a ROS node that subscribes to a yaw com- manding node in order to yaw the UAV based upon what the camera sees. Figure 4-8 displays how the UAV uses ROS. The first ROS node of interest is the camera node. This node first sets camera

50 camera

nager /camera/img~a /camra/cmer~nodlet~anaer/bndca mera/ca meranodelet-ma /cam ra/amea-ndele-Maage/bod 9/camera/camera nodelet

image_yaw-creator flight-companion

/imagejyaw-creator /yaw-deg -1 /lgt-opno

Figure 4-8: Multiple ROS nodes capture, process, and act upon the camera image. parameters and calibration settings and then captures an image from the camera. It publishes this image on the \camera\image mono topic. Any other node that wishes to use this image need simply subscribe to the \camera\imageraw topic.

The yaw commanding node is the first node to subscribe to the image topic. Whenever an image is delivered to the \camera\imageraw topic, a callback of the yaw commanding node processes the image to generate a yaw command in degrees. This yaw command is then published to the \yawdeg topic. The yaw value is calculated based upon the location of the active features in the image. The magnitude of the yaw increases linearly with the distance of the centroid of the lights from the center of the image. Figure 4-9 shows an example of this calculation.

The next node of interest is the main node, which, like the GCS, is named main.py. This node issues flight commands, subscribes to the \yawdeg topic, and listens to the GCS Pixhawk for mission updates, as described in Section 4.2.4. When the GCS operator issues a command, the UAV main node interprets it and commands the UAV Pixhawk accordingly. At the same time, this node is subscribed to the \yawdeg topic. Whenever the yawcommanding node issues a yaw command, the UAV main node takes that value and directs the UAV Pixhawk to the specified yaw. ROS makes this complex interaction between systems and scripts simple.

The final node that subscribes to the \camera\image_ raw topic is the ROSBAG. Using the ROSBAG command allows logging of all messages published to a specified topic. In this way, ROS provides the functionality to capture video during flight.

51 Figure 4-9: The horizontal distance (in pixels) from the centroid of the detected points to the center of the image determines the magnitude of the yaw command.

4.2.3 Waypoint Commands in a Relative System

In the system thus far, the GCS operator can issue waypoint commands and the UAV will navigate to the specified waypoint relative the the GCS. This section explains the non-trivial relative-position calculation.

The first aspect of relative waypoint navigation is the mission. The mission is a simple python file located on the GCS and UAV that describes the desired 3D position of the UAV relative to the GCS at each waypoint index. The GCS uses this file in order to automatically command the reel to an appropriate length, while the UAV uses this file to generate navigation objectives. This file is described in the GCS coordinate frame with the x-axis aft of the ship, y-axis abeam to port, and z-axis down. Figure 4-10 shows the GPS coordinate frame.

Once described in human-understandable GCS coordinates, a program rotates the mission to the ships bearing in order to place the mission in the North-East-Down coordinate frame. Figure 4-10 shows an example mission in GCS coordinates and

52 Waypoint 3: (20, 10) (18.7, -12.3)

Waypoint 2: (20, 0) (10.0, -17.3)

x Waypoint 1: (20, -10) Z o (1.3, -22.3)

Waypoint #: (GCS Coords) AN (NED Coords)

Figure 4-10: This top view of the ship shows the mission in the GCS coordinate frame and the North-East-Down (NED) coordinate frame after rotation by the ships bearing. GCS coordinates are more intuitive and easier to program before each flight, but the Pixhawk autopilot uses NED coordinates for navigation.

NED coordinates. This rotation is completed on the UAV before flight can begin so that the UAV navigates to the correct position when the GCS operator issues a waypoint. The rotation can be fine-tuned during flight from the GCS.

With a rotated mission described in the NED coordinate frame, the UAV is ready to fly. When the GCS operator commands a waypoint during flight, the UAV ODROID calculates the desired GPS coordinate of the UAV relative to the GCS. First the UAV ODROID polls the GCS Pixhawk for its GPS location and altitude. Then the UAV ODROID performs the following greater circle calculations to determine the desired UAV location and altitude. Finally, the UAV ODROID commands the UAV Pixhawk to navigate to the determined latitude, longitude, and altitude. This calculation is performed multiple times per second, so that the UAV will maintain its position relative to the GCS even when the GCS is moving in absolute coordinates.

53 Variables

LatGCS= GCS Latitude LatUAv = Desired UAV Latitude

LonGCS= GCS Longitude LonUAV= Desired UAV Longitude

AltGCS = GCS Altitude AltuAv = Desired UAV Altitude

dAft = Desired dist. aft of GCS dPort = Desired dist. port of GCS

dDown = Desired dist. down from GCS R = Radius of Earth

Equations dLor dPort dLat= dAft ~n R R * cos 7r*LatGcs 180 / dLat *180 dLon *180 LatuAv= LatGCS + [deg] LonuAv = LonGCS + [deg] 7r 7r

AltuAv = AltGCS - dDown [meters]

4.2.4 A Note on Communications

The GCS operator issues flight commands to the UAV, such as arm/disarm, takeoff, goto waypoint, etc. As such, there must be some sort of connection between the GCS and the UAV Pixhawk, which controls UAV motor outputs. The obvious method for communicating with the UAV for control is to connect the GCS to the UAV Pixhawk via radio and issue flight commands directly from the GCS. This allows the GCS to poll both the GCS's and UAV's position and subsequently issue relative navigation commands. Unfortunately, this method carries latency of up to three seconds between command and action. Such high latency is unacceptable, especially in emergency scenarios that require immediate motor shutdown. As such, a special communication method provides low-latency control of the UAV.

54 Table 4.1: Transfer parameter values and associated meanings. Value Meaning 0-99 Go to waypoint with the same index 100 UAV will stop listening to GCS commands 101 UAV will begin listening to GCS commands 102 UAV will clear its current waypoint 103 UAV will kill its motors immediately 356 UAV will land according to its pre-programmed landing protocol 357 UAV will takeoff according to its pre-programmed takeoff protocol 358 UAV will disarm its motors 359 UAV will arm its motors

Instead of the GCS ODROID connecting to the UAV Pixhawk, the UAV ODROID connects to GCS Pixhawk. Figure 4-1 displays the exact connection path. This allows the UAV to poll both the GCS's and UAV's position for relative navigation commands. In order for the GCS operator to issue flight commands to the UAV, a transfer parameter is set on the GCS Pixhawk.

The transfer parameter is an otherwise unused parameter on the GCS Pixhawk, such as "PIVOTTURNANGLE" or "PIVOTTURNRATE". This parameter has no effect on GCS performance since the GCS Pixhawk is not connected to any actuators. The GCS ODROID sets the GCS Pixhawk parameter value according to the desired action, and then the UAV ODROID reads that parameter from the GCS Pixhawk and enacts the associated action. Table 4.1 lists each possible parameter value and corresponding action. In effect, an integer value is sent over radio instead of a complicated, high-latency Pixhawk command.

The drawback to this method is that each desired action must be programmed before flight into both the GCS and UAV software - with a radio connection from the GCS to the UAV Pixhawk, the GCS could use standard Pixhawk commands directly. However, the near-instant communication of this method makes up for this minor hinderance. The same strategy can be utilized in reverse on a separate GCS parameter to allow the UAV to acknowledge receipt of commands.

55 4.2.5 Summary

The flight system described in this section allows for relative navigation while using a Pixhawk flight controller. The unmanned aerial vehicle (UAV) keeps its position relative to the ground control station (GCS) according to the commands sent by the GCS operator. This relative navigation system allows the sensor payload, a camera and omni-directional airprobe, to collect data and precisely position it via the monocular localization system described in Chapter 3.

56 Chapter 5

Testing and Results

The monocular localization and flight systems completed numerous validation tests. The monocular localization system exhibits accuracy and precision during tests within a Vicon [311 motion capture space and during outdoor tests. Before full-fledged flight testing, Software-in-the-Loop (SITL) simulations proved the integrity of the flight system software. Later, ground tests exhibited the correct integration of the monocular localization and flight systems, leading to numerous successful stationary and mobile ground control station flights.

5.1 Monocular Localization System Ground Tests

The monocular localization system design provides high accuracy at long-ranges. Though before conducting long-range outdoor tests, the system underwent devel- opment and verification with the help of RAVEN (Real-time indoor Autonomous Vehicle test ENvironment) in the Aerospace Controls Lab at MIT [1]. Once opera- tional indoors with RAVEN Vicon as ground truth, the monocular localization system performed long-range outdoor ground tests with GPS serving as a position estimate comparison and the onboard IMU serving as a ground truth for orientation. The tests discussed in this section demonstrate the monocular localization sys- tem's: 1) solution stability, 2) solution accuracy, and 3) ability to provide pose esti- mates for an entire simulated flight.

57 5.1.1 Vicon

The monocular localization system experienced initial testing and verification in MIT's RAVEN facility. RAVEN is an experiment facility featuring a Vicon mo- tion capture system that reports sub-centimeter accuracy position and orientation in real time. In the past, student researchers have used RAVEN for rapidly prototyping flight controllers, developing navigation and trajectory-planning methods, and pro- ducing vision-based sensing algorithms, among other applications [1]. This project utilizes Vicon to provide a ground truth value to the calculated pose estimate. In order to provide a pose estimate, Vicon must first see the desired object. To do so, Vicon uses infrared cameras and highly reflective object markers. A set of reflective markers attached to each object of interest provides a number of feature points that allow for pose calculation. The Vicon system is capable of tracking multiple uniquely- marked objects at the same time.

Hardware

Monocular localization hardware for Vicon tests includes a 1:35 scale model of the ground control station (GCS) outfitted with 5 mm LED's, a PointGrey camera on a hand-held mount, and artificial sources of noise (detailed at the end of Section 5.1.1) to test robustness. Reflective markers placed on the scale GCS and camera mount allow Vicon to produce pose estimates for each. These pose estimates provide a ground truth value for the relative pose of the camera and scale GCS. The 1:35 scale model of the GCS comprises an extruded aluminum frame equipped with LED's that serve as active feature points. Since RAVEN uses infrared cameras to track objects, the scale features cannot be infrared LED's as in the outdoor system. Instead, the LED's on the scale model are green. The infrared wavelength of the full- scale system is optimized to overcome the noise from direct sunlight; since there is not significant sunlight indoors, the green LED's are plenty bright for indoor detection and exhibit no other differences. The 1:35 scale comes from the size of the scale features. Standard 5 mm LED's are almost exactly 1/35th the size of the active

58 "~CLa Aerospace Controls Lab

Figure 5-1: The 1:35 scale ground control station uses green LED's during indoor Vicon testing. features in the full-size system. By using this 1:35 scale for spacing, the GCS scale model ensures that the system is entirely to-scale'. Figure 5-1 shows the 1:35 scale GCS with green active features. The camera mount uses the same extruded aluminum to form a hand-held struc- ture for the camera. The ODROID microcomputer from the flight system records video. The camera specs are detailed in Section 3.3.2. The only difference between the camera system used in Vicon tests and that used in outdoor tests is the optical filter. The Vicon test setup has a 525 nanometer bandpass filter while the outdoor test camera uses an 850 nanometer bandpass-interference filter. The hand-held mount allows for simple movement of the camera relative to the scale GCS.

Testing Process

The testing process comprises mission planning, system setup, and data collection. In this context, mission planning refers to choosing the movement of the camera mount

If the spacing and LED size have a different scale relative to the full-size system, then the LED's could be either too large or too small in the image. Either case produces potential for incorrect assumptions about the system, which might lead to a sub-standard algorithm.

59 during data logging. Different types of tests can provide results on different aspects of the system. To thoroughly test the system, video is taken: 1) near the feature and far from the feature, 2) at various angles relative to the feature, and 3) while stationary and while moving. Testing both near to and far from the feature identify distortion problems if the feature takes up the entire image. Video from multiple angles shows the ability of the pose solution to decouple position and orientation. Finally, a stationary test shows stability of the solution over time, while a moving test shows that the pose solution is maintained while the camera moves.

Vicon, the camera, and the scale GCS must be setup to record data. For Vicon, this involves calibrating the motion capture cameras and setting up the object models for the scale GCS and camera mount. A Robot Operating System (ROS) [6] node connects to the Vicon system and publishes pose information to a ROS topic at 100 Hz for each identified object. Setting up the camera to record data also involves using the ODROID for calibration and then starting a ROS node to access the camera and publish images to a ROS topic. The scale GCS is placed on the floor and plugged in to power the LED's. Tape marks on the floor indicate parallel and circular lines extending from the scale GCS to assist with accurately moving the camera mount. At this point, the ROS nodes are publishing Vicon pose estimates and camera images, but nothing is recording the data. In order to actually save the data, the ODROID and Vicon computer utilize ROS's rosbag functionality. Rosbags can subscribe to any number of ROS topics, recording every message that is sent over that topic. In this case, the rosbag saves every image published to the camera image topic and every Vicon pose solution. Once the system is recording data, the user carries the camera mount according to the chosen mission. Finally, the rosbag is closed and the user processes the data.

Results

Two Vicon tests demonstrate the monocular localization system. The first involves movement purely in the XY-plane in order to get an initial system validation. The second Vicon test exhibits movement similar to an actual flight. In each of these

60 Vicon Test: Movement in XY Plane

Translation: XY Plane 5 4 3 2 E U Vision 1 - Vicon 0 S-10 8 1 12 - 2 -3 -4 -5 X Position [m]

Figure 5-2: The relative position of the camera with respect to the active feature. The vision solution degrades when the feature is close to the camera, as the feature takes up much more of the image and distortion is more difficult to eliminate.

tests, the Vicon pose estimate serves as ground truth for the vision solution.

For the first Vicon test, the UAV travels on the ground from point to point, covering the entirety of the Vicon test space. It begins and ends at the same location in the middle of the space. This test exhibits the behavior of the monocular localization system both near to and far from the feature.

Figures 5-2 and 5-3 show the relative translation (top-down view) and rotation

(yaw), respectively, of camera with respect to the feature during the first Vicon test.

The vision solution matches the Vicon estimate well for most of this test. When the camera is near to the feature, the position estimate given by the monocular localization system degrades. This is likely caused by distortion in the image. As the feature takes up more of the image, it is affected more by the radial distortion of the lens. With an ideal camera calibration, this distortion would not affect the solution; in this case, however, the small amount of remaining distortion skews the vision solution. The distortion does not seem to cause any deprecation of the yaw solution given by vision.

61 Vicon Test: Movement in XY Plane

Rotation: Yaw 60 40

20 E Vision 0 Vicon "D -50 50 100 1 200 Vio -20'

-40 -601 -80 Time [s]

Figure 5-3: The relative yaw of the UAV with respect to the feature.

The final thing to note for the first Vicon test is the decoupling of rotation and translation. The camera yaws significantly from point-to-point during this test, but the position estimate does not skew in any noticeable way as a result of this yawing. This suggests that the vision solution is successfully decoupling the position and orientation estimates.

In the second Vicon test the camera moves similarly to how it would during a real flight. Carrying the camera through a similar-to-flight path proves that the position and orientation are reliable in all axes.

Figures 5-4 and 5-5 show the results of the second Vicon test. The translation is far less skewed than in the first test, owing largely to the fact that there is little Y-movement near the feature. In the YZ-plane, the vision solution tracks the altitude (Z Distance) accurately. As in the first Vicon test, the rotation results from vision are near-identical to the Vicon estimates.

62 Vicon Test: Similar to Flight

Translation: XY Plane 4 3 2 E m Vision _..._ _ MVicon -1 2 3 4 5 7

-2 -3 -4 X Distance [m]

Vicon Test: Similar to Flight

Translation: YZ Plane 2

1.4

E - Vision 1 U Vicon 0.8 N 0.6 0.4 0.2 0 -4 -3 -2 -1 0 1 2 3 4 Y Distance [m]

Figure 5-4: The relative position of the camera with respect to the active feature.

63 Vicon Test: Similar to Flight

Rotation: Yaw 40 30 20 10 M Vision 0 mVicon -10 -100 20 30 0 5 60 0 80 -20 -30 -40 -50 Time [s]

Vicon Test: Similar to Flight

Rotation: Pitch 15

10

5 m Vision . Vicon 0 2 -10 0 : 2 50 70 80 -5

-10

-15 Time [s]

Vicon Test: Similar to Flight

Rotation: Roll 25 20

15 mVision j# 0 Vcon -10 -5 0 10 20 40 70 80 -10- -15 -20 -25 Time [s]

Figure 5-5: The relative yaw, pitch, and roll of the camera with respect to the active feature.

64 Figure 5-6: The filters of the pose extraction software overcome the noise and identify the feature in the image. Left: Original image with all detected features. Right: Processed image with all noise eliminated.

Noise Rejection Results

The monocular localization system has many hardware features that attempt to re- duce the noise that reaches the camera sensor. However, it is still possible for some false-positive readings to occur. The first and second noise filters, described in Sec- tions 3.4.2 and 3.4.3 respectively, reduce any noise that ends up in the image. This section demonstrates the effectiveness of that noise reduction software. In order to test noise reduction performance, there must first exist some noise in the image. Multiple green light sources generate noise for the Vicon tests. These lights have various sizes, intensities, and patterns, with some rotating or flashing. The noise sources create a large number of false-positive detections in the pose extraction algorithm. The pose extraction algorithm easily overcomes the noise and correctly identifies the feature. Figure 5-6 displays the effect of the noise filters2 . The noise filters are generally very good at removing unwanted noise, but there are situations where noise makes feature identification impossible. The first filter calculates x and y clusters as a way to reduce noise. The active features are evenly spaced, and so tend to cluster with each other more often than with noise. However,

2https: //youtu. be/UHmD5zGV13g shows a video of the real-time noise extraction.

65 if the image features just the right amount of noise in the correct image locations, it can throw off the clustering filter and potentially eliminate true feature points from the image. The second filter performs a linear regression fit and keeps the points with the lowest residual error. The active features are carefully arranged into two rows with the same slope, so the residual error is typically extremely small. If noise in the image lines up perfectly with the true features or is in a straight line, it can throw off the filter and incorrectly identify the true feature points. In outdoor tests, it is extremely unlikely for any noise to make it into the image. However, if noise does enter the image during an outdoor test, it is typically only one false positive detection that the noise filters easily sort out.

5.1.2 Outdoor

With the monocular localization system performing as desired during indoor tests in a laboratory environment, outdoor tests begin. The tests discussed in this section demonstrate the monocular localization sys- tem's ability to: 1) operate at large distances, 2) operate in bright sunlight, and 3) correctly recover the solution when one or more features leave the image frame.

Hardware

Monocular localization outdoor hardware includes the active features and camera described in Section 3.3. It also includes the unmanned aerial vehicle (UAV) and a significant number of extension cords to provide power to all eight beacons. The active features create a full-scale model of the final system. Four beacons hang in a row from a chain-link fence, and four others sit in a row on the ground. Figure 5-7 shows the active features installed at the outdoor test location. The camera uses the 850 nanometer bandpass-interference optical filter since the outdoor beacons are infrared. The UAV serves as a camera mount and as a GPS- recording device. The UAV is does not fly in these initial monocular localization outdoor ground tests, but still captures video and GPS data.

66 Figure 5-7: Four beacons hang from a chain link fence and four sit on the ground.

Testing Process

The testing process comprises mission planning, system setup, and data collection. The tests performed match those from the Vicon testing described in Section 5.1.1. Additionally, mission planning for outdoor monocular localization system testing is the same as for indoor testing. See Section 5.1.1. Outdoor system setup has a large initial time cost due to the effort of setting up each beacon and their associated extension cords. First, each LED attaches to a wooden platform. The hanging platforms install at the top of a chain link fence, and the ground platforms are set on the ground. Precise measurements ensure that each beacon sits where it needs to. Once all beacons are in place, extension cords are run from the generator (or an outdoor wall plug) to each beacon. The UAV is typically fully assembled and requires no extra effort on testing day. The user records the precise relative position of each beacon. The UAV and an additional Pixhawk collect all relevant test data. The extra Pixhawk captures the GPS coordinates and altitude of the beacon that marks the origin of the feature set, while the UAV records its coordinates, altitude, and video.

Results

To validate the monocular localization system outdoors, a reliable measurement method must provide ground truth information. This information is normally pro- vided by GPS; however, the GPS results (later in this section) are abysmal compared

67 Ground Validation Test

Translation: XY Plane 15

10

.,5 N Measured hh u .....-- ...... V...... ision * EKF -5 0 5 10 25 35 -5

-10

-15 X Distance [m]

Figure 5-8: The relative position of the UAV with respect to the active feature during the outdoor validation test.

to the vision results. As such, a number of locations measured with a 60 meter open- reel measuring tape provide the ground truth for this test. Cones mark each point of the test at which the user holds the UAV for a short duration. At each point, the user significantly rolls, pitches, and yaws the UAV while holding it in place relative to the active feature. The vision system generates a position and orientation estimate for this test. The onboard extended Kalman filter (EKF) also logs a pose estimate using its barometer, magnetometer, GPS, and inertial measurement unit (IMU) as sensors. The EKF position estimate is plotted alongside vision to demonstrate its unworthiness as ground truth.

Figures 5-8 and 5-9 show the results of the outdoor validation ground test. The vision solution lines up nicely with the measured positions. Conversely, the EKF position estimate created from GPS measurements is inaccurate by over 5 meters in either direction. The orientation estimates from vision are just as accurate as during the Vicon tests. The yaw solution from EKF diverges from the vision solution - not the other way around - as a result of a drifting magnetometer measurement.

68 Ground Validation Test

Rotation: Yaw 100

E 0

" Vision 0 0 100 250 " EKF 0 -50 U a -100 I -150 Time [s]

Ground Validation Test

Rotation: Pitch 50 40 30 - ~ 20 " Vision " EKF 10

0. 0 __ 50O 1001 150 -'200 250 -10 0 -20 -30 Time [s]

Ground Validation Test

Rotation: Roll 60

4 0 " Vision a, " EKF 0 250

-i 0 0 I -60 Time [s]

Figure 5-9: The relative orientation of the UAV with respect to the active feature during the outdoor validation test.

69 5.2 Flight System Ground Tests

The flight system serves as a method of collecting air-flow data and capturing video for post-processing to obtain precise relative positioning. Simulations prove the effi- cacy of the flight system software and permit rapid debugging. Then hardware tests validate all communication connections between the ground control station (GCS) and unmanned aerial vehicle (UAV).

5.2.1 Software-in-the-Loop Simulations

The flight system utilizes custom software to command and control the UAV. A Software-in-the-Loop (SITL) simulation reproduces the characteristics of a flight using the flight software. This allows rapid testing and debugging of flight code without risking any of the flight hardware. Software-in-the-Loop [21 is a simulation environment provided by ArduPilot that simulates a Pixhawk autopilot and the environment in which it operates. It provides substantial testing capabilities for custom software, including the ability to simulate and connect multiple instances of simulated Pixhawks. SITL simulations prove indispensable with regard to testing the flight system soft- ware. By simulating a Pixhawk for both the GCS and UAV, SITL tests replicate the system architecture. This decouples hardware issues from software issues, reducing debugging time insurmountably.

5.2.2 Hardware Validation

When a project utilizes hardware, the complexity increases substantially. Compo- nent incompatibilities, improper wiring, and the inherent stochasticity of the real world create ample opportunities for frustrating errant behavior. For this reason, each hardware subsystem of this project undergoes detailed testing before full system integration. GCS tests demonstrate correct reel behavior, UAV tests prove the vehicle air-worthy, and combined GCS/UAV tests show that the radios work as expected and that the reel doesnot cause any instability in UAV flight.

70 GCS testing allows thorough characterization and tuning of the reel motor con- troller before attaching the tether to the UAV. Reel analysis characterized rotational speed limits, which in turn set the maximum linear payout and reel-in speeds. Ten- sioner adjustment, combined with tension sensor measurements, established how hard the UAV must pull on the tether before the reel pays out. This consideration prevents tangles inside the GCS box. Finally, temperature measurements ensure that the GCS electronics stay below maximum operational temperature during operation.

UAV testing demonstrates operation at the simplest level before attempting con- trol via custom flight software. Manual flights show that the UAV is airworthy. These flights verify that the Pixhawk is responding appropriately to hand-held transmitter commands, confirming that the flight-essential hardware is connected correctly. Then flights using Mission Planner, Pixhawk's GUI tool for autonomous flight, indicate that the Pixhawk can properly interpret and act upon autonomous commands3 . After this step, the UAV is ready for flight with the custom software.

The next hardware validation involves connecting the GCS and UAV via tether and radio. First, the GCS and UAV software communicate to certify that the radio connection is reliable at various distances. Then the UAV completes a Mission Planner flight while tethered to the GCS. During this flight, the UAV navigates to numerous waypoints to ensure that the tether does not cause flight instabilities.

Finally, the flight system conducts a flight using the custom control software. The GCS software automatically sets tether length based upon the current mission objective and allows a GCS operator to issue commands to the UAV. The UAV software listens for commands, then acts accordingly. This validation flight shows that the UAV can autonomously arm, takeoff, navigate through mission waypoints, and land without interference from the tether.

3 Mission Planner uses MAVLink messages to communicate with the Pixhawk. The custom control software utilizes these same messages to control the UAV.

71 5.3 Integrated System Tests

With both the monocular localization and flight systems independently operational, integrated system tests begin. Integrated tests combine the monocular localization and flight systems to ensure that: 1) hardware interfaces are available and positioned as expected, 2) congruent systems do not overlap causing unexpected errors, and 3) computation of one system does not affect performance of the other.

5.3.1 Initial Verification

Integrated system verification tests establish the ability for the monocular localization and flight systems to operate concurrently without issue. Shared hardware and high computation costs lead to potential system issues worth investigating. The unmanned aerial vehicle's (UAV's) camera interfaces with both the monocular localization and flight systems. The monocular localization system saves the video for post-processing, and the flight system uses the video during flight to yaw the UAV towards the ground control station (GCS). To ensure that these uses do not interfere with each other, the vision-based yaw script outputs to a screen while the UAV saves video. Since the image capture process is handled by ROS, there should be no issues caused by two processes interacting with the image feed. This scenario is exactly the situation for which ROS is built. Indeed, this test shows that saving video does not interfere with vision-based yaw capabilities, and vice versa. The next consideration for integrated system verification is the computational cost of each system. Saving video to long-term storage demands significant CPU power, which is already limited since the UAV uses an ODROID microcomputer. To test this potential issue, the full flight system must operate while the UAV saves video. The flight system conducts a flight with the custom control software while the UAV is constantly recording video. Around 60 seconds after takeoff, the UAV stoped responding to GCS commands. Investigations show that the UAV saved over 4 GB of video in just 60 seconds. To overcome this issue, the UAV compresses the video at the same time as it captures it.

72 5.3.2 Stationary GCS Flights

In order to incrementally increase test complexity, the integrated system performs flights featuring a stationary GCS before utilizing a mobile GCS. The tests discussed in this section demonstrate the integrated system's ability to: 1) follow high-level commands to autonomously navigate the UAV relative to the stationary GCS, 2) record video during autonomous flight, and 3) post-process the recorded video to provide an accurate pose estimate for the flight.

Hardware

Stationary GCS flight test hardware includes all monocular localization and flight hardware mounted to a flatbed trailer. Wooden scaffolding attached to the trailer serves as a mounting point for various pieces of hardware. Figure 5-10 shows the hardware setup for stationary GCS flights.

Testing Process

The testing process comprises mission planning, system setup, and data collection. Stationary GCS mission planning is the same as for Vicon tests. See Section 5.1.1. Stationary GCS system initial setup is a complex process due to the quantity of hardware. First, each beacon attaches to a precise location on the wooden scaffolding. A ratchet strap anchors the GCS box to a frame set into the scaffolding, and the GCS tether routes through metal eyelets and attaches to the UAV sitting in the rear center of the trailer bed. The GCS's LAN router fastens to the scaffolding with velcro straps. More ratchet straps secure the generator to the front of the trailer, and power strips transmit electricity from the generator to all of the powered components. Velro straps hold down any loose wiring. The integrated system collects all commands sent and received from the GCS and UAV, the GCS and UAV location, images from the camera, and yaw commands from the vision-based yaw code. The Pixhawks on the GCS and UAV automatically collect command and location information, while a rosbag on the UAV collects yaw

73 Figure 5-10: A flatbed trailer with wooden scaffolding provides the flight platform and mounting locations for all integrated system hardware. commands and camera images. An extra laptop uses Mission Planner to connect to the UAV to record the flight as an extra data source and to provide UAV status during flight. At the end of each day of flight testing, the operator collects all data logs and saves them to an external hard drive.

Results

The stationary GCS flight results show the vision solution during a real flight of a tethered UAV. During this test, the tether restrained the UAV such that it could not reach its commanded mission coordinates. This test also shows the autonomous flight system in action.

Figures 5-11 and 5-12 show the results of the stationary GCS flight. The vision position estimate is much less clean than in previous tests due to the fact that this is a real flight with more staggered movement. Even so, the vision solution appears to give a decent solution. The accuracy of this test is not able to be verified due to the lack of a trustworthy ground truth. The vision orientation solution matches very closely to that of the EKF (from the IMU and magnetometer).

74 Stationary GCS Flight

Translation: XY Plane 15

10

5 - Mission C- m Vision 0 025 N EKF & -5 0 5~~~mm25 -5

-10

-15 X Position [m]

Stationary GCS Flight

Translation: YZ Plane -12-

- Mission - Vision . EKF

-15 -10 -5 5 10 15 -2 Y Position [m]

Figure 5-11: The relative position of the UAV with respect to the stationary GCS.

75 Stationary GCS Flight

Rotation: Yaw 50 40 30 20 o Vision 10 0 EKF 0 -10 0 150 0 350 -20 -30 -40 -50 Time [s]

Stationary GCS Flight

Rotation: Pitch 40

30

20 . Vision U EKF 10

0 - 0 50 100 150 200 . 300 350 -10

-20 Time [s]

Stationary GCS Flight

Rotation: Roll 15 10

5 . Vision 10 3 EKF S 50 1, 350 -5

-10

-15 -20 Time [s]

Figure 5-12: The relative orientation of the UAV with respect to the stationary GCS.

76 5.3.3 Mobile GCS Flights

The final assessment of the integrated system is mobile GCS flights. Completion of mobile GCS tests signify successful completion of the primary project objective to create a system capable of measuring the air wake aft of a naval vessel. The tests discussed in this section demonstrate the integrated system's ability to post-process the recorded video to provide an accurate pose estimate for the entire flight. The unreliability of GPS-based navigation (as seen in Figure 5-8) caused issues with the tether during mobile GCS tests. The tether would frequently become too slack and then too taut as a result of drifting GPS position estimates. This leads to the results in this section featuring a manually-flown UAV. As such, the plots in this section do not feature a mission as in the Stationary GCS section.

Hardware and Testing Process

Mobile GCS flight hardware and testing process are the same as described for sta- tionary GCS tests in Section 5.3.2 and 5.3.2, respectively. The only difference is that the pickup truck tows the trailer during each flight.

Results

The mobile GCS flight demonstrates the ability to post-process a vision solution for relative positioning of the UAV with respect to the mobile GCS. The EKF solution is this case is actually an EKF difference. Since the UAV and GCS are traveling over a kilometer over a runway, it would be impossible to plot just the EKF position of the UAV as in previous section. In this section, the EKF position estimate of the GCS is subtracted from the EKF estimate of the UAV in order to produce interesting plots. Figures 5-13 and 5-14 show the results from the mobile GCS flight. In this test, the UAV flew the entire first arc (at 11 meters) with a taut tether. The vision solution on the XY-plane plot backs up this observation, strongly suggesting that the vision solution is very accurate. As in previous plots, the EKF position estimate proves unreliable relative to the vision solution. The orientation plots show the the vision

77 Mobile GCS Flight

Translation: XY Plane 10 8 6

2 m Vision . ." EKF Diff

-5 -2 0 .2G ~ 30

-10

X Distance [in]

Mobile GCS Flight -110 Translation: YZ Plane 14 -64 12

Y Distance [m] E~Wh3U~. .% 8 .Vision U EKF Duff Fiur -1:Th rltiepoiio o heUV ih eset othobile GCS. Flinhe the EF postion stimae ofTenUAationer a Plaredstnedwthruayte

-10 -8 -6 -4 2 4 6 8 10 -4

Y Distance [in]

Figure 5-13: The relative position of the UAV with respect to the mobile GCS. Since the EKE position estimate of the UAV covers a large distance down the runway, the EKF solution of the ground control station is subtracted from that of the UAV to produce a usable plot.

78 Mobile GCS Flight

Rotation: Yaw 30 20 10 j " Vision

0 0 " EKF 0 20 40 00 1 140 160 18 ) 200 Cu -10 -20 - -30 -40 Time [s]

Mobile GCS Flight

Rotation: Pitch 40

30

20 * Vision * EKF 10 AE C.

0 20 40 60 80 100 0 140 160 180 200 -10

-20 Time [s]

Mobile GCS Flight

Rotation: Roll 10 5 0 0 2180 200 mVision Cu -5 E EKF 0 -10 -15 -20

-25 Time [s]

Figure 5-14: The relative orientation of the UAV with respect to the mobile GCS.

79 solution matches the EKF very closely, with the expected drifting EKF yaw from an inaccurate magnetometer.

5.3.4 Summary

The monocular localization system and autonomous flight system work together to produce a flight system capable of accurately measuring and positioning air flow rela- tive to a mobile ground control station. The monocular localization system produces very accurate positioning unless very close to the feature since lens distortion has a greater affect when the active feature takes up most of the image. With an ideal camera featuring perfect calibration, this would not occur. The flight system ended up being unusable for mobile GCS testing since the state information for positioning (GPS) is too inaccurate for tethered flight. With a more powerful on board computer, the monocular localization system could likely provide the real-time state information necessary for autonomous flight.

80 Chapter 6

Conclusion

6.1 Summary

This thesis details the motivation, design, testing, and results of a monocular local- ization system applied to an unmanned aerial vehicle (UAV). The system overcomes the environmental constraints to provide high accuracy relative pose estimation in a setting where typical pose estimation systems fall short. The project successfully completed flights to collect the air flow data that the U.S. Navy requested. The active features in the form of high power infrared LED's serve many purposes:

1. Provide features in the feature-starved ocean environment, 2. Simplify feature recognition by utilizing a single, known wavelength, 3. Ensure feature recognition at large distances by using very bright LED's, 4. Provide pose extraction capabilities with minimal transmissions, and 5. Reduce the weight of system architecture placed on board the UAV.

The camera hardware - in particular the 850 nanometer bandpass-interference filter - also simplifies feature recognition by removing unwanted light before it even reaches the camera sensor. The flight system hardware provides thirty-minute duration au- tonomous flights with straightforward launch and recovery and remote commanding. The total system culminates in a high-accuracy pose extraction method applied to a simple autonomous flight system.

81 The main contributions of this thesis are the monocular localization system, the autonomous UAV platform, and the air flow dataset delivered to the United States Navy. The monocular localization system allows pose estimation in a unique environ- ment, while the UAV platform offers an uncomplicated command-and-fly structure.

6.2 Future Work

This section discusses future work as it relates to the two primary topics of this thesis: pose estimation and the actual flight platform. The pose estimation technique of monocular localization that this thesis presents emerges from the lack of pose estimation systems that provide high-accuracy results relative to a mobile target in outdoor environments. Outdoor relative pose determi- nation remains to be a difficult task. Differential global positioning system (DGPS) techniques are a promising relative pose method if combined with inertial data; how- ever, further development is necessary to reduce synchronization time and lower the frequency of solution dropouts. The flight system dynamics model does not currently incorporate the tether, re- sulting in less-efficient flight. Incorporation of tether dynamics into the control system would result in more robust unmanned aerial vehicle (UAV) flights. Also, the current flight platform supports only a single UAV. Incorporation of multiple agents requires significant work in terms of coordination and ground station software, but provides potential for fleet-wide control. There are many applications to which both the monocular localization system and the autonomous flight platform apply, such as precisely landing package delivery UAVs on a moving truck for rapid delivery and resupply. There is significant potential for future work in terms of applications of the ideas that this thesis presents.

82 Appendix A

Equipment

Item Brand Part/Model # QTY Notes UAV Camera Pointgrey CM3-U3-13S2M-CS 1 Chameleon 3 Camera Lens Fujinon YV2.8x2.8SA-SA2 1 Optical Filter MidOPT Bi850 1 850 nm Interference Bandpass Telemetry Radio 3DR 2 915 MHz Handheld Transmitter Spektrum DX6e 1 Pixhawk Autopilot 3DR 1 Pixhawk Buzzer 3DR 1 Pixhawk Safety Switch 3DR 1 Pixhawk GPS Module 3DR 1 Pixhawk Power Module 3DR 1 Motor RCMC 3407 400KV V2 4 Propeller HQ Prop 12x4.5 4 2x CW and 2x CCW ESC RCMC 4 30HV Simonk Carbon Fiber Frame Tarot Iron Man 650 1 Folding frame (4-axis) Mounting Plate Delrin Acetal Resin Sheet 1 Cut on 3-axis CNC machine Microcomputer ODROID XU4 1 Ubuntu 14.04 ARM7 Version Memory Chip eMMC 1 32GB BEC Castle 1 10A 6S Battery Turnigy 1 6S, 10C, 6600mAh

83 Item Brand Part/Model # QTY Notes GCS Laptop Any Any 1 Must have SSH capabilities Windows with Mission Planner Secondary Laptop Any Any 1 installed Generator Any Any 1 Must provide at least 1500 W Splashproof Box Pelican Stormcase iM2620 1 Allows SSH connection via Router Any Any 1 ethernet Box Internal Structure 8020 2525 Extruded Bars Multiple Reel Motor Controller EPOS2 70/10 375711 1 Spectra Fiber Tether Power Pro Braided Line Multiple 100 lb Test, at least 100 meters Microcomputer ODROID XU4 1 Memory Chip eMMC 1 32GB Telemetry Radio 3DR 1 915 MHz, 1 to UAV Pixhawk Autopilot 3DR 1 Pixhawk Buzzer 3DR 1 Pixhawk GPS Module 3DR 1 Pixhawk Power Module 3DR 1

Beacon System IR LED Hontiey 100W High Power 8 850 nm wavelength Lens TX 8 Wide angle Heatsink TX 8 Fan TX 8 AC-DC Power Adapter Mean Well PLN-45-15 8 Constant current (3A, 15V) Mounted to 1 5/8" Unistrut Mounting Unistrut 1 5/8" L-bracket 8 beams

84 Bibliography

[1] ACL. Aerospace controls laboratory I massachusetts institute of technology. Online, August 2018. http://ac1.mit.edu/. Accessed August 2018.

[2] ArduPilot. Sitl simulator (software in the loop). Online, August 2018. http://ardupilot.org/dev/docs/sitl-simulator-software-in-the-loop. html. Accessed August 2018.

[3] I. Arvanitakis, K. Giannousakis, and A. Tzes. Mobile robot navigation in un- known environment based on exploration principles. In 2016 IEEE Conference on Control Applications (CCA), pages 493-498, Sept 2016.

[4] Kairos Autonomi. Gps correction comparisons - RTK vs DGPS. On- line, January 2000. http://www.kairosautonomi.com/uploads/files/129/ Bulletin---RTK-vs-DGPS-010400.pdf. Accessed August 2018.

[5] Martin A. Fischler and Robert C. Bolles. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM, 24(6):381-395, June 1981.

[6] Open Source Foundation. About ros. Online, August 2018. http: //www. ros. org/about-ros/. Accessed August 2018.

[71 FUR Integrated Imaging Solutions Inc. Chameleon3 1.3 mp mono usb3 vision (sony icx445). Online, August 2018. https://www.ptgrey.com/ chameleon3-13-mp-mono-usb3-vision. Accessed August 2018.

[8] . Usb 3.0* radio frequency interference impact on 2.4 ghz wire- less devices. Technical Report 327216-001, Intel Corporation, 2012. https://www.intel.com/content/www/us/en/io/universal-serial-bus/ usb3-frequency-interference-paper.html. Accessed August 2018.

[9] Patrick Irmisch. Camera-based distance estimation for autonomous vehicles. Master's thesis, Technische Universitdt Berlin, December 2017.

[10] H. Kato, M. Billinghurst, I. Poupyrev, K. Imamoto, and K. Tachibana. Virtual object manipulation on a table-top ar environment. In Proceedings IEEE and ACM InternationalSymposium on Augmented Reality (ISAR 2000), pages 111- 119, Oct 2000.

85 [111 Michael R. Klinker. Tethered uav flight using a spherical position controller. Master's thesis, Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, June 2016.

[121 Rainer Kiimmerle, Bastian Steder, Christian Dornhege, Michael Ruhnke, Giorgio Grisetti, Cyrill Stachniss, and Alexander Kleiner. On measuring the accuracy of slam algorithms. Autonomous Robots, 27(4):387, Sep 2009. [13] Kenneth Levenberg. A method for the solution of certain non-linear problems in least squares. Quarterly of Applied Mathematics, 2:164-168, 1944. [14] Brett T. Lopez. Low-latency trajectory planning for high-speed navigation in unknown environments. Master's thesis, Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, September 2016. [15] M. R. Mahfouz, M. J. Kuhn, Y. Wang, J. Turnmire, and A. E. Fathy. Towards sub-millimeter accuracy in uwb positioning for indoor medical environments. In 2011 IEEE Topical Conference on Biomedical Wireless Technologies, Networks, and Sensing Systems, pages 83-86, Jan 2011. [161 D. Marquardt. An algorithm for least-squares estimation of nonlinear parame- ters. Journal of the Society for Industrial and Applied Mathematics, 11(2):431- 441, 1963. [171 Pierre Merriaux, Yohan Dupuis, R6mi Boutteau, Pascal Vasseur, and Xavier Savatier. A study of vicon system positioning performance. Sensors, 17(7):1591, 2017. [181 MidOpt. Bi850 near-ir interference bandpass filter. Online, October 2018. http: //midopt. com/f ilters/bi850/. Accessed August 2018. [19] Aqel MOA, Marhaban MH, Saripan MI, and Ismail NB. Review of visual odom- etry: types, approaches, challenges, and applications. SpringerPlus, 5(1):1897, oct 2016. [20] R. Mur-Artal, J. M. M. Montiel, and J. D. Tard6s. Orb-slam: A versatile and accurate monocular slam system. IEEE Transactions on Robotics, 31(5):1147- 1163, Oct 2015. [21] R. Mur-Artal and J. D. Tard6s. Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras. IEEE Transactions on Robotics, 33(5):1255-1262, Oct 2017. [22] Navio2. Anti-vibration mount. Online, August 2018. https ://docs .emlid. com/ navio2/ardupilot/hardware-setup/. Accessed August 2018.

[23] R. A. Newcombe, S. Izadi, 0. Hilliges, D. Molyneaux, D. Kim, A. J. Davison, P. Kohi, J. Shotton, S. Hodges, and A. Fitzgibbon. Kinectfusion: Real-time dense surface mapping and tracking. In 2011 10th IEEE International Symposium on Mixed and Augmented Reality, pages 127-136, Oct 2011.

86 [24] Nick84. Solar spectrum en.svg. Online via Wikimedia Commons, October 2018. https://commons.wikimedia.org/wiki/File:Solar-spectrumen. svg. Ac- cessed August 2018. Modified and republished with the following license: https: //creativecommons.org/licenses/by-sa/3.0/deed.en.

125] OpenCV. Camera calibration and 3d reconstruction. Online, August 2018. https://docs.opencv.org/2.4/modules/calib3d/doc/camera calibration_ and_3dreconstruction.html. Accessed August 2018.

[26] OpenCV. Opencv library. Online, August 2018. https: //opencv. org/. Accessed August 2018.

[27] Meir Pachter, Nicola Ceccarelli, and Phillip R. Chandler. Vision-based target ge- olocation using micro air vehicles. Journal of Guidance, Control, and Dynamics, 31(3):597-615, 2008.

[28] Pixhawk. Home page. Online, August 2018. http://pixhawk.org/. Accessed August 2018.

[29] N. C. Rowe, A. E. Fathy, M. J. Kuhn, and M. R. Mahfouz. A uwb transmit-only based scheme for multi-tag support in a millimeter accuracy localization system. In 2013 IEEE Topical Conference on Wireless Sensors and Sensor Networks (WiSNet), pages 7-9, Jan 2013.

[30] Velodyne. Vlp-16 (puck lite). Online, August 2018. https://velodynelidar. com/vlp-16-lite.html. Accessed August 2018.

[31] VICON. Motion capture systems. Online, August 2018. http://www.vicon.com/. Accessed August 2018.

87