Diego Brito dos Santos Cesar

A 2 1/2 D Visual Controller for Autonomous Underwater Vehicle

Brazil May, 2017 Diego Brito dos Santos Cesar

A 2 1/2 D Visual Controller for Autonomous Underwater Vehicle

Presented to the Master’s Program in Elec- trical Engineering of the Federal University of Bahia in partial fulfillment of the require- ments for the degree of Master of Science in Electrical Engineering.

Federal University of Bahia - UFBA Master’s Program in Electrical Engineering

Supervisor: Prof. André Gustavo Scolari Conceição Co-supervisor: Dr. Jan Christian Albiez

Brazil May, 2017 Modelo de ficha catalográfica fornecido pelo Sistema Universitário de Bibliotecas da UFBA para ser confeccionada pelo autor

Cesar, Diego Brito dos Santos A 2 1/2D Visual Controller for Autonomous Underwater Vehicle / Diego Brito dos Santos Cesar. -- Salvador, 2017. 107 f. : il

Orientador: André Gustavo Scolari Conceição. Coorientador: Jan Christian Albiez. Dissertação (Mestrado - Programa de Pós-Graduação em Engenharia Elétrica) -- Universidade Federal da Bahia, Universidade Federal da Bahia, Escola Politécnica, 2017.

1. Controle Servo Visual. 2. Marco Fiducial Artificial. 3. Veículo Autônomo Submarino. I. Conceição, André Gustavo Scolari. II. Albiez, Jan Christian. III. Título.

To Laila Civatti, my beloved wife Acknowledgements

As this work comes to an end I realize just how many people helped directly or indirectly to make it possible. I would like to firstly thank my parents, Rose and Cesar, that did their best to give me the conditions to go further and become a better person and a better professional. I thank my brother, Gabriel for the company all over these years. I give thanks to my wife and partner of all moments, good or bad, Laila, the one that heard my moans, suffered each suffering and vibrated with each small success during this journey. I’ll be eternally grateful. To my supervisors André Scolari and Jan Albiez, that encouraged me to do my best and guided me based on their experience and knowledge. To the whole BIR and the FlatFish team, people that I have been spending more time with than any other in these last years. Thanks for making my days pleasant. A special thanks to my friends Gustavo, for helping me so many times, and Geovane, for supporting me and encouraging me towards this goal. Marco Reis, thanks for believing in my potential and giving me the opportunity to be part of this amazing group. Thanks to all colleagues from DFKI that somehow contributed to this work. Special thanks to Christopher Gaudig, for the goodwill and support with the experiments on the DFKI’s facilities. To my friends Pedro Xavier, João Britto, Rafael Saback, Danilo Farias and Livia Assunção, for their support, encouragement, help and empathy during this whole process. Last but not least, thanks to Shell for the financial founding of the FlatFish Project via the Brazilian Industrial Research and Innovation Corporation (EMBRAPII) and the Brazilian National Agency of Petroleum (ANP). “The most exciting phrase to hear in science, the one that heralds the most discoveries, is not ’Eureka!’ but ’That’s funny...’” Isaac Asimov Resumo

Navegação submarina é afetada pela falta de GPS, devido à atenuação de ondas eletromag- néticas. Por causa disso, os robôs submarinos baseiam-se em sistemas de navegação via odometria e sensores inerciais. Contudo, a localização via esse tipo de abordagem possui uma incerteza associada que cresce com o passar do tempo. Por isso sensores visuais e acústicos são utilizados para aumentar a precisão da navegação de veículos submarinos. Nesse contexto, a utilização de um controlador visual aumenta a precisão dos sistemas robóticos quando se locomovem em relação a um objeto alvo. Esse tipo de precisão é requerida para manipulação de objetos, inspeção, monitoramento e docagem submarina. Esse trabalho tem como objetivo projetar e avaliar um controlador visual híbrido para um veículo submarino autônomo (AUV) utilizando como referência marcos visuais artificiais. Os marcos artificiais são alvos planares projetados para serem facilmente detectados por sistemas de visão computacional, sendo capazes de fornecer meios para estimação da posição do robô em relação ao marco. As suas características de alta taxa de detecção e baixa taxa de falsos positivo são desejáveis para tarefas de controle servo visual. Este trabalho analisou, portanto, dentre os marcos mais populares e de código aberto, aquele que apresenta o melhor desempenho em ambientes submarinos, em termos de taxa de detecção, número de falsos positivos, máxima distância e ângulo para detecção. Posteriormente, o marco que apresentou melhor performance foi utilizado para aplicação de controle visual em um robô submarino. Os primeiros ensaios foram realizados na plataforma de simulação robótica Gazebo e, posteriormente, em um protótipo de AUV real, o FlatFish. Testes em um tanque de água salgada foram realizados visando avaliar a solução proposta utilizando um ganho estático e um ganho adaptativo para o controlador visual. Finalmente, testes no mar foram realizados utilizando o controlador que apresentou os melhores resultados no ambiente controlado, a fim de verificar seu desempenho em um ambiente real. Os testes mostraram que o controlador visual foi capaz de manter o veículo em frente aos marcos visuais artificiais e que o ganho adaptativo trouxe vantagens, principalmente por suavizar a movimentação do robô no início da missão.

Palavras-chave: Controle Servo Visual, Marco Fiducial Artificial, Veículo Autônomo Submarino Abstract

Underwater navigation is affected by the lack of GPS due to the attenuation of the electromagnetic signals. Thereby, underwater rely on dead reckoning as their main navigation systems. However, localization via dead-reckoning raises uncertainties over time. Consequently, visual and acoustic sensors have been used to increase accuracy in robotic systems navigation, specially when they move in relation to a target object. This level of precision is required, for instance, for object manipulation, inspection, monitoring and docking. This work aims to develop and assess a hybrid visual controller for an autonomous underwater vehicle (AUV) using artificial fiducial markers as reference. Artificial fiducial markers are planar targets, designed to be easily detected by computer vision systems and provide means to estimate the ’s pose in respect to the marker. They usually have high detection rate and low false positive rate, which are desirable for visual servoing tasks. On this master thesis was evaluated, from among the most popular and open-source marker systems, one that presents the best performance in underwater environments in terms of detection rate, false positives rate, maximum distance and angle for successful detection. Afterwards, the best marker was used for visual servoing purposes in an underwater robot. The firsts experiments were performed on the Gazebo robot simulation environment and, after that, on a real prototype, the FlatFish. Tests on a saltwater tank were performed in order to assess the controller using static and adaptive gains. Finally, sea trials were performed, using the controller that best behaved on the controlled environment in order to assess its performance on a real environment. The tests have shown that the visual controller was able of station-keeping in front of an artificial fiducial marker. Additionally, it was also seen that the adaptive gain brings improvements, mainly because it smooths the robot’s motion on the beginning of the task.

Keywords: Visual Servoing, Artificial Fiducial Marker, Autonomous Underwater Vehicle List of Figures

Figure 1 – Mobile robots ...... 17 Figure 2 – Typical Underwater Vehicles ...... 18 Figure 3 – Different marker systems ...... 20 Figure 4 – Classification of different underwater vehicles ...... 22 Figure 5 – Remotely Operated Vehicles ...... 23 Figure 6 – Commercial Mobile robots ...... 24 Figure 7 – Images of POODLE, the first ROV ...... 25 Figure 8 – Early ROVs ...... 26 Figure 9 – SPURV ...... 27 Figure 10 – AUSS ...... 27 Figure 11 – Hugin AUV ...... 27 Figure 12 – FlatFish AUV ...... 28 Figure 13 – FlatFish thruster configuration ...... 30 Figure 14 – Representation of the three components of light received by the camera 34 Figure 15 – Color absorption by function of distance ...... 34 Figure 16 – Picture of a miniaturized parking lot taken with a camera close to the objects...... 35 Figure 17 – Pinhole camera geometry ...... 36 Figure 18 – Diagram showing the position of the pixel in the ideal case and the effect of the radial distortion (δr) and tangential distortion (δt) . . . . 38 Figure 19 – Studied marker systems ...... 40 Figure 20 – Structure of dynamic look-and-move IBVS ...... 42 Figure 21 – Structure of dynamic look-and-move PBVS ...... 42 Figure 22 – Example of adaptive curve using λ(0) = 0.2, λ(∞) = 0.1 and λ˙ (0) = 0.05 50 Figure 23 – Camera head mounted on a 3D gantry robot inside the black tank . . . 55 Figure 24 – Placement of markers on the support structure. The first row features an AprilTags and an ARToolKit marker. In the second row the ArUco marker is followed by smaller size versions of all markers...... 56 Figure 25 – AprilTags at different turbidity levels in a deep sea lighting scenario . . 56 Figure 26 – Angle steps used for the maximum detectable angle test ...... 57 Figure 27 – Smallest detectable size (in pixels) for ArUco, AprilTags and ARToolKit libraries in a shallow sea scenario at different turbidity levels...... 58 Figure 28 – Smallest detectable size (in pixels) for ArUco, AprilTags and ARToolKit libraries in a deep sea scenario at different turbidity levels...... 59 Figure 29 – The largest camera-to-marker distance for successful detection at differ- ent turbidity levels for the shallow sea lighting scenario ...... 60 Figure 30 – The largest camera-to-marker distance for successful detection at differ- ent turbidity levels for the deep sea lighting scenario ...... 60 Figure 31 – Maximum detected angle for the three libraries in a shallow sea lighting scenario at different turbidity levels ...... 60 Figure 32 – Maximum detected angle for the three libraries in a deep sea lighting scenario at different turbidity levels ...... 61 Figure 33 – The detection time for ArUco, AprilTags and ARToolKit ...... 61 Figure 34 – FlatFish inspecting a shipwreck on Gazebo ...... 63 Figure 35 – Starting and ending position of the vehicle in the world [(a) e (b)] and starting and ending marker positions on the image plane [(c) e (d)] . . 64 Figure 36 – Reference following without oceans current effects ...... 65 Figure 37 – Pixels trajectory on the image plan when oceans current are not present 66 Figure 38 – Reference following with oceans current effects ...... 66 Figure 39 – Pixels trajectory on the image plan in the presence of oceans current . 67 Figure 40 – Maritime Exploration Hall - DFKI ...... 68 Figure 41 – Setup scenario on the DFKI’s big basin ...... 69 Figure 42 – Cascade PID diagram ...... 70 Figure 43 – Visual control chain diagram ...... 70 Figure 44 – Coordinate systems from FlatFish’s body, camera and marker . . . . . 71 Figure 45 – Vehicle’s camera detecting the Apriltag marker at two different distances on the tank ...... 73 Figure 46 – Controller performance in surge during setpoint changing in surge (case 1)...... 73 Figure 47 – Controller performance in sway during setpoint changing in surge (case 1)...... 74 Figure 48 – Controller performance in heave during setpoint changing in surge (case 1)...... 74 Figure 49 – Controller performance in yaw during setpoint changing in surge (case 1). 75 Figure 50 – Pixel trajectory on the image plane during motion in surge (case 1). . . 76 Figure 51 – Controller performance in surge during setpoint changing in surge (case 2)...... 76 Figure 52 – Controller performance in sway during setpoint changing in surge (case 2)...... 77 Figure 53 – Controller performance in heave during setpoint changing in surge (case 2)...... 78 Figure 54 – Controller performance in yaw during setpoint changing in surge (case 2). 78 Figure 55 – Pixel trajectory on the image plane during motion in surge (case 2). . . 79 Figure 56 – Controller performance in sway during setpoint changing in sway (case 3). 79 Figure 57 – Controller performance in surge during setpoint changing in sway (case 3)...... 80 Figure 58 – Controller performance in heave during setpoint changing in sway (case 3)...... 80 Figure 59 – Controller performance in yaw during setpoint changing in sway (case 3). 81 Figure 60 – Pixel trajectory on the image plane during motion in sway (case 3). . . 81 Figure 61 – Controller performance in heave during setpoint changing in heave (case 4)...... 82 Figure 62 – Controller performance in surge during setpoint changing in heave (case 4)...... 83 Figure 63 – Controller performance in sway during setpoint changing in heave (case 4)...... 83 Figure 64 – Controller performance in yaw during setpoint changing in heave (case 4). 84 Figure 65 – Pixel trajectory on the image plane during motion in heave (case 4). . . 84 Figure 66 – Controller performance in yaw during setpoint changing in yaw (case 5). 85 Figure 67 – Controller performance in surge during setpoint changing in yaw (case 5). 85 Figure 68 – Controller performance in sway during setpoint changing in yaw (case 5). 86 Figure 69 – Displacement of sway when performing Yaw control ...... 86 Figure 70 – Controller performance in heave during setpoint changing in yaw (case 5). 87 Figure 71 – Pixel trajectory on the image plane during motion in yaw (case 5). . . 87 Figure 72 – Brazilian experiments location (Google Maps) ...... 88 Figure 73 – FlatFish AUV in Todos os Santos bay ...... 89 Figure 74 – Apriltag marker ID=5 placed on metal pedestal ...... 89 Figure 75 – Vehicle’s camera detecting the Apriltag marker at two different distances. 91 Figure 76 – Surge controller performance in the sea during several setpoint changes 91 Figure 77 – Sway controller performance in the sea during several setpoint changes 92 Figure 78 – Heave controller performance in the sea during several setpoint changes 93 Figure 79 – Yaw controller performance in the sea during several setpoint changes . 93 Figure 80 – Pixel trajectory on the image plane during the entire mission in the sea 94 Figure 81 – Surge controller performance on interval [770-900] seconds during mis- sion in the sea ...... 95 Figure 82 – Sway controller performance on interval [770-900] seconds during mission in the sea ...... 95 Figure 83 – Heave controller performance on interval [770-900] seconds during mis- sion in the sea ...... 96 Figure 84 – Yaw controller performance on interval [770-900] seconds during mission in the sea ...... 96 Figure 85 – Pixel trajectory on the image plane on the interval [770-900] seconds the during mission in the sea ...... 97 List of Tables

Table 1 – FlatFish Specifications ...... 29 Table 2 – Thrust Configuration Matrix ...... 30 Table 3 – Model parameters of AUV FlatFish ...... 32 Table 4 – Gains for the simulated visual servoing task ...... 64 Table 5 – PID Coefficients ...... 71 Table 6 – Setpoint for the test cases scenarios on the basin ...... 72 Table 7 – PID Coefficients on the ocean experiments ...... 90 List of abbreviations and acronyms

ANP Brazilian National Agency of Petroleum

AUSS Advanced Unmanned Search System

AUV Autonomous Underwater Vehicle

CTD Conductivity, Temperature and Depth

CURV Cable-controlled Undersea Recovery Vehicles

DFKI German Research Center for Artificial Intelligence

DVL Doppler Velocity Logger

EMBRAPII Brazilian Industrial Research and Innovation Corporation

HVS Hybrid Visual Servoing

IBVS Image Based Visual Servoing

IMU Inertial Measurement Unit

LBL Long Baseline

PBVS Position Based Visual Servoing

PID Proportional–Integral–Derivative

PnP Perspective-n-Point

RIC Innovation Center

ROCK Robotic Construction Kit

ROV Remotely Operated Vehicle

SPURV Special Purpose Underwater Research Vehicle

TCM Thrust Configuration Matrix

UAV

USBL Ultra-short Baseline

Visp Visual Servoing Platform List of symbols

λ Greek letter Lambda

τ Greek letter Tau

η Greek letter Eta

ρ Greek letter Rho

∈ Math symbol "element of"

∀ Math symbol "for all" Contents

1 INTRODUCTION ...... 17 1.1 Motivation ...... 17 1.1.1 Vision-based control and underwater vehicles ...... 18 1.2 General Objective ...... 20 1.3 Specific Objectives ...... 20 1.4 Thesis Outline ...... 21

2 AUTONOMOUS UNDERWATER VEHICLES ...... 22 2.1 Autonomous Underwater Vehicles ...... 22 2.1.1 Underwater Vehicles ...... 22 2.1.2 Historical Context ...... 24 2.2 FlatFish ...... 28 2.2.1 Sensors and Actuators ...... 29 2.2.1.1 Navigation System ...... 29 2.2.1.2 Inspection System ...... 30 2.2.1.3 Propellers ...... 30 2.2.2 Vehicle Model ...... 31 2.2.3 Software Layer ...... 32

3 COMPUTER VISION ...... 33 3.1 Underwater Image Processing ...... 33 3.2 Perspective Transform ...... 35 3.2.1 Pinhole camera model ...... 35 3.2.2 Distortion parameters ...... 37 3.2.3 Camera Calibration ...... 38 3.2.4 Pose Estimation with Single Camera ...... 38 3.3 Artificial Fiducial Markers ...... 39 3.3.1 The evaluated marker systems ...... 40

4 VISUAL CONTROLLER ...... 41 4.1 IBVS ...... 44 4.1.1 Interaction Matrix for Points ...... 44

4.1.2 Estimation of Interaction Matrix - Lˆs ...... 46 4.2 PBVS ...... 46 4.3 Hybrid Visual Servoing - 2 1/2D ...... 48 4.3.1 Interaction Matrix ...... 48 4.4 Adaptive Gain ...... 49 4.5 Stability Analysis ...... 50 4.5.1 Stability for PBVS ...... 52 4.5.2 Stability for IBVS ...... 52 4.5.3 Stability of 21/2D visual servoing ...... 53

5 EVALUATION OF FIDUCIAL MARKERS ...... 54 5.1 Experimental Setup ...... 54 5.2 Methodology ...... 55 5.3 Results and Discussion ...... 58 5.4 Conclusions ...... 62

6 GAZEBO SIMULATION ...... 63 6.1 Experimental Setup and Methodology ...... 63 6.2 Results ...... 64 6.3 Conclusions ...... 67

7 EXPERIMENTS WITH THE REAL VEHICLE ...... 68 7.1 Experiments in the big basin ...... 68 7.1.1 Experimental Setup ...... 68 7.1.1.1 Control Chain ...... 69 7.1.2 Methodology ...... 72 7.1.3 Results and Discussion ...... 72 7.1.3.1 Case 1 - Surge ...... 73 7.1.3.2 Case 2 - Surge ...... 75 7.1.3.3 Case 3 - Sway ...... 77 7.1.3.4 Case 4 - Heave ...... 82 7.1.3.5 Case 5 - Yaw ...... 85 7.1.4 Conclusions ...... 88 7.2 Experiments in the Sea ...... 88 7.2.1 Experimental Setup ...... 88 7.2.2 Methodology ...... 89 7.2.3 Results and Discussion ...... 90 7.2.3.1 Entire servoing mission ...... 90 7.2.3.2 Interval Analysis ...... 94 7.2.4 Conclusions ...... 94

8 FINAL CONSIDERATIONS ...... 98

BIBLIOGRAPHY ...... 100 17 1 Introduction

This chapter presents the importance of subsea exploration and the potential of unmanned underwater vehicles for this task. It also presents the limitations of current underwater robots and shows how computer vision and vision-based controlled vehicles have overcome real problems. The objectives of this work are also presented. Finally, the structure of this master thesis is shown.

1.1 Motivation

The introducing of robots in the manufacturing industries in the 1960s revolutionized the way human beings produced their products. The ability to perform repetitious tasks with high accuracy and speed have increased industry productivity, reduced costs and increased product quality. Nowadays, manipulator robots are widely used in many sectors of industries in manners that make it now unthinkable for company of such segments to remain competitive and survive without them. Although still far from the same technological maturation of the industrial robots, currently we are witnessing an enormous progress of mobile robots. The advances on sensory equipment, computing power as well as signal processing and artificial intelligence have enabled actual applications of mobile robots as well as revealed the potential for execution of more complex tasks. Mobile robots have been applied to replace humans in dangerous situations, to extend human perception and to actuate on inhospitable areas, e.g., analysis of surface materials on Mars [1], search and rescue victims of natural disasters [2], observation of active volcanoes [3] and inspection in power transmission lines [4]. Figure 1 shows some examples of spacial, terrestrial and aerial mobile robots.

(a) Sojourner [5] (b) Aeryon Scout [6] (c) Asimo [7]

Figure 1 – Mobile robots Chapter 1. Introduction 18

Moreover, mobile robots have also been used in underwater environments to perform tasks previously done by divers as well as to enable operations on lower depths than those humans can tolerate. Underwater vehicles are mainly classified either as Remotely Operated Vehicle (ROV) and Autonomous Underwater Vehicle (AUV). The former is a tethered vehicle guided by a pilot, which relies on information coming from sensors such as sonars and cameras to control the movement of the vehicle. On the other hand, the latter performs missions without human intervention using internal and external positioning sources to localize itself and navigate in the environment. In Figure 2 some underwater vehicles are shown.

(a) ROV Nexxus [8] (b) ROV Magnum [9]

(c) AUV Leng [10] (d) AUV Sabertooth [11]

Figure 2 – Typical Underwater Vehicles

Underwater robots have been used for inspection of man-made submarine structures such as pipelines, cables, columns of offshore platforms [12]; localizing and grasping objects [13]; seafloor mapping; maintenance and surveillance of observatories, sampling of marine chemistry, geology and biology [14].

1.1.1 Vision-based control and underwater vehicles

The absence of GPS in underwater environments prompts most of the underwater vehicles to rely on visual and acoustic sensors for navigation. Acoustics sensors are not affected by visibility and light conditions. However, they are expensive, operate with a low update frequency and their weight might not be suitable for certain AUVs. On the other hand, visual sensors such as cameras are inexpensive and standard equipment in most underwater vehicles, providing rich-information at a high update rate in addition to being a passive sensor [15]. Moreover, cameras have a higher resolution than acoustic sensors [16], which allow more accuracy on the retrieved data. Thus, the visual sensor has Chapter 1. Introduction 19

been applied for estimation of abundances of marine organisms [17], 3D reconstruction [18], failure detection in hydroelectric dams [19], simultaneous localization and mapping (SLAM) [20] and visual servoing [21, 22, 23]. Visual servoing or vision-based control is a control scheme that uses visual infor- mation to control the motion of a robot. It is a multidisciplinary field that covers the knowledge of areas such as kinematics, dynamics, image processing and control theory [24]. Visual servoing mimics the way that humans use their visual system to perform simple actions such as grabbing objects, positioning the body to enter through a door or seating on a chair. On mobile robots it allows motion in a local target-referenced frame, thus providing higher accuracy when compared to inertial navigation, and enables closer target-relative movements. Although visual servoing was developed first for manufacturing robots, it has been largely used in mobile robots such as for automatic landing of Unmanned Aerial Vehicle (UAV) [25]; for accuracy enhancement of medicinal micro-robots [26] and in assisting pilots of remotely operated spacecrafts on satellite maintenance [27]. In underwater environments, the displacement relative to the target is desired in tasks such as docking [28], pipeline following [29] and structure inspection [30]. Moreover, underwater visual servoing has been used for helping robot inspection and 3D reconstruction [18], assisting ROV pilots [31] and valves manipulation [32]. Visual servoing controllers are classified in two forms according to their input. Image Based Visual Servoing (IBVS) receives input on the image domain (pixel coordinates), so it is also called 2D Visual Servoing. Position Based Visual Servoing (PBVS) receives input on the Cartesian space and so it is also referred as 3D Visual Controllers. While IBVS is robust against camera calibration and more computationally efficient, it has singularities that cause the system to be locally asymptotically stable. On the other hand, the global asymptotic stability is proved for PBVS. However, PBVS requires an additional computational step to estimate the camera pose and it demands previous knowledge of camera calibration parameters and 3D model of target object. Hybrid Visual Servoing (HVS) or 21/2D visual controller is a controller that combines the advantages of both strategies while avoiding its drawbacks [33]. Regarding the image processing layer, visual servoing relies on the extraction of visual features such as points, lines or ellipses. Those are either the input of the controllers (IBVS) or used to compute the camera pose to feed the controller (PBVS). In that sense, this work aims to take advantage of the artificial fiducial markers for feature extraction. Artificial fiducial markers are similar to QRcode, having encoded on their pattern an unique ID and being especially designed to be easily detected by computer vision systems. In Figure 3 some fiducial marker systems are shown. Given these characteristics, artificial fiducial markers have high detection and low Chapter 1. Introduction 20

Figure 3 – Different marker systems [34] false-positive and false-negative rates. The use of fiducial markers is suitable when object recognition or pose determination is needed with high reliability and when the environment can be modified to affix them [34]. These characteristics are desired for control tasks, which motivates its use on this work. Examples of possible applications are in a docking station for AUV and structures of oil and gas industries. Few works have used planar target to visual servoing in underwater robots [35, 22], but to the best of the author’s knowledge, artificial fiducial markers have not been used for visual servoing in underwater environments.

1.2 General Objective

The main objective of this work is to develop and assess the performance of an 21/2D Visual Controller when controlling an autonomous underwater vehicle using the information provided by a single camera detecting a fiducial marker in a real environment.

1.3 Specific Objectives

• Assay the particularity of visual servoing tasks in Autonomous Underwater Vehicles;

• Assay the performance of artificial fiducial markers systems in underwater environ- ments;

• Define among the known open-source artificial fiducial markers systems the one that best suits underwater applications;

• Analyze the performance of a hybrid approach of visual servoing on an simulated environment;

• Evaluate the performance difference of using adaptive gains in relationship with static gains; Chapter 1. Introduction 21

• Validate the results on the real vehicle in a big basin;

• Evaluate the performance of the proposed controller in the ocean;

• Investigate the challenges of the application in a real environment.

1.4 Thesis Outline

This document is composed of this chapter, that presents a brief history of robotics, the evolution of visual servoing tasks from manipulators to , the motivation, and goals of this work. The Chapter 2 describes the particularities of autonomous underwater vehicles and introduces the FlatFish AUV, the robot used for the experiments of this work. The Chapter 3 introduces the basic concepts of image processing, the challenges of image processing in underwater environments and a bibliographic review of artificial fiducial markers. The Chapter 4 is dedicated to presenting the background related to visual servoing in addition to details of the theory behind the different classifications of visual controllers. The chapters 5, 6 and 7 are dedicated to presenting the methodology, results and discussion of the performed experiments. The Chapter 5 shows an evaluation of artificial fiducial markers in underwater environments. The Chapter 6 shows the AUV performing visual servoing on a simulated environ- ment. The Chapter 7 details the visual servoing experiments in the real vehicle both in a basin and at the sea. Finally, the Chapter 8 points the conclusion of the work and suggestions for future works. 22 2 Autonomous Underwater Vehicles

This chapter focuses on the aspects of autonomous underwater vehicles. A historical panorama of mobile robots is presented aiming to contextualize the development and applications of underwater vehicles. The state of the art of this technology is also introduced. A section is reserved to explain the dynamic model of an autonomous underwater vehicle. The remainder of the chapter introduces the AUV of interest in this work, the FlatFish AUV, with hardware details and characteristics of the embedded software.

2.1 Autonomous Underwater Vehicles

2.1.1 Underwater Vehicles

Underwater vehicles have been developed in different sizes for different purposes. Figure 4 illustrates the main categories of this kind of vehicles.

Figure 4 – Classification of different underwater vehicles [36]

The main division concerns whether the vehicle is able to carry people or not. The ones who lift humans are classified as manned underwater vehicles, while the ones who do not are classified as unmanned underwater vehicles. In all manned vehicles, human lives are at risk, so the construction must be very robust. Additionally, periodic inspections and maintenance by trained and qualified people becomes indispensable. Thus, manned vehicles are highly costly to build, operate and maintain [36]. On the other hand, unmanned underwater vehicles do not bring any risk to humans and therefore part of the costs is reduced. Moreover, these vehicles permit operation at higher depths, for extended periods and require less people involved, which reduces the operational costs. Such vehicles are either controlled via or operate on their own. Chapter 2. Autonomous Underwater Vehicles 23

Remotely operated vehicles are the most common unmanned underwater vehicle and their operation relies on a pilot that uses the vehicle sensors, such as camera, sonar and depth to make decisions on how to move and send commands to the robot via a cable, usually called tether or umbilical. ROVs size varies from small enough to be carried by one person up to the size of a garage, weighing some tons (see Figure 5). ROVs applications include scientific research, educational uses, and inspection of difficult or dangerous environments. The bigger ones are usually equipped with manipulators and are used to inspect and maintain oil rigs and pipelines [36]

(a) ROV Seabotix LBV150 [37] (b) ROV Dock Ricketts [38]

Figure 5 – Remotely Operated Vehicles

Autonomous Underwater Vehicles, in turn, are tetherless robots guided by a preprogrammed computer which uses data such as speed, depth, and orientation to control and localize itself in the environment. Additional sensors may detect potential obstacles and record data for posterior human inspection. AUVs have their own power supply system and need to recharge it periodically. This is performed by retrieving the vehicle and bringing it back to the vessel. Some AUVs, also called resident AUVs, recharge their battery on a submerged docking station. In addition, the docking station permits communication with the topside and so it can be used to transfer the inspection data and to load new missions [36]. AUV technology is viable thanks to advances in many areas, such as higher energy density batteries, which enable operation by extended periods of time and for larger ranges. The quality of sensors, such as sonars and cameras, enables not only abilities of navigation such as obstacle avoidance, but also significantly improves the quality of the collected data. Moreover, encapsulating permits the merge of sensors like magnetometers, accelerometers and gyroscopes in a single package, which allows for it to be embedded on the vehicle, improving navigation quality in AUVs [39]. Navigation is one of the most critical areas when it comes to AUVs. Since radio signals are strongly attenuated underwater, the standard way an AUV navigates is by using dead reckoning. This is performed by combining data such as velocity samples Chapter 2. Autonomous Underwater Vehicles 24

from a Doppler Velocity Logger (DVL), acceleration and angular velocities from Inertial Measurement Unit (IMU), depth from pressure sensor and velocities from the dynamic model. Moreover, AUVs can use perception sensors such as camera and sonar to detect known landmarks and resolve its position on the environment. Additionally, some operations allow the deployment of multiple acoustic positioning transponders on the seafloor, giving the robot its position in relation to an inertial frame. Those are known as Long Baseline (LBL). Another option is to use an Ultra-short Baseline (USBL), that provides the vehicle’s position by triangulation of two short-distanced antennas on the reference frame and the antenna on the vehicle [39].

2.1.2 Historical Context

In the last decades, robots have played an important role on the development of modern society. Day after day they become more present in the daily life of humans and indispensable in the industry. Given its multidisciplinary nature, robotics depends on the progress of areas such as electronics, material engineering, computer science, control theory and so on and so forth. The advances on these areas have permitted the development of reliable robots for commercial purposes. Figure 6 shows the Autonomous Guided Vehicle (AGV) from Swisslog, which transports material in warehouses, the TUG from Aethon used in hospitals for transporting medicines, and the RC3000 autonomous vacuum from Kärcher for cleaning domestic environments [40].

(a) AGV from Swisslog [41] (b) TUG from Aethon [42] (c) RC3000 from Kärcher [43]

Figure 6 – Commercial Mobile robots

Terrestrial robots, as well as spacial and aerial robots, progressed rapidly so many of them are currently able to perform autonomous tasks. Underwater robots, however, did not experience the same expressive progress. The materials for severe environment, high research costs and complexity of the experiments did not attract researches to this area and therefore it was developed almost exclusively for military purposes. Although humans have used the sea and boats for traveling since history started to be recorded, the development of vehicles that can go underwater is tardy. The first Chapter 2. Autonomous Underwater Vehicles 25

Figure 7 – Images of POODLE, the first ROV [36]

mention of an underwater vehicle design is credited to Leonardo Da Vinci, registered as a military underwater vehicle in his collection of documents, namely Codice Atlanticus, which was written between 1480 and 1518. It is also said he worked on his invention but destroyed it later because he judged it could be too dangerous [44]. The development of manned underwater vehicles such as submarines started in the early 1600s and their development grew concurrently with the Industrial Revolution (1776), so that by the year of 1914, during World War I, they were already vastly used as military weapons and a threat to navies [36]. The development of the first unmanned underwater vehicle, however, only happened in 1953, when ROV POODLE was built (Figure 7) [44]. A nautical archaeologist called Dimitri Rebikoff improved his -shaped machine, which he used as a propeller when diving. He equipped it with a camera and controlled it remotely, using the ROV to search for shipwrecks. In parallel, the British and Navies began to use ROVs to locate and retrieve expensive materials that had been lost in the water. During the 1950s a British ROV called Cutlet and the American Cable-controlled Undersea Recovery Vehicles (CURV) were used to recover mines and torpedos [36]. Between 1950s and 1960s, ROVs evolved mainly for military purposes. Two memorable events in the ROVs history are the recovering of a hydrogen bomb that was lost in the coast of Spain in 1966 and the rescue of the trapped crew of the submersible Pisces III in 1973. Both tasks were performed by the series of U.S. Navy ROV CURV [36, 45]. Figure 8 shows the first ROVs and an illustration of the ROV CURV III recovering the hydrogen bomb. During the 1970s, the growing demand for offshore oil extraction propelled in- vestment on underwater vehicles, as they had proven to be quite useful for underwater missions. Thus, scientific and commercial development of ROVs grew significantly. The requirements of Oil & Gas industry pushed the development of increasingly more complex robots, more efficient and capable of operating reliably at higher depths [36]. The need to operate in deep water became a problem for the use of tethers and has encouraged the Chapter 2. Autonomous Underwater Vehicles 26

(a) Cutlet [45] (b) CURV II [45]

(c) Illustration of CURV recovering the hydrogen bomb [46]

Figure 8 – Early ROVs commercial development of tetherless robots with the ability to work autonomously, i.e., without human intervention. The first autonomous underwater vehicle is the Special Purpose Underwater Re- search Vehicle (SPURV), built in 1957 in the University of Washington, shown in Figure 9). Its purpose was to study underwater communication and track submarines [47]. The decades of 1960 and 1970 consisted of an experimentation period as many concept proofs were developed. These years can be considered as an experimental period because, though there were many successes and failures, a considerable advance happened in this upcoming technology. Only in the 1980s major attention was given tetherless vehicles. On those years, most laboratories began to develop test platforms. During these years, many technological advances in electronics and computing, such as power consumption efficiency, larger memory and more processing power, in addition to the implementation of software engineering, helped create more complex robots. Besides that, vision systems, decision-making and navigation became more popular and reliable in AUVs [49]. One highlight of the 1980 decade is the Advanced Unmanned Search System (AUSS), presented in Figure 10. It was developed by the Naval Ocean System Center, in USA. Chapter 2. Autonomous Underwater Vehicles 27

Figure 9 – SPURV [48]

AUSS was launched in 1983 and even during the 1990s many reports and publications were still in press. It could operate at 6000 meters of depth, with an autonomy of 10 hours and it was able to communicate via acoustic sensors, which were used to transmit images through the water. The vehicle’s weight was 907kg. A remarkable event was its detection of an American WWII bomber while operating at San Diego [50, 51].

Figure 10 – AUSS [51]

In the 1990s, the proof of concept became the first generation of operational systems. The highlights of this period are: HUGIN I and HUGIN II developed by Kongsberg Maritime in cooperation with the Norwegian Defense Research Establishment, Odyssey AUV created by MIT Sea Grant College AUV Lab. and REMUS (Remote Environmental Monitoring Unit System) developed by Woods Hole Oceanographic Systems Lab. On the beginning of 21th century, the first commercial vehicles became available. They are capable of performing tasks such as marine search and rescue, oceanographic research and environmental monitoring. One of the leaders in AUV business, the Kongsberg Maritime, has on its portfolio the AUV HUGIN, one of the most famous commercial AUVs. Figure 11 shows this vehicle.

Figure 11 – Hugin AUV [52]

In terms of comparison with the performance of the aforementioned AUSS, HUGIN Chapter 2. Autonomous Underwater Vehicles 28 is available in versions that can operate at 6000 meters depth, its battery lasts 100 hours when it is operating at 4 knot ( 2m/s). It is typically equipped with side-scan sonar, multibeam echo sounder, sub-bottom profiler, camera, and Conductivity, Temperature and Depth (CTD) sensor [52]. It is used for military purposes, in mine countermeasures (MCM), intelligence, surveillance and reconnaissance (ISR), rapid environmental assessment (REA) and commercial purposes, in seabed mapping and imaging, geophysical inspection, pipeline and subsea structure inspection, oceanographic surveys, environment monitoring, marine geological survey and search operations [52]. Underwater vehicles have evolved from a purely military role to being largely used in the Oil and Gas industry. Although oil production will continue playing an important role on our lives for a while, the future of underwater vehicles however is not limited to it. Deep mining and offshore alternative energy production will demand underwater vehicles and prompt their development to fulfill the requirements of new businesses [39].

2.2 FlatFish

FlatFish project is an ongoing project for Dutch Royal Shell, founded by the Brazilian Government via Brazilian National Agency of Petroleum (ANP) and Brazilian Industrial Research and Innovation Corporation (EMBRAPII). It is developed by SENAI CIMATEC, at the Brazilian Institute of Robotics in cooperation with the Robotics Innovation Center (RIC), which is part of the German Research Center for Artificial Intelligence (DFKI). FlatFish is shown in Figure 12

Figure 12 – FlatFish AUV

The goal of the FlatFish project is to develop an AUV for inspection of assets of the Oil&Gas industry, such as pipelines, manifolds, SSIV, etc. FlatFish is a subsea-resident AUV, e.g., it has a docking station where the vehicle can park, recharge its battery, send data to shore and load new missions. It enables FlatFish to operate for extend periods submerged. When compared to ROV operations, this ability significantly reduces costs since the robot does not depends on the weather, it does not require a dedicated support vessel and it permits an increase in the frequency of inspections, given that the vehicle will Chapter 2. Autonomous Underwater Vehicles 29

Table 1 – FlatFish Specifications

Depth rating 300 m Weight (in air) Size (LWH) 275 kg 220 cm x 105 cm x 50 cm Propulsion 6x 60N Enitech ring thrusters (120N in each direction) Battery Lithium-Ion battery 5,8 kWh (11,6 kWh) @ 48V Rock7mobile RockBlock Iridium satellite modem (1,6 GHz) Communication (surface) Digi XBee-Pro-868 (868 MHz) ubiquiti PicoStation M2 HP WLAN-Modul (2,4 GHz) Communication (submerged) Evologics S2CR 48/78 kHz usable as USBL transponder 10 GBit/s optical fibre Communication (tethered) 1 GBit/s Cat5e (max. 50m) Light 4x Bowtech LED-K-3200 (3200 lumen each) 2xPicotronic LD532-20-3(20x80)45-PL line laser Laser Line projector 20mW each @ 532nm BlueView MB1350-45 Multibeam Profiler (inspection sonar) Sonar Tritech Gemini 720i Multibeam Imager (navigation sonar) 2x Tritech Micron Sonar (obstacle avoidance) 4x Basler ace acA2040-gc25 2048x2048 at Camera 25 frames/s, colour, GigabitEthernet Depth Sensor Paroscientific 8CDP700-I INS/AHRS KVH 1750 IMU DVL Rowe SeaProfiler DualFrequency 300/1200 kHz be available 24/7. So, a subsea-resident AUV has a lower cost per operation and allows early failure detection. According to the scope of the project, one vehicle was built in Germany and another in Brazil. This strategy speeds up project development since it allows the involved scientists to exchange knowledge during the project. Table 1 provides an overview of the FlatFish features.

2.2.1 Sensors and Actuators

2.2.1.1 Navigation System

One of the problems in underwater vehicles is how to localize precisely on the environment. The absence of a global positioning system causes FlatFish to rely mostly on dead-reckoning for navigation. It is performed by data fusion from the accelerometers and gyroscopes of the Inertial Navigation System (INS) and the velocity measured by the Doppler Velocity Logger (DVL). The dead-reckoning error, however, grows exponentially. Thus, FlatFish also takes advantage of the visual and acoustic sensors to extract features of known submerged structures to correct the localization error. Flatfish is also equipped with an Ultra-Short Base Line (USBL), a sensor placed on the docking station, that obtains the position of the vehicle according to the time-of-flight of an emitted acoustic signal. The USBL is able to track the vehicle position within 1 Km radius. The vehicle is also Chapter 2. Autonomous Underwater Vehicles 30 equipped with two mechanical single beam sonars able to detect obstacles around the vehicle.

2.2.1.2 Inspection System

The inspection sensors play an important role on FlatFish as it is the core of FlatFish’s objective. The vehicle’s payload is composed of a mix of visual and acoustic sensors, they were chosen in order to acquire data even under adverse environmental conditions, such as high turbidity level. The visual groups is composed of four cameras compounding 2 stereo cameras systems that can record colored videos at 25fps at 2K resolution (2040x2040). In addition, two lasers can project green lines on the image to extract depth information and use it, for instance, for 3D model reconstruction. Concerning the acoustic sensor for inspection, FlatFish uses a high resolution Profiling Multi-Beam sonar, which provides means to create a 3D model of the structure even in bad visibility.

2.2.1.3 Propellers

Flatfish uses six propellers of 60N force each. These are able to control the vehicle in five DOF, which characterizes the AUV as an overactuated system. The distribution of the thrusters is shown in Figure 13. Given distribution results on the Thrust Configuration Matrix (TCM) are shown in Table 2 [53]. The TCM is a matrix that relates the forces on the thrusters with the efforts in each degree of freedom.

Figure 13 – FlatFish thruster configuration [53]

Table 2 – Thrust Configuration Matrix [53]

T1 T2 T3 T4 T5 T6 Surge 1 1 0 0 0 0 Sway 0 0 -1 -1 0 0 Heave 0 0 0 0 1 1 Roll 0 0 0 0 0 0 Pitch 0 0 0 0 -0.4235 0.556 Yaw 0.44 -0.4 -0.5735 0.936 0 0 Chapter 2. Autonomous Underwater Vehicles 31

2.2.2 Vehicle Model

The dynamic behavior of underwater vehicles are essentially non-linear. A Newton- Euler formulation for underwater vehicles proposed by [54] is expressed in Equation 2.1.

M ˙v + C(v)v + D(v)v + g(η) = τe + τ (2.1) where:

• M = inertia matrix, including the added mass;

• C(v) = matrix of Coriolis and centripetal terms, including added mass;

• D(v) = damping matrix;

• g(η) = vector of gravitational forces and moments;

• τe = vector of environmental forces;

• τ = vector of control inputs.

Note that C(v) and D(v) are non-linear velocity-dependent components. Addition- ally, the velocities described on the model aforementioned are coupled, which means that the velocity on one degree of freedom contributes to the movement of the vehicle in a different DOF. Ocean currents and umbilical dynamics (in ROVs) are usually present. They are not typically considered for the controller designer, but regarded as disturbance. Moreover, underwater vehicles are subject to delay and saturation of propellers [55]. For vehicles operating at small velocities and symmetrical in the three planes, a common practice is to simplify the model by assuming the added mass constant and disregard the off-diagonal and coupling terms [56]. Thus, Equation 2.1 can be rewritten as:

miv˙i(t) + dQvi(t)|vi(t)| + dLvi(t) + bi = τ(t) (2.2) where each degree of freedom i has its own added mass (m), damping terms (dQ and dL), buoyancy and control forces. The FlatFish’s parameters model (from Equation 2.2) was identified via Least- Square method, collecting the poses and the thrusters’ efforts when the vehicle performed sinusoidal movements on each degree of freedom. The results are shown in Table 3. Chapter 2. Autonomous Underwater Vehicles 32

Table 3 – Model parameters of AUV FlatFish [56]

DOF Inertia m Quad. Damp. (qD) Linear Damp. (qL) Buoyancy (b) kg kg Surge 851.05 [kg] 8.62 [ m ] 39.57 [ s ] -0.81 [N] kg kg Sway 976.05 [kg] 157.52 [ m ] 64.96 [ s ] 3.00 [N] Heave 1511.53 [kg] 1911.21 [ kg ] -70.29 [ kg ] -12.42 [N] 2 m s 2 kg.m 2 kg.m Yaw 301.60 [ rad ] 279.39 [kg.m ] 26.92 [ rad.s ] -0.0909 [N.m]

2.2.3 Software Layer

The software layer is based on the Robotic Construction Kit (ROCK)1, an open- source model-based software framework for construction of robots. The framework provides standard tools for development, such as logging and log replay, data visualization and state monitoring [57]. Rock framework also has an integration with the robot simulator Gazebo, which permits the test of the software integration, algorithms and the behavior of the system under faulty situations. The FlatFish team enabled on Gazebo a robust simulation of the physical effects of water such as buoyancy, drag and added mass, simulation of different types of sonar, and the simulation of water effects on cameras (color attenuation and depth of view, which have been proved very useful for control and computer vision algorithm tests. The models of Rock, such as drivers, navigation and image processing algorithms, controllers, are managed by the system coordinator Syskit, which is responsible for binding the component together and creating a function for that. For instance, in a visual servoing task, Syskit handles the connection of camera driver component with the image extractor and the controller. Given its status, Syskit also manages missions as a whole, with the actions state machine and also takes care of emergency procedures in case of failures, from resetting a single component to sending the vehicle to surface in case of some critical failure.

1 33 3 Computer Vision

This chapter introduces basic computer vision concepts related to this master thesis. It starts with the particularities of the underwater image formation and the challenges related to underwater image processing. Afterwards, the pinhole camera model is presented, together with the basic concepts of perspective geometry required for its understanding. Then, the process of pose estimation with a single camera is introduced. Finally, we introduce the artificial fiducial markers, the general concepts and specificities of the marker systems used in this work.

3.1 Underwater Image Processing

Underwater image analysis is interesting for many applications, such as: inspection of man-made structures, observation of marine fauna, monitoring, seabed studies [58]. Nonetheless, unlike normal images taken in the air, physical characteristics of the water medium causes the underwater images to be usually affected by limited range visibility, low contrast, non uniform lighting, blurring, bright artifacts, and noise [59]. Light attenuation in the water is caused by light absorption and light scattering. The former reduces the energy of light while the latter changes the direction of the light. It directly affects light visibility, which can vary from twenty up to five meters or even less [60]. Light attenuation is not caused only by light propagation in the water medium, but also due to dissolved particles and the floating particles in the water. These interactions increase absorption and scattering effects. The underwater image formation model is based on the assumption that the light received by the camera is composed of the superposition of three components: direct component (Ed), forward-scattered components (Ef ) and backscatter component (Eb)[60], thus:

ET = Ed + Ef + Eb (3.1)

The direct component is the light directly reflected by the object. The forward- scattered component is also reflected by the object, but scattered in small angles. The backscatter component is reflected by objects not present on the scene, for instance, suspension particles, but enters in the camera [60]. Figure 14 shows these three components. Forward-scattering components contribute to the blurring effect on the final image, while backscattering affects the contrast. Chapter 3. Computer Vision 34

Figure 14 – Representation of the three components of light received by the camera [60]

Figure 15 – Color absorption by function of distance [59]

The absorption of wavelength corresponding to the red color is higher than the other colors, so the red component quickly decreases after a few meters. The blue corresponding wavelength is less affected than the others, which makes the underwater images have a bluish appearance. In the deep ocean, the absence of light requires an artificial lighting source, however, it tends to add a bright spot on the center of the image and low luminosity level on the corners [61]. Figure 15 shows color absorption as a function of distance [62]. Groups of researchers have been making efforts on turning underwater images more comprehensible to humans and even for machines, since it would permit the use of standard image processing algorithms [60]. Underwater image processing is mainly divided in two groups: image restoration and image enhancement. The former uses a model of the original image formation and a degradation model to recover the original image. It Chapter 3. Computer Vision 35 involves the knowledge of many parameters, such as attenuation and diffusion coefficients. The latter does not use any physical models and relies on the use of techniques such as contrast stretching and color space conversion in order to transform the degraded image into a more legible one [60].

3.2 Perspective Transform

3.2.1 Pinhole camera model

This section introduces the simplest and more specialized camera model, the pinhole model. A camera model is a matrix that maps the transformation of a 3D world into a 2D image [63]. On that transformation, the depth information is lost, i.e., without any previous knowledge about the object size, it is impossible to discern, just by looking to the image, whether the object is small or the camera is positioned very far from the object [64]. Figure 16.a shows a picture of a parking lot. However, only with the people in the background of the Figure 16.b is it possible to realize that it is a miniaturized city.

(a) (b)

Figure 16 – Picture of a miniaturized parking lot taken with a camera close to the objects.

The pinhole model represents the phenomenon where light goes through a tiny orifice and an inverted image is projected on the wall of an obscure room [64]. The process of image formation on humans and in CCD cameras is similar to image formation on a pinhole camera. The inverted image is projected in the retina, or sensors, and flipped by the brain or by the microprocessor. Unlike the pinhole camera, humans and cameras have convex lenses that allow more light to pass through and, consequently, produce brighter images [64]. Pinhole is a central projection camera, that is, any line that crosses a point in the world and its projection on the image plane ends up hitting the same point on the camera frame, namely, the camera center. Figure 17.a illustrates the central projection model. The non-inverted image is projected in plane parallel to the x-y plane and in Z = f, the image plane. As it was said, the center of coordinate systems C is the camera center or Chapter 3. Computer Vision 36 optical center. The line from the camera center, perpendicular to the image plane, is the principal axis or principal ray, and the plane through C, parallel to the image plane, is called principal plane. Finally, the principal axis crosses the image plane at the principal point [63].

(a) Central projection model (b) Triangle similarities

Figure 17 – Pinhole camera geometry

The mapping of the point P = (X,Y,Z)⊤ in the image plane p = (x, y) is demonstrated by triangle similarity, according to Figure 17.b, thus:

X Y x = f y = f (3.2) Z Z This projection is a mapping from R3 to R2 and has the following properties [64]:

1. Straight lines in the world keep straight in the image plane.

2. Parallel lines in the world are intersected in the image plane at the vanishing point or point at infinity, except by lines parallel to the image plane which do not converge.

3. Conics (circles, ellipses, parabolas, hyperbolas) are projected as conics in the image plane.

4. For a given projected point p = (x, y), there is no unique solution for retrieving the P = (X,Y,Z) of the world point. The point P can be anywhere along the segment CP (shown in Figure 17

5. The internal angles are not preserved, so the shape is not preserved.

Equation 3.2 considers the origin of the image plane at principal point, which may ⊤ not to be. So letting the coordinates of the principal point be (px, py) , the following map give us a more generic relation:

X Y x = f + p y = f + p (3.3) Z x Z y Chapter 3. Computer Vision 37

The pinhole cameras consider that image coordinates have the same scales for both directions. However, it may be that CCD cameras have non-square pixels. Establishing

ρw and ρh as the pixel width and pixel height, respectively, the point on the image plane represented in the pixel dimensions is then related by:

x y u = x + u0 v = x + v0 (3.4) ρw ρh where u0 and v0 are the principal point expressed in pixel dimension. Writing this in the homogeneous form:

X u αx 0 u0 0 Y  v = 0 α v 0   (3.5)    y 0        Z  w  0 0 1 0          1  f f where αx = /ρw and αx = /ρw. In short:

p = K[I|0]P (3.6) with:

αx 0 u0 K =  0 αy v0 (3.7)      0 0 1  where K is the camera calibration matrix [63] and its elements are called the intrinsic parameters.

3.2.2 Distortion parameters

The above equations assume perfect lenses. However, real lenses, especially low-cost lenses, have some kind of imperfections. These add several distortions on the formed image, such as chromatic aberration, variation of focus across the scene and geometric distortions, which cause points to be projected on a different position from where they should be [64]. In robotic applications, geometric distortions are the most problematic, consisting of radial and tangential components. The radial distortion causes image point displacement along a circumference around the principal point and radii equals to the distance between the point and the principal point. Tangential distortions occur at right angles and usually are less significant than the radial [64]. Figure 18 illustrates the effect of geometrical distortions during the image formation process. Chapter 3. Computer Vision 38

Figure 18 – Diagram showing the position of the pixel in the ideal case and the effect of the radial distortion (δr) and tangential distortion (δt)

The radial distortion is modeled by a polynomial approximation:

3 5 7 δr = k1r + k2r + k3r + ... (3.8) where r is the distance to the principal point. Therefore, the coordinate (u,v) of a point with distortion is given by:

δ u(k r2 + k r4 + k r6 + ...) 2p uv + p (r2 + 2u2)  u =  1 2 3  +  1 2  (3.9) 2 4 6 2 2 δv v(k1r + k2r + k3r + ...) p1(r + 2v ) + 2p1uv) Usually three parameters are acceptable for characterizing radial distortion. Thus the distortion model is parameterized by (k1,k2,k3, p1, p2) and these parameters are considered additional intrinsic parameters [64].

3.2.3 Camera Calibration

The process of determining the intrinsic parameters is known as camera calibration. It relies on the corresponding world points in the image plane. The Bouget’s calibration method is widely applied, requiring samples of a planar chessboard in different positions and orientations [64]. A tool 1 that automatically detects the chessboard and computes the calibration matrix is available in both OpenCV and Matlab. In practice it is common to find the camera intrinsic parameters, by the camera calibration process, and undistort the image based on the distortion parameters.

3.2.4 Pose Estimation with Single Camera

The problem of estimating the position and orientation of a camera with respect to an object is known, and it requires the camera intrinsic parameters and the 3D geometric 1 Chapter 3. Computer Vision 39

model. In addition, it demands the correspondence of N points on the image (ui, vi) with the points in the world (Xi,Yi,Zi), where i ∈ [1,N]. This problem is known as Perspective-n-Point (PnP). Many methods to address this problem have been proposed. They are divided in analytical and least-square solutions. There is a solution for a system with three points, though it yields to multiple solutions. A unique solution is possible with four coplanar, but non-collinear points. The correspondence of six or more points has an also unique solution and it provides not only the relation between camera and object, but also the intrinsic camera parameters [24]

3.3 Artificial Fiducial Markers

Artificial fiducial markers are landmarks designed to be easily detected by computer vision systems. They have an unique ID encoded in their structure following a pre-defined pattern system and provide a means to implement reliable pose estimation of the marker with respect to the camera. Because of these features, fiducial markers have many applications in augmented reality [65] and [66], where they can be used in addition to natural landmarks to improve navigation performance. They can also be employed to label specific structures [67], such as a docking station or areas of interest, such as sampling points. In underwater environments, fiducial markers can be also applied to robot mapping and localization [68][69], to monitoring underwater sensors [70], to assist ROV operators in robot control [71] and as an interface for human-robot communication[72] [73]. Their advantages have aroused the interest of the scientific community and several marker systems have been proposed. MAtrix [74] and ARToolKit [75] were two of the marker systems first created, and their quadratic form has inspired other markers such as ARTag [76], ArUco [77] and AprilTags [78]. Circular shapes were proposed by Rune-Tag [79]. The encoding systems also vary between marker systems. For instance, Mono-spectrum Markers [80] and Fourier Tags [81] rely on a frequency analysis of the image to decode their patterns. The use of fiducial markers in underwater environments is challenging because the acquired image is subject to many degradation factors mentioned on the Section 3.1. The following section details the three marker systems chosen to be evaluated on this work. ARToolKit, AprilTags and ArUco were chosen because are released under open source license and provide a ready-to-use implementation. Chapter 3. Computer Vision 40

3.3.1 The evaluated marker systems

ARToolKit [75] is one of the most popular marker systems and has been used in several applications over the years. During its detection stage, ARToolKit uses a global threshold to create a binary image and an image correlation method where the payload is compared with a predefined database. Although this feature allows the user to create markers with intuitive patterns as shown in [67], it increases the computational effort when a large database is formed, since each potential marker has to be checked against the entire database. AprilTags [78] uses a graph-based image segmentation with local gradients to estimate lines and introduces a quad detection method designed to handle occlusions. Once the lines are detected, AprilTags relies on a spatially-varying threshold using known values of black and white pixels to decode the payload. This makes the detection system robust against lighting variations within the observed marker. Its encoding guarantees a certain Hamming distance between the codewords and for this reason decreases the inter-marker confusion rate. ArUco [77] proposes a method to create a dictionary with a configurable number of markers and number of bits. This method maximizes the bit transitions and the inter- marker difference to reduce the false positive and inter-marker confusion rates, respectively. The ArUco library also features a method for error correction. Its detection process consists of applying the adaptive threshold in a grayscale image and then finding the marker candidates by discarding the contours that cannot be approximated by a rectangle. Subsequently, the code extraction, marker identification and error correction stages are applied. Figure 19 shows the presented marker systems.

(a) ARToolKit [75] (b) Apriltag [78] (c) ArUco [77]

Figure 19 – Studied marker systems 41 4 Visual Controller

Visual servoing is a control scheme that uses visual information to control the motion of a robot. It uses single or multiple cameras to extract features from an image and map it into control commands. Vision-based control has been applied to increase the accuracy and flexibility of robotic systems [24, 82, 33]. The first efforts to control robots using visual information dates from 1973, when an used camera information to place a block inside a box [83]. In that work, the difference between the current and the desired position is computed and, based on that, the motion command is provided. Here, the robot moves without any visual information towards the goal pose. When the robot stops, it takes another picture to check if the desired position was achieved. If it was not, then it repeats the process. The term visual servoing was first introduced in 1979 and it characterizes a system with a faster image feature extractor which provides real-time information and enables closed-loop control chain [84]. However, despite the efforts during the 1970s, only in the 90s did the number of publications increase significantly, mainly due to the considerable advances in processing power and computer vision techniques [24]. Visual servoing can be deployed with several configurations. It can either have a camera attached to the robot’s body (eye-in-hand) or an external camera mounted on the environment (eye-to-hand). It can also be classified regarding the level of actuation, where the dynamic look-and-move approach provides a setpoint to the low level controller as velocity commands, while the direct visual servo uses the visual controller to directly command the actuators [24]. Since most mobile robots have an embedded camera, the majority of visual servoing applications are eye-in-hand. Nevertheless, eye-to-hand has a potential use in collaborative robots and a mixed approach has been used in humanoid robots [85, 86]. Concerning the control level, most implementations adopt the dynamic look-and-move approach, mainly because it requires lower update rate than the direct visual servo. In addition, most of robots already have a low level controller implemented. Thus, the modularity added by the use of dynamic look-and-move grants simplicity and portability to the controller, since it gives a kinematic control character to the visual controller [24]. The classical visual servoing literature, such as tutorials and reviews [24, 87, 88, 89], classifies the eye-in-hand visual servoing into two big categories according to the nature of the controller input. The Image Based Visual Servoing (IBVS) compares the pixel coordinates of the features with the pixels coordinates that the feature will assume when the robot achieves the desired pose. Since the task is defined on the image plane, this Chapter 4. Visual Controller 42 approach is sometimes referred as 2D visual servoing [33]. On the other hand, the Position Based Visual Servoing (PBVS) estimates the 3D camera pose based on the extracted features and tries to minimize the difference to the desired pose. Given that, the task function is defined on the Cartesian Space, the PBVS is also referred as 3D Visual Servoing [33]. Figure 20 and Figure 21 illustrate the structure of both approaches.

Figure 20 – Structure of dynamic look-and-move IBVS

Figure 21 – Structure of dynamic look-and-move PBVS

Both IBVS and PBVS present their advantages and disadvantages. The image- based visual controller does not require any 3D model of the target, it is robust to errors on the camera calibration [33] and may reduce the computational time since the pose estimation step is not performed [87]. On the other hand, it introduces a challenge to the controller designed because the process is non-linear and highly coupled [87]. The position-based scheme is highly sensitive to camera calibration errors and uncertainties on the 3D object model [33]. Since PBVS is defined in the Cartesian Space, the camera performs an optimal trajectory on that space, but it can result in the loss of features, since there is no direct regulation of image feature on the image space. On the other hand, IBVS is defined on the image space. Therefore, there is no direct control of camera motion on 3D space. The IBVS controller may cross a singularity or reach a local minimum [64, 88]. A mixed approach introduced by [33] aims to compensate the shortcomings of both strategies while taking advantage of their strengths. As it combines aspects of 3D and 2D visual controllers, it is called 21/2D visual servoing or hybrid visual servoing (HVS). The 21/2D approach does not need any previous knowledge of the targeted object geometrical model and, unlike the IBVS, it ensures convergence of control law for the whole task space [33]. Chapter 4. Visual Controller 43

All mentioned visual servoing approaches aim to minimize the error function e(t) defined by Equation 4.1 [88]:

e(t) = s(m(t), a) − s∗ (4.1) where s is a k-size vector of features that represents image measurements such as coordinates of points of interest and image coordinates of centroid regions and angles. Furthermore, m(t) and a offer additional information like geometrical sizes of an object and camera intrinsic parameters. In turn, s∗ is the vector of the desired features, here considered constant during the task, i.e., time independent. In order to design a velocity controller, for a dynamic look-and-move fashion, it is necessary to relate the velocity of the camera (vc) defined by its linear and angular parts vc = [vc, ωc] with the velocity of the features ˙s. It is performed via Equation 4.2

˙s = Lsvc (4.2)

k×6 where Ls ∈ R is the interaction matrix and correlates the velocity of the camera with the velocity of the features. The interaction matrix is also called in the literature feature Jacobian, image Jacobian or feature sensitive matrix. Therefore, the error variation is defined directly from 4.1 and 4.2:

˙e = Levc (4.3) where Le = Ls. In order to ensure the exponential decrease of the velocities we can assume ˙e = −λe and apply it on 4.3. Thus we have the control low defined in terms of the error, expressed by Equation 4.4:

+ vc = −λLe e (4.4)

+ R6×k + T −1 T where Le ∈ and defined as Le = (Le Le) Le is the Moore-Penrose pseudo inverse. + This ensures, when Le is not full rank, a minimal value for kv˙c + λLe ek. In case of + −1 det(Le) =6 0, the matrix is invertible, which results Le = Le [88]. In practical visual servoing tasks, it not possible to know exactly either the in- + teraction matrix Le nor its pseudo-inverse Le . So an approximation or estimation is performed [88]. The approximation of pseudo-inverse or the inverse of approximation of Chapter 4. Visual Controller 44

+ the pseudo-inverse will be denoted as Lˆe . Therefore, the control law of Equation 4.4 is redefined as:

+ vc = −λLˆe e (4.5)

The remainder of the chapter is dedicated to defining the form of s and the respective Le for the IBVS, PBVS and, finally, 21/2D visual servoing. A stability analysis is also presented in the end of the chapter.

4.1 IBVS

As previously mentioned, IBVS is a visual servoing scheme that uses the information on the image plane to feed the controller. Usually, the most common applications use the coordinates of image points, but other features such as geometrical primitives, i.e., line segments, straight lines, spheres, circles and cylinders can also be utilized. This work focuses on the use of coordinates of image points.

4.1.1 Interaction Matrix for Points

Given a camera that moves with velocity vc = (vc, ωc), where vc and ωc are the linear and angular velocities in relation to an inertial reference frame, respectively, a fixed point P = (X,Y,Z) and its projection on the image plane p = (x, y). The velocity of P relative to the camera is defined in Equation 4.6 [64, 88]:

P˙ = −ω × P − v (4.6) which can be expanded in forms of Equation 4.7, Equation 4.8 and Equation 4.9:

X˙ = Y ωz − Zωy − vx (4.7)

Y˙ = Zωx − Xωz − vy (4.8)

Z˙ = Xωy − Y ωx − vz (4.9)

From Chapter 3, Equation 3.2 differentiation is given by Equation 4.10 and Equation 4.11: XZ˙ − XZ˙ x˙ = f (4.10) Z2 YZ˙ − Y Z˙ y˙ = f (4.11) Z2

By rearranging Equation 3.2: Chapter 4. Visual Controller 45

xZ yZ X = Y = f f

Replacing Equation 4.7, Equation 4.8 and Equation 4.9 in the rearranged 3.2, it is found in the interaction matrix for points on Equation 4.12:

vx  2 2 vy  f x xy f + x   − 0 − y    x˙  Z Z f f vz  =   (4.12)  f y f 2 + y2 xy    y˙   ωx  0 − − −x    Z Z f f    ωy     ωz  Since the camera provides the information in terms of image coordinates, using the 3.4 it is possible to rewrite 4.12, as it is shown by Equation 4.13.

vx  2 2 2 vy  f u¯ ρhu¯v¯ f + ρwu¯   ˙ − 0 − v¯    u¯ ρwZ Z f ρwf vz   2 2 2    (4.13) ˙ f v¯ f + ρ v¯ ρwu¯v¯   v¯  h  ωx  0 − − −u¯    ρhZ Z ρhf f    ωy     ωz  where u¯˙ = u − u0, v¯˙ = v − v0 and ρw and ρh are the sensor width and height, respectively. Compactly:

p˙ = s˙ = Ls1vc (4.14)

Note that the matrix of Equation 4.13 is defined for a single point, thus it belongs to R2×6. A minimum of three non-coincident and non-collinear points are needed in order 6×6 to ensure the controllability of all degrees of freedom [88], on that case Ls ∈ R , given that each point provides two parameters (u and v). Therefore, the vector of features is s = (s1, s2, s3) = (u1, v1, u2, v2, u3, v3). Finally, Ls is built from the matrices showed in Equation 4.13.

Ls1 Ls = Ls2 (4.15)   Ls3  6×6 Nonetheless, the use of only three points to build the interaction matrix shown on Equation 4.15 leads to some configurations in which Ls is singular. On that cases four Chapter 4. Visual Controller 46 global minima exists, which means that there are four distinct poses in which the error is zero. For that reason, usually more than three points are adopted [88].

4.1.2 Estimation of Interaction Matrix - Lˆs

In real problems of visual servoing it is not possible to know perfectly the real value of Ls, so that an estimation of its parameters must be performed [88]. The estimated + interaction matrix is denoted here as Lˆs and its pseudo-inverse as Lˆs .

When the depth of all points are known, the direct approach is to use Lˆs = Ls. The depth estimation must be performed at each cycle. Another popular use is Lˆs = Ls⋆ , i.e., the interaction matrix is constant and it uses the depth information of the features from the desired position. On this approach, no 3D estimation is required. A third approach computes the mean of both previous strategies and define Lˆs = 1/2(Ls + Ls⋆ ). On that case, the depth of each point has to be estimated.

Although on the use of Lˆs = Ls⋆ the system converges even at large displacements, it shows a poor behavior on the pixel movements, computed cameras and 3D camera motion. On Lˆs = Ls, the pixels trajectory on the image plane are almost straight lines, however the 3D camera motion shows a worst performance than Lˆs = Ls⋆ . The contribution of Lˆs on the beginning of the task is big, it implies in large camera velocities and the 3D camera trajectory is far from a straight line. The Lˆs = 1/2(Ls + Ls⋆ ) approach, however, has presented a good performance, which means small oscillations and smooth trajectories both in the image plane and in 3D space.

4.2 PBVS

On the position-based visual servoing, the vector of features s is defined in the Cartesian space as the camera pose relative to the object. On that case, both the camera calibration values and the 3D geometric model of the object are required. The problem of pose estimation from the features tracked on the image plane was briefly introduced on Section 3.2.4. The concept of pose estimation is broad, since after a pose estimation the camera can be considered as an regular position sensor and its measurements may be applied to controllers such as an Proportional–Integral–Derivative (PID) controller. However, [88] presented two PBVS schemes that are suitable to the general visual servoing formulation introduced on the beginning of this chapter.

Given a set of three coordinate frames: Fc the camera frame; Fc⋆ the desired camera frame and Fo the frame attached to the object, the vector of features s = (t, θu), where t is a translational vector and θu is the angle-axis parametrization for the rotation, is here defined in two forms. Chapter 4. Visual Controller 47

The first approach defines the vector t relative to the object frame Fo, thus:

c s = ( to, θu) ⋆ c⋆ s = ( to, θu) (4.16) c c⋆ e = ( to − to, θu)

For this case, the interaction matrix is defined as:

c −I3 [ to]× Ls = (4.17)  0 Lθu  and its inverse is:

c −1 −I3 [ to×]L ˆ+  θu  Ls = −1 (4.18)  0 Lθu  where Lθu is defined on Equation 4.19.

  θ sincθ 2 L u = I − [u]× + 1 − [u]× (4.19) θ 3 2  θ   sinc2   2  in which [·]× is the skew matrix, shown on 4.20. Interesting properties of Lθu is that −1 −1 Lθu ≈ I3 for small values of θ and Lθu = I3, thus Lθu θu = θu, [33].

 0 −uz uy  [u]× =  uz 0 −ux (4.20)     −uy ux 0 

+ Applying Lˆs from Equation 4.18 on Equation 4.5 give us the following control law:

c⋆ c c vc = −λ(( to − to) + [ to]×θu)  (4.21) ωc = −λθu  Another approach is to consider the translation vector t as the displacement of the camera frame (Fc) relative to the desired camera frame (Fc⋆ ). Thus:

c⋆ s = ( to, θu) s⋆ = 0 (4.22) e = s Chapter 4. Visual Controller 48

The interaction matrix for that case is shown on Equation 4.23

R 0  Ls = (4.23)  0 Lθu and its inverse:

T ˆ + R 0  Ls = −1 (4.24)  0 Lθu  since R−1 = R⊤ Replacing Equation 4.24 on Equation 4.5 it is given the control law defined in Equation 4.25

c⋆ vc = −λR tc  (4.25) ωc = −λθu  On the first approach the origin trajectory of the object follows a straight line as opposed to the camera trajectory on the 3D space. On the other hand, on the second approach the camera trajectory is a straight line if the camera parameters are perfectly estimated as opposed to that of the pixel trajectory on the image plane.

4.3 Hybrid Visual Servoing - 2 1/2D

4.3.1 Interaction Matrix

In opposite of IBVS and PBVS schemes on the 21/2D visual servoing, the interaction matrix can be decoupled into a translational and rotational components, which means

Ls = [Lv Lω]. Therefore, Equation 4.2 can be rewritten according to Equation 4.26:

˙s= Lsvc T = [Lv Lω][vc ωc] (4.26)

= Lvvc + Lωωc

Similarly, forcing the exponential decreasing of the error, ˙e = λe:

−λe = ˙e = ˙s = Lvvc + Lωωc (4.27) which directly implies on the control law for the linear velocities, as the following:

+ vc = −Lv (λe + Lωωc) (4.28) Chapter 4. Visual Controller 49

where ωc is the same as the PBVS 4.21, i.e., ωc = −λθu. Thus, Equation 4.29 defines the control law for the 21/2D visual controller [89]:

+ vc = −Lv (λe + Lωωc)  (4.29) ωc = −λθu  where the partitioned interaction matrix is defined as shown by Equation 4.30 and Equation Z 4.31, x and y are the image coordinates and ρ = : z Z⋆

−1 0 x 1   Lv =  0 −1 y  (4.30) Zρz      0 0 −1

2  xy −(1 + x ) y  2 Lω = 1 + y −xy −x (4.31)      −y x 0  In the global representation:

Lv Lw  Ls = (4.32)  0 Lθu and its inverse:

− −1 ˆ Lv −LvLωLθu1 Lv −LvLω Ls = − = (4.33)  0 Lθu1   0 I3 

−1 An important feature of Lˆs is that it is an upper triangular matrix without any singularity in the whole task space [33].

4.4 Adaptive Gain

One way to improve the performance of the visual servoing task is to apply an adaptive gain on the control law instead of the regular static gain proposed above [90]. Therefore, the gain λ from Equation [90] is modified during the task according to 4.34:

λ(x) = a e(−bx) + c (4.34) Chapter 4. Visual Controller 50

where x = kek2, i.e., the norm two of the error vector. The values a, b, c, are computed indirectly according to Equation. 4.35

a = λ(0) − λ(∞) b = λ˙ (0)/a (4.35) c = λ(∞) where λ(0) is the value which the gain will assume when x = 0, while λ(∞) is the gain value when x → ∞, and λ˙ (0) is the slope of the curve when x = 0. Figure 22 shows an example of gain curve for λ(0) = 0.2, λ(∞) = 0.1 and λ˙ (0) = 0.05.

Adaptive Gain Curve 0.2

0.18

0.16 gain

0.14

0.12

0.1 0 2 4 6 8 10 ||error||

Figure 22 – Example of adaptive curve using λ(0) = 0.2, λ(∞) = 0.1 and λ˙ (0) = 0.05

The use of adaptive gains causes the gain to be small when the error is large, so it reduces the control signal in the beginning of the task. As long as the error decreases, the gain increases, which improves the response time. This approach is particularly useful for visual servoing tasks, because it avoids abrupt movements and, consequently, the marker loss on the beginning of the task, since the control signal provokes smoother vehicle’s motion.

4.5 Stability Analysis

On this section an stability analysis of the proposed visual servoing schemes is discussed according to the Lyapunov theory. Given a dynamic system x˙ = f(x, t), with n ⋆ ⋆ x(t0) = x0 with x ∈ R , the point x is defined as an equilibrium point if f(x , y) ≡ 0, i.e., x˙(x⋆) ≡ 0. A system is said stable if a point starting near to x⋆ remains close to x⋆ for all the time. The equilibrium point is asymptotically stable if it is stable and limx→∞ x(t) = 0 [91]. Chapter 4. Visual Controller 51

In the sense of Lyapunov, given a Lyapunov candidate function L : D → Rn, where D ⊂ Rn, the point of equilibrium x¯⋆ is:

• stable if L(0) = 0, L(x) > 0, ∀x =6 0 and L˙(x) ≤ 0, ∀x =6 0 and

• asymptotically stable if L(0) = 0, L(x) > 0, ∀x =6 0 and L˙(x) < 0, ∀x =6 0

Moreover, when L : Rn → Rn and the conditions above are satisfied, it is said that the systems are globally stable and globally asymptotically stable in the sense of Lyapunov, respectively. In terms of the visual controllers introduced in the previous sections of this chapter, the candidate-Lyapunov-function is defined:

1 L = kek2 (4.36) 2 So L > 0 and L(0) = 0. The differentiation of the Equation 4.36 give us:

L˙ = eT e˙ (4.37)

Applying Equation. 4.3 in Equation 4.37:

⊤ L˙ = e Lsvc (4.38)

Substituting Equation 4.5 in Equation 4.38

⊤ + L˙ = −λe LsLˆs e (4.39)

Given the quadratic form of L˙ expressed on Equation 4.39, the stability criteria is achieved if and only if:

+ LsLˆs > 0 (4.40) since it ensures:

L˙ < 0, ∀e =6 0 (4.41)

We have seen that the controllers introduced on this chapter differ regarding the interaction matrix, so we will detail the particularities of each one on the remainder of this section. Chapter 4. Visual Controller 52

4.5.1 Stability for PBVS

For the PBVS, the stability analysis is quite direct. Given that Lθu is singular only ∗ −1 for θ = 2kπ, ∀k ∈ Z and it is out of the possible workspace, we have that LsLˆs = I6 which assures globally asymptotically stability. Of course this is only ensured when Ls = Lˆs, which is only achieved if the pose parameters are perfectly estimated. This is true for both presented PBVS controllers since Ls is full rank in all workspace. However, [88] warns that accuracy and stability can be extremely affected under small errors on the pose estimation, either by error on 3D model of object or coarse camera calibration [33].

4.5.2 Stability for IBVS

The stability analysis of IBVS has some particularities. We have seen that the use of more than three features to compose the interaction matrix in order to avoid k×n singularities is advised. It implies Ls ∈ R with k > n. Since n is the number of actuated degrees of freedom of a camera, the maximum value for n is six, which means + that max(rank(Ls)) = 6. Therefore, there is e ∈ Ker(Lˆs ), that is, local minima in which ⋆ ⋆ feature states s 6= s satisfies Lˆs(s − s ) = 0. For that reason, it is said that only local asymptotic stability is obtained for IBVS [88]. In order to study stability when k > 6, it is necessary to define a new error e′[88].

′ + ⋆ + e = Lˆs (s − s ) = Lˆs e (4.42)

The differentiation of Equation 4.42 is expressed as follows:

′ ˆ˙+ + e˙ = Ls e + Lˆs ˙e (4.43)

By replacing Equation 4.3 on Equation 4.43:

′ ˆ˙+ + e˙ = Ls e + Lˆs Lsvc (4.44) ˆ˙+ = (Ls Ls + O)vc 6×6 where O ∈ R and it is equal 0 when e = 0, regardless of the choice of Ls. Thus, we ′ 1 ′ define a new control law vc = −λe and the new Lyapunov-candidate-function L = ke k. 2 Following the same steps as before we will have:

′⊤ + ′ L˙ = e (Ls Ls + O)e (4.45) which give us a new stability criterion:

+ Lˆs Ls + O ≥ 0 (4.46) Chapter 4. Visual Controller 53 and, since a small neighborhood around e = e⋆ = 0 is considered, we can rewrite it as:

+ Lˆs Ls ≥ 0 when k > n (4.47)

In fact, when the features are chosen so that the rank of Ls is six and the camera calibrations and other approximations are not too coarse, the criterion established on 4.47 is satisfied. The problem of determining the size of the neighborhood where the asymptotically stability is ensured is still an open problem in the visual servoing field, however practical experiments have shown that this is quite a large region [88].

4.5.3 Stability of 21/2D visual servoing

The stability analysis for the hybrid visual servoing matches the same case of PBVS described on the Section 4.5.1, that is, the number of features is equal to the number of actuated degrees of freedom (k = n). Thus, since there are no singularities in the entire workspace, the stability criterion on Equation 4.40 is satisfied and the global asymptotic stability is obtained.

The triangular form of the interaction matrix Ls in the hybrid visual servoing makes it possible to analyze stability against errors on the camera calibration [89]. Details of the proof are found in [33]. 54 5 Evaluation of Fiducial Markers

This chapter describes the experiments performed to evaluate the performance of artificial fiducial markers in underwater environments. ARToolKit, AprilTags and ArUco were compared according to six criteria: performance at different lighting and turbidity levels, minimum marker size (in pixels) needed for detection, maximum distance between camera and marker (in meters), maximum angle for successful detection, and lastly, required processing time.

5.1 Experimental Setup

The experiments were conducted in a 3.4m x 2.6m x 2.2m tank with black walls which block external light. It is equipped with a 3D gantry robot with millimeter positioning precision. A camera head featuring a 2-DOF roll-tilt unit [92] was mounted to the gantry. This configuration allows to rotate the camera 360◦ degrees around its z-axis, and from -90◦ to 90◦ around its x-axis. The image is captured by a Prosilica GigE GE1900 camera, featuring full HD resolution at 30 fps and a CCD sensor with a PENTAX TV lens with 12.5mm focal length and a relative aperture of f/1.4. To compensate for the small tank size and cover a maximum range of marker sizes in the image, a set of three different physical marker dimensions is used for each tested library. The largest marker has a size of 18.3cm x 18.3cm, followed by a medium size marker with 5.5cm x 5.5cm and a small marker with 2.0cm x 2.0cm. With this configuration it is possible to capture markers in the image with sizes ranging from 21 x 21 pixels to 880 x 880 pixels. The lab is equipped with several fluorescent lamps and two additional halogen lamps were added in order to simulate shallow sea light conditions. In addition, deep sea conditions are achieved by switching off all fluorescent and halogen lamps and turning on an LED underwater lamp attached on the camera head. The LED lamp provides 3,200 lumens at a 6,600K color temperature. Additionally, a photometer and a turbidity sensor were used to measure the lighting and turbidity conditions. Figure 23 shows the roll-tilt camera head and the LED lamp attached to 3D gantry robot. The following library versions are used: ArUco 1.2.5, AprilTags 2014-10-20 with tag family 36h11 and ARToolKit 2.72.1 with a pattern created by ARToolKit Patternmaker 1. The libraries’ parameters are configured to achieve a high detection rate in a clear 1 http://www.cs.utah.edu/gdc/projects/augmentedreality/ Chapter 5. Evaluation of Fiducial Markers 55

Figure 23 – Camera head mounted on a 3D gantry robot inside the black tank water scenario. Since during a typical robot mission these parameters are not changed, we considered them constant during all the experiment phases. All data processing is performed on a laptop with an Intel Core i7-3540M CPU @ 3.00GHz x 2 and 8GB of RAM running Linux Mint 17 Cinnamon 64-bit.

5.2 Methodology

In this work, the three different marker systems are compared according to the six criteria previously described in Section 3. In order to create the same conditions in terms of lighting, turbidity and positions for all three libraries used, the markers are mounted on a metal structure side by side (see Figure 24). This arrangement allows to analyze the markers at the same time and in the same captured frame. The top row features the AprilTag and ARToolKit markers and the bottom row holds the ArUco marker and smaller size versions of all markers. For the data acquisition the three libraries are integrated into Rock, mentioned on Section 2. The experiments simulate two lighting scenarios, namely shallow and deep sea. The shallow sea lighting conditions are created using external lamps that provide a luminance of 35 lux in the marker plane (in clear water). The deep sea lighting conditions are simulated by switching off the external lamps and blocking all natural light. The result is a luminance level undetectable by the photometer sensor. Here an underwater LED lamp is used to simulate the operating conditions of underwater vehicles. A luminance of 102 to 668 lux is measured (in clear water), depending on the distance between the LED lamp and the markers. Concerning the turbidity conditions, the experiment starts in clear water with 0.3 FTU (Formazin Turbidity Unit). A small amount of powdered clay is added to the water Chapter 5. Evaluation of Fiducial Markers 56

Figure 24 – Placement of markers on the support structure. The first row features an AprilTags and an ARToolKit marker. In the second row the ArUco marker is followed by smaller size versions of all markers. in order to increase the turbidity and create five additional scenarios with 1.1 FTU, 2.0 FTU, 3.6 FTU, 5.0 FTU and 7.0 FTU. Every measurement is performed with a turbidity sensor fifteen minutes after adding the powdered clay (for even distribution). Figure 25 shows the AprilTags markers at the six tested turbidity levels for a 1.2m camera-marker distance in the deep sea lighting scenario.

(a) 0.3FTU (b) 1.1FTU (c) 2.0FTU

(d) 3.6FTU (e) 5.0FTU (f) 7.0FTU

Figure 25 – AprilTags at different turbidity levels in a deep sea lighting scenario

In order to evaluate the capability of the marker systems to detect small markers in turbid underwater environments, the camera is centered in front of the markers and Chapter 5. Evaluation of Fiducial Markers 57 then moved further away, increasing the distance between the camera and the markers and therefore decreasing the marker size in pixels. The distance between the camera and the markers affects the detection rate not only due to the marker size in the image but also because of the visibility range, which decreases as the turbidity level rises. The procedure to evaluate the smallest detectable marker size and the maximum camera-to-marker distance starts by placing the camera at 31.5 cm from the markers and increasing the distance until reaching the back wall of the tank. This motion results in a maximum distance from the markers of 215.3cm. Due to this, the marker size ranges from 21 x 21 px to 880 x 880 px. Using this configuration, the smallest detectable marker size in pixels is analyzed for each library. Due to image degradation caused by turbidity, the distance between the marker and the camera plays an important role regarding the detection rate, since blurring and noise level in the image increase with the camera-to-marker distance. This way, even markers with a size larger than the "minimum detectable size" might not be detected at larger distances. For this reason, the maximum camera-to-marker distances for each marker system are also compared. Here, this number is defined as the maximum distance at which the camera detects a marker of any size for the respective marker system. For the maximum detectable angle evaluation (assuming a camera coordinate system, where the z-axis points forward and the y-axis points down) the camera is moved to 120.0 cm in z-direction and is rotated around the y-axis in 13 steps up to a maximum of 85◦ according to Figure 26.

Figure 26 – Angle steps used for the maximum detectable angle test

To evaluate the required computation time, 800 frames are processed (clear water and shallow sea lighting) and the difference between the system time immediately before Chapter 5. Evaluation of Fiducial Markers 58 and after the detector function call is analyzed. Neither the time for image capturing nor the pose estimation time were taken into account here. After processing the frames, the mean and the standard deviation of the time values are calculated to establish the required computation time for each library.

5.3 Results and Discussion

The analysis of the smallest detectable marker size in pixels for each library in a shallow sea scenario is shown in Figure 27. For all tested libraries, the required size for detection increases as the turbidity level increases. This behavior was expected since particles suspended in the water reduce the image quality. The ArUco library presents a higher sensitivity to the turbidity level since the required size increases at a larger rate than for the other libraries. During this experiment, it could be observed that the ArUco library was not able to detect the small marker (2.0x2.0 cm) at any turbidity level. It was only able to detect the medium size marker (5.0x5.0 cm). For this reason, the ArUco library presented the worst relative performance compared to the other libraries. On the other hand, AprilTags showed much less sensitivity to turbidity changes (within the tested range) and was able to detect the 2.0x2.0cm marker in all scenarios. ARToolKit detected the 2.0x2.0cm markers only at the two lowest turbidity levels.

Minimum Detected Marker Size - Shallow Sea 250 ArUco AprilTags ARToolKit 200

150

100

Minimal Marker Size (px) 50

0 0,3 1,1 2 3,6 5 7 Turbidity (FTU)

Figure 27 – Smallest detectable size (in pixels) for ArUco, AprilTags and ARToolKit libraries in a shallow sea scenario at different turbidity levels.

Figure 28 shows the obtained results for the deep sea lighting scenario at different turbidity levels. Here, all libraries were able to detect markers at smaller sizes than in the shallow sea lighting scenario. This might be due to the larger luminance in the marker plane provided by the LED lamp. An important observation is that ARToolKit did not detect any marker at the highest tested turbidity level and ArUco needed a size larger than 81 pixels and for this Chapter 5. Evaluation of Fiducial Markers 59

reason it did not detect the 2.0x2.0cm marker at any turbidity level. The AprilTags library showed better performance than the other libraries, since it was able to detect the smaller markers at all tested turbidity levels.

Minimum Detected Marker Size - Deep Sea 250 ArUco AprilTags ARToolKit 200

150

100

Minimal Marker Size (px) 50

0 0,3 1,1 2 3,6 5 7 Turbidity (FTU)

Figure 28 – Smallest detectable size (in pixels) for ArUco, AprilTags and ARToolKit libraries in a deep sea scenario at different turbidity levels.

For each marker library the maximum detection distance was analyzed for shallow sea and deep sea scenarios. The results are shown in Figure 29 and Figure 30, respectively. For successful marker detection, both graphs show that the distance between camera and marker needs to be smaller when the turbidity levels increase. In the shallow sea case (except for the first two turbidity levels), ARToolKit needs to be closer to the marker compared to the other tested libraries. This behavior is perceived even more in the deep sea scenario when ARToolKit does not detect any marker at a turbidity of 7.0FTU. The ArUco and AprilTags libraries delivered identical results at the first three turbidity levels in both lighting scenarios. They were able to detect the markers at a distance of 215.3 cm, which was the maximum distance tested. However, at higher turbidity levels, AprilTags presented a better performance than ArUco, being able to detect the markers at larger distances. The analysis of the maximum angle for detection was performed up to 85◦ as shown in Figure 26. For the shallow sea lighting scenario, the results are shown in Figure 31. Here, all tested libraries were able to detect the markers at the maximum tested angle at the two firsts turbidity levels. ArUco detected the markers at a maximum of 70◦, while AprilTags and ARToolKit even detected the markers at 80◦. In the 5.0FTU scenario, only AprilTags was able to detect the markers and for the highest tested turbidity level, none of the libraries were able to detect the markers at any angle. The deep sea case is shown in Figure 32. ArUco and AprilTags detected the maximum tested angles in the first three turbidity levels, while ARToolKit detected the markers at a maximum of 70◦. At the 3.6FTU level, ArUco and ARToolKit detected up Chapter 5. Evaluation of Fiducial Markers 60

Largest Distance Camera-to-Marker for detection - Shallow Sea 220 ArUco 200 AprilTags ARToolKit 180

160

140

120

100

80

Distance Camera-to-Marker (cm) 60

40 0,3 1,1 2 3,6 5 7 Turbidity (FTU)

Figure 29 – The largest camera-to-marker distance for successful detection at different turbidity levels for the shallow sea lighting scenario

Largest Distance Camera-to-Marker for detection - Deep Sea 220 ArUco 200 AprilTags ARToolKit 180

160

140

120

100

80

Distance Camera-to-Marker (cm) 60

40 0,3 1,1 2 3,6 5 7 Turbidity (FTU)

Figure 30 – The largest camera-to-marker distance for successful detection at different turbidity levels for the deep sea lighting scenario

Maximum Detected Angles - Shallow Sea 90 ArUco 80 AprilTags ARToolKit 70

60

50

40

30 Angles (degrees)

20

10

0 0,3 1,1 2,0 3,6 5,0 7,0 Turbidity (FTU)

Figure 31 – Maximum detected angle for the three libraries in a shallow sea lighting scenario at different turbidity levels Chapter 5. Evaluation of Fiducial Markers 61 to a maximum of 55◦ while AprilTags reached 60◦. AprilTags was the only library which could detect markers at the 5.0FTU turbidity level (up to 50◦) and no library could detect the markers at the highest turbidity level.

Maximum Detected Angles - Deep Sea 90 ArUco 80 AprilTags ARToolKit 70

60

50

40

30 Angles (degrees)

20

10

0 0,3 1,1 2,0 3,6 5,0 7,0 Turbidity (FTU)

Figure 32 – Maximum detected angle for the three libraries in a deep sea lighting scenario at different turbidity levels

The required processing time for each library is shown in Figure 33. The ArUco library delivers the fastest performance and the smallest standard deviation (of the compute time per frame) among the three libraries. On the other hand, AprilTags has the slowest performance and a larger variation from the mean time-per-frame value. For real-time applications this has to be taken into account since the detection time for one frame can in some cases be approximately three times as long as the mean value. This can be verified for instance for frame 333 where the detection time was 149.8 ms.

Detection Time 160 ArUco Aruco: mean=27.07 std dev = 5.39 std dev=10.85 AprilTags 140 AprilTags: mean=57.86 ARToolKit ARToolKit: mean=37.28 std dev=10.31 ArUco Mean 120 AprilTags Mean ARToolKit Mean 100

80

time (ms) 60

40

20

0 0 100 200 300 400 500 600 700 800 Frames

Figure 33 – The detection time for ArUco, AprilTags and ARToolKit Chapter 5. Evaluation of Fiducial Markers 62

5.4 Conclusions

Among the tested libraries, AprilTags presented the best performance for most of the analyzed criteria, since it is able to detect smaller markers, even at larger angles compared to the other candidates. This is true for all tested turbidity and lighting levels. However, depending on the application, its performance might be a problem, since it has presented the slowest performance during the tests and in certain frames it might take 150ms to detect the marker. Several variables affect the detection rate directly, such as camera parameters (resolution, sensor quality, focal length, aperture size) and library configuration parameters. For this reason, the results discussed here do not represent an absolute evaluation of these marker systems, but a relative comparison of the tested libraries. In other words, it is possible that the libraries might present better results under different conditions, but the relative analysis is still of value since all tested libraries were used under the same conditions. 63 6 Gazebo Simulation

6.1 Experimental Setup and Methodology

The simulation of the test scenario was performed in the Gazebo 1simulation environment. Gazebo is a powerful open source robotic simulator maintained by the Open Source Robotics Foundation© able to accurately and efficiently simulate indoor and outdoor environments with a robust physic engine. In this work it was used to simulate an ocean, taking into account fluid statics and dynamics such as density, viscosity and ocean currents. Moreover, Gazebo is integrated with Rock Robotics2 as already mentioned in Section 2.2.3. This allows task portability and parameter configuration between the real and the simulated vehicle [93]. Thereby, the vehicle in Gazebo matches the real vehicle in terms of dimension, dynamics, sensors and propellers, as already shown in Section 2.2. Figure 34 shows FlatFish in the simulated environment.

Figure 34 – FlatFish inspecting a shipwreck on Gazebo

In this work, the simulated camera was set with a 720p (1280x720) resolution in order to boost marker detection time and therefore increase the control chain update rate. Given that the adopted visual servoing strategy was monocular, the front left camera was chosen. As mentioned on Section 2.2, roll and pitch degrees of freedom were passively controlled, so the visual servoing was designed to control all linear DOF (surge, sway and heave) as well as the yaw angular DOF. Regarding the artificial fiducial marker system, according to the results of Section 5 the Apriltag marker system showed better performance in underwater environments and, therefore, was chosen for this stage [94]. Thus, for the simulation an Apriltags with ID = 1 and size of 0.4m x 0.4m was added to Gazebo. 1 www.gazebo.org 2 www.rock-robotics.org Chapter 6. Gazebo Simulation 64

The ocean environment was simulated in two conditions: sheltered water and under ocean currents with magnitude 0.4m/s similar to the average current observed in the Todos os Santos bay [95]. In this work, the ocean current was equally decomposed in the X and Y components corresponding to the vehicles X and Y with magnitude of 0.283m/s each.

For each scenario two variations of visual servoing 2 1/2D controller were assessed. One was to use the parameter λ of Equation 4.4 as a static gain while the second strategy aimed to vary λ according to the error x as it is shown in Equation 4.34, where x is the ∗ norm-2 of the difference between the desired and current features, i.e, ||s − s ||2. The visual controller relied on the Visual Servoing Platform (Visp) 3 and it was tuned according to Table 4.

Static Gain λ = 0.1 Adaptive Gain λ(0) = 0.5, λ(∞) = 0.08, λ˙ (0) = 0.5

Table 4 – Gains for the simulated visual servoing task

In all tests the robot started at position X = −1.0,Y = 0.6,Z = 3.0, and yaw = 30◦ and the desired position x = 0.0, y = 0.0,z = 1.0 and yaw = 0◦. It is shown in Figure 35 the position in the world and the markers on the image plane in the started and desired positions.

(a) (b)

(c) (d)

Figure 35 – Starting and ending position of the vehicle in the world [(a) e (b)] and starting and ending marker positions on the image plane [(c) e (d)]

6.2 Results

The results found on the simulation when no ocean currents were present are shown in Figure 36 and Figure 37. Figure 36 shows the vehicle’s position on the four 3 https://visp.inria.fr/ Chapter 6. Gazebo Simulation 65 controlled degrees of freedom. For each of these cases, it can be observed that the static gain controller presents a settling time of 55 seconds. In terms of this indicator, the performance is increased in 63% when the adaptive gain is adopted, since the settling time in that case is approximately 20 seconds.

Reference Following 4 0

3 −30

2 −60

1 −90

0

Linear Position (m) −120 Angular Position (°)

−1 −150

−2 −180 0 10 20 30 40 50 60 Velocities 0.4 0.4

0.3 0.3

0.2 0.2

0.1 0.1

0 0

−0.1 −0.1 Linear Velocity (m/s) Angular Velocity (rad/s) −0.2 −0.2

−0.3 −0.3 0 10 20 30 40 50 60 time (s) surge heave static setpoint sway yaw adaptive

Figure 36 – Reference following without oceans current effects

Figure 37 shows that the trajectory of both controllers were similar and the final position was reached by both of them. Figure 38 and Figure 39 show the controller behavior when a constant disturbance is included. A steady-state error can be observed in Figure 38, when the static gain was applied, whereas when the adaptive gain was adopted a null steady-state error and a slightly higher settling time were observed. The difference in error is better seen in Figure 39, when the continuous line, which represents the pixel’s position on the image plane for the static gain, does not achieve the desired position. In that case, as long as the error decreases the velocity commands become very small so the vehicle is unable to overcome the external disturbance. On the other hand, the adaptive gain tends to increase the control signal when the error is small, which requires higher velocities that are able to reject the disturbances. Chapter 6. Gazebo Simulation 66

0 320 640 960 1280 0 start pos. prop. final pos. prop. trajectory prop. start pos. adap. final pos. adap. trajectory adap. 240

480

720

Figure 37 – Pixels trajectory on the image plan when oceans current are not present

Reference Following 4 0

3 −30

2 −60

1 −90

0

Linear Position (m) −120 Angular Position (°)

−1 −150

−2 −180 0 10 20 30 40 50 60 Velocities 0.4 0.4

0.3 0.3

0.2 0.2

0.1 0.1

0 0

−0.1 −0.1 Linear Velocity (m/s) Angular Velocity (rad/s) −0.2 −0.2

−0.3 −0.3 0 10 20 30 40 50 60 time (s) surge heave static setpoint sway yaw adaptive

Figure 38 – Reference following with oceans current effects Chapter 6. Gazebo Simulation 67

0 320 640 960 1280 0 start pos. prop. final pos. prop. trajectory prop. start pos. adap. final pos. adap. trajectory adap. 240

480

720

Figure 39 – Pixels trajectory on the image plan in the presence of oceans current

The static gain values could be set higher except for the fact that on the beginning of the mission higher required velocities will increase the controller aggressiveness, which might lead to the loss of the marker in the camera field of view. In the case of the present work, the markers started close to the image border (see Figure 35) , so the controller was tuned conservatively.

6.3 Conclusions

The main contribution of this section was to conclude that there is a trade off when tuning the static gain since big values overcome currents when the error is small, but increases the chances of marker loss in the beginning of the task. Therefore, the static gain has been shown as an alternative to the static gain since it is able to produce a smooth behavior in the beginning of the mission while it requires higher velocities on the final of the task, which results on a null steady-state error unlike the static gain approach. Moreover, the adaptive gain also presented smaller settling time than static gain strategy. 68 7 Experiments with the Real Vehicle

This chapter presents the methodology, results and discussion of the experiments performed on the FlatFish AUV. Initially, the tests took place on a big saltwater tank aiming to validate the results of the simulation on a safe and controllable environment. Later, the best controller design was submitted to the ocean. By the end of each section, the conclusions of the trials, challenges and suggestions for improvements are exposed.

7.1 Experiments in the big basin

7.1.1 Experimental Setup

As mentioned on the Section 2.2, two prototypes of FlatFish were built. In these experiment, the facilities of the DFKI and the German vehicle were used. The experiments took place at the Maritime Exploration Hall located in Bremen, Germany. It is a big hall comprising a basin with dimensions of 23m x 19m x 8m and approximately 3 million liters of saltwater (Figure 40). The tank allows experiments in a controlled environment, with transparent water and minimum external disturbances. This ambient is excellent for testing, since the researcher can see from the exterior what is happening with the vehicle. The process of launching and retrieving the vehicle is simple and can be performed by one single person and, most importantly, there is no risk of losing the vehicle.

Figure 40 – Maritime Exploration Hall - DFKI [96]

Similarly to the experiment in the simulation, only a single camera was used for the tests. Here, the right frontal camera was chosen. Although the camera is able to capture 2040x2040 frames at 25 frames per second, for this experiments the images were down scaled by a factor of 50% and 15 frames per second. The exposure time and gain was set to automatic, so it can compensate the changes on illumination. Although the camera frame rate, the output of Apriltags is slower than the input data, which means it effectively deliveries information to the controller at 7.4Hz (average). Chapter 7. Experiments with the Real Vehicle 69

The camera was calibrated using a chessboard and the camera parameters are shown on Equation 7.1 as the distortion coefficients are shown on Equation 7.2.

1991.8 0 972.148 K =  0 1993.069 1067.18 (7.1)      0 0 1 

distcoeff = (k1,k2, p1, p2) = (0.08653039, 0.1632879, −0.000002090889, −0.001396539) (7.2)

One Apriltag of identifier number equal to 285 was placed in one of the basin walls, fixed by a rope. Figure 41 shows the marker on the wall.

Figure 41 – Setup scenario on the DFKI’s big basin

Regarding the computational equipment, FlatFish has two computers: one is dedicated to processing large data coming from cameras and sonars while the other is dedicated to navigation purposes, for instance, computing dead reckoning and control commands. The payload PC is an Intel(R) Core(TM) i5-6400 CPU @ 2.70GHz, 16GB memory RAM and the navigation PC is an Intel(R) Core(TM) i5-4250U CPU @ 1.30GHz, 16GB RAM.

7.1.1.1 Control Chain

The vehicle control is currently performed by a cascade PID, where the outer loop controls position and provides a velocity setpoint for the inner loop. The velocity PID controller, in turn, outputs the efforts for each degree of freedom. It is then mapped to the required efforts for each thruster. The mapping is performed using the thrust configuration matrix, already presented on the Table 2. Figure 42 shows the data flow of the PID cascade control chain. Chapter 7. Experiments with the Real Vehicle 70

Figure 42 – Cascade PID diagram

The feedback for the outer PID is the position represented on the world frame and it comes from both the vehicle’s navigation system. The velocity feedback comes from a fusion of the information that comes from DVL and vehicle motion model (based on the thruster efforts, it estimates the vehicle’s velocity on the sampling time). The control chain operates at a frequency of 10Hz. In this experiment, the aforementioned control chain is used to move the vehicle until the marker detector finds the marker. So the control chain is reconfigured for a dynamic look-and-move visual servoing scheme. In other words, the position PID is replaced by the visual controller, which becomes responsible of sending velocity requests (setpoint) to the inner loop. The major changes affect the feedback information: on the visual servoing control chain the vehicle receives the coordinates of pixels that are used to feed the controller directly as well as to compute the orientation of the marker in relation to the camera. Figure 43 shows the data flow of the visual servoing control chain.

Figure 43 – Visual control chain diagram

Another difference concerns to the controlled frame. The visual controller controls the position of the marker in respect to the vehicle body frame. A few static transformations are involved on the process, since the VISP expects the setpoint in the camera frame. Thus, the transformation between the camera and body frame is performed. Figure 44 shows graphically the frame coordinates.

Lets assume Fc is the camera frame, Fm is the marker frame and Fb is the vehicle’s body frame. The static transformations between these frames is represented by the following a homogeneous matrices, where Hb is the representation of frame "b" in the frame "a". From the marker corners, the pose estimation provides the marker pose in the camera c frame ( Hm), this transformation is dynamic and updated as long as the camera moves. In Chapter 7. Experiments with the Real Vehicle 71

Figure 44 – Coordinate systems from FlatFish’s body, camera and marker order to control the marker’s position with respect to the vehicle’s body frame, the static transformation between the camera and the body is needed. The Equation 7.3 shows the transformation between the frames:

b b c Hm = Hc Hm (7.3) where:

 0 0 1 0.979 

b −1 0 0 −0.131 H =   (7.4) c    0 −1 0 −0.421      0 0 0 1  The PIDs from Figure 42 and Figure 43 were tuned to the values shown on Table 5.

Table 5 – PID Coefficients

Position PID Velocity PID Kp Ki Kd Kp Ki Kd Surge 0.3 0 0 600 0 0 Sway 0.3 0 0 400 0 0 Heave 0.3 0 0 600 0 0 Yaw 0.3 0 0 500 0 0

The PIDs running on the vehicle are actually a P-only controllers, since the integrative and derivative parameters are null. The decision to nullify the derivative term was taken in order to prevent influences of the noisy feedback signal. The null integral term comes from a time when the filters involved with the velocity estimation added significant Chapter 7. Experiments with the Real Vehicle 72 delay. Since it had been fixed, the P-only still presented satisfactory results and therefore was kept.

7.1.2 Methodology

The experiment procedure consisted of varying the setpoint of a single DOF by time. A different pose was required and then the vehicle was asked to come back to the initial pose. Except by surge, a significant difference was not perceived in the process of going back and forth to the setpoint. Thus, only for surge, both results are going to be discussed. For the others, only one way will be shown. Table 6 shows the setpoint of all 5 case scenarios: Table 6 – Setpoint for the test cases scenarios on the basin

Case 1 Case 2 Case 3 Case 4 Case 5 Surge (m) 2.5 to 4.0 4.0 to 2.5 4.0 4.0 4.0 Sway (m) -0.131 -0.131 -0.831 to 0.631 -0.131 -0.131 Heave (m) -0.421 -0.421 -0.421 -0.421 to -0.921 -0.421 Yaw (deg) 90 90 90 90 55 to 90

The values adopted for sway and heave correspond to the displacement of the camera with respect to the body frame. Thus, with these values the marker is expected to be kept on the middle of the image. The camera is placed in 0.979m in relation to the body’s frame x-axis, which means that a setpoint of 3m in surge, actually places the camera at 2.021m (3m-0.979m) from the marker. Regarding the gains, a set of 2 static and 2 adaptive gains was applied with the following values:

• Static: λ = 0.1

• Static: λ = 0.2

• Adaptive: λ(0) = 0.2, λ(∞) = 0.1 and λ˙ (0) = 0.01

• Adaptive: λ(0) = 0.3, λ(∞) = 0.1 and λ˙ (0) = 0.01

7.1.3 Results and Discussion

This section shows the results and discussion of the experiments performed on the big saltwater basin. In Figure 45, the target Apriltag is shown being detected by the vehicle during a visual servoing mission. Note the transparency of the water. In the first image the vehicle is 3 meters from the marker and in the second it is at 7 meters from the marker. Since 7 meters is an extremely good and unlikely condition, the results will be restricted to up to 4 meters. Chapter 7. Experiments with the Real Vehicle 73

(a) (b)

Figure 45 – Vehicle’s camera detecting the Apriltag marker at two different distances on the tank

7.1.3.1 Case 1 - Surge

For case 1, a step of 1.5 meters is applied to the vehicle in the surge degree of freedom. At the beginning, the vehicle was placed at 2.5m and a setpoint of 4m was required. The results are shown in Figure 46.

Surge 4.5 4 3.5 3

Distance (m) 2.5 2 0 10 20 30 40

0.2 0.1 0 -0.1

Velocity (m/s) -0.2

0 10 20 30 40

100 50 0

Effort (N) -50 -100 0 10 20 30 40 Time (s)

Figure 46 – Controller performance in surge during setpoint changing in surge (case 1).

The curves show a smooth transition, where the adaptive gain λ = [0.3 0.1 0.01] had shown a quicker response. On the other hand, the static gain λ = 0.1 presented a slower behavior among them. It is the result of an application of a bigger efforts on the beginning of the task, that can be noticed on the velocity and efforts graphs. It is also possible to see a dead-zone of approximately 0.7 seconds, mostly due to the image Chapter 7. Experiments with the Real Vehicle 74 processing stage.

Sway

0.4 0.2 0 -0.2 -0.4

Distance (m) -0.6 -0.8 0 10 20 30 40

0.2 0.1 0 -0.1

Velocity (m/s) -0.2

0 10 20 30 40

100 50 0

Effort (N) -50 -100 0 10 20 30 40 Time (s)

Figure 47 – Controller performance in sway during setpoint changing in surge (case 1).

The sway graph shows that the vehicle kept its position during the whole mission. By the orange line on the efforts, one can see after 17 seconds oscillations on the thrusters efforts signal. It is because at this moment the vehicle is 1.5m away from the start position, and so the marker becomes smaller on the image and a small variation on the pixels causes the detection to be nosier.

Heave 1 0.5 0 -0.5 Distance (m) -1 0 10 20 30 40

0.2 0.1 0 -0.1

Velocity (m/s) -0.2

0 10 20 30 40

100 50 0

Effort (N) -50 -100 0 10 20 30 40 Time (s)

Figure 48 – Controller performance in heave during setpoint changing in surge (case 1).

The heave analysis is shown in Figure 48. None of the curves reached the setpoint on the steady state. The orange line is too oscillatory while the λ = [0.2 0.1 0.01] has a Chapter 7. Experiments with the Real Vehicle 75 similar performance to the static gain λ = 0.2. The oscillation is caused by the higher friction generated by the communication tower of FlatFish, located above the vehicle. It provokes an undesired pitch movement when the vehicle moves back and forth, and given that the vehicle’s controller assumes null pitch, the efforts on the thrusters T1 and T2 (Figure 13) causes a displacement on Z axis, or heave. Currently, pitch is passively controlled by the buoyancy of the vehicle. An alternative would be to actively control pitch using the thrusters T5 and T6 showed in Figure 13.

Yaw

120 100 80

Angle (deg) 60 0 10 20 30 40

0.2 0.1 0 -0.1 -0.2 Velocity (deg/s) 0 10 20 30 40

100 50 0

Effort (N) -50 -100 0 10 20 30 40 Time (s)

Figure 49 – Controller performance in yaw during setpoint changing in surge (case 1).

The yaw performance is shown in Figure 49. No changes are noticed on yaw during the movement. Looking at the efforts graph, it is possible to see that the orange line is sensitive to noise measurements as we perceived in sway case. Pixel trajectory on the image plane is shown in Figure 50. It is possible to see the markers started bigger on the image and, as the vehicle moves to the 4m setpoint, the projected size decreases. In general, all the squares have a size similar to the desired one and are well positioned in the center of the image, except for the red square (λ = 0.1), which has a significant offset along image v-axis, due to the poor performance in heave.

7.1.3.2 Case 2 - Surge

Case two has the same step amplitude (1.5m), however, on the opposite direction. Here the start point is at 4m and it is desired for the vehicle to stay at 2.5m from the marker, while keeping the marker in the image center. Chapter 7. Experiments with the Real Vehicle 76

u 0 200 400 600 800 1000 0

200

400 v

600

800

1000

Figure 50 – Pixel trajectory on the image plane during motion in surge (case 1).

Surge 4.5 4 3.5 3

Distance (m) 2.5 2 0 10 20 30 40

0.2 0.1 0 -0.1

Velocity (m/s) -0.2

0 10 20 30 40

100 50 0

Effort (N) -50 -100 0 10 20 30 40 Time (s)

Figure 51 – Controller performance in surge during setpoint changing in surge (case 2). Chapter 7. Experiments with the Real Vehicle 77

By analyzing Figure 51 one can see an overshoot of 0.25m when the vehicle goes towards the marker, regardless of the applied gain. This is not observed on the previous case, even though the error has the same magnitude. The non-occurrence of overshoot on the previous case might be due the asymmetry on the Z − Y plane, which causes less friction on the forward movement and higher dampness when moving backwards.

Sway

0.4 0.2 0 -0.2 -0.4

Distance (m) -0.6 -0.8 0 10 20 30 40

0.2 0.1 0 -0.1

Velocity (m/s) -0.2

0 10 20 30 40

100 50 0

Effort (N) -50 -100 0 10 20 30 40 Time (s)

Figure 52 – Controller performance in sway during setpoint changing in surge (case 2).

Figure 52 shows the performance of surge during the setpoint changing on surge. It shows that the vehicle kept its position and also shows the same behavior presented on Case 1, in which there is more oscillation when the vehicle is far from the marker, causing the surge measurements to be noisier. The heave performance showed in Figure 53 is similar to that of the previous case, with small oscillations due the resistance provoked by the vehicle’s communication tower. Yaw performance showed in Figure 54 presents no relevant changes on Yaw during the surge movements. In Figure 55 pixel trajectory is shown when the camera moves in the direction of the markers. As expected, the marker size in the image is larger than on Case 1. All markers present a similar size to that of the setpoint, however there is vertical displacement due to the heave controller. The case for λ = 0.2 also presents an offset due to the performance of controller in surge.

7.1.3.3 Case 3 - Sway

The setpoint changing in sway, shown in Figure 56, caused a smooth transition from -0.831m to 0.631m, a range of 1.462m, comprising moving the marker from border to Chapter 7. Experiments with the Real Vehicle 78

Heave 1 0.5 0 -0.5 Distance (m) -1 0 10 20 30 40

0.2 0.1 0 -0.1

Velocity (m/s) -0.2

0 10 20 30 40

100 50 0

Effort (N) -50 -100 0 10 20 30 40 Time (s)

Figure 53 – Controller performance in heave during setpoint changing in surge (case 2).

Yaw

120 100 80

Angle (deg) 60 0 10 20 30 40

0.2 0.1 0 -0.1 -0.2 Velocity (deg/s) 0 10 20 30 40

100 50 0

Effort (N) -50 -100 0 10 20 30 40 Time (s)

Figure 54 – Controller performance in yaw during setpoint changing in surge (case 2). Chapter 7. Experiments with the Real Vehicle 79

u 0 200 400 600 800 1000 0

200

400 v

600

800

1000

Figure 55 – Pixel trajectory on the image plane during motion in surge (case 2). border horizontally on the image. The performance of λ = 0.2 and λ = [0.2 0.1 0.01] were similar, while the λ = 0.1 presented was slower than the others.

Sway

0.5

0

-0.5 Distance (m)

0 10 20 30 40

0.4 0.2 0 -0.2 Velocity (m/s) -0.4 0 10 20 30 40

100 50 0

Effort (N) -50 -100 0 10 20 30 40 Time (s)

Figure 56 – Controller performance in sway during setpoint changing in sway (case 3).

In surge, according to Figure 57, the vehicle kept its position at 4m, in which is possible to see the effect of the noisy measurements being reflected on the control efforts. The gain λ = [0.3 0.1 0.01] appeared to be more sensitive to it. On heave, Figure 58 all the controllers presented a slight oscillation and a non-zero steady state error. The vehicle is closer to the setpoint on controllers with higher gains. Chapter 7. Experiments with the Real Vehicle 80

Surge 4.5 4 3.5 3

Distance (m) 2.5 2 0 10 20 30 40

0.4 0.2 0 -0.2 Velocity (m/s) -0.4 0 10 20 30 40

100 50 0

Effort (N) -50 -100 0 10 20 30 40 Time (s)

Figure 57 – Controller performance in surge during setpoint changing in sway (case 3).

Heave 1 0.5 0 -0.5 Distance (m) -1 0 10 20 30 40

0.4 0.2 0 -0.2 Velocity (m/s) -0.4 0 10 20 30 40

100 50 0

Effort (N) -50 -100 0 10 20 30 40 Time (s)

Figure 58 – Controller performance in heave during setpoint changing in sway (case 3). Chapter 7. Experiments with the Real Vehicle 81

In yaw, shown in Figure 59, the setpoint was kept during the movements, with no significant differences among the controllers.

Yaw 120 110 100 90 80

Angle (deg) 70 60 0 10 20 30 40

0.4 0.2 0 -0.2

Velocity (deg/s) -0.4 0 10 20 30 40

100 50 0

Effort (N) -50 -100 0 10 20 30 40 Time (s)

Figure 59 – Controller performance in yaw during setpoint changing in sway (case 3).

u 0 200 400 600 800 1000 0

200

400 v

600

800

1000

Figure 60 – Pixel trajectory on the image plane during motion in sway (case 3).

Figure 60 illustrates pixel trajectory on the image plane during the sway movements. One can see the markers kept their size and are aligned in the u-axis, which is due to controller performance in sway. Chapter 7. Experiments with the Real Vehicle 82

7.1.3.4 Case 4 - Heave

The experiments on heave were performed starting at -0.421m and going to - 0.921m, a range of 0.5m. Its performance is shown in Figure 61. One can see that λ = 0.1 presents the slowest behavior, λ = 0.2 and λ = [0.2 0.1 0.01] present similar behavior and λ = [0.3 0.1 0.01] presents the fastest response, but also the most oscillatory one among the controllers. An offset between the desired and achieved position can also be noticed. This fact is highlighted on the beginning of the task, when a setpoint of -0.421 was given, but all cases started at -0.220m. Particularly, on heave it is possible to see the velocity curves also have an offset, which suggests the position error on that case is related to a bad tuning of the heave PID velocity controller.

Heave 0 -0.2 -0.4 -0.6 -0.8 Distance (m) -1 0 10 20 30 40 50

0.2 0.1 0 -0.1 Velocity (m/s) -0.2 0 10 20 30 40 50

100 50 0

Effort (N) -50 -100 0 10 20 30 40 50 Time (s)

Figure 61 – Controller performance in heave during setpoint changing in heave (case 4).

The surge behavior is shown in Figure 62. The vehicle kept its position with a slight offset from the setpoint. Sway control demonstrated in Figure 63 shows also the keeping of the desired position. In yaw, Figure 64, the visual controller was able to maintain the vehicle angular position aligned with the marker. An unexpected displacement on the beginning of the task represented by the orange and blue lines are due to previous experiments where the setpoint was changed before it stabilized. In Figure 65 the trajectory of the pixels when the vehicle performs movements in heave is shown. In general they are close to the position, except by a small offset in surge. Chapter 7. Experiments with the Real Vehicle 83

Surge 4.5 4 3.5 3 2.5 Distance (m) 2 0 10 20 30 40 50

0.2 0.1 0 -0.1 Velocity (m/s) -0.2 0 10 20 30 40 50

100 50 0

Effort (N) -50 -100 0 10 20 30 40 50 Time (s)

Figure 62 – Controller performance in surge during setpoint changing in heave (case 4).

Sway

0.4 0.2 0 -0.2 -0.4

Distance (m) -0.6 -0.8 0 10 20 30 40 50

0.2 0.1 0 -0.1 Velocity (m/s) -0.2 0 10 20 30 40 50

100 50 0

Effort (N) -50 -100 0 10 20 30 40 50 Time (s)

Figure 63 – Controller performance in sway during setpoint changing in heave (case 4). Chapter 7. Experiments with the Real Vehicle 84

Yaw 100 95 90 85 Angle (deg) 80 0 10 20 30 40 50

0.2 0.1 0 -0.1

Velocity (deg/s) -0.2 0 10 20 30 40 50

100 50 0

Effort (N) -50 -100 0 10 20 30 40 50 Time (s)

Figure 64 – Controller performance in yaw during setpoint changing in heave (case 4).

u 0 200 400 600 800 1000 0

200

400 v

600

800

1000

Figure 65 – Pixel trajectory on the image plane during motion in heave (case 4). Chapter 7. Experiments with the Real Vehicle 85

7.1.3.5 Case 5 - Yaw

On the Yaw experiment, a rotation from 55 degrees to 90 degrees was demanded. Figure 66 shows the performance of the controller. All of them were able to move the vehicle to the desired angle. Here it is possible to see again the slower performance of λ = 0.1 in comparison with the other ones. The other three presented similar performances.

Yaw 110 100 90 80 70

Angle (deg) 60 50 0 10 20 30 40 50 60

0.2 0.1 0 -0.1

Velocity (deg/s) -0.2 0 10 20 30 40 50 60

100 50 0

Effort (N) -50 -100 0 10 20 30 40 50 60 Time (s)

Figure 66 – Controller performance in yaw during setpoint changing in yaw (case 5).

During the movement in yaw, the surge was kept at 4m from the marker. Figure 67 shows the vehicle’s position with small changes, but rapidly returning to the setpoint.

Surge 4.5 4 3.5 3

Distance (m) 2.5 2 0 10 20 30 40 50 60

0.2 0.1 0 -0.1 Velocity (m/s) -0.2 0 10 20 30 40 50 60

100 50 0

Effort (N) -50 -100 0 10 20 30 40 50 60 Time (s)

Figure 67 – Controller performance in surge during setpoint changing in yaw (case 5). Chapter 7. Experiments with the Real Vehicle 86

In Figure 68, the performance of sway is shown. Here one can see that an error is caused on sway during yaw movements. This is explained by the rotation of the vehicle’s frame.

Sway 1

0.5

0

Distance (m) -0.5 0 10 20 30 40 50 60

0.4 0.2 0 -0.2

Velocity (m/s) -0.4

0 10 20 30 40 50 60

100 50 0

Effort (N) -50 -100 0 10 20 30 40 50 60 Time (s)

Figure 68 – Controller performance in sway during setpoint changing in yaw (case 5).

Since the setpoint is given in terms of the marker position regarding FlatFish’s frame, a rotation in the vehicle’s yaw causes a displacement in the y axis, which is explained better in Figure 69. At the end, the vehicle performs a movement along a circumference in which the radius equals the surge setpoint. This is an expected and desired behavior because it allows a rotation through larger angles without losing the marker on the image.

Figure 69 – Displacement of the vehicle when performing yaw control. (a) The vehicle is positioned so the camera is aligned with the marker. (b) A small change in yaw provokes a displacement in the y-axis, then (c) the vehicle actuates in surge to regulate the error.

In heave one can see that the setpoint was not reached by any of controllers. Again this is likely due to the velocity controller’s tuning parameters, since it is possible to notice an offset on the velocity graphs. The oscillatory behavior on the controller with Chapter 7. Experiments with the Real Vehicle 87

λ = [0.3 0.1 0.01], is generated during the yaw movement, but after the vehicle reach the yaw, the heave curve converges.

Heave 0.4 0.2 0 -0.2 -0.4 -0.6 Distance (m) -0.8 -1 0 10 20 30 40 50 60

0.2 0.1 0 -0.1 Velocity (m/s) -0.2 0 10 20 30 40 50 60

100 50 0

Effort (N) -50 -100 0 10 20 30 40 50 60 Time (s)

Figure 70 – Controller performance in heave during setpoint changing in yaw (case 5).

Figure 70 shows a clutter pixel trajectory for the yaw movements. As discussed before, the setpoint changing in yaw implies a moment in sway, since we want to keep the marker in the center of the image. At the beginning it is possible to see the movement in yaw transition to the markers to the left of the image, but later it moves to the marker in the image center again.

u 0 200 400 600 800 1000 0

200

400 v

600

800

1000

Figure 71 – Pixel trajectory on the image plane during motion in yaw (case 5). Chapter 7. Experiments with the Real Vehicle 88

7.1.4 Conclusions

In general we could see that the adaptive set λ = [0.3 0.1 0.01], i.e., the orange line, usually performs the fastest response. However, it comes with the price of being more sensitive to noise, which caused oscillatory movements in various moments during the experiment. The adaptive λ = [0.2 0.1 0.01], i.e., blue line presented a behavior similar to the static gain λ = 0.2 in most of the cases. Indeed, since a single degree of freedom was tested per turn, the range of operation caused the norm error to be small and therefore the adaptive gain operated on a region close to the λ(0). Even though both have presented similar results, the use of adaptive gain tends to prevent the loss of markers on the beginning of the task. The results have shown a poor behavior on the heave controller. It is most likely due to the PID tuning parameters, since there is an offset on the velocity graphs as well, which means that the position controller was still demanding control efforts, but the velocity controller was not able to correspond to it.

7.2 Experiments in the Sea

7.2.1 Experimental Setup

The sea trials took place at the Todos os Santos Bay in Salvador, Brazil, aboard of the Lady Catarina boat at the GPS coordinates 12°56’51"S 38°30’16"W. Figure 72 shows the location where the experiments were performed in Brazil’s map, and a zoomed up figure with the location in Salvador’s map.

(a) Location in the (b) Location in Salvador’s map Brazil’s map

Figure 72 – Brazilian experiments location (Google Maps)

Figure 73 shows the vehicle hooked on a crane in the Todos os Santos bay, where it was launched to the sea. Chapter 7. Experiments with the Real Vehicle 89

Figure 73 – FlatFish AUV in Todos os Santos bay

The Apriltag marker was attached on a metal pedestal, so the marker was 1.8m above the pedestal’s base. Figure 74 shows the marker, which was positioned 15 meters below sea level.

(a) (b)

Figure 74 – Apriltag marker ID=5 placed on metal pedestal

7.2.2 Methodology

The camera used was the same model as the one used in Germany, capturing images at 15 frames per second and at 2k resolution. The frames were also down scaled by a factor of 50% in order to increase the detection time and it operated with auto exposure. The camera calibration parameters are different and shown on Equation 7.5 and 7.6.

1993.04 0 995.586 K =  0 1998.89 1030.42 (7.5)      0 0 1  Chapter 7. Experiments with the Real Vehicle 90

distcoeff = (k1,k2, p1, p2) = (0.0870984, 0.225211, −0.00145666, −0.000153369) (7.6)

The control chain architecture is the same as mentioned on the previous section. However, the PID parameters are different. They are presented on the Table 7. Table 7 – PID Coefficients on the ocean experiments

Position PID Velocity PID Kp Ki Kd Kp Ki Kd Surge 0.3 0 0 600 0 0 Sway 0.3 0 0 400 0 0 Heave 0.7 0 0 1000 0 0 Yaw 0.5 0 0 300 0 0

The experiment procedure concerned launching the vehicle from the vessel, moving the vehicle to the front of the marker and then hovering it. After this the visual servoing task is called and the control chain from Figure 42 is replaced by Figure 43. The results on the tank showed the controller with λ = [0.3 0.1 0.1] had faster responses than the others, but was more sensitive to noise in the measurements. On the other hand, the controller with λ = [0.2 0.1 0.1], showed to be less sensitive to noise when the error is small, but still had a satisfactory time response. Thus, for the sea tests, only the controller with adaptive gain λ = [0.2 0.1 0.1] was tested. In order to demonstrate the capability of the vehicle to keep a fixed position in front of a visual marker, we kept the vehicle for 15 minutes with different setpoint in surge during this process.

7.2.3 Results and Discussion

This section shows the results of the FlatFish performing visual servoing on the Todos os Santos bay. Figure 75 shows the vehicle’s camera, while detecting the marker and performing station keeping at different distances. The black borders are due to the process of distortion removal. The results are first presented showing the graphics for the whole mission, which comprises approximately 15 minutes of the vehicle performing visual servoing on the marker. During the process, the setpoint was changed in order to evaluate the capability of the controller to hover at different distances. Later, some specific parts of the graph are presented in a different graph, so it can be analyzed in detail.

7.2.3.1 Entire servoing mission

Figure 76 shows the performance of the controller during 900 seconds moving to different distances to the marker. The maximum distance tested was 3.5m and the closest Chapter 7. Experiments with the Real Vehicle 91

(a) (b)

Figure 75 – Vehicle’s camera detecting the Apriltag marker at two different distances. one was 2m. The controller was able to move the vehicle to the desired setpoint during the entire time. An oscillation is also observed, which caused oscillations on the velocity requests and consequently the actuators.

Surge 4.5 4 3.5 3 2.5 2

Distance (m) 1.5 1 0 200 400 600 800

0.4 0.2 0 -0.2 Velocity (m/s) -0.4 0 200 400 600 800

100 50 0

Effort (N) -50 -100 0 200 400 600 800 Time (s)

Figure 76 – Surge controller performance in the sea during several setpoint changes

Similar behavior is shown in Figure 76, in which the setpoint’s sway was kept in zero, and, due to the translation of the camera in relation to the body frame, implied the image was displaced on the u-axis of the image plane. Despite the oscillations, the vehicle stayed all the time in the neighborhood of the required position in sway. In Figure 78, one can see the performance on heave. Note that on the beginning of Chapter 7. Experiments with the Real Vehicle 92

Sway

0.4 0.2 0 -0.2

Distance (m) -0.4

0 200 400 600 800

0.2 0.1 0 -0.1 Velocity (m/s) -0.2 0 200 400 600 800

100 50 0

Effort (N) -50 -100 0 200 400 600 800 Time (s)

Figure 77 – Sway controller performance in the sea during several setpoint changes the task the setpoint was changed in order to move the vehicle down and let the marker in the center of the image on the v-axis of the image plane. In general, the controller also managed to keep the vehicle on the heave position. Here we also perceived similar behavior to that of the tests in the tank, when heave controller was unable to nullify the error in steady state. Figure 79 shows the controller performance in yaw. On the beginning of the task, the vehicle was slightly rotated, but it quickly regulated the yaw angle to the desired one. In general, the graph shows the vehicle maintained the orientation. The apparent vibrations in high frequency shown in these graphs and in the previous ones are the result of the representation of a large amount of time shrunk into the graph. On the next graphs, only a part of the mission is shown and this become more evident. In Figure 80 pixel trajectory is shown during the entire mission. Although apparently confusing, some conclusions can be extracted from the image analysis. First, the path on the top of the image represents the beginning of the task, when the vehicle was at 2 meters and the heave setpoint was zero. It shows the marker was close to the border and on the verge of being out of the field of view. After the correction, the trajectory was placed on the center of the image and reduced the risks of marker loss. In addition, the dense red area on the image suggests that oscillations represented by the previous graphs, actually represents an oscillation around approximately 50 pixels. Given the image size and considering the environment disturbance, it can be considered a satisfactory result. Finally, a small rotation around the vehcile’s x-axis is observed on the final position. It Chapter 7. Experiments with the Real Vehicle 93

Heave 0.2 0 -0.2 -0.4

Distance (m) -0.6 -0.8 0 200 400 600 800

0.2 0.1 0 -0.1 Velocity (m/s) -0.2 0 200 400 600 800

100 50 0

Effort (N) -50 -100 0 200 400 600 800 Time (s)

Figure 78 – Heave controller performance in the sea during several setpoint changes

Yaw 100 95 90 85 Angle (deg) 80 0 200 400 600 800

0.2 0.1 0 -0.1

Velocity (deg/s) -0.2 0 200 400 600 800

100 50 0

Effort (N) -50 -100 0 200 400 600 800 Time (s)

Figure 79 – Yaw controller performance in the sea during several setpoint changes Chapter 7. Experiments with the Real Vehicle 94 is due to a bend on the metal pedestal where the Apriltag was attached (see Figure 75). Since the roll DOF is not controlled actively, this displacement is not corrected.

u 0 200 400 600 800 1000 0

200

400 v

600

800

1000

Figure 80 – Pixel trajectory on the image plane during the entire mission in the sea

7.2.3.2 Interval Analysis

In order to have a better view of the controller performance, the region from 770 to 900 seconds was selected and it is shown in Figures 81-85. All the intervals presented similar behaviors, so there is no special reason for this choice. Figure 81 shows the performance of the controller in the sea during the selected interval. One can see the controller maintained the desired position during the process. It is also possible to notice that the oscillation observed in Figure 76 is smoother here. It is an effect caused by the shrinkage of the time axis. Figure 82 shows that the controller also kept the vehicle in the desired position. The controller performance in heave is illustrated in Figure 83. The vehicle stayed at its desired position with small variations. In Figure 84, the signals for the Yaw analysis are shown. In this case, there are high frequency components on the measurements, which suggests measurement noises. It is explained by the fact that the image was noisier and turbidity level higher than those in the basin, which caused degradation of measurement quality at lower ranges. Even though, the vehicle’s orientation was maintained during the whole period of analysis.

7.2.4 Conclusions

The experiments have shown the controller successfully kept the vehicle in front of the fiducial marker in the sea, regardless of external influences. The oscillations presented Chapter 7. Experiments with the Real Vehicle 95

Surge 4.5 4 3.5 3 2.5 2

Distance (m) 1.5 1 0 20 40 60 80 100 120

0.4 0.2 0 -0.2 Velocity (m/s) -0.4 0 20 40 60 80 100 120

100 50 0

Effort (N) -50 -100 0 20 40 60 80 100 120 Time (s)

Figure 81 – Surge controller performance on interval [770-900] seconds during mission in the sea

Sway

0.4 0.2 0 -0.2

Distance (m) -0.4

0 20 40 60 80 100 120

0.2 0.1 0 -0.1 Velocity (m/s) -0.2 0 20 40 60 80 100 120

100 50 0

Effort (N) -50 -100 0 20 40 60 80 100 120 Time (s)

Figure 82 – Sway controller performance on interval [770-900] seconds during mission in the sea Chapter 7. Experiments with the Real Vehicle 96

Heave 0.2 0 -0.2 -0.4

Distance (m) -0.6 -0.8 0 20 40 60 80 100 120

0.2 0.1 0 -0.1 Velocity (m/s) -0.2 0 20 40 60 80 100 120

100 50 0

Effort (N) -50 -100 0 20 40 60 80 100 120 Time (s)

Figure 83 – Heave controller performance on interval [770-900] seconds during mission in the sea

Yaw 100 95 90 85 Angle (deg) 80 0 20 40 60 80 100 120

0.2 0.1 0 -0.1

Velocity (deg/s) -0.2 0 20 40 60 80 100 120

100 50 0

Effort (N) -50 -100 0 20 40 60 80 100 120 Time (s)

Figure 84 – Yaw controller performance on interval [770-900] seconds during mission in the sea Chapter 7. Experiments with the Real Vehicle 97

u 0 200 400 600 800 1000 0

200

400 v

600

800

1000

Figure 85 – Pixel trajectory on the image plane on the interval [770-900] seconds the during mission in the sea on the graph of entire mission seems to have a higher frequency, but the analysis of specific intervals shows that actually the oscillation occurred within a more spaced period, a result of external disturbance. It is believed that the oscillations are also a result of delay on the controller. A system capable of handling the delay could improve the disturbance rejection and improve the overall controller performance. The marker detection on the sea environment is subject to several factors that degrade the image, as discussed on Section 3. This reduces the range of operation of the visual servoing missions. An alternative to increase the range of operation is to compensate the underwater image’s unfavorable features with processing before it passes to the marker detector. Preliminary results running algorithms to improve contrast, correct non-uniform lighting and noise filtering, have presented an improvement of 70% of markers detection rate. However, the image pre-processing increased the overall detection time by a factor of 30%, which would add more delay to the controlling process. It opens a field to address this problem in future works. 98 8 Final Considerations

On this work a visual controller for an underwater vehicle was designed for its use on artificial fiducial markers. As introduction, the main aspects of related areas and a historical context of autonomous underwater vehicles was presented. Then the FlatFish AUV was introduced in terms of dynamic model, sensors, actuators and embedded software. The challenges of performing underwater image processing were also shown, bringing up the reasons for underwater image degradation and the general aspect of those kinds of images. Additionally, the concepts of perspective transformation and camera calibration were also developed. Finally, the background of artificial fiducial markers and the studied marker systems were discussed. The background of visual servoing, historical context and applications on mobile robots were shown, as well as the main groups of visual servoing schemes such as IBVS, PBVS and the combination of both, the HVS or 21/2D. The control law and stability analysis for each one of them were defined. In terms of experiments, initially the intention was to define the best marker system for underwater application. The main open source marker systems were tested: Aruco, Apriltags and ARToolKit. Among the tested libraries, we have seen that Apriltags marker system showed the best performance in terms of the minimum detectable marker size, maximum distance to the marker and maximum angle with successful detection in both shallow and deep sea scenarios. On the other hand, it presented the slowest detection time rate, being roughly two times slower than the Aruco, for instance. Though processing time is an important factor, it has to be related to the detection rate because it is better to have a slower, but consistent, detection than quicker, but intermittent, measurements. In the tests we perceived an average update rate of 7.4Hz, which, given the dynamic of the system, is satisfactory. The experiments in Gazebo were fundamental to validate the whole integration in terms of software, to correct bugs and to be sure that the error would generate efforts in the right degree of freedom and in the right direction. Afterwards, the results showed that the simple application of static gain could lead the vehicle to lose marker, because of the aggressiveness on the beginning of the mission, when the error is larger. It can create an undesired pitch movement due to the dynamics of the vehicle and cause the camera to miss the marker. The use of adaptive gain has smoothed the beginning of the mission, while also keeping a significant value for the gain when the marker is closer and, therefore, achieving the null error on steady-state. An alternative to marker loss would be to apply some state filter that could predict Chapter 8. Final Considerations 99 marker position even when it is not detected, which would make the vehicle try to correct it and find the marker again. The results on the real vehicle have shown the vehicle was able to keep its position in front of the marker and follow setpoint in all degrees of freedom. Both static and adaptive gain were shown. There was not possible to notice a significant difference between them because the error norm was too small, which make the adaptive parameter λ(0) to be more considered. We have seen that the controller with the higher adaptive parameters caused the robot to present quicker response in compassion with the others, but also more oscillations on the steady-state. It also showed to be more sensitive to measurement noises as well. On the sea, the visual controller was able to keep its position in front of the marker during 15 minutes and get to different positions during this time. The analysis has shown that the artificial marker is a good visual reference and can be used for visual servoing purposes in the sea. The adaptive gain designed in the pool was able to control the vehicle even under natural disturbances. In the plots it is possible to notice a delay between the commands and the vehicles response, which is mostly due to image processing time. This also might have impacted the vehicle’s controller performance causing oscillatory movements on steady-state. On the other hand, previous results have shown that in a pre-processing stage, increasing contrast and reducing non-uniform lighting can increase in approximately 70% the detection rate, which mainly impacts on more constant samples and higher range of operation in terms of distance and turbidity level. Unfortunately, it comes with an addition of 30% in the processing time. Therefore, future works in the improvement of image quality before the detection process, simultaneously with the application of control strategies to handle delay such as Smith Predictor, might bring interesting results on the underwater visual servoing missions. 100 Bibliography

1 GROTZINGER, J. P. Analysis of surface materials by the curiosity mars rover. Science, American Association for the Advancement of Science, v. 341, n. 6153, p. 1475–1475, 2013. ISSN 0036-8075. Cited on page 17. 2 MATSUNO, F.; TADOKORO, S. Rescue robots and systems in japan. In: 2004 IEEE International Conference on Robotics and Biomimetics. [S.l.: s.n.], 2004. p. 12–20. Cited on page 17. 3 NAGATANI, K. et al. Volcanic ash observation in active volcano areas using teleoperated mobile robots - introduction to our robotic-volcano-observation project and field experiments. In: 2013 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR). [S.l.: s.n.], 2013. p. 1–6. ISSN 2374-3247. Cited on page 17. 4 SAWADA, J. et al. A mobile robot for inspection of power transmission lines. IEEE Transactions on Power Delivery, v. 6, n. 1, p. 309–315, Jan 1991. ISSN 0885-8977. Cited on page 17. 5 NASA. Nasa’s website. In: . Mars Pathfinder Rover Sojourner. 2004. Disponível em: . Acesso em: 15 of March 2017. Cited on page 17. 6 AERYION Labs Inc. Aeryion’s website. In: . Aeryon Scout. 2017. Disponível em: . Acesso em: 15 of March 2017. Cited on page 17. 7 HONDA. Honda’s website. In: . Asimo Specifications. 2017. Disponível em: . Acesso em: 15 of March 2017. Cited on page 17. 8 OCEANEERING International Inc. Oceaneering’s website. In: . Nexxus ROV. 2017. Disponível em: . Acesso em: 26 of March 2017. Cited on page 18. 9 OCEANEERING International Inc. Oceaneering’s website. In: . Magnum Plus ROV. 2017. Disponível em: . Acesso em: 26 of March 2017. Cited on page 18. 10 WIRTZA, M.; HILDEBRANDTB, M. Iceshuttle teredo: An ice-penetrating robotic system to transport an exploration auv into the ocean of jupiter’s moon europa. 2016. Cited on page 18. 11 SUBSEA WORLD News. Subsea world news. In: . MODUS Orders Seaeye Sabertooth AUV/ROV. 2017. Disponível em: . Acesso em: 26 of March 2017. Cited on page 18. 12 FORESTI, G. L. Visual inspection of sea bottom structures by an autonomous underwater vehicle. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), v. 31, n. 5, p. 691–705, Oct 2001. ISSN 1083-4419. Cited on page 18. Bibliography 101

13 MARANI, G.; CHOI, S. K.; YUH, J. Underwater autonomous manipulation for intervention missions auvs. Ocean Engineering, v. 36, n. 1, p. 15 – 23, 2009. ISSN 0029-8018. Autonomous Underwater Vehicles. Cited on page 18.

14 PEÑALVER, A. et al. Visually-guided manipulation techniques for robotic autonomous underwater panel interventions. Annual Reviews in Control, v. 40, p. 201–211, 2015. ISSN 13675788. Cited on page 18.

15 GRACIAS, N. R. et al. Mosaic-based navigation for autonomous underwater vehicles. IEEE Journal of Oceanic Engineering, v. 28, n. 4, p. 609–624, Oct 2003. ISSN 0364-9059. Cited on page 18.

16 GAO, J. et al. Hierarchical Model Predictive Image-Based Visual Servoing of Underwater Vehicles with Adaptive Neural Network Dynamic Control. IEEE Transactions on Cybernetics, v. 46, n. 10, p. 2323–2334, 2016. ISSN 21682267. Cited on page 18.

17 THOMPSON, A.; MAPSTONE, B. D. Observer effects and training in underwater visual surveys of reef fishes. Marine Ecology Progress Series, v. 154, p. 53–63, 1997. Cited on page 19.

18 BRANDOU, V. et al. 3d reconstruction of natural underwater scenes using the stereovision system iris. In: OCEANS 2007 - Europe. [S.l.: s.n.], 2007. p. 1–6. Cited on page 19.

19 RIDAO, P. et al. Visual inspection of hydroelectric dams using an autonomous underwater vehicle. Journal of Field Robotics, Wiley Online Library, v. 27, n. 6, p. 759–778, 2010. Cited on page 19.

20 KIM, A.; EUSTICE, R. M. Real-time visual slam for autonomous underwater hull inspection using visual saliency. IEEE Transactions on Robotics, v. 29, n. 3, p. 719–733, June 2013. ISSN 1552-3098. Cited on page 19.

21 RIVES, P.; BORRELLY, J. J. Visual servoing techniques applied to an underwater vehicle. In: Proceedings of International Conference on Robotics and Automation. [S.l.: s.n.], 1997. v. 3, p. 1851–1856 vol.3. Cited on page 19.

22 LOTS, J. F. et al. A 2d visual servoing for underwater vehicle station keeping. In: Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164). [S.l.: s.n.], 2001. v. 3, p. 2767–2772 vol.3. ISSN 1050-4729. Cited 2 times on pages 19 and 20.

23 MARKS, R. L. et al. Automatic visual station keeping of an underwater robot. In: OCEANS ’94. ’Oceans Engineering for Today’s Technology and Tomorrow’s Preservation.’ Proceedings. [S.l.: s.n.], 1994. v. 2, p. II/137–II/142 vol.2. Cited on page 19.

24 HUTCHINSON, S.; HAGER, G. D.; CORKE, P. A tutorial on visual servo control. IEEE Transaction on Robotics and Automation, v. 12, n. 5, p. 651–670, 1996. ISSN 1042296X. Cited 3 times on pages 19, 39, and 41.

25 BOURQUARDEZ, O.; CHAUMETTE, F. Visual servoing of an airplane for auto-landing. In: IEEE. Intelligent Robots and Systems, 2007. IROS 2007. IEEE/RSJ International Conference on. [S.l.], 2007. p. 1314–1319. Cited on page 19. Bibliography 102

26 AZIZIAN, M. et al. Visual servoing in medical robotics: a survey. part i: endoscopic and direct vision imaging–techniques and applications. The International Journal of Medical Robotics and Computer Assisted Surgery, Wiley Online Library, 2013. Cited on page 19.

27 ZHANG, G. et al. A generalized visual aid system for teleoperation applied to satellite servicing. International Journal of Advanced Robotic Systems, v. 11, 2014. Cited on page 19.

28 LEE, P.-M. L. P.-M.; JEON, B.-H. J. B.-H.; KIM, S.-M. K. S.-M. Visual servoing for underwater docking of an autonomous underwater vehicle with one camera. Oceans 2003. Celebrating the Past ... Teaming Toward the Future (IEEE Cat. No.03CH37492), v. 2, p. 677–682, 2003. ISSN 01977385. Cited on page 19.

29 RIVES, P.; BORRELLY, J. J. Underwater pipe inspection task using visual servoing techniques. In: Intelligent Robots and Systems, 1997. IROS ’97., Proceedings of the 1997 IEEE/RSJ International Conference on. [S.l.: s.n.], 1997. v. 1, p. 63–68 vol.1. Cited on page 19.

30 KRUPINSKI, S. et al. Pool testing of auv visual servoing for autonomous inspection. IFAC-PapersOnLine, v. 48, n. 2, p. 274 – 280, 2015. ISSN 2405-8963. Cited on page 19.

31 PLOTNIK, A. M.; ROCK, S. M. Visual servoing of an rov for servicing of tethered ocean moorings. In: OCEANS 2006. [S.l.: s.n.], 2006. p. 1–6. ISSN 0197-7385. Cited on page 19.

32 EVANS, J. et al. Autonomous docking for intervention-auvs using sonar and video-based real-time 3d pose estimation. In: Oceans 2003. Celebrating the Past ... Teaming Toward the Future (IEEE Cat. No.03CH37492). [S.l.: s.n.], 2003. v. 4, p. 2201–2210 Vol.4. Cited on page 19.

33 MALIS, E.; CHAUMETTE, F.; BOUDET, S. 2-1/2-D visual servoing. IEEE Transactions on Robotics and Automation, v. 15, n. 2, p. 238–250, 1999. ISSN 1042296X. Cited 7 times on pages 19, 41, 42, 47, 49, 52, and 53.

34 FIALA, M. Designing highly reliable fiducial markers. IEEE Transactions on Pattern Analysis and Machine Intelligence, v. 32, n. 7, p. 1317–1324, 2010. ISSN 01628828. Cited on page 20.

35 LOTS, J. F.; LANE, D. M.; TRUCCO, E. Application of 2 1/2 d visual servoing to underwater vehicle station-keeping. In: OCEANS 2000 MTS/IEEE Conference and Exhibition. Conference Proceedings (Cat. No.00CH37158). [S.l.: s.n.], 2000. v. 2, p. 1257–1264 vol.2. Cited on page 20.

36 MOORE, S. W.; BOHM, H.; JENSEN, V. Underwater robotics: Science, design & fabrication. [S.l.]: Marine Advanced Technology Education (MATE) Center, 2010. Cited 3 times on pages 22, 23, and 25.

37 OCEAN INNOVATIONS. Ocean innovations. In: . Seabotix Produces. 2017. Disponível em: . Acesso em: 17 of March 2017. Cited on page 23. Bibliography 103

38 ED THELEN. Ed thelen’s website. In: . Monterey Bay Aquarium Research Institute - Annual Open House July 21, 2012 a techie view. 2012. Disponível em: . Acesso em: 17 of March 2017. Cited on page 23. 39 BOGUE, R. Underwater robots: a review of technologies and applications. Industrial Robot: An International Journal, v. 42, n. 3, p. 186–191, may 2015. ISSN 0143-991X. Cited 3 times on pages 23, 24, and 28. 40 SIEGWART, R.; NOURBAKHSH, I. R.; SCARAMUZZA, D. Autonomous mobile robots. Massachusetts Institute of Technology, 2004. Cited on page 24. 41 SWISSLOG Labs Inc. Swisslog’s website. In: . Swisslog. 2017. Disponível em: . Acesso em: 15 of March 2017. Cited on page 24. 42 AETHON Inc. Aethon’s website. In: . Aethon. 2017. Disponível em: . Acesso em: 15 of March 2017. Cited on page 24. 43 ALFRED KäRCHER GmbH & Co. KG. Kärcher’s website. In: . Kärcher Robocleaner RC3000. 2017. Disponível em: . Acesso em: 15 of March 2017. Cited on page 24. 44 SICILIANO, B.; KHATIB, O. Springer Handbook of Robotics. Secaucus, NJ, USA: Springer-Verlag New York, Inc., 2008. ISBN 354023957X. Cited on page 25. 45 TAHIR, A. M.; IQBAL, J. Underwater robotic vehicles: Latest development trends and potential challenges. Science International, v. 26, n. 3, 2014. Cited 2 times on pages 25 and 26. 46 AOGHS American Oil & Gas Historical Society. Rov – swimming socket wrench. In: . Offshore History. 2017. Disponível em: . Acesso em: 19 of March 2017. Cited on page 26. 47 WIDDITSCH, H. SPURV-The first decade. [S.l.], 1973. Cited on page 26. 48 NAVAL DRONES. Special purpose underwater research vehicle (spurv). In: . SPURV. 2017. Disponível em: . Acesso em: 21 of March 2017. Cited on page 27. 49 BLIDBERG, D. R. The development of autonomous underwater vehicles (auv); a brief summary. In: IEEE ICRA. [S.l.: s.n.], 2001. v. 4. Cited on page 26. 50 ALT, C. V. Autonomous underwater vehicles. In: Autonomous Underwater Lagrangian Platforms and Sensors Workshop. [S.l.: s.n.], 2003. v. 3. Cited on page 27. 51 NAVAL DRONES. Advanced unmanned search system (auss). In: . AUSS. 2017. Disponível em: . Acesso em: 22 of March 2017. Cited on page 27. 52 KONGSBERG. Hugin auv. In: . Autonomous Underwater Vehicle, HUGIN. 2017. Disponível em: . Acesso em: 22 of March 2017. Cited 2 times on pages 27 and 28. Bibliography 104

53 SABACK, R. et al. Fault-tolerant control allocation technique based on explicit optimization applied to an autonomous underwater vehicle. In: OCEANS 2016 MTS/IEEE Monterey. [S.l.: s.n.], 2016. p. 1–8. Cited on page 30.

54 FOSSEN, T. I. Guidance and control of ocean vehicles. [S.l.]: John Wiley & Sons Inc, 1994. Cited on page 31.

55 SOUZA, E. C. de. Modelagem e controle de veículos submarinos não tripulados. Master Thesis, 2003. Cited on page 31.

56 BRITTO, J. et al. Model identification of an unmanned underwater vehicle via an adaptive technique and artificial fiducial markers. In: OCEANS 2015 - MTS/IEEE Washington. [S.l.: s.n.], 2015. p. 1–6. Cited 2 times on pages 31 and 32.

57 ALBIEZ, J. et al. Flatfish - a compact subsea-resident inspection auv. In: OCEANS 2015 - MTS/IEEE Washington. [S.l.: s.n.], 2015. p. 1–8. Cited on page 32.

58 AULINAS, J. et al. Feature extraction for underwater visual slam. In: OCEANS 2011 IEEE - Spain. [S.l.: s.n.], 2011. p. 1–7. Cited on page 33.

59 YANG, M.; GONG, C. l. Underwater image restoration by turbulence model based on image gradient distribution. In: 2012 2nd International Conference on Uncertainty Reasoning and Knowledge Engineering. [S.l.: s.n.], 2012. p. 296–299. Cited 2 times on pages 33 and 34.

60 SCHETTINI, R.; CORCHS, S. Underwater image processing: State of the art of restoration and image enhancement methods. EURASIP Journal on Advances in Signal Processing, v. 2010, n. 1, p. 746052, 2010. ISSN 1687-6180. Disponível em: . Cited 3 times on pages 33, 34, and 35.

61 BAZEILLE, S. et al. Automatic underwater image pre-processing. In: CMM’06. Brest, France: [s.n.], 2006. p. xx. Disponível em: . Cited on page 34.

62 YUSSOF, W. et al. Performing contrast limited adaptive histogram equalization technique on combined color models for underwater image enhancement. International Journal of Interactive Digital Media, v. 1, n. 1, p. 1–6, 2013. Cited on page 34.

63 HARTLEY, R.; ZISSERMAN, A. Multiple view geometry in computer vision. [S.l.]: Cambridge university press, 2003. Cited 3 times on pages 35, 36, and 37.

64 CORKE, P. Robotics, Vision and Control. [S.l.]: Springer Tracts in Advanced Robotics, 2011. Cited 6 times on pages 35, 36, 37, 38, 42, and 44.

65 MOHAMED, B. Proposition of a 3d pattern for e-learning augmented reality applications based on artoolkit library. In: IEEE. Education and e-Learning Innovations (ICEELI), 2012 International Conference on. [S.l.], 2012. p. 1–4. Cited on page 39.

66 FENG, C.; KAMAT, V. R. Augmented reality markers as spatial indices for indoor mobile aecfm applications. In: Proceedings of 12th international conference on construction applications of virtual reality (CONVR 2012). [S.l.: s.n.], 2012. p. 235–24. Cited on page 39. Bibliography 105

67 BENBELKACEM, S. et al. Augmented reality for photovoltaic pumping systems maintenance tasks. Renewable Energy, v. 55, n. 0, p. 428 – 437, 2013. ISSN 0960-1481. Cited 2 times on pages 39 and 40.

68 JASIOBEDZKI, P. et al. Underwater 3d mapping and pose estimation for rov operations. In: IEEE. OCEANS 2008. [S.l.], 2008. p. 1–6. Cited on page 39.

69 VANMIDDLESWORTH, M. M. A. Toward autonomous underwater mapping in partially structured 3D environments. Tese (Doutorado) — Massachusetts Institute of Technology, 2014. Cited on page 39.

70 SPEERS, A. et al. Monitoring underwater sensors with an amphibious robot. In: Computer and Robot Vision (CRV), 2011 Canadian Conference on. [S.l.: s.n.], 2011. p. 153–159. Cited on page 39.

71 PLOTNIK, A.; ROCK, S. Visual servoing of an rov for servicing of tethered ocean moorings. In: OCEANS 2006. [S.l.: s.n.], 2006. p. 1–6. Cited on page 39.

72 DUDEK, G.; SATTAR, J.; XU, A. A visual language for robot control and programming: A human-interface study. In: IEEE. Robotics and Automation, 2007 IEEE International Conference on. [S.l.], 2007. p. 2507–2513. Cited on page 39.

73 VERZIJLENBERG, B.; JENKIN, M. Swimming with robots: Human robot communication at depth. In: Intelligent Robots and Systems (IROS), 2010 IEEE/RSJ International Conference on. [S.l.: s.n.], 2010. p. 4023–4028. ISSN 2153-0858. Cited on page 39.

74 REKIMOTO, J. Matrix: a realtime object identification and registration method for augmented reality. In: Computer Human Interaction, 1998. Proceedings. 3rd Asia Pacific. [S.l.: s.n.], 1998. p. 63–68. Cited on page 39.

75 KATO, H.; BILLINGHURST, M. Marker tracking and hmd calibration for a video-based augmented reality conferencing system. In: IEEE. Augmented Reality, 1999.(IWAR’99) Proceedings. 2nd IEEE and ACM International Workshop on. [S.l.], 1999. p. 85–94. Cited 2 times on pages 39 and 40.

76 FIALA, M. Artag, a fiducial marker system using digital techniques. In: Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on. [S.l.: s.n.], 2005. v. 2, p. 590–596 vol. 2. ISSN 1063-6919. Cited on page 39.

77 GARRIDO-JURADO, S. et al. Automatic generation and detection of highly reliable fiducial markers under occlusion. Pattern Recognition, v. 47, n. 6, p. 2280 – 2292, 2014. ISSN 0031-3203. Cited 2 times on pages 39 and 40.

78 OLSON, E. AprilTag: A robust and flexible visual fiducial system. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). [S.l.]: IEEE, 2011. p. 3400–3407. Cited 2 times on pages 39 and 40.

79 BERGAMASCO, F. et al. Rune-tag: A high accuracy fiducial marker with strong occlusion resilience. In: Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. [S.l.: s.n.], 2011. p. 113–120. ISSN 1063-6919. Cited on page 39. Bibliography 106

80 TOYOURA, M. et al. Detecting markers in blurred and defocused images. In: Cyberworlds (CW), 2013 International Conference on. [S.l.: s.n.], 2013. p. 183–190. Cited on page 39.

81 XU, A.; DUDEK, G. Fourier tag: a smoothly degradable fiducial marker system with configurable payload capacity. In: IEEE. Computer and Robot Vision (CRV), 2011 Canadian Conference on. [S.l.], 2011. p. 40–47. Cited on page 39.

82 WEISS, L. E.; SANDERSON, A. C.; NEUMAN, C. P. Dynamic Sensor-Based Control of Robots with Visual Feedback. IEEE Journal on Robotics and Automation, v. 3, n. 5, p. 404–417, 1987. ISSN 08824967. Cited on page 41.

83 SHIRAI, Y.; INOUE, H. Guiding a robot by visual feedback in assembling tasks. Pattern Recognition, v. 5, n. 2, p. 99–108, 1973. ISSN 00313203. Cited on page 41.

84 AGIN, G. J. Real time control of a robot with a mobile camera. [S.l.]: SRI International, 1979. Cited on page 41.

85 AGRAVANTE, D. J.; PAGèS, J.; CHAUMETTE, F. Visual servoing for the reem ’s upper body. In: 2013 IEEE International Conference on Robotics and Automation. [S.l.: s.n.], 2013. p. 5253–5258. ISSN 1050-4729. Cited on page 41.

86 TAYLOR, G.; KLEEMAN, L. Flexible self-calibrated visual servoing for a humanoid robot. In: Proc. of the Australian Conference on Robotics and Automation. [S.l.: s.n.], 2001. p. 79–84. Cited on page 41.

87 CORKE, P. Visual control of robot manipulators - a review. Visual Servoing, v. 7, p. 1–31, 1994. Cited 2 times on pages 41 and 42.

88 CHAUMETTE, F.; HUTCHINSON, S. Visual servo control, Part I: Basic approaches. IEEE Robotics and Automation Magazine, Institute of Electrical and Electronics Engineers, v. 13, n. 4, p. 82–90, 2006. Cited 8 times on pages 41, 42, 43, 44, 45, 46, 52, and 53.

89 CHAUMETTE, F.; HUTCHINSON, S. Visual servo control, Part II: Advanced approaches. IEEE Robotics and Automation Magazine, Institute of Electrical and Electronics Engineers, v. 14, n. 1, p. 109–118, 2007. Cited 3 times on pages 41, 49, and 53.

90 KERMORGANT, O.; CHAUMETTE, F. Dealing with constraints in sensor-based robot control. IEEE Transactions on Robotics, v. 30, n. 1, p. 244–257, Feb 2014. ISSN 1552-3098. Cited on page 49.

91 MURRAY, R. M.; SASTRY, S. S.; ZEXIANG, L. A Mathematical Introduction to Robotic Manipulation. 1st. ed. Boca Raton, FL, USA: CRC Press, Inc., 1994. ISBN 0849379814. Cited on page 50.

92 ALBIEZ, J. et al. CSurvey - An autonomous optical inspection head for AUVs. Robotics and Autonomous Systems, v. 67, n. 0, p. 72 – 79, 2015. ISSN 0921-8890. Advances in Autonomous Underwater Robotics. Cited on page 54.

93 WATANABE, T. et al. The rock-gazebo integration and a real-time auv simulation. In: 2015 12th Latin American Robotics Symposium and 2015 3rd Brazilian Symposium on Robotics (LARS-SBR). [S.l.: s.n.], 2015. p. 132–138. Cited on page 63. Bibliography 107

94 CESAR, D. et al. An evaluation of artificial fiducial markers in underwater environments. In: OCEANS 2015 - Genova. [S.l.: s.n.], 2015. p. 1–6. Cited on page 63.

95 CODEBA. Companhia das Docas do Estado da Bahia. 2016. Goo.gl/nWRLsW. Acessado em 11-04-2016. Cited on page 64.

96 DFKI GMBH Deutsche Forschungszentrum für Künstliche Intelligenz. Dfki’s website. In: . Maritime Explorationshalle. 2017. Disponível em: . Acesso em: 28 of March 2017. Cited on page 68.