IMAGE- AND POINT CLOUD-BASED DETECTION OF DAMAGE IN ROBOTIC AND

VIRTUAL ENVIRONMENTS

By

CALVIN DOBRIN

A thesis submitted to the

School of Graduate Studies

Rutgers, The State University of New Jersey

In partial fulfillment of the requirements

For the degree of

Master of Science

Graduate Program in Mechanical and Aerospace Engineering

Written under the direction of

Aaron Mazzeo

And approved by

______

______

______

______

New Brunswick, New Jersey

May, 2021

ABSTRACT OF THE THESIS

IMAGE- AND POINT CLOUD-BASED DETECTION OF DAMAGE IN ROBOTIC AND

VIRTUAL ENVIRONMENTS

By CALVIN DOBRIN

Thesis Director:

Aaron D. Mazzeo

Repair procedures are vital to maintain the integrity of long-term structures such as bridges, roads, and space habitats. To reduce the burden of manual inspection and repair of long- term environments, the proposed solution is an autonomous repair system used for damage detection and damage repair with very little human intervention. The primary purpose of this thesis is to lay the groundwork for the introductory steps related to detection of damage and creation of a virtual map for navigation in this system. It covers the process of initial detection of damage on a structure, confirmation of damage with detailed red-green-blue-depth (RGB-D) scanning, and development of a virtual map of the structure for navigation and localization of important features. We begin by reviewing numerous damage detection methods and establishing a case for optical 2D stereo imaging and 3D scanning. We performed image-processing and point cloud-processing methods to isolate damage in image and point cloud data. The potential of automating operation and data processing without human intervention is also discussed. To lay the groundwork of an autonomous system, a robot was set up to navigate in a virtual ROS environment and relay sensory information of its surroundings. This process establishes a framework for navigation and detecting damage in a system.

ii

ACKNOWLEDGEMENTS

I would like to express my gratitude to my advisor, Professor Mazzeo, for his constant

feedback, patience, and encouragement over the course of this project. His feedback during our meetings gave me new angles to approach the project with that I never would have considered. I would like to thank Dr. Patrick Hull for overseeing my progress and guiding the goals of the project during my internship at NASA’s Marshall Space Flight Center. Whenever I hit a roadblock, he would actively help me brainstorm the best way to overcome the problem and get me in touch with colleagues who could help. I would also like to acknowledge Noah Harmatz and

Declan O’Brien from the Mazzeo Research group for their support on this project. Noah actively worked with me to integrate my research into damage detection with his work on patching and path planning and Declan graciously shared some of his datasets from working with his test

articles. It was a pleasure brainstorming with them both on the larger scope of our project.

Finally, I would like to thank my grandparents for their emotional support and encouragement.

There were many occasions where I needed to step away and clear my head and they were always

there for me.

iii

TABLE OF CONTENTS

ABSTRACT OF THE THESIS ...... ii

ACKNOWLEDGEMENTS ...... iii

LIST OF FIGURES ...... vi

Chapter 1. Introduction ...... 1

1.1 Methods for Imaging, Mapping, and Detecting Damage ...... 1

1.2 In-Space Manufacturing and Repair ...... 4

1.3 Potential Obstacles to Characterizing Structural Integrity Autonomously ...... 4

1.4 Overview of Envisioned System ...... 5

1.5 Overview of Work Described in Thesis ...... 7

Chapter 2. Camera- and Laser-Based Acquisition of Real-World Environments ...... 8

2.1 Cameras with Depth Information Acquisition ...... 8

2.1.1 Depth Information with ZED ...... 8 2.1.2 Depth Information with DK ...... 12 2.1.3 Depth Information with Intel RealSense L515 ...... 13 2.2 Structured Light Scanning with Einscan Pro 2X Plus ...... 15

2.3 Localization with ZED Position Tracking ...... 17

2.4 SLAM-based Mapping: RTAB-Map ...... 19

2.4.1 Mapping with ZED ...... 19 2.4.2 Mapping with Azure Kinect ...... 21 Chapter 3. Characterization of Features for Repairing Damage in Civil Infrastructure and Space-

Relevant Habitats ...... 24

3.1 Image Feature Extraction with RGB Images ...... 24

3.1.1 Surface features ...... 24

iv

3.1.2 Edge Detection ...... 29 3.2 Point Cloud Analysis ...... 37

3.2.1 K-Nearest Neighbor Approach ...... 37 3.2.2 Fitting To 3D Surfaces ...... 45 Chapter 4. Simulation/Virtual Environment ...... 52

4.1 Mapping ...... 52

4.2 Post-Processing of Image and Depth Images ...... 59

4.3 Localization in Map ...... 61

Chapter 5. Conclusions ...... 62

Bibliography ...... 63

v

LIST OF FIGURES

Figure 1: The SRMS and OBSS system. Image from [13] ...... 3

Figure 2: System model of a concept for an autonomous repair system. The system is divided into a scan and patching phase. The IMU, Laser/LiDAR, and depth camera move over the surface to be scanned. The scan phase obtains image and point cloud data using an RGB-D camera for damage detection in MATLAB. The mapping software uses the joint IMU data, laser scans, and image data from the IMU, laser/LiDAR, and camera, respectively. If damage is detected, the robot attempts to re-localize itself to where it found damage and proceeds to the patching stage. . 6

Figure 3: A ZED stereo camera. Image from [19]...... 8

Figure 4: A foam board test article used for damage detection identification. Damage on the surface consists of dropped items and punctured holes. The top right corner of the foam board shows five markers, used for feature detection, around a single crater...... 9

Figure 5: A confidence map of the upright foam board taken with ZED Depth Viewer. The shaded areas in the confidence map correspond to uncertainty in the depth map. Confidence values were set to approximately 85% confidence...... 10

Figure 6: A colored point cloud of the foam board surface displayed in MATLAB. The background points were filtered out...... 10

Figure 7: A point cloud of the foam board surface taken with the ZED Depth Viewer. The color- coding based on values in the z-axis, where blue points are the lowest and yellow points are the highest...... 11

Figure 8: An Azure Kinect DK camera. Image from [20]...... 12

Figure 9: The poster board used for scanning with the Azure Kinect DK is shown (a) as a point cloud and (b) as a reference image. Image and point cloud courtesy of Declan O’Brien...... 12

Figure 10: An Intel RealSense L515 camera. Image from [21]...... 13

vi

Figure 11: A view from above of the cardboard box surface and its point cloud. (a) A cardboard

box test article with a single hole. (b) A point cloud of the cardboard box as generated from the

Intel RealSense L515 and visualized in MATLAB...... 14

Figure 12: A view from the side of the surface of the cardboard box and its point cloud. a) A

point cloud of the box surface as seen from the side. b) The cardboard box with dimensions 41

cm x 16.5 cm x 15.9 cm, (L x W x H) at the same angle for reference...... 14

Figure 13: An Einscan Pro 2X Plus scanner. Image from: [22] ...... 15

Figure 14: A scan of a corner of the foam board taken with an Einscan Pro 2X Plus and the

ExScan Pro interface. The markers on the foam board were detected and highlighted in green. .. 16

Figure 15: The back of the foam board with a hole carved out and applied markers...... 16

Figure 16: A point cloud of the surface of the foam board taken with the Einscan Pro 2X Plus and

visualized in MATLAB viewed (a) from above and (b) from the side. The shape of the area of

damage appears as an indent into the surface...... 17

Figure 17: The camera view (top) and position graph (bottom) of the ZED camera as it is moving

forward...... 18

Figure 18: The mapping process with ZED odometry. a) A point cloud map of a room generated

inside of ROS’s RViz interface. b) A sample right image returned from the ZED camera to ROS.

...... 20

Figure 19: A point cloud map of a room generated using an Azure Kinect camera using RGB-D

odometry as shown in the right window of the RTAB-Map standalone interface. The left

windows show returned RGB and depth images used for the mapping process...... 21

Figure 20: A point cloud map of a room generated using LiDAR odometry from the Intel

RealSense L515 camera as shown in the right window of the RTAB-Map standalone interface.

Visual odometry was not used for mapping and no images appear in the left windows...... 22

Figure 21: Maps generated of the house from two trials. The path taken by the camera during

mapping appears in light blue. (a) The full map of the house as seen from above taken during the

vii

faster mapping session. (b) The full map of the house as seen from above taken during the slower

mapping session. (c) An enlarged image of the bed point cloud from the slower mapping session.

(d) The path taken by the camera in the slower mapping session as seen from inside the bedroom.

...... 23

Figure 22: A picture of a corner of the foam board test article used for feature matching and

subtraction. This image corresponds to the “undamaged” case before an extra hole was applied. 25

Figure 23: Images of the corner of the foam board during processing with the surface features method. Features detected by the Harris Features method in the image before application of new damage (left) and in the image after application of new damage (right) ...... 25

Figure 24: Images of the corner of the foam board during processing with the surface features method. (a) Matched features between the before and after images, where the red-tinted features belong to the “before” image and blue-tinted features belong to the “after” image. Each red circle and green cross pair represents matched features. (b) Before and after images are overlaid on top of each other after transforming the “after” image using the matrix returned from the MSAC algorithm. Feature detection is run again to confirm the features are in the same locations...... 26

Figure 25: Images of the corner of the foam board during processing with the surface features method. (a) Subtracting the matrix corresponding to the before image from the “after” image results in a difference image. Here, the white circle represents damage that was not found in the first image. (b) Magnified image of the pixels corresponding to the hole...... 27

To further investigate this, two images of the same features on a foam board were compared

using the same feature detection and matching process. Figure 26 a and b show two separate

pictures of another corner of the foam board that was not changed between both images were

taken. After applying the image comparison process defined before, the final difference image

had leftover fragments instead of being a pure black image. The difference image is shown in

Figure 26 c. It was desired that the features in both images would perfectly align and disappear

after the final subtraction step. However, the leftover image is not pure black. The presence of

viii

white pixels in the image suggests the features between the two were not matched completely.

Two limitations with the surface feature matching method are that it requires identifiable surface features and that a slight perceived difference between two images may result in false positives for damage detection. Figure 26: Results of comparing two offset images of a foam board with no changes applied to the surface. (a) “Before” image. (b) “After” image. (c) The difference image returned after overlaying the features in a and b and subtracting the two images...... 27

Figure 27: Three different grayscale images of the cardboard box test article used for image analysis: (i) without any applied damage, (ii) with a small hole of applied damage, (iii) with two holes of applied damage...... 30

Figure 28: Binary images of the cardboard box in case iii after additive application of image

processing operations: (a) Roberts edge detector with a threshold value of 0.093 b) bridging and

dilation with a 1-pixel radius disk structural element (c) hole filling (d) subtraction of the hole-

filled image with the previous image (e) filtering objects less than 0.01% of the area of the image.

...... 32

Figure 29: Remaining images after subtraction of hole-filled and non-hole-filled binary images

and filtering objects less than 0.01% of the area of the image. Results for cardboard box in (a)

case i (b) case ii...... 32

Figure 30: The image of the back of the cardboard box and its edges after application of the

Roberts method with a 0.093 threshold. (a) Image of back side of the cardboard box, taken with

the Intel RealSense L515 camera. (b) Image of the cardboard box after application of the Roberts

method with a threshold of 0.093 to the image of the cardboard box. (c) Enlarged image of b. ... 33

Figure 31: Binary images of the back of the cardboard box after additive application of image

processing operations. (a) Cropped image of LoG edge detector with a threshold value of 0.0043

(b) cropped image after bridging and dilation with a 1-pixel radius disk structural element (c)

cropped image after hole filling (d) full image after subtraction of the hole-filled image with the

previous image...... 34

ix

Figure 32: Grayscale image of laminated poster used for edge detection analysis. Image courtesy of Declan O’Brien ...... 35

Figure 33: Binary images of the poster board after additive application of image processing operations. (a) Canny edge detector with a threshold value of 0.049 (b) bridging and dilation with a 1-pixel radius disk structural element (c) hole filling (d) subtraction of the hole-filled image with the previous image (e) filtering objects less than 0.01% of the area of the image...... 36

Figure 34: Point clouds taken of a foam board with the Einscan Pro 2X Plus scanner of the foam board with (a) nothing placed on its surface (b) a block placed in the center of its surface. The point clouds are shown upside down and at a tilt. They are color coded by height and a little noise appears on the both images outside of the point cloud surfaces...... 38

Figure 35: The ICP algorithm in MATLAB. The steps consist of iteratively matching points between two point clouds, removing incorrect matches, recovering rotation and translation transforms, and checking if the error is within a tolerance. Image from: [32]...... 39

Figure 36: The resulting point clouds after plotting the undamaged foam board with only the indices returned from the KNN algorithm. (a) The resulting point cloud after running the KNN algorithm between the pristine test article and the test article with a block placed on top. (b) The resulting point cloud after running the KNN algorithm between two different point clouds of the pristine test article...... 40

Figure 37: The back of the foam board after carving out damage and application of markers for structured light scanning (a) A picture of the foam board with damage from carving (b) The point cloud of the surface of the foam board...... 41

Figure 38: The pristine foam board point cloud plotted with only the points returned in the K-

Nearest Neighbor algorithm between the pristine and carved point clouds. (a) Slightly zoomed in

...... 41

This method was repeated with point clouds of a cardboard box generated with an Intel

RealSense L515 RGB-D camera. Point clouds of the box in Figure 27 were generated for each of

x

the three cases of i) no damage, ii) one area of damage, and iii) two areas of damage. Figure 39 shows three point clouds of the cardboard box under three different conditions and the surrounding environment. The surface of the box appears to float above the floor because the point cloud was created when viewed directly above without the sides of the box in sight. Figure

39: Point clouds of the cardboard box taken with the Intel RealSense L515 camera. The point clouds correspond to the foam board with (i) no damage, (ii) one area of damage, and (iii) two areas of damage...... 41

Figure 40: Point clouds of the surface of the cardboard box with the background points removed.

The point clouds correspond to the foam board with (i) no damage, (ii) one area of damage, and

(iii) two areas of damage...... 43

Figure 41: Two pairs of point clouds of the cardboard box surface are shown after being

registered using the ICP algorithm. The pairs of point clouds are (a) the undamaged point cloud i

and the point cloud with one area of damage ii and (b) the undamaged point cloud i and the point

cloud with two areas of damage iii...... 44

Figure 42: The undamaged cardboard box point cloud plotted with only the points returned in the

K-Nearest Neighbor algorithm between (a) the undamaged point cloud i and the point cloud with

one area of damage ii and (b) the undamaged point cloud i and the point cloud with two areas of

damaged iii...... 45

Figure 43: The remaining point clouds of the carved foam board after fitting to a plane. (a) A

point cloud of all the inlier points that belong on the surface of the plane and (b) the outlier points

that do not fit to a plane. The light blue region in the right point cloud corresponds to the

damaged area...... 46

Figure 44: A region of outliers in the point cloud corresponding to the shape of damage...... 46

Figure 45: The outliers of MATLAB’s fit to plane method for each of the cases of damage for the

cardboard box using a max distance threshold of 0.01. The point clouds correspond to the foam

board with (i) no damage, (ii) one area of damage, and (iii) two areas of damage...... 47

xi

Figure 46: The outliers of MATLAB’s fit to plane method for case ii: cardboard box surface with one area of damage using a max distance threshold value of 0.003...... 48

Figure 47: Point cloud of the poster paper taken with the Azure Kinect DK camera...... 48

Figure 48: Poster paper point cloud during fit to cylinder operation. (a) The point cloud of after

removing inliers of the pcfittoplane function. (b) The final point cloud after fitting a to a cylinder.

...... 49

Figure 49: The isolated point cloud cluster representing the poster paper. This cluster was

manually selected from a group of clusters separated by an MSAC algorithm...... 49

Figure 50: The point cloud of a cross-section of the cylinder, corresponding to all points within

0.001 of the average y-value, viewed from the side. The markers are enlarged 50 x...... 50

Figure 51: The RANSAC polynomial fit vs the point cloud. (a) A plot of the points at the cross-

section of the cylinder vs the polynomial returned by MSAC. b) The 3D plot of the point cloud vs

the polynomial returned by MSAC...... 51

Figure 52: Images of the Gazebo interface with (a) the truss bridge world, consisting of a

damaged truss bridge model, the modified Turtlebot robot, and a Kinect depth camera,

(b) A closer view of the robot and virtual camera, and (c) a closer view of the damaged area in the

bridge for detection...... 53

Figure 53: A reconstruction of the truss bridge environment in RViz and RTAB-Mapviz at the

robot’s initial state is shown. The map is reconstructed through RTAB-Map using laser odometry.

(a) The reconstruction of the bridge environment in RViz. (b) The RTAB-Map interface during

mapping showing the latest received image (left column), the 3D map with superimposed image

data (center column), and graph view of the map (right column). (c) Enlarged view of 3D map. 54

Figure 54: Example images of the view from the Microsoft Kinect camera in the virtual

environment, as seen in RViz. The images shown are (a) an RGB image of the robot and ground

as seen from above (b) a depth image of the robot and ground as seen from above, where the

darkest regions are closest to the camera and the lightest regions are farthest from the camera. .. 55

xii

Figure 55: The location of the robot after moving to one end of the bridge environment as seen in

(a) the virtual environment in Gazebo and (b) the visualization of the reconstructed map in RViz

...... 56

Figure 56: The RTAB-Mapviz interface during mapping is pictured. In the 3D Map window, the blue lines represent the boundary of the bridge and the path taken by the robot. Each image received from the virtual camera is superimposed on the map by location where it was taken. ... 57

Figure 57: An image returned by the virtual camera of the unobscured bridge surface after moving the robot out of sight of the camera...... 57

Figure 58: The image and map data after navigating to the location of damage. (a) The RTAB-

Mapviz interface with the image of the triangular region of damage in the left column, the 3D map of the environment with superimposed RGB images in the center column, and the graph view in the right column. (b) A close-up view of the image of the damaged region...... 58

Figure 59: Binary images of the damaged region after additive application of image processing operations (a) LoG edge detector with a threshold value of 0.001 (b) bridging and dilation with a

1-pixel radius disk structural element (c) hole filling (d) subtraction of the hole-filled image with the previous image ...... 59

Figure 60: (a) Point cloud data returned by the virtual Microsoft Kinect camera and sent to

MATLAB. (b) The inliers of the fit to plane method returned as a point cloud. (c) The outliers of the fit to plane method returned as a point cloud...... 60

Figure 61: The map of the environment reconstructed in RViz. The edges of the bridge in the generated do not match the location of the edges in the sensory data in the new map...... 61

xiii

1

Chapter 1. Introduction

While visual inspection for damage on structures like bridges or roads is usually

performed manually, this process could be hazardous, especially in bridges with low accessibility

[1] [2]. Despite this, defects on the surfaces of bridges are an indicator of structural damage and

must not be overlooked [3]. Furthermore, cracks and potholes on road surfaces are a safety hazard

for drivers [4]. Even more hazardous are damage detection and repair procedures of space

structures. Currently, repair procedures are done by EVAs, or extravehicular activities, which

expose astronauts to the vacuum of space, extreme temperature variations, radiation, and other

such hazards [5]. With plans to establish bases on the moon in the near future, damage detection

and repair of space habitats will become more of a concern. It is estimated that 10 impacts of

particles heavier than 10 g occur every second per square meter on the surface of the moon. −18 Since these particles travel at relatively large velocities, micrometeorites are able to cause substantial damage upon impact [6]. To reduce the risks associated with damage inspection and repair procedures, an automatic patching system responsible for all steps of repair from damage

detection to application of a patch is proposed. In this thesis, the framework for such a system is

developed, encompassing damage detection and identification with a camera/laser system and

navigation in a 3D map.

1.1 Methods for Imaging, Mapping, and Detecting Damage

Damage is a loss of material as a result of a physical process [7]. While 2D/3D data

collection of pavement surfaces might be performed autonomously, detection of damage and

classification are generally performed manually by technicians [8]. Modern methods of detecting such damage are optical-based and can be split into categories of 2D image/disparity map analysis and 3D point cloud/model analysis.

2

Optical methods relying on 2D image analysis generally utilize computer vision and

machine learning to detect areas of interest in images or videos [4]. These methods may include

taking a picture of an area of interest, such as a pothole on a road, and applying 2D image detection algorithms to isolate damage [9]. More advanced damage detection methods such as

machine learning are gaining traction as well, where they train neural networks to perform image segmentation. Image processing can also be applied to disparity maps, which are created from slight differences in two offset images, usually taken by a stereo or RGB-D camera. A disparity map shades scenes in an image based on the distance of each surface from the camera. Stereo cameras are comprised of two lenses that find depth information by measuring the displacement of pixels between the left and right images. RGB-D cameras such as the Microsoft Kinect and

Intel RealSense series use a combination of RGB (red-green-blue) visual information and infrared light (IR) to determine depth. The second category of optical-based analysis relies on point clouds to extrapolate depth information of the environment being scanned, where 3D point clouds are collections of data points in a 3D space that may represent an object or an environment. Point clouds can be created using disparity maps as bases, with laser scanners such as LiDAR, or with other scanners. The point cloud data can then be fit to a model of a geometric undamaged surface.

Using an algorithm to compare the measured point cloud to a baseline, deviations from the undamaged model may correspond to damage on the scanned surface [4]. As an alternative to laser scanners, structured light scanners can also generate point clouds of environments.

Structured light scanners perform scanning by projecting a pattern of light onto a surface and analyzing the deformation of the pattern [10].

To obtain the location of damage on a large structure, it is vital to map the entire surface of the structure and then localize the damage. Many approaches fall under the umbrella of

Simultaneous Localization and Mapping (SLAM), which accomplishes the goal of creating a map and localizing a robot within the map. There are three methods of measuring the environment to localize a robot in a 3D space: laser scanners, sonar devices, and cameras. [11] Out of these

3

methods, visual and LiDAR-based SLAM approaches are the most popular. SLAM approaches

utilizing visual data generally need only a stereo or RGB-D camera, and approaches using

LiDAR generally can work using only 2D or 3D LiDAR. It is possible to use proprioceptive

sensors such as inertial measurement units (IMUs) for better accuracy [12].

Presently, concepts for autonomous damage detection on road and bridge surfaces are gaining traction. In Pothole Detection Based on Disparity Transformation and Road Surface

Modeling by R. Fan et. al, an experimental setup consisting of a stereo camera on top of the hood of a car is used to obtain stereo road images. These images are then processed for damage using their image segmentation algorithm [9].

As a part of a post-Columbia-accident mission, NASA’s Orbiter Boom Sensor System

(OBSS) was developed to survey the space shuttle for damage by inspecting its Thermal

Protection System. The OBSS is a 50-foot long boom with an instrumental package consisting of a Laser Dynamic Range Imager and cameras and is used in conjunction with the Shuttle Remote

Manipulator System (SRMS) to perform scans [13]. However, the damage repair procedures

through this system are only achievable through EVA, or manual operation [2].

Figure 1: The SRMS and OBSS system. Image from [13]

4

1.2 In-Space Manufacturing and Repair

Repair procedures in space environments may take the form of 3D printing components

needed to patch holes of damage. The first three-dimensional printer, 3DP, was launched to the

ISS in 2014 in an effort to understand 3D printing in microgravity [14]. This process

demonstrated that 3D polymer printing was possible in microgravity and that files for printing in

space could be uplinked from the ground. The successor to this mission: Advanced

Manufacturing Facility (AMF) allows for multimaterial polymeric printing and has printed many parts for NASA and other organizations [15].

Concepts of future missions for in-space manufacturing are currently underway, such as

Archinaut, CIRAS, and Dragonfly [16]. One such concept, Archinaut, is a suite for additive

manufacturing and robotic assembly for in-space construction developed by Made in Space [10].

Commercial Infrastructure for Robotic Assembly and Services (CIRAS) by Orbital ATK, now known as Northrop Grumman, aims to develop repeatable modular interfaces for in-space assembly. Finally, Dragonfly by Space Systems/Loral aims to demonstrate robotic operation of interfaces originally designed for EVA operations [16].

1.3 Potential Obstacles to Characterizing Structural Integrity Autonomously

Some obstacles to characterizing damage autonomously are due to weaknesses in damage

inspection methods. RGB-based methods like image segmentation are affected by environmental

conditions such as low lighting [4]. This can especially impact results in environments with

insufficient lighting, such as over the International Space Station [17]. Some other approaches

such as disparity/depth map segmentation have overcome problems caused by poor illumination

conditions [4]. Another consideration for image analysis is the effect of noise caused by factors

such as motion or atmospheric conditions [18]. RGB-D sensors such as the Microsoft Kinect suffer in direct sunlight due to infrared saturation [9]. This may impact the quality of depth maps

5

taken over outside structures such as bridges. Another concern to keep in mind is that laser-

scanning equipment is generally expensive to purchase and maintain [9].

Apart from obstacles related to damage detection, other factors to consider in the

proposed system are obstacles to localization of damage on a map. Such obstacles include

dynamic environments, changes in illumination, changes of geometry, and repetitive

environments. These can lead to errors during the mapping process [12]. Finally, for a fully

autonomous system, it is important to bridge each of the subsystems, such as damage detection,

navigation, and patching. Section 1.4 introduces the envisioned final system proposed by this

project and section 1.5 goes over the work done in this thesis as part of this proposed system.

1.4 Overview of Envisioned System

A diagram of a concept system model for an autonomous in-space repair system is shown in Figure 2. The envisioned system would consist of an RGB-D camera, a laser or LiDAR device, an IMU, and a robot manipulator with a 3D-printing end effector connected to a computer on the

ROS (Robot Operating System) network. Before any inspection takes place, the robot will perform a one-time baseline scan of the structure to generate a map of the environment through

RTAB-Map (Real-Time Appearance-Mapping). During the baseline scan, the robot will travel along a 1-DOF path and use either RGB-D, 2D laser, or LIDAR-based odometry for mapping as discussed later in Chapter 4. After an initial map is generated, the robot performs routine scans to assess the surface of the structure, using the same odometry as the baseline scan while also obtaining RGB and depth data using the camera. While the map is being generated, each independent image is saved and sent to MATLAB for preliminary damage detection. After a run has been finished, each image is analyzed by MATLAB using an edge detection algorithm, such as the Laplacian of Gaussian (LoG) filter. If a hole has been identified within a specific tolerance, the system moves on to the localization step.

6

During this step, the robot moves to the location in the map corresponding to the image with damage. This shifts the system into Phase 2: Patching Phase. First, the robot takes a high- quality stationary scan of the area with damage using a structured light scanner. The point cloud information is sent to MATLAB again for analysis, where the scanned point cloud is compared to a baseline surface. The outliers, or points that do not match between both datasets, are segmented and the point cloud cluster corresponding to damage is isolated. The final step of the process is

3D printing the patch and using the robot to apply it on the area of damage.

Figure 2: System model of a concept for an autonomous repair system. The system is divided into a scan and patching phase. The IMU, Laser/LiDAR, and depth camera move over the surface to be scanned. The scan phase obtains image and point cloud data using an RGB-D camera for damage detection in MATLAB. The mapping software uses the joint IMU data, laser scans, and image data from the IMU, laser/LiDAR, and camera, respectively. If damage is detected, the robot attempts to re-localize itself to where it found damage and proceeds to the patching stage.

7

1.5 Overview of Work Described in Thesis

This thesis discusses camera- and laser-based acquisition of real-world environments, characterizing features for damage detection, and methods for localization in large environments.

Thus, it covers only the Detection Phase in the system model of Figure 2. The quality of point clouds generated by stereo and RGB-D cameras are considered for damage detection and mapping, and the quality of point clouds generated with a structured light scanner are considered for high accuracy damage detection. Several methods for damage detection are considered and a few are selected as candidates for application to the automatic system. The chosen methods are then used on sensory information collected from a simple simulation environment in ROS’s

Gazebo software. The process of integrating each subsystem is also discussed, suggesting how to send information from sensors, the robot, and the computer running damage detection analysis.

Finally, future steps of the project are discussed.

8

Chapter 2. Camera- and Laser-Based Acquisition of Real-World

Environments

2.1 Cameras with Depth Information Acquisition

When choosing a camera for computer vision applications, a few considerations dependent on its application are necessary such as: depth sensing, motion tracking, and

Application Programming Interfaces (APIs) or software development kits (SDKs). Three types of cameras were considered for this study: a ZED stereo camera, an Azure Kinect DK, and an Intel

RealSense L515. All three of these cameras came with their own SDKs, or tools that support integration with other applications and development. Each device had integration with both

MATLAB and ROS for applications such as data transfer.

2.1.1 Depth Information with ZED

The ZED camera by StereoLabs is a stereo camera that obtains depth information and visual odometry using stereoscopic vision. It distinguishes itself from other 3D sensors by introducing outdoor long-range sensing. Some official uses for the ZED camera include RGB image acquisition, development of disparity maps, and development of point clouds. ZED’s SDK comes with an official ROS wrapper for publishing image and depth map information through the

ROS network. It also comes with a MATLAB interface. This interface allows for obtaining depth, point cloud, and positional tracking data in MATLAB.

Figure 3: A ZED stereo camera. Image from [19].

9

To demonstrate whether the camera could detect damage on a surface, the camera was set up to take pictures of a high-density foam board with damage across its surface. Figure 4 shows the foam board test article used.

Figure 4: A foam board test article used for damage detection identification. Damage on the surface consists of dropped items and punctured holes. The top right corner of the foam board shows five markers, used for feature detection, around a single crater.

The official ZED Depth Viewer software was used to view the depth map of the environment in the camera’s field of view in real-time. Each frame of the real-time depth video corresponded to a single pair of left or right images. In each test, the depth map returned by the camera’s software was qualitatively evaluated and compared to the actual article. The following figure shows a depth map of the foam board with confidence manually reduced until every edge corresponding to a hole was accounted for in the depth map. This type of depth map, known as a confidence map, is used to display the accuracy in determining depth. Confidence maps around approximately the 85% range were used over a depth map, because 100% confidence failed to mark every crater.

10

Figure 5: A confidence map of the upright foam board taken with ZED Depth Viewer. The shaded areas in the confidence map correspond to uncertainty in the depth map. Confidence values were set to approximately 85% confidence.

Using the depth map as a basis, the ZED camera software automatically generates a 3D point cloud of the environment. Because each point cloud is based on static images taken at a single point in time, the point cloud only has information of the foam board and environment from one perspective. The following figure shows a colored point cloud of the foam board generated using

ZED Depth Viewer and with the background points filtered out.

Figure 6: A colored point cloud of the foam board surface displayed in MATLAB. The background points were filtered out.

11

To properly assess the accuracy of the point cloud, the point cloud was color-coded to correspond to the position of each point in the z-axis. Figure 7 shows a slightly zoomed in version of the previous point cloud, where blue points correspond to a lower position while yellow points are the highest.

Figure 7: A point cloud of the foam board surface taken with the ZED Depth Viewer. The color- coding based on values in the z-axis, where blue points are the lowest and yellow points are the highest.

This plot demonstrates the high degree of error in the point cloud, which appears to exaggerate the surface of the foam board which is relatively flat. Numerous trials were taken to improve the accuracy of the point cloud, by varying the ambient lighting, distance from the camera, and test articles for scanning. Some general conditions were observed to improve accuracy, such as a brighter environment, a distance of 150 cm between the camera and test article, and using a monochromatic test article. However, the camera still failed to create point clouds to a desired level of accuracy.

A potential explanation for the inconsistent results is the camera’s tailoring toward outdoor use. Without an infrared sensor, the ZED camera relies on bright light to obtain accurate depth information. Furthermore, with improved accuracy at distances farther from the lenses,

ZED is better at gaining depth information from large areas.

12

2.1.2 Depth Information with Azure Kinect DK

At the time of writing, the Azure Kinect DK camera is a fairly new RGB-D camera that is slowly gaining integration with popular software. The Azure Kinect contains an RGB camera, a time-of-flight depth camera, IR emitters, an inertial measurement unit (IMU), and a microphone array. The camera is compatible with the Azure Kinect SDK, which comes with a viewer, recorder, and application development tools. A separate Azure Kinect ROS driver allows for publishing sensor data to ROS. Due to the presence of an IMU, the Azure Kinect has the potential to act as both an RGB-D camera and an IMU in the system model from Figure 2.

Figure 8: An Azure Kinect DK camera. Image from [20].

A sample point cloud taken with the Azure Kinect DK of a poster board is shown in

Figure 9 next to a reference image of the poster board.

Figure 9: The poster board used for scanning with the Azure Kinect DK is shown (a) as a point cloud and (b) as a reference image. Image and point cloud courtesy of Declan O’Brien.

13

2.1.3 Depth Information with Intel RealSense L515

The Intel RealSense L515 is another new RGB-D camera. It is made up of an RGB

camera, an IMU consisting of an accelerometer and gyroscope, and LiDAR depth technology

with depth accuracy from ~5 mm to ~14 mm thru 9 mm . The camera is supported by Intel’s 2 RealSense SDK, which offers code samples for example programs and wrappers for integration

with other software, such as MATLAB and ROS. It has been optimized for indoor lighting. Due

to the presence of an IMU and a LiDAR scanner, the Intel RealSense L515 has the potential to act

as an RGB-D camera, an IMU, and a LiDAR scanner in the system model from Figure 2.

Figure 10: An Intel RealSense L515 camera. Image from [21].

The camera was used to scan the surface of a cardboard box, with dimensions 41 cm x

16.5 cm x 15.9 cm, (L x W x H) to resolve damage. A picture of the cardboard box is shown in

Figure 11 a. A point cloud of the environment was generated and fragmented to leave only the points corresponding to the surface of the box. The point cloud, shown in Figure 11 b, has a dip corresponding to the location of the hole in the actual box.

14

a) b)

Figure 11: A view from above of the cardboard box surface and its point cloud. (a) A cardboard box test article with a single hole. (b) A point cloud of the cardboard box as generated from the

Intel RealSense L515 and visualized in MATLAB.

An alternate angle of the point cloud of the surface is presented in Figure 12 a to show the roughness of the surface as indicated by the point cloud. The same angle of the cardboard box is presented in Figure 12 b for reference.

a)

b)

Figure 12: A view from the side of the surface of the cardboard box and its point cloud. a) A point cloud of the box surface as seen from the side. b) The cardboard box with dimensions 41 cm x 16.5 cm x 15.9 cm, (L x W x H) at the same angle for reference.

15

2.2 Structured Light Scanning with Einscan Pro 2X Plus

The Einscan Pro 2X Plus scanner, as shown in Figure 13, was used for structured light scanning to create high quality point clouds.

Structured light scanners such as the Einscan Pro are generally much more expensive options than

RGB-D cameras. The Einscan uses proprietary software to perform scanning in three types of modes: Fixed mode, HD mode, and rapid mode. Rapid mode was the mode chosen for scanning to overcome the limitation of having the scanner stationary in fixed mode and requiring markers placed on the surface for HD mode. The rapid mode allows for motion of the scanner, and the option of marker or feature alignment during scanning. The foam board in Figure 4 was used again as the test article for structured light scanning. The following figure shows a sample scan of a corner of the foam board, where five markers were applied to facilitate scanning. Attempts to scan using only feature alignment and no markers resulted in loss of tracking. The process of scanning with the structured light scanner differed from scanning with RGB-D and stereo cameras by requiring that the scanner be moved in repetitive motions over the surface. After scanning was complete, the point cloud was meshed and saved as a point cloud file. EXScan Pro allows the option to save the mesh as a CAD file as well, which may be useful for the process of modeling a patch to place over the damage.

Figure 13: An Einscan Pro 2X Plus scanner. Image from: [22]

16

Figure 14: A scan of a corner of the foam board taken with an Einscan Pro 2X Plus and the

ExScan Pro interface. The markers on the foam board were detected and highlighted in green.

A separate scan was done of the back side of the foam board, where damage was located

in one concentrated area. In this case, damage was carved out to create a more unique shape as

opposed to a round crater. The following figure shows a close-up of the foam board with the damaged crater and markers surrounding it.

Figure 15: The back of the foam board with a hole carved out and applied markers.

The entire surface of the foam board was scanned with the Einscan and exported as a .ply file for visualization in MATLAB. Figure 16 shows a magnified version of the point cloud with the damage and markers and a side profile of the surface to show the shape of the damage.

17

a)

b)

Figure 16: A point cloud of the surface of the foam board taken with the Einscan Pro 2X Plus and visualized in MATLAB viewed (a) from above and (b) from the side. The shape of the area of damage appears as an indent into the surface.

The point clouds generated by scans using the Einscan structured light scanner appear to be the most detailed out of the tests conducted. However, the need to include markers for surfaces with few unique features means this type of scanning may not produce viable scans without human handling. Furthermore, since the scan can only be run manually using proprietary software, there is not yet support for integration with MATLAB and ROS.

2.3 Localization with ZED Position Tracking

An important consideration when performing image- or point cloud-based damage detection is localizing the individual image or point cloud in a map of the entire structure. This

18

step allows the patching robot to navigate to the location of damage and apply its patch. Two methods were considered for achieving this objective: ZED position tracking and SLAM-based mapping. The ZED camera comes with a compatible position tracking script that tracks the location of the camera in space relative to its original location.

The ZED camera was connected to a laptop running MATLAB via USB connection.

Next, several trials were taken in which the camera was moved along the side of a platform while running the position tracking script. The position data was recorded on MATLAB and compared to analog measurements of the platform taken with a tape measure. Figure 17 shows the camera view and the position graph of the camera in a trial run where the camera was moved forward.

Figure 17: The camera view (top) and position graph (bottom) of the ZED camera as it is moving forward.

Several trials determined that the maximum percent error between the distance traveled recorded by MATLAB and the tape measure was 3.3%. To provide more accurate readings, future tests may involve clamping the camera on a 1-degree-of-freedom track to prevent noise that may be induced from bouncing.

19

The position tracking script was modified to save an image into a MATLAB cell at each

iteration of the algorithm. This allows the indices of the location data saved in an array to correspond to the indices of the images in the cell. Image-based damage detection can then be carried out on each individual image. If damage is detected, the index of the image with damage will be matched with the index of the position values from the sorted position data.

2.4 SLAM-based Mapping: RTAB-Map

Another method used to localize individual images on a map of the entire surface is

through SLAM-based mapping with RTAB-Map. RTAB-Map’s versatility allows many varied

odometry inputs for mapping. Furthermore, RTAB-Map has a ROS-compatible package for communicating with sensors and robots through the ROS network. The following methods for mapping were attempted with each of the previously discussed depth cameras through RTAB-

Map’s ROS package: Mapping with ZED using stereo odometry, mapping with Azure Kinect using RGB-D odometry, and mapping with the Intel RealSense L515 using the LiDAR-odometry.

2.4.1 Mapping with ZED

Mapping with stereo cameras such as the ZED camera relies on interpreting depth information from two offset cameras: stereo odometry. RTAB-Map allows for the option of mapping using proprietary visual odometry by StereoLabs or using open source visual odometry by RTAB-Map. RTAB-Map’s visual odometry process involves three steps: feature detection, feature matching, and motion prediction, whereas ZED’s is proprietary [12] [23]. The mapping process generates a point cloud of the environment. After the full point cloud is generated, the database is saved by RTAB-Map and can be reopened for localization in the same environment.

The ZED camera was connected to a computer running RTAB-Map via USB and set up to send left and right stereo images to RTAB-Map through the ROS network. The camera was

20

then moved around the room as it performed mapping using ZED odometry. Figure 18 a shows

the mapping in progress of a room using the ZED camera and ZED odometry inside of RViz,

RTAB-Map’s native visualizer. The red-, blue-, and green-colored axes represent the location of the two lenses and their estimated movement in space during the mapping process. The center panel in the RViz interface visualizes the data coming in through ROS. This data, called displays, is listed in the left panel of the interface and it can include images, point clouds, the robot state, etc. An extra panel was added in the bottom left to display the most recent right stereo image used for mapping. The right panel is the views panel, which allows for changing view of the 3D space between the orbital camera, first-person camera, and top-down orthographic. The default orbital camera is used, keeping the markers for the ZED camera at the focal point of the 3D world.

Figure 18 b provides an enlarged view of the image of the room captured by the ZED camera.

a)

b)

Figure 18: The mapping process with ZED odometry. a) A point cloud map of a room generated inside of ROS’s RViz interface. b) A sample right image returned from the ZED camera to ROS.

21

2.4.2 Mapping with Azure Kinect

The Azure Kinect was connected to a computer running RTAB-Map via USB and set up

to send RGB images and depth images to RTAB-Map through the ROS network. The mapping

process used RTAB-Map’s RGBD odometry mode, which computes odometry by matching

features from RGB images and depth information from depth/disparity images. The Kinect was then moved around a bedroom as it performed mapping. Figure 19 shows the mapping in progress of the room using the Azure Kinect camera inside the default RTAB-Map interface. The right column shows the 3D point cloud and the left column shows image and depth data received from the camera.

Figure 19: A point cloud map of a room generated using an Azure Kinect camera using RGB-D odometry as shown in the right window of the RTAB-Map standalone interface. The left windows show returned RGB and depth images used for the mapping process.

2.4.3 Mapping with Intel RealSense and LIDAR

When mapping with the L515 camera, both RGB-D odometry and LiDAR odometry were considered for creating point clouds of a room in separate tests. With RGB-D odometry,

RTAB-Map uses a RANSAC approach to compute the transform between consecutive images

22

recorded by the camera [24]. While RGB-D odometry is dependent on images, LiDAR odometry

takes point clouds as input. To assemble a point cloud, incoming point cloud data is registered to

point cloud data already received by RTAB-Map using iterative-closest-point (ICP) [12]. To take advantage of the L515’s LiDAR sensor, mapping was done using LiDAR odometry. Figure 20 shows the RTAB-Map standalone app interface during mapping and the point cloud of a bedroom

being mapped. Since RTAB-Map is using LiDAR odometry, no RGB images are being received

by RTAB-Map in the left panels.

Figure 20: A point cloud map of a room generated using LiDAR odometry from the Intel

RealSense L515 camera as shown in the right window of the RTAB-Map standalone interface.

Visual odometry was not used for mapping and no images appear in the left windows.

The shape of the bed and the walls can be clearly made out from the point cloud and by

interacting and moving the point cloud around in RTAB-Map’s interface. The points

corresponding to the bed frame are likely not included because the reflective material of the bed

frame interfered with the LiDAR sensor.

The Intel RealSense L515 camera was next used to generate a larger map spanning the

bedroom and living room of a house using the same procedure. The camera was manually turned

23

slowly around the bedroom, moved around the living room and returned to the bedroom. Figure

21 shows two maps generated from two sessions and enhanced versions of the second session. In the first session, the camera was moved around more quickly to generate a map, where the path (marked in light blue) can be seen from above. The top view of this map is shown in Figure

21 a. The second map, shown in Figure 21 b, was created during a mapping session conducted more slowly, allowing a map with more detail to develop. The path taken by the camera is obstructed by the ceiling of the map. Figure 21 c shows an enhanced view of the bed and Figure

21 d shows the path taken from the perspective of the bedroom, both in the second map.

a) b)

c) d)

Figure 21: Maps generated of the house from two trials. The path taken by the camera during mapping appears in light blue. (a) The full map of the house as seen from above taken during the faster mapping session. (b) The full map of the house as seen from above taken during the slower mapping session. (c) An enlarged image of the bed point cloud from the slower mapping session.

(d) The path taken by the camera in the slower mapping session as seen from inside the bedroom.

24

Chapter 3. Characterization of Features for Repairing Damage in Civil

Infrastructure and Space-Relevant Habitats

3.1 Image Feature Extraction with RGB Images

Two types of damage detection philosophies were considered when performing image-

based damage detection: before and after image comparison and edge detection. Initially the goal

was to determine the existence of damage with a high degree of certainty using only RGB or

depth images. To achieve this, surface feature matching and comparison was tested on before-

and-after image pairs. When the goal shifted to using point cloud-based methods to classify and

detect damage, the focus with RGB images became finding potential areas of damage and edge

detection methods became a consideration.

3.1.1 Surface features

One attempt to detect areas of damage was feature detection and matching. The premise

of this method is finding an area in a new image that differs from the same area in a baseline

image. This process can be accomplished by taking two separate images, one before damage

occurs, and one after damage occurs. If there is a difference between features between the two

images, we designate these differences as damage. Since before and after images are not likely to

be taken at the exact same orientation, the first step in this process is to align features in the two

images. This was tested using the Harris-Stephens algorithm for feature matching in MATLAB.

The Harris-Stephens algorithm (shortened to Harris features) detects corner points, or features, in a 2D image [25] [26]. Two images were taken of the test article, one before damage was applied and one after damage was applied. It was noted that if there are marks or edges on the foam

25

blocks present in both photos, both methods have a good chance of finding features that can be

matched. Figure 22 shows a portion of the foam board shown earlier in Figure 4.

Figure 22: A picture of a corner of the foam board test article used for feature matching and subtraction. This image corresponds to the “undamaged” case before an extra hole was applied.

Two separate photos were taken of this portion of the foam board: one as the foam board appears in Figure 22 and one with an added hole. Both images were converted to grayscale and the Harris Features method was run on both images. Figure 23 shows the output of this algorithm on both images, where the green plus markers correspond to features detected in each image.

Figure 23: Images of the corner of the foam board during processing with the surface features

method. Features detected by the Harris Features method in the image before application of new

damage (left) and in the image after application of new damage (right)

26

The “before” and “after” images were overlayed and displayed in Figure 24 a. The red- tinted features belong to the before image and the blue-tinted features belong to the after image.

Each red circle and green cross pair represents matched features. The next step in the process is aligning the features in both images. To find the translation and rotation between the “before” and

“after” images, an M-estimator Sample Consensus (MSAC) algorithm was applied. The MSAC algorithm is an M-estimator, or a class of robust estimators useful for estimating parameters of mathematical models [27], [28],[29]. This instance of the algorithm was used to compute the transformation matrix representing the offset of the features in the “after” image from the

“before” image. By finding the inverse of this transformation matrix, the scale and rotation angle between the two images is recovered. Applying a transformation matrix with this information to the “after” image lined up the features of both images. Figure 24 b demonstrates how these

a) b)

Figure 24: Images of the corner of the foam board during processing with the surface features method. (a) Matched features between the before and after images, where the red-tinted features belong to the “before” image and blue-tinted features belong to the “after” image. Each red circle and green cross pair represents matched features. (b) Before and after images are overlaid on top of each other after transforming the “after” image using the matrix returned from the MSAC algorithm. Feature detection is run again to confirm the features are in the same locations.

27

images look overlaid. The light blue hole on the right corner still appears because it failed to

match with any feature from the before image.

Finally, the “before” image was subtracted from the “after” image. The result is a

difference image with black pixels indicating features that canceled out and lighter pixels

indicating regions that did not match between both images as shown in Figure 25. A closer view of the lighter pixels corresponding to the hole is also shown.

a) b)

Figure 25: Images of the corner of the foam board during processing with the surface features method. (a) Subtracting the matrix corresponding to the before image from the “after” image results in a difference image. Here, the white circle represents damage that was not found in the first image. (b) Magnified image of the pixels corresponding to the hole.

The pixels in this image that are not pure black represent features that did not match between both images. While the new hole appears isolated through this method, other features that did not change or did not intentionally change between the two images also appeared in the difference image. A portion of the sharp rounded edge around the top center of the image and the wooden pattern outside of the foam board appeared in the difference image.

To further investigate this, two images of the same features on a foam board were compared using the same feature detection and matching process. Figure 26 a and b show two

28

separate pictures of another corner of the foam board that was not changed between both images were taken. After applying the image comparison process defined before, the final difference image had leftover fragments instead of being a pure black image. The difference image is shown in Figure 26 c. It was desired that the features in both images would perfectly align and disappear after the final subtraction step. However, the leftover image is not pure black. The presence of white pixels in the image suggests the features between the two were not matched completely.

Two limitations with the surface feature matching method are that it requires identifiable surface features and that a slight perceived difference between two images may result in false positives for damage detection.

a) b)

c)

Figure 26: Results of comparing two offset images of a foam board with no changes applied to the surface. (a) “Before” image. (b) “After” image. (c) The difference image returned after overlaying the features in a and b and subtracting the two images.

29

3.1.2 Edge Detection

Another image processing tool for damage detection considered in this thesis is edge

detection. Edge detection methods detect outlines, or edges, in images. These have uses such as

finding boundaries around objects, detection of objects, and characterization of objects from the background [18]. MATLAB offers several methods for detecting edges in grayscale images.

These methods include Sobel, Prewitt, Canny and Roberts methods, which use a gradient-based approach to detect edges. These methods search for regions in an image where the gradient, or the directional change in the color of the image, is a maximum or local maximum. Two other methods, Laplace of Gaussian (LoG) and zerocross, find edges by looking for zero-crossings or points where a function changes in sign (such as positive to negative), in the grayscale image

[30]. Each of these edge detection methods were tested on test articles with different damage.

Further, each method was tested under different thresholds, to filter through only the desired edges. The following series of images proposes a method for isolating damage on a cardboard box. Three different images of the box were taken before and after application of damage, case i: before external damage was applied, case ii: after a small hole was added, and case iii: after a second hole in the shape of a triangle was cut out. The images were converted to grayscale as shown in Figure 27. The pictures were taken approximately 50 cm from the surface of the box with a Samsung Galaxy S20+ phone and lowered to resolutions ranging from 974 x 543 px to

1074 x 599 px. The distances of the camera from the test article and the resolutions were slightly varied to test the damage detection script under different conditions.

A method that produced consistent results for isolating desirable edges in all three images was the Roberts method with a threshold value of 0.093, expressed as a numeric scalar. This method produced a decent balance between edges corresponding to potential damage and extra edges that did not correspond to damage. The result is a binary image, or an image comprised of a matrix of 0- and 1- values, color-coded as black and white pixels, respectively. The application of this method on case iii, with two cut out holes, is shown in Figure 28 a. Running the Roberts

30

i)

ii)

iii)

Figure 27: Three different grayscale images of the cardboard box test article used for image

analysis: (i) without any applied damage, (ii) with a small hole of applied damage, (iii) with two

holes of applied damage. method with a lower threshold includes more noise in the results, while a larger threshold may potentially mark less of the damage of interest.

After obtaining the image with edges, it is desired to fill the holes inside of fully closed edges. Before application of a hole-filling method, the boundary lines around damage must be connected for the area to be recognized as a hole. To connect the lines in the image, first a bridge morphological operation was applied using MATLAB’s bwmorph method. The operation

31

bridges unconnected pixels, or sets 0-value pixels to 1 if they are between two 1-value pixels.

Next, MATLAB’s imdilate function was used to dilate the image with a disk structural element.

The radius of the structural element was chosen to be 1 unit, which translates to 1 pixel in the image. The image returned after bridging and dilating can be seen in Figure 28 b. To fill the holes, imfill, or MATLAB’s hole-filling function is applied. This function detects object

boundaries and performs a flood-fill operation to convert empty background pixels to “filled” foreground pixels [31]. Figure 28 c shows the image after applying the hole-filling method. To

isolate the holes in this image, the images in Figure 28 c and Figure 28 b were subtracted by a

simple matrix subtraction. Figure 28 d shows the resulting image after subtraction. Finally, to

remove the extra noise MATLAB’s bwareafilt function was used to filter out objects less than

0.01% of the area of the image, measured in pixels. Figure 28 e shows the final image isolating

edges corresponding to damage. Using MATLAB’s bwlabel function, the number of regions is

automatically found to be 3, which are taken to each correspond to a separate area of damage. In

this case, the three regions corresponded to the two holes applied as damage and the right edge of

the box.

Applying the same processing techniques and exact same parameters to the images in

cases i and ii of Figure 27, we obtain the following images after the final filtering operation in

Figure 29. The final image after processing for case i returns a completely black, or empty, image as desired. The final image after processing for case ii returns one filtered region. While the

image processing operations for case i and ii did not identify the right edge of the box as a region

of damage, the result of processing for case iii did.

A potential explanation for this inconsistency may be due to the process of dilation before

filling thin and long holes. Because the dilation step enlarges the borders of objects, it has the

potential to leave less space for the hole-filling operation, especially if the edges of the hole are

close together. This leaves a much smaller region after the subtraction step.

32

a) b)

c) d)

e)

Figure 28: Binary images of the cardboard box in case iii after additive application of image processing operations: (a) Roberts edge detector with a threshold value of 0.093 b) bridging and dilation with a 1-pixel radius disk structural element (c) hole filling (d) subtraction of the hole- filled image with the previous image (e) filtering objects less than 0.01% of the area of the image.

For images of different surfaces or distances, the parameters must be calibrated to yield more accurate results. The same process was applied to another image of the cardboard box,

a) b)

Figure 29: Images after subtraction of hole-filled and non-hole-filled binary images and filtering objects less than 0.01% of the field of view. Results for cardboard box in (a) case i (b) case ii.

33

obtained during mapping of a room with the Intel RealSense L515 camera at a resolution of 1920 x 1080 px. Figure 30 a shows the opposite side of the cardboard box with a much smaller hole and other objects in the field of view. The same parameters used to isolate the damage for the three cases in Figure 27 do not achieve similar results for the image in Figure 30 a. Application of the Roberts method with a 0.093 threshold value returns the image in Figure 30 b. The left half of this image is enlarged and displayed in Figure 30 c to better view the edge boundary around the

a)

b)

c)

Figure 30: The image of the back of the cardboard box and its edges after application of the

Roberts method with a 0.093 threshold. (a) Image of back side of the cardboard box, taken with the Intel RealSense L515 camera. (b) Image of the cardboard box after application of the Roberts method with a threshold of 0.093 to the image of the cardboard box. (c) Enlarged image of b.

34

area of damage.

To better isolate the hole, the parameters in the script were varied. Instead of the Roberts method, the Laplace of Gaussian method was used with a threshold of 0.0043. Figure 31 shows the output of this edge detection method, the image after the bridge and dilate methods, after hole filling operation, and finally the entire zoomed out image after the subtraction operation.

a) b)

c) d)

Figure 31: Binary images of the back of the cardboard box after additive application of image processing operations. (a) Cropped image of LoG edge detector with a threshold value of 0.0043

(b) cropped image after bridging and dilation with a 1-pixel radius disk structural element (c) cropped image after hole filling (d) full image after subtraction of the hole-filled image with the previous image.

Due to the small size of the region (about 0.0017% of the FOV), no filter was applied because this is lower than the arbitrary threshold for noise filtering used for other image

35

processing trials. This demonstrates the ability of this code to isolate very small damage, but at

the risk of not filtering out unwanted noise at the same scale.

The final test article for image analysis was a laminated poster paper slightly compressed

to bulge outward in the shape of a half-cylinder. Several cutouts were made in the article with

varying shapes and sizes to indicate damage and the edge detection process was applied again.

Figure 32 shows the grayscale image of the cylinder used for processing.

Figure 32: Grayscale image of laminated poster used for edge detection analysis. Image courtesy of Declan O’Brien

The Canny edge method was applied with a threshold of 0.049. Figure 33 shows the output of the Canny edge detector, the image after application of the bridge and dilate operations, the application of the hole filling method, the subtraction operation, and the remaining image after filtering noise.

36

a) b)

c) d)

e)

Figure 33: Binary images of the poster board after additive application of image processing operations. (a) Canny edge detector with a threshold value of 0.049 (b) bridging and dilation with a 1-pixel radius disk structural element (c) hole filling (d) subtraction of the hole-filled image with the previous image (e) filtering objects less than 0.01% of the area of the image.

37

The number of regions detected in the final image is 8, where four of these correspond to

the cutouts and the other four correspond to pieces of tape holding the poster on the wall, where one piece of tape was identified as two regions. While this method was effective in finding the biggest cutouts, including the very long and thin hole, it did not detect the smallest holes in the image. While the edge detection method found the edges corresponding to these holes, it also included undesired edges that were removed throughout the steps in the algorithm.

3.2 Point Cloud Analysis

Due to the inconsistency in some results of the image-based methods, point cloud

processing methods were sought to detect areas of damage in point clouds of object surfaces. Two

separate methods were tested for point cloud processing: The K-Nearest Neighbor Approach and

Fitting to 3D Surfaces.

3.2.1 K-Nearest Neighbor Approach

The K-Nearest Neighbor method is useful for comparing two sets of query points, such as matrices. This method was used to compare point clouds of a surface before and after damage.

The back of the foam board test article shown in Figure 15 was used for testing before and after application of the damage. The Einscan Pro 2X Plus 3D scanner was used to obtain a 3D point cloud of the test article for processing of the point cloud. Multiple scans of the foam board were taken before and after damaging the article. Figure 34 shows point clouds of the foam board generated by the 3D scanner, the left point cloud corresponding to an undamaged foam board and the right point cloud corresponding to the foam board with a block placed on top of it. The point clouds are color coded based on height. While the surface of the actual test article is flat, the point clouds show an incline due to the default orientation of the point clouds scanned in by the scanner. Due to this orientation, the undamaged and damaged point clouds appear upside down

38

and at a tilt. It should be noted that what appears as a dip in the right point cloud is a protrusion on top of the surface caused by the test block rather than a dip inside the surface.

a) b)

Figure 34: Point clouds taken of a foam board with the Einscan Pro 2X Plus scanner of the foam board with (a) nothing placed on its surface (b) a block placed in the center of its surface. The point clouds are shown upside down and at a tilt. They are color coded by height and a little noise appears on the both images outside of the point cloud surfaces.

Except for some noise on the corners of the point clouds, the shapes of the point clouds closely matched the actual shape of the scanned test article. It should be noted that the slight variation in shape of the point clouds from each other is due to some edges of the point cloud not being scanned over during the scanning process.

After obtaining these scans, an ICP algorithm was run to align the before and after point clouds. The ICP workflow process, as used in MATLAB, is shown in Figure 35. The ICP algorithm aims to reduce the distance between two offset point clouds iteratively. It first searches for the nearest neighbor of each point in a point cloud and matches it to a point in the second point cloud. Outliers, or incorrect matches, are checked for and filtered out. After this, translation and rotation transforms are tested on the second point cloud to minimize the distance between points between both point clouds. This process continues until a distance tolerance is met [32].

39

Figure 35: The ICP algorithm in MATLAB. The steps consist of iteratively matching points between two point clouds, removing incorrect matches, recovering rotation and translation transforms, and checking if the error is within a tolerance. Image from: [32].

After both point clouds were aligned with the ICP method, MATLAB’s K-Nearest

Neighbor (KNN) algorithm was run between the two point clouds. K-Nearest Neighbor is used to find the closest point from a set to a point in a different set. This process can be run repeatedly for a set of points, such as a point cloud, repeating the nearest neighbor search for each point in the set. Given a point from one data set, the algorithm will find the k-closest points to it from another dataset. In this example, the base data set is the pristine point cloud and the secondary dataset is any one of the secondary point clouds in the trials. Since it is undesirable to match damaged areas in the damaged point cloud with points on the surface of the pristine point cloud, the k-value was set to 1, to avoid extra matches.

The KNN algorithm was run to find the nearest neighbors in the pristine point cloud for each point in the test point cloud with a block placed on top. The output of this run was a list of indices with matching neighbors between both sets of point clouds. The undamaged point cloud was then re-plotted only with the returned indices. The resulting point cloud of the pristine surface contained a less dense area around where the block was placed in the test point cloud, as

40

shown in Figure 36 a. To test the accuracy of this procedure, the ICP and KNN procedures were

repeated on a pair of two independent point clouds of the pristine foam board surface. One of the two pristine point clouds was then plotted with only the indices returned from this run. Figure 36 b shows this output. The returned point cloud appears similar to the pristine point cloud in Figure

34 a with less noise.

a) b)

Figure 36: The resulting point clouds after plotting the undamaged foam board with only the indices returned from the KNN algorithm. (a) The resulting point cloud after running the KNN

algorithm between the pristine test article and the test article with a block placed on top. (b) The

resulting point cloud after running the KNN algorithm between two different point clouds of the

pristine test article.

In the second test, the back side of the foam board was carved as seen in Section 2.2.

Figure 37 shows a zoomed in image of the surface of the foam board and a reconstructed point

cloud showing the area where the surface was carved. The ICP algorithm was run between the

pristine point cloud of the back of the back of the foam board and the new damaged point cloud,

followed by the K-Nearest Neighbor algorithm.

Plotting the pristine point cloud with only the indices returned by this algorithm returns

Figure 38. The point cloud was slightly rotated and zoomed in to the area of interest. There are

areas in the point cloud with a less dense assortment of points such as the areas corresponding to

41

a) b)

Figure 37: The back of the foam board after carving out damage and application of markers for structured light scanning (a) A picture of the foam board with damage from carving (b) The point cloud of the surface of the foam board. the small markers on the foam board and the larger hole corresponding to the carved section. The presence of these features indicates that differences between the pristine and damaged point clouds were caught by the algorithm and not included in the returned indices.

a) b)

Figure 38: The pristine foam board point cloud plotted with only the points returned in the K-

Nearest Neighbor algorithm between the pristine and carved point clouds. (a) Slightly zoomed in and rotated point cloud (b) Larger zoom around the damaged area of the point cloud.

This method was repeated with point clouds of a cardboard box generated with an Intel

RealSense L515 RGB-D camera. Point clouds of the box in Figure 27 were generated for each of

42

the three cases of i) no damage, ii) one area of damage, and iii) two areas of damage. Figure 39

shows three point clouds of the cardboard box under three different conditions and the surrounding environment. The surface of the box appears to float above the floor because the point cloud was created when viewed directly above without the sides of the box in sight.

i)

ii)

iii)

Figure 39: Point clouds of the cardboard box taken with the Intel RealSense L515 camera. The point clouds correspond to the foam board with (i) no damage, (ii) one area of damage, and (iii) two areas of damage.

43

To isolate the surface of the cardboard box, each point cloud was segmented into clusters by distance between points. After which, the surfaces were isolated from the background. Figure

40 shows the isolated surfaces of each of these point clouds.

i)

ii)

damage

iii)

damage

Figure 40: Point clouds of the surface of the cardboard box with the background points removed.

The point clouds correspond to the foam board with (i) no damage, (ii) one area of damage, and

(iii) two areas of damage.

After isolating each surface, two sets of point cloud pairs were registered using the ICP algorithm to align the point clouds. The two registration trials were between the point cloud in

44

Figure 40 i and Figure 40 ii and between the point cloud in Figure 40 i and Figure 40 iii. The

results of registration via ICP for both point cloud pairs are shown in Figure 41.

a)

b)

Figure 41: Two pairs of point clouds of the cardboard box surface are shown after being registered using the ICP algorithm. The pairs of point clouds are (a) the undamaged point cloud i and the point cloud with one area of damage ii and (b) the undamaged point cloud i and the point

cloud with two areas of damage iii.

An ideal fit between two point cloud surfaces would be full overlap between both

surfaces. Since in both cases of Figure 41, the two point clouds are not fully lined up, matching

surface features by location will not produce meaningful results. Indeed, the point cloud of the

undamaged board plotted with only the indices returned by the KNN algorithm appears

perforated due to the low density of points, as seen in Figure 42. While the triangular hole from

the point cloud in Figure 42 b appears more prominently than the noise, the smaller hole cannot

45

be detected and little data can be gleaned about the location of damage relative to the rest of the point cloud.

a)

b)

Figure 42: The undamaged cardboard box point cloud plotted with only the points returned in the

K-Nearest Neighbor algorithm between (a) the undamaged point cloud i and the point cloud with one area of damage ii and (b) the undamaged point cloud i and the point cloud with two areas of damaged iii.

3.2.2 Fitting To 3D Surfaces

The alternative point-cloud-based damage detection method is fitting to 3D surfaces. In this example, the high quality point cloud of the foam board surface from Figure 37 b was used again for processing. The point cloud fit was to a plane using the MATLAB’s plane fitting function, a variant of the MSAC algorithm [33]. This returns inlier points that fit to a plane within a specified maximum distance. Figure 43 shows the inliers and outliers returned by the fit to plane function. In this example, the inliers return the undamaged surface and the outliers return

46

the area of damage and extra noise. Figure 44 shows the isolated damage region. The shape of the

damage is captured in detail with the point cloud scan, with two clearly defined peaks.

a) b)

Figure 43: The remaining point clouds of the carved foam board after fitting to a plane. (a) A

point cloud of all the inlier points that belong on the surface of the plane and (b) the outlier points

that do not fit to a plane. The light blue region in the right point cloud corresponds to the

damaged area.

Figure 44: A region of outliers in the point cloud corresponding to the shape of damage.

Isolating damage using the fit to surface method was also attempted with point clouds

obtained from RGB-D cameras. The point clouds from Figure 40 are fit to planes using a threshold with a maximum distance of 0.01, expressed as a scalar. The outliers can be seen in

Figure 45. To properly isolate damage, it is desirable that all of the outlier points are the only

47

regions of damage. However, the imaginary points corresponding to the sides of the box were included as outliers as well. This is an issue for scanning small surfaces, but may not be an issue for scanning much larger surfaces, as discussed in Chapter 4.

ii) i)

iii)

damage

Figure 45: The outliers of MATLAB’s fit to plane method for each of the cases of damage for the cardboard box using a max distance threshold of 0.01. The point clouds correspond to the foam board with (i) no damage, (ii) one area of damage, and (iii) two areas of damage.

While the larger triangular hole from case iii was captured in the form of outlier points, the smaller hole does not appear isolated in neither the outliers in case ii nor iii. A more aggressive plane fitting threshold was used to isolate the smaller hole. Figure 46 shows the outliers for the point cloud in case ii using a max distance threshold of 0.003. Since the point cloud recorded the hole as a small dent in the surface, the points corresponding to the hole did not diverge from the plane any more than the points at the border of the box.

48

damage

Figure 46: The outliers of MATLAB’s fit to plane method for case ii: cardboard box surface with

one area of damage using a max distance threshold value of 0.003.

Point cloud fitting methods were also investigated for nonplanar surfaces. A point cloud

was taken of the poster paper from Figure 32 with the Azure Kinect DK camera. MATLAB’s

pcfittocylinder function was then used to fit this point cloud to a geometric cylinder. Figure 47

shows the point cloud as seen in MATLAB.

Figure 47: Point cloud of the poster paper taken with the Azure Kinect DK camera.

To isolate the poster paper of interest, a plane fit was applied to the entire point cloud.

The inliers, corresponding to points on the wall, were then removed. The remaining point cloud is shown in Figure 48 a. Next, the fit to cylinder function was applied with a maximum distance of

0.09, a maximum angular distance of 10, and a reference vector of [0, 1, 0] to the remaining point cloud. The inliers to the cylinder fit are shown in Figure 48 b.

The methods for fitting to geometric shapes discussed thus far have been applicable for planar and cylindrical surfaces. To test the possibility of expanding this method to more shapes,

49

the MSAC algorithm was considered for fitting a polynomial to the shape of the surface we would like to scan.

a) b)

Figure 48: Poster paper point cloud during fit to cylinder operation. (a) The point cloud of after removing inliers of the pcfittoplane function. (b) The final point cloud after fitting a to a cylinder.

This algorithm was applied in the form of MATLAB’s fitpolynomialRANSAC function.

Instead of being fit to a cylinder, the point cloud in Figure 47 was clustered with a minimum distance of 0.02 and the cluster corresponding to the cylinder was manually selected as a new reference point cloud. Figure 49 shows the isolated cluster.

Figure 49: The isolated point cloud cluster representing the poster paper. This cluster was manually selected from a group of clusters separated by an MSAC algorithm.

Next, a cross-section of the cylinder was sought for fitting a polynomial. The cross- section was selected at the mean y-value, y , in the direction along the height of the cylinder.

avg

50

To allow for more data points in finding the cross section, all points within a range of y +

avg 0.001 and y 0.001 were selected and approximated to lie in the same plane. Figure 50

avg shows the point −cloud of only the points selected within the range of y-values. The markers for each point in the point cloud are enlarged 50 x for clearer viewing.

Figure 50: The point cloud of a cross-section of the cylinder, corresponding to all points within

0.001 of the average y-value, viewed from the side. The markers are enlarged 50 x.

The fitpolynomialRANSAC function was applied to these points to approximate a 2- degree polynomial with a maximum allowed distance of 1, or the distance for a point to be considered an inlier. The curve of best fit and the points from the cross-section were plotted together, as shown in Figure 51 a. The polynomial corresponding to this fit as found by the M-

SAC estimator was found to be about 1.97x 0.559x + 0.837. The polynomial fit line was 2 plotted together with the full point cloud of the− cylinder as seen in Figure 51: b. Successive runs of the function generated different fits for the data that occasionally did not fit the data points as closely. As a result, more research must be done before applying these methods on surfaces of interest.

51

a) b)

Figure 51: The RANSAC polynomial fit vs the point cloud. (a) A plot of the points at the cross- section of the cylinder vs the polynomial returned by MSAC. b) The 3D plot of the point cloud vs the polynomial returned by MSAC.

52

Chapter 4. Simulation/Virtual Environment

The final topic considered in this paper is application of the techniques discussed prior to a virtual environment. The goal of this virtual environment is to perform damage detection procedures through messages relayed from robot sensors and to localize damage on a large surface. This virtual environment presents an opportunity to test the system model presented in the beginning of the thesis without needing a robot or large structure to scan. A virtual environment was created in ROS’s Gazebo software for replicating an environment for damage detection. A truss bridge was used as the surface on which to perform damage detection and a

TurtleBot burger model was placed in the center of the bridge. The TurtleBot burger model is equipped with a 360 Laser Distance Sensor LDS-01, which is emulated using ROS’s hls_lfcd_lds_driver. While another model, a Turtlebot Waffle, comes with its own camera, that camera faces forward. This complicates image acquisition of the bridge surface. To remedy this, a default TurtleBot burger model was selected and edited to include a Microsoft Kinect camera facing downward and held up by a basic arm. The Kinect camera is emulated with an OpenNI driver. Since Gazebo does not have sensor plugins for a LiDAR camera, the 2D LDS sensor was used for mapping.

4.1 Mapping

An environment was created in Gazebo using a 15.4 m x 4.78 m (L x W) truss bridge model from Gazebo’s model repository [34]. The bridge was edited in the open source software,

Blender, and applied with a small triangular hole on the surface, measuring 59 mm x 52 mm x 70 mm. The edited TurtleBot model with the Kinect camera was then placed in the center of the bridge and configured to send laser scan and image data through ROS. Figure 52 a shows the

53

Gazebo interface with the Truss bridge environment and robot. Figure 52 b shows a closer view of the robot and Figure 52 c shows a closer look at the damaged area in the bridge.

a)

b)

c)

Figure 52: Images of the Gazebo interface with (a) the truss bridge world, consisting of a damaged truss bridge model, the modified Turtlebot robot, and a Microsoft Kinect depth camera,

(b) A closer view of the robot and virtual camera, and (c) a closer view of the damaged area in the

bridge for detection.

The virtual camera was set up to simultaneously take RGB and depth images of its field

of view and send them as messages through ROS at regular intervals. Two methods were used for

54

obtaining the image data from the virtual camera: subscribing with MATLAB through the ROS

network and subscribing with RTAB-Map also through the ROS network. Mapping was set up

with RTAB-Map to subscribe to both laser scans as well as RGB-D (image and depth) data. To visualize the mapping process, RViz was set up to subscribe to the image data from the camera and map point cloud from RTAB-Map. Figure 53 a shows a visualization of the map of the environment as reconstructed from laser scans before the robot has moved and Figure 53 b shows

the interface of the standalone RTAB-Map window (RTAB-Mapviz) during mapping. In the left column of RTAB-Mapviz, the latest returned image is displayed and labeled. The center image interface shows the 3D map, enlarged in Figure 53 c.

a) b)

c)

Figure 53: A reconstruction of the truss bridge environment in RViz and RTAB-Mapviz at the robot’s initial state is shown. The map is reconstructed through RTAB-Map using laser odometry.

(a) The reconstruction of the bridge environment in RViz. (b) The RTAB-Map interface during mapping showing the latest received image (left column), the 3D map with superimposed image data (center column), and graph view of the map (right column). (c) Enlarged view of 3D map.

55

In this case, the image seen by the camera is hovering in front of the edge of the bridge, which is marked by a purple line. The rightmost column shows a grid view of the map of the recorded environment.

The images published by the virtual camera were also visualized in real-time in a separate

RViz window. Figure 54 presents an example RGB image and an example depth image as visualized in RViz. In the depth image, the lighter shading corresponds to objects farthest away from the camera and darker shading corresponds to objects closer to the camera.

a)

b)

Figure 54: Example images of the view from the Microsoft Kinect camera in the virtual environment, as seen in RViz. The images shown are (a) an RGB image of the robot and ground as seen from above (b) a depth image of the robot and ground as seen from above, where the darkest regions are closest to the camera and the lightest regions are farthest from the camera.

The Turtlebot teleoperation ROS package was used to move the robot in the Gazebo environment. First, the robot was moved to the front side of the bridge (the right side in the visualization), where there is no damage to be found. Figure 55 a shows the location of the robot

56

in the Gazebo environment after moving to the end of the bridge and Figure 55 b shows the

updated visualization of the map in RViz. The right side of the reconstructed map has now expanded after the robot traversed the bridge in the Gazebo environment.

a)

b)

Figure 55: The location of the robot after moving to one end of the bridge environment as seen in

(a) the virtual environment in Gazebo and (b) the visualization of the reconstructed map in RViz

Figure 56 shows the RTAB-Mapviz interface during the mapping process. In the 3D map

window, the boundary of the environment and the path taken by the robot is visualized from

information taken from laser scans. The images taken at each location are superimposed on the

map. Since each image is counted by ID, it is possible to track which location in 3D space each

image was taken.

57

Figure 56: The RTAB-Mapviz interface during mapping is pictured. In the 3D Map window, the blue lines represent the boundary of the bridge and the path taken by the robot. Each image received from the virtual camera is superimposed on the map by location where it was taken.

To gather unobscured images of the surface of the bridge, the position of the camera was moved 0.8 meters to the side of the robot. RTAB-Map’s standalone software was set up to gather image data at each point in the path of the robot during mapping. This time the robot was moved across the back side of the bridge (the left side as recorded by the visualizer). The robot was first turned and moved in an approximately straight path toward the damaged region. Across most of the bridge, the images returned were of the grey surface as seen in Figure 57.

Figure 57: An image returned by the virtual camera of the unobscured bridge surface after moving the robot out of sight of the camera.

58

Furthermore, MATLAB was simultaneously set up to receive point cloud messages through ROS from the Kinect camera. This was achieved by initiating a ROS global node in

MATLAB and subscribing to messages by the Kinect’s /points topic. This point cloud data was not visualized in ROS, but was visualized in MATLAB during post-processing.

The mapping session was manually halted when the damaged area came into view of the camera. Figure 58 a shows the RTAB-Mapviz interface with the damaged region in view in the

New ID window and Figure 58 b shows the RGB image enlarged. The images in the 3D Map

a)

b)

Figure 58: The image and map data after navigating to the location of damage. (a) The RTAB-

Mapviz interface with the image of the triangular region of damage in the left column, the 3D map of the environment with superimposed RGB images in the center column, and the graph view in the right column. (b) A close-up view of the image of the damaged region.

59

window appear lined up in front of each other, indicating RTAB-Map recognizes they were taken

while the robot was moving approximately straight forward.

4.2 Post-Processing of Image and Depth Images

The images, depth images, and map data were all saved at the end of the mapping

session. The images taken when the robot passed over the area of damage were specifically

examined in MATLAB using the edge detection method outlined in 3.1.2. The LoG edge detection method with a threshold of 0.001 was applied to the image of the damaged region from

Figure 58. The output of the edge method is shown in Figure 59 a. The bridge operation and dilate operation with a disk structural element of radius 1 were applied to obtain Figure 59 b. The result of hole filling can be seen in Figure 59 c and the leftover area after subtraction is shown in

Figure 59 d. The leftover region is then classified as damage.

a) b)

c) d)

Figure 59: Binary images of the damaged region after additive application of image processing operations (a) LoG edge detector with a threshold value of 0.001 (b) bridging and dilation with a

1-pixel radius disk structural element (c) hole filling (d) subtraction of the hole-filled image with the previous image

60

The images returned by the virtual camera at locations outside of the region of damage were also

processed. Running the images of the undamaged region, such as the one from Figure 57, through

the same set of algorithms resulted in a completely black binary image with no damage.

The point cloud data published by the camera during mapping and received by MATLAB

was processed using the fit to plane method discussed in Chapter 3. Figure 60 a shows a point

cloud from the virtual Kinect camera while it was hovering over the damaged area. A plane fit with a max distance threshold of 0.0005 was applied and returned the point cloud in Figure 60 b.

The outliers were isolated in Figure 60 c. This demonstrates that the methods discussed in

Chapter 3 are applicable to processing of data from virtual environments.

a)

b)

c)

Figure 60: (a) Point cloud data returned by the virtual Microsoft Kinect camera and sent to

MATLAB. (b) The inliers of the fit to plane method returned as a point cloud. (c) The outliers of the fit to plane method returned as a point cloud.

61

4.3 Localization in Map

After confirming damage in MATLAB, it is desired to localize the robot over the area of

damage on the surface for either more comprehensive damage detection methods or for patching.

For this, RTAB-Map’s localization feature is the proposed solution. The robot was loaded into the

truss bridge environment and set to localization mode in RTAB-Map instead of mapping mode. It

was desired that after the robot moved around enough in its environment, it would localize itself

within the already generated map of the bridge. However, the location of the edges of the bridge

as sensed during the localization process did not line up with the edges in the already

reconstructed map of the bridge. Figure 61 shows the localization in progress as seen in RViz.

While the issues with localization in the virtual environment must be corrected first, the

framework for damage detection, mapping and communication between each subsystem has been

demonstrated to work together. Thus, it should be possible to apply similar setups between ROS,

RTAB-Map, and MATLAB in future setups of virtual environments and real-world environments.

Figure 61: The map of the environment reconstructed in RViz. The edges of the bridge in the

generated do not match the location of the edges in the sensory data in the new map.

62

Chapter 5. Conclusions

Methods for damage detection with RGB images and 3D point clouds were used on a

number of physical test articles. Out of RGB image methods, edge detection-based methods

appeared to provide a more accurate means of isolating damage by returning regions of possible

damage. However, edge detection methods need to be calibrated to their use case to avoid using

too high or low a threshold. Point cloud analysis methods were also considered for more accurate

damage detection. Two methods of point cloud analysis were considered: matching before-and-

after point cloud pairs using the K-Nearest Neighbor approach and fitting to surfaces. Both methods showed potential to isolate craters as damage in high quality point clouds taken with an

Einscan Pro 2X Plus scanner. However, point clouds taken with RGB and stereo cameras had more error and could not be aligned through the standard ICP approach. Fitting point clouds created with the Intel RealSense L515 RGB-D camera to geometric shapes and isolating damage was possible if specific tolerances during the fitting procedure were used. However, holes in hollow objects could not be resolved clearly through point cloud analysis.

Both edge detection and plane fitting approaches were used on images and point cloud information returned from a simulation in ROS of a robot moving through a bridge. Images and point clouds corresponding to damage recorded by a virtual Microsoft Kinect were sent to

MATLAB for analysis. Damage applied to the surface of the bridge was isolated using both methods. The indices of the images were also compared to the indices stored in RTAB-Map, related to the location in a map where they were taken. This process establishes a framework for robotic navigation using ROS for detecting and localizing damage on a large and repetitive environment such as a bridge or space habitat. Future work in this project may involve methods to obtain clearer point clouds with RGB-D devices and automation of patching of damage.

63

Bibliography

[1] M. Asce, A. Chatterjee, and M. Asce, “Pothole Detection and Classification Using 3D Technology and Watershed Method,” p. 7. [2] C. E. Stewart, “EVA HAZARDS DUE TO TPS INSPECTION AND REPAIR,” p. 29. [3] C. Zhang, C. Chang, and M. Jamshidi, “Concrete bridge surface damage detection using a single‐stage detector,” Comput. Civ. Infrastruct. Eng., vol. 35, no. 4, pp. 389–409, Apr. 2020, doi: 10.1111/mice.12500. [4] R. Fan and M. Liu, “Road Damage Detection Based on Unsupervised Disparity Map Segmentation,” IEEE Trans. Intell. Transp. Syst., vol. 21, no. 11, pp. 4906–4911, Nov. 2020, doi: 10.1109/TITS.2019.2947206. [5] B. Johnson, “National Aeronautics and Space Administration,” p. 158, Sep. 1988. [6] M. I. Allende, J. E. Miller, B. A. Davis, E. L. Christiansen, M. D. Lepech, and D. J. Loftus, “Prediction of micrometeoroid damage to lunar construction materials using numerical modeling of hypervelocity impact events,” Int. J. Impact Eng., vol. 138, p. 103499, Apr. 2020, doi: 10.1016/j.ijimpeng.2020.103499. [7] “What is Mechanical Damage? - Definition from Corrosionpedia,” Corrosionpedia. http://www.corrosionpedia.com/definition/1455/mechanical-damage (accessed Mar. 10, 2021). [8] Y. Tsai, Y. Wu, and Z. Lewis, “Full-Lane Coverage Micromilling Pavement-Surface Quality Control Using Emerging 3D Line Laser Imaging Technology,” J. Transp. Eng., vol. 140, no. 2, p. 04013006, Feb. 2014, doi: 10.1061/(ASCE)TE.1943-5436.0000604. [9] R. Fan, U. Ozgunalp, B. Hosking, M. Liu, and I. Pitas, “Pothole Detection Based on Disparity Transformation and Road Surface Modeling,” IEEE Trans. Image Process., vol. 29, pp. 897–908, 2020, doi: 10.1109/TIP.2019.2933750. [10] A. Georgopoulos, C. Ioannidis, and A. Valanis, “ASSESSING THE PERFORMANCE OF A STRUCTURED LIGHT SCANNER,” p. 7, 2010. [11] S. Riisgaard and M. R. Blas, “SLAM for Dummies.” [Online]. Available: http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=7AA0B94AB86ECAE4A64D70B C4A63F6C5?doi=10.1.1.208.6289&rep=rep1&type=pdf. [12] M. Labbé and F. Michaud, “RTAB-Map as an open-source lidar and visual simultaneous localization and mapping library for large-scale and long-term online operation: LABBÉ AND MICHAUD,” J. Field Robot., vol. 36, no. 2, pp. 416–446, Mar. 2019, doi: 10.1002/rob.21831. [13] G. Jorgensen and E. Bains, “SRMS History, Evolution and Lessons Learned,” presented at the AIAA SPACE 2011 Conference & Exposition, Long Beach, California, Sep. 2011, doi: 10.2514/6.2011-7277. [14] S. Litkenhous, “In-Space Manufacturing,” NASA, Apr. 25, 2019. http://www.nasa.gov/oem/inspacemanufacturing (accessed Jan. 29, 2021). [15] T. J. Prater and N. J. Werkheiser, “Summary Report on Phase I and Phase II Results From the 3D Printing in Zero-G Technology Demonstration Mission, Volume II,” p. 120. [16] “In-Space Robotic Manufacturing and Assembly,” p. 19. [17] M. Zhang, Y. Zhang, Z. Jiang, X. Lv, and C. Guo, “Low-Illumination Image Enhancement in the Space Environment Based on the DC-WGAN Algorithm,” Sensors, vol. 21, no. 1, Jan. 2021, doi: 10.3390/s21010286. [18] J. R. Parker, Algorithms for Image Processing and Computer Vision. New York, UNITED STATES: John Wiley & Sons, Incorporated, 2010. [19] “ZED Stereo Camera | Stereolabs.” https://www.stereolabs.com/zed/ (accessed Mar. 11, 2021).

64

[20] “Buy the Azure Kinect developer kit – Microsoft,” . https://www.microsoft.com/en-us/p/azure-kinect-dk/8pp5vxmd9nhq (accessed Mar. 11, 2021). [21] “Intel® RealSenseTM LiDAR Camera L515,” Intel® RealSenseTM Depth and Tracking Cameras. https://www.intelrealsense.com/lidar-camera-l515/ (accessed Mar. 11, 2021). [22] “EinScan Pro 2X Plus - Handheld Industrial Scanner,” EinScan. https://www.einscan.com/handheld-3d-scanner/2x-plus/ (accessed Mar. 11, 2021). [23] “RTAB-Map,” RTAB-Map. http://introlab.github.io/rtabmap/ (accessed Mar. 11, 2021). [24] “rtabmap_ros - ROS Wiki.” http://wiki.ros.org/rtabmap_ros (accessed Mar. 16, 2021). [25] “Detect corners using Harris–Stephens algorithm and return cornerPoints object - MATLAB detectHarrisFeatures.” https://www.mathworks.com/help/vision/ref/detectharrisfeatures.html (accessed Mar. 14, 2021). [26] C. Harris and M. Stephens, “A Combined Corner and Edge Detector,” in Procedings of the Alvey Vision Conference 1988, Manchester, 1988, p. 23.1-23.6, doi: 10.5244/C.2.23. [27] P. H. S. Torr and A. Zisserman, “MLESAC: A New Robust Estimator with Application to Estimating Image Geometry,” Comput. Vis. Image Underst., vol. 78, no. 1, pp. 138–156, Apr. 2000, doi: 10.1006/cviu.1999.0832. [28] “Find matching features - MATLAB matchFeatures.” https://www.mathworks.com/help/vision/ref/matchfeatures.html (accessed Mar. 16, 2021). [29] D. Q. F. de Menezes, D. M. Prata, A. R. Secchi, and J. C. Pinto, “A review on robust M- estimators for regression analysis,” Comput. Chem. Eng., vol. 147, p. 107254, Apr. 2021, doi: 10.1016/j.compchemeng.2021.107254. [30] “Find edges in intensity image - MATLAB edge.” https://www.mathworks.com/help/images/ref/edge.html (accessed Mar. 14, 2021). [31] “Fill image regions and holes - MATLAB imfill.” https://www.mathworks.com/help/images/ref/imfill.html (accessed Mar. 14, 2021). [32] “Register two point clouds using ICP algorithm - MATLAB pcregistericp.” https://www.mathworks.com/help/vision/ref/pcregistericp.html (accessed Mar. 12, 2021). [33] “Fit plane to 3-D point cloud - MATLAB pcfitplane.” https://www.mathworks.com/help/vision/ref/pcfitplane.html (accessed Mar. 17, 2021). [34] “osrf/gazebo_models,” GitHub. https://github.com/osrf/gazebo_models (accessed Mar. 19, 2021).