IEEE JOURNAL OF OCEAN ENGINEERING, VOL 28., NO. 4, OCTOBER 2003 1 Mosaic Based Navigation for Autonomous Underwater Vehicles Nuno Gracias, Sjoerd van der Zwaan, Alexandre Bernardino and Jos´e Santos-Victor, Member, IEEE
Abstract— We propose an approach for vision-based navigation human independence, these systems reduce the operating costs of underwater robots that relies on the use video mosaics of the and broaden the potential end-users group. The user main tasks sea bottom as environmental representations for navigation. are in the definition of mission primitives to be carried out and We present a methodology for building high quality video mo- saics of the sea bottom, in a fully automatic manner, that ensures higher level mission control. global spatial coherency. During navigation, a set of efficient The underwater environment poses a difficult challenge for visual routines are used for the fast and accurate localization precise vehicle positioning. The absence of electromagnetic of the underwater vehicle with respect to the mosaic. These signal propagation prevents the use of long range beacon visual routines were developed taking into account the operating networks. Aerial or land robot navigation can rely upon the requirements of real-time position sensing, error bounding and computational load. Global Positioning System to provide real-time updates with A visual servoing controller, based on the vehicle kinematics, errors of just few centimeters, anywhere around the world. is used to drive the vehicle along a computed trajectory, specified The underwater acoustic equivalent is severely limited both in in the mosaic, while maintaining constant altitude. The trajectory range and accuracy, thus requiring the previous deployment of towards a goal point is generated online to avoid undefined areas carefully located beacons, and restricting the vehicle operating in the mosaic. We have conducted a large set of sea trials, under realistic range to the area in between. Sonar equipment provides range operating conditions. This paper demonstrates that, without data and is increasingly being used in topographic matching resorting to additional sensors, visual information can be used to for navigation, but the resolution is too low for precise, sub– create environment representations of the sea bottom (mosaics) metric navigation. Vision can provide precise positioning if and support long runs of navigation in a robust manner. an adequate representation of the environment exists, but is Index Terms— Underwater computer vision, video mosaics, limited to short distances to the floor due to visibility and visual servoing, trajectory reconstruction, uncertainty estimation lighting factors. However, for the mission scenarios where the working locations change often and are restricted to relatively small areas, the use of visual based positioning can be the I. INTRODUCTION most appropriate. The methodology in this paper considers mission scenarios HIS paper describes a methodology for mosaic-based where an autonomous platform is required to map an area of visual navigation of underwater autonomous vehicles, T interest and navigate upon it afterwards, as illustrated in Fig. navigating close to the sea floor. A high-quality video-mosaic 1. is automatically built and used as a representation of the sea- For building the video mosaic, the quality constraints are bottom. A visual servoing strategy is adopted to drive the very strict, as the mosaic will be the basis for global navi- vehicle along a specified trajectory (indicated by waypoints) gation. Hence, highly accurate image registration methods are relative to the mosaic. The control errors are defined by needed. The proposed mosaic creation method deals with long comparing (registering) the instantaneous views acquired by image sequences, the automatic inference of the path topology, the vehicle and the mosaic. The proposed approach was tested and the 3–D recovery of the overall geometry. The quality of at sea with an underwater vehicle. the mosaic is improved by ensuring global data consistency, The autonomous navigation of underwater vehicles is a using overlapping image regions from loop trajectories or zig- growing research and application field. A contributing factor is zag scanning patterns. the increasing need of sub-sea data for activities such has envi- During navigation, the performance depends heavily upon ronmental monitoring or geological surveying. Recent interest the ability of the vehicle to localize itself with respect to the has been devoted to the development of smart sensors, where previously constructed map. Two important requirement are the data acquisition and navigation are intertwined. These the bounding of localization errors and real-time availability of systems aim at releasing the human operation from low-level position estimates. To address this, two distinct visual routines requirements, such as the path planning, obstacle avoidance are run simultaneously: image-to-mosaic registration and inter- and homing. By providing the platforms with such level of image motion estimation. Manuscript received ; revised . This work was supported by the EU-Esprit- The first routine, also referred to as mosaic matching,is LTR Proj. 30185 - NARVAL and by the Portuguese Foundation for Science devised for accurate positioning and error bounding. It runs and Technology PRAXIS XXI BD/13772/97. at a low frequency, and uses robust model–based feature The authors are with the VisLab, Instituto de Sistemas e Rob´otica – Instituto Superior T´ecnico, Av. Rovisco Pais, 1049–001 Lisbon, Portugal. Email – matching to register a current image over the mosaic. The {ngracias, sjoerd, alex, jasv}@isr.ist.utl.pt outcome of this routine is the 3–D position of the camera in a 2 EEE JOURNAL OF OCEAN ENGINEERING, VOL 28.,NO. 4, OCTOBER 2003
Mosaic Creation Mosaic Servoing
Fig. 1. The operation modes for the proposed mosaic-based navigation system. world coordinate frame associated with the map. The second A. Related Work and Contributions routine computes the image motion between camera images in This paper relates to the work of a number of authors. The real–time. It provides updates over the last successful mosaic most important differences are now highlighted. matching, but is prone to accumulate errors if not combined 1) Mosaic Construction with Global Registration: The ap- with the mosaic matching. plication of mosaicing techniques for underwater imagery is a Depending on the use, the localization information can topic of increasing research interest, not only as a visualization either be expressed directly in terms of an image frame (e.g. tool for covering large areas [6], [7], [8], [9], but also as a pixel location on the mosaic) or converted to an explicit spatial representation for underwater robotics [10], [11], [12]. position and orientation of the vehicle in 3D. Comparative results on vehicle positioning and mosaicing for To avoid driving the vehicle over the areas where the mosaic long image sequences are reported in [13], where calibrated matching may be too difficult (such as near the borders of the testing of the algorithm presented in this paper is included. region covered by the mosaic), a trajectory generation module Considering the topic of global registration, several ap- was implemented. This module provides a set of waypoints proaches have been proposed using topology inference of between the current and final location that simultaneously neighboring frames [14], [15], and restricted parameterizations searches for a short travel path while keeping away from the for the projection matrices [16]. Recent methods allow the fast mosaic borders. computation of globally consistent linear strips mosaics [17], A final control module consists in translating all these data and use Kalman Filtering for closing the trajectory loops [18]. into control errors and design the controllers that drive the The main differences of our approach with all of the above vehicle propellers. are twofold: (1) the parameterization of the homographies with This paper builds upon our previous work on mosaic complete and meaningful 3D pose parameters and, (2) the creation, pose estimation and station keeping. In [1], [2] the inclusion of the unknown single world plane constraint. basic process of sequential image processing is detailed for the 2) Mosaics for Navigation: One of the early references creation of underwater mosaics, along with the pose estimation to the idea of using mosaics as visual maps is the work of and error propagation. The present paper extends this approach Zheng [19], where panoramic representations were applied to loop trajectories and distant superpositions, which are to route recognition and outdoor navigation. However the essential for the creation of spatially coherent mosaics over visual representations do not preserve geometric characteristics large areas. nor correspond to visually correct mosaics. This constitutes a In [3] a fast template–based pose estimation method and drawback as the representation is not fit for human perception, a vision–based controller are derived. These methods are which is important for mission definition. Recently, Kelly [20] specially suited for underwater station keeping [4]. We use the has addressed the feasibility and implementation issues of same controller structure, but extend it to navigate to distant using large mosaics for robot guidance, predicting a large points with respect to the starting position. Furthermore, a impact of these techniques on industrial environments. In this trajectory generation procedure is also implemented. A key case, the problem is simplified by assuming that the image point is the extension and integration of several methods plane is parallel to the mosaiced areas and the motion of the for mosaic creation, pose estimation and visual servoing that vehicles is restricted to the ground plane. complement each other. Xu [21] investigated the use of seafloor mosaics, constructed Part of this work done in the context of the European using temporal image gradients, in the context of concurrent Project NARVAL [5]. The main scientific goal was the de- mapping and localization, for real-time applications. Albeit sign and implementation of reliable navigation systems for careful compensation for systematic errors, eventual loops mobile robots in unstructured environments, without resorting in the camera path are not taken into account nor used for to global positioning methods. The algorithms and results compensating for the accumulated error, which prevents the described in this paper, where large mosaics are created and use in covering large areas. Huster [22] described a naviga- used for posterior navigation, constitute a major achievement tion interface using live–updated mosaics, and illustrated the towards this goal. advantages of using it as a visual representation for human GRACIAS et al.: MOSAIC BASED NAVIGATION FOR AUTONOMOUS UNDERWATER VEHICLES 3 operation. However, since the mosaic is not used in the navigation, namely the algorithms implemented for the vehicle navigation control loop, there is no guaranty the vehicle is localization, the image based control law and selected results driven to the desired position. from the mosaic servoing experiments. Section V draws the One of the works more closely related to ours is [11], in the conclusions and establish directions for future work. sense it combines spatially consistent mosaic with underwater ROV navigation. However, in their approach, the navigation II. VISUAL GEOMETRY AND MOTION ESTIMATION system requires additional sensors to provide heading, pitch This section reviews some important geometry entities and and yaw information, whereas our work relies solely upon estimation methods related to the mosaicing process. vision to provide information for all the relevant degrees of freedom. The integration of several different sensors benefits for the robustness of the overall navigation system, provided A. Camera model that realistic estimates exist for the uncertainty associated with The adopted camera model is the standard pin-hole model, each sensor. However, it is of scientific relevance to know of which performs a linear projective mapping of the 3D world far can underwater vision systems go when used alone, and into the image frame. The camera is assumed to be calibrated have ways of computing the uncertainty of the pose. Our paper with known intrinsic parameter K matrix [23], [24]. In this is directed towards this goal. work, the radial distortion was estimated and used for the Regarding navigation, our approach differs from the con- off-line creation of displacement maps. During the image current mapping and localization approaches (CM&L, SLAM) acquisition, these maps are used as look-up tables for the on- in the sense the map is totally created prior to its use. Some line correction of the distortion. authors have successfully implemented (and extended) CM&L for the underwater navigation with mosaics [21], [11], which B. Homographies has the advantage of using the mosaic while it is being constructed, but leads to less accurate mosaics due to the real– We assume that the working area of the sea bottom can be time constraints. approximated by a plane. Two views of the same 3–D plane 3) Contributions: Our paper presents the first results on are related by a homography [25] (also referred to as a planar 3×3 using large mosaic for servoing at sea, which validates the transformation) which is represented by a matrix defined H servoing approach. Previous experiments on visual trajec- up to a scale factor. A homography performs a point–to– tory following using mosaics at sea were restricted proof-of- point mapping between the homogeneous coordinates of the x x x = Hx concepts, using very small mosaics and few meter runs [21], image points and , such that . It has, at most, [11]. eight degrees of freedom which are illustrated in Fig. 2. The H Our work contributes to the field of visual underwater estimation of requires at least four pairs of corresponding navigation in several ways: points. In the case of more than four correspondences, it can be estimated by least-squares [26]. • The mosaic creation is approached in a fully automated
60 60 60 and integrated way where global spatial consistency is 60 40 40 40 40
20 20 20 imposed by estimating the image neighboring topology. 20 0 0 0 0
−20 −20 • All the degrees of freedom arising from the mosaic −20 −20 −40 −40 −40 −40
−60 −60 −60 −60 geometry are taken into account and parameterized as −60 −40 −20 0 20 40 60 80 −80 −60 −40 −20 0 20 40 60 80 −60 −40 −20 0 20 40 60 −80 −60 −40 −20 0 20 40 60 80 geometrically meaningful entities – pose parameters and (a) (b) (c) (d)
60 60 60 60
world plane description. As a result, we retrieve the 40 40 40 40
20 20 20 20
vehicle trajectory over time, together with the associated 0 0 0 0
−20 −20 −20 −20 uncertainty. −40 −40 −40 −40 −60 −60 −60 −60 • Different techniques of inter-image motion and image-to- −80 −60 −40 −20 0 20 40 60 80 −80 −60 −40 −20 0 20 40 60 80 −80 −60 −40 −20 0 20 40 60 80 −80 −60 −40 −20 0 20 40 60 80 mosaic matching were devised and used in a combined (e) (f) (g) (h) manner, together with robust estimation methods, provid- Fig. 2. Degrees of freedom of the planar projective transformation on images: ing the necessary degree of accuracy and robustness. (a) horizontal translation, (b) vertical translation, (c) rotation, (d) scaling, (e) shear, (f) aspect ratio, (g) projective distortion along the horizontal image axis, • A simple and effective visual servoing control scheme is (h) projective distortion along the vertical image axis used to drive the vehicle. The error signals are defined exclusively from image measurements. • The appropriateness of the approach is demonstrated by C. Visual Motion Estimation successful experimental testing in the challenging, real– world conditions of the underwater environment (at sea). The starting point for the creation of mosaics is the estima- tion of image motion between consecutive frames of a video sequence. B. Paper Organization For each image Ik, a set of point features is extracted Section II reviews some useful entities and methods related using the Harris corner detector [27] and matched over the with the mosaic geometry. Section III details the algorithm following image Ik+1, through a correlation-based procedure. used for creating a navigation map through image mosaicing, The correlation is performed using a fast implementation [28] along with illustrative results. Section IV describes the mosaic of the Lucas-Kanade point tracker [29]. A robust estimation 4 EEE JOURNAL OF OCEAN ENGINEERING, VOL 28. ,NO. 4, OCTOBER 2003 technique is used to filter out matching outliers, and estimate where s (.) and c (.) represent the sine and cosine functions the homography Hk,k+1 that relates the coordinate frames of of the rotation angles. Ik and Ik+1. A variant of random sampling LMedS, detailed Without loss of generality, we assume that all the world in [1], is used. points belong to the plane defined by Wz =0. A image-to- A different approach to the computation of image motion world homography has the form : was also investigated for station keeping applications. In this W 10−Ctx approach, the tracking of an image region is performed by C W Ψ(K, Θ) = K · RW · 01−Cty . (4) combining optic flow information with template matching, 00−Wt which relies on a set of pre–computed motion models to C z describe the image deformations. The robustness of the method If a set of matches exists between image point projections is increased by adapting the set of used models to the most and their world coordinates, then the camera pose can be commonly observed camera motions [3], [30]. estimated [2], by minimizing the following error function