underwater slam in man-made structured environments

24

Click here to load reader

Upload: david-ribas

Post on 06-Jul-2016

218 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Underwater SLAM in man-made structured environments

• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •

Underwater SLAM inMan-Made Structured

Environments

David Ribas and Pere RidaoGrup de Visio per Computador i RoboticaUniversitat de GironaGirona 17071, Spaine-mail: [email protected], [email protected]

Juan Domingo Tardos and Jose NeiraGrupo de Robotica y Tiempo RealUniversidad de ZaragozaZaragoza 50018, Spaine-mail: [email protected], [email protected]

Received 9 January 2008; accepted 30 May 2008

This paper describes a navigation system for autonomous underwater vehicles (AUVs)in partially structured environments, such as dams, harbors, marinas, and marine plat-forms. A mechanically scanned imaging sonar is used to obtain information about thelocation of vertical planar structures present in such environments. A robust voting algo-rithm has been developed to extract line features, together with their uncertainty, from thecontinuous sonar data flow. The obtained information is incorporated into a feature-basedsimultaneous localization and mapping (SLAM) algorithm running an extended Kalmanfilter. Simultaneously, the AUV’s position estimate is provided to the feature extraction al-gorithm to correct the distortions that the vehicle motion produces in the acoustic images.Moreover, a procedure to build and maintain a sequence of local maps and to posteriorlyrecover the full global map has been adapted for the application presented. Experimentscarried out in a marina located in the Costa Brava (Spain) with the Ictineu AUV show theviability of the proposed approach. C© 2008 Wiley Periodicals, Inc.

1. INTRODUCTION

Simultaneous localization and mapping (SLAM) isone of the fundamental problems that needs to besolved before achieving truly autonomous vehicles.For this reason, in recent years it has been the focusof a great deal of attention (see Bailey & Durrant-Whyte, 2006; Durrant-Whyte & Bailey, 2006, and ref-erences therein). Multiple techniques have shownpromising results in a variety of different applicationsand scenarios. Some of them perform SLAM indoors

(Castellanos, Montiel, Neira, & Tardos, 1999), out-doors (Guivant, Nebot, & Durrant-Whyte, 2000), andeven on air (Kim & Sukkarieh, 2003). However, theunderwater environment is still one of the most chal-lenging scenarios for SLAM because of the reducedsensorial possibilities and the difficulty in finding re-liable features. Acoustic devices are the most com-mon choice. Many approaches extract features fromacoustic data produced by imaging sonars (Leonard,Carpenter, & Feder, 2001; Tena, Petillot, Lane, &Salson, 2001) or sidescan sonars (Tena, Reed, Petillot,

Journal of Field Robotics 25(11–12), 898–921 (2008) C© 2008 Wiley Periodicals, Inc.Published online in Wiley InterScience (www.interscience.wiley.com). • DOI: 10.1002/rob.20249

Page 2: Underwater SLAM in man-made structured environments

Ribas et al.: Underwater SLAM in Man-Made Structured Environments • 899

Bell, & Lane, 2003). Some introduce artificial bea-cons to deal with complex environments (Newman& Leonard, 2003; Williams, Newman, Rosenblatt,Dissanayake, & Durrant-Whyte, 2001). An alternativeapproach is analyzing the three-dimensional (3D)structure of the environment to perform SLAM, ei-ther using bathymetric scans (Roman & Singh, 2005)or producing a tessellated representation of the sce-nario (Fairfield, Jonak, Kantor, & Wettergreen, 2007).The use of cameras is limited to applications in whichthe vehicle navigates in clear water and very nearto the seafloor (Eustice, Singh, Leonard, Walter, &Ballard, 2005). On the other hand, the visual infor-mation can also be combined with acoustic data toimprove the overall reliability of the SLAM system(Williams & Mahon, 2004).

This article focuses on underwater SLAM ap-plied to man-made environments (harbors, marinas,marine platforms, dams, etc.) where structures com-posed of vertical planes are present and producereliable features in acoustic images. Although mostof the previous work done in this field focuses onopen sea and coastal applications, obtaining an ac-curate positioning in such scenarios would notablyincrease autonomous underwater vehicle (AUV) ca-pabilities. For instance, an AUV could use a har-bor as an outpost for oceanography research if it isable to localize itself and navigate through it withenough accuracy to safely perform the leaving andreturning operations (Griffiths, McPhail, Rogers, &Meldrum, 1998). Maintenance and inspection of un-derwater structures (Martins, Matos, Cruz, & Pereira,1999) and even surveillance of marine installationsare examples of other applications that can benefitfrom such a system.

We have chosen an extended Kalman filter(EKF)–based implementation of the stochastic map toperform SLAM. The algorithm relies on a mechan-ically scanned imaging sonar (MSIS) for feature ex-traction. Although these mechanically actuated de-vices usually have a low scanning rate, they are quitepopular because of their reduced cost. When work-ing with MSIS, it is not uncommon to assume thatthe robot remains static or moves slowly enough toneglect the induced acoustic image distortion. Here,the static assumption is removed. Vehicle position es-timates from the filter are introduced to reduce theeffects of motion-induced distortion in the resultingacoustic data. Typically, approaches using imagingsonars have focused on the use of point features (Tenaet al., 2001; Williams et al., 2001). This work proposes

the use of line features in underwater environmentsas a representation of the cross sections producedwhen a sonar scan intersects with existing planarstructures. As a result, a two-dimensional (2D) mapof the environment is produced by the SLAM algo-rithm. Most of the applications at hand involve onlytrajectories performed at a constant depth, and there-fore, a 2D map is sufficient for navigation. Moreover,it is generally viable to apply the present methodwhen 3D motion is also required because the pre-dominance of vertical structures in the target scenar-ios makes the 2D map valid for different operatingdepths.

In recent years, many different authors haveproposed methods to carry out SLAM by build-ing sequences of local maps (Bosse et al., 2003;Bosse, Newman, Leonard, & Teller, 2004; Clemente,Davison, Reid, Neira, & Tardos, 2007; Estrada, Neira,& Tardos, 2005; Leonard & Feder, 2001; Leonard &Newman, 2003; Newman, Leonard, & Rikoski, 2003;Ni, Steedly, & Dellaert, 2007; Tardos, Neira, Newman,& Leonard, 2002; Williams, Dissanayake, & Durrant-Whyte, 2002). The main objective of such techniquesis limiting the cost of updating the full covariance ma-trix, which has O(n2) complexity (Guivant & Nebot,2001). By working on local maps of limited size, thecost of these algorithms remains constant most of thetime. In this work we consider conditionally inde-pendent local maps (Pinies & Tardos, 2007) becausethey allow sharing vital information between con-secutive maps (in this case the underwater vehiclestate). Strong empirical evidence also suggests thatthe use of local maps also improves the consistencyof EKF-based SLAM algorithms (Castellanos, Neira,& Tardos, 2004; Huang & Dissanayake, 2007).

A data set obtained during an experiment per-formed with the Ictineu AUV serves as a test for theproposed SLAM algorithm. The vehicle, a low-costresearch platform of reduced dimensions developedat the Underwater Robotics Laboratory of the Uni-versity of Girona (Ribas, Palomer, Ridao, Carreras,& Hernandez, 2007), performed a 600-m trajectory inan abandoned marina situated in the Spanish CostaBrava, near St. Pere Pescador (see Figure 1). The re-sulting data set includes measurements from manydifferent sensors. A Tritech Miniking MSIS providesthe acoustic imagery that feeds the feature extrac-tion algorithm. The velocity measurements from aSonTek Argonaut Doppler velocity log (DVL) are in-troduced in the filter to perform dead reckoning, anda compass and a pressure sensor provide absolute

Journal of Field Robotics DOI 10.1002/rob

Page 3: Underwater SLAM in man-made structured environments

900 • Journal of Field Robotics—2008

Figure 1. The Ictineu AUV (left) and an abandoned marina used as test bed for the presented SLAM algorithm situatednear St. Pere Pescador, Spain (right).

Prediction

Update

Segmentation

Data Buffer

(180º sector)

Voting

Reference voters to B

New Hough space

Assign votes

Uncertainty est.

New space

Find set of lines

Determine mean & cov.

DVL Compass

MSIS

zD

(1.5 Hz)

Beam (15 Hz)

Vehicle position

Remove old data Winner

Bins+positions

Segmented beams+positions

Data association (IC)

Line feature

Observation

Newlandmark

KF

StatezC

(10 Hz)

Figure 2. Diagram of the proposed SLAM approach.

measurements for the vehicle heading and depth. Adiagram of the SLAM system can be seen in Figure 2.

This article is organized as follows. Sections 2and 3 present a feature extraction algorithm thatsearches for line features in the data arriving contin-uously from the MSIS, removing the effect of motion-induced distortions, and estimates the feature uncer-tainty from representation in the acoustic image (apreliminary version appeared first in Ribas, Neira,Ridao, & Tardos, 2006). The SLAM algorithm in Sec-tion 4 incorporates a method that allows integratingcompass measurements independently from the ori-entation of the map base reference. This has opened

the door to implement a method to build local mapsfor improving scalability and accuracy as describedin Section 5. The results obtained with the abandonedmarina data set are presented in Section 6. Finally, theconclusions and further work are given in Section 7.

2. WORKING WITH ACOUSTIC IMAGESFROM A MSIS

MSISs perform scans in a 2D plane by rotating a fan-shaped sonar beam through a series of small-anglesteps (Figure 3). For each emitted beam, an echointensity profile is returned from the environment

Journal of Field Robotics DOI 10.1002/rob

Page 4: Underwater SLAM in man-made structured environments

Ribas et al.: Underwater SLAM in Man-Made Structured Environments • 901

Figure 3. Representation of a mechanically scanned imag-ing sonar mode of operation. Beams are emitted sequen-tially at given transducer head angles (fan-shaped silhou-ettes). For each beam, the echo intensity return profile isdiscretized in a sequence of bins (arcs at different rangesalong the most recently emitted beam).

and discretized into a set of bins (distance vs. echo-amplitude values). Accumulating this informationalong a complete 360-deg sector produces an acousticimage of the surroundings [see polar representationof a 360-deg scan sector in Figure 4(a) and its cor-responding Cartesian representation in Figure 4(c)].The beam typically has a large vertical beamwidth(for our sensor, about 40 deg), which makes possi-ble the detection of obstacles at different heights, al-though at the cost of inducing small imprecisions inthe range measurements. On the other hand, a nar-row horizontal beamwidth (about 3 deg) incrementsthe resolution of the device and improves the sharp-ness of the acoustic images. It is worth noting thatalthough the sensor can detect the presence of tri-dimensional objects, it is not able to determine theirposition in the vertical plane. Therefore, only a 2Drepresentation of the environment is produced.

2.1. Motion-Induced Distortions

Commonly, MSISs have a slow scanning rate. ATritech Miniking sonar head needs a minimum timeof 6 s to complete a 360-deg scan. Depending onthe range and resolution settings, the time can bemuch more (e.g., during the abandoned marina ex-periment, the required time was 15 s). Therefore, dur-ing the acquisition of a scan, the vehicle position canchange considerably. Not taking this motion into ac-count may induce important distortions in the result-ing acoustic data. Figure 4(c) represents this effect fora set of measurements obtained in the scenario shownin Figure 4(b). Note the misplacement of the walls

on the left and top of the scan and how the rotationof the vehicle curves the one on the right. Extractingline features from distorted data is difficult and pro-duces inaccuracies that disturb data association and,as a consequence, yields poor results. Therefore, thefirst step of the procedure consists of merging the rawsensor data with information regarding the vehiclemotion. This information is provided by the SLAMalgorithm (see Section 4), which runs simultaneouslywith the feature extraction algorithm. Incorporatingthe displacements and rotations of the sensor into thepositional information of each sonar measurementleads to an undistorted acoustic image such the one inFigure 4(d). As can be appreciated when comparingwith Figure 4(b), this results in a better representationof the real scenario.

2.2. Beam Segmentation

Because objects present in the environment appear ashigh-echo-amplitude returns in acoustic images, onlypart of the information stored in each beam is usefulfor feature extraction [see Figure 5(a)]. Therefore, asegmentation process can be done in order to obtainthe most significant information. This process is car-ried out beam to beam and consists of two steps. First,only those bins with an intensity value over a thresh-old are selected and stored. This procedure separatesthe acoustic imprint left by an object in the imagefrom the noisy background data [Figure 5(b)]. The re-sulting imprint is used to estimate the feature uncer-tainty, as explained in Section 3.2. The second step isto select among the beam’s thresholded data thosebins that are local maxima and satisfy a “minimumdistance between them” criterion. This means that iftwo or more of these bins are too close within thebeam, they should correspond to the detection of thesame object and hence are redundant. Then, the oneswith the lowest intensity value are discarded [see theresult in Figure 5(c)]. The selected local high-intensitybins are the ones that most likely correspond to ob-jects present in the scene. Thus, they are specially wellsuited as input to the feature extraction algorithm(Section 3) while, at the same time, the computationalefficiency is improved because a small number of binsis involved.

2.3. Dealing with a Stream of Beams

To deal with the stream of measurements producedby the continuous arrival of beams, a data buffer is

Journal of Field Robotics DOI 10.1002/rob

Page 5: Underwater SLAM in man-made structured environments

902 • Journal of Field Robotics—2008

Figure 4. Effect of motion-induced distortion on acoustic images: (a) Raw data represented in polar coordinates. (b) Or-tophotomap of the real environment where the sonar data were gathered. (c) The same data set represented in Cartesiancoordinates. (d) Undistorted image obtained after taking into account the vehicle motion.

set to store the beams contained within the most re-cent 180-deg scan sector. Whenever new beams cor-responding to an unexplored zone arrive, old beamsthat fall outside the scan sector are discarded. Thechoice of a 180-deg sector is not arbitrary becausethis is the maximum zone that a single line can cover

within a sonar scan. Because calculations to searchfor features are performed with every new beam(Section 3), the buffer should contain the bins that arelocal maxima [Figure 5(c)] for the line detection pro-cess, the segmented beams [Figure 5(b)] for uncer-tainty estimation, and all its associated positions in

Journal of Field Robotics DOI 10.1002/rob

Page 6: Underwater SLAM in man-made structured environments

Ribas et al.: Underwater SLAM in Man-Made Structured Environments • 903

Figure 5. Acoustic data segmentation: (a) Raw sensor data represented in polar coordinates. (b) Data after applying athreshold. (c) Selection of the local maxima bins of each beam.

the world coordinate system to deal with the motion-induced distortions.

3. LINE FEATURE EXTRACTION

The cross section of a sonar scan with walls and otherplanar structures results in line-shaped features in theacoustic images. The Hough transform (Illingworth& Kittler, 1988) is a feature extraction technique thatis specially well suited for this kind of situation. Thisalgorithm accumulates the information from the sen-sor data into a voting table that is a parameterizedrepresentation of all the possible feature locations.Those features that receive a great number of votesare the ones with a relevant set of compatible sensormeasurements and thus the ones that most likely cor-respond to a real object in the environment. In ourapplication, line features are described by two param-eters, ρB and θB (distance and orientation with re-spect to a base frame B). Hence, the resulting Houghspace (HS) is a 2D space where the voting process andthe search for maxima can be done efficiently. Thebase reference frame B can be set arbitrarily. How-ever, our choice for B is the current position of thesensor head at the moment the voting is performed.Because in this implementation the voting is trig-

gered by the arrival of new beams from the sensor,the most recently stored position in the data buffer(the one corresponding to the last beam) defines theposition of B. An advantage of choosing this base isthat, when a line feature is detected after the voting,its parameters are already represented in the sensorcoordinate frame and hence it can be integrated di-rectly into the SLAM framework. It is worth notingthat B is not a fixed coordinate frame. As the param-eterization in the HS is performed in polar coordi-nates, setting the reference in a fixed position wouldproduce resolution loss with the increment in range.To avoid this, we need to resituate B according to thevehicle’s motion. Unfortunately, this requires recom-puting the HS with each change in the position of B.Although it may seem a great deal of computation,in fact the number of bins involved in the voting isnot large (fewer than 100 bins during the tests per-formed) and the calculations can be executed quiterapidly. Moreover, as will be explained in the nextsection, there are situations in which recalculating theHS can be avoided. Another key issue is the quan-tization of the HS. In our case, we have observedthat selecting the quantization equal to the angularand linear resolutions of our sensor (typically, 1.8 degand 0.1 m) works fine. A higher resolution does not

Journal of Field Robotics DOI 10.1002/rob

Page 7: Underwater SLAM in man-made structured environments

904 • Journal of Field Robotics—2008

necessarily increase the quality of the detection be-cause the sonar resolution limits its precision, Onthe other hand, a lower resolution would produce arough observation.

The general execution of the feature extractionprocess consists of several steps. First, with eachbeam arrival, the HS is initialized using the currentsensor position as the base frame B. Next, all the binsstored in the buffer are referenced to B so that theycan be used to vote in the space. It is worth notingthat the stored beam positions are taken into accountwhen transforming to B. Hence, the data are undis-torted. Then, the votes corresponding to each bin areassigned to the candidate lines by means of a sonarmodel. Finally, a search for winning candidates is per-formed.

3.1. Voting Method for Line Detection

Each bin represents the strength of the echo inten-sity return in a particular place within the insoni-fied area. Owing to the uncertainty produced by thehorizontal beamwidth, a measurement cannot be as-signed to a single point in the space. A commonapproach (Leonard & Durrant-Whyte, 1992; Tardoset al., 2002) is to consider the measurement as anarc whose aperture represents the beamwidth uncer-

tainty. Moreover, as a high-intensity return is typi-cally produced when the acoustic wave hits a sur-face perpendicularly, we can infer that all the surfacestangent to the arc can explain the high-intensity re-turn. Although this simple model is well suited forair sonar ranging systems, it is not able to explainthe acoustic images gathered with a MSIS. A care-ful analysis of these images reveals that their objectdetection capability is not limited to the arc-tangentsurfaces but that those beams intersecting the surfacewithin the limits defined by a certain maximum inci-dence angle also produce a discernible return. On theother hand, those beams with a shallower angle arecompletely reflected and do not perceive the surface.To obtain a better description of this situation, an ex-tended model of the imaging sonar has been adopted(Figure 6). Basically, given a horizontal beamwidthangle α (in our sensor, α = 3 deg) and a maximumincidence angle β (generally, more than 60 deg), theset of line features compatible with a particular bin iscomposed not only of the lines tangent to the arc de-fined by α but also of all the lines that intersect the arcwith an incidence angle smaller than ±β. Before per-forming a voting, this set of lines must be determinedfor each bin stored in the data buffer. This processwill now be described using as reference the illustra-tion in Figure 6. Let the reference frame S define the

Figure 6. Model of the sonar sensor relating a bin and its compatible candidate lines. B is the base reference frame, and Sis a reference frame attached to the sonar transducer head.

Journal of Field Robotics DOI 10.1002/rob

Page 8: Underwater SLAM in man-made structured environments

Ribas et al.: Underwater SLAM in Man-Made Structured Environments • 905

position of the transducer head at the moment a par-ticular bin was obtained, with [xB

S , yBS , θB

S ] being thetransformation that defines the position of S with re-spect to the chosen base reference B and ρS the rangeat which the bin was measured from the sensor. Boththe transformation and the range values can be ob-tained from the information in the data buffer. To em-ulate the effect of the horizontal beamwidth, a set of i

values are taken at a given resolution within an aper-ture of ±α/2 around the direction in which the trans-ducer is oriented, also referred as θB

S :

θBS − α/2 ≤ θB

i ≤ θBS + α/2. (1)

Each value of θBi represents the bearing parameter for

a line tangent with the arc that models the horizontalbeamwidth. As stated earlier, not only are the linestangent to the arc candidates, but also the ones insidethe maximum incidence angle limits of ±β. For thisreason, k values are taken at a given resolution foreach value of θB

i and within an aperture of ±β:

θBi − β ≤ θB

i,k ≤ θBi + β . (2)

The result of this operation is i × k different values ofθBi,k . These are the bearings for a set of lines that are

a representation of all the possible candidates com-patible with the bin. The final step is to determinethe range parameter ρB

i,k corresponding to each oneof the θB

i,k bearings obtained. Given the geometry ofthe problem, they are calculated as

ρBi,k = xB

S cos(θBi,k

) + yBS sin

(θBi,k

) + ρS cos(θi,k). (3)

This set of lines can now be used to determine thecells in the voting space that should receive a sin-gle vote from this particular bin. It is assumed thatthe resolutions chosen during the generation of thei × k lines are sufficient to ensure a correct explo-ration of the grid cells and hence that the zone inthe discretized space corresponding to the compati-ble candidates is correctly determined. This processis repeated for all the bins stored in the data buffer.In Figure 7 it is shown how the set of voters lookswhen assigned to the HS. Note that each selected cellof the HS can receive only one vote from any particu-lar bin and that those cells containing multiple votestherefore represent lines compatible with different in-dividual bins.

Coinciding with the arrival of a new beam, a newvoting space is generated to look for winning line

candidates. A winning line must be detected onlyonce it has been completely observed (i.e., furtherbeams cannot provide more votes to the candidate).In the voting space, the zone in which the winninglines can exist is completely determined by the sub-set of all the candidate lines contained in the most re-cent 180-deg scan sector that do not intersect with thelast beam (shaded zones in Figure 7). Any line candi-date with a sufficient number of votes found withinthis zone is declared a winner. This value is chosenas a compromise between avoiding false detectionsand maintaining the capacity of sensing short walls.(A count of 18 votes has shown good results in dif-ferent tests.) Performing the detection in this way canensure that the algorithm detects the lines as soon asthey are completely visible. After a line detection, allthe bins involved in the election of the selected candi-date are removed from the buffer so that they do notinterfere with the detection of further features.

It is worth mentioning that, in order to reduce thecomputational cost of the process, some votings canbe skipped. After each voting, it is possible to deter-mine the cell with the largest number of votes andtherefore to calculate the number of supplementaryvotes required to produce a winner. Because addi-tional votes can be obtained only from newly mea-sured bins, it is not necessary to perform more vot-ings before the minimum required number of binshas been measured and introduced in the buffer.

3.2. Uncertainty Estimation

The process to estimate a feature’s uncertainty isbased on relating the probability of an object exist-ing in a particular place with the measured inten-sities in the acoustic image representing the samelocation. There is a high probability that there willbe an object in a zone where large intensity val-ues have been measured (e.g., the light shapes inFig. 4), whereas the probability in the zones withlower intensity measurements gradually decreasesto zero (the dark zones in the figure). Given this,the process of applying a threshold to segment theacoustic data can be considered analogous to defin-ing a particular confidence interval for a probabil-ity distribution. In other words, a line feature willfall inside the thresholded zone in the acoustic im-age with a particular confidence level. To make theproblem tractable, the probability distribution of aline feature represented in the acoustic image willbe approximated to a bivariate Gaussian distribution

Journal of Field Robotics DOI 10.1002/rob

Page 9: Underwater SLAM in man-made structured environments

906 • Journal of Field Robotics—2008

Figure 7. Sequence representing the voting process. The scan sector stored in the buffer is represented together with itscorresponding voting space. The line with triangular shapes marks the position of the most recent beam. The shaded zonerepresents where the candidates have received all the possible votes. (a) Part of the target line is still outside the sector scanand can receive more votes in the future. (b) The line can now be detected because it has been fully observed and more votescannot be added. (c) Those votes corresponding to the detected line, as well as the old ones that fall outside the 180-degscan sector, are removed from the HS so they cannot interfere with future line detections.

on its ρ and θ parameters. (An example justifyingthat this approximation is suitable can be found inSection 3.3.) Therefore, the process to estimate thefeature uncertainty consists of determining the Gaus-sian that best fits the segmented data representing aprobability distribution for a given confidence level.A simple description of this process is shown inAlgorithm 1. After the detection of a line feature withthe voting algorithm, the uncertainty estimation pro-cess begins with the assignment of a feasible confi-dence coefficient to the imprint left after the segmen-

tation. [For instance, it is realistic to assume that thesegmented data in Figure 5(b) will contain the realfeature in 95% of the cases.] Because the winning can-didate line has received a considerable number ofvotes, it must be one of the lines contained withinthe confidence interval defined by the segmented im-print. The next step of the process consists of find-ing a number of compatible lines belonging to theneighborhood of the winning candidate that over-lap the segmented data in the same way. The ob-jective of this is to obtain a set of line realizations

Journal of Field Robotics DOI 10.1002/rob

Page 10: Underwater SLAM in man-made structured environments

Ribas et al.: Underwater SLAM in Man-Made Structured Environments • 907

Algorithm 1get measurements([ρc, θc], scan, confidence level)

/∗ Initialization of the polar grid space that will containthe segmented sonar data ∗/boolean last180scan [ρresolution, θresolution];[last180scan] = init scan(scan);

/∗ Set the paradigm with the candidate line from thevoting ∗/[ηc] = get overlap ratio([ρc, θc], last180scan)

/∗ Search for compatible lines ∗/lines2check = {[ρc, θc]};accepted = ∅;rejected = ∅;While lines2check �= ∅ do

[ρi, θi ] = get candidate(lines2check);[ηi ] = get overlap ratio([ρi, θi ], last180scan);if accept line(ηc, ηi ) then

accepted = accepted ∪ {[ρi, θi ]};lines2check = lines2check\{[ρi, θi ]},lines2check = lines2check ∪ {neighbour8connectivity([ρi, θi ]) ∩ {rejected

⋃accepted}})

elserejected = rejected ∪ {[ρi, θi ]};lines2check = lines2check\{[ρi, θi ]};––

/∗ Given the set of lines, determine the ellipse that con-tains the area where they exist ∗/[major axis,minor axis, ρmean, θmean, α] =get ellipse(accepted);/∗ Given the ellipse and the confidence level related to thesegmentation, find the mean and covariancezV = [ρmean, θmean]; ∗/R = get covariance(major axis,minor axis, ρmean, θmean,conf idence level) return [zV , R];

representative of the population contained within thedefined confidence interval (i.e., a set of lines that“fill” the segmented area). Estimating the Gaussiandistribution from a set of lines is not straightforward;however, it is worth noting that lines described byits ρ and θ parameters can also be represented aspoints in a polar ρ–θ space. Representing the set oflines in such a space will result in a cloud of points(the lines are similar) with an elliptic form. This par-ticular elliptic disposition of the ρ–θ points suggeststhat the approximation of the line feature to a Gaus-sian distribution is correct. Although the space haschanged, the set still represents a population of lineswithin the previously defined confidence interval.This fact is used to estimate the uncertainty of the linefeature. This is achieved by approximating the area

occupied by the set of points to the area enclosedin the ellipse that a bivariate Gaussian distributionwould generate at the same given confidence. Byknowing the confidence coefficient and the major andminor axes of the ellipse and their orientation, it ispossible to recover the covariance matrix. Moreover,the mean value of a ρ–θ pair defining the line featurecan also be obtained from the center of the ellipse.Figure 8 illustrates the different steps involved in theprocess of estimating the feature uncertainty. The im-age in Figure 8(a) reproduces a voting space that hasjust obtained a winning candidate (marked with thesmall box). The corresponding sonar measurementsappear in Figure 8(b) and are represented in the sameB-based discrete polar space as the HS. Because thedata are represented in polar coordinates, the line fea-ture appears as an arch whose thickness is relatedto its uncertainty. Note that the ρ–θ pair, represent-ing the winning candidate line in the HS, can also berepresented in this space. In fact, to parameterize theline, we use its point with the smallest distance to theorigin (again, represented with the same small boxin the figure). Applying a threshold and assigning aconfidence coefficient to the segmented data resultsin the space represented in Figure 8(c). At this point,and using the winning candidate line as a paradigm,the search for lines contained within the segmentedimprint is performed. The resulting set of lines is con-tained inside the bounds represented as black arcs,and the representation of the place occupied by theirρ and θ pairs is represented as a black shape at theapex of the arc. The final step of the procedure con-sists of finding the ellipse containing this area andextracting the covariance matrix given the predefinedconfidence coefficient. Finally, Figure 8(d) representsthe estimated feature over a Cartesian representationof the scan sector. The line in the center correspondsto the ρ–θ mean value, and the lines at the sides rep-resent the uncertainty bounds at 95% confidence.

3.3. Validation of the Feature ExtractionAlgorithm

To validate the feature extraction algorithm, severaltests with both synthetic and real data were carriedout. With generating synthetic data, we seek two ob-jectives. The first one is to justify the use of a bivariateρ–θ Gaussian distribution to represent the uncertainfeatures present in the acoustic images. The second isto have a way of comparing the output from the al-gorithm with the paradigm, which makes it possible

Journal of Field Robotics DOI 10.1002/rob

Page 11: Underwater SLAM in man-made structured environments

908 • Journal of Field Robotics—2008

Figure 8. Process for uncertainty estimation. (a) Winning candidate in the HS. (b) Polar representation of the sonar data.(c) Segmented data with the zone occupied by line features inside the confidence level. (d) Estimated feature representedover the scan sector.

to confirm the correctness of the estimation. To ob-tain the synthetic data, a large population of ρ–θ pairswas generated following a given probability distribu-tion. Then, the lines represented by each pair wereprojected into a polar space analogous to those pro-duced by the measurements from a MSIS. Each cellfrom this space represents a bin, and its echo inten-sity value is assigned according to the number oflines that cross its area. The resulting synthetic dataset is represented in polar and Cartesian coordinatesin Figures 9(a) and 9(d). In spite of the large uncer-tainty assigned with the goal of improving the visu-alization of the uncertainty estimation process, theexample has sufficient points in common with thereal acoustic images to consider this model as valid.It can be observed how the high-intensity zone inthe center corresponds with the major concentrationof lines, whereas the dispersion on the sides, causedby the angular uncertainty, produces an effect simi-lar to the loss of intensity and precision affecting thebeams with large incidence angles. Figures 9(b) and9(c) illustrate the voting and the uncertainty estima-tion process. The elliptic-shaped zone representingthe population of compatible lines reflects the Gaus-sianity of the estimated feature. As can be observedin Figure 9(d), the estimated line feature is a goodrepresentation of what appears in the synthetic data.Additional verification of the method can be seen inFigure 10, where the cloud of ρ–θ pairs initially usedto generate the synthetic data is plotted together with

an ellipse representing the original Gaussian distribu-tion (dashed line) and another one representing theone estimated with the proposed method (solid line).When comparing the two ellipses, it can be appreci-ated that they are almost coincident except for a smallangular misalignment. It is important to note thatcorrelated data, such as those in this example, haveturned out to be the most difficult scenario for theproposed uncertainty estimation method, and there-fore one could expect even better estimates whenworking with less-correlated data.

A second set of tests was carried out with realdata acquired with the Ictineu AUV. Under real work-ing conditions, it is not possible to obtain reliablereferences to test the performance of the method.Therefore, only the direct visualization of the esti-mated line feature represented over the acoustic im-ages can be used as an indicator. The first examplein Figure 11(a) shows the features extracted from adata set obtained in a real application scenario; in par-ticular, in the same marina environment that servedas the test bed for the SLAM algorithm. The sec-ond example is represented in Figure 11(b). It cor-responds to an experiment performed in the watertank of the Underwater Robotics Research Center atthe University of Girona. This confined environmentwith highly reflective concrete walls produces noisydata with many reflections and phantoms. The resultsin both cases are consistent with the representation ofthe walls in the acoustic images and, moreover, show

Journal of Field Robotics DOI 10.1002/rob

Page 12: Underwater SLAM in man-made structured environments

Ribas et al.: Underwater SLAM in Man-Made Structured Environments • 909

Figure 9. Testing the algorithm with synthetic data. (a) Raw sensor data generated from ρ and θ given a normally dis-tributed uncertainty. Some correlation affects the two variables to increase the difficulty of the test. (b) The voting spaceclearly identifies the line. (c) Uncertainty estimation using the segmented data. The black elliptic shape corresponds to thelines with compatible overlapping and represents the uncertainty of ρ and θ . (d) The estimated line feature fits almostperfectly with the synthetic one.

a reliable behavior when working with noisy data, fil-tering linear features from shapeless phantoms.

4. THE SLAM ALGORITHM

An EKF integrates the vehicle’s navigation sensors toprovide an estimate of its position and retain the esti-mates of the previously observed features in order to

build a map. We favor the use of the EKF over the po-tentially more efficient information filter (Thrun et al.,2004) because the availability of covariances with-out any additional computations or approximation isvery important for data association. This filter is animplementation of the stochastic map (Smith, Self, &Cheeseman, 1990) in which the estimate of the posi-tion of both the vehicle xV and the set of map features

Figure 10. Comparison between the bivariate Gaussian distribution used to produce the synthetic data and the outputfrom the algorithm. The ellipse with dashed line represents the Gaussian distribution at 95% confidence, and the solid oneis the output of the algorithm at the same confidence level.

Journal of Field Robotics DOI 10.1002/rob

Page 13: Underwater SLAM in man-made structured environments

910 • Journal of Field Robotics—2008

Figure 11. Testing the algorithm with real data. (a) Line features extracted from acoustic data gathered in a marina envi-ronment. (b) Line features obtained from acoustic data gathered in a small water tank. The lines on the right-hand side arenot estimated as they are split between the start and the end of the scan.

F = {x1 . . . xn} are stored in the state vector x:

x(k) = [xV (k), x1(k) . . . xn(k)]T . (4)

The covariance matrix P describes the covariance ofthe vehicle and the features as well as their respectivecross correlations:

P(k) = E([x(k) − x(k)][x(k) − x(k)]T |Z(k)). (5)

The vehicle’s state xV has dimension 9, which definesthe minimum size of the state vector x at the begin-ning of the execution. The features are represented inpolar coordinates, and therefore the state will be in-creased by 2 with each new incorporation in the map.

4.1. Map Initialization

When creating a new stochastic map at step 0, a baselocal reference frame L must be selected (Figure 12).In this approach, the initial vehicle position ischosen to set this base location and thus is initializedwith perfect knowledge. The vehicle’s state xV is rep-resented as

xV = [x y z ψ u v w r ψL0

]T, (6)

where, as defined in Fossen (2002), [x y z ψ] rep-resent the position and heading of the vehicle inthe local reference frame L and [u v w r] are theircorresponding linear and angular velocities on thevehicle’s coordinate frame V. The term ψL0 representsthe angle between the initial vehicle heading at step 0

Figure 12. Representation of the different reference coor-dinate frames.

Journal of Field Robotics DOI 10.1002/rob

Page 14: Underwater SLAM in man-made structured environments

Ribas et al.: Underwater SLAM in Man-Made Structured Environments • 911

(orientation of L) and magnetic north in the earthglobal frame E. This term works as a sensor bias andallows us to initialize the vehicle heading ψ in the lo-cal frame L, making it possible to use compass mea-surements (angle to the north in the E frame) for itsestimation as shown in Section 4.4. Assuming that thevehicle is not moving at step 0, the state is initializedas

x(0) = xV (0) =[0 0 0 0 0 0 0 0 ψL0

]T ;

P(0) = PV (0) =[

08×8 08×1

01×8 σ 2ψL0

], (7)

where ψL0 takes its value from the first available com-pass measurement and σ 2

ψL0is initialized accordingly

with the sensor’s precision. It is worth noting that atthe beginning of the execution, the map does not con-tain any feature and hence the state x contains onlythe vehicle’s state xV .

4.2. Prediction

A simple four-degree-of-freedom, constant-velocitykinematics model is used to predict the state of thevehicle. Because AUVs are commonly operated de-scribing rectilinear transects at constant speed duringsurvey missions, we believe that such a model is asimple but realistic way to describe the motion:

xV (k) = f [xV (k − 1), sV ], (8)

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

x

y

z

ψ

u

v

w

r

ψL0

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

(k)

=

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

x +(uT + su

T 2

2

)cos(ψ)

−(vT + sv

T 2

2

)sin(ψ)

y +(uT + su

T 2

2

)sin(ψ)

+(vT + sv

T 2

2

)cos(ψ)

z + wT + swT 2

2

ψ + rT + srT 2

2

u + suT

v + svT

w + swT

r + srT

ψL0

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

(k−1)

, (9)

where sV = [su sv sw sr ] represents an accelerationwhite noise additive in the velocity with a zero meanand covariance QV , which is propagated through in-tegration to the position. On the other hand, as fea-

tures correspond to fixed objects from the environ-ment, we can assume that they are stationary. Hence,the whole state can be predicted as

x(k|k−1) = {f [xV (k−1)] x1(k−1) . . . xn(k−1)}T(10)

and its covariance matrix updated as

P(k|k − 1) =[

FV (k) 00 I

]P(k − 1)

[FV (k) 0

0 I

]T

+[

GV (k)0

]QV

[GV (k)

0

]T

, (11)

where FV and GV are the Jacobian matrices of partialderivatives of the nonlinear model function f withrespect to the state xV and the noise sV , respectively.

4.3. DVL Update

A SonTek Argonaut DVL unit provides bottom track-ing and water velocity measurements at a frequencyof 1.5 Hz. The unit also includes a pressure sensor al-lowing depth estimation. The model prediction is up-dated by the standard KF equations with each newDVL measurement:

zD = [ubv

bw

buw vw ww zdepth]T , (12)

where index b stands for bottom tracking velocityand w for through water velocity. The measurementmodel is

zD(k) = HD(k)x(k) + sD, (13)

HD(k) =⎡⎣ 03×3 03×1 I3×3 03×2 03×2n

03×3 03×1 I3×3 03×2 03×2n

0 0 1 0 01×3 01×2 01×2n

⎤⎦ , (14)

where sD (measurement noise) is a zero-mean whitenoise with covariance RD . In some situations, forinstance, while operating close to the bottom or inconfined spaces, the DVL is unable to produce correctvelocity measurements. The bottom tracking mea-surements are more prone to fail than those withrespect to the water. For this reason, and under thecondition that no water currents are present, the ve-locities with respect to the water are introduced inthe estimation process. The DVL determines auto-matically the quality of the received signals and pro-vides a status measurement for the velocities. Then,

Journal of Field Robotics DOI 10.1002/rob

Page 15: Underwater SLAM in man-made structured environments

912 • Journal of Field Robotics—2008

different versions of the HD matrix are used to fuseone (removing row 2), the other (removing row 1), orboth (using the full matrix) readings. In the particu-lar case of the abandoned marina experiment, only3% of the bottom tracking measurements received abad status indicator. Therefore, it was not necessaryto rely on water velocities during the execution.

4.4. Compass Update

A compass provides measurements at a frequencyof 10 Hz. This allows updating the attitude esti-mate from the model prediction with the standard KFequations. As can be observed in Figure 12, the com-pass measurement zC corresponds to the addition ofthe heading of the vehicle ψ with respect to the localreference frame L and the orientation of this frameψL0 . The resulting measurement model is

zC(k) = HC(k)x(k) + sC, (15)

HC(k) = [0 0 0 1 0 0 0 0 1 01×2n], (16)

where sC (measurement noise) is an additive zero-mean white noise with covariance RC . Working withcompass data can be a difficult task in some situ-ations. The effect of electromagnetic fields, such asthose produced by the thrusters, and the presenceof large structures with ferromagnetic materials canconsiderably distort measurements and render themunusable. For this reason, it is important to avoid op-erating close to walls (generally, 1–2 m is sufficient)and to perform a calibration before each mission. Onthe other hand, a compass is an especially useful sen-sor for SLAM because it provides absolute orienta-tion measurements, unlike the dead-reckoning sen-sors normally used in SLAM such as wheel encoders,gyros, or, in our case, the DVL. The effect of using acompass is threefold:

1. The error in vehicle orientation will not in-crease during the SLAM process.

2. Vehicle orientation introduces nonlinearity inthe SLAM problem, so that loss of precisionbecause of linearization effects will also belimited.

3. Vehicle orientation errors in a certain step be-come position errors in future steps. Boundingthe errors in orientation will also result in a re-duction in the rate of increase of vehicle posi-tion errors.

Figure 13 shows the evolution of a vehicle’s positionand orientation using the DVL velocity data togetherwith the rate of turn measurements from gyros (solidline) and using absolute attitude information from thecompass (dashed line). We can see that the error inorientation remains constant. There is also a reduc-tion in the rate of increase of the error in the directiontransverse to the vehicle’s direction of motion.

4.5. Imaging Sonar Beam Arrival

The Tritech Miniking imaging sonar produces beamsat a 10–30-Hz rate, depending on the settings of thesensor. Each new beam is stored together with thecurrent vehicle position estimate in a data buffer andfed to the feature extraction algorithm as shown inSections 2 and 3. Eventually, the information addedby a new beam arrival is sufficient to produce aline feature detection. In this case, the ρ–θ pair ob-tained is represented in a B frame that is placed inthe sonar head. For the sake of simplicity, let us as-sume that the transformation between B and the ve-hicle’s coordinate system is known. Hence, we couldrepresent a new measurement i with respect to thevehicle’s frame V as zV

i = [ρVi θV

i ]T . Of course, thesame transformation should be applied to the covari-ance matrix obtained from the uncertainty estimationmethod. This transformation will result in the covari-ance matrix Ri . The next step is to solve the data as-sociation problem; that is, to determine whether themeasured line zV

i corresponds to any of the featuresFj , j = 1 . . . n already existing in the map and shouldbe used to update the system or, on the contrary, isnew and has to be incorporated into the map. Theresult of the data association process is a hypothesisH = ji associating the measurement zV

i with one ofthe map features Fj (ji = 0 indicates that zV

i has nocorrespondence with the existing features). Findingthe correct hypothesis is a process involving the anal-ysis of the discrepancy between the actual line mea-surement and its prediction. This prediction is ob-tained from the nonlinear measurement function hj ,which relates the i measurement with the state vectorx(k) containing the locations of the vehicle and the j

feature:

zVi (k) = hj [x(k)] + si , (17)[ρV

i

θVi

]=

[ρj − x cos θj − y sin θj

θj − ψ

]+ si , (18)

Journal of Field Robotics DOI 10.1002/rob

Page 16: Underwater SLAM in man-made structured environments

Ribas et al.: Underwater SLAM in Man-Made Structured Environments • 913

0 50 100 150

0

10

Time (s)

X (

m)

0 50 100 150

0

20

Time (s)

Y (

m)

0 50 100 150

0

2

Time (s)

Hea

din

g (

rad

)

Figure 13. Estimated position covariance plots (2σ bounds). The data correspond to the first minutes of the abandonedmarina experiment executing the EKF with the updates from dead-reckoning sensors. The results using a cheap inaccurategyro sensor are represented with a solid line. The ones using absolute data from a compass are indicated with a dashedline.

where si , the noise affecting the line feature observa-tion, is a zero-mean white noise with covariance Ri .To calculate the discrepancy between the measure-ment and its prediction, the innovation term νij andits associate covariance matrix Sij are obtained as

νij (k) = zVi (k) − hj [x(k|k − 1)], (19)

Sij (k) = Hj (k)P(k|k − 1)H(k)Tj + Ri , (20)

where Hj represents the Jacobian matrix that lin-earizes the nonlinear measurement function hj

around the best available estimation of the statex(k|k − 1). To determine whether the correspondenceis valid, an individual compatibility (IC) test usingthe Mahalanobis distance is carried out:

D2ij = νij (k)T Sij (k)−1νij (k) < χ2

d,α, (21)

where d = dim(hj ) and α is the desired confidencelevel. It is possible for multiple hypotheses relatingthe measurement with different map features to sat-isfy the IC test. Then, in order to select the best candi-date, the nearest neighbor criterion is applied. Finally,after the correspondence has been decided, it is used

to update the state estimate by means of the EKF up-date equations:

Kij (k) = P(k|k − 1)Hj (k)T Sij (k)−1, (22)

x(k) = x(k|k − 1) + Kij (k)νij (k), (23)

P(k) = [I − Kij (k)Hj (k)]P(k). (24)

In case there is no valid hypothesis relating themeasured line with any of the features from the map(i.e., H = 0), this measurement can be added to thecurrent state vector as a new feature. However, thiscannot be done directly because this new featureneeds to be represented in the map reference frame.The change of reference is done by compounding(Smith et al., 1990; Tardos et al., 2002) the line featurewith the current vehicle position as follows:

x(k) =

⎡⎢⎢⎢⎣

xV (k)x1(k)

...xn(k)

⎤⎥⎥⎥⎦ ⇒ x(k)+ =

⎡⎢⎢⎢⎢⎢⎣

xV (k)x1(k)

...xn(k)

xV (k) ⊕ zVi (k)

⎤⎥⎥⎥⎥⎥⎦ .

(25)

Journal of Field Robotics DOI 10.1002/rob

Page 17: Underwater SLAM in man-made structured environments

914 • Journal of Field Robotics—2008

Augmenting the state vector also requires updatingthe estimated error covariance matrix:

P(k) = F(k)P(k)F(k)T + G(k)RiG(k)T , (26)

F(k) =

⎡⎢⎢⎢⎣

I 0 . . . 0...

... . . ....

0 0 . . . IJ1⊕ 0 . . . 0

⎤⎥⎥⎥⎦ , G(k) =

⎡⎢⎢⎢⎣

0...0

J2⊕

⎤⎥⎥⎥⎦ , (27)

where J1⊕ and J2⊕ are the Jacobian matrices of thecompounding transformation.

5. SLAM WITH LOCAL MAPS

The main advantages of building sequences of localmaps are the limitation of the cost associated withthe update of a full covariance matrix and the im-provement of the system’s consistency. In the presentcase, an additional advantage is obtained with usinglocal maps. The parameterization of line features us-ing polar coordinates is the most adequate approachfor our type of (polar) sensor. However, it is not thebest choice for referencing the features in a large map.Some issues appear when an observation of a newfeature is translated from the sensor frame to the mapbase frame, particularly in those situations in whichthe map base and the sensor base are far from eachother, because a small variation in the θ parameterof a feature with a large ρ value translates in largechanges in Cartesian coordinates. Using local mapsovercomes this issue as their area is smaller and thereference changes are less critical.

An important restriction of most local mappingmethods is that the local maps must be statisticallyindependent (no information can be shared betweenthem) to avoid introducing inconsistency when re-covering the global map. As a consequence, vehiclestates such as velocities or estimated sensor biasescannot be transferred between maps. Recently, Piniesand Tardos (2007) presented a technique that over-comes this limitation and makes sharing informationbetween local maps possible, while remaining condi-tionally independent. This is especially useful in ourcase because it allows us to keep information aboutthe vehicle’s state. This method was chosen to imple-ment the local map sequencing in the present work.Although this section summarizes the main charac-teristics of our particular implementation of the algo-rithm, a more detailed presentation of the method canbe found in the bibliographic reference mentioned.

5.1. Local Map Building

The local map-building process relies on defining aset of state variables that are common to two con-secutive maps. This commonality serves as a link totransmit the information from one map to the otherwhile maintaining their conditional independence. Inthe application at hand, this link makes it possible touse the estimates of the vehicle’s velocities and thecompass bias obtained at the end of a map to initial-ize the next local map. Moreover, after new measure-ments modify the estimate of these terms, it is alsopossible to update their estimated values in the pre-vious map through back-propagation.

The procedure to build the local maps begins byinitializing the filter presented in Section 4. Then, thevehicle moves through the scenario, acquiring sensorinformation regarding its own state and the positionof existing features. After a certain time period, thestate vector x will contain the current estimate of thestates of the vehicle xV as well as the position of sev-eral map features F = {x1 . . . xn}. At a given instant k,the current local map is finished and a new one is ini-tialized by defining a new state x containing only thecurrent vehicle state xV as follows:

x(k) = [xV (k) TxV (k)]T , (28)

where the first term is a clone of the vehicle’s statethat will serve as a link between the two local mapsand the second term represents the initialization ofthe vehicle’s state in the new map after performinga change of the base reference defined by the lineartransformation function T:

TxV (k) = [0 0 0 0 u v w r ψ+ψL0

]T, (29)

T =

⎡⎢⎣

04×4 04×4 04×1

04×4 I4×4 04×1

0 0 0 1 01×4 1

⎤⎥⎦ . (30)

This transformation sets the current vehicle locationas the base reference of the new local map, whileits velocity estimates (represented by the vehicle’sframe) are preserved. It is important to note that theterm of the compass bias is also updated to make in-tegrating compass measurements with respect to thenew base possible. The resulting state vector has adimension of 18. To complete the initialization pro-cess, the state covariance matrix P has to be set

Journal of Field Robotics DOI 10.1002/rob

Page 18: Underwater SLAM in man-made structured environments

Ribas et al.: Underwater SLAM in Man-Made Structured Environments • 915

accordingly:

P(k) =[

PV (k) PV (k)TT

TPV (k) TPV (k)TT

], (31)

where PV is the submatrix corresponding to the vehi-cle’s state from the full covariance matrix of the firstmap. At this point, the filter is ready to begin the esti-mation of the new local map using the equations pre-sented in Section 4. Of course, those equations shouldbe adapted to the presence of the common state vari-ables representing the link between the maps.

5.2. Local Map Joining

The map-building procedure will result in a sequenceof local maps with the form

Mi = (xi , Pi); with xi = [xi−1

V xi1 . . . xi

n xiV

]T.

(32)Each local map Mi contains the term xi−1

V , a copyof the vehicle’s state at the end of the previous mapMi−1, which represents the common part connect-ing the two maps. It also contains a set of features{xi

1 . . . xin}, which have been added to the state vec-

tor during the generation of the map, and, finally, theterm xi

V , which represents the estimate of the vehi-cle’s state throughout the creation of the map andwhose final value will serve to initialize the Mi+1local map.

The process of joining local maps into a singleglobal map is described here using a notation simi-lar to the one presented in Pinies and Tardos (2007).Consider two consecutive local maps defined as

MA =([

xA

xCa

],

[PA PACa

PCaA PCa

]), (33)

MB =([

xCb

xB

],

[PCb PCbB

PBCb PB

]). (34)

The part common to both maps is represented by xCa ,which corresponds to the state of the vehicle at theend of MA, and xCb, which is initialized as an exactclone of xCa during the creation of the MB map butevolves because of the updates propagated throughthe correlation terms during the generation of MB .The rest of the information stored in the maps is rep-resented by xA and xB . According to the general formdescribed in Eq. (32), xA will contain the common

term representing the link with a previous map andall the features in MA and xB will contain the featuresin MB and the estimate of the vehicle’s state at theend of the map.

The objective of the map-joining process is to ob-tain a single global map containing the informationfrom all the local maps. In this example, the globalmap is represented by

MAB =⎛⎝

⎡⎣ x′

A

xCb

xB

⎤⎦ ,

⎡⎣ P′

A P′ACb P′

AB

P′CbA PCb PCbB

P′BA PBCb PB

⎤⎦

⎞⎠ . (35)

The last two blocks of the global map coincide ex-actly with MB . (They are up to date.) Therefore, onlythe terms related to xA need to be updated (a tilde isused to denote those terms). This is because the firstmap has been updated only with its own measure-ments but does not contain any information obtainedduring the generation of the second map. To trans-mit the effect of those measurements to the estimatesin the MA map, a back-propagation procedure is car-ried out:

K = PACaP−1Ca, (36)

P′ACb = KPCb, (37)

P′A = PA + K(P′

CbA − PCaA), (38)

x′A = xA + K(xCb − xCa). (39)

Moreover, in order to recover the full covariance ma-trix of the global map, it is necessary to calculate thecorrelation term relating the two local maps:

P′AB = P′

ACbP−1CbPCbB, (40)

= KPCbB. (41)

At this point, all the elements in MAB have been de-termined. It is important to note that this map-joiningprocedure is applicable to sequences of more thantwo local maps. After each union, the resulting mapstill contains the common elements that serve as alink with the adjacent ones; therefore, the same pro-cedure can be applied.

Each element from the resulting global map isstill represented in the base frame of the respectivelocal map. Moreover, it is possible that some featurescould have been observed from different local mapsand therefore they are repeated. The final part of thisprocedure consists of transforming all the features to

Journal of Field Robotics DOI 10.1002/rob

Page 19: Underwater SLAM in man-made structured environments

916 • Journal of Field Robotics—2008

a common coordinate frame. Data association can becarried out and, after obtaining the correspondences,the global map can be updated to produce a betterestimate. In the context of this work, the Joint Com-patibility Branch and Bound (JCBB) data associationalgorithm has been used (Neira & Tardos, 2001) toobtain the hypothesis relating features from differentlocal maps. Then, an implicit measurement equationrepresenting the equivalence between paired featuresis used to perform the update (Castellanos & Tardos,1999).

6. EXPERIMENTAL RESULTS

To test the reliability of the proposed algorithm, wecarried out an extensive experiment in an abandonedmarina on the Costa Brava (Spain). The Ictineu AUVgathered a data set along a 600-m operated trajec-tory that included a small loop around the principalwater tank and a 200-m straight path through an out-

going canal. The vehicle moved at about 0.2 m/s,and the experiment lasted 50 min. The data set in-cluded measurements from the DVL, the compass,and the imaging sonar, which was set to a range of50 m, with a resolution of 0.1 m and 1.8 deg. To mini-mize the perturbations in the compass, a calibrationwas performed before starting the experiment. Be-cause the bottom was natural and a minimum dis-tance to the walls was always maintained, the cal-ibration was not affected by important distortions.However, to deal with punctual effects, a pessimisticapproach was adopted for the compass uncertaintymodel. For validation purposes, the vehicle was oper-ated close to the surface attached to a global position-ing system (GPS)–equipped buoy used for registeringthe trajectory. Figure 14 represents the trajectory ob-tained during the generation of the different submaps(solid black line), which is a good approximation tothe one measured with the GPS (dashed line). In ourimplementation, a new local map is started when the

0

0

50

100

150

X (m)

Y (

m)

Figure 14. Sequence of local maps. The SLAM trajectory is represented with a solid line and the GPS with a dashed line.The circles represent the limits of each local map, and their centers coincide with the corresponding base frames.

Journal of Field Robotics DOI 10.1002/rob

Page 20: Underwater SLAM in man-made structured environments

Ribas et al.: Underwater SLAM in Man-Made Structured Environments • 917

0

0

50

100

150

X(m)

Y(m

)

Figure 15. The resulting global map together with the dead-reckoning (dash–dotted line), GPS (dashed line), and SLAM(solid line) trajectories represented over a satellite image of the scenario.

vehicle is at a distance superior to 75 m from themap base frame. (These limits appear as circles inthe figure.) This resulted in the creation of five lo-cal maps whose dimensions made possible a correctrepresentation of the features in polar coordinates.The line segments corresponding to the walls werealso obtained by analyzing the overlapping of themeasured line features with their corresponding im-prints in the thresholded acoustic data. It is impor-tant to clarify that only the lines are included in thestate vector and hence are estimated with the filter.The endpoints of the segment should, therefore, betaken only as an approximation. A remarkable re-semblance can be observed between the SLAM tra-jectory estimated during the creation of the differentmaps and the reference data from the GPS. At ap-proximately [−40, 25], a sudden position change ap-pears as a consequence of reobserving, in the secondlocal map and after performing a small loop, the fea-tures at the beginning of the long canal. Given the

shape and dimensions of the scenario and the rangesetting of the sonar, the few places where a loop clo-sure could happen are limited to the principal tank.The path followed toward the top part of this tankis split between the two first local maps. Therefore,the only place where a loop closure could happen isin the lower part of the tank, when the features ateach side go out of sight. In these loop-closing situ-ations, a discontinuity is introduced in the trajectorystored in the sonar data buffer. It is, however, un-common for such strong position corrections to af-fect the feature extraction process. The update thatproduces this discontinuity generally takes place justafter the complete observation of a feature and dur-ing the initial moments of the next one. Therefore,the major part of the new bins introduced into thebuffer will usually be obtained on the already cor-rected track. It can also be observed how the discrep-ancy with the GPS data increments when the vehi-cle moves through the canal. This is mainly caused

Journal of Field Robotics DOI 10.1002/rob

Page 21: Underwater SLAM in man-made structured environments

918 • Journal of Field Robotics—2008

0 500 1000 1500 2000 2500 3000

0

5

10

Time (s)

X e

rror

(m)

0 500 1000 1500 2000 2500 3000

0

5

10

Time (s)

Y e

rror

(m)

Figure 16. Error plots (2σ bounds) for the resulting estimated trajectory after the local map joining. The GPS data wereused as ground truth.

by the absence of features placed perpendicular tothe direction of the canal axis, which makes the cor-rection of the errors accumulating in this directiondifficult. The global map and the estimated trajec-tory (solid line) obtained after the joining are plot-ted in Figure 15 layered over a satellite image. Forcomparison, the GPS trajectory (dashed line) and adead-reckoning trajectory (dot–dashed line) obtainedby executing the filter with only the measurementsfrom the DVL and the compass are also represented.As can be observed, the dead-reckoning data sufferfrom an appreciable drift (even causing the trajectoryto go outside the canal), whereas the SLAM estimatedtrajectory follows the GPS track with considerableprecision. The resulting map is also a good approxi-mation, matching almost perfectly with the real po-sition of the marina’s boundaries. A problem withthe position of a feature is observed in the upper-left-hand part of the map. This effect is due to the sim-ilarity between the two intersecting lines. The smallintersection angle makes it difficult for the featureextraction to discern between the two lines and, even-tually, they are interpreted as a single (slightly dis-torted) one. Of course, this also affects the measure-ment of the segment endpoints, as it makes it difficultto determine the overlapping with the thresholded

data and tends to make longer segments. Some mi-nor problems with the measurement of the segmentendpoints are also observed in the small channel en-trance in the lower-left-hand part of the map. Theymainly appear because of the polar parameterizationused in the line features, which, in some particularsituations, produces a misplacement of the segmentendpoints. Figure 16 represents the error plots for theresulting estimated trajectory obtained after produc-ing the local map. The GPS data were used as theground truth. As can be seen, the error is containedwithin the 2σ limits, confirming the correct opera-tion of the SLAM. Additional results validating thealgorithm are shown in Figure 17, which reproducestwo acoustic images generated by placing the sonarmeasurements from the complete data set accordingto the dead-reckoning and the SLAM estimated tra-jectories. An averaged representation of all the over-lapping scans was used; therefore, one can expect thediffuse appearance shown on the dead-reckoning im-age as a result of the dispersion induced by the erro-neous trajectory. On the other hand, using the SLAMtrajectory provides a more accurate placement of themeasurements that results in a sharper image. Onlythe acquisition of the sensor data was performed inreal time by the computers onboard the Ictineu. This

Journal of Field Robotics DOI 10.1002/rob

Page 22: Underwater SLAM in man-made structured environments

Ribas et al.: Underwater SLAM in Man-Made Structured Environments • 919

Figure 17. Acoustic map obtained after an averaged composition of the sonar readings along the dead-reckoning estimatedtrajectory (left) and the SLAM estimated trajectory (right).

SLAM approach was implemented on Matlab andexecuted offline on an ordinary desktop computer.The execution time is shorter than the duration of thereal experiment. We assume that a more optimizedimplementation should be able to operate onboard.

7. CONCLUSIONS AND FURTHER WORK

An algorithm to perform SLAM in partially struc-tured underwater environments has been presented.It takes advantage of walls and other planar struc-tures typically present in some scenarios of inter-est, in which reliable line features from acoustic im-ages can be extracted. The main contributions of thiswork include a feature extraction method capable ofworking with the continuous stream of data from theMSIS while dealing with the distortions induced bythe vehicle movement in the acoustic images, a newmethod for estimating their uncertainty, and the ap-plication domain with experimental results support-ing the viability of the proposal. Moreover, the EKF–SLAM algorithm has been complemented with theimplementation of a local map-building method thatmakes possible sharing information between mapswhile maintaining their independence. An additionalcontribution that complements this work is the pub-lication of the marina data set that includes logs fromthe MSIS, DVL, compass, and GPS sensors. The dataset can be downloaded from Ribas (2006).

Further work can be done to improve the fea-ture representation in the state vector by substitutingthe current ρ–θ parameterization for a more reliableapproach such as the SP-map (Castellanos & Tardos,1999). This SLAM approach will also benefit from theintegration of additional types of features. The pre-

sented algorithm can be adapted to extract any otherkind of features susceptible to parameterization, forinstance, the detection of point features representingcorners in the intersection of two walls. Of course, thepossibility of building local maps also opens the doorto mapping larger environments and also further im-provements such as multirobot SLAM. All these im-provements should conclude with the testing of thealgorithm in more challenging environments.

ACKNOWLEDGMENTS

The authors thank Marc Carreras, Andres El-Fakdi,Narcıs Palomeras, and Emili Hernandez for their helpwith the experimental setup and the acquisition of themarina data set. This work has been funded in part byprojects DPI2005-09001-C03-01 and DPI2006-13578 ofthe Direccion General de Investigacion of Spain.

REFERENCES

Bailey, T., & Durrant-Whyte, H. F. (2006). Simultaneous lo-calization and mapping (SLAM): Part II, State of theart. IEEE Robotics and Automation Magazine, 13(3),108–117.

Bosse, M., Newman, P., Leonard, J., Soika, M., Feiten, W., &Teller, S. (2003). An Atlas framework for scalable map-ping. In Proceedings of the IEEE International Con-ference on Robotics and Automation, Taipei, Taiwan(pp. 1899–1906). Thousand Oaks, CA: Sage.

Bosse, M., Newman, P., Leonard, J. J., & Teller, S. (2004).SLAM in large-scale cyclic environments using theAtlas framework. International Journal of Robotics Re-search, 23(12), 1113–1139.

Castellanos, J., Montiel, J., Neira, J., & Tardos, J. (1999). TheSPmap: A probabilistic framework for simultaneouslocalization and map building. IEEE Transactions onRobotics and Automation, 15(5), 948–953.

Journal of Field Robotics DOI 10.1002/rob

Page 23: Underwater SLAM in man-made structured environments

920 • Journal of Field Robotics—2008

Castellanos, J. A., Neira, J., & Tardos, J. D. (2004). Limits tothe consistency of EKF-based SLAM. In Proceedings ofthe 5th IFAC Symposium on Intelligent AutonomousVehicles, Lisbon, Portugal (pp. 1244–1249).

Castellanos, J. A., & Tardos, J. D. (1999). Mobile robot lo-calization and map building: A multisensor fusion ap-proach. Boston, MA: Kluwer Academic Publishers.

Clemente, L., Davison, A. J., Reid, I. D., Neira, J., & Tardos,J. D. (2007). Mapping large loops with a single hand-held camera. In Proceedings of Robotics Science andSystems, Atlanta, GA.

Durrant-Whyte, H. F., & Bailey, T. (2006). Simultaneous lo-calization and mapping (SLAM): Part I, The essentialalgorithms. IEEE Robotics and Automation Magazine,13(2), 99–108.

Estrada, C., Neira, J., & Tardos, J. D. (2005). HierarchicalSLAM: Real-time accurate mapping of large environ-ments. IEEE Transactions on Robotics, 21(4), 588–596.

Eustice, R., Singh, H., Leonard, J., Walter, M., & Ballard,R. (2005). Visually navigating the RMS Titanic withSLAM information filters. In Proceedings of RoboticsScience and Systems, Cambridge, MA (pp. 57–64).Cambridge, MA: MIT Press.

Fairfield, N., Jonak, D., Kantor, G. A., & Wettergreen, D.(2007). Field results of the control, navigation, andmapping systems of a hovering AUV. In Proceedingsof the 15th International Symposium on UnmannedUntethered Submersible Technology, Durham, NH.

Fossen, T. I. (2002). Marine control systems: Guidance, navi-gation & control of ships, rigs and underwater vehicles(1st Ed.). Trondheim, Norway: Marine Cybernetics AS.

Griffiths, G., McPhail, S., Rogers, R., & Meldrum, D. (1998).Leaving and returning to harbour with an autonomousunderwater vehicle. In Proceedings of the Oceanol-ogy International, Brighton, UK (volume 3, pp. 75–87).New Malden, UK: Spearhead Exhibitions Ltd.

Guivant, J., Nebot, E., & Durrant-Whyte, H. F. (2000). Si-multaneous localization and map building using nat-ural features in outdoor environments. In Proceed-ings of the International Conference on Intelligent Au-tonomous Systems, Venice, Italy (pp. 581–588).

Guivant, J. E., & Nebot, E. M. (2001). Optimization of the si-multaneous localization and map-building algorithmfor real-time implementation. IEEE Transactions onRobotics and Automation, 17(3), 242–257.

Huang, S., & Dissanayake, G. (2007). Convergence andconsistency analysis for extended Kalman filter basedSLAM. IEEE Transactions on Robotics, 23(5), 1036–1049.

Illingworth, J., & Kittler, J. (1988). A survey of the Houghtransform. Computer Vision, Graphics, and Image Pro-cessing, 44(1), 87–116.

Kim, J. H., & Sukkarieh, S. (2003). Airborne simultaneouslocalisation and map building. In Proceedings of theIEEE International Conference on Robotics and Au-tomation, Taipei, Taiwan (pp. 406–411).

Leonard, J. J., Carpenter, R. N., & Feder, H. J. S.(2001). Stochastic mapping using forward look sonar.Robotica, 19(5), 467–480.

Leonard, J. J., & Durrant-Whyte, H. F. (1992). Directed sonarsensing for mobile robot navigation. Norwell, MA:Kluwer Academic Publishers.

Leonard, J. J., & Feder, H. (2001). Decoupled stochasticmapping. IEEE Journal of Oceanic Engineering, 26(4),561–571.

Leonard, J. J., & Newman, P. M. (2003). Consistent, con-vergent and constant-time SLAM. In Proceedings ofthe International Joint Conference on Artificial Intel-ligence, Acapulco, Mexico (pp. 1143–1150). MorganKaufmann.

Martins, A., Matos, A., Cruz, N., & Pereira, F. L. (1999). IESan open system for underwater inspection. In Proceed-ings of the Oceans MTS/IEEE, Seattle, WA (volume 2,pp. 549–554).

Neira, J., & Tardos, J. D. (2001). Data association in stochas-tic mapping using the joint compatibility test. IEEETransactions on Robotics and Automation, 17(6), 890–897.

Newman, P. M., & Leonard, J. J. (2003). Pure range-onlysub-sea SLAM. In Proceedings of the IEEE Interna-tional Conference on Robotics and Automation, Taipei,Taiwan (volume 2, pp. 1921–1926).

Newman, P. M., Leonard, J. J., & Rikoski, R. J. (2003). To-wards constant-time SLAM on an autonomous un-derwater vehicle using synthetic aperture sonar. InProceedings of the 11th International Symposium onRobotics Research, Sienna, Italy (volume 15, pp. 409–420). Berlin: Springer.

Ni, K., Steedly, D., & Dellaert, F. (2007). Tectonic SAM: Ex-act, out-of-core, submap-based SLAM. In Proceedingsof the IEEE International Conference on Robotics andAutomation, Rome, Italy (pp. 1678–1685).

Pinies, P., & Tardos, J. (2007). Scalable SLAM buildingconditionally independent local maps. In Proceedingsof the IEEE/RSJ International Conference on Intelli-gent Robots and Systems, San Diego, CA (pp. 3466–3471).

Ribas, D. (2006). Dataset obtained in an abandonedmarina, St. Pere Pescador (Spain). http://eia.udg.es/%7Edribas (online; accessed July 12, 2007).

Ribas, D., Neira, J., Ridao, P., & Tardos, J. D. (2006).SLAM using an imaging sonar for partiallystructured environments. In Proceedings of theIEEE/RSJ International Conference on IntelligentRobots and Systems, Beijing, China (pp. 5040–5045).

Ribas, D., Palomer, N., Ridao, P., Carreras, M., &Hernandez, E. (2007). Ictineu AUV wins the firstSAUC-E competition. In Proceedings of the IEEE In-ternational Conference on Robotics and Automation,Rome, Italy (pp. 151–156).

Roman, C., & Singh, H. (2005). Improved vehicle basedmultibeam bathymetry using sub-maps and SLAM. InProceedings of the IEEE/RSJ International Conferenceon Intelligent Robots and Systems, Edmonton, Canada(pp. 3662–3669).

Smith, R., Self, M., & Cheeseman, P. (1990). Estimat-ing uncertain spatial relationships in robotics. InAutonomous robot vehicles (pp. 167–193). Springer-Verlag: New York.

Tardos, J. D., Neira, J., Newman, P., & Leonard, J. (2002). Ro-bust mapping and localization in indoor environmentsusing sonar data. International Journal of Robotics Re-search, 21(4), 311–330.

Journal of Field Robotics DOI 10.1002/rob

Page 24: Underwater SLAM in man-made structured environments

Ribas et al.: Underwater SLAM in Man-Made Structured Environments • 921

Tena, I., Petillot, Y., Lane, D. M., & Salson, C. (2001). Featureextraction and data association for AUV concurrentmapping and localisation. In Proceedings of the IEEEInternational Conference on Robotics and Automation,Seoul, Korea (pp. 2785–2790).

Tena, I., Reed, S., Petillot, Y., Bell, J., & Lane, D. M. (2003).Concurrent mapping and localisation using side-scansonar for autonomous navigation. In Proceedings ofthe 13th International Symposium on Unmanned Un-tethered Submersible Technology, Durham, NH.

Thrun, S., Liu, Y., Koller, D., Ng, A. Y., Ghahramani, Z., &Durrant-Whyte, H. F. (2004). Simultaneous localizationand mapping with sparse extended information fil-ters. International Journal of Robotics Research, 23(7–8), 693–716.

Williams, S., & Mahon, I. (2004). Simultaneous localisationand mapping on the Great Barrier Reef. In Proceedingsof the IEEE International Conference on Robotics andAutomation, New Orleans, LA (volume 2, pp. 1771–1776).

Williams, S. B., Dissanayake, G., & Durrant-Whyte,H. F. (2002). An efficient approach to the simul-taneous localisation and mapping problem. In Pro-ceedings of the IEEE International Conference onRobotics and Automation, Washington, DC (pp. 406–411).

Williams, S. B., Newman, P. M., Rosenblatt, J., Dissanayake,G., & Durrant-Whyte, H. (2001). Autonomous under-water navigation and control. Robotica, 19(5), 481–496.

Journal of Field Robotics DOI 10.1002/rob