a hybrid vision + ladar rural road follower

6
A Hybrid Vision + Ladar Rural Road Follower Christopher Rasmussen University of Delaware [email protected] Abstract—We present a vision- and ladar-based approach to autonomous driving on rural and desert roads that has been tested extensively in a closed-loop system. The vision component uses Gabor wavelet filters for texture analysis to find ruts and tracks from which the road vanishing point can be inferred via Hough- style voting, yielding a direction estimate for steering control. The ladar component projects detected obstacles along the road direction onto the plane of the front of the vehicle and tracks the 1-D obstacle “gap” due to the road to yield a lateral offset estimate. Several image- and state-based tests to detect failure conditions such as off-road poses (i.e., there is no road to follow) and poor lighting due to sun glare or distracting shadows are also explained. The system’s efficacy is demonstrated with full control of a vehicle over 10+ miles of difficult roads at up to 25 mph, as well as analysis of logged data in diverse situations. I. I NTRODUCTION With the running of the DARPA Grand Challenge (DGC) robot races in March, 2004 and October, 2005, there has been a heightened interest in algorithms for autonomously following “difficult” unpaved paths and roads. Nearly all of the 2004 and 2005 courses were narrowly constrained to a series of flat or hilly desert roads, many of which were unpaved and offered little color contrast to the surrounding sandy environment. The DGC courses also went through roadless areas, requiring sensors and methods for general obstacle avoidance and slope analysis as studied in [1], [2], and thus require some decision mechanism as to whether a vehicle is on- or off-road. In this paper, we describe a perceptual module for following marginal roads that uses a monocular, grayscale camera in conjunction with a SICK LMS ladars to rapidly obtain an estimate of oncoming road structure and transmit appropriate steering and throttle commands to the vehicle controller. The module, which we will call VP FOLLOW, was developed to operate as part of the overall system for the 2005 DGC team of Caltech, and functions related to higher-level navigation, structural map update, off-road steering, direct control of the vehicle, hardware fault monitoring, etc. reside in modules developed by other team members. In the 2005 DGC, the ultimate action taken by the vehicle in any given situation was a complex function of its state and sensor inputs that is beyond the scope of this paper to describe; the full system is covered in [3]. Thus, we focus here on the workings of VP FOLLOW in isolation and in conjunction with two simplified versions of the vehicle controller which essentially allow it to completely control the vehicle. Many complementary strategies for visual road following have been developed based on certain assumptions about the characteristics of the road scene. For example, edge-based methods such as those described in [4], [5], [6] are often used to identify lane lines or road borders, which are fit to a model of the road curvature, width, and so on. These algorithms typically work best on well-engineered roads such as highways which are paved and/or painted, resulting in a wealth of high- contrast contours suited for edge detection. Another popular set of methods for road tracking are region-based [6], [7], [8], [9]. These approaches use characteristics such as color or texture measured over local neighborhoods in order to formulate and threshold on a likelihood that pixels belong to the road area vs. the background. When there is a good contrast for the cue chosen, there is no need for the presence of sharp or unbroken edges, which tends to make these methods more appropriate for unpaved rural roads. We have found empirically that many desert road scenes present difficulties for these approaches, as for example a dirt road through a dry environment is delineated by neither strong sharp edges nor contrasting local color or texture characteristics (e.g., Figure 1(a)). One cue that often identifies such roads unambiguously, however, is their overall banding pattern. This banding, often due to ruts and tire tracks left by previous vehicles driven by humans who knew the way, is aligned with the road direction and thus most apparent because of the strong grouping cue imposed by its vanishing point. The percept of the vanishing point is reinforced by other oriented texture such as road border intensity edges or painted lines if they are present, and thus is an almost invariant feature of road images taken from the driver’s perspective regardless of road width or surface properties. For a straight road segment on planar ground, there is a unique vanishing point associated with the road. Its hor- izontal image position indicates the road direction, and its vertical position marks the horizon line of the road plane (see Figure 1(d) for an example of VP FOLLOW’s output on the previously referenced image). The significance of the former, of course, is that the difference between it and the vehicle’s current direction of travel yields an angular error suitable for input to a low-level steering control and acceleration module. As the vanishing point is associated with the road’s tangent, there is no unique vanishing point for curving or undulating road segments [10]. One empirical result we show here is that a constantly updated “mean” vanishing point tangent to approximately the middle of the curve ahead–that is, only a fast-changing linear shape estimate–suffices (along with a lateral offset control discussed below) to negotiate non-trivially curved roads at moderate vehicle speeds. In addition to the image-derived direction information, the second key part of VP FOLLOW is extraction of the vehicle’s lateral displacement from the road midline. Alignment of vehicle direction with the road direction does not assure that the vehicle is on the road–only that is is moving parallel to it. Without a corrective centering impulse, the vehicle may drift off of a straight road over time or cut off or overshoot curves. Image-based road segmentation is one possible approach here

Upload: others

Post on 12-Dec-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

A Hybrid Vision + Ladar Rural Road FollowerChristopher RasmussenUniversity of [email protected]

Abstract— We present a vision- and ladar-based approach toautonomous driving on rural and desert roads that has been testedextensively in a closed-loop system. The vision component usesGabor wavelet filters for texture analysis to find ruts and tracksfrom which the road vanishing point can be inferred via Hough-style voting, yielding a direction estimate for steering control.The ladar component projects detected obstacles along the roaddirection onto the plane of the front of the vehicle and tracksthe 1-D obstacle “gap” due to the road to yield a lateral offsetestimate. Several image- and state-based tests to detect failureconditions such as off-road poses (i.e., there is no road to follow)and poor lighting due to sun glare or distracting shadows are alsoexplained. The system’s efficacy is demonstrated with full controlof a vehicle over 10+ miles of difficult roads at up to 25 mph, aswell as analysis of logged data in diverse situations.

I. INTRODUCTION

With the running of the DARPA Grand Challenge (DGC)robot races in March, 2004 and October, 2005, there has beena heightened interest in algorithms for autonomously following“difficult” unpaved paths and roads. Nearly all of the 2004 and2005 courses were narrowly constrained to a series of flat orhilly desert roads, many of which were unpaved and offeredlittle color contrast to the surrounding sandy environment.The DGC courses also went through roadless areas, requiringsensors and methods for general obstacle avoidance and slopeanalysis as studied in [1], [2], and thus require some decisionmechanism as to whether a vehicle is on- or off-road.

In this paper, we describe a perceptual module for followingmarginal roads that uses a monocular, grayscale camera inconjunction with a SICK LMS ladars to rapidly obtain anestimate of oncoming road structure and transmit appropriatesteering and throttle commands to the vehicle controller. Themodule, which we will call VP FOLLOW, was developed tooperate as part of the overall system for the 2005 DGC teamof Caltech, and functions related to higher-level navigation,structural map update, off-road steering, direct control of thevehicle, hardware fault monitoring, etc. reside in modulesdeveloped by other team members. In the 2005 DGC, theultimate action taken by the vehicle in any given situation wasa complex function of its state and sensor inputs that is beyondthe scope of this paper to describe; the full system is coveredin [3]. Thus, we focus here on the workings of VP FOLLOW

in isolation and in conjunction with two simplified versions ofthe vehicle controller which essentially allow it to completelycontrol the vehicle.

Many complementary strategies for visual road followinghave been developed based on certain assumptions about thecharacteristics of the road scene. For example, edge-basedmethods such as those described in [4], [5], [6] are often usedto identify lane lines or road borders, which are fit to a modelof the road curvature, width, and so on. These algorithmstypically work best on well-engineered roads such as highways

which are paved and/or painted, resulting in a wealth of high-contrast contours suited for edge detection. Another popular setof methods for road tracking are region-based [6], [7], [8], [9].These approaches use characteristics such as color or texturemeasured over local neighborhoods in order to formulate andthreshold on a likelihood that pixels belong to the road areavs. the background. When there is a good contrast for the cuechosen, there is no need for the presence of sharp or unbrokenedges, which tends to make these methods more appropriatefor unpaved rural roads.

We have found empirically that many desert road scenespresent difficulties for these approaches, as for example adirt road through a dry environment is delineated by neitherstrong sharp edges nor contrasting local color or texturecharacteristics (e.g., Figure 1(a)). One cue that often identifiessuch roads unambiguously, however, is their overall bandingpattern. This banding, often due to ruts and tire tracks leftby previous vehicles driven by humans who knew the way, isaligned with the road direction and thus most apparent becauseof the strong grouping cue imposed by its vanishing point. Thepercept of the vanishing point is reinforced by other orientedtexture such as road border intensity edges or painted linesif they are present, and thus is an almost invariant feature ofroad images taken from the driver’s perspective regardless ofroad width or surface properties.

For a straight road segment on planar ground, there isa unique vanishing point associated with the road. Its hor-izontal image position indicates the road direction, and itsvertical position marks the horizon line of the road plane (seeFigure 1(d) for an example of VP FOLLOW’s output on thepreviously referenced image). The significance of the former,of course, is that the difference between it and the vehicle’scurrent direction of travel yields an angular error suitable forinput to a low-level steering control and acceleration module.As the vanishing point is associated with the road’s tangent,there is no unique vanishing point for curving or undulatingroad segments [10]. One empirical result we show here isthat a constantly updated “mean” vanishing point tangent toapproximately the middle of the curve ahead–that is, onlya fast-changing linear shape estimate–suffices (along with alateral offset control discussed below) to negotiate non-triviallycurved roads at moderate vehicle speeds.

In addition to the image-derived direction information, thesecond key part of VP FOLLOW is extraction of the vehicle’slateral displacement from the road midline. Alignment ofvehicle direction with the road direction does not assure thatthe vehicle is on the road–only that is is moving parallel to it.Without a corrective centering impulse, the vehicle may driftoff of a straight road over time or cut off or overshoot curves.Image-based road segmentation is one possible approach here

(a) (b) (c) (d)

Fig. 1. (a) Captured road image; (b) Dominant orientation at each pixel ([0, π] radians → [0, 255] intensity values); (c) Vote function forvanishing point; (d) Distribution of particles and estimated vanishing point location with horizon line indicated

[7], [9], and we have experimented with several methods usingthe vanishing point as a powerful shape constraint [11], [12],but ultimately have found them insufficiently robust. Rather,VP FOLLOW uses a simple obstacle map derived from SICKLMS ladars to estimate the lateral offset of the road midlinerelative to the vehicle. By tracking the 1-D “gap” betweenobstacles (positive and/or negative) on the left and right sidesof the road, the vehicle can center itself while decreasingdirectional error. This approach is insensitive to the roadsurface material and lighting conditions, and there is no needto tune, learn, or adapt parameters as the algorithm runs ona variety of roads. In flat areas without obstacles to constrainlateral offset (i.e., only visual appearance demarcates the road),any drifting is of less concern than elsewhere because bydefinition the off-road area is less hazardous.

A critical system state is whether the vehicle currently seesa road and thus its “advice” should be followed or else no roadis visible and the vehicle controller should ignore spurioussteering commands. The final component of VP FOLLOW isa suite of methods for detecting failure situations in order toshut itself off as well as to restart itself when visual conditionswarrant. The first failure situation we examine is whether thevehicle is currently on a road or not. This is a simple procedurefor a GPS-equipped vehicle on highways and urban streetswhich have been digitized in vector form, but the unnamedaccess roads and desert paths we are interested in exploitingare mostly unmapped. VP FOLLOW thus includes an image-based function to discriminate images that contain a strongvanishing point–assumed to belong to a road–from those thatdo not. Other failure detection functions relate to poor lightingconditions such as sun glare, distracting shadows, and darknessthat may prevent the module from recognizing a road even ifit is on one.

In the next sections, we will describe the steps of howVP FOLLOW estimates road shape and introduce its failuredetection methods, after which we will show results demon-strating the system’s capabilities and discuss extensions we arecurrently working on.

II. ROAD SHAPE ESTIMATION

There are three significant stages to road shape estimationwhich we describe in the following subsections. First, a textureanalysis is performed by computing dominant texture orienta-tions over the current image. Second, a linear approximationto the road direction is measured by having all dominant

orientations in the image vote for a single best road vanishingpoint. Finally, the vehicle’s lateral offset from the road centeris estimated from ladar data.

A. Dominant Orientations

The dominant orientation θ(p) of an image at pixel p =(x, y) is the direction that describes the strongest local parallelstructure or texture flow. We estimate θ(p) by convolving theimage with a bank of Gabor wavelet filters [13] parametrizedby orientation θ, wavelength λ, and odd or even phase. Togenerate a k × k Gabor kernel (we use k = b 10λ

πc), we

calculate:

godd(x, y, θ, λ) = exp[− 1

8σ2(4a2 + b2)] sin(2πa/λ) (1)

where x = y = 0 is the kernel center, a = x cos θ + y sin θ,b = −x sin θ+y cos θ, σ = k

9, and the “sin” changes to “cos”

for geven. The actual convolution kernel g is then obtained bysubtracting g’s DC component (i.e., mean value) from itselfand normalizing the result so that g’s L2 norm is 1.

To best characterize local texture properties including stepand roof edge elements at an image pixel I(x, y), we ex-amine the complex response of the Gabor filter given byIcomplex(x, y) = (godd ∗ I)(x, y)2 + (geven ∗ I)(x, y)2 for aset of n evenly spaced Gabor filter orientations. The dominantorientation θ(x, y) is chosen as the filter orientation whichelicits the maximum complex response at that location.

For all of the results in this paper except where otherwisenoted, the image has been scaled via an image pyramid downto 80× 60 resolution, the number of Gabor orientations usedis n = 36, and a single wavelength λ = 4 resulting in akernel size of 12× 12 is used. The FFTW Fourier transformlibrary [14] at single precision is used to calculate dominantorientations speedily, taking ∼ 55 ms on a 3.0 GHz PentiumIV for a 160× 120 image.

Figure 1(b) shows the calculated dominant orientations forthe image in Figure 1(a). Gray level intensities proportionalto an estimated angle from 0 to 180 (in 36 discrete steps) areshown. Observe that most parallel structure is in the dirt roadon the right.

B. Vanishing Point Detection

The possible vanishing points for an image pixel p withdominant orientation θ(p) are all of the points (x, y) along theray defined by rp = (p, θ(p)). Intuitively, the best estimate for

Fig. 2. Steps of ladar “gap” tracker for lateral offset estimation. Ladar-identified obstacles (red points) are projected along road direction(diagonal blue line) onto axis defined by front axle (green line)of vehicle (purple rectangle). The estimated obstacle density in thisprojection is graphed in red below, with the particle filter distributionand likelihoods shown in yellow below that, and the weighted sum ofthe particles indicating the gap estimated shown as a blue line segmentat the bottom of the figure. [Source image is shown in Figure 6(c); oneladar mounted on the bumper parallel to the ground was used here]

the vanishing point vmax is that point lying on or near the mostsuch dominant orientation rays (see [15], [16], [17], [18] forrecent work on vanishing point finding). In [10], we formulatedan objective function votes(v) to evaluate the support of roadvanishing point candidates v over a search region C roughlythe size of the image itself. An efficient and relatively accurate(given enough orientations) voting scheme, which we callraster voting, is to draw a “ray of votes” rp per voter inan additive accumulation buffer A in which each pixel is avanishing point candidate v. After rendering every vote ray,the pixel in A (which represents C at a fixed resolution) withthe maximum value is vmax . Graphics hardware acceleratesthis voting operation, though 8-bit accumulation buffers limit“elections” to a maximum of 256 votes per candidate, whichis quite enough for our image resolution.

The raw maximum of votes(v) is noisy, and since thevanishing point shifts only slightly between frames as thevehicle moves, we smooth the estimate using a particle filter[19], [5], [6]. Particles are initially distributed uniformly inorder to coarsely localize the vanishing point. Weak dynamicsp(vt |vt−1) (e.g., a low-variance, circular Gaussian) then limitthe search region to track the vanishing point closely, reducingthe chance of misidentification due to a false peak elsewherein the image. Finally, the averaging effect of filtering alsomitigates saturation by returning the middle of a region ofsaturated votes as the max, which generally correlates withwhere the unsaturated maximum would be.

Figure 1(c) shows the vanishing point candidate functionfor the image in Figure 1(a). The current particle distributionand their weighted mean (i.e., the estimated vanishing pointlocation) are shown in Figure 1(d). The vertical cyan line inFigure 1(d) indicates the vehicle direction. When the road fol-lowing camera is perfectly aligned with the vehicle direction,

it is midway across the image. In cases where there is someyaw offset between the camera and vehicle coordinate systems,such as with the data in Figure 6, the vehicle direction linemay be off the image center.

C. Ladar-based Lateral Offset Estimation

Given the road direction from the visual vanishing pointtracker, our goal is to find a maximum a posteriori estimateof the lateral offset of a “gap” in the obstacle field in frontof the vehicle that corresponds to the road. The obstacle fieldis defined as the set of ladar hit points (over all registeredladars) in vehicle coordinates that pass a “danger” test. Herethe danger test simply consists of having an absolute elevationdifference from the bottom of the vehicle’s tires of ≥ 0.5 m.

Figure 2 diagrams the steps of the lateral offset calculationfrom an overhead perspective with the vehicle (shown as apurple rectangle) traveling up the page. At the top of thefigure is the obstacle field graphed as a set of red points(the source image corresponding to this data is Figure 6(c)).Obstacle points are projected via parallel projection along theroad direction (the blue diagonal line) onto the line defined bythe front axle of the vehicle (indicated by a green line in thefigure).

We seek the segment–aka the road “gap”–with the lowest1-D obstacle density along this line that is both wider than thevehicle width w and in the vehicle’s neighborhood under theassumption that it is currently on the road. The weight of eachobstacle point falls off exponentially with its distance from thevehicle so that nearby obstacles count considerably more. Theprojected obstacle density is graphed in red in Figure 2 underthe projection axis–note the two humps corresponding to theberms and vegetation that flank each side of the road.

A particle filter, the “gap tracker,” is used to estimate theobstacle density online. Each particle (n = 100 here) consistsof a hypothetical 2w-wide gap randomly chosen no furtherthan w from the vehicle’s current edges. The purpose of lim-iting gap hypotheses to the immediate vicinity of the vehicleis to avoid seeing attractively empty areas outside the road butfar away and causing the vehicle to swerve abruptly towardthem. The likelihood of each particle is inversely proportionalto the obstacle density within it. The distribution of gap particlecenters for this data is graphed in yellow beneath the projectedobstacle density, with more likely particles drawn with longervertical lines. Finally, the most likely gap estimate derivedfrom the particle distribution is shown as a blue 2w-widesegment at the bottom of the figure.

The center of this segment determines whereVP FOLLOW thinks the nominal road centerline (the blue linein the figure) intersects its projection axis, which is treatedas the front of the vehicle. The road centerline functionsas a linear trajectory for the vehicle controller to follow.Its constantly changing direction and lateral translation arewhat allow the vehicle to go around turns while maintainingclearance from the road edges.

The road centerline can be further augmented with widthestimates at discrete distances along it. These are obtained byexpanding circles centered at 2 m intervals along the trajectoryuntil k = 3 obstacle points are enclosed within them. Themaximum circle radius is limited here to 5 m.

III. FAILURE DETECTION

There are a number of visual situations that can cause thevanishing point estimator described above to fail. These fallinto two general categories: (1) non-road images, and (2) poorlighting conditions. In the first case, the vehicle may leavethe road–for example, when driven manually, or because theDGC route description compels it to–and VP FOLLOW mustrecognize that it should no longer output a road centerlineto follow (in the larger DGC system, other modules wouldcontinue to offer their opinions of what to do). In the secondcase, the vehicle may still be on a road, but because ofdarkness, sun glare, or confusing shadows it may not be ableto accurately infer the road vanishing point. Again, we wouldlike to recognize such eventualities and gate the road follower’soutput.

a) On-off road inference: For most road scenes, espe-cially rural ones, the vanishing point due to the road is theonly one in the image. In rural scenes, there is very littleother coherent parallel structure besides that due to the road.The dominant orientations of much off-road texture such asvegetation, rocks, etc. are randomly and uniformly distributedwith no strong points of convergence. Even in urban sceneswith non-road parallel structure, such texture is predominantlyhorizontal and vertical, and hence the associated vanishingpoints are located well outside the image.

The “sharpness” of the vanishing point peak in the votefunction over C, which is generally only slightly larger thanthe input image, is intuitively an indicator of the reliability ofthe estimate. There are a number of ways to measure this, butwe have found that the Kullback-Leibler (KL) divergence [20]between the vote function and a uniform distribution of thevote totals (256 possible values for 8-bit accumulation buffers)correlates well with this intuition. Low KL values are obtainedwhen many different vote totals are observed in the candidateregion, while high values are measured with bunching of votetotals at either the high or low end. A similar approach usingthe likelihood ratio was used to decide whether a scene hadvanishing points or not in [18].

We decide that an input image is “road-like” when the KLof votes(v) is over a threshold and the estimated vanishingpoint can be considered reliable. To smooth this decision, westore a history of the last 5 seconds of results of this thresholdcomparison and require 1/2 of them to be over threshold forthe estimated road geometry to be passed on to the vehiclecontroller.

There is not space in this paper to examine this criterion indepth, but some results demonstrating its efficacy are shown inFigure 3. A vehicle was driven on, off, and around a road forseveral hundred meters, crossing the road several times. Theoutput of the KL on-off road test is graphed over the vehicle’strack with red indicating off-road segments and green on-road. Note that the test is not strictly about where the vehiclecurrently is, but rather what it sees immediately ahead. Thusthe end of the road at the T-intersection is anticipated, and theroad is not re-recognized until the 90-degree turn is completed.

b) Sun glare: Although the situation is less common,vanishing point estimation can be fooled by spurious edgescaused by saturated pixels blooming on the camera CCD

Fig. 3. On/off road decisions for an arbitrarily-driven segment(direction of travel is up and to the left). Top figure is 300 m by150 m; red segments are where images were classified as “off road”;green are “on road”. Bottom images correspond to first frame of eachoff-road segment (marked by blue circles).

(a) (b)

(c) (d)

Fig. 4. (a) Blooming saturation caused by direct viewing of sun; werecognize such conditions and turn off road following during them; (b)A few frames later the sun is just out of view. The sky is still saturated,but the vertical stripe is not there. VP FOLLOW accepts such images;(c) A problematic image with the vehicle shadow cast into the roadin front of it; (d) Same image as (c) with outline of predicted shadowlocation in image overlaid.

from the sun being directly visible in the image. An exampleis shown in Figure 4(a). VP FOLLOW uses a straightforwardimage processing procedure to recognize such occurrences.This consists of finding saturated pixels in the current image,doing a dilation, and then checking for whether any column isnearly all (more than 0.8 of them) saturated. This instantaneoustest is then temporally filtered. Simply computing the fractionalarea of an image that is saturated does not work, as the sky isoften saturated without significantly affecting the contrast ofground pixels, as in Figure 4(b).

Fig. 5. Robotic vehicle used for testing in this paper. SICK LMSladars are mounted on the bumper and roof; road following camera isone of stereo pair over windshield.

c) Shadows and darkness: There is a rare situation inwhich the assumption that the road vanishing point is theonly strong vanishing point in the image is contradicted. Itoccurs when the sun is behind the vehicle and low enoughto cast the vehicle’s own shadow far along the road in frontof it (see Figure 4(c)). If the actual road vanishing point isclose enough to the point of the triangular shadow, the twomay be conflated, leading to a road departure. Detecting andremoving shadows in images is still an open topic [21], butwe have found that a calculation of the local sun angle fromthe vehicle state (northing, easting, heading, pitch, date, andtime), plus rough knowledge of the vehicle’s 3-D shape and anassumption of planar ground, suffices for predicting the vehicleshadow in the image–e.g., Figure 4(d). In practice, we simplythreshold on the azimuthal angle between the predicted tip ofthe shadow and the current vanishing estimate. If they are tooclose, VP FOLLOW’s output is considered untrustworthy andit is disregarded. A corollary computation for failure due todarkness that does not require image processing is the localsun elevation angle. If it is too low, we assume that it is toodark for VP FOLLOW to operate.

IV. RESULTS

We will first show results for various components of thesystem, then present results for the entire system runningautonomously. For all of results given here, the vehicle usedis a fully-actuated Ford E350 with a modified suspension and4 wheel drive. Only a single bumper-mounted SICK LMSladar was used for lateral offset and road width estimation.The vehicle is pictured in Figure 5.

A sample of VP FOLLOW running passively while thevehicle is under manual control is shown in Figure 6. Theleftmost image is a detail of an aerial photograph of the testarea in which the vehicle is driven from right to left throughan S-turn. The five pairs of camera images and ladar obstaclemaps to the right correspond to the vehicle locations markedwith red dots in the aerial image. The green road regionsare based on the estimated road centerline trajectories andthe discrete width estimates, with the lookahead distance ofthe road clipped when a width posting is ≤ 2m. Note the

(a)

(b)

Fig. 7. (a) Approximate traces of autonomous trajectory followerruns with latitude, longitude markings overlaid. Map covers about an11 km by 6.5 km area. (b) GPS trace of trajectory follower run overhill (lower right segment from (a)) overlaid on terrain model renderedfrom digital elevation model, aerial imagery.

shortening of the road region as the turns are entered inFigures 6(c) and (e), and its lengthening as the road straightensin Figure 6(d) and (f). In the camera images, VP FOLLOWcanbe seen to smoothly anticipate both the left and the right turn.

VP FOLLOW has been tested in autonomous mode in anumber of situations. In one experiment, the road centerlinewas directly used by the vehicle as a trajectory to follow.Modulo several pauses to adjust the maximum vehicle speed,the vehicle was able to follow a long curving road in gentlysloped terrain quite smoothly (the diagonal leg in Figure 7using only the single camera and a bumper ladar. A secondsegment was driven that approached, ascended, and descendeda significant hill (the horizontal segment in Figure 7 and 3-Drendering) under full autonomous control save manual throttlecontrol on the descent (the vehicle was not pitch-aware insetting its speed limits).

In more recent experiments, we have controlled the vehicleless directly, through painting of “costs” in a 2-D map thata planning module uses to generate nonlinear trajectories.The details of this are beyond the scope of this paper, butsuffice it to say that several more autonomous miles havebeen logged on different desert roads in this fashion withVP FOLLOW exercising primary control, and more still with

(a) (b) (c) (d) (e) (f)

Fig. 6. Estimated road shape from manual driving around an S-turn. The leftmost image is a satellite photograph of the testing area with thevehicle track overlaid. The scale of the obstacle field images is 40 m wide, 50 m high (vehicle is 2 m wide, 5 m long)

it as part of a larger ensemble of “opinion providers.” Onepositive aspect of this less-direct method of vehicle controlis that higher-level matters such as DGC course boundariescan override found roads. For example, a sample coursevisiting all three vertices of the triangle of roads in Figure 7is untraversable in full by the trajectory following methodbecause it requires a turn at each vertex off of the main roadonto a secondary road, and VP FOLLOW’s tendency is to stayon whatever road it starts on.

V. CONCLUSION

We have presented a system for road following on desertand unpaved roads that relies on road texture analyzed froman on-board camera and ladar-based structural information torobustly identify and track the road. The on-board componentrecovers the road vanishing point in near real-time for manykinds of surface materials with no tuning, and it analyzes itsown performance and automatically turns off when the vehicleis not near a road.

A major area of ongoing work is incorporating aerial im-agery and digitial elevation data into longer-range planning andanticipation. We have begun promising preliminary work usingskeletonization and watershed image processing techniques toextract a road network in the vicinity of the vehicle, offeringmore choices to the vehicle and possibly graph-based path-planning.

ACKNOWLEDGMENTS

Many thanks to Team Caltech for their generous provisionof vehicle access and data.

REFERENCES

[1] A. Stentz, A. Kelly, P. Rander, H. Herman, O. Amidi, R. Mandel-baum, G. Salgian, and J. Pedersen, “Real-time, multi-perspectiveperception for unmanned ground vehicles,” in AUVSI, 2003.

[2] K. Kluge and M. Morgenthaler, “Multi-horizon reactive anddeliberative path planning for autonomous cross-country naviga-tion,” in SPIE Defense and Security Symposium, 2004.

[3] L. Cremean, T. Foote, J. Gillula, G. Hines, D. Kogan, K. Kriech-baum, J. Lamb, J. Leibs, L. Lindzey, A. Stewart, J. Burdick, andR. Murray, “Alice: An information-rich autonomous system forhigh speed desert navigation,” Submitted to Journal of FieldRobotics, 2006.

[4] C. Taylor, J. Malik, and J. Weber, “A real-time approach tostereopsis and lane-finding,” in Proc. IEEE Intelligent VehiclesSymposium, 1996.

[5] B. Southall and C. Taylor, “Stochastic road shape estimation,”in Proc. Int. Conf. Computer Vision, 2001, pp. 205–212.

[6] N. Apostoloff and A. Zelinsky, “Robust vision based lanetracking using multiple cues and particle filtering,” in Proc. IEEEIntelligent Vehicles Symposium, 2003.

[7] J. Crisman and C. Thorpe, “UNSCARF, a color vision systemfor the detection of unstructured roads,” in Proc. IEEE Int. Conf.Robotics and Automation, 1991, pp. 2496–2501.

[8] C. Rasmussen, “Combining laser range, color, and texture cuesfor autonomous road following,” in Proc. IEEE Int. Conf.Robotics and Automation, 2002.

[9] J. Zhang and H. Nagel, “Texture-based segmentation of roadimages,” in Proc. IEEE Intelligent Vehicles Symposium, 1994.

[10] C. Rasmussen, “Grouping dominant orientations for ill-structuredroad following,” in Proc. IEEE Conf. Computer Vision andPattern Recognition, 2004.

[11] C. Rasmussen, “Texture-based vanishing point voting for roadshape estimation,” in Proc. British Machine Vision Conf., 2004.

[12] C. Rasmussen and T. Korah, “On-vehicle and aerial textureanalysis for vision-based desert road following,” in IEEEWorkshop on Machine Vision for Intelligent Vehicles, 2005.

[13] T. Lee, “Image representation using 2D Gabor wavelets,” IEEETrans. PAMI, vol. 18, no. 10, pp. 959–971, 1996.

[14] M. Frigo and S. Johnson, “FFTW: An adaptive softwarearchitecture for the FFT,” in Proc. 1998 IEEE Intl. Conf.Acoustics, Speech, & Sig. Proc., 1998, vol. 3, pp. 1381–1384.

[15] T. Tuytelaars, M. Proesmans, and L. Van Gool, “The cascadedHough transform,” in Proc. IEEE Int. Conf. on Image Processing,1998.

[16] A. Polk and R. Jain, “A parallel architecture for curvature-basedroad scene classication,” in Vision-Based Vehicle Guidance,I. Masaki, Ed., pp. 284–299. Springer, 1992.

[17] S. Se and M. Brady, “Vision-based detection of staircases,” inProc. Asian Conf. Computer Vision, 2000, pp. 535–540.

[18] J. Coughlan and A. Yuille, “Manhattan world: Orientation andoutlier detection by bayesian inference,” Neural Computation,vol. 15, no. 5, pp. 1063–088, 2003.

[19] M. Isard and A. Blake, “Contour tracking by stochastic propaga-tion of conditional density,” in Proc. European Conf. ComputerVision, 1996, pp. 343–356.

[20] T. Cover and J. Thomas, Elements of Information Theory, JohnWiley and Sons, 1991.

[21] M. Ollis and A. Stentz, “Vision-based perception for an au-tonomous harvester,” in Proc. Int. Conf. Intelligent Robots andSystems, 1997, pp. 1838–1844.