rough terrain autonomous mobility - cs.cmu.edu

34
Rough Terrain Autonomous Mobility Part 2: An Active Vision, Predictive Control Approach Alonzo Kelly and Anthony Stentz Robotics Institute, Carnegie Mellon University, Pittsburgh, PA 15213 [email protected], http://www.frc.ri.cmu.edu/~alonzo [email protected], http://www.frc.ri.cmu.edu/~axs Abstract. Off-road autonomous navigation is one of the most difficult automation challenges from the point of view of constraints on mobility, speed of motion, lack of environmental structure, density of hazards, and typical lack of prior information. This paper describes an autonomous navigation software system for outdoor vehicles which includes perception, mapping, obstacle detection and avoidance, and goal seeking. It has been used on sev- eral vehicle testbeds including autonomous HMMWV’s and planetary rover prototypes. To date, it has achieved speeds of 15 km/hr and excursions of 15 km. We introduce algorithms for optimal processing and computational stabilization of range imagery for terrain map- ping purposes. We formulate the problem of trajectory generation as one of predictive control searching trajectories expressed in command space. We also formulate the problem of goal arbitration in local autonomous mobility as an optimal control problem. We emphasize the modeling of vehicles in state space form. The resulting high fidelity models stabilize coordinated control of a high speed vehicle for both obstacle avoidance and goal seeking purposes. An intermediate predictive control layer is introduced between the typical high-level strategic or artificial intelli- gence layer and the typical low-level servo control layer. This layer incorporates some deliberation, and some envi- ronmental mapping as do deliberative AI planners, yet it also emphasizes the real-time aspects of the problem as do minimalist reactive architectures. Keywords: mobile robots, autonomous vehicles, rough terrain mobility, terrain mapping, obstacle avoidance, goal- seeking, trajectory generation, requirements analysis 1. Introduction In our companion paper in this issue (Kelly and Stentz, 1998), the authors derived a set of require- ments for autonomous off-road mobility that also sug- gest an approach to meeting those requirements. This paper is concerned with the design and implementa- tion of a system that learns from the results of our ear- lier theoretical analysis. We have called this system RANGER - for Real-Time Autonomous Navigator with a Geometric Engine 1 . We emphasize the real-time nature of high speed autonomous mobility and, as a result, have been very concerned with such matters as efficiency, speed, throughput, and response time. Our approach is based fundamentally on the state space representation of a multi-input / multi-output dynamical system and is a departure from precedent in the following ways: • We use an active and adaptive approach to percep- tion that minimizes the amount of perceptual data processed. • We use a predictive control formulation of trajec- tory generation and search. • We use an optimal control formulation of goal arbitration. 1.1. Overview The paper is organized into five more sections fol- lowed by a summary and conclusions section. Section 2 presents the problem and our approach at a systems and architectural level of detail. We discuss autono- mous mobility from a historical perspective and attempt to define it. Then, we introduce two views of the overall architec- ture of our solution - a hierarchical (layered) view, and a data flow (object-oriented) view and state our high level results for two very different vehicles. The major subsystems of the design are identified as perception, terrain mapping, path planning, and goal seeking and arbitration, and each of these is covered in subsequent sections. Section 3 deals with our active approach to perception. We show how a few well-justified assumptions make it possible to implement an efficient test for member- 1. The name RANGER is also being used currently for an unrelated free-flying space vehicle under development by David Akin at the University of Maryland.

Upload: others

Post on 06-Dec-2021

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Rough Terrain Autonomous Mobility - cs.cmu.edu

Rough Terrain Autonomous MobilityPart 2: An Active Vision, Predictive Control Approach

Alonzo Kelly and Anthony Stentz

Robotics Institute, Carnegie Mellon University, Pittsburgh, PA [email protected], http://www.frc.ri.cmu.edu/~alonzo

[email protected], http://www.frc.ri.cmu.edu/~axs

Abstract. Off-road autonomous navigation is one of the most difficult automation challenges from the point ofview of constraints on mobility, speed of motion, lack of environmental structure, density of hazards, and typicallack of prior information. This paper describes an autonomous navigation software system for outdoor vehicleswhich includes perception, mapping, obstacle detection and avoidance, and goal seeking. It has been used on sev-eral vehicle testbeds including autonomous HMMWV’s and planetary rover prototypes. To date, it has achievedspeeds of 15 km/hr and excursions of 15 km.

We introduce algorithms for optimal processing and computational stabilization of range imagery for terrain map-ping purposes. We formulate the problem of trajectory generation as one of predictive control searching trajectoriesexpressed in command space. We also formulate the problem of goal arbitration in local autonomous mobility as anoptimal control problem. We emphasize the modeling of vehicles in state space form. The resulting high fidelitymodels stabilize coordinated control of a high speed vehicle for both obstacle avoidance and goal seeking purposes.

An intermediate predictive control layer is introduced between the typical high-level strategic or artificial intelli-gence layer and the typical low-level servo control layer. This layer incorporates some deliberation, and some envi-ronmental mapping as do deliberative AI planners, yet it also emphasizes the real-time aspects of the problem as dominimalist reactive architectures.

Keywords: mobile robots, autonomous vehicles, rough terrain mobility, terrain mapping, obstacle avoidance, goal-seeking, trajectory generation, requirements analysis

1. Introduction

In our companion paper in this issue (Kelly andStentz, 1998), the authors derived a set of require-ments for autonomous off-road mobility that also sug-gest an approach to meeting those requirements. Thispaper is concerned with the design and implementa-tion of a system that learns from the results of our ear-lier theoretical analysis. We have called this systemRANGER - for Real-Time Autonomous Navigatorwith a Geometric Engine1.

We emphasize the real-time nature of high speedautonomous mobility and, as a result, have been veryconcerned with such matters as efficiency, speed,throughput, and response time.

Our approach is based fundamentally on the statespace representation of a multi-input / multi-outputdynamical system and is a departure from precedent inthe following ways:

• We use an active and adaptive approach to percep-tion that minimizes the amount of perceptual data

processed.• We use a predictive control formulation of trajec-

tory generation and search.• We use an optimal control formulation of goal

arbitration.

1.1. Overview

The paper is organized into five more sections fol-lowed by a summary and conclusions section. Section2 presents the problem and our approach at a systemsand architectural level of detail. We discuss autono-mous mobility from a historical perspective andattempt to define it.

Then, we introduce two views of the overall architec-ture of our solution - a hierarchical (layered) view, anda data flow (object-oriented) view and state our highlevel results for two very different vehicles. The majorsubsystems of the design are identified as perception,terrain mapping, path planning, and goal seeking andarbitration, and each of these is covered in subsequentsections.

Section 3 deals with our active approach to perception.We show how a few well-justified assumptions makeit possible to implement an efficient test for member-

1. The name RANGER is also being used currently for an unrelatedfree-flying space vehicle under development by David Akin at theUniversity of Maryland.

Page 2: Rough Terrain Autonomous Mobility - cs.cmu.edu

2 Kelly and Stentz

ship of range pixels in a region of interest on theground.

We then introduce a three part generic algorithm fornear optimal active vision for terrain mapping in sta-tionary environments and state our results when weimplement it for both laser rangefinders and stereovision. In the latter case, the algorithm is embeddedinside the stereo algorithm so that it also significantlyimproves the efficiency of range image generation.

Section 4 covers terrain mapping - the process of mod-eling the environment based on the range imagerygenerated and processed by perception. Our experi-ence with this problem had identified map traversal,copying, and interpolation as expensive and largelyunnecessary operations which ultimately limit vehicleperformance.

We introduce an abstract data structure and anapproach to obstacle detection which make it possibleto avoid all of the above expensive operations. Ourresults indicate a mapping overhead that is now aninsignificant component of run-time. Further, we showa composite terrain map generated from a few hundredstereo range images.

Section 5 deals with path planning and obstacle avoid-ance. We formulate this problem in terms of forwardsimulation based on a state space vehicle model andidentify certain advantages associated with thisapproach. The nonlinear vehicle model is formulatedand then used in a predictive control context to accu-rately test candidate vehicle trajectories for relativesafety. Our results include a simulation that demon-strates that 97% of C space is kinematically anddynamically infeasible for our vehicle in typical situa-tions.

Section 6 deals with goal seeking and goal arbitration.We formulate the former as a model reference controlproblem and the latter as a standard optimal controlproblem. We identify the advantages of forward simu-lation in path tracking performance and introduce anactive perception approach to searching the terrainmap based on knowledge of vehicle maneuverability.

2. Local Autonomous Mobility

Our system addresses what we will call the local nav-igation problem for autonomous vehicles. That is, theproblem of deciding what to do based only on whatcan be seen at the moment within the field of view ofthe environmental sensors. The problem of globalplanning is outside our scope in this paper.

2.1. Introduction

Consider the task of path planning for an autonomousvehicle travelling cross country over rough terrain athigh speeds.

Scope of Problem. We can scope the problem in termsof several parameters:

• Overall Goal. In general, the vehicle must achievesome useful goal. The goal may be to move froman initial position to some other distant position, tomap an entire area, or to search an area for objects.

• Degree of Optimization. Standards for what con-stitutes achievement of the goal may vary fromsatisfaction of certain constraints (e.g. avoid colli-sion with obstacles) to optimization of an arbitraryutility function (e.g. fuel consumption or distancetravelled).

• Difficulty of Terrain. In realistic terrain, the vehi-cle is challenged by regions that would causetipover, trapped wheels, or loss of traction. Someregions are not traversable at all and others maycause disastrous system failures such as fallinginto an abyss.

• Depth of Prior Knowledge. Both the goal to beachieved and the characteristics of the environ-ment may be expressed and/or known to varyingdegrees of detail. The goal may be expressed as apoint to achieve, a path to follow, an object to find,or something much more abstract. The environ-ment may be completely known or partiallymapped at various levels of detail and richness ofexpression.

Problem. Within these parameters, we can character-ize the problem we have addressed in the followingterms:

• The overall goal is to follow a predefined paththat, because our goal trajectory is specified out-side our system, is assumed to be free of localplanning minima.

• We attempt to follow this path as closely as possi-ble while travelling as fast as possible and avoid-ing any obstacles that may appear.

• We attempt to operate on barren, rolling terrainthat may contain ravines, cliffs, and regions thatwould tip the vehicle.

• We have no prior knowledge of the environmentbeyond the path we are to follow.

Previous Work. Groundbreaking work on this problemhas been conducted at Hughes (Daily et. al., 1988;Keirsey et. al., 1988; Chang et. al., 1986), Lockheed-Martin (Dunlay and Morgenthaler, 1986a, 1986b), JPL(Thompson, 1997; Matthies, 1992), CMU (Feng et.al., 1990; Gowdy et. al. 1990; Hebert et. al., 1988,1991), INRIA (Hotz, Zhang and Fua, 1993) andLAAS-CNRS (Chatila and Lacroix, 1995), amongmany others. Generally speaking, early systems wereslower than later ones, and speed, excursion, and run

Page 3: Rough Terrain Autonomous Mobility - cs.cmu.edu

Rough Terrain Autonomous Mobility - Part 2: An Active Vision, Predictive Control Approach 3

time have improved with the general improvement incomponent technology and algorithms. In 1978, theStanford Cart thought more than it moved (Moravec,1980). In 1988, the ALV achieved continuous motion(Olin and Tseng, 1991). In 1994 the vehicles of theARPA UGV program achieved a few meters per sec-ond of speed (Brumitt et. al., 1992).

While speed is but one measure of performance,attempting to navigate successfully at high speedrequires that many other performance issues beresolved as well. Work on the related problem of road-following has also been conducted for a decade in theU.S. and Europe (Dickmanns, 1986, 1995). The Ger-man UBM group has achieved speeds of 130 km/hr onthe autobahn. This work has addressed dynamic mod-eling issues in order to achieve high speeds just as wedo in this paper.

Solution. Our general architectural approach to theproblem has been to insert an architectural layerbetween strategic planning and actuator control whichwe will call tactical control. This layer, being fasterthan planning yet more intelligent than control will beable to understand vehicle maneuverability suffi-ciently well for robust path tracking and will be able toreact fast enough for robust obstacle avoidance.

We will describe this technique in the context ofmobile robot layered architectures, and the delibera-tive / reactive architectural spectrum. However, suchtechniques as model-referenced adaptive control andtrajectory tracking (Kaufman et. al., 1994), model-based predictive control (Clark, 1994), optimal control(Kirk, 1970), and state-space modeling (Ogata, 1967)are well-known in the control engineering literature.An excellent reference on all of these topics is(Levine, 1996).

None of the control techniques we propose are new.However, as the emphasis in mobile robot researchbegins to shift from answering fundamental formula-tion questions to achieving real-life performance, clas-sical control engineering techniques promise to pointpart of the way.

2.2. Preliminaries

Conventions. The paper will introduce many newterms as a device to foster brevity and precision. Newterms will be defined in their first appearance in thetext. They will generally be highlighted thus.

The angular coordinates of a pixel will be expressed interms of horizontal angle or azimuth , and verticalangle or elevation . Three orthogonal axes are con-

sidered to be oriented along the vehicle body axes ofsymmetry. Generally, we will arbitrarily choose z up, yforward, and x to the right:

• x - crossrange, in the groundplane, normal to thedirection of travel.

• y - downrange, in the groundplane, along thedirection of travel.

• z - vertical, normal to the groundplane.

We will carefully distinguish range, measured in 3Dfrom a range sensor, and the projection of range onto the groundplane. Generally, both will be mea-sured forward from the sensor unless otherwise noted.

Certain vehicle dimensions that will be important aresummarized in the following figure. One distinguishedpoint on the vehicle body will be designated the vehi-cle control point. The position of this point and the ori-entation of the associated coordinate system is used todesignate the pose of the vehicle.

The wheelbase is , and the wheel radius is . Theheight of the sensor above the groundplane is desig-nated and its offset rear of the vehicle nose is . Theheight of the undercarriage above the groundplane is

. Range measured from the sensor is designated .

Terminology. Any vehicle which attempts to navigateautonomously in the presence of unknown obstaclesmust exhibit performance that satisfies a basic set ofrequirements. At the highest level, if the system is tosurvive on its own, the vehicle control system mustimplement a policy of guaranteed safety.

This requirement to guarantee safety can be furtherbroken down into four other requirements on perfor-mance and functionality expressed in terms of timing,speed, resolution, and accuracy. In order to survive onits own, an autonomous vehicle must implement thefour policies of:

• guaranteed response: It must respond fast enoughto avoid an obstacle once it is perceived.

• guaranteed throughput: It must update its model ofthe environment at a rate commensurate with itsspeed.

ψθ

RY

L r

h p

c R

2r

h

L

R

y

z

Figure 1: Important Vehicle Dimensions. Manyof these dimensions will be used throughoutthe paper.

p

c

Page 4: Rough Terrain Autonomous Mobility - cs.cmu.edu

4 Kelly and Stentz

• guaranteed detection: It must incorporate highenough resolution sensors and computations toenable it to detect the smallest event or feature thatcan present a hazard.

• guaranteed localization: It must incorporate suffi-ciently high fidelity models of itself and the envi-ronment to enable it to make correct decisions andexecute them sufficiently accurately.

Subproblems. Analysis shows [Kelly, 1995] that tradi-tional configuration space planning techniques appliedto cross-country navigation suffer from problems ofpoor reliability and stability, and poor computationalefficiency. Indeed, our experience has demonstratedregular collisions with obstacles that were seen andreacted to. There were two general reasons for thisbehavior:

• computational inefficiency: there was not enoughtime to decide what to do.

• command following problem: the specific trajec-tory used to avoid the obstacle could not be exe-cuted reliably or stably.

Yet another common problem is the well-known localminimum problem. It arises from the use of local ratherthan global optimization strategies. This problem isconsidered outside the scope of our work here, thoughone of the authors [Stentz, 1995] addresses this prob-lem in our target environment in other writings.

2.3. Standard Architectural Model

Consider the following hierarchical architecturalmodel. This is a convenient view for organizing thedescription of the system.

Spectrum of Characteristics. Higher levels of thehierarchy tend to be characterized by computation thatis more symbolic, logical, search-oriented, sequential,deliberative and abstract than lower layers. Lower lay-ers tend to be characterized by computation that ismore spatial or temporal, arithmetic, repetitive, paral-lel, reactive, and concrete than higher layers. As a gen-eral rule, higher layers exhibit longer reaction timesand longer cycle times.

The policy layer concerns itself with the generationand monitoring of mission level objectives such as“stay alive’, “find the bomb”, etc. Generally, the goalsof this level are not subject to much compromise. Pol-icy does not change often and is usually constant overthe duration of a mission. Policy is often imparted per-manently to a system by its human designers.

The strategic layer corresponds to the deliberative,logical, goal-generating component of autonomoussystems. It concerns itself with the larger picturewithin the confines of policy; with avoiding localplanning minima, with overall optimality, and withmodeling and memory of the environment. Relative tolower layers, it is often obliged to consume more timein its deliberations of longer term strategic concerns.

By contrast, the control layer corresponds to the real-time command following component of autonomoussystems. It concerns itself with the immediate lowlevel issues of doing exactly what it is told to do to thebest of its ability. Relative to higher layers, it is oftenobliged to consume more bandwidth as it reacts almostinstantly to short term immediate concerns.

It is clear that both longer term strategic and shorterterm reactive concerns will contend for computationalresources. Plainly, there are limits to the degree towhich any system can be both smart and fast. Facedwith this reality, the design problem becomes one ofmaking the best use of available resources.

2.4. Tactical Control Layer

Much of our work resides in the layer we are callingtactical control. We identify this layer in order to moreeffectively connect the strategic and control layers,and to provide a place to solve many of the problemsmentioned earlier.

Strategic - Control Connection. A direct connectionof the strategic layer (say, the global path planner) tothe control layer (actuator control) becomes less feasi-ble as speeds increase. Further, there are times whenone or the other generates incorrect output and the

Figure 2: Standard Model. This type of hierarchicalarchitecture is common. Higher layers tend to bemore deliberative etc. whereas lower layers tend tobe more reactive etc. Our control formulation ofmobility fits in at the “tactical” level.

Goals

Set Points

Commands Feedback

States

Policy Layer

Objectives

Status

Search-Oriented

Repetitive

LogicalSymbolicSequentialDeliberativeAbstract Strategic Layer

Tactical Layer

Control Layer

Physical Layer

ArithmeticSpatio-temporalParallelReactiveConcrete

Page 5: Rough Terrain Autonomous Mobility - cs.cmu.edu

Rough Terrain Autonomous Mobility - Part 2: An Active Vision, Predictive Control Approach 5

other is either not intelligent enough or not fast enoughto compensate.

There are certainly times when the goals specified bythe strategic layer must be ignored because it is notaware of the immediate environment but the controllayer is not intelligent enough to compensate. Thereare also times when the control layer is unable to fol-low its commands but the strategic layer is too slow toalter the command.

Our solution to this problem is to have a third layermore intelligent than control and faster than planning.This layer:

• Views the goals from the strategic layer as recom-mendations that it may be overridden when the sit-uation demands it.

• Incorporates sufficient bandwidth to ensure vehi-cle survival at the coordinated actuator controllevel.

• Incorporates a sufficiently accurate model of vehi-cle dynamics that it understands and adapts to theinability of the controller to follow its commands.

This three-layer architecture imparts a degree ofautonomy to the layer below the strategic to allow it toimplement basic survival and temporarily disregardstrategic imperatives.

Intelligent Predictive Control. We characterize thenavigator as an intelligent predictive controllerbecause it closes the overall perceive-think-act loopfor a robot vehicle based on intelligent assessment ofboth the surrounding environment and the predictedabilities of the vehicle to maneuver.

The system shares many characteristics with strategicplanning:

• It performs an amount of search and heuristics areemployed to reduce that search.

• It models the environment and responds to it, so itmerits the designation intelligent.

• It considers alternatives in an abstract space that isa transformation of reality - in this case, commandspace instead of the configuration space com-monly used in strategic motion planning.

• It considers the consequences of its actions usingtime continuous precedence information in theform of a system dynamics model.

• It employs some memory of the state of the envi-ronment and of the vehicle.

Likewise, the system also shares characteristics withcontrollers.

• It models the vehicle with a multivariate differen-tial equation.

• It is very concerned with response time andthroughput management as is common of real-timesystems.

• It is concerned with latencies and time tags and theprecise timing of events.

• It concerns itself with command following -although it may temporarily override its strategiclevel commands in order to ensure survival.

2.5. Architecture

At the highest level, the system can be considered toconsist of 5 modules as shown in the following dataflow diagram:

Position Estimator. The Position Estimator is respon-sible for integrating diverse navigation sensor indica-tions into a single consistent indication of vehiclestate. Vehicle state information includes the positionsof all actuators and some of their derivatives, and the3D state of motion of the vehicle body. This modulemay be the built-in navigation Kalman filter or anothersystem which generates the same output.

Map Manager. The Map Manager integrates discretesamples of terrain geometry or other properties into aconsistent terrain map which can be presented to thevehicle controller as the environmental model. Itmaintains a current record of the terrain immediatelyin front of the vehicle which incorporates all imagesnecessary, and which automatically scrolls as the vehi-cle moves.

Vehicle. The Vehicle object is both the control loopsystem model element and an abstract data structurewhich encapsulates the vehicle state. It incorporatesFIFO queues which store a short time history of vehi-cle states and commands. Old state information isrequired by the map manager in order to registerimages in space. The current vehicle state and oldcommands are used in the dynamic simulation.

Controller. The Controller object is responsible forcoordinated control of all actuators. This moduleincludes regulators for sensor head control, an obstacleavoidance tactical controller and a path following stra-tegic controller. It also incorporates an arbiter to

Figure 3: System Data Flow: Five components makeup the tactical control layer.

Percep.Map ControlMgr

ImagesProcessedImages

Posit.CurrentVehicle

Estimat.

States

NavSensorReadings

Vehicle

OldVehicleStates

Environ.Properties

PredictedVehicleStates

CurrentVehicleCmd.

MotionCmds

Page 6: Rough Terrain Autonomous Mobility - cs.cmu.edu

6 Kelly and Stentz

resolve disagreements between the latter two control-lers.

Perception. The Perception module is responsible forunderstanding or interpreting input images and puttingthem in a form suitable for the Map Manager to pro-cess. Examples of perceptual preprocessing includestereoscopy (stereo vision) which computes rangefrom two or more images taken from disparate view-points, and terrain-typing which labels each pixel in animage as rock, road, shrubbery or tree. Stereo vision isthe only perception that is currently supported,although laser range images can be fed directly to themap manager.

2.6. Results

The system has been tested on the vehicles shown inthe following figure. We have used laser range dataand stereo range data to build maps of the terrain overwhich the vehicle must travel. In the former case,excursions of 15 kilometers and instantaneous speedsof 15 km/hr have been achieved while tracking acoarsely specified path. Average speed was on theorder of 7 km/hr.

3. Perception

This section discusses the motivation behind, andimplementation of the 3D perception algorithm forextracting relevant geometry from a range imagesequence. We propose a relatively simple method ofapproaching the minimum required perceptualthroughput in a terrain mapping system, and hence thefastest possible update of the environmental model.The technique proposed will be relevant to any appli-cation that models the environment with a terrain mapor other 2-1/2 D representation.

3.1. Introduction

The surface of the surrounding terrain can be sensedby any number of means, but the two most commonlyused ones in outdoor scenarios are laser rangefindersand stereo vision. We represent the surface of the sur-

rounding terrain by a sampled, uniform density datastructure often called a terrain map or cartesian eleva-tion map.

Problem. When attempting to navigate over rough ter-rain, few assumptions about the shape of the terrainahead can be made. It can be necessary to convertimages into a full description of the geometry of thescene at relatively high rates. As a result, the speed ofrough terrain navigation is typically limited by thethroughput of the perception system. We will call thispredicament the perceptual throughput problem. Per-ceptual throughput can be expressed in units of rangeor intensity pixels measured per second, or its equiva-lent.

We address here this typical performance limitation ofautonomous outdoor vehicles. Our companion analy-sis suggests (Kelly and Stentz, 1998) that much of thecomputational resources used to image and interpretthe environment can be a waste of resources in mobil-

Figure 4: Some Navigation Testbeds: A modified military HMMWV and a RATLER Planetary Rover prototype.

Page 7: Rough Terrain Autonomous Mobility - cs.cmu.edu

Rough Terrain Autonomous Mobility - Part 2: An Active Vision, Predictive Control Approach 7

ity scenarios. This waste occurs for three principle rea-sons:

• The vertical field of view is often too wide from athroughput perspective. Obstacles and other haz-ards normally appear in the field of view longbefore they can be resolved, and long after theycannot be avoided.

• Sensor frame rate is often too fast. The sensor ver-tical field of view is normally aligned with thedirection of travel so that image sequences nor-mally contain much redundant information.

• Square pixels are not optimally shaped. The pro-jection of image pixels on the groundplane is nor-mally elongated in the wrong direction for robustobstacle detection and minimum throughput.

From the days of the Stanford Cart to the AutonomousLand Vehicle, vehicle speed has been limited, at leastin part, by limited perceptual throughput. We willshow how to eliminate much of this inefficiency inorder to easily meet the perceptual throughput require-ments.

Solution. One approach to reducing redundant infor-mation is the use of laser and video line scanners.These have seen use in specialized high-speed inspec-tion applications for some time. In satellite applica-tions, synthetic aperture radar has used vehicle motionto provide the scanning motion of the sensor along thedirection of travel. The essential principle involved inthese examples is to avoid scanning the sensor wheneither the motion of the vehicle or the motion of theenvironment already accomplish the scanning.

However, the use of line-scanned sensors is difficulton rough terrain because abrupt attitude changes of thevehicle body cause holes in the coverage of the sensor.Software adaptation provides the best of both worldsbecause it gives the ideally focussed attention neces-sary for high speed and the wide field of view neces-sary for rough terrain.

In our companion paper, we have showed that requiredperceptual throughput is polynomial in the reactiondistance. We have also shown that straightforwardtechniques promise to significantly increase the over-all efficiency of terrain mapping perception algo-rithms. This improvement can be accomplished byexploiting constraints and several assumptions that arevalid in most outdoor mobility scenarios.

The basic idea is to selectively process only the datathat matters in range imagery. Known here as adaptiveperception, the technique also has the beneficial side-effects of automatically adapting to changes in vehiclespeed and attitude, and to the local slope of the imagedterrain. Through this technique we achieve near mini-

mum perceptual throughput and hence, near maximumsafe vehicle speeds.

A fundamental tenet of active vision (Aliomonos et.al., 1988; Bajcsy, 1988) is to direct attention to the partof the scene that is relevant to the task at hand, ratherthan to interpret and model the scene. Our work pro-vides a concrete example of at least one aspect ofactive vision. Although we do model the scene, weactively search for the data we need.

3.2. Preliminaries

We will use two primary techniques for reduction ofthe perceptual inefficiencies mentioned above:

• We will actively maintain a focus of attention andprocess perceptual data only in a region of interestthat contains the most useful information.

• We will actively and intelligently subsample thedata within that region of interest for adequate -but not unnecessarily high - resolving power.

These two strategies will be referred to collectively asadaptive perception - the organizing principle of ourapproach to terrain mapping for high speed mobility.

Terminology. We will call a region of space for whichsensory data is required a region of interest, abbrevi-ated ROI.

It also will be important to distinguish the coordinatesystem implied by the sensor image - called the imageplane from a set of coordinates attached to the terrain -called the ground plane.

An ROI defined on the groundplane will be called aground plane ROI. Such a region will have an imagein the image plane which will be called an image planeROI.

Let and be the elevation and azimuth coordinatesof an image pixel. Computing derivatives of the rangeimage of flat terrain leads to the differential relation-ships between groundplane (x,y) resolution and imageplane resolution ( , ):

Figure 5: Regions of Interest. A region of interest inthe ground plane forms a corresponding image inthe image plane.

ImagePlane

GroundPlane

Center ofProjection

θ ψ

θ ψ

dy R2

h⁄( )dθ=dx Rdψ=

Page 8: Rough Terrain Autonomous Mobility - cs.cmu.edu

8 Kelly and Stentz

The completely correct transformations also dependon the local terrain gradients. These are unknown apriori because terrain geometry is the very thing thesensor is supposed to measure.

A quantity of central concern to us will be the distancethat the vehicle requires to react to an external eventsuch as the appearance of an obstacle in the sensorfield of view. This distance will be called the responsedistance and its precise value will depend on:

• the speed of the vehicle when the event happens• when the response is considered to be complete• the maneuver chosen as the response

Subproblems. We have, at this point, nominated adap-tive perception as a solution to the perceptual through-put problem. Unfortunately, this leads to a new set ofproblems, but we will be able to solve them with addi-tional strategies and clearly identified assumptions.

From the point of view of responding robustly toobstacles, it is best to detect obstacles early, or equiva-lently, at high range from the vehicle. However, fromthe point of view of sensor resolving power, it is bestto detect obstacles as close as possible to the vehiclewhere data quality and spatial resolution tends to behighest. In other words, the farther away an obstacle isdetected, the easier it is to avoid, but the harder it is todetect it robustly. When either resolution or range islimited, we can detect an obstacle robustly or avoid itrobustly, but not both. This is the response-resolutiontradeoff.

We will manage this tradeoff by explicitly computingthe minimum distance required for robust obstacleavoidance and looking for obstacles only beyond thisdistance. This technique will be called adaptive looka-head.

The mapping from the groundplane ROI to the imageplane ROI is both nonlinear (a projective transform)and a function of the unknown shape of the terrain. Itseems, therefore, that it is not at all straightforward toefficiently find the image plane ROI. Consider, forexample, the straightforward solution of convertingcoordinates of all pixels in the image and then compar-ing their positions to the groundplane ROI. After pix-els that are not in the groundplane ROI are eliminated,one is left with the image plane ROI. While this wouldcertainly work, it can be far too inefficient to be use-ful.

For terrain mapping, the largest computational cost ofa range pixel is the conversion of its coordinates fromthe image plane to the ground plane. In attempting toselect only the data of interest by converting the coor-dinates of all pixels, one has already done most of the

perception task anyway. Any straightforward attemptto selectively process data in a region of interestapparently falters because the problem of selection isas difficult as the problem of perception.

We will use assumptions to decouple these problems.When the assumptions are combined with an appropri-ate choice of the groundplane ROI, we will be able topartially infer the shape of the image plane ROI andcompute its position by very efficient image planesearch. The algorithm for doing this will be calledadaptive sweep.

The sampling problem is the nonuniform and anisotro-pic distribution of pixels on the groundplane whichcorresponds to a uniform and isotropic distribution ofthe corresponding pixels in the image plane. The Jaco-bian matrix which relates the two distributionsdepends on both the image projective transform andthe local terrain slope at each point. The impact of thisproblem is that not only is the shape of the imageplane ROI distorted and of unknown position but thelocal pixel density required to sample the groundplaneuniformly is both unknown and different everywherein the image plane ROI.

This variation in pixel density is shown below for flatterrain. Each ellipse represents the footprint of a pixel.It is the variation in density which we are illustrating,not the density itself, so the images were subsampledto avoid clutter.

We will solve this problem to some degree by choos-ing the best compromise and, at other times, byactively computing the required image plane resolu-tion from extrapolation. The algorithm for doing thiswill be called adaptive scan.

Assumptions. Certain assumptions will be key compo-nents of our approach - either because they must bemade or because they can be made with little or noloss of generality.

Figure 6: Sampling Problem. Equally spaced imagepixels are not equally spaced on the groundplane -even for flat terrain. The situation is worse for roughterrain.

Y C

oord

inat

e in

Met

ers

-30 -20 -10 0 10 20 300

10

20

30

X Coordinate in Meters

Page 9: Rough Terrain Autonomous Mobility - cs.cmu.edu

Rough Terrain Autonomous Mobility - Part 2: An Active Vision, Predictive Control Approach 9

One of our most fundamental assumptions will be thatthe environment is self stationary. That is, the environ-ment will be supposed to consist of rigid bodies whoserelative positions are fixed - at least while they are inthe field of view of the environmental sensor. Whilethe bodies comprising the environment are self sta-tionary, our vehicle is in motion with respect to them.The value of this assumption is that it allows us toimage a point in the environment only once and,because only the vehicle moves, its subsequent posi-tion relative to the vehicle at any later time can beinferred solely from the vehicle motion.

We will use the term small incidence angle assump-tion to refer to the situation where image pixels inter-sect a theoretical flat world at glancing angles. This isguaranteed to be the case if:

• the sensor is mounted on the vehicle roof, and• pixels inside the response distance are ignored,

and• the vehicle speed is relatively high

because, under these conditions, the sensor height issmall relative to the range of any pixel.

In the figure above, this assumption implies the valid-ity of the following approximations:

We will call the range and the range projection.It is easy to show that the relative error incurred inassuming that these two quantities are the same is thesquare of the ratio .

A final important assumption is the assumption thatthe environment is 2-1/2 dimensional with respect tothe direction of gravity. That is, at all points, a linealigned with gravity pierces the first reflecting surfaceof the environment at most once. This assumption jus-tifies a terrain map representation and it also allows usto assume that range is a near monotonic function ofimage elevation angle. If we eliminate vegetationwhich overhangs its own supports from consideration,the worst case violation of this monotone rangeassumption is the reduction in range that occurs whena vertical surface is scanned as shown below.

The computational advantage of the assumption is thatonce the maximum range is found in an image, all pix-els above it in the same column of the image can besafely assumed to be beyond that range. It will turn outlater that this assumption will only be used in laserrangefinder implementations of adaptive perception.Stereo vision will not require it.

Design. Our basic technique for efficient extraction ofthe groundplane ROI data is to convert the member-ship test to image coordinates, rather than to converteach data point to world coordinates. The relationshipsinvolved are nonlinear. Constant thresholds in onespace are functions when transformed to the other asshown below:

However, if we are willing to incur a small error andallocate safety margins appropriately, constant mini-mum and maximum thresholds can easily be com-puted for range, azimuth, and elevation whichcorrespond to a given cube in world coordinates.

The fact that the small incidence angle assumption isvalid implies several things of importance:

• the extra volume of the world coordinate ROIshown above is minimal

• the vertical field of view required to image thegroundplane ROI is small1

• range almost equals range projection

h

Y2

h2

+ R2

=

R

Figure 7: Imaging Geometry. The height of thesensor above the ground plane is normally smallcompared to the ranges measured.

Y

y

z

hR--- 1« and Y R≈

R Y

h R⁄

1. It is not necessary to scan for obstacles that are higher than thesmallest height which will be considered hazardous, so the verticalangular width can remain small. Plainly, after the bottom few feet ofa tree are percieved, the question of whether there is more tree aboveis irrelevant.

R3 R2<

R2

Figure 8: Nonmonotone Range. Range is anonmonotone function of image elevation angle at avertical or near-vertical surface.

θ3 θ2> but

R3

θ2θ3

θ1

R1

R2 R1>θ2 θ1> and

Rmax

Figure 9: Converting Membership Test Coordinates.It is much more efficient to convert coordinates ofthe ROI membership test from world to imagecoordinates than to convert the coordinates of allrange data from image to world coordinates.

Rmin θmax

θmin

ExtraVolume

Page 10: Rough Terrain Autonomous Mobility - cs.cmu.edu

10 Kelly and Stentz

These derived assumptions allow us to inexpensivelydecouple the problem of selection from that of percep-tion because reasonably accurate constant thresholdscan be derived in image space. Only those pixelswhich satisfy the resulting inexpensive image planeROI membership test need have their coordinates con-verted for mapping purposes.

Our adaptive perception algorithm confines the pro-cessing of range geometry in any cycle of computa-tions to an image plane ROI with the followingproperties:

• It extends beyond the vehicle response distance.• Its size is the distance moved since the last cycle.

The algorithm has three conceptual parts as outlinedbelow.

Adaptive lookahead means the process of adapting theposition of the groundplane ROI to assure that there issufficient time to react to hazards. There is some mini-mum range inside of which it is unnecessary to lookbecause the vehicle is already committed to travelthere. Also, there is some maximum range beyondwhich it is unnecessary to look because there will betime to look there later. In detail implementation, thealgorithm can set the minimum range to the responsedistance, or alternately, set the maximum range toresponse distance plus the distance travelled per cycle.

Adaptive sweep is the process of adapting the width ofthe groundplane ROI to assure that there are no holesor excessive overlaps in the coverage of the sensor.The ROI width is set to the distance travelled since thelast computational cycle. This determines both themaximum and minimum range projections in thegroundplane and they are trivially converted to theimage plane ROI.

Adaptive scan is the process of managing resolutionwithin the image plane ROI in order to achieve uni-form groundplane resolution. For the data of interest,it will be possible to compute an approximate mappingfrom groundplane resolution to image plane resolutionand images will be subsampled by appropriate factorsto achieve near uniform groundplane resolution.

Implications. Certain implications of using the adap-tive perception algorithm are worth noting here. Theminimum computational cost of this approach to per-ception has implications for the real-time performanceof autonomous vehicles. The maximum useful rangeof a perception sensor is often limited by reasons ofeye safety, computational cost, limited angular resolu-tion etc. Given this limit, the highest safe vehiclespeeds are normally achieved by minimizing reactiontimes.

The only element of reaction time that can be changedeasily is often the component due to the time requiredto process imagery or perform other computations.Therefore, to the degree that our approach minimizesthe computational cost of perception, it also increasesthe vehicle speeds that can be achieved.

Our software adaptive approach to perception has theside effect of computationally pointing the sensor ver-tical field of view by responding to both changes in thevehicle attitude and changes in the shape of theimaged terrain. While the shape of the range windowmay be very irregular in image space, it always corre-sponds to a regular semi-annulus in the ground plane.If the vertical field of view is wide enough and therange sensor is fast enough in terms of range pixelrate, this software adaptation is superior to the tech-nique of physically stabilizing the sensor because itresponds instantaneously.

3.3. Adaptive Lookahead

The three techniques described in the previous sectioncan be applied to any range image generated by animaging laser or radar sensor or a stereo vision system.It is also possible to embed adaptive perception into astereo vision algorithm - which will be the subject of aspecial section. For both classes of imagery, rangeimagery and stereo pairs, the adaptive lookahead algo-rithm is common.

A vehicle may attempt to turn to avoid obstacles andmaintain its forward speed, it may elect to stop com-pletely, or it may choose any other arbitrary trajectory.The choice of trajectory determines the details of com-puting the response distance. For our purposes, adap-tive lookahead is implemented by computing thedistance required to execute a 90° turn at the currentspeed. This gives the maximum range of the rangewindow.

The groundplane ROI must be defined very preciselyin terms of distances from some specific point on thevehicle at some specific time. The problem of findingthe data in this region in an image taken previouslyinvolves several aspects of time delays and geometricoffsets.

• The sensor is not mounted at the vehicle referencepoint, so the ROI is adjusted for this offset.

• The vehicle is not itself a point, so the ROI mustbe enlarged to provide data at the positions of thewheels forward and aft of the reference point.

• There may be significant delay associated with theacquisition of an image, so the ROI must beadjusted for the age of the image.

Page 11: Rough Terrain Autonomous Mobility - cs.cmu.edu

Rough Terrain Autonomous Mobility - Part 2: An Active Vision, Predictive Control Approach 11

• The most recent vehicle state estimate is itselfsomewhat old and computation takes finite time.The ROI may need to be adjusted for these effectsdepending on the instant with respect to which theROI is defined.

3.4. Adaptive Sweep/Scan-Range Imagery

If one starts with a dense range image, the algorithmconsists of the mapping of the range window intoimage space and the extraction of the data.

Adaptive Sweep. Terrain roughness and nonzero vehi-cle roll mean that the position of the range window inthe image is different for each column so the rangewindow is processed on a per column basis. In order torobustly find the range window, each column is pro-cessed in the bottom-to-top direction.

A conceptual C code fragment is as follows. Theimage itself is of dimensions rows by cols. A constantrectangular subwindow of the image is searched whichis delimited by the image plane coordinates start_row,start_col, end_row, and end_col. This region is knownto always contain the ROI as discussed above.

The monotone range assumption appears as the breakstatement after the first conditional of the inner loop.The start_col and end_col variables implement a fixedazimuth and elevation angle window within which therange window always lies on typical terrain.

Adaptive Scan. The variables row_skip and col_skiphave values corresponding to the constant image sub-sampling factors that give the most acceptable ground-plane resolution. In the case of range images, adaptivescan is implemented by a literal subsampling of theimage. Also, this subsampling applies to both the datain the ROI and the data below the ROI that is not pro-cessed. That is, adapting the resolution can benefit the

speed of handling both the processed and the unproc-essed data.

Because the differential transformation from theimage plane to the groundplane is unknown, a per-fectly robust, optimal subsampling solution is notavailable. However, a spectrum of approaches to reso-lution management are available based on the fre-quency of update of the row_skip and col_skipvariables and how they vary with range for anassumed flat world. They can be computed based on:

• the highest projected value of the ROI maximumrange, Rmax, based on the known speed limits ofthe vehicle.

• the value of ROI maximum range, Rmax, for thecurrent computational cycle.

• the instantaneous value of range, R, at the currentpixel.

These options have been listed in order of increasingspeed and decreasing robustness.

In the least adaptive form of adaptive scan, the numberof pixels skipped in the horizontal and vertical direc-tions can be set based on the average or worst caseexpected value of the maximum range.

In the next most adaptive form, the image plane reso-lutions are recomputed for each image based on thecurrent ROI maximum range. In the most adaptiveform, image plane resolutions can be recomputedbased on the instantaneous range image values. How-ever, it can be awkward to vary the azimuth resolutionas a function of range if one chooses to process theimage by columns.

The ratio of maximum to minimum range is normallysmall, so the variation in (row_skip) is also small.Under this assumption, a good compromise is to usethe worst case azimuth resolution and the instanta-neously computed elevation resolution.

Although the flat world assumption may seem inap-propriate on rough terrain, the use of it in adaptivescan works well in practice.

3.5. Adaptive Sweep/Scan-Stereo Imagery

The principles of the earlier section could be applieddirectly to the output of a stereo vision system. Yet,because stereo also consumes computational

j = start_col;while (j <= end_col+col_skip)

{i = end_row;while (i >= start_row-row_skip)

{R = range(i,j);if (R > Rmax )

break;else if( R < Rmin )

{i -= row_skip; continue;}else process_pixel_into_map();

i -= row_skip;}

j += col_skip;}

Figure 10: Adaptive Sweep Algorithm. The rangewindow is processed on a per column basis in orderto robustly extract the data of interest.

dψ 1Rmax------------

dx= dθ h

Rmax2

--------------

dy=

dψ 1Rmax------------

dx= dθ h

R2

------ dy=

Page 12: Rough Terrain Autonomous Mobility - cs.cmu.edu

12 Kelly and Stentz

resources, it seems worthwhile to investigate whethersimilar techniques can be employed inside of the ste-reo algorithm itself in order to avoid computing rangepixels that subsequently would be eliminated anyway.

Traditionally, the stereo problem is cast as one ofdetermining the range for every pixel in the image.Traditional stereo finds the range for each possibleangular pixel position. Conversely, our adaptiveapproach to stereo finds the angular positions in theimage plane of groups of pixels with each possiblerange value. It determines those pixels whose rangevalue falls within a small range window, and it does sowithout computing the ranges of pixels which are notof interest. This principle is sometimes called rangegating in laser rangefinders which employ it.

The motivation for the approach in the case of stereo isthe observation that the region of terrain which isbeyond the vehicle response distance usually corre-sponds to a very narrow range in stereo disparity spaceas shown below.

The nonlinear relationship between range and dispar-ity also implies that range resolution is relatively poorat high ranges, so the computation of the range of lowrange pixels can be wasteful. In one sense; we areobliged to accept poor range resolution, so we may aswell take computational advantage of it where it helps.

Embedded Adaptive Sweep in Stereo Vision. For ste-reo ranging systems, the basic principle of the rangewindow can be converted to a disparity window1 for astereo system because the range and disparity arerelated by the stereo baseline. However, as before, theproblem of selection, of determining membership in arange gate without computing the range, seems diffi-cult.

The basic stereo configuration for perfectly alignedcameras is given below. It is useful to remove thedependence of disparity on the focal length by

expressing disparity as an angle. Define the normal-ized disparity thus:

Then, for a range window between 25 meters and 30meters, and a stereo baseline of 1 meter, the angularwidth of the corresponding disparity window is:

Thus, the range of disparities which corresponds to atypical range window is roughly 1% of a typical cam-era field of view (40°). In other words, the imagecoordinates of corresponding points in both images arevery close to each other if the range of the point isbeyond the response distance.

In traditional area-based stereo, correlations (or any ofa number of other measures of similarity of two imagesubwindows) are computed for a wide range of dispar-ities. Then the algorithm searches along the curve gen-erated for each pixel for the disparity, ,corresponding to the global correlation maximum. Thecase for normalized image crosscorrelation is illus-trated below.

1. There is a slight difference in the geometry of a stereo rangeimage (perspective) compared to a rangefinder image (sphericalpolar). Therefore, a disparity window corresponds to a window onthe y coordinate and not the true polar range. In most circumstances,this distinction can be safely ignored.

Figure 11: Disparity Window. For high speedmotion, the range window is distant from thevehicle. Quadratic growth in stereo range resolutionimplies that the entire range window is spanned bya few disparities.

YL

XL

YR

XR

Figure 12: Stereo Triangulation. The relationshipbetween disparity , range , baseline , and focallength is derived from similiar triangles.

d Y bf

OpticalImagePlanes

Axes

Object

RightCamera

LeftCamera

f

Y

b

δ df---

bY---= =

∆δ 125------

130------– 0.0067 0.38°= = =

d*

Figure 13: Disparity Search. The global maximumcorrelation is the best match. Limiting the search canlead to the wrong answer.

Cd

1.0

dd* dmin dmax

Page 13: Rough Terrain Autonomous Mobility - cs.cmu.edu

Rough Terrain Autonomous Mobility - Part 2: An Active Vision, Predictive Control Approach 13

If, however, the search were limited to the disparitywindow whose boundaries are and in theabove figure, the point of maximum correlation thatwould be found would be only a local minimum. Noinformation other than the absolute value of the mea-sure of similarity would indicate this. If a range imagewere generated based on the results of this limited dis-parity search, the image would contain:

• correct ranges for pixels whose true range hap-pened to fall within the range window searched.

• incorrect ranges for pixels like the one illustratedabove which defeated our best attempts to identifythem at this stage of processing.

Nevertheless, the environment is often smooth, andthis smoothness leads to the property that correctranges tend to form large smooth regions whereasincorrect ones do not as illustrated below.

It is well known that spurious matches occur funda-mentally because regions which do not correspondphysically actually look more or less the same. Severalsolutions to this repetitive texture problem help the sit-uation somewhat but the simple technique of comput-ing connected components and removing smallregions (Matthies et. al., 1996) works effectively andis computationally free because a disparity imagecleanup pass is often required even when a wide dis-parity range is searched.

Embedded Adaptive Scan in Stereo Vision. In thecase of stereo vision, the situation for adaptivelychanging resolution is more complex because rangeresolution and angular resolution are coupled. That is,once angular resolution is fixed, range resolution isalso fixed, yet each has independent constraintsimposed on it by the application. It is not possible, forinstance, to aggressively reduce horizontal image reso-lution (as would be done with a range image) at theinput to stereo because range resolution will also bedramatically and unacceptably degraded.

The least that can be done, however, is to compute thedegree to which the output range image would be sub-sampled and then the latter stages of stereo (the stagespast the correlation computation) can simply ignorethe unwanted pixels. Before correlation, thoseunwanted pixels may be needed to participate in com-puting the correlations.

3.6. Results

The following two sections present performanceresults for adaptive perception based on laser rangeimages and stereo vision. For these results, the vehiclespeed is 3 meters/second and the resolution of the gen-erated terrain map is 0.75 meters in both horizontaldirections. An oversampling factor of 2 is also incor-porated into adaptive scan as a safety margin to protectagainst terrain undersampling.

While adaptive perception resamples a range imagefor optimum coverage of the terrain, the specificattributes of the range sensor and cameras used for thefollowing results are given in the table below:

Range Image Adaptive Perception. In a typicalimage, the pixels that are actually processed by theadaptive perception algorithm form a horizontal bandthat is jagged-edged and of varying width. The widthof the band decreases if the vehicle speed increasesbecause adaptive lookahead will move the window upin the image where a smaller width projects onto thesame groundplane distance.

The following figure gives a sequence of range imagesfor a run of our navigation system simulator1 on veryrough terrain using a simulated rangefinder where thepixels that were actually processed fall between thethin black lines. On average, only 75 range pixels outof the available 10,000 (or 2%) were processed per

dmax dmin

Figure 14: Spurious Disparities. Correctly rangedpixels tend to form large connected smooth regions.Incorrect ones do not.

CorrectRange

IncorrectRange

NoRange

Legend

Range Image

Table 1: Sensor Parameters

AttributeERIM

rangefinderCCD

camera

Image Rows 64 640

Image Cols 256 486

Hor. Field of View 80° 20°Vert. Field of View 30° 20°Hor. Angular Res 0.3125° 0.0412°Vert. Angular Res 0.4688° 0.0312°Frame Rate 2 Hz 30 Hz

1. The system performs identically on real images but simulatedones were used here in order to illustrate several points in limitedspace.

Page 14: Rough Terrain Autonomous Mobility - cs.cmu.edu

14 Kelly and Stentz

image. In terms of areas imaged per second, the sys-tem throughput is increased by a factor of 100 times,or two orders of magnitude over the method of simplyprocessing all image data.

There are five range images arranged vertically on theleft. These are rendered as intensity images wheredarker greys indicate increasing distance from the sen-sor. The terrain map constructed by the perception sys-tem is rendered on the right. The top figure shows themap as an image where lighter greys indicate higherelevations. In the center of the map is the vehicle at theposition where the 5th image was captured. The lowerright figure is the same terrain map rendered as a wire-frame surface from the vantage point of the initialposition.

There are three hills in the scene whose range shadowsare clearly visible in the terrain map. In the first image,the vehicle is accelerating but still travelling relativelyslowly. The range window is relatively wide and posi-tioned near the bottom of the image. The first hill is inthe range window. In the second image, the second hillis in the range window and the first hill has alreadybeen processed. In the third image, the third hill is nowin the range window. In the fourth image, the vehicleis driving past the first hill and is rolled to the rightbecause of it. This rolls the image to the left and thealgorithm compensates appropriately. In the fifthimage, the range window has moved past the third hillto the flats beyond and a fourth hill is barely visible inthe distance.

Actual perception performance is given in the tablesbelow for a series of images of flat terrain. In the table,the nonadaptive value corresponds to the resultobtained by processing all pixels in the ERIM range

image. The adaptive value is the value obtained by ourrange image algorithm:

The results do not scale linearly with pixels processedbecause the adaptive result includes a constant setuptime. Nonetheless, the adaptive result is 16 timesfaster than the nonadaptive result and if the ERIM sen-sor had higher angular resolution, the improvementwould be proportionally better. The system uses barelyadequate spatial resolution and eliminates redundantmeasurements and hence achieves minimum through-put.

Stereo Vision. The following figure illustrates theoperation of embedded adaptive stereo on two hori-zontal baseline input images. These are images of abarren ravine road near CMU taken from inside theravine. The initial input images appear at the left. Tothe right of these are the nonadaptively processed dis-parity and range images. To the extreme right are theadaptively processed disparity and range images. Thedisparity images are shown to demonstrate the spuri-ous matches which are caused by incorrectly chosenextrema in the correlation versus disparity curves.

Figure 15: Adaptive Rangefinder Perception. Theprocessing of five range images is illustrated as thevehicle drives through an obstacle course of threehills.

Image 2

Image 3

Image 4

Image 5

Image 1

Hill 3

Hill 2

Hill 1

Hill 2

Hill 3

Hill 1

Table 2: Rangefinder Adaptive Perception Performance (SPARC 20)

Attribute Nonadaptive Adaptive

Pixels Processed Per Image

16384 75

Run Time 0.352 secs 0.022 secs

Figure 16: Adaptive Horizontal Baseline Stereo. Theincorrect disparities due to incorrect matches arecleaned up with an efficient filter.

Input Images Nonadaptive Adaptive

Range RangeRight

Left Disparity Disparity

Page 15: Rough Terrain Autonomous Mobility - cs.cmu.edu

Rough Terrain Autonomous Mobility - Part 2: An Active Vision, Predictive Control Approach 15

A breakdown of this run is shown in the table below:

4. Terrain Mapping

This section discusses the motivation behind, andimplementation of our highly efficient approach to ter-rain mapping. We discuss methods for elimination ofthe need to copy and/or interpolate the data structureto incorporate incoming new data, methods to com-pensate for sensor motion, and methods for the repre-sentation of terrain shape uncertainty.

4.1. Introduction

Terrain mapping is the process by which surfacedescriptions, obtained from different vantage points,are accumulated into a consistent environmentalmodel (Hebert et. al., 1988). In order to provide therest of the system with a single, coherent, uniformdensity data structure we transform images into a reg-ularly-spaced cartesian grid called a Cartesian Eleva-tion Map (CEM).

Problem. In our early attempts to map terrain for a fastmoving outdoor vehicle, we encountered severe com-putational inefficiency problems for several reasons:

• The treatment of the motion of the vehicle throughthe environmental model necessitated a physicalshift of data that was very expensive.

• Interpolation of the values of unknown cells fromtheir neighbours was very expensive.

• Massive distortions of reality due to sensor motionwere introduced as the vehicle speed increased.

Solution. We have developed methods to managethese problems that include:

• A special terrain map data structure and accessroutines.

• An approach to interpolation based on interpolat-ing predicted vehicle state rather than terraingeometry.

• Real-time methods for processing sensor data.

4.2. Preliminaries

Terminology. In the map, each cell encodes wherethe z coordinate is unique for any pair i,j and is refer-enced to some fixed coordinate system called the navi-gation coordinate system with respect to which thevehicle moves. Individual elevation buckets in a ter-rain map are called cells to distinguish them fromrange image pixels.

Subproblems. Once we have implemented a wrappingterrain map data structure (discussed below), we willface a new problem in distinguishing data from twodifferent regions of space that happen to fall into thesame cell. We will be able to manage this problemthrough the introduction of a new data field - the “age”of a cell.

Assumptions. Under some circumstances, natural out-door terrain is well approximated by a surfaceexpressed as z = f(x,y) where the z axis is aligned withthe local gravity vector. An important exception to thisassumption is trees and other large vegetation. We willassume that either we operate in barren terrain or thatwe can safely fill in the space beneath branches in ourmodels. Thus, the use of a terrain map normally meansthat the 2-1/2 D world assumption is being adopted.

Design. Our implementation includes the followingelements:

• A 2D ring-buffer implementation of a terrain mapthat accommodates vehicle motion through mod-ulo arithmetic indexing.

• Methods for processing perceptual data that neverrequire copying, traversal, or interpolation of theterrain map.

• Straightforward methods to compensate incominggeometry for range camera motion.

Implications. Before these methods were firstadopted, over half of our processor time was con-sumed in simply managing the terrain map. That is,map management was more expensive than perceptionand planning combined. After they were adopted, thecost of terrain map management became so small thatit was insignificant.

4.3. Wrappable Map

Using 30 meters of lookahead in planning, 1/6 meterresolution, and 20 bytes of memory per map cell, over1/2 megabyte of memory is required to store a typicalmap. If this map is stored as a physically coherentblock of memory, it must be physically shifted andcopied after the acquisition of each image in order to

Table 3: Stereo Adaptive Perception Performance (SPARC 20)

Attribute Nonadaptive Adaptive

Output Rows 120 48

Output Cols 128 128

Disparities 60 10

Preprocessing 102 msecs. 41 msecs.

Correlation 683 msecs. 69 msecs.

Postprocessing 754 msecs. 74 msecs.

Total Runtime 1539 msecs. 203 msecs.

zij

Page 16: Rough Terrain Autonomous Mobility - cs.cmu.edu

16 Kelly and Stentz

account for the relative motion between the vehicleand the terrain.

2D FIFO Queue. Our solution to this problem is aclassical one from computer science - the FIFO queue.A simple array accessed with modulo arithmetic suf-fices to logically scroll the map as the vehicle movesby physically wrapping around in memory. As in allFIFOs, the queue size must be chosen to exceed theworst case amount of memory required.

Let the rows and columns of the terrain map bealigned with the axes of the navigation frameand be divided into cells of resolution by . Letthe map width and height be and respectively.The indices into the array are determined by moduloarithmetic as follows:

The operator is the least integer function and is the floating point remainder function.

The operation of the technique when applied to threesuccessive images is indicated below or both a physi-cally scrolling and a wrappable map data structure.

Cell Tags. This approach creates new problems. Themapping from world coordinates to map indices ismultiply defined and therefore the inverse mapping is

not a function. In mathematical terms, the coordinatetransform is not onto.

An infinity of points in global coordinates correspondto a single cell in the map, so remnants of images ofarbitrary age may remain in the map indefinitely. Sup-pose the elevation at the point (15, 25) is needed andthe map is 10 by 10. Then the point (5, 15) may also bein the map. A query for the elevation at (15, 25) mayget the elevation at (5, 15) instead.

We manage this problem in a very simple way.Although all data remains in the map until it is over-written, each entry is tagged with the distance that thevehicle had travelled since the start of the mission atthe time the pixel was measured. The interface rou-tines then perform two important hidden functions:

• If the tag of the last update is too old, the interfaceroutines report the cell as empty. This makes itimpossible for old data to poke through the holesin new data.

• When the tag of new incoming data is significantlydifferent from the one in the cell, it indicates wrap-around, so the statistical accumulators in the cellare first cleared. This ensures that two physicallydistinct regions of space are not confused andmerged together.

4.4. Sensor Motion Compensation

By the time an image is received by the perceptionsystem, the vehicle may have moved a considerabledistance since the image was acquired. So, the pro-cessing of the geometry in the image must account forthe exact position of the vehicle when the image wastaken.

Further, some sensors such as scanning laserrangefinders may require significant time to scan thelaser beam over the environment. In the worst case,there is a distinct vehicle pose associated with eachpixel in a ladar image. If this motion is not accountedfor, the terrain maps computed from images will begrossly in error.

Smear and Offset. The worst case is a high angularvelocity turn. If rangefinder scanning takes about 0.5secs and the vehicle is travelling at 6 mph and turningsharply, its angular velocity can be as high as 1 rad/sec, so an obstacle can be smeared by 30° in arangefinder image at high speed. Similarly, if the inputlatency is 0.5 secs and it is not accounted for, objectswill also be shifted by 30° in a rangefinder image athigh speed. Finally, the range to objects will also beoverestimated by the distance travelled in 1 second.

x y,( )dx dy

W H

maprem

xW-----

dx---------------------

remyH----

dy--------------------

z x y,( )=

Figure 17: Wrappable Map Indexing. Using modulararithmetic, all of 2D space maps, with wraparound,into a finite map.

W

H

dxdy

x

y

xrem x y⁄( )

First Second Third

ImageImage Image

ScrollingMap

WrappableMap

Figure 18: Wrappable Terrain Map. Data remains inthe map until it is overwritten by incoming data fromsomewhere else that happens to fall in the sameplace in memory.

Page 17: Rough Terrain Autonomous Mobility - cs.cmu.edu

Rough Terrain Autonomous Mobility - Part 2: An Active Vision, Predictive Control Approach 17

Pose History and Lookup. We remove this distortionof range images by maintaining a history of vehicleposes sampled at regular intervals for the last few min-utes of execution. When a pixel is processed, wesearch and interpolate the vehicle pose FIFO for theprecise vehicle position at which each range pixel wasmeasured.

4.5. Interpolation

The terrain map is not interpolated at all because inter-polation requires a complete traversal which is tooexpensive to perform. Instead, the responsibility forinterpolation is left with the users of the map.

Impact of Vehicle Maneuverability. Spatial interpola-tion of the entire map is wasteful because vehiclemaneuverability constraints may prevent many placesfrom being reachable. Hence, the data in such regionsis not necessary at all and interpolating there is a wasteof resources.

Impact of Occlusion. Note that occlusion is inevitablein rough terrain, so spatial interpolation can never suc-ceed fully without unjustified and harmful smoothnessassumptions.

Temporal State Interpolation. We will see later thatthe path planner interpolates vehicle state in timeinstead of interpolating the map in space. Further, theassessment of hazards is based on a time signal whichmay or may not be known at a particular point in time.The system is robust by design to unknown signal val-ues and, as a by-product of its processing, computes anassessment of how much geometry is actuallyunknown and reacts accordingly.

4.6. Errors and Uncertainty

Practical solutions require methods to deal with bothsystematic and random error sources that corrupt theincoming data and its eventual representation.

Image Registration. A simple image registration algo-rithm is used in situations where edge artifacts areintroduced by various forms of position and range sen-sor errors. The basic mechanism is to compute andremove the average elevation deviation between theoverlapping regions of consecutive images.

Currently, only the elevation, coordinate is matchedand this seems to work best in practice. When the zdeviation of two consecutive images is computed, it isapplied to all incoming geometry samples in order toremove the mismatch error.

Elevation Uncertainty. After the mean mismatch erroris removed, there are still random errors in the eleva-tion data. In order to represent the variation in geome-try in a single map cell and to improve signal to noiseratios, a scatter matrix is computed (Hotz et. al., 1993)incrementally as each new range pixel is merged intothe map. The scatter matrix is defined as:

The advantage of this incremental approach is that themean and standard deviation of the evolving 3D distri-bution is available at any point in time from some sim-ple formulas. Specifically, the deviation in z is usefulfor computing the uncertainty in the hazard estimatesgenerated by the path planner.

4.7. Results

Although RANGER has its own built in stereo algo-rithm, it has been integrated with a stereo vision sys-tem at the Jet Propulsion Laboratory (Matthies 1992;Matthies et. al. 1996) on another HMMWV. The figureshows a short autonomous excursion along a dirt roadbounded by trees and bushes on the right and a ravineon the left. The sequence of images to the left are thestereo range images. To the right are intensity imagesof the scene corresponding to the range images.

The images are positioned in correspondence withtheir associated position in the terrain map. The terrainmap, drawn in the center, is rendered with intensityproportional to elevation. The path followed is drawnleading to the position of the vehicle near the end ofthe run. The run terminates at the end of the road. Twodistinct obstacle avoidance maneuvers occur. The firstis a left turn to avoid a large tree and the second is arecovery right turn to prevent falling into the ravine.

5. Path Planning

This section discusses the motivation behind, and theimplementation of our predictive control approach totrajectory representation, generation and search. Wewill call the collection of these three capabilities pathplanning in our context.

Our approach will be a departure from precedent thatformulates the classical planning problem of decidingwhere to go largely in terms of predictive control. Thisapproach will have advantages whenever speed is high

z

S

xx∑ xy∑ xz∑yx∑ yy∑ yz∑zx∑ zy∑ zz∑

=

Page 18: Rough Terrain Autonomous Mobility - cs.cmu.edu

18 Kelly and Stentz

enough for dynamics to matter or when nonholonomicmotion constraints are operative. Our approach is sim-ilar to the approach to cluttered environment planningadopted in (Feiten and Bauer, 1994) and it echoes ear-lier work presenting a duality between feedforwardcontrol and deliberative planning (Passino and Ant-saklis, 1989).

5.1. Introduction

The local intelligent mobility problem can be charac-terized in terms of a search of the immediately visibleenvironment, scanning for hazards, and seeking a goalwhile simultaneously avoiding any hazards thatappear. Regardless of many other design variables, allor part of the local environment is typically searched.

Problem. Our initial attempts to search trajectorieswere founded on classical C-space techniques (Loz-ano-Perez and Wesley, 1979). These attempts werevery slow, brittle, and inelegant, but they were educa-tional. In our early work, we encountered the follow-ing major problems:

• computational inefficiency: There was not enoughtime to decide what to do. Conversely, the ineffi-ciency of computations limited vehicle speeds thatcould be safely achieved.

• command following problem: Specific trajectoriesused to avoid obstacles often could not be exe-cuted reliably or stably.

Our computational inefficiency problem was causedby a treatment of vehicle trajectories that was expen-sive and often wasteful. Our command followingproblem arose from issuing commands to the vehiclethat were either wrong or unrealistic.

Further, consideration of these unrealistic trajectoriesin search tended to waste computational resources,thereby increasing reaction time and aggravating thefirst problem of computational inefficiency.

After conducting a study of some related real-timeissues, we concluded that classical C-space planningtechniques were ineffective in our domain. A newapproach was necessary.

Solution. Motion planning is a problem involvingsearch. Recall that heuristic search efficiency can beimproved by appropriate ordering of constraintsbecause some have more power to limit search thanothers. Our predictive control formulation amounts toa constraint ordering heuristic that improves the effi-ciency of search. The elements of our approach are:

• We represent trajectories implicitly in terms of thecommands that are issued to the vehicle, and...

• The corresponding spatial trajectory is computedfrom a highly accurate state space vehicle modelthat guarantees mechanical feasibility by construc-tion.

Through this technique we completely bypass many ofthe difficulties of trajectory generation and search fornonholonomic vehicles with real actuator response

Figure 19: A short cross country excursion. (a) shows a sequence of range images from a stereo vision systemmounted on a HMMWV vehicle. (c) shows a sequence of intensity images from one of the cameras. (b) and (d)show an overhead view of an elevation map that was generated.

tree

tree

ravineravine

(a) (b) (c) (d)

Page 19: Rough Terrain Autonomous Mobility - cs.cmu.edu

Rough Terrain Autonomous Mobility - Part 2: An Active Vision, Predictive Control Approach 19

characteristics. The same technique has been used inseminal works (Olin and Tseng 1991) with less for-mally declared roots.

5.2. Preliminaries

Before proceeding to describe our technique, a fewnecessary terms will be defined.

Terminology. We will use a single point on the vehiclecalled the reference point to describe its motion. Aspatial description of the continuous sequence of posi-tions achieved or considered will be called a path, andwhen the time dimension is added, a trajectory.

The kinematics constraint is a term used to express thefact that steering mechanisms may be unable toachieve arbitrarily small curvatures. Any path whichrespects these mechanical limitations of the steeringsystem is said to be kinematically feasible.

The dynamics constraint is a catchall term used toexpress the fact that system behavior is governed bydifferential equations. Any trajectory which satisfiesthis set of constraints is said to be dynamically feasi-ble.

A trajectory is mechanically feasible if it is both kine-matically and dynamically feasible. Such a trajectorydescribes a physically achievable motion.

A trajectory which is safe for the vehicle to execute(i.e free from significant hazards) is called admissible.

It will be necessary to distinguish several alternateforms of trajectory representation. In general, the sys-tem has available to it at any time a space of possiblecommands that it can send to the vehicle controller forexecution. This space of commands will be called thecommand space and a point in this space is a commandvector or control vector.

Commands may or may not map directly onto thevehicle actuators. The space of directly actuated vari-ables is the actuation space. In our case, vehicle com-mands map more or less directly onto the speed andsteering actuators.

When these commands are applied to the vehicle actu-ators, the vehicle kinematics, dynamics, and themechanics of terrain following cause it to traverse aunique trajectory. We can represent this trajectory interms of the time evolution of a vector quantity calledthe vehicle state vector, which spans an abstract statespace.

Nominally, the state vector includes the position andorientation of the vehicle body, and the positions ofany articulations. Depending on the order of the sys-

tem dynamics, it will also include some number oftime derivatives.

If time dependence is removed from the description byrepresenting only the geometry of the motion and timederivatives are eliminated from the description, theresulting description turns out to be a configurationspace (C-space).

A distinction which is independent from representa-tion is the distinction between command and response.The first is a specification of requested motionwhereas the second is the actual response to thatrequest. The degree to which these two agree is onemeasure of the fidelity of control that has beenachieved.

It will be useful to distinguish two approaches todynamic system modeling that are analogous to theduality between actions and states that has been welldiscussed in the AI literature (Hendler et. al., 1990).

In a state based representation, system motion isviewed as a series of states that are altered by events.In an event based representation, the state of the sys-tem is derived implicitly from its initial state and allevents that have occurred up to a particular point.

While the choice of one representation over anotherdoes not affect the capability for expression, it doesaffect the relative ease with which certain propertiesare represented and reasoned about.

Robot planning has commonly used an abstractionknown as configuration space (C-space) - a spacespanned by any set of parameters that uniquelydescribe the configuration of the robot.

Relatively speaking, the determination of trajectoryadmissibility (safety) is fairly trivial in C space butsignificant work is required to ensure feasibility.

C-space methods, like state based methods, require aninverse model that computes the command trajectorythat corresponds to the chosen C-space trajectory.

Models may not be easily inverted and may not beinvertible at all for an arbitrary C-space curve. WhileC-space planning is a powerful paradigm, it is not aneffective technique in problems where the systemmodel is difficult to invert. Such situations includecases where the system:

• requires a dynamic model1, or• is nonlinear, or• is underdetermined (nonholonomic).

1. We will consider any situation where a derivative is required in astate vector to compute accurate trajectories to be one where a“dynamic” model is required. The need for one increases as speedincreases.

Page 20: Rough Terrain Autonomous Mobility - cs.cmu.edu

20 Kelly and Stentz

The command space described earlier represents thespace of alternative commands that the vehicle canreceive. Such commands can always be executed inthe sense that a legitimate, unique, computableresponse exists for all commands - although theresponse may not follow the command closely.

In this case, the determination of trajectory feasibilityis fairly trivial whereas it requires some work toensure that it is admissible.

Command space methods, like event based methods,require a forward model that computes the C-spacetrajectory that corresponds to the chosen commandtrajectory. Both forms of model are indicated below.

Subproblems. The command following problem ariseseither because the command itself was infeasible tobegin with, or the controller performance is inade-quate. If we choose to blame the trajectory rather thanthe controller, we will call the predicament the trajec-tory generation problem.

Legitimate constraints are imposed on trajectories byvehicle limitations that amount to very strong con-straints on the feasibility of arbitrary trajectoriesexpressed in configuration space. Such constraintsinclude:

• actuator and plant kinematic and dynamic limits inthe form of braking and steering maneuverability

• underactuation of the vehicle• processing and communication delays

Consider a simple situation where the vehicle com-mand space consists of speed , and curvature

, and the configuration space consists of position and heading in two dimensions.

In attempting to generate trajectories that are feasible,several difficulties will emerge because the relation-ship between a C-space trajectory and its commandspace equivalent is multivariate, nonlinear, coupled,and underdetermined.

The equations which map command space to C-spacein a flat 2D world are given below:

We will call these equations a forward model. By con-trast, an inverse model would be a solution to theproblem of computing the command space curve fromthe C space curve. That is, the inverse model generatesclothoid curves for Ackerman steered vehicles.

The generation of clothoids is a difficult problem thatis further aggravated by the need to account fordynamics and latencies. Note however, that the for-ward model provides the inverse correspondence trivi-ally, and in its integral form, is simply the equations ofdead reckoning. The trajectory generation problemhere is only difficult in one direction.

It is often necessary for the system to understand thedegree to which a particular motion can be accom-plished. Poor fidelity of this aspect of the vehiclemodel means that the system will not understand itsown motion. This in turn will lead to:

• unreliablility of obstacle avoidance• instability of path following1

The perceptual horizon of any vehicle cannot exceedthe maximum range of the perception sensor. At mod-erate speeds (<10 mph), maximum useful sensor rangetends to be roughly equal to the worst case distance ittakes for the steering actuator to move to its com-manded position.

When dramatic steering changes are required to avoida hazard, the navigation system operates almostentirely in the regime where curvature is continuouslychanging. This observation leads to the conclusion thatthe use of arc rather than clothoid models of Acker-man steering are incorrect at even moderate speeds.

The following figure indicates accurately the differ-ence between an arc model and a clothoid model ofvehicle response to a command to switch from a hard

Figure 20: Forward and Inverse Models. These haveanalogous definitions in robot manipulatordynamics.

CurrentState

Command NewState

ForwardModel

CurrentState

New

InverseModel

StateCommand

V t( )κ t( )x t( ) y t( ), ψ t( )

1. This depends on the feedback law used. Even with poor models,stable low performance paths may be obtained.

x t( ) x0 V t( ) ψ t( )( )dtcos

0

t

∫+=

y t( ) y0 V t( ) ψ t( )( )sin dt

0

t

∫+=

ψ t( ) ψ0 V t( )κ t( )dt

0

t

∫+=

dx t( )dt

------------- V t( ) ψ t( )cos=

dy t( )dt

------------- V t( ) ψ t( )sin=

dψ t( )dt

-------------- V t( )κ t( )=

Figure 21: Mapping from Command Space to Cspace. These are also the equations of deadreckoning.

Page 21: Rough Terrain Autonomous Mobility - cs.cmu.edu

Rough Terrain Autonomous Mobility - Part 2: An Active Vision, Predictive Control Approach 21

left to a hard right turn. The maximum rate of the steerangle of the front wheels is 30° per second.

Suppose an obstacle exists to the left of the vehicle.The obstacle could be barely avoided if the vehiclecontinued on its hard left turn trajectory. Deciding toplay it safe, the vehicle issues a hard right command atthe position shown (0,0). The command is representedby the arc to the right of the vehicle. The response tosuch a command is the clothoid shown which remainsto the left of the vehicle. Under these conditions, thevehicle would drive directly into the obstacle. Ulti-mately, poor understanding of its own dynamics hascaused an incorrect decision and lead to a collision.The correct decision was to continue turning leftbecause curvature cannot be changed fast enough toavoid the obstacle to the right.

Design. Our approach to managing these problems isto:

• Invert the order in which the common planningconstraints of feasibility and admissibility are sat-isfied through a forward modeling approach to tra-jectory generation.

• Employ an accurate state space model of vehicleresponse as the forward model.

It turns out that it is far more efficient to search for anadmissible trajectory in a space of feasible ones thanvice-versa and that sufficiently accurate models ofvehicle motion are relatively easy to generate in statespace form.

Implications. An important implication of theapproach is that the generated trajectories are mechan-ically feasible by construction. While the input com-mands to the model respect only the maximum

curvature constraint, the output state estimate is con-sistent with the response of all actuators and the bodykinematics of motion over rough terrain.

Thus, we have simplified the computation of the corre-spondence between command and response. This sim-ple correspondence available through our controlformulation combined with high fidelity models intro-duces the following benefits:

• Reliability: System reliability is enhanced becausedynamic feasibility is inherent in forward model-ing approaches.

• Accuracy: Higher fidelity models make it possibleto drive close to hazards when necessary and totrack paths with low error.

• Stability: Vehicle control, whether for the purposeof obstacle avoidance or goal-seeking, remainsstable at high speeds.

• Performance: Computational complexity of plan-ning is reduced because the dynamics constraint isa valuable heuristic to limit search. This reductionin complexity leads to enhanced response timesand higher speed motion.

5.3. Trajectory Representation & Generation

We will represent response trajectories implicitly interms of the commands to which they correspond.When necessary, we will use a forward model to gen-erate the response from the command through a sys-tem simulation process based on numerical integrationof the state space model.

Linear State Space Model. For a linear system, theconventional state space model of a system is the fol-lowing two matrix equations:

Note in particular that the first equation is a differen-tial one. This kind of model is known classically as amultivariate state space system. It can be mapped ontoour problem as follows. Let us assume the system islinear and describe the function of the matrices and

-14 -10 -6 -2 2 6 10 14X coordinate in Meters

-10

-6

-2

2

6

10

14

18

Y c

oord

inat

e in

Met

ers

Figure 22: Model Fidelity at 5 m/s speed. Theclothoid model correctly accounts for how long itreally takes to reverse curvature for an Ackermansteer vehicle.

obstacle

initialtrajectory

command

trajectory(arc)

response

trajectory(clothoid)

dx = A x + B u

y = C x + D udt

Page 22: Rough Terrain Autonomous Mobility - cs.cmu.edu

22 Kelly and Stentz

vectors. The system of equations can be represented ina block diagram as follows:

The system dynamics matrix, A, models actuator con-straints, kinematics, and dynamics, and body dynam-ics. It propagates the state of the vehicle forward intime. Our system model is based on the assumptionthat velocity can be considered constant for a smallperiod of time.

The input distribution matrix, B, transforms the con-trol vector into its influences on the state vector.

The control vector u includes vehicle steering andspeed commands as well as command signals to anyarticulated sensor heads. Generally, alternative com-mands can be any time-varying control vector u(t).

The terrain disturbances ud model the terrain contactconstraint1. Alternately, an abstract kinematic equa-tion of the form g(x) = 0 can be used. Terrain geome-try is represented in a terrain map data structure that isgenerated by the perception system.

The state vector x includes the vehicle steering, speed,and the state of motion of the vehicle body and anyarticulated sensor heads. It includes the 3D positionand 3-axis orientation of the vehicle body as well as itslinear and angular velocity.

The output vector y can be any function of both stateand inputs. It will be discussed later in the context ofobstacle avoidance.

Nonlinear State Space Model. The actual systemmodel used is nonlinear. Fundamentally, this isbecause the actuators move with the vehicle, so the

transformation from command space to state spacedepends on vehicle attitude and hence on state.

The nonlinear model computes trajectories resultingfrom vehicle commands. The inputs to the model arethe steering angle , (corresponding indirectly to thedesired path curvature), and throttle , (correspond-ing to desired speed) and an elevation map of the ter-rain ahead of the vehicle.

The commands are first delayed through a FIFO queueto account for communications and processing delaysand passed through a model of the actuator dynamics.In the case of the throttle (speed) the influence thegravitational load is so significant that it must be mod-eled.

The predicted steer angle response is passedthrough a model of the steering column to predict theactual curvature , of the path traversed. The productof curvature and speed provides the component ofangular velocity directed along the vehicle verticalaxis (which may not be aligned with gravity).

The linear velocity is converted to world coordi-nates to generate the components of vehicle velocityalong the world frame axes. We assume that the slipangle (the angle between vehicle orientation and itsvelocity vector) is zero so that velocity is then orientedalong the body forward axis. Position is deter-mined through integration of the velocity vector.

Pitch and roll are determined by placing the vehi-cle wheels over the terrain map and allowing the vehi-cle to settle. Heading is computed by integrating the

1. Of course, at sufficiently high speeds, the vehicle need not remainin contact with the terrain.

B

u x

+

+dt∫

A

ud

y

Figure 23: Multivariate Linear System Block Diagram.This model can be used to represent a vehicle drivingover terrain.

C

Figure 24: Forward model. This model is used topredict vehicle state given actuator commands ( ,

). The output state contains body frame zcomponent of angular velocity ( ), vehiclevelocity( ), heading( ), pitch( ), roll( ) and position

).

αcVc

β·V ψ θ φ

x y z, ,( )

Delay

αC

αdelayed

SteeringDynamics

VehicleKinematics

αresponse

Delay

VCVdelayed

ThrottleDynamics

Vresponse

*

κ

β·

CoordinateTransform

ψψ·

V

CoordinateTransform

x· y· z·T x y z

T

TerrainContact

θ

φdt∫

dt∫

αcVc

αresponse

κ

β·

V

x y z, ,( )

θ φ

ψ

Page 23: Rough Terrain Autonomous Mobility - cs.cmu.edu

Rough Terrain Autonomous Mobility - Part 2: An Active Vision, Predictive Control Approach 23

angular velocity after converting coordinates to theworld frame.

State Space Simulator. The basic simulation loop canbe written as follows. At each time step:

• simulate suspension - determine attitude from ter-rain geometry and position

• simulate propulsion - determine new speed fromcommand, state, and attitude

• simulate steering - determine angular velocityfrom steering and speed

• simulate body - dead reckon from linear and angu-lar velocity and time step

The positions of distinguished points on the body,called reference points, are maintained in navigationcoordinates throughout the simulation. The suspensionmodel that is used is based on assumptions of rigid ter-rain and suspension and it computes the attitude of thevehicle which is consistent with terrain contact.

Propulsion is modeled as a proportional controllerwith gravity compensation. The steering model isbased on an angular velocity limit on the steeringwheel and the bicycle model of steering kinematics.Body dynamics are simulated using the 3D dead reck-oning equations.

5.4. Trajectory Search

The basic search process used is generate and test. Weemploy this technique while conducting a commandspace search over the feasible set of response trajecto-ries.

The system considers a number of command spacealternatives which span the entire set of commandsavailable for the vehicle at some gross resolution.These are then converted to response state space and Cspace trajectories through model-based simulation andsubsequently evaluated by both obstacle detection andgoal seeking.

For a rigid-bodied vehicle moving in three dimen-sional space, the C-space can be considered to be asubset of state space - that is, the coordinates of thevehicle control point expressed as (x, y, z, roll, pitch,yaw). The command space for a conventional automo-bile is spanned by the variables of speed and path cur-vature and these variables map more or less directly tothe controls of throttle and steering.

Simulation. By the definition of state, the system canbe projected arbitrarily far forward in time based onlyon the control vector, terrain contact constraint, andtime. Hence, a computer implementation of an integra-

tion of the system model equations constitutes a statespace simulator as shown below.

The set of trajectories xi(t) which:

• satisfies the model equations (dx/dt = A x + B u)• maintains contact with rigid terrain (g(x) = 0)

is called the feasible set.

The constraints are satisfied by construction throughsimulation of the system dynamics and altering thevehicle attitude at each step in the simulation toenforce terrain contact.

Predictive Control vs. C Space Planning. The differ-ences between classical C space planning and com-mand space planning are indicated in the followingfigure. On the left of the figure, the search of planningalternatives is expressed in configuration space.Obstacles can be represented as regions in this space.When a clear region or set of points has been found infront of the vehicle, a trajectory generation algorithmis invoked to map C space onto the vehicle commandspace and these commands are sent to the hardware forexecution.

The right of the figure represents the predictive control(Soeterboek, 1992; Clark et. al., 1987; Clark 1994)approach. Note first that the direction of the arrows arereversed.

In our case, unlike in more rigorous controller designmethodologies, we do not explicitly compute a controllaw. We use generate and test. The inverse systemmodel is never evaluated explicitly. The system simplyremembers the correspondence of command toresponse trajectories and inverts this list of orderedpairs.

It is clear from the figure that state space is, in fact, asuperset of configuration space - including all C space

Figure 25: State Space Simulator. Because a statespace model retains state, it can be used to projectmotion that corresponds to a command arbitrarilyfar into the future. The inverse model on the right isnever explicitly evaluated.

u x

S

x

Terrain

u

S-1

MapTerrainMap

ForwardModel

InverseModel

Page 24: Rough Terrain Autonomous Mobility - cs.cmu.edu

24 Kelly and Stentz

variables plus any derivatives that appear in the sys-tem dynamic model.

5.5. Results

While it is difficult to compute the shapes of regions inconfiguration space in closed form, it is relatively easyto write a computer program to enumerate all possibil-ities and fill in boxes in a discrete grid which repre-sents C-space at reduced resolution. The threedimensional C-space for an Ackerman steer vehiclefor an impulse turn at 4.5 m/s was generated by thisforward technique.

The results are plotted below in heading slices of 1/16of a revolution. Symmetry generates mirror imagesalong the heading axis, so two slices are plotted oneach graph. The maneuver is a turn from zero curva-ture to the maximum issued at time t = 0. A dot at aparticular point (x,y) in any graph indicates that theheading of the slice is obtainable at that position.There are 16 slices in total of which 6 are completelyempty (i.e the vehicle cannot turn around completelyin 20 meters). The total percent occupancy of C-spaceis the ratio of the total occupied cells to the total num-ber of cells. This can be computed from the figure tobe 3.1%.

So 97% of the C-space of the vehicle is infeasible ifthe limited maneuverability of the vehicle is modeled.The maneuverability is limited by both the nonzerominimum turn radius and the steering actuatorresponse. Note that occupancy of C-space does notaccount for higher level dynamics. There are severe

constraints on the ability to “connect the dots” in thesegraphs which aggravate the situation further.

6. Goal Arbitration

This section discusses the motivation behind, and theimplementation of our optimal control approach togoal arbitration. The proposed approach is similar informulation to work in classical optimal control(Athans and Falb, 1966) in that it seeks to determinecontrol signals that will both satisfy constraints andoptimize a performance index.

Again, unlike in classical controller design techniques,we do not explicitly compute an optimal control law;we use a much simpler generate and test technique.This sampling of command space implies that sub-optimal solutions are generated.

steering

spee

dbr

ake

x

y

z xy

V

CommandSpace

ConfigurationSpace

StateSpace

(6D) (8 D)

(3 D)StateTrajectory

ObstacleFigure 26: Planning Through Predictive Control.Obstacles are most naturally represented in Cspace but generating commands that avoid theminvolves a difficult inverse model. Instead, arepresentative number of command spacealternatives are transformed, through simulation, tostate space, and then to C space and then checkedfor intersection with obstacles.

Space Model

Generator

-10 0 10X Coordinate

0

10

20

Y C

oord

inat

e

-10 0 10X Coordinate

0

10

20

Y C

oord

inat

e

± 1*PI/8 ± PI/16

-10 0 10X Coordinate

0

10

20

Y C

oord

inat

e

-10 0 10X Coordinate

0

10

20

Y C

oord

inat

e

± 3*PI/8 ± PI/16

-10 0 10X Coordinate

0

10

20

Y C

oord

inat

e

± 5*PI/8 ± PI/16

-10 0 10X Coordinate

0

10

20Y

Coo

rdin

ate

± 4*PI/8 ± PI/16

± 0*PI/8 ± PI/16

± 2*PI/8 ± PI/16

Figure 27: Ackerman Steer Configuration Space. Fora sharp turn to the left or right at 4.5 m/s, startingfrom zero curvature, 97% of the configuration pointsin front of the vehicle’s initial position are notfeasible - meaning the vehicle can not achieve the(x,y,heading) represented by the point within thefinite time horizon simulated.

Page 25: Rough Terrain Autonomous Mobility - cs.cmu.edu

Rough Terrain Autonomous Mobility - Part 2: An Active Vision, Predictive Control Approach 25

6.1. Introduction

In addition to trajectory search, the local intelligentmobility problem involves an aspect of goal arbitra-tion. For example, given a goal path to follow and thesimultaneous goal of avoiding obstacles, it is likelyand frequently the case that these goals will conflict.More plainly, an obstacle may appear directly on thegoal path.

We will discuss here our mechanism for dealing withthis conflict as well as the manner in which candidatetrajectories are ranked for both their obstacle avoid-ance and goal-seeking potential.

Problem. Let us define the strategic goal as some pathor position to be followed or achieved and the tacticalgoal as that of simultaneously avoiding all hazardousconditions. If both goals are implemented as indepen-dent behaviors they will naturally disagree on thecommands to the actuators. This legitimate and inevi-table conflict will be called the actuator contentionproblem, also known elsewhere as a control prioritiza-tion problem.

Solution. Solutions to this problem must decide howto either:

• Merge both commands together to generate athird.

• Give one behavior priority over the other.

Regardless of how this is done, the general techniqueinvolved is arbitration. Several approaches have beenused ranging from subsumption of one behavior infavor of another (Brooks, 1987) to consensus-buildingand voting techniques (Rosenblatt and Payton, 1989).

A spectrum of approaches exist with extremes thatcorrespond roughly to bureaucracy and democracy.Our approach is somewhat intermediate between theseextremes. It recognizes that:

• Some behaviors, like obstacle avoidance, must begiven absolute veto power over unsafe trajectories.

• Others, like goal seeking can profitably optimizeperformance through search and ranking of theremaining alternative trajectories.

6.2. Preliminaries

Terminology. Any number of potential hazardous situ-ations may exist along a trajectory. Some of these haz-ards include:

• discrete obstacles like rocks, holes, and trees thatwould damage the vehicle through collision.

• hazardous configurations like extreme pitch androll angles that would damage the vehicle throughtipover.

• hazardous states like extreme lateral or longitudi-nal acceleration which could damage the vehiclethrough tipover or loss of traction and control.

• traction traps like wheel-sized holes, high center-ing terrain, and regions of ice or slippery terrainthat would prevent further vehicle motion.

• catastrophic falls like ravines and cliffs that coulddamage the vehicle through complete loss of ter-rain contact, followed by ballistic motion, fol-lowed by catastrophic collision.

A path is admissible if it would be safe for the vehicleto traverse it.

Design. Our basic approach is to notice that the mobil-ity problem can be expressed in the familiar terms ofoptimal control theory and to then apply the associatedtechniques and abstractions of this theory.

Architecturally, the tactical control layer consists ofcoexisting hazard avoidance and goal seeking behav-iors that are managed by an arbiter to avoid conflict asshown below:

In fact, the seeking and/or avoidance occur when thearbiter chooses an alternative and sends it to the con-trol layer. The other two entities simply rank the can-didate trajectories that are given to them.

Implications. Our approach has the advantage thatsome limited degree of local intelligent behavioremerges naturally. For example, wall or otherextended feature following emerges because optimiza-tion will keep the vehicle close to an obstacle betweenit and the goal. However, once a break in the extendedfeature appears, the system will immediately seize theopportunity to reacquire the goal path.

6.3. Arbitration

Note that the satisfaction of either goal may beexpressed as a constraint or as some utility function tobe optimized. For example, we could optimize safetyby choosing the safest overall candidate trajectory or

Strategic Layer

Figure 28: Tactical Control Layer. The tacticalcontrol layer consists of two purposeful behaviors,and an arbiter to coordinate them.

Goals

Set Points States

Status

Tactical Layer

Control Layer

GoalSeeking

OptimalControlArbiter

ObstacleAvoidance

Page 26: Rough Terrain Autonomous Mobility - cs.cmu.edu

Rough Terrain Autonomous Mobility - Part 2: An Active Vision, Predictive Control Approach 26

we might rigidly enforce a path to follow as a con-straint. We believe the reverse is more appropriate.That is, hazard avoidance is a constraint and goalseeking is to be optimized.

Hence, the task of safe navigation can be expressed inclassical optimization terms. The navigator mustachieve some useful goal while simultaneously satis-fying the constraint of avoiding damage to the vehicleor its payload and the constraints of limited vehiclemaneuverability.

Feasible Set. We will express this optimal controlproblem as follows. Let ui(t) be a candidate controlfunction. The response to this candidate command,denoted xi(t), is generated from the nonlinear systemdynamics and terrain following models:

Let the set of mechanically feasible trajectories,denoted Xf, be those which satisfy the above twoequations. The response trajectory is a member of thisfeasible set:

Note that the space of possible commands to issue iscontinuous. Rather than deal with variational calculus,we will sample the feasible set of trajectories at somepractical resolution that ensures adequate coverage ofthe set and search the alternatives exhaustively.

Output Vector. The hazard vector, y, consists of rank-ings of every point along a response trajectory for itsrelative safety in terms of several of the hazard condi-tions mentioned earlier. By and large, safety can bedetermined kinematically from the state and the terrainmap, though such issues as predicting rollover areexceptions. Each element of the vector corresponds toa different hazard.

Let us define the safety threshold vector, ysafe, as theconstant vector whose elements are the maximum safevalues of each element of the hazard vector. A trajec-tory is safe when:

Let the set of safe trajectories, denoted Xa, be thosewhose associated hazard vectors satisfy the above

equation. Such a response trajectory is a member ofthe admissible set:

Optimal Control Problem. While there are manypotential forms of strategic goal that might be assignedto the vehicle, let us assume, without loss of general-ity, that a goal trajectory, xgoal, or reference trajectoryof some form has been assigned.

Further, let us define a goal functional, or perfor-mance index L[xi(t)], to be an arbitrary expression ofhow well a particular trajectory follows the goal tra-jectory. If we use the integrated length of the vectordifference, we might write:

The optimal control problem can now be representedas follows:

6.4. Adaptive Regard

Although our gross resolution command space searchmanages the complexity of trajectory search, it doesnothing to minimize the cost of evaluating a particulartrajectory for its safety and/or goal seeking potential.Efficient trajectory evaluation is the subject of thissection.

We minimize wasted computation in trajectory evalua-tion by selectively processing only the data that mat-ters along the trajectory in the environmental model.Known here as adaptive regard, the technique is theanalog of our adaptive approach to perception in that itminimizes references to the environmental model justas adaptive perception minimizes references toimages.

Detection Zone. The immediately material informa-tion forms a region in the environmental model whichwe call the detection zone - that region of the near

x = f(x,u)

g(x) = 0

System Dynamics

Terrain Following

.

xi(t) ∈Xf Feasibility

Hazard Kinematicsy = h(x)

Safety|y| < ysafe

xi(t) ∈Xa Admissibility

L[xi(t)]=||xi(t)-xgoal(t)||= xi t( ) xgoal t( )–[ ]2td

t1

t2

Minimize: L[xi(t)]=||xi(t)-xgoal(t)||ui(t)

Subject to:

Goal Proximity

Figure 29: Optimal Control Arbitration. Drivingsafely toward a goal can be formulated in terms ofoptimizing a functional over a set of trajectories thatare both safe and mechanically feasible.

x = f(x,u)

g(x) = 0

System Dynamics

Terrain Following

.

y = h(x) Hazard Kinematics

Safety Constraint|y| < ysafe

Page 27: Rough Terrain Autonomous Mobility - cs.cmu.edu

Rough Terrain Autonomous Mobility - Part 2: An Active Vision, Predictive Control Approach 27

environment which the vehicle can reach but is notalready committed to travelling and about which animmediate decision of traversability is needed to con-tinue moving.

Remaining Zones. Thus, in our search for hazards wedo not look for hazards in the following regions:

• free zone: where the vehicle cannot go• dead zone: where the vehicle is committed to go• horizon zone: where the decision can be delayed

The precise location of the detection zone is obtainedtrivially from a time window into the response trajec-tories computed by the state space model.

Planning Window. The planning window in planningis analogous to the range window in perception. Itconfines the search for hazards to the detection zone.One of the highest level system requirements is toattempt to maintain continuous motion, so adaptiveregard is based on turning maneuvers (which consumemore space) rather than braking maneuvers.

An impulse turn is a turn from zero curvature to themaximum allowed curvature. The planning window iscomputed by predicting the distance required to exe-cute an impulse turn at the current speed with the bestavailable estimates of the output latencies that willapply.

Precision in computation of the planning windowrequires careful treatment of time. The planning win-dow is measured from the position where the vehiclewill be when the steering actuator starts moving.

Real-Time Latency Modeling. The planning problemhas latency concerns similar to those of perception.Without latency models, obstacle avoidance signifi-

cantly underestimates the distance required to react fortwo reasons.

• Position estimate input latency implies that thevehicle is actually much closer to an obstacle thanthe last available position measurement suggests.

• Command output latency and actuator dynamicsimply that it actually takes much more distance toturn than would be expected from instantaneousresponse models.

A model of these latencies is accomplished with thefollowing mechanisms:

• all input and output is time-stamped• all input and output is stored in internal FIFO

queues• all sensor latencies are modeled• all actuator latencies are modeled

These FIFO queues do not introduce artificial delay -they are used to model the delays which already existin the system.

6.5. Hazard Detection

The process of hazard detection is the process of com-puting the hazard output vector. The hazard vectormoves over time in a hazard space in response to themovement over time of the vehicle state vector in statespace. That is, there is a hazard trajectory yi(t) whichcorresponds to each state trajectory xi(t), which in turncorresponds to each command trajectory, ui(t).

Once the vehicle state trajectory xi(t) is known, a setof hazard models is used to compute its relative safetywith respect to the associated terrain information inthe environmental model.

Types of Hazards. Each element of the hazard vectorcorresponds to a different hazardous condition. Eachscalar hazard yj(t) is a function of the vehicle state, theterrain on which it rests, and the input commands.Some typical hazards are:

• Tipover: The movement of the weight vector out-side of the support polygon formed from the wheelcontact points.

• Body Collision: Collision of the underbody withthe terrain.

• Discrete Obstacles: Regions of locally high terraingradient.

• Unknown Terrain: Regions that are occluded, out-side the field of view of the sensors, unknownfrom poor measurement accuracy or resolution, ordevoid of matter (such as the region beyond a cliffedge).

Hazard Signals. As an example of a hazard signal,consider expressing proximity to tipover in terms ofexcessive pitch or roll angle. Let the elevations of the

Figure 30: Detection Zone For Braking Maneuver. Inany candidate vehicle maneuver a swath of groundis covered. The detection zone is the region whichmust be verified to be safe in the currentcomputational cycle to prevent collisions.

FreeZone

FreeZone

DetectionZone

DeadZone

HorizonZone

Page 28: Rough Terrain Autonomous Mobility - cs.cmu.edu

Rough Terrain Autonomous Mobility - Part 2: An Active Vision, Predictive Control Approach 28

front and rear points on the vehicle longitudinal axesafter enforcing terrain contact be and . Thepitch tipover hazard signal is then given by:

The other three hazards above are computed from thevolume under the belly, the local terrain gradient, andthe total trajectory length for which the terrain isunknown respectively.

Trajectory Safety. In order to assess the safety of anentire trajectory, it is necessary to integrate out thetime dimension and merge all of the predictions of dif-ferent hazard types together. This process generates aholistic estimate of the degree of safety expected if theassociated command is executed.

Although many alternatives for doing this presentthemselves, we have achieved acceptable performanceby ranking the whole trajectory using the worst hazardat the worst point in time. The output of this process isthe safety ratings of all trajectories - which is suppliedlater to the optimal control arbiter.

6.6. Goal Seeking

The process of goal-seeking starts with the process ofcomputing the goal functional. Many techniques forcomputing the proximity of two trajectories are possi-ble, but the one we chose here is a modification of theclassical pure pursuit algorithm.

The most general form of path-based strategic goal is aliteral trajectory to follow1. Other types of goals suchas headings, points, and curvatures, can be extracted astrivial subcases of the more general path trackingproblem.

Basic Pure Pursuit. The pure pursuit algorithm (Shinet. al. 1991) is a proportional controller formed on theheading error computed from the current heading andthe heading derived from the current vehicle positionand a goal point on the path.

The goal point is computed by finding the point on thepath which is a predetermined distance from thecurrent vehicle position.

Heading is measured at the center of the rear axle. Theproportional gain is normalized by the lookaheaddistance . This can be viewed an adaptive elementor, more simply, as a unit conversion from headingerror to curvature because the ratio is the aver-age curvature required to reacquire the path at the goalpoint.

Rough Terrain Pure Pursuit. A few modifications areintroduced to adapt pure pursuit for rough terrain.Extremely large tracking errors must be acceptable tothe servo without causing instability if obstacles are tobe avoided robustly. This is made possible by twodevices indicated in the following figure.

The system maintains a running estimate of the pointon the path which is closest to the current vehicle posi-tion in order to avoid an expensive search of the entire1. The arbiter can easily integrate the real-time inputs of a human

supervisor through this mechanism.

zf t( ) zr t( )

ypitch t( ) zf t( ) zr t( )– L⁄=

L

ψerr

L

κcmdψerr

ψres

ψcmd

ψcmd

ψres

Path

Figure 31: Basic Pure Pursuit. A proportionalcontroller formed on the heading error to acquire agoal point at a given lookahead distance.

-

+ Kp

L------

SteeringColumn

GoalPoint

KpL

ψerr L⁄

x1 y1,( )

L13

x2 y2,( )

x3 y3,( )

NonadaptiveLookahead

ErrorDistance

L23

Adaptive

L12

Lookahead

Figure 32: Adaptive Pure Pursuit. The point closestto the vehicle on the path replaces the vehicleposition.

GoalPoint

ClosestPoint

Path

Page 29: Rough Terrain Autonomous Mobility - cs.cmu.edu

Rough Terrain Autonomous Mobility - Part 2: An Active Vision, Predictive Control Approach 29

path each iteration. This closest point is used insteadof the current vehicle position as the origin of the loo-kahead vector.

The lookahead distance is adaptive to the currenttracking error - increasing as the error increases asindicated in the accompanying code fragment:

The first while loop is responsible for maintaining arunning record of the close point, point 2. It searchesthrough an arc length window which adapts to the pathtracking error. As the error gets larger, this loop willcause the close point to jump over high curvaturekinks in the path as they become less relevant at theresolution of the tracking error.

The second while loop computes the goal point in anidentical manner. It basically moves point 3 forwarduntil it falls outside a circle centered at the vehiclewhose radius is the sum of the error distance andthe nonadaptive lookahead .

In this way, when an obstacle avoidance maneuvercauses significant path error, the algorithm will searchto reacquire the path on the other side of the obstacleinstead of causing a return to the front of the obstacle.

Under normal circumstances when the vehicle istracking the path, the close point is at the vehicle posi-tion, the error distance is close to zero, and the adap-tive lookahead is the nonadaptive lookahead. Hence,the algorithm gracefully degenerates to classical purepursuit when obstacle avoidance is not necessary.

Model-Based Adaptive Pure Pursuit. The devices ofthe previous section account for obstacle avoidance.However, basic pure pursuit also suffers from speedrelated problems. Instability results from large track-ing errors, too short a lookahead distance or too high again.

The goal functional is generated by evaluating, at eachpoint on each candidate trajectory, the distance fromthe vehicle to the goal point. The minimum distanceover the entire trajectory is the functional value associ-ated with it.

The following figure shows how accurate models ofresponse stabilizes goal seeking.

In the above situation, if an arc-based model wereused, the system would issue a hard left command.However, the more accurate clothoid model revealsthat such a command would actually increase thetracking error leading to even more overcorrection. Atracker based on a more accurate clothoid modelwould recognize the situation and issue a zero curva-ture command that would correctly acquire the goal atthe lookahead distance.

This is nothing more than a simple version of classicalmodel referenced control (MRC) (Landau 1979; Whi-taker et. al., 1958) implemented through commandspace sampling. In MRC, a reference model (our sim-ulator here) describes the desired input-output charac-teristics of the closed-loop plant. The controller (ourcommand space generate and test algorithm) attemptsto cause the real plant to agree with the referencemodel.

6.7. Results

In the RANGER navigator, command alternatives areexpressed in terms of constant speed, constant curva-

Figure 33: Adaptive Pure Pursuit Algorithm. Thelookahead is adapted to allow for large trackingerrors when obstacles are avoided.

min = HUGE; way_pt = close;while(L23 < L12 + lookahead)

{x3 = path_x[way_pt];y3 = path_y[way_pt];if( L13 < min )

{close = way_pt;}

way_pt++;}

way_pt = close;while(L13 < L12 + lookahead)

{x3 = path_x[way_pt]; y3 = path_y[way_pt];goal_pt = way_pt;}

L12L23

Arc Model(Unstable)

Clothoid

Figure 34: Adaptive Pure Pursuit. Inaccurate modelsof response lead to servo instability. Note that thevehicle is in a sharp left turn and happens to havezero heading at t=0.

Model(Stable)

Page 30: Rough Terrain Autonomous Mobility - cs.cmu.edu

Rough Terrain Autonomous Mobility - Part 2: An Active Vision, Predictive Control Approach 30

ture commands. Several times a second, the optimalcontrol arbiter considers approximately ten steeringangles to use during the next control cycle. The for-ward model simulates the effect of using these steeringangles over a short period of time and evaluates eachof the resultant paths. Any steering angles that resultin paths that go near or through hazardous vehicle con-figurations are discarded. The steering angle thatresults in a path that is optimal based on several crite-ria is chosen.

In the figure below the system issues a left turn com-mand to avoid a hill to its right. The histograms repre-sent the votes for each candidate trajectory (highervalues indicate safer trajectories). The hazards are:

• ROLL: excessive roll• PITCH: excessive pitch• BODY: collision with the body • WHEEL: collision with the wheels

The tactical vote (TACT) is the overall vote of hazardavoidance. The strategic vote (STRAT) is the goalseeking vote.The arbiter chooses the third trajectoryfrom the left because this closest to the strategic votemaximum (straight) while exceeding the threshold forsafety.

7. Summary and Conclusions

This section summarizes our perspectives and conclu-sions.

7.1. Perspectives

Our implementation of a tactical control layer impliesa perspective on the need for deliberation and reactiv-ity in autonomous vehicles. It seems that both reactiv-ity and deliberation have their place in our approach tolocal intelligent mobility.

Memory. From a real-time response point of view,high-speed navigators cannot afford the computationnecessary to continually process the same scene geom-etry at sufficiently high resolution. Thus, for high-speed navigators, the memory involved in the mappingof the environment is an essential system capability.

Deliberation. For high-speed navigators, models ofdynamics take the place of models of logical prece-dence used in AI in that they limit the states reachablein a small period of time from any given state. In suchnavigators, the deliberative reasoning about the futureimpact of current actions that is implemented in for-ward models is an essential capability - at least to theextent that the system must understand its ownmotion.

Reaction. The latency models developed in the paperand the precision timekeeping that has been incorpo-rated in the software seem more applicable to the con-trol systems of fighter aircraft than to autonomousvehicles. It seems that high-speed navigators must rea-son about their ability to respond.

7.2. Conclusions

This section presents a short list of conclusions whichseem most significant to the problem and most rele-vant to the more general problem of autonomousmobility.

Tactical Control Layer. A modification on a standardarchitectural model of robotic systems has been pro-posed which connects strategic geometric reasoning todynamic reactive control in an effective manner whenthe system under control exhibits poor command fol-lowing. The problem is solved in this intermediatelayer between AI and control, between reactive anddeliberative approaches.

Adaptive Perception. The throughput problem ofautonomous navigation can be managed at contempo-rary speeds by computational stabilization of the sen-sor sweep and active control of resolution throughintelligent image subsampling.

Computational Image Stabilization. Adaptive percep-tion techniques which computationally stabilize thevertical field of view provide the best of both worlds.

Figure 35: Optimal Control Arbitration. The systemchooses a steering angle from a set of candidatetrajectories. The histograms represent “votes” foreach candidate trajectory where higher valuesindicate preferred trajectories.

Roll Pitch Body Wheel Tact Strat

preferredpath

Hill

Page 31: Rough Terrain Autonomous Mobility - cs.cmu.edu

Rough Terrain Autonomous Mobility - Part 2: An Active Vision, Predictive Control Approach 31

They provide the high throughput necessary for high-speed motion and the wide field of view necessary forrough terrain.

Adaptive Regard. In a manner similar to the use of afocus of attention in perception, a focus of attentioncan be computed for obstacle avoidance that reflectsthe capacity of the vehicle to react at any given time.Adaptive regard places a limit on how close a vehicleshould look for hazards because it cannot react insideof some distance. Thus, adaptive regard calls for alldata inside some lower limit to be ignored and limitsare placed on the extent of data processed beyond thisminimum limit by considerations of both minimumplanning throughput and range data quality.

Command Space Search. Dynamics of many kindsimply that the local planning problem is actually rela-tively easy from the point of view of search complex-ity. The planning “state space1” of the high-speedAckerman vehicle is degenerate. Ordering heuristicsgenerally optimize search by imposing the most con-straining limits first and reducing the size of the searchspace as fast as possible. This principle is used herebecause once dynamically infeasible paths are elimi-nated, only a few remaining alternatives are spatiallydistinct enough to warrant consideration.

Dynamic Models. The incorporation of dynamicsmodels has generated a local navigation system whichremains stable past the limits beyond which our kine-matically modeled systems became unstable. The useof dynamic models also makes obstacle avoidancemore reliable in general by imparting to the system amore accurate understanding of its ability to respond.

Forward Modeling. Physical dynamics amounts to anoverwhelming constraint on the maneuverability of ahigh-speed vehicle. For a vehicle or operating regimefor which classical path generation based on via pointsbecomes difficult, forward modeling has the advan-tage that generated trajectories are feasible by con-struction.

Arbitration. The simultaneous satisfaction of hazardavoidance and goal seeking can cause contention forthe absolute control of vehicle actuators, and in a prac-tical system, this contention must be resolved throughsome arbitration mechanism. The problems of goalseeking, local path planning, and hazard avoidancehave been unified into an optimal control context. Inthis context, a functional computed over the feasibleset of response trajectories serves as the quantity to be

optimized and the hazard avoidance mechanism speci-fies the confines of the feasible set.

Hazard Space. In strict mathematical terms, the con-figuration space (C-space) of AI planning is a subsetof the state space (S-space) of multivariate control. Itis a well-established technique to abstract a mobilerobot into a C space point in six-dimensional positionand attitude coordinates. The significance of hazardspace (H-space) is that it performs the same functionfor dynamic planning that C-space performs for kine-matic planning. In hazard space, the point representingthe vehicle follows a trajectory which corresponds tosome particular command trajectory and response tra-jectory.

Acknowledgments

This research was sponsored by ARPA under contracts“Perception for Outdoor Navigation” (contract num-ber DACA76-89-C-0014, monitored by the US ArmyTopographic Engineering Center) and “UnmannedGround Vehicle System” (contract number DAAE07-90-C-RO59, monitored by TACOM).

References

Aliomonos, J., Weiss, I. and Badyopadhyay, A. 1988. Active Vision.Int. J. Computer Vision, 1:333-356.

Athans, M. and Falb, P. 1966. Optimal Control, McGraw-Hill: NewYork, NY.

Bajcsy, R. 1988. Active Perception. Proc. IEEE, 76(8):996-1005.

Bekker, M.G. 1960. Off-the-Road Locomotion, The University ofMichigan Press.

Bekker, M.G. 1956. The Theory of Land Locomotion, The Univer-sity of Michigan Press.

Blochl, B. 1994. System Requirements and architectures for vision-guided autonomous robots. In Proc. of the 12th European meetingon Cybernetics and Systems Research, Vienna, Austria, pp. 5-8.

Brumitt, B., Coulter, R.C. and Stentz, A. 1992. Dynamic TrajectoryPlanning for a Cross-Country Navigator. In Proc. of the SPIEConference on Mobile Robots, Boston, MA.

Brooks, R. 1987. A Hardware-Retargettable Distributed LayeredArchitecture for Mobile Robot Control. In Proc. of the IEEEInternational Conference on Robotics and Automation.

Chang, T.S., Qui, K. and Nitao, J.J. 1986. An Obstacle AvoidanceAlgorithm for an Autonomous Land Vehicle. In Proc. of the SPIEConference on Mobile Robots, pp. 117-123.

Chatila, R. and Lacroix, S. 1995. Adaptive Navigation for Autono-mous Mobile Robots. 7th International Symposium on RoboticsResearch, Munich, Germany.

Clark, D. (ed). 1994. Advances in Model-Based Predictive Control,Oxford University Press: Oxford, UK.

Clarke, D.W., Mohtadi, C. and Tuffs, P.S. 1987. Generalized Predic-tive Control - Part 1: The Basic Algorithm. Automatica,

1. In AI, the term “state space” has a slightly different connotationthan in control theory.

Page 32: Rough Terrain Autonomous Mobility - cs.cmu.edu

Rough Terrain Autonomous Mobility - Part 2: An Active Vision, Predictive Control Approach 32

23(2):149.

Daily, M. et al. 1988. Autonomous Cross Country Navigation withthe ALV. In Proc. of the IEEE International Conference on Robot-ics and Automation, Philadelphia, Pa, pp. 718-726.

Dickmanns, E.D. 1988. Dynamic Computer Vision for MobileRobot Control. In Proc. of the 19th International Symposium andExposition on Robots, pp. 314-27.

Dickmanns, E.D. 1995. Parallel Use of Differential and IntegralRepresentations for Realising Efficient Mobile Robots. 7th Inter-national Symposium on Robotics Research, Munich, Germany.

Dickmanns, E.D. and Zapp, A. 1986. A Curvature-Based Schemefor Improving Road Vehicle Guidance by Computer Vision. InProc. of the SPIE Conference on Mobile Robots.

Dunlay, R.T. and Morgenthaler, D.G. 1986. Obstacle Detection andAvoidance from Range Data. In Proc. of SPIE Conference onMobile Robots, Cambridge, MA.

Dunlay, R.T. and Morgenthaler, D.G. 1986. Obstacle Avoidance onRoadways Using Range Data. In Proc. of SPIE Conference onMobile Robots, Cambridge, MA.

Feiten, W. and Bauer, R. 1994. Robust Obstacle Avoidance inUnknown & Cramped Environments. In Proc. IEEE Interna-tional. Conference on Robotics and Automation, San Diego, CA.

Feng, D., Singh, S. and Krogh, B. 1990. Implementation ofDynamic Obstacle Avoidance on the CMU Navlab. In Proc. ofIEEE Conference on Systems Engineering.

Fikes, R. and Nilsson, N. 1971. STRIPS: A New Approach to theApplication of Theorem Proving to Problem Solving. ArtificialIntelligence, Vol 2:189-208.

Franke, U., Fritz, H. and Mehring, S. 1991. Long Distance Drivingwith the Daimler-Benz Autonomous Vehicle VITA.PROMETHEUS Workshop, Grenoble, France.

Georgeff, M. and Lansky, A. 1986. Reasoning about actions andplans. In Proc. of the AAAI workshop.

Gowdy, J., Stentz, A. and Hebert, M. 1990. Hierarchical TerrainRepresentation for Off-Road Navigation. In Proc SPIE MobileRobots.

Grandjean, P. and Matthies, L. 1993. Perception Control for obstacledetection by a cross-country rover. In Proc. of the IEEE Interna-tional Conference on Robotics and Automation.

Hebert, M., Kanade, T. and Kweon, I. 1988. 3-D Vision Techniquesfor Autonomous Vehicles. The Robotics Institute, Carnegie Mel-lon University, Pittsburgh, PA, Technical Report CMU-RI-TR-88-12.

Hebert, M. and Krotkov, E. 1991 Imaging Laser Radars: How GoodAre They? IROS.

Hendler, J., Tate, A. and Drummond, M. 1990. AI Planning: Sys-tems and Techniques. AI Magazine, Vol 11(2).

Hoffman, R. and Krotkov, E. 1992. Terrain Mapping for Long Dura-tion Autonomous Walking. In Proc. IEEE/RSJ International.Conference on Intelligent Robots and Systems, Raleigh, NC.

Horn, B.K. and Harris, J.G. 1991. Rigid Body Motion from RangeImage Sequences. Image Understanding, Vol. 53(1):1-13.

Hotz, B., Zhang, Z. and Fua, P. 1993. Incremental Construction ofLocal DEM for an Autonomous Planetary Rover. Workshop onComputer Vision for Space Applications, Antibes.

Jordan, M. and Jacobs, R. 1990. Learning to Control an UnstableSystem with Forward Modeling. In Advances in Neural Informa-tion Sciences 2, Morgan Kaufmann: San Francisco, CA.

Kaufman, H., Bar-Kana, I. and Sobel, K. 1994. Direct AdaptiveControl Algorithms, Theory and Applications, Springer Verlag:

New York, NY.

Keirsey, D., Payton, D. and Rosenblatt, J. 1988. Autonomous Navi-gation in Cross-Country Terrain. In Proc. of Image UnderstandingWorkshop.

Kelly, A., Stentz, A. and Hebert, M. 1998. Rough Terrain Autono-mous Mobility: Part 1: A Theoretical Analysis of Requirements.Autonomous Robots, Vol 5(2).

Kelly, A., Stentz, A. and Hebert, M. 1992. Terrain Map Building forFast Navigation on Rough Terrain. In Proc. of the SPIE Confer-ence on Mobile Robots, Boston, MA.

Kelly, A. 1995. An Intelligent Predictive Control Approach to theHigh-Speed, Cross Country Autonomous Navigation Problem. Ph.D. Thesis, Robotics Institute, Carnegie Mellon University, Pitts-burgh, PA.

Kelly, A. and Stentz, A. 1997. Analysis of Requirements of RoughTerrain Autonomous Mobility: Part I: Throughput and Response.IEEE International Conference on Robotics and Automation,Albuquerque, NM.

Kelly, A. and Stentz, A. 1997. Analysis of Requirements of RoughTerrain Autonomous Mobility: Part II: Resolution and Accuracy.IEEE International Conference on Robotics and Automation,Albuquerque, NM.

Kelly, A. and Stentz, A. 1997. Computational Complexity of TerrainMapping Perception in Autonomous Mobility. IEEE InternationalConference on Robotics and Automation, Albuquerque, NM.

Kirk, D.E. 1970. Optimal Control Theory, Prentice Hall: EnglewoodCliffs, NJ.

Krotkov, E. and Hebert, M. 1995. Mapping and Positioning for aPrototype Lunar Rover. In Proc. of the International Conferenceon Robotics and Automation.

Kweon, I.S. 1990. Modelling Rugged Terrain by Mobile Robots withMultiple Sensors. Ph.D. Thesis, Robotics Institute, Carnegie Mel-lon University, Pittsburgh, PA.

Landau, I.D. 1979. Adaptive Control: The Model ReferenceApproach, Marcel Dekker: New York, NY.

Levine, W. (ed). 1996. The Control Handbook, CRC Press: BocaRaton, FL.

Lozano-Perez, T. and Wesley, M.A. 1979. An Algorithm for Plan-ning Collision Free Paths Among Polyhedral Obstacles. Commu-nications of the ACM, Vol. 22(10):560-570.

Matthies, L. 1992. Stereo Vision for Planetary Rovers. InternationalJournal of Computer Vision, 8(1):71-91.

Matthies, L. and Grandjean, P. 1994. Stochastic performance model-ling and evaluation of obstacle detectability with imaging rangesensors. IEEE Transactions on Robotics and Automation, Vol10(6):783-791.

Matthies, L., Kelly, A. and Litwin, T. 1995. Obstacle Detection forUnmanned Ground Vehicles: A Progress Report. 7th InternationalSymposium on Robotics Research, Munich, Germany.

Moravec, H.P. 1980. Obstacle Avoidance and Navigation in the RealWorld by a Seeing Robot Rover. Ph. D. Thesis, Stanford Univer-sity, Stanford, CA.

Olin, K. and Tseng, D. 1991. Autonomous Cross Country Naviga-tion. IEEE Expert, 16-30.

Ogata, K. 1967. State Space Analysis of Control Systems, PrenticeHall: Englewood Cliffs, NJ.

Passino, K.M. and Antsaklis, P.J. 1989. A System and Control theo-retic Perspective on Artificial Intelligence Planning Systems.Applied Artificial Intelligence, 3:1-32.

Rosenblatt, J. and Payton, D. 1989. A Fine-Grained Alternative to

Page 33: Rough Terrain Autonomous Mobility - cs.cmu.edu

Rough Terrain Autonomous Mobility - Part 2: An Active Vision, Predictive Control Approach 33

the Subsumption Architecture. In Proc. of the AAAI Symposiumon Robot Navigation.

Shin, D.H., Singh, S. and Shi, W. 1991. A Partitioned ControlScheme for Mobile Robot Path Planning. In Proc. IEEE Confer-ence on Systems Engineering, Dayton, Ohio.

Shkel, A.M. and Lumelsky, V.J. 1995. The joggers problem:accounting for body dynamics in real-time motion planning. InProc. of the 1995 International Conference on Intelligent Robotsand Systems, Pittsburgh, PA.

Singh, S. et. al. 1991. FastNav: A System for Fast Navigation.Robotics Institute, Carnegie Mellon University, Pittsburgh, PA,Technical Report CMU-RI-TR-91-20.

Soeterboek, R. 1992. Predictive Control, a Unified Approach, Pren-tice Hall International: Englewood Cliffs, NJ.

Stentz, A. 1993. Optimal and Efficient Path Planning for Unknownand Dynamic Environments. Robotics Institute, Carnegie MellonUniversity, Pittsburgh, PA, Technical Report CMU-RI-TR-93-20.

Stentz, A. 1995. Optimal and Efficient Path Planning for Unknownand Dynamic Environments. International Journal of Roboticsand Automation, Vol. 10(3).

Stentz, A. and Hebert, M. 1995. A Complete Navigation System forGoal Acquisition in Unknown Environments. AutonomousRobots, Vol. 2(2).

Spofford, J. and Gothard, B. 1994. Stopping Distance Analysis ofLadar and Stereo for Unmanned Ground Vehicles. In Proc. of theInternational Society of Optical Engineering, Mobile Robots IX,Boston, MA.

Thompson. A. 1977. The Navigation System of the JPL Robot. InProc. of the International Joint Conference for Artificial Intelli-gence, pp749-757.

Whitaker, H.P., Yamron, J. and Kezer, A. 1958. Design of ModelReference Adaptive Control Systems for Aircraft. InstrumentationLaboratory, Massachusetts Institute of Technology, Cambridge,MA, Report R-164.

Alonzo Kelly is a project scientist at the RoboticsInstitute of Carnegie Mellon University. He receivedhis B. A. Sc. in Aerospace engineering from Univer-sity of Toronto in 1984, his B. Sc. in computer sciencefrom York University in 1990, and his Masters andPh.D. degrees in robotics from Carnegie Mellon Uni-versity in 1994 and 1996 respectively. His researchinterests include perception, planning, control, simula-tion, and operator interfaces for indoor and outdoormobile robots.

Anthony Stentz is a senior research scientist at theRobotics Institute of Carnegie Mellon University. Hereceived his Ph.D. in computer science from Carnegie-Mellon University in 1989. He received his M.S. incomputer science from CMU in 1984 and his B.S. inphysics from Xavier University of Ohio in 1982. Hisresearch interests include mobile robots, path plan-ning, computer vision, system architecture, and artifi-cial intelligence in the context of fieldworthy roboticsystems.

List of Symbols

Lowercase Alphabetics

...................... baseline

...................... undercarriage clearance

...................... disparity

.................... correct disparity

................ minimum disparity

............... maximum disparity

...................... focal length

...................... sensor height

...................... sensor / vehicle nose offset

...................... wheel radius

...................... arc length, distance travelled

....................... time

...................... crossrange coordinate

...................... downrange coordinate

...................... vertical coordinate

.................... front elevation

..................... rear elevation

Alphabetics

................... correlation

................... proportional gain

b

c

d

d*

dmin

dmax

f

h

p

r

s

t

x

y

z

zf

zr

Cd

Kp

Page 34: Rough Terrain Autonomous Mobility - cs.cmu.edu

Rough Terrain Autonomous Mobility - Part 2: An Active Vision, Predictive Control Approach 34

......................vehicle wheelbase

......................range

...............maximum range

................minimum range

......................scatter matrix

...................... time, time interval

......................vehicle speed

.....................vehicle width, swath width

......................groundplane projected range

................min groundplane projected range

...............max groundplane projected range

............. left camera frame axes

.............right camera frame axes

Bold Alphabetics

f(x,u)...............nonlinear system dynamics model

g(x) .................. terrain following relationship

h(x)..................hazard model

x,x(t),xi(t) .....state vector, candidate state vector

y,y(t),yi(t) .....output vector, candidate output vector

ysafe .................safety threshold vector

u,u(t),ui(t) ....control vector, candidate cmd vector

ud ..................... terrain disturbance vector

A.......................system dynamics matrix

B .......................input distribution matrix

Xf .....................set of mechanically feasible trajectories

Xa..................... iset of admissable trajectories

L[x] .................goal functional

Greek Alphabetics

......................steer angle

......................angular velocity (z component)

......................normalized disparity

......................curvature

......................yaw, pixel azimuth, vehicle yaw

......................vehicle yaw rate

......................pitch, elevation

......................roll

Increments and Differentials

.............crossrange incremental distance

.............downrange incremental distance

.............vertical incremental distance

............pitch/elevation increment or error

...........yaw/azimuth increment or error

...................change in normalized disparity

L

R

Rmax

Rmin

S

T

V

W

Y

Ymin

Ymax

XL YL

XR YR

αβ·

δκψψ·

θφ

dx ∆x,dy ∆y,dz ∆z,dθ ∆θ,dψ ∆ψ,∆δ