navigating congested environments with risk level sets

Navigating Congested Environments with Risk Level Sets

Alyssa Pierson1, Wilko Schwarting1, Sertac Karaman2, and Daniela Rus1

Abstract— In this paper, we address the problem of navigat-ing in a cluttered environment by introducing a congestion costthat maps the density and motion of objects to an occupancyrisk. We propose that an agent can choose a “risk level set”from this cost function and construct a planning space fromthis set. In choosing different levels of risk, the agent adjusts itsinteractions with the other agents. From the assumption thatagents are self-preserving, we show that any agent planningwithin their risk level set will avoid collisions with other agents.We then present an application of planning with risk level setsin the framework of an autonomous vehicle driving along ahighway. Using the risk level sets, the agent can determinesafe zones when planning a sequence of lane changes. Throughsimulations in Matlab, we demonstrate how the choice of riskthreshold manifests as aggressive or conservative behavior.

I. INTRODUCTION

As autonomous vehicles become more prevalent in today’ssociety, they must learn to operate in congested and clutteredenvironments. To successfully navigate a dynamic environ-ment, it is important to first quantify the level of congestionin the environment. Once the environmental congestion isknown, the agent must choose a control law such that theyavoid collisions with the other obstacles and agents in theenvironment. This paper proposes using “risk level sets”to navigate a cluttered environment. We first introduce acost function that quantifies the level of congestion fromboth the location of other obstacles and agents, as well astheir movement through the environment. This cost functionallows us to measure occupancy for points in the environmentand map it to a measure of risk for an agent attemptingto navigate to that point. We allow the agents to choose arisk threshold from calculated level sets of the cost function.By construction, a conservative agent chooses a lower riskthreshold, while an aggressive agent chooses a higher riskthreshold. Agents are allowed to make any control actionwithin this set, and in doing so, they are guaranteed to avoidcollisions with other agents and obstacles. Our approach isuseful to a number of applications in multi-agent systems.Using risk level sets reduces the control space for an agent,which guarantees collision avoidance. This is particularlyuseful in applications that require exploring a clutteredenvironment, or navigating in crowds or traffic. In situationswhere the number of obstacles may change, our approachscales with the density of the environment, both providing a

1Computer Science & Artificial Intelligence Laboratory, Mas-sachusetts Institute of Technology, Cambridge, MA 02139, USA[apierson, wilkos, rus]@mit.edu

2Laboratory of Information and Decision Systems (LIDS), MIT,Cambridge, MA, USA [email protected]

This work supported in part by NSF Grant IIS-1724058, the Office ofNaval Research (ONR) Grant N00014-12-1-1000, and by Toyota ResearchInstitute (TRI). This article solely reflects the opinions and conclusions ofits authors and not TRI or any other Toyota entity. Their support is gratefullyacknowledged.

metric of congestion to the agent and adjusting the controlspace.

A main focus of this paper is how these risk level setsmay be applied to autonomous vehicles. For self-drivingcars to fully integrate into our daily lives, they must interactwith other human drivers and respond to everyday trafficscenarios. In the case of highway navigation, an autonomousvehicle will need to adapt to varying levels of traffic and con-gestion along the road. We use the problem of an autonomousvehicle planning a sequence of lane changes along a highwayto demonstrate our planning algorithm. Using our congestioncost, the vehicle quickly assesses the level of congestion onthe road. To plan the sequence of lane changes, we map thecost to a weighted graph along a planning horizon, and useDjikstra’s Algorithm to find the fastest route through trafficwhile avoiding collisions with other cars. Using our risk levelset formulation, we also demonstrate how an agent’s choiceof risk threshold manifests as varying behavior; conservativedrivers with lower risk thresholds will make fewer lanechanges, while aggressive drivers with a higher risk thresholdattempt more lane changes.

A. Related WorkWe focus on two core concepts: planning with guaranteed

collision avoidance, then applying this to planning sequencesof lane changes. Within multi-agent systems, there is ex-tensive work on collision avoidance [1], but most assumesimple approximations of the vehicle dynamics. In [2], theauthors compare the reachable sets of vehicles and avoidcollisions by choosing actions that avoid other vehicles whilesatisfying an objective function. While this works well fora small number of vehicles with reliable communication,our problem requires planning around large numbers ofobstacles without communication to the human drivers. An-other approach is to use a “global dynamic window” forhigh speed navigation [3], [4], which checks admissiblevelocities for motion planning against obstacles in view,however, this work is limited to static obstacles. In [5], theauthors discretize a non-convex environment into polygonsfor navigating around static obstacles.

Our paper also draws inspiration from recent work inbarrier functions for safety. In [6], the authors unify controlbarrier functions and Lyapunov functions in order to provideactuator bounds that guarantee safe performance. Furtherextensions look at other dynamics, as well as encodingcollision avoidance and control guarantees for distributedsystems with a single function [7], [8]. Here, we define acost function that allows us to choose various risk thresholdsfor the agents.

In Section IV, we discuss how the risk level sets can beused for autonomous vehicles, and present an example ofnavigating in traffic along a highway. We build upon the

2018 IEEE International Conference on Robotics and Automation (ICRA)May 21-25, 2018, Brisbane, Australia

978-1-5386-3080-8/18/$31.00 ©2018 IEEE 5712

author’s previous work, wherein lane changes among a groupof vehicles are planned recursively [9]. In designing lanechange algorithms, there are a multitude of approaches forautonomous vehicles. One method is to model the behaviorby traffic flow, setting rules that correspond to flux [10].Similarly, in [11], each lane has a pre-defined maximumspeed, and cars “hop” between lanes to lower the probabilityof getting caught behind a slower vehicle. Before makingany change, a maneuver must be deemed safe. One methodis to determine a safe gap for merging [12], [13], however,these gaps may change as the density of traffic and driverpersonalities change [14]. Other methods attempt to mimichuman behavior [15], or use reinforcement learning on laneswitching [16]. To address risk in maneuvers, one approachis to use Scenario Model Predictive Control to account foruncertainty [17], while others explore a series of trajectoriesand choose the least-risky approach [18]. In this paper, weaccount for risk by defining a threshold to avoid collisions,and then allowing the agents to choose a level set of thatrisk, which scales with the density of the environment.

The remainder of the paper is organized as follows: inSection II, we present our problem formulation and assump-tions. Section III defines the risk level sets as a series ofthresholds, as well as a proof of collision avoidance giventhese risk thresholds. We apply these concepts to the problemof highway navigation for autonomous vehicles in SectionIV, and Section V details our simulation results of thehighway naviagtion scenarios. We present our conclusionsin Section VI.

II. PROBLEM FORMULATION

In this section, we present our formulation of occupancycost, then state our assumption of self-preserving agents. Ourapproach treats all objects and agents in the environment asdynamic obstacles. Consider an environment Q ⊂ RN , withpoints in Q denoted q. For n agents in the environment, wedenote the position of each agent as xi, for i ∈ 1, ..., n.We define the ego agent as the planning agent, with posi-tion denoted xe. We assume agents have double-integratordynamics, such that

xi = ui,

where ui is the control input for each agent. We also assumethat all agents have maximum speed and acceleration, suchthat ‖xi‖ < ai and ‖xi‖ < vi, where ai and vi are themaximum acceleration and velocity, respectively.

We introduce an occupancy risk cost, H , which capturesboth the density of agents at a location, as well as theirintended direction of motion. The cost is calculated basedon all agents in the environment, defined

H(q, x, x) =

n∑i=1

exp(−(q − xi)TΩ(q − xi)

)1 + exp

(−αxTi (q − xi)

) , (1)

where Ω is the diagonal matrix of the inverse square of thestandard deviation. In R2, Ω = diag 1

σ2x, 1σ2y. Note that

the construction of H is a Gaussian peak multiplied by alogistic function. The Gaussian peak represents an estimateof an agent’s position, while the logistic function skews thisestimate in the direction of the velocity. By skewing theposition by the velocity, the occupancy cost H is higher

0 0.5 1 1.5 2 2.5 3 3.5 40

0.5

1

1.5

2

2.5

3

3.5

4

(a)0 0.5 1 1.5 2 2.5 3 3.5 4

0

0.5

1

1.5

2

2.5

3

3.5

4

(b)

Fig. 1: (a) Agents in environment with velocity directionindicated by blue lines. (b) Contour of H , noting the skewin the direction of the velocity.

along points in the direct path of the agent’s motion, andlower otherwise. Figure 1 illustrates the mapping from objectpositions and velocities to contours of H .

We use the cost function as a proxy for the intention of theother agents in the system. In highly-cluttered and congestedenvironments, it is impractical for the ego agent to knowthe control laws of all other agents in the system. Instead,we propose an algorithm where the ego agent only needsto calculate H and from this, can generate a conservativecontrol plan that avoids collision with other agents. Tocalculate H , we assume the ego agent is able to observe thestate (position) of the other agents, and can also estimatethe velocity and acceleration of the other vehicles fromprior observations. Since H is defined in the local referenceframe of the ego agent, these measurements only need tobe relative. We assume the other agents in the system areself-preserving, formalized in Assumption 1.

Assumption 1. All agents in the environment are “self-preserving,” defined as avoiding collisions with other agentswhen possible. To avoid collisions, the agents require a min-imum stopping distance, as determined by their dynamics.If two agents’ planned paths intersect, the agents will stopbefore colliding, given sufficient stopping distance.

For our assumption of “self-preserving” agents, we onlyassume all agents have sufficient stopping distance at anygiven time to avoid collisions with other obstacles and agents.We do not assume an advanced collision-avoidance plannerof any agent, and instead incorporate the stopping distanceinto determining our thresholds for the congestion cost. Foreach agent, from its maximum velocity and acceleration, wecan find the stopping radius ri needed for each agent to avoidcollisions. While this assumption is valid in many scenarios,it does not allow for malicious agents in the system whomay intentionally try to collide with other agents. Futurework will explore this and other relaxations of the self-preserving assumption. The following section details howwe determine risk thresholds for the agents under the self-preserving assumption.

III. DETERMINING RISK THRESHOLD

Under the assumption of self-preserving agents, we findthe appropriate risk thresholds for the agents to guarantee

5713

collision avoidance. We propose that there is some costthreshold that, if an agent stays below that threshold, isguaranteed to avoid collisions. The following section definesthe various risk thresholds for both collision avoidance andplanning costs. Let Hc define the cost at which a collisionoccurs. For self-preserving agents, the goal is for an agentto never exceed this cost. Given the dynamics of the system,we define HT < Hc as a maximum risk threshold, whereinif the agent stays within this cost, they will always havesufficient stopping distance to avoid collisions. Finally, wedefine HP ≤ HT as the planning threshold for an agent. If anagent chooses the maximum allowable risk, then HP = HT ,but for any agent choosing a lower risk, HP < HT .

We construct a risk level set within the environment bychoosing all values of H < HP , defined

Lp = q | H(q − xe, xi) ≤ HP . (2)

We will demonstrate that for an agent planning actions withinits risk level set Lp, it will avoid collisions with all otheragents and obstacles. To do so, we present a series of proofsthat define the various risk thresholds of the system. Usingtwo agents, we first solve for the maximum cost beforecollision Hc in Lemma 1. Next, we define the planningthreshold HT in Lemma 2. To construct the planning set, werequire that the planning threshold HP ≤ HT . In Theorem 1,we show that if an agent starts inside Lp and plans all futureactions within their risk level set, the cost will never exceedHc, thus avoiding collisions. From the two-agent system, weextend these results to multiple agents and obstacles in thesystem. We also discuss the situation when the risk levelset Lp is empty, and the implications on the agent. Thisonly occurs when agents in the environment have varyingplanning thresholds or when an agent is braking to avoidcollision. However, from the assumption of self-preservingagents, collisions will still be avoided given the design ofthe risk thresholds.

A. Two-Agent System

Consider a two-agent system comprising the ego agent,located at xe, and another agent, located at xi. Without lossof generality, we can express the dynamics of the systemsuch that the ego agent appears stationary at the origin. Inthis modified reference frame, we define vi as the velocity ofxi, and ai as the acceleration of xi. Using these dynamics,we define Rb as the braking distance of xi when travelingat its maximum velocity, xi, and rc as the safety radius ofthe vehicle enclosing its extent. A collision is thus definedwhen ‖d‖ ≤ rc.

To choose a control action, we need the cost calculated atthe location of the ego agent. Let d = (xe − xi), expressedin the modified coordinate frame such that xe is the origin.We write the cost from (1) for the two-agent system as

H(d, xi) =exp

(−dTΩd

)1 + exp

(−αxTi d

) . (3)

For our ego agent to avoid collisions, we know the distancebetween the two agents must satisfy ‖d‖ > rc. In Theorem 1,we show that by the properties of H , collision avoidance alsomeans the cost H never exceeds a maximum cost Hc. From

rc, the safety radius of the vehicle, the following Lemmasummarizes the calculation of Hc.

Lemma 1. For a two-agent system with cost H defined in(3), the maximum cost before collision Hc is defined

Hc =exp

(− r2cσ2m

)1 + exp (−αvirc)

, (4)

where 1σ2m

= min

1σ2x, 1σ2y

.

Proof. To calculate the maximum value of Hc before colli-sion, we evaluate H when ‖d‖ = rc. Let ui represent theunit vector u = d

‖d‖ . The safety radius can be written rcu.Substituting this into the expression of cost (3),

H(rcu, xi) =exp

(−r2

cuTΩu

)1 + exp

(−αrcxTi u

) .Note that the value of uTΩu depends on the direction of u.Define 1

σ2m

= min

1σ2x, 1σ2y

. We know

1

σ2m

≤ uTΩu,

thus exp(− r2cσ2m

)≥ exp

(−r2

cuTΩu

). The value of Hc

will depend on the value of(1 + exp

(−αrcxTi u

))in the

denominator. To find the largest value of Hc, we’ll need tofind the smallest value of the denominator. Note the value ofexp

(−αrcxTi u

)also depends on the relationship between

xi and u. If xi is moving away from xe, then xTi u < 0,which drives the overall value of H toward zero. We seethat H(rcu, xi) increases when xi = vu, where v is themagnitude of the velocity. Given a maximum velocity vi,we find

H(rcu, xi) ≤exp

(− r2cσ2m

)1 + exp (−αrcvi)

,

Since Hc is the maximum value of H(rcu, xi), we find

Hc =exp

(− r2cσ2m

)1 + exp (−αrcvi)

,

which is equivalent to (4).

From Lemma 1, we see that the maximum cost Hc occurswhen an agent is traveling at its maximum velocity at itssafety radius. For vehicles with single-integrator dynamicsthat can stop instantaneously, our agents could generatea level set based on this threshold and avoid collisions.However, if the agents cannot stop instantaneously, thena cost of Hc would cause a collision. Instead, we alsoneed to find the value of cost that prevents collisions givena minimum braking distance. For a braking distance Rb,Lemma 2 defines the maximum threshold cost HT . We useHT when choosing our planned risk threshold HP ≤ HT .

Lemma 2. For the two-agent system with cost H defined in(3), the maximum threshold to avoid collisions HT is defined

HT =exp

(−R

2b

σ2u

)2

(5)

5714

where σu = max σx, σy.

Proof. Similar to Lemma 1, we can find the maximumthreshold cost by looking at the requirements to guarantee thesafe braking distance. Recall that Rb is the braking distanceof xi, such that it will never collide with xe. To guaranteethe self-preservation, the threshold cost will be the minimumvalue of H at this distance ‖d‖ = Rb. For the unit vectoru = d

‖d‖ , we can express

H(Rbu, xi) =exp

(−R2

biuTΩu

)1 + exp

(−αRbxTi u

).

In Lemma 1, we found the maximum value of H to calculateHc. Here, we find the minimum value to calculate HT . Notethat the value of uTΩu depends on the orientation and valuesof σx and σy . However, we know

H(Rbu, xi) ≥exp

(−Rb,i2

σ2u

)1 + exp

(−αRbxTi u

) ,for σu = max σx, σy . Here, we observe that the valueof H decreases as the denominator increases. Note that forsome positive constant c, exp(−c) ≤ 1. The largest value ofthe denominator occurs when xi = 0. Any velocity in thedirection of u would result in a larger value of H . Thus,

H(Rbu, xi) ≥exp

(−Rb,i2

σ2u

)2

.

By choosing the smallest value of H(Rb, xi) as HT , we findthe expression in (5).

We propose that to avoid collisions, the ego agents stayswithin Lp, their risk level set, defined in (2) by some choiceof HP ≤ HT . By construction, we see that the chosenvalue of HP correlates with how close the ego agent willget next to other agents. Intuitively, a smaller value of HP

corresponds to larger separations between the ego agent andother agents, hence a “less risky” maneuver. In Theorem1, we demonstrate that for agents starting in Lp and onlyplanning actions that stay below this risk level set, the costwill never exceed Hc.

Theorem 1. For an ego agent starting with some initial costH < HP and operating within the set of points defined byLp, the total cost will never exceed Hc.

Proof. Consider a two-agent system, with ego vehicle xe andother vehicle xi, and all dynamics written such that the egovehicle appears stationary at the origin. Given the initial costis below HT , we know that ‖d‖ > Rb, the braking distancefor the other vehicle. To prove that the cost will never exceedHc, we examine its dynamics. Recall

H(d, xi) =exp

(−dTΩd

)1 + exp

(−αxTi d

) .Now consider the function

W (r) =exp

(− r2

σ2m

)1 + exp (−αvmaxr)

,

where vmax is the maximum magnitude of velocity of xi

and r is the distance between the two agents, r = ‖d‖. Byinspection,

W (‖d‖) ≥ H(d, xi),

and that for ‖d‖ = rc, W (rc) = Hc. The derivative of W is

W =−2rr exp

(− r2

σ2m

)σ2m (1 + exp (−αvmaxr))

−(−αvmaxrr) exp

(−αvmaxr − r2

σ2m

)r (1 + exp (−αvmaxr))

2 .

From the derivative, W is not always decreasing as rincreases. We can rewrite W as

W =

[−2

σ2m

+αvmax exp (−αvmaxr)

r (1 + exp (−αvmaxr))

]β,

where

β =rr exp

(−r2σ2m

)(1 + exp (−αvmaxr))

≥ 0.

The sign of W then depends on choosing an α such that

αvmax exp (−αvmaxrc)

(1 + exp (−αvmaxrc))<

2rcσ2m

.

Thus, given the values of rc, σm, and vmax, we can chooseα such that such that W < 0 for r > rc.

Since W (r) is always greater than H for ‖d‖ > rc, andW < 0 for increasing r, then H(d, xi) < Hc for any‖d‖ >rc. Thus, the cost H(d, xi) will never exceed Hc unless theagents collide. For agents operating within Lp, to stay in Lprequires only choosing points such that H(·) < HP < Hc,thus any agent starting in Lp will never choose an action thatresults in H > Hc, completing the claims of Theorem 1.

As shown in Theorem 1, an agent starting within Lpwill never exceed the collision threshold Hc. If all agentsin the system stay within their respective thresholds Lp,i,then all agents will never leave these sets. However, we donot assume all agents in the system use the same controllaws, only that they are self-preserving. Due to this property,there are situations in which the ego agent experiences acost greater than HP , but never exceeding HT . Consider thescenario of the ego agent directly following the other agentin the environment. If the other agent brakes suddenly, theego agent will experience an increase in cost as it movestowards the other agent. However, since HT is chosen toinclude a braking distance, then as soon as the ego agentsees H > HP , it can begin braking to avoid collision andmaintain H < Hc.

B. Multiple Agents

While the previous section provides collision avoidanceguarantees for two agents, our goal is to find an algorithmto work in highly cluttered environments with many agents.Using the claims from Theorem 1 for two agents, this sectionextends those results to multiple agents in the environment.For two agents, we introduced H , which maps the positionof the other agent to an occupancy cost in the environment,

5715

Fig. 2: Illustration of the hexagonal pattern of circle packing.The blue agent in the center is the ego agent.

allowing us to define a risk threshold and Lp. When consid-ering multiple agents, a simple approach is to consider allthe agents as many pairwise interactions. For i = 1, ..., nother agents, let di = (xe−xi), HT,i(di, xi) be the thresholdbetween two agents, and Hc,i be the collision cost. The risklevel set is Lp can then be expressed as the intersection ofall pairwise level sets,

Lp =

n⋂i=1

Lp,i, (6)

where Lp,i is defined in (2). When defined in this fashion,Corollary 1 extends the claims of Theorem 1 to the multi-agent scenario.

Corollary 1. For an ego agent xe and a group of n agentswith positions xi, let Lp represent planning set of agent xe.By planning within Lp, agent xe avoids collisions with allother n agents in the environment.

Proof. By construction, Lp is the intersection of all pairwiserisk sets. For a point q to be in this set, the individual agentcost Hi(q − xi, xi) < HT . By Theorem 1, when the egoagent plans within this risk threshold, the maximum pairwisecost will never exceed Hc,i, ensuring the ego agent does notcollide with the agent xi. With this pairwise approach, theego agent avoids collisions with all agents.

While Corollary 1 demonstrates a method for avoidingcollisions in a pairwise fashion, to compute Lp as theintersection of all individual sets becomes intractable withmany agents. Instead, we wish to compute a single costacross the environment, and find a single risk level set withinthat cost. Consider the cost as the sum of all agents in theenvironment. Let

H =

n∑i=1

Hi(di, xi) =

n∑i=1

exp(−dTi Ωdi

)1 + exp

(−αxTi di

) , (7)

where n is the number of other agents in the field, and di =(xe − xi), the pairwise distance between the ego agent xeand other agent xi. Here, we assume that Ω, α, vmax, andthe collision radius rc are equal for all agents in the group.From the assumption that all agents have the same safetyradius, we show that the level set of cost can be determinedfrom a circle-packing problem. It turns out that for all agentswith safety radius rc, the hexagonal packing pattern is thedensest arrangement of agents. In this pattern, at most sixother agents of radius rc are positioned around the ego agent,and another twelve agents are arranged within 2rc distance

from the ego agent, as illustrated in Figure 2.From this hexagonal pattern in Figure 2, we can approx-

imate the maximum cost Hc our ego agent may encounter,based on a worst-case scenario of all agents packed together.We find

Hc ≈6∑i=1

exp(− r2cσ2m

)1 + exp (−αvmaxrc)

+

12∑i=7

exp(− (1.5rc)2

σ2m

)1 + exp (−1.5αvmaxrc)

+

18∑i=13

exp(− (2rc)2

σ2m

)1 + exp (−2αvmaxrc)

+ ...

(8)

In this summation, six agents are arranged at rc, shaded witha crosshatch in Figure 2. Six more agents are a distance1.5rc, shaded with dots, and six agents are a distance2rc away, shown in white in Figure 2. We can continuecalculating this by adding more agents around the ego agentin the hexagonal pattern. Note the approximation of Hc isdetermined by the number of agents and the packing density.For practical purposes in highly-cluttered environments, Hc

may be approximated by a fixed number of agents. Similarly,we can approximate HT from (5) for multiple agents as

HT ≈6∑i=1

exp(−R

2b

σ2u

)2

+

12∑i=7

exp(− (1.5Rb)2

σ2m

)2

+

18∑i=13

exp(− (2Rb)2

σ2m

)2

+ ...

(9)

For planning purposes, approximating HT by truncating theseries expansion will yield a more conservative estimate. Forexample, letting

HP ≤3 exp

(−R

2b

σ2u

)2

will guarantee that HP < HT for any group size of n ≥ 6other agents in the environment.

IV. PLANNING IN RISK LEVEL SETS

The previous section introduced our definition of risk levelsets, and proposed that an agent can use the risk level sets toguarantee collision avoidance. In this section, we provide anexample of one such planning algorithm. For this example,consider the problem of an autonomous vehicle planning asequence of lane changes along a highway. The ego vehiclewants to minimize its travel time along a segment, butalso must avoid collisions with the other cars. We use therisk level sets to both check for collision avoidance duringindividual lane changes, as well as provide a metric ofcongestion for overall route planning.

As the previous section discussed, the ego agent uses thecost H to determine some planning threshold, HP , less thanthe threshold cost HT required for self-preserving agents.Figure 3 demonstrates various contours of H around obstaclecars. In Figure 3, the value of Hc corresponds to the smallest

5716

0 5 10 15 20 25 300

2

4

6

8

10

Fig. 3: Illustration of various contour levels. Each colorrepresents a different level of H , with Hc denoted in yellow.As H decreases, the region around the car increases.

yellow region, while lower values of H create a larger regionaround the obstacle. By choosing a smaller HP , the ego carleaves more room around the other cars, resulting in lessrisky maneuvers. Also note the contours of H are all skewedalong the forward travel direction of the vehicle. Intuitively,when an agent is planning to change lanes, it needs to avoidcutting off a vehicle in front.

Our planner works in several stages, summarized in Al-gorithm 1. Using a short planning horizon, we discretize thehorizon into a graph, with edges representing straight traveland lane changes. Using the calculated cost, we map thevalues of the cost function to weights along the edges. UsingDjikstra’s Algorithm, we find the shortest path along theplanning horizon, representing the sequence of desired lanechanges that accounts for travel time and risk in changinglanes. From the suggested sequence of lane changes, the egoagent only makes an individual lane change when it can doso without violating any risk thresholds. Since we updatethe planned route regularly, the path will change based onvariations in traffic and relative lane speeds.

Algorithm 1 Sequential Lane Changes

1: Computer overall cost H (7)2: Compute Lp (2)3: Create graph within planning horizon4: Map H and Lp to edge weights5: Use Djikstra’s to compute sequence of lane changes6: Check that single lane change does not violate Lp

Within our planner, the overall cost H is computed acrossthe planning horizon from observed traffic. By construction,our ego vehicle does not need perfect localization of the othervehicles, but can instead use an estimated position skewedby the observed velocity. From H , the agent calculates Lp,based on a chosen planning threshold HP . In our Section V,we demonstrate in simulation how the choice of HP affectsthe perceived aggressiveness of the ego vehicle.

To calculate the sequence of lane changes, we choose todiscretize the horizon into a weighted graph and calculatethe lowest-cost path. Along the planning horizon, nodesare placed in a grid pattern, centered in each lane andspaced a distance D horizontally. We classify the edges aseither “straight travel” or as a“lane change.” As the namingsuggests, straight segments connect nodes sequentially withina lane, which the lane change nodes connect diagonally to the

-5 0 5 10 15 20 25 300

2

4

6

8

10

12

14

(a) Calculation of H around ego agent (blue)

-5 0 5 10 15 20 25 300

2

4

6

8

10

12

14

(b) Graph of planning horizon

-5 0 5 10 15 20 25 300

2

4

6

8

10

12

14

(c) Planned route

Fig. 4: Snapshot of planning algorithm with (a) values ofH > HP shaded in gray, (b) the planning graph with thecolors of the edges correspond to their relative weight, and(c) computed lane change for car.

adjacent lanes. To assign weights, the value of straight edgesis based on whether the node is occupied (indicated by thecost function), and the difference in vehicle travel speed withsome desired speed. For unoccupied nodes, straight segmentsare assigned a base value of B. For the lane changes, the edgeweight is calculated on the occupancy between nodes, givenby the cost H . For segments connecting unoccupied lanechanges, the base value is 2B, ensuring that a lane changewill have a higher cost than a straight segment when no othercars are present. Figure 4 illustrates an example of the costand the planned trajectory.

V. SIMULATIONS

To demonstrate the performance of our route-planningalgorithm, we conducted simulations in Matlab. A videoof the simulations is included with the submission of thispaper. First, we present a single example of the lane changealgorithm, illustrating how the car adjusts its route based onlocal cars and its planning horizon. Next, we present resultssummarizing the behavior in various traffic densities and riskthresholds. Finally, we comment on the extension to multiplecars simultaneously planning routes, and show that the cars

5717

(a) t = 10s

(b) t = 15s

(c) t = 20s

(d) t = 25s

(e) t = 30s

Fig. 5: Simulation of the ego car (blue) planning a paththrough traffic. The planned sequence of lane changes isshown by the dotted blue line. The contour of H > HP

is shaded in grey.

naturally spread out and re-adjust based on the behavior ofthe other agents.

Figure 5 illustrates a car (blue) navigating a 4-lane high-way. In the figure, the planned sequence of lane changes isdrawn as the dashed blue line. The contour of H > HP isshaded grey, representing points outside Lp. From Figure 5,we see the route changes as the agent moves through trafficand as openings appear on its planning horizon.

A. Impact of Risk Threshold and Traffic Density

To demonstrate how using different risk level sets affectsperformance, we compared several scenarios for a “conserva-tive” driver and an “aggressive” ego agent. For each scenario,the ego car travels 2000m on a 4-lane segment of highway.To increase the number of lane changes, the ego car has amaximum desired speed of 40m/s. The other cars are givena maximum speed based on their lane assignment, such thatthe fastest lane travels at 29m/s and the slowest lane travelsat 17m/s. Without any traffic, the expected travel time along

0 10 20 30 40 50 600

2

4

6

8

10

(a) HP = .5HT

0 10 20 30 40 50 600

2

4

6

8

10

(b) HP = .9HT

Fig. 6: The difference in Lp for the conservative and aggres-sive driver. The gray zone indicates values of H > HP . Notethe car in (b) has a much larger planning space Lp (whitearea) than the car in (a).

HP = .9HT HP = .5HT

n = 100 56.9s 62.7sn = 150 63.4s 68.1sn = 200 67.2s 69.2s

TABLE I: Average travel time for the ego car across 100trials.

this segment for the ego car is 50 seconds, while the expectedtravel time for an obstacle car in the “fast” lane is 69 seconds.We benchmark the performance of the ego car by its traveltime along the segment, which demonstrates how effective itis at weaving through traffic.

To compare the performance among each scenario, weexamine both the average time to travel the distance, as wellas the average number of lane changes. Table I summarizesthe average time, and Table II summarizes the averagenumber of lane changes. For each scenario, we ran 100 trialswith randomized initial conditions. The ego car computesLp from either HP = .9HT for the “aggressive” driver,or HP = .5HT for the “conservative” driver. Figure 6illustrates the difference in Lp for the variations in HP .In Figure 6, the higher value of HP results in a gap tochange lanes, whereas with the lower value of HP , the cardoes not have a gap. We also vary the number of obstaclecars from n = 100, 150, 200 to simulate moderate to highcongestion.

From Table I, for both values of HP , we see that theaverage travel time increases as the congestion increases.By n = 200, the average travel time is almost equal to theexpected travel time of the “fast” lane. We expect this result,as in dense traffic, the car does not have sufficient empty

HP = .9HT HP = .5HT

n = 100 13.4 9.7n = 150 14.2 8.4n = 200 12.8 7.0

TABLE II: Average number of lane changes the ego carmakes across 100 trials.

5718

110 120 130 140 150 160 170 180 1900

2

4

6

8

10

12

14

(a) t = 5s

240 250 260 270 280 290 300 310 320 3300

2

4

6

8

10

12

14

(b) t = 10s

360 380 400 420 440 460 4800

2

4

6

8

10

12

14

(c) t = 15s

Fig. 7: For 6 agents using the route planner, each car treatsthe other cars as obstacles. Over time, the cars adjust anddisperse, based on lane changes occurring ahead.

lane space to travel at its faster desired speed and pass othercars. From Table II, we see a key difference between theaggressive and conservative driver, namely, the aggressivedriver attempts more lane changes. This result is expected,as the higher HP value corresponds to a larger Lp set, andas we saw in Figure 6, more opportunities to make lanechanges.

B. Multiple Agents

Our algorithm is able to handle multiple cars simultane-ously planning their individual routes. In this case, the othercars are still treated as dynamic obstacles, and when forwardcars change lanes, the planner adjusts its path accordingly.Figure 7 shows 6 agents simultaneously planning routesalong the same section of highway, with the active carsshown in various colors, and the other traffic shown in black.Notice that while the purple car in front stays in the fast lane,the rear cars perform more passing maneuvers and updatesas the cars in front change lanes. Since this algorithm isbased on the density, we see the cars spread out among thedifferent lanes instead of all moving to the “fast” lane.

VI. CONCLUSIONS

This paper addresses the problem of navigating a clutteredenvironment by first introducing a congestion cost, thenchoosing planners that operate within “risk level sets” ofthis cost. The congestion cost maps the density and motionof the objects to an occupancy risk for the agent. In choosingdifferent levels of risk, the agents adjust its interactionswith other agents. We show that any agent planning withintheir risk level set avoids collisions with other agents. Wedemonstrate planning in congestion through the exampleof an autonomous vehicle planning a sequence of lanechanges along a highway. Using the risk level sets, the agentsdetermine safe zones to make lane changes. In simulations,we illustrate how a higher risk threshold results in moreaggressive behavior of the vehicle, as well as an increasein number of lane changes as it weaves through traffic.

Similarly, a lower risk threshold results in more conservativebehavior with fewer lane changes. Future work will relaxthe assumption of self-preserving agents and examine howthe agents interact in other environments. We will alsoexamine advanced dynamics on the agents, and how levelset calculations may change with more information on thereachable set of an agent.

REFERENCES

[1] C. Goerzen, Z. Kong, and B. Mettler, “A survey of motion planningalgorithms from the perspective of autonomous UAV guidance,” Jour-nal of Intelligent and Robotic Systems, vol. 57, no. 1-4, pp. 65–100,2010.

[2] C. Schlegel, “Fast local obstacle avoidance under kinematic anddynamic constraints for a mobile robot,” in Intelligent Robots andSystems, 1998. Proceedings., 1998 IEEE/RSJ International Conferenceon, vol. 1. IEEE, 1998, pp. 594–599.

[3] O. Brock and O. Khatib, “High-speed navigation using the globaldynamic window approach,” in Robotics and Automation, 1999. Pro-ceedings. 1999 IEEE International Conference on, vol. 1, 1999, pp.341–346.

[4] P. Ogren and N. E. Leonard, “A convergent dynamic window approachto obstacle avoidance,” Robotics, IEEE Transactions on, vol. 21, no. 2,pp. 188–195, 2005.

[5] C. Belta, V. Isler, and G. J. Pappas, “Discrete abstractions for robotmotion planning and control in polygonal environments,” IEEE Trans-actions on Robotics, vol. 21, no. 5, pp. 864–874, Oct 2005.

[6] A. D. Ames, J. W. Grizzle, and P. Tabuada, “Control barrier functionbased quadratic programs with application to adaptive cruise control,”in 53rd IEEE Conference on Decision and Control, Dec 2014, pp.6271–6278.

[7] U. Borrmann, L. Wang, A. D. Ames, and M. Egerstedt, “Controlbarrier certificates for safe swarm behavior,” IFAC-PapersOnLine,vol. 48, no. 27, pp. 68 – 73, 2015, analysis and Design of HybridSystems ADHS.

[8] D. Panagou, D. M. Stipanovi, and P. G. Voulgaris, “Distributedcoordination control for multi-robot networks using lyapunov-likebarrier functions,” IEEE Transactions on Automatic Control, vol. 61,no. 3, pp. 617–632, March 2016.

[9] W. Schwarting and P. Pascheka, “Recursive conflict resolution forcooperative motion planning in dynamic highway traffic,” in 17thInternational IEEE Conference on Intelligent Transportation Systems(ITSC), Oct 2014, pp. 1039–1044.

[10] P. Wagner, K. Nagel, and D. E. Wolf, “Realistic multi-lane trafficrules for cellular automata,” Physica A: Statistical Mechanics and itsApplications, vol. 234, no. 3, pp. 687 – 698, 1997.

[11] D. Chowdhury, D. E. Wolf, and M. Schreckenberg, “Particle hoppingmodels for two-lane traffic with two kinds of vehicles: Effects of lane-changing rules,” Physica A: Statistical Mechanics and its Applications,vol. 235, no. 3, pp. 417 – 439, 1997.

[12] K. Ahmed, M. Ben-Akiva, H. Koutsopoulos, and R. Mishalani,“Models of freeway lane changing and gap acceptance behavior,”Transportation and traffic theory, vol. 13, pp. 501–515, 1996.[Online]. Available: http://dx.doi.org/

[13] H. Jula, E. B. Kosmatopoulos, and P. A. Ioannou, “Collision avoidanceanalysis for lane changing and merging,” IEEE Transactions onVehicular Technology, vol. 49, no. 6, pp. 2295–2308, Nov 2000.

[14] P. Hidas, “Modelling lane changing and merging in microscopic trafficsimulation,” Transportation Research Part C: Emerging Technologies,vol. 10, no. 56, pp. 351 – 371, 2002.

[15] J. E. Naranjo, C. Gonzalez, R. Garcia, and T. de Pedro, “Lane-changefuzzy control in autonomous vehicles for the overtaking maneuver,”IEEE Transactions on Intelligent Transportation Systems, vol. 9, no. 3,pp. 438–450, Sept 2008.

[16] S. Shalev-Shwartz, S. Shammah, and A. Shashua, “Safe,multi-agent, reinforcement learning for autonomous driving,”CoRR, vol. abs/1610.03295, 2016. [Online]. Available:http://arxiv.org/abs/1610.03295

[17] G. Schildbach and F. Borrelli, “Scenario model predictive control forlane change assistance on highways,” in 2015 IEEE Intelligent VehiclesSymposium (IV), June 2015, pp. 611–616.

[18] A. Lawitzky, D. Althoff, C. F. Passenberg, G. Tanzmeister, D. Woll-herr, and M. Buss, “Interactive scene prediction for automotive ap-plications,” in 2013 IEEE Intelligent Vehicles Symposium (IV), June2013, pp. 1028–1033.

5719

navigating congested environments with risk level sets

Documents