planning algorithms: when optimal just isn’t good enoughruml/papers/uml-2011-talk.pdf ·...

Wheeler Ruml (UNH) When Optimal Just Isn’t Good Enough – 1 / 33

Planning Algorithms:When Optimal Just Isn’t Good Enough

Wheeler Ruml

Department of Computer Science

Joint work with the UNH AI Group. Funded by NSF and DARPA.

Where are the Walking Talking Robots?

Introduction

■ Robots?

■ Toddler Steps

■ An Agent

Planning as Search

Bounded Suboptimal

Conclusion


Lt. Commander Data Lt. Sharon Valerii Zoe GreystoneStar Trek: TNG Battlestar Galactica Caprica

Toddler Steps

Introduction

■ Robots?

■ Toddler Steps

■ An Agent

Planning as Search

Bounded Suboptimal

Conclusion


The Role of Planning

Introduction

■ Robots?

■ Toddler Steps

■ An Agent

Planning as Search

Bounded Suboptimal

Conclusion


world model

planner

agent

world

actions

sensing

The Role of Planning

Introduction

■ Robots?

■ Toddler Steps

■ An Agent

Planning as Search

Bounded Suboptimal

Conclusion


world model

planner

search

agent

world

actions

sensing

Planning as Heuristic Graph Search

Introduction

Planning as Search

■ Planning

■ Graph Search

■ Example

■ The Problem

■ 3 Algorithms

■ Uniform Search

■ Dijkstra Behavior

■ Warcraft

■ Knowledge

■ Lower Bounds

■ Heuristic Search

■ A* Behavior

■ Dijkstra vs A*

■ Lower Bound

■ Problems with A*

■ Problem Settings

Bounded Suboptimal

Conclusion


Planning

Introduction

Planning as Search

■ Planning

■ Graph Search

■ Example

■ The Problem

■ 3 Algorithms

■ Uniform Search


■ Warcraft

■ Knowledge

■ Lower Bounds


■ A* Behavior

■ Dijkstra vs A*

■ Lower Bound



Bounded Suboptimal

Conclusion


Given:

■ current state of the world■ models of available actions

preconditions, effects, costs

■ desired state of the world (partially specified?)

Find:

■ cheapest plan

Graph Search

Introduction

Planning as Search

■ Planning

■ Graph Search

■ Example

■ The Problem

■ 3 Algorithms

■ Uniform Search


■ Warcraft

■ Knowledge

■ Lower Bounds


■ A* Behavior

■ Dijkstra vs A*

■ Lower Bound



Bounded Suboptimal

Conclusion


Given:

■ start state: an explicit node■ expand function: generates children and their costs

assume arc cost ≥ 0■ goal test: predicate on nodes

Find:

■ cheapest path to a goal node

current

next

action a b

Graph Search

Introduction

Planning as Search

■ Planning

■ Graph Search

■ Example

■ The Problem

■ 3 Algorithms

■ Uniform Search


■ Warcraft

■ Knowledge

■ Lower Bounds


■ A* Behavior

■ Dijkstra vs A*

■ Lower Bound



Bounded Suboptimal

Conclusion


Given:



Find:


root

a

a

a b

b

a b

b

a

a b

b

a b

Graph Search

Introduction

Planning as Search

■ Planning

■ Graph Search

■ Example

■ The Problem

■ 3 Algorithms

■ Uniform Search


■ Warcraft

■ Knowledge

■ Lower Bounds


■ A* Behavior

■ Dijkstra vs A*

■ Lower Bound



Bounded Suboptimal

Conclusion


Given:



Find:


root = 0

1

2

2 2

1

3 1

2

1

2 3

2

1 3

Example: Motion Planning

Introduction

Planning as Search

■ Planning

■ Graph Search

■ Example

■ The Problem

■ 3 Algorithms

■ Uniform Search


■ Warcraft

■ Knowledge

■ Lower Bounds


■ A* Behavior

■ Dijkstra vs A*

■ Lower Bound



Bounded Suboptimal

Conclusion


(S. LaValle)

The Problem: Exponential Growth

Introduction

Planning as Search

■ Planning

■ Graph Search

■ Example

■ The Problem

■ 3 Algorithms

■ Uniform Search


■ Warcraft

■ Knowledge

■ Lower Bounds


■ A* Behavior

■ Dijkstra vs A*

■ Lower Bound



Bounded Suboptimal

Conclusion


Ever wonder why no one draws trees more than 3 levels deep?

The Problem: Exponential Growth

Introduction

Planning as Search

■ Planning

■ Graph Search

■ Example

■ The Problem

■ 3 Algorithms

■ Uniform Search


■ Warcraft

■ Knowledge

■ Lower Bounds


■ A* Behavior

■ Dijkstra vs A*

■ Lower Bound



Bounded Suboptimal

Conclusion


Ever wonder why no one draws trees more than 3 levels deep?

number of actionsplan length

14100 > 1080

A Tale of Three Algorithms

Introduction

Planning as Search

■ Planning

■ Graph Search

■ Example

■ The Problem

■ 3 Algorithms

■ Uniform Search


■ Warcraft

■ Knowledge

■ Lower Bounds


■ A* Behavior

■ Dijkstra vs A*

■ Lower Bound



Bounded Suboptimal

Conclusion


1. Uniform Cost Search (Dijkstra, 1959)

2. A* Search (Hart, Nilsson, and Raphael, 1968)

3. Explicit Estimation Search (Thayer and Ruml, 2011)

Uniform Cost Search (Dijkstra, 1959)

Introduction

Planning as Search

■ Planning

■ Graph Search

■ Example

■ The Problem

■ 3 Algorithms

■ Uniform Search


■ Warcraft

■ Knowledge

■ Lower Bounds


■ A* Behavior

■ Dijkstra vs A*

■ Lower Bound



Bounded Suboptimal

Conclusion



Introduction

Planning as Search

■ Planning

■ Graph Search

■ Example

■ The Problem

■ 3 Algorithms

■ Uniform Search


■ Warcraft

■ Knowledge

■ Lower Bounds


■ A* Behavior

■ Dijkstra vs A*

■ Lower Bound



Bounded Suboptimal

Conclusion


Explore nodes in increasing order of cost-so-far (g(n)):


Introduction

Planning as Search

■ Planning

■ Graph Search

■ Example

■ The Problem

■ 3 Algorithms

■ Uniform Search


■ Warcraft

■ Knowledge

■ Lower Bounds


■ A* Behavior

■ Dijkstra vs A*

■ Lower Bound



Bounded Suboptimal

Conclusion


Explore nodes in increasing order of cost-so-far (g(n)):

Q← ordered list containing the initial stateLoop

If Q is empty, return failureNode ← pop cheapest node off Q

If Node is a goal, return it (or path to it)Children ← Expand(Node).Merge Children into Q, keeping sorted by g(n).

1

2

2 2

1

3 1

2

1

2 3

2

1 3

Dijkstra Behavior

Introduction

Planning as Search

■ Planning

■ Graph Search

■ Example

■ The Problem

■ 3 Algorithms

■ Uniform Search


■ Warcraft

■ Knowledge

■ Lower Bounds


■ A* Behavior

■ Dijkstra vs A*

■ Lower Bound



Bounded Suboptimal

Conclusion


Dijkstra: Pathfinding in Warcraft

Introduction

Planning as Search

■ Planning

■ Graph Search

■ Example

■ The Problem

■ 3 Algorithms

■ Uniform Search


■ Warcraft

■ Knowledge

■ Lower Bounds


■ A* Behavior

■ Dijkstra vs A*

■ Lower Bound



Bounded Suboptimal

Conclusion


Problem-Specific Background Knowledge

Introduction

Planning as Search

■ Planning

■ Graph Search

■ Example

■ The Problem

■ 3 Algorithms

■ Uniform Search


■ Warcraft

■ Knowledge

■ Lower Bounds


■ A* Behavior

■ Dijkstra vs A*

■ Lower Bound



Bounded Suboptimal

Conclusion


Lower Bounds on Cost-to-go

Introduction

Planning as Search

■ Planning

■ Graph Search

■ Example

■ The Problem

■ 3 Algorithms

■ Uniform Search


■ Warcraft

■ Knowledge

■ Lower Bounds


■ A* Behavior

■ Dijkstra vs A*

■ Lower Bound



Bounded Suboptimal

Conclusion


h(n) < f*(n)

Heuristic Search: A* (Hart, Nilsson, and Raphael, 1968)

Introduction

Planning as Search

■ Planning

■ Graph Search

■ Example

■ The Problem

■ 3 Algorithms

■ Uniform Search


■ Warcraft

■ Knowledge

■ Lower Bounds


■ A* Behavior

■ Dijkstra vs A*

■ Lower Bound



Bounded Suboptimal

Conclusion



Introduction

Planning as Search

■ Planning

■ Graph Search

■ Example

■ The Problem

■ 3 Algorithms

■ Uniform Search


■ Warcraft

■ Knowledge

■ Lower Bounds


■ A* Behavior

■ Dijkstra vs A*

■ Lower Bound



Bounded Suboptimal

Conclusion


Cost should include both cost-so-far and cost-to-go:


Introduction

Planning as Search

■ Planning

■ Graph Search

■ Example

■ The Problem

■ 3 Algorithms

■ Uniform Search


■ Warcraft

■ Knowledge

■ Lower Bounds


■ A* Behavior

■ Dijkstra vs A*

■ Lower Bound



Bounded Suboptimal

Conclusion


Cost should include both cost-so-far and cost-to-go:

Q← ordered list containing the initial stateLoop

If Q is empty, return failureNode ← pop cheapest node off Q

If Node is a goal, return it (or path to it)Children ← Expand(Node)Merge Children into Q, keeping sorted by f(n)

f(n) = g(n) + h(n)g(n) = cost incurred so farh(n) = lower bound on cost to goal

A* Behavior

Introduction

Planning as Search

■ Planning

■ Graph Search

■ Example

■ The Problem

■ 3 Algorithms

■ Uniform Search


■ Warcraft

■ Knowledge

■ Lower Bounds


■ A* Behavior

■ Dijkstra vs A*

■ Lower Bound



Bounded Suboptimal

Conclusion


Dijkstra vs A*: Pathfinding in Warcraft

Introduction

Planning as Search

■ Planning

■ Graph Search

■ Example

■ The Problem

■ 3 Algorithms

■ Uniform Search


■ Warcraft

■ Knowledge

■ Lower Bounds


■ A* Behavior

■ Dijkstra vs A*

■ Lower Bound



Bounded Suboptimal

Conclusion


Quick Note: A Lower Bound on Solution Cost

Introduction

Planning as Search

■ Planning

■ Graph Search

■ Example

■ The Problem

■ 3 Algorithms

■ Uniform Search


■ Warcraft

■ Knowledge

■ Lower Bounds


■ A* Behavior

■ Dijkstra vs A*

■ Lower Bound



Bounded Suboptimal

Conclusion


f(n) = g(n) + h(n)

g(n) is actual cost-so-far, sowhen h(n) is a lower bound on cost-to-go,f(n) is a lower bound on final solution cost

lowest f(n) on frontier gives lower bound for entire problem!

bestf = argminn∈open

f(n)

Problems with A*

Introduction

Planning as Search

■ Planning

■ Graph Search

■ Example

■ The Problem

■ 3 Algorithms

■ Uniform Search


■ Warcraft

■ Knowledge

■ Lower Bounds


■ A* Behavior

■ Dijkstra vs A*

■ Lower Bound



Bounded Suboptimal

Conclusion


Problems with A*

Introduction

Planning as Search

■ Planning

■ Graph Search

■ Example

■ The Problem

■ 3 Algorithms

■ Uniform Search


■ Warcraft

■ Knowledge

■ Lower Bounds


■ A* Behavior

■ Dijkstra vs A*

■ Lower Bound



Bounded Suboptimal

Conclusion


A* takes exponential memory

Problems with A*

Introduction

Planning as Search

■ Planning

■ Graph Search

■ Example

■ The Problem

■ 3 Algorithms

■ Uniform Search


■ Warcraft

■ Knowledge

■ Lower Bounds


■ A* Behavior

■ Dijkstra vs A*

■ Lower Bound



Bounded Suboptimal

Conclusion



Can sometimes be fixed: see ‘iterative deepening’

Problems with A*

Introduction

Planning as Search

■ Planning

■ Graph Search

■ Example

■ The Problem

■ 3 Algorithms

■ Uniform Search


■ Warcraft

■ Knowledge

■ Lower Bounds


■ A* Behavior

■ Dijkstra vs A*

■ Lower Bound



Bounded Suboptimal

Conclusion




A* takes exponential time

Problems with A*

Introduction

Planning as Search

■ Planning

■ Graph Search

■ Example

■ The Problem

■ 3 Algorithms

■ Uniform Search


■ Warcraft

■ Knowledge

■ Lower Bounds


■ A* Behavior

■ Dijkstra vs A*

■ Lower Bound



Bounded Suboptimal

Conclusion




A* takes exponential time

We must trade cost for time.

A New Generation of Problem Settings

Introduction

Planning as Search

■ Planning

■ Graph Search

■ Example

■ The Problem

■ 3 Algorithms

■ Uniform Search


■ Warcraft

■ Knowledge

■ Lower Bounds


■ A* Behavior

■ Dijkstra vs A*

■ Lower Bound



Bounded Suboptimal

Conclusion


optimal: minimize solution costmust expand all with f(n) < f∗(opt)

A New Generation of Problem Settings

Introduction

Planning as Search

■ Planning

■ Graph Search

■ Example

■ The Problem

■ 3 Algorithms

■ Uniform Search


■ Warcraft

■ Knowledge

■ Lower Bounds


■ A* Behavior

■ Dijkstra vs A*

■ Lower Bound



Bounded Suboptimal

Conclusion


optimal: minimize solution costmust expand all with f(n) < f∗(opt)

greedy: minimize solving time

bounded suboptimal: minimize time subject to relative costbound (factor of optimal)

bounded cost: minimize time subject to absolute cost bound

contract: minimize cost subject to absolute time bound

utility function: maximize utility function of cost and timeeg, goal achievement time =

plan makespan + search time

Bounded Suboptimal Search

Introduction

Planning as Search

Bounded Suboptimal

■ Intuition

■ Three Heuristics

■ EES

■ Expansion Order

■ Bound

■ EES Performance

Conclusion


The Intuition Behind EES (Thayer and Ruml, 2011)

Introduction

Planning as Search

Bounded Suboptimal

■ Intuition


■ EES

■ Expansion Order

■ Bound

■ EES Performance

Conclusion


■ unbiased estimates can be more informed than lower bounds■ nearest goal is the easiest to find


Introduction

Planning as Search

Bounded Suboptimal

■ Intuition


■ EES

■ Expansion Order

■ Bound

■ EES Performance

Conclusion



Heavy Vacuum

Size302010

total ra

w cpu tim

e

400

200

admiss. h

inadmiss. h


Introduction

Planning as Search

Bounded Suboptimal

■ Intuition


■ EES

■ Expansion Order

■ Bound

■ EES Performance

Conclusion



Heavy Vacuum

Size302010

total ra

w cpu tim

e

400

200

admiss. h

inadmiss. h

d


Introduction

Planning as Search

Bounded Suboptimal

■ Intuition


■ EES

■ Expansion Order

■ Bound

■ EES Performance

Conclusion



Explicit Estimation Search (EES) combines these ideas

to minimize solving time subject to cost ≤ w·optimal:

pursue the nearest goal within the bound

need more information than just lower bound on cost (h(n))!

Three Heuristic Sources of Information

Introduction

Planning as Search

Bounded Suboptimal

■ Intuition


■ EES

■ Expansion Order

■ Bound

■ EES Performance

Conclusion


1. h: a lower bound on cost-to-gof(n) = g(n) + h(n)the traditional optimal A* lower bound


Introduction

Planning as Search

Bounded Suboptimal

■ Intuition


■ EES

■ Expansion Order

■ Bound

■ EES Performance

Conclusion



2. h: an estimate of cost-to-gounbiased estimates can be more informedf(n) = g(n) + h(n)(Thayer and Ruml, ICAPS-11)


Introduction

Planning as Search

Bounded Suboptimal

■ Intuition


■ EES

■ Expansion Order

■ Bound

■ EES Performance

Conclusion




3. d: an estimate of distance-to-gonearest goal is the easiest to find(Pearl and Kim, IEEE PAMI 1982,Thayer et al, ICAPS-09)

Finding bestd

Introduction

Planning as Search

Bounded Suboptimal

■ Intuition


■ EES

■ Expansion Order

■ Bound

■ EES Performance

Conclusion


bestf : open node giving lower bound on cost

argminn∈open

f(n)

Finding bestd

Introduction

Planning as Search

Bounded Suboptimal

■ Intuition


■ EES

■ Expansion Order

■ Bound

■ EES Performance

Conclusion



argminn∈open

f(n)

bestf: open node giving estimated optimal cost

argminn∈open

f(n)

Finding bestd

Introduction

Planning as Search

Bounded Suboptimal

■ Intuition


■ EES

■ Expansion Order

■ Bound

■ EES Performance

Conclusion



argminn∈open

f(n)

bestf: open node giving estimated optimal cost

argminn∈open

f(n)


bestd: estimated w-suboptimal node with minimum d

argminn∈open∧f(n)≤w·f(best

f)

d(n)

EES Expansion Order

Introduction

Planning as Search

Bounded Suboptimal

■ Intuition


■ EES

■ Expansion Order

■ Bound

■ EES Performance

Conclusion


bestf : open node giving lower bound on costbest

f: open node giving estimated optimal cost


node to expand next:

1. pursue the shortest solution that is within the bound2.3.

in other words:

1. bestd

2.3.

EES Expansion Order

Introduction

Planning as Search

Bounded Suboptimal

■ Intuition


■ EES

■ Expansion Order

■ Bound

■ EES Performance

Conclusion






1. pursue the shortest solution that is within the bound2.3.

in other words:

1. if f(bestd) ≤ w · f(bestf ) then best

d

2.3.

EES Expansion Order

Introduction

Planning as Search

Bounded Suboptimal

■ Intuition


■ EES

■ Expansion Order

■ Bound

■ EES Performance

Conclusion






1. pursue the shortest solution that is within the bound2. pursue the estimated optimal solution3.

in other words:


d

2. else if f(bestf) ≤ w · f(bestf ) then best

f

3.

EES Expansion Order

Introduction

Planning as Search

Bounded Suboptimal

■ Intuition


■ EES

■ Expansion Order

■ Bound

■ EES Performance

Conclusion






1. pursue the shortest solution that is within the bound2. pursue the estimated optimal solution3. raise the lower bound on optimal solution cost

in other words:


d

2. else if f(bestf) ≤ w · f(bestf ) then best

f

3. else bestf

see paper for further justification

EES Respects the Suboptimality Bound

Introduction

Planning as Search

Bounded Suboptimal

■ Intuition


■ EES

■ Expansion Order

■ Bound

■ EES Performance

Conclusion


how does f(n) ≤ w · f(bestf ) ensure the suboptimality bound?

EES Respects the Suboptimality Bound

Introduction

Planning as Search

Bounded Suboptimal

■ Intuition


■ EES

■ Expansion Order

■ Bound

■ EES Performance

Conclusion


how does f(n) ≤ w · f(bestf ) ensure the suboptimality bound?

f(n) ≤ f(n) f(n) is a lower bound for n

f(n) ≤ w · f(bestf ) expansion criterionw · f(bestf ) ≤ w · f∗(opt) because f(bestf ) is a lower

bound for the entire problem

f(n) ≤ w · f∗(opt) suboptimality bound

EES Performance

Introduction

Planning as Search

Bounded Suboptimal

■ Intuition


■ EES

■ Expansion Order

■ Bound

■ EES Performance

Conclusion


bounded suboptimal search:minimize time subject to

relative cost bound (factor of optimal)

EES Performance

Introduction

Planning as Search

Bounded Suboptimal

■ Intuition


■ EES

■ Expansion Order

■ Bound

■ EES Performance

Conclusion


Dock Robot

Suboptimality2.41.6

total raw cpu tim

e300

200

100

A* eps

wA*

Optimistic

Skeptical

EES

EES Opt.

EES Performance

Introduction

Planning as Search

Bounded Suboptimal

■ Intuition


■ EES

■ Expansion Order

■ Bound

■ EES Performance

Conclusion


Vacuum World

Suboptimality42

log10 total raw cpu tim

e

0.6

0

-0.6

A* eps

Optimistic

Skeptical

wA*

EES

EES Opt.

EES Performance

Introduction

Planning as Search

Bounded Suboptimal

■ Intuition


■ EES

■ Expansion Order

■ Bound

■ EES Performance

Conclusion


Heavy Vacuum World

Suboptimality168

log10 total raw cpu tim

e

1

0

-1

wA*

Optimistic

Skeptical

A* eps

EES

EES Opt.

Conclusion

Introduction

Planning as Search

Bounded Suboptimal

Conclusion

■ Summary

■ The AI Vision

■ Algs as Agents

■ UNH


Summary

Introduction

Planning as Search

Bounded Suboptimal

Conclusion

■ Summary

■ The AI Vision

■ Algs as Agents

■ UNH


Who’s afraid of big bad NP-hardness?

bounded suboptimal planning

■ fast solving■ control over cost

a new generation of planning algorithms

■ beyond optimal■ estimates, in addition to lower bounds■ new sources of heuristic information■ principled, provable performance

The AI Vision

Introduction

Planning as Search

Bounded Suboptimal

Conclusion

■ Summary

■ The AI Vision

■ Algs as Agents

■ UNH


world model

planner

search

agent

world

actions

sensing

Search Algorithms as Agents

Introduction

Planning as Search

Bounded Suboptimal

Conclusion

■ Summary

■ The AI Vision

■ Algs as Agents

■ UNH


model of space

meta-reasoner

open list

search algorithm

domain

expand

g, h values

children,

The University of New Hampshire

Introduction

Planning as Search

Bounded Suboptimal

Conclusion

■ Summary

■ The AI Vision

■ Algs as Agents

■ UNH


tell your students to apply to grad school in CS at UNH!

■ friendly faculty■ funding■ individual attention■ beautiful campus■ low cost of living■ easy access to Boston,

White Mountains■ strong in AI, infoviz,

networking,bioinformatics

planning algorithms: when optimal just isn’t good enoughruml/papers/uml-2011-talk.pdf ·...

Documents