15-780: graduate ai
TRANSCRIPT
![Page 1: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/1.jpg)
15-780: Graduate AILecture 14. Spatial Search;
Uncertainty Geoff Gordon (this lecture)
Tuomas SandholmTAs Sam Ganzfried, Byron Boots
1
![Page 2: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/2.jpg)
Spatial Planning
2
![Page 3: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/3.jpg)
Plans in Space…
A* can be used for many thingsHere, A* for spatial planning (in contrast to, e.g., jobshop scheduling)
Optimal Solution End-effector Trajectory Probability of Obstacle Appearing Probability of Obstacle Appearing
Solu
tion
Cos
t
Stat
eEx
pans
ions
Figure 10: Environment used in our second experiment, along with the optimal solution and the end-effector trajectory (withoutany dynamic obstacles). Also shown are the solution cost of the path traversed and the number of states expanded by each ofthe three algorithms compared.
other words, by adding a fixed value to the key of each newstate placed on the queue, the old states are given a rela-tive advantage in their queue placement. When a state ispopped off the queue whose key value is not in line withthe current bias term, it is placed back on the queue with anupdated key value. The intuition is that only a small num-ber of the states previously on the queue may ever makeit to the top, so it can be much more efficient to only re-order the ones that do. We can use the same idea when εdecreases (from εo to εn, say) to increase the bias term by(εo − εn) · maxs∈OPEN h(sstart, s). The key value of eachstate becomeskey(s) = [min(g(s), rhs(s)) + ε · h(sstart, s) + bias,
min(g(s), rhs(s))].By using the maximum heuristic value present in the queueto update the bias term, we are guaranteeing that each statealready on the queue will be at least as elevated on the queueas it should be relative to the new states being added. It isfuture work to implement this approach but it appears to bea promising modification.
Finally, it may be possible to reduce the effect of un-derconsistent states in our repair of previous solution paths.With the current version of AD*, underconsistent states needto be placed on the queue with a key value that uses an un-inflated heuristic value. This is because they could reside onthe old solution path and their true effect on the start statemay be much more than the inflated heuristic would suggest.This means, however, that the underconsistent states quicklyrise to the top of the queue and are processed before manyoverconsistent states. At times, these underconsistent statesmay not have any effect on the value of the start state (forinstance when they do not reside upon the current solutionpath). We are currently looking into ways of reducing thenumber of underconsistent states examined, using ideas veryrecently developed (Ferguson & Stentz 2005). This couldprove very useful in the current framework, where much ofthe processing is done on underconsistent states that may notturn out to have any bearing on the solution.
ConclusionsWe have presented Anytime Dynamic A*, a heuristic-based,anytime replanning algorithm able to efficiently generate so-
lutions to complex, dynamic path planning problems. Thealgorithm works by continually decreasing a suboptimal-ity bound on its solution, reusing previous search efforts asmuch as possible. When changes in the environment areencountered, it is able to repair its previous solution incre-mentally. Our experiments and application of the algorithmto two real-world robotic systems have shown it to be a valu-able addition to the family of heuristic-based path planningalgorithms, and a useful tool in practise.
AcknowledgmentsThe authors would like to thank Sven Koenig for fruitfuldiscussions. This work was partially sponsored by DARPA’sMARS program. Dave Ferguson is supported in part by anNSF Graduate Research Fellowship.
ReferencesBarbehenn, M., and Hutchinson, S. 1995. Efficient searchand hierarchical motion planning by dynamically maintain-ing single-source shortest path trees. IEEE Transactions onRobotics and Automation 11(2):198–214.Barto, A.; Bradtke, S.; and Singh, S. 1995. Learning toAct Using Real-Time Dynamic Programming. ArtificialIntelligence 72:81–138.Bonet, B., and Geffner, H. 2001. Planning as heuristicsearch. Artificial Intelligence 129(1-2):5–33.Chakrabarti, P.; Ghosh, S.; and DeSarkar, S. 1988. Ad-missibility of AO* when heuristics overestimate. ArtificialIntelligence 34:97–113.Dean, T., and Boddy, M. 1988. An analysis of time-dependent planning. In Proceedings of the National Con-ference on Artificial Intelligence (AAAI).Edelkamp, S. 2001. Planning with pattern databases. InProceedings of the European Conference on Planning.Ersson, T., and Hu, X. 2001. Path planning and navigationof mobile robots in unknown environments. In Proceedingsof the IEEE International Conference on Intelligent Robotsand Systems (IROS).Ferguson, D., and Stentz, A. 2005. The Delayed D* Algo-rithm for Efficient Path Replanning. In Proceedings of the
3
![Page 4: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/4.jpg)
What’s wrong w/ A* guarantees?
(optimality) A* finds a solution of cost g*(efficiency) A* expands no nodes that have f(node) > g*
4
![Page 5: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/5.jpg)
What’s wrong with A*?
Discretized space into tiny little chunksa few degrees rotation of a jointLots of states ⇒ lots of states w/ f ≤ g*
Discretized actions tooone joint at a time, discrete angles
Results in jagged paths
Optimal Solution End-effector Trajectory Probability of Obstacle Appearing Probability of Obstacle Appearing
Solu
tion
Cos
t
Stat
eEx
pans
ions
Figure 10: Environment used in our second experiment, along with the optimal solution and the end-effector trajectory (withoutany dynamic obstacles). Also shown are the solution cost of the path traversed and the number of states expanded by each ofthe three algorithms compared.
other words, by adding a fixed value to the key of each newstate placed on the queue, the old states are given a rela-tive advantage in their queue placement. When a state ispopped off the queue whose key value is not in line withthe current bias term, it is placed back on the queue with anupdated key value. The intuition is that only a small num-ber of the states previously on the queue may ever makeit to the top, so it can be much more efficient to only re-order the ones that do. We can use the same idea when εdecreases (from εo to εn, say) to increase the bias term by(εo − εn) · maxs∈OPEN h(sstart, s). The key value of eachstate becomeskey(s) = [min(g(s), rhs(s)) + ε · h(sstart, s) + bias,
min(g(s), rhs(s))].By using the maximum heuristic value present in the queueto update the bias term, we are guaranteeing that each statealready on the queue will be at least as elevated on the queueas it should be relative to the new states being added. It isfuture work to implement this approach but it appears to bea promising modification.
Finally, it may be possible to reduce the effect of un-derconsistent states in our repair of previous solution paths.With the current version of AD*, underconsistent states needto be placed on the queue with a key value that uses an un-inflated heuristic value. This is because they could reside onthe old solution path and their true effect on the start statemay be much more than the inflated heuristic would suggest.This means, however, that the underconsistent states quicklyrise to the top of the queue and are processed before manyoverconsistent states. At times, these underconsistent statesmay not have any effect on the value of the start state (forinstance when they do not reside upon the current solutionpath). We are currently looking into ways of reducing thenumber of underconsistent states examined, using ideas veryrecently developed (Ferguson & Stentz 2005). This couldprove very useful in the current framework, where much ofthe processing is done on underconsistent states that may notturn out to have any bearing on the solution.
ConclusionsWe have presented Anytime Dynamic A*, a heuristic-based,anytime replanning algorithm able to efficiently generate so-
lutions to complex, dynamic path planning problems. Thealgorithm works by continually decreasing a suboptimal-ity bound on its solution, reusing previous search efforts asmuch as possible. When changes in the environment areencountered, it is able to repair its previous solution incre-mentally. Our experiments and application of the algorithmto two real-world robotic systems have shown it to be a valu-able addition to the family of heuristic-based path planningalgorithms, and a useful tool in practise.
AcknowledgmentsThe authors would like to thank Sven Koenig for fruitfuldiscussions. This work was partially sponsored by DARPA’sMARS program. Dave Ferguson is supported in part by anNSF Graduate Research Fellowship.
ReferencesBarbehenn, M., and Hutchinson, S. 1995. Efficient searchand hierarchical motion planning by dynamically maintain-ing single-source shortest path trees. IEEE Transactions onRobotics and Automation 11(2):198–214.Barto, A.; Bradtke, S.; and Singh, S. 1995. Learning toAct Using Real-Time Dynamic Programming. ArtificialIntelligence 72:81–138.Bonet, B., and Geffner, H. 2001. Planning as heuristicsearch. Artificial Intelligence 129(1-2):5–33.Chakrabarti, P.; Ghosh, S.; and DeSarkar, S. 1988. Ad-missibility of AO* when heuristics overestimate. ArtificialIntelligence 34:97–113.Dean, T., and Boddy, M. 1988. An analysis of time-dependent planning. In Proceedings of the National Con-ference on Artificial Intelligence (AAAI).Edelkamp, S. 2001. Planning with pattern databases. InProceedings of the European Conference on Planning.Ersson, T., and Hu, X. 2001. Path planning and navigationof mobile robots in unknown environments. In Proceedingsof the IEEE International Conference on Intelligent Robotsand Systems (IROS).Ferguson, D., and Stentz, A. 2005. The Delayed D* Algo-rithm for Efficient Path Replanning. In Proceedings of the
5
![Page 6: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/6.jpg)
What’s wrong with A*?
6
![Page 7: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/7.jpg)
Snapshot of A*
7
![Page 8: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/8.jpg)
Wouldn’t it be nice…
… if we could break things up based more on the real geometry of the world?Robot Motion Planning by Jean-Claude Latombe
8
![Page 9: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/9.jpg)
Physical system
A moderate number of real-valued coordinatesDeterministic, continuous dynamicsContinuous goal set (or a few pieces)Cost = time, work, torque, …
9
![Page 10: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/10.jpg)
Typical physical system
10
![Page 11: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/11.jpg)
A kinematic chain
Rigid links connected by jointsrevolute or prismatic
Configurationq = (q1, q2, …)
qi = angle or length of joint iDimension of q = “degrees of freedom”
11
![Page 12: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/12.jpg)
Mobile robots
Translating in space = 2 dof
12
![Page 13: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/13.jpg)
More mobility
Translation + rotation = 3 dof
13
![Page 14: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/14.jpg)
Q: How many dofs?
3d translation & rotation
14
![Page 15: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/15.jpg)
credit: Andrew M
oore
15
![Page 16: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/16.jpg)
Robot kinematic motion planning
Now let’s add obstacles
16
![Page 17: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/17.jpg)
Configuration space
For any configuration q, can test whether it intersects obstaclesSet of legal configs is “configuration space” C (a subset of a dof-dimensional vector space) Path is a continuous function from [0,1] into C with q(0) = qs and q(1) = qg
17
![Page 18: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/18.jpg)
Note: dynamic planning
Includes inertia as well as configurationq, qHarder, since twice as many dofsMore later…
18
![Page 19: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/19.jpg)
C-space example
19
![Page 20: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/20.jpg)
More C-space examples
20
![Page 21: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/21.jpg)
Another C-space example
image: J Kuffner
21
![Page 22: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/22.jpg)
Topology of C-space
Topology of C-space can be something other than the familiar Euclidean worldE.g. set of angles = unit circle = SO(2)
not [0, 2π) !
Ball & socket joint (3d angle) ⊆ unit sphere = SO(3)
22
![Page 23: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/23.jpg)
Topology example
Compare L to R: 2 planar angles v. one solid angle — both 2 dof (and neither the same as Euclidean 2-space)
23
![Page 24: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/24.jpg)
Back to planning
Complaint with A* was that it didn’t break up space intelligentlyHow might we do better?Lots of roboticists have given lots of answers!
24
![Page 25: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/25.jpg)
Shortest path in C-space
25
![Page 26: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/26.jpg)
Shortest path in C-space
26
![Page 27: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/27.jpg)
Shortest path
Suppose a planar polygonal C-spaceShortest path in C-space is a sequence of line segmentsEach segment’s ends are either start or goal or one of the vertices in C-spaceIn 3-d or higher, might lie on edge, face, hyperface, …
27
![Page 28: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/28.jpg)
Visibility graph
http://www.cse.psu.edu/~rsharma/robotics/notes/notes2.html
28
![Page 29: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/29.jpg)
Naive algorithm
For i = 1 … pointsFor j = 1 … points
included = tFor k = 1 … edges
if segment ij intersects edge kincluded = f
29
![Page 30: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/30.jpg)
Complexity
Naive algorithm is O(n3) in planar C-spaceFor algorithms that run faster, O(n2) and O(k + n log n), see [Latombe, pg 157]
k = number of edges that wind up in visibility graph
Once we have graph, search it!
30
![Page 31: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/31.jpg)
Discussion of visibility graph
Good: finds shortest pathBad: complex C-space yields long runtime, even if problem is easy
get my 23-dof manipulator to move 1mm when nearest obstacle is 1m
Bad: no margin for error
31
![Page 32: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/32.jpg)
Getting bigger margins
Could just pad obstaclesbut how much is enough? might make infeasible…
What if we try to stay as far away from obstacles as possible?
32
![Page 33: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/33.jpg)
Voronoi graph
!1.5 !1 !0.5 0 0.5
!1
!0.5
0
0.5
1
33
![Page 34: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/34.jpg)
Voronoi graph
Given a set of point obstaclesFind all places that are equidistant from two or more of themResult: network of line segmentsCalled Voronoi graphEach line stays as far away as possible from two obstacles while still going between them
34
![Page 35: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/35.jpg)
Voronoi from polygonal C-space
35
![Page 36: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/36.jpg)
Voronoi from polygonal C-space
Set of points which are equidistant from 2 or more closest points on border of C-spacePolygonal C-space in 2d yields lines & parabolas intersecting at points
lines from 2 pointsparabolas from line & point
36
![Page 37: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/37.jpg)
Voronoi method for planning
Compute Voronoi diagram of C-spaceGo straight from start to nearest point on diagramPlan within diagram to get near goal (e.g., with A*)Go straight to goal
37
![Page 38: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/38.jpg)
Discussion of Voronoi
Good: stays far away from obstaclesBad: assumes polygonsBad: gets kind of hard in higher dimensions (but see Howie Choset’s web page and book)
!"#$%&'()*+,- ./*0-1'(+,1/(0,23,4,5631(5*78,32(49*:((;<=<()62-9(+/*(
>**0->8(?60-@(/,0(3*1-0(*3(3*1-0(?8(A6358(B:61*
C*D(+,1/(E7-3,32
!"#$%&'()*+,- ./*0-1'(+,1/(0,23,4,5631(5*78,32(49*:((;<=<()62-9(+/*(
>**0->8(?60-@(/,0(3*1-0(*3(3*1-0(?8(A6358(B:61*
C*D(+,1/(E7-3,32
38
![Page 39: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/39.jpg)
Voronoi discussion
Bad: kind of gun-shy about obstacles
39
![Page 40: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/40.jpg)
(Approximate) cell decompositions
40
![Page 41: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/41.jpg)
Planning algorithm
Lay down a grid in C-spaceDelete cells that intersect obstaclesConnect neighborsA*If no path, double resolution and try again
never know when we’re done
41
![Page 42: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/42.jpg)
Approximate cell decomposition
This decomposition is what we were using for A* in examples from aboveWorks pretty well except:
need high resolution near obstacleswant low res away from obstacles
42
![Page 43: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/43.jpg)
Fix: variable resolution
Lay down a coarse gridSplit cells that intersect obstacle borders
empty cells goodfull cells also don’t need splitting
Stop at fine resolutionData structure: quadtree
43
![Page 44: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/44.jpg)
44
![Page 45: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/45.jpg)
45
![Page 46: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/46.jpg)
46
![Page 47: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/47.jpg)
47
![Page 48: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/48.jpg)
Discussion
Works pretty well, except:Still don’t know when to stopWon’t find shortest pathStill doesn’t really scale to high-d
48
![Page 49: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/49.jpg)
Better yet
Adaptive decompositionSplit only cells that actually make a difference
are on path from startmake a difference to our policy
49
![Page 50: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/50.jpg)
An adaptive splitter: parti-game
G
Start
Goal
G
G
G
50
![Page 51: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/51.jpg)
9dof planar arm
Fixed
base
Start
Goal
85 partitions total
51
![Page 52: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/52.jpg)
Parti-game paper
Andrew Moore and Chris Atkeson. The Parti-game Algorithm for Variable Resolution Reinforcement Learning in Multidimensional State-spaces http://www.autonlab.org/autonweb/14699.html
52
![Page 53: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/53.jpg)
Randomness in search
53
![Page 54: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/54.jpg)
Rapidly-exploring Random Trees
Put landmarks into C-spaceBreak up C-space into Voronoi regions around landmarksPut landmarks densely only if high resolution is needed to find a pathWill not guarantee optimal path (*)
54
![Page 55: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/55.jpg)
RRT assumptions
RANDOM_CONFIGsamples from C-space
EXTEND(q, q’)local controller, heads toward q’ from qstops before hitting obstacle
FIND_NEAREST(q, Q)searches current tree Q for point near q
55
![Page 56: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/56.jpg)
Path Planning with RRTs
[ Kuffner & LaValle , ICRA’00]
RRT = Rapidly-Exploring Random Tree
BUILT_RRT(qinit) { T = qinit for k = 1 to K { qrand = RANDOM_CONFIG() EXTEND(T, qrand); }}
EXTEND(T, q) { qnear = FIND_NEAREST(q, T) qnew = EXTEND(qnear, q) T = T + (qnear, qnew)}
56
![Page 57: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/57.jpg)
Path Planning with RRTs
qinit
[ Kuffner & LaValle , ICRA’00]
RRT = Rapidly-Exploring Random Tree
BUILT_RRT(qinit) { T = qinit for k = 1 to K { qrand = RANDOM_CONFIG() EXTEND(T, qrand); }}
EXTEND(T, q) { qnear = FIND_NEAREST(q, T) qnew = EXTEND(qnear, q) T = T + (qnear, qnew)}
56
![Page 58: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/58.jpg)
Path Planning with RRTs
qinitqrand
[ Kuffner & LaValle , ICRA’00]
RRT = Rapidly-Exploring Random Tree
BUILT_RRT(qinit) { T = qinit for k = 1 to K { qrand = RANDOM_CONFIG() EXTEND(T, qrand); }}
EXTEND(T, q) { qnear = FIND_NEAREST(q, T) qnew = EXTEND(qnear, q) T = T + (qnear, qnew)}
56
![Page 59: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/59.jpg)
Path Planning with RRTs
qnearqinitqrand
[ Kuffner & LaValle , ICRA’00]
RRT = Rapidly-Exploring Random Tree
BUILT_RRT(qinit) { T = qinit for k = 1 to K { qrand = RANDOM_CONFIG() EXTEND(T, qrand); }}
EXTEND(T, q) { qnear = FIND_NEAREST(q, T) qnew = EXTEND(qnear, q) T = T + (qnear, qnew)}
56
![Page 60: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/60.jpg)
Path Planning with RRTs
qnear
qnew
qinitqrand
[ Kuffner & LaValle , ICRA’00]
RRT = Rapidly-Exploring Random Tree
BUILT_RRT(qinit) { T = qinit for k = 1 to K { qrand = RANDOM_CONFIG() EXTEND(T, qrand); }}
EXTEND(T, q) { qnear = FIND_NEAREST(q, T) qnew = EXTEND(qnear, q) T = T + (qnear, qnew)}
56
![Page 61: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/61.jpg)
RRT example
Plan
ar h
olon
omic
robo
t
57
![Page 62: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/62.jpg)
RRTs explore coarse to fine
Tend to break up large Voronoi regions
higher probability of qrand being in them
Limiting distribution of vertices given by RANDOM_CONFIG
as RRT grows, probability that qrand is reachable with local controller (and so immediately becomes a new vertex) approaches 1
58
![Page 63: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/63.jpg)
59
![Page 64: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/64.jpg)
60
![Page 65: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/65.jpg)
61
![Page 66: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/66.jpg)
62
![Page 67: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/67.jpg)
63
![Page 68: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/68.jpg)
RRT example
64
![Page 69: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/69.jpg)
RRT for a car (3 dof)
65
![Page 70: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/70.jpg)
Planning with RRTs
Build RRT from start until we add a node that can reach goal using local controller(Unique) path: root → last node → goal
Optional: cross-link tree by testing local controller, search within tree using A*Optional: grow forward and backward
66
![Page 71: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/71.jpg)
67
![Page 72: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/72.jpg)
68
![Page 73: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/73.jpg)
69
![Page 74: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/74.jpg)
70
![Page 75: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/75.jpg)
71
![Page 76: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/76.jpg)
What you should know
C-spaceWays of splitting up C-space
Visibility graphVoronoiCell decompositionVariable resolution or adaptive cells (quadtree, parti-game)
RRTs72
![Page 77: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/77.jpg)
AI + Uncertainty
73
![Page 78: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/78.jpg)
Uncertainty is ubiquitous
Random outcomes (coin flip, who’s behind the door, …)Incomplete observations (occlusion, sensor noise)Behavior of other agents (intentions, observations, goals, beliefs)
74
![Page 79: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/79.jpg)
Propositional v. lifted
We’ve talked about both propositional and lifted logical representations
SAT v. FOLCan add uncertainty to bothWe’ll do just propositional; uncertainty in lifted representations is a topic of current research
75
![Page 80: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/80.jpg)
Review: probability
76
![Page 81: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/81.jpg)
Random variables
Uncertain analog of propositions in SAT or variables in MILPTomorrow’s weather, change in AAPL stock price, grade on HW4Observations convert random variables into fixed realizations
77
![Page 82: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/82.jpg)
Notation
X=x means r.v. X is realized as xP(X=x) means probability of X=x
if clear from context, may omit “X=”instead of P(Weather=rain), just P(rain)
P(X) means a function: x → P(X=x)
P(X=x, Y=y) means probability that both realizations happen simultaneously
78
![Page 83: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/83.jpg)
Discrete distributionsW
eath
er
AAPL price A A- B+
A .21 .17 .07
A- .17 .15 .06
B+ .07 .06 .04H
W4
15-780
up same down
sun .09 .15 .06
rain .21 .35 .14
79
![Page 84: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/84.jpg)
Joint distribution
Atomic event: a joint realization of all random variables of interestJoint distribution: assigns a probability to each atomic event
Wea
ther
AAPL
up same down
sun .09 .15 .06
rain .21 .35 .14
e = sun, same P(e) = 0.15
80
![Page 85: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/85.jpg)
Events
An event is a set of atomic eventsE = { (sun, same), (rain, same) }
P(E) = ∑ P(e) = .15 + .35 = .5
This one is written “AAPL = same”
e ∈ EW
eath
er
AAPL
up same down
sun .09 .15 .06
rain .21 .35 .14
81
![Page 86: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/86.jpg)
AND, OR, NOT
For events E, F:E AND F means E ∩ F
E OR F means E ∪ F
NOT E means U – EU is set of all possible joint realizations
82
![Page 87: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/87.jpg)
Marginal: eliminating unneeded r.v.sW
eath
er
AAPL price
up same down
sun .09 .15 .06
rain .21 .35 .14
.09 + .15 + .06 = .3
.21 + .35 + .14 = .7
sun 0.3
rain 0.7Wea
ther
83
![Page 88: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/88.jpg)
Law of Total Probability
For any X, YP(X) = P(X, Y=y1) + P(X, Y=y2) + …
84
![Page 89: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/89.jpg)
Conditional: incorporating observations
A .41
A- .35
B+ .24
15-7
80
A A- B+
A .21 .17 .07
A- .17 .15 .06
B+ .07 .06 .04
HW
4
15-780
P(15-780=A | HW4=B+) = .04 / (.07+.06+.04)85
![Page 90: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/90.jpg)
Conditioning
In general, divide a row or column by sumP(X | Y=y) = P(X, Y=y) / P(Y=y)
P(Y=y) is a marginal probability, gotten by row or column sumP(X | Y=y) is a row or column of tableThought experiment: what happens if we condition on an event of zero probability?
86
![Page 91: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/91.jpg)
Notation
P(X | Y) is a function: x, y → P(X=x | Y=y)
Expressions are evaluated separately for each realization:
P(X | Y) P(Y) means the function x, y → P(X=x | Y=y) P(Y=y)
87
![Page 92: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/92.jpg)
Independence
X and Y are independent if, for all possible values of y, P(X) = P(X | Y=y)
equivalently, for all possible values of x, P(Y) = P(Y | X=x)equivalently, P(X, Y) = P(X) P(Y)
Knowing X or Y gives us no information about the other
88
![Page 93: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/93.jpg)
Bayes Rule
For any X, Y, CP(X | Y, C) P(Y | C) = P(Y | X, C) P(X | C)
Simple version (without context)P(X | Y) P(Y) = P(Y | X) P(X)
Proof: both sides are just P(X, Y)P(X | Y) = P(X, Y) / P(Y) (by def’n of conditioning)
89
![Page 94: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/94.jpg)
Continuous distributions
What if X is real-valued?Atomic event: (X ≤ x), (X ≥ x)
any* other subset via AND, OR, NOTX=x has zero probability*
P(X=x, Y=y) means lim P(x ≤ X ≤ x+ε, Y = y) / εas ε → 0+
90
![Page 95: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/95.jpg)
Factor Graphs
91
![Page 96: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/96.jpg)
Factor graphs
Strict generalization of SAT to include uncertaintyFactor graph = (variables, constraints)
variables: set of discrete r.v.sconstraints: set of (possibly) soft or probabilistic constraints called factors or potentials
92
![Page 97: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/97.jpg)
Factors
Hard constraint: X + Y ≥ 3Soft constraint: X + Y ≥ 3 is more probable than X + Y < 3, all else equalDomain: set of relevant variables
{X, Y} in this case
93
![Page 98: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/98.jpg)
Factors
0 1 20 1 1 1
1 1 1 3
2 1 3 3
X
Y
0 1 20 0 0 0
1 0 0 1
2 0 1 1
X
Y
Hard Soft
94
![Page 99: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/99.jpg)
Factor graphs
Variables and factors connected according to domains
95
![Page 100: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/100.jpg)
Factor graphs
Omit single-argument factors from graph to reduce clutter
96
![Page 101: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/101.jpg)
Factor graphs: probability model
97
![Page 102: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/102.jpg)
Normalizing constant
Also called partition function
98
![Page 103: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/103.jpg)
Factors
Ra O W Φ1
T T T 3T T F 1T F T 3T F F 3F T T 3F T F 3F F T 3F F F 3
W M Ru Φ2T T T 3T T F 1T F T 3T F F 3F T T 3F T F 3F F T 3F F F 3
99
![Page 104: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/104.jpg)
Unary factors
Ra Φ3
T 1
F 2
W Φ4
T 1
F 1
O Φ5
T 5
F 2
M Φ6
T 10
F 1
Ru Φ7
T 1
F 3100
![Page 105: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/105.jpg)
Inference Qs
Is Z > 0?What is P(E)?What is P(E1 | E2)?Sample a random configuration according to P(.) or P(. | E)Hard part: taking sums of Φ (such as Z)
101
![Page 106: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/106.jpg)
Example
What is P(T, T, T, T, T)?
102
![Page 107: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/107.jpg)
Example
What is P(Rusty=T | Rains=T)?This is P(Rusty=T, Rains=T) / P(Rains=T)P(Rains) is a marginal of P(…)So is P(Rusty, Rains)Note: Z cancels, but still have to sum lots of entries to get each marginal
103
![Page 108: 15-780: Graduate AI](https://reader030.vdocuments.net/reader030/viewer/2022012503/617d7da328fe0f316577f982/html5/thumbnails/108.jpg)
Relationship to SAT, ILP
Easy to write a clause or a linear constraint as a factor: 1 if satisfied, 0 o/wFeasibility problem: is Z > 0?
more generally, count satisfying assignments (determine Z)NP or #P complete (respectively)
Sampling problem: return a satisfying assignment uniformly at random
104