exact mode estimation for pomdps based on constraint decomposition and symbolic encoding
DESCRIPTION
Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding. Martin Sachenbacher July 1, 2003. Exact vs. Approximate ME. Problems of ME with incomplete belief state Dead ends (no solutions) Incorrect leading solutions Incorrect probabilities of solutions - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding](https://reader035.vdocuments.net/reader035/viewer/2022062309/568150eb550346895dbf0408/html5/thumbnails/1.jpg)
Exact Mode Estimation for POMDPs Exact Mode Estimation for POMDPs based on Constraint Decomposition based on Constraint Decomposition and Symbolic Encodingand Symbolic Encoding
Martin SachenbacherJuly 1, 2003
![Page 2: Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding](https://reader035.vdocuments.net/reader035/viewer/2022062309/568150eb550346895dbf0408/html5/thumbnails/2.jpg)
Exact vs. Approximate MEExact vs. Approximate ME
Problems of ME with incomplete belief state– Dead ends (no solutions)– Incorrect leading solutions– Incorrect probabilities of solutions
Usefulness of ME with complete belief state– As accuracy reference– As performance reference– As a starting point for approximations
Key: Compact representation of belief state– Map to semiring-based CSP– Decompose Hypergraph into Hypertree– Encode Tree Nodes symbolically as ADDs
![Page 3: Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding](https://reader035.vdocuments.net/reader035/viewer/2022062309/568150eb550346895dbf0408/html5/thumbnails/3.jpg)
OutlineOutline
SCSPs (Semiring-based CSPs) Mapping State Constraints to SCSPs Mapping Transition Constraints to SCSPs ADDs (Algebraic Decision Diagrams) Hypertree Decompositions of SCSPs Solving Tree-structured SCSPs Exact Mode Estimation for POMDPs as
Decomposition/ADD-based SCSP Solving Demonstration: Two Switches Example
![Page 4: Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding](https://reader035.vdocuments.net/reader035/viewer/2022062309/568150eb550346895dbf0408/html5/thumbnails/4.jpg)
SCSPs (Semiring-based CSPs)SCSPs (Semiring-based CSPs)
Generalization of CSPs [Bistarelli et al. 97] Domain D, Variables V, Set S, Type T V Constraints are mappings Dk S Operations (for join) and (for projection) on S (S, , , 0, 1) must for form c-semiring Dynamic Programming applicable to all SCSPs Examples
– ({0,1}, , , 0, 1): Classical CSPs– (R+, min, +, +, 0): Weighted CSPs– ([0,1], max, *, 0, 1): Probabilistic CSPs
![Page 5: Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding](https://reader035.vdocuments.net/reader035/viewer/2022062309/568150eb550346895dbf0408/html5/thumbnails/5.jpg)
Encoding States as SCSPsEncoding States as SCSPs
Example: Or-Gate P(Or=ok) = 99%, P(Or=fty) = 1%
xt in1 in2 outok lo lo look lo hi hiok hi lo hiok hi hi hifty * * *
f
0.990.990.990.990.01
≥ 1
Or
![Page 6: Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding](https://reader035.vdocuments.net/reader035/viewer/2022062309/568150eb550346895dbf0408/html5/thumbnails/6.jpg)
Encoding Observations as SCSPsEncoding Observations as SCSPs
Example: (Probabilistic) Observation
0 1 2 3
P
0.9
0.60.3
xi
xi f
0123
0.60.90.30.0
Distribution over values for xi
![Page 7: Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding](https://reader035.vdocuments.net/reader035/viewer/2022062309/568150eb550346895dbf0408/html5/thumbnails/7.jpg)
Encoding Transitions as SCSPsEncoding Transitions as SCSPs
Example: (Probabilistic) CCA
0
1
0.9 0.9
0.9
0.9
xt cmd xt+1 f
0 off 00 on 00 off 10 on 11 off 01 on 01 off 11 on 1
0.90.10.10.90.90.10.10.9
cmd=offcmd=on
cmd=on
cmd=off
Transition Function
![Page 8: Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding](https://reader035.vdocuments.net/reader035/viewer/2022062309/568150eb550346895dbf0408/html5/thumbnails/8.jpg)
ADDs: Symbolic (graph-based) representation of functions {0,1}n R
Generalization of BDDs (functions {0,1}n {0,1}) Canonicity of representation (as for BDDs) Efficient package: CUDD
Algebraic Decision DiagramsAlgebraic Decision Diagrams
A
B B
C C
0 1 2 3
![Page 9: Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding](https://reader035.vdocuments.net/reader035/viewer/2022062309/568150eb550346895dbf0408/html5/thumbnails/9.jpg)
ADD Join OperationsADD Join Operations
Multiplication, addition, maximum, … Generalization of BDD operations
ABC f f*gg f>1f+g
000001010011100101110111
01121223
32010001
32131224
02020003
5*f
055105101015
00010111
max(f,g)
32121223
![Page 10: Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding](https://reader035.vdocuments.net/reader035/viewer/2022062309/568150eb550346895dbf0408/html5/thumbnails/10.jpg)
ExampleExample
Summation of ADD f, ADD g
A
B B
C C C
3 2 1 0
A
B B
C C
0 1 2 3
A
B B
C C C
4 3 2 1
+ =
![Page 11: Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding](https://reader035.vdocuments.net/reader035/viewer/2022062309/568150eb550346895dbf0408/html5/thumbnails/11.jpg)
ADD Projection OperationsADD Projection Operations
(f,X) (and (f,X)) obtained by summing (multiplying) values of tuples that differ only w.r.t. X
ABC f
000001010011100101110111
01121223
AB (f,{C})
00011011
1335
(f,{C})
0226
![Page 12: Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding](https://reader035.vdocuments.net/reader035/viewer/2022062309/568150eb550346895dbf0408/html5/thumbnails/12.jpg)
ADD Projection OperationsADD Projection Operations
For optimization, we require operation max(f,X) that yields maximum value of tuples differing only w.r.t. X
ABC f
000001010011100101110111
01121223
AB (f,{C})
00011011
1335
(f,{C})
0226
Not part of CUDD, but easy to implement as variant of /(f,X).
max(f,{C})
1223
![Page 13: Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding](https://reader035.vdocuments.net/reader035/viewer/2022062309/568150eb550346895dbf0408/html5/thumbnails/13.jpg)
Solving SCSPs using DecompositionSolving SCSPs using Decomposition
Transform SCSPs into Hypertree H=(T,,) Compute constraint (v) for each node v Bottom-up phase for computing values Top-down phase for extracting solutions
![Page 14: Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding](https://reader035.vdocuments.net/reader035/viewer/2022062309/568150eb550346895dbf0408/html5/thumbnails/14.jpg)
Pseudocode for Bottom-Up PhasePseudocode for Bottom-Up Phase
Function solve(v)For Each child children(v)
(v) (v) max((child), (child) \ (v))
Next child
Return (v) Generalization of (Semi-)Join Operation
![Page 15: Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding](https://reader035.vdocuments.net/reader035/viewer/2022062309/568150eb550346895dbf0408/html5/thumbnails/15.jpg)
ExampleExample
Boolean Polycell
And1
And2
F = 0
Or2
G = 1
Or1
Or3
X
Y
Z
B = 1
D = 1
A = 1
E = 0
C = 1
![Page 16: Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding](https://reader035.vdocuments.net/reader035/viewer/2022062309/568150eb550346895dbf0408/html5/thumbnails/16.jpg)
ExampleExample
Hypertree Decomposition of Boolean Polycell
O3A1CEFXYZ
A2GYZ O1ACXO2BDY
Y,Z Y C,X
ok 1 1 1fty 1 1
1fty 1 0
1fty 1 1
0fty 1 0
0
ok ok 1 0 0 0 0 1ok ok 1 0 0 0 1 1ok ok 1 0 0 1 0 1
…
ok 1 1 1fty 1 1
1fty 1 1
0
ok 1 1 1fty 1 1
1fty 1 1
0
v0
v1 v2 v3
U=.98505
U=.99U=.99U=.995U=.005 U=.01 U=.01
![Page 17: Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding](https://reader035.vdocuments.net/reader035/viewer/2022062309/568150eb550346895dbf0408/html5/thumbnails/17.jpg)
ExampleExample
Initial (v0)U=.98505
U=.00995
U=.00005U=.00495
fty ok 1 0 0 0 1 1fty ok 1 0 0 0 1 0fty ok 1 0 0 1 0 0fty ok 1 0 0 1 0 1fty ok 1 0 0 0 1 1fty ok 1 0 0 1 0 1
……
ok ok 1 0 0 0 0 1
ok ok 1 0 0 0 1 1
ok ok 1 0 0 1 0 1
ADD with20 nodes,5 leaves
O3A1CEFXYZv0
![Page 18: Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding](https://reader035.vdocuments.net/reader035/viewer/2022062309/568150eb550346895dbf0408/html5/thumbnails/18.jpg)
ExampleExample
After multiplication with max((v1),{A2,G})
ok ok 1 0 0 0 1 1
U=.98012
U=.00990
U=.00492
U=2.4E-5
U=4.9E-5
U=2.5E-7
fty ok 1 0 0 0 1 1ok ok 1 0 0 0 0
1ok ok 1 0 0 1 0 1……
…
…
ADD with28 nodes,7 leaves
O3A1CEFXYZv0
![Page 19: Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding](https://reader035.vdocuments.net/reader035/viewer/2022062309/568150eb550346895dbf0408/html5/thumbnails/19.jpg)
ExampleExample
After multiplication with max((v2),{O2,B,D})
ok ok 1 0 0 0 1 1
U=.97032
U=.00980
U=.00487
U=4.9E-7U=2.4E-7
U=2.5E-9
fty ok 1 0 0 0 1 1ok fty 1 0 0 0 1
1…
…U=4.9E-5
…
…
O3A1CEFXYZv0
ADD with30 nodes,8 leaves
![Page 20: Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding](https://reader035.vdocuments.net/reader035/viewer/2022062309/568150eb550346895dbf0408/html5/thumbnails/20.jpg)
ExampleExample
After multiplication with max((v3),{O1,A})
ADD with35 nodes,10 leaves
ok ok 1 0 0 0 1 1
U=.00970
U=.00482
U=9.8E-5
U=4.9E-7
U=2.4E-7
U=4.9E-9
ok fty 1 0 0 1 1 1fty ok 1 0 0 0 1
1…
…U=4.8E-5
…
…
U=2.4E-9
U=2.5E-11
…
…
Best Solution:Umax = .0097
O3A1CEFXYZv0
![Page 21: Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding](https://reader035.vdocuments.net/reader035/viewer/2022062309/568150eb550346895dbf0408/html5/thumbnails/21.jpg)
Pseudocode for Top-Down PhasePseudocode for Top-Down Phase
Function extractSolutions(vroot)E edges(vroot)
(vroot) max(, vars() \ decvars()vars(E))While E Do
e choose(E)v son-node(e)E (E \ e) edges(v)
0-1 (0)
div max(0-1 (v), vars())
( (v)) -1 div max(, vars() \ decvars()vars(E))
End While
“Divisor”
Restrict todecision and
shared variables
No search queue necessary
![Page 22: Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding](https://reader035.vdocuments.net/reader035/viewer/2022062309/568150eb550346895dbf0408/html5/thumbnails/22.jpg)
ExampleExample
Initial = max((vroot),{E,F})
ok ok 1 0 1 1 U=.00970
U=.00482
U=9.8E-5
U=4.9E-7
U=2.4E-7
U=4.9E-9
ok fty 1 1 1 1
fty ok 1 0 1 1
…
…U=4.8E-5
…
…
U=2.4E-9
U=2.5E-11
…
…
O3A1CXYZ
ADD with21 tuples, 33 nodes, 10 leaves
![Page 23: Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding](https://reader035.vdocuments.net/reader035/viewer/2022062309/568150eb550346895dbf0408/html5/thumbnails/23.jpg)
ExampleExample
After processing edge(v0,v3)
fty ok ok 1 1 U=.00970
U=.00482
U=9.8E-5
U=4.9E-7
U=2.4E-7
U=4.9E-9
ok ok fty 1 1
fty fty ok 1 1
…
…U=4.8E-5
…
…
U=2.4E-9
U=2.5E-11
…
…
O1O3A1YZ
ADD with21 tuples, 32 nodes, 10 leaves
![Page 24: Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding](https://reader035.vdocuments.net/reader035/viewer/2022062309/568150eb550346895dbf0408/html5/thumbnails/24.jpg)
ExampleExample
After processing edge(v0,v2)
fty ok ok ok 1 1
U=.00970
U=.00482
U=9.8E-5
U=9.9E-7
U=4.9E-7
ok ok ok fty 1 1fty fty ok ok 1 1fty ok fty ok 1 1
…
…U=4.8E-5
…
…
U=2.5E-11
…
…
O1O2O3A1YZ
ADD with30 tuples, 47 nodes, 11 leaves
![Page 25: Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding](https://reader035.vdocuments.net/reader035/viewer/2022062309/568150eb550346895dbf0408/html5/thumbnails/25.jpg)
ExampleExample
After processing edge(v0,v1)
fty ok ok ok ok U=.00970
U=.00482
U=9.8E-5
U=9.9E-7
ok ok ok fty okfty fty ok ok okfty ok fty ok ok
…
…U=4.8E-5
…
…
U=2.5E-11
…
…
O1O2O3A1A2
ADD with26 tuples,35 nodes, 12 leaves
U=2.4E-5#Solutions = 26
Easy to focus on leading solutions.
![Page 26: Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding](https://reader035.vdocuments.net/reader035/viewer/2022062309/568150eb550346895dbf0408/html5/thumbnails/26.jpg)
Application: Exact ME for POMDPsApplication: Exact ME for POMDPs
Given: POMDP (Feasible States, Observables, Control Actions, Transitions), Observations
Approach: Complete representation of belief state (through decomposition and symbolic encoding)
Benefit: Allows for exploiting Markov property
S0
S1 …Sn
Time t
S0
S1 …Sn
Time t+1
![Page 27: Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding](https://reader035.vdocuments.net/reader035/viewer/2022062309/568150eb550346895dbf0408/html5/thumbnails/27.jpg)
Algorithm: Exact ME for POMDPsAlgorithm: Exact ME for POMDPs
Construct Hypertree (offline) Construct State-ADDs for each node (offline) Construct Transition-ADDs for each node (offline) Repeat for each time step:
– Multiply nodes with Obs-ADDs (“Condition on Observations”)
– Establish consistency in the tree (Bottom-up)– Extract leading solution(s) from the tree (Top-down)
– Multiply nodes with Transition-ADDs, project on xt+1, set xt = xt+1, multiply with State-ADDs (“Transition Expansion”)
Complexity: Polynomial in width of Hypertree
![Page 28: Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding](https://reader035.vdocuments.net/reader035/viewer/2022062309/568150eb550346895dbf0408/html5/thumbnails/28.jpg)
ExampleExample
Adapted from Jim Kurien’s thesis
t0: Sw1.cmd = on t1: Or.out = lo, Sw1.cmd = idl, Sw2.cmd = on t2: Or.out = lo
Sw1
≥ 1
Sw2
Or
hi
hi
Switches more likely to fail than Or-Gate
![Page 29: Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding](https://reader035.vdocuments.net/reader035/viewer/2022062309/568150eb550346895dbf0408/html5/thumbnails/29.jpg)
ExampleExample
Switch Model
on
fty
0.95
1.0
t1 t2
0.05
lo lo lo hihi lohi hi
off
0.05
t1 t2lo lo hi hi
0.95
cmd=off
cmd=on
0.95
0.95
true
cmd=off,idlcmd=on,idl
![Page 30: Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding](https://reader035.vdocuments.net/reader035/viewer/2022062309/568150eb550346895dbf0408/html5/thumbnails/30.jpg)
ExampleExample
Switch Model
xt t1 t2 on lo loon hi hioff * *fty * *
f
1.01.01.01.0
xt cmd xt+1 f
on on onon off offon idl onon * ftyoff on onoff off offoff idl offoff * ftyfty * fty
0.950.950.950.050.950.950.950.051.0
![Page 31: Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding](https://reader035.vdocuments.net/reader035/viewer/2022062309/568150eb550346895dbf0408/html5/thumbnails/31.jpg)
ExampleExample
Or-Gate Model
ok
fty
0.99
1.0
in1 in2 out
true
0.01
lo lo lolo hi hihi lo hihi hi hi
xt in1 in2 outok lo lo look lo hi hiok hi lo hiok hi hi hifty * * *
xt xt+1
ok okok ftyfty fty
f
1.01.01.01.01.0
f
0.990.011.0
![Page 32: Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding](https://reader035.vdocuments.net/reader035/viewer/2022062309/568150eb550346895dbf0408/html5/thumbnails/32.jpg)
ExampleExample
Initial belief state (chosen):– p(Sw=on) = p(Sw=off) = 0.475, p(Sw=fty) = 0.05– p(Or=ok) = 0.99, p(Or=fty) = 0.01
Observations/Commands:– t0: Sw1.cmd=on– t1: Or.out=lo, Sw1.cmd=idl, Sw2.cmd=on– t2: Or.out=lo
Leading Solutions:– t0: Sw1=on/off, Sw2=on/off, Or=ok– t1: Sw1=fty, Sw2=off, Or=ok– t2: Sw1=on, Sw2=on, Or=fty
![Page 33: Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding](https://reader035.vdocuments.net/reader035/viewer/2022062309/568150eb550346895dbf0408/html5/thumbnails/33.jpg)
ConclusionConclusion
SCSPs elegant and general representation ADDs encoding of SCSPs efficient in average case,
exponential in the number of variables in worst case Decomposition factors problem into set of ADDs,
each confined to small numbers of variables The two methods complement each other well How far can we get with this combination?