dynamic bayesian networks cse 473. © daniel s. weld 2 473 topics agency problem spaces search...

Post on 22-Dec-2015

218 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Dynamic Bayesian Networks

CSE 473

© Daniel S. Weld 2

473 Topics

Agency

Problem Spaces

Search

Knowledge Representation

Reinforcement

Learning Inference Planning Learning

Logic Probability

© Daniel S. Weld 3

Last Time• Basic notions• Bayesian networks• Statistical learning

  Parameter learning (MAP, ML, Bayesian L)  Naïve Bayes  Structure Learning  Expectation Maximization (EM)

• Dynamic Bayesian networks (DBNs)• Markov decision processes (MDPs)

© Daniel S. Weld 4

Generative Planning

InputDescription of (initial state of) world (in some KR)Description of goal (in some KR)Description of available actions (in some KR)

OutputController

E.g. Sequence of actionsE.g. Plan with loops and conditionalsE.g. Policy = f: states -> actions

© Daniel S. Weld 5

Simplifying Assumptions

Environment

Percepts Actions

What action next?

Static vs.

Dynamic

Fully Observable vs.

Partially Observable

Deterministic vs.

Stochastic

Instantaneous vs.

Durative

Full vs. Partial satisfaction

Perfectvs.

Noisy

Determ

inist

ic

vs.

Stoch

ast

ic

© Daniel S. Weld 6

Static Deterministic ObservableInstantaneousPropositional

“Classical Planning”

DynamicR

ep

lan

ni

ng

/S

itu

ate

d

Pla

ns

Durative

Tem

pora

l R

eason

in

g

Continuous

Nu

meri

c

Con

str

ain

t re

ason

ing

(LP

/ILP

)

Stochastic

Con

tin

gen

t/C

on

form

an

t P

lan

s,

Inte

rleaved

execu

tion

MD

P

Policie

sP

OM

DP

P

olicie

s

PartiallyObservable

Con

tin

gen

t/C

on

form

an

t P

lan

s,

Inte

rleaved

execu

tion

Sem

i-M

DP

P

olicie

s

© Daniel S. Weld 7

Uncertainty

• DisjunctiveNext state could be one of a set of states.

• ProbabilisticNext state is a probability distribution over the set of states.

Which is a more general model?

© Daniel S. Weld 8

STRIPS Action Schemata

(:operator pick-up :parameters ((block ?ob1)) :precondition (and (clear ?ob1)

(on-table ?ob1) (arm-empty))

:effect (and (not (clear ?ob1)) (not (on-table ?ob1))

(not (arm-empty)) (holding ?ob1)))

• Instead of defining ground actions: pickup-A and pickup-B and …

• Define a schema:Note: strips doesn’t

allow derived effects;

you must be complete!}

© Daniel S. Weld 9

Nondeterministic “STRIPS”?

pick_up_Aarm_empty

clear_A

on_table_A

?

+ arm_empty- on_table_A- holding_A

© Daniel S. Weld 10

Probabilistic STRIPS?

pick_up_Aarm_empty

clear_A

on_table_A

P<0.9

+ arm_empty- on_table_A- holding_A

But what does this all mean anyway?

© Daniel S. Weld 11

Defn: Markov Model

Q: set of states

init prob distribution

A: transition probability distribution

© Daniel S. Weld 12

Probability Distribution, A

• Forward Causality   The probability of st does not depend directly

on values of future states.• Probability of new state could depend on

  The history of states visited.  Pr(st|st-1,st-2,…, s0)

• Markovian Assumption  Pr(st|st-1,st-2,…s0) = Pr(st|st-1)

• Stationary Model Assumption  Pr(st|st-1) = Pr(sk|sk-1) for all k.

© Daniel S. Weld 14

Representing A

Q: set of states

init prob distributionA: transition probabilities

how can we represent these?

s0

s1

s2

s3

s4

s5

s6

s0 s1 s2 s3 s4 s5 s6

s0

s1

s2

s3

s4

s5

s6

p12

Probability of transitioning from s1 to s2

∑ ?

© Daniel S. Weld 15

Factoring Q

• Represent Q simply as a   set of states?

• Is their internal structure?  Consider a robot domain  What is the state space?

s0

s1

s2

s3

s4

s5

s6

© Daniel S. Weld 16

A Factored domain

• Variables :   has_user_coffee (huc) ,   has_robot_coffee (hrc),   robot_is_wet (w),   has_robot_umbrella (u),   raining (r),   robot_in_office (o)

• How many states?

  26 = 64

© Daniel S. Weld 17

Representing Compactly

Q: set of states init prob distribution

How represent this efficiently?

s0

s1

s2

s3

s4

s5

s6

r u hrc

w

With a Bayes net (of course!)

© Daniel S. Weld 18

Representing A Compactly

Q: set of states

init prob distributionA: transition probabilities

s0

s1

s2

s3

s4

s5

s6

s0 s1 s2 s3 s4 s5 s6 … s35

s0

s1

s2

s3

s4

s5

s6

s35

p12 How big is matrix version of A? 4096

© Daniel S. Weld 19

Dynamic Bayesian Network

huc

hrc

w

u

r

o o

r

u

w

hrc

huc 8

4

16

4

2

2

Total values

required to represent transition probability table = 36

Vs. 4096 required

for a complete

state probablity

table?

T T+1

© Daniel S. Weld 20

Dynamic Bayesian Network

huc

hrc

w

u

r

o o

r

u

w

hrc

huc

T T+1

Also known as a Factored Markov Model

Defined formally as * Set of random variables * BN for initial state * 2-layer DBN for transitions

© Daniel S. Weld 21

This is Really a “Schema”

huc

hrc

w

u

r

o o

r

u

w

hrc

huc

T T+1 “Unrolling”

© Daniel S. Weld 22

Semantics:A DBN A Normal Bayesian Network

o

r

u

w

hrc

huchuc

hrc

w

u

r

o

0

o

r

u

w

hrc

huc

o

r

u

w

hrc

huc

1 2 3

© Daniel S. Weld 23

Multiple Actions

• Variables :   has_user_coffee (huc) , has_robot_coffee

(hrc), robot_is_wet (w), has_robot_umbrella (u), raining (r), robot_in_office (o)

• Actions :  buy_coffee, deliver_coffee, get_umbrella,

move

• We need an extra variable• (Alternatively a separate 2DBN /

action)

© Daniel S. Weld 24

Actions in DBN

huc

hrc

w

u

r

o o

r

u

w

hrc

huc

T T+1 a

© Daniel S. Weld 25

What can we do with DBNs?

• State estimation• Planning

  If you add rewards  (Next class)

© Daniel S. Weld 26

State Estimation

OR

HucHrcUW

? ?Move BuyC

© Daniel S. Weld 27

Noisy Observationshuc

hrc

w

u

r

o o

r

u

w

hrc

huc

T T+1

So So

True state vars (unobserved)

Noisy, sensor reading

o oSo .9 .2

a Action var Known with

certainty

© Daniel S. Weld 28

State Estimation

OR

HucHrcUW

? ?Move BuyC

So=T …

© Daniel S. Weld 29

Maybe we don’t know action!huc

hrc

w

u

r

o o

r

u

w

hrc

huc

T T+1

So So

True state vars (unobserved)

Noisy, sensor reading

o oSo .9 .2

a Action var (other agent’s

action)

© Daniel S. Weld 30

State Estimation

OR

HucHrcUW

? ?? ?

So=T …

© Daniel S. Weld 31

Inference

• Expand DBN (schema) into BN  Use inference as shown last week  Problem?

• Particle Filters

© Daniel S. Weld 32

Particle Filters

• Create 1000s of “Particles”  Each one has a “copy” of each random var  These have concrete values  Distribution is approximated by whole set

•Approximate technique! Fast!• Simulate

  Each time step, iterate per particle•Use the 2 layer DBN + random # generator to•Predict next value for each random var

  Resample!

© Daniel S. Weld 33

Resampling

• Simulate  Each time step, iterate per particle

•Use the 2 layer DBN + random # generator to•Predict next value for each random var

  Resample:•Given observations (if any) at new time point•Calculate the likelihood of each particle•Create a new population of particles by

– Sampling from old population – Proportionally to likelihood

top related