two approximate algorithms for belief updating mini-clustering - mc robert mateescu, rina dechter,...

30
Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating", AAAI-2002 Iterative Join-Graph Propagation - IJGP Rina Dechter, Kalev Kask and Robert Mateescu. "Iterative Join-Graph Propagation”, UAI 2002

Upload: cornelia-anthony

Post on 05-Jan-2016

220 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

Two Approximate Algorithms for Belief Updating

Mini-Clustering - MCRobert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating", AAAI-2002

Iterative Join-Graph Propagation - IJGP Rina Dechter, Kalev Kask and Robert Mateescu. "Iterative Join-Graph Propagation”, UAI 2002

Page 2: Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

What is Mini-Clustering?

Mini-Clustering (MC) is an approximate algorithm for belief updating in Bayesian networks

MC is an anytime version of join-tree clustering

MC applies message passing along a cluster tree

The complexity of MC is controlled by a user-adjustable parameter, the i-bound

Empirical evaluation shows that MC is a very effective algorithm, in many cases superior to other approximate schemes (IBP, Gibbs Sampling)

Page 3: Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

The belief updating problem is the task of computing the posterior probability P(Y|e) of query nodes Y X given evidence e.We focus on the basic case where Y is a single variable Xi

G

E

F

C D

B

A

y tables)probabilit al(condition

CPTs are )|(},,...,{

over graph) acyclic (directedDAG a is

domains their ofset theis },...,{

variablesrandom ofset a is },...,{

: where,,,

quadruple a is A

1

1

1

iiin

n

n

paXPpppP

XG

DDD

XXX

PGDXBN

network belief

Belief networks

Page 4: Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

Tree decompositions

property)on intersecti (running subtree connected

a forms set the bleeach variaFor 2.

and

such that vertex oneexactly is therefunction each For 1.

:satisfying

and sets, twox each verte with gassociatin functions,

labeling are and and treea is where,,, triple

a is network belief afor A

χ(v)}V|X{vXX

χ(v))scope(pψ(v)p

Pp

Pψ(v)

Xχ(v)Vv

ψχ(V,E)TT

X,D,G,PBNpositiontree decom

ii

ii

i

A B C p(a), p(b|a), p(c|a,b)

B C D Fp(d|b), p(f|c,d)

B E Fp(e|b,f)

E F Gp(g|e,f)

EF

BF

BC

G

E

F

C D

B

A

Belief network Tree decomposition

Page 5: Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

Cluster Tree Elimination

Cluster Tree Elimination (CTE) is an exact algorithm

It works by passing messages along a tree decomposition

Basic idea: Each node sends only one message to each of its

neighbors Node u sends a message to its neighbor v only when

u received messages from all its other neighbors

Page 6: Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

Cluster Tree Elimination

Previous work on tree clustering:

Lauritzen, Spiegelhalter - ‘88 (probabilities) Jensen, Lauritzen, Olesen - ‘90 (probabilities) Shenoy, Shafer - ‘90, Shenoy - ‘97 (general) Dechter, Pearl - ‘89 (constraints) Gottlob, Leone, Scarello - ‘00 (constraints)

Page 7: Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

)(u

u v

x1

x2

xn

)},(),,({)( 21 uxhuxhu )},({)( 1 uxhu )},(),...,,(),,({)( 21 uxhuxhuxhu n

),( )},({)(),(

:message theCompute

vuelim uvhuclusterffvuh

Belief Propagation

h(u,v)

)},(),,(),...,,(),,({)( 21 uvhuxhuxhuxhu n

Page 8: Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

ABC

2

4

),|()|()(),()2,1( bacpabpapcbha

1

3 BEF

EFG

),(),|()|(),( )2,3(,

)1,2( fbhdcfpbdpcbhfd

),(),|()|(),( )2,1(,

)3,2( cbhdcfpbdpfbhdc

),(),|(),( )3,4()2,3( fehfbepfbhe

),(),|(),( )3,2()4,3( fbhfbepfehb

),|(),()3,4( fegGpfeh e

EF

BF

BC

BCDF

G

E

F

C D

B

A

Cluster Tree Elimination - example

Page 9: Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

Cluster Tree Elimination - the messages

),|()|()(),()2,1( bacpabpapcbha

A B C p(a), p(b|a), p(c|a,b)

B C D Fp(d|b), p(f|c,d)

h(1,2)(b,c)

B E Fp(e|b,f), h(2,3)(b,f)

E F Gp(g|e,f)

),(),|()|(),( )2,1(,

)3,2( cbhdcfpbdpfbhdc

2

4

1

3

EF

BC

BFsep(2,3)={B,F}

elim(2,3)={C,D}

Page 10: Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

Cluster Tree Elimination - properties

Correctness and completeness: Algorithm CTE is correct, i.e. it computes the exact joint probability of a single variable and the evidence.

Time complexity: O ( deg (n+N) d w*+1 )

Space complexity: O ( N d sep)where deg = the maximum degree of a node

n = number of variables (= number of CPTs)

N = number of nodes in the tree decomposition

d = the maximum domain size of a variable

w* = the induced widthsep = the separator size

Page 11: Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

Mini-Clustering - motivation

Time and space complexity of Cluster Tree Elimination depend on the induced width w* of the problem

When the induced width w* is big, CTE algorithm becomes infeasible

Page 12: Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

Mini-Clustering - the basic idea

Try to reduce the size of the cluster (the exponent); partition each cluster into mini-clusters with less variables

Accuracy parameter i = maximum number of variables in a mini-cluster

The idea was explored for variable elimination (Mini-Bucket)

Page 13: Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

Suppose cluster(u) is partitioned into p mini-clusters: mc(1),…,mc(p), each containing at most i variables

TC computes the ‘exact’ message:

We want to process each fmc(k) f separately

),( 1 )(),( vuelim

p

k kmcfvu fh

Mini-Clustering

Page 14: Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

),( 1 )(),( vuelim

p

k kmcfvu fh

Mini-Clustering

Approximate each fmc(k) f , k=2,…,p and take it outside the summation

How to process the mini-clusters to obtain approximations or bounds:

Process all mini-clusters by summation - this gives an upper bound on the joint probability

A tighter upper bound: process one mini-cluster by summation and the others by maximization

Can also use mean operator (average) - this gives an approximation of the joint probability

Page 15: Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

Split a cluster into mini-clusters =>bound complexity

XX gh

)()()O(e :decrease complexity lExponentia n rnr eOeO

Idea of Mini-Clustering

Page 16: Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

EF

BF

BC

),|()|()(:),(1)2,1( bacpabpapcbh

a

)2,1(H

),|(max:)(

),()|(:)(

,

2)1,2(

1)2,3(

,

1)1,2(

dcfpch

fbhbdpbh

fd

fd

)1,2(H

),|(max:)(

),()|(:)(

,

2)3,2(

1)2,1(

,

1)3,2(

dcfpfh

cbhbdpbh

dc

dc

)3,2(H

),(),|(:),( 1)3,4(

1)2,3( fehfbepfbh

e

)2,3(H

)()(),|(:),( 2)3,2(

1)3,2(

1)4,3( fhbhfbepfeh

b

)4,3(H

),|(:),(1)3,4( fegGpfeh e)3,4(H

ABC

2

4

1

3 BEF

EFG

BCDF

Mini-Clustering - example

Page 17: Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

Mini-Clustering - the messages, i=3

),|()|()(),(1)2,1( bacpabpapcbh

a

A B C p(a), p(b|a), p(c|a,b)

B C D p(d|b), h(1,2)(b,c)

C D F p(f|c,d)

B E Fp(e|b,f),

h1(2,3)(b), h2

(2,3)(f)

E F Gp(g|e,f)

2

4

1

3

EF

BC

BFsep(2,3)={B,F}

elim(2,3)={C,D} ),|(max)(,

2)3,2( dcfpfh

dc

),()|()( 1)2,1(

,

1)3,2( cbhbdpbh

dc

Page 18: Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

Cluster Tree Elimination vs. Mini-Clustering

ABC

2

4

),()2,1( cbh1

3 BEF

EFG

),()1,2( cbh

),()3,2( fbh

),()2,3( fbh

),()4,3( feh

),()3,4( fehEF

BF

BC

BCDF

),(1)2,1( cbh

)(

)(2

)1,2(

1)1,2(

ch

bh

)(

)(2

)3,2(

1)3,2(

fh

bh

),(1)2,3( fbh

),(1)4,3( feh

),(1)3,4( feh

)2,1(H

)1,2(H

)3,2(H

)2,3(H

)4,3(H

)3,4(H

ABC

2

4

1

3 BEF

EFG

EF

BF

BC

BCDF

Page 19: Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

Mini-Clustering

Correctness and completeness: Algorithm MC(i) computes a bound (or an approximation) on the joint probability P(Xi,e) of each variable and each of its values.

Time & space complexity: O(n hw* d i)

where hw* = maxu | {f | f (u) } |

Page 20: Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

Normalization

Algorithms for the belief updating problem compute, in general, the joint probability:

Computing the conditional probability:

is easy to do if exact algorithms can be applied becomes an important issue for approximate

algorithms

evidence node,query ),,( eXeXP ii

evidence node,query ),|( eXeXP ii

Page 21: Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

MC can compute an (upper) bound on the joint P(Xi,e)

Deriving a bound on the conditional P(Xi|e) is not easy when the exact P(e) is not available

If a lower bound would be available, we could use:

as an upper bound on the posterior

In our experiments we normalized the results and regarded them as approximations of the posterior P(Xi|e)

),( eXP i

)(eP

)(/),( ePeXP i

Normalization

Page 22: Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

Experimental results

Algorithms: Exact IBP Gibbs sampling (GS) MC with normalization

(approximate)

Networks (all variables are binary): Coding networks CPCS 54, 360, 422 Grid networks (MxM) Random noisy-OR networks Random networks

We tested MC with max and mean operators

Measures: Normalized Hamming Distance

(NHD) BER (Bit Error Rate) Absolute error Relative error Time

Page 23: Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

Random networks - Absolute error

evidence=0 evidence=10

Random networks, N=50, P=2, k=2, evid=0, w*=10, 50 instances

i-bound

0 2 4 6 8 10

Abs

olut

e er

ror

0.00

0.02

0.04

0.06

0.08

0.10

0.12

0.14

0.16

MCGibbs SamplingIBP

Random networks, N=50, P=2, k=2, evid=10, w*=10, 50 instances

i-bound

0 2 4 6 8 10

Abs

olut

e er

ror

0.00

0.02

0.04

0.06

0.08

0.10

0.12

0.14

0.16

MCGibbs SamplingIBP

Page 24: Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

Coding networks - Bit Error Rate

sigma=0.22 sigma=.51

Coding networks, N=100, P=4, sigma=.51, w*=12, 50 instances

i-bound

0 2 4 6 8 10 12

Bit

Err

or R

ate

0.06

0.08

0.10

0.12

0.14

0.16

0.18

MCIBP

Coding networks, N=100, P=4, sigma=.22, w*=12, 50 instances

i-bound

0 2 4 6 8 10 12

Bit

Err

or R

ate

0.000

0.001

0.002

0.003

0.004

0.005

0.006

0.007

MCIBP

Page 25: Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

Noisy-OR networks - Absolute error

Noisy-OR networks, N=50, P=3, evid=10, w*=16, 25 instances

i-bound

0 2 4 6 8 10 12 14 16

Abs

olut

e er

ror

1e-5

1e-4

1e-3

1e-2

1e-1

1e+0

MCIBPGibbs Sampling

Noisy-OR networks, N=50, P=3, evid=20, w*=16, 25 instances

i-bound

0 2 4 6 8 10 12 14 16A

bsol

ute

erro

r1e-5

1e-4

1e-3

1e-2

1e-1

1e+0

MCIBPGibbs Sampling

evidence=10 evidence=20

Page 26: Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

CPCS422 - Absolute error

evidence=0 evidence=10

CPCS 422, evid=0, w*=23, 1 instance

i-bound

2 4 6 8 10 12 14 16 18

Abs

olut

e er

ror

0.00

0.01

0.02

0.03

0.04

0.05

MCIBP

CPCS 422, evid=10, w*=23, 1 instance

i-bound

2 4 6 8 10 12 14 16 18

Abs

olut

e er

ror

0.00

0.01

0.02

0.03

0.04

0.05

MCIBP

Page 27: Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

Grid 15x15, evid=0, w*=22, 10 instances

i-bound

0 2 4 6 8 10 12 14 16 18

NH

D

0.00

0.02

0.04

0.06

0.08

0.10

0.12

0.14

MCIBP

Grid 15x15 - 0 evidenceGrid 15x15, evid=0, w*=22, 10 instances

i-bound

0 2 4 6 8 10 12 14 16 18

Abs

olut

e er

ror

0.00

0.01

0.02

0.03

0.04

0.05

MCIBP

Grid 15x15, evid=0, w*=22, 10 instances

i-bound

0 2 4 6 8 10 12 14 16 18

Rel

ativ

e er

ror

0.00

0.02

0.04

0.06

0.08

0.10

0.12

MCIBP

Grid 15x15, evid=0, w*=22, 10 instances

i-bound

0 2 4 6 8 10 12 14 16 18

Tim

e (s

eco

nds)

0

2

4

6

8

10

12

MCIBP

Page 28: Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

Grid 15x15 - 10 evidenceGrid 15x15, evid=10, w*=22, 10 instances

i-bound

0 2 4 6 8 10 12 14 16 18

NH

D

0.00

0.02

0.04

0.06

0.08

0.10

0.12

0.14

MCIBP

Grid 15x15, evid=10, w*=22, 10 instances

i-bound

0 2 4 6 8 10 12 14 16 18

Abs

olut

e er

ror

0.00

0.01

0.02

0.03

0.04

0.05

0.06

MCIBP

Grid 15x15, evid=10, w*=22, 10 instances

i-bound

0 2 4 6 8 10 12 14 16 18

Rel

ativ

e er

ror

0.00

0.02

0.04

0.06

0.08

0.10

0.12

MCIBP

Grid 15x15, evid=10, w*=22, 10 instances

i-bound

0 2 4 6 8 10 12 14 16 18

Tim

e (s

eco

nds)

0

2

4

6

8

10

12

MCIBP

Page 29: Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

Grid 15x15 - 20 evidenceGrid 15x15, evid=20, w*=22, 10 instances

i-bound

0 2 4 6 8 10 12 14 16 18

NH

D

0.001

0.01

0.1

1

MCIBPGibbs Sampling

Grid 15x15, evid=20, w*=22, 10 instances

i-bound

0 2 4 6 8 10 12 14 16 18

Abs

olut

e er

ror

0.001

0.01

0.1

1

MCIBPGibbs Sampling

Grid 15x15, evid=20, w*=22, 10 instances

i-bound

0 2 4 6 8 10 12 14 16 18

Rel

ativ

e er

ror

0.001

0.01

0.1

1

MCIBPGibbs Sampling

Grid 15x15, evid=20, w*=22, 10 instances

i-bound

0 2 4 6 8 10 12 14 16 18

Tim

e (s

eco

nds)

0

2

4

6

8

10

MCIBPGibbs Sampling

Page 30: Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

Conclusion

MC extends the partition based approximation from mini-buckets to general tree decompositions for the problem of belief updating

Empirical evaluation demonstrates its effectiveness and superiority (for certain types of problems, with respect to the measures considered) relative to other existing algorithms