1 structured region graphs: morphing ep into gbp max welling tom minka yee whye teh

52
1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

Upload: godfrey-foster

Post on 01-Jan-2016

220 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

1

Structured Region Graphs: Morphing EP into GBP

Max Welling

Tom Minka

Yee Whye Teh

Page 2: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

2

GBP and EP

• Approximate inference in large graphical models

• Generalized belief propagation– Minimize Kikuchi free energy

• Expectation propagation– Minimize local KL-divergence

• Require choosing approximation structure– Kikuchi clusters, exponential family

• Need a constructive framework…

[Yedidia, Freeman, Weiss, NIPS 2000]

[Minka, UAI 2001]

Page 3: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

3

Structured Region Graphs

• A general representation for both GBP and EP approximations

• Reveals equivalence between GBP/EP– Can convert between equivalent GBP/EP

algorithms

• Simple tests ensure good performance: non-singularity, R cR = 1, maximality

• A framework for constructing good SRGs for any graphical model

Page 4: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

4

A simple graphical model

5

4 7

3

8

6 9

1

2

),(),(),(),()( 5225411432232112 xxfxxfxxfxxfp x

Want single-variable marginals p(x1), p(x2), …

Page 5: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

5

Belief propagation

1

2

1

41

2 4 93

2

3

8

9

52

)()(

),(

)( 114111

2112

11122 xmsg

xq

xxq

xmsg x

),( 2112 xxq

)( 11 xq

Iterate until all marginals match

Page 6: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

6

Fully-factorized EP

5

4 7

3

8

6 9

1

2

5

4 7

3

8

6 9

1

2 5

4 7

3

8

6 9

1

2 5

4 7

3

8

6 9

1

2

)(12 xq

)(xq

)(

)(marginals 12

xq

q

Iterate until all marginals match

Equivalent to BP

Page 7: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

7

Generalized belief propagation

4

5

5

525

6

85

5

41

2

5

3 6

2 5 8

6 9

),( 5445 xxq

)( 55 xq

),,,( 54211245 xxxxq

4 7

5 8

Page 8: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

8

Tree-structured EP

5

4 7

3

8

6 9

1

2

)(12 xq

)(xq

Iterate until all pairwise marginals on the tree match

Equivalent to GBP on squares

5

4 7

3

8

6 9

1

2 5

4 7

3

8

6 9

1

2

)(

)(nalstree.margi 12

xq

q

Page 9: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

9

Common theme

• GBP and EP approximate p(x) in a distributed fashion

• Factors are allocated to local regions

• Each region has a distribution of a specific form, tied together by constraints

• Regions pass messages until they meet the constraints

Page 10: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

10

Approximation choices

1. Number of regions

2. Allocation of factors to regions

3. Number of parameters per region

4. Which regions to constrain

5. What type of constraints

• How can we reason about these choices?

Page 11: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

11

Outline

• Structured region graphs

• Equivalence operators

• Design criteria

• Design examples

Page 12: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

12

Structured Region Graph

• A general representation for GBP and EP approximations

• A DAG of regions, each with a graph structure, and a set of factors

• Graph structure defines the form of qR(xR)• Links define constraints – parent and child

have the same clique-marginals• Extends region graph formalism of

[Yedidia,Freeman,Weiss, 2002]

Page 13: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

13

Structured Region Graph

),,()( 431),,(\ 431

xxxqq Dxxx

RR x

x

Parent must be super-graph of child

qR must match qD on (1,3,4) and (3,4,5):

4

3 51

2 6

4

3 51

R

D

Cliques (1,3,4)(3,4,5)

),,()( 543),,(\ 543

xxxqq Dxxx

RR x

x

Page 14: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

14

GBP region graphs

• All inner regions are complete [Yedida,Freeman,Weiss, 2002]

• Thus qR(xR) is not factorized

4

3 51

2

4

31

2

4

3 5

6

4

3 5

Outer regions (no parents)

Inner regions4

3 5

6

1

2

Originalgraph

Page 15: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

15

EP region graphs

• Only one inner region

• Every region contains all variables

4

3 51

2 4

3 5

66

1

2

4

3 5

6

1

24

3 5

6

1

2

Originalgraph

Page 16: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

16

Free energy

• Each region has counting number

• Free energy:

)(an

1RAAR cc

R x RR

RRRRR

Rxf

xqxqcpqF

)(

)(log)()||(

subject to the parent-child marginal constraints

Applies to both GBP and EP (special cases)

Page 17: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

17

• Parent-child algorithm (for discrete variables):

• D relays this to other parents

• Iterate until all constraints satisfied

• Fixed point of msg passing = critical point of free energy

Generalized EP messages

)(

)(marginals.clique)(

DD

RDDR q

qmsg

xx

4

3 51

2 6

4

3 51

R

D

Page 18: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

18

Outline

• Structured region graphs

• Equivalence operators

• Design criteria

• Design examples

Page 19: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

19

Equivalence operators

• Graphical operators that preserve the critical points of the free energy:1. Region-Drop

2. Region-Merge

3. Region-Split

4. Link-Death

5. Clique-Grow/Shrink

6. Factor-Move

Page 20: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

20

• A region with one parent can be dropped (replaced by direct links)

Region Drop

0c

Page 21: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

21

Region Merge

• Linked regions with the same structure can be merged

Page 22: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

22

• Any region can be split into two regions plus a separator

• Separator must be complete

• Pieces must be super-graphs of children

Region split

4

3 5

6 4

3 5

4

5

64

5

4

3 5

6 4

3 5

4

5

64

5

Page 23: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

23

Equivalence of BP and fully-factorized EP

Page 24: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

24

Fully-factorized EP

5

4 7

3

8

6 9

1

2

5

4 7

3

8

6 9

1

2 5

4 7

3

8

6 9

1

2 5

4 7

3

8

6 9

1

2

Page 25: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

25

SPLIT

5

4 7

3

8

6 9

1

2

5

4 7

3

8

6 9

1

2 5

4 7

3

8

6 9

1

2 5

4 7

3

8

6 9

1

2

Page 26: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

26

MERGE

5

4 7

3

8

6 9

1

2

1

2

41

8

9

5

4 7

3

8

6 9

5

7

3

8

6 9

2 5

4 7

3 6

1

2

Page 27: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

27

Belief propagation graph

1

2

1

41

2 4 93

2

3

8

9

52

BP and fully-factorized EP have the same fixed points

Page 28: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

28

Equivalence of GBP-squares and tree-structured EP

Page 29: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

29

Tree-structured EP

5

4 7

3

8

6 9

1

2

5

4 7

3

8

6 9

1

2 5

4 7

3

8

6 9

1

2

Page 30: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

30

SPLIT

5 82

3 6 9

2

4 71

25

4 7

8

1

2

2

5

3

8

6 9

2

24 71

2

3 6 9

2

2

2

Page 31: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

31

MERGE

5 82

5

4 7

8

1

2

3 6 9

2

2

5

3

8

6 9

2

4 71

2

24 71

2

3 6 9

2

2

2

Page 32: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

32

SPLIT

5

4 7

8

1

2

1

2

2

1 4 741 4

5 852 5

2

3

3 6 963 6

5

3

8

6 9

2

5

6

5

3 6

2 5 8

6 9

4

5

4

2 5

1 4 7

5 8

Page 33: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

33

DROP

5 852 5

5

6

5

3 6

2 5 8

6 9

4

5

4

2 5

1 4 7

5 8

Page 34: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

34

GBP-squares region graph

5 852 5

5

6

5

3 6

2 5 8

6 9

4

5

4

2 5

1 4 7

5 8

• The chosen TreeEP region graph has the same fixed points as GBP-squares• Extends to any grid

Page 35: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

35

When does EP reduce to GBP?

• When all variables are discrete, and inner region is triangulated (i.e. approximation family is decomposable)

• E.g. TreeEP always reduces to GBP

• Proof: split all inner regions, starting at the bottom, until only complete regions are left

• But EP is often faster – (10x faster in [Minka & Qi, NIPS 2003])

Page 36: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

36

Outline

• Structured region graphs

• Equivalence operators

• Design criteria

• Design examples

Page 37: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

37

Good region graphs

• Consider 2 extreme cases:– maximally correlated variables (strong factors)– uniform variables (weak factors)

• Want approx to be exact in (at least) these cases [Yedidia,Freeman,Weiss, 2004]

Factor strength:maximal

(deterministic)none

(uniform)

Non-singularSRG exact iff: R cR = 1

Page 38: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

38

Non-singularity

• Def: All fixed points are uniform when the factors are uniform

• Not true for all region graphs

• Equivalent def: No ‘redundant’ regions– create spurious fixed points– analogous to singular matrix

• E.g. all triples in K4 = singular

Page 39: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

39

Simple test for non-singularity

• Non-singularity is preserved by equivalence operators

• Theorem: SRG is non-singular iff reduces to single-variable regions when all factors are removed

Page 40: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

40

Example: Squares graph

5 852 5

5

6

5

3 6

2 5 8

6 9

4

5

4

2 5

1 4 7

5 8

1. Remove factors

Page 41: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

41

Example: Squares graph

5 5 82 5

5

6

4

5

1. Remove factors

2. Split 14

52 5

4

55 8

7

5 85

697

5

6

2 5

Page 42: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

42

Example: Squares graph

5 5 82 5

5

6

4

5

1 7

93

1. Remove factors

2. Split

3. Merge

4. Clique-shrink

1. Remove factors

2. Split

3. Merge

Page 43: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

43

Example: Squares graph

5

1 7

93

1. Remove factors

2. Split

3. Merge

4. Clique-shrink

5. Split & merge

2

4

6

8

The squares graph is non-singular

Page 44: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

44

Example: An extra loop

• Adding any extra loop (and overlap edges) to the squares graph makes it singular

• Squares graph is maximal wrt loops

5

4

3 6

1

2 5 852 5

5

6

5

3 6

2 5 8

6 9

4

5

4

2 5

1 4 7

5 8

+ = Singular

Page 45: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

45

General results

• Every acyclic SRG (no cycles of regions) is non-singular and has R cR = 1

– EP-graphs are acyclic

• If all regions contain at most one loop, then non-singular & R cR = 1 implies maximal wrt loops– E.g. squares graph

• Makes it easy to construct good RGs

Page 46: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

46

Outline

• Structured region graphs

• Equivalence operators

• Design criteria

• Design examples

Page 47: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

47

Region graph design

• Join graph method [Dechter et al, UAI 2002] does not ensure R cR = 1

• Want non-singular, R cR = 1, maximal

• Two approaches:

1. Start with EP-graph and reduce

2. Start with BP-graph and add regions (region pursuit)

Page 48: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

48

Star graph

• Non-singular, R cR = 1, maximal

• Closed under intersection

• Very effective on dense graphs

Originalgraph

Page 49: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

49

Region pursuit

• Start with edge regions only

• Greedily add the most “significant” cluster– changes free energy the most

[Welling, UAI 2004]

• Performs poorly when too many clusters are added

• New twist: Skip clusters which would make the graph singular (tested automatically)

Page 50: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

50

7-node complete graph

Page 51: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

51

Summary

• A general formalism for GBP and EP approximations

• Equivalence operators between SRGs– equivalences between EP and GBP

• Simple tests ensure good performance: non-singularity, R cR = 1, maximality

Page 52: 1 Structured Region Graphs: Morphing EP into GBP Max Welling Tom Minka Yee Whye Teh

52

Future work

• More design principles– strength of actual factors– closed under intersection

• General test for maximality• Generalized EP on continuous variables

[Heskes & Zoeter, AISTATS 2003]