introduction to belief propagation and its generalizations. max welling donald bren school of...

15
Introduction to Belief Propagation and its Generalizations. Max Welling Donald Bren School of Information and Computer and Science University of California Irvine

Upload: bethanie-payne

Post on 17-Dec-2015

222 views

Category:

Documents


2 download

TRANSCRIPT

Introduction to Belief Propagation and its Generalizations.

Max Welling

Donald Bren Schoolof Information and Computer and Science

University of California Irvine

Graphical Models

A ‘marriage’ between probability theory and graph theory

Why probabilities? • Reasoning with uncertainties, confidence levels• Many processes are inherently ‘noisy’ robustness issues

Why graphs?• Provide necessary structure in large models: - Designing new probabilistic models. - Reading out (conditional) independencies.

• Inference & optimization: - Dynamical programming - Belief Propagation

Types of Graphical Model

Undirected graph (Markov random field)

Directed graph(Bayesian network)

i ij

jiijii xxxZ

xP)(

)( ),()(1

)(

i

j

)( ii x),()( jiij xx

)|()( )(parentsi

ii xxPxP

i

Parents(i)

factor graphs

interactions

variables

Example 1: Undirected Graph

neighborhoodinformation

high informationregions

low information

regions

air or water ?

?

?

Undirected Graphs (cont’ed)Nodes encode hidden information (patch-identity).

They receive local information from the image (brightness, color).

Information is propagated though the graph over its edges.

Edges encode ‘compatibility’ between nodes.

Example 2: Directed Graphs

war animals computersTOPICS …

Iraqi the Matlab

Why do we need it?• Answer queries : -Given past purchases, in what genre books is a client interested? -Given a noisy image, what was the original image?

• Learning probabilistic models from examples

(expectation maximization, iterative scaling ) •Optimization problems: min-cut, max-flow, Viterbi, …

Inference in Graphical Models

Example: P( = sea | image) ?

Inference: • Answer queries about unobserved random variables, given values of observed random variables.

• More general: compute their joint posterior distribution: ( | ) { ( | )}iP u o or P u o

learning

inference

Approximate Inference

Inference is computationally intractable for large graphs (with cycles).

Approximate methods:

• Markov Chain Monte Carlo sampling. • Mean field and more structured variational techniques.• Belief Propagation algorithms.

Belief Propagation on trees

ik

k

k

k

ij k

k

k

Mki

k

iikx

iijiijjji xMxxxxMi

)()(),()(

Compatibilities (interactions)

external evidence

k

kkiiii xMxxb )()()(

message

belief (approximate marginal probability)

Belief Propagation on loopy graphs

ik

k

k

k

ij k

k

k

Mki

k

iikx

iijiijjji xMxxxxMi

)()(),()(

Compatibilities (interactions)

external evidence

k

kkiiii xMxxb )()()(

message

belief (approximate marginal probability)

Some facts about BP

• BP is exact on trees.

• If BP converges it has reached a local minimum of an objective function (the Bethe free energy Yedidia et.al ‘00 , Heskes ’02)often good approximation

• If it converges, convergence is fast near the fixed point.

• Many exciting applications: - error correcting decoding (MacKay, Yedidia, McEliece, Frey) - vision (Freeman, Weiss) - bioinformatics (Weiss) - constraint satisfaction problems (Dechter) - game theory (Kearns) - …

BP Related Algorithms

• Convergent alternatives (Welling,Teh’02, Yuille’02, Heskes’03)

• Expectation Propagation (Minka’01)

• Convex alternatives (Wainwright’02, Wiegerinck,Heskes’02)

• Linear Response Propagation (Welling,Teh’02)

• Generalized Belief Propagation (Yedidia,Freeman,Weiss’01)

• Survey Propagation (Braunstein,Mezard,Weigt,Zecchina’03)

Generalized Belief Propagation

kiik

xiijiij

jji

xMxxx

xM

i

)()(),(

)(

Idea: To guess the distribution of one of your neighbors, you ask your other neighbors to guess your distribution. Opinions get combined multiplicatively.

kiik

xiijiij

jji

xMxxx

xM

i

)()(),(

)(

BP GBP

Marginal Consistency

( )A AP x ( )B BP x

( )A B A BP x

\ \

( ) ( ) ( )A A B B A B

A A A B A B B Bx x x x

P x P x P x

Solve inference problem separately on each “patch”,then stitch them togetherusing “marginal consistency”.

Region Graphs (Yedidia, Freeman, Weiss ’02)

C=1C=1 C=1

C=… C=… C=…

C=…

C=…

C=… C=… C=… C=…

C=1

Region: collection of interactions & variables.

)(

1

Anc

cc

Stitching together solutions on local clusters by enforcing “marginal consistency” on their intersections.