# mrfs uc berkeley

Post on 08-Apr-2018

219 views

Embed Size (px)

TRANSCRIPT

8/7/2019 MRFs UC Berkeley

1/159

,Prof.TrevorDarrell

trevor@eecs.berkeley.edu

Lecture20:MarkovRandomFields

8/7/2019 MRFs UC Berkeley

2/159

YuriBoykov ot ers,asnote

2

8/7/2019 MRFs UC Berkeley

3/159

8/7/2019 MRFs UC Berkeley

4/159

,

observed and some are not. Theres some probabilistic relationship betweenthe 5 variables, described by their joint probability,

, , , , .

If we want to find out what the likely state of variable x1 is (say, the position

of the hand of some erson we are observin , what can we do?

Two reasonable choices are: (a) find the value of x1 (and of all the other

, , , ,

solution.

Or (b) marginalize over all the other variables and then take the mean or the

maximum of the other variables. Mar inalizin then takin the mean is

4

equivalent to finding the MMSE solution. Marginalizing, then taking the

max, is called the max marginal solution and sometimes a useful thing to do.

8/7/2019 MRFs UC Berkeley

5/159

,

),,,,(5432 ,,,

54321 xxxx xxxxxP

If the system really is high dimensional, that will quickly become

intractable. But if there is some modularity in ),,,,( 54321 xxxxxPthen things become tractable again.

Suppose the variables form a Markov chain: x1 causes x2 which causes x3,

.

x x x x x

5

8/7/2019 MRFs UC Berkeley

6/159

8/7/2019 MRFs UC Berkeley

7/159

Directed graphical models

A directed, acyclic graph.

Nodes are random variables. Can be scalars or vectors, continuous or

discrete.

The direction of the edge tells the parent-child-relation:

parent child

parent nodes of node i. That conditional probability isi x i |x i

conditional probabilities:

n

x1 ...x n x i |x

i

i1

8/7/2019 MRFs UC Berkeley

8/159

Another modular probabilistic structure, more common in visionproblems, is an undirected graph:

1x 2x 3x 4x 5x

T e o nt pro a ty or t s grap s g ven y:

),(),(),(),(),,,,( 5443322154321 xxxxxxxxxxxxxP

Where is called a compatibility function. We can),( 21 xx

8

for the directed graph described in the previous slides; for that example,

we could use either form.

8/7/2019 MRFs UC Berkeley

9/159

Undirected graphical models

A set of nodes joined by undirected edges.

The graph makes conditional independencies explicit: If two nodes are

not linked, and we condition on every other node in the graph, then

those two nodes are conditionally independent.

Conditionally independent, because are

not connected by a line in the undirected

graphical model

9

8/7/2019 MRFs UC Berkeley

10/159

Undirected graphical models: cliques

Clique: a fully connected set of nodes

not a clique

clique

A maximal clique is a clique that cant include more nodes of the graph

w/o losing the clique property.

Maximal cliqueNon-maximal clique

10

8/7/2019 MRFs UC Berkeley

11/159

Undirected graphical models:

Hammersley-Clifford theorem addresses the pdf factorization implied

by a graph: A distribution has the Markov structure implied by an

undirected graph iff it can be represented in the factored form

Px

1

Z xc

Potential functions of

set of maximal cliques

states of variables in

maximal cliqueNormalizing constant

So for this graph, the factorization is ambiguous:. (something called

factor graphs makes it explicit).

11

8/7/2019 MRFs UC Berkeley

12/159

Markov Random Fields

Allows rich probabilistic models for.

But built in a local, modular way. Learn, .

12

8/7/2019 MRFs UC Berkeley

13/159

MRF nodes as pixels

13Winkler, 1995, p. 32

8/7/2019 MRFs UC Berkeley

14/159

image patches

(xi, yi)image

14

xi, xj

scene

8/7/2019 MRFs UC Berkeley

15/159

Network joint probability

iiji yxxx

ZyxP ),(),(),(

scene

ima e

Scene-scene

compatibility

Image-scene

compatibility

,

functionneighboring local

function

15

8/7/2019 MRFs UC Berkeley

16/159

In order to use MRFs:

Given observations y, and the parameters ofthe MRF how infer the hidden variables x?

How learn the parameters of the MRF?

16

8/7/2019 MRFs UC Berkeley

17/159

Outline of MRF section

Inference in MRFs. Gibbs sampling, simulated annealing

Iterated condtional modes (ICM)

Variational methods Belief propagation

Graph cuts

Vision applications of inference in MRFs. Learning MRF parameters.

17

Iterative proportional fitting (IPF)

8/7/2019 MRFs UC Berkeley

18/159

Outline of MRF section

Inference in MRFs. Gibbs sampling, simulated annealing

Iterated condtional modes (ICM)

Variational methods Belief propagation

Graph cuts

Vision applications of inference in MRFs. Learning MRF parameters.

18

Iterative proportional fitting (IPF)

8/7/2019 MRFs UC Berkeley

19/159

Derivation of belief propagation

y1 y2 y3

, 11

),( 21 xx

, 22

),( 32 xx

, 33

x1 x2 x3

,,,,,sumsummean 3213211321

yyxxxxxxx

MMSE

19

8/7/2019 MRFs UC Berkeley

20/159

The posterior factorizes

),,,,,(sumsummean 3213211 yyyxxxPx MSE

),(sumsummean 111321

321

yxxxxx

MMSE

xxx

),(),( 2122 xxyx

),(mean

,,

1111

yxxx

MMSE

),(),(sum 21222

xxyxx ),( 11 yx ),( 22 yx ),( 33 yx

x x x

20

,, 32333x

),( 21 xx ),( 32 xx

8/7/2019 MRFs UC Berkeley

21/159

Propagation rules

),,,,,(sumsummean 3213211 yyyxxxPx MSE

),(sumsummean 111321

321

yxxxxx

MMSE

xxx

),(),( 2122 xxyx

),(mean

,,

1111

yxxx

MMSE

),( 11 yx ),( 22 yx ),( 33 yx

x x x

),(),(sum 21222

xxyxx

21

),( 21 xx ),( 32 xx,, 32333x

8/7/2019 MRFs UC Berkeley

22/159

Propagation rules

),(mean 1111

yxxx

MMSE

),(),(sum 21222 xxyxx ,,sum 3233

3

xxyxx

)(),(),(sum)( 23

222211

2

12

xMyxxxxMx

),( 11 yx ),( 22 yx ),( 33 yx

x x x

22

),( 21 xx ),( 32 xx

8/7/2019 MRFs UC Berkeley

23/159

Propagation rules

),(mean 1111

yxxx

MMSE

),(),(sum 21222 xxyxx ,,sum 3233

3

xxyxx

)(),(),(sum)( 23

222211

2

12

xMyxxxxMx

),( 11 yx ),( 22 yx ),( 33 yx

x x x

23

),( 21 xx ),( 32 xx

8/7/2019 MRFs UC Berkeley

24/159

neighbor rule

Given everything that I know, heres what Ithink ou should think

different states, and how my states relate to

,probabilities of your states should be)

24

8/7/2019 MRFs UC Berkeley

25/159

Belief propagation messages

A message: can be thought of as a set of weights on

each of your possible states

To send a message: Multiply together all the incomingmessages, except from the node youre sending to,

over the senders states.

ijNk

jjji

x

i

j

i xxxxj \)(

ij )(),()(

jii j

25

8/7/2019 MRFs UC Berkeley

26/159

To find a nodes beliefs: Multiply together all the

messages coming in to that node.

j

)()(j

k

jjj xMxb

26

8/7/2019 MRFs UC Berkeley

27/159

j )( )()( jNk jjjj xxb

kj xxxx )(),()(

i

ijNkxj \)(

27

=

8/7/2019 MRFs UC Berkeley

28/159

O timal solution in a chain or tree:Belief Propagation

Do the right thing Bayesian algorithm.

Kalman filter.

forward/backward algorithm (and MAP

.

28

8/7/2019 MRFs UC Berkeley

29/159

No factorization with loops!

),(mean 1111

yxxx

MMSE

),(),(sum 21222 xxyxx ,,sum 3233

3

xxyxx 31

),( xx

y2

y3

x1

x2

x3

29

8/7/2019 MRFs UC Berkeley

30/159

Justification for running belief propagation

Experimental results:

Error-correctin codes Kschischan and Fre 1998

Freeman and Pasztor 1999

McEliece et al., 1998

Theoretical results:

Frey, 2000

For Gaussian processes, means are correct.

Large neighborhood local maximum for MAP.

Weiss and Freeman, 1999

Equivalent to Bethe approx. in statistical physics.

Tree-wei hted re arameterization

We ss an Freeman, 2000

Yedidia, Freeman, and Weiss, 2000

30

Wainwright, Willsky, Jaakkola, 2001

8/7/2019 MRFs UC Berkeley

31/159

Results from Bethe free energy analysis

Fixed point of belief propagation equations iff. Bethe

approx mat on stat onary po nt.

Belief propagation always has a fixed point.

Connection with variational methods for inference: both

minimize approximations to Free Energy,

var a ona : usua y use pr ma var a es.

belief propagation: fixed pt. equs. for dual variables.

propagation algorithms.

31

Yu