mrfs uc berkeley
Post on 08-Apr-2018
219 views
Embed Size (px)
TRANSCRIPT
8/7/2019 MRFs UC Berkeley
1/159
,Prof.TrevorDarrell
trevor@eecs.berkeley.edu
Lecture20:MarkovRandomFields
8/7/2019 MRFs UC Berkeley
2/159
YuriBoykov ot ers,asnote
2
8/7/2019 MRFs UC Berkeley
3/159
8/7/2019 MRFs UC Berkeley
4/159
,
observed and some are not. Theres some probabilistic relationship betweenthe 5 variables, described by their joint probability,
, , , , .
If we want to find out what the likely state of variable x1 is (say, the position
of the hand of some erson we are observin , what can we do?
Two reasonable choices are: (a) find the value of x1 (and of all the other
, , , ,
solution.
Or (b) marginalize over all the other variables and then take the mean or the
maximum of the other variables. Mar inalizin then takin the mean is
4
equivalent to finding the MMSE solution. Marginalizing, then taking the
max, is called the max marginal solution and sometimes a useful thing to do.
8/7/2019 MRFs UC Berkeley
5/159
,
),,,,(5432 ,,,
54321 xxxx xxxxxP
If the system really is high dimensional, that will quickly become
intractable. But if there is some modularity in ),,,,( 54321 xxxxxPthen things become tractable again.
Suppose the variables form a Markov chain: x1 causes x2 which causes x3,
.
x x x x x
5
8/7/2019 MRFs UC Berkeley
6/159
8/7/2019 MRFs UC Berkeley
7/159
Directed graphical models
A directed, acyclic graph.
Nodes are random variables. Can be scalars or vectors, continuous or
discrete.
The direction of the edge tells the parent-child-relation:
parent child
parent nodes of node i. That conditional probability isi x i |x i
conditional probabilities:
n
x1 ...x n x i |x
i
i1
8/7/2019 MRFs UC Berkeley
8/159
Another modular probabilistic structure, more common in visionproblems, is an undirected graph:
1x 2x 3x 4x 5x
T e o nt pro a ty or t s grap s g ven y:
),(),(),(),(),,,,( 5443322154321 xxxxxxxxxxxxxP
Where is called a compatibility function. We can),( 21 xx
8
for the directed graph described in the previous slides; for that example,
we could use either form.
8/7/2019 MRFs UC Berkeley
9/159
Undirected graphical models
A set of nodes joined by undirected edges.
The graph makes conditional independencies explicit: If two nodes are
not linked, and we condition on every other node in the graph, then
those two nodes are conditionally independent.
Conditionally independent, because are
not connected by a line in the undirected
graphical model
9
8/7/2019 MRFs UC Berkeley
10/159
Undirected graphical models: cliques
Clique: a fully connected set of nodes
not a clique
clique
A maximal clique is a clique that cant include more nodes of the graph
w/o losing the clique property.
Maximal cliqueNon-maximal clique
10
8/7/2019 MRFs UC Berkeley
11/159
Undirected graphical models:
Hammersley-Clifford theorem addresses the pdf factorization implied
by a graph: A distribution has the Markov structure implied by an
undirected graph iff it can be represented in the factored form
Px
1
Z xc
Potential functions of
set of maximal cliques
states of variables in
maximal cliqueNormalizing constant
So for this graph, the factorization is ambiguous:. (something called
factor graphs makes it explicit).
11
8/7/2019 MRFs UC Berkeley
12/159
Markov Random Fields
Allows rich probabilistic models for.
But built in a local, modular way. Learn, .
12
8/7/2019 MRFs UC Berkeley
13/159
MRF nodes as pixels
13Winkler, 1995, p. 32
8/7/2019 MRFs UC Berkeley
14/159
image patches
(xi, yi)image
14
xi, xj
scene
8/7/2019 MRFs UC Berkeley
15/159
Network joint probability
iiji yxxx
ZyxP ),(),(),(
scene
ima e
Scene-scene
compatibility
Image-scene
compatibility
,
functionneighboring local
function
15
8/7/2019 MRFs UC Berkeley
16/159
In order to use MRFs:
Given observations y, and the parameters ofthe MRF how infer the hidden variables x?
How learn the parameters of the MRF?
16
8/7/2019 MRFs UC Berkeley
17/159
Outline of MRF section
Inference in MRFs. Gibbs sampling, simulated annealing
Iterated condtional modes (ICM)
Variational methods Belief propagation
Graph cuts
Vision applications of inference in MRFs. Learning MRF parameters.
17
Iterative proportional fitting (IPF)
8/7/2019 MRFs UC Berkeley
18/159
Outline of MRF section
Inference in MRFs. Gibbs sampling, simulated annealing
Iterated condtional modes (ICM)
Variational methods Belief propagation
Graph cuts
Vision applications of inference in MRFs. Learning MRF parameters.
18
Iterative proportional fitting (IPF)
8/7/2019 MRFs UC Berkeley
19/159
Derivation of belief propagation
y1 y2 y3
, 11
),( 21 xx
, 22
),( 32 xx
, 33
x1 x2 x3
,,,,,sumsummean 3213211321
yyxxxxxxx
MMSE
19
8/7/2019 MRFs UC Berkeley
20/159
The posterior factorizes
),,,,,(sumsummean 3213211 yyyxxxPx MSE
),(sumsummean 111321
321
yxxxxx
MMSE
xxx
),(),( 2122 xxyx
),(mean
,,
1111
yxxx
MMSE
),(),(sum 21222
xxyxx ),( 11 yx ),( 22 yx ),( 33 yx
x x x
20
,, 32333x
),( 21 xx ),( 32 xx
8/7/2019 MRFs UC Berkeley
21/159
Propagation rules
),,,,,(sumsummean 3213211 yyyxxxPx MSE
),(sumsummean 111321
321
yxxxxx
MMSE
xxx
),(),( 2122 xxyx
),(mean
,,
1111
yxxx
MMSE
),( 11 yx ),( 22 yx ),( 33 yx
x x x
),(),(sum 21222
xxyxx
21
),( 21 xx ),( 32 xx,, 32333x
8/7/2019 MRFs UC Berkeley
22/159
Propagation rules
),(mean 1111
yxxx
MMSE
),(),(sum 21222 xxyxx ,,sum 3233
3
xxyxx
)(),(),(sum)( 23
222211
2
12
xMyxxxxMx
),( 11 yx ),( 22 yx ),( 33 yx
x x x
22
),( 21 xx ),( 32 xx
8/7/2019 MRFs UC Berkeley
23/159
Propagation rules
),(mean 1111
yxxx
MMSE
),(),(sum 21222 xxyxx ,,sum 3233
3
xxyxx
)(),(),(sum)( 23
222211
2
12
xMyxxxxMx
),( 11 yx ),( 22 yx ),( 33 yx
x x x
23
),( 21 xx ),( 32 xx
8/7/2019 MRFs UC Berkeley
24/159
neighbor rule
Given everything that I know, heres what Ithink ou should think
different states, and how my states relate to
,probabilities of your states should be)
24
8/7/2019 MRFs UC Berkeley
25/159
Belief propagation messages
A message: can be thought of as a set of weights on
each of your possible states
To send a message: Multiply together all the incomingmessages, except from the node youre sending to,
over the senders states.
ijNk
jjji
x
i
j
i xxxxj \)(
ij )(),()(
jii j
25
8/7/2019 MRFs UC Berkeley
26/159
To find a nodes beliefs: Multiply together all the
messages coming in to that node.
j
)()(j
k
jjj xMxb
26
8/7/2019 MRFs UC Berkeley
27/159
j )( )()( jNk jjjj xxb
kj xxxx )(),()(
i
ijNkxj \)(
27
=
8/7/2019 MRFs UC Berkeley
28/159
O timal solution in a chain or tree:Belief Propagation
Do the right thing Bayesian algorithm.
Kalman filter.
forward/backward algorithm (and MAP
.
28
8/7/2019 MRFs UC Berkeley
29/159
No factorization with loops!
),(mean 1111
yxxx
MMSE
),(),(sum 21222 xxyxx ,,sum 3233
3
xxyxx 31
),( xx
y2
y3
x1
x2
x3
29
8/7/2019 MRFs UC Berkeley
30/159
Justification for running belief propagation
Experimental results:
Error-correctin codes Kschischan and Fre 1998
Freeman and Pasztor 1999
McEliece et al., 1998
Theoretical results:
Frey, 2000
For Gaussian processes, means are correct.
Large neighborhood local maximum for MAP.
Weiss and Freeman, 1999
Equivalent to Bethe approx. in statistical physics.
Tree-wei hted re arameterization
We ss an Freeman, 2000
Yedidia, Freeman, and Weiss, 2000
30
Wainwright, Willsky, Jaakkola, 2001
8/7/2019 MRFs UC Berkeley
31/159
Results from Bethe free energy analysis
Fixed point of belief propagation equations iff. Bethe
approx mat on stat onary po nt.
Belief propagation always has a fixed point.
Connection with variational methods for inference: both
minimize approximations to Free Energy,
var a ona : usua y use pr ma var a es.
belief propagation: fixed pt. equs. for dual variables.
propagation algorithms.
31
Yu