mrfs uc berkeley

Post on 08-Apr-2018

219 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

  • 8/7/2019 MRFs UC Berkeley

    1/159

    ,Prof.TrevorDarrell

    trevor@eecs.berkeley.edu

    Lecture20:MarkovRandomFields

  • 8/7/2019 MRFs UC Berkeley

    2/159

    YuriBoykov ot ers,asnote

    2

  • 8/7/2019 MRFs UC Berkeley

    3/159

  • 8/7/2019 MRFs UC Berkeley

    4/159

    ,

    observed and some are not. Theres some probabilistic relationship betweenthe 5 variables, described by their joint probability,

    , , , , .

    If we want to find out what the likely state of variable x1 is (say, the position

    of the hand of some erson we are observin , what can we do?

    Two reasonable choices are: (a) find the value of x1 (and of all the other

    , , , ,

    solution.

    Or (b) marginalize over all the other variables and then take the mean or the

    maximum of the other variables. Mar inalizin then takin the mean is

    4

    equivalent to finding the MMSE solution. Marginalizing, then taking the

    max, is called the max marginal solution and sometimes a useful thing to do.

  • 8/7/2019 MRFs UC Berkeley

    5/159

    ,

    ),,,,(5432 ,,,

    54321 xxxx xxxxxP

    If the system really is high dimensional, that will quickly become

    intractable. But if there is some modularity in ),,,,( 54321 xxxxxPthen things become tractable again.

    Suppose the variables form a Markov chain: x1 causes x2 which causes x3,

    .

    x x x x x

    5

  • 8/7/2019 MRFs UC Berkeley

    6/159

  • 8/7/2019 MRFs UC Berkeley

    7/159

    Directed graphical models

    A directed, acyclic graph.

    Nodes are random variables. Can be scalars or vectors, continuous or

    discrete.

    The direction of the edge tells the parent-child-relation:

    parent child

    parent nodes of node i. That conditional probability isi x i |x i

    conditional probabilities:

    n

    x1 ...x n x i |x

    i

    i1

  • 8/7/2019 MRFs UC Berkeley

    8/159

    Another modular probabilistic structure, more common in visionproblems, is an undirected graph:

    1x 2x 3x 4x 5x

    T e o nt pro a ty or t s grap s g ven y:

    ),(),(),(),(),,,,( 5443322154321 xxxxxxxxxxxxxP

    Where is called a compatibility function. We can),( 21 xx

    8

    for the directed graph described in the previous slides; for that example,

    we could use either form.

  • 8/7/2019 MRFs UC Berkeley

    9/159

    Undirected graphical models

    A set of nodes joined by undirected edges.

    The graph makes conditional independencies explicit: If two nodes are

    not linked, and we condition on every other node in the graph, then

    those two nodes are conditionally independent.

    Conditionally independent, because are

    not connected by a line in the undirected

    graphical model

    9

  • 8/7/2019 MRFs UC Berkeley

    10/159

    Undirected graphical models: cliques

    Clique: a fully connected set of nodes

    not a clique

    clique

    A maximal clique is a clique that cant include more nodes of the graph

    w/o losing the clique property.

    Maximal cliqueNon-maximal clique

    10

  • 8/7/2019 MRFs UC Berkeley

    11/159

    Undirected graphical models:

    Hammersley-Clifford theorem addresses the pdf factorization implied

    by a graph: A distribution has the Markov structure implied by an

    undirected graph iff it can be represented in the factored form

    Px

    1

    Z xc

    Potential functions of

    set of maximal cliques

    states of variables in

    maximal cliqueNormalizing constant

    So for this graph, the factorization is ambiguous:. (something called

    factor graphs makes it explicit).

    11

  • 8/7/2019 MRFs UC Berkeley

    12/159

    Markov Random Fields

    Allows rich probabilistic models for.

    But built in a local, modular way. Learn, .

    12

  • 8/7/2019 MRFs UC Berkeley

    13/159

    MRF nodes as pixels

    13Winkler, 1995, p. 32

  • 8/7/2019 MRFs UC Berkeley

    14/159

    image patches

    (xi, yi)image

    14

    xi, xj

    scene

  • 8/7/2019 MRFs UC Berkeley

    15/159

    Network joint probability

    iiji yxxx

    ZyxP ),(),(),(

    scene

    ima e

    Scene-scene

    compatibility

    Image-scene

    compatibility

    ,

    functionneighboring local

    function

    15

  • 8/7/2019 MRFs UC Berkeley

    16/159

    In order to use MRFs:

    Given observations y, and the parameters ofthe MRF how infer the hidden variables x?

    How learn the parameters of the MRF?

    16

  • 8/7/2019 MRFs UC Berkeley

    17/159

    Outline of MRF section

    Inference in MRFs. Gibbs sampling, simulated annealing

    Iterated condtional modes (ICM)

    Variational methods Belief propagation

    Graph cuts

    Vision applications of inference in MRFs. Learning MRF parameters.

    17

    Iterative proportional fitting (IPF)

  • 8/7/2019 MRFs UC Berkeley

    18/159

    Outline of MRF section

    Inference in MRFs. Gibbs sampling, simulated annealing

    Iterated condtional modes (ICM)

    Variational methods Belief propagation

    Graph cuts

    Vision applications of inference in MRFs. Learning MRF parameters.

    18

    Iterative proportional fitting (IPF)

  • 8/7/2019 MRFs UC Berkeley

    19/159

    Derivation of belief propagation

    y1 y2 y3

    , 11

    ),( 21 xx

    , 22

    ),( 32 xx

    , 33

    x1 x2 x3

    ,,,,,sumsummean 3213211321

    yyxxxxxxx

    MMSE

    19

  • 8/7/2019 MRFs UC Berkeley

    20/159

    The posterior factorizes

    ),,,,,(sumsummean 3213211 yyyxxxPx MSE

    ),(sumsummean 111321

    321

    yxxxxx

    MMSE

    xxx

    ),(),( 2122 xxyx

    ),(mean

    ,,

    1111

    yxxx

    MMSE

    ),(),(sum 21222

    xxyxx ),( 11 yx ),( 22 yx ),( 33 yx

    x x x

    20

    ,, 32333x

    ),( 21 xx ),( 32 xx

  • 8/7/2019 MRFs UC Berkeley

    21/159

    Propagation rules

    ),,,,,(sumsummean 3213211 yyyxxxPx MSE

    ),(sumsummean 111321

    321

    yxxxxx

    MMSE

    xxx

    ),(),( 2122 xxyx

    ),(mean

    ,,

    1111

    yxxx

    MMSE

    ),( 11 yx ),( 22 yx ),( 33 yx

    x x x

    ),(),(sum 21222

    xxyxx

    21

    ),( 21 xx ),( 32 xx,, 32333x

  • 8/7/2019 MRFs UC Berkeley

    22/159

    Propagation rules

    ),(mean 1111

    yxxx

    MMSE

    ),(),(sum 21222 xxyxx ,,sum 3233

    3

    xxyxx

    )(),(),(sum)( 23

    222211

    2

    12

    xMyxxxxMx

    ),( 11 yx ),( 22 yx ),( 33 yx

    x x x

    22

    ),( 21 xx ),( 32 xx

  • 8/7/2019 MRFs UC Berkeley

    23/159

    Propagation rules

    ),(mean 1111

    yxxx

    MMSE

    ),(),(sum 21222 xxyxx ,,sum 3233

    3

    xxyxx

    )(),(),(sum)( 23

    222211

    2

    12

    xMyxxxxMx

    ),( 11 yx ),( 22 yx ),( 33 yx

    x x x

    23

    ),( 21 xx ),( 32 xx

  • 8/7/2019 MRFs UC Berkeley

    24/159

    neighbor rule

    Given everything that I know, heres what Ithink ou should think

    different states, and how my states relate to

    ,probabilities of your states should be)

    24

  • 8/7/2019 MRFs UC Berkeley

    25/159

    Belief propagation messages

    A message: can be thought of as a set of weights on

    each of your possible states

    To send a message: Multiply together all the incomingmessages, except from the node youre sending to,

    over the senders states.

    ijNk

    jjji

    x

    i

    j

    i xxxxj \)(

    ij )(),()(

    jii j

    25

  • 8/7/2019 MRFs UC Berkeley

    26/159

    To find a nodes beliefs: Multiply together all the

    messages coming in to that node.

    j

    )()(j

    k

    jjj xMxb

    26

  • 8/7/2019 MRFs UC Berkeley

    27/159

    j )( )()( jNk jjjj xxb

    kj xxxx )(),()(

    i

    ijNkxj \)(

    27

    =

  • 8/7/2019 MRFs UC Berkeley

    28/159

    O timal solution in a chain or tree:Belief Propagation

    Do the right thing Bayesian algorithm.

    Kalman filter.

    forward/backward algorithm (and MAP

    .

    28

  • 8/7/2019 MRFs UC Berkeley

    29/159

    No factorization with loops!

    ),(mean 1111

    yxxx

    MMSE

    ),(),(sum 21222 xxyxx ,,sum 3233

    3

    xxyxx 31

    ),( xx

    y2

    y3

    x1

    x2

    x3

    29

  • 8/7/2019 MRFs UC Berkeley

    30/159

    Justification for running belief propagation

    Experimental results:

    Error-correctin codes Kschischan and Fre 1998

    Freeman and Pasztor 1999

    McEliece et al., 1998

    Theoretical results:

    Frey, 2000

    For Gaussian processes, means are correct.

    Large neighborhood local maximum for MAP.

    Weiss and Freeman, 1999

    Equivalent to Bethe approx. in statistical physics.

    Tree-wei hted re arameterization

    We ss an Freeman, 2000

    Yedidia, Freeman, and Weiss, 2000

    30

    Wainwright, Willsky, Jaakkola, 2001

  • 8/7/2019 MRFs UC Berkeley

    31/159

    Results from Bethe free energy analysis

    Fixed point of belief propagation equations iff. Bethe

    approx mat on stat onary po nt.

    Belief propagation always has a fixed point.

    Connection with variational methods for inference: both

    minimize approximations to Free Energy,

    var a ona : usua y use pr ma var a es.

    belief propagation: fixed pt. equs. for dual variables.

    propagation algorithms.

    31

    Yu