graphical models and belief propagation6.869.csail.mit.edu/fa19/lectures/l10mrf2019.pdf · –...

67
6.869/6.819 Advances in Computer Vision Fall 2019 Bill Freeman, Antonio Torralba, Phillip Isola Lecture 10 Graphical Models and Belief Propagation October 8, 2019

Upload: others

Post on 25-Sep-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

6.869/6.819 Advances in Computer Vision Fall 2019Bill Freeman, Antonio Torralba, Phillip Isola

Lecture 10

Graphical Models and Belief Propagation

October 8, 2019

Page 2: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

Lecture 10 Oct. 8, 2019Belief Propagation and Graphical Models

Only the first 10 slides will be presented in class; the rest are just included for reference. Most of the class will be on the blackboard.

Page 3: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian
Page 4: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian
Page 5: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

Identical local evidence...

Page 6: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

…different interpretations

Page 7: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

Information must propagate over the image.

Local information... ...must propagate

Probabilistic graphical models are a powerful tool for propagating information within an image. And these tools are used everywhere within computer vision now.

Page 8: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

From a random sample of 6 papers from CVPR 2014, half had figures that look like this...

!8

http://www.cvpapers.com/cvpr2014.html

Page 9: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

!9

http://hci.iwr.uni-heidelberg.de/Staff/bsavchyn/papers/swoboda-GraphicalModelsPersistency-with-Supplement-cvpr2014.pdf

Partial Optimality by Pruning for MAP-inference with General Graphical Models, Swoboda et al

Page 10: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

!10file:///Users/billf/Downloads/dewarp_high.pdf

Active flattening of curved document images via two structured beams, Meng et al.

Page 11: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

A Mixture of Manhattan Frames: Beyond the Manhattan World, Straub et al

!11

http://www.jstraub.de/download/straub2014mmf.pdf

Page 12: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

MRF nodes as patches

image patches

Φ(xi, yi)

Ψ(xi, xj)

image

scene

scene patches

!12

Page 13: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

Network joint probability

sceneimage

Scene-scene compatibility

functionneighboring scene nodes

local observations

Image-scene compatibility

function

∏∏ ΦΨ=i

iiji

ji yxxxZ

yxP ),(),(1),(,

!13

Page 14: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

Energy formulation

!14

sceneimage

Scene-scene compatibility

functionneighboring scene nodes

local observations

Image-scene compatibility

function

L(x,y) = k + β(xi,x j )(i, j )∑ + α(xi,yi)∑E

Page 15: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

• Inference in MRF’s. – Gibbs sampling, simulated annealing – Iterated conditional modes (ICM) – Loopy belief propagation

• Application example—super-resolution – Graph cuts – Variational methods

• Learning MRF parameters. – Iterative proportional fitting (IPF)

Outline of MRF section

!15

Page 16: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

Belief, and message update rules are just local operations, and can be run whether

or not the network has loops

j

i

i =

j

!16

Page 17: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

Justification for running belief propagation in networks with loops

• Experimental results: – Comparison of methods

– Error-correcting codes

– Vision applications

• Theoretical results: – For Gaussian processes, means are correct.

– Large neighborhood local maximum for MAP.

– Equivalent to Bethe approx. in statistical physics.

– Tree-weighted reparameterization

Weiss and Freeman, 2000Yedidia, Freeman, and Weiss, 2000

Freeman and Pasztor, 1999; Frey, 2000

Kschischang and Frey, 1998; McEliece et al., 1998

Weiss and Freeman, 1999

Wainwright, Willsky, Jaakkola, 2001

Szeliski et al. 2008

!17

http://vision.middlebury.edu/MRF/

Page 18: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

testMRF.m

Show program comparing some methods on a simple MRF

!18

Page 19: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

• Inference in MRF’s. – Gibbs sampling, simulated annealing – Iterated conditional modes (ICM) – Belief propagation

• Application example—super-resolution – Graph cuts – Variational methods

• Learning MRF parameters. – Iterative proportional fitting (IPF)

Outline of MRF section

!19

Page 20: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

53

Super-resolution

• Image: low resolution image • Scene: high resolution image

imag

esc

ene

ultimate goal...

!20

Page 21: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

Polygon-based graphics images are resolution independent

Pixel-based images are not resolution

independent

Pixel replication

Cubic spline Cubic spline,

sharpened

Training-based super-resolution

!21

Page 22: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

3 approaches to perceptual sharpening

(1) Sharpening; boost existing high frequencies.

(2) Use multiple frames to obtain higher sampling rate in a still frame.

(3) Estimate high frequencies not present in image, although implicitly defined.

In this talk, we focus on (3), which we’ll call “super-resolution”.

spatial frequency

ampl

itude

spatial frequencyam

plitu

de

!22

Page 23: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

• Schultz and Stevenson, 1994 • Pentland and Horowitz, 1993 • fractal image compression (Polvere, 1998; Iterated Systems) • astronomical image processing (eg. Gull and Daniell, 1978;

“pixons” http://casswww.ucsd.edu/puetter.html) • Follow-on: Jianchao Yang, John Wright, Thomas S. Huang,

Yi Ma: Image super-resolution as sparse representation of raw image patches. CVPR 2008

Super-resolution: other approaches

!23

Page 24: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

57

Training images, ~100,000 image/scene patch pairs

Images from two Corel database categories: “giraffes” and “urban skyline”.

!24

Page 25: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

Do a first interpolation

Zoomed low-resolution

Low-resolution

!25

Page 26: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

Zoomed low-resolution

Low-resolution

Full frequency original

!26

Page 27: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

Full freq. originalRepresentationZoomed low-freq.

!27

Page 28: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

61

True high freqsLow-band input

(contrast normalized, PCA fitted)

Full freq. originalRepresentationZoomed low-freq.

(to minimize the complexity of the relationships we have to learn, we remove the lowest frequencies from the input image,

and normalize the local contrast level).!28

Page 29: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

Training data samples (magnified)

......

Gather ~100,000 patches

low freqs.

high freqs.

!29

Page 30: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

True high freqs.Input low freqs.

Training data samples (magnified)

......

Nearest neighbor estimate

low freqs.

high freqs.

Estimated high freqs.

!30

Page 31: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

Input low freqs.

Training data samples (magnified)

......

Nearest neighbor estimate

low freqs.

high freqs.

Estimated high freqs.

!31

Page 32: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

Example: input image patch, and closest matches from database

Input patch

Closest image patches from database

Corresponding high-resolution

patches from database

!32

Page 33: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

!33

Page 34: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

Assume overlapped regions, d, of hi-res. patches differ by Gaussian observation noise:

Scene-scene compatibility function, Ψ(xi, xj)

d

Uniqueness constraint, not smoothness.

!34

Page 35: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

Image-scene compatibility function, Φ(xi, yi)

Assume Gaussian noise takes you from observed image patch to synthetic sample:

y

x

!35

Page 36: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

Markov network

image patches

Φ(xi, yi)

Ψ(xi, xj)scene patches

!36

Page 37: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

Iter. 3

Iter. 1

Belief PropagationInput

Iter. 0

After a few iterations of belief propagation, the algorithm selects spatially consistent high resolution

interpretations for each low-resolution patch of the input image.

!37

Page 38: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

Max. likelihood zoom to 340x204

Zooming 2 octaves

85 x 51 input

Cubic spline zoom to 340x204

We apply the super-resolution algorithm recursively, zooming

up 2 powers of 2, or a factor of 4 in each dimension.

!38

Page 39: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

True 200x232

Original 50x58

(cubic spline implies thin plate prior)

Now we examine the effect of the prior assumptions made about images on the

high resolution reconstruction. First, cubic spline interpolation.

!39

Page 40: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

Cubic splineTrue

200x232

Original 50x58

(cubic spline implies thin plate prior)

!40

Page 41: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

True

Original 50x58

Training images

Next, train the Markov network algorithm on a world of random noise

images.

!41

Page 42: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

Markov network True

Original 50x58

The algorithm learns that, in such a world, we add random noise when zoom

to a higher resolution.

Training images

!42

Page 43: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

True

Original 50x58

Training images

Next, train on a world of vertically oriented rectangles.

!43

Page 44: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

Markov network True

Original 50x58

The Markov network algorithm hallucinates those vertical rectangles that

it was trained on.

Training images

!44

Page 45: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

True

Original 50x58

Training images

Now train on a generic collection of images.

!45

Page 46: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

Markov network True

Original 50x58

The algorithm makes a reasonable guess at the high resolution image, based on its

training images.

Training images

!46

Page 47: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

Generic training images

Next, train on a generic set of training images.

Using the same camera as for the test image, but

a random collection of photographs.

!47

Page 48: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

Cubic Spline

Original 70x70

Markov net, training: generic

True 280x280

!48

Page 49: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

82

Kodak Imaging Science Technology Lab test.

3 test images, 640x480, to be zoomed up by 4 in each dimension.

8 judges, making 2-alternative, forced-choice comparisons.

!49

Page 50: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

Algorithms compared

• Bicubic Interpolation • Mitra's Directional Filter • Fuzzy Logic Filter •Vector Quantization • VISTA

!50

Page 51: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

84

Bicubic spline Altamira VISTA

!51

Page 52: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

Bicubic spline Altamira VISTA

!52

Page 53: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

User preference test results

“The observer data indicates that six of the observers ranked Freeman’s algorithm as the most preferred of the five tested algorithms. However the other two observers rank Freeman’s algorithm as the least preferred of all the algorithms….

Freeman’s algorithm produces prints which are by far the sharpest out of the five algorithms. However, this sharpness comes at a price of artifacts (spurious detail that is not present in the original scene). Apparently the two observers who did not prefer Freeman’s algorithm had strong objections to the artifacts. The other observers apparently placed high priority on the high level of sharpness in the images created by Freeman’s algorithm.”

!53

Page 54: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

87!54

Page 55: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

88

!55

Page 56: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

89

Training images

!56

Page 57: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

Training image

!57

Page 58: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

Processed image

!58

Page 60: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

Motion applicationimage patches

image

scene

scene patches

!60

Page 61: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

• Aperture problem • Resolution through propagation of

information • Figure/ground discrimination

What behavior should we see in a motion algorithm?

!61

Page 62: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

The aperture problem

http://web.mit.edu/persci/demos/Motion&Form/demos/one-square/one-square.html

!62

Page 63: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

The aperture problem

!63

Page 64: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

motion program demo

!64

Page 65: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

Motion estimation results (maxima of scene probability distributions displayed)

Initial guesses only show motion at edges.

Iterations 0 and 1

Inference:

Image data

!65

Page 66: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

Motion estimation results

Figure/ground still unresolved here.

(maxima of scene probability distributions displayed)

Iterations 2 and 3

!66

Page 67: Graphical Models and Belief Propagation6.869.csail.mit.edu/fa19/lectures/L10MRF2019.pdf · – Error-correcting codes – Vision applications • Theoretical results: – For Gaussian

Motion estimation results

Final result compares well with vector quantized true (uniform) velocities.

(maxima of scene probability distributions displayed)

Iterations 4 and 5

!67