image parsing: unifying segmentation and detection

38
Image Parsing: Unifying Segmentation and Detection Z. Tu, X. Chen, A.L. Yuille and S-C. Hz ICCV 2003 (Marr Prize) & IJCV 2005 Sanketh Shetty

Upload: javier

Post on 18-Mar-2016

45 views

Category:

Documents


3 download

DESCRIPTION

Image Parsing: Unifying Segmentation and Detection. Z. Tu, X. Chen, A.L. Yuille and S-C. Hz ICCV 2003 (Marr Prize) & IJCV 2005 Sanketh Shetty. Outline. Why Image Parsing? Introduction to Concepts in DDMCMC DDMCMC applied to Image Parsing - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Image Parsing: Unifying Segmentation and Detection

Image Parsing: Unifying Segmentation and

DetectionZ. Tu, X. Chen, A.L. Yuille and S-C.

HzICCV 2003 (Marr Prize) & IJCV

2005

Sanketh Shetty

Page 2: Image Parsing: Unifying Segmentation and Detection

Outline

• Why Image Parsing?• Introduction to Concepts in DDMCMC• DDMCMC applied to Image Parsing• Combining Discriminative and

Generative Models for Parsing• Results• Comments

Page 3: Image Parsing: Unifying Segmentation and Detection

Image Parsing

Image I

Parse Structure W

Optimize p(W|I)

Page 4: Image Parsing: Unifying Segmentation and Detection

Properties of Parse Structure

• Dynamic and reconfigurable– Variable number of nodes and node types

• Defined by a Markov Chain– Data Driven Markov Chain Monte Carlo

(earlier work in segmentation, grouping and recognition)

Page 5: Image Parsing: Unifying Segmentation and Detection

Key Concepts• Joint model for Segmentation &

Recognition– Combine different modules to obtain cues

• Fully generative explanation for Image generation– Uses Generative and Discriminative Models

+ DDMCMC framework– Concurrent Top-Down & Bottom-Up Parsing

Page 6: Image Parsing: Unifying Segmentation and Detection

Pattern Classes

62 characters

Faces

Regions

Page 7: Image Parsing: Unifying Segmentation and Detection

• Key Concepts:– Markov Chains– Markov Chain Monte Carlo

• Metropolis-Hastings [Metropolis 1953, Hastings 1970]

• Reversible Jump [Green 1995]– Data Driven Markov Chain Monte Carlo

MCMC: A Quick Tour

Page 8: Image Parsing: Unifying Segmentation and Detection

Markov Chains

Notes: Slides by Zhu, Dellaert and Tu at ICCV 2005

Page 9: Image Parsing: Unifying Segmentation and Detection

Markov Chain Monte Carlo

Notes: Slides by Zhu, Dellaert and Tu at ICCV 2005

Page 10: Image Parsing: Unifying Segmentation and Detection

Metropolis-Hastings Algorithm

Notes: Slides by Zhu, Dellaert and Tu at ICCV 2005

Page 11: Image Parsing: Unifying Segmentation and Detection

Metropolis-Hastings Algorithm

Proposal Distribution

Invariant Distribution

Notes: Slides by Zhu, Dellaert and Tu at ICCV 2005

Page 12: Image Parsing: Unifying Segmentation and Detection

Reversible Jumps MCMC

• Many competing models to explain data– Need to explore this complicated state space

Notes: Slides by Zhu, Dellaert and Tu at ICCV 2005

Page 13: Image Parsing: Unifying Segmentation and Detection

DDMCMC Motivation

Notes: Slides by Zhu, Dellaert and Tu at ICCV 2005

Page 14: Image Parsing: Unifying Segmentation and Detection

DDMCMC Motivation

Generative Modelp(I|W)p(W)

State Space

Page 15: Image Parsing: Unifying Segmentation and Detection

DDMCMC Motivation

Generative Modelp(I|W)p(W)

State Space

Discriminative Modelq( wj | I ) Dramatically reduce search space by focusing

sampling to highly probable states.

Page 16: Image Parsing: Unifying Segmentation and Detection

DDMCMC Framework

• Moves:– Node Creation– Node Deletion– Change Node Attributes

Page 17: Image Parsing: Unifying Segmentation and Detection

Transition Kernel

Satisfies detailed balanced equation

Full Transition Kernel

Page 18: Image Parsing: Unifying Segmentation and Detection

Convergence to p(W|I)

Monotonically at a geometric rate

Page 19: Image Parsing: Unifying Segmentation and Detection

Criteria for Designing Transition Kernels

Page 20: Image Parsing: Unifying Segmentation and Detection

Image Generation ModelRegions:

Constant IntensityTexturesShading

State of parse graph

Page 21: Image Parsing: Unifying Segmentation and Detection

62 characters

Faces

3 Regions

Page 22: Image Parsing: Unifying Segmentation and Detection

UniformDesigned to penalize high model complexity

Page 23: Image Parsing: Unifying Segmentation and Detection

Shape Prior

Faces

3 Regions

Page 24: Image Parsing: Unifying Segmentation and Detection

Shape Prior: Text

Page 25: Image Parsing: Unifying Segmentation and Detection

Intensity Models

Page 26: Image Parsing: Unifying Segmentation and Detection

Intensity Model: Faces

Page 27: Image Parsing: Unifying Segmentation and Detection

Discriminative Cues Used• Adaboost Trained

– Face Detector– Text Detector

• Adaptive Binarization Cues• Edge Cues

– Canny at 3 scales• Shape Affinity Cues• Region Affinity Cues

Page 28: Image Parsing: Unifying Segmentation and Detection

Transition Kernel Design• Remember

Page 29: Image Parsing: Unifying Segmentation and Detection

Possible Transitions

1. Birth/Death of a Face Node2. Birth/Death of Text Node3. Boundary Evolution4. Split/Merge Region5. Change node attributes

Page 30: Image Parsing: Unifying Segmentation and Detection

Face/Text Transitions

Page 31: Image Parsing: Unifying Segmentation and Detection

Region Transitions

Page 32: Image Parsing: Unifying Segmentation and Detection

Change Node Attributes

Page 33: Image Parsing: Unifying Segmentation and Detection

Basic Control Algorithm

Page 34: Image Parsing: Unifying Segmentation and Detection
Page 35: Image Parsing: Unifying Segmentation and Detection

Results

Page 36: Image Parsing: Unifying Segmentation and Detection
Page 37: Image Parsing: Unifying Segmentation and Detection

Comments• Well motivated but very complicated approach to THE HOLY GRAIL

problem in vision– Good global convergence results for inference with very minor

dependence on initial W.– Extensible to larger set of primitives and pattern types.

• Many details of the algorithm are missing and it is hard to understand the motivation for choices of values for some parameters

• Unclear if the p(W|I)’s for configurations with different class compositions are comparable.

• Derek’s comment on Adaboost false positives and their failure to report their exact improvement

• No quantitative results/comparison to other algorithms and approaches

– It should be possible to design a simple experiment to measure performance on recognition/detection/localization tasks.

Page 38: Image Parsing: Unifying Segmentation and Detection

Thank You