![Page 1: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/1.jpg)
Combinatorial Optimization and
Computer Vision
Philip Torr
![Page 2: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/2.jpg)
Story
• How an attempt to solve one problem lead into many different areas of computer vision and some interesting results.
![Page 3: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/3.jpg)
Aim• Given an image, to segment the object
Segmentation should (ideally) be• shaped like the object e.g. cow-like• obtained efficiently in an unsupervised manner• able to handle self-occlusion
Segmentation
ObjectCategory
Model
Cow Image Segmented Cow
![Page 4: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/4.jpg)
Challenges
Self Occlusion
Shape Variability
Appearance Variability
![Page 5: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/5.jpg)
Motivation
Current methods require user intervention• Object and background seed pixels (Boykov and Jolly, ICCV 01)• Bounding Box of object (Rother et al. SIGGRAPH 04)
Cow Image
Object Seed Pixels
![Page 6: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/6.jpg)
Motivation
Current methods require user intervention• Object and background seed pixels (Boykov and Jolly, ICCV 01)• Bounding Box of object (Rother et al. SIGGRAPH 04)
Cow Image
Object Seed Pixels
Background Seed Pixels
![Page 7: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/7.jpg)
Motivation
Current methods require user intervention• Object and background seed pixels (Boykov and Jolly, ICCV 01)• Bounding Box of object (Rother et al. SIGGRAPH 04)
Segmented Image
![Page 8: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/8.jpg)
Motivation
Current methods require user intervention• Object and background seed pixels (Boykov and Jolly, ICCV 01)• Bounding Box of object (Rother et al. SIGGRAPH 04)
Cow Image
Object Seed Pixels
Background Seed Pixels
![Page 9: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/9.jpg)
Motivation
Current methods require user intervention• Object and background seed pixels (Boykov and Jolly, ICCV 01)• Bounding Box of object (Rother et al. SIGGRAPH 04)
Segmented Image
![Page 10: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/10.jpg)
Problem • Manually intensive
• Segmentation is not guaranteed to be ‘object-like’
Non Object-like Segmentation
Motivation
![Page 11: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/11.jpg)
MRF for Image Segmentation
EnergyMRF
Pair-wise Terms MAP SolutionUnary likelihoodData (D)
Unary likelihood Contrast Term Pair-wise terms(Potts Model)
Boykov and Jolly [ICCV 2001]
Maximum-a-posteriori (MAP) solution x* = arg min E(x)x
=
![Page 12: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/12.jpg)
GraphCut for Inference
Cut: A collection of edges which separates the Source from the Sink
MinCut: The cut with minimum weight (sum of edge weights)
Solution: Global optimum (MinCut) in polynomial time
Image
Sink
Source
Foreground
Background
Cut
![Page 13: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/13.jpg)
Energy Minimization using Graph cuts
EMRF(a1,a2)
Sink (1)
Source (0)
a1 a2
Graph Construction for Boolean Random Variables
![Page 14: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/14.jpg)
Energy Minimization using Graph cuts
Sink (1)
Source (0)
a1 a2
EMRF(a1,a2) = 2a1
2t-edges
(unary terms)
![Page 15: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/15.jpg)
Energy Minimization using Graph cuts
EMRF(a1,a2) = 2a1 + 5ā1
Sink (1)
Source (0)
a1 a2
2
5
![Page 16: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/16.jpg)
Energy Minimization using Graph cuts
EMRF(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2
Sink (1)
Source (0)
a1 a2
2
5
9
4
![Page 17: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/17.jpg)
Energy Minimization using Graph cuts
EMRF(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2
Sink (1)
Source (0)
a1 a2
2
5
9
4
2
n-edges(pair-wise term)
![Page 18: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/18.jpg)
Energy Minimization using Graph cuts
EMRF(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2
Sink (1)
Source (0)
a1 a2
2
5
9
4
2
1
![Page 19: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/19.jpg)
Energy Minimization using Graph cuts
EMRF(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2
Sink (1)
Source (0)
a1 a2
2
5
9
4
2
1
![Page 20: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/20.jpg)
Energy Minimization using Graph cuts
EMRF(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2
Sink (1)
Source (0)
a1 a2
2
5
9
4
2
1
a1 = 1 a2 = 1
EMRF(1,1) = 11
Cost of st-cut = 11
![Page 21: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/21.jpg)
Energy Minimization using Graph cuts
EMRF(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2
Sink (1)
Source (0)
a1 a2
2
5
9
4
2
1
a1 = 1 a2 = 0
EMRF(1,0) = 8
Cost of st-cut = 8
![Page 22: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/22.jpg)
• The Max-flow Problem- Edge capacity and flow balance constraints
Computing the st-mincut from Max-flow algorithms
• Notation- Residual capacity (edge capacity – current flow)
• Simple Augmenting Path based Algorithms- Repeatedly find augmenting paths and push flow.- Saturated edges constitute the st-mincut. [Ford-Fulkerson Theorem]
Sink (1)
Source (0)
a1 a2
2
5
9
42
1
![Page 23: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/23.jpg)
Minimum s-t cuts algorithms
Augmenting paths [Ford & Fulkerson, 1962]
Push-relabel [Goldberg-Tarjan, 1986]
![Page 24: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/24.jpg)
“Augmenting Paths”
• Find a path from S to T along non-saturated edges
“source”
A graph with two terminals
S T
“sink” Increase flow along
this path until some edge saturates
![Page 25: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/25.jpg)
“Augmenting Paths”
• Find a path from S to T along non-saturated edges
“source”
A graph with two terminals
S T
“sink” Increase flow along
this path until some edge saturates
Find next path… Increase flow…
![Page 26: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/26.jpg)
“Augmenting Paths”
• Find a path from S to T along non-saturated edges
“source”
A graph with two terminals
S T
“sink” Increase flow along
this path until some edge saturates
Iterate until … all paths from S to T have at least one saturated edge
MAX FLOW MIN CUT
![Page 27: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/27.jpg)
MRF, Graphical Model
Probability for a labelling consists of• Likelihood Unary potential based on colour of pixel• Prior which favours same labels for neighbours (pairwise potentials)
Prior Ψxy(mx,my)
Unary Potential Φx(D|mx)
D (pixels)
m (labels)
Image Plane
x
y
mx
my
![Page 28: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/28.jpg)
Example
Cow Image Object SeedPixels
Background SeedPixels
Prior
x …
y …
…
…
x …
y …
…
…
Φx(D|obj)
Φx(D|bkg)Ψxy(mx,my)
Likelihood Ratio (Colour)
![Page 29: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/29.jpg)
Example
Cow Image Object SeedPixels
Background SeedPixels
Pair-wise TermsLikelihood Ratio (Colour)
![Page 30: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/30.jpg)
Contrast-Dependent MRF
Probability of labelling in addition has• Contrast term which favours boundaries to lie on image edges
D (pixels)
m (labels)
Image Plane
Contrast Term Φ(D|mx,my)
x
y
mx
my
![Page 31: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/31.jpg)
Example
Cow Image Object SeedPixels
Background SeedPixels
Pair-wise Term
x …
y …
…
…
x …
y …
…
…
Likelihood Ratio (Colour)
Ψxy(mx,my)+Φ(D|mx,my)
Φx(D|obj)
Φx(D|bkg)
![Page 32: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/32.jpg)
Example
Cow Image Object SeedPixels
Background SeedPixels
Prior + ContrastLikelihood Ratio (Colour)
![Page 33: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/33.jpg)
Object Graphical Model
Probability of labelling in addition has• Unary potential which depend on distance from Θ (shape parameter)
D (pixels)
m (labels)
Θ (shape parameter)
Image Plane
Object CategorySpecific MRFx
y
mx
my
Unary PotentialΦx(mx|Θ)
![Page 34: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/34.jpg)
Example
Cow Image Object SeedPixels
Background SeedPixels
Prior + ContrastDistance from Θ
Shape Prior Θ
![Page 35: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/35.jpg)
Example
Cow Image Object SeedPixels
Background SeedPixels
Prior + ContrastLikelihood + Distance from Θ
Shape Prior Θ
![Page 36: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/36.jpg)
Example
Cow Image Object SeedPixels
Background SeedPixels
Prior + ContrastLikelihood + Distance from Θ
Shape Prior Θ
![Page 37: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/37.jpg)
Thought
• We can imagine rather than using user input to define histograms we use object detection.
![Page 38: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/38.jpg)
Shape Model
• BMVC 2004
![Page 39: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/39.jpg)
Pictorial Structure
Fischler & Elschlager, 1973
Yuille, ‘91 Brunelli & Poggio, ‘93 Lades, v.d. Malsburg et al. ‘93 Cootes, Lanitis, Taylor et al. ‘95 Amit & Geman, ‘95, ‘99 Perona et al. ‘95, ‘96, ’98, ‘00
![Page 40: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/40.jpg)
Layered Pictorial Structures (LPS)• Generative model
• Composition of parts + spatial layout
Layer 2
Layer 1
Parts in Layer 2 can occlude parts in Layer 1
Spatial Layout(Pairwise Configuration)
![Page 41: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/41.jpg)
Layer 2
Layer 1
Transformations
Θ1
P(Θ1) = 0.9
Cow Instance
Layered Pictorial Structures (LPS)
![Page 42: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/42.jpg)
Layer 2
Layer 1
Transformations
Θ2
P(Θ2) = 0.8
Cow Instance
Layered Pictorial Structures (LPS)
![Page 43: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/43.jpg)
Layer 2
Layer 1
Transformations
Θ3
P(Θ3) = 0.01
Unlikely Instance
Layered Pictorial Structures (LPS)
![Page 44: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/44.jpg)
How to learn LPS
• From video via motion segmentation see Kumar Torr and Zisserman ICCV 2005.
• Graph cut based method.
![Page 45: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/45.jpg)
Examples
![Page 46: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/46.jpg)
LPS for Detection• Learning
– Learnt automatically using a set of examples
• Detection– Matches LPS to image using Loopy Belief Propagation
– Localizes object parts
![Page 47: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/47.jpg)
Detection
• Like a proposal process.
![Page 48: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/48.jpg)
Pictorial Structures (PS)
PS = 2D Parts + Configuration
Fischler and Eschlager. 1973
Aim: Learn pictorial structures in an unsupervised manner
• Identify parts• Learn configuration• Learn relative depth of parts
Parts +Configuration +Relative depth
LayeredPictorialStructures(LPS)
![Page 49: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/49.jpg)
MotivationMatching Pictorial Structures - Felzenszwalb et al - 2001
Part likelihood Spatial Prior
Outline
Texture
Image
P1 P3
P2
(x,y,,)
MRF
![Page 50: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/50.jpg)
Motivation
Image
P1 P3
P2
(x,y,,)
MRF
• Unary potentials are negative log likelihoods
Valid pairwise configuration
Potts Model
Matching Pictorial Structures - Felzenszwalb et al - 2001
12
YES NO
![Page 51: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/51.jpg)
Motivation
P1 P3
P2
(x,y,,)
Pr(Cow)Image
• Unary potentials are negative log likelihoodsMatching Pictorial Structures - Felzenszwalb et al - 2001
Valid pairwise configuration
Potts Model
12
YES NO
![Page 52: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/52.jpg)
Bayesian Formulation (MRF)
• D = image.
• Di = pixels Є pi , given li
• (PDF Projection Theorem. )
z = sufficient statistics
• ψ(li,lj) = const, if valid configuration
= 0, otherwise.
Pott’s model
![Page 53: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/53.jpg)
Combinatorial Optimization
• SDP formulation (Torr 2001, AI stats), best bound
• SOCP formulation (Kumar, Torr & Zisserman this conference), good compromise of speed and accuracy.
• LBP (Huttenlocher, many), worst bound.
![Page 54: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/54.jpg)
Defining the likelihood
• We want a likelihood that can combine both the outline and the interior appearance of a part.
• Define features which will be sufficient statistics to discriminate foreground and background:
![Page 55: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/55.jpg)
Features
• Outline: z1 Chamfer distance
• Interior: z2 Textons
• Model joint distribution of z1 z2 as a 2D Gaussian.
![Page 56: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/56.jpg)
Chamfer Match Score
• Outline (z1) : minimum chamfer distances over multiple outline exemplars
• dcham= 1/n Σi min{ minj ||ui-vj ||, τ }
Image Edge Image Distance Transform
![Page 57: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/57.jpg)
Texton Match Score
• Texture(z2) : MRF classifier – (Varma and Zisserman, CVPR ’03)
• Multiple texture exemplars x of class t
• Textons: 3 X 3 square neighbourhood
• VQ in texton space
• Descriptor: histogram of texton labelling
• χ2 distance
![Page 58: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/58.jpg)
Bag of Words/Histogram of Textons
• Having slagged off BoW’s I reveal we used it all along, no big deal.
• So this is like a spatially aware bag of words model…
• Using a spatially flexible set of templates to work out our bag of words.
![Page 59: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/59.jpg)
2. Fitting the Model
• Cascades of classifiers– Efficient likelihood evaluation
• Solving MRF– LBP, use fast algorithm– GBP if LBP doesn’t converge– Could use Semi Definite Programming (2003)– Recent work second order cone programming
method best CVPR 2006.
![Page 60: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/60.jpg)
Efficient Detection of parts
• Cascade of classifiers
• Top level use chamfer and distance transform for efficient pre filtering
• At lower level use full texture model for verification, using efficient nearest neighbour speed ups.
![Page 61: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/61.jpg)
Cascade of Classifiers-for each part
Y. Amit, and D. Geman, 97?; S. Baker, S. Nayer 95
![Page 62: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/62.jpg)
High Levels based on Outline
(x,y)
![Page 63: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/63.jpg)
Low levels on Texture
• The top levels of the tree use outline to eliminate patches of the image.
• Efficiency: Using chamfer distance and pre computed distance map.
• Remaining candidates evaluated using full texture model.
![Page 64: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/64.jpg)
Efficient Nearest Neighbour• Goldstein, Platt and Burges (MSR Tech Report, 2003)
Conversion from fixeddistance to rectangle search
• bitvectorij(Rk) = 1
= 0• Nearest neighbour of x• Find intervals in all dimensions• ‘AND’ appropriate bitvectors• Nearest neighbour search on pruned exemplars
Rk Є Iiin dimension j
![Page 65: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/65.jpg)
Inspiration
• ICCV 2003, Stenger et al.
• System developed for tracking articulated objects such as hands or bodies, based on efficient detection.
![Page 66: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/66.jpg)
Evaluation at Multiple Resolutions
Tree: 9000 templates of hand pointing, rigid
![Page 67: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/67.jpg)
Templates at Level 1
![Page 68: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/68.jpg)
Templates at Level 2
![Page 69: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/69.jpg)
Templates at Level 3
![Page 70: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/70.jpg)
Tracking Results
![Page 71: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/71.jpg)
Marginalize out Pose
• Get an initial estimate of pose distribution.
• Use EM to marginalize out pose.
![Page 72: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/72.jpg)
SegmentationImage
ResultsUsing LPS Model for Cow
![Page 73: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/73.jpg)
In the absence of a clear boundary between object and background
SegmentationImage
ResultsUsing LPS Model for Cow
![Page 74: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/74.jpg)
SegmentationImage
ResultsUsing LPS Model for Cow
![Page 75: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/75.jpg)
SegmentationImage
ResultsUsing LPS Model for Cow
![Page 76: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/76.jpg)
SegmentationImage
ResultsUsing LPS Model for Horse
![Page 77: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/77.jpg)
SegmentationImage
ResultsUsing LPS Model for Horse
![Page 78: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/78.jpg)
Our Method Leibe and SchieleImage
Results
![Page 79: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/79.jpg)
Thoughts
Object models can help segmentation.
But good models hard to obtain.
![Page 80: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/80.jpg)
Do we really need accurate models?
• Segmentation boundary can be extracted from edges
• Rough 3D Shape-prior enough for region disambiguation
![Page 81: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/81.jpg)
Energy of the Pose-specific MRFEnergy to be
minimizedUnary term
Shape prior
Pairwise potential
Potts model
But what should be the value of θ?
![Page 82: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/82.jpg)
The different terms of the MRF
Original image
Likelihood of being foreground given a
foreground histogram
Grimson-Stauffer
segmentation
Shape prior model
Shape prior (distance transform)
Likelihood of being foreground
given all the terms
Resulting Graph-Cuts
segmentation
![Page 83: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/83.jpg)
Can segment multiple views simultaneously
![Page 84: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/84.jpg)
Solve via gradient descent
• Comparable to level set methods
• Could use other approaches (e.g. Objcut)
• Need a graph cut per function evaluation
![Page 85: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/85.jpg)
Formulating the Pose Inference Problem
![Page 86: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/86.jpg)
But…But…
… to compute the MAP of E(x) w.r.t the pose, it means that the unary terms will be changed at EACHEACH iteration and the maxflow recomputed!
However…However… Kohli and Torr showed how dynamic graph cuts can
be used to efficiently find MAP solutions for MRFs that change minimally from one time instant to the next: Dynamic Graph Cuts (ICCV05).
![Page 87: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/87.jpg)
Dynamic Graph Cuts
PB SB
cheaperoperation
computationally
expensive operation
Simplerproblem
PB*
differencesbetweenA and B
A and Bsimilar
PA SA
solve
![Page 88: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/88.jpg)
Dynamic Image Segmentation
Image
Flows in n-edges Segmentation Obtained
![Page 89: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/89.jpg)
9 + α
4 + α
Adding a constant to both thet-edges of a node does not change the edges constituting the st-mincut.
Key Observation
Sink (1)
Source (0)
a1 a2
2
5
2
1
E (a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2
E*(a1,a2 ) = E(a1,a2) + α(a2+ā2)
= E(a1,a2) + α [a2+ā2 =1]
Reparametrization
![Page 90: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/90.jpg)
9 + α
4
All reparametrizations of the graph are sums of these two types.
Other type of reparametrization
Sink (1)
Source (0)
a1 a2
2
5 + α
2 + α
1 - α
Reparametrization, second type
E* (a1,a2) = E (a1,a2) + α ā1+ α a2 + α a1ā2 - α ā1a2
E* (a1,a2) = E (a1,a2) + α (ā1+ a2 + a1(1-a2) - ā1a2)
E* (a1,a2) = E (a1,a2) + α
![Page 91: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/91.jpg)
9 + α
4
All reparametrizations of the graph are sums of these two types.
Other type of reparametrization
Sink (1)
Source (0)
a1 a2
2
5 + α
2 + α
1 - α
Reparametrization, second type
Both maintain the solution and add a constant α to the energy.
![Page 92: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/92.jpg)
Reparametrization
• Nice result (easy to prove)
• All other reparametrizations can be viewed in terms of these two basic operations.
• Proof in Hammer, and also in one of Vlad’s recent papers.
![Page 93: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/93.jpg)
s
Gt
original graph
0/9
0/7
0/5
0/2 0/4
0/1
xi xj
flow/residual capacity
Graph Re-parameterization
![Page 94: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/94.jpg)
s
Gt
original graph
0/9
0/7
0/5
0/2 0/4
0/1
xi xj
flow/residual capacity
Graph Re-parameterization
t residual graph
xi xj0/12
5/2
3/2
1/0
2/0 4/0st-mincut
ComputeMaxflow
Gr
Edges cut
![Page 95: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/95.jpg)
Update t-edge Capacities
s
Gr
t residual graph
xi xj0/12
5/2
3/2
1/0
2/0 4/0
![Page 96: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/96.jpg)
Update t-edge Capacities
s
Gr
t residual graph
xi xj0/12
5/2
3/2
1/0
2/0 4/0
capacitychanges from
7 to 4
![Page 97: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/97.jpg)
Update t-edge Capacities
s
G`t
updated residual graph
xi xj0/12
5/-1
3/2
1/0
2/0 4/0
capacitychanges from
7 to 4
edge capacityconstraint violated!(flow > capacity)
= 5 – 4 = 1
excess flow (e) = flow – new capacity
![Page 98: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/98.jpg)
add e to both t-edges connected to node i
Update t-edge Capacities
s
G`t
updated residual graph
xi xj0/12
3/2
1/0
2/0 4/0
capacitychanges from
7 to 4
edge capacityconstraint violated!(flow > capacity)
= 5 – 4 = 1
excess flow (e) = flow – new capacity
5/-1
![Page 99: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/99.jpg)
Update t-edge Capacities
s
G`t
updated residual graph
xi xj0/12
3/2
1/0
4/0
capacitychanges from
7 to 4
excess flow (e) = flow – new capacity
add e to both t-edges connected to node i
= 5 – 4 = 1
5/0
2/1
edge capacityconstraint violated!(flow > capacity)
![Page 100: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/100.jpg)
Update n-edge Capacities
s
Gr
t
residual graph
xi xj0/12
5/2
3/2
1/0
2/0 4/0
• Capacity changes from 5 to 2
![Page 101: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/101.jpg)
Update n-edge Capacities
s
t
Updated residual graph
xi xj0/12
5/2
3/-1
1/0
2/0 4/0
G`
• Capacity changes from 5 to 2- edge capacity constraint violated!
![Page 102: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/102.jpg)
Update n-edge Capacities
s
t
Updated residual graph
xi xj0/12
5/2
3/-1
1/0
2/0 4/0
G`
• Capacity changes from 5 to 2- edge capacity constraint violated!
• Reduce flow to satisfy constraint
![Page 103: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/103.jpg)
Update n-edge Capacities
s
t
Updated residual graph
xi xj0/11
5/2
2/0
1/0
2/0 4/0
excess
deficiency
G`
• Capacity changes from 5 to 2- edge capacity constraint violated!
• Reduce flow to satisfy constraint- causes flow imbalance!
![Page 104: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/104.jpg)
Update n-edge Capacities
s
t
Updated residual graph
xi xj0/11
5/2
2/0
1/0
2/0 4/0
deficiency
excess
G`
• Capacity changes from 5 to 2- edge capacity constraint violated!
• Reduce flow to satisfy constraint- causes flow imbalance!
• Push excess flow to/from the terminals
• Create capacity by adding α = excess to both t-edges.
![Page 105: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/105.jpg)
Update n-edge Capacities
Updated residual graph
• Capacity changes from 5 to 2- edge capacity constraint violated!
• Reduce flow to satisfy constraint- causes flow imbalance!
• Push excess flow to the terminals
• Create capacity by adding α = excess to both t-edges.
G`
xi xj0/11
5/3
2/0
2/0
3/0 4/1
s
t
![Page 106: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/106.jpg)
Update n-edge Capacities
Updated residual graph
• Capacity changes from 5 to 2- edge capacity constraint violated!
• Reduce flow to satisfy constraint- causes flow imbalance!
• Push excess flow to the terminals
• Create capacity by adding α = excess to both t-edges.
G`
xi xj0/11
5/3
2/0
2/0
3/0 4/1
s
t
![Page 107: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/107.jpg)
First segmentation problem MAP solution
Ga
Our Algorithm
Gb
second segmentation problem
Maximum flow
residual graph (Gr)
G`
differencebetween
Ga and Gbupdated residual
graph
![Page 108: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/108.jpg)
Dynamic Graph Cut vs Active Cuts
• Our method flow recycling
• AC cut recycling
• Both methods: Tree recycling
![Page 109: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/109.jpg)
Experimental Analysis
MRF consisting of 2x105 latent variables connected in a 4-neighborhood.
Running time of the dynamic algorithm
![Page 110: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/110.jpg)
Experimental Analysis
Image resolution: 720x576 static: 220 msec dynamic (optimized): 50 msec
Image segmentation in videos (unary & pairwise terms)
Graph CutsDynamic Graph Cuts
EnergyMRF =
![Page 111: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/111.jpg)
Segmentation Comparison
Gri
mson
-G
rim
son
-S
tau
ffer
Sta
uff
er
Bath
ia0
Bath
ia0
44O
ur
Ou
r m
eth
od
meth
od
![Page 112: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/112.jpg)
Segmentation + Pose inference
[Images courtesy: M. Black, L. Sigal]
![Page 113: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/113.jpg)
Segmentation + Pose inference
[Images courtesy: Vicon]
![Page 114: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/114.jpg)
Max-Marginals for Parameter Learning
• Use Max-marginals instead of Pseudo marginals from LBP (from Sanjiv Kumar)
![Page 115: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/115.jpg)
Volumetric Graph cuts
Source
Sink
Min cut
Can apply to 3D
![Page 116: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/116.jpg)
Results
• Model House
![Page 117: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/117.jpg)
Results• Stone carving
![Page 118: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/118.jpg)
Results
• Haniwa
![Page 119: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/119.jpg)
Conclusion
• Combining pose inference and segmentation worth investigating.
• Lots more to do to extend MRF models
• Combinatorial Optimization is a very interesting and hot area in vision at the moment.
• Algorithms are as important as models.
![Page 120: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/120.jpg)
Ask Pushmeet for code Demo
![Page 121: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/121.jpg)
![Page 122: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/122.jpg)
![Page 123: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/123.jpg)
![Page 124: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/124.jpg)
![Page 125: Combinatorial Optimization and Computer Vision Philip Torr](https://reader036.vdocuments.net/reader036/viewer/2022062223/5515eb38550346dd6f8b5177/html5/thumbnails/125.jpg)