iccv2009: map inference in discrete models: part 5
TRANSCRIPT
Course Program9.30-10.00 Introduction (Andrew Blake)
10.00-11.00 Discrete Models in Computer Vision (Carsten Rother)
15min Coffee break
11.15-12.30 Message Passing: DP, TRW, LP relaxation (Pawan Kumar)
12.30-13.00 Quadratic pseudo-boolean optimization (Pushmeet Kohli)
1 hour Lunch break
14:00-15.00 Transformation and move-making methods (Pushmeet Kohli)
15:00-15.30 Speed and Efficiency (Pushmeet Kohli)
15min Coffee break
15:45-16.15 Comparison of Methods (Carsten Rother)
16:15-17.30 Recent Advances: Dual-decomposition, higher-order, etc. (Carsten Rother + Pawan Kumar)
All online material will be online (after conference):http://research.microsoft.com/en-us/um/cambridge/projects/tutorial/
Comparison of Optimization Methods
Carsten Rother
Microsoft Research Cambridge
Why is good optimization important?
[Data courtesy from Oliver Woodford]
Problem: Minimize a binary 4-connected pair-wise MRF (choose a colour-mode at each pixel)
Input: Image sequence
Output: New view
[Fitzgibbon et al. ‘03+
Why is good optimization important?
Belief Propagation ICM, Simulated Annealing
Ground Truth
QPBOP [Boros ’06, Rother ‘07+
Global Minimum
Graph Cut with truncation [Rother et al. ‘05+
Comparison papers• Binary, highly-connected MRFs *Rother et al. ‘07+
• Multi-label, 4-connected MRFs *Szeliski et al. ‘06,‘08+all online: http://vision.middlebury.edu/MRF/
• Multi-label, highly-connected MRFs *Kolmogorov et al. ‘06+
Comparison papers• Binary, highly-connected MRFs *Rother et al. ‘07+
• Multi-label, 4-connected MRFs *Szeliski et al. ‘06,‘08+all online: http://vision.middlebury.edu/MRF/
• Multi-label, highly-connected MRFs *Kolmogorov et al. ‘06+
Random MRFs
o Three important factors:
o Connectivity (av. degree of a node)
o Unary strength:
o Percentage of non-submodular terms (NS)
E(x) = w ∑ θi (xi) + ∑ θij (xi,xj)
Computer Vision Problems
perc. unlabeled (sec) Energy (sec)
Conclusions: • Connectivity is a crucial factor• Simple methods like Simulated
Annealing sometimes best
Diagram Recognition [Szummer et al ‘04]
71 nodes; 4.8 con.; 28% non-sub; 0.5 unary strength
Ground truth
GrapCut E= 119 (0 sec) ICM E=999 (0 sec)BP E=25 (0 sec)
QPBO: 56.3% unlabeled (0 sec)QPBOP (0sec) - Global Min.Sim. Ann. E=0 (0.28sec)
• 2700 test cases: QPBO solved nearly all
(QPBOP solves all)
Binary Image Deconvolution50x20 nodes; 80con; 100% non-sub; 109 unary strength
Ground Truth Input
0.2 0.2 0.2 0.2 0.2
0.2 0.2 0.2 0.2 0.2
0.2 0.2 0.2 0.2 0.2
0.2 0.2 0.2 0.2 0.2
0.2 0.2 0.2 0.2 0.2
MRF: 80 connectivity - illustration
5x5 blur kernel
Binary Image Deconvolution50x20 nodes; 80con; 100% non-sub; 109 unary strength
Ground Truth QPBO 80% unlab. (0.1sec)Input
ICM E=6 (0.03sec)QPBOP 80% unlab. (0.9sec) GC E=999 (0sec)
BP E=71 (0.9sec) QPBOP+BP+I, E=8.1 (31sec) Sim. Ann. E=0 (1.3sec)
Comparison papers• Binary, highly-connected MRFs *Rother et al. ‘07+
Conclusion: low-connectivity tractable: QPBO(P)
• Multi-label, 4-connected MRFs *Szeliski et al ‘06,‘08+all online: http://vision.middlebury.edu/MRF/
• Multi-label, highly-connected MRFs *Kolmogorov et al ‘06+
Comparison papers• Binary, highly-connected MRFs *Rother et al. ‘07+
Conclusion: low-connectivity tractable: QPBO(P)
• Multi-label, 4-connected MRFs *Szeliski et al ‘06,‘08+all online: http://vision.middlebury.edu/MRF/
• Multi-label, highly-connected MRFs *Kolmogorov et al ‘06+
Multiple labels – 4 connected
[Szelsiki et al ’06,08+
stereo
Panoramic stitching
Image Segmentation;de-noising; in-painting
“Attractive Potentials”
Stereo
Conclusions: – Solved by alpha-exp. and TRW-S
(within 0.01%-0.9% of lower bound – true for all tests!)
– Expansion-move always better than swap-move
image Ground truth
TRW-S image Ground truth
TRW-S
De-noising and in-painting
Conclusion:
– Alpha-expansion has problems with smooth areas (potential solution: fusion-move *Lempitsky et al. ‘07+)
Ground truth TRW-S Alpha-exp.Noisy input
Panoramic stitching
• Unordered labels are (slightly) more challenging
Comparison papers• Binary, highly-connected MRFs *Rother et al. ‘07+
Conclusion: low-connectivity tractable (QPBO)
• Multi-label, 4-connected MRFs *Szeliski et al ‘06,‘08+all online: http://vision.middlebury.edu/MRF/Conclusion: solved by expansion-move; TRW-S
(within 0.01 - 0.9% of lower bound)
• Multi-label, highly-connected MRFs *Kolmogorov et al ‘06+
Comparison papers• Binary, highly-connected MRFs *Rother et al. ‘07+
Conclusion: low-connectivity tractable (QPBO)
• Multi-label, 4-connected MRFs *Szeliski et al ‘06,‘08+all online: http://vision.middlebury.edu/MRF/Conclusion: solved by expansion-move; TRW-S
(within 0.01 - 0.9% of lower bound)
• Multi-label, highly-connected MRFs *Kolmogorov et al ‘06+
Multiple labels – highly connected
Stereo with occlusion:
Each pixel is connected to D pixels in the other image
E(d): {1,…,D}2n → R
*Kolmogorov et al. ‘06+
Multiple labels – highly connected
• Alpha-exp. considerably better than message passing
Tsukuba: 16 labels Cones: 56 labels
Potential reason: smaller connectivity in one expansion-move
Comparison: 4-con. versus highly con.
Conclusion:• highly connected graphs are harder to optimize
Tsukuba (E) Map (E) Venus (E)
highly-con. 103.09% 103.28% 102.26%
4-con. 100.004% 100.056% 100.014%
Lower-bound scaled to 100%
Comparison papers• binary, highly-connected MRFs *Rother et al. ‘07+
Conclusion: low-connectivity tractable (QPBO)
• Multi-label, 4-connected MRFs *Szeliski et al ‘06,‘08+all online: http://vision.middlebury.edu/MRF/Conclusion: solved by alpha-exp.; TRW
(within 0.9% to lower bound)
• Multi-label, highly-connected MRFs *Kolmogorov et al ‘06+Conclusion: challenging optimization (alpha-exp. best)
How to efficiently optimize general highly-connected (higher-order) MRFs is still an open question
Course Program9.30-10.00 Introduction (Andrew Blake)
10.00-11.00 Discrete Models in Computer Vision (Carsten Rother)
15min Coffee break
11.15-12.30 Message Passing: DP, TRW, LP relaxation (Pawan Kumar)
12.30-13.00 Quadratic pseudo-boolean optimization (Pushmeet Kohli)
1 hour Lunch break
14:00-15.00 Transformation and move-making methods (Pushmeet Kohli)
15:00-15.30 Speed and Efficiency (Pushmeet Kohli)
15min Coffee break
15:45-16.15 Comparison of Methods (Carsten Rother)
16:15-17.30 Recent Advances: Dual-decomposition, higher-order, etc. (Carsten Rother + Pawan Kumar)
All online material will be online (after conference):http://research.microsoft.com/en-us/um/cambridge/projects/tutorial/
Advanced Topics –Optimizing Higher-Order MRFs
Carsten Rother
Microsoft Research Cambridge
Challenging Optimization Problems
• How to solve higher-order MRFs:
• Possible Approaches:
- Convert to Pairwise MRF (Pushmeet has explained)
- Branch & MinCut (Pushmeet has explained)
- Add global constraint to LP relaxation
- Dual Decomposition
Add global constraints to LPBasic idea:
References:[K. Kolev et al. ECCV’ 08] silhouette constraint [Nowizin et al. CVPR ‘09+ connectivity prior[Lempitsky et al ICCV ‘09+ bounding box prior (see talk on Thursday)
T
∑i Є T
Xi ≥ 1
See talk on Thursday: [Lempitsky et al ICCV ‘09+ bounding box prior
Dual Decomposition
• Well known in optimization community [Bertsekas ’95, ‘99+
• Other names: “Master-Slave” [Komodiakis et al. ‘07, ’09+
• Examples of Dual-Decomposition approaches:– Solve LP of TRW [Komodiakis et al. ICCV ‘07+
– Image segmentation with connectivity prior [Vicente et al CVPR ‘08+
– Feature Matching [Toressani et al ECCV ‘08+
– Optimizing Higher-Order Clique MRFs [Komodiakis et al CVPR ‘09+
– Marginal Probability Field *Woodford et al ICCV ‘09+
– Jointly optimizing appearance and Segmentation [Vicente et al ICCV 09]
Dual Decomposition
min E(x) = min [ E1(x) + θTx + E2(x) – θTx ]
• θ is called the dual vector (same size as x)
• Goal: max L(θ) ≤ min E(x)
• Properties:• L(θ) is concave (optimal bound can be found)• If x1=x2 then problem solved (not guaranteed)
xx
xθ
Hard to optimize Possible to optimize Possible to optimize
x1 x2 “Lower bound”
≥ min [E1(x1) + θTx1] + min [E2(x2) - θTx2] = L(θ)
Why is the lower bound a concave function?
L(θ) = min [E1(x1) + θTx1] + min [E2(x2) - θTx2] x1 x2
L(θ) : Rn -> R
L1(θ)
θ
L1(θ) L2(θ)
L(θ) concave since a sum of concave functions
θTx’1
θTx’’1
θTx’’’1
How to maximize the lower bound?If L(θ) were to be differentiable use gradient ascent
L(θ) not diff. … so subgradient approach [Shor ‘85+
How to maximize the lower bound?If L(θ) were to be differentiable use gradient ascent
L(θ) not diff. so subgradient approach [Shor ‘85+
L(θ) = min [E1(x1) + θTx1] + min [E2(x2) - θTx2] x1 x2
L(θ) : Rn -> R
L1(θ)
Θ
L1(θ) L2(θ)
θTx’1
θTx’’1
θTx’’’1
Θ’’ = Θ’ + λ g
Θ’ Θ’’
= Θ’ + λ x’1 Θ’’ = Θ’ + λ (x1-x2)
Subgradient g
Dual DecompositionL(θ) = min [E1(x1) + θTx1] + min [E2(x2) - θTx2] x1 x2
Subproblem 1x1 = min [E1(x1) + θTx1]
x1 x2
Subgradient Optimization:
subgradient
Θ
Θ Θ
Θ = Θ + λ(x1-x2)
x1
Subproblem 2x2 = min [E2(x2) + θTx2]x2
“Slaves”
“Master”
Example optimization
• Guaranteed to converge to optimal bound L(θ)
• Choose step-width λ correctly ([Bertsekas ’95])
• Pick solution x as the best of x1 or x2
• E and L can in- and decrease during optimization
• Each step: θ gets close to optimal θ*
*
* *
*
* *
Why can the lower bound go down?
Lower envelop of planes in 3D:
L(θ)
L(θ’)
L(θ’) ≤ L(θ)
Analyse the model
Θ’’ = Θ’ + λ (x1-x2)
L(θ) = min [E1(x1) + θTx1] + min [E2(x2) - θTx2]
Update step:
Look at pixel p:
Case2: x1p = 1 x2p = 0 then Θ’’ = Θ’+ λ
push x1p towards 0 push x2p towards 1
Case1: x1p = x2p then Θ’’ = Θ’
Case3: x1p = 0 x2p = 1 then Θ’’ = Θ’- λ
push x1p towards 1 push x2p towards 0
* *
*
*
* *
*
*
Example 1: Segmentation and Connectivity
Foreground object must be connected:
User input Standard MRF Standard MRF+h
Zoom in
E(x) = ∑ θi (xi) + ∑ θij (xi,xj) + h(x)
h(x)= { ∞ if x not 4-connected0 otherwise
*Vicente et al ’08+
E(x) = ∑ θi (xi) + ∑ θij (xi,xj) + h(x)
h(x)= { ∞ if x not 4-connected0 otherwise
Example 1: Segmentation and ConnectivityE1(x)
min E(x) = min [ E1(x) + θTx + h(x) – θTx ]
≥ min [E1(x1) + θTx1] + min [h(x2) + θTx2] = L(θ)x1 x2
xx
Derive Lower bound:
Subproblem 1:
Unary terms +pairwise terms
Global minimum:GraphCut
Subproblem 2:
Unary terms + Connectivity constraint
Global minimum: Dijkstra
But: Lower bound was for no example tight.
Example 1: Segmentation and Connectivity
min E(x) = min [ E1(x) + θTx + θ’Tx’ + h(x) – θTx + h(x) - θ’Tx’]
≥ min [E1(x1) + θTx1 + θ’Tx’1] + min [h(x2) + θTx2] +
min [h(x3) + θ’Tx’3] =L(θ)x2
x,x,
x1,x’1
Derive Lower bound:
Subproblem 1:
Unary terms +pairwise terms
Global minimum:GraphCut
Subproblem 2:
Unary terms + Connectivity constraint
Global minimum: Dijkstra
x’ indicator vector of all pairwise terms
E(x) = ∑ θi (xi) + ∑ θij (xi,xj) + h(x)
h(x)= { ∞ if x not 4-connected0 otherwise
E1(x)
x3,x’3Subproblem 3:
Pairwise terms + Connectivity constraint
Lower Bound: Based on minimal paths on a dual graph
x’ x’
Results: Segmentation and Connectivity
Global optimum 12 out of 40 cases.
Image Input GraphCut GlobalMin
Heuristic method, DijkstraGC, which is faster and gives empirically same or better results
Extra Input
*Vicente et al ’08+
Example2: Dual of the LP Relaxation(from Pawan Kumar’s part)
Wainwright et al., 20011
2
3
4 5 6
q*( 1)
i i =
q*( 2)
q*( 3)
q*( 4) q*( 5) q*( 6)
i q*( i)
Dual of LP
Va Vb Vc
Vd Ve Vf
Vg Vh Vi
Va Vb Vc
Vd Ve Vf
Vg Vh Vi
Va Vb Vc
Vd Ve Vf
Vg Vh Vi
i ≥ 0
max
= ( a0, a1,…, ab00, ab01,…)
{ i}
i
= L({ i})
q*( i) = min iTxixi
x = (xa0, xa1,…,xab00,xab01,…)
i
Example2: Dual of the LP Relaxation
min Tx = min iTx ≥ min iTxi = L({ i})
“Original problem”
∑i
∑ix xi
“i different trees”
q*( i)Subject to i =
“Lower bound”
Projected subgradient method:
Θi = [Θi + λxi ]Ω
Ω= {Θi| ∑ Θi = Θ }
q*( i) = min iTxixi
q*( i) concave wrt i ;
Guaranteed to get the optimal lower bound !
*
i
θ
θTx’1θTx’’1θTx’’’1
x
i
Use subgradient … why?
Example 2: optimize LP of TRW
[Komodiakis et al ’07+
TRW-S:• Not guaranteed to get optimal bound (DD does)• Lower bound goes always up (DD not).• Needs min-marginals (DD not)• DD paralizable (every tree in DD can be optimized separately)
Not NP-hard(PushmeetKohli’s part)
Example 3: A global perspective on low-level vision
*Woodford et al. ICCV’09+(see poster on Friday)
Add global term which enforcesa match with the marginal statistic
∑xii
0 n
Cost f
E(x) = ∑ θi (xi) + ∑ θij (xi,xj) + f(∑xi)ii i,jЄN
Global unary, er. 12.8%
E1 E2
∑xii
0 n
Cost f“Solve with dual-decomposition”
Example 3: A global perspective on low-level vision
input
Image synthesis
Image de-noising
global colordistribution prior
[Kwatra ’03+
Pairwise-MRFNoisy input Global gradient prior
Ground truth Gradient strength
Example 4: Solve GrabCut globally optimal
*Vicente et al; ICCV ’09+(see poster on Tuesday)
E(x, w)
w Color model
Highly connected MRF
E’(x) = min E(x, w)w
Higher-order MRF
E(x,w) = ∑ θi (xi,w) + ∑ θij (xi,xj)
E(x,w): {0,1}n x {GMMs}→ R
Example 4: Solve GrabCut globally optimal
Prefers “equal area” segmentation Each color either fore- or background
0 n/2 n
g convex fb
0 max
concave
E(x)= g(∑xi) + ∑ fb(∑xib) + ∑i b i,jЄN
θij (xi,xj)
input
∑xi ∑xib
E1 E2
“Solve with dual-decomposition”
Example 3: Solve GrabCut globally optimal
*Vicente et al; ICCV ’09+(see poster on Tuesday)
Globally optimal in 60% of cases, such as…
Summary
• Dual Decomposition is a powerful technique for challenging MRFs
• Not guaranteed to give globally optimal energy
• … but for several vision problems we get tight bounds
END
… unused slides
Texture Restoration (table 1)256x85 nodes; 15connec; 36% non-sub; 6.6 unary strength
Training Image Test Image GC, E=999(0.05sec)
QPBO, 16.5% unlab. (1.4sec)
QPBOP, 0% unlab. (14sec)Global Minimum.Sim. ann., ICM, BP, BP+I, C+BP+I (visually similar)
New-View Synthesis [Fitzgibbon et al ‘03]
385x385 nodes; 8con; 8% non-sub; 0.1 unary strength
QPBO 3.9% unlabelled (black) (0.7sec)
QPBOP - Global Min. (1.4sec),P+BP+I, BP+I
Ground Truth Sim. Ann. E=980 (50sec)ICM E=999 (0.2sec) (visually similar)
Graph Cut E=2 (0.3sec)
BP E=18 (0.6sec)
Image Segmentation – region & boundary brush 321x221 nodes; 4con; 0.006% non-sub; 0 unary strength
Sim. Ann. E=983 (50sec) ICM E=999 (0.07sec)
GraphCut E=873 (0.11sec) BP E=28 (0.2sec) QPBO 26.7% unlabeled (0.08sec)
QPBOP Global Min (3.8sec)
Input Image User Input
Non-truncated cost function
[HBF; Szeliski ‘06+ Fast linear system solver in continuous domain (then discretised)
original input HBF
|di-dj|
cost