various regularization methods in computer vision min-gyu park computer vision lab. school of...
TRANSCRIPT
Various Regularization Methods in Computer Vision
Min-Gyu Park
Computer Vision Lab.
School of Information and Communications
GIST
Vision Problems (intro)
• Such as stereo matching, optical flow estimation, de-noising, segmentation, are typically ill-posed problems.– Because these are inverse problems.
• Properties of well-posed problems. – Existence: a solution exists. – Uniqueness: the solution is unique. – Stability: the solution continuously depends on the
input data.
Vision Problems (intro)
• Vision problems are difficult to compute the solu-tion directly.– Then, how to find a meaningful solution to such a
hard problem?
• Impose the prior knowledge to the solution. – Which means we constrict the space of possible so-
lutions to physically meaningful ones.
Vision Problems (intro)
• This seminar is about imposing our prior knowledge to the solution or to the scene.
• There are various kinds of approaches, – Quadratic regularization,– Total variation,– Piecewise smooth models,– Stochastic approaches, – With either L1 or L2 data fidelity terms.
• We will study about the properties of different pri-ors.
Bayesian Inference & Probabilistic Modeling
• We will see the simple de-noising problem.
– f is a noisy input image, u is the noise-free (de-noised) image, and n is Gaussian noise.
• Our objective is finding the posterior distribution,
– Where the posterior distribution can be directly es-timated or can be estimated as,
( | ) ( )( | )
( )
p f u p up u f
p f
( , ) ( , ) ( , )f x y u x y n x y
* max{ ( | )}u
u p u f
Bayesian Inference & Probabilistic Modeling
• Probabilistic modeling
• Depending on how we model p(u), the solution will be significantly different.
( | ) ( )( | ) ( | ) ( )
( )
p f u p up u f p f u p u
p f
Likelihood term (data fidelity term)
Prior term
Evidence(does not
depend on u)
De-noising Problem
• Critical issue. – How to smooth the input image while preserving some
important features such as image edge.
Input (noisy) image De-noised image via L1 regularization term
De-noising Problem
• Formulation.
2
2
( ( , ) ( , ))
2
( , )
1( | )
2d
f x y u x y
x y D d
p f u e
2
2
| ( , )|
2
( , )
1( )
2p
u x y
x y D p
p u e
22
, | |T
u u u uu u
x y x y
Quadratic smoothness of a first order derivatives.
First order: flat surfaceSecond order: quadratic surface
De-noising Problem
• By combining both likelihood and prior terms,
• Thus, maximization of p(f|u)p(u) is equivalent to minimize the free energy of Gibbs distribution.
2 2
2 2
( ( , ) ( , )) | ( , )|
2 2
( , )
1( | ) ( )
4d p
f x y u x y u x y
x y D d p
p f u p u e
Is the exactly Gibbs function!!!
2 2
2 2( , )
2 2
( , )
( ( , ) ( , )) | ( , )|( )
2 2
1 1( ( , ) ( , )) | ( , )|
2 2
x y D d p
x y D
f x y u x y u x yE u
f x y u x y u x y
2
2
p
d
How to minimize the energy func-tion?
• Directly solve the Euler-Lagrange equations.– Because the solution space is convex!
(having a globally unique solution)
1t t
Eu u dt
u
0Initially, .u f
The Result of a Quadratic Regular-izer
Input (noisy) image Noise are removed (smoothed), but edges are also blurred.
The result is not satisfactory….
Why?
• Due to bias against discontinuities.
𝑥
𝑓 (𝑥)
5
4
3
2
1
01 2 3 4 5 6
𝑔 (𝑥)∫0
6
¿𝛻 𝑓 ∨¿𝑑𝑥=∫0
6
¿𝛻𝑔∨¿𝑑𝑥=5 ¿¿
Discontinuity are penalized more!!!
intensity
whereas L1 norm(total variation) treats both as same.
Pros & Cons
• If there is no discontinuity in the result such as depth map, surface, and noise-free image, qua-dratic regularizer will be a good solution. – L2 regulaizer is biased against discontinuities.
– Easy to solve! Descent gradient will find the solution. • Quadratic problems has a unique global solution.
– Meaning it is a well-posed problem. – But, we cannot guarantee the solution is truly correct.
Introduction to Total Variation
• If we use L1-norm for the smoothness prior,
• Furthermore, if we assume the variance is 1 then,
( , )
| ( , )|
( , )
| ( , )|
1( )
1
p
px y D
u x y
x y D
u x y
p u eZ
eZ
( , )
| ( , )|
( , )
| ( , )|
1( )
1
p
x y D
u x y
x y D
u x y
p u eZ
eZ
Introduction to Total Variation
• Then, the free energy is defined as total variation of a function u.
x
u(x)
0
( ) | |TV u u d
∫Definition of total variation:
s.t. the summation should be a finite value (TV(f) < ). Those functions have bounded variation(BV).
Characteristics of Total Variation
• Advantages:– No bias against discontinuities. – Contrast invariant without explicitly modeling the
light condition. – Robust under impulse noise.
• Disadvantages:– Objective functions are non-convex.
• Lie between convex and non-convex problems.
How to solve it?
• With L1, L2 data terms, we can use – Variational methods
• Explicit Time Marching• Linearization of Euler-Lagrangian • Nonlinear Primal-dual method• Nonlinear multi-grid method
– Graph cuts – Convex optimization (first order scheme) – Second order cone programming
• To solve original non-convex problems.
Variational Methods
• Definition. – Informally speaking, they are based on solving Eu-
ler-Lagrange equations.
• Problem Definition (constrained problem).
2 2min | | s.t. ( )u
u d u f d
∫ ∫
The first total variation based approach in computer vision, named after Rudin, Osher and Fatemi, shortly as ROF model (1992).
Variational Methods
• Unconstrained (Lagrangian) model
• Can be solved by explicit time matching scheme as,
21max{ ( | )} min | | ( )
2uup u f u d u f d
∫ ∫
1
1( )
| |t t
uu u dt u f
u
Variational Methods
• What happens if we change the data fidelity term to L1 norm as,
• More difficult to solve (non-convex), but robust against outliers such as occlusion.
1min | | ( )
2uu d u f d
∫ ∫
This formulation is called as TV-L1 framework.
Variational Methods
• Comparison among variational methods in terms of explicit time marching scheme.
1
1( )
| |t t
uu u dt u f
u
1
1( )t tu u dt u u f
1
1 ( )
| | | |t t
u u fu u dt
u u f
Where the degeneracy comes from.
L2-L2
TV-L2
TV-L1
Variational Methods
22 22 2
2
22 2 22
2 2
|| ||
|| ||
u u u uu
x y x y
u u u uu u
u u x u y x y
1
1( )t tu u dt u u f
• In L2-L2 case,
where
Duality-based Approach
• Why do we use duality instead of the primal prob-lem? – The function becomes continuously differen-
tiable. – Not always, but in case of total variation.
• For example, we use below property to introduce a dual variable p,
max :|| || 1 | |p u d p u d
∫ ∫
Duality-based Approach
• Deeper understandings of duality in variational methods will be given in the next seminar.
Applying to Other Problems
• Optical flow (Horn and Schunck – L2-L2)
• Stereo matching (TV-L1)
• Segmentation (TV-L2)
2
[0,1]
1min ( ) | | ( )( )
2ug x u d x u f d
∫ ∫
2 2 21 2 1 0min | | | | ( ( ( )) ( ))
uu u d I x u x I x d
∫ ∫
1 0 0 1 0min | | | ( ) ( ) |x
uu d I x u u u I I d
∫ ∫
Q&A / Discussion