various regularization methods in computer vision min-gyu park computer vision lab. school of...

Various Regularization Methods in Computer Vision

Min-Gyu Park

Computer Vision Lab.

School of Information and Communications

GIST

Vision Problems (intro)

• Such as stereo matching, optical flow estimation, de-noising, segmentation, are typically ill-posed problems.– Because these are inverse problems.

• Properties of well-posed problems. – Existence: a solution exists. – Uniqueness: the solution is unique. – Stability: the solution continuously depends on the

input data.


• Vision problems are difficult to compute the solu-tion directly.– Then, how to find a meaningful solution to such a

hard problem?

• Impose the prior knowledge to the solution. – Which means we constrict the space of possible so-

lutions to physically meaningful ones.


• This seminar is about imposing our prior knowledge to the solution or to the scene.

• There are various kinds of approaches, – Quadratic regularization,– Total variation,– Piecewise smooth models,– Stochastic approaches, – With either L1 or L2 data fidelity terms.

• We will study about the properties of different pri-ors.

Bayesian Inference & Probabilistic Modeling

• We will see the simple de-noising problem.

– f is a noisy input image, u is the noise-free (de-noised) image, and n is Gaussian noise.

• Our objective is finding the posterior distribution,

– Where the posterior distribution can be directly es-timated or can be estimated as,

( | ) ( )( | )

( )

p f u p up u f

p f

( , ) ( , ) ( , )f x y u x y n x y

* max{ ( | )}u

u p u f

Bayesian Inference & Probabilistic Modeling

• Probabilistic modeling

• Depending on how we model p(u), the solution will be significantly different.

( | ) ( )( | ) ( | ) ( )

( )

p f u p up u f p f u p u

p f

Likelihood term (data fidelity term)

Prior term

Evidence(does not

depend on u)

De-noising Problem

• Critical issue. – How to smooth the input image while preserving some

important features such as image edge.

Input (noisy) image De-noised image via L1 regularization term

De-noising Problem

• Formulation.

2

2

( ( , ) ( , ))

2

( , )

1( | )

2d

f x y u x y

x y D d

p f u e

2

2

| ( , )|

2

( , )

1( )

2p

u x y

x y D p

p u e

22

, | |T

u u u uu u

x y x y

Quadratic smoothness of a first order derivatives.

First order: flat surfaceSecond order: quadratic surface

De-noising Problem

• By combining both likelihood and prior terms,

• Thus, maximization of p(f|u)p(u) is equivalent to minimize the free energy of Gibbs distribution.

2 2

2 2

( ( , ) ( , )) | ( , )|

2 2

( , )

1( | ) ( )

4d p

f x y u x y u x y

x y D d p

p f u p u e

Is the exactly Gibbs function!!!

2 2

2 2( , )

2 2

( , )

( ( , ) ( , )) | ( , )|( )

2 2

1 1( ( , ) ( , )) | ( , )|

2 2

x y D d p

x y D

f x y u x y u x yE u

f x y u x y u x y

2

2

p

d

How to minimize the energy func-tion?

• Directly solve the Euler-Lagrange equations.– Because the solution space is convex!

(having a globally unique solution)

1t t

Eu u dt

u

0Initially, .u f

The Result of a Quadratic Regular-izer

Input (noisy) image Noise are removed (smoothed), but edges are also blurred.

The result is not satisfactory….

Why?

• Due to bias against discontinuities.

𝑥

𝑓 (𝑥)

5

4

3

2

1

01 2 3 4 5 6

𝑔 (𝑥)∫0

6

¿𝛻 𝑓 ∨¿𝑑𝑥=∫0

6

¿𝛻𝑔∨¿𝑑𝑥=5 ¿¿

Discontinuity are penalized more!!!

intensity

whereas L1 norm(total variation) treats both as same.

Pros & Cons

• If there is no discontinuity in the result such as depth map, surface, and noise-free image, qua-dratic regularizer will be a good solution. – L2 regulaizer is biased against discontinuities.

– Easy to solve! Descent gradient will find the solution. • Quadratic problems has a unique global solution.

– Meaning it is a well-posed problem. – But, we cannot guarantee the solution is truly correct.

Introduction to Total Variation

• If we use L1-norm for the smoothness prior,

• Furthermore, if we assume the variance is 1 then,

( , )

| ( , )|

( , )

| ( , )|

1( )

1

p

px y D

u x y

x y D

u x y

p u eZ

eZ

( , )

| ( , )|

( , )

| ( , )|

1( )

1

p

x y D

u x y

x y D

u x y

p u eZ

eZ

Introduction to Total Variation

• Then, the free energy is defined as total variation of a function u.

x

u(x)

0

( ) | |TV u u d

∫Definition of total variation:

s.t. the summation should be a finite value (TV(f) < ). Those functions have bounded variation(BV).

Characteristics of Total Variation

• Advantages:– No bias against discontinuities. – Contrast invariant without explicitly modeling the

light condition. – Robust under impulse noise.

• Disadvantages:– Objective functions are non-convex.

• Lie between convex and non-convex problems.

How to solve it?

• With L1, L2 data terms, we can use – Variational methods

• Explicit Time Marching• Linearization of Euler-Lagrangian • Nonlinear Primal-dual method• Nonlinear multi-grid method

– Graph cuts – Convex optimization (first order scheme) – Second order cone programming

• To solve original non-convex problems.

Variational Methods

• Definition. – Informally speaking, they are based on solving Eu-

ler-Lagrange equations.

• Problem Definition (constrained problem).

2 2min | | s.t. ( )u

u d u f d

∫ ∫

The first total variation based approach in computer vision, named after Rudin, Osher and Fatemi, shortly as ROF model (1992).

Variational Methods

• Unconstrained (Lagrangian) model

• Can be solved by explicit time matching scheme as,

21max{ ( | )} min | | ( )

2uup u f u d u f d

∫ ∫

1

1( )

| |t t

uu u dt u f

u

Variational Methods

• What happens if we change the data fidelity term to L1 norm as,

• More difficult to solve (non-convex), but robust against outliers such as occlusion.

1min | | ( )

2uu d u f d

∫ ∫

This formulation is called as TV-L1 framework.

Variational Methods

• Comparison among variational methods in terms of explicit time marching scheme.

1

1( )

| |t t

uu u dt u f

u

1

1( )t tu u dt u u f

1

1 ( )

| | | |t t

u u fu u dt

u u f

Where the degeneracy comes from.

L2-L2

TV-L2

TV-L1

Variational Methods

22 22 2

2

22 2 22

2 2

|| ||

|| ||

u u u uu

x y x y

u u u uu u

u u x u y x y

1

1( )t tu u dt u u f

• In L2-L2 case,

where

Duality-based Approach

• Why do we use duality instead of the primal prob-lem? – The function becomes continuously differen-

tiable. – Not always, but in case of total variation.

• For example, we use below property to introduce a dual variable p,

max :|| || 1 | |p u d p u d

∫ ∫

Duality-based Approach

• Deeper understandings of duality in variational methods will be given in the next seminar.

Applying to Other Problems

• Optical flow (Horn and Schunck – L2-L2)

• Stereo matching (TV-L1)

• Segmentation (TV-L2)

2

[0,1]

1min ( ) | | ( )( )

2ug x u d x u f d

∫ ∫

2 2 21 2 1 0min | | | | ( ( ( )) ( ))

uu u d I x u x I x d

∫ ∫

1 0 0 1 0min | | | ( ) ( ) |x

uu d I x u u u I I d

∫ ∫

Q&A / Discussion

various regularization methods in computer vision min-gyu park computer vision lab. school of...

Documents

unique solution slide

u slide

meaningful solution

solution space

quadratic surface slide

good solution

quadratic problems

l1 regularization term