deconvolution with admm - stanford universitydeconvolution • given measurements band convolution...

45
Deconvolution with ADMM Gordon Wetzstein Stanford University EE367/ CS448I: Computational Imaging and Display stanford.edu /class/ ee367 Lecture 6

Upload: others

Post on 13-Jul-2020

43 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

Deconvolution with ADMM

Gordon WetzsteinStanford University

EE367/CS448I: Computational Imaging and Displaystanford.edu/class/ee367

Lecture 6

Page 2: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

Lens as Optical Low-pass Filter

• point source on focal plane maps to point

focal plane

Page 3: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

• away from focal plane: out of focus blur

focal plane

blur

red

poin

t

Lens as Optical Low-pass Filter

Page 4: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

• shift-invariant convolution

focal plane

Lens as Optical Low-pass Filter

Page 5: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

Lens as Optical Low-pass Filter

poin

t spr

ead

func

tion

(PSF

): c

x bsharp image measured, blurred image

b = c∗ x

convolution kernel is calledpoint spread function (PSF)

Page 6: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

Lens as Optical Low-pass Filter

poin

t spr

ead

func

tion

(PSF

): c

x bsharp image measured, blurred image

b = c∗ x

diffraction-limited PSF of circular aperture (aka “Airy” pattern):

Page 7: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

PSF, OTF, MTF

• point spread function (PSF) is fundamental concept in optics• optical transfer function (OTF) is (complex) Fourier transform of PSF

• modulation transfer function (MTF) is magnitude of OTF

• example: PSFOTF=F{PSF}MTF=|OTF|

Page 8: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

PSF, OTF, MTF

PSFOTF=F{PSF}MTF=|OTF|• example:

Page 9: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

Deconvolution

• given measurements b and convolution kernel c, what is x?

*

=

bcx

?

Page 10: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

Deconvolution with Inverse Filtering• naive solution: apply inverse kernel

!x = c−1 ∗b = F−1 F b{ }

F c{ }⎧⎨⎩

⎫⎬⎭

x !x

Page 11: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

Deconvolution with Inverse Filtering & Noise• naive solution: apply inverse kernel

• Gaussian noise,

!x = c−1 ∗b = F−1 F b{ }

F c{ }⎧⎨⎩

⎫⎬⎭

!xσ = 0.05

Page 12: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

Deconvolution with Inverse Filtering & Noise

• results: terrible!

• why? this is an ill-posed problem (division by (close to) zero in frequency

domain) à noise is drastically amplified!

• need to include prior(s) on images to make up for lost data• for example: noise statistics (signal to noise ratio)

Page 13: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

Deconvolution with Wiener Filtering• apply inverse kernel and don’t divide by 0

!x = F−1 F c{ } 2

F c{ } 2 + 1SNR⋅F b{ }F c{ }

⎧⎨⎪

⎩⎪

⎫⎬⎪

⎭⎪

amplitude-dependent damping factor!

𝑆𝑁𝑅 =𝑚𝑒𝑎𝑛 𝑠𝑖𝑔𝑛𝑎𝑙 ≈ 0.5𝑛𝑜𝑖𝑠𝑒 𝑠𝑡𝑑 = 𝜎

Page 14: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

Deconvolution with Wiener Filtering

x !xNaïve inverse filter Wiener

Page 15: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

Deconvolution with Wiener Filtering

σ = 0.05 σ = 0.1σ = 0.01

Page 16: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

Deconvolution with Wiener Filtering

• results: not too bad, but noisy

• this is a heuristic à dampen noise amplification

Page 17: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

• idea: promote sparse gradients (edges)

• is finite differences operator, i.e. matrix

Total Variation

minimizex

Cx − b 22 + λTV (x) = minimize

xCx − b 2

2 + λ ∇x 1

x 1 = xii∑

−1 1−1 1!

−1

⎢⎢⎢⎢

⎥⎥⎥⎥

Rudin et al. 1992

Page 18: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

Total Variation

∗0 0 00 −1 10 0 0

⎢⎢⎢

⎥⎥⎥

∇yxx ∇xx

∗0 0 00 −1 00 1 0

⎢⎢⎢

⎥⎥⎥

express (forward finite difference) gradient as convolution!

Page 19: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

better: isotropic

Total Variation

∇xx( )2 + ∇yx( )2x ∇xx( )2 + ∇yx( )2easier: anisotropic

Page 20: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

Total Variation

• for simplicity, this lecture only discusses anisotropic TV:

• problem: l1-norm is not differentiable, can’t use inverse filtering

• however: simple solution for data fitting along and simple solution

for TV alone à split problem!

TV (x) = ∇xx 1 + ∇yx 1=

∇x

∇y

⎣⎢⎢

⎦⎥⎥x1

Page 21: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

minimize f (x)+ g(z)subject to Ax + Bz = c

f (x) = Cx − b 22

g(z) = λ z 1

A = ∇, B = −I , c = 0

Deconvolution with ADMM

• split deconvolution with TV prior:

• general form of ADMM (alternating direction method of multiplies):

minimize Cx − b 22 + λ z 1

subject to ∇x = z

Page 22: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

minimize f (x)+ g(z)subject to Ax + Bz = c

f (x) = Cx − b 22

g(z) = λ z 1

A = ∇, B = −I , c = 0

Deconvolution with ADMM

• split deconvolution with TV prior:

• general form of ADMM (alternating direction method of multiplies):

minimize Cx − b 22 + λ z 1

subject to ∇x = z

Page 23: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

minimize f (x)+ g(z)subject to Ax + Bz = c

f (x) = Cx − b 22

g(z) = λ z 1

A = ∇, B = −I , c = 0

Deconvolution with ADMM

• split deconvolution with TV prior:

• general form of ADMM (alternating direction method of multiplies):

minimize Cx − b 22 + λ z 1

subject to ∇x = z

Page 24: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

minimize f (x)+ g(z)subject to Ax + Bz = c

f (x) = Cx − b 22

g(z) = λ z 1

A = ∇, B = −I , c = 0

Deconvolution with ADMM

• split deconvolution with TV prior:

• general form of ADMM (alternating direction method of multiplies):

minimize Cx − b 22 + λ z 1

subject to ∇x = z

Page 25: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

ADMM

• Lagrangian (bring constraints into objective = penalty method):

• augmented Lagrangian:

minimize f (x)+ g(z)subject to Ax + Bz = c

Lρ (x, y, z) = f (x)+ g(z)+ yT (Ax + Bz − c)+ (ρ / 2) Ax + Bz − c 22

L(x, y, z) = f (x)+ g(z)+ yT (Ax + Bz − c)

dual variable or Lagrange multiplier

additional penalty term

Page 26: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

ADMM

• augmented Lagrangian is differentiable under mild conditions (usually

better convergence etc.)

minimize f (x)+ g(z)subject to Ax + Bz = c

Lρ (x, y, z) = f (x)+ g(z)+ yT (Ax + Bz − c)+ (ρ / 2) Ax + Bz − c 22

Page 27: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

• ADMM consists of 3 steps per iteration k:

ADMMminimize f (x)+ g(z)subject to Ax + Bz = c

xk+1 := argminx

Lρ (x, zk , yk )

zk+1 := argminz

Lρ (xk+1, z, yk )

yk+1 := yk + ρ(Axk+1 + Bzk+1 − c)

Page 28: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

ADMM

• ADMM consists of 3 steps per iteration k:

minimize f (x)+ g(z)subject to Ax + Bz = c

xk+1 := argminx

f (x)+ (ρ / 2) Ax + Bzk − c + uk( )zk+1 := argmin

zg(z)+ (ρ / 2) Axk+1 + Bz − c + uk( )

uk+1 := uk + Axk+1 + Bzk+1 − c

constant

u = (1 / ρ)yscaled dual variable:

Page 29: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

ADMM

• ADMM consists of 3 steps per iteration k:

minimize f (x)+ g(z)subject to Ax + Bz = c

xk+1 := argminx

f (x)+ (ρ / 2) Ax + Bzk − c + uk2

2( )zk+1 := argmin

zg(z)+ (ρ / 2) Axk+1 + Bz − c + uk

2

2( )uk+1 := uk + Axk+1 + Bzk+1 − c

split f(x) and g(x) into independent problems! (u connects them)

u = (1 / ρ)yscaled dual variable:

Page 30: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

Deconvolution with ADMM

• ADMM consists of 3 steps per iteration k:

minimize 12Cx − b 2

2 + λ z 1

subject to ∇x − z = 0

xk+1 := argminx

12Cx − b 2

2 + (ρ / 2) ∇x − zk + uk2

2⎛⎝⎜

⎞⎠⎟

zk+1 := argminz

λ z 1 + (ρ / 2) ∇xk+1 − z + uk

2

2( )uk+1 := uk +∇xk+1 − zk+1

Page 31: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

Deconvolution with ADMM

1. x-update: xk+1 := argminx

12Cx − b 2

2 + (ρ / 2) ∇x − zk + uk2

2⎛⎝⎜

⎞⎠⎟

CTC + ρ∇T∇( )x = CTb + ρ∇T v( )

constant, say

minimize 12Cx − b 2

2 + λ z 1

subject to ∇x − z = 0 v = zk − uk

∇T v =∇x

∇y

⎣⎢⎢

⎦⎥⎥

T

v = ∇xT v1 +∇y

T v2

solve normal equations

Page 32: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

Deconvolution with ADMM

1. x-update:

• inverse filtering:

à may blow up, but that’s okay

xk+1 := argminx

12Cx − b 2

2 + (ρ / 2) ∇x − zk + uk2

2⎛⎝⎜

⎞⎠⎟

x = CTC + ρ∇T∇( )−1 CTb + ρ∇T v( )

constant, say

minimize 12Cx − b 2

2 + λ z 1

subject to ∇x − z = 0 v = zk − uk

xk+1 = F−1F c{ }* ⋅F b{ }+ ρ F ∇x{ }* ⋅F v1{ }+ F ∇y{ }* ⋅F v2{ }( )F c{ }* ⋅F c{ }+ ρ F ∇x{ }* ⋅F ∇x{ }+ F ∇y{ }* ⋅F ∇y{ }( )

⎨⎪

⎩⎪

⎬⎪

⎭⎪

precompute!

Page 33: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

Deconvolution with ADMM

2. z-update:

• l1-norm is not differentiable! yet, closed-form solution via element-wise

soft thresholding:

zk+1 := argminz

λ z 1 + (ρ / 2) ∇xk+1 − z + uk

2

2( ):= argmin

zλ z 1 + (ρ / 2) z − a 2

2

constant, say

minimize 12Cx − b 2

2 + λ z 1

subject to ∇x − z = 0

zk+1 := Sλ /ρ (a) Sκ (a) =a −κ a >κ0 a ≤κa +κ a < −κ

⎨⎪

⎩⎪

= (a −κ )+ − (−a −κ )+

a = ∇xk+1 + uk

κ = λ / ρ

Page 34: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

Deconvolution with ADMM

for k=1:max_iters

minimize 12Cx − b 2

2 + λ z 1

subject to ∇x − z = 0

xk+1 := argminx

12

Cρ∇

⎣⎢⎢

⎦⎥⎥x −

bρv

⎣⎢⎢

⎦⎥⎥2

2⎛

⎝⎜⎜

⎠⎟⎟

zk+1 := Sλ /ρ (∇xk+1 + uk )

uk+1 := uk +∇xk+1 − zk+1

inverse filtering

element-wise threshold

trivial

Page 35: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

Deconvolution with ADMM

for k=1:max_iters

minimize 12Cx − b 2

2 + λ z 1

subject to ∇x − z = 0

xk+1 := argminx

12

Cρ∇

⎣⎢⎢

⎦⎥⎥x −

bρv

⎣⎢⎢

⎦⎥⎥2

2⎛

⎝⎜⎜

⎠⎟⎟

zk+1 := Sλ /ρ (∇xk+1 + uk )

uk+1 := uk +∇xk+1 − zk+1

inverse filtering

element-wise threshold

trivial

à easy! J

Page 36: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

Deconvolution with ADMMminimize 12Cx − b 2

2 + λ z 1

subject to ∇x − z = 0Wiener filtering ADMM with anisotropic TV, λ = 0.01, ρ = 10

Page 37: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

Deconvolution with ADMMminimize 12Cx − b 2

2 + λ z 1

subject to ∇x − z = 0

λ = 0.1, ρ = 10λ = 0.05, ρ = 10λ = 0.01, ρ = 10

• too much TV: “patchy”, too little TV: noisy

Page 38: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

Deconvolution with ADMMminimize 12Cx − b 2

2 + λ z 1

subject to ∇x − z = 0Wiener filtering ADMM with anisotropic TV, λ = 0.1, ρ = 10

Page 39: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

Deconvolution with ADMMminimize 12Cx − b 2

2 + λ z 1

subject to ∇x − z = 0

λ = 0.1, ρ = 10λ = 0.05, ρ = 10λ = 0.01, ρ = 10

• too much TV: okay because image actually has sparse gradients!

Page 40: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

Outlook ADMM• powerful tool for many computational imaging problems• include generic prior in g(z), just need to derive proximal operator

• example priors: noise statistics, sparse gradient, smoothness, …

• weighted sum of different priors also possible

• anisotropic TV is one of the easiest priors

minimizex,z{ }

f (x)+ g(z)

subject to Ax = z

minimizex

12Ax − b 2

2

data fidelity! "# $#

+ Γ(x)regularization%

Page 41: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

Remember!

• implement matrix-free operations for Ax and A’x if efficient (e.g.

multiplications and divisions in frequency space)

• split difficult problems (e.g., inverse problems with non-

differentiable priors) into easier subproblems - ADMM

Page 42: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

Homework 3

• implement:• filtering

• inverse filtering and Wiener filtering

• deconvolution with ADMM + (anisotropic) TV prior

Page 43: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

• notes for ADMM implementation:

• initialize U, Z, X with 0

• implement with matrix-free form: all FT multiplications / divisions

• in 2D, finite differences matrix becomes(anisotropic form), use matrix free-operations as well!

• see note notes in HW

• check ADMM example scripts: http://web.stanford.edu/~boyd/papers/admm/

∇ =∇x

∇y

⎣⎢⎢

⎦⎥⎥

Notes for Homework 3I ∈ℜM×N , X ∈ℜMN×1

U ∈ℜ2MN×1, Z ∈ℜ2MN×1

Page 44: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

• signal-to-noise ratio (SNR):

• peak signal-to-noise ratio (PSNR):

(always in dB)

• residual is value of objective function:

• convergence: residual for increasing iterations (should always decrease!)

Notes for Homework 3

12Cx − b 2

2 + λ∇x

∇y

⎣⎢⎢

⎦⎥⎥x1

12Cx − b 2

2not regularized: regularized:

MSE = 1mn

xtarget − xest( )n∑m∑ 2

PSNR = 10 ⋅ log10max(xtarget )

2

MSE⎛

⎝⎜⎞

⎠⎟= 10 ⋅ log10

1MSE

⎛⎝⎜

⎞⎠⎟

SNR =PsignalPnoise

SNRdB = 10 ⋅ log10PsignalPnoise

⎛⎝⎜

⎞⎠⎟

Page 45: Deconvolution with ADMM - Stanford UniversityDeconvolution • given measurements band convolution kernel c, what is x? * = b x c? Deconvolutionwith Inverse Filtering • naive solution:

References and Further Reading• Boyd, Parikh, Chu, Peleato, Eckstein, “Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers”,

Foundations and Trends in Machine Learning, 2011

• A. Chambolle, T. Pock “A first-order primal-dual algorithm for convex problems with applications in imaging”, Journal of Mathematical Imaging and Vision, 2011

• Boreman, “Modulation Transfer Function in Optical and ElectroOptical Systems”, SPIE Publications, 2001• Rudin, Osher, Fatemi, “Nonlinear total variation based noise removal algorithms”, Physica D: Nonlinear Phenomena 60, 1

• http://www.imagemagick.org/Usage/fourier/