Statistical Image Models
Eero Simoncelli
Howard Hughes Medical Institute,Center for Neural Science, and
Courant Institute of Mathematical SciencesNew York University
Photographic ImagesDiverse specialized structures:
• edges/lines/contours
• shadows/highlights
• smooth regions
• textured regions
Photographic ImagesDiverse specialized structures:
• edges/lines/contours
• shadows/highlights
• smooth regions
• textured regions
Occupy a small region of the full space
space of all images
typical images
One could describe this set as a deterministic manifold....
• Step edges are rare (lighting, junctions, texture, noise)
• Step edges are rare (lighting, junctions, texture, noise)
• One scale’s texture is another scale’s edge
• Step edges are rare (lighting, junctions, texture, noise)
• One scale’s texture is another scale’s edge
• Need seamless transitions from isolated features to dense textures
One could describe this set as a deterministic manifold....
space of all images
typical images
One could describe this set as a deterministic manifold....But seems more natural to use probability
space of all images
typical images
One could describe this set as a deterministic manifold....But seems more natural to use probability
space of all images
typical images
P(x)
“Applications”• Engineering: compression, denoising, restoration,
enhancement/modification, synthesis, manipulation
[Hubel ‘95]
“Applications”• Engineering: compression, denoising, restoration,
enhancement/modification, synthesis, manipulation
• Science: optimality principles for neurobiology (evolution, development, learning, adaptation)
[Hubel ‘95]
Density models
nonparametric parametric/constrained
Density models
nonparametric parametric/constrained
build a histogram from lots of observations...
Density models
nonparametric parametric/constrained
build a histogram from lots of observations...
use “natural constraints”(geometry/photometry
of image formation, computation, maxEnt)
Density models
nonparametric parametric/constrained
build a histogram from lots of observations...
use “natural constraints”(geometry/photometry
of image formation, computation, maxEnt)
historical trend(technology driven)
0 50 100 150 200 250
histogram
Original image
Range: [0, 237] Dims: [256, 256]
0 50 100 150 200 250
histogram
Original image
Range: [0, 237] Dims: [256, 256]
0 50 100 150 200 250
histogram
Equalized image
Range: [1.99, 238] Dims: [256, 256]
0 50 100 150 200 250
histogram
Original image
Range: [0, 237] Dims: [256, 256]
0 50 100 150 200 250
histogram
Equalized image
Range: [1.99, 238] Dims: [256, 256]
General methodology
Observe “interesting”Joint Statistics
Transform toOptimal Representation
General methodology
Observe “interesting”Joint Statistics
Transform toOptimal Representation
General methodology
Observe “interesting”Joint Statistics
Transform toOptimal Representation
“Onion peeling”
Evolution of image models
I. (1950’s): Fourier + Gaussian
II. (mid 80’s - late 90’s): Wavelets + kurtotic marginals
III. (mid 90’s - present): Wavelets + local context
• local amplitude (contrast)
• local orientation
IV. (last 5 years): Hierarchical models
I(x,y)
I(x+1
,y)
I(x,y)I(x
+2,y
)I(x,y)
I(x+4
,y)
10 20 30 400
1
Spatial separation (pixels)
Corre
lation
a. b.Pixel correlation
I(x,y)
I(x+1
,y)
I(x,y)
I(x+2
,y)
I(x,y)
I(x+4
,y)
10 20 30 400
1
Spatial separation (pixels)
Corre
lation
a. b.I(x,y)
I(x+1
,y)
I(x,y)I(x
+2,y
)I(x,y)
I(x+4
,y)
10 20 30 400
1
Spatial separation (pixels)
Corre
lation
a. b.Pixel correlation
Translation invariance
Assuming translation invariance,
Translation invariance
Assuming translation invariance,
=> covariance matrix is Toeplitz (convolutional)
Translation invariance
Assuming translation invariance,
=> covariance matrix is Toeplitz (convolutional)
=> eigenvectors are sinusoids
Translation invariance
Assuming translation invariance,
=> covariance matrix is Toeplitz (convolutional)
=> eigenvectors are sinusoids
=> can diagonalize (decorrelate) with F.T.
Translation invariance
Assuming translation invariance,
=> covariance matrix is Toeplitz (convolutional)
=> eigenvectors are sinusoids
=> can diagonalize (decorrelate) with F.T.
Power spectrum captures full covariance structure
Spectral power
Structural:
F (s!) = spF (!)
F (!) ! 1!p
[Ritterman 52; DeRiugin 56; Field 87; Tolhurst 92; Ruderman/Bialek 94; ...]
Assume scale-invariance:
then:
Spectral power
Structural:
F (s!) = spF (!)
F (!) ! 1!p
[Ritterman 52; DeRiugin 56; Field 87; Tolhurst 92; Ruderman/Bialek 94; ...]
Assume scale-invariance:
then:
0 1 2 30
1
2
3
4
5
6
Log10 spatialfrequency (cycles/image)
Log 10
pow
er
Empirical:
Principal Components Analysis (PCA) + whitening
-20 20-20
20
4-4
4
-420-20
20
-20
a. b. c.
PCA basis for image blocks
PCA basis for image blocks
PCA is not unique
Maximum entropy (maxEnt)
E (f(x)) = c
f(x) = x2 f(x) = |x|
The density with maximal entropy satisfying
pME(x) ! exp ("!f(x))
is of the form
Examples:
where ! depends on c
Model I (Fourier/Gaussian)Basis set: Image:
Coefficientdensity:
!
!
!
!
F -11/f2
P(c)
Gaussian model is weak
P(x)
F!1!!2
F -11/f2
P(c)
Gaussian model is weak
a. b.
!2F F!1
P(x)
F!1!!2
F -11/f2
P(c)
Gaussian model is weak
a. b.
!2F F!1
-20 20-20
20
4-4
4
-420-20
20
-20
a. b. c.
P(x)
F!1!!2
Bandpass Filter Responses
500 0 50010 -4
10 -2
100
Filter Response
Prob
abilit
yResponse histogramGaussian density
[Burt&Adelson 82; Field 87; Mallat 89; Daugman 89, ...]
“Independent” Components Analysis(ICA)
For Linearly Transformed Factorial (LTF) sources: guaranteed independence(with some minor caveats)
-4 4-4
420
-2020-20
20
-2020-20 -4 4
-4
4
a. b. c. d.
[Comon 94; Cardoso 96; Bell/Sejnowski 97; ...]
ICA on image blocks
[Olshausen/Field ’96; Bell/Sejnowski ’97][example obtained with FastICA, Hyvarinen]
Marginal densities
P (x) ! exp"|x/s|p
[Mallat 89; Simoncelli&Adelson 96; Moulin&Liu 99; ...]
Well-fit by a generalized Gaussian:
Wavelet coefficient value
log(P
robabili
ty)
p = 0.46
!H/H = 0.0031
Wavelet coefficient value
log(P
robabili
ty)
p = 0.58
!H/H = 0.0011
Wavelet coefficient value
log(P
robabili
ty)
p = 0.48
!H/H = 0.0014
Wavelet coefficient value
log(P
robabili
ty)
p = 0.59
!H/H = 0.0012
Fig. 4. Log histograms of a single wavelet subband of four example images (see Fig. 1 for image description). For eachhistogram, tails are truncated so as to show 99.8% of the distribution. Also shown (dashed lines) are fitted model densitiescorresponding to equation (3). Text indicates the maximum-likelihood value of p used for the fitted model density, andthe relative entropy (Kullback-Leibler divergence) of the model and histogram, as a fraction of the total entropy of thehistogram.
non-Gaussian than others. By the mid 1990s, a numberof authors had developed methods of optimizing a ba-sis of filters in order to to maximize the non-Gaussianityof the responses [e.g., 36, 4]. Often these methods oper-ate by optimizing a higher-order statistic such as kurto-sis (the fourth moment divided by the squared variance).The resulting basis sets contain oriented filters of differentsizes with frequency bandwidths of roughly one octave.Figure 5 shows an example basis set, obtained by opti-mizing kurtosis of the marginal responses to an ensembleof 12 ! 12 pixel blocks drawn from a large ensemble ofnatural images. In parallel with these statistical develop-ments, authors from a variety of communities were devel-oping multi-scale orthonormal bases for signal and imageanalysis, now generically known as “wavelets” (see chap-ter 4.2 in this volume). These provide a good approxima-tion to optimized bases such as that shown in Fig. 5.
Once we’ve transformed the image to a multi-scalewavelet representation, what statistical model can we useto characterize the the coefficients? The statistical moti-vation for the choice of basis came from the shape of themarginals, and thus it would seem natural to assume thatthe coefficients within a subband are independent andidentically distributed. With this assumption, the modelis completely determined by the marginal statistics of thecoefficients, which can be examined empirically as in theexamples of Fig. 4. For natural images, these histogramsare surprisingly well described by a two-parameter gen-eralized Gaussian (also known as a stretched, or generalizedexponential) distribution [e.g., 31, 47, 34]:
Pc(c; s, p) =exp("|c/s|p)
Z(s, p), (3)
where the normalization constant is Z(s, p) = 2 sp!( 1
p ).An exponent of p = 2 corresponds to a Gaussian den-sity, and p = 1 corresponds to the Laplacian density. In
Fig. 5. Example basis functions derived by optimizing amarginal kurtosis criterion [see 35].
5
Kurtosis vs. bandwidth
0 0.5 1 1.5 2 2.5 3
4
6
8
10
12
14
16
Filter Bandwidth (octaves)
Sam
ple
Kurto
sis
[after Field 87]
Note: Bandwidth matters much more than orientation[see Bethge 06]
Octave-bandwidth representations
Filter:
SpatialFrequencySelectivity:
Model II (LTF)
Basis set: Image:Coefficient
density:
!
!
!
LTF also a weak model...
Sample Gaussianized
Sample ICA-transformedand Gaussianized
Trouble in paradise
Trouble in paradise
• Biology: Visual system uses a cascade- Where’s the retina? The LGN?- What happens after V1? Why don’t responses get
sparser? [Baddeley etal 97; Chechik etal 06]
Trouble in paradise
• Biology: Visual system uses a cascade- Where’s the retina? The LGN?- What happens after V1? Why don’t responses get
sparser? [Baddeley etal 97; Chechik etal 06]
• Statistics: Images don’t obey ICA source model- Any bandpass filter gives sparse marginals [Baddeley 96]
=> Shallow optimum [Bethge 06; Lyu & Simoncelli 08]
- The responses of ICA filters are highly dependent [Wegmann & Zetzsche 90, Simoncelli 97]
Conditional densities
-40 0 40 50
0.2
0.6
1
-40 0 40
0.2
0.6
1
0
-40
0
40
-40 40
Linear responses are not independent, even for optimized filters!
CSH-02
[Simoncelli 97; Schwartz&Simoncelli 01]
[Schwartz&Simoncelli 01]
• Large-magnitude subband coefficients are found at neighboring positions, orientations, and scales.
Method 1: Conditional Gaussian
[Simoncelli 97; Buccigrossi&Simoncelli 99;see also ARCH models in econometrics!]
Modeling heteroscedasticity(i.e., variable variance)
P (xn|{xk}) ! N!
0;"
k
wnk |xk|2 + !2
#
Joint densitiesadjacent near far other scale other ori
!100 0 100
!150
!100
!50
0
50
100
150
!100 0 100
!150
!100
!50
0
50
100
150
!100 0 100
!150
!100
!50
0
50
100
150
!500 0 500
!150
!100
!50
0
50
100
150
!100 0 100
!150
!100
!50
0
50
100
150
!100 0 100
!150
!100
!50
0
50
100
150
!100 0 100
!150
!100
!50
0
50
100
150
!100 0 100
!150
!100
!50
0
50
100
150
!500 0 500
!150
!100
!50
0
50
100
150
!100 0 100
!150
!100
!50
0
50
100
150
Fig. 8. Empirical joint distributions of wavelet coefficients associated with different pairs of basis functions, for a singleimage of a New York City street scene (see Fig. 1 for image description). The top row shows joint distributions as contourplots, with lines drawn at equal intervals of log probability. The three leftmost examples correspond to pairs of basis func-tions at the same scale and orientation, but separated by different spatial offsets. The next corresponds to a pair at adjacentscales (but the same orientation, and nearly the same position), and the rightmost corresponds to a pair at orthogonal orien-tations (but the same scale and nearly the same position). The bottom row shows corresponding conditional distributions:brightness corresponds to frequency of occurance, except that each column has been independently rescaled to fill the fullrange of intensities.
remain. First, although the normalized coefficients arecertainly closer to a homogeneous field, the signs of thecoefficients still exhibit important structure. Second, thevariance field itself is far from homogeneous, with mostof the significant values concentrated on one-dimensionalcontours.
4 Discussion
After nearly 50 years of Fourier/Gaussian modeling, thelate 1980s and 1990s saw sudden and remarkable shift inviewpoint, arising from the confluence of (a) multi-scaleimage decompositions, (b) non-Gaussian statistical obser-vations and descriptions, and (c) variance-adaptive sta-tistical models based on hidden variables. The improve-ments in image processing applications arising from theseideas have been steady and substantial. But the completesynthesis of these ideas, and development of further re-finements are still underway.
Variants of the GSMmodel described in the previous sec-tion seem to represent the current state-of-the-art, both interms of characterizing the density of coefficients, and interms of the quality of results in image processing appli-
cations. There are several issues that seem to be of pri-mary importance in trying to extend such models. First,a number of authors have examined different methods ofdescribing regularities in the local variance field. Theseinclude spatial random fields [23, 26, 24], and multiscaletree-structured models [40, 55]. Much of the structure inthe variance field may be attributed to discontinuous fea-tures such as edges, lines, or corners. There is a substan-tial literature in computer vision describing such struc-tures [e.g., 57, 32, 17, 27, 56], but it has proven difficultto establish models that are both explicit and flexible. Fi-nally, there have been several recent studies investigat-ing geometric regularities that arise from the continuityof contours and boundaries [45, 16, 19, 21, 60]. These andother image structures will undoubtedly be incorporatedinto future statistical models, leading to further improve-ments in image processing applications.
References
[1] F. Abramovich, T. Besbeas, and T. Sapatinas. Em-pirical Bayes approach to block wavelet function es-timation. Computational Statistics and Data Analysis,39:435–451, 2002.
9
[Simoncelli, ‘97; Wainwright&Simoncelli, ‘99]
• Nearby: densities are approximately circular/elliptical
• Distant: densities are approximately factorial
ICA-transformed joint densitiesd=2 d=16 d=32
kurto
sis
orientationdata (ICA’d): factorialized:sphericalized:
4
6
8
10
12
0 !/4 !/2 3!/4 !4
6
8
10
12
0 !/4 !/2 3!/4 !4
6
8
10
12
0 !/4 !/2 3!/4 !
ICA-transformed joint densitiesd=2 d=16 d=32
kurto
sis
orientationdata (ICA’d): factorialized:sphericalized:
4
6
8
10
12
0 !/4 !/2 3!/4 !4
6
8
10
12
0 !/4 !/2 3!/4 !4
6
8
10
12
0 !/4 !/2 3!/4 !
• Local densities are elliptical (but non-Gaussian)• Distant densities are factorial
[Wegmann&Zetzsche ‘90; Simoncelli ’97; + many recent models]
3 6 9 12 15 18 200
0.05
0.1
0.15
0.2
kurtosis
blk size = 3x3
blksphericalfactorial
Spherical vs LTF
3x3 7x7 15x15
3 6 9 12 15 18 200
0.05
0.1
0.15
0.2
kurtosis
blk size = 7x7
blksphericalfactorial
3 6 9 12 15 18 200
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
kurtosis
blk size = 11x11
blksphericalfactorial
data (ICA’d): factorialized:sphericalized:
• Histograms, kurtosis of projections of image blocks onto random unit-norm basis functions.• These imply data are closer to spherical than factorial
[Lyu & Simoncelli 08]
- Zetzsche & Krieger, 1999;- Huang & Mumford, 1999; - Wainwright & Simoncelli, 2000; - Hyvärinen and Hoyer, 2000; - Parra et al., 2001; - Srivastava et al., 2002; - Sendur & Selesnick, 2002; - Teh et al., 2003; - Gehler and Welling, 2006- Lyu & Simoncelli, 2008- etc.
non-Gaussian elliptical observations and models of natural images:
• is Gaussian, • and are independent• is elliptically symmetric, with covariance • marginals of are leptokurtotic
[Wainwright&Simoncelli 99]
Modeling heteroscedasticityMethod 2: Hidden scaling variable for each patchGaussian scale mixture (GSM)[Andrews & Mallows 74]:
!u
!x =!
z!u
z
!x
!u z > 0
!x ! Cu
• Empirically, z is approximately lognormal [Portilla etal, icip-01]
• Alternatively, can use Jeffrey’s noninformative prior [Figueiredo&Nowak, ‘01; Portilla etal, ‘03]
GSM - prior on z
pz(z) =exp (!(log z ! µl)2/(2!2
l ))
z(2"!2
l )1/2
pz(z) ! 1/z
GSM simulation
!!" " !"#"
"
#"!
!!" " !"#"
"
#"!
Image data GSM simulation
[Wainwright & Simoncelli, NIPS*99]
Model III (GSM)
Basis set: Image:Coefficient density:
!
!
!
"#$%&'(
!
!
!
)
Original coefficients Normalized by!
z
marginal
!500 0 500
!10
!8
!6
!4
!2
Log p
robabili
ty!5 0 5
!10
!9
!8
!7
!6
!5
!4
Log p
robabili
ty
joint
!100 !50 0 50 100!100
!50
0
50
100
0 2 4 6 80
2
4
6
8
subband
IBM, 10/03
[Schwartz&Simoncelli 01]
[Ruderman&Bialek 94]
First Order IdealConditional Model
1 2 3 4 5 61
2
3
4
5
6
Empirical Conditional Entropy
Mod
el E
ncod
ing
cost
(bits
/coe
ff)
Gaussian Model Generalized Laplacian
3 4 5
2.5
3
3.5
4
4.5
5
5.5
Empirical First Order Entropy (bits/coeff)
Mod
el E
ncod
ing
Cost
(bits
/coe
ff)
[Buccigrossi & Simoncelli 99]
Bayesian denoising
• Additive Gaussian noise:
• Bayes’ least squares solution is conditional mean:
y = x + w
P (y|x) ! exp["(y " x)2/2!2
w]
x̂(y) = IE(x|y)
=
!dxP(y|x)P(x)x/P(y)
I. Classical
If signal is Gaussian, BLS estimator is linear:
den
oise
d(x̂
)
noisy (y)
x̂(y) =!
2x
!2x
+ !2n
· y
=> suppress fine scales, retain coarse scales
Non-Gaussian coefficients
[Burt&Adelson ‘81; Field ‘87; Mallat ‘89; Daugman ‘89; etc]
!!"" " !""#"
!$
#"!%
#""
&'()*+,-*./01.*
2+0343'(')5
-*./01.*,6'.)07+4894:..'41,;*1.')5,,
II. BLS for non-Gaussian prior• Assume marginal distribution [Mallat ‘89]:
• Then Bayes estimator is generally nonlinear:
P (x) ! exp"|x/s|p
p = 2.0 p = 1.0 p = 0.5
[Simoncelli & Adelson, ‘96]
MAP shrinkage
p=2.0 p=1.0 p=0.5
[Simoncelli 99]
Denoising: JointIE(x|!y) =
!
dz P(z|!y) IE(x|!y, z)
=!
dz P(z|!y)"
#zCu(zCu + Cw)!1!y$
%
ctr
where
P(z|!y) =P(!y|z) P(z)
P!y, P(!y|z) =
exp(!!yT (zCu + Cw)!1!y/2)&
(2")N |zCu + Cw|
Numerical computation of solution is reasonably efficient ifone jointly diagonalizes Cu and Cw ...
[Portilla, Strela, Wainwright, Simoncelli, ’03]
IPAM, 9/04 20
Example estimators
Estimators for the scalar and single-neighbor cases
NOISY COEFF.
ES
TIM
AT
ED
CO
EF
F.
!w
!!"
"
!"
!#"
"
#"
!!"
"
!"
$%&'()*%+,,-$%&'()./0+$1
+'1&2/1+3)*%+,,-
[Portilla etal 03]
Comparison to other methods
Results averaged over 3 images
!" #" $" %" &"!$'&
!$
!#'&
!#
!!'&
!!
!"'&
"
"'&
()*+,-*.)/-0123
/456,(74-.)/-0123
)89!:1:;<=>?.=9@A?-9?=@BC8D
!" #" $" %" &"!$'&
!$
!#'&
!#
!!'&
!!
!"'&
"
"'&
()*+,-*.)/-0123
,456748+91:;<=:>6965#8*>?6<
[Portilla etal 03]
Original Noisy(22.1 dB)
Matlab’swiener2(28 dB)
BLS-GSM(30.5 dB)
Original Noisy(8.1 dB)
UndWvltThresh
(19.0 dB)
BLS-GSM(21.2 dB)
Real sensor noise
400 ISO denoised
GSM summary• GSM captures local variance
• Underlying Gaussian leads to simple computation
• Excellent denoising results
• What’s missing?
• Global model of z variables [Wainwright etal 99; Romberg etal ‘99; Hyvarinen/Hoyer ‘02; Karklin/Lewicki ‘02; Lyu/Simoncelli 08]
• Explicit geometry: phase and orientation
Global models for z• Non-overlapping neighborhoods, tree-structured z
[Wainwright etal 99; Romberg etal ’99]
• Field of GSMs: z is an exponentiated GMRF, is a GMRF, subband is the product[Lyu&Simoncelli 08]
z
Coarse scale
Fine scale
!u
!u
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. X, NO. X, XX 200X 9
Barbara Lena Boats
! "#"! $! !# %! "##
!&
!'
!$
!"
#
"
!
"()*+,
! "#"! $! !# %! "##
!&
!'
!$
!"
#
"
!
"()*+,
! "#"! $! !# %! "##
!&
!'
!$
!"
#
"
!"()*+,
Fig. 6. Performance comparison of denoising methods for three di!erent images. Plotted are di!erences in PSNR for di!erent input noise levels (!) betweenFoGSM and four other methods (! BM3D [37], " BLS-GSM [17], # kSVD [39] and $ FoE [27]). The PSNR values for these methods were taken fromcorresponding publications.
original image noisy image (! = 50) (PSNR = 14.15dB)
local GSM [17] (PSNR = 25.45dB) FoGSM (PSNR = 26.40dB)
Fig. 7. Denoising results using local GSM [17] and FoGSM.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
FoE
kSVD
GSM
BM3DFoGSM
[Lyu&Simoncelli, PAMI 08]
State-of-the-art denoising
2-band steerable pyramid: Image decomposition in terms of multi-scale gradient measurements
Measuring Orientation
[Simoncelli et.al., 1992; Simoncelli & Freeman 1995]
Multi-scale gradient basis
Multi-scale gradient basis
• Multi-scale bases: efficient representation
Multi-scale gradient basis
• Multi-scale bases: efficient representation
• Derivatives: good for analysis
• Local Taylor expansion of image structures
• Explicit geometry (orientation)
Multi-scale gradient basis
• Multi-scale bases: efficient representation
• Derivatives: good for analysis
• Local Taylor expansion of image structures
• Explicit geometry (orientation)
• Combination:
• Explicit incorporation of geometry in basis
• Bridge between PDE / harmonic analysis approaches
orientation
orientation
magnitude
[Hammond&Simoncelli 06; cf. Oppenheim and Lim 81]
Importance of local orientationRandomized orientation Randomized magnitude
[Hammond&Simoncelli 05]
Reconstruction from orientation
• Reconstruction by projections onto convex sets
• Resilient to quantization
Quantized to 2 bits
[Hammond&Simoncelli 06]
Original
Image patches related by rotation
two-band steerable pyramid coefficients[Hammond&Simoncelli 06]
raw patches
rotated patches
--- Raw Patches Rotated Patches
PCA of normalized gradient patches
[Hammond&Simoncelli 06]
Orientation-Adaptive GSM model
patch rotation operator
hidden magnitude/orientation variables
Model a vectorized patch of wavelet coefficients as:
[Hammond&Simoncelli 06]
Orientation-Adaptive GSM model
patch rotation operator
hidden magnitude/orientation variables
Model a vectorized patch of wavelet coefficients as:
Conditioned on ; is zero mean gaussian with covariance
[Hammond&Simoncelli 06]
[Hammond&Simoncelli 06]
Estimation of C(θ) from noisy data
noisy patch
unknown, approximate by measured from noisy data.
Assuming independent and noise rotationally invariant
(assuming w.l.o.g. E[z] =1 )
Bayesian MMSE Estimator
[Hammond&Simoncelli 06]
Bayesian MMSE Estimatorcondition on and integrate over hidden variables
[Hammond&Simoncelli 06]
Bayesian MMSE Estimatorcondition on and integrate over hidden variables
[Hammond&Simoncelli 06]
Bayesian MMSE Estimatorcondition on and integrate over hidden variables
Wiener estimate
[Hammond&Simoncelli 06]
Bayesian MMSE Estimatorcondition on and integrate over hidden variables
has covariance
separable prior forhidden variables
Wiener estimate
[Hammond&Simoncelli 06]
Bayesian MMSE Estimatorcondition on and integrate over hidden variables
has covariance
separable prior forhidden variables
Wiener estimate
[Hammond&Simoncelli 06]
oagsm 13.1 dB
gsm2 12.4 dB
σ = 40
noisy 2.81 dB
Locally adaptive covariance• Karklin & Lewicki 08: Each patch is Gaussian,
with covariance constructed from a weighted outer-product of fixed vectors:
• Guerrero-Colon, Simoncelli & Portilla 08: Each patch is a mixture of GSMs (MGSMs):
p(!x) =!
k
Pk
"p(zk) G(!x; zkCk) dzk
p(!y) =!
n
exp(!|yn|) Bn =!
k
wnk!bk
!bkT
log C(!y) =!
n
ynBnp(!x) = G (!x;C(!y))
MGSMs generative model!xPatch chosen from
with probabilities
Parameters:• Covariances
• Scale densities• Component probabilities• Number of components
Ck
{!
z1!u1,!
z2!u2, ...!
zK!uK}
{P1, P2, ..., PK}
pk(zk)
Pk
K
Parameters can be fit to data of one or more images by maximizing likelihood (EM-like)
[Guerrero-Colon, Simoncelli, Portilla 08]
MGSM “segmentation”
First sixeigenvectors
of GSMcovariance
matrices
image 1 2 4
[Guerrero-Colon, Simoncelli, Portilla 08]
MGSM “segmentation”
Eigenvectors of GSM components represent
invariant subspaces:“generalized complex
cells”
Potential of local homogeneous models?
• marginal statistics [var,skew,kurtosis]
• local raw correlations
• local variance correlations
• local phase correlations
Consider an implicit model: maxEnt subject to constraints on subband coefficients:
[Portilla & Simoncelli 00; cf. Zhu, Wu & Mumford 97]
Visual texture
Visual texture
Homogeneous, with repeated structures
Visual texture
Homogeneous, with repeated structures
“You know it when you see it”
All Images
Texture Images
Equivalence class (visually indistinguishable)
Iterative synthesis algorithm
Synthesis
Analysis
Transform MeasureStatistics
ExampleTexture
RandomSeed
SynthesizedTexture
Transform
MeasureStatistics
Adjust InverseTransform
[Portilla&Simoncelli 00; cf. Heeger&Bergen ‘95]
Examples: Artificial
Photographic, quasi-periodic
Photographic, aperiodic
Photographic, structured
Photographic, color
Non-textures?
Texture mixtures
Texture mixtures
Convex combinations in parameter space
Texture mixtures
Convex combinations in parameter space=> Parameter space includes non-textures
Summary
• Fusion of empirical data with structural principles
• Statistical models have led to state-of-the-art image processing, and are relevant for biological vision
• Local adaptation to {variance, orientation, phase, ...} gives improvement, but makes learning harder
• Cascaded representations emerge naturally
• There’s still much room for improvement!
Cast• Local GSM model: Martin Wainwright, Javier Portilla
• GSM Denoising: Javier Portilla, Martin Wainwright, Vasily Strela
• Variance-adaptive compression: Robert Buccigrossi
• Local orientation and OAGSM: David Hammond
• Field of GSMs: Siwei Lyu
• Mixture of GSMs: Jose-Antonio Guerrero-Colón, Javier Portilla
• Texture representation/synthesis: Javier Portilla