texture modeling with ssrbms and deep extensions · texture modeling with convolutional...
TRANSCRIPT
Texture Modeling with Convolutional Spike-and-Slab RBMs
and Deep ExtensionsAaron Courville
University of Montreal
1
Joint work with:
Heng Luo Pierre Luc Carrier Yoshua Bengio (circa 2001)
1
Modeling Texture
• A subset of Natural images
• Our goal: A deep probabilisLc texture model • Based on the spike-‐and-‐slab restricted Boltzmann machine
Image from (Karklin and Lewicki, 2008)
Brodatz Textures
2
2
Previous work:Boltzmann Machine Texture Models
• Kivinen and Williams (AISTATS 2012) showed with Lled-‐convoluLon:– mPoT (Ranzato et al., 2010) performance well on texture systhesis and inpainLng tasks.
– Gaussian RBM can achieve similar performance.
– Gaussian RBM can model mulLple textures using label informaLon.
3
Spike-‐and-‐slab RBM
• Two layer model– allow the hidden unit to capture changes in the covariance between pixels
v
s h
4
4
Binary ssRBM
• Spike-‐and-‐slab visible layer
• Binary hidden layer
• Factorial condiLonals– amenable to block Gibbs sampling
hs
g
5
5
• The first layer (ssRBM)
• The second layer (bssRBM)–
• The resulLng ssDBN
Spike-‐and-‐slab DBN
hs
g
v
6
pss(v) =X
s,h
pss(v | s, h)pss(s, h)
pDBN (v) =X
s,h
pss(v | s, h)pbss(s, h)
pbss(s, h) models the aggregated posterior:
p̂ss(s, h)X
v
pss(s, h | v)p̂(v)
6
Setup of experiments• Brodatz textures
• Tasks: – Texture synthesis evaluated by TSS (Texture Similarity Score) (Heess
et al., 2009)
– Texture inpainLng evaluated by MSSIM (Mean Structural Similarity Index) (Wang et al., 2004)
7
7
Training on Textures
8
Training Set
Test Set
random training patch
98 x 98
• All textures were scaled by a factor 0.75 or 0.5– Following Kivinen and Williams (2012)
8
Architecture
Tile-‐convoluLon ConvoluLon
98 x 98
11 x 11
9
9
Task I: Texture Synthesis
10
1. Markov chain in top layer: (starLng from noise)
2. Project to image space:
g(0) g(3)g(1) g(2)
s(0)h(0) s(1)h(1) s(2)h(2)
v(0) v(1) v(2)
10
Task II: Texture InpainLng
11
1. Given frame :
2. Sample top layer:
3. Project to image space:
g(1)g(0)
s(0)h(0) s(1)h(1)
v(0) v(1)
s(2)h(2)
v(2)
s(1)h(1)
11
Synthesis Results
12
Synthesis Results
13
13
InpainLng Results
14
14
InpainLng Results
15
15
High-‐ResoluLon Textures
• Training on 98 x 98 patches from high-‐resoluLon texture images:
• Same recepLve field sizes:– 11 x 11 in the bobom layer (Lled-‐convoluLon)
– 2 x 2 in higher layers (convoluLon)
16
High resoluLon:Low resoluLon:
98 x 98 98 x 98
16
High-‐ResoluLon Textures
17
17
Depth helps mixing
• As observed by Bengio et al. (2012)
• A generalizaLon of autocorrelaLon:
where:
vt is the sample image at step t
v̂t is centered: v̂t = vt � µv
18
18
Depth helps mixing
19
1-layer model 2-layer model 3-layer model
19
Other natural images dataset
• Preliminary experiments on CIFAR-‐10– 3-‐layer ssDBN
• Tiled-‐convoluLon for the bobom layer
• ConvoluLon for higher layers
– InpainLng results on test set
20
20
CIFAR-‐10 InpainLng
21
21
CIFAR-‐10 InpainLng
22
22
Modeling natural images of faces
23
• Experiments on TFD (Toronto Face Database)– 4-‐layer ssDBN
• Dense connecLon for each layer
23
• Experiments on TFD (Toronto Face Database)– 4-‐layer ssDBN
• Dense connecLon for each layer
24
Modeling natural images of faces
Samples from the model:
24
• Experiments on TFD (Toronto Face Database)– 4-‐layer ssDBN
• Dense connecLon for each layer
24
Modeling natural images of faces
Samples from the model:
24
25
25
InpainLng: Gaussian RBM vs ssRBM
26
Gaussian RBM:
Spike-‐and-‐Slab RBM:
InpainLng Frame:
Original Patches:
26
Spike-‐and-‐slab all the way up?
• Spike-‐and-‐slab first hidden layer helped. So why not?
• Early results are promising,
• But ….27
s4RBM: (single layer)
t1 g1
s1 h1
t2
s2 h2
g2 t1 g1 t2 g2
s1 h1 s2 h2
v1 v2
s4DBN: (two layer)
27
Spike-‐and-‐slab all the way up?
• Spike-‐and-‐slab first hidden layer helped. So why not?
• Early results are promising,
• But ….27
s4RBM: (single layer)
t1 g1
s1 h1
t2
s2 h2
g2 t1 g1 t2 g2
s1 h1 s2 h2
v1 v2
s4DBN: (two layer)
27
28
28