Machine Learning and Compressive Sensingfor Electron Microscopy
Andrew Stevens1,2, Xin Yuan2,Lawrence Carin2, Nigel Browning1
1Pacific Northwest National Laboratory
2Duke University ECE
Related ResultsModels
Video CS
Outline
1 Related ResultsSTEM InpaintingSTEM/TEM Super-resolution
2 ModelsMixture modelsFactor analysisMixture of factor analyzers
3 Video CSDataCamera systemDemonstration
2
Related ResultsModels
Video CS
Goals
Reduce dose (and data volume) through spatialcompression.Increase speed and decrease data volume throughtemporal compression.Learn a representation for sample structures (bulk, defects,grain boundaries, etc.).
3
20% SrTiO3 STEM Inpainting [Stevens et al., 2013]
20% SrTiO3 STEM Inpainting [Stevens et al., 2013]
20% zeolite STEM inpainting [Stevens et al., 2013]
20% zeolite STEM inpainting [Stevens et al., 2013]
Related ResultsModels
Video CS
STEM InpaintingSTEM/TEM Super-resolution
Super-resolution images are not distributable.
8
Related ResultsModels
Video CS
Mixture modelsFactor analysisMixture of factor analyzers
Compressive Sensing (CS)[Stevens et al., 2013, Zhou et al., 2012, Chen et al., 2010]
Given a sensing matrix Φ ∈ RQ×P ,Q P, usually Gaussian orBernoulli, and compressed measurements y i ,
y i = Φi(x i + εi).
We want to recover x i .
Inpainting is the case when Φ is a subset of columns from theidentity matrix.
9
Related ResultsModels
Video CS
Mixture modelsFactor analysisMixture of factor analyzers
Sparse CS
y = Φ(x + ε), y ∈ RQ,x ∈ RP ,Q P
The true signal x is assumed to be sparse in a some(overcomplete) basis D ∈ RP×K ,P < K .
y = Φ(Dw + ε), nnz(w) K
10
Related ResultsModels
Video CS
Mixture modelsFactor analysisMixture of factor analyzers
Manifold CS [Chen et al., 2010]
11
Gaussian mixture model [Rasmussen, 1999]
Related ResultsModels
Video CS
Mixture modelsFactor analysisMixture of factor analyzers
Gaussian mixture model [Rasmussen, 1999]
p(xi |·) =T∑
t=1
λtN (µt , τ−1t )
xi ∼ N (µt(i), τ−1t(i))
µt ∼ N (a,b−1)
τt ∼ G(c,d)λ1, . . . , λT ∼ Dirichlet(α/T , . . . , α/T )
t(i) ∼ Multinomial(1;λ1, . . . , λT )
p(t(i) = j |t(−i), α) =n−ij + α/Tn − 1 + α
13
Related ResultsModels
Video CS
Mixture modelsFactor analysisMixture of factor analyzers
Chinese Restaurant Process
μ2,τ2
μ1,τ1
p(t = 1) =1
1 + α, p(t = 2) =
α
1 + α14
Related ResultsModels
Video CS
Mixture modelsFactor analysisMixture of factor analyzers
Chinese Restaurant Process
μ2,τ2
μ3,τ3
μ1,τ1
p(t = 1) =3
4 + α, p(t = 2) =
14 + α
, p(t = 3) =α
4 + α15
Related ResultsModels
Video CS
Mixture modelsFactor analysisMixture of factor analyzers
Chinese Restaurant Process
μ2,τ2 μ4,τ4
μ3,τ3
μ1,τ1
4,3,1, α/(9 + α)
16
Related ResultsModels
Video CS
Mixture modelsFactor analysisMixture of factor analyzers
Chinese Restaurant Process
μ2,τ2 μ4,τ4
μ3,τ3
μ5,τ5
μ1,τ1
μ6,τ6
8,5,4,2,1, α/(20 + α)
17
Related ResultsModels
Video CS
Mixture modelsFactor analysisMixture of factor analyzers
Chinese Restaurant Process
μ2,τ2 μ4,τ4
μ3,τ3
μ5,τ5
μ1,τ1
μ6,τ6
p(table t) =nt
n − 1 + α, p(new) =
α
n − 1 + α18
Related ResultsModels
Video CS
Mixture modelsFactor analysisMixture of factor analyzers
CRP Stick Breaking
λt = vt
t−1∏j=1
(1− vj)
vt ∼ Beta(1, α)
…
λ7
λ6
λ5
λ4
λ3
λ2
λ1
19
Related ResultsModels
Video CS
Mixture modelsFactor analysisMixture of factor analyzers
Factor analysis
Given n samples x i ∈ RN
x i = Dw i + µ+ εi
εi ∼ N (0, γ−1ε IN)
w i ∼ N (0, IK )
where D ∈ RN×K and µ ∈ RN . Equivalently,
x i ∼ N (Dw i + µ, γ−1ε IN)
20
Image Patches
Image Patches
64 patches/pixel
…
…
… 56
48 40 32 24 16 8
7 6 5 4 3 2 1
8
8x8 patch
8 8 8
…
7 6 5 4 3 2 1 8 8 8 8
…
64 8-‐56
8-‐56
Dictionaries
Dictionaries
Haar Wavelet basis Discrete cosine basis
Dictionaries
(-‐1.5,1.5,-‐1.3,-‐1.1)
(-‐1.5,1.5,-‐1.3)
* -‐1.5
* -‐1.5 + * 1.5
Related ResultsModels
Video CS
Mixture modelsFactor analysisMixture of factor analyzers
Sparsity via Beta-Bernoulli Process
For each x i ∈ RP we have a latent binary vector z i ∈ RK thatencodes which dictionary elements are used by x i .
zki ∼ Bern(πk ), πk ∼ Beta(
aK,b
K − 1K
)
26
Related ResultsModels
Video CS
Mixture modelsFactor analysisMixture of factor analyzers
Draw From Indian Buffet Process [Griffiths and Ghahramani, 2011]
First customer samples Poisson(α) dishes.The i th customer samples each old dish with probability#(previous samples)/i and samples Poisson(α/i) new dishes.
27
Related ResultsModels
Video CS
Mixture modelsFactor analysisMixture of factor analyzers
Beta Process Factor Analysis (BPFA) [Zhou et al., 2012]
x i = Dw i + εi
dk ∼ N (0,P−1IP)
εk ∼ N (0, γ−1ε IP), γε ∼ Gamma(c,d)
w i = si ~ z i
si ∼ N (0, γ−1s IK ), γs ∼ Gamma(e, f )
z i ∼K∏
k=1
Bern(πk ), π ∼K∏
k=1
Beta(
aK,b
K − 1K
)
28
Related ResultsModels
Video CS
Mixture modelsFactor analysisMixture of factor analyzers
Connection to Optimization
− log p(D,S,Z ,π|X ,a,b, c,d ,e, f )
=γε2
N∑i=1
‖x i − D(si ~ z i)‖22
+P2
K∑k=1
‖dk‖22 +γs
2
N∑i=1
‖si‖22
− log fBeta-Bern(Z ;a,b)− log Gamma(γε|c,d)− log Gamma(γs|e, f )+ Const
29
Related ResultsModels
Video CS
Mixture modelsFactor analysisMixture of factor analyzers
Mixture of factor analyzers [Chen et al., 2010]
x i ∼ N (Dt(i)w i + µt(i), γ−1ε,t(i)IP)
w i = si ~ z t(i), si ∼ Nt(i)(0, γ−1s IK )
t(i) ∼ Mult(1;λ1, . . . , λT ), λt = vt
t−1∏j=1
(1− vj)
z t ∼K∏
k=1
Bernoulli(πk ), π ∼K∏
k=1
Beta(
aK,b
K − 1K
)Dt(i) = Dt(i)∆t(i), µt ∼ N (µ, τ−1
0 IP)
d(t)k ∼ N (0,P−1IP), ∆
(t)kk ∼ N (0, τ−1
tk )
30
Related ResultsModels
Video CS
Mixture modelsFactor analysisMixture of factor analyzers
Mixture of factor analyzers [Chen et al., 2010]
x i ∼ N (Dt(i)w i + µt(i), γ−1ε,t(i)IP)
w i = si ~ z t(i), si ∼ Nt(i)(0, γ−1s IK )
t(i) ∼ CRP(α)z t ∼ IBP(a,b)
Dt(i) = Dt(i)∆t(i), µt ∼ N (µ, τ−10 IP)
d(t)k ∼ N (0,P−1IP), ∆
(t)kk ∼ N (0, τ−1
tk )
31
Related ResultsModels
Video CS
Mixture modelsFactor analysisMixture of factor analyzers
Block/group sparsity
MFA is similar to Block sparse models.
x = [µ1,D1| . . . |µT ,DT ]
w1...
wT
In the presented MFA only one of the w t vectors is non-zero.Each dictionary is usually low-rank (undercomplete), K < P.
32
Related ResultsModels
Video CS
Mixture modelsFactor analysisMixture of factor analyzers
CS-MFA
y = Φ(x + ε)
p(x) ≈T∑
t=1
λtN (x ;χt ,Ωt)
p(y |x) = N (y ;Φx ,R−1)
p(x |y) = p(x)p(y |x)∫p(x)p(y |x)dx
=T∑
t=1
λtN (x ; χt , Ωt)
33
Related ResultsModels
Video CS
DataCamera systemDemonstration
Pixel-wise flutter-shutter [Llull et al., 2013]
Y ij = [Aij1,Aij2, . . . ,Aij`]
X ij1X ij2
...X ij`
= Φijx ij
Φ = diag(Φ1,1,Φ1,2, . . . ,ΦNx ,Ny )
34
Optical camera setup [Llull et al., 2013]
Related ResultsModels
Video CS
DataCamera systemDemonstration
Video CS demonstration
36
Related ResultsModels
Video CS
DataCamera systemDemonstration
Thanks!
1 Related ResultsSTEM InpaintingSTEM/TEM Super-resolution
2 ModelsMixture modelsFactor analysisMixture of factor analyzers
3 Video CSDataCamera systemDemonstration
37
Related ResultsModels
Video CS
DataCamera systemDemonstration
Questions? [email protected]
1 Related ResultsSTEM InpaintingSTEM/TEM Super-resolution
2 ModelsMixture modelsFactor analysisMixture of factor analyzers
3 Video CSDataCamera systemDemonstration
38
Related ResultsModels
Video CS
DataCamera systemDemonstration
References I
M. Chen, J. Silva, J. Paisley, C. Wang, D. Dunson, and L. Carin.Compressive sensing on manifolds using a nonparametricmixture of factor analyzers: Algorithm and performancebounds. Signal Processing, IEEE Transactions on, 58(12):6140–6155, Dec 2010.
T. Griffiths and Z. Ghahramani. The indian buffet process: Anintroduction and review. The Journal of Machine LearningResearch, 12:1185–1224, 2011.
P. Llull, X. Liao, X. Yuan, J. Yang, D. Kittle, L. Carin, G. Sapiro,and D. Brady. Coded aperture compressive temporalimaging. Optics express, 21(9):10526–10545, 2013.
C. Rasmussen. The infinite gaussian mixture model. In NIPS,volume 12, pages 554–560, 1999.
39
Related ResultsModels
Video CS
DataCamera systemDemonstration
References II
A. Stevens, H. Yang, L. Carin, I. Arslan, and N. Browning. Thepotential for bayesian compressive sensing to significantlyreduce electron dose in high-resolution stem images.Microscopy, 63(1):41–51, 2013.
M. Zhou, H. Chen, J. Paisley, L. Ren, L. Li, Z. Xing, D. Dunson,G. Sapiro, and L. Carin. Nonparametric bayesian dictionarylearning for analysis of noisy and incomplete images. ImageProcessing, IEEE Transactions on, 21(1):130–144, 2012.
40
10%
5%
Related ResultsModels
Video CS
DataCamera systemDemonstration
SrTiO3 Error Metrics
EstimatedNoise
Variance
SamplePSNR(dB)
InpaintedPSNR (dB)
InpaintedPSNR vs.Denoised
(dB)SrTiO3 5% 33.75 9.04 15.91 19.00SrTiO3 10% 32.00 9.28 17.73 22.73SrTiO3 20% 31.83 9.79 18.78 26.14SrTiO3 100% 28.36 - 20.50 -
43
Related ResultsModels
Video CS
DataCamera systemDemonstration
SrTiO3 Structure Identification
Average of 9 images reconstructed from 5% samples withoverlaid grain boundary structure.
44
Related ResultsModels
Video CS
DataCamera systemDemonstration
SrTiO3 Quality Comparison
45
10%
5%
Related ResultsModels
Video CS
DataCamera systemDemonstration
Zeolite Error Metrics
EstimatedNoise
Variance
SamplePSNR(dB)
InpaintedPSNR (dB)
InpaintedPSNR vs.Denoised
(dB)zeolite 5% 10.69 7.72 23.61 26.42zeolite 10% 11.26 7.96 26.34 35.67zeolite 20% 11.83 8.47 26.58 38.98zeolite 100% 11.51 - 27.27 -
48
Related ResultsModels
Video CS
DataCamera systemDemonstration
Zeolite Quality Comparison
49
Related ResultsModels
Video CS
DataCamera systemDemonstration
SrTiO3 Dictionaries
5% 10% 20% 100%
50
Related ResultsModels
Video CS
DataCamera systemDemonstration
Zeolite Dictionaries
5%
20%
10%
100%
51