deep learning and its applicationsto radar atrsee.xidian.edu.cn/vipsl/mla2014/chenbo.pdf · ·...
TRANSCRIPT
National Lab of Radar Signal Processing
Bo Chen National Lab. of Radar Signal Processing
Xidian University, China
Deep Learning and Its Applications to Radar ATR
Joint work with Bo Feng, Jun Ding, Gungor Polatkan, Guillermo Sapiro, David Blei, David B. Dunson and Lawrence Carin
MLA 2014
National Lab of Radar Signal Processing
Outline
Deep/Multilayered Models
Deep Models with Bayesian Nonparametric
Convolutional Factor Analysis with
(Hierarchical) Beta Process
Summary and Future Work
2014-11-20
National Lab of Radar Signal Processing
Outline
Deep/Multilayered Models
Deep Models with Bayesian Nonparametric
Convolutional Factor Analysis with
(Hierarchical) Beta Process
Summary and Future Work
2014-11-20
National Lab of Radar Signal Processing2014-11-20
Success Stories• Computer Vision:
− Image inpaiting/denoising, segmentation− Object recognition/detection, scene understanding− Video analysis
• Information Retrieval/NLP:− Text, audio, and image retrieval− Parsing, machine translation, text analysis
• Speech processing• Robotics• Computational Biology• Cognitive Science• Radar Automatic Target Detection and
Recognition?
National Lab of Radar Signal Processing2014-11-20
Single Layer Models• Autoencoder (most deep learning methods)
− Denoising autoencoders− Restricted Boltzmann Machine− Predictive sparse decomposition
• Decoder-only− Sparse coding/factor analysis− Deconvolutional nets
• Encoder-only− (Convolutional) Neural nets (supervised)
National Lab of Radar Signal Processing2014-11-20
Autoencoder
σ(Wx)g(WTz)
(Binary or Real-valued) Input x
(Binary) Features z
Encoder filters W
Sigmoid function σ(.)
Decoder filters WT
Linear or nonlinear
function g(.)
National Lab of Radar Signal Processing2014-11-20
σ(Wx)Dz
Input Patch x
Sparse Features z
Encoder filters W
Sigmoid function σ(.)
Decoder filters D
L1Sparsity
Predictive Sparse Decomposition[Kavukcuoglu et al.,‘09]
National Lab of Radar Signal Processing2014-11-20
Denoising Autoencoder[Vincent et. Al., 2008]
Figure credit for Vincent
National Lab of Radar Signal Processing2014-11-20
Deep Belief Networks (Greedy)
• Construct an RBM with an input layer v and a hidden layer h
• Stack another hidden layer on top of the RBM to form a new RBM
• And so on.
[Hinton et. al., 2006]
National Lab of Radar Signal Processing2014-11-20
Deep Belief Networks (Greedy),[Hinton et. al., 2006]
Generating samples
National Lab of Radar Signal Processing2014-11-20
Deep Boltzmann Machine
1 2 3 1 1 1 2 2 2 3 3, , , ; T T TE v h h h θ v W h h W h h W h
The energy of the state is defined as: 1 2 3, , ,v h h h
1 2 1 2 21| ,k ik i km mi m
p h g W v W h
v h
2 1 3 2 1 3 31| ,k ik i km mi m
p h g W h W h h h 3 2 3 21|k jk j
jp h g W h
h
1 1 11|k kj jj
p v g W h
hInference:
Figure credit for Ruslan
National Lab of Radar Signal Processing
Outline
Deep/Multilayered Models
Deep Models with Bayesian Nonparametric
Convolutional Factor Analysis with
(Hierarchical) Beta Process
Summary and Future Work
2014-11-20
National Lab of Radar Signal Processing2014-11-20
Bayesian Nonparametric Modeling• What is Bayesian nonparametric?
− It doesnot mean “no parameters”, a really large parametric model
− “not parametric,” not restricted to objects whose dimensionality stays fixed as more data is observed. More flexible according flexible data structure
− A model over infinite dimensional function or measure spaces
− A family of distributions that is dense in some large space
• Why nonparametric models?− broad class of priors that allows data to “speak for itself”− side-step model selection and averaging
National Lab of Radar Signal Processing2014-11-20
Dirichlet distribution Dirichlet process
Beta distribution Beta processGaussian distribution Gaussian processPoisson distribution Poisson process
Bayesian Nonparametric Models
Dirichlet Process / Chinese Restaurant Process Beta Process / Indian Buffet Process
National Lab of Radar Signal Processing2014-11-20
Autoencoder with Nonparametric Priors• Deep Sparse Graphical Models via CIBP (R. Adam et. al., 2010)
• Autoencoder with Gaussian Process (J. Snoek et. al., 2012)
National Lab of Radar Signal Processing2014-11-20
Autoencoder with Nonparametric Priors• Beta Process RBMs (R. Mittelman et. al., 2013)
• Hierarchical-Deep Models (R. Salakhutdinov et. al., 2013)
National Lab of Radar Signal Processing
Outline
Deep/Multilayered Models
Deep Models with Bayesian Nonparametric
Convolutional Factor Analysis with
(Hierarchical) Beta Process
Summary and Future Work
2014-11-20
National Lab of Radar Signal Processing
Existing Convolutional Deep Networks
(Lee et al., 2009 and Norouzi et al.,2009)
• Convolutional Restricted Boltzmann Machines
• Convolutional Sparse Coding and Encoder Networks
(Kavukcuoglu et al., 2010)
with L2-norm sparsity
• Deconvolutional Networks (Zeiler et al., 2010)
Characteristics in common: 1. Require hidden nodes sparse; 2. Use point estimate to update parameters; 3. Have to set the number of filters
v
h
W
Hidden
Visible
2014-11-20
National Lab of Radar Signal Processing
Convolutional Factor Analysis with Beta Process
Beta process prior on the usage of filters
Normal-Gamma prior (sparsity) on hidden units
Normal-Gamma prior (sparsity) on filters
Gaussian noise
Generative Model
2014-11-20
National Lab of Radar Signal Processing
Beta-Bernoulli Process (2/2) Beta-Bernoulli Process (N. L. Hjort, 1990 and
Thibaux & Jordan, 2007)
N
K
2014-11-20
National Lab of Radar Signal Processing
Online Variational Bayesian Learning Variational Bayesian
Online Version
Maximize the lower bound:
The lower bound of marginal Log likelihood:
2014-11-20
National Lab of Radar Signal Processing
Multitask Learning via the Hierarchical BP HBP Generative Process
Via this construction, each task shares the same filters, but with task-specific probability of filter usage.
Global atoms
Local order
2014-11-20
National Lab of Radar Signal Processing
Multilayered/Deep Models Stack multiple convolutional factor analysis
and train layer by layer Max-Pooling
2014-11-20
National Lab of Radar Signal Processing
Experiments: Synthesized Data• We generate seven binary canonical shapes, with shifted versions of these
basic shapes used to constitute five classes of example images.Image: 32x32; Class: 5; Number of images: 30.
Layer-1: 4x4; K1=10;
Layer-2: 3x3; K2=100;
Max-pooling=2;Burn-in: 30000;
Collections: 20000
Class 1Class 2
Class 3
Class 4
Class 5
2014-11-20
National Lab of Radar Signal Processing
Experiments: MNIST (Layer-2 Filters)For each digit, we randomly select 500 samples.
Layer-1: 7x7; K1=25;
Layer-2: 3x3; K2=1000;
Max-pooling=3;Burn-in: 1000;
Collections: 500
“0”
“1”
“4”
“6”
“7”
2014-11-20
National Lab of Radar Signal Processing
Experiments: Caltech101Layer-1: 11x11; K1=25;Layer-2: 4x4; K2=200;Layer-3: 6x6; K3=100;
Burn-in: 1000; Collections: 500
Layer-1
Layer-2
Layer-3
2014-11-20
National Lab of Radar Signal Processing
Sparseness Analysis• The impact of sparse hyperparameters on sparseness and model performance
the sparseness of filters:
2014-11-20
National Lab of Radar Signal Processing
Sparseness Analysisthe sparseness of hidden nodes:
the sparseness of binary indicators:
2014-11-20
National Lab of Radar Signal Processing
Online Learning
Held-out RMSE with different sizes of minibatches on Caltech101 data, as in Fig. 6. (a) Layer 1, (b) Layer 2.
2014-11-20
National Lab of Radar Signal Processing
HBP on Caltech101Task=102; Images:1020; 1000 Layer-2 filters ranked by the global usage
It appears that as the range of image classes considered within an HBP analysis increases, the form of the prominent filters tend toward simple filter forms.
2014-11-20
National Lab of Radar Signal Processing
Outline
Deep/Multilayered Models
Deep Models with Bayesian Nonparametric
Convolutional Factor Analysis with
(Hierarchical) Beta Process
Summary and Future Work
2014-11-20
National Lab of Radar Signal Processing
Summary and Future Work Build new convolutional deep networks based on factor
analysis with BP/IBP Infer the number of filters at each layer of the deep model
from the data by an IBP/BP construction Multi-task feature learning for simultaneous analysis of
different families of images via HBP Future work:
combine topic modeling with the model develop a classifier special for convolutional
property with better generalization build deep model with the encoder style