deep learning presentation
DESCRIPTION
Short introduction to deep learning and to the DLL Library (C++, https://github.com/wichtounet/dll). Nothing fancy.TRANSCRIPT
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
Deep Learning
Baptiste [email protected]
September 12, 2014
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
Table of Contents
1 Deep Learning2 Restricted Boltzmann Machine3 Deep Belief Network4 Convolutional RBM5 Convolutional DBN6 Conclusion
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
DefinitionHistoryUsagesDifficulties
Contents
1 Deep LearningDefinitionHistoryUsagesDifficulties
2 Restricted Boltzmann Machine3 Deep Belief Network4 Convolutional RBM5 Convolutional DBN6 Conclusion
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
DefinitionHistoryUsagesDifficulties
Definition
Deep Learning (Wikipedia)Deep learning is a set of algorithms in machine learning thatattempt to model high-level abstractions in data by using modelarchitectures composed of multiple non-linear transformations
Deep Learning (deeplearning.net)Deep Learning is a new area of Machine Learning research, whichhas been introduced with the objective of moving MachineLearning closer to one of its original goals: Artificial Intelligence.
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
DefinitionHistoryUsagesDifficulties
Definition (cont.d)
Goal: Imitate the natureSet of algorithmsGenerally structures with multiple layersOften unsupervised feature learningTime-consuming trainingSometimes large amount of dataGenerally complex dataNew name for an old thinghot topic
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
DefinitionHistoryUsagesDifficulties
History
1960: Neural networks1985: Multilayer Perceptrons1986: Restricted Boltzmann Machine1995: Support Vector Machine2006: Hinton presents the Deep Belief Network (DBN)
New interests in deep learning and RBMState of the art MNIST
2009: Deep Recurrent Neural Network2010: Convolutional DBN2011: Max-Pooling CDBN
Many competitions won and state of the art results
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
DefinitionHistoryUsagesDifficulties
Names
Geoffrey HintonAndrew Y. NgYoshua BengioHonglak LeeYann LeCun...
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
DefinitionHistoryUsagesDifficulties
Algorithms
Deep Neural NetworksDeep Belief NetworksConvolutional Deep Belief NetworksDeep SVM
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
DefinitionHistoryUsagesDifficulties
Usages
Text recognitionFacial Expression RecognitionObject RecognitionAudio classification
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
DefinitionHistoryUsagesDifficulties
Difficulties
Large number of free variablesFew insights on how to set them
Complex to implementLarge variations between papersLot of refinements were proposed
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
DefinitionTrainingUnitsVariants
Contents
1 Deep Learning2 Restricted Boltzmann Machine
DefinitionTrainingUnitsVariants
3 Deep Belief Network4 Convolutional RBM5 Convolutional DBN6 Conclusion
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
DefinitionTrainingUnitsVariants
Definition
Restricted Boltzmann MachineFunction: Learn a probability distribution over the inputGenerative stochastic neural networkVisible and hidden neuronsNeurons form a bipartite graphV visible units and visible biasesH hidden units and hidden biasesVxH weights
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
DefinitionTrainingUnitsVariants
Definition (Cont.d)
Binary units (Bernoulli RBM)
p(hj = 1|v) = σ(cj +m∑i
viwi ,j)
p(vi = 1|h) = σ(bi +n∑j
hjwi ,j)
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
DefinitionTrainingUnitsVariants
Example
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
DefinitionTrainingUnitsVariants
Example
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
DefinitionTrainingUnitsVariants
Example
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
DefinitionTrainingUnitsVariants
Example
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
DefinitionTrainingUnitsVariants
Example
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
DefinitionTrainingUnitsVariants
Usages
Unsupervised feature learningClassification with other techniques (linear classifier, SVM, ...)Limited to one layer of abstraction
Stacking for higher-level models and classificationDeep Belief NetworkDeep Boltzmann Machines
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
DefinitionTrainingUnitsVariants
Training
Objective: Maximizing the log-likelihoodIntractable
Other methods have been developed:Markov Chain Monte Carlo (MCMC) (Too slow)Contrastive Divergence (CD) (Hinton)Persistent CDMean-Field CD (mf-CD)Parallel TemperingAnnealed Importance Sampling
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
DefinitionTrainingUnitsVariants
Contrastive Divergence
For each data point1 Compute gradients g between t = k and t = k − 12 Add α ∗ g to the weights and the biases
Repeat for several epochs
Experiments have shown that CD1 (k = 1) works well
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
DefinitionTrainingUnitsVariants
Contrastive Divergence
When to stop training ?1 Proxies to log-likelihood:
Reconstruction errorPseudo-likelihood (PCD)
2 Visual inspection of the filtersTraining is relatively fast
Can be trained on GPUHard to compare two RBMsHard to test an implementation correctly
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
DefinitionTrainingUnitsVariants
Contrastive Divergence Options
Mini-batch trainingMomentumWeight decaySparsity Target...
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
DefinitionTrainingUnitsVariants
Units
RBM Was initially developed with binary unitsDifferent types of units can be used:
Gaussian visible units for real-value inputsSoftmax hidden unit for classification (last layer)Rectified Linear Unit (ReLU) units for hidden/visible
Can be capped
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
DefinitionTrainingUnitsVariants
Variants
Convolutional RBM (see later)mean-covariance RBM (mcRBM)Sparse RBM (SRBM)Third-Order RBMSpike And Slab RBMNonnegative RBM...
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
DefinitionTraining
Contents
1 Deep Learning2 Restricted Boltzmann Machine3 Deep Belief Network
DefinitionTraining
4 Convolutional RBM5 Convolutional DBN6 Conclusion
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
DefinitionTraining
Definition
Deep Belief NetworkGenerative graphical modelType of Deep Neural NetworkMultiple layer of hidden unitsStack of RBMs
Can be implemented with other autoencoders
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
DefinitionTraining
Definition (Cont.d)
Each RBM takes inputfrom previous layer outputEach layer forms ahigher-level representationof the dataNumber of hidden units ineach layer can be tuned
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
DefinitionTraining
Training
1 Train each layer, from bottom to top, with ContrastiveDivergence (Unsupervised)
2 Then treat the DBN as a MLP3 If necessary, fine-tune the last layer for classification
(Supervised)Back propagationnonlinear Conjugate Gradient methodLimited Memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS)Hessian-Free CG (Martens)
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
DefinitionTrainingProbabilistic Max Pooling
Contents
1 Deep Learning2 Restricted Boltzmann Machine3 Deep Belief Network4 Convolutional RBM
DefinitionTrainingProbabilistic Max Pooling
5 Convolutional DBN6 Conclusion
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
DefinitionTrainingProbabilistic Max Pooling
Definition
Convolutional RBMMotivation: Translation-invariance
Scaling to full-size imagesVariant of RBM, concepts remain the sameNV xNV binary visible unitsK groups of hidden unitsNK xNK binary hidden units per groupEach group has a NW xNW filter (NW , NV − NH + 1)A bias bk for each hidden groupA single bias c for all visible units
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
DefinitionTrainingProbabilistic Max Pooling
Definition (Cont.d)
Binary units:
p(hkj = 1|v) = σ(bk + (W̃ k ∗v v)j)
p(vi = 1|h) = σ(c +K∑k(W k ∗f hk)i)
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
DefinitionTrainingProbabilistic Max Pooling
Training
Contrastive DivergenceGradients computations are done with convolutionsSame refinements can be used (weight decay, momentum, ...)
CRBM is highly overcompleteSparse learning is very important
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
DefinitionTrainingProbabilistic Max Pooling
Probabilistic Max Pooling
Shrink the representation by a constant factor CAllows higher-level to be invariant to small translationsReduces computational effort
Generative version of standard Max PoolingPooling layer with K groups of pooling unitsEach group has NPxNP unitsNP , NH/CEach hidden block α (CxC) is connected to exactly onepooling unit
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
DefinitionTrainingProbabilistic Max Pooling
Definition (Cont.d)
Binary units:
p(vi = 1|h) = σ(c +K∑k(W k ∗f hk)i)
I(hkj ) , bk + (W̃ k ∗v v)j
p(hkj = 1|v) = exp(I(hk
i ))
1 +∑
j′∈βαexp(I(hk
i ′))
p(pkα = 0|v) = 1
1 +∑
j′∈βαexp(I(hk
i ′))
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
Contents
1 Deep Learning2 Restricted Boltzmann Machine3 Deep Belief Network4 Convolutional RBM5 Convolutional DBN6 Conclusion
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
Definition
Stack of Convolutional RBMWith or without ProbabilisticMax Pooling
Each RBM takes input fromprevious layer outputEach layer forms a higher-levelrepresentation of the dataNumber of hidden units in eachlayer can be tuned
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
Feature Learning
Source: Honglak LeeEach layer learns a differentabstraction of features
1 Stroke2 Parts of faces3 Faces
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
ImplementationConclusion
Contents
1 Deep Learning2 Restricted Boltzmann Machine3 Deep Belief Network4 Convolutional RBM5 Convolutional DBN6 Conclusion
ImplementationConclusion
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
ImplementationConclusion
Implementation
Deep Learning Library (DLL)https://github.com/wichtounet/dllRBM
Binary, Gaussian, Softmax, ReLU unitsCD and PCDMomentum, Weight Decay, Sparsity Target
Convolutional RBMStandard versionProbabilistic Max PoolingVarious unitsCD and PCDMomentum, Weight Decay, Sparsity Target
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
ImplementationConclusion
Implementation
DBNPretraining with RBMFine-tuning with Conjugate GradientFine-tuning with Stochastic Gradient Descent
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
ImplementationConclusion
Future Work
Use CDBN for text detectionConvolutional DBNSVM classification layer for DBNRefinements
New training methods for RBM/DBNReduce compute timeMaxout, Dropout
Baptiste [email protected] Deep Learning
Deep LearningRestricted Boltzmann Machine
Deep Belief NetworkConvolutional RBMConvolutional DBN
Conclusion
ImplementationConclusion
Conclusion
Deep Learning solutions are very powerfulState of the art in several problems ,Still room for improvement ,Still young solutions (hype) ,
HoweverThey are complex to implement /Free variables need to be configured with care /Results from paper are hard to reproduce /Heavy to train /
Baptiste [email protected] Deep Learning