deep learning presentation

43
Deep Learning Restricted Boltzmann Machine Deep Belief Network Convolutional RBM Convolutional DBN Conclusion Deep Learning Baptiste Wicht [email protected] September 12, 2014 Baptiste [email protected] Deep Learning

Upload: baptiste-wicht

Post on 05-Dec-2014

2.427 views

Category:

Technology


5 download

DESCRIPTION

Short introduction to deep learning and to the DLL Library (C++, https://github.com/wichtounet/dll). Nothing fancy.

TRANSCRIPT

Page 1: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

Deep Learning

Baptiste [email protected]

September 12, 2014

Baptiste [email protected] Deep Learning

Page 2: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

Table of Contents

1 Deep Learning2 Restricted Boltzmann Machine3 Deep Belief Network4 Convolutional RBM5 Convolutional DBN6 Conclusion

Baptiste [email protected] Deep Learning

Page 3: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

DefinitionHistoryUsagesDifficulties

Contents

1 Deep LearningDefinitionHistoryUsagesDifficulties

2 Restricted Boltzmann Machine3 Deep Belief Network4 Convolutional RBM5 Convolutional DBN6 Conclusion

Baptiste [email protected] Deep Learning

Page 4: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

DefinitionHistoryUsagesDifficulties

Definition

Deep Learning (Wikipedia)Deep learning is a set of algorithms in machine learning thatattempt to model high-level abstractions in data by using modelarchitectures composed of multiple non-linear transformations

Deep Learning (deeplearning.net)Deep Learning is a new area of Machine Learning research, whichhas been introduced with the objective of moving MachineLearning closer to one of its original goals: Artificial Intelligence.

Baptiste [email protected] Deep Learning

Page 5: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

DefinitionHistoryUsagesDifficulties

Definition (cont.d)

Goal: Imitate the natureSet of algorithmsGenerally structures with multiple layersOften unsupervised feature learningTime-consuming trainingSometimes large amount of dataGenerally complex dataNew name for an old thinghot topic

Baptiste [email protected] Deep Learning

Page 6: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

DefinitionHistoryUsagesDifficulties

History

1960: Neural networks1985: Multilayer Perceptrons1986: Restricted Boltzmann Machine1995: Support Vector Machine2006: Hinton presents the Deep Belief Network (DBN)

New interests in deep learning and RBMState of the art MNIST

2009: Deep Recurrent Neural Network2010: Convolutional DBN2011: Max-Pooling CDBN

Many competitions won and state of the art results

Baptiste [email protected] Deep Learning

Page 7: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

DefinitionHistoryUsagesDifficulties

Names

Geoffrey HintonAndrew Y. NgYoshua BengioHonglak LeeYann LeCun...

Baptiste [email protected] Deep Learning

Page 8: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

DefinitionHistoryUsagesDifficulties

Algorithms

Deep Neural NetworksDeep Belief NetworksConvolutional Deep Belief NetworksDeep SVM

Baptiste [email protected] Deep Learning

Page 9: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

DefinitionHistoryUsagesDifficulties

Usages

Text recognitionFacial Expression RecognitionObject RecognitionAudio classification

Baptiste [email protected] Deep Learning

Page 10: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

DefinitionHistoryUsagesDifficulties

Difficulties

Large number of free variablesFew insights on how to set them

Complex to implementLarge variations between papersLot of refinements were proposed

Baptiste [email protected] Deep Learning

Page 11: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

DefinitionTrainingUnitsVariants

Contents

1 Deep Learning2 Restricted Boltzmann Machine

DefinitionTrainingUnitsVariants

3 Deep Belief Network4 Convolutional RBM5 Convolutional DBN6 Conclusion

Baptiste [email protected] Deep Learning

Page 12: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

DefinitionTrainingUnitsVariants

Definition

Restricted Boltzmann MachineFunction: Learn a probability distribution over the inputGenerative stochastic neural networkVisible and hidden neuronsNeurons form a bipartite graphV visible units and visible biasesH hidden units and hidden biasesVxH weights

Baptiste [email protected] Deep Learning

Page 13: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

DefinitionTrainingUnitsVariants

Definition (Cont.d)

Binary units (Bernoulli RBM)

p(hj = 1|v) = σ(cj +m∑i

viwi ,j)

p(vi = 1|h) = σ(bi +n∑j

hjwi ,j)

Baptiste [email protected] Deep Learning

Page 14: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

DefinitionTrainingUnitsVariants

Example

Baptiste [email protected] Deep Learning

Page 15: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

DefinitionTrainingUnitsVariants

Example

Baptiste [email protected] Deep Learning

Page 16: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

DefinitionTrainingUnitsVariants

Example

Baptiste [email protected] Deep Learning

Page 17: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

DefinitionTrainingUnitsVariants

Example

Baptiste [email protected] Deep Learning

Page 18: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

DefinitionTrainingUnitsVariants

Example

Baptiste [email protected] Deep Learning

Page 19: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

DefinitionTrainingUnitsVariants

Usages

Unsupervised feature learningClassification with other techniques (linear classifier, SVM, ...)Limited to one layer of abstraction

Stacking for higher-level models and classificationDeep Belief NetworkDeep Boltzmann Machines

Baptiste [email protected] Deep Learning

Page 20: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

DefinitionTrainingUnitsVariants

Training

Objective: Maximizing the log-likelihoodIntractable

Other methods have been developed:Markov Chain Monte Carlo (MCMC) (Too slow)Contrastive Divergence (CD) (Hinton)Persistent CDMean-Field CD (mf-CD)Parallel TemperingAnnealed Importance Sampling

Baptiste [email protected] Deep Learning

Page 21: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

DefinitionTrainingUnitsVariants

Contrastive Divergence

For each data point1 Compute gradients g between t = k and t = k − 12 Add α ∗ g to the weights and the biases

Repeat for several epochs

Experiments have shown that CD1 (k = 1) works well

Baptiste [email protected] Deep Learning

Page 22: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

DefinitionTrainingUnitsVariants

Contrastive Divergence

When to stop training ?1 Proxies to log-likelihood:

Reconstruction errorPseudo-likelihood (PCD)

2 Visual inspection of the filtersTraining is relatively fast

Can be trained on GPUHard to compare two RBMsHard to test an implementation correctly

Baptiste [email protected] Deep Learning

Page 23: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

DefinitionTrainingUnitsVariants

Contrastive Divergence Options

Mini-batch trainingMomentumWeight decaySparsity Target...

Baptiste [email protected] Deep Learning

Page 24: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

DefinitionTrainingUnitsVariants

Units

RBM Was initially developed with binary unitsDifferent types of units can be used:

Gaussian visible units for real-value inputsSoftmax hidden unit for classification (last layer)Rectified Linear Unit (ReLU) units for hidden/visible

Can be capped

Baptiste [email protected] Deep Learning

Page 25: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

DefinitionTrainingUnitsVariants

Variants

Convolutional RBM (see later)mean-covariance RBM (mcRBM)Sparse RBM (SRBM)Third-Order RBMSpike And Slab RBMNonnegative RBM...

Baptiste [email protected] Deep Learning

Page 26: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

DefinitionTraining

Contents

1 Deep Learning2 Restricted Boltzmann Machine3 Deep Belief Network

DefinitionTraining

4 Convolutional RBM5 Convolutional DBN6 Conclusion

Baptiste [email protected] Deep Learning

Page 27: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

DefinitionTraining

Definition

Deep Belief NetworkGenerative graphical modelType of Deep Neural NetworkMultiple layer of hidden unitsStack of RBMs

Can be implemented with other autoencoders

Baptiste [email protected] Deep Learning

Page 28: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

DefinitionTraining

Definition (Cont.d)

Each RBM takes inputfrom previous layer outputEach layer forms ahigher-level representationof the dataNumber of hidden units ineach layer can be tuned

Baptiste [email protected] Deep Learning

Page 29: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

DefinitionTraining

Training

1 Train each layer, from bottom to top, with ContrastiveDivergence (Unsupervised)

2 Then treat the DBN as a MLP3 If necessary, fine-tune the last layer for classification

(Supervised)Back propagationnonlinear Conjugate Gradient methodLimited Memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS)Hessian-Free CG (Martens)

Baptiste [email protected] Deep Learning

Page 30: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

DefinitionTrainingProbabilistic Max Pooling

Contents

1 Deep Learning2 Restricted Boltzmann Machine3 Deep Belief Network4 Convolutional RBM

DefinitionTrainingProbabilistic Max Pooling

5 Convolutional DBN6 Conclusion

Baptiste [email protected] Deep Learning

Page 31: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

DefinitionTrainingProbabilistic Max Pooling

Definition

Convolutional RBMMotivation: Translation-invariance

Scaling to full-size imagesVariant of RBM, concepts remain the sameNV xNV binary visible unitsK groups of hidden unitsNK xNK binary hidden units per groupEach group has a NW xNW filter (NW , NV − NH + 1)A bias bk for each hidden groupA single bias c for all visible units

Baptiste [email protected] Deep Learning

Page 32: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

DefinitionTrainingProbabilistic Max Pooling

Definition (Cont.d)

Binary units:

p(hkj = 1|v) = σ(bk + (W̃ k ∗v v)j)

p(vi = 1|h) = σ(c +K∑k(W k ∗f hk)i)

Baptiste [email protected] Deep Learning

Page 33: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

DefinitionTrainingProbabilistic Max Pooling

Training

Contrastive DivergenceGradients computations are done with convolutionsSame refinements can be used (weight decay, momentum, ...)

CRBM is highly overcompleteSparse learning is very important

Baptiste [email protected] Deep Learning

Page 34: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

DefinitionTrainingProbabilistic Max Pooling

Probabilistic Max Pooling

Shrink the representation by a constant factor CAllows higher-level to be invariant to small translationsReduces computational effort

Generative version of standard Max PoolingPooling layer with K groups of pooling unitsEach group has NPxNP unitsNP , NH/CEach hidden block α (CxC) is connected to exactly onepooling unit

Baptiste [email protected] Deep Learning

Page 35: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

DefinitionTrainingProbabilistic Max Pooling

Definition (Cont.d)

Binary units:

p(vi = 1|h) = σ(c +K∑k(W k ∗f hk)i)

I(hkj ) , bk + (W̃ k ∗v v)j

p(hkj = 1|v) = exp(I(hk

i ))

1 +∑

j′∈βαexp(I(hk

i ′))

p(pkα = 0|v) = 1

1 +∑

j′∈βαexp(I(hk

i ′))

Baptiste [email protected] Deep Learning

Page 36: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

Contents

1 Deep Learning2 Restricted Boltzmann Machine3 Deep Belief Network4 Convolutional RBM5 Convolutional DBN6 Conclusion

Baptiste [email protected] Deep Learning

Page 37: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

Definition

Stack of Convolutional RBMWith or without ProbabilisticMax Pooling

Each RBM takes input fromprevious layer outputEach layer forms a higher-levelrepresentation of the dataNumber of hidden units in eachlayer can be tuned

Baptiste [email protected] Deep Learning

Page 38: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

Feature Learning

Source: Honglak LeeEach layer learns a differentabstraction of features

1 Stroke2 Parts of faces3 Faces

Baptiste [email protected] Deep Learning

Page 39: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

ImplementationConclusion

Contents

1 Deep Learning2 Restricted Boltzmann Machine3 Deep Belief Network4 Convolutional RBM5 Convolutional DBN6 Conclusion

ImplementationConclusion

Baptiste [email protected] Deep Learning

Page 40: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

ImplementationConclusion

Implementation

Deep Learning Library (DLL)https://github.com/wichtounet/dllRBM

Binary, Gaussian, Softmax, ReLU unitsCD and PCDMomentum, Weight Decay, Sparsity Target

Convolutional RBMStandard versionProbabilistic Max PoolingVarious unitsCD and PCDMomentum, Weight Decay, Sparsity Target

Baptiste [email protected] Deep Learning

Page 41: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

ImplementationConclusion

Implementation

DBNPretraining with RBMFine-tuning with Conjugate GradientFine-tuning with Stochastic Gradient Descent

Baptiste [email protected] Deep Learning

Page 42: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

ImplementationConclusion

Future Work

Use CDBN for text detectionConvolutional DBNSVM classification layer for DBNRefinements

New training methods for RBM/DBNReduce compute timeMaxout, Dropout

Baptiste [email protected] Deep Learning

Page 43: Deep learning presentation

Deep LearningRestricted Boltzmann Machine

Deep Belief NetworkConvolutional RBMConvolutional DBN

Conclusion

ImplementationConclusion

Conclusion

Deep Learning solutions are very powerfulState of the art in several problems ,Still room for improvement ,Still young solutions (hype) ,

HoweverThey are complex to implement /Free variables need to be configured with care /Results from paper are hard to reproduce /Heavy to train /

Baptiste [email protected] Deep Learning