representation learning and deep neural networks -...

33
Introduction Deep Learning Applications Representation Learning and Deep Neural Networks Davide Bacciu Dipartimento di Informatica Università di Pisa [email protected] Applied Brain Science - Computational Neuroscience (CNS)

Upload: trankien

Post on 28-Mar-2018

233 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Representation Learning and Deep Neural Networks - …didawiki.di.unipi.it/.../computational-neuroscience/5-deep-hand.pdfRepresentation Learning and Deep Neural Networks Davide Bacciu

IntroductionDeep Learning

Applications

Representation Learning and Deep NeuralNetworks

Davide Bacciu

Dipartimento di InformaticaUniversità di [email protected]

Applied Brain Science - Computational Neuroscience (CNS)

Page 2: Representation Learning and Deep Neural Networks - …didawiki.di.unipi.it/.../computational-neuroscience/5-deep-hand.pdfRepresentation Learning and Deep Neural Networks Davide Bacciu

IntroductionDeep Learning

Applications

Representation LearningDeep RepresentationsBio-Inspired Foundations

Learning Representations in the Brain

Sensory information is represented by neural activityResponse selectivity of individual neuronsDistribution of activation in neural populationSomething we have seen so far

Neural representation is hierarchicalBrain cortex processes information incrementallyIncreasing levels of abstraction

Representation Learning

How can we obtain articulated hierarchical representations ofinformation in computational models?

Page 3: Representation Learning and Deep Neural Networks - …didawiki.di.unipi.it/.../computational-neuroscience/5-deep-hand.pdfRepresentation Learning and Deep Neural Networks Davide Bacciu

IntroductionDeep Learning

Applications

Representation LearningDeep RepresentationsBio-Inspired Foundations

Representation Learning - A Computational View

Learning the unknown causes underlying dataCauses explain the data we observeAre also the basic building blocks which, when combined,generate the data we observe

Something we have already seen with RBM

Try explaining input data v usingthe unknown causes hLearning is

Generative as causes canreconstruct the dataUnsupervised as interest is inrepresenting information ratherthan, e.g., predicting a class

Page 4: Representation Learning and Deep Neural Networks - …didawiki.di.unipi.it/.../computational-neuroscience/5-deep-hand.pdfRepresentation Learning and Deep Neural Networks Davide Bacciu

IntroductionDeep Learning

Applications

Representation LearningDeep RepresentationsBio-Inspired Foundations

Representation Learning - A Classical View

Representation learning as density estimation: learn aprobability distribution for the data v that uses latent variables h

Learning of a Gaussian Mixture ModelData likelihood P(v|h)

Posterior P(h|v)

Several models in this classical view: PCA, ICA, factor analysis,clustering, ...

Page 5: Representation Learning and Deep Neural Networks - …didawiki.di.unipi.it/.../computational-neuroscience/5-deep-hand.pdfRepresentation Learning and Deep Neural Networks Davide Bacciu

IntroductionDeep Learning

Applications

Representation LearningDeep RepresentationsBio-Inspired Foundations

Representation Learning - A Modern View

Learn representations (a.k.a. causes or features) that areHierarchical: representations at increasing levels ofabstractionDistributed: information is encoded by a multiplicity ofcausesShared among tasksSparse: enforcing neural selectivityCharacterized by simple dependencies: a simple (linear)combination of the representations should be sufficient togenerate data

Deep models follow this modern view of representation learning

Page 6: Representation Learning and Deep Neural Networks - …didawiki.di.unipi.it/.../computational-neuroscience/5-deep-hand.pdfRepresentation Learning and Deep Neural Networks Davide Bacciu

IntroductionDeep Learning

Applications

Representation LearningDeep RepresentationsBio-Inspired Foundations

What is Deep Learning?

Machine learning algorithms inspired by brain organization,based on learning multiple levels of representation andabstraction

Learning models with many layers trained layer-wiseBuild a hierarchical feature space through layeringReduce the need of supervised information

Unsupervised discovery of features in the internal layersFinal layer performs supervised step

Page 7: Representation Learning and Deep Neural Networks - …didawiki.di.unipi.it/.../computational-neuroscience/5-deep-hand.pdfRepresentation Learning and Deep Neural Networks Davide Bacciu

IntroductionDeep Learning

Applications

Representation LearningDeep RepresentationsBio-Inspired Foundations

Learning Features

The traditional shallow way

The deep way

Page 8: Representation Learning and Deep Neural Networks - …didawiki.di.unipi.it/.../computational-neuroscience/5-deep-hand.pdfRepresentation Learning and Deep Neural Networks Davide Bacciu

IntroductionDeep Learning

Applications

Representation LearningDeep RepresentationsBio-Inspired Foundations

Why Deep Learning?

Page 9: Representation Learning and Deep Neural Networks - …didawiki.di.unipi.it/.../computational-neuroscience/5-deep-hand.pdfRepresentation Learning and Deep Neural Networks Davide Bacciu

IntroductionDeep Learning

Applications

Representation LearningDeep RepresentationsBio-Inspired Foundations

No, Seriously.. Why Deep Learning?

Deep learning is THE hot topic nowRevolutionized performance in

Speech recognitionMachine vision

Now expanding to other topicsNatural languageReinforcement learningRobotics

Its origins and inspiration can be traced back to brain/cortexorganization

Page 10: Representation Learning and Deep Neural Networks - …didawiki.di.unipi.it/.../computational-neuroscience/5-deep-hand.pdfRepresentation Learning and Deep Neural Networks Davide Bacciu

IntroductionDeep Learning

Applications

Representation LearningDeep RepresentationsBio-Inspired Foundations

Hierarchical RepresentationHMAX Model

Explaining the structure of the visual cortex in mammals

M. Riesenhuber, T. Poggio, Hierarchical Models of Object Recognition in Cortex. Nature Neuroscience 2:1019-1025, 1999

Page 11: Representation Learning and Deep Neural Networks - …didawiki.di.unipi.it/.../computational-neuroscience/5-deep-hand.pdfRepresentation Learning and Deep Neural Networks Davide Bacciu

IntroductionDeep Learning

Applications

Representation LearningDeep RepresentationsBio-Inspired Foundations

Sparse RepresentationNeuroscience

Sparse coding theory in brain: sensory information in the brainis represented by a relatively small number of simultaneouslyactive neurons out of a large population (B.A. Olshausen, D.J.Field, 1996)

Page 12: Representation Learning and Deep Neural Networks - …didawiki.di.unipi.it/.../computational-neuroscience/5-deep-hand.pdfRepresentation Learning and Deep Neural Networks Davide Bacciu

IntroductionDeep Learning

Applications

Representation LearningDeep RepresentationsBio-Inspired Foundations

Sparse RepresentationMachine Learning

The machine learning perspective: a single layer networklearns better to generate a target output if the input has asparse representation (Willshaw and Dayan, 1990)

Page 13: Representation Learning and Deep Neural Networks - …didawiki.di.unipi.it/.../computational-neuroscience/5-deep-hand.pdfRepresentation Learning and Deep Neural Networks Davide Bacciu

IntroductionDeep Learning

Applications

IntroductionDeep Learning ArchitecturesPractical Things to Know

Deep Networks

Deep architecture with multiple layers focused on learningsparse encoding of input dataUnsupervised training between layers to decompose theproblem into distributed sub problems with increasinglevels of abstractionDeep networks type

Deep Generative ModelsStacked Neural AutoencodersDeep convolutional neural networksRecurrent neural networksHybrids of the above

Page 14: Representation Learning and Deep Neural Networks - …didawiki.di.unipi.it/.../computational-neuroscience/5-deep-hand.pdfRepresentation Learning and Deep Neural Networks Davide Bacciu

IntroductionDeep Learning

Applications

IntroductionDeep Learning ArchitecturesPractical Things to Know

Training Deep Networks

Problem with conventional backpropagation trainingStrongly relies on labeled training dataLearning does not scale well to multiple hidden layers

Greedy layer-wise trainingUnsupervised (internal) layer by layer representationlearning (pre-training)Supervised training of last layer (read-out) plus optionalfine-tuning

Key advantagesGive full learning focus to each layerExploit unlabeled data using supervised training only forfine tuning

Page 15: Representation Learning and Deep Neural Networks - …didawiki.di.unipi.it/.../computational-neuroscience/5-deep-hand.pdfRepresentation Learning and Deep Neural Networks Davide Bacciu

IntroductionDeep Learning

Applications

IntroductionDeep Learning ArchitecturesPractical Things to Know

Deep Generative Models

Intuition - Train a network where hidden units h organized into ahierarchy extract a good representation data from visible units x

Architecture - A networkof stacked RestrictedBoltzmann Machines(RBM) plus a supervisedread-out layer for makingpredictions

Page 16: Representation Learning and Deep Neural Networks - …didawiki.di.unipi.it/.../computational-neuroscience/5-deep-hand.pdfRepresentation Learning and Deep Neural Networks Davide Bacciu

IntroductionDeep Learning

Applications

IntroductionDeep Learning ArchitecturesPractical Things to Know

Deep Belief Networks (DBN)

Pretraining of the independent RBM byContrastive-Divergence, which are then unrolled to form thedeep architecture that is fine-tuned by error backpropagation

Page 17: Representation Learning and Deep Neural Networks - …didawiki.di.unipi.it/.../computational-neuroscience/5-deep-hand.pdfRepresentation Learning and Deep Neural Networks Davide Bacciu

IntroductionDeep Learning

Applications

IntroductionDeep Learning ArchitecturesPractical Things to Know

Deep (Restricted) Boltzmann Machines

DBM are directed generative models⇒ To train a Deep RBMwe need a pre-training trick

Sampling h1 by averagingbottom-up and top-downcontribution

h1j ∼ σ(∑

i

Mijxi +∑

m

Mjmh2m)

For multiple layers

Apply modified training to firstand last layer

Halve the weights of innerlayers

Page 18: Representation Learning and Deep Neural Networks - …didawiki.di.unipi.it/.../computational-neuroscience/5-deep-hand.pdfRepresentation Learning and Deep Neural Networks Davide Bacciu

IntroductionDeep Learning

Applications

IntroductionDeep Learning ArchitecturesPractical Things to Know

Neural Autoencoder Networks

Non-accretive associative memories for feature discovery anddata compression

E.g. linear hidden units with MSE perform PCA (guess why?)

Page 19: Representation Learning and Deep Neural Networks - …didawiki.di.unipi.it/.../computational-neuroscience/5-deep-hand.pdfRepresentation Learning and Deep Neural Networks Davide Bacciu

IntroductionDeep Learning

Applications

IntroductionDeep Learning ArchitecturesPractical Things to Know

Sparse Autoencoders

Autoencoders using more features with a significant numberbeing 0 when encoding an input

Using regularization approaches to enforce sparsity

Page 20: Representation Learning and Deep Neural Networks - …didawiki.di.unipi.it/.../computational-neuroscience/5-deep-hand.pdfRepresentation Learning and Deep Neural Networks Davide Bacciu

IntroductionDeep Learning

Applications

IntroductionDeep Learning ArchitecturesPractical Things to Know

Stacked Neural Autoencoders

Stack autoencoders with level-wise training as with DBM

Train each autoencoder in isolation and drop the decoding layerwhen training is completed

Page 21: Representation Learning and Deep Neural Networks - …didawiki.di.unipi.it/.../computational-neuroscience/5-deep-hand.pdfRepresentation Learning and Deep Neural Networks Davide Bacciu

IntroductionDeep Learning

Applications

IntroductionDeep Learning ArchitecturesPractical Things to Know

Convolutional Neural Networks (CNNs) - LeNet

Very much inspired by the visual processing pipeline in thebrain

Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, Gradient-based learning applied to document recognition,Proceedings of the IEEE 86(11): 2278-2324, 1998

Page 22: Representation Learning and Deep Neural Networks - …didawiki.di.unipi.it/.../computational-neuroscience/5-deep-hand.pdfRepresentation Learning and Deep Neural Networks Davide Bacciu

IntroductionDeep Learning

Applications

IntroductionDeep Learning ArchitecturesPractical Things to Know

Strategies to Control Model Complexity(Regularization)

Weight SharingExploit symmetry orstationarity assumption toreduce the number ofparametersConvolutional NN,recurrent and recursive NN

Dropout

Randomly disconnecthidden neurons duringtrainingNeed to scale hiddenweights at test time

Page 23: Representation Learning and Deep Neural Networks - …didawiki.di.unipi.it/.../computational-neuroscience/5-deep-hand.pdfRepresentation Learning and Deep Neural Networks Davide Bacciu

IntroductionDeep Learning

Applications

IntroductionDeep Learning ArchitecturesPractical Things to Know

Stochastic Gradient Descent

Mini-batchesCompute stochastic gradient onsmall batches of data rather thanon single sampleStabler and quicker learning andfacilitates GPU computing as abyproduct

Momentum Method

∆w(t+1) = α∆w(t)−ε∂E∂w

(t+1)Changes the velocity of theweight particle instead ofthe positionMay quicken convergencedue to avoiding oscillations

Page 24: Representation Learning and Deep Neural Networks - …didawiki.di.unipi.it/.../computational-neuroscience/5-deep-hand.pdfRepresentation Learning and Deep Neural Networks Davide Bacciu

IntroductionDeep Learning

Applications

Deep Learning ApplicationsConclusions

Document Encoding - Deep Belief Network

Finding sparse features for > 800K newswire stories

DBN document encoding Latent semantic analysisHinton, Salakhutdinov, Reducing the dimensionality of data with neural networks, Science, 2006

Page 25: Representation Learning and Deep Neural Networks - …didawiki.di.unipi.it/.../computational-neuroscience/5-deep-hand.pdfRepresentation Learning and Deep Neural Networks Davide Bacciu

IntroductionDeep Learning

Applications

Deep Learning ApplicationsConclusions

DeepFace: a.k.a. beating humans at recognizingfaces

DeepFace recognition accuracy is ≈ 97.35%

23% better than previous resultsCNN with more than 120 million parameters

Taigman et al, Deepface: Closing the gap to human-level performance in faceverification, CVPR 2014

Page 26: Representation Learning and Deep Neural Networks - …didawiki.di.unipi.it/.../computational-neuroscience/5-deep-hand.pdfRepresentation Learning and Deep Neural Networks Davide Bacciu

IntroductionDeep Learning

Applications

Deep Learning ApplicationsConclusions

Learning to Play Atari (I)

Integrating CNN with Reinforcement Learning

Mnih et al, Human-level control through deep reinforcement learning, Nature 2015

Page 27: Representation Learning and Deep Neural Networks - …didawiki.di.unipi.it/.../computational-neuroscience/5-deep-hand.pdfRepresentation Learning and Deep Neural Networks Davide Bacciu

IntroductionDeep Learning

Applications

Deep Learning ApplicationsConclusions

Learning to Play Atari (II)

Page 28: Representation Learning and Deep Neural Networks - …didawiki.di.unipi.it/.../computational-neuroscience/5-deep-hand.pdfRepresentation Learning and Deep Neural Networks Davide Bacciu

IntroductionDeep Learning

Applications

Deep Learning ApplicationsConclusions

Deep Learning and Robotics

Learning where to grasp objects with robotic manipulatorsDeep auto-encoder networkPredicts which grasping box works better

Lenz et al, Deep Learning for Detecting Robotic Grasps, IJRR 2014

Page 29: Representation Learning and Deep Neural Networks - …didawiki.di.unipi.it/.../computational-neuroscience/5-deep-hand.pdfRepresentation Learning and Deep Neural Networks Davide Bacciu

IntroductionDeep Learning

Applications

Deep Learning ApplicationsConclusions

Cheap DL Robots

Page 30: Representation Learning and Deep Neural Networks - …didawiki.di.unipi.it/.../computational-neuroscience/5-deep-hand.pdfRepresentation Learning and Deep Neural Networks Davide Bacciu

IntroductionDeep Learning

Applications

Deep Learning ApplicationsConclusions

Deep Learning API

Caffe - DL framework for computer visionhttp://demo.caffe.berkeleyvision.org/

Theano - Python library with GPU supporthttp://deeplearning.net/software/theano/

Torch - Scientific computing environmenthttp://torch.ch/

MatConvNet - CNN in Matlab with GPU supporthttp://www.vlfeat.org/matconvnet/

CNTK - Microsoft NN and DL toolkithttp://www.cntk.ai/

Tensorflow - Google NN and DL libraryhttps://www.tensorflow.org/

Make sure you have a good GPU!

Page 31: Representation Learning and Deep Neural Networks - …didawiki.di.unipi.it/.../computational-neuroscience/5-deep-hand.pdfRepresentation Learning and Deep Neural Networks Davide Bacciu

IntroductionDeep Learning

Applications

Deep Learning ApplicationsConclusions

Want to try something easy out?

Google Tensorflow Playgroundhttp://playground.tensorflow.org

Language learning and generationhttp://karpathy.github.io/2015/05/21/rnn-effectiveness/

Caffe image classification demohttp://demo.caffe.berkeleyvision.org/

Nvidia Digits deep learning GUIhttps://developer.nvidia.com/digits

Page 32: Representation Learning and Deep Neural Networks - …didawiki.di.unipi.it/.../computational-neuroscience/5-deep-hand.pdfRepresentation Learning and Deep Neural Networks Davide Bacciu

IntroductionDeep Learning

Applications

Deep Learning ApplicationsConclusions

Where is DL heading?

Image and video understanding by integration with naturallanguage descriptions

Bringing in biological models of attentionNatural language understanding and generation

Emotion understandingMoving from sequential representation of text to parse trees

Will soon impact heavily in roboticsSensorimotor learningPolicy learning: deep reinforcement learningCloud robotics: deep learned knowledge available toconnected devicesOnboard GPU computingSelf-driving cars, drones, connected devices

Page 33: Representation Learning and Deep Neural Networks - …didawiki.di.unipi.it/.../computational-neuroscience/5-deep-hand.pdfRepresentation Learning and Deep Neural Networks Davide Bacciu

IntroductionDeep Learning

Applications

Deep Learning ApplicationsConclusions

Take Home Messages

Deep learning is about learning features of complex dataFeatures ≡ hidden stochastic neuronsStrongly rooted in hierarchical information processing in thebrain

Stacking and level-wise trainingLet the deep network discover the features and then placeyour preferred learning model to perform your task

Breakthrough performance in several learningtasks/application areas

Complex spatio/temporal structured data

Do we really know what is going on with the encoding?How many layers?