deep learning - what's the buzz all about
DESCRIPTION
Deep learning is a genre of machine learning algorithms that attempt to solve tasks by learning abstraction in data following a stratified description paradigm using non-linear transformation architectures. When put in simple terms, say you want to make the machine recognize some Mr. X with Mt. E in the background, then this task is a stratified or hierarchical recognition task. At the base of the recognition pyramid would be kernels which can discriminate flats, lines, curves, sharp angles, color; higher up will be kernels which use this information to discriminate body parts, trees, natural scenery, clouds, etc.; higher up will use this knowledge to recognize humans, animals, mountains, etc.; and higher up will learn to recognize Mr. X and Mt. E and finally the apex lexical synthesizer module would say that Mr. X is standing in front of Mt. E. Deep learning is all about how you make machines synthesize this hierarchical logic and also learn these representative kernels all by itself. Deep learning has been extensively used to efficiently solve these kinds of problems from handwritten character recognition (NYU, U Toronto), speech recognition (Microsoft, Google Voice), lexical ordered speech synthesis (Google Voice, iPhone Siri), object and poster recognition (Cortica), image retrieval (Baidu), content filtering (Youtube, Metacafe, Twitter), product visibility tracking (GazeMetrix), computational medical imaging (IITKgp). This talk will focus on the buzz around this topic and how firm does the buzz hold on to the claims it boasts of?TRANSCRIPT
Deep LearningWhat’s the buzz all about?
Dr. Debdoot SheetAssistant Professor
Department of Electrical EngineeringIndian Institute of Technology Kharagpur
www.facweb.iitkgp.ernet.in/~debdoot/
Deep Learning - What's the buzz all about? / Debdoot Sheet 25 Aug. 2014
Deep Learning - What's the buzz all about? / Debdoot Sheet 3
Learning?
A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E
-Tom Mitchell
5 Aug. 2014
Deep Learning - What's the buzz all about? / Debdoot Sheet 4
Demystifying Learning
5 Aug. 2014
Man 1 Man 2 Man 3Man 4
Great Wall logo
Great Wall tower
Kim JungWangDebdoot
Experience (E)
Perfo
rman
ce (P
)
Debdoot, Kim, Jung and Wang are standing near the Great Wall logo and the Great Wall tower is behind them.
Deep Learning - What's the buzz all about? / Debdoot Sheet 5
How was it Learning?
5 Aug. 2014
Man 1 Man 2 Man 3Man 4
Great Wall logo
Great Wall tower
Kim JungWangDebdoot
Salient Segments
Objectify
Detect humans
Recognize inanimate
Describe Scene
Debdoot, Kim, Jung and Wang are standing near the Great Wall logo and the Great Wall tower is behind them.
Recognize humans
Deep Learning - What's the buzz all about? / Debdoot Sheet 6
Challenges
5 Aug. 2014
Salient Segments
Objectify
Detect humans
Recognize inanimate
Describe Scene
Recognize humans
Salient Segments
Objectify
Detect humans
Recognize inanimate
Describe Scene
Recognize humans
Salient Segments
Objectify
Detect humans
Recognize inanimate
Describe Scene
Recognize humans
Deep Learning - What's the buzz all about? / Debdoot Sheet 7
Challenges
5 Aug. 2014
Salient Segments
Objectify
Recognize inanimate
Describe Scene
Recognize humans
LBP
Wavelets
HoG
Body part recognition
Human appearance
Chromaclustering
Posture realign Silhouette
matchingRecognize
humanDetect
humans
Deep Learning - What's the buzz all about? / Debdoot Sheet 8
Dilemma only in Machine Vision?• Speech Recognition and Signal Processing
– Dahl, et al,(2012). Context dependent pre-trained deep neural networks for large vocabulary speech recognition. IEEE Trans. Audio, Speech, and Language Processing, 20(1), 33–42.
• Handwriting Recognition– Bengio, et al (2006). Greedy layer-wise training of deep networks. In
NIPS’2006.
• Natural Language Processing– Hinton, (1986). Learning distributed representations of concepts. In Proc.
8th Conf. Cog. Sc. Society, pp. 1–12.
• Hierarchical and Transfer Learning– Goodfellow, et al. (2011). Spike-and slab sparse coding for unsupervised
feature discovery. In NIPS’2011 Workshop on Challenges in Learning Hierarchical Models.
• Medical Imaging– Karri and Sheet, et al (2014). Deep learnt random forests for
segmentation of retinal layers in optical coherence tomography images. In ISBI’2014.5 Aug. 2014
Deep Learning - What's the buzz all about? / Debdoot Sheet 9
How to tackle this dilemma?
5 Aug. 2014
Great Wall behindGreat Wall logo beside
Debdoot, Kim, Jung, Wang
Deep Learning - What's the buzz all about? / Debdoot Sheet 10
Multilayer Perceptron (MLP)
5 Aug. 2014
Hidd
en la
yers
Hidd
en la
yers
Hidd
en la
yers
Deep Learning - What's the buzz all about? / Debdoot Sheet 11
MLP Learning, troubles thereof
5 Aug. 2014
P
T1
T2
Deep Learning - What's the buzz all about? / Debdoot Sheet 12
MLP Learning troubles, why so?
5 Aug. 2014
P
T1
T2
LBP
Wavelets
HoG
Body part recognition
Human appearance
Deep Learning - What's the buzz all about? / Debdoot Sheet 13
MLP Learning troubles, why so?
5 Aug. 2014
P
T1
T2
Chromaclustering
Posture realign Silhouette
matchingRecognize
human
Deep Learning - What's the buzz all about? / Debdoot Sheet 14
MLP Learning troubles, why so?
5 Aug. 2014
P
T1
T2
?
Deep Learning - What's the buzz all about? / Debdoot Sheet 15
Deep Learning, origin and growth• Around 1950 – NN age
– Neural Nets (McCulloch and Pitts, 1943)
– Unsupervised Learn. (Hebb, 1949)– Supervised Learn. (Rosenblatt, 1958)– Associative Memory (Palm, 1980;
Hopfield, 1982)
• 1960– Discovery of visual sensory cells that
respond to Edges (Hubel and Wiesel, 1962)
– Feed Forward Multi Layer Perceptron (FF-MLP) (Ivakhnenko, 1968)
• 1980 – Neocognition– Convolution + WeightReplication +
Subsampling (Fukushima, 1980)– Max Pooling– Back-propagation (Werbos, 1981;
LeCunn, 1985, 1988)
5 Aug. 2014
Deep Learning - What's the buzz all about? / Debdoot Sheet 16
Deep Learning, origin and growth• 1980-2000 – Search for simple,
low-complexity, problem-solvers– Recurrent Neural Network (RNN)
(Hochreiter and Schmidhuber, 1996)
– Local learning Feed forward NN (Dayan and Hinton, 1996)
– Advanced gradient descent (Schaback and Werner, 1992)
– Sequential Network Construction (Honavar and Uhr, 1988)
– Unsupervised Pre-training (Ritter and Kohonen, 1989)
– Auto-Encoder (Hinton et al., 1989)
– Back Propagating Convolutional Neural Networks (LeCun et al., 1989, 1990a, 1998)
5 Aug. 2014
Deep Learning - What's the buzz all about? / Debdoot Sheet 17
Deep Learning, origin and growth• 2000 – Era of Deep Learning
– NIPS 2003 Feature Selection Challenge (Neal and Zhang, 2006)
– MNIST digit recognition (LeCun et al., 1989)
– Deep Belief Network (DBN) / Restricted Boltzmann Machines (Hinton et al., 2006)
– Auto Encoders (Bengio, 2009)
• 2006– GPU based CNN (Chellapilla et al.,
2006)
• 2009– GPU DBN (Raina et al., 2009)
• 2011– Max-Pooling CNN on the GPU
(Ciresan et al., 2011)
• 2012– Image Net (Krizhevsky et al., 2012)
5 Aug. 2014
Deep Learning - What's the buzz all about? / Debdoot Sheet 18
Science fiction or realitySynthetic facial expression
Machine that can dream
5 Aug. 2014
Susskind, Joshua M., et al. "Generating facial expressions with deep belief nets." Affective Computing, Emotion Modelling, Synthesis and Recognition (2008): 421-440.
Deep Learning - What's the buzz all about? / Debdoot Sheet 19
Deep Learning, where• Facebook• Baidu – IDL• Cortica• Cortexica• Google
– DNNresearch– Deep Mind
• Microsoft– DNN– Adam project
5 Aug. 2014
Deep Learning - What's the buzz all about? / Debdoot Sheet 20
COMPUTATIONAL MEDICAL IMAGING
my experiences with Deep Learning (or not) for
5 Aug. 2014
xx |,,|11 nnff
x,,,| 1 ng
Deep Learning - What's the buzz all about? / Debdoot Sheet 21
(Not so Deep) Transfer Learning
5 Aug. 2014
D. Sheet, et al, Iterative self-organizing atherosclerotic tissue labeling in intravascular ultrasound images and comparison with virtual histology, IEEE TBME, 59(11), 2012
D. Sheet, et al, Hunting for necrosis in the shadows of intravascular ultrasound, CMIG, 38(2), 2014
D. Sheet, et al, Joint learning of ultrasonic backscattering statistical physics and signal confidence primal for characterizing atherosclerotic plaques using intravascular ultrasound, Med. Image Anal,18(1), 2014
Training set of IVUS signals
Ground truth
Random forest learning
Test IVUS signal
Labeled tissue
Multi-scale Speckle stats.
Multi-scale Speckle stats.
Deep Learning - What's the buzz all about? / Debdoot Sheet 22
Local Minima Juggling, experience
5 Aug. 2014
P
T1
T2
LBP
Wavelets
HoG
Body part recognition
Human appearance
Chromaclustering
Posture realign Silhouette
matchingRecognize
human
Deep Learning - What's the buzz all about? / Debdoot Sheet 23
Transfer vs. Full Deep Architecture
5 Aug. 2014
Multi-scale modeling of
OCT speckles
Trainingimage
set
Ground truth
Random forest learning
Multi-scale modeling of
OCT speckles
Test image
Labeled tissue
Stacked Auto-Encoders,
Logistic Regression
Random Forest
Trainingimage
set
Ground truth
Convolution masks
http://www.facweb.iitkgp.ernet.in/~debdoot/current.html
Deep Learning - What's the buzz all about? / Debdoot Sheet 24
Endnote (almost there)“It’s like in quantum physics at the beginning of
the 20th century” Trishul Chilimbi (MSR, DNN, Adam)
“The experimentalists and practitioners were ahead of the theoreticians. They couldn’t explain the results. We appear to be at a similar stage with DNNs. We’re realizing the power and the capabilities, but we still don’t understand the fundamentals of exactly how they work.”
5 Aug. 2014
Deep Learning - What's the buzz all about? / Debdoot Sheet 25
Take home message• Challenges
– Architectures• Neural Nets vs. Others
– Implementation• CPU vs. GPU vs. Cloud
– GPU (VLSI) architectures• Hierarchical Temporal
Memory• Potential Causal
Connection
• Toolboxes– Theano (Python/SciPy)– Pylearn2
• More information– www.deeplearning.net– Schmidhuber (2014). Deep
Learning in Neural Networks: An Overview (arXiv:1404.7828)
– Bengio (2009). Learning Deep Architectures for AI.
– Deng and Yu (2013). Deep Learning: Methods and Applications.
• Conferences– Int. Conf. Learning
Representations (ICLR)
5 Aug. 2014