an introduction to deep learning · deep learning an introduction to ... •deep learning •feed...

Deep Learning

An Introduction to

By Rahil Mahdian – December 12-13, 2017

1

http://powerpoint.sage-fox.com/

Outline • Machine Learning

• Learning Strategies

• Neural Network Learning

• Deep Learning

• Feed Forward Network

• Problems of Deep Learning

• AutoEncoders

• Restricted Boltzmann Machines

• Convolutional Neural Networks

• Recurrent Neural Networks (RNNs, LSTM)

• Deep Learning Applications

2

Scope of Machine Learning

3

Nando de Freitas, Oxford

When to Apply Machine Learning

4

Nando de Freitas, Oxford

Machine Learning Pioneering ( C. Shannon 1961)

5

Learning Types

6

Supervised Learning UnSupervised Learning

Semi-Supervised Learning

Machine Learning vs Deep Learning

7

Perceptron – Single Neuron element, Rosenblatt 1958

8

e.g. Sigmoidal function

Neural Networks (MLP)

9

Training schemes (SGD, Batch, MiniBatch)

10

SGD Batch Mini-Batch

MLP - Function Approximation

11

Deep Motivation- Different Layers of Abstraction

12

Deep Neural Networks

13

Feed Forward Neural Networks

14

Training DNNs - Backpropagation

15

Deep NN - Training Problem

16

The back-propagation encounters the three following difficulties in the training process of deep neural networks:

Vanishing Gradient- output error fails to reach the farther back nodes Overfitting Computational Load

Vanishing Gradient Solutions

17

Overfitting – Generalization Problem

18

Bishop

Model Complexity

19

Data Matters

20

Among competing hypotheses, the one with the fewest assumptions should be selected.

In the related concept of overfitting, excessively complex models are affected by statistical

noise (a problem also known as the bias-variance trade-off), whereas simpler models may

capture the underlying structure better and may thus have better predictive performance.

Hoeffding’s inequalities:

Failure rate

Empirical error rate

Occam's razor: William of Ockham (c. 1287–1347)

# of model parameters

True error rate

Number of sufficient samples

A 32-bits floating point computer

https://en.wikipedia.org/wiki/Overfitting

https://en.wikipedia.org/wiki/Statistical_noise



https://en.wikipedia.org/wiki/Predictive_inference

Overfitting Solutions – Dropout & Regularization

21

50% of hidden layers, and 25% for the input layer

Rule of thumb:

Regularization:

Drop out:

Add a norm of the weights to the cost function. (l1-norm, l2-norm)

Data Augmentation is also a way to avoid overfitting; i.e., adding noise, translating data, etc.

AutoEncoder- Nonlinear dimensionality reduction

22

Restricted Boltzmann Machines (RBM)

23

Hugo Larochelle

Unsupervised Pre-training – another solution

24

Convolution Neural Network (Lecun et al. 1993, LeNet)

25

Convolution NN - Architecture

26

Photo: Phil Kim

ConvNet – How it works?

27

Vincent Vanhoucke- Google

Convolutional NN – (LeCun, Fukushima)

28

AlexNet - 2012

ConvNet for Speech

29

CNN Structures

30

LSTM and RNNs – sequential data

31

LSTM Training

32

RNNs & Multi-Hypothesis Tracking- BeamSearch

33


Captioning & Translation - RNNs

34


Deep Learning Applications: Computer Vision

35

DNN Application - Caption Generation

36

Wrap Up

37

Machine Learning Influence Learning Neural Networks Deep Learning Motivation, Problems, Solutions Unsupervised Neural Networks: AutoEncoders, Restricted Boltzmann Machines Deep Learning Training Solutions Feed Forward Neural Networks as MLPs Convolutional Neural Networks Recurrent Neural Networks for sequential data LSTMs as a generalization of RNNs Applications of DNNs

38

Thanks for attending the Talk.

Questions?

an introduction to deep learning · deep learning an introduction to ... •deep learning •feed...

Documents