introduction to neural networks - cornell university · 2018-10-04 · 1950’s: first artificial...
TRANSCRIPT
![Page 1: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/1.jpg)
Introduction to Neural Networks
Ritchie Zhao, Zhiru ZhangSchool of Electrical and Computer Engineering
ECE 5775 (Fall’18)High-Level Digital Design Automation
![Page 2: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/2.jpg)
▸ Neural networks have revolutionized the world– Self-driving vehicles– Advanced image recognition– Game AI
1
Rise of the Machines
![Page 3: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/3.jpg)
2
Rise of the Machines
▸ Neural networks have revolutionized research
![Page 4: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/4.jpg)
▸ 1950’s: First artificial neural networks created based on biological structures in the human visual cortex
▸ 1980’s – 2000’s: NNs considered inferior to other, simpler algorithms (e.g. SVM, logistic regression)
▸ Mid 2000’s: NN research considered “dead”, machine learning conferences outright reject most NN papers
▸ 2010 – 2012: NNs begin winning large-scale image and document classification contests, beating other methods
▸ 2012 – Now: NNs prove themselves in many industrial applications (web search, translation, image analysis)
3
A Brief History
![Page 5: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/5.jpg)
▸ Surpass human ability in image recognition
4
Things Neural Networks Can Do…
![Page 6: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/6.jpg)
▸ Generate celebrities
5
Things Neural Networks Can Do…
![Page 7: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/7.jpg)
▸ Art style transfer
6
Things Neural Networks Can Do…
https://deepart.io/
![Page 8: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/8.jpg)
▸ Beat humans at DOTA 2 (>6.5k MMR)
7
Things Neural Networks Can Do…
https://openai.com/five/
![Page 9: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/9.jpg)
CLASSIFICATION WITHTHE PERCEPTRON
Part 1
![Page 10: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/10.jpg)
▸ We’ll discuss neural networks for solving supervised classification problems
9
Classification Problems
???
Inputs Predictions
Predict New Data
• Given a training set consisting of labeled inputs
• Learn a function which maps inputs to labels
• This function is used to predict the labels of new inputs
Cat
Bird
Human
Inputs Labels
Training Set
![Page 11: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/11.jpg)
▸ Inputs: Pairs of numbers (𝒙𝟏, 𝒙𝟐)▸ Labels: 0 or 1
– binary decision problem
▸ Real-life analogue:– Label = Raining or Not Raining– 𝒙𝟏 = relative humidity– 𝒙𝟐 = cloud coverage
10
A Simple Classification Problem
x1 x2 Label-0.7 -0.6 0-0.6 0.4 0-0.5 0.7 1-0.5 -0.2 0-0.1 0.7 10.1 -0.2 00.3 0.5 10.4 0.1 10.4 -0.7 00.7 0.2 1
Training Set
![Page 12: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/12.jpg)
11
Visualizing the Data
Plot of the data points
𝒙𝟏
𝒙𝟐x1 x2 Label-0.7 -0.6 0-0.6 0.4 0-0.5 0.7 1-0.5 -0.2 0-0.1 0.7 10.1 -0.2 00.3 0.5 10.4 0.1 10.4 -0.7 00.7 0.2 1
Training Set
![Page 13: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/13.jpg)
▸ In this case, the data points can be classified with a linear decision boundary
12
Decision Function
Goal: learn this function
![Page 14: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/14.jpg)
▸ The perceptron is the simplest possible neural network, containing only one neuron
▸ Described by the following equation:
𝑦 = 𝜎(*𝑤,𝑥,
.
,/0
+ 𝑏)
13
The Perceptron
𝑤, = weights𝑏 = bias𝜎= activation function
![Page 15: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/15.jpg)
▸ A 2-input, 1-output perceptron:
14
Breaking Down the Perceptron
𝑤0𝜎
𝑤4+
𝑥0
𝑥4𝑦
𝑏
Linear function of 𝑥 =𝑥0𝑥4
𝑊𝑥 + 𝑏
Non-linear function“squashes” output to
the interval (0,1)
OutputProbability that the label is 1
![Page 16: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/16.jpg)
▸ A 2-input, 1-output perceptron:
15
Breaking Down the Perceptron
𝑤0𝜎
𝑤4+
𝑥0
𝑥4𝑦
𝑏
Humidity
Raining?Cloud Cover
Yes
No
![Page 17: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/17.jpg)
▸ The activation function σ is non-linear:
16
Activation Function
Unit Step Sigmoid ReLU
Hard Yes/No decision
Used in the first perceptrons
Soft probability
Used in early neural nets
Makes deep networks easier to train
Used in modern deep nets
11 + 𝑒9: max(0, 𝑥)
![Page 18: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/18.jpg)
▸ Let’s plot the 2-input perceptron (sigmoid activation)𝑦 = 𝜎(𝑤0𝑥0 + 𝑤4𝑥4 + 𝑏)
17
The Perceptron Decision Boundary
𝑤0 = 10𝑤4 = 10
𝑤0 = 5𝑤4 = 20
𝑤0 = 20𝑤4 = 5
Ratio of weights change the direction of the decision boundary
![Page 19: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/19.jpg)
▸ Let’s plot the 2-input perceptron (sigmoid activation)𝑦 = 𝜎(𝑤0𝑥0 + 𝑤4𝑥4 + 𝑏)
18
The Perceptron Decision Boundary
𝑏 = 0 𝑏 = 5 𝑏 = −5
Bias moves boundary away the from origin
![Page 20: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/20.jpg)
▸ Let’s plot the 2-input perceptron (sigmoid activation)𝑦 = 𝜎(𝑤0𝑥0 + 𝑤4𝑥4 + 𝑏)
19
The Perceptron Decision Boundary
𝑤0 = 10𝑤4 = 10
𝑤0 = 5𝑤4 = 5
𝑤0 = 2𝑤4 = 2
Magnitude of weights change the steepness of the decision boundary
![Page 21: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/21.jpg)
▸ The right parameters (weights and bias) will create any linear decision boundary we want
▸ Training = process of finding the parameters to solve our classification problem– Basic idea: iteratively modify the parameters to
reduce the training loss– Training loss: measure of difference between
predictions and labels on the training set
20
Finding the Parameters
![Page 22: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/22.jpg)
▸ Loss function– Measure of difference between predictions and true labels
𝐿 =*(𝑦 , − 𝑡 , )4D
,/E
▸ Gradient Descent:
𝑤FG0 = 𝑤F − η𝜕𝐿𝜕𝑤F
𝑘 = training stepη = learning rate or step size
21
Gradient Descent
Sum over training samples
𝑦 , = Prediction
Gradient = direction of steepest descent
in 𝐿
𝑡 , = True label
![Page 23: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/23.jpg)
▸ At each step k:1. Classify each sample to get each 𝑦 ,
2. Compute the loss 𝐿
3. Compute the gradient KLKMN
4. Update the parameters using gradient descent
𝑤FG0 = 𝑤F − η𝜕𝐿𝜕𝑤F
22
Training a Neural Network
![Page 24: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/24.jpg)
▸ Perceptron training demo– No bias (bias = 0)– No test set (training samples only)
23
Demo
![Page 25: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/25.jpg)
DEEP NEURAL NETWORKSPart 2
![Page 26: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/26.jpg)
▸ A deep neural network (DNN) consists of many layers of neurons (perceptrons)▸ Each connection indicates a weight 𝑤
25
Deep Neural Network
Image credit: http://www.opennn.net/
Input LayerHidden Layers
Output Layer
![Page 27: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/27.jpg)
▸ A single neuron can only make a simple decision▸ Feeding neurons into each other allows a DNN
to learn complex decision boundaries
26
Combining Neurons
Image credit: http://www.opennn.net/
Simple Decisions
Complex Decisions
![Page 28: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/28.jpg)
27
Complex Decision Boundaries
Image credit: https://www.carl-olsson.com/fall-semester-2013/
2 units 3 units 4 units
5 units 10 units 20 units
![Page 29: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/29.jpg)
▸ Gradient Descent:
𝑤FG0 = 𝑤F − η𝜕𝐿𝜕𝑤F
28
Learning a Deep Neural Network
𝐿
How to get KLKMN
for this neuron?
![Page 30: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/30.jpg)
▸ Backpropagation: use the chain rule from calculus to propagate the gradients backwards through the network
29
Backpropagation
𝑓
𝑥
𝑦𝑓(𝑥, 𝑦)
𝜕𝑔𝜕𝑓
𝜕𝑔𝜕𝑥 =
𝜕𝑔𝜕𝑓
𝜕𝑓𝜕𝑥
𝜕𝑔𝜕𝑦 =
𝜕𝑔𝜕𝑓
𝜕𝑓𝜕𝑦
Inspired by course slides from L. Fei-Fei, J. Johnson, S. Yeung, CS231n at Stanford University
𝑔 𝑔(𝑓, ℎ)
ℎ
![Page 31: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/31.jpg)
▸ Remember Gradient Descent?
𝑤FG0 = 𝑤F − η𝜕𝐿𝜕𝑤F
▸ 𝐿 must be computed over the entire training set, which can be millions of samples!
▸ Stochastic Gradient Descent:– At each set, only compute 𝐿 for a minibatch (a few samples
randomly taken from the training set)– SGD is faster and more accurate than GD for DNNs!
30
Stochastic Gradient Descent
![Page 32: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/32.jpg)
CONVOLUTIONAL NEURAL NETWORKS
Part 3
![Page 33: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/33.jpg)
▸ So far, we’ve see networks built from fully-connected layers
▸ These networks don’t work well for images. Why?
▸ Images are typically shift-invariant (i.e. a 6 is a 6 even when shifted)
▸ But a fully-connected neuron probably won’t work when the input is shifted
32
Neural Networks for Images
![Page 34: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/34.jpg)
33
The Convolutional Filter
Image credit: http://deeplearning.stanford.edu/wiki/index.php/Feature_extraction_using_convolution
InputImage
OutputFeature map
▸ Each neuron learns a weight filter and convolves the filter over the image
▸ Each neuron outputs a 2D feature map (basically an image of features)
![Page 35: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/35.jpg)
34
The Convolutional Filter
Image credit: http://deeplearning.stanford.edu/wiki/index.php/Feature_extraction_using_convolution
InputImage
OutputFeature map
▸ Each point in the feature map encodes both a decision and its spatial location
▸ Detects the pattern anywhere in the image!
![Page 36: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/36.jpg)
▸ 𝑀 input and 𝑁 output feature maps▸ Each output map uses 𝑀filters, 1 per input map▸ 𝑀×𝑁 total filters
35
The Convolutional Layer
Filters
Output Feature MapsInput Feature Maps
𝑀 𝑁
∗
![Page 37: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/37.jpg)
36
The Convolutional Layer
Filters
Output Feature MapsInput Feature Maps
𝑀 𝑁
∗
1for(row=0;row<R;row++){2for(col=0;col<C;col++){3for(to=0;to<N;to++){4for(ti=0;ti<M;ti++){5for(i=0; i<K;i++){6for(j=0;j<K;j++){
output_fm[to][row][col]+=weights[to][ti][i][j]*input_fm[ti][S*row+i][S*col+j];
}}}}}}
𝐶
𝑅
Huge amount of parallelism!
![Page 38: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/38.jpg)
▸ Front: convolutional layers learn visual features▸ Feature maps get downsampled through the network▸ Back: fully-connected layers perform classification using
the visual features
37
Convolutional Neural Network
Convolutional Layers Fully-connected Layers
Image credit: http://www.wildml.com/2015/11/understanding-convolutional-neural-networks-for-nlp/
![Page 39: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/39.jpg)
▸ Deep CNNs combine simple features into complex patterns– Early conv layers = edges, textures, ridges – Later conv layers = eyes, noses, mouths
38
Learning Complex Features
Image credit: https://devblogs.nvidia.com/parallelforall/accelerate-machine-learning-cudnn-deep-neural-network-library/; H. Lee, R. Grosse, R. Ranganath, and A. Y. Ng, “Unsupervised Learning of Hierarchical Representations with Convolutional Deep Belief Networks”, CACM Oct 2011
![Page 40: Introduction to Neural Networks - Cornell University · 2018-10-04 · 1950’s: First artificial neural networks created based on biological structures in the human visual cortex](https://reader033.vdocuments.net/reader033/viewer/2022042219/5ec4d6ff87da536c5139021c/html5/thumbnails/40.jpg)
FIN
39