machine learning ee514 cs535 neural networks

51
Machine Learning EE514 – CS535 Neural Networks Zubair Khalid School of Science and Engineering Lahore University of Management Sciences https://www.zubairkhalid.org/ee514_2021.html

Upload: others

Post on 16-Oct-2021

14 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Machine Learning EE514 CS535 Neural Networks

Machine Learning

EE514 – CS535

Neural Networks

Zubair Khalid

School of Science and EngineeringLahore University of Management Sciences

https://www.zubairkhalid.org/ee514_2021.html

Page 2: Machine Learning EE514 CS535 Neural Networks

Outline

- Neural networks connection with perceptron and logistic regression

- Neural networks notation

- Neural networks ‘Forward Pass’

- Activation functions

- Learning neural network parameters

- Back Propagation.

Page 3: Machine Learning EE514 CS535 Neural Networks

Neural NetworksConnection with Logistic Regression and Perceptron:

Perceptron Model:

Activation Function

Page 4: Machine Learning EE514 CS535 Neural Networks

Neural NetworksConnection with Logistic Regression and Perceptron:

Activation function is Logistic/Sigmoid function

Logistic Regression Model:

Logistic Regression Model, aka Sigmoid Neuron

Page 5: Machine Learning EE514 CS535 Neural Networks

Neural NetworksConnection with Logistic Regression and Perceptron:

Activation FunctionPerceptron vs Sigmoid Neuron

Weighted sum of inputs + bias

Page 6: Machine Learning EE514 CS535 Neural Networks

Neural NetworksNeuron Model:Compact Representation:

More Compact Representation:

Page 7: Machine Learning EE514 CS535 Neural Networks

Neural NetworksNeural Networks - Infamous XOR Problem:

Idea: Learn AND and ORboundaries.

Page 8: Machine Learning EE514 CS535 Neural Networks

Neural NetworksNeural Networks - Infamous XOR Problem:

AND

OR

Page 9: Machine Learning EE514 CS535 Neural Networks

Neural NetworksNeural Networks

Each unit in the network is called a neuron and the network of neurons is called Neural Network.

- Output is a non-linear function- of a linear combination

- of non-linear functions - of a linear combination of inputs

Page 10: Machine Learning EE514 CS535 Neural Networks

Neural NetworksNeural Networks:

Example: 3-layer network, 2 hidden layers

- Output is a non-linear function- of a linear combination

- of non-linear functions - of linear combinations

- of non-linear functions - of linear combinations of inputs

Feedforward Neural Network: Output from one layer is an input to the next layer.

Page 11: Machine Learning EE514 CS535 Neural Networks

Neural NetworksWhat kind of functions can be modeled by a neural network?

Intuition: XOR example

Page 12: Machine Learning EE514 CS535 Neural Networks

Neural NetworksWhat kind of functions can be modeled by a neural network?

Intuition: Example (Sigmoid neuron)

Source: http://neuralnetworksanddeeplearning.com/chap4.html

Page 13: Machine Learning EE514 CS535 Neural Networks

Neural NetworksWhat kind of functions can be modeled by a neural network?

Intuition: Example (Multi layer)

Source: http://neuralnetworksanddeeplearning.com/chap4.html

Page 14: Machine Learning EE514 CS535 Neural Networks

Neural NetworksWhat kind of functions can be modeled by a neural network?

Universal Approximation Theorem (Hornik 1991):

“A single hidden layer neural network with a linear output unit can approximate any continuous function arbitrarily well, given enough hidden units.”

- The theorem results demonstrates the capability of neural network, but this does not mean there is a learning algorithm that can find the necessary parameter values.

- Since each neuron represents non-linearity, we can keep on increasing the number of neurons in the hidden layer to model the function. But this will also increase the number of parameters defining the model.

- Instead of adding more neurons in the same layer, we prefer to add more hidden layers because non-linear projections of a non-linear projection can model complex functions relatively easy.

Page 15: Machine Learning EE514 CS535 Neural Networks

Outline

- Neural networks connection with perceptron and logistic regression

- Neural networks notation

- Neural networks ‘Forward Pass’

- Activation functions

- Learning neural network parameters

- Back Propagation.

Page 16: Machine Learning EE514 CS535 Neural Networks

Neural NetworksNeural Networks – Notation:

Pre-activation(Aggregation)

Activation

Linear transformation

Non-linear transformation

Pre-activation

Activation

Single Neuron:

Page 17: Machine Learning EE514 CS535 Neural Networks

Neural NetworksNeural Networks – Notation:

Example: 3-layer network, 2 hidden layers

Page 18: Machine Learning EE514 CS535 Neural Networks

Neural NetworksNeural Networks – Notation:

Example: 3-layer network, 2 hidden layers

Page 19: Machine Learning EE514 CS535 Neural Networks

Neural NetworksNeural Networks – Notation:

Page 20: Machine Learning EE514 CS535 Neural Networks

Neural NetworksNeural Networks – Forward Pass:

Page 21: Machine Learning EE514 CS535 Neural Networks

Neural NetworksNeural Networks – Forward Pass Summary:

Page 22: Machine Learning EE514 CS535 Neural Networks

Neural NetworksNeural Networks – Forward Pass – Incorporating all Inputs:

Recall:

Page 23: Machine Learning EE514 CS535 Neural Networks

Neural NetworksNeural Networks – Forward Pass Summary – All Inputs:

Page 24: Machine Learning EE514 CS535 Neural Networks

Outline

- Neural networks connection with perceptron and logistic regression

- Neural networks notation

- Neural networks ‘Forward Pass’

- Activation functions

- Learning neural network parameters

- Back Propagation.

Page 25: Machine Learning EE514 CS535 Neural Networks

Neural NetworksActivation Function:

Page 26: Machine Learning EE514 CS535 Neural Networks

Neural NetworksStep:

Linear:

Page 27: Machine Learning EE514 CS535 Neural Networks

Neural NetworksSigmoid:

Issues:

Page 28: Machine Learning EE514 CS535 Neural Networks

Neural Networkstanh (Hyperbolic tangent):

Issues:

Page 29: Machine Learning EE514 CS535 Neural Networks

Neural NetworksRectifier – Rectified Linear Unit (ReLu):

Issues:

Why?

Different variants of ReLu have been proposed to overcome these issues.

Page 30: Machine Learning EE514 CS535 Neural Networks

Neural NetworksLeaky Rectified Linear Unit (Leaky ReLu):

We use rectified linear, leaky rectified linear sigmoid and tanh for hidden layer.

Page 31: Machine Learning EE514 CS535 Neural Networks

Neural NetworksActivation Function for output layer:

Page 32: Machine Learning EE514 CS535 Neural Networks

Neural NetworksChoice of the Activation Function:

• For classification tasks;

• we prefer to use sigmoid, tanh functions and their combinations.

• Due to the saturation problem, sigmoids and tanh functions are sometimes avoided.

• As indicated earlier, ReLU function is mostly used (computationally fast).

• ReLu variants are used to resolve a dead neuron issue (e.g., Leaky ReLu).

• It must be noted that ReLU function is only used in the hidden layers.

• Start with ReLu or leaky/randomized Relu and if the results are not satisfactory,

you may try other activation functions.

Page 33: Machine Learning EE514 CS535 Neural Networks

Outline

- Neural networks connection with perceptron and logistic regression

- Neural networks notation

- Neural networks ‘Forward Pass’

- Activation functions

- Learning neural network parameters

- Back Propagation

Page 34: Machine Learning EE514 CS535 Neural Networks

Neural NetworksLearning Weights:

Notation revisit:

Parameters we need to learn!

Example:

Page 35: Machine Learning EE514 CS535 Neural Networks

Neural NetworksLearning Weights:

We use a method called ‘Back Propagation’ to implement the chain rule for the computation the gradient.

We use log loss here if we have a classification problem and output represents probability.

Page 36: Machine Learning EE514 CS535 Neural Networks

Neural NetworksBack Propagation – Key Idea:

Forward Pass

The weights are the only parameters that can be modified to make the loss function as low as possible.

Learning problem reduces to the question of calculating gradient (partial derivatives) of loss function.

Back Propagation

Page 37: Machine Learning EE514 CS535 Neural Networks

Neural NetworksBack Propagation – Example:

Page 38: Machine Learning EE514 CS535 Neural Networks

Neural NetworksBack Propagation – Example:

Forward Pass

Nothing fancy so far, we have computed the output and loss by traversing neural network.Let’s compute the contribution of loss by each node; back propagate the loss.

Page 39: Machine Learning EE514 CS535 Neural Networks

Neural NetworksBack Propagation – Example:

Page 40: Machine Learning EE514 CS535 Neural Networks

Neural NetworksBack Propagation – Example:

Page 41: Machine Learning EE514 CS535 Neural Networks

Neural NetworksBack Propagation – Vectorization:

Page 42: Machine Learning EE514 CS535 Neural Networks

Neural NetworksBack Propagation – Vectorization:

Forw

ard

Pas

s

Partial Derivatives:

Bac

k P

rop

agat

ion

Page 43: Machine Learning EE514 CS535 Neural Networks

Neural NetworksBack Propagation – Vectorization:

Partial Derivatives:

Page 44: Machine Learning EE514 CS535 Neural Networks

Neural NetworksBack Propagation – Vectorization:

Page 45: Machine Learning EE514 CS535 Neural Networks

Neural NetworksBack Propagation – Vectorization:

Page 46: Machine Learning EE514 CS535 Neural Networks

Neural NetworksBack Propagation – Vectorization:

Page 47: Machine Learning EE514 CS535 Neural Networks

Neural NetworksBack Propagation – Vectorization:

Page 48: Machine Learning EE514 CS535 Neural Networks

Neural NetworksBack Propagation – Vectorization:

Key Equations:

Page 49: Machine Learning EE514 CS535 Neural Networks

Neural NetworksBack Propagation – Vectorization:

Interpretation of Key Equations:

Bac

k P

rop

agat

ion

Page 50: Machine Learning EE514 CS535 Neural Networks

Neural NetworksBack Propagation – Vectorization:

Page 51: Machine Learning EE514 CS535 Neural Networks

• TM – Chapter 4

• CB – Chapter 5 (See Section 5.3 for back propagation)

• ‘Neural Networks: A Comprehensive Foundation’ and ‘Neural Networks and Learning Machines’ by Simon Haykin

• http://neuralnetworksanddeeplearning.com/

Neural NetworksReferences: