back-propagation chih-yun lin 5/16/2015. agenda perceptron vs. back-propagation network network...

13
Back-propagation Chih-yun Lin 07/17/22

Upload: kerry-hines

Post on 17-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Back-propagation Chih-yun Lin 5/16/2015. Agenda Perceptron vs. back-propagation network Network structure Learning rule Why a hidden layer? An example:

Back-propagation

Chih-yun Lin04/18/23

Page 2: Back-propagation Chih-yun Lin 5/16/2015. Agenda Perceptron vs. back-propagation network Network structure Learning rule Why a hidden layer? An example:

Agenda

Perceptron vs. back-propagation network Network structure Learning rule

Why a hidden layer?An example: Jets or SharksConclusions

Page 3: Back-propagation Chih-yun Lin 5/16/2015. Agenda Perceptron vs. back-propagation network Network structure Learning rule Why a hidden layer? An example:

Network Structure –Perceptron

O Output Unit

Wj

IjInput Units

Page 4: Back-propagation Chih-yun Lin 5/16/2015. Agenda Perceptron vs. back-propagation network Network structure Learning rule Why a hidden layer? An example:

Network Structure – Back-propagation Network

Oi Output Unit

Wj,i

aj Hidden Units

Wk,j

Ik Input Units

Page 5: Back-propagation Chih-yun Lin 5/16/2015. Agenda Perceptron vs. back-propagation network Network structure Learning rule Why a hidden layer? An example:

Learning Rule

Measure error Reduce that error By appropriately adjusting each of the

weights in the network

Page 6: Back-propagation Chih-yun Lin 5/16/2015. Agenda Perceptron vs. back-propagation network Network structure Learning rule Why a hidden layer? An example:

Learning Rule –Perceptron

Err = T – O O is the predicted output T is the correct output

Wj Wj + α * Ij * Err Ij is the activation of a unit j in the

input layer α is a constant called the learning

rate

Page 7: Back-propagation Chih-yun Lin 5/16/2015. Agenda Perceptron vs. back-propagation network Network structure Learning rule Why a hidden layer? An example:

Learning Rule – Back-propagation Network

Erri = Ti – Oi

Wj,i Wj,i + α * aj * Δi

Δi = Erri * g’(ini) g’ is the derivative of the activation

function g aj is the activation of the hidden unit

Wk,j Wk,j + α * Ik * Δj Δj = g’(inj) * ΣiWj,i * Δi

Page 8: Back-propagation Chih-yun Lin 5/16/2015. Agenda Perceptron vs. back-propagation network Network structure Learning rule Why a hidden layer? An example:

Learning Rule – Back-propagation Network

E = 1/2Σi(Ti – Oi)2

= - Ik * Δj jkW

E

,

Page 9: Back-propagation Chih-yun Lin 5/16/2015. Agenda Perceptron vs. back-propagation network Network structure Learning rule Why a hidden layer? An example:

Why a hidden layer?

(1 w1) + (1 w2) < ==> w1 + w2 < (1 w1) + (0 w2) > ==> w1 > (0 w1) + (1 w2) > ==> w2 > (0 w1) + (0 w2) < ==> 0 <

Page 10: Back-propagation Chih-yun Lin 5/16/2015. Agenda Perceptron vs. back-propagation network Network structure Learning rule Why a hidden layer? An example:

Why a hidden layer? (cont.)

(1 w1) + (1 w2) + (1 w3) < ==> w1 + w2 + w3 < (1 w1) + (0 w2) + (0 w3) > ==> w1 > (0 w1) + (1 w2) + (0 w3) > ==> w2 > (0 w1) + (0 w2) + (0 w3) < ==> 0 <

Page 11: Back-propagation Chih-yun Lin 5/16/2015. Agenda Perceptron vs. back-propagation network Network structure Learning rule Why a hidden layer? An example:

An example: Jets or Sharks

Page 12: Back-propagation Chih-yun Lin 5/16/2015. Agenda Perceptron vs. back-propagation network Network structure Learning rule Why a hidden layer? An example:

Conclusion

Expressiveness: Well-suited for continuous

inputs,unlike most decision tree systems

Computational efficiency: Time to error convergence is highly

variable

Generalization: Have reasonable success in a number

of real-world problems

Page 13: Back-propagation Chih-yun Lin 5/16/2015. Agenda Perceptron vs. back-propagation network Network structure Learning rule Why a hidden layer? An example:

Conclusions (cont.)

Sensitivity to noise: Very tolerant of noise in the input data

Transparency: Neural networks are essentially black

boxes

Prior knowledge: Hard to used one’s knowledge to

“prime” a network to learn better