back-propagation chih-yun lin 5/16/2015. agenda perceptron vs. back-propagation network network...
TRANSCRIPT
Back-propagation
Chih-yun Lin04/18/23
Agenda
Perceptron vs. back-propagation network Network structure Learning rule
Why a hidden layer?An example: Jets or SharksConclusions
Network Structure –Perceptron
O Output Unit
Wj
IjInput Units
Network Structure – Back-propagation Network
Oi Output Unit
Wj,i
aj Hidden Units
Wk,j
Ik Input Units
Learning Rule
Measure error Reduce that error By appropriately adjusting each of the
weights in the network
Learning Rule –Perceptron
Err = T – O O is the predicted output T is the correct output
Wj Wj + α * Ij * Err Ij is the activation of a unit j in the
input layer α is a constant called the learning
rate
Learning Rule – Back-propagation Network
Erri = Ti – Oi
Wj,i Wj,i + α * aj * Δi
Δi = Erri * g’(ini) g’ is the derivative of the activation
function g aj is the activation of the hidden unit
Wk,j Wk,j + α * Ik * Δj Δj = g’(inj) * ΣiWj,i * Δi
Learning Rule – Back-propagation Network
E = 1/2Σi(Ti – Oi)2
= - Ik * Δj jkW
E
,
Why a hidden layer?
(1 w1) + (1 w2) < ==> w1 + w2 < (1 w1) + (0 w2) > ==> w1 > (0 w1) + (1 w2) > ==> w2 > (0 w1) + (0 w2) < ==> 0 <
Why a hidden layer? (cont.)
(1 w1) + (1 w2) + (1 w3) < ==> w1 + w2 + w3 < (1 w1) + (0 w2) + (0 w3) > ==> w1 > (0 w1) + (1 w2) + (0 w3) > ==> w2 > (0 w1) + (0 w2) + (0 w3) < ==> 0 <
An example: Jets or Sharks
Conclusion
Expressiveness: Well-suited for continuous
inputs,unlike most decision tree systems
Computational efficiency: Time to error convergence is highly
variable
Generalization: Have reasonable success in a number
of real-world problems
Conclusions (cont.)
Sensitivity to noise: Very tolerant of noise in the input data
Transparency: Neural networks are essentially black
boxes
Prior knowledge: Hard to used one’s knowledge to
“prime” a network to learn better