neural net 3rdclass

7/29/2019 Neural Net 3rdClass

1/35

A Brief Overview of Neural

Networks


2/35

Relation to Biolo ical Brain: Biolo ical Neural Network

The Artificial Neuron Types of Networks and Learning Techniques

Supervised Learning & Backpropagation Training

Algorithm

Applications


3/35


4/35

WINeuron

f(n) OutputsPU

W ActivationTS

= e g


5/35

1

Output

: ( )1

nSIGMOID f ne= +

Input

LI EAR n n=


6/35

Multiple Inputs and Multiple Inputs

and layers


7/35

.Feedback

Recurrent Networks


8/35

Recurrent Networks

Feed forward networks:

One input pattern produces one output

Recurrency

Nodes connect back to other nodes orthemselves

Information flow is multidirectional

Sense of time and memory of previous state(s)

o og ca nervous sys ems s ow g eve s o

recurrency (but feed-forward structures exists too)


9/35

components of biological neural nets:

1. Neurones (nodes)

2. Synapses (weights)


10/35

-

Information flow is unidirectional

Data is resented to In ut la er

Passed on to Hidden Layer

Passed on to Output layer

Information is distributed


11/35

Neural networks are ood for rediction roblems.

The in uts are well understood. You have a

good idea of which features of the data areimportant, but not necessarily how to combine.

The output is well understood. You know what.

Experience is available. You have plenty ofexamples where both the inputs and the output

are known. This experience will be used to trainthe network.


12/35

(1 0.25) + (0.5 (-1.5)) = 0.25 + (-0.75) = - 0.5

0.3775

1 5.0=

+ e

Squashing:


13/35

Inputs from theActual S stem

ExpectedOutput

+Actual

Neural Network

-

Training

Error


14/35

Inputs First Hiddenlayer

SecondHidden Layer

Layer


15/35

Signal Flow

Backpropagation of Errors

Function Signals

Error Signals


16/35

Neural networks for Directed Data Mining: Building

a mo e or c ass ca on an pre c on

1. Identify the input and output features

.

range is between 0 and 1.3. Set up a network on a representative set of training

examp es.

4. Train the network on a representative set of training

exam les.

5. Test the network on a test set strictly independent fromthe training examples. If necessary repeat the training,

, ,parameters. Evaluate the network using the evaluationset to see how well it performs.

.outcomes for unknown inputs.


17/35

= F(n)= 1/(1+exp(-n)), where n is the net input tothe neuron.

Derivative= F(n) = (output of the neuron)(1-

output of the neuron) : Slope of the transferfunction.

Output layer transfer function: Linear function=

n =n; utput= nput to t e neuronDerivative= F(n)= 1


18/35

Purpose of the

Activation Function We want the unit to be active near +1 when the ri ht

inputs are given We want the unit to be inactive (near 0) when the

wrong npu s are g ven.

Its preferable for activation function to be nonlinear.

Otherwise the entire neural network colla ses into a

simple linear function.


19/35

function

Step function Sign function Sigmoid (logistic) function

step(x) = 1, ifx > threshold0, if x threshold

in icture above, threshold = 0

sign(x) = +1, if x > 0

-1, if x 0

sigmoid(x) = 1/(1+e-x)

Adding an extra input with activation a0 = - 1 and weight=0,j

threshold at t. This way we can always assume a 0 threshold.


20/35

s ng a as e g t to

-1

x1

W1

x2

W2

W1x1+ W2x2 < T

W1x1+ W2x2 - T < 0


21/35

errors using gradient descent training.

Red: Current weights range: p a e we g s

Black boxes: Inputs and outputs to a neuron

Blue: Sensitivities at each layer


22/35

The perceptron learning rule performs gradient descentin

weight space. Error surface: The surface that describes the error on

each example as a function of all the weights in the. .

(It could also be called a state in the state space ofpossible weights, i.e., weight space.)

respect to each weight (i.e., the gradient -- how much theerror would change if we made a small change in each

weight). Then the weights are being altered in an amountpropor ona o e s ope n eac rec on correspon ngto a weight). Thus the network as a whole is moving inthe direction of steepest descent on the error surface.


23/35

Definition ofError:

This is introduced to simplify the math on the next slide

22

2

)(

2

ErrotE

examples

==

Here,t is the correct (desired) output ando is the actual

ou pu o e neura ne .


24/35

Reduction of Squared Error

the partial derivative of E with respect to each weight:

Err

E is

n

jj Wrr

W

=

= c a n ru e or er va ves

This is called in

a vector

j

k

kk

j

xingErr

W

=

=

)('

0

expan secon rr a ove to t g n

0=

W

tbecause and chain rule

jjj xingErrWW + )('

The weight is updated by times this gradient of error E in weight space. The factthat the weight is updated in the correct direction (+/-) can be verified with examples.

The learning rate, , is typically set to a small value such as 0.1


25/35

G2= (0.6508)(1-G1= (0.6225)(1-

0.62250.6225 0.6508

. . . = .. . . = .

0.5

0.5

0.5.

0.51

.0.6508

0.50.5

0.50.5 0.62250.6225

0.6508

0.6508

G3=(1)(0.3492)=0.3492Gradient of the neuron= G=slope of the transfer

function[{(weight of the

Gradient of the outputneuron = slope of the

transfer function error

Error=1-0.6508=0.3492neuron to t e next neuron

(output of the neuron)}]


26/35

New Weight=Old Weight + {(learning rate)(gradient)(prior output)}

0.5+(0.5)(0.3492)(0.6508)0.5+(0.5)(0.0397)(0.6225)0.5+(0.5)(0.0093)(1)

0.6136

0.5124 0.5124

.

0.5047

0.51240.61360.5047


27/35

G2= (0.6545)(1-G1= (0.6236)(1-

0.63910.62360.6545

. . . = .. . . = .

0.5047

0.5124

0.6136.

0.51241

.0.8033

0.61360.5047

0.51240.5047

0.63910.6236

0.8033

0.6545G3=(1)(0.1967)=0.1967

Error=1-0.8033=0.1967


28/35

New Weight=Old Weight + {(learning rate)(gradient)(prior output)}

0.6136+(0.5)(0.1967)(0.6545)0.5124+(0.5)(0.0273)(0.6236)0.5047+(0.5)(0.0066)(1)

0.6779

0.5209 0.5209

.

0.508

0.52090.67790.508


29/35

0.65040.62430.6571

0.508

0.5209

0.6779.

0.52091

.0.8909

0.67790.508

0.52090.508

0.65040.6243

0.89090.6571


30/35

Output Expected ErrorWeights

w w w

Initial conditions 0.5 0.5 0.5 0.6508 1 0.3492

. . . . .

Pass 2 Update 0.508 0.5209 0.6779 0.8909 1 0.1091

W1: Weights from the input to the input layerW2: Weights from the input layer to the hidden layer

W3: Weights from the hidden layer to the output layer


31/35

backpropagation continues until the

reached.

-

Other complicated backpropagation

ra n ng a gor ms a so ava a e.


32/35

O1

W1O3 = 1/[1+exp(-N)]

Output 1

O2W2 N =

(O1W1)

Error =Actual

O = Output of the neuron

W = Weight

N = Net input to the neuron

+(O2W

2)

0Input

To reduce error: Change in weights:

o Learning rate

Gradient: rate of change of error w.r.t rate of change of N Prior output (O1 and O2)


33/35

Gradient : Rate of change of error w.r.t rate of change in net input to

o For output neurons Slope of the transfer function error

o For hidden neurons : A bit complicated ! : error fed back in terms of

gradient of successive neurons

Slope of the transfer function [ (gradient of next neuron

weight connecting the neuron to the next neuron)]

Why summation? Share the responsibility!!


34/35

G1=0.66(1-0.66)(0.34)= 0.0763

1 0.731

0.5

0.6645 0.66 1

Error = 1-0.66 = 0.34

0.4

0.50.5

0.598 0.5 0.66450.66 0

Error = 0-0.66 = -0.66

G1=0.66(1-0.66)(-0.66)= -0.148Reduce more

Increase less


35/35

number of neurons in each layer..

Changing the learning rate. Training for longer times.

T e of re- rocessin and ost-

processing.

neural net 3rdclass

Documents