(elec 5240 and elec 6240)(elec 5240 and elec 6240…wilambm/nn/cn_org/03 single neuron...

Post on 30-Sep-2018

272 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

NEURAL NETWORKS (ELEC 5240 and ELEC 6240)(ELEC 5240 and ELEC 6240)

single neuron training

Bodgan M. Wilamowski

1

Area with 4 partitions. Input–output mapping

Assuming linear activation function foractivation function for the output neuron

Relatively complex li inonlinear mapping

Question:

How to design a system for an arbitrary nonlinear mapping?

2

arbitrary nonlinear mapping?

How to design?

What is given?What is given?

(a) Mathematical function

No need for design – just use microcomputer for calculations

(b) Experimental data

- Find an analytical functions describing the process ???- Find an analytical functions describing the process ???

- Use a data to train neural networks

3

Hamming code example

Let us consider binary signals and weights such as

x = +1 -1 -1 +1 -1 +1 +1 -1

if weights w = xg

w = +1 -1 -1 +1 -1 +1 +1 -1

then8 xw = net ii

n

=1i

this is the maximum value net can have for any other combinations net would be smaller

1i

4

combinations net would be smaller

Hamming code example

For the same pattern

x = +1 -1 -1 +1 -1 +1 +1 -1

and slightly different weightsg y g

w = +1 +1 -1 +1 -1 -1 +1 -1n

4 xw = net ii

n

=1i

HDnxw = net ii

n

1i

2 HD is the Hamming Distance

5

=1i Hamming Distance

Unsupervised learning rules for single neuron

xw ci i

where c is the learning constant

Hebb rule xw oci Hebb rule xw oci

Pattern normalization required

6

Supervised learning rules for single neuron

xw ci correlation rule (supervised): dperceptron fixed rule: od perceptron adjustable rule - as above but the learning constant is modified to:

* wx netT

2xxx T

LMS (Widrow-Hoff) rule: netd delta rule: 'fod

d i l (th LMS) dTT xxxw1

7

pseudoinverse rule (the same as LMS): dxxxw

Training Neurons

ii xw Perceptron learning rule:

ii

od )(sign netdii xw

A i bi l

2xw Assuming bipolar neurons

output ±18

2ii xw output = ±1

Simple example of training one neuron

neuron3 y3 y

(1,2) => -1x

1

2

(2 1) > +1

( , )y

3

-3

1

(2,1) => +1+1

initial setting with wrong answers

1 2x

3

initial setting with wrong answersboth paterns belongd to -1 category

9

1 2 3

Simple example of training one neuron

Weights: 1 3 -3 Desired output

Pattern 1: 1 2 +1 -1

Pattern 2: 2 1 +1 +1

A lnet = w xn

for pattern 1: net = 1*1+3*2 3*1= 4 => +1

Actual outputnet = w xi=1

i ifor pattern 1: net = 1 1+3 2-3 1= 4 => +1

for pattern 2: net = 1*2+3*1-3*1= 2 => +1

10

Simple example of training one neuron

weights:

30 constant learning assuming .α 3]3[1wweights:

npattern 1:

3]- 3 [1w

1] 2 [1x

net = w xi=1

i i od xw

14313211

xxw 6.0113.0 14313211 net

0.6]- 1.2- [-0.6w

3 6]1 8[0 411

3.6]- 1.8 [0.4w

Simple example of training one neuronAft l i th fi t tt fi t ti

3 yx

0 43 6]1 8[0 4

After applying the first pattern first time

3 y

(1,2) => -1y

0.4

1.8

3.6]- 1.8 [0.4w

2

(2 1) 1

(1,2) 1

+1-3.6

63

1

(2,1) => +1

63

94.0

6.30 x

1 2x

3

2 8.1

6.30 y

12

1 2 3

Simple example of training one neuron

weights:

Applying the second pattern first time

3 6]1 8[0 4wweights:

npattern 2: 1] 1 [2x

3.6]- 1.8 [0.4w

net = w xi=1

i i od xw

11631811402

xxw 6.0113.0 116.318.114.02 net

0.6] 0.6 [1.2w

133.0]- 2.4 [1.6w

Simple example of training one neuron

3 y 3]2 4[1 6wx

1 6

After applying the second pattern first time

3 y

(1,2) => -1

3]- 2.4 [1.6wy

1.6

2.4

2

(2 1) > +1

( , )

+1-3

1

(2,1) => +187.1

6.1

30 x

1 2x

3

1.25 4.2

30 y

14

1 2 3

Simple example of training one neuron

weights:

Applying the first pattern second time

3]2 4[1 6wweights:

npattern 1: 1] 2 [1x

3]- 2.4 [1.6w

net = w xi=1

i i od xw

14331422611

xxw 6.0113.0 14.3314.226.11 net

0.6]- 1.2- [-0.6w

15

3.6]- 1.2 [1w

Simple example of training one neuronAft l i th fi t tt d tiAfter applying the first pattern second time

3 6]1 2[1w3 y x13.6]- 1.2 [1w3 y

y

1

1.2

2+1

-3.6

16.3

1

6.30 x

1 2x

3

3 2.1

6.30 y

16

1 2 3 2.1

Simple example of training one neuron

weights:

Applying the second pattern second time

3 6]1 2[1wweights:

npattern 2: 1] 1 [2x

3.6]- 1.2 [1w

net = w xi=1

i i od xw

14063121112

xxw 6.0113.0 14.06.312.1112 net

0.6] 0.6 [1.2w

17

3.0]- 1.8 [2.2w

Simple example of training one neuronAfter applying the second pattern second time

3.0]-1.8[2.2w3 y x2 23.0] 1.8 [2.2w3 y

y

2.2

1.8

2+1

-3

3

13

36.12.2

30 x

1 2x

3

67.1 8.1

30 y

18

1 2 3

Simple example of training one neuron

weights:

Applying the first pattern third time

3 0]1 8[2 2wweights:

npattern 1: 1] 2 [1x

3.0]- 1.8 [2.2w

net = w xi=1

i i od xw

18231812221

xxw 6.0113.0 18.2318.122.21 net

0.6]- 1.2- [-0.6w

19

3.6]- 0.6 [1.6w

Simple example of training one neuron

3 6]0 6[1 6w

Applying the first pattern third time

3 y x1 63.6]- 0.6 [1.6w3 y

y

1.6

0.6

2+1

-3.6

63

163

25.26.1

6.30 x

1 2x

3

6 6.0

6.30 y

20

1 2 3

Simple example of training one neuron

weights:

Applying the second pattern third time

3 6]0 6[1 6wweights:

npattern 2: 1] 1 [2x

3.6]- 0.6 [1.6w

net = w xi=1

i i od xw

120631601612

00113.0 xxw

12.06.316.016.12 net

0] 0 [0w

3 6]-0 6[1 6w21

3.6]- 0.6 [1.6w

Simple example of training one neuron

3.6]- 0.6 [1.6wApplying the second pattern third time

x3 y

y

1.6

0.6

2+1

y

-3.6

1

25.26.1

6.30 x

1

x6

6.0

6.30 y

221 2

x3

Simple example of training one neuron

weights:

Applying the first pattern 4-th time

3 6]0 6[1 6wweights:

npattern 1: 1] 2 [1x

3.6]- 0.6 [1.6w

net = w xi=1

i i od xw

180631602611

00113.0 xxw

18.06.316.026.11 net

0] 0 [0w

3 6]-0 6[1 6w23

3.6]- 0.6 [1.6w

Supervised learning rules for single neuron

xw ci correlation rule (supervised): dperceptron fixed rule: od perceptron adjustable rule - as above but the learning constant is modified to:

* wx netT

2xxx T

LMS (Widrow-Hoff) rule: netd delta rule: 'fod

d i l (th LMS) dTT xxxw1

24

pseudoinverse rule (the same as LMS): dxxxw

Training one neuron using the perceptron ruleDesired output

Pattern 1: 1 2 +1

Desired output

-1

Pattern 2: 2 1 +1 +1

1 3 -3Initial weights:

α 30constantlearning

)(sign netdii xw .α 30constant learning

)(sign netdii xw

2ii xw 25

2ii xw

Training one neuron using the perceptron rule

Final weights: 3.6]- 0.6 [1.6w

y x3 y

y

1.6

0.6

2+1

-3.6

1

25.26.1

6.30 x

x6

6.0

6.30 y

26

1 2 3

Soft activation functions

01

1)(netif

netsign 01 netif

00

05.02

1)()(

netif

netifnetsign

netfo

01

00)sgn()(

netif

netifnetnetfo

1 - net- + 1

2 = net0.5 = f(net) = o

exptanh net- + 1

1 = f(net) = o

exp

27

oo = f 1' o = f 21'

Program in MATLAB (with graphics)%single neuron perceptron training with soft activation function format compact;format compact;ip=[1 2; 2 1]; dp=[-1,1]'; ww=[1 3 -3]; c=0.3; k=1;figure(1); clf; hold onplot(ip(1,1),ip(1,2),'ro'); plot(ip(2,1),ip(2,2),'bx');[np,ni]=size(ip); ip(:,ni+1)=ones(np,1) % agumenting inputa=axis ; a=[0 4 0 4]; axis(a); j=0;for ite=1:5,

for p=1:npfor p=1:np,j=j+1; if j>1, plot(x,y,'g'); end; net(p)=ip(p,:)*ww' ; op(p)=sign(k*0.5*net(p));er(p)=dp(p)-op(p); ww=ww+c*er(p)*ip(p,:);x(1)=-1; y(1)=-(ww(1)*x(1)+ww(3))./ww(2);x(2)=4; y(2)=-(ww(1)*x(2)+ww(3))./ww(2);plot(x,y,'r');

% pause;% pause; endter=sqrt(er*er')tter(ite)=ter;if t 0 001 b k d

28

if ter <0.001, break; end;end;hold off; ite,figure(2); clf; semilogy(tter)

MATLAB training resultsc=0.3 iter=4 error =0 c=0.1 iter=9 error =0

c=0.01 iter=66 error =0c=1 iter=4 error =0

29

Program in MATLAB (perceptron -hard)%single neuron perceptron training with hard activation function format compact;format compact;ip=[1 2; 2 1]; dp=[-1,1]'; ww=[1 3 -3]; c=0.3; k=1;figure(1); clf; hold onplot(ip(1,1),ip(1,2),'ro'); plot(ip(2,1),ip(2,2),'bx');[np,ni]=size(ip); ip(:,ni+1)=ones(np,1) % augmenting inputa=axis ; a=[0 4 0 4]; axis(a); j=0;for ite=1:5,

for p=1:npfor p=1:np,j=j+1; if j>1, plot(x,y,'g'); end; net(p)=ip(p,:)*ww' ; op(p)=sign(net(p));er(p)=dp(p)-op(p); ww=ww+c*er(p)*ip(p,:);x(1)=-1; y(1)=-(ww(1)*x(1)+ww(3))./ww(2);x(2)=4; y(2)=-(ww(1)*x(2)+ww(3))./ww(2);plot(x,y,'r');

% pause;% pause; endter=sqrt(er*er')tter(ite)=ter;if t 0 001 b k d

30

if ter <0.001, break; end;end;hold off; ite,figure(2); clf; semilogy(tter)

MATLAB training results (perceptron -hard)c=0.3 iter=4 error =0 c=0.1 iter=9 error =0

c=0.01 iter=66 error =0c=1 iter=4 error =0

31

Program in MATLAB (perceptron -soft)

%single neuron perceptron traning with soft activation function 4%single neuron perceptron traning with soft activation function format compact; clear all;ip=[1 2; 2 1]; dp=[-1,1]'; ww=[1 3 -3]; c=0.3; k=1;figure(1); clf; hold on; xlabel('X input'); ylabel('Y input');plot(ip(1,1),ip(1,2),'ro'); plot(ip(2,1),ip(2,2),'bx');[np ni]=size(ip); ip(: ni+1)=ones(np 1) %agumenting input 2.5

3

3.5

4

[np,ni]=size(ip); ip(:,ni+1)=ones(np,1) %agumenting inputa=axis ; a=[0 4 0 4]; axis(a); j=0;for ite=1:20,

for p=1:np,j=j+1; if j>1, plot(x,y,'g'); end; 1

1.5

2

Y in

put

j j j ( y g )net(p)=ip(p,:)*ww' ;op(p)=tanh(k*0.5*net(p)); %hiperbolic functioner(p)=dp(p)-op(p); ww=ww+c*er(p)*ip(p,:);x(1)=-1; y(1)=-(ww(1)*x(1)+ww(3))./ww(2);x(2)=4; y(2)=-(ww(1)*x(2)+ww(3)) /ww(2);

0 0.5 1 1.5 2 2.5 3 3.5 40

0.5

X input

x(2)=4; y(2)=-(ww(1) x(2)+ww(3))./ww(2);plot(x,y,'r');

% pause; endter=sqrt(er*er'), tter(ite)=ter;

0

if ter <0.0001, break; end;end;hold off; itefigure(2); clf; semilogy(tter); xlabel('iterations'); ylabel('error');

100

erro

r

32

semilogy(tter); xlabel( iterations ); ylabel( error );

0 2 4 6 8 10 12 14 16 18 20iterations

Program in MATLAB (perceptron -soft)

%single neuron perceptron traning with soft activation function 3.5

4

%single neuron perceptron traning with soft activation function format compact; clear all;ip=[1 2; 2 1]; dp=[-1,1]'; ww=[1 3 -3]; c=3; k=0.3;figure(1); clf; hold on; xlabel('X input'); ylabel('Y input');plot(ip(1,1),ip(1,2),'ro'); plot(ip(2,1),ip(2,2),'bx');[np ni]=size(ip); ip(: ni+1)=ones(np 1) %agumenting input 2

2.5

3

nput

[np,ni]=size(ip); ip(:,ni+1)=ones(np,1) %agumenting inputa=axis ; a=[0 4 0 4]; axis(a); j=0;for ite=1:20,

for p=1:np,j=j+1; if j>1, plot(x,y,'g'); end;

0 5

1

1.5

2

Y in

j j j ( y g )net(p)=ip(p,:)*ww' ;op(p)=tanh(k*0.5*net(p)); %hiperbolic functioner(p)=dp(p)-op(p); ww=ww+c*er(p)*ip(p,:);x(1)=-1; y(1)=-(ww(1)*x(1)+ww(3))./ww(2);x(2)=4; y(2)=-(ww(1)*x(2)+ww(3)) /ww(2);

0 0.5 1 1.5 2 2.5 3 3.5 40

0.5

X input

101

x(2)=4; y(2)=-(ww(1) x(2)+ww(3))./ww(2);plot(x,y,'r');

% pause; endter=sqrt(er*er'), tter(ite)=ter; 10

0

if ter <0.0001, break; end;end;hold off; itefigure(2); clf; semilogy(tter); xlabel('iterations'); ylabel('error');

10-1

erro

r

33

semilogy(tter); xlabel( iterations ); ylabel( error );

0 2 4 6 8 10 12 14 16 18 2010

-2

iterations

LMS learning rule

ip

np

pp xfoddw

TEd'2

pidw 1

In LMS rule (Widrow Hoft – 1962) they assumed f‘=1

(they worked with hard threshold neurons so f ’ was not defined)(they worked with hard threshold neurons so f was not defined)

np

xodTEd 2

Therefore:

ipp

ppi

xoddw

1

2

2

np

netdTE

Therefore:

34

1

p

pp netdTE

Program in MATLAB (LMS)%single neuron LMS training with soft activation function f t t l ll

4

format compact; clear all;ip=[1 2; 2 1], dp=[-1,1]', ww=[1 3 -3], c=0.1; k=1;figure(1); clf; hold on; xlabel('X input'); ylabel('Y input');plot(ip(1,1),ip(1,2),'ro'); plot(ip(2,1),ip(2,2),'bx');[ i] i (i ) i ( i+1) ( 1) % ti

2.5

3

3.5

ut[np,ni]=size(ip); ip(:,ni+1)=ones(np,1) %augmenting inputa=axis ; a=[0 4 0 4]; axis(a); j=0;for ite=1:100,

f 1

1

1.5

2

Y in

pu

for p=1:np,j=j+1; if j>1, plot(x,y,'g'); end;net(p)=ip(p,:)*ww' ;op(p)=tanh(0.5*k*net(p)); %hyperbolic functioner(p) dp(p) op(p) +c*(dp(p) net(p))*ip(p )

0 0.5 1 1.5 2 2.5 3 3.5 40

0.5

X input

er(p)=dp(p)-op(p); ww=ww+c*(dp(p)-net(p))*ip(p,:);x(1)=-1; y(1)=-(ww(1)*x(1)+ww(3))./ww(2);x(2)=4; y(2)=-(ww(1)*x(2)+ww(3))./ww(2);plot(x,y,'r');

% pause;% pause; endter=sqrt(er*er'), tter(ite)=ter;if ter <0.0001, break; end;

end; 100

erro

r

35

end;hold off; itefigure(2); clf; semilogy(tter); xlabel('iterations'); ylabel('error');

0 10 20 30 40 50 60 70 80 90 100iterations

Delta learning rule

2111 odErr Errors:

2222 odErr

2npnpnp odErr

2

np

pp odTE 1p

pp

36

DELTA learning rule 12

2

1

np

ppp odTE

1p

ninip xwxwxwfo 2211

The gradient of TE along wi:

pnp do

dTEd 2

d tdd

i

p

ppp

i dwod

dw

1

2

ii

p

p

p

i

p xfdw

dnet

dnet

do

dw

do'

37

ipi

Delta learning rule 2

i

pnp

ppp

i dw

dood

dw

TEd

1

2

ii

p

p

p

i

p xfdw

dnet

dnet

do

dw

do'

ipi

ipi

ip

np

pp xfodd

TEd'2 ip

ppp

i

fdw 1

np

p

ipppi xfodw1

'2

38

Program in MATLAB (Delta)

3.5

4

%single neuron delta training with soft activation function format compact; clear all;ip=[1 2; 2 1], dp=[-1,1]', ww=[1 3 -3], c=2; k=0.5;figure(1); clf; hold on; xlabel('X input'); ylabel('Y input');plot(ip(1 1) ip(1 2) 'ro'); plot(ip(2 1) ip(2 2) 'bx'); 2

2.5

3

nputplot(ip(1,1),ip(1,2), ro ); plot(ip(2,1),ip(2,2), bx );

[np,ni]=size(ip); ip(:,ni+1)=ones(np,1) %augmenting inputa=axis ; a=[0 4 0 4]; axis(a);j=0;for ite=1:250,

for p=1:np,j j 1 if j 1 l t( ' ') d 0 5

1

1.5

2

Y in

j=j+1; if j>1, plot(x,y,'g'); end;net(p)=ip(p,:)*ww' ;op(p)=tanh(0.5*k*net(p)); %hyperbolic functionfp(p)=k*(1-op(p)*op(p));er(p)=dp(p)-op(p); ww=ww+c*fp(p)*er(p)*ip(p,:);

0 0.5 1 1.5 2 2.5 3 3.5 40

0.5

X input

101(p) p(p) p(p); p(p) (p) p(p, );

x(1)=-1; y(1)=-(ww(1)*x(1)+ww(3))./ww(2);x(2)=4; y(2)=-(ww(1)*x(2)+ww(3))./ww(2);plot(x,y,'r');

% pause; end

100

endter=sqrt(er*er'), tter(ite)=ter;if ter <0.001, break; end;

end;hold off; ite

10-1

erro

r

39

figure(2); clf;semilogy(tter); xlabel('iterations'); ylabel('error');

0 50 100 150 200 25010

-2

iterations

Program in MATLAB (Delta)Batch training 4

%single neuron delta training with soft activation function % BATCH trainingformat compact; clear all;ip=[1 2; 2 1], dp=[-1,1]', ww=[1 3 -3], c=2; k=0.5; 2.5

3

3.5

figure(1); clf; hold on; xlabel('X input'); ylabel('Y input');plot(ip(1,1),ip(1,2),'ro'); plot(ip(2,1),ip(2,2),'bx');[np,ni]=size(ip); ip(:,ni+1)=ones(np,1) %augmenting inputa=axis ; a=[0 4 0 4]; axis(a);j=0; 1

1.5

2

Y in

put

for ite=1:125,if ite>1, plot(x,y,'g'); end;net=ip*ww'; op=tanh(0.5.*k.*net);fp=k.*(1-op.*op); er=dp-op;

0 0.5 1 1.5 2 2.5 3 3.5 40

0.5

X input

101

dw=(c*er.*fp)'*ip; ww=ww+dw;x(1)=-1; y(1)=-(ww(1)*x(1)+ww(3))./ww(2);x(2)=4; y(2)=-(ww(1)*x(2)+ww(3))./ww(2);plot(x,y,'r');

100

% pause; ter=er'*er, tter(ite)=ter;if ter <0.001, break; end;

end;10

-2

10-1

erro

r

40

hold off; itefigure(2); clf; semilogy(tter); xlabel('iterations'); ylabel('error');

0 20 40 60 80 100 120 14010

-3

iterations

Delta learning for multiple patterns c=3 k=1derr=0.01 ite=576

4

%single neuron delta training with soft activation function % BATCH training with several patternsformat compact; clear all;ip= [-1,-1; 2,2; 0,0; 1,1; -0.5,0; 2,1; 0,1; 3,1; 1,1.5; 2.5,1.5] [ i] i (i ) i ( i+1) ( 1) % ti i t

2

3

[np,ni]=size(ip); ip(:,ni+1)=ones(np,1) %augmenting inputdp=[-1, 1,-1, 1, -1, 1, -1, 1, -1, 1]', ww=[-1 3 -3], c=1.8; k=1;figure(1); clf; hold on; xlabel('X input'); ylabel('Y input');a=axis ; a=[-2 4 -2 4]; axis(a);j=0;for ite=1:10000,

0

1

Y in

put

if ite>1, plot(x,y,'g'); end;net=ip*ww'; op=tanh(0.5.*k.*net);fp=k.*(1-op.*op); er=dp-op; dw=(c*er.*fp)'*ip; ww=ww+dw;x(1)= 1; y(1)= (ww(1)*x(1)+ww(3)) /ww(2);

-2 -1 0 1 2 3 4-2

-1

X input10

2

x(1)=-1; y(1)=-(ww(1) x(1)+ww(3))./ww(2);x(2)=4; y(2)=-(ww(1)*x(2)+ww(3))./ww(2);plot(x,y,'r');% pause; ter=er'*er, tter(ite)=ter;

101

if ter <0.01, break; end;end;for p=1:np, if dp(p)>0, plot(ip(p,1),ip(p,2),'ro');

else plot(ip(p,1),ip(p,2),'bx'); end; end;hold off; ite 10

-1

100

erro

r

41

hold off; itefigure(2); clf; semilogy(tter); xlabel('iterations'); ylabel('error');.

0 100 200 300 400 500 60010

-2

iterations

top related