lecture 2 -neural network (cont’d)
Post on 13-Jul-2022
8 Views
Preview:
TRANSCRIPT
3/19/2012
1
Lecture 2 - Neural Network
(Cont’d)
Neural Network Models
Types of NN
Supervised Training
• Backpropagation Network
• RBF Network
• LVQ Network
• Elman Network
Unsupervised Training
• Hamming Network
• KSOMS
• SOFM
• LVQ
• Hopfield Network
• ART
3/19/2012
2
Radial Basis Function (RBF) Network• RBF network � 2 layers feedforward network except that the 1st
layer do not use the weighted sum of inputs and the sigmoid
transfer function like the other multilayer networks.
• The output of the first layer neurons each represent a “radial
function” determined by the distance between the network input
and the ‘center’ of the basis function.
• As input moves away from center, the neuron outputs drops
rapidly to zero (distance between w and p decreases, the output
increases).• radial basis neuron acts as a detector which
produces 1 whenever the input p is
identical to its weight
• The 2nd layer: uses linear transfer function.
• RBF networks � require more neurons but
trains faster than standard feed-forward
backpropagation networks
• output of the first layer for a feed forward network net can be obtained
with the following code:
a{1} = radbas(netprod(dist(net.IW{1,1},p),net.b{1}))
• This function can produce a network with zero error on training vectors.
It is called in the following way
• net = newrbe(P,T,SPREAD); a spread constant SPREAD for
the radial basis layer, and returns a network with weights and biases
such that the outputs are exactly T when the inputs are P
Radial Basis Function (RBF) Network
3/19/2012
3
• function newrb iteratively creates a radial basis network one neuron
at a time. Neurons are added to the network until the sum-squared
error falls beneath an error goal or a maximum number of neurons
has been reached. The call for this function is:
net = newrb(P,T,GOAL,SPREAD)
• The error of the new network is checked, and if low enough newrb is
finished. Otherwise the next neuron is added. This procedure is
repeated until the error goal is met, or the maximum number of
neurons is reached.
• Radial basis networks, even when designed efficiently with newrbe,
tend to have many times more neurons than a comparable feed-
forward network with tansig or logsig neurons in the hidden layer.
• choose a spread constant larger than the distance between adjacent
input vectors, but smaller than the distance across the whole input
space.
Radial Basis Function (RBF) Networkdemorb1
• The first design method, newrbe, finds an exact solution. The
function newrbe creates radial basis networks with as many radial
basis neurons as there are input vectors in the training data.
• The second method, newrb, finds the smallest network that can
solve the problem within a given error goal. Typically, far fewer
neurons are required by newrb than are returned newrbe.
• Generalized regression neural network (GRNN) is often used for
function approximation .
• Probabilistic neural networks can be used for classification
problems.
Radial Basis Function (RBF) Network
3/19/2012
4
Radial Basis Function (RBF) Network
Hamming Network• Hamming network � example of competitive network
(unsupervised learning)
(i) compute the distance between the prototype stored
pattern and the input pattern
(ii) perform a competition to determine which neuron
represents the prototype pattern closest to the input
(In unsupervised learning Self-Organizing Map (SOM), the
prototype patterns are adjusted as new inputs to the network
so that the network can cluster the inputs into different
categories)
3/19/2012
5
Hamming Network
• Hamming network � 1st layer: feedforward layer and 2nd
layer: recurrent layer (known as the competitive layer)
• Is called Hamming network because the neuron in the
feedforward layer with the largest output will correspond to
the prototype pattern that is closest in Hamming distance to
the input pattern
• In the competitive layer, the neuron competes with each
other to determine a winner. After the competition, only 1
neuron has a nonzero output.
• Transfer function in the competitive layer: poslin (linear for
+ve numbers and zero for –ve numbers)
Kohonen Self-Organizing Maps (SOMs)
• Self-organizing in network: networks can learn to detect
regularities and correlations in their input and adapt their
future responses to that input accordingly
• Self-organizing maps learn to recognize groups of similar
input vectors in such a way that neurons physically close
together in the neuron layer respond to similar input vectors.
• The weights of the winning neuron (a row of the input weight
matrix) are adjusted with the Kohonen learning rule.
Supposing that the ith neuron wins, the ith row of the input
weight matrix are adjusted as shown below.
3/19/2012
6
p = [.1 .8 .1 .9; .2 .9 .1 .8]
net = newc([0 1; 0 1],2); % to create competitive network
net.trainParam.epochs = 500
net = train(net,p);
a = sim(net,p)
ac = vec2ind(a)
ac = 1 2 1 2
• We see that the network has been trained to classify
the input vectors into two groups, those near the origin,
class 1, and those near (1,1), class 2.
KSOM in MATLAB
democ1
• Self-organizing feature maps (SOFM) learn to classify input
vectors according to how they are grouped in the input space.
They differ from competitive layers in that neighboring neurons
in the self-organizing map learn to recognize neighboring
sections of the input space.
• Thus self-organizing maps learn both the distribution (as do
competitive layers) and topology of the input vectors they are
trained on.
• Functions gridtop, hextop or randtop can arrange the neurons in
a grid, hexagonal, or random topology.
• Instead of updating only the winning neuron (competitive
network), all neurons within a certain neighborhood of the
winning neuron are updated using the Kohonen rule.
SOFM
3/19/2012
7
• different topologies for the original neuron locations with the
functions gridtop, hextop or randtop can be specified
• An 8x10 set of neurons in a hextop topology can be created and
plotted with the code shown below:
pos = hextop(8,10);
plotsom(pos)
SOFM
net = newsom([0 2; 0 1] , [2 3]);
P = [.1 .3 1.2 1.1 1.8 1.7 .1 .3 1.2 1.1 1.8 1.7;...
0.2 0.1 0.3 0.1 0.3 0.2 1.8 1.8 1.9 1.9 1.7 1.8]
We can plot all of this with
plot(P(1,:),P(2,:),'.g','markersize',20)
hold on
plotsom(net.iw{1,1},net.layers{1}.distances)
hold off
net.trainParam.epochs = 1000;
net = train(net,P);
SOFM in MATLAB
3/19/2012
8
• Competitive network learns to categorize the input
vectors presented to it
• Self-organizing map learns to categorize input vectors. It
also learns the distribution of input vectors.
• Self-organizing maps also learn the topology of their input
vectors. Neurons next to each other in the network learn
to respond to similar vectors. The layer of neurons can be
imagined to be a rubber net which is stretched over the
regions in the input space where input vectors occur.
• Self-organizing maps allow neurons that are neighbours to
the winning neuron to output values.
Summary for KSOM
Learning Vector Quantization (LVQ)
• LVQ networks classify input vectors into target classes by using a
competitive layer to find subclasses of input vectors, and then
combining them into the target classes.
• LVQ: hybrid network - uses both unsupervised and supervised
learning for classification
• Supervised learning � target is given
3/19/2012
9
LVQ in MATLAB net = newlvq(PR,S1,PC,LR,LF)
P = [-3 -2 -2 0 0 0 0 +2 + 2 +3; 0 +1 -1 +2 +1 -1 -2 +1 -1 0]
and
Tc = [1 1 1 2 2 2 2 1 1 1];
T = ind2vec(Tc)
create a network with four neurons in the first layer and two neurons in the second
layer. The second layer weights will have 60% (6 of the 10 in Tc above) of its
columns with a 1 in the first row, corresponding to class 1, and 40% of its columns
will have a 1 in the second row, corresponding to class 2.
net = newlvq(range(P),4,[.6 .4]);
net.trainParam.epochs = 1000;
net.trainParam.lr = 0.05;
net = train(net,P,T);
Y = sim(net,P)
Yc = vec2ind(Y)
Yc =
1 1 1 2 2 2 2 1 1 1
Recurrent Network
• Example: Elman (supervised learning) and Hopfield networks (unsupervised)
• Elman networks are two-layer backpropagation networks, with the addition of a
feedback connection from the output of the hidden layer to its input. This
feedback path allows Elman networks to learn to recognize and generate
temporal patterns, as well as spatial patterns.
• Elman networks, by having an internal feedback loop, are capable of learning to
detect and generate temporal patterns. This makes Elman networks useful in
such areas as signal processing and prediction where time plays a dominant role.
• The Hopfield network is used to store one or more stable target vectors. These
stable vectors can be viewed as memories which the network recalls when
provided with similar vectors which act as a cue to the network memory.
• Hopfield networks can act as error correction or vector categorization
networks. Input vectors are used as the initial conditions to the network, which
recurrently updates until it reaches a stable output vector.
3/19/2012
10
Elman Networknet = newelm([0 1],[5 1],{'tansig','logsig'});
P = round(rand(1,8))
P = 1 0 1 1 1 0 1 1
and
T = [0 (P(1:end-1)+P(2:end) == 2)]
T = 0 0 0 1 1 0 0 1
Here T is defined to be 0 except when two ones occur in P, in which case T
will be 1.
net = train(net,Pseq,Tseq);
Y = sim(net,Pseq);
z = seq2con(Y);
3/19/2012
11
Hopfield NetworkT = [-1 -1 1; 1 -1 1]‘
net = newhop(T);
Ai = T;
[Y,Pf,Af] = sim(net,2,[],Ai);
Y
This gives us
Y =
-1 1
-1 -1
1 1
the network has indeed been designed to be stable at its design points.
Ai = {[-0.9; -0.8; 0.7]}
[Y,Pf,Af] = sim(net,{1 5},{},Ai);
Y{1}
We get
Y =
-1
-1
1
Adaptive Resonance Theory (ART)
• ART � Unsupervised learning; designed to achieve
learning stability while maintaining sensitivity to novel
inputs
• As each input is given to the network, it is compared with
the prototype vector that is mostly closely matching.
• If the matching is not adequate, a new prototype is
selected.
• In this way, previously learned memories (prototype) is
not eroded by new learning
3/19/2012
12
Applications of NNAppcr1: Character Recognition
S1 = 10; S2 = 26;
net = newff(minmax(P),[S1 S2],{'logsig' 'logsig'},'traingdx');
Training Without Noise
The network is initially trained without noise for a maximum of 5000 epochs
or until the network sum-squared error falls beneath 0.1.
P = alphabet;
T = targets;
net.performFcn = 'sse';
net.trainParam.goal = 0.1;
net.trainParam.show = 20;
net.trainParam.epochs = 5000;
net.trainParam.mc = 0.95;
[net,tr] = train(net,P,T);
Applications of NNAppcr1: Character Recognition
Training With Noise
netn = net;
netn.trainParam.goal = 0.6;
netn.trainParam.epochs = 300;
T = [targets targets targets targets];
for pass = 1:10
P = [alphabet, alphabet, ...
(alphabet + randn(R,Q)*0.1), ...
(alphabet + randn(R,Q)*0.2)];
[netn,tr] = train(netn,P,T);
end
3/19/2012
13
Applications of NNAppcr1: Character Recognition
Training Without Noise Again
Once the network has been trained with noise it makes sense to train it without
noise once more to ensure that ideal input vectors are always classified
correctly.
To test the system a letter with noise can be created and presented to the
network.
noisyJ = alphabet(:,10)+randn(35,1) * 0.2;
plotchar(noisyJ);
A2 = sim(net,noisyJ);
A2 = compet(A2);
answer = find(compet(A2) == 1);
plotchar(alphabet(:,answer));
Applications of NN
Applin2: Adaptive Prediction
time1 = 0:0.05:4;
time2 = 4.05:0.024:6;
time = [time1 time2];
T = [sin(time1*4*pi) sin(time2*8*pi)];
Since we will be training the network incrementally, we will change t to a
sequence.
T = con2seq(T);
P = T;
lr = 0.1;
delays = [1 2 3 4 5];
net = newlin(minmax(cat(2,P{:})),1,delays,lr);
[w,b] = initlin(P,t)
[net,a,e]=adapt(net,P,T);
top related