2tlecture08
TRANSCRIPT
-
8/11/2019 2tLecture08
1/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 1
Lecture 8Lecture 8
Artificial neural networks:Artificial neural networks:Unsupervised learningUnsupervised learning
IntroductionIntroduction
Hebbian learningHebbian learning
Generalised Hebbian learning algorithmGeneralised Hebbian learning algorithm Competitive learningCompetitive learning
Self-organising computational map:Self-organising computational map:Kohonen networkKohonen network
SummarySummary
-
8/11/2019 2tLecture08
2/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 2
TheThemain property of a neural network is anmain property of a neural network is an
ability to learn from its environment, and toability to learn from its environment, and to
improve its performance through learning. Soimprove its performance through learning. So
far we have consideredfar we have considered supervisedsupervisedororactiveactive
learninglearning
learning with an external teacherlearning with an external teacher
or a supervisor who presents a training set to theor a supervisor who presents a training set to the
network. But another type of learning alsonetwork. But another type of learning also
exists:exists: unsupervised learningunsupervised learning..
IntroductionIntroduction
-
8/11/2019 2tLecture08
3/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 3
In contrast to supervised learning, unsupervised orIn contrast to supervised learning, unsupervised orself-organised learningself-organised learningdoes not require andoes not require an
external teacher. During the training session, theexternal teacher. During the training session, the
neural network receives a number of differentneural network receives a number of differentinput patterns, discovers significant features ininput patterns, discovers significant features in
these patterns and learns how to classify input datathese patterns and learns how to classify input data
into appropriate categories. Unsupervisedinto appropriate categories. Unsupervisedlearning tendslearning tendsto follow theto follow theneuroneuro-biological-biological
organisation of the brain.organisation of the brain.
Unsupervised learning algorithms aim to learnUnsupervised learning algorithms aim to learn
rapidly and can be used in real-time.rapidly and can be used in real-time.
-
8/11/2019 2tLecture08
4/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 4
InIn1949,1949, Donald Hebb proposedDonald Hebb proposedone of the keyone of the key
ideas in biological learning, commonly known asideas in biological learning, commonly known as
HebbsHebbsLawLaw.. HebbsHebbsLaw states that ifLaw states that ifneuronneuroniiisis
near enough to excitenear enough to exciteneuronneuronjjand repeatedlyand repeatedly
participates in its activation, the synaptic connectionparticipates in its activation, the synaptic connection
between these twobetween these twoneuronsneuronsis strengthened andis strengthened and
neuronneuronjjbecomes more sensitive to stimuli frombecomes more sensitive to stimuli from
neuronneuronii..
HebbianHebbianlearninglearning
-
8/11/2019 2tLecture08
5/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 5
Hebbs Law can be represented in the form of twoHebbs Law can be represented in the form of two
rules:rules:
1. If two neurons on either side of a connection1. If two neurons on either side of a connection
are activated synchronously, then the weight ofare activated synchronously, then the weight of
that connection is increased.that connection is increased.
2. If two neurons on either side of a connection2. If two neurons on either side of a connection
are activated asynchronously, then the weightare activated asynchronously, then the weightof that connection is decreased.of that connection is decreased.
HebbsHebbsLaw provides the basis for learningLaw provides the basis for learningwithout a teacher. Learning here is awithout a teacher. Learning here is a locallocal
phenomenonphenomenonoccurring without feedback fromoccurring without feedback from
the environment.the environment.
-
8/11/2019 2tLecture08
6/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 6
HebbianHebbianlearninglearningin ain a neuralneuralnetworknetwork
i j
Input
Signals
Ou
tpu
t
Signa
ls
-
8/11/2019 2tLecture08
7/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 7
UsingUsingHebbsHebbsLaw we can express the adjustmentLaw we can express the adjustmentapplied to the weightapplied to the weightwwijijat iterationat iterationppin thein the
following form:following form:
As a special case, we can represent Hebbs Law asAs a special case, we can represent Hebbs Law as
follows:follows:
wherewhere isisthethe learning ratelearning rateparameter.parameter.
This equation is referred to as theThis equation is referred to as the activity productactivity productrulerule..
][ )(),()( pxpyFpw ijij =
)()()( pxpypw ijij =
-
8/11/2019 2tLecture08
8/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 8
HebbianHebbianlearning implies that weights can onlylearning implies that weights can only
increase.increase. To resolve this problem, we mightTo resolve this problem, we might
impose a limit on the growth of synaptic weightsimpose a limit on the growth of synaptic weights..
It can be done by introducing a non-linearIt can be done by introducing a non-linear
forgetting factorforgetting factorintointoHebbsHebbsLaw:Law:
wherewhere is the forgetting factor.is the forgetting factor.
Forgetting factorForgetting factor usually falls in the intervalusually falls in the interval
between 0 and 1, typically betweenbetween 0 and 1, typically between0.01 and 0.1,0.01 and 0.1,to allow only a little forgetting while limitingto allow only a little forgetting while limiting
the weight growth.the weight growth.
)()()()()( pwpypxpypw ijjijij =
-
8/11/2019 2tLecture08
9/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 9
Hebbian learning algorithmHebbian learning algorithm
Step 1Step 1:: InitialisationInitialisation..
SetSetinitial synaptic weights and thresholds to smallinitial synaptic weights and thresholds to small
random values, say in an interval [0, 1random values, say in an interval [0, 1].].
Step 2Step 2: Activation.: Activation.
Compute the neuron output at iterationCompute the neuron output at iterationpp
wherewhere nnis the number of neuronis the number of neuroninputs, andinputs, and jjis theis thethreshold value ofthreshold value ofneuronneuronjj..
j
n
i
ijij pwpxpy ==1
)()()(
-
8/11/2019 2tLecture08
10/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 10
Step 3Step 3::Learning.Learning.
UpdateUpdatethe weights in the network:the weights in the network:
wherewherewwijij((pp) is the weight correction at iteration) is the weight correction at iterationpp..
The weight correction is determined by theThe weight correction is determined by the
generalised activity product rule:generalised activity product rule:
StepStep44: Iteration.: Iteration.
IncreaseIncreaseiterationiterationppby one, go back to Stepby one, go back to Step 2.2.
)()()1( pwpwpw ijijij +=+
][ )()()()( pwpxpypw ijijij =
-
8/11/2019 2tLecture08
11/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 11
To illustrateTo illustrateHebbianHebbianlearning, consider a fullylearning, consider a fully
connectedconnectedfeedforwardfeedforwardnetwork with a single layernetwork with a single layer
of five computationof five computation
neuronsneurons
. Each
. Eachneuronneuron
isis
represented by arepresented by aMcCullochMcCullochandandPittsPittsmodel withmodel with
the sign activation function. The network is trainedthe sign activation function. The network is trained
on the following set of input vectors:on the following set of input vectors:
Hebbian learning eHebbian learning examplexample
=
0
0
0
0
0
1X
=
1
0
0
1
0
2X
=
0
1
0
0
0
3X
=
0
0
1
0
0
4X
=
1
0
0
1
0
5X
-
8/11/2019 2tLecture08
12/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 12
Input layer
x1 1
Output layer
2
1 y1
y2x2 2
x3 3
x4 4
x5 5
4
3 y3
y4
5 y5
1
0
0
0
1
1
0
0
0
1
Input layer
x1 1
Output layer
2
1 y1
y2x2 2
x3 3
x4 4
x5
4
3 y3
y4
5 y5
1
0
0
0
1
0
0
1
0
1
2
5
Initial and final states of the networkInitial and final states of the network
-
8/11/2019 2tLecture08
13/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 13
O u t p u t l a y e r
Inpu
t
lay
er
10000
0100000100
00010
00001
21 43 5
1
2
3
4
5
O u t p u t l a y e r
Inpu
t
lay
er
0
00
0
0
1 2 3 4 5
1
2
3
4
5
00
2.0204
0
2.0204
1.0200
0
0
0
0 0.9996
0
0
0
0
0
0
2.0204
0
2.0204
Initial and final weight matricesInitial and final weight matrices
-
8/11/2019 2tLecture08
14/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 14
When this probe is presented to the network, weWhen this probe is presented to the network, we
obtain:obtain:
=
1
0
0
0
1
X
=
=
1
0
0
1
0
0737.0
9478.0
0907.0
2661.0
4940.0
1
0
0
0
1
2.0204002.02040
00.9996000
001.020000
2.0204002.02040
00000
signY
A test input vector, or probe,A test input vector, or probe,is definedis defined asas
-
8/11/2019 2tLecture08
15/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 15
InIncompetitive learning,competitive learning,neuronsneuronscompete amongcompete among
themselves to be activated.themselves to be activated.
WhileWhileininHebbianHebbianlearning, several outputlearning, several outputneuronsneuronscan be activated simultaneously, in competitivecan be activated simultaneously, in competitive
learning, only a single outputlearning, only a single outputneuronneuronis active atis active at
any time.any time.
TheTheoutputoutputneuronneuronthat wins the competition isthat wins the competition is
called thecalled the winner-takes-allwinner-takes-allneuronneuron..
CompetitiveCompetitive learninglearning
-
8/11/2019 2tLecture08
16/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 16
The basic idea of competitive learning wasThe basic idea of competitive learning was
introduced in the early 1970s.introduced in the early 1970s.
In the late 1980s, Teuvo Kohonen introduced aIn the late 1980s, Teuvo Kohonen introduced aspecial class of artificial neural networks calledspecial class of artificial neural networks called
self-organising feature mapsself-organising feature maps. These maps are. These maps are
based on competitive learning.based on competitive learning.
-
8/11/2019 2tLecture08
17/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 17
OurOurbrain is dominated by the cerebral cortex, abrain is dominated by the cerebral cortex, a
very complex structure of billions ofvery complex structure of billions ofneuronsneuronsandand
hundreds of billions of synapses. The cortexhundreds of billions of synapses. The cortexincludes areas that are responsible for differentincludes areas that are responsible for different
human activities (motor, visual, auditory,human activities (motor, visual, auditory,
somatosensorysomatosensory, etc.), and, etc.), and associatedassociatedwith differentwith differentsensory inputs. We can say that each sensorysensory inputs. We can say that each sensory
input is mapped into a corresponding area of theinput is mapped into a corresponding area of the
cerebralcerebral cortex.cortex. TheThecortex is a self-organisingcortex is a self-organisingcomputational map in the human brain.computational map in the human brain.
What is a self-organisingWhat is a self-organisingfeature map?feature map?
-
8/11/2019 2tLecture08
18/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 18
Feature-mappingFeature-mappingKohonenKohonenmodelmodel
Input layer
Kohonen layer
(a)
Input layer
Kohonen layer
1 0
(b)
0 1
-
8/11/2019 2tLecture08
19/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 19
TheTheKohonenKohonenmodel provides a topologicalmodel provides a topological
mapping.mapping. It placesIt placesa fixed number of inputa fixed number of input
patterns from the input layer into a higher-patterns from the input layer into a higher-dimensional output ordimensional output orKohonenKohonenlayer.layer.
Training in the Kohonen network begins with theTraining in the Kohonen network begins with the
winners neighbourhood of a fairly large size.winners neighbourhood of a fairly large size.
Then, as training proceeds, the neighbourhood sizeThen, as training proceeds, the neighbourhood size
gradually decreases.gradually decreases.
TheTheKohonenKohonennetworknetwork
-
8/11/2019 2tLecture08
20/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 20
Architecture of theArchitecture of the KohonenKohonenNetworkNetwork
Input
layer
Ou
tput
Signa
ls
Input
Signa
ls
x1
x2
Output
layer
y1
y2
y3
-
8/11/2019 2tLecture08
21/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 21
The lateral connections are used to create aThe lateral connections are used to create a
competition betweencompetition betweenneuronsneurons. The. Theneuronneuronwithwiththe largest activation level among allthe largest activation level among allneuronsneuronsinin
the output layer becomes thethe output layer becomes the winner. This neuronwinner. This neuron
is the only neuronis the only neuronthat produces an output signal.that produces an output signal.The activity of all otherThe activity of all otherneuronsneuronsis suppressed inis suppressed in
the competition.the competition.
The lateral feedback connections produceThe lateral feedback connections produce
excitatory or inhibitory effects, depending on theexcitatory or inhibitory effects, depending on the
distance from the winning neuron. This isdistance from the winning neuron. This is
achieved by the use of aachieved by the use of a Mexican hat functionMexican hat function
which describes synaptic weights between neuronswhich describes synaptic weights between neurons
in the Kohonenin the Kohonenlayer.layer.
-
8/11/2019 2tLecture08
22/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 22
The Mexican hat function of lateral connectionThe Mexican hat function of lateral connection
Connection
strength
Distance
Excitatory
effect
Inhibitoryeffect
Inhibitoryeffect
0
1
-
8/11/2019 2tLecture08
23/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 23
In theIn the KohonenKohonennetwork, anetwork, aneuronneuronlearns bylearns by
shifting its weights from inactive connections toshifting its weights from inactive connections toactive ones. Only the winningactive ones. Only the winningneuronneuronand itsand its
neighbourhood are allowed to learn. If aneighbourhood are allowed to learn. If aneuronneuron
does not respond to a given input pattern, thendoes not respond to a given input pattern, thenlearning cannot occur in that particularlearning cannot occur in that particularneuronneuron..
TheThe competitive learning rulecompetitive learning ruledefines the changedefines the change
wwijijapplied to synaptic weightapplied to synaptic weight wwijijasas
wherewherexxiiis the input signal andis the input signal and is theis the learninglearning
raterateparameter.parameter.
=ncompetitiothelosesneuronif,0
ncompetitiothewinsneuronif),(
j
jwxw
ijiij
-
8/11/2019 2tLecture08
24/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 24
The overall effect of the competitive learning ruleThe overall effect of the competitive learning rule
resides in moving the synaptic weight vectorresides in moving the synaptic weight vector WWjjofofthe winningthe winningneuronneuronjjtowards the input patterntowards the input pattern XX..
The matching criterion is equivalent to theThe matching criterion is equivalent to the
minimumminimum Euclidean distanceEuclidean distancebetween vectors.between vectors.
The Euclidean distance between a pair ofThe Euclidean distance between a pair of nn-by-1-by-1
vectorsvectors XX
andandWW
jjis defined byis defined by
wherewherexxiiandand wwijijare theare theiiththelements of the vectorselements of the vectors
XXandand WWjj,,respectively.respectively.
2/1
1
2)(
==
=
n
i
ijij wxd WX
-
8/11/2019 2tLecture08
25/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 25
To identify the winningTo identify the winningneuronneuron,,jjXX, that best, that bestmatches the input vectormatches the input vector XX, we may apply the, we may apply the
following condition:following condition:
wherewhere mmis the number ofis the number ofneuronsneuronsin thein theKohonenKohonen
layer.layer.
,jj
minj WXX = j= 1, 2, . . ., m
-
8/11/2019 2tLecture08
26/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 26
Suppose, for instance, that the 2-dimensional inputSuppose, for instance, that the 2-dimensional inputvectorvector XXis presented to the three-neuron Kohonenis presented to the three-neuron Kohonen
network,network,
The initial weight vectors,The initial weight vectors, WWjj, are given by, are given by
=
12.0
52.0X
=
81.0
27.01W
=
70.0
42.02W
=
21.0
43.03W
-
8/11/2019 2tLecture08
27/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 27
We find the winning (best-matching)We find the winning (best-matching)neuronneuronjjXX
using the minimum-distance Euclidean criterionusing the minimum-distance Euclidean criterion::
2212
21111 )()( wxwxd += 73.0)81.012.0()27.052.0(
22 =+=
2222
21212 )()( wxwxd += 59.0)70.012.0()42.052.0(
22 =+=
2232
21313 )()( wxwxd += 13.0)21.012.0()43.052.0(
22 =+=
Neuron 3 is the winner and its weight vectorNeuron 3 is the winner and its weight vector WW33isis
updated according to the competitive learningupdated according to the competitive learningrule.rule.
0.01)43.052.0(1.0)( 13113 === wxw
0.01)21.012.0(1.0)( 23223 === wxw
-
8/11/2019 2tLecture08
28/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 28
The updated weight vectorThe updated weight vector WW33at iteration (at iteration (pp+ 1)+ 1)is determined as:is determined as:
The weight vectorThe weight vector WW33of the winingof the winingneuronneuron33becomes closer to the input vectorbecomes closer to the input vector XXwith eachwith each
iteration.iteration.
=
+
=+=+ 20.0
44.001.0
0.0121.043.0)()()1( 333 ppp WWW
-
8/11/2019 2tLecture08
29/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 29
Step 1Step 1::InitialisationInitialisation..
Set initial synaptic weights to small randomSet initial synaptic weights to small randomvalues, say in an interval [0, 1], and assign a smallvalues, say in an interval [0, 1], and assign a small
positive value to the learning rate parameterpositive value to the learning rate parameter ..
Competitive Learning AlgorithmCompetitive Learning Algorithm
-
8/11/2019 2tLecture08
30/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 30
Step 2Step 2::Activation and Similarity MatchingActivation and Similarity Matching..
Activate the Kohonen network by applying theActivate the Kohonen network by applying the
input vectorinput vector XX, and find the winner-takes-all (best, and find the winner-takes-all (best
matching) neuronmatching) neuronjjXXat iterationat iterationpp, using the, using theminimum-distance Euclidean criterionminimum-distance Euclidean criterion
wherewhere nnis the number ofis the number ofneuronsneuronsin the inputin the input
layer, andlayer, and mmis the number ofis the number ofneuronsneuronsinin thethe
Kohonen layer.Kohonen layer.
,)()()(
2/1
1
2][
== =
n
i
ijijj
pwxpminpj WXX
j= 1, 2, . . ., m
-
8/11/2019 2tLecture08
31/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 31
Step 3Step 3::LearningLearning..
UpdateUpdatethe synaptic weightsthe synaptic weights
wherewhere wwijij((pp) is the weight correction at iteration) is the weight correction at iterationpp..TheTheweight correction is determined by theweight correction is determined by the
competitive learning rule:competitive learning rule:
wherewhere is theis the learning ratelearning rateparameter, andparameter, and jj((pp) is) isthe neighbourhood function centred around thethe neighbourhood function centred around the
winner-takes-allwinner-takes-allneuronneuronjjXXat iterationat iterationpp..
)()()1( pwpwpw ijijij +=+
=
)(,0
)(,)()(
][
pj
pjpwxpw
j
jijiij
-
8/11/2019 2tLecture08
32/37
-
8/11/2019 2tLecture08
33/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 33
To illustrate competitive learning, consider theTo illustrate competitive learning, consider the
KohonenKohonennetwork with 100network with 100neuronsneuronsarranged in thearranged in the
form of a two-dimensional lattice with 10 rows andform of a two-dimensional lattice with 10 rows and10 columns. The network is required to classify10 columns. The network is required to classify
two-dimensional inputtwo-dimensional input vectorsvectors each neuron in theeach neuron in the
network should respond only to the input vectorsnetwork should respond only to the input vectorsoccurring in its region.occurring in its region.
The network is trained with 1000 two-dimensionalThe network is trained with 1000 two-dimensional
input vectors generated randomly in a squareinput vectors generated randomly in a square
region in the interval between 1 and +1.region in the interval between 1 and +1. TheThe
learning rate parameterlearning rate parameter is equal to 0.1.is equal to 0.1.
Competitive learning in theCompetitive learning in theKohonenKohonennetworknetwork
I iti l d i htI iti l d i hts
-
8/11/2019 2tLecture08
34/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 34
-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
W(1,j)
-1-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
W(2,j)
Initial random weightsInitial random weights
Net ork after 100 iterationsNetwork after 100 iterations
-
8/11/2019 2tLecture08
35/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 35
Network after 100 iterationsNetwork after 100 iterations
-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
W(1,j)
-1-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
W(2,j)
Network after 1000 iterationsNetwork after 1000 iterations
-
8/11/2019 2tLecture08
36/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 36
Network after 1000 iterationsNetwork after 1000 iterations
-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
W(1,j)
-1-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
W(2,j)
Network after 10 000 iterationsNetwork after 10 000 iterations
-
8/11/2019 2tLecture08
37/37
Negnevitsky, Pearson Education, 2002Negnevitsky, Pearson Education, 2002 37
Network after 10,000 iterationsNetwork after 10,000 iterations
-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
W(1,j)
-1-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
W(2,j)