artificial intelligence csc 361
TRANSCRIPT
Department of Computer Engineering
University of Kurdistan
Neural Networks (Graduate level)Associative Networks
By: Dr. Alireza Abdollahpouri
Associative networks
• To an extent, learning is forming associations.
• Human memory associates
– similar items,
– contrary/opposite items,
– items close in proximity,
– items close in succession (e.g., in a song)
2
The patterns we associate together may be
– of the same type or sensory modality (e.g. a visual
image may be associated with another visual image)
– or of different types (e.g. a fragrance may be
associated with a visual image or a feeling).
Memorization of a pattern (or a group of patterns) may be
considered to be associating the pattern with itself.
3
Associative networks
An associative network is a single-layer
Network in which the weights are determined in such a way that the net can store a set of
pattern associations.
4
Associative networks
5
Associative networks
Associative
Networks
Auto-
associative
Hetero-
associative
Feed-forward Recurrent Feed-forward Recurrent
6
Types of Associative networks
Hetero associative
Auto associative
n Input m Output
n Input n Output
7
Types of Associative networks
Auto-association
AA
Hetero-association
Niagara Waterfall
memory
memory
8
Types of Associative networks
Whether auto- or hetero-associative, the net can associate not only the exact pattern pairs used in training, but is also able to obtain associations if the input is similar to one on which it has been trained.
�Information recording: A large set of patterns (the priori information) are stored (memorized)
�Information retrieval/recall: Stored prototypes are excited according to the input key patterns
9
Associative networks
AM- Representation
Before training an AM NN, the original patterns must be converted to an appropriate representation for computation
In a simple example, the original pattern might consist of “on” and “off” signals, and the conversion could be:
“on” +1 , “off” 0 (binary representation) or
“on” +1, “off” -1 (bipolar representation).
10
Examples of Hetero-association
Mapping from 4-inputs to 2-outputs.
Whenever the net is shown a 4-bit input pattern, it produces a 2-bit output pattern
11
Hebbian Learning
. .
[ ]
. .
j m
i i j i mi j m
n n j n mn
s t s t s ts
s t s t s tS T s t t t
s t s t s ts
1 1 1 11
11
1
Input n dimensional
Output: m dimensional
Weight matrix- using Hebb rule
12
( , , , ) ( , )
( , , , ) ( , )
( , , , ) ( , )
( , , , ) ( , )
s s s s t t
s t
s t
s t
s t
1 2 3 4 1 2
1 1 0 0 0 1 0
2 1 1 0 0 1 0
3 0 0 0 1 0 1
4 0 0 1 1 0 1
13
Examples of Hetero-association
(1, 0, 0, 0):(1, 0)s : t
.
1 1 0
0 0 01 0
0 0 0
0 0 0
.
1 1 0
1 1 01 0
0 0 0
0 0 0
(1, 1, 0, 0):(1, 0)s : t
14
weight matrix to store the first pair:
weight matrix to store the second pair:
Examples of Hetero-association
.
0 0 0
0 0 00 1
0 0 0
1 0 1
.
0 0 0
0 0 00 1
1 0 1
1 0 1
(0, 0, 0, 1):(0, 1)s : t
(0, 0, 1, 1):(0, 1)s : t
W
1 0 1 0 0 0 0 0 2 0
0 0 1 0 0 0 0 0 1 0
0 0 0 0 0 0 0 1 0 1
0 0 0 0 0 1 0 1 0 2
15
weight matrix to store the third pair:
weight matrix to store the fourth pair:
Examples of Hetero-association
Test the network
16
Using the activation function:
For the first input pattern
This is the correct response for the first training pattern
Similarly, applying the same algorithm, with x equal to each of the other three training input vectors, yields:
17
Test the network
18
Testing a hetero-associative net with input similar to the training input
Thus, the net also associates a known output pattern with this input
Testing a heteroassociative net with input that is not similar to the training input
The output is not one of the outputs with which the net was trained; in other words, the net does not recognize the pattern.
Test the network
( ) ( , , , ), ( ) ( )
( ) ( , , , ), ( ) ( , )
( ) ( , , , ), ( ) ( , )
( ) ( , , , ), ( ) ( , )
s t
s t
s t
s t
1 1 1 1 1 1 1 1
2 1 1 1 1 2 1 1
3 1 1 1 1 3 1 1
4 1 1 1 1 4 1 1
( ) ( )ij i j
p
w s p t p
1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1. 1 1 . 1 1 . 1 1
1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1
1 1 1 4 4
1 1 1 2 2. 1 1
1 1 1 2 2
1 1 1 4 4
First input Second input Third input Fourth input
19
Bi-polar representation
Weight matrix
( , , , , , , , , , , , , , , ) 1 1 1 1 11 1 11 1 11 1 11
20
A heteroassociative net for associating letters from different fonts
Hetero-associative network- example
21
Response of heteroassociative net to several noisy versions of pattern A.
Hetero-associative network- example
Response of heteroassociative net to patterns A, B, and C with mistakes in 1/3 of the components.
22
Hetero-associative network- example
Auto-associative nets
23
the weights are usually set from the formula
W
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
( , , , ) x 1 1 1 1
_ ( , , , )in y 4 4 4 4
( , , , ) ( , , , )f y 4 4 4 4 1 1 1 1
( , , , ). ( , , , ) ( , , , ) W1 1 1 1 4 4 4 4 1 1 1 1
Correct recognition of input vector
24
Auto-associative net- example
The vector s = (1, 1, 1, - 1) is stored with the weight matrix:
( , , , ) s 1 1 1 1
W
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
( , , ) . ( , , , ) ( , , , )
( , , ) . ( , , , ) ( , , , )
( , , ) . ( , , , ) ( , , , )
( , , ) . ( , , , ) ( , , , )
W
W
W
W
1 1 1 1 2 2 2 2 1 1 1 1
1 1 1 1 2 2 2 2 1 1 1 1
1 1 1 1 2 2 2 2 1 1 1 1
1 1 1 1 2 2 2 2 1 1 1 1
One mistake in input vector
Correct recognition
25
Testing an autoassociative net: one mistake in the input vector
Auto-associative net- example
( , , , ) s 1 1 1 1
W
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
two "missing" entries
Correct recognition
( , , , ) . ( , , , ) ( , , , )
( , , , ) . ( , , , ) ( , , , )
( , , , ) . ( , , , ) ( , , , )
( , , , ) . ( , , , ) ( , , , )
( , , , ) . ( , , , ) ( , , , )
( , , , ) . ( , , , ) ( , , ,
W
W
W
W
W
W
0 0 1 1 2 2 2 2 1 1 1 1
0 1 0 1 2 2 2 2 1 1 1 1
0 1 1 0 2 2 2 2 1 1 1 1
1 0 0 1 2 2 2 2 1 1 1 1
1 0 1 0 2 2 2 2 1 1 1 1
1 1 0 0 2 2 2 2 1 1 1 1)
26
Testing an autoassociative net: two "missing" entries in the input vector
Auto-associative net- example
( , , , ) s 1 1 1 1
W
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
Incorrect recognition
( , , , ). ( , , , ) W1 1 1 1 0 0 0 0
27
Testing an autoassociative net: two mistakes in the input vector
Auto-associative net- example
28
Storage Capacity
An autoassociative net with four nodes can store three orthogonal vectors
1 2 1 2
(1, 1, 1, 1) ( 1, 1, 1, 1)
0 1 1 1 0 1 1 1 0 0 2 0
1 0 1 1 1 0 1 1 0 0 0 2
1 1 0 1 1 1 0 1 2 0 0 0
1 1 1 0 1 1 1 0 0 2 0 0
W W W W
It is fairly common for an autoassociativenetwork to have its diagonal terms set to zero
29
Storage Capacity
1 2 1 2
(1, 1, 1, 1) ( 1, 1, 1, 1)
0 1 1 1 0 1 1 1 0 0 2 0
1 0 1 1 1 0 1 1 0 0 0 2
1 1 0 1 1 1 0 1 2 0 0 0
1 1 1 0 1 1 1 0 0 2 0 0
W W W W
1 2(1, 1, 1, 1).[ ] (1, 1, 1, 1) W W
1 2( 1, 1, 1, 1).[ ] ( 1, 1, 1, 1) W WCorrect recognition
30
Storing two Orthogonal vectors in an autoassociative net
Storage Capacity
( , , , ) 1 1 1 1( , , , )1 1 1 1
0 1 1 1 0 1 1 1 0 0 2 2
1 0 1 1 1 0 1 1 0 0 0 0
1 1 0 1 1 1 0 1 2 0 0 2
1 1 1 0 1 1 1 0 2 0 2 0
W
(1, 1, 1, 1). (1, 1, 1,1) W
(1, 1, 1, 1). (1, 1, 1,1) W
Incorrect Recognition
31
Attempting to store two non-orthogonal vectors in an autoassociative net
Storage Capacity
1 2 3 1 2 3
(1, 1, 1, 1) ( 1, 1, 1, 1) ( 1, 1, 1, 1)
0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1
1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1
1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1
1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0
W W W W W W
Correct recognition of all vectors
32
Storing three Orthogonal vectors in an autoassociative net
Storage Capacity
1 2 3 4 1 2 3 4
(1, 1, 1, 1) ( 1, 1, 1, 1) ( 1, 1, 1, 1) (1, 1, 1, 1)
0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1
1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1
1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1
1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0
W W W W W W W W
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
learning is erased.
33
Storing four Orthogonal vectors in an autoassociative net
Storage Capacity
Hopfield neural network (HNN) is a model of autoassociative memory
It is a single layer neural network with feedbacks.
34
Hopfield Neural network
1 2 3 n. . .
v1 v2 v3 vn
I1 I2 I3 In
w21 w31 wn1
w12 w32 wn2
w13 w23 wn3
w1n w2n w3n
wij = wji
wii = 0
35
Hopfield Neural network
36
Hopfield Neural network
2 ( ) 1 2 ( ) 1 , ; 0ij i i ii
p
w s p s p i j w
( ) ( ), ; 0ij i j ii
p
w s p s p i j w
37
To store a set of binary patterns, the weight matrix W = is given by:
To store a set of bipolar patterns, the weight matrix W = is given by:
Hopfield Neural network
, ,...,i iy x i n 1
_ i i j ji
j
y in x y w _
_
_ .
i i
i i i i
i i
if y in
y y if y in
if y in
1
0iy
38
Step 0. Initialize weights to store patterns.
While activations of the net are not converged, do Steps 1-7.
Step 1. For each input vector x, do Steps 2-6.
Step 2. Set initial activations of net equal to the external input vector x:
Step 3. Do Steps 4-6 for each unit (Units should be updated in random order.)
Step 4. Compute net input:
Step 5. Determine activation (output signal):
Step 6. Broadcast the value of to all other units. (This updates the activation vector.)
Step 7. Test for convergence.
Hopfield Neural network
W
0 1 1 1
1 0 1 1
1 1 0 1
1 1 1 0
Y 1 4Y 3Y2Y
39
Hopfield Neural network - example
The vector (1, 1, 1,0) (or its bipolar equivalent (1, 1, 1, - 1)) was stored in a net
the weight matrix is bipolar
The input vector is x = (0, 0, 1, 0)
For this example the update order is
( , , , )y 0 0 1 0
Y 1
_ j j
j
y in x y w 1 1 1 0 1
W
0 1 1 1
1 0 1 1
1 1 0 1
1 1 1 0
_y in y 1 10 1 ( , , , )y 1 0 1 0
_ ( )j j
j
y in x y w 4 4 4 0 2
( , , , )y 1 0 1 0
40
4Y
_y in y 4 40 0
Choose unit to update its activation:
Choose unit to update its activation:
Hopfield Neural network - example
3Y
2Y
_ j j
j
y in x y w 3 3 3 1 1
( , , , )y 1 0 1 0
_y in y 3 30 1 ( , , , )y 1 0 1 0
_ j j
j
y in x y w 2 2 2 0 2
_y in y 2 20 1 ( , , , )y 1 1 1 0
41
Choose unit to update its activation:
Choose unit to update its activation:
further iterations do not change the activation of any unit. The net has converged to the stored vector.
Hopfield Neural network - example
42
• Image reconstruction.• A 20 X 20 discrete Hopfield network was trained with 20 input patterns,
including the one shown in the left figure and 19 random patterns as the one on the right.
Hopfield Neural network - example
43
The Hopfield Network After providing only one fourth of the “face” image as initial input, the network is able to perfectly reconstruct that image within only two iterations.
Hopfield Neural network - example
44
Adding noise by changing each pixel with a probability p = 0.3 does not impair the network’s performance. After two steps the image is perfectly reconstructed.
Hopfield Neural network - example
45
for noise created by p = 0.4, the network is unable the original image. Instead, it converges against one of the 19 random patterns.
Hopfield Neural network - example
The stored dada in memory
Network input: Distorted 6
Perfect Recall
46
Hopfield Neural network - example
Network input: Distorted 9Network input: Distorted 2
Erroneous Recall
Perfect Recall
47
Hopfield Neural network - example
48
E 0
Hopfield network- Energy function
Hopfield nets have a scalar value associated with each state of the network, referred to as the "energy", E, of the network, where:
a Hopfield network constantly decreases its energy
49
W
0 1 1 1
1 0 1 1
1 1 0 1
1 1 1 0
Establishing Connection Weights
Hopfield network- example
50
Network’ States and Their Code: Total number of states = 16
Hopfield network- example
51
Calculating energy function for all states:=0
Hopfield network- example
52
For state A = [O1, O2, O3, O4] [ 1 , 1, 1, 1]
Hopfield network- example
53
Minimum energy
(Stable states)
Similarly, we can compute the energy level of the other states.
Hopfield network- example
54
State Transition for State J = [−1 , −1 , 1 , −1]
As a result, the first component of the state J changes from −1 to 1. In other words, the state J transits to the state G
Hopfield network- example
55
B= [1, 1, 1, -1]
As a result, the second component of the state G changes from −1 to 1. In other words, the state G transits to the state B
Hopfield network- example
56
As state B is a fundamental pattern, no more transition will occur
Hopfield network- example
57
Hopfield network- example
58
Hopfield network- example
59
Storage Capacity of Hopfield Net
Binary
Bipolar
P: # of patterns that can be stored an recalled in a net with reasonable accuracy
n: # of neurons in the net
nP 15.0
n
nP
2log2
Bidirectional associative memory (BAM)
The Hopfield network represents an autoassociative type of memory it can retrieve a corrupted or incomplete memory but cannot associate this memory with another different memory.
Human memory is essentially associative. One thing may remind us of another, and that of another, and so on. We use a chain of mental associations to recover a lost memory. If we forget where we left an umbrella, we try to recall where we last had it, what we were doing, and who we were talking to. We attempt to establish a chain of associations, and thereby to restore a lost memory.
60
To associate one memory with another, we need a recurrent neural network capable of accepting an input pattern on one set of neurons and producing a related, but different, output pattern on another set of neurons.
Bidirectional associative memory (BAM), first proposed by Bart Kosko, is a heteroassociative network. It associates patterns from one set, set A, to patterns from another set, set B, and vice versa. Like a Hopfield network, the BAM can generalize and also produce correct outputs despite corrupted or incomplete inputs.
Bidirectional associative memory (BAM)
61
BAM operation
yj(p)
y1(p)
y2(p)
ym(p)
1
2
j
m
Output
layer
Input
layer
xi(p)
x1
x2
(p)
(p)
xn(p)
2
i
n
1
xi(p+1)
x1(p+1)
x2(p+1)
xn(p+1)
yj(p)
y1(p)
y2(p)
ym(p)
1
2
j
m
Output
layer
Input
layer
2
i
n
1
(a) Forward direction. (b) Backward direction.
62
The basic idea behind the BAM is to store pattern pairs so that when n-dimensional vector X from set A is presented as input, the BAM recalls m-dimensional vector Y from set B, but when Y is presented as input, the BAM recalls X.
Bidirectional associative memory (BAM)
63
To develop the BAM, we need to create a correlation matrix for each pattern pair we want to store. The correlation matrix is the matrix product of the input vector X, and the transpose of the output vector YT. The BAM weight matrix is the sum of all correlation matrices, that is,
where M is the number of pattern pairs to be stored in the BAM.
Tm
M
m
m YXW
1
Bidirectional associative memory (BAM)
64
Stability and storage capacity of the BAM
The BAM is unconditionally stable. This means that any set of associations can be learned without risk of instability.
The maximum number of associations to be stored in the BAM should not exceed the number of neurons in the smaller layer.
The more serious problem with the BAM is incorrect convergence. The BAM may not always produce the closest association. In fact, a stable association may be only slightly related to the initial input vector.
65
BAM- Example
Imagine we wish to store two associations, A1:B1 and A2:B2.
• A1 = (1, 0, 1, 0, 1, 0), B1 = (1, 1, 0, 0)• A2 = (1, 1, 1, 0, 0, 0), B2 = (1, 0, 1, 0)
These are then transformed into the bipolar forms:
• X1 = (1, -1, 1, -1, 1, -1), Y1 = (1, 1, -1, -1)• X2 = (1, 1, 1, -1, -1, -1), Y2 = (1, -1, 1, -1)
66
67
BAM- Example
To retrieve the association A1, we multiply it by M to get (4, 2, -2, -4), which, when run through a threshold, yields (1, 1, 0, 0), which is B1. To find the reverse association, multiply this by the transpose of M.
Questions