artificial intelligence csc 361

Department of Computer Engineering

University of Kurdistan

Neural Networks (Graduate level)Associative Networks

By: Dr. Alireza Abdollahpouri

Associative networks

• To an extent, learning is forming associations.

• Human memory associates

– similar items,

– contrary/opposite items,

– items close in proximity,

– items close in succession (e.g., in a song)

2

The patterns we associate together may be

– of the same type or sensory modality (e.g. a visual

image may be associated with another visual image)

– or of different types (e.g. a fragrance may be

associated with a visual image or a feeling).

Memorization of a pattern (or a group of patterns) may be

considered to be associating the pattern with itself.

3


An associative network is a single-layer

Network in which the weights are determined in such a way that the net can store a set of

pattern associations.

4


5


Associative

Networks

Auto-

associative

Hetero-

associative

Feed-forward Recurrent Feed-forward Recurrent

6

Types of Associative networks

Hetero associative

Auto associative

n Input m Output

n Input n Output

7


Auto-association

AA

Hetero-association

Niagara Waterfall

memory

memory

8


Whether auto- or hetero-associative, the net can associate not only the exact pattern pairs used in training, but is also able to obtain associations if the input is similar to one on which it has been trained.

�Information recording: A large set of patterns (the priori information) are stored (memorized)

�Information retrieval/recall: Stored prototypes are excited according to the input key patterns

9


AM- Representation

Before training an AM NN, the original patterns must be converted to an appropriate representation for computation

In a simple example, the original pattern might consist of “on” and “off” signals, and the conversion could be:

“on” +1 , “off” 0 (binary representation) or

“on” +1, “off” -1 (bipolar representation).

10

Examples of Hetero-association

Mapping from 4-inputs to 2-outputs.

Whenever the net is shown a 4-bit input pattern, it produces a 2-bit output pattern

11

Hebbian Learning

. .

[ ]

. .

j m

i i j i mi j m

n n j n mn

s t s t s ts

s t s t s tS T s t t t

s t s t s ts

1 1 1 11

11

1

Input n dimensional

Output: m dimensional

Weight matrix- using Hebb rule

12

( , , , ) ( , )

( , , , ) ( , )

( , , , ) ( , )

( , , , ) ( , )

s s s s t t

s t

s t

s t

s t

1 2 3 4 1 2

1 1 0 0 0 1 0

2 1 1 0 0 1 0

3 0 0 0 1 0 1

4 0 0 1 1 0 1

13


(1, 0, 0, 0):(1, 0)s : t

.

1 1 0

0 0 01 0

0 0 0

0 0 0

.

1 1 0

1 1 01 0

0 0 0

0 0 0

(1, 1, 0, 0):(1, 0)s : t

14

weight matrix to store the first pair:

weight matrix to store the second pair:


.

0 0 0

0 0 00 1

0 0 0

1 0 1

.

0 0 0

0 0 00 1

1 0 1

1 0 1

(0, 0, 0, 1):(0, 1)s : t

(0, 0, 1, 1):(0, 1)s : t

W

1 0 1 0 0 0 0 0 2 0

0 0 1 0 0 0 0 0 1 0

0 0 0 0 0 0 0 1 0 1

0 0 0 0 0 1 0 1 0 2

15

weight matrix to store the third pair:

weight matrix to store the fourth pair:


Test the network

16

Using the activation function:

For the first input pattern

This is the correct response for the first training pattern

Similarly, applying the same algorithm, with x equal to each of the other three training input vectors, yields:

17

Test the network

18

Testing a hetero-associative net with input similar to the training input

Thus, the net also associates a known output pattern with this input

Testing a heteroassociative net with input that is not similar to the training input

The output is not one of the outputs with which the net was trained; in other words, the net does not recognize the pattern.

Test the network

( ) ( , , , ), ( ) ( )

( ) ( , , , ), ( ) ( , )

( ) ( , , , ), ( ) ( , )

( ) ( , , , ), ( ) ( , )

s t

s t

s t

s t

1 1 1 1 1 1 1 1

2 1 1 1 1 2 1 1

3 1 1 1 1 3 1 1

4 1 1 1 1 4 1 1

( ) ( )ij i j

p

w s p t p

1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1. 1 1 . 1 1 . 1 1

1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1

1 1 1 4 4

1 1 1 2 2. 1 1

1 1 1 2 2

1 1 1 4 4

First input Second input Third input Fourth input

19

Bi-polar representation

Weight matrix

( , , , , , , , , , , , , , , ) 1 1 1 1 11 1 11 1 11 1 11

20

A heteroassociative net for associating letters from different fonts

Hetero-associative network- example

21

Response of heteroassociative net to several noisy versions of pattern A.


Response of heteroassociative net to patterns A, B, and C with mistakes in 1/3 of the components.

22


Auto-associative nets

23

the weights are usually set from the formula

W

1 1 1 1

1 1 1 1

1 1 1 1

1 1 1 1

( , , , ) x 1 1 1 1

_ ( , , , )in y 4 4 4 4

( , , , ) ( , , , )f y 4 4 4 4 1 1 1 1

( , , , ). ( , , , ) ( , , , ) W1 1 1 1 4 4 4 4 1 1 1 1

Correct recognition of input vector

24

Auto-associative net- example

The vector s = (1, 1, 1, - 1) is stored with the weight matrix:

( , , , ) s 1 1 1 1

W

1 1 1 1

1 1 1 1

1 1 1 1

1 1 1 1

( , , ) . ( , , , ) ( , , , )

( , , ) . ( , , , ) ( , , , )

( , , ) . ( , , , ) ( , , , )

( , , ) . ( , , , ) ( , , , )

W

W

W

W

1 1 1 1 2 2 2 2 1 1 1 1

1 1 1 1 2 2 2 2 1 1 1 1

1 1 1 1 2 2 2 2 1 1 1 1

1 1 1 1 2 2 2 2 1 1 1 1

One mistake in input vector

Correct recognition

25

Testing an autoassociative net: one mistake in the input vector


( , , , ) s 1 1 1 1

W

1 1 1 1

1 1 1 1

1 1 1 1

1 1 1 1

two "missing" entries

Correct recognition

( , , , ) . ( , , , ) ( , , , )

( , , , ) . ( , , , ) ( , , , )

( , , , ) . ( , , , ) ( , , , )

( , , , ) . ( , , , ) ( , , , )

( , , , ) . ( , , , ) ( , , , )

( , , , ) . ( , , , ) ( , , ,

W

W

W

W

W

W

0 0 1 1 2 2 2 2 1 1 1 1

0 1 0 1 2 2 2 2 1 1 1 1

0 1 1 0 2 2 2 2 1 1 1 1

1 0 0 1 2 2 2 2 1 1 1 1

1 0 1 0 2 2 2 2 1 1 1 1

1 1 0 0 2 2 2 2 1 1 1 1)

26

Testing an autoassociative net: two "missing" entries in the input vector


( , , , ) s 1 1 1 1

W

1 1 1 1

1 1 1 1

1 1 1 1

1 1 1 1

Incorrect recognition

( , , , ). ( , , , ) W1 1 1 1 0 0 0 0

27

Testing an autoassociative net: two mistakes in the input vector


28

Storage Capacity

An autoassociative net with four nodes can store three orthogonal vectors

1 2 1 2

(1, 1, 1, 1) ( 1, 1, 1, 1)

0 1 1 1 0 1 1 1 0 0 2 0

1 0 1 1 1 0 1 1 0 0 0 2

1 1 0 1 1 1 0 1 2 0 0 0

1 1 1 0 1 1 1 0 0 2 0 0

W W W W

It is fairly common for an autoassociativenetwork to have its diagonal terms set to zero

29

Storage Capacity

1 2 1 2

(1, 1, 1, 1) ( 1, 1, 1, 1)

0 1 1 1 0 1 1 1 0 0 2 0

1 0 1 1 1 0 1 1 0 0 0 2

1 1 0 1 1 1 0 1 2 0 0 0

1 1 1 0 1 1 1 0 0 2 0 0

W W W W

1 2(1, 1, 1, 1).[ ] (1, 1, 1, 1) W W

1 2( 1, 1, 1, 1).[ ] ( 1, 1, 1, 1) W WCorrect recognition

30

Storing two Orthogonal vectors in an autoassociative net

Storage Capacity

( , , , ) 1 1 1 1( , , , )1 1 1 1

0 1 1 1 0 1 1 1 0 0 2 2

1 0 1 1 1 0 1 1 0 0 0 0

1 1 0 1 1 1 0 1 2 0 0 2

1 1 1 0 1 1 1 0 2 0 2 0

W

(1, 1, 1, 1). (1, 1, 1,1) W

(1, 1, 1, 1). (1, 1, 1,1) W

Incorrect Recognition

31

Attempting to store two non-orthogonal vectors in an autoassociative net

Storage Capacity

1 2 3 1 2 3

(1, 1, 1, 1) ( 1, 1, 1, 1) ( 1, 1, 1, 1)

0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1

1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1

1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1

1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0

W W W W W W

Correct recognition of all vectors

32

Storing three Orthogonal vectors in an autoassociative net

Storage Capacity

1 2 3 4 1 2 3 4

(1, 1, 1, 1) ( 1, 1, 1, 1) ( 1, 1, 1, 1) (1, 1, 1, 1)

0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1

1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1

1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1

1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0

W W W W W W W W

0 0 0 0

0 0 0 0

0 0 0 0

0 0 0 0

learning is erased.

33

Storing four Orthogonal vectors in an autoassociative net

Storage Capacity

Hopfield neural network (HNN) is a model of autoassociative memory

It is a single layer neural network with feedbacks.

34

Hopfield Neural network

1 2 3 n. . .

v1 v2 v3 vn

I1 I2 I3 In

w21 w31 wn1

w12 w32 wn2

w13 w23 wn3

w1n w2n w3n

wij = wji

wii = 0

35


36


2 ( ) 1 2 ( ) 1 , ; 0ij i i ii

p

w s p s p i j w

( ) ( ), ; 0ij i j ii

p

w s p s p i j w

37

To store a set of binary patterns, the weight matrix W = is given by:

To store a set of bipolar patterns, the weight matrix W = is given by:


, ,...,i iy x i n 1

_ i i j ji

j

y in x y w _

_

_ .

i i

i i i i

i i

if y in

y y if y in

if y in

1

0iy

38

Step 0. Initialize weights to store patterns.

While activations of the net are not converged, do Steps 1-7.

Step 1. For each input vector x, do Steps 2-6.

Step 2. Set initial activations of net equal to the external input vector x:

Step 3. Do Steps 4-6 for each unit (Units should be updated in random order.)

Step 4. Compute net input:

Step 5. Determine activation (output signal):

Step 6. Broadcast the value of to all other units. (This updates the activation vector.)

Step 7. Test for convergence.


W

0 1 1 1

1 0 1 1

1 1 0 1

1 1 1 0

Y 1 4Y 3Y2Y

39

Hopfield Neural network - example

The vector (1, 1, 1,0) (or its bipolar equivalent (1, 1, 1, - 1)) was stored in a net

the weight matrix is bipolar

The input vector is x = (0, 0, 1, 0)

For this example the update order is

( , , , )y 0 0 1 0

Y 1

_ j j

j

y in x y w 1 1 1 0 1

W

0 1 1 1

1 0 1 1

1 1 0 1

1 1 1 0

_y in y 1 10 1 ( , , , )y 1 0 1 0

_ ( )j j

j

y in x y w 4 4 4 0 2

( , , , )y 1 0 1 0

40

4Y

_y in y 4 40 0

Choose unit to update its activation:



3Y

2Y

_ j j

j

y in x y w 3 3 3 1 1

( , , , )y 1 0 1 0

_y in y 3 30 1 ( , , , )y 1 0 1 0

_ j j

j

y in x y w 2 2 2 0 2

_y in y 2 20 1 ( , , , )y 1 1 1 0

41



further iterations do not change the activation of any unit. The net has converged to the stored vector.


42

• Image reconstruction.• A 20 X 20 discrete Hopfield network was trained with 20 input patterns,

including the one shown in the left figure and 19 random patterns as the one on the right.


43

The Hopfield Network After providing only one fourth of the “face” image as initial input, the network is able to perfectly reconstruct that image within only two iterations.


44

Adding noise by changing each pixel with a probability p = 0.3 does not impair the network’s performance. After two steps the image is perfectly reconstructed.


45

for noise created by p = 0.4, the network is unable the original image. Instead, it converges against one of the 19 random patterns.


The stored dada in memory

Network input: Distorted 6

Perfect Recall

46


Network input: Distorted 9Network input: Distorted 2

Erroneous Recall

Perfect Recall

47


48

E 0

Hopfield network- Energy function

Hopfield nets have a scalar value associated with each state of the network, referred to as the "energy", E, of the network, where:

a Hopfield network constantly decreases its energy

49

W

0 1 1 1

1 0 1 1

1 1 0 1

1 1 1 0

Establishing Connection Weights

Hopfield network- example

50

Network’ States and Their Code: Total number of states = 16


51

Calculating energy function for all states:=0


52

For state A = [O1, O2, O3, O4] [ 1 , 1, 1, 1]


53

Minimum energy

(Stable states)

Similarly, we can compute the energy level of the other states.


54

State Transition for State J = [−1 , −1 , 1 , −1]

As a result, the first component of the state J changes from −1 to 1. In other words, the state J transits to the state G


55

B= [1, 1, 1, -1]

As a result, the second component of the state G changes from −1 to 1. In other words, the state G transits to the state B


56

As state B is a fundamental pattern, no more transition will occur


57


58


59

Storage Capacity of Hopfield Net

Binary

Bipolar

P: # of patterns that can be stored an recalled in a net with reasonable accuracy

n: # of neurons in the net

nP 15.0

n

nP

2log2

Bidirectional associative memory (BAM)

The Hopfield network represents an autoassociative type of memory it can retrieve a corrupted or incomplete memory but cannot associate this memory with another different memory.

Human memory is essentially associative. One thing may remind us of another, and that of another, and so on. We use a chain of mental associations to recover a lost memory. If we forget where we left an umbrella, we try to recall where we last had it, what we were doing, and who we were talking to. We attempt to establish a chain of associations, and thereby to restore a lost memory.

60

To associate one memory with another, we need a recurrent neural network capable of accepting an input pattern on one set of neurons and producing a related, but different, output pattern on another set of neurons.

Bidirectional associative memory (BAM), first proposed by Bart Kosko, is a heteroassociative network. It associates patterns from one set, set A, to patterns from another set, set B, and vice versa. Like a Hopfield network, the BAM can generalize and also produce correct outputs despite corrupted or incomplete inputs.


61

BAM operation

yj(p)

y1(p)

y2(p)

ym(p)

1

2

j

m

Output

layer

Input

layer

xi(p)

x1

x2

(p)

(p)

xn(p)

2

i

n

1

xi(p+1)

x1(p+1)

x2(p+1)

xn(p+1)

yj(p)

y1(p)

y2(p)

ym(p)

1

2

j

m

Output

layer

Input

layer

2

i

n

1

(a) Forward direction. (b) Backward direction.

62

The basic idea behind the BAM is to store pattern pairs so that when n-dimensional vector X from set A is presented as input, the BAM recalls m-dimensional vector Y from set B, but when Y is presented as input, the BAM recalls X.


63

To develop the BAM, we need to create a correlation matrix for each pattern pair we want to store. The correlation matrix is the matrix product of the input vector X, and the transpose of the output vector YT. The BAM weight matrix is the sum of all correlation matrices, that is,

where M is the number of pattern pairs to be stored in the BAM.

Tm

M

m

m YXW

1


64

Stability and storage capacity of the BAM

The BAM is unconditionally stable. This means that any set of associations can be learned without risk of instability.

The maximum number of associations to be stored in the BAM should not exceed the number of neurons in the smaller layer.

The more serious problem with the BAM is incorrect convergence. The BAM may not always produce the closest association. In fact, a stable association may be only slightly related to the initial input vector.

65

BAM- Example

Imagine we wish to store two associations, A1:B1 and A2:B2.

• A1 = (1, 0, 1, 0, 1, 0), B1 = (1, 1, 0, 0)• A2 = (1, 1, 1, 0, 0, 0), B2 = (1, 0, 1, 0)

These are then transformed into the bipolar forms:

• X1 = (1, -1, 1, -1, 1, -1), Y1 = (1, 1, -1, -1)• X2 = (1, 1, 1, -1, -1, -1), Y2 = (1, -1, 1, -1)

66

67

BAM- Example

To retrieve the association A1, we multiply it by M to get (4, 2, -2, -4), which, when run through a threshold, yields (1, 1, 0, 0), which is B1. To find the reverse association, multiply this by the transpose of M.

Questions

artificial intelligence csc 361

Documents