introduction to artificial neural networks - eemb … · a representational system and psychology...

05/03/2014

1

Introduction to Artificial Neural Networks

EE-588 & EE-589

1 2

Assistant Prof. Dr. Turgay IBRIKCI

Room # 305

Wensdays 9:30- 12:00

(322) 338 6868 / 139

eembdersler.wordpress.com

3

Course Outline

The course is divided in two parts: theory and practice.

1. Theory covers basic topics in neural networks theory and application to supervised, unsupervised and reinforcement learning.

2. Practice deals with basics of MATLAB and implementation of NN learning algorithms. We assume that you know MATLAB or you will learn it yourself

4

Course Grading

Grading the Class:

Project 40% – Proposal 5%( 02/04/2014)

– Report 35% (CD -will be in paper publication format)

– Presentation (Week 14 ; 20 mins)

Final Exam 20% (Week 15-We decide together)

Homeworks 40% (At least 4 homeworks)

Full attending the class 10% (Bonus)

Project

Choose a topic that is related to your research interest and pertains to the course material. Discuss with your supervisor

The proposal should include the following sessions (Due: 02/04) – the project goal, – the problems to be studied, – overview of current methods, – proposed methods, – expected results, and – references (about 4 pages: single space, fond size = 12, references

are not counted). References should be cited in the proposal. A written final report in the style of a journal article is also

required. Final project is due by 21/05. Each student will give classroom presentation about the final

project.

WARNING !!!!!!!!!

IMPORTANT NOTE: We don’t accept late materials at all

We will take your signatures every class during the course that you attend.

6

05/03/2014

2

7

What you learn from the course

How to approach a machine learning classification or clustering

Basic knowledge of the common linear machine learning algorithms

Basic knowledge of Neural Networks learning algorithms

A good understanding of neural network algorithms

8

Academic Integrity

All programming is to be done alone!

Do not share code with anyone else in the course (looking at the code counts as sharing)!

Comparison of homeworks will be to catch cheaters!

Minimum penalty is 2 letter grade drop for the course for everyone involved.

9

Where to go for help

You can discuss your programs with anyone!

Feel free to send your code to

[email protected]

and ask me for help.

Please write your name and coursenumber to the subject line

Bring your code to the class

Contents

Fundamental Concepts of Intelligent

Fundamental Concepts of ANNs.

Basic Models and Learning Rules – Neuron Models

– ANN structures

– Learning

Distributed Representations

Conclusions

10


Fundamental Concepts of Intelligent

11

Philosophers have been trying for over 2000 years to understand and resolve two Big Questions of the Universe: How does a human mind work, and Can non-humans have minds? These questions are still unanswered.

Intelligence is the ability to understand and learn things.

Intelligence is the ability to think and understand instead of doing things by instinct or automatically.

Intelligent machines, or what machines can do

(Essential English Dictionary, Collins, London, 1990) 12

05/03/2014

3

What is Intelligence?

Intelligence: – “the capacity to learn and solve problems”

(Websters dictionary)

– in particular, the ability to solve novel problems

the ability to act rationally

the ability to act like humans

Artificial Intelligence – build and understand intelligent entities or agents

– 2 main approaches: “engineering” versus “cognitive modeling”

13

We can define intelligence as the ability to learn and understand, to solve problems and to make decisions.

The goal of artificial intelligence (AI) as a science is to make machines do things that would require intelligence if done by humans. Therefore, the answer to the question Can Machines Think? was vitally important to the discipline.

The answer is not a simple “Yes” or “ No ”.

Intelligent machines

14

As humans, we all have the ability to learn and understand, to solve problems and to make decisions; however, our abilities are not equal in different areas. Therefore, we should expect that if machines can think, some of them might be smarter than others in some ways.

Intelligent machines

15

One of the most significant papers on machine intelligence, “Computing Machinery and Intelligence,” was written by the British mathematician Alan Turing over fifty five years ago. However, it still stands up well under the test of time, and the Turing’s approach remains universal.

Turing did not provide definitions of machines and thinking, he just avoided semantic arguments by inventing a game, the Turing Imitation Game.

Turing Imitation Game

16

n The imitation game originally included two phases. In the first phase, the interrogator, a man and a woman are each placed in separate rooms. The interrogator’s objective is to work out who is the man and who is the woman by questioning them. The man should attempt to deceive the interrogator that he is the woman, while the woman has to convince the interrogator that she is the woman.

Turing Imitation Game

17

Turing Imitation Game: Phase 1

18

05/03/2014

4

n In the second phase of the game, the man is replaced by a computer programmed to deceive the interrogator as the man did. It would even be programmed to make mistakes and provide fuzzy answers in the way a human would. If the computer can fool the interrogator as often as the man did, we may say this computer has passed the intelligent behaviour test.


19


20

n Turing believed that by the end of the 20th century it would be possible to program a digital computer to play the imitation game. Although modern computers still cannot pass the Turing test, it provides a basis for the verification and validation of knowledge-based systems.

n A program intelligence, in some narrow area of expertise, is evaluated by comparing its performance with the performance of a human expert.

n To build an intelligent computer system, we have to capture, organize and use human expert knowledge in some narrow area of expertise.

The Result of Turing Machines

21

Artificial neural networks are composed of interconnecting artificial neurons (programming constructs that mimic the properties of biological neurons).

What is Neural Networking??

22

Human Intelligence VS Artificial Intelligence

23


Human Intelligence

Intuition, Common sense, Judgement, Creativity, Beliefs etc

The ability to demonstrate their intelligence by communicating effectively

Plausible Reasoning and Critical thinking

Artificial Intelligence

Ability to simulate human behavior and cognitive processes

Capture and preserve human expertise

Fast Response. The ability to comprehend large amounts of data quickly.

24

05/03/2014

5

• Humans are fallible

• They have limited knowledge bases

• Information processing of serial nature proceed very slowly in the brain as compared to computers

Humans are unable to retain large amounts of data in memory.

No “common sense”

Cannot readily deal with “mixed” knowledge

May have high development costs

Raise legal and ethical concerns


Human Intelligence Artificial Intelligence

25


We achieve more than we know. We know more than we understand. We understand more than we can explain

(Claude Bernard, 19th C French scientific philosopher) 26

Artificial Intelligence

AI software uses the techniques of search and pattern matching

Programmers design AI software to give the computer only the problem, not the steps necessary to solve it

Conventional Computing

Conventional computer software follow a logical series of steps to reach a conclusion

Computer programmers originally designed software that accomplished tasks by completing algorithms

Artificial Intelligence VS Conventional Computing

27

Psychology And Artificial intelligence

The functionalist approach of AI views the mind as a representational system and psychology as the

study of the various computational processes whereby mental representations are constructed,

organized, and interpreted.

(Margaret Boden's essays written between 1982 and 1988)

28

Artificial intelligence & Our society Why we need AI??

To supplement natural intelligence for e.g we are building intelligence in an object so that it can do what we want it to do, as for example– robots, thus reducing human labour and reducing human mistakes

29


Fundamental Concepts of ANNs

30

05/03/2014

6

What is ANN? Why ANN?

ANN Artificial Neural Networks – To simulate human brain behavior

– A new generation of information processing system.

31

Applications

Pattern Matching Pattern Recognition Associate Memory (Content Addressable Memory) Function Approximation Learning Optimization Vector Quantization Data Clustering . . .

32

Applications

Pattern Matching Pattern Recognition Associate Memory (Content Addressable Memory) Function Approximation Learning Optimization Vector Quantization Data Clustering . . .

Traditional Computers are inefficient at these tasks although their computation speed is faster.

33

The Configuration of ANNs

An ANN consists of a large number of interconnected processing elements called neurons. – A human brain consists of ~1011 neurons of

many different types.

How ANN works? – Collective behavior.

34

The Biologic Neuron

35

The Biologic Neuron

chemical

tail Connection point

36

05/03/2014

7

The Biologic Neuron

Excitatory or Inhibitory

37

The Artificial Neuron

x1

x2

xm

wi1

wi2

wim

yi

f (.) a (.)

i

38

The Artificial Neuron-Perceptron

x1

x2

xm

wi1

wi2

wim

yi

f (.) a (.)

i

ij

m

j

iji xwf 1

)(

)()1( fatyi

otherwise

ffa

0

01)(

39


x1

x2

xm

wi1

wi2

wim

yi

f (.) a (.)

i

wij

positive excitatory negative inhibitory zero no connection

40


x1

x2

xm

wi1

wi2

wim

yi

f (.) a (.)

i

Proposed by McCulloch and Pitts [1943] M-P neurons

41

What can be done by M-P neurons?

A hard limiter. A binary threshold unit. Hyperspace separation.

otherwise

fy

xwxwf

i

i

0

0)(1

)( 2211

x1

x2

x1 x2

y

w1 w2 1 0

42

05/03/2014

8

What ANNs will be?

ANN A neurally inspired mathematical model.

Consists a large number of highly interconnected PEs.

Its connections (weights) holds knowledge.

The response of PE depends only on local information.

Its collective behavior demonstrates the computation power.

With learning, recalling and, generalization capability.

43

Three Basic Entities of ANN Models

Models of Neurons or PEs.

Models of synaptic interconnections and structures.

Training or learning rules.

44


Basic Models and Learning Rules

Neuron Models

ANN structures

Learning

45

Processing Elements

f (.) a (.)

i

What integration functions we may have?

What activation functions we may have?

Extensions of M-P neurons

46

Integration Functions

f (.) a (.)

i ij

m

j

iji xwf

2

1

Quadratic Function

i

m

j

ijji wxf 1

2)(Spherical Function

1 1

j k

m m

i ijk j k j k i

j k

f w x x x x

Polynomial Function

ij

m

j

ijii xwnetf 1

M-P neuron

47

Activation Functions

f (.) a (.)

i

M-P neuron: (Step function)

otherwise

ffa

0

01)(

1

a

f 48

05/03/2014

9


f (.) a (.)

i

Hard Limiter (Threshold function)

01

01)sgn()(

f

fffa

1

a

1 f

49


f (.) a (.)

i

Ramp function:

00

10

11

)(

f

ff

f

fa

1

a

1 f

50


f (.) a (.)

i

Unipolar sigmoid function:

0

0.5

1

1.5

-4 -3 -2 -1 0 1 2 3 4

fefa

1

1)(

51


f (.) a (.)

i

Bipolar sigmoid function:

11

2)(

fefa

-1.5

-1

-0.5

0

0.5

1

1.5

-4 -3 -2 -1 0 1 2 3 4

52

x

y

Example: Activation Surfaces

L1

L2

L3

x y

L1 L2 L3

53

x

y L1

L2

L3


x1=0

y1=0

xy+4=0

x y

L1 L2 L3

1 0

1=1

0 1

2=1

1

1

3= 4

54

05/03/2014

10


x y

L1 L2 L3

x

y L1

L2

L3

011

001 101 100

110

010

111

Region Code

55

x

y L1

L2

L3


z=1

z=0 L4

z

x y

L1 L2 L3

56

x

y L1

L2

L3


z=1

z=0 L4

z

x y

L1 L2 L3

1

4=2.5

1 1

57


L4

z

x y

L1 L2 L3

M-P neuron: (Step function)

otherwise

ffa

0

01)(

58


L4

z

x y

L1 L2 L3

=2 =3

=5 =10

Unipolar sigmoid function: fe

fa

1

1)(

59



Neuron Models

ANN structures

Learning

60

05/03/2014

11

ANN Structure (Connections)

61

Single-Layer Feedforward Networks

y1 y2 yn

x1 x2 xm

w11 w12

w1m w21 w22

w2m wn1 wnm wn2

. . .

62

Multilayer Feedforward Networks

. . .

. . .

. . .

. . .

x1 x2 xm

y1 y2 yn

Hidden Layer

Input Layer

Output Layer

63

Multilayer Feedforward Networks

Input

Analysis

Classification Output

Learning

Where the knowledge from?

64

Single Node with Feedback to Itself

Feedback Loop

65

Single-Layer Recurrent Networks

. . .

x1 x2 xm

y1 y2 yn

66

05/03/2014

12

Multilayer Recurrent Networks

x1 x2 x3

y1 y2 y3

. . .

. . .

67



Neuron Models

ANN structures

Learning

68

Learning

Consider an ANN with n neurons and each with m adaptive weights.

Weight matrix:

nmnn

m

m

T

n

T

T

www

www

www

21

22221

11211

2

1

w

w

w

W

69

Learning

Consider an ANN with n neurons and each with m adaptive weights.

Weight matrix:

nmnn

m

m

T

n

T

T

www

www

www

21

22221

11211

2

1

w

w

w

W

To “Learn” the weight matrix.

How?

70

Learning Rules

Supervised learning

Reinforcement learning

Unsupervised learning

71

Supervised Learning

Learning with a teacher

Learning by examples

Training set

(1) (2)(1) (2 )) ( )(( , ), ( , ), , ( , ),kk d d dx xT x

72

05/03/2014

13

Supervised Learning

x

Error

signal

Generator

d

y ANN

W

(1) (2)(1) (2 )) ( )(( , ), ( , ), , ( , ),kk d d dx xT x

73

Reinforcement Learning

Learning with a critic

Learning by comments

74

Reinforcement Learning

x

Critic

signal

Generator

y ANN

W Reinforcement

Signal

75

Unsupervised Learning

Self-organizing

Clustering – Form proper clusters by

discovering the similarities and dissimilarities among objects.

76

Unsupervised Learning

x y ANN

W

77

Some Learning Laws

• Hebb’s Learning [1949]

• Hopfield Learning[1951]

• Least Mean Square learning-Delta Rules

• …..etc

78

05/03/2014

14

Hebb’s Learning Law

• Hebb [1994] hypothesis that when an axonal input from A to B causes neuron B to immediately emit a pulse (fire) and this situation happens repeatedly or persistently.

• Then, the efficacy of that axonal input, in terms of ability to help neuron B to fire in future, is somehow increased.

• Hebb’s learning rule is a unsupervised learning rule.

79

Hebb’s Learning Law

( , , ) ( )T

r i i iif d ar y w x w x

( ) ( ) ii rt t y w x x

iij jw xy

+

+ 80

Hopfield Learning

Recurrent; Every unit is connected to every other unit

Weights connecting units are symmetrical – wij = wji

If the weighted sum of the inputs exceeds a threshold, its output is 1 otherwise its output is -1

Units update themselves asynchronously as their inputs change

81

Hopfield Learning Law

Consider signum function to be neuron’s activation function.

i.e.

vi = +1 if hi>0

vi = -1 if hi<0

82

)sgn(1

ii

kt

i

k

i Tivwv

LMS Learning

LMS = Least Mean Square learning, The concept is to minimize the total error, as measured over all training examples, P. O is the raw output, as calculated by

P

PP OTLMSceDis2

2

1)(tan

We want to minimize the LMS:

E

W

W(old)

W(new)

C-learning rate

i

ii Iw

The General Weight Learning Rule

1

1

m

i ijij

j

net xw

Input:

Output: ( )i iy a net

i

.

.

.

.

.

.

wi1

wi2

wij

wi,m-1

x1

x2

xj

xm-1

yi

i

84

05/03/2014

15


1

1

m

i ijij

j

net xw

Input:

Output: ( )i iy a net

i

.

.

.

.

.

.

wi1

wi2

wij

wi,m-1

x1

x2

xj

xm-1

yi

i

We want to learn the weights & bias.

85

We want to learn the weights & bias.


1

1

m

i ijij

j

net xw

Input:

i

.

.

.

.

.

.

wi1

wi2

wij

wi,m-1

x1

x2

xj

xm-1

i

1

ij

m

i j

j

net xw

Let xm = 1 and wim = i.

86


1

1

m

i ijij

j

net xw

Input:

i

.

.

.

.

.

.

wi1

wi2

wij

wi,m-1

x1

x2

xj

xm-1

1

ij

m

i j

j

net xw

Let xm = 1 and wim = i.

xm= 1

wim=i

87


Input:

i

.

.

.

.

.

.

wi1

wi2

wij

wi,m-1

x1

x2

xj

xm-1

1

ij

m

i j

j

net xw

xm= 1

wim=i

yi

wi=(wi1, wi2 ,…,wim)T

wi(t) = ?

We want to learn

88


wi x

yi

r di Learning

Signal

Generator

89


wi x

yi

r di Learning

Signal

Generator

( , , )r i if dw x

90

05/03/2014

16


wi x

yi

r di Learning

Signal

Generator

)()( trti xw

( , , )r i if dw x

91


wi x

yi

r di Learning

Signal

Generator

( ) ( )i t tr w x

)()( trti xw

Learning Rate

( , , )r i if dw x

92


wi=(wi1, wi2 ,…,wim)T We want to learn

( , , )r i ir f d w x( ) ( )i t tr w x

( 1) ( ) ( ) ( ) ( ) ( )( , , )t t t t t t

i i r i if d w w w x x

Discrete-Time Weight Modification Rule:

Continuous-Time Weight Modification Rule:

( )( )id t

r tdt

w

x93



94


• Distributed Representation: – An entity is represented by a pattern of

activity distributed over many PEs. – Each Processing element is involved in

representing many different entities.

• Local Representation: – Each entity is represented by one PE.

95

Example

P0 P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 P13 P14 P15

+ _

+ + _ _ _ _

+ + + + + _ _ _

+ _

+ + _ _ _ _

+ _

+ _

+ + _

+

+ + _

+ _

+ + _

+ _ _

+ + + + _

Dog

Cat

Bread

96

05/03/2014

17


+ _

+ + _ _ _ _

+ + + + + _ _ _

+ _

+ + _ _ _ _

+ _

+ _

+ + _

+

+ + _

+ _

+ + _

+ _ _

+ + + + _

Dog

Cat

Bread

Advantages

Act as a content addressable memory.


+ + + +

What is this? 97

Advantages


+ _

+ + _ _ _ _

+ + + + + _ _ _

+ _

+ + _ _ _ _

+ _

+ _

+ + _

+

+ + _

+ _

+ + _

+ _ _

+ + + + _

Dog

Cat

Bread


Make induction easy.


+ _ _

+ _ _ _ _

+ + + + + + _ _

Fido

Dog has 4 legs? How many for Fido? 98

Advantages


+ _

+ + _ _ _ _

+ + + + + _ _ _

+ _

+ + _ _ _ _

+ _

+ _

+ + _

+

+ + _

+ _

+ + _

+ _ _

+ + + + _

Dog

Cat

Bread



Make the creation of new entities or concept easy (without allocation of new hardware).

+ + _ _ _

+ + _

+ _ _ _

+ + + _

Doughnut

Add doughnut by changing weights. 99

Advantages


+ _

+ + _ _ _ _

+ + + + + _ _ _

+ _

+ + _ _ _ _

+ _

+ _

+ + _

+

+ + _

+ _

+ + _

+ _ _

+ + + + _

Dog

Cat

Bread



Make the creation of new entities or concept easy (without allocation of new hardware).

Fault Tolerance.

Some PEs break down don’t cause problem. 10

0

Disadvantages

• How to understand?

• How to modify?

Learning procedures are required.

10

1

REFERENCES

All materials were collected from different

web pages.

Thanks for the authors of these slides.

10

2

introduction to artificial neural networks - eemb … · a representational system and psychology...

Documents