introduction to artificial neural networks - eemb … · a representational system and psychology...
TRANSCRIPT
05/03/2014
1
Introduction to Artificial Neural Networks
EE-588 & EE-589
1 2
Assistant Prof. Dr. Turgay IBRIKCI
Room # 305
Wensdays 9:30- 12:00
(322) 338 6868 / 139
eembdersler.wordpress.com
3
Course Outline
The course is divided in two parts: theory and practice.
1. Theory covers basic topics in neural networks theory and application to supervised, unsupervised and reinforcement learning.
2. Practice deals with basics of MATLAB and implementation of NN learning algorithms. We assume that you know MATLAB or you will learn it yourself
4
Course Grading
Grading the Class:
Project 40% – Proposal 5%( 02/04/2014)
– Report 35% (CD -will be in paper publication format)
– Presentation (Week 14 ; 20 mins)
Final Exam 20% (Week 15-We decide together)
Homeworks 40% (At least 4 homeworks)
Full attending the class 10% (Bonus)
Project
Choose a topic that is related to your research interest and pertains to the course material. Discuss with your supervisor
The proposal should include the following sessions (Due: 02/04) – the project goal, – the problems to be studied, – overview of current methods, – proposed methods, – expected results, and – references (about 4 pages: single space, fond size = 12, references
are not counted). References should be cited in the proposal. A written final report in the style of a journal article is also
required. Final project is due by 21/05. Each student will give classroom presentation about the final
project.
WARNING !!!!!!!!!
IMPORTANT NOTE: We don’t accept late materials at all
We will take your signatures every class during the course that you attend.
6
05/03/2014
2
7
What you learn from the course
How to approach a machine learning classification or clustering
Basic knowledge of the common linear machine learning algorithms
Basic knowledge of Neural Networks learning algorithms
A good understanding of neural network algorithms
8
Academic Integrity
All programming is to be done alone!
Do not share code with anyone else in the course (looking at the code counts as sharing)!
Comparison of homeworks will be to catch cheaters!
Minimum penalty is 2 letter grade drop for the course for everyone involved.
9
Where to go for help
You can discuss your programs with anyone!
Feel free to send your code to
and ask me for help.
Please write your name and coursenumber to the subject line
Bring your code to the class
Contents
Fundamental Concepts of Intelligent
Fundamental Concepts of ANNs.
Basic Models and Learning Rules – Neuron Models
– ANN structures
– Learning
Distributed Representations
Conclusions
10
Introduction to Artificial Neural Networks
Fundamental Concepts of Intelligent
11
Philosophers have been trying for over 2000 years to understand and resolve two Big Questions of the Universe: How does a human mind work, and Can non-humans have minds? These questions are still unanswered.
Intelligence is the ability to understand and learn things.
Intelligence is the ability to think and understand instead of doing things by instinct or automatically.
Intelligent machines, or what machines can do
(Essential English Dictionary, Collins, London, 1990) 12
05/03/2014
3
What is Intelligence?
Intelligence: – “the capacity to learn and solve problems”
(Websters dictionary)
– in particular, the ability to solve novel problems
the ability to act rationally
the ability to act like humans
Artificial Intelligence – build and understand intelligent entities or agents
– 2 main approaches: “engineering” versus “cognitive modeling”
13
We can define intelligence as the ability to learn and understand, to solve problems and to make decisions.
The goal of artificial intelligence (AI) as a science is to make machines do things that would require intelligence if done by humans. Therefore, the answer to the question Can Machines Think? was vitally important to the discipline.
The answer is not a simple “Yes” or “ No ”.
Intelligent machines
14
As humans, we all have the ability to learn and understand, to solve problems and to make decisions; however, our abilities are not equal in different areas. Therefore, we should expect that if machines can think, some of them might be smarter than others in some ways.
Intelligent machines
15
One of the most significant papers on machine intelligence, “Computing Machinery and Intelligence,” was written by the British mathematician Alan Turing over fifty five years ago. However, it still stands up well under the test of time, and the Turing’s approach remains universal.
Turing did not provide definitions of machines and thinking, he just avoided semantic arguments by inventing a game, the Turing Imitation Game.
Turing Imitation Game
16
n The imitation game originally included two phases. In the first phase, the interrogator, a man and a woman are each placed in separate rooms. The interrogator’s objective is to work out who is the man and who is the woman by questioning them. The man should attempt to deceive the interrogator that he is the woman, while the woman has to convince the interrogator that she is the woman.
Turing Imitation Game
17
Turing Imitation Game: Phase 1
18
05/03/2014
4
n In the second phase of the game, the man is replaced by a computer programmed to deceive the interrogator as the man did. It would even be programmed to make mistakes and provide fuzzy answers in the way a human would. If the computer can fool the interrogator as often as the man did, we may say this computer has passed the intelligent behaviour test.
Turing Imitation Game: Phase 2
19
Turing Imitation Game: Phase 2
20
n Turing believed that by the end of the 20th century it would be possible to program a digital computer to play the imitation game. Although modern computers still cannot pass the Turing test, it provides a basis for the verification and validation of knowledge-based systems.
n A program intelligence, in some narrow area of expertise, is evaluated by comparing its performance with the performance of a human expert.
n To build an intelligent computer system, we have to capture, organize and use human expert knowledge in some narrow area of expertise.
The Result of Turing Machines
21
Artificial neural networks are composed of interconnecting artificial neurons (programming constructs that mimic the properties of biological neurons).
What is Neural Networking??
22
Human Intelligence VS Artificial Intelligence
23
Human Intelligence VS Artificial Intelligence
Human Intelligence
Intuition, Common sense, Judgement, Creativity, Beliefs etc
The ability to demonstrate their intelligence by communicating effectively
Plausible Reasoning and Critical thinking
Artificial Intelligence
Ability to simulate human behavior and cognitive processes
Capture and preserve human expertise
Fast Response. The ability to comprehend large amounts of data quickly.
24
05/03/2014
5
• Humans are fallible
• They have limited knowledge bases
• Information processing of serial nature proceed very slowly in the brain as compared to computers
Humans are unable to retain large amounts of data in memory.
No “common sense”
Cannot readily deal with “mixed” knowledge
May have high development costs
Raise legal and ethical concerns
Human Intelligence VS Artificial Intelligence
Human Intelligence Artificial Intelligence
25
Human Intelligence VS Artificial Intelligence
We achieve more than we know. We know more than we understand. We understand more than we can explain
(Claude Bernard, 19th C French scientific philosopher) 26
Artificial Intelligence
AI software uses the techniques of search and pattern matching
Programmers design AI software to give the computer only the problem, not the steps necessary to solve it
Conventional Computing
Conventional computer software follow a logical series of steps to reach a conclusion
Computer programmers originally designed software that accomplished tasks by completing algorithms
Artificial Intelligence VS Conventional Computing
27
Psychology And Artificial intelligence
The functionalist approach of AI views the mind as a representational system and psychology as the
study of the various computational processes whereby mental representations are constructed,
organized, and interpreted.
(Margaret Boden's essays written between 1982 and 1988)
28
Artificial intelligence & Our society Why we need AI??
To supplement natural intelligence for e.g we are building intelligence in an object so that it can do what we want it to do, as for example– robots, thus reducing human labour and reducing human mistakes
29
Introduction to Artificial Neural Networks
Fundamental Concepts of ANNs
30
05/03/2014
6
What is ANN? Why ANN?
ANN Artificial Neural Networks – To simulate human brain behavior
– A new generation of information processing system.
31
Applications
Pattern Matching Pattern Recognition Associate Memory (Content Addressable Memory) Function Approximation Learning Optimization Vector Quantization Data Clustering . . .
32
Applications
Pattern Matching Pattern Recognition Associate Memory (Content Addressable Memory) Function Approximation Learning Optimization Vector Quantization Data Clustering . . .
Traditional Computers are inefficient at these tasks although their computation speed is faster.
33
The Configuration of ANNs
An ANN consists of a large number of interconnected processing elements called neurons. – A human brain consists of ~1011 neurons of
many different types.
How ANN works? – Collective behavior.
34
The Biologic Neuron
35
The Biologic Neuron
chemical
tail Connection point
36
05/03/2014
7
The Biologic Neuron
Excitatory or Inhibitory
37
The Artificial Neuron
x1
x2
xm
wi1
wi2
wim
yi
f (.) a (.)
i
38
The Artificial Neuron-Perceptron
x1
x2
xm
wi1
wi2
wim
yi
f (.) a (.)
i
ij
m
j
iji xwf 1
)(
)()1( fatyi
otherwise
ffa
0
01)(
39
The Artificial Neuron
x1
x2
xm
wi1
wi2
wim
yi
f (.) a (.)
i
wij
positive excitatory negative inhibitory zero no connection
40
The Artificial Neuron
x1
x2
xm
wi1
wi2
wim
yi
f (.) a (.)
i
Proposed by McCulloch and Pitts [1943] M-P neurons
41
What can be done by M-P neurons?
A hard limiter. A binary threshold unit. Hyperspace separation.
otherwise
fy
xwxwf
i
i
0
0)(1
)( 2211
x1
x2
x1 x2
y
w1 w2 1 0
42
05/03/2014
8
What ANNs will be?
ANN A neurally inspired mathematical model.
Consists a large number of highly interconnected PEs.
Its connections (weights) holds knowledge.
The response of PE depends only on local information.
Its collective behavior demonstrates the computation power.
With learning, recalling and, generalization capability.
43
Three Basic Entities of ANN Models
Models of Neurons or PEs.
Models of synaptic interconnections and structures.
Training or learning rules.
44
Introduction to Artificial Neural Networks
Basic Models and Learning Rules
Neuron Models
ANN structures
Learning
45
Processing Elements
f (.) a (.)
i
What integration functions we may have?
What activation functions we may have?
Extensions of M-P neurons
46
Integration Functions
f (.) a (.)
i ij
m
j
iji xwf
2
1
Quadratic Function
i
m
j
ijji wxf 1
2)(Spherical Function
1 1
j k
m m
i ijk j k j k i
j k
f w x x x x
Polynomial Function
ij
m
j
ijii xwnetf 1
M-P neuron
47
Activation Functions
f (.) a (.)
i
M-P neuron: (Step function)
otherwise
ffa
0
01)(
1
a
f 48
05/03/2014
9
Activation Functions
f (.) a (.)
i
Hard Limiter (Threshold function)
01
01)sgn()(
f
fffa
1
a
1 f
49
Activation Functions
f (.) a (.)
i
Ramp function:
00
10
11
)(
f
ff
f
fa
1
a
1 f
50
Activation Functions
f (.) a (.)
i
Unipolar sigmoid function:
0
0.5
1
1.5
-4 -3 -2 -1 0 1 2 3 4
fefa
1
1)(
51
Activation Functions
f (.) a (.)
i
Bipolar sigmoid function:
11
2)(
fefa
-1.5
-1
-0.5
0
0.5
1
1.5
-4 -3 -2 -1 0 1 2 3 4
52
x
y
Example: Activation Surfaces
L1
L2
L3
x y
L1 L2 L3
53
x
y L1
L2
L3
Example: Activation Surfaces
x1=0
y1=0
xy+4=0
x y
L1 L2 L3
1 0
1=1
0 1
2=1
1
1
3= 4
54
05/03/2014
10
Example: Activation Surfaces
x y
L1 L2 L3
x
y L1
L2
L3
011
001 101 100
110
010
111
Region Code
55
x
y L1
L2
L3
Example: Activation Surfaces
z=1
z=0 L4
z
x y
L1 L2 L3
56
x
y L1
L2
L3
Example: Activation Surfaces
z=1
z=0 L4
z
x y
L1 L2 L3
1
4=2.5
1 1
57
Example: Activation Surfaces
L4
z
x y
L1 L2 L3
M-P neuron: (Step function)
otherwise
ffa
0
01)(
58
Example: Activation Surfaces
L4
z
x y
L1 L2 L3
=2 =3
=5 =10
Unipolar sigmoid function: fe
fa
1
1)(
59
Introduction to Artificial Neural Networks
Basic Models and Learning Rules
Neuron Models
ANN structures
Learning
60
05/03/2014
11
ANN Structure (Connections)
61
Single-Layer Feedforward Networks
y1 y2 yn
x1 x2 xm
w11 w12
w1m w21 w22
w2m wn1 wnm wn2
. . .
62
Multilayer Feedforward Networks
. . .
. . .
. . .
. . .
x1 x2 xm
y1 y2 yn
Hidden Layer
Input Layer
Output Layer
63
Multilayer Feedforward Networks
Input
Analysis
Classification Output
Learning
Where the knowledge from?
64
Single Node with Feedback to Itself
Feedback Loop
65
Single-Layer Recurrent Networks
. . .
x1 x2 xm
y1 y2 yn
66
05/03/2014
12
Multilayer Recurrent Networks
x1 x2 x3
y1 y2 y3
. . .
. . .
67
Introduction to Artificial Neural Networks
Basic Models and Learning Rules
Neuron Models
ANN structures
Learning
68
Learning
Consider an ANN with n neurons and each with m adaptive weights.
Weight matrix:
nmnn
m
m
T
n
T
T
www
www
www
21
22221
11211
2
1
w
w
w
W
69
Learning
Consider an ANN with n neurons and each with m adaptive weights.
Weight matrix:
nmnn
m
m
T
n
T
T
www
www
www
21
22221
11211
2
1
w
w
w
W
To “Learn” the weight matrix.
How?
70
Learning Rules
Supervised learning
Reinforcement learning
Unsupervised learning
71
Supervised Learning
Learning with a teacher
Learning by examples
Training set
(1) (2)(1) (2 )) ( )(( , ), ( , ), , ( , ),kk d d dx xT x
72
05/03/2014
13
Supervised Learning
x
Error
signal
Generator
d
y ANN
W
(1) (2)(1) (2 )) ( )(( , ), ( , ), , ( , ),kk d d dx xT x
73
Reinforcement Learning
Learning with a critic
Learning by comments
74
Reinforcement Learning
x
Critic
signal
Generator
y ANN
W Reinforcement
Signal
75
Unsupervised Learning
Self-organizing
Clustering – Form proper clusters by
discovering the similarities and dissimilarities among objects.
76
Unsupervised Learning
x y ANN
W
77
Some Learning Laws
• Hebb’s Learning [1949]
• Hopfield Learning[1951]
• Least Mean Square learning-Delta Rules
• …..etc
78
05/03/2014
14
Hebb’s Learning Law
• Hebb [1994] hypothesis that when an axonal input from A to B causes neuron B to immediately emit a pulse (fire) and this situation happens repeatedly or persistently.
• Then, the efficacy of that axonal input, in terms of ability to help neuron B to fire in future, is somehow increased.
• Hebb’s learning rule is a unsupervised learning rule.
79
Hebb’s Learning Law
( , , ) ( )T
r i i iif d ar y w x w x
( ) ( ) ii rt t y w x x
iij jw xy
+
+ 80
Hopfield Learning
Recurrent; Every unit is connected to every other unit
Weights connecting units are symmetrical – wij = wji
If the weighted sum of the inputs exceeds a threshold, its output is 1 otherwise its output is -1
Units update themselves asynchronously as their inputs change
81
Hopfield Learning Law
Consider signum function to be neuron’s activation function.
i.e.
vi = +1 if hi>0
vi = -1 if hi<0
82
)sgn(1
ii
kt
i
k
i Tivwv
LMS Learning
LMS = Least Mean Square learning, The concept is to minimize the total error, as measured over all training examples, P. O is the raw output, as calculated by
P
PP OTLMSceDis2
2
1)(tan
We want to minimize the LMS:
E
W
W(old)
W(new)
C-learning rate
i
ii Iw
The General Weight Learning Rule
1
1
m
i ijij
j
net xw
Input:
Output: ( )i iy a net
i
.
.
.
.
.
.
wi1
wi2
wij
wi,m-1
x1
x2
xj
xm-1
yi
i
84
05/03/2014
15
The General Weight Learning Rule
1
1
m
i ijij
j
net xw
Input:
Output: ( )i iy a net
i
.
.
.
.
.
.
wi1
wi2
wij
wi,m-1
x1
x2
xj
xm-1
yi
i
We want to learn the weights & bias.
85
We want to learn the weights & bias.
The General Weight Learning Rule
1
1
m
i ijij
j
net xw
Input:
i
.
.
.
.
.
.
wi1
wi2
wij
wi,m-1
x1
x2
xj
xm-1
i
1
ij
m
i j
j
net xw
Let xm = 1 and wim = i.
86
The General Weight Learning Rule
1
1
m
i ijij
j
net xw
Input:
i
.
.
.
.
.
.
wi1
wi2
wij
wi,m-1
x1
x2
xj
xm-1
1
ij
m
i j
j
net xw
Let xm = 1 and wim = i.
xm= 1
wim=i
87
The General Weight Learning Rule
Input:
i
.
.
.
.
.
.
wi1
wi2
wij
wi,m-1
x1
x2
xj
xm-1
1
ij
m
i j
j
net xw
xm= 1
wim=i
yi
wi=(wi1, wi2 ,…,wim)T
wi(t) = ?
We want to learn
88
The General Weight Learning Rule
wi x
yi
r di Learning
Signal
Generator
89
The General Weight Learning Rule
wi x
yi
r di Learning
Signal
Generator
( , , )r i if dw x
90
05/03/2014
16
The General Weight Learning Rule
wi x
yi
r di Learning
Signal
Generator
)()( trti xw
( , , )r i if dw x
91
The General Weight Learning Rule
wi x
yi
r di Learning
Signal
Generator
( ) ( )i t tr w x
)()( trti xw
Learning Rate
( , , )r i if dw x
92
The General Weight Learning Rule
wi=(wi1, wi2 ,…,wim)T We want to learn
( , , )r i ir f d w x( ) ( )i t tr w x
( 1) ( ) ( ) ( ) ( ) ( )( , , )t t t t t t
i i r i if d w w w x x
Discrete-Time Weight Modification Rule:
Continuous-Time Weight Modification Rule:
( )( )id t
r tdt
w
x93
Introduction to Artificial Neural Networks
Distributed Representations
94
Distributed Representations
• Distributed Representation: – An entity is represented by a pattern of
activity distributed over many PEs. – Each Processing element is involved in
representing many different entities.
• Local Representation: – Each entity is represented by one PE.
95
Example
P0 P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 P13 P14 P15
+ _
+ + _ _ _ _
+ + + + + _ _ _
+ _
+ + _ _ _ _
+ _
+ _
+ + _
+
+ + _
+ _
+ + _
+ _ _
+ + + + _
Dog
Cat
Bread
96
05/03/2014
17
P0 P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 P13 P14 P15
+ _
+ + _ _ _ _
+ + + + + _ _ _
+ _
+ + _ _ _ _
+ _
+ _
+ + _
+
+ + _
+ _
+ + _
+ _ _
+ + + + _
Dog
Cat
Bread
Advantages
Act as a content addressable memory.
P0 P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 P13 P14 P15
+ + + +
What is this? 97
Advantages
P0 P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 P13 P14 P15
+ _
+ + _ _ _ _
+ + + + + _ _ _
+ _
+ + _ _ _ _
+ _
+ _
+ + _
+
+ + _
+ _
+ + _
+ _ _
+ + + + _
Dog
Cat
Bread
Act as a content addressable memory.
Make induction easy.
P0 P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 P13 P14 P15
+ _ _
+ _ _ _ _
+ + + + + + _ _
Fido
Dog has 4 legs? How many for Fido? 98
Advantages
P0 P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 P13 P14 P15
+ _
+ + _ _ _ _
+ + + + + _ _ _
+ _
+ + _ _ _ _
+ _
+ _
+ + _
+
+ + _
+ _
+ + _
+ _ _
+ + + + _
Dog
Cat
Bread
Act as a content addressable memory.
Make induction easy.
Make the creation of new entities or concept easy (without allocation of new hardware).
+ + _ _ _
+ + _
+ _ _ _
+ + + _
Doughnut
Add doughnut by changing weights. 99
Advantages
P0 P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 P13 P14 P15
+ _
+ + _ _ _ _
+ + + + + _ _ _
+ _
+ + _ _ _ _
+ _
+ _
+ + _
+
+ + _
+ _
+ + _
+ _ _
+ + + + _
Dog
Cat
Bread
Act as a content addressable memory.
Make induction easy.
Make the creation of new entities or concept easy (without allocation of new hardware).
Fault Tolerance.
Some PEs break down don’t cause problem. 10
0
Disadvantages
• How to understand?
• How to modify?
Learning procedures are required.
10
1
REFERENCES
All materials were collected from different
web pages.
Thanks for the authors of these slides.
10
2