artificial intelligence csc 361 prof. mohamed batouche computer science department ccis – king...

33
Artificial Intelligence CSC 361 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi Arabia [email protected]

Post on 22-Dec-2015

261 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Artificial Intelligence CSC 361 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi Arabia mbatouche@ccis.ksu.edu.sa

Artificial Intelligence CSC 361

Prof. Mohamed Batouche

Computer Science DepartmentCCIS – King Saud University

Riyadh, Saudi Arabia

[email protected]

Page 2: Artificial Intelligence CSC 361 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi Arabia mbatouche@ccis.ksu.edu.sa

Intelligent SystemsPart II: Neural Nets

Page 3: Artificial Intelligence CSC 361 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi Arabia mbatouche@ccis.ksu.edu.sa

Developing Intelligent Program Systems

Machine Learning : Neural Nets

• Artificial Neural Networks: Artificial Neural Networks are Artificial Neural Networks are crude attempts to model the highly massive parallel and crude attempts to model the highly massive parallel and distributed processing we believe takes place in the brain.distributed processing we believe takes place in the brain.

• Two main areas of activity:• Biological: Try to model biological neural systems.

• Computational: develop powerful applications.

Page 4: Artificial Intelligence CSC 361 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi Arabia mbatouche@ccis.ksu.edu.sa

Developing Intelligent Program Systems

Machine Learning : Neural Nets

Neural nets can be used to answer the following:

• Pattern recognition: Does that image contain a face?

• Classification problems: Is this cell defective?

• Prediction: Given these symptoms, the patient has disease X

• Forecasting: predicting behavior of stock market

• Handwriting: is character recognized?

• Optimization: Find the shortest path for the TSP.

Page 5: Artificial Intelligence CSC 361 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi Arabia mbatouche@ccis.ksu.edu.sa

Developing Intelligent Program Systems

Machine Learning : Neural Nets

Strength and Weaknesses of ANN

• Examples may be described by a large number of attributes Examples may be described by a large number of attributes (e.g., pixels in an image).(e.g., pixels in an image).

• Data may contain errorsData may contain errors..

• The time for training may be extremely longThe time for training may be extremely long..

• Evaluating the network for a new example is relatively fast.Evaluating the network for a new example is relatively fast.

• Interpretability of the final hypothesis is not relevant (the Interpretability of the final hypothesis is not relevant (the NN is treated as a black box).NN is treated as a black box).

Page 6: Artificial Intelligence CSC 361 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi Arabia mbatouche@ccis.ksu.edu.sa

Artificial Neural Networks

Biological Neuron

Page 7: Artificial Intelligence CSC 361 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi Arabia mbatouche@ccis.ksu.edu.sa

The Neuron

• The neuron receives nerve impulses through its dendrites. It then sends the nerve impulses through its axon to the terminal buttons where neurotransmitters are released to simulate other neurons.

Page 8: Artificial Intelligence CSC 361 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi Arabia mbatouche@ccis.ksu.edu.sa

The neuron

• The unique components are:

• Cell body or soma which contains the nucleus

• The dendrites• The axon• The synapses

Page 9: Artificial Intelligence CSC 361 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi Arabia mbatouche@ccis.ksu.edu.sa

The neuron - dendrites

• The dendrites are short fibers (surrounding the cell body) that receive messages

• The dendrites are very receptive to connections from other neurons.

• The dendrites carry signals from the synapses to the soma.

Page 10: Artificial Intelligence CSC 361 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi Arabia mbatouche@ccis.ksu.edu.sa

The neuron - axon

• The axon is a long extension from the soma that transmits messages

• Each neuron has only one axon.

• The axon carries action potentials from the soma to the synapses.

Page 11: Artificial Intelligence CSC 361 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi Arabia mbatouche@ccis.ksu.edu.sa

The neuron - synapses

• The synapses are the connections made by an axon to another neuron. They are tiny gaps between axons and dendrites (with chemical bridges) that transmit messages

• A synapse is called excitatory if it raises the local membrane potential of the post synaptic cell.

• Inhibitory if the potential is lowered.

Page 12: Artificial Intelligence CSC 361 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi Arabia mbatouche@ccis.ksu.edu.sa

Artificial Neural Networks

History of ANNs

Page 13: Artificial Intelligence CSC 361 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi Arabia mbatouche@ccis.ksu.edu.sa

History of Artificial Neural Networks

• 1943: McCulloch and Pitts proposed a model of a neuron --> Perceptron

• 1960s: Widrow and Hoff explored Perceptron networks (which they called “Adalines”) and the delta rule.

• 1962: Rosenblatt proved the convergence of the perceptron training rule.

• 1969: Minsky and Papert showed that the Perceptron cannot deal with nonlinearly-separable data sets---even those that represent simple function such as X-OR.

• 1970-1985: Very little research on Neural Nets

• 1986: Invention of Backpropagation [Rumelhart and McClelland, but also Parker and earlier on: Werbos] which can learn from nonlinearly-separable data sets.

• Since 1985: A lot of research in Neural Nets

Page 14: Artificial Intelligence CSC 361 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi Arabia mbatouche@ccis.ksu.edu.sa

Artificial Neural Networks

artificial Neurons

Page 15: Artificial Intelligence CSC 361 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi Arabia mbatouche@ccis.ksu.edu.sa

Artificial Neuron

• Incoming signals to a unit are combined by summing their weighted values

• Output function: Activation functions include Step function, Linear function, Sigmoid function, …

1

f()

InputsOutput=f()

xiwi

x1

xp

w1

w0

wp

Page 16: Artificial Intelligence CSC 361 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi Arabia mbatouche@ccis.ksu.edu.sa

Activation functions

Step function Sign function Sigmoid (logistic) function

step(x) = 1, if x >= threshold 0, if x < threshold(in picture above, threshold = 0)

sign(x) = +1, if x >= 0 -1, if x < 0 sigmoid(x) = 1/(1+e-x)

Adding an extra input with activation a0 = -1 and weightW0,j = t (called the bias weight) is equivalent to having a threshold at t. This way we can always assume a 0 threshold.

Linear function

pl(x) =x

Page 17: Artificial Intelligence CSC 361 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi Arabia mbatouche@ccis.ksu.edu.sa

Real vs. Artificial Neurons

axon

dendrites

dendrites

synapse

cell

x0

xn

w0

wn

oi

n

iixw

0

o/w 0 and 0 if 10

i

n

iixwo

Threshold units

Page 18: Artificial Intelligence CSC 361 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi Arabia mbatouche@ccis.ksu.edu.sa

Neurons as Universal computing machine

• In 1943, McCulloch and Pitts showed that a synchronous assembly of such neurons is a universal computing machine. That is, any Boolean function can be implemented with threshold (step function) units.

Page 19: Artificial Intelligence CSC 361 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi Arabia mbatouche@ccis.ksu.edu.sa

Implementing AND

x1

x2

o(x1,x2)

otherwise 0

05.1 if 1),( 2121

xxxxo

1

1

-1

W=1.5

Page 20: Artificial Intelligence CSC 361 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi Arabia mbatouche@ccis.ksu.edu.sa

Implementing OR

x1

x2

o(x1,x2)

1

1

-1

W=0.5

o(x1,x2) = 1 if –0.5 + x1 + x2 > 0 = 0 otherwise

Page 21: Artificial Intelligence CSC 361 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi Arabia mbatouche@ccis.ksu.edu.sa

Implementing NOT

x1 o(x1)-1

W=-0.5

-1

otherwise 0

05.0 if 1)( 11

xxo

Page 22: Artificial Intelligence CSC 361 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi Arabia mbatouche@ccis.ksu.edu.sa

Implementing more complex Boolean functions

x1

x2

1

1

0.5-1

x1 or x2

x3

1

1

1.5

(x1 or x2) and x3

-1

Page 23: Artificial Intelligence CSC 361 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi Arabia mbatouche@ccis.ksu.edu.sa

Artificial Neural Networks

• When using ANN, we have to define:

• Artificial Neuron Model

• ANN Architecture

• Learning mode

Page 24: Artificial Intelligence CSC 361 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi Arabia mbatouche@ccis.ksu.edu.sa

Artificial Neural Networks

ANN Architecture

Page 25: Artificial Intelligence CSC 361 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi Arabia mbatouche@ccis.ksu.edu.sa

ANN Architecture

• Feedforward: Links are unidirectional, and there are no cycles, i.e., the network is a directed acyclic graph (DAG). Units are arranged in layers, and each unit is linked only to units in the next layer. There is no internal state other than the weights.

• Recurrent: Links can form arbitrary topologies, which can implement memory. Behavior can become unstable, oscillatory, or chaotic.

Page 26: Artificial Intelligence CSC 361 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi Arabia mbatouche@ccis.ksu.edu.sa

Artificial Neural NetworkFeedforward Network

Output layer

Input layer

Hidden layers

fully connected

sparsely connected

Page 27: Artificial Intelligence CSC 361 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi Arabia mbatouche@ccis.ksu.edu.sa

Artificial Neural NetworkFeedForward Architecture

• Information flow unidirectional

• Multi-Layer Perceptron (MLP)

• Radial Basis Function (RBF)

• Kohonen Self-Organising Map (SOM)

Page 28: Artificial Intelligence CSC 361 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi Arabia mbatouche@ccis.ksu.edu.sa

Artificial Neural NetworkRecurrent Architecture

• Feedback connections

• Hopfield Neural Networks: Associative memory

• Adaptive Resonance Theory (ART)

Page 29: Artificial Intelligence CSC 361 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi Arabia mbatouche@ccis.ksu.edu.sa

Artificial Neural NetworkLearning paradigms

• Supervised learning:

• Teacher presents ANN input-output pairs, • ANN weights adjusted according to error

• Classification• Control• Function approximation• Associative memory

• Unsupervised learning:

• no teacher

• Clustering

Page 30: Artificial Intelligence CSC 361 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi Arabia mbatouche@ccis.ksu.edu.sa

ANN capabilities

• Learning• Approximate reasoning• Generalisation capability• Noise filtering• Parallel processing• Distributed knowledge base• Fault tolerance

Page 31: Artificial Intelligence CSC 361 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi Arabia mbatouche@ccis.ksu.edu.sa

Main Problems with ANN

• Contrary to Expert sytems, with ANN the Knowledge base is not transparent (black box)

• Learning sometimes difficult/slow

• Limited storage capability

Page 32: Artificial Intelligence CSC 361 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi Arabia mbatouche@ccis.ksu.edu.sa

Some applications of ANNs

• Pronunciation: NETtalk program (Sejnowski & Rosenberg 1987) is a neural network that learns to pronounce written text: maps characters strings into phonemes (basic sound elements) for learning speech from text

• Speech recognition

• Handwritten character recognition:a network designed to read zip codes on hand-addressed envelops

• ALVINN (Pomerleau) is a neural network used to control vehicles steering direction so as to follow road by staying in the middle of its lane

• Face recognition

• Backgammon learning program

• Forecasting e.g., predicting behavior of stock market

Page 33: Artificial Intelligence CSC 361 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi Arabia mbatouche@ccis.ksu.edu.sa

When to use ANNs?• Input is high-dimensional discrete or real-valued (e.g. raw sensor input).

• Inputs can be highly correlated or independent.

• Output is discrete or real valued

• Output is a vector of values

• Possibly noisy data. Data may contain errors

• Form of target function is unknown

• Long training time are acceptable

• Fast evaluation of target function is required

• Human readability of learned target function is unimportant

⇒ ANN is much like a black-box