artificial neural networks lect2: neurobiology & architectures of anns

62
CS407 Neural Computation Lecture 2: Neurobiology and Architectures of ANNs Lecturer: A/Prof. M. Bennamoun

Upload: mohammed-bennamoun

Post on 22-Jan-2018

772 views

Category:

Engineering


1 download

TRANSCRIPT

Page 1: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

CS407 Neural Computation

Lecture 2: Neurobiology and Architectures of ANNs

Lecturer: A/Prof. M. Bennamoun

Page 2: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

NERVOUS SYSTEM & HUMAN BRAIN

Page 3: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Organization of the nervous system

Page 4: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Central Nervous System

Spinal cord

Hindbrain & MidbrainBrain stem & cerebellum

ThalamusHypothalamusLimbic system

Sub-cortical structures

Frontal, parietal,occipital & temporal lobesin left & right hemispheres

Cortex

Forebrain

Brain

Central Nervous System

Page 5: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS
Page 6: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

THE BIOLOGICAL NEURON

Page 7: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

The Structure of Neurons

axon

cell body

synapse

nucleus

dendrites

axon

cell body

synapse

nucleus

dendrites

Page 8: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

The Structure of Neurons

A neuron has a cell body, a branching input structure (the dendrIte) and a branching output structure (the axOn)

• Axons connect to dendrites via synapses.• Electro-chemical signals are propagated

from the dendritic input, through the cell body, and down the axon to other neurons

Page 9: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

The Structure of Neurons

• A neuron only fires if its input signal exceeds a certain amount (the threshold) in a short time period.

• Synapses vary in strength– Good connections allowing a large signal– Slight connections allow only a weak signal.– Synapses can be either excitatory or

inhibitory.

Page 10: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Neurotransmission

http://www.health.org/pubs/qdocs/mom/TG/intro.htm

Page 11: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Neurons come in many shapes & sizes

Page 12: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

The brain’s plasticityThe ability of the brain to alter its neural pathways.

Recovery from brain damage– Dead neurons are not replaced, but branches of the

axons of healthy neurons can grow into the pathways and take over the functions of damaged neurons.

– Equipotentiality: more than one area of the brain may be able to control a given function.

– The younger the person, the better the recovery (e.g. recovery from left hemispherectomy).

Page 13: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

THE ARTIFICIAL NEURONModel

Page 14: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Models of NeuronNeuron is an information processing unitA set of synapses or connecting links– characterized by weight or strength

An adder– summing the input signals weighted by synapses– a linear combiner

An activation function– also called squashing function

• squash (limits) the output to some finite values

Page 15: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Nonlinear model of a neuron (I)wk1

x1

wk2x2

wkmxm

... ... Σ

Biasbk

ϕ(.)vk

Inputsignal

Synapticweights

Summingjunction

Activationfunction

Outputyk

bxwv kj

m

jkjk +=∑

=1

)(vy kkϕ=

Page 16: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Analogy

• Inputs represent synapses• Weights represent the strengths of

synaptic links • Wi represents dentrite secretion• Summation block represents the addition

of the secretions• Output represents axon voltage

Page 17: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Nonlinear model of a neuron (II)

wk1x1

wk2x2

wkmxm

... ... Σ ϕ(.)vk

Inputsignal

Synapticweights

Summingjunction

Activationfunction

wk0X0 = +1 Wk0 = bk (bias)

Outputyk

xwv j

m

jkjk ∑

=

=0

)(vy kkϕ=

Page 18: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

THE ARTIFICIAL NEURONActivation Function

Page 19: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Types of Activation Function

ini

Oj

+1

t ini

Oj

+1

The hard-limiting Threshold Function

Piecewise-linearFunction

Sigmoid Function(differentiable)

Oj

+1

init

Corresponds to the biological paradigm: either fires or not

('S'-shaped curves)

)exp(11)(

avv

−+=ϕ

≤>

=tintin

inO01

)( a is slope parameter

Page 20: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Activation Functions...Threshold or step function (McCulloch & Pitts model)Linear: neurons using a linear activation function are called in the literature ADALINEs (Widrow 1960)Sigmoidal functions: functions which more exactly describe non-linear functions of the biological neurons.

y

Page 21: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Activation Functions...sigmoid

0

)exp(11)(

vv

ββϕ −+=

1 2β

v

)(1)()(

0)()(1)()(

ννϕν

β

ϕβν

νϕνβ

β

β

β

→ ∞→

→−∞→

→∞→

thenfixedand

iii

vtheniitheniif

1(v) is the modified Heaviside function

Page 22: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Activation Functions... sigmoid

1H(s) is the Heaviside function

<≥

=0001

)(1νν

νH )(1 νH

1(v) is the modified Heaviside function

=<>

=02/1

0001

)(1ννν

ν

v

1

11/2

)(1ν

s

Page 23: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Activation Function value range

vi

+1

-1

Signum Function (sign)

vi

+1

Hyperbolic tangent Function

)tanh()( vv =ϕ

Page 24: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Stochastic Model of a Neuron• So far we have introduced only deterministic

models of ANNs.• A stochastic (probabilistic) model can also be

defined.• If x denotes the state of a neuron, then P(v)

denotes the prob. of firing a neuron, where v is the induced activation potential (bias + linear combination).

Tv

evP

−+

=1

1)(

Page 25: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Stochastic Model of a Neuron…• Where T is a pseudo-temperature used to

control the noise level (and therefore the uncertainty in firing)

Stochastic model deterministic model

0→T

<−≥+

=0101

vv

x

Page 26: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

DECISION BOUNDARIES

Page 27: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Decision boundaries

• In simple cases, divide feature space by drawing a hyperplane across it.

• Known as a decision boundary.• Discriminant function: returns different values

on opposite sides. (straight line)

• Problems which can be thus classified are linearly separable.

Page 28: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

E.g. Decision Surface of a Perceptron

+

+-

-

x1

x2

Non-Linearly separable

• Perceptron is able to represent some useful functions• AND(x1,x2) choose weights w0=-1.5, w1=1, w2=1• But functions that are not linearly separable (e.g. XOR)

are not representable

+

++

+ -

-

-

-

x2

Linearly separable

x1

Page 29: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Linear Separability

X2

X1B

A(x1,y1)

B

B

BB

DecisionDecisionBoundaryBoundary

A(x2,y2)

A(x4,y4)

A(x3,y3)

A(x6,y6) B

(x11,y11)

B(x8,y8)

B(x10,y10)

21

2

12 w

txwwx +−=

A(x5,y5)

A(x7,y7)

Page 30: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Rugby players & Ballet dancers

2

1

50 120

Height (m)

Weight (Kg)

Rugby ?Rugby ?

Ballet?Ballet?

Page 31: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Training the neuron

x2

x1

X0=-1W0 = t=?

W2 = ?

W1 = ?

+1

-1

Σv t

<−=>

=01

0001

)(ννν

νf

v

twxwxwxwx =−==++ 00331100 ;10

It is clear that:

twxwxiffByxtwxwxiffAyx

<+∈>+∈

2211

2211

),(),( Finding wi is called

learning

Page 32: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

THE ARTIFICIAL NEURONLearning

Page 33: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Supervised Learning–The desired response of the system is provided by ateacher, e.g., the distance ρ[d,o] as as error measure– Estimate the negative error gradient direction and

reduce the error accordingly– Modify the synaptic weights to reduce the stochastic

minimization of error in multidimensional weight space

Page 34: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Unsupervised Learning (Learning without a teacher)

–The desired response is unknown, no explicit error information can be used to improve network behavior. E.g. finding the cluster boundaries of input pattern–Suitable weight self-adaptation mechanisms have toembedded in the trained network

Page 35: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Training

Linear threshold is used. W - weight valuet - threshold value

1 if Σ wi xi >tOutput=

0 otherwise{ i=0

Page 36: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Simple network

t = 0.0

Y

X

W1 = 1.5

W3 = 1

-1

W2 = 1

1 if Σ wi xi >tOutput=

0 otherwiseAND with a Biased inputAND with a Biased input

Page 37: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Learning algorithmWhile epoch produces an error

Present network with next inputs from epoch

Error = T – OIf Error <> 0 then

Wj = Wj + LR * Ij * ErrorEnd If

End While

Page 38: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Learning algorithm

Epoch : Presentation of the entire training set to the neural network. In the case of the AND function an epoch consists of four sets of inputs being presented to the network (i.e. [0,0], [0,1], [1,0], [1,1])

Error: The error value is the amount by which the value output by the network differs from the target value. For example, if we required the network to output 0 and it output a 1, then Error = -1

Page 39: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Learning algorithm

Target Value, T : When we are training a network we not only present it with the input but also with a value that we require the network to produce. For example, if we present the network with [1,1] for the AND function the target value will be 1

Output , O : The output value from the neuron

Ij : Inputs being presented to the neuron

Wj : Weight from input neuron (Ij) to the output neuron

LR : The learning rate. This dictates how quickly the network converges. It is set by a matter of experimentation. It is typically 0.1

Page 40: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Training the neuronFor ANDA B Output0 0 00 1 01 0 01 1 1

t = 0.0

y

x

-1W1 = ?

W3 = ?

W2 = ?

••What are the weight values? What are the weight values? ••Initialize with random weight valuesInitialize with random weight values

Page 41: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Training the neuronFor ANDA B Output0 0 00 1 01 0 01 1 1

t = 0.0

y

x

-1W1 = 0.3

W3 =-0.4

W2 = 0.5

I1 I2 I3 Summation Output -1 0 0 (-1*0.3) + (0*0.5) + (0*-0.4) = -0.3 0 -1 0 1 (-1*0.3) + (0*0.5) + (1*-0.4) = -0.7 0 -1 1 0 (-1*0.3) + (1*0.5) + (0*-0.4) = 0.2 1 -1 1 1 (-1*0.3) + (1*0.5) + (1*-0.4) = -0.2 0

Page 42: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Learning in Neural Networks

Learn values of weights from I/O pairsStart with random weightsLoad training example’s inputObserve computed inputModify weights to reduce differenceIterate over all training examplesTerminate when weights stop changing OR when error is very small

Page 43: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

NETWORK ARCHITECTURE/TOPOLOGY

Page 44: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Network ArchitectureSingle-layer Feedforward Networks– input layer and output layer

• single (computation) layer

– feedforward, acyclicMultilayer Feedforward Networks– hidden layers - hidden neurons and hidden units– enables to extract high order statistics– 10-4-2 network, 100-30-10-3 network– fully connected layered network

Recurrent Networks– at least one feedback loop– with or without hidden neuron

Page 45: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Network ArchitectureMultiple layerfully connectedSingle layer

Unit delayoperator

Recurrent networkwithout hidden units

inputs

outputs

{

}

Recurrent networkwith hidden units

Page 46: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Feedforward Networks (static)InputLayer

HiddenLayers

OutputLayer

Page 47: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Feedforward Networks…• One I/P and one O/P layer• One or more hidden layers• Each hidden layer is built from artificial

neurons• Each element of the preceding layer is

connected with each element of the next layer.

• There is no interconnection between artificial neurons from the same layer.

• Finding weights is a task which has to be done depending on which solution problem is to be performed by a specific network.

Page 48: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Feedback Networks(Recurrent or dynamic systems)

OutputLayer

InputLayer

HiddenLayers

Page 49: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Feedback Networks …(Recurrent or dynamic systems)

• The interconnections go in two directions between ANNs or with the feedback.

• Boltzmann machine is an example of recursive nets which is a generalization of Hopfield nets. Other example of recursive nets: Adaptive Resonance Theory (ART) nets.

Page 50: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Neural network as directed Graph

x0 = +1

...

vk

Wk0 = bk

wk1

wk2

wkm

x1

ϕ(.) ykx2

xm

Page 51: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Neural network as directed Graph…

Block diagram can be simplify by the idea of signal flow graphnode is associated with signaldirected link is associated with transfer function– synaptic links

• governed by linear input-output relation• signal xj is multiplied by synaptic weight wkj

– activation links• governed by nonlinear input-output relation• nonlinear activation function

Page 52: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

FeedbackOutput determines in part own output via feedback

depending on w– stable, linear divergence, exponential

divergence– we are interested in the case of |w| <1 ; infinite

memory• output depends on inputs of infinite past

NN with feedback loop : recurrent network

xj(n)

xj’(n)

w yk(n)

z-1 )()(0

1 lnnk xwy ji

l −=∑∞

=

+

Page 53: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

NEURAL PROCESSING

Page 54: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Neural Processing• Recall

– The process of computation of an output o for a given input x performed by the ANN.

– It’s objective is to retrieve the information, i.e., to decode the stored content which must have been encoded in the network previously

• Autoassociation– A network is presented a pattern similar to a

member of the stored set, autoassociationassociates the input pattern with the closest stored pattern.

Page 55: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Neural Processing…Autoassociation: reconstruction of

incomplete or noisy image.• Heteroassociation:

– The network associates the input pattern with pairs of patterns stored

Page 56: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Neural Processing…Classification

– A set of patterns is already divided into a number of classes, or categories

- When an input pattern is presented, the classifier recalls the information regarding the class membership of the input pattern

– The classes are expressed by discrete-valued output vectors, thus the output neurons of the classifier employ binary activation functions

– A special case of heteroassociation

Page 57: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

•RecognitionIf the desired response is the class number, but the input pattern doesn’t exactly corresponding to any of the patterns in the stored set

Neural Processing…

Page 58: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Neural Processing…

• Clustering– Unsupervised classification of patterns/objects

without providing information about the actual classes

– The network must discover for itself any existing patterns, regularities, separating properties, etc.

– While discovering these, the network undergoes change of its parameters, which is called Self-organization

Page 59: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

Neural Processing…patterns stored

Page 60: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

SummaryParallel distributed processing (especially a hardware based neural net) is a good approach for complex pattern recognition (e.g. image recognition, forecasting, text retrieval, optimization)

Less need to determine relevant factors a priori when building a neural networkLots of training data are neededHigh tolerance to noisy data. In fact, noisy data enhance post-training performanceDifficult to verify or discern learned relationships even with special knowledge extraction utilities developed for neural nets

Page 61: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

References:1. ICS611 Foundations of Artificial Intelligence, Lecture

notes, Univ. of Nairobi, Kenya: Learning –http://www.uonbi.ac.ke/acad_depts/ics/course_material-

1. Berlin Chen Lecture notes: Normal University, Taipei, Taiwan, ROC. http://140.122.185.120-

2. Lecture notes on Biology of Behaviour, PYB012- Psychology, by James Freeman, QUT.

3. Jarl Giske Lecture notes: University of Bergen Norway, http://www.ifm.uib.no/staff/giske/

4. Denis Riordan Lecture notes, DalhousieUniv.:http://www.cs.dal.ca/~riordan/

5. Artificial Neural Networks (ANN) by David Christiansen: http://www.pa.ash.org.au/qsite/conferences/conf2000/moreinfo.asp?paperid=95

Page 62: Artificial Neural Networks Lect2: Neurobiology & Architectures of ANNS

References:•Jin Hyung Kim, KAIST Computer Science Dept., CS679 Neural Network lecture notes http://ai.kaist.ac.kr/~jkim/cs679/detail.htm