september 16, 2010neural networks lecture 4: models of neurons and neural networks 1 capabilities of...

September 16, 2010 Neural Networks Lecture 4: Models of Neurons and Neural Networks

1

Capabilities of Threshold NeuronsCapabilities of Threshold Neurons

By choosing appropriate weights By choosing appropriate weights wwii and threshold and threshold we can place the we can place the lineline dividing the input space into dividing the input space into regions of output 0 and output 1in regions of output 0 and output 1in any position and any position and orientationorientation. .

Therefore, our threshold neuron can realize any Therefore, our threshold neuron can realize any linearly separablelinearly separable function function RRnn {0, 1}. {0, 1}.

Although we only looked at two-dimensional input, Although we only looked at two-dimensional input, our findings apply to our findings apply to any dimensionality nany dimensionality n..

For example, for n = 3, our neuron can realize any For example, for n = 3, our neuron can realize any function that divides the three-dimensional input function that divides the three-dimensional input space along a two-dimension plane.space along a two-dimension plane.


2


What do we do if we need a more complex function?What do we do if we need a more complex function?

Just like Threshold Logic Units, we can also Just like Threshold Logic Units, we can also combinecombine multiple artificial neurons to form networks with multiple artificial neurons to form networks with increased capabilities.increased capabilities.

For example, we can build a two-layer network with For example, we can build a two-layer network with any number of neurons in the first layer giving input to any number of neurons in the first layer giving input to a single neuron in the second layer.a single neuron in the second layer.

The neuron in the second layer could, for example, The neuron in the second layer could, for example, implement an AND function.implement an AND function.


3


What kind of function can such a network realize?What kind of function can such a network realize?

xx11

xx22

xx11

xx22

xx11

xx22

..

..

..

xxii


4


Assume that the dotted lines in the diagram represent the Assume that the dotted lines in the diagram represent the input-dividing lines implemented by the neurons in the first input-dividing lines implemented by the neurons in the first layer:layer:

11stst comp. comp.

22ndnd comp. comp.

Then, for example, the second-layer neuron could output 1 if Then, for example, the second-layer neuron could output 1 if the input is within a the input is within a polygonpolygon, and 0 otherwise., and 0 otherwise.


5


However, we still may want to implement functions However, we still may want to implement functions that are more complex than that.that are more complex than that.

An obvious idea is to extend our network even An obvious idea is to extend our network even further.further.

Let us build a network that has Let us build a network that has three layersthree layers, with , with arbitrary numbers of neurons in the first and second arbitrary numbers of neurons in the first and second layers and one neuron in the third layer.layers and one neuron in the third layer.

The first and second layers are The first and second layers are completely completely connectedconnected, that is, each neuron in the first layer , that is, each neuron in the first layer sends its output to every neuron in the second layer.sends its output to every neuron in the second layer.


6


What type of function can a three-layer network realize?What type of function can a three-layer network realize?

xx11

xx22

xx11

xx22

xx11

xx22

..

..

..

ooii......


7


Assume that the polygons in the diagram indicate the input Assume that the polygons in the diagram indicate the input regions for which each of the second-layer neurons yields regions for which each of the second-layer neurons yields output 1:output 1:

11stst comp. comp.

22ndnd comp. comp.

Then, for example, the third-layer neuron could output 1 if the Then, for example, the third-layer neuron could output 1 if the input is within input is within any of the polygonsany of the polygons, and 0 otherwise., and 0 otherwise.


8


The more neurons there are in the first layer, the The more neurons there are in the first layer, the more vertices can the polygons have.more vertices can the polygons have.

With a sufficient number of first-layer neurons, the With a sufficient number of first-layer neurons, the polygons can approximate polygons can approximate anyany given shape. given shape.

The more neurons there are in the second layer, the The more neurons there are in the second layer, the more of these polygons can be combined to form the more of these polygons can be combined to form the output function of the network.output function of the network.

With a sufficient number of neurons and appropriate With a sufficient number of neurons and appropriate weight vectors weight vectors wwii, a three-layer network of threshold , a three-layer network of threshold neurons can realize neurons can realize any (!)any (!) function function RRnn {0, 1}. {0, 1}.


9

TerminologyTerminology

Usually, we draw neural networks in such a way that Usually, we draw neural networks in such a way that the input enters at the bottom and the output is the input enters at the bottom and the output is generated at the top.generated at the top.

Arrows indicate the direction of data flow.Arrows indicate the direction of data flow.

The first layer, termed The first layer, termed input layerinput layer, just contains the , just contains the input vector and does not perform any computations.input vector and does not perform any computations.

The second layer, termed The second layer, termed hidden layerhidden layer, receives , receives input from the input layer and sends its output to the input from the input layer and sends its output to the output layeroutput layer..

After applying their activation function, the neurons in After applying their activation function, the neurons in the output layer contain the output vector.the output layer contain the output vector.


10

TerminologyTerminology

Example: Example: Network function f: Network function f: RR3 3 {0, 1} {0, 1}22

output layeroutput layer

hidden layerhidden layer

input layerinput layer

input vectorinput vector

output vectoroutput vector


11

Linear NeuronsLinear Neurons

Obviously, the fact that threshold units can only Obviously, the fact that threshold units can only output the values 0 and 1 restricts their applicability to output the values 0 and 1 restricts their applicability to certain problems. certain problems.

We can overcome this limitation by eliminating the We can overcome this limitation by eliminating the threshold and simply turning fthreshold and simply turning f ii into the into the identity identity

functionfunction so that we get: so that we get:

)(net )( ttx ii

With this kind of neuron, we can build feedforward With this kind of neuron, we can build feedforward networks with m input neurons and n output neurons networks with m input neurons and n output neurons that compute a function f: that compute a function f: RRmm RRnn..


12

Linear NeuronsLinear Neurons

Linear neurons are quite popular and useful for Linear neurons are quite popular and useful for applications such as interpolation.applications such as interpolation.

However, they have a serious limitation: Each neuron However, they have a serious limitation: Each neuron computes a linear function, and therefore the overall computes a linear function, and therefore the overall network function f: network function f: RRmm RRn n is also is also linearlinear..

This means that if an input vector x results in an This means that if an input vector x results in an output vector y, then for any factor output vector y, then for any factor the input the input x will x will result in the output result in the output y.y.

Obviously, many interesting functions cannot be Obviously, many interesting functions cannot be realized by networks of linear neurons.realized by networks of linear neurons.


13

Gaussian NeuronsGaussian Neurons

Another type of neurons overcomes this problem by Another type of neurons overcomes this problem by using a using a GaussianGaussian activation function: activation function:

11

00

11

ffii(net(netii(t))(t))

netnetii(t)(t)-1-1

2

1)(net

))(net(

t

ii

i

etf


14

Gaussian NeuronsGaussian Neurons

Gaussian neurons are able to realize Gaussian neurons are able to realize non-linearnon-linear functions.functions.

Therefore, networks of Gaussian units are in principle Therefore, networks of Gaussian units are in principle unrestricted with regard to the functions that they can unrestricted with regard to the functions that they can realize.realize.

The drawback of Gaussian neurons is that we have to The drawback of Gaussian neurons is that we have to make sure that their net input does not exceed 1.make sure that their net input does not exceed 1.

This adds some difficulty to the learning in Gaussian This adds some difficulty to the learning in Gaussian networks.networks.


15

Sigmoidal NeuronsSigmoidal Neurons

Sigmoidal neuronsSigmoidal neurons accept any vectors of real accept any vectors of real numbers as input, and they output a real number numbers as input, and they output a real number between 0 and 1.between 0 and 1.

Sigmoidal neurons are the most common type of Sigmoidal neurons are the most common type of artificial neuron, especially in learning networks.artificial neuron, especially in learning networks.

A network of sigmoidal units with m input neurons A network of sigmoidal units with m input neurons and n output neurons realizes a network function and n output neurons realizes a network function f: f: RRmm (0,1) (0,1)n n


16

Sigmoidal NeuronsSigmoidal Neurons

The parameter The parameter controls the slope of the sigmoid function, controls the slope of the sigmoid function, while the parameter while the parameter controls the horizontal offset of the controls the horizontal offset of the function in a way similar to the threshold neurons.function in a way similar to the threshold neurons.

11

00

11

ffii(net(netii(t))(t))

netnetii(t)(t)-1-1

/))(net(1

1))(net(

tii ietf

= = 11

= = 0.10.1


17

Correlation LearningCorrelation Learning

Hebbian Learning (1949):Hebbian Learning (1949):

““When an axon of cell A is near enough to excite a When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes place in cell B and repeatedly or persistently takes place in firing it, some growth process or metabolic change firing it, some growth process or metabolic change takes place in one or both cells such that A’s takes place in one or both cells such that A’s efficiency, as one of the cells firing B, is increased.”efficiency, as one of the cells firing B, is increased.”

Weight modification rule:Weight modification rule:

wwi,ji,j = c = cxxiixxjj

Eventually, the connection strength will reflect the Eventually, the connection strength will reflect the correlation between the neurons’ outputs.correlation between the neurons’ outputs.


18

Competitive LearningCompetitive Learning• Nodes compete for inputsNodes compete for inputs

• Node with highest activation is the winnerNode with highest activation is the winner

• Winner neuron adapts its tuning (pattern of weights) Winner neuron adapts its tuning (pattern of weights) even further towards the current inputeven further towards the current input

• Individual nodes specialize to win competition for a Individual nodes specialize to win competition for a set of similar inputsset of similar inputs

• Process leads to most efficient neural Process leads to most efficient neural representation of input spacerepresentation of input space

• Typical for unsupervised learningTypical for unsupervised learning


19

Feedback-Based Weight AdaptationFeedback-Based Weight Adaptation• Feedback from environment (possibly teacher) is Feedback from environment (possibly teacher) is

used to improve the system’s performanceused to improve the system’s performance

• Synaptic weights are modified to reduce the Synaptic weights are modified to reduce the system’s error in computing a desired functionsystem’s error in computing a desired function

• For example, if increasing a specific weight For example, if increasing a specific weight increases error, then the weight is decreasedincreases error, then the weight is decreased

• Small adaptation steps are needed to find optimal Small adaptation steps are needed to find optimal set of weightsset of weights

• Learning rate can vary during learning processLearning rate can vary during learning process

• Typical for supervised learningTypical for supervised learning


20

Supervised vs. Unsupervised LearningSupervised vs. Unsupervised Learning

Examples:Examples:

• Supervised learning: Supervised learning: An archaeologist determines the gender of a human An archaeologist determines the gender of a human skeleton based on many past examples of male skeleton based on many past examples of male and female skeletons.and female skeletons.

• Unsupervised learning: Unsupervised learning: The archaeologist determines whether a large The archaeologist determines whether a large number of dinosaur skeleton fragments belong to number of dinosaur skeleton fragments belong to the same species or multiple species. There are no the same species or multiple species. There are no previous data to guide the archaeologist, and no previous data to guide the archaeologist, and no absolute criterion of correctness. absolute criterion of correctness.


21

Applications of Neural NetworksApplications of Neural Networks• ClassificationClassification

• ClusteringClustering

• Vector quantizationVector quantization

• Pattern associationPattern association

• ForecastingForecasting

• Control applicationsControl applications

• OptimizationOptimization

• SearchSearch

• Function approximationFunction approximation