1. introduction to ann
TRANSCRIPT
-
8/6/2019 1. Introduction to ANN
1/14
Department of Computer Science University of Karachi
An introduction to Neural Computing
1. What is Artificial Neural Network:
There is no universally accepted definition of an NN. But perhaps most people in the
field would agree that an NN is a network of many simple processors ("units"), each
possibly having a small amount of local memory. The units are connected bycommunication channels ("connections") which usually carry numeric (as opposed to
symbolic) data, encoded by any of various means. The units operate only on their
local data and on the inputs they receive via the connections. The restriction to local
operations is often relaxed during training.
Some NNs are models of biological NNs and some are not, but historically, much of
the inspiration for the field of NNs came from the desire to produce artificial systems
capable of sophisticated, perhaps "intelligent", computations similar to those that the
human brain routinely performs, and thereby possibly to enhance our understanding
of the human brain.
Most NNs have some sort of "training" rule whereby the weights of connections are
adjusted on the basis of data. In other words, NNs "learn" from examples, as children
learn to distinguish dogs from cats based on examples of dogs and cats. If trained
carefully, NNs may exhibit some capability for generalization beyond the training
data, that is, to produce approximately correct results for new cases that were not
used for training.
NNs normally have great potential for parallelism, since the computations of thecomponents are largely independent of each other. Some people regard massive
parallelism and high connectivity to be defining characteristics of NNs, but such
requirements rule out various simple models, such as simple linear regression (a
minimal feedforward net with only two units plus bias), which are usefully regarded
as special cases of NNs.
According to theDARPA Neural Network Study
... A neural network is a system composed of many simple processing elements
operating in parallel whose function is determined by network structure, connection
strengths, and the processing performed at computing elements or nodes.
According to Haykins (1994), p. 2:A neural network is a massively parallel distributed processor that has a natural
propensity for storing experiential knowledge and making it available for use. It
resembles the brain in two respects:
1. Knowledge is acquired by the network through a learning process.
2. Interneuron connection strengths known as synaptic weights are used to storethe knowledge.
By Dr. Tahseen Ahmed Jilani
-
8/6/2019 1. Introduction to ANN
2/14
Department of Computer Science University of Karachi
According to Nigrin (1993), p. 11:A neural network is a circuit composed of a very large number of simple processing
elements that are neurally based. Each element operates only on local information.
Furthermore each element operates asynchronously; thus there is no overall system
clock.
According to Zurada (1992), p. xv:Artificial neural systems, or NNs, are physical cellular systems which can acquire,store, and utilize experiential knowledge.
2. Why use neural network?
Artificial neural network learning is well-suited to problems in which data
corresponds to noisy, complex sensor data, such as from cameras and microphones or
can be used to extract patterns and detect trends that are complex to be noticed by
either humans or other computer techniques. It is also applicable to problems for
which more symbolic representations are often used, such as decision tree tasks
A trained neural network can be thought as an expert in the category of
information it has given to analyze. This expert can then be used to provide
projections given new situations of interest and answer what if questions.
Other advantages include:
a. Adaptive learning: An ability to learn how to do tasks based on the
data given for training or initial experience.b. Self-organization: An ANN can create its own organization or
representation it receives during learning time.c. Real Time operation: ANN computations may be carried out in
parallel, and special hardware devices are being designed and
manufactured which take advantage of this capability.
d. Fault Tolerance via Redundant Information Coding: Partial
destruction of a network leads to the corresponding degradation of
performance. However, some network capabilities may be retained even
with network damage.
e. Hybrid systems: NNs can be incorporated into hybrid systems. For
example, a hybrid system might use an Expert system to identify thecorrect NN architecture, the correct transformation of the input
variables, and so on.
3. Artificial Neural Network versus conventional computers
NNs take a different approach to problem solving than of conventional
computers. Conventional computers use an algorithmic approach i.e., the
computer follow a set of instructions in order to solve a problem. Unless the
specific steps that the computer needs to follow are known, the computercannot solve the problem. That restricts the problem solving capability of
conventional computers to problems that we understand and know how to
By Dr. Tahseen Ahmed Jilani
-
8/6/2019 1. Introduction to ANN
3/14
Department of Computer Science University of Karachi
solve. But computers would be so useful if they could do things that we do not
exactly know how to do. NNs process information in the similar way the
human brain does. The network is composed of a large number of highly
interconnected processing elements (neurons) working in parallel to perform a
specific task. The examples must be selected carefully otherwise useful time is
wasted or even worse the network might be functioning incorrectly. The
disadvantage is that because the network finds out how to solve the problemby itself. Its operation can be unpredictable.
On the other hand conventional computers use a cognitive approach to
problem solving; the way the problem is to solved must be known and stated
on small unambiguous instructions. Hence instructions are then converted to
high-level language program and then into machine code that he computer can
understand. These machines are completely predictable, if anything goes
wrong is due to the software or hardware fault.
NNs and conventional algorithmic computers are not on competition but
complement to each other. There are tasks that are more suited to analgorithmic approach like arithmetic operations and tasks that are more suited
to NNs.
Even more, a large number of tasks, require systems that use a combination of
the two approaches (normally a conventional computer is used to supervise the
NNs) in order to perform at maximum efficiency.
4. Human and Artificial- investigating the similarities
How Human Brain learns?Much is still unknown about the brain to process
information, so theories abound. In the human brain, a typical neuron collects
signals from others through a host of fine structure called dendrites. The
neuron sent out spikes of electrical activity through a long, thin stand known
as an axon, which splits into thousands of branches. At the end of each branch,
a structure called a synapse converts the activity from the axon into electrical
effects that inhibits or excite activity from the axon into electrical effect that
inhibit or excite activity in the connected neurons. When a neuron receives
excitatory inputs that are sufficiently large compared with its inhibitory inputs,it sends a spike of electrical activity down its axon. Learning occurs by
changing the effectiveness of the synapses so that the influence of one neuron
on another.
From Human Neurons to Artificial Neurons
We conduct these NNs by first trying to deduce the essential features of neurons and
their interconnections. We then typically program a computer to simulate these
features. However because our knowledge of neurons is incomplete and our computing
power is limited, our models are necessarily gross idealizations of real networks of
neurons.
By Dr. Tahseen Ahmed Jilani
-
8/6/2019 1. Introduction to ANN
4/14
Department of Computer Science University of Karachi
6. Who is concerned with NNs?
NNs are interesting for quite a lot of very different people:
Computer scientists want to find out about the properties of non-symbolic
information processing with neural nets and about learning systems in general.
Statisticians use neural nets as flexible, nonlinear regression and classification
models.
Engineers of many kinds exploit the capabilities of NNs in many areas, such as
signal processing and automatic control.
Cognitive scientists view NNs as a possible apparatus to describe models of
thinking and consciousness (High-level brain function).
Neuro-physiologists use NNs to describe and explore medium-level brainfunction (e.g. memory, sensory system, robotics).
Physicists use NNs to model phenomena in statistical mechanics and for a lot of
other tasks.
Biologists use NNs to interpret nucleotide sequences.
Philosophers and some other people may also be interested in NNs for various
reasons.
By Dr. Tahseen Ahmed Jilani
Dendri
tes
Cellbody
Thresh
old
Dendri
tes
An Artificial Neuron
-
8/6/2019 1. Introduction to ANN
5/14
Department of Computer Science University of Karachi
6. Application areas of Artificial NNs
1. Economic systems
o Identification of the process of economy inflation. Modeling of economy
for restoration of the governing. Forecasting and evaluation of the main
factors for economy., Normative forecasting of processes in macro
economy., Prediction of Share rates.2. Ecological systems analysis and prediction
o Oil fields forecasting. Forecasting of river flow., Temperature of air and
soil modeling., Water quality forecasting., Winter wheat productivity
modeling.
o Wheat harvest as a function of different parameters. Drainage flow
optimization.
o Cl- and NO3 -settlement modeling. Influence of natural position factors
on harvest growth. Forecasting of pesticides destruction process in fruits.
3. Environment systems
o Solar activity forecasting. Pesticide type recognition for decision support.
o Air and water pollution prediction.
4. Medical diagnostics
o Cancer patients diagnostics. State of human brain identification. Sleep
stage classification. Crush injury modeling.
5. Demographic systems
o birth-rate forecasting in GDR
6. Econometric modeling and marketing
o Steel shipment forecasting. Cost-estimating relationships modeling.
Forecasting of bread and drinks delivery.
7. Manufacturing
o Optimization and forecasting of silk production process. Forecasting of
cement quality by MIA. Hot strip steel mill runout table cooling sprays.
Crystallization process forecasting. Fermentation process modeling
8. Physical experiments
o Parameters of nuclear weapon testing were optimized. Parameters of
sharpening angles of cutting tools.
9. Materials
o Martial radiations modeling. Modeling of single-particle erosion of heat
shieldso Weld strength estimation
10. Multisensor signal processing
o Physical security systems
11.Microprocessor-based hardware
o "Smart" ultrasonic flaw discriminator
11.Eddy currents
o Automatic bolthole inspection. Recirculating and once-through steam
generator tubing inspection12.X-ray
o X-ray image analysis for bomb detection in luggage.
By Dr. Tahseen Ahmed Jilani
-
8/6/2019 1. Introduction to ANN
6/14
Department of Computer Science University of Karachi
13.Acoustic and seismic analysis
o Ocean platform detection and classification. Seismic discrimination. Oil
and water fields detection by OSA
14.Military systems
Radar: Reentry vehicle trajectory prediction. Radar imagery target
classification. Detection and identification of tactical targets. Radar pulse
classification
Infrared: Target acquisition and aim-point selection. LANDSAT scene
classification
Ultrasonics and acoustics emission: Ultrasonic imaging. Feedwater nozzle
detection. Turbine rotor inspection. Ultrasonic pipe inspection.
Monitoring of crack-growth activity
Missile guidance: Equation of hyroscope systems dynamics identification
using sorting of difference equations. Air-to-air guidance law synthesis
Image Processing and Computer Vision
Including image matching, preprocessing, segmentation and analysis, computer vision
e.g., circuit board inspection), image compression, stereo vision and processing and
understanding of time varying images.
Signal Processing
Including seismic signal analysis and morphology.
Pattern RecognitionIncluding feature extraction, radar signal classification and analysis, speech
recognition and understanding, fingerprint identification, character (letter or
number) recognition, and handwriting analysis.
Medical
Including electrocardiograph signal analysis and understanding, diagnosis of many
diseases, and medical image processing.
Financial Systems
Including stock market analysis, real estate appraisal, credit card authorization, and
securities trading
Planning, Control and SearchIncluding parallel implementation of constraint scarification problems (CSPs),
solutions to traveling salesman and control and robotics.
Power Systems
Including system state estimation, transient detection and classification fault detection
and recovery, load forecasting, and security assessment.
Human Factors
Interfacing
By Dr. Tahseen Ahmed Jilani
-
8/6/2019 1. Introduction to ANN
7/14
Department of Computer Science University of Karachi
7. Learning/ Training of Artificial NNs
The two main kinds of learning algorithms are supervised and unsupervised.
Supervised Learning
In supervised learning, the correct results (target values, desired outputs) are
known and are given to the NN during training so that the NN can adjust its
weights to try matching its outputs to the target values. After training, the NN is
tested by giving it only input values, not target values, and seeing how close it
comes to outputting the correct target values.
Unsupervised Learning
In unsupervised learning, the NN is not provided with the correct results during
training. Unsupervised NNs usually perform some kind of data compression, such
as dimensionality reduction (like PCA etc)or clustering (like FA, CA etc).
The distinction between supervised and unsupervised methods is not always clear-cut.
An unsupervised method can learn a summary of a probability distribution, then that
summarized distribution can be used to make predictions. Furthermore, supervised
methods come in two subvarieties: auto-associative and hetero-associative. In auto-
associative learning, the target values are the same as the inputs, whereas in hetero-
associative learning, the targets are generally different from the inputs. Many
unsupervised methods are equivalent to auto-associative supervised methods.
By Dr. Tahseen Ahmed Jilani
-
8/6/2019 1. Introduction to ANN
8/14
INPUT
S OUTPUT
Department of Computer Science University of Karachi
8. Kinds of Network topology
Feed forward NNs:
In a feedforward NN, the connections between units do not form cycles
(feeding back to inputs). Feedforward NNs usually produce a response to aninput quickly. Most feedforward NNs can be trained using a wide variety of
efficient conventional numerical methods.
Feed backward NNs:
In a feedback or recurrent NN, there are cycles in the connections. In somefeedback NNs, each time an input is presented, the NN must iterate for a
potentially long time before it produces a response. Feedback NNs are usually
more difficult to train than feedforward NNs.
By Dr. Tahseen Ahmed Jilani
-
8/6/2019 1. Introduction to ANN
9/14
Department of Computer Science University of Karachi
9. Some well-known types of NNs
1. Supervised
a. Feedforward
Linear NNs, Multilayer perceptron, RBF networks - Bishop (1995),Moody and Darken (1989), Orr (1996), CMAC: Cerebellar Model
Articulation Controller, Classification only , Regression only
b. Feedback
BAM: Bidirectional Associative Memory, Boltzman Machine,
Recurrent time series
c. Competitive
ARTMAP Carpenter, Fuzzy ARTMAP, Gaussian ARTMAP,
Counterpropagation, Neocognitron
2. Unsupervised
a. Competitive
Vector Quantization, Self-Organizing Map, Adaptive resonance
theory, DCL: Differential Competitive Learning - Kosko (1992)
b. Dimension Reduction
Hebbian, Oja, Sanger, Differential Hebbian
c. Autoassociation
Linear autoassociator, BSB: Brain State in a Box, Hopfield
3. Nonlearning
Hopfield
Various networks for optimization
By Dr. Tahseen Ahmed Jilani
-
8/6/2019 1. Introduction to ANN
10/14
Department of Computer Science University of Karachi
10. NETWORKS LAYERS:
The commonest type of artificial neural networks consists of three groups, or layers,
of units: a layer of input units is connected to a layer of hidden units, which is
connected to a layer of output units.
The activity of the input units represents the raw information that is fed intothe network.
The activity of each hidden unit is determined by the activities of the input
units and the weights on the connections between the input the hidden units.
The behavior of the output units depends on he activity of the hidden units and
the weights between the hidden and output units.
The simple type of network is interesting because the hidden units are free to
construct their own representations of the input. The weights between the input the
hidden units determine when each hidden unit is active, and so by modifying theseweights, a hidden unit can choose what is represent.
We also distinguish single-layer and multi-layer architectures. The single-layer
organization, in which all units are connected to one another, constitutes the most
general case and is of more potential computational power than hierarchically
structured multi-layer organizations. In multi-layer networks, units are often
numbered by layer, instead of following a global numbering. Two or more neurons
can be combined in layer, and a particular network could contain one or more such
layers.
A Layer of Neurons
In this network, each element of the input vector is connected to each neuron input
through the weight matrix W. The ith neuron has a summer that gathers its weightedinputs and bias to form its own scalar output n (i). The various n (i) taken together
form a S-element net input vector n. Finally, the neuron layer output form a column
vector a.
By Dr. Tahseen Ahmed Jilani
InputWhere
R =Number of
elements in
input vector.
S = Number ofneurons in
layer
Layer of neurons
PR
P3
P2
P1
a3
a2
a1n1
n2
n3
b2
b3
b1
a = f (WP +b)
f
f
f
WS,R
W1,1
-
8/6/2019 1. Introduction to ANN
11/14
Department of Computer Science University of Karachi
Multiple Layers of neurons:
A network can have several layers. Each layer has a weight matrix W, a bias vector
b, and an output vectora. Layer number is shown as the superscript to the variable ofinterest. A three-layer network is shown below. Note that the outputs of each
intermediate layer are the input to the following layer. The layers of a multi layer
network play different roles. A layer that produces the network output is called anoutput layer. All other layers are called hidden layers.
By Dr. Tahseen Ahmed Jilani
-
8/6/2019 1. Introduction to ANN
12/14
Department of Computer Science University of Karachi
11. Transfer Function:
It is common practice in Statistical regression modeling that the
experimenter tries to model population using sample data points. Thus the
modeling of sample data using regression analysis requires parameter
estimation using any procedure.
But, ANNs are adaptive systems with the power of a universal computer,
i.e. they can realize an arbitrary mapping (association) of one vector space
(inputs) to the other vector space (outputs). They differ in many respects,
one of the most important characteristics being the transfer functions
performed by each neuron. The choice of transfer functions in neural
networks is of crucial importance to their performance. There is a growing
understanding that the choice of transfer functions is at least as important
as the network architecture and learning algorithm. Neural networks are
used either to approximate a posterioriprobabilities for classification or to
approximate the probability densities of the training data.
Viewing the problem of learning from geometrical point of view the
purpose of the transfer functions performed by the neural network nodes is
to enable the proper use of the parameter space in the most flexible way
using the lowest number of adaptive parameters.The activation and the output functions of the input and the output layers may
be of different type than those of the hidden layers, in particular frequently
linear functions are used for inputs and outputs and non-linear transferfunctions for hidden layers.
The behavior of an Artificial Neural Network depends highly on the transfer function
used. There are a number of transfer function that typically fall into one of the
following three categories.
1. Linear (or ramp)
2. Threshold
3. Sigmoid
Some of the transfer functions included in MATLAB are,
1. hardlim (Hard limit transfer function)
2. poslin (Positive Linear transfer function)
3. purelin (Linear transfer function)
4. tansig (Hyperbolic tangent transfer function)
By Dr. Tahseen Ahmed Jilani
-
8/6/2019 1. Introduction to ANN
13/14
Department of Computer Science University of Karachi
hardlim transfer function:
The hardlim transfer function forces a neuron to output a 1 if its net input
reaches a threshold otherwise it outputs 0. This allows a neuron to make a
decision or classification. It can say yes or no. This kind of neuron is often
trained with the perceptron learning rule.
poslin transfer function:
The transfer function poslin returns the output n if n is greater than
equal to zero and o if n is less than equal to zero.
purelin transfer function:
Transfer functions of this type are used in linear filters.
3.4 tansig transfer function:
By Dr. Tahseen Ahmed Jilani
0
a
-1
+1
n
a
1
+1
n
-1
0
0
a
-1
+1
n
a+1
-1
0
n
-
8/6/2019 1. Introduction to ANN
14/14
Department of Computer Science University of Karachi
tansig is named after the hyperbolic tangent, which has the same shape.
However, tanh may be more accurate and is recommended for applications that
require the hyperbolic tangent. tansig(N) calculates its output according to:
n = 2 / (1 + exp (-2 * n) ) - 1
This is mathematically equivalent to tanh (N).
By Dr. Tahseen Ahmed Jilani