1. introduction to ann

8/6/2019 1. Introduction to ANN

1/14

Department of Computer Science University of Karachi

An introduction to Neural Computing

1. What is Artificial Neural Network:

There is no universally accepted definition of an NN. But perhaps most people in the

field would agree that an NN is a network of many simple processors ("units"), each

possibly having a small amount of local memory. The units are connected bycommunication channels ("connections") which usually carry numeric (as opposed to

symbolic) data, encoded by any of various means. The units operate only on their

local data and on the inputs they receive via the connections. The restriction to local

operations is often relaxed during training.

Some NNs are models of biological NNs and some are not, but historically, much of

the inspiration for the field of NNs came from the desire to produce artificial systems

capable of sophisticated, perhaps "intelligent", computations similar to those that the

human brain routinely performs, and thereby possibly to enhance our understanding

of the human brain.

Most NNs have some sort of "training" rule whereby the weights of connections are

adjusted on the basis of data. In other words, NNs "learn" from examples, as children

learn to distinguish dogs from cats based on examples of dogs and cats. If trained

carefully, NNs may exhibit some capability for generalization beyond the training

data, that is, to produce approximately correct results for new cases that were not

used for training.

NNs normally have great potential for parallelism, since the computations of thecomponents are largely independent of each other. Some people regard massive

parallelism and high connectivity to be defining characteristics of NNs, but such

requirements rule out various simple models, such as simple linear regression (a

minimal feedforward net with only two units plus bias), which are usefully regarded

as special cases of NNs.

According to theDARPA Neural Network Study

... A neural network is a system composed of many simple processing elements

operating in parallel whose function is determined by network structure, connection

strengths, and the processing performed at computing elements or nodes.

According to Haykins (1994), p. 2:A neural network is a massively parallel distributed processor that has a natural

propensity for storing experiential knowledge and making it available for use. It

resembles the brain in two respects:

1. Knowledge is acquired by the network through a learning process.

2. Interneuron connection strengths known as synaptic weights are used to storethe knowledge.

By Dr. Tahseen Ahmed Jilani


2/14


According to Nigrin (1993), p. 11:A neural network is a circuit composed of a very large number of simple processing

elements that are neurally based. Each element operates only on local information.

Furthermore each element operates asynchronously; thus there is no overall system

clock.

According to Zurada (1992), p. xv:Artificial neural systems, or NNs, are physical cellular systems which can acquire,store, and utilize experiential knowledge.

2. Why use neural network?

Artificial neural network learning is well-suited to problems in which data

corresponds to noisy, complex sensor data, such as from cameras and microphones or

can be used to extract patterns and detect trends that are complex to be noticed by

either humans or other computer techniques. It is also applicable to problems for

which more symbolic representations are often used, such as decision tree tasks

A trained neural network can be thought as an expert in the category of

information it has given to analyze. This expert can then be used to provide

projections given new situations of interest and answer what if questions.

Other advantages include:

a. Adaptive learning: An ability to learn how to do tasks based on the

data given for training or initial experience.b. Self-organization: An ANN can create its own organization or

representation it receives during learning time.c. Real Time operation: ANN computations may be carried out in

parallel, and special hardware devices are being designed and

manufactured which take advantage of this capability.

d. Fault Tolerance via Redundant Information Coding: Partial

destruction of a network leads to the corresponding degradation of

performance. However, some network capabilities may be retained even

with network damage.

e. Hybrid systems: NNs can be incorporated into hybrid systems. For

example, a hybrid system might use an Expert system to identify thecorrect NN architecture, the correct transformation of the input

variables, and so on.

3. Artificial Neural Network versus conventional computers

NNs take a different approach to problem solving than of conventional

computers. Conventional computers use an algorithmic approach i.e., the

computer follow a set of instructions in order to solve a problem. Unless the

specific steps that the computer needs to follow are known, the computercannot solve the problem. That restricts the problem solving capability of

conventional computers to problems that we understand and know how to



3/14


solve. But computers would be so useful if they could do things that we do not

exactly know how to do. NNs process information in the similar way the

human brain does. The network is composed of a large number of highly

interconnected processing elements (neurons) working in parallel to perform a

specific task. The examples must be selected carefully otherwise useful time is

wasted or even worse the network might be functioning incorrectly. The

disadvantage is that because the network finds out how to solve the problemby itself. Its operation can be unpredictable.

On the other hand conventional computers use a cognitive approach to

problem solving; the way the problem is to solved must be known and stated

on small unambiguous instructions. Hence instructions are then converted to

high-level language program and then into machine code that he computer can

understand. These machines are completely predictable, if anything goes

wrong is due to the software or hardware fault.

NNs and conventional algorithmic computers are not on competition but

complement to each other. There are tasks that are more suited to analgorithmic approach like arithmetic operations and tasks that are more suited

to NNs.

Even more, a large number of tasks, require systems that use a combination of

the two approaches (normally a conventional computer is used to supervise the

NNs) in order to perform at maximum efficiency.

4. Human and Artificial- investigating the similarities

How Human Brain learns?Much is still unknown about the brain to process

information, so theories abound. In the human brain, a typical neuron collects

signals from others through a host of fine structure called dendrites. The

neuron sent out spikes of electrical activity through a long, thin stand known

as an axon, which splits into thousands of branches. At the end of each branch,

a structure called a synapse converts the activity from the axon into electrical

effects that inhibits or excite activity from the axon into electrical effect that

inhibit or excite activity in the connected neurons. When a neuron receives

excitatory inputs that are sufficiently large compared with its inhibitory inputs,it sends a spike of electrical activity down its axon. Learning occurs by

changing the effectiveness of the synapses so that the influence of one neuron

on another.

From Human Neurons to Artificial Neurons

We conduct these NNs by first trying to deduce the essential features of neurons and

their interconnections. We then typically program a computer to simulate these

features. However because our knowledge of neurons is incomplete and our computing

power is limited, our models are necessarily gross idealizations of real networks of

neurons.



4/14


6. Who is concerned with NNs?

NNs are interesting for quite a lot of very different people:

Computer scientists want to find out about the properties of non-symbolic

information processing with neural nets and about learning systems in general.

Statisticians use neural nets as flexible, nonlinear regression and classification

models.

Engineers of many kinds exploit the capabilities of NNs in many areas, such as

signal processing and automatic control.

Cognitive scientists view NNs as a possible apparatus to describe models of

thinking and consciousness (High-level brain function).

Neuro-physiologists use NNs to describe and explore medium-level brainfunction (e.g. memory, sensory system, robotics).

Physicists use NNs to model phenomena in statistical mechanics and for a lot of

other tasks.

Biologists use NNs to interpret nucleotide sequences.

Philosophers and some other people may also be interested in NNs for various

reasons.


Dendri

tes

Cellbody

Thresh

old

Dendri

tes

An Artificial Neuron


5/14


6. Application areas of Artificial NNs

1. Economic systems

o Identification of the process of economy inflation. Modeling of economy

for restoration of the governing. Forecasting and evaluation of the main

factors for economy., Normative forecasting of processes in macro

economy., Prediction of Share rates.2. Ecological systems analysis and prediction

o Oil fields forecasting. Forecasting of river flow., Temperature of air and

soil modeling., Water quality forecasting., Winter wheat productivity

modeling.

o Wheat harvest as a function of different parameters. Drainage flow

optimization.

o Cl- and NO3 -settlement modeling. Influence of natural position factors

on harvest growth. Forecasting of pesticides destruction process in fruits.

3. Environment systems

o Solar activity forecasting. Pesticide type recognition for decision support.

o Air and water pollution prediction.

4. Medical diagnostics

o Cancer patients diagnostics. State of human brain identification. Sleep

stage classification. Crush injury modeling.

5. Demographic systems

o birth-rate forecasting in GDR

6. Econometric modeling and marketing

o Steel shipment forecasting. Cost-estimating relationships modeling.

Forecasting of bread and drinks delivery.

7. Manufacturing

o Optimization and forecasting of silk production process. Forecasting of

cement quality by MIA. Hot strip steel mill runout table cooling sprays.

Crystallization process forecasting. Fermentation process modeling

8. Physical experiments

o Parameters of nuclear weapon testing were optimized. Parameters of

sharpening angles of cutting tools.

9. Materials

o Martial radiations modeling. Modeling of single-particle erosion of heat

shieldso Weld strength estimation

10. Multisensor signal processing

o Physical security systems

11.Microprocessor-based hardware

o "Smart" ultrasonic flaw discriminator

11.Eddy currents

o Automatic bolthole inspection. Recirculating and once-through steam

generator tubing inspection12.X-ray

o X-ray image analysis for bomb detection in luggage.



6/14


13.Acoustic and seismic analysis

o Ocean platform detection and classification. Seismic discrimination. Oil

and water fields detection by OSA

14.Military systems

Radar: Reentry vehicle trajectory prediction. Radar imagery target

classification. Detection and identification of tactical targets. Radar pulse

classification

Infrared: Target acquisition and aim-point selection. LANDSAT scene

classification

Ultrasonics and acoustics emission: Ultrasonic imaging. Feedwater nozzle

detection. Turbine rotor inspection. Ultrasonic pipe inspection.

Monitoring of crack-growth activity

Missile guidance: Equation of hyroscope systems dynamics identification

using sorting of difference equations. Air-to-air guidance law synthesis

Image Processing and Computer Vision

Including image matching, preprocessing, segmentation and analysis, computer vision

e.g., circuit board inspection), image compression, stereo vision and processing and

understanding of time varying images.

Signal Processing

Including seismic signal analysis and morphology.

Pattern RecognitionIncluding feature extraction, radar signal classification and analysis, speech

recognition and understanding, fingerprint identification, character (letter or

number) recognition, and handwriting analysis.

Medical

Including electrocardiograph signal analysis and understanding, diagnosis of many

diseases, and medical image processing.

Financial Systems

Including stock market analysis, real estate appraisal, credit card authorization, and

securities trading

Planning, Control and SearchIncluding parallel implementation of constraint scarification problems (CSPs),

solutions to traveling salesman and control and robotics.

Power Systems

Including system state estimation, transient detection and classification fault detection

and recovery, load forecasting, and security assessment.

Human Factors

Interfacing



7/14


7. Learning/ Training of Artificial NNs

The two main kinds of learning algorithms are supervised and unsupervised.

Supervised Learning

In supervised learning, the correct results (target values, desired outputs) are

known and are given to the NN during training so that the NN can adjust its

weights to try matching its outputs to the target values. After training, the NN is

tested by giving it only input values, not target values, and seeing how close it

comes to outputting the correct target values.

Unsupervised Learning

In unsupervised learning, the NN is not provided with the correct results during

training. Unsupervised NNs usually perform some kind of data compression, such

as dimensionality reduction (like PCA etc)or clustering (like FA, CA etc).

The distinction between supervised and unsupervised methods is not always clear-cut.

An unsupervised method can learn a summary of a probability distribution, then that

summarized distribution can be used to make predictions. Furthermore, supervised

methods come in two subvarieties: auto-associative and hetero-associative. In auto-

associative learning, the target values are the same as the inputs, whereas in hetero-

associative learning, the targets are generally different from the inputs. Many

unsupervised methods are equivalent to auto-associative supervised methods.



8/14

INPUT

S OUTPUT


8. Kinds of Network topology

Feed forward NNs:

In a feedforward NN, the connections between units do not form cycles

(feeding back to inputs). Feedforward NNs usually produce a response to aninput quickly. Most feedforward NNs can be trained using a wide variety of

efficient conventional numerical methods.

Feed backward NNs:

In a feedback or recurrent NN, there are cycles in the connections. In somefeedback NNs, each time an input is presented, the NN must iterate for a

potentially long time before it produces a response. Feedback NNs are usually

more difficult to train than feedforward NNs.



9/14


9. Some well-known types of NNs

1. Supervised

a. Feedforward

Linear NNs, Multilayer perceptron, RBF networks - Bishop (1995),Moody and Darken (1989), Orr (1996), CMAC: Cerebellar Model

Articulation Controller, Classification only , Regression only

b. Feedback

BAM: Bidirectional Associative Memory, Boltzman Machine,

Recurrent time series

c. Competitive

ARTMAP Carpenter, Fuzzy ARTMAP, Gaussian ARTMAP,

Counterpropagation, Neocognitron

2. Unsupervised

a. Competitive

Vector Quantization, Self-Organizing Map, Adaptive resonance

theory, DCL: Differential Competitive Learning - Kosko (1992)

b. Dimension Reduction

Hebbian, Oja, Sanger, Differential Hebbian

c. Autoassociation

Linear autoassociator, BSB: Brain State in a Box, Hopfield

3. Nonlearning

Hopfield

Various networks for optimization



10/14


10. NETWORKS LAYERS:

The commonest type of artificial neural networks consists of three groups, or layers,

of units: a layer of input units is connected to a layer of hidden units, which is

connected to a layer of output units.

The activity of the input units represents the raw information that is fed intothe network.

The activity of each hidden unit is determined by the activities of the input

units and the weights on the connections between the input the hidden units.

The behavior of the output units depends on he activity of the hidden units and

the weights between the hidden and output units.

The simple type of network is interesting because the hidden units are free to

construct their own representations of the input. The weights between the input the

hidden units determine when each hidden unit is active, and so by modifying theseweights, a hidden unit can choose what is represent.

We also distinguish single-layer and multi-layer architectures. The single-layer

organization, in which all units are connected to one another, constitutes the most

general case and is of more potential computational power than hierarchically

structured multi-layer organizations. In multi-layer networks, units are often

numbered by layer, instead of following a global numbering. Two or more neurons

can be combined in layer, and a particular network could contain one or more such

layers.

A Layer of Neurons

In this network, each element of the input vector is connected to each neuron input

through the weight matrix W. The ith neuron has a summer that gathers its weightedinputs and bias to form its own scalar output n (i). The various n (i) taken together

form a S-element net input vector n. Finally, the neuron layer output form a column

vector a.


InputWhere

R =Number of

elements in

input vector.

S = Number ofneurons in

layer

Layer of neurons

PR

P3

P2

P1

a3

a2

a1n1

n2

n3

b2

b3

b1

a = f (WP +b)

f

f

f

WS,R

W1,1


11/14


Multiple Layers of neurons:

A network can have several layers. Each layer has a weight matrix W, a bias vector

b, and an output vectora. Layer number is shown as the superscript to the variable ofinterest. A three-layer network is shown below. Note that the outputs of each

intermediate layer are the input to the following layer. The layers of a multi layer

network play different roles. A layer that produces the network output is called anoutput layer. All other layers are called hidden layers.



12/14


11. Transfer Function:

It is common practice in Statistical regression modeling that the

experimenter tries to model population using sample data points. Thus the

modeling of sample data using regression analysis requires parameter

estimation using any procedure.

But, ANNs are adaptive systems with the power of a universal computer,

i.e. they can realize an arbitrary mapping (association) of one vector space

(inputs) to the other vector space (outputs). They differ in many respects,

one of the most important characteristics being the transfer functions

performed by each neuron. The choice of transfer functions in neural

networks is of crucial importance to their performance. There is a growing

understanding that the choice of transfer functions is at least as important

as the network architecture and learning algorithm. Neural networks are

used either to approximate a posterioriprobabilities for classification or to

approximate the probability densities of the training data.

Viewing the problem of learning from geometrical point of view the

purpose of the transfer functions performed by the neural network nodes is

to enable the proper use of the parameter space in the most flexible way

using the lowest number of adaptive parameters.The activation and the output functions of the input and the output layers may

be of different type than those of the hidden layers, in particular frequently

linear functions are used for inputs and outputs and non-linear transferfunctions for hidden layers.

The behavior of an Artificial Neural Network depends highly on the transfer function

used. There are a number of transfer function that typically fall into one of the

following three categories.

1. Linear (or ramp)

2. Threshold

3. Sigmoid

Some of the transfer functions included in MATLAB are,

1. hardlim (Hard limit transfer function)

2. poslin (Positive Linear transfer function)

3. purelin (Linear transfer function)

4. tansig (Hyperbolic tangent transfer function)



13/14


hardlim transfer function:

The hardlim transfer function forces a neuron to output a 1 if its net input

reaches a threshold otherwise it outputs 0. This allows a neuron to make a

decision or classification. It can say yes or no. This kind of neuron is often

trained with the perceptron learning rule.

poslin transfer function:

The transfer function poslin returns the output n if n is greater than

equal to zero and o if n is less than equal to zero.

purelin transfer function:

Transfer functions of this type are used in linear filters.

3.4 tansig transfer function:


0

a

-1

+1

n

a

1

+1

n

-1

0

0

a

-1

+1

n

a+1

-1

0

n


14/14


tansig is named after the hyperbolic tangent, which has the same shape.

However, tanh may be more accurate and is recommended for applications that

require the hyperbolic tangent. tansig(N) calculates its output according to:

n = 2 / (1 + exp (-2 * n) ) - 1

This is mathematically equivalent to tanh (N).


1. introduction to ann

Documents