1. introduction to ann

Upload: shahzad-karim-khawer

Post on 07-Apr-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/6/2019 1. Introduction to ANN

    1/14

    Department of Computer Science University of Karachi

    An introduction to Neural Computing

    1. What is Artificial Neural Network:

    There is no universally accepted definition of an NN. But perhaps most people in the

    field would agree that an NN is a network of many simple processors ("units"), each

    possibly having a small amount of local memory. The units are connected bycommunication channels ("connections") which usually carry numeric (as opposed to

    symbolic) data, encoded by any of various means. The units operate only on their

    local data and on the inputs they receive via the connections. The restriction to local

    operations is often relaxed during training.

    Some NNs are models of biological NNs and some are not, but historically, much of

    the inspiration for the field of NNs came from the desire to produce artificial systems

    capable of sophisticated, perhaps "intelligent", computations similar to those that the

    human brain routinely performs, and thereby possibly to enhance our understanding

    of the human brain.

    Most NNs have some sort of "training" rule whereby the weights of connections are

    adjusted on the basis of data. In other words, NNs "learn" from examples, as children

    learn to distinguish dogs from cats based on examples of dogs and cats. If trained

    carefully, NNs may exhibit some capability for generalization beyond the training

    data, that is, to produce approximately correct results for new cases that were not

    used for training.

    NNs normally have great potential for parallelism, since the computations of thecomponents are largely independent of each other. Some people regard massive

    parallelism and high connectivity to be defining characteristics of NNs, but such

    requirements rule out various simple models, such as simple linear regression (a

    minimal feedforward net with only two units plus bias), which are usefully regarded

    as special cases of NNs.

    According to theDARPA Neural Network Study

    ... A neural network is a system composed of many simple processing elements

    operating in parallel whose function is determined by network structure, connection

    strengths, and the processing performed at computing elements or nodes.

    According to Haykins (1994), p. 2:A neural network is a massively parallel distributed processor that has a natural

    propensity for storing experiential knowledge and making it available for use. It

    resembles the brain in two respects:

    1. Knowledge is acquired by the network through a learning process.

    2. Interneuron connection strengths known as synaptic weights are used to storethe knowledge.

    By Dr. Tahseen Ahmed Jilani

  • 8/6/2019 1. Introduction to ANN

    2/14

    Department of Computer Science University of Karachi

    According to Nigrin (1993), p. 11:A neural network is a circuit composed of a very large number of simple processing

    elements that are neurally based. Each element operates only on local information.

    Furthermore each element operates asynchronously; thus there is no overall system

    clock.

    According to Zurada (1992), p. xv:Artificial neural systems, or NNs, are physical cellular systems which can acquire,store, and utilize experiential knowledge.

    2. Why use neural network?

    Artificial neural network learning is well-suited to problems in which data

    corresponds to noisy, complex sensor data, such as from cameras and microphones or

    can be used to extract patterns and detect trends that are complex to be noticed by

    either humans or other computer techniques. It is also applicable to problems for

    which more symbolic representations are often used, such as decision tree tasks

    A trained neural network can be thought as an expert in the category of

    information it has given to analyze. This expert can then be used to provide

    projections given new situations of interest and answer what if questions.

    Other advantages include:

    a. Adaptive learning: An ability to learn how to do tasks based on the

    data given for training or initial experience.b. Self-organization: An ANN can create its own organization or

    representation it receives during learning time.c. Real Time operation: ANN computations may be carried out in

    parallel, and special hardware devices are being designed and

    manufactured which take advantage of this capability.

    d. Fault Tolerance via Redundant Information Coding: Partial

    destruction of a network leads to the corresponding degradation of

    performance. However, some network capabilities may be retained even

    with network damage.

    e. Hybrid systems: NNs can be incorporated into hybrid systems. For

    example, a hybrid system might use an Expert system to identify thecorrect NN architecture, the correct transformation of the input

    variables, and so on.

    3. Artificial Neural Network versus conventional computers

    NNs take a different approach to problem solving than of conventional

    computers. Conventional computers use an algorithmic approach i.e., the

    computer follow a set of instructions in order to solve a problem. Unless the

    specific steps that the computer needs to follow are known, the computercannot solve the problem. That restricts the problem solving capability of

    conventional computers to problems that we understand and know how to

    By Dr. Tahseen Ahmed Jilani

  • 8/6/2019 1. Introduction to ANN

    3/14

    Department of Computer Science University of Karachi

    solve. But computers would be so useful if they could do things that we do not

    exactly know how to do. NNs process information in the similar way the

    human brain does. The network is composed of a large number of highly

    interconnected processing elements (neurons) working in parallel to perform a

    specific task. The examples must be selected carefully otherwise useful time is

    wasted or even worse the network might be functioning incorrectly. The

    disadvantage is that because the network finds out how to solve the problemby itself. Its operation can be unpredictable.

    On the other hand conventional computers use a cognitive approach to

    problem solving; the way the problem is to solved must be known and stated

    on small unambiguous instructions. Hence instructions are then converted to

    high-level language program and then into machine code that he computer can

    understand. These machines are completely predictable, if anything goes

    wrong is due to the software or hardware fault.

    NNs and conventional algorithmic computers are not on competition but

    complement to each other. There are tasks that are more suited to analgorithmic approach like arithmetic operations and tasks that are more suited

    to NNs.

    Even more, a large number of tasks, require systems that use a combination of

    the two approaches (normally a conventional computer is used to supervise the

    NNs) in order to perform at maximum efficiency.

    4. Human and Artificial- investigating the similarities

    How Human Brain learns?Much is still unknown about the brain to process

    information, so theories abound. In the human brain, a typical neuron collects

    signals from others through a host of fine structure called dendrites. The

    neuron sent out spikes of electrical activity through a long, thin stand known

    as an axon, which splits into thousands of branches. At the end of each branch,

    a structure called a synapse converts the activity from the axon into electrical

    effects that inhibits or excite activity from the axon into electrical effect that

    inhibit or excite activity in the connected neurons. When a neuron receives

    excitatory inputs that are sufficiently large compared with its inhibitory inputs,it sends a spike of electrical activity down its axon. Learning occurs by

    changing the effectiveness of the synapses so that the influence of one neuron

    on another.

    From Human Neurons to Artificial Neurons

    We conduct these NNs by first trying to deduce the essential features of neurons and

    their interconnections. We then typically program a computer to simulate these

    features. However because our knowledge of neurons is incomplete and our computing

    power is limited, our models are necessarily gross idealizations of real networks of

    neurons.

    By Dr. Tahseen Ahmed Jilani

  • 8/6/2019 1. Introduction to ANN

    4/14

    Department of Computer Science University of Karachi

    6. Who is concerned with NNs?

    NNs are interesting for quite a lot of very different people:

    Computer scientists want to find out about the properties of non-symbolic

    information processing with neural nets and about learning systems in general.

    Statisticians use neural nets as flexible, nonlinear regression and classification

    models.

    Engineers of many kinds exploit the capabilities of NNs in many areas, such as

    signal processing and automatic control.

    Cognitive scientists view NNs as a possible apparatus to describe models of

    thinking and consciousness (High-level brain function).

    Neuro-physiologists use NNs to describe and explore medium-level brainfunction (e.g. memory, sensory system, robotics).

    Physicists use NNs to model phenomena in statistical mechanics and for a lot of

    other tasks.

    Biologists use NNs to interpret nucleotide sequences.

    Philosophers and some other people may also be interested in NNs for various

    reasons.

    By Dr. Tahseen Ahmed Jilani

    Dendri

    tes

    Cellbody

    Thresh

    old

    Dendri

    tes

    An Artificial Neuron

  • 8/6/2019 1. Introduction to ANN

    5/14

    Department of Computer Science University of Karachi

    6. Application areas of Artificial NNs

    1. Economic systems

    o Identification of the process of economy inflation. Modeling of economy

    for restoration of the governing. Forecasting and evaluation of the main

    factors for economy., Normative forecasting of processes in macro

    economy., Prediction of Share rates.2. Ecological systems analysis and prediction

    o Oil fields forecasting. Forecasting of river flow., Temperature of air and

    soil modeling., Water quality forecasting., Winter wheat productivity

    modeling.

    o Wheat harvest as a function of different parameters. Drainage flow

    optimization.

    o Cl- and NO3 -settlement modeling. Influence of natural position factors

    on harvest growth. Forecasting of pesticides destruction process in fruits.

    3. Environment systems

    o Solar activity forecasting. Pesticide type recognition for decision support.

    o Air and water pollution prediction.

    4. Medical diagnostics

    o Cancer patients diagnostics. State of human brain identification. Sleep

    stage classification. Crush injury modeling.

    5. Demographic systems

    o birth-rate forecasting in GDR

    6. Econometric modeling and marketing

    o Steel shipment forecasting. Cost-estimating relationships modeling.

    Forecasting of bread and drinks delivery.

    7. Manufacturing

    o Optimization and forecasting of silk production process. Forecasting of

    cement quality by MIA. Hot strip steel mill runout table cooling sprays.

    Crystallization process forecasting. Fermentation process modeling

    8. Physical experiments

    o Parameters of nuclear weapon testing were optimized. Parameters of

    sharpening angles of cutting tools.

    9. Materials

    o Martial radiations modeling. Modeling of single-particle erosion of heat

    shieldso Weld strength estimation

    10. Multisensor signal processing

    o Physical security systems

    11.Microprocessor-based hardware

    o "Smart" ultrasonic flaw discriminator

    11.Eddy currents

    o Automatic bolthole inspection. Recirculating and once-through steam

    generator tubing inspection12.X-ray

    o X-ray image analysis for bomb detection in luggage.

    By Dr. Tahseen Ahmed Jilani

  • 8/6/2019 1. Introduction to ANN

    6/14

    Department of Computer Science University of Karachi

    13.Acoustic and seismic analysis

    o Ocean platform detection and classification. Seismic discrimination. Oil

    and water fields detection by OSA

    14.Military systems

    Radar: Reentry vehicle trajectory prediction. Radar imagery target

    classification. Detection and identification of tactical targets. Radar pulse

    classification

    Infrared: Target acquisition and aim-point selection. LANDSAT scene

    classification

    Ultrasonics and acoustics emission: Ultrasonic imaging. Feedwater nozzle

    detection. Turbine rotor inspection. Ultrasonic pipe inspection.

    Monitoring of crack-growth activity

    Missile guidance: Equation of hyroscope systems dynamics identification

    using sorting of difference equations. Air-to-air guidance law synthesis

    Image Processing and Computer Vision

    Including image matching, preprocessing, segmentation and analysis, computer vision

    e.g., circuit board inspection), image compression, stereo vision and processing and

    understanding of time varying images.

    Signal Processing

    Including seismic signal analysis and morphology.

    Pattern RecognitionIncluding feature extraction, radar signal classification and analysis, speech

    recognition and understanding, fingerprint identification, character (letter or

    number) recognition, and handwriting analysis.

    Medical

    Including electrocardiograph signal analysis and understanding, diagnosis of many

    diseases, and medical image processing.

    Financial Systems

    Including stock market analysis, real estate appraisal, credit card authorization, and

    securities trading

    Planning, Control and SearchIncluding parallel implementation of constraint scarification problems (CSPs),

    solutions to traveling salesman and control and robotics.

    Power Systems

    Including system state estimation, transient detection and classification fault detection

    and recovery, load forecasting, and security assessment.

    Human Factors

    Interfacing

    By Dr. Tahseen Ahmed Jilani

  • 8/6/2019 1. Introduction to ANN

    7/14

    Department of Computer Science University of Karachi

    7. Learning/ Training of Artificial NNs

    The two main kinds of learning algorithms are supervised and unsupervised.

    Supervised Learning

    In supervised learning, the correct results (target values, desired outputs) are

    known and are given to the NN during training so that the NN can adjust its

    weights to try matching its outputs to the target values. After training, the NN is

    tested by giving it only input values, not target values, and seeing how close it

    comes to outputting the correct target values.

    Unsupervised Learning

    In unsupervised learning, the NN is not provided with the correct results during

    training. Unsupervised NNs usually perform some kind of data compression, such

    as dimensionality reduction (like PCA etc)or clustering (like FA, CA etc).

    The distinction between supervised and unsupervised methods is not always clear-cut.

    An unsupervised method can learn a summary of a probability distribution, then that

    summarized distribution can be used to make predictions. Furthermore, supervised

    methods come in two subvarieties: auto-associative and hetero-associative. In auto-

    associative learning, the target values are the same as the inputs, whereas in hetero-

    associative learning, the targets are generally different from the inputs. Many

    unsupervised methods are equivalent to auto-associative supervised methods.

    By Dr. Tahseen Ahmed Jilani

  • 8/6/2019 1. Introduction to ANN

    8/14

    INPUT

    S OUTPUT

    Department of Computer Science University of Karachi

    8. Kinds of Network topology

    Feed forward NNs:

    In a feedforward NN, the connections between units do not form cycles

    (feeding back to inputs). Feedforward NNs usually produce a response to aninput quickly. Most feedforward NNs can be trained using a wide variety of

    efficient conventional numerical methods.

    Feed backward NNs:

    In a feedback or recurrent NN, there are cycles in the connections. In somefeedback NNs, each time an input is presented, the NN must iterate for a

    potentially long time before it produces a response. Feedback NNs are usually

    more difficult to train than feedforward NNs.

    By Dr. Tahseen Ahmed Jilani

  • 8/6/2019 1. Introduction to ANN

    9/14

    Department of Computer Science University of Karachi

    9. Some well-known types of NNs

    1. Supervised

    a. Feedforward

    Linear NNs, Multilayer perceptron, RBF networks - Bishop (1995),Moody and Darken (1989), Orr (1996), CMAC: Cerebellar Model

    Articulation Controller, Classification only , Regression only

    b. Feedback

    BAM: Bidirectional Associative Memory, Boltzman Machine,

    Recurrent time series

    c. Competitive

    ARTMAP Carpenter, Fuzzy ARTMAP, Gaussian ARTMAP,

    Counterpropagation, Neocognitron

    2. Unsupervised

    a. Competitive

    Vector Quantization, Self-Organizing Map, Adaptive resonance

    theory, DCL: Differential Competitive Learning - Kosko (1992)

    b. Dimension Reduction

    Hebbian, Oja, Sanger, Differential Hebbian

    c. Autoassociation

    Linear autoassociator, BSB: Brain State in a Box, Hopfield

    3. Nonlearning

    Hopfield

    Various networks for optimization

    By Dr. Tahseen Ahmed Jilani

  • 8/6/2019 1. Introduction to ANN

    10/14

    Department of Computer Science University of Karachi

    10. NETWORKS LAYERS:

    The commonest type of artificial neural networks consists of three groups, or layers,

    of units: a layer of input units is connected to a layer of hidden units, which is

    connected to a layer of output units.

    The activity of the input units represents the raw information that is fed intothe network.

    The activity of each hidden unit is determined by the activities of the input

    units and the weights on the connections between the input the hidden units.

    The behavior of the output units depends on he activity of the hidden units and

    the weights between the hidden and output units.

    The simple type of network is interesting because the hidden units are free to

    construct their own representations of the input. The weights between the input the

    hidden units determine when each hidden unit is active, and so by modifying theseweights, a hidden unit can choose what is represent.

    We also distinguish single-layer and multi-layer architectures. The single-layer

    organization, in which all units are connected to one another, constitutes the most

    general case and is of more potential computational power than hierarchically

    structured multi-layer organizations. In multi-layer networks, units are often

    numbered by layer, instead of following a global numbering. Two or more neurons

    can be combined in layer, and a particular network could contain one or more such

    layers.

    A Layer of Neurons

    In this network, each element of the input vector is connected to each neuron input

    through the weight matrix W. The ith neuron has a summer that gathers its weightedinputs and bias to form its own scalar output n (i). The various n (i) taken together

    form a S-element net input vector n. Finally, the neuron layer output form a column

    vector a.

    By Dr. Tahseen Ahmed Jilani

    InputWhere

    R =Number of

    elements in

    input vector.

    S = Number ofneurons in

    layer

    Layer of neurons

    PR

    P3

    P2

    P1

    a3

    a2

    a1n1

    n2

    n3

    b2

    b3

    b1

    a = f (WP +b)

    f

    f

    f

    WS,R

    W1,1

  • 8/6/2019 1. Introduction to ANN

    11/14

    Department of Computer Science University of Karachi

    Multiple Layers of neurons:

    A network can have several layers. Each layer has a weight matrix W, a bias vector

    b, and an output vectora. Layer number is shown as the superscript to the variable ofinterest. A three-layer network is shown below. Note that the outputs of each

    intermediate layer are the input to the following layer. The layers of a multi layer

    network play different roles. A layer that produces the network output is called anoutput layer. All other layers are called hidden layers.

    By Dr. Tahseen Ahmed Jilani

  • 8/6/2019 1. Introduction to ANN

    12/14

    Department of Computer Science University of Karachi

    11. Transfer Function:

    It is common practice in Statistical regression modeling that the

    experimenter tries to model population using sample data points. Thus the

    modeling of sample data using regression analysis requires parameter

    estimation using any procedure.

    But, ANNs are adaptive systems with the power of a universal computer,

    i.e. they can realize an arbitrary mapping (association) of one vector space

    (inputs) to the other vector space (outputs). They differ in many respects,

    one of the most important characteristics being the transfer functions

    performed by each neuron. The choice of transfer functions in neural

    networks is of crucial importance to their performance. There is a growing

    understanding that the choice of transfer functions is at least as important

    as the network architecture and learning algorithm. Neural networks are

    used either to approximate a posterioriprobabilities for classification or to

    approximate the probability densities of the training data.

    Viewing the problem of learning from geometrical point of view the

    purpose of the transfer functions performed by the neural network nodes is

    to enable the proper use of the parameter space in the most flexible way

    using the lowest number of adaptive parameters.The activation and the output functions of the input and the output layers may

    be of different type than those of the hidden layers, in particular frequently

    linear functions are used for inputs and outputs and non-linear transferfunctions for hidden layers.

    The behavior of an Artificial Neural Network depends highly on the transfer function

    used. There are a number of transfer function that typically fall into one of the

    following three categories.

    1. Linear (or ramp)

    2. Threshold

    3. Sigmoid

    Some of the transfer functions included in MATLAB are,

    1. hardlim (Hard limit transfer function)

    2. poslin (Positive Linear transfer function)

    3. purelin (Linear transfer function)

    4. tansig (Hyperbolic tangent transfer function)

    By Dr. Tahseen Ahmed Jilani

  • 8/6/2019 1. Introduction to ANN

    13/14

    Department of Computer Science University of Karachi

    hardlim transfer function:

    The hardlim transfer function forces a neuron to output a 1 if its net input

    reaches a threshold otherwise it outputs 0. This allows a neuron to make a

    decision or classification. It can say yes or no. This kind of neuron is often

    trained with the perceptron learning rule.

    poslin transfer function:

    The transfer function poslin returns the output n if n is greater than

    equal to zero and o if n is less than equal to zero.

    purelin transfer function:

    Transfer functions of this type are used in linear filters.

    3.4 tansig transfer function:

    By Dr. Tahseen Ahmed Jilani

    0

    a

    -1

    +1

    n

    a

    1

    +1

    n

    -1

    0

    0

    a

    -1

    +1

    n

    a+1

    -1

    0

    n

  • 8/6/2019 1. Introduction to ANN

    14/14

    Department of Computer Science University of Karachi

    tansig is named after the hyperbolic tangent, which has the same shape.

    However, tanh may be more accurate and is recommended for applications that

    require the hyperbolic tangent. tansig(N) calculates its output according to:

    n = 2 / (1 + exp (-2 * n) ) - 1

    This is mathematically equivalent to tanh (N).

    By Dr. Tahseen Ahmed Jilani