ann basic concepts

41
AI Applications to Power Systems ARTIFICIAL NEURAL NETWORK Concepts of Neural Network – Multi-layer feed forward networks – Back propagation algorithms –Radial basis function and recurrent networks Introduction The neural network of an animal is part of its nervous system, containing a large number of interconnected neurons (nerve cells). "Neural" is an adjective for neuron, and "network" denotes a graph-like structure. Artificial neural networks refer to computing systems whose central theme is borrowed from the analogy of biological neural networks. Bowing to common practice, we omit the prefix "artificial." There is potential for confusing the (artificial) poor imitation for the (biological) real thing; in this text, non-biological words and names are used as far as possible. Artificial neural networks are also referred to as "neural nets," "artificial neural systems," "parallel distributed processing systems," and "connectionist systems." For a computing system to be called by these pretty names, it is necessary for the system to have a labeled directed graph structure where nodes perform some simple computations. From elementary graph theory we recall that a "directed graph" consists of a set of "nodes" (vertices) and a set of "connections" (edges/links/arcs) connecting pairs of nodes. A graph is a "labeled graph" if each connection is associated with a label to identify some property of the connection. In a neural network, each node performs some simple computations and each connection conveys a signal from one node to another, labeled by a number called the "connection strength" or "weight" indicating the extent to which a signal is amplified or diminished by a connection. A.S.S.Murugan, SL/EEE, KLNCE, Pottapalayam 3.1

Upload: marimuthua

Post on 28-Dec-2015

18 views

Category:

Documents


0 download

DESCRIPTION

good

TRANSCRIPT

Page 1: ANN Basic Concepts

AI Applications to Power Systems

ARTIFICIAL NEURAL NETWORK

Concepts of Neural Network – Multi-layer feed forward networks – Back propagation algorithms –Radial basis function and recurrent networks

Introduction The neural network of an animal is part of its nervous system, containing a large number of interconnected neurons (nerve cells). "Neural" is an adjective for neuron, and "network" denotes a graph-like structure. Artificial neural networks refer to computing systems whose central theme is borrowed from the analogy of biological neural networks. Bowing to common practice, we omit the prefix "artificial." There is potential for confusing the (artificial) poor imitation for the (biological) real thing; in this text, non-biological words and names are used as far as possible.Artificial neural networks are also referred to as "neural nets," "artificial neural systems," "parallel distributed processing systems," and "connectionist systems." For a computing system to be called by these pretty names, it is necessary for the system to have a labeled directed graph structure where nodes perform some simple computations. From elementary graph theory we recall that a "directed graph" consists of a set of "nodes" (vertices) and a set of "connections" (edges/links/arcs) connecting pairs of nodes. A graph is a "labeled graph" if each connection is associated with a label to identify some property of the connection. In a neural network, each node performs some simple computations and each connection conveys a signal from one node to another, labeled by a number called the "connection strength" or "weight" indicating the extent to which a signal is amplified or diminished by a connection.

Properties of neural networks

The use of neural networks offers the following useful properties and capabilities:

1. Nonlineari t y . A neural network, made up of an interconnection of nonlinear neurons, is itself nonlinear. Moreover, the nonlinearity is of a special kind in the sense that it is distributed throughout the network. Most real systems, including power systems are nonlinear, so this property is very desirable for its applications in power systems.

2. I nput- O utput Mappin g . A popular paradigm of learning called learning with a teacher or supervised learning involves modification of the synaptic weights of a neural network by applying a set of labeled training samples or task examples. Each example consists of a unique input signal and a corresponding desired response. The network learns from the examples by constructing an input-output mapping for the

A.S.S.Murugan, SL/EEE, KLNCE, Pottapalayam 3.1

Page 2: ANN Basic Concepts

Unit III Artificial Neural Network

problem. In power system voltage security analysis, the traditional approaches which are widely used can be used to generate those training samples.

3. Ad a ptivit y . Neural networks have a built-in capability to adapt their synaptic weights to changes in the surrounding environment. In particular, a neural network trained to operate in a specific environment can be easily retrained to deal with minor changes in the operating environmental conditions. Moreover, when it is operating in a nonstationary environment, a neural network can be designed to change its synaptic weights in real time.

4. F a ult tol e r a nc e . A neural network has the potential to be inherently fault tolerant in the sense that its performance degrades gracefully under missing or erroneous data. The reason is that the information is distributed in the network, the errors must be extensive before catastrophic failure occurs.

Biological neurons

A typical biological neuron is composed of a cell body, a tubular axon, and a multitude of hair-like dendrites, shown in figure. The dendrites form a very fine filamentary brush surrounding the body of the neuron. The axon is essentially a long, thin tube that splits into branches terminating in little end bulbs that almost touch the dendrites of other cells. The small gap between an end bulb and a dendrite is called a synapse, across which information is propagated. The axon of a single neuron forms synaptic connections with many other neurons; the presynaptic side of the synapse refers to the neuron that sends a signal, while the postsynaptic side refers to the neuron that receives the signal. However, the real picture of neurons is a little more complicated.1. A neuron may have no obvious axon, but only "processes" that receive and transmit information.2. Axons may form synapses on other axons.3. Dendrites may form synapses onto other dendrites. The number of synapses received by each neuron range from 100 to 100,000. Morphologically, most synaptic contacts are of two types.

Type I: Excitatory synapses with asymmetrical membrane specializations; membrane thickening is greater on the postsynaptic side. The presynaptic side contains round bags (synaptic vesicles) believed to contain packets of a neurotransmitter (a chemical such as glutamate or aspartate).

Type II: Inhibitory synapses with symmetrical membrane specializations; with smaller ellipsoidal or flattened vesicles. Gamma-amino butyric acid is an example of an inhibitory neurotransmitter.

A.S.S.Murugan, SL/EEE, KLNCE, Pottapalayam 3.2

Page 3: ANN Basic Concepts

AI Applications to Power Systems

Fig. Biological neuronsBasic Concepts Artificial neural networks (ANN) have been used for two main tasks: 1) function approximation and 2) classification problems. Neural networks offer a general framework for representing non-linear mappings

Artificial Neuron Structure The human nervous system, built of cells called neurons is of staggering complexity. An estimated 1011 interconnections over transmission paths are there that may range for a meter or more. Each neuron shares many characteristics with the other cells in the body, but has unique capabilities to receive, process and transmit electrochemical signals over neural pathways that comprise thebrain’s communication system. Figure shows the structure of typical biological neurons. Biological neuron basically consists of three main components cell body, dendrite and axon. Dendrites extend from the cell body to other neurons where they receive signals at a connection point called a synapse. On the receiving side of the synapse, these inputs are conducted to the cell body, where they are summed up. Some inputs tend to excite the cell causing a reduction in the potential across the cell membrane; others tend to inhibit its firing causing an increase in the polarization of the receiving nerve cell. When the cumulative excitation in the cell body exceeds a threshold, the cell fires and action potential is generated and propagates down the axon towards the synaptic junctions with other nerve cells.

Neural Network A neural network (NN) is an abstract computer model of the human brain. The human brain has an estimated 1011 tiny units called neurons. These neurons are interconnected with an estimated 1015 links. Although more research needs to be done, the neural network of the brain is considered to be the fundamental functional source of intelligence, which includes perception, cognition, and learning for humans as well as other living creatures. Similar to the brain, a neural network is composed of artificial neurons (or units) and interconnections. When we view such a network as a graph, neurons can be represented as nodes (or vertices), and interconnections as edges.

A.S.S.Murugan, SL/EEE, KLNCE, Pottapalayam 3.3

Page 4: ANN Basic Concepts

Unit III Artificial Neural Network

Although the term "neural networks" (NNs) is most commonly used, other names include artificial neural networks (ANNs)⎯to distinguish from the natural brain neural networks⎯neural nets, PDP(Parallel Distributed Processing) models (since computations can typically be performed in both parallel and distributed processing), connectionist models, and adaptive systems.

Neuron Artificial neural networks (ANNs) are software or hardware systems designed to simulate the operation of a simple biological nervous system.

The basic element of the brain is a natural neuron; similarly, the basic element of every neural network is an artificial neuron, or simply neuron. That is, a neuron is the basic building block for all types of neural networks.

A Typical ANN StructureANNs are collections of interconnected entities named “neurones”.Similarly to the biological model, each neurone has many inputs (the dendrites) but a single output (the axon).Some inputs have excitatory effect on the axon while others have inhibitory effect.The activity of the ANN is the combined effect of the operation of the constituent neurones of the ANN.

A.S.S.Murugan, SL/EEE, KLNCE, Pottapalayam 3.4

Page 5: ANN Basic Concepts

AI Applications to Power Systems

Fig. ANN Structure

From the above figure, we see that neuron has a head like structure at the top called Soma which has many dendrites and a long Axon. The axon is connected to the dendrites of other neurons. Similarly its dendrites are connected to axons of other neighbouring neurons. The neuron receives information in the form of electric currents from other neurons on the dendrites. The information is "processed" and the neuron "fires" to pass its result in the form of current, to other neuron through its axon.

A Simple Neurone Model (The Perceptron) A perceptron has analogue inputs but binary output. Each input has an associated weight. Positive weights correspond to excitatory inputs and negative weights to inhibitory

inputs.

Fig. Neuron Model

Description of a neuron

A.S.S.Murugan, SL/EEE, KLNCE, Pottapalayam 3.5

out= f (net )

net=∑i=1

n

wi⋅xi

f ={1 if net> threshold0 if net≤threshold

Page 6: ANN Basic Concepts

Unit III Artificial Neural Network

A neuron is an abstract model of a natural neuron, as illustrated in Figs. We have inputs x1, x2, ……..., xm coming into the neuron. These inputs are the stimulation levels of a natural neuron. Each input xi is multiplied by its corresponding weight wi, then the product xi.wi is fed into the body of the neuron. The weights represent the biological synaptic strengths in a natural neuron. The neuron adds up all the products for i = 1, m. The weighted sum of the products is usually denoted as net in the neural network literature, so we will use this notation. That is, the neuron evaluates net = x1w1 + x2w2 + ... + xmwm. In mathematical terms, given two vectors x = (x1, x2, ..., xm) and w = (w1, w2, ..., wm), net is the dot (or scalar) product of the two vectors, xw = x1w1 + x2w2 + ... + xmwm. Finally, the neuron computes its output y as a certain function of net, i.e., y = f (net). This function is called the activation (or sometimes transfer) function. We can think of a neuron as a sort of black box, receiving input vector x then producing a scalar output y. The same output value y can be sent out through multiple edges emerging from the neuron.

Fig. (a)A neuron model that retains the image of a natural neuron. (b) A further abstraction of Fig. (a).

Back Propagation Network (BPN) It is a multi-layer forward network used extend gradient-descent waste delta learning rule.

Fig. Structure of biological neuron

The artificial neuron was designed to mimic the first order characteristics of the biological neuron. McCulloch and Pitts suggested the first synthetic neuron in the early 1940s. In essence, a set of inputs are applied, each representing the output of another neuron. Each input is multiplied by a corresponding weight, analogous to a synaptic strength, and all of the weighted inputs are then summed to determine the activation level of the neuron. If this activation exceeds a certain

A.S.S.Murugan, SL/EEE, KLNCE, Pottapalayam 3.6

Page 7: ANN Basic Concepts

AI Applications to Power Systems

threshold the unit produces an output response. This functionality is captured in the artificial neuron known as the threshold logic unit (TLU) originally proposed by McCulloch and Pitts.

Fig. Artificial neuron structure (perceptron model)

Figure shows a model that implement this idea. Despite of the diversity of network paradigms, nearly all are based upon this neuron configuration. Here a set of input labeled X1, X2, . . . .,Xn is applied from the input space to artificial neuron. These inputs, collectively referred as the input vector “X” corresponds to the signal into the synapses of biological neuron. Each signal is multiplied by an associated weight W1, W2, . . .Wn, before it is applied to the summation block.The activation a, is given by

This may be represented more compactly as

the output y is then given by y = f(a), where f is a activation function. In McCulloh–Pitts Perceptron model hard limiter as activation function was used and defined as:

The threshold s will often be zero. The activation function is sometimes called a step-function. Some more non-linear activation functions also tried by the researchers like sigmoid, Gaussian, etc. and the neuron responses for different activation functions shown in Fig. 3.3

Network Architectures

Network architectures can be categorized to three main types: feedforward networks, recurrent networks (feedback networks) and self-organizing networks. This classification of networks has been proposed by Kohonen [1990]. Network is feedforward if all of the hidden and output neurons receive inputs from the preceding layer only. The input is presented to the input layer and it is propagated forwards through the network. Output never forms a part of its own

A.S.S.Murugan, SL/EEE, KLNCE, Pottapalayam 3.7

Page 8: ANN Basic Concepts

Unit III Artificial Neural Network

input. Recurrent network has at least one feedback loop, i.e., cyclic connection, which means that at least one of its neurons feed its signal back to the inputs of all the other neurons. The behavior of such networks may be extremely complex.

Haykin divides networks into four classes [Haykin, 1994]: 1) single-layer feedforward networks, 2) multilayer feedforward networks, 3) recurrent networks, and 4) lattice structures. A lattice network is a feedforward network, which has output neurons arranged in rows and columns.

Layered networks are said to be fully connected if every node in each layer is connected to all the following layer nodes. If any of the connections is missing, then network is said to be partially connected. Partially connected networks can be formedif some prior information about the problem is available and this information supports the use of such a structure. The following treatment of networks applies mainly to feed forward networks (single layer networks, MLP, RBF, etc.). The designation n-layer network refers to the number of computational nodes or the number of weight connection layers. Thus the input node layer is not taken into account.

Feed-forward networks

Feed-forward ANNs allow signals to travel one way only from input to output. There is no feedback (loops) i.e. the output of any layer does not affect that same layer. Feed-forward ANNs tend to be straight forward networks that associate inputs with outputs. They are extensively used in pattern recognition. This type of organisation is also referred to as bottom-up or top-down.

Fig.3.7 An example of a simple feedforward network

A.S.S.Murugan, SL/EEE, KLNCE, Pottapalayam 3.8

Page 9: ANN Basic Concepts

AI Applications to Power Systems

Figure A feedforward network

Feedback networks

Feedback networks can have signals travelling in both directions by introducing loops in the network. Feedback networks are very powerful and can get extremely complicated. Feedback networks are dynamic; their 'state' is changing continuously until they reach an equilibrium point. They remain at the equilibrium point until the input changes and a new equilibrium needs to be found. Feedback architectures are also referred to as interactive or recurrent, although the latter term is often used to denote feedback connections in single-layer organisations.

Multilayered/non-multilayered - Topology of the network architecture

(i) Multilayered The back propagation model is multilayered since it has distinct layers such as input, hidden, and output. The neurons within each layer are connected with the neurons of the adjacent layers through directed edges. There are no connections among the neurons within the same layer.

(ii) Non-multilayered We can also build neural network without such distinct layers as input, output, or hidden. Every neuron can be connected with every other neuron in the network through directed edges. Every neuron may input as well as output. A typical example is the Hopfield model.

Non-recurrent/recurrent - Directions of output

A.S.S.Murugan, SL/EEE, KLNCE, Pottapalayam 3.9

Page 10: ANN Basic Concepts

Unit III Artificial Neural Network

(i) Non-recurrent (feedforward only) In the backpropagation model, the outputs always propagate from left to right in the diagrams. This type of output propagation is called feedforward. In this type, outputs from the input layer neurons propagate to the right, becoming inputs to the hidden layer neurons, and then outputs from the hidden layer neurons propagate to the right becoming inputs to the output layer neurons. Neural network models with feedforward only are called non-recurrent. Incidentally, "backpropagation" in the backpropagation model should not be confused with feedbackward. The backpropagation is backward adjustments of the weights, not output movements from neurons.

(ii) Recurrent (both feedforward and feedbackward) In some other neural network models, outputs can also propagate backward, i.e., from right to left. This is called feedbackward. A neural network in which the outputs can propagate in both directions, forward and backward, is called a recurrent model. Biological systems have such recurrent structures. A feedback system can be represented by an equivalent feedforward system

Single-layer Feed forward Networks A Single Layer Feed forward Network represents the simplest form of Neural Network. In such Network, there are only 2 layers, an input layer and an output layer. The phrase ‘Single layer’ refers to the output layer of neurons (computation nodes). The input layer is not considered as a layer as no computation is done in the layer. The inputs are multiplied by a weight denoted by W. For instance, the input X1 is multiplied by a weight of W1. The same is

done for the rest of the inputs as well. Finally a weight vector comprising of all the weights is formed. The result of all the multiplication of the inputs and weights are then fed to the summer where addition is executed. The output of the summer is then fed to the Linear Threshold unit. If the input to the summer is above the threshold level, an output of ‘1’ will take place. Else, an output of ‘0’ will occur. All the data can be presented to the Network in binary. (Example: ‘1’ and ‘0’) or in bipolar (Example: ‘1’ and ‘-1’) Figure illustrates the block diagram of a Single Layer Feed forward Network.

Fig. Block Diagram of Single Layer Feed forward Network

The simplest choice of neural network is the following weighted sum

A.S.S.Murugan, SL/EEE, KLNCE, Pottapalayam 3.10

Page 11: ANN Basic Concepts

AI Applications to Power Systems

where d is the dimension of input space, x0 1 and w0 is the bias parameter. Input vector x can be considered as a set of activations of input layer. In classification problem is called a discriminant function, because yx0 can be interpreted as a decision boundary. Weight vector w determines the orientation of decision plane and bias parameter w0 determines the distance from origin. In regression problems the use of this kind of network is limited; only (d-1) -dimensional hyper planes can be modeled. An example of single-layer networks is a linear associative memory, which associates an output vector with an input vector.

where again x0 1 and wk 0 is the bias parameter. The connection from input i to output k is weighted by a weight parameter wki .

Figure The simplest neural networks. Computation is done in the second layer of nodes.

Functions of the form can be generalized by using a (monotonic) linear or nonlinear activation function which acts on the weighted sum as

where gvis usually chosen to be a threshold function, piecewise linear, logistic sigmoid, sigmoidal or hyperbolic tangent function (tanh). The first neuron model was of this type and was proposed as early as in 1940’s by McCulloch and Pitts.

Threshold function (step function):

A.S.S.Murugan, SL/EEE, KLNCE, Pottapalayam 3.11

Page 12: ANN Basic Concepts

Unit III Artificial Neural Network

Piecewise linear function (pseudolinear)

Logistic sigmoid

Multilayer Feed forward Network The clear distinction between a single and Multilayer Feed forward Network is the introduction of hidden units. In a single layered network there is an input layer of source nodes and an output layer of neurons. A multi-layer network has in addition one or more hidden layers of hidden neurons. Some standard three-layer feed-forward networks are used widely.

The objective of the hidden unit is to intervene between the input and output layer, enabling the Network to extract higher-order statistics. Figure illustrates the Architecture of the Multilayer Feed forward Network. The data processing between the input layer and the summer is similar to the single layer feed forward Network. Apart from formation of a weight vector between the two layers, another vector between the hidden units and the output layer must be formed. A representative feed-forward neural network consists of a three layer structure: input layer, output layer and hidden layer. Each layer is composed of variable nodes. The number of nodes in the hidden layers is selected to make the network more efficient and to interpret the data more accurately. The relationship between the input and output can be non-linear or linear, and its characteristics are determined by the weights assigned to the connections between the nodes in the two adjacent layers. Changing the weight will change the input-to-output behavior of the network.

Fig. A fully connected feed-forward network with one hidden layer and one output layer

A.S.S.Murugan, SL/EEE, KLNCE, Pottapalayam 3.12

Page 13: ANN Basic Concepts

AI Applications to Power Systems

Figure 3.10 The multilayer perceptron network

The summing junction of the hidden unit is obtained by the following weighted linear combination:

where wji is a weight in the first layer (from input unit i to hidden unit j) and wj0 is the bias for hidden unit j. The activation (output) of hidden unit j is then obtained by

For output of whole network the following activation is constructed

Two-layered multilayer perceptron in Fig. can be represented as a function by combining the previous expressions in the form

The activation function for the output unit can be linear. In that case becomes a special case of in which the basis functions are

If the activation functions in the hidden layer are linear, then such a network can be converted into an equivalent network without hidden units by forming a single linear transformation of two successive linear transformations. So the networks having non-linear hidden unit activation functions are preferred.

A.S.S.Murugan, SL/EEE, KLNCE, Pottapalayam 3.13

Page 14: ANN Basic Concepts

Unit III Artificial Neural Network

Note: In literature the equation for output of neuron can also be seen written as Follows

where k is the bias parameter. Mathematically this is equivalent to the former equations where the bias was included in the summation. We can always set wk 0 k and x0 1. The sign ‘–‘ can, of course, be included in weight parameter and not in the input. MLPs are mainly used for functional approximation rather than classification problems. They are generally unsuitable for modeling functions with significant local variations. The universal approximation theorem states that MLP can approximate any continuous function arbitrarily well, although it does not provide indication about the complexity of MLP. The Vapnik-Chervonenkis dimension dVC gives a rough approximation about the complexity. According to this principle, the amount of training data should be approximately ten times the dVC, or the number of weights in MLP. MLPs are suitable for high-dimensional function approximation, if the desired function can be approximated by a low number of ridge functions (MLPs employ ridge functions in the hidden layer). They may perform well, although the training data have redundant inputs.

Back-Propagation Algorithm

Multiple layer perceptrons have been applied successfully to solve some difficult diverse problems by training them in a supervised manner with a highly popular algorithm known as the error back-propagation algorithm. This algorithm is based on the error-correction learning rule. It may be viewed as a generalization of an equally popular adaptive filtering algorithm- the least mean square (LMS) algorithm. Error back-propagation learning consists of two passes through the different layers of the network: a forward pass and a backward pass. In the forward pass, an input vector is applied to the nodes of the network, and its effect propagates through the network layer by layer. Finally, a set of outputs is produced as the actual response of the network. During the forward pass the weights of the networks are all fixed. During the backward pass, the weights are all adjusted in accordance with an error correction rule. The actual response of the network is subtracted from a desired response to produce an error signal. This error signal is then propagated backward through the network, against the direction of synaptic connections. The weights are adjusted to make the actual response of the network move closer to the desired response.

A.S.S.Murugan, SL/EEE, KLNCE, Pottapalayam 3.14

Page 15: ANN Basic Concepts

AI Applications to Power Systems

Fig.3.11 Multiple layer perceptrons with back-propagation algorithm

A multilayer perceptron has three distinctive characteristics:

1. The model of each neuron in the network includes a nonlinear activation function. The sigmoid function is commonly used which is defined by the logistic function:

1.Another commonly used function is hyperbolic tangent.

2.

The presence of nonlinearities is important because otherwise the input- output relation of the network could be reduced to that of single layer perceptron.

2. The network contains one or more layers of hidden neurons that are not part of the input or output of the network. These hidden neurons enable the network to learn complex tasks.

3. The network exhibits a high degree of connectivity. A change in the connectivity of the network requires a change in the population of their weights.

Learning Process To illustrate the process a three layer neural network with two inputs and one output,which is shown in the picture below, is used.

A.S.S.Murugan, SL/EEE, KLNCE, Pottapalayam 3.15

Page 16: ANN Basic Concepts

Unit III Artificial Neural Network

Each neuron is composed of two units. First unit adds products of weights coefficients and input signals. The second unit realise nonlinear function, called neuron activation function. Signal e is adder output signal, and y = f(e) is output signal of nonlinear element. Signal y is also output signal of neuron.

Three layer neural network with two inputs and single output

The training data set consists of input signals (x1 and x2 ) assigned with corresponding target (desired output) y’. The network training is an iterative process. In each iteration weights coefficients of nodes are modified using new data from training data set. Symbols wmn represent weights of connections between output of neuron m and input of neuron n in the next layer. Symbols yn represents output signal of neuron n.

A.S.S.Murugan, SL/EEE, KLNCE, Pottapalayam 3.16

Page 17: ANN Basic Concepts

AI Applications to Power Systems

A.S.S.Murugan, SL/EEE, KLNCE, Pottapalayam 3.17

Page 18: ANN Basic Concepts

Unit III Artificial Neural Network

Propagation of signals through the output layer.

In the next algorithm step the output signal of the network y is compared with the desired output value (the target), which is found in training data set. The difference is called error signal δ of output layer neuron.

It is impossible to compute error signal for internal neurons directly, because output values of these neurons are unknown. For many years the effective method for training multiplayer networks has been unknown. Only in the middle eighties the backpropagation algorithm has been worked out. The idea is to propagate error signal δ (computed in single teaching step) back to all neurons, which output signals were input for discussed neuron.

A.S.S.Murugan, SL/EEE, KLNCE, Pottapalayam 3.18

Page 19: ANN Basic Concepts

AI Applications to Power Systems

A.S.S.Murugan, SL/EEE, KLNCE, Pottapalayam 3.19

Page 20: ANN Basic Concepts

Unit III Artificial Neural Network

When the error signal for each neuron is computed, the weights coefficients of each neuron input node may be modified. In formulas below df(e)/de represents derivative of neuron activation function (which weights are modified).

A.S.S.Murugan, SL/EEE, KLNCE, Pottapalayam 3.20

Page 21: ANN Basic Concepts

AI Applications to Power Systems

A.S.S.Murugan, SL/EEE, KLNCE, Pottapalayam 3.21

Page 22: ANN Basic Concepts

Unit III Artificial Neural Network

Coefficient η affects network teaching speed. There are a few techniques to select this parameter. The first method is to start teaching process with large value of the parameter. While weights coefficients are being established the parameter is being decreased gradually. The second, more complicated, method starts teaching with small parameter value. During the teaching process the parameter is being increased when the teaching is advanced and then decreased again in the final stage. Starting teaching process with low parameter value enables to determine weights coefficients signs.

A.S.S.Murugan, SL/EEE, KLNCE, Pottapalayam 3.22

Page 23: ANN Basic Concepts

AI Applications to Power Systems

Fig 3.2 Flowchart showing working of BPA Recurrent Neural Network (RNN) A feed forward architecture does not maintain a short-term memory. Any memory effects are due to the way past inputs are re-presented to the network.

A.S.S.Murugan, SL/EEE, KLNCE, Pottapalayam 3.23

Page 24: ANN Basic Concepts

Unit III Artificial Neural Network

Fig. 3.11 A simple recurrent network

A simple recurrent network has activation feedback which embodies short-term memory. A state layer is updated not only with the external input of the network but also with activation from the previous forward propagation. The feedback is modified by a set of weights as to enable automatic adaptation through learning (e.g. backpropagation).

Fig. 3.12 A simple recurrent network

Neural networks with closed paths in their topology are known as recurrent neural networks (RNN’s). RNNs are an improvement on MLPNs, and are characterized by cyclic paths between neurons. RNNs can propagate data from later processing stages to earlier stages. In RNNs, the present activation state is a function of the previous activation state as well as the

A.S.S.Murugan, SL/EEE, KLNCE, Pottapalayam 3.24

Page 25: ANN Basic Concepts

AI Applications to Power Systems

present inputs. In essence, the recurrent connections allow storing information from the past input and the past state of the network. Adding feedback from the prior activation step introduces a form of memory to the process. This enhances the network’s ability to learn temporal sequences without fundamentally changing the training process. Therefore, RNN’s have the capability of dealing with spatio-temporal problems which have been found to be difficult for feedforward networks A recurrent neural network differs from a feedforward neural network in the fact that there are no restrictions on the placements of synapes in a recurrent network. This makes all kinds of feedbacks and connections possible and achieves the full computational power of neural networks. With such a general architecture, recurrent neural networks have important capabilities not found in feedforward networks, such as attractor dynamics and the ability to identify a time-varying system. Various learning algorithms in recurrent neural networks have been proposed. Algorithms for associative memory networks which are recurrent networks settling to stable states have beenproposed by Hopfield and Pineda. However, Jordan, Gallant and King and Pearlmutter develop algorithms to train recurrent networks to handle time-varying systems. The algorithm is a real-time recurrent learning algorithm for completely recurrent networks running in continually sampled time devised by R. J. Williams and D. Zipser[S]. The real-time recurrent learning algorithm exhibits the generality of the backpropagation-through-time approach without the growing memory requirement in arbitrarily long training sequence. With the feedbacks from the output layer, a small recurrent neural network can well simulate a time-varying and nonlinear system. A typical real-time recurrent neural network is shown in Figure . It consists of two layers: output layer and input layer. The output layer includes output and hidden neurons. Some or all ofthe output/hidden neurons are delayed and fedback to the input layer. Therefore, the input layer consists of delayed output and external input. The algorithm proceeds as follows:

Fig. A Recurrent Neural Network

1. Forward process: compute output yj for all j C as

A.S.S.Murugan, SL/EEE, KLNCE, Pottapalayam 3.25

Page 26: ANN Basic Concepts

Unit III Artificial Neural Network

2. Backward process: with

Compute error gradient as

3. Weight updates:

where A : external input neurons B : feedback output/hidden neurons O : desired output neurons C : all output/hidden neurons

Ui : neurons of input layer where i A U B δ kj : Kronecker delta function η : learning rate wji : weight between output/hidden neuron j and input neuron i f : logistic function

Backpropagation through time In the original experiments presented by Jeff Elman (Elman, 1990) so-called truncated backpropagation was used. This basically means that yj (t -1) was simply regarded as an additional input. Any error at the state layer, δ j (t), was used to modify weights from this additional input slot (see Figure 4). Errors can be backpropagated even further. This is called backpropagation through time (BPTT; (Rumelhart et al., 1986)) and is a simple extension of what we have seen so far. The basic principle of BPTT is that of “unfolding.” All recurrent weights can be duplicated spatially for an arbitrary number of time steps, here referred to as τ. Consequently, each node which sends activation (either directly or indirectly) along a recurrent connection has (at least) τ number of copies as well. In accordance with Equation , errors are thus backpropagated according to

A.S.S.Murugan, SL/EEE, KLNCE, Pottapalayam 3.26

Page 27: ANN Basic Concepts

AI Applications to Power Systems

where h is the index for the activation receiving node and j for the sending node (one time step back). This allows us to calculate the error as assessed at time t, for node outputs (at the state or input layer) calculated on the basis of an arbitrary number of previous presentations.

Fig. The effect of unfolding a network for BPTT (τ = 3).

Radial Basis Function Network (RBFN) Radial basis function network is a network of radial symmetric basis functions described above. Functions whose response increases monotonically away from a central point are also radial basis functions, but because of their globality they are not as commonly used as the local

A.S.S.Murugan, SL/EEE, KLNCE, Pottapalayam 3.27

Page 28: ANN Basic Concepts

Unit III Artificial Neural Network

ones. The most used are the Gaussians – mainly because of their good analytical properties. RBF -network produces a mapping

3.23 where bi are radial basis functions. Other common choices for radial basis functions are the Cauchy function, the inverse multiquadric and the non-local multiquadric. When the basis function is Gaussian (3.23) can be seen as approximating a probability density by a mixture of known densities.

RBF -network produces a local mapping. The extrapolation properties of the mapping can be very poor outside the optimization data set. However, the RBF network is an efficient method for low dimensional tasks when input vector dimension is low and very accurate approximations can be obtained. For higher dimensional tasks several difficulties are encountered. Distributing the basis function centers evenly on the input space results in a complex model. The number of hidden layer nodes depends exponentially on the size of input space. Especially, irrelevant inputs are problematic, since they do not add information but increase the number of basis functions. Toovercome this problem, a Gaussian bar network has been proposed, where the product of univariate Gaussians (tensor product) is replaced by the sum [Hartman & Keeler, 1991].

3.24 The same idea of replacing the product by a sum has also been proposed to be used with fuzzy logic systems for control problems. This means that AND -operation is replaced by an OR -operation. However, the linguistic interpretation is then lost.

RBF -network differs from the RBF -interpolation such thatNumber of basis functions is not determined by the size of data.Centers of basis functions are not constrained to be input data vectors.Each basis function may have its own width.Biases and normalization may be included.

Training of artificial neural networks

A neural network has to be configured such that the application of a set of inputs produces (either 'direct' or via a relaxation process) the desired set of outputs. Various methods to set the strengths of the connections exist. One way is to set the weights explicitly, using a priori knowledge. Another way is to 'train' the neural network by feeding it teaching patterns and letting it change its weights according to some learning rule.

Learning Techniques Two types of learning prevailed in ANNs:

A.S.S.Murugan, SL/EEE, KLNCE, Pottapalayam 3.28

Page 29: ANN Basic Concepts

AI Applications to Power Systems

Supervised learning :- learning with teacher signals or targets Unsupervised learning :- learning without the use of teacher signals(i) Supervised Learning

In supervised learning the training patterns are provided to the ANN together with a teaching signal or target.

The difference between the ANN output and the target is the error signal. Initially the output of the ANN gives a large error during the learning phase. The error is then minimized through continuous adaptation of the weights to solve the

problem through a learning algorithm. In the end when the error becomes very small, the ANN is assumed to have learned the

task and training is stopped. It can then be used to solve the task in the recall phase.

Learning configuration

For each input, a teacher knows what should be the correct output and this information is given to the neural network. This is supervised learning since the neural network learns under supervision of the teacher. The backpropagation modelis such an example, assuming an existence of a teacher who knows what are correct patterns. In the backpropagation model, the actual output from the neural network is compared with the correct one, and the weights are adjusted to reduce the difference.

Fig.3 Block diagram for explanation of basic learning modes: (a) supervised learning and (b) unsupervised learning.(ii) Unsupervised learning In some models, neural networks can learn by themselves after being given some form of general guidelines. There is no external comparison between actual and ideal output. Instead, the neural network adjusts by itself internally using certain criteria or algorithms - e.g., to minimize a

A.S.S.Murugan, SL/EEE, KLNCE, Pottapalayam 3.29

Page 30: ANN Basic Concepts

Unit III Artificial Neural Network

function (e.g., "global energy") defined on the neural network. Such form of learning is called unsupervised learning. (Unsupervised learning does not mean no guidance is given to the neural network; if no direction is given, the neural network will do nothing.)

In unsupervised learning, the ANN is trained without teaching signals or targets. It is only supplied with examples of the input patterns that it will solve eventually. The ANN usually has an auxiliary cost function which needs to be minimized like an

energy function, distance, etc. Usually a neuron is designated as a “winner” from similarities in the input patterns

through competition. The weights of the ANN are modified where a cost function is minimized. At the end of the learning phase, the weights would have been adapted in such a manner

such that similar patterns are clustered into a particular node.

(iii) Reinforcement Learning

Reinforcement Learning type of learning may be considered as an intermediate form of the above two types of learning. Here the learning machine does some action on the environment and gets a feedback response from the environment. The learning system grades its action good (rewarding) or bad (punishable) based on the environmental response and accordingly adjusts its parameters. Generally, parameter adjustment is continued until an equilibrium state occurs, following which there will be no more changes in its parameters. The self-organizing neural learning may be categorized under this type of learning.

Applications

Aerospace High performance aircraft autopilots, flight path simulations, aircraft control systems, autopilot enhancements, aircraft component simulations, aircraft component fault detectorsAutomotive Automobile automatic guidance systems, warranty activity analyzersBanking Check and other document readers, credit application evaluatorsDefense Weapon steering, target tracking, object discrimination, facial recognition, new kinds of sensors, sonar, radar and image signal processing including data compression, feature extraction and noise suppression, signal/image identification

A.S.S.Murugan, SL/EEE, KLNCE, Pottapalayam 3.30

Page 31: ANN Basic Concepts

AI Applications to Power Systems

Electronics Code sequence prediction, integrated circuit chip layout, process control, chip failure analysis, machine vision, voice synthesis, nonlinear modelingFinancial Real estate appraisal, loan advisor, mortgage screening, corporate bond rating, credit line use analysis, portfolio trading program, corporate financial analysis, currency price predictionManufacturing Manufacturing process control, product design and analysis, process and machine diagnosis, real-time particle identification, visual quality inspection systems, beer testing, welding quality analysis, paper quality prediction, computer chip quality analysis, analysis of grinding operations, chemical product design analysis, machine maintenance analysis, project bidding, planning and management, dynamic modeling of chemical process systemsMedical Breast cancer cell analysis, EEG and ECG analysis, prosthesis design, optimization of transplant times, hospital expense reduction, hospital quality improvement and emergency room testadvisementRobotics Trajectory control, forklift robot, manipulator controllers, vision systemsSpeech Speech recognition, speech compression, vowel classification, text to speech synthesisSecurities Market analysis, automatic bond rating, stock trading advisory systems Telecommunications Image and data compression, automated information services, real-time translation of spoken language, customer payment processing systemsTransportation Truck brake diagnosis systems, vehicle scheduling, routing systems

A.S.S.Murugan, SL/EEE, KLNCE, Pottapalayam 3.31