d. dranidis - 14.06.001 the road to... phd aims: –specify neural networks in an abstract...

D. Dranidis - 14.06.00 1

The road to ... PhD

• Aims:

– specify neural networks in an abstract mathematical way

– transform the specifications into executable programs

– prove the correctness of transformation

D. Dranidis - 14.06.00 2

Software Development process

• Requirements specification is abstract:– Many models satify the

requirements specification

• Each design decision restricts the set of possible models

• ... until one reaches to the final program which is a representation of a unique model.

Informal description

Requirements specification

Design specification

Program

analysis

design decisions

design decisions

D. Dranidis - 14.06.00 3

Formal methods

• Mathematical formalisms– Aim: rigorously describe (software) systems

– consist of:• Syntax how do I write a description?

– grammatic rules for building an expression

• Semantics what is the meaning of a description?– mapping from expressions to a mathematical domain

– optionally:• Methodology

procedures and processes of describing a system and deriving an implementation

D. Dranidis - 14.06.00 4

Formalisms

• Algebraic specifications• Stream processing functions• Traces semantics• CCS (calculus of communicating systems)• μCML (operational semantics of CML)• SL (stream language)

CML (concurrent ML)

D. Dranidis - 14.06.00 5

Algebraic specifications

• Formalism for the specification of – data types (structures) and algorithms

• An algebraic specification consists of:– syntactic interface:

• names of the types (sorts)

• names of the operations

– semantic properties:• axioms

D. Dranidis - 14.06.00 6

Specifications

• Depending on the form of axioms one can create:– an abstract specification: a predicative description where axioms

are logic formulas

– a concrete specification: constructive description where axioms are restricted to conditional equations

• A concrete specification serves as a prototype of the system (it is executable) or as the final product (implemented in a functional language)

D. Dranidis - 14.06.00 7

Example

BOOL = {-- sortssort Bool;-- functionstrue : Bool;false : Bool;not : Bool Bool;-- axiomsaxioms

not ( true ) = false ;not ( not ( x ) ) = x ;

endaxioms}

true false not(true)

not(false) not(not(true))

not(...not(true)...)

not(not(false))

not(...not(false)...)

Term algebra = Initial model (all terms are unequal)

true = false = not(true) = not(false) = ...

true = not(false) = not(not(true)) = ...

false = not(true) = not(not(false)) = ...

false = not(true)

Final model (all terms are equal)

true

not(false) not(not(true))

not(...not(true)...)

not(not(false))

not(...not(false)...)

not ( true ) = false ;

not ( true ) = false ;not ( not ( x ) ) = x;

not ( true ) = false ;-- not ( not ( x ) ) = x;true = false;

D. Dranidis - 14.06.00 8

Stream processing functions

• <y1 y2 y3 ... yt > = f < x1 x2 x3 ... xr >

• f operates on the whole input stream and produces an output stream.

• Streams can be considered as data histories (or traces)

• Streams can be infinite

• f can be considered as an agent receiving some inputs, processing and sending some outputs

f

x1 x2 x3 ... xr ... y1 y2 y3 ... yt ...

D. Dranidis - 14.06.00 9

Examples

• Finite streams– < > is the empty stream

– <0> one element stream

– 1 & <2, 3> = <1,2,3> (prefix operation)

– <1,2,3> + <4,5,6> = <5,7,9> (pairwise addition)

– succ <0,1,2> = <1,2,3> (successor function)

• Infinite streams (via recursion)– null = 0 & null null = <0,0,0,...>

– nat = 0 & ( succ nat) nat = <0,1,2,3,4,...>

– squares = nat * nat squares = <0,1,4,9,16,...>

– (nat1, nat2) = split nat nat1 = <0,1,2,3,4,...>nat2 = <0,2,1,3,4,...>

D. Dranidis - 14.06.00 10

CCS

• Language for the description of concurrent systems. Systems are described as processes.

• Processes are built by actions and operators

• Actions are – external: communication of a process with other processes

– internal: communications between the components of a process

• Communications occur on channels. There are two kinds of communications:– c!v write value v on channel c:

– c?x read value from channel c and assign to variable x:

• Handshake (synchronous) communication:– a read or write action in a process must handshake with a complementary

action in another process (data transfer) for the processes to proceed.

D. Dranidis - 14.06.00 11

Process operators• Prefix:

– P = a.Q first a then Q

– eg. P = c!0 . P

• Summation:– P = Q + R Q or P

– eg. P = c?x.P + d?x.P

• Composition:– P = Q | R Q parallel to R

– eg. R = c!0.R , Q = c?x.Q, P = Q | R

• Restriction:– P = Q\c c is private in P

– eg. R = c!0.R , Q = c?x.Q, P = (Q | R)\c

• Recursion:– P = fix (X = Q)

– eg. P = fix(X = c!0 . X) equiv. to P = c!0. P

D. Dranidis - 14.06.00 12

Examples

• The following stream– nat = 0 & (succ nat) = μs. 0 & (succ s) in the specification

– val nat = recursion ( fn s => 0 & ( 1 +^ s) ) in SL/CML

• is modelled (realised) by the following CCS process terms– TRANSFERi,k

succ = fix ( X = i?x . k!(succ x) . X )

– PREFIXk,m0 = m!0 . fix ( X = k?x . m! x . X )

– SPLITm,i,o = fix ( X = m?x . (i!x . o!x . X + o!x . i!x . X ) )

– NATo = (TRANSFERi,ksucc | PREFIXk,m

0 | SPLITm,i,o )\{k,m,i}

TRANSFERi,ksucc SPLITm,i,oPREFIXk,m

0

k m

i

o

D. Dranidis - 14.06.00 13

Specification

• Networks are specified as stream processing functions in an algebraic specification.

• Example– unit (η, w0) (x, τ) = y– where ( – y = w x– e = τ - y– w = w0 & w + Δw– Δw = η . e . x– );

• The initial abstract specification is transformed into a design specification:

– recursion is modeled via network feedback and

– branched connecting lines are split

• The semantics of the specification is preserved (trivially).

– unit (η, w0) (x, τ) = y– where (– create yt

– create wt – e = τ - yt

– x1, x2 = split x– Δw = η . e . x1

– w = w0 & wt + Δw– w1, w2 = w– y = w2 x2

– y1, y2 = split y– w1 wt

– y1 yt

– );

D. Dranidis - 14.06.00 14

Program

• The design specification is implemented as a function in SL/CML/SML

– fun unit (eta, w0_) (x_, tau) = let– val (yt, tr2yt) = createStream()– val (wt_, tr2w_) = createStreamList (length x_)– val e = tau ^-^ yt– val (x1_, x2_) = split x_– val Dw_ = eta *^ e ^*^ x1_– val w_ = w0_ ~&~ wt_ ~+~ Dw_– val (w1_, w2_) = splitList w_– val y = x2_ ~**~ w2_– val (y1, y2) = split y– val _ = connectList (w1_, tr2w_)– val _ = connect (y1, tr2yt)– in– y2– end

D. Dranidis - 14.06.00 15

Correctness

stream processing function SL/CML function

syntactic transformation

traces semantics

CCS process term

realisation

operational semantics

μCML process term

μCML process term

translation T bisimulation

bisimulation

D. Dranidis - 14.06.00 16

Neural Networks

• Applications– Pattern recognition

• Supervised Learning

• Reinforcement Learning

D. Dranidis - 14.06.00 17

Some Biology

• The brain The brain is composed of nerve cells called neurons. The number of neurons in the human brain is about 1011. Each neuron is connected to approximately 104 neurons. Neurons have (at least abstractly) a very simple operation. Neurons are slow. (An operation cycle lasts some ms). Neurons operate in parallel. Although neurons “die” information retrieval is still possible. Neural networks are fault-tolerant because information storage and

information processing is distributed.

D. Dranidis - 14.06.00 18

Neurons

• A neuron consists of – the soma, the central part of the cell– the dendrites, junctions from other neurons– and the axon, its junction to other neurons– The axon branches and connects to other neuron’s dendrites via the synapses.

• Operation

– Neurons transmit information in the form of pulses.

– This information is received at the synapses.

– Synapses produce some chemical substances that increase or decrease (depending on the type of synapse) the electrical potential of the cell.

– The cell generates a pulse (fires) if its potential reaches a threshold value. – The pulse is transmitted on its axon.

• Adaptive behavior

– Neurons “store” information in their synapses.– Synapses are plastic. Their effect changes over time. They adapt.

D. Dranidis - 14.06.00 19

Artificial Neural Networks

• An artificial neural network is a network of simple computation elements, called units, which operate in parallel and communicate by transmitting their output to other units via connections.

• Units compute simple functions on their inputs.

• A weight is associated to each input line. The weight indicates the significance of the connection for the computation of the unit.

w1

w2

wn

x1

x2

xn

Σ f y

Graphical representation of a unit

y = f (Σ wi * xi )i

D. Dranidis - 14.06.00 20

Pattern Recognition

• Character recognition

• Sound recognition

A NN it is an ‘A’ character

NN it is an ‘A’ phoneme

D. Dranidis - 14.06.00 21

Representation

• Information (data) is represented and processed numerically

it is an ‘A’ character

A0 0 0 1 0 0 00 0 1 1 1 0 00 0 1 0 1 0 00 1 1 0 1 1 00 1 1 1 1 1 00 1 0 0 0 1 01 1 0 0 0 1 1

0 0 0 1 0 0 0 0 0 1 1 1 0 0 0 0 1 0 1 0 0 0 1 1 0 1 1 0 0 1 1 1 1 1 0 0 1 0 0 0 1 0 1 1 0 0 0 1 1

49 inputs

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 26 outputs

49+14+26=89 units 49*14+14*26=686+364 = 1050 connections

D. Dranidis - 14.06.00 22

Supervised Learning

• Network is initialised with random connection weights

• Repeat – present a character at the input

– let network classify input

– compare network output to correct class

– update connection weights to minimize the error

• until all characters are correctly classified

• Learning algorithm updates weights.

• Information is stored in weights

• Information is distributed (in 1050 numerical weights)

D. Dranidis - 14.06.00 23

Reinforcement Learning

• Learning to act in an unknown environment in order to get the maximum reward.

• Reinforcement problem:– State space S, actions A, initial state s0, final state sF

– transition : S x A -> S

– reward : S -> N

– policy : S -> A

• Goal

– find policy which maximizes the total reward: Σ ri

• Example– Games: reward = 0 in all states except the winning state where reward = 1

– Route finding: reward = -1 in all states except the last where reward = 0

s0 s1 s2 s3a1

r1

a2

r2

a3

r3

D. Dranidis - 14.06.00 24

Reinforcement learning methods

• Value : S -> N– Value of a state is the sum of rewards if one follows the optimal policy till

the goal state.

– If the value of each state is known then the optimal policy is the greedy policy: each time take the action which leads to the maximum value state.

• Value Iteration:– begin with a random value function (ie. a random policy)

– after each action update the value function according to:• V(st , t+1) = V(st , t) + α ( rt + γ V(st+1 , t) - V(st , t) )

– value iteration converges to the optimal policy

• If the state space is large then the value function cannot be represented explicitly (via a table); instead it has to be approximated

• Neural networks are used to approximate the value function

d. dranidis - 14.06.001 the road to... phd aims: –specify neural networks in an abstract...

Documents