cosc 460 – neural networks gregory caza 17 august 2007

COSC 460 – Neural Networks

Gregory Caza

17 August 2007

Elman (1993)

• Elman, J. L. (1993). Learning and development in neural networks: the importance of starting small. Cognition 48: 71-99.

• Modelling first language acquisition using a progressive training strategy.

Elman (1993)

• Simple Recurrent Network (SRN)

• context units remember the state of the hidden units at the last time step

Elman (1993)

• input was a binary-encoded word

• words are presented one at a time

• output was an encoded prediction of the next word in a sentence

• predictions are expected to depend on the network learning a grammatical structure

Elman (1993)

• developmental constraints may facilitate learning

• limited view provides a buffer from a complex, potentially overwhelming domain

• simple network = child

• complex domain = language

Elman (1993)

• Training was performed using three different schemata:

1. using all training data and a fully-developed network

2. with the training data organized and presented with increasing complexity

3. beginning with a limited memory that increased throughout training

Elman (1993)

• developmental simulation #1: incremental input

• training sentences were classified as simple or complex

• ratio of complex : simple increased over time

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

Simple Complex

Elman (1993)

• developmental simulation #2: incremental memory

• context would be reset when memory limit was reached

Epoch # Memory (words)

1-12 3 or 4

13-17 4 or 5

18-22 5 or 6

23-27 6 or 7

28-32 no limit

Elman (1993)

• full set: learning did not successfully complete

• incremental input: low final error; good generalization

• incremental memory: low final error; good generalization

Elman (1993)

• can training with a subset construct a “foundation for future success”?

• filter out “stimuli which may either be irrelevant or require prior learning to be interpreted”

• solution space is constrained

Elman (1993)

• Questions– how many sentences/epochs were used in the

failed case? – what were the quantitative differences

between the incremental memory/input results?

– were results reproducible with different training corpora?

Assad et al. (2002)

• Assad, C., Harmann, M. J., Paulin, M. G. (2002). Control of a simulated arm using a novel combination of cerebellar learning mechanisms. Neurocomputing 44-46: 275-283.

• Control of a robot arm using dynamic state estimation.

Assad et al. (2002)

• explore the cerebellum’s role in dynamic state estimation during movement

• single-link robot arm, capable of single-plane movement and releasing a ball

• ANN used to control the release time of the throw, with the goal of hitting a target at a certain height

Assad et al. (2002)

• 6 Purkinje cells (PC)

• 6 climbing fibres (CF)

• 6 ascending branches (AB)

• 4280 parallel fibres (PF) - 600 inhibitory; 3680 excitatory

Assad et al. (2002)

• each excitatory PF received a radial basis function (RBF) of 2 state variables

• PF-PC connections were strengthened through ‘Hebbian-like’ learning

• after each trial, a binary error signal was generated based on throw accuracy

• if the ball hit the target window, PF-PC connections were strengthened through ‘Hebbian-like’ learning

Assad et al. (2002)

• the target window was initialized to be “quite large”

• if a hit was recorded, the window was shrunk

• if there was an error, the window was expanded

Assad et al. (2002)

• physiological experiments demonstrate LTD between PF and CF

• most cerebellar models ignore the AB input

• the network suggests a possible role for LTP in cerebellar learning through the AB

Assad et al. (2002)

• details, details!

• too complicated => laying groundwork for experiments

• Why does no learning take place when the target is missed? What about negative reinforcement?

cosc 460 – neural networks gregory caza 17 august 2007

Documents