dynamical analysis of lvq type learning rules

Dynamical Analysis of LVQ type algorithms WSOM 2005

Dynamical analysis of LVQ type learning rules

Rijksuniversiteit Groningen

Mathematics and Computing Science

httpwwwcsrugnl~biehl

mbiehlrugnl

Michael Biehl Anarta GhoshClausthal University of Technology

Institute of Computing Science

Barbara Hammer


bull identify the closest prototype ie the so-called winner

bull initialize prototype vectors for different classes

bull present a single example

bull move the winner - closer towards the data (same class)

- away from the data (different class)

classification

assignment of a vector to the class of the closest

prototype w

aim generalization ability

classification of novel data

after learning from examples

Learning Vector Quantization (LVQ)- identification of prototype vectors from labelled example data

- parameterization of distance based classification schemes

example basic LVQ scheme [Kohonen] ldquoLVQ 1rdquo

often heuristically motivated variations of competitive learning


LVQ algorithms

- frequently applied in a variety

of practical problems

- plausible intuitive flexible

- fast easy to implement

- often based on heuristic arguments

or cost functions with unclear relation to generalization

- limited theoretical understanding of

- dynamics and convergence properties

- achievable generalization ability

here analysis of LVQ algorithms wrt

- dynamics of the learning process

- performance ie generalization ability

- typical properties in a model situation


Model situation two clusters of N-dimensional data

random vectors isin ℝN according to σ)P(p )P(1σ

σ ξξ

2σ

σN2

σ

- v 2

1exp

v 2π

1σ)P( Βξξ mixture of two Gaussians

orthonormal center vectors

B+ B- isin ℝN ( B )2 =1 B+ B- =0

prior weights of classes p+ p-

p+ + p- = 1

B+

B-

(p+)

(p-)

separation prop ℓ ℓ

jj Bσσξ

σσσvξξ

22jj

independent components

with variance

ℝN


Dynamics of on-line training

sequence of new independent random examples 123μσμμ ξ

drawn according to μμσ σPp μ ξ

learning ratestep size

competitiondirection ofupdate etc

change of prototypetowards or away from the current data

example

LVQ1 original formulation [Kohonen]

Winner-Takes-All (WTA) algorithm

μs

μs

μs d d σS f

1-μs

μμμs-

μss

1-μs

μs σSddf

N

ηwξww 21

μs

μμs

μ

d

1σS

wξ

update of two prototype vectors w+ w-


Nξffη QxfηQxfη1N

QQ

Ryfη1N

RR

μts

1-μst

μst

1-μst

μts

1-μst

μst

1-μsσ

μσs

1-μsσ

μsσ

22

1-μs

μμμs-

μss

1-μs

μs σSddf

N

ηwξww

recursions

Mathematical analysis of the learning dynamics

μσ

μσ

μ1-μs

μs ξByx ξw

random vector ξμ enters only through

its length and the projections

11 σtsμt

μs

μstσ

μs

μsσ QBR www

projections into the (B+ B- )-plane

length and relativeposition of prototypes

1 description in terms of a few characteristic quantitities

( here ℝ2N ℝ7 )


completely specified in terms of first and second moments

in the thermodynamic limit N μμ

μ1-μs

μs

By

wx

ξ

ξ

correlated Gaussian random quantities

2 average over the current example

averaged recursions closed in p σ1σ

σ

random vector according to avg lengthσ)|P( μξ 22 vN σσ

ξ

μsσ

μst R Q

characteristic quantities

- depend on the random sequence of example data

- their variance vanishes with N (here prop N-1)

μsσ

μst R Q

learning dynamics is completely described in terms of averages

3 self-averaging property


4 continuous learning time

N

μ α of examples

of learning stepsper degree of freedom

) α (R ) α (Q sσst integration yields evolution of projections

stochastic recursions deterministic ODE

1-μsσ

μσs

sσ1-μsσ

μσs

1-μsσ

μsσ Ryfη

dα

dRRyfη

1N

RR

probability for misclassification of a novel example

ddpddp gε

QQQv

RR2QQ

QQQv

RR2QQpp

22 2

1

2

1

5 learning curve

generalization error εg(α) after training with α N examples


LVQ1 The winner takes it all

initialization ws(0)asymp0

theory and simulation (N=100)p+=08 v+=4 v+=9 ℓ=20 =10

averaged over 100 indep runs

Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww

only the winner is updated according to the class label

w-

w+

ℓ B-

ℓ B+

RS- w+

RS+

Trajectories in the (B+B- )-plane

(bull) =2040140 optimal decision boundary ____ asymptotic position


Learning curve

η= 201002

- suboptimal non-monotonic behavior for small η

εg (αinfin) grows linearly with η- stationary state

η 0 αinfin (η α ) infin

- well-defined asymptotics

η

εgp+ = 02 ℓ=10

v+ = v- = 10

achievable generalization error

εgεg

p+ p+

v+ = v- =10 v+ =025 v-=081

best linear boundary― LVQ1


ldquoLVQ 21ldquo [Kohonen] here update correct and wrong winner

1-μs

μ1-μs

μs Sσ

N

ηwξww

αQQRR

Q R R

with

finite remain

Q R R

R Q

Q R

α 102 4 86

6-

0

6theory and simulation (N=100)p+=08 ℓ=1 v+=v-=1 =05 averages over 100 independent runs

problem instability of the algorithm

due to repulsion of wrong prototypes

trivial classification fuumlr αinfin

εg = min p+p- RS+

RS-


suggested strategy

selection of data in a window close to the current decision boundary

slows down the repulsion system remains instable

Early stopping end training process at minimal εg (idealized)

εg

η= 20 10 05

η

- pronounced minimum in εg (α) depends on initialization and cluster geometry

- lowest minimum assumed for η0

v+ =025 v-=081εg

p+

― LVQ1__ early stopping


ldquoLearning From Mistakes (LFM)rdquo

1-μs

μμμσ-

μσ

1-μs

μs Sσdd

N

ημμ wξww

LVQ21 updateonly if the current classification is wrong

crisp limit of Soft Robust LVQ [Seo and Obermayer 2003]

projected trajetory

ℓ B-

ℓ B+

RS+

RS-

εg

p+=08 ℓ=30

v+=40 v-=90

η= 20 10 05

Learning curves

η-independent asymptotic εg p+=08 ℓ= 12 v+=v=10


εg

p+

equal cluster variances

p+

unequal variances

best linear boundary

― LVQ1

--- LVQ21 (early stopping)middot-middot LFM

Comparison achievable generalization ability

v+=025 v-=081v+=v-=10


work in progress outlook

bull multi-class multi-prototype problems

bull optimized procedures learning rate schedules

variational approach Bayes optimal on-line

Summary

bullprototype-based learning

Vector Quantization and Learning Vector Quantization

bulla model scenario two clusters two prototypes

dynamics of online training

bullcomparison of algorithms

LVQ 1 close to optimal asymptotic generalization

LVQ 21 instability trivial (stationary) classification

+ stopping potentially very good performance

LFM far from optimal generalization behavior


Perspectives

bullSelf-Organizing Maps (SOM)

(many) N-dim prototypes form a (low) d-dimensional grid

representation of data in a topology preserving map

neighborhood preserving SOM Neural Gas (distance based)

bullGeneralized Relevance LVQ [eg Hammer amp Villmann]

adaptive metrics eg distance measure

N

i

iii w1

2)( sλ ξξwd

training

bullapplications


Outlook




sσσ

N

1jjsσs R x

jw

completely specified in terms of first and second moments (wo indices μ)


μ1-μs

μs

By

wx

ξ

ξ


stσσtσsσt s Qv xx- xx

sσσσsσ s Rv yx- yx σσσσv yy- yy

sσσ y



σ


ξ

μsσ

μst R Q


N

- repulsiveattractive fixed points of the dynamics

- asymptotic behavior for - dependence on learning rate separation initialization-

investigation and comparison of given algorithms

- time-dependent learning rate η(α)

- variational optimization wrt fs[]

-

optimization and development of new prescriptions

maximizeα

g

d

d ε



initialization ws(0)=0

theory and simulation (N=100)p+=08 v+=4 p+=9 ℓ=20 =10


Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww


self-averaging property

(mean and variances)

1N

R++ (α=10)


high-dimensional data (formally Ninfin)

ξμ isinℝN N=200 ℓ=1 p+=04 v+=044 v-=044μ

B

yξ

( 240)( 160)

projections into the plane of center vectors B+ B-

μ By ξ

μ 2

2xξ

w

projections on two independent random directions w12

μ 11x ξw


bull identify the closest prototype ie the so-called winner

bull initialize prototype vectors for different classes

bull present a single example

bull move the winner - closer towards the data (same class)

- away from the data (different class)

classification

assignment of a vector to the class of the closest

prototype w

aim generalization ability

classification of novel data

after learning from examples

Learning Vector Quantization (LVQ)- identification of prototype vectors from labelled example data

- parameterization of distance based classification schemes

example basic LVQ scheme [Kohonen] ldquoLVQ 1rdquo

often heuristically motivated variations of competitive learning


LVQ algorithms

















σ ξξ

2σ

σN2

σ

- v 2

1exp

v 2π



B+ B- isin ℝN ( B )2 =1 B+ B- =0


p+ + p- = 1

B+

B-

(p+)

(p-)


jj Bσσξ

σσσvξξ

22jj


with variance

ℝN








example



μs

μs

μs d d σS f

1-μs

μμμs-

μss

1-μs

μs σSddf

N

ηwξww 21

μs

μμs

μ

d

1σS

wξ




QQ

Ryfη1N

RR

μts

1-μst

μst

1-μst

μts

1-μst

μst

1-μsσ

μσs

1-μsσ

μsσ

22

1-μs

μμμs-

μss

1-μs

μs σSddf

N

ηwξww

recursions


μσ

μσ

μ1-μs

μs ξByx ξw



11 σtsμt

μs

μstσ

μs

μsσ QBR www




( here ℝ2N ℝ7 )




μ1-μs

μs

By

wx

ξ

ξ




σ


ξ

μsσ

μst R Q




μsσ

μst R Q





N

μ α of examples




1-μsσ

μσs

sσ1-μsσ

μσs

1-μsσ

μsσ Ryfη

dα

dRRyfη

1N

RR


ddpddp gε

QQQv

RR2QQ

QQQv

RR2QQpp

22 2

1

2

1

5 learning curve







Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww


w-

w+

ℓ B-

ℓ B+

RS- w+

RS+




Learning curve

η= 201002





η

εgp+ = 02 ℓ=10

v+ = v- = 10


εgεg

p+ p+

v+ = v- =10 v+ =025 v-=081




1-μs

μ1-μs

μs Sσ

N

ηwξww

αQQRR

Q R R

with

finite remain

Q R R

R Q

Q R

α 102 4 86

6-

0





εg = min p+p- RS+

RS-


suggested strategy




εg

η= 20 10 05

η



v+ =025 v-=081εg

p+




1-μs

μμμσ-

μσ

1-μs

μs Sσdd

N

ημμ wξww



projected trajetory

ℓ B-

ℓ B+

RS+

RS-

εg

p+=08 ℓ=30

v+=40 v-=90

η= 20 10 05

Learning curves



εg

p+


p+

unequal variances


― LVQ1



v+=025 v-=081v+=v-=10






Summary











Perspectives







N

i

iii w1

2)( sλ ξξwd

training

bullapplications


Outlook




sσσ

N

1jjsσs R x

jw



μ1-μs

μs

By

wx

ξ

ξ




sσσ y



σ


ξ

μsσ

μst R Q


N






-


maximizeα

g

d

d ε






Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww




1N

R++ (α=10)



ξμ isinℝN N=200 ℓ=1 p+=04 v+=044 v-=044μ

B

yξ

( 240)( 160)


μ By ξ

μ 2

2xξ

w


μ 11x ξw


LVQ algorithms

















σ ξξ

2σ

σN2

σ

- v 2

1exp

v 2π



B+ B- isin ℝN ( B )2 =1 B+ B- =0


p+ + p- = 1

B+

B-

(p+)

(p-)


jj Bσσξ

σσσvξξ

22jj


with variance

ℝN








example



μs

μs

μs d d σS f

1-μs

μμμs-

μss

1-μs

μs σSddf

N

ηwξww 21

μs

μμs

μ

d

1σS

wξ




QQ

Ryfη1N

RR

μts

1-μst

μst

1-μst

μts

1-μst

μst

1-μsσ

μσs

1-μsσ

μsσ

22

1-μs

μμμs-

μss

1-μs

μs σSddf

N

ηwξww

recursions


μσ

μσ

μ1-μs

μs ξByx ξw



11 σtsμt

μs

μstσ

μs

μsσ QBR www




( here ℝ2N ℝ7 )




μ1-μs

μs

By

wx

ξ

ξ




σ


ξ

μsσ

μst R Q




μsσ

μst R Q





N

μ α of examples




1-μsσ

μσs

sσ1-μsσ

μσs

1-μsσ

μsσ Ryfη

dα

dRRyfη

1N

RR


ddpddp gε

QQQv

RR2QQ

QQQv

RR2QQpp

22 2

1

2

1

5 learning curve







Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww


w-

w+

ℓ B-

ℓ B+

RS- w+

RS+




Learning curve

η= 201002





η

εgp+ = 02 ℓ=10

v+ = v- = 10


εgεg

p+ p+

v+ = v- =10 v+ =025 v-=081




1-μs

μ1-μs

μs Sσ

N

ηwξww

αQQRR

Q R R

with

finite remain

Q R R

R Q

Q R

α 102 4 86

6-

0





εg = min p+p- RS+

RS-


suggested strategy




εg

η= 20 10 05

η



v+ =025 v-=081εg

p+




1-μs

μμμσ-

μσ

1-μs

μs Sσdd

N

ημμ wξww



projected trajetory

ℓ B-

ℓ B+

RS+

RS-

εg

p+=08 ℓ=30

v+=40 v-=90

η= 20 10 05

Learning curves



εg

p+


p+

unequal variances


― LVQ1



v+=025 v-=081v+=v-=10






Summary











Perspectives







N

i

iii w1

2)( sλ ξξwd

training

bullapplications


Outlook




sσσ

N

1jjsσs R x

jw



μ1-μs

μs

By

wx

ξ

ξ




sσσ y



σ


ξ

μsσ

μst R Q


N






-


maximizeα

g

d

d ε






Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww




1N

R++ (α=10)



ξμ isinℝN N=200 ℓ=1 p+=04 v+=044 v-=044μ

B

yξ

( 240)( 160)


μ By ξ

μ 2

2xξ

w


μ 11x ξw




σ ξξ

2σ

σN2

σ

- v 2

1exp

v 2π



B+ B- isin ℝN ( B )2 =1 B+ B- =0


p+ + p- = 1

B+

B-

(p+)

(p-)


jj Bσσξ

σσσvξξ

22jj


with variance

ℝN








example



μs

μs

μs d d σS f

1-μs

μμμs-

μss

1-μs

μs σSddf

N

ηwξww 21

μs

μμs

μ

d

1σS

wξ




QQ

Ryfη1N

RR

μts

1-μst

μst

1-μst

μts

1-μst

μst

1-μsσ

μσs

1-μsσ

μsσ

22

1-μs

μμμs-

μss

1-μs

μs σSddf

N

ηwξww

recursions


μσ

μσ

μ1-μs

μs ξByx ξw



11 σtsμt

μs

μstσ

μs

μsσ QBR www




( here ℝ2N ℝ7 )




μ1-μs

μs

By

wx

ξ

ξ




σ


ξ

μsσ

μst R Q




μsσ

μst R Q





N

μ α of examples




1-μsσ

μσs

sσ1-μsσ

μσs

1-μsσ

μsσ Ryfη

dα

dRRyfη

1N

RR


ddpddp gε

QQQv

RR2QQ

QQQv

RR2QQpp

22 2

1

2

1

5 learning curve







Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww


w-

w+

ℓ B-

ℓ B+

RS- w+

RS+




Learning curve

η= 201002





η

εgp+ = 02 ℓ=10

v+ = v- = 10


εgεg

p+ p+

v+ = v- =10 v+ =025 v-=081




1-μs

μ1-μs

μs Sσ

N

ηwξww

αQQRR

Q R R

with

finite remain

Q R R

R Q

Q R

α 102 4 86

6-

0





εg = min p+p- RS+

RS-


suggested strategy




εg

η= 20 10 05

η



v+ =025 v-=081εg

p+




1-μs

μμμσ-

μσ

1-μs

μs Sσdd

N

ημμ wξww



projected trajetory

ℓ B-

ℓ B+

RS+

RS-

εg

p+=08 ℓ=30

v+=40 v-=90

η= 20 10 05

Learning curves



εg

p+


p+

unequal variances


― LVQ1



v+=025 v-=081v+=v-=10






Summary











Perspectives







N

i

iii w1

2)( sλ ξξwd

training

bullapplications


Outlook




sσσ

N

1jjsσs R x

jw



μ1-μs

μs

By

wx

ξ

ξ




sσσ y



σ


ξ

μsσ

μst R Q


N






-


maximizeα

g

d

d ε






Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww




1N

R++ (α=10)



ξμ isinℝN N=200 ℓ=1 p+=04 v+=044 v-=044μ

B

yξ

( 240)( 160)


μ By ξ

μ 2

2xξ

w


μ 11x ξw








example



μs

μs

μs d d σS f

1-μs

μμμs-

μss

1-μs

μs σSddf

N

ηwξww 21

μs

μμs

μ

d

1σS

wξ




QQ

Ryfη1N

RR

μts

1-μst

μst

1-μst

μts

1-μst

μst

1-μsσ

μσs

1-μsσ

μsσ

22

1-μs

μμμs-

μss

1-μs

μs σSddf

N

ηwξww

recursions


μσ

μσ

μ1-μs

μs ξByx ξw



11 σtsμt

μs

μstσ

μs

μsσ QBR www




( here ℝ2N ℝ7 )




μ1-μs

μs

By

wx

ξ

ξ




σ


ξ

μsσ

μst R Q




μsσ

μst R Q





N

μ α of examples




1-μsσ

μσs

sσ1-μsσ

μσs

1-μsσ

μsσ Ryfη

dα

dRRyfη

1N

RR


ddpddp gε

QQQv

RR2QQ

QQQv

RR2QQpp

22 2

1

2

1

5 learning curve







Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww


w-

w+

ℓ B-

ℓ B+

RS- w+

RS+




Learning curve

η= 201002





η

εgp+ = 02 ℓ=10

v+ = v- = 10


εgεg

p+ p+

v+ = v- =10 v+ =025 v-=081




1-μs

μ1-μs

μs Sσ

N

ηwξww

αQQRR

Q R R

with

finite remain

Q R R

R Q

Q R

α 102 4 86

6-

0





εg = min p+p- RS+

RS-


suggested strategy




εg

η= 20 10 05

η



v+ =025 v-=081εg

p+




1-μs

μμμσ-

μσ

1-μs

μs Sσdd

N

ημμ wξww



projected trajetory

ℓ B-

ℓ B+

RS+

RS-

εg

p+=08 ℓ=30

v+=40 v-=90

η= 20 10 05

Learning curves



εg

p+


p+

unequal variances


― LVQ1



v+=025 v-=081v+=v-=10






Summary











Perspectives







N

i

iii w1

2)( sλ ξξwd

training

bullapplications


Outlook




sσσ

N

1jjsσs R x

jw



μ1-μs

μs

By

wx

ξ

ξ




sσσ y



σ


ξ

μsσ

μst R Q


N






-


maximizeα

g

d

d ε






Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww




1N

R++ (α=10)



ξμ isinℝN N=200 ℓ=1 p+=04 v+=044 v-=044μ

B

yξ

( 240)( 160)


μ By ξ

μ 2

2xξ

w


μ 11x ξw



QQ

Ryfη1N

RR

μts

1-μst

μst

1-μst

μts

1-μst

μst

1-μsσ

μσs

1-μsσ

μsσ

22

1-μs

μμμs-

μss

1-μs

μs σSddf

N

ηwξww

recursions


μσ

μσ

μ1-μs

μs ξByx ξw



11 σtsμt

μs

μstσ

μs

μsσ QBR www




( here ℝ2N ℝ7 )




μ1-μs

μs

By

wx

ξ

ξ




σ


ξ

μsσ

μst R Q




μsσ

μst R Q





N

μ α of examples




1-μsσ

μσs

sσ1-μsσ

μσs

1-μsσ

μsσ Ryfη

dα

dRRyfη

1N

RR


ddpddp gε

QQQv

RR2QQ

QQQv

RR2QQpp

22 2

1

2

1

5 learning curve







Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww


w-

w+

ℓ B-

ℓ B+

RS- w+

RS+




Learning curve

η= 201002





η

εgp+ = 02 ℓ=10

v+ = v- = 10


εgεg

p+ p+

v+ = v- =10 v+ =025 v-=081




1-μs

μ1-μs

μs Sσ

N

ηwξww

αQQRR

Q R R

with

finite remain

Q R R

R Q

Q R

α 102 4 86

6-

0





εg = min p+p- RS+

RS-


suggested strategy




εg

η= 20 10 05

η



v+ =025 v-=081εg

p+




1-μs

μμμσ-

μσ

1-μs

μs Sσdd

N

ημμ wξww



projected trajetory

ℓ B-

ℓ B+

RS+

RS-

εg

p+=08 ℓ=30

v+=40 v-=90

η= 20 10 05

Learning curves



εg

p+


p+

unequal variances


― LVQ1



v+=025 v-=081v+=v-=10






Summary











Perspectives







N

i

iii w1

2)( sλ ξξwd

training

bullapplications


Outlook




sσσ

N

1jjsσs R x

jw



μ1-μs

μs

By

wx

ξ

ξ




sσσ y



σ


ξ

μsσ

μst R Q


N






-


maximizeα

g

d

d ε






Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww




1N

R++ (α=10)



ξμ isinℝN N=200 ℓ=1 p+=04 v+=044 v-=044μ

B

yξ

( 240)( 160)


μ By ξ

μ 2

2xξ

w


μ 11x ξw




μ1-μs

μs

By

wx

ξ

ξ




σ


ξ

μsσ

μst R Q




μsσ

μst R Q





N

μ α of examples




1-μsσ

μσs

sσ1-μsσ

μσs

1-μsσ

μsσ Ryfη

dα

dRRyfη

1N

RR


ddpddp gε

QQQv

RR2QQ

QQQv

RR2QQpp

22 2

1

2

1

5 learning curve







Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww


w-

w+

ℓ B-

ℓ B+

RS- w+

RS+




Learning curve

η= 201002





η

εgp+ = 02 ℓ=10

v+ = v- = 10


εgεg

p+ p+

v+ = v- =10 v+ =025 v-=081




1-μs

μ1-μs

μs Sσ

N

ηwξww

αQQRR

Q R R

with

finite remain

Q R R

R Q

Q R

α 102 4 86

6-

0





εg = min p+p- RS+

RS-


suggested strategy




εg

η= 20 10 05

η



v+ =025 v-=081εg

p+




1-μs

μμμσ-

μσ

1-μs

μs Sσdd

N

ημμ wξww



projected trajetory

ℓ B-

ℓ B+

RS+

RS-

εg

p+=08 ℓ=30

v+=40 v-=90

η= 20 10 05

Learning curves



εg

p+


p+

unequal variances


― LVQ1



v+=025 v-=081v+=v-=10






Summary











Perspectives







N

i

iii w1

2)( sλ ξξwd

training

bullapplications


Outlook




sσσ

N

1jjsσs R x

jw



μ1-μs

μs

By

wx

ξ

ξ




sσσ y



σ


ξ

μsσ

μst R Q


N






-


maximizeα

g

d

d ε






Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww




1N

R++ (α=10)



ξμ isinℝN N=200 ℓ=1 p+=04 v+=044 v-=044μ

B

yξ

( 240)( 160)


μ By ξ

μ 2

2xξ

w


μ 11x ξw



N

μ α of examples




1-μsσ

μσs

sσ1-μsσ

μσs

1-μsσ

μsσ Ryfη

dα

dRRyfη

1N

RR


ddpddp gε

QQQv

RR2QQ

QQQv

RR2QQpp

22 2

1

2

1

5 learning curve







Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww


w-

w+

ℓ B-

ℓ B+

RS- w+

RS+




Learning curve

η= 201002





η

εgp+ = 02 ℓ=10

v+ = v- = 10


εgεg

p+ p+

v+ = v- =10 v+ =025 v-=081




1-μs

μ1-μs

μs Sσ

N

ηwξww

αQQRR

Q R R

with

finite remain

Q R R

R Q

Q R

α 102 4 86

6-

0





εg = min p+p- RS+

RS-


suggested strategy




εg

η= 20 10 05

η



v+ =025 v-=081εg

p+




1-μs

μμμσ-

μσ

1-μs

μs Sσdd

N

ημμ wξww



projected trajetory

ℓ B-

ℓ B+

RS+

RS-

εg

p+=08 ℓ=30

v+=40 v-=90

η= 20 10 05

Learning curves



εg

p+


p+

unequal variances


― LVQ1



v+=025 v-=081v+=v-=10






Summary











Perspectives







N

i

iii w1

2)( sλ ξξwd

training

bullapplications


Outlook




sσσ

N

1jjsσs R x

jw



μ1-μs

μs

By

wx

ξ

ξ




sσσ y



σ


ξ

μsσ

μst R Q


N






-


maximizeα

g

d

d ε






Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww




1N

R++ (α=10)



ξμ isinℝN N=200 ℓ=1 p+=04 v+=044 v-=044μ

B

yξ

( 240)( 160)


μ By ξ

μ 2

2xξ

w


μ 11x ξw






Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww


w-

w+

ℓ B-

ℓ B+

RS- w+

RS+




Learning curve

η= 201002





η

εgp+ = 02 ℓ=10

v+ = v- = 10


εgεg

p+ p+

v+ = v- =10 v+ =025 v-=081




1-μs

μ1-μs

μs Sσ

N

ηwξww

αQQRR

Q R R

with

finite remain

Q R R

R Q

Q R

α 102 4 86

6-

0





εg = min p+p- RS+

RS-


suggested strategy




εg

η= 20 10 05

η



v+ =025 v-=081εg

p+




1-μs

μμμσ-

μσ

1-μs

μs Sσdd

N

ημμ wξww



projected trajetory

ℓ B-

ℓ B+

RS+

RS-

εg

p+=08 ℓ=30

v+=40 v-=90

η= 20 10 05

Learning curves



εg

p+


p+

unequal variances


― LVQ1



v+=025 v-=081v+=v-=10






Summary











Perspectives







N

i

iii w1

2)( sλ ξξwd

training

bullapplications


Outlook




sσσ

N

1jjsσs R x

jw



μ1-μs

μs

By

wx

ξ

ξ




sσσ y



σ


ξ

μsσ

μst R Q


N






-


maximizeα

g

d

d ε






Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww




1N

R++ (α=10)



ξμ isinℝN N=200 ℓ=1 p+=04 v+=044 v-=044μ

B

yξ

( 240)( 160)


μ By ξ

μ 2

2xξ

w


μ 11x ξw


Learning curve

η= 201002





η

εgp+ = 02 ℓ=10

v+ = v- = 10


εgεg

p+ p+

v+ = v- =10 v+ =025 v-=081




1-μs

μ1-μs

μs Sσ

N

ηwξww

αQQRR

Q R R

with

finite remain

Q R R

R Q

Q R

α 102 4 86

6-

0





εg = min p+p- RS+

RS-


suggested strategy




εg

η= 20 10 05

η



v+ =025 v-=081εg

p+




1-μs

μμμσ-

μσ

1-μs

μs Sσdd

N

ημμ wξww



projected trajetory

ℓ B-

ℓ B+

RS+

RS-

εg

p+=08 ℓ=30

v+=40 v-=90

η= 20 10 05

Learning curves



εg

p+


p+

unequal variances


― LVQ1



v+=025 v-=081v+=v-=10






Summary











Perspectives







N

i

iii w1

2)( sλ ξξwd

training

bullapplications


Outlook




sσσ

N

1jjsσs R x

jw



μ1-μs

μs

By

wx

ξ

ξ




sσσ y



σ


ξ

μsσ

μst R Q


N






-


maximizeα

g

d

d ε






Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww




1N

R++ (α=10)



ξμ isinℝN N=200 ℓ=1 p+=04 v+=044 v-=044μ

B

yξ

( 240)( 160)


μ By ξ

μ 2

2xξ

w


μ 11x ξw



1-μs

μ1-μs

μs Sσ

N

ηwξww

αQQRR

Q R R

with

finite remain

Q R R

R Q

Q R

α 102 4 86

6-

0





εg = min p+p- RS+

RS-


suggested strategy




εg

η= 20 10 05

η



v+ =025 v-=081εg

p+




1-μs

μμμσ-

μσ

1-μs

μs Sσdd

N

ημμ wξww



projected trajetory

ℓ B-

ℓ B+

RS+

RS-

εg

p+=08 ℓ=30

v+=40 v-=90

η= 20 10 05

Learning curves



εg

p+


p+

unequal variances


― LVQ1



v+=025 v-=081v+=v-=10






Summary











Perspectives







N

i

iii w1

2)( sλ ξξwd

training

bullapplications


Outlook




sσσ

N

1jjsσs R x

jw



μ1-μs

μs

By

wx

ξ

ξ




sσσ y



σ


ξ

μsσ

μst R Q


N






-


maximizeα

g

d

d ε






Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww




1N

R++ (α=10)



ξμ isinℝN N=200 ℓ=1 p+=04 v+=044 v-=044μ

B

yξ

( 240)( 160)


μ By ξ

μ 2

2xξ

w


μ 11x ξw


suggested strategy




εg

η= 20 10 05

η



v+ =025 v-=081εg

p+




1-μs

μμμσ-

μσ

1-μs

μs Sσdd

N

ημμ wξww



projected trajetory

ℓ B-

ℓ B+

RS+

RS-

εg

p+=08 ℓ=30

v+=40 v-=90

η= 20 10 05

Learning curves



εg

p+


p+

unequal variances


― LVQ1



v+=025 v-=081v+=v-=10






Summary











Perspectives







N

i

iii w1

2)( sλ ξξwd

training

bullapplications


Outlook




sσσ

N

1jjsσs R x

jw



μ1-μs

μs

By

wx

ξ

ξ




sσσ y



σ


ξ

μsσ

μst R Q


N






-


maximizeα

g

d

d ε






Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww




1N

R++ (α=10)



ξμ isinℝN N=200 ℓ=1 p+=04 v+=044 v-=044μ

B

yξ

( 240)( 160)


μ By ξ

μ 2

2xξ

w


μ 11x ξw



1-μs

μμμσ-

μσ

1-μs

μs Sσdd

N

ημμ wξww



projected trajetory

ℓ B-

ℓ B+

RS+

RS-

εg

p+=08 ℓ=30

v+=40 v-=90

η= 20 10 05

Learning curves



εg

p+


p+

unequal variances


― LVQ1



v+=025 v-=081v+=v-=10






Summary











Perspectives







N

i

iii w1

2)( sλ ξξwd

training

bullapplications


Outlook




sσσ

N

1jjsσs R x

jw



μ1-μs

μs

By

wx

ξ

ξ




sσσ y



σ


ξ

μsσ

μst R Q


N






-


maximizeα

g

d

d ε






Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww




1N

R++ (α=10)



ξμ isinℝN N=200 ℓ=1 p+=04 v+=044 v-=044μ

B

yξ

( 240)( 160)


μ By ξ

μ 2

2xξ

w


μ 11x ξw


εg

p+


p+

unequal variances


― LVQ1



v+=025 v-=081v+=v-=10






Summary











Perspectives







N

i

iii w1

2)( sλ ξξwd

training

bullapplications


Outlook




sσσ

N

1jjsσs R x

jw



μ1-μs

μs

By

wx

ξ

ξ




sσσ y



σ


ξ

μsσ

μst R Q


N






-


maximizeα

g

d

d ε






Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww




1N

R++ (α=10)



ξμ isinℝN N=200 ℓ=1 p+=04 v+=044 v-=044μ

B

yξ

( 240)( 160)


μ By ξ

μ 2

2xξ

w


μ 11x ξw






Summary











Perspectives







N

i

iii w1

2)( sλ ξξwd

training

bullapplications


Outlook




sσσ

N

1jjsσs R x

jw



μ1-μs

μs

By

wx

ξ

ξ




sσσ y



σ


ξ

μsσ

μst R Q


N






-


maximizeα

g

d

d ε






Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww




1N

R++ (α=10)



ξμ isinℝN N=200 ℓ=1 p+=04 v+=044 v-=044μ

B

yξ

( 240)( 160)


μ By ξ

μ 2

2xξ

w


μ 11x ξw


Perspectives







N

i

iii w1

2)( sλ ξξwd

training

bullapplications


Outlook




sσσ

N

1jjsσs R x

jw



μ1-μs

μs

By

wx

ξ

ξ




sσσ y



σ


ξ

μsσ

μst R Q


N






-


maximizeα

g

d

d ε






Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww




1N

R++ (α=10)



ξμ isinℝN N=200 ℓ=1 p+=04 v+=044 v-=044μ

B

yξ

( 240)( 160)


μ By ξ

μ 2

2xξ

w


μ 11x ξw


Outlook




sσσ

N

1jjsσs R x

jw



μ1-μs

μs

By

wx

ξ

ξ




sσσ y



σ


ξ

μsσ

μst R Q


N






-


maximizeα

g

d

d ε






Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww




1N

R++ (α=10)



ξμ isinℝN N=200 ℓ=1 p+=04 v+=044 v-=044μ

B

yξ

( 240)( 160)


μ By ξ

μ 2

2xξ

w


μ 11x ξw




sσσ

N

1jjsσs R x

jw



μ1-μs

μs

By

wx

ξ

ξ




sσσ y



σ


ξ

μsσ

μst R Q


N






-


maximizeα

g

d

d ε






Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww




1N

R++ (α=10)



ξμ isinℝN N=200 ℓ=1 p+=04 v+=044 v-=044μ

B

yξ

( 240)( 160)


μ By ξ

μ 2

2xξ

w


μ 11x ξw


N






-


maximizeα

g

d

d ε






Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww




1N

R++ (α=10)



ξμ isinℝN N=200 ℓ=1 p+=04 v+=044 v-=044μ

B

yξ

( 240)( 160)


μ By ξ

μ 2

2xξ

w


μ 11x ξw






Q++

Q--

Q+-

α

RSσ

winner ws 1

1-μs

μμμS

μS

1-μs

μs Sσdd

N

ηwξww




1N

R++ (α=10)



ξμ isinℝN N=200 ℓ=1 p+=04 v+=044 v-=044μ

B

yξ

( 240)( 160)


μ By ξ

μ 2

2xξ

w


μ 11x ξw



ξμ isinℝN N=200 ℓ=1 p+=04 v+=044 v-=044μ

B

yξ

( 240)( 160)


μ By ξ

μ 2

2xξ

w


μ 11x ξw

dynamical analysis of lvq type learning rules

Documents