biomedical applications and the probabilistic framework · probabilistic kalman filter observation...

Post on 25-Oct-2020

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Biomedical Applications and theProbabilistic Framework

Peter Sykacek

http://www.sykacek.net

jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.1/23

Talk Overview

� Motivation of Probabilistic Concepts

� BCI, current practice & shortcomings

� Probabilistic Kalman Filter

� Adaptive BCI

� Gene Discovery

� DAG for Bayesian Marker Identification

� Gene Selection

� Discussion of Model Selection

jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.2/23

Probabilistic Motivations

Thomas Bayes (1701 - 1763)Learning from data using adecision theoretic framework

First consequence: wemust revise beliefs ac-cording to Bayes theorem

, where

.

Second consequence: De-cisions by maximising ex-pected utilities

jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.3/23

Probabilistic Motivations

Thomas Bayes (1701 - 1763)Learning from data using adecision theoretic framework

� ��� � � � � � �� � ��

� �

First consequence: wemust revise beliefs ac-cording to Bayes theorem

, where

.

Second consequence: De-cisions by maximising ex-pected utilities

jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.3/23

Probabilistic Motivations

Thomas Bayes (1701 - 1763)Learning from data using adecision theoretic framework

� ��� � � � � � �� � ��

� �

First consequence: wemust revise beliefs ac-cording to Bayes theorem

��� � � � �� � � � � � � � , where

� � � � � ��

� ��� � � �� � � .

Second consequence: De-cisions by maximising ex-pected utilities

jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.3/23

Brain Computer Interface

Computer is controlled directly by cortical activity.

jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.4/23

Classification of BCIs

(mostly motor tasks)involuntary events (P300)voluntary control

Scalp EEG intracranial EEG

Neural activity

Cognitive events intracranial EEG � � high spatial

and temporal resolution; highly inva-

sive!; allows 2-d control of artificial

limb.

surface EEG � � low spatial and

temporal resolution; no permanent in-

terference with patient; slow! at most

20 bit per minute and task.

� � focus on BCI’s based on scalp recordings.

� � low bit rates; last resort if no other communica-

tion possible

jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.5/23

BCI with almost no adaptation

� P

�� �

based: L. A. Farwell and E. Donchin, � �

User intention is embedded within a sequence ofsymbols. The correct symbol leads to “surprise”and triggers a P

�� �

.

� Filter & threshold: N. Birbaumer etal. , � �

threshold slow cortical potentials; J.R. Wolpawetal., � � threshold moving average in anappropriate pass band e.g. �-rhythm.

These principles rely mostly on user training.

jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.6/23

BCI & static pattern recognition

� Extract representation of EEG “waveforms” (e.g.low pass filtered time series; spectralrepresentation)

� Parameterize supervised classification implicitlyassuming stationarity.

What ifTechnical setup changes during operation?

(e.g. electrolyte changes impedance)User learns from feedback?User shows fatigue?

Assuming stationarity must be wrong !

� � Probabilistic method for “adaptive” BCI.

jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.7/23

Probabilistic Kalman Filter

� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �

� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �

n-1Λ

n-1w nw

λI

n-1

β α

w

observation n

yn

Key: get right (may regard as

learning rate)

0 200 400 600 800 10000

50

100

150

200

250

300

350

Simulations using σλ=1e+003

window sz. 1window sz. 5window sz. 10window sz. 15window sz. 20

0 200 400 600 800 10000

0.2

0.4

0.6

0.8

1

Simulations using σλ=1e+003

window sz. 1window sz. 5window sz. 10window sz. 15window sz. 20

Illustration of and “in-

stantaneous” generalization error for

B. D. Ripley’s synthetic data with ar-

tificial non-stationarity (swap labels

after sample 500).

jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.8/23

Probabilistic Kalman Filter

� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �

� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �

n-1Λ

n-1w nw

λI

n-1

β α

w

observation n

yn

Key: get

right (may regard

� � �

as

learning rate)

0 200 400 600 800 10000

50

100

150

200

250

300

350

Simulations using σλ=1e+003

window sz. 1window sz. 5window sz. 10window sz. 15window sz. 20

0 200 400 600 800 10000

0.2

0.4

0.6

0.8

1

Simulations using σλ=1e+003

window sz. 1window sz. 5window sz. 10window sz. 15window sz. 20

Illustration of and “in-

stantaneous” generalization error for

B. D. Ripley’s synthetic data with ar-

tificial non-stationarity (swap labels

after sample 500).

jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.8/23

Probabilistic Kalman Filter

� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �

� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �

n-1Λ

n-1w nw

λI

n-1

β α

w

observation n

yn

Key: get

right (may regard� � �

as

learning rate)

Classification: Non linear and non

Gaussian, some eqns.

0 200 400 600 800 10000

50

100

150

200

250

300

350

Simulations using σλ=1e+003

window sz. 1window sz. 5window sz. 10window sz. 15window sz. 20

0 200 400 600 800 10000

0.2

0.4

0.6

0.8

1

Simulations using σλ=1e+003

window sz. 1window sz. 5window sz. 10window sz. 15window sz. 20

Illustration of and “in-

stantaneous” generalization error for

B. D. Ripley’s synthetic data with ar-

tificial non-stationarity (swap labels

after sample 500).

jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.8/23

Probabilistic Kalman Filter

� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �

� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �

n-1Λ

n-1w nw

λI

n-1

β α

w

observation n

yn

Key: get

right (may regard� � �

as

learning rate)

Classification: Non linear and non

Gaussian, some eqns.

0 200 400 600 800 10000

50

100

150

200

250

300

350

Simulations using σλ=1e+003

window sz. 1window sz. 5window sz. 10window sz. 15window sz. 20

0 200 400 600 800 10000

0.2

0.4

0.6

0.8

1

Simulations using σλ=1e+003

window sz. 1window sz. 5window sz. 10window sz. 15window sz. 20

Illustration of � � � and “in-

stantaneous” generalization error for

B. D. Ripley’s synthetic data with ar-

tificial non-stationarity (swap labels

after sample 500).

jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.8/23

Adaptive BCIby variational Kalman filtering.BCI: data driven prediction of cognitive state from EEG measurements.

Working hypothesis: EEG dynamics during a cognitive task are subject to

temporal variation (learning effects, fatigue ...)

Represent EEG segments by z-transformed reflection coefficients.

Mutual information, adaptive method and identical “stationary” model.

0 0.5 10

0.2

0.4

0.6

0.8

1

navigation/auditory

DP(y| z)||P(y)

in bit

empi

rical

cdf

0 0.5 10

0.2

0.4

0.6

0.8

1

navigation/movement

DP(y| z)||P(y)

in bit

empi

rical

cdf

0 0.5 10

0.2

0.4

0.6

0.8

1

auditory/movement

DP(y| z)||P(y)

in bit

empi

rical

cdf

0 0.5 1 1.50

0.2

0.4

0.6

0.8

1

navig./audit./move, 3 class

DP(y| z)||P(y)

in bit

empi

rical

cdf

0 0.5 10

0.2

0.4

0.6

0.8

1

rest/move no feedback

DP(y| z)||P(y)

in bit

empi

rical

cdf

0 0.5 10

0.2

0.4

0.6

0.8

1

rest/move feedback

DP(y| z)||P(y)

in bit

empi

rical

cdf

0 0.5 10

0.2

0.4

0.6

0.8

1

move/math no feedback

DP(y| z)||P(y)

in bit

empi

rical

cdf

0 0.5 10

0.2

0.4

0.6

0.8

1

move/math feedback

DP(y| z)||P(y)

in bit

empi

rical

cdf

jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.9/23

Communication Bandwidthbit rates � � ��� � [bit/s]

task vkf vsi

��� rest/move no fb.

� � �� � � � � � � � � �

rest/move fb.

� � �� � � �� � � � � �

move/math no fb.

� � �� � � � � � � � � �

move/math fb.

� � �� � � � � � � � � �

nav./aud./move

� � � � � � �� � � � � �

audit./move� � � � � � � � � � � � �

navig./move� � �� � �� � � � � � �

navig./audit.

� � � � � � � � � � � � �

Conclusion: adaptive methods increase BCI band-

widths even on short time scales.jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.10/23

Gene discoveryDiscovering “important” genes (or proteins) from

microarray datasets can be classified as

� Identification of all differentially expressedgenes.

� Identification of reliable (sets) of marker genes.

Current practise for the first: classical methods (e.g.t-test on differences of means) or probabilisticapproaches with one indicator variable for each gene.

The second is typically done by conventional featuresubset selection. As a result we obtain a set of genesthat was found by heuristic search.

jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.11/23

Bayesian Marker IdentificationMissing in FSS: How good are other explanations?

Interpret microarray data as classification problem of“genetic” regressors w.r.t. a discrete response.

� � Bayesian variable selection provides thisinformation. However: hopeless, unless we constrainthe dimensionality.

Simplified attempt: � � Find distribution overindividual genes.

Probabilities result from the marginal likelihood ofeach model.

� � � � � � � � � � � � � � � � � � � �

� � � � � � � � � � � � � � � �� �

jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.12/23

DAG for Marker IdentificationI

� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �

� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �

� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �

� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �

� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �

� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �

� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �

0Λ z

ny

n

observation n

nxw

Π Latent variable probit GLM.

� is a one dimensional Gaus-sian random variable with mean

� � � and precision

.

��� � � ��� �

if �� � �

� if �

Inference can be done by a variational method(systematic error) or by sampling (random error). Thelatter allows to integrate over analytically and wedraw from and only.

jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.13/23

DAG for Marker IdentificationI

� � � � � � � � �

� � � � � � � � �

� � � � � � � � �

� � � � � � � � �

� � � � � � � � �

� � � � � � � � �

� � � � � � � � �

� � � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � � �

� � � � � � � � �

� � � � � � � � �

� � � � � � � � �

� � � � � � � � �

� � � � � � � � �

� � � � � � � � �

� � � � � � � � �

� � � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

� � � � � � � �

0Λ z

ny

n

observation n

nxw

Π Latent variable probit GLM.

� is a one dimensional Gaus-sian random variable with mean

� � � and precision

.

��� � � ��� �

if �� � �

� if �

Inference can be done by a variational method(systematic error) or by sampling (random error). Thelatter allows to integrate over � analytically and wedraw from � and

�only.

jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.13/23

Asymptotic Behaviour

0 50 100 150 200

0

0.2

0.4

0.6

0.8

1

Cum

ulat

ive

Pro

babi

lity

Gene index

Bayes Posterior, Leukaemia 3 slides18 slides38 slides

0 50 100 150 200

0

0.02

0.04

0.06

0.08

0.1

Asc

endi

ng p

−va

l.

Gene index

Test based ranking, Leukaemia 3 slides18 slides38 slides

0 50 100 150 200

0

0.2

0.4

0.6

0.8

1

Cum

ulat

ive

Pro

babi

lity

Gene index

Bayes Posterior, Colon cancer 3 slides30 slides62 slides

0 50 100 150 200

0

0.02

0.04

0.06

0.08

0.1

Asc

endi

ng p

−va

l.

Gene index

Test based ranking, Colon cancer 3 slides30 slides62 slides

jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.14/23

Comparison with ML

0 50 100 150 200

0

0.2

0.4

0.6

0.8

1

Cum

ulat

ive

Pro

babi

lity

Gene index

Leukaemia

Bayesian prob.Max. likelihood

0 50 100 150 200

0

0.2

0.4

0.6

0.8

1

Cum

ulat

ive

Pro

babi

lity

Gene index

Colon cancer

Bayesian prob.Max. likelihood

Results differ since Bayesian model posteriors take“complexity” (ref. Hochreiter’s “flat minima”) intoaccount.

jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.15/23

Selection and Gen. Accuracy

Most probable regressorsselected at a

��

� �

threshold

Acc. no. description

� �� �� �

Colon Cancer (Alon et. al.)

Z50753 Uroguanylin 0.76

R87126 Myosin 0.21

M63391 desmin gene 0.01

M36634 vasoact. pept. 0.01

Leukaemia (Golub et. al.)

X95735 Zyxin 0.93

M55150 FAH Fumarylac. 0.05

M27891 CST3 Cystatin C 0.01

Generalization accuracy

Dataset B. probit “indifference”

Colon 84% 74% to 94%

Leukaemia 88% 91% to 96%

No “better” results in literature �

confirms model.

Biology confirms Uroguanylin (cell

apoptosis) as important in colon can-

cer development.

But: Meaning of the prob-abilities?

jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.16/23

Selection and Gen. Accuracy

Most probable regressorsselected at a

��

� �

threshold

Acc. no. description

� �� �� �

Colon Cancer (Alon et. al.)

Z50753 Uroguanylin 0.76

R87126 Myosin 0.21

M63391 desmin gene 0.01

M36634 vasoact. pept. 0.01

Leukaemia (Golub et. al.)

X95735 Zyxin 0.93

M55150 FAH Fumarylac. 0.05

M27891 CST3 Cystatin C 0.01

Generalization accuracy

Dataset B. probit “indifference”

Colon 84% 74% to 94%

Leukaemia 88% 91% to 96%

No “better” results in literature �

confirms model.

Biology confirms Uroguanylin (cell

apoptosis) as important in colon can-

cer development.

But: Meaning of the prob-abilities?

jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.16/23

DiscussionQuoting

� � � �� � -closed model selection

with zero-one utility.

Our approach should assume an -open scenario.Under asymptotic normality,

� � � � � �

degenerates on��� � �

that minimizes� ��� �� �� ���� � � �� �� � �� �� � � � �� � ).

If the predictive distribution of a new observation is ofinterest, B&S’s suggest to use a logarithmic scorefunction for -open model comparison.

��� � � � ��� � ��� � � � �� � �� �

(e.g. cross validation estimate, still to be done)

jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.17/23

A simple idea:the world is one probabilistic model.

� Applications often require hierarchical structure:a feature extraction part and a probabilisticmodel.

� Classical approach: treat both parts separatelyand thus regard features as sufficient statistic ofthe data. � � Features are deterministicvariables.

� Our suggestion: treat such hierarchical settings asone probabilistic model. � � Feature extractionis a representation in a latent space.

jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.18/23

Bayes’ Consistent Models

ϕϕI

baI

a

a b

b

X X

t

Expected utility requiresto integrate over all un-known variables, includ-ing , , and thatrepresent a feature space.

−6 −5 −4 −3 −2 −1 0 1 20.25

0.3

0.35

0.4

0.45

0.5

0.55

0.6

0.65

0.7

Precision of p(φa|X

a) on logarithmic scale

Pos

terio

r pr

obab

ilitie

s

Probabilistic sensor fusion

P(2) − latent feauturesP(2) − conditioning

Decisions depend on(un)certainty and maythus change.

jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.19/23

Bayes’ Consistent Models

ϕϕI

baI

a

a b

b

X X

t

Expected utility requiresto integrate over all un-known variables, includ-ing ��� , ��� ,

�� and

�� thatrepresent a feature space.

−6 −5 −4 −3 −2 −1 0 1 20.25

0.3

0.35

0.4

0.45

0.5

0.55

0.6

0.65

0.7

Precision of p(φa|X

a) on logarithmic scale

Pos

terio

r pr

obab

ilitie

s

Probabilistic sensor fusion

P(2) − latent feauturesP(2) − conditioning

Decisions depend on(un)certainty and maythus change.

jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.19/23

Bayes’ Consistent Models

ϕϕI

baI

a

a b

b

X X

t

Expected utility requiresto integrate over all un-known variables, includ-ing ��� , ��� ,

�� and

�� thatrepresent a feature space.

−6 −5 −4 −3 −2 −1 0 1 20.25

0.3

0.35

0.4

0.45

0.5

0.55

0.6

0.65

0.7

Precision of p(φa|X

a) on logarithmic scale

Pos

terio

r pr

obab

ilitie

s

Probabilistic sensor fusion

P(2) − latent feauturesP(2) − conditioning

Decisions depend on(un)certainty and maythus change.

jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.19/23

Time Series ClassificationROC Curves

0 0.5 1

0

0.2

0.4

0.6

0.8

1sen

sitivity

Navig. vs. Aud.

1 − specifity

Bayes cond.

0 0.5 1

0

0.2

0.4

0.6

0.8

1

sensiti

vity

Left vs. Right

1 − specifity

Bayes cond.

0 0.5 1

0

0.2

0.4

0.6

0.8

1

sensiti

vity

Spindle

1 − specifity

Bayes cond.

0 0.5 1

0

0.2

0.4

0.6

0.8

1

sensiti

vity

Synthetic

1 − specifity

Bayes cond.

Kullback Leibler Divergence

0 0.2 0.4 0.6 0.8

0

0.2

0.4

0.6

0.8

1

empiri

cal cd

f.

Navig. vs. Aud.

DP(t|x)||P(t)

bit

Bayes cond.

0 0.2 0.4 0.6 0.8

0

0.2

0.4

0.6

0.8

1

empiri

cal cd

f.

Left vs. Right

DP(t|x)||P(t)

bit

Bayes cond.

0 1 2

0

0.2

0.4

0.6

0.8

1

empiri

cal cd

f.

Spindle

DP(t|x)||P(t)

bit

Bayes cond.

0 0.5 1 1.5

0

0.2

0.4

0.6

0.8

1

empiri

cal cd

f.

Synthetic

DP(t|x)||P(t)

bit

Bayes cond.

jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.20/23

More ResultsExpected feature values

−2 −1.5 −1

0

0.2

0.4

0.6

0.8

1

Bayes Spindle

dim. 2

dim. 1

−2 −1.5 −1

0

0.2

0.4

0.6

0.8

1

cond. Spindle

dim. 2

dim. 1

−1 −0.5 0

0

0.2

0.4

0.6

Bayes Synthetic

dim. 3

dim. 2

−1 −0.5 0

0

0.2

0.4

0.6

cond. Synthetic

dim. 3

dim. 2

−2 −1.5 −10

0.1

0.2

0.3

0.4

0.5

Bayes Navig. vs. Aud.

dim. 2

dim. 1

−2 −1.5 −10

0.1

0.2

0.3

0.4

0.5

cond. Navig. vs. Aud.

dim. 2

dim. 1

Kullback Leibler Divergence for “Artefacts”

0 0.2 0.4 0.6

0

0.2

0.4

0.6

0.8

1

empiri

cal cd

f.

Correct classifications

DP(t|x)||P(t)

bit

Bayescond.

0 0.2 0.4 0.6

0

0.2

0.4

0.6

0.8

1

empiri

cal cd

f.

Wrong classifications

DP(t|x)||P(t)

bit

Bayescond.

jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.21/23

Variational Kalman FilterThe logarithmic model evidence for a window of size

is

��� � � � ��� � � � �� � ��

��� �

� �

� � � � � � � � � �

� � �� � � ��� �� � � �� � � � � � � � � � � � � �� � � ��� � � � � � �

This is not a probabilistic structure! (need Rauch Tung Striebel smoother)

Plug in distributions and integrate over :

jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.22/23

Variational Kalman FilterThe logarithmic model evidence for a window of size

is

��� � � � ��� � � � �� � ��

��� �

� �

� � � � � � � � � �

� � �� � � ��� �� � � �� � � � � � � � � � � � � �� � � ��� � � � � � �

This is not a probabilistic structure! (need Rauch Tung Striebel smoother)

Plug in distributions and integrate over � � � :

��� � � � ��� � � � �� � ��

��� �

�� � �� � � � �� �� � � � �� � � �� � �

� �� � � � � � � � � � � � � � � � �� �� � � � �� � � �� � � � � � � � � � �

� � � � �� � � �� � � � � � � � � � �� � � �

���� � � �� � � � �� � � � � � �� � �

jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.22/23

Lower Bounds

�� � � � ��� � � � � � � � � � ��� � � � � � � � �

� � �� � �� � � ��� � ���� � � � � ��

� �

���� � � � �

� � �

� � ��

�� � ��

back to vkf

jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.23/23

Lower Bounds

�� � � � ��� � � � � � � � � � ��� � � � � � � � �

� � �� � �� � � ��� � ���� � � � � ��

� �

���� � � � �

� � �

� � ��

�� � ��

� � � � �� � � �� �� � � � �� � � � �

��

�� � � ��

���� � ��� �� �� � � �

��

�� � � � � ��� �� � � � � �� �

back to vkf

jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.23/23

Lower Bounds

�� � � � ��� � � � � � � � � � ��� � � � � � � � �

� � �� � �� � � ��� � ���� � � � � ��

� �

���� � � � �

� � �

� � ��

�� � ��

� � � � �� � � �� �� � � � �� � � � �

��

�� � � ��

���� � ��� �� �� � � �

��

�� � � � � ��� �� � � � � �� �

� � � � � � � � � � � � � �� �� � � � �� � � �� � � � � � � � � � �

� � � � � � � � � � � � � �� �� � � � � � � � �� � � � � � � � � �

� � � � � � � � � � � � � � � � � �� �� �� � � � � �� � � � � � � � � �

back to vkf

jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.23/23

top related