biomedical applications and the probabilistic framework · probabilistic kalman filter observation...
TRANSCRIPT
Biomedical Applications and theProbabilistic Framework
Peter Sykacek
http://www.sykacek.net
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.1/23
Talk Overview
� Motivation of Probabilistic Concepts
� BCI, current practice & shortcomings
� Probabilistic Kalman Filter
� Adaptive BCI
� Gene Discovery
� DAG for Bayesian Marker Identification
� Gene Selection
� Discussion of Model Selection
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.2/23
Probabilistic Motivations
Thomas Bayes (1701 - 1763)Learning from data using adecision theoretic framework
First consequence: wemust revise beliefs ac-cording to Bayes theorem
, where
.
Second consequence: De-cisions by maximising ex-pected utilities
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.3/23
Probabilistic Motivations
Thomas Bayes (1701 - 1763)Learning from data using adecision theoretic framework
� ��� � � � � � �� � ��
� �
First consequence: wemust revise beliefs ac-cording to Bayes theorem
, where
.
Second consequence: De-cisions by maximising ex-pected utilities
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.3/23
Probabilistic Motivations
Thomas Bayes (1701 - 1763)Learning from data using adecision theoretic framework
� ��� � � � � � �� � ��
� �
First consequence: wemust revise beliefs ac-cording to Bayes theorem
��� � � � �� � � � � � � � , where
� � � � � ��
� ��� � � �� � � .
Second consequence: De-cisions by maximising ex-pected utilities
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.3/23
Brain Computer Interface
Computer is controlled directly by cortical activity.
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.4/23
Classification of BCIs
(mostly motor tasks)involuntary events (P300)voluntary control
Scalp EEG intracranial EEG
Neural activity
Cognitive events intracranial EEG � � high spatial
and temporal resolution; highly inva-
sive!; allows 2-d control of artificial
limb.
surface EEG � � low spatial and
temporal resolution; no permanent in-
terference with patient; slow! at most
20 bit per minute and task.
� � focus on BCI’s based on scalp recordings.
� � low bit rates; last resort if no other communica-
tion possible
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.5/23
BCI with almost no adaptation
� P
�� �
based: L. A. Farwell and E. Donchin, � �
User intention is embedded within a sequence ofsymbols. The correct symbol leads to “surprise”and triggers a P
�� �
.
� Filter & threshold: N. Birbaumer etal. , � �
threshold slow cortical potentials; J.R. Wolpawetal., � � threshold moving average in anappropriate pass band e.g. �-rhythm.
These principles rely mostly on user training.
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.6/23
BCI & static pattern recognition
� Extract representation of EEG “waveforms” (e.g.low pass filtered time series; spectralrepresentation)
� Parameterize supervised classification implicitlyassuming stationarity.
What ifTechnical setup changes during operation?
(e.g. electrolyte changes impedance)User learns from feedback?User shows fatigue?
Assuming stationarity must be wrong !
� � Probabilistic method for “adaptive” BCI.
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.7/23
Probabilistic Kalman Filter
� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �
� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �
n-1Λ
n-1w nw
λI
n-1
β α
w
observation n
yn
Key: get right (may regard as
learning rate)
0 200 400 600 800 10000
50
100
150
200
250
300
350
Simulations using σλ=1e+003
window sz. 1window sz. 5window sz. 10window sz. 15window sz. 20
0 200 400 600 800 10000
0.2
0.4
0.6
0.8
1
Simulations using σλ=1e+003
window sz. 1window sz. 5window sz. 10window sz. 15window sz. 20
Illustration of and “in-
stantaneous” generalization error for
B. D. Ripley’s synthetic data with ar-
tificial non-stationarity (swap labels
after sample 500).
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.8/23
Probabilistic Kalman Filter
� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �
� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �
n-1Λ
n-1w nw
λI
n-1
β α
w
observation n
yn
Key: get
�
right (may regard
� � �
as
learning rate)
0 200 400 600 800 10000
50
100
150
200
250
300
350
Simulations using σλ=1e+003
window sz. 1window sz. 5window sz. 10window sz. 15window sz. 20
0 200 400 600 800 10000
0.2
0.4
0.6
0.8
1
Simulations using σλ=1e+003
window sz. 1window sz. 5window sz. 10window sz. 15window sz. 20
Illustration of and “in-
stantaneous” generalization error for
B. D. Ripley’s synthetic data with ar-
tificial non-stationarity (swap labels
after sample 500).
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.8/23
Probabilistic Kalman Filter
� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �
� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �
n-1Λ
n-1w nw
λI
n-1
β α
w
observation n
yn
Key: get
�
right (may regard� � �
as
learning rate)
Classification: Non linear and non
Gaussian, some eqns.
0 200 400 600 800 10000
50
100
150
200
250
300
350
Simulations using σλ=1e+003
window sz. 1window sz. 5window sz. 10window sz. 15window sz. 20
0 200 400 600 800 10000
0.2
0.4
0.6
0.8
1
Simulations using σλ=1e+003
window sz. 1window sz. 5window sz. 10window sz. 15window sz. 20
Illustration of and “in-
stantaneous” generalization error for
B. D. Ripley’s synthetic data with ar-
tificial non-stationarity (swap labels
after sample 500).
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.8/23
Probabilistic Kalman Filter
� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �
� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �
n-1Λ
n-1w nw
λI
n-1
β α
w
observation n
yn
Key: get
�
right (may regard� � �
as
learning rate)
Classification: Non linear and non
Gaussian, some eqns.
0 200 400 600 800 10000
50
100
150
200
250
300
350
Simulations using σλ=1e+003
window sz. 1window sz. 5window sz. 10window sz. 15window sz. 20
0 200 400 600 800 10000
0.2
0.4
0.6
0.8
1
Simulations using σλ=1e+003
window sz. 1window sz. 5window sz. 10window sz. 15window sz. 20
Illustration of � � � and “in-
stantaneous” generalization error for
B. D. Ripley’s synthetic data with ar-
tificial non-stationarity (swap labels
after sample 500).
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.8/23
Adaptive BCIby variational Kalman filtering.BCI: data driven prediction of cognitive state from EEG measurements.
Working hypothesis: EEG dynamics during a cognitive task are subject to
temporal variation (learning effects, fatigue ...)
Represent EEG segments by z-transformed reflection coefficients.
Mutual information, adaptive method and identical “stationary” model.
0 0.5 10
0.2
0.4
0.6
0.8
1
navigation/auditory
DP(y| z)||P(y)
in bit
empi
rical
cdf
0 0.5 10
0.2
0.4
0.6
0.8
1
navigation/movement
DP(y| z)||P(y)
in bit
empi
rical
cdf
0 0.5 10
0.2
0.4
0.6
0.8
1
auditory/movement
DP(y| z)||P(y)
in bit
empi
rical
cdf
0 0.5 1 1.50
0.2
0.4
0.6
0.8
1
navig./audit./move, 3 class
DP(y| z)||P(y)
in bit
empi
rical
cdf
0 0.5 10
0.2
0.4
0.6
0.8
1
rest/move no feedback
DP(y| z)||P(y)
in bit
empi
rical
cdf
0 0.5 10
0.2
0.4
0.6
0.8
1
rest/move feedback
DP(y| z)||P(y)
in bit
empi
rical
cdf
0 0.5 10
0.2
0.4
0.6
0.8
1
move/math no feedback
DP(y| z)||P(y)
in bit
empi
rical
cdf
0 0.5 10
0.2
0.4
0.6
0.8
1
move/math feedback
DP(y| z)||P(y)
in bit
empi
rical
cdf
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.9/23
Communication Bandwidthbit rates � � ��� � [bit/s]
task vkf vsi
��� rest/move no fb.
� � �� � � � � � � � � �
rest/move fb.
� � �� � � �� � � � � �
move/math no fb.
� � �� � � � � � � � � �
move/math fb.
� � �� � � � � � � � � �
nav./aud./move
� � � � � � �� � � � � �
audit./move� � � � � � � � � � � � �
navig./move� � �� � �� � � � � � �
navig./audit.
� � � � � � � � � � � � �
Conclusion: adaptive methods increase BCI band-
widths even on short time scales.jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.10/23
Gene discoveryDiscovering “important” genes (or proteins) from
microarray datasets can be classified as
� Identification of all differentially expressedgenes.
� Identification of reliable (sets) of marker genes.
Current practise for the first: classical methods (e.g.t-test on differences of means) or probabilisticapproaches with one indicator variable for each gene.
The second is typically done by conventional featuresubset selection. As a result we obtain a set of genesthat was found by heuristic search.
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.11/23
Bayesian Marker IdentificationMissing in FSS: How good are other explanations?
Interpret microarray data as classification problem of“genetic” regressors w.r.t. a discrete response.
� � Bayesian variable selection provides thisinformation. However: hopeless, unless we constrainthe dimensionality.
Simplified attempt: � � Find distribution overindividual genes.
Probabilities result from the marginal likelihood ofeach model.
� � � � � � � � � � � � � � � � � � � �
� � � � � � � � � � � � � � � �� �
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.12/23
DAG for Marker IdentificationI
� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �
� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �
� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �
� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �
� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �
� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �
� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �
0Λ z
ny
n
observation n
nxw
Π Latent variable probit GLM.
� is a one dimensional Gaus-sian random variable with mean
� � � and precision
.
��� � � ��� �
if �� � �
� if �
�
Inference can be done by a variational method(systematic error) or by sampling (random error). Thelatter allows to integrate over analytically and wedraw from and only.
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.13/23
DAG for Marker IdentificationI
� � � � � � � � �
� � � � � � � � �
� � � � � � � � �
� � � � � � � � �
� � � � � � � � �
� � � � � � � � �
� � � � � � � � �
� � � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � � �
� � � � � � � � �
� � � � � � � � �
� � � � � � � � �
� � � � � � � � �
� � � � � � � � �
� � � � � � � � �
� � � � � � � � �
� � � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
0Λ z
ny
n
observation n
nxw
Π Latent variable probit GLM.
� is a one dimensional Gaus-sian random variable with mean
� � � and precision
.
��� � � ��� �
if �� � �
� if �
�
Inference can be done by a variational method(systematic error) or by sampling (random error). Thelatter allows to integrate over � analytically and wedraw from � and
�only.
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.13/23
Asymptotic Behaviour
0 50 100 150 200
0
0.2
0.4
0.6
0.8
1
Cum
ulat
ive
Pro
babi
lity
Gene index
Bayes Posterior, Leukaemia 3 slides18 slides38 slides
0 50 100 150 200
0
0.02
0.04
0.06
0.08
0.1
Asc
endi
ng p
−va
l.
Gene index
Test based ranking, Leukaemia 3 slides18 slides38 slides
0 50 100 150 200
0
0.2
0.4
0.6
0.8
1
Cum
ulat
ive
Pro
babi
lity
Gene index
Bayes Posterior, Colon cancer 3 slides30 slides62 slides
0 50 100 150 200
0
0.02
0.04
0.06
0.08
0.1
Asc
endi
ng p
−va
l.
Gene index
Test based ranking, Colon cancer 3 slides30 slides62 slides
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.14/23
Comparison with ML
0 50 100 150 200
0
0.2
0.4
0.6
0.8
1
Cum
ulat
ive
Pro
babi
lity
Gene index
Leukaemia
Bayesian prob.Max. likelihood
0 50 100 150 200
0
0.2
0.4
0.6
0.8
1
Cum
ulat
ive
Pro
babi
lity
Gene index
Colon cancer
Bayesian prob.Max. likelihood
Results differ since Bayesian model posteriors take“complexity” (ref. Hochreiter’s “flat minima”) intoaccount.
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.15/23
Selection and Gen. Accuracy
Most probable regressorsselected at a
��
� �
threshold
Acc. no. description
� �� �� �
Colon Cancer (Alon et. al.)
Z50753 Uroguanylin 0.76
R87126 Myosin 0.21
M63391 desmin gene 0.01
M36634 vasoact. pept. 0.01
Leukaemia (Golub et. al.)
X95735 Zyxin 0.93
M55150 FAH Fumarylac. 0.05
M27891 CST3 Cystatin C 0.01
Generalization accuracy
Dataset B. probit “indifference”
Colon 84% 74% to 94%
Leukaemia 88% 91% to 96%
No “better” results in literature �
confirms model.
Biology confirms Uroguanylin (cell
apoptosis) as important in colon can-
cer development.
But: Meaning of the prob-abilities?
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.16/23
Selection and Gen. Accuracy
Most probable regressorsselected at a
��
� �
threshold
Acc. no. description
� �� �� �
Colon Cancer (Alon et. al.)
Z50753 Uroguanylin 0.76
R87126 Myosin 0.21
M63391 desmin gene 0.01
M36634 vasoact. pept. 0.01
Leukaemia (Golub et. al.)
X95735 Zyxin 0.93
M55150 FAH Fumarylac. 0.05
M27891 CST3 Cystatin C 0.01
Generalization accuracy
Dataset B. probit “indifference”
Colon 84% 74% to 94%
Leukaemia 88% 91% to 96%
No “better” results in literature �
confirms model.
Biology confirms Uroguanylin (cell
apoptosis) as important in colon can-
cer development.
But: Meaning of the prob-abilities?
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.16/23
DiscussionQuoting
� � � �� � -closed model selection
with zero-one utility.
Our approach should assume an -open scenario.Under asymptotic normality,
� � � � � �
degenerates on��� � �
that minimizes� ��� �� �� ���� � � �� �� � �� �� � � � �� � ).
If the predictive distribution of a new observation is ofinterest, B&S’s suggest to use a logarithmic scorefunction for -open model comparison.
��� � � � ��� � ��� � � � �� � �� �
(e.g. cross validation estimate, still to be done)
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.17/23
A simple idea:the world is one probabilistic model.
� Applications often require hierarchical structure:a feature extraction part and a probabilisticmodel.
� Classical approach: treat both parts separatelyand thus regard features as sufficient statistic ofthe data. � � Features are deterministicvariables.
� Our suggestion: treat such hierarchical settings asone probabilistic model. � � Feature extractionis a representation in a latent space.
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.18/23
Bayes’ Consistent Models
ϕϕI
baI
a
a b
b
X X
t
Expected utility requiresto integrate over all un-known variables, includ-ing , , and thatrepresent a feature space.
−6 −5 −4 −3 −2 −1 0 1 20.25
0.3
0.35
0.4
0.45
0.5
0.55
0.6
0.65
0.7
Precision of p(φa|X
a) on logarithmic scale
Pos
terio
r pr
obab
ilitie
s
Probabilistic sensor fusion
P(2) − latent feauturesP(2) − conditioning
Decisions depend on(un)certainty and maythus change.
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.19/23
Bayes’ Consistent Models
ϕϕI
baI
a
a b
b
X X
t
Expected utility requiresto integrate over all un-known variables, includ-ing ��� , ��� ,
�� and
�� thatrepresent a feature space.
−6 −5 −4 −3 −2 −1 0 1 20.25
0.3
0.35
0.4
0.45
0.5
0.55
0.6
0.65
0.7
Precision of p(φa|X
a) on logarithmic scale
Pos
terio
r pr
obab
ilitie
s
Probabilistic sensor fusion
P(2) − latent feauturesP(2) − conditioning
Decisions depend on(un)certainty and maythus change.
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.19/23
Bayes’ Consistent Models
ϕϕI
baI
a
a b
b
X X
t
Expected utility requiresto integrate over all un-known variables, includ-ing ��� , ��� ,
�� and
�� thatrepresent a feature space.
−6 −5 −4 −3 −2 −1 0 1 20.25
0.3
0.35
0.4
0.45
0.5
0.55
0.6
0.65
0.7
Precision of p(φa|X
a) on logarithmic scale
Pos
terio
r pr
obab
ilitie
s
Probabilistic sensor fusion
P(2) − latent feauturesP(2) − conditioning
Decisions depend on(un)certainty and maythus change.
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.19/23
Time Series ClassificationROC Curves
0 0.5 1
0
0.2
0.4
0.6
0.8
1sen
sitivity
Navig. vs. Aud.
1 − specifity
Bayes cond.
0 0.5 1
0
0.2
0.4
0.6
0.8
1
sensiti
vity
Left vs. Right
1 − specifity
Bayes cond.
0 0.5 1
0
0.2
0.4
0.6
0.8
1
sensiti
vity
Spindle
1 − specifity
Bayes cond.
0 0.5 1
0
0.2
0.4
0.6
0.8
1
sensiti
vity
Synthetic
1 − specifity
Bayes cond.
Kullback Leibler Divergence
0 0.2 0.4 0.6 0.8
0
0.2
0.4
0.6
0.8
1
empiri
cal cd
f.
Navig. vs. Aud.
DP(t|x)||P(t)
bit
Bayes cond.
0 0.2 0.4 0.6 0.8
0
0.2
0.4
0.6
0.8
1
empiri
cal cd
f.
Left vs. Right
DP(t|x)||P(t)
bit
Bayes cond.
0 1 2
0
0.2
0.4
0.6
0.8
1
empiri
cal cd
f.
Spindle
DP(t|x)||P(t)
bit
Bayes cond.
0 0.5 1 1.5
0
0.2
0.4
0.6
0.8
1
empiri
cal cd
f.
Synthetic
DP(t|x)||P(t)
bit
Bayes cond.
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.20/23
More ResultsExpected feature values
−2 −1.5 −1
0
0.2
0.4
0.6
0.8
1
Bayes Spindle
dim. 2
dim. 1
−2 −1.5 −1
0
0.2
0.4
0.6
0.8
1
cond. Spindle
dim. 2
dim. 1
−1 −0.5 0
0
0.2
0.4
0.6
Bayes Synthetic
dim. 3
dim. 2
−1 −0.5 0
0
0.2
0.4
0.6
cond. Synthetic
dim. 3
dim. 2
−2 −1.5 −10
0.1
0.2
0.3
0.4
0.5
Bayes Navig. vs. Aud.
dim. 2
dim. 1
−2 −1.5 −10
0.1
0.2
0.3
0.4
0.5
cond. Navig. vs. Aud.
dim. 2
dim. 1
Kullback Leibler Divergence for “Artefacts”
0 0.2 0.4 0.6
0
0.2
0.4
0.6
0.8
1
empiri
cal cd
f.
Correct classifications
DP(t|x)||P(t)
bit
Bayescond.
0 0.2 0.4 0.6
0
0.2
0.4
0.6
0.8
1
empiri
cal cd
f.
Wrong classifications
DP(t|x)||P(t)
bit
Bayescond.
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.21/23
Variational Kalman FilterThe logarithmic model evidence for a window of size
�
is
��� � � � ��� � � � �� � ��
��� �
� �
� � � � � � � � � �
� � �� � � ��� �� � � �� � � � � � � � � � � � � �� � � ��� � � � � � �
�
This is not a probabilistic structure! (need Rauch Tung Striebel smoother)
Plug in distributions and integrate over :
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.22/23
Variational Kalman FilterThe logarithmic model evidence for a window of size
�
is
��� � � � ��� � � � �� � ��
��� �
� �
� � � � � � � � � �
� � �� � � ��� �� � � �� � � � � � � � � � � � � �� � � ��� � � � � � �
�
This is not a probabilistic structure! (need Rauch Tung Striebel smoother)
Plug in distributions and integrate over � � � :
��� � � � ��� � � � �� � ��
��� �
�
�� � �� � � � �� �� � � � �� � � �� � �
� �� � � � � � � � � � � � � � � � �� �� � � � �� � � �� � � � � � � � � � �
� � � � �� � � �� � � � � � � � � � �� � � �
�
�
���� � � �� � � � �� � � � � � �� � �
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.22/23
Lower Bounds
�� � � � ��� � � � � � � � � � ��� � � � � � � � �
� � �� � �� � � ��� � ���� � � � � ��
� �
���� � � � �
� � �
� � ��
�� � ��
back to vkf
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.23/23
Lower Bounds
�� � � � ��� � � � � � � � � � ��� � � � � � � � �
� � �� � �� � � ��� � ���� � � � � ��
� �
���� � � � �
� � �
� � ��
�� � ��
� � � � �� � � �� �� � � � �� � � � �
��
�� � � ��
���� � ��� �� �� � � �
��
�� � � � � ��� �� � � � � �� �
�
back to vkf
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.23/23
Lower Bounds
�� � � � ��� � � � � � � � � � ��� � � � � � � � �
� � �� � �� � � ��� � ���� � � � � ��
� �
���� � � � �
� � �
� � ��
�� � ��
� � � � �� � � �� �� � � � �� � � � �
��
�� � � ��
���� � ��� �� �� � � �
��
�� � � � � ��� �� � � � � �� �
�
� � � � � � � � � � � � � �� �� � � � �� � � �� � � � � � � � � � �
� � � � � � � � � � � � � �� �� � � � � � � � �� � � � � � � � � �
� � � � � � � � � � � � � � � � � �� �� �� � � � � �� � � � � � � � � �
back to vkf
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.23/23