ese 524ese 524 detection and estimation theory
TRANSCRIPT
![Page 1: ESE 524ESE 524 Detection and Estimation Theory](https://reader033.vdocuments.net/reader033/viewer/2022042107/6256d67b3496cd37cc224067/html5/thumbnails/1.jpg)
ESE 524ESE 524Detection and Estimation Theory
Joseph A. O’Sullivan
yJoseph A. O Sullivan
Samuel C. Sachs ProfessorElectronic Systems and Signals Research
LaboratoryLaboratoryElectrical and Systems Engineering
Washington University211 U b H ll211 Urbauer Hall
314-935-4173 (Lynda answers)[email protected]
J. A. O'S. ESE 524, Lecture 10, 02/20/09 11
![Page 2: ESE 524ESE 524 Detection and Estimation Theory](https://reader033.vdocuments.net/reader033/viewer/2022042107/6256d67b3496cd37cc224067/html5/thumbnails/2.jpg)
A tAnnouncements Problem Set 3 is Due in Class 2/20 Problem Set 3 is Due in Class 2/20 We another make-up class Feb. 27 Another Friday after spring break Another Friday after spring break Midterm Exam? Other announcements or questions? Other announcements or questions?
J. A. O'S. ESE 524, Lecture 10, 02/20/09 22
![Page 3: ESE 524ESE 524 Detection and Estimation Theory](https://reader033.vdocuments.net/reader033/viewer/2022042107/6256d67b3496cd37cc224067/html5/thumbnails/3.jpg)
St ti ti l I f rStatistical InferenceTransition Probability or pdf
Inference AlgorithmL lik lih d ti t tTransition Probability or pdf
p(R|θ)- Log-likelihood ratio test- Parameter estimate
Parameter Space- Hypothesis
Data Space-Continuous
Inference SpaceHypothesisHypothesis
- ContinuousContinuous
-Discrete- Random process
- Hypothesis- Continuous
J. A. O'S. ESE 524, Lecture 10, 02/20/09 3
![Page 4: ESE 524ESE 524 Detection and Estimation Theory](https://reader033.vdocuments.net/reader033/viewer/2022042107/6256d67b3496cd37cc224067/html5/thumbnails/4.jpg)
Outline: Introduction to Estimation ThTheory Range of problems studied Range of problems studied Minimum mean cost problems
Minimum mean square error estimation Minimum mean square error estimation Minimum absolute error estimation Maximum a posteriori estimation Other
Maximum likelihood for nonrandom parameters
Fisher information and the Cramer-Rao b d
J. A. O'S. ESE 524, Lecture 10, 02/20/09 44
bound
![Page 5: ESE 524ESE 524 Detection and Estimation Theory](https://reader033.vdocuments.net/reader033/viewer/2022042107/6256d67b3496cd37cc224067/html5/thumbnails/5.jpg)
Range of Estimation TheoryP bl S di dProblems Studied
Cost Function Solution
Random Parameters
Mean Square Error Posterior Mean
Mean Absolute Error MedianMean Absolute Error Median
Likelihood Function: Maximum a posteriori
Likelihood equationMaximum a posteriori equation
Other mean cost Generalized mean
Nonrandom parameters
Likelihood Function: Maximum likelihood
Likelihood equation
Other costJ. A. O'S. ESE 524, Lecture 10, 02/20/09 55
Other cost
![Page 6: ESE 524ESE 524 Detection and Estimation Theory](https://reader033.vdocuments.net/reader033/viewer/2022042107/6256d67b3496cd37cc224067/html5/thumbnails/6.jpg)
R d P r t r E ti tiRandom Parameter Estimation Prior on the parameters |( ), ( | )p ps r ss S r R S Conditional pdf on the data
given the parameters Bayes’ Rule gives posterior
df
{ }{ }
|( ) ( | )ˆ ˆ[ , ( )], [ , ( )]
ˆ ˆ* arg min [ , ( )]
p pC E C
E C=
s r s
S s R s s r
s s s rpdf
Cost function is given Select that estimator that
minimizes the mean cost
{ }ˆ
g [ ( )]s
{ }{ } { }{ }
|ˆ ˆ[ , ( )] [ , ( )] ( | ) ( )
ˆ ˆ[ , ( )] [ , ( )] |
E C C p p d d
E C E E C
=
= r s ss s r S s R R S S S R
s s r s s r rminimizes the mean cost Estimator is a function For each data point,
estimator is single-valued
{ } { }{ }|
|
[ , ( )] [ , ( )] |
ˆ[ , ( )] ( | ) ( )
ˆ ˆ( ) arg min [ , ] ( | )
E C E E C
C p d p d
C p d
=
=
s r r
s r
s s r s s r r
S s R S R S R R
s R S s S R Sestimator is single-valued Minimize conditional mean
cost Generalized notion of mean
ps(S) ( )s | ( | )pr s R S
|ˆ
( ) g [ , ] ( | )p s rs
J. A. O'S. ESE 524, Lecture 10, 02/20/09 66
Generalized notion of mean
![Page 7: ESE 524ESE 524 Detection and Estimation Theory](https://reader033.vdocuments.net/reader033/viewer/2022042107/6256d67b3496cd37cc224067/html5/thumbnails/7.jpg)
Range of Estimation TheoryP bl S di dProblems Studied
Cost Function Solution
Random Parameters
Mean Square Error Posterior Mean
Mean Absolute Error MedianMean Absolute Error Median
Likelihood Function: Maximum a posteriori
Likelihood equationMaximum a posteriori equation
Other mean cost Generalized mean
Nonrandom parameters
Likelihood Function: Maximum likelihood
Likelihood equation
Other costJ. A. O'S. ESE 524, Lecture 10, 02/20/09 77
Other cost
![Page 8: ESE 524ESE 524 Detection and Estimation Theory](https://reader033.vdocuments.net/reader033/viewer/2022042107/6256d67b3496cd37cc224067/html5/thumbnails/8.jpg)
Range of Estimation TheoryP bl S di dProblems Studied
Cost Function Solution
Random Parameters
Mean Square Error Posterior Mean
Mean Absolute Error Median2ˆ ˆ[ ]C =S s S sMean Absolute Error Median
Likelihood Function: Maximum a posteriori
Likelihood equation
|ˆ
[ , ]
ˆ ˆ( ) arg min [ , ] ( | )MMSE
C
C p d
= −
= s rs
S s S s
s R S s S R SMaximum a posteriori equationOther mean cost Generalized mean
{ } ( ){ }
2|
ˆ
2
ˆarg min ( | )
ˆ ˆ| 2 |
p d
E E
= −
∇
s rs
S s S R S
Nonrandom parameters
Likelihood Function: Maximum likelihood
Likelihood equation
Other cost
{ } ( ){ }( )
ˆ ˆ ˆ| 2 |
ˆ2 [ | ] 0
E E
E
∇ − = − −
= − − =
s s s r s s r
s r s
J. A. O'S. ESE 524, Lecture 10, 02/20/09 88
Other cost( ) [ | ]MMSE E= =s R s r R
![Page 9: ESE 524ESE 524 Detection and Estimation Theory](https://reader033.vdocuments.net/reader033/viewer/2022042107/6256d67b3496cd37cc224067/html5/thumbnails/9.jpg)
Range of Estimation TheoryP bl S di dProblems Studied
Cost Function Solution
Random Parameters
Mean Square Error Posterior Mean
Mean Absolute MedianMean Absolute Error
Median
Likelihood Function: Likelihood ˆ
ˆ ˆ[ , ]C S s S s= −
Maximum a posteriori equationOther mean cost Generalized mean
ˆ
| |ˆ( | ) ( | )
ˆ ( ) median of posterior
s
s ss
MAE
p S dS p S dS
s
∞
−∞=
= r rR R
R
Nonrandom parameters
Likelihood Function: Maximum likelihood
Likelihood equation
MAE
J. A. O'S. ESE 524, Lecture 10, 02/20/09 99
Other cost
![Page 10: ESE 524ESE 524 Detection and Estimation Theory](https://reader033.vdocuments.net/reader033/viewer/2022042107/6256d67b3496cd37cc224067/html5/thumbnails/10.jpg)
Range of Estimation TheoryP bl S di dProblems Studied
Cost Function Solution
Random Parameters
Mean Square Error Posterior Mean
Mean Absolute Error MedianMean Absolute Error Median
Likelihood Function: Maximum a posteriori
Likelihood equationMaximum a posteriori equation
Other mean cost Generalized mean|ˆ ( ) arg max ln ( | )
l ( | ) l ( ) l ( )
MAP sS
s p S=
rR R
Nonrandom parameters
Likelihood Function: Maximum likelihood
Likelihood equation
Other cost
|
|
arg max ln ( | ) ln ( ) ln ( )
ln ( | ) ln ( )0
s sS
s s
p S p S p
p S p S
= + −
∂ ∂= +
r r
r
R R
R
J. A. O'S. ESE 524, Lecture 10, 02/20/09 1010
Other cost |0ˆˆ
s
MAPMAPS S S sS s
= +∂ ∂ ==
![Page 11: ESE 524ESE 524 Detection and Estimation Theory](https://reader033.vdocuments.net/reader033/viewer/2022042107/6256d67b3496cd37cc224067/html5/thumbnails/11.jpg)
Comments on Random Parameter E i iEstimation If the posterior is symmetric around its mean, p y ,
then the posterior mean (MMSE estimate) equals the posterior median (MAE estimate).
If the posterior mean is also the maximum, the If the posterior mean is also the maximum, the MAP equals the MMSE estimate.
If the cost function is symmetric in the error and the posterior is symmetric around the mean thenthe posterior is symmetric around the mean, then the minimum cost estimate equals the MMSE estimate.
J. A. O'S. ESE 524, Lecture 10, 02/20/09 11
![Page 12: ESE 524ESE 524 Detection and Estimation Theory](https://reader033.vdocuments.net/reader033/viewer/2022042107/6256d67b3496cd37cc224067/html5/thumbnails/12.jpg)
Oth r C t F tiOther Cost Functions Parameters may take many forms
Amplitude, frequency, phase Intensity of a Poisson (concentration of radioactive
substance) Variance of noise in an amplifier or circuit Variance of noise in an amplifier or circuit Direction: SO(3); distance and direction: SE(3) Subspace in signal space Deformation or warping: image or volume warping Deformation or warping: image or volume warping
Distance or other discrepancy must be defined on the parameter space Nonnegative, zero at truth, monotonic in some sense Nonnegative, zero at truth, monotonic in some sense
Example: map parameter into a matrix and use a matrix-distance (or distance squared like sum of square errors) to induce a discrepancy in
J. A. O'S. ESE 524, Lecture 10, 02/20/09 1212
q ) p yparameter space
![Page 13: ESE 524ESE 524 Detection and Estimation Theory](https://reader033.vdocuments.net/reader033/viewer/2022042107/6256d67b3496cd37cc224067/html5/thumbnails/13.jpg)
T d N d PToday: Nonrandom Parameters Maximum likelihood for nonrandom Maximum likelihood for nonrandom
parameters Fisher information and the Cramér-Rao s e o at o a d t e C a é ao
bound
J. A. O'S. ESE 524, Lecture 10, 02/20/09 13
![Page 14: ESE 524ESE 524 Detection and Estimation Theory](https://reader033.vdocuments.net/reader033/viewer/2022042107/6256d67b3496cd37cc224067/html5/thumbnails/14.jpg)
N r d P r t r E ti tiNonrandom Parameter Estimation There is no prior on the parameters. Concentrate on the maximum likelihood rule: find the
parameter that maximizes the likelihood function or equivalently the loglikelihood function.
Nonrandom parameter version of MAP estimation Performance?
| |ˆ ( ) arg max ( | ) arg max ln ( | )
ln ( | )
ML s sS S
s p S p S
p S
= =
∂
r rR R R
R|ln ( | )0
ˆ
s
ML
p SS S s
∂=
∂ =
r R
J. A. O'S. ESE 524, Lecture 10, 02/20/09 1414
![Page 15: ESE 524ESE 524 Detection and Estimation Theory](https://reader033.vdocuments.net/reader033/viewer/2022042107/6256d67b3496cd37cc224067/html5/thumbnails/15.jpg)
N r d P r t r E ti tiNonrandom Parameter Estimation There is no prior on the parameters. Concentrate on the maximum likelihood rule: find the
parameter that maximizes the likelihood function or equivalently the loglikelihood function.
Nonrandom parameter version of MAP estimation Single variable and
multiple variable| |ˆ ( ) arg max ( | ) arg max ln ( | )ML s s
S Ss p S p S= =r rR R R
p Performance? |ln ( | )
0ˆ
ln ( | )
s
ML
p SS S s
p
∂=
∂ =
= ∇
r R
0 R S|ln ( | )ˆML
p= ∇=
S r s0 R SS s
S( )s | ( | )pr s R S
J. A. O'S. ESE 524, Lecture 10, 02/20/09 15
( )s|
![Page 16: ESE 524ESE 524 Detection and Estimation Theory](https://reader033.vdocuments.net/reader033/viewer/2022042107/6256d67b3496cd37cc224067/html5/thumbnails/16.jpg)
Maximum Likelihood Estimation
Repeated measurements of a 21 2 i i d (0 )
si N σ
∈+
R
Nmeasurements of a deterministic variable in Gaussian noise
Solve likelihood( )
2
2
|
, 1, 2,..., , i.i.d. (0, )
1( | ) exp
i i i n
i
Ni
r s w i N ww s
R Sp S
σ= + =
− = − ∏R
N
Solve likelihood equation
ML estimate is the limit of the MMSE
ti t SNR
( )
| 221
2
| 21
( | ) exp22
ln ( | ) constant2
si nn
Ni
si n
p S
R Sp S
σπσ
σ
=
=
−= − +
∏
r
r
R
Restimate as SNR goes to infinity Prior variance
goes to infinity
( )1
|2
1ˆ ˆ
ln ( | )0
1ML ML
i n
Ns i
i nS s S s
N
p S R SS σ
=
== =
∂ −= =
∂ r R
Performance? MSE=(mean error)2+(error variance)
1
1ˆ
ˆ ( )
ML ii
MMSE
s RNSs
=
=
=
R1
11
N
ii
NR RSNR N
+
J. A. O'S. ESE 524, Lecture 10, 02/20/09 16
) 11 iSNR N =+
![Page 17: ESE 524ESE 524 Detection and Estimation Theory](https://reader033.vdocuments.net/reader033/viewer/2022042107/6256d67b3496cd37cc224067/html5/thumbnails/17.jpg)
P f D i i i PPerformance: Deterministic Parameters Estimate is random [ ]ˆ( ) ( )E = +s r S B S Bias equals the mean
of the estimate minus the truthV i i
( )( )ˆ ˆ ˆcov( ( )) ( ) ( ) ( ) ( ) TE = − − − − s r s r S B S s r S B S
Variance or covariance matrix of the estimate
For the example, the estimate is unbiased
2
1
, 1, 2,..., , i.i.d. (0, )1ˆ
i i i nN
ML iiN
r s w i N w
s RN
σ
=
= + =
=
N
estimate is unbiased and the variance is easily computed.
In many cases,
[ ] [ ]
( ) ( )
1
22
1ˆ ( )
1ˆ ( )
N
ML ii
N
ML i
E s E r sN
E s s E r s
=
= =
− = −
r
ry ,computing the bias and variance may be hard.
( ) ( )1
2 2
1
( )
1
ML ii
Nn
ii
N
E wN N
σ
=
=
= =
J. A. O'S. ESE 524, Lecture 10, 02/20/09 17
![Page 18: ESE 524ESE 524 Detection and Estimation Theory](https://reader033.vdocuments.net/reader033/viewer/2022042107/6256d67b3496cd37cc224067/html5/thumbnails/18.jpg)
Fisher Information and theC é R B dCramér-Rao Bound Actual performance in terms of variance may be difficult to
compute so bounds on performance are soughtcompute, so bounds on performance are sought. The Cramér-Rao Bound is a lower bound on the variance of
any unbiased estimator. It depends only on the probability distribution for the data, not on any particular estimator.distribution for the data, not on any particular estimator.
Later, we consider algorithms to compute estimates. It is important to note that the Cramér-Rao Bound (and related bounds) are independent of the algorithm.
If the lower bound is achievable, then any estimator that achieves that lower bound is called efficient.
There is a version of the Cramér-Rao Bound for biased ti t ll b t it i t f l b bi iestimators as well, but it is not as useful because bias is
not known (otherwise it could be subtracted). Performance bounds such as the Cramér-Rao Bound may
be used for system design and analysis by evaluating how
J. A. O'S. ESE 524, Lecture 10, 02/20/09 18
be used for system design and analysis by evaluating how the bounds depend on system parameters.
![Page 19: ESE 524ESE 524 Detection and Estimation Theory](https://reader033.vdocuments.net/reader033/viewer/2022042107/6256d67b3496cd37cc224067/html5/thumbnails/19.jpg)
Fisher Information and theC é R B dCramér-Rao Bound
ˆTheorem: Let ( ) be any unbiased estimate of . Assume thats sr
( ) ( )2
2
| | and exist and are absolutely integrable. Then
1
p s p ss s
∂ ∂∂ ∂R R
( ) 2
1ˆvar( ( ))ln |
sp s
Es
≥ ∂ ∂
rr
or, equivalently1ˆvar( ( ))s
≥ −r 2
var( ( ))ln
sp
E≥
∂r
r( )2
.| s
s ∂
J. A. O'S. ESE 524, Lecture 10, 02/20/09 19
![Page 20: ESE 524ESE 524 Detection and Estimation Theory](https://reader033.vdocuments.net/reader033/viewer/2022042107/6256d67b3496cd37cc224067/html5/thumbnails/20.jpg)
Fisher Information and theC é R B d 1Cramér-Rao Bound CRB=1/FI
( ) 2
1ˆTheorem: var( ( ))ln |
sp s
Es
≥ ∂ ∂
rr
CRB 1/FI
[ ] ( ) ( )Proof:
ˆ ˆ( ) ( ) | 0E d R R R( )2
2
or, equivalently1ˆvar( ( )) .
ln |s
p sE
≥ − ∂ ∂
rr
[ ] ( ) ( )
( ) ( ) ( ) ( )
( ) ( )
ˆ ˆ( ) ( ) | 0
|ˆ ˆ( ) | ( ) 1 0
E s s s s p s d
p ss s p s d s s d
s s
− = − =
∂∂ − = − − =∂ ∂
r R R R
RR R R R R
2s ∂
( ) ( ) ( )
( ) ( ) ( )
| ln ||
ln |ˆ( ) | 1
p s p sp s
s sp s
s s p s d
∂ ∂=
∂ ∂∂
− =∂
R RR
RR R R( ) ( )
( ) ( ) ( ) ( )
2
( ) |
ln |ˆ( ) | | 1
ps
p ss s p s p s d
s
∂ ∂ − = ∂
R
R R R R
J. A. O'S. ESE 524, Lecture 10, 02/20/09 20
( ) ( ) ( ) ( )2
2 ln |ˆ( ) | | 1
p ss s p s d p s d
s ∂
− ≥ ∂
RR R R R R Schwarz inequality
![Page 21: ESE 524ESE 524 Detection and Estimation Theory](https://reader033.vdocuments.net/reader033/viewer/2022042107/6256d67b3496cd37cc224067/html5/thumbnails/21.jpg)
C tCondition for equality in Schwarz inequality Condition for an
ML estimateComments Equality is achieved in the CRB
if the Schwarz inequality holds ( ) ( )ln |ˆ( ) ( )
p sk s s s
∂− =
∂R
Rif the Schwarz inequality holds with equality. If an estimator exists that achieves equality (is efficient), then the ML estimator is efficient
( )
( ) ( )ˆ ( )
ˆ ( )
ln |ˆ( ) ( ) 0
ML
ML
s ss s
sp s
k s s ss=
=
∂∂
− = =∂R
R
RR
estimator is efficient. If no efficient estimator exists,
the variance may be arbitrarily larger than the CRB.
ˆ ˆ ˆ( ) ( ), ( ( )) 0ML MLs s k s= =R R R
The second derivative form of Fisher Information is easily found.
For biased estimators the
2
2
( )1ˆvar( ( ))
dB sds
s + ≥
r
Biased Estimator:
For biased estimators, the bound changes. The variance of a biased estimator may be lower than an unbiased estimator
( ) 2( ( ))
ln |p sE
s
∂ ∂
r
J. A. O'S. ESE 524, Lecture 10, 02/20/09 21
estimator. Consider the zero estimator.
![Page 22: ESE 524ESE 524 Detection and Estimation Theory](https://reader033.vdocuments.net/reader033/viewer/2022042107/6256d67b3496cd37cc224067/html5/thumbnails/22.jpg)
M i Lik lih d E ti ti i.i.d. measurements
of a function of a s∈R
Maximum Likelihood Estimationof a function of a deterministic variable in Gaussian noise
Solve likelihood ( )
2
2
( ) , 1, 2,..., , i.i.d. (0, )
( )1
i i i i n
i
N
sr g s w i N ww s
R S
σ∈= + =
R
N
Solve likelihood equation use some preferred solution techniqueFi h i f ti
( )
( )
| 221
2
( )1( | ) exp22
( )ln ( | )
Ni i
si nn
Ni i
R g Sp S
R g Sp S
σπσ=
− = −
−=
∏
r R
R Fisher information is easily computed.
Note the dependence of
( )
| 21
|2
1ˆ ˆ
ln ( | )2
ln ( | ) ( ) ( ) 0ML ML
si n
Ns i i i
i nS s S s
p S
p S R g S dg SS dS
σ
σ
=
== =
= −
∂ −= =
∂
r
r
R
R
pperformance on the true value of the parameter.
|ln
ML MLS s
spJ E∂
= r2 2
21
( | ) ( )1Ni
i n
S dg SS dSσ=
= ∂
r
J. A. O'S. ESE 524, Lecture 10, 02/20/09 22
![Page 23: ESE 524ESE 524 Detection and Estimation Theory](https://reader033.vdocuments.net/reader033/viewer/2022042107/6256d67b3496cd37cc224067/html5/thumbnails/23.jpg)
Maximum Likelihood Estimation:E l C i Amplitude
2 2|
2
ln ( | ) ( )1Ns ip S dg SJ ES dSσ
∂ = = ∂ r r
Example Computations
Signal Energy divided by noise power
1
2
( )( ) 1
( ) 1
i n
ii
nN
S dS
dg S Ng s s JdSd S E
σ
σ
=∂
= = =
Frequency Signal energy
is proportional to the square
( )
( )
22 2
1
( ) 1( )
( ) 2 ( 1)( ) cos(2 ( 1) / ) sin 2 ( 1) /
Ni k
i i i iin n
ii
dg S Eg s sk k J kdS
dg S ig s s i M s i MdS M
σ σππ π
=
= = = =
−= − = − −
to the square of the number of cycles
Exponent
22 2
2 2
4 ( 1) sinin
dS M
J iM
πσ =
= − ( )2 2
2 21
1( 1) ( 1) 2 2
2 ( 1) /
( ) 1( ) ( 1)
N
nN
s i s i sii
Ns i MM
dg S i J i
ππσ
−− −
− ≈
Exponent Positive vs.
negative exponent
( 1) ( 1) 2 22
0
1 1
0 0
( )( ) ( 1) ...
1 1, ,1 1
s i s i siii
in
N NN Ni i
i i
gg s e i e J i edS
e ee iee e
α αα α
α α
σ
α
=
− −
= =
= = − = =
− ∂ −= = − ∂ −
J. A. O'S. ESE 524, Lecture 10, 02/20/09 23( ) ( )
( )21
222
0
1 111 1
N NNNi
i
Ne e e eei ee e
α α α ααα
α αα α
−
=
− − + − ∂ − ∂= = ∂ − ∂ −