estimation theory basic concepts hamid r....

Post on 09-Mar-2021

7 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Hamid R. Rabiee

Stochastic Processes

Estimation Theory

Basic concepts

1

Overview

Reading Assignment

Chapter 6 of C.B. book.

Further Resources

MIT Open Course Ware

2 Stochastic Processes

Outline

Basic Definitions

Sample, Parameter and Parametric

distribution, Statistics

Sufficient Statistics

How to find an SS?

Minimal Sufficient Statistics

How to find an MSS?

3 Stochastic Processes

Basic Definitions

4 Stochastic Processes

let 𝑥1, 𝑥2, … , 𝑥𝑛 be a Random Sample from X.

𝑥𝑖 ~ 𝑓 𝑥 𝜃 , and xi′s are independent.

𝑋 = (𝑥1, 𝑥2, … , 𝑥𝑛)

𝜃: A parameter that describes the distribution,

for example 𝜃 may be the mean value in a

particular distribution.

𝑡1

𝑡3

𝑡2

𝑡1

𝑡3

𝑡2

T

Statistic

5 Stochastic Processes

Any function of the random samples 𝑋 is a statistic:

𝑇: 𝜒 → ℝ, 𝜒 𝑖𝑠 𝑡𝑕𝑒 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑝𝑎𝑐𝑒, 𝑖. 𝑒. 𝑠𝑒𝑡 𝑜𝑓 𝑎𝑙𝑙 𝑋

𝑡 = 𝑇 𝑋

= {t : t = T(X) for some X }

• Data reduction

• Partitioning the sample space

T partitions 𝜒 into sets 𝐴𝑡 t.

𝐴𝑡 ={ X 𝜒 | t = T(X) }

T(X) = t X 𝐴𝑡

Example: T(X) = 𝑥1 + 𝑥2 +⋯+ 𝑥𝑛

𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒𝑚𝑒𝑎𝑛

max 𝑣𝑎𝑙𝑢𝑒min 𝑣𝑎𝑙𝑢𝑒

Sufficient Statistics

6 Stochastic Processes

A sufficient statistic for a parameter 𝜃 is a

statistic, that captures all the information

about 𝜃 contained in the samples.

Sufficiency Principle:

If is a sufficient statistic for 𝜃 then any

inference about 𝜃 should depend on the

sample only through .

( )T X

( )T X

( )T XX

Sufficient Statistics(Cont’d)

7 Stochastic Processes

Definition:

If is the joint pdf or pmf of 𝑋 and

𝑞(𝑡|𝜃) is the pdf or pmf of 𝑇(𝑋), then 𝑇(𝑋) is

a sufficient statistic for 𝜃, if for every𝑋 ∈ 𝜒

the ratio 𝑝(𝑋)

𝑞(𝑇(𝑋)|𝜃) is constant as a function of

𝜃.

( | )p X

8 Stochastic Processes

Sufficient Statistics(Cont’d)

Example 1:

Let be i.i.d. Bernoulli(θ),

is a sufficient statistic?

Yes. But how?

is independent of θ.

1, , nx x 0 1

1( ) nT X x x

1

i

n

x

9 Stochastic Processes

Sufficient Statistics(Cont’d)

Example 2:

Let be i.i.d. , is known. Is

a sufficient statistic for ?

Left as Exercise for YOU!

1, , nx x 2( , )N 2

1( ) /nx x x n

How to find an SS for 𝜃

10 Stochastic Processes

Factorization Theorem:

Let 𝑓 𝑋 𝜃 denote the joint pdf or pmf of a

sample 𝑋, 𝑇(𝑋) is sufficient statistic for 𝜃 iff

there exists functions 𝑔(𝑡|𝜃) and 𝑕(𝑋) such

that:

∀𝑋 ∈ 𝜒 𝑓 𝑋 𝜃 = 𝑔 𝑇 𝑋 𝜃 𝑕 𝑋

So, to find 𝑇(𝑋) factorize𝑓 𝑋 𝜃 into two parts,

𝑔 𝑇 𝑋 𝜃 , which depends on 𝜃, and 𝑕 𝑋 which

is independent of 𝜃.

How to find an SS for 𝜃(Cont’d)

11 Stochastic Processes

Example 1(continued):

Find SS for a Bernoulli distribution

𝑓 𝑋 𝜃 = 𝜃𝑥𝑖 1 − 𝜃 1−𝑥𝑖

𝑛

𝑖=1

= 𝜃∑𝑥𝑖 1 − 𝜃 1−∑𝑥𝑖 =

𝑔 ∑𝑥𝑖 𝜃)𝑕 𝑋 𝑤𝑕𝑒𝑟𝑒: 𝑔 ∑𝑥𝑖 𝜃) = 𝜃∑𝑥𝑖 1 − 𝜃 1−∑𝑥𝑖

𝑕 𝑋 = 1

So: 𝑇 𝑋 = ∑𝑥𝑖 is a SS for 𝜃.

How to find an SS for 𝜃(Cont’d)

12 Stochastic Processes

Example 2:

Find SS for a discrete uniform distribution

on 1, 2,… , 𝜃 [Hint: Use Indicator function]

How to find an SS for 𝜃(Cont’d)

13 Stochastic Processes

Example 2:

Find SS for a discrete uniform distribution

on 1, 2,… , 𝜃 [Hint: Use Indicator function]

𝑇 𝑋 = max 𝑥𝑖 i=1,2, …, n

14 Stochastic Processes

Sufficient Statistics (cont’d)

Sometimes θ is a vector of parameters.

In such cases, T(X) is usually also vector

valued.

Example: iid ,

1, , nx x 2( , )N 2( , )

15 Stochastic Processes

Sufficient Statistics (cont’d)

Exponential class of distributions:

Theorem: Let be iid from

then

is a sufficient statistic for θ.

1, , nx x

1

( | ) ( ) ( )exp{ ( ) ( )}k

i i

i

f x h x c w t x

1 2

1 1 1

( ) ( ), ( ), , ( )n n n

j j k j

j j j

T x t x t x t x

Minimal Sufficient Statistics

16 Stochastic Processes

There may be many Sufficient Statistics for a parameter

𝜃. For example 𝑇 𝑋 = 𝑋 is always an SS.

i.e. 𝑓 𝑋 𝜃 = 𝑓 𝑋 𝜃 𝑕 𝑋 , 𝑤𝑕𝑒𝑟𝑒 𝑕 𝑋 = 1

Also any one-to-one function of an SS is an SS.

Which SS is the best?

Minimal Sufficient Statistics(Cont’d)

17 Stochastic Processes

Goal: Data reduction while preserving info. about 𝜃.

A sufficient statistic 𝑇(𝑋) is called a minimal sufficient

statistic, if for any other SS 𝑇′(𝑋), 𝑇(𝑋) is a function of

T′(𝑋).

So MSS ≡ Maximum data reduction

MSS gives the coarsest

partitioning

MSS SS but not MSS

Minimal Sufficient Statistics(Cont’d)

18 Stochastic Processes

Example 4:

𝑥1, 𝑥2, … , 𝑥𝑛 ~ N 𝜇, 𝜎2 , 𝜎2 𝑖𝑠 𝑘𝑛𝑜𝑤𝑛, Are i.i.d. samples

Factorization Theorem: 𝑋 𝑖𝑠 𝑎𝑛 𝑆𝑆.

𝑋, 𝑠2 𝑖𝑠 𝑎𝑙𝑠𝑜 𝑎𝑛 𝑆𝑆.

Clearly, 𝑋 achieves higher data reduction and is thus

better.

If 𝜎2 where unknown, then 𝑋 is not an SS. And (𝑋, s2) contains more info about (𝜇, 𝜎2).

How to find an MSS?

19 Stochastic Processes

Theorem [Lehmann, Sheffe 1950]:

Let 𝑓(𝑋|𝜃) be the pdf or pmf of a sample 𝑋. Suppose

𝑇(𝑋) exists such that: ∀𝑋, 𝑌 ∈ 𝜒,𝑓(𝑋|𝜃)

𝑓(𝑌|𝜃) is constant as a

function of 𝜃 iff 𝑇 𝑋 = 𝑇(𝑌). Then 𝑇(𝑋) is a Minimal

Sufficient Statistic.

If [𝑇 𝑋 = 𝑇 𝑌 → 𝑓 𝑋 𝜃

𝑓 𝑌 𝜃] is a constant, then 𝑇 𝑋 is an

SS.

How to find an MSS?

20 Stochastic Processes

Example 5:

𝑥1, 𝑥2, … , 𝑥𝑛 ~ 𝑈(𝜃, 𝜃 + 1)

Find an MSS for 𝑋.

does the dimension of the MSS equal the dimension of the

parameter?

How to find an MSS?

21 Stochastic Processes

Example 5:

𝑥1, 𝑥2, … , 𝑥𝑛 ~ 𝑈(𝜃, 𝜃 + 1)

Find an MSS for 𝑋.

does the dimension of the MSS equal the dimension of the

parameter?

𝑇 𝑋 = (min 𝑥𝑖 , max 𝑥𝑖) is an MSS

IS it unique??

So any one-to-one function of an MSS is also MSS.

top related