lec5slide h

Statistics and Econometrics (ECO00037)

Lecture 5: Point and Interval Estimation

Lecturer: Takashi Yamagata (Room A/EC/018)E-mail: [email protected]

Oce Hour: Monday 9.30-11.30

Reading:

Topic 8: Newbold Ch. 8.1; Freund Ch.10.2,10.3 (except Cramer-Rao

inequality), 10.4

Topic 9: Newbold Ch. 8.2, 8.3, 8.4, 8.6, 8.7; Freund Ch.11.2, 11.3,

11.4, 11.5

Autumn 2014

1 / 25

8. Point Estimation8.1 Parameters, Estimators and Estimates

I Suppose that X1,X2, ...,Xn are independently andidentically distributed, with common pdf f (X ; ), wherethe parameter is interior of parameter space .

I The form of pdf is known but the parameter(s), , isunknown.

I An estimator for , n, is a statistic of samples toestimate

eg. X N( , 2), = and n = XnI An estimator is a random variable with a samplingdistribution

I An estimate is realisation of an estimator (a xedvalue)

I n is used for both estimator and estimate.

2 / 25

8. Point Estimation8.2 Properties of EstimatorsStatistical properties of estimators can be used to decidewhich estimator is most appropriate in a given situation. Weconsider unbiasedness, minimum variance, consistency, relativeeciency.8.2.1 Unbiasedness

I A statistic n is an unbiased estimator of the parameterif and only if E(n) = .

ExampleX1,X2, ...,Xn are iid random variables drawn fromX N( , 2). Then, n = Xn is unbiased estimator for= .

Proof.E (n) = E (Xn) =

1n

ni=1 E(Xi ) =

3 / 25

8. Point Estimation8.2 Properties of Estimators8.2.2 Relative Eciency

I Let n and n be unbiased estimators of a parameter . Ifvar() < var(), we say that is relatively more ecientthan .

I

-5 -4 -3 -2 -1 0 1 2 3 4 5

0.1

0.2

0.3

0.4

b

f(b)b^

b~

Distribution of b= and b=

4 / 25

8. Point Estimation8.2 Properties of EstimatorsI 8.2.3 Best Linear Unbiased Estimator (BLUE): If nis a linear unbiased estimator (n =

ni=1 aiXi ) and no

other linear unbiased estimator has a smaller variance,then n is BLUE.

I 8.2.4 Mean Square Error (MSE) of n can be shown as[Example]

E (n )2 = var(n) + bias(n)

2

where

var(n) = E n E n2

bias(n) = E(n )

I The estimator which has the smallest value of MSE maybe preferred to other estimators.

5 / 25

8. Point Estimation8.3 Large Sample Properties of EstimatorsI Sometimes we cannot obtain nite sample distributions ofestimators and instead look at asymptotic/large sampleproperties.

I 8.3.1 Chebyshevs Inequality: Let Y be a randomvariable with E(Y 2) < . Then

Pr ( Y ) E Y 2 2 for all > 0.

I Consider an unbiased estimator of , , E ( ) = andvar( ) = 2n. Then,

Pr ( )var( )

2=

2

n 2for all > 0.

I Now consider a biased estimator of , ,E ( ) = + cn and var( ) = 2n. Then,

Pr ( )MSE( )

2=

2 + c

n 2for all > 0.

6 / 25

8. Point Estimation8.3 Large Sample Properties of EstimatorsI 8.3.2 Probability Limits: In the above examples

limn

Pr ( ) = 0 and limn

Pr ( ) = 0

for all > 0, since 2n 2 0 and 2 + c n 2 asn . It is said that has a probability limit equal to, or

n

( ) = 0, or n

= .

(a similar discussion applies to )I 8.2.3 Consistency: n is a consistent estimator of theparameter , if and only if

n

n = 0.

7 / 25

8. Point Estimation8.3 Large Sample Properties of EstimatorsI 8.3.4 Sultskys Theorem: If n

n = and

g( . ) is a continuous function, then

n

g(n) = g( )

NB: n (1

n) = 1 ( = 0), but

E (1n) = 1 in general!I 8.3.5 Convergence in Distribution:

n(n )dY as n , where Y N(0,V ).

n(n ) is asymptotically normally distributedI For the same example in 8.3.1,

n Xn dZ as n , where Z N(0, 1).

8 / 25

9. Interval Estimation9.1 Condence Intervals: Small Sample, knownA (point) estimator is a random variable, subject to samplingvariability (i.e. dierent samples will yield dierent estimates).Interval estimator take account for this.I 9.1.1 Mean:

I Suppose X1,X2, ...,Xn are iid normal random variables,Xi N( , 2), and is known.

I Dene X n =1n

ni=1 Xi and SE X n = n. We

know X n N( , 2n), then

I Pr za2X n

SE (X n)za2 = 1

I Pr za2SE X n X n za2SE X n = 1I

Pr X n za2SE X n X n + za2SE X n =1

I Pr X n + za2SE X n X n za2SE X n =1

9 / 25

I Observe that LB and UB of Pr [LB UB] arerandom variables, where LB = Xn za2SE Xn ,

UB = Xn + za2SE XnI Consider = 0.10, so that Pr [LB UB] = 0.90.

This means the probability of falling into a randomcondence interval (LB,UB) is 0.90

I (lb, ub) is called a 90% condence interval forI Suppose we obtained ten sets of n random samples:

x1,1, x1,2, ..., x1,n ; x2,1, x2,2, ..., x2,n ; ...; x10,1, x10,2, ..., x10,nI We can obtain ten dierent condence intervals,corresponding to these ten sets of data, which may looklike:

10 / 25

9. Interval Estimation

9.1 Condence Intervals: Small Sample, known

ExampleA random sample of 16 observations from a normal populationwith 2 = 4 had sample mean 10. Find a 90% condenceinterval for the population mean, .

I qx = , = , n = , =

I qXn N( , ), so a 90% condence interval isqXn z0.05 n

I z0.05 = [See Standard Normal Table]

I Answer is (9.1775, 10.8225)

11 / 25



I 9.1.2 Dierence between Two Means:

I n1 and n2 samples are randomly drawn fromX1 N( 1,

21) and X2 N( 2,

22) respectively, and

form sample means, X 1 and X 2, accordingly.21,

22 are

known. We can show [Example]

X 1 X 2 N 1 2,21

n1+

22

n2

Therefore the 95% condence interval for 1 2 is

X 1 X 2 1.9621n1+

22n2

12 / 25



I 9.1.3 Two Means, Matched Pairs:

I n matched pairs, (x1, y1), (x2, y2), ...., (xn , yn), arerandomly drawn from bivariate normal distribution suchthat (X ,Y ) with mean ( x , y ) and var(X Y ).var(X Y ) is known. Form sample meanqD = n 1 ni=1 Di where Di = Xi Yi .

I We can show

qD N x y ,var(X Y )

n

I the 100 (1 )% condence interval for x y is

qD z 2var (X Y )

n

13 / 25


9.2 Condence Intervals: Small Sample, unknown

I 9.2.1 Mean:

I Suppose X1,X2, ...,Xn are iid normal random variables,Xi N( , 2), and is unknown.

I Dene X n =1n

ni=1 Xi and SE X n = S n.

I As we know

(X n )SE X n tn 1

I The 100 (1 )% condence interval for will be

X n ta2,n 1SE X n .

14 / 25



ExampleA random sample of 16 observations from a normal populationhad sample mean 10 and standard deviation 2. Find a 90%condence interval for the population mean, .

I qx = , s = , n = , =

I (Xn ) (S n) tn 1, so a 90% condenceinterval is qXn t0.05,n 1

Sn

I t0.05,n 1 = [See t-distribution Table]

I Answer is (9.1235, 10.8765) [compare to the previousexample]

15 / 25

9. Interval Estimation9.2 Condence Intervals: Small Sample, unknownI 9.2.2 Dierence between Two Means:

I n1 and n2 samples are randomly drawn fromX1 N( 1,

2) and X2 N( 2,2) respectively, and

form sample means, X 1 and X 2, accordingly.2 is

unknownI NB: assumed 21 =

22 =

2, homoskedasticity,otherwise not straightforward solution

I As(X 1 X 2) ( 1 2)

SE (X 1 X 2)tn1+n2 2

I where SE X 1 X 2 =S 2

n1+ S

2

n2

I with S2 = (n1 1)S21+(n2 1)S

22

(n1 1)+(n2 1), S2j =

nji=1(Xi X j)

2

nj 1,

j = 1, 2I the 100 (1 )% condence interval for 1 2 is

X 1 X 2 t 2,n1+n2 2SE X 1 X 2

16 / 25




I n matched pairs, (x1, y1), (x2, y2), ...., (xn , yn), arerandomly drawn from bivariate normal distribution,(X ,Y ).

I Suppose our interest is in the condence interval ofmean of D = X Y , D .

I Form sample mean qD = n 1 ni=1 Di whereDi = Xi Yi .

I AsqD DSE ( qD )

tn 1

I where SE ( qD) =S 2Dn with S

2D =

ni=1(Di qD )

2

n 1I the 100 (1 )% condence interval for D is

qD t 2,n 1SE ( qD)

17 / 25


9.3 Condence Intervals: Large Sample, unknown

I 9.3.1 Mean:

I Suppose X1,X2, ...,Xn are iid random variables withmean and nite variance 2. n 30.

I Dene X n =1n

ni=1 Xi and SE X n = S n.

I By Central Limit Theorem (CLT)

(X n )SE X ndZ as n , where

Z N(0, 1).I The 100 (1 )% condence interval for will be

X n za2SE X n .

18 / 25



ExampleA random sample of 36 observations had sample mean 10 andstandard deviation 2. Find a 90% condence interval for thepopulation mean, .

I qx = , s = , n = , =

I (Xn ) (S n) N(0, 1), approximately, so a 90%condence interval is qXn za2

Sn

I z0.05 =

I Answer is (9.4517, 10.5483)

19 / 25



I 9.3.2 Dierence between Two Means:

I n1 and n2 samples are randomly drawn fromX1 i .i .d .( 1,

21) and X2 i .i .d .( 2,

22) respectively,

and form sample means, X 1 and X 2, accordingly.min (n1, n2) 30.

I By CLT(X 1 X 2) ( 1 2)

SE (X 1 X 2)dZ as min (n1, n2)

I where SE X 1 X 2 =S 21n1+

S 22n2

I with S2j =nji=1(Xi X j)

2

nj 1, j = 1, 2

I the 100 (1 )% condence interval for 1 2 is

X 1 X 2 z 2SE X 1 X 2

20 / 25




I n 30 matched pairs, (x1, y1), (x2, y2), ...., (xn , yn), arerandomly drawn from bivariate normal distribution,(X ,Y ).

I Suppose our interest is in the condence interval ofmean of D = X Y , D .

I Form sample mean qD = n 1 ni=1 Di whereDi = Xi Yi .

I By CLTqD DSE ( qD )

dZ as n

I where SE ( qD) =S 2Dn with S

2D =

ni=1(Di qD )

2

n 1I the 100 (1 )% condence interval for x y is

qD z 2SE ( qD)

21 / 25

9. Interval Estimation9.3 Condence Intervals: Large Sample, unknownI 9.3.4 Proportion of Successes:By Central Limit Theorem we found the binomial randomvariable X is distributed as N(n , n (1 ))approximately.

I Now dene the sample proportion of successes asP = Xn. From the above result,

P

(1 )n

N(0, 1) approximately

I as P is consistent estimator of , we argue that

P

P(1 P)n

N(0, 1) approximately

22 / 25

9. Interval Estimation9.3 Condence Intervals: Large Sample, unknownI 9.3.4 Proportion of Successes:It follows that the 100 (1 )% condence interval foris

P z 2P(1 P)

nI 9.3.5 Dierences between Proportions of Successes:

I Two independent populations,X1 Binomial(x1; n1, 1) X2 Binomial(x2; n2, 2).

I Dene P1 = X1n1 and P2 = X2n2.I Analogous to 9.3.2, the 100 (1 )% condenceinterval for 1 2 is

P1 P2 z 2P1(1 P1)

n1+P2(1 P2)

n2

23 / 25


9.3 Condence Intervals: Large Sample, unknown9.3.3 Proportion of Successes:

ExampleIn a survey, a random sample of 100 purchasers of toilet rolls,20 indicated cheapness as the major reason for brand selection.Find a 90% condence interval for the population proportion.

I p = , n = , p(1 p)n = , =

I P

P(1 P)nN(0, 1), approximately, so a 90%

condence interval is P z 2P(1 P)

n

I z0.05 =

I Answer is (0.1342,0.2658)

24 / 25

Useful Sampling Distributions for Condence Intervaland Hypothesis Testing forEstimating E (X ) = by qX = n 1 ni=1 Xi

Xi i .i .d .N ( ,2) Xi i .i .d .( ,

2)n small, 2 known n small, 2 unknown n large

Estimating E (D) = D = X Y byqD = n 1 ni=1 Di ,

where Di = Xi Yi with Matched Pair Xi ,YiDi i .i .d .N( D ,

2D ) Di i .i .d .( D ,

2D )

n small, 2D known n small,2D unknown n large

Estimating E (Xi Yi ) = X Y byqX qY

where qX = n 1XnXi=1 &

qY = n 1YnYi=1 Yi , Xi & Yi are independently drawn

Xi i .i .d .N ( X ,2X ) Yi i .i .d .N ( Y ,

2Y ) Xi i .i .d .( X ,

2X ) Yi i .i .d .( Y ,

2Y )

min (nX , nY ) small, min (nX , nY ) small, min (nX , nY ) large2X &

2Y known

2X &

2Y unknown

Assume 2X =2Y =

2

Estimating by P = Xn where X Binomial(x ; n, )n small n small n large or n & n (1 ) > 4N/A N/A

Estimating X Y by PX =XnX

and PY =YnY

where X Binomial(x ; nX , X ) & Y Binomial(y ; nY , Y ), X & Y are independentmin (nX , nY ) small min (nX , nY ) small min (nX , nY ) large

N/A N/A

25 / 25

lec5slide h

Documents