chap 7: survey sampling introduction simple random sampling stratified random sampling

28
Chap 7 : Survey Sampling Introduction Simple Random Sampling Stratified Random Sampling

Upload: christopher-neagle

Post on 31-Mar-2015

249 views

Category:

Documents


7 download

TRANSCRIPT

Page 1: Chap 7: Survey Sampling Introduction Simple Random Sampling Stratified Random Sampling

Chap 7: Survey Sampling

Introduction Simple Random Sampling

Stratified Random Sampling

Page 2: Chap 7: Survey Sampling Introduction Simple Random Sampling Stratified Random Sampling

7.1: Introduction

For small pop’n, a census study are used because data can be gathered on all.

For large pop’n, sample surveys will be used to obtain information from a small (but carefully chosen) sample of the pop’n. The sample should reflect the characteristics of the pop’n from which it is drawn.

Sampling methods are classified as either probabilistic or non-probabilistic in nature.

Page 3: Chap 7: Survey Sampling Introduction Simple Random Sampling Stratified Random Sampling

Sampling methods:

Probability Sampling:

• Random Sampling

• Systematic Sampling

• Stratified Sampling

NonProbabilitySampling

ConvenienceSampling

• Judgment Sampling

• Quota Sampling

• Snowball Sampling

Page 4: Chap 7: Survey Sampling Introduction Simple Random Sampling Stratified Random Sampling

The winner is: Probability Sampling.

In non-probability sampling, members are selected from the pop’n in some non-random manner and the sampling error (=degree to which a sample might differ from the pop’n) is unknown.

In probability sampling, each member of the pop’n has a specified probability of being included in the sample. Its advantage is that sampling error can be calculated.

Page 5: Chap 7: Survey Sampling Introduction Simple Random Sampling Stratified Random Sampling

7.2: Pop’n parameters

Definition: Parameters are those numerical characteristics of the pop’n that we will estimate from a sample.

Notations:

absence

presenceorweightoragexExample

xxxXInterestofVariable

NofsubsetaissSample

NelementnPop

i

N

0

1:

.....:

,....,2,1:

...21:'

21

Page 6: Chap 7: Survey Sampling Introduction Simple Random Sampling Stratified Random Sampling

Pop’n mean, total, variance:

Pop’n mean:

Pop’n total:

Pop’n variance: its square root is the StdDev

N

iixN 1

1

NxN

ii

1

2

1

22

1

2 11

N

ii

N

ii x

Nx

N

Page 7: Chap 7: Survey Sampling Introduction Simple Random Sampling Stratified Random Sampling

7.3: Simple Random Sampling

The most elementary form of sampling is s.r.s.s.r.s. where each member of the pop’n has an equal and known chance of being selected at most once.

There are possible samples of size n taken without replacement.

In this section, we will derive some statistical properties of the sample mean.

n

N

Page 8: Chap 7: Survey Sampling Introduction Simple Random Sampling Stratified Random Sampling

7.3.0: The Sample Mean:

The sample mean

estimates the pop’n mean

where

so that will estimate the pop’n total

n

iiXn

X1

1

)'(

)(:

.....:

)(...21:

...21:'

21

npopValueFixedx

sampleValueRandomXDifferenceHuge

XXXmemberssampletheofValues

NnnelementSample

NelementnPop

i

i

n

N

iixN 1

1

XNT N

Page 9: Chap 7: Survey Sampling Introduction Simple Random Sampling Stratified Random Sampling

7.3.1: Expectation & Variance of the Sample Mean:

Theorem A (UNBIASEDNESS) : under s.r.s.s.r.s. , Theorem B: under s.r.s.s.r.s. ,

Recall: The variance of in sampling without replacement differs from that in sampling with replacement by the factor which is called the finite population correction.

The ratio is called the sampling fraction.

XE

1

11

2

N

n

nXVar

1

11

N

n

X

N

n

Page 10: Chap 7: Survey Sampling Introduction Simple Random Sampling Stratified Random Sampling

7.3.2: Estimation of the Population Variance:

Theorem A: under s.r.s.s.r.s.,

where

Corollary A: An unbiased estimator of is

given by

where

1

1ˆ 22

N

N

n

nE

2

1

2 1ˆ

n

ii XX

n

XVar

N

n

n

s

N

nN

N

N

n

n

nsX 1

1

1

1

ˆ 222

2

1

2

1

1

n

ii XX

ns

Page 11: Chap 7: Survey Sampling Introduction Simple Random Sampling Stratified Random Sampling

7.3.3: Normal approximation to the sampling dist’n of the sample meanWe will be using the CLT (Central Limit Theorem,

see Section 5.3) in order to find the probabilistic bounds for the estimation error.

Application 1: • the probability that the error made in estimating

by is using the CLT.Application 2: • a CI (Confidence Interval) for the

pop’n mean is given by using the CLT.

X 12||

X

XP

2/* zX X %1100

Page 12: Chap 7: Survey Sampling Introduction Simple Random Sampling Stratified Random Sampling

7.4: Estimating a ratio:Ratio arises frequently in Survey Sampling. If a bivariate sample is drawn, then the ratio

is estimated by .We wish to derive E(R) and Var(R) using

approximation methods seen in Section 4.6 because R is a nonlinear function.

N

ii

N

ii

xN

yN

r

1

1

1

1

XYR /

),( ii YX

Page 13: Chap 7: Survey Sampling Introduction Simple Random Sampling Stratified Random Sampling

7.4: Estimating a ratio (cont’d)

Theorem A:

With s.r.s.s.r.s., the approximate variance of R is

Since the population correlation pop’n is then

yxyx

x

rrN

n

nRVar

2

1

1

11

1)( 222

2

xyyxx

YXYXx

rrN

n

n

rrRVar

21

1

11

1

21

)(

2222

2222

yx

xy

Page 14: Chap 7: Survey Sampling Introduction Simple Random Sampling Stratified Random Sampling

7.4: Estimating a ratio (cont’d)Theorem B:

With s.r.s.s.r.s., the approximate expectation of R is

yxxx

rN

n

nrRE

222

1

1

11

1)(

Page 15: Chap 7: Survey Sampling Introduction Simple Random Sampling Stratified Random Sampling

Standard Error estimate of R:

The estimate variance of R is

where and the pop’n covariance

is estimated by

xyyxR RsssRXN

n

ns 2

1

1

11

1 2222

2

yi

N

ixixy yx

N

1

1

yx

xy

yx

xy

i

n

iixy

ss

sand

YYXXn

s

ˆ

1

1

1

Page 16: Chap 7: Survey Sampling Introduction Simple Random Sampling Stratified Random Sampling

Confidence Interval for r:

An approximate CI (Confidence Interval) for the ratio of interest r is given by

%1100

2/* zsR R

Page 17: Chap 7: Survey Sampling Introduction Simple Random Sampling Stratified Random Sampling

7.5: Stratified Random Sampling:

7.5.1: Introduction (S.R.S.)(S.R.S.)

The pop’n is partitioned into sub-pop’s or strata (stratum, singular) that are then independently sampled and are combined to estimate pop’n parameters.

A stratum is a subset of the population that shares at least one common characteristic

Example: males & females; age groups;…

Page 18: Chap 7: Survey Sampling Introduction Simple Random Sampling Stratified Random Sampling

Why is Stratified Sampling superior to Simple Random Sampling?

• S.R.S.S.R.S. reduces the sampling error• S.R.S.S.R.S. guarantees a prescribed number of

observations from each stratum while s.r.s.s.r.s. can’t

• The mean of a S.R.S.S.R.S. can be considerably more precise than the mean of a s.r.s.s.r.s., if the pop’n members within each stratum are relatively homogeneous and if there is enough variation between strata.

Page 19: Chap 7: Survey Sampling Introduction Simple Random Sampling Stratified Random Sampling

7.5.2: Properties of Stratified Estimates:

Notation: Let be the total pop’n size if denote the pop’n sizes in the L strata.

The overall pop’n mean

is a weighted average of the pop’n means of the L strata, where denotes the fraction of the pop’n in the stratum.

LNNNN ...21LlforN l ,...,2,1

L

ll

L

lllL

ll

L

lll

WbecauseWW

W

11

1

1 1

lNNW ll / thl

Page 20: Chap 7: Survey Sampling Introduction Simple Random Sampling Stratified Random Sampling

7.5.2: Properties of Stratified Estimates: (cont’d)

Stratified sampling requires two steps:• Identify the relevant strata in the pop’n• Use s.r.s.s.r.s. to get subject from each stratum

Within each stratum, a s.r.s.s.r.s. of size is taken to obtain the sample mean in the stratum will be denoted by

where denotes the observation in the stratum.

thlln

ln

iil

ll X

nX

1

1

thithl

Page 21: Chap 7: Survey Sampling Introduction Simple Random Sampling Stratified Random Sampling

7.5.2: Properties of Stratified Estimates: (cont’d)

Theorem A: The stratified estimate, , of the overall pop’n mean is UNBIASED.

• Since we assume that the samples from different Since we assume that the samples from different strata are independent of one another and that strata are independent of one another and that within each stratum a within each stratum a s.r.s.s.r.s. is taken, then the is taken, then the variance of can be easily calculated in:variance of can be easily calculated in:

Theorem B: The variance of the stratified sample is

l

L

lls XWX

1

sX

2

1

2

1

11

1l

l

l

l

L

lls N

n

nWXVar

Page 22: Chap 7: Survey Sampling Introduction Simple Random Sampling Stratified Random Sampling

Neglecting / Ignoringthe finite population correction:

Approximation:

If the sampling fractions within all strata

were small, Theorem B will then reduce to:

N

NW l

l

2

1

2

l

L

l l

ls n

WXVar

Page 23: Chap 7: Survey Sampling Introduction Simple Random Sampling Stratified Random Sampling

Expectation and Variance of the stratified estimate of the pop’n total:This is a corollary of Theorems A & B.

Practice with examples A & B in the textbook on Practice with examples A & B in the textbook on pages 276-277 to get healthy with these pages 276-277 to get healthy with these calculations.calculations.

ss

ll

l

l

L

llss

s

XNTwhere

N

n

nNXVarNTVar

TE

2

1

22

1

11

1

Page 24: Chap 7: Survey Sampling Introduction Simple Random Sampling Stratified Random Sampling

7.5.3: Methods of allocation:

For small sampling fractions within strata i.e. when neglecting/ignoring the finite pop’n correction,

Question: How to choose to minimize subject to the constraint when resources of a survey allowed only a total of n units to be sampled?

Note: We could include finite pop’n corrections but the results will be more complicated. Try it!

2

1

2

l

L

l l

ls n

WXVar

Lnnn ,...,, 21 sXVarLnnnn ...21

Page 25: Chap 7: Survey Sampling Introduction Simple Random Sampling Stratified Random Sampling

7.5.3 a: Neyman allocationTheorem A: The samples sizes that

minimize subject to the constraint

are given by

Corollary A: stratified estimate & optimal allocations

Lnnn ,...,, 21

sXVar

Lnnnn ...21

LlwhereW

Wnn L

kkk

lll ,...,2,1

1

n

W

XVar

L

lll

so

2

1

Page 26: Chap 7: Survey Sampling Introduction Simple Random Sampling Stratified Random Sampling

7.5.3 b: Proportional allocation

If a survey measures several attributes for each pop’n member, it will be difficult to find an allocation that is simultaneously optimal for each of those variables. Using the same sampling fraction

in each stratum will provide a simple and popular alternative method of allocation.

LlfornWN

Nnn

N

n

N

n

N

n

ll

L

L

,...,2,1

....

1

2

2

1

1

Page 27: Chap 7: Survey Sampling Introduction Simple Random Sampling Stratified Random Sampling

7.5.3b:Proportional allocation (cont’

Theorem B: With stratified sampling based on proportional allocation, ignoring the finite pop’n correction,

Theorem C: With stratified sampling based on both allocation methods, ignoring the finite pop’n correction,

L

lll

l

L

llsosp

Wwhere

Wn

XVarXVar

1

2

1

1

2

1

1l

L

llsp W

nXVar

Page 28: Chap 7: Survey Sampling Introduction Simple Random Sampling Stratified Random Sampling

7.6:Conclusions

A mathematical model for Survey Sampling was built using s.r.s.s.r.s. and probabilistic error bounds for the estimates derived.

The theory and techniques of survey probabilityprobability sampling include Systematic Sampling, Cluster Sampling, etc…as well as non-probabilitynon-probability sampling methods such as Quota Sampling, Snowball Sampling,…