emr 6500: survey research dr. chris l. s. coryn spring 2012

39
EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

Post on 19-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

EMR 6500:Survey Research

Dr. Chris L. S. CorynSpring 2012

Page 2: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

Agenda

• Elements of the sampling problem• Some basic concepts of statistics• Case Study #1

Page 3: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

Elements of the Sampling Problem

Page 4: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

Technical Terms (Again)

• An element is an object on which a measurement is taken

• A population is a collection of elements to which an inference is made

• A sample is a collection of sampling units drawn from a frame or frames

• Sampling units are nonoverlapping collections of elements from the population that cover the entire population

• A frame is a list of sampling units

Page 5: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

How to Select the Sample: The Design of the Survey Sample

Page 6: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

How to Select the Sample

• The objective of sampling is to estimate population parameters such as the mean, proportion, or total

• The quantity of information is controlled by the number of units included in a sample and the method used to select a sample

Page 7: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

How to Select the Sample

• The primary questions addressed by sampling theory are:–What sampling procedure should be

used?–What number of sampling units should

be included in a sample?

• The answer to both depends on how much information one is willing to buy

Page 8: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

How to Select the Sample

• If is the population parameter of interest and is an estimator of then a bound on the error of estimation, B, should be specified that represents the difference in absolute value between and

Page 9: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

How to Select the Sample

• A probability, , specifies the fraction of times in repeated samples the the error of estimation is less than B

Page 10: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

How to Select a Sample

• Typically B is set to and, therefore, will be approximately .95

• Once a bound, B, has been specified, along with its associated probability, , different sampling designs can be compared to determine which is most efficient for a particular purpose

Page 11: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

Probability Sampling

• Statistical estimation requires randomness in sampling designs so that properties of statistical estimators can be assessed probabilistically

• Sampling designs based on planned randomness are probability samples

Page 12: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

Simple Random Sampling

• The basic probability sampling design, simple random sampling, consists of selecting a group of n sampling units in such a way that all samples of size n have the same probability of selection

Page 13: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

Stratified Random Sampling

• A stratified random sample is one obtained by separating the population elements into discrete, nonoverlapping groups, called strata, and then selecting a simple random sample from each stratum

Page 14: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

Stratified Random Sampling

• The principle reasons for using stratified random sampling rather than simple random sampling are:1. Stratification may produce a smaller bound on the

error of estimation than would be produced by a simple random sample of the same size (this is particularly true if measurements within strata are homogenous)

2. The cost per observation may be reduced by stratification of the population elements into convenient groupings

3. Estimate of population parameters may be desired for subgroups of the population (these subgroups should then be identifiable strata)

Page 15: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

Cluster Sampling

• Cluster sampling is a less costly alternative to simple or stratified random sampling if the cost of obtaining a frame that lists all population elements is very high or if the cost of obtaining observations increases as the distance separating elements increases

Page 16: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

Cluster Sampling

• Cluster sampling is an effective design for obtaining a specified amount of information under the following conditions:1. A good frame listing all population

elements is not available or is very costly to obtain, but a frame listing clusters is easily obtained

2. The cost of obtaining observations increases as the distances separating the elements increases

Page 17: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

Cluster Sampling

• Clusters typically consist of herds, households, or other units of clustering (e.g., an orange tree forms a cluster of oranges for investigating insect infestations)

• A farm herd contains a cluster of livestock for estimating proportions of diseased animals

• Elements within a cluster are often physically close together and hence tend to have similar characteristics and the measurement on one element within a cluster may be correlated with the measurement on another

Page 18: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

Take a simple random sample from every stratum Take a simple random sample of clusters; observe all elements within clusters in the sample

Each element of the population is in exactly one stratum

Each element of the population is in exactly one cluster

Variance of the estimate depends on the variability within strata

Variance of the estimate depends primarily on the variability between clusters

For greatest precision, individual elements within each stratum should have similar values, but stratum means should differ from each other as much as possible

For greatest precision, individual elements within each cluster should be heterogeneous, and cluster means should be similar to one another

Page 19: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

Systematic Sampling

• Systematic sampling involves random selection of one element from the first k elements and then selecting every kth element thereafter

Page 20: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

Systematic Sampling

• Systematic sampling is a useful alternative to simple random sampling for the following reasons:1. Systematic sampling is easier to perform in the

field and hence is less subject to selection errors by field-workers than are either simple random samples or stratified random samples, especially if a good frame is not available

2. Systematic sampling can provide greater information per unit cost than simple random sampling can provide for certain populations with certain patterns in the arrangement of elements

Page 21: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

Multi-Stage Sampling

• Sampling conducted in stages, often taking into account the hierarchical (nested) structure of a population– Primary sampling units (PSUs) are sampled

first (e.g., cities)– Secondary sampling units (SSUs) are sampled

next (e.g., city blocks)– Ultimate sampling units (actual elements) are

sampled last (e.g., households)

• Especially useful when no frame can be established for a single-stage sample

Page 22: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

Multi-Stage Sampling

• For a fixed sample size of elements, a multi-stage sampling design is almost always less efficient than a simple random sample (though often more feasible)

• Variance estimation methods for complex sample designs must be used to obtain correct standard errors

Page 23: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

Multiple-Frame Sampling

Page 24: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

Quota Sampling

• A nonprobability sampling method (although randomness is sometimes part of the design) in which a prespecified number of surveys is obtained from specific subgroups of a target population (e.g., Republicans, Democrats)

• Introduces unknown sampling biases into survey estimates

Page 25: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

Chain-Referral Sampling

• Snowball sampling methods for sampling in rare/hard-to-reach populations

• One or more persons having the trait of interest serve as seeds and identify others

• Persons with many connections are likely to be included, whereas isolated persons may not be included at all

• Information about network connections in the sample can be used to weight sample units (respondent-driven sampling, which is premised on Markov-chain theory)

Page 26: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

Recruitment Network

+

+

+

+

+

+

+

+

+

+ –

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+ –

+

+

+

+

+

+

+

+

+

+

Page 27: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

Equilibrium

0 1 2 3 4 5 6 7 8 9 10

100%

90%

80%

70%

60%

50%

40%

30%

20%

10%

0%

Recruitment Wave

Perc

en

tage o

f Popula

tion

Page 28: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

Planning a Survey

Page 29: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

Planning A Survey

1. Statement of objectives2. Target population3. The frame4. Sample design5. Method of measurement6. Measurement instrument7. Selection and training of fieldworkers8. The pretest (pilot)9. Organization of fieldwork10.Organization of data management11.Data analysis

Page 30: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

Some Basic Concepts of Statistics

Page 31: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

Finite Population Correction

• Most statistical theory is premised on an underlying infinite population

• Sampling theory and practice is founded on the assumption of sampling from a finite population

• In the general framework of finite population sampling, sample sizes of size n are taken from a population of size N

• In the finite population case, the variance estimate of a statistical estimator must be adjusted due to the fact that not all data from a finite population are observed, using the finite population correction (fpc)

Page 32: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

Finite Population Correction

• For simple random samples (without replacement) the fpc is expressed as or

• Where f is the sampling fraction or rate

fN

n-1or 1

N

nf

Page 33: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

Finite Population Correction

• The fpc is, therefore, the fraction of a finite population that is not sampled

• Because the fpc is literally a factor in the calculation of an estimate of variance for an estimated finite population parameter, the estimated variance is reduced to zero if n = N

Page 34: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

Finite Population Correction

• When n is small relative to N, the fpc is close to unity

• In samples of very large populations f is very small and the fpc may be ignored– Ignore if 1-n/N>.95

• Although the fpc is applicable for estimation, it often is not necessary for many inferential uses such as statistical significance testing (e.g., comparison between sampled subgroups).

Page 35: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

Estimate of Population Mean

Page 36: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

Estimate of Population Proportion

where

Page 37: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

Estimate of Population Total

Page 38: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

Case Study #1

Page 39: EMR 6500: Survey Research Dr. Chris L. S. Coryn Spring 2012

Case Study Activity

• In small groups, address the following questions in relation to Case Study #1 relying only on the material that was discussed thus far in the semester1. Has the surveyor committed any

serious error(s)? 2. If so, what type and why? If not, why?