sampling (1)

Chapter 7

Salam Abdallah, PhD

MGT524

Sampling

Survey elements

Basic terms and concepts: 1

Population: the universe of units from which the sample is to be selected

Sample: the segment of population that is selected for investigation

Sampling frame: list of all units e.g. If all the workers in a factory make a population, a single worker is a unit of the population. If all the factories in a country are being studied for some purpose, a single factory is a unit of the population of factories. The sampling frame contains all the units of the population. It is to be defined clearly as to which units are to be included in the frame.

The frame provides a base for the selection of the sample.

The sampling frame operationally defines the target population from which the sample is drawn and to which the sample data will be generalized.

Representative sample: a sample that reflects the population accurately

Sample bias: distortion in the representativeness of the sample

Basic terms and concepts: 2

Probability sample: sample selected using random selection

Non-probability sample: sample selected not using random selection method

Sampling error: difference between sample and population

Non-sampling error: are the unpredictable errors resulting from poor estimation or Non-response: when members of sample are unable or refuse to take part

Systematic errorsare those errors that tend to accumulate over the entire sample. For example, if there is an error in the questionnaire design, this could cause problems with the respondent's answers, which in turn, can create processing errors, etc. These types of errors often lead to a bias in the final results.

Census: data collected from entire population

SAMPLING BREAKDOWN

Picture of sampling breakdown

Sampling error

Difference between sample and population

Biased sample does not represent population

some groups are over-represented; others are under-represented

sources of bias

non-probability sampling, inadequate sample frame, non-response

Probability sampling reduces sampling error and allows for inferential statistics

Probability sampling

The four stage process

Identify sampling frame from research objectives

Decide on a suitable sample size

Select the appropriate technique and the sample

Check that the sample is representative

4 types of probability sample

Simple random sample

Systematic sample

Stratified random sample

Multi-stage cluster sample

Sample size

Choice of sample size is influenced by

Confidence needed in the data

Margin of error that can be tolerated

Types of analyses to be undertaken

Size of the sample population and distribution

Simple random sampleing

Each unit has an equal probability of selection

Sampling fraction: n/N

where n = sample size and N = population size

List all units and number them consecutively

Use random numbers table to select units

Simple random sampleing

Example we want to conduct a survey to measure the levels training, skill development and learning among employees. And we selected a company that has 9000 employees. Surveying the whole population may not be feasible.

Decide your sample size e.g. 450

Number all the employees from 1 to 9000

450/9000 i.e 1 in 20

Generate 450 random numbers using random number generators, the numbers generated will the numbers given to represent the employees to be surveyed: http://www.psychicscience.org/random.aspx

Systematic sampleing

Select units directly from sampling frame

From a random starting point, choose every nth unit (e.g. every 4th name)

a starting point is chosen at random , choose every nth unit , and choices thereafter are at regular intervals. For example, suppose you want to sample 450 employees from 9000. 9000/450=20, so every 20 employee is chosen after a random starting point between 1 and 20. If the random starting point is 16, then the employees selected are 16, 36, 56, 76, 96, 116, etc.

Make sure sampling frame has no inherent ordering if it has, rearrange it to remove bias

Starting point is to categorise population into strata (relevant divisions, or departments of companies for example) i.e. stratifying the population by criterion,

Five departments, will result in 5 strata

So the sample can be proportionately representative of each stratum

Then, randomly select within each category as for a simple random sample

This approach will ensure the resulting sample will be distributed in the same as the population

We can also stratify by several criteria, i.e. by both department and gender and whether or not employees are above or below a certain salary level or occupational grade.

Note we can only stratify the sample if we have the relevant information accessible.

Stratified random sampling

Stratified sampling an example

Using sampling fraction of of 1 in 20, we would expect to have 90 employees in our sample from this department of the company. However, because of sampling error, it is unlikely that this we will occur and that there will be a difference, so that there my be, say 85 or 93 from this department.

Sample size required from a population 5,000 given a 95% confidence level for 2.5 margin of error you need a sample size of 1176

STATISTICAL SAMPLING

Multi-stage cluster sampleing

Useful for widely dispersed populations

First, divide population into groups (clusters) of units, like geographic areas, or industries, for example

Sub-clusters (sub-groups) can then be sampled from these clusters, if appropriate

Now randomly select units from each (sub)cluster

Collect data from each cluster of units, consecutively

Multi-stage cluster sampleing

Example: We want a nationally representative sample of 5,000 employees who are working for the 100 largest companies in the UK.

Problem: Using simple random or systematic sampling would yield a widely dispersed sample, which would result in a great deal of travel for interviewers.

One solution to sample companies and then employees from each company. We randomly sample ten companies from the entire population of 100 largest companies in the UK, resulting in ten clusters, and we would then interview 500 randomly selected employees at each of the at of the ten companies i.e. 5,000 employees.

Qualities of a probability sample

Good Representative Sample- allows for generalization from sample to population

Use inferential statistical tests to generalize

Sample means can be used to estimate population means

Sample size

Absolute size matters more than relative size

The larger the sample, the more precise and representative it is likely to be

As sample size increases, sampling error decreases

Important to be honest about the limitations of your sample

Factors affecting sample size: 1

Time and cost

after a certain point (n=1000), increasing sample size produces less noticeable gains in precision

very large samples are not cost-efficient

Non-response

response rate = % of sample who agree to participate (or % who provide usable data)

responders and non-responders may differ on a crucial variable

Factors affecting sample size: 2

Heterogeneity of the population

the more varied the population is, the larger the sample will have to be

Kind of analysis to be carried out

some techniques require large sample (e.g. inferential statistics)

Types of non-probability sampling: 1

1. Convenience sampling

the most easily accessible individuals

useful when piloting a research instrument

may be a chance to collect data that is too good to miss

2. Snowball sampling

researcher makes initial contact with a small group

these respondents introduce others in their network

Types of non-probability sampling: 2

3. Quota sampling

often used in market research and opinion polls

relatively cheap, quick and easy to manage

proportionately representative of a populations social categories (strata)

but non-random sampling of each stratums units

interviewers select people to fit their quota for each category, so the sample may be biased towards those who appear friendly and accessible (e.g. in the street), leading to under-representation of less accessible groups

Limits to generalization

findings can only be generalized to the population from which the sample was selected

be wary of over-generalizing in terms of locality

time, historical events and cohort effects

results may no longer be relevant and so require updating (replication)

Error in survey research

Sampling error

unavoidable difference between sample and population

Sampling-related error

inadequate sampling frame; non-response

makes it difficult to generalize findings

Data collection error

implementation of research instruments

e.g. poor question wording in surveys

Data processing error

faulty management of data, e.g. coding errors

Primary

Clusters

Secondary

Clusters

Simple Random Sampling within Secondary Clusters

sampling (1)

Documents

1 1 slide © 2005 thomson/south-western chapter 7, part a...

metode sampling(1)

nptel_acceptance sampling (1)

sampling and analysis plan - california...1.3 organization...

1 1 slide © 2005 thomson/south-western chapter 7, part b...

1. sampling

1 chapter seven introduction to sampling distributions...

sampling qb[1]

topic 1: sampling and sampling distributions chapter 6...

sampling design avoiding pitfalls in environmental sampling...

sampling and the sampling distribution(7) (1)

topic 1: sampling and sampling...

sampling 1 biostatistics

sampling-1 intro

1 1 slide © 2007 thomson south-western. all rights reserved...

acceptance sampling[1]

1 understanding sampling

chapter 5 1 sampling methods. learning objectives 2 reasons...

sampling methodologies - occ: home...

1 sampling. 2 sampling issues sampling terminology...