blog on sampling

21
www.Anify.ne t

Upload: letsanify

Post on 14-Aug-2015

14 views

Category:

Education


2 download

TRANSCRIPT

Page 2: Blog on sampling

www.Anify.net

Would you eat all the grains of rice to see if it is cooked?...

Would you have the entire soup to check if it tastes good?

Possibly not, unless you are very hungry

Lets talk about Sampling today…

Page 3: Blog on sampling

www.Anify.net

In real life, one usually doesn’t have the time, money or resource to reach out to the entire population.

Even if one could, would it be the best use of the time, money or resource?

Probably not, and this is where statistics comes to the rescue! It allows us to study a part of the population and yet derive reliable inferences.

Page 4: Blog on sampling

www.Anify.net

However, as soon as one decides to study a ‘part’ of the population i.e. sample, one has to find answers

to three questions…

1. Who to sample and how do you

find themOr

What to sample and how do you find it

2. How do you pull out a sample from

the population?

3. What is the right size of the sample

i.e. how many

Page 5: Blog on sampling

www.Anify.net

1. Who to sample and how do you

find themOr

What to sample and how do you find it

If you have the answer to the question ‘Who to sample from?’ you’ll know who to sample?

‘Who to sample from’ refers to the population and defining the population is usually aligned to your

research objective or aligns with your target audience in case of marketing research problems

To understand how adults in Mumbai feel about eating Maggi?

The entire adult population of Mumbai

Sure…Example…Objective

To establish if post graduate students in a country like simplified animated videos as a

means of learning?

All post graduate students of that country

Can you give some examples to make it clearer?

Population

To detect if shampoo bottles for a batch from a specific machine are have incorrect volume.

All the bottles of shampoo for that batch from the machine

Who to sample?

Page 6: Blog on sampling

www.Anify.net

1. Who to sample and how do you

find themOr

What to sample and how do you find it

And how do we go about finding the

sample?

Any information source that is comprehensive and that doesn’t contain biases can be used to locate the sample. e.g. census data, class attendance roll,

telephone directory. This source of data is also called sample frame.

How to find the sample?

Page 7: Blog on sampling

www.Anify.net

1. Who to sample and how do you

find themOr

What to sample and how do you find it

2. How do you pull out a sample from

the population?

While pulling out samples out of the population, should each element of the population have the

same chance/probability of being picked up?

…If so, then welcome to Probability Sampling

Are there situations where due to other reasons you do not have access to entire population data at the beginning itself? How do you then pull out sample ensuring equal probability of each element?

…do you then get past Probability Sampling into the world of Non-Probability Sampling?

How to pull the sample?

Can you tell me more about Probability Sampling first?

Page 8: Blog on sampling

www.Anify.net

There are different types of Probability sampling but all have two things in common – that every element has a non-zero probability of being sampled

and it involves random selection at some point

1. Simple Random Sampling

Each element has a known and equal probability of selectionAnd each combination of elements has an equal probability of selection

Example… 1. Picking up names from a hat

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15… 99 100Random Number

Generation (1-100)

75 3 11 79 14

2. Number being picked up in lottery3. Assigning numbers to participants and selecting a

sample by generating random numbers

How to pull the sample? Probability sampling

Page 9: Blog on sampling

www.Anify.net

Stratified Random Sampling

Divide the population into ‘strata’ or groups that differ in important waysEg. Age group, regions, ethnicity, religion etc.

Select random samples from within the groups

Often, the population can inherently contain different sub-groups or strata within it. And it may be crucial to consider the different sub-groups

For this Stratified Random Sampling can be used.

Asian American European

Children Adults Senior Citizens

School Students College Students Post Graduate Students

Ex. 1

Ex. 2

Ex. 3

Group 1 Group 2 Group 3

How to pull the sample? Probability sampling

Population

Take random samples within this group

Take random samples within this group

Take random samples within this group

Page 10: Blog on sampling

www.Anify.net

Business Example

A bottled juice manufacturer wants to conduct a survey to estimate the weekly spend on grocery in a certain region. The region has for towns A, B and C. Town A mostly consists of farmers, Town B is mostly industrial with

mostly factory workers and family, whereas Town C is mostly old aged people who have retired.

Page 11: Blog on sampling

www.Anify.net

Systematic Random Sampling

Each element has an equal probability of selection, but combinations of elements have different probabilities.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

46 47 48 49 50

21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

41 42 43 44 45

16 17 18 19 20

56 57 58 59 6051 52 53 54 55

996 997 998 999 1000994 995…

Example below – Population(N) = 1000 and say we need to pick up a sample size(n) of 100. This gives us N/n i.e. 1000/100 = 10 telling us that we need to pick up every 10th element.

And we choose a random number between 1 and 10 and start from there..eg. 3

3rd element 3rd + 10th

How to pull the sample? Probability sampling

Page 12: Blog on sampling

www.Anify.net

Random Cluster Sampling

Population is divide into groups/clusters, usually geographic or organizational. Then some of the groups are randomly chosen.

Group 1 Group 2 Group 3

Group 4 Group n

If all elements are cluster are chosen it is called Pure Cluster SamplingIf random sampling is done within the cluster is it called Simple Multi-stage Cluster

How to pull the sample? Probability sampling

Page 13: Blog on sampling

www.Anify.net

Stratified Cluster Sampling

This is a mix of Stratified and Cluster Sampling

Strata

Clus

ters

How to pull the sample? Probability sampling

Page 14: Blog on sampling

www.Anify.net

Lets summarize Probability Sampling!

Simple Random Sampling

Stratified Sampling

Systematic Random Sampling

Cluster Random Sampling

Stratified Cluster

SamplingThat’s great, the summary makes it clear! What about Non-Probability Sampling?

How to pull the sample? Probability sampling

Page 15: Blog on sampling

www.Anify.net

Often there are restrictions because of which it may not be possible to do Probability Sampling. Eg. You might be doing a market research on a very

specific set of individuals eg. Triplets. In such case Non- Probability Sampling is used and it does not involve random selection.

Essentially, the samples are gathered in a process that does not give all the individuals in the population equal chances of being selected.

Since the sample elements don’t have equal probability of being selected, one cannot use the probability theory and its inferences to come to conclusion for

the entire population.

Rather than being random, the sample is selected based on the basis of their accessibility or by the personal judgement of the researcher.

Are there different types of Non Probability

Sampling?

How to pull the sample? Non-Probability sampling

Page 16: Blog on sampling

www.Anify.net

• Researcher uses judgement to select from population members whom he feels will give accurate information.

• Usually used when a limited number of individuals possess the trait of interest.

• Eg. Researcher picks up certain cricketers whom he feels will have the insights to his questions - ‘What makes left handers look more elegant’

• Researcher selects the individuals who are easily accessible

• They may or may not be tied to the purpose of research

• The selected sample are essentially at the right place at the right time

Researcher

Population has 60% women and 40% men

• The sample should represent the proportions

of the population

• You continue selecting samples until you get the

proportional representation

Sample

Sample has same ratio

Convenience Sampling Purposive Sampling Quota Sampling

• Mostly used when sample is very rare or is

very limited

• Existing elements select amongst their acquaintances

Snowball Sampling

Sample

NON-PROBABILITY SAMPLING

Page 17: Blog on sampling

www.Anify.net

However, as soon as one decides to study a ‘part’ of the population i.e. sample, one has to find answers

to three questions…

1. Who to sample and how do you

find themOr

What to sample and how do you find it

2. How do you pull out a sample from

the population?

3. What is the right size of the sample

i.e. how many

Page 18: Blog on sampling

www.Anify.net

𝑍=𝑋−𝜇𝜎 /√𝑛

More mathematically, Central Limit Theorem tells us that

Rearranging it gives n (sample size) 𝑛= 𝜎2𝑍2

(𝑋−𝜇)2

What is the right size of the sample?3 Sample Size Estimation

You are a manager in a Shampoo Company and you need to confirm, with a degree of confidence, what's the volume in the shampoo bottle to a level of accuracy. The question is how many bottles would you need to randomly select to answer the question? Would you pick up 40 bottles or 4000 bottles?

Degree of confidence – feels instinctively correct to say that if we need to be more confident, we will need to get more samples to be sure and vice versa

Accuracy (error of estimation)– again, feels instinctive if we need higher accuracy, then we will need a bigger sample. More the accuracy, narrower will be the margin of error

Represented by Z

Represented by

Standard deviation of the sample –higher variability of the sample will mean that we will need a larger sample size

Represented by

Sample Size

depends on

Page 19: Blog on sampling

www.Anify.net

In our example:: If of population is not known, then the range/4 could be used as an acceptable estimate of .

In our case, suppose we know through historic dataset that the shampoo bottles could vary from 59 ml to 61 ml, then we can estimate to be (61-59)/4 = 2/4

𝑛= 𝜎2𝑍2

(𝑋−𝜇)2¿( 24)2

1.962

(0.1)2¿96.04

As you cannot sample 96.04, the required sample size is 97.

: We need a 95% confidence, hence through the z table we get = 1.96

(): And, we are ok with an error of ±0.05ml in our results. Thus, then () = 0.1

Page 20: Blog on sampling

www.Anify.net

Now that you have answers to these three questions, you know what you need to know about

sampling. Happy Sampling!

1. Who to sample and how do you

find themOr

What to sample and how do you find it

2. How do you pull out a sample from

the population?

3. What is the right size of the sample

i.e. how many