blog on sampling
TRANSCRIPT
www.Anify.net
Would you eat all the grains of rice to see if it is cooked?...
Would you have the entire soup to check if it tastes good?
Possibly not, unless you are very hungry
Lets talk about Sampling today…
www.Anify.net
In real life, one usually doesn’t have the time, money or resource to reach out to the entire population.
Even if one could, would it be the best use of the time, money or resource?
Probably not, and this is where statistics comes to the rescue! It allows us to study a part of the population and yet derive reliable inferences.
www.Anify.net
However, as soon as one decides to study a ‘part’ of the population i.e. sample, one has to find answers
to three questions…
1. Who to sample and how do you
find themOr
What to sample and how do you find it
2. How do you pull out a sample from
the population?
3. What is the right size of the sample
i.e. how many
www.Anify.net
1. Who to sample and how do you
find themOr
What to sample and how do you find it
If you have the answer to the question ‘Who to sample from?’ you’ll know who to sample?
‘Who to sample from’ refers to the population and defining the population is usually aligned to your
research objective or aligns with your target audience in case of marketing research problems
To understand how adults in Mumbai feel about eating Maggi?
The entire adult population of Mumbai
Sure…Example…Objective
To establish if post graduate students in a country like simplified animated videos as a
means of learning?
All post graduate students of that country
Can you give some examples to make it clearer?
Population
To detect if shampoo bottles for a batch from a specific machine are have incorrect volume.
All the bottles of shampoo for that batch from the machine
Who to sample?
www.Anify.net
1. Who to sample and how do you
find themOr
What to sample and how do you find it
And how do we go about finding the
sample?
Any information source that is comprehensive and that doesn’t contain biases can be used to locate the sample. e.g. census data, class attendance roll,
telephone directory. This source of data is also called sample frame.
How to find the sample?
www.Anify.net
1. Who to sample and how do you
find themOr
What to sample and how do you find it
2. How do you pull out a sample from
the population?
While pulling out samples out of the population, should each element of the population have the
same chance/probability of being picked up?
…If so, then welcome to Probability Sampling
Are there situations where due to other reasons you do not have access to entire population data at the beginning itself? How do you then pull out sample ensuring equal probability of each element?
…do you then get past Probability Sampling into the world of Non-Probability Sampling?
How to pull the sample?
Can you tell me more about Probability Sampling first?
www.Anify.net
There are different types of Probability sampling but all have two things in common – that every element has a non-zero probability of being sampled
and it involves random selection at some point
1. Simple Random Sampling
Each element has a known and equal probability of selectionAnd each combination of elements has an equal probability of selection
Example… 1. Picking up names from a hat
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15… 99 100Random Number
Generation (1-100)
75 3 11 79 14
2. Number being picked up in lottery3. Assigning numbers to participants and selecting a
sample by generating random numbers
How to pull the sample? Probability sampling
www.Anify.net
Stratified Random Sampling
Divide the population into ‘strata’ or groups that differ in important waysEg. Age group, regions, ethnicity, religion etc.
Select random samples from within the groups
Often, the population can inherently contain different sub-groups or strata within it. And it may be crucial to consider the different sub-groups
For this Stratified Random Sampling can be used.
Asian American European
Children Adults Senior Citizens
School Students College Students Post Graduate Students
Ex. 1
Ex. 2
Ex. 3
Group 1 Group 2 Group 3
How to pull the sample? Probability sampling
Population
Take random samples within this group
Take random samples within this group
Take random samples within this group
www.Anify.net
Business Example
A bottled juice manufacturer wants to conduct a survey to estimate the weekly spend on grocery in a certain region. The region has for towns A, B and C. Town A mostly consists of farmers, Town B is mostly industrial with
mostly factory workers and family, whereas Town C is mostly old aged people who have retired.
www.Anify.net
Systematic Random Sampling
Each element has an equal probability of selection, but combinations of elements have different probabilities.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
46 47 48 49 50
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
41 42 43 44 45
16 17 18 19 20
56 57 58 59 6051 52 53 54 55
996 997 998 999 1000994 995…
Example below – Population(N) = 1000 and say we need to pick up a sample size(n) of 100. This gives us N/n i.e. 1000/100 = 10 telling us that we need to pick up every 10th element.
And we choose a random number between 1 and 10 and start from there..eg. 3
3rd element 3rd + 10th
How to pull the sample? Probability sampling
www.Anify.net
Random Cluster Sampling
Population is divide into groups/clusters, usually geographic or organizational. Then some of the groups are randomly chosen.
Group 1 Group 2 Group 3
Group 4 Group n
If all elements are cluster are chosen it is called Pure Cluster SamplingIf random sampling is done within the cluster is it called Simple Multi-stage Cluster
How to pull the sample? Probability sampling
www.Anify.net
Stratified Cluster Sampling
This is a mix of Stratified and Cluster Sampling
Strata
Clus
ters
How to pull the sample? Probability sampling
www.Anify.net
Lets summarize Probability Sampling!
Simple Random Sampling
Stratified Sampling
Systematic Random Sampling
Cluster Random Sampling
Stratified Cluster
SamplingThat’s great, the summary makes it clear! What about Non-Probability Sampling?
How to pull the sample? Probability sampling
www.Anify.net
Often there are restrictions because of which it may not be possible to do Probability Sampling. Eg. You might be doing a market research on a very
specific set of individuals eg. Triplets. In such case Non- Probability Sampling is used and it does not involve random selection.
Essentially, the samples are gathered in a process that does not give all the individuals in the population equal chances of being selected.
Since the sample elements don’t have equal probability of being selected, one cannot use the probability theory and its inferences to come to conclusion for
the entire population.
Rather than being random, the sample is selected based on the basis of their accessibility or by the personal judgement of the researcher.
Are there different types of Non Probability
Sampling?
How to pull the sample? Non-Probability sampling
www.Anify.net
• Researcher uses judgement to select from population members whom he feels will give accurate information.
• Usually used when a limited number of individuals possess the trait of interest.
• Eg. Researcher picks up certain cricketers whom he feels will have the insights to his questions - ‘What makes left handers look more elegant’
• Researcher selects the individuals who are easily accessible
• They may or may not be tied to the purpose of research
• The selected sample are essentially at the right place at the right time
Researcher
Population has 60% women and 40% men
• The sample should represent the proportions
of the population
• You continue selecting samples until you get the
proportional representation
Sample
Sample has same ratio
Convenience Sampling Purposive Sampling Quota Sampling
• Mostly used when sample is very rare or is
very limited
• Existing elements select amongst their acquaintances
Snowball Sampling
Sample
NON-PROBABILITY SAMPLING
www.Anify.net
However, as soon as one decides to study a ‘part’ of the population i.e. sample, one has to find answers
to three questions…
1. Who to sample and how do you
find themOr
What to sample and how do you find it
2. How do you pull out a sample from
the population?
3. What is the right size of the sample
i.e. how many
www.Anify.net
𝑍=𝑋−𝜇𝜎 /√𝑛
More mathematically, Central Limit Theorem tells us that
Rearranging it gives n (sample size) 𝑛= 𝜎2𝑍2
(𝑋−𝜇)2
What is the right size of the sample?3 Sample Size Estimation
You are a manager in a Shampoo Company and you need to confirm, with a degree of confidence, what's the volume in the shampoo bottle to a level of accuracy. The question is how many bottles would you need to randomly select to answer the question? Would you pick up 40 bottles or 4000 bottles?
Degree of confidence – feels instinctively correct to say that if we need to be more confident, we will need to get more samples to be sure and vice versa
Accuracy (error of estimation)– again, feels instinctive if we need higher accuracy, then we will need a bigger sample. More the accuracy, narrower will be the margin of error
Represented by Z
Represented by
Standard deviation of the sample –higher variability of the sample will mean that we will need a larger sample size
Represented by
Sample Size
depends on
www.Anify.net
In our example:: If of population is not known, then the range/4 could be used as an acceptable estimate of .
In our case, suppose we know through historic dataset that the shampoo bottles could vary from 59 ml to 61 ml, then we can estimate to be (61-59)/4 = 2/4
𝑛= 𝜎2𝑍2
(𝑋−𝜇)2¿( 24)2
1.962
(0.1)2¿96.04
As you cannot sample 96.04, the required sample size is 97.
: We need a 95% confidence, hence through the z table we get = 1.96
(): And, we are ok with an error of ±0.05ml in our results. Thus, then () = 0.1
www.Anify.net
Now that you have answers to these three questions, you know what you need to know about
sampling. Happy Sampling!
1. Who to sample and how do you
find themOr
What to sample and how do you find it
2. How do you pull out a sample from
the population?
3. What is the right size of the sample
i.e. how many