sample size
DESCRIPTION
TRANSCRIPT
This is the totality of all available units a defined area that falls the scope or interest of the study investigator
Units may be individuals, households, families, schools, communities, villages, insects, hospitals and so forth
population from which the data are actually collected is the survey or study population
If information is required about the health of pre-school children then the population will be that of all children less than 5 years of age
It should be stated whether the result will be valid for the whole country, county or hospitals
Also if a study is on primary health workers in Kenya, then all primary health workers regardless of cadre is the population of interest
A sample is “a smaller (but hopefully representative) collection of units from a population used to determine truths about that population” (Field, 2005)
RESOURCES- TIME, MONEY, PERSONNEL
Study design (prospective or retrospective)
Type of analysis required particularly cross tabulations
Categories of the variables Sample from previous studies by
respectable researchers
Mostly population is infinitely too large to be managed within a reasonable time for the study
Representative (sample) of such population is therefore selected
The process of sampling involves defining the population
What is your population of interest? To whom do you want to generalize your
results? All doctors School children Kenyans Women aged 15-45 years Other
Can you sample the entire population?
3 factors that influence sample representative-ness
Sampling procedure Sample size Participation (response)
When might you sample the entire population? When your population is very small When you have extensive resources When you don’t expect a very high response
9SAMPLING BREAKDOWN
10
TARGET POPULATION
STUDY POPULATION
SAMPLE
Probability (Random) Sampling Simple random sampling
◦ Systematic random sampling◦ Stratified random sampling◦ Multistage sampling◦ Multiphase sampling◦ Cluster sampling
Non-Probability Sampling◦ Convenience sampling◦ Purposive sampling◦ Quota
12
The sampling process comprises several stages:◦Defining the population of concern ◦Specifying a sampling frame, a set of items or
events possible to measure ◦Specifying a sampling method for selecting items
or events from the frame ◦Determining the sample size ◦Implementing the sampling plan ◦Sampling and data collecting ◦Reviewing the sampling process
13
In the most straightforward case, such as the sentencing of a batch of material from production (acceptance sampling by lots), it is possible to identify and measure every single item in the population and to include any one of them in our sample. However, in the more general case this is not possible. There is no way to identify all rats in the set of all rats. Where voting is not compulsory, there is no way to identify which people will actually vote at a forthcoming election (in advance of the election)
As a remedy, we seek a sampling frame which has the property that we can identify every single element and include any in our sample .
The sampling frame must be representative of the population
14
A probability sampling scheme is one in which every unit in the population has a chance (greater than zero) of being selected in the sample, and this probability can be accurately determined.
. When every element in the population does have the same probability of selection, this is known as an 'equal probability of selection' (EPS) design. Such designs are also referred to as 'self-weighting' because all sampled units are given the same weight.
15
Any sampling method where some elements of population have no chance of selection (these are sometimes referred to as 'out of coverage'/'undercovered'), or where the probability of selection can't be accurately determined. It involves the selection of elements based on assumptions regarding the population of interest, which forms the criteria for selection. Hence, because the selection of elements is nonrandom, non-probability sampling not allows the estimation of sampling errors..
Example: We visit every household in a given street, and interview the first person to answer the door. In any household with more than one occupant, this is a non-probability sample, because some people are more likely to answer the door (e.g. an unemployed person who spends most of their time at home is more likely to answer than an employed housemate who might be at work when the interviewer calls) and it's not practical to calculate these probabilities.
Too few subjects makes estimates unreliable, imprecise, and of low power
Too many subjects is needless waste of resources
Need to strike a balance between cost and precision
Precision- measure of consistency of estimates.
PRIMARY OUTCOME MEASURE ( qualitative or quantitative?)
Smallest effect of interest: How small a difference is to be detected. The magnitude of the effect that is clinically important
and that we do not want to overlook. Significance level: the cut –off level below which
we will reject the null hypothesis i.e. the maximum probability of incorrectly concluding
that there is an effect We usually fix this as n0.05, or occasionally, 0.01 and
reject the null hypothesis if the P value is less than this value
STATISTICAL POWER TO DETECT AN ACTUAL DIFFERENCE
Variability in measurement Study design
Sample size for a single estimate Sample size to compare two means Sample size for a single proportion Sample size for two proportions
N=(Z1-/2 )22
d2
N=(Z1-/2 + Z)22
d2
Prevalence of outcome measure Standard deviation of the variable in the
population if quantitative
Calculation can be done manually with formulas or epi info software package
A health officer wishes to estimate the mean haemoglobin in a defined community. Preliminary information is that this mean is about 150mg/l with a SD of 32mg/l. If a sampling error of up to 5mg/l in the estimate is to be tolerated, how many subjects should be included in the study?
SD=32mg/l D=5mg/l Z=1.96
17625
32*96.12
22
n
n
N=2(Z1-/2 + Z)22
d2
Suppose the prevalence of brucella infection is 2% and the absolute difference to be detected is 0.25% with a 95% confidence, what is the sample size required?
P=0.02% q=1-p=1-0.02% Q=0.98% Z=1.96 D=0.0025%
120470025.0
98.0*02.0*96.12
2
n
n
Suppose investigators want to compare heart rate in patients with essential hypertension and high catecholamine levels with heart rate in patinets with essential hypertension and low catecholamine levels. They are willing to accept a type 1 error(incorrecting concluding that there is a difference in heart rate) of 0.05, and they want a probability of 0.80 of detecting a true difference. The investigators decide a fifference of 10 or more beats per minute is clinically significant, and that an estimate of the SD in heart rate is 15 beats per minute.
Calc SS? Solution N=2{(1.96 + 0.84) (15)}2
10
N= 36 Therefore 36 patients are needed in each group
if the investigators want to have an 80% chance (or 80% power) of detecting a difference of 10 or more beats per minute.
nZ 2 P
c 1 P
c Z P
t1 P
tPc1 P
c
PtPc
2
Z Pc
Pc
Z Pt
Pt
Pc
Pc
PtPc
Study involved a trial of J5 antiserum in surgical patients to determine whether it is effective in preventing gram-negative infections.
Investigators want to estimate the sample size needed to detect a reduction in the proportion of patients who experience shock from the 10% level according to the investigators previous experience (Pc) to 5% or less if patients are given transfusions from donors treated with J5.
They are willing to accept a type 1 error( of falsely concluding that there is a difference when there really is none) of 0.05 and they want a 0.90 probability of detecting of detecting a true difference.
Z=1.96, Z=-1.28, Pc=10%, Pt=5% Therefore: N=(1.306/0.05)2
= 682.46
Sample size increases when;- difference to detect is small.- When power is high.- significance level is low .- Large variation.
THANK YOU