4. sampling design (09.02.10)
TRANSCRIPT
-
8/8/2019 4. Sampling Design (09.02.10)
1/43
-
8/8/2019 4. Sampling Design (09.02.10)
2/43
The process of using a small number of items or partsof a larger population to make conclusions about thewhole population
Sample
A subset or some part of a larger population
-
8/8/2019 4. Sampling Design (09.02.10)
3/43
A complete group of entities sharing some common setof characteristics
Population ElementAn individual member of a specific population
CensusAn investigation of all the individual elements
making up a population
-
8/8/2019 4. Sampling Design (09.02.10)
4/43
` Pragmatic Reasons : cuts costs, reduces laborrequirement, gathers vital information quickly.
` Accurate and Reliable Results : Population
elements are highly homogeneous.` Destruction of Test Units.
-
8/8/2019 4. Sampling Design (09.02.10)
5/43
` The list of elements drawn from which a samplemay be drawn; also called as working population.
` Mailing List A list of the names, addresses and
phone numbers of specific populations.` Reverse Directory` Sampling Frame Error occurs when certain
sample elements are not listed or available and
are not represented in the sampling frame.
-
8/8/2019 4. Sampling Design (09.02.10)
6/43
` A single element or group of elements subject toselection in the sample
` Primary Sampling Unit (PSU)
` Secondary Sampling Unit (SSU)
-
8/8/2019 4. Sampling Design (09.02.10)
7/43
Define the target population
Select a Sampling Unit
Select a Sampling Frame
Probability Sampling or Non-probabilitySampling : Determine
Determine The Sample Size
Parameters of Interest
Budgetary Constraint
-
8/8/2019 4. Sampling Design (09.02.10)
8/43
Cost of an
Incorrect
Inference
Resultingfrom the data
Cost ofcollectingthe data
-
8/8/2019 4. Sampling Design (09.02.10)
9/43
` NonSampling Error / Systematic Bias
` Random Sampling Error
-
8/8/2019 4. Sampling Design (09.02.10)
10/43
` Inappropriate Sampling Frame` Defective Measuring Device` Non-respondents
` Indeterminancy Principle` Natural Bias in Reporting of Data
-
8/8/2019 4. Sampling Design (09.02.10)
11/43
` A statistical fluctuation that occurs because ofchance variation in the elements selected for asample.
` The measurement of this error is called precisionof the sampling plan` It is a function of sample size.
-
8/8/2019 4. Sampling Design (09.02.10)
12/43
` It must result in a truly representative sample` It should give a small sampling error` It should be within the cost constraint
` It should control the systematic bias in a betterway
` It should provide for the application of the resultsof sample study with reasonable confidence.
-
8/8/2019 4. Sampling Design (09.02.10)
13/43
Element Selection
Technique
Representation Basis
Probability Sampling Non-probability
SamplingUnrestrictedSampling
Simple RandomSampling
Haphazard/ConvenienceSampling
Restricted Sampling Complex RandomSampling(Cluster/Systematic/Stratified Sampling)
Purposive Sampling(Quota/JudgementSampling)
-
8/8/2019 4. Sampling Design (09.02.10)
14/43
` Convenience Sampling` Judgement Sampling` Quota Sampling
` Snowball Sampling Initial respondents are selected by probability methods
and additional respondents are obtained from informationprovided by initial respondents.
-
8/8/2019 4. Sampling Design (09.02.10)
15/43
Ensures the Law of Statistical Regularityi.e. A random sample will have the same
composition and characteristics as the universe& We can measure the errors of estimation
In short it implies :` It gives each element in the population an equal
probability of getting into the sample and allchoices are independent of each other
` It gives each possible sample combination anequal probability of being chosen
-
8/8/2019 4. Sampling Design (09.02.10)
16/43
Tippett gave 10400four figure nos. from 41600 digitsfrom census reports and combined them into fours togive random nos.
First 30 sets of Tippetts Nos. :2952 6641 3992 9792 7979 5911 3170 56244167 9525 1545 1396 7203 5356 1300 26932370 7483 3408 2769 3563 6107 6913 7691
0560 5246 1112 9025 6008 8126
Used only when lists are available and items are readilynumbered
-
8/8/2019 4. Sampling Design (09.02.10)
17/43
Caution : If there is a hidden periodicity in thepopulation
Population list should be in random order, then it isequivalent to Random Sampling
-
8/8/2019 4. Sampling Design (09.02.10)
18/43
a) How to form strata ?b) How should items be selected from each
stratum?
c) How many items be selected from each stratumor how to allocate the sample size to eachstratum?
-
8/8/2019 4. Sampling Design (09.02.10)
19/43
ni = n . Ni where n = Total Sample SizeP i = Stratum No.
Ni = Size of stratum iP = Population
Optimum AllocationAccount for Variability, Cost etc..
ni = n.Ni.i .N1.1 + N2.2 + + Nk.k
-
8/8/2019 4. Sampling Design (09.02.10)
20/43
` Cluster Sampling` Area Sampling` Multi- Stage Sampling` Sequential Sampling
-
8/8/2019 4. Sampling Design (09.02.10)
21/43
` Used when cluster sampling units dont haveapproximately the same number of elements
` Indicates which clusters and how many from each
cluster are to be selected by simple randomsampling or systematic sampling.
` EX : 15 cities have following no. of departmentalstores :
35,17,10,32,70,28,26,19,26,66,37,44,33,29,28Select a sample of 10 stores using this technique
-
8/8/2019 4. Sampling Design (09.02.10)
22/43
` Instantaneous Surveys : Adv and DisadvResponse is too rapid
` Lack of Computer ownership
` Only Internet Users : young , better educated,more affluent` Unrestricted Samples Convenience Samples
may not be representative
-
8/8/2019 4. Sampling Design (09.02.10)
23/43
` SurveySite conducts pop-up survey` Panel Samples : Drawing prob samples from Prerecruited
membership panel is popular, scientific and effective method` Harris Interactive Inc. Propensity Weighing Scheme Panel of
6.5 million Parallel Studies` Recruited ad-hoc Samples Create a sampling frame of e-mail
address.` Opt-in Lists Give permission to receive selected e-mail, such as
questionnaires, from a company with an internet presence.Survey Sampling Incorporation Company providessampling frames and scientifically drawn samples.
-
8/8/2019 4. Sampling Design (09.02.10)
24/43
A certain population is divided into 5 strata so that N1 =2000, N2 = 2000,N3 =1800, N4=1700, and N5=2500.Respective standard deviations are 1.6,2.0,4.4,4.8and 6.0. Expected sampling cost in the first two stratis Rs.4 per interview and in the remaining three it is
Rs.6 per interview. How should a sample of sizen=226 be allocated to five strata if we adoptproportionate sampling design; if we adoptdisproportionate sampling design considering
i. Only the differences in stratum variability
ii. Differences invariability as well as differences instratum sampling costs
-
8/8/2019 4. Sampling Design (09.02.10)
25/43
` Frequency Distribution` Central Tendency : Mean, Median, Mode` Measures of Dispersion
` Range _________` Standard Deviation : S = (Xi X)2
n-1
`
The Normal Distribution : Z = X -
-
8/8/2019 4. Sampling Design (09.02.10)
26/43
` Population Distribution` Sample Distribution` Sampling Distribution Take certain no. of
samples and for each sample compute variousstatistical measures; each sample will have itsown values of mean, SD etc..
` Standard Error of the Mean
SX =.
.
n
-
8/8/2019 4. Sampling Design (09.02.10)
27/43
` Sampling Distribution of Mean` Sampling Distribution of Proportion Mean` Students t-distribution
` F distribution Variance` Chi-square distribution
-
8/8/2019 4. Sampling Design (09.02.10)
28/43
` Probability distribution of all possible means ofrandom samples of a given size that we take froma population
` X N ,2
, ZX =.
X-.
n p/n
Eg : Annual Income of employees in an industry follows normaldistribution with mean and variance as Rs 4lakhs and Rs 1lakh resp. A
random sample of size 49 is taken from an infinite normal population.What is the probability that sample means is greater thanRs.4.25lakhs?
n = 49, =4lakhs, 2 =1lakh,
-
8/8/2019 4. Sampling Design (09.02.10)
29/43
` Finite Population Multiplier N-nN-1
Eg : The age of employees in a company follows normal dist. With mean and
variance as 40yrs and 121yrs resp. If a random sample of 36 employees istaken from a finite population size of 1000, what is the probability that thesample mean is (a) lesser than 45, (b) greater than 42, (c) in between 42 &40
n = 36, N = 1000, =4oyrs, 2 =121yrs, = 11yrs
-
8/8/2019 4. Sampling Design (09.02.10)
30/43
` Statistics of Attributes, Binomial Distribution` p = proportion of successes= X/n, q=proportion of
failures,
` n=sample size, 1-p = q` Mean = np, 2 = npq` p N [ p, p(1-p) ]
nEg : The personnel manager of a company feels that 52% of the employees will
have enhanced skill after attending the training. A sample of records of 49employees, who attended the trainng reveals that only 24 of them haveenhanced skill after attending the program. Find the probability that the sampleof employees who attended have enhanced their skill.
p = 0.52, n = 49, q=1-p=0.48, p = 24/49
-
8/8/2019 4. Sampling Design (09.02.10)
31/43
` t = . X- .
S/nWith (n-1) degrees of freedom
Eg : The annual sales of dealers of a company follows normal distribution with
its mean as Rs. 94 lakhs. A random sample of 10 dealers of the company istaken from the normal population. The variance of the annual sales of these10 dealers is Rs. 81 lakhs. Find the probability that the mean annual salesof sample is (a)less than Rs.98lakhs, (b) more than Rs.98 lakhs
n= 10, = 94lakhs, S = 9lakhs
-
8/8/2019 4. Sampling Design (09.02.10)
32/43
` 2 = (n-1)S 2 with (n-1) degrees of freedom.
2
Eg : A random sample of 20 dealers of a company is taken from a normal
population. The variance of the annual sales of dealers from the normalpopulation and that of the random sample of 20 dealers are Rs.81 lakhsand 125 lakhs resp. Compute chi-square statistic and find the probabilitythat the chi-square variable is more than calculated chi-square statistic.
n = 20, S 2 = 125, 2 = 81
-
8/8/2019 4. Sampling Design (09.02.10)
33/43
` Ratio of 2 Chi-square tests
(n1-1)S12/ 12
F =.
(n1-1).
With (n1-1) & (n2-1)(n2-1)S22/ 22 degrees of freedom
(n2-1)
If 1 = 2, F = S12/S22
-
8/8/2019 4. Sampling Design (09.02.10)
34/43
` Eg : 2 independent samples of students of a programme underdistance education are taken from normal populations with the samevariance. The size and variance of marks of first sample are 8 and100 resp. The size and variance of marks of second sample are 20and 40 resp.
(a) What is calculated F- Statistic(b) What is probablity that F-ratio is more than calculated F-statistic.
n1 = 8, n2 = 20, S12 = 100, S22 = 40
-
8/8/2019 4. Sampling Design (09.02.10)
35/43
` A percentage or decimal value that tells howconfident a researcher can be about being correct.It states the long-run percentage of the time that a
confidence interval will include the true populationlevel.` It gives the estimated value of the population
parameter, plus or minus an estimate of error.
= X Z c.l. SX
-
8/8/2019 4. Sampling Design (09.02.10)
36/43
` A personal manager believes age will be a useful criterion forplacement. Successful women at the supervisory level are sampled.The mean age of 100 women is 37.5yrs, with a standard deviationof 12 yrs. Knowing that it would be extremely coincidental if thepoint estimate from the sample were exactly the same as the
population mean ( ), you decide to construct a confidence intervalaround the sample mean. (0.475 : z-value 1.96)
-
8/8/2019 4. Sampling Design (09.02.10)
37/43
` Increasing the sample size decreases the width ofthe confidence interval at a given confidence level.(As n is in the denominator)
Confidence Interval= = X Z c.l. . .n
Sample Size n30, use t-distn is large, use normal distribution
-
8/8/2019 4. Sampling Design (09.02.10)
38/43
Confidence Interval = p Z c.l. . pqn
Eg:
-
8/8/2019 4. Sampling Design (09.02.10)
39/43
3 factors are required` Variance, or heterogenity of the population (S)
` Magnitude of acceptable error. (E)` Confidence Level.(Zc.l.)
-
8/8/2019 4. Sampling Design (09.02.10)
40/43
` Sample Size n = ZS 2
E
Eg : survey research wants 95% confidence level anda range of error E of less than Rs. 2.00
-
8/8/2019 4. Sampling Design (09.02.10)
41/43
` Either do a pilot study and find the variance` Or, if the range of variation is known ( Normal
Distribution- Range of variable 3 std deviations
-
8/8/2019 4. Sampling Design (09.02.10)
42/43
Some amusement park visitors might spend early nothing on souvenirs, othersmight visit several amusement parks in a yr and buy a lot of souvenirs everytime. Suppose that 5 days a year were considered typical of the upper limit,and food and souvenir expenses were calculated at INR 90 per day; the totalupper limit would be INR 450.
The range would be 450/6 = 75Desired precision of 25 and 95% confidence interval;Sample size n = Z2S2 = (2) 2 (75) 2 = 36
E2 (25)2
Suppose these observations generate mean x = 35, and SD , SX = 60,
Then confidence interval = 35 2 . 60 = 35 20 or, 15 55;36
Desired precision was INR25 , we got INR20
-
8/8/2019 4. Sampling Design (09.02.10)
43/43
n = Zc.l.2 pqE2
Eg : The manager of a bank feels that 35% of branches will have enhancedyearly collection of deposits after introducing a hike in interest rate.Determine the sample size such that the mean proportion is within plus orminus 0.06 at a confidence interval of 90%
p = 0.35, q = 0.65, C.L. = 0.9, Z 0.45= 1.645, E = 0.06