4. sampling design (09.02.10)

Upload: sehaj01

Post on 10-Apr-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/8/2019 4. Sampling Design (09.02.10)

    1/43

  • 8/8/2019 4. Sampling Design (09.02.10)

    2/43

    The process of using a small number of items or partsof a larger population to make conclusions about thewhole population

    Sample

    A subset or some part of a larger population

  • 8/8/2019 4. Sampling Design (09.02.10)

    3/43

    A complete group of entities sharing some common setof characteristics

    Population ElementAn individual member of a specific population

    CensusAn investigation of all the individual elements

    making up a population

  • 8/8/2019 4. Sampling Design (09.02.10)

    4/43

    ` Pragmatic Reasons : cuts costs, reduces laborrequirement, gathers vital information quickly.

    ` Accurate and Reliable Results : Population

    elements are highly homogeneous.` Destruction of Test Units.

  • 8/8/2019 4. Sampling Design (09.02.10)

    5/43

    ` The list of elements drawn from which a samplemay be drawn; also called as working population.

    ` Mailing List A list of the names, addresses and

    phone numbers of specific populations.` Reverse Directory` Sampling Frame Error occurs when certain

    sample elements are not listed or available and

    are not represented in the sampling frame.

  • 8/8/2019 4. Sampling Design (09.02.10)

    6/43

    ` A single element or group of elements subject toselection in the sample

    ` Primary Sampling Unit (PSU)

    ` Secondary Sampling Unit (SSU)

  • 8/8/2019 4. Sampling Design (09.02.10)

    7/43

    Define the target population

    Select a Sampling Unit

    Select a Sampling Frame

    Probability Sampling or Non-probabilitySampling : Determine

    Determine The Sample Size

    Parameters of Interest

    Budgetary Constraint

  • 8/8/2019 4. Sampling Design (09.02.10)

    8/43

    Cost of an

    Incorrect

    Inference

    Resultingfrom the data

    Cost ofcollectingthe data

  • 8/8/2019 4. Sampling Design (09.02.10)

    9/43

    ` NonSampling Error / Systematic Bias

    ` Random Sampling Error

  • 8/8/2019 4. Sampling Design (09.02.10)

    10/43

    ` Inappropriate Sampling Frame` Defective Measuring Device` Non-respondents

    ` Indeterminancy Principle` Natural Bias in Reporting of Data

  • 8/8/2019 4. Sampling Design (09.02.10)

    11/43

    ` A statistical fluctuation that occurs because ofchance variation in the elements selected for asample.

    ` The measurement of this error is called precisionof the sampling plan` It is a function of sample size.

  • 8/8/2019 4. Sampling Design (09.02.10)

    12/43

    ` It must result in a truly representative sample` It should give a small sampling error` It should be within the cost constraint

    ` It should control the systematic bias in a betterway

    ` It should provide for the application of the resultsof sample study with reasonable confidence.

  • 8/8/2019 4. Sampling Design (09.02.10)

    13/43

    Element Selection

    Technique

    Representation Basis

    Probability Sampling Non-probability

    SamplingUnrestrictedSampling

    Simple RandomSampling

    Haphazard/ConvenienceSampling

    Restricted Sampling Complex RandomSampling(Cluster/Systematic/Stratified Sampling)

    Purposive Sampling(Quota/JudgementSampling)

  • 8/8/2019 4. Sampling Design (09.02.10)

    14/43

    ` Convenience Sampling` Judgement Sampling` Quota Sampling

    ` Snowball Sampling Initial respondents are selected by probability methods

    and additional respondents are obtained from informationprovided by initial respondents.

  • 8/8/2019 4. Sampling Design (09.02.10)

    15/43

    Ensures the Law of Statistical Regularityi.e. A random sample will have the same

    composition and characteristics as the universe& We can measure the errors of estimation

    In short it implies :` It gives each element in the population an equal

    probability of getting into the sample and allchoices are independent of each other

    ` It gives each possible sample combination anequal probability of being chosen

  • 8/8/2019 4. Sampling Design (09.02.10)

    16/43

    Tippett gave 10400four figure nos. from 41600 digitsfrom census reports and combined them into fours togive random nos.

    First 30 sets of Tippetts Nos. :2952 6641 3992 9792 7979 5911 3170 56244167 9525 1545 1396 7203 5356 1300 26932370 7483 3408 2769 3563 6107 6913 7691

    0560 5246 1112 9025 6008 8126

    Used only when lists are available and items are readilynumbered

  • 8/8/2019 4. Sampling Design (09.02.10)

    17/43

    Caution : If there is a hidden periodicity in thepopulation

    Population list should be in random order, then it isequivalent to Random Sampling

  • 8/8/2019 4. Sampling Design (09.02.10)

    18/43

    a) How to form strata ?b) How should items be selected from each

    stratum?

    c) How many items be selected from each stratumor how to allocate the sample size to eachstratum?

  • 8/8/2019 4. Sampling Design (09.02.10)

    19/43

    ni = n . Ni where n = Total Sample SizeP i = Stratum No.

    Ni = Size of stratum iP = Population

    Optimum AllocationAccount for Variability, Cost etc..

    ni = n.Ni.i .N1.1 + N2.2 + + Nk.k

  • 8/8/2019 4. Sampling Design (09.02.10)

    20/43

    ` Cluster Sampling` Area Sampling` Multi- Stage Sampling` Sequential Sampling

  • 8/8/2019 4. Sampling Design (09.02.10)

    21/43

    ` Used when cluster sampling units dont haveapproximately the same number of elements

    ` Indicates which clusters and how many from each

    cluster are to be selected by simple randomsampling or systematic sampling.

    ` EX : 15 cities have following no. of departmentalstores :

    35,17,10,32,70,28,26,19,26,66,37,44,33,29,28Select a sample of 10 stores using this technique

  • 8/8/2019 4. Sampling Design (09.02.10)

    22/43

    ` Instantaneous Surveys : Adv and DisadvResponse is too rapid

    ` Lack of Computer ownership

    ` Only Internet Users : young , better educated,more affluent` Unrestricted Samples Convenience Samples

    may not be representative

  • 8/8/2019 4. Sampling Design (09.02.10)

    23/43

    ` SurveySite conducts pop-up survey` Panel Samples : Drawing prob samples from Prerecruited

    membership panel is popular, scientific and effective method` Harris Interactive Inc. Propensity Weighing Scheme Panel of

    6.5 million Parallel Studies` Recruited ad-hoc Samples Create a sampling frame of e-mail

    address.` Opt-in Lists Give permission to receive selected e-mail, such as

    questionnaires, from a company with an internet presence.Survey Sampling Incorporation Company providessampling frames and scientifically drawn samples.

  • 8/8/2019 4. Sampling Design (09.02.10)

    24/43

    A certain population is divided into 5 strata so that N1 =2000, N2 = 2000,N3 =1800, N4=1700, and N5=2500.Respective standard deviations are 1.6,2.0,4.4,4.8and 6.0. Expected sampling cost in the first two stratis Rs.4 per interview and in the remaining three it is

    Rs.6 per interview. How should a sample of sizen=226 be allocated to five strata if we adoptproportionate sampling design; if we adoptdisproportionate sampling design considering

    i. Only the differences in stratum variability

    ii. Differences invariability as well as differences instratum sampling costs

  • 8/8/2019 4. Sampling Design (09.02.10)

    25/43

    ` Frequency Distribution` Central Tendency : Mean, Median, Mode` Measures of Dispersion

    ` Range _________` Standard Deviation : S = (Xi X)2

    n-1

    `

    The Normal Distribution : Z = X -

  • 8/8/2019 4. Sampling Design (09.02.10)

    26/43

    ` Population Distribution` Sample Distribution` Sampling Distribution Take certain no. of

    samples and for each sample compute variousstatistical measures; each sample will have itsown values of mean, SD etc..

    ` Standard Error of the Mean

    SX =.

    .

    n

  • 8/8/2019 4. Sampling Design (09.02.10)

    27/43

    ` Sampling Distribution of Mean` Sampling Distribution of Proportion Mean` Students t-distribution

    ` F distribution Variance` Chi-square distribution

  • 8/8/2019 4. Sampling Design (09.02.10)

    28/43

    ` Probability distribution of all possible means ofrandom samples of a given size that we take froma population

    ` X N ,2

    , ZX =.

    X-.

    n p/n

    Eg : Annual Income of employees in an industry follows normaldistribution with mean and variance as Rs 4lakhs and Rs 1lakh resp. A

    random sample of size 49 is taken from an infinite normal population.What is the probability that sample means is greater thanRs.4.25lakhs?

    n = 49, =4lakhs, 2 =1lakh,

  • 8/8/2019 4. Sampling Design (09.02.10)

    29/43

    ` Finite Population Multiplier N-nN-1

    Eg : The age of employees in a company follows normal dist. With mean and

    variance as 40yrs and 121yrs resp. If a random sample of 36 employees istaken from a finite population size of 1000, what is the probability that thesample mean is (a) lesser than 45, (b) greater than 42, (c) in between 42 &40

    n = 36, N = 1000, =4oyrs, 2 =121yrs, = 11yrs

  • 8/8/2019 4. Sampling Design (09.02.10)

    30/43

    ` Statistics of Attributes, Binomial Distribution` p = proportion of successes= X/n, q=proportion of

    failures,

    ` n=sample size, 1-p = q` Mean = np, 2 = npq` p N [ p, p(1-p) ]

    nEg : The personnel manager of a company feels that 52% of the employees will

    have enhanced skill after attending the training. A sample of records of 49employees, who attended the trainng reveals that only 24 of them haveenhanced skill after attending the program. Find the probability that the sampleof employees who attended have enhanced their skill.

    p = 0.52, n = 49, q=1-p=0.48, p = 24/49

  • 8/8/2019 4. Sampling Design (09.02.10)

    31/43

    ` t = . X- .

    S/nWith (n-1) degrees of freedom

    Eg : The annual sales of dealers of a company follows normal distribution with

    its mean as Rs. 94 lakhs. A random sample of 10 dealers of the company istaken from the normal population. The variance of the annual sales of these10 dealers is Rs. 81 lakhs. Find the probability that the mean annual salesof sample is (a)less than Rs.98lakhs, (b) more than Rs.98 lakhs

    n= 10, = 94lakhs, S = 9lakhs

  • 8/8/2019 4. Sampling Design (09.02.10)

    32/43

    ` 2 = (n-1)S 2 with (n-1) degrees of freedom.

    2

    Eg : A random sample of 20 dealers of a company is taken from a normal

    population. The variance of the annual sales of dealers from the normalpopulation and that of the random sample of 20 dealers are Rs.81 lakhsand 125 lakhs resp. Compute chi-square statistic and find the probabilitythat the chi-square variable is more than calculated chi-square statistic.

    n = 20, S 2 = 125, 2 = 81

  • 8/8/2019 4. Sampling Design (09.02.10)

    33/43

    ` Ratio of 2 Chi-square tests

    (n1-1)S12/ 12

    F =.

    (n1-1).

    With (n1-1) & (n2-1)(n2-1)S22/ 22 degrees of freedom

    (n2-1)

    If 1 = 2, F = S12/S22

  • 8/8/2019 4. Sampling Design (09.02.10)

    34/43

    ` Eg : 2 independent samples of students of a programme underdistance education are taken from normal populations with the samevariance. The size and variance of marks of first sample are 8 and100 resp. The size and variance of marks of second sample are 20and 40 resp.

    (a) What is calculated F- Statistic(b) What is probablity that F-ratio is more than calculated F-statistic.

    n1 = 8, n2 = 20, S12 = 100, S22 = 40

  • 8/8/2019 4. Sampling Design (09.02.10)

    35/43

    ` A percentage or decimal value that tells howconfident a researcher can be about being correct.It states the long-run percentage of the time that a

    confidence interval will include the true populationlevel.` It gives the estimated value of the population

    parameter, plus or minus an estimate of error.

    = X Z c.l. SX

  • 8/8/2019 4. Sampling Design (09.02.10)

    36/43

    ` A personal manager believes age will be a useful criterion forplacement. Successful women at the supervisory level are sampled.The mean age of 100 women is 37.5yrs, with a standard deviationof 12 yrs. Knowing that it would be extremely coincidental if thepoint estimate from the sample were exactly the same as the

    population mean ( ), you decide to construct a confidence intervalaround the sample mean. (0.475 : z-value 1.96)

  • 8/8/2019 4. Sampling Design (09.02.10)

    37/43

    ` Increasing the sample size decreases the width ofthe confidence interval at a given confidence level.(As n is in the denominator)

    Confidence Interval= = X Z c.l. . .n

    Sample Size n30, use t-distn is large, use normal distribution

  • 8/8/2019 4. Sampling Design (09.02.10)

    38/43

    Confidence Interval = p Z c.l. . pqn

    Eg:

  • 8/8/2019 4. Sampling Design (09.02.10)

    39/43

    3 factors are required` Variance, or heterogenity of the population (S)

    ` Magnitude of acceptable error. (E)` Confidence Level.(Zc.l.)

  • 8/8/2019 4. Sampling Design (09.02.10)

    40/43

    ` Sample Size n = ZS 2

    E

    Eg : survey research wants 95% confidence level anda range of error E of less than Rs. 2.00

  • 8/8/2019 4. Sampling Design (09.02.10)

    41/43

    ` Either do a pilot study and find the variance` Or, if the range of variation is known ( Normal

    Distribution- Range of variable 3 std deviations

  • 8/8/2019 4. Sampling Design (09.02.10)

    42/43

    Some amusement park visitors might spend early nothing on souvenirs, othersmight visit several amusement parks in a yr and buy a lot of souvenirs everytime. Suppose that 5 days a year were considered typical of the upper limit,and food and souvenir expenses were calculated at INR 90 per day; the totalupper limit would be INR 450.

    The range would be 450/6 = 75Desired precision of 25 and 95% confidence interval;Sample size n = Z2S2 = (2) 2 (75) 2 = 36

    E2 (25)2

    Suppose these observations generate mean x = 35, and SD , SX = 60,

    Then confidence interval = 35 2 . 60 = 35 20 or, 15 55;36

    Desired precision was INR25 , we got INR20

  • 8/8/2019 4. Sampling Design (09.02.10)

    43/43

    n = Zc.l.2 pqE2

    Eg : The manager of a bank feels that 35% of branches will have enhancedyearly collection of deposits after introducing a hike in interest rate.Determine the sample size such that the mean proportion is within plus orminus 0.06 at a confidence interval of 90%

    p = 0.35, q = 0.65, C.L. = 0.9, Z 0.45= 1.645, E = 0.06