sampling of data
DESCRIPTION
this document contains the datails for the support of commerce projectTRANSCRIPT
Sampling
Sampling is a part of our day-to-day life which we use inadvertently
The purpose of sampling is to gather maximum information about the population under consideration at minimum cost, time and human power.
This is best achieved when the sample posses all the characteristics of the population
Objectives
i. To make an inference about an unknown
parameter of a population from a sample drawn
from it.
ii. To test a hypothesis relating to a population
parameter.
Importance of Sampling A house wife takes one or two grains of rice from
the cooking pan and decides whether the rice is cooked or not.
If the house wife takes the entire rice to test, from the cooking pan then there will be no rice to eat.
A quality controller tests few items and decide whether the lot is in accordance with the desired specifications or not.
If he tests all the items produced in the lot then there will not be any items remained in the lot to reach the customer.
Importance of Sampling
A pathologist takes a few drops of blood and
tests for any change in the content.
If he squeeze the entire blood from the body
then ultimately the patient will die, and there
will be no scope for further treatment.
All these situations emphasize the importance of sampling and reveals that the sampling is inevitable, and gives satisfactory results when:
the population is infinite
survey area is wide
the results are required in a short time
scared resources in respect of money and skilled personnel
Advantages of Sampling over Complete Enumeration (Census)
Less time
Reduced cost
Greater accuracy
Greater scope.
Sampling is inevitable when:
Population is too large
Testing is destructive
Population is hypothetical
Limitations of Sampling
The sampling units are drawn in a scientific manner
Appropriate sampling technique is used, and;
The sample size is adequate
Sampling gives best results only if:
Sampling – Basic Concepts
Population (Universe): An aggregate of objects, animate or inanimate under study in any statistical investigation.SampleA part or portion or segment or subset or subgroup of population (larger group)Random SampleA sample in which each and every unit of the population has the same probability or chance of being included in the sample.Sampling The process of learning about the population on the basis of a sample drawn from it.
ParametersPopulation constants such as µ,σ,ρ etc.StatisticsMeasures such as x, s, r etc. based on sample observations. Sampling DistributionThe distribution of a statistic such as x or σ for different samples.Standard ErrorThe standard deviation of a sampling distribution.
Sampling – Basic Concepts (contd.)
Theoretical Basis of Sampling(i) Law of Statistical Regularity
A moderately large number of items chosen at random
from a population are almost sure on the average to
possess the characteristics of the larger group.
(ii) Law of Inertia of Large Numbers
Larger the size of the sample, more accurate the
results are likely to be.
Essentials of Sampling
Representativeness - Random Method of Selection
Adequacy - Size of the Sample should
be adequate
Independence - Independent selection of
units
Homogeneity - No basic difference in the nature of the units of the universe and the sample
Types of Sampling
Subjective or Non-probability Sampling
Probability or Random Sampling
Mixed Sampling
Subjective or Non-Probability Sampling
If the sample is selected with definite
purpose in view and the choice of the sampling
units depends entirely on the discretion and choice
of the investigator, then the sampling is called a
subjective or non-probability sampling. For
example, Purposive sampling or quota
sampling, Judgment sampling and
Convenience sampling, and snowball
sampling.
Probability or Random SamplingProbability sampling is the scientific method of selecting samples according to some laws of chance in which each and every unit of the population has an equal chance of being selected. This kind methods may also be called as Random sampling methods. For example, Simple Random Sampling, Stratified Sampling, Systematic Sampling, Multi-Stage Sampling and Cluster Sampling. Mixed Sampling
If the samples are selected partly according to some laws of chance and partly according to a fixed sampling rule then it is termed as Mixed sampling.
Simple Random Sampling
• This method is purely based on probability and also known as Probability Sampling.
• The Simple Random Sampling (SRS) is the process of selection of a sample in such a manner that each and every unit of the population has an equal and independent chance of being included in the sample.
Methods1. Lottery Method2. Table of Random Numbers
i. Tippett’s (1927) random number tables (41600 digits grouped into 10400 sets of 4 digited numbers).
ii. Fisher and Yates (1938) table of random numbers (15000 digits arranged into 1500 sets of 10 digited numbers).
iii. Kendall and B.B.Smith (1939) table of random numbers (10,00,000 digits grouped into 2,00,000 sets of 5 digited numbers).
iv. C.R.Rao, Mitra and Matthai (1966) table of random numbers (20,000 digits grouped into 5000 sets of 4 digited random numbers).
3. Use of Computer
Simple Random SamplingAdvantages
It is quite simple in its sample selection
It is said to be more representative because each unit has an equal chance of being selected.
It is free from bias and prejudices.
Simple Random SamplingDisadvantages
The investigator has no control over the selection of the units for investigation.
Selection according to strictly random basis is difficult.
It is unsuitable for heterogeneous groups.
Stratified Random Sampling
In Stratified Random Sampling Method, the universe or the entire population is divided into a number of groups or strata.
Stratification variables include age, income group, residential area etc.
Selection of units are done from each stratum, proportionately or disproportionately.
Stratified Random Sampling Method Importance of Strata
In Stratified Random Sampling, the selection of the sample items depends upon the process of stratification. The following precautions are required.
Each stratum in the universe should be much enough in size.
A perfect homogeneity in different units of stratum is required.
Different variables involved in the study problem should not be considered.
There should be well defined and clear-cut stratification.
Stratified Random Sampling MethodAdvantages
It is easy to achieve representative character.
The Investigator has greater control over the selection of the samples.
Replacement of unit is possible when a particular unit is inaccessible for the study.
Stratified Random SamplingDisadvantages If stratification is not done properly then bias may
creep in.
It is very difficult to attain the proportion through deliberated means. It is because of the unequal size of the strata.
If the stratums are not very clear-cut, it may be difficult in placing cases under stratum.
The sample becomes under-representative if disproportionate weighing is done from the stratums.
Systematic Sampling
In some instances, the most practical way of sampling is to select every ‘i’th unit on a list of sampling units.
An element of randomness is introduced by using random numbers to pickup the unit with which to start.
The remaining units of the sample are selected at fixed intervals, which is known as the Sampling Interval in the Systematic Sampling.
Systematic SamplingAdvantages
The observations of the Systematic Sampling spread more evenly over the entire population.
This is easier and less costlier method of sampling and can be used conveniently in the case of large populations.
Disadvantages
If there is a hidden periodicity in the population, systematic sampling will prove to be an inefficient method of sampling.
Sampling may not be reliable if all the elements are not ordered in a manner representative of the total population.
Cluster Sampling
It is a type of sampling in which clusters of units are selected in the sample method of elementary units.
Cluster refers to the particular area and thus cluster sample implies the Area Sample. Cluster sample is basically particular geographical area.
The sample units are clustered using the concept of neighbourhood.
Cluster Sampling Advantages
Where the area of inquiry is wide, cluster sampling method is widely used.
The measurement of data can be accurate in cluster sampling.
It brings flexibility in sampling.
In cluster sampling the fieldwork gets localized or concentrated. As such field cost for collecting the data is cheaper by comparison and further the fieldwork period will also be lesser.
Cluster Sampling
Disadvantages
It is less accurate than other methods.
It is a very complex and complicated method.
Estimates of parameters and their standard errors are somewhat difficult when the clusters are of unequal sizes.
Multi-Stage Sampling
In this design various stages of selection is involved. It is appropriate where the population is scattered over a wider geographical area and no sampling frame is available.
It is useful when a sample is to be made within a limited time and cost budget.
Advantages
It requires less time, labour and money.
More convenient, effective and flexible.
Disadvantages
The procedure of estimating Standard Error is complicated.
It is difficult for a non-statistician to follow this method.
Selection of Appropriate Method of Sampling
Factors influencing Selection of the Method of Sampling
Nature of the Problem
Size of the Universe
Size of the Sample
Availability of Money and time
Sample Design
It is a plan for drawing a sample from a population. It involves making decision on the following questions:
What is the relevant population?
What method of sampling technique shall we use?
What sampling frame shall we use?
What should be the size of the sample?
How much will be the sample cost?
The Sample size should neither be too small nor too large. It should be optimum.
Optimum size is that which fulfils the requirements of efficiency, representativeness, reliability and flexibility.
Size of the Sample
1. The size of the Universe
2. The resources available
3. The degree of accuracy or precision desired
4. Homogeneity or Heterogeneity of the Universe
5. Nature of the Study
6. Methods of the Sampling adopted
7. Nature of the Respondents
The following factors should be considered while deciding the sample size.
Size of the Sample
Mathematical Formula for Determining the Sampling Size
Sample Size : n = (Zσ/d)
2
n = Sample Size
Z = Value at a specified level of confidence or desired degree of precision
σ = Standard Deviation of the population
d = Difference between population mean and sample mean or Standard Error of Mean
Example (Determining Sample Size)
Determine the sample Size if σ = 6, population mean = 25, sample mean = 23, and the desired degree of precision is 99%.
n = (Zσ/ d)2
σ = 6, d = 25-23=2, Z = 2.576 (at 1% level of significance)
Therefore, n = [(2.576 x 6)/2]2
= [7.728]2 = 59.72 ≈ 60 (approximately)
Sampling Error and Non-sampling Error
Sampling Error
The error arising due to drawing inferences about the population on the basis of few observations (sample) is termed as Sampling Error.
1. Biased Errors: Errors arise due to any biasedness in the selection, estimation etc.
2. Unbiased Errors: Errors arise due to chance differences between the members of the population included in the sample and those not included.
How to Reduce it?: By increasing the Sample Size?
Non-Sampling Errors
Non-sampling errors arise from one or more of the following factors:
1. Data specification being inadequate and inconsistent with respect to the objectives of the study.
2. Inappropriate statistical unit
3. Inaccurate or inappropriate method of data collection
1. Lack of trained and experienced investigators
2. Lack of inspection and supervision
3. Due to non-response.
4. Data processing operations such as coding, verification etc.
5. During presentation and printing of tabulated results
Non-Sampling Errors (contd.)
How to Judge the Reliability of Samples
More samples of the same size should be taken from the same universe and their results be compared. If the results are similar, the sample will be reliable.
If the measurements of the universe are known, then they should be compared with the measurements of the samples. In case of similarity, the sample will be reliable.
Sub-sample should be taken from the sample and studied. If the results of the sample and sub-sample study show similarity, the sample will be reliable.