sampling

12
Ellen Taylor-Powell Program Development and Evaluation Specialist May 1998 Program Development and Evaluation Sampling G3658-3

Upload: hilma-khoury

Post on 22-Nov-2014

1.963 views

Category:

Technology


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Sampling

Ellen Taylor-PowellProgram Development and Evaluation Specialist

May 1998

Program Development and Evaluation

Sampling

G3658-3

Page 2: Sampling

Adapted from Sampling in ExtensionEllen Taylor-PowellTexas Agricultural Extension ServiceThe Texas A&M University SystemCollege Station, Texas, 1988

Page 3: Sampling

■ ■ ■ TABLE OF CONTENTS

The sample: who and what . . . . . . . . . . . . . . . 2

Sampling for generalizability: probability sampling . . . . . . . . . . . . . . . . . . . . . 3

Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

Simple random sample . . . . . . . . . . . . . . . . . . . 3

Systematic sample . . . . . . . . . . . . . . . . . . . . . . 4

Stratified sample . . . . . . . . . . . . . . . . . . . . . . . 4

Sample size . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

Sampling for other purposes:nonprobability sampling . . . . . . . . . . . . . . . . . . 6

Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Quota sample . . . . . . . . . . . . . . . . . . . . . . . . . 6

Purposeful sampling . . . . . . . . . . . . . . . . . . . . . 6

Sample size . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

Appendix 1. Table of random numbers. . . . . . . . . . . . . 9

Appendix 2. Recommended sample sizes for two different precision levels. . . . . . . . . . . . . . . . 10

Page 4: Sampling

A sample is a portion or a subgroup of a largergroup called a population. The two standardways to draw a sample are by probability andnonprobability sampling. If the evaluationÕspurpose is to generalize to the whole group, orto provide a statistical basis for saying that thesample is representative, a probability sampleis appropriate. If the aim of the evaluation is tolearn about individuals or cases for somepurpose other than generalizing to the popula-tion, or if random selection is not feasible, thennonprobability sampling is appropriate.

Sampling is not always necessary. When a pop-ulation is small, you may choose to survey allits members. Decisions about samplingdepend upon population size, what you wantto know, and the resources available.

The sample: who and what?

In Extension, we are usually interested inpeopleÑproducers, families, youth. We cantake samples of almost anything, however,such as acres, result demonstrations, livestock,lakes or 4-H records.

To identify who or what to include in a sample,start by clearly defining the population ofinterest. The term population does not necessar-ily refer to all the people in the town, county orstate. Rather, it refers to the group or units ofinterest (married couples, livestock owners,watersheds, participants in an extensionprogram, or recipients of a newsletter series)who are found in the geographic area of inter-est (town, county, region) during the time ofinterest (since 1995, during 1990-1995, and soon).

Clearly defining the population is the essentialfirst step in selecting a sample. This processincludes three parts:

■ identifying the group of interest;

■ naming the geographic area where thegroup is found; and

■ indicating the time period of interest, asnecessary.

Identifying the population from which to selectthe sample depends upon what you want toknow. If the purpose is to show the impact of afinancial management program on familyfinancial well-being, all participants in theprogram would be the population of interest. Ifthe purpose is to assess interest among thecountyÕs single parents in a prospective finan-cial management program, the population is allsingle parents in the county.

2 P R O G R A M D E V E L O P M E N T A N D E V A L U A T I O N ■ ■ ■

Page 5: Sampling

Sampling for generalizability:

probability samplingIf you want to generalize from the sample tothe population, random sampling is necessary.A probability sample provides for randomselection. This means that each unit has thesame or a known chance of being selected sothat it is possible to confidently make estimatesabout the total population based on the sampleresults. Random sampling increases the likeli-hood that the information collected is repre-sentative of the entire group.

The use of probability samples requires thateach individual in the population be identifi-able on a list or at a location. This so-calledsampling frame must meet several criteria. Itneeds to be:

1. accurateÑincluding only those individualsof interest;

2. complete and currentÑincluding all indi-viduals of interest;

3. free of duplicate names; and

4. absent any pattern in the way the namesare listed.

Information gained from a sample can be gen-eralized only to the larger population fromwhich the sample is taken. Conclusions drawnfrom a representative sample of teenagers inAdams County may be generalized to allteenagers in Adams County. They should notbe generalized to teenagers in other counties oracross the state.

Strategies

Simple random sampleIn a simple random sample, each unit of thepopulation has an equal chance of beingselected. This requires a complete list of the totalpopulationÑall participants in the program,all homeowners, all county residents, and soon. One of the following methods may beselected for drawing a random sample:

1. When the population is small, numbers ornames can simply be drawn from a hat.Record the number or name that is drawn.Put the slip back into the container andcontinue drawing until the requiredsample size is obtained. If the samenumber is drawn again, disregard it, put itback in the container and continue.

2. For larger populations and to ensure equalselection opportunities, a random numbertable may be used such as those found inmost statistics textbooks (see example,Appendix 1). After assigning consecutivenumbers to the names on the populationlist, a number is randomly selected on thetable (close eyes and point). Proceed eithervertically or horizontally. For example, ifthe total population is a three-digitnumber (100), use the last three digits ofthe random table number that correspondto a number on the population list, (anynumber between 1 and 100) until theneeded sample size is obtained. The corre-sponding names on the population listform the sample.

3S A M P L I N G ■ ■ ■

Page 6: Sampling

4 P R O G R A M D E V E L O P M E N T A N D E V A L U A T I O N ■ ■ ■

Systematic sampleThis process is easier and may be substitutedfor random sampling with large populations.Sampling starts from some randomly selectedpoint on the population list (again, close youreyes and point) and proceed by selecting indi-viduals or units at intervals thereafter. Theinterval is determined by dividing the totalnumber on the population list by the requiredsample size. For example, if a sample of 200 isdesired from a population of 800 people, theinterval would be 800/200 = 4. Randomlyselect a starting point and select every fourthname on the list thereafter. If you come to theend of the list and donÕt have the requiredsample size, go to the beginning of the list andkeep counting and choosing until the requirednumber is obtained.

A potential problem with this method is theorder in which names appear on the popula-tion list. Individuals may be listed in a certainorder such as husbandsÕ names precedingwivesÕ; groupings by age; groupings by timesand days of the week, or some other way.Even alphabetical listings may result in biasedselection when ethnic names are alphabeti-cally clustered, resulting in under- or overrep-resentation. Systematic sampling works best ifthe names on the list are randomized first.

Stratified sampleIn stratified sampling, the total population isdivided into separate groups (strata) whichdiffer along selected characteristics such asgender, age, size of operation, or geographicallocation. A random sample is drawn from eachsubgroup or stratum.

This method is especially appropriate whenparticular subgroups are known to vary orwhen some characteristic, such as age, isknown to be related to the outcome of interest.For example, we might want to draw a samplefrom each age grouping in the program toensure that each is fully represented. To do so,determine what percentage each age groupmakes up of the total number of participants.Use this percentage to determine the propor-tion of the sample that must come from eachage group. Then, randomly select an equalnumber of participants from those enrolled inthe corresponding age group. The examplethat follows shows the sample size for eachage stratum in a nutrition program where thetotal number of participants is 250 and thedesired sample size is 154.

Enrollment and stratified sample of a nutrition program

Age 20–30 31–40 41–50 50+ TOTAL

Number enrolled 50 112 68 20 250

% of total group 20% 45% 27% 8% 100%

Sample size 31 69 42 12 154

Page 7: Sampling

Sample sizeWhen your purpose is to generalize or showrepresentativeness, the size of the samplebecomes an issue. How many individuals orunits should you include so that the sampleprovides a fair representation? Often, a propor-tion of the population is used to determine thesample size. For example, you might sample10% of all producers in the county, or 20% ofall program participants. Whether you use pro-portions depends on the size of the total popu-lation. In a program with 200 participants, a20% sample would produce a sample of 40peopleÑunderrepresenting the populationsince there is a large chance for respondentvariability. On the other hand, a 20% sample of50,000 county inhabitants produces a sample of10,000Ñan unnecessary oversampling. You donot gain much more precision than you wouldusing a sample of 400. You can use the table inAppendix 2 as a guide to recommendedsample sizes for various population sizes.Notice that the smaller the population, thelarger the sample size. This is because smallerpopulations exhibit greater variability.

Variability, known as sampling error, existswithin any sample. The larger the sample, thesmaller the sampling error. But no sample willyield exactly the same information as if allpeople in the population had been included.Information collected from a sample is used tomake estimates about the population. Theintent is to produce as close an approximationof the population as possible within the con-straints of time and money.

To estimate how closely the sample approxi-mates the population, two parameters are setÑthe margin of error and the confidence level.The margin of error indicates the range ofvalues that can result when you use a sampleto estimate the population. For example, a 5%margin of error means that if 50% of thesample adopted a recommendation, you can befairly certain that 47.5% to 52.5% of the whole

population adopted it. If results show that 95%of the sample improved in parenting skills,you can feel comfortable saying that 93.5% to97.5% of all participants improved in thoseskills.

The risk of being wrong within the margin oferror is known as the confidence level. Thatis, a 95% confidence level (C = .95) means thatthere is a 5% chance that the results will notfall within the specified margin of error. Usingthe previous example, there is a 5% chance thatthis interpretation is not correctÑthat 93.5% to97.5% of all participants did not improve theirskills. We can say that we are Ò95% sureÓ thatour conclusions accurately reflect the totalpopulation.

Most studies in Extension allow for a 5Ð10%margin of error with a confidence level of 95%.Appendix 2 gives estimates of suggestedsample sizes at different margins of error.Extension practitioners recommend that whengreater accuracy is needed, all individuals beincluded when the population is 100 or less. Inthis manner, the problem in sampling smallpopulations is overcome: Because small popu-lations exhibit greater variability, a greater pro-portion of the population must be sampled.

Anticipating nonresponseGenerally, a certain number of people will notrespond for one reason or another. The samplesize needs to be large enough to compensatefor these nonresponses. A way to do this is toestimate the rate of nonresponse (30% nonre-sponse may be realistic) and increase thesample size accordingly. For example, if thetotal group of Extension participants is 150 andwe set a margin of error at 10%, we need asample size of 61 participants (Appendix 2). Ifwe think that 70% of the individuals willrespond, we actually need to begin with asample of 87 (61/.70) to ensure that we end upwith 61 people.

5S A M P L I N G ■ ■ ■

Page 8: Sampling

Sampling for other purposes:

nonprobabilitysamplesIn some instances, probability sampling maybe impossible, unnecessary or even undesir-able. You may be limited to only those partici-pants or those counties that agree to beincluded. Or you may want more in-depthinformation regarding a particular program,participants or delivery method. In nonproba-bility sampling, there is no expectation thateach unit has an equal chance of beingincluded in the sample. Since the sample doesnot intend to represent the population, find-ings should not be generalized to the whole.

Strategies

Quota sampleA quota sample divides the population beingstudied into subgroups such as male andfemale, younger and older. You estimate theproportion of people in each group based onwhat you know about the population ingeneral. For example, if you know that 20% ofthe population is headed by a single-parentfamily, then search for respondents until 20%of your sample is single parents. This is not thesame as the stratified random sample becausenot every single parent is identified and has anequal opportunity of being included.

Purposeful samplingPatton (1990) describes a number of samplingstrategies that serve purposes other than repre-sentativeness or randomness (see table 1). Basicto all these is the importance of selecting Òinfor-mation-richÓ cases from which you can learnmuch about issues that are important to thestudy. Focus on the specific rather than thegeneral; for example, if an evaluationÕs purposeis to increase a programÕs effectiveness in reach-ing low-income families, you may learn moreby conducting an in-depth query of the fewpoor families in the program than by gatheringstandardized information from a randomsample of all participants.

Sample sizeWhen your purpose is to examine selectedcases in greater depth, the previous recommen-dations for sample size do not apply.

Often, in purposeful sampling the sample size isvery smallÑpossibly even just one case study(n=1). For example, you may wish to conduct asingle case study of a low-income participantand the difference a program made in his or herlife. Or, your purpose may require an in-depthanalysis of successful community collabora-tions, highlighting the factors affecting success.Another option is to ask knowledgeable peopleto identify such collaborations and then select acertain number to include in your sample. If thepurpose of the evaluation is to document diver-sity or variation, a larger number of cases maybe necessary to capture the variety. What is therecommended sample size? There are no rules.It depends upon what you want to know, whatwill be useful, what will be credible, and whatcan be accomplished within the time andresources you have available.

When sampling for the purpose of generaliz-ability, the sample size is set in advance. Withpurposeful sampling, on the other hand, thesample size may change as the study pro-gresses. For instance, based on early investiga-tions, you may identify other Òinformation-richÓ cases to include. Or, you may terminatedata collection when no new information isforthcoming from the new sampled units.

More than likely, the factors determiningsample size will be time and resources. Selectthe individual cases carefully to choose thosethat are most likely to provide the informationyou are seeking. Finally, explain and justifyyour sampling procedures and decisions sothat information users and decisionmakersunderstand your logic. As a professional, youare obliged to describe the strengths and weak-nesses of the sampling procedures that are rel-evant for understanding the findings. Becareful not to generalize but to focus on theintention and strengths of a purposefulsample.

6 P R O G R A M D E V E L O P M E N T A N D E V A L U A T I O N ■ ■ ■

Page 9: Sampling

7S A M P L I N G ■ ■ ■

Table 1. Purposeful sampling strategies

Extreme or deviant case sampling Learning from highly unusual cases, such as outstanding successes/notable failures; top of the class/dropouts; crises.

Intensity sampling Information-rich cases that manifest the phenomenon intensely, but not extremely, such as good students/poor students; above average/below average.

Maximum variation sampling Purposefully pick cases to illustrate a wide range of variation on dimensions of interest. Identifies important common patterns that cut across variations.

Homogeneous sampling Pick cases that are alike. Focuses, reduces variation, simplifies analysis, faclitates group interviewing.

Typical case sampling Illustrates or highlights what is typical, normal, average.

Stratified purposeful sampling Illustrates characteristics of particular subgroups of interest; facilitates comparisons.

Critical case sampling Permits logical generalization and maximum application of information to other cases because if itÕs true of this one case, itÕs likely to be true of all other cases.

Snowball or chain sampling Relies on people identifying other people or cases to investigate. As they identify new names, the snowball gets bigger. A few key names may be mentioned repeatedly indicating their special importance. Particularly useful when there is no population list or you want to draw a sample based on recommendations.

Criterion sampling Picking all cases that meet some criterion, such as all children abused in a treatment facility.

Theory-based or operational Finding manifestations of a theoretical construct of construct sampling interest so as to elaborate and examine the construct.

Confirming and disconfirming cases Elaborating and deepening initial analysis, seeking exceptions, testing variation.

Opportunistic sampling Following new leads during fieldwork, taking advantage of the unexpected, requires flexibility.

Random purposeful sampling Uses random selection to select limited (small sample size) number of cases from a larger purposeful sample;

adds credibility (not for generalizations or representativeness).

Politically important or sensitive Attracts attention to the study (or avoids attracting case sampling undesired attention by eliminating politically sensitive

cases from the sample).

Convenience sampling Takes individuals who are available or cases as they occur. Saves time, money and effort. Poorest rationale; lowest credibility.

Combination or mixed Combination of above approaches, meets multiple purposeful sampling interests and needs.

Patton, 1990: 182Ð183 (Adapted with permission by author. Reprinted with permission of Sage Publications.)

Page 10: Sampling

Summary

Sampling is the selection of a smaller numberof units from among a larger group or popula-tion. To generalize from the sample to the pop-ulation, probability or random sampling isneeded to ensure that the sample is representa-tive. If generalizations are not desired or neces-sary, nonprobability sampling is appropriateand can yield very useful information if casesare selected thoughtfully.

Many people are concerned about how manycases to include in their samples. There is nosimple answer. Determining the sample sizedepends upon:

1. The purpose of the study

2. Expectations of information users, and

3. The resources available.

ReferencesAbbey-Livingston, Diane and Davis S.

Abbey. 1982. Enjoying Research? A ÔHow-toÕ Manual on Needs Assessment. Ontario:Ministry of Tourism and Recreation.

Cochran, W.G. 1963. Sampling Techniques, 2ndEd. N.Y.: John Wiley and Sons, Inc.

Fink, Arlene. 1995. How to Sample in Surveys.Thousand Oaks, Calif: Sage Publications.

Ilvento, Tom W., Paul D. Warner, Richard C.Maurer. 1986. Sampling Issues forEvaluations in the Cooperative ExtensionService. Kentucky Cooperative ExtensionService, Lexington, KY.

Isaac, Stephen and William B. Michael. 1981.Handbook in Research and Evaluation, 2ndEd. San Diego, EdITS Publishers.

Kerlinger, Fred N. 1973. Foundations ofBehavioral Research, 2nd Ed. N.Y.: Holt,Rinehart, and Winston, Inc.

Patton, Michael Q. 1990. QualitativeEvaluation and Research Methods. 2nd ed.Newbury Park, Calif: Sage Publications.

Rohs, F. Richard. 1985. Sampling Techniquesfor Extension Program Planners andEvaluators. The Cooperative ExtensionService, University of Georgia, Athens,Georgia.

Sabrosky, Laurel K. 1959. ÒSamplingÓ inEvaluation in Extension, pg. 37-45. ES-USDA, Washington, D.C.

Smith, M.F. 1983. Sampling Considerations inEvaluating Cooperative Extension Programs.Florida Cooperative Extension Service,Gainesville, Florida.

8 P R O G R A M D E V E L O P M E N T A N D E V A L U A T I O N ■ ■ ■

Page 11: Sampling

Appendix 1. Table of random numbers

59391 58030 52098 82718 87024 82848 04190 96574 90464 2906599567 76364 77204 04615 27062 96621 43918 01896 83991 5114110363 97518 51400 25670 98342 61891 27101 37855 06235 3331686859 19558 64432 16706 99612 59798 32803 67708 15297 2861211258 24591 36863 55368 31721 94335 34936 02566 80972 0818895068 88628 35911 14530 33020 80428 39936 31855 34334 6486554463 47237 73800 91017 36239 71824 83671 39892 60518 3709216874 62677 57412 13215 31389 62233 80827 73917 82802 8442092494 63157 76593 91316 03505 72389 96363 52887 01087 6609115669 56689 35682 40844 53256 81872 35213 09840 34471 7444199116 75486 84989 23476 52967 67104 39495 39100 17217 7407315696 10703 65178 90637 63110 17622 53988 71087 84148 1167097720 15369 51269 69620 03388 13699 33423 67453 43269 5672011666 13841 71681 98000 35979 39719 81899 07449 47985 4696771628 73130 78783 75691 41632 09847 61547 18707 85489 6994440501 51089 99943 91843 41995 88931 73631 69361 05375 1541722518 55576 98215 82068 10798 86211 36584 67466 69373 4005475112 30485 62173 02132 14878 92879 22281 16783 86352 0007780327 02671 98191 84342 90813 49268 95441 15496 20168 0927160251 45548 02146 05597 48228 81366 34598 72856 66762 1700257430 82270 10421 00540 43648 75888 66049 21511 47676 3344473528 39559 34434 88596 54086 71693 43132 14414 79949 8519325991 65959 70769 64721 86413 33475 42740 06175 82758 6624878388 16638 09134 59980 63806 48472 39318 35434 24057 7473912477 09965 96657 57994 59439 76330 24596 77515 09577 9187183266 32883 42451 15579 38155 29793 40914 65990 16255 1777776970 80876 10237 39515 79152 74798 39357 09054 73579 9235937074 65198 44785 68624 98336 84481 97610 78735 46703 9826583712 06514 30101 78295 54656 85417 43189 60048 72781 7260620287 56862 69727 94443 64936 08366 27227 05158 50326 5956674261 32592 86538 27041 65172 85532 07571 80609 39285 6534064081 49863 08478 96001 18888 14810 70545 89755 59064 0721005617 75818 47750 67814 29575 10526 66192 44464 27058 4046726793 74951 95466 74307 13330 42664 85515 20632 05497 3362565988 72850 48737 54719 52056 01596 03845 35067 03134 7032227366 42271 44300 73399 21105 03280 73457 43093 05192 4865756760 10909 98147 34736 33863 95256 12731 66598 50771 8366572880 43338 93643 58904 59543 23943 11231 83268 65938 8158177888 38100 03062 58103 47961 83841 25878 23746 55903 4411528440 07819 21580 51459 47971 29882 13990 29226 23608 1587363525 94441 77033 12147 51054 49955 58312 76923 96071 0581347606 93410 16359 89033 89696 47231 64498 31776 05383 3990252269 45030 96279 14709 52372 87832 02735 50803 72744 8820816738 60159 07425 62369 07515 82721 37875 71153 21315 0013259348 11695 45751 15865 74739 05572 32688 20271 65128 1455112900 71775 29845 60774 94924 21810 38636 33717 67598 8252175086 23537 49939 33595 13484 97588 28617 17979 70749 3523499495 51434 29181 09993 38190 42553 68922 52125 91077 4019726075 31671 45386 36583 93459 48599 52022 41330 60651 9132113636 93596 23377 51133 95126 61496 42474 45141 46660 42338

9S A M P L I N G ■ ■ ■

Page 12: Sampling

Appendix 2. Recommended sample sizes for two different precision levels

Sample size Sample sizePopulation Size +5% 10% Population size 5% 10%

10 10 275 163 74

15 14 300 172 76

20 19 325 180 77

25 24 350 187 78

30 28 375 194 80

35 32 400 201 81

40 36 425 207 82

45 40 450 212 82

50 44 475 218 83

55 48 500 222 83

60 52 1000 286 91

65 56 2000 333 95

70 59 3000 353 97

75 63 4000 364 98

80 66 5000 370 98

85 70 6000 375 98

90 73 7000 378 99

95 76 8000 381 99

100 81 51 9000 383 99

125 96 56 10,000 385 99

150 110 61 15,000 390 99

175 122 64 20,000 392 100

200 134 67 25,000 394 100

225 144 70 50,000 397 100

250 154 72 100,000 398 100

Source: Isaac and Michael, 1981; Smith, M. F., 1983

10 ■ ■ ■

Author: Ellen Taylor-Powell is a program development and evaluation specialist for Cooperative Extension,University of WisconsinÐExtension.

An EEO/Affirmative Action employer, University of WisconsinÐExtension provides equal opportunities inemployment and programming, including Title IX and ADA requirements. Requests for reasonable accommoda-tion for disabilities or limitations should be made prior to the date of the program or activity for which they areneeded. Publications are available in alternative formats upon request. Please make such requests as early aspossible by contacting your county Extension office so proper arrangements can be made. If you need this infor-mation in an alternative format, contact the Office of Equal Opportunity and Diversity Programs or callExtension Publications at (608)262-2655.

© 1996 by the Board of Regents of the University of Wisconsin System doing business as the division ofCooperative Extension of the University of WisconsinÐExtension. Send inquiries about copyright permission to:Director, Cooperative Extension Publications, 201 Hiram Smith Hall, 1545 Observatory Dr., Madison, WI 53706.

You can obtain copies of this publication from Cooperative Extension Publications, Room 170, 630 W. MifflinStreet, Madison, WI 53703, or by phoning (608)262-3346.

G3658-3 Program Development and Evaluation, Sampling RP-4-98-.5M-200