chapter 7 statistics in practice

85
1 1 1 Slide Slide Slide Chapter 7 Chapter 7 STATISTICS STATISTICS in in PRACTICE PRACTICE MeadWestvaco Corporations products include textbook paper, magazine paper, and office products. MeadWestvacos internal consulting group uses sampling to provide information that enables the company to obtain significant productivity benefits and remain competitive. Managers need reliable and accurate information about the timberlands and forests to evaluate the companys ability to meet its future raw material needs. Data collected from sample plots throughout the forests are the basis for learning about the population of trees owned by the company.

Upload: others

Post on 01-May-2022

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 7 STATISTICS in PRACTICE

111SlideSlideSlide

Chapter 7Chapter 7

STATISTICSSTATISTICS in in PRACTICEPRACTICE� MeadWestvaco Corporation’s

products include textbook

paper, magazine paper, and

office products.

� MeadWestvaco’s internal consulting

group uses sampling to provide information that enables the company to obtain significant productivity benefits and remain competitive.

� Managers need reliable and accurate information about the timberlands and forests to evaluate the company’s ability to meet its future raw material needs.

� Data collected from sample plots throughout the forests are the basis for learning about the population of trees owned by the company.

Page 2: Chapter 7 STATISTICS in PRACTICE

222SlideSlideSlide

Chapter 7 Chapter 7

Sampling and Sampling DistributionsSampling and Sampling Distributions

� 7.1 The Electronics Associates Sampling Problem

� 7.2 Simple Random Sampling

� 7.3 Point Estimation

� 7.4 Introduction to Sampling Distributions

� 7.5 Sampling Distribution of x

� 7.6 Sampling Distribution of p

� 7.7 Properties of Point Estimators

� 7.8 Other Sampling Methods

Page 3: Chapter 7 STATISTICS in PRACTICE

333SlideSlideSlide

The purpose of statistical inference is to obtaininformation about a population from informationcontained in a sample.

The purpose of The purpose of statistical inferencestatistical inference is to obtainis to obtaininformation about a population from informationinformation about a population from informationcontained in a sample.contained in a sample.

Statistical InferenceStatistical Inference

A population is the set of all the elements of interest.A A populationpopulation is the set of all the elements of interest.is the set of all the elements of interest.

A sample is a subset of the population.A A samplesample is a subset of the population.is a subset of the population.

Page 4: Chapter 7 STATISTICS in PRACTICE

444SlideSlideSlide

The sample results provide only estimates of thevalues of the population characteristics.

The sample results provide only The sample results provide only estimatesestimates of theof thevalues of the population characteristics.values of the population characteristics.

A parameter is a numerical characteristic of apopulation.

A A parameterparameter is a numerical characteristic of ais a numerical characteristic of apopulation.population.

With proper sampling methods, the sample resultscan provide “good” estimates of the populationcharacteristics.

With With proper sampling methodsproper sampling methods, the sample results, the sample resultscan provide can provide ““goodgood”” estimates of the populationestimates of the populationcharacteristics.characteristics.

Statistical InferenceStatistical Inference

Page 5: Chapter 7 STATISTICS in PRACTICE

555SlideSlideSlide

7.1 7.1 The Electronics Associates

Sampling Problem

� Often the cost of collecting information from a

sample is substantially less than from a

population,

� Especially when personal interviews must be

conducted to collect the information.

Page 6: Chapter 7 STATISTICS in PRACTICE

666SlideSlideSlide

7.2 Simple Random Sampling:7.2 Simple Random Sampling:

Finite PopulationFinite Population� Finite populations are often defined by lists

such as:

•Organization membership roster

•Credit card account numbers

•Inventory product numbers

� A simple random sample of size n from a finite

population of size N is a sample selected such

that each possible sample of size n has the

same probability of being selected.

Page 7: Chapter 7 STATISTICS in PRACTICE

777SlideSlideSlide

Simple Random Sampling:Simple Random Sampling:

Finite PopulationFinite Population

� In large sampling projects, computer-

generated random numbers are often used to

automate the sample selection process.

Excel provides a function for generating

random numbers in its worksheets.

� Sampling without replacement is the procedure

used most often.

� Replacing each sampled element before selecting

subsequent elements is called sampling with

replacement.

Page 8: Chapter 7 STATISTICS in PRACTICE

888SlideSlideSlide

Simple Random SamplingSimple Random Sampling

� Random Numbers: the numbers in the table are

random, these four-digit numbers are equally likely.

Page 9: Chapter 7 STATISTICS in PRACTICE

999SlideSlideSlide

� Infinite populations are often defined by an ongoing

process whereby the elements of the population

consist of items generated as though the process

would operate indefinitely.

Simple Random Sampling:Simple Random Sampling:

Infinite PopulationInfinite Population

� A simple random sample from an infinite population

is a sample selected such that the following conditions

are satisfied.

• Each element selected comes from the same

population.

• Each element is selected independently.

Page 10: Chapter 7 STATISTICS in PRACTICE

101010SlideSlideSlide

Simple Random Sampling:Simple Random Sampling:

Infinite PopulationInfinite Population

� The random number selection procedure

cannot be used for infinite populations.

� In the case of infinite populations, it is

impossible to obtain a list of all elements in the

population.

Page 11: Chapter 7 STATISTICS in PRACTICE

111111SlideSlideSlide

s is the point estimator of the population standarddeviation σ.

ss is the is the point estimatorpoint estimator of the population standardof the population standarddeviation deviation σσ..

In point estimation we use the data from the sample to compute a value of a sample statistic that servesas an estimate of a population parameter.

In In point estimationpoint estimation we use the data from the sample we use the data from the sample to compute a value of a sample statistic that servesto compute a value of a sample statistic that servesas an estimate of a population parameter.as an estimate of a population parameter.

7.3 Point Estimation7.3 Point Estimation

We refer to as the point estimator of the populationmean µ.

We refer to We refer to as the as the point estimatorpoint estimator of the populationof the populationmean mean µµ..

xx

is the point estimator of the population proportion p.is the is the point estimatorpoint estimator of the population proportion of the population proportion pp..pp

Page 12: Chapter 7 STATISTICS in PRACTICE

121212SlideSlideSlide

Point EstimationPoint Estimation

� Example: to estimate the population mean, the

population standard deviation and population

proportion.

Page 13: Chapter 7 STATISTICS in PRACTICE

131313SlideSlideSlide

Sampling ErrorSampling Error

� Statistical methods can be used to make probability

statements about the size of the sampling error.

� Sampling error is the result of using a subset of the

population (the sample), and not the entire

population.

� The absolute value of the difference between an

unbiased point estimate and the corresponding

population parameter is called the sampling error.

� When the expected value of a point estimator is

equal to the population parameter, the point

estimator is said to be unbiased.

Page 14: Chapter 7 STATISTICS in PRACTICE

141414SlideSlideSlide

Sampling ErrorSampling Error

� The sampling errors are:

| |p p−| |p p− for sample proportionfor sample proportion

| |s σ−| |s σ− for sample standard deviationfor sample standard deviation

| |x µ−| |x µ− for sample meanfor sample mean

Page 15: Chapter 7 STATISTICS in PRACTICE

151515SlideSlideSlide

Example: St. AndrewExample: St. Andrew’’ss

St. Andrew’s College

receives 900 applications

annually from

prospective students.

The application form

contains a variety of information

including the individual’s scholastic aptitude

test (SAT) score and whether or not the

individual desires on-campus housing.

Page 16: Chapter 7 STATISTICS in PRACTICE

161616SlideSlideSlide

Example: St. AndrewExample: St. Andrew’’ss

The director of admissions

would like to know the

following information:

•the average SAT score

for the 900 applicants,

and

•the proportion of

applicants that want to live on campus.

Page 17: Chapter 7 STATISTICS in PRACTICE

171717SlideSlideSlide

Example: St. AndrewExample: St. Andrew’’ss

We will now look at three

alternatives for obtaining

The desired information.

� Conducting a census of

the entire 900 applicants

� Selecting a sample of 30

applicants, using a random number table

� Selecting a sample of 30 applicants, using Excel

Page 18: Chapter 7 STATISTICS in PRACTICE

181818SlideSlideSlide

Conducting a CensusConducting a Census

� If the relevant data for the entire 900 applicants

were in the college’s database, the population

parameters of interest could be calculated using

the formulas presented in Chapter 3.

� We will assume for the moment that conducting

a census is practical in this example.

Page 19: Chapter 7 STATISTICS in PRACTICE

191919SlideSlideSlide

990900

ixµ = =∑

2( )80

900

ix µσ

−= =∑

Conducting a CensusConducting a Census

648.72

900p = =

� Population Mean SAT Score

� Population Standard Deviation for SAT

Score

� Population Proportion Wanting On-Campus

Housing

Page 20: Chapter 7 STATISTICS in PRACTICE

202020SlideSlideSlide

Simple Random SamplingSimple Random Sampling

� The applicants were numbered, from 1 to 900, as

their applications arrived.

� She decides a sample of 30 applicants will be used.

� Furthermore, the Director of Admissions must obtain

estimates of the population parameters of interest for

a meeting taking place in a few hours.

� Now suppose that the necessary data on the

current year’s applicants were not yet entered in the

college’s database.

Page 21: Chapter 7 STATISTICS in PRACTICE

212121SlideSlideSlide

� Taking a Sample of 30 Applicants

Simple Random Sampling:Simple Random Sampling:

Using a Random Number TableUsing a Random Number Table

• We will use the last three digits of the 5-digit

random numbers in the third column of the

textbook’s random number table, and continue

into the fourth column as needed.

• Because the finite population has 900 elements, we

will need 3-digit random numbers to randomly

select applicants numbered from 1 to 900.

Page 22: Chapter 7 STATISTICS in PRACTICE

222222SlideSlideSlide

� Taking a Sample of 30 Applicants

Simple Random Sampling:Simple Random Sampling:

Using a Random Number TableUsing a Random Number Table

• (We will go through all of column 3 and part of

column 4 of the random number table,

encountering in the process five numbers greater

than 900 and one duplicate, 835.)

• We will continue to draw random numbers until

we have selected 30 applicants for our sample.

• The numbers we draw will be the numbers of

the applicants we will sample unless

• the random number is greater than 900 or

• the random number has already been used.

Page 23: Chapter 7 STATISTICS in PRACTICE

232323SlideSlideSlide

� Use of Random Numbers for Sampling

Simple Random Sampling:Simple Random Sampling:

Using a Random Number TableUsing a Random Number Table

744744436436865865790790835835902902

190190836836

. . . and so on. . . and so on

33--DigitDigitRandom NumberRandom Number

ApplicantApplicantIncluded in SampleIncluded in Sample

No. 436No. 436No. 865No. 865No. 790No. 790No. 835No. 835

Number exceeds 900Number exceeds 900

No. 190No. 190No. 836No. 836

No. 744No. 744

Page 24: Chapter 7 STATISTICS in PRACTICE

242424SlideSlideSlide

� Sample Data

Simple Random Sampling:Simple Random Sampling:

Using a Random Number TableUsing a Random Number Table

11 744 744 Conrad HarrisConrad Harris 10251025 YesYes

22 436436 Enrique RomeroEnrique Romero 950950 YesYes

33 865865 Fabian Fabian AvanteAvante 10901090 NoNo

44 790790 LucilaLucila CruzCruz 11201120 YesYes

55 835835 Chan ChiangChan Chiang 930930 NoNo.. .. .. .. ..

3030 498498 Emily MorseEmily Morse 10101010 NoNo

No.No.RandomRandomNumberNumber ApplicantApplicant

SATSATScoreScore

Live OnLive On--CampusCampus

.. .. .. .. ..

Page 25: Chapter 7 STATISTICS in PRACTICE

252525SlideSlideSlide

� Taking a Sample of 30 Applicants

• Then we choose the 30 applicants

corresponding to the 30 smallest random

numbers as our sample.

• For example, Excel’s function

= RANDBETWEEN(1,900)

can be used to generate random numbers

between 1 and 900.

• Computers can be used to generate random

numbers for selecting random samples.

Simple Random Sampling:Simple Random Sampling:

Using a ComputerUsing a Computer

Page 26: Chapter 7 STATISTICS in PRACTICE

262626SlideSlideSlide

29,910997

30 30

ixx = = =∑

2( ) 163,99675.2

29 29

ix xs

−= = =∑

Point EstimationPoint Estimation

Note: Different random numbers would have

identified a different sample which would have

resulted in different point estimates.

� s as Point Estimator of σσσσ

� as Point Estimator of µµµµx

–� p as Point Estimator of p

.6820/30 ==p

Page 27: Chapter 7 STATISTICS in PRACTICE

272727SlideSlideSlide

PopulationPopulationParameterParameter

PointPointEstimatorEstimator

PointPointEstimateEstimate

ParameterParameterValueValue

µµ = Population mean= Population meanSAT score SAT score

990990 997997

σσ = Population std.= Population std.deviation for deviation for SAT score SAT score

8080 s s = Sample std.= Sample std.deviation fordeviation forSAT score SAT score

75.275.2

pp = Population pro= Population pro--portion wantingportion wantingcampus housing campus housing

.72.72 .68.68

Summary of Point EstimatesSummary of Point Estimates

Obtained from a Simple Random SampleObtained from a Simple Random Sample

= Sample mean= Sample meanSAT score SAT score

xx

= Sample pro= Sample pro--portion wantingportion wantingcampus housing campus housing

pp

Page 28: Chapter 7 STATISTICS in PRACTICE

282828SlideSlideSlide

7.4 Sampling Distribution7.4 Sampling Distribution

� Example: Relative Frequency Histogram of Sample

Mean Values from 500 Simple Random Samples of

30 each.

Page 29: Chapter 7 STATISTICS in PRACTICE

292929SlideSlideSlide

Sampling DistributionSampling Distribution

� Example: Relative Frequency Histogram of Sample

Proportion Values from 500 Simple Random

Samples of 30 each.

Page 30: Chapter 7 STATISTICS in PRACTICE

303030SlideSlideSlide

� Process of Statistical

Inference

The value of is used toThe value of is used tomake inferences aboutmake inferences about

the value of the value of µµ..

xx The sample data The sample data provide a value forprovide a value forthe sample meanthe sample mean ..xx

A simple random sampleA simple random sampleof of nn elements is selectedelements is selectedfrom the population.from the population.

Population Population with meanwith mean

µµ = ?= ?

7.5 Sampling Distribution of 7.5 Sampling Distribution of x

Page 31: Chapter 7 STATISTICS in PRACTICE

313131SlideSlideSlide

The sampling distribution of is the probability

distribution of all possible values of the sample

mean .

Sampling Distribution ofSampling Distribution of

where:

µµµµ = the population mean

EE( ) = ( ) = µµxx

Expected Value of x

x

x

x

Page 32: Chapter 7 STATISTICS in PRACTICE

323232SlideSlideSlide

Finite Population Infinite Population

σσ

xn

N n

N=

−( )

σx

n

N n

N=

−( )

σx

n=σ

σx

n=

• is referred to as the standard error of the

mean.

xσxσ

• A finite population is treated as being

infinite if n/N < .05.

• is the finite correction factor.)1/()( −− NnN )1/()( −− NnN

Standard Deviation of x

Sampling Distribution ofSampling Distribution of x

Page 33: Chapter 7 STATISTICS in PRACTICE

333333SlideSlideSlide

If we use a large (If we use a large (nn >> 30) simple random sample, the30) simple random sample, thecentral limit theoremcentral limit theorem enables us to conclude that theenables us to conclude that thesampling distribution of can be approximated bysampling distribution of can be approximated bya normal distribution.a normal distribution.

xx

When the simple random sample is small (When the simple random sample is small (nn < 30),< 30),the sampling distribution of can be consideredthe sampling distribution of can be considerednormal only if we assume the population has anormal only if we assume the population has anormal distribution.normal distribution.

xx

Form of the Sampling Distribution ofForm of the Sampling Distribution of x

Page 34: Chapter 7 STATISTICS in PRACTICE

343434SlideSlideSlide

Central Limit Theorem� Illustration of The Central Limit Theorem

Page 35: Chapter 7 STATISTICS in PRACTICE

353535SlideSlideSlide

Relationship Between the Sample Size and

the Sampling Distribution of Sample Mean

� A Comparison of The Sampling Distributions of

Sample Mean for Simple Random Samples of n = 30

and n = 100.

Page 36: Chapter 7 STATISTICS in PRACTICE

363636SlideSlideSlide

8014.6

30x

n

σσ = = =

8014.6

30x

n

σσ = = =

( ) 990E x =( ) 990E x =

xx

SamplingSamplingDistributionDistribution

of of xx

Sampling Distribution ofSampling Distribution of for SAT Scoresfor SAT Scoresx

Page 37: Chapter 7 STATISTICS in PRACTICE

373737SlideSlideSlide

What is the probability that a simple

random sample of 30 applicants will provide

an estimate of the population mean SAT score

that is within +/-10 of the actual population mean ?

In other words, what is the probability that

will be between 980 and 1000?

Sampling Distribution ofSampling Distribution of for SAT Scoresfor SAT Scoresx

x

Page 38: Chapter 7 STATISTICS in PRACTICE

383838SlideSlideSlide

Step 1: Calculate the z-value at the upper

endpoint of the interval.

z = (1000 - 990)/14.6= .68

P(z < .68) = .7517

Step 2: Find the area under the curve to the

left of the upper endpoint.

Sampling Distribution ofSampling Distribution of for SAT Scoresfor SAT Scoresx

Page 39: Chapter 7 STATISTICS in PRACTICE

393939SlideSlideSlide

Cumulative Probabilities forCumulative Probabilities forthe Standard Normal Distributionthe Standard Normal Distribution

z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09

. . . . . . . . . . .

.5 .6915 .6950 .6985 .7019 .7054 .7088 .7123 .7157 .7190 .7224

.6 .7257 .7291 .7324 .7357 .7389 .7422 .7454 .7486 .7517 .7549

.7 .7580 .7611 .7642 .7673 .7704 .7734 .7764 .7794 .7823 .7852

.8 .7881 .7910 .7939 .7967 .7995 .8023 .8051 .8078 .8106 .8133

.9 .8159 .8186 .8212 .8238 .8264 .8289 .8315 .8340 .8365 .8389

. . . . . . . . . . .

Sampling Distribution ofSampling Distribution of for SAT Scoresfor SAT Scoresx

Page 40: Chapter 7 STATISTICS in PRACTICE

404040SlideSlideSlide

xx

990990

SamplingSamplingDistributionDistribution

of of xx

14.6xσ =14.6xσ =

10001000

Area = .7517Area = .7517

Sampling Distribution ofSampling Distribution of for SAT Scoresfor SAT Scoresx

Page 41: Chapter 7 STATISTICS in PRACTICE

414141SlideSlideSlide

Step 3: Calculate the z-value at the lower

endpoint of the interval.

Step 4: Find the area under the curve to the

left of the lower endpoint.

z = (980 - 990)/14.6= - .68

P(z < -.68) = P(z > .68)

= .2483

= 1 - . 7517

= 1 - P(z < .68)

Sampling Distribution ofSampling Distribution of for SAT Scoresfor SAT Scoresx

Page 42: Chapter 7 STATISTICS in PRACTICE

424242SlideSlideSlide

xx

980980 990990

Area = .2483Area = .2483

SamplingSamplingDistributionDistribution

of of xx

14.6xσ =14.6xσ =

Sampling Distribution ofSampling Distribution of for SAT Scoresfor SAT Scoresx

Page 43: Chapter 7 STATISTICS in PRACTICE

434343SlideSlideSlide

Step 5: Calculate the area under the curve

between the lower and upper endpoints

of the interval.

P(-.68 < z < .68) = P(z < .68) - P(z < -.68)

= .7517 - .2483= .5034

The probability that the sample mean SAT

score will be between 980 and 1000 is:

P(980 < < 1000) = .5034xx

Sampling Distribution ofSampling Distribution of for SAT Scoresfor SAT Scoresx

Page 44: Chapter 7 STATISTICS in PRACTICE

444444SlideSlideSlide

xx10001000980980 990990

Area = .5034Area = .5034

SamplingSamplingDistributionDistribution

of of xx

14.6xσ =14.6xσ =

Sampling Distribution ofSampling Distribution of for SAT Scoresfor SAT Scoresx

Page 45: Chapter 7 STATISTICS in PRACTICE

454545SlideSlideSlide

� Suppose we select a simple random sample

of 100 applicants instead of the 30 originally

considered.

Relationship Between the Sample SizeRelationship Between the Sample Size

and the Sampling Distribution ofand the Sampling Distribution of x

� Whenever the sample size is increased, the

standard error of the mean is decreased.

With the increase in the sample size to n = 100,

the standard error of the mean is decreased to:

8.0100

80x ===

n

σσ

� E( ) = m regardless of the sample size. In

our example, E( ) remains at 990.

x

x

Page 46: Chapter 7 STATISTICS in PRACTICE

464646SlideSlideSlide

( ) 990E x =( ) 990E x =

xx

14.6xσ =14.6xσ =

With With nn = 30,= 30,

8xσ = 8xσ =

With With nn = 100,= 100,

Relationship Between the Sample SizeRelationship Between the Sample Size

and the Sampling Distribution ofand the Sampling Distribution of x

Page 47: Chapter 7 STATISTICS in PRACTICE

474747SlideSlideSlide

Relationship Between the Sample SizeRelationship Between the Sample Size

and the Sampling Distribution ofand the Sampling Distribution of x

� Recall that when n = 30, P(980 < < 1000) = .5034.x

� We follow the same steps to solve for P(980 < < 1000)when n = 100 as we showed earlier when n = 30.

x

� Now, with n = 100, P(980 < < 1000) = .7888.x

� Because the sampling distribution with n = 100 has asmaller standard error, the values of have lessvariability and tend to be closer to the populationmean than the values of with n = 30.

x

x

Page 48: Chapter 7 STATISTICS in PRACTICE

484848SlideSlideSlide

xx

10001000980980 990990

Area = .7888Area = .7888

SamplingSamplingDistributionDistribution

of of xx

8xσ =8xσ =

Relationship Between the Sample SizeRelationship Between the Sample Size

and the Sampling Distribution ofand the Sampling Distribution of x

Page 49: Chapter 7 STATISTICS in PRACTICE

494949SlideSlideSlide

Sampling DistributionSampling Distribution

� Example: Relative Frequency Histogram of

Sample Proportion Values from 500 Simple

Random Samples of 30 each.

Page 50: Chapter 7 STATISTICS in PRACTICE

505050SlideSlideSlide

A simple random sampleA simple random sampleof of nn elements is selectedelements is selectedfrom the population.from the population.

Population Population with proportionwith proportion

pp = ?= ?

� Making Inferences about a Population Proportion

The sample data The sample data provide a value for theprovide a value for thesample proportionsample proportion ..pp

The value of is usedThe value of is usedto make inferencesto make inferencesabout the value of about the value of pp..

pp

7.6 Sampling Distribution of 7.6 Sampling Distribution of pp

Page 51: Chapter 7 STATISTICS in PRACTICE

515151SlideSlideSlide

E p p( ) =E p p( ) =

where:

p = the population proportion

The sampling distribution of p is the probability

distribution of all possible values of the sample

proportion p .

Expected Value of p

Sampling Distribution ofSampling Distribution of p

Page 52: Chapter 7 STATISTICS in PRACTICE

525252SlideSlideSlide

σ p

p p

n

N n

N=

− −

( )1

1σ p

p p

n

N n

N=

− −

( )1

1σ p

p p

n=

−( )1σ p

p p

n=

−( )1

is referred to as the standard error of

the proportion.

σ pσ p

Sampling Distribution ofSampling Distribution of

Finite Population Infinite Population

Standard Deviation of p

p

Page 53: Chapter 7 STATISTICS in PRACTICE

535353SlideSlideSlide

The sampling distribution of can be approximatedThe sampling distribution of can be approximatedby a normal distribution whenever the sample size by a normal distribution whenever the sample size is large.is large.

pp

The sample size is considered large whenever theseThe sample size is considered large whenever theseconditions are satisfied:conditions are satisfied:

npnp >> 55 nn(1 (1 –– pp) ) >> 55andand

Form of the Sampling Distribution ofForm of the Sampling Distribution of p

Page 54: Chapter 7 STATISTICS in PRACTICE

545454SlideSlideSlide

For values of For values of pp near .50, sample sizes as small as 10near .50, sample sizes as small as 10

permit a normal approximationpermit a normal approximation..

With very small (approaching 0) or very large With very small (approaching 0) or very large

(approaching 1) values of (approaching 1) values of pp, much larger samples are , much larger samples are

needed.needed.

Form of the Sampling Distribution ofForm of the Sampling Distribution of p

Page 55: Chapter 7 STATISTICS in PRACTICE

555555SlideSlideSlide

Recall that 72% of the

prospective students applying

to St. Andrew’s College desire

on-campus housing.

� Example: St. Andrew’s College

Sampling Distribution ofSampling Distribution of

What is the probability that

a simple random sample of 30 applicants will provide

an estimate of the population proportion of applicant

desiring on-campus housing that is within plus or

minus .05 of the actual population proportion?

p

Page 56: Chapter 7 STATISTICS in PRACTICE

565656SlideSlideSlide

For our example, with n = 30 and p = .72,

the normal distribution is an acceptable

approximation because:

nn(1 (1 -- pp) = 30(.28) = 8.4 ) = 30(.28) = 8.4 >> 55

and

npnp = 30(.72) = 21.6 = 30(.72) = 21.6 >> 55

Sampling Distribution ofSampling Distribution of p

Page 57: Chapter 7 STATISTICS in PRACTICE

575757SlideSlideSlide

pp

SamplingSamplingDistributionDistribution

of of pp

Sampling Distribution ofSampling Distribution of p

082.30

)72.1(72.=

−=pσ

72.)( =pE

Page 58: Chapter 7 STATISTICS in PRACTICE

585858SlideSlideSlide

Step 1: Calculate the z-value at the upper

endpoint of the interval.

z = (.77 - .72) /.082 = .61

P(z < .61) = .7291

Step 2: Find the area under the curve to the

left of the upper endpoint.

Sampling Distribution ofSampling Distribution of p

Page 59: Chapter 7 STATISTICS in PRACTICE

595959SlideSlideSlide

Cumulative Probabilities for

the Standard Normal Distribution

z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09

. . . . . . . . . . .

.5 .6915 .6950 .6985 .7019 .7054 .7088 .7123 .7157 .7190 .7224

.6 .7257 .7291 .7324 .7357 .7389 .7422 .7454 .7486 .7517 .7549

.7 .7580 .7611 .7642 .7673 .7704 .7734 .7764 .7794 .7823 .7852

.8 .7881 .7910 .7939 .7967 .7995 .8023 .8051 .8078 .8106 .8133

.9 .8159 .8186 .8212 .8238 .8264 .8289 .8315 .8340 .8365 .8389

. . . . . . . . . . .

Sampling Distribution ofSampling Distribution of p

Page 60: Chapter 7 STATISTICS in PRACTICE

606060SlideSlideSlide

.77.77.72.72

Area = .7291Area = .7291

pp

SamplingSamplingDistributionDistribution

of of pp

.082pσ = .082pσ =

Sampling Distribution ofSampling Distribution of p

Page 61: Chapter 7 STATISTICS in PRACTICE

616161SlideSlideSlide

Step 3: Calculate the z-value at the lower

endpoint of the interval.

Step 4: Find the area under the curve to the

left of the lower endpoint.

z = (.67 - .72) /.082 = - .61

P(z < -.61) = P(z > .61)

= .2709

= 1 - . 7291

= 1 - P(z < .61)

Sampling Distribution ofSampling Distribution of p

Page 62: Chapter 7 STATISTICS in PRACTICE

626262SlideSlideSlide

.67.67 .72.72

Area = .2709Area = .2709

pp

SamplingSamplingDistributionDistribution

of of pp

.082pσ = .082pσ =

Sampling Distribution ofSampling Distribution of p

Page 63: Chapter 7 STATISTICS in PRACTICE

636363SlideSlideSlide

PP(.67 (.67 << << .77) = .4582.77) = .4582pp

Step 5: Calculate the area under the curve between

the lower and upper endpoints of the interval..

P(-.61 < z < .61) = P(z < .61) - P(z < -.61)

= .7291 - .2709

= .4582

The probability that the sample proportion of

applicants wanting on-campus housing will be

within +/-.05 of the actual population proportion :

Sampling Distribution ofSampling Distribution of p

Page 64: Chapter 7 STATISTICS in PRACTICE

646464SlideSlideSlide

.77.77.67.67 .72.72

Area = .4582Area = .4582

pp

SamplingSamplingDistributionDistribution

of of pp

.082pσ = .082pσ =

Sampling Distribution ofSampling Distribution of p

Page 65: Chapter 7 STATISTICS in PRACTICE

656565SlideSlideSlide

Point EstimatorsPoint Estimators

� Notations:

� θθθθ = the population parameter of interest.

For example, population mean, population standard

deviation, population proportion, and so on.

� θθθθ = the sample statistic or point estimator of θθθθ .

Represents the corresponding sample statistic such as

the sample mean, sample standard deviation, and

sample proportion.

� The notation θθθθ is the Greek letter theta.

� the notation θθθθ is pronounced “theta-hat.”^

^

Page 66: Chapter 7 STATISTICS in PRACTICE

666666SlideSlideSlide

7.7 Properties of Point Estimators7.7 Properties of Point Estimators

� Before using a sample statistic as a point estimator,

statisticians check to see whether the sample

statistic has the following properties associated

with good point estimators.

ConsistencyConsistency

EfficiencyEfficiency

UnbiasedUnbiased

Page 67: Chapter 7 STATISTICS in PRACTICE

676767SlideSlideSlide

Properties of Point EstimatorsProperties of Point Estimators

If the expected value of the sample statistic

is equal to the population parameter being

estimated, the sample statistic is said to be an

unbiased estimator of the population

parameter.

UnbiasedUnbiased

Page 68: Chapter 7 STATISTICS in PRACTICE

686868SlideSlideSlide

Properties of Point EstimatorsProperties of Point Estimators

� Unbised

The sample statistic θθθθ is unbiased estimator of

the population parameter θθθθ if

E(θθθθ)=θθθθ^

where

E(θθθθ)=the expected value of the sample statistic θθθθ^ ^

Page 69: Chapter 7 STATISTICS in PRACTICE

696969SlideSlideSlide

Properties of Point EstimatorsProperties of Point Estimators

� Examples of Unbiased and Biased Point

Estimators

Page 70: Chapter 7 STATISTICS in PRACTICE

707070SlideSlideSlide

Properties of Point EstimatorsProperties of Point Estimators

Given the choice of two unbiased estimators

of the same population parameter, we would

prefer to use the point estimator with the

smaller standard deviation, since it tends to

provide estimates closer to the population

parameter.

The point estimator with the smaller

standard deviation is said to have greater

relative efficiency than the other.

EfficiencyEfficiency

Page 71: Chapter 7 STATISTICS in PRACTICE

717171SlideSlideSlide

Properties of Point EstimatorsProperties of Point Estimators

� Example: Sampling Distributions of Two

Unbiased Point Estimators.

Page 72: Chapter 7 STATISTICS in PRACTICE

727272SlideSlideSlide

Properties of Point EstimatorsProperties of Point Estimators

A point estimator is consistent if the values

of the point estimator tend to become closer to

the population parameter as the sample size

becomes larger.

ConsistencyConsistency

Page 73: Chapter 7 STATISTICS in PRACTICE

737373SlideSlideSlide

7.8 Other Sampling Methods7.8 Other Sampling Methods

� Stratified Random Sampling

� Cluster Sampling

� Systematic Sampling

� Convenience Sampling

� Judgment Sampling

Page 74: Chapter 7 STATISTICS in PRACTICE

747474SlideSlideSlide

The population is first divided into groups ofelements called strata.

The population is first divided into groups ofThe population is first divided into groups ofelements called elements called stratastrata..

Stratified Random SamplingStratified Random Sampling

Each element in the population belongs to one andonly one stratum.

Each element in the population belongs to one andEach element in the population belongs to one andonly one stratum.only one stratum.

Best results are obtained when the elements withineach stratum are as much alike as possible(i.e. a homogeneous group).

Best results are obtained when the elements withinBest results are obtained when the elements withineach stratum are as much alike as possibleeach stratum are as much alike as possible(i.e. a (i.e. a homogeneous grouphomogeneous group).).

Page 75: Chapter 7 STATISTICS in PRACTICE

757575SlideSlideSlide

Stratified Random SamplingStratified Random Sampling

� Diagram for Stratified Random Sampling

Page 76: Chapter 7 STATISTICS in PRACTICE

767676SlideSlideSlide

Stratified Random SamplingStratified Random Sampling

A simple random sample is taken from each stratum.A simple random sample is taken from each stratum.A simple random sample is taken from each stratum.

Formulas are available for combining the stratumsample results into one population parameterestimate.

Formulas are available for combining the stratumFormulas are available for combining the stratumsample results into one population parametersample results into one population parameterestimate.estimate.

Advantage: If strata are homogeneous, this methodis as “precise” as simple random sampling but witha smaller total sample size.

AdvantageAdvantage: If strata are homogeneous, this method: If strata are homogeneous, this methodis as is as ““preciseprecise”” as simple random sampling but withas simple random sampling but witha smaller total sample size.a smaller total sample size.

Example: The basis for forming the strata might bedepartment, location, age, industry type, and so on.

ExampleExample: The basis for forming the strata might be: The basis for forming the strata might bedepartment, location, age, industry type, and so on.department, location, age, industry type, and so on.

Page 77: Chapter 7 STATISTICS in PRACTICE

777777SlideSlideSlide

Cluster SamplingCluster Sampling

The population is first divided into separate groupsof elements called clusters.

The population is first divided into separate groupsThe population is first divided into separate groupsof elements called of elements called clustersclusters..

Ideally, each cluster is a representative small-scaleversion of the population (i.e. heterogeneous group).

Ideally, each cluster is a representative smallIdeally, each cluster is a representative small--scalescaleversion of the population (i.e. heterogeneous group).version of the population (i.e. heterogeneous group).

A simple random sample of the clusters is then taken.A simple random sample of the clusters is then taken.A simple random sample of the clusters is then taken.

All elements within each sampled (chosen) clusterform the sample.

All elements within each sampled (chosen) clusterAll elements within each sampled (chosen) clusterform the sample.form the sample.

Page 78: Chapter 7 STATISTICS in PRACTICE

787878SlideSlideSlide

Cluster SamplingCluster Sampling

� Diagram for Cluster Sampling

Page 79: Chapter 7 STATISTICS in PRACTICE

797979SlideSlideSlide

Cluster SamplingCluster Sampling

Advantage: The close proximity of elements can becost effective (i.e. many sample observations can beobtained in a short time).

AdvantageAdvantage: The close proximity of elements can be: The close proximity of elements can becost effective (i.e. many sample observations can becost effective (i.e. many sample observations can beobtained in a short time).obtained in a short time).

Disadvantage: This method generally requires alarger total sample size than simple or stratifiedrandom sampling.

DisadvantageDisadvantage: This method generally requires a: This method generally requires alarger total sample size than simple or stratifiedlarger total sample size than simple or stratifiedrandom sampling.random sampling.

Example: A primary application is area sampling,where clusters are city blocks or other well-definedareas.

ExampleExample: A primary application is area sampling,: A primary application is area sampling,where clusters are city blocks or other wellwhere clusters are city blocks or other well--defineddefinedareas.areas.

Page 80: Chapter 7 STATISTICS in PRACTICE

808080SlideSlideSlide

Systematic SamplingSystematic Sampling

If a sample size of n is desired from a populationcontaining N elements, we might sample oneelement for every n/N elements in the population.

If a sample size of If a sample size of nn is desired from a populationis desired from a populationcontaining containing NN elements, we might sample oneelements, we might sample oneelement for every element for every nn//NN elements in the population.elements in the population.

We randomly select one of the first n/N elementsfrom the population list.

We randomly select one of the first We randomly select one of the first nn//NN elementselementsfrom the population list.from the population list.

We then select every n/Nth element that follows inthe population list.

We then select every We then select every nn//NNth element that follows inth element that follows inthe population list.the population list.

Page 81: Chapter 7 STATISTICS in PRACTICE

818181SlideSlideSlide

Systematic SamplingSystematic Sampling

This method has the properties of a simple randomsample, especially if the list of the populationelements is a random ordering.

This method has the properties of a simple randomsample, especially if the list of the populationelements is a random ordering.

Advantage: The sample usually will be easier toidentify than it would be if simple random samplingwere used.

AdvantageAdvantage: The sample usually will be easier to: The sample usually will be easier toidentify than it would be if simple random samplingidentify than it would be if simple random samplingwere used.were used.

Example: Selecting every 100th listing in a telephonebook after the first randomly selected listing

ExampleExample: Selecting every 100: Selecting every 100thth listing in a telephonelisting in a telephonebook after the first randomly selected listingbook after the first randomly selected listing

Page 82: Chapter 7 STATISTICS in PRACTICE

828282SlideSlideSlide

Convenience SamplingConvenience Sampling

It is a nonprobability sampling technique. Items areincluded in the sample without known probabilitiesof being selected.

It is a It is a nonprobabilitynonprobability sampling techniquesampling technique. Items are. Items areincluded in the sample without known probabilitiesincluded in the sample without known probabilitiesof being selected.of being selected.

Example: A professor conducting research might usestudent volunteers to constitute a sample.

ExampleExample: A professor conducting research might use: A professor conducting research might usestudent volunteers to constitute a sample.student volunteers to constitute a sample.

The sample is identified primarily by convenience.The sample is identified primarily by The sample is identified primarily by convenienceconvenience..

Page 83: Chapter 7 STATISTICS in PRACTICE

838383SlideSlideSlide

Advantage: Sample selection and data collection arerelatively easy.

AdvantageAdvantage: Sample selection and data collection are: Sample selection and data collection arerelatively easy.relatively easy.

Disadvantage: It is impossible to determine howrepresentative of the population the sample is.

DisadvantageDisadvantage: It is impossible to determine how: It is impossible to determine howrepresentative of the population the sample is.representative of the population the sample is.

Convenience SamplingConvenience Sampling

Page 84: Chapter 7 STATISTICS in PRACTICE

848484SlideSlideSlide

Judgment SamplingJudgment Sampling

The person most knowledgeable on the subject of thestudy selects elements of the population that he orshe feels are most representative of the population.

The person most knowledgeable on the subject of theThe person most knowledgeable on the subject of thestudy selects elements of the population that he orstudy selects elements of the population that he orshe feels are most representative of the population.she feels are most representative of the population.

It is a nonprobability sampling technique.It is a It is a nonprobabilitynonprobability sampling techniquesampling technique..

Example: A reporter might sample three or foursenators, judging them as reflecting the generalopinion of the senate.

ExampleExample: A reporter might sample three or four: A reporter might sample three or foursenators, judging them as reflecting the generalsenators, judging them as reflecting the generalopinion of the senate.opinion of the senate.

Page 85: Chapter 7 STATISTICS in PRACTICE

858585SlideSlideSlide

Judgment SamplingJudgment Sampling

Advantage: It is a relatively easy way of selecting asample.

AdvantageAdvantage: It is a relatively easy way of selecting a: It is a relatively easy way of selecting asample.sample.

Disadvantage: The quality of the sample resultsdepends on the judgment of the person selecting thesample.

DisadvantageDisadvantage: The quality of the sample results: The quality of the sample resultsdepends on the judgment of the person selecting thedepends on the judgment of the person selecting thesample.sample.