Transcript
Page 1: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[1]

Confidence Intervals

Page 2: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[2]

Statistical Estimation

sample statistic = parameter estimate

=

s =

Example:

X ̂̂

n

1iiX

n

1X

n

ii XX

ns

1

2)(1

1

Page 3: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[3]

• Process parameters, and

• (Model parameters, and )

• Sample statistics, and s

• Statistical inference

– inferring knowledge of and , unknown,from values of and s, calculated from data

Parameters and Statistics

X

X

Page 4: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[4]

Clip gap measurements in twenty five samples offive measurements each

Sample 1 2 3 4 5 6 7 8 9 10 11 12

65 75 75 60 70 60 75 60 65 60 80 85 Clip 70 85 80 70 75 75 80 70 80 70 75 75 gaps 65 75 80 70 65 75 65 80 85 60 90 85 65 85 70 75 85 85 75 75 85 80 50 65 85 65 75 65 80 70 70 75 75 65 80 70

Range 20 20 10 15 20 25 15 20 20 20 40 20

Sample 13 14 15 16 17 18 19 20 21 22 23 24 25

70 65 90 75 75 75 65 60 50 60 80 65 65 Clip 70 70 80 80 85 70 65 60 55 80 65 60 70 gaps 75 85 80 75 70 60 85 65 65 65 75 65 70 75 75 75 80 80 70 65 60 80 65 65 60 60 70 60 85 65 70 60 70 65 80 75 65 70 65

Range 5 25 15 15 15 15 20 5 30 20 15 10 10

Page 5: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[5]

Plot subgroup means

5 10 15 20 25

Sample Number

55

60

65

70

75

80

85

90

X bar

8.73XBefore 75.66XAfter

Page 6: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[6]

Estimation : how do we quantify the implied uncertainty?

Based on the 16×5 = 80 values sampled from the stable process before the new batch of raw material, can we estimate the process mean?

How do we represent the uncertainty associated with this estimate?

Evaluating the estimate in light of its implied uncertainty, would we conclude that the process is “on target” ?

8.73XBefore

8.73XBefore

Page 7: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[7]

Page 8: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[8]

Page 9: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[9]

Page 10: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[10]

Page 11: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

It is unlikely that two samples of the same size taken from the sample population would return exactly the same value for the sample mean. The sample mean will vary from sample to sample.

The sample mean is itself a random variable with its own population mean

its own standard deviation (called the standard error)and its own distribution (sampling distribution of the mean)

Properties of the sampling distribution of the mean

The sampling distribution of the mean turns out to be a normal distribution. (see diagrams below).

This is always true if the underlying distribution of the variable is itself normal; but even more importantly, it is approximately true as long as the distribution of the original variables is not very skewed, and the approximation improves as the sample size (n) increases.

Page 12: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

The second result which is of concern relates to the mean of all the sampling means in the sampling distribution of the mean.

Fairly reasonably it turns out to be nothing more than the mean () of the population from which the samples were chosen.

Thus, sample means, are distributed normally about an unknown population mean which is being estimated.

This justifies the intuitive notion that most of the possible sample means should be fairly close to this population value.

Page 13: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

The sample mean should be fairly near to the population mean. The question arises of how near is fairly near, which, of course, relates to the dispersion of the sample means around the population mean.

It can be shown that the standard deviation of the sampling distribution of the mean (more usually called the standard error of the mean, or, when there is no ambiguity, the standard error) is given by

where is the standard deviation of the original population, and n is the sample size.

Thus, estimates based on a large sample size are more precise than estimates associated with small samples.

- Why?

SE Xn

( )

Page 14: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[14]

The Normal model for X and for X-bar

3 3 3

n3

n

Page 15: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[15]

Implications of the standard error formula

is very likely to be within 2 standard errors of and is

even more likely to be within 3 standard errors of .

This means that, having calculated a value of from

sampled data, we can be reasonably confident that is

within 2/n of the calculated value and even more

confident that is within 3/n of the calculated value

X

X

Page 16: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[16]

Sampling distribution of X-bar

95% chance that X-bar is within 2/n of ,

therefore,

95% confident that is within 2/n of X-bar

-3 -2 -1 0 1 2 3

2

n2

n

Z scale

X scale

95%

Page 17: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[17]

Logic of confidence intervals

With repeated sampling from the process, n at a time and calculating a new value of each time, expect 95% of the calculated values of to be within two standard errors of .

Changing emphasis, expect that, in 95% of samples from a stable process, will be within two standard errors of the calculated value of .

Therefore, given a single sample from the process, we are 95% confident that the value of will be within two standard errors of the calculated value of .

X

X

X

X

Page 18: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[18]

95% confidence interval for

that is,

all values of within 2 standard errors of

n/2X,n/2X

X

Page 19: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[19]

Example

s = 7.3 n = 80.

Confidence interval for Before is:

73.8 - 2 × 7.3/80 to 73.8 + 2 × 7.3/80,

72.2 to 75.4 .

8.73XBefore

Page 20: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[20]

Exercise

s = 7.3 n = 40.

Calculate a confidence interval for After

75.66XAfter

Page 21: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[21]

50 simulated confidence intervals

Page 22: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[22]

Page 23: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[23]

Page 24: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[24]

Page 25: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[25]

Page 26: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[26]

Page 27: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

X

The value 2 is an approximation to the value 1.96 from the normal tables.

The Normal model for

95% of sample means lie in the range given by

X

Xn

Xn

196 196. .

196 196. .n

Xn

X X X X X X X XX X X X X X

X X X XX X X X

X XX X

X

n

X X

S.E.

Page 28: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[28]

Problem Name: Cadmium Ion Concentration in Sludge

Application: Interval Estimation of a Population Mean

Problem Description: 70 determinations of the Cd2+ ion concentration were made. The data showed a sample mean of 54.97 mg/ml and a standard deviation of 0.33 mg/ml.

Our best estimate of is 54.97 mg/ml, but what level of confidence do we place in this figure?

What we require is an INTERVAL ESTIMATE.

Page 29: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[29]

Example: 95% CI for Mean Cadmium Ion Concentration

A 95% confidence interval for the true mean Cadmium ion concentration is calculated as

Xn

Xn

196 196

54 97 1960 33

7054 97 196

0 3370

54 97 0 08 54 97 0 08

54 89 55 05

. , .

. ..

, . ..

. . , . .

. .

Under repeated sampling we would expect the true mean Cadmium ion concentration to lie in an interval constructed in such a fashion, 95% of the time.

Page 30: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[30]

General Procedure: Interval estimate of a population mean

where 1 - is the confidence level.

x

Sampling distribution

of X

valuesX

all of

100%)-(1

576.201.0%99

960.105.0%95

645.110.0%90

2/ ZInterval

Confidence

X Z

/2 n

Page 31: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[31]

Example: 99% CI for Mean Cadmium Ion Concentration

A 99% confidence interval for the true mean Cadmium ion concentration is calculated as

Xn

Xn

2 58 2 58

54 97 2 580 33

7054 97 2 58

0 3370

54 97 010 54 97 010

54 87 55 07

. , .

. ..

, . ..

. . , . .

. .

Under repeated sampling we would expect the true mean Cadmium ion concentration to lie in an interval constructed in such a fashion, 99% of the time.

Page 32: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[32]

Example: Tablets require an average weight of 100mg. An inspector takes a sample of 200 tablets and finds that

X 98 52 7 1. . mg, and s mg.

A 95% CI is

Xn

Xn

196 196

98 52 1967 1200

98 52 1967 1200

98 52 0 98 98 52 0 98

97 54 99 50

. , .

. ..

, . ..

. . , . .

. .

Quality engineer says that this interval is “too wide”!

Page 33: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[33]

Xn

X

n

n

196 0 85

1967 1

0 85

196 7 10 85

268

2

. .

..

.

. ..

Example: What sample size would be required to estimate the mean weight of tablets to within + 0.85mg, using a 95% C.I.?

Thus, in order to achieve the desired precision in our estimate of the population mean we should use a sample of size 268.

Page 34: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[34]

Suppose a new sample gave X 98 32 7 0. . mg, and s mg.

98 32 1967 0268

98 32 1967 0268

98 32 0 84 98 32 0 84

97 48 99 16

. ..

, . ..

. . , . .

. .

Page 35: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[35]

# The normal core body temperature of a healthy, resting adult human # being is stated to be at 98.6 degrees Fahrenheit. We will consider # data reported by Mackowiak et al., JAMA 268:1578-1580, 1992. TRY...

temps = read.table("C:/Kev/MA4413/data/Mackowiak.txt", header=TRUE)

temps

boxplot(temp ~ gender, data = temps)

abline( h = 98.6, col = "green", lty=2, lwd=2)

stats = function(x) c(mean(x),sd(x),sd(x)/sqrt(length(x)))

CI = function(x, w=1.96) mean(x) + c(-1,1) * w *

sd(x) / sqrt(length(x))

with(temps, by(temp, gender, stats))

with(temps, by(temp, gender, CI))

means = with(temps, by(temp, gender, mean))

CIs = with(temps, by(temp, gender, CI))

lines(x = c(1,1), y = CIs$female, col = "red", lwd = 3)

lines(x = c(2,2), y = CIs$male, col = "red", lwd = 3)

points(x = 1:2, y = means, pch = 16, col = "blue", cex=1.5)

Page 36: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[36]

Page 37: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[37]

Example: Rental Costs

• A reporter for a student newspaper is writing an article on the cost of off-campus housing.

• A sample of 10 one-bedroom units within a half-mile of campus resulted in a sample mean of €550 per month and a sample deviation of €30.

• Calculate a 95% confidence interval estimate of the mean rent per month for the population of one- bedroom units within a half-mile of campus.

• We’ll assume this population to be normally distributed.

Page 38: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

Interval Estimation of a Population Mean Small-Sample Case (n < 30)

If the data have a normal probability distribution and the sample standard deviation s is used to estimate the population standard deviation , the interval estimate is given by:

where t/2 is the value providing an area of /2 in the upper tail of a t distribution with n-1 degrees of freedom.

X t /2

sn

Page 39: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[39]

Degrees Area in Upper Tail

of Freedom .10 .05 .025 .01 .005

. . . . . .

6 1.440 1.943 2.447 3.143 3.707

7 1.415 1.895 2.365 2.998 3.499

8 1.397 1.860 2.306 2.896 3.355

9 1.383 1.833 2.262 2.821 3.250

10 1.372 1.812 2.228 2.764 3.169

• t Value

At 95% confidence, 1 - = .95, = .05, and /2 = .025.

t.025 is based on n - 1 = 10 - 1 = 9 degrees of freedom.

In the t distribution table we see that t.025 = 2.262.

Example: Apartment Rents

Page 40: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[40]

• Interval Estimation of a Population Mean:Small-Sample Case (n < 30) with Unknown

550 + 21.46

or $528.54 to $571.46

We are 95% confident that the mean rent per month for the population of one-bedroom units within a half-mile of campus is between $528.54 and $571.46.

ns

tx 025.

1030

262.2550

Example: Apartment Rents

Page 41: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[41]

of Freedom .10 .05 .025 .01 .005

. . . . . .

29 1.311 1.699 2.045 2.462 2.756

30 1.310 1.697 2.042 2.457 2.750

. . . . . .

40 1.303 1.684 2.021 2.423 2.704

. . . . . .

60 1.296 1.671 2.000 2.390 2.617

. . . . . .

120 1.289 1.658 1.980 2.358 2.617

. . . . . .

infinity 1.282 1.645 1.960 2.326 2.576

Percentage points of the t Distribution

Page 42: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

Problem Description: A quality control inspector weighs the contents of 7 packets of breakfast cereal all from the same filling machine. The data recorded were

111g, 117g, 105g, 100g, 97g, 118g, 113g.

Use a 95% confidence interval estimate to determine if the machine is filling to the a priori target value of 115 grams per pack.

At 95% confidence, 1- = 0.95 and = 0.05.

X t

or

101.1 to 116.3

/

. ..

. .

2

108 7 2 4478 22

7

108 7 7 6

sn

-2.447 +2.447

t-dist on 6df

Page 43: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[43]

TRY:

w = c(111, 117, 105, 100, 97, 118, 113)

n = length(w)

qt(0.975, df = n - 1)

qt(0.025, df = n - 1, lower.tail = FALSE)

mean(w) +c(-1,1) * qt(0.975, df = n - 1) * sd(w) / sqrt(n)

t.test(w)$conf #the R function t.test does all this

qqnorm(w) #test the assumption of normal data!!

Page 44: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

N-Score Plots: Testing the assumption of normality

NSCORES are idealised values we would expect if the data came from a normal distribution.

Use Z values {Z1…Z

7} that divide

the standard curve normal into 8 sections, with the area to the left of each Z equal to (i - 1/2)/n of the total area, where n = 7 and i runs from 1 to 7 in this example.

The assumption of normality of the Weight data is being tested.

If the points fall on a line then the assumption of normality is not called into question!

Page 45: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[45]

Normal scores andthe Normal diagnostic plot

Page 46: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[46]

Normal diagnostic plot

• If the sampled process follows the Normal model, the similarity of the spacing patterns will lead to a straight line scatter plot pattern, with some chance variation.

• If the scatter plot pattern is not a straight line with some chance variation, then the conclusion is that the sample process does not conform to the Normal model.

Page 47: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[47]

Normal diagnostic plot, Presses 1-4

Page 48: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[48]

Reference plots

Page 49: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[49]

Normal plot, Presses 1-4, all data, with reference plots

Page 50: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[50]

A skew frequency curve

Page 51: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[51]

Return on Stocks

Page 52: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[52]

Assumption of Normality??

Page 53: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[53]

• Sample mean: Xbar = -0.00983

• Standard Deviation: s = 0.055

• Sample size: n = 30

• t Value

At 95% confidence, 1 - = .95, = .05, and /2 = .025.

t.025 is based on n - 1 = 30 - 1 = 29 degrees of freedom.

In the t distribution table we see that t.025 = 2.045

Sample Statistics and t-value from tables

Verify that the 95% CI estimate is -0.0304 % TO 0.0107 %

Page 54: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

[54]

Prediction Intervals

ns

tx 025.

Prediction interval:

Confidence interval:

nstx

11025.

Verify that the 95% PI estimate is -0.125 % TO 0.105 %

Page 55: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

Confidence Interval For A Proportion

Suppose that 46 respondents of a sample of 140 students claim to attend lecturers. The sample proportion is = 46/140 = 0.33, a 95% confidence interval for the population proportion p is required.

SE p

p p

N

1

p

SE p

p p

N

1

Page 56: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

In general a confidence interval is constructed as

Point Estimate + Value*SE(Point Estimate)

.

. .. .

. .

p SE p

196

0 33 1960 33 1 0 33

140

0 25 0 41

Page 57: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

Sample Size

Suppose that the research team are unhappy about the width of the interval and say that in future they would like estimates in the form

X% + 2%

To achieve this level of precision in the estimate how large must the sample be??

.

.

p Zp p

Np

NZ

p p

10 02

021

2

Page 58: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

Since p is unknown this expression cannot be evaluated immediately. Consider the following table:

p 1-p p(1-p).1 .9 .09.2 .8 .16.3 .7 .21.4 .6 .24.5 .5 .25.4 .6 .24

p(1-p) has a maximum when p = 0.5 - if we use this value we do at least as well as required.

Page 59: [1] Confidence Intervals. [2] Statistical Estimation sample statistic = parameter estimate = s=s= Example:

N

19602

0 25

2401

2..

.

For a 95% confidence interval we have

NZ

..

020 25

2


Top Related