inferences based on two samples

28
Inferences based on TWO samples •New concept: Independent versus dependent samples •Comparing two population means: Independent sampling •Comparing two population means: Dependent sampling

Upload: yanka

Post on 03-Feb-2016

40 views

Category:

Documents


0 download

DESCRIPTION

Inferences based on TWO samples. New concept: Independent versus dependent samples Comparing two population means: Independent sampling Comparing two population means: Dependent sampling. Inferences About Two Means. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Inferences based on TWO samples

Inferences based on TWO samples

•New concept: Independent versus dependent samples

•Comparing two population means: Independent sampling

•Comparing two population means: Dependent sampling

Page 2: Inferences based on TWO samples

2

Inferences About Two Means• In the previous chapter we used one sample

to make inferences about a single population. Very often we are interested in comparing two populations.– 1) Is the average midterm grade in Stat 201.11

higher than the average midterm grade in Stat 201.12?

– 2) Is the average grade in Quiz #1 higher than Quiz #2 in this section of Introductory Statistics?

Page 3: Inferences based on TWO samples

Inferences about Two Means

• Each sample is an example of testing a claim between two populations. However, there is a fundamental difference between 1) and 2).

• In # 2) the samples are not independent where as in # 1), they are.

• Why? 1. Different people in each class. 2. Same people writing different test.

Page 4: Inferences based on TWO samples

Recognizing independent versus dependent samples

1. Is the average midterm grade in Stat 201.11 higher than the average

midterm grade in Stat 201.12? Independent samples

2. Is the average grade in Quiz #1 higher than Quiz #2 in this section

of Introductory Statistics? Dependent samples

Page 5: Inferences based on TWO samples

Definition. Independent and Dependent Samples

Two samples are independent if the sample selected from one population is not related to the sample selected from the other population.

If one sample is related to the other, the samples are dependent. With dependent samples we get two values for each person, sometimes called paired-samples.

Page 6: Inferences based on TWO samples

We consider first the case of two dependent (or paired) samples

•Calculations are very similar to those in the previous chapter for a CI or Test of Hypothesis involving one sample

Page 7: Inferences based on TWO samples

Organize work using a table

Sample 1 Sample 2 Difference

1 x1 y1 d1=x1 - y1

2 x2 y2 d2=x2 - y2

3 x3 y3 d3=x3 - y3

… ..l. …. …..

n xn yn dn=xn -yn

Page 8: Inferences based on TWO samples

Organize work using a table

Sample 1 Sample 2 Difference

1 x1 y1 d1=x1 - y1

2 x2 y2 d2=x2 - y2

3 x3 y3 d3=x3 - y3

… ..l. …. …..

n xn yn dn=xn -yn

Can now use the methods of the previous chapter to find a confidence interval for the population mean of the difference d between x1 and x2.

d

Page 9: Inferences based on TWO samples

Notation for Two Dependent Samples

data of pairs ofnumber

data sample paired for the

sdifference theofdeviation standard

data sample paired the

for sdifference theof mean value

data. paired of population for the

sdifference theof mean value

n

ds

dd

d

d

d

Page 10: Inferences based on TWO samples

Confidence Interval for the Mean Difference

(Dependent Samples: Paired Data ) The (1-)*100% confidence interval for the mean

difference d is

. and of instead used are and then 30 If --

normal. is scores difference

of population that theassume then we30n If --

data. sample

paired in the sdifference theofdeviation

standard theandmean theare and where

2,12,1

stzn

sd

n

std

n

std

d

dnd

dn

Page 11: Inferences based on TWO samples

Test Statistic for the Mean Difference (Dependent Samples)

For n<30 the appropriate test statistic for testing the mean difference between paired samples is

with n-1 degrees of freedom.

For n>30 then we use ‘z’

nsd

td

d

n

dz

d

d

Page 12: Inferences based on TWO samples

We now turn to the more challenging case of independent

samples

Page 13: Inferences based on TWO samples

Testing Claims about the Mean Difference (Independent Samples)

• When making claims about the mean difference between independent samples a different procedure is used than that for dependent/paired samples.

• Again there are different procedures for large (n>30) samples and small samples (n<30).

• In the small sample case, we must assume that both populations are normal and have equal variances.

Page 14: Inferences based on TWO samples

ExampleSuppose we wish to compare two brands of 9-

volt batteries, Brand 1 and Brand 2. Specifically, we would like to compare the mean life for the population of batteries of Brand 1, 1, and the mean life for the population of batteries of Brand 2, 2. To obtain a meaningful comparison we shall estimate the difference of the two population means by picking samples from the two populations.

Page 15: Inferences based on TWO samples

For Brand 1 a sample of size 64 was chosen.

For Brand 2 a sample of size 49 was chosen.

From the data a point estimate for 1, would be 7.13. From the data a point estimate for 2 would be 7.78.

It would therefore be natural for us to take as a point estimate for (1-2) to be -0.65 hours.

4.1

13.7

1

1

s

x

2.1

78.7

2

2

s

x

Page 16: Inferences based on TWO samples

Point Estimator (Independent Samples)

The estimate is the best point estimator of (1-2).

Having found a point estimate, our next goal is to determine a confidence interval for it.

21 xx

Page 17: Inferences based on TWO samples

Point Estimator (Independent Samples)

To construct a confidence interval for (1-2) we need to know the distribution of its point estimator.

The distribution of is normal with mean (1-2) and standard deviation

where n1 is the size of sample 1, n2 is the size of sample 2.

21 xx

2

22

1

21

)( 21 nnxx

Page 18: Inferences based on TWO samples

Confidence Interval for Difference in two Means (Large samples or known

variance)

2

22

1

21

22121

2

22

1

21

221 nn

zxxnn

zxx

Page 19: Inferences based on TWO samples

Example: Life span of Batteries

Let = .05 so we are looking for the 95% confidence interval for the mean difference.

16.014.1

48825.065.49

2.1

64

4.196.165.

21

21

22

What conclusion can you draw from the above?

Page 20: Inferences based on TWO samples

Example: Life span of Batteries

Let = .05 so we are looking for the 95% confidence interval for the mean difference.

16.014.1

48825.065.49

2.1

64

4.196.165.

21

21

22

We are 95 percent certain that the difference is negative. Thus, we are 95% certain that

21

21 0

Page 21: Inferences based on TWO samples

Test Statistic for Two Means: Independent and large samples

2

22

1

21

2121

nn

xxz

Page 22: Inferences based on TWO samples

Example: Life span of Batteries

• Hypothesis Testing. I claim that the two brands of batteries do not have the same life span. Using a 5% level of significance, test this claim.

Page 23: Inferences based on TWO samples

Example: Life span of Batteries

• Hypothesis

• Sample Data

• Test Statistic

21

210

:

:

AH

H

64

4.1

13.7

1

1

n

s

x

49

2.1

78.7

2

2

n

s

x

65.2

492.1

644.1

078.713.722

2

22

1

21

2121

nn

xxz

Page 24: Inferences based on TWO samples

Example: Life span of Batteries

• Critical Region

• Decision The test statistic lies in the critical region, therefore we reject H0. The samples provide sufficient evidence to claim that the Batteries do indeed have different life spans.

Page 25: Inferences based on TWO samples

Exercise

Show that we would have rejected the null hypothesis even if we had used level of significance .008 (instead of .05. Thus…

We conclude that the mean battery lives ARE different (p = .008)

Page 26: Inferences based on TWO samples

Overview

• Comparing Two Populations:

• Mean (Small Dependent (paired) Samples)– Asumptions: Samples are random plus eith

n>=30 or the population of differences is approximately normal

• Mean (Large Independent Samples)• Assumptions: Both samples are randomly chosen

plus both sample sizes >= 30.

Page 27: Inferences based on TWO samples

NOTEIn the case of SMALL independent

samples, one must use the t-distribution plus additional

conditions must be satisfied AND one must use what is called a

pooled estimate of the variance.

Page 28: Inferences based on TWO samples

NOTE

In the case of SMALL independent samples, one must use the t-

distribution plus additional conditions must be satisfied AND

one must use what is called a pooled estimate of the variance.

You are not responsible for this material