week 1_hypothesis testing (wk 1)

71
Hypothesis Testing Niniet Indah Arvitrida, ST, MT [email protected] SepuluhNopember Institute of Technology INDONESIA 2008

Upload: dewyekha

Post on 13-Dec-2015

217 views

Category:

Documents


1 download

DESCRIPTION

HYPOTESIS TESTING UPLOAD

TRANSCRIPT

Page 1: Week 1_Hypothesis Testing (Wk 1)

Hypothesis Testing

Niniet Indah Arvitrida, ST, [email protected]

SepuluhNopember Institute of TechnologyINDONESIA

2008

Page 2: Week 1_Hypothesis Testing (Wk 1)

A bit concept review...Statistics is ...Population is ...Sample is ...Variable is ...Parameter is ...Relation between statistics & parameters ...Difference between statistics and probabilityStatistics assumptions :

Population and its parameters are (unknown / known)Sample is used to ..... parameters

Page 3: Week 1_Hypothesis Testing (Wk 1)

Course 1 ObjectivesStudents able toFormulate null and alternative hypotheses Formulate null and alternative hypotheses

for applications involving a single population mean or proportion

Formulate a decision rule for testing a testing a hypothesishypothesis

Know how to use the test statistic, critical value, and p-value approaches to test the null hypothesis

Know what Type I Type I and Type II errors Type II errors are

Compute the probability of a Type II error

Page 4: Week 1_Hypothesis Testing (Wk 1)

Review of Estimating Review of Estimating Population ValuePopulation Value

Page 5: Week 1_Hypothesis Testing (Wk 1)

Point and Interval Estimates

A point estimate is a single number, a confidence interval provides additional

information about variability

Point Estimate

Lower

Confidence

Limit

Upper

Confidence

Limit

Width of confidence interval

Page 6: Week 1_Hypothesis Testing (Wk 1)

Point Estimates

We can estimate a Population Parameter …

with a SampleStatistic

(a Point Estimate)

Mean

Proportion pp

Page 7: Week 1_Hypothesis Testing (Wk 1)

Confidence Intervals

How much uncertainty is associated with a point estimate of a population parameter?

An interval estimate provides more information about a population characteristic than does a point estimate

Such interval estimates are called confidence intervals

Page 8: Week 1_Hypothesis Testing (Wk 1)

Estimation Process

(mean, μ, is unknown)

Population

Random Sample

Mean x = 50

Sample

I am 95% confident that μ is between 40 & 60.

Page 9: Week 1_Hypothesis Testing (Wk 1)

Confidence Level

Confidence Level

Confidence in which the interval will contain the unknown population parameter

A percentage (less than 100%)

Page 10: Week 1_Hypothesis Testing (Wk 1)

Confidence Interval for μ(σ Known)

AssumptionsPopulation standard deviation σ is knownPopulation is normally distributedIf population is not normal, use large

sample

Confidence interval estimate

n

σzx α/2

Page 11: Week 1_Hypothesis Testing (Wk 1)

Finding the Critical ValueConsider a 95% confidence interval:

z.025= -1.96 z.025= 1.96

.951

.0252

α .025

2

α

Point EstimateLower Confidence Limit

UpperConfidence Limit

z units:

x units: Point Estimate

0

1.96zα/2

Page 12: Week 1_Hypothesis Testing (Wk 1)

Margin of ErrorMargin of Error (e): the amount added and

subtracted to the point estimate to form the confidence interval

n

σzx /2

n

σze /2

Example: Margin of error for estimating μ, σ known:

Page 13: Week 1_Hypothesis Testing (Wk 1)

Chap 7-15

Factors Affecting Margin of Error

Data variation, σ : e as σ

Sample size, n : e as n

Level of confidence, 1 - : e if 1 -

n

σze /2

Page 14: Week 1_Hypothesis Testing (Wk 1)

Chap 7-16

Example

A sample of 11 circuits from a large normal population has a mean resistance of 2.20 ohms. We know from past testing that the population standard deviation is .35 ohms.

Determine a 95% confidence interval for the true mean resistance of the population.

Page 15: Week 1_Hypothesis Testing (Wk 1)

Chap 7-17

2.4068 ...............1.9932

.2068 2.20

)11(.35/ 1.96 2.20

n

σz x /2

ExampleA sample of 11 circuits from a large normal

population has a mean resistance of 2.20 ohms. We know from past testing that the population standard deviation is .35 ohms.

Solution:

(continued)

Page 16: Week 1_Hypothesis Testing (Wk 1)

Chap 7-18

Interpretation

We are 95% confident that the true mean resistance is between 1.9932 and 2.4068 ohms

Although the true mean may or may not be in this interval, 95% of intervals formed in this manner will contain the true mean

An incorrect interpretation is that there is 95% probability

that this interval contains the true population mean.

(This interval either does or does not contain the true mean, there is no probability for a single interval)

Page 17: Week 1_Hypothesis Testing (Wk 1)

Chap 7-19

Student’s t Distribution

t0

t (df = 5)

t (df = 13)t-distributions are bell-shaped and symmetric, but have ‘fatter’ tails than the normal

Standard Normal

(t with df = )

Note: t z as n increases

Page 18: Week 1_Hypothesis Testing (Wk 1)

Chap 7-20

Approximation for Large Samples

Since t approaches z as the sample size increases, an approximation is sometimes used when n 30:

n

stx /2

n

szx /2

Technically correct

Approximation for large n

Page 19: Week 1_Hypothesis Testing (Wk 1)

Now we talk about Now we talk about HYPOTHESISHYPOTHESIS

Page 20: Week 1_Hypothesis Testing (Wk 1)

What is a Hypothesis?A hypothesis is a claim

(assumption) about a population population parameter:

population meanmean

population proportionproportion

Example: The mean monthly cell phone bill of this city is = $42

Example: The proportion of adults in this city with cell phones is p = .68

Page 21: Week 1_Hypothesis Testing (Wk 1)

The Null Hypothesis, H0

States the assumptionassumption (numerical) to be tested

Example: The average number of TV sets in U.S. Homes is at least three ( )

Is alwaysalways about a populationpopulation

parameter, not about a sample statistic

3μ:H0

3μ:H0 3x:H0

Page 22: Week 1_Hypothesis Testing (Wk 1)

The Null Hypothesis, H0

Begin Begin with the assumption that the null hypothesis is truenull hypothesis is trueSimilar to the notion of innocent until proven guilty

Always contains “=”“=” , “≤”“≤” or ““” ” sign

May oror may not be rejected

(continued)

Page 23: Week 1_Hypothesis Testing (Wk 1)

The Alternative Hypothesis, HA

Is the opposite of the null hypothesise.g.: The average number of TV sets in

U.S. homes is less than 3 ( HA: < 3 )

NeverNever contains the “=” , “≤” or “” sign

May or may not be acceptedIs generally the hypothesis that is

believed (or needs to be supported) by the researcher

Page 24: Week 1_Hypothesis Testing (Wk 1)

Population

Claim: thepopulationmean age is 50.(Null Hypothesis:

REJECT

Supposethe samplemean age is 20: x = 20

SampleNull Hypothesis

20 likely if = 50?Is

Hypothesis Testing Process

If not likely,

Now select a random sample

H0: = 50 )

x

Page 25: Week 1_Hypothesis Testing (Wk 1)

Reason for Rejecting H0

Sampling Distribution of x

= 50If H0 is true

If it is unlikely that we would get a sample mean of this value ...

... then we reject the null

hypothesis that = 50.

20

... if in fact this were the population mean…

x

Page 26: Week 1_Hypothesis Testing (Wk 1)

Level of Significance, Defines unlikely values of sample statistic

if null hypothesis is true

Defines rejection region rejection region of the sampling distribution

Is designated by , (level of significance)

Typical values are .01, .05, or .10Is selected by the researcher at the beginning

Provides the critical value(s) of the test

Page 27: Week 1_Hypothesis Testing (Wk 1)

Level of Significance and the Rejection Region

H0: μ ≥ 3

HA: μ < 3 0

H0: μ ≤ 3

HA: μ > 3

H0: μ = 3

HA: μ ≠ 3

/2

Represents critical value

Lower tail test

Level of significance =

0

0

/2

Upper tail test

Two tailed test

Rejection region is shaded

1-α

1-α

1-α

Page 28: Week 1_Hypothesis Testing (Wk 1)

Errors in Making DecisionsType I Error

Reject a true null hypothesisConsidered a serious type of error

The probability of Type I Error is Called level of significance level of significance of the

testSet by researcher in advance

Page 29: Week 1_Hypothesis Testing (Wk 1)

Errors in Making Decisions

Type II ErrorFail to reject a false null hypothesis

The probability of Type II Error is β

(continued)

Page 30: Week 1_Hypothesis Testing (Wk 1)

Outcomes and Probabilities

State of Nature

Decision

Do NotReject

H0

No error (1 - )

Type II Error ( β )

RejectH0

Type I Error( )

Possible Hypothesis Test Outcomes

H0 False H0 True

Key:Outcome

(Probability) No Error ( 1 - β )

Page 31: Week 1_Hypothesis Testing (Wk 1)

Type I & II Error Relationship

Type I and Type II errors can not happen at the same time

Type I error can only occur if H0 is true

Type II error can only occur if H0 is false

If Type I error probability ( ) , then

Type II error probability ( β )

Page 32: Week 1_Hypothesis Testing (Wk 1)

Factors Affecting Type II Error

All else equal, β when the difference between

hypothesized parameter and its true value

β when

β when σ

β when n

Page 33: Week 1_Hypothesis Testing (Wk 1)

Critical Value Approach to Testing

Convert sample statistic (e.g.: ) to test statistic ( Z or t statistic )

Determine the critical value(s) for a specifiedlevel of significance from a table or computer

If the test statistic falls in the rejection

region, reject H0 ; otherwise do not reject H0

x

Page 34: Week 1_Hypothesis Testing (Wk 1)

Lower Tail Tests

Reject H0 Do not reject H0

The cutoff value,

or , is called a

critical value

-zα

-zα xα

0

μ

H0: μ ≥ 3

HA: μ < 3

n

σzμx

Page 35: Week 1_Hypothesis Testing (Wk 1)

Upper Tail Tests

Reject H0Do not reject H0

The cutoff value,

or , is called a

critical value

zα xα

H0: μ ≤ 3

HA: μ > 3

n

σzμx

Page 36: Week 1_Hypothesis Testing (Wk 1)

Two Tailed Tests

Do not reject H0 Reject H0Reject H0

There are two cutoff

values (critical values):

or /2

-zα/2

xα/2

± zα/2

xα/2

0μ0

H0: μ = 3

HA: μ

3

zα/2

xα/2

n

σzμx /2/2

Lower

Upperxα/2

Lower Upper

/2

Page 37: Week 1_Hypothesis Testing (Wk 1)

Critical Value Approach to Testing

Convert sample statistic ( ) to a test statistic ( Z or t statistic )

x

Known

Large Samples

Unknown

Hypothesis Tests for

Small Samples

Page 38: Week 1_Hypothesis Testing (Wk 1)

Calculating the Test Statistic

Known

Large Samples

Unknown

Hypothesis Tests for μ

Small Samples

The test statistic is:

n

σμx

z

Page 39: Week 1_Hypothesis Testing (Wk 1)

Calculating the Test Statistic

Known

Large Samples

Unknown

Hypothesis Tests for

Small Samples

The test statistic is:

n

sμx

t 1n

But is sometimes approximated using a z:

n

σμx

z

(continued)

Page 40: Week 1_Hypothesis Testing (Wk 1)

Calculating the Test Statistic

Known

Large Samples

Unknown

Hypothesis Tests for

Small Samples

The test statistic is:

n

sμx

t 1n

(The population must be approximately normal)

(continued)

Page 41: Week 1_Hypothesis Testing (Wk 1)

Review: Steps in Hypothesis Testing

1. Specify the population value of interest

2. Formulate the appropriate null and alternative hypotheses

3. Specify the desired level of significance

4. Determine the rejection region

5. Obtain sample evidence and compute the test statistic

6. Reach a decision and interpret the result

Page 42: Week 1_Hypothesis Testing (Wk 1)

Hypothesis Testing Example

Test the claim that the true mean # of TV sets in US homes is at least 3.

(Assume σ = 0.8)

1. Specify the population value of interest The mean number of TVs in US homes

2. Formulate the appropriate null and alternative hypotheses H0: μ 3 HA: μ < 3 (This is a lower tail test)

3. Specify the desired level of significance Suppose that = .05 is chosen for this test

Page 43: Week 1_Hypothesis Testing (Wk 1)

Hypothesis Testing Example4. Determine the rejection region

Reject H0 Do not reject H0

= .05

-zα= -1.645 0

This is a one-tailed test with = .05. Since σ is known, the cutoff value is a z value:

Reject H0 if z < z = -1.645 ; otherwise do not reject H0

(continued)

Page 44: Week 1_Hypothesis Testing (Wk 1)

Hypothesis Testing Example5. Obtain sample evidence and compute the

test statistic

Suppose a sample is taken with the following results: n = 100, x = 2.84 ( = 0.8 is assumed known)

Then the test statistic is:

2.0.08

.16

100

0.832.84

n

σμx

z

Page 45: Week 1_Hypothesis Testing (Wk 1)

Hypothesis Testing Example 6. Reach a decision and interpret the result

Reject H0 Do not reject H0

= .05

-1.645 0

-2.0

Since z = -2.0 < -1.645, we reject the null hypothesis that the mean number of TVs in US homes is at least 3

(continued)

z

Page 46: Week 1_Hypothesis Testing (Wk 1)

Hypothesis Testing Example An alternate way of constructing rejection region:

Reject H0

= .05

2.8684

Do not reject H0

3

2.84

Since x = 2.84 < 2.8684, we reject the null hypothesis

(continued)

x

Now expressed in x, not z units

2.8684100

0.81.6453

n

σzμx αα

Page 47: Week 1_Hypothesis Testing (Wk 1)

p-Value Approach to Testing

Convert Sample Statistic (e.g. ) to Test Statistic ( Z or t statistic )

Obtain the p-value from a table or computer

Compare the p-value with

If p-value < , reject H0

If p-value , do not reject H0

x

Page 48: Week 1_Hypothesis Testing (Wk 1)

p-Value Approach to Testing

p-value: Probability of obtaining a test statistic more extreme ( ≤ or ) than the observed sample value given H0 is true

Also called observed level of significance

Smallest value of for which H0 can be

rejected

(continued)

Page 49: Week 1_Hypothesis Testing (Wk 1)

p-value exampleExample: How likely is it to see a sample

mean of 2.84 (or something further below the mean) if the true mean is = 3.0?

p-value =.0228

= .05

2.8684 3

2.84

x

.02282.0)P(z

1000.8

3.02.84zP

3.0)μ|2.84xP(

Page 50: Week 1_Hypothesis Testing (Wk 1)

p-value exampleCompare the p-value with

If p-value < , reject H0

If p-value , do not reject H0

Here: p-value = .0228 = .05

Since .0228 < .05, we reject the null hypothesis

(continued)

p-value =.0228

= .05

2.8684 3

2.84

Page 51: Week 1_Hypothesis Testing (Wk 1)

Example: Upper Tail z Test for Mean ( Known)

A phone industry manager thinks that customer monthly cell phone bill have increased, and now average over $52 per month. The company wishes to test this claim. (Assume = 10 is known)

H0: μ ≤ 52 the average is not over $52 per month

HA: μ > 52 the average is greater than $52 per month(i.e., sufficient evidence exists to support the manager’s claim)

Form hypothesis test:

Page 52: Week 1_Hypothesis Testing (Wk 1)

Suppose that = .10 is chosen for this test

Find the rejection region:

Reject H0Do not reject H0

= .10

zα=1.280

Reject H0

Reject H0 if z > 1.28

Example: Find Rejection Region

(continued)

Page 53: Week 1_Hypothesis Testing (Wk 1)

Review:Finding Critical Value - One Tail

Z .07 .09

1.1 .3790 .3810 .3830

1.2 .3980 .4015

1.3 .4147 .4162 .4177z 0 1.28

.08

Standard Normal Distribution Table (Portion)What is z given =

0.10?

= .10

Critical Value = 1.28

.90

.3997

.10

.40.50

Page 54: Week 1_Hypothesis Testing (Wk 1)

Example: Test Statistic

0.88

64

105253.1

n

σμx

z

(continued)

Obtain sample evidence and compute the test statistic

Suppose a sample is taken with the following results:

n = 64, x = 53.1 (=10 was assumed known)

Then the test statistic is:

Page 55: Week 1_Hypothesis Testing (Wk 1)

Example: DecisionReach a decision and interpret the result:

Reject H0Do not reject H0

= .10

1.280

Reject H0

Do not reject H0 since z = 0.88 ≤ 1.28

i.e.: there is not sufficient evidence that the mean bill is over $52

z = .88

(continued)

Page 56: Week 1_Hypothesis Testing (Wk 1)

p -Value SolutionCalculate the p-value and compare to

Reject H0

= .10

Do not reject H0 1.28

0

Reject H0

z = .88

(continued)

.1894

.3106.50.88)P(z

6410

52.053.1zP

52.0)μ|53.1xP(

p-value = .1894

Do not reject H0 since p-value = .1894 > = .10

Page 57: Week 1_Hypothesis Testing (Wk 1)

Example: Two-Tail Test( Unknown)

The average cost of a hotel room in New York is said to be $168 per night. A random sample of 25 hotels resulted in x = $172.50 and

s = $15.40. Test at the = 0.05 level.

(Assume the population distribution is normal)

H0: μ= 168

HA: μ

168

Page 58: Week 1_Hypothesis Testing (Wk 1)

Example Solution: Two-Tail Test

= 0.05

n = 25

is unknown, so use a t statistic

Critical Value:

t24 = ± 2.0639 Do not reject H0: not sufficient evidence that true mean cost is different than $168

Reject H0Reject H0

/2=.025

-tα/2

Do not reject H0

0 tα/2

/2=.025

-2.0639 2.0639

1.46

25

15.40168172.50

n

sμx

t 1n

1.46

H0: μ= 168

HA: μ

168

Page 59: Week 1_Hypothesis Testing (Wk 1)

Hypothesis Tests for Proportions

Involves categorical values

Two possible outcomes

“Success” (possesses a certain characteristic)

“Failure” (does not possesses that characteristic)

Fraction or proportion of population in the “success” category is denoted by p

Page 60: Week 1_Hypothesis Testing (Wk 1)

ProportionsSample proportion in the success category is

denoted by p

When both np and n(1-p) are at least 5, p can be approximated by a normal distribution with mean and standard deviation

(continued)

sizesample

sampleinsuccessesofnumber

n

xp

pμP

n

p)p(1σ

p

Page 61: Week 1_Hypothesis Testing (Wk 1)

Hypothesis Tests for Proportions

The sampling distribution of p is normal, so the test statistic is a z value:

n)p(p

ppz

1

np 5and

n(1-p) 5

Hypothesis Tests for p

np < 5or

n(1-p) < 5

Not discussed in this chapter

Page 62: Week 1_Hypothesis Testing (Wk 1)

Example: z Test for Proportion

A marketing company claims that it receives 8% responses from its mailing. To test this claim, a random sample of 500 were surveyed with 25 responses. Test at the = .05 significance level.

Check:

n p = (500)(.08) = 40

n(1-p) = (500)(.92) = 460

Page 63: Week 1_Hypothesis Testing (Wk 1)

Z Test for Proportion: Solution

= .05

n = 500, p = .05

Reject H0 at = .05

H0: p = .08

HA: p

.08

Critical Values: ± 1.96

Test Statistic:

Decision:

Conclusion:

z0

Reject Reject

.025.025

1.96

-2.47

There is sufficient evidence to reject the company’s claim of 8% response rate.

2.47

500.08).08(1

.08.05

np)p(1

ppz

-1.96

Page 64: Week 1_Hypothesis Testing (Wk 1)

p -Value SolutionCalculate the p-value and compare to (For a two sided test the p-value is always two sided)

Do not reject H0

Reject H0Reject H0

/2 = .025

1.960

z = -2.47

(continued)

0.01362(.0068)

.4932)2(.5

2.47)P(x2.47)P(z

p-value = .0136:

Reject H0 since p-value = .0136 < = .05

z = 2.47

-1.96

/2 = .025

.0068.0068

Page 65: Week 1_Hypothesis Testing (Wk 1)

Type II ErrorType II error is the probability of failing to reject a false H0

Reject H0: μ 52

Do not reject H0 : μ 52

5250

Suppose we fail to reject H0: μ 52 when in fact the true mean is μ = 50

Page 66: Week 1_Hypothesis Testing (Wk 1)

Type II ErrorSuppose we do not reject H0: 52 when

in fact the true mean is = 50

Reject H0: 52

Do not reject H0 : 52

5250

This is the true distribution of x if = 50

This is the range of x where H0 is not rejected

(continued)

Page 67: Week 1_Hypothesis Testing (Wk 1)

Type II ErrorSuppose we do not reject H0: μ 52

when in fact the true mean is μ = 50

Reject H0: μ 52

Do not reject H0 : μ 52

5250

β

Here, β = P( x cutoff ) if μ = 50

(continued)

Page 68: Week 1_Hypothesis Testing (Wk 1)

Calculating βSuppose n = 64 , σ = 6 , and = .05

Reject H0: μ 52

Do not reject H0 : μ 52

5250

So β = P( x 50.766 ) if μ = 50

50.76664

61.64552

n

σzμxcutoff

(for H0 : μ 52)

50.766

Page 69: Week 1_Hypothesis Testing (Wk 1)

Calculating βSuppose n = 64 , σ = 6 , and = .05

Reject H0: μ 52

Do not reject H0 : μ 52

.1539.3461.51.02)P(z

646

5050.766zP50)μ|50.766xP(

5250

(continued)

Probability of type II error:

β = .1539

Page 70: Week 1_Hypothesis Testing (Wk 1)

Chapter SummaryAddressed hypothesis testing

methodology

Performed z Test for the mean (σ known)

Discussed p–value approach to hypothesis testing

Performed one-tail and two-tail tests . . .

Page 71: Week 1_Hypothesis Testing (Wk 1)

Chapter Summary

Performed t test for the mean (σ unknown)

Performed z test for the proportion

Discussed type II error and computed its probability

(continued)