inferensi statistika

Post on 03-Jan-2016

84 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Inferensi Statistika. Confidence Interval. Estimation Process. Population. Random Sample. I am 95% confident that  is between 40 & 60, which is a point estimate + or - margin of error. Mean X = 50. Mean,  , is unknown. Sample. Confidence Interval Estimation. - PowerPoint PPT Presentation

TRANSCRIPT

Inferensi Statistika

Confidence Interval

Mean, , is unknown

Population Random SampleI am 95%

confident that is between 40 &

60, which is

a point estimate + or - margin of

error.

Mean X = 50

Estimation Process

Sample

Provides Range of Values Based on Observations from 1 Sample

Gives Information about Closeness to Unknown Population Parameter

Stated in terms of Probability Never 100% Sure

Confidence Interval Estimation

Confidence Interval Sample Statistic

Confidence Limit (Lower)

Confidence Limit (Upper)

A Probability (confidence level, denoted by C) That the Population Parameter Falls

Somewhere Within the Interval.

Elements of Confidence Interval Estimation

90% Samples

95% Samples

estimate_

Confidence Intervals

xx .. 64516451

xx 96.196.1

xx .. 582582 99% Samples

nzXzX X

**

X_

Probability that the unknown population parameter falls

within the interval

Denoted by C = level of confidence

e.g. C=90%, 95%, 99%.

Level of Confidence

Confidence Intervals

Intervals Extend from

C proportion of Intervals Contains .

1-C proportion Does Not.

Area=C (1-C)/2(1-C)/2

X_

x_

Intervals & Level of Confidence

Sampling Distribution of

the Mean

toXzX *

XzX *

X

Data Variation

measured by Sample Size

Level of Confidence

C

Intervals Extend from

Factors Affecting Interval Width

X - z* to X + z*

xx

n/XX

Assumptions Population Standard Deviation Is Known Population Is Normally Distributed If Not Normal, use large samples

Confidence Interval Estimate

Confidence Intervals (Known)

nzX

* n

zX

*

Assumptions Population Standard Deviation Is

Unknown Population Must Be Normally Distributed

Use Student’s t Distribution

Confidence Interval Estimate

Confidence Intervals (Unknown)

n

StX n 1

* n

StX n 1

*

Zt

0

t (df = 5)

Standard Normal

t (df = 13)Bell-Shaped

Symmetric

‘Fatter’ Tails

Student’s t Distribution

Upper Tail Area

df .25 .10 .05

1 1.000 3.078 6.314

2 0.817 1.886 2.920

3 0.765 1.638 2.353

t0

Assume: n = 3 df = n - 1 = 2

= .10 /2 =.05

2.920t Values

/ 2=(1-C)/2

.05

Student’s t Table

A random sample of n = 25 has = 50 and s = 8. Set up a 95% confidence interval estimate for .

. .46 69 53 30

X

Example: Interval EstimationUnknown

n

StX n 1

* n

StX n 1

*

25

80639250 . 25

80639250 .

Sample Size

Too Big:•Requires toomuch resources

Too Small:•Won’t do the job

What sample size is needed to be 90% confident of being correct

within ± 5? A pilot study suggested that the standard

deviation is 45.

nZ

m

2 2

2

2 2

2

1645 45

5219 2 220

..

Example: Sample Size for Mean

Round Up

Test of Significance

The second method of making statistical inference.

It is to test a Hypothesis about the population parameters

A hypothesis is an assumption about the population parameter.

A parameter is a Population mean or proportion

The parameter must be identified before analysis.

I assume the mean GPA of this class is 3.5!

What is a Hypothesis?

States the Assumption (numerical) to be tested

e.g. The average # TV sets in US homes is at 3 (H0: 3)

Begin with the assumption that the null hypothesis is TRUE.

(Similar to the notion of innocent until proven guilty)

The Null Hypothesis, H0

•Refers to the Status Quo

•The Null Hypothesis may or may not be rejected.

Is the opposite of the null hypothesis e.g. The average # TV sets in US homes is less than 3 (Ha: < 3) or is NOT equal to 3 (Ha: 3).

Challenges the Status Quo Never contains the ‘=‘ sign The Alternative Hypothesis may

or may not be accepted

The Alternative Hypothesis, Ha

State the Null Hypothesis (H0: 3)

State the Alternative Hypothesis (Ha: < 3 or Ha: ) as the conclusion if H0 is not true.

Logically, if the alternative Ha: < 3 is not true, we can accept either Ho: > 3 or Ho: = 3.

Identify the Problem

Population

Assume thepopulationmean age is 50.(Null Hypothesis)

REJECT

The SampleMean Is 20

SampleNull Hypothesis

50?20 XIs

Hypothesis Testing Process

No, not likely!

Sample Mean = 50

Sampling DistributionIt is unlikely that we would get a sample mean of this value ...

... if in fact this were the population mean.

... Therefore, we reject the null

hypothesis that = 50.

20H0

Reason for Rejecting H0

Defines Unlikely Values of Sample Statistic if Null Hypothesis Is True Called Rejection Region of Sampling Distribution

Designated (alpha) Typical values are 0.01, 0.05, 0.10

Selected by the Researcher at the Start

Provides the Critical Value(s) of the Test

Level of Significance,

Level of Significance, and the Rejection Region

H0: 3

H1: < 30

0

0

H0: 3

H1: > 3

H0: 3

H1: 3

/2

Critical Value(s)

Rejection Regions

Convert Sample Statistic (e.g., ) to Standardized Z Variable

Compare to Critical Z Value(s) If Z test Statistic falls in Critical Region,

Reject H0; Otherwise Do Not Reject H0

Z-Test Statistics (Known)

Test Statistic

X

n

XXZ

X

X

1. State H0 H0 : 3

2. State H1 Ha :

3. Choose = .05

4. Choose n n = 100

5. Choose Test: Z Test (or p Value)

Hypothesis Testing: Steps

Test the Assumption that the true mean # of TV sets in US homes is at least 3.

6. Set Up Critical Value(s) Z = -1.645

7. Collect Data 100 households surveyed

8. Compute Test Statistic Computed Test Stat.= -2

9. Make Statistical Decision Reject Null Hypothesis

10. Express Decision The true mean # of TV set is less than 3 in the US

households.

Hypothesis Testing: Steps

Test the Assumption that the average # of TV sets in US homes is at least 3.

(continued)

Assumptions Population Is Normally Distributed If Not Normal, use large samples Null Hypothesis Has or Sign

Only

Z Test Statistic:

One-Tail Z Test for Mean (Known)

n

xxz

x

x

Z0

Reject H0

Z0

Reject H0

H0: H1: < 0

H0: 0 H1: > 0

Must Be Significantly Below = 0

Small values don’t contradict H0

Don’t Reject H0!

Rejection Region

Does an average box of cereal contain more than 368 grams of cereal? A random sample of 25 boxes showed X = 372.5. The company has specified to be 15 grams. Test at the 0.05 level.

368 gm.

Example: One Tail Test

H0: 368 H1: > 368

_

Z .04 .06

1.6 .5495 .5505 .5515

1.7 .5591 .5599 .5608

1.8 .5671 .5678 .5686

.5738 .5750

Z0

Z = 1

1.645

.50 -.05

.45

.05

1.9 .5744

Standardized Normal Probability Table (Portion)

What Is Z Given = 0.05?

= .05

Finding Critical Values: One Tail

Critical Value = 1.645

= 0.025n = 25Critical Value: 1.645

Test Statistic:

Decision:

Conclusion:

Do Not Reject at = .05

No Evidence True Mean Is More than 368Z0 1.645

.05

Reject

Example Solution: One Tail

H0: 368 H1: > 368 50.1

n

XZ

Probability of Obtaining a Test Statistic More Extreme or ) than Actual Sample Value Given H0 Is True

Called Observed Level of Significance Smallest Value of a H0 Can Be Rejected

Used to Make Rejection Decision If p value Do Not Reject H0

If p value <, Reject H0

p Value Approach -- used in the pdf Chapter

Z0 1.50

p Value.0668

Z Value of Sample Statistic

From Z Table: Lookup 1.50

.9332

Use the alternative hypothesis to find the direction of the test.

1.0000 - .9332 .0668

p Value is P(Z 1.50) = 0.0668

p Value Solution

0 1.50 Z

Reject

(p Value = 0.0668) ( = 0.05). Do Not Reject.

p Value = 0.0668

= 0.05

Test Statistic Is In the Do Not Reject Region

p Value Solution

Does an average box of cereal contains 368 grams of cereal? A random sample of 25 boxes showed X = 372.5. The company has specified to be 15 grams. Test at the 0.05 level.

368 gm.

Example: Two Tail Test

H0: 368

H1: 368

= 0.05n = 25Critical Value: ±1.96

Test Statistic:

Decision:

Conclusion:

Do Not Reject at = .05

No Evidence that True Mean Is Not 368Z0 1.96

.025

Reject

Example Solution: Two Tail

-1.96

.025

H0: 386

H1: 38650.1

2515

3685.372

n

XZ

Connection to Confidence Intervals

For X = 372.5oz, = 15 and n = 25,

The 95% Confidence Interval is:

372.5 - (1.96) 15/ 25 to 372.5 + (1.96) 15/ 25

or

366.62 378.38

If this interval contains the Hypothesized mean (368), we do not reject the null hypothesis.

It does. Do not reject.

_

Assumptions Population is normally distributed If not normal, only slightly skewed &

a large sample taken

Parametric test procedure t test statistic

t-Test: Unknown

nSX

t

Example: One Tail t-Test

Does an average box of cereal contain more than 368 grams of cereal? A random sample of 36 boxes showed X = 372.5, and 15. Test at the 0.01 level.

368 gm.

H0: 368 H1: 368

is not given,

= 0.01n = 36, df = 35Critical Value: 2.4377

Test Statistic:

Decision:

Conclusion:

Do Not Reject at = .01

No Evidence that True Mean Is More than 368Z0 2.4377

.01

Reject

Example Solution: One Tail

H0: 368 H1: 368 80.1

3615

3685.372

nSX

t

Type I Error Reject True Null Hypothesis Has Serious Consequences Probability of Type I Error Is

Called Level of Significance

Type II Error Do Not Reject False Null Hypothesis Probability of Type II Error Is

(Beta)

Errors in Making Decisions

H0: Innocent

Jury Trial Hypothesis Test

Actual Situation Actual Situation

Verdict Innocent Guilty Decision H0 True H0 False

Innocent Correct ErrorDo NotReject

H0

1 - Type IIError ( )

Guilty Error Correct RejectH0

Type IError( )

Power(1 - )

Result Possibilities

Reduce probability of one error and the other one goes up.

& Have an Inverse Relationship

True Value of Population Parameter Increases When Difference Between

Hypothesized Parameter & True Value Decreases

Significance Level Increases When Decreases

Population Standard Deviation Increases When Increases

Sample Size n Increases When n Decreases

Factors Affecting Type II Error,

n

rt test =n (r i test)2

---------------------------1 + (n-1) (r i test)2

top related