statistics for business and economics: bab 14

31
1 Slides Prepared by JOHN S. LOUCKS St. Edward’s University © 2002 South-Western/Thomson Learning

Upload: balo

Post on 19-Feb-2016

219 views

Category:

Documents


1 download

DESCRIPTION

Statistics for Business and Economics: bab 14Materi Statistik untuk Bisnis dan Ekonomi:Anderson, Sweeney, Williams; Bab 14

TRANSCRIPT

Page 1: Statistics for Business and Economics: bab  14

1 Slide

Slides Prepared byJOHN S. LOUCKS

St. Edward’s University

© 2002 South-Western/Thomson Learning

Page 2: Statistics for Business and Economics: bab  14

2 Slide

Chapter 14 Simple Linear Regression

Simple Linear Regression Model Least Squares Method Coefficient of Determination Model Assumptions Testing for Significance Using the Estimated Regression Equation

for Estimation and Prediction Computer Solution Residual Analysis: Validating Model

Assumptions Residual Analysis: Outliers and Influential Observations

Page 3: Statistics for Business and Economics: bab  14

3 Slide

The Simple Linear Regression Model

Simple Linear Regression Model y = 0 + 1x +

Simple Linear Regression EquationE(y) = 0 + 1x

Estimated Simple Linear Regression Equationy = b0 + b1x^

Page 4: Statistics for Business and Economics: bab  14

4 Slide

Least Squares Method

Least Squares Criterion

where:yi = observed value of the dependent

variable for the ith observationyi = estimated value of the dependent

variable for the ith observation

min (y yi i )2

^

Page 5: Statistics for Business and Economics: bab  14

5 Slide

Slope for the Estimated Regression Equation

y-Intercept for the Estimated Regression Equation

b0 = y - b1xwhere:xi = value of independent variable for ith observationyi = value of dependent variable for ith observation

x = mean value for independent variable y = mean value for dependent variable n = total number of observations

__

b x y x y nx x n

i i i i

i i1 2 2

( )/( ) /

__

The Least Squares Method

Page 6: Statistics for Business and Economics: bab  14

6 Slide

Example: Reed Auto Sales

Simple Linear RegressionReed Auto periodically has a special week-long sale. As part of the advertising campaign Reed runs one or more television commercials during the weekend preceding the sale. Data from a sample of 5 previous sales are shown below.

Number of TV Ads Number of Cars Sold1 143 242 181 173 27

Page 7: Statistics for Business and Economics: bab  14

7 Slide

Slope for the Estimated Regression Equation b1 = 220 - (10)(100)/5 = 5

24 - (10)2/5 y-Intercept for the Estimated Regression

Equation b0 = 20 - 5(2) = 10

Estimated Regression Equationy = 10 + 5x

^

Example: Reed Auto Sales

Page 8: Statistics for Business and Economics: bab  14

8 Slide

Example: Reed Auto Sales

Scatter Diagram

y = 5x + 10

0

5

10

15

20

25

30

0 1 2 3 4TV Ads

Car

s So

ld

Page 9: Statistics for Business and Economics: bab  14

9 Slide

The Coefficient of Determination

Relationship Among SST, SSR, SSESST = SSR + SSE

Coefficient of Determinationr2 = SSR/SST

where: SST = total sum of squares SSR = sum of squares due to

regression SSE = sum of squares due to error

( ) ( ) ( )y y y y y yi i i i 2 2 2^^

Page 10: Statistics for Business and Economics: bab  14

10 Slide

Coefficient of Determinationr2 = SSR/SST = 100/114 = .8772The regression relationship is very strong

since 88% of the variation in number of cars sold can be explained by the linear relationship between the number of TV ads and the number of cars sold.

Example: Reed Auto Sales

Page 11: Statistics for Business and Economics: bab  14

11 Slide

The Correlation Coefficient

Sample Correlation Coefficient

where: b1 = the slope of the estimated

regressionequation

21 ) of(sign rbrxy

ionDeterminat oft Coefficien ) of(sign 1brxy

xbby 10ˆ

Page 12: Statistics for Business and Economics: bab  14

12 Slide

Example: Reed Auto Sales

Sample Correlation Coefficient

The sign of b1 in the equation is “+”.

rxy = +.9366

21 ) of(sign rbrxy

ˆ 10 5y x

=+ .8772xyr

Page 13: Statistics for Business and Economics: bab  14

13 Slide

Model Assumptions

Assumptions About the Error Term • The error is a random variable with mean

of zero.• The variance of , denoted by 2, is the

same for all values of the independent variable.

• The values of are independent.• The error is a normally distributed random

variable.

Page 14: Statistics for Business and Economics: bab  14

14 Slide

Testing for Significance

To test for a significant regression relationship, we must conduct a hypothesis test to determine whether the value of 1 is zero.

Two tests are commonly used• t Test• F Test

Both tests require an estimate of 2, the variance of in the regression model.

Page 15: Statistics for Business and Economics: bab  14

15 Slide

Testing for Significance

An Estimate of 2

The mean square error (MSE) provides the estimate

of 2, and the notation s2 is also used. s2 = MSE = SSE/(n-2)

where: 2

102 )()ˆ(SSE iiii xbbyyy

Page 16: Statistics for Business and Economics: bab  14

16 Slide

Testing for Significance

An Estimate of • To estimate we take the square root of

2.• The resulting s is called the standard error

of the estimate.

2SSEMSE

n

s

Page 17: Statistics for Business and Economics: bab  14

17 Slide

Hypotheses H0: 1 = 0 Ha: 1 = 0

Test Statistic

Rejection RuleReject H0 if t < -t or t > t

where t is based on a t distribution with

n - 2 degrees of freedom.

Testing for Significance: t Test

t bsb

1

1

Page 18: Statistics for Business and Economics: bab  14

18 Slide

t Test • Hypotheses H0: 1 = 0

Ha: 1 = 0• Rejection Rule

For = .05 and d.f. = 3, t.025 = 3.182

Reject H0 if t > 3.182• Test Statistics

t = 5/1.08 = 4.63• Conclusions

Reject H0

Example: Reed Auto Sales

Page 19: Statistics for Business and Economics: bab  14

19 Slide

Confidence Interval for 1

We can use a 95% confidence interval for 1 to test the hypotheses just used in the t test.

H0 is rejected if the hypothesized value of 1 is not included in the confidence interval for 1.

Page 20: Statistics for Business and Economics: bab  14

20 Slide

Confidence Interval for 1

The form of a confidence interval for 1 is:

where b1 is the point estimateis the margin of erroris the t value providing an

areaof /2 in the upper tail of a

t distribution with n - 2 degrees

of freedom

12/1 bstb

12/ bst2/t

Page 21: Statistics for Business and Economics: bab  14

21 Slide

Example: Reed Auto Sales

Rejection RuleReject H0 if 0 is not included in the

confidence interval for 1. 95% Confidence Interval for 1

= 5 +/- 3.182(1.08) = 5 +/- 3.44

or 1.56 to 8.44 Conclusion

Reject H0

12/1 bstb

Page 22: Statistics for Business and Economics: bab  14

22 Slide

Testing for Significance: F Test

Hypotheses H0: 1 = 0 Ha: 1 = 0

Test StatisticF = MSR/MSE

Rejection RuleReject H0 if F > F

where F is based on an F distribution with 1 d.f. in the numerator and n - 2 d.f. in the denominator.

Page 23: Statistics for Business and Economics: bab  14

23 Slide

F Test• Hypotheses H0: 1 = 0

Ha: 1 = 0• Rejection Rule

For = .05 and d.f. = 1, 3: F.05 = 10.13

Reject H0 if F > 10.13.• Test Statistic

F = MSR/MSE = 100/4.667 = 21.43• Conclusion

We can reject H0.

Example: Reed Auto Sales

Page 24: Statistics for Business and Economics: bab  14

24 Slide

Some Cautions about theInterpretation of Significance Tests

Rejecting H0: 1 = 0 and concluding that the relationship between x and y is significant does not enable us to conclude that a cause-and-effect relationship is present between x and y.

Just because we are able to reject H0: 1 = 0 and demonstrate statistical significance does not enable us to conclude that there is a linear relationship between x and y.

Page 25: Statistics for Business and Economics: bab  14

25 Slide

Confidence Interval Estimate of E(yp)

Prediction Interval Estimate of yp

yp + t/2 sind

where the confidence coefficient is 1 - and

t/2 is based on a t distribution with n - 2 d.f.

Using the Estimated Regression Equationfor Estimation and Prediction

/ y t sp yp 2

Page 26: Statistics for Business and Economics: bab  14

26 Slide

Point EstimationIf 3 TV ads are run prior to a sale, we expect the mean number of cars sold to be:

y = 10 + 5(3) = 25 cars Confidence Interval for E(yp)

95% confidence interval estimate of the mean number of cars sold when 3 TV ads are run is:

25 + 4.61 = 20.39 to 29.61 cars Prediction Interval for yp

95% prediction interval estimate of the number of cars sold in one particular week when 3 TV ads are run is: 25 + 8.28 = 16.72 to 33.28 cars

^

Example: Reed Auto Sales

Page 27: Statistics for Business and Economics: bab  14

27 Slide

Residual for Observation i yi – yi

Standardized Residual for Observation i

where:

Residual Analysis

^

y ysi i

y yi i

^^

s s hy y ii i 1^

Page 28: Statistics for Business and Economics: bab  14

28 Slide

Example: Reed Auto Sales

ResidualsObservation Predicted Cars Sold Residuals

1 15 -12 25 -13 20 -24 15 25 25 2

Page 29: Statistics for Business and Economics: bab  14

29 Slide

Example: Reed Auto Sales

Residual Plot

TV Ads Residual Plot

-3

-2

-1

0

1

2

3

0 1 2 3 4TV Ads

Resi

dual

s

Page 30: Statistics for Business and Economics: bab  14

30 Slide

Residual Analysis Detecting Outliers

• An outlier is an observation that is unusual in comparison with the other data.

• Minitab classifies an observation as an outlier if its standardized residual value is < -2 or > +2.

• This standardized residual rule sometimes fails to identify an unusually large observation as being an outlier.

• This rule’s shortcoming can be circumvented by using studentized deleted residuals.

• The |i th studentized deleted residual| will be larger than the |i th standardized residual|.

Page 31: Statistics for Business and Economics: bab  14

31 Slide

End of Chapter 14