© 2000 prentice-hall, inc. chap. 9 - 1 forecasting using the simple linear regression model and...

45
© 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

Post on 22-Dec-2015

220 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 1

Forecasting Using the Simple Linear Regression

Model and Correlation

Page 2: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 2

What is a forecast?

Using a statistical method on past data to predict the future.

Using experience, judgment and surveys to predict the future.

Page 3: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 3

Why forecast?

to enhance planning.to force thinking about the future.to fit corporate strategy to future conditions. to coordinate departments to the same future.to reduce corporate costs.

Page 4: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 4

Kinds of Forecasts

Causal forecasts are when changes in a variable (Y) you wish to predict are caused by changes in other variables (X's).

Time series forecasts are when changes in a variable (Y) are predicted based on prior values of itself (Y).

Regression can provide both kinds of forecasts.

Page 5: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 5

Types of Relationships

Positive Linear Relationship Negative Linear Relationship

Page 6: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 6

Types of Relationships

Relationship NOT Linear No Relationship

(continued)

Page 7: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 7

Relationships

If the relationship is not linear, the forecaster often has to use math transformations to make the relationship linear.

Page 8: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 8

Correlation Analysis

•Correlation measures the strength of the linear relationship between variables.

•It can be used to find the best predictor variables.

• It does not assure that there is a causal relationship between the variables.

Page 9: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 9

The Correlation Coefficient

• Ranges between -1 and 1.

• The Closer to -1, The Stronger Is The Negative Linear Relationship.

• The Closer to 1, The Stronger Is The Positive Linear Relationship.

• The Closer to 0, The Weaker Is Any Linear Relationship.

Page 10: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 10

Y

X

Y

X

Y

X

Y

X

Y

X

Graphs of Various Correlation (r) Values

r = -1 r = -.6 r = 0

r = .6 r = 1

Page 11: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 11

The Scatter Diagram

0

20

40

60

80

0 20 40 60

X

Y

Plot of all (Xi , Yi) pairs

Page 12: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 12

The Scatter Diagram

Is used to visualize the relationship and to assess its linearity. The scatter diagram can also be used to identify outliers.

Page 13: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 13

Regression Analysis

Regression Analysis can be used to model causality and make predictions.

Terminology: The variable to be predicted is called the dependent or response variable. The variables used in the prediction model are called independent, explanatory or predictor variables.

Page 14: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 14

Simple Linear Regression Model

•The relationship between variables is described by a linear function.•A change of one variable causes the other variable to change.

Page 15: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 15

• Population Regression Line Is A Straight Line that Describes The Dependence of One Variable on The Other

Population Y intercept

Population SlopeCoefficient

Random Error

Dependent (Response) Variable

Independent (Explanatory) Variable

Population Linear Regression

ii iY X PopulationRegressionLine

Page 16: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 16

= Random Error

Y

X

How is the best line found?

Observed Value

Observed Value

YX iX

i

ii iY X

Page 17: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 17

Sample Linear Regression

Sample Y Intercept

SampleSlopeCoefficient

Sample Regression Line Provides an Estimate of The Population Regression Line

0 1i iib bY X e

provides an estimate of

provides an estimate of

0b

1b

Sample Regression Line

Residual

Y

Page 18: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 18

Simple Linear Regression: An Example

You wish to examine the relationship between the square footage of produce stores and their annual sales. Sample data for 7 stores were obtained. Find the equation of the straight line that fits the data best

Annual Store Square Sales

Feet ($1000)

1 1,726 3,681

2 1,542 3,395

3 2,816 6,653

4 5,555 9,543

5 1,292 3,318

6 2,208 5,563

7 1,313 3,760

Page 19: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 19

The Scatter Diagram

0

2000

4000

6000

8000

10000

12000

0 1000 2000 3000 4000 5000 6000

Square Feet

An

nu

al

Sa

les

($00

0)

Excel Output

Page 20: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 20

The Equation for the Regression Line

i

ii

X..

XbbY

4871415163610

From Excel Printout:

CoefficientsIntercept 1636.414726X Variable 1 1.486633657

Page 21: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 21

Graph of the Regression Line

0

2000

4000

6000

8000

10000

12000

0 1000 2000 3000 4000 5000 6000

Square Feet

An

nu

al

Sa

les

($00

0)

Y i = 1636.415 +1.487X i

Page 22: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 22

Interpreting the Results

Yi = 1636.415 +1.487Xi

The slope of 1.487 means that each increase of one unit in X, we predict the average of Y to increase by an estimated 1.487 units.

The model estimates that for each increase of 1 square foot in the size of the store, the expected annual sales are predicted to increase by $1487.

Page 23: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 23

The Coefficient of Determination

SSR regression sum of squares

SST total sum of squaresr2 = =

The Coefficient of Determination (r2 ) measures the proportion of variation in Y explained by the independent variable X.

Page 24: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 24

Coefficients of Determination (R2) and Correlation (R)

r2 = 1,

Y

Yi = b0 + b1Xi

X

^

r = +1

Page 25: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 25

Coefficients of Determination (R2) and Correlation (R)

r2 = .81, r = +0.9

Y

Yi = b0 + b1Xi

X

^

(continued)

Page 26: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 26

Coefficients of Determination (R2) and Correlation (R)

r2 = 0, r = 0

Y

Yi = b0 + b1Xi

X

^

(continued)

Page 27: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 27

Coefficients of Determination (R2) and Correlation (R)

r2 = 1, r = -1

Y

Yi = b0 + b1Xi

X

^

(continued)

Page 28: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 28

Correlation: The Symbols

• Population correlation coefficient (‘rho’) measures the strength between two variables.

• Sample correlation coefficient r estimates based on a set of sample observations.

Page 29: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 29

Regression StatisticsMultiple R 0.9705572R Square 0.94198129Adjusted R Square 0.93037754Standard Error 611.751517Observations 7

Example: Produce Stores

From Excel Printout

Page 30: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 30

Inferences About the Slope

• t Test for a Population SlopeIs There A Linear Relationship between X and Y ?

1

11

bS

bt

•Test Statistic:

n

ii

YXb

)XX(

SS

1

21

and df = n - 2

• Null and Alternative Hypotheses

H0: 1 = 0 (No Linear Relationship) H1: 1 0 (Linear Relationship)

Where

Page 31: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 31

Example: Produce Stores

Data for 7 Stores: Estimated Regression Equation:

The slope of this model is 1.487.

Is Square Footage of the store affecting its Annual Sales?

Annual Store Square Sales

Feet ($000)

1 1,726 3,681

2 1,542 3,395

3 2,816 6,653

4 5,555 9,543

5 1,292 3,318

6 2,208 5,563

7 1,313 3,760

Yi = 1636.415 +1.487Xi

Page 32: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 32

t Stat P-valueIntercept 3.6244333 0.0151488X Variable 1 9.009944 0.0002812

H0: 1 = 0

H1: 1 0

.05

df 7 - 2 = 5

Critical value(s):

Test Statistic:

Decision:

Conclusion:

There is evidence of a linear relationship.t0 2.5706-2.5706

.025

Reject Reject

.025

From Excel Printout

Reject H0

Inferences About the Slope: t Test Example

Page 33: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 33

Inferences About the Slope Using A Confidence Interval

Confidence Interval Estimate of the Slopeb1 tn-2 1bS

Excel Printout for Produce Stores

At 95% level of Confidence The confidence Interval for the slope is (1.062, 1.911). Does not include 0.

Conclusion: There is a significant linear relationship between annual sales and the size of the store.

Lower 95% Upper 95%Intercept 475.810926 2797.01853X Variable 11.06249037 1.91077694

Page 34: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 34

Residual Analysis

Is used to evaluate validity of assumptions. Residual analysis uses numerical measures and plots to assure the validity of the assumptions.

Page 35: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 35

Linear Regression Assumptions

1. X is linearly related to Y.

2. The variance is constant for each value of Y (Homoscedasticity).

3. The Residual Error is Normally Distributed.

4. If the data is over time, then the errors must be independent.

Page 36: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 36

Residual Analysis for Linearity

Not Linear Linear

X

e eX

Y

X

Y

X

Page 37: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 37

Residual Analysis for Homoscedasticity

Heteroscedasticity Homoscedasticity

e

X

e

X

Y

X X

Y

Page 38: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 38

Residual Analysis for Independence: The Durbin-Watson Statistic

It is used when data is collected over time.

It detects autocorrelation; that is, the residuals in one time period are related to residuals in another time period.

It measures violation of independence assumption.

n

ii

n

iii

e

)ee(D

1

2

2

21 Calculate D and

compare it to the value in Table E.8.

Page 39: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 39

Preparing Confidence Intervals for Forecasts

Page 40: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 40

Interval Estimates for Different Values of X

X

Y

X

Confidence Interval for a individual Yi

A Given X

Confidence Interval for the mean of Y

Y i = b0 + b1X i

_

Page 41: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 41

Estimation of Predicted Values

Confidence Interval Estimate for YX

The Mean of Y given a particular Xi

n

ii

iyxni

)XX(

)XX(

nStY

1

2

2

21

t value from table with df=n-2

Standard error of the estimate

Size of interval vary according to distance away from mean, X.

Page 42: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 42

Estimation of Predicted Values

Confidence Interval Estimate for Individual Response Yi at a Particular Xi

n

ii

iyxni

)XX(

)XX(

nStY

1

2

2

21

1

Addition of 1 increases width of interval from that for the mean of Y

Page 43: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 43

Example: Produce Stores

Yi = 1636.415 +1.487Xi

Data for 7 Stores:

Regression Model Obtained:

Predict the annual sales for a store with

2000 square feet.

Annual Store Square Sales

Feet ($000)

1 1,726 3,681

2 1,542 3,395

3 2,816 6,653

4 5,555 9,543

5 1,292 3,318

6 2,208 5,563

7 1,313 3,760

Page 44: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 44

Estimation of Predicted Values: Example

Find the 95% confidence interval for the average annual sales for stores of 2,000 square feet

n

ii

iyxni

)XX(

)XX(

nStY

1

2

2

21

Predicted Sales Yi = 1636.415 +1.487Xi = 4610.45 ($000)

X = 2350.29 SYX = 611.75 tn-2 = t5 = 2.5706

= 4610.45 612.66

Confidence interval for mean Y

Confidence Interval Estimate for YX

Page 45: © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

© 2000 Prentice-Hall, Inc. Chap. 9 - 45

Estimation of Predicted Values: Example

Find the 95% confidence interval for annual sales of one particular store of 2,000 square feet

Predicted Sales Yi = 1636.415 +1.487Xi = 4610.45 ($000)

X = 2350.29 SYX = 611.75 tn-2 = t5 = 2.5706

= 4610.45 1687.68

Confidence interval for individual Y

n

ii

iyxni

)XX(

)XX(

nStY

1

2

2

21

1

Confidence Interval Estimate for Individual Y