chapter 12 · fitting the model: the least squares approach advertising-sales data month...

Chapter 12

Simple Linear Regression

Probabilistic Models

General form of Probabilistic Models

Y = Deterministic Component + Random

Errorwhere

E(y) = Deterministic Component


First Order (Straight-Line) Probabilistic Model

xy10


5 steps of Simple Linear Regression

1. Hypothesize the deterministic component

2. Use sample data to estimate unknown model

parameters

3. Specify probability distribution of , estimate

standard deviation of the distribution

4. Statistically evaluate model usefulness

5. Use for prediction, estimatation, once model is

useful

Fitting the Model: The Least

Squares Approach

Advertising-Sales Data

Month Advertising Expenditure, x ($100s) Sales Revenue, y ($!,000s)

1 1 1

2 2 1

3 3 2

4 4 2

5 5 4


Squares Approach

Least Squares Line has:•Sum of errors (SE) = 0

•Sum of Squared errors (SSE) is smallest of all straight line

models

Formulas:

Slope: y-intercept

( ) ( )( ) ( )

i i

x y i i i i

x yS S x x y y x y

n

xy10

ˆˆˆ

1

ˆ x y

x x

S S

S S

0 1

ˆ ˆy x

2

2 2( )

( )i

x y i i

xS S x x x

n


Squares Approach

Preliminary Computations

i

x i

y 2

ix

i ix y

1 1 1 1

2 1 4 2

3 2 9 6

4 2 16 8

5 4 25 20

Totals 1 5i

x 1 0i

y 25 5

ix 3 7

i ix y

Comparing Observed and Predicted Values for the Least Squares Prediction

Equation

x y ˆ .1 .7y x ˆy y 2

ˆy y

1 1 .6 .4 .16

2 1 1.3 -.3 .09

3 2 2.0 0.0 .00

4 2 2.7 -.7 .49

5 4 3.4 .6 .36

Sum of Errors = 0 SSE = 1.10

Model Assumptions

1. Mean of the probability distribution of ε is 0

2. Variance of the probability distribution of ε is constant

for all values of x

3. Probability distribution of ε is normal

4. Values of ε are independent of each other

An Estimator of 2

Estimator of 2 for a straight-line model

2

2

S S E S S Es

D e g r e e s o f fr e e d o m fo r e r r o r n

1

2

2 2

2

ˆy y x y

i

y y i i

S S E S S S S

yS S y y y

n

s s E s t im a te d S ta n d a r d E r r o r o f th e R e g r e s s io n M o d e l

Making Inferences about the Slope 1

Sampling Distribution of 1

1ˆ

x xS S


A Test of Model Utility: Simple Linear Regression

One-Tailed Test Two-Tailed Test

H0: β1=0

Ha: β1<0 (or Ha: β1>0)

H0: β1=0

Ha: β1≠0

Rejection region: t< -tα

(or t< -tα when Ha: β1>0)

Rejection region: |t|> tα/2

Where tα and tα/2 are based on (n-2) degrees of freedom

1

1 1

ˆ

ˆ ˆ

:

x x

T e s t s ta t i s t ic ts s S S


A 100(1-α)% Confidence Interval for 1

where1

ˆ1 2

ˆ t s

1ˆ

x x

ss

S S

The Coefficient of Correlation

x y

x x y y

S Sr

S S S S

The Coefficient of Determination

21

y y

y y y y

S S S S E S S Er

S S S S

Using the Model for Estimation and

Prediction

Sampling errors and confidence intervals will be larger for

Predictions than for Estimates

Standard error of

Standard error of the prediction

2

ˆ

( )1 p

y

x x

x x

n S S

y

2

ˆ

11

p

y y

x x

x x

n S S

Using the Model for Estimation and

Prediction

100(1-α)% Confidence interval for Mean Value of y at x=xp

100(1-α)% Confidence interval for an Individual New Value of y at x=xp

where tα/2 is based on (n-2) degrees of freedom

2

2

1ˆ

p

x x

x x

y t sn S S

2

2

1ˆ 1

p

x x

x x

y t sn S S

chapter 12 · fitting the model: the least squares approach advertising-sales data month...

Documents