chapter 12 · fitting the model: the least squares approach advertising-sales data month...
TRANSCRIPT
Chapter 12
Simple Linear Regression
Probabilistic Models
General form of Probabilistic Models
Y = Deterministic Component + Random
Errorwhere
E(y) = Deterministic Component
Probabilistic Models
First Order (Straight-Line) Probabilistic Model
xy10
Probabilistic Models
5 steps of Simple Linear Regression
1. Hypothesize the deterministic component
2. Use sample data to estimate unknown model
parameters
3. Specify probability distribution of , estimate
standard deviation of the distribution
4. Statistically evaluate model usefulness
5. Use for prediction, estimatation, once model is
useful
Fitting the Model: The Least
Squares Approach
Advertising-Sales Data
Month Advertising Expenditure, x ($100s) Sales Revenue, y ($!,000s)
1 1 1
2 2 1
3 3 2
4 4 2
5 5 4
Fitting the Model: The Least
Squares Approach
Least Squares Line has:•Sum of errors (SE) = 0
•Sum of Squared errors (SSE) is smallest of all straight line
models
Formulas:
Slope: y-intercept
( ) ( )( ) ( )
i i
x y i i i i
x yS S x x y y x y
n
xy10
ˆˆˆ
1
ˆ x y
x x
S S
S S
0 1
ˆ ˆy x
2
2 2( )
( )i
x y i i
xS S x x x
n
Fitting the Model: The Least
Squares Approach
Preliminary Computations
i
x i
y 2
ix
i ix y
1 1 1 1
2 1 4 2
3 2 9 6
4 2 16 8
5 4 25 20
Totals 1 5i
x 1 0i
y 25 5
ix 3 7
i ix y
Comparing Observed and Predicted Values for the Least Squares Prediction
Equation
x y ˆ .1 .7y x ˆy y 2
ˆy y
1 1 .6 .4 .16
2 1 1.3 -.3 .09
3 2 2.0 0.0 .00
4 2 2.7 -.7 .49
5 4 3.4 .6 .36
Sum of Errors = 0 SSE = 1.10
Model Assumptions
1. Mean of the probability distribution of ε is 0
2. Variance of the probability distribution of ε is constant
for all values of x
3. Probability distribution of ε is normal
4. Values of ε are independent of each other
An Estimator of 2
Estimator of 2 for a straight-line model
2
2
S S E S S Es
D e g r e e s o f fr e e d o m fo r e r r o r n
1
2
2 2
2
ˆy y x y
i
y y i i
S S E S S S S
yS S y y y
n
s s E s t im a te d S ta n d a r d E r r o r o f th e R e g r e s s io n M o d e l
Making Inferences about the Slope 1
Sampling Distribution of 1
1ˆ
x xS S
Making Inferences about the Slope 1
A Test of Model Utility: Simple Linear Regression
One-Tailed Test Two-Tailed Test
H0: β1=0
Ha: β1<0 (or Ha: β1>0)
H0: β1=0
Ha: β1≠0
Rejection region: t< -tα
(or t< -tα when Ha: β1>0)
Rejection region: |t|> tα/2
Where tα and tα/2 are based on (n-2) degrees of freedom
1
1 1
ˆ
ˆ ˆ
:
x x
T e s t s ta t i s t ic ts s S S
Making Inferences about the Slope 1
A 100(1-α)% Confidence Interval for 1
where1
ˆ1 2
ˆ t s
1ˆ
x x
ss
S S
The Coefficient of Correlation
x y
x x y y
S Sr
S S S S
The Coefficient of Determination
21
y y
y y y y
S S S S E S S Er
S S S S
Using the Model for Estimation and
Prediction
Sampling errors and confidence intervals will be larger for
Predictions than for Estimates
Standard error of
Standard error of the prediction
2
ˆ
( )1 p
y
x x
x x
n S S
y
2
ˆ
11
p
y y
x x
x x
n S S
Using the Model for Estimation and
Prediction
100(1-α)% Confidence interval for Mean Value of y at x=xp
100(1-α)% Confidence interval for an Individual New Value of y at x=xp
where tα/2 is based on (n-2) degrees of freedom
2
2
1ˆ
p
x x
x x
y t sn S S
2
2
1ˆ 1
p
x x
x x
y t sn S S