business statistics session 17 (1)
TRANSCRIPT
-
8/7/2019 Business statistics Session 17 (1)
1/24
Business statisticsSession 17
Simple correlation and regression
-
8/7/2019 Business statistics Session 17 (1)
2/24
Learning Objectives
Understand the concept of Correlation and regression Compute correlation coefficient Test the significance of correlation coefficient
Compute the equation of a simple regression line froma sample of data, and interpret the slope and interceptof the equation.
Understand the usefulness of residual analysis intesting the assumptions underlying regression
analysis and in examining the fit of the regression lineto the data. Compute a standard error of the estimate and interpret
its meaning.
-
8/7/2019 Business statistics Session 17 (1)
3/24
Regression and Correlation
Regression analysis is the process ofconstructing a mathematical model or
function that can be used to predict ordetermine one variable by anothervariable.
Correlation is a measure of the degree of
relatedness of two variables.
-
8/7/2019 Business statistics Session 17 (1)
4/24
Pearson Product-MomentCorrelation Coefficient
rS S X Y
S S X S S Y
X X Y Y
X YX Y
n
n n
X X Y Y
X X Y Y
!
!
!
2 2
2
2
2
2
e e1 1r
-
8/7/2019 Business statistics Session 17 (1)
5/24
Degrees of Correlation
Correlation is a measure of the degree ofrelatedness of variables
Coefficient of Correlation (r) - applicableonly if both variables being analyzed haveat least an interval level of data
-
8/7/2019 Business statistics Session 17 (1)
6/24
Degrees of Correlation
The term (r) is a measure of the linearcorrelation of two variables
The number ranges from -1 to 0 to +1 Closer to +1, the higher the correlation
between the dependent and the independentvariables
See the formula forPearson Product Momentcorrelation coefficient
-
8/7/2019 Business statistics Session 17 (1)
7/24
Correlation Coefficient
Covariance
COV(X,Y)=(Xi-X)*(Yi-Y)/N-1
Pearson product moment correlationcoefficient
rxy=Cov xy/Sx*Sy
-
8/7/2019 Business statistics Session 17 (1)
8/24
Correlation (contd.)
Population correlation (p) - If the databaseincludes an entire population
Sample correlation (r) - If measure isbased on a sample
-
8/7/2019 Business statistics Session 17 (1)
9/24
Three Degrees of Correlation
r < 0 r > 0
r = 0
-
8/7/2019 Business statistics Session 17 (1)
10/24
Computation ofrforthe Economics Example
Day
Interest
X
Futures
Index
Y1 7.43 221 55.205 48,841 1,642.03
2 7.48 222 55.950 49,284 1,660.563 8.00 226 64.000 51,076 1,808.00
4 7.75 225 60.063 50,625 1,743.75
5 7.60 224 57.760 50,176 1,702.40
6 7.63 223 58.217 49,729 1,701.49
7 7.68 223 58.982 49,729 1,712.64
8 7.67 226 58.829 51,076 1,733.42
9 7.59 226 57.608 51,076 1,715.34
10 8.07 235 65.125 55,225 1,896.45
11 8.03 233 64.481 54,289 1,870.99
12 8.00 241 64.000 58,081 1,928.00
Summations 92.93 2,725 720.220 619,207 21,115.07
X2 Y2 XY
-
8/7/2019 Business statistics Session 17 (1)
11/24
Computation ofrEconomics Example
r
X X Y Y
X YX Y
n
n n
!
!
!
22
2
2
2 2
2 1 1 1 5 0 79 2 9 3 2 7 2 5
1 2
7 2 0 2 2 1 2 6 1 9 2 0 7 1 29 2 9 3 2 7 2 5
8 1 5
, ..
. ,.
.
-
8/7/2019 Business statistics Session 17 (1)
12/24
Testing the Significance of theCorrelation Coefficient
Null hypothesis: Ho : p = 0
Alternative hypothesis: Ha : p 0
Test statistic
Example: n = 6 and r = .70
t=.706-2/1-.702
= 1.96
At E = .05 , n-2 = 4 degrees of freedom,
Critical value of t = 2.78
Since 1.96
-
8/7/2019 Business statistics Session 17 (1)
13/24
Simple Regression Analysis
Bivariate (two variables) linear regression-- the most elementary regression model dependent variable, the variable to be
predicted, usually called Y independent variable, the predictor or
explanatory variable, usually called X
Nonlinear relationships and regression
models with more than one independentvariable can be explored by using multipleregression models
-
8/7/2019 Business statistics Session 17 (1)
14/24
Regression Models Deterministic Regression Model
Y = F0 + F1X
Probabilistic Regression ModelY = F0 + F1X + I
F0 and F1 are population parameters
F0 and F1 are estimated by samplestatistics b0 and b1
-
8/7/2019 Business statistics Session 17 (1)
15/24
Equation of the SimpleRegression Line
YY
where
Y
b
bbb
ofvaluepredictedthe=slopesamplethe=
interceptsamplethe=:
1
0
10!
-
8/7/2019 Business statistics Session 17 (1)
16/24
Least Squares Analysis
Least squares analysis is a processwhereby a regression model is developedby producing the minimum sum of thesquared error values
The vertical distance from each point tothe line is the error of the prediction.
The least squares regression line is theregression line that results in the smallestsum of errors squared.
-
8/7/2019 Business statistics Session 17 (1)
17/24
Least Squares Analysis
1 2 2 2
2
2bY Y Y n Y
n
YY
n
n
!
!
!
0 1 1b b bYY
n n! !
-
8/7/2019 Business statistics Session 17 (1)
18/24
Solving forb1 and b0 of the Regression
Line: Airline Cost Example
Number ofPassengers Cost ($1,000)
X Y X2
XY
61 4.28 3,721 261.08
63 4.08 3,969 257.0467 4.42 4,489 296.1469 4.17 4,761 287.7370 4.48 4,900 313.6074 4.30 5,476 318.2076 4.82 5,776 366.3281 4.70 6,561 380.7086 5.11
7,396 4
39.46
91 5.13 8,281 466.8395 5.64 9,025 535.8097 5.56 9,409 539.32
X= 930 Y=56.69 2
X = 73,764 XY=4,462.22
-
8/7/2019 Business statistics Session 17 (1)
19/24
-
8/7/2019 Business statistics Session 17 (1)
20/24
Residual Analysis: Airline Cost Example
Number of PredictedPassengers Cost ($1,000) Value Residual
X Y Y YY
61 4.28 4.053 .227
63 4.08 4.134 -.05467 4.42 4.297 .12369 4.17 4.378 -.20870 4.48 4.419 .06174 4.30 4.582 -.28276 4.82 4.663 .15781 4.70 4.867 -.16786 5.11 5.070 .040
91 5.13 5.274 -.14495 5.64 5.436 .20497 5.56 5.518 .042
! 001.)( YY
-
8/7/2019 Business statistics Session 17 (1)
21/24
Standard Error of the Estimate
Residuals represent errors of estimation forindividual points.
A more useful measurement of error is thestandard error of the estimate
The standard error of the estimate, denotedse,is a standard deviation of the error of the
regression model
-
8/7/2019 Business statistics Session 17 (1)
22/24
-
8/7/2019 Business statistics Session 17 (1)
23/24
Standard Error of the Estimate forthe Airline Cost Example
1773.0
1031434.0
2
31434.0
2
!
!
!
!
!
n
SSE
SSE
S
YY
e
-
8/7/2019 Business statistics Session 17 (1)
24/24
Thank you