business statistics session 17 (1)

Upload: mukul159

Post on 08-Apr-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/7/2019 Business statistics Session 17 (1)

    1/24

    Business statisticsSession 17

    Simple correlation and regression

  • 8/7/2019 Business statistics Session 17 (1)

    2/24

    Learning Objectives

    Understand the concept of Correlation and regression Compute correlation coefficient Test the significance of correlation coefficient

    Compute the equation of a simple regression line froma sample of data, and interpret the slope and interceptof the equation.

    Understand the usefulness of residual analysis intesting the assumptions underlying regression

    analysis and in examining the fit of the regression lineto the data. Compute a standard error of the estimate and interpret

    its meaning.

  • 8/7/2019 Business statistics Session 17 (1)

    3/24

    Regression and Correlation

    Regression analysis is the process ofconstructing a mathematical model or

    function that can be used to predict ordetermine one variable by anothervariable.

    Correlation is a measure of the degree of

    relatedness of two variables.

  • 8/7/2019 Business statistics Session 17 (1)

    4/24

    Pearson Product-MomentCorrelation Coefficient

    rS S X Y

    S S X S S Y

    X X Y Y

    X YX Y

    n

    n n

    X X Y Y

    X X Y Y

    !

    !

    !

    2 2

    2

    2

    2

    2

    e e1 1r

  • 8/7/2019 Business statistics Session 17 (1)

    5/24

    Degrees of Correlation

    Correlation is a measure of the degree ofrelatedness of variables

    Coefficient of Correlation (r) - applicableonly if both variables being analyzed haveat least an interval level of data

  • 8/7/2019 Business statistics Session 17 (1)

    6/24

    Degrees of Correlation

    The term (r) is a measure of the linearcorrelation of two variables

    The number ranges from -1 to 0 to +1 Closer to +1, the higher the correlation

    between the dependent and the independentvariables

    See the formula forPearson Product Momentcorrelation coefficient

  • 8/7/2019 Business statistics Session 17 (1)

    7/24

    Correlation Coefficient

    Covariance

    COV(X,Y)=(Xi-X)*(Yi-Y)/N-1

    Pearson product moment correlationcoefficient

    rxy=Cov xy/Sx*Sy

  • 8/7/2019 Business statistics Session 17 (1)

    8/24

    Correlation (contd.)

    Population correlation (p) - If the databaseincludes an entire population

    Sample correlation (r) - If measure isbased on a sample

  • 8/7/2019 Business statistics Session 17 (1)

    9/24

    Three Degrees of Correlation

    r < 0 r > 0

    r = 0

  • 8/7/2019 Business statistics Session 17 (1)

    10/24

    Computation ofrforthe Economics Example

    Day

    Interest

    X

    Futures

    Index

    Y1 7.43 221 55.205 48,841 1,642.03

    2 7.48 222 55.950 49,284 1,660.563 8.00 226 64.000 51,076 1,808.00

    4 7.75 225 60.063 50,625 1,743.75

    5 7.60 224 57.760 50,176 1,702.40

    6 7.63 223 58.217 49,729 1,701.49

    7 7.68 223 58.982 49,729 1,712.64

    8 7.67 226 58.829 51,076 1,733.42

    9 7.59 226 57.608 51,076 1,715.34

    10 8.07 235 65.125 55,225 1,896.45

    11 8.03 233 64.481 54,289 1,870.99

    12 8.00 241 64.000 58,081 1,928.00

    Summations 92.93 2,725 720.220 619,207 21,115.07

    X2 Y2 XY

  • 8/7/2019 Business statistics Session 17 (1)

    11/24

    Computation ofrEconomics Example

    r

    X X Y Y

    X YX Y

    n

    n n

    !

    !

    !

    22

    2

    2

    2 2

    2 1 1 1 5 0 79 2 9 3 2 7 2 5

    1 2

    7 2 0 2 2 1 2 6 1 9 2 0 7 1 29 2 9 3 2 7 2 5

    8 1 5

    , ..

    . ,.

    .

  • 8/7/2019 Business statistics Session 17 (1)

    12/24

    Testing the Significance of theCorrelation Coefficient

    Null hypothesis: Ho : p = 0

    Alternative hypothesis: Ha : p 0

    Test statistic

    Example: n = 6 and r = .70

    t=.706-2/1-.702

    = 1.96

    At E = .05 , n-2 = 4 degrees of freedom,

    Critical value of t = 2.78

    Since 1.96

  • 8/7/2019 Business statistics Session 17 (1)

    13/24

    Simple Regression Analysis

    Bivariate (two variables) linear regression-- the most elementary regression model dependent variable, the variable to be

    predicted, usually called Y independent variable, the predictor or

    explanatory variable, usually called X

    Nonlinear relationships and regression

    models with more than one independentvariable can be explored by using multipleregression models

  • 8/7/2019 Business statistics Session 17 (1)

    14/24

    Regression Models Deterministic Regression Model

    Y = F0 + F1X

    Probabilistic Regression ModelY = F0 + F1X + I

    F0 and F1 are population parameters

    F0 and F1 are estimated by samplestatistics b0 and b1

  • 8/7/2019 Business statistics Session 17 (1)

    15/24

    Equation of the SimpleRegression Line

    YY

    where

    Y

    b

    bbb

    ofvaluepredictedthe=slopesamplethe=

    interceptsamplethe=:

    1

    0

    10!

  • 8/7/2019 Business statistics Session 17 (1)

    16/24

    Least Squares Analysis

    Least squares analysis is a processwhereby a regression model is developedby producing the minimum sum of thesquared error values

    The vertical distance from each point tothe line is the error of the prediction.

    The least squares regression line is theregression line that results in the smallestsum of errors squared.

  • 8/7/2019 Business statistics Session 17 (1)

    17/24

    Least Squares Analysis

    1 2 2 2

    2

    2bY Y Y n Y

    n

    YY

    n

    n

    !

    !

    !

    0 1 1b b bYY

    n n! !

  • 8/7/2019 Business statistics Session 17 (1)

    18/24

    Solving forb1 and b0 of the Regression

    Line: Airline Cost Example

    Number ofPassengers Cost ($1,000)

    X Y X2

    XY

    61 4.28 3,721 261.08

    63 4.08 3,969 257.0467 4.42 4,489 296.1469 4.17 4,761 287.7370 4.48 4,900 313.6074 4.30 5,476 318.2076 4.82 5,776 366.3281 4.70 6,561 380.7086 5.11

    7,396 4

    39.46

    91 5.13 8,281 466.8395 5.64 9,025 535.8097 5.56 9,409 539.32

    X= 930 Y=56.69 2

    X = 73,764 XY=4,462.22

  • 8/7/2019 Business statistics Session 17 (1)

    19/24

  • 8/7/2019 Business statistics Session 17 (1)

    20/24

    Residual Analysis: Airline Cost Example

    Number of PredictedPassengers Cost ($1,000) Value Residual

    X Y Y YY

    61 4.28 4.053 .227

    63 4.08 4.134 -.05467 4.42 4.297 .12369 4.17 4.378 -.20870 4.48 4.419 .06174 4.30 4.582 -.28276 4.82 4.663 .15781 4.70 4.867 -.16786 5.11 5.070 .040

    91 5.13 5.274 -.14495 5.64 5.436 .20497 5.56 5.518 .042

    ! 001.)( YY

  • 8/7/2019 Business statistics Session 17 (1)

    21/24

    Standard Error of the Estimate

    Residuals represent errors of estimation forindividual points.

    A more useful measurement of error is thestandard error of the estimate

    The standard error of the estimate, denotedse,is a standard deviation of the error of the

    regression model

  • 8/7/2019 Business statistics Session 17 (1)

    22/24

  • 8/7/2019 Business statistics Session 17 (1)

    23/24

    Standard Error of the Estimate forthe Airline Cost Example

    1773.0

    1031434.0

    2

    31434.0

    2

    !

    !

    !

    !

    !

    n

    SSE

    SSE

    S

    YY

    e

  • 8/7/2019 Business statistics Session 17 (1)

    24/24

    Thank you