lecture 5 correlation and regression dr peter wheale

21
Lecture 5 Correlation and Regression Dr Peter Wheale

Upload: emery-powers

Post on 15-Jan-2016

238 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 5 Correlation and Regression Dr Peter Wheale

Lecture 5Correlation and Regression

Dr Peter Wheale

Page 2: Lecture 5 Correlation and Regression Dr Peter Wheale

A Scatter Plot of Monthly Returns

Page 3: Lecture 5 Correlation and Regression Dr Peter Wheale

Interpretation of Correlation Coefficient

Correlation Interpretation coefficient (r)

(r) r = +1 perfect positive correlation 0 < r < +1 positive linear relationship r = 0 no linear relationship r = -1 perfect negative correlation -1 < r < 0 negative linear relationship

Page 4: Lecture 5 Correlation and Regression Dr Peter Wheale

Scatter Plots and Correlation

Page 5: Lecture 5 Correlation and Regression Dr Peter Wheale

Covariance of Rates of Return

n

t,1 1 t,2 2t 1

1,2

R R R Rcov

n 1Example: Calculate the covariance between the returns on the two stocks indicated below:

Page 6: Lecture 5 Correlation and Regression Dr Peter Wheale

Covariance Using Historical Data

R2 = 0.07

Σ = 0.0154

Cov = 0.0154 / 2 = 0.0077 R1 = 0.05

Page 7: Lecture 5 Correlation and Regression Dr Peter Wheale

Sample Correlation Coefficient

Correlation, ρ, is a standardized measure of covariance and is bounded by +1 and –1

1,21,2

1 2

Cov

1,2

0.00510.662

0.07 0.11

Example: The covariance of returns on two assets is 0.0051 and σ1= 7% and σ2= 11%. Calculate ρ1,2.

Page 8: Lecture 5 Correlation and Regression Dr Peter Wheale

Testing H0: Correlation = 0

The test of whether the true correlation between two random variables is zero (i.e., there is no correlation) is a t-test based on the sample correlation coefficient, r. With n (pairs of) observations the test statistic is:

Degrees of freedom is n – 2

Page 9: Lecture 5 Correlation and Regression Dr Peter Wheale

Example

Data:n = 10r = 0.475 Determine if the sample correlation is significant at the

5% level of significance.t = 0.475 (8)0.5 / [1 – (0.475)2] 0.5

= 1.3435 / 0.88 = 1.5267 The two-tailed critical t – values at a 5% level of

significance with df = 8 (n-2) are found to be +/- 2.306.

Since -2.306≤ 1.5267≤ 2.306, the null hypothesis cannot be rejected, i.e. correlation between variables X and Y is not significantly different from zero at a 5% significance level.

Page 10: Lecture 5 Correlation and Regression Dr Peter Wheale

Testing H0: Correlation = 0

The test of whether the true correlation between two random variables is zero (i.e., there is no correlation) is a t-test based on the sample correlation coefficient, r. With n (pairs of) observations the test statistic is:

Degrees of freedom is n – 2

Page 11: Lecture 5 Correlation and Regression Dr Peter Wheale

Testing H0: Correlation = 0

The test of whether the true correlation between two random variables is zero (i.e., there is no correlation) is a t-test based on the sample correlation coefficient, r. With n (pairs of) observations the test statistic is:

Degrees of freedom is n – 2

Page 12: Lecture 5 Correlation and Regression Dr Peter Wheale

Testing H0: Correlation = 0

The test of whether the true correlation between two random variables is zero (i.e., there is no correlation) is a t-test based on the sample correlation coefficient, r. With n (pairs of) observations the test statistic is:

Degrees of freedom is n – 2

Page 13: Lecture 5 Correlation and Regression Dr Peter Wheale

Linear Regression• Dependent variable: you are trying to

explain changes in this variable• Independent variable: the variable being

used to explain the changes in the dependent variable

• Example: You want to predict housing starts using mortgage interest rates:

Independent variable = mortgage interest ratesDependent variable = housing starts

Page 14: Lecture 5 Correlation and Regression Dr Peter Wheale

Regression Equation

y-Intercept

Slope Coefficient

Independent Variable

Dependent Variable

Error Term

Page 15: Lecture 5 Correlation and Regression Dr Peter Wheale

Assumptions of Linear Regression

• Linear relation between dependent and independent variables

• Independent variable uncorrelated with error term

• Expected value of error term is zero• Variance of the error term is constant• Error term is independently distributed• Error term is normally distributed

Page 16: Lecture 5 Correlation and Regression Dr Peter Wheale

Estimated Regression Coefficients

Estimated regression line is:

Y-InterceptSlope

Page 17: Lecture 5 Correlation and Regression Dr Peter Wheale

Estimating the slope coefficient

b1 = the cov(X,Y) / var(X)

Example Compute the slope coefficient and intercept term for

the least squares regression equation using the following information:

Where X – Xmean multiplied by Y-Ymean = 445, and X – Xmean squared = 374.50. The sample means of X and Y = 25 and 75, respectively.

The slope coefficient, b1 = 445/374.5 = 1.188.

The intercept term, b0 = 75 – 1.188 (25) = 45.3.

Page 18: Lecture 5 Correlation and Regression Dr Peter Wheale

Calculating the Standard Error of the Estimate (SEE)

• SEE measures the accuracy of the prediction from a regression equation It is the standard dev. of the error term The lower the SEE, the greater the accuracy

SSESEE

n – 2

where:

SSE sum of squared errors

Page 19: Lecture 5 Correlation and Regression Dr Peter Wheale

Interpreting the Coefficient of Determination (R2)

• R2 measures the percentage of the variation in the dependent variable that can be explained by the independent variable

• An R2 of 0.25 means the independent variable explains 25% of the variation in the dependent variable

Caution: You cannot conclude causation

Page 20: Lecture 5 Correlation and Regression Dr Peter Wheale

Calculating the Coefficient of Determination (R2)

• For simple linear regression, R2 is the correlation coefficient (r) squared

Example: Correlation coefficient between X and Y, (r) = 0.50

Coefficient of determination = 0.502 = 0.25

Page 21: Lecture 5 Correlation and Regression Dr Peter Wheale

Coefficient of Determination (R2)

R2 can also be calculated with SST and SSRSS Total = SS Regression + SS Error

Total variation = explained variation + unexplained variation

2 SSR SST – SSE SSE explained variationR = = =1– =

SST SST SST total variation