correlation analysis. a measure of association between two or more numerical variables. for examples...

50
Correlation Analysis

Upload: silvia-king

Post on 02-Jan-2016

225 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Page 2: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

A measure of association between two or more numerical variables.

For examples

height & weight relationship

price and demand relationship

CORRELATION

Page 3: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Independent and Dependent Variables

Independent variable: The variable that is the basis of estimation is called.

Dependent variable: The variable whose value is to be estimated is called dependent variable. The dependent variables are dependent on independent variables.

Page 4: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Example

Student Hours studied % Marks

1 6 82

2 2 63

3 1 57

4 5 88

5 3 68

6 2 75

Page 5: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

USE

1. With the help of correlation analysis we can measure in one figure the degree of relationship existing between the variables.

2. Correlation analysis contributes to the economic behavior, aids in locating the critically important variables on which disturbances spread and suggest to him the paths through which stabilizing forces become effective.

3.In business, correlation analysis enables the executive to estimate costs, sales, price and other variables on the basis of some other series with which these costs, sales or prices may be functionally related.

Page 6: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Example

Independent variable in this example is the number of hours studied.

The mark the student obtains is a dependent variable.

The mark student obtains depend upon the number of hours he or she will study.

Are these two variables related?

Page 7: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Types of correlation

Correlation

Simple, partial and multiple

Positive and negative

Page 8: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Height & Weight

Income & Expenditure

Training & performance  

Positive correlation

A positive relationship exists when both variables increase or decrease at the same time.

Page 9: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Strength and age

Demand & Price

Negative correlation

A negative relationship exist when one

variable increases and the other variable

decreases or vice versa.

Page 10: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Simple, partial and multiple correlation

Association between only twovariables is Simple correlation.

(e.g. Height & Weight)

Association among morethan two variables is Multiple correlation.

(e.g. Capital, Production cost, Advertisement cost & Profit)

Incase of multiple correlation the association between two variables is called Partial correlation when effects of other variables remain constant.

(e.g. correlation between Capital & Profit when the effects of Production cost & Advertisement cost remain unchanged.)

Page 11: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Scatter plots

A scatter plot is a chart that shows the relationship between two quantitative variables measured on the same observations.

In a scatter plot, one of the variables (usually the independent variable) is plotted along the horizontal or X axis and the other is plotted along the vertical or Y axis.

Page 12: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Specific Example Specific Example

For seven random For seven random summer days, a summer days, a person recorded the person recorded the temperature and their and their water consumption, , during a three-hour during a three-hour period spent outside.  period spent outside.  

Temperature (F)

Water Consumption

(ounces)

75 1683 2085 2585 2793 3297 4899 48

Page 13: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

How would you describe the graph?How would you describe the graph?

Page 14: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Types of correlations

Y

X

Y

X

Y

Y

X

X

(continued)

Perfect positive

Perfect negative

Strong positive

Strong negative

Page 15: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

No linear correlation

x = height y = IQ

160

150

140

130120

110

100

9080

60 64 68 72 76 80

Height

IQ

Page 16: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Correlation Coefficient

A quantity which measures the direction and

the strength of the linear association between

two numerical paired variables is called

correlation coefficient.

Page 17: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Pearson’s correlation coefficient or product moment correlation

2222

2222

22

YnYXnX

YXnYX

YYnXXn

YXYXn

YYXX

YYXXr

ii

ii

iiii

iiii

ii

ii

Pearson’s Correlation coefficient (continued)

Page 18: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Example 1

A company has brought out an annual report in which the capital investment and profits were given for the few years.

Capital Investment

(cores)10 16 18 24 36 48 57

Profits (lakh)

12 14 13 18 26 38 62

Page 19: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Calculation

X= Capital investment Y= Profits

2222r

iiii

iiii

YYnXXn

YXYXn

Page 20: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Continue…

10 12

16 14

18 13

24 18

36 26

48 38

57 62

= = = = =

X Y XY 2X 2Y

iX iY iiYX 2

iX 2iY

Page 21: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Example 2

A departmental store has the following statistics of sales for a period of last one year of 8 salesmen who have varying years of experience.

Salesmen Years of exp.

Annual sales(tk)

1 1 80

2 3 97

3 4 92

4 4 102

5 6 103

6 8 111

7 10 119

8 11 117

Page 22: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Calculation

X= years of experiences Y= Annual Sales

2222r

iiii

iiii

YYnXXn

YXYXn

Page 23: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Continue…

salesmen

1 1 80

2 3 97

3 4 92

4 4 102

5 6 103

6 8 111

7 10 119

8 11 117

X Y XY 2X 2Y

iX iY iiYX 2

iX 2iY

Page 24: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Properties of r

r lies between -1 to +1. i.e.,

The correlation coefficient is a symmetric measure.

The r will be negative or positive depending on whether the sign of the numerator of the formula is positive or negative.

The correlation coefficient is a dimensionless quantity, implying that it is not expressed in any unit of measurement.

11 r

Page 25: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Interpretation

r=1 indicates a perfect positive correlation or relationship. In this case, all the points in a scatter diagram lie on a straight line that has a upward direction.

r=-1 indicates a perfect negative correlation or relationship. In this case, all the points in a scatter diagram lie on a straight line that has a downward direction.

r=0 indicates that the variables are not linearly related or no correlation.

Page 26: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Interpretation

Value of r close to 1 indicates a strong positive correlation or strong positive linear relationship

Value of r close to -1 indicates a strong negative correlation or strong negative linear relationship

Positive value of r close to 0 indicates a weak positive correlation or weak linear relationship.

Negative value of r close to 0 indicates a weak negative correlation or weak negative linear relationship.

Page 27: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Perfect negative corr.

Perfect positive corr.

Zero corr.

Weak negative corr.

Strong negative corr.

Weak positive corr.

Strong positive corr.

-1 - 0.5 0 0.5 1

Moderate negative corr.

Moderate positive corr.

Negative correlation Positive correlation

Page 28: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation AnalysisCorrelation Coefficient Interpretation

CoefficientRange

Strength ofRelationship

0.01 - 0.20 Very weak

0.21 - 0.40 weak

0.41 - 0.60 Moderate

0.61 - 0.80 Strong

0.80 - .99 Very strong

Page 29: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Interpret the following

i. r = -.098

ii. r = 1

iii. r = 0.5

iv. r = -1

v. r = 0

vi. r= .92

Page 30: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Types of correlations

Y

X

Y

X

Y

Y

X

X

(continued)

r=1

r=-1

r close to +1

r close to -1

Page 31: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Y

X

Type of correlation

r close to zero

Page 32: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

r = ? Why?

1.

r = ? Why?

2.

Interpret the following

Page 33: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Y

X

3.

r = ? Why?

Interpret the following

Page 34: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Characteristics of Correlation Correlation does not tell us anything about

causation.

To calculate correlation, both variables must be

quantitative (not categorical).

A positive value for r indicates a positive association

between x and y. A negative value for r indicates a

negative association between x and y.

Page 35: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Regression Analysis

Regression analysis is a technique of studying

the relationship of one independent variable

with one or more dependent variables with a

view to estimating or predicting the average

value of the dependent variable in terms of the

known or fixed values of the independent

variables.

Page 36: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Objectives of regression

Estimate the relationship that exists between

the dependent variable and the independent

variable.

Determine the effect of each of the

independent variables on the dependent

variables.

Prediction the value of the dependent

variable for a given value of the independent

variable.

Page 37: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Regression vs. Correlation

The correlation answers the STRENGTH of

linear association between paired variables,

say X and Y. On the other hand, the regression

tells us the FORM of linear association that

best predicts Y from the values of X.

In case of correlation, it never measure cause

and effect relationship whereas regression

specially measures this.

Page 38: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Regression vs. Correlation

Linear regression are not symmetric in terms

of X and Y. That is interchanging X and Y will

give a different regression value. On the other

hand, if you interchange variables X and Y in

the calculation of correlation coefficient you

will get the same value of this correlation

coefficient.

Page 39: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Types of regression

Linear regression

that shows the relationship between one dependent variable and one independent variable.

Multiple regression

that shows the relationship between one dependent variable and two or more independent variables.

Page 40: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Regression model (equation)

A model is mathematical equation that describes

the relationship between a dependent variable and

a set of independent variables.

Intercept term slope term

Dependent variable Independent variable

ii bXaY

Page 41: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Interpretation

ii bXaY Y is dependent variable X is independent variable a is intercept term, also the expected

value of Y for X=0. b is slope term, also known as regression

coefficient. It represents the amount of change in Y for each unit change in X.

Page 42: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Estimates of a and b

XbYa ˆˆ

22

ii

iiii

i

ii

XXn

YXYXn

XX

YYXXb

The least squares principle is used to estimate a

and b. The equations to determine a and b are

Page 43: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Properties of b

It lies between

Negative value of b indicates the relationship between two variables is negative.

Positive value of b indicates the relationship between two variables is positive.

b=0 indicates there is no relationship between the two variables.

b

Page 44: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Example 1

Age of trucks years 5 4 3 1 7

Repair expense last year in hundreds of $

7 7 6 4 10

XbaY ˆˆˆ

22

ˆ

ii

iiii

XXn

YXYXnb

XbYa ˆˆ

Page 45: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Calculation

5 7 35 25

4 7 28 16

3 6 18 9

2 4 8 4

7 10 70 49

= = = =

X 2XXYY

iX

iY iiYX 2

iX

Page 46: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Example 1

Mr. A, president of a financial services, believes that there is a relationship between the no. of client contacts and the dollar amount of sales. To document this assertion, Mr. A gathered the following sample information.

No. of contacts

14 12 20 16 46 23

Sales ( $Thousand)

24 14 28 30 80 30

Page 47: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Calculation

Find the Regression co-efficient and interpret it. Find the regression equation that express the

relationship between these two variables. Determine the amount of sales if 40 contacts

are made.

Page 48: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Problem

Given data

X 8 12 5 8 15 10

Y 10 6 12 10 7 9

1. Draw a scatter diagram2. Calculate the correlation coefficient and interpret it.3. Find the regression coefficient and interpret it.4. Determine the value of Y when X= 11, 18

Page 49: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis

Coefficient of Determination

The coefficient of determination (r2) is the proportion of the total variation in the dependent variable (Y) that is explained or accounted for by the variation in the independent variable (X).

It is the square of the coefficient of correlation. It ranges from 0 to 1. It does not give any information on the direction

of the relationship between the variables.

Page 50: Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship

Correlation Analysis