pearson product moment correlation

Post on 20-Jun-2015

238 Views

Category:

Education

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

Statistics education

TRANSCRIPT

Pearson Product Moment Correlation

Welcome to the Pearson Product Moment Correlation Learning

Module

• The Pearson Product Moment Correlation is the most widely used statistic when determining the relationship between two variables that are continuous.

• The Pearson Product Moment Correlation is the most widely used statistic when determining the relationship between two variables that are continuous.

Variable A Variable B

• By continuous we mean a variable that can take any valuable between two points.

• By continuous we mean a variable that can take any valuable between two points.

• Here is an example:

• By continuous we mean a variable that can take any valuable between two points.

• Here is an example:

Suppose the fire department mandates that all fire fighters must weigh between 150 and 250 pounds. The weight of a fire fighter would be an example of a continuous variable; since a fire fighter's weight could take on any value between 150 and 250 pounds.

• By continuous we mean a variable that can take any valuable between two points.

• Here is an example:

Suppose the fire department mandates that all fire fighters must weigh between 150 and 250 pounds. The weight of a fire fighter would be an example of a continuous variable; since a fire fighter's weight could take on any value between 150 and 250 pounds.

• The Pearson Product Moment Correlation will either indicate a strong relationship

• The Pearson Product Moment Correlation will either indicate a strong relationship

Variable A Variable B

• Or a weak even nonexistent relationship

• Or a weak even nonexistent relationship

Variable A Variable B

• Strong relationships can either be positive

• Strong relationships can either be positive

Variable A Variable B

• Or negative

• Or negative

Variable A Variable B

• The Pearson Product Moment Correlation or simply Pearson Correlation values range from -1.0 to 1.0

• The Pearson Product Moment Correlation or simply Pearson Correlation values range from -1.0 to 1.0

-1 +10

• A Pearson Correlation of 1.0 has a perfect postive relationship. Note two qualities here:

• A Pearson Correlation of 1.0 has a perfect postive relationship. Note two qualities here:

(1) direction

• A Pearson Correlation of 1.0 has a perfect postive relationship. Note two qualities here:

(1) direction(2) strength

• A Pearson Correlation of 1.0 has a perfect postive relationship. Note two qualities here:

(1) direction(2) strength

• A +1.0 Pearson Correlation’s direction is positive and it’s strength is very or perfectly strong.

• A Pearson Correlation of 1.0 has a perfect postive relationship. Note two qualities here:

(1) direction(2) strength

• A +1.0 Pearson Correlation’s direction is positive and it’s strength is very or perfectly strong.

• A -1.0 Pearson Correlation’s direction is negative and it’s strength is very or perfectly strong.

• A Pearson Correlation of 1.0 has a perfect postive relationship. Note two qualities here:

(1) direction(2) strength

• A +1.0 Pearson Correlation’s direction is positive and it’s strength is very or perfectly strong.

• A -1.0 Pearson Correlation’s direction is negative and it’s strength is very or perfectly strong.

• A 0.0 Pearson Correlation has no direction and has no strength.

• A Pearson Correlation of 1.0 has a perfect postive relationship. Note two qualities here:

(1) direction(2) strength

• A +1.0 Pearson Correlation’s direction is positive and it’s strength is very or perfectly strong.

• A -1.0 Pearson Correlation’s direction is negative and it’s strength is very or perfectly strong.

• A 0.0 Pearson Correlation has no direction and has no strength.

• A +0.3 Pearson Correlation’s direction is positive and it’s strength is moderately weak.

• A Pearson Correlation of 1.0 has a perfect postive relationship. Note two qualities here:

(1) direction(2) strength

• A +1.0 Pearson Correlation’s direction is positive and it’s strength is very or perfectly strong.

• A -1.0 Pearson Correlation’s direction is negative and it’s strength is very or perfectly strong.

• A 0.0 Pearson Correlation has no direction and has no strength.

• A +0.3 Pearson Correlation’s direction is positive and it’s strength is moderately weak.

• A -0.1 Pearson Correlation’s direction is negative and it’s strength is very weak.

• There is another quality as well. With a Pearson correlation you are considering the relationship between only two variables.

• There is another quality as well. With a Pearson correlation you are considering the relationship between only two variables.

• There is another quality as well. With a Pearson correlation you are considering the relationship between only two variables.

• Three’s a crowd:

• There is another quality as well. With a Pearson correlation you are considering the relationship between only two variables.

• Three’s a crowd:

• There is another quality as well. With a Pearson correlation you are considering the relationship between only two variables.

• Three’s a crowd:

• Bottom line: The Pearson Correlation is used only when exploring the relationship between two variables.

• Let’s look at a fictitious problem to illustrate how the Pearson Correlation is calculated.

• Imagine you are conducting a study to determine the relationship between the average daily temperature and the average daily ice cream sales in a particular city.

• Imagine you are conducting a study to determine the relationship between the average daily temperature and the average daily ice cream sales in a particular city.

• Imagine the data set looks like this:

• Imagine the data set looks like this:

Ave Daily Temp

900

800

700

600

500

Ave Daily Ice Cream Sales

560

480

350

320

230

• Notice how as one variable goes up (temperature) the other variable increases (ice cream sales)

• Notice how as one variable goes up (temperature) the other variable increases (ice cream sales)

Ave Daily Temp

900

800

700

600

500

Ave Daily Ice Cream Sales

560

480

350

320

230

• Notice how as one variable goes up (temperature) the other variable increases (ice cream sales)

Ave Daily Temp

900

800

700

600

500

Ave Daily Ice Cream Sales

560

480

350

320

230

• One way to look at this relationship is to rank order both variable values like so:

• One way to look at this relationship is to rank order both variable values like so:

Ave Daily Temp

900

800

700

600

500

Ave Daily Ice Cream Sales

560

480

350

320

230

• One way to look at this relationship is to rank order both variable values like so:

Ave Daily Temp

900

800

700

600

500

Ave Daily Ice Cream Sales

560

480

350

320

230

1st

• One way to look at this relationship is to rank order both variable values like so:

Ave Daily Temp

900

800

700

600

500

Ave Daily Ice Cream Sales

560

480

350

320

230

1st 1st

• One way to look at this relationship is to rank order both variable values like so:

Ave Daily Temp

900

800

700

600

500

Ave Daily Ice Cream Sales

560

480

350

320

230

1st 1st

• One way to look at this relationship is to rank order both variable values like so:

Ave Daily Temp

900

800

700

600

500

Ave Daily Ice Cream Sales

560

480

350

320

230

1st 1st

2nd 2nd

• One way to look at this relationship is to rank order both variable values like so:

Ave Daily Temp

900

800

700

600

500

Ave Daily Ice Cream Sales

560

480

350

320

230

1st 1st

2nd

3rd 3rd

2nd

• One way to look at this relationship is to rank order both variable values like so:

Ave Daily Temp

900

800

700

600

500

Ave Daily Ice Cream Sales

560

480

350

320

230

1st 1st

2nd

3rd 3rd

2nd

4th 4th

• One way to look at this relationship is to rank order both variable values like so:

Ave Daily Temp

900

800

700

600

500

Ave Daily Ice Cream Sales

560

480

350

320

230

1st 1st

2nd

5th 5th

4th 4th

3rd 3rd

2nd

• Notice how their rank orders are identical. And because their standard deviations are similar as well, these variables have a +1.0 Pearson Correlations.

Ave Daily Temp

900

800

700

600

500

Ave Daily Ice Cream Sales

560

480

350

320

230

1st 1st

2nd

5th 5th

4th 4th

3rd 3rd

2nd

• What would a perfectly negative correlation (-1.0) look like?

• What would a perfectly negative correlation (-1.0) look like?

Ave Daily Temp

900

800

700

600

500

Ave Daily Ice Cream Sales

230

320

350

480

560

1st

1st

2nd

5th

5th

4th

4th

3rd 3rd

2nd

• What would a perfectly negative correlation (-1.0) look like?

Ave Daily Temp

900

800

700

600

500

Ave Daily Ice Cream Sales

230

320

350

480

560

1st

1st

2nd

5th

5th

4th

4th

3rd 3rd

2nd

• What would a perfectly negative correlation (-1.0) look like?

Ave Daily Temp

900

800

700

600

500

Ave Daily Ice Cream Sales

230

320

350

480

560

1st

1st

2nd

5th

5th

4th

4th

3rd 3rd

2nd

• What would a perfectly negative correlation (-1.0) look like?

Ave Daily Temp

900

800

700

600

500

Ave Daily Ice Cream Sales

230

320

350

480

560

1st

1st

2nd

5th

5th

4th

4th

3rd 3rd

2nd

• What would a perfectly negative correlation (-1.0) look like?

Ave Daily Temp

900

800

700

600

500

Ave Daily Ice Cream Sales

230

320

350

480

560

1st

1st

2nd

5th

5th

4th

4th

3rd 3rd

2nd

• What would a zero correlation (0.0) look like?

• What would a zero correlation (0.0) look like?

Ave Daily Temp

900

800

700

600

500

Ave Daily Ice Cream Sales

560

480

350

320

230

1st

1st

2nd

5th 5th

4th

4th

3rd

3rd

2nd

• What would a zero correlation (0.0) look like?

Ave Daily Temp

900

800

700

600

500

Ave Daily Ice Cream Sales

560

480

350

320

230

1st

1st

2nd

5th 5th

4th

4th

3rd

3rd

2nd

• What would a zero correlation (0.0) look like?

Ave Daily Temp

900

800

700

600

500

Ave Daily Ice Cream Sales

560

480

350

320

230

1st

1st

2nd

5th 5th

4th

4th

3rd

3rd

2nd

• What would a zero correlation (0.0) look like?

Ave Daily Temp

900

800

700

600

500

Ave Daily Ice Cream Sales

560

480

350

320

230

1st

1st

2nd

5th 5th

4th

4th

3rd

3rd

2nd

• What would a zero correlation (0.0) look like?

Ave Daily Temp

900

800

700

600

500

Ave Daily Ice Cream Sales

560

480

350

320

230

1st

1st

2nd

5th 5th

4th

4th

3rd

3rd

2nd

• What would a zero correlation (0.0) look like?

Ave Daily Temp

900

800

700

600

500

Ave Daily Ice Cream Sales

560

480

350

320

230

1st

1st

2nd

5th 5th

4th

4th

3rd

3rd

2nd

• What would a zero correlation (0.0) look like?

• Note – Pearson Correlation is not just a comparison of rank ordered data (that is what a Phi coefficient does) but the rank order is one factor that is considered with a Pearson Correlation. Another factor is the degree to which the standard deviations are similar.

Ave Daily Temp

900

800

700

600

500

Ave Daily Ice Cream Sales

560

480

350

320

230

1st

1st

2nd

5th 5th

4th

4th

3rd

3rd

2nd

• The Pearson Product Moment Correlation (PPMC) is calculated as the average cross product of the z-scores of two variables for a single group of people. Here is the equation for the PPMC

• The Pearson Product Moment Correlation (PPMC) is calculated as the average cross product of the z-scores of two variables for a single group of people. Here is the equation for the PPMC

𝑟=∑(𝑍 𝑋 ∙𝑍𝑌 )𝑛

• Let’s calculate the Pearson Correlation, for the following data set:

• Let’s calculate the Pearson Correlation, for the following data set:

Ave Daily Temp

900

800

700

600

500

Ave Daily Ice Cream Sales

560

480

350

320

230

• Let’s calculate the Pearson Correlation, for the following data set:

• It is very important to note that the Pearson Correlation can be computed in a matter of seconds using statistical software. The next set of slides is designed to help you see what is happening conceptually as well as computationally with the Pearson Correlation.

Ave Daily Temp

900

800

700

600

500

Ave Daily Ice Cream Sales

560

480

350

320

230

• When computing a Pearson Correlation you will normally have two variables that DO NOT USE THE SAME METRIC:

• When computing a Pearson Correlation you will normally have two variables that DO NOT USE THE SAME METRIC:

Ave Daily Temp

900

800

700

600

500

Ave Daily Ice Cream Sales

560

480

350

320

230

• When computing a Pearson Correlation you will normally have two variables that DO NOT USE THE SAME METRIC:

Ave Daily Temp

900

800

700

600

500

Ave Daily Ice Cream Sales

560

480

350

320

230

The metric here is degrees

• When computing a Pearson Correlation you will normally have two variables that DO NOT USE THE SAME METRIC:

Ave Daily Temp

900

800

700

600

500

Ave Daily Ice Cream Sales

560

480

350

320

230

The metric here is degrees The metric here is number of ice cream sales

• So we have to get these two variables on the same metric. This is done by calculating the z scores or standardized scores for the values from each variable.

• So these raw score values in separate metrics are transformed into standardized values which converts them into the same metric:

• So these raw score values in separate metrics are transformed into standardized values which converts them into the same metric:

Ave Daily Temp

900

800

700

600

500

Ave Daily Ice Cream Sales

560

480

350

320

230

• So these raw score values in separate metrics are transformed into standardized values which converts them into the same metric:

Ave Daily Temp

900

800

700

600

500

Ave Daily Ice Cream Sales

560

480

350

320

230

• So these raw score values in separate metrics are transformed into standardized values which converts them into the same metric:

Ave Daily Temp

900

800

700

600

500

Ave Daily Ice Cream Sales

560

480

350

320

230

Ave Daily Temp

+1.4

+0.7

0.0

-0.7

-1.4

Ave Daily Ice Cream Sales

+1.5

+0.8

-0.3

-0.6

-1.3

• So these raw score values in separate metrics are transformed into standardized values which converts them into the same metric:

Ave Daily Temp

900

800

700

600

500

Ave Daily Ice Cream Sales

560

480

350

320

230

Ave Daily Temp

+1.4

+0.7

0.0

-0.7

-1.4

Ave Daily Ice Cream Sales

+1.5

+0.8

-0.3

-0.6

-1.3

Different Metric (raw scores)

• Note – this is done by subtracting each value from it’s mean (e.g., 900 minus 700 = 200) and dividing it by it’s standard deviation (e.g., 200 / 14.1 = 1.4)

Ave Daily Temp

900

800

700

600

500

Ave Daily Ice Cream Sales

560

480

350

320

230

Ave Daily Temp

+1.4

+0.7

0.0

-0.7

-1.4

Ave Daily Ice Cream Sales

+1.5

+0.8

-0.3

-0.6

-1.3

Different Metric (raw

scores)

• Once the values are standardized we multiply them

• Once the values are standardized we multiply them

𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )

𝑛

• Once the values are standardized we multiply them

𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )

𝑛

• Once the values are standardized we multiply them

Ave Daily Temp

+1.4

+0.7

0.0

-0.7

-1.4

Ave Daily Ice Cream Sales

+1.5

+0.8

-0.3

-0.6

-1.3

𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )

𝑛

• Once the values are standardized we multiply them

Ave Daily Temp

+1.4

+0.7

0.0

-0.7

-1.4

Ave Daily Ice Cream Sales

+1.5

+0.8

-0.3

-0.6

-1.3

XXXXX

𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )

𝑛

• Once the values are standardized we multiply them

Ave Daily Temp

+1.4

+0.7

0.0

-0.7

-1.4

Ave Daily Ice Cream Sales

+1.5

+0.8

-0.3

-0.6

-1.3

XXXXX

1.9

0.4

0.0

0.6

2.1

=====

𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )

𝑛

• Once the values are standardized we multiply them

Ave Daily Temp

+1.4

+0.7

0.0

-0.7

-1.4

Ave Daily Ice Cream Sales

+1.5

+0.8

-0.3

-0.6

-1.3

XXXXX

1.9

0.4

0.0

0.6

2.1

=====

𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )

𝑛

These are called cross products because we are multiplying

across two values

• Once the values are standardized we multiply them

Ave Daily Temp

+1.4

+0.7

0.0

-0.7

-1.4

Ave Daily Ice Cream Sales

+1.5

+0.8

-0.3

-0.6

-1.3

XXXXX

1.9

0.4

0.0

0.6

2.1

=====

𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )

𝑛

1.9 + 0.4 + 0.0 + 0.6 + 2.1 = 5.0

• Finally, divide that number (5.0) by the number of observations

• Finally, divide that number (5.0) by the number of observations

𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )

𝑛

• Finally, divide that number (5.0) by the number of observations

𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )

𝑛

The number of observations (in this case 5)

Ave Daily Temp

+1.4

+0.7

0.0

-0.7

-1.4

Ave Daily Ice Cream Sales

+1.5

+0.8

-0.3

-0.6

-1.3

12345

𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )

𝟓

𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )

𝟓

The number of observations (in this case 5)

𝑟=𝟓𝟓

𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )

𝟓

The number of observations (in this case 5)

𝑟=𝟓𝟓

Sum of the cross products1.9 + 0.4 + 0.0 + 0.6 + 2.1 =

5.0

𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )

𝟓

The number of observations (in this case 5)

𝑟=𝟓𝟓

Sum of the cross products1.9 + 0.4 + 0.0 + 0.6 + 2.1 =

5.0

𝑟=+𝟏 .𝟎

𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )

𝟓

The number of observations (in this case 5)

𝑟=𝟓𝟓

Sum of the cross products1.9 + 0.4 + 0.0 + 0.6 + 2.1 =

5.0

𝑟=+𝟏 .𝟎This is the Pearson Correlation which in this case is a perfect

positive relationship

• In summary:

• In summary:• The Pearson Product Moment Correlation can range

from -1 to 0 to +1.

• In summary:• The Pearson Product Moment Correlation can range

from -1 to 0 to +1.

-1 +10

• A correlation of 0 indicates no association between the variables of interest.

• A correlation of 0 indicates no association between the variables of interest.

• The direction (positive or negative) simply indicates a positive or negative (inverse) relationship between the variables.

• If POSITIVE, when values increase on one variable, they tend to increase on another variable.

• If POSITIVE, when values increase on one variable, they tend to increase on another variable.

Variable 1

10

9

8

7

Variable 2

5

4

3

2

-1 +10

• If POSITIVE, when values increase on one variable, they tend to increase on another variable.

Variable 1

10

9

8

7

Variable 2

5

4

3

2

-1 +10

• If POSITIVE, when values increase on one variable, they tend to increase on another variable.

Variable 1

10

9

8

7

Variable 2

5

4

3

2

PearsonCorrelation = +1.0

-1 +10

• If NEGATIVE, when values increase on one variable, they tend to decrease on another variable.

• If NEGATIVE, when values increase on one variable, they tend to decrease on another variable.

Variable 1

10

9

8

7

Variable 2

5

4

3

2

-1 +10

• If NEGATIVE, when values increase on one variable, they tend to decrease on another variable.

Variable 1

10

9

8

7

Variable 2

5

4

3

2

PearsonCorrelation = -1.0

-1 +10

• The strength of the relationship depends on the decimal value.

• The strength of the relationship depends on the decimal value.

-1 +10

• The strength of the relationship depends on the decimal value.

-1 +10

• The strength of the relationship depends on the decimal value.

-1 +10 0.2weak

• The strength of the relationship depends on the decimal value.

-1 +10

• The strength of the relationship depends on the decimal value.

-1 +10 0.8strong

• The strength of the relationship depends on the decimal value.

-1 +10

• The strength of the relationship depends on the decimal value.

-1 +100.2

weak

• The strength of the relationship depends on the decimal value.

-1 +10

• The strength of the relationship depends on the decimal value.

-1 +100.8

strong

• The strength of the relationship depends on the decimal value.

-1 +10

• There is a tendency to interpret the Pearson Product Moment Correlation with causal language as though changes in one variable causes changes in the other.

• There is a tendency to interpret the Pearson Product Moment Correlation with causal language as though changes in one variable causes changes in the other.

• Whether to interpret the Pearson Product Moment Correlation as prediction or causation depends on the nature of the research design rather than the nature of the statistic.

• There is a tendency to interpret the Pearson Product Moment Correlation with causal language as though changes in one variable causes changes in the other.

• Whether to interpret the Pearson Product Moment Correlation as prediction or causation depends on the nature of the research design rather than the nature of the statistic.

• First, analyze the nature of the research design before interpreting the Pearson Product Moment Correlation with causal or prediction language.

• There is a tendency to interpret the Pearson Product Moment Correlation with causal language as though changes in one variable causes changes in the other.

• Whether to interpret the Pearson Product Moment Correlation as prediction or causation depends on the nature of the research design rather than the nature of the statistic.

• First, analyze the nature of the research design before interpreting the Pearson Product Moment Correlation with causal or prediction language.

• So, if your research question is focused on the relationship between two continuous variables the Pearson Product Moment Correlation would be the appropriate statistical method to use.

top related