correlation - computing services for faculty & staffmmm431/quant_methods_s15/qm_lectur… ·...

33
Correlation

Upload: hakhanh

Post on 07-Mar-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Correlation - Computing Services for Faculty & Staffmmm431/quant_methods_S15/QM_Lectur… · – How well do SAT scores predict a student’s GPA? ... Critical values for Pearson’s

Correlation

Page 2: Correlation - Computing Services for Faculty & Staffmmm431/quant_methods_S15/QM_Lectur… · – How well do SAT scores predict a student’s GPA? ... Critical values for Pearson’s

01:830:200 Spring 2015

Correlation

Reminder: Student Instructional Rating Surveys

You have until May 7th to fill out the student instructional rating

surveys at https://sakai.rutgers.edu/portal/site/sirs

The survey should be available on any device with a full-featured

web browser. Please take the time to fill it out. Your answers:

• Will be anonymous

• Will help me to improve my teaching strategies and the structure of

the course

• Will help the department in planning and designing future courses

• Will be used by the university in promotion, tenure, and

reappointment decisions

Page 3: Correlation - Computing Services for Faculty & Staffmmm431/quant_methods_S15/QM_Lectur… · – How well do SAT scores predict a student’s GPA? ... Critical values for Pearson’s

01:830:200 Spring 2015

Correlation

Correlation: Relationships between Variables

• So far, nearly all of our discussion of inferential statistics has focused on testing for differences between group means

• However, researchers are often interested in graded relationships between variables, such as how well one variable can predict another

• Examples: – How well do SAT scores predict a student’s GPA?

– How is the amount of time a student takes to complete an exam related to her grade on that exam?

– How well do IQ scores correlate with income?

– How does a child’s height correlate with his running speed?

– How does class size affect student performance?

Page 4: Correlation - Computing Services for Faculty & Staffmmm431/quant_methods_S15/QM_Lectur… · – How well do SAT scores predict a student’s GPA? ... Critical values for Pearson’s

01:830:200 Spring 2015

Correlation

Correlation: Relationships between Variables

• Correlation is a statistical technique used to describe the

relationship between two variables.

• Usually the two variables are simply observed as they exist in

the environment (with no experimental manipulation—a

correlational study)

• However, results from experimental studies (in which one of

the variables is systematically manipulated) can also be

analyzed using correlation

Page 5: Correlation - Computing Services for Faculty & Staffmmm431/quant_methods_S15/QM_Lectur… · – How well do SAT scores predict a student’s GPA? ... Critical values for Pearson’s

01:830:200 Spring 2015

Correlation

Mean Comparison Approach

Height Weight

70 150

67 140

72 180

75 190

68 145

69 150

71.5 164

71 140

72 142

69 136

67 123

68 155

66 140

72 145

73.5 160

73 190

69 155

73 165

72 150

Weights

Short Tall

140 164

140 180

123 142

145 145

155 150

150 190

136 165

155 160

150 190

140

Page 6: Correlation - Computing Services for Faculty & Staffmmm431/quant_methods_S15/QM_Lectur… · – How well do SAT scores predict a student’s GPA? ... Critical values for Pearson’s

01:830:200 Spring 2015

Correlation

Correlation: Scatter Plots

Height Weight

70 150

67 140

72 180

75 190

68 145

69 150

71.5 164

71 140

72 142

69 136

67 123

68 155

66 140

72 145

73.5 160

73 190

69 155

73 165

72 150

Page 7: Correlation - Computing Services for Faculty & Staffmmm431/quant_methods_S15/QM_Lectur… · – How well do SAT scores predict a student’s GPA? ... Critical values for Pearson’s

01:830:200 Spring 2015

Correlation

Scatter Plots

Height Weight

70 150

67 140

72 180

75 190

68 145

69 150

71.5 164

71 140

72 142

69 136

67 123

68 155

66 140

72 145

73.5 160

73 190

69 155

73 165

72 150

Page 8: Correlation - Computing Services for Faculty & Staffmmm431/quant_methods_S15/QM_Lectur… · – How well do SAT scores predict a student’s GPA? ... Critical values for Pearson’s

01:830:200 Spring 2015

Correlation

Scatter Plots

Height Weight

70 150

67 140

72 180

75 190

68 145

69 150

71.5 164

71 140

72 142

69 136

67 123

68 155

66 140

72 145

73.5 160

73 190

69 155

73 165

72 150

Page 9: Correlation - Computing Services for Faculty & Staffmmm431/quant_methods_S15/QM_Lectur… · – How well do SAT scores predict a student’s GPA? ... Critical values for Pearson’s

01:830:200 Spring 2015

Correlation

Scatter Plots

Height Weight

70 150

67 140

72 180

75 190

68 145

69 150

71.5 164

71 140

72 142

69 136

67 123

68 155

66 140

72 145

73.5 160

73 190

69 155

73 165

72 150

Page 10: Correlation - Computing Services for Faculty & Staffmmm431/quant_methods_S15/QM_Lectur… · – How well do SAT scores predict a student’s GPA? ... Critical values for Pearson’s

01:830:200 Spring 2015

Correlation

Scatter Plots

Height Weight

70 150

67 140

72 180

75 190

68 145

69 150

71.5 164

71 140

72 142

69 136

67 123

68 155

66 140

72 145

73.5 160

73 190

69 155

73 165

72 150

Page 11: Correlation - Computing Services for Faculty & Staffmmm431/quant_methods_S15/QM_Lectur… · – How well do SAT scores predict a student’s GPA? ... Critical values for Pearson’s

01:830:200 Spring 2015

Correlation

Characteristics of the Correlation

A Correlation coefficient is a single number describing the relationship between two variables. This number describes:

• The direction of the relationship – Variables sharing a positive correlation tend to change in the same direction

(e.g., height and weight). As the value of one of the variables (height) increases, the value of other variable (weight) also increases

– Variables sharing a negative correlation tend to change in opposite directions (e.g., snowfall and beach visitors). As the value of one of the variables (amount of snowfall) increases, the value of the other variable (number of beach visitors) decreases.

• The strength of the relationship – Variables that share a strong correlation (close to +1 or -1) strongly predict

one another, while variables that share a weak correlation (near 0) do not.

Page 12: Correlation - Computing Services for Faculty & Staffmmm431/quant_methods_S15/QM_Lectur… · – How well do SAT scores predict a student’s GPA? ... Critical values for Pearson’s

01:830:200 Spring 2015

Correlation

Positive versus Negative Correlations

Positive Correlation Negative Correlation

Page 13: Correlation - Computing Services for Faculty & Staffmmm431/quant_methods_S15/QM_Lectur… · – How well do SAT scores predict a student’s GPA? ... Critical values for Pearson’s

01:830:200 Spring 2015

Correlation

Strong versus Weak Correlations

Page 14: Correlation - Computing Services for Faculty & Staffmmm431/quant_methods_S15/QM_Lectur… · – How well do SAT scores predict a student’s GPA? ... Critical values for Pearson’s

01:830:200 Spring 2015

Correlation

Correlation is not Causation

Page 15: Correlation - Computing Services for Faculty & Staffmmm431/quant_methods_S15/QM_Lectur… · – How well do SAT scores predict a student’s GPA? ... Critical values for Pearson’s

01:830:200 Spring 2015

Correlation

Possible Sources of Correlation

• The relationship is causal. – Manipulating the predictor variable causes an increase or decrease in the

criterion variable.

• E.g., leg strength and sprinting speed

• The causal relationship is backwards (reverse causality). – Manipulating the criterion variable causes changes in the predictor variable

• The two variables work together systematically to cause an effect

• The relationship may be due to one or more confounding variables – Changes in both variables reflect the effect of a confounding variable

• E.g., intelligence as an explanation for correlated performance on different exams

• E.g., increasing density in cities increases the number of physicians and the number of crimes

Page 16: Correlation - Computing Services for Faculty & Staffmmm431/quant_methods_S15/QM_Lectur… · – How well do SAT scores predict a student’s GPA? ... Critical values for Pearson’s

01:830:200 Spring 2015

Correlation

Measuring Correlation: Pearson’s r

• To compute a correlation you need a pair of scores, X and Y,

for each individual in the sample.

• The most commonly used measure of correlation is

Pearson’s product-moment correlation coefficient, or

more simply, Pearson’s r.

• Conceptually, Pearson’s r is a ratio between the degree to

which two variables (X and Y) vary together and the degree to

which they vary separately.

co-variability( , )

variability( ) variability( )

X Y

X Yr

Page 17: Correlation - Computing Services for Faculty & Staffmmm431/quant_methods_S15/QM_Lectur… · – How well do SAT scores predict a student’s GPA? ... Critical values for Pearson’s

01:830:200 Spring 2015

Correlation

The Covariance

• The term in the numerator of Pearson’s r is the covariance, an unnormalized statistic representing the degree to which two variables (X and Y) vary together.

• Mathematically, it is the average of the product of the deviations of two paired variables

• The covariance depends both on how consistently X and Y tend to vary together and on the individual variability of the variables (X and Y).

cov

1

X Y

XY

X YM M

n

Page 18: Correlation - Computing Services for Faculty & Staffmmm431/quant_methods_S15/QM_Lectur… · – How well do SAT scores predict a student’s GPA? ... Critical values for Pearson’s

01:830:200 Spring 2015

Correlation

The Covariance

Notice that the formula for covariance looks a lot like the formula for variance:

cov

1

X Y

XY

X YM M

n

2

2

1 1

X X X

X

MX M

n n

X Xs

M

Page 19: Correlation - Computing Services for Faculty & Staffmmm431/quant_methods_S15/QM_Lectur… · – How well do SAT scores predict a student’s GPA? ... Critical values for Pearson’s

01:830:200 Spring 2015

Correlation

The Covariance

Moreover, they share a similar computational formula:

2

2 2; where 1

XX X

SSSS X

X X X

ns X

nX

n

; where ov1

c XYXY XY

SPS

XP

nn

YXY

Page 20: Correlation - Computing Services for Faculty & Staffmmm431/quant_methods_S15/QM_Lectur… · – How well do SAT scores predict a student’s GPA? ... Critical values for Pearson’s

01:830:200 Spring 2015

Correlation

Computing Pearson’s r

• Pearson’s r is computed by dividing by the product of the

standard deviations of each of the variables

– This removes the effect of the variability of the individual variables

covXY XY

X Y X Y

SPr

s s SS SS

Page 21: Correlation - Computing Services for Faculty & Staffmmm431/quant_methods_S15/QM_Lectur… · – How well do SAT scores predict a student’s GPA? ... Critical values for Pearson’s

01:830:200 Spring 2015

Correlation

Computing Pearson’s r: Example

X Y

0 2

10 6

4 2

8 4

8 6

Page 22: Correlation - Computing Services for Faculty & Staffmmm431/quant_methods_S15/QM_Lectur… · – How well do SAT scores predict a student’s GPA? ... Critical values for Pearson’s

01:830:200 Spring 2015

Correlation

X Y XY

0 2 0

10 6 60

4 2 8

8 4 32

8 6 48

𝑋 = 30 𝑌 = 20 𝑋𝑌 = 148 𝑋2 = 244 𝑌2 = 96

Computing Pearson’s r: Example

2

22 30

244 244 180 645

X XX

SSN

2

22 20

96 96 80 165

Y

YSS Y

N

148 120 28XY

X Y

NSP XY

Compute SSX, SSY, & SPXY:

28 280.875

3264 16

XY

X YS

SPr

SS S

Compute r:

Page 23: Correlation - Computing Services for Faculty & Staffmmm431/quant_methods_S15/QM_Lectur… · – How well do SAT scores predict a student’s GPA? ... Critical values for Pearson’s

01:830:200 Spring 2015

Correlation

Computing Pearson’s r: Example

Hypothesis testing for r:

The null hypothesis is that the population correlation coefficient ρ = 0

The alternative hypothesis is that ρ ≠ 0

Page 24: Correlation - Computing Services for Faculty & Staffmmm431/quant_methods_S15/QM_Lectur… · – How well do SAT scores predict a student’s GPA? ... Critical values for Pearson’s

01:830:200 Spring 2015

Correlation

Level of Significance for One-Tailed Test

0.05 0.025 0.01 0.005 0.0005

Level of Significance for Two-Tailed Test

df = n-2 0.1 0.05 0.02 0.01 0.001

1 0.988 0.997 1.000 1.000 1.000

2 0.900 0.950 0.980 0.990 0.999

3 0.805 0.878 0.934 0.959 0.991

4 0.729 0.811 0.882 0.917 0.974

5 0.669 0.754 0.833 0.875 0.951

6 0.621 0.707 0.789 0.834 0.925

7 0.582 0.666 0.750 0.798 0.898

8 0.549 0.632 0.715 0.765 0.872

9 0.521 0.602 0.685 0.735 0.847

10 0.497 0.576 0.658 0.708 0.823

11 0.476 0.553 0.634 0.684 0.801

12 0.458 0.532 0.612 0.661 0.780

13 0.441 0.514 0.592 0.641 0.760

14 0.426 0.497 0.574 0.623 0.742

15 0.412 0.482 0.558 0.606 0.725

16 0.400 0.468 0.543 0.590 0.708

17 0.389 0.456 0.529 0.575 0.693

18 0.378 0.444 0.516 0.561 0.679

19 0.369 0.433 0.503 0.549 0.665

20 0.360 0.423 0.492 0.537 0.652

21 0.352 0.413 0.482 0.526 0.640

22 0.344 0.404 0.472 0.515 0.629

23 0.337 0.396 0.462 0.505 0.618

24 0.330 0.388 0.453 0.496 0.607

25 0.323 0.381 0.445 0.487 0.597

26 0.317 0.374 0.437 0.479 0.588

27 0.311 0.367 0.430 0.471 0.579

28 0.306 0.361 0.423 0.463 0.570

29 0.301 0.355 0.416 0.456 0.562

30 0.296 0.349 0.409 0.449 0.554

40 0.257 0.304 0.358 0.393 0.490

50 0.231 0.273 0.322 0.354 0.443

100 0.164 0.195 0.230 0.254 0.321

Critical values for

Pearson’s r

Page 25: Correlation - Computing Services for Faculty & Staffmmm431/quant_methods_S15/QM_Lectur… · – How well do SAT scores predict a student’s GPA? ... Critical values for Pearson’s

01:830:200 Spring 2015

Correlation

Computing Pearson’s r: Example

Hypothesis testing for r:

2

( )( ) ; 2crit

crit

crit

t dfr df df N

df t

The null hypothesis is that the population correlation coefficient ρ = 0

The alternative hypothesis is that ρ ≠ 0

Note that you can also compute rcrit using the t distribution table:

Page 26: Correlation - Computing Services for Faculty & Staffmmm431/quant_methods_S15/QM_Lectur… · – How well do SAT scores predict a student’s GPA? ... Critical values for Pearson’s

01:830:200 Spring 2015

Correlation

Level of significance for one-tailed test

0.25 0.2 0.15 0.1 0.05 0.025 0.01 0.005 0.0005

Level of significance for two-tailed test

df 0.5 0.4 0.3 0.2 0.1 0.05 0.02 0.01 0.001

1 1.000 1.376 1.963 3.078 6.314 12.706 31.821 63.657 636.619

2 0.816 1.061 1.386 1.886 2.920 4.303 6.965 9.925 31.599

3 0.765 0.978 1.250 1.638 2.353 3.182 4.541 5.841 12.924

4 0.741 0.941 1.190 1.533 2.132 2.776 3.747 4.604 8.610

5 0.727 0.920 1.156 1.476 2.015 2.571 3.365 4.032 6.869

6 0.718 0.906 1.134 1.440 1.943 2.447 3.143 3.707 5.959

7 0.711 0.896 1.119 1.415 1.895 2.365 2.998 3.499 5.408

8 0.706 0.889 1.108 1.397 1.860 2.306 2.896 3.355 5.041

9 0.703 0.883 1.100 1.383 1.833 2.262 2.821 3.250 4.781

10 0.700 0.879 1.093 1.372 1.812 2.228 2.764 3.169 4.587

11 0.697 0.876 1.088 1.363 1.796 2.201 2.718 3.106 4.437

12 0.695 0.873 1.083 1.356 1.782 2.179 2.681 3.055 4.318

13 0.694 0.870 1.079 1.350 1.771 2.160 2.650 3.012 4.221

14 0.692 0.868 1.076 1.345 1.761 2.145 2.624 2.977 4.140

15 0.691 0.866 1.074 1.341 1.753 2.131 2.602 2.947 4.073

16 0.690 0.865 1.071 1.337 1.746 2.120 2.583 2.921 4.015

17 0.689 0.863 1.069 1.333 1.740 2.110 2.567 2.898 3.965

18 0.688 0.862 1.067 1.330 1.734 2.101 2.552 2.878 3.922

19 0.688 0.861 1.066 1.328 1.729 2.093 2.539 2.861 3.883

20 0.687 0.860 1.064 1.325 1.725 2.086 2.528 2.845 3.850 21 0.686 0.859 1.063 1.323 1.721 2.080 2.518 2.831 3.819

22 0.686 0.858 1.061 1.321 1.717 2.074 2.508 2.819 3.792

23 0.685 0.858 1.060 1.319 1.714 2.069 2.500 2.807 3.768

24 0.685 0.857 1.059 1.318 1.711 2.064 2.492 2.797 3.745

25 0.684 0.856 1.058 1.316 1.708 2.060 2.485 2.787 3.725 26 0.684 0.856 1.058 1.315 1.706 2.056 2.479 2.779 3.707

27 0.684 0.855 1.057 1.314 1.703 2.052 2.473 2.771 3.690

28 0.683 0.855 1.056 1.313 1.701 2.048 2.467 2.763 3.674

29 0.683 0.854 1.055 1.311 1.699 2.045 2.462 2.756 3.659

30 0.683 0.854 1.055 1.310 1.697 2.042 2.457 2.750 3.646 40 0.681 0.851 1.050 1.303 1.684 2.021 2.423 2.704 3.551

50 0.679 0.849 1.047 1.299 1.676 2.009 2.403 2.678 3.496

100 0.677 0.845 1.042 1.290 1.660 1.984 2.364 2.626 3.390

t-Distribution Table

Two-tailed test

One-tailed test

α

t

α/2 α/2

t -t

Page 27: Correlation - Computing Services for Faculty & Staffmmm431/quant_methods_S15/QM_Lectur… · – How well do SAT scores predict a student’s GPA? ... Critical values for Pearson’s

01:830:200 Spring 2015

Correlation

Computing Pearson’s r: Example

2

( )( ) ; 2 5 3 3

( )

critcrit

crit

t dfr df df N

df t df

(3) 3.182critt

2 2

3.182 3.182

3.6233 3.0 8

182.87crit

crit

crit

tr

df t

Page 28: Correlation - Computing Services for Faculty & Staffmmm431/quant_methods_S15/QM_Lectur… · – How well do SAT scores predict a student’s GPA? ... Critical values for Pearson’s

01:830:200 Spring 2015

Correlation

Linear Correlation: Assumptions

1. Linearity

– Assumes that the relationship between the paired scores is best

described by a straight line

2. Normality

– Assumes that the marginal score distributions, their joint distribution,

and any conditional distributions are normally distributed

3. Homoscedasticity

– Assumes that the variability around the regression line is

homogeneous across different score values

Page 29: Correlation - Computing Services for Faculty & Staffmmm431/quant_methods_S15/QM_Lectur… · – How well do SAT scores predict a student’s GPA? ... Critical values for Pearson’s

01:830:200 Spring 2015

Correlation

Other Correlation Coefficients

• Spearman’s correlation coefficient (rs) for ranked data – As the name suggests, Spearman’s correlation is used when the scores

for both X and Y consist of (or have been converted to) ordinal ranks

• The point biserial correlation coefficient (rpb) – This correlation is used when one of the scores is continuous and the

other is dichotomous, taking on one of only two possible values

• The phi correlation coefficient (rϕ) – The phi correlation is used when both scores are dichotomous

All of the above can be computed in the same manner as Pearson’s correlation.

Page 30: Correlation - Computing Services for Faculty & Staffmmm431/quant_methods_S15/QM_Lectur… · – How well do SAT scores predict a student’s GPA? ... Critical values for Pearson’s

01:830:200 Spring 2015

Correlation

Converting Data for Spearman’s Correlation

Original Data

Age Height

10 31.4

11 41

12 47.8

13 52.8

14 55.7

15 58.3

16 60.7

17 62.1

18 62.7

19 63.3

20 64.1

21 64.3

22 64.6

23 64.7

24 64.5

25 64.3

r = 0.86

Page 31: Correlation - Computing Services for Faculty & Staffmmm431/quant_methods_S15/QM_Lectur… · – How well do SAT scores predict a student’s GPA? ... Critical values for Pearson’s

01:830:200 Spring 2015

Correlation

Converting Data for Spearman’s Correlation

Original Data Converted Scores

Age Height Age Rank Height rank

10 31.4 1 1

11 41 2 2

12 47.8 3 3

13 52.8 4 4

14 55.7 5 5

15 58.3 6 6

16 60.7 7 7

17 62.1 8 8

18 62.7 9 9

19 63.3 10 10

20 64.1 11 11

21 64.3 12 12.5

22 64.6 13 15

23 64.7 14 16

24 64.5 15 14

25 64.3 16 12.5

r = 0.86

r = 0.97

Page 32: Correlation - Computing Services for Faculty & Staffmmm431/quant_methods_S15/QM_Lectur… · – How well do SAT scores predict a student’s GPA? ... Critical values for Pearson’s

01:830:200 Spring 2015

Correlation

Converting Data for the Point Biserial Correlation

Page 33: Correlation - Computing Services for Faculty & Staffmmm431/quant_methods_S15/QM_Lectur… · – How well do SAT scores predict a student’s GPA? ... Critical values for Pearson’s

01:830:200 Spring 2015

Correlation

Converting Data for Phi Correlation