correlation this chapter is on correlation we will look at patterns in data on a scatter graph we...

22

Upload: polly-watkins

Post on 12-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Correlation This Chapter is on Correlation We will look at patterns in data on a scatter graph We will be looking at how to calculate the variance and
Page 2: Correlation This Chapter is on Correlation We will look at patterns in data on a scatter graph We will be looking at how to calculate the variance and

Correlation

• This Chapter is on Correlation

• We will look at patterns in data on a scatter graph

• We will be looking at how to calculate the variance and co-variance of variables

• We will see how to numerically measure the strength of correlation between two variables

Page 3: Correlation This Chapter is on Correlation We will look at patterns in data on a scatter graph We will be looking at how to calculate the variance and
Page 4: Correlation This Chapter is on Correlation We will look at patterns in data on a scatter graph We will be looking at how to calculate the variance and

CorrelationScatter GraphsScatter Graphs are a way of representing 2 sets of data. It is then possible to see whether they are related.

Positive Correlation As one variable increases, so does the other

Negative Correlation As one variable increases, the other decreases

No Correlation There seems to be no pattern linking the two variables

Positive

Negative

None

6A

Page 5: Correlation This Chapter is on Correlation We will look at patterns in data on a scatter graph We will be looking at how to calculate the variance and

CorrelationScatter GraphsIn the study of a city, the population density, in people/hectare, and the distance from the city centre, in km, was investigated by choosing sample areas. The results are as follows:

Plot a scatter graph and describe the correlation. Interpret what the correlation means.

Area A B C D E

Distance 0.6 3.8 2.4 3.0 2.0

Pop. Density

50 22 14 20 33

Area F G H I J

Distance 1.5 1.8 3.4 4.0 0.9

Pop. Density

47 25 8 16 38

0 1 42 3

10

20

30

50

40

0

Distance from centre (km)

Pop. D

ensi

ty (

people

/hect

are

)

The correlation is negative, which means that as we get further from

the city centre, the population density decreases.

Page 6: Correlation This Chapter is on Correlation We will look at patterns in data on a scatter graph We will be looking at how to calculate the variance and
Page 7: Correlation This Chapter is on Correlation We will look at patterns in data on a scatter graph We will be looking at how to calculate the variance and

CorrelationVariability of Bivariate DataWe learnt in chapter 3 that:

In Correlation:

Similarly for y:

And you can also calculate the Co-variance of both variables

6B/C

2( )x xVariance n

(Although remember that this formula

changed to make it easier to use)

2( )x x xxS

2( )y y yyS

( )( )x x y y xyS

‘How x varies’

‘How y varies’

‘How x and y vary

together’n

( )( )x x y y

Page 8: Correlation This Chapter is on Correlation We will look at patterns in data on a scatter graph We will be looking at how to calculate the variance and

CorrelationVariability of Bivariate DataLike in chapter 3, we can use a formula which will make calculations easier

2( )x xVariance n 2( )x x xxSBUT:

xxSVariance n

6B/C

Page 9: Correlation This Chapter is on Correlation We will look at patterns in data on a scatter graph We will be looking at how to calculate the variance and

CorrelationVariability of Bivariate Data

xxSVariance n

xxS n Variance

22

xxx x

n nnS

22

2( )xx

xx

n nnS

22xxx

xn

S

Multiply both sides by ‘n’

The easier formula for variance from chapter 3

222 x x

n n

For the second fraction, square the top and bottom

separately

Variability of Bivariate Data

Multiplying both fractions by ‘n’ will cancel a ‘divide by n’

from each of them

6B/C

Page 10: Correlation This Chapter is on Correlation We will look at patterns in data on a scatter graph We will be looking at how to calculate the variance and

CorrelationVariability of Bivariate DataThese are the formulae for Sxx, Syy and Sxy. You are given these in the formula booklet. You do not need to know how to derive them (like we just did!)

22xxx

xn

S 22yyy

yn

S

xyx y

xyn

S

6B/C

Page 11: Correlation This Chapter is on Correlation We will look at patterns in data on a scatter graph We will be looking at how to calculate the variance and

Correlation

Variability of Bivariate DataCalculate Sxx, Syy and Sxy, based on the following information.

22xxx

xn

S

22yyy

yn

S xy

x yxy

nS

n

y

2x 2y

xy

x 12

198

155

3904

2031

2732

22xxx

xn

S 2(155)

203112

xxS

28.92xxS

22yyy

yn

S 2(198)

390412

yyS

637yyS

xyx y

xyn

S

155 1982732

12xyS

174.5xyS

6B/C

Page 12: Correlation This Chapter is on Correlation We will look at patterns in data on a scatter graph We will be looking at how to calculate the variance and

Correlation

Variability of Bivariate Data

The following table shows babies heads’ circumferences (cm) and the gestation period (weeks) for 6 new born babies. Calculate Sxx, Syy and Sxy.

We need

22xxx

xn

S

22yyy

yn

S xy

x yxy

nS

1200

1400

1178

1140

1221

1116xy

1600

1600

1444

1444

1369

1296y2

9001225

9619001089

961x2

404038383736Gestation period

(y)

303531303331Head

size (x)

FEDCBABaby

n

y

2x 2y

xy

x 6

229

190

8753

6036

7255

6B/C

Page 13: Correlation This Chapter is on Correlation We will look at patterns in data on a scatter graph We will be looking at how to calculate the variance and

Correlation

Variability of Bivariate Data

The following table shows babies heads’ circumferences (cm) and the gestation period (weeks) for 6 new born babies. Calculate Sxx, Syy and Sxy.

We need

22xxx

xn

S

22yyy

yn

S xy

x yxy

nS

n

y

2x 2y

xy

x 6

229

190

8753

6036

7255

22xxx

xn

S 2(190)

60366

xxS

19.33xxS

22yyy

yn

S 2(229)

87536

yyS

12.83yyS

xyx y

xyn

S

190 2297255

6xyS

3.33xyS

6B/C

Page 14: Correlation This Chapter is on Correlation We will look at patterns in data on a scatter graph We will be looking at how to calculate the variance and

CorrelationProduct Moment Correlation CoefficientWe can test the correlation of data by calculating the Product Moment Correlation Coefficient. This uses Sxx, Syy and Sxy.

The value of this number tells you what the correlation is and how strong it is.

The closer to 1, the stronger the positive correlation. The same applies for -1 and negative correlation. A value close to 0 implies no linear correlation.

xy

xx yy

Sr

S S

Positive Correlation

Negative Correlation

-1 10

No Linear Correlation

6B/C

Page 15: Correlation This Chapter is on Correlation We will look at patterns in data on a scatter graph We will be looking at how to calculate the variance and

CorrelationProduct Moment Correlation CoefficientGiven the following data, calculate the Product Moment Correlation Coefficient.

74xxS 150yyS 102xyS

xy

xx yy

Sr

S S

102

74 150r

0.97r There is positive correlation, as x

increases, y does as well.

6B/C

Page 16: Correlation This Chapter is on Correlation We will look at patterns in data on a scatter graph We will be looking at how to calculate the variance and

CorrelationLimitations of the Product Moment Correlation CoefficientSometimes it may indicate Correlation between unrelated variables

Cars on a particular street have increased, as have the sales of DVDs in town The PMCC would indicate positive correlation where the two are most likely not linked

The speed of computers has increased, as has life expectancy amongst people These are not directly linked, but are both due to scientific developments

6B/C

Page 17: Correlation This Chapter is on Correlation We will look at patterns in data on a scatter graph We will be looking at how to calculate the variance and
Page 18: Correlation This Chapter is on Correlation We will look at patterns in data on a scatter graph We will be looking at how to calculate the variance and

CorrelationUsing Coding with the PMCCCalculating the PMCC from this table.

6D

391403744036565351903450532640xy

14440012960012602511902

5112225102400y2

106091081610609104041060910404x2

380360355345335320y

103104103102103102x

n

y 2x 2y xy

x 6

2095

617

733675

63451

215480

22xxx

xn

S 2(617)

634516

xxS

2.83xxS

22yyy

yn

S 2(2095)

7336756

yyS

2170.83yyS

xyx y

xyn

S

617 2095215480

6xyS

44.17xyS

Page 19: Correlation This Chapter is on Correlation We will look at patterns in data on a scatter graph We will be looking at how to calculate the variance and

CorrelationUsing Coding with the PMCCCalculating the PMCC from this table.

391403744036565351903450532640xy

14440012960012602511902

5112225102400y2

106091081610609104041060910404x2

380360355345335320y

103104103102103102x

2.83xxS

2170.83yyS

44.17xyS

xy

xx yy

Sr

S S

44.17

2.83 2170.83r

0.563r 6D

Page 20: Correlation This Chapter is on Correlation We will look at patterns in data on a scatter graph We will be looking at how to calculate the variance and

pqp q

pqn

S

Correlation

q

Using Coding with the PMCCCalculating the PMCC from this table, using coding.

6D

48483318218pq

256144121814916q2

9169494p2

161211974q

343232p

380360355345335320y

103104103102103102x

2p 2q pq

p 6

59

17

667

51

176

22ppp

pn

S 2(17)

516

ppS

2.83ppS

22qqq

qn

S 2(59)

6676

qqS

86.83qqS

17 59176

6pqS

8.83pqS

n

100p x

300

5

yq

Page 21: Correlation This Chapter is on Correlation We will look at patterns in data on a scatter graph We will be looking at how to calculate the variance and

CorrelationUsing Coding with the PMCCCalculating the PMCC from this table.

2.83ppS

86.83qqS

8.83pqS

pq

pp qq

Sr

S S

8.83

2.83 86.83r

0.563r So coding will not affect the PMCC!

48483318218pq

256144121814916q2

9169494p2

161211974q

343232p

380360355345335320y

103104103102103102x

6D

Page 22: Correlation This Chapter is on Correlation We will look at patterns in data on a scatter graph We will be looking at how to calculate the variance and

Summary

• We have looked at plotting scatter graphs

• We have looked at calculating measures of variance, Sxx, Syy and Sxy

• We have also seen types of correlation and how to recognise them on a graph

• We have calculated the Product Moment Correlation Coefficient, and interpreted it. It is a numerical measure of correlation.