lt4011 week 22 slides

TESTS OF CORRELATIONWeek 22

IN THIS LECTURE

Last week we looked at what correlation was

This week, we learn: Some different tests of correlation, and when we use them How to produce tests of correlation in SPSS The impact of significance on statistical results

DATA ANALYSIS Univariate analysis (one variable at a time) =

Most graphs and tables & descriptive statistics

Bivariate analysis (comparing two variables) = Contingency tables (cross-tabs) and the tests of correlation we look at today

Multivariate analysis (comparing more than two variables at a time) – not covered in this module!

CORRELATION – A REMINDER Will usually be from -1 to +1 The number indicates the strength of the

relationship The closer the number is to 1, the stronger the

relationship, the closer to 0, the weaker the relationship

The number will be positive or negative, which will indicate the direction of the relationship

Correlations do not prove causality The type of test you need depends on the

type of data contained in the variables you are analysing

WHICH TEST TO USE?

Nominal Ordinal ScaleNominal Contingency

table+ Chi-Square+ Cramer’s V

Contingency table+ Chi-Square+ Cramer’s V

Contingency table+ Chi-Square+ Cramer’s V

Ordinal Contingency table+ Chi-Square+ Cramer’s V

Spearman’s rho Spearman’s rho

Scale Contingency table+ Chi-Square+ Cramer’s V

Spearman’s rho Pearson’s r

An example – same as last week

We are interested to know whether the amount of CO2 emission is associated with the gross domestic product (or growth in the economy) among some countries.In particular, we want to know if the increase in CO2 emission is explained by growth in the economy (GDP) and vice versa.

Using SPSS to calculate the correlation coefficient

If we were dealing with ordinal data, we would use the Spearman correlation.

The correlation coefficient between Co2 emissions and GDP is 0.721This is considered a relatively strong positive correlation.

ONE TAIL OR TWO? One-tailed and two-tailed

tests are alternative methods for testing hypotheses

One-tailed assumes the result in only one direction

Two-tailed assumes the possibility that the parameter could deviate in either direction

Unless you are certain why, you should use a two-tailed test

SO WHAT ABOUT SIGNIFICANCE? Statistical significance,

represented by p, is the level of confidence that the findings do actually exist in the populations (versus the findings occurring by chance)

Generally speaking, the maximum acceptable level we would consider something significant is p = <0.05 – this means there are fewer than 5 chances in 100 that you have a sample that shows a relationship when there is not one in the population

SO WHAT IF THE VARIABLES ARE NOMINAL? We use a Chi-Square test to compare the

frequencies we have to the frequencies we would expect to see if the variables were unrelated (ie. Observed counts versus expected counts)

The higher the Chi-square value, the greater the difference between these counts are

The significance tells you the likelihood that these results have occurred by chance (should be <0.05)

Chi-square ONLY indicates whether or not the variables are not related. It DOES NOT tell you anything about the strength or direction of any correlation

CHI-SQUARE TEST Chi-square should

only be produced alongside a cross-tabulation

It assumes that there are no expected counts of zero, and less than 20% of categories with an expected count of less than 5 – it will tell you these assumptions as notes!

CHI-SQUARE TEST

Degrees of freedom

Significance – should be

<0.05

This is the Chi-Square value – higher means more difference, but the number

alone is meaningless without the significance

The Chi-square test is comparing observed counts with expected

counts

Assumptions checked here

lt4011 week 22 slides

Education