basic statistics for the utterly confused

45
Basic Statistics for the University of Pretoria Faculty of Economic & Management Science 10, 11, 14 & 15 September 2009 Presented by Sumari O’Neil [email protected]

Upload: waqarnasirkhan

Post on 08-Apr-2015

219 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Basic Statistics for the Utterly Confused

Basic Statistics

for the

University of Pretoria Faculty of Economic & Management Science 10, 11, 14 & 15 September 2009

Presented by Sumari O’Neil

[email protected]

Page 2: Basic Statistics for the Utterly Confused
Page 3: Basic Statistics for the Utterly Confused

TTaabbllee ooff CCoonntteennttss

11.. SSTTAATTIISSTTIICCSS AANNDD AALLLL TTHHAATT JJAAZZZZ 1

22.. DDEESSCCRRIIPPTTIIVVEE SSTTAATTIISSTTIICCSS 3

2.1 Frequencies 4

2.2 Central tendency 6

2.3 Statistics for variability 7

2.4 Working with percentages: 7

33.. PPAARRAAMMEETTRRIICC AANNDD NNOONN--PPAARRAAMMEETTRRIICC SSTTAATTIISSTTIICCSS 8

3.1Testing the assumption of normality 9

3.2 Equality of variances 14

44.. FFRROOMM QQUUEESSTTIIOONNNNAAIIRREE TTOO DDAATTAASSEETT 15

55.. SSCCRREEEENNIINNGG AANNDD CCLLEEAANNIINNGG YYOOUURR DDAATTAA 16

66.. MMAANNIIPPUULLAATTIINNGG YYOOUURR DDAATTAA 17

6.1Calculating the total scores of scales or indexes 17

6.2 Reversing negatively worded items 17

6.3 Collapsing a continues variable into groups 18

77.. CCOORRRREELLAATTIIOONN AANNAALLYYSSIISS 18

7.1 Statistics to test relations between variables 19

7.2 How to interpret the results of the correlations 24

7.3 The coefficient of determination (r2) 25

7.4 How to write up the results of a correlation analysis in a research report 25 7.5 Graphically representing the relationship between variables: 25 7.6 Other analysis that is grounded in correlation analysis 27

88888888........ TTTTTTTTEEEEEEEESSSSSSSSTTTTTTTTIIIIIIIINNNNNNNNGGGGGGGG DDDDDDDDIIIIIIIIFFFFFFFFFFFFFFFFEEEEEEEERRRRRRRREEEEEEEENNNNNNNNCCCCCCCCEEEEEEEESSSSSSSS BBBBBBBBEEEEEEEETTTTTTTTWWWWWWWWEEEEEEEEEEEEEEEENNNNNNNN GGGGGGGGRRRRRRRROOOOOOOOUUUUUUUUPPPPPPPPSSSSSSSS ((((((((CCCCCCCCAAAAAAAAUUUUUUUUSSSSSSSSAAAAAAAALLLLLLLL RRRRRRRREEEEEEEELLLLLLLLAAAAAAAATTTTTTTTIIIIIIIIOOOOOOOONNNNNNNNSSSSSSSSHHHHHHHHIIIIIIIIPPPPPPPPSSSSSSSS)))))))) 28

8.1 What does “testing for differences between groups,” mean? 28 8.2 Testing differences between two independent groups: t-test for independent groups 30 8.3 The nonparametric alternative for the t-test for independent samples: Mann-Whitney U test 32 8.4 Testing differences between two dependent / related samples 33 8.4 The non-parametric alternative to the t-test for dependent/related samples: Wilcoxon Singed-Rank Test 35 8.5 Testing differences between more than 2 groups on one variable: One-way Analysis of Variance (One way ANOVA) 35 8.6 The non-parametric alternatives for the One-way ANOVA 40 References: 41

Page 4: Basic Statistics for the Utterly Confused
Page 5: Basic Statistics for the Utterly Confused

- 1 -

11111111........ SSSSSSSSttttttttaaaaaaaattttttttiiiiiiiissssssssttttttttiiiiiiiiccccccccssssssss aaaaaaaannnnnnnndddddddd aaaaaaaallllllllllllllll tttttttthhhhhhhhaaaaaaaatttttttt jjjjjjjjaaaaaaaazzzzzzzzzzzzzzzz Statistics is used in quantitative researchquantitative researchquantitative researchquantitative research to analyse and interpret the data collected during the data collection process. (Although very elementary statistics such as frequency counts are sometimes used in qualitative statistics, most hard core qualitative researchers will RATHER DIE than use any form of statistics!) In short, it implies that you collected data from the “real world” by means of a questionnaire (most commonly used) and now you want to tell the story of the “real world” by using statistics. Field (2009) explains this by saying that we are actually building statistical “models” of reality. When you look at the model you would like to be able to say that “this is what reality” looks like! Of course, when you build a model you would want to use the best material to depict the reality as accurately as possible. In terms of statistics these material refers to firstly, your data and secondly the statistics used. The data comes first. Here, the garbage-in, garbage-out principle applies. Make sure before the

study that, 1. your data answers the research question, 2 the data comes from a representative sample and 3. the data meets the parameters of the statistics you want to use. In terms of the latter, every statistic has a set of criteria to be met for optimal usage. Should your data not meet the criteria for the statistic needed to answer your research question, you will do a statistic will very little power and little validity or you may not be able to do the statistic at all. Then in terms of the statistics, you have to make sure that if your data is good, that you choose the best possible statistic to depict the “reality of your research question”. There is probably more than one option to consider when selecting a statistic. Make sure you choose the best one to increase the accuracy of your results. It should be clear by now that although statistics is used for the analyses of the data, it should actually be considered from the start of the research process. Generally research topics for explorative research (topics not explored in great depth) are better answered through qualitative research. From its nature, quantitative research gives more answers in terms of the breadth of a

problem, for instance the prevalence of HIV Aids in South Africa. Qualitative research gives a better depiction of the depth of a problem, e.g. the experience of cancer survivors. After finding a topic, a research question should be stated at some point and out of the question flow the purpose of the research. Some research questions are better answered by quantitative research. For instance, questions that revolves around determination (such as the prediction of

Page 6: Basic Statistics for the Utterly Confused

- 2 -

one event by means of another), validity (i.e. the validity of a questionnaire) and causal relationships between variables (e.g. whether gender is the cause of a negative attitude) are all better answered through quantitative methods. When stating the research question, you should already have an idea of what type of analysis you can possibly use. Most statistical analyses have some data requirements. For instance, requirements of sample size and level of measurement (i.e. most parametric statistics require data to be at least on interval scale of measurement).

Fig. 1:Fig. 1:Fig. 1:Fig. 1: The research processThe research processThe research processThe research process

Research Question

•Design

oPlan for measurement

oSampling plan and procedures

oData analysis

Find a topic

Data analysis & interpretation

•Interpretation of results

•Conclusion & recommendations

Has the topic been explored

in depth and breadth?

What methodology would answer

the Research question best?

Does the design fit the criteria for the statistical analysis? Is the sample big enough for the statistical analysis?

The Research Process

Page 7: Basic Statistics for the Utterly Confused

- 3 -

BOX 1:BOX 1:BOX 1:BOX 1: Different approaches to researchDifferent approaches to researchDifferent approaches to researchDifferent approaches to research

22222222........ DDDDDDDDeeeeeeeessssssssccccccccrrrrrrrriiiiiiiippppppppttttttttiiiiiiiivvvvvvvveeeeeeee SSSSSSSSttttttttaaaaaaaattttttttiiiiiiiissssssssttttttttiiiiiiiiccccccccssssssss Descriptive statistics tells you what your data looks like. Say for instance, you used a questionnaire to gather data. Let’s say the questionnaire was about asked biographical questions about the managers (e.g. age, year’s experience, gender) that completed it, as well as questions with regard to their management style. By doing descriptive statistics you will be able to draw a profile of the managers that took part in your research. You would also be able to get an idea of the management styles they use. The first step of statistical analysis usually involves descriptive statistics. You can use it to describe the sample, to check if the data is fit for specific analysis or to answer a specific descriptive or exploratory research question.

Page 8: Basic Statistics for the Utterly Confused

- 4 -

For different types of data, different descriptive statistics are used. In other words, different descriptive statistics are used for data from different levels of measurement. Nominal and ordinal data are henceforth referred to as categorical data/variables. This is since these two levels of measurement indicate different categorical answers in your data set. Interval and Ratio level data on the other hand is referred to as Scale data/continues variables. This is because they indicate respondent answers on a scale from 0/1, 2, 3, through to x. A special type of categorical variable is the dichotomous variable. This is a variable that represents only 2 categories. For instance, the variable of gender represents male and female. Descriptive statistics include frequencies/frequency counts, statistics of central tendency and statistics that indicate variability/dispersion.

2.1 2.1 2.1 2.1 FrequenciesFrequenciesFrequenciesFrequencies

Frequencies indicate to us the amount of cases (respondents), which falls into each of the available categories. Frequencies can be displayed in terms of counts or percentages. Frequencies are usually displayed by means of frequency tables, but can also be displayed graphically in graphs and charts. Suitable graphs to display frequencies for categorical data are bar charts or pie charts. Example of a frequency table:Example of a frequency table:Example of a frequency table:Example of a frequency table:

VOTE FOR CLINTON, BUSH, PEROT

661 35.8 35.8 35.8

278 15.1 15.1 50.8

908 49.2 49.2 100.0

1847 100.0 100.0

Bush

Perot

Clinton

Total

Valid

Frequency Percent Valid Percent

Cumulative

Percent

In this example, I wanted to see the frequency of people that voted for each one of the three candidates for the US presidential elections in 1994. In my interpretation, it is very obvious that most of the voters (908) voted for Clinton in 1994. Example of bar chart:Example of bar chart:Example of bar chart:Example of bar chart:

Page 9: Basic Statistics for the Utterly Confused

- 5 -

ClintonPerotBush

VOTE FOR CLINTON, BUSH, PEROT

1,000

800

600

400

200

0

Fre

qu

en

cy

908

278

661

VOTE FOR CLINTON, BUSH, PEROT

Here I drew up a bar chart of the frequency distribution in the above-mentioned example.

Example of Pie chart:Example of Pie chart:Example of Pie chart:Example of Pie chart:

49.16%

15.05%

35.79%

Clinton

Perot

Bush

VOTE FOR CLINTON, BUSH, PEROT

Here is a pie chart displaying the percentages of the frequencies. For a scale variable, one would display a frequency distribution graphically by means of a histogramhistogramhistogramhistogram and not a bar chart or pie chart. In SPSS you also have the option of adding a normal curve to the histogram to get an idea of the normality of the distribution. Another option is to graphically represent it by means of a frequency polygon. Although a frequency polygon is appropriate for ordinal data as well as other scale date, it is not appropriate for nominal data.

Page 10: Basic Statistics for the Utterly Confused

- 6 -

2.2 2.2 2.2 2.2 Central tendencyCentral tendencyCentral tendencyCentral tendency

For variables measured on nominal scale, the statistic for central tendency is the mode. The mode indicates the category with the greatest number of cases. Say for instance, your question asked peoples occupation from a list. If most of the people indicated they were medical doctors that would be the mode of the dataset for that question. As an example of mode look at the following (From Glosser (2004) http://www.mathgoodies.com/lessons/vol8/mode.html):

Example 1: The following is the number of problems that Ms. Matty assigned for homework on 10 different days. What is the mode?

8, 11, 9, 14, 9, 15, 18, 6, 9, 10

Solution: Ordering the data from least to greatest, we get:

6, 8, 9, 9, 9, 10, 11, 14, 15, 18

Answer:Answer:Answer:Answer: The mode is 9. The mode is 9. The mode is 9. The mode is 9.

For ordinal level data the best indicator of central tendency is the medianmedianmedianmedian the median is the exact middle point of the data set. It indicates the value above and below which half of the cases fall.

(From http://www.uwsp.edu/psych/stat/5/CT-Var.htm)

For interval and ratio data, one uses the mean (average score) as indicator of central tendency. Thus with categorical data the mode and median have the same function as a mean. The mean

Page 11: Basic Statistics for the Utterly Confused

- 7 -

is not used with interval data if the distribution is skewed (not normal). In this case you will use the median.

2.3 2.3 2.3 2.3 Statistics for variabilityStatistics for variabilityStatistics for variabilityStatistics for variability

As mentioned above, another type of measure that can be used to summarise a data set is the measures of dispersion or variability. These measures refer to summaries of the size of the differences between each score and every other score. There are three measures of variability:

� RangeRangeRangeRange: The difference between the largest and smallest score � VarianceVarianceVarianceVariance: Extent of the differences among scores. The greater the differences

the more the mean fails to represent the data set. The range takes only into account the largest and smallest score. The variance takes into account every score.

� Standard deviationStandard deviationStandard deviationStandard deviation: The standard deviation of the scores from the mean in the same measurement unit as the original score.

Since categorical variables have a restricted range (it will always be bound to the number of categories), the variability is often not used as a description. One can rather look at the minimum and maximum scores in the data set or the range. For scale data, the standard deviation is used. Take note, that if the standard deviation = 0, all the scores are the same variability. The higher the standard deviation, the higher the variability.

2.4 2.4 2.4 2.4 WorkWorkWorkWorking with percentages:ing with percentages:ing with percentages:ing with percentages:

In order to compare frequencies, most researchers work out the percentage of frequency in each category. Percentages represent the proportion of responses within each category in your dataset, and serves two purposes: 1) it simplifies the data by reducing the numbers to a range from 1 – 100 and 2) it translates the data into a standard form for relative comparrison. To calculate the percentage, you need to know the number of observations in the category and the total number of observations in the data set. The formula for percentages is:

Percentage = f /N * 100% where f = the number of observations in the category and N = the total number of observations in the data set. N can also be described as the “base,” “total,” or universe.

Page 12: Basic Statistics for the Utterly Confused

- 8 -

For example, People in poverty in Johannesburg is 400,000. The total number of people living in Johannesburg is 132 000 000. What is the percentage of people living in poverty? Of the total of poor people, 260 000 are women, what is the percentage of poor women living in Johannesburg? There are some rules when it comes to interpreting percentages:

1. Percentages cannot be averaged unless each is weighted by the size of the group from which it is computed. This is referred to as a weighted average.

2. When a very small base is used (say the percentage of out of 5) it is easy to overestimate the percentage. For instance, 60% would seem like a huge difference, while it may only indicate 3/5.

33333333........ PPPPPPPPaaaaaaaarrrrrrrraaaaaaaammmmmmmmeeeeeeeettttttttrrrrrrrriiiiiiiicccccccc aaaaaaaannnnnnnndddddddd NNNNNNNNoooooooonnnnnnnn--------ppppppppaaaaaaaarrrrrrrraaaaaaaammmmmmmmeeeeeeeettttttttrrrrrrrriiiiiiiicccccccc ssssssssttttttttaaaaaaaattttttttiiiiiiiissssssssttttttttiiiiiiiiccccccccssssssss When we need to use inferential statistics, the optimal is to use parametric statistics. To use

parametric tests, the data that we use should meet a number of assumptions. If it does not meet

the assumptions, the results will be inaccurate. As such, it is extremely important that you test

the assumptions of a specific statistic before you continue with the analysis.

Specific statistics has specific assumptions, however, they generally include:

• Normally distributed data: It is assumed that the data is from a normally distributed

population. If you remember that inferential statistics is done to prove that some or the

other results are applicable to an entire population, you should also understand that the

population’s distribution should also be normal. This assumption is however different

depending on the context in which it is used.

• Equal variances / homogeneity of variances: If two or more groups are compared, or

used in the research, they should have equal variances or spread of scores

• Independence: There must be independence of observations, except when the data are

paired (paired data refers to data that is related to the same respondents over more than

one measurement, like in pre-and post measurements, or respondents that are in some

way related to each other). How do we know if there exists independence of

observations? Well, you will have to look at the design of the research. Where did the

data come from? Was it observations of two entirely different groups, or was it a pre-post

measurement of the same group. How do we prove that this assumption was met?

Easy! By describing and explaining the research design. There are statistical ways to

prove independence of observations – however they are not used for the type of statistics

we will go through in this course.

Page 13: Basic Statistics for the Utterly Confused

- 9 -

• Interval data: The variables (specifically the independent variable) should be on at least

interval level of measurement (or if categorical it should have a minimum of 7 categories).

This assumption is tested by common sense and not through a statistical analysis.

When the assumptions of parametric tests are not met, we should look at the non-parametric

alternative to the parametric test (we will also look at non-parametric alternatives with each

type of statistic in following tasks). Although non-parametric statistics also has some

assumptions, there are fewer restrictions on the data that can be used. The general assumptions

of non-parametric statistics are:

• Independence of observations except when paired

• Few assumptions concerning the population’s distribution

• The scale of measurement of the dependent variable may be categorical or ordinal

• The primary focus is either the rank ordering or the frequencies of the data

• Sample size requirements are less stringent than for parametric tests.

If we look at the assumptions above it is clear why non-parametric statistics are often referred to

as statistics for small samples and distribution free tests.

3333.1.1.1.1 Testing the assumption of normalityTesting the assumption of normalityTesting the assumption of normalityTesting the assumption of normality What is a normal distribution? The normal distribution has 4 characteristics:

• It is unimodal – thus, it has only one hump in the middle of the distribution with the mode

in the middle

• The mean, mode and medial are equal

• It is symmetrical (not skewed)

• It is asymptotic (the extreme scores never touch the x-axis)

• It’s neither too peaked not too flat, thus the kurtosis is equal to 0.

An illustration of the normal distribution

The statistics to look at when you check for normality of the distribution include:

• Skewness

Page 14: Basic Statistics for the Utterly Confused

- 10 -

• Kurtosis

• Kolmogorov-Smirnov (or K-S from now on) (the vodka statistic)

• Shapiro-Wilk test

• Q-Q Plots

• Box-and-whiskers plots

• Histogram

Skewness refers to the lack of symmetry. A distribution with a long tail to the right have is

positively skewed and visa versa.

How to see the skewness of a distribution with SPSS:

From the menu, choose: Analyse > Descriptive statistics > Descriptives > From the options… box, select skewness.

The output will give you a number. E.g.-5,845. The +/- in from of the number indicates to what

direction the skewness tends and the number how skew the distribution is. The higher the

number, the more skewed the distribution.

Kurtosis on the other hand measures the flatness or peakedness of the distribution. Very

peaked distributions have positive kurtosis and very flat curves have a negative kurtosis. A

perfect normal distribution has kurtosis = 0. To check the kurtosis, you can follow the same

procedure as for skewness, but instead of selecting “skewness”, select “kurtosis”. (Both

skewness and kurtosis can also be computed by SPSS under the “Frequency” option of

“Analyse”.)

To use skewness and kurtosis to see if the distribution is normal, you have to convert the given

skewness and kurtosis scores to z-scores. Use the following formula: zskewness = (K-0)/SEskewness

Page 15: Basic Statistics for the Utterly Confused

- 11 -

or z kurtosis = (S-0)/SEkurtosis. S = Skewness; K = kurtosis; SE = Standard Error (of skewness or

kurtosis). If the value is smaller than 1.96, the distribution is normal. In larger samples, this value

should be increased to 2.58. And very large samples it should be increased to 3.29. When a

sample is larger than 200, one should look at the shape from the histogram rather than

significance testing. Significance tests of skewness and kurtosis should with large samples

because they are likely to be significant even when skew and kurtosis are not too different from

normal (Field, 2009, p. 139).

Skewness and kurtosis gives us a numerical value by which we can judge whether a distribution

is normal or not. When you draw up a histogram, you can graphically see if the distribution is

skewed or flat or peaked.

Example of SPSS output

How to draw a histogram with SPSS:

From the menu, choose: Analyse > Descriptive statistics > Frequencies > From the charts options… box, select histogram and select the tickbox with normal curve.

Another plot that can be used is the PPPP----P PlotP PlotP PlotP Plot

(Probability-probability plot). A normal distribution on a

P-P Plot should be a diagonal straight line.

Valid 1013 N

Missing 504

Skewness -3.817

Std. Error of Skewness .077

Kurtosis 12.594

Std. Error of Kurtosis .154 2.521.510.5

Counselling for Mental Problems

2,000

1,500

1,000

500

0

Fre

qu

en

cy

Mean = 1.94Std. Dev. = 0.232N = 1,013

Histogram

Page 16: Basic Statistics for the Utterly Confused

- 12 -

Drawing a P-P Plot with SPSS:

From the menu, choose: Analyse > Descriptive statistics > P-P Plots

Box 3: Describing the different groups in your sample: Using the split file command

Most of the time there are different subpopulations represented in the sample. In these cases

you would most likely want to explore each of the subpopulations. One of the functions in SPSS

that can help you do this is the split file function. The split file function allows you to identify a grouping variable (a variable that is used to specify categories of people).

When you select the split file function, any subsequent procedure that you will do in SPSS will be

carried out, in turn, on each category specified by the grouping variable. For this reason it is

important to turn off the split file function after you have completed the computations you

wanted done in that way. (To switch it of follow the same path (given below) and click on the

reset button.)

To select the split file command: From the menu, choose > Data > Split file. Here the split

file dialogue box will open:

Select “Organise output by groups, and the select the grouping variable (e.g. sex). Then click OK.

Another way in which normality can be tested is by means of the Kolmogorov-Smirnov (K-S)

and the Shapiro-Wilk tests. These tests compare the distribution with a comparable normal

Page 17: Basic Statistics for the Utterly Confused

- 13 -

distribution. In both these tests, we are actually testing a hypothesis. This hypothesis is that the

distribution of the sample is the same as the distribution of a population with the same

mean and standard deviation. Remember that we always statistically test to reject or accept

the null hypothesis. The null hypothesis in this case will be:

There is no difference between the distributions of the sample and population (thus

they are equal).

If this is true (if we accept the null hypothesis), it means that the sample distribution is normally

distributed).

The Shapiro-Wilk test is used for small sample sizes (less than 50), otherwise use the K-S test.

The limitation of these tests is similar to the skewness and kurtosis significant tests: that is, if the

sample size is large, it will easily show significant differences (non-normality). For this reason

one would always plot data and use the graphs in collaboration with any other test used.

Komogorov-Smirnov & Shapiro Wilk in SPSS:

How to do this? Well, it is actually easy with SPSS. First, you have to do the statistics by:

From the menu, choose Analyse > Explore…In the dependent list, put all the variables of

interest to you (that you want to test). If any of the variables is a grouping variable, you can

put it in the factor list. This will split the file so that your computations will be done for

different subgroups (e.g. for males and females). If you click on statistics select

Descriptives. Continue. If you click on Plots select under box plots, the option of factor levels together. Under descriptive select stem- and-leaf. Also Select the Normal plots with tests and click on continue. Then OK.

There is a lot of output, but only some that is of importance specifically for the K-S and

Shapiro-Wilk. You may look at the descriptors per variable if you haven’t drawn up any

originally. The important statistic is the tests for normality.

Tests of Normality

.398 408 .000 .643 408 .000

.444 574 .000 .548 574 .000

.227 408 .000 .865 408 .000

.266 574 .000 .857 574 .000

Respondent's Sex

Male

Female

Male

Female

To Be Well Liked

or Popular

To Obey

Statistic df Sig. Statistic df Sig.

Kolmogorov-Smirnova

Shapiro-Wilk

Lilliefors Significance Correctiona.

Example of normality tests output

How do you interpret these tests? The statistic is the actual K-S statistic and the df is the degrees of freedom (should be the same as the sample size). The one we look at to judge

whether to accept or reject the null-hypothesis is the sig. or significance value. If the sig. i less than 0.05, there is a significant difference between the population and sample

Page 18: Basic Statistics for the Utterly Confused

- 14 -

distribution – therefore we reject the null hypothesis and say that the distribution is not

normal.

In the case of the above shown table, I will report:

The Kilmogorov-Smirnov statistic was significant (p<0.05) and therefore the distribution is not normal. You will see that the normality output will also include Q-Q plots and stem and

leaf plots and even box-plots (box and whiskers plots).

3.2 Equality of variances

You can see the variances by using the descriptive and frequency commands in SPSS.

However, these give you an indication of the variance of the different groups, but you do not know

if the differences on face value are statistically significant. There are other statistics that tell us to

what extent there are significant differences between different samples. The most common of

these are the Levine’s test of homogeneity of variance and the Bartlett’s test for homogeneity of

variance.

The Levene’s test in SPSS Explore:

Go to Analyse>Descriptive statistics>Explore…put the dependent variable in the “Dependent

List. The grouping variable should be in the “Factor list”. Under the “Plots” options select

Histograms with normality plots and “untransformed” under “Spread vs Level with Levene’s

test”.

Test of Homogeneity of Variance

Levene Statistic df1 df2 Sig.

Based on Mean .070 1 53 .792

Based on Median .033 1 53 .856

Based on Median and with

adjusted df

.033 1 52.457 .856

Age

Based on trimmed mean .052 1 53 .820

Read the statistics based on the mean. If the significance is smaller than 0.05, it indicates

that the variances are not equal. Significance larger tan 0.05 indicates that variances are

equal.

To report the results of the Levene’s test:

Levene’s test is denoted by the letter F . F as well as the degrees of freedom (df) should be

Page 19: Basic Statistics for the Utterly Confused

- 15 -

mentioned in the report. The general form of reporting is: F(df1; df2) = value , sig. E.g. F(1;

53) = 0.070, 0.792.

44444444........ FFFFFFFFrrrrrrrroooooooommmmmmmm qqqqqqqquuuuuuuueeeeeeeessssssssttttttttiiiiiiiioooooooonnnnnnnnnnnnnnnnaaaaaaaaiiiiiiiirrrrrrrreeeeeeee ttttttttoooooooo ddddddddaaaaaaaattttttttaaaaaaaasssssssseeeeeeeetttttttt The data collected during the research needs to be coded and entered into SPSS to create a dataset with which you can work. For the purposes of using statistical programmes, you have to define and label the variables you measured during data collection. For instance, if I measured level of statistical knowledge, the label may be STATKNOWSTATKNOWSTATKNOWSTATKNOW and the levels of that variable was measured on a 5 point scale where 1 was no knowledge, was some knowledge, 3 was average knowledge, 4 was above expected and 5 was exceeding knowledge. So the levels would be the codes that I will use to indicate the levels of statistical knowledge. When you are measuring a lot of variables it is very easy to become confused with codes and labels. For this reason, researchers create codebooks. The codebook lists all the variables included for the statistics, as well as their labels and the codes ascribed to each answer category given. For instance, if I measured gender in a questionnaire (in other words, a question asking each respondents’ gender), “Gender” will be the variable name. In the SPSS data file, I will refer to gender as “SEX” and the codes that identifies each respondents’ gender is 1 or 2, where 1 indicates “Female’ and 2 indicates “Male”. In my codebook, I will illustrate this as:

VariablesVariablesVariablesVariables SPSS VSPSS VSPSS VSPSS Variable Nameariable Nameariable Nameariable Name Coding InstructionCoding InstructionCoding InstructionCoding Instruction

Gender SEX 1 = Female 2 = Male

The codebook can be created as soon as your data analysis tool is finalised and contains only closed answer categories. In the case where you want to use a qualitative data collection tool, such as an open ended questionnaire, you will have to wait until after you collected your data. Variable names should:

• Be unique • Must begin with a letter (not a number) • Cannot include full stops, blanks or other characters

Page 20: Basic Statistics for the Utterly Confused

- 16 -

• Cannot include words used as commands by SPSS (all, ne, eq, to, lt, by, or, gt,. And, not, ge, with)

• Cannot exceed 64 characters The responses must all be coded with numbers. Otherwise you would not be able to do any statistics with them. Even open-ended questions should be transformed to numerical codes to use it in SPSS. Before you can analyse data with a statistics programme like SPSS, you will need to create some form of data set for it to work on. The dataset will need read the data you collected into the chosen programme for it to work with the data. For this course we are using SPSS (Statistical Programme for the Social Sciences). But you may decide to use MS Excel for the data analysis or SAS (Statistical Analysis System). In which case, you would have to read the data into that programme.

Since you will be working on SPSS you will need to open or create an SPSS data set. When you are working with raw data (the answers of the respondents are on the questionnaires only) you need to create a template and insert the data into the SPSS spreadsheet. If our data is in an electronic form, it can be opened in SPSS. (Note that data should be in an Excel spreadsheet or a text file to open with SPSS).

55555555........ SSSSSSSSccccccccrrrrrrrreeeeeeeeeeeeeeeennnnnnnniiiiiiiinnnnnnnngggggggg aaaaaaaannnnnnnndddddddd cccccccclllllllleeeeeeeeaaaaaaaannnnnnnniiiiiiiinnnnnnnngggggggg yyyyyyyyoooooooouuuuuuuurrrrrrrr ddddddddaaaaaaaattttttttaaaaaaaa Sally did research on managers’ stress level and blood pressure. She collected the data on stress using the General Stress Inventory and a registered nurse took the blood pressure levels. As soon as Sally had all the data she read it into SPSS and started the analysis. To her amazement she found inconsistent results. Lucky for her she went back and checked her data before she started writing the report. It turned out that Sally made a lot of mistakes while reading in the data, and that caused the inconsistent results! Like with Sally, it often happens that mistakes are made when capturing data. When the dataset is faulty, it can lead to wrong conclusions and therefore invalid and unreliable research! For this reason, the first step after capturing data is to screen and clean the dataset. To screen data means that you explore the dataset for any errors, find the errors and correct them. To identify errors means that you have to know what the correct data will look like, right.

Page 21: Basic Statistics for the Utterly Confused

- 17 -

This is easy, you know what the data should look like since a codebook is available that shows you what the range of the data should be for each variable. For instance, if you measured the variable of home language in South Africa with a closed ended question with 11 answer options (one for each language), you know that for the variable of language there is a range of 1 – 11. Anything outside this range will be a mistake. See, easy! Now, how do you screen for errors in SPSS? Well, basically you want SPSS to describe the data. And, what if you find that a variable is not in the same range as you expected, how will you know which one of the cases is out wrong? You can either search the variable or you can do more detailed descriptive statistics. Your choice! As soon as you have identified the error you can replace it with the correct value by going back to the raw data (questionnaires). If you do not know what the correct value is, you need to delete the value and replace it with a missing value (or just keep the cell empty).

66666666........ MMMMMMMMaaaaaaaannnnnnnniiiiiiiippppppppuuuuuuuullllllllaaaaaaaattttttttiiiiiiiinnnnnnnngggggggg yyyyyyyyoooooooouuuuuuuurrrrrrrr ddddddddaaaaaaaattttttttaaaaaaaa With SPSS one can add up scores, for instance to adding the scores on individual items of a questionnaire to get a scale score. Continues scores may need to be collapsed into categories to create a categorical variable, or if too few responses of a specific category are present the number of categories on a questionnaire can be reduced. Skewed distributions can also be transformed if needed.

6666.1Calculating the total scores of scales or indexes.1Calculating the total scores of scales or indexes.1Calculating the total scores of scales or indexes.1Calculating the total scores of scales or indexes In some questionnaires a number of questions (items) measure a specific construct. In other

words, you will not look at the single items alone. If this is the case, we would like to add the

responses on these items to obtain a total for each person. We may also use scale in which case

we want to add the responses of all the items together to obtain a scale score. To do this in

SPSS go to > Transform >compute variable.

6666.2 Reversing negatively worded items.2 Reversing negatively worded items.2 Reversing negatively worded items.2 Reversing negatively worded items

In some scales, the wording of particular items has been reversed to help prevent response bias. Using the “Transform” function in SPSS, the item can be recoded positively.

Page 22: Basic Statistics for the Utterly Confused

- 18 -

6666.3 Collapsing a continues variable into groups.3 Collapsing a continues variable into groups.3 Collapsing a continues variable into groups.3 Collapsing a continues variable into groups Sometimes you will need to divide your sample according to scores to create groups. For

instance in terms of income, you would want to create categories of low income, middle income

and high income if the question on the questionnaire asked you to write in the income. So in

writing the answer, you may have a continues variable of income, but say you want to compare

the three different income groups on for instance the variable of hope. In such cases, you will

transform the continues variable into a categorical variable.

You may ask, why do you not use categories from the beginning? Well, using interval or ratio

level of measurement gives you much more detail to work with. If you ask age in categories,

every person in your sample will just fall into a category but if you ask the specific age, you have

much more detail on your samples age. It gives you also a wider variety of analysis to work with

since if needed, you may always collapse the continues variable into a categorical one. To do

this in SPSS go to > Transform > Recode into different variable

77777777........ CCCCCCCCoooooooorrrrrrrrrrrrrrrreeeeeeeellllllllaaaaaaaattttttttiiiiiiiioooooooonnnnnnnn aaaaaaaannnnnnnnaaaaaaaallllllllyyyyyyyyssssssssiiiiiiiissssssss When we talk about relationships between variables, we imply that the variables

influence each other. Take note, influence does not imply a causal relationship! If

ice cream sales in Bloemfontein are very high this month, and the amount of

drowning is very high, there will be a correlation or relationship between ice cream

sales and drowning. Does this mean that ice cream sales cause drowning? Or does it maybe

mean that drowning cause ice cream sale? Of course not! There is no logical or theoretical link

between these two events! So a relationship implies that at a given time, in a given context, the

rate or frequency of occurrence of two variables (that is in this case ice cream sales and

drowning) increase.

Relationships between variables are also referred to as associations between variables. The

nature of a relationship/association implies its strength and the direction of the relationship.

The strength of a relationship is indicated by a correlation coefficient (the symbol r is used to

indicate the correlation coefficient in statistics output). The correlation coefficient is a number

between 0 – 1 that indicates how strong the relationship between variables are. A coefficient of 0

indicates no relationship and 1 indicates a perfect relationship.

Page 23: Basic Statistics for the Utterly Confused

- 19 -

The direction is whether the relationship is positive or negative. A positive relationship implies

that if the properties of the one variable increase, the properties in the other one will also

increase. Or if the properties in the one decrease, the properties in the other will also decrease.

THUS, a positive relationship means that the variables co-vary in the same direction. A positive

relationship is also referred to as a direct relationship.

A negative relationship means that if the scores in one variable increase the scores in the other

variable decrease. THUS, a negative relationship means that the variables co-vary in different

directions. A negative correlation is also referred to as an indirect relationship.

The positive and negative correlations refer to linear relationships – in other words, both of them

are fitted on a straight diagonal line (See the scatter gram examples below – you will see that a

positive and negative correlation both fit on a straight diagonal line).

In statistics, a correlation analysis is used to test the nature of the relationships between

variables. Therefore, relationships are also referred to as correlations – positive and negative

correlation.

As example, if I want to know if students with higher order thinking skills understand statistics

better, I will do a correlation analysis. That is, I will ask:

Is there a positive relationship between higher order thinking skills and student’s

understanding of statistics?

For relationship questions, I will conduct a correlation analysis. If the analysis is significant, it will

tell me that the better the higher order thinking skills, the better students’ understand statistics. It

does however not tell me that higher order thinking cause statistics understanding! There is a

difference.

Questions about relationships between variables are usually descriptive research. In other

words, the aim of the research when you are using correlations is to describe the relationships

that exists between a and b.

7777.1.1.1.1 Statistics to test relations between Statistics to test relations between Statistics to test relations between Statistics to test relations between variables variables variables variables Different statistics are used to test the relationship between variables. They are all referred to as

types of correlation analysis, but are used for different types of data. They include:

• Pearson / product-moment;

• Spearman;

• Point Baserial;

Page 24: Basic Statistics for the Utterly Confused

- 20 -

• Phi coefficient and so forth.

7.1.1 The Pearson / Product-Moment correlation

A Pearson correlation coefficient is used when you are working with continuous data, in

other words, data on the interval or ratio level of measurement. The Pearson correlation is

also a parametric test or a parametric statistic. In short, in statistics we have two legs or

two kinds of statistics – those that are parametric and those that are non-parametric.

Parametric indicates that there are certain assumptions or parameters (borders) that the data

should adhere to in order for it to qualify for parametric statistics. Should the data not adhere

to the parameters or assumptions, the equivalent but NON-parametric alternative should be

used.

The Pearson correlation coefficient is a parametric statistic. To use the Pearson product-

moment correlation your data should adhere to the following assumptions or parameters:

• Data must be on Interval level

• A linear relationship must exist (can be indicated by means of a scatter plot)

• The distributions must be similar (Thus, if they are skewed, they must be skewed in

the same direction), but preferably normal.

• Outliers must be identified and omitted from the computation (please note if you

delete the outliers, delete only the cell with the outlier value)

Page 25: Basic Statistics for the Utterly Confused

- 21 -

How do I know if there are outliers?How do I know if there are outliers?How do I know if there are outliers?How do I know if there are outliers? To see if there are any outliers, we draw up a box-and-whiskers plot and a stem and leaf plot. Both of these can be drawn up under Analyse…Descriptives..ExploreAnalyse…Descriptives..ExploreAnalyse…Descriptives..ExploreAnalyse…Descriptives..Explore ..Under statistics select Outl..Under statistics select Outl..Under statistics select Outl..Under statistics select Outliers and under Plots select stem and leaf. iers and under Plots select stem and leaf. iers and under Plots select stem and leaf. iers and under Plots select stem and leaf. To read stem and leaf plots use the following link: http://www.cmh.edu/stats/definitions/stem.htm. The box plot gives you a good idea of the outliers and the identity of the outliers. In other words, it does not only show you the outlier, but also which number in the data set has that particular value.

Outliers cannot be included in the analysis. There are different ways to deal with outliers:

1. Outliers can be removed 2. Data can be transformed: Outliers skew distributions. The skewness can be reduced somewhat by transformations of the dataset. (See Field (2009) p. 155 for a short and understandable description of different transformation options.

3. Change the score: Should the transformation fail, the value can be replaced by: a. Changing it to the next highest score in the dataset plus 1. b. The mean plus two standard deviations

The median

The 1ste – 3de quartile of the distribution, thus it will be the bell part in a normal distribution

The maximum value, which is not an outlier

The minimum value, which

is not an outlier

Outliers or extreme values that do not fit with the rest of the distribution

Page 26: Basic Statistics for the Utterly Confused

- 22 -

To conduct a Pearson-correlation follows the following steps should be used in SPSS:

From the menu bar select: Analyze > Correlate > The options that you can choose from at this stage is: Bivariate > Partial > Distance > A bivariate correlation is a correlation between 2 variables. In the following box, select the variables that you want to correlate. Select “Pearson” under Correlation coefficients. Under Test of significance, two-tailed means that there is no specification of the direction of the correlation in the hypothesis stated. We will mostly work with this one. One tailed is only chosen when you have specified the

direction of the effect (relationship). In other words, a directional hypothesis. In the bottom left hand corner, you can select “Flag significant correlations”. This will show SPSS that the significant correlations must be marked on the output.

7.1.2 The Spearman Rank-Order correlation / Spearman’s Rho

Spearman’s Rho is the non-parametric alternative of Pearson correlation coefficient. It is

used when one or both of the variables are measured on ordinal scale (If only one, the

other should be at least on interval scale). Spearman’s Rho is indicates as rs. To do this

on SPSS use the same procedure as with the Pearson correlation, but select the

Spearman Rank-Order option instead.

7.1.3 Kendall’s Tau

Kendall’s Tau is another non-parametric correlation and it should be used rather than

Spearman’s coefficient when you have a small data set (50 or less). It is stricter and if

you do both the Tau and Rho, you will probably find that the Tau is a bit lower than the

Rho. To do this on SPSS use the same procedure as with the Pearson correlation, but

select the Kendall’s Tau option instead.

7.1.4 The Point-Baserial correlation

This statistics is computed when you want to see the relationship between a continues

variable and a dichotomous variable. E.g. females and males report the total number of

years of education they have had, and we want to know whether there is any correlation

Page 27: Basic Statistics for the Utterly Confused

- 23 -

between gender and years of education. It is indicated by rpb The assumptions that your

data must meet to compute a Point Baserial correlation is:

• The dichotomous variable has mutually exclusive groups whose values have

been coded 1 and 0

• The two groups created by the dichotomous variables are normally distributed

• The two groups created by the dichotomous variables have equal variances

• The continues variable has equal variances across each level of the dichotomous

variable

To compute a rpb you use a normal Pearson correlation procedure.

7.1.5 The Phi-coefficient

When both variables are dichotomous the phi-coefficient is used (indicated as rphi). The

assumptions that the data must meet to utilise the Phi coefficient is:

• Variables must be dichotomous

• Observations are independent

• The observations are in the form of frequencies and not scores

• There must be at least 5 counts in each category for each variable.

How to test for equality of variances in SPSS?How to test for equality of variances in SPSS?How to test for equality of variances in SPSS?How to test for equality of variances in SPSS?

To test for equality of variances, an easy way is to select from the menu bar

statistics…compare means….Independentstatistics…compare means….Independentstatistics…compare means….Independentstatistics…compare means….Independent----Samples Samples Samples Samples TTTT----Test.Test.Test.Test. The grouping variable will

obviously be the dichotomous variable and the continues the one which you want to test

differences for. Then click OKThen click OKThen click OKThen click OK. This procedure will give you a table in the output that

looks like this:

Independent Samples Test

3.090 .079 1.943 1843 .052 .259 .133 -.002 .521

1.929 1677.298 .054 .259 .134 -.004 .523

5.685 .017 1.154 1845 .248 .065 .057 -.046 .177

1.147 1680.978 .252 .065 .057 -.047 .177

Equal variances

assumed

Equal variances

not assumed

Equal variances

assumed

Equal variances

not assumed

HIGHEST YEAR OF

SCHOOL COMPLETED

RS HIGHEST DEGREE

F Sig.

Levene's Test for

Equality of Variances

t df Sig. (2-tailed)

Mean

Difference

Std. Error

Difference Lower Upper

95% Confidence

Interval of the

Difference

t-test for Equality of Means

Look under Levene’s test for equality of variances. If the significance value is more than

0.05 it means that the two groups have equal variances.

Page 28: Basic Statistics for the Utterly Confused

- 24 -

To compute the phi coefficient with SPSS:

To compute the phi….From the menu bar select Analyze > Describe > Cross Tabulations

Go though the same process as you would with cross tabulations. However, go to the statistics

option and select “Phi and Cramer’s V” and continue and OK.

The output box should give you a table like this:

Symmetric Measures

.208 .136

.208 .136

1847

Phi

Cramer's V

Nominal by

Nominal

N of Valid Cases

Value Approx. Sig.

Not assuming the null hypothesis.a.

Using the asymptotic standard error assuming the null

hypothesis.

b.

If the significance value is less than 0.05 there is a significant relationship between the two

variables. You look at the Phi statistic only.

7.1.6 Cramer’s V Coefficient

When you want to test the association between two categorical variables (not

dichotomous) you use the Cramers’ V statistic. Obtain this by the same steps as above.

7777.2.2.2.2 How to interpret the results of the correlations How to interpret the results of the correlations How to interpret the results of the correlations How to interpret the results of the correlations A correlation coefficient tells you two things: 1) the strength of the relationship between the

variables and 2) the direction of that relationship. It does not tell you whether that relationship is

statistically significant or not. Here is a rough guide to interpret the correlation coefficients in

terms of strength of relationship:

Correlation

coefficient (r)

Strength of relationship

0.0 – 0.2 Very weak, negligible

0.2 – 0.4 Weak, low

0.4 – 0.7 Moderate

0.7 – 0.9 Strong, high, marked

0.9 – 1.0 Very strong, very high

You have to remember to look at the direction of the correlation as well. You can only interpret

the correlation in terms of strength if the correlation is statistically significant.

Page 29: Basic Statistics for the Utterly Confused

- 25 -

7777.3.3.3.3 The coefficient of determination (r The coefficient of determination (r The coefficient of determination (r The coefficient of determination (r2222)))) When the correlation coefficient is squared, it gives us an indication of the amount of variability in

the one variable that is explained by the other. For example if correlation coefficient between age

and social intelligence is 0.78 (p < 0.05) then the r2

= 0.6084. This can then be interpreted as:

The amount of variability that can be explained in social intelligence by means of age is 61%.

This r2

is also called the coefficient of determination (see

http://www2.chass.ncsu.edu/garson/pa765/correl.htm).

7.4 How to write up the results of a correlation analysis in a research report

Mostly you will write something like…The results of the chi-square analysis indicated a significant

but weak association between group membership and post intervention fear (Chi-square = 0.40,

p= 0.03)…depending on which analysis you used. Remember to interpret it in terms of the

practical value of the research.

7.5 Graphically representing the relationship between variables:

It is probably the easiest to see if a relationship exists by drawing up scatter plots of the different

variables that you would like to test. A scatter plot shows how the scores on the variables co-vary

(go together). Since the scatter plots gives you such a good picture of what to expect from a

correlation coefficient, it is the fist step of a correlation analysis – to first draw up a scatter plot

What is statistical significance?

A statistical concept indicating that the result is very unlikely due to chance and, therefore, likely

represents a true relationship between the variables. Statistical significance is usually indicated by the

alpha value (or probability value), which should be smaller than a chosen significance level. For most

research studies the significance level of 0.05 or 0.01 is used, thus indicating that the results have only a

5% or 1% chance of being likely by chance alone.

In SPSS, we look at the p-value to tell us whether results are statistically significant or not. If the p-

value is smaller than 0.05, we know the results are statistically significant by 0.05.

Page 30: Basic Statistics for the Utterly Confused

- 26 -

Examples of scatter plots:

A positive relationship A negative relationship No Relationship

To draw up a scatter plot in SPSS:

From the menu bar select graphs > Select Scatter. A box with the different scatter

plot options should appear - We will use the simple scatter plot for now. This type of

scatter plot looks at the relationship between two variables. Click on Define > Select the

variables for the analysis and place as x and y-axis > If there is a grouping variable

that defines different categories you may place it in the “Set markers by” block >

Select the Titles option below to give headings to the plot.

The different types of scatter plots that can be drawn up are:

• The simple scatter plot (as indicated above),

• The overlay scatter plot,

• The Matrix scatter plot and

• The 3-D scatter plot.

With the overlay scatter plot option, you can display the covariance between several

variables on the same axis / diagram. The Matrix scatter plot does the same but rather

than drawing it up on the same diagram, it is drawn up in a matrix. The 3D scatter plot is

used to draw a diagram of the relationship between 3 variables.

Page 31: Basic Statistics for the Utterly Confused

- 27 -

7.6 Other analysis that is grounded in correlation analysis

A lot of multivariate statistics is grounded in the logic of correlation analysis. They include Factor analysis, cluster analysis, regression analysis, and reliability analysis, to name but a few commonly used ones. While correlation analysis tests whether relationship exists and the strength of that relationship, regression analysis assess the predictive ability of an independent variable on a continues dependent variable. For instance, if we take high school achievement and university achievement, one can use a regression analysis to determine the extent to which the high school achievement can be used to predict achievement at university. While simple regression will assess the functional relationship between one dependent (criterion/outcome measure) one independent (predictor) variable, multiple regression is used when you want to test the predictive value of a number of predictors to a single criterion (outcome measure), where the criterion should be a scale variable (continues variable on at least interval level of measurement). When the criterion is not on interval level of measurement, logistic regression should be used.

For more information on correlations and regression, see:

o http://bmj.bmjjournals.com/collections/statsbk/11.shtml o Correlation and regression analysis for curve fitting find @

http://helios.bto.ed.ac.uk/bto/statistics/tress11.html o Sykes, A.O. (ND) An Introduction to Regression Analysis. Retrieved from:

http://www.law.uchicago.edu/Lawecon/WkngPprs_01-25/20.Sykes.Regression.pdf o http://www.valuebasedmanagement.net/methods_regression_analysis.html o http://www.investorwords.com/4136/regression_analysis.html o http://www.blackwellpublishing.com/specialarticles/jcn_10_462.pdf o http://www.telecom.csuhayward.edu/~esuess/Links/Software/RegressionExplained/re

gression_explained.doc

o DAU Stats Refresher @ http://www.cne.gmu.edu/modules/dau/stat/dau2_frm.html o Dallal, G.E. (2004). The Little Handbook of Statistical Analysis @

http://www.tufts.edu/~gdallal/LHSP.HTM > Select Regression pages on the menu page.

o http://www2.sjsu.edu/faculty/gerstman/StatPrimer/regression.pdf

Page 32: Basic Statistics for the Utterly Confused

- 28 -

Another procedure, which is based on the logic of correlations, is the factor analysis. With a factor analysis you can determine the underlying structure of a large data set. In other words, when you have 10000 variables in your dataset and want to look at how these variables fall together, you can use a factor analysis. On such a dataset the factor analysis will group the variables that fall together with each other, thus, indicating the underlying structure (or reduced number of latent variables) present in the data set. . One analysis which is very important when questionnaires are used in research is the reliability analysis. One of the main principles of selecting a data collection instrument is that it should measure what you need it to measure and it should be a reliable indicator of what ever it is you are measuring. In other words, the validity and reliability of your data collection instrument is important. While a factor analysis can assess the construct validity of an instrument, the cronbach’s alpha is one way to assess the reliability of a questionnaire. This method tests the internal consistency of the items that are supposed to measure the same thing. All of the reliability analysis options are under Analyse>Scale…Analyse>Scale…Analyse>Scale…Analyse>Scale…

88888888........ TTTTTTTTeeeeeeeessssssssttttttttiiiiiiiinnnnnnnngggggggg ddddddddiiiiiiiiffffffffffffffffeeeeeeeerrrrrrrreeeeeeeennnnnnnncccccccceeeeeeeessssssss bbbbbbbbeeeeeeeettttttttwwwwwwwweeeeeeeeeeeeeeeennnnnnnn ggggggggrrrrrrrroooooooouuuuuuuuppppppppssssssss ((((((((ccccccccaaaaaaaauuuuuuuussssssssaaaaaaaallllllll rrrrrrrreeeeeeeellllllllaaaaaaaattttttttiiiiiiiioooooooonnnnnnnnsssssssshhhhhhhhiiiiiiiippppppppssssssss)))))))) Sometimes we hypothesise that one variable (independent variable) may cause a change in another variable (dependent variable). For instance, we think that gender can influence vocational interest. In other words, if you are a male, you will have certain interests that differ from the interests of females. Thus, too prove your hypothesis you have to prove that the career interests of males and females differ from each other. In other words, you have to compare groups (in this case males and females).

8.1 What does “testing for differences between groups,” mean?

Researchers often want to test the similarities or differences of the properties or characteristics between groups. Take the following example: Example 1: A researcher wants to know whether there is a difference in the personalities of sales consultants and sales managers. This would give important information for the recruitment of both groups. The research question for this study would be: Is there a difference between the personality profiles of sales consultants and sales managers?

Page 33: Basic Statistics for the Utterly Confused

- 29 -

Of course the researcher has to define each of the variables included in the study. They are:

Type of post that is the independent variable or grouping variable, which is either sales consultant or sales manager, and personality profile (the dependent variable). The researcher defines a sales consultant as a person who is responsible for the sales of a specific product of a company. He is directly involved with the prospective buyer. The sales manager is a person who is responsible for the sales of sales consultants within a specific division of an organisation. He is not directly involved with the prospective buyer, but rather with the management of sales consultants.

A personality profile is a profile that defines the personality dimensions important for a specific group.

The above mentioned is the conceptualisations or conceptual definitions of the variables. However, the researcher needs to measure these concepts and therefore will specify the operational definitions/operationalise the variables. For this, he will for instance define the groupings as: For a person to fall into the category of a sales consultant, he or she has to be in a sales consultant post for at least 1 year. And a sales manager has to be in a post specified as sales manager for at least 1 year. Personality profiles are measured by means of the 16 Personality Factor questionnaires. The researcher will of course specify here what this instrument measures and how. The hypotheses will be set out as follows: H0 (null hypothesis): There is no difference in the personality profiles of sales consultants and sales managers. H1 (alternative hypothesis): There is a difference between the personality profiles of sales consultants and sales managers. The researcher can go so far as to set specific sub hypotheses for H1. These sub hypotheses will specify how the personality profiles will differ. For instance, the researcher can say that:

Page 34: Basic Statistics for the Utterly Confused

- 30 -

H1 (a): A sales consultant will score high on dimension A, F and Q4, and low on Q2. H1 (b): A sales manager will score high on dimension C, D and E, and lower on A and F. If sub hypotheses are specified, the researcher will have to substantiate why and how he got to these hypothesis from previous research.

8.2 Testing differences between two independent groups: t-test for independent groups

When a researcher wants to see if statistically significant differences exist between two different groups with regard to a dependent variable, he will use the t-test for independent groups. For instance, if you want to test if there is a difference in the level of language skills between a group of matriculates from Gauteng and Limpopo, you will use the t-test for independent groups. The t-test is a parametric statistic. The following assumptions must be met:

1. The t-test uses the means to compare for differences. This implies that the data for the dependent variable must be on at least interval scale.

2. It is not essential for this procedure that the sample sizes of the two groups are the same.

However, for the t-test, the sample size should at least be 30 per group.

3. Equal variances are assumed. For this, the Levene’s test for homogeneity of variances is used. This is given with the t-test output. You will remember from previous SA’s that the significance value of the Levene’s test should be more than 0.05. The latter will indicate that variances are equal.

4. The data for each of the two groups must be distributed normally. This can be tested by

means of the descriptions of skewness and kurtosis or the Q-Q plots (or any other test for

normality). To do an independent samples t-test on SPSS select ANALYSE > COMPARE MEANS > INDEPENDENT SAMPLES T-TEST. Select the dependent variable for the dependent list and the grouping variable under grouping variable. You have to define the groups – use the codes of the data set, e.g. group 1 = 0; group 2 = 1. Run the analysis.

Page 35: Basic Statistics for the Utterly Confused

- 31 -

The output will typically look like this:

Group Statistics

493 8.9939 1.68866 .07605

507 8.9152 1.58510 .07040

GenderMale

Female

Income before

the program

N Mean Std. Deviation

Std. Error

Mean

The first table shows the number of cases (N) for each group, the mean score for each group and the standard error of the mean.

Independent Samples Test

.421 .517 .760 998 .447 .0787 .10354 -.12446 .28191

.760 989.773 .448 .0787 .10363 -.12464 .28209

Equal variances

assumed

Equal variances

not assumed

Income before

the program

F Sig.

Levene's Test for

Equality of Variances

t df Sig. (2-tailed)

Mean

Difference

Std. Error

Difference Lower Upper

95% Confidence

Interval of the

Difference

t-test for Equality of Means

When you write the results of a t-test you must indicate the t-value as well as the significance value (p). In this example, Levine’s test showed that homogeneity of variances could be assumed. Thus, from the results, it is evident that there are no differences between the groups (t(998) = 0.760; p = 0.447). This reports the statistical significance. Lately it is required to report

Use the row of output corresponding to the outcome of the Levene’s test. E.g if Levene’s test indicates homogeneity of variances as in this case, use the upper row output for the t-test.

Levene’s test output. In this case, the test shows that the variances are equal F(1,998) = 0.517;p = 0.517.

The t-value is in this case 0.760

The degrees of freedom (N-1)

The significance value. Should be less than 0.05 to indicate a significant difference. For this example no differences exist.

The mean difference when the mean of group A is subtracted from Group B

Page 36: Basic Statistics for the Utterly Confused

- 32 -

the effect sizes of statistical results as well. There are different methods for calculating effect sized. The most common is however using r and Cohen’s d. Pearson's r can vary in magnitude from −1 to 1, with −1 indicating a perfect negative linear relation, 1 indicating a perfect positive linear relation, and 0 indicating no linear relation between two variables. Cohen gives the following guidelines for the social sciences: small effect size, r = 0.1-.23; medium, r = 0.24-.36; large, r = 0.37 or larger.

Box 3: Practical and statistical significance In SPSS, we look at the p-value to tell us whether results are statistically significant or not. If the p-value is smaller than 0.05, we know the results are statistically significant by 0.05. What is statistical significance? A statistical concept indicating that the result is very unlikely due to chance and, therefore, likely represents a true relationship between the variables. Statistical significance is usually indicated by the alpha value (or probability value), which should be smaller than a chosen significance level. For most research studies the significance level of 0.05 or 0.01 is used, thus indicating that the results have only a 5% or 1% chance of being likely by chance alone. We test the significance of yours statistics by looking at the probability that our results may be due to other factors. If this probability is larger than 5% we generally do not accept it as “significant”. When it is smaller than 5%, we do accept it as being “significant”. However, this significance is does not necessarily mean that it is important. Statistical significance can sometimes be due to large samples. For this reason we calculate also the effect sizes of significant statistics.

8.3 The nonparametric alternative for the t-test for independent samples: Mann-Whitney U test

Used if, the assumptions of the t-test for independent samples are not met, i.e. � Data is not normally distributed � Dependent variable is measured on ordinal scale � Sample sizes are small (smaller than 30 larger than 5 per group).

The hypothesis for a Mann-Whitney will look like:

HO: there are no differences between the means of the samples ( ) (median1 =median2 for non-parametric)

H1: there is a difference between the means of the two samples ( ) (median1?median2) The output will typically look like this: Mann-Whitney Test

Page 37: Basic Statistics for the Utterly Confused

- 33 -

Ranks

504 512.13 258114.01

496 488.68 242386.00

1000

Marital status

Unmarried

Married

Total

Level of education

N Mean Rank Sum of Ranks

Test Statisticsa

119130.00

242386.00

-1.389

.165

Mann-Whitney U

Wilcoxon W

Z

Asymp. Sig. (2-tailed)

Level of

education

Grouping Variable: Marital statusa.

When you report the results you must mention the Z score and the Significance level. In the example above, the differences between married and unmarried employees with regard to level of education is not significant (z=-1,389; p=0.165).

8.4 Testing differences between two dependent / related samples

In some research designs, a researcher has two measurements of the same group taken at two different points in time. For instance a pre-and post measurement. In such cases the researcher would like to see if there is a difference between the two measurements. A good example of such a design is when a researcher wants to test the effectiveness of a communication skills training programme. If the training programme is effective, the logical deduction would be that the scores on a second measurement of (after the training programme) will be higher than the first (before the training programme. In the case described above, the research will use the t-test for related/dependent samples. The

assumptions are the same as for the t-test for independent samples, except for the independence of observations. Take the following example: We compared the mean test scores before (pre-test) and after (post-test) the subjects completed a test preparation course. We want to see if our test preparation course improved people's score on the test.

Page 38: Basic Statistics for the Utterly Confused

- 34 -

First, we see the descriptive statistics for both variables.

The post-test mean scores are higher. However this is just on face value – we still do not know if this difference is statistically significant.

Next, we see the correlation between the two variables. Remember, the groups are paired / the same and therefore, we assume that there is a correlation between the first and second measurement.

There is a strong positive correlation. People who did well on the pre-test also did well on the post-test. Finally, we see the results of the Paired Samples T Test. Remember; this test is based on the difference between the two variables. Under "Paired Differences" we see the descriptive statistics for the difference between the two variables.

To the right of the Paired Differences, we see the T, degrees of freedom, and significance.

Page 39: Basic Statistics for the Utterly Confused

- 35 -

The T value = -2.171 We have 11 degrees of freedom Our significance is .053 If the significance value is less than .05, there is a significant difference. If the significance value is greater than. 05, there is no significant difference. Here, we see that the significance value is approaching significance, but it is not a significant difference. There is no difference between pre- and post-test scores. The test preparation course

did not help! To conduct a t-test for related samples on SPSS you follow the same route as with the t-test for unrelated samples. But, select the paired samples option/dependent samples t-test..

8.4 The non-parametric alternative to the t-test for dependent/related samples: Wilcoxon Singed-Rank Test

When the level of measurement for a one-group pre-post test design is on ordinal scale, data is not normally distributed, or sample sizes are small, the Wilcoxon Signed-Rank Test is used to test differences. Where the t-test uses the mean to test for differences, the Wilcoxon Signed Rank test uses the median. For more info on the Wilcoxon Sing Rank procedure see: http://learn.lboro.ac.uk/sci/ma/mlsc/documents/wsrt.pdf

8.5 Testing differences between more than 2 groups on one variable: One-way Analysis of Variance (One way ANOVA)

Sometimes a researcher wants to compare the differences and similarities between more than 2 groups.

Page 40: Basic Statistics for the Utterly Confused

- 36 -

Example: A researcher thinks that students’ research skills are influenced by their time management skills. The research question here is: Do time management skills influence students’ research skills? For this study time management skills is the independent variable. This will therefore be the grouping variable. Research skills are the dependent variable. She measures time management skills by means of the Kubic Time-management questionnaire. This questionnaire categorise a persons time management skills in Low, low-to-moderate, Moderate-to-high, and High time management. Research skills are measured by means of the outcome/score of a student’s performance on a masters’ level dissertation. The hypotheses are as follows: H0: There is no difference between low, low-to-moderate, moderate-to-high and high time management abilities and students’ performance on their masters dissertations H1: Students with high time-management ability will perform significantly better in their masters dissertations than students with moderate-to-high, low-to-moderate and low time management skills. H2: Students with moderate-to high time management skills will perform better on their masters dissertations than students with low-to-moderate and moderate time management skills, but worse than students with high time management skills. (H3 and H4 will follow in the same pattern) To test these hypotheses, a one-way ANOVA can be performed. Note the use of “one-way” in this type of statistic. This indicates that there is only one independent variable (grouping variable) or factor involved. This is important because there are also two-or three ways ANOVAs or factorial ANOVAs which is computed when there are more than 1 factor or grouping variable used in the comparison. This is however beyond the scope of this module. As the name indicates, the ANOVA looks at the variances (or differences in variances) between the different groups. If differences exist, we assume that there are differences somewhere

Page 41: Basic Statistics for the Utterly Confused

- 37 -

between the means of the different groups. Or from all the groups, at least two group means differ significantly from each other. Thus, at this point in time, the H0 can be rejected. The ANOVA output itself only tells you that there is a difference somewhere, or not. It does not tell you between which groups these differences lie. To see between which groups differences exist, post-hoc tests are used. There are different post-hoc tests. The most commonly used in the Tukey’s Honestly Significant Difference or HSD test. The Bonferonni test is also used since it controls for the TYPE I Error (finding significant differences when there are none). The chances of a TYPE I Error is enhanced since repetitive comparisons are made between groups. Both these tests are conducted when equal variances of groups are assumed (parametric assumption). The one-way ANOVA as explained above is a parametric test. The assumptions or requirements for the data is the same as for the t-test for independent groups:

1. All observations must be independent from each other 2. The dependent variable must be measured on an interval or ratio scale 3. The dependent variable must be normally distributed in the population – for each group

being compared. 4. The variances of all the groups must be the same (homogeneity of variances) 5. Sample sizes need not be equal, but should preferably be larger than 30 for each group.

When equal variances are not assumed, but all other assumptions are met, SPSS gives you a choice of post hoc tests, which adapts for the differences between group variances. For this, you may select from Tamhanes, Dunettes or Games-Howell post-hoc tests. SPSS output will typically give you the following:

a. Descriptive statistics:

Descriptives

Income before the program

459 7.6776 .82043 .03829 7.6023 7.7528 6.00 10.00

348 9.2500 .72273 .03874 9.1738 9.3262 8.00 11.00

193 11.4560 1.02030 .07344 11.3111 11.6008 10.00 14.00

1000 8.9540 1.63663 .05175 8.8524 9.0556 6.00 14.00

Did not complete

high school

High school degree

Some college

Total

N Mean Std. Deviation Std. Error Lower Bound Upper Bound

95% Confidence Interval for

Mean

Minimum Maximum

Page 42: Basic Statistics for the Utterly Confused

- 38 -

The descriptive statistics include the number of respondents per group (N), the mean or average score per groups, the standard deviation, standard error of the mean, the minimum and maximum scores. b. Test for homogeneity of variance:

Test of Homogeneity of Variances

Income before the program

18.420 2 997 .000

Levene

Statistic df1 df2 Sig.

As with the t-test the ANOVA output, if you choose, should give you the results of the Levene’s test for homogeneity of variances. If the Levene’s test is significant (sig./p<0.05) the null hypothesis (that states that variances are equal) is rejected. THUS, the variances between groups are not equal. This gives you a fair idea of which post-hoc tests should be interpreted.

b. The ANOVA / F-test

ANOVA

Income before the program

1986.479 2 993.240 1436.399 .000

689.405 997 .691

2675.884 999

Between Groups

Within Groups

Total

Sum of

Squares df Mean Square F Sig.

This is what you look at to decide whether there are any differences between the groups. In the first column, you will see that the output shows you the location of the differences – being either between groups (that is differences of variance between groups) or within groups (that is, the amount of variation that exists within each of the groups). The amount of variation for each of these is computed by means of the SUM OF SQUARES and the DEGREES OF FREEDOM. The df stands for DEGREES OF FREEDOM and is N-1. MS is the MEAN SQUARE (the variance) which is computed by SS/df. The F is the F-ratio. The ANOVA uses the F-test/F-distribution to test for differences between groups. The F-ratio is computed by between MS/within MS. For differences to be significant the between MS should be much larger than the within MS. If the grouping variable has an effect (in other words when there is a difference between groups) the F-

Page 43: Basic Statistics for the Utterly Confused

- 39 -

ratio should be larger than 1. To see if the differences is statistically significant, you need to look at the sig. (significance value). If the sig. < 0.05, it indicates that there are significant differences between the groups. The interpretation for this table will be written as: A significant difference exists between groups (f=1436.199; p=0.00).

c. Results of the Post-hoc tests:

Multiple Comparisons

Dependent Variable: Income before the program

-1.5724* .05911 .000 -1.7112 -1.4337

-3.7784* .07134 .000 -3.9458 -3.6110

1.5724* .05911 .000 1.4337 1.7112

-2.2060* .07463 .000 -2.3811 -2.0308

3.7784* .07134 .000 3.6110 3.9458

2.2060* .07463 .000 2.0308 2.3811

-1.5724* .05911 .000 -1.7142 -1.4307

-3.7784* .07134 .000 -3.9495 -3.6073

1.5724* .05911 .000 1.4307 1.7142

-2.2060* .07463 .000 -2.3849 -2.0270

3.7784* .07134 .000 3.6073 3.9495

2.2060* .07463 .000 2.0270 2.3849

-1.5724* .05447 .000 -1.7028 -1.4421

-3.7784* .08283 .000 -3.9773 -3.5795

1.5724* .05447 .000 1.4421 1.7028

-2.2060* .08304 .000 -2.4053 -2.0066

3.7784* .08283 .000 3.5795 3.9773

2.2060* .08304 .000 2.0066 2.4053

-1.5724* .05447 .000 -1.7004 -1.4445

-3.7784* .08283 .000 -3.9735 -3.5833

1.5724* .05447 .000 1.4445 1.7004

-2.2060* .08304 .000 -2.4015 -2.0104

3.7784* .08283 .000 3.5833 3.9735

2.2060* .08304 .000 2.0104 2.4015

(J) Level of education

High school degree

Some college

Did not complete

high school

Some college

Did not complete

high school

High school degree

High school degree

Some college

Did not complete

high school

Some college

Did not complete

high school

High school degree

High school degree

Some college

Did not completehigh school

Some college

Did not complete

high school

High school degree

High school degree

Some college

Did not complete

high school

Some college

Did not complete

high school

High school degree

(I) Level of education

Did not complete

high school

High school degree

Some college

Did not completehigh school

High school degree

Some college

Did not complete

high school

High school degree

Some college

Did not complete

high school

High school degree

Some college

Tukey HSD

Bonferroni

Tamhane

Games-Howell

MeanDifference

(I-J) Std. Error Sig. Lower Bound Upper Bound

95% Confidence Interval

The mean difference is significant at the .05 level.*.

The post-hoc test, compare the specific groups with each other. Those groups that differ significantly will usually be flagged by means of an *. The significance value for the group comparison should also be smaller than 0.05.

Page 44: Basic Statistics for the Utterly Confused

- 40 -

For this example, I have selected both post hoc tests that assumes homogeneity of variances, and those who do not. From the Levene’s statistic, I can now say that the assumption of homogeneity of variances has not been met. Therefore, I need to look at either the results of the Tamhane or Games-Howell post hoc tests. Both these tests indicate to me that there are statistically significant differences between all the groups. The hypothesis that I tested in this example was that significant differences exist in the income level of people that did not complete school, did complete school and those with post-matric training. I can now say that the ANOVA showed that significant differences exist. From the means plot I can see which of the groups has the highest income:

Level of education

Some collegeHigh school degreeDid not complete hig

Me

an

of

Inco

me

be

fore

th

e p

rog

ram

12

11

10

9

8

7

8.6 The non-parametric alternatives for the One-way ANOVA

When the data does not meet the requirements/assumptions of the parametric one-way ANOVA, the Kruskal-Wallis H test, the Median test and the Johnckheere-Terpstra Tests can be used. For the purpose of this module, we will only look at the Kruskal-Wallis H Test. The Kruskal-Wallis H Test is an extension of the Mann-Whitney U test. It is more powerful and

preferable non-parametric alternative to use. Where the ANOVA uses the F-ratio, the Kruskal-Wallis uses the H to assess whether differences exists. Basically it compares the medians of the samples/groups.

• Data that can be used for the Krukskal-Wallis should: • Groups must be independent • More than 5 respondents per group (preferably 10) • Sample sizes should be equal or as equal as possible

Page 45: Basic Statistics for the Utterly Confused

- 41 -

The distribution need not be normal and variances need not be equal. In some situations, you would not want to compare more than two groups on one independent variable alone. In other words, you would like to see if there are differences based on more than one variable. Does Black, Asian and White South Africans differ in terms of demographical location, number of children, number of people living within one household?. When two independent variables are included we make use of the two-way ANOVA, when three independent variables are included we make use of the three-way ANOVA. ANOVA’s with more than one “factor” tested to see if it has an effect, can also be called Factorial ANOVA. See more at:

o http://davidmlane.com/hyperstat/A134930.html o http://pluto.fss.buffalo.edu/classes/psy/segal/2072001/anova2/ANOVA2.html o http://arts.uwaterloo.ca/~djbrown/psych391/Test2/Factorial-Variance1.pdf

When more than one dependent variable is included, the Multivariate analysis of variance or MANOVA is used. See:

o http://userwww.sfsu.edu/~efc/classes/biol710/manova/manovanew.htm o http://www.utexas.edu/cc/docs/stat38.html

Remember when to use a partial correlation? You want to keep the effect of a variable constant, to see what the relationship between two other variables are without its interference. Sometimes when we want to test differences with an ANOVA, we may also want to control for the effect of another variable. In such cases we use the Analysis of Covariance (ANCOVA). See also: http://www-users.cs.umn.edu/~ludford/Stat_Guide/ANCOVA.htm

References:

Field, A. (2009) Discovering statistics using SPSS. SAGE Publications Huysamen, G.K. (1998). Descriptive statistics for the social and behavioral sciences. JL van Schaik Academic: Pretoria. Pallant, J. (2003) SPSS survival manual. Open university press.