chapter 7 comparing two means difference between two groups

CHAPTER 7

COMPARING TWO MEANS

Difference between two groups

7.2. Revision of Experimental Research

• Often in Social sciences we are not just interested in looking at which variables covary or predict an outcome. Instead we want to look at the effect of one variable another by systematically changing some aspect of the variable.

• Rather collecting naturally occuring data as in correlation and regression, we manipulate one variable to observe ist effect on the other.

Example: Effect of encouragement on learning

• Group 1: Positive Reinforcement

• Group 2: Negative Reinforcement

Teaching method (Positive or Negative) is known as the independent variable. It has two levels

Outcome is the statistical ability: called the dependent variable.

7.2.1 Two methods of data collection

• Between Group, Between Subject or Independent Design

• Within subject or repeated measure design (Dependent Design)

7.2.2. Two types of variation

Repeated Measures Design (Dependent)Example: Train Monkeys to run EconomyTraining Phase 1: Given Userfriendly computers with press

buttons, which change various parameters of the economy. Once they change the parameters, a figure appears on the screen indicating the economy growth. No Feed Back

Training Phase 2: The same monkeys are given computers.

If the economic growth is good, they get a banana.

If there was no experimental manipulation (i.e No bananas), then we would expect the behaviour of hte monkeys to be the same.

We expect this beacuse other factors i.e age, gender, IQ, motivation are same for both conditions.

Therefore monkeys who score high in condition 1 should also score high in condition 2 and vice versa.

But the performance will not be identical. There will be small differences in performance created by unknown factors. This variation is called

unsystematic variation.

Independent Design (Different ParticipantsExample: Train Monkeys to run EconomyTraining Phase 1: 5 Monkeys Recieve training without

feedback. 5 Monleys Recieve Training withfeedback.

No Feed Back

If there was no experimental manipulation (i.e No bananas), then we would still find some variation between behaviour between the groups because they contain different monkeys, who vary in their ability, motivation and IQ etc.

The fctors that were held constant in the repeated measures design are free to vary in the independent design.

Therefore the unsystematic variation will be bigger than for repeated measures design,

7.2.2 Two Types of Variation (Continued)

Systematic Variation

The variation due to the experimentor doing some thing to all of the participants in one condition but not in the other condition

UnSystematic Variation

This variation results from random factors that exist between the experimental conditions (such as natural differences in ability)

The role of statistics is to discover how much variation there is in Performance and then to work out how much of this is systematic and how much is unsystematic.

Repeated Measure Design: Difference between two conditions can be caused by only two things

•The manipulation that was carried out on the participants

•Any other factor that might affect the way in which a person peroforms from one time to the next.

Independent Design:

•The manipulation that was carried out on the participants

•Difference between the characteristics of the people allocated to each of the groups. The second factor inthis case is likely tocreate considerable random variation both within each condition and between them.

7.3 Error Bar charts

Figure 7.3

7.3.1. Error Bar graphs for between group designs

SPSS Example: spiderBG.sav

Figure 7.9

7.3.2. Error Bar graphs for repeated measures design

7.4 Testing Difference between Means: The T Test

Examples:

• Does the viewing of an advertisement leads to more purchase.

Independent means t test : Different participants

Dependent means t test: Same participants

7.4.1 Rationale for the t test

1. Two samples of data are collected and the sample means calculated. These means might differ by either a little or a lot.

2. If the samples come from the same population, then we expect their means to be roughly equal. Although it is possible for their means to differ by chance alone, we would expect large differences between sample means to occur very infrequently. Therefore under the Null Hypotheses we assume that the experimental manipulation (advertisement) has no effect on the participants, therefore the sample means are very similiar.

H0 : Mean Sample 1 = Mean Sample 2

3. We compare the difference sample means that we collected to the difference between the sample means that we would expect to obtain by chance. We use the standard error as the gauge of the variability between sample means.

• If std error is small, we expect most samples to have similiar means.• If std error is large, we expect to obtain large differences in sample means by chance

alone.

7.4.1 Rationale for the T Test (Continued)

If the difference between the sample means we have collected is larger than that what we would expect based on the standard error then we can assume one of the two things

1. That sample means in our population fluctuate a lot by chance alone and we have, by chance collected two samples that are atypical (not representative) of the population from which they came.

2. The two samples come from different populations but are typical of their respective parent population. In this scenario, the difference between samples represents a genuine difference between the samples. (Null Hypothesis is incorrect)

As the observed difference between the sample means gets larger, the more confident we become that the second explaination is correct

7.4.2. Assumptions of the t test

Both independent and dependent tests are parametric tests

• Data are from normally distributed populations• Data are measure at least at the interval level

The independent t test, because it is used to test different groups of people also assumes

• Varinaces in these populations are roughly equal (homogeneity of variance)

• Scores are independent (because they come from different people)

7.5 The Dependent T Test: To Analyse whether differences between group means are statistically meaningfull.

NsD

DDt

The equation compares the mean difference between our samples with the difference we would expect to find bewteen population means and then takes in to account the standard error of the differences.

7.5.1 Sampling distributions and the standard error

Sampling distributions have several properties that are important– If the population is normally distributed then so is the sampling distribution i.e if

the sample size is more than 30, it is always normal– The mean of the sampling distribution is equal to the mean of the population.– Standard deviation of a sampling distribution is equal to the std devation of the

population divided by the square root of the sample size. N

sD

We can extend this idea to differences between sample means

• If you take several pairs of samples from a population and calculate their means, then you can also calculate the difference between their means.

Look at explaination on page 289

7.5.2 The dependent t test equation

explained NsD

DDt

Variation explained by the model

Variation not explained by the model

If we calculate the difference between each persons score in each condition and add these differences, we get the total amount of difference.

If we divide this by no of participants, we get averge difference. This average difference is D bar and an indicator of systematic variation due to the experimental effect.

We need to be sure that the observed difference is due to our experimental manipulation (and not a chance result)

Knowing the mean difference is not useful beacuse it depends on the scale of measurement so we standardise the value.

We can standardise it by dividing it by the sample std deviation of the differences.

Std deviation is a measure of how much variation there is between participants differences scores.

Thus the std deviation of differnces represents the unsystematic variation.

Sr. No Picture Real

Difference

1 30 40 -10

2 35 35 0

3 45 5 40

4 40 55 -15

5 50 65 -15

6 35 55 -20

7 55 50 5

8 25 35 -10

9 30 30 0

10 45 50 -5

11 40 60 -20

12 50 39 11

Summe -39

Avg Difference (mean) -3,25

STABWA Std Dev of differences 16,77

In case we have 12 clones

Sr. No Picture RealDifference

1 30 40 10

2 30 40 10

3 30 40 10

4 30 40 10

5 30 40 10

6 30 40 10

7 30 40 10

8 30 40 10

9 30 40 10

10 30 40 10

11 30 40 10

12 30 40 10

Summe 120

Avg Difference (mean) 12

STABWA Std Dev of differences 0

Sr. No Picture RealDifference

1 30 40 -10

2 35 35 0

3 45 5 40

4 40 55 -15

5 50 65 -15

6 35 55 -20

7 55 50 5

8 25 35 -10

9 30 30 0

10 45 50 -5

11 40 60 -20

12 50 39 11

Summe -39

Avg Difference (mean) -3,25

STABWA Std Dev of differences 16,77

7.5.2 The dependent t test equation explained (continued)

• Dividing by std deviation is a useful means of standardising the average difference between conditions.

• We are interested in knowing how the difference between sample means compares, to what we would expect to find had we not imposed an experimental manipulation.

• We can use the properties of the sampling distribution: Instead of dividing the average differences between conditions by std deviaiton of differnces, we divide it by std error of differnces.

• Dividing by the std error standardises the average differences between conditions, but also tells us how the difference between the sample means compares in magnitude to what we would expect by chance alone.

• If std error is large, then large differences between samples are more common (because the distribution of differences is more spread out). Conversly if the std error is small, then large differences between sample means are uncommon. (because the distribution of differences is very narrow and centred around zero).

• Therefore if the avg difference between our samples is large and the std deviation is small, then we can be confident that the difference we observed in our sample is not by a chance result. So if the difference is not by chance it must have been caused by the experimental manipulation.

NsD

DDt

• The t statistics is simply the ratio of the systematic varioation in the experiment to the unsystematic variation .

• If the experimental manipulation creates and effect, then we expect the systematic variation to be much greater than the unsystematic variation. i.e t shall be greater than 1.

• If the experimental manipulation is unsuccessful then we might expect the variation caused by individual differences to be much greater than that caused by the experiment. So t will be less than 1.

• We can compare the obtained t value against the max value we would expect to get by chance alone in a t distribution with the same dof.

• If the value we obtain exceeds this critical value we can be confident that this reflects an effect of our independent variable.

7.5.2 The dependent t test equation explained (continued)

NsD

DDt

Variation explained by the model

Variation not explained by the model

• Open the file: spiderRM.sav

• Analyse==Compare Means==Paired Samples T Test

7.5.2 The dependent t test using SPSS

7.6 The Independent T Test

errorstdtheofestimate

XXt

____21

NsD

DDt

Instead of looking at differences between pairs of scores, we look at the differences between the overall means of the two samples and compare them to the differences we would expect to get between the means of the two populations from which the samples come.

Under the null hypotheses the equation becomes


XXt

____2121


XXt

____21

2

22

1

21tan

N

S

N

SErrordards

If we tookl several pairs of samples, the differences between the sample means will be similiar across pairs.

7.6.2 The independent t test using spss

• Open the file spiderBG.sav

• Analyse==Compare Means==independent Samples T Test

7.8 The T Test as a General Linear Model

Open SpiderBG.Sav and simple linear regression, using group as predictor and anxiety as the outcome.

chapter 7 comparing two means difference between two groups

Documents

train monkeys

different monkeys

independent variable

monkeys recieve training

experimental conditions

experimental manipulation

behaviour of hte monkeys

economytraining phase