handout seven: independent-samples t test instructor: dr. amery wu

27
Handout Seven: Independent- Samples t Test EPSE 482 Introduction to Statistics for Research in Education Instructor: Dr. Amery Wu 1

Upload: lindsay-warner

Post on 08-Jan-2018

231 views

Category:

Documents


0 download

DESCRIPTION

Independent-Samples t Test - An Example to Contextualized Learning Let’s say that I am interested in finding out whether there is an age difference between UBC and SFU students, including all undergraduate, masters, and doctoral students. My hypothesis is that there is a difference in students’ age between the two universities. I recruited 43 students from UBC and 30 students from SFU. My sample mean age is 26 for UBC and 23.5 for SFU. Question: How likely is my guess (there is an age difference between UBC and SFU students) true, given my sample mean difference is 2.5? Or simply, put, is the observed mean difference of 2.5 true in the population?

TRANSCRIPT

Page 1: Handout Seven: Independent-Samples t Test Instructor: Dr. Amery Wu

1

Handout Seven: Independent-Samples t Test

EPSE 482Introduction to Statistics for

Research in EducationInstructor: Dr. Amery Wu

Page 2: Handout Seven: Independent-Samples t Test Instructor: Dr. Amery Wu

2

Independent-Samples t Test

- An Example to Contextualized Learning

Let’s say that I am interested in finding out whether there is an age difference between UBC and SFU students, including all undergraduate, masters, and doctoral students.My hypothesis is that there is a difference in students’ age between the two universities. I recruited 43 students from UBC and 30 students from SFU. My sample mean age is 26 for UBC and 23.5 for SFU.Question: How likely is my guess (there is an age difference between UBC and SFU students) true, given my sample mean difference is 2.5? Or simply, put, is the observed mean difference of 2.5 true in the population?

Page 3: Handout Seven: Independent-Samples t Test Instructor: Dr. Amery Wu

3

DesignExperimentalObservational

DataContinuousCategorical

ModelDescriptive/SummativeExplanatory/Predictive

InferenceDescriptive vs. InferentialRelational vs. Causal

ResearchQuestion

Quantitative Methodology Network

Page 4: Handout Seven: Independent-Samples t Test Instructor: Dr. Amery Wu

4

Keep notes of the flaws with the method I will use to answer my question.

Hints: How did I recruit my sample?How many students did I sample for each group?The distribution of my sample for each group?…..

Page 5: Handout Seven: Independent-Samples t Test Instructor: Dr. Amery Wu

5

Lab ActivityIn SPSS, use the drop-down menu “Data” > “Split File” > “Compare Groups”, and then use “Descriptive Statistics” to output the following statistics for the age, separately for the UBC and SFU students:minimum, maximum, mean, sample size, variance, and SD.

Page 6: Handout Seven: Independent-Samples t Test Instructor: Dr. Amery Wu

Where We Are TodayMeasurement of Data

Continuous Categorical

Type ofthe

Inference

DescriptiveA B

InferentialC D

6

Today, we will introduce the concept and application of independent-samples t test. The purpose of the independent-samples t test is to test whether one’s hypothesis about the population mean difference is supported by the observed data.

The design is could be observational or experimental. The model is explanatory/predictive (IV is the group with 2

levels). The data (DV) is quantitative. The relationship between IV and DV could be causal or

relational depending on the design. The intended conclusion is inferential.

Page 7: Handout Seven: Independent-Samples t Test Instructor: Dr. Amery Wu

Inferential statistics takes into account the sampling errors when trying to infer the population parameter based on the statistic observed from one single sample.

Two essential but esoteric pieces of machinery for making inferences about the population based on one singe sample:

1. Sampling Distribution2. Hypothesis Testing

Inferential Statistics

7

Page 8: Handout Seven: Independent-Samples t Test Instructor: Dr. Amery Wu

8

PopulationMean 24 ???

Sampling Distributions of the Mean- the t Distributions

Expected Sample Distribution of Size 20 from the Population;

M=24, SD=1.503

Observed Sample Distribution of Sample Size 20; M=26, SD=6.72

Sampling Distribution

sProbability

& Hypothesis

Testing

From the Sample to the Population

Page 9: Handout Seven: Independent-Samples t Test Instructor: Dr. Amery Wu

9

Sampling Distributions of the Mean DifferenceThe statistic of our interest is the mean difference between group 1 and 2.

Question: What is the other name for the SD of the sampling distributions of the mean difference?Lab Activity: identify the sampling distribution of the UBC/SFU mean difference in age, df= ?? mean difference= ?? standard deviation= ??

The sampling distributions of the mean difference follow the tdistributions. When the null is true, the distribution has df= N-2 (N is the total sample size over the two groups) a mean of zero a SD of when n1 = n2 or

when n1 ≠ n2

where pooled variance =

Page 10: Handout Seven: Independent-Samples t Test Instructor: Dr. Amery Wu

10

1. Specify the hypotheses.H0: µ1=µ2 (the same population)H1: µ1≠µ2

2. Specify the significance level (α). α = 0.05

3. Calculate the statistic of interest.M1-M2= 2.5

4. Identify the sampling distribution of the statistic of interest.

t-distribution with df= 71, mean=0, and SD=1.325

5. Calculate the test statistic.t= (M1-M2)/SEmean difference; t=(26-23.5)/1.325

=1.8866. Obtain the p-value.

p= 0.063 (by SPSS)7. Conclude: reject or retain.

Retain the H0: µ1=µ2

Seven Steps for Hypothesis TestingTwo Tailed Independent-Samples t Test - UBC vs. SFU Age

Page 11: Handout Seven: Independent-Samples t Test Instructor: Dr. Amery Wu

11

Independent-Samples t Test Using SPSS

Page 12: Handout Seven: Independent-Samples t Test Instructor: Dr. Amery Wu

12

SPSS Outputs for Independent-Samples t Test-Two Tailed

Page 13: Handout Seven: Independent-Samples t Test Instructor: Dr. Amery Wu

13

Let’s say that I had started with my hypothesis stating that UBC students, on average, is older than SFU students (instead of “different from” SFU). I then collected the data and ran a one-tailed independent-samples t-test.

Lab ActivityOne Tailed Independent-Samples t Test

Question: How likely is my guess (UBC mean age > SFU mean age) true, given my sample mean difference is 2.5? Or simply put, is the population mean difference greater than 0 in the population?

Page 14: Handout Seven: Independent-Samples t Test Instructor: Dr. Amery Wu

14

Seven Steps for Hypothesis TestingOne Tailed Independent –Samples t Test

1. Specify the hypotheses.H0: µ1≤µ2 ; H1: µ1>µ2

2. Specify the significance level (α). α = 0.05

3. Calculate the statistic of interest.M1-M2= 2.5

4. Identify the sampling distribution of the statistic of interest.

t-distribution with df= 71, mean=0, and SD=1.325

5. Calculate the test statistic.t= (M1-M2)/SEmean difference; t=(26-23.5)/1.325

=1.8866. Obtain the p-value.

p= 0.0317 (SPSS drop down menu does not provide p value for one-tailed tests go to http://www.stat.tamu.edu/~west/applets/tdemo.html)

7. Conclude: reject or retain.Reject the H0: µ1≤µ2

Page 15: Handout Seven: Independent-Samples t Test Instructor: Dr. Amery Wu

15

Why did the one tailed and two tailed hypothesis tests reach different conclusions ???

Page 16: Handout Seven: Independent-Samples t Test Instructor: Dr. Amery Wu

16

The power of a statistical test is the probability that the test will reject the null hypothesis when the null hypothesis is false.If we define “false null” as the “CASE” (what we would like to detect), power is how sensitive a statistical test is to detect a true CASE, i.e., the true “positive” rate. It is also called sensitivity. Retain (0) Reject (1)

Null is True (0) Specificity Type I Error

Null is False (1) Type II Error Power (Sensitivity)

Statistical Power

Page 17: Handout Seven: Independent-Samples t Test Instructor: Dr. Amery Wu

17

The power is, in general, a function of the size of the population parameter (population effect size), sample size, and the alpha level.

Power (π) = F (Δ, N, α). Note. Δ, delta, denotes population effect size.

Let’s take the independent-samples t test for example, if the null is false (CASE) and the alpha is fixed (say set to = 0.05), the greater the test statistic t is, the more power we would have to reject the null.

Factors Influencing Power

Looking at the right side of the equation, we can see that the greater the numerator (effect size Δ, i.e., mean difference), the greater the t (hence, the power) will be.

Also, the smaller the denominator (sampling error, i.e., standard error) is, the greater the t (hence, the power) will be. That is, the greater the sample size N is, the greater the t (hence, the power) will be.

𝒕= 𝑴𝟏−𝑴𝟐

√𝐕𝐚𝐫𝟏𝐧𝟏 +𝐕𝐚𝐫𝟐𝐧𝟐

Page 18: Handout Seven: Independent-Samples t Test Instructor: Dr. Amery Wu

18

Issues with the Method of Hypothesis Testing

Sample Size When the sample size is small, a true mean difference (M1-M2) could be undetected and the t test would fail to reject the H0.

On the contrary, an insignificantly trivial mean difference could be detected, when the sample size is large.

To address this issue, researchers are recommended to report the magnitude of the difference in effect size measures.

Effect size is a standardized measure because it transforms the magnitude of difference from the raw score scale to a common scale. Thus, differences found in studies of the same DV but measured in different raw score scales can all be compared because they are all on the SD scale

Page 19: Handout Seven: Independent-Samples t Test Instructor: Dr. Amery Wu

19

Calculating Effect Size Cohen’s dA variety of effect size measures have been suggested. Cohen’s d is most commonly reported.Cohen’s d for mean difference = (M1-M2) /SDpooled. The SDpooled is the standard deviation pooled over the two groups.SPSS drop-down menu does not provide such a measure.Lab Activity:Hand calculate the Cohen’s d for the age difference for UBC and SFU. Answer:d= (26-23.5)/5.571=0.449 (See slide #5 for the necessary statistics and see slide #9 for calculating the pooled variance, hence pooled SD.

Page 20: Handout Seven: Independent-Samples t Test Instructor: Dr. Amery Wu

20

Issues with the Method of Hypothesis Testing

Testing a Single Value The hypothesis is made, tested, concluded on a single value (point estimate, i.e., mean difference µ=2.5.

It would be more realistic to make a conclusion about the possible range of the population parameter (mean difference), taking the sampling error into account.

This issue is addressed by constructing and reporting a confidence interval.

When α = 0.05, the 95% confidence interval (CI) is constructed to estimate the possible range, within which the population parameter may reside. Ninety five out of 100 times of re-sampling, the sample statistic would fall within the 95% CI.

Page 21: Handout Seven: Independent-Samples t Test Instructor: Dr. Amery Wu

Constructing the 95% CI The confidence interval for the mean difference

= Mean difference ±(critical t value x standard error of the mean difference).

Lab Activity:1. Using the SPSS output in the previous slide,

hand-calculate the 95% CI for UBC students’ mean age difference between UBC and SFU. Find the critical t value at ttp://easycalculation.com/statistics/critical-t-test.php

2. Compare your answers to the results of SPSS.3. Question: How can one determine whether to reject or retain the null hypothesis by only examining the confidence interval? Answer: Retain the null if the confidence interval includes the hypothesized value. 21

Page 22: Handout Seven: Independent-Samples t Test Instructor: Dr. Amery Wu

22

Constructing the 99% CI Lab Activity1. Using the SPSS output, hand-calculate the 99%

CI for mean age difference between UBC and SFU students. Find the critical t value at ttp://easycalculation.com/statistics/critical-t-test.php

2. Compare your answers to the results of SPSS.3. Question: Which confidence interval is wider, 95% CI (= 0.05) or 99% CI (= 0.01 )? Why?Answer: the 99% confidence interval is wider because the critical value is greater. Conceptually, one would have more confidence (99%) about a more conservative estimate (wider interval) and less confidence (95%) about a bolder estimate (narrower interval).

Page 23: Handout Seven: Independent-Samples t Test Instructor: Dr. Amery Wu

Independent-samples t test makes three assumptions about the data. Namely, for this method to work well, the data should meet the assumptions “reasonably” well.

1. Independent Observations2. Normal Distribution 3. Equal Variances

(homogeneity)

Assumptions of Independent-Samples t Test

23

Page 24: Handout Seven: Independent-Samples t Test Instructor: Dr. Amery Wu

24

The observations (scores across the subjects) within each group are independent of one another. The observation of the score of one individual (age) within each group is not influenced by that of another. This assumption should be checked by how the scores are collected. For example, if some of the scores, within each group, are relatively more similar to the others because of the time/location when they are collected, then the assumption is violated.

A typical example of violation of independence observations is that individual student’s academic achievement data are collected through sampling the schools. Students’ scores are influenced by the fact that they share the same teachers and principle, and the same school climate. Students’ achievement may tend to be relatively higher/lower in one school than the other.

If this assumptions is violated, a random effects model should be used.

Assumptions of Independent-Samples t TestIndependent Observations

Page 25: Handout Seven: Independent-Samples t Test Instructor: Dr. Amery Wu

25

Independent-samples t test assumes the data, for each group, are sampled from a normally distributed population hence should be fairly normal.

if violated, use non-parametric tests. This assumption could be checked, separately for

each group, by the skewness, histogram, boxplot, QQ plots, etc. SPSS “Analyze” > “Descriptive” > “Explore” commands of SPSS.

Assumptions of Independent-Samples t-testNormal Distribution

Page 26: Handout Seven: Independent-Samples t Test Instructor: Dr. Amery Wu

26

Checking Assumption-Equal Variances (homogeneity)

The group variances are equal. If violated, the power is reduced, but the Type-one error rate is still robust.

One can check this assumption by eyeball comparing SDs or variances for the groups. Alternatively, one can use Levene's homogeneity test; a non-significant result indicates the variances are all equal. This test could be to too sensitive if the sample size is large. Levene’s test is automatically outputted by SPSS for independent-samples t tests.

Page 27: Handout Seven: Independent-Samples t Test Instructor: Dr. Amery Wu

27

Checking Assumption-Equal Variances (homogeneity)

When the equal variances assumption is violated, the power is reduced, but the Type-I error rate is still robust. Thus, one can increase sample size or use t-test results that do not assume equal variances.