sp17 lecture notes 7b - inference for a difference in meansvollmer/stat307pdfs/ln7b_2017.pdf ·...

32
Lecture notes 7b: Inference for a difference in means Outline: Hypothesis test for two means using independent samples (example 1) CI for two means using independent samples “Paired” differences Hypothesis test for a paired difference (example 2) CI for a paired difference

Upload: others

Post on 27-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SP17 Lecture Notes 7b - Inference for a Difference in Meansvollmer/stat307pdfs/LN7b_2017.pdf · compare blood pressure before being put on a drug to after being on the drug), or studies

Lecture notes 7b:Inference for a difference in means

Outline:

• Hypothesis test for two means using independent samples (example 1)

• CI for two means using independent samples • “Paired” differences • Hypothesis test for a paired difference (example

2) • CI for a paired difference

Page 2: SP17 Lecture Notes 7b - Inference for a Difference in Meansvollmer/stat307pdfs/LN7b_2017.pdf · compare blood pressure before being put on a drug to after being on the drug), or studies

Inference for a difference between population means

• We have looked at how hypothesis tests and confidence intervals can be used to draw inference on a population mean.

• In practice, it is more common to want to investigate how two means differ from one another – or if they differ at all.

• We will look at how we compare the means from two separate groups (“independent samples”), as well as how we compare the means of two sets of observations taken on the same group (“paired data”).

Page 3: SP17 Lecture Notes 7b - Inference for a Difference in Meansvollmer/stat307pdfs/LN7b_2017.pdf · compare blood pressure before being put on a drug to after being on the drug), or studies

Hypothesis tests for two means (independent samples)

• We will first look at inferential procedures for comparing means when we have “independent samples”, i.e. two different groups of subjects.

• We’ll start with a hypothesis test, and then look at constructing a corresponding confidence interval.

• The following hypothesis test is usually referred to as a “two sample t-test”.

Page 4: SP17 Lecture Notes 7b - Inference for a Difference in Meansvollmer/stat307pdfs/LN7b_2017.pdf · compare blood pressure before being put on a drug to after being on the drug), or studies

Hypothesis tests for two means

• This is the formula for the test statistic which compares two sample means to one another:

• Here, the subscripts refer to populations 1 and 2.

• In the numerator of this formula, , is the point estimate for the difference between population means 1 and 2.

Page 5: SP17 Lecture Notes 7b - Inference for a Difference in Meansvollmer/stat307pdfs/LN7b_2017.pdf · compare blood pressure before being put on a drug to after being on the drug), or studies

Hypothesis tests for two means• is the hypothesized difference between the

means. This is almost always zero, because the null hypothesis is almost always that the means are equal. If we reject this null, it will be in favor of the alternative hypothesis that the means are different.

• When is zero, we can simplify the test statistic:

Page 6: SP17 Lecture Notes 7b - Inference for a Difference in Meansvollmer/stat307pdfs/LN7b_2017.pdf · compare blood pressure before being put on a drug to after being on the drug), or studies

Hypothesis tests for two means

• The denominator of this equation is the standard error of . It combines the standard errors of both and .

• And so this statistic follows the same general formula as the statistic for a one sample t-test:

Page 7: SP17 Lecture Notes 7b - Inference for a Difference in Meansvollmer/stat307pdfs/LN7b_2017.pdf · compare blood pressure before being put on a drug to after being on the drug), or studies

Example 1• A researcher who is studying the relationship between concentration

and balance conducts an experiment in which nine elderly subjects and eight young subjects each stand barefoot on a “force platform”, which measures how much a person sways (in millimeters) in the forward/backward and side-to-side directions.

• Subjects are asked to maintain a stable upright position and to react as quickly as possible to a randomly timed noise by pressing on a hand held button.

• The researcher would like to know if there is a difference between the elderly and young with regards to how well they maintain balance in this scenario.

Page 8: SP17 Lecture Notes 7b - Inference for a Difference in Meansvollmer/stat307pdfs/LN7b_2017.pdf · compare blood pressure before being put on a drug to after being on the drug), or studies

Example 1Here is the data:

Elderly subjects14 41 18 11 16 24 18 21 37

Young subjects17 10 16 22 12 14 12 18

Page 9: SP17 Lecture Notes 7b - Inference for a Difference in Meansvollmer/stat307pdfs/LN7b_2017.pdf · compare blood pressure before being put on a drug to after being on the drug), or studies

Step 1: the hypothesesState the null and alternative hypotheses:

Page 10: SP17 Lecture Notes 7b - Inference for a Difference in Meansvollmer/stat307pdfs/LN7b_2017.pdf · compare blood pressure before being put on a drug to after being on the drug), or studies

Step 2: α and the critical value• This time we didn’t specify a level of significance. By convention,

when no level of significance is specified, we default to α = 0.05

• As with the one sample test for a mean, the test statistic used for a two sample test follows a t-distribution. So, we will use a t-critical value.

• As it turns out, the formula for finding degrees of freedom for the sampling distribution of this test statistic is pretty involved:

Page 11: SP17 Lecture Notes 7b - Inference for a Difference in Meansvollmer/stat307pdfs/LN7b_2017.pdf · compare blood pressure before being put on a drug to after being on the drug), or studies

Step 2: α and the critical valueIn practice, software can be used to find this degrees of freedom. In our class, you will be given the degrees of freedom for two sample procedures.

In this case, df = 11.

Find the critical value and sketch the sampling distribution of the test statistic under H0:

Page 12: SP17 Lecture Notes 7b - Inference for a Difference in Meansvollmer/stat307pdfs/LN7b_2017.pdf · compare blood pressure before being put on a drug to after being on the drug), or studies

Step 3: The test statistic and p-valueCompute the test statistic and corresponding p-value:

Page 13: SP17 Lecture Notes 7b - Inference for a Difference in Meansvollmer/stat307pdfs/LN7b_2017.pdf · compare blood pressure before being put on a drug to after being on the drug), or studies

Step 4: The statistical decision

State and interpret the statistical decision:

Page 14: SP17 Lecture Notes 7b - Inference for a Difference in Meansvollmer/stat307pdfs/LN7b_2017.pdf · compare blood pressure before being put on a drug to after being on the drug), or studies

Example 1 confidence interval

• We can also construct a confidence interval for the true difference in mean millimeters of sway between the elderly and the young.

• Recall the general form of a confidence interval:

CI = point estimate ± margin of error

or CI = point estimate ± (critical value) * (standard error)

Page 15: SP17 Lecture Notes 7b - Inference for a Difference in Meansvollmer/stat307pdfs/LN7b_2017.pdf · compare blood pressure before being put on a drug to after being on the drug), or studies

Example 1 confidence interval• Using this general form as a guide, we see that

Construct a 95% CI for the true difference in mean sway between the young and elderly groups:

Page 16: SP17 Lecture Notes 7b - Inference for a Difference in Meansvollmer/stat307pdfs/LN7b_2017.pdf · compare blood pressure before being put on a drug to after being on the drug), or studies

Example 1 confidence intervalDo the results of the hypothesis test and confidence interval agree? Why or why not?

Page 17: SP17 Lecture Notes 7b - Inference for a Difference in Meansvollmer/stat307pdfs/LN7b_2017.pdf · compare blood pressure before being put on a drug to after being on the drug), or studies

A brief aside: pooling variance• The 2 sample t-test we just conducted involved finding estimates for

the standard deviations of our two groups (young and elderly) separately.

• There is a slight variation on this method which involves “pooling” the standard deviations. Usually this is referred to as “pooling the variances” – mathematically this is the same thing, since the standard deviation is the square root of the variance.

• This method assumes that the population standard deviations of the two groups are equal, and so we can estimate one standard deviation that applies to both groups. We will not be using this technique in our class, but in practice it is often used.

Page 18: SP17 Lecture Notes 7b - Inference for a Difference in Meansvollmer/stat307pdfs/LN7b_2017.pdf · compare blood pressure before being put on a drug to after being on the drug), or studies

Statistical Tests Involving Paired Data

• Sometimes, when testing for a difference in means, we are able to measure the same subjects twice and test for differences in the two measurements.

• Examples include “before” and “after” type studies (e.g. compare blood pressure before being put on a drug to after being on the drug), or studies where each subject can be measured under two different treatments, or a treatment and a control (e.g. conduct a vision test with your right eye, then with your left eye, and compare the results).

Page 19: SP17 Lecture Notes 7b - Inference for a Difference in Meansvollmer/stat307pdfs/LN7b_2017.pdf · compare blood pressure before being put on a drug to after being on the drug), or studies

• These “paired” studies have an advantage over studies using independent samples in that there is less natural variability to account for.

• Example: suppose we want to see if beer and wine consumption have different effects on short term memory. We could take a sample of participants, randomly assign them to either the beer or wine group, and then administer a memory test after they’ve consumed some specified quantity of alcohol.

• If we do this, any differences we observe between the groups may be attributable to the beer or wine, but they may also be attributable many other variables that affect memory (age, genetics, physical health, etc.) and that differ from person to person.

Page 20: SP17 Lecture Notes 7b - Inference for a Difference in Meansvollmer/stat307pdfs/LN7b_2017.pdf · compare blood pressure before being put on a drug to after being on the drug), or studies

• Using paired observations, i.e. multiple observations on the same subjects, in effect “controls” for these other variables.

• If each participant undergoes a memory test once after consuming beer and once after consuming wine, then we know that any difference between these pair observations will not be attributable to these other variables (age, genetics, physical health, etc.).

• Because of this, the amount of random variability that we normally expect to see in our data will be reduced, which might make it easier to test our research hypotheses.

Page 21: SP17 Lecture Notes 7b - Inference for a Difference in Meansvollmer/stat307pdfs/LN7b_2017.pdf · compare blood pressure before being put on a drug to after being on the drug), or studies

• Using paired observations can introduce other possible problems, and so this type of test is not always the most appropriate.

• For instance, human subjects may perform differently on a test the 2nd time due to having gone through it once.

• Also, you may not be able to test against placebo in a paired study.

• If conducting a paired study is feasible and appropriate, then it will generally be preferable to a two-sample study because it results in a smaller standard error.

Page 22: SP17 Lecture Notes 7b - Inference for a Difference in Meansvollmer/stat307pdfs/LN7b_2017.pdf · compare blood pressure before being put on a drug to after being on the drug), or studies

• The formula for a paired t-test is essentially identical to the formula for a t-test for a single mean. The only difference is that the data we use is not raw observations of a variable; rather it is the differences between the paired observations.

• We note that our data takes the form of differences by using the subscript “d” in our notation. Apart from this, the formula is identical to that of a one sample test for a mean:

Page 23: SP17 Lecture Notes 7b - Inference for a Difference in Meansvollmer/stat307pdfs/LN7b_2017.pdf · compare blood pressure before being put on a drug to after being on the drug), or studies

A physical fitness program is designed to increase a person’s upper body strength. To determine the effectiveness of this program a SRS of 31 members of a health club was selected and each member was asked to do as many push-ups as possible in 1 minute. After 1 month on the program the participants were once again asked to do as many push-ups as possible in 1 minute. These values were recorded and the difference (After - Before) was computed.

Example 2

Page 24: SP17 Lecture Notes 7b - Inference for a Difference in Meansvollmer/stat307pdfs/LN7b_2017.pdf · compare blood pressure before being put on a drug to after being on the drug), or studies

Test to see if the program is effective in increasing upper body strength as defined by the “number of push ups in one minute” metric, using α = 0.05.

Here are summary statistics for the variable “difference”:

Variable N Mean Median StDev SE Mean difference 31 9.17 8.00 8.06 1.45

Note here that “SE mean” (the standard error of the mean) is found by dividing the standard deviation by the square root of the sample size.

Example 2

Page 25: SP17 Lecture Notes 7b - Inference for a Difference in Meansvollmer/stat307pdfs/LN7b_2017.pdf · compare blood pressure before being put on a drug to after being on the drug), or studies

Subject Before After Difference 1 28 32 4 2 34 32 -2 3 28 42 14 4 60 64 4 5 20 41 21 6 25 33 8 7 32 49 17 8 19 32 13 … … … …

44

Sample data (Partial Listing)

This shows how the paired differences are calculated. For the purposes of this test, our data will be the “difference” column. Note that this isn’t all of the data.

Page 26: SP17 Lecture Notes 7b - Inference for a Difference in Meansvollmer/stat307pdfs/LN7b_2017.pdf · compare blood pressure before being put on a drug to after being on the drug), or studies

Step 1: The Hypotheses

Page 27: SP17 Lecture Notes 7b - Inference for a Difference in Meansvollmer/stat307pdfs/LN7b_2017.pdf · compare blood pressure before being put on a drug to after being on the drug), or studies

Step 2: α and the critical value

Page 28: SP17 Lecture Notes 7b - Inference for a Difference in Meansvollmer/stat307pdfs/LN7b_2017.pdf · compare blood pressure before being put on a drug to after being on the drug), or studies

Step 3: test statistic and p-value

Page 29: SP17 Lecture Notes 7b - Inference for a Difference in Meansvollmer/stat307pdfs/LN7b_2017.pdf · compare blood pressure before being put on a drug to after being on the drug), or studies

Step 4: the statistical decision

Page 30: SP17 Lecture Notes 7b - Inference for a Difference in Meansvollmer/stat307pdfs/LN7b_2017.pdf · compare blood pressure before being put on a drug to after being on the drug), or studies

We can also construct a CI for the true mean difference in upper body strength, before and after the program:

Example 2 confidence interval

Page 31: SP17 Lecture Notes 7b - Inference for a Difference in Meansvollmer/stat307pdfs/LN7b_2017.pdf · compare blood pressure before being put on a drug to after being on the drug), or studies

Finally, we can interpret this confidence interval, and note how it relates to the hypothesis test we conducted:

Example 2 confidence interval

Page 32: SP17 Lecture Notes 7b - Inference for a Difference in Meansvollmer/stat307pdfs/LN7b_2017.pdf · compare blood pressure before being put on a drug to after being on the drug), or studies

• Hopefully the basic process of performing a hypothesis test and constructing a confidence interval has become familiar.

• The statistical techniques that we study throughout the remainder of the class will always involve a hypothesis test, confidence interval, or both. These are the “bread and butter” of statistical inference.

• In the next set of notes, we will consider the assumptions that underlie these procedures, as well as some areas of controversy in statistical inference.

Conclusion