laboratory for interdisciplinary statistical analysis anne ryan [email protected] virginia tech
TRANSCRIPT
![Page 2: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/2.jpg)
www.lisa.stat.vt.edu
Laboratory for Interdisciplinary Statistical Analysis
1948: The Statistical Laboratory was founded as a division of the Virginia Agricultural Experiment Station to
help agronomists design experiments and calculate sums of squares.
![Page 3: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/3.jpg)
www.lisa.stat.vt.edu
Laboratory for Interdisciplinary Statistical Analysis
1949: Based on the success of the Statistical Laboratory, the Department of Statistics at Virginia Polytechnic
Institute (VPI) was founded—the 3rd oldest statistics department in the United States.
![Page 4: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/4.jpg)
www.lisa.stat.vt.edu
Laboratory for Interdisciplinary Statistical Analysis
1973: The Statistical Laboratory was re-formed as the Statistical Consulting Center to assist with statistical
analyses in every college of Virginia Polytechnic Institute & State University (VPI&SU).
![Page 5: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/5.jpg)
www.lisa.stat.vt.edu
Laboratory for Interdisciplinary Statistical Analysis
2007: The Graduate Student Assembly led a movement to save statistical consulting and collaboration from
death by budget cuts, ensuring that graduate students could receive help with their research.
The College of Science, Provost, Vice President of Research, Graduate School, and six additional colleges agreed that researchers should be able to receive free
statistical consulting and collaboration.
![Page 6: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/6.jpg)
www.lisa.stat.vt.edu
Laboratory for Interdisciplinary Statistical Analysis
2008: The Statistical Consulting Center was re-organized as the Laboratory for Interdisciplinary
Statistical Analysis (LISA) to collaborate with researchers across the Virginia Tech (VT) campuses.
![Page 7: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/7.jpg)
www.lisa.stat.vt.edu
Laboratory for Interdisciplinary Statistical Analysis
Established in 2008
Year Clients Hours
2000 299 13682001 293 19382002 321 22202003 304 21922004 274 17752005 211 4952006 171 5412007 190 9652008 895 21842009 719 30932010 1124 4420
![Page 8: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/8.jpg)
www.lisa.stat.vt.edu
Laboratory for Interdisciplinary Statistical Analysis
Year Clients Hours
2000 299 13682001 293 19382002 321 22202003 304 21922004 274 17752005 211 4952006 171 5412007 190 9652008 895 21842009 719 30932010 1124 4420
Year
Clie
nts
pe
r ye
ar
2000 2002 2004 2006 2008 2010
03
00
60
09
00
12
00
![Page 9: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/9.jpg)
www.lisa.stat.vt.edu
Laboratory for Interdisciplinary Statistical Analysis
Year Clients Hours
2000 299 13682001 293 19382002 321 22202003 304 21922004 274 17752005 211 4952006 171 5412007 190 9652008 895 21842009 719 30932010 1124 4420
Year
Ho
urs
pe
r ye
ar
2000 2002 2004 2006 2008 2010
01
00
02
00
03
00
04
00
05
00
0
![Page 10: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/10.jpg)
Laboratory for Interdisciplinary Statistical Analysis
www.lisa.stat.vt.edu
Laboratory for Interdisciplinary Statistical Analysis
LISA helps VT researchers benefit
from the use of Statistics
www.lisa.stat.vt.edu
Experimental Design • Data Analysis • Interpreting ResultsGrant Proposals • Software (R, SAS, JMP, SPSS...)
Our goal is to improve the quality of research and the use of statistics at
Virginia Tech.
10
![Page 11: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/11.jpg)
Laboratory for Interdisciplinary Statistical Analysis
www.lisa.stat.vt.edu
Laboratory for Interdisciplinary Statistical Analysis
LISA helps VT researchers benefit
from the use of Statistics
www.lisa.stat.vt.edu
Collaboration LISA statisticians meet with faculty, staff, and graduate students to
understand their research and think of
ways to help them using statistics.
11
![Page 12: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/12.jpg)
Laboratory for Interdisciplinary Statistical Analysis
www.lisa.stat.vt.edu
Laboratory for Interdisciplinary Statistical Analysis
Collaboration
LISA helps VT researchers benefit
from the use of Statistics
www.lisa.stat.vt.edu
Walk-In Consulting
Every day from 1-3PMclients get answers to their (quick) questions
about using statistics in their research.
12
![Page 13: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/13.jpg)
Laboratory for Interdisciplinary Statistical Analysis
www.lisa.stat.vt.edu
Laboratory for Interdisciplinary Statistical Analysis
LISA helps VT researchers benefit
from the use of Statistics
www.lisa.stat.vt.edu
Walk-In Consulting
Collaboration
Short Courses
Short Courses are designed to teach graduate students
howto apply statisticsin their research.
13
![Page 14: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/14.jpg)
Laboratory for Interdisciplinary Statistical Analysis
www.lisa.stat.vt.edu
Laboratory for Interdisciplinary Statistical Analysis
Short Courses
LISA helps VT researchers benefit
from the use of Statistics
www.lisa.stat.vt.edu
Walk-In Consulting
Collaboration
All services are FREE for VT researchers. We assist with research—not class projects or homework.
14
![Page 15: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/15.jpg)
How can LISA help?• Formulate research question.• Screen data for integrity and unusual
observations.• Implement graphical techniques to showcase
the data – what is the story?• Develop and implement an analysis plan to
address research question.• Help interpret results.• Communicate! Help with writing the report or
giving the talk.
• Identify future research directions.
![Page 16: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/16.jpg)
www.lisa.stat.vt.edu
Laboratory for Interdisciplinary Statistical Analysis
To request a collaboration meeting go to
www.lisa.stat.vt.edu
![Page 17: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/17.jpg)
www.lisa.stat.vt.edu
Laboratory for Interdisciplinary Statistical Analysis
To request a collaboration meeting go to www.lisa.stat.vt.edu
1. Sign in to the website using your VT PID and password.2. Enter your information (email address, college, etc.)3. Describe your project (project title, research goals,
specific research questions, if you have already collected data, special requests, etc.) 4. Wait 0-3 days, then contact the LISA collaboratorsassigned to your project to schedule an initial meeting.
![Page 18: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/18.jpg)
www.lisa.stat.vt.edu
Laboratory for Interdisciplinary Statistical Analysis
![Page 19: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/19.jpg)
www.lisa.stat.vt.edu
Laboratory for Interdisciplinary Statistical Analysis
Introduction to R• R is a free software environment for
statistical computing and graphics. Download: http://www.r-project.org/
• Topics Covered:
• Data objects in R, loops, import/export datasets, data manipulation
• Graphing
• Basic Analyses: T-tests, Regression, ANOVA
![Page 20: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/20.jpg)
www.lisa.stat.vt.edu
Laboratory for Interdisciplinary Statistical Analysis
Linear Regression & Structural Equation Monitoring• Linear regression is used to model the
relationship between a continuous response and a continuous predictor.
• SEM is a modeling technique that investigates causal relationships among variables.
• Time –related latent variables, modification indices and critical ratio in exploratory analyses, and computation of implied moments, factor score weights, total effects, and indirect effects.
![Page 21: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/21.jpg)
www.lisa.stat.vt.edu
Laboratory for Interdisciplinary Statistical Analysis
Generalized Linear Models
• Modeling technique for situations where the errors are not necessarily normal.
• Can handle situations where you have binary responses, counts, etc.
• Uses a link function to relate the response to the linear model.
• Cover: Basic statistical concepts of GLM and how it relates to regression using normal errors.
![Page 22: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/22.jpg)
www.lisa.stat.vt.edu
Laboratory for Interdisciplinary Statistical Analysis
Mixed Models and Random Effects• Mixed Model: A statistical model that has
both random effects and fixed effects.
• Fixed Effect: Levels of the factor are predetermined. Random Effect: Levels of the factor were chosen at random.
• The primary focus of the course will be to identify scenarios where a mixed model approach will be appropriate. The concepts will be explained almost wholly through examples in SAS or in R.
![Page 23: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/23.jpg)
T-Tests and Analysis of Variance
Anne Ryan
23
![Page 24: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/24.jpg)
Defense:
Prosecution:
What’s the Assumed Conclusion?
Criminal Trial
Represent the accused (defendant)
Hold the “Burden of Proof”—obligation to shift the assumed conclusion from an oppositional opinion to one’s own position through evidence
ANSWER: The accused is innocent until proven guilty.• Prosecution must convince the judge/jury that
the defendant is guilty beyond a reasonable doubt
24
![Page 25: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/25.jpg)
Similarities between Criminal Trials and Hypothesis Testing
Burden of Proof—Obligation to shift the conclusion using evidence
TrialHypothesis Test
Innocent until proven guilty
Accept the status quo (what is
believed before) until the data
suggests otherwise
25
![Page 26: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/26.jpg)
Similarities between Criminal Trials and Hypothesis Testing
Decision Criteria
TrialHypothesis Test
Evidence has to convincing beyond a
reasonable
Occurs by chance less than 100α% of the time (ex:
5%)
26
![Page 27: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/27.jpg)
Hypothesis Test: Procedure for examining a claim about the value of a parameter◦ i.e.
Hypothesis tests are very methodical with several key pieces.
Introduction to Hypothesis Testing
27
![Page 28: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/28.jpg)
1. Test
2. Assumptions
3. Hypotheses
4. Mechanics
5. Conclusion
Steps in a Hypothesis Test
28
![Page 29: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/29.jpg)
State the name of the testing method to be used
It is important to not be off track in the very beginning
Hypothesis Tests we will Perform:◦ One Sample t test for μ◦ Two sample t test for μ◦ Paired t test ◦ ANOVA
1. Test
29
![Page 30: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/30.jpg)
List all the assumptions required for your test to be valid.
All tests have assumptions
Even if assumptions are not met you should still comment on how this affects your results.
2. Assumptions
30
![Page 31: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/31.jpg)
State the hypothesis of interest
There are two hypotheses◦ Null Hypothesis: Denoted ◦ Alternative Hypothesis: Denoted
Examples of possible hypotheses:
3. Hypotheses
0HaHorH1
13:.13:0 aHvsH
31
![Page 32: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/32.jpg)
For hypothesis testing there are three popular versions of testing◦ Left Tailed Hypothesis Test◦ Right Tailed Hypothesis Test◦ Two Tailed or Two Sided Hypothesis Test
3. Hypotheses Continued
32
![Page 33: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/33.jpg)
1. Left Tailed Hypothesis Test: Researchers are only interested in whether
the true value is below the hypothesized value.
e.g—
2. Right Tailed Hypothesis Test: Researchers are only interested in
whether the True Value is above the hypothesized value.
e.g.–
3. Hypotheses Continued
000 :.: aHvsH
33
![Page 34: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/34.jpg)
3. Two Tailed or Two Sided Hypothesis Test: The researcher is interested in looking above and below they hypothesized value.
3. Hypotheses Continued
000 :.: aHvsH
34
![Page 35: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/35.jpg)
Three Requirements for Stating Hypotheses:1. Two complementary hypotheses.
2. A parameter about which the test is to be based e.g.—μ
3. Hypothesized Value for parameter
Denoted but generally takes on numeric values in practice
3. Hypotheses Continued
andorand
35
![Page 36: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/36.jpg)
Computational Part of the Test
What is part of the Mechanics step?◦ Stating the Significance Level◦ Finding the Rejection Rule◦ Computing the Test Statistic◦ Computing the p-value
4. Mechanics
36
![Page 37: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/37.jpg)
Significance Level: Here we choose a value to use as the significance level, which is the level at which we are willing to start rejecting the null hypothesis.
Denoted by α
Default value is α=.05, use α=.05 unless otherwise noted!
4. Mechanics Continued
37
![Page 38: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/38.jpg)
Rejection Rule: State our criteria for rejecting the null hypothesis.◦ “Reject the null hypothesis if p-value<.05”.
p-value: The probability of obtaining a point estimate as “extreme” as the current value where the definition of “extreme” is taken from the alternative hypotheses assuming the null hypothesis is true.
4. Mechanics Continued
38
![Page 39: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/39.jpg)
Test Statistic: Compute the test statistic, which is usually a standardization of your point estimate.
Translates your point estimate, a statistic, to follow a known distribution so that is can be used for a test.
4. Mechanics Continued
39
![Page 40: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/40.jpg)
p-value: After computing the test statistic, now you can compute the p-value.
Use software to compute p-values.
4. Mechanics Continued
40
![Page 41: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/41.jpg)
Conclusion: Last step of the hypothesis test just like it is the last step when computing confidence intervals.
Conclusions should always include:◦ Decision: reject or fail to reject◦ Linkage: why you made the decision (interpret p-
value)◦ Context: what your decision means in context of
the problem.
5. Conclusion
41
![Page 42: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/42.jpg)
Note: Your decision can only be one of two choices:
1. Reject --data gives strong indication that is more likely
2. Fail to Reject --data gives no strong indication that is more likely
When conducting hypothesis tests, we assume that is true, therefore the decision CAN NOT be to accept the null hypothesis
5. Conclusion
0HaH
0H
aH
0H
42
![Page 43: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/43.jpg)
One Sample T-Test
43
![Page 44: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/44.jpg)
One Sample T-Test Used to test whether the population mean is
different from a specified value.
Example: Is the mean height of 12 year old girls greater than 60 inches?
http://office.microsoft.com/en-us/images
44
![Page 45: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/45.jpg)
Step 1: Formulate the Hypotheses
The population mean is not equal to a specified value.Null Hypothesis, H0: μ = μ0
Alternative Hypothesis: Ha: μ ≠ μ0
The population mean is greater than a specified value. H0: μ = μ0
Ha: μ > μ0
The population mean is less than a specified value.H0: μ = μ0
Ha: μ < μ0
45
![Page 46: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/46.jpg)
Step 2: Check the Assumptions The sample is random.
The population from which the sample is drawn is either normal or the sample size is large.
46
![Page 47: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/47.jpg)
Steps 3-5 Step 3: Calculate the test statistic:
Where
Step 4: Calculate the p-value based on the appropriate alternative hypothesis.
Step 5: Write a conclusion.
ns
yt
/0
11
2
n
yys
n
ii
47
![Page 48: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/48.jpg)
Iris Example A researcher would like to know whether the mean sepal
width of a variety of irises is different from 3.5 cm. Use .
The researcher randomly selects 50 irises and measures the sepal width.
Step 1: HypothesesH0: μ = 3.5 cm
Ha: μ ≠ 3.5 cm
http://en.wikipedia.org/wiki/Iris_flower_data_set
48
![Page 49: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/49.jpg)
JMP Steps 2-4:
JMP DemonstrationAnalyze DistributionY, Columns: Sepal Width
Normal Quantile Plot
Test MeanSpecify Hypothesized Mean: 3.5
49
![Page 50: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/50.jpg)
JMP Output
Step 5 Conclusion: Fail to reject since the p-value=0.1854 is greater than 0.05. There is significant sample evidence to indicate that the mean sepal width is not different from 3.5 cm.
50
![Page 51: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/51.jpg)
Two Sample T-Test
51
![Page 52: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/52.jpg)
Two Sample T-Test Two sample t-tests are used to determine
whether the population mean of one group is equal to, larger than or smaller than the population mean of another group.
Example: Is the mean cholesterol of people taking drug A lower than the mean cholesterol of people taking drug B?
52
![Page 53: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/53.jpg)
Step 1: Formulate the Hypotheses The population means of the two groups are not equal.
H0: μ1 = μ2
Ha: μ1 ≠ μ2
The population mean of group 1 is greater than the population mean of group 2.H0: μ1 = μ2
Ha: μ1 > μ2
The population mean of group 1 is less than the population mean of group 2.H0: μ1 = μ2
Ha: μ1 < μ2
53
![Page 54: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/54.jpg)
Step 2: Check the Assumptions The two samples are random and
independent.
The populations from which the samples are drawn are either normal or the sample sizes are large.
The populations have the same standard deviation.
54
![Page 55: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/55.jpg)
Steps 3-5 Step 3: Calculate the test statistic
where
Step 4: Calculate the appropriate p-value. Step 5: Write a Conclusion.
21
21
11
nns
yyt
p
2
)1()1(
21
222
211
nn
snsnsp
55
![Page 56: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/56.jpg)
Two Sample Example A researcher would like to know whether the
mean sepal width of setosa irises is different from the mean sepal width of versicolor irises.
The researcher randomly selects 50 setosa irises and 50 versicolor irises and measures their sepal widths.
Step 1 Hypotheses:H0: μsetosa = μversicolor
Ha: μsetosa ≠ μversicolorhttp://en.wikipedia.org/wiki/Iris_flower_data_set
http://en.wikipedia.org/wiki/Iris_versicolor
56
![Page 57: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/57.jpg)
JMP Steps 2-4:
JMP Demonstration:Analyze Fit Y By XY, Response: Sepal WidthX, Factor: Species
Means/ANOVA/Pooled t
Normal Quantile Plot Plot Actual by Quantile
57
![Page 58: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/58.jpg)
JMP Output
Step 5 Conclusion: There is strong evidence (p-value < 0.0001) that the mean sepal widths for the two varieties are different.
setosa
versicolor
-2.33 -1.64-1.28 -0.67 0.0 0.67 1.281.64 2.33
0.5
0.8
0.9
0.2
0.1
0.0
2
0.9
8
Normal Quantile
58
![Page 59: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/59.jpg)
Paired T-Test
59
![Page 60: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/60.jpg)
Paired T-Test The paired t-test is used to compare the
population means of two groups when the samples are dependent.
Example:A researcher would like to determine if background noise causes people to take longer to complete math problems. The researcher gives 20 subjects two math tests one with complete silence and one with background noise and records the time each subject takes to complete each test.
60
![Page 61: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/61.jpg)
Step 1: Formulate the Hypotheses
The population mean difference is not equal to zero. H0: μdifference = 0
Ha: μdifference ≠ 0 The population mean difference is greater than
zero. H0: μdifference = 0
Ha: μdifference > 0 The population mean difference is less than a zero.
H0: μdifference = 0
Ha: μdifference < 0
61
![Page 62: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/62.jpg)
Step 2: Check the assumptions The sample is random.
The data is matched pairs.
The differences have a normal distribution or the sample size is large.
62
![Page 63: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/63.jpg)
Steps 3-5
ns
dt
d /
0
Where d bar is the mean of the differences and sd is the standard deviations of the differences.
Step 4: Calculate the p-value.
Step 5: Write a conclusion.
Step 3: Calculate the test Statistic:
63
![Page 64: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/64.jpg)
Paired T-Test Example A researcher would like to determine
whether a fitness program increases flexibility. The researcher measures the flexibility (in inches) of 12 randomly selected participants before and after the fitness program.
Step 1: Formulate a HypothesisH0: μAfter - Before = 0
Ha: μ After - Before > 0http://office.microsoft.com/en-us/images
64
![Page 65: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/65.jpg)
Paired T-Test Example Steps 2-4:
JMP Analysis:Create a new column of After – BeforeAnalyze DistributionY, Columns: After – Before
Normal Quantile Plot
Test MeanSpecify Hypothesized Mean: 0
65
![Page 66: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/66.jpg)
JMP Output
Step 5 Conclusion: There is not evidence that the fitness program increases flexibility.
66
![Page 67: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/67.jpg)
One-Way Analysis of Variance
67
![Page 68: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/68.jpg)
One-Way ANOVA ANOVA is used to determine whether three
or more populations have different distributions.
A B C
Medical Treatment
68
![Page 69: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/69.jpg)
ANOVA Strategy
The first step is to use the ANOVA F test to
determine if there are any significant differences
among the population means.
If the ANOVA F test shows that the population
means are not all the same, then follow up tests
can be performed to see which pairs of population
means differ.
69
![Page 70: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/70.jpg)
One-Way ANOVA Model
i
ij
i
ij
ijiij
nj
ri
N
y
y
,,1
,,1
),0(~
groupith theofmean theis
levelfactor ith on the jth trial theof response theis
Where
2
In other words, for each group the observed value is the group mean plus some random variation.
70
![Page 71: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/71.jpg)
One-Way ANOVA Hypothesis Step 1: We test whether there is a
difference in the population means.
equal. allnot are The :
: 210
ia
r
H
H
71
![Page 72: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/72.jpg)
Step 2: Check ANOVA Assumptions The samples are random and independent of
each other. The populations are normally distributed. The populations all have the same standard
deviations.
The ANOVA F test is robust to the assumptions of normality and equal standard deviations.
72
![Page 73: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/73.jpg)
Step 3: ANOVA F Test
Compare the variation within the samples to the variation between the samples.
A B C A B C
Medical Treatment
73
![Page 74: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/74.jpg)
ANOVA Test Statistic
MSE
MSG
Groupswithin Variation
Groupsbetween Variation F
Variation within groups small compared with variation between groups → Large F
Variation within groups large compared with variation between groups → Small F
74
![Page 75: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/75.jpg)
MSG
1-r
)(n)(n)(n
1 -r
SSGMSG
21r
222
211
yyyyyy
The mean square for groups, MSG, measures
the variability of the sample averages.
SSG stands for sums of squares groups.
75
![Page 76: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/76.jpg)
MSE
1
)(
s
Wherer -n
1)s - (n1)s - (n 1)s - (n
r -n
SSE MSE
1i
2rr
222
211
i
n
jiij
n
yyi
Mean square error, MSE, measures the variability within the groups.
SSE stands for sums of squares error.
76
![Page 77: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/77.jpg)
Steps 4-5 Step 4: Calculate the p-value.
Step 5: Write a conclusion.
77
![Page 78: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/78.jpg)
ANOVA Example A researcher would like to determine if
three drugs provide the same relief from pain.
60 patients are randomly assigned to a treatment (20 people in each treatment).
Step 1: Formulate the HypothesesH0: μDrug A = μDrug B = μDrug C
Ha : The μi are not all equal.
http://office.microsoft.com/en-us/images
78
![Page 79: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/79.jpg)
Steps 2-4 JMP demonstration
Analyze Fit Y By X Y, Response: Pain
X, Factor: Drug
Normal Quantile Plot Plot Actual by Quantile
Means/ANOVA
79
![Page 80: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/80.jpg)
JMP Output and Conclusion
Step 5 Conclusion: There is strong evidence that the drugs are not all the same.
50
55
60
65
70
75
Pa
in
Drug A Drug B Drug CDrug
Drug ADrug BDrug C
-2.33 -1.64-1.28 -0.67 0.0 0.67 1.281.64 2.33
0.5
0.8
0.9
0.2
0.1
0.0
2
0.9
8
Normal Quantile
80
![Page 81: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/81.jpg)
Follow-Up Test The p-value of the overall F test indicates
that the level of pain is not the same for patients taking drugs A, B and C.
We would like to know which pairs of treatments are different.
One method is to use Tukey’s HSD (honestly significant differences).
81
![Page 82: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/82.jpg)
Tukey Tests Tukey’s test simultaneously tests
JMP demonstrationOneway Analysis of Pain By Drug Compare Means All Pairs, Tukey HSD
'a
'0
:H
:H
ii
ii
for all pairs of factor levels. Tukey’s HSD controls the overall type I error.
82
![Page 83: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/83.jpg)
JMP Output
The JMP output shows that drugs A and C are significantly different.
Drug C
Drug C
Drug B
Level
Drug A
Drug B
Drug A
- Level
5.850000
3.600000
2.250000
Difference
1.677665
1.677665
1.677665
Std Err Dif
1.81283
-0.43717
-1.78717
Lower CL
9.887173
7.637173
6.287173
Upper CL
0.0027*
0.0897
0.3786
p-Value
83
![Page 84: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/84.jpg)
Two-Way Analysis of Variance
84
![Page 85: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/85.jpg)
Two-Way ANOVA We are interested in the effect of two
categorical factors on the response. We are interested in whether either of the
two factors have an effect on the response and whether there is an interaction effect. ◦ An interaction effect means that the effect on the
response of one factor depends on the level of the other factor.
85
![Page 86: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/86.jpg)
Interaction
Low High Dosage
Impr
ovem
ent
No Interaction
Drug A Drug B
Low High Dosage
Impr
ovem
ent
Interaction
Drug A Drug B
86
![Page 87: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/87.jpg)
Two-Way ANOVA Model
ij
ijk
ij
j
i
ijk
ijkijjiijk
nk
bj
ai
N
y
y
,...,1
,,1
,,1
),0(~
Bfactor of leveljth theandA factor of levelith theofeffect n interactio theis )(
Bfactor of leveljth theofeffect main theis
Afactor of levelith theofeffect main theis
mean overall theis
level Bfactor jth theand levelA factor ith on the kth trial theof response theis
Where
)(
2
87
![Page 88: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/88.jpg)
Two-Way ANOVA Example We would like to determine the effect of two
alloys (low, high) and three cooling temperatures (low, medium, high) on the strength of a wire.
JMP demonstrationAnalyze Fit ModelY: StrengthHighlight Alloy and Temp and click Macros Factorial to DegreeRun Model
http://office.microsoft.com/en-us/images
88
![Page 89: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/89.jpg)
JMP Output
Conclusion: There is strong evidence of an interaction between alloy and temperature.
89
![Page 90: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/90.jpg)
Conclusion The one sample t-test allows us to test
whether the population mean of a group is equal to a specified value.
The two-sample t-test and paired t-test allow us to determine if the population means of two groups are different.
ANOVA allows us to determine whether the population means of several groups are different.
90
![Page 91: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/91.jpg)
SAS, SPSS and R For information about using SAS, SPSS and
R to do ANOVA:
http://www.ats.ucla.edu/stat/sas/topics/anova.htm
http://www.ats.ucla.edu/stat/spss/topics/anova.htm
http://www.ats.ucla.edu/stat/r/sk/books_pra.htm
91
![Page 92: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/92.jpg)
References Fisher’s Irises Data (used in one sample and
two sample t-test examples).
Flexibility data (paired t-test example):Michael Sullivan III. Statistics Informed Decisions Using Data. Upper Saddle River, New Jersey: Pearson Education, 2004: 602.
92
![Page 93: Laboratory for Interdisciplinary Statistical Analysis Anne Ryan agryan@vt.edu Virginia Tech](https://reader036.vdocuments.net/reader036/viewer/2022062314/56649da65503460f94a9105f/html5/thumbnails/93.jpg)
Special thanks to Jennifer Kensler for course materials and help with JMP!
93