spotting pseudoreplication
DESCRIPTION
Spotting pseudoreplication. Inspect spatial (temporal) layout of the experiment Examine degrees of freedom in analysis. Degrees of freedom (df). Number of independent terms used to estimate the parameter = Total number of datapoints – number of parameters estimated from data. - PowerPoint PPT PresentationTRANSCRIPT
Spotting pseudoreplication
1. Inspect spatial (temporal) layout of the experiment
2. Examine degrees of freedom in analysis
Degrees of freedom (df)
Number of independent terms used to estimate the parameter
= Total number of datapoints – number of parameters estimated from data
Example: VarianceIf we have 3 data points with a mean value of 10, what’s the df for the variance estimate?
Independent term method:
Can the first data point be any number?
Can the second data point be any number?
Can the third data point be any number?
Yes, say 8
Yes, say 12
No – as mean is fixed !
Variance is (y – mean)2 / (n-1)
Example: VarianceIf we have 3 data points with a mean value of 10, what’s the df for the variance estimate?
Independent term method:
Therefore 2 independent terms (df = 2)
Example: VarianceIf we have 3 data points with a mean value of 10, what’s the df for the variance estimate?
Subtraction method
Total number of data points?
Number of estimates from the data?
df= 3-1 = 2
3
1
Example: Linear regression
Y = mx + b
Therefore 2 parameters estimated simultaneously
(df = n-2)
Example: Analysis of variance (ANOVA)
A B C a1 b1 c1
a2 b2 c2
a3 b3 c3
a4 b4 c4
What is n for each level?
Example: Analysis of variance (ANOVA)
A B C a1 b1 c1
a2 b2 c2
a3 b3 c3
a4 b4 c4
n = 4
How many df for each variance estimate?
df = 3 df = 3 df = 3
Example: Analysis of variance (ANOVA)
A B C a1 b1 c1
a2 b2 c2
a3 b3 c3
a4 b4 c4
What’s the within-treatment df for an ANOVA?
Within-treatment df = 3 + 3 + 3 = 9
df = 3 df = 3 df = 3
Example: Analysis of variance (ANOVA)
A B C a1 b1 c1
a2 b2 c2
a3 b3 c3
a4 b4 c4
If an ANOVA has k levels and n data points per level, what’s a simple formula for within-treatment df?
df = k(n-1)
Spotting pseudoreplication
An experiment has 10 fertilized and 10 unfertilized plots, with 5 plants per plot.
The researcher reports df=98 for the ANOVA (within-treatment MS).
Is there pseudoreplication?
Spotting pseudoreplication
An experiment has 10 fertilized and 10 unfertilized plots, with 5 plants per plot.
The researcher reports df=98 for the ANOVA.
Yes! As k=2, n=10, then df = 2(10-1) = 18
Spotting pseudoreplication
An experiment has 10 fertilized and 10 unfertilized plots, with 5 plants per plot.
The researcher reports df=98 for the ANOVA.
What mistake did the researcher make?
Spotting pseudoreplication
An experiment has 10 fertilized and 10 unfertilized plots, with 5 plants per plot.
The researcher reports df=98 for the ANOVA.
Assumed n=50: 2(50-1)=98
Why is pseudoreplicationa problem?
Hint: think about what we use df for!
How prevalent?
Hurlbert (1984): 48% of papers
Heffner et al. (1996): 12 to 14% of papers
Statistics review
Basic concepts:
• Variability measures
• Distributions
• Hypotheses
• Types of error
Common analyses
• T-tests
• One-way ANOVA
• Two-way ANOVA
• Randomized block
Variance
Ecological rule # 1: Everything varies
…but how much does it vary?
Variance
S2= Σ (xi – x )2
n-1
x
Sum-of-squarecake
Variance
S2= Σ (xi – x )2
n-1
x
Variance
S2= Σ (xi – x )2
n-1
What is the variance of 4, 3, 3, 2 ?
What are the units?
Variance variants
1. Standard deviation (s, or SD)
= Square root (variance)
Advantage: units
Variance variants
2. Standard error (S.E.)
= s
n
Advantage: indicates precision
How to report
We observed 29.7 (+ 5.3) grizzly bears per month (mean + S.E.).
A mean (+ SD)of 29.7 (+ 7.4) grizzly bears were seen per month
+ 1SE or SD
- 1SE or SD
Distributions
Normal• Quantitative data
Poisson• Count
(frequency) data
Normal distribution
0
2
4
6
8
10
12
14
16
mean
67% of data within 1 SD of mean
95% of data within 2 SD of mean
Poisson distribution
0
2
4
6
8
10
12
14
16
18
mean
Mostly, nothing happens (lots of zeros)
Poisson distribution
• Frequency data
• Lots of zero (or minimum value) data
• Variance increases with the mean
1. Correct for correlation between mean and variance by log-transforming y (but log (0) is undefined!!)
2. Use non-parametric statistics (but low power)
3. Use a “generalized linear model” specifying a Poisson distribution
What do you do with Poisson data?
• Null (Ho): no effect of our experimental treatment, “status quo”
• Alternative (Ha): there is an effect
Hypotheses
Whose null hypothesis?
Conditions very strict for rejecting Ho, whereas accepting Ho is easy (just a matter of not finding grounds to reject it).
A criminal trial?Exotic plant species?WTO?
Hypotheses
Null (Ho) and alternative (Ha):
always mutually exclusive
So if Ha is treatment>control…
Types of error
Type 1 error
Type 2 error
Reject Ho Accept Ho
Ho true
Ho false
• Usually ensure only 5% chance of type 1 error (ie. Alpha =0.05)
• Ability to minimize type 2 error: called power
Types of error