![Page 1: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/1.jpg)
http://mathworld.wolfram.com/Chi-SquaredDistribution.html
More stats...Outliers, R2, and sample size
![Page 2: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/2.jpg)
![Page 3: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/3.jpg)
•Stats practice in next lab
•Also need to start putting together your group for inquiry 2... 3-5 people/group
•Inquiry 1 written and oral reports are due in lab Th 9/23 or M 9/27
•Homework #2 and #3 coming soon
•Online evaluation
•TA office hours calendar online
![Page 4: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/4.jpg)
•In your lab notebook: Write everything about your experiments. Each entry should have a date. Include notes (intro and conclusions), so when you, or someone else, go back to look at your notebook, the entries make sense.
Notebooks will be turned in as a HW later in the semester.
![Page 5: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/5.jpg)
Outliers…
2, 2, 2, 2, 2, 3, 3, 3, 4, 4, 5, 5, 5, 6, 7, 121, 130
Median = 4
Mean = 18
![Page 6: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/6.jpg)
Outliers: When is data invalid?
![Page 7: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/7.jpg)
Outliers: When is data invalid?
Not simply when you want it to be.
![Page 8: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/8.jpg)
Outliers: When is data invalid?
Not simply when you want it to be.
Dixon’s Q test can determine if a value is statistically an outlier.
![Page 9: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/9.jpg)
Dixon’s Q test can determine if a value is statistically an outlier.
|(suspect value – nearest value)|Q = |(largest value – smallest value)|
![Page 10: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/10.jpg)
Dixon’s Q test can determine if a value is statistically an outlier.
|(suspect value – nearest value)|Q = |(largest value – smallest value)|
Example: results from a blood test…789, 700, 772, 766, 777
![Page 11: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/11.jpg)
Dixon’s Q test can determine if a value is statistically an outlier.
|(suspect value – nearest value)|Q = |(largest value – smallest value)|
Example: results from a blood test…789, 700, 772, 766, 777
![Page 12: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/12.jpg)
Dixon’s Q test can determine if a value is statistically an outlier.
|(suspect value – nearest value)|Q = |(largest value – smallest value)|
Example: results from a blood test…789, 700, 772, 766, 777
Q=|(700 – 766)| ÷ |(789 – 700)|
![Page 13: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/13.jpg)
Dixon’s Q test can determine if a value is statistically an outlier.
|(suspect value – nearest value)|Q = |(largest value – smallest value)|
Example: results from a blood test…789, 700, 772, 766, 777
Q =|(700 – 766)| ÷ |(789 – 700)| = 0.742
![Page 14: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/14.jpg)
Dixon’s Q test can determine if a value is statistically an outlier.
|(suspect value – nearest value)|Q = |(largest value – smallest value)|
Example: results from a blood test…789, 700, 772, 766, 777
Q =|(700 – 766)| ÷ |(789 – 700)| = 0.742 So?
![Page 15: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/15.jpg)
You need the critical values for Q table:
Sample # Q critical value
3 0.970
4 0.831
5 0.717
6 0.621
7 0.568
10 0.466
12 0.426
15 0.384
20 0.342
25 0.317
30 0.298
If Q calc > Q critrejected
From: E.P. King, J. Am. Statist. Assoc. 48: 531 (1958)
![Page 16: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/16.jpg)
You need the critical values for Q table:
If Q calc > Q critthan the outlier can be rejected
Q calc = 0.742
Q crit = 0.717
= rejection
From: E.P. King, J. Am. Statist. Assoc. 48: 531 (1958)
Sample # Q critical value
3 0.970
4 0.831
5 0.717
6 0.621
7 0.568
10 0.466
12 0.426
15 0.384
20 0.342
25 0.317
30 0.298
![Page 17: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/17.jpg)
What can outliers tell us?
![Page 18: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/18.jpg)
If you made a mistake, you should have already accounted for that.
![Page 19: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/19.jpg)
Outliers can lead to important and fascinating discoveries.
Transposons “jumping genes” were discovered because they did not fit known modes of inheritance.
![Page 20: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/20.jpg)
What about relating 2 variables?
XKCD.com
![Page 21: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/21.jpg)
What about relating 2 variables?
R2 gives a measure of fit to a line.
If R2 = 1 the data fits perfectly to a straight line
If R2 = 0 there is no correlation between the data
![Page 22: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/22.jpg)
R2 gives a measure of fit to a line.
4 1711 146 7
12 172 136 213 21
birth month vs birth day
![Page 23: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/23.jpg)
birth month vs birth day
1 3 5 7 9 110
5
10
15
20
25
30
R² = 0.00546238003477373
Birth Month
Bir
th D
ay
![Page 24: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/24.jpg)
phosphate quantity vs absorbance
0.0 20.0 40.0 60.0 80.0 100.0 120.0 140.00.0000.0500.1000.1500.2000.2500.3000.3500.4000.4500.500
R² = 0.999918160770785
Apyrase Assay Standard Curve 3-7-05
nMol Pi
OD
66
0
![Page 25: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/25.jpg)
What about relating 2 variables?
•To use R2 the data must be continually variable...
R2 gives a measure of fit to a line.
If R2 = 1 the data fits perfectly to a straight line
If R2 = 0 there is no correlation between the data
![Page 26: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/26.jpg)
Samples vs populations
![Page 27: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/27.jpg)
Samples vs populationsPopulation- everything or everyone about which information is soughtSample- a subset of a population (that is hopefully representative of the population)
population
sample
![Page 28: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/28.jpg)
Population-
• U.S. census
• Dogs
• 1 – infinity
Sample-
• Travis county
• Poodles
• Prime numbers
![Page 29: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/29.jpg)
Why use a sample instead of a population?
![Page 30: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/30.jpg)
Why use a sample instead of a population?
•Logistics
![Page 31: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/31.jpg)
Why use a sample instead of a population?
•Logistics
•Cost
![Page 32: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/32.jpg)
Why use a sample instead of a population?
•Logistics
•Cost
•Time
![Page 33: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/33.jpg)
Samples:
Random- each member of population has an equal chance of being part of the sample.
or
Representative- ensuring that certain parameters of your sample match the population.
![Page 34: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/34.jpg)
Replicates:
Technical vs Experimental
Technical replicate- one treatment is divided into multiple samples.
Experimental replicate- different, replicate, treatments are done to different samples.
![Page 35: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/35.jpg)
Testing blood sugar levels after eating a Snickers:
![Page 36: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/36.jpg)
Testing blood sugar levels after eating a Snickers:
Divide a participants blood into 3 samples and test blood sugar in each sample.
Technical or Experimental replicate?
![Page 37: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/37.jpg)
Testing blood sugar levels after eating a Snickers:
Test 3 different people.
Technical or Experimental replicate?
![Page 38: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/38.jpg)
Testing blood sugar levels after eating a Snickers:
Test the same person on 3 different days.
Technical or Experimental replicate?
![Page 39: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/39.jpg)
What sample size do you need?
![Page 40: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/40.jpg)
What sample size do you need?
It depends on the error you expect.
![Page 41: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/41.jpg)
To determine an appropriate sample size, you need to estimate a few parameters.•Means•Standard Deviation
•Power: The probability that an experiment will have a significant (positive) result, that is have a p-value of less than the specified significance level (usually 5%).
![Page 42: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/42.jpg)
This calculator will help you determine the appropriate sample size:
http://www.stat.ubc.ca/~rollin/stats/ssize/n2.html
![Page 43: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/43.jpg)
What sample size do you need?
It depends on the error you expect.
(So it is impossible to predict with 100% accuracy before the experiment is carried out.)
![Page 44: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size](https://reader030.vdocuments.net/reader030/viewer/2022032611/56649ed25503460f94be186d/html5/thumbnails/44.jpg)
3rd Thursday at Blanton Art Museum(http://blantonmuseum.org/calendar_events/details/third_thursday7)
•Stats practice in next lab
•Also need to start putting together your group for inquiry 2... 3-5 people/group
•Inquiry 1 written and oral reports are due in lab Th 9/23 or M 9/27
•Homework #2 and #3 coming soon
•Online evaluation
•TA office hours calendar online