statistical inference l3, bootstrap, l2, l1 progression formal and informal continuing...
TRANSCRIPT
Statistical InferenceL3, Bootstrap, L2, L1 progression
Formal and informal
Continuing implementation of new NCEA standards
Team SolutionsSteel • Hogan • Cathcart • McNaughton • Barks • Johnson
What may happen in this session.• Data• Questions and PPDAC• Bootstrap idea• Bootstrap confidence interval• (CLT, normal and CI )• Contextual information• Census at School resources• Level 2, level 1, Year 9 and 10.
What are we learning?
• The bootstrap idea and how it is used to make an inference (now called formal inference).
• The importance of contextual information for this standard 3.10
• A possible lesson sequence• How to justify √n• Where there are a lot of resources
• The big idea of this standard is to make a useful inference from a sample about a population.
Data
The Population• The only definite thing we know about the
population is we never know anything definite about the population!
• And there would be no need for statistics
• But, if we did, we could check to see if our statistical ways work. Let’s do that!
The Vineyard
ProblemBefore the harvest I want to estimate the total harvest weight reasonably accurately to plan processing.
The population is 10,000 bunches of grapes. There are 500 vines and each vine is managed to have 10 shoots and each shoot grows two bunches of grapes.
The PlanThe plan is to pick a random sample of bunches and weigh each bunch. There are 13 rows 50m long so 6 random numbers between 1 and 650 were chosen.
The bunches of grapes in a metre long section were picked from the vines at these places and weighed.
The Sample DataI have the weights of 212 bunches of grapes.
AnalysisThe median weight is 87.5 grams.The IQR is 109 – 70.5 =38.5
So the Y12 median estimate for the vineyard is 87.5 ± 1.5 x 38.5/√212 =[83.5, 91.5]I am pretty sure confident that this interval will contain the median.
ConclusionPopulation parameter median is very likely to be contained in this interval;[83.5, 91.5].
I am pretty sure the vineyard harvest will be between 835kg and 915kg. I have stainless vats for 800kg so there will be a small surplus which I can ferment in a 200L plastic drum.
Bootstrap for a better estimateI have the weights of 212 bunches of grapes.
Bootstrap
• The idea behind a bootstrap is to mimic the sample many times.
• This is best simulated.
• A computer is needed.
• The distributions of means/medians of all the mimic samples reflect the population.
The distribution of 200 resamples of size 212.
Bootstrap, likely to be between 860kg to 945kg with 95% confidence.
What controls the spread of the means?
• From the left, sample sizes 3, 5, 8, 12 and 40.
• Here I am varying the sample size.
√n
0 5 10 15 20 25 30 35 40 450
10
20
30
40
50
60
70
80
f(x) = 125.406063050883 x -̂0.511925070490753R² = 0.964533465734864
width
Sample size
Spre
ad
Spread α 1/√(sample size)
Key Ideas• List all the ideas of statistics Years 9 to 13
• Order the development of these
Resources!• http://3rs.ccac.ca/en/research/reduction/experimental-
design.html • • http://www.corwin.com/upm-data/29173_Millsap___Ch
apter_2.pdf• http://curiouscat.com/bill/101doe.cfm• http://stattrek.com/experiments/what-is-an-experiment.
aspx• http://en.wikipedia.org/wiki/Design_of_experiments• http://webspace.ship.edu/cgboer/experiments.html• http://www.stat.auckland.ac.nz/~iase/publications/icots
8/ICOTS8_4B2_ENGEL..pdf• http://statistics.about.com/od/Applications/a/Example-
Of-Bootstrapping.htm• http://www.stat.rutgers.edu/home/mxie/rcpapers/boot
strap.pdf
Resources!
And …
http://mediasite.temanuaute.org.nz/mediasite/Catalog/pages/catalog.aspx?catalogId=22dcecd5-8520-4ab7-b6c3-d96115a5dacf
This is not the end, it is not even the beginning of the end, it is only the end of the beginning!
Puzzle time
Who on Earth said that and when?