10 throws - arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12...
TRANSCRIPT
![Page 1: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/1.jpg)
![Page 2: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/2.jpg)
2 3 4 5 6 7 8 9 10 11
10 throws
dice total
frequency
0.0
0.5
1.0
1.5
2.0
> length( p[p==7] ) / throws[1] 0.1
![Page 3: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/3.jpg)
2 3 4 5 6 7 8 9 10 11 12
100 throws
dice total
frequency
05
10
15
20
> length( p[p==7] ) / throws[1] 0.14
2 3 4 5 6 7 8 9 10 11
10 throws
dice total
frequency
0.0
0.5
1.0
1.5
2.0
![Page 4: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/4.jpg)
2 3 4 5 6 7 8 9 10 11 12
100 throws
dice total
frequency
05
10
15
20
> length( p[p==7] ) / throws[1] 0.195
2 3 4 5 6 7 8 9 10 11
10 throws
dice total
frequency
0.0
0.5
1.0
1.5
2.0
2 3 4 5 6 7 8 9 10 11 12
1000 throws
dice total
frequency
050
100
150
![Page 5: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/5.jpg)
2 3 4 5 6 7 8 9 10 11 12
100 throws
dice total
frequency
05
10
15
20
> length( p[p==7] ) / throws[1] 0.17865
2 3 4 5 6 7 8 9 10 11
10 throws
dice total
frequency
0.0
0.5
1.0
1.5
2.0
2 3 4 5 6 7 8 9 10 11 12
1000 throws
dice total
frequency
050
100
150
2 3 4 5 6 7 8 9 10 11 12
1e+05 throws
dice total
frequency
05000
10000
15000
![Page 6: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/6.jpg)
resampling & empirical likelihood estimation
• most sta's'cal es'ma'on premised on repeated “experiments”:
• if data generated many 'mes, what’s the expected outcome?
• instead of actually repea'ng, write formula that computes the expecta'on (and likelihood of observed data)
![Page 7: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/7.jpg)
resampling & empirical likelihood estimation
• what if we can’t write the formula?
• some'mes impossible
• oBen very hard
• can simulate repeat sampling and compute the expecta'on EMPIRICALLY (some'mes called “monte carlo” likelihood es'ma'on)
![Page 8: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/8.jpg)
binomial example
• frequencies of dead tadpoles (again) in pools of 5
• what is chance of death?
• easy problem, but suppose we can’t write the formula...
0 1 2 3 4 50
510
15
20
25
30
![Page 9: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/9.jpg)
binomial example
• even when can’t write likelihood expression, can usually simulate data, condi'onal on parameters
• strategy
• (1) generate a datum, condi'onal on parameters
• (2) do (1) a bunch of 'mes
• (3) observe freq of real data in distribu'on from (2). this is the likelihood es'mate.
![Page 10: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/10.jpg)
binomial example
prob = 0.4
10 000 replicates
0 1 2 3 4 5
0500
1500
2500
0 1 2 3 4 5
0500
1500
2500
Likelihood of 1
> length( k[k==1] )/10000[1] 0.2626
![Page 11: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/11.jpg)
empirical likelihood estimation
• at each set of parameter values, need to simulate the distribu'on
• may need many replicates to get a smooth picture of likelihood surface
• careful of returning zero (0) likelihoods. NO EVENT should ever have zero chance of happening.
![Page 12: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/12.jpg)
empirical likelihood estimation
• This func'on does the same thing as dbinom(), but it does it via simula'on.
dsimbinom <- function( x , prob , size , log=TRUE , R=99 ) { e <- rbinom( R , prob=prob , size=size ) p <- log( sapply( x , function(y) length(e[e==y])/R ) ) p}
![Page 13: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/13.jpg)
empirical likelihood estimation
0.2 0.3 0.4 0.5 0.6
150
160
170
180
190
200
99 replicates
prob
-logLik
0.2 0.3 0.4 0.5 0.6
150
160
170
180
190
200
999 replicates
prob
-logLik
0.2 0.3 0.4 0.5 0.6
150
160
170
180
190
200
9999 replicates
prob-logLik
Red curve is real analy/cal likelihood func/on
![Page 14: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/14.jpg)
empirical likelihood estimation• “jaggies” bad. helps to use
SIMULATED ANNEALING (SA) (method=”SANN”)
• SA hill-‐climbs, like most algorithms, but also climbs DOWN, with slowly decreasing probability (as it “cools”)
m.prob <- mle2( k ~ dbinom( prob=1/(1+exp(z)) , size=5 ) , start=list(z=0) )
m.sim <- mle2( k ~ dsimbinom( prob=1/(1+exp(z)) , size=5 , R=999 ) , start=list(z=0) , method="SANN" )
0.2 0.3 0.4 0.5 0.6
160
170
180
190
200
210
prob
-logLik
![Page 15: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/15.jpg)
empirical likelihood estimation
k <- rbinom( 100 , size=5 , prob=0.4 )
> sum(k)/500[1] 0.388
> logit(coef(m.prob)) z 0.388 > logit(coef(m.sim)) z 0.390865
m.prob <- mle2( k ~ dbinom( prob=1/(1+exp(z)) , size=5 ) , start=list(z=0) )m.sim <- mle2( k ~ dsimbinom( prob=1/(1+exp(z)) , size=5 , R=999 ) , start=list(z=0) , method="SANN" )
![Page 16: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/16.jpg)
more complex example
• beta-‐binomial distribu'on:
• binomial probabili'es sampled from beta distribu'on
• has an analy'cal solu'on, but we’ll do it empirically now
![Page 17: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/17.jpg)
0.0 0.2 0.4 0.6 0.8 1.0
probability of death
pro
babili
ty o
f pro
babili
ty o
f death
p1 = 0.4 p2 = 0.65 p3 = 0.12
40%60% 65%35% 12%88%
beta distributed chances of mortality
binomial trials determine actual deaths in each pool
![Page 18: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/18.jpg)
0 1 2 3 4 5 6 7 8 9 10
count of dead tadpoles in pool
num
ber
of pools
050
100
150
200
0 1 2 3 4 5 6 7 8 9 10
count of dead tadpoles in pool
num
ber
of pools
020
40
60
80
100
0.0 0.2 0.4 0.6 0.8 1.0
probability of death
pro
babili
ty o
f pro
babili
ty o
f death
rbinom( 1000 , prob=0.5 , size=10 )
rbetabinom( 1000 , shape1=0.9 , shape2=0.9 , size=10 )
p = 0.5
binomial beta-‐binomial
![Page 19: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/19.jpg)
0 1 2 3 4 5 6 7 8 9 10
count of dead tadpoles in pool
num
ber
of pools
050
100
150
200
rbinom( 1000 , prob=0.5 , size=10 )
rbetabinom( 1000 , shape1=2 , shape2=2 , size=10 )
p = 0.5
binomial beta-‐binomial
0.0 0.2 0.4 0.6 0.8 1.0
probability of death
pro
babili
ty o
f pro
babili
ty o
f death
0 1 2 3 4 5 6 7 8 9 10
count of dead tadpoles in pool
num
ber
of pools
020
60
100
![Page 20: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/20.jpg)
heterogeneous tadpoles
0.0 0.2 0.4 0.6 0.8 1.0
0.000
0.005
0.010
0.015
0.020
x
y/i
0.0 0.2 0.4 0.6 0.8 1.0
0.000
0.005
0.010
0.015
0.020
x
y/i
a = 2 , b = 2
a = 0.7 , b = 0.7
0.0 0.2 0.4 0.6 0.8 1.0
0.000
0.005
0.010
0.015
0.020
x
y/i
0.0 0.2 0.4 0.6 0.8 1.0
0.000
0.005
0.010
0.015
0.020
x
y/i
a = 1 , b = 2
a = 1 , b = 0.7
probability has rot
prob
ability of p
robability has rot
![Page 21: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/21.jpg)
empirical beta-binomial
• out of 5 tadpoles, how many dead?
• assume that mortality correlated WITHIN pools
0 1 2 3 4 5
dead tadpolesfrequency
05
10
15
20
25
![Page 22: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/22.jpg)
empirical beta-binomial
dsimbetabinom <- function( x , shape1 , shape2 , size , log=TRUE , R=99 ) {
# sample R probabilities from betabinom p <- rbeta( R , shape1=shape1 , shape2=shape2 )
# sample each event from p e <- rbinom( R , size=size , prob=p )
# observe log-freq of each x in distribution of e log( sapply( x , function(y) length(e[e==y])/R ) ) }
![Page 23: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/23.jpg)
empirical beta-binomial
• The analy'cal way:
> library(emdbook)> m.prob <- mle2( k ~ dbetabinom( size=5 , shape1=exp(s1) , shape2=exp(s2) ) , start=list( s1=1,s2=1 ) )
> exp(coef(m.prob)) s1 s2 1.967890 2.010719
![Page 24: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/24.jpg)
empirical beta-binomial
• The empirical way:
> m.sim <- mle2( k ~ dsimbetabinom( shape1=exp(s1) , shape2=exp(s2) , size=5 , R=999 ) , start=list( s1=1 , s2=1 ) , method="SANN" )
> exp(coef(m.sim)) s1 s2 2.013872 1.964869
![Page 25: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/25.jpg)
empirical likelihood estimation
• Problems that require empirical likelihood methods
• complex phylogene'c models
• complex popula'on structure models
• almost all Bayesian analyses
• almost all network models
• many “mixed effects” models
• many 'me series models
![Page 26: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/26.jpg)
bootstrapping
• a special kind of resampling aimed at es'ma'ng variance of an es'mate (confidence intervals)
• suppose we can’t es'mate confidence from likelihood surface (can’t write a formula, perhaps)
• can treat sample like a popula'on, and take many samples of same size from it
• theory tells us that as sample size increases, variance in resampled es'mates converges to true variance
![Page 27: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/27.jpg)
bootstrapping
• (1) sample n data from original size n sample, WITH REPLACEMENT
• (2) do (1) many 'mes
• (3) as n increases, histogram from (2) approaches true likelihood surface
• (4) find values of parameter in histogram that mark different confidence limits
![Page 28: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/28.jpg)
bootstrap estimates
• simplest confidence intervals are just read from the histogram
• e.g. 95% intervalslow: value just above 2.5% of the valueshigh: value just above 97.5% of the values
Histogram of 1/(1 + exp(b$t))
1/(1 + exp(b$t))F
req
ue
ncy
0.25 0.30 0.35
05
01
00
15
02
00
![Page 29: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/29.jpg)
bootstrapping
• Original data:
• Es'mate parameter:
k <- rbinom( 100 , size=5 , prob=0.3 )
m <- mle2( k ~ dbinom( prob=logit(z) , size=5 ) , start=list(z=0) )
![Page 30: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/30.jpg)
bootstrapping
• Resample 999 sets of data from original data, and re-‐es'mate mle for each:
plist <- replicate( 999 , coef(mle2( sample(k,100,TRUE) ~ dbinom( prob=logit(z) , size=5 ) , start=list(z=coef(m)[1]) , method="Nelder-Mead" ) ) )
logit( quantile( plist , probs=c(0.025,0.975) ) )
![Page 31: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/31.jpg)
bootstrapping
• Can find 95% confidence interval just by cuhng off lower and upper 2.5%
• Here: 0.251, 0.335
• confint() gives: 0.259, 0.339
Histogram of logit(plist)
logit(plist)
Frequency
0.25 0.30 0.35
050
100
150
200
logit( quantile( plist , probs=c(0.025,0.975) ) )
![Page 32: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/32.jpg)
bootstrapping
• More complicated models are easier to do with the boot library.
• Consider modeling log body mass against log brain mass, for various species (at right).
0 5 10
02
46
8log body mass
log b
rain
mass
Dinosaurs
![Page 33: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/33.jpg)
bootstrapping
0 5 10
02
46
8
log body mass
log b
rain
mass
plot( log(d$brain) ~ log(d$body) , xlab="log body mass" , ylab="log brain mass" )
abline( lm( log(d$brain) ~ log(d$body) ) , col="red" )
![Page 34: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/34.jpg)
bootstrapping
• Now write a func'on that accepts the original data and a collec'on of row numbers as parameters:
f.coef <- function( d , i ) { # make a new data frame that contains the resampled rows in i nd <- d[i,] # fit our model to the resampled data m <- lm( log(brain) ~ log(body) , data=nd ) # return coefficients coef(m)}
![Page 35: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/35.jpg)
bootstrapping
• Then tell the boot library to resample and collect coefficients from that func'on:
library(boot)boot.animals <- boot( d , f.coef , R=9999 )
boot.object <- boot( ORIGINAL.DATA , YOUR.FUNCTION , R=NUM.RESAMPLES )
![Page 36: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/36.jpg)
bootstrappingplot( boot.animals , index=2 )
Histogram of the resampled beta coefficients
Histogram of t
t*
Density
0.0 0.2 0.4 0.6 0.8
01
23
-4 -2 0 2 40.0
0.2
0.4
0.6
0.8
Quantiles of Standard Normal
t*
Comparison of resampled distribu/on to normal
![Page 37: 10 throws - Arbeitxcelab.net/rm/wp-content/uploads/2010/03/week9.pdf · 2 3 4 5 6 7 8 9 10 11 12 100 throws dice total frequency 0 5 10 15 20 > length( p[p==7] ) / throws [1] 0.14](https://reader034.vdocuments.net/reader034/viewer/2022042214/5eb9a7288c7c18741a6d422a/html5/thumbnails/37.jpg)
bootstrapping
• Convenient func'on to extract confidence intervals:
> boot.ci( boot.animals , type="perc" , index=2 )
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONSBased on 9999 bootstrap replicates
CALL : boot.ci(boot.out = boot.animals, type = "perc", index = 2)
Intervals : Level Percentile 95% ( 0.2905, 0.7491 ) Calculations and Intervals on Original Scale
> confint( lm( log(d$brain) ~ log(d$body) ) ) 2.5 % 97.5 %(Intercept) 1.7056829 3.4041133log(d$body) 0.3353152 0.6566742