04/10/23
(c) 2000, Ron S. Kenett, Ph.D. 1
Computer Intensive Techniques(Bootstrapping)
Instructor: Ron S. KenettEmail: [email protected]
Course Website: www.kpa.co.il/biostatCourse textbook: MODERN INDUSTRIAL STATISTICS,
Kenett and Zacks, Duxbury Press, 1998
04/10/23
(c) 2000, Ron S. Kenett, Ph.D. 2
Course Syllabus
•Understanding Variability•Variability in Several Dimensions•Basic Models of Probability•Sampling for Estimation of Population Quantities•Parametric Statistical Inference•Computer Intensive Techniques•Multiple Linear Regression•Statistical Process Control•Design of Experiments
04/10/23
(c) 2000, Ron S. Kenett, Ph.D. 3
BootstrappingBootstrapping
A computer intensive method, introduced in 1979 by Brad Efron from Stanford University in order to “pool yourself out of the mess”:
Take a Random Sampling With Replacement (RSWR) and compute statistic TT
Resample M times and recompute statistic TT Derive Empirical Bootstrap Distribution (EBD) E{EBD} and STD{EBD} and EBD percentiles
estimate E{TT} and STD{TT} and Bootstrap Confidence Interval for population parameter
04/10/23
(c) 2000, Ron S. Kenett, Ph.D. 4
Bootstrap testing of the mean
245023502250215020501950
15
10
5
0
Hybrid1
Frequency
Hybrid120602127194721401960196021342054209420872267242721742107226721542167214722142160218022202167217422802187218020602060205422402140
Hybrid120602127194721401960196021342054209420872267242721742107226721542167214722142160218022202167217422802187218020602060205422402140
4.214332X
Is this significantly different from 2150 ?
04/10/23
(c) 2000, Ron S. Kenett, Ph.D. 5
Boot1smp.exeBoot1smp.exeDEFLNG I-NDEFDBL A-H, O-ZRANDOMIZE TIMERPRINT " W e l c o m e to BOOT 1SMP"PRINT " ==========================="PRINT " "PRINT " "PRINT " This program bootstraps the sample mean and"PRINT " the sample standard deviation from a given sample."PRINT " "PRINT " The output is stored in c:\istat\data\boot 1smp.dat."PRINT " "PRINT "-----------------------------------------------"PRINT " "PRINT " What is the path and file name of your sample?"INPUT source$
cont10: PRINT " What is the size of Sample 1 ?"INPUT n1IF n1 = 1 THENPRINT " SAMPLE SIZE SHOULD BE GREATER THAN 1 , REENTER"GOTO cont 10END IF
04/10/23
(c) 2000, Ron S. Kenett, Ph.D. 6
Hybrid120602127194721401960196021342054209420872267242721742107226721542167214722142160218022202167217422802187218020602060205422402140
Hybrid120602127194721401960196021342054209420872267242721742107226721542167214722142160218022202167217422802187218020602060205422402140
1*T
Hybrid120602127194721401960196021342054209420872267242721742107226721542167214722142160218022202167217422802187218020602060205422402140
Hybrid120602127194721401960196021342054209420872267242721742107226721542167214722142160218022202167217422802187218020602060205422402140
2*T
Hybrid120602127194721401960196021342054209420872267242721742107226721542167214722142160218022202167217422802187218020602060205422402140
Hybrid120602127194721401960196021342054209420872267242721742107226721542167214722142160218022202167217422802187218020602060205422402140
3*T
Hybrid120602127194721401960196021342054209420872267242721742107226721542167214722142160218022202167217422802187218020602060205422402140
Hybrid120602127194721401960196021342054209420872267242721742107226721542167214722142160218022202167217422802187218020602060205422402140
4*T
Hybrid120602127194721401960196021342054209420872267242721742107226721542167214722142160218022202167217422802187218020602060205422402140
Hybrid120602127194721401960196021342054209420872267242721742107226721542167214722142160218022202167217422802187218020602060205422402140
5*T
04/10/23
(c) 2000, Ron S. Kenett, Ph.D. 7
X-bar Std2135.03 87.8502149.84 121.6312141.19 109.2582149.09 78.0842134.00 103.8562122.13 73.8432119.66 86.6252113.59 107.1362138.97 101.6932163.00 67.725
*Derive reference distribution bycomputing
0 nX
Empirical Bootstrap Distribution
04/10/23
(c) 2000, Ron S. Kenett, Ph.D. 8
220021502100
100
50
0
X-bar
Frequency
15014013012011010090807060
100
50
0
Std
Frequency
21502150
Empirical Bootstrap Distribution of mean
0.95 conf. BI =(2109.5, 2179.9)
EBD of STD
Empirical Bootstrap Distribution
04/10/23
(c) 2000, Ron S. Kenett, Ph.D. 9
Bootstrapping the ANOVA
Hybrid1 Hybrid2 Hybrid32060 1907 18872127 1940 18341947 1700 15872140 1934 18141960 1707 16141960 1680 16802134 1940 17472054 1794 16602094 1707 1600
344.1850
813.1902
406.2143
3
2
1
X
X
X
F= MSBetween/MSWithin = 49.274
01.21001
35.16648
54.9929
32
22
12
S
S
S
04/10/23
(c) 2000, Ron S. Kenett, Ph.D. 10
0 1 2 3 4 5 6 7
0
100
200
F_RATIO
Frequency
0 1 2 3 4 5 6 7
0
100
200
F_RATIO
Frequency
F= 49.274
ANOVTEST.EXE
EBD of F values
04/10/23
(c) 2000, Ron S. Kenett, Ph.D. 11
Draw samples from X and Y X: Stress or Load distributions Y: Strength distribution
Estimate P( X>Y)
Bootstrapping Stress Strength relationships
04/10/23
(c) 2000, Ron S. Kenett, Ph.D. 12
X= .0352, .0397, .0677, .0233, .0873, .1156, .0286, .0200, .0797, .9972, .0245, .0251, .0469, .0838, .0796
Y= 1.7700, .9457, 1.8985, 2.6121, 1.0929, .0362, 1.0615, 2.3895, .0982, .7971, .8316, 3.2304, .4373, 2.5648, .6377
P( X>Y) = 0.04 with Pwith P.95.95 = 0.08 = 0.08
EBD of P(X>Y)