august 2004copyright tim hesterberg1 introduction to the bootstrap (and permutation tests) tim...

86
August 2004 Copyright Tim Hesterberg 1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research Center Statisticians August 2004, Toronto

Post on 15-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 1

Introduction to the Bootstrap (and

Permutation Tests)Tim Hesterberg, Ph.D.

Association of General Clinical Research Center Statisticians

August 2004, Toronto

Page 2: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 2

Outline of Talk

• Why Resample?

• Introduction to Bootstrapping

• More examples, sampling methods

• Two-sample Bootstrap

• Two-sample Permutation Test

• Other statistics

• Other permutation tests

Page 3: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 3

Why Resample?

• Fewer assumptions: normality, equal variances

• Greater accuracy (in practice)

• Generality: Same basic procedure for wide range of statistics, sampling methods

• Promote understanding: Concrete analogies to theoretical concepts

Page 4: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 4

Good Books

• Hesterberg et al. Bootstrap Methods and Permutation Tests (2003, W. H. Freeman)

• B. Efron and R. Tibshirani An Introduction to the Bootstrap (1993, Chapman & Hall).

• A.C. Davison and D.V. Hinkley, Bootstrap Methods and Their Application (Cambridge University Press, 1997).

Page 5: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 5

Example - Verizon

Number of

Observations

Average Repair

Time

ILEC (Verizon) 1664 8.4

CLEC (other carrier)

23 16.5

Is the difference statistically significant?

Page 6: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

Example Data

Repair Time

0 50 100 150 200

0.0

0.0

10

.02

0.0

30

.04

Repair Time

0 50 100 150 200

0.0

0.0

10

.02

0.0

3

Quantiles of Standard Normal

Re

pa

ir T

ime

-2 0 2

05

01

00

15

0

ILECCLEC

Page 7: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 7

Start Simple

• We’ll start simple – single sample mean

• Later – other statistics– two samples – permutation tests

Page 8: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 8

Bootstrap Procedure

• Repeat 1000 times– Draw a sample of size n with replacement from

the original data (“bootstrap sample”, or “resample”)

– Calculate the sample mean for the resample

• The 1000 bootstrap sample means comprise the bootstrap distribution.

Page 9: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 9

Bootstrap Distn for ILEC mean

mean

De

nsi

ty

7.5 8.0 8.5 9.0 9.5

0.0

0.2

0.4

0.6

0.8

1.0

ObservedMean

Quantiles of Standard Normal

me

an

-2 0 2

7.5

8.0

8.5

9.0

9.5

bootstrap : ILEC$Time : mean

Page 10: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 10

Bootstrap Standard Error

• Bootstrap standard error (SE) = standard deviation of bootstrap distribution

> ILEC.boot.meanCall:bootstrap(data = ILEC, statistic = mean, seed = 36)

Number of Replications: 1000

Summary Statistics: Observed Mean Bias SE mean 8.412 8.395 -0.01698 0.3672

Page 11: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 11

Bootstrap Distn for CLEC mean

mean

De

nsi

ty

10 15 20 25 30

0.0

0.0

20

.04

0.0

60

.08

0.1

0

ObservedMean

Quantiles of Standard Normal

me

an

-2 0 2

10

15

20

25

30

bootstrap : CLEC$Time : mean

Page 12: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 12

Take another look

• Take another look at the previous two figures.

• Is the amount of non-normality/asymmetry there a cause for concern?

• Note – we’re looking at a sampling distribution, not the underlying distribution. This is after the CLT effect!

Page 13: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 13

Idea behind bootstrapping

• Plug-in principle– Underlying distribution is unknown– Substitute your best guess

Page 14: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 14

Ideal world

Page 15: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 15

Bootstrap world

Page 16: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 16

Fundamental Bootstrap Principle

• Plug-in principle– Underlying distribution is unknown– Substitute your best guess

• Fundamental Bootstrap Principle– This substitution works– Not always– Bootstrap distribution centered at statistic, not

parameter

Page 17: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 17

Secondary Principle

• Implement the Fundamental Principle by Monte Carlo sampling

• This is just an implementation detail!– Exact: nn samples– Monte Carlo, typically 1000 samples

• 1000 realizations from theoretical bootstrap dist

• More for higher accuracy (e.g. 500,000)

Page 18: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 18

Not Creating Data from Nothing

• Some are uncomfortable with the bootstrap, because they think it is creating data out of nothing. (The name doesn’t help!)

• Not creating data. No better parameter estimates. (Exception – bagging, boosting.)

• Use the original data to estimate SE or other aspects of the sampling distribution.– Using sampling, rather than a formula

Page 19: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 19

Formulaic and Bootstrap SE

Page 20: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 20

What to substitute?

• Plug-in principle– Underlying distribution is unknown– Substitute your best guess

• What to substitute?– Empirical distribution – ordinary bootstrap– Smoothed distribution – smoothed bootstrap– Parametric distribution – parametric bootstrap– Satisfy assumptions, e.g. null hypothesis

Page 21: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 21

Another example: Kyphosis

• Variables Kyphosis (present or absent), Age of child, Number of vertebrae in operation, Start of range of vertebrae

• Logistic regression

Page 22: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 22

Kyphosis - Logistic Regression

Value Std. Error t value (Intercept) -2.03693225 1.44918287 -1.405573 Age 0.01093048 0.00644419 1.696175 Start -0.20651000 0.06768504 -3.051043 Number 0.41060098 0.22478659 1.826626

Null Deviance: 83.23447 on 80 dfResidual Deviance: 61.37993 on 77 df

Page 23: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 23

Kyphosis vs. Start

Start

Kyp

ho

sis

5 10 15

0.0

0.2

0.4

0.6

0.8

1.0

Page 24: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 24

Kyphosis Example

• Pseudo-code:Repeat 1000 times {

Draw sample with replacement from original rows

Fit logistic regression

Save coefficients

}

Use the bootstrap distribution

• Live demo (kyphosis.ssc)

Page 25: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 25

Bootstrap SE and bias

• Bootstrap SE (standard error) = standard deviation of bootstrap distribution

• Bootstrap bias = mean of bootstrap distribution – original statistic

Page 26: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 26

t confidence interval

• Statistic +- t* SE(bootstrap)

• Reasonable interval if bootstrap distribution is approximately normal, little bias. Compare to bootstrap percentiles. Return to Kyphosis example

• In the literature, “bootstrap t” means something else.

Page 27: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 27

Percentiles to check Bootstrap t

• If bootstrap distribution is approximately normal and unbiased, then bootstrap t intervals and corresponding percentiles should be similar.

• Compare these

• If similar use either; else use a more accurate interval

Page 28: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 28

More Accurate Intervals

• BCa, Tilting, others (real bootstrap-t)

• Percentile and “bootstrap-t”: – first-order correct– Consistent, coverage error O(1/sqrt(n))

• BCa and Tilting: – second-order correct– coverage error O(1/n)

Page 29: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 29

Different Sampling Procedures

• Two-sample applications

• Other sampling situations

Page 30: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 30

Two-sample Bootstrap Procedure

Given independent SRSs from two populations:• Repeat 1000 times

– Draw sample size m from sample 1

– Draw sample size n from sample 2, independently

– Compute statistic that compares two groups, e.g. difference in means

• The 1000 bootstrap statistics comprise the bootstrap distribution.

Page 31: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 31

Example – Relative Risk

Blood Pressure Cardiovascular Disease

High 55/3338 = 0.0165

Low 21/2676 = 0.0078

Estimated Relative Risk = 2.12

Page 32: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 32

…bootstrap Relative Riskbootstrap Relative Risk

mean

De

nsi

ty

1 2 3 4 5

0.0

0.2

0.4

0.6

0.8

ObservedMean

t

percentile

BCa

tilt

Page 33: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 33

Example: Verizon

Repair Time

0 50 100 150 200

0.0

0.0

10

.02

0.0

30

.04

Repair Time

0 50 100 150 200

0.0

0.0

10

.02

0.0

3

Quantiles of Standard Normal

Re

pa

ir T

ime

-2 0 2

05

01

00

15

0

ILECCLEC

Page 34: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 34

…difference in means

mean

De

nsi

ty

-25 -20 -15 -10 -5 0

0.0

0.0

20

.04

0.0

60

.08

0.1

0

ObservedMean

Quantiles of Standard Normal

me

an

-2 0 2

-20

-15

-10

-50

bootstrap : Verizon$Time : mean : ILEC - CLEC

Page 35: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 35

…difference in trimmed means

Param

De

nsi

ty

-15 -10 -5 0

0.0

0.0

50

.10

0.1

5

ObservedMean

Quantiles of Standard Normal

Pa

ram

-2 0 2

-15

-10

-50

bootstrap : Verizon : mean(Time, trim =... : ILEC - CLEC

Page 36: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 36

…comparison

• Diff means

Observed Mean Bias SE mean -8.098 -7.931 0.1663 3.893

• Diff 25% trimmed means

Observed Mean Bias SE Param -10.34 -10.19 0.1452 2.737

Page 37: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 37

Other Sampling Situations

• Stratified Sampling– Resample within strata

• Small samples or strata– Correct for narrowness bias

• Finite Population– Create finite population, resample without

replacement

• Regression

Page 38: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 38

Bootstrap SE too small

• Usual SE for mean is where

• Bootstrap corresponds to using divisor of n instead of n-1.

• Bias factor for each sample, each stratum

/s n

21( )

1 is x xn

Page 39: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 39

Remedies for small SE

• Multiply SE by sqrt(n/(n-1)– Equal strata sizes only. No effect on CIs.

• Sample with reduced size, (n-1)• Bootknife sampling

– Omit random observation– Sample size n from remaining n-1

• Smoothed bootstrap– Choose smoothing parameter to match variance– Continuous data only

Page 40: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 40

Smoothed bootstrap

• Kernel Density Estimate = Nonparametric bootstrap + random noise

minutes/half-hourTV Advertising, Basic Cable

De

nsi

ty

6 8 10 12

0.0

0.1

0.2

0.3

Page 41: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 41

Finite Population

• Sample size n from population size N

• If N is multiple of n, – repeat each observation (N/n) times, – bootstrap sample without replacement

• If N is not a multiple of n, – Repeat each observation same # of times

• round N/n up, down

Page 42: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 42

Resampling for Regression

• Resample observations (random effects)– Problem with factors, random amount of info

• Resample residuals (fixed effects)– Fit model– Resample residuals, with replacement– Add to fitted values– Problems with heteroskedasticity, lack of fit

Page 43: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 43

Basic Rule for Sampling

• Sample in a way consistent with how the data were produced

• Including any additional information– Continuous distribution (if it matters, e.g. for

medians)– Null hypothesis

Page 44: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 44

Resampling for Hypothesis Tests

• Sample in a manner consistent with H0• P-value = P0(random value exceeds observed

value)

observed statistic

P-value

SamplingDistribution

when H0 is true

Page 45: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 45

Permutation Test for 2-samples

• H0: no real difference between groups; observations could come from one group as well as the other

• Resample: randomly choose n1 observations for group 1, rest for group 2.

• Equivalent to permuting all n, first n1 into group 1.

Page 46: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 46

Verizon permutation testpermutation : Verizon$Time : mean : ILEC - CLEC

ObservedMean

Page 47: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 47

Test resultsPooled-variance t-test t = -2.6125, df = 1685, p-value = 0.0045Non-pooled-variance t-test t = -1.9834, df = 22.3463548265907, p-value = 0.0299 > permVerizon3Call:permutationTestMeans(data = Verizon$Time, treatment = Verizon$Group, B = 499999, alternative = "less", seed = 99)

Number of Replications: 499999

Summary Statistics: Observed Mean SE alternative p.value Var -8.098 -0.001288 3.105 less 0.01825

Page 48: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 48

Permutation vs Pooled Bootstrap

• Pooled bootstrap test– Pool all n observations

– Choose n1 with replacement for group 1

– Choose n2 with replacement for group 2

• Permutation test is preferred– Condition on the observed data– Same number of outliers as the observed data

Page 49: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 49

Assumptions

• Permutation Test:– Same distribution for two populations

• When H0 is true

• Population variances must be the same; sample variances may differ

– Does not require normality– Does not require that data be a random sample

from a larger population

Page 50: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 50

Other Statistics

• Procedure works for variety of statistics– Difference in means– t-statistic– difference in trimmed means

• Work directly with statistic of interest– Same p-value for and pooled-variance t-

statistic1 2x x

Page 51: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 51

Difference in Trimmed Meanspermutation 25% trimmed mean: Verizon ILEC-CLEC

mean

De

nsi

ty

-10 -5 0

0.0

0.0

50

.10

0.1

50

.20

0.2

50

.30

ObservedMean

P-value = 0.0002

Page 52: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 52

General Permutation Tests

• Compute Statistic for data

• Resample in a way consistent with H0 and study design

• Construct permutation distribution

• P-value = percentage of resampled statistics that exceed original statistic

Page 53: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 53

Perm Test for Matched Pairsor Stratified Sampling

• Permute within each pair

• Permute within each stratum

Page 54: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 54

Example: Puromycin

• The data are from a biochemical experiment where the initial velocity of a reaction was measured for different concentrations of the substrate. Data are from two runs, one on cells treated with the drug Puromycin, the other on cells without

• Variables concentration, velocity, treatment

Page 55: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 55

Puromycin dataPuromycin

Concentration

Ve

loci

ty

0.0 0.2 0.4 0.6 0.8 1.0

50

10

01

50

20

0

untreatedtreated

Page 56: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 56

Permutation Test for Puromycin

• Statistic: ratio of smooths, at each original concentration

• Stratify by original concentration

• Permute only the treatment variablepermutationTest(data = Puromycin, statistic = f,

alternative = "less", combine = T, seed = 42,

group = Puromycin$conc,

resampleColumns = "state")

Page 57: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 57

Puromycin – Permutation Graphspermutation : Puromycin : f

0.02

De

nsi

ty

0.8 1.0 1.2 1.4

0.0

0.5

1.0

1.5

2.0

2.5

3.0

ObservedMean

permutation : Puromycin : f

0.06

De

nsi

ty0.8 0.9 1.0 1.1 1.2

01

23

45 Observed

Mean

permutation : Puromycin : f

0.11

De

nsi

ty

0.8 0.9 1.0 1.1 1.2

01

23

45

ObservedMean

permutation : Puromycin : f

0.22

De

nsi

ty

0.8 0.9 1.0 1.1 1.2

01

23

45

6 ObservedMean

permutation : Puromycin : f

0.56

De

nsi

ty

0.8 0.9 1.0 1.1 1.2

01

23

45 Observed

Mean

permutation : Puromycin : f

1.1

De

nsi

ty

0.8 0.9 1.0 1.1 1.20

12

34

5

ObservedMean

Page 58: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 58

Puromycin – P-values

Summary Statistics: Observed Mean SE alternative p-value 0.02 0.9085 1.016 0.14932 less 0.2590.06 0.8509 1.005 0.08191 less 0.0240.11 0.8254 1.002 0.07011 less 0.0030.22 0.8034 1.001 0.07657 less 0.0020.56 0.7850 1.007 0.09675 less 0.002 1.1 0.7937 1.025 0.13384 less 0.053

Combined p-value: 0.02, 0.06, 0.11, 0.22, 0.56, 1.1 0.002

Page 59: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 59

Permutation test curves

Concentration

Ve

loci

ty

0.0 0.2 0.4 0.6 0.8 1.0

50

10

01

50

20

0

untreatedtreatedperm/untreated

Page 60: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 60

Permutation Test of Relationship

• To test H0: X and Y are independent

• Permute either X or Y (both is just extra work)

• Test statistic may be correlation, regression slope, chi-square statistic (Fisher’s exact test), …

Page 61: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 61

Perm Test in Regression

• Simple regression: permute X or Y

• Multiple regression:– Permute Y to test H0: no X contributes

– To test incremental contribution of X1

• Cannot permute X1

• That loses joint relationship of Xs

Page 62: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 62

Example: Kyphosis

• Variables Kyphosis (present or absent), Age of child, Number of vertebrae in operation, Start of range of vertebrae

• Logistic regression

Page 63: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 63

Kyphosis - Logistic Regression

Value Std. Error t value (Intercept) -2.03693225 1.44918287 -1.405573 Age 0.01093048 0.00644419 1.696175 Start -0.20651000 0.06768504 -3.051043 Number 0.41060098 0.22478659 1.826626

Null Deviance: 83.23447 on 80 dfResidual Deviance: 61.37993 on 77 df

Page 64: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 64

Kyphosis vs. Start

Start

Kyp

ho

sis

5 10 15

0.0

0.2

0.4

0.6

0.8

1.0

Page 65: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 65

Kyphosis Permutation Test

• Permute Kyphosis (the response variable), leaving other variables fixed.

• Test statistic is residual deviance.

Summary Statistics: Observed Mean SE alternative p-value Param 61.38 79.95 2.828 less 0.001

Page 66: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 66

Kyphosis Permutation Distribution

Permutation Distribution for Kyphosis

Residual Deviance

De

nsi

ty

65 70 75 80

0.0

0.0

50

.10

0.1

50

.20

ObservedMean

Page 67: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 67

When Perm Testing Fails

• Permutation Testing is not Universal– Cannot test H0: = 0 – Cannot test H0: = 1

• Use Confidence Intervals• Bootstrap tilting

– Find maximum-likelihood weighted distribution that satisfies H0, use weighted bootstrap

Page 68: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 68

If time permits

• Bias – Portfolio optimization example, in section3.ppt

• More about confidence intervals, from section5.ppt

Page 69: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 69

Summary

• Basic bootstrap idea – – Substitute best estimate for population(s)

• For testing, match null hypothesis

– Sample consistently with how data produced– Inspect bootstrap distribution – Normal?– Compare t and percentile intervals, BCa &

tilting

Page 70: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 70

Summary

• Testing– Sample consistent with H0– Permutation test to compare groups, test

relationships– No permutation tests in some situations; use

bootstrap confidence interval or test

Page 71: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 71

Resources

• www.insightful.com/Hesterberg/bootstrap

• S+Resamplewww.insightful.com/downloads/libraries

[email protected]

Page 72: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 72

Supplement for pages 24-27

• This document is a supplement to the presentation at the AGS. This includes some material that was shown in a live demo using S-PLUS, corresponding to pages 24-27 of the original presentation.

Page 73: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 73

Another example: Kyphosis

• Variables Kyphosis (present or absent), Age of child, Number of vertebrae in operation, Start of range of vertebrae

• Logistic regression

Page 74: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 74

Kyphosis - Logistic Regression

Value Std. Error t value (Intercept) -2.03693225 1.44918287 -1.405573 Age 0.01093048 0.00644419 1.696175 Start -0.20651000 0.06768504 -3.051043 Number 0.41060098 0.22478659 1.826626

Null Deviance: 83.23447 on 80 dfResidual Deviance: 61.37993 on 77 df

Page 75: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 75

Kyphosis Example

• Pseudo-code:Repeat 1000 times {

Draw sample with replacement from original rows

Fit logistic regression

Save coefficients

}

Use the bootstrap distribution

• Live demo (kyphosis.ssc)

Page 76: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 76

Kyphosis vs. Start

Start

Kyp

ho

sis

5 10 15

0.0

0.2

0.4

0.6

0.8

1.0

Page 77: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 77

Graphical bootstrap of predictions

Start

Kyp

ho

sis

5 10 15

0.0

0.2

0.4

0.6

0.8

1.0

Page 78: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 78

Bootstrap Coefficients

bootstrap : glm(formula = Kyp... : coef(glm(data))

(Intercept)

De

nsi

ty

-15 -10 -5 0

0.0

0.0

50

.15

0.2

5

ObservedMean

bootstrap : glm(formula = Kyp... : coef(glm(data))

Age

De

nsi

ty

0.0 0.02 0.04 0.06

01

02

03

04

05

06

0 ObservedMean

bootstrap : glm(formula = Kyp... : coef(glm(data))

Start

De

nsi

ty

-0.8 -0.6 -0.4 -0.2 0.0

01

23

45

6

ObservedMean

bootstrap : glm(formula = Kyp... : coef(glm(data))

Number

De

nsi

ty

0 1 2 3

0.0

0.4

0.8

1.2

ObservedMean

Page 79: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 79

Bootstrap Scatterplots

(Intercept)

0.0 0.02 0.04 0.06 0 1 2 3

-15

-10

-50

0.0

0.02

0.06

Age

Start

-0.8

-0.4

-15 -10 -5 0

01

23

-0.8 -0.6 -0.4 -0.2

Number

Page 80: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 80

t confidence interval

• Statistic +- t* SE(bootstrap)

• Reasonable interval if bootstrap distribution is approximately normal, little bias. Compare to bootstrap percentiles. Return to Kyphosis example

• In the literature, “bootstrap t” means something else.

Page 81: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 81

Are t-limits reasonable here?

bootstrap : glm(formula = Kyp... : coef(glm(data))

Start

De

nsi

ty

-0.8 -0.6 -0.4 -0.2 0.0

01

23

45

6 ObservedMean

Page 82: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 82

Are t-limits reasonable here?

bootstrap : glm(formula = Kyp... : coef(glm(data))

Quantiles of Standard Normal

Sta

rt

-2 0 2

-0.8

-0.6

-0.4

-0.2

Page 83: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 83

Are t-limits reasonable here?

• Remember, the previous two plots show the bootstrap distribution, an estimate of the sampling distribution, after the Central Limit Theorem has had its chance to work.

Page 84: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 84

Percentiles to check Bootstrap t

• If bootstrap distribution is approximately normal and unbiased, then bootstrap t intervals and corresponding percentiles should be similar.

• Compare these

• If similar use either; else use a more accurate interval

Page 85: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 85

Compare t and percentile CIs• > signif(limits.t(boot.kyphosis), 2)• 2.5% 5% 95% 97.5% • (Intercept) -6.1000 -5.4000 1.400 2.000• Age -0.0054 -0.0027 0.025 0.027• Start -0.3800 -0.3500 -0.063 -0.034• Number -0.2900 -0.1800 1.000 1.100• > signif(limits.percentile(boot.kyphosis), 2)

• 2.5% 5% 95% 97.5% • (Intercept) -6.80000 -5.8000 0.560 1.400• Age 0.00077 0.0021 0.028 0.033• Start -0.44000 -0.3900 -0.120 -0.095• Number -0.09400 0.0078 1.100 1.300

Page 86: August 2004Copyright Tim Hesterberg1 Introduction to the Bootstrap (and Permutation Tests) Tim Hesterberg, Ph.D. Association of General Clinical Research

August 2004 Copyright Tim Hesterberg 86

Compare asymmetry of CIs• > signif(limits.t(boot.kyphosis) - boot.kyphosis$observed, 2)

• 2.5% 5% 95% 97.5% • (Intercept) -4.100 -3.400 3.400 4.100• Age -0.016 -0.014 0.014 0.016• Start -0.170 -0.140 0.140 0.170• Number -0.710 -0.590 0.590 0.710• > signif(limits.percentile(boot.kyphosis) - boot.kyphosis$observed, 2)

• 2.5% 5% 95% 97.5% • (Intercept) -4.80 -3.8000 2.600 3.500• Age -0.01 -0.0088 0.018 0.022• Start -0.23 -0.1800 0.088 0.110• Number -0.51 -0.4000 0.710 0.850