ensembled multivariate adaptive regression ......mars. ohd dibldi blour method ppp proposed is...

25
CO S 2010 COMPSTAT2010 in Paris Ensembled Multivariate Adaptive Regression Splines Ensembled Multivariate Adaptive Regression Splines ith N ti G t E ti t with Nonnegative Garrote Estimator Hiroki Motogaito Osaka University Osaka University M hiG t Masashi Goto Biostatistical Research Association NPO Biostatistical Research Association, NPO. JAPAN JAPAN

Upload: others

Post on 10-Dec-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Ensembled Multivariate Adaptive Regression ......MARS. Ohd dibldi blOur method ppp proposed is stable and interpretable. •EnsembledMARSEnsembled MARS-NNGprovidedsuperiorNNG providedpp

CO S 2010COMPSTAT2010 in Paris

Ensembled Multivariate Adaptive Regression SplinesEnsembled Multivariate Adaptive Regression Splines ith N ti G t E ti twith Nonnegative Garrote Estimator

Hiroki MotogaitogOsaka UniversityOsaka University

M hi G tMasashi GotoBiostatistical Research Association NPOBiostatistical Research Association, NPO.

JAPANJAPAN

Page 2: Ensembled Multivariate Adaptive Regression ......MARS. Ohd dibldi blOur method ppp proposed is stable and interpretable. •EnsembledMARSEnsembled MARS-NNGprovidedsuperiorNNG providedpp

AgendaAgendag

• Introduction and motivation• Introduction and motivationTree methods• Tree methods M lti i t Ad ti R iMultivariate Adaptive RegressionMultivariate Adaptive Regression

Splines(MARS)Splines(MARS)p ( )Bagging MARSBagging MARSgg g

O th d d• Our method proposedOur method proposedEnsembled MARS with nonnegative garroteEnsembled MARS with nonnegative garrote

• Example and simulation• Example and simulation• Concluding remarks• Concluding remarksg

22

Page 3: Ensembled Multivariate Adaptive Regression ......MARS. Ohd dibldi blOur method ppp proposed is stable and interpretable. •EnsembledMARSEnsembled MARS-NNGprovidedsuperiorNNG providedpp

AgendaAgendag

• Introduction and motivation• Introduction and motivationTree methods• Tree methods M lti i t Ad ti R iMultivariate Adaptive RegressionMultivariate Adaptive Regression

Splines(MARS)Splines(MARS)p ( )Bagging MARSBagging MARSgg g

O th d d• Our method proposedOur method proposedEnsembled MARS with nonnegative garroteEnsembled MARS with nonnegative garrote

• Example and simulation• Example and simulation• Concluding remarks• Concluding remarksg

33

Page 4: Ensembled Multivariate Adaptive Regression ......MARS. Ohd dibldi blOur method ppp proposed is stable and interpretable. •EnsembledMARSEnsembled MARS-NNGprovidedsuperiorNNG providedpp

Introduction and motivationIntroduction and motivation

Unstable Less interpretableUnstable Less interpretable

)(f )(xf

ˆ)(ˆ xf )(ˆ xfSt bili i)(ˆ xf)(xf )(xfStabilizing

BaggingMARS gg g(Breiman,1996)(Friedman,1991) (Breiman,1996)(Friedman,1991)

M ti tiMotivation

i MARS th t h b th t bilit d i t t bilita new version MARS that has both stability and interpretability

44

Page 5: Ensembled Multivariate Adaptive Regression ......MARS. Ohd dibldi blOur method ppp proposed is stable and interpretable. •EnsembledMARSEnsembled MARS-NNGprovidedsuperiorNNG providedpp

AgendaAgendag

• Introduction and motivation• Introduction and motivationTree methods• Tree methodsM lti i t Ad ti R iMultivariate Adaptive RegressionMultivariate Adaptive Regression

Splines(MARS)Splines(MARS)p ( )Bagging MARSBagging MARSgg g

O th d d• Our method proposedOur method proposedEnsembled MARS with nonnegative garroteEnsembled MARS with nonnegative garrote

• Example and simulation• Example and simulation• Concluding remarks• Concluding remarksg

55

Page 6: Ensembled Multivariate Adaptive Regression ......MARS. Ohd dibldi blOur method ppp proposed is stable and interpretable. •EnsembledMARSEnsembled MARS-NNGprovidedsuperiorNNG providedpp

M lti i t Ad ti R i S li (F i d 1991)Multivariate Adaptive Regression Splines(Friedman,1991)

• Model form• Model formRegression model Basis function

M KRegression model Basis function

M

Bf )(ˆˆˆ mK

qiB )]([)( mmBf 0MARS )(x qmkmkpmkm txiB ),(),(),( )]([)(x

m 1

k

mkmkpmkm1

),(),(),(

• Algorithms• AlgorithmsForward stepwise

0.5

Forward stepwise 0.45

Increase basis functions0.4

Increase basis functions 0 3

0.35

Backward stepwise 0 25

0.3

数の値Backward stepwise

P ff 0 2

0.25

基底関

Prune off 0.15

0.2基

Select the best tree 0.1

0.15

)]5.0([ px )]5.0([ px Select the best tree0.05

)]([ p)]([ p

0 0 2 0 4 0 6 0 8 10

0 2 0 4 0 5 0 6 0 8 1 x0 0.2 0.4 0.6 0.8 1

px0.2 0.4 0.5 0.6 0.8 1 px

1 d k t t 0 5q=1 and knot t=0.5

66

Page 7: Ensembled Multivariate Adaptive Regression ......MARS. Ohd dibldi blOur method ppp proposed is stable and interpretable. •EnsembledMARSEnsembled MARS-NNGprovidedsuperiorNNG providedpp

Bagging (Breiman 1996)Bagging (Breiman,1996)gg g ( , )

• Model form(Bagging MARS)• Model form(Bagging MARS)Regression model Each treeRegression model

1Each tree

)(ˆ1ˆ E ff MARS d l)(f)(1MARS Bagging x

e efE

f : MARS model)(xef1gg g eE

• Algorithms• AlgorithmsSamplep

Bootstrap sample Bootstrap sample Bootstrap sampleBootstrap sample Bootstrap sample

+ +・・・+ +・・・+

)(ˆ xf )(f )(f )(f)(1 xf )(2 xf )( xef )( xEf

ˆ7

averaging )(xf7

Page 8: Ensembled Multivariate Adaptive Regression ......MARS. Ohd dibldi blOur method ppp proposed is stable and interpretable. •EnsembledMARSEnsembled MARS-NNGprovidedsuperiorNNG providedpp

AgendaAgendag

• Introduction and motivation• Introduction and motivationPre io s research• Previous research M lti i t Ad ti R iMultivariate Adaptive RegressionMultivariate Adaptive Regression

Splines(MARS)Splines(MARS)p ( )Bagging MARSBagging MARSgg g

O th d d• Our method proposedOur method proposedEnsembled MARS with nonnegative garroteEnsembled MARS with nonnegative garrote

• Example and simulation• Example and simulation• Concluding remarks• Concluding remarksg

88

Page 9: Ensembled Multivariate Adaptive Regression ......MARS. Ohd dibldi blOur method ppp proposed is stable and interpretable. •EnsembledMARSEnsembled MARS-NNGprovidedsuperiorNNG providedpp

Proposed methodProposed methodp

Motivation

a new version MARS that has both stability and interpretabilitya new version MARS that has both stability and interpretability

Stable, but less interpretable Stable and interpretable

233

1Selection 14

1

&RankingRanking

45Typical tree

45Typical tree

nonnegativeBagging Proposed methodnonnegativetBagging Proposed methodgarrote

(B i 1995)(Breiman,1995)

99

Page 10: Ensembled Multivariate Adaptive Regression ......MARS. Ohd dibldi blOur method ppp proposed is stable and interpretable. •EnsembledMARSEnsembled MARS-NNGprovidedsuperiorNNG providedpp

Ensembled MARS ith non negati e garrote(1/2)Ensembled MARS with non-negative garrote(1/2)g g

• Model form• Model formRegression model Each treeg

E )(ˆˆˆ x E fcf : MARS model :non-negative garrote estimator)(ˆ xf c)(1

x

e ee fcf : MARS model , :non negative garrote estimator )(xef ec

• Algorithms• Algorithms

Generate Bagging trees Generate Bagging trees. Att h h t d ti t i tiˆ Attach on each tree and estimate using nonnegative ec ec

garrote(Breiman,1995). e e

g ( , )Select candidate trees(If the tree is removed)0ˆ c― Select candidate trees(If , the tree is removed).0ec

)(ˆˆˆ E ff Get .)(1

x

e ee fcf• Interpretable structure through typical tree(max )

1e

c• Interpretable structure through typical tree(max )ec

1010

Page 11: Ensembled Multivariate Adaptive Regression ......MARS. Ohd dibldi blOur method ppp proposed is stable and interpretable. •EnsembledMARSEnsembled MARS-NNGprovidedsuperiorNNG providedpp

Ensembled MARS ith non negati e garrote(2/2)Ensembled MARS with non-negative garrote(2/2)g g

ti t (B i 1995)non-negative garrote (Breiman,1995)

N P

P 2)(ˆP

pnppn

Pp xcYc

P

2)(1 )ˆ(min arg}ˆ{ scc pp ,0  , subject to ,

n pnppn

cp P

p 1 1}{1 )(g}{

1

p

pp 1

,, j ,

h i th l t ti t d P1where is the least square estimator and .p Ps 1

Ensembled MARS with non-negative garrote g g

N E E

N E

E fcYc 2))(ˆ(minarg}ˆ{ x 10 E

ccbj t t neenc

e fcYcE

1 1}{1 ))((min arg}{

1

x 1,01

ee cc  , subject to , n ece 1 1}{ 1 1e

ˆwhere is MARS model.)(ˆnef x )( nef

h t i ticharacteristics

• All indicates BaggingEc /1All indicates Bagging.Ece /1

• Selection of optimal is unnecessary( ). s 1sp y( )s s

1111

Page 12: Ensembled Multivariate Adaptive Regression ......MARS. Ohd dibldi blOur method ppp proposed is stable and interpretable. •EnsembledMARSEnsembled MARS-NNGprovidedsuperiorNNG providedpp

AgendaAgendag

• Introduction and motivation• Introduction and motivationPre io s research• Previous research M lti i t Ad ti R iMultivariate Adaptive RegressionMultivariate Adaptive Regression

Splines(MARS)Splines(MARS)p ( )Bagging MARSBagging MARSgg g

O th d d• Our method proposedOur method proposedEnsembled MARS with non-negative garroteEnsembled MARS with non negative garrote

• Example and simulation• Example and simulation• Concluding remarks• Concluding remarksg

1212

Page 13: Ensembled Multivariate Adaptive Regression ......MARS. Ohd dibldi blOur method ppp proposed is stable and interpretable. •EnsembledMARSEnsembled MARS-NNGprovidedsuperiorNNG providedpp

Literature exampleLiterature examplep

Prostate cancer data (Stamney et al 1989: Tibshirani 1996)Prostate cancer data (Stamney et al.,1989: Tibshirani,1996)

L l f t t ifi tiy• : Level of prostate-specific antigenyT• : Clinical measuresT

81 ),...,( xxx: Log of tumor size

81 ), ,(x : Log of tumor size1x

: Weight of prostate2x : Weight of prostate P ti t’

2

: Patient’s age 3x: Log of benign prostatic hyperplasia amount4x : Log of benign prostatic hyperplasia amount4x: Dummy variables of whether it is metastasizing to seminal vesicle5x y g: Log of capsular penetration

5

x : Log of capsular penetration 6x: Gleason score7x : Gleason scoreGl ’ ti f 4 5

7

x : Gleason score’s ratio of 4 or 58x

• Sample size : 97Np

1313

Page 14: Ensembled Multivariate Adaptive Regression ......MARS. Ohd dibldi blOur method ppp proposed is stable and interpretable. •EnsembledMARSEnsembled MARS-NNGprovidedsuperiorNNG providedpp

Literature exampleLiterature examplep

0 50.5

0.45

0 40.4

V

0 35CV

0.35

GC

G

0.3

0 250.25

0 20.2

Ensembled BaggingMARS NNG

gg gMARSMARS-NNG MARS

1414

Page 15: Ensembled Multivariate Adaptive Regression ......MARS. Ohd dibldi blOur method ppp proposed is stable and interpretable. •EnsembledMARSEnsembled MARS-NNGprovidedsuperiorNNG providedpp

Literature exampleLiterature examplep

• Number of trees• Number of treesBagging Ensembled MARS-NNG

97 997 9

• Structure• StructureBagging Ensembled MARS-NNG

1x 2x 2x2x

x 1x4x

Typical treecandidates

Typical treecandidates

1515

Page 16: Ensembled Multivariate Adaptive Regression ......MARS. Ohd dibldi blOur method ppp proposed is stable and interpretable. •EnsembledMARSEnsembled MARS-NNGprovidedsuperiorNNG providedpp

Small simulationSmall simulation

• Design• Design Model(Friedman 1991) Model(Friedman,1991)

510)50(20)sin(10 2 xxxxxy where is )10(N,510)5.0(20)sin(10 54321 xxxxxy where is . )1,0(N

T i i l i 100 Training sample size: 100 Testing sample size: 1 000 Testing sample size: 1,000 Number of simulation: 100

M th d• MethodMethod MARS B i MARS E bl d MARS NNG MARS, Bagging MARS, Ensembled MARS-NNG

E l ti• EvaluationMSE (St d di d )MSESTD(Standardized mean square error)( q )

1616

Page 17: Ensembled Multivariate Adaptive Regression ......MARS. Ohd dibldi blOur method ppp proposed is stable and interpretable. •EnsembledMARSEnsembled MARS-NNGprovidedsuperiorNNG providedpp

Small simulationSmall simulation

0.07

0.065

DE

ST

0 06MS

E

0.06M

0 0550.055

0.05

Ensembledse b edMARS-NNG Bagging MARS MARSMARS NNG

Number 11 6Numberof trees

11.6( d)

100 1of trees (averaged)

1717

Page 18: Ensembled Multivariate Adaptive Regression ......MARS. Ohd dibldi blOur method ppp proposed is stable and interpretable. •EnsembledMARSEnsembled MARS-NNGprovidedsuperiorNNG providedpp

AgendaAgendag

• Introduction and motivation• Introduction and motivationPre io s research• Previous research M lti i t Ad ti R iMultivariate Adaptive RegressionMultivariate Adaptive Regression

Splines(MARS)Splines(MARS)p ( )Bagging MARSBagging MARSgg g

O th d d• Our method proposedOur method proposedEnsembled MARS with non-negative garroteEnsembled MARS with non negative garrote

• Example and simulation• Example and simulation• Concluding remarks• Concluding remarksg

1818

Page 19: Ensembled Multivariate Adaptive Regression ......MARS. Ohd dibldi blOur method ppp proposed is stable and interpretable. •EnsembledMARSEnsembled MARS-NNGprovidedsuperiorNNG providedpp

Concluding remarksConcluding remarksg

• We proposed a new ensembled method of• We proposed a new ensembled method of p pMARSMARS.O h d d i bl d i blOur method proposed is stable and interpretable.p p p

• Ensembled MARS NNG provided superior• Ensembled MARS-NNG provided superior p por comparable results to MARS andor comparable results to MARS and pBagging MARSBagging MARS.gg g

1919

Page 20: Ensembled Multivariate Adaptive Regression ......MARS. Ohd dibldi blOur method ppp proposed is stable and interpretable. •EnsembledMARSEnsembled MARS-NNGprovidedsuperiorNNG providedpp

ReferencesReferences

• Breiman, L. (1995) Better subset regression using the nonnegative , ( ) g g ggarrote. Technometrics, 37,373-384.g , ,

• Breiman L (1996) Bagging predictors Machine Learning 24 123-• Breiman, L. (1996). Bagging predictors. Machine Learning, 24, 123-140140. B i L F i d J H Ol h R A & St C J (1984)• Breiman, L., Friedman, J. H., Olshen, R. A. & Stone, C. J. (1984). Cl ifi ti A d R i T W d thClassification And Regression Trees. Wadsworth.

• Friedman, J. H. (1991) Multivariate Adaptive Regression SplinesFriedman, J. H. (1991) Multivariate Adaptive Regression Splines (with discussion) Annals of Statistics 19 1-141(with discussion). Annals of Statistics,19, 1 141.

• Friedman J H (2001) Greedy function approximation: a gradient• Friedman, J. H. (2001). Greedy function approximation: a gradient boosting machine Ann Statist 29(5) 1189 1232boosting machine. Ann. Statist., 29(5), 1189-1232.

• Meinshausen N (2009): Node Harvest: simple and interpretableMeinshausen, N. (2009): Node Harvest: simple and interpretable regression and classification Arxiv preprint arXiv:0910 2145regression and classification. Arxiv preprint arXiv:0910.2145.

• Motogaito, H., Sugimoto, T. & Goto, M. (2007): Multivariate AdaptiveMotogaito, H., Sugimoto, T. & Goto, M. (2007): Multivariate Adaptive Regression Splines with Non negative Garrote Estimator JapaneseRegression Splines with Non-negative Garrote Estimator. Japanese J A l St ti t 36 99 118 (i J )J. Appl. Statist., 36, 99-118 (in Japanese).

• Yuan M & Lin Y (2007) On the non negative garrote estimator J• Yuan, M. & Lin, Y. (2007) On the non-negative garrote estimator. J. R St ti t S B 69(2) 143 161R. Statist. Soc., B 69(2), 143-161.

2020

Page 21: Ensembled Multivariate Adaptive Regression ......MARS. Ohd dibldi blOur method ppp proposed is stable and interpretable. •EnsembledMARSEnsembled MARS-NNGprovidedsuperiorNNG providedpp

Thank you very much for your attentionThank you very much for your attentiony y y

h-motogt@sigmath es osaka-u ac [email protected]

2121

Page 22: Ensembled Multivariate Adaptive Regression ......MARS. Ohd dibldi blOur method ppp proposed is stable and interpretable. •EnsembledMARSEnsembled MARS-NNGprovidedsuperiorNNG providedpp

Back pBack upBack upp

2222

Page 23: Ensembled Multivariate Adaptive Regression ......MARS. Ohd dibldi blOur method ppp proposed is stable and interpretable. •EnsembledMARSEnsembled MARS-NNGprovidedsuperiorNNG providedpp

Small simulationSmall simulation

0 06620 07

0.0597 0.06620.07

0 05950 065

0.05950.065

0 05450,0545

0 060.0567

0.06

0 0 440.0544

0 0550.0550 06090.0609

0.05 0.05730.0570

0.05730.0570

EnsembledEnsembledMARS-NNG Bagging MARS MARSMARS-NNG gg g

2323

Page 24: Ensembled Multivariate Adaptive Regression ......MARS. Ohd dibldi blOur method ppp proposed is stable and interpretable. •EnsembledMARSEnsembled MARS-NNGprovidedsuperiorNNG providedpp

Literature exampleLiterature examplep

1x x x1x 1x 8x 8x 1x1

x x x2x 3x 6x

MARSMARSMARSMARSMARSMARS24242424

Page 25: Ensembled Multivariate Adaptive Regression ......MARS. Ohd dibldi blOur method ppp proposed is stable and interpretable. •EnsembledMARSEnsembled MARS-NNGprovidedsuperiorNNG providedpp

Literature exampleLiterature examplep

x1x 2x 2x2x1

xx 1x4x

EnsembledEnsembledEnsembledEnsembledMARS NNGMARS NNGMARS-NNGMARS-NNG

25252525