![Page 1: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/1.jpg)
Ensemble Methods:Bagging and Boosting
2
Tufts COMP 135: Introduction to Machine Learninghttps://www.cs.tufts.edu/comp/135/2019s/
Many slides attributable to:Liping Liu and Roni Khardon (Tufts)T. Q. Chen (UW), James, Witten, Hastie, Tibshirani (ISL/ESL books)
Prof. Mike Hughes
![Page 2: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/2.jpg)
Unit ObjectivesBig idea: We can improve performance by aggregating decisions from MANY predictors
• V1: Predictors are Independently Trained• Using bootstrap subsample of examples: “Bagging”• Using random subsets of features• Exemplary method: Random Forest / ExtraTrees
• V2: Predictors are Sequentially Trained• Each successive predictor “boosts” performance• Exemplary method: XGBoost
3Mike Hughes - Tufts COMP 135 - Spring 2019
![Page 3: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/3.jpg)
Motivating Example
3 binary classifiersModel predictions as independent random variablesEach one is correct 70% of the time
What is chance that majority vote is correct?
4Mike Hughes - Tufts COMP 135 - Spring 2019
![Page 4: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/4.jpg)
Motivating Example
3 binary classifiersModel predictions as independent random variablesEach one is correct 70% of the time
What is chance that majority vote is correct?
5Mike Hughes - Tufts COMP 135 - Spring 2019
0.784
![Page 5: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/5.jpg)
Motivating Example
5 binary classifiersModel predictions as independent random variablesEach one is correct 70% of the time
What is chance that majority vote is correct?
6Mike Hughes - Tufts COMP 135 - Spring 2019
![Page 6: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/6.jpg)
Motivating Example
5 binary classifiersModel predictions as independent random variablesEach one is correct 70% of the time
What is chance that majority vote is correct?
7Mike Hughes - Tufts COMP 135 - Spring 2019
0.8369…
![Page 7: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/7.jpg)
Motivating Example
101 binary classifiersModel predictions as independent random variablesEach one is correct 70% of the time
What is chance that majority vote is correct?
8Mike Hughes - Tufts COMP 135 - Spring 2019
![Page 8: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/8.jpg)
Motivating Example
101 binary classifiersModel predictions as independent random variablesEach one is correct 70% of the time
What is chance that majority vote is correct?
9Mike Hughes - Tufts COMP 135 - Spring 2019
>0.99…
![Page 9: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/9.jpg)
10Mike Hughes - Tufts COMP 135 - Spring 2019
Key Idea: Diversity
• Vary the training data
![Page 10: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/10.jpg)
Bootstrap Sampling
11Mike Hughes - Tufts COMP 135 - Spring 2019
![Page 11: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/11.jpg)
Bootstrap Sampling in Python
12Mike Hughes - Tufts COMP 135 - Spring 2019
![Page 12: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/12.jpg)
Bootstrap Aggregation: BAgg-ing
• Draw B “replicas” of training set• Use bootstrap sampling with replacement
• Make prediction by averaging
13Mike Hughes - Tufts COMP 135 - Spring 2019
![Page 13: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/13.jpg)
Regression Example: 1 tree
Images are taken from Adele Cutler’s slides
![Page 14: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/14.jpg)
Regression Example: 10 trees
The solid black line is the ground-truth, Red lines are predictions of single regression trees
Images are taken from Adele Cutler’s slides
![Page 15: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/15.jpg)
Regression Average of 10 trees
The solid black line is the ground-truth, The blue line is the prediction of the average of 10 regression trees
Images are taken from Adele Cutler’s slides
![Page 16: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/16.jpg)
Binary Classification
Images are taken from Adele Cutler’s slides
![Page 17: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/17.jpg)
Decision Boundary: 1 tree
Images are taken from Adele Cutler’s slides
![Page 18: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/18.jpg)
Decision boundary: 25 trees
Images are taken from Adele Cutler’s slides
![Page 19: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/19.jpg)
Average over 25 trees
Images are taken from Adele Cutler’s slides
![Page 20: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/20.jpg)
Variance of averages
21Mike Hughes - Tufts COMP 135 - Spring 2019
• Given B independent observations
• Each one has variance v
• Compute the mean of the B observations
• What is variance of this estimator?
![Page 21: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/21.jpg)
Why Bagging Works:Reduce Variance!
22Mike Hughes - Tufts COMP 135 - Spring 2019
• Flexible learners applied to small datasets have high variance w.r.t. the data distribution• Small change in training set -> big change in
predictions on heldout set
• Bagging decreases heldout error by decreasing the variance of predictions
• Bagging can be applied to any base classifiers/regressors
![Page 22: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/22.jpg)
Another Idea for Diversity
• Vary the features
23Mike Hughes - Tufts COMP 135 - Spring 2019
![Page 23: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/23.jpg)
Random Forest
24Mike Hughes - Tufts COMP 135 - Spring 2019
Combine example diversity AND feature diversity
For t = 1 to T (# trees): Create an bootstrap sample from the training set.Greedy train tree on random subsample of features
For each node within a maximum depth: Randomly select m features from F featuresFind the best split among m variables
Average the trees to get predictions for new data.
![Page 24: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/24.jpg)
Extremely Randomized Treesaka “ExtraTrees” in sklearn
25Mike Hughes - Tufts COMP 135 - Spring 2019
Speed and feature diversity
For t = 1 to T (# trees): Create an bootstrap sample from the training set.Greedy train tree on random subsample of features
For each node within a maximum depth: Randomly select m features from F featuresFind the best split among m variablesTry 1 random split at each of m variables,
then select the best split of these
![Page 25: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/25.jpg)
26Mike Hughes - Tufts COMP 135 - Spring 2019
![Page 26: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/26.jpg)
27Mike Hughes - Tufts COMP 135 - Spring 2019
Single tree
Credit: ISL textbook
![Page 27: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/27.jpg)
28Mike Hughes - Tufts COMP 135 - Spring 2019
![Page 28: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/28.jpg)
Random Forest in Industry
29Mike Hughes - Tufts COMP 135 - Spring 2019
Microsoft Kinect RGB-D camera
![Page 29: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/29.jpg)
30Mike Hughes - Tufts COMP 135 - Spring 2019
![Page 30: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/30.jpg)
Summary of Ensemble v1:Independent predictors
• Average over independent base predictors
• Why it works: Reduce variance
• PRO• Often better heldout performance than base model
• CON• Training B separate models is expensive
31Mike Hughes - Tufts COMP 135 - Spring 2019
![Page 31: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/31.jpg)
Vocabulary: Residual
32Mike Hughes - Tufts COMP 135 - Spring 2019
![Page 32: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/32.jpg)
Ensemble Method v2:Sequentially Predict Residual
• Model f1: Trained with (x,y) pairs• Capture residual: r1 = y – f1
• Model f2: Trained with (x, r1) pairs• Capture residual: r2 = r1 – f2
• Repeat!
Combine weak learners into a powerful committee
33Mike Hughes - Tufts COMP 135 - Spring 2019
![Page 33: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/33.jpg)
Adaboost (Adaptive Boosting)Reweight misclassified examples
34Mike Hughes - Tufts COMP 135 - Spring 2019
ESL textbook
![Page 34: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/34.jpg)
Boosting with depth-1 tree “stump”
35Mike Hughes - Tufts COMP 135 - Spring 2019
ESL textbook
![Page 35: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/35.jpg)
Boosting for Regression Trees
36Mike Hughes - Tufts COMP 135 - Spring 2019
ISL textbook
![Page 36: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/36.jpg)
Boosted Tree: Optimization
37Mike Hughes - Tufts COMP 135 - Spring 2019
![Page 37: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/37.jpg)
Gradient Boosting Algorithm
38Mike Hughes - Tufts COMP 135 - Spring 2019
Decide tree structure by fitting to gradients
Decide leaf values by progressively minimizing loss
Add up trees to get the final model
Compute gradientAt each training example
![Page 38: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/38.jpg)
What about regularization?
39Mike Hughes - Tufts COMP 135 - Spring 2019
Minimization objective when adding tree t:
Loss function Regularization(limit complexity of tree t)
https://xgboost.readthedocs.io/
![Page 39: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/39.jpg)
Regularization
40Mike Hughes - Tufts COMP 135 - Spring 2019
We can penalize• Number of nodes in tree• Depth of tree• Scalar value predicted in each region (L2 penalty)
Credit: T. Chenhttps://homes.cs.washington.edu/~tqchen/pdf/BoostedTree.pdf
![Page 40: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/40.jpg)
Example Regularization Term
41Mike Hughes - Tufts COMP 135 - Spring 2019
Credit: T. Chenhttps://homes.cs.washington.edu/~tqchen/pdf/BoostedTree.pdf
![Page 41: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/41.jpg)
To Improve Gradient Boosting
Can extend gradient boosting with• 2nd order gradient information (Newton step)• Penalties on tree complexity• Very smart practical implementation
Result: Extreme Gradient Boostingaka XGBoost (T. Chen & C. Guestrin)
42Mike Hughes - Tufts COMP 135 - Spring 2019
![Page 42: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/42.jpg)
XGBoost:Extreme Gradient Boosting
• Kaggle competitions in 2015• 29 total winning solutions to challenges published• 17 / 29 (58%) used XGBoost• 11 / 29 (37%) used deep neural networks
43Mike Hughes - Tufts COMP 135 - Spring 2019
![Page 43: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/43.jpg)
More details (beyond this class)
ESL textbook, Section 10.10
Good slide deck by T. Q. Chen (first author of XGBoost):• https://homes.cs.washington.edu/~tqchen/pdf
/BoostedTree.pdf
44Mike Hughes - Tufts COMP 135 - Spring 2019
![Page 44: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/44.jpg)
Summary of BoostingPRO• Like all tree methods, invariant to scaling of inputs (no
need for careful feature normalization)• Can be scalable
CON• Greedy Sequential fit may not be globally optimal
IN PRACTICE• XGBoost
• Very popular in many benchmark competitions and industrial applications
45Mike Hughes - Tufts COMP 135 - Spring 2019
![Page 45: Ensemble Methods - Department of Computer Science · 2019. 5. 3. · Extremely Randomized Trees aka “ExtraTrees” in sklearn Mike Hughes - Tufts COMP 135 - Spring 2019 25 Speed](https://reader036.vdocuments.net/reader036/viewer/2022081620/610110d0a94925402633e3a5/html5/thumbnails/45.jpg)
Unit ObjectivesBig idea: We can improve performance by aggregating decisions from MANY predictors
• V1: Predictors are Independently Trained• Using bootstrap subsample of examples: “Bagging”• Using random subsets of features• Exemplary method: Random Forest / ExtraTrees
• V2: Predictors are Sequentially Trained• Each successive predictor “boosts” performance• Exemplary method: XGBoost
46Mike Hughes - Tufts COMP 135 - Spring 2019