machine learning: ensembles of classifiers · algolabs certi cation course on machine learning 24...
TRANSCRIPT
![Page 1: Machine Learning: Ensembles of Classifiers · AlgoLabs Certi cation Course on Machine Learning 24 February, 2015. Bottlenecks in building a classi er Noise: Uncertainty in classi](https://reader034.vdocuments.net/reader034/viewer/2022050410/5f87925e0fa58b59bc7d95ad/html5/thumbnails/1.jpg)
Machine Learning: Ensembles of Classifiers
Madhavan Mukund
Chennai Mathematical Institutehttp://www.cmi.ac.in/~madhavan
AlgoLabs Certification Course on Machine Learning24 February, 2015
![Page 2: Machine Learning: Ensembles of Classifiers · AlgoLabs Certi cation Course on Machine Learning 24 February, 2015. Bottlenecks in building a classi er Noise: Uncertainty in classi](https://reader034.vdocuments.net/reader034/viewer/2022050410/5f87925e0fa58b59bc7d95ad/html5/thumbnails/2.jpg)
Bottlenecks in building a classifier
Noise : Uncertainty in classification function
Bias : Systematic inability to predict a particular value
Variance: Variation in model based on sample of training data
Models with high variance are unstable
Decision trees: choice of attributes influenced by entropy oftraining data
Overfitting: model is tied too closely to training set
Is there an alternative to pruning?
![Page 3: Machine Learning: Ensembles of Classifiers · AlgoLabs Certi cation Course on Machine Learning 24 February, 2015. Bottlenecks in building a classi er Noise: Uncertainty in classi](https://reader034.vdocuments.net/reader034/viewer/2022050410/5f87925e0fa58b59bc7d95ad/html5/thumbnails/3.jpg)
Bottlenecks in building a classifier
Noise : Uncertainty in classification function
Bias : Systematic inability to predict a particular value
Variance: Variation in model based on sample of training data
Models with high variance are unstable
Decision trees: choice of attributes influenced by entropy oftraining data
Overfitting: model is tied too closely to training set
Is there an alternative to pruning?
![Page 4: Machine Learning: Ensembles of Classifiers · AlgoLabs Certi cation Course on Machine Learning 24 February, 2015. Bottlenecks in building a classi er Noise: Uncertainty in classi](https://reader034.vdocuments.net/reader034/viewer/2022050410/5f87925e0fa58b59bc7d95ad/html5/thumbnails/4.jpg)
Multiple models
Build many models (ensemble) and “average” them
How do we build different models from the same data?
Strategy to build the model is fixed
Same data will produce same model
Choose different samples of training data
![Page 5: Machine Learning: Ensembles of Classifiers · AlgoLabs Certi cation Course on Machine Learning 24 February, 2015. Bottlenecks in building a classi er Noise: Uncertainty in classi](https://reader034.vdocuments.net/reader034/viewer/2022050410/5f87925e0fa58b59bc7d95ad/html5/thumbnails/5.jpg)
Bootstrap Aggregating = Bagging
Training data has N items
TD = {d1, d2, . . . , dN}
Pick a random sample with replacement
Pick an item at random (probability 1N )
Put it back into the set
Repeat K times
Some items in the sample will be repeated
If sample size is same as data size (K = N), expected number
of distinct items is (1− 1
e) · N
Approx 63.2%
![Page 6: Machine Learning: Ensembles of Classifiers · AlgoLabs Certi cation Course on Machine Learning 24 February, 2015. Bottlenecks in building a classi er Noise: Uncertainty in classi](https://reader034.vdocuments.net/reader034/viewer/2022050410/5f87925e0fa58b59bc7d95ad/html5/thumbnails/6.jpg)
Bootstrap Aggregating = Bagging
Training data has N items
TD = {d1, d2, . . . , dN}
Pick a random sample with replacement
Pick an item at random (probability 1N )
Put it back into the set
Repeat K times
Some items in the sample will be repeated
If sample size is same as data size (K = N), expected number
of distinct items is (1− 1
e) · N
Approx 63.2%
![Page 7: Machine Learning: Ensembles of Classifiers · AlgoLabs Certi cation Course on Machine Learning 24 February, 2015. Bottlenecks in building a classi er Noise: Uncertainty in classi](https://reader034.vdocuments.net/reader034/viewer/2022050410/5f87925e0fa58b59bc7d95ad/html5/thumbnails/7.jpg)
Bootstrap Aggregating = Bagging
Training data has N items
TD = {d1, d2, . . . , dN}
Pick a random sample with replacement
Pick an item at random (probability 1N )
Put it back into the set
Repeat K times
Some items in the sample will be repeated
If sample size is same as data size (K = N), expected number
of distinct items is (1− 1
e) · N
Approx 63.2%
![Page 8: Machine Learning: Ensembles of Classifiers · AlgoLabs Certi cation Course on Machine Learning 24 February, 2015. Bottlenecks in building a classi er Noise: Uncertainty in classi](https://reader034.vdocuments.net/reader034/viewer/2022050410/5f87925e0fa58b59bc7d95ad/html5/thumbnails/8.jpg)
Bootstrap Aggregating = Bagging
Training data has N items
TD = {d1, d2, . . . , dN}
Pick a random sample with replacement
Pick an item at random (probability 1N )
Put it back into the set
Repeat K times
Some items in the sample will be repeated
If sample size is same as data size (K = N), expected number
of distinct items is (1− 1
e) · N
Approx 63.2%
![Page 9: Machine Learning: Ensembles of Classifiers · AlgoLabs Certi cation Course on Machine Learning 24 February, 2015. Bottlenecks in building a classi er Noise: Uncertainty in classi](https://reader034.vdocuments.net/reader034/viewer/2022050410/5f87925e0fa58b59bc7d95ad/html5/thumbnails/9.jpg)
Bootstrap Aggregating = Bagging
Sample with replacement of size N : bootstrap sample
Approx 60% of full training data
Take K such samples
Build a model for each sample
Models will vary because each uses different training data
Final classifier: report the majority answer
Assumptions: binary classifier, K odd
Provably reduces variance
![Page 10: Machine Learning: Ensembles of Classifiers · AlgoLabs Certi cation Course on Machine Learning 24 February, 2015. Bottlenecks in building a classi er Noise: Uncertainty in classi](https://reader034.vdocuments.net/reader034/viewer/2022050410/5f87925e0fa58b59bc7d95ad/html5/thumbnails/10.jpg)
Bagging with decision trees
![Page 11: Machine Learning: Ensembles of Classifiers · AlgoLabs Certi cation Course on Machine Learning 24 February, 2015. Bottlenecks in building a classi er Noise: Uncertainty in classi](https://reader034.vdocuments.net/reader034/viewer/2022050410/5f87925e0fa58b59bc7d95ad/html5/thumbnails/11.jpg)
Bagging with decision trees
![Page 12: Machine Learning: Ensembles of Classifiers · AlgoLabs Certi cation Course on Machine Learning 24 February, 2015. Bottlenecks in building a classi er Noise: Uncertainty in classi](https://reader034.vdocuments.net/reader034/viewer/2022050410/5f87925e0fa58b59bc7d95ad/html5/thumbnails/12.jpg)
Bagging with decision trees
![Page 13: Machine Learning: Ensembles of Classifiers · AlgoLabs Certi cation Course on Machine Learning 24 February, 2015. Bottlenecks in building a classi er Noise: Uncertainty in classi](https://reader034.vdocuments.net/reader034/viewer/2022050410/5f87925e0fa58b59bc7d95ad/html5/thumbnails/13.jpg)
Bagging with decision trees
![Page 14: Machine Learning: Ensembles of Classifiers · AlgoLabs Certi cation Course on Machine Learning 24 February, 2015. Bottlenecks in building a classi er Noise: Uncertainty in classi](https://reader034.vdocuments.net/reader034/viewer/2022050410/5f87925e0fa58b59bc7d95ad/html5/thumbnails/14.jpg)
Bagging with decision trees
![Page 15: Machine Learning: Ensembles of Classifiers · AlgoLabs Certi cation Course on Machine Learning 24 February, 2015. Bottlenecks in building a classi er Noise: Uncertainty in classi](https://reader034.vdocuments.net/reader034/viewer/2022050410/5f87925e0fa58b59bc7d95ad/html5/thumbnails/15.jpg)
Bagging with decision trees
![Page 16: Machine Learning: Ensembles of Classifiers · AlgoLabs Certi cation Course on Machine Learning 24 February, 2015. Bottlenecks in building a classi er Noise: Uncertainty in classi](https://reader034.vdocuments.net/reader034/viewer/2022050410/5f87925e0fa58b59bc7d95ad/html5/thumbnails/16.jpg)
Random Forest
Applying bagging to decision trees with a further twist
Each data item has M attributes
Normally, decision tree building chooses one among Mattributes, then one among remaining M − 1, . . .
Instead, fix a small limit m < M
At each level, choose m of the available attributes at random,and only examine these for next split
No pruning
Seems to improve on bagging in practice
![Page 17: Machine Learning: Ensembles of Classifiers · AlgoLabs Certi cation Course on Machine Learning 24 February, 2015. Bottlenecks in building a classi er Noise: Uncertainty in classi](https://reader034.vdocuments.net/reader034/viewer/2022050410/5f87925e0fa58b59bc7d95ad/html5/thumbnails/17.jpg)
Random Forest
Applying bagging to decision trees with a further twist
Each data item has M attributes
Normally, decision tree building chooses one among Mattributes, then one among remaining M − 1, . . .
Instead, fix a small limit m < M
At each level, choose m of the available attributes at random,and only examine these for next split
No pruning
Seems to improve on bagging in practice
![Page 18: Machine Learning: Ensembles of Classifiers · AlgoLabs Certi cation Course on Machine Learning 24 February, 2015. Bottlenecks in building a classi er Noise: Uncertainty in classi](https://reader034.vdocuments.net/reader034/viewer/2022050410/5f87925e0fa58b59bc7d95ad/html5/thumbnails/18.jpg)
Random Forest
Applying bagging to decision trees with a further twist
Each data item has M attributes
Normally, decision tree building chooses one among Mattributes, then one among remaining M − 1, . . .
Instead, fix a small limit m < M
At each level, choose m of the available attributes at random,and only examine these for next split
No pruning
Seems to improve on bagging in practice
![Page 19: Machine Learning: Ensembles of Classifiers · AlgoLabs Certi cation Course on Machine Learning 24 February, 2015. Bottlenecks in building a classi er Noise: Uncertainty in classi](https://reader034.vdocuments.net/reader034/viewer/2022050410/5f87925e0fa58b59bc7d95ad/html5/thumbnails/19.jpg)
Boosting
Looking at a few attributes gives “rule of thumb” heuristic
If Amla does well, South Africa usually wins
If opening bowlers take at least 2 wickets within 5 overs, Indiausually wins
. . .
Each heuristic is a weak classifier
Can we combine such weak classifiers to boost performanceand build a strong classifier?
![Page 20: Machine Learning: Ensembles of Classifiers · AlgoLabs Certi cation Course on Machine Learning 24 February, 2015. Bottlenecks in building a classi er Noise: Uncertainty in classi](https://reader034.vdocuments.net/reader034/viewer/2022050410/5f87925e0fa58b59bc7d95ad/html5/thumbnails/20.jpg)
Adaptively boosting a weak classifier (AdaBoost)
Weak binary classifier: output is {−1,+1}
Initially, all training inputs have equal weight, D1
Build a weak classifier C1 for D1
Compute its error rate, e1 (Details suppressed)Increase weightage to all incorrectly classified inputs, D2
Build a weak classifier C2 for D2
Compute its error rate, e2Increase weightage to all incorrectly classified inputs, D3
. . .
Combine the outputs o1, o2, . . . , ok of C1, C2, . . . , Ck asw1o1 + w2o2 + · · ·+ wkok
Each weigth wj depends on error rate ej
Report the sign (negative 7→ −1, positive 7→ +1)
![Page 21: Machine Learning: Ensembles of Classifiers · AlgoLabs Certi cation Course on Machine Learning 24 February, 2015. Bottlenecks in building a classi er Noise: Uncertainty in classi](https://reader034.vdocuments.net/reader034/viewer/2022050410/5f87925e0fa58b59bc7d95ad/html5/thumbnails/21.jpg)
Adaptively boosting a weak classifier (AdaBoost)
Weak binary classifier: output is {−1,+1}
Initially, all training inputs have equal weight, D1
Build a weak classifier C1 for D1
Compute its error rate, e1 (Details suppressed)Increase weightage to all incorrectly classified inputs, D2
Build a weak classifier C2 for D2
Compute its error rate, e2Increase weightage to all incorrectly classified inputs, D3
. . .
Combine the outputs o1, o2, . . . , ok of C1, C2, . . . , Ck asw1o1 + w2o2 + · · ·+ wkok
Each weigth wj depends on error rate ej
Report the sign (negative 7→ −1, positive 7→ +1)
![Page 22: Machine Learning: Ensembles of Classifiers · AlgoLabs Certi cation Course on Machine Learning 24 February, 2015. Bottlenecks in building a classi er Noise: Uncertainty in classi](https://reader034.vdocuments.net/reader034/viewer/2022050410/5f87925e0fa58b59bc7d95ad/html5/thumbnails/22.jpg)
Adaptively boosting a weak classifier (AdaBoost)
Weak binary classifier: output is {−1,+1}
Initially, all training inputs have equal weight, D1
Build a weak classifier C1 for D1
Compute its error rate, e1 (Details suppressed)Increase weightage to all incorrectly classified inputs, D2
Build a weak classifier C2 for D2
Compute its error rate, e2Increase weightage to all incorrectly classified inputs, D3
. . .
Combine the outputs o1, o2, . . . , ok of C1, C2, . . . , Ck asw1o1 + w2o2 + · · ·+ wkok
Each weigth wj depends on error rate ej
Report the sign (negative 7→ −1, positive 7→ +1)
![Page 23: Machine Learning: Ensembles of Classifiers · AlgoLabs Certi cation Course on Machine Learning 24 February, 2015. Bottlenecks in building a classi er Noise: Uncertainty in classi](https://reader034.vdocuments.net/reader034/viewer/2022050410/5f87925e0fa58b59bc7d95ad/html5/thumbnails/23.jpg)
Adaptively boosting a weak classifier (AdaBoost)
Weak binary classifier: output is {−1,+1}
Initially, all training inputs have equal weight, D1
Build a weak classifier C1 for D1
Compute its error rate, e1 (Details suppressed)Increase weightage to all incorrectly classified inputs, D2
Build a weak classifier C2 for D2
Compute its error rate, e2Increase weightage to all incorrectly classified inputs, D3
. . .
Combine the outputs o1, o2, . . . , ok of C1, C2, . . . , Ck asw1o1 + w2o2 + · · ·+ wkok
Each weigth wj depends on error rate ej
Report the sign (negative 7→ −1, positive 7→ +1)
![Page 24: Machine Learning: Ensembles of Classifiers · AlgoLabs Certi cation Course on Machine Learning 24 February, 2015. Bottlenecks in building a classi er Noise: Uncertainty in classi](https://reader034.vdocuments.net/reader034/viewer/2022050410/5f87925e0fa58b59bc7d95ad/html5/thumbnails/24.jpg)
Boosting
![Page 25: Machine Learning: Ensembles of Classifiers · AlgoLabs Certi cation Course on Machine Learning 24 February, 2015. Bottlenecks in building a classi er Noise: Uncertainty in classi](https://reader034.vdocuments.net/reader034/viewer/2022050410/5f87925e0fa58b59bc7d95ad/html5/thumbnails/25.jpg)
Boosting
![Page 26: Machine Learning: Ensembles of Classifiers · AlgoLabs Certi cation Course on Machine Learning 24 February, 2015. Bottlenecks in building a classi er Noise: Uncertainty in classi](https://reader034.vdocuments.net/reader034/viewer/2022050410/5f87925e0fa58b59bc7d95ad/html5/thumbnails/26.jpg)
Boosting
![Page 27: Machine Learning: Ensembles of Classifiers · AlgoLabs Certi cation Course on Machine Learning 24 February, 2015. Bottlenecks in building a classi er Noise: Uncertainty in classi](https://reader034.vdocuments.net/reader034/viewer/2022050410/5f87925e0fa58b59bc7d95ad/html5/thumbnails/27.jpg)
Boosting
![Page 28: Machine Learning: Ensembles of Classifiers · AlgoLabs Certi cation Course on Machine Learning 24 February, 2015. Bottlenecks in building a classi er Noise: Uncertainty in classi](https://reader034.vdocuments.net/reader034/viewer/2022050410/5f87925e0fa58b59bc7d95ad/html5/thumbnails/28.jpg)
Boosting
![Page 29: Machine Learning: Ensembles of Classifiers · AlgoLabs Certi cation Course on Machine Learning 24 February, 2015. Bottlenecks in building a classi er Noise: Uncertainty in classi](https://reader034.vdocuments.net/reader034/viewer/2022050410/5f87925e0fa58b59bc7d95ad/html5/thumbnails/29.jpg)
Boosting
![Page 30: Machine Learning: Ensembles of Classifiers · AlgoLabs Certi cation Course on Machine Learning 24 February, 2015. Bottlenecks in building a classi er Noise: Uncertainty in classi](https://reader034.vdocuments.net/reader034/viewer/2022050410/5f87925e0fa58b59bc7d95ad/html5/thumbnails/30.jpg)
Boosting
![Page 31: Machine Learning: Ensembles of Classifiers · AlgoLabs Certi cation Course on Machine Learning 24 February, 2015. Bottlenecks in building a classi er Noise: Uncertainty in classi](https://reader034.vdocuments.net/reader034/viewer/2022050410/5f87925e0fa58b59bc7d95ad/html5/thumbnails/31.jpg)
Boosting
![Page 32: Machine Learning: Ensembles of Classifiers · AlgoLabs Certi cation Course on Machine Learning 24 February, 2015. Bottlenecks in building a classi er Noise: Uncertainty in classi](https://reader034.vdocuments.net/reader034/viewer/2022050410/5f87925e0fa58b59bc7d95ad/html5/thumbnails/32.jpg)
Summary
Variance in unstable models (e.g., decision trees) can bereduced using an ensemble — bagging
Further refinement for decision tree bagging
Choose a random small subset of attributes to explore at eachlevel
Random Forest
Combining weak classifiers (“rules of thumb”) — boosting
![Page 33: Machine Learning: Ensembles of Classifiers · AlgoLabs Certi cation Course on Machine Learning 24 February, 2015. Bottlenecks in building a classi er Noise: Uncertainty in classi](https://reader034.vdocuments.net/reader034/viewer/2022050410/5f87925e0fa58b59bc7d95ad/html5/thumbnails/33.jpg)
References
Bagging Predictors, Leo Breiman,http://statistics.berkeley.edu/sites/default/files/
tech-reports/421.pdf
Random Forests, Leo Breiman and Adele Cutler,https://www.stat.berkeley.edu/~breiman/RandomForests/
cc_home.htm
A Short Introduction to Boosting, Yoav Fruend and RobertE. Schapire,http:
//www.site.uottawa.ca/~stan/csi5387/boost-tut-ppr.pdf
AdaBoost and the Super Bowl of Classifiers A Tutorial Introductionto Adaptive Boosting, Raul Rojas,http://www.inf.fu-berlin.de/inst/ag-ki/adaboost4.pdf