classification: ensemble methods combines multiple models construct multiple classifiers from...
TRANSCRIPT
CLASSIFICATION: Ensemble Methods
Combines multiple models Construct multiple classifiers from
training set Aggregate their predictions on
testing set
Meta-algorithm
CLASSIFICATION: Ensemble Methods
Improves stability and accuracy
Reduces variance
Helps avoid overfitting
Compensates for poor learning algorithms
Uses more computation
ENSEMBLE METHODS: Examples
Bagging (bootstrap aggregation) Bagging with MetaCost
Random forests
Boosting
Stacked generalization Usually used on different learning
algorithms
Bayesian model combination
ENSEMBLE METHODS: Bagging
Randomly create samples (with replacement) from a data set
Create classifiers (same type) for each sample
Run classifiers on testing sample
Use majority voting to determine classification of testing sample
ENSEMBLE METHODS: Bagging with MetaCost
Used when each model can output probability estimates
Probability estimates used to obtain expected cost of each prediction
Classifies training instances to minimize the expected cost
Learns new classifier
ENSEMBLE METHODS: Random Forests
Modification of applying bagging to tree learners
Uses only random subsets of features at each split
Promotes tree diversity
ENSEMBLE METHODS: Boosting
Seeks models that complement one another
Combines models of same type
New models constructed to better handle those instances incorrectly handled by previous models – focuses on hard to classify examples
Uses weighted averaging often adaptively
ENSEMBLE METHODS: Stacked Generalization
Introduced by David Wolpert, 1992
Other algorithms trained from training set
Stacking (“level-1”) algorithm uses predicitions from base (“level-0”) algorithms as inputs
ENSEMBLE METHODS: Stacked Generalization
Employs j-fold cross validation of training set
Train and test each of the level-0 algorithms using the split training data to create the level-0 models
Test each model on each split to create level-1 data
ENSEMBLE METHODS: Stacked Generalization
ENSEMBLE METHODS: Stacked Generalization
Can be used for both supervised and unsupervised learning
Best performers in Netflix competition were forms of stacked generalization
Can even create multiple levels of stacking(“level-2”, etc.)
ENSEMBLE METHODS: Stacked Generalization
Best performers in Netflix competition were forms of stacked generalization
Can even create multiple layers (“stacked stacking”)
Works best with class probabilities (Tang and Witten, 1999)
ENSEMBLE METHODS: Bayesian Model Combination
Built upon Bayes Model Averaging and Bayes Optimal Classifier
Bayes Optimal Classifier Ensemble (using Bayes’ rule) of all
hypotheses in hypothesis space
On average, it is the ideal ensemble
ENSEMBLE METHODS: Bayesian Model Combination
Bayes Model Averaging Approximates Bayes optimal classifer Samples from hypothesis space
Monte Carlo sampling Tends to promote overfitting Performs worse in practice than simpler
techniques (eg bagging)
ENSEMBLE METHODS: Bayesian Model Combination
Bayes Model Combination Correction to Bayes Model Averaging Uses model weightings to create samples Overcomes drawback of BMA giving weight
to single model Better performance than BMA or bagging