![Page 1: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/1.jpg)
Variance Reduction and Ensemble Methods
Nicholas RuozziUniversity of Texas at Dallas
Based on the slides of Vibhav Gogate and David Sontag
![Page 2: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/2.jpg)
Last Time
β’ PAC learning
β’ Bias/variance tradeoff
β’ small hypothesis spaces (not enough flexibility) can have high bias
β’ rich hypothesis spaces (too much flexibility) can have high variance
β’ Today: more on this phenomenon and how to get around it
2
![Page 3: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/3.jpg)
Intuition
β’ Bias
β’ Measures the accuracy or quality of the algorithm
β’ High bias means a poor match
β’ Variance
β’ Measures the precision or specificity of the match
β’ High variance means a weak match
β’ We would like to minimize each of these
β’ Unfortunately, we canβt do this independently, there is a trade-off
3
![Page 4: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/4.jpg)
Bias-Variance Analysis in Regression
β’ True function is π¦π¦ = ππ(π₯π₯) + ππ
β’ Where noise, ππ, is normally distributed with zero mean and standard deviation ππ
β’ Given a set of training examples, π₯π₯ 1 ,π¦π¦(1) , β¦ , π₯π₯ ππ ,π¦π¦(ππ) , we fit a hypothesis ππ(π₯π₯) = π€π€πππ₯π₯ + ππ to the data to minimize the squared error
οΏ½ππ
π¦π¦(ππ) βππ π₯π₯ ππ 2
4
![Page 5: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/5.jpg)
2-D Example
Sample 20 points from ππ(π₯π₯) = π₯π₯ + 2 sin(1.5π₯π₯) + ππ(0,0.2)
5
![Page 6: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/6.jpg)
2-D Example
50 fits (20 examples each)
6
![Page 7: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/7.jpg)
Bias-Variance Analysis
β’ Given a new data point π₯π₯π₯ with observed value π¦π¦β² = ππ π₯π₯β² + ππ, want to understand the expected prediction error
β’ Suppose that training samples are drawn independently from a distribution ππ(ππ), want to compute the expected error of the estimator
πΈπΈ[ π¦π¦β²βππππ π₯π₯β²2 ]
7
![Page 8: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/8.jpg)
Probability Reminder
β’ Variance of a random variable, ππ
ππππππ ππ = πΈπΈ ππ β πΈπΈ ππ 2
= πΈπΈ ππ2 β 2πππΈπΈ ππ + πΈπΈ ππ 2
= πΈπΈ ππ2 β πΈπΈ ππ 2
β’ Properties of ππππππ(ππ)
ππππππ ππππ = πΈπΈ ππ2ππ2 β πΈπΈ ππππ 2 = ππ2ππππππ(ππ)
8
![Page 9: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/9.jpg)
Bias-Variance-Noise Decomposition
9
πΈπΈ π¦π¦β² β ππππ π₯π₯β²2 = πΈπΈ ππππ π₯π₯β² 2 β 2ππππ π₯π₯β² π¦π¦β² + π¦π¦β²2
= πΈπΈ ππππ π₯π₯β² 2 β 2πΈπΈ ππππ π₯π₯β² πΈπΈ π¦π¦β² + πΈπΈ π¦π¦β²2
= ππππππ ππππ π₯π₯β² + πΈπΈ πππ π π₯π₯β² 2 β 2πΈπΈ ππππ π₯π₯β² ππ π₯π₯β²
+ ππππππ π¦π¦π₯ + ππ π₯π₯β² 2
= ππππππ ππππ π₯π₯β² + πΈπΈ πππ π (π₯π₯β²) β ππ π₯π₯β² 2 + ππππππ ππ
= ππππππ ππππ π₯π₯β² + πΈπΈ πππ π (π₯π₯β²) β ππ π₯π₯β² 2 + ππ2
![Page 10: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/10.jpg)
Bias-Variance-Noise Decomposition
10
πΈπΈ π¦π¦β² β ππππ π₯π₯β²2 = πΈπΈ ππππ π₯π₯β² 2 β 2ππππ π₯π₯β² π¦π¦β² + π¦π¦β²2
= πΈπΈ ππππ π₯π₯β² 2 β 2πΈπΈ ππππ π₯π₯β² πΈπΈ π¦π¦β² + πΈπΈ π¦π¦β²2
= ππππππ ππππ π₯π₯β² + πΈπΈ πππ π π₯π₯β² 2 β 2πΈπΈ ππππ π₯π₯β² ππ π₯π₯β²
+ ππππππ π¦π¦π₯ + ππ π₯π₯β² 2
= ππππππ ππππ π₯π₯β² + πΈπΈ πππ π (π₯π₯β²) β ππ π₯π₯β² 2 + ππππππ ππ
= ππππππ ππππ π₯π₯β² + πΈπΈ πππ π (π₯π₯β²) β ππ π₯π₯β² 2 + ππ2
The samples ππand the noise ππ are independent
![Page 11: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/11.jpg)
Bias-Variance-Noise Decomposition
11
πΈπΈ π¦π¦β² β ππππ π₯π₯β²2 = πΈπΈ ππππ π₯π₯β² 2 β 2ππππ π₯π₯β² π¦π¦β² + π¦π¦β²2
= πΈπΈ ππππ π₯π₯β² 2 β 2πΈπΈ ππππ π₯π₯β² πΈπΈ π¦π¦β² + πΈπΈ π¦π¦β²2
= ππππππ ππππ π₯π₯β² + πΈπΈ πππ π π₯π₯β² 2 β 2πΈπΈ ππππ π₯π₯β² ππ π₯π₯β²
+ ππππππ π¦π¦π₯ + ππ π₯π₯β² 2
= ππππππ ππππ π₯π₯β² + πΈπΈ πππ π (π₯π₯β²) β ππ π₯π₯β² 2 + ππππππ ππ
= ππππππ ππππ π₯π₯β² + πΈπΈ πππ π (π₯π₯β²) β ππ π₯π₯β² 2 + ππ2
Follows from definition of variance
![Page 12: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/12.jpg)
Bias-Variance-Noise Decomposition
12
πΈπΈ π¦π¦β² β ππππ π₯π₯β²2 = πΈπΈ ππππ π₯π₯β² 2 β 2ππππ π₯π₯β² π¦π¦β² + π¦π¦β²2
= πΈπΈ ππππ π₯π₯β² 2 β 2πΈπΈ ππππ π₯π₯β² πΈπΈ π¦π¦β² + πΈπΈ π¦π¦β²2
= ππππππ ππππ π₯π₯β² + πΈπΈ πππ π π₯π₯β² 2 β 2πΈπΈ ππππ π₯π₯β² ππ π₯π₯β²
+ ππππππ π¦π¦π₯ + ππ π₯π₯β² 2
= ππππππ ππππ π₯π₯β² + πΈπΈ πππ π (π₯π₯β²) β ππ π₯π₯β² 2 + ππππππ ππ
= ππππππ ππππ π₯π₯β² + πΈπΈ πππ π (π₯π₯β²) β ππ π₯π₯β² 2 + ππ2
πΈπΈ π¦π¦β² = ππ(π₯π₯β²)
![Page 13: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/13.jpg)
Bias-Variance-Noise Decomposition
13
πΈπΈ π¦π¦β² β ππππ π₯π₯β²2 = πΈπΈ ππππ π₯π₯β² 2 β 2ππππ π₯π₯β² π¦π¦β² + π¦π¦β²2
= πΈπΈ ππππ π₯π₯β² 2 β 2πΈπΈ ππππ π₯π₯β² πΈπΈ π¦π¦β² + πΈπΈ π¦π¦β²2
= ππππππ ππππ π₯π₯β² + πΈπΈ πππ π π₯π₯β² 2 β 2πΈπΈ ππππ π₯π₯β² ππ π₯π₯β²
+ ππππππ π¦π¦π₯ + ππ π₯π₯β² 2
= ππππππ ππππ π₯π₯β² + πΈπΈ πππ π (π₯π₯β²) β ππ π₯π₯β² 2 + ππππππ ππ
= ππππππ ππππ π₯π₯β² + πΈπΈ πππ π (π₯π₯β²) β ππ π₯π₯β² 2 + ππ2
Variance Bias Noise
![Page 14: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/14.jpg)
Bias, Variance, and Noise
β’ Variance: πΈπΈ[ (ππππ π₯π₯β² β πΈπΈ ππππ(π₯π₯β²) )2 ]
β’ Describes how much ππππ(π₯π₯β²) varies from one training set ππ to another
β’ Bias: πΈπΈ ππππ(π₯π₯β²) β ππ(π₯π₯π₯)
β’ Describes the average error of ππππ(π₯π₯β²)
β’ Noise: πΈπΈ π¦π¦β² β ππ π₯π₯β² 2 = πΈπΈ[ππ2] = ππ2
β’ Describes how much π¦π¦β² varies from ππ(π₯π₯β²)
14
![Page 15: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/15.jpg)
2-D Example
50 fits (20 examples each)
15
![Page 16: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/16.jpg)
Bias
16
![Page 17: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/17.jpg)
Variance
17
![Page 18: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/18.jpg)
Noise
18
![Page 19: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/19.jpg)
Bias
β’ Low bias
β’ ?
β’ High bias
β’ ?
19
![Page 20: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/20.jpg)
Bias
β’ Low bias
β’ Linear regression applied to linear data
β’ 2nd degree polynomial applied to quadratic data
β’ High bias
β’ Constant function applied to non-constant data
β’ Linear regression applied to highly non-linear data
20
![Page 21: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/21.jpg)
Variance
β’ Low variance
β’ ?
β’ High variance
β’ ?
21
![Page 22: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/22.jpg)
Variance
β’ Low variance
β’ Constant function
β’ Model independent of training data
β’ High variance
β’ High degree polynomial
22
![Page 23: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/23.jpg)
Bias/Variance Tradeoff
β’ (bias2+variance) is what counts for prediction
β’ As we saw in PAC learning, we often have
β’ Low bias β high variance
β’ Low variance β high bias
β’ How can we deal with this in practice?
23
![Page 24: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/24.jpg)
Reduce Variance Without Increasing Bias
β’ Averaging reduces variance: let ππ1, β¦ ,ππππ be i.i.d random variables
ππππππ1πποΏ½ππ
ππππ =1ππππππππ(ππππ)
β’ Idea: average models to reduce model variance
β’ The problem
β’ Only one training set
β’ Where do multiple models come from?
24
![Page 25: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/25.jpg)
Bagging: Bootstrap Aggregation
β’ Take repeated bootstrap samples from training set π·π· (Breiman, 1994)
β’ Bootstrap sampling: Given set π·π· containing ππ training examples, create π·π·π₯ by drawing ππ examples at random with replacement from π·π·
β’ Bagging:
β’ Create ππ bootstrap samples π·π·1, β¦ ,π·π·ππ
β’ Train distinct classifier on each π·π·ππ
β’ Classify new instance by majority vote / average
25
![Page 26: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/26.jpg)
Bagging: Bootstrap Aggregation
[image from the slides of David Sontag]26
![Page 27: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/27.jpg)
Bagging
Data 1 2 3 4 5 6 7 8 9 10BS 1 7 1 9 10 7 8 8 4 7 2BS 2 8 1 3 1 1 9 7 4 10 1BS 3 5 4 8 8 2 5 5 7 8 8
27
β’ Build a classifier from each bootstrap sampleβ’ In each bootstrap sample, each data point has
probability 1 β 1ππ
ππof not being selected
β’ Expected number of distinct data points in each sample is then
ππ β 1 β 1 β 1ππ
ππβ ππ β (1 β exp(β1)) = .632 β ππ
![Page 28: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/28.jpg)
Bagging
Data 1 2 3 4 5 6 7 8 9 10BS 1 7 1 9 10 7 8 8 4 7 2BS 2 8 1 3 1 1 9 7 4 10 1BS 3 5 4 8 8 2 5 5 7 8 8
28
β’ Build a classifier from each bootstrap sampleβ’ In each bootstrap sample, each data point has
probability 1 β 1ππ
ππof not being selected
β’ If we have 1 TB of data, each bootstrap sample will be ~ 632GB (this can present computational challenges, e.g., you shouldnβt replicate the data)
![Page 29: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/29.jpg)
Decision Tree Bagging
[image from the slides of David Sontag]29
![Page 30: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/30.jpg)
Decision Tree Bagging (100 Bagged Trees)
[image from the slides of David Sontag]30
![Page 31: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/31.jpg)
Bagging Results
Breiman βBagging Predictorsβ Berkeley Statistics Department TR#421, 1994
32
Without Bagging
WithBagging
![Page 32: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/32.jpg)
Random Forests
33
![Page 33: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/33.jpg)
Random Forests
β’ Ensemble method specifically designed for decision tree classifiers
β’ Introduce two sources of randomness: βbaggingβ and βrandom input vectorsβ
β’ Bagging method: each tree is grown using a bootstrap sample of training data
β’ Random vector method: best split at each node is chosen from a random sample of ππ attributes instead of all attributes
34
![Page 34: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/34.jpg)
Random Forest Algorithmβ’ For ππ = 1 to π΅π΅
β’ Draw a bootstrap sample of size ππ from the dataβ’ Grow a tree ππππ using the bootstrap sample as follows
β’ Choose ππ attributes uniformly at random from the dataβ’ Choose the best attribute among the ππ to split onβ’ Split on the best attribute and recurse (until partitions
have fewer than π π ππππππ number of nodes)
β’ Prediction for a new data point π₯π₯β’ Regression: 1
π΅π΅βππ ππππ(π₯π₯)
β’ Classification: choose the majority class label among ππ1 π₯π₯ , β¦ ,πππ΅π΅(π₯π₯)
35
![Page 35: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/35.jpg)
Random Forest Demo
A demo of random forests implemented in JavaScript
36
![Page 36: Variance Reduction and Ensemble Methodsnrr150130/cs6375/...Bias-Variance Analysis in Regression β’ True function is π¦π¦= ππ(π₯π₯) +ππ β’ Where noise, ππ,](https://reader034.vdocuments.net/reader034/viewer/2022050215/5f616c896ceaa26f994d51af/html5/thumbnails/36.jpg)
When Will Bagging Improve Accuracy?
β’ Depends on the stability of the base-level classifiers
β’ A learner is unstable if a small change to the training set causes a large change in the output hypothesis
β’ If small changes in π·π· cause large changes in the output, then there will likely be an improvement in performance with bagging
β’ Bagging can help unstable procedures, but could hurt the performance of stable procedures
β’ Decision trees are unstable
β’ ππ-nearest neighbor is stable
37