trilha – machine learning€¦ · deep ensembles (1) source: simple and scalable predictive...

77
pen4education Trilha – Machine Learning Roberto Silveira Engenheiro

Upload: others

Post on 31-Dec-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

pen4education

Trilha – Machine LearningRoberto Silveira

Engenheiro

Page 2: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Incerteza em Machine Learning

Roberto Silveira

Page 3: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Hello!I am Roberto Silveira EE engineer, ML enthusiast

[email protected]@rsilveira79

Page 4: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Why do we care about uncertainty?

Page 5: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Why do we care about uncertainty?

Cat?Dog????

Page 6: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Why do we care about uncertainty?

Page 7: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Why do we care about uncertainty?

Page 8: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Why do we care about uncertainty?

Page 9: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

A note on SOFTMAX

Page 10: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian
Page 11: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian
Page 12: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Source: Uncertainty in Deep Learning - PhD Thesis (Gal, 2016)

Page 13: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Demo timeCode: https://github.com/rsilveira79/intents_uncertainty/blob/master/intent_classifier_uncertainty_mc_dropout.ipynb

Page 14: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Language of uncertainty

Page 15: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Types of Uncertainty

Source: Uncertainty in Deep Learning (Yarin Gal, 2016)

Model uncertainty (epistemic, reducible) = uncertainty in model (either model parameters or model structure) → more data helps

"epistemic" → Greeg "episteme" = knowledge

Page 16: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Types of Uncertainty

Source: Uncertainty in Deep Learning (Yarin Gal, 2016)

Aleatoric uncertainty (stochastic, irreducible) = uncertainty in data (noise) → more data doesn't help

"Aleatoric" → Latin "aleator" = “dice player’s"

Can be further divided:

● Homoscedastic → uncertainty is same for all inputs● Heteroscedastic → observation (uncertainty) can vary with input X

Page 17: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Frequentist vs Bayesian view of the world

Page 18: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Frequentist vs Bayesian view of the world (2)

Frequentist

● Data is considered random● Model parameters are fixed● Probabilities are

fundamentally related to frequencies of events

Source: https://github.com/fonnesbeck/intro_stat_modeling_2017

Bayesian

● Data is considered fixed● Model parameters are

"random" (conditioned to observations - sampled from distribution)

● Probabilities are fundamentally related to their own knowledge about an event

Page 19: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian
Page 20: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Bayes Inference 101

Source: https://github.com/fonnesbeck/intro_stat_modeling_2017

Page 21: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Bayes Inference 101

Source: https://github.com/fonnesbeck/intro_stat_modeling_2017

Posterior Probability → probability of model parameters given the data

Page 22: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Bayes Inference 101

Source: https://github.com/fonnesbeck/intro_stat_modeling_2017

Likelihood of observations → information on the observed data ~ proportional to likelihood in frequentist approach

Page 23: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Bayes Inference 101

Source: https://github.com/fonnesbeck/intro_stat_modeling_2017

Prior Probability → what is known about the model before observing the data

Page 24: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Bayes Inference 101

Source: https://github.com/fonnesbeck/intro_stat_modeling_2017

Normalizing Constant → model evidence, usually not considered in bayesian inference

Page 25: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Further reading

Source: Frequentism and Bayesianism: A Python-driven Primer (Jake VanderPlas, 2014)

Page 26: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Gaussian Processes

Page 27: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Johann Carl Friedrich Gauss(1777 - 1855)

Page 28: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Gaussian Basics (1)

Source: YouTube - Machine learning - Introduction to Gaussian processes (Nando de Freitas, 2013)

X2

X1

X2

X1

Page 29: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Gaussian Basics (2)

Source: YouTube - Machine learning - Introduction to Gaussian processes (Nando de Freitas, 2013)

X2

X1

X2

X1

X1* X1

*

Page 30: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Gaussian Basics (3) - from joint to conditional dist.

Source: YouTube - Machine learning - Introduction to Gaussian processes (Nando de Freitas, 2013)

X2

X1X1*

X2

Page 31: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Gaussian Basics (4) - Multivariate Gaussian Dist.

Source: YouTube - Machine learning - Introduction to Gaussian processes (Nando de Freitas, 2013)

f(x)

xx1 x2 x3

f1

f2

f3

Page 32: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Gaussian Basics (4) - Multivariate Gaussian Dist.

Source: YouTube - Machine learning - Introduction to Gaussian processes (Nando de Freitas, 2013)

f(x)

xx1 x2 x3

f1

f2

f3

Page 33: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Gaussian Basics (4) - Multivariate Gaussian Dist.

Source: YouTube - Machine learning - Introduction to Gaussian processes (Nando de Freitas, 2013)

f(x)

xx1 x2 x3

f1

f2

f3

Where have data+ confidence (lower uncertainty)

Where don't have data- confidence (higher uncertainty)

Page 34: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Gaussian Processes (1)

● Generalization of multivariate gaussian distribution to infinitely many variables

● BAYESIAN NON-PARAMETRIC = # parameters grows w/ size of dataset = infinitely parametric

● GP are distribution over functions (not point estimates)● Can be seen as bayesian version of SVM● Uses kernels as SVM methods

Page 35: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Kernel Trick (recap)

Projects non-linear separable input into high-order linear separable space

Source: https://goo.gl/kxLepp

Page 36: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Gaussian Processes (2)

● GP = distribution over functions (or high dimensional vectors)● GP is fully specified by:

○ Mean vector 𝞵○ Covariance matrix ∑

● Learning in GP = finding suitable properties for the covariance function

Page 37: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Gaussian Processes (3) - Regression

Page 38: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Gaussian Processes (4) - Regression

Page 39: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Covariance Functions - Squared Exponential

Source: https://docs.pymc.io/

Page 40: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Covariance Functions - Exponential

Source: https://docs.pymc.io/

Page 41: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Covariance Functions - Matern 5/2

Source: https://docs.pymc.io/

Page 42: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Covariance Functions - Matern 3/2

Source: https://docs.pymc.io/

Page 43: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Covariance Functions - Cosine

Source: https://docs.pymc.io/

Page 44: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Demo timeCode: https://github.com/rsilveira79/bayesian_notebooks/blob/master/uncertainty_notebooks/gaussian_process_PyMC3.ipynb

Page 45: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Bayes meets Deep

Learning Or more logically, deep learning meets Bayesian inference

Page 46: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Intuition (1)

Source: https://ericmjl.github.io/bayesian-deep-learning-demystified/

Page 47: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Intuition (2) - From this ...

Source: https://ericmjl.github.io/bayesian-deep-learning-demystified/

Page 48: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Intuition (3) - … to this

Source: https://ericmjl.github.io/bayesian-deep-learning-demystified/

Page 49: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Bayes by Backprop (1)

Source: Weight Uncertainty in Neural Networks (Blundell et al, 2015)

● Each weight w is a distribution (instead of single point estimate)● Backpropagation-compatible algorithm to compute distribution over w

Page 50: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Bayes by Backprop (2)

mean gradients (and weights)

Source: Weight Uncertainty in Neural Networks (Blundell et al, 2015)

standard deviation gradients (and weights)

Bayesian approach: data is fixed, update model beliefs (w)

Page 51: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Bayes by Backprop (3) - MNIST

Source: Weight Uncertainty in Neural Networks (Blundell et al, 2015)

Page 52: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Source: Dropout: A Simple Way to Prevent Neural Networks from Overfitting (Srivastava et al, 2014)

A quick note on dropout

w/o dropout w/ dropout

KEY IDEA1 NN w/ dropout = ensemble of 2n thinned networks

w/o dropout w/ dropout

y1 y1~

y2 y2~

r1

r2

rj(l)=

Bernoulli(p)

Page 53: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Monte Carlo Dropout (MC Dropout) (1)

Source: Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning (Gal et al, 2016), Uncertainty in Deep Learning - PhD Thesis (Gal, 2016)

● MC dropout is equivalent to performing T stochastic forward passes through the network and averaging the results (model averaging)

p → probability of units not being dropped

Page 54: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Monte Carlo Dropout (MC Dropout) (2)

Source: Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning (Gal et al, 2016), Uncertainty in Deep Learning - PhD Thesis (Gal, 2016)

● In practice, given a point x:○ Drop units at test time;○ Repeat n times (e.g. 10, 100, 200);○ Look at the mean and variance;

Page 55: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Monte Carlo Dropout (MC Dropout) (3)

Source: Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning (Gal et al, 2016), Uncertainty in Deep Learning - PhD Thesis (Gal, 2016)

Page 56: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Demo timeCode: https://github.com/rsilveira79/bayesian_notebooks/blob/master/uncertainty_notebooks/uncertainty_mc_dropout.ipynb

Page 57: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Deep Ensembles (1)

Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017)

● Non-bayesian method● High quality uncertainty predictions

a. Not over-confident on out-of-distribution examples● Model don't need to have dropout● Basic receipt:

a. Use proper scoring rule (loss function) - NLL for regressionb. Use adversarial inputs (smooth predictions)c. Train ensemble of classifiers

Page 58: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Deep Ensembles (2)

Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017)

● Scoring Rules (Regression) mean → no variance in MSE loss

mean

variance

Page 59: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Deep Ensembles (3)

Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017)

MSE NLL Adversarial training

Ensemble (M=5)

Page 60: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Deep Ensembles (4)

Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017)

Page 61: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Deep Ensembles (5)

Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017)

Page 62: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Demo timeCode: https://github.com/rsilveira79/bayesian_notebooks/blob/master/uncertainty_notebooks/deep_ensemble_uncertainty_Pytorch.ipynb

Page 63: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Uncertainty in Discrete-Continuous Data

Source: Quantifying Uncertainty in Discrete-Continuous and Skewed Data with Bayesian Deep Learning (Blundell et al, 2015)

Page 64: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Uncertainty in Recommender Systems

Source: Deep density networks and uncertainty in recommender systems (Zeldes et al, 2017)

Page 65: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Take home

Page 66: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Take #1

Have free time?Study Bayesian Stats

(also if doesn't have free time)

Page 67: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Take #2

Probabilistic Programming(e.g. GP) can be a powerful tool to

master - PyMC3, Pyro.ai, Stan(specially for small datasets)

Page 68: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Take #3For Bayesian Deep Learning

● stay tuned w/ latest developments (Cambridge, Deep Mind, Uber)

● always check uncertainty quality● try different approaches

Page 69: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Probe further

Page 70: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Source: https://sites.google.com/view/udl2018/accepted-papers

Page 71: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Source: http://auai.org/uai2018/index.php

Page 72: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Source: http://www.gaussianprocess.org/gpml/

Page 73: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Source: http://mlg.eng.cam.ac.uk/yarin/thesis/thesis.pdf

Page 74: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Source: https://github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers

Page 75: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Nice Repos

● https://github.com/fonnesbeck/bayesian_mixer_london_2017● https://ericmjl.github.io/bayesian-deep-learning-demystified/● https://github.com/yaringal

Page 76: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian

Nice blog posts

● https://blog.dominodatalab.com/fitting-gaussian-process-models-python/● http://katbailey.github.io/post/gaussian-processes-for-dummies/● http://mlg.eng.cam.ac.uk/yarin/blog_2248.html● https://www.nitarshan.com/bayes-by-backprop/

Page 77: Trilha – Machine Learning€¦ · Deep Ensembles (1) Source: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Lakshminarayanan et al, 2017) Non-bayesian