deep learning - univie.ac.atgrohs/tmp/dl19_intro.pdf · deep learning philipp grohs march 4st 2019....

54
Deep Learning Philipp Grohs March 4st 2019

Upload: others

Post on 03-Jul-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

Deep Learning

Philipp Grohs

March 4st 2019

Page 2: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

Organisatorial Stuff...

3h lecture (MSc Level)

Monday 11:30 - 12:15 (seminar room 11) & Wednesday 11:30 -13:00 (seminar room 7)

Oral exam at the end of the semester

Exercise classes during the semester

Prerequisites: Linear algebra, Analysis, Probability

Desirable: Functional Analysis, Experience with Python(!!!!!) orMATLAB

I will create a website at http://mat.univie.ac.at/~grohs/DeepLearningCourse2019.html.

Page 3: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

Organisatorial Stuff...

3h lecture (MSc Level)

Monday 11:30 - 12:15 (seminar room 11) & Wednesday 11:30 -13:00 (seminar room 7)

Oral exam at the end of the semester

Exercise classes during the semester

Prerequisites: Linear algebra, Analysis, Probability

Desirable: Functional Analysis, Experience with Python(!!!!!) orMATLAB

I will create a website at http://mat.univie.ac.at/~grohs/DeepLearningCourse2019.html.

Page 4: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

Organisatorial Stuff...

3h lecture (MSc Level)

Monday 11:30 - 12:15 (seminar room 11) & Wednesday 11:30 -13:00 (seminar room 7)

Oral exam at the end of the semester

Exercise classes during the semester

Prerequisites: Linear algebra, Analysis, Probability

Desirable: Functional Analysis, Experience with Python(!!!!!) orMATLAB

I will create a website at http://mat.univie.ac.at/~grohs/DeepLearningCourse2019.html.

Page 5: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

Organisatorial Stuff...

3h lecture (MSc Level)

Monday 11:30 - 12:15 (seminar room 11) & Wednesday 11:30 -13:00 (seminar room 7)

Oral exam at the end of the semester

Exercise classes during the semester

Prerequisites: Linear algebra, Analysis, Probability

Desirable: Functional Analysis, Experience with Python(!!!!!) orMATLAB

I will create a website at http://mat.univie.ac.at/~grohs/DeepLearningCourse2019.html.

Page 6: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

Organisatorial Stuff...

3h lecture (MSc Level)

Monday 11:30 - 12:15 (seminar room 11) & Wednesday 11:30 -13:00 (seminar room 7)

Oral exam at the end of the semester

Exercise classes during the semester

Prerequisites: Linear algebra, Analysis, Probability

Desirable: Functional Analysis, Experience with Python(!!!!!) orMATLAB

I will create a website at http://mat.univie.ac.at/~grohs/DeepLearningCourse2019.html.

Page 7: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

Organisatorial Stuff...

3h lecture (MSc Level)

Monday 11:30 - 12:15 (seminar room 11) & Wednesday 11:30 -13:00 (seminar room 7)

Oral exam at the end of the semester

Exercise classes during the semester

Prerequisites: Linear algebra, Analysis, Probability

Desirable: Functional Analysis, Experience with Python(!!!!!) orMATLAB

I will create a website at http://mat.univie.ac.at/~grohs/DeepLearningCourse2019.html.

Page 8: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

Organisatorial Stuff...

3h lecture (MSc Level)

Monday 11:30 - 12:15 (seminar room 11) & Wednesday 11:30 -13:00 (seminar room 7)

Oral exam at the end of the semester

Exercise classes during the semester

Prerequisites: Linear algebra, Analysis, Probability

Desirable: Functional Analysis, Experience with Python(!!!!!) orMATLAB

I will create a website at http://mat.univie.ac.at/~grohs/DeepLearningCourse2019.html.

Page 9: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

Face recognition

Page 10: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

Face recognition

D. Trump

B. Sanders

B. Johnson

A. Merkel

Page 11: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

Face recognition

D. Trump

B. Sanders

B. Johnson

A. Merkel

Page 12: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

Describing the content of an image

AI generates sentences describing the content of animage [Vinyals et al., 2015 ]

Page 13: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

Go!

AI beats Go-champion Lee Sedol [Silver et al., 2016 ]

Page 14: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

Atari games

AI beats professional human Atari-players [Mnih et al., 2015 ]

Page 15: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

Rating Attractiveness

Swiss Dating App Blinq (developed in cooperation with ETHZ)

Page 16: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

Goal of This Course

How does this work???

Page 17: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

Syllabus

1. Statistical Learning Theory

2. Classical Models

3. Neural Networks

4. Expressive Power of Neural Networks

5. Breaking the Curse of Diemensionality with Neural Networks

Caution

This is not a pure Mathematics course!

Page 18: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

Syllabus

1. Statistical Learning Theory

2. Classical Models

3. Neural Networks

4. Expressive Power of Neural Networks

5. Breaking the Curse of Diemensionality with Neural Networks

Caution

This is not a pure Mathematics course!

Page 19: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

Some Literature

I. Goodfellow, Y. Bengio and A. Courville. Deep Learning. (2016).Available from http://www.deeplearningbook.org.

L Devroye, L Gyorfi, G Lugosi. A Probabilistic Theory of PatternRecognition (2013). Springer.

F. Cucker and S. Smale. On the Mathematical Foundations ofLearning. Bulletin of the AMS 39/1, pp 1 – 49 (2001).

F. Chollet. Deep Learning with Python (2017). Manning.

P. Grohs, D. Perekrestenko, D. Elbrachter, H. Bolcskei. Deep NeuralNetwork Approximation Theory. Available fromhttps://arxiv.org/abs/1901.02220.

J. Berner, P. Grohs, A. Jentzen. Analysis of the generalization error:Empirical risk minimization over deep artificial neural networksovercomes the curse of dimensionality in the numerical approximationof Black-Scholes partial differential equations. Available fromhttps://arxiv.org/abs/1809.03062.

Page 20: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

0. Prelude

Page 21: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

1.1 Basic Concepts

Page 22: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

Definition of Learning

Definition [Mitchell (1997)]

“A computer program is said to learn from experience E with respectto some class of tasks T and performance measure P , if itsperformance at tasks in T , as measured by P , improves withexperience E”

...recall that this is not pure mathematics...

Page 23: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

Definition of Learning

Definition [Mitchell (1997)]

“A computer program is said to learn from experience E with respectto some class of tasks T and performance measure P , if itsperformance at tasks in T , as measured by P , improves withexperience E”

...recall that this is not pure mathematics...

Page 24: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

The Task T

Classification

Compute f : Rn → {1, . . . , k} which maps data x ∈ Rn to a categoryin {1, . . . , k}. Alternative: Compute f : Rn → Rk which maps datax ∈ Rn to a histogram with respect to k categories.

x = 7→ f(x) = 5.

Page 25: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

The Task T

Classification

Compute f : Rn → {1, . . . , k} which maps data x ∈ Rn to a categoryin {1, . . . , k}. Alternative: Compute f : Rn → Rk which maps datax ∈ Rn to a histogram with respect to k categories.

x = 7→ f(x) = 5.

Page 26: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

The Task T

Classification

Compute f : Rn → {1, . . . , k} which maps data x ∈ Rn to a categoryin {1, . . . , k}. Alternative: Compute f : Rn → Rk which maps datax ∈ Rn to a histogram with respect to k categories.

x = 7→ f(x) = 5.

Page 27: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

The Task T

Regression

Predict a numerical value f : Rn → R.

Expected claim of insured person

Algorithmic trading

Page 28: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

The Task T

Regression

Predict a numerical value f : Rn → R.

Expected claim of insured person

Algorithmic trading

Page 29: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

The Task T

Regression

Predict a numerical value f : Rn → R.

Expected claim of insured person

Algorithmic trading

Page 30: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

The Task T

Density Estimation

Estimate a probability density p : R→ R+ which can be interpretedas a probability distribution on the space that the examples weredrawn from.

Useful for many tasks in data processing, for example if weobserve corrupted data x we may estimate the original x as theargmax of p(x|x).

Page 31: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

The Task T

Density Estimation

Estimate a probability density p : R→ R+ which can be interpretedas a probability distribution on the space that the examples weredrawn from.

Useful for many tasks in data processing, for example if weobserve corrupted data x we may estimate the original x as theargmax of p(x|x).

Page 32: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

The Experience E

The experience typically consists of a dataset which consists of manyexamples (aka data points).

If these data points are labeled (for example in the classificationproblem, if we know the classifier of our given data points) wespeak of supervised learning.

If these data points are not labeled (for example in theclassification problem, the algorithm would have to find theclusters itself from the given dataset) we speak of unsupervisedlearning.

Page 33: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

The Experience E

The experience typically consists of a dataset which consists of manyexamples (aka data points).

If these data points are labeled (for example in the classificationproblem, if we know the classifier of our given data points) wespeak of supervised learning.

If these data points are not labeled (for example in theclassification problem, the algorithm would have to find theclusters itself from the given dataset) we speak of unsupervisedlearning.

Page 34: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

The Experience E

The experience typically consists of a dataset which consists of manyexamples (aka data points).

If these data points are labeled (for example in the classificationproblem, if we know the classifier of our given data points) wespeak of supervised learning.

If these data points are not labeled (for example in theclassification problem, the algorithm would have to find theclusters itself from the given dataset) we speak of unsupervisedlearning.

Page 35: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

The Performance Measure P

In classification problems this is typically the accuracy, i.e., theproportion of examples for which the model produces the correctoutput.

Often the given dataset is split into a training set on which thealgorithm operates and a test set on which its performance ismeasured.

Page 36: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

The Performance Measure P

In classification problems this is typically the accuracy, i.e., theproportion of examples for which the model produces the correctoutput.

Often the given dataset is split into a training set on which thealgorithm operates and a test set on which its performance ismeasured.

Page 37: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

An Example: Linear Regression

The Task

Regression: Predict f : Rn → R.

The Experience

Training data ((xtraini , ytraini ))mi=1.

The Performance Measure

Given test data ((xtesti , ytesti ))ni=1 we evaluate the performance of anestimator f : Rn → R as the mean squared error

1

n

n∑i=1

|f(xtesti )− ytesti |2.

Page 38: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

An Example: Linear Regression

The Task

Regression: Predict f : Rn → R.

The Experience

Training data ((xtraini , ytraini ))mi=1.

The Performance Measure

Given test data ((xtesti , ytesti ))ni=1 we evaluate the performance of anestimator f : Rn → R as the mean squared error

1

n

n∑i=1

|f(xtesti )− ytesti |2.

Page 39: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

An Example: Linear Regression

The Task

Regression: Predict f : Rn → R.

The Experience

Training data ((xtraini , ytraini ))mi=1.

The Performance Measure

Given test data ((xtesti , ytesti ))ni=1 we evaluate the performance of anestimator f : Rn → R as the mean squared error

1

n

n∑i=1

|f(xtesti )− ytesti |2.

Page 40: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

An Example: Linear Regression

The Task

Regression: Predict f : Rn → R.

The Experience

Training data ((xtraini , ytraini ))mi=1.

The Performance Measure

Given test data ((xtesti , ytesti ))ni=1 we evaluate the performance of anestimator f : Rn → R as the mean squared error

1

n

n∑i=1

|f(xtesti )− ytesti |2.

Page 41: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

An Example: Linear Regression

The Computer Program

Define a Hypothesis Space

H = span{ϕ1, . . . , ϕl} ⊂ C(Rn)

and, given training data

z = (xi, yi)mi=1,

and the defining the empirical risk

Ez(f) :=1

m

m∑i=1

(f(xi)− yi)2,

we let our algorithm find the minimizer (a.k.a. empirical targetfunction)

fH,z := argminf∈H Ez(f).

Page 42: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

An Example: Linear Regression

The Computer Program

Define a Hypothesis Space

H = span{ϕ1, . . . , ϕl} ⊂ C(Rn)

and, given training data

z = (xi, yi)mi=1,

and the defining the empirical risk

Ez(f) :=1

m

m∑i=1

(f(xi)− yi)2,

we let our algorithm find the minimizer (a.k.a. empirical targetfunction)

fH,z := argminf∈H Ez(f).

Page 43: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

An Example: Linear Regression

The Computer Program

Define a Hypothesis Space

H = span{ϕ1, . . . , ϕl} ⊂ C(Rn)

and, given training data

z = (xi, yi)mi=1,

and the defining the empirical risk

Ez(f) :=1

m

m∑i=1

(f(xi)− yi)2,

we let our algorithm find the minimizer (a.k.a. empirical targetfunction)

fH,z := argminf∈H Ez(f).

Page 44: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

An Example: Linear Regression

The Computer Program

Define a Hypothesis Space

H = span{ϕ1, . . . , ϕl} ⊂ C(Rn)

and, given training data

z = (xi, yi)mi=1,

and the defining the empirical risk

Ez(f) :=1

m

m∑i=1

(f(xi)− yi)2,

we let our algorithm find the minimizer (a.k.a. empirical targetfunction)

fH,z := argminf∈H Ez(f).

Page 45: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

An Example: Linear Regression

Computing the Empirical Target Function

LetA = (ϕj(xi))i,j ∈ Rm×l.

Every f ∈ H can be written as∑m

i=1wiϕi and we denotew := (wi)

li=1.

We let y := (yi)mi=1.

We get thatEz(f) = ‖Aw − y‖2 .

A minimizer is given by w∗ := A†y, and we get our estimate

f∗ :=

l∑i=1

(w∗)iϕi.

Page 46: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

An Example: Linear Regression

Computing the Empirical Target Function

LetA = (ϕj(xi))i,j ∈ Rm×l.

Every f ∈ H can be written as∑m

i=1wiϕi and we denotew := (wi)

li=1.

We let y := (yi)mi=1.

We get thatEz(f) = ‖Aw − y‖2 .

A minimizer is given by w∗ := A†y, and we get our estimate

f∗ :=

l∑i=1

(w∗)iϕi.

Page 47: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

An Example: Linear Regression

Computing the Empirical Target Function

LetA = (ϕj(xi))i,j ∈ Rm×l.

Every f ∈ H can be written as∑m

i=1wiϕi and we denotew := (wi)

li=1.

We let y := (yi)mi=1.

We get thatEz(f) = ‖Aw − y‖2 .

A minimizer is given by w∗ := A†y, and we get our estimate

f∗ :=

l∑i=1

(w∗)iϕi.

Page 48: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

An Example: Linear Regression

Computing the Empirical Target Function

LetA = (ϕj(xi))i,j ∈ Rm×l.

Every f ∈ H can be written as∑m

i=1wiϕi and we denotew := (wi)

li=1.

We let y := (yi)mi=1.

We get thatEz(f) = ‖Aw − y‖2 .

A minimizer is given by w∗ := A†y, and we get our estimate

f∗ :=

l∑i=1

(w∗)iϕi.

Page 49: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

An Example: Linear Regression

Computing the Empirical Target Function

LetA = (ϕj(xi))i,j ∈ Rm×l.

Every f ∈ H can be written as∑m

i=1wiϕi and we denotew := (wi)

li=1.

We let y := (yi)mi=1.

We get thatEz(f) = ‖Aw − y‖2 .

A minimizer is given by w∗ := A†y, and we get our estimate

f∗ :=

l∑i=1

(w∗)iϕi.

Page 50: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

An Example: Linear Regression

Computing the Empirical Target Function

LetA = (ϕj(xi))i,j ∈ Rm×l.

Every f ∈ H can be written as∑m

i=1wiϕi and we denotew := (wi)

li=1.

We let y := (yi)mi=1.

We get thatEz(f) = ‖Aw − y‖2 .

A minimizer is given by w∗ := A†y, and we get our estimate

f∗ :=

l∑i=1

(w∗)iϕi.

Page 51: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

Degree too low: underfitting. Degree to high: overfitting!

Page 52: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

Degree too low: underfitting. Degree to high: overfitting!

Page 53: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

Figure: Error with Polynomial Degree

Bias-Variance Problem

“Capacity” of the hypothesis space has to be adapted to thecomplexity of the target function and the sample size!

Page 54: Deep Learning - univie.ac.atgrohs/tmp/DL19_intro.pdf · Deep Learning Philipp Grohs March 4st 2019. Organisatorial Stu ... 3h lecture (MSc Level) Monday 11:30 - 12:15 (seminar room

Figure: Error with Polynomial Degree

Bias-Variance Problem

“Capacity” of the hypothesis space has to be adapted to thecomplexity of the target function and the sample size!