andrew rosenberg- lecture 2.1: vector calculus csc 84020 - machine learning

46
Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning Andrew Rosenberg February 5, 2009

Upload: roots999

Post on 06-Apr-2018

229 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 1/46

Lecture 2.1: Vector Calculus

CSC 84020 - Machine Learning

Andrew Rosenberg

February 5, 2009

Page 2: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 2/46

Today

Last Time

Probability ReviewToday

Vector Calculus

Page 3: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 3/46

Background

Let’s talk.Linear Algebra

VectorsMatricesBasis Spaces

Eigenvectors/values?Inversion and transposition

Calculus

DerivationIntegration

Vector CalculusGradientsDerivation w.r.t. a vector

Page 4: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 4/46

Linear Algebra Basics

What is a vector?

What is a matrix?

Transposition

Adding matrices and vectors

Multiplying matrices.

Page 5: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 5/46

Definitions

A vector is a one dimensional array.We denote vectors as either x, x.If we don’t specify otherwise assume x is a column vector .

x =

x 0x 1. . .

x n−1

Page 6: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 6/46

Definitions

A matrix is a higher dimensional array.We typically denote matrices as capital letters e.g., A.If  A is an n-by-m matrix, it has the following structure

A =

a0,0 a0,1 . . . a0,m−1

a1,0 a1,1 a1,m−1...

. . ....

an−1,0 an−1,1 . . . an−1,m−1

M i i i

Page 7: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 7/46

Matrix transposition

Transposing a matrix or vector swaps rows and columns.

A column-vector becomes a row-vector

x =

x 0x 1. . .

x n−1

x

=

x 0 x 1 . . . x n−

1

M i i i

Page 8: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 8/46

Matrix transposition

Transposing a matrix or vector swaps rows and columns.

A column-vector becomes a row-vector

A =

a0,0 a0,1 . . . a0,m−1

a1,0 a1,1 a1,m−1...

.. .

.

..an−1,0 an−1,1 . . . an−1,m−1

A

=

a0,0 a1,0 . . . an−1,0

a0,1 a1,1 a1,m−1

... . . . ...a0,m−1 a1,m−1 . . . an−1,m−1

If  A is n-by-m, then AT  is m-by-n.

Addi M t i

Page 9: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 9/46

Adding Matrices

Matrices can only be added if they have the same dimension.

A+B  =

a0,0

+ b 0,0

a0,1

+ b 0,1

. . . a0,m

1+ b 

0,m−

1a1,0 + b 1,0 a1,1 + b 1,1 a1,m−1 + b 1,m−1...

. . ....

an−1,0 + b n−1,0 an−1,1 + b n−1,1 . . . an−1,m−1 + b n−1,m−1

M lti l i t i

Page 10: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 10/46

Multiplying matrices

To multiply two matrices, the inner dimensions  must match.

An n-by-m can be multiplied by an n′-by-m′ matrix iff  m = n′.

AB  = C 

c ij  =m

k =0

aik  ∗ b kj 

That is, multiply the i -th row by the j -th column.

Image from wikipedia.

Us f l t i ti s

Page 11: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 11/46

Useful matrix operations

Inversion

NormEigenvector decomposition

Matrix Inversion

Page 12: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 12/46

Matrix Inversion

The inverse of an n-by-m matrix A is denoted A−1, and has thefollowing property.

AA−1 = I 

Where I  is the identity matrix, an n-by-n matrix where I ij  = 1 iff i  = j  and 0 otherwise.If  A is a square matrix (iff  n = m) then,

A−1A = I 

Matrix Inversion

Page 13: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 13/46

Matrix Inversion

The inverse of an n-by-m matrix A is denoted A−1, and has thefollowing property.

AA−1 = I 

Where I  is the identity matrix, an n-by-n matrix where I ij  = 1 iff i  = j  and 0 otherwise.If  A is a square matrix (iff  n = m) then,

A−1A = I 

What is the inverse of a vector? x−1 =?

Some useful Matrix Inversion Properties

Page 14: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 14/46

Some useful Matrix Inversion Properties

(A−1)−1 = A

(kA

)

−1

=k −1A−1

(AT )−1 = (A−1)T 

(AB )−1

= B −1

A−1

The norm of a vector

Page 15: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 15/46

The norm of a vector

The norm of a vector x is written ||x||.The norm represents the euclidean length of a vector.

||x|| =

 n−1i =0

x 2i 

x 20 + x 21 + . . . + x 2n−1

Positive Definite/Semi Definite

Page 16: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 16/46

Positive Definite/Semi-Definite

A positive definite matrix, M  has the property that

x T Mx  > 0

A positive semi-definite matrix, M  has the property that

x T Mx ≥ 0

Why might we care about these matrices?

Eigenvectors

Page 17: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 17/46

Eigenvectors

For a square matrix A, the eigenvector is defined as

Aui  = λi ui 

Where ui  is an eigenvector and λi  is its correspondingeigenvalue.

In general, eigenvalues are complex numbers, but if  A issymmetric, they are real.

Eigenvectors describe how a matrix transforms a vector, and canbe used to define a basis space, namely the eigenspace.

Who cares? The eigenvectors of a covariance matrix have somevery interesting properties.

Basis Spaces

Page 18: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 18/46

Basis Spaces

Basis spaces allow vectors to be represented in different spaces.Our normal 2-dimensional basis space is generated by the vectors[0, 1], [1, 0].

Any 2-d vector can be expressed as the sum of linear factorsof these two basis vectors.

However, any two non-colinear vectors can generate a 2-d basisspace. In this basis space, the generating vectors are perpendicular.

Basis Spaces

Page 19: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 19/46

Basis Spaces

Basis Spaces

Page 20: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 20/46

Basis Spaces

Basis Spaces

Page 21: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 21/46

Basis Spaces

Why do we care?

Dimensionality reduction.

Calculus Basics

Page 22: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 22/46

Calculus Basics

What is a derivative?

What is an integral?

Derivatives

Page 23: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 23/46

A derivative, d dx 

f  (x ) can be thought of as defining the slope of a

function f  (x ). This is sometimes also written as f  ′(x ).

Derivative Example

Page 24: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 24/46

p

Integrals

Page 25: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 25/46

g

Integrals are an inverse operation of the derivative (plus aconstant).

 f  (x )dx  = F (x ) + c 

F ′(x ) = f  (x )

An integral can be thought of as a calculation of the area underthe curve defined by f  (x ).

A definite integral evaluates the area over a finite region. Anindefinite integral is calculated over the range of (−∞,∞).

Integration Example

Page 26: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 26/46

g p

Useful calculus operations

Page 27: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 27/46

p

Product, quotient, summation rules for derivatives.

Useful integration and derivative identities.

Chain ruleIntegration by parts

Variable substitution (don’t forget the Jacobian!)

Calculus Identities

Page 28: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 28/46

Summation rule

g (x ) = f  0(x ) + f  0(x )

g ′(x ) = f  ′0(x ) + f  ′1(x )

Product Ruleg (x ) = f  0(x )f  1(x )

g ′(x ) = f  0(x )f  ′1(x ) + f  ′0(x )f  1(x )

Quotient Rule

g (x ) =f  0(x )

f  1(x )

g ′(x ) =f  0(x )f  ′1(x )− f  ′0(x )f  1(x )

f  21 (x )

Calculus Identities

Page 29: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 29/46

Constant multipliersg (x ) = cf  (x )

g ′(x ) = cf  ′(x )

Exponent Rule

g (x ) = f  (x )k 

g ′(x ) = kf  (x )k −1

Chain Rule

g (x ) = f  0(f  1(x ))

g ′(x ) = f  ′0(f  1(x ))f  ′1(x )

Calculus Identities

Page 30: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 30/46

Exponent Ruleg (x ) = e x 

g ′(x ) = e x 

g (x ) = k x 

g ′(x ) = ln(k )k x 

Logarithm Ruleg (x ) = ln(x )

g ′(x ) =1

g (x ) = logb (x )

g ′(x ) =1

x ln b 

Calculus Operations

Page 31: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 31/46

Integration by Parts

 f  (x )

dg (x )

dx dx  = f  (x )g (x ) −

 g (x )

df  (x )

dx dx 

Variable Substitution

 b a

f  (g (x ))g ′(x )dx  =

 g (b )g (a)

f  (x )dx 

Vector Calculus

Page 32: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 32/46

Derivation with respect to to a vector or matrix.

Gradient of a vector.Change of variables with a vector.

Derivation with respect to a vector

Page 33: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 33/46

Given a vector x = (x 0, x 1, . . . , x n−1)T 

, and a functionf  (x) : Rn → R how can we find ∂ f  (x)

∂ x?

Derivation with respect to a vector

Page 34: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 34/46

Given a vector x = (x 0, x 1, . . . , x n−1)T 

, and a functionf  (x) : Rn → R how can we find ∂ f  (x)

∂ x?

∂ f  (x)

∂ x =

∂ f  (x)∂ x 0

∂ f  (x)∂ x 

1...∂ f  (x)∂ x n−1

This is also called the gradient of the function, and is often

written ∇f  (x ) or ∇f  .

Derivation with respect to a vector

Page 35: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 35/46

Given a vector x = (x 0, x 1, . . . , x n−1)T 

, and a functionf  (x) : Rn → R how can we find ∂ f  (x)

∂ x?

∂ f  (x)

∂ x =

∂ f  (x)∂ x 0

∂ f  (x)∂ x 

1...∂ f  (x)∂ x n−1

This is also called the gradient of the function, and is often

written ∇f  (x ) or ∇f  .Why might this be useful?

Useful Vector Calculus identities

Page 36: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 36/46

Given a vector x with |x| = n and a scalar variable y .

∂ x∂ y 

=

∂ x0∂ y ∂ x1∂ y ...

∂ xn−1∂ y 

Useful Vector Calculus identities

Page 37: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 37/46

Given a vector x with |x| = n and a vector y with |y| = m .

∂ x∂ y

=

∂ x0∂ y0

∂ x0∂ y1

. . . ∂ x0∂ ym−1

∂ x1∂ y0

∂ x1∂ y1 . . .

∂ x1∂ ym−1

......

. . ....

∂ xn−1

∂ y0

∂ xn−1∂ y1

. . .∂ xn−1∂ ym−1

Vector Calculus Identities

Page 38: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 38/46

Similar to – Scalar Multiplication Rule

∂ 

∂ x (xT a) =∂ 

∂ x (aT x) = a

Similar to – Product Rule

∂ 

∂ x 

(AB ) =∂ A

∂ x 

B  + A∂ B 

∂ x Derivative of an Matrix inverse.

∂ 

∂ x (A−1) = −A−1∂ A

∂ x A−1

Change of Variable in an Integral

 f  (x)d x =

 f  (u)

∂ x

∂ u

d u

Calculating the Expectation of a Gaussian

Page 39: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 39/46

Now we have enough tools to calculate the expectation of avariable given a Gaussian Distribution.

Recall:

E[x |µ, σ

2

] = 

p (x |µ, σ2

)xdx 

=

 N (x |µ, σ2)xdx 

=  1

√2πσ2

exp −1

2σ2

(x 

−µ)2 xdx 

Calculating the Expectation of a Gaussian

Page 40: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 40/46

E[x |µ, σ2] =

Z 1√

2πσ2exp 

− 1

2σ2(x − µ)2

ffxdx 

u = x − µ

du = dx 

E[x |µ, σ2] =

Z 1√

2πσ2exp 

− 1

2σ2(x − µ)2

ffxdx 

=

Z 1√

2πσ2exp 

− 1

2σ2u 2

ff(u + µ)du 

=Z 

1√2πσ2

exp − 1

2σ2u 2ff

udu + µ

Z 1√

2πσ2exp 

− 1

2σ2u 2ff

du 

Calculating the Expectation of a Gaussian

Page 41: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 41/46

E[x |µ, σ2] =

Z 1√

2πσ2exp 

− 1

2σ2u 2

ffudu + µ

Z 1√

2πσ2exp 

− 1

2σ2u 2

ffdu 

Z 1√

2πσ2exp 

− 1

2σ2u 2

ffdu = 1

E[x |µ, σ2] =Z 

1√2πσ2 exp − 12σ2 u 2

ffudu + µ

Aside: A function is Odd iff  f  (x ) = −f  (−x ).

Odd functions have the propertyR ∞

−∞f  (x )dx  = 0.

A function is Even iff  f  (x ) = f  (−x ).

The product of an odd function and an even function is an odd function.

Calculating the Expectation of a Gaussian

Page 42: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 42/46

E[x |µ, σ2] = 

1√2πσ2

exp 

− 1

2σ2u 2

udu + µ

exp 

− 1

2σ2u 2

is even

u  is odd

exp 

− 1

2σ2u 2

u  is odd

 1√

2πσ2

exp 

− 1

2σ2u 2

udu  = 0

E[x |µ, σ2] = µ

Why does Machine Learning need these tools?

Page 43: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 43/46

Calculus

We need to find maximum likelihoods or minimum risks. Thisoptimization is accomplished with derivatives.

Integration allows us to marginalize continuous probability

density functions.

Linear Algebra

We will be working in high-dimension spaces.

Vectors and Matrices allow us to refer to high dimensionalpoints – groups of features – as vectors.

Matrices allow us to describe the feature space .

Why does machine learning need these tools

Page 44: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 44/46

Vector Calculus

We need to do all of the calculus operations in

high-dimensional feature spaces.We will want to optimize multiple values simultaneously –Gradient Descent.

We will need to take a marginal over a high dimensional

distributions – Gaussians.

Broader Context

Page 45: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 45/46

What we have so far:

Entities in the world are represented as feature vectors andmaybe a label.

We want to construct statistical models of the feature vectors.Finding the most likely model is an optimization problem.

Since the feature vectors may have more than one dimension,linear algebra can help us work with them.

Bye

Page 46: Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

8/3/2019 Andrew Rosenberg- Lecture 2.1: Vector Calculus CSC 84020 - Machine Learning

http://slidepdf.com/reader/full/andrew-rosenberg-lecture-21-vector-calculus-csc-84020-machine-learning 46/46

Next

Linear Regression