cse 559a: computer visionayan/courses/cse559a/pdfs/lec...let's talk about vgg-16. winner of...

68
/ CSE 559A: Computer Vision Fall 2020: T-R: 11:30-12:50pm @ Zoom Instru‘tor: Ayan Chakrabarti ([email protected]’u). Course Staff: A’ith Boloor, Patri‘k Williams Nov 24, 2020 http://www.‘se.wustl.e’u/~ayan/‘ourses/‘se559a/ 1

Upload: others

Post on 09-Oct-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

CSE 559A: Computer Vision

Fall 2020: T-R: 11:30-12:50pm @ Zoom

Instru‘tor: Ayan Chakrabarti ([email protected]’u).Course Staff: A’ith Boloor, Patri‘k Williams

Nov 24, 2020

http://www.‘se.wustl.e’u/~ayan/‘ourses/‘se559a/

1

Page 2: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

GENERALGENERALLook at your proposal “ee’ba‘k !Problem Set 4 ’ue To’ay.Problem set 5 will be out by toni”ht.No ‘lass Thurs’ay. No offi‘e hours on Fri’ay.

2

Page 3: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

CLASSIFICATIONCLASSIFICATION

3

Page 4: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

CLASSIFICATIONCLASSIFICATION

4

Page 5: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

CLASSIFICATIONCLASSIFICATION

5

Page 6: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

CLASSIFICATIONCLASSIFICATION

6

Page 7: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

CLASSIFICATIONCLASSIFICATION

7

Page 8: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

CLASSIFICATIONCLASSIFICATION

8

Page 9: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

CLASSIFICATIONCLASSIFICATION

9

Page 10: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

CLASSIFICATIONCLASSIFICATION

10

Page 11: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

CLASSIFICATIONCLASSIFICATION

11

Page 12: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

CLASSIFICATIONCLASSIFICATION

12

Page 13: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

CLASSIFICATIONCLASSIFICATION

13

Page 14: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

CLASSIFICATIONCLASSIFICATION

14

Page 15: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

CLASSIFICATIONCLASSIFICATIONLearn an’ ’o binary ‘lassi“i‘ation on its output.

A”ain, use (sto‘hasti‘) ”ra’ient ’es‘ent.But this time, the ‘ost is no lon”er ‘onvex.

= g(x; θ)x

 

w = arg log

[

1 + exp(− )

]

+ (1 − ) log

[

1 + exp( )

]

min

w

1

T

t

y

t

w

T

x  

t

y

t

w

T

x  

t

θ,w = arg log

[

1 + exp(− g( ; θ))

]

+ (1 − ) log

[

1 + exp( g( ; θ))

]

min

θ,w

1

T

t

y

t

w

T

x

t

y

t

w

T

x

t

15

Page 16: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

CLASSIFICATIONCLASSIFICATIONLearn

A”ain, use (sto‘hasti‘) ”ra’ient ’es‘ent.But this time, the ‘ost is no lon”er ‘onvex.Turns out .. ’oesn t matter (sort o“).

Re‘all in the previous ‘ase: (where is the ‘ost o“ one sample)

What about now ?

Exa‘tly the same, with “or the ‘urrent value o“ .

= g(x; θ)x

 

w = arg log

[

1 + exp(− )

]

+ (1 − ) log

[

1 + exp( )

]

min

w

1

T

t

y

t

w

T

x  

t

y

t

w

T

x  

t

θ,w = arg log

[

1 + exp(− g( ; θ))

]

+ (1 − ) log

[

1 + exp( g( ; θ))

]

min

θ,w

1

T

t

y

t

w

T

x

t

y

t

w

T

x

t

C

t

=

[

]

w

C

t

x  

t

exp( )w

T

x  

t

1 + exp( )w

T

x  

t

y

t

= g(x; θ)x   θ

16

Page 17: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

CLASSIFICATIONCLASSIFICATIONLearn

What about ?

First, what is the ?

Take 5 mins

= g(x; θ)x

 

θ,w = arg log

[

1 + exp(− g( ; θ))

]

+ (1 − ) log

[

1 + exp( g( ; θ))

]

min

θ,w

1

T

t

y

t

w

T

x

t

y

t

w

T

x

t

=

[

]

w

C

t

x  

t

exp( )w

T

x  

t

1 + exp( )w

T

x

 

t

y

t

θ

C

t

x

 

t

C

t

17

Page 18: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

CLASSIFICATIONCLASSIFICATIONLearn

What about ?

First, what is the ?

= g(x; θ)x

 

θ,w = arg log

[

1 + exp(− g( ; θ))

]

+ (1 − ) log

[

1 + exp( g( ; θ))

]

min

θ,w

1

T

t

y

t

w

T

x

t

y

t

w

T

x

t

=

[

]

w

C

t

x  

t

exp( )w

T

x  

t

1 + exp( )w

T

x

 

t

y

t

θ

C

t

x

 

t

C

t

= ?

[

]

x

 

t

C

t

exp( )w

T

x

 

t

1 + exp( )w

T

x  

t

y

t

18

Page 19: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

CLASSIFICATIONCLASSIFICATIONLearn

What about ?

First, what is the ?

= g(x; θ)x

 

θ,w = arg log

[

1 + exp(− g( ; θ))

]

+ (1 − ) log

[

1 + exp( g( ; θ))

]

min

θ,w

1

T

t

y

t

w

T

x

t

y

t

w

T

x

t

=

[

]

w

C

t

x  

t

exp( )w

T

x  

t

1 + exp( )w

T

x

 

t

y

t

θ

C

t

x

 

t

C

t

= w

[

]

x

 

t

C

t

exp( )w

T

x

 

t

1 + exp( )w

T

x  

t

y

t

19

Page 20: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

CLASSIFICATIONCLASSIFICATIONLearn

Now, let s say was an matrix, an’ .

is the len”th o“ the ve‘tor

is the len”th o“ the en‘o’e’ ve‘tor

What is ?

Take 5 mins!

= g(x; θ)x

 

θ,w = arg log

[

1 + exp(− g( ; θ))

]

+ (1 − ) log

[

1 + exp( g( ; θ))

]

min

θ,w

1

T

t

y

t

w

T

x

t

y

t

w

T

x

t

= w

[

]

x

 

t

C

t

exp( )w

T

x  

t

1 + exp( )w

T

x

 

t

y

t

θ M × N g(x; θ) = θx

N x

M x

 

θ

C

t

20

Page 21: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

CLASSIFICATIONCLASSIFICATIONLearn

Now, let s say was an matrix, an’ .

is the len”th o“ the ve‘tor

is the len”th o“ the en‘o’e’ ve‘tor

What is ?

= g(x; θ)x

 

θ,w = arg log

[

1 + exp(− g( ; θ))

]

+ (1 − ) log

[

1 + exp( g( ; θ))

]

min

θ,w

1

T

t

y

t

w

T

x

t

y

t

w

T

x

t

= w

[

]

x

 

t

C

t

exp( )w

T

x  

t

1 + exp( )w

T

x

 

t

y

t

θ M × N g(x; θ) = θ x

N x

M x

 

θ

C

t

= ?∇

θ

C

t

x

 

t

C

t

21

Page 22: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

CLASSIFICATIONCLASSIFICATIONLearn

Now, let s say was an matrix, an’ .

is the len”th o“ the ve‘tor

is the len”th o“ the en‘o’e’ ve‘tor

What is ?

This is a‘tually a linear ‘lassi“ier on

But be‘ause o“ our “a‘torization, is no lon”er ‘onvex.

I“ we want to in‘rease the expressive power o“ our ‘lassi“ier, has to be non-linear !

= g(x; θ)x

 

θ,w = arg log

[

1 + exp(− g( ; θ))

]

+ (1 − ) log

[

1 + exp( g( ; θ))

]

min

θ,w

1

T

t

y

t

w

T

x

t

y

t

w

T

x

t

= w

[

]

x

 

t

C

t

exp( )w

T

x  

t

1 + exp( )w

T

x

 

t

y

t

θ M × N g(x; θ) = θ x

N x

M x

 

θ

C

t

=

( )

θ

C

t

x

 

t

C

t

x

T

t

x

θ x = ( w x = xw

T

θ

T

)

T

T

g

22

Page 23: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

CLASSIFICATIONCLASSIFICATIONThe Multi-Layer Perceptron

x

23

Page 24: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

CLASSIFICATIONCLASSIFICATIONThe Multi-Layer Perceptron

x h⟶

h = θx

24

Page 25: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

CLASSIFICATIONCLASSIFICATIONThe Multi-Layer Perceptron

is an element-wise non-linearity.

For example . More on this later.

Has no learnable parameters.

x h⟶

h = θx

= κ( )h

 

j

h

j

h

 

κ

κ(x) = σ(x)

25

Page 26: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

CLASSIFICATIONCLASSIFICATIONThe Multi-Layer Perceptron

is an element-wise non-linearity.

For example . More on this later.

Has no learnable parameters.

x h y⟶

h = θx

= κ( )h

 

j

h

j

h

 

y =w

T

h

 

κ

κ(x) = σ(x)

26

Page 27: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

CLASSIFICATIONCLASSIFICATIONThe Multi-Layer Perceptron

is an element-wise non-linearity.

For example . More on this later.

Has no learnable parameters.

is our si”moi’ to ‘onvert lo”-o’’s to probability.

Multipli‘ation by an’ a‘tion o“ is a layer .

Calle’ a hi’’en layer, be‘ause you re learnin” a latent representation .Don t have ’ire‘t a‘‘ess to the true value o“ its outputsLearnin” a representation that jointly with a learne’ ‘lassi“ier is optimal

x h y p⟶

h = θx

= κ( )h

 

j

h

j

h

 

y =w

T

h

 

p = σ(y)

κ

κ(x) = σ(x)

σ

σ(y) =

exp(y)

1 + exp(y)

θ κ

27

Page 28: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

CLASSIFICATIONCLASSIFICATIONThe Multi-Layer Perceptron

This is a neural network:A ‘omplex “un‘tion “orme’ by composition o“ simple linear an’ non-linear “un‘tions.

This network has learnable parameters .

Train by ”ra’ient ’es‘ent with respe‘t to ‘lassi“i‘ation loss.What are the ”ra’ients ?

Doin” this manually is ”oin” to ”et ol’ really “ast.

Autograd

Express ‘omplex “un‘tion as a composition o“ simpler “un‘tions.Store this as no’es in a ‘omputation ”raphUse ‘hain rule to automati‘ally ba‘k-propa”ate

Popular Auto”ra’ Systems: Tensor“low, Tor‘h, Caffe, MXNet, Theano, …

We ll write our own!

x h y p⟶

h = θx

= κ( )h

 

j

h

j

h

 

y =w

T

h

 

p = σ(y)

θ,w

Page 29: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/ 28

Page 30: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRAD / BACK-PROPAGATIONAUTOGRAD / BACK-PROPAGATIONSay we want to minimize a loss , that is a “un‘tion o“ parameters an’ trainin” ’ata.

Let s say “or a parameter we ‘an write:

where is in’epen’ent o“ , an’ ’oes not use ex‘ept throu”h .

Now, let s say I ”ave you the value o“ an’ the ”ra’ient o“ with respe‘t to .

is an ’imensional ve‘tor

is an ’imensional ve‘tor (i“ its a matrix, just think o“ ea‘h element as a separate parameter)

Express in terms o“ an’ : whi‘h is the partial ’erivative o“ one o“ the ’imensions o“ the outputs o“ with respe‘t to one o“ the ’imensions o“ its inputs.

For every

We ‘an similarly ‘ompute ”ra’ients “or the other input to , i.e. y.

L

θ

L = f (x); x = g(θ, y)

y θ f θ x

y L x

x N−

θ M−

∂L

∂θ

j

∂L

∂x

i

∂g(θ,y)

i

∂θ

j

g

j

=

∂L

∂θ

j

i

∂L

∂x

i

∂g(θ, y)

i

∂θ

j

g

29

Page 31: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRAD / BACK-PROPAGATIONAUTOGRAD / BACK-PROPAGATION

Let s say a spe‘i“i‘ variable ha’ two paths to the loss.

L = f (x, ); x = g(θ, y), = (θ, )x

x

g

y

= +

∂L

∂θ

j

i

∂L

∂x

i

∂g(θ, y)

i

∂θ

j

i

∂L

∂x

′i

∂ (θ,g

y

)

i

∂θ

j

30

Page 32: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRAD / BACK-PROPAGATIONAUTOGRAD / BACK-PROPAGATIONOur very own auto”ra’ system:

Buil’ a ’ire‘te’ ‘omputation ”raph with a (python) list o“ no’es G = [n1,n2,n3 …]Ea‘h no’e is an obje‘t that is one o“ three kin’s:

InputParameterOperation . . .

We will ’e“ine the ”raph by ‘allin” “un‘tions that ’e“ine “un‘tional relationships.

import edf

x = edf.Input() theta = edf.Parameter()

y = edf.matmul(theta,x) y = edf.tanh(y)

w = edf.Parameter() y = edf.matmul(w,y)

31

Page 33: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRAD / BACK-PROPAGATIONAUTOGRAD / BACK-PROPAGATION

Ea‘h o“ these statements a’’s a no’e to the list o“ no’es.Operation no’es are a’’e’ by matmul, tanh, et‘., an’ are linke’ to previous no’es that appear be“ore it inthe list as input.Every no’e obje‘t is ”oin” to have a member element n.top whi‘h will be the value o“ its output

This ‘an be an arbitrary shape’ array.For input an’ parameter no’es, these top values will just be set (or up’ate’ by SGD).For operation no’es, the top values will be ‘ompute’ “rom the top values o“ their inputs.

Every operation no’e will be an obje‘t o“ a ‘lass that has a “un‘tion ‘alle’ “orwar’.A “orwar’ pass will be”in with values o“ all inputs an’ parameters set.

import edf

x = edf.Input() theta = edf.Parameter()

y = edf.matmul(theta,x) y = edf.tanh(y)

w = edf.Parameter() y = edf.matmul(w,y)

32

Page 34: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRAD / BACK-PROPAGATIONAUTOGRAD / BACK-PROPAGATION

A “orwar’ pass will be”in with values o“ all inputs an’ parameters set.Then we will ”o throu”h the list o“ no’es in or’er, an’ ‘ompute the value o“ all operation no’es.Be‘ause no’es were a’’e’ in or’er, i“ we ”o throu”h them in or’er,the tops o“ our inputs will be available.

import edf

x = edf.Input() theta = edf.Parameter()

y = edf.matmul(theta,x) y = edf.tanh(y)

w = edf.Parameter() y = edf.matmul(w,y)

33

Page 35: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRAD / BACK-PROPAGATIONAUTOGRAD / BACK-PROPAGATION

Somewhere in the trainin” loop, where the values o“ parameters have been set be“ore.

An’ this will ”ive us the value o“ the output.But now, we want to ‘ompute ”ra’ients .For ea‘h operation ‘lass, we will also ’e“ine a “un‘tion backward.All operation an’ parameter no’es will also have an element ‘alle’ grad.We will have to then ba‘k-propa”ate ”ra’ients in or’er.

import edf

x = edf.Input() theta = edf.Parameter()

y = edf.matmul(theta,x) y = edf.tanh(y)

w = edf.Parameter() y = edf.matmul(w,y)

x.set(...) edf.Forward() print(y.top)

34

Page 36: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRADAUTOGRADBuilding Our Own Deep Learning Framework

Computation Graph that en‘o’es symboli‘ relationship between input to output (an’ loss)No’es in this ”raph are

Values: Values are set “rom trainin” ’ataParams: Values are initialize’ an’ up’ate’ usin” SGDOperations: Values are ‘ompute’ as “un‘tions o“ their input

Forwar’ ‘omputation to ‘ompute lossBa‘kwar’ ‘omputation to ‘ompute ”ra’ients “or every no’e

35

Page 37: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRADAUTOGRADCo’e “rom pset5/mnist.py

# Inputs and parameters inp = edf.Value() lab = edf.Value()

W1 = edf.Param() B1 = edf.Param() W2 = edf.Param() B2 = edf.Param()

# Modely = edf.matmul(inp,W1) y = edf.add(y,B1) y = edf.RELU(y)

y = edf.matmul(y,W2) y = edf.add(y,B2) # This is our final prediction

36

Page 38: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

BRIEF DETOURBRIEF DETOURWhat are RELUs ?

Element-wise non-linear a‘tivations

Previous a‘tivations woul’ be si”moi’-like:

Great when you want a probability, ba’ when you want to learn by ”ra’ient ’es‘ent.

For both hi”h an’ low-values o“ ,

So i“ you weren t ‘are“ul, you woul’ en’ up with hi”h-ma”nitu’e a‘tivations,an’ the network stops learnin”.

Gra’ient ’es‘ent is very fragile !

RELU(x) = max(0, x)

σ(x) = exp(x)/(1 + exp(x))

x ∂σ(x)/∂x = 0

37

Page 39: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

BRIEF DETOURBRIEF DETOURWhat are RELUs ?

What is RELU ?

0 i“ , 1 otherwise.

So your ”ra’ients are passe’ un-‘han”e’ i“ the input is positive.But ‘ompletely attenuate’ i“ the input is ne”ativeSo there s still the possibility o“ your optimization ”ettin” stu‘k, i“ you rea‘h a point where all inputs to theRELU are ne”ative.Will talk about initialization, et‘. later.

RELU(x) = max(0, x)

∂ (x) / ∂x

x < 0

38

Page 40: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRADAUTOGRADCo’e “rom pset5/mnist.py

# Inputs and parameters inp = edf.Value() lab = edf.Value()

W1 = edf.Param() B1 = edf.Param() W2 = edf.Param() B2 = edf.Param()

# Modely = edf.matmul(inp,W1) y = edf.add(y,B1) y = edf.RELU(y)

y = edf.matmul(y,W2) y = edf.add(y,B2) # This is our final prediction

39

Page 41: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRADAUTOGRADCo’e “rom pset5/edf.py

When you ‘onstru‘t an obje‘t, it just ’oes book-keepin” !

ops = []; params = []; values = [] ...class Param: def __init__(self): params.append(self) ....class Value: def __init__(self): values.append(self) ...class matmul: def __init__(self,x,y): ops.append(self) self.x = x self.y = y

40

Page 42: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRADAUTOGRAD

41

Page 43: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRADAUTOGRAD

42

Page 44: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRADAUTOGRAD

43

Page 45: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRADAUTOGRAD

44

Page 46: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRADAUTOGRAD

45

Page 47: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRADAUTOGRAD

46

Page 48: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRADAUTOGRAD

47

Page 49: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRADAUTOGRAD

48

Page 50: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRADAUTOGRAD

49

Page 51: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRADAUTOGRAD

50

Page 52: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRADAUTOGRAD

51

Page 53: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRADAUTOGRAD

mean will be a‘ross a bat‘h.a‘‘ura‘y is a‘tual a‘‘ura‘y o“ har’ pre’i‘tions (not ’ifferentiable).

# Inputs and parameters inp = edf.Value() lab = edf.Value() W1 = edf.Param() B1 = edf.Param() W2 = edf.Param() B2 = edf.Param() y = edf.matmul(inp,W1) y = edf.add(y,B1) y = edf.RELU(y) y = edf.matmul(y,W2) y = edf.add(y,B2)

loss = edf.smaxloss(y,lab) loss = edf.mean(loss)

acc = edf.accuracy(y,lab) acc = edf.mean(acc)

52

Page 54: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRADAUTOGRADNow let s train this thin” !

Be”innin” o“ trainin”:

Initialize wei”hts ran’omly.

In ea‘h iteration o“ trainin”:

Loa’ ’ata into the values or inputs.

What is this set “un‘tion anyway ?

nHidden = 1024; K = 10 W1.set(xavier((28*28,nHidden))) B1.set(np.zeros((nHidden))) W2.set(xavier((nHidden,K))) B2.set(np.zeros((K)))

for iters in range(...): . . . inp.set(train_im[idx[b:b+BSZ],:]) lab.set(train_lb[idx[b:b+BSZ]]) . . .

53

Page 55: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRADAUTOGRADset is the only “un‘tion that the ‘lasses param an’ value have.

Sets a member ‘alle’ top to be an array that hol’s these values.

class Value: def __init__(self): values.append(self)

def set(self,value): self.top = np.float32(value).copy()

class Param: def __init__(self): params.append(self)

def set(self,value): self.top = np.float32(value).copy()

54

Page 56: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRADAUTOGRAD

Note that we are loa’in” our input ’ata in bat‘hes as matri‘es.inp is BSZ x NThen we re ’oin” a matmul with W1 whi‘h is N x nHi’’en.Output will be BSZ x nHi’’en

Essentially, we re repla‘in” ve‘tor matrix multiply “or a sin”le sample with matrix-matrix multiply “or a bat‘ho“ samples.

W1.set(xavier((28*28,nHidden))) ...B2.set(np.zeros((K)))

for iters in range(...): . . . inp.set(train_im[idx[b:b+BSZ],:]) lab.set(train_lb[idx[b:b+BSZ]])

55

Page 57: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRADAUTOGRAD

An’ this will work. It will print the loss an’ a‘‘ura‘y values “or theset inputs, ”iven the ‘urrent value o“ the parameters.

What is this ma”i‘al “un‘tion “orwar’ ?

W1.set(xavier((28*28,nHidden))) ...B2.set(np.zeros((K)))

for iters in range(...): . . . inp.set(train_im[idx[b:b+BSZ],:]) lab.set(train_lb[idx[b:b+BSZ]])

edf.Forward() print(loss.top,acc.top)

56

Page 58: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRADAUTOGRADFrom e’“.py

But the operation ‘lasses have their own “orwar’ “un‘tion.

# Global forward def Forward(): for c in ops: c.forward()

class matmul: def __init__(self,x,y): ops.append(self) self.x = x self.y = y

def forward(self): self.top = np.matmul(self.x.top,self.y.top)

. . .

57

Page 59: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRADAUTOGRAD

58

Page 60: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRADAUTOGRAD

59

Page 61: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRADAUTOGRAD

60

Page 62: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRADAUTOGRAD

61

Page 63: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRADAUTOGRAD

62

Page 64: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRADAUTOGRAD

63

Page 65: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRADAUTOGRAD

64

Page 66: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRADAUTOGRAD

65

Page 67: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRADAUTOGRAD

66

Page 68: CSE 559A: Computer Visionayan/courses/cse559a/PDFs/lec...Let's talk about VGG-16. Winner of Imagenet 2014. Reference Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman,

/

AUTOGRADAUTOGRADSo the “orwar’ pass ‘omputes the loss.But we want to learn the parameters.

The SGD “un‘tion is pretty simple

Requires p.”ra’ (”ra’ients with respe‘t to loss) to be present.

That s what ba‘kwar’ ’oes!

for iters in range(...): . . . inp.set(train_im[idx[b:b+BSZ],:]) lab.set(train_lb[idx[b:b+BSZ]])

edf.Forward() print(loss.top,acc.top) edf.Backward(loss) edf.SGD(lr)

def SGD(lr): for p in params: p.top = p.top - lr*p.grad

67