deeplearning&% neural networks€¦ · 15 convolution%layer ©sham%kakade 29 y lecun ma ranzato...

21
1 1 Deep Learning & Neural Networks Machine Learning – CSE4546 Sham Kakade University of Washington November 29, 2016 ©Sham Kakade 2 Announcements: HW4 posted Poster Session Thurs, Dec 8 Today: Review: EM Neural nets and deep learning ©Sham Kakade

Upload: others

Post on 24-May-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,

1

1

Deep  Learning  &  Neural  Networks

Machine  Learning  – CSE4546Sham  KakadeUniversity  of  WashingtonNovember  29,  2016

©Sham  Kakade

2

Announcements:

n HW4  postedn Poster  Session  Thurs,  Dec  8

n Today:  ¨Review:  EM¨Neural  nets  and  deep  learning

©Sham  Kakade

Page 2: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,

2

Poster  Sessionn Thursday  Dec  8,  9-­11:30am

¨ Please  arrive  20  mins  early  to  set  up

n Everyone  is  expected  to  attendn Prepare  a  poster

¨ We  provide  poster  board  and  pins¨ Both  one  large  poster  (recommended)  and  several  pinned  pages  are  OK.

n Capture¨ Problem  you  are  solving¨ Data  you  used¨ ML  methodology¨ Results  

n Prepare  a  1-­minute  speech  about  your  projectn Two  instructors  will  visit  your  poster  separatelyn Project  Grading:  scope,  depth,  data

©Sham  Kakade 3

©Sham  Kakade 4

Logistic  regression

n P(Y|X)  represented  by:

n Learning  rule  – MLE:

-6 -4 -2 0 2 4 60

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Page 3: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,

3

©Sham  Kakade 5

Perceptron  as  a  graph-6 -4 -2 0 2 4 60

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

©Sham  Kakade 6

Linear  perceptron  classification  region

-6 -4 -2 0 2 4 60

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Page 4: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,

4

©Sham  Kakade 7

Percepton,  linear  classification,  Boolean  functions

n Can  learn  x1  AND x2

n Can  learn  x1  OR x2

n Can  learn  any  conjunction  or  disjunction

©Sham  Kakade 8

Percepton,  linear  classification,  Boolean  functions

n Can  learn  majority

n Can  perceptrons  do  everything?

Page 5: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,

5

©Sham  Kakade 9

Going  beyond  linear  classification

n Solving  the  XOR  problem

©Sham  Kakade 10

Hidden  layer

n Perceptron:

n 1-­hidden  layer:    

Page 6: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,

6

©Sham  Kakade 11

Example  data  for  NN  with  hidden  layer

©Sham  Kakade 12

Learned  weights  for  hidden  layer

Page 7: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,

7

©Sham  Kakade 13

NN  for  images

©Sham  Kakade 14

Weights  in  NN  for  images

Page 8: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,

8

©Sham  Kakade 15

Forward  propagation  for  1-­hidden  layer  -­ Predictionn 1-­hidden  layer:    

©Sham  Kakade 16

Gradient  descent  for  1-­hidden  layer  –Back-­propagation:  Computing

Dropped  w0 to  make  derivation  simpler

Page 9: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,

9

©Sham  Kakade 17

Gradient  descent  for  1-­hidden  layer  –Back-­propagation:  Computing

Dropped  w0 to  make  derivation  simpler

©Sham  Kakade 18

Multilayer  neural  networks

Page 10: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,

10

©Sham  Kakade 19

Forward  propagation  – prediction

n Recursive  algorithmn Start  from  input  layern Output  of  node  Vk with  parents  U1,U2,…:

©Sham  Kakade 20

Back-­propagation  – learning

n Just  stochastic  gradient  descent!!!  n Recursive  algorithm  for  computing  gradientn For  each  example

¨Perform  forward  propagation  ¨Start  from  output  layer¨Compute  gradient  of  node  Vk with  parents  U1,U2,…¨Update  weight  wik

Page 11: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,

11

©Sham  Kakade 21

Many  possible  response/link  functionsn Sigmoid

n Linear

n Exponential

n Gaussian

n Hinge  

n Max

n …

22

Convolutional  Neural  Networks  &  Application  to  Computer  Vision

Machine  Learning  – CSE4546Sham  KakadeUniversity  of  Washington

November  29,  2016©Sham  Kakade

Page 12: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,

12

Contains  slides  from…

n LeCun &  Ranzaton Russ  Salakhutdinovn Honglak Lee

©Sham  Kakade 23

Neural  Networks  in  Computer  Vision

n Neural  nets  have  made  an  amazing  come  back¨ Used  to  engineer  high-­level  features  of  images

n Image  features:

©Sham  Kakade 24

Page 13: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,

13

Some  hand-­created  image  features

©Sham  Kakade 25

Computer$vision$features$

SIFT$ Spin$image$

HoG$ RIFT$

Textons$ GLOH$Slide$Credit:$Honglak$Lee$

Scanning  an  image  with  a  detectorn Detector  =  Classifier  from  image  patches:

n Typically  scan  image  with  detector:

©Sham  Kakade 26

Page 14: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,

14

Using  neural  nets  to  learn  non-­linear  features

©Sham  Kakade 27

Y LeCunMA Ranzato

Deep Learning = Learning Hierarchical Representations

It's deep if it has more than one stage of non-linear feature transformation

Trainable Classifier

Low-LevelFeature

Mid-LevelFeature

High-LevelFeature

Feature visualization of convolutional net trained on ImageNet from [Zeiler & Fergus 2013]

But,  many  tricks  needed  to  work  well…  

©Sham  Kakade 28

ConvoluAnal$Deep$Models$$for$Image$RecogniAon$

• $Learning$mulAple$layers$of$representaAon.$

(LeCun, 1992)!

Page 15: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,

15

Convolution  Layer

©Sham  Kakade 29

Y LeCunMA Ranzato

Fully-connected neural net in high dimension

Example: 200x200 imageFully-connected, 400,000 hidden units = 16 billion parameters

Locally-connected, 400,000 hidden units 10x10 fields = 40

million params

Local connections capture local dependencies

Parameter  sharingn Fundamental  technique  used  throughout  MLn Neural  net  without  parameter  sharing:

n Sharing  parameters:

©Sham  Kakade 30

Page 16: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,

16

Pooling/Subsampling  n Convolutions  act  like  detectors:

n But  we  don’t  expect  true  detections  in  every  patchn Pooling/subsampling  nodes:  

©Sham  Kakade 31

Example  neural  net  architecture

©Sham  Kakade 32

Page 17: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,

17

Sample  results

©Sham  Kakade 33

Y LeCunMA Ranzato

Simple ConvNet Applications with State-of-the-Art Performance

Traffic Sign Recognition (GTSRB)German Traffic Sign Reco

Bench

99.2% accuracy

#1: IDSIA; #2 NYU

House Number Recognition (Google) Street View House Numbers

94.3 % accuracy

Example  from  Krizhevsky,  Sutskever,  Hinton  2012

©Sham  Kakade 34

Y LeCunMA Ranzato

Object Recognition [Krizhevsky, Sutskever, Hinton 2012]

CONV 11x11/ReLU 96fm

LOCAL CONTRAST NORM

MAX POOL 2x2sub

FULL 4096/ReLU

FULL CONNECT

CONV 11x11/ReLU 256fm

LOCAL CONTRAST NORM

MAX POOLING 2x2sub

CONV 3x3/ReLU 384fm

CONV 3x3ReLU 384fm

CONV 3x3/ReLU 256fm

MAX POOLING

FULL 4096/ReLU

Won the 2012 ImageNet LSVRC. 60 Million parameters, 832M MAC ops4M

16M

37M

442K

1.3M

884K

307K

35K

4Mflop

16M

37M

74M

224M

149M

223M

105M

Page 18: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,

18

Results  by  Krizhevsky,  Sutskever,  Hinton  2012

©Sham  Kakade 35

Y LeCunMA Ranzato

Object Recognition: ILSVRC 2012 results

ImageNet Large Scale Visual Recognition Challenge1000 categories, 1.5 Million labeled training samples

©Sham  Kakade 36

Y LeCunMA Ranzato

Object Recognition [Krizhevsky, Sutskever, Hinton 2012]

Page 19: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,

19

©Sham  Kakade 37

Y LeCunMA Ranzato

Object Recognition [Krizhevsky, Sutskever, Hinton 2012]

TEST IMAGE RETRIEVED IMAGES

Application  to  scene  parsing

©Sham  Kakade 38

Y LeCunMA Ranzato

Semantic Labeling:Labeling every pixel with the object it belongs to

[Farabet et al. ICML 2012, PAMI 2013]

Would help identify obstacles, targets, landing sites, dangerous areasWould help line up depth map with edge maps

Page 20: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,

20

Learning  challenges  for  neural  nets

n Choosing  architecturen Slow  per  iteration  and  convergence

n Gradient  “diffusion”  across  layersn Many  local  optima

©Sham  Kakade 39

Random  dropouts  

n Standard  backprop:

n Random  dropouts:  randomly  choose  edges  not  to  update:

n Functions  as  a  type  of  “regularization”…  helps  avoid  “diffusion”  of  gradient

©Sham  Kakade 40

Page 21: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,

21

Revival  of  neural  networks

n Neural  networks  fell  into  disfavor  in  mid  90s  -­early  2000s¨ Many  methods  have  now  been  rediscovered  J

n Exciting  new  results  using  modifications  to  optimization  techniques  and  GPUs

n Challenges  still  remain:¨ Architecture  selection  feels  like  a  black  art¨ Optimization  can  be  very  sensitive  to  parameters¨ Requires  a  significant  amount  of  expertise  to  get  good  results

©Sham  Kakade 41