cs330 paper presentation: october 16th, 2019 · cs330 paper presentation: october 16th, 2019....

46
CS330 Paper Presentation: October 16th, 2019

Upload: others

Post on 05-Jul-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

CS330 Paper Presentation: October 16th, 2019

Page 2: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Supervised Classification

Page 3: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Semi-Supervised Classification: More realistic datasetLabelled

Unlabelled

Page 4: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Semi-Supervised ClassificationMost “biologically plausible” learning regime

Page 5: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

A familiar problem:

?

Few-shot, multi-task learning:Generalize to unseen classes

Page 6: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

A new twist on a familiar problem:

?

Page 7: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

How can we leverage unlabelled data for few-shot classification?

Page 8: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled
Page 9: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Unlabelled data may come from the support set or not (distractors)

Page 10: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Strategy:As we can now appreciate, there are a number of possible ways to approach the original problem. To name a few:

● Siamese Networks (Koch et al, 2015)● Matching Networks (Vinyals et al., 2016)● Prototypical Networks (Snell et al., 2017)● Weight initialization / Update step learning (Ravi et al., 2017, Finn et al., 2017)● MANN (Santoro et al., 2016)● Temporal convolutions (Mishra et al., 2017)

All are reasonable starting points for semi-supervised few-shot classification problem!

Page 11: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Prototypical Networks (Snell et al., 2017)Very simple inductive bias!

Page 12: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Prototypical Networks (Snell et al., 2017)

For each class, compute prototype

Embedding is generated via a simple convnet:Pixels - 64 [3x3] Filters - Batchnorm - ReLU - [2x2] MaxPool = 64D Vector

https://jasonyzhang.com/convnet/

Page 13: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Prototypical Networks (Snell et al., 2017)

For each class, compute prototype

Softmax distribution of distances to prototypes for newimage

Compute loss

Page 14: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Prototypical Networks (Snell et al., 2017)

For each class, compute prototype

Softmax distribution of distances to prototypes for newimage

Compute loss

Very simple inductive bias:Reduces to a linear model with Euclidean distance

Page 15: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Strategy for semi-supervised:Refine Prototypes centers with unlabelled data.

Support

Unlabelled

Test

Page 16: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

1. Start with labelled prototypes2. Give each unlabelled input a partial assignment to

each cluster3. Incorporate unlabelled examples into original

prototype

Strategy for semi-supervised:

Page 17: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Prototypical networks with Soft k-means

Unlabelled support set

Partial Assignment

Page 18: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

What about distractor classes?

Prototypical networks with Soft k-means

Page 19: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Add a buffering prototype at the origin to “capture the distractors”

Prototypical networks with Soft k-means w/ Distractor Cluster

Page 20: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Add a buffering prototype at the origin to “capture the distractors”

Prototypical networks with Soft k-means w/ Distractor Cluster

Assumption: Distractors all come from one class!

Page 21: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Soft k-means + Masking Network

1. Distance

2. Compute mask with small network

Page 22: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Soft k-means + Masking Network

differentiable

Page 23: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Soft k-means + Masking

In practice, MLP is a dense layer with 20 hidden units (tanh nonlinearity)

Page 24: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Datasets● Omniglot● miniImageNet (600 images from 100 classes)

Page 25: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Hierarchical Datasets

Omniglot

tieredImageNet

Page 26: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

miniImageNet:Test - electric guitarTrain - acoustic guitar

tierediImageNet:Test - musical instrumentsTrain - farming equipment

tieredImagenet

Page 27: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Datasets● Omniglot● miniImageNet (600 images from 100 classes)● tieredImageNet (34 broad categories, each containing 10 to 30 classes)

10% goes to labeled splits 90% goes to unlabelled classes and distractors*

*40/60 for miniImageNet

Page 28: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Datasets● Omniglot● miniImageNet (600 images from 100 classes)● tieredImageNet (34 broad categories, each containing 10 to 30 classes)

10% goes to labeled splits 90% goes to unlabelled classes and distractors*

*40/60 for miniImageNet

Much less labelled data than standard few-shot approaches!!!

Page 29: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

DatasetsN: ClassesK: Labelled samples from each classM: Unlabelled samples from N classesH: Distractors (Unlabelled sample from classes other than N)

H=N=5M=5 for training & M=20 for testing

Page 30: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Baseline Models

1.

1. Vanilla Protonet

Page 31: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Baseline Models

1.2.

1. Vanilla Protonet2. Vanilla Protonet + one step of Soft k-means refinement at

test only (supervised embedding)

Page 32: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Results: Omniglot

Page 33: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Results: miniImageNet

Page 34: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Results: tieredImageNet

Page 35: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Results: Other Baselines

Page 36: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

ResultsModels trained with M=5During meta-test: vary amount of unlabelled examples

Page 37: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Results

Page 38: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Conclusions:1. Achieve state of the art performance over logical baselines on 3 datasets

Page 39: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Conclusions:1. Achieve state of the art performance over logical baselines on 3 datasets2. K-means Masked models perform best with distractors

Page 40: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Conclusions:1. Achieve state of the art performance over logical baselines on 3 datasets2. K-means Masked models perform best with distractors3. Novel: models extrapolate to increases in amount of labelled data

Page 41: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Conclusions:1. Achieve state of the art performance over logical baselines on 3 datasets2. K-means Masked models perform best with distractors3. Novel: models extrapolate to increases in amount of labelled data4. New dataset: tieredImageNet

Page 42: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Critiques:1. Results are convincing, but the work is actually a relatively straightforward

application of (a) Protonets and (b) k-means clustering2. Model Choice: protonets are very simple. It’s not clear what they gained by

the simple inductive bias 3. Presented approach does not generalize well beyond classification problems

Page 43: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Future directions: extension to unsupervised learning

I would be really interested in withholding labels alltogetherCan the model learn how many classes there are?

… and correctly classify them?

Page 44: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Future directions: extension to unsupervised learning

Page 45: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Thank you!

Page 46: CS330 Paper Presentation: October 16th, 2019 · CS330 Paper Presentation: October 16th, 2019. Supervised Classification. Semi-Supervised Classification: More realistic dataset Labelled

Supplemental:Accounting for Intra-Cluster Distance