Transcript
Page 1: Multimodal Deep Learning - Max-Planck-Institut für ... · Multimodal Deep Learning Zeynep Akata Zero-Shot Learning Latent Embeddings for Zero-Shot Image Classification Xian et.al.,

Multimodal Deep Learning

Zeynep Akata

Zero-Shot Learning

Latent Embeddings for Zero-Shot Image ClassificationXian et.al., CVPR’16 & CVPR’17

WLarge

error

W₁

W₂

Linear compatibility function: large errors (left).

Piecewise-linear: significantly improves results (right).

Multi-Cue Zero-Shot Learning with Strong SupervisionAkata et.al., CVPR’16

Class

Embedding

F

Image

Embedding

Blue Jay Albatross

black

long tailblue

cone beak

Attributes: costly but good, W2V: cheap but weak.

Strong visual supervision: to compensate weak W2V.

Learning Deep Representations of Fine-Grained Visual

Descriptions Reed et.al., CVPR’16

2 4 6 8 10

# of train sentences per image

30

35

40

45

50

55

60

To

p-1

Acc.

(in

%)

Zero-Shot in CUB

Ours (word)Ours (char)LSTMTCNNAttributes

CNN-RNN: fast + models sequence of words or characters

With >4 sentences: outperforms SoA with attributes

Gaze Embeddings for Zero-Shot Image Classification

Karessli et.al., CVPR’17

Original image Gaze pointsGaze Features

with Grid (GFG)

Gaze histogram (GH)

0

0

0

0

0

02

13

22Raw gaze data

0

22

0

0

13

0

0

2

0

Gaze Features

without Grid (GFS)

GH Embedding per class

GFS Embedding per class

Gaze heatmap

Outlier

removal

Gaze data

collection

GFG Embedding per class+

x

y

�₁1

2

3

�₂

d

...

x 9

+x

y

d

...

x 3

�₁1

2

3

�₂

Generating: Vision + Language

Generarive Adversarial Text to Image SynthesisReed et.al. ICML’16

GAN conditioned on sentences: real/fake, matching/not

a tiny bird, with a

tiny beak, tarsus and

feet, a blue crown,

blue coverts, and

black cheek patch

this small bird has

a yellow breast,

brown crown, and

black superciliary

an all black bird

with a distinct

thick, rounded bill.

this bird is different

shades of brown all

over with white and

black spots on its

head and back

GAN - CLS

GAN - INT

GAN

GAN - INT

- CLS

the gray bird has a

light grey head and

grey webbed feetGT

Generates pixels from characters: intuitive

Language compensates lack of large # training images

Learning What and Where to Draw Reed et.al. NIPS’16

This bird has a yellow head, black eyes, a gray pointy beak and orange lines on its breast.

This water bird has a long white neck, black body, yellow beak and black head.

This bird is large, completely black, with a long pointy beak and black eyes.

This bird is completely red with a red and cone-shaped beak, black face and a red nape.

This white bird has gray wings, red webbed feet and a long, curved and yellow beak.

This small bird has a blue and gray head, pointy beak and a white belly.

GT

GT

GT

GT

GT

GT

Generating Visual Explanations Hendricks et.al. ECCV’16

D: this bird has a white breast

black wings and a red spot on its

head.

E: this is a white bird with a

black wing and a black and white

striped head.

D: this bird has a white breast

black wings and a red spot on its

head.

E: this is a black and white bird

with a red spot on its crown.

This is a Downy Woodpecker because... This is a Downy Woodpecker because...

Class + image conditional LSTM & Reinforcement Loss

Learns to mention class-specific and visible properties

Top Related