Learning Deep Embeddings with Histogram Loss (NIPS2016)

Download Learning Deep Embeddings with Histogram Loss (NIPS2016)

Post on 22-Jan-2018

206 views

Category:

Data & Analytics

1 download

Embed Size (px)

TRANSCRIPT

<ol><li> 1. l l l l </li><li> 2. l l </li><li> 3. l l l </li><li> 4. l l l </li><li> 5. l l l l l </li><li> 6. + - embedded batch similarity histograms deep net aggregation gram loss computation for a batch of examples (color-coded; same color indicates matching batch (left) is embedded into a high-dimensional space by a deep network (middle), we ms of similarities of positive (top-right) and negative pairs (bottom-right). We then evaluate </li><li> 7. + - embedded batch similarity histograms net aggregation mputation for a batch of examples (color-coded; same color indicates matching is embedded into a high-dimensional space by a deep network (middle), we arities of positive (top-right) and negative pairs (bottom-right). We then evaluate en the negative distribution and the cumulative density function for the positive ed line), which corresponds to a probability that a randomly sampled positive a randomly sampled negative pair. Such histogram loss can be minimized by ociated parameter of such loss is the number of histogram bins, to which the </li><li> 8. + - embedded batch similarity histograms net aggregation omputation for a batch of examples (color-coded; same color indicates matching is embedded into a high-dimensional space by a deep network (middle), we arities of positive (top-right) and negative pairs (bottom-right). We then evaluate en the negative distribution and the cumulative density function for the positive ed line), which corresponds to a probability that a randomly sampled positive a randomly sampled negative pair. Such histogram loss can be minimized by ociated parameter of such loss is the number of histogram bins, to which the </li><li> 9. + - embedded batch similarity histograms net aggregation omputation for a batch of examples (color-coded; same color indicates matching is embedded into a high-dimensional space by a deep network (middle), we arities of positive (top-right) and negative pairs (bottom-right). We then evaluate en the negative distribution and the cumulative density function for the positive ed line), which corresponds to a probability that a randomly sampled positive a randomly sampled negative pair. Such histogram loss can be minimized by ociated parameter of such loss is the number of histogram bins, to which the </li><li> 10. l l l l </li><li> 11. l l l l l l ding size 512 1024 GoogLeNet pool5 Contrastive Triplet LiftedStruct 83 84 85 86 87 88 Embedding size NMIscore(%) 64 128 256 512 1024 GoogLeNet pool5 Contrastive Triplet LiftedStruct 1 10 100 50 60 70 80 90 100 K Recall@Kscore(%) GoogLeNet pool5 Contrastive Triplet LiftedStruct , and Recall@K score metrics on the test split of Stanford Online Products with GoogLeNet [33]. Figure 11: Examples of failure queries on St Products dataset. Most failures are ne grain ferences among similar products. Images in th ng via Lifted Structured Feature Embedding Xiang University s.stanford.edu Stefanie Jegelka MIT stefje@csail.mit.edu Silvio Savarese Stanford University ssilvio@stanford.edu airs of examples ual recognition. of the art convo- 31] have shown ng the networks e similar exam- issimilar exam- we describe an raining batches vector of pair- Query Retrieval erence on Computer Vision and Pattern Recognition </li><li> 12. l l l </li><li> 13. l l l </li><li> 14. l l l sastheclassier(pre- themainclassier,but networkisdepictedin edusingtheDistBe- gsystemusingmod- lelism.Althoughwe only,aroughestimate rkcouldbetrainedto Uswithinaweek,the age.Ourtrainingused centwith0.9momen- (decreasingthelearn- akaveraging[13]was inferencetime. changedsubstantially mpetition,andalready hotheroptions,some- hyperparameters,such herefore,itishardto steffectivesingleway emattersfurther,some smallerrelativecrops, ].Still,oneprescrip- wellafterthecompeti- zedpatchesoftheim- between8%and100% onstrainedtotheinter- hotometricdistortions ocombatoverttingto ta. cationChallenge challengeinvolvesthe of1000leaf-nodecat- hereareabout1.2mil- validationand100,000 associatedwithone nceismeasuredbased edictions.Twonum- accuracyrate,which erstpredictedclass, paresthegroundtruth input Conv 7x7+2(S) MaxPool 3x3+2(S) LocalRespNorm Conv 1x1+1(V) Conv 3x3+1(S) LocalRespNorm MaxPool 3x3+2(S) Conv 1x1+1(S) Conv 1x1+1(S) Conv 1x1+1(S) MaxPool 3x3+1(S) DepthConcat Conv 3x3+1(S) Conv 5x5+1(S) Conv 1x1+1(S) Conv 1x1+1(S) Conv 1x1+1(S) Conv 1x1+1(S) MaxPool 3x3+1(S) DepthConcat Conv 3x3+1(S) Conv 5x5+1(S) Conv 1x1+1(S) MaxPool 3x3+2(S) Conv 1x1+1(S) Conv 1x1+1(S) Conv 1x1+1(S) MaxPool 3x3+1(S) DepthConcat Conv 3x3+1(S) Conv 5x5+1(S) Conv 1x1+1(S) Conv 1x1+1(S) Conv 1x1+1(S) Conv 1x1+1(S) MaxPool 3x3+1(S) AveragePool 5x5+3(V) DepthConcat Conv 3x3+1(S) Conv 5x5+1(S) Conv 1x1+1(S) Conv 1x1+1(S) Conv 1x1+1(S) Conv 1x1+1(S) MaxPool 3x3+1(S) DepthConcat Conv 3x3+1(S) Conv 5x5+1(S) Conv 1x1+1(S) Conv 1x1+1(S) Conv 1x1+1(S) Conv 1x1+1(S) MaxPool 3x3+1(S) DepthConcat Conv 3x3+1(S) Conv 5x5+1(S) Conv 1x1+1(S) Conv 1x1+1(S) Conv 1x1+1(S) Conv 1x1+1(S) MaxPool 3x3+1(S) AveragePool 5x5+3(V) DepthConcat Conv 3x3+1(S) Conv 5x5+1(S) Conv 1x1+1(S) MaxPool 3x3+2(S) Conv 1x1+1(S) Conv 1x1+1(S) Conv 1x1+1(S) MaxPool 3x3+1(S) DepthConcat Conv 3x3+1(S) Conv 5x5+1(S) Conv 1x1+1(S) Conv 1x1+1(S) Conv 1x1+1(S) Conv 1x1+1(S) MaxPool 3x3+1(S) DepthConcat Conv 3x3+1(S) Conv 5x5+1(S) Conv 1x1+1(S) AveragePool 7x7+1(V) FC Conv 1x1+1(S) FC FC SoftmaxActivation softmax0 Conv 1x1+1(S) FC FC SoftmaxActivation softmax1 SoftmaxActivation softmax2 </li><li> 15. l l l 3 Input 3@128x48 C1 64@48x48 S2 64@24x24 C3 64@24x24 S4 64@12x12 F5 500 Convolution Normalization &amp; Max pooling Convolution Normalization &amp; Max pooling Full connection Parameters Sharing Fig. 2. The structure of the 5-layer CNN used in our method. View Specic SCNN. Because cross dataset problem is the main concern of this paper, we focus on General SCNN. B. Convolutional Neural Network </li><li> 16. 1 148 6 18 on 10 50 2 22 10 526 26 8 s 10 325 16 9 10 223 10 9 C 10 367 10 30 Precision = TP TP + FP Recall = TP TP + FN F1 = 2TP 2TP + FN + FP F2 = 5TP 5TP + 4FN + FP </li><li> 17. Figure 3: Recall@K for (left) - CUB-200-2011 and (right) - Online Products datasets for different metho </li><li> 18. Figure 4: Recall@K for (left) - CUHK03 and (right) - Market-1501 datasets. The Histogram loss (4) outperf </li><li> 19. l l </li><li> 20. Nicolas Thome Matthieu Cord C - Sorbonne University, Paris, France colas.Thome, Matthieu.Cord}@lip6.fr ning frame- s involving t efciently mantic label straints, we g on a con- our metric lationships, d class tax- earned met- he-art meth- r approach en webpage Presence of smile + Least smiling ? ? Most smiling Class (e) Class (f) Class (g) Class (h) Learn dissimilarity D such that: D( , ) &lt; D( , ) D( , ) &lt; D( , ) </li></ol>

Recommended

View more >