large-scale object recognition with weak supervision weiqiang ren, chong wang, yanhua cheng, kaiqi...

Download Large-Scale Object Recognition with Weak Supervision Weiqiang Ren, Chong Wang, Yanhua Cheng, Kaiqi Huang, Tieniu Tan {wqren,cwang,yhcheng,kqhuang,tnt}@nlpr.ia.ac.cn

If you can't read please download the document

Upload: mavis-banks

Post on 18-Dec-2015

226 views

Category:

Documents


2 download

TRANSCRIPT

  • Slide 1
  • Large-Scale Object Recognition with Weak Supervision Weiqiang Ren, Chong Wang, Yanhua Cheng, Kaiqi Huang, Tieniu Tan {wqren,cwang,yhcheng,kqhuang,tnt}@nlpr.ia.ac.cn
  • Slide 2
  • Task2 : Classification + Localization Task 2b: Classification + localization with additional training data Ordered by classification error 1.Only classification labels are used 2.Full image as object location
  • Slide 3
  • Outline Motivation Method Results
  • Slide 4
  • Motivation
  • Slide 5
  • Knowing where to look, recognizing objects will be easier ! However, in the classification-only task, no annotations of object location are available. Weakly Supervised Localization Why Weakly Supervised Localization (WSL)?
  • Slide 6
  • Current WSL Results on VOC07
  • Slide 7
  • 13.9: Weakly supervised object detector learning with model drift detection, ICCV 2011 15.0: Object-centric spatial pooling for image classification, ECCV 2012 22.4: Multi-fold mil training for weakly supervised object localization, CVPR 2014 22.7: On learning to localize objects with minimal supervision, ICML 2014 26.4: Weakly supervised object detection with posterior regularization, BMVC 2014 31.6: Weakly supervised object localization with latent category learning, ECCV 2014 Sep 11, Poster Session 4A, #34 26.2: Discovering Visual Objects in Large-scale Image Datasets with Weak Supervision, submitted to TPAMI
  • Slide 8
  • VOC 2007Results Ours31.6 DPM 5.033.7 Weakly Supervised Object Localization with Latent Category Learning ECCV 2014 VOC 2007Results Ours26.2 DPM 5.033.7 Discovering Visual Objects in Large-scale Image Datasets with Weak Supervision Submitted to TPAMI Our Work For the consideration of high efficiency in large-scale tasks, we use the second one.
  • Slide 9
  • Method
  • Slide 10
  • Framework Conv Layers FC Layers Input Images Cls Prediction Det Prediction Rescoring 2 1 3 4
  • Slide 11
  • 1 st : CNN Architecture Chatfield et al. Return of the Devil in the Details: Delving Deep into Convolutional Nets
  • Slide 12
  • 2 nd : MILinear SVM
  • Slide 13
  • Good region proposal algorithms High recall High overlap Small number Low computation cost MCG pretrained on VOC 2012 Additional Data Training: 128 windows/ image Testing: 256 windows/image Compared to Selective Search (~2000) MILinear : Region Proposal
  • Slide 14
  • Low Level Features SIFT, LBP, HOG Shape context, Gabor, Mid-Level Features Bag of Visual Words (BoVW) Deep Hierarchical Features Convolutional Networks Deep Auto-Encoders Deep Belief Nets MILinear: Feature Representations
  • Slide 15
  • Clustering KMeans Topic Model pLSA, LDA, gLDA CRF Multiple Instance Learning DD, EMDD, APR MI-NN, MI-SVM, mi-SVM MILBoost MILinear: Positive Window Mining
  • Slide 16
  • Multiple instance Linear SVM Optimization: trust region Newton A kind of Quasi Newton method Working in the primal Faster convergence MILinear: Objective Function and Optimization
  • Slide 17
  • MILinear: Optimization Efficiency
  • Slide 18
  • 3 rd : Detection Rescoring Rescoring with softmax 1000 classes 128 boxes max train softmax 1000 dim Softmax: consider all the categories simultaneously at each minibatch of the optimization Suppress the response of other appearance similar object categories
  • Slide 19
  • 4 th : Classification Rescoring Linear Combination 1000 dim One funny thing: We have tried some other strategies of score combination, but it seems not working !
  • Slide 20
  • Results
  • Slide 21
  • 1 st : Classification without WSL MethodTop 5 Error Baseline with one CNN :13.7 Average with four CNNs:12.5
  • Slide 22
  • 2 nd : MILinear on ImageNet 2014 MethodsDetection Error Baseline (Full Image)61.96 MILinear40.96 Winner25.3
  • Slide 23
  • 2 nd : MILinear on VOC 2007
  • Slide 24
  • 2 nd : MILinear on ILSVRC 2013 detection mAP: 9.63%! vs 8.99% (DPM5.0)
  • Slide 25
  • 2 nd : MILinear for Classification MethodsTop 5 Error Milinear17.1
  • Slide 26
  • 3 rd : WSL Rescoring (Softmax) MethodTop 5 Error Baseline with one CNN :13.7 Average with four CNN :12.5 MILinear17.1 MILinear + Rescore13.5 The Softmax based rescoring successfully suppresses the predictions of other appearance similar object categories !
  • Slide 27
  • 4 th : Cls and WSL Combinataion MethodTop 5 Error Baseline with one CNN model:13.7 Average with four CNN models:12.5 MILinear17.1 MILinear + Rescore13.5 Cls (12.5) + MILinear (13.5)11.5 WSL and Cls can be complementary to each other!
  • Slide 28
  • Russakovsky et al. ImageNet Large Scale Visual Object Challenge.
  • Slide 29
  • Conclusion WSL always helps classification WSL has large potential: WSL data is cheap
  • Slide 30
  • Thank You!