count, crop and recognise: fine-grained recognition in the ... · fine-grained recognition in the...
TRANSCRIPT
![Page 1: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for](https://reader033.vdocuments.net/reader033/viewer/2022042220/5ec656224faae761ee4db673/html5/thumbnails/1.jpg)
Count, Crop and Recognise:Fine-Grained Recognition in the Wild
Max Bain, Arsha Nagrani, Daniel Schofield, Andrew Zisserman
Visual Geometry Group, University of Oxford
![Page 2: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for](https://reader033.vdocuments.net/reader033/viewer/2022042220/5ec656224faae761ee4db673/html5/thumbnails/2.jpg)
Recognising Animal Individuals in a Video
• The aim: label animal individuals in every frame of a video
King Kong(King Kong 2005)
George(Rampage 2018)
Jambo(Durell Wildlife Park, France)
![Page 3: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for](https://reader033.vdocuments.net/reader033/viewer/2022042220/5ec656224faae761ee4db673/html5/thumbnails/3.jpg)
Recognising Animal Individuals in a Video
• The aim: label animal individuals in every frame of a video
![Page 4: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for](https://reader033.vdocuments.net/reader033/viewer/2022042220/5ec656224faae761ee4db673/html5/thumbnails/4.jpg)
Recognising Animal Individuals in a Video
• The aim: label animal individuals in every frame of a video
![Page 5: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for](https://reader033.vdocuments.net/reader033/viewer/2022042220/5ec656224faae761ee4db673/html5/thumbnails/5.jpg)
Current Methods
Chimpanzee face recognition from videos in the wild using deep learningD. Schofield, A. Nagrani, A. Zisserman, M. Hayashi, T. Matsuzawa, D. Biro, S. CarvalhoScience Advances, 2019www.robots.ox.ac.uk/~vgg/research/ChimpanzeeFaces
![Page 6: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for](https://reader033.vdocuments.net/reader033/viewer/2022042220/5ec656224faae761ee4db673/html5/thumbnails/6.jpg)
Current Methods
![Page 7: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for](https://reader033.vdocuments.net/reader033/viewer/2022042220/5ec656224faae761ee4db673/html5/thumbnails/7.jpg)
Current Methods: Training Pipeline
![Page 8: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for](https://reader033.vdocuments.net/reader033/viewer/2022042220/5ec656224faae761ee4db673/html5/thumbnails/8.jpg)
Pre-trained Detector
(SSD, YOLO, etc.)
Current Methods: Detection
![Page 9: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for](https://reader033.vdocuments.net/reader033/viewer/2022042220/5ec656224faae761ee4db673/html5/thumbnails/9.jpg)
Finetune Detector
Evaluate
Current Methods: Detection
![Page 10: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for](https://reader033.vdocuments.net/reader033/viewer/2022042220/5ec656224faae761ee4db673/html5/thumbnails/10.jpg)
Current Methods: Tracking
Fully-Convolutional Siamese Networks for Object TrackingL. Bertinetto, J. Valmadre, J. F. Henriques, A. Vedaldi, P. H. S. TorrCVPR 2017
![Page 11: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for](https://reader033.vdocuments.net/reader033/viewer/2022042220/5ec656224faae761ee4db673/html5/thumbnails/11.jpg)
1. Acquire identity labels from expert
2. Train Identity Recognition CNN• ResNet• CE weighted loss (class imbalance)
Current Methods: Recognition
![Page 12: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for](https://reader033.vdocuments.net/reader033/viewer/2022042220/5ec656224faae761ee4db673/html5/thumbnails/12.jpg)
Current Methods: Challenges
Face often turned away Bodies prone to heavy occlusion and overlap
![Page 13: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for](https://reader033.vdocuments.net/reader033/viewer/2022042220/5ec656224faae761ee4db673/html5/thumbnails/13.jpg)
Current Methods: Challenges
Face often turned away Bodies prone to heavy occlusion and overlap
![Page 14: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for](https://reader033.vdocuments.net/reader033/viewer/2022042220/5ec656224faae761ee4db673/html5/thumbnails/14.jpg)
Current Methods: Challenges
Lack of contextual information
![Page 15: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for](https://reader033.vdocuments.net/reader033/viewer/2022042220/5ec656224faae761ee4db673/html5/thumbnails/15.jpg)
Current Methods: Challenges
Lack of contextual information
![Page 16: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for](https://reader033.vdocuments.net/reader033/viewer/2022042220/5ec656224faae761ee4db673/html5/thumbnails/16.jpg)
Current Methods: Challenges
Lack of contextual information
![Page 17: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for](https://reader033.vdocuments.net/reader033/viewer/2022042220/5ec656224faae761ee4db673/html5/thumbnails/17.jpg)
What if we recognised without explicit detection?
![Page 18: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for](https://reader033.vdocuments.net/reader033/viewer/2022042220/5ec656224faae761ee4db673/html5/thumbnails/18.jpg)
Dataset (Publicly Available)
![Page 19: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for](https://reader033.vdocuments.net/reader033/viewer/2022042220/5ec656224faae761ee4db673/html5/thumbnails/19.jpg)
Frame-Level Recognition
• Mutli-label classifier on raw frames• ResNet18, Sigmoid + Weighted BCE loss
![Page 20: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for](https://reader033.vdocuments.net/reader033/viewer/2022042220/5ec656224faae761ee4db673/html5/thumbnails/20.jpg)
Coarse-grainedcounting
CNNImage
Fine-grained recognition
CNN
Region Proposal
JEJE, PELEY
Prediction
Count, Crop and Recognise (CCR)
![Page 21: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for](https://reader033.vdocuments.net/reader033/viewer/2022042220/5ec656224faae761ee4db673/html5/thumbnails/21.jpg)
Count, Crop and Recognise (CCR)
Coarse-grainedcounting
CNNCount = 2
• Count labels are for free• Trained as classification task (ResNet18, CE loss)• Bin count of N+ into the same class
![Page 22: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for](https://reader033.vdocuments.net/reader033/viewer/2022042220/5ec656224faae761ee4db673/html5/thumbnails/22.jpg)
Count, Crop and Recognise (CCR)
Region Proposal
![Page 23: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for](https://reader033.vdocuments.net/reader033/viewer/2022042220/5ec656224faae761ee4db673/html5/thumbnails/23.jpg)
Count, Crop and Recognise (CCR)
Region Proposal
B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba. Learning Deep Features for Discriminative Localization. CVPR'16 (arXiv:1512.04150, 2015).
Australian terrier ...
CONV
CONV
CONV
CONV
CONV
GAP ...
w1
w2
wn
w1 * + w2 * + … + wn * Class Activation Map
(Australian terrier)
=
CONV
Class Activation Mapping
![Page 24: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for](https://reader033.vdocuments.net/reader033/viewer/2022042220/5ec656224faae761ee4db673/html5/thumbnails/24.jpg)
Count, Crop and Recognise (CCR)
Region Proposal
![Page 25: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for](https://reader033.vdocuments.net/reader033/viewer/2022042220/5ec656224faae761ee4db673/html5/thumbnails/25.jpg)
![Page 26: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for](https://reader033.vdocuments.net/reader033/viewer/2022042220/5ec656224faae761ee4db673/html5/thumbnails/26.jpg)
Results
![Page 27: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for](https://reader033.vdocuments.net/reader033/viewer/2022042220/5ec656224faae761ee4db673/html5/thumbnails/27.jpg)
Count, Crop and Recognise (CCR)
• Recognise
Fine-grained recognition
CNNJEJE, PELEY
• Multi-label classification (ResNet18)• Sigmoid + BCE Loss
![Page 28: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for](https://reader033.vdocuments.net/reader033/viewer/2022042220/5ec656224faae761ee4db673/html5/thumbnails/28.jpg)
Results
![Page 29: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for](https://reader033.vdocuments.net/reader033/viewer/2022042220/5ec656224faae761ee4db673/html5/thumbnails/29.jpg)
ResultsIndividual: JIRE
Method AP
Face 31.2
Body 42.3
Baseline 82.3
CCR 86.4
![Page 30: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for](https://reader033.vdocuments.net/reader033/viewer/2022042220/5ec656224faae761ee4db673/html5/thumbnails/30.jpg)
ResultsIndividual: JIRE
Method AP
Face 31.2
Body 42.3
Baseline 82.3
CCR 86.4
![Page 31: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for](https://reader033.vdocuments.net/reader033/viewer/2022042220/5ec656224faae761ee4db673/html5/thumbnails/31.jpg)
Conclusions
• Detect, Track and Recognise pipelines limited by detector performance• Body > Face• Frame-level recognition offers an alternative, more research needed
![Page 32: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for](https://reader033.vdocuments.net/reader033/viewer/2022042220/5ec656224faae761ee4db673/html5/thumbnails/32.jpg)
An quirky post-hoc application
FANA JEJE PAMA FOAF
Top-Down Neural Attention by Excitation Backprop, ECCV 2016J. Zhang, Z. Lin, J. Brandt, X. Shen and S. Sclaroff
![Page 33: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for](https://reader033.vdocuments.net/reader033/viewer/2022042220/5ec656224faae761ee4db673/html5/thumbnails/33.jpg)
Annotation Tools
Abhishek Dutta and Andrew Zisserman. 2019.
The VIA Annotation Software for Images, Audio and Video.
In Proceedings of the 27th ACM International Conference on Multimedia (MM ’19)
Object Annotator (bounding boxes, keypoints, pose) Temporal segmentation (presence, actions, speech)
![Page 34: Count, Crop and Recognise: Fine-Grained Recognition in the ... · Fine-Grained Recognition in the Wild Max Bain, Arsha Nagrani ... Tracking Fully-Convolutional Siamese Networks for](https://reader033.vdocuments.net/reader033/viewer/2022042220/5ec656224faae761ee4db673/html5/thumbnails/34.jpg)
Thank you to the organisers for arranging
this workshop!
Paper, Dataset and Code at:www.robots.ox.ac.uk/~vgg/research/ccr