fast and compact retrieval methods in computer vision part ii
DESCRIPTION
Fast and Compact Retrieval Methods in Computer Vision Part II. A. Torralba, R. Fergus and Y. Weiss. Small Codes and Large Image Databases for Recognition . CVPR 2008 A. Torralba, R. Fergus, W. Freeman . 80 million tiny images: a large dataset for non-parametric object and scene recognition. TR. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/1.jpg)
Fast and Compact Retrieval Methods in Computer Vision Part II
• A. Torralba, R. Fergus and Y. Weiss.Small Codes and Large Image Databases for Recognition. CVPR 2008
• A. Torralba, R. Fergus, W. Freeman . 80 million tiny images: a large dataset for non-parametric object and scene recognition. TR
Presented by Ken and Ryan
![Page 2: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/2.jpg)
Outline
• Large Datasets of Images• Searching Large Datasets
– Nearest Neighbor– ANN: Locality Sensitive Hashing
• Dimensionality Reduction– Boosting– Restricted Boltzmann Machines (RBM)
• Results
![Page 3: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/3.jpg)
Goal
• Develop efficient image search and scene matching techniques that are fast and require very little memory
• Particularly on VERY large image sets
Query
![Page 4: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/4.jpg)
Motivation
• Image sets– Vogel & Schiele: 702 natural scenes in 6 cat– Olivia & Torralba: 2688– Caltech 101: ~50 images/cat ~ 5000 – Caltech 256: 80-800 images/cat ~ 30608
• Why do we want larger datasets?
![Page 5: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/5.jpg)
Motivation
• Classify any image• Complex classification methods don’t
extend well• Can we use a simple classification
method?
![Page 6: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/6.jpg)
Thumbnail Collection Project
• Collect images for ALL objects– List obtained from WordNet– 75,378 non-abstract nouns in English
![Page 7: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/7.jpg)
Thumbnail Collection Project• Collected 80M images• http://people.csail.mit.edu/torralba/tinyimages
![Page 8: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/8.jpg)
How Much is 80M Images?
• One feature-length movie:– 105 min = 151K frames @ 24 FPS
• For 80M images, watch 530 movies• How do we store this?
– 1k * 80M = 80 GB– Actual storage: 760GB
![Page 9: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/9.jpg)
First Attempt
• Store each image as 32x32 color thumbnail• Based on human visual perception• Information: 32*32*3 channels =3072 entries
![Page 10: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/10.jpg)
First Attempt
• Used SSD++ to find nearest neighbors of query image– Used first 19 principal components
![Page 11: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/11.jpg)
Motivation Part 2
• Is this good enough?• SSD is naïve• Still too much storage required• How can we fix this?
– Traditional methods of searching large datasets– Binary reduction
![Page 12: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/12.jpg)
![Page 13: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/13.jpg)
![Page 14: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/14.jpg)
![Page 15: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/15.jpg)
Locality-Sensitive Hash Families
![Page 16: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/16.jpg)
![Page 17: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/17.jpg)
LSH Example
![Page 18: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/18.jpg)
![Page 19: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/19.jpg)
Binary Reduction
Lots of pixels
512 values 32 bits
Gist vector
Binaryreduction
164 GB 320 MB80 million images?
![Page 20: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/20.jpg)
Gist
“The ‘gist’ is an abstract representation of the scene that spontaneously activates
memory representations of scene categories (a city, a mountain, etc.)”
A. Oliva and A. Torralba. Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. Journal of Computer Vision, 42(3):145–175, 2001.
![Page 21: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/21.jpg)
Gist
![Page 22: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/22.jpg)
http://ilab.usc.edu/siagian/Research/G
ist/Gist.htm
l
Gist vector
![Page 23: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/23.jpg)
Query Image Dataset
Querying
![Page 24: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/24.jpg)
1
?
Querying
![Page 25: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/25.jpg)
6
?
Querying
![Page 26: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/26.jpg)
Querying
![Page 27: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/27.jpg)
Boosting
• Positive and negative image pairs train the discovery of the binary reduction.
&
&
= 1
= -1
80% negatives150K pairs
![Page 28: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/28.jpg)
BoostSSC
• Similarity Sensitive Coding• Weights start uniformly
xi
Nvalues
Weight
![Page 29: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/29.jpg)
BoostSSC
• For each bit m:– Choose the index n that
minimizes a weighted error across entire training set
Featurevector x
from image i
Binaryreduction
h(x)
Nvalues
Mbits
m
n
![Page 30: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/30.jpg)
BoostSSC
• Weak classifications are evaluated via regression stumps:
xi
N values
nxj
)])(())([(),( TnxTnxxxf jiji
• We need to figure out , , and T for each n.
If xi and xj are similar, we should get 1 for
most n’s.
![Page 31: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/31.jpg)
BoostSSC
• Try a range of threshold T:– Regress f across entire training set
to find each and .– Keep the T that fits the best.
• Then, keep the n that causes the least weighted error.
xi xj
n )])(())([(),( TnxTnxxxf jiji
N values
nn
![Page 32: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/32.jpg)
BoostSSCxi xj
N values Mbits
mn
![Page 33: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/33.jpg)
BoostSSC
• Update weights.– Affects future error
calculations
xi xj
N values
n
Weight
![Page 34: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/34.jpg)
BoostSSC
• In the end, each bit has an n index and a threshold.
xi
Nvalues
Mbits
![Page 35: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/35.jpg)
BoostSSC
![Page 36: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/36.jpg)
Restricted Boltzmann Machine (RBM) Architecture• Network of binary stochastic units• Hinton & Salakhutdinov, Nature 2006
Parameters: w: Symmetric Weightsb: Biasesh: Hidden Unitsv: Visible Units
![Page 37: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/37.jpg)
Multi-Layer RBM Architecture
![Page 38: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/38.jpg)
Training RBM Models
• Two phases1. Pre-training
• Unsupervised• Use Contrastive Divergence to learn weights and biases• Gets parameters in the right ballpark
2. Fine-tuning• Supervised• No longer stochastic• Backpropogate error to update parameters• Moves parameters to local minimum
![Page 39: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/39.jpg)
Greedy Pre-training (Unsupervised)
![Page 40: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/40.jpg)
Greedy Pre-training (Unsupervised)
![Page 41: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/41.jpg)
Greedy Pre-training (Unsupervised)
![Page 42: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/42.jpg)
![Page 43: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/43.jpg)
Neighborhood Components Analysis• Goldberger, Roweis,Salakhutdinov & Hinton, NIPS 2004
Output of RBM
W are RBM weights
![Page 44: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/44.jpg)
Neighborhood Components Analysis• Goldberger, Roweis,Salakhutdinov & Hinton, NIPS 2004
Assume K=2 classes
![Page 45: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/45.jpg)
Neighborhood Components Analysis• Goldberger, Roweis,Salakhutdinov & Hinton, NIPS 2004
Pulls nearby points of same class closer
![Page 46: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/46.jpg)
Neighborhood Components Analysis• Goldberger, Roweis,Salakhutdinov & Hinton, NIPS 2004
Pulls nearby points of same class closer
Goal is to preserve neighborhood structure of original, high-dimensional space
![Page 47: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/47.jpg)
Experiments and Results
![Page 48: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/48.jpg)
Searching
• Bit limitations:– Hashing scheme:
• Max. capacity for 13M images: 30 bits– Exhaustive search:
• 256 bits possible
![Page 49: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/49.jpg)
Searching Results
![Page 50: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/50.jpg)
LabelMe Retrieval
![Page 51: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/51.jpg)
Examples of Web Retrieval
• 12 neighbors using different distance metrics
![Page 52: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/52.jpg)
Web Images Retrieval
![Page 53: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/53.jpg)
Conclusion
• Efficient searching for large image datasets
• Compact image representation• Methods for binary reductions
– Locality-Sensitive Hashing– Boosting– Restricted Boltzmann Machines
• Searching techniques
![Page 54: Fast and Compact Retrieval Methods in Computer Vision Part II](https://reader036.vdocuments.net/reader036/viewer/2022062410/56815a9f550346895dc82645/html5/thumbnails/54.jpg)