lecture 2: image classification pipelinecs231n.stanford.edu/slides/2015/lecture2.pdf · fei-fei li...
TRANSCRIPT
![Page 1: Lecture 2: Image Classification pipelinecs231n.stanford.edu/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan 2015 Summary - Image Classification: We are](https://reader033.vdocuments.net/reader033/viewer/2022043001/5f789f4de65a0040f702785a/html5/thumbnails/1.jpg)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 1
Lecture 2: Image Classification pipeline
![Page 2: Lecture 2: Image Classification pipelinecs231n.stanford.edu/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan 2015 Summary - Image Classification: We are](https://reader033.vdocuments.net/reader033/viewer/2022043001/5f789f4de65a0040f702785a/html5/thumbnails/2.jpg)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 2
Image Classification: a core task in Computer Vision
cat
(assume given set of discrete labels) {dog, cat, truck, plane, ...}
![Page 3: Lecture 2: Image Classification pipelinecs231n.stanford.edu/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan 2015 Summary - Image Classification: We are](https://reader033.vdocuments.net/reader033/viewer/2022043001/5f789f4de65a0040f702785a/html5/thumbnails/3.jpg)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 3
The problem: semantic gap
Images are represented as Rd arrays of numbers • E.g. R3 with integers
between [0, 255], where d=3 represents 3 color channels (RGB)
![Page 4: Lecture 2: Image Classification pipelinecs231n.stanford.edu/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan 2015 Summary - Image Classification: We are](https://reader033.vdocuments.net/reader033/viewer/2022043001/5f789f4de65a0040f702785a/html5/thumbnails/4.jpg)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 4
![Page 5: Lecture 2: Image Classification pipelinecs231n.stanford.edu/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan 2015 Summary - Image Classification: We are](https://reader033.vdocuments.net/reader033/viewer/2022043001/5f789f4de65a0040f702785a/html5/thumbnails/5.jpg)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 5
![Page 6: Lecture 2: Image Classification pipelinecs231n.stanford.edu/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan 2015 Summary - Image Classification: We are](https://reader033.vdocuments.net/reader033/viewer/2022043001/5f789f4de65a0040f702785a/html5/thumbnails/6.jpg)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 6
![Page 7: Lecture 2: Image Classification pipelinecs231n.stanford.edu/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan 2015 Summary - Image Classification: We are](https://reader033.vdocuments.net/reader033/viewer/2022043001/5f789f4de65a0040f702785a/html5/thumbnails/7.jpg)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 7
![Page 8: Lecture 2: Image Classification pipelinecs231n.stanford.edu/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan 2015 Summary - Image Classification: We are](https://reader033.vdocuments.net/reader033/viewer/2022043001/5f789f4de65a0040f702785a/html5/thumbnails/8.jpg)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 8
![Page 9: Lecture 2: Image Classification pipelinecs231n.stanford.edu/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan 2015 Summary - Image Classification: We are](https://reader033.vdocuments.net/reader033/viewer/2022043001/5f789f4de65a0040f702785a/html5/thumbnails/9.jpg)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 9
![Page 10: Lecture 2: Image Classification pipelinecs231n.stanford.edu/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan 2015 Summary - Image Classification: We are](https://reader033.vdocuments.net/reader033/viewer/2022043001/5f789f4de65a0040f702785a/html5/thumbnails/10.jpg)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 10
![Page 11: Lecture 2: Image Classification pipelinecs231n.stanford.edu/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan 2015 Summary - Image Classification: We are](https://reader033.vdocuments.net/reader033/viewer/2022043001/5f789f4de65a0040f702785a/html5/thumbnails/11.jpg)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 11
An image classifier
Unlike e.g. sorting a list of numbers, no obvious way to hard-code the algorithm for recognizing a cat, or other classes.
![Page 12: Lecture 2: Image Classification pipelinecs231n.stanford.edu/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan 2015 Summary - Image Classification: We are](https://reader033.vdocuments.net/reader033/viewer/2022043001/5f789f4de65a0040f702785a/html5/thumbnails/12.jpg)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 12
Data-driven approach: 1. Collect a dataset of images and label them 2. Use Machine Learning to train an image classifier 3. Evaluate the classifier on a withheld set of test images
Example training set
![Page 13: Lecture 2: Image Classification pipelinecs231n.stanford.edu/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan 2015 Summary - Image Classification: We are](https://reader033.vdocuments.net/reader033/viewer/2022043001/5f789f4de65a0040f702785a/html5/thumbnails/13.jpg)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 13
First classifier: Nearest Neighbor Classifier
Remember all training images and their labels
Predict the label of the most similar training image
![Page 14: Lecture 2: Image Classification pipelinecs231n.stanford.edu/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan 2015 Summary - Image Classification: We are](https://reader033.vdocuments.net/reader033/viewer/2022043001/5f789f4de65a0040f702785a/html5/thumbnails/14.jpg)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 14
Example dataset: CIFAR-10 10 labels 50,000 training images 10,000 test images.
![Page 15: Lecture 2: Image Classification pipelinecs231n.stanford.edu/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan 2015 Summary - Image Classification: We are](https://reader033.vdocuments.net/reader033/viewer/2022043001/5f789f4de65a0040f702785a/html5/thumbnails/15.jpg)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 15
Example dataset: CIFAR-10 10 labels 50,000 training images 10,000 test images.
For every test image (first column), examples of nearest neighbors in rows
![Page 16: Lecture 2: Image Classification pipelinecs231n.stanford.edu/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan 2015 Summary - Image Classification: We are](https://reader033.vdocuments.net/reader033/viewer/2022043001/5f789f4de65a0040f702785a/html5/thumbnails/16.jpg)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 16
How do we compare the images? What is the distance metric?
L1 distance: Where I1 denotes image 1, and p denotes each pixel
![Page 17: Lecture 2: Image Classification pipelinecs231n.stanford.edu/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan 2015 Summary - Image Classification: We are](https://reader033.vdocuments.net/reader033/viewer/2022043001/5f789f4de65a0040f702785a/html5/thumbnails/17.jpg)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 17
Nearest Neighbor classifier
![Page 18: Lecture 2: Image Classification pipelinecs231n.stanford.edu/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan 2015 Summary - Image Classification: We are](https://reader033.vdocuments.net/reader033/viewer/2022043001/5f789f4de65a0040f702785a/html5/thumbnails/18.jpg)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 18
Nearest Neighbor classifier
remember the training data
![Page 19: Lecture 2: Image Classification pipelinecs231n.stanford.edu/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan 2015 Summary - Image Classification: We are](https://reader033.vdocuments.net/reader033/viewer/2022043001/5f789f4de65a0040f702785a/html5/thumbnails/19.jpg)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 19
Nearest Neighbor classifier
for every test image: - find nearest train image
with L1 distance - predict the label of
nearest training image
![Page 20: Lecture 2: Image Classification pipelinecs231n.stanford.edu/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan 2015 Summary - Image Classification: We are](https://reader033.vdocuments.net/reader033/viewer/2022043001/5f789f4de65a0040f702785a/html5/thumbnails/20.jpg)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 20
Nearest Neighbor classifier
Q: what is the complexity of the NN classifier w.r.t training set of N images and test set of M images? 1. at training time?
2. at test time?
![Page 21: Lecture 2: Image Classification pipelinecs231n.stanford.edu/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan 2015 Summary - Image Classification: We are](https://reader033.vdocuments.net/reader033/viewer/2022043001/5f789f4de65a0040f702785a/html5/thumbnails/21.jpg)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 21
Nearest Neighbor classifier
Q: what is the complexity of the NN classifier w.r.t training set of N images and test set of M images? 1. at training time?
O(1) 1. at test time?
O(NM)
![Page 22: Lecture 2: Image Classification pipelinecs231n.stanford.edu/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan 2015 Summary - Image Classification: We are](https://reader033.vdocuments.net/reader033/viewer/2022043001/5f789f4de65a0040f702785a/html5/thumbnails/22.jpg)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 22
Nearest Neighbor classifier
1. at training time? O(1) 2. at test time? O(NM) This is backwards: - test time performance is usually much more important. - CNNs flip this: expensive training, cheap test evaluation
![Page 23: Lecture 2: Image Classification pipelinecs231n.stanford.edu/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan 2015 Summary - Image Classification: We are](https://reader033.vdocuments.net/reader033/viewer/2022043001/5f789f4de65a0040f702785a/html5/thumbnails/23.jpg)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 23
Aside: Approximate Nearest Neighbor find approximate nearest neighbors quickly
![Page 24: Lecture 2: Image Classification pipelinecs231n.stanford.edu/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan 2015 Summary - Image Classification: We are](https://reader033.vdocuments.net/reader033/viewer/2022043001/5f789f4de65a0040f702785a/html5/thumbnails/24.jpg)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 24
The choice of distance is a hyperparameter
L1 (Manhattan) distance L2 (Euclidean) distance
- Two most commonly used special cases of p-norm
![Page 25: Lecture 2: Image Classification pipelinecs231n.stanford.edu/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan 2015 Summary - Image Classification: We are](https://reader033.vdocuments.net/reader033/viewer/2022043001/5f789f4de65a0040f702785a/html5/thumbnails/25.jpg)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 25
k-Nearest Neighbor find the k nearest images, have them vote on the label
the data NN classifier 5-NN classifier
http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm
![Page 26: Lecture 2: Image Classification pipelinecs231n.stanford.edu/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan 2015 Summary - Image Classification: We are](https://reader033.vdocuments.net/reader033/viewer/2022043001/5f789f4de65a0040f702785a/html5/thumbnails/26.jpg)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 26
What is the best distance to use? What is the best value of k to use? i.e. how do we set the hyperparameters?
![Page 27: Lecture 2: Image Classification pipelinecs231n.stanford.edu/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan 2015 Summary - Image Classification: We are](https://reader033.vdocuments.net/reader033/viewer/2022043001/5f789f4de65a0040f702785a/html5/thumbnails/27.jpg)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 27
What is the best distance to use? What is the best value of k to use? i.e. how do we set the hyperparameters? Very problem-dependent. Must try them all out and see what works best.
![Page 28: Lecture 2: Image Classification pipelinecs231n.stanford.edu/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan 2015 Summary - Image Classification: We are](https://reader033.vdocuments.net/reader033/viewer/2022043001/5f789f4de65a0040f702785a/html5/thumbnails/28.jpg)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 28
Trying out what hyperparameters work best on test set: Very bad idea. The test set is a proxy for the generalization performance
![Page 29: Lecture 2: Image Classification pipelinecs231n.stanford.edu/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan 2015 Summary - Image Classification: We are](https://reader033.vdocuments.net/reader033/viewer/2022043001/5f789f4de65a0040f702785a/html5/thumbnails/29.jpg)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 29
Validation data use to tune hyperparameters evaluate on test set ONCE at the end
![Page 30: Lecture 2: Image Classification pipelinecs231n.stanford.edu/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan 2015 Summary - Image Classification: We are](https://reader033.vdocuments.net/reader033/viewer/2022043001/5f789f4de65a0040f702785a/html5/thumbnails/30.jpg)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 30
Cross-validation cycle through the choice of which fold is the validation fold, average results.
![Page 31: Lecture 2: Image Classification pipelinecs231n.stanford.edu/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan 2015 Summary - Image Classification: We are](https://reader033.vdocuments.net/reader033/viewer/2022043001/5f789f4de65a0040f702785a/html5/thumbnails/31.jpg)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 31
Example of 5-fold cross-validation for the value of k. Each point: single outcome. The line goes through the mean, bars indicated standard deviation (Seems that k = 7 works best for this data)
![Page 32: Lecture 2: Image Classification pipelinecs231n.stanford.edu/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan 2015 Summary - Image Classification: We are](https://reader033.vdocuments.net/reader033/viewer/2022043001/5f789f4de65a0040f702785a/html5/thumbnails/32.jpg)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 32
Summary
- Image Classification: We are given a Training Set of labeled images, asked to predict labels on Test Set. Common to report the Accuracy of predictions (fraction of correctly predicted images)
- We introduced the k-Nearest Neighbor Classifier, which predicts the labels based on nearest images in the training set
- We saw that the choice of distance and the value of k are hyperparameters that are tuned using a validation set, or through cross-validation if the size of the data is small.
- Once the best set of hyperparameters is chosen, the classifier is evaluated once on the test set.