brief review of recognition + contextdhoiem.cs.illinois.edu/courses/vision_spring10/lectures/lecture...

18
Brief Review of Recognition + Context Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/01/10

Upload: others

Post on 30-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Brief Review of Recognition + Contextdhoiem.cs.illinois.edu/courses/vision_spring10/lectures/Lecture 20 - Review of...representation using labeled examples • The features should

Brief Review of Recognition+ Context

Computer Vision

CS 543 / ECE 549

University of Illinois

Derek Hoiem

04/01/10

Page 2: Brief Review of Recognition + Contextdhoiem.cs.illinois.edu/courses/vision_spring10/lectures/Lecture 20 - Review of...representation using labeled examples • The features should

Object Instance Recognition

• Want to recognize the same or equivalent object instance, which may vary– Slight deformations

– Change in lighting

– Occlusion

– Rotation, rescaling, translation, perspective =

=

=

Page 3: Brief Review of Recognition + Contextdhoiem.cs.illinois.edu/courses/vision_spring10/lectures/Lecture 20 - Review of...representation using labeled examples • The features should

Object Instance Recognition

• Template matching: faces– Recognize by directly computing

pixel distance of aligned faces

– Principal component analysis gives a subspace that preserves variance

– Linear Discriminant Analysis (LDA) or Fisher Linear Discriminants(FLD) gives a subspace that maximizes discrimination

• This could work for other kinds of aligned objects

Page 4: Brief Review of Recognition + Contextdhoiem.cs.illinois.edu/courses/vision_spring10/lectures/Lecture 20 - Review of...representation using labeled examples • The features should

Object Instance Recognition• If object is not aligned, we need to

perform geometric matching1. Find distinctive and repeatable

keypoints• E.g., Difference of Gaussian, Harris

corners, or MSER regions2. Represent the appearance at these

points (e.g., SIFT)3. Match pairs of keypoints4. Estimate transformation (e.g.,

rotation, scale, translation) from matched keypoints

• Hough voting• Geometric refinement

• Clustering (visual words) and inverse document frequency enable fast search in large datasets

A1

A2

A3

Page 5: Brief Review of Recognition + Contextdhoiem.cs.illinois.edu/courses/vision_spring10/lectures/Lecture 20 - Review of...representation using labeled examples • The features should

Category recognition

• Instances across categories tend to vary in more challenging ways than a single instance across images

Page 6: Brief Review of Recognition + Contextdhoiem.cs.illinois.edu/courses/vision_spring10/lectures/Lecture 20 - Review of...representation using labeled examples • The features should

Image Categorization• In training, a classifier is trained for a particular feature

representation using labeled examples

• The features should generally capture local patterns but with loose spatial encoding

• For scene categorization, a reasonable choice is often1. Compute visual words (detect interest points, represent them

with SIFT, and cluster)2. Compute a spatial pyramid of these visual words, composed

of histograms at different spatial resolutions3. Train a linear SVM classifier or one with a Chi-squared kernel

Training LabelsTraining

Images

Classifier Training

Image Features

Trained Classifier

Page 7: Brief Review of Recognition + Contextdhoiem.cs.illinois.edu/courses/vision_spring10/lectures/Lecture 20 - Review of...representation using labeled examples • The features should

Object Category Detection• One difficulty of object category detection is that

objects could appear at many scales or translations, and keypoint matching will be unreliable

• A simple way around this is to treat category detection as a series of image categorization tasks, breaking up the image into thousands of windows and applying a binary classifier to each

• Often, the object is classified using edge-based features whose positions are defined at fixed position in the sliding window

Object or Background?

Page 8: Brief Review of Recognition + Contextdhoiem.cs.illinois.edu/courses/vision_spring10/lectures/Lecture 20 - Review of...representation using labeled examples • The features should

Object Category Detection

• Sliding windows might work well for rigid objects

• But some objects may be better thought of as spatial arrangements of parts

Page 9: Brief Review of Recognition + Contextdhoiem.cs.illinois.edu/courses/vision_spring10/lectures/Lecture 20 - Review of...representation using labeled examples • The features should

Object Category Detection• Part-based models have three key components

– Part definition and appearance model– Model of geometry or layout of parts– Algorithm for efficient search

• ISM Model– Parts are clustered detected keypoints– Position of each part wrt object center/size is recorded– Search is done through Hough voting / Mean-shift

clustering combination• Pictorial structures model

– Parts are rectangles detected in silhouette– Layout is articulated model with tree-shaped graph– Search through dynamic programming or probabilistic

sampling

Page 10: Brief Review of Recognition + Contextdhoiem.cs.illinois.edu/courses/vision_spring10/lectures/Lecture 20 - Review of...representation using labeled examples • The features should

Region-based recognition

• Sometimes, we want to label image pixels or regions

• Basic approach:– Segment the image into blocks, superpixels, or

regions

– Represent each region with histograms of keypoints, color, texture, and position

– Classify each region (variety of classifiers used)

Page 11: Brief Review of Recognition + Contextdhoiem.cs.illinois.edu/courses/vision_spring10/lectures/Lecture 20 - Review of...representation using labeled examples • The features should

Context in Recognition

• Objects usually are surrounded by a scene that can provide context in the form of naerbyobjects, surfaces, scene category, geometry, etc.

Page 12: Brief Review of Recognition + Contextdhoiem.cs.illinois.edu/courses/vision_spring10/lectures/Lecture 20 - Review of...representation using labeled examples • The features should

Context provides clues for function

• What is this?

These examples from Antonio Torralba

Page 13: Brief Review of Recognition + Contextdhoiem.cs.illinois.edu/courses/vision_spring10/lectures/Lecture 20 - Review of...representation using labeled examples • The features should

Context provides clues for function

• What is this?

• Now can you tell?

Page 14: Brief Review of Recognition + Contextdhoiem.cs.illinois.edu/courses/vision_spring10/lectures/Lecture 20 - Review of...representation using labeled examples • The features should

Sometimes context is the major component of recognition

• What is this?

Page 15: Brief Review of Recognition + Contextdhoiem.cs.illinois.edu/courses/vision_spring10/lectures/Lecture 20 - Review of...representation using labeled examples • The features should

Sometimes context is the major component of recognition

• What is this?

• Now can you tell?

Page 16: Brief Review of Recognition + Contextdhoiem.cs.illinois.edu/courses/vision_spring10/lectures/Lecture 20 - Review of...representation using labeled examples • The features should

More Low-Res

• What are these blobs?

Page 17: Brief Review of Recognition + Contextdhoiem.cs.illinois.edu/courses/vision_spring10/lectures/Lecture 20 - Review of...representation using labeled examples • The features should

More Low-Res

• The same pixels! (a car)

Page 18: Brief Review of Recognition + Contextdhoiem.cs.illinois.edu/courses/vision_spring10/lectures/Lecture 20 - Review of...representation using labeled examples • The features should

We will see more on context later…