![Page 1: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/1.jpg)
Learning Spatial Context:
Can stuff help us find things?
Geremy HeitzDaphne Koller
April 14, 2008DAGS
Stuff (n): Material defined by a homogeneous or repetitive pattern of fine-scale properties, but has no specific or distinctive spatial extent or shape.
Thing (n): An object with a specific size and shape.
![Page 2: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/2.jpg)
Outline
Sliding window object detection What is context? The Things and Stuff (TAS) model Results
![Page 3: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/3.jpg)
Object Detection
Task: Find all the cars in this image. Return a “bounding
box” for each Evaluation:
Maximize true positives
Minimize false positives
Precision-Recall tradeoff
![Page 4: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/4.jpg)
Sliding Window Detection Consider every bounding box
All shifts All scales Possibly all rotations
Each box gets a score: D(x,y,s,Θ)
Detections: Local peaks in D() Pros:
Covers the entire image Flexible to allow variety of
D()’s Cons:
Brute force – can be slow Only considers features in box
D = 1.5
D = -0.3
![Page 5: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/5.jpg)
Features: Haar wavelets
Haar filters and integral imageViola and Jones, ICCV 2001
The average intensity in the block is computed with four sums independently of the block size.BOOSTING!
![Page 6: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/6.jpg)
Features: Edge fragments
Weak detector = Match of edge chain(s) from training image to edgemap of test image
Opelt, Pinz, Zisserman, ECCV 2006
BOOSTING!
![Page 7: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/7.jpg)
Histograms of oriented gradients
• Dalal & Trigs, 2006
• SIFT, D. Lowe, ICCV 1999
SVM!
![Page 8: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/8.jpg)
Sliding Window Results
PASCALVisual Object Classes Challenge
Cows 2006
score(A,B) = |A∩B| / |AUB|True Pos: B s.t. score(A,B) > 0.5 for some AFalse Pos: B s.t. score(A,B) < 0.5 for all A False Neg: A s.t. score(A,B) < 0.5 for all B
B A
0.1 0.2 0.3 0.4 0.5
0.2
0.4
0.6
0.8
1
My DetectorINRIA-Douze
Recall Rate
Pre
cisi
on
Recall = TP / (TP + FN)Precision = TP / (TP + FP)
![Page 9: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/9.jpg)
Satellite Detection Example
![Page 10: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/10.jpg)
Quantitative Evaluation
0 40 80 120 160
0.2
0.4
0.6
0.8
1
False Positives Per Image
Rec
all R
ate
![Page 11: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/11.jpg)
Why does this suck?
True Positives in Context
False Positives in Context
False Positives out of Context
Context!
![Page 12: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/12.jpg)
What is Context?
![Page 13: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/13.jpg)
What is Context?
Scene-Thing:
Stuff-Stuff:
gist car “likely”
keyboard “unlikely”
Thing-Thing:
Torralba et al., 2005
Gouldet al., 2008
Rabinovich et al., 2007
![Page 14: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/14.jpg)
What is Context?
Stuff-Thing: Based on
intuitive “relationships”
Trees = no cars
Houses = cars nearby
Ro
ad =
cars here
![Page 15: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/15.jpg)
Things
Candidate detections Bounding Box +
Score Boolean R.V. Ti
Ti = 1: Candidate is a positive detection
Thing-only model
Ti
ImageWindow
Wi
))(exp(1
1)(
WDWTP i
![Page 16: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/16.jpg)
Stuff
Coherent image regions Coarse “superpixels” Feature vector Fj in Rn
Cluster label Sj
Stuff-only model Naïve Bayes
Sj
Fj
jjjjj SFPSPFSP )(),(
ssjj sSF ,~
![Page 17: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/17.jpg)
Relationships
Descriptive Relations “Near”, “Above”,
“In front of”, etc. Choose a set R Rij: Relation
between detection i and region j
Relationship model
S72 = Trees
S 4 = H
ouses
S10 =
Ro
ad
Rij
TiSj
![Page 18: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/18.jpg)
The TAS Model
RijTi Sj
Fj
ImageWindow
Wi
Wi: Window
Ti: Object Presence
Sj: Region Label
Fj: Region Features
Rij: Relationship
N
J
![Page 19: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/19.jpg)
Unrolled Model
T1
S1
S2
S3
S4
S5
T2
T3
R21 = “Above”
R31 = “Left”
R13 = “In”
R33 = “In”
R11 = “Left”
CandidateWindows
ImageRegions
![Page 20: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/20.jpg)
Learning Everything observed except Sj’s Expectation-Maximization
Mostly discrete variables Like Mixture-of-Gaussians
An ode to directed models:
Oh directed probabilistic modelsYou are so beautiful and palatableBecause unlike your undirected friendsYour parameters are so very interpretable
- Unknown Russian Mathematician
(Translated by Geremy Heitz)
RijTi Sj
Fj
ImageWindow
Wi
N
J
![Page 21: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/21.jpg)
Learned Satellite Clusters
![Page 22: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/22.jpg)
Inference
Goal:
Gibbs Sampling Easy to sample Ti’s given Sj’s
and vice versa
Could do distributional particles
RijTi Sj
Fj
ImageWindow
Wi
N
J
![Page 23: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/23.jpg)
Results - Satellite
Prior:Detector Only
Posterior:TAS ModelRegion Labels
![Page 24: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/24.jpg)
Results - Satellite
0 40 80 120 160
0.2
0.4
0.6
0.8
1
False Positives Per Image
Rec
all R
ate
Base DetectorTAS Model
![Page 25: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/25.jpg)
PASCAL VOC Challenge
2005 Challenge 2232 images split into {train, val, test} Cars, Bikes, People, and Motorbikes
2006 5304 images plit into {train, test} 12 classes, we use Cows and Sheep
Results reported for challenge with state-of-the-art approaches Caveat: They didn’t get to see the test set
before the challenge, but I did!
![Page 26: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/26.jpg)
Results – PASCAL
Cows
![Page 27: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/27.jpg)
Results – PASCAL
Bicycles
Cluster #3
![Page 28: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/28.jpg)
Results – PASCAL
Good examples
Discover “true positives”
Remove “false positives”
![Page 29: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/29.jpg)
Results – VOC 2005
0.1 0.2 0.3 0.4 0.5
0.2
0.4
0.6
0.8
1
Recall Rate
Pre
cisi
on
TAS ModelBase DetectorsINRIA-Dalal
0.1 0.2 0.3 0.4 0.5 0.6
0.2
0.4
0.6
0.8
1
Recall Rate
Pre
cisi
on
Motorbike
Car
0.1 0.2 0.3 0.4
0.2
0.4
0.6
0.8
1
Recall Rate
Pre
cisi
on
Bicycle
0.1 0.2 0.3 0.4 0.5 0.6
0.2
0.4
0.6
0.8
1
Recall Rate
Pre
cisi
on
People
![Page 30: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/30.jpg)
Results – VOC 2006
0.1 0.2 0.3 0.4 0.5
0.2
0.4
0.6
0.8
1TAS ModelBase DetectorsINRIA-Douze
Recall Rate
Pre
cisi
on
0.1 0.2 0.3 0.4 0.5
0.2
0.4
0.6
0.8
1
Recall Rate
Pre
cisi
on
Cow Sheep
![Page 31: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/31.jpg)
Conclusions
Detectors can benefit from context The TAS model captures
an important type of context We can improve any sliding window
detector using TAS The TAS model can be interpreted and
matches our intuitions Geremy is smart
![Page 32: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/32.jpg)
![Page 33: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/33.jpg)
![Page 34: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/34.jpg)
0 40 80 120 160
0.2
0.4
0.6
0.8
1
False Positives Per Image
Rec
all R
ate
Base DetectorTAS Model
Prior:Detector Only
Posterior:TAS Model
Region Labels
Detections in Context
Task: Identify all cars in the satellite image
Idea: The surrounding context adds info to the local window detector
+ =Houses
Road
![Page 35: Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or](https://reader035.vdocuments.net/reader035/viewer/2022062804/56649d785503460f94a5b7e1/html5/thumbnails/35.jpg)
Equations
))(exp(1
1)(
WDWTP i
jjjjj SFPSPFSP )(),(
ssjj sSF ,~