a bayesian model of visual attention - mit9.520/spring09/classes/06may2009... · 2010-01-22 ·...
TRANSCRIPT
A Bayesian Model of Visual Attention
Sharat Chikkerur, Thomas Serre, Cheston Tan & Tomaso Poggio
Center for Biological and Computational Learning, MIT
Outline
• IntroductionLimitations of feed‐forward processing
Role of attention
• A computational model of attention
• ApplicationsModeling human eye‐movements
Object recognition under clutter
Outline
• IntroductionLimitations of feed‐forward processing
Role of attention
• A computational model of attention
• ApplicationsModeling human eye‐movements
Object recognition under clutter
Riesenhuber & Poggio 1999, 2000; Serre Kouh Cadieu Knoblich Kreiman & Poggio 2005; Serre Oliva Poggio 2007
*Modified from (Gross, 1998)
Feed‐forward processing
see also Broadbent 1952 1954; Treisman 1960; Treisman & Gelade 1see also Broadbent 1952 1954; Treisman 1960; Treisman & Gelade 1980; Duncan & Desimone 1995; Wolfe, 1997; Tsotsos and many othe980; Duncan & Desimone 1995; Wolfe, 1997; Tsotsos and many othersrs
Zoccolan Kouh Poggio DiCarlo 2007 Reynolds Chelazzi &
Desimone 1999Serre Oliva Poggio 2007
Role of attention
Parallel processing (No attention) Serial processing (With attention)
Biology of attention
Feed-fowardFeature-attention
Rao 2005; Lee & Mumford 2003Rao 2005; Lee & Mumford 2003
Attention as Bayesian inference
OO
FiFi
FliFli
II
LL
NN
LIP/FEFLIP/FEF
V2V2
V4V4
ITIT
PFCPFC
Spatial attention
•We use a Bayesian framework to model
the interaction between the ventral
stream and LIP/FEF
•Feed forward connections
within the
ventral stream are modeled as bottom‐up
evidence. Feedback
connections from
higher areas are modeled as top‐down
priors.
•The posterior probability of location
generates a task‐based saliency map.
Model description
Fi
Fil
L
N
“What”“Where”
Feature-maps
Image
Feature-maps
Model properties: invariance
Fi
Fil
L
N
“What”“Where”
Model properties: crowding
Fi
Fil
L
N
“What”“Where”
Model: spatial attention
Fi
Fil
L
N
What is at location X?
X
* * *
Model: feature‐based attention
Fi
Fil
L
N
Where is object X?
X
* * *
Outline
• IntroductionLimitations of feed‐forward processing
Role of attention
• A computational model of attention
• ApplicationsModeling human eye‐movements
Object recognition under clutter
Matching human eye‐movements
Car search Pedestrian search
Dataset100 CBCL street-scenes images having cars & pedestrians20 images with neither objects
Experiment8 subjects where shown these 120 images in random order. Each image in the stimuli-set was presented twiceThe subjects were asked to count the number of cars/pedestrians For each of these block trials, the subject’s eye movements were recorded using an infra-red eye tracker.
(Psychophysics done by Cheston T )
Model instantiation
OOO
FiFFii
FliFFllii
III
LLL
NN
LIP/FEFLIP/FEF
V2V2
V4V4
ITIT
PFCPFC
Examples
Examples
Quantitative evaluation: ROC
0
0.25
0.5
0.75
1
car pedestrian
Humans Bottom-upTop-down (feature-based) Feaure-based + contextual cues
RO
Car
ea
Integrating (local) featureIntegrating (local) feature--based + (global) contextbased + (global) context--based based cues accounts for cues accounts for 92%92% of interof inter--subject agreement!subject agreement!
Chikkerur ,Tan Serre & Poggio (SFN Chikkerur ,Tan Serre & Poggio (SFN ‘‘09,VSS 09,VSS ‘‘09)09)
Quantitative evaluation: ROC
Outline
• IntroductionLimitations of feed‐forward processing
Role of attention
• A computational model of attention
• ApplicationsModeling human eye‐movements
Object recognition under clutter
Effect of clutter on detection
recognition without attention
recognition under attention
Scale and location prediction
Performance improves under attention
0
1
2
3
Model Humans
perfo
rman
ce (d
perfo
rman
ce (d
’’ ))
no attentionno attention one shift of one shift of attentionattention
Tan, Tan, ChikkerurChikkerur , , SerreSerre & & PoggioPoggio (VSS (VSS ‘‘09)09)
Thank you!
Questions?