human vision: what we know that ain’t so (and what to do about it) ronald a. rensink departments...
Post on 20-Dec-2015
212 views
TRANSCRIPT
Human Vision:
What We Know That Ain’t So
(and what to do about it)
Ronald A. Rensink
Departments of Computer Science and Psychology
University of British Columbia
Vancouver, Canada
2CPSC 444, 15 Apr 10
It isn't what we don't know that gives us trouble,it's what we know that ain't so…
—Will Rogers
3CPSC 444, 15 Apr 10
To design effective interfaces, we need to know how human perception operates
Belief: If we pay enough attention, we see everything around us
What do we know about how we see?
4CPSC 444, 15 Apr 10
Count the number of times the ball is passed between the people in the white shirts…
QuickTime™ and aCinepak decompressor
are needed to see this picture.
Test (Simons & Chabris, 2000):
5CPSC 444, 15 Apr 10
46% of observers did not see the person in the gorilla suit
Similar effects have been found in many situations in everyday life (“looked but failed to see”)
6CPSC 444, 15 Apr 10
Overview
Visual attention lets us see everything around us
1. Preattentive Systems (= before attention operates)• act quickly, but don’t do much
2. Attentional Systems (= conscious perception)• build up a complete description of the scene
3. Nonattentional Systems (= unconscious perception)• really don’t do much
7CPSC 444, 15 Apr 10
1. Preattentive Systems (= before attention operates)• act quickly, and are quite intelligent
2. Attentional Systems (= conscious perception)• dynamic, “just in time” representation of scenes
3. Nonattentional Systems (= unconscious perception)• might play very important roles in how we see
We can take advantage of these to create new, more effective kinds of visual displays…
Overview
8CPSC 444, 15 Apr 10
1. Preattentive Systems
Initial stage of visual processing
Light rays from environment into pixels
12CPSC 444, 15 Apr 10
Orientation receptors
OrientationChannels
0° (vertical)
Similar channels for 30° 60° 90° (horizontal)
14CPSC 444, 15 Apr 10
Preattentive Visual Properties
Automatic — no attention needed- no conscious effort needed
Computed rapidly - about 1/5 of a second
Computed in parallel over entire visual field- at each point, a limited extent in space (c. 2°)
Measurements (templates) such as• color• spatial frequency / width• orientation
15CPSC 444, 15 Apr 10
Why should we care?
Provides the scientific basis for the design of rapidly-perceived displays.
“An understanding of what is processed preattentively is probably the most important contribution that vision science can make to data visualization”
(Ware, 2000)
16CPSC 444, 15 Apr 10
Find a blue dot in a display
Unique color is easy to spot when few other items around
18CPSC 444, 15 Apr 10
For color, a unique value is always easy to notice
Reaction time does not depend on number of items “pop-out”
Reactiontime (ms)
Number of items (set size)
300
400
500
600
19CPSC 444, 15 Apr 10
Similarly with orientation…
Unique orientation is easy to spot when few items around
21CPSC 444, 15 Apr 10
For orientation, a unique value is always easy to notice
A unique visual primitive will “pop out” of a display
-draws attention to itself
Reactiontime (ms)
Number of items (set size)
300
400
500
600
22CPSC 444, 15 Apr 10
Unique L-shaped item is not always easy to spot
Find the L-shaped figure
But an item does not necessarily pop out if it is unique
24CPSC 444, 15 Apr 10
For L-shapes, reaction time depends on number of items
Reaction time depends linearly on number of items
Reactiontime (ms)
Number of items (set size)
300
400
500
600
search slope(30-100 ms/item)
25CPSC 444, 15 Apr 10
Interpretation (Treisman & Gormican, 1988):
Rapid search:- Visual primitives calculated in absence of attention- Unique primitive draws attention
Slow search:- Combination of pre-attentive features requires a spotlight of attention, which “welds” visual primitives at a rate of c. 30 ms/item
26CPSC 444, 15 Apr 10
Texture Segmentation
Segmentation via differences in color:
Distinct segments each have different primitives
28CPSC 444, 15 Apr 10
No segmentation if primitives don’t differ
No differences in features.
(Only in their arrangement)
29CPSC 444, 15 Apr 10
Belief:
To achieve its great speed, preattentive visionsacrifices the complexity of the propertiesit determines.
What is Processed Preattentively?
Conventional View
Simple properties (e.g., color, 2D orientation) are preattentive
Complex properties (e.g., 3D orientation) aren’t- these require attention (need to arrange)
31CPSC 444, 15 Apr 10
Scene-based properties
Image-based (2D) properties are the samein both target and distractor- only differ in how the pixels are arranged
1. Convexity (Kleffner & Ramachandran, 1992)
If these are the only properties computedpreattentively, search should be slow
Target Distractor
33CPSC 444, 15 Apr 10
Search is fast -> unique preattentive feature
Feature: surface convexity (scene-based property) recovered rapidly at early levels
34CPSC 444, 15 Apr 10
Target Distractor
Image-based (2D) properties are the samein both target and distractor
2. 3D Orientation (Enns & Rensink 1990, 1991)
If these are the only properties computed at early levels, search should be slow
36CPSC 444, 15 Apr 10
Reactiontime (ms)
Set size
Targ Distr
slope = 6 ms/ item
Rapid search for cubes
Reactiontime (ms)
Set size
Targ Distr
slope = 35 ms/ item
Slow search for hexagons
Three-dimensional orientationcan be recovered rapidly
37CPSC 444, 15 Apr 10
Target and distractor have shadows withdifferent orientations…
If 2D (or 3D) orientation is accessible,search should always be rapid,no matter what.
Target Distractor
3. Identification of Shadows (Rensink & Cavanagh, 1993)
40CPSC 444, 15 Apr 10
For right-side up, search is slow
For upside-down, search is rapid
Reactiontime (ms)
Set size
Targ Distr
slope = 27 ms/ item
Reactiontime (ms)
Set size
Targ Distr
slope = 9 ms/ item
Shadows affect visual search- 2D orientation info available only for U/D
42CPSC 444, 15 Apr 10
Target and distractor have different pieces.
4. Completion of Occlusions (Rensink & Enns, 1998)
If these are the relevant properties at early levels,search should always be rapid,no matter how they are arranged.
Target Distractor
44CPSC 444, 15 Apr 10
Looking for TargTarg
Slower to detect—Bite has been “compensated for”- occluded structure has been rapidly completed
45CPSC 444, 15 Apr 10
Search is rapid - based on distinctive parts
Search is based on completed fragments only”pre-empts” the unique primitives
Search is slow - cannot access visible pieces
Reactiontime (ms)
Set size
Targ Distr
slope = 8 ms/ item
Reactiontime (ms)
Set size
Targ Distr
slope = 66 ms/ item
46CPSC 444, 15 Apr 10
Summary
Complex properties (e.g., 3D orientation)can be determined rapidly without attention
- these are calculated via simple rules that work most of the time (e.g., lighting from above)
Attention acts on proto-objects (Rensink & Enns, 1998): - “quick and dirty” estimates of scene properties (e.g., surface slant, true surface color)
47CPSC 444, 15 Apr 10
1. Displays need not be limited to simple 2D properties (e.g., color and orientation) to convey information.
More complex properties based on scene structure (e.g., 3D orientation, shadows) can also be used.
- will only work under the appropriate conditions - likely to be as effective as “old” dimensions
Implications for Display Design
2. “Primitive” properties won’t always work as expected.- will be unavailable if pre-empted by proto-objects
48CPSC 444, 15 Apr 10
What is the role of visual attention?
2. Attentional Systems
Conventional view:
Attention “welds” visual primitives into more complex structures (objects).
The accumulation of these structures isthe basis for visual perception (scenes)
Is this really true?
?
49CPSC 444, 15 Apr 10
Change Blindness
QuickTime™ and aQuickDraw decompressor
are needed to see this picture.
52CPSC 444, 15 Apr 10
Change blindness: the failure to see changes in an image, even when these are large, and repeatedly made
• image flicker
• eye movements (saccades)
• eyeblinks
• "splats" not on change
• movie cuts
• real-world interruptions
• gradual change
Proposal: All of these have the same cause
Can be induced by:
53CPSC 444, 15 Apr 10
Proposal: Attention is needed to perceive change in an object.
When change is made same time as another event, transients interfere with drawing of attention,causing change to become “invisible”.
Under normal circumstances, a change creates a motion transient, which draws attention.
54CPSC 444, 15 Apr 10
Coherence theory:
1. Attention latches onto proto-objects- forms an object complex coherent in space and time- supports perception of change
2. Object complex exists as long as item is attended
3. Once attention is released, object “dissolves” back into proto-objects
- no buildup of coherent information after attention is withdrawn
- some memory of scene remains, but not of object complex
55CPSC 444, 15 Apr 10
Focused attention
Proto- objects
Focused attention
Proto- objects
Focused attention
Proto- objects
56CPSC 444, 15 Apr 10
How Much Can We Attend?
Use images that change back and forth in time,like the scene examples
Approach: Visual Search for Change
But images that are much simpler in content - can control the number of items, the type of change, etc.
57CPSC 444, 15 Apr 10
- on half the trials, one of the items changes (target) - observer must report if change present or absent
display 2
Focused attentionholds onto item,allowing changeto be seen
display 1
blank
blank
Displaysalternateuntil observerresponds
Rensink (2000)
58CPSC 444, 15 Apr 10
Results
Attention has a capacity of 3-4 items
- similar to other estimates of attentional capacity
- demonstrates that visual detail is not built up- otherwise, capacity estimate would be unlimited
59CPSC 444, 15 Apr 10
But...
If we can’t attend to much (3-4 items), we have a coherent representation of only a few objects at any given time (= objects that are attended)
If only a few objects are represented at a time, why do we feel we see all objects at once?
60CPSC 444, 15 Apr 10
• Although objects appear to be present simultaneously, do not all need to be represented simultaneously
• All that is needed is that the properties of the objects can be accessed when requested.
Observation:
Proposal: Virtual Representation (Rensink, 2000a)
61CPSC 444, 15 Apr 10
Real station:
1-2 sites at a time
⇒ Ifdataalreadypresent,use it.⇒ Else locate appropriate machine,andloadinthedata.
web.mit .edu/ bcs
cvs. rochester.edu
yorku.ca
vision.arc .nasa.gov
mpik-tueb.mpg.de
Millions of web sites,eachwith lotsofdata
hyperion.com
World Wide Web
psy. jhu.edu
Websurfer" sees" : 1 ) cvs.rochester.edu 2 ) vision.arc.nasa.gov 3 ) yorku.ca 4 )…
Virtual station: Millions of sites
62CPSC 444, 15 Apr 10
To build a coherent representation of an object:
Focus eyes and attention on appropriate location whenever that object is needed
Representation “dissolves” once it is no longer needed
Can this work for the visual system? Yes!!
Can always obtain information from the world
Use the world itself as an external memory(Stroud, 1955; Brooks, 1991)
63CPSC 444, 15 Apr 10
Real station:
1-2 sites at a time
⇒ Ifdataalreadypresent,use it.⇒ Else locate appropriate machine,andloadinthedata.
web.mit .edu/ bcs
cvs. rochester.edu
yorku.ca
vision.arc .nasa.gov
mpik-tueb.mpg.de
Millions of web sites,eachwith lotsofdata
hyperion.com
World Wide Web
psy. jhu.edu
Websurfer" sees" : 1 ) cvs.rochester.edu 2 ) vision.arc.nasa.gov 3 ) yorku.ca 4 )…
Virtual station: Millions of sites
Real representation:
1-2 objects at a time
⇒ If object alreadyattended, use it.⇒ Else locate appropriate proto- object, and make it coherent.
left screen
podium
stage
right screen
speaker (R.R.)
Millions of objects,eachwith lotsofdata
World
noisy person
ceiling
Visual system " sees" : 1 ) speaker (R.R.) 2 ) left screen 3 ) right screen 4 )…
Virtual representation: Millions of objects
64CPSC 444, 15 Apr 10
Note:
Using the world as an external memory means that perception is not carried out in isolation
- perceiver and environment form a partnership
- perception is inherently dynamic and interactive
65CPSC 444, 15 Apr 10
Need something compatible with what is known about human vision…
How might a virtual representation be implemented?
66CPSC 444, 15 Apr 10
Nonattentional extraction of aspects of scene (I):
1. Gist: abstract meaning of scene (farm, harbor, etc.)- obtained within 150 ms (Biederman, 1981)- obtained without attention (Oliva & Schyns, 1997)
Proposal : Triadic Architecture (Rensink, 2000a)
Possibly derived via statistics of low-level structures(e.g. Swain & Ballard, 1991)
67CPSC 444, 15 Apr 10
Nonattentional extraction of aspects of scene (II):
2. Layout: arrangement of items in the scene.
- nonvolatile - held without attention (Simons, 1996)- can be learned without attention (Chun & Jiang, 1998)
68CPSC 444, 15 Apr 10
Complex
Gist
Elements
Layout
Early (nonattentional)
Setting (nonattentional)
HIGHER LEVELS
Object (attentional)
Complex
Gist
Elements
Layout
Early (nonattentional)
Setting (nonattentional)
HIGHER LEVELS
Object (attentional)
High level control
Control of attention—and thus, what is seen—can be guided by high-level knowledge (long-term memory) and mental set (task)
69CPSC 444, 15 Apr 10
Complex
Gist
Elements
Layout
Early (nonattentional)
Setting (nonattentional)
HIGHER LEVELS
Object (attentional)
Complex
Gist
Elements
Layout
Early (nonattentional)
Setting (nonattentional)
HIGHER LEVELS
Low level control
Object (attentional)
Control of attention —and thus, what is seen—also guided by the contents of low-level visual systems. Key factor is salience.
70CPSC 444, 15 Apr 10
Our impression that many coherent objects are representedsimultaneously is an illusion.
– scenes are represented via a virtual representation– dynamic, “just in time” system
Attention is not necessarily a perceptual “gateway”– it’s only the stream specialized for coherent objects– other (nonattentional) streams help guide it
Summary
71CPSC 444, 15 Apr 10
1. Pickup of Information
Implications for Display Design
Only a limited amount of information can be attended (= available to conscious perception) at any time
- 3-4 items- only a small amount of info from each item
No attentional distraction from other parts of display- could induce blindness
Only one dynamic information source at a time- can’t separate two simultaneous changes
72CPSC 444, 15 Apr 10
What is attended (= what is seen) depends on the viewer & the task
- different people can literally see the same world differently
Can use flicker paradigm to measure which parts and properties of objects are attended first. (= most easily seen to change)
2. Perception of Visual Displays
73CPSC 444, 15 Apr 10
Activities such as navigating through information space can be highly natural, if designed correctly
Mechanisms already exist to couple ourperceptual systems to environment. - “natural-born cyborgs” (Clark, 2003)
> Can use these mechanisms as the basis of effective human-machine interactions
3. Interaction with Visual Displays
74CPSC 444, 15 Apr 10
4. Coercive Graphics
Display could coerce user to attend to a given location- effectively “takes over” virtual representation can make viewer see (or not see) given items
This is how magicians control what an audience “sees”
Creation of “magical” displays?- transitions at other locations made “invisible”? - perception of events that could not occur in the real world.
- “unreal” (dis)appearances- “unreal” change to objects, regions
75CPSC 444, 15 Apr 10
3. Nonattentional Systems
Gist
Incoming Light
1. Early Vision(Proto-objects)
2. Attentional Stream
3. Nonattentional Streams
Higher Levels
Layout?
Triadic architecture implies an important
role for nonattentional streams in vision
These streams are not concerned with
explicit (= conscious) perception
Rather, they involve implicit (= unconscious) perception
76CPSC 444, 15 Apr 10
3.1 Some early results…
Bridgeman et al. (1975) — eye movements• target moves while observer moves eyes to it
• eye makes corrective movement, even though observers have no conscious perception of change
Goodale et al. (1986) — manual pointing• target moves while observer saccades to it
• hand corrects its trajectory while reaching to target, even though observers have no conscious perception of
change
77CPSC 444, 15 Apr 10
Two Visual Systems Milner & Goodale (1995)
Eye
"What" system
- requires attention- relatively slow (c. 300 ms)- conscious "picture" of world- basis for rational decisions
"How" system- may not require attention- quite fast- visuomotor control- emotions...
Distinct visual systems- supported by two separate neural pathways
The “how” system is effectively an “inner zombie”
79CPSC 444, 15 Apr 10
Fernandez-Duque & Thornton (2000)
- observers view 2-display sequence; each display is a simple array of rectangles
- observers tested on two items: - the item changed, and the item diagonally across from it
If observer did not notice change, asked to guess which item changed.
time
?
3.2 Nonconscious Detection of Change
80CPSC 444, 15 Apr 10
Results
Observers could guess better than chance (55-65%)even though change was not consciously noticed(!)
– attention not involved
81CPSC 444, 15 Apr 10
Reports by some observers that they “knew” a change was occurring before they saw (= visually experienced) it.
- not: a “picture” that could be seen
- but: a “gut feeling” of a change happening
3.3 Mindsight
82CPSC 444, 15 Apr 10
.
t1: respond when change is felt
t2: respond when change is seen
Experiment (Rensink,2004)
Observers view flicker sequence (natural images)• asked to hit button (t1) when change was felt• then hit button (t2) when change was seen
83CPSC 444, 15 Apr 10
Results
1/2 of observers had no feeling of change without a visual experience of it
1/3 of observers could feel a change before seeing it, at least on some trials
• average duration = 3.7 seconds• not a result of guessing.
• good performance on absent trials• accuracy = 82%; d’ ≥ 2.1
84CPSC 444, 15 Apr 10
• positive — not due to a problem with attention• “seeing” responses equally fast for both groups
• informative — not a result of guessing• can distinguish change-present from change-absent; d’ ≥ 2.1
For these observers, this effect was…
• sophisticated — not due to simple motion signals• inserting a bright yellow flash in every other blank had no effect on
most measures of sensing• did disrupt seeing
85CPSC 444, 15 Apr 10
Displays might influence other aspects of user’s experience besides conscious “image” of its contents
could guide user actions (e.g. control of mouse)
Implications for Display Design
1. Visuomotor Actions
might induce user to automatically “do the right thing” (no conscious noticing of this)
- displays designed for “zombie system”: easy to use for a given motor activity
86CPSC 444, 15 Apr 10
- feeling that something is occurring, without an accompanying visual experience
2. Displays for “Sixth Sense” Experience
use as a form of “soft warning” (?)
87CPSC 444, 15 Apr 10
Summary
• dynamic, “just in time” representation of scenes• just one of several concurrent systems
• may play several important roles in how we see, and in how we interact with the world
1. Preattentive Systems (= before attention operates)• act quickly, but don’t do much
2. Attentional Systems (= conscious perception)• build up a complete description of the scene• central - everything perceived must engage it
3. Nonattentional Systems (= unconscious perception)• don’t really do much
• act quickly, and can be quite intelligent