real-time tt i aihghg raphics architecture
TRANSCRIPT
4/4/2007
1
Real-Time G hi A hit tGraphics Architecture
Kurt Akeley
Pat Hanrahan
http://graphics.stanford.edu/cs448-07-spring/
About Kurt
Personal history
BEE University of Delaware, 1976-1980
MSEE Stanford, 1980-1982
SGI co-founder, chief engineer, CTO, 1982–2000
PhD Stanford, 2001-2004
Asst. Director, Microsoft Research Asia, 2005-2007
Lots of SIGGRAPH involvement
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
Lots of SIGGRAPH involvement
Currently
Principal Researcher, MSR Silicon ValleyComputer architecture research with Chuck Thacker …
4/4/2007
2
Other notes
OpenGL
Lots of history with this
Good framework for understanding
Dissertation
Achieving Near-correct Focus Cues Using Multiple Image Planes
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
Display + Viewer
Outline
Introductions (done)
Evolution of Graphics Systems (Kurt)
Future Evolution (Pat)
Lecture Schedule (Pat)
Brief Introduction to Perception (Kurt)
Course Logistics (Pat)
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
4/4/2007
3
Evolution of real-time graphics
Don’t have a genealogy chart
This would be a great project
Some important phases:
Early research
Flight simulation
GL-like: Terminal SGI PC
Game consoles
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
Game consoles
We’ll focus on GL-like systems
Attempt to credit research and simulation results
Axes of improvement
Performance
Triangles / second
Pixel fragments / second
Features
Hidden-surface elimination
Image mapping
Antialiasing
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
Antialiasing
Quality
Numeric representation (e.g., floating point)
Image filters (e.g., linear, cubic, …)
4/4/2007
4
Relationships of axes
Ideal relationship:
Performance is inversely proportional to the “work required” to implement the features and quality
Mode changes lead to proportional performance changes
(Naïve) software implementations approach this ideal
Hardware behavior differs substantially from this ideal. Often
Performance is invariant to complexity (“free”), OR
Performance falls off catastrophically
F f t
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
Free features
If use of feature X is free, then rendering without using feature X is too slow !
Pipelining leads to free featuresTraditional parallelism typically doesn’t
Generations of GL-like systems
Generations are defined in terms of feature sets
The features that are included in the performance plateau determine the system’s generation:plateau determine the system s generation:
rfor
man
ce
Second Generation System
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
Per
Gen1 Gen2 Gen3 Gen4
0Features/quality
4/4/2007
5
First generation - wireframe
Vertex: transform, clip, and project
Rasterization: color interpolation (points, lines)
Fragment: overwrite
Dates: prior to 1987
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
Second generation – shaded solids
Vertex: lighting calculation
Rasterization: depth interpolation (triangles)
Fragment: depth buffer, color blending
Dates: 1987 - 1992
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
4/4/2007
6
Third generation – texture mapping
Vertex: texture coordinate transformation
Rasterization: texture coordinate interpolation
Fragment: texture evaluation, antialiasing
Dates: 1992 - 2000
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
SGI historicals
First-generation metrics
Year Product Tri rate CAGR Fill rate CAGR1984 Iris 2000 10k - 46m -1988 GTX 135k 1.9 80m 1.21992 RealityEngine 2m 2.0 380m 1.51996 InfiniteReality 12m 1.6 1000m 1.3
1.8 1.3
Gen1st2nd3rd3rd
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
4/4/2007
7
SGI historicals (depth buffered)
Second-generation metrics
Year Product ZTri rate CAGR Zbuf rate CAGR1984 Iris 2000 1k - 100k -1988 GTX 135k 3.6 40m 4.51992 RealityEngine 2m 2.0 380m 1.81996 InfiniteReality 12m 1.6 1000m 1.3
2.2 2.2
Gen1st2nd3rd3rd
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
Yearly Growth well above 1.5 (Moore’s Law)
NVIDIA graphics growth (225%/yr)
Season Product Process # Trans Gflops 32-bit AA Fill Mpolys Notes
2H97 Riva 128 .35 3M 5 20M 3M Integrated 2D/3D
1H98 Riva ZX .25 5M 7 31M 3M AGP2x
2H98 Riva TNT .25 7M 10 50M 6M 32-bit
1H99 TNT2 .22 9M 15 75M 9M AGP4x
2H99 GeForce .22 23M 25 120M 15M HW T&L
1H00 GeForce2 .18 25M 35 200M1 25M Per-Pixel Shading
2H00 NV16 .18 25M 45 250M1 31M 230 Mhz DDR
1H01 NV20 15 55M 80 500M1 30M2 Programmable
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
1: Dual textured
2: Programmable
Essentially Moore’s Law Cubed.
1H01 NV20 .15 55M 80 500M1 30M Programmable
4/4/2007
8
NVIDIA historicals
Year Product Tri rate CAGR Tex rate CAGR
Third-generation metrics
1998 Riva ZX 3m - 100m -1999 Riva TNT2 9m 3.0 350m 3.52000 GeForce2 GTS 25m 2.8 664m 1.92001 GeForce3 30m 1.2 800m 1.22002 GeForce Ti 4600 60m 2.0 1200m 1.52003 GeForce FX 167m 2.8 2000m 1.7
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
2004 GeForce 6800 Ultra 170m 1.0 6800m 2.72005 GeForce 7800 GTX 215m 1.2 6800m 1.02006 GeForce 7900 GTX 260m 1.3
1.7 1.8
Yearly Growth well above 1.5 (Moore’s Law)
Fourth generation - programmability
Programmable shading
Image-based rendering
Convergence of graphics and media processing
Curved surfaces
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
4/4/2007
9
Fifth generation – global evaluation
Ray tracing: visibility and integration
True shadows, path tracing, photon mapping
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
From batch to interactiveCourtesy Frank Crow, Interval
106 s
1 day
1 week
1 mo. log timeFanatical
1.0 s
100 s
104 s
1 min.
1 hr.Possible
Practical
Interactive
Teddy Bear 250 GI’s
Kitchen Table 10 GI’s
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
10 m
ips
0.01 s
100 m
ips
1 g
ips
10 g
ips
100 g
ips
Immersive
log performanceStemware 100 MI’s
4/4/2007
10
Perception
Interactive graphics is (typically) for human viewers
Guided-missile design is a counter example
Good designers know their customers’ needs and problems
Have basic understanding of visual perception
NTSC is a great engineering design example
References
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
Foundations of Vision, Brian Wandell
A Technical Introduction to Digital Video, Charles Poynton
Perception topics
Intensity
Motion
Latency
Color
Resolution
….
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
4/4/2007
11
Human perception is non-linear
Perceived loudness:
0.3l iµ 0.32 00 10
Perceived brightness:
l iµ
0.4b iµ
2.00 10
0.42 51 10
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
b iµ 2.51 10
Monitor response is non-linear too
0.4b iµPerceived brightness:
i v2.4µ
µ
2.4g =
Monitor (CRT) response to voltage:
Aggregate response is nearly linear:
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
Aggregate response is nearly linear:
0.4 0.96( )b v v v2.4µ = »
4/4/2007
12
Graphs
Foundations of Vision, Wandell, p. 416
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
Contrast ratio
Human (static) contrast ratio is roughly 1%
DAC bits Delta iper step
Delta itotal
8 1.0 % 12.6
8 1.8 % 100
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
10 0.7 % 1000
4/4/2007
13
Gamma correction
Intensities can be added, brightnesses cannot
Store image linear in brightness (unusual in 3-D systems)
Best use of available storage precision
256 representable levels are enough
Requires conversion for each pixel operation
8-bit
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
8-bitframebuffer
DAC DisplayGammaconvertern 8 8
An alternative approach
Store image linear in intensity (typical in 3-D systems)
Native arithmetic format
Requires conversion during display
Large brightness steps at low intensities
256 DAC levels is OK, but frame buffer needs more
12-bit
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
Gammaconverter DAC Display
12-bitframebuffern n 8
4/4/2007
14
What is n ?
Assume
8-bit DAC
Gamma of 2 4
0.41662.4255
2 1n
inputoutputæ ö÷ç= ÷ç ÷çè øGamma of 2.4
Table input
Outputn=8
Outputn=10
Outputn=12
Outputn=14
Outputn=16
2**n-1 255 255 255 255 2552**n-2 254 255 255 255 255
2 1çè ø-
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
3 40 22 13 7 42 34 19 11 6 31 25 14 8 4 20 0 0 0 0 0
Demo
Intensity and motion
Demo
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
4/4/2007
15
Motion
Eye is sensitive to motion and change (only)Rule 1: Avoid substantial frame-to-frame changes
AnimationAnimationNo flicker detection above 80Hz or soSequence of frames is interpreted as continuousCorollary to rule 1: Evaluate sequences of images
Eye/brain combination tracks motion
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
Image doubling if render and display rates differInterlace artifacts if render and field rates differSeparation if colors are displayed sequentially
Latency
Latency is a critical system issue
GPU is just a link in the latency chain
Latency budget is sum of all delays
Human latency thresholds
Hand-eye (fixed-position display) is ~100ms
Head-eye (head-mounted display) is ~10ms
Matthew Regan and Ronald Pose, An Interactive
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
Matthew Regan and Ronald Pose, An Interactive Graphics Display Architecture, Proceedings of IEEE Virtual Reality Annual International Symposium, 18-22 September 1993, Seattle USA
4/4/2007
16
Color
Three cone types (S, M, L)
Color can be represented as a 3-tuple
RGB is convenient for display (L M S)RGB is convenient for display (L,M,S)
Other tuples for other purposes
Cone densities differ
S (blue) cones low density
L,M (red, green) cones higher density
Color arithmetic
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
Independent R, G, and B calculations are wrong
3x3 matrix arithmetic is a better approximation
Resolution
Eye’s resolution is not evenly distributed
Foveal resolution is ~20x peripheral
Can track direction of view
Flicker sensitivity higher in periphery
One eye can compensate for the other
Research at NASA suggests high-resolution dominant display
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
Human visual system is well engineered …
4/4/2007
17
Imperfect optics - linespread
Ideal Actual
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
Linespread function
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
4/4/2007
18
Modulation transfer function (MDF)
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
Optical “imperfections” improve acuity
Image is pre-filtered
Cutoff is approximately 60 cpd
Cutoff is gradual – no ringing
Aliasing is avoided
Foveal cone density is 120 / degree
Cutoff matches retinal Nyquist limit
Vernier acuity is improved
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
Vernier acuity is improved
Foveal resolution is 30 arcsec
Vernier acuity is 5-10 arcsec
Linespread blur includes more sensors
4/4/2007
19
Reading assignment
Before Thursday’s class, read
Mark Segal and Kurt Akeley, The Design of the OpenGL Graphics Interface unpublishedOpenGL Graphics Interface, unpublished
David Blythe, The Direct3D 10 System, SIGGRAPH 2006
Also become familiar with www.opengl.org:
GL, GLX, and GLU Specifications
E t i ifi ti
CS448 Lecture 1 Kurt Akeley, Pat Hanrahan, Spring 2007
Extension specifications
…
Optional:
Regan Pose paper
Real-Time G hi A hit tGraphics Architecture
Kurt Akeley
Pat Hanrahan
http://graphics.stanford.edu/cs448-07-spring/