may 8, 2007farid harhad and alaa shams cs7080 overview of the gpu architecture cs7080 final class...
TRANSCRIPT
May 8, 2007 Farid Harhad and Alaa ShamsCS7080
Overview of the GPU Architecture
CS7080 Final Class ProjectSupervised by: Dr. Elias Khalaf
By: Farid Harhad & Alaa Shams
May 8, 2007 Farid Harhad and Alaa ShamsCS7080
Outline
• Introduction
• GPU Architecture
• GPU programming– GPU programming model– Toolkit and language
• Sample Code
• Conclusion
May 8, 2007 Farid Harhad and Alaa ShamsCS7080
Introduction
• The GPU on commodity video cards has evolved into extremely flexible and powerful processor.
• GPUs are fast :– 3.0 GHz Pentium 4: 6 GFLOPs, 6 GB/Sec peak– 3.0 GHz dual-core Pentium 4: 24.6 GFLOPs– GeoForceFX 6800: 53 GFLOPs, 34 GB/Sec Peak – GeoForceFX 7800: 165 GFLOPs– 1066 MHz FSB Pentium Extreme Edition: 8.5 GB/s– ATI Radeo X850 XT Platinum Edition: 37.8 GB/s
• GPUS are getting faster and faster– CPUs: ~1.5x annual growth ~60x per decade– GPUs: ~2.3x annual growth ~1000x per decade
May 8, 2007 Farid Harhad and Alaa ShamsCS7080
Computational power
May 8, 2007 Farid Harhad and Alaa ShamsCS7080
Cont.
May 8, 2007 Farid Harhad and Alaa ShamsCS7080
Why are GPUsgetting faster so fast?
• Arithmetic intensity: the specialized nature of GPUs makes it easier to use additional transistors for computation not cache
• Economics: multi-billion dollar video game market is a pressure cooker that drives innovation
May 8, 2007 Farid Harhad and Alaa ShamsCS7080
Flexible and Precise
• Modern GPUs are deeply Programmable– Programmable pixel, vertex, video engines
– Solidifying high-level language support
• Modern GPUs support high precision– 32 bit floating point throughout the pipeline
– High enough for many (not all) applications
May 8, 2007 Farid Harhad and Alaa ShamsCS7080
The Potential of GPU
• In short:– The power and flexibility of GPUs makes
them an attractive platform for general- purpose computation
– Example applications range from in-game physics simulation to conventional Computational science
– Goal: make the inexpensive power of the GPU available to developers as a sort of computational coprocessor
May 8, 2007 Farid Harhad and Alaa ShamsCS7080
The Problem: Difficult To Use
• GPUs designed for & driven by video games– Programming model unusual– Programming idioms tied to computer graphics– Programming environment tightly constrained
• Underlying architectures are:– Inherently parallel– Rapidly evolving (even in basic feature set!)– Largely secret
• Can’t simply “port” CPU code!
May 8, 2007 Farid Harhad and Alaa ShamsCS7080
GPU ArchitectureGraphic PL
May 8, 2007 Farid Harhad and Alaa ShamsCS7080
Modern Graphic PL
May 8, 2007 Farid Harhad and Alaa ShamsCS7080
Transform
• Vertex processor
(multiple in parallel)– Transform from “world space” to “image space”
– Compute per-vertex lighting
May 8, 2007 Farid Harhad and Alaa ShamsCS7080
Rasterizer
– Convert geometric rep. (vertex) to image rep. (fragment)• Fragment = image fragment
– Pixel + associated data: color, depth, stencil, etc.
– Interpolate per-vertex quantities across pixels
May 8, 2007 Farid Harhad and Alaa ShamsCS7080
Shade
• Fragment processors
(multiple in parallel)– Compute a color for each pixel
– Optionally read colors from textures (images)
May 8, 2007 Farid Harhad and Alaa ShamsCS7080
GPU programming
May 8, 2007 Farid Harhad and Alaa ShamsCS7080
GPU Programming Model
• Useful analogies:– Rasterization = Kernel Invocation– Texture coordinates = Computation domain– Vertex coordinates = computational range
• Invoking computation amounts to drawing pixels:– GPGPU invocation is commonly a full-screen quad
GPU CPU
Stream / Data array:• Memory read
Texture:• Texture sampling
Loop body / Kernel / Algorithm Fragment program
Feedback: Array write Feedback: render a texture
May 8, 2007 Farid Harhad and Alaa ShamsCS7080
GPU Programming Model
May 8, 2007 Farid Harhad and Alaa ShamsCS7080
Toolkits and Language
• High level shading languages– Cg: C for Graphics– HLSL: The D3D Shading Language– The OpenGL Shading Language
• GPGPU Languages– Sh - University of Waterloo– Brook - Stanford University
• CUDA SDK– Includes a C compiler and many libraries
May 8, 2007 Farid Harhad and Alaa ShamsCS7080
Sample Code
May 8, 2007 Farid Harhad and Alaa ShamsCS7080
Conclusion
• GPU provide the programmer with unparalleled flexibility and performance
in a product line that spans the entire PC market.
• Utilizing the capabilities of the GPU allow the programmers to develop newer applications-either graphical or general purpose-in more efficient way.
References
• GPU Gem2 (Chapters 29 & 30)
• SIGGRAPH 2005 GPGPU Course
• http://www.gpgpu.org/
May 8, 2007 Farid Harhad and Alaa ShamsCS7080
May 8, 2007 Farid Harhad and Alaa ShamsCS7080
Questions?
Thanks