real-world gpgpu mark harris nvidia developer technology
Post on 15-Jan-2016
253 views
TRANSCRIPT
Real-World GPGPU
Mark HarrisNVIDIA Developer Technology
Copyright © NVIDIA Corporation 2004
GPGPU Research Promises Big Speedups
physically-based simulation
image processing
scientific computing
computer vision
computational finance
medical imaging
bioinformatics
databases and data mining
sorting
ray tracing
Researchers have tried many applications on GPUs
Research results promise big speedupsLU-GPU dense linear system solver: 10x CPU (UNC)
GPUTeraSort: 2006 Indy PennySort Champion (UNC)
ClawHMMr streaming sequence search: 5-20x CPU (Stanford)
Copyright © NVIDIA Corporation 2004
Raw Data Promises High Perf, Too
• GPU Observed GFLOPS• CPU Theoretical peak GFLOPS
2005 2006
NVIDIA GPU Pixel Shader GFLOPS
Copyright © NVIDIA Corporation 2004
Real-World Performance Gains
Do research results and high peak performance translate to real application speedups?
Real-World ApplicationsMedical Imaging (Mercury Computer Systems)
Electromagnetic Simulations (NVIDIA Partner)
Game Physics (Havok)
© 2006 Mercury Computer Systems, Inc.
Digital Breast Tomosynthesis (DBT)
100X reconstruction speed-up with NVIDIA Quadro FX 4500 GPU From hours to minutes Facilitates clinical use
Improved diagnostic value Clearer images Fewer obstructions Earlier detection
Axis of rotation
Compressed breast
Digital detector
X-Ray tube
Compression paddle
11 Low-dose X-ray ProjectionsExtremely Computationally Intense Reconstruction
Advanced Imaging Solution of the Year
“Mercury reduced reconstruction time from 5 hours to 5 minutes, making DBT clinically viable.…among 70 women diagnosed with breast cancer, DBT pinpointed 7 cases not seen with mammography”
Pioneering DBT work at Massachusetts General Hospital
Copyright © NVIDIA Corporation 2004
0
100
200
300
400
500
600
700
800
Performance (Mcells/s)
Electromagnetic Simulation
3D Finite-Difference and Finite-ElementModeling of:
Cell phone irradiationMRI Design / ModelingPrinted Circuit Boards Radar Cross Section (Military)
Computationally Intensive!Large speedups with Quadro GPUs
Pacemaker with Transmit Antenna
Commercial, Optimized, Mature Software
Single CPU, 3.x GHz
5X5X
10X10X
1X1X
18X18X
4210
# Quadro FX 4500 GPUs
Copyright © NVIDIA Corporation 2004
Havok FX Physics on NVIDIA GPUs
Physics-based effects on a massive scale10,000s of objects at high frame rates
Rigid bodies
Particles
Fluids
Cloth
and more
Copyright © NVIDIA Corporation 2004
Dedicated Performance For Physics
Performance Measurement15,000 Boulder Scene
FrameRate
CPU PhysicsDual Core P4EE 955 - 3.46GHz
GeForce 7900GTX SLICPU Multi-threading enabled
GPU PhysicsDual Core P4EE 955 - 3.46GHz
GeForce 7900GTX SLICPU Multi-threading enabled
6.2 fps
64.5 fps
Copyright © NVIDIA Corporation 2004
GPGPU Performance Strategies
Choose applications with high Arithmetic IntensityArithmetic Intensity = Arithmetic / BandwidthGame physics top kernels = very high A.I.
> 1500 cycles per collision, ~100 texture fetches
Leverage strengths of all processors in the systemGPUs: data-parallel computationCPUs: sequential computationMulti-core CPUs: task-parallel computation
Find the parallelism in the applicationData dependencies can make problem appear sequentialDivide into batches of independent parallelism
Copyright © NVIDIA Corporation 2004
Rigid Body Dynamics Overview
3 phases to every simulation time stepIntegrate positions and velocities
Detect collisions
Resolve collisions
Integration is very parallelNo dependencies between objects: use the GPU
Detecting collisions is basically scene traversalCPU is good at this – use it
Resolving collisions is a tricky oneIs it parallel enough for the GPU?
Copyright © NVIDIA Corporation 2004
Is Game Physics A Data Parallel Task?
Solve Collisions
NewVelocities
Contacts&
Velocities
Body
Body
Body
Body
Body
Body Body
Body
Slide courtesy of Andrew Bond, Havok
Copyright © NVIDIA Corporation 2004
Is Game Physics A Data Parallel Task?
Solve Collisions
NewVelocities
Contacts&
Velocities
Body
Body
Body
Body
Body
Body Body
Body
Slide courtesy of Andrew Bond, Havok
Copyright © NVIDIA Corporation 2004
Is Game Physics A Data Parallel Task?
Solve Collisions
NewVelocities
Contacts
Solve link 1
Solve link 2
Solve link N
Solve link 1
Solve link 2
Solve link N
Solve link 1
Solve link 2
Solve link N
Batch 1 Batch 2 Batch M
Slide courtesy of Andrew Bond, Havok
Copyright © NVIDIA Corporation 2004
Conclusion
Real-World GPGPU is just beginning!
http://developer.nvidia.com