future directions for cuda...device api 1000+ new nvpp functions cublas cufft thrust curand cusparse...
TRANSCRIPT
![Page 1: Future Directions for CUDA...Device API 1000+ new NVPP functions cuBLAS cuFFT Thrust cuRand cuSparse LLVM New Visual Profiler GPU-Aware MPI C++ new/delete Virtual functions Templates](https://reader036.vdocuments.net/reader036/viewer/2022071501/612041059ae33c64426d94b3/html5/thumbnails/1.jpg)
© 2013 NVIDIA
Mark Harris Chief Technologist, GPU Computing Software, NVIDIA
Future Directions for CUDA
![Page 2: Future Directions for CUDA...Device API 1000+ new NVPP functions cuBLAS cuFFT Thrust cuRand cuSparse LLVM New Visual Profiler GPU-Aware MPI C++ new/delete Virtual functions Templates](https://reader036.vdocuments.net/reader036/viewer/2022071501/612041059ae33c64426d94b3/html5/thumbnails/2.jpg)
© 2013 NVIDIA
Platform for Parallel Computing
The CUDA Platform is a
foundation that supports a
diverse parallel computing
ecosystem.
Platform
![Page 3: Future Directions for CUDA...Device API 1000+ new NVPP functions cuBLAS cuFFT Thrust cuRand cuSparse LLVM New Visual Profiler GPU-Aware MPI C++ new/delete Virtual functions Templates](https://reader036.vdocuments.net/reader036/viewer/2022071501/612041059ae33c64426d94b3/html5/thumbnails/3.jpg)
© 2013 NVIDIA
1.0 2.0 3.0 4.0 5.0
C++ Dynamic
Parallelism
C
Device Code
Linking NVCC
Fortran (PGI)
cuda-memcheck
Nsight
Eclipse Ed.
Detect
Shared Memory
Hazards
cuBLAS
Device API 1000+ new NVPP
functions
cuBLAS
cuFFT
Thrust
cuRand
cuSparse
LLVM
New Visual
Profiler
GPU-Aware
MPI
C++ new/delete
Virtual functions
Templates
UVA
nvidia-smi
GPUDirect
Recursion
cuda-gdb
Visual Profiler
Command-
Line Profiler
NVPP
Nsight IDE
OpenACC
Inheritance
Function pointers
Platform for Parallel Computing
Compiler Tool Chain
Programming Languages
Libraries
Developer Tools
Platform
![Page 4: Future Directions for CUDA...Device API 1000+ new NVPP functions cuBLAS cuFFT Thrust cuRand cuSparse LLVM New Visual Profiler GPU-Aware MPI C++ new/delete Virtual functions Templates](https://reader036.vdocuments.net/reader036/viewer/2022071501/612041059ae33c64426d94b3/html5/thumbnails/4.jpg)
© 2013 NVIDIA
Investing in the Future
Enabling More Programmers
Programming Model
Future Computing Platforms
Platform
![Page 5: Future Directions for CUDA...Device API 1000+ new NVPP functions cuBLAS cuFFT Thrust cuRand cuSparse LLVM New Visual Profiler GPU-Aware MPI C++ new/delete Virtual functions Templates](https://reader036.vdocuments.net/reader036/viewer/2022071501/612041059ae33c64426d94b3/html5/thumbnails/5.jpg)
© 2013 NVIDIA
Unified Programming Language
![Page 6: Future Directions for CUDA...Device API 1000+ new NVPP functions cuBLAS cuFFT Thrust cuRand cuSparse LLVM New Visual Profiler GPU-Aware MPI C++ new/delete Virtual functions Templates](https://reader036.vdocuments.net/reader036/viewer/2022071501/612041059ae33c64426d94b3/html5/thumbnails/6.jpg)
© 2013 NVIDIA
GPU
A
CPU
main
Unified Run-Time Interface
B
C
X
Y
Z
CUDA Dynamic Parallelism
![Page 7: Future Directions for CUDA...Device API 1000+ new NVPP functions cuBLAS cuFFT Thrust cuRand cuSparse LLVM New Visual Profiler GPU-Aware MPI C++ new/delete Virtual functions Templates](https://reader036.vdocuments.net/reader036/viewer/2022071501/612041059ae33c64426d94b3/html5/thumbnails/7.jpg)
© 2013 NVIDIA
![Page 8: Future Directions for CUDA...Device API 1000+ new NVPP functions cuBLAS cuFFT Thrust cuRand cuSparse LLVM New Visual Profiler GPU-Aware MPI C++ new/delete Virtual functions Templates](https://reader036.vdocuments.net/reader036/viewer/2022071501/612041059ae33c64426d94b3/html5/thumbnails/8.jpg)
© 2013 NVIDIA
![Page 9: Future Directions for CUDA...Device API 1000+ new NVPP functions cuBLAS cuFFT Thrust cuRand cuSparse LLVM New Visual Profiler GPU-Aware MPI C++ new/delete Virtual functions Templates](https://reader036.vdocuments.net/reader036/viewer/2022071501/612041059ae33c64426d94b3/html5/thumbnails/9.jpg)
© 2013 NVIDIA
CUDA UVM Demo
![Page 10: Future Directions for CUDA...Device API 1000+ new NVPP functions cuBLAS cuFFT Thrust cuRand cuSparse LLVM New Visual Profiler GPU-Aware MPI C++ new/delete Virtual functions Templates](https://reader036.vdocuments.net/reader036/viewer/2022071501/612041059ae33c64426d94b3/html5/thumbnails/10.jpg)
© 2013 NVIDIA
Simpler, More Integrated Programming
16
2
4
6
8
10
12
14
DP G
FLO
PS p
er
Watt
2008 2010 2012 2014 Unified Language
Unified
Run-Time
Unified Virtual
Memory
Tesla Fermi
Kepler
Maxwell
![Page 11: Future Directions for CUDA...Device API 1000+ new NVPP functions cuBLAS cuFFT Thrust cuRand cuSparse LLVM New Visual Profiler GPU-Aware MPI C++ new/delete Virtual functions Templates](https://reader036.vdocuments.net/reader036/viewer/2022071501/612041059ae33c64426d94b3/html5/thumbnails/11.jpg)
© 2013 NVIDIA
Diversity of Programming Languages
http://www.ohloh.net
![Page 12: Future Directions for CUDA...Device API 1000+ new NVPP functions cuBLAS cuFFT Thrust cuRand cuSparse LLVM New Visual Profiler GPU-Aware MPI C++ new/delete Virtual functions Templates](https://reader036.vdocuments.net/reader036/viewer/2022071501/612041059ae33c64426d94b3/html5/thumbnails/12.jpg)
© 2013 NVIDIA
Enabling More Programming Languages
Developers want to build
front-ends for
Python, Java, R, DSLs …
Target other processors like
ARM, FPGAs, GPUs, x86 …
CUDA C, C++, Fortran
LLVM Compiler For CUDA
NVIDIA GPUs
x86 CPUs
New Language Support
New Processor Support
![Page 13: Future Directions for CUDA...Device API 1000+ new NVPP functions cuBLAS cuFFT Thrust cuRand cuSparse LLVM New Visual Profiler GPU-Aware MPI C++ new/delete Virtual functions Templates](https://reader036.vdocuments.net/reader036/viewer/2022071501/612041059ae33c64426d94b3/html5/thumbnails/13.jpg)
© 2013 NVIDIA
Enabling More Programming Languages
CUDA C, C++, Fortran
LLVM Compiler For CUDA
NVIDIA GPUs
x86 CPUs
New Language Support
New Processor Support
Halide (http://halide-lang.org/)
Mozilla Rust
![Page 14: Future Directions for CUDA...Device API 1000+ new NVPP functions cuBLAS cuFFT Thrust cuRand cuSparse LLVM New Visual Profiler GPU-Aware MPI C++ new/delete Virtual functions Templates](https://reader036.vdocuments.net/reader036/viewer/2022071501/612041059ae33c64426d94b3/html5/thumbnails/14.jpg)
© 2013 NVIDIA
Rapid Development
Powerful Libraries
Commercial Support
Large Community
![Page 15: Future Directions for CUDA...Device API 1000+ new NVPP functions cuBLAS cuFFT Thrust cuRand cuSparse LLVM New Visual Profiler GPU-Aware MPI C++ new/delete Virtual functions Templates](https://reader036.vdocuments.net/reader036/viewer/2022071501/612041059ae33c64426d94b3/html5/thumbnails/15.jpg)
© 2013 NVIDIA
Is Python Fast Enough for HPC?
Python apps often implement
performance critical functions in C/C++.
![Page 16: Future Directions for CUDA...Device API 1000+ new NVPP functions cuBLAS cuFFT Thrust cuRand cuSparse LLVM New Visual Profiler GPU-Aware MPI C++ new/delete Virtual functions Templates](https://reader036.vdocuments.net/reader036/viewer/2022071501/612041059ae33c64426d94b3/html5/thumbnails/16.jpg)
© 2013 NVIDIA
Compile Python for Parallel Architectures
Anaconda Accelerate from Continuum Analytics
NumbaPro array-oriented compiler for Python & NumPy
Compile for CPUs or GPUs (uses LLVM + NVIDIA Compiler SDK)
Fast Development + Fast Execution: Ideal Combination
http://continuum.io
Free Academic
License
![Page 17: Future Directions for CUDA...Device API 1000+ new NVPP functions cuBLAS cuFFT Thrust cuRand cuSparse LLVM New Visual Profiler GPU-Aware MPI C++ new/delete Virtual functions Templates](https://reader036.vdocuments.net/reader036/viewer/2022071501/612041059ae33c64426d94b3/html5/thumbnails/17.jpg)
© 2013 NVIDIA
10242 Mandelbrot Time Speedup v. Pure Python
Pure Python 4.85s --
NumbaPro (CPU) 0.11s 44x
CUDA Python (K20) .004s 1221x
CUDA Python
CUDA Programming,
Python Syntax
![Page 18: Future Directions for CUDA...Device API 1000+ new NVPP functions cuBLAS cuFFT Thrust cuRand cuSparse LLVM New Visual Profiler GPU-Aware MPI C++ new/delete Virtual functions Templates](https://reader036.vdocuments.net/reader036/viewer/2022071501/612041059ae33c64426d94b3/html5/thumbnails/18.jpg)
© 2013 NVIDIA
![Page 19: Future Directions for CUDA...Device API 1000+ new NVPP functions cuBLAS cuFFT Thrust cuRand cuSparse LLVM New Visual Profiler GPU-Aware MPI C++ new/delete Virtual functions Templates](https://reader036.vdocuments.net/reader036/viewer/2022071501/612041059ae33c64426d94b3/html5/thumbnails/19.jpg)
© 2013 NVIDIA
KAYLA
![Page 20: Future Directions for CUDA...Device API 1000+ new NVPP functions cuBLAS cuFFT Thrust cuRand cuSparse LLVM New Visual Profiler GPU-Aware MPI C++ new/delete Virtual functions Templates](https://reader036.vdocuments.net/reader036/viewer/2022071501/612041059ae33c64426d94b3/html5/thumbnails/20.jpg)
© 2013 NVIDIA
CUDA 5 | OpenGL 4.3
Kick starts ARM + CUDA Ecosystem
NAMD Ported in 2 Days
Kayla Development Platform
Quad ARM + Kepler GPU
Quad ARM + Any CUDA GPU
![Page 21: Future Directions for CUDA...Device API 1000+ new NVPP functions cuBLAS cuFFT Thrust cuRand cuSparse LLVM New Visual Profiler GPU-Aware MPI C++ new/delete Virtual functions Templates](https://reader036.vdocuments.net/reader036/viewer/2022071501/612041059ae33c64426d94b3/html5/thumbnails/21.jpg)
© 2013 NVIDIA
DEMO: KAYLA
![Page 22: Future Directions for CUDA...Device API 1000+ new NVPP functions cuBLAS cuFFT Thrust cuRand cuSparse LLVM New Visual Profiler GPU-Aware MPI C++ new/delete Virtual functions Templates](https://reader036.vdocuments.net/reader036/viewer/2022071501/612041059ae33c64426d94b3/html5/thumbnails/22.jpg)
© 2013 NVIDIA
1.0 2.0 3.0 4.0 5.0
C++ Dynamic
Parallelism
C
Device Code
Linking NVCC
Fortran (PGI)
cuda-memcheck
Nsight
Eclipse Ed.
Detect
Shared Memory
Hazards
cuBLAS
Device API 1000+ new NVPP
functions
cuBLAS
cuFFT
Thrust
cuRand
cuSparse
LLVM
New Visual
Profiler
GPU-Aware
MPI
C++ new/delete
Virtual functions
Templates
UVA
nvidia-smi
GPUDirect
Recursion
cuda-gdb
Visual Profiler
Command-
Line Profiler
NVPP
Nsight IDE
OpenACC
Inheritance
Function pointers
Platform for Parallel Computing
Compiler Tool Chain
Programming Languages
Libraries
Developer Tools
Platform
![Page 23: Future Directions for CUDA...Device API 1000+ new NVPP functions cuBLAS cuFFT Thrust cuRand cuSparse LLVM New Visual Profiler GPU-Aware MPI C++ new/delete Virtual functions Templates](https://reader036.vdocuments.net/reader036/viewer/2022071501/612041059ae33c64426d94b3/html5/thumbnails/23.jpg)
© 2013 NVIDIA
5.0
Platform for Parallel Computing
JIT
Linking
JIT
Compilation
Profiler
Step-by-Step Guidance Single-GPU Debugging
Multi-GPU Support ARM Support
Compiler Tool Chain
Programming Languages
Libraries
Developer Tools
C++11
Sparse Solvers
Platform
![Page 24: Future Directions for CUDA...Device API 1000+ new NVPP functions cuBLAS cuFFT Thrust cuRand cuSparse LLVM New Visual Profiler GPU-Aware MPI C++ new/delete Virtual functions Templates](https://reader036.vdocuments.net/reader036/viewer/2022071501/612041059ae33c64426d94b3/html5/thumbnails/24.jpg)
© 2013 NVIDIA
Ubiquitous
parallel
programming
Power
Aware
Programming
Hybrid
operating
system
Enablement
Parallel
Compiler
Foundation
Enablement
Optimizing
locality and
computation
Task, Thread
& Data
Parallelism
Today Easier
Parallel
Programming
Future Challenges
![Page 25: Future Directions for CUDA...Device API 1000+ new NVPP functions cuBLAS cuFFT Thrust cuRand cuSparse LLVM New Visual Profiler GPU-Aware MPI C++ new/delete Virtual functions Templates](https://reader036.vdocuments.net/reader036/viewer/2022071501/612041059ae33c64426d94b3/html5/thumbnails/25.jpg)
© 2013 NVIDIA
GPUs Everywhere
2012 2015 2018
MPI