pycon2014 gpu computing

Post on 25-May-2015

200 Views

Category:

Engineering

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

GPU ACCELERATEDHIGH PERFORMANCECOMPUTING PRIMER

G RAJA SUMANT ASHOK ASHWIN

//early draft copy

CONTENTS1. INTRODUCTION2.WHY USE GPU?3.CPU ARCHITECTURE4.GPU ARCHITECTURE5.WHY PYTHON FOR GPU6.HOW GPU ACCELERATION WORKS7.TECHNOLOGIES AVAILABLE TODAY FOR GPU COMPUTING8.CUDA+PYTHON9.PyCuda sample code

INTRODUCTIONGPU-accelerated computing is the use of a graphics processing unit (GPU) together with a CPU to accelerate scientific, engineering, and enterprise applications. Pioneered in 2007 by NVIDIA, GPUs now power energy-efficient datacenters in government labs, universities, enterprises, and small-and-medium businesses around the world.

WHY USE GPU?

GPU-accelerated computing offers unprecedented application performance by offloading compute-intensive portions of the application to the GPU, while the remainder of the code still runs on the CPU. From a user's perspective, applications simply run significantly faster.

CPU ARCHITECTURE

Design target for CPUs:

1)Focus on Task parallelism2)Make a single thread very fast3)Hide latency through large caches

4)Predict, speculate

GPU ARCHITECTURE

The GPU architecture of AMD

GPU ARCHITECTURE

NVIDIA ARCHITECTURE

GPU ARCHITECTURE

WHY PYTHON FOR GPU

Go to a terminal , type python >> import this

read the output

WHY PYTHON FOR GPU

GPUs are everything that scripting languages are not.>Highly parallel>Very architecture-sensitive>Built for maximum FP/memory throughput>complement each otherCPU: largely restricted to control>tasks (1000/sec)>Scripting fast enough>Python + CUDA = PyCUDA>Python + OpenCL = PyOpenCL

http://www.nvidia.com/object/what-is-gpu-computing.html

TECHNOLOGIES AVAILABLE TODAY FOR GPU COMPUTING

Open computing language (OpenCL)> Many vendors: AMD, Nvidia, Apple, Intel, IBM...> Standard CPUs may report themselves as OpenCL capable>Works on most devices, but>Implemented feature set and extensions may vary

Compute unified device architecture (CUDA)>One vendor: Nvidia (more mature tools)>Better coherence across a limited set of devices

CUDA + PYTHON

PyCUDA>You still have to write your kernel in CUDA C>. . . but integrates easily with numpy>Higher level than CUDA C, but not much higher>Full CUDA support and performancegnumpy/CUDAMat/cuBLAS>gnumpy: numpy-like wrapper for CUDAMat>CUDAMat: Pre-written kernels and partial cuBLAS wrapper>cuBLAS: (incomplete) CUDA implementation of BLAS

PyCUDA sample code

>> open helloCUDA.py in editorlook and analyse the code

CUDAMAT

The aim of the cudamat project is to make it easy to perform basic matrix calculations on CUDA-enabled GPUs from Python. cudamat provides a Python matrix class that performs calculations on a GPU. At present, some of the operations the GPU matrix class supports include: Easy conversion to and from instances of numpy.ndarray. Limited slicing support. Matrix multiplication and transpose, Elementwise addition, subtraction, multiplication, and division.

open cudamat examples.

GNumpy

Module gnumpy contains class garray, which behaves much like numpy.ndarray

Module gnumpy also contains methods like tile() and rand(), which behave like their numpy counterparts except that they deal with gnumpy.garray instances, instead of numpy.ndarray instances.

gnumpy builds on cudamat

References● http://documen.tician.de/pycuda/tutorial.html

● http://on-demand.gputechconf.com/gtc/2010/presentations/S12041-PyCUDA-Simpler-GPU-Programming-Python.pdf

● http://www.tsc.uc3m.es/~miguel/MLG/adjuntos/slidesCUDA.pdf

● http://femhub.com/docs/cuda_en.pdf

● http://www.ieap.uni-kiel.de/et/people/kruse/tutorials/cuda/tutorial01p/web01p/tutorial01p.pdf

● http://conference.scipy.org/static/wiki/scipy09-pycuda-tut.pdf

top related