tesla: fastest processor adoption in hpc history · tesla gpu computing products tesla s1070 system...

Post on 22-Jan-2020

95 Views

Category:

Documents

5 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Tesla: Fastest Processor Adoption in HPC History

Teratec 2009

1995

2008

240 cores4 cores

Advent of GPU Computing

CPU + GPU Co-Processing

GPU Computing Applications

CC++Java

FortranOpenCLtm DirectX Compute

NVIDIA GPUCUDA Parallel Computing Architecture

OpenCL is trademark of Apple Inc. used under license to the Khronos Group Inc.

CUDA GPU Computing Architecture

Tesla GPU Computing Products

Tesla S1070 System Tesla C1060 Processor

GPUs 4 Tesla GPUs 1 Tesla GPU 1 Tesla GPU

Single Precision Perf 4.14 TFlops 933 GFlops 933 GFlops

Double Precision Perf346 GFlops 78 GFlops 78 GFlops

Memory 4 GB / GPU 4 GB 4 GB

Form Factor 1U Chassis, cables to a Host Standard PCIe board Custom Module

Tesla M1060 Processor

Tesla GPU Computing Solutions

Datacenter

DepartmentalCluster

ScientificDesktop

Integrated 1U Servers

Personal Supercomputer

Pre-Configured Clusters

Tesla S1070 & MS Server 2008

Current HICPCI x16 Gen2

New GHICx16PCIe x16 Gen2 with GPU & Fan

Connector extends out from I/O plane High Density

connector for VGA/DVI dongle

NVIDIA GPU Computing Ecosystem

NVIDIA Hardware Solutions CUDA SDK & ToolsGPU Architecture

Customer Application DeploymentCustomerRequirementsHardware Architecture

OEMsCUDA

Development Specialist

ISV

CUDA Training

CompanyHardware Architect

VAR

CUDA Ecosystem

Applications LibrariesFFT

BLASLAPACK

Image processingVideo processingSignal processing

Vision

Consultants OEMs

LanguagesC, C++DirectXFortranJava

OpenCLPython

CompilersPGI Fortran

CAPS HMPPMCUDA

MPINOAA Fortran2C

OpenMP

UIUCMIT

HarvardBerkeley

CambridgeOxford

IIT DelhiDortmundtETH Zurich

MoscowEcole CentraleParis 6 Jussieu

Over 200 Universities Teaching CUDA

Oil & Gas Finance

Medical Biophysics

Numerics

Imaging

CFD

DSP EDA

DebuggersAllinea

TotalView

2006

2010

2012

2015

2008

G PU

T8128

core

T10240

core

A 2015 GPU~20x the performance of today’s GPU~5,000 cores at ~3GHz (50mW each)~20 TFLOPS~1.2TB/s of memory bandwidth

*This is a sketch of a what a GPU in 2015 might look like, it does not reflect any actual product plans

GPU Revolutionizing ComputingGFlops

Click to edit Master subtitle style

top related