opening keynote at gtc 2015: leaps in visual computing

Post on 14-Jul-2015

2.556 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

LEAPS IN VISUAL COMPUTINGJEN-HSUN HUANG, CO-FOUNDER & CEO | GTC 2015

FOUR ANNOUNCEMENTS

A New GPUand

Deep Learning

A Very Fast Boxand

Deep Learning

Roadmap Revealand

Deep Learning

Self-Driving Carsand

Deep Learning

AMAZING YEAR IN VISUAL COMPUTING

© 2015 Industrial Light & Magic. All Rights Reserved.

10X GROWTH IN GPU COMPUTING2008

150,000CUDA Downloads

4,000Academic Papers

60Universities Teaching

77Supercomputing Teraflops

6,000Tesla GPUs

27CUDA Apps

2008

150,000CUDA Downloads

4,000Academic Papers

60Universities Teaching

77Supercomputing Teraflops

6,000Tesla GPUs

27CUDA Apps

2015

3 MillionCUDA Downloads

10X GROWTH IN GPU COMPUTING

2015

3 MillionCUDA Downloads

10X GROWTH IN GPU COMPUTING

319CUDA Apps

2008

150,000CUDA Downloads

4,000Academic Papers

60Universities Teaching

77Supercomputing Teraflops

6,000Tesla GPUs

27CUDA Apps

2015

3 MillionCUDA Downloads

800Universities Teaching

10X GROWTH IN GPU COMPUTING

319CUDA Apps

2008

150,000CUDA Downloads

4,000Academic Papers

60Universities Teaching

77Supercomputing Teraflops

6,000Tesla GPUs

27CUDA Apps

2015

3 MillionCUDA Downloads

800Universities Teaching

10X GROWTH IN GPU COMPUTING

319CUDA Apps

2008

150,000CUDA Downloads

4,000Academic Papers

60Universities Teaching

77Supercomputing Teraflops

6,000Tesla GPUs

27CUDA Apps

60,000 Academic Papers

2015

3 MillionCUDA Downloads

800Universities Teaching

10X GROWTH IN GPU COMPUTING

319CUDA Apps

2008

150,000CUDA Downloads

4,000Academic Papers

60Universities Teaching

77Supercomputing Teraflops

6,000Tesla GPUs

27CUDA Apps

60,000 Academic Papers

450,000Tesla GPUs

2015

3 MillionCUDA Downloads

60,000 Academic Papers

800Universities Teaching

54,000Supercomputing Teraflops

10X GROWTH IN GPU COMPUTING

450,000Tesla GPUs

319CUDA Apps

2008

150,000CUDA Downloads

4,000Academic Papers

60Universities Teaching

77Supercomputing Teraflops

6,000Tesla GPUs

27CUDA Apps

8 Billion Transistors 3,072 CUDA Cores7 TFLOPS SP / 0.2 TFLOPS DP12GB Memory

TITAN XTHE WORLD’S FASTEST GPU

01234567

TITAN X FOR DEEP LEARNINGTraining AlexNet

Days

16-core Xeon CPU TITAN TITAN BlackcuDNN

TITAN XcuDNN

~

43

8 Billion Transistors 3,072 CUDA Cores7 TFLOPS SP / 0.2 TFLOPS DP12GB Memory

TITAN XTHE WORLD’S FASTEST GPU

$999

FOUR ANNOUNCEMENTS

A New GPUand

Deep Learning

A Very Fast Boxand

Deep Learning

Roadmap Revealand

Deep Learning

Self-Driving Carsand

Deep Learning

A SHORT HISTORY OF DEEP LEARNING

Convolutional Neural Networks for Handwritten Digital Recognition

LECUN, BOTTOU, BENGIO, HAFFNER, 1998ImageNet Classification with NVIDIA GPUsKRIZHEVSKY, HINTON, ET AL., 2012

1995 2000 2005 2010 2015

Accuracy %

2010 201420122011 2013

74%

84%

DNN

CV

72%

“Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification”

— Microsoft: 4.94%, Feb. 6, 2015

“Deep Image: Scaling up Image Recognition”— Baidu: 5.98%, Jan. 13, 2015

“Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariant Shift”

— Google: 4.82%, Feb. 11, 2015

IMAGENET CHALLENGE

Accuracy %

2010 201420122011 2013

74%

84%

DNN

CV

72%

THE BIG BANG

DEEP LEARNINGVISUALIZED

GPU-ACCELERATED DEEP LEARNINGSTART-UPS

Detecting Mitosis in Breast Cancer Cells— IDSIA

Predicting the Toxicity of New Drugs— Johannes Kepler University

Understanding Gene Mutation to Prevent Disease— University of Toronto

DEEP LEARNING REVOLUTIONIZING MEDICAL RESEARCH

“Automated Image Captioning with ConvNets and Recurrent Nets”

—Andrej Karpathy, Fei-Fei Li

DIGITSDEEP GPU TRAINING SYSTEMFOR DATA SCIENTISTS

Design DNNs

Visualize activations

Manage multiple trainings

USER INTERFACE

Visualize Layers

Configure DNN

Process Data

GPUGPU HW CloudGPU ClusterMulti-GPU

TheanoTorch

Monitor Progress

CaffecuDNN, cuBLAS

CUDA

Monitor Progress

DIGITS

Configure DNNProcess Data Visualize Layers Test Image

DIGITS DEVBOXWorld’s fastest GPU

Max GPU out of a plug

Multi-GPU training & inference

“ I’ve never seen AlexNetrun this fast…TitanX is a monster, Crazy Fast”

DIGITS DEVBOX — EARLY RESULTS

“DIGITS makes it way easier to design the best networkfor the job”

0x

1x

2x

3x

4x

1 2 4

Multi-GPU scaling on Torch

AlexNet VGG

— Simon OsinderoA.I. Architech

— Soumith ChintalaResearch Engineer

DIGITS DEVBOX

Available May 2015$15,000

FOUR ANNOUNCEMENTS

A New GPUand

Deep Learning

A Very Fast Boxand

Deep Learning

Roadmap Revealand

Deep Learning

Self-Driving Carsand

Deep Learning

SGEM

M /

W

2012 20142008 2010 2016

48

36

12

0

24

60

2018

72

Tesla Fermi

Kepler

Maxwell

PascalMixed Precision3D MemoryNVLink

Volta

GPU ROADMAPPascal 2x SGEMM/W

Fram

e Bu

ffer

Cap

acit

y (G

B)

2012 20142008 2010 2016

40

30

10

0

20

50

2018

60

Tesla FermiKepler

Maxwell

PascalMixed Precision3D MemoryNVLink

Volta

GPU ROADMAPPascal 2.7x Memory Capacity

HG

EMM

/ W

2012 20142008 2010 2016

96

72

24

0

48

120

2018

144

Tesla Fermi Kepler

Maxwell

PascalMixed Precision3D MemoryNVLink

Volta

GPU ROADMAPPascal 4x Mixed Precision

STRE

AM G

B/s

2012 20142008 2010 2016

600

450

150

0

300

750

2018

900

Tesla

FermiKepler

Maxwell

PascalMixed Precision3D MemoryNVLink

Volta

GPU ROADMAPPascal 3x Bandwidth

PASCAL 10X MAXWELL

CONVOLUTION FULLY CONNECTED FULLY CONNECTED CONVOLUTION(compute) (bandwidth) (bandwidth) (compute)

WEIGHT UPDATE(interconnect)

4x (FP16) 6x 6x 4x 10x

Mixed Precision 3D Memory NVLINK

forward backward

Mixed Precision3D Memory

5x 2x

* Very rough estimates

FOUR ANNOUNCEMENTS

A New GPUand

Deep Learning

A Very Fast Boxand

Deep Learning

Roadmap Revealand

Deep Learning

Self-Driving Carsand

Deep Learning

TODAY’S ADAS

PLAN ACT

CPU

WARN

FPGACV ASIC

SENSE

BRAKE

NEXT-GENERATION ADAS

PLAN ACT

CPU

WARN

FPGACV ASIC

SENSE

BRAKE

STEER

ACCELERATE

NVIDIA DRIVE PX SELF-DRIVING CAR COMPUTER

PLAN ACT

CPUWARN

FPGACV ASIC

DNN

SENSE

BRAKE

STEER

IMAGENET CHALLENGE

Accuracy %

2010 201420122011 2013

74%

84%

DNN

CV

72%

ACCELERATE

NVIDIA DRIVE PX SELF-DRIVING CAR COMPUTER

PLAN ACT

CPUWARN

FPGACV ASIC

DNN

SENSE

BRAKE

STEER

IMAGENET CHALLENGE

Accuracy %

2010 201420122011 2013

74%

84%

DNN

CV

72%

ACCELERATE

NVIDIA DRIVE PX SELF-DRIVING CAR COMPUTER

PLAN ACT

CPUWARN

FPGACV ASIC

DNN

SENSE

BRAKE

STEER

IMAGENET CHALLENGE

Accuracy %

2010 201420122011 2013

74%

84%

DNN

CV

72%

ACCELERATE

NVIDIA DRIVE PX SELF-DRIVING CAR COMPUTER

PLAN ACT

CPUWARN

FPGACV ASIC

DNN

SENSE

BRAKE

STEER

IMAGENET CHALLENGE

Accuracy %

2010 201420122011 2013

74%

84%

DNN

CV

72%

ACCELERATE

NVIDIA DRIVE PX SELF-DRIVING CAR COMPUTER

PLAN ACT

CPUWARN

FPGACV ASIC

DNN

SENSE

BRAKE

STEER

IMAGENET CHALLENGE

Accuracy %

2010 201420122011 2013

74%

84%

DNN

CV

72%

ACCELERATE

NVIDIA DRIVE PX SELF-DRIVING CAR COMPUTER

PLAN ACT

CPUWARN

FPGACV ASIC

DNN

SENSE

BRAKE

STEER

IMAGENET CHALLENGE

Accuracy %

2010 201420122011 2013

74%

84%

DNN

CV

72%

ACCELERATE

DNN-based self-driving robot

Training data by human driver

No hand-coded CV algorithms

PROJECT LEADSUrs Muller: Chief Architect, Autonomous Driving, NVIDIA

Yann LeCun: Director,AI Research, Facebook

PROJECT DAVE — DARPA AUTONOMOUS VEHICLE

IMAGENET CHALLENGE

Accuracy %

2010 201420122011 2013

74%

84%

DNN

CV

72%

DAVE IN ACTION

TRAINING DATA225K Images

TEST DRIVENo Training

TEST DRIVEPartially Trained (52K images)

TEST DRIVEFully Trained (225K images)

3,000x Faster

DAVEAlexNet onDRIVE PX

3.1 Million

12

38 Million

630 Million

184

116 Billion

Number of Connections

Frames / Second

Connections / Second

NVIDIA DRIVE PX SELF-DRIVING CAR COMPUTER

PLAN ACT

CPUWARN

FPGACV ASIC

DNN

SENSE

BRAKE

STEER

IMAGENET CHALLENGE

Accuracy %

2010 201420122011 2013

74%

84%

DNN

CV

72%

ACCELERATE

NVIDIA DRIVE™ PXSELF-DRIVING CAR COMPUTER

Available May 2015$10,000

ELON MUSK

LEAPS IN VISUAL COMPUTINGTITAN X

The World’s Fastest GPUDIGITS DevBox

GPU Deep Learning PlatformPascal — 10x Maxwell

For Deep LearningNVIDIA DRIVE PX

Deep Learning Platform for Self-Driving Cars

top related