dgxupdate - nvidiaon-demand.gputechconf.com/gtc/2017/presentation/s... · dgx station dgx-1 nvidia...

27
DGX UPDATE Customer Presentation Deck May 8, 2017

Upload: others

Post on 28-May-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DGXUPDATE - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s... · DGX Station DGX-1 NVIDIA GPU Cloud with Tesla P100 with Tesla V100. 5 NVIDIA DGX unlocks the full potential

DGX UPDATE

Customer Presentation DeckMay 8, 2017

Page 2: DGXUPDATE - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s... · DGX Station DGX-1 NVIDIA GPU Cloud with Tesla P100 with Tesla V100. 5 NVIDIA DGX unlocks the full potential

2

NVIDIA DGX-1: The World’s Fastest AI Supercomputer

EFFORTLESS PRODUCTIVITY

REVOLUTIONARY AI PERFORMANCE

Fully-integrated and pre-optimizedInsights in hours instead of weeks

Optimized frameworks and cloud managed for faster insights

DGX software stack for fastest GPU performance in the industry

FASTEST PATH TODEEP LEARNING

Page 3: DGXUPDATE - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s... · DGX Station DGX-1 NVIDIA GPU Cloud with Tesla P100 with Tesla V100. 5 NVIDIA DGX unlocks the full potential

3

DGX-1 launch

OpenAI

ONE YEAR LATER – NVIDIA DGX-1Barriers Toppled, the Unsolvable Solved – a Sampling of DGX-1 Impact

April

2016April

2017

UC Berkley CSIRO MIT CMU Fidelity Skymind RIKEN

NYU Mass. General Hosp. DFK

IDSIA

Microsoft Nimbix

Noodle.ai

SAP NVIDIASATURNV launch

Page 4: DGXUPDATE - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s... · DGX Station DGX-1 NVIDIA GPU Cloud with Tesla P100 with Tesla V100. 5 NVIDIA DGX unlocks the full potential

4

INTRODUCING THE DGX FAMILY

AI WORKSTATION

The Personal AI Supercomputer

CLOUD-SCALE AIAI DATA CENTER

The World’s First AI Supercomputer

in a Box

The Essential Instrument for AI

Research

Cloud service with the highest deep learning efficiency

DGX Station DGX-1 NVIDIA GPU Cloud

with

Tesla P100

with

Tesla V100

Page 5: DGXUPDATE - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s... · DGX Station DGX-1 NVIDIA GPU Cloud with Tesla P100 with Tesla V100. 5 NVIDIA DGX unlocks the full potential

5

NVIDIA DGX unlocks the full potential of NVIDIA GPU’s –powered by software innovation

REVOLUTIONARY AI PERFORMANCE

3X system performance over prior generation

Software stack delivers additional 30% faster training performance vs other GPU systems

10X I/O performance with 2nd generation NVLink vs PCIe-connected GPU’s

New Tensor Core architecture inspired by the demands of deep learning

5

Page 6: DGXUPDATE - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s... · DGX Station DGX-1 NVIDIA GPU Cloud with Tesla P100 with Tesla V100. 5 NVIDIA DGX unlocks the full potential

6

EFFORTLESSPRODUCTIVITYSave $x00,000’s on software engineeringof DL frameworks

Depend on NVIDIA-optimized frameworksinstead of evolving open source software

Save $100k+/yr in admin OpEx with cloudmanagement, streamlined collaboration

Monthly framework releases ensuremaximized performance for DL ROI

NVIDIA DGX software stack delivers immediate productivity that saves time and money

6

Page 7: DGXUPDATE - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s... · DGX Station DGX-1 NVIDIA GPU Cloud with Tesla P100 with Tesla V100. 5 NVIDIA DGX unlocks the full potential

7

Single, unified stack for deep learning frameworks

Predictable execution across platforms

Pervasive reach

COMMON SOFTWARE STACK ACROSS DGX FAMILY

DEEP LEARNING FRAMEWORKS

DGX Station DGX-1 NVIDIA Cloud Service

NVIDIAGPU Cloud

DEEP LEARNING USER SOFTWARE

NVIDIA DIGITS™

THIRD PARTY ACCELERATED SOLUTIONS

CONTAINERIZATION TOOL

NVIDIA Docker

GPU DRIVER

NVIDIA Driver

SYSTEM

Host OS

Page 8: DGXUPDATE - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s... · DGX Station DGX-1 NVIDIA GPU Cloud with Tesla P100 with Tesla V100. 5 NVIDIA DGX unlocks the full potential

8

ENTERPRISE BENEFITS OF DGX SOFTWARENVIDIA Investments in Deep Learning Performance and Manageability

Practitioner productivity with minimal setup

Clean, minimal O/S base image

Non-disruptive updates for software and security

Optimized drivers and libraries for maximized multi-GPU performance

Driver and library independence for

each framework

Popular deep-learning frameworks - GPU-tuned

by NVIDIA Engineering

Page 9: DGXUPDATE - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s... · DGX Station DGX-1 NVIDIA GPU Cloud with Tesla P100 with Tesla V100. 5 NVIDIA DGX unlocks the full potential

9

DGX CLOUD SERVICESStart Faster, Stay Productive

Benefits for Deep Learning Workflow

Single software

stack

Scale across teams of

practitioners

Develop once, deploy

anywhere

Features

Container Registry

Web UI and CLI

Job Scheduling and Management

Host Telemetry

User Management

Python SDK and REST API

Page 10: DGXUPDATE - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s... · DGX Station DGX-1 NVIDIA GPU Cloud with Tesla P100 with Tesla V100. 5 NVIDIA DGX unlocks the full potential

10NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

10 STEPS TO SETUP A DIY SYSTEM380 PAGES OF DOCS TO READ

Step 1. Install Ubuntu linux (10 pg)

Step 2. Install CUDA (41 pg)

Step 3. Install CUDNN (154 pg)

Step 4. Install and Upgrade PIP (20 pg)

Step 5. Install BAZEL (build TF source) (50 pg)

Step 6. Install TensorFlow (15 pg)

Step 7. Upgrade Protobuf (15 pg)

Step 8. Install Docker (75 pg)

Step 9. Test the installationStep 10. Debug and fix install

Page 11: DGXUPDATE - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s... · DGX Station DGX-1 NVIDIA GPU Cloud with Tesla P100 with Tesla V100. 5 NVIDIA DGX unlocks the full potential

11

DL FROM DEVELOPMENT TO PRODUCTIONAccelerated Deep Learning Value with DGX Solutions

Experiment Tune/Optimize Deploy Train Insights

ProcureDGX

Station

Install / Compile

Training at ScaleProductive ExperimentationFast Bring-up

DGX-1/SATURNV/CloudDGX Station

To Data Centeror

To CloudFrom Desk

installed optimized scaled

Page 12: DGXUPDATE - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s... · DGX Station DGX-1 NVIDIA GPU Cloud with Tesla P100 with Tesla V100. 5 NVIDIA DGX unlocks the full potential

12

OUR STRATEGY IN THEDATACENTER: NVIDIA DGX-1

Highest Performance, Fully Integrated HW System

960 TFLOPS | 8x Tesla V100 16GB | 300 GB/s NVLink Hybrid Cube Mesh

2x Xeon | 8 TB RAID 0 | Quad IB 100Gbps, Dual 10GbE | 3U — 3200W

8 TB SSD 8 x Tesla V100 16GB

Page 13: DGXUPDATE - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s... · DGX Station DGX-1 NVIDIA GPU Cloud with Tesla P100 with Tesla V100. 5 NVIDIA DGX unlocks the full potential

13

NVIDIA DGX-1 SOFTWARE STACK

DGX SOFTWARE STACK

DEEP LEARNING FRAMEWORKS

DEEP LEARNING USER SOFTWARE

NVIDIA DIGITS™

THIRD PARTY ACCELERATED SOLUTIONS

CONTAINERIZATION TOOL

NVIDIA Docker

GPU DRIVER

NVIDIA Driver

SYSTEM

Host OS

Advantages:Instant productivity with NVIDIA optimized deep learning frameworks

Caffe, CNTK, MXNet, PyTorch, TensorFlow, Theano, and Torch

Performance optimized across the entire stack

Faster Time-to-Insight with pre-built, tested,and ready to run framework containers

Flexibility to use different versions of libraries like libc, cuDNN in each framework container

Fully Integrated Software for Instant Productivity

13

Page 14: DGXUPDATE - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s... · DGX Station DGX-1 NVIDIA GPU Cloud with Tesla P100 with Tesla V100. 5 NVIDIA DGX unlocks the full potential

14

SIMPLIFY PORTABILITY WITH NVIDIA DOCKER CONTAINERSBenefits of Containers:

Simplify deployment of GPU-accelerated applications

Isolate individual frameworks or applications

Share, collaborate, and test applications across different environments

14

Page 15: DGXUPDATE - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s... · DGX Station DGX-1 NVIDIA GPU Cloud with Tesla P100 with Tesla V100. 5 NVIDIA DGX unlocks the full potential

15

NVIDIA ® DGX-1™

Containerized Applications

TF Tuned SW

NVIDIA Docker

CNTK Tuned SW

NVIDIA Docker

Caffe2 Tuned SW

NVIDIA Docker

Pytorch Tuned SW

NVIDIA Docker

CUDA RTCUDA RTCUDA RTCUDA RT

Linux Kernel + CUDA Driver

Tuned SW

NVIDIA Docker

CUDA RT

Other Frameworks

and Apps. . .

THE POWER TO RUN MULTIPLE FRAMEWORKS AT ONCE

Container Images portable across new driver versions

Microsoft Cognitive Toolkit

Page 16: DGXUPDATE - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s... · DGX Station DGX-1 NVIDIA GPU Cloud with Tesla P100 with Tesla V100. 5 NVIDIA DGX unlocks the full potential

16NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

DGX-1: 96X FASTER THAN CPU

96X

7.4 hoursDGX-1

8-way GPU

Server

40X1X

18 hours

711 hoursDual

Socket CPU

Workload: ResNet50, 90 epochs to solution | CPU Server: Dual Xeon E5-2699 v4, 2.6GHz

Page 17: DGXUPDATE - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s... · DGX Station DGX-1 NVIDIA GPU Cloud with Tesla P100 with Tesla V100. 5 NVIDIA DGX unlocks the full potential

17

RIKEN SUCCESS STORY

CHALLENGE

Enterprises and research organizations embracing AI/DL

Needed to accelerated research in areas including medicine, manufacturing and healthcare

Conventional HPC architectures too costly and inefficient

Fujitsu and NVIDIA Build AI Supercomputer With 24 DGX-1s

SOLUTION

Partnered with Fujitsu for scale-out AI architecture built on DGX-1

24 DGX-1’s deliver 4 petaflops powering the RIKEN supercomputer

NVIDIA COSMOS streamlines AI researcher workflow, helping accelerate RIKEN productivity

IMPACT

Accelerated real-world implementation of scale-out AI

Enables RIKEN team to take advantage of next-gen DL algorithms

Helping create future in which AI finds solutions to societal issues

17

Page 18: DGXUPDATE - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s... · DGX Station DGX-1 NVIDIA GPU Cloud with Tesla P100 with Tesla V100. 5 NVIDIA DGX unlocks the full potential

18

BENEVOLENTAI: TRAINING REDUCED TO DAYS Technology Review Article on DGX-1:

The Pint-Sized Supercomputer That Companies Are Scrambling to Gethttps://www.technologyreview.com/s/603075/the-pint-sized-supercomputer-that-companies-are-scrambling-to-get/

“The cost of renting enough servers on Amazon Web Services would surpass the system’s $129,000 price tag within a year.”

-Jackie Hunter, CEO, BenevolentAI

NVIDIA DGX-1 Other GPU System

3x-4xFASTER TRAINING

DGX-1

Weeks of TrainingDays

TRAINING MODELSSYSTEM INSTALLATION

Page 19: DGXUPDATE - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s... · DGX Station DGX-1 NVIDIA GPU Cloud with Tesla P100 with Tesla V100. 5 NVIDIA DGX unlocks the full potential

19

INTRODUCING NVIDIA DGX STATIONGroundbreaking AI – at your desk

The Personal AI Supercomputer for Researchers and Data Scientists

Revolutionary form factor -designed for the desk, whisper-quiet

Start experimenting in hours, not weeks, powered by DGX Stack

Productivity that goes from desk to data center to cloud

Breakthrough performance and precision – powered by Volta

19

Page 20: DGXUPDATE - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s... · DGX Station DGX-1 NVIDIA GPU Cloud with Tesla P100 with Tesla V100. 5 NVIDIA DGX unlocks the full potential

20

3X FASTER THAN THE FASTEST

WORKSTATIONS

Supercomputing performance at your desk

Water-cooled performance – the only workstation built on 4 Tesla V100’s

3X the performance of today’s fastest GPU workstations

with 30% faster training over non-DGX stack solutions

5X increase in I/O performance with 4-way next generation NVLinkvs. PCIe-connected GPU’s

480 TFLOPS

30%

5X

3X

20

Page 21: DGXUPDATE - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s... · DGX Station DGX-1 NVIDIA GPU Cloud with Tesla P100 with Tesla V100. 5 NVIDIA DGX unlocks the full potential

21

ANNOUNCING NVIDIA GPU CLOUDGPU-ACCELERATED CLOUD PLATFORM OPTIMIZED FOR DEEP LEARNING

Containerized in NVDocker

Optimization across the full stack

Always up-to-date

Fully tested and maintained by NVIDIA

Registry of Containers, Datasets,and Pre-trained models

NVIDIAGPU CLOUD

CSPs

Page 22: DGXUPDATE - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s... · DGX Station DGX-1 NVIDIA GPU Cloud with Tesla P100 with Tesla V100. 5 NVIDIA DGX unlocks the full potential

22

DGX CLOUD SERVICES

Detailed Feature Walkthrough

Page 23: DGXUPDATE - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s... · DGX Station DGX-1 NVIDIA GPU Cloud with Tesla P100 with Tesla V100. 5 NVIDIA DGX unlocks the full potential

23

DGX CLOUD SERVICESManagement Workflows

Plug-in, Power up

Authenticate appliances

View appliances

On-premises setupCloud portal connectivity

Cluster /Node Management

Node events (cloud connect,

change of master, IP, etc.)

Node state (connect,

disconnect, ready, faulty)

SoftwareManagement

RecoveryISO

Image

Factory PXE Boot

Image

Container updates

Page 24: DGXUPDATE - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s... · DGX Station DGX-1 NVIDIA GPU Cloud with Tesla P100 with Tesla V100. 5 NVIDIA DGX unlocks the full potential

24

DGX CLOUD SERVICESManagement Workflows

Container Management

Application / Job Scheduling Metrics / Notifications System Updates

Job Execution Job Status Hardware

HealthCPU, CPU, RAM Util. Alerts System

ImageSystem

SoftwareDocker Image

Page 25: DGXUPDATE - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s... · DGX Station DGX-1 NVIDIA GPU Cloud with Tesla P100 with Tesla V100. 5 NVIDIA DGX unlocks the full potential

25

DL FRAMEWORK WORKFLOW - NGCDGX Management in 7 Easy Steps

User Mgmt

Create Accounts

User Mapping

Assign Users to Projects

Container Repo

Pull/Push Frameworks to Node(s)

Config Job Resources

Assign GPU/CPU/RAM

Review / Submit Job

Ready / Warnings

JobMgmt

Status / Detail / Clone

Schedule

FIFO Scheduler

1 2 3 4 5 6 7

BUILD MANAGE SCALE

Page 26: DGXUPDATE - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s... · DGX Station DGX-1 NVIDIA GPU Cloud with Tesla P100 with Tesla V100. 5 NVIDIA DGX unlocks the full potential

26

NVIDIA DGX SYSTEMS

Faster AI Innovation and Insight

The World’s First Portfolio of Purpose-Built AI Supercomputers

Powered by NVIDIA GPU Cloud

Get Started in AI – Faster

Effortless Productivity

Performance Without Compromise

For More Information: nvidia.com/dgx-systems

26

Page 27: DGXUPDATE - NVIDIAon-demand.gputechconf.com/gtc/2017/presentation/s... · DGX Station DGX-1 NVIDIA GPU Cloud with Tesla P100 with Tesla V100. 5 NVIDIA DGX unlocks the full potential