alison b lowndes · the blueprint for ai power and scale using dgx a100 infused with the expertise...
TRANSCRIPT
![Page 1: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/1.jpg)
ALISON B LOWNDESAI DevRel | EMEA
@alisonblowndes
June 2020
![Page 2: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/2.jpg)
2
INTRO TO NVIDIA
Training & deployment
RAPIDS
Accelerating the datascience
ROBOTICS & SIMULATION
The hardware, the software & the environments
WRAPUP + Q&A
AGENDA
![Page 3: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/3.jpg)
3
NVIDIA AI BREAKTHROUGHS
IN GRAPHICS
PROJECT SOL:A Showcase for the Power of NVIDIA RTX
MINECRAFT RTX:Real-time Ray Tracing in the World’s Most Popular Game
OMNIVERSE:A Powerful Collaboration Platform for 3D Design
NASA MARS LANDER:Visualizing NASA’s Supercomputer Simulations
![Page 4: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/4.jpg)
44
“AMPERE” NVIDIA A100
20X Volta
54B XTOR | 826mm2 | TSMC 7N | 40GB Samsung HBM2 | 600 GB/s NVLink
Peak Vs Volta
FP32 TRAINING 312 TFLOPS 20X
INT8 INFERENCE 1,248 TOPS 20X
FP64 HPC 19.5 TFLOPS 2.5X
MULTI INSTANCE GPU 7X GPUs
![Page 5: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/5.jpg)
55
25 YEARS OF ACCELERATED COMPUTING
X-FACTOR SPEED UP FULL STACK ONE ARCHITECTURESYSTEMS
GPU
CPU
![Page 6: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/6.jpg)
66
25 YEARS OF ACCELERATED COMPUTING
X-FACTOR SPEED UP FULL STACK DATA-CENTER SCALE
GPU
CPU
DPU
ONE ARCHITECTURE
![Page 7: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/7.jpg)
7
Original NVIDIA Campus
NVIDIA Endeavor (2017)
New SaturnV Datacenter (2020)
NVIDIA Voyager (2020)
![Page 8: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/8.jpg)
8
10.5 MW
45,000 Sq. Ft.
![Page 9: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/9.jpg)
9
Unmatched Data Center Scalability —Deployed in Under 3 Weeks
NVIDIA DGX SUPERPODWITH DGX A100
Leadership-class AI infrastructure
The blueprint for AI power and scale using DGX A100
Infused with the expertise of NVIDIA’s AI practitioners
Designed to solve the previously unsolvable
Configurations start at 20 systems
NVIDIA DGX SuperPOD deployed in SATURNV
1,120 A100 GPUs
140 DGX A100 systems
170 Mellanox 200G HDR switches
4 PB of high-performance storage
700 PFLOPS of power to train the previously impossible
nvidia.com/en-us/data-center/dgx-a100/
![Page 10: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/10.jpg)
10
ProvisioningOS Provisioning, BMaaS, netw ork assignment
Cluster CMSW deployment, updates & upgrades
Sys Monitoring & ReportingSystem usage, health checks & alerting
Dataset ManagementStorage, tagging, & versioning of datasets
Interactive NotebooksNotebooks w / schedulable GPU resources
Experiment ManagementJob & results tracking
GUI/CLIPortal/CLI/A PI for requesting resources
Model DeploymentDeployment to prod, inference services, etc
User ManagementAuth, users, teams, & resource restrictions
System Administrator
Data Scientist/Researcher
AI Infra on DGX PODWhat are customers asking for?
![Page 11: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/11.jpg)
11
![Page 12: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/12.jpg)
1212
5 MIRACLES OF A100
AmpereWorld’s Largest 7nm chip
54B XTORS, HBM2
3rd Gen NVLINK and NVSWITCHEfficient Scaling to Enable Super GPU
2X More Bandwidth
3rd Gen Tensor CoresFaster, Flexible, Easier to use
20x AI Perf with TF32
New Sparsity AccelerationHarness Sparsity in AI Models
2x AI Performance
New Multi-Instance GPUOptimal utilization with right sized GPU
7x Simultaneous Instances per GPU
![Page 13: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/13.jpg)
13
NEW MULTI-INSTANCE GPU (MIG)Optimize GPU Utilization, Expand Access to More Users with Guaranteed Quality of Service
nvidia.com/en-us/technologies/multi-instance-gpu/
Up To 7 GPU Instances In a Single A100: Dedicated SM, Memory, L2 cache, Bandwidth for hardware QoS & isolation
Simultaneous Workload Execution With Guaranteed Quality Of Service: All MIG instances run in parallel with predictable throughput & latency
Right Sized GPU Allocation: Different sized MIG instances based on target workloads
Flexibility to run any type of workload on a MIG instance
Diverse Deployment Environments: Supported with Bare metal, Docker, Kubernetes, Virtualized Env.
Amber
GPU Mem
GPU
GPU Mem
GPU
GPU Mem
GPU
GPU Mem
GPU
GPU Mem
GPU
GPU Mem
GPU
GPU Mem
GPU
https://blogs.nvidia.com/blog/2020/05/14/multi-instance-gpus/
![Page 14: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/14.jpg)
14
BECAUSE MODEL DEVELOPMENT IS JUST THE FIRST STEP
Develop and Test Locally
Package─• Dependencies• Parameters• Run scripts
• Build
Scale-out─• Load-balance• Data partitions• Model distribution
• AutoML
Tune─• Parallelism• GPU support• Query tuning
• Caching
Instrument─• Monitoring• Logging• Versioning
• Security
Automate─• CI/CD• Workflows• Rolling upgrades
• A/B testing
Weekswith one data
scientist or developer
Monthswith a large team of developers,
scientists, data engineers and DevOps
Production
![Page 15: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/15.jpg)
15
![Page 16: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/16.jpg)
16
![Page 17: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/17.jpg)
17
![Page 18: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/18.jpg)
![Page 19: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/19.jpg)
19
![Page 20: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/20.jpg)
20
AI IS NOT MAGIC
![Page 21: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/21.jpg)
Definitions
![Page 22: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/22.jpg)
22
BUILDING AN AI MODEL
AI MODELFEATURES DEPLOYMENTDATA
DATA
ANALYTICSMACHINE
LEARNING
MODEL
VALIDATION
NEW DATA
![Page 23: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/23.jpg)
23
BUILDING AN AI PRODUCT
SENSORS
PERCEIVE REASON
PLAN
DATA
DATAANALYTICS
MACHINE LEARNING
AI MODELVALIDATION
ACTUATORSAI MODEL
![Page 24: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/24.jpg)
24
12
6
39
GPUPOWEREDWORKFLOW
DAY IN THE LIFE OF A DATA SCIENTIST
Train Model
Validate
Test Model
Experiment with Optimizations and Repeat
Go Home on Time
DatasetDownloadsOvernight
Start GET A COFFEE
Stay Late
Restart Data Prep Workflow Again
Find Unexpected Null Values Stored as String…
Switch to Decaf
12
6
39
CPUPOWEREDWORKFLOW
Restart Data Prep Workflow
@*#! Forgot to Add a Feature
ANOTHER…
GET A COFFEE
Start Data PrepWorkflow
GET A COFFEE
Configure Data PrepWorkflow
DatasetDownloadsOvernight
Dataset Collection Analysis Data Prep Train Inference
![Page 25: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/25.jpg)
25
NVIDIA Nsight Systems
• Balance your workload across multiple CPUs and GPUs
• Locate idle CPU and GPU time
• Locate redundant synchronizations
• Locate optimization opportunities
• Improve application’s performance
System Wide Profiling Tool
![Page 26: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/26.jpg)
26
Processes and threads
CUDA and OpenGL API trace
Multi-GPU
Kernel and memory transfer activities
cuDNN and cuBLAS trace
Thread/core migration
Thread state
![Page 28: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/28.jpg)
28
IMAGE BASED DL IS EASY
Object detection Semantic Segmentation
Figures copyright Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun,2015. [Faster R-CNN]
Figures copyr ight Preferred Networks Inc., 2016.
![Page 29: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/29.jpg)
29
Numerous applications
3D DL IS EXCITING
Simulation Medical imaging Autonomous driving
Manipulation Robotics Augmented reality
* This slide is best viewed in "slide show" mode.
![Page 30: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/30.jpg)
30
KAOLIN
- A Pytorch library for 3D DL
- Supports a wide range of 3D data representations
- Convenient dataloading/preprocessing/conversions
- Large collection of 3D neural nets to choose from
- Optimized implementations
- Omniverse-Kit integration for easy rendering,
interactive visualization, and much more.
https://gitlab-master.nvidia.com/Toronto_
DL_Lab/kaolin
![Page 32: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/32.jpg)
32
![Page 33: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/33.jpg)
33
World Sense See, Understand Automation
AI Program
Computer
ARTIFICIAL INTELLIGENCE IS DOMAIN SPECIFIC
Self-Driving
![Page 34: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/34.jpg)
34
World Sense See, Understand Automation
AI Program
Computer
AI Program
Computer
ARTIFICIAL INTELLIGENCE IS DOMAIN SPECIFIC
Self-Driving
Manufacturing
![Page 35: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/35.jpg)
35
World Sense See, Understand Automation
AI Program
Computer
AI Program
Computer
AI Program
Computer
ARTIFICIAL INTELLIGENCE IS DOMAIN SPECIFIC
Self-Driving
Manufacturing
Radiology
![Page 36: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/36.jpg)
36
![Page 37: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/37.jpg)
37
RAPIDSGPU POWERED MACHINE LEARNINGMiguel Martínez – Sr. Data Scientist @ NVIDIA
![Page 38: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/38.jpg)
38
WHAT IS RAPIDS
![Page 39: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/39.jpg)
39
GPU Accelerated Data Science
RAPIDS is a set of open source software libraries which
gives you the freedom to execute end-to-end data science
and analytics pipelines entirely on GPUs.
www.rapids.ai
![Page 40: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/40.jpg)
40
CPU Memory
Data Preparation VisualizationModel Training
Open Source Data Science Ecosystem
Familiar Python APIs
Dask
Matplotlib/Plotly
Visualization
Scikit-Learn
Machine Learning
NetworkX
Graph Analytics
Pandas
Analytics
Pytorch, MxNet…
Deep Learning
![Page 41: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/41.jpg)
41
GPU Memory
Dask
cuXFilter <> pyViz
Visualization
cuML
Machine Learning
cuGraph
Graph Analytics
cuDF
Analytics
Pytorch, MxNet…
Deep Learning
Data Preparation VisualizationModel Training
End-to-End Accelerated GPU Data Science
![Page 42: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/42.jpg)
42
cuDF
• GPU-accelerated data preparation and feature engineering
• Python drop-in Pandas replacement
cuML
• GPU-accelerated traditional machine learning libraries
• XGBoost, PCA, Kalman, K-means, k-NN, DBScan, tSVD…
cuGraph
• GPU-accelerated graph analytics libraries
cuXfilter
• Web Data Visualization library
• DataFrame kept in GPU-memory throughout the session
![Page 43: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/43.jpg)
43
LEARNING FROM
Pandas
Spark
Drill
Impala
Parquet
Cassandra Kudu
HBase
Copy & Convert
Copy & Convert
Copy & Convert
Copy & Convert
Copy & Convert
Each system has its own internal memory format
Similar functionality implemented in multiple projects
70-80% computation wasted on serialization & deserialization
All systems utilize the same memory format
Projects can share functionality
No overhead for cross-system communication
Pandas
Spark
Drill
Impala
Parquet
Cassandra Kudu
HBase
Arrow Memory
![Page 44: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/44.jpg)
44
APACHE ARROW
Columnar layout leverages GPU strengths
Emphasis on zero-copy and shallow-copy operations minimizes a core bottleneck
Consistency with CPU version simplifies development and conversion
gdf[‘session_id’]
![Page 45: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/45.jpg)
45
Why OpenUCX?Bringing Hardware Accelerated Communications to Dask
• TCP sockets are slow!
• UCX provides uniform access to transports:
– TCP, InfiniBand, Shared memory, NVLink
• Alpha Python bindings for UCX (ucx-py)
• Provides best communication performance to Dask, based on available hardware on nodes/cluster
![Page 46: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/46.jpg)
46
Environment
• cuDF v0.11,
• UCX-PY 0.11
• Running on NVIDIA DGX-2:
• GPU NVIDIA Tesla V100 32GB
• CPU Intel(R) Xeon(R) CPU 8168 @ 2.70GHz
Benchmark Setup
• DataFrames:
Left/Right 1x int64 column key column,
1x int64 value columns.
• Inner Merge
• 30% of matching data balanced across each partition
Distributed cuDF Random Merge
BENCHMARKS
![Page 47: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/47.jpg)
47
cuDF
![Page 48: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/48.jpg)
48
GPU-Accelerated ETLThe average data scientist spends 90+% of their
time in ETL, as opposed to training models
![Page 49: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/49.jpg)
49
• Follow Pandas APIs and provide >10x speedup
– CSV Reader/Writer
– Parquet Reader/Writer
– ORC Reader/Writer
– JSON Reader
– Avro Reader
• GPU Direct Storage integration in progress forbypassing PCIe bottlenecks!
• Key is GPU-accelerating both parsing anddecompression wherever possible
EXTRACTION IS THE CORNERSTONEcuDF for Faster Data Loading
![Page 50: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/50.jpg)
50
Python
Cython
cuDF C++
CUDA
cuDFDask cuDF
Pandas
ThrustCub
Jitify
CUDA Libraries
ETL Technology Stack
![Page 51: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/51.jpg)
51
ETL – THE BACKBONE OF DATA SCIENCE
libcuDF is… cuDF is…
• Low level library containing function
implementations and C/C++ API
• Importing/exporting Apache Arrow in GPU
memory using CUDA IPC
• CUDA kernels to perform element-wise
math operations on GPU DataFrame columns
• CUDA sort, join, groupby, reduction, etc.
operations on GPU DataFrames
• A Python library for manipulating GPU
DataFrames following the Pandas API
• Python interface to CUDA C++ library with
additional functionality
• Create GPU DataFrames from Numpy arrays,
Pandas DataFrames, and PyArrow Tables
• JIT compilation of User-Defined Functions
(UDFs) using Numba
CUDA C++ Library Python Library
![Page 52: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/52.jpg)
52
BENCHMARKSSingle-GPU Speedup vs Pandas
Environment
• cuDF v0.13
• Pandas v0.25.3
• GPU NVIDIA Tesla V100 32GB
• CPU Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz
Benchmark Setup
• DataFrames:
2x int32 columns key columns,
3x int32 value columns.
• Inner Merge
• GroupBy:
count, sum, min, max.
calculated for each value column.
500
240 220
970
360
290
0
200
400
600
800
1000
1200
Merge Sort GroupBy
# rows
GPU Speedup over CPU
10M 100M
![Page 53: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/53.jpg)
53
500
240 220
970
360290
0
200
400
600
800
1000
1200
Merge Sort GroupBy
# rows
GPU Speedup over CPU
10M 100M
430
140
28
870
300
150
0
100
200
300
400
500
600
700
800
900
1000
Merge Sort GroupBy
# rows
GPU Speedup over CPU
10M 100M
BENCHMARKSContinuous Improvement
cuDF v0.10 cuDF v0.13
![Page 54: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/54.jpg)
54
LOADING DATA INTO A GPU DATAFRAME
Create an empty DataFrame, and add a column
cuDF code examples
Create a DataFrame with two columns
Load a CSV file into a GPU DataFrame
Use Pandas to load a CSV file, and copy its content into a GPU DataFrame
![Page 55: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/55.jpg)
55
WORKING WITH GPU DATAFRAMEScuDF code examples
Return the first three rows as a new DataFrame Row slicing with column selection
Find the mean and standard deviation of a column Count number of occurrences per value, and number of unique values
Transform column values with a custom function Change the data type of a column
![Page 56: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/56.jpg)
56
QUERY, SORT, GROUP, JOIN, …cuDF code examples
Query a DataFrame with a boolean expression
Return the first ‘n’ rows ordered by ‘columns’
Sort a column by its values
One-hot encoding
Group by column with aggregate function
Join and merge DataFrames
![Page 57: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/57.jpg)
57
cuML
![Page 58: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/58.jpg)
58
Python
Cython
cuML Algorithms
cuML Prims
CUDA Libraries
CUDA
cuDFDask cuDFDask cuML
Numpy
ThrustCub
cuSolvernvGraphCUTLASScuSparsecuRandcuBlas
ML Technology Stack
![Page 59: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/59.jpg)
59
CPU vs GPU
Training results
CPU: 57.1 seconds
GPU: 4.28 seconds
System: AWS p3.8xlarge
CPU: Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz,
32 vCPU cores, 244 GB RAM
GPU: Tesla V100 SXM2 16GB
PRINCIPAL COMPONENT
ANALYSIS(PCA)
Specific: Import CPU algorithm
Common: Data loading and algo params Common: Data loading and algo params
Specific: DataFrame from Pandas to GPU
Common: Model training Common: Model training
Specific: Import GPU algorithm
![Page 60: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/60.jpg)
60
cuML roadmap
March 2020 – RAPIDS 0.13
cuML Single-GPU Multi-GPUMulti-NodeMulti-GPU
Gradient Boosted Decision Trees
Linear Regression
Logistic Regression
Random Forest
K-Means
K-NN
DBSCAN
UMAP
ARIMA & Holt-Winters
Kalman Filter
t-SNE
Principal Components
Singular Value Decomposition
SVM
![Page 61: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/61.jpg)
61
cuML roadmap
2020 – RAPIDS 1.0
cuML Single-GPU Multi-GPUMulti-NodeMulti-GPU
Gradient Boosted Decision Trees
Linear Regression
Logistic Regression
Random Forest
K-Means
K-NN
DBSCAN
UMAP
ARIMA & Holt-Winters
Kalman Filter
t-SNE
Principal Components
Singular Value Decomposition
SVM
![Page 62: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/62.jpg)
62
0 5,000 10,000
20 CPU Nodes
30 CPU Nodes
50 CPU Nodes
100 CPU Nodes
DGX-2
5x DGX-1
BENCHMARKS
Benchmark
200GB CSV dataset; Data preparation includes joins, variable transformations.
CPU Cluster Configuration
CPU nodes (61 GiB of memory, 8 vCPUs, 64-bit platform), Apache Spark
DGX Cluster Configuration
5x DGX-1 on InfiniBand network
Time in seconds — Shorter is better
2290
1956
1999
1948
147
137
0 1,000 2,000 3,000
20 CPU Nodes
30 CPU Nodes
50 CPU Nodes
100 CPU Nodes
DGX-2
5x DGX-1
2741
1675
715
379
37
17
0 1,000 2,000 3,000
20 CPU Nodes
30 CPU Nodes
50 CPU Nodes
100 CPU Nodes
DGX-2
5x DGX-1
8762
6148
3925
3,221
209
164
cuDF – Load and Data Prep cuML – XGBoost End-to-End
cuDF (Load and Data Preparation) Data Conversion XGBoost
![Page 63: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/63.jpg)
63
cuGraph
![Page 64: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/64.jpg)
64
Python
Cython
cuGraph Algorithms
CUDA
cuDFDask cuDFDask cuML
Numpy
ThrustCub
cuSolvercuSparsecuRand
Gunrock*
Prims
CUDA Libraries
cuGraphBLAS cuHornet
Graph Technology Stack
* Gunrock is from UC Davis
![Page 65: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/65.jpg)
65
Focus on Features and User Experience
GOALS AND BENEFITS OF CUGRAPH
• Property Graph support via DataFrames
Seamless Integration with cuDF & cuML
• Up to 500 million edges on a single 32GB GPU
• Multi-GPU support for scaling into the billions
of edges
Breakthrough Performance
• Python: Familiar NetworkX-like API
• C/C++: lower-level granular control for
application developers
Multiple APIs
• Extensive collection of algorithm, primitive,
and utility functions
Growing Functionality
![Page 66: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/66.jpg)
66
Louvain Single Run
Returns:
cudf.DataFrame with two names columns:
- louvain["vertex"]: The vertex id.
- louvain["partition"]: The assigned partition.
G = cugraph.Graph()
G.add_edge_list(gdf["src_0"], gdf["dst_0"], gdf["data"])
df, mod = cugraph.nvLouvain(G)
![Page 67: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/67.jpg)
67
BENCHMARKSpeedup vs Scipy PageRank and cyLouvain
![Page 68: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/68.jpg)
68
cuGraph roadmap
March 2020 – RAPIDS 0.13
cuGraph Single-GPU Multi-GPUMulti-NodeMulti-GPU
PageRank
Personal Page Rank
Katz
Betweenness Centrality
Spectral Clustering
Louvain
Ensemble Clustering for Graphs
K-Core
K-Truss
Triangle Counting
Connected Components (Weak and Strong)
Jaccard
Overlap Coefficent
Single Source Shortest Path (SSSP)
Breadth First Search (BFS)
![Page 69: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/69.jpg)
69
cuGraph roadmap
2020 – RAPIDS 1.0
cuGraph Single-GPU Multi-GPUMulti-NodeMulti-GPU
PageRank
Personal Page Rank
Katz
Betweenness Centrality
Spectral Clustering
Louvain
Ensemble Clustering for Graphs
K-Core
K-Truss
Triangle Counting
Connected Components (Weak and Strong)
Jaccard
Overlap Coefficent
Single Source Shortest Path (SSSP)
Breadth First Search (BFS)
![Page 70: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/70.jpg)
70
HOW TO START
![Page 71: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/71.jpg)
71
On-premisesIn the cloud
https://github.com/rapidsai
Source code on GitHub
https://ngc.nvidia.com
Containers on NGC & Docker Hub
https://anaconda.org/rapidsai
Conda packages
Pascal architecture or better CUDA 9.2, 10.0 or 10.1.2
Ubuntu 16.04/18.04,CentOS 7 & RHEL 7
![Page 74: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/74.jpg)
74
LEARN MORE
![Page 76: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/76.jpg)
76
![Page 77: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/77.jpg)
ALISON B LOWNDESAI DevRel | EMEA
![Page 78: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/78.jpg)
78
https://fortune.com/longform/ai-artificial-intelligence-big-tech-microsoft-alphabet-openai/
![Page 80: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/80.jpg)
80
![Page 82: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/82.jpg)
82
![Page 83: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/83.jpg)
83
Brain Computer Interfaces Focused on treatment for disease and dysfunction eg epilepsy, depression, Parkinsons but ultimately to advance human intelligence by restoring and extending cognitive vibrancy.
“We’re either going to have to merge with AI or be left behind”; Elon Musk
![Page 84: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/84.jpg)
84
![Page 85: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/85.jpg)
85
![Page 86: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/86.jpg)
86
![Page 87: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/87.jpg)
87
![Page 88: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/88.jpg)
88
ISAAC PLATFORM FOR ROBOTICSNvidia's multi-tool for robotics
DESIGN
JETSON XAVIER
SIMULATE TRAIN DEPLOY
![Page 89: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/89.jpg)
89
CARTER — THE NAVIGATION ROBOT
High-end platform for logistics applications
Navigation Stack
3D Obstacle Detection & Avoidance
NVIDIA Jetson Xavier
![Page 90: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/90.jpg)
90
![Page 91: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/91.jpg)
91
![Page 92: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/92.jpg)
92
GTC Digital talk: S21182
![Page 93: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/93.jpg)
93
ISAAC 2020.1
“High-fidelity simulation lets us train and test algorithms more effectively, leading to more robust and adaptive networks” A. Anandkumar, Prof CS, CalTech & NVIDIA.
![Page 94: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/94.jpg)
94
![Page 95: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/95.jpg)
95
![Page 96: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/96.jpg)
![Page 97: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/97.jpg)
97
THE JETSON FAMILYfor AI at the Edge and Autonomous System designs
Same software Full specs at developer.nvidia.com/jetson * TX2i: 10-20W
7.5 – 15W*50mm x 87mm
JETSON TX2 series1.3 TFLOPS (FP16)
5 - 10W45mm x 70mm
JETSON NANO0.5 TFLOPS (FP16)
10 – 30W100mm x 87mm
JETSON AGX XAVIER series11 TFLOPS (FP16)32 TOPS (INT8)
10 - 15W45mm x 70mm
JETSON Xavier NX6 TFLOPS (FP16)21 TOPS (INT8)
Mainstream Autonomous machinesEntry
![Page 98: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/98.jpg)
NVIDIA Jetson
Xavier “NX”
21TOPS (INT8) at 15w8GB LPDDR4x16GB eMMCSupports up to 32x 1080p IP cameras
70x45mmModule
Developer kit
![Page 99: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/99.jpg)
99
Sample Code
Deep Learning
CUDA, Linux4Tegra, ROS
Multimedia API
MediaComputer Vision Graphics
Nsight Developer Tools
Jetson Embedded Supercomputer: Advanced GPU, 64-bit CPU, Video CODEC, VIC, ISP
JETPACK SDK FOR AI @ THE EDGE
DEVELOPER.NVIDIA.COM/EMBEDDED-COMPUTING
TensorRT
cuDNN
VisionWorks
OpenCV
Vulkan
OpenGL
libargus
Video API
![Page 100: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/100.jpg)
100
DEEP LEARNING INSTITUTE
Training Labs
Nanodegrees
nvidia.com/DLI
TWO DAYS TO A DEMO
Create your first demo today
developer.nvidia.com/
embedded/twodaystoademo
JETSON DEVELOPER KIT
AGX Xavier Developer Kit $699
Xavier NX software patch
developer.nvidia.com/
buy-jetson
GTC
Largest event for GPU
developers
gputechconf.com
JETSON - START NOW
![Page 101: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/101.jpg)
![Page 102: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/102.jpg)
![Page 103: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/103.jpg)
103
JARVISFramework for Multimodal Conversational AI services
PRE-TRAINED MODELS
JARVIS
End-to-End Multimodal Conversational AI ServicesPre-trained SOTA models-100,000 Hours of DGX Retrain with NeMoInteractive Response – 150ms on A100 versus 25sec on CPUDeploy Services with One Line of Code
RETRAIN
video
audio
Multi-Speaker
Tr anscription
NVIDIA GPU CLOUD NVIDIA AI TOOLKIT
Transfer Learning
NeMo
Service Maker
TRITON INFERENCE SERVER
Dialog Manager
ChatbotMulti-
Speaker Tr anscription
Look to TalkGestur e
Recognition
Speech
Vision
NLU
Sign-up for EA:developer.nvidia.com/nvidia-jarvis
![Page 104: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/104.jpg)
104
Enabling Resilience & Monitoring of Advanced Deployments
Package manager for Kubernetes Easily configure, deploy and update
applications on Kubernetes
Container OrchestrationAutomated container deployment
including self-healing
Cloud Native Deployment Approach
NVIDIA EGX Stack
GPU Operator
![Page 105: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/105.jpg)
105
PURPOSE-BUILT AI SUPERCOMPUTERS
AI WORKSTATION AI DATA CENTER
Universal SW for Deep Learning
Predictable execution across platforms
Pervasive reach
NGC DL SOFTWARE STACK
The Essential Instrument for AI
Research
DGX-1
The Personal AI Supercomputer
DGX Station
The World’s Most Powerful AI System for the Most Complex AI Challenges
DGX-2
![Page 106: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/106.jpg)
106106
NEW NGC FEATURES
SDKs & CONTAINERS FOR A100Q2
NGC PRIVATE REGISTRYNow
NGC-READY SYSTEMS FOR A100Q2
DL - TF, PyT, MxNet, Triton…
HPC – NAMD, Chroma, LAMMPS…
Easily grant and manage content access
Container scanning and signing.Model versioning and encryption
Multi-arch support - x86, Arm, POWER
Securely share and collaborate
Industry SDKs – Jarvis, Aerial…
![Page 107: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/107.jpg)
107
Secure and Accelerate End to End AI WorkflowsNGC AI Model and Security Enhancements
PRE-TRAINED
MODELS
AI Toolkits & SDK’s
Transfer Learning
Federated Learning
NeMo
ConversationalAI
TensorRT Optimizer
Service Maker
TRAINING & REFINING
NGC Catalog Private Registry
Container Signing
Model Encryption
Model Versioning
Security Scanning
Access Control
DeploySecure
Manage
Remote EGX Systems
![Page 108: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/108.jpg)
108
NVIDIA CLARA FIGHTING COVIDTesting | Treating | Tracking
Clara GuardianVideo, Vision, Voice18+ Global Partners
First DGX A100 for COVIDArgonne National LabBlue Print for Pharmas
Accelerated GenomicsMinutes & Hours vs. Weeks & MonthsEpidemiology to Infected Population
AI Models for COVID in CT2 Pre-trained Models
NVIDIA Clara Imaging in NGC
![Page 109: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/109.jpg)
109
FIVE ROADS TO GPU COMPUTING
GPU Libraries______________
Drop-in replacement for
existing libraries
cuBLAS, CUDA Math,
cuSPARSE, cuRAND, cuSOLVER, nvGRAPH, cuDNN,
cuFFT, Thrust
OPEN-ACC______________
Comment-based
directives in
C / C++ / Fortran
Single source code
parallelization for
multiple architectures
CUDA______________
Parallel Programming
Model for GPUs in C, C++,
Fortran, Python, MATLAB
Specialized Kernels for
general purpose GPU
RAPIDS______________
GPU Acceleration of
Traditional Machine
Learning
Accelerate Scikit-Learn
style ML algorithms
DEEP LEARNING______________
GPU accelerated deep
learning frameworks
TensorFlow, Pytorch
Build GPU-accelerated
functions directly
from data
![Page 110: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/110.jpg)
110developer.nvidia.com
![Page 111: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/111.jpg)
111
RICH CONTENT PORTFOLIOFundamentals and advanced hands-on training in key technologies and application domains
AI for Digital Content Creation
Deep Learning Fundamentals
AI for HealthcareAI for Autonomous Vehicles
AI for Intelligent Video Analytics
Accelerated Computing Fundamentals
AI for RoboticsAI for
Predictive Maintenance
Accelerated Data Science Fundamentals
Intro to AI in the Data Center
AI for Anomaly Detection
AI for Industrial Inspection
![Page 112: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/112.jpg)
112
DLI UNIVERSITY TRAINING
UNIVERSITY AMBASSADOR PROGRAM
• Qualified faculty and researchers can get certified to teach DLI
workshops to their students at no cost.
• Hundreds of universities certified around the world, including:
TEACHING KITS
• Qualified university educators can download courseware across
deep learning, accelerated computing, and robotics.
• Kits include lecture materials, GPU cloud resources, access to
self-paced DLI courses, and more.
Learn more at www.nvidia.com/dli
![Page 113: ALISON B LOWNDES · The blueprint for AI power and scale using DGX A100 Infused with the expertise of NVIDIA’s AI practitioners Designed to solve the previously unsolvable Configurations](https://reader034.vdocuments.net/reader034/viewer/2022051918/600a0e8acb416a587341f354/html5/thumbnails/113.jpg)
113
`https://blogs.nvidia.com/blog/2019/11/20/nvidia-microsoft-aid-ai-startups/