nvidia® cudnn gpu-accelerated machine learningspeech.ee.ntu.edu.tw/~tlkagk/courses/mlds_2015/nn...
TRANSCRIPT
![Page 1: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep](https://reader036.vdocuments.net/reader036/viewer/2022062506/5f01d1207e708231d4012e5d/html5/thumbnails/1.jpg)
NVIDIA® cuDNN
GPU-Accelerated Machine Learning
![Page 2: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep](https://reader036.vdocuments.net/reader036/viewer/2022062506/5f01d1207e708231d4012e5d/html5/thumbnails/2.jpg)
How GPU Acceleration Works
Application Code
+
GPU CPU 5% of Code
~ 80% of run-time
Compute-Intensive Functions
Rest of Sequential CPU Code
![Page 3: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep](https://reader036.vdocuments.net/reader036/viewer/2022062506/5f01d1207e708231d4012e5d/html5/thumbnails/3.jpg)
3 Ways to Program GPUs
Applications
Libraries
“Drop-in”
Acceleration
Programming
Languages
Maximum
Flexibility
OpenACC
Directives
Easily Accelerate
Applications
![Page 4: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep](https://reader036.vdocuments.net/reader036/viewer/2022062506/5f01d1207e708231d4012e5d/html5/thumbnails/4.jpg)
HPC Today cuDNN is a library of primitives for deep learning
![Page 5: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep](https://reader036.vdocuments.net/reader036/viewer/2022062506/5f01d1207e708231d4012e5d/html5/thumbnails/5.jpg)
Deep Learning with cuDNN cuDNN is a library of primitives for deep learning
GPUs
cuDNN
Frameworks
Applications
Tesla TX-1 Titan
![Page 6: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep](https://reader036.vdocuments.net/reader036/viewer/2022062506/5f01d1207e708231d4012e5d/html5/thumbnails/6.jpg)
LARGE SCALE VISUAL RECOGNITION CHALLENGE (ILSVRC)
person
car
helmet
motorcycle
bird
frog
person
dog
chair
person
hammer
flower pot
power drill
1.2M training images • 1000 object categories
![Page 7: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep](https://reader036.vdocuments.net/reader036/viewer/2022062506/5f01d1207e708231d4012e5d/html5/thumbnails/7.jpg)
Image Classification Error Rates
2012
CHALLENGE SUMMARY
4
60
110
0
20
40
60
80
100
120
2010 2011 2012 2013 2014
Entries using GPUs
28% 26%
16%
12%
7%
0%
5%
10%
15%
20%
25%
30%
2010 2011 2012 2013 2014
![Page 8: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep](https://reader036.vdocuments.net/reader036/viewer/2022062506/5f01d1207e708231d4012e5d/html5/thumbnails/8.jpg)
DEEP LEARNING VISUALIZED
![Page 9: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep](https://reader036.vdocuments.net/reader036/viewer/2022062506/5f01d1207e708231d4012e5d/html5/thumbnails/9.jpg)
Image Classification, Object Detection, Localization Face Recognition
Speech & Natural Language Processing
Medical Imaging & Interpretation
Seismic Imaging & Interpretation Recommendation
Example Use Cases
![Page 10: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep](https://reader036.vdocuments.net/reader036/viewer/2022062506/5f01d1207e708231d4012e5d/html5/thumbnails/10.jpg)
Deep learning revolutionizing medical research
Detecting Mitosis in
Breast Cancer Cells — IDSIA
Predicting the Toxicity
of New Drugs — Johannes Kepler University
Understanding Gene Mutation
to Prevent Disease — University of Toronto
![Page 11: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep](https://reader036.vdocuments.net/reader036/viewer/2022062506/5f01d1207e708231d4012e5d/html5/thumbnails/11.jpg)
cuDNN Version 2
![Page 12: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep](https://reader036.vdocuments.net/reader036/viewer/2022062506/5f01d1207e708231d4012e5d/html5/thumbnails/12.jpg)
cuDNN Design Goal
Basic Deep Learning Subroutines
Allow user to write a DNN application without any CUDA code
Flexible Layout
Handle any data layout
Basic Deep Learning Subroutines
Great performance with more memory use
Good performance with minimal memory usage
![Page 13: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep](https://reader036.vdocuments.net/reader036/viewer/2022062506/5f01d1207e708231d4012e5d/html5/thumbnails/13.jpg)
DNN ROUTINES
Convolutions – 80-90% of the execution time
Pooling – Spatial smoothing
Activation – Pointwise non-linear function
![Page 14: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep](https://reader036.vdocuments.net/reader036/viewer/2022062506/5f01d1207e708231d4012e5d/html5/thumbnails/14.jpg)
CONVOLUTIONS – The MAIN Workload
![Page 15: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep](https://reader036.vdocuments.net/reader036/viewer/2022062506/5f01d1207e708231d4012e5d/html5/thumbnails/15.jpg)
2D conv as a GEMV
I1 I2 I3 I4 I5 I6
I7 I8 I9 I10 I11 I12
I13 I14 I15 I16 I17 I18
I19 I20 I21 I22 I23 I24
I25 I26 I27 I28 I29 I30
I31 I32 I33 I34 I35 I36
F1 F2 F3
F4 F5 F6
F7 F8 F9
I1 I2 I3 I7 I8 I9 I13 I14 I15
I2 I3 I4 I8 I9 I10 I14 I15 I16
I3 I4 I5 I9 I10 I11 I15 I16 I17
F1
F2
F3
F4
F5
F6
F7
F8
F9
Image
Filter
![Page 16: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep](https://reader036.vdocuments.net/reader036/viewer/2022062506/5f01d1207e708231d4012e5d/html5/thumbnails/16.jpg)
Multi-convolve
![Page 17: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep](https://reader036.vdocuments.net/reader036/viewer/2022062506/5f01d1207e708231d4012e5d/html5/thumbnails/17.jpg)
cuDNN V2 Flexibility
![Page 18: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep](https://reader036.vdocuments.net/reader036/viewer/2022062506/5f01d1207e708231d4012e5d/html5/thumbnails/18.jpg)
cuDNN V2 new features
![Page 19: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep](https://reader036.vdocuments.net/reader036/viewer/2022062506/5f01d1207e708231d4012e5d/html5/thumbnails/19.jpg)
cuDNN Version 2
Accelerates key routines to
improve performance of neural
net training
Up to 1.8x faster on AlexNet than
a baseline GPU implementation
New support for 3D convolutions
Integrated into all major Deep
Learning frameworks: Caffe,
Theano, Torch
1.0x 1.0x
1.6x
1.2x
Caffe (GoogLeNet) Torch (OverFeat)
Baseline (GPU)
With cuDNN
2.5M
18M
23M
43M
0
10
20
30
40
50
16 Core CPU GTX Titan Titan BlackcuDNN v1
Titan XcuDNN v2
Millions
of
Images
Images Trained Per Day (Caffe AlexNet)
E5-2698 v3 @ 2.3GHz / 3.6GHz Turbo
![Page 20: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep](https://reader036.vdocuments.net/reader036/viewer/2022062506/5f01d1207e708231d4012e5d/html5/thumbnails/20.jpg)
cuDNN Version 2
Accelerates key routines to
improve performance of neural
net training
Up to 1.8x faster on AlexNet than
a baseline GPU implementation
New support for 3D convolutions
Integrated into all major Deep
Learning frameworks: Caffe,
Theano, Torch
1.0x 1.0x
1.6x
1.2x
Caffe (GoogLeNet) Torch (OverFeat)
Baseline (GPU)
With cuDNN
2.5M
18M
23M
43M
0
10
20
30
40
50
16 Core CPU GTX Titan Titan BlackcuDNN v1
Titan XcuDNN v2
Millions
of
Images
Images Trained Per Day (Caffe AlexNet)
E5-2698 v3 @ 2.3GHz / 3.6GHz Turbo
![Page 21: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep](https://reader036.vdocuments.net/reader036/viewer/2022062506/5f01d1207e708231d4012e5d/html5/thumbnails/21.jpg)
NVIDIA® cuDNN Roadmap
Q3’14 Q4’14
Layers (foward & backprop)
- Convolutional
- Pooling
- Softmax
- ReLu/Sigmoid/Tanh
Performance Features
Release 1 September 2014
High performance
convolution
Layers
- Local receptive field
- Contrast normalization
- Fully-connected
- Recurrent
Support for multiple GPUs
per node
Faster convolution routines
Release 3 Release 2
Q2’15 Q1’15
Tuning for future chips
![Page 22: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep](https://reader036.vdocuments.net/reader036/viewer/2022062506/5f01d1207e708231d4012e5d/html5/thumbnails/22.jpg)
GPU-Accelerated Deep Learning Frameworks
CAFFE TORCH THEANO Mernava neo CUDA-
CONVNET2 KALDI
Description Deep Learning
Framework
Scientific Computing
Framework
Math Expression
Compiler
Deep Learning
Framework
Deep Learning
Application
Speech Recognition
Toolkit
cuDNN R2 R2 R2 -- -- --
Multi-GPU In Progress In Progress In Progress (nnet2)
Multi-CPU (nnet2)
License BSD-2 BSD BSD Apache 2.0 Apache 2.0 Apache 2.0
Interface(s) Text-based definition
files, C++. Python,
MATLAB
Python, Lua,
MATLAB Python Python C++ C++, Shell scripts
Embedded (TK1)
http://developer.nvidia.com/deeplearning
![Page 23: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep](https://reader036.vdocuments.net/reader036/viewer/2022062506/5f01d1207e708231d4012e5d/html5/thumbnails/23.jpg)
Using cuDNN
![Page 24: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep](https://reader036.vdocuments.net/reader036/viewer/2022062506/5f01d1207e708231d4012e5d/html5/thumbnails/24.jpg)
cuDNN Easy to Enable
![Page 25: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep](https://reader036.vdocuments.net/reader036/viewer/2022062506/5f01d1207e708231d4012e5d/html5/thumbnails/25.jpg)
DIGITS
Visualization tool for DNN training
Use default network, import one, or
design your own
Import your training data from disk or
web
Monitor multiple trainings in parallel
Deep Learning GPU Training System
![Page 26: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep](https://reader036.vdocuments.net/reader036/viewer/2022062506/5f01d1207e708231d4012e5d/html5/thumbnails/26.jpg)
DIGITS
Test Image
Monitor Progress Configure DNN Process Data Visualize Layers
![Page 27: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep](https://reader036.vdocuments.net/reader036/viewer/2022062506/5f01d1207e708231d4012e5d/html5/thumbnails/27.jpg)
DIGITS
Deep Learning GPU Training System
Who it is for
Deep learning researchers
Automotive
Medical Researchers
Defense
Intelligent Video Analytics
Web Companies
Startups
![Page 28: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep](https://reader036.vdocuments.net/reader036/viewer/2022062506/5f01d1207e708231d4012e5d/html5/thumbnails/28.jpg)
![Page 29: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep](https://reader036.vdocuments.net/reader036/viewer/2022062506/5f01d1207e708231d4012e5d/html5/thumbnails/29.jpg)
Thank you!
Developer Zone: https://developer.nvidia.com/deeplearning
GPU Technology Conference: http://www.gputechconf.com/
cuDNN Download: https://developer.nvidia.com/cuDNN
DIGITS Download: https://developer.nvidia.com/digits
DIGITS Source: https://www.github.com/nvidia/digits