![Page 1: Urs Köster Presenting at RE-Work DL Summit in Boston](https://reader031.vdocuments.net/reader031/viewer/2022030314/589b2c611a28ab2d4c8b6043/html5/thumbnails/1.jpg)
Proprietary and confidential. Do not distribute.
Deep Learning at Scale
May 2016 Urs Köster, PhD
Nervana
MAKING MACHINES SMARTER.
![Page 2: Urs Köster Presenting at RE-Work DL Summit in Boston](https://reader031.vdocuments.net/reader031/viewer/2022030314/589b2c611a28ab2d4c8b6043/html5/thumbnails/2.jpg)
Proprietary and confidential. Do not distribute.
ne r vana
About nervana
2
• A platform for machine intelligence
• enable deep learning at scale
• optimized from algorithms to silicon
X
![Page 3: Urs Köster Presenting at RE-Work DL Summit in Boston](https://reader031.vdocuments.net/reader031/viewer/2022030314/589b2c611a28ab2d4c8b6043/html5/thumbnails/3.jpg)
Proprietary and confidential. Do not distribute.
ne r vana
The Nervana Platform - a full-stack solution
3
neon deep learning
framework
nervana cloud Solutions
Images
Text
Tabular
Speech
Time series
Video
![Page 4: Urs Köster Presenting at RE-Work DL Summit in Boston](https://reader031.vdocuments.net/reader031/viewer/2022030314/589b2c611a28ab2d4c8b6043/html5/thumbnails/4.jpg)
neon: nervana python deep learning library
4
• User-friendly, extensible, fast
• Support for many deep learning models
• Interface to nervana cloud
• Multiple backends
• nervana engine
• GPU (optimized assembler kernels)
• CPU cluster
Open source (Apache 2.0) on github.com/nervanaSystems/neon
![Page 5: Urs Köster Presenting at RE-Work DL Summit in Boston](https://reader031.vdocuments.net/reader031/viewer/2022030314/589b2c611a28ab2d4c8b6043/html5/thumbnails/5.jpg)
Proprietary and confidential. Do not distribute.
ne r vana
Nervana Cloud
5
web interface
command line
![Page 6: Urs Köster Presenting at RE-Work DL Summit in Boston](https://reader031.vdocuments.net/reader031/viewer/2022030314/589b2c611a28ab2d4c8b6043/html5/thumbnails/6.jpg)
Proprietary and confidential. Do not distribute.
ne r vana
Deep learning as a core technology
6
DL
Photos Maps
Voice Search
Self-driving car
Ad Targeting
Machine Translation
‘Google Brain’ model
DL
Image Classification
Object Localization
Video Indexing
Speech Recognition
Nervana Platform
Natural Language
![Page 7: Urs Köster Presenting at RE-Work DL Summit in Boston](https://reader031.vdocuments.net/reader031/viewer/2022030314/589b2c611a28ab2d4c8b6043/html5/thumbnails/7.jpg)
Proprietary and confidential. Do not distribute.
ne r vana
Video recognition with 3D convolution
7
Training Speed
0
0.25
0.5
0.75
1
epochs / hour
neon caffe
![Page 8: Urs Köster Presenting at RE-Work DL Summit in Boston](https://reader031.vdocuments.net/reader031/viewer/2022030314/589b2c611a28ab2d4c8b6043/html5/thumbnails/8.jpg)
Proprietary and confidential. Do not distribute.
ne r vana
Object Localization / Segmentation
8
CamVid DatasetSegNet model
KITTI DatasetFast R-CNN model
neon (ms) caffe (ms) Speedup
Fast-RCNN (batch size=4) 360 670 1.8x
SegNet (batch size=4) 267 1455 5.4x
SegNet (4 GPUs, batch size=16) 348 -- *5.9x
![Page 9: Urs Köster Presenting at RE-Work DL Summit in Boston](https://reader031.vdocuments.net/reader031/viewer/2022030314/589b2c611a28ab2d4c8b6043/html5/thumbnails/9.jpg)
Proprietary and confidential. Do not distribute.
ne r vana
Image Classification (Residual Network)
9
![Page 10: Urs Köster Presenting at RE-Work DL Summit in Boston](https://reader031.vdocuments.net/reader031/viewer/2022030314/589b2c611a28ab2d4c8b6043/html5/thumbnails/10.jpg)
Proprietary and confidential. Do not distribute.
ne r vana
Speech to text
10
![Page 11: Urs Köster Presenting at RE-Work DL Summit in Boston](https://reader031.vdocuments.net/reader031/viewer/2022030314/589b2c611a28ab2d4c8b6043/html5/thumbnails/11.jpg)
Proprietary and confidential. Do not distribute.
ne r vana
Imagenet ILSVRC Challenge
11
Top-5
err
or
rate
0%
10%
20%
30%
2010 2011 2012 2013 2014 2015
Deep learninghuman
performance
Alex
Net
C
larifa
i
Goo
gleNe
t
Res
Net
![Page 12: Urs Köster Presenting at RE-Work DL Summit in Boston](https://reader031.vdocuments.net/reader031/viewer/2022030314/589b2c611a28ab2d4c8b6043/html5/thumbnails/12.jpg)
Proprietary and confidential. Do not distribute.
ne r vana 12
• Same model, better performance:
• Hardware improvements
• Algorithmic improvements
Speeding up Deep Learning
0100200
300400500600
CPU GTX580TitanX neon
Soumith's AlexNet Benchmark
ms
0
100
200
300
400
500
4/2015 8/2015 3/2016
neonCuDNN
Soumith's GoogleNet Benchmark
ms
0
100
200
300
400
500
4/2015 8/2015 3/2016
neonCuDNN
15,000 ...
Alexnet ms / iteration
![Page 13: Urs Köster Presenting at RE-Work DL Summit in Boston](https://reader031.vdocuments.net/reader031/viewer/2022030314/589b2c611a28ab2d4c8b6043/html5/thumbnails/13.jpg)
Proprietary and confidential. Do not distribute.
ne r vana
Dennard scaling has ended
13
# OF PROCESSORS
LEARNING SPEED
INDUSTRY STANDARD: COMMUNICATION OVERHEAD = PERFORMANCE CEILING
NERVANA: BETTER COMMUNICATION FABRIC, NEAR LINEAR SCALING
Transistors Clock speed Power Perf / clock
![Page 14: Urs Köster Presenting at RE-Work DL Summit in Boston](https://reader031.vdocuments.net/reader031/viewer/2022030314/589b2c611a28ab2d4c8b6043/html5/thumbnails/14.jpg)
Proprietary and confidential. Do not distribute.
ne r vana
Nervana Engine (coming in 2017)
14
• Unprecedented computing power
• 10x speedup over current GPUs
• More memory on-chip
• High-Bandwidth Memory off-chip
• Six bi-directional high-bandwidth
links for 3D torus interconnect
• 8 chips in a box, seamlessly scale
to multiple chassis
![Page 15: Urs Köster Presenting at RE-Work DL Summit in Boston](https://reader031.vdocuments.net/reader031/viewer/2022030314/589b2c611a28ab2d4c8b6043/html5/thumbnails/15.jpg)
Proprietary and confidential. Do not distribute.
ne r vana
Summary
15
• Deep learning is a new computational paradigm
• Learning and Inference on data
• neon with state-of-the-art GPU kernels
• Nervana Cloud with multi-GPU training
• Watch for Nervana Engine deep learning processor
![Page 16: Urs Köster Presenting at RE-Work DL Summit in Boston](https://reader031.vdocuments.net/reader031/viewer/2022030314/589b2c611a28ab2d4c8b6043/html5/thumbnails/16.jpg)