cs 758: advanced topics in computer architecture

36
CS 758: Advanced Topics in Computer Architecture Lecture #11: Intro to DNNs Professor Matthew D. Sinclair Some of these slides were developed by Tushar Krishna at Georgia Tech and Joel Emer and Vivienne Sze at MIT Slides enhanced by Matt Sinclair

Upload: others

Post on 29-Oct-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS 758: Advanced Topics in Computer Architecture

CS 758: Advanced Topics in Computer Architecture

Lecture #11: Intro to DNNs

Professor Matthew D. Sinclair

Some of these slides were developed by Tushar Krishna at Georgia Tech and Joel Emer and Vivienne Sze at MIT

Slides enhanced by Matt Sinclair

Page 2: CS 758: Advanced Topics in Computer Architecture

AI/ML Applications“AI is the new electricity” – Andrew Ng

Object Detection

Speech Recognition

Image Segmentation Medical Imaging

GamesText to Speech Recommendations

Autonomous Agents, Intelligent Personal Assistants, …

Page 3: CS 758: Advanced Topics in Computer Architecture

ML is HOT

Source: Jeff Dean, Google

Over 1.5B invested in AI chip startups in 2017 aloneNot clear what the endgame will be

Page 4: CS 758: Advanced Topics in Computer Architecture

Publications at Architecture Conferences

Slide Courtesy: Joel Emer and Vivienne Sze

Page 5: CS 758: Advanced Topics in Computer Architecture

Key Questions to Answer• Separating the Hype from the (Remaining) Opportunity

• Computer Architecture perspective• The role of HW Acceleration in the AI/ML Boom

• Understand Compute and Memory behavior of Deep Learning Workloads

• Understand Limitations of Current Platforms (CPU and GPU)

• What about Moore’s Law?

• Opportunities and Challenges with custom HW Accelerators• Techniques to reduce computation

• Techniques to to reduce data movement

• Techniques to to reduce memory overhead

• Performance vs Energy vs $$

Page 6: CS 758: Advanced Topics in Computer Architecture

Artificial Intelligence

Machine Learning with Neural Networks

Machine Learning

Brain-Inspired

Spiking NeuralNetworks

Slide Courtesy: Joel Emer and Vivienne Sze

Page 7: CS 758: Advanced Topics in Computer Architecture

NeurON: Weighted Sum

Image Source: Stanford

Slide Courtesy: Joel Emer and Vivienne Sze

Page 8: CS 758: Advanced Topics in Computer Architecture

NEURAL NETWORKNeurons

[Image Source: Stanford]

Slide Courtesy: Joel Emer and Vivienne Sze

Page 9: CS 758: Advanced Topics in Computer Architecture

NEURAL NETWORKSynapses

[Image Source: Stanford]

Slide Courtesy: Joel Emer and Vivienne Sze

Page 10: CS 758: Advanced Topics in Computer Architecture

NEURAL NETWORK: Multiple weighted sumsEach synapse has a weight for neuron activation

X1

X2

X3

Y1

Y2

Y3

Y4

W11

W34

Yj = activation Wij ´ Xii=1

3

åæ

èç

ö

ø÷

[Image Source: Stanford]

Slide Courtesy: Joel Emer and Vivienne Sze

Page 11: CS 758: Advanced Topics in Computer Architecture

NEURAL NETWORK: Multiple weighted sumsWeight Sharing: multiple synapses use the same weight value

X1

X2

X3

Y1

Y2

Y3

Y4

W11

W34

Yj = activation Wij ´ Xii=1

3

åæ

èç

ö

ø÷

[Image Source: Stanford]

Slide Courtesy: Joel Emer and Vivienne Sze

Page 12: CS 758: Advanced Topics in Computer Architecture

NEURAL NETWORK TerminologyL1 Neuron outputsa.k.a. ActivationsL1 Neuron inputs

e.g. image pixels

Layer 1

[Image Source: Stanford]

Slide Courtesy: Joel Emer and Vivienne Sze

Page 13: CS 758: Advanced Topics in Computer Architecture

NEURAL NETWORK Terminology

L2 Output Activations

L2 Input Activations

Layer 2

[Image Source: Stanford]

Slide Courtesy: Joel Emer and Vivienne Sze

Page 14: CS 758: Advanced Topics in Computer Architecture

NEURAL NETWORK TerminologyA layer can refer to a set activations or a set of weights.In this class, we use layer to refer to a set of weights.

[Image Source: Stanford]

Slide Courtesy: Joel Emer and Vivienne Sze

Page 15: CS 758: Advanced Topics in Computer Architecture

NEURAL NETWORK TerminologyFully-Connected: all input neurons connected to all output neurons

Sparsely-Connected

[Image Source: Stanford]

Slide Courtesy: Joel Emer and Vivienne Sze

Page 16: CS 758: Advanced Topics in Computer Architecture

NEURAL NETWORK Terminology

Feed Forward Feedback

[Image Source: Stanford]

Slide Courtesy: Joel Emer and Vivienne Sze

Page 17: CS 758: Advanced Topics in Computer Architecture

Artificial Intelligence

Deep Learning

Machine Learning

Brain-Inspired

Spiking NeuralNetworks

DeepLearning

Slide Courtesy: Joel Emer and Vivienne Sze

Page 18: CS 758: Advanced Topics in Computer Architecture

Popular Types of DEEP NEURAL NETWORKS

• Fully-Connected NN

• feed forward, a.k.a. multilayer perceptron (MLP)

• Convolutional NN (CNN)

• feed forward, sparsely-connected w/ weight sharing

Fully-ConnectedSparsely-Connected

Slide Courtesy: Joel Emer and Vivienne Sze

Page 19: CS 758: Advanced Topics in Computer Architecture

Popular Types of DEEP NEURAL NETWORKS

• Recurrent NN (RNN)

• feedback

• Long Short-Term Memory (LSTM)

• feedback + storage Feed Forward Feedback

Slide Courtesy: Joel Emer and Vivienne Sze

Page 20: CS 758: Advanced Topics in Computer Architecture

Deep Learning LandscapeInference

Intermediate features

Convolutional Layers

(Feature Extraction)Summarize features

Page 21: CS 758: Advanced Topics in Computer Architecture

Deep Learning Landscape

Labelled Datasets

Error

ML PractitionerDNN Model

(Topology + Weights)

Training Inference

Page 22: CS 758: Advanced Topics in Computer Architecture

Why is deep learning hot now?

350M images uploaded per day

2.5 Petabytes of customer data hourly

300 hours of video uploaded every minute

Big DataAvailability

GPUAcceleration

New MLTechniques

Convolutional Neural Network

Slide Courtesy: Joel Emer and Vivienne Sze

Page 23: CS 758: Advanced Topics in Computer Architecture

How deep learning started: ImageNet Challenge

Image Classification Task:1.2M training images • 1000 object categories

Object Detection Task:456k training images • 200 object categories

Slide Courtesy: Joel Emer and Vivienne Sze

Page 24: CS 758: Advanced Topics in Computer Architecture

ImageNet: Image Classification Task

0

5

10

15

20

25

30

2010 2011 2012 2013 2014 2015 Human

Top 5 Classification Error (%)

large error rate reduction due to Deep CNN

[Russakovsky et al., IJCV 2015]

Deep CNN-based designsHand-crafted feature-based designs

Slide Courtesy: Joel Emer and Vivienne Sze

Erro

r (%

)

Page 25: CS 758: Advanced Topics in Computer Architecture

GPU Usage for ImageNet Challenge

Slide Courtesy: Joel Emer and Vivienne Sze

Page 26: CS 758: Advanced Topics in Computer Architecture

Mature Applications

• Image

o Classification: image to object class

o Recognition: same as classification (except for faces)

o Detection: assigning bounding boxes to objects

o Segmentation: assigning object class to every pixel

• Speech & Language

o Speech Recognition: audio to text

o Translation

o Natural Language Processing: text to meaning

o Audio Generation: text to audio

• Games

Slide Courtesy: Joel Emer and Vivienne Sze

Page 27: CS 758: Advanced Topics in Computer Architecture

Emerging Applications

• Medical (Cancer Detection, Pre-Natal)

• Finance (Trading, Energy Forecasting, Risk)

• Infrastructure (Structure Safety and Traffic)

• Weather Forecasting and Event Detection

http://www.nextplatform.com/2016/09/14/next-wave-deep-learning-applications/

Slide Courtesy: Joel Emer and Vivienne Sze

Page 28: CS 758: Advanced Topics in Computer Architecture

Deep Learning for Self-driving Cars

Slide Courtesy: Joel Emer and Vivienne Sze

Page 29: CS 758: Advanced Topics in Computer Architecture

Opportunities

Image Source: Tractica

$500B Market over 10 Years!

Slide Courtesy: Joel Emer and Vivienne Sze

Page 30: CS 758: Advanced Topics in Computer Architecture

Opportunities

From EE Times – September 27, 2016

”Today the job of training machine learning models is limited by compute, if wehad faster processors we’d run bigger models…in practice we train on a reasonablesubset of data that can finish in a matter of months. We could use improvements ofseveral orders of magnitude – 100x or greater.”

• Greg Diamos, Senior Researcher, (formerly) SVAIL, Baidu

Slide Courtesy: Joel Emer and Vivienne Sze

Page 31: CS 758: Advanced Topics in Computer Architecture

COMPUTATION Platforms

• CPU• Intel, ARM, AMD…

• GPU• NVIDIA, AMD…

• Fine Grained Reconfigurable (FPGA)• Microsoft BrainWave

• Coarse Grained Programmable/Reconfigurable• Wave Computing, Plasticine, Graphcore…

• Application Specific• Neuflow, *DianNao, Eyeriss, Cnvlutin, SCNN, TPU, …

Slide Courtesy: Joel Emer and Vivienne Sze

Page 32: CS 758: Advanced Topics in Computer Architecture

Computation Platforms

On-c

hip

Buff

er 168 PE Array

ShiDianNao

Eyeriss

NVDLA

ARM Trillum

Apple Neural Engine

CambriconX

HPC cluster(CPU+GPU+FPGA) Accelerators

(ASIC)

Training Inference

Page 33: CS 758: Advanced Topics in Computer Architecture

CPU Compute Model

ProcessorCompiler

Architecture

µArchitecture

InputData

Binary Program

ProcessedData

Program

Behavioral Statistics

Slide Courtesy: Joel Emer and Vivienne Sze

Page 34: CS 758: Advanced Topics in Computer Architecture

DNN Compute Model

ProcessorCompiler

Architecture

µArchitecture

InputData

Binary Program

ProcessedData

Program

Behavioral Statistics

Model (Shape)Configuration

Dataflow

Implementation

DNN Accelerator

Mapper

InputActivations

OutputActivations

Slide Courtesy: Joel Emer and Vivienne Sze

Page 35: CS 758: Advanced Topics in Computer Architecture

Challenges in Design and Deployment

C

X

YS

R

C

K

DNN Model/Shape Mapping(Dataflow)

AcceleratorMicroarchitecture

RuntimeEnergy

Training Inference

Page 36: CS 758: Advanced Topics in Computer Architecture

DNN Timeline

• 1940s: Neural networks were proposed

• 1960s: Deep neural networks were proposed

• 1989: Neural network for recognizing digits (LeNet)

• 1990s: Hardware for shallow neural nets• Example: Intel ETANN (1992)

• 2011: Breakthrough DNN-based speech recognition• Microsoft real-time speech translation

• 2012: DNNs for vision supplanting traditional ML• AlexNet for image classification

• 2014+: Rise of DNN accelerator research• Examples: Neuflow, DianNao, etc.

Slide Courtesy: Joel Emer and Vivienne Sze