deep learning frameworks - hpc.pnl.gov

11
Where we are and where we should be going DEEP LEARNING FRAMEWORKS JACK LEE | University of Toronto AMY WANG | Huawei Canada

Upload: others

Post on 30-Nov-2021

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DEEP LEARNING FRAMEWORKS - hpc.pnl.gov

Where we are and where we should be going

DEEP LEARNING FRAMEWORKS

JACK LEE | University of TorontoAMY WANG | Huawei Canada

Page 2: DEEP LEARNING FRAMEWORKS - hpc.pnl.gov

BACKGROUNDArchitecture

Frontend API

Graph IR

Graph Executor

Kernel Library

INPUTS OUTPUTS

TensorFlow Frontend

Graph IR

Kernel Implementation

Page 3: DEEP LEARNING FRAMEWORKS - hpc.pnl.gov

MOTIVATIONSFrontend Interface

TensorFlow Frontend

Autograph

PyTorch Frontend

Page 4: DEEP LEARNING FRAMEWORKS - hpc.pnl.gov

MOTIVATIONSGraph Optimizations

PyTorch Frontend

Trace-based JIT

AST-based JITTensorFlow Graph IR

XLA Lower Level opsAutomatic differentiation every iteration.

Page 5: DEEP LEARNING FRAMEWORKS - hpc.pnl.gov

MOTIVATIONSKernel Specialization

XLA Lower Level ops

Benchmarks

Deep Fusion Tiling

Graph Lowering

Frontend API

INPUTS OUTPUTS

Graph IR

Compiler IR

COMPILED NETWORK

Page 6: DEEP LEARNING FRAMEWORKS - hpc.pnl.gov

MOTIVATIONSKernel Specialization

NNVM API

TVM API

OUTPUTS

Compiler IR

Generated GPU Code

INPUTS OUTPUTS

NNVM Graph IR

TVM Halide IR

COMPILED KERNELS

CUSTOM RUNTIME

Page 7: DEEP LEARNING FRAMEWORKS - hpc.pnl.gov

STATE OF THE ART SUMMARY

TENSORFLOW TENSORFLOW XLA PYTORCH PYTORCH - GLOW NNVM + TVM

Staged Frontend

✘ ✘

Native Frontend

✘ ✘ ✘

GraphOptimization

Kernel Specialization

✘ ✘

Runtime Specialization

✘ ✘ ✘ ✘ ✘

ExecutionLevel

C++ C++ Python Machine Code C++

Page 8: DEEP LEARNING FRAMEWORKS - hpc.pnl.gov

THE DVM FRONTENDDeep Learning Compilation Framework

TENSORFLOW PYTORCH NATIVE SYNTAX

IR Transformation IR Transformation Parser (Clang/Python AST)

IR Builder

Page 9: DEEP LEARNING FRAMEWORKS - hpc.pnl.gov

THE DVM MIDENDDeep Learning Compilation Framework

SSA-based IRLow level opsControl Flow

Data Flow

Automatic Differentiation

Graph Optimizations

Profile Guided Optimizations

Compiler Optimizations

Page 10: DEEP LEARNING FRAMEWORKS - hpc.pnl.gov

THE DVM BACKENDSDeep Learning Compilation Framework

Default Runtime Codegen

Compatible Compiler

Specialized Runtime Source

Code

Handwritten Kernel Source

Code

Compiled Network

Runtime + Kernel Codegen

ClusteredSpecialized

Runtime Source Code

FusedSpecialized

Kernel Source Code

Compatible Compiler

Compiled Network

Page 11: DEEP LEARNING FRAMEWORKS - hpc.pnl.gov

Q&A

Default Runtime Codegen

Compatible Compiler

Specialized Runtime Source Code

Handwritten Kernel Source Code

Compiled Network

Runtime + Kernel Codegen

ClusteredSpecialized Runtime

Source Code

FusedSpecialized Kernel

Source Code

Compatible Compiler

Compiled Network

TENSORFLOW PYTORCH NATIVE SYNTAX

IR Transformation IR Transformation Parser (Clang/Python AST)

IR Builder

SSA-based IRLow level opsControl Flow

Data Flow

Automatic Differentiation

Graph Optimizations

Profile Guided Optimizations

Compiler Optimizations