ariane + nvdladavide_giri/pdf/giri_carrv20_slides.pdfariane + nvdla seamless third-party ip...

Ariane + NVDLASeamless Third-Party IP Integration with ESP

Davide GiriKuan-Lin ChiuGuy EichlerPaolo MantovaniNandhini Chandramoorthy (IBM Research)Luca P. Carloni CARRV 2020

Motivation

• SoCs are increasingly heterogeneous [1]

• Heterogeneity increases the engineering effort [2]

→ IP reuse enables the design of complex SoCs

• Thanks to open-source hardware (OSH) movement [3]

→ Proliferation of open-source IPs

Seamless third-party IP integration is key!

2[1] Shao, SLCA’15 [2] Khailani, DAC’18 [3] Gupta, IEEE Computer’17

In this work

Enhance ESP with support for third-party accelerators

• ESP is our open-source platform for SoC design [4]

3[4] ESP: esp.cs.columbia.edu [5] Ariane: github.com/pulp-platform/ariane [6] NVDLA: nvdla.org

Demonstrate integration capabilities of ESP

• Integration of Ariane [5] and NVDLA [6]

• Rapid FPGA prototyping

Open-source release as part of ESP

• Hands-on tutorial: esp.cs.columbia.edu/docs/thirdparty_acc

ESP overview

4

ESP architecture

5

ESP methodology

6

Accelerator Flow

• Simplified design

• Automated integration

SoC Flow

• Mix&matchfloorplanning GUI

• Rapid FPGA prototyping

RapidPrototyping

SoC Integration

HLSDesignFlows

RTLDesignFlows

Vivado HLSCatapult HLSStratus HLS

Ariane…

…

accelerator

IP Library

accelerator

…

third-partyaccelerator

** B

y le

win

g@is

c.ta

mu

.ed

uLa

rry

Ewin

g an

d T

he

GIM

P

* B

y N

vid

ia C

orp

ora

tio

n

**

*

ESP methodology: SoC flow

7** By [email protected] Larry Ewing and The GIMP

RapidPrototyping

SoC Integration

Ariane…

…

accelerator

IP Library

accelerator

…

third-partyaccelerator

**

Third-party IP integration with ESP

8

ESP accelerator tile

9

third-party accelerator

Third-party RTL and SW

files list

Acceleratordefinition

(xml)RTL

wrapperwiring

Makefiletargets

definition

ESP accelerator flow

automated

manual

ESP accelerator

Acceleratorskeleton

Test behavior

Generate RTL

Test RTL

Instantiate into SoC

…

…

…accelerator

accelerator

accelerator

Acceleratorspecific

functions

10

Ariane + NVDLA with ESP

11

Integration of Ariane

ESP processor tile

• RISC-V Ariane (new!) or Sparc-v8 Leon3

• Boot unmodified Linux

• AXI4 (new!) or AHB bus to access memory

• APB bus to access peripherals

• Optional L2 private cache

• Processor-specific interrupt controller placed in the I/O tile

12

NVDLA

NVIDIA Deep Learning Accelerator

• Open source

• Fixed function

• Highly configurable

NVDLA small

• 8-bit integer precision

• 64 MAC units

• 128 KB local memory

13

SoCs evaluated on FPGA (Xilinx XCVU440)

• Ariane core

• 1-4 NVDLA tiles

• 1-4 memory channels

Evaluation: setup

14

Evaluation networks

Evaluation: results

15

3.8

4.5

1.3

0.4

0

1

2

3

4

5

LeNet Convnet SimpleNet ResNet50

fram

es

/ se

con

d

1 NVDLA

Performance of NVDLA small in ESP@ 50 MHz

1

2.1

3.1

3.9

0

1

2

3

4

5

1 NVDLA1 mem ctrl

2 NVDLA2 mem ctrl

3 NVDLA3 mem ctrl

4 NVDLA4 mem ctrl

fram

es

/ se

con

d (

no

rmal

ized

)

LeNet

Scaling NVDLA instances and DDR channels@ 50 MHz

18x lower than NVIDIA’s results

@ 1GHz

performance preserved

Thank you from the ESP team!

sld.cs.columbia.edu esp.cs.columbia.edu sld-columbia/espColumbiaSld ESP channel

Ariane + NVDLASeamless Third-Party IP Integration with ESP

Davide GiriKuan-lin ChiuGuy EichlerPaolo MantovaniNandhini Chandramoorthy (IBM)Luca P. Carloni CARRV 2020
https://sld.cs.columbia.edu/https://www.esp.cs.columbia.edu/https://github.com/sld-columbia/esphttps://twitter.com/ColumbiaSldhttps://www.youtube.com/channel/UCYpZYCJJNGm5m-kV8aaTZSQ

ariane + nvdladavide_giri/pdf/giri_carrv20_slides.pdfariane + nvdla seamless third-party ip...

Documents