ariane + nvdladavide_giri/pdf/giri_carrv20_slides.pdfariane + nvdla seamless third-party ip...

16
Ariane + NVDLA Seamless Third-Party IP Integration with ESP Davide Giri Kuan-Lin Chiu Guy Eichler Paolo Mantovani Nandhini Chandramoorthy (IBM Research) Luca P. Carloni CARRV 2020

Upload: others

Post on 26-Jan-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

  • Ariane + NVDLASeamless Third-Party IP Integration with ESP

    Davide GiriKuan-Lin ChiuGuy EichlerPaolo MantovaniNandhini Chandramoorthy (IBM Research)Luca P. Carloni CARRV 2020

  • Motivation

    • SoCs are increasingly heterogeneous [1]

    • Heterogeneity increases the engineering effort [2]

    → IP reuse enables the design of complex SoCs

    • Thanks to open-source hardware (OSH) movement [3]

    → Proliferation of open-source IPs

    Seamless third-party IP integration is key!

    2[1] Shao, SLCA’15 [2] Khailani, DAC’18 [3] Gupta, IEEE Computer’17

  • In this work

    Enhance ESP with support for third-party accelerators

    • ESP is our open-source platform for SoC design [4]

    3[4] ESP: esp.cs.columbia.edu [5] Ariane: github.com/pulp-platform/ariane [6] NVDLA: nvdla.org

    Demonstrate integration capabilities of ESP

    • Integration of Ariane [5] and NVDLA [6]

    • Rapid FPGA prototyping

    Open-source release as part of ESP

    • Hands-on tutorial: esp.cs.columbia.edu/docs/thirdparty_acc

  • ESP overview

    4

  • ESP architecture

    5

  • ESP methodology

    6

    Accelerator Flow

    • Simplified design

    • Automated integration

    SoC Flow

    • Mix&matchfloorplanning GUI

    • Rapid FPGA prototyping

    RapidPrototyping

    SoC Integration

    HLSDesignFlows

    RTLDesignFlows

    Vivado HLSCatapult HLSStratus HLS

    Ariane…

    accelerator

    IP Library

    accelerator

    third-partyaccelerator

    ** B

    y le

    win

    g@is

    c.ta

    mu

    .ed

    uLa

    rry

    Ewin

    g an

    d T

    he

    GIM

    P

    * B

    y N

    vid

    ia C

    orp

    ora

    tio

    n

    **

    *

  • ESP methodology: SoC flow

    7** By [email protected] Larry Ewing and The GIMP

    RapidPrototyping

    SoC Integration

    Ariane…

    accelerator

    IP Library

    accelerator

    third-partyaccelerator

    **

  • Third-party IP integration with ESP

    8

  • ESP accelerator tile

    9

  • third-party accelerator

    Third-party RTL and SW

    files list

    Acceleratordefinition

    (xml)RTL

    wrapperwiring

    Makefiletargets

    definition

    ESP accelerator flow

    automated

    manual

    ESP accelerator

    Acceleratorskeleton

    Test behavior

    Generate RTL

    Test RTL

    Instantiate into SoC

    …accelerator

    accelerator

    accelerator

    Acceleratorspecific

    functions

    10

  • Ariane + NVDLA with ESP

    11

  • Integration of Ariane

    ESP processor tile

    • RISC-V Ariane (new!) or Sparc-v8 Leon3

    • Boot unmodified Linux

    • AXI4 (new!) or AHB bus to access memory

    • APB bus to access peripherals

    • Optional L2 private cache

    • Processor-specific interrupt controller placed in the I/O tile

    12

  • NVDLA

    NVIDIA Deep Learning Accelerator

    • Open source

    • Fixed function

    • Highly configurable

    NVDLA small

    • 8-bit integer precision

    • 64 MAC units

    • 128 KB local memory

    13

  • SoCs evaluated on FPGA (Xilinx XCVU440)

    • Ariane core

    • 1-4 NVDLA tiles

    • 1-4 memory channels

    Evaluation: setup

    14

    Evaluation networks

  • Evaluation: results

    15

    3.8

    4.5

    1.3

    0.4

    0

    1

    2

    3

    4

    5

    LeNet Convnet SimpleNet ResNet50

    fram

    es

    / se

    con

    d

    1 NVDLA

    Performance of NVDLA small in ESP@ 50 MHz

    1

    2.1

    3.1

    3.9

    0

    1

    2

    3

    4

    5

    1 NVDLA1 mem ctrl

    2 NVDLA2 mem ctrl

    3 NVDLA3 mem ctrl

    4 NVDLA4 mem ctrl

    fram

    es

    / se

    con

    d (

    no

    rmal

    ized

    )

    LeNet

    Scaling NVDLA instances and DDR channels@ 50 MHz

    18x lower than NVIDIA’s results

    @ 1GHz

    performance preserved

  • Thank you from the ESP team!

    sld.cs.columbia.edu esp.cs.columbia.edu sld-columbia/espColumbiaSld ESP channel

    Ariane + NVDLASeamless Third-Party IP Integration with ESP

    Davide GiriKuan-lin ChiuGuy EichlerPaolo MantovaniNandhini Chandramoorthy (IBM)Luca P. Carloni CARRV 2020

    https://sld.cs.columbia.edu/https://www.esp.cs.columbia.edu/https://github.com/sld-columbia/esphttps://twitter.com/ColumbiaSldhttps://www.youtube.com/channel/UCYpZYCJJNGm5m-kV8aaTZSQ