from “field programmable” to “programmable” · cmu/ece/calcm/hoe arm research summit,...

16
CMU/ECE/CALCM/Hoe ARM Research Summit, September 2019, slide-1 From “Field-Programmable” to “Programmable” James C. Hoe Department of ECE Carnegie Mellon University

Upload: others

Post on 30-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: From “Field Programmable” to “Programmable” · CMU/ECE/CALCM/Hoe ARM Research Summit, September 2019, slide-2 Classic FPGA in a Nutshell I I/O pins programmable lookup tables

CMU/ECE/CALCM/Hoe ARM Research Summit, September 2019, slide-1

From “Field-Programmable” to “Programmable”

James C. Hoe

Department of ECE

Carnegie Mellon University

Page 2: From “Field Programmable” to “Programmable” · CMU/ECE/CALCM/Hoe ARM Research Summit, September 2019, slide-2 Classic FPGA in a Nutshell I I/O pins programmable lookup tables

CMU/ECE/CALCM/Hoe ARM Research Summit, September 2019, slide-2

Classic FPGA in a Nutshell

I

I/O pins

programmable lookup tables (LUT) and flip-flops (FF)

aka “soft logic” or “fabric”

Inte

rco

nn

ect

LUT FF

programmable routing

Page 3: From “Field Programmable” to “Programmable” · CMU/ECE/CALCM/Hoe ARM Research Summit, September 2019, slide-2 Classic FPGA in a Nutshell I I/O pins programmable lookup tables

CMU/ECE/CALCM/Hoe ARM Research Summit, September 2019, slide-3

FPGAs as we knew it

Traditionally, FPGAs have been the bastard step-brother of ASICs. They have been forced to act like ASICs and fit themselves into the ASIC development model. . . . . . .

. . . . . . This has meant ignoring their unique strengths: reprogrammability, late binding and run-time reconfiguration.

Andre DeHon, ISFPGA 2004

Page 4: From “Field Programmable” to “Programmable” · CMU/ECE/CALCM/Hoe ARM Research Summit, September 2019, slide-2 Classic FPGA in a Nutshell I I/O pins programmable lookup tables

CMU/ECE/CALCM/Hoe ARM Research Summit, September 2019, slide-4

Perspective on FPGAs changed when

• Microsoft (and others) got desperate enough to do this

[www.microsoft.com/en-us/research/project/project-catapult]

Page 5: From “Field Programmable” to “Programmable” · CMU/ECE/CALCM/Hoe ARM Research Summit, September 2019, slide-2 Classic FPGA in a Nutshell I I/O pins programmable lookup tables

CMU/ECE/CALCM/Hoe ARM Research Summit, September 2019, slide-5

New FPGAs are not RTL targets

“*”• spatial data/compute•highly concurrent• finely controllable• reprogrammable

Immediate Challenges• killer apps• ease of development

[Xilinx Versal] [Intel Agilex]

Page 6: From “Field Programmable” to “Programmable” · CMU/ECE/CALCM/Hoe ARM Research Summit, September 2019, slide-2 Classic FPGA in a Nutshell I I/O pins programmable lookup tables

CMU/ECE/CALCM/Hoe ARM Research Summit, September 2019, slide-6

Greater break from ASIC mentality

• Dynamism ⎯ actually use the programmability

– support more functionality on same parts cost

– achieve better performance by specializing

• Shareability ⎯ multitenancy to consume “slack”

– too much logic: partition fabric spatially

– too much throughput: repurpose fabric temporally

• Manageability ⎯ bring FPGA under OS purview

– part of compute resource pool (CPU cycles, DRAM)

– seamless interface, virtualization and isolation (security and QoS)

Dynamic Partial Reconfiguration is a key capability

Page 7: From “Field Programmable” to “Programmable” · CMU/ECE/CALCM/Hoe ARM Research Summit, September 2019, slide-2 Classic FPGA in a Nutshell I I/O pins programmable lookup tables

CMU/ECE/CALCM/Hoe ARM Research Summit, September 2019, slide-7

DPR: what is feasible today?

Page 8: From “Field Programmable” to “Programmable” · CMU/ECE/CALCM/Hoe ARM Research Summit, September 2019, slide-2 Classic FPGA in a Nutshell I I/O pins programmable lookup tables

CMU/ECE/CALCM/Hoe ARM Research Summit, September 2019, slide-8

FPGA

DPR

Programmable Crossbar

DMA IO

DPR DPR …

DMADMA

embedded ARM core

schedulermapper

(module to RP)Interconnect and DMA configurer

plug-and-play architecture

runtime

MAMBMCMDMEMFMGMHMI MAMA

MA MB MC

MD ME MF

MG MH MI

Dynamic Execution Framework for Interactive Vision

vision stage IP library + pipeline specifications

Page 9: From “Field Programmable” to “Programmable” · CMU/ECE/CALCM/Hoe ARM Research Summit, September 2019, slide-2 Classic FPGA in a Nutshell I I/O pins programmable lookup tables

CMU/ECE/CALCM/Hoe ARM Research Summit, September 2019, slide-9

Flash DRAMPS

AXI-PCAP Bridge

AXI Master Interface

FIFO

PCAP Interface

ARM core(user code + SW management)

MA MB MCCamera Display

MD ME MFCamera Display

MG MH MICamera Display

RPFPGA

Overlay Crossbar

DMA DMA

RP

DMA Camera

RPRP RP

DisplayDMA

RP

DMA

MD ME MFMA MB MC MG MH MI

Spatial and Temporal Multitenancy

Page 10: From “Field Programmable” to “Programmable” · CMU/ECE/CALCM/Hoe ARM Research Summit, September 2019, slide-2 Classic FPGA in a Nutshell I I/O pins programmable lookup tables

CMU/ECE/CALCM/Hoe ARM Research Summit, September 2019, slide-10

We Actually Use Thishttps://www.cs.cmu.edu/smartheadlight

FPGA

ZynqSoC-FPGA

Camera

SLMbeamsplitter

Page 11: From “Field Programmable” to “Programmable” · CMU/ECE/CALCM/Hoe ARM Research Summit, September 2019, slide-2 Classic FPGA in a Nutshell I I/O pins programmable lookup tables

CMU/ECE/CALCM/Hoe ARM Research Summit, September 2019, slide-11

720p

Time-Multiplexing Feasibility [FPL2018]• Interleaving (a) 2 pipelines and (b) 3 pipelines

• Pipelines differ between 1 and 6 PR partitions

1080p

batch size

It takes more than doing PR in quick successions!!!

Page 12: From “Field Programmable” to “Programmable” · CMU/ECE/CALCM/Hoe ARM Research Summit, September 2019, slide-2 Classic FPGA in a Nutshell I I/O pins programmable lookup tables

CMU/ECE/CALCM/Hoe ARM Research Summit, September 2019, slide-12

Cost and Energy/Power Benefits [FPL2019]

• Application casestudy with 6 modules

– color-based detector triggers follow-up processing

– meets throughput and latency requirements

– ~3x logic saving (7x $ saving in parts cost)

– ~30% power/energy saving in worstcase

Static Mapping

detect

stereo

SIFT

etc.

“large”FPGA

Timeshare

DPR

“small”FPGA

busy busy busy busy busy

idle busy idle

idle busy idle

idle

timelineframe

det. stereo det. SIFT det.

frametimeline

Page 13: From “Field Programmable” to “Programmable” · CMU/ECE/CALCM/Hoe ARM Research Summit, September 2019, slide-2 Classic FPGA in a Nutshell I I/O pins programmable lookup tables

CMU/ECE/CALCM/Hoe ARM Research Summit, September 2019, slide-13

Today’s Practical Constraints

• Number and size of PR partitions fixed apriori

– too few/too large: internal fragmentation

– too many/too small: external fragmentation

• Not all PR partitions are equal⎯even if same interface and shape

– a module needs a different bitstream for each partition it goes into

– build and store upto MxN bitstreams for N partitions and M modules

• PR is not all that fast . . .

Page 14: From “Field Programmable” to “Programmable” · CMU/ECE/CALCM/Hoe ARM Research Summit, September 2019, slide-2 Classic FPGA in a Nutshell I I/O pins programmable lookup tables

CMU/ECE/CALCM/Hoe ARM Research Summit, September 2019, slide-14

PR could be a lot better if need be

• Reconfiguration time could be a lot faster

– increase raw speeds

– increase concurrency

• Tools could be more powerful and friendly

– more push-button GUI, less manual scripting

– higher level interfaces (incorporating management and scheduling) in-line with use models (e.g., AOC)

• More flexibility, e.g. soft partition boundaries, relocatable bitstreams

• Time-multiplex contextful modules?

Page 15: From “Field Programmable” to “Programmable” · CMU/ECE/CALCM/Hoe ARM Research Summit, September 2019, slide-2 Classic FPGA in a Nutshell I I/O pins programmable lookup tables

CMU/ECE/CALCM/Hoe ARM Research Summit, September 2019, slide-15

Hopes and Dreams: Spatial/Temporal Multitasking Fabric?

• Manage and schedule FPGA fabric like one would with CPU cycles and memory

• First-class support of PR compute modules

– hard transport with standard

virtually, private interfaces

to memory, I/O & resources

– loadable modules designed

independently of placement

and surroundings

– security and QoS provisions

for multitenant sharing

Page 16: From “Field Programmable” to “Programmable” · CMU/ECE/CALCM/Hoe ARM Research Summit, September 2019, slide-2 Classic FPGA in a Nutshell I I/O pins programmable lookup tables

CMU/ECE/CALCM/Hoe ARM Research Summit, September 2019, slide-16

16

Sponsored by the CONIX Research Center, one of six centers administered by the

JUMP phase of the Focused Center Research Program (FCRP), a Semiconductor

Research Corporation program sponsored by MARCO and DARPA.