the molen compiler backend for reconfigurable architectures€¦ · the molen compiler backend for...

41
The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena Moscu Panainte Carlo Galuzzi Yana Yankova Koen Bertels Stamatis Vassiliadis

Upload: others

Post on 12-Oct-2020

27 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

The Molen Compiler Backend forReconfigurable Architectures

Computer EngineeringTU DELFT

The Netherlands

Elena Moscu PanainteCarlo GaluzziYana Yankova

Koen BertelsStamatis Vassiliadis

Page 2: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

OUTLINE

BackgroundMolen machine organizationMolen programming paradigm

Molen CompilerOptimizations for Dynamic Reconfiguration

Intra/Interprocedural instruction schedulingCompiler-driven FPGA area allocation

ResultsConclusions

Page 3: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

The Molen Machine Organization

Main components:• GPP• Reconfigurable Processor• Arbiter • Exchange Registers

Page 4: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

The Molen Prototype

Molen machine organization

Molen prototypeimplemented on

Virtex II Pro

Page 5: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

The Molen Programming Paradigm (I)

A one time architectural extensionone time architectural extension of a few instructions:– Two* instructions for controlling the FPGA

• SET <address>: for hardware configuration• EXECUTE <address>: for controlling the

execution on the FPGA– Two move instructions for passing values to and

from the GPP register file and the FPGAFPGA has associated a special set of registers – Exchange Registers (XRs)

Page 6: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

The Molen Programming Paradigm (II)

Example: C code: res = alpha(param1, param2);

Page 7: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

The Molen Programming Paradigm (II)

Example: C code: res = alpha(param1, param2);

movtx XR1 ← param1movtx XR2 ← param2

Send param.

Page 8: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

The Molen Programming Paradigm (II)

Example: C code: res = alpha(param1, param2);

movtx XR1 ← param1movtx XR2 ← param2set <address_alpha_set>

Send param.

HW reconfiguration

Page 9: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

The Molen Programming Paradigm (II)

Example: C code: res = alpha(param1, param2);

movtx XR1 ← param1movtx XR2 ← param2set <address_alpha_set>exec <address_alpha_exec>

Send param.

HW reconfigurationHW execution

Page 10: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

The Molen Programming Paradigm (II)

Example: C code: res = alpha(param1, param2);

movtx XR1 ← param1movtx XR2 ← param2set <address_alpha_set>exec <address_alpha_exec>movfx res ← XR3

Send param.

HW reconfigurationHW executionReturn result

Page 11: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

OUTLINE

BackgroundMolen machine organizationMolen programming paradigm

Molen CompilerOptimizations for Dynamic Reconfiguration

Intra/Interprocedural instruction schedulingCompiler-driven FPGA area allocation

ResultsConclusions

Page 12: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

SUIFfrontend

Machine SUIFbackend framework

MolenExtensions

ISA extension(SET/EXEC)

Register extension

PowerPC backend

MolenOptimizations

The Molen Compiler

Compiler FCCM

MAIN.c

File_n.c

C application

Page 13: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

PowerPC Backend

PowerPC instruction generationPowerPC register allocationPowerPC EABI stack frame allocation

+SET/EXECUTE - ISA extensionXRs - Register extension

Page 14: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

OUTLINE

BackgroundMolen machine organizationMolen programming paradigm

Molen CompilerOptimizations for Dynamic Reconfiguration

Intra/Interprocedural instruction schedulingCompiler-driven FPGA area allocation

ResultsConclusions

Page 15: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

Molen Compiler: Optimizations

Challenge: huge reconfiguration latency(for SET instruction)

Repetitive reconfiguration: – performance decrease of one order of

magnitudeHardware kernel executions:– Speedup of one order of magnitude

Page 16: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

Solutions

Hardware solutions:– Partial configurations– Configuration Prefetching

Compiler solution:- Scheduling of SET instructions

- Intraprocedural level- Interprocedural level

- Compiler-driven FPGA area allocationSoftware solution:– Application rewriting (code transformation)

Page 17: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

Instruction Scheduling

Compiler Optimizations

SET op1EXEC op1…………

a) Repetitivereconfigurations

SET op1EXEC op1…………

b) Singlereconfiguration

SET op1

Page 18: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

Instruction Scheduling

Compiler Optimizations

SET op1EXEC op1…………

a) Repetitivereconfigurations

SET op1EXEC op1SET op2EXEC op2

b) Multiple hardwareoperations

Page 19: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

Speculation-Based Instruction Scheduling

Algorithm based on:– Edge profiling

SET op1

251000

100025

Page 20: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

Speculation-Based Instruction Scheduling

Algorithm based on:– Edge profiling– Speculation

• SET instruction doesnot cause any exception

SET op125

1000

100025

A

B

C DSET op1

Page 21: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

Speculation-Based Instruction Scheduling

Algorithm based on:– Edge profiling– Speculation

• SET instruction doesnot cause any exception

– Information about FPGA area conflicts

OP1

OP2

FPGA

Page 22: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

Instruction SchedulingSTEP 1: Anticipation

STEP 1: Iterative backward data-flow analysis for partial anticipabilityLocal information:– Gen(n)– Kill(n)

Global information:IN(s1)

IN(s2) IN(s3)IN(s4)

IN(i)

OUT(i)

Gen(i)Kill(i)

U ))()(()()( iKilliPANToutiGeniPANTin −=

U)(

)()(iSuccj

jPANTiniPANTout∈

= +

Page 23: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

Instruction SchedulingSTEP 2: Availability

STEP 2: Iterative forward data-flow analysis for availabilityLocal information:– Gen(n)– Kill(n)

Global information:

U ))()(()()( iKilliAVALiniGeniAVALout −=

I)(Pred

)()(ij

jAVALoutiAVALin∈

=

OUT(p1)OUT(p2) OUT(p3)

OUT(p4)

IN(i)

OUT(i)

Gen(i)Kill(i)

Page 24: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

Instruction SchedulingSTEP 3: Minimum s-t Cut

Anticipation Graph for each HW op:

Minimum s-t cut for finding the bestinsertion edges

)}()(|),{( vPANTinopuAVALoutopvuESS ∈∧∉=

10

s

B7

B8

B9

B10

t

B14

B13

INF INF

10

10

200Min s-t cut

for op2INF

INF

Page 25: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

Interprocedural Instruction Scheduling

SAD – 117084DCT – 1152IDCT - 1152

SAD – 1DCT – 1IDCT - 1

Initial

Final

Goal: anticipation of SET instructions at interprocedurallevel

SADIDCT

DCT

FPGA area allocation

Page 26: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

Step 1: Construction of the Call Graph

We use suifbrowser packageNo indirect procedure callsThe call graph is a DAG

motion.c transform.c…………int sad(..)

…………

…………

void dct(..)

…………

putseq.c…………

void idct(..)

…………

…………

MPEG2 Encoder

Page 27: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

Step 2: Propagation of Hardware Reconfigurations

Interprocedural data-flow analysisBackward propagationFor each procedure compute LRMOD and RMODLRMOD(p) = Rop, if p is executed on the FPGA

Ø , otherwise{RMOD(p) = LRMOD(p) RMOD(s)U

s in Succ(p)

Page 28: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

Step 3: Conflict Propoagationand Instruction Scheduling

Compute CF for each procedure

for each edge <pi,pj> in the call graphfor each op in CF(pi) and [RMOD(pj)-CF(pj)]

insert SET op in pi where pj is calledfor each op in RMOD(root) – CF(root)

insert SET op at the application entry point

}),(|)({)( jiji opoppRMODoppRMODoppCF ≠∈∃∈=

Page 29: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

putseq:…..SET sadcall motion_estimation……SET dctcall transform…….SET idctcall itransform

Page 30: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

Compiler-driven FPGA area allocation

Example FPGA

1 12………….

ROP 1

1 2 3

ROP 2

1 2 3 4

ROP 3

1 8………….

1 12………….

FIX RW

Trace: n(Rop1) = 4; n(Rop2)=2; n(Rop3)=1 1 12………….

Rop1 Rop2Rop3

Page 31: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

FIX/RW Algorithm

Rops selection: FIX/ RW0-1 integer linear programming problem – min:

– constraints:

∑∈ROPiRop ix*iA*n(T)

⎪⎪⎪⎪⎪⎪

⎪⎪⎪⎪⎪⎪

≤+

≤+

≤+

≤+

SxAxA

SxAxA

SxAxA

SxAxA

ROPRopjjnn

ROPRopjjii

ROPRopjj

ROPRopjj

j

j

j

j

**........................................

**........................................

**

**

22

11

Page 32: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

FIX/RW/SW Algorithmmin:

constraints:∑∑∑===

++n

ii

n

ii

n

ii

111

xsw*cost_swxrw*cost_rwxfix*cost_fix iii

⎪⎪⎪⎪⎪⎪⎪

⎪⎪⎪⎪⎪⎪⎪

≤+

≤+

≤+

≤+

=

=

=

=

SxfixAxrwA

SxfixAxrwA

SxfixAxrwA

SxfixAxrwA

n

jjjnn

n

jjjii

n

jjj

n

jjj

1

1

12

111

**

........................................

**

........................................

**2

**

Page 33: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

OUTLINE

BackgroundMolen machine organizationMolen programming paradigm

Molen CompilerOptimizations for Dynamic Reconfiguration

Intra/Interprocedural instruction schedulingCompiler-driven FPGA area allocation

ResultsConclusions

Page 34: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

Intraprocedural Instruction Scheduling Algorithm

Optimization implemented as a MachineSUIF passTarget application: M-JPEG encoder

multimedia benchmarkGPP included in the Molen prototype:

IBM PowerPC 405 at 250 MHzFunctions executed on the FPGA:– DCT (2D Discrete Cosine Transform)– Quantization– VLC (Variable Length Coding)

Page 35: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

Intraprocedural Instruction Scheduling Algorithm

Xilinx IP cores for DCT, Quant and VLC

Simple scheduling: 10x slowdown for DCT

HW Execution SW ExecutionOp EXEC Area SET One call %TotalName [cycle] [slice] [cycle] [cycle] M-JPEGDCT 416 848 431771 44396 80 %

Quant 73 397 202073 1494 3 %VLC 272 193 98237 6921 12.5 %

Page 36: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

Intraprocedural Instruction Scheduling Algorithm

Page 37: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

Interprocedural Instruction Scheduling Algorithm

M-JPEG encoder:– input: 30 frames from “tennis”, 256x256– Hardware operations: DCT, Quantization, VLCMPEG2 encoder:– input: 3 standard test frames– Hardware operations: SAD, DCT, IDCT

Page 38: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

Interprocedural Instruction Scheduling Algorithm

M-JPEG encoder:Initial With interprocedural optimization

HW op [#SET] No cf DCT –Quant cf

DCT VLC cf

Quant –VLC cf

All cf

DCT 61440 1 15360 15360 1 15360Quant 15360 1 15360 1 15360 15360VLC 15360 1 1 15360 15360 15360

Page 39: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

Interprocedural Instruction Scheduling Algorithm

MPEG2 encoder:Initial With interprocedural optimization

HW op [#SET] No cf

SAD -DCT cf

SAD -IDCT cf

DCT -IDCT cf

All cf

SAD 117084 1 3 3 1 3DCT 1152 1 3 1 3 3IDCT 1152 1 1 3 3 3

Page 40: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

Conclusions

The proposed compiler optimization can significantly reduce the number of performed reconfigurations and improve the overall performance The anticipation of the SET instructions will allow the hardware reconfigurations to be performed in parallel with the GPP execution

Page 41: The Molen Compiler Backend for Reconfigurable Architectures€¦ · The Molen Compiler Backend for Reconfigurable Architectures Computer Engineering TU DELFT The Netherlands Elena

Elena Moscu Panainte

Thank you!