analyze applications to identify suitable aspects, design

2
Research Projects Computing Systems REFLECT: Re ndering F PGAs to Mul ti-Core E mbedded C omput ing 1 Zlatko Petrov and Kamil Krátký (Honeywell), Pedro C. Diniz (INESC- ID), João M.P. Cardoso, José Carlos Alves, and João Canas Ferreira (FEUP), Koen Bertels, Georgi Kuzmanov, Razvan Nane, and Vlad- Mihai Sima (TUD), Jürgen Becker, Michael Hübner, Florian Thoma, Lars Braun, and Matthias Kühnle (KIT), George Constantinides, Wayne Luk, and José Gabriel de F. Coutinho (IC), Bryan Olivier and Hans van Someren (ACE), Fernando Gonçalves (CW) Introduction The relentless increase in capacity of Field-Programmable Gate-Arrays (FPGAs) has made them vehicles of choice for both prototypes and final products of on-chip multi-core, heterogeneous and reconfigurable systems. Multiple cores can be embedded as hard- or soft-cores, have customizable instruction sets, multiple distributed RAMs and/or configurable interconnections. Their flexibility allows them to achieve orders of magnitude better performance than conventional computing systems via customization. Programming these systems, however, is extremely cumbersome and error-prone hampering their widespread adoption and limiting their true computational potential. Technology Description The REFLECT project is developing, implementing and evaluating a novel compilation and synthesis system approach for FPGA-based platforms. The approach relies on Aspect-Oriented Specifications to convey critical domain knowledge to all development steps. Project Objectives Make reconfigurable technology accessible: o Lower barrier of adoption of technology o Facilitate program portability to new architectures Improve productivity: o Accelerating design cycles by more than two orders of magnitude o Allow user to have full control in a consistent, systematic way of development flow stages Bring to development flow: o User’s knowledge about the algorithm o Non-functional requirements o Flexibility to define properties of target FPGA and memory organization o Best design practices represented by design patterns and HW/SW templates REFLECT’s Repository of Applications Avionics: 3D Path Planning, and Stereo Navigation Audio: MPEG audio encoder, and G729 voice encoder Technical plan 1 http://www.reflect-project.eu Analyze applications to identify suitable aspects, design patterns and HW/SW templates, as well as reconfiguration schemes Develop techniques for configuration and reconfiguration based on the REFLECT’s aspect-oriented concept Develop aspects, design patterns, and HW/SW templates Specify new intermediate representation that supports orthogonal views, e.g., CDFG-view and Aspect-view Develop techniques for data type and word-length optimizations Develop techniques for automatic HW generation Develop techniques for cost effective mapping of computations to reconfigurable hardware, e.g. FPGAs by means of a domain specific language (LARA) Develop unqualified development tools Evaluate and validate Advances over State-of-the-art The REFLECT’s approach intends to solve some of the problems when mapping efficiently computations to FPGA- based systems. In particular, the use of aspects and strategies will allow developers to try different design patterns and to achieve solutions design-guided by non- functional requirements. To the best of our knowledge, the REFLECT design flow is the first approach considering a systematic control of all the compilation stages and the first one to consider the relationship between non-functional requirements to different design patterns and optimizations, both specified in a domain-specific language, named LARA. Expected results Enable one application, multiple designs according to different customer requirements with reduced V&V cost and overall development cost: o In a traceable way through the notation of Requirements-Aspects-Design Patterns-HW templates o With pre-verified Design Patterns and HW Templates Systematic approach to Guide Design Flow Stages: o Limiting design space given requirements and derived aspects o Bringing an opportunity to achieve cost-effective designs Allow Specification of Reusable Design Patterns and Best Practices o Capture and codify application and platform specific knowledge and expertise. Acknowledgment This work was partially supported by the European Community under the Framework Programme 7 (FP7) under contract No. 248976. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the European Community.

Upload: others

Post on 17-Mar-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

Research Projects

Computing Systems REFLECT: Rendering FPGAs to Multi-Core Embedded Computing1 Zlatko Petrov and Kamil Krátký (Honeywell), Pedro C. Diniz (INESC-ID), João M.P. Cardoso, José Carlos Alves, and João Canas Ferreira (FEUP), Koen Bertels, Georgi Kuzmanov, Razvan Nane, and Vlad-Mihai Sima (TUD), Jürgen Becker, Michael Hübner, Florian Thoma, Lars Braun, and Matthias Kühnle (KIT), George Constantinides, Wayne Luk, and José Gabriel de F. Coutinho (IC), Bryan Olivier and Hans van Someren (ACE), Fernando Gonçalves (CW)

Introduction The relentless increase in capacity of Field-Programmable Gate-Arrays (FPGAs) has made them vehicles of choice for both prototypes and final products of on-chip multi-core, heterogeneous and reconfigurable systems. Multiple cores can be embedded as hard- or soft-cores, have customizable instruction sets, multiple distributed RAMs and/or configurable interconnections. Their flexibility allows them to achieve orders of magnitude better performance than conventional computing systems via customization. Programming these systems, however, is extremely cumbersome and error-prone hampering their widespread adoption and limiting their true computational potential.

Technology Description The REFLECT project is developing, implementing and evaluating a novel compilation and synthesis system approach for FPGA-based platforms. The approach relies on Aspect-Oriented Specifications to convey critical domain knowledge to all development steps.

Project Objectives Make reconfigurable technology accessible: o Lower barrier of adoption of technology o Facilitate program portability to new architectures

Improve productivity: o Accelerating design cycles by more than two orders of

magnitude o Allow user to have full control in a consistent,

systematic way of development flow stages

Bring to development flow: o User’s knowledge about the algorithm o Non-functional requirements o Flexibility to define properties of target FPGA and

memory organization o Best design practices represented by design patterns

and HW/SW templates

REFLECT’s Repository of Applications

Avionics: 3D Path Planning, and Stereo Navigation Audio: MPEG audio encoder, and G729 voice encoder

Technical plan

1 http://www.reflect-project.eu

Analyze applications to identify suitable aspects, design patterns and HW/SW templates, as well as reconfiguration schemes

Develop techniques for configuration and reconfiguration based on the REFLECT’s aspect-oriented concept

Develop aspects, design patterns, and HW/SW templates

Specify new intermediate representation that supports orthogonal views, e.g., CDFG-view and Aspect-view

Develop techniques for data type and word-length optimizations

Develop techniques for automatic HW generation

Develop techniques for cost effective mapping of computations to reconfigurable hardware, e.g. FPGAs by means of a domain specific language (LARA)

Develop unqualified development tools

Evaluate and validate

Advances over State-of-the-art The REFLECT’s approach intends to solve some of the problems when mapping efficiently computations to FPGA-based systems. In particular, the use of aspects and strategies will allow developers to try different design patterns and to achieve solutions design-guided by non-functional requirements. To the best of our knowledge, the REFLECT design flow is the first approach considering a systematic control of all the compilation stages and the first one to consider the relationship between non-functional requirements to different design patterns and optimizations, both specified in a domain-specific language, named LARA.

Expected results Enable one application, multiple designs according to

different customer requirements with reduced V&V cost and overall development cost: o In a traceable way through the notation of

Requirements-Aspects-Design Patterns-HW templates o With pre-verified Design Patterns and HW Templates

Systematic approach to Guide Design Flow Stages: o Limiting design space given requirements and derived

aspects o Bringing an opportunity to achieve cost-effective

designs

Allow Specification of Reusable Design Patterns and Best Practices o Capture and codify application and platform specific

knowledge and expertise.

Acknowledgment This work was partially supported by the European Community under the Framework Programme 7 (FP7) under contract No. 248976. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the European Community.

This document and all information and expression contained herein are the property of Honeywell International Inc., are loaned in confidence, and may not, in whole or in part, be used, duplicated, or

disclosed for any purpose without prior written permission of Honeywell International Inc. All rights reserved.

REFLECT: Rendering Field Programmable Gate Arrays (FPGAs)to Multi-Core Embedded Computing

Objectives:This project is developing, implementing and evaluating a novel compilation and synthesis system approach for FPGA-based platforms.The REFLECT’s approach emphasizes: The use of a High-level Imperative Programming Abstraction The automation of Low-level and error-prone Mapping Steps The possibility to allow programmers to assist Tools in Guiding the Optimizations and Mapping The possibility to use previous design and mapping experiences The use of Hardware Templates

ApproachWe rely on Aspect-Oriented (AO) Specifications to convey critical domain knowledge to a mapping engine while preserving the advantages of a high-level imperative programming paradigm in early software development and portability. We leverage AO specifications and a set of transformations to generate an intermediate representation using an extensible mapping language (LARA). LARA specifications will allow the exploration of alternative architectures and run-time adaptive strategies enabling the generation of flexible hardware cores that can be easily incorporated into larger multi-core designs. We will evaluate the effectiveness of the proposed approach using partner-provided codes from the domain of audio/video processing and real-time avionics.

Associated Compiler Experts b.v.

Bryan Oliver

Technische Universiteit Delft

Koen Bertels

Imperial College of Science Technology and Medicine

George A. Constantinides

Karlsruhe Institute of Technology

Jürgen Becker

Honeywell International, s.r.o

Zlatko Petrov (Project Coordinator)

Faculdade de Engenharia da Universidade do Porto

João M. P. Cardoso (Scientific Coordinator)

Projectos de Circuitos e Sistemas Electrónicos s.a.

Fernando Gonçalves

INESC-ID

Pedro Diniz (Scientific Coordinator)

REFLECT Design Flow - Highlights REFLECT Approach

APPLICATIONS

Experience on:Application customization and architecture

exploration for FPGAsRetargetable compilers for Embedded

Systems and FPGAs from high-level languages

Digital IP coresDynamically reconfigurable System-On-a-

Chip systems Strong Track Record of

Building advanced compilation prototypesDeveloping, maintaining and deploying

industrial compilers Market Leaders in Embedded Heterogeneous

HPCAvionicsCritical real-time systems Consumer and broadcasting electronics

PROJECT PARTNERSPARTNERS COMPETENCES

Project Coordinator: Zlatko Petrov (Honeywell International s.r.o., Czech Republic)Scientific Coordinators: João M. P. Cardoso (FEUP, Portugal) and Pedro C. Diniz (INESC-ID, Portugal)Start Date: 1st January 2010End Date: 31st December 2012Project Cost: 3.7 M€Project Funding: 2.7 M€Project Web site: http://www.reflect-project.eu

“One application,

multiple designs

according to different customer

requirements” with reduced V&V cost and

overall development

cost

In a traceable way through the notation

of Requirements

-Aspects-Design

Patterns-Hardware template

With pre-verified Design

Patterns and Hardware Templates

Systematic approach to

Guide Design Flow Stages

Limiting design space

given requirements and derived

aspects

Bringing an opportunity to

achieve a cost-effective

designs

Allows Specification of Reusable

Design Patterns and

Best Practices

Capture application

and platform specific

knowledge

Hard

ware

/Soft

ware

Flo

w

Application (C)

C Front-End

Aspects and Design Patterns (LARA)

VHDL-RTL

Back-End (code generators)

Optimizer (Software/Hardware)

Harmonic

Application (C)

CDFG-IR

Kernels for Sw and Hw Components (C) +

Annotations

Aspect-IR

Design-Space Exploration

(DSE)LARA Front-End

Best Practices

CDFG-IR

Hardware/Software Templates

CoSy

Assembly

weaving

High Performance

Task-Pipelining + Streaming +Loop Tilling + Loop Unrolling +

Data Reuse

FIFOs between producer/consumer tasks +Specific Hardware Core to Implement FFT +

BRAMs + Distributed RAMs + DRAMs

High-Level Aspects (Specification of Non-

Functional Requirements)

Design Patterns(applied by strategies using multiple low-level aspects)

Hardware Templates (used for implementation)

Team: Zlatko Petrov and Kamil Krátký (Honeywell), Pedro C. Diniz (INESC-ID), João M.P. Cardoso, José Carlos Alves, and João Canas Ferreira (FEUP), Koen Bertels, Georgi Kuzmanov, Razvan Nane, and Vlad-Mihai Sima (TUD), Jürgen Becker, Michael Hübner, Florian Thoma, Lars Braun, and Matthias Kühnle (KIT), George Constantinides, Wayne Luk, and José Gabriel de F. Coutinho (IC), Bryan Olivier and Hans van Someren (ACE), Fernando Gonçalves (CW)

REFLECT Design Flow – Conceptual Levels

Four types of aspect modules: Specializing

Specialization of an input code to be more suitable for the particular target system (e.g., specializing data types, numeric precision, and input/output data rates);

Optimizing, Mapping and Guiding

Specification of optimizations and mapping actions to guide the tools in some decisions (e.g., mapping array variables to memories, specifying FIFOs to communicate data between cores).

Monitoring

Specification of which implementation features, such as current value of a variable or the number of items written to a specific data structure, provide insight for the refinement of other implementation-related aspects.

Retargeting

Specification of certain characteristics of the target system in order to make the tools adaptable and aware of those characteristics (i.e., retargetable).

Concepts Examples

Techniques used:Aspects: express non-functional requirements and user’s knowledge about the algorithmSuccessive Refinement & Code Transformations: input application is transformed in a stepwise fashion in the target directionBest Practices: capture information about previous algorithms/implementationsAspect/Strategic-oriented Domain-Specific Language (LARA): Encapsulate Aspects and Search and Exploration Strategies

void filter_subband(double z[512], double s[32]) {

double y[64]; int i,j;static const double m[32][64] = {...};

for (i=0;i<64;i++) { // loop1y[i] = 0;for (j=0; j<8;j++) // loop2

y[i] += z[i+64*j];}for (i=0;i<32;i++) { // loop3

s[i]= 0;for (j=0;j<64;j++) // loop4

s[i] += m[i][j] * y[j];}

}

filter_subband from MPEG encoder

837655

560384 445

651

2860

3702

1328

3628

773

4072

0

500

1000

1500

2000

2500

3000

3500

4000

4500

fsubb-base-d fsubb-base-f fsubb-sr fsubb-unroll+jam fsubb-fixed fsubb-hw1

Hw Execution time (us) Slices

Possible design pattern for filter_subbandfunction:partial scalar replacementunroll loop1 by two, then jam the two loops in the new bodyunroll loop4 by twomap arrays y and m to local dual-port memories

REFLECT Design Flow

Audio: MPEG Audio Encoder G729 Voice Encoder

Avionics: 3D Path Planning Stereo Navigation

strategydef pattern1apply: scalar1; unroll1; jam1; map1;

end

aspectdef unroll1select A: function{*}.for{*}apply to A: optimize unroll_loop(k=2)condition:

$for.bound && ($for.iterations % 2) && $for.array{1,}(loop_variant && is_local )

end

aspectdef jam1select A: function{*}.for{*}.bodyapply to A: optimize jam_loop_body

end

aspectdef scalar1select A: function{*}apply to A: optimize scalar_replacement

end

aspectdef map1select A: function{*}.array{*}apply to A: map to DistributedRAM(ports=2)condition: $array.is_local && ($array.size <=

512)end

The Design Pattern specified as a reusable LARA strategy