repararepara-project.eu/wp-content/uploads/2016/04/ict-609666-d2.3.pdf · project co-funded by the...

Project co-funded by the European Commission within the Seventh FrameworkProgramme

Project number: 609666

REPARA

Reengineering and Enabling

Performance and poweR

of Applications

D2.3: Enforcing techniques for the REPARA-C++speci�cation

Document ID: ICT-609666-D2.3Version: 1Work Package: WP2 � Source code analysisTask: T2.3 � Reengineering of legacy codeDissemination Level: CO, complemented by this public report

Task Person Institution Role DateEdited by: Rudolf Ferenc USZ T2.3 Task Leader 15/02/2016Reviewed by Marco Danelutto UPI Reviewer 19/02/2016Reviewed by Luis Miguel Sanchez Garcia UC3M Reviewer 22/02/2016Approved by J. Daniel Garcia UC3M Project Coordinator 29/02/2016

ICT-609666-D2.3 REPARA Reengineering of legacy code

March 2, 2016. Version 1 PUBLIC Page ii of 53


Change Log

Version Author Contractor Change Date

2 Rudolf Ferenc USZ Corrected version based on reviewsof UC3M and UPI

25/02/2016

1 Rudolf Ferenc USZ First draft 15/02/2016

March 2, 2016. Version 1 PUBLIC Page iii of 53


March 2, 2016. Version 1 PUBLIC Page iv of 53


List of Contributors

The following individuals (in alphabetical order) have made contributions to this deliverable:

� Denes Ban (USZ)

� Silvano Brugnoni (HSR)

� Thomas Corbat (HSR)

� Rudolf Ferenc (USZ)

� Istvan Siket (USZ)

� Peter Sommerlad (HSR)

� Toni Suter (HSR)

March 2, 2016. Version 1 PUBLIC Page v of 53


March 2, 2016. Version 1 PUBLIC Page vi of 53


Executive Summary

This document presents deliverable D2.3 Reengineering of legacy code. The aim of this taskwas to process and transform legacy C++ code to REPARA-C++ conformance source code.

� For this we developed a static code analysis tool to �nd the violations to the REPARA-C++ speci�cation.

� We integrated this analyser into the Eclipse environment to make easy navigation throughthe code that needs to be modi�ed possible.

� We developed an Eclipse CDT plug-in to support those modi�cations and to interactivelyrefactor source code to REPARA-C++ compliant source code.

The tool and plug-ins help the user to �nd the places in the source code that are notREPARA-C++ compliant and provide support for source code reactoring.

Deliverable context

One of the aims of the REPARA project is to �nd the best target platform for arbitrary C++code to execute on. Since di�erent platforms have di�erent restrictions therefore REPARA-C++ was de�ned in D2.1 [6] to ful�l the requirements of all platforms and a technique wasproposed in D3.3 [7] to annotate those parts � called REPARA kernels � in the source codethat must be compliant with REPARA-C++. The aim of task T2.3 (Reengineering of legacycode) was to verify whether the kernels conform with the REPARA-C++ speci�cation and ifa violation is found it helps the user to transform the code into REPARA-C++ source code.The REPARA-C++ conformance checker uses the results of the REPARA-C++ analyser tooldeveloped in D2.2 [12] while the transformation part uses the results of the REPARA-C++conformance checker results.

With the results of D3.3 [7] REPARA tools will provide complete technical support fortransferring a given C++ project into the desired target platform.

Deliverable structure

This document is organised as follows. In Chapter 1 the motivation and the main results of thistask are summarized. Chapter 2 gives an overview of the architecture of the REPARA-C++checker. The detailed description of the REPARA-C++ Conformance Checker can be found inChapter 3. And �nally, Chapter 4 presents a detailed description of source code transformation.

March 2, 2016. Version 1 PUBLIC Page vii of 53


March 2, 2016. Version 1 PUBLIC Page viii of 53


Contents

1 Introduction 1

2 Architecture 3

3 REPARA-C++ conformance checker 93.1 AIRConformanceChecker command line tool . . . . . . . . . . . . . . . . . . . . 11

4 Enabling Transformations 134.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134.2 Implemented Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

4.2.1 Memory Localization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144.2.2 Pointer Elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184.2.3 Pointer to Array Conversion . . . . . . . . . . . . . . . . . . . . . . . . . 214.2.4 Tail Recursion Elimination . . . . . . . . . . . . . . . . . . . . . . . . . . 244.2.5 if-Statement Replacement . . . . . . . . . . . . . . . . . . . . . . . . . . 294.2.6 Inlining Function Pointers . . . . . . . . . . . . . . . . . . . . . . . . . . 314.2.7 Reinterpreting Data as Another Type . . . . . . . . . . . . . . . . . . . . 374.2.8 Quick Fixes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384.2.9 Compound to Function Conversion . . . . . . . . . . . . . . . . . . . . . 45

4.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

March 2, 2016. Version 1 PUBLIC Page ix of 53


March 2, 2016. Version 1 PUBLIC Page x of 53


List of Tables

3.1 Summary of REPARA-C++ restrictions . . . . . . . . . . . . . . . . . . . . . . 9

4.1 Wall-clock runtime of multi-threaded benchmark in ms. (For di�erent kind ofFPGA boards; α is the achieved speedup) . . . . . . . . . . . . . . . . . . . . . 18

4.2 Limitations to C subset of C++ by OpenCL . . . . . . . . . . . . . . . . . . . . 504.3 Limitations to C subset of C++ by FPGA . . . . . . . . . . . . . . . . . . . . . 504.4 Limitations to C subset of C++ by DSP . . . . . . . . . . . . . . . . . . . . . . 504.5 Limitations to C++ by OpenCL . . . . . . . . . . . . . . . . . . . . . . . . . . . 514.6 Limitations to C++ by FPGA . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514.7 Limitations to C++ by DSP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

March 2, 2016. Version 1 PUBLIC Page xi of 53


March 2, 2016. Version 1 PUBLIC Page xii of 53


List of Figures

2.1 REPARA-C++ conformance checker architecture . . . . . . . . . . . . . . . . . 32.2 REPARA-C++ conformance checker Eclipse plug-in . . . . . . . . . . . . . . . . 42.3 List of invoked command line tools . . . . . . . . . . . . . . . . . . . . . . . . . 52.4 REPARA-C++ conformance checking results listed in the Problems window . . 62.5 Highlighted REPARA-C++ conformance problems in the source code . . . . . . 62.6 Lines do not conforming with REPARA-C++ are underlined with a red line . . 7

4.1 Select pointer parameter to be copied locally . . . . . . . . . . . . . . . . . . . . 174.2 Select Copy Parameter Locally in the REPARA menu . . . . . . . . . . . . . . . 174.3 The transformed code after memory localization . . . . . . . . . . . . . . . . . . 174.4 Select pointer parameter to be eliminated . . . . . . . . . . . . . . . . . . . . . . 204.5 Select Eliminate Pointer Parameters in the REPARA menu . . . . . . . . . . . 214.6 The transformed code after pointer elimination . . . . . . . . . . . . . . . . . . . 214.7 Select pointer parameter to be converted . . . . . . . . . . . . . . . . . . . . . . 234.8 Select Replace Pointer with Array in the REPARA menu . . . . . . . . . . . . . 244.9 The transformed code after pointer to array conversion . . . . . . . . . . . . . . 244.10 Select tail-recursive function to be transformed . . . . . . . . . . . . . . . . . . . 274.11 Select Remove Recursion in the REPARA menu . . . . . . . . . . . . . . . . . . 284.12 The transformed code after replacement of the recursion . . . . . . . . . . . . . 284.13 Put the cursor inside the if statement . . . . . . . . . . . . . . . . . . . . . . . . 304.14 Select Transform If-Statements in the REPARA menu . . . . . . . . . . . . . . 314.15 The transformed code after if-statement replacement . . . . . . . . . . . . . . . 314.16 Select the function reference argument to be inlined . . . . . . . . . . . . . . . . 354.17 Select Inline Function Pointer in the REPARA menu . . . . . . . . . . . . . . . 354.18 The transformed code with the inlined function pointer . . . . . . . . . . . . . . 364.19 Select a kernel-annotated compound statement . . . . . . . . . . . . . . . . . . . 484.20 Select Compound to Function in the REPARA menu . . . . . . . . . . . . . . . 484.21 The resulting kernel function after the transformation . . . . . . . . . . . . . . . 49

March 2, 2016. Version 1 PUBLIC Page xiii of 53


March 2, 2016. Version 1 PUBLIC Page xiv of 53


1. Introduction

The aim of this task was to reengineer legacy code to conform with the REPARA-C++speci�cation. This task can be divided into two major parts: �rst all violations have to befound and then they have to be refactored to REPARA-C++ compliant source code.

For the �rst part we developed the AIRConformanceChecker command line tool that veri�eswhether the given source code ful�ls all REPARA-C++ rules. Its input consists of one ormore .air �les (which contain the Abstract Intermediate Representation of the source code;generated by RCPP2AIR [12]) and information about the standard include directories. TheAIRConformanceChecker tool loads the .air �le(s) and checks whether the REPARA kernels(tagged with C++11 attributes) contain any code that does not conform to REPARA-C++restrictions [6]. If such source code is found, a warning message is issued which tells the userwhere (�le and position) the problem is and what it is (e.g. Bit�eld is not supported).

The REPARA C++ Open Speci�cation [6] contains 42 C and C++ restrictions based onOpenCL for GPU, FPGA, and DSP and we implemented 36 of them, while 4 restrictions wereout-of-date because they cannot appear in C++11 code and there are 2 constraints whichcannot be veri�ed by static analysis.

To aid the developers which will use the REPARA command line tools we developed anEclipse/Cevelop plug-in [1] that helps the users to apply these tools e�ciently. This meansthat the plug-in collects all necessary information for the analysis, invokes the RCPP2AIR andAIRConformanceChecker tools with the appropriate parameters and presents the results asEclipse-style warnings. This way, if the user double-clicks on the warning, Eclipse opens thesource �le and navigates the focus to the appropriate line and at the same time, all problematiccode parts are underlined in red.

In the second part transformations targeting violations of the REPARA-C++ towardsREPARA C++ compliant code have been developed. Those transformations are availableas a plug-in for Eclipse/Cevelop as well and can be applied directly to the source code in theIDE.

For 19 of the 42 restrictions speci�ed in the REPARA C++Open Speci�cation [6] we providea transformation to remedy the violation at least partially. Some transformations like thereplacement of unsupported library calls can be applied to multiple restrictions. The remainingincompliant cases either do not have a suggested resolution, are unnecessary restrictions inC++ or should be handled while compiling the code to the target platform (e.g. alignmentof class objects). Furthermore, we do not provide transformations for removing just singlekeyword (e.g. register storage class speci�er).

March 2, 2016. Version 1 PUBLIC Page 1 of 53


2. Architecture

To check whether the given project conforms with the REPARA-C++ speci�cation wedeveloped a command line tool called AIRConformanceChecker. Its input is an .air �le whichis the binary representation of the source code and which is generated by the RCPP2AIRcommand line tool [12]. Besides, the tool needs the standard include library paths as wellbecause there are restrictions that exclude the use of functions or header from them. TheAIRConformanceChecker tool writes all warnings into a text �le where each line contains atriplet:

� the path of warning,

� the start-line, start-column, end-line, end-column of the warning

� and a short description of the problem.

Example:bfs.cpp(54,4,54,35): Writing a pointer with type of "char" is not supported.

Figure 2.1: REPARA-C++ conformance checker architecture

To check source code, the user has to collect information about the source code (macros,include paths, etc.) and has to execute two command line tools (RCPP2AIR and AIRConfor-manceChecker) for each compilation unit and the user eventually gets the output in a simpletext �le. Using these tools from command line is inconvenient and time consuming in prac-tice; therefore, we developed an Eclipse plug-in so the user can simply click on a button andthe plug-in manages all the task mentioned above and presents the results in a user-friendlyway. Figure 2.1 shows an overview of what steps are carried out in the background when theREPARA-C++ conformance checker is used from Eclipse. These steps are the followings:



Figure 2.2: REPARA-C++ conformance checker Eclipse plug-in

1. The user has to provide the location of the standard headers and the macros used in thecompilation environment in the corresponding setting �les before installing the Eclipseplug-in.

2. The user has to install the Eclipse plug-in by copying it into the Eclipse plugins directory.After starting Eclipse a new Conformance Checker icon (a running person) will appearon the toolbar (see Figure 2.2).

3. The user can open an arbitrary C++ project and can work on it as usual.

4. If the user wants to check whether the project conforms to REPARA-C++ speci�ca-tion, �rst s/he has to select the project in the Project Explorer and then click on theConformance Checker icon.

5. The plug-in collects all compilation units from the selected project and for each compila-tion unit it collects the relevant compilation information (e.g. macros, additional includedirectories).

6. The plug-in invokes the RCPP2AIR tool for each compilation unit with the appropriateparameters including where to put the output .air �les.

7. RCPP2AIR analyses each compilation unit separately and generates the correspondingbinary representation .air �les.

8. The plug-in invokes the AIRConformanceChecker tool with the appropriate parameters(location of the .air �le, list of standard include directories, where to put the output �les).

9. The AIRConformanceChekcer tool checks whether the REPARA kernel (annotated withC++11 attributes [7]) conforms with REPARA-C++ or not. All source code locationswhere the code does not ful�l the restrictions are written into the output �le.



Figure 2.3: List of invoked command line tools

10. During the checking process the plug-in provides information about this background pro-cess in the console window. More precisely, it writes out all command line invocationswith full parameter list (see Figure 2.3).

11. After �nishing all command line executions the plug-in loads the warning messages fromthe output �les of AIRConformanceChecker, �lters out the repeated warnings1 and showsthe list in the Problems window of Eclipse (see Figure 2.4).

12. If the user double-clicks on any of these warnings the source code will be loaded and theproblematic part of the code will be highlighted (see Figure 2.5).

13. Another way of exploring warnings found in a given �le is that if a �le is opened all theproblematic parts are underlined with a red line (see Figure 2.6).

1Since the project is checked per compilation unit, it is possible that two compilation units include the same

header �le which contains non-REPARA-C++ conforming code, so the warning will appear in two di�erent

outputs.



Figure 2.4: REPARA-C++ conformance checking results listed in the Problems window

Figure 2.5: Highlighted REPARA-C++ conformance problems in the source code



Figure 2.6: Lines do not conforming with REPARA-C++ are underlined with a red line



3. REPARA-C++ conformance checker

REPARA-C++ is a special subset of ISO C++11 where several language constructs areexcluded based on the target platforms used in the project. Deliverable D2.1 [6] contains thedetailed descriptions of all C and C++ restrictions that are prescribed for OpenCL, FPGA,and DSP. Table 3.1 shows the six groups of restrictions, the number of restrictions can be foundin each group (Nr.)1, the number of restrictions implemented in this deliverable, the number ofrestrictions that were unnecessary to implement and the number of unveri�able ones. The lastline of the table summarizes the numbers, namely 36 out of the 42 restrictions are implemented,4 of them are unnecessary and only 2 of them cannot be veri�ed properly by static analysis.

Restriction Nr. Implemented Unnecessary Unveri�ableC restr. based on OpenCL for GPU 16 14 2 0C restr. based on FPGA 5 5 0 0C restr. based on DSP 5 4 0 1C++ restr. based on OpenCL for GPU 7 7 0 0C++ restr. based on FPGA 2 2 0 0C++ restr. based on DSP 7 4 2 1REPARA-C++ restrictions 42 36 4 2

Table 3.1: Summary of REPARA-C++ restrictions

The unveri�ed restrictions:

� Restriction to arithmetic conversions (C restriction based on OpenCL for GPU): ISO C99supports vector type and this restriction is related to its conversions. On the other hand,ISO C++ does not support vector type and since REPARA-C++ is a subset of ISO C++it is unnecessary to verify it.

� Restriction to elements of struct, union (C restriction based on OpenCL for GPU): Therestriction says that the element of a struct or union must belong to the same address spacewhich means that constructs like struct s { __constant int a; __private int b;

__local int c; } are not supported. Since ISO C++ does not support it we do nothave to check it.

� Restriction to export keyword (C++ restriction based on DSP): The restriction says thatREPARA-C++ should not support the export keyword. Export keyword is used toexport templates and this way it was unnecessary to import the corresponding header�les. Since only a few compilers supported this feature therefore it was not wide-spreadso the new C++11 standard does not support this. Since C++11 attributes are used totag REPARA kernels the source code must conform with the C++11 standard and thisway this is an unnecessary restriction.

� Restriction to function template linkage (C++ restriction based on DSP): This rule for-bids that a template, a template explicit specialization, or a class template partial spe-cialization has C linkage but C++11 already contains such restriction so it is unnecessaryto verify this.

There are two restrictions that are not possible to verify by static analysis:

1Several restrictions can be found in more groups but they were implemented only once.



� Restriction to �oating-point exceptions (C restrictions based on DSP): Arithmetic opera-tions may cause di�erent kinds of �oating-point exceptions. Unfortunately, it is very dif-�cult or rather impossible to identify which operations will cause any of the �oating-pointexceptions. By using control-�ow and data-�ow analysis some results can be achieved butit will be noisy and will not be able to �nd all places where �oating-point exception canbe thrown. Therefore, this restriction is not veri�ed.

� Restriction to recursion depth (C++ restriction based on DSP): In some DSP devicesthe capacity of stack is limited therefore the total depth of recursive instantiations shouldbe 32 or maximally 64. Similarly to �oating-point exception, this restriction cannot beveri�ed by static analysis or just some heuristic approach could be implemented whichwill be noisy and will not �nd all problematic function calls. Therefore, this restrictionis not veri�ed.

The list of implemented restrictions (the numbers in the parentheses are the section numberof the restriction in [6]):

� REPARA C++ restrictions based on C OpenCL code for GPU

� ISO C++ scalar data types (4.3.3)

� Alignment of types (4.3.4)

� Reinterpreting data as another type (4.3.5)

� Arithmetic conversions (4.3.6)

� Bit-�eld struct members (4.3.7)

� Function pointer (4.3.8)

� Variable length arrays and structures with �exible (or unsized) arrays (4.3.9)

� Variadic macros and functions (4.3.10)

� The use of ISO C99 library functions (4.3.11)

� Storage-class quali�ers (4.3.12)

� Recursive functions (4.3.13)

� Dynamic memory allocation (4.3.14)

� Writes to a pointer (or arrays) (4.3.15)

� Elements of structs and unions (4.3.16)

� Random number generator (4.3.17)

� asm declaration is not supported (4.3.18)

� REPARA C++ restrictions based on C code for FPGA

� System calls (4.4.3)

� Pointer casting (4.4.4)

� Pointer Arrays (4.4.5)

� Recursive function (4.4.6)

� Dynamic memory management (4.4.7)

� REPARA C++ restrictions based on C code for DSP

� System call (4.5.4)



� Register storage class speci�er (4.5.5)

� Data padding and alignment of structures (4.5.6)

� Floating-point exceptions (4.5.7)

� Run-time library exceptions (4.5.8)

� REPARA C++ restrictions based on C++ OpenCL code for GPU

� Dynamic binding (4.7.4)

� Dynamic_cast (4.7.5)

� Dynamic storage allocation and deallocation (4.7.6)

� Dynamic reinterpret_cast (4.7.7)

� Exception handling (4.7.8)

� Alignment of class object (4.7.9)

� C++ Standard libraries (4.7.10)

� REPARA C++ restrictions based on C++ code for FPGA

� Dynamic memory allocation (4.8.3)

� Dynamic binding (4.8.4)

� REPARA C++ restrictions based on C++ code for DSP

� Dynamic reinterpret_cast (4.9.3)

� Recursion depth (4.9.4)

� Function template linkage (4.9.5)

� Variable sized classobjects (4.9.6)

� Function inlining (4.9.7)

� Typeinfo header (4.9.8)

� Export keyword (4.9.9)

3.1 AIRConformanceChecker command line tool

AIRConformanceChecker is the rule checker command line tool that identi�es the source codeparts which do not comply with the restrictions. The input of the tool are the .air �les, andit creates a list of source code positions of code which are not REPARA-C++ compliant.

Usage:AIRConformanceChecker [options] [input(s)]

The command line options are the following:

� -out:filename: The name of the output �le for the rule violations. If it is not set, thenthe standard output used.

� -stdPath:path: Path of the C/C++ standard headers. It is required for checking somerestrictions.

� -inputlist:filename: A list �le, which contains the list of the input �les.



For example, consider the following bit�eld.cpp �le (Listing 3.1) where bit�elds are used butREPARA-C++ does not support it so AIRConformanceChecker gives the following warningmessage:/home/repara/bitfield/bitfield.cpp(4,3,4,9): Bitfield is not supported.

/home/repara/bitfield/bitfield.cpp(5,3,5,10): Bitfield is not supported.

Listing 3.1: The small source code example (bit�eld.cpp)1

2 struct s {3 int a ;4 int b : 2 ;5 char c : 4 ;6 } ;7

8 void f ( ) {9

10 s var ;11

12 [ [ rpr : : kernel ] ] {13 var . a ;14 }15 }



4. Enabling Transformations

This chapter describes the enabling transformations implemented in Task T2.3 by HSR. Sec-tion 4.1 gives a short introduction about the goal and an overview of the developed transforma-tions. Section 4.2 describes the developed enabling transformations in detail. And Section 4.3concludes.

4.1 Introduction

The second part of Task 2.3 has been the development of a set of transformation techniquesto refactor legacy C++ code into REPARA C++ compliant source code. All restrictionsimplied by the various target platforms yield the REPARA C++ Open Speci�cation, describedin deliverable D2.1 [6]. HSR developed the transformations described in this chapter. TUDprovided domain-speci�c expertise regarding transformations enabling the extraction of kernelsfor FPGA targets.

Summarized there are three categories of transformations, all described in the followingsections.

Transformations Elaborated with TUD As mentioned above several transformations havebeen elaborated in cooperation with TUD. Their goal was to enable and improve kernelextraction as described in D4.3 [11]. This encompasses the following transformations:

� Memory Localization

� Pointer Elimination

� Pointer to Array Conversion

� Tail Recursion Elimination

� if-Statement Replacement

Transformations Derived from D2.1 In addition to the transformations above, furthertransformations have been derived from the restrictions to RCPP (REPARA C++) andthe suggested solutions in deliverable D2.1 [6].

� Inlining Function Pointers

� Replacement of Basic Types

� Reinterpreting Data as Another Type

� Various Quick Fixes

Convenience Transformation Kernel extraction as described in deliverable D4.3 always ex-pects a kernel to consist of a single function call. In practice it would be convenient toalso allow extraction of compound statements. In order to enable the extraction of kernelsin the form of such compound statement, we also provide a transformation for convertinga compound statement into a kernel function:

� Compound to Function Conversion



4.2 Implemented Transformations

All transformation implementations described in this section are based on the extensible trans-formation infrastructure described in D4.1 [10]. The description of each transformation is splitup into a short description of the motivation, an outline of the approach to perform the trans-formation and a section about the implementation of the transformation for Cevelop [1]. Forsome transformations limitations to their applicability are known, which are described at theend of the corresponding part.

4.2.1 Memory Localization

Copying a parameter locally is an optimization speci�cally targeting Vivado High-Level Syn-thesis (HLS) [15]. Most HLS tools are capable of dealing with a limited set of pointer and arraytypes; e.g., by convention, Vivado HLS implements stack-allocated data in on-chip memory,whereas pointers indicate o�-chip memory [14]. Access to the latter is usually implementedusing a bus master interface, i.e., via address and data channels, or streaming. This approachis suitable for large amounts of data, but can incur prohibitive overhead for small, isolatedpieces of data and limits parallelism. Unfortunately, even fundamental types are often passedvia pointers in code intended to run on CPUs, since the dereferencing-overhead is negligible(for modern compilers).

To remedy the adverse e�ect of such code on the resulting hardware circuits, we providea transformation that inserts a local copy of pointer and array parameters of functions. Thisreduces the number of o�-chip memory operations signi�cantly.

Approach

To overcome the negative e�ect of pointer parameters on a kernel function interface, we createa local copy of the data referred to by the parameter. A new local variable of non-pointer typeis inserted at the beginning of the function to hold the copied data. If there is a modi�cationto that data locally, the values have to be copied back at every exit point of the function. Sincethe type of the local variable is not a pointer anymore, access to the local variable has to beadapted in the body of the function.

As a example refer to the code in Listing 4.1. It shows a struct S, which contains a membervariable i. The function kernel takes such an S object by pointer as parameter. The membervariable i is modi�ed locally.

Listing 4.1: Example for kernel with pointer parameter.1 struct S {2 int i ;3 } ;4

5 void kernel ( S *s ) {6 s−>i = 1 ;7 }

After the transformation the parameter s becomes a local variable. The object pointed toby the parameter will be copied to the local variable s. In order to retain the semantics in thecode all access to s needs to be changed from pointer to non-pointer access. For example s->ibecomes s.i. At the end of the function, to retain the side-e�ect on s, its value has to be copiedback to the pointer parameter. The result of the transformation an be seen in Listing 4.2.



Listing 4.2: Result of the memory localization.1 struct S {2 int i ;3 } ;4

5 void kernel ( S *parameter_s ) {6 S s = *parameter_s ;7 s . i = 1 ;8 *parameter_s = s ;9 }

Implementation

Creating a local copy of a parameter basically always consists of the following basic transfor-mation steps:

� Checking initial conditions

� Insertion of the local copy

� Replacement of the parameter references

� Copying back the modi�ed local variable

� Adding includes

Eventually, we distinguish two cases depending on the declaration of the original parameter.(1) If it is a plain pointer we expect it to be referring to a single value (See Listing 4.3). Thiscase is handled by the assignment strategy. (2) If it is an array parameter we expect it to bean array (See Listing 4.4), which is handled by the array memcopy strategy. In the latter casewe have a further constraint, which requires the dimensions of the array to be speci�ed. If therequirements are not satis�ed the problem is reported.

Listing 4.3: Pointer parameter example1 void kernel ( int * ptr_param ) {2 . . .3 }

Listing 4.4: Array parameter example1 void kernel ( int array_param [ 9 ] ) {2 . . .3 }

Assignment Strategy The assignment strategy works for pointer parameters that refer to asingle value or instance. As a local copy of the parameter it just inserts a new local declarationat the beginning of the function body. To avoid large renaming activity in the body of thefunction, the new local variable gets the name of the parameter assigned. Subsequently, theparameter has to be renamed, otherwise a name clash would occur. Initialization of the newlocal variable happens by copy initialization (line 2 in Listing 4.6). We expect this changedoes not introduce any signi�cant side-e�ect to the behavior of the program, otherwise thistransformation would not retain the semantics of the code.

As the local variable is of non-pointer type, all references have to be adapted locally. Derefer-ence operations (*) on the name are removed and direct access of the pointer need the address-ofoperator (&). For an example refer to the lines 3 and 4 in Listing 4.6.

At every exit point of the function the value of the local variable has to be copied back tothe parameter, i.e. at return statements and the end of the function, if reachable. See lines 7and 11 in Listing 4.6 for an example.



There are no additional include directives required in this case.The assignment strategy applied to the following source code on parameter s results in the

transformed code in Listing 4.6.

Listing 4.5: Before memory localization ofpointer parameter

1 void kernel ( S *s ) {2 foo (* s ) ;3 S *sp = s ;4 . . .5 if ( some_condition ) {6 return ;7 }8 . . .9 }

Listing 4.6: After memory localization ofpointer parameter

1 void kernel ( S *parameter_s ) {2 S s = *parameter_s ;3 foo ( s ) ;4 S *sp = &s ;5 . . .6 if ( some_condition ) {7 *parameter_s = s ;8 return ;9 }

10 . . .11 *parameter_s = s ;12 }

Array Memcopy Strategy The array memcopy strategy works for pointer parameters thathave all array dimensions speci�ed. An array of the same size is allocated locally and the valuesof the array argument are copied to the local array with the memcpy function (lines 4 and 5 inListing 4.8). Similar to the implementation for single values the local data has to be copied backto the argument of the kernel at every exit point of the function (lines 7 and 11 in Listing 4.8).Access to the local array does not need to be adapted, as array access operations are identicalfor the array parameter and the newly created local array. As the memcpy command used forcopying the array data is declared in the cstring header a corresponding include to this headeris required, if not already present(line 1 in Listing 4.8).

Listing 4.7: Before memory localization of array parameter1 void kernel ( int data [ 5 ] [ 5 ] ) {2 if ( data [ 0 ] [ 0 ] == 1) return ;3 data [ 0 ] [ 0 ] = 1 ;4 }

Listing 4.8: After memory localization of array parameter1 #include <cs t r i ng >2

3 void kernel ( int memory_parameter_data [ 5 ] [ 5 ] ) {4 int data [ 5 ] [ 5 ] ;5 memcpy ( data , memory_parameter_data , sizeof ( data ) ) ;6 if ( data [ 0 ] [ 0 ] == 1) {7 memcpy ( memory_parameter_data , data , sizeof ( data ) ) ;8 return ;9 }

10 data [ 0 ] [ 0 ] = 1 ;11 memcpy ( memory_parameter_data , data , sizeof ( data ) ) ;12 }

User Interface The Local Copy of Parameter transformation is easily accessible in Cevelopif the corresponding plug-in (ch.hsr.ifs.repara.localmemcopy) is installed. It is applied intwo simple steps:

� Select the pointer parameter, which shall be copied locally, in the Cevelop code editor.



Figure 4.1: Select pointer parameter to be copied locally

� In the REPARA menu select Copy Parameter Locally.

Figure 4.2: Select Copy Parameter Locally in the REPARA menu

� The refactoring will apply automatically resulting in the following code:

Figure 4.3: The transformed code after memory localization

Possible Extensions

At the current state the REPARA attributes are not considered while applying this transfor-mation. It could be extended to analyze them, if any can be found, for deciding which copyoperations to perform. For example, it is not required to perform a copy-back operation if anargument is purely function input.



Known Issues

Assignment of a new pointer value to the local variable will carry the side-e�ect back to thecalling function. This change in semantics happens due to copying the content of the localvariable to the parameter pointee location. Before the transformation e�ects to an overwrittenpointer parameter have not been visible outside the function.

Achieved Performance

Experimental results provided by TU Darmstadt have shown a signi�cant performance increaseachieved by the memory localization transformation. Note that these results are part of aresearch paper that has not been published yet. Table 4.1 shows the wall-clock runtimes ofa multi-threaded benchmark program working on random data. Speedups range from 2.3×up to 377.6× (with PCIe-based VC709 bene�ting most from the aggregated data transfers)compared to the IP core resulting from the original source code, which con�rms that MemoryLocalization is a highly useful, portable optimization for C/C++ HLS code.

Table 4.1: Wall-clock runtime of multi-threaded benchmark in ms. (For di�erent kind of FPGAboards; α is the achieved speedup)

Kernel zedboard α ZC706 α VC709 αsobel 24421 1.0 15606 1.0 12839 1.0sobel-ml 10815 2.3 575 27.1 192 66.9sobel-p 6830 3.6 3920 4.0 11489 1.1sobel-p-ml 721 33.9 190 82.1 34 377.6

4.2.2 Pointer Elimination

Most HLS tools are capable of dealing with a limited set of pointer and array types; e.g., byconvention, Vivado HLS implements stack-allocated data in on-chip memory, whereas pointersindicate o�-chip memory [14]. Access to the latter is usually implemented using a bus masterinterface, i.e., via address and data channels, or streaming. This approach is suitable for largeamounts of data, but can incur prohibitive overhead for small, isolated pieces of data andlimits parallelism. Unfortunately, even fundamental types are often passed via pointers in codeintended to run on CPUs, since the dereferencing-overhead is negligible for modern compilers.

For improving such sub-optimal situations we provide a transformation for converting pointerparameter into value parameters.

Approach

Several modi�cations at di�erent locations in the source code have to be applied in order toperform the pointer parameter elimination transformation. First, the parameter type has tobe adapted by removing the pointer operator. Second, as the parameter has a new type thereferences in the body of the function need to be adapted to the new type as well. Third, thecall site needs to have the arguments changed to match the new type.

Below we see an example of a function de�nition with a pointer parameter p, before thetransformation.



Listing 4.9: Kernel function before thetransformation

1 int square ( int* p ) {2 return *p * *p ;3 }4

5 int main ( ) {6 int x = 5 ;7 int result = square(&x ) ;8 }

Listing 4.10: Kernel function after thetransformation

1 int square ( int p ) {2 return p * p ;3 }4

5 int main ( ) {6 int x = 5 ;7 int result = square ( x ) ;8 }

The function implementation in Listing 4.9 after the transformation looks as shown inListing 4.10. The type of the parameter p has been changed to int instead of int * and thereferences to the pointer parameter in the function body are adapted to direct value accessesinstead of indirect accesses through the pointer. Note that the call-site in main() changedaccordingly, to re�ect the changes in the function signature of square().

Implementation

This transformation performs the following steps in order to achieve the pointer elimination:

� Checking the selection of a pointer parameter

� Replace pointer parameter declaration(s)

� Replace pointer access in function body

� Replace pointer arguments at call-sites

As a function can be declared at several locations all such declarations and the de�nitionhave to be adapted accordingly. This separation commonly exists when properly separatingheader and source �les. For example if we have a function square with a separate declaration,as seen in Listing 4.11.

The adapted code has the pointer operator stripped from the parameter p in all declarationsand de�nitions, as can be seen in Listing 4.12.

Listing 4.11: Declarations of square beforethe transformation

1 int square ( int *p ) ;2

3 int square ( int *p ) {4 . . .

Listing 4.12: Declarations of square afterthe transformation

1 int square ( int p ) ;2

3 int square ( int p ) {4 . . .

The references to the pointer parameter in the function body are adapted to comply withthe changed parameter type. Two cases are distinguished: First, direct references to the pointerparameter are replaced by an address-of access to the pointer parameter by inserting an amper-sand (&). Second, if the pointer has been dereferenced previously this indirection is no longernecessary and thus the indirection operator (*) can be removed. Both cases are shown in List-ing 4.13. In Listing 4.14 we see the adapted parameter access code after the transformation.



Listing 4.13: Parameter p before the trans-formation.

1 check_p ( p ) ;2 return *p * *p ;

Listing 4.14: Parameter p after the trans-formation.

1 check_p(&p ) ;2 return p * p ;

Call-sites relying on the parameter to be of pointer type need to be adapted as well. Herewe also have two distinct cases: �rst, direct access to a pointer � ap in the code in Listing 4.15;second, access to the address of a variable � &a in the code in Listing 4.15. After the transfor-mation the parameter is no longer a pointer the pointer ap has to be dereferenced �rst and theaddress-of operator, which accesses the address of a, is no longer required. The modi�ed codecan be seen in Listing 4.16.

Listing 4.15: Call-sites before the transfor-mation.

1 square ( ap ) ;2 square(&a ) ;

Listing 4.16: Call-sites after the transfor-mation.

1 square (* ap ) ;2 square ( a ) ;

User Interface The inline function pointer transformation is easily accessible in Cevelopif the corresponding plug-in (ch.hsr.ifs.repara.pointerelimination) is installed. It isapplied in two simple steps:

� Select the function parameter of pointer type to be eliminated.

Figure 4.4: Select pointer parameter to be eliminated

� In the REPARA menu select Eliminate Pointer Parameters.



Figure 4.5: Select Eliminate Pointer Parameters in the REPARA menu


Figure 4.6: The transformed code after pointer elimination

Limitations

Side-e�ects on the pointer parameter are not visible beyond the scope of the modi�ed function,i.e. if on the call-site a side-e�ect is expected to be visible on the pointer parameter, thistransformation must not be applied. Currently, we do not perform any analysis regarding suchdata-�ow. In an extended version the REPARA rpr::in and rpr::out attributes might beconsidered to support decisions on the applicability of the transformation.

4.2.3 Pointer to Array Conversion

In C++, semantically there is no di�erence between a pointer parameter and an array pa-rameter, as the array parameter is internally represented as a pointer parameter [5]. But theLocal Copy of Parametertransformation , as described in Section 4.2.1, distinguishes betweenpointer and array parameters selection of the transformation strategy. Due to the equivalence



it is possible to encounter an array parameter to be declared as a pointer parameter. Thisprevents the memory localization to work correctly, as it expects the parameter to be referringto a single value. Thus we provide a transformation that transforms a pointer parameter to anarray parameter.

Approach

The conversion of a pointer parameter to an array parameter is straightforward. The pointerdeclarator has to be replaced with a parameter declarator consisting of array syntax, as canbe seen in the example below. int *arr is a pointer parameter that is can be replaced by thesemantically equivalent int arr[]. Just replacing the pointer operator from a declarator andconverting it into an array declarator does not need an automated transformation by itself.Beyond this simple editing task the Pointer to Array Conversion also replaces legacy access inthe function body with corresponding array access operation, which might be tedious if it hasto be applied manually in a large function body.

A before/after example of this transformation can be seen in the Listings 4.17 and 4.18.

Listing 4.17: Pointer parameter before thetransformation.

1 void kernel ( int *arr ) {2 int start = *arr ;3 . . .4 }

Listing 4.18: Pointer parameter after thetransformation.

1 void kernel ( int arr [ ] ) {2 int start = arr [ 0 ] ;3 . . .4 }

Implementation

Converting a pointer parameter into an array parameter consists of the following basic steps:

� Check initial conditions

� Replace the pointer parameter with an array parameter

� Replace pointer access with array access in the function body

In order to invoke this refactoring a pointer parameter declaration must be selected. Apartfrom this no further checks are required. The replacement of the parameter declaration isstraight forward as described above. Some analysis is required for the replacement of theparameter accesses in the function body though.

The following cases have to be considered:

� Direct dereferencing of the pointer is just replaced by an array index access to index 0,as seen in Listings 4.19 and 4.20.

� Dereferencing the pointer at a speci�c index is replaced by an array index access to thecorresponding index, as seen in Listings 4.21 and 4.22.

� Modi�cation of the pointer parameter. In such a case we introduce an o�set variable tocatch such side-e�ects. It is names <parameter-name>_offset. An example can be seenin Listings 4.23 and 4.24.



Listing 4.19: Access to the �rst elementbefore the transformation.

1 *arr

Listing 4.20: Access to the �rst elementafter the transformation.

1 arr [ 0 ]

Listing 4.21: Access to an element at a spe-ci�c index before the transformation.

1 *( arr + 1)2 *( arr + index )

Listing 4.22: Access to an element at a spe-ci�c index after the transformation.

1 arr [ 1 ]2 arr [ index ]

Listing 4.23: Access to the modi�edpointer before the transformation.

1 *(++arr )

Listing 4.24: Access to the modi�ed o�setafter the transformation.

1 int arr_offset = 02 arr [(++arr_offset ) ]

User Interface The Pointer to Array Conversion transformation is easily accessible in Cevelopif the corresponding plug-in (ch.hsr.ifs.repara.pointertoarray) is installed. It is appliedin two simple steps:

� Select the pointer parameter, which shall be converted into an array parameter.

Figure 4.7: Select pointer parameter to be converted

� In the REPARA menu select Replace Pointer with Array.



Figure 4.8: Select Replace Pointer with Array in the REPARA menu


Figure 4.9: The transformed code after pointer to array conversion

Limitations

Pointer to Array Conversion does not determine the size of the array parameter. This couldbe �gured out from analyzing the function body, although, that might not be possible in themost general case. Alternatively, the implementation could analyze the REPARA attributes,which must have the accessed indices speci�ed in the corresponding attribute annotations forarray arguments.

In the current version of the implementation, multi-dimensional arrays cannot be generatedyet.

The function body is not checked for name clashes in case an o�set variable has to beintroduced.

4.2.4 Tail Recursion Elimination

High level synthesis for FPGA and OpenCL code do not support recursive function calls andrestrictions on the DSP target only allow a limited number of recursion depth [6]. REPARAC++ compliant code subsequently cannot contain recursive function calls. While it wouldtheoretically be possible to transform arbitrary recursive functions into a loop-based equivalentversion, it would require major changes to the original code [16]. Furthermore, the general case



relies on maintaining a value stack that replaces the call stack of the recursive function. Dynamicallocation of such a stack is not possible in REPARA C++ due to restrictions to dynamicmemory allocation [6]. Thus we refrain from providing a resolution for general recursion.

In contrast to the transformation required to resolve arbitrary recursive functions we imple-mented the subset of transformations dealing with tail-recursive functions [13]. Tail-recursioncan be resolved without explicit maintenance of a value stack as the current stack frame canbe discarded upon entering the next recursion step.

Approach

For replacing a tail-recursive function with a loop-based equivalent, a while-loop needs to beinserted and the tail-recursive calls need to be adapted. The while-loop can run endlessly untila return statement is reached, at which point the call stack of a recursive function would beunwound.

A tail-recursive call needs to have a speci�c structure. It must be a return statement withthe recursive call as direct return expression, without further processing of the return value ofthe recursive call.

Listing 4.25: Pattern for a tail recursive call.1 . . .2 return recursive_call(<arguments>) ;3 . . .

All tail-recursive calls can be replaced with a continue-statement for starting the next iter-ation of the while-loop. Before, all parameter values have to be adapted to comply with thenew argument values (<arguments> in the example above).

Below we see an example of a tail-recursive implementation of the Fibonacci function.

Listing 4.26: Tail recursive Fibonacci implementation.1 int fib ( int term , int val = 1 , int prev = 0) {2 if ( term == 0) return prev ;3 if ( term == 1) return val ;4 else return fib ( term − 1 , val + prev , val ) ;5 }

Inserting the while-loop and replacing the tail-recursive calls are straight forward. Adaptingthe parameter values requires some analysis though. At a glance those values could just getassigned the new value each. A problem arises if the new value of a parameter depends on apreceding parameter. If that preceding parameter is updated before calculating the new valueof the succeeding parameter the resulting value is a�ected. In such a case the new value of thepreceding parameter needs to be stored temporarily and assigned to the e�ective parameterafter the calculation of the succeeding parameters.

In the example in Listing 4.27 we encounter this case relatively to the parameter prev. prevwhich gets the value of val from the preceding iteration assigned. If val got updated beforethe calculation of the value of prev it would have an incorrect value. Thus the new value ofval is temporarily stored in new_argument_val and assigned to val after the calculation ofthe new value of prev. The result of the transformation can be seen in Listing 4.27.



Listing 4.27: Iterative Fibonacci implementation.1 int fib ( int term , int val = 1 , int prev = 0) {2 while ( true ) {3 if ( term == 0) return prev ;4 if ( term == 1) return val ;5 else {6 term = term − 1 ;7 int new_argument_val = val + prev ;8 prev = val ;9 val = new_argument_val ;

10 continue ;11 }12 }13 }

Implementation

For replacing a tail-recursive function with a loop-based the following transformation steps haveto be performed:

� Check whether the selected function is tail-recursive.

� Insert an endless while-loop surrounding the whole implementation of the function body.

� Add code for adapting the parameter values.

� Replace the recursive calls with code continuing the loop.

Initial Conditions When invoking this transformation on a function it is analyzed whetherit is actually a tail-recursive function. If the function is directly tail-recursive or non-recursiveat all a corresponding error message is displayed.

For a non-tail-recursive function see the example in Listing 4.28:

Listing 4.28: Non-tail-recursive function.1 int count ( int rem , int initial ) {2 if ( rem == 0) {3 return initial ;4 }5 return count ( rem − 1 , initial ) + 1 ;6 }

Transformation The iterative version of a tail-recursive function after applying our trans-formation consist of a while(true)-loop. This loop is inserted �rst. Then all statements ofthe function body are moved to this loop, except for the tail-recursive calls, see Listings 4.29and 4.30.

Listing 4.29: Statements before the trans-formation.

1 int function(<params>) {2 <statement>3 }

Listing 4.30: Inserted while-loop.1 int function(<params>) {2 while ( true ) {3 <statement>4 }5 }



Tail-recursive calls are replaced by a continue-statement, which ensures that statements afterthe tail-recursive call are skipped an the next iteration begins. In addition to the continue-statement all parameters are modi�ed and inserted before the continue-statement. See List-ings 4.31 and 4.32.

Listing 4.31: Tail recursive call statementbefore the transformation.

1 {2 . . .3 return recursive_call ( arg_0 , arg_1←↩

, . . . , arg_n ) ;4 . . .5 }

Listing 4.32: Parameter updates andcontinue-statement after the transforma-tion.

1 {2 . . .3 param_0 = arg_0 ;4 param_1 = arg_1 ;5 /* params 2 to n − 1 */6 param_n = arg_n ;7 continue ;8 . . .9 }

Call-sites of the tail-recursive function are not a�ected by the transformation as the signatureof the function is left untouched.

User Interface The tail-recursion elimination transformation is easily accessible in Cevelopif the corresponding plug-in (ch.hsr.ifs.repara.recursion) is installed. It is applied in twosimple steps:

� Select the tail-recursive function, which shall be transformed.

Figure 4.10: Select tail-recursive function to be transformed

� In the REPARA menu select Remove Recursion.



Figure 4.11: Select Remove Recursion in the REPARA menu


Figure 4.12: The transformed code after replacement of the recursion

Limitations

Currently, we just provide elimination of tail recursive kernels. It replaces the body by aniterative equivalent. As no call stack is required for storing call arguments the transformationis straight-forward. General recursion could be eliminated as well with more e�ort, but has notbeen implemented yet.

It is possible that the newly created temporary variables for storing the parameter valueshave a name that is already taken in the insertion context, which might result in an uncompilableprogram. A check could be added that veri�es the uniqueness of that name and adapts it ifnecessary or queries the developer for a unique name suggestion.

Copying the values into temporary variables might have a side-e�ect, which is duplicated



through the temporary variable. Thus the legacy code is not expected to consist of such side-e�ects.

The continue-statement might be nested in another loop-statement. This would break thesemantics of the transformed core, as instead of entering the next level of the recursion justthis surrounding loop would enter the next iteration. While not implemented yet, this issuecould be solved by a label at the outer loop and a corresponding labeled continue-statement.A check for this issue is not implemented yet.

4.2.5 if-Statement Replacement

The if-statement replacement is a transformation that has a rather narrow range of application.Some high-level synthesis tools bene�t from the removal of control �ow, because it can enablefurther optimizations such as loop pipelining. This can be achieved by replacing if statementswith conditional expressions [4].

Approach

Transforming an if statement into a conditional expression is not always as straightforward asit may seem. The main challenge lies in the di�erent nature of statements and expressions.The else clause of an if statement is optional but the else clause of a conditional expressionis not, because expressions must always evaluate to some value. Additionally, both clauses ofa conditional expression must have the same type, because the type of the parent conditionalexpression must be known at compile time. However, the clauses of an if statement are them-selves statements and therefore have neither a value nor a type. Listings 4.33 and 4.34 showhow this problem was solved:

Listing 4.33: Before transformation1 void doSomething ( ) { . . . }2 void doSomethingElse ( ) { . . . }3 bool condition ( ) { . . . }4

5 int main ( ) {6 if ( condition ( ) ) {7 doSomething ( ) ;8 doSomething ( ) ;9 } else {

10 doSomethingElse ( ) ;11 }12 }

Listing 4.34: After transformation1 void doSomething ( ) { . . . }2 void doSomethingElse ( ) { . . . }3 bool condition ( ) { . . . }4

5 int main ( ) {6 bool c11 = condition ( ) ;7 c11 ? ( doSomething ( ) , 0) : 0 ;8 c11 ? ( doSomething ( ) , 0) : 0 ;9 ! c11 ? ( doSomethingElse ( ) , 0) : 0 ;

10 }11

12

The transformation consists of the following steps:

1. Extract condition into local variableThe condition is stored in a local variable to ensure that potential side e�ects in theexpression are only executed once.

2. Create a conditional expression for each statement in the if clausesThe comma operator is used to ensure that the clauses of the conditional expressionalways have the int value 0. This guarantees that both clauses have a value of the sametype and since the result of the conditional expression is ignored anyway, its exact valuedoesn't really matter.



Sometimes an if statement is used to conditionally assign or return a di�erent value dependingon some condition. In these cases the transformation is more straightforward, because of theresult. An example is shown in listings 4.35 and 4.36:

Listing 4.35: Before transformation1 #include <iostream>2 int main ( ) {3 int x ;4 if ( someCondition ( ) ) {5 x = 5 ;6 } else {7 x = 10 ;8 }9 std : : cout << x << '\n' ;

10 }

Listing 4.36: After transformation1 #include <iostream>2

3 int main ( ) {4 int x = someCondition ( ) ? 5 : 10 ;5 std : : cout << x << '\n' ;6 }7

8

9

10

The plug-in can also transform nested if statements as shown in the following example:

Listing 4.37: Before transformation1 void doSomething ( ) {}2 bool condition1 ( ) { . . . }3 bool condition2 ( ) { . . . }4 bool condition3 ( ) { . . . }5

6 int main ( ) {7 if ( condition1 ( ) ) {8 doSomething ( ) ;9 if ( condition2 ( ) ) {

10 doSomething ( ) ;11 }12 } else {13 doSomething ( ) ;14 if ( condition3 ( ) ) {15 doSomething ( ) ;16 }17 }18 }

Listing 4.38: After transformation1 void doSomething ( ) {}2 bool condition1 ( ) { . . . }3 bool condition2 ( ) { . . . }4 bool condition3 ( ) { . . . }5

6 int main ( ) {7 bool c11 = condition1 ( ) ;8 c11 ? ( doSomething ( ) , 0) : 0 ;9 bool c21 = condition2 ( ) ;

10 c11 && c21 ? ( doSomething ( ) , 0) : 0 ;11 ! c11 ? ( doSomething ( ) , 0) : 0 ;12 bool c22 = condition3 ( ) ;13 ! c11 && c22 ? ( doSomething ( ) , 0) : 0 ;14 }15

16

17

18

User Interface

The if statement transformation is easily accessible in Cevelop if the corresponding plug-in(ch.hsr.ifs.repara.terraformer) is installed. It is applied in two simple steps:

� In the Cevelop code editor, put the cursor inside the if statement, which shall be trans-formed.

Figure 4.13: Put the cursor inside the if statement



� In the REPARA menu select Transform If-Statements.

Figure 4.14: Select Transform If-Statements in the REPARA menu


Figure 4.15: The transformed code after if-statement replacement

Limitations

The transformation has a few limitations. The if statements that should be transformed can-not contain any loops, declarations or switch statements. Only expression statements, returnstatements and other if statements are allowed.

4.2.6 Inlining Function Pointers

According to the REPARA C++ Open Speci�cation [6] function pointers are not allowed dueto limitations implied by the OpenCL target. Dynamic jumps in control �ow are not supportedon this target platform. Subsequently, legacy source code using function pointers is not validREPARA C++. In Listing 4.39 we have an example that violates this restriction. The callapply(square, 2) is not allowed according to the RCPP speci�cation.



Listing 4.39: Example code with a function pointer parameter1 // square i s the func t i on which i s passed by po in t e r2 unsigned square ( unsigned i ) {3 return i * i ;4 }5

6 // apply takes a func t i on by po in t e r and app l i e s i t to the parameter i7 unsigned apply ( unsigned (* f ) ( unsigned ) , unsigned i ) {8 return f ( i ) ;9 }

10

11 // in the main func t i on we have the c a l l o f apply that ge t s square passed to as po in t e r12 int main ( ) {13 apply ( square , 2) ;14 }

The REPARA C++ Open Speci�cation considers another case, which is not treated explic-itly here. It also mentions inlining of lambda functions. In this case we suggest to adapt theoriginal source code to become a valid target for this transformation. This can be achievedmanually or by corresponding tooling support, which might be available in the future [2].

Approach

The REPARA C++ Open Speci�cation proposes a possible solution for legacy source codethat is not conform with this restriction. For a speci�c call the illegal function pointer could beinlined and therefore eliminated on the interface of the calling function. In the example in List-ing 4.39 the square argument would be removed at the call-site. We propose a transformationresulting in the semantically equivalent code, as seen in Listing 4.40.

Listing 4.40: Inlined function pointer parameter.1 unsigned square ( unsigned i ) {2 return i * i ;3 }4

5 unsigned apply ( unsigned (* f ) ( unsigned ) , unsigned i ) {6 return f ( i ) ;7 }8

9 unsigned apply_square ( unsigned i ) {10 return square ( i ) ;11 }12

13 int main ( ) {14 apply_square (2 ) ;15 }

Just removing the parameter would probably break compilation of the legacy source code asthere is no overload of the apply function with just one parameter. Yet there might already exista corresponding overload of the apply function. In order to avoid the generation of con�ictswith already existing overloads we create a new function with a combined name of the callingfunction apply and the called function square: apply_square

The transformed version does not con�ict with the REPARA C++ Open Speci�cationanymore and, if it was part of a REPARA kernel, could be extracted.

Implementation

The implementation of this transformation is available in the REPARA plug-ins for Cevelop. Itis invoked independently of the Checkair plug-in, but an integration into the problem resolutionplug-in for the Checkair plug-in is possible. The transformation is applied in the following steps:



� Checking of conditions

� Replacement of the function call

� Insertion of new declaration for called function

� Insertion of new declaration for calling function

� Insertion of calling function de�nition

Check of Conditions Inlining function pointers does not depend on any REPARA attributesand thus can be applied without additional information beyond the selection of the functionreference in a function call. In the current implementation the function supplied as pointerneeds to be a direct reference to a speci�c function. More speci�cally, variables referring to afunction cannot be inlined yet (See Issues of the Implementation for details). Furthermore, it ismandatory that the function pointer is an argument of a function call. If the conditions aboveare not satis�ed, a corresponding error message is displayed to the user.

Replacement of the Function Call Replacement of the function call performs two modi-�cations to the original call taking a function pointer as argument:

� The function call is renamed to the new function with the inlined function pointer. Asnew name a combination of the names of the called function and the inlined functionpointer is used.

� The argument representing the inlined function pointer is removed.

Listings 4.41 and 4.42 show an incomplete example where the function pointer square isinlined in the call of the apply function.

Listing 4.41: Call-site with a functionpointer argument.

1 int main ( ) {2 apply ( square , 2) ;3 }

Listing 4.42: Call-site after the transfor-mation.

1 int main ( ) {2 apply_square (2 ) ;3 }

Insertion of New Declaration for Called Function In the context of the implementationof the calling function, the function referred to by the function pointer parameter is most likelynot known. In order to inline this parameter and to directly call the function, a declaration isneeded before the new function de�nition. For example in the case in Listing 4.43, when weinline the call apply(square, 2) we create a new de�nition of apply, which directly referencesthe function square.

Listing 4.43: De�nition of the apply function.1 //The func t i on square i s not known at t h i s po int2 unsigned apply ( unsigned (* f ) ( unsigned ) , unsigned i ) {3 return f ( i ) ;4 }

To overcome this issue we add a declaration of the called function. The signature of thisdeclaration is derived from the signature of the function pointer parameter. The declarationcan be seen in Listing 4.44.



Listing 4.44: Declaration of square as needed in apply_square.1 unsigned square ( unsigned ) ;

Insertion of New Declaration for Calling Function As we create a new function imple-mentation of the calling function, which gets a new name, we also need a declaration for thisfunction. That declaration is added to the existing declaration of the previously called function.Otherwise, the new function implementation might not be found at the call-site. In the case ofour example we add the declaration of apply_square before the declaration of apply, as seenin Listing 4.45.

Listing 4.45: Declaration of apply_square as needed at the call-site.

1 unsigned apply_square ( unsigned ) ;2 unsigned apply ( unsigned (* ) ( unsigned ) , unsigned ) ;

Insertion of Calling Function De�nition Eventually, the plug-in inserts the new de�nitionof the apply_square function with all calls of the function pointer parameter replaced. Theimplementation is added to the translation unit of the implementation of the previously calledfunction. Due to the declaration we have added to the header, it will be accessible from thecall-site.

This new function de�nition has the combined name of the replaced function and the inlinedfunction argument. In our case apply_square. The function pointer parameter is removed andall references to the function pointer parameter are replaced by the inlined function, square inour example, as shown in Listing 4.46.

Listing 4.46: De�nition of the calling function apply_square.1 unsigned apply_square ( unsigned i ) {2 return square ( i ) ;3 }4

5 unsigned apply ( unsigned (* f ) ( unsigned ) , unsigned i ) {6 return f ( i ) ;7 }

User Interface The inline function pointer transformation is easily accessible in Cevelop ifthe corresponding plug-in (ch.hsr.ifs.repara.inlinefunctionptr) is installed. It is appliedin two simple steps:

� Select the function argument that is the function reference, which shall be inlined.



Figure 4.16: Select the function reference argument to be inlined

� In the REPARA menu select Inline Function Pointer.

Figure 4.17: Select Inline Function Pointer in the REPARA menu




Figure 4.18: The transformed code with the inlined function pointer

Issues of the Implementation If the argument of the calling function is not directly thefunction to be inlined, for example a local variable referring to the inlined function, the trans-formation does not work. In the example in Listing 4.47, f in apply(f, 2) is a local variable.While the example below is trivial and f could be resolved to square the general case is muchmore di�cult as the function f is referring to might change dynamically.

Listing 4.47: Local variable as function argument.1 unsigned square ( unsigned i ) {2 return i * i ;3 }4

5 unsigned apply ( unsigned (* f ) ( unsigned ) , unsigned i ) {6 return f ( i ) ;7 }8

9 int main ( ) {10 unsigned (* f ) ( unsigned ) ;11 f = square ;12 apply (f , 2) ;13 }

There is a simple workaround to this case. The local variable f can be inlined �rst with thewell known Inline Temp refactoring [8]. Currently, the issue is recognized and reported to theuser if the transformation is applied in such a case.

Possible Extensions

The following improvements to the usability of the automated transformation could be realizedin the future:

Check for Name Duplication It is possible that the newly created function gets a namethat is already taken in the insertion context, which might result in an uncompilable program.A check could be added that veri�es the uniqueness of that name and adapts it if necessary orqueries the developer for a unique name suggestion.

Cascading Inlining If the inlined function pointer parameter is used for something elsethan a function call, for example as a function pointer argument itself, the resulting source



code would still not be REPARA C++ compliant. The inline function pointer transformationcould be applied recursively. An extension of this transformation could apply such cascadingreplacements automatically.

Wizard for Problem Resolution In the current proof of concept implementation, thetransformation just fails if it is not applied to a valid context with a correct selection. Theseconstraints could be relaxed and a wizard could guide the developer through the transforma-tion and allow the selection of speci�c program elements that represent the calling and calledfunction as well as the speci�c call-site to be target of the inline transformation.

4.2.7 Reinterpreting Data as Another Type

Some people may �nd it useful to reinterpret the underlying bit pattern of a value as a valueof another type. Pointer aliasing is one way to do this in C/C++. However, it is dangerous,because it often leads to violations of the strict aliasing rule, which results in unde�ned be-haviour [9]. For this reason, developers often resort to using memcpy() instead. An example isshown in Listing 4.48:

Listing 4.48: Reinterpreting data using memcpy()1 #include <iostream>2 #include <cstdde f>3 #include <c l im i t s >4

5 int main ( ) {6 static_assert ( sizeof ( long long ) == sizeof ( double ) ,7 "Expects same size for long long and double" ) ;8 constexpr std : : size_t byte_count = sizeof ( long long ) ;9 constexpr std : : size_t bit_count = byte_count * CHAR_BIT ;

10

11 double d = 5 . 5 ;12 long long i ;13 memcpy(&i , &d , byte_count ) ;14 i |= 1LL << ( bit_count −1) ;15 memcpy(&d , &i , byte_count ) ;16 std : : cout << d << '\n' ; // p r i n t s '−5.5 '

17 }

In the example above, data is reinterpreted by copying the value of a double variable into along long variable using memcpy(). This makes it possible to use the bitwise logical operator& in order to manipulate individual bits of the value. After the value has been changed, it iscopied back to the original double variable.

The static_assert() at the beginning of main() prevents the code from compiling, if thetypes double and long long don't have the same size. Note that the code makes assumptionsabout the internal representation of double and long long, which are not necessarily correctfor all platforms.

Unfortunately, OpenCL C does not support memcpy(). Instead it provides special, built-in functions that can be used for this task. Listing 4.49 performs a similar operation as theexample in Listing 4.48. The functions as_uint() and as_float() are used to reinterpret thedata:



Listing 4.49: Reinterpreting data in OpenCL1 __kernel void square ( __global float* in , __global float* out ) {2 size_t tid = get_global_id (0 ) ;3 float f = in [ tid ] ;4 uint u = as_uint ( f ) ;5 u |= (1 << 31) ;6 f = as_float ( u ) ;7 out [ tid ] = f ;8 }

The purpose of this transformation is to �nd places where memcpy() is used to reinterpretdata and to provide the option to automatically transform it to a version that uses OpenCL'sas_type () functions. Note that this transformation has not been implemented yet.

4.2.8 Quick Fixes

As described in chapter 2 the RCPP compliance violations are marked and reported to thedeveloper in Cevelop. A couple of those violations can be �xed directly with one click. InEclipse such resolutions for reported problems are called quick �xes [3].

We provide quick �xes for the following cases:

� Replace new/delete

� Replace library call

� Replace Bit�eld

� Variadic Macros/Functions

Replace new/delete

Dynamic memory management is restricted in RCPP compliant code. A possible workaroundsolution, for allowing a platform speci�c version of memory management, is to implementsurrogate calls. We need to handle the following cases of dynamic memory management:

� Allocation of single value or object. It is replaced with a call to repara::alloc, which isa template function taking the type of the allocated object as template argument. Thepassed arguments of the new expression are copied to the arguments of the function call.An example is shown in Listings 4.50 and 4.51.

� Allocation of an array. It is replaced with a call to repara::alloc_array, a templatefunction taking the type of the allocated array as template argument. An example isshown in Listings 4.52 and 4.53.

� Allocation of a multi-dimensional array. It is replaced with a call to repara::alloc_array,a template function taking the type of the allocated array as template argument. Further-more, the constant dimensions are added as additional template parameters. An exampleis shown in Listings 4.54 and 4.55.

� Deallocation of a single value or object. It is replaced by a call to repara::free with thepointer to be deallocated as argument. An example is shown in Listings 4.56 and 4.57.

� Deallocation of an array. It is replaced by a call to repara::free_vectored with thepointer to the array as argument. An example is shown in Listings 4.58 and 4.59.



Listing 4.50: new expression before quick�x application.

1 int *i = new int {23} ;

Listing 4.51: repara:alloc expression af-ter quick �x application.

1 int *i = repara : : alloc<int>(23) ;

Listing 4.52: new[] expression before quick�x application.

1 new int [ 6 4 ] ;

Listing 4.53: repara::alloc_array ex-pression after quick �x application.

1 repara : : alloc_array<int>(64) ;

Listing 4.54: new[] expression for multi-dimensional arrays before quick �x appli-cation.

1 unsigned dynamic = . . . ;2 new int [ dynamic ] [ 6 4 ] ;

Listing 4.55: repara::alloc_array ex-pression for multi-dimensional arrays afterquick �x application.

1 unsigned dynamic = . . . ;2 repara : : alloc_array<int , 64>(dynamic ) ;

Listing 4.56: delete expression beforequick �x application.

1 delete i ;

Listing 4.57: repara::free expression af-ter quick �x application.

1 repara : : free ( i ) ;

Listing 4.58: delete[] expression beforequick �x application.

1 delete [ ] arr ;

Listing 4.59: repara::free_vectored ex-pression after quick �x application.

1 repara : : free_vectored ( arr ) ;

If a REPARA dynamic memory operation is used, a platform-speci�c implementation of thecorresponding functions can be provided. Currently, we expect them to be provided in a headercalled repara_memory.h. This header is included, if not already present, when this quick �x isapplied. Listing 4.60 shows the include.

Listing 4.60: repara_memory.h include.1 #include "repara_memory.h"

Limitations Since it is unclear how exactly the repara::alloc_array calls will be imple-mented and how arguments for array initialization should be passed, it is currently not sup-ported to have initializers for dynamic array initialization.



Replace Library Call

Calls to standard libraries are restricted in RCPP compliant code [6]. A possible workaroundsolution, to support a platform speci�c version of the corresponding standard library, is toimplement surrogate calls. The quick �x replaces the library call with a call to a REPARA-speci�c implementation from the repara namespace, which shall be declared/de�ned in thecorresponding REPARA header. The illegal library calls are recognized by the conformancechecker, as described in chapters 2 and 3.

In Listings 4.61 and 4.62 we see an example of such a library call replacement:

Listing 4.61: Example for invalid librarycall.

1 int r = rand ( ) ;

Listing 4.62: Example for replacement oflibrary call.

1 int r = repara : : rand ( ) ;

If necessary an include is added to import the surrogate function. Listing 4.63 shows theinclude.

Listing 4.63: repara_stdlib.h include.1 #include "repara_stdlib.h"

Limitations Currently, all function calls at the marker of the corresponding quick �x arereplaced as we are lacking detailed information which call exactly is illegal. In most cases thisshould not be a major issue. In the worst case a call is replaced with a REPARA version thatdoes not exist, but this can easily be �xed by the developer.

Replace Bit-�eld

According to the REPARA C++ Open Speci�cation bit-�elds are not supported [6]. Thisrestriction is implied by constraints of the OpenCL target platform. The suggested solution isnot to allow bit-�elds at all or convert the bit-�elds into non-bit-�eld member variables. Thisconversion is provided by this transformation.

The suggested resolution in [6] is to perform a conversion from unsigned to int if theunsigned bit-�eld can be stored in the int considering the width. However, this conversiondoes not seem to be required as a bit-�eld cannot consist of more signi�cant bits than theunderlying type has on the speci�c platform. As unsigned is supported by REPARA C++it should be su�cient to remove the size speci�er of the bit-�eld, converting it into a membervariable of the speci�ed type, even in the case of unsigned bit-�elds.

Furthermore, we have to consider additional cases. Bit-�elds can be unnamed, representingpadding in the memory layout of the structure. Such unnamed bit-�elds can have a widthof 0 bits, o�setting the subsequent bit-�eld to allocation unit boundary. Since such unnamedbit-�elds do not represent a member of the residing structure they can be removed withoutreplacement.

Note: While the memory layout changes after this transformation the semantics of theresulting code should be una�acted, unless the source code relies on over�ow e�ects on the bitsused - which are only de�ned for signed member variables.

An example for this quick �x containing illegal bit-�elds in a struct is shown in Listing 4.64.The same code would be transformed to the struct shown in Listing 4.65.



Listing 4.64: Example struct with bit-�elds.

1 struct S {2 unsigned a : 4 ;3 unsigned : 0 ;4 unsigned b : 4 ;5 } ;

Listing 4.65: Example struct after removalof bit-�elds.

1 struct S {2 unsigned a ;3 unsigned b ;4 } ;

Limitations Unsigned bit-�elds could rely on the speci�ed behavior in case of over�ow, start-ing over at the value 0 if that happens. Due to the potential expansion of the value range whenremoving the bit-�eld speci�er the semantics of the program might be changed. A possibleworkaround to this issue could be to insert modulo operations for unsigned bit-�elds at writeoperations with a modulator of 2#ofbits.

Variadic Macros/Functions

According to the REPARA C++ Open Speci�cation (RC++ Spec), the use of variadic macrosis restricted. This restriction is imposed by the required compatibility with OpenCL language,which itself does not support variadic macros.

The listing below shows an example of a variadic macro, followed by a variadic function

Listing 4.66: Examples of Variadic Macros and Variadic Functions1 #include <iostream>2 #include <cstdarg>3 // va r i ad i c macro4 #define e p r i n t f ( . . . ) s imple_pr int f ( std : : ce r r , __VA_ARGS__)5

6 // va r i ad i c func t i on7 void simple_printf ( std : : ostream & out , const char * fmt , . . . ) {8 . . .9 }

10

11 int main ( ) {12 eprintf ( "ddfd" , 1 , 2 , 3 . 1 , 5) ;13 }

Proposed resolution The RC++ Spec proposes a workaround for variadic macros and func-tions using the following mechanism (cited directly from the paper):

1. First, at compile, if a function is declared with a variadic parameter list, andthe call does not use the variadic keyword, then the function is treated as if theparameter list were replaced by one or more occurrences of its element type,as needed to match the call.

2. After such expansion the function might have e�ective argument types identicalto some non-variadic function. In that case the function appearing earlier inthe search path is used, or if the two functions are in the same schema, thenon-variadic one is preferred.

3. Second, if any variadic function or macros cannot statically transformable intononvariadic ones. REPARA C++ generate a compiler error such as "variadicfunction/-macros is not supported.".

This mechanism is lacking in the following aspects:



� Equal treatment of variadic functions and macros is inappropriate. Even though variadicfunctions and macros look syntactically similar, they have very di�erent semantics. Tech-nically, they are not even part of the same language. Accordingly, the strategies for copingwith the restrictions of variadic functions and macros should re�ect these dissimilarities.

� Also, the proposed transformation mechanism is naive and has a high probability ofbreaking the program on the syntactic or semantic level. Simply expanding a variadicfunctions with the hope of �nding a matching overload is both risky and potentiallydangerous.

In light of the proposed resolution's shortcomings, HSR designed new transformations tocope with both variadic macros and functions that have a much higher success rate and are lessdangerous to apply.

The following sections will describe the new transformations and illustrate them using theprogram in Code Listing 4.66.

Variadic Macros The transformation for variadic macros is straightforward. They are "pre-processed" at the call-site, and replaced by their expansion. Cevelop provides the option toeither execute the transformation for a speci�c call site, or for all calls at once.

The transformation veri�es that the resulting code is syntactically correct. Cevelop onlydisplays the corresponding quick�x to the user if the macro can be expanded safely. If auto-mated transformation without syntax error is not possible, cevelop displays only a warning, andthe user has to transform the macro manually. This work�ow ensures that the transformationwill not break the source code.

Listing 4.67: Example for variadic macro.1 #define e p r i n t f ( . . . ) s imple_pr int f ( std : :←↩

cer r , __VA_ARGS__)2 // . .3 int main ( ) {4 [ [ rpr : : kernel ] ]5 eprintf ( "ddfd" , 1 , 2 , 3 . 1 , 5) ;6 }

Listing 4.68: Example for replacement ofvaradic macro.

1 #define e p r i n t f ( . . . ) s imple_pr int f ( std : :←↩cer r , __VA_ARGS__)

2 // . .3 int main ( ) {4 [ [ rpr : : kernel ] ]5 simple_printf ( std : : cerr , "ddfd" , 1 , 2 ,←↩

3 . 1 , 5) ;6 }

Variadic Functions Variadic functions are transformed using an auxiliary class templatethat mimics the semantics of the C-style varargs primitives va_list, va_start, va_copy, va_arg,and va_end. With the help of this class it is possible to transform any variadic function into a�xed-arity function, as long as the aforementioned semantics are used in the correct manner.



Listing 4.69: Auxiliary class template.1 namespace rpr {2

3 template<typename . . . E>4 struct Container ;5

6 template<>7 struct Container<> {8 void fill ( unsigned , void *) {}9 } ;

10

11 template<typename T , typename . . . E>12 struct Container<T , E . . . > : public Container<E . . . > {13 Container ( T const & t , E const &. . . e ) : Container<E . . . >{ e . . . } , t{t}{}14

15 void fill ( unsigned index , void ** p ) {16 p [ index ] = &t ;17 Container<E . . . > : : fill(++index , p ) ;18 }19 private :20 T t ;21 } ;22

23 template<typename . . . E>24 struct va_args {25 va_args ( E const &. . . e ) : c{e . . . } {26 c . fill (0 , p ) ;27 }28 Container<E . . . > c ;29 void * p [ sizeof . . . ( E ) ] ;30 unsigned next_index = 0 ;31

32 template<typename T>33 T next ( ) {34 if ( next_index >= sizeof . . . ( E ) ) {35 return T {} ;36 }37 T* t = static_cast<T*>(p [ next_index++]) ;38 return *t ;39 }40

41 va_args start ( ) {42 va_args copy = *this ;43 copy . next_index = 0 ;44 return copy ;45 }46 } ;47 }

Using variadic templates and recursion, the class template wraps the variadic arguments,reducing them into a single argument. This allows an arbitrary number of arguments to bepassed to the function as a single argument, thus removing the variadic nature of the function.

Since there may be calls to a variadic function with more di�erent amounts of arguments,the exact type of the auxiliary class template will di�er for each call. To account for this, thefunction is transformed into a template function, with the varags class template as the templateparameter. The Code Listings 4.70 and 4.71 show the signature of a variadic function beforeand after the transformation, respectively.

Listing 4.70: Example for variadic macro.1 void simple_printf ( const char * , . . . )


1 template <typename T>2 void simple_printf ( const char * , T )

The change to the function signature implies that the function call-site must be changed as



well. More precisely, an object of type va_args<...> must be constructed and passed to thetransformed variadic function. To prevent having to write the exact type or the va_args<...>object at each function call-site, this task is delegated to the function template make_va_args(Econst & ...e) in Listing 4.72, which takes the former variadic arguments and returns an objectof the proper type.

Listing 4.72: factory for va_args objects1 namespace rpr {2 template<typename . . . E>3 va_args<E . . . > make_va_args ( E const &. . . e ) {4 return va_args<E . . . >{ e . . . } ;5 }6 }

Listings 4.73 and 4.74 show the call-site of a variadic function before and after the trans-formation, respectively.

Listing 4.73: Example for variadic macro.1 simple_printf ( "dcff" , 3 , 'a' , 1 . 999 , ←↩

42 . 5 ) ;


1 simple_printf ( "dcff" , rpr : : make_va_args←↩(3 , 'a' , 1 . 999 , 42 . 5 ) ) ;

As mentioned before, this class template can be used to imitate C-style varargs semantics.There is a mapping from each va_* command to the class template.

1. va_list declaration → va_args declaration

2. va_start → start() member-function

3. va_arg → next() member-function

4. va_copy → copy constructor. Works as long as all variadic arguments are copyable.

5. va_end → can be removed.

During the transformation, these mappings are applied to the function body. The Code List-ings 4.75 and 4.76 show the signature of a variadic function before and after the transformation,respectively.

The transformation is provided as quick�x on the 'variadic function' marker. The auxiliaryC++ code is part of a header that must be present in the include path.

Replacement of Basic Types The data sent to FGPAs must conform exactly to the spec-i�ed interface on the bit-level. For instance, if a certain kernel expects a 32-bit signed integer,followed by a 64-bit �oating point number, the transmitted data must be exactly 96 bits long,or the communication will fail.

FPGA kernels are generated from RC++ code, and communicate at runtime with otherRC++ code that was potentially compiled on a di�erent machine. It is therefore importantto have data types with �xed bit-width, which are independent of the machine where RC++programs are compiled and executed.

The Basic Type Replacement transformation locates variables with variable-width typessuch as int or long within rpr::kernels, and provides a quick�x that transforms the type into



Listing 4.75: Result of the memory local-ization.

1 void simple_printf ( const char* fmt . . . ) {2 va_list args ;3 va_start ( args , fmt ) ;4 while (* fmt != '\0' ) {5 if (* fmt == 'd' ) {6 int i = va_arg ( args , int ) ;7 std : : cout << i << '\n' ;8 } else if (* fmt == 'c' ) {9 int c = va_arg ( args , int ) ;

10 std : : cout << static_cast<char>(c ) ;11 std : : cout << '\n' ;12 } else if (* fmt == 'f' ) {13 double d = va_arg ( args , double ) ;14 std : : cout << d << '\n' ;15 }16 ++fmt ;17 }18 va_end ( args ) ;19 }

Listing 4.76: Result of the memory local-ization.

1 template <typename T>2 void simple_printf ( const char* fmt , T ←↩

args ) {3 while (* fmt != '\0' ) {4 if (* fmt == 'd' ) {5 int i = args . template next<int>() ;6 std : : cout << i << std : : endl ;7 } else if (* fmt == 'c' ) {8 int c = args . template next<int>() ;9 std : : cout << static_cast<char>(c ) ;

10 std : : cout << '\n' ;11 } else if (* fmt == 'f' ) {12 double d = args . template next<double←↩

>() ;13 std : : cout << d << '\n' ;14 }15 ++fmt ;16 }

its �xed-width equivalent from the cstdint-header. Currently, the transformation targetsparameter declarations, simple declarations, and function return values.

Listings 4.77 to 4.80 exemplify how the transformation may be applied to a piece of RC++code.

Listing 4.77: variable declaration beforequick �x application.

1 unsigned int x = 5 ;

Listing 4.78: variable declaration afterquick �x application.

1 uint32_t x = 5 ;

Listing 4.79: function declaration beforequick �x application.

1 int foo ( char x , long y ) ;

Listing 4.80: function declaration afterquick �x application.

1 int32_t foo ( int8_t x , int64_t y ) ;

4.2.9 Compound to Function Conversion

The current implementation of the kernel extraction of work package 4 [11] only allows theextraction of function calls. During the course of the project we often encountered examplesthat contained compound statements which were annotated as REPARA kernels. While thekernel extraction cannot cope with such a compound statement kernel we have implementedan extract-function-like transformation that transforms a compound statement kernel to afunction-call kernel.

Approach

Conversion of a compound statement to a function works on an annotated compound statementand replaces it with a call to a corresponding function. The de�nition of this function isderived from the statements in the replaced compound statement. This transformation is more



lightweight than a complete extract function refactoring as it does not need perform a completedata-�ow analysis. Input and output of the created function is derived from the REPARAkernel attributes [7]. Obviously, the �ow data in the kernel attribute is expected to be correct.Otherwise, the result of the transformation might not be valid.

Below, we have an example of how a compound statement that is a possible target for thetransformation could look like in Listing 4.81. The compound statement is annotated withrpr::kernel and rpr::in/rpr::out for the local variables x and y.

Listing 4.81: Attributed compound statement.1 void callsite ( ) {2 int x {} , y {} ;3 [ [ rpr : : kernel , rpr : : in ( x ) , rpr : : out ( y ) ] ]4 {5 y = x ;6 }7 }

The transformed result looks as shown in Listing 4.82. A new function de�nition is cre-ated, which is called extracted_kernel. For each local variable referred to in the REPARAattributes a parameter is created. This parameter is a reference parameter that has the samename as the variable in the attribute. The type of the parameter is derived from the variabledeclaration.

At the call-site the compound statement is replaced by a call to the newly added function.An argument is added for each parameter of the extracted_kernel function. The REPARAattributes stay the same, which now allow the extraction of the kernel.

Listing 4.82: Result of the compound to function transformation.1 void extracted_kernel ( int& x , int& y ) {2 y = x ;3 }4

5 void callsite ( ) {6 int x {} , y {} ;7 [ [ rpr : : kernel , rpr : : in ( x ) , rpr : : out ( y ) ] ]8 extracted_kernel (x , y ) ;9 }

Implementation

The implementation of this transformation is available in the REPARA plug-ins for Cevelop.The transformation is applied in the following steps:

� Checking of conditions

� Insertion of extracted kernel function de�nition

� Replacement of the compound statement

Checking of Conditions It is mandatory that the compound function to be convertedinto a function has an rpr::kernel annotation. While the kernel annotation itself is notthat important, the indications about which variables are in and out variables of the kernelare mandatory. A kernel without in and out attributes cannot have a signi�cant side-e�ectand therefore cannot be converted. Furthermore, the compound statement must not containcontrol-�ow statements that jump out of the compound. For example, the annotated compoundstatement in Listing 4.83 is not a valid target for the transformation for two reasons:



� It does not contain the required attributes [[rpr::kernel, rpr::in(x), rpr::out(x)]].

� It contains a return statement.

Listing 4.83: Invalid target example.1 int compute ( int x ) {2 while ( x < 1000) {3 { //compound statement l a ck s the r equ i r ed a t t r i b u t e s4 if ( x < 0) {5 return −1; // i n v a l i d statement f o r e x t r a c t i on6 }7 x *= x ;8 }9 }

10 return x ;11 }

Insertion of Extracted Kernel Function De�nition From the selected compound state-ment a new function de�nition is created. The body of that function de�nition consists ofthe statements of the compound statement. In order to have the local variables used in thecompound statement available in the created function, corresponding parameters are added.These parameters are derived from the REPARA in and out attributes. To avoid changes inthe semantics and to allow all side-e�ects being visible beyond the function, all parameters arepassed by reference. The type of the parameters is determined from the type of the correspond-ing variable. If an attribute references an unknown variable the attribute is ignored. Similarly,if a variable is referenced multiple times it is only added once as a parameter.

In Listing 4.84 we see an example for this behavior. The result of the conversion is thefunction shown in Listing 4.85.

Listing 4.84: Example compound statement for conversion1 void callsite ( ) {2 int x {} ;3 [ [ rpr : : kernel , rpr : : in ( x ) , rpr : : out ( x ) ] ]4 {5 x *= x ;6 }7 }

From the compound statement above the following function de�nition is inserted:

Listing 4.85: Function inserted by the conversion1 void extracted_kernel ( int& x ) {2 x *= x ;3 }

Note: The created function always has the return type void.

Replacement of the Compound Statement The selected compound statement is replacedby a call of the created function. That call has an argument for every parameter of the function.As the parameter names correspond to the variable names of the context of the compoundstatement, the arguments refer to these variables. This function call retains the REPARAattributes of the compound statement in order to enable the kernel extraction.

In Listing 4.86 the call to the new function including the REPARA attributes is shown.



Listing 4.86: Call-site inserted by the conversion1 void callsite ( ) {2 int x {} ;3 [ [ rpr : : kernel , rpr : : in ( x ) , rpr : : out ( x ) ] ]4 extracted_kernel ( x ) ;5 }

User Interface The inline function pointer transformation is easily accessible in Cevelopif the corresponding plug-in (ch.hsr.ifs.repara.compoundtofunction) is installed. It isapplied in two simple steps:

� Select the annotated compound statement that shall be converted. The cursor or theselection must be within the compound statement.

Figure 4.19: Select a kernel-annotated compound statement

� In the REPARA menu select Convert Compound to Function.

Figure 4.20: Select Compound to Function in the REPARA menu




Figure 4.21: The resulting kernel function after the transformation

4.3 Conclusion

In this section we have summarized which transformations should be applied to manage eachof the restriction in REPARA-C++. For some restrictions there is no suggested solution andtherefore, those speci�c features are just not allowed. In such a case it is up to developer toeither just delete the corresponding keyword or to provide an implementation of the kernel thatabides the restrictions of REPARA-C++.

The suggested enabling transformations are categorized in the tables as follows:

Limitations to C

� Limitations by OpenCL in Table 4.2.

� Limitations by FPGA in Table 4.3.

� Limitations by DSP in Table 4.4.

Limitations to C++

� Limitations by OpenCL in Table 4.5.

� Limitations by FPGA in Table 4.6.

� Limitations by DSP in Table 4.7.



Table 4.2: Limitations to C subset of C++ by OpenCL

Number Description Possible Resolution4.3.3 ISO C++ scalar data types The complex type is not available in C++4.3.4 alignment of types No transformation solution4.3.5 reinterpreting data as another type REPARA::as<type>() transformation4.3.6 arithmetic conversions Replace �xed with types4.3.7 bit-�eld struct members Remove bit-�eld quick �x4.3.8 'pointer to function' Inline Function Ptr4.3.9 'variable length arrays' Not supported in C++4.3.10 'variadic macros and functions' Macronator / variadic quick �x4.3.11 the use of ISO C99 library functions Replace library call4.3.12 storage-class quali�ers Not supported4.3.13 recursive functions Replace tail recursion4.3.14 dynamic memory allocation Replace library call4.3.15 writes to a pointer (or arrays) Eliminate Pointer4.3.16 elements of struct, union Handled by the kernel extraction4.3.17 random number generator Replace library call4.3.18 asm declaration Not supported

Table 4.3: Limitations to C subset of C++ by FPGA

Number Description Possible Resolution4.4.3 system calls Replace library call4.4.4 pointer casting Not supported4.4.5 pointer arrays Not supported4.4.6 recursive function Replace tail recursion4.4.7 dynamic memory management Replace library call

Table 4.4: Limitations to C subset of C++ by DSP

Number Description Possible Resolution4.5.4 system call Replace library call4.5.5 register storage class speci�er Not supported4.5.6 data padding and alignment of struct Automatic alignment4.5.7 �oating-point exceptions No transformation4.5.8 run-time library exceptions Not supported



Table 4.5: Limitations to C++ by OpenCL

Number Description Possible Resolution4.7.4 dynamic binding Not supported4.7.5 dynamic_cast Not supported4.7.6 dynamic storage alloc and dealloc Replace new/delete4.7.7 dynamic reinterpret_cast Not supported4.7.8 exception handling Not supported4.7.9 alignment of class object Automatic alignment4.7.10 C++ Standard libraries Replace library call

Table 4.6: Limitations to C++ by FPGA

Number Description Possible Resolution4.8.3 dynamic memory allocation Replace new/delete4.8.4 dynamic binding Not supported

Table 4.7: Limitations to C++ by DSP

Number Description Possible Resolution4.9.3 dynamic reinterpret_cast Not supported4.9.4 recursion depth Replace Tail Recursion4.9.5 function template linkage Not supported4.9.6 variable sized classobject Not supported4.9.7 function inlining No transformation4.9.8 typeinfo header Not supported4.9.9 export keyword Not C++11

For 19 of the restrictions an enabling transformation exist, which at least partially remediescode containing code violations.

Re�nements of the transformation, discovering edge cases and improvement of the mentionedlimitations will be part of the integration work package at the end of the REPARA project.



Bibliography

[1] Cevelop IDE. https://www.cevelop.com/. Accessed: 2016-02-26.

[2] LambdaFicator. http://wiki.hsr.ch/PeterSommerlad/LambdaFicator. Accessed:2016-02-11.

[3] What is a Quick Fix? https://wiki.eclipse.org/FAQ_What_is_a_Quick_Fix%3F. Ac-cessed: 2016-02-11.

[4] Canis, Andrew C. LegUp: Open-Source High-Level Synthesis Research Framework, chapter7.4. University of Toronto, 2015. https://tspace.library.utoronto.ca/bitstream/

1807/70811/1/Canis_Andrew_C_201511_PhD_thesis.pdf.

[5] ISO. ISO/IEC 14882:2011 Information technology � Programming languages � C++.2012.

[6] J. Daniel Garcia et al. D2.1: REPARA C++ Open Speci�cation. 2014. http:

//repara-project.eu/wp-content/uploads/2014/02/ICT-609666-D2.1.pdf.

[7] Luis M. Sánchez et al. D3.3: Static partitioning tool. 2014. http://repara-project.eu/wp-content/uploads/2015/02/ICT-609666-D3.3.pdf.

[8] Martin Fowler. Refactoring: Improving the Design of Existing Code. 1999.

[9] Patrick Horgan. Understanding C/C++ Strict Aliasing. http://dbp-consulting.com/

tutorials/StrictAliasing.html. Accessed: 2016-02-11.

[10] Peter Sommerlad et al. D4.1: Extensible infrastructure for source code transformation.2014. http://repara-project.eu/wp-content/uploads/2014/04/ICT-609666-D4.1.

pdf.

[11] Peter Sommerlad et al. D4.3: Source Code Transformations for Coarse GrainedParallelism. 2015. http://repara-project.eu/wp-content/uploads/2015/02/

ICT-609666-D4.3.pdf.

[12] Rudolf Ferenc et al. D2.2: Static analysis techniques for AIR generation. 2015. http:

//repara-project.eu/wp-content/uploads/2015/02/ICT-609666-D2.2.pdf.

[13] William D. Clinger. Proper Tail Recursion and Space E�ciency. ACM PLDI, 1998.http://dl.acm.org/citation.cfm?id=277719.

[14] Xilinx. Introduction to FPGA Design with Vivado High-Level Synthesis,pages 40�41. http://www.xilinx.com/support/documentation/sw_manuals/

ug998-vivado-intro-fpga-design-hls.pdf.

[15] Xilinx. Vivado High-Level Synthesis. http://www.xilinx.com/products/

design-tools/vivado/integration/esl-design.html. Accessed: 2016-02-11.

[16] Yanhong A. Liu and Scott D. Stoller. From recursion to iteration: what are the optimiza-tions? ACM PEPM, 1999. http://dl.acm.org/citation.cfm?id=328700.


https://www.cevelop.com/

http://wiki.hsr.ch/PeterSommerlad/LambdaFicator

https://wiki.eclipse.org/FAQ_What_is_a_Quick_Fix%3F

https://tspace.library.utoronto.ca/bitstream/1807/70811/1/Canis_Andrew_C_201511_PhD_thesis.pdf

https://tspace.library.utoronto.ca/bitstream/1807/70811/1/Canis_Andrew_C_201511_PhD_thesis.pdf

http://repara-project.eu/wp-content/uploads/2014/02/ICT-609666-D2.1.pdf




http://dbp-consulting.com/tutorials/StrictAliasing.html

http://dbp-consulting.com/tutorials/StrictAliasing.html







http://dl.acm.org/citation.cfm?id=277719

http://www.xilinx.com/support/documentation/sw_manuals/ug998-vivado-intro-fpga-design-hls.pdf

http://www.xilinx.com/support/documentation/sw_manuals/ug998-vivado-intro-fpga-design-hls.pdf

http://www.xilinx.com/products/design-tools/vivado/integration/esl-design.html

http://www.xilinx.com/products/design-tools/vivado/integration/esl-design.html

http://dl.acm.org/citation.cfm?id=328700

repararepara-project.eu/wp-content/uploads/2016/04/ict-609666-d2.3.pdf · project co-funded by the...

Documents