appendix - springer978-1-4020-7838-5/1.pdf · c description (such as int, char, float, unsigned...

42
Part V Appendix

Upload: others

Post on 20-Oct-2019

13 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

Part V

Appendix

Page 2: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

ASPARK: USAGE, SYNTHESISSCRIPTS, AND HARDWARELIBRARY FILES

A.1 Command Line Interface

Spark can be invoked on the command-line by the command:spark [command-line flags] filename.c

Some of the important command-line flags are:

Flag

-h-hs-hvf-hb-hec-hcc-hch-hcp-hdc-hcs-hli-hg

-hcg-hq

Task Performed by Command-Line Flag

Prints helpSchedule DesignGenerate RTL VHDL (output file is ./output/ filename_ spark_rtl.vhdl)Do Resource Binding (operation to functional unit and variable to registers)Statistics about states, path lengths are printed into RTL VHDL fileGenerate output C file of scheduled design in ./output/ filename_sparkout.cChain operations across Conditional BoundariesDo Copy and Constant PropagationDo Dead Code EliminationDo Common Sub-Expression EliminationDo Loop Invariant Code MotionGenerate graphs (output files are in directory ./output). Graphs aregenerated by default if scheduling is done, as: filename_sched.dottyGenerate function call graph (./output/CG_filename.dotty)Run quietly; no debug messages

Spark writes out several files such as the output graphs (see next section) and thebackend VHDL and C files. All these files are written out to the subdirectory “output”

Page 3: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

194 APPENDIX A. SPARK: USAGE AND SYNTHESIS SCRIPTS

of the directory from which Spark is invoked. This directory has to be created beforeexecuting Spark.

A.2 Viewing Output Graphs

The format that Spark uses for the output graphs is that of AT&TŠs Graphviz tool[Lab]. Output graphs are created for each function in the “C” input file. The outputgraphs generated are listed in the table below.

Graphs representing the original input file

CFG_filename_c_funcName.dottyHTG_filename_c_funcName.dottyDFG_filename_c_funcName.dottyCDFG_filename_c_funcName.dotty

Graphs representing the scheduled design

CFG_filename_c_funcName_sched.dottyHTG_filename_c_funcName_sched.dottyDFG_filename_c_funcName_sched.dottyCDFG_filename_c_funcName_sched.dotty

The key to the nomenclature used in tables above is:

Abbreviation

CFGHTGDFGCDFGfilenamefuncName

Description

Control Flow Graph: Basic blocks with control flow between themHierarchical Task Graph: Hierarchical structure of basic blocksData Flow Graph: Data flow between operationsControl-Data Flow Graph: View of resource utilization in the CFGThe name of “C” input fileThe name of the function in input file

To view these output graphs, we use the Graphviz command line tool dotty, as follows:dotty output /graph filename.dotty

A.3 Hardware Description File Format

Spark requires as input a hardware description file that has information on timing, rangeof the various data types, and the list of resources allocated to schedule the design(resource library). This file has to be named “filename.spark”, where filename.c is

Page 4: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

A.3. HARDWARE DESCRIPTION FILE FORMAT 195

name of the “C” input file. If a filename.spark does not exist, then the Spark executablelooks for “default.spark”. One of these files has to exist for Spark to execute.

The various have sections in the .spark files are described in the next few sections.Note that, comments can be included in the .spark files by preceding them with “//”.

A.3.1 Timing Information

The timing section of the .spark file has the following format:

// ClockPeriod[GeneralInfo]10

NumOfCycles

0

TimeConstrained

0

Pipelined

0

Of these parameters, only the clock period is used by the scheduler to schedule thedesign. The rest of the parameters have been included for future development and arenot used currently. They correspond to the number of cycles to schedule the designin (timing constraint), whether the design should be scheduled by a time constrainedscheduling heuristic, and whether the design should be pipelined.

A.3.2 Data Type Information

Each data type used in the “C” input file has to have an entry in the “[TypeInfo]” sectionof the .spark file, as shown below:

// typeName// or variableName[TypeInfo]intmyVar

lowerRangelowerRange

-327670

upperRangeupperRange

3276816

This section specifies the range of the various data types that can be specified in aC description (such as int, char, float, unsigned int, signed int, long, double et cetera).The format is data type, lower bound range, and upper bound range. Also, the datavalue range of specific variables from the input C description can be specified in thissection, as variable name, lower range, upper range. This is shown in the table abovemy the example of a variable “myVar” whose range is from 0 to 16.

A.3.3 Hardware Resource Information

The [Resources] section parameterizes each resource allocated to schedule the designas shown by the example below:The example given above is:

A resource called CMP (comparator).

Page 5: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

196 APPENDIX A. SPARK: USAGE AND SYNTHESIS SCRIPTS

//name[Resources]CMP

type

==,<,!=

inpsType

i

inputs

2

number

1

cost

10

cycles

1

ns

10

The operations ==, <, ! = can be mapped to this resource.

It handles inputs of type integer (i).

It has 2 inputs.

There is one CMP allocated to schedule the design.

Its cost is 10. The cost, although not used currently, can be integrated into mod-ule selection heuristics while selecting a resource for scheduling from amongmultiple resources.

The CMP resource executes in 1 cycle.

It takes 10 nanoseconds to execute.

This resource description section allows for multi-cycle resources but does not cur-rently support structurally pipelined resources.

A.3.4 Loop Unrolling and Pipelining Parameters

// variable maxNumUnrolls maxNumShifts percentThreshold ThruputCycles[RDLPParams]*i

00

02

7070

00

The [RDLPParams] section presents the parameters for loop unrolling and loop pipelin-ing.

Variable is the loop index variable to operate on (“*” means all loops)

The second parameter specifies the number of times to unroll the loop.

Number of times the loop should be shifted by the loop pipelining heuristic.

Percentage threshold and throughput cycles are parameters used by the resource-directed loop pipelining (RDLP) heuristic implemented in our system.

The example in the second line of the “[RDLPParams]” section shown above saysthat the loop with loop index variable “i” should be shifted twice.

Page 6: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

A.4. SCRIPTING OPTIONS FOR CONTROLLING TRANSFORMATIONS 197

A.3.5 Other Sections in .spark files

The other sections in the .spark files are:

[RDLPMetrics]: Controls the various parameters of the resource-directed looppipelining (RDLP) heuristic.

[SchedulerRules]: The file that Spark should read to get the scheduling scripts(rules and parameters). Default is: Priority.rules.

[SchedulerScript]: The scheduling script to use: different scheduling heuristicscan be employed by changing this entry. Default is “genericSchedulerScript”.

[Verification]: Specifies the number of test vectors that should be generated forfunctional verification of output C with input C.

A.4 Scripting Options for Controlling Transformations

Spark allows the designer to control the transformations applied to the design descrip-tion by way of synthesis scripts. In this section, we discuss the scripting options avail-able to the designer.

The scripting options can be specified in the file given by the “[SchedulerRules]”section of the .spark file (see previous section). The default script file Spark looks foris “Priority.Rules”. This file has three main sections: the scheduler functions, the listof allowed code motions (code motion rules) and the cost of code motions. We discusseach of these in the next three sections. Note that in this file, “//” denotes that the restof the line is a comment.

A.4.1 Scheduler Functions

An example of the scheduler functions section from a sample Priority.rules file isgiven in Figure A. 1. An entry in this section is of type “FunctionType=FunctionName”.A brief explanation for each function is given in the second part of the figure.

Of these we use the following flags for the experiments presented in this book:DynamicCSE, PriorityType, BranchBalancingDuringCMs and BranchBalancingDur-ingTraversal.

A.4.2 List of Allowed Code Motion

The [CodeMotionRules] section of the “Priority.rules” file has the list of code motionsthat can be employed by the scheduler. Each code motion can be enabled or disabledby setting the flag corresponding to it to “true” or “false”. An example of the list ofallowed code motions sections is as given below.

Page 7: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

198 APPENDIX A. SPARK: USAGE AND SYNTHESIS SCRIPTS

// all the following can take values true or false[CodeMotionRules]RenamingAllowed=trueAcrossHTGCodeMotionAllowed=trueSpeculationAllowed=trueReverseSpeculationAllowed=trueEarlyCondExecAllowed=trueConditionalSpeculationAllowed=true

// Variable Renaming allowed or not// Across HTG code motion allowed or not// Speculation allowed allowed or not// Reverse Speculation allowed or not// Early Condition Execution allowed or not// Condition Speculation allowed or not

A.4.3 Cost of Code Motions

This section was developed to experiment with incorporating costs of code motionsinto the cost function based on which the operation to schedule is chosen. An exampleof this section is given below. Here all the code motions are assigned a cost of 1.

[CodeMotionCosts]WithinBBAcrossHTGCodeMotionSpeculationDuplicationUp

1111

The total cost of scheduling an operation is determined as:Total Cost = - Basic Cost of operation * Cost Of Each Code Motion required to sched-ule the operation

where basic cost of the operation is the priority of the operation (see Chapter 7) and costof each code motion is as given in the “[CodeMotionCosts]” section of the Priority.rulesfile. Since this function generates a negative total cost, the operation with the lowestcost is chosen as the operation to be scheduled.

A.5 Sample default.spark Hardware Description file

//NOTE: do no put any comments within a section// ClockPeriod NumOfCycles TimeConstrained Pipelined[GeneralInfo]10 1 1 0

//typeName//or variableName[TypeInfo]charsigned_charunsigned_char

lowerRangelowerRange

000

upperRangeupperRange

888

Page 8: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

A.5. SAMPLE DEFAULT.SPARK HARDWARE DESCRIPTION FILE 199

shortintunsigned_shortunsigned_intlongunsigned_longlong_longunsigned_long_longfloatdoublelong_doublemyVariableFromInput

000000000000

16321632646412812832641284

// all cycles in resources have to be ns/ClockPeriod = cycles//name type inpsType inputs number cost cycles ns[Resources]ALUMULCMPSHFTARR

+,–

*==,<<<[]

iiiii

22221

11125

1020101010

12111

1020101010

// variable[RDLPParams]* 0 0 70 0

numUnrolls numShifts percentThreshold cycleThruput

//unroll, shift, resetUnroll, and resetShift metrics[RDLPMetrics]UnrollMetric=RDLPGenericUnrollMetricShiftMetric=RDLPGenericShiftMetricResetUnrollMetric=RDLPGenericResetUnrollMetricResetShiftMetric=RDLPGenericResetShiftMetric

//lists file that has scheduler rules/functions[SchedulerRules]Priority.rules

//function that drives scheduler[SchedulerScript]genericSchedulerScript

// numOfTestVectors[Verification]20

[OutputVHDLRules]

Page 9: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

200 APPENDIX A. SPARK: USAGE AND SYNTHESIS SCRIPTS

PrintSynopsysVHDL=true

A.6 Recommended Priority.rules Synthesis Script file

We found the following choice of options in the synthesis script produces the bestsynthesis results for data-intensive designs with complex control flow.

//line format: <functiontype>=<functionvalue>[SchedulerFunctions]ScheduleRegionWalkerFunction=topDownBasicBlockNoEmptyCandidateValidatorFunction=candidateValidatorPriorityCandidateMoverFunction=TbzMoverLoopSchedulingFunction=RDLPCandidateRegionWalkerFunction=topDownGlobalPreSchedulingStepFunction=preSchedulingPriorityPostSchedulingStepFunction=postSchedulingPriorityPreLoopSchedulingFunction=prepareForRDLPPostLoopSchedulingFunction=constantPropagationPreSchedulingFunction=initPrioritiesReDoHTGsForDupUp=false // true or false – false is betterReassignPriorityForCS=true // true or false – true is betterPriorityType=max // max, sum, maxNoCond – max is bestRestrictDupUpType=targetBBUnsched // or none or afterSchedOnceBranchBalancingDuringCMs=true // true or false – true is betterBranchBalancingDuringTraversal=true // true is betterDynamicCSE=true

[CodeMotionRules]RenamingAllowed=trueAcrossHTGCodeMotionAllowed=trueSpeculationAllowed=trueReverseSpeculationAllowed=trueEarlyCondExecAllowed=trueConditionalSpeculationAllowed=true

// the higher the cost, the more profitable a code motion is// total cost = basicCost * CostOfEachCodeMotion[CodeMotionCosts]WithinBBAcrossHTGCodeMotionSpeculationDuplicationUp

1111

Page 10: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

A.7. RECOMMENDED COMMAND-LINE OPTIONS FOR SPARK 201

A.7 Recommended Command-line Options for Invok-ing Spark

We recommend the following command-line options for invoking Spark:spark -hli -hcs -hcp -hdc -hs -hvf -hb -hec filename.c

The command-line options that are enabled are: loop-invariant code motion (-hli),common sub-expression elimination (-hcs), copy and constant propagation (-hcp), deadcode elimination (-hdc), scheduling (-hs), generation of synthesizable RTL VHDL (-hvf), interconnect-minimizing resource binding (-hb) and generation of statistics aboutcycle count (-hec).

A.8 Options for Synthesizing Microprocessor BlocksTo synthesize microprocessor blocks, we have to enable operation chaining across con-ditional boundaries by using the command-line option –hch and we have to increasethe clock period to a large number so that all the operations can be packed into oneclock cycle. We arbitrarily increase clock period to 10000ns: this enables up to 1000additions to be chained together (if each addition takes 10ns). The clock period canbe set in the “[GeneralInfo]” section of the default.spark file (see Section A.3.1) andthe timing of each operation in the design can be set in the “[Resources]” section (seeSection A.3.3).

Also, we have to enable full loop unrolling. This can be done by setting the numberof unrolls in the “[RDLPParams]” section of the default.spark file to the number ofiterations of the loop to be unrolled (see Section A.3.4).

Page 11: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

202 APPENDIX A. SPARK: USAGE AND SYNTHESIS SCRIPTS

Page 12: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

BSAMPLE RUNS

B.1 A Sample Input C Programint dest_cur[128];void synthetic (int bytes, int x, int y){int val, data, a, col;

for (col = 0; col < 10; col++){data = dest_cur[col+bytes] ;val = data-y+2;if (col<x+2)a = x+2-col;

else{a = col-x+2;}

dest_cur[col+bytes+a] += val;} /* for (col = 0; col < 10; col++) */

}

B.2 Unbound VHDL output for the sample programAutomatically generated by the SPARK High-Level Synthesis SystemSat Jan 24 10:52:48 2004, source file : synthetic.c

’SPARK’ should be defined as the user packagePACKAGE spark_pkg isTYPE integer_vector IS ARRAY ( NATURAL RANGE <>) OF integer;TYPE boolean_vector IS ARRAY ( NATURAL RANGE <>) OF boolean;FUNCTION integer_wired_or ( arr_int : integer_vector ) RETURN integer;FUNCTION boolean_wired_or ( arr_bool : boolean_vector ) RETURN boolean;SUBTYPE wiredOrInt IS integer_wired_or Integer;SUBTYPE wiredOrBoolean IS boolean_wired_or boolean;TYPE ARRAY_127_0_32768_32767 is ARRAY(127 DOWNTO 0) of wiredOrInt range -32767 to 32768;

END spark_pkg;

library IEEE;use IEEE.std_logic_1164.all;use IEEE.std_logic_arith.all;use IEEE.std_logic_signed.all;

PACKAGE BODY spark_pkg ISFUNCTION integer_wired_or ( arr_int : integer_vector ) RETURN integer is

Page 13: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

204 APPENDIX B. SAMPLE RUNS

pragma resolution_method wired_orvariable i : integer;variable returnVal : std_logic_vector(15 downto 0);variable arr_int_std_logic_vec : std_logic_vector(15 downto 0);

BEGINreturnVal := (others => ’0’);for i in arr_int’range looparr_int_std_logic_vec := conv_std_logic_vector(arr_int(i) , 16);returnVal := returnVal or arr_int_std_logic_vec;

end loop;RETURN conv_integer(returnVal);

END integer_wired_or;

FUNCTION boolean_wired_or ( arr_bool : boolean_vector ) RETURN boolean ispragma resolution_method wired_or

variable i : integer;variable returnVal : boolean;

BEGINreturnVal := FALSE;for i in arr_bool’range loopreturnVal := raturnVal or arr_bool(i);

end loop;RETURN returnVal;

END boolean_wired_or;end spark_pkg;

library IEEE;use IEEE.std_logic_1164. all;use IEEE.std_logic_arith.all;use IEEE.std_logic_signed.all;library work;use work.spark_pkg.all;

ENTITY synthetic ISport(bytes IN wiredOrInt range -32767 to 32768 ;x IN wiredOrInt range -32767 to 32768 ;y IN wiredOrInt range -32767 to 32768 ;

global variables aredest_cur : INOUT ARRAY_127_0_32768_32767;CLOCK IN std_logic ;RESET IN std_logic ;done OUT std_logic ) ;

END synthetic;

ARCHITECTURE rtl OF synthetic IS

signal val wiredOrInt range -32767 to 32768 ;signal data wiredOrInt range -32767 to 32768 ;signal a wiredOrInt range -32767 to 32768 ;signal col wiredOrInt range -32767 to 32768 ;signal sT0_6 wiredOrBoolean ;signal sT1_8 wiredOrInt range -32767 to 32768 ;signal sT2_8 wiredOrInt range -32767 to 32768 ;signal sT3_10 wiredOrInt range -32767 to 32768 ;signal sT4_10 wiredOrBoolean ;signal sT5_11 wiredOrInt range -32767 to 32768 ;signal sT6_14 wiredOrInt range -32767 to 32768 ;signal sT7_16 wiredOrInt range -32767 to 32768 ;signal sT8_16 wiredOrInt range -32767 to 32768 ;signal sT9_16 wiredOrInt range -32767 to 32768 ;signal sT10_11 wiredOrInt range -32767 to 32768 ;signal sTll_8 wiredOrInt range -32767 to 32768 ;signal sT12_14 wiredOrInt range -32767 to 32768 ;signal sT13_11 wiredOrInt range -32767 to 32768 ;signal sT14_14 wiredOrInt range -32767 to 32768 ;

RESET IN std_logic ;

Page 14: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

B.2. UNBOUND VHDL OUTPUT FOR THE SAMPLE PROGRAM 205

Statistics collected about this ScheduleminCycles = 51, maxCycles = 51, avgCycles = 51.0 in syntheticNum of Ifs = 1, Nun of Loops = 1Num of Non-Empty BBs = 9, Num of Ops = 20Scheduled with the following resources2 ALU, 1 CMP, 1 ARRDeclarations of the 6 states in routine synthetic

subtype StateType is std_logic_vector(5 downto 0);CONSTANT S_0CONSTANT S_1CONSTANT S_2CONSTANT S_3CONSTANT S_4CONSTANT S_5

std_logic_vector (5 downto 0) :=std_logic_vector (5 downto 0) :=std_logic_vector (5 downto 0) :=std_logic_vector (5 downto 0) :=std_logic_vector (5 downto 0) :=std_logic_vector (5 downto 0) :=

"000001";"000010";"000100";"001000";"010000";"100000";

signal CURRENT_STATE : StateType;signal NEXT_STATE : StateType;BEGINSYNC: PROCESSBEGINwait until CLOCK’event and CLOCK = ’1’;if reset = ’1’ thenCURRENT_STATE <= S_0;done <= ’0’;

elseCURRENT_STATE <= NEXT_STATE;if CURRENT_STATE /= S_0 and NEXT_STATE = S_0 thendone <= ’1’;

end if;end if; -- if reset check

END PROCESS; -- SYNC Process

FSM: PROCESS (CURRENT_STATE, sT4_10, sT0_6)BEGINNEXT_STATE <= CURRENT_STATE;if CURRENT_STATE(0) = ’1’ thenNEXT_STATE <= S_l;

elsif CURRENT_STATE(1) = ’1’ thenNEXT_STATE <= S_2;

elsif CURRENT_STATE(2) = ’1’ thenif sT0_6 thenNEXT_STATE <= S_3;

else -- sT0_6NEXT_STATE <= S_0;

end if; -- conditionselsif CURRENT_STATE(3) = ’1’ thenif sT0_6 thenif ST4_10 thenNEXT_STATE <= S_4;

else -- sT4_10NEXT_STATE <= S_4;

end if; -- conditionsend if; -- conditions

elsif CURRENT_STATE(4) = ’1’ thenif sT0_6 thenif sT4_10 thenNEXT_STATE <= S_5;

else -- sT4_10NEXT_STATE <= S_5;

end if; -- conditionsend if; -- conditions

elsif CURRENT_STATE(5) = ’1’ thenNEXT_STATE <= S_1;

END if; -- if (CURRENT_STATE)END PROCESS; -- FSM Process

DP: PROCESS

BEGIN

Page 15: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

206 APPENDIX B. SAMPLE RUNS

wait until CLOCK ’event and CLOCK = ’1’;if reset = ’1’ thensT0_6 <= FALSE;sT4_10 <= FALSE;

else -- else of if resetif CURRENT_STATE(0) = ’1’ thensT3_10 <= (x + 2);col <= 0;

elsif CURRENT_STATE(1) = ’1’ thensTll_8 <= (col + bytes);sT12_14 <= (col - x);sT0_6 <= (col < 10);

elsif CURRENT_STATE(2) = ’1’ thenif sT0_6 thensT13_ll <= (sT3_10 - col);sT14_14 <= (sT12_14 + 2);sT4-10 <= (col < sT3_10);data <= dest_cur(sTll_8);

end if; -- conditionselsif CURRENT_STATE(3) = ’1’ thenif sT0_6 thenif sT4_10 thensT2_8 <= (data - y) ;sT8_16 <= (sTll_8 + sT13_ll);

else -- sT4_10sT2_8 <= (data - y) ;sT8_16 <= (sT11_8 + sT14_14);

end if; -- conditionsend if; -- conditions

elsif CURRENT_STATE(4) = ’1’ thenif sT0_6 thenif sT4_10 thenval <= (sT2_8 + 2);col <= (col + 1);sT9_16 <= dest_cur(sT8_16);

else -- sT4_10val <= (sT2_8 + 2);col <= (col +1);sT9_16 <= dest_cur(sT8_16);

end if ; -- conditionsend if; -- conditions

elsif CURRENT_STATE(5) = ’1’ thenif sT0_6 thendest_cur(sT8_16) <= (sT9_16 +val);

end if; -- conditionsEND if; -- if (CURRENT_STATE)

end if; -- end of if resetEND PROCESS; -- DP Process

END rtl;

B.3 Bound VHDL output for the sample program Automatically generated by the SPARK High-Level Synthesis System Sat Jan 24 10:51:54 2004, source file : synthetic.c

’SPARK’ should be defined as the user packagePACKAGE spark_pkg isTYPE integer_vector IS ARRAY ( NATURAL RANGE <>) OF integer;TYPE boolean_vector IS ARRAY ( NATURAL RANGE <>) OF boolean;FUNCTION integer_wired_or ( arr_int : integer_vector ) RETURN integer;FUNCTION boolean_wired_or ( arr_bool : boolean_vector ) RETURN boolean;SUBTYPE wiredOrInt IS integer_wired_or integer;SUBTYPE wiredOrBoolean IS boolean_wired_or boolean;TYPE ARRAY_127_0_32768_32767 is ARRAY(127 DOWNTO 0) of wiredOrInt range -32767 to 32768;

END spark_pkg;

Page 16: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

B.3. BOUND VHDL OUTPUT FOR THE SAMPLE PROGRAM 207

library IEEE;use IEEE.std_logic_1164.all;use IEEE.std_logic_arith.all;use IEEE.std_logic_signed.all;

PACKAGE BODY spark_pkg ISFUNCTION integer_wired_or ( arr_int : integer_vector ) RETURN integer is-- pragma resolution_method wired_orvariable i : integer;variable returnVal : std_logic_vector(15 downto 0);variable arr_int_std_logic_vec : std_logic_vector(15 dovnto 0);

BEGINreturnVal := (others => ’0’);

arr_int_std_logic_vec := conv_std_logic_vector(arr_int(i), 16);returnVal := returnVal or arr_int_std_logic_vec;

end loop;RETURN conv_integer(returnVal);

END integer_wired_or;

FUNCTION boolean_wired_or ( arr_bool : boolean_vector ) RETURN boolean is-- pragma resolution_method wired_orvariable i : integer;variable returnVal : boolean;

BEGINreturnVal := FALSE;for i in arr_bool’range loopreturnVal := returnVal or arr_bool(i);

end loop;RETURN returnVal;

END boolean_wired_or;end spark_pkg;

library IEEE;use IEEE.std_logic_1164.all;use IEEE.std_logic_arith.all;use IEEE.std_logic_signed.all;

ENTITY res_ALU ISport(res_ALU_in0 : IN integer range -32767 to 32768;res_ALU_in1 : IN integer range -32767 to 32768;res_ALU_execOp : IN integer range 0 to 1;res_ALU_out : OUT integer range -32767 to 32768

);END res_ALU;

ARCHITECTURE rtl of res_ALU ISBEGINres_ALU_out <= res_ALU_in0 + res_ALU_in1 when (res_ALU_execOp = 0)

else res_ALU_in0 - res_ALU_in1;END rtl; -- architecture of res_ALU

library IEEE;use IEEE.std_logic_1164.all;use IEEE.std_logic_arith.all;use IEEE.std_logic_signed.all;

ENTITY res_CMP ISport(

res_CMP_in0 : IN integer range -32767 to 32768;res_CMP_in1 : IN integer range -32767 to 32768;res_CMP_out : OUT boolean

);END res_CMP;

ARCHITECTURE rtl of res_CMP ISBEGIN

for i in arr-int’range loop

Page 17: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

208 APPENDIX B. SAMPLE RUNS

res_CMP_out <= res_CMP_in0 < res_CMP_in1;END rtl; -- architecture of res_CMP

library IEEE;use IEEE.std_logic_1164.all;use IEEE.std_logic_arith.all;use IEEE.std_logic_signed.all;

ENTITY res_ARR ISport(res_ARR_in0 : IN integer range -32767 to 32768;res_ARR_out : OUT integer range -32767 to 32768

);END res_ARR;

ARCHITECTURE rtl of res_ARR ISBEGINres_ARR_out <= res_ARR_in0;

END rtl; -- architecture of res_ARR

library IEEE;use IEEE.std_logic_1164.all;use IEEE.std_logic_arith.all;use IEEE.std_logic_signed.all;library work;use work.spark_pkg.all;

ENTITY synthetic ISport(bytes : IN wiredOrInt range -32767 to 32768;x : IN wiredOrInt range -32767 to 32768;y : IN wiredOrInt range -32767 to 32768;

-- global variables aredest_cur : INOUT ARRAY_127_0_32768_32767;CLOCK : IN std_logic ;RESET : IN std_logic ;done : OUT std_logic ) ;

END synthetic;

ARCHITECTURE rtl OF synthetic IS

COMPONENT res_ALUport(res_ALU_in0 : IN integer range -32767 to 32768;res_ALU_in1 : IN integer range -32767 to 32768;res_ALU_execOp : IN integer range 0 to 1;res_ALU_out : OUT integer range -32767 to 32768

);END COMPONENT; -- end of component res_ALU

COMPONENT res_CMPport(res_CMP_in0 : IN integer range -32767 to 32768;res_CMP_in1 : IN integer range -32767 to 32768;res_CMP_out : OUT boolean

);END COMPONENT; -- end of component res_CMP

COMPONENT res_ARRport(res_ARR_in0 : IN integer range -32767 to 32768;res_ARR_out : OUT integer range -32767 to 32768

);END COMPONENT; -- end of component res_ARR

signal sT0_6 : wiredOrBoolean ;signal sT4_10 : wiredOrBoolean ;

Page 18: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

B.3. BOUND VHDL OUTPUT FOR THE SAMPLE PROGRAM 209

signal res_ALU_0_in0 : integer range -32767 to 32768 := 0;signal res_ALU_0_in1 : integer range -32767 to 32768 := 0;signal res_ALU_0_execOp : integer range 0 to 1 := 0;signal res_ALU_0_out : integer range -32767 to 32768 := 0;signal res_ALU_1_in0 : integer range -32767 to 32768 := 0;signal res_ALU_1_in1 : integer range -32767 to 32768 := 0;signal res_ALU_1_execOp : integer range 0 to 1 := 0;signal res_ALU_1_out : integer range -32767 to 32768 := 0;signal res_CMP_2_in0 : integer range -32767 to 32768 := 0;signal res_CMP_2_in1 : integer range -32767 to 32768 := 0;signal res_CMP_2_out : boolean;signal res_ARR_3_in0 : integer range -32767 to 32768 := 0;signal res_ARR_3_out : integer range -32767 to 32768 := 0; Statistics collected about this Schedule minCycles = 51, maxCycles = 51, avgCycles = 51.0 in synthetic Num of Ifs = 1, Num of Loops = 1 Num of Non-Empty BBs = 9, Num of 0ps = 20 Scheduled with the following resources 2 ALU, 1 CMP, 1 ARR, 1 dest_cur, Declarations of the 6 states in routine synthetic

subtype StateType is std_logic_vector(5 downto 0);CONSTANT S_0 : std_logic_vector (5 downto 0) := "000001";CONSTANT S_1 : std_logic_vector (5 downto 0) := "000010";CONSTANT S_2 : std_logic_vector (5 downto 0) := "000100";CONSTANT S_3 : std_logic_vector (5 downto 0) := "001000";CONSTANT S_4 : std_logic_vector (5 downto 0) := "010000";CONSTANT S_5 : std_logic_vector (5 downto 0) := "100000";signal CURRENT_STATE : StateType;signal NEXT_STATE : StateType;-- Declarations of the 11 registers in syntheticsignal regNum0 : integer range -32767 to 32768 := 0;signal regNum1 : integer range -32767 to 32768 := 0;signal regNum2 : integer range -32767 to 32768 := 0;signal regNum3 : integer range -32767 to 32768 := 0;signal regNum4 : integer range -32767 to 32768 := 0;signal regNum5 : integer range -32767 to 32768 := 0;signal regNum6 : Integer range -32767 to 32768 := 0;signal regNum7 : integer range -32767 to 32768 := 0;signal regNum8 : integer range -32767 to 32768 := 0;signal regNum9 : integer range -32767 to 32768 := 0;signal regNum10 : integer range -32767 to 32768 := 0;BEGINres_ALU_instance_0 : res_ALUport map (res_ALU_in0 => res_ALU_0_in0,res_ALU_in1 => res_ALU_0_in1,res_ALU_execOp => res_ALU_0_execOp,res_ALU_out => res_ALU_0_out

);-- end of port map of component res_ALU_instance_0

res_ALU_instance_1 : res_ALUport map (res_ALU_in0 => res_ALU_1_in0,res_ALU_in1 => res_ALU_1_in1,res_ALU_execOp => res_ALU_1_execOp,res_ALU_out => res_ALU_1_out

);-- end of port map of component res_ALU_instance_1

res_CMP_instance_2 : res_CMPport map (res_CMP_in0 => res_CMP_2_in0,res_CMP_in1 => res_CMP_2_in1,res_CMP_out => res_CMP_2_out

);-- end of port map of component res_CMP_instance_2

Page 19: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

210 APPENDIX B. SAMPLE RUNS

res_ARR_instance_3 : res_ARRport map (res_ARR_in0 => res_ARR_3_in0,res_ARR_out => res_ARR_3_out

);-- end of port map of component res_ARR_instance_3

SYNC: PROCESSBEGINwait until CLOCK’event and CLOCK = ’1’;if reset = ’1’ thenCURRENT_STATE <= S_0;done <= ’0’;

elseCURRENT_STATE <= NEXT_STATE;if CURRENT_STATE /= S_0 and NEXT_STATE = S_0 thendone <= ’1’ ;

end if;end if; -- if reset check

END PROCESS; -- SYNC Process

FSM: PROCESS(CURRENT_STATE, sT4_10, sT0_6)BEGINNEXT_STATE <= CURRENT_STATE;if CURRENT_STATE(0) = ’1’ thenNEXT_STATE <= S_1;

elsif CURRENT_STATE(1) = ’1’ thenNEXT_STATE <= S_2;

elsif CURRENT_STATE(2) = ’1’ thenif sT0_6 thenNEXT_STATE <= S_3;

else -- sT0_6NEXT_STATE <= S_0;

end if; -- conditionselsif CURRENT_STATE(3) = ’1’ thenif sT0_6 thenif sT4_10 thenNEXT_STATE <= S_4;

else -- ST4_10NEXT_STATE <= S_4;

end if; -- conditionsend if; -- conditions

elsif CURRENT_STATE(4) - ’1’ thenif sT0_6 thenif ST4_10 thenNEXT_STATE <= S_5;

else -- ST4_10NEXT_STATE <= S_5;

end if; -- conditionsend if; -- conditions

elsif CURRENT_STATE(5) = ’1’ thenNEXT_STATE <= S_1;

END if; -- if (CURRENT_STATE)END PROCESS; -- FSM Process

DP: PROCESS

BEGINwait until CLOCK ’event and CLOCK = ’1’;if reset = ’1’ thensT0_6 <= FALSE;sT4_10 <= FALSE;

else -- else of if resetif CURRENT_STATE(0) = ’1’ thenregNum0 <= res_ALU_1_out;regNum1 <= 0;

elsif CURRENT_STATE(1) = ’1’ then

Page 20: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

B.3. BOUND VHDL OUTPUT FOR THE SAMPLE PROGRAM 211

regNum2 <= res_ALU_0_out;regNum3 <= res_ALU_1_out;sT0_6 <=res_CMP_2_out;

elsif CURRENT_STATE(2) = ’1’ thenif sT0_6 thenregNum4 <= res_ALU_0_out;regNum5 <= res_ALU_1_out;sT4_10 <= res_CMP_2_out;regNum6 <= res_ARR_3_out;

end if; -- conditionselsif CURRENT_STATE(3) = ’1’ then

if sT0_6 thenif sT4_10 thenregNum7 <= res_ALU_1_out;regNum8 <= res_ALU_0_out;

else -- sT4_10regNum7 <= res_ALU_1_out;regNum8 <= res_ALU_0_out;

end if; -- conditionsend if; -- conditions

elsif CURRENT_STATE(4) = ’1’ thenif sT0_6 thenif sT4_10 thenregNum9 <= res_ALU_1_out;regNum1 <= res_ALU_0_out;regNum10 <= res_ARR_3_out;

else -- sT4_10regNum9 <= res_ALU_1_out;regNum1 <= res_ALU_0_out;regNum10 <= res_ARR_3_out;

end if; -- conditionsend if; -- conditions

elsif CURRENT_STATE(5) = ’1’thenEND if; -- if (CURRENT_STATE)

end if; -- end of if resetEND PR0CESS; -- DP Process

res_ALU_0_in0_MUXES:PROCESS(CURRENT_STATE, regNuml, sT0_6,regNum0, sT4_10, regNum2, regNum10)

variable mux_select : std_logic_vector(13 downto 0);BEGINmux_select := "00000000000000";if (CURRENT_STATE(l) = ’1’)thenmux_select := "00000000000001";

end if;if (CURRENT_STATE(2) = ’1’ and sT0_6) thenmux_select := "00000000000010";

end if;if (CURRENT_STATE(3) = ’1’ and sTO_6 and sT4_10) thenmux_select := "00000000000100";

end if;if (CURRENT_STATE(4) = ’1’ and sT0_6 and sT4_10) thenmux_select := "00000000001000";

end if;if (CURRENT_STATE(3) = ’1’ and sT0_6 and NOT(sT4_10)) thenmux_select := "00000000010000";

end if;if (CURRENT_STATE(4) = ’1’ and sT0_6 and NOT(sT4_10)) thenmux_select := "00000000100000";

end if;if (CURRENT_STATE(5) = ’1’ and sT0_6) thenmux_select := "00000001000000";

end if;case mux_select iswhen "00000000000001" =>res_ALU_0_in0 <= regNum1;

when "00000000000010" =>res_ALU_0_in0 <= regNum0;

Page 21: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

212 APPENDIX B. SAMPLE RUNS

when "00000000000100" =>res_ALU_0_in0 <= regNum2;

when "00000000001000" =>res_ALU_0_in0 <= regNum1;

when "00000000010000" =>res_ALU_0_in0 <= regNum2;

when "00000000100000" =>res_ALU_0_in0 <= regNum1;

when "00000001000000" =>res_ALU_0_in0 <= regNum10;

when others =>res_ALU_0_in0 <= 0;

END CASE;END PROCESS; -- res_ALU_0_in0_MUXES END PROCESS;

res_ALU_0_in1_MUXES: PROCESS(CURRENT_STATE, bytes, sT0_6, regNum1,sT0_6, ST4_10, regNum4, regNum5, regNum9)

variable mux_select : std_logic_vector(13 downto 0);BEGINmux_select := "00000000000000";if (CURRENT_STATE(1) = ’1’) thenmux_select := "00000000000001";

end if;if (CURRENT_STATE(2) = ’1’ and sT0_6) thenmux_select := "00000000000010";

end if;if (CURRENT_STATE(3) = ’1’ and sT0_6 and sT4_10) thenmux_select := "00000000000100";

end if;if (CURRENT_STATE(4) = ’1’ and sT0_6 and sT4_10) thenmux_select := "00000000001000";

end if;if (CURRENT_STATE(3) = ’1’ and sT0_6 and NOT(sT4_10)) thenmux_select := "00000000010000";

end if;if (CURRENT_STATE(4) = ’1’ and sT0_6 and NOT<sT4_10)) thenmux_select := "00000000100000";

end if;if (CURRENT_STATE(5) = ’1’ and sT0_6) thenmux_select := "00000001000000";

end if;case mux_select iswhen "00000000000001" =>res_ALU_0_in1 <= bytes;

when "00000000000010" =>res_ALU_0_in1 <= regNum1;

when "00000000000100" =>res_ALU_0_in1 <= regNum4;

when "00000000001000" =>res_ALU_0_in1 <= 1;

when "00000000010000" =>res_ALU_0_in1 <= regNum5;

when "00000000100000" =>

wres_ALU_0_in1 <= 1;hen "00000001000000" =>res_ALU_0_in1 <= regNum9;

when others =>res_ALU_0_in1 <= 0;

END CASE;END PROCESS; -- res_ALU_0_in1_MUXES END PROCESS;

res_ALU_0_execOpMUXES: PROCESS (CURRENT_STATE, sT0_6, sT4_10)variable mux_select : std_logic_vector(13 downto 0);

BEGINmux_select := "00000000000000";if (CURRENT_STATE(1) = ’1’) thenmux_select := "00000000000001";

end if;

Page 22: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

B.3. BOUND VHDL OUTPUT FOR THE SAMPLE PROGRAM 213

if (CURRENT_STATE(2) = ’1’ and sT0_6) thenmux_select := "00000000000010";

end if;if (CURRENT_STATE(3) = ’1’ and sT0_6 and sT4_10) thenmux_select := "00000000000100";

end if;if (CURRENT_STATE(4) = ’1’ and sT0_6 and sT4_10) thenmux_select := "00000000001000";

end if;if (CURRENT_STATE(3) = ’1’ and sT0_6 and NOT(sT4_10)) thenmux_select := "00000000010000";

end if;if (CURRENT_STATE(4) = ’1’ and sT0_6 and NOT(sT4_10)) thenmux_select := "00000000100000";

end if;if (CURRENT_STATE(5) = ’1’ and sT0_6) thenmux_select := "00000001000000";

end if;case mux_select iswhen "00000000000001" =>res_ALU_0_execOp <= 0 ;

when "00000000000010" =>res_ALU_0_execOp <= 1 ;

when "00000000000100" =>res_ALU_0_execOp <= 0 ;

when "00000000001000" =>res_ALU_0_execOp <= 0 ;

when "00000000010000" =>res_ALU_0_execOp <= 0 ;

when "00000000100000" =>res_ALU_0_execOp <= 0 ;

when "00000001000000" =>res_ALU_0_execOp <= 0 ;

when others =>res_ALU_0_execOp <= 0;

END CASE;END PROCESS; -- res_ALU_0_execOp_MUXES END PROCESS;

res_ALU_1_in0_MUXES: PROCESS (CURRENT_ STATE, x, regNum1, sT0_6,regNum3, sT4_10, regNum6, regNum7)

variable mux_select : std_logic_vector(13 downto 0);BEGINmux_select := "00000000000000";if (CURRENT_STATE(0) = ’1’) thenmux_select := "00000000000001";

end if;if (CURRENT_STATE(1) = ’1’) thenmux_select := "00000000000010";

end if;if (CURRENT_STATE(2) = ’1’ and sT0_6) thenmux_select := "00000000000100";

end if;if (CURRENT_STATE(3) = ’1’ and sT0_6 and sT4_10) thenmux_select := "00000000001000";

end if;if (CURRENT_STATE(4) = ’1’ and sT0_6 and sT4_10) thenmux_select := "00000000010000";

end if;if (CURRENT_STATE(3) = ’1’ and sT0_6 and NOT(sT4_10)) then

mux_select := "00000000100000";end if;if (CURRENT_STATE(4) = ’1’ and sT0_6 and NOT(sT4_10)) thenmux_select := "00000001000000";

end if;case mux_select iswhen "00000000000001" =>res_ALU_1_in0 <= x;

when "00000000000010" =>

Page 23: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

214 APPENDIX B. SAMPLE RUNS

res_ALU_1_in0 <= regNum1;when "00000000000100" =>res_ALU_1_in0 <= regNum3;

when "00000000001000" =>res_ALU_1_in0 <= regNum6;

when "00000000010000" =>res_ALU_1_in0 <= regNum7;

when "00000000100000" =>res_ALU_1_in0 <= regNum6;

when "00000001000000" =>res_ALU_1_in0 <= regNum7;

when others =>res_ALU_1_in0 <= 0;

END CASE;END PROCESS; -- res_ALU_1_in0_MUXES END PROCESS;

res_ALU_1_in1_MUXES: PROCESS (CURRENT_STATE, x, sT0_6, sT4_10, y)variable mux_select : std_logic_vector(13 downto 0);

BEGINmux_select := "00000000000000";if (CURRENT_STATE(0) = ’1’) thenmux_select := "00000000000001";

end if;if (CURRENT_STATE(1) = ’1’) thenmux_select := "00000000000010";

end if;if (CURRENT_STATE(2) = ’1’ and sT0_6) thenmux_select := "00000000000100";

end if;if (CURRENT_STATE(3) = ’1’ and sT0_6 and sT4_10) thenmux_select := "00000000001000";

end if;if (CURRENT_STATE(4) = ’1’ and sT0_6 and sT4_10) thenmux_select := "00000000010000";

end if;if (CURRENT_STATE(3) = ’1’ and sT0_6 and NOT(sT4_10)) thenmux_select := "00000000100000";

end if;if (CURRENT_STATE(4) = ’1’ and sT0_6 and NOT(sT4_10)) thenmux_select := "00000001000000";

end if;case mux_select iswhen "00000000000001" =>res_ALU_1_in1 <= 2;

when "00000000000010" =>res_ALU_1_in1 <= x;

when "00000000000100" =>res_ALU_1_in1 <= 2;

when "00000000001000" =>res_ALU_1_in1 <= y;

when "00000000010000" =>res_ALU_1_in1 <= 2;

when "00000000100000" =>res_ALU_1_in1 <= y;

when "00000001000000" =>res_ALU_1_in1 <= 2;

when others =>res_ALU_1_in1 <= 0;

END CASE;END PROCESS; -- res_ALU_1_in1_MUXES END PROCESS;

res_ALU_1_execOpMUXES: PROCESS(CURRENT_STATE, sT0_6, sT4_10)variable mux_select : std_logic_vector(13 downto 0);

BEGINmux_select := "00000000000000";if (CURRENT_STATE(0) = ’1’) thenmux_select := "00000000000001";

end if;

Page 24: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

B.3. BOUND VHDL OUTPUT FOR THE SAMPLE PROGRAM 215

if (CURRENT_STATE(1) = ’1’) thenmux_select := "00000000000010";

end if;if (CURRENT_STATE(2) = ’1’ and sT0_6) thenmux_select := "00000000000100";

end if;if (CURRENT_STATE (3) = ’1’ and sT0_6 and sT4_10) thenmux_select := "00000000001000";

end if;if (CURRENT_STATE(4) = ’1’ and sT0_6 and sT4_10) thenmux_select := "00000000010000";

end if;if (CURRENT_STATE (3) = ’1’ and sT0_6 and NOT(sT4_10)) thenmux_select := "00000000100000";

end if;if (CURRENT_STATE(4) = ’1’ and sT0_6 and NOT(sT4_10)) thenmux_select := "00000001000000";

end if;case mux_select iswhen "00000000000001" =>res_ALU_1_execOp <= 0 ;

when "00000000000010" =>res_ALU_1_execOp <= 1 ;

when "00000000000100" =>res_ALU_1_execOp <= 0 ;

when "00000000001000" =>res_ALU_1_execOp <= 1 ;

when "00000000010000" =>res_ALU_1_execOp <= 0 ;

when "00000000100000" =>res_ALU_1_execOp <= 1 ;

when "00000001000000" =>res_ALU_1_execOp <= 0 ;

when others =>res_ALU_1_execOp <= 0;

END CASE;END PROCESS; -- res_ALU_1_execOp_MUXES END PROCESS;

res_CMP_2_in0_MUXES: PROCESS (CURRENT_STATE, regNum1, sT0_6)variable mux_select : std_logic_vector(1 downto 0) ;

BEGINmux_select := "00";if (CURRENT_STATE(1) = ’1’) thenmux_select := "01";

end if;if (CURRENT_STATE(2) = ’1’ and sT0_6) thenmux_select := "10";

end if;case mux_select iswhen "01" =>res_CMP_2_in0 <= regNum1;

when "10" =>res_CMP_2_in0 <= regNum1;

when others =>res_CMP_2_in0 <= 0;

END CASE;END PROCESS; -- res_CMP_2_in0_MUXES END PROCESS;

res_CMP_2_in1_MUXES: PROCESS (CURRENT_STATE, sT0_6, regNum0)variable mux_select : std_logic_vector(1 downto 0);

BEGINmux_select := "00";if (CURRENT_STATE(1) = ’1’) thenmux_select := "01";

end if;if (CURRENT_STATE(2) = ’1’ and sT0_6) thenmux_select := "10";

end if;

Page 25: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

216 APPENDIX B. SAMPLE RUNS

case mux_select iswhen "01" =>res_CMP_2_in1 <= 10;

when "10" =>res_CMP_2_in1 <= regNum0;

when others =>res_CMP_2_in1 <= 0;

END CASE;END PROCESS; -- res_CMP_2_in1_MUXES END PROCESS;

res_ARR_3_in0_MUXES: PROCESS (CURRENT_STATE , sT0_6, dest_cur,regNum2, sT4_10, regNum8)

variable mux_select : std_logic_vector(2 downto 0);BEGINmux_select := "000";if (CURRENT_STATE(2) = ’1’ and sT0_6) thanmux_select := "001";

end if;if (CURRENT_STATE(4) = ’1’ and sT0_6 and sT4_10) thenmux_select := "010";

end if;if (CURRENT_STATE(4) = ’1’ and sT0_6 and NOT(sT4_10)) thenmux_select := "100";

end if;case mux_select iswhen "001" =>res_ARR_3_in0 <= dest_cur(regNum2);

when "010" =>res_ARR_3_in0 <= dest_cur(regNum8);

when "100" =>res_ARR_3_in0 <= dest_cur(regNum8);

when others =>res_ARR_3_in0 <= 0;

END CASE;END PROCESS; -- res_ARR_3_in0_MUXES END PROCESS;

END rtl;

Page 26: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

Bibliography

R. Allen, K. Kennedy, C. Portfield, and J. Warren. Conversion of controldependence to data dependence. In ACM Symposium on Principles ofProgramming Languages, 1983.

A. Aiken, , and A. Nicolau. A development environment for horizontalmicrocode. IEEE Transactions on Software Engineering, May 1988.

A. Aiken and A. Nicolau. Perfect Pipelining: A new loop parallelizationtechnique. In European Symposium on Programming, 1988.

A. Aiken, A. Nicolau, and S. Novack. Resource-constrained softwarepipelining. IEEE Transactions on Parallel and Distributed Systems,6(12), December 1995.

A. Aho, R. Sethi, and J. Ullman. Compilers: Principles and Techniquesand Tools. Addison-Wesley, 1986.

Behavioral compiler. Synopsys.

R.K. Brayton, R. Camposano, G. De Micheli, R.H.J.M. Otten, and J. vanEijndhoven. The Yorktown Silicon Compiler System, chapter in SiliconCompilation. Addison-Wesley, 1988.

R.A. Bergamaschi. Behavioral network graph unifying the domains ofhigh-level and logic synthesis. In Design Automation Conference, 1999.

David F. Bacon, Susan L. Graham, and Oliver J. Sharp. Compiler trans-formations for high-performance computing. ACM Computing Surveys,26(4):345–420, 1994.

M. Benmohammed and A. Rahmoune. Automatic generation of repro-grammable microcoded controllers within a high-level synthesis environ-ment. In IEE Proceedings-Computers and Digital Techniques, 1998.

R. A. Bergamaschi, S. Raje, and L. Trevillyan. Control-flow versusdata-flow-based scheduling: combining both approaches in an adaptivescheduling system. IEEE Transactions on Very Large Scale Integration(VLSI) Systems, March 1997.

[AKPW83]

[AN88a]

[AN88b]

[ANN95]

[ASU86]

[BC]

[Ber99]

[BGS94]

[BR98]

[BRT97]

Page 27: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

218 BIBLIOGRAPHY

R. Camposano. Path–based scheduling for synthesis. IEEE Transactionson Computer–Aided Design, Jan. 1991.

Celoxica. DK design suite. http://www.celoxica.com.

R. Cytron and J. Ferrante. What’s in a name? or the value of renaming forparallelism detection and storage allocation. In International Conferenceon Parallel Processing, 1987.

V. Chaiyakul, D.D. Gajski, and L. Ramachandran. Minimizing syntacticvariance with assignment decision diagrams. Technical Report ICS-TR-92-34, UC Irvine, 1992.

V. Chaiyakul, D. D. Gajski, and L. Ramachandran. High-level trans-formations for minimizing syntactic variances. In Design AutomationConference, 1993.

V. Chaiyakul, D.D. Gajski, and L. Ramachandran. High level transfor-mations for minimizing syntactic variances. In Design Automation Con-ference, 1993.

A. Chowdhary, S. Kale, P. Saripella, N. K. Sehgal, and R. K. Gupta. Ageneral approach for regularity extraction in datapath circuits. In Inter-national Conference on Computer-Aided Design, 1998.

L.-F. Chao, A. S. LaPaugh, and E. H.-M. Sha. Rotation scheduling: Aloop pipelining algorithm. In Design Automation Conference, 1993.

J.-M. Chang and M. Pedram. Register allocation and binding low power.In Design Automation Conf., 1995.

J.-M. Chang and M. Pedram. Module assignment for low power. InEuropean Design Automation Conference, 1996.

R. Camposano and W. Wolf. High Level VLSI Synthesis. Kluwer Aca-demic, 1991.

F. Catthoor, S. Wuytack, E. De Greef, F. Balasa, L. Nachtergaele, andA. Vandecappelle. Custom Memory Management Methodology: Explo-ration of Memory Organisation for Embedded Multimedia System De-sign. Kluwer Academic Publishers, 1998.

Design compiler. Synopsys Incorporated.

J. C. Dehnert, P. Y.-T Hsu, and J.P. Bratt. Overlapped loop support inthe cydra 5. In International Conference on Architectural Support forProgramming Languages and Operating Systems, 1989.

G. De Micheli. Synthesis and Optimization of Digital Circuits. McGraw-Hill, 1994.

[Cam91]

[Cel]

[CF87]

[CGR92]

[CGR93a]

[CGR93b]

[CLS93]

[CP95]

[CP96]

[CW91]

[DC]

[DHB89]

[DM94]

Page 28: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

BIBLIOGRAPHY 219

[DS]

[dS97]

[dS98]

[dSJ99]

[DT93]

[Ebc87]

[EDG]

[ELL00]

[EN89]

[Fis81]

[GDGN03a]

[GDGN03b]

[GDGN03c]

[GDGN03d]

Forte Design Systems. Behavioral design suite. http://www.forteds.com.

L.C.V. dos Santos. A method to control compensation code during globalscheduling. In Workshop on Circuits, Systems and Signal Processing,1997.

L.C.V. dos Santos. Exploiting instruction-level parallelism: a construc-tive approach. PhD thesis, Eindhoven University of Technology, 1998.

L.C.V. dos Santos and J.A.G. Jess. A reordering technique for efficientcode motion. In Design Automation Conference, 1999.

J. C. Dehnert and R. A. Towle. Compiling for the cydra 5. IEEE Com-puter, 7(1/2), 1993.

K. Ebcioglu. A compilation technique for software pipelining of loopswith conditional jumps. In MICRO, 1987.

Edison Design Group (edg) compiler frontends. http://www.edg.com.

E. Eim, J.-G. Lee, and D.-I. Lee. Automatic process-oriented control cir-cuit generation for asynchronous high-level synthesis. In InternationalSymposium on Advanced Research in Asynchronous Circuits and Sys-tems, 2000.

K. Ebcioglu and A. Nicolau. A global resource-constrained paralleliza-tion technique. In 3rd International Conference on Supercomputing,1989.

J. Fisher. Trace scheduling: A technique for global microcode com-paction. IEEE Transactions on Computers, July 1981.

S. Gupta, N.D. Dutt, R.K. Gupta, and A. Nicolau. Dynamic conditionalbranch balancing during the high-level synthesis of control-intensive de-signs. In Design, Automation and Test Conference, 2003.

S. Gupta, N.D. Dutt, R.K. Gupta, and A. Nicolau. Dynamically increas-ing the scope of code motions during the high-level synthesis of digitalcircuits. Invited Paper in Special Issue of IEE Proceedings: Computersand Digital Technique: Best of DATE 2003, 150(5), September 2003.

S. Gupta, N.D. Dutt, R.K. Gupta, and A. Nicolau. Loop shifting andcompaction for the high-level synthesis of designs with complex controlflow. Technical Report CECS-TR-03-14, UC Irvine, April 2003.

S. Gupta, N.D. Dutt, R.K. Gupta, and A. Nicolau. SPARK: A high-levelsynthesis framework for applying parallelizing compiler transformations.In International Conference on VLSI Design, 2003.

Page 29: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

220 BIBLIOGRAPHY

S. Gupta, N.D. Dutt, R.K. Gupta, and A. Nicolau. Loop shifting andcompaction for the high-level synthesis of designs with complex controlflow. In Design, Automation and Test Conference, 2004.

D. D. Gajski, N. D. Dutt, A. C-H. Wu, and S. Y-L. Lin. High-LevelSynthesis: Introduction to Chip and System Design. Kluwer Academic,1992.

C.H. Gebotys and M.I. Elmasry. Optimal synthesis of high-performancearchitectures. IEEE Journal of Solid-State Circuits, March 1992.

S. Gupta and R. K. Gupta. The VLSI Handbook, chapter ASIC Design.CRC Press and IEEE Press, 2000. Chapter 64.

GNU Image Manipulation Program. http://www.gimp.org.

E. Girczyc. Automatic Generation of Micro-sequenced Data Paths to Re-alize ADA Circuit Descriptions. PhD thesis, Carleton University, 1984.

E. Girczyc. Loop winding - a data flow approach to functional pipelining.In International Symposium of Circuits and Systems, 1987.

M. R. Garey and D. S. Johnson. Computers and Intractability: A Guideto the Theory of NP-Completeness. W. H. Freeman and Company, 1979.

S. Gupta, T. Kam, M. Kishinevsky, S. Rotem, N. Savoiu, N.D. Dutt, R.K.Gupta, and A. Nicolau. Coordinated transformations for high-level syn-thesis of high performance microprocessor blocks. In Design AutomationConference, 2002.

R. L. Gupta, A. Kumar, A. Van Der Werf, and G. N. Busa. Synthesizing along latency unit within VLIW processor. In Intl. Conf. on VLSI Design,2000.

R. K. Gupta and S. Y. Liao. Using a programming language for digitalsystem design. IEEE Design and Test of Computers, April 1997.

S. Gupta, M. Luthra, N.D. Dutt, R.K. Gupta, and A. Nicolau. Hardwareand interface synthesis of fpga blocks using parallelizing code transfor-mations. In To appear at the International Conference on Parallel andDistributed Computing and Systems, November 2003.

S. Gupta, M. Miranda, F. Catthoor, and R. Gupta. Analysis of high-leveladdress code transformations for programmable processors. In Design,Automation and Test in Europe, 2000.

M. Girkar and C.D. Polychronopoulos. Automatic extraction of func-tional parallelism from ordinary programs. IEEE Trans. on Parallel &Distributed Systems, Mar. 1992.

[GDGN04]

[GDWL92]

[GE92]

[GG00]

[Gim]

[Gir84]

[Gir87]

[GJ79]

[GKWB00]

[GL97]

[GMCG00]

[GP92]

Page 30: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

BIBLIOGRAPHY 221

[GPN90]

[GS90]

[GVM89]

[Hay00]

[HC89]

[HD86]

[HE95]

[Hea93a]

[Hea93b]

[HLH91]

[HP81]

D.H. Gelernter, D. A. Padua, and A. Nicolau. Languages and Compilersfor Parallel Computing. Morgan Kaufmann, 1990.

S. Gupta, M. Reshadi, N. Savoiu, N.D. Dutt, R.K. Gupta, and A. Nicolau.Dynamic common sub-expression elimination during scheduling in high-level synthesis. In International Symposium on System Synthesis, 2002.

R. Gupta and M. L. Soffa. Region scheduling: An approach for detectingand redistributing parallelism. IEEE Transactions on Software Engineer-ing, April 1990.

S. Gupta, N. Savoiu, N.D. Dutt, R.K. Gupta, and A. Nicolau. Condi-tional speculation and its effects on performance and area for high-levelsynthesis. In International Symposium on System Synthesis, 2001.

G. Goossens, J. Vandewlle, and H. De Man. Loop optimization inregister-transfer scheduling for DSP-systems. In Design automation con-ference, 1989.

D. D. Gajski, J. Zhu, R. Domer, A. Gerstlauer, and S. Zhao. SpecC:Specification Language and Methodology. Kluwer Academic Publishers,January 2000.

S. Haynal. Automata-Based Symbolic Scheduling. PhD thesis, Universityof California, Santa Barbara, 2000.

R. Hartley and A. E. Casavant. Tree-height minimization in pipelinedarchitectures. In International Conference on Computer-Aided Design,1989.

P. Y. T. Hsu and E. S. Davidson. Highly concurrent scalar processing. InInternational Symposium on Computer Architecture, 1986.

U. Holtmann and R. Ernst. Combining MBP-speculative computationand loop pipelining in high-level synthesis. In European Design and TestConference, 1995.

S. Huang and et al. A tree-based scheduling algorithm for control domi-nated circuits. In Design Automation Conference, 1993.

W.W. Hwu and et al. The superblock: An effective technique for vliwand superscalar compilation. Journal of Supercomputing, March 1993.

C.T. Hwang, T.H. Lee, and Y. C. Hsu. A formal approach to the schedul-ing problem in high level synthesis. IEEE Transactions on CAD, April1991.

F.J. Hill and G.R. Peterson. Switching Theory and Logical Design. Wiley,New York, 1981.

Page 31: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

222 BIBLIOGRAPHY

C.Y. Hitchcock and D.E. Thomas. A method of automatic data pathsynthesis. In Design Automation Conference, 1983.

S.C.-Y. Huang and W.H. Wolf. How datapath allocation affects controllerdelay. In International Symposium on High-Level Synthesis, 1994.

M. Ishikawa and G. D. Micheli. A module selection algorithm for high-level synthesis. In International Symposium on Circuits and Systems,1991.

Get2Chip Incorporated (now a Cadence subsidiary). G2C architecturalcompiler. http://www.get2chip.com.

Z. Iqbal, M. Potkonjak, S. Dey, and A. Parker. Critical path optimizationusing retiming and algebraic speed-up. In Design Automation Confer-ence, 1993.

R. Jones and V. Allan. Software pipelining: A comparison and improve-ment. In Proceedings of the Micro-23,1990.

M. Janssen, F. Catthoor, and H. De Man. A specification invariant tech-nique for operation cost minimisation in flow-graphs. In InternationalSymposium on High-level Synthesis, 1994.

A. A. Jerraya, H. Ding, P. Kission, and M. Rahmouni. Behavioral Syn-thesis and Component Reuse with VHDL. Kluwer Academic Publishers,1997.

K. Kennedy and R. Allen. Optimizing Compilers for Modern Architec-tures. Morgan Kaufmann, 2001.

P. Kollig and B. M. Al-Hashimi. Simultaneous scheduling, allocationand binding in high level synthesis. Electronics Letters, August 1997.

R. Kennedy, S. Chan, S.-M. Liu, R. Io, P. Tu, and F. Chow. Partial redun-dancy elimination in SSA form. ACM Trans. Progrm. Languages andSystems, May 1999.

P. Kission, H. Ding, and A.A. Jerraya. Structured design methodologyfor high-level design. In Design Automation Conference, 1994.

A. Kifli, G. Goossens, and H. De Man. A unified scheduling model forhigh-level synthesis and code generation. In Proceedings of the EuropeanDesign and Test Conference, 1995.

D. J. Kuck, R. H. Kuhn, D. A. Padua, B. Leasure, and M. Wolfe. De-pendence graphs and compiler optimizations. In ACM Symposium onPrinciples of Programming Languages, 1981.

[HT83]

[HW94]

[IM91]

[InaCs]

[IPDP93]

[JA90]

[JCM94]

[JDKR97]

[KA01]

[KAH97]

[KDJ94]

[KGM95]

Page 32: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

BIBLIOGRAPHY 223

J.J. Kim, F.J. Kurdahi, and N. Park. Automatic synthesis of time-stationary controllers for pipelined data paths. In Proceedings of theInternational Conference on Computer-Aided Design, 1991.

D. Ku and G. De Micheli. HardwareC - A language for hardware design.Technical Report CSL-TR-90-419, Stanford University, 1988.

D. Ku and G. De Micheli. Relative scheduling under timing constraints.In Design Automation Conference, 1990.

D. C. Ku and G. De Micheli. High Level Synthesis of ASICs UnderTiming and Synchronization Constraints. Kluwer Academic, 1992.

D. W. Knapp. Behavioral Synthesis: Digital System Design using theSynopsys Behavioral Compiler. Prentice-Hall, 1996.

T. J. Kowalski and D. E. Thomas. The VLSI design automation assistant:What’s in a knowledge base. In Design Automation Conference, 1985.

D.J. Kuck. The Structure of Computers and Computations, John Wileyand Sons, 1978.

A. Kountouris and C. Wolinski. High level pre-synthesis optimizationsteps using hierarchical conditional dependency graphs. In EuromicroConfernce, 1999.

A.A. Kountouris and C. Wolinski. Hierarchical conditional dependencygraphs as a unifying design representation in the CODESIS high-levelsynthesis system. In Intl. Symposium on System Synthesis, 2000.

T. Kim, N. Yonezawa, J.W.S. Liu, and C.L. Liu. A scheduling al-gorithm for conditional resource sharing - a hierarchical reduction ap-proach. IEEE Transactions on CAD, April 1994.

AT&T Research Labs. Graphviz - Open source graph drawing software.http://www.research.att.com/sw/tools/graphviz/.

M. Lam. Software pipelining: An effective scheduling technique forVLIW machines. In ACM SIGPLAN Conference Programming Lan-guages Design Implementation, 1988.

E.A. Lee. Programmable DSP architectures, Parts I, II. IEEE ASSPMagazine, October 1988.

J. Li and R.K. Gupta. HDL optimizations using Timed Decision Tables.In Design Automation Conference, 1996.

J. Li and R.K. Gupta. Decomposition of Timed Decision Tables andits use in presynthesis optimizations. In International Conference onComputer Aided Design, 1997.

[KKP91]

[KM88]

[KM90]

[KM92]

[Kna96]

[KT85]

[Kuc78]

[KW99]

[KW00]

[KYLL94]

[Lab]

[Lam88]

[Lee88]

[LG96]

[LG97]

Page 33: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

224 BIBLIOGRAPHY

M. Luthra, S. Gupta, N.D. Dutt, R.K. Gupta, and A. Nicolau. Interfacesynthesis using memory mapping for an FPGA platform. In InternationalConference on Computer Design, October 2003.

B. Landwehr, P. Marwedel, and R. Doemer. Oscar: Optimum simulta-neous scheduling, allocation and resource binding based on integer pro-gramming. In European Design Automation Conference, 1994.

D.A. Lobo and B.M. Pangrle. Redundant operator creation: A schedulingoptimization technique. In Design Automation Conference, 1991.

C. Lee, M. Potkonjak, and W. H. Mangione-Smith. Mediabench: A toolfor evaluating and synthesizing multimedia and communicatons systems.In International Symposium on Microarchitecture, 1997.

G. Lakshminarayana, A. Raghunathan, and N.K. Jha. Incorporating spec-ulative execution into scheduling of control-flow intensive behavioral de-scriptions. In Design Automation Conference, 1998.

G.W. Leive and D.E. Thomas. A technology relative logic synthesis andmodule selection system. In Design Automation Conference, 1981.

M. S. Lam and R. P. Wilson. Limits of control flow on parallelism. InInternational Symposium on Computer Architecture, 1992.

S. Mantripragada. Branch Optimizations and Instruction Level Paral-lelism Exploitation for Dynamic Superscalar and VLIW Processors. PhDthesis, University of California, Irvine, 2000.

P. Marwedel. A new synthesis for the mimola software system. In DesignAutomation Conference, 1986.

A. Moshovos, S. E. Breach, T. N. Vijaykumar, and G. S. Sohi. Dynamicspeculation and synchronization of data dependences. In InternationalSymposium on Computer Architecture, 1997.

M. C. McFarland. The Value Trace: A database for automated digi-tal design. Technical Report DRC-01-4-80, Carnegie-Mellon University,Design Research Center, 1978.

M. Miranda, F. Catthoor, M. Janssen, and H. De Man. High-level addressoptimisation and synthesis techniques for data-transfer intensive applica-tions. IEEE Transactions on VLSI Systems, December 1998.

S.-M. Moon and K. Ebcioglu. An efficient resource-constrained globalscheduling technique for superscalar and VLIW processors. In Interna-tional Symposium on Microarchitecture, 1992.

UCLA Mediabench benchmark suite. http://cares.icsl.ucla.edu/MediaBench/.

[LMD94]

[LP91]

[LPMS97]

[LRJ98]

[LT81]

[LW92]

[Man00]

[Mar86]

[MBVS97]

[McF78]

[MCJM98]

[ME92]

[Med]

Page 34: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

BIBLIOGRAPHY 225

A. Mujumdar, R. Jain, and K. Saluja. Incorporating performance andtestability constraints during binding in high-level synthesis. IEEE Trans.on CAD, 1996.

G. De Micheli, D. C. Ku, F. Mailhot, and T. Truong. The Olympussynthesis system for digital design. IEEE Design and Test of Computers,pages 37–53, October 1990.

S. A. Mahlke, D. C. Lin, W. Y. Chen, R. E. Hank, and R. A. Bring-mann. Effective compiler support for predicated execution using the hy-perblock. In International Symposium on Microarchitecture, 1992.

M.C. McFarland, A.C. Parker, and R. Camposano. The high-level syn-thesis of digital systems. Proceedings of the IEEE, February 1990.

H. De Man, J. Rabaey, P. Six, and L. Claesen. Cathedral-II: A siliconcompiler for digital signal processing. IEEE Design & Test Magazine,December 1986.

S. S. Muchnick. Advanced Compiler Design and Implementation. Mor-gan Kaufmann, 1997.

A. Nicolau. A development environment for scientific parallel programs.Technical Report TR 86-722, Department of Computer Science, CornellUniversity, 1985.

A. Nicolau. Uniform parallelism exploitation in ordinary programs. InInternational Conf. on Parallel Processing, 1985.

A. Nicolau and S. Novack. Trailblazing: A hierarchical approach to Per-colation Scheduling. In International Conference on Parallel Processing,1993.

S. Novack and A. Nicolau. Mutation scheduling: A unified approach tocompiling for fine-grain parallelism. In Languages and Compilers forParallel Computing, 1994.

S. Novack and A. Nicolau. An efficient, global resource-directed ap-proach to exploiting instruction-level parallelism. In Conference on Par-allel Architectures and Compilation Techniques, 1996.

A. Nicolau and R. Potasman. Incremental tree height reduction for highlevel synthesis. In Design Automation Conference, 1991.

M. Narasimhan and J. Ramanujam. On lower bounds for schedulingproblems in high-level synthesis. In Design Automation Conference,2000.

A. Orailoglu and D.D. Gajski. Flow graph representation. In DesignAutomation Conference, 1986.

[MJS96]

[MKMT90]

[MPC90]

[MRSC86]

[Muc97]

[Nic85a]

[Nic85b]

[NN93]

[NN94]

[NN96]

[NP91]

[NR00]

[OG86]

Page 35: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

226 BIBLIOGRAPHY

P.R. Panda. Memory Optimizations and Exploration for Embedded Sys-tems. PhD thesis, University of California, Irvine, 1998.

P.G. Paulin. Horizontal partitioning of PLA-based finite state machines.In Design Automation Conference, 1989.

Intel Inc., http://developer.intel.com/design/pro/manuals/242691.htm. PentiumPro® Programmer’s Reference Manual. Chap-ter 11.

P. Paulin, J. Frehel, M. Harrand, E. Berrebi, C. Liem, F Nacabal, and J.-C. Herluison. High-level synthesis and codesign methods: an applicationto a videophone codec. In European design automation conference withEURO-VHDL, 1995.

B.M. Pangrle and D.D. Gajski. Slicer: A state synthesizer for intelligentsilicon compilation. In Proceedings of the International Conference onComputer-Aided Design, 1986.

P. G. Paulin and J. P. Knight. Force-Directed Scheduling for the Behav-ioral Synthesis of ASIC’s. IEEE Transactions on CAD, 8(6):661–678,June 1989.

P. G. Paulin and J. P. Knight. Scheduling and Binding Algorithms forHigh-Level Synthesis. In Design Automation Conference, 1989.

I.-C. Park and C.-M. Kyung. Fast and near optimal scheduling in auto-matic data path synthesis. In Design Automation Conference, 1991.

R. Potasman, J. Lis, A. Nicolau, and D. Gajski. Percolation based syn-thesis. In Design Automation Conference, 1990.

O. Penalba, J.M. Mendias, and R. Hermida. Maximizing conditionalreuse by pre-synthesis transformations. In Design, Automation and Testin Europe, 2002.

C. D. Polychronopoulos. Parallel Programming and Compilers. KluwerAcademic Publishers, 1988.

N. Park and A. Parker. Sehwa: A software package for synthesisof pipelines from behavioral specifications. IEEE Transactions onComputer-Aided Design, March 1988.

A.C. Parker, J. Pizarro, and M. Mlinar. MAHA: A program for datapathsynthesis. In Design Automation Conference, 1986.

M. Potkonjak and J. Rabaey. Maximally fast and arbitrarily fast imple-mentation of linear computations. In International Conference on CAD,1992.

[Pan98]

[Pau89]

[Pen]

[PG86]

[PK89a]

[PK89b]

[PK91]

[PLNG90]

[PMH02]

[Pol88]

[PP88]

[PPM86]

[PR92]

Page 36: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

BIBLIOGRAPHY 227

M. Potkonjak and J. Rabaey. Optimizing resource utlization using tran-formations. IEEE Transactions on CAD, March 1994.

J. C. H. Park and M. Schlansker. On predicated execution. TechnicalReport HPL-91-58, Hewlett-Packard Software and Systems Laboratory,1991.

M. Potkonjak, M.B. Srivastava, and A. Chandrakasan. Multiple constantmultiplications: Efficient and versatile framework and algorithms for ex-ploring common subexpression elimination. IEEE Trans. on CAD, Mar1996.

R. Pasko, P. Schaumont, V. Derudder, S. Vernalde, and D. Durackova. Anew algorithm for elimination of common subexpressions. IEEE Trans-actions on CAD, Jan 1999.

S. Rajan. Practical state machine design using VHDL.Integrated System Design Magazine, Febuary 1995.http://www.isdmag.com/editorial/1995/fpgafeature9502.html.

I. Radivojevic and F. Brewer. Analysis of conditional resource sharingusing a guard-based control representation. In International Conferenceon Computer Design, 1995.

I. Radivojevic and F. Brewer. A new symbolic technique for control-dependent scheduling. IEEE Transactions on CAD, January 1996.

J. M. Rabaey, C. Chu, P. Hoang, and M. Potkonjak. Fast prototypingof datapath-intensive architectures. IEEE Design & Test of Computers,June 1991.

M. Rim, Y. Fann, and R. Jain. Global scheduling with code-motions forhigh-level synthesis applications. IEEE Transactions on VLSI Systems,September 1995.

B. R. Rau and C. D. Glaeser. Some scheduling techniques and an easilyschedulable horizontal architecture for high performance scientific com-puting. In Annual Workshop on Microprogramming, 1981.

D.S. Rao and F.J. Kurdahi. Controller and datapath trade-offs in hier-archical RT-level synthesis. In International Symposium on High-LevelSynthesis, 1994.

N.N.J. Roy and R. Vemuri. Synchronous controller models for synthesisfrom communicating VHDL processes. In International Conference onVLSI Design, 1996.

B. Rau, D. Yen, W. Yen, and R. Towle. The cydra 5 departmental super-computer: Design philosophies, decisions, and trade-offs. IEEE Com-puter, 22(1), 1989.

[PR94]

[PS91]

[PSC96]

[Raj95]

[RB95]

[RB96]

[RCHP91]

[RFJ95]

[RG81]

[RK94]

[RV96]

[RYYT89]

Page 37: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

228 BIBLIOGRAPHY

L. Semeria. Applying Pointer Analysis to the Synthesis of Hardware fromC. PhD thesis, Stanford University, 2001.

V.C. Sreedhar, G. R. Gao, and Y.-F. Lee. A new framework for exhaustiveand incremental data flow analysis using DJ graphs. ACM SIGPLANConf. on PLDI, 1996.

V.C. Sreedhar, G. R. Gao, and Y.-F. Lee. Incremental computation ofdominator trees. ACM Trans. Progrm. Languages and Systems, March1997.

M. D. Smith, M. S. Lam, and M. A. Horowitz. Boosting beyond staticscheduling in a superscalar processor. In International Symposium onComputer Architecture, 1990.

L. Stok and W.J.M. Philipsen. Module allocation and comparabilitygraphs. In IEEE International Sympoisum on Circuits and Systems, 1991.

SPARK parallelizing high-level synthesis framework website, http://www.cecs.uci.edu/~spark.

L. Stok. Transfer free register allocation in cyclic data flow graphs. InEuropean Conf. on Design Automation, 1992.

Synopsys Inc., http://www.systemc.org. SystemC Reference Man-ual.

A. Safir and B. Zavidovique. Towards a global solution to high levelsynthesis problems. In European Design Automation Conference, 1990.

D.E. Thomas, E.M. Dirkes, R.A. Walker, J.V. Rajan, J.A. Nestor, andR.L. Blackburn. The system architect’s workbench. In Design Automa-tion Conference, 1988.

G. S. Tjaden and M. J. Flynn. Detection and parallel execution of inde-pendent instructions. IEEE Transactions on Computers, 19(10), October1970.

H. Trickey. Flamel: A high-level hardware compiler. IEEE Transactionson Computer–Aided Design, March 1987.

C.J. Tseng and D.P. Siewiorek. Automated synthesis of data paths indigital systems. IEEE Transactions on Computer-Aided Design of Inte-grated Circuits and Systems, July 1986.

R. L. Rivest T. T. Cormen, C. E. Leiserson. Introduction to algorithms.MIT Press, Cambridge, MA, 1990.

C.-J. Tseng, R.-S. Wei, S. G. Rothweiler, M. M. Tong, and A. K. Bose.Bridge: A versatile behavioral synthesis system. In Design AutomationConference, 1988.

[Sem01]

[SGL96]

[SGL97]

[SLH90]

[SP91]

[SPA]

[Sto92]

[SyC]

[SZ90]

[TF70]

[Tri87]

[TS86]

[TTC90]

Page 38: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

BIBLIOGRAPHY 229

S. Vernalde, P. Schaumont, and I. Bolsens. An object oriented program-ming approach for hardware design. In IEEE Computer Society Work-shop on VLSI, April 1999.

K. Wakabayashi. C-based synthesis experiences with a behavior synthe-sizer, “Cyber”. In Design, Automation and Test in Europe, 1999.

D.W. Wall. Limits of instruction-level parallelism. In International Con-ference on Architectural Support for Programming Languages and Op-erating System (ASPLOS), 1991.

T.C. Wilson, N. Mukherjee, M.K. Garg, and D. K. Banerji. An ILP solu-tion for optimum scheduling, module and register allocation, and opera-tion binding in datapath synthesis. VLSI Design, 1995.

M.J. Wolfe. High Performance Compilers for Parallel Computing.Addison-Wesley, 1996.

J.-P. Weng and A. C. Parker. CSG: Control path synthesis in the ADAMsystem. In International Workshop on High Level Synthesis, 1992.

R. Walker and D. Thomas. Behavioral transformation for algorithmiclevel IC design. IEEE Transactions on CAD, October 1989.

K. Wakabayashi and H. Tanaka. Global scheduling independent of con-trol dependencies based on condition vectors. In Design AutomationConference, 1992.

W. Wolf, A. Takach, C.-Y. Huang, R. Manno, and E. Wu. The PrincetonUniversity behavioral synthesis system. In Design automation confer-ence, 1992.

K. Wakabayashi and T. Yoshimura. A resource sharing and control syn-thesis method for conditional branches. In Proceedings of the Interna-tional Conference on Computer-Aided Design, 1989.

Xilinx. ISE logic synthesis tools.

XviD MPEG-4 video de-/encoding solution. http://www.xvid.org.

T. Z. Yu, E. H.-M. Sha, N. Passos, and R. D.-C. Ju. Algorithms andhardware support for branch anticipation,. In Great Lakes Symposium onVLSI, 1997.

J. Zegers, P. Six, J. Rabaey, and H. De Man. CGE: Automatic genera-tion of controllers in the CATHEDRAL-II silcion compiler. In DesignAutomation Conference, 1990.

[VSB99]

[Wak99]

[Wal91]

[WMGB95]

[Wol96]

[WP92]

[WT89]

[WT92]

[WY89]

[Xil]

[Xvi]

[YSPJ97]

[ZSRM90]

Page 39: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

Index

3-layered, 135

abstract syntax tree, 24allocation, 6

previous work, 15as late as possible, 15as soon as possible, 15available operations, 90, 91

heuristic, 94heuristic with chaining, 109

back-end code generation, 53basic block, 27benchmarks, 145, 146binding, see resource binding, see re-

source bindingbranch balancing, see dynamic branch

balancing

candidate fetcher, 89available operations, 94collecting unscheduled operations,

95data dependencies, 96Trailblazing, 97TrailSynth, 97

candidate mover, 90move operations, 99, 100Trailblazing, 97TrailSynth, 97

candidate validater, 90candidate walker, 90case studies, 145coarse-grain, 7, 51, 136combining, 55command-line options, 135common sub-expression elimination, see

CSE

compatibility edges, 119conditional constructs, 7, 26conditional speculation, 74, 77

heuristics, 141results, 141

control constructs, 7control flow, 26control flow graph, 28control paths, 27control speculation, 68control synthesis, 7, 10, 53, 122control-data flow graphs, 29cost function, 90, 91CSE, 59–61

results, 150–152

data dependency, 24, 54, 96anti, 24, 54flow, 24, 54input, 24, 54output, 24, 54

data dependency analysis, 26data dependency graphs, 54data flow graph, 25data speculation, 68data types, 34design graph, 33design HTG, 33dominate, 42, 59dominator trees, 42, 59duplicating down, see reverse specula-

tionduplicating up, see conditional specu-

lationdynamic branch balancing, 53, 75–76

during code motions, 76, 141during design traversal, 76, 141

Page 40: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

INDEX 231

heuristic, 105results, 141

dynamic copy propagation, 53, 79dynamic CSE, 53, 76–79

algorithm, 103results, 155–158

dynamic renaming, 52, 54–55dynamic transformations, 9, 53, 90, 92

dynamic branch balancing, 75, 105dynamic copy propagation, 79dynamic CSE, 76

dynamic variable renaming, 26, see dy-namic renaming

early condition execution, 73embedded systems, 3experiments, 145

fine-grain, 7, 51finite state machine, 53, 124

algorithm, 125, 126construction, 125

force-directed scheduling, 15fork node, 26function call, 35function inlining, 179

results, 148functional blocks, 173functional units, 115

global slicing, 122group dominance, 43, 78

hardware resource library, 35, 51, 135heuristic

move operations, 100hierarchical task graph, 29–33

compound node, 30construction, 31for-loop, 31if-then-else, 31loop node, 30single node, 30start node, 30stop node, 30

high-level synthesis, 3

HTG, see hierarchical task graph

idle resourceheuristic, 101

index variable elimination, 182inlining, 179

results, 148instruction length decoder, 175, 176instruction-level parallelism, 7interconnect, 10interconnect minimization, see resource

bindinginterconnect minimizing resource bind-

ing, 10interdependencies between code motions,

137inverted triangle, 31IR walker, 89, 103

get next basic block, 104get next scheduling step, 103

join node, 26

language level modeling, 7layered graph, 24

3-layered graph, 24k-layered graph, 24

lazy code motion, 71list scheduling heuristic, 90local slicing, 122loop index variable elimination, 63–64loop pipelining, 53, 84

incremental, 53loop shifting, 53, 84

algorithm, 112results, 164–170

loop unrolling, 62–63, 175results, 164–170

loop-invariant code motion, 61–62results, 150–152

low latency, 173

max-cost flow, 121metrics

logic synthesis results, 147scheduling results, 147

Page 41: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

232 INDEX

microprocessor blocks, 174microprocessor functional blocks, 64min-cost max-flow, 122mobility, 15modeling

code motions, 39–41control flow, 26data dependency, 24data type information, 34finite state machine, 124hardware resources, 35hierarchical code motions, 40–41intermediate representation, 23–33operation chaining, 37operation chaining across condi-

tionals, 46resource binding, 116speculative code motions, 39–40timing, 36

modular, 136module selection, 6multi-commodity network, 120, 121multi-cycle operations, 52mutually exclusive, 69, 70

non-flow data dependencies, 26

operation binding, 119clique, 119example, 119

operation chaining, 37across conditionals, 46, 80–84, 176heuristic, 108results, 158–160wire variables, 82wire variables heuristic, 111

operation compatibility graph, 119

parallelizing high-level synthesis, 8parallelizing transformations, 7Percolation, 52PHLS framework, 51–57pipelining

previous work, 16pre-synthesis, 8, 51, 59–65

results, 147

predicated execution, 68priority

calculation, 93results, 143scheduling heuristic, 90

register-transfer level, 11resource

busy, 68idle, 68

resource allocation, 6, 37resource binding, 6, 53, 115

example, 117interconnect minimization, 116modeling, 116operation binding, 119previous work, 15problem definition, 115, 117variable binding, 121

resource list, 37resource sharing, 52, 69, 70resource types, 37resource utilization, 44–46resource-constrained scheduling, 43results

dynamic CSE, 155experimental setup, 145loop shifting, 164loop unrolling, 164metrics, 147operation chaining, 158speculative code motions, 152

reverse speculation, 71

scheduling, 6, 9, 41–47available operations, 94, 109example, 107heuristic with chaining, 108heuristics, 89–114loops, 92software architecture, 89

scheduling heuristicsprevious work, 15

scheduling loops, 92scheduling phase, 53, 136scheduling problem, 37–39

Page 42: Appendix - Springer978-1-4020-7838-5/1.pdf · C description (such as int, char, float, unsigned int, signed int, long, double et cetera). The format is data type, lower bound range,

INDEX 233

scheduling step, 41scheduling transformations, 67–88script-based synthesis, 137SPARK framework, 51–57speculation, 68, 70–71speculative code motions, 9, 52, 68–

75, 175, 181modeling, 39ordering, 137–141results, 152–155

speculative execution, see speculationsplit lifetime intervals, 121state assignment, 122

slicing, 122steering logic, 10synthesis scripts, 135, 137

recommended, 143synthesizable VHDL generation, 128system level design methodology, 3system-on-a-chip, 3

three-address expression, 24timing constraints, 51Trailblazing, 52, 55–56

heuristic, 97heuristic with chaining, 111

TrailSynth, see Trailblazing, 100transformations toolbox, 136

urgency, 15

variable binding, 121variable compatibility graph, 121variable lifetimes, 116variable names, 25VHDL generation, 128VHDL processes, 128visualization, 25

wire variables, see operation chainingwire-variables, 81