courseware high-level synthesis an introduction prof. jan madsen informatics and mathematical...
Post on 20-Dec-2015
217 views
TRANSCRIPT
courseware
High-Level Synthesisan introduction
Prof. Jan Madsen
Informatics and Mathematical ModellingTechnical University of Denmark
Richard Petersens Plads, Building 321DK2800 Lyngby, Denmark
M-1 High-Level Synthesis 2SoC-MOBINET courseware
Hardware synthesis
P1P2
P3
CPU ASIC
P1 P2 & P3
Starts from an abstract behavioral description
Generates an RTL description
Need to restrict the target hardware – otherwise search space is too large
M-1 High-Level Synthesis 3SoC-MOBINET courseware
Hardware synthesis
P1P2
P3
CPU ASIC
P1 P2 & P3
How is the behavior specified? Natural languages C/C++ VHDL/Verilog
What is the target architecture of the ASIC?
M-1 High-Level Synthesis 4SoC-MOBINET courseware
Hardware model - components
Most synthesis systems are targeted towards synchronous hardware
Functional units: Can perform one or more computations Addition, multiplication, comparison, ALU, etc.
Registers: Store inputs, intermediate results and outputs May be organized as a register file
M-1 High-Level Synthesis 5SoC-MOBINET courseware
Hardware model - interconnection
Multiplexers: Select one output from several inputs
Busses: Connection shared between several components Only one component can write data at a specific time Exclusive writing may be controlled by tri-state drivers
M-1 High-Level Synthesis 6SoC-MOBINET courseware
Hardware model – parameters
Clocking strategy Single or multiple phase clocks
Interconnect Allowing or disallowing busses
Clocking of functional units Multicycle operations Chaining Pipelined units
M-1 High-Level Synthesis 7SoC-MOBINET courseware
Hardware model – example
M-1 High-Level Synthesis 8SoC-MOBINET courseware
Hardware concepts
Data path Network of functional units, registers, multiplexers and
buses
Control Takes care of having the data present at the right
place at a specific time Takes care of presenting the right instructions to a
programmable unit
Often high-level synthsis concentrates on data path synthesis
M-1 High-Level Synthesis 9SoC-MOBINET courseware
Methodology
implementationdesign specification
Physical domain
Mathematical domain
specification
create a model of the physical problem
synthesis
create an alogorithm to solve the problem
implementation
Transform the optimized model back to the physical domain
M-1 High-Level Synthesis 10SoC-MOBINET courseware
Input format
Input Behavior described in textual form Conventional programming language Hardware description language (HDL)
Has to be parsed and transformed into an internal representation
Conventional compiler techniques are used
M-1 High-Level Synthesis 11SoC-MOBINET courseware
Internal representation
Data-flow graph (DFG) Used by most systems May or may not contain information on control flow
vertex (node): represent computation
edge: represent precedence relations
M-1 High-Level Synthesis 12SoC-MOBINET courseware
Data flow
x := a * b;
y := c + d;
z := x + y;
a b c d
x
*
y
+
z
+
M-1 High-Level Synthesis 13SoC-MOBINET courseware
DFG semantics
a b c d
x
*
y
+
z
+
M-1 High-Level Synthesis 14SoC-MOBINET courseware
Exercise 1: data flow graph of DiffEq
Solve the second order differential equation y´´ + 3zy´+ 3y = 0
Iterative solution
While (z<a) {
z1 := z + dz;
u1 := u – (3*z*u*dz) – (3*y*dz);
y1 := y + (u*dz);
z := z1; u := u1; y := y1;
}
M-1 High-Level Synthesis 15SoC-MOBINET courseware
Exercise 1 - result
+
u1
-
*
-
*
*
* *
* + <
u dz 3 z 3 y udz z dz
y1 ctrl
M-1 High-Level Synthesis 16SoC-MOBINET courseware
High-level synthesis
a b c d
x y
+
z
+
*
M-1 High-Level Synthesis 17SoC-MOBINET courseware
High-level synthesis
Scheduling Determine for each operation the time at which it
should be performed such that no precedence contraint is violated
Allocation Specify the hardware resources that will be necessary
Assignment Provide a mapping from each operation to a specific
functional unit and from each variable to a register
M-1 High-Level Synthesis 18SoC-MOBINET courseware
High-level synthesis
Scheduling, allocation and assignment are strongly interrelated
But are often solved separately! Scheduling is NP-complete – heuristics have to
be used!
M-1 High-Level Synthesis 19SoC-MOBINET courseware
Scheduling
Input DFG G(V, E) Library of ressource types R
Mapping : V R, (vi ) = r
a given operation may be mapped to different ressource type, e.g. + may be performed by an adder or an ALU
execution delay:(vi ) = di
ressource type cost: (r)
M-1 High-Level Synthesis 20SoC-MOBINET courseware
Scheduling
Start time of operations T = { ti : i = 0, 1, …, n }
Scheduling is the task of determining the start times subject to the precedence constraints of the DFG : V Z+
vi ) = ti such that ti tj + dj, i, j : (vj, vi ) E
Latency: = tn – t0
Cost of schedule: r R(r) Nr()]
M-1 High-Level Synthesis 21SoC-MOBINET courseware
Scheduling
implementation specification
Physical domain
Mathematical domain
specification synthesis implementation
C program
DFG
Scheduling algorithm
Scheduled DFG
M-1 High-Level Synthesis 22SoC-MOBINET courseware
Scheduling – ASAP
Map operations to their earliest possible start time not violating the precedence constraints
Easy and fast to compute Find longest path in a directed acyclic graph No attemp to optimize ressource cost
Gives the fastest possible schedule if unlimited amount of resources are available
Gives an upper bound on execution speed
M-1 High-Level Synthesis 23SoC-MOBINET courseware
ASAP algorithm
For each node vi V do
if pred(vi) = Ø then
Ei = 1;
V = V – { vi };
else
Ei = 0;
endif
endfor
While V ≠ Ø do
for each node vi V do
if all_sched(pred(vi),E) then
Ei = max(pred(vi),E) + 1;
V = V – { vi };
endif
endfor
endwhile
M-1 High-Level Synthesis 24SoC-MOBINET courseware
DiffEq
+
u1
-
*
-
*
*
* *
* + <
u dz 3 z 3 y z dz
y1 ctrl
udz
M-1 High-Level Synthesis 25SoC-MOBINET courseware
Exercise 2 – latency and resources
Assume: cycle time = 25 ns d*, d+, d-, d< = 25 ns
What is the latency of the schedule? How many resources are needed? How many resources are needed, if we introduce an ALU
(+,-,<) What is the latency if we have only 1 multiplier? What is the latency if
d* = 25ns and dALU = 12ns
M-1 High-Level Synthesis 26SoC-MOBINET courseware
Exercise 2 – result
What is the latency of the schedule? 4*25ns = 100ns
How many resources are needed? 4*, 1+, 1-, 1<
How many resources are needed, if we introduce an ALU (+,-,<) 4*, 2ALU
What is the latency if we have only 1 multiplier? 7*25ns = 175ns
What is the latency if d* = 25ns and dALU = 12ns 3*25ns = 75ns (operator chaining)
M-1 High-Level Synthesis 27SoC-MOBINET courseware
Scheduling - ALAP
+
u1
-
*
-
*
* *
**
+ <
u dz 3 z 3 y z dz
y1 ctrl
udz
M-1 High-Level Synthesis 28SoC-MOBINET courseware
Scheduling – ALAP
Map operations to their latest possible start time not violating the precedence constraints
Needs a latency constraint Easy and fast to compute
Find longest path in a directed acyclic graph No attemp to optimize ressource cost
M-1 High-Level Synthesis 29SoC-MOBINET courseware
Scheduling – ASAP / ALAP
Are ASAP and ALAP useful? ASAPvi ) = Ei
ALAPvi ) = Li
Operator flexibility = Li – Ei
Also known as mobility
Mobility = 0 operator has to be scheduled at Ei
otherwise latency constraint is violated
Mobility > 0 gives scheduling freedom
M-1 High-Level Synthesis 30SoC-MOBINET courseware
Scheduling – list based
Generalization of ASAP Priority-list of ready nodes A ready node is an operator that has all
predecessors already scheduled The priority-list is always sorted with respect to a
priority function
M-1 High-Level Synthesis 31SoC-MOBINET courseware
List scheduling algorithm
ins_ready_ops(V,PListr1, PListr2
,…, PListrm);
Cstep = 0;
While ((PListr1 Ø) or … or ((PListrm
Ø)) do
Cstep = Cstep + 1;
for k = 1 to m do
for funit = 1 to Nk do
if PListrk Ø then
schdule_op(first(Plistrk),Cstep);
Plistrk = delete(Plistrk
,first(Plistrk));
endif
endfor
endfor
ins_ready_ops(V,PListr1, PListr2
,…, PListrm);
endwhile
M-1 High-Level Synthesis 32SoC-MOBINET courseware
DiffEq
+
u1
-
*
-
*
*
* *
* + <
u dz 3 z 3 y z dz
y1 ctrl
udz
[h,0]
[g,0]
[a,0] [b,0]
[e,0] [f,1]
[c,1] [d,2]
[i,2] [k,2]
[j,2] Plist*:
Plist+:
Plist-:
Plist<:
a,b,c,d
j
Ø
Ø
* * +
- <
* * c,d e,c,d *
Ø
k
*
*
*
Ø
f,d
g
M-1 High-Level Synthesis 33SoC-MOBINET courseware
List scheduling
Priority may be based on other measures than mobility
Length of longest path to a node with no immediate successor
Number of immediate successor nodes High number means high priority