hardware/ software partitioning - tu dortmund · hardware/software codesign: approach specification...
TRANSCRIPT
![Page 1: Hardware/ Software Partitioning - TU Dortmund · Hardware/software codesign: approach Specification p p. marwedel, - 6 -informatik 12, 2011 [Niemann, Hardware/Software Co-Design for](https://reader033.vdocuments.net/reader033/viewer/2022060907/60a194d3e5cc8409d7552d94/html5/thumbnails/1.jpg)
Hardware/Software
Partitioning
Gra
phic
s: ©
Ale
xand
ra N
olte
, Ges
ine
Mar
wed
el,
2003
2011年年年年 12 月月月月 09 日日日日
Peter MarwedelTU Dortmund, Informatik 12
Germany
Gra
phic
s: ©
Ale
xand
ra N
olte
, Ges
ine
Mar
wed
el,
2003
These slides use Microsoft clip arts. Microsoft copyright restrictions apply.
![Page 2: Hardware/ Software Partitioning - TU Dortmund · Hardware/software codesign: approach Specification p p. marwedel, - 6 -informatik 12, 2011 [Niemann, Hardware/Software Co-Design for](https://reader033.vdocuments.net/reader033/viewer/2022060907/60a194d3e5cc8409d7552d94/html5/thumbnails/2.jpg)
Structure of this course
2:Specification
3: ES-hardware
8:Test
App
licat
ion
Kno
wle
dge Design
repository Design
- 2 - p. marwedel, informatik 12, 2011
ES-hardware
4: system software (RTOS, middleware, …)
Test
5: Evaluation & validation (energy, cost, performance, …)
7: Optimization
6: Application mapping
App
licat
ion
Kno
wle
dge
Numbers denote sequence of chapters
![Page 3: Hardware/ Software Partitioning - TU Dortmund · Hardware/software codesign: approach Specification p p. marwedel, - 6 -informatik 12, 2011 [Niemann, Hardware/Software Co-Design for](https://reader033.vdocuments.net/reader033/viewer/2022060907/60a194d3e5cc8409d7552d94/html5/thumbnails/3.jpg)
Hardware/software partitioning
No need to consider special hardware in the future?
%100
HardwareSoftware
- 3 - p. marwedel, informatik 12, 2011
�Functionality to be implemented in software or in hardware?
Correct for fixed functionality, but wrong in general: “By the time MPEG-ncan be implemented in software, MPEG-n+1 has been invented” [de Man]
t20202010 2012 2014 2016 2018
![Page 4: Hardware/ Software Partitioning - TU Dortmund · Hardware/software codesign: approach Specification p p. marwedel, - 6 -informatik 12, 2011 [Niemann, Hardware/Software Co-Design for](https://reader033.vdocuments.net/reader033/viewer/2022060907/60a194d3e5cc8409d7552d94/html5/thumbnails/4.jpg)
Functionality to be implementedin software or in hardware?
Decision based on hardware/ software partitioning,
- 4 - p. marwedel, informatik 12, 2011
partitioning, a special case of hardware/ software codesign.
![Page 5: Hardware/ Software Partitioning - TU Dortmund · Hardware/software codesign: approach Specification p p. marwedel, - 6 -informatik 12, 2011 [Niemann, Hardware/Software Co-Design for](https://reader033.vdocuments.net/reader033/viewer/2022060907/60a194d3e5cc8409d7552d94/html5/thumbnails/5.jpg)
Codesign Tool (COOL)as an example of HW/SW partitioning
Inputs to COOL:
1.Target technology
- 5 - p. marwedel, informatik 12, 2011
2.Design constraints
3.Required behavior
![Page 6: Hardware/ Software Partitioning - TU Dortmund · Hardware/software codesign: approach Specification p p. marwedel, - 6 -informatik 12, 2011 [Niemann, Hardware/Software Co-Design for](https://reader033.vdocuments.net/reader033/viewer/2022060907/60a194d3e5cc8409d7552d94/html5/thumbnails/6.jpg)
Hardware/software codesign: approach
Specification
p
- 6 - p. marwedel, informatik 12, 2011
[Niemann, Hardware/Software Co-Design for Data Flow Dominated Embedded Systems, Kluwer Academic Publishers, 1998 (Comprehensive mathematical model)]
Processor P1
Processor P2 Hardware
Mapping
![Page 7: Hardware/ Software Partitioning - TU Dortmund · Hardware/software codesign: approach Specification p p. marwedel, - 6 -informatik 12, 2011 [Niemann, Hardware/Software Co-Design for](https://reader033.vdocuments.net/reader033/viewer/2022060907/60a194d3e5cc8409d7552d94/html5/thumbnails/7.jpg)
Steps of the COOL partitioning algorithm (1)
1. Translation of the behavior into aninternal graph model
2. Translation of the behavior of each node from VHDL into C
3. Compilation• All C programs compiled for the target processor,
- 7 - p. marwedel, informatik 12, 2011
• All C programs compiled for the target processor,• Computation of the resulting program size, • estimation of the resulting execution time
(simulation input data might be required) 4. Synthesis of hardware components:
∀ leaf nodes, application-specific hardware is synthesized. High-level synthesis sufficiently fast.
![Page 8: Hardware/ Software Partitioning - TU Dortmund · Hardware/software codesign: approach Specification p p. marwedel, - 6 -informatik 12, 2011 [Niemann, Hardware/Software Co-Design for](https://reader033.vdocuments.net/reader033/viewer/2022060907/60a194d3e5cc8409d7552d94/html5/thumbnails/8.jpg)
Steps of the COOL partitioning algorithm (2)
5. Flattening of the hierarchy:
• Granularity used by the designer is maintained.
• Cost and performance information added to the nodes.
• Precise information required for partitioning is pre-computed
- 8 - p. marwedel, informatik 12, 2011
computed
6. Generating and solving a mathematical model of the optimization problem:
• Integer linear programming ILP model for optimization.Optimal with respect to the cost function(approximates communication time)
![Page 9: Hardware/ Software Partitioning - TU Dortmund · Hardware/software codesign: approach Specification p p. marwedel, - 6 -informatik 12, 2011 [Niemann, Hardware/Software Co-Design for](https://reader033.vdocuments.net/reader033/viewer/2022060907/60a194d3e5cc8409d7552d94/html5/thumbnails/9.jpg)
Steps of the COOL partitioning algorithm (3)
7. Iterative improvements:Adjacent nodes mapped to the same hardware component are now merged.
- 9 - p. marwedel, informatik 12, 2011
![Page 10: Hardware/ Software Partitioning - TU Dortmund · Hardware/software codesign: approach Specification p p. marwedel, - 6 -informatik 12, 2011 [Niemann, Hardware/Software Co-Design for](https://reader033.vdocuments.net/reader033/viewer/2022060907/60a194d3e5cc8409d7552d94/html5/thumbnails/10.jpg)
Steps of the COOL partitioning algorithm (4)
8. Interface synthesis:After partitioning, the glue logic required for interfacing processors, application-specific hardware and memories is created.
- 10 - p. marwedel, informatik 12, 2011
![Page 11: Hardware/ Software Partitioning - TU Dortmund · Hardware/software codesign: approach Specification p p. marwedel, - 6 -informatik 12, 2011 [Niemann, Hardware/Software Co-Design for](https://reader033.vdocuments.net/reader033/viewer/2022060907/60a194d3e5cc8409d7552d94/html5/thumbnails/11.jpg)
An integer linear programming modelfor HW/SW partitioning
Notation:
� Index set V denotes task graph nodes.
� Index set L denotes task graph node typese.g. square root, DCT or FFT
- 11 - p. marwedel, informatik 12, 2011
e.g. square root, DCT or FFT
� Index set M denotes hardware component types.e.g. hardware components for the DCT or the FFT.
� Index set J of hardware component instances
� Index set KP denotes processors.All processors are assumed to be of the same type
![Page 12: Hardware/ Software Partitioning - TU Dortmund · Hardware/software codesign: approach Specification p p. marwedel, - 6 -informatik 12, 2011 [Niemann, Hardware/Software Co-Design for](https://reader033.vdocuments.net/reader033/viewer/2022060907/60a194d3e5cc8409d7552d94/html5/thumbnails/12.jpg)
An ILP model for HW/SW partitioning
� Xv,m: =1 if node v is mapped to hardwarecomponent type m ∈ M and 0 otherwise.
� Yv,k: =1 if node v is mapped to processor k ∈ KP and 0 otherwise.
� NYl,k =1 if at least one node of type l is mapped to
- 12 - p. marwedel, informatik 12, 2011
� NYl,k =1 if at least one node of type l is mapped to processor k ∈ KP and 0 otherwise.
� Type is a mapping from task graph nodes to their types:Type : V → L
� The cost function accumulates the cost of hardware units:
C = cost(processors) + cost(memories) + cost(application specific hardware)
![Page 13: Hardware/ Software Partitioning - TU Dortmund · Hardware/software codesign: approach Specification p p. marwedel, - 6 -informatik 12, 2011 [Niemann, Hardware/Software Co-Design for](https://reader033.vdocuments.net/reader033/viewer/2022060907/60a194d3e5cc8409d7552d94/html5/thumbnails/13.jpg)
Constraints
Operation assignment constraints
∑ ∑∈ ∈
=+∈∀Mm KPk
kvmv YXVv 1: ,,
All task graph nodes have to be mapped either in
- 13 - p. marwedel, informatik 12, 2011
All task graph nodes have to be mapped either in software or in hardware.
Variables are assumed to be integers.
Additional constraints to guarantee they are either 0 or 1:
1:: , ≤∈∀∈∀ mvXMmVv
1:: , ≤∈∀∈∀ kvYKPkVv
![Page 14: Hardware/ Software Partitioning - TU Dortmund · Hardware/software codesign: approach Specification p p. marwedel, - 6 -informatik 12, 2011 [Niemann, Hardware/Software Co-Design for](https://reader033.vdocuments.net/reader033/viewer/2022060907/60a194d3e5cc8409d7552d94/html5/thumbnails/14.jpg)
Operation assignment constraints (2)
∀ l ∈ L, ∀ v:Type(v)=cl , ∀ k ∈ KP : NY l,k ≥ Yv,k
For all types l of operations and for all nodes v of this type:if v is mapped to some processor k, then that processor
- 14 - p. marwedel, informatik 12, 2011
if v is mapped to some processor k, then that processor must implement the functionality of l.
Decision variables must also be 0/1 variables:
∀ l ∈ L, ∀ k ∈ KP : NY l,k ≤ 1.
![Page 15: Hardware/ Software Partitioning - TU Dortmund · Hardware/software codesign: approach Specification p p. marwedel, - 6 -informatik 12, 2011 [Niemann, Hardware/Software Co-Design for](https://reader033.vdocuments.net/reader033/viewer/2022060907/60a194d3e5cc8409d7552d94/html5/thumbnails/15.jpg)
Resource & design constraints
�∀ m ∈ M, the cost (area) for components of typem is = sum of the costs of the components of that type.This cost should not exceed its maximum.
�∀ k ∈ KP, the cost for associated data storage area should not exceed its maximum.
�∀ k ∈ KP the cost for storing instructions should not exceed
- 15 - p. marwedel, informatik 12, 2011
�∀ k ∈ KP the cost for storing instructions should not exceed its maximum.
�The total cost (Σm ∈ M) of HW components should not exceed its maximum
�The total cost of data memories (Σk ∈ KP) should not exceed its maximum
�The total cost instruction memories (Σk ∈ KP) should not exceed its maximum
![Page 16: Hardware/ Software Partitioning - TU Dortmund · Hardware/software codesign: approach Specification p p. marwedel, - 6 -informatik 12, 2011 [Niemann, Hardware/Software Co-Design for](https://reader033.vdocuments.net/reader033/viewer/2022060907/60a194d3e5cc8409d7552d94/html5/thumbnails/16.jpg)
Scheduling
Processorp1 ASIC h1
FIR1 FIR2
v1 v2 v3 v4
v5 v6 v7 v8
e3 e4
- 16 - p. marwedel, informatik 12, 2011
v9 v10
v11
t
p1
v8 v7
v7 v8
or
...
... ...
...
t
c1
or
...
... ...
...e3
e3
e4
e4t
FIR2 on h1
v4 v3
v3 v4
or
...
... ...
...
Communication channel c1
![Page 17: Hardware/ Software Partitioning - TU Dortmund · Hardware/software codesign: approach Specification p p. marwedel, - 6 -informatik 12, 2011 [Niemann, Hardware/Software Co-Design for](https://reader033.vdocuments.net/reader033/viewer/2022060907/60a194d3e5cc8409d7552d94/html5/thumbnails/17.jpg)
Scheduling / precedence constraints
� For all nodes vi1 and vi2 that are potentiallymapped to the same processor or hardware component instance, introduce a binary decision variable bi1,i2 withbi1,i2=1 if vi1 is executed before vi2 and
= 0 otherwise.Define constraints of the type
- 17 - p. marwedel, informatik 12, 2011
Define constraints of the type(end-time of vi1) ≤ (start time of vi2) if bi1,i2=1 and(end-time of vi2) ≤ (start time of vi1) if bi1,i2=0
� Ensure that the schedule for executing operations is consistent with the precedence constraints in the task graph.
� Approach fixes the order of execution
![Page 18: Hardware/ Software Partitioning - TU Dortmund · Hardware/software codesign: approach Specification p p. marwedel, - 6 -informatik 12, 2011 [Niemann, Hardware/Software Co-Design for](https://reader033.vdocuments.net/reader033/viewer/2022060907/60a194d3e5cc8409d7552d94/html5/thumbnails/18.jpg)
Other constraints
� Timing constraintsThese constraints can be used to guarantee that certain time constraints are met.
� Some less important constraints omitted ..
- 18 - p. marwedel, informatik 12, 2011
![Page 19: Hardware/ Software Partitioning - TU Dortmund · Hardware/software codesign: approach Specification p p. marwedel, - 6 -informatik 12, 2011 [Niemann, Hardware/Software Co-Design for](https://reader033.vdocuments.net/reader033/viewer/2022060907/60a194d3e5cc8409d7552d94/html5/thumbnails/19.jpg)
Example
HW types H1, H2 and H3 with costs of 20, 25, and 30.
Processors of type P.Tasks T1 to T5.Execution times:
- 19 - p. marwedel, informatik 12, 2011
T H1 H2 H3 P
1 20 100
2 20 100
3 12 10
4 12 10
5 20 100
![Page 20: Hardware/ Software Partitioning - TU Dortmund · Hardware/software codesign: approach Specification p p. marwedel, - 6 -informatik 12, 2011 [Niemann, Hardware/Software Co-Design for](https://reader033.vdocuments.net/reader033/viewer/2022060907/60a194d3e5cc8409d7552d94/html5/thumbnails/20.jpg)
Operation assignment constraints (1)
T H1 H2 H3 P
1 20 100
2 20 100
3 12 10
4 12 10
5 20 100
∑ ∑∈ ∈
=+∈∀KMm KPk
kvmv YXVv 1: ,,
- 20 - p. marwedel, informatik 12, 2011
X1,1+Y1,1=1 (task 1 mapped to H1 or to P)X2,2+Y2,1=1X3,3+Y3,1=1X4,3+Y4,1=1X5,1+Y5,1=1
![Page 21: Hardware/ Software Partitioning - TU Dortmund · Hardware/software codesign: approach Specification p p. marwedel, - 6 -informatik 12, 2011 [Niemann, Hardware/Software Co-Design for](https://reader033.vdocuments.net/reader033/viewer/2022060907/60a194d3e5cc8409d7552d94/html5/thumbnails/21.jpg)
Operation assignment constraints (2)
Assume types of tasks are l =1, 2, 3, 3, and 1.
∀ l ∈ L, ∀ v:Type(v)=c l, ∀ k ∈ KP : NY l,k ≥ Yv,k
- 21 - p. marwedel, informatik 12, 2011
Functionality 3 to be implemented on
processor if node 4 is mapped to it.
![Page 22: Hardware/ Software Partitioning - TU Dortmund · Hardware/software codesign: approach Specification p p. marwedel, - 6 -informatik 12, 2011 [Niemann, Hardware/Software Co-Design for](https://reader033.vdocuments.net/reader033/viewer/2022060907/60a194d3e5cc8409d7552d94/html5/thumbnails/22.jpg)
Other equations
Time constraints leading to: Application specific hardware required for time constraints ≤ 100 time units.
T H1 H2 H3 P
- 22 - p. marwedel, informatik 12, 2011
1 20 100
2 20 100
3 12 10
4 12 10
5 20 100
Cost function:C=20 #(H1) + 25 #(H2) + 30 # (H3) + cost(processor) + cost(memory)
![Page 23: Hardware/ Software Partitioning - TU Dortmund · Hardware/software codesign: approach Specification p p. marwedel, - 6 -informatik 12, 2011 [Niemann, Hardware/Software Co-Design for](https://reader033.vdocuments.net/reader033/viewer/2022060907/60a194d3e5cc8409d7552d94/html5/thumbnails/23.jpg)
Result
For a time constraint of 100 time unitsand cost(P)<cost(H3):
T H1 H2 H3 P
1 20 100
2 20 100
3 12 10
- 23 - p. marwedel, informatik 12, 2011
3 12 10
4 12 10
5 20 100
Solution (educated guessing) :T1 → H1T2 → H2T3 → PT4 → PT5 → H1
![Page 24: Hardware/ Software Partitioning - TU Dortmund · Hardware/software codesign: approach Specification p p. marwedel, - 6 -informatik 12, 2011 [Niemann, Hardware/Software Co-Design for](https://reader033.vdocuments.net/reader033/viewer/2022060907/60a194d3e5cc8409d7552d94/html5/thumbnails/24.jpg)
Separation of scheduling and partitioning
Combined scheduling/partitioning very complex;� Heuristic: Compute estimated schedule
� Perform partitioning for estimated schedule� Perform final scheduling� If final schedule does not meet time
constraint, go to 1 using a reduced overall timing constraint.
- 24 - p. marwedel, informatik 12, 2011
timing constraint.
2nd Iteration
t
specificationspecification
Actual execution time
1st Iteration
approx. execution time
t
Actual execution time
approx. execution time
New specificationNew specification
![Page 25: Hardware/ Software Partitioning - TU Dortmund · Hardware/software codesign: approach Specification p p. marwedel, - 6 -informatik 12, 2011 [Niemann, Hardware/Software Co-Design for](https://reader033.vdocuments.net/reader033/viewer/2022060907/60a194d3e5cc8409d7552d94/html5/thumbnails/25.jpg)
Application example
Audio lab (mixer, fader, echo, equalizer,balance units); slow SPARC processor1µ ASIC libraryAllowable delay of 22.675 µs (~ 44.1 kHz)
- 25 - p. marwedel, informatik 12, 2011
SPARCprocessor
ASIC(Compass,1 µ)
External memory
Outdated technology; just a proof of concept.
![Page 26: Hardware/ Software Partitioning - TU Dortmund · Hardware/software codesign: approach Specification p p. marwedel, - 6 -informatik 12, 2011 [Niemann, Hardware/Software Co-Design for](https://reader033.vdocuments.net/reader033/viewer/2022060907/60a194d3e5cc8409d7552d94/html5/thumbnails/26.jpg)
Running time for COOL optimization
- 26 - p. marwedel, informatik 12, 2011
� Only simple models can be solved optimally.
![Page 27: Hardware/ Software Partitioning - TU Dortmund · Hardware/software codesign: approach Specification p p. marwedel, - 6 -informatik 12, 2011 [Niemann, Hardware/Software Co-Design for](https://reader033.vdocuments.net/reader033/viewer/2022060907/60a194d3e5cc8409d7552d94/html5/thumbnails/27.jpg)
Deviation from optimal design
- 27 - p. marwedel, informatik 12, 2011
� Hardly any loss in design quality.
![Page 28: Hardware/ Software Partitioning - TU Dortmund · Hardware/software codesign: approach Specification p p. marwedel, - 6 -informatik 12, 2011 [Niemann, Hardware/Software Co-Design for](https://reader033.vdocuments.net/reader033/viewer/2022060907/60a194d3e5cc8409d7552d94/html5/thumbnails/28.jpg)
Running time for heuristic
- 28 - p. marwedel, informatik 12, 2011
![Page 29: Hardware/ Software Partitioning - TU Dortmund · Hardware/software codesign: approach Specification p p. marwedel, - 6 -informatik 12, 2011 [Niemann, Hardware/Software Co-Design for](https://reader033.vdocuments.net/reader033/viewer/2022060907/60a194d3e5cc8409d7552d94/html5/thumbnails/29.jpg)
Design space for audio lab
- 29 - p. marwedel, informatik 12, 2011
Everything in software: 72.9 µs, 0 λ2
Everything in hardware: 3.06 µs, 457.9x106 λ2
Lowest cost for given sample rate: 18.6 µs, 78.4x106 λ2,
![Page 30: Hardware/ Software Partitioning - TU Dortmund · Hardware/software codesign: approach Specification p p. marwedel, - 6 -informatik 12, 2011 [Niemann, Hardware/Software Co-Design for](https://reader033.vdocuments.net/reader033/viewer/2022060907/60a194d3e5cc8409d7552d94/html5/thumbnails/30.jpg)
Positioning of COOL
COOL approach:
� shows that a formal model of hardware/SW codesign is beneficial; IP modeling can lead to useful implementation even if optimal result is available only for small designs.
Other approaches for HW/SW partitioning:
- 30 - p. marwedel, informatik 12, 2011
Other approaches for HW/SW partitioning:
� starting with everything mapped to hardware; gradually moving to software as long as timing constraint is met.
� starting with everything mapped to software; gradually moving to hardware until timing constraint is met.
� Binary search.
![Page 31: Hardware/ Software Partitioning - TU Dortmund · Hardware/software codesign: approach Specification p p. marwedel, - 6 -informatik 12, 2011 [Niemann, Hardware/Software Co-Design for](https://reader033.vdocuments.net/reader033/viewer/2022060907/60a194d3e5cc8409d7552d94/html5/thumbnails/31.jpg)
HW/SW partitioning in the context of mapping applications to processors
� Handling of heterogeneous systems
� Handling of task dependencies
� Considers of communication (at least in COOL)
- 31 - p. marwedel, informatik 12, 2011
� Considers of communication (at least in COOL)
� Considers memory sizes etc (at least in COOL)
� For COOL: just homogeneous processors
� No link to scheduling theory