hardwired networks on chip for fpgas
TRANSCRIPT
1
Hardwired networks on chip for FPGAs
Kees Goossens (TUD, NXP) Muhammad Aqeel Wahlah (TUD)
Kees Goossens 2009-06-02 Tubs.CITY
2
overview
applications network on chip FPGA
key ideas – hardwired NOC – unified interconnect – data coercion / type casting
dynamic partial reconfiguration – multiple applications – multiplex sub-applications (“hardware tasks”)
example conclusions
2
Kees Goossens 2009-06-02 Tubs.CITY
3
applications
BAC
T1 T2 T3
C1 C2 C3 A1 A2 BA
task / function mapped on IP – includes storage / buffering
application: set of communicating IPs / tasks / ... – data, control, code – communication via connections
use case: set of concurrent applications
Kees Goossens 2009-06-02 Tubs.CITY
4
network on chip (NOC)
connects ports on hardware blocks (IP) – data, control
connections: virtual wires programmable at run-time
– set up & destroy connections by programming control registers in the NOC
styles of communication – address-based /
memory-mapped – streaming
real-time / quality of service
R R
R
NI
NI
NI
NI NI
IP
IP
IP IP
IP
NOC
T1
T2
T3
BAC
A1 A2
BA
3
Kees Goossens 2009-06-02 Tubs.CITY
5
FPGA fabric
LUT
LUT
LUT
LUT
IO processor
CPU
on-chip memory
off-chip memory
de/encrypt accelerator
on-chip memory
LUT
LUT
LUT
LUT
ICAP
soft IP are configured in – configurable elements (LUT) – and switch boxes (not shown)
with a given configuration granularity (frame) using the configuration interconnect (ICAP)
hard IP – CPU – on-chip memories (BRAM, ...) – off-chip memory interfaces – decryption IP – etc.
Kees Goossens 2009-06-02 Tubs.CITY
6
LUT
LUT
LUT
LUT
application on FPGA
LUT
LUT
LUT
LUT
IO processor
CPU
on-chip memory
off-chip memory
de/encrypt accelerator
on-chip memory
A2
A1
BAC
BA
ICAP
map application (IPs + interconnect + storage) on soft + hard IP
traditionally data and control interconnects are separate
could also use NOC for both
soft data interconnect
soft control interconnect
4
Kees Goossens 2009-06-02 Tubs.CITY
7
LUT
LUT
LUT
LUT
multiple applications on FPGA
LUT
LUT
LUT
LUT
IO processor
CPU
on-chip memory
off-chip memory
de/encrypt accelerator
on-chip memory
A2
A1
BAC
BA
ICAP
T3
T1
interconnects and IPs of different applications share reconfiguration regions (frames) – dynamic reconfiguration is
global, not partial – applications interfere
soft data interconnect
soft control interconnect
T2
Kees Goossens 2009-06-02 Tubs.CITY
8
overview
application network on chip FPGA
key ideas – hardwired NOC – unified interconnect – data coercion / type casting
dynamic partial reconfiguration – multiple applications – multiplex sub-applications (“hardware tasks”)
example conclusions
5
Kees Goossens 2009-06-02 Tubs.CITY
9
1. hardwired interconnect
CFR
CFR
CFR
CFR
IO processor
CPU
on-chip memory
off-chip memory
de/encrypt accelerator
on-chip memory
A2
A1
BAC
BA
ICAP
T3
T1
T2
replace soft interconnect(s) by hard interconnect(s)
interconnect regions of LUTs (CFR)
~35 X smaller area ~5 X higher speed
– program, don’t configure
bit-level (CFR) vs. transaction-level (NOC) reconfigurability – memory mapped – streaming
hard interconnect(s)
Kees Goossens 2009-06-02 Tubs.CITY
10
hard interconnect(s)
1. hardwired interconnect
CFR
CFR
CFR
CFR
IO processor
CPU
on-chip memory
off-chip memory
de/encrypt accelerator
on-chip memory
BAC
ICAP
T3
T1
T2
dynamic partial reconfiguration – no constraints on soft IP
placement
loss of flexibility – fewer LUTs
C1
C2
c3
6
Kees Goossens 2009-06-02 Tubs.CITY
11
2. unified interconnect
CFR
CFR
CFR
CFR
IO processor
CPU
on-chip memory
off-chip memory
de/encrypt accelerator
on-chip memory
A2
A1
BAC
BA
ICAP
T3
T1
T2
one interconnect (e.g. NOC) for – data for functional mode – control for programming – bitstreams for configuration
dynamic partitioning of different interconnects
single hard interconnect
Kees Goossens 2009-06-02 Tubs.CITY
12
single hard interconnect
3. data coercion
CFR
CFR
CFR
CFR
IO processor
CPU
on-chip memory
off-chip memory
de/encrypt accelerator
on-chip memory
data = control = bitstream = …
connect a data port to a configuration port – decrypt bitstreams
bitstream
data
7
Kees Goossens 2009-06-02 Tubs.CITY
13
single hard interconnect
3. data coercion
CFR
CFR
CFR
CFR
IO processor
CPU
on-chip memory
off-chip memory
de/encrypt accelerator
on-chip memory
PH
IP
data = control = bitstream
connect a data port to a configuration port – decrypt bitstreams – run-time compute / optimise
bitstreams • JIT, peephole
bitstream
Kees Goossens 2009-06-02 Tubs.CITY
14
single hard interconnect
3. data coercion
CFR
CFR
CFR
CFR
IO processor
CPU
on-chip memory
off-chip memory
de/encrypt accelerator
on-chip memory
TR
TV
DUT
data = control = bitstream = test
connect a data port to a configuration port – decrypt bitstreams – run-time compute / optimise
bitstreams
connect a data port to a test port – run-time structural test
data
test data
data
8
Kees Goossens 2009-06-02 Tubs.CITY
15
overview
applications network on chip FPGA
key ideas – hardwired NOC – unified interconnect – data coercion / type casting
dynamic partial reconfiguration – multiple applications – multiplex sub-applications (“hardware tasks”)
example conclusions
Kees Goossens 2009-06-02 Tubs.CITY
16
dynamic partial reconfiguration
“hardware operating system” implements run-time scheduling of
1. multiple concurrent applications – independent applications on own virtual platform
• no communication, no interference – activation given by user, environment, etc.
T1 T2 T3
BAC C1 C2 C3 A1 A2 BA
app T
time
app D A app AC
9
Kees Goossens 2009-06-02 Tubs.CITY
17
dynamic partial reconfiguration
“hardware operating system” implements run-time scheduling of
1. multiple concurrent applications 2. parts of single applications (soft IP, “hardware tasks”)
– multiplex resources of a single application
BAC C1 C2 C3 A1 A2 BA
app T
time
app D A C
Kees Goossens 2009-06-02 Tubs.CITY
18
dynamic partial reconfiguration
“hardware operating system” implements run-time scheduling of
1. multiple concurrent applications 2. parts of single applications (soft IP, “hardware tasks”)
– multiplex resources of a single application – internal state
BAC C1 C2 C3 A1 A2 BA
app T
time
app D A C
state
10
Kees Goossens 2009-06-02 Tubs.CITY
19
dynamic partial reconfiguration
1. system manager – resource management (CFR, NOC, …)
• inter-application virtual platforms
time
system manager
A C
application manager
BAC
T
application manager
Kees Goossens 2009-06-02 Tubs.CITY
20
dynamic partial reconfiguration
1. system manager – resource management (CFR, NOC, …)
• inter-application virtual platforms • intra-application phases
– NOC programming – soft IP / (sub)-application configuration
time
system manager
A C
application manager
BAC
11
Kees Goossens 2009-06-02 Tubs.CITY
21
dynamic partial reconfiguration
1. system manager 2. application manager
– application programming
time
system manager
A C
application manager
BAC
T
application manager
Kees Goossens 2009-06-02 Tubs.CITY
22
dynamic partial reconfiguration
1. system manager 2. application manager
– application programming – intra-application persistent data management
time
system manager
A C
application manager
BAC
BAC C1 C2 C3 A1 A2 BA
state
12
Kees Goossens 2009-06-02 Tubs.CITY
23
overview
applications FPGA network on chip
key ideas – hardwired NOC – unified interconnect – data coercion / type casting
dynamic partial reconfiguration – multiple applications – multiplex sub-applications (“hardware tasks”)
example conclusions
Kees Goossens 2009-06-02 Tubs.CITY
24
modelling
SystemC – bit & cycle accurate NOC model – behavioural CFR models – accurate bitstream structure – behavioural hard IP models
model – starting / stopping of applications
• dynamic, based on user input – starting / stopping of sub-applications
• dynamic, based on flow of data
– configuration: loading of bitstreams for soft IP; clock & reset – programming: of NOC, system & sub-application managers – management of persistent state
13
Kees Goossens 2009-06-02 Tubs.CITY
25
single hard interconnect
example
system manager – program NOC for configuration
CFR
CFR
CFR
CFR
IO processor
CPU
on-chip memory
off-chip memory
de/encrypt accelerator
on-chip memory
A2
A1
BAC
BA
system manager
application manager
Kees Goossens 2009-06-02 Tubs.CITY
26
single hard interconnect
example
system manager – program NOC for configuration – configure: load bitstreams
• including bitstream syntax, etc.
CFR
CFR
CFR
CFR
IO processor
CPU
on-chip memory
off-chip memory
de/encrypt accelerator
on-chip memory
A2
A1
BAC
BA
system manager
application manager
bitstream programming data
14
Kees Goossens 2009-06-02 Tubs.CITY
27
single hard interconnect
example
system manager – program NOC for configuration – configure: load bitstreams – program NOC for (sub)-application A
CFR
CFR
CFR
CFR
IO processor
CPU
on-chip memory
off-chip memory
de/encrypt accelerator
on-chip memory
A2
A1
BAC
BA
system manager
application manager
bitstream programming data
Kees Goossens 2009-06-02 Tubs.CITY
28
single hard interconnect
example
system manager – program NOC for configuration – configure: load bitstreams – program NOC for (sub)-application A – program & start application manager
• including clocking & reset
CFR
CFR
CFR
CFR
IO processor
CPU
on-chip memory
off-chip memory
de/encrypt accelerator
on-chip memory
A2
A1
BAC
BA
system manager
application manager
bitstream programming data
15
Kees Goossens 2009-06-02 Tubs.CITY
29
single hard interconnect
example
system manager – program NOC for configuration – configure: load bitstreams – program NOC for (sub)-application A – program & start application manager
application manager – programs & starts sub-app A
• soft IP fn is modelled by CFR
CFR
CFR
CFR
CFR
IO processor
CPU
on-chip memory
off-chip memory
de/encrypt accelerator
on-chip memory
A2
A1
BAC
BA
system manager
application manager
bitstream programming data
Kees Goossens 2009-06-02 Tubs.CITY
30
single hard interconnect
example
system manager – program NOC for configuration – configure: load bitstreams – program NOC for (sub)-application A – program & start application manager
application manager – programs & starts sub-app A
sub-application A runs
CFR
CFR
CFR
CFR
IO processor
CPU
on-chip memory
off-chip memory
de/encrypt accelerator
on-chip memory
A2
A1
BAC
BA
system manager
application manager
bitstream programming data
16
Kees Goossens 2009-06-02 Tubs.CITY
31
conclusions
ideas: – hardwire NOC – unified interconnects – data coercion / type casting
very detailed model many simplifications & restrictions
many open issues – design flow: soft IP placement, binding, relocation, etc. – application model:
• extend use-case model with intra-application dynamism • more general notions of persistent state
– implementation: separation of system & application managers
Kees Goossens 2009-06-02 Tubs.CITY
32