multicore digital signal processing - etsist.upm.es · crc/ turbo coding channel coding symbol...
Post on 05-Oct-2019
26 Views
Preview:
TRANSCRIPT
1
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
MULTICORE DIGITAL
SIGNAL PROCESSING
Karol Desnos – kdesnos@insa-rennes.fr
Slides from M. Pelcat, K. Desnos,
J.-F. Nezan, D. Ménard,
M. Raulet, J. Gorin
3
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Previously on MDSPs
Source: ATEME
HEVC Encoder Bloc description
4
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Previously on MDSPs
HEVC Decoder Bloc description
VLC
Decoding
Dequantization
Inverse Transform
Filtering
Source: Hervé Yviquel
5
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Application is naturally described with blocs
Applications for MDSPs
OFDMA
Encoding
Downlink Data Encoding
data
(bits)
Multi-Antenna
PrecodingModulation
Interleaving/
Scrambling
Rate
Matching
CRC/Turbo
Coding
Channel Coding Symbol Processing
Turbo
Decoding/CRC
Uplink Data Decoding
data
(bits)
Rate
Dematching
/HARQ
Descrambling/
Deinterleaving
Channel Decoding
DemodulationSC-FDMA Decoding/
Multi-Antenna Equalization
Symbol Processing
Channel Estimation
6
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Grail of Heterogeneous MPSoCs Programming
Previously on MDSPs
Multicore Compiler
Simulator
+ Debugger
+ Profiler
Algorithm
Architecture
Portable Multicore Program
PE
Main
Proc.
Main
Proc.
Main
Proc.
Main
Proc.
PE PE PE PE
Peripherals
Main
Memory
Multicore Runtime
7
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
• Lecture 1 – Maxime Pelcat
• Introduction to the course
• Applications for MDSPs
• Lecture 2 – Karol Desnos
• Languages and MoCs
• Programming MPSoCs
• Dataflow MoCs
• Lecture 3 – Maxime Pelcat
• Hardware Architectures
Course Outline
8
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
• Lecture 4 – Karol Desnos
• Theoretical Bounds
• Mapping/Scheduling Strategies
• Lecture 5 – Karol Desnos
• Lab Session
Course Outline
10
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
What is a Model of Computation (MoC) ?
• A set of operational elements that can be composed to
describe the behavior of an application.
Semantics of the MoC
• Interface between: • Computer science
• Mathematical domain
• A MoC is not a language !
Languages and MoCs
11
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
What is a language ?
• A set of textual/graphical symbols that can be assembled
respecting a well defined grammar to specify the behavior
of a program
Syntax of the language
• Interface between: • Programmer
• Machine (through a compiler)
• A language implements one or several MoCs
Languages and MoCs
12
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Semantics and Syntax
• Among the 10 most used languages,
• all 10 are imperative,
• 7 are object-oriented.
• Other semantics exist. A MoC specifies semantics
independently from a language syntax
Languages and MoCs
13
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Semantics and Syntax
• UML implements object-oriented semantics
• C++ implements object-oriented semantics
• They share semantics but not syntax
Languages and MoCs
14
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
A few MoCs
• Finite State Machine MoCs • Semantics
• States
• Transitions based on conditions
• MoC of imperative languages (C/C++/Java…)
• Computation is executed on transitions
• Representing the behavior of a Turing machine
Languages and MoCs
Start
End
pts = @TAB
(pts) = 0
pts = pts + 1
End
TAB ?
yes
no
15
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
A few MoCs
• Discrete Event MoCs • Modules react to events by producing events
• Events tagged in time, i.e. the time at which events are consumed
and produced is essential and is used to model system behavior
• Modules share a clock
• Used to model HDL behavior
• Functional MoCs and lambda calculus • No state, a program returns the result of composed mathematical
functions: result = f ס g ס h(inputs)
• Based on lambda calculus
• Haskell, Caml, Scheme, XSLT
• Functional languages are examples of declarative languages
Languages and MoCs
16
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
A few MoCs
• Petri Nets • Close to state machines but
• able to represent parallelism
• Operations are done on transitions
• Synchronous MoCs • Like in Discrete Events, modules react to events
• by producing new events
• Contrary to Discrete Events, time is not explicit and
• only the simultaneity of events and causality are important
• Language examples: Signal, Esterel, Lustre…
Languages and MoCs
Pro
cessin
g th
read 1
Pro
cessin
g th
read 2
OP1 OP2
Loo
p
Loo
p
17
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
A few MoCs
• Process Network MoCs
• concurrent and independent modules (processes) communicate
ordered tokens (data quanta) through First-In, First-Out (FIFO)
channels
• Include dataflow process networks
• Untimed models: the time is abstracted
What we naturally used to describe MPEG HEVC and 3GPP LTE
processing
Languages and MoCs
19
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
• Parallelisms
• Programming Coarse-Grain Parallelism
Programming MPSoCs
20
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Parallelisms
• Granularity of parallelism
• Amount of computation of an atomic operation relatively to
the overhead to manage/execute several operations concurrently
Programming MPSoCs
Operation size
Overhead
small
large
small large
Perfect world
21
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Parallelisms
• Single-Instruction Multiple-Data (SIMD)
• Primitive operations (+ - * / >> min max …)
• Low/No cost when code is suitable (data dependencies)
• Explicitly specified with intrinsic / implicitly supported by HW
Programming MPSoCs
Overhead
Operation size
small
large
small large
Perfect world SIMD
22
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Parallelisms
• Software Pipelining
• Loop optimization: start a new iteration before the previous end
• Low cost when code is suitable (data dependencies)
• Automatically performed by compilers (-O3)
Programming MPSoCs
Overhead
Operation size
small
large
small large
Perfect world SIMD
SW Pipe.
23
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Parallelisms
• Instruction Level Parallelism
• Out-of-order execution
• SIMD
• Register Renaming
Programming MPSoCs
Overhead
Operation size
small
large
small large
Perfect world
ILP
SIMD
SW Pipe.
• Branch prediction
• SW pipelining
• …
24
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Parallelisms
• Threads
• “Independent” sequence of instructions with private context
• Moderate to high cost => quasi-total hardware abstraction
• Complex to program: deadlocks, preemption, …
Programming MPSoCs
Overhead
Operation size
small
large
small large
Perfect world
ILP
SIMD
SW Pipe.
Threads
25
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Parallelisms
• Cores + on-chip interconnect
• Each core runs a bare-metal program (no OS).
• HW-supported means of communication (Interrupt, DMA, Packet, ...)
Programming MPSoCs
Overhead
Operation size
small
large
small large
Perfect world
ILP
SIMD
SW Pipe.
Cores + interconnect
Threads
26
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Parallelisms
• Chips + LAN or WAN
• Independent servers/computers communicating through a network
• Very high cost (communication, distributed memory)
• High Performance Computing
Programming MPSoCs
Overhead
Operation size
small
large
small large
Perfect world
ILP
SIMD
SW Pipe.
Cores + interconnect
Chips + L/WAN
Threads
27
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Parallelisms
• For MPSoCs: Fine and Coarse-grain parallelisms
• ILP and SW Pipelining: Compiler (& crazy developers)
• Threads and Interconnect: Developers and/or optimization tools
Programming MPSoCs
Overhead
Operation size
small
large
small large
Perfect world
ILP
SIMD
SW Pipe.
Cores + interconnect
Chips + L/WAN
Threads
28
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
• Parallelisms
• Programming Coarse-Grain Parallelism
Programming MPSoCs
29
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Programming Coarse-Grain Parallelism
• Manually, using threads
• “Independent” sequence of instructions with private context
• Usually preemptible
• Number of thread ≈ number of cores • Otherwise, context switching rapidly decreases application performance
=> Test in Laboratories
• Solutions: • posix thread (linux/windows)
• win32 API
• Boost Threads
• « Idle » threads on SysBios (TI DSP)
• …
Programming MPSoCs
30
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Programming Coarse-Grain Parallelism
• Manually, using tasks
• Short independent sequence of instructions with private context
• Not preemptible
• Number of tasks >> number of cores • Finer granularity than threads.
• Using many tasks ensures load-balancing
• Solutions: • Multicore Task Management API (MTAPI)
• OpenEvent Machine
• StarSs
• …
Programming MPSoCs
31
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Programming Coarse-Grain Parallelism
• Semi-automated, using OpenMP
• Add #pragma omp instruction to parallelize sections of C code
• Compiler optimization automatically create threads/tasks
• Shared-Memory targets !
• Supported on some MPSoCs (including c6x DSPs !!)
• Examples:
• Split loop iterations among threads #pragma omp parallel for
for(i=0; i< L; i++) { … }
• Execution barrier for several threads #pragma omp parallel for
for(i=0; i< L; i++) {
tab[i] = 0; // init
#pragma omp barrier
tab[i] = … // work on the tab.
}
Programming MPSoCs
32
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Programming Coarse-Grain Parallelism
And so many other solutions:
• Message-Passing Interface (MPI), Multicore
Communication API (MCAPI)
• Communication protocol for parallel computing abstracting architecture
concerns.
• OpenCL, CUDA
• GPGPU oriented
• Cilk, Go Language, Threading Building Blocks, …
• Threads without headaches
• Dataflow programming
Programming MPSoCs
34
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
• Dataflow Models of Computations
• Properties of Dataflow MoCs
• Dataflow Frameworks
• Laboratories Example
Dataflow MoCs
35
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Dataflow MoCs
• Compilation for block diagrams
• Accessible to persons with no knowledge of programming languages
• 1961: BLODI compiler • 30 basic DSP blocks (Adder, Filter, Quantizer, …)
• 1 Punched card for each block.
Dataflow MoCs
Source: J.L. Kelly, C. Lochbaum, and V.A. Vyssotsky. A block diagram compiler. Bell System Technical Journal, 1961
36
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Dataflow MoCs
• Kahn Process Network (KPN)
• Actors and Data Ports
• First-In, First-Out (FIFO) Queues
Dataflow MoCs
B
A C D E
37
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Dataflow MoCs
• Kahn Process Network (KPN)
• Production/consumption rates of an process can change at
each firing, but:
• Reading from a FIFO is a blocking operation
• This ensures the determinism of the application
i.e. Same input stream => Same output stream
Dataflow MoCs
B
A C D E
38
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Dataflow MoCs
• Kahn Process Network (KPN)
• Process example
• Process Sort separates odd and even values.
Process sort(in int i, out int even, out int odd) {
int value = i.read(); // Blocking
if( value % 2 == 0)
even.write(value);
else
odd.write(value);
}
Dataflow MoCs
3 2 6 5 1 Sort
i even odd
39
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Dataflow MoCs
• Kahn Process Network (KPN)
• Determinism example
• Process Interleave output value alternatively taken from its inputs.
Process inteleave(in int a, in int b, out int o) {
static bool = true;
int value = (bool)? a.read() : b.read();
bool = !bool;
o.write(value);
}
Dataflow MoCs
Sort i even
odd 3
2 6
5 1
blocked Interleave a o b
40
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Dataflow MoCs
• Dataflow Process Network (DPN)
• Firing rules of an actor can change at each firing
• Firing rules depend on: • State of actors
• Time / Randomness…
• Number/value of available tokens
Dataflow MoCs
B
A C D E
Source: E.A. Lee and T.M. Parks. Dataflow process networks. Proceedings of the IEEE, 1995.
x2
Model is non-deterministic
41
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Dataflow MoCs
• Dataflow Process Network (DPN)
• Non-Determinism example
• Process Interleave output value alternatively taken from its inputs.
• Time-out on FIFO read !
Process inteleave(in int a, in int b, out int o) {
static bool = true;
int value = (bool)? a.read() : b.read();
bool = !bool;
if(no_timeout) o.write(value);
}
Dataflow MoCs
3
2 6
5 1
Timeout Interleave a o b
42
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Dataflow MoCs
• Dataflow Process Network (DPN)
• Non-Determinism example
• Same inputs with different arrival time produces different output
Process inteleave(in int a, in int b, out int o) {
static bool = true;
int value = (bool)? a.read() : b.read();
bool = !bool;
if(no_timeout) o.write(value);
}
Dataflow MoCs
3
2 6
5 1
Timeout Interleave a o b
43
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
• Dataflow Models of Computations
• Properties of Dataflow MoCs
• Dataflow Frameworks
• Laboratories Example
Dataflow MoCs
44
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Properties
• DPN and KPN are Dynamic Dataflow MoCs
• It is not possible to find a fixed sequence of actor firing that can be
repeatedly executed for any input sequence.
• It is not possible to guarantee that an application will not deadlock
for any input sequence.
• It is not possible to guarantee that an application will always require
a finite amount of memory for its execution (tokens can accumulate
indefinitely in unbounded FIFOs)
Dataflow MoCs
45
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Properties
• Synchronous Dataflow (SDF)
• Stateless actors and data ports with fixed exchange rates
• First-In, First-Out (FIFO) Queues
• Delays
Dataflow MoCs
2
2
1
1
1 2 1 1
B
A C D E
Source: E. Lee and D. Messerschmitt, “Synchronous data flow”, Proceedings of the IEEE, 1987.
x2
46
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Properties
• Synchronous Dataflow (SDF)
• SDF is a static model: It cannot be used to model all applications
• Production/consumption rates are fixed
• It is possible to find a fixed sequence of actor firing that
can be repeatedly executed for any input sequence, with a
finite memory footprint and no deadlock.
Dataflow MoCs
47
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Properties
• Synchronous Dataflow (SDF)
• Data-driven execution: An actor is fired when its input FIFOs contain
enough data-tokens.
Dataflow MoCs
B
A C D E
Core1 A B C C D E
2
2
1
1
1 2 1 1
Source: E. Lee and D. Messerschmitt, “Synchronous data flow”, Proceedings of the IEEE, 1987.
48
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Properties
• Synchronous Dataflow (SDF)
• Parallelisms: / / /
Dataflow MoCs
Source: E. Lee and D. Messerschmitt, “Synchronous data flow”, Proceedings of the IEEE, 1987.
2
2
1
1
1 2 1 1
B
A C D E
Core1
Core2
Core3
x2
A B C C E A C +1 +1
B C +1 +1
D E +1 +1
D
Task parallelism Data parallelism Pipeline parallelism Internal parallelism
Data Pipeline Internal Task
49
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Properties
• There exists many other dataflow MoCs.
• What is the difference between them ?
• Decidability • Application schedulability can be guaranteed at compile-time
• Determinism • Same inputs => Same outputs (regardless of time, randomness, or
implementation)
• Compositionality • Properties of a graph (topology, firing rules of actors, repetition
vector, …) are independent from the internal implementation of the
actors that compose it.
Dataflow MoCs
50
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Properties
• Reconfigurability • Firing rules of actors can change dynamically during the execution of
the application, depending on inputs. DPN and KPN are
reconfigurable, SDF is not.
• Predictability (related to reconfigurability) • Amount of time between the reconfiguration of firing rules of an actor
and its actual firing. (Relative)
• Conciseness (or succinctness)* • Ability of a MoC to model an application with a small number of
actors. (Relative)
Dataflow MoCs
51
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Properties
• Analyzability • Availability of analysis techniques to characterize an application
graph (eg. worst-case latency, memory requirements). (Relative)
• Expressivity • Complexity of application behavior that can be described.
Ex. DPN and KPN are Turing-Complete
SDF is not.
Dataflow MoCs
52
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Properties • About compositionality: A non-compositional hierarchical model
Dataflow MoCs
1 3 h B 3 1 A
1 N C 1 1 D 1 1
B
C2
C3
C4
C5
C6
C1 D1
D2
D3
A1
With N=2
A2
h1
h2
h3
With N=1
B C2
C3
C1 D1
D2
D3
A
53
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Properties • About compositionality: The IBSDF compositional model
Dataflow MoCs
1 3 h B 3 1 A
1 N C 1 1 D 1 1
B
C2
C3
C4
C5
C6
C1 D1
D2
D3
A
With N=2
h1
h2
h3
h1
h2
h3
With N=1
B C2
C3
C1 D1
D2
D3
A
54
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Properties • About predictability and reconfigurability:
The more dynamism, the less predictability
Dataflow MoCs
B 1 3 A 2 C 2 SDF
x1 x2 x3
DPN A C B x2 x1 x1 x2 x3 x1
PSDF
B 1 A 2 C 2
body
subinit Set_p
init
x2 x3 x1
p:=3
p
Compile Time
After Subinit
After execution
55
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Properties
• There exists many other dataflow MoCs.
Dataflow MoCs
Feature SDF ADF IBSDF DSSF PSDF PiSDF SADF SPDF DPN KPN
Expressivity Low Med. Turing
Hierarchical X X X X
Compositional X X X
Reconfigurable X X X X X X
Statically schedulable X X X X
Decidable X X X X (X) (X) X (X)
Variable rates X X X X X X X
Non-determinism X X X
SDF: Synchronous Dataflow
ADF: Affine Dataflow
IBSDF: Interface-Based Dataflow
DSSF: Deterministic SDF with Shared Fifos
PSDF: Parameterized SDF
PiSDF Parameterized and Interfaced SDF
SADF: Scenario-Aware Dataflow
SPDF: Schedulable Parametric Dataflow
DPN: Dataflow Process Network
KPN: Kahn Process Network
56
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Properties
• It is important to choose the appropriate MoC before
developing each part of an application.
• LTE example:
Dataflow MoCs
X Number of root sequences
DFT Filter
Select preamble subcarriers
Correlate with root sequence
IDFT Power
Noise floor estimation Peak Search
X Root Sequences X Reception Antennas X Preambles
Detected Users Timing Advance
X Reception Antennas X Preambles
Static
57
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Properties
• It is important to choose the appropriate MoC before
developing each part of an application.
• LTE example:
X Users X Transport Blocks
Blocks
X Users X Code Blocks
X Users
X Reception Antennas
FFT Prepare
Demap Control/Data
Decode Control
X symbols (14) X Reception Antennas
Data of Each user
Estimate Channel
Equalize Multiple Antennas
Symbol and Bit Processing
Code Block Processing
Transport Block Processing
Static
Static
Param.
Param.
Param.
Dataflow MoCs
58
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
• Dataflow Models of Computations
• Properties of Dataflow MoCs
• Dataflow Frameworks
• Laboratories Example
Dataflow MoCs
59
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Dataflow MoCs
Dataflow Framework
• CAL and DDF
• CAL = CAL Actor Language, J. Ecker and J. Janneck
• Implements the DPN MoC
• RVC-CAL: Normalized in MPEG
actor decimate () I ==> O :
action I:[ a , b ] ==> O:[ a ] end
end Consommation Production
décimate
I O
60
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Dataflow MoCs
Dataflow Framework
• CAL and DDF
• Via classification, a CAL actor can be transformed into SDF, CSDF in order to
gain predictability
• The Orcc compiler, developed at IETR, is a compiler for CAL
• From CAL, both HW and SW solutions can be generated
• Supports real-life applications such as HEVC !
• Examples
actor decimate () I ==> O:
a: action [ x ] ==> [ x ] end
b: action [ x ] ==> end
schedule fsm s1:
s1 ( a ) --> s2;
s2 ( b ) --> s1;
end
end
Consommation Production
actor posValue() I ==> O:
pos: action I: [ x ] ==> O: [ x ] guard x >= 0
end
neg: action I: [ x ] ==> O:
end
priority
pos > neg;
end
end
actor decimate () I ==> O:
int s := 0;
action [ x ] ==> [ x ] guard s = 0
do s := 1; end
action [ x ] ==> guard s = 1
do s := 0; end
end
61
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Dataflow MoCs
Dataflow Framework
• Ptolemy (by UC Berkeley)
• First tool
• Support for composition of DPN, PSDF and SDF MoCs
• Lightweight Dataflow Environment (LIDE) (by U. of Maryland)
• First tool
• Support for composition of DPN, PSDF and SDF MoCs
• MAPS Compiler (by RWTH Aachen U.)
• KPN and C language
• Support C6678
• SynDEx (by INRIA)
• Custom Dataflow MoC (SDF-like) an C Code
• Similar to Preesm (but closed source)
62
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Dataflow MoCs
Dataflow Framework
• OpenDF (by EPFL)
• CAL Code compiler
• Targeting reconfigurable hardware and multicore systems.
• SDF For Free (SDF3) (by Technische Universiteit Eindhoven)
• SDF, CSDF and SADF
• Focusing on analysis of SDF application and generation of random graphs
• Canals (Abo Akademi)
• A language and its compiler where schedulers are specified explicitely at
each level of hierarchy
• AccessCore Studio (by Kalray)
• Programming for Many-Core MPSoC with 256 cores.
63
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Programming Coarse-Grain Parallelism
And so many other solutions
Programming MPSoCs
Deduce parallelism from Models of Computation
Extract parallelism from code
High Performance Computing
Embedded Systems
Ptolemy II
PREESM
OpenMP
Matlab + Simulink
OpenCL
CAL
SystemC
Simulation oriented
Execution oriented
64
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
• Dataflow Models of Computations
• Properties of Dataflow MoCs
• Dataflow Frameworks
• Laboratories Example
Dataflow MoCs
65
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Dataflow MoCs
Laboratories Example
• PiMM Dataflow MoC • Use the IBSDF subset
+ static param
Channel Decoding
Converge
Config NbUE
Actors
Data Ports
FIFO Queues
SDF
Hierarchy
Interfaces
IBSDF
𝝅SDF
Parameters
Param. Ports
Dependencies
Dynamic Config.
PiMM
PUSCH
NbUE
mac
an
ten
na
sin
k
MaxCB
NbUE 1
NbUE*MaxCB MaxCB
Source: J. Piat, S.S. Bhattacharyya, and M. Raulet. Interface-based hierarchy for synchronous dataflow graphs. SiPS 2009
66
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Dataflow MoCs
Laboratories Example
• Sobel Application
• Sequential version
Process
67
Multicore DSPs – Karol Desnos (kdesnos@insa-rennes.fr)
Dataflow MoCs
Laboratories Example
• Sobel Application
• Parallel version
Split
Process
Process
Process
Merge
Convolution of a 3x3 Matrix with pixels of the input image
Each slice has a 1-line overlap with the next and previous slices
top related