advanced system technology think join an extension to think for programming parallel streaming...
DESCRIPTION
3 Advanced System TechnologyConfidential Streaming strategies Dispatcher Decoder I Decoder P Display Encoded videao Stream Decoded video stream Synchronization Image I Image P Decoded I image Image I décodée Decoded P image Programming flows: approaches –The “pull” way Strong coupling Too many synchronizations for good parallelism –The “push” way : Filters pipelining through buffers Weak coupling Difficult to deal with many asynchronous events. Proposition –Mastering the asynchronous push model –Join Calculus based synchronization language ImageI getImg() { return DecI.getImgDecI(); } ImageI getImgDecI() { return Dispatcher.getImgI(); } Buffer void dispatch() { while (Buffer.NotFull) Buffer.push(imgI); } void decodeP() { while (Buffer.NotEmpty) { imgI = Buffer.pull(); Buffer2.push(Dec(imgI)); }TRANSCRIPT
Advanced System Technology
Think JoinAn extension to Think for programming parallel
streaming applicationsFréderic Mayot, Matthieu Leclercq, Erdem Özcan
2
Advanced System Technology Confidential
Streaming applications on MPSoC
Functional considerations– Enabling more parallelism– Mastering the distribution– Supporting different execution models
Software engineering considerations– Easier programming models– Separation of concerns– Improved the reuse and mastered evolution– Management of software assets
3
Advanced System Technology Confidential
Streaming strategies
Dispatcher
Decoder I
Decoder P
Display
Encoded videaoStream
Decoded video stream
SynchronizationImage I
Image P
Decoded I image
Image Idécodée
Decoded P image
Programming flows: approaches– The “pull” way
Strong coupling Too many synchronizations for good parallelism
– The “push” way : Filters pipelining through buffers Weak coupling Difficult to deal with many asynchronous events.
Proposition– Mastering the asynchronous push model– Join Calculus based synchronization language
ImageI getImg() { return DecI.getImgDecI();}
ImageI getImgDecI() { return Dispatcher.getImgI();}
Buffer
void dispatch() { while (Buffer.NotFull) Buffer.push(imgI); }
void decodeP() { while (Buffer.NotEmpty) { imgI = Buffer.pull(); Buffer2.push(Dec(imgI)); }}
+-
--
4
Advanced System Technology Confidential
Computation model Synchronization via join patterns
Set of messages received reaction invocation
Language ThinkJoin
ift1.pushImgI (img1) & itf2.pushImgP (img2) => itf3.decode (img1, img2);
ift1.pushImgI (img1) & itf4.synch () => itf5.display (img1);
Rule
Pattern
Reaction
Message
Décodeur I
Décodeur P
AfficheurDispatcher
void pushImgI(ImageI img) &void pushImgP(ImageP img) { /* decode */}
5
Advanced System Technology Confidential
Integration into Fractal ADL
Implémentation
Controler Functional
void pushImgI (img);
void pushImgP (img);
void decode (img1, img2);
ift1.pushImgI (img1) & itf2.pushImgP (img2) => itf3.decode (img1, img2);
Automatically generated by the compiler from the rule :
<definition name=“functional”> <interface name="itf3" role="server" /> <content class="reaction" language=“thinkMC" /> </functional>
ADL
<definition name=“controller”> <interface name="itf1" role="server" /> <interface name="itf2" role="server" /> <interface name="itf3" role="client" /> <content class="rules" language="join" /> </definition>
ADL
functional.c
controller.rules
<definition name="cmp"> <interface name="itf1" role="server" /> <interface name="itf2" role="server" /> <component name=“functional” … > </component> <component name=“controller” … > </component > <binding client="this.itf1" server="ctl.itf1" /> <binding client="this.itf2" server="ctl.itf2" /> <binding client=“ctrl.react" server=“fn.reeact" /></definition>
ADL
6
Advanced System Technology Confidential
BBA
DCDC F
FE
How does-it work ? FSM execution example
Think-v3 ADL compiler extension Computation model inspired by Join calculus (cf. Fournet et al.)
– Tightly modified to have FIFO order within bindings Implementation
– Formalization by Finite State Machines (cf. Maranget et al.)– Encoding and pattern matching with bit masks to have linear complexity – Using circular buffers and synchronization primitives
i1.A() & i3.E() => ir.r1();
i1.B() & i2.C() => ir.r2();
i2.D() & i4.F() => ir.r3();
i1
i4i3
i2
FSMA
public interface I2 { void C ( ); void D ( );}
public interface I3 { void E ( );}
public interface I4 { void F ( );}
public interface I1 { void A ( ); void B ( );}
i1 i2 i3 i4
BB
FF
E
DCDC
7
Advanced System Technology Confidential
Execution models
i1
i2
Réactions
i1.pushP(img1) & i2.pushI(img2) => i3.decode(img1, img2);
i3
Scheduler
register(FSM, {decode, 10})
execute({decode, 10})
decode(img1, img2)
FSM
7 {decode, 10}… …
10 {img1, img2}… …
match
FSM
pushP(img1)
pushI(img2)
Various execution types– Synchronous– Asynchronous– Threads pool
Only one parameter to switch Depends on each component
pushP(img1)
pushI(img2)
8
Advanced System Technology Confidential
Evaluation on a MPEG2 decoder
Over cost (FSM, scheduling, serialization)Objectives reached
ArchitectureCollaborationRepartition
Next: more formal language, better efficiency
Header Dec
Picture Dec
Framestore
MC Y
MC U
MC V
Exporter
IDCT
Adder
MotionComp.
idct_mb_type.intra_mb() &idct_mb.push_mb(mb_data)=>r.add(mb_data);
idct_mb_type.non_intra_mb() &idct_mb.push_mb(mb_data) &mc_y.prediction_finished() &mc_u.prediction_finished() &mc_v.prediction_finished()=>r.add(mb_data);
Threads
Perf.
+-