synthesis of embedded softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… ·...

276

Upload: others

Post on 20-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that
Page 2: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

Synthesis of Embedded Software

Page 3: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

Sandeep K. Shukla � Jean-Pierre TalpinEditors

Synthesis of EmbeddedSoftware

Frameworks and Methodologiesfor Correctness by Construction

123

Page 4: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

EditorsDr. Sandeep K. ShuklaVirginia TechBradley Dept. Electrical& Computer EngineeringWhittemore Hall 30224061 Blacksburg Virgin [email protected]

Dr. Jean-Pierre TalpinINRIA Rennes-BretagneAtlantique35042 Rennes CXCampus de [email protected]

ISBN 978-1-4419-6399-4 e-ISBN 978-1-4419-6400-7DOI 10.1007/978-1-4419-6400-7Springer New York Dordrecht Heidelberg London

Library of Congress Control Number: 2010930045

c� Springer Science+Business Media, LLC 2010All rights reserved. This work may not be translated or copied in whole or in part without the writtenpermission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York,NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use inconnection with any form of information storage and retrieval, electronic adaptation, computer software,or by similar or dissimilar methodology now known or hereafter developed is forbidden.The use in this publication of trade names, trademarks, service marks, and similar terms, even if they arenot identified as such, is not to be taken as an expression of opinion as to whether or not they are subjectto proprietary rights.

Printed on acid-free paper

Springer is part of Springer Science+Business Media (www.springer.com)

Page 5: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

Introduction

Sandeep Shukla and Jean-Pierre Talpin

Embedded Systems are ubiquitous. In applications ranging from avionics,automotive, and industrial process control all the way to the handheld PDAs,cell phones, and bio-medical prosthetic devices, we find embedded computingdevices running embedded software systems. Growth in embedded systems isexponential, and there are orders of magnitude more embedded processors andembedded applications today in deployment than other forms of computing. Someof the applications running on such embedded computing platforms are also safety-critical, real-time, and require absolute guarantees of correctness, timeliness, anddependability.

Design of such safety-critical applications require utmost care. These applicationsmust be verified for functional correctness; satisfaction of real-time constraints mustbe ensured; and must be properly endowed with reliability/dependability properties.These requirements pose hard challenges to system designers. A large body of re-search done over the last few decades exists in the field of designing safety-criticalhardware and software systems. A number of standards have evolved, specifica-tions have been published, architecture analysis and design languages have beenproposed, techniques for optimal mapping of software components to architecturalelements have been studied, modular platform architectures have been standardized,etc. Certification by various authorities and strict requirements for satisfying certi-fication goals have been developed. Today’s avionics, automotive, process control,and many other safety-critical systems are usually developed based on such a largebody of knowledge and technology.

However, there is still no guarantee that a system will not crash, or an auto-motive throttle by wire system malfunction will not cause problems, or a processcontrol system will fail to work properly. All the model based development process

S. ShuklaFermat Laboratory, Virginia Tech, Blacksburg, VA, USAe-mail: [email protected]

J.-P. TalpinINRIA, Centre de Recherche Rennes – Bretagne Atlantique, Campus de Beaulieu,Rennes, Francee-mail: [email protected]

v

Page 6: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

vi S. Shukla and J.-P. Talpin

specification, and strict certification are meant to eliminate the possibilities of sucheventualities, but cannot eliminate them completely. Formal verification could pro-vide more certainty than test based certification, but even formal verification is asgood as the abstract models and properties that one verifies on such models.

Even though, 100% guarantee is not possible, we believe that a rigorous formalmethods based approach which transforms abstract specifications via refinementsteps will be the methodology of choice for such systems, if it is already not so.While the refinement steps will add further complexity to satisfy various constraints,only refinements to be used are those that could be mathematically proven to pre-serve the correctness already proven at higher abstraction levels.

Towards achieving the goal of designing safety-critical systems, especially em-bedded software in this way, a number of disciplines have been developed, andare being researched at the moment. These research areas consist of programmingmodels, abstractions, semantics preserving refinements, real-time and other non-functional constraint based model elaboration, automated synthesis of code, formalverification, etc.

In Europe, a number of programming models and languages based on such pro-gramming models have evolved for this purpose. Synchronous and polychronousprogramming models constitute important classes among them. Stream based pro-gramming models, Petri Nets, Process algebra and various semantically enrichedUML models also belong to these. In the USA, actor based modeling paradigms,tuple-space based modeling, and guarded command languages are the main ones,other than automata based composition of processes.

Defining a programming model, its semantics, and translation schemes based onoperational semantics into execution languages such as C is only one part of the task.The other part is to create an entire methodology around the language, programmingmodel, and tools for creating executable software from it.

In 2009, we developed a one day tutorial at the ACM/IEEE Conference onDesign, Analysis and Test in Europe (DATE 2009) where a number of speakersspoke about the various languages, models, and tools for correct-by-constructionembedded software design. Following up on this successful tutorial the present bookintroduces approaches to the high-level specification of embedded software that aresupported with correct-by-construction, automated synthesis and code generationtechniques. It focuses on approaches based on synchronous models of computationand the many programmatic paradigms it can be associated with to provide inte-grated, system-level, design methodologies.

Chapter 1, on the “Compilation of Polychronous Data-Flow Equations”, gives athorough presentation of the program analysis and code generation techniquespresent in Polychrony, a compiler for the data-flow synchronous language Signal.

Introduced in the late 1980s, Signal and its polychronous model of computationstand among the most developed concepts of synchronous programming. It allows tomodel concurrent embedded software architectures using high-level multi-clockedsynchronous data-flow equations.

Page 7: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

Introduction vii

The chapter defines the formal methodology consisting of all required programanalysis and transformation techniques to automatically generate the sequential orconcurrent code suiting the target architecture (embedded or distributed) or compi-lation goals (modularity or performance).

Chapter 2, on “Formal Modeling of Embedded Systems with Explicit Schedules andRoutes”, investigates data-flow process networks as a formal framework to modeldata-flow signal processing algorithms (e.g., multimedia) for networks of processors(e.g., multi-cores).

While not readily a programming model, a process network provides a suitablemodel of computation to reason on the issues posed by the efficient implementa-tion of data-intensive processing algorithms on novel multi-processor architectures.Process networks provide a bridge between the appealing scheduling and routing is-sues posed by multi-core architectures and the still very conventional programminglanguage concepts that are commonly used to devise algorithms.

In this aim, process networks act as a foundational principle to possibly definedomain-specific yet high-level programming and design concepts to match the mainconcern of data-processing algorithms, which is to minimize communication la-tencies (with external memory or between computation units) as early as possibleduring system design.

Chapter 3, on “Synoptic: A Domain-Specific Modeling Language for Space On-board Application Software”, is a perfect illustration of a related effort to designand structure high-level programming concepts from the specific requirements to anapplication domain. Synoptic is a modeling language developed in the frame of theFrench ANR project SPaCIFY. Its aim is to provide a programming environmentfor real-time embedded software on-board space application, and especially controland command software.

Synoptic consists of an Eclipse-based modeling environment that covers themany aspects of aerospace software design: control-dominated modules are de-scribed using imperative synchronous programs, commands using mode automata,data-processing modules using data-flow diagrams, and also partitioning, timing andmapping of these modules onto satellite architectures using AADL-like diagrams.

As such, it is a domain-specific framework relying on the standard modeling no-tations such as Simulink/Stateflow and AADL to provide the engineer with a unifiedmodeling environment to handle all heterogeneous analysis, design, implementationand verification tasks, as defined in collaboration with the industrial end users of theproject.

Chapter 4, on “Compiling SHIM”, provides a complete survey to another domain-specific language, whose aim is to help engineer software interfaced with asyn-chronous hardware: “the shim under the leg of a table”. In the design of Shim,emphasis is put on so-called scheduling-independence as a principle guaranteeingcorrectness by construction while assuming minimum programmatic concepts.

Page 8: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

viii S. Shukla and J.-P. Talpin

Shim is a concurrent imperative programming language in which races andnon-determinism are avoided at compile-time by enforcing simple analysis rulesensuring a deterministic input–output behavior regardless of way program threadsare scheduled: scheduling independence. The chapter gives a complete survey ofShim, the programming principles it is build upon, the program analysis techniquesit uses, its code generation strategies for both shared-memory and message-passingarchitectures.

Chapter 5, on “A Module Language for Typing SIGNAL Programs by Contracts”,brings up the polychronous model of computation again to present a means tomodularly and compositionally support assumption-guarantee reasoning in thatframework. Contract-based design has become a popular reasoning concept inwhich contracts are used to negotiate the correctness of assumptions made on thedefinition of a component at the point where it is used and provides guarantees toits environment.

The chapter first elaborates formal foundations by defining a Boolean algebraof contracts in a trace-theoretical framework. Based on that contracts algebra, ageneral-purpose module language is then specified. The algebra and module systemare instantiated to the framework of the synchronous data-flow language Signal.This presentation is illustrated with the specification of a protocol for Loosely Time-Triggered Architectures (LTTA).

The aim of Chap. 6, on “MRICDF: A Polychronous Model for Embedded SoftwareSynthesis”, is to define a simple modeling language and framework to reason aboutsynchronous multi-clocked actors, as an alternative to the formal methods in thePolychrony toolbox for use by a larger community of software engineers.

As the leitmotiv of MRICDF can be summarized as “polychrony for the mass”,a systematic emphasis is put on presenting a modeling language that uses program-ming concepts as simple as possible to implement a complete synchronous designworkbench.

The chapter gives a thorough presentation of the design modeling language andon the issues encountered in implementing MRICDF. The use of this formalism to-gether with its visual editor EmCodeSyn is illustrated through case studies to designsafety-critical applications.

Chapter 7, on “The Time Model of Logical Clocks Available in the OMG MARTEProfile”, gives a detailed presentation of CCSL, the clock constraints specificationlanguage defined in the OMG profile MARTE for specifying timing relations inUML. The aim of CCSL is to capture some of the essence of logical time andencapsulate it into a dedicated syntax that can be used to annotate UML diagramsin order to refine them by formalizing this critical design aspect.

Multi-form or polychronous logical time, introduced and made popular throughits central role in the theory of synchronous languages, is already present in manyformalisms pertaining to embedded system design, sometimes in a hidden fashion.Logical time considers time bases that can be generated from any sort of sequences

Page 9: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

Introduction ix

of events, not necessarily equally spaced in physical time. CCSL provides waysto express loose or strict constraints between distinct logical clocks. Solving suchclock constraints amounts to relating clocks to a common reference, which then canbe thought of as closer to the physical clock.

Finally, Chap. 8, entitled “From Synchronous Specifications to Statically ScheduledHard Real-Time Implementations”, addresses the issue of transforming high-leveldata-flow specifications expressed in a synchronous model of computation down tosequential or distributed expressed in a real-time model of computation.

Hard real-time embedded systems are often designed as automatic control sys-tems that can include both continuous and discrete parts. The functional specifica-tion of such systems is usually done using data-flow formalisms such as Simulink orScade. These formalisms are either quasi-synchronous or synchronous, but go be-yond the classical data-flow model, by introducing means of conditional execution,hence allowing the description of hierarchical execution modes.

Specific real-time implementation approaches have been proposed for such for-malisms, which exploit the hierarchical conditions to improve the generated code.This last chapter present one such approach. It takes data-flow synchronous spec-ifications as input and uses static scheduling heuristics to automatically produceefficient distributed real-time code. The chapter insists on improving the analysis ofthe hierarchical conditions in order to obtain more efficient code.

This book is intended to provide readers with a sample on advanced topics andstate of the art in various fields related to safety-critical software synthesis. Hence,it is obviously not an exhaustive handbook. Our hope is that the reader will findthe area of research covered in this book important and interesting, and find otherinteresting research being done in related topics from various other sources.

We wish to thank all the contributors of this book for participating in preparingthis volume, and for demonstrating their patience with the entire publication pro-cess. We also thank Charles Glaser from Springer for his support with this project.Finally, we thank the National Science Foundation, the US Air Force Office ofScientific Research, INRIA, and the ARTIST network of excellence for their supportand funding which enabled us to work on this project.

Page 10: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

Contents

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vSandeep Shukla and Jean-Pierre Talpin

1 Compilation of Polychronous Data Flow Equations . . . . . . . . . . . . . . . . . . . . . . . 1Loıc Besnard, Thierry Gautier, Paul Le Guernic,and Jean-Pierre Talpin

2 Formal Modeling of Embedded Systems with ExplicitSchedules and Routes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41Julien Boucaron, Anthony Coadou, and Robert de Simone

3 Synoptic: A Domain-Specific Modeling Language for SpaceOn-board Application Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79A. Cortier, L. Besnard, J.P. Bodeveix, J. Buisson, F. Dagnat,M. Filali, G. Garcia, J. Ouy, M. Pantel, A. Rugina, M. Strecker,and J.P. Talpin

4 Compiling SHIM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .121Stephen A. Edwards and Nalini Vasudevan

5 A Module Language for Typing SIGNAL Programsby Contracts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .147Yann Glouche, Thierry Gautier, Paul Le Guernic,and Jean-Pierre Talpin

6 MRICDF: A Polychronous Model for Embedded SoftwareSynthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .173Bijoy A. Jose and Sandeep K. Shukla

7 The Time Model of Logical Clocks Available in the OMGMARTE Profile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .201Charles Andre, Julien DeAntoni, Frederic Mallet,and Robert de Simone

xi

Page 11: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

xii Contents

8 From Synchronous Specifications to Statically ScheduledHard Real-Time Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .229Dumitru Potop-Butucaru, Robert de Simone, and Yves Sorel

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .263

Page 12: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

Contributors

Charles Andre Laboratoire I3S, UMR 6070 CNRS, Universite de Nice-SophiaAntipolis, 06903 Sophia Antipolis Cedex, France, INRIA, Centre de Recherchede Sophia-Antipolis Mediterranee, 06902 Sophia Antipolis, France, [email protected]

Loıc Besnard CNRS/IRISA, Campus de Beaulieu, Rennes, France,[email protected]

Jean-Paul Bodeveix IRIT-ACADIE, Universite de Toulouse, site Paul Sabatier,118 Route de Narbonne, 31062 Toulouse Cedex 9, France, [email protected]

Julien Boucaron INRIA, Centre de Recherche de Sophia-Antipolis Mediterranee,06902 Sophia Antipolis, France, [email protected]

Jeremy Buisson VALORIA, Ecoles de St-Cyr Cotquidan, Universite Europeennede Bretagne, 56381 Guer Cedex, France,[email protected]

Anthony Coadou INRIA, Centre de Recherche de Sophia-Antipolis Mediterranee,06902 Sophia Antipolis, France, [email protected]

Alexandre Cortier IRIT-ACADIE, Universite de Toulouse, site Paul Sabatier,118 Route de Narbonne, 31062 Toulouse Cedex 9, France, [email protected]

Fabien Dagnat Institut Telecom – Telecom Bretagne, Universite Europeennede Bretagne, Technopole Brest Iroise, CS83818, 29238 Brest Cedex 3, France,[email protected]

Julien DeAntoni INRIA, Centre de Recherche de Sophia-Antipolis Mediterranee,06902 Sophia Antipolis, FranceandLaboratoire I3S, UMR 6070 CNRS, Universite de Nice-Sophia Antipolis,06903 Sophia Antipolis Cedex, France, [email protected]

Robert de Simone INRIA, Centre de Recherche de Sophia-Antipolis, SophiaAntipolis Cedex, France, [email protected]

Stephen A. Edwards Columbia University, New York, NY, USA,[email protected]

xiii

Page 13: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

xiv Contributors

Mamoun Filali IRIT-ACADIE, Universite de Toulouse, site Paul Sabatier,118 Route de Narbonne, 31062 Toulouse Cedex 9, France, [email protected]

Gerald Garcia Thales Alenia Space, 100 Boulevard Midi, 06150 Cannes, France,[email protected]

Thierry Gautier INRIA, Centre de Recherche de Rennes, Campus de Beaulieu,Rennes, France, [email protected]

Yann Glouche INRIA, Centre de Recherche de Rennes, Campus de Beaulieu,Rennes, France,[email protected]

Bijoy A. Jose FERMAT Lab, Bradley Department of Electrical and ComputerEngineering, Virginia Polytechnic Institute and State University, Blacksburg, VA,USA, [email protected]

Paul Le Guernic INRIA, Centre de Recherche de Rennes, Campus de Beaulieu,Rennes, France, [email protected]

Frederic Mallet INRIA, Centre de Recherche de Sophia-Antipolis Mediterranee,06902 Sophia Antipolis, FranceandLaboratoire I3S, UMR 6070 CNRS, Universite de Nice-Sophia Antipolis,06903 Sophia Antipolis Cedex, France, [email protected]

Julien Ouy IRISA-ESPRESSO, Campus de Beaulieu., 35042 Rennes Cedex,France, [email protected]

Marc Pantel IRIT-ACADIE, Universite de Toulouse, site Paul Sabatier,118 Route de Narbonne, 31062 Toulouse Cedex 9, France, [email protected]

Dumitru Potop-Butucaru INRIA, Centre de Recherche de Paris-Rocquencourt,Le Chesnay Cedex, France, [email protected]

Ana Rugina EADS Astrium, 31 rue des Cosmonautes Z.I. du Palays,31402 Toulouse Cedex 4, France, [email protected]

Sandeep K. Shukla FERMAT Lab, Bradley Department of Electrical andComputer Engineering, Virginia Polytechnic Institute and State University,Blacksburg, VA, USA, [email protected]

Yves Sorel INRIA, Centre de Recherche de Paris-Rocquencourt, Le ChesnayCedex, France, [email protected]

Martin Strecker IRIT-ACADIE, Universite de Toulouse, site Paul Sabatier,118 Route de Narbonne, 31062 Toulouse Cedex 9, France, [email protected]

Jean-Pierre Talpin INRIA, Centre de Recherche de Rennes, Campus de Beaulieu,Rennes, France, [email protected]

Nalini Vasudevan Columbia University, New York, USA,[email protected]

Page 14: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

Acronyms

IFFT Inverse Discrete Fourier TransformDMA Direct Memory AccessFSM Finite State MachineRTL Register Transfer LevelHLS High Level SynthesisEDA Electronic Design AutomationHDL Hardware Description LanguageCAOS Concurrent Action Oriented SpecificationsCDFG Control Data-Flow GraphHTG Hierarchical Task GraphGCD Greatest Common DivisorBSC Bluespec CompilerBSV Bluespec System VerilogLPM Longest Prefix MatchVM Vending MachineLTL Linear-time Temporal LogicTLA Temporal Logic of ActionsMCS Maximal Concurrent ScheduleACS Alternative Concurrent ScheduleMNS Maximum Non-conflicting SubsetMIS Maximum Independent SetMLS Minimum Length ScheduleFFD First Fit DecreasingPTAS Polynomial Time Approximation SchemeAES Advanced Encryption StandardUC Upsize Converter

xv

Page 15: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

Index

AAlgebraic transformations, 18–19Alias, 132, 187Automaton, 20, 89–93, 98, 100, 103–107, 110,

129–132, 142, 143, 238

BBalance equations, 56–58, 62, 63, 67, 73Basic blocks, 127, 128Bit vector, 137, 138Blocking, 22, 25, 26, 28, 34, 48, 49, 80, 81,

87–101, 103–105, 115, 123, 124,126–130, 132, 137–139, 187, 230,231, 234, 236, 238, 244, 245

Buffer sharing, 143, 144

CC, vi, 3, 14, 21, 25–28, 30, 31, 33–34, 48, 96,

99, 121–124, 126, 127, 130–132,136, 138–141, 193, 239

CCC, 3, 21, 25, 31, 123, 147Call tree, 137, 139Causality, 47, 109, 159, 177, 203, 210, 231,

233–235, 238Cell processor, 140, 145Channels, 36, 42–45, 48, 50, 122–130, 132,

133, 136–144, 176, 178, 179, 197,253

Clock analysis, 230, 231, 236, 244Clock calculus, 8, 18, 25, 28, 212, 214, 223,

259, 260Clocks, viii, ix, 2–9, 12, 14–21, 23, 25–35,

46–49, 56, 95, 100, 101, 103,108, 110, 114, 122, 123, 149–152,159, 163–166, 173–176, 201–226,230–232, 234, 236–238, 241,244–251, 259, 260

Code generation, vi, viii, 3, 8, 10, 13–39, 81,85, 86, 93, 94, 117, 123–132, 137,138, 143, 160, 181, 183, 188, 193,194, 203, 229, 230, 236

Communication, vii, 2, 5, 8, 17, 18, 22–24, 26,27, 30, 35–37, 39, 41–43, 45–50,54, 58, 59, 63, 69, 73, 74, 80, 83,84, 91, 110–114, 117, 122, 123,126, 129, 132, 133, 135–139, 141,143, 145, 166, 179, 182, 203, 205,215, 217–219, 231, 232, 236, 238,239, 241, 244, 246, 248–250, 253,255–260

Communication patterns, 129Compilation, vi, vii, 1–39, 41, 43, 51, 58, 59,

64, 73, 74, 94, 100, 114, 115, 148,173, 202, 226, 230, 232, 235, 237

Components, v, viii, 1, 8, 25, 26, 30, 35, 36,38, 42, 43, 50, 51, 56, 60, 80, 83, 88,89, 94, 96–97, 114–116, 147, 148,151, 152, 159, 160, 165, 167, 168,170, 182, 201, 204, 205, 215–217,233, 234, 239

Compositional approach, 142Concurrency, 30, 41, 42, 47, 56, 58, 60, 69, 73,

74, 81, 124, 141, 197, 236Condition variables, 121Contract

assume/guarantee, 151–161, 167boolean algebra, viii, 148, 153, 155, 156,

158, 160, 169design, viii, 147, 148, 158, 168–170process algebra, 148, 152–163real-time, 147–149refinement, 147, 156–159, 161–162, 167,

168semantic, 152, 154, 161, 162

263

Page 16: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

264 Index

specification, 112, 114, 117, 147–149, 151,152, 159, 160, 163, 168–170

synchronous language, 148, 160Contract-directed code generation, 160Control-flow graph, 127, 202Control software, 80, 110, 113, 117Correct by construction static routing

transformations, 60CSP, 44, 122Cyclo-Static DataFlow (CSDF), 44, 45, 59–60,

62, 73

DData Control Graph (DCG), 3, 14, 20, 21, 34Dataflow, 41–44, 46–60, 74, 81, 86, 88–93, 96,

98–101, 105, 107, 109, 122, 230,231, 236–238, 241–246, 248, 249,251–253, 258–260

Dataflow graphs, 59–60, 242, 251Deadlock, 4, 9, 14, 15, 18, 19, 25, 43, 50, 51,

57–60, 63, 68, 71, 73, 126, 130,141–143, 182–184, 197, 225

Determinism, viii, 74, 121–123, 136, 248DFS, 127DMA, 145

EEmbedded software, vi–viii, 83, 173–198Embedded software design, viEmbedded systems, v, vii, viii, ix, 1–4, 14,

37, 41–75, 111, 121, 147, 148, 160,175, 176, 201–205, 226, 229, 230,235, 241

Endochrony, 14, 183, 190, 197Endochrony (endochronous system), 14, 24,

183, 190–192, 197, 260Esterel, 21, 47, 122, 135, 173–176, 236, 237Event-driven, 175Event function, 17, 138, 139Exception handler, 137Exceptions, 132–139, 144, 256Exceptions and communication, 135–138, 144Exception scope, 136Explicit model-checking, 142Exponential state space, 143External variable, 86, 88, 94, 99, 111, 113–144

FFIFO, 4, 24, 49, 50, 66, 134, 135, 145, 166,

217Fixed point, 129, 173

Flight software, 79–85, 87, 89, 92, 93, 98, 99Formal methods, vi–viii, 116Formal verification, vi, 2, 37, 48, 49, 74, 75,

81, 122, 148, 230, 235, 237Function calls, 23, 31, 32, 81, 126, 132, 133,

140Function pointers, 124, 126, 132

GGlobally asynchronous and locally syn-

chronous (GALS), 2, 20, 80, 84, 88,113, 116

HHardware/software boundary, 122Haskell, 145

IIBM, 140, 145Inferred parameters, 123Interconnect optimization, 42–43, 75Interleavings, 122I/O library, 144

JJava, 3, 25, 31, 38, 96, 121, 147

KKahn networks, 122, 208, 237K-periodic routing, 45, 59–74K-periodic throughput equalization, 51, 59,

63–64

LLatency insensitive design (LID), 43, 46,

48–49, 56, 58, 74Locks, 68, 94, 121, 136, 138, 211Logical clock, viii, ix, 2, 47, 49, 149, 201–226Lustre, 48, 82, 122, 159, 173, 176, 214, 236,

237, 241–243, 259

MMacros, 127Marked Graphs, 43, 44, 46, 49–56, 58, 64, 68,

74Mask operation, 137

Page 17: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

Index 265

Model-checking, 48, 142, 143, 162, 235Model-Driven Engineering (MDE), 80, 82, 85,

87, 116, 117, 201Models of computation, vi, 42, 43, 74, 160,

173Modulo scheduling, 202Multi-clock, 37, 174, 236, 260Multiway rendezvous, 133, 136, 138, 144

NNested loops, 127, 202Network block, 123Next, 125–128, 132, 133, 136, 142Non-determinism, viii, 121, 122NuSMV, 141, 142

OOffline scheduling, 231Operating system, 24–27, 80, 121, 124, 139,

232, 239Optimality condition, 255Optimization, 2, 37, 38, 43, 45, 48, 50, 55, 56,

58, 122, 176, 230–232, 234, 241,244, 247–249, 254–260

PParallel exceptions, 135–136Parallelism, 41, 74, 121, 239Par construct, 132Pass-by-reference, 132, 140Pipeline, 55, 143, 201, 220Point-to-point channels, 42, 124Poisoned, 136–141, 144Polychrony, vi, viii, 3, 9, 14, 17, 18, 21, 25–27,

30, 31, 34, 37–39, 86, 117, 176,178, 190, 236, 259, 260

Polychrony, multi-rate, 173Porting C programs, 132POSIX threads, 136, 139PPU, 140Prime implicate, 176, 189–192, 197, 198Processes, 122–127, 129–135Process Networks, vii, 41–45, 74, 201, 202,

222Product machine, 129Program analysis, vi–viii, 3, 143Program counter, 124, 129, 130Program transformation, 3, 37, 38Pthreads, 121, 140

RRaces, viii, 121Reader pointer, 124Real-time, v–vii, ix, 3, 8, 43, 75, 80–82, 85,

95, 99, 111, 112, 147–149, 198,201–206, 226, 229–260

Real-time scheduling, 229–232, 237, 239–241,252, 258, 259

Reconfiguration, 86, 111, 112, 114–117Recursion, 123, 124, 132–137Recv, 132–134, 136, 137, 141, 142, 144Refinement, vi, 2–4, 12, 19–21, 29, 86, 88,

96–99, 118, 147, 156–159, 161,162, 167–169

Rendezvous, 122, 123, 126, 133, 135, 136,138–139, 143–145

Resuming, 102, 103, 105, 124, 126, 132

SScheduler, 9, 25, 26, 30, 31, 38, 88, 112, 117,

124, 126, 139, 205, 240Scheduling, vii, viii, ix, 20, 23–24, 26, 29–31,

33, 38, 39, 41, 43, 45, 48, 49, 57,64, 74, 82, 127, 129, 132, 141,143–145, 190, 193, 197, 202, 209,222, 229–233, 235, 237, 239–241,248, 252–260

Scheduling-independent, 122, 135, 144, 145Scheduling policy, 121, 130, 240Semantics, vi, 1–4, 7, 9–10, 21, 22, 25, 42,

44, 45, 69, 81, 86, 88, 99–111, 117,122, 123, 135, 136, 144, 152, 161,162, 176, 177, 182, 203, 204, 206,208–210, 216, 217, 221–226, 230,232, 234, 235, 243–244

Send, 125, 126, 130, 132–138, 141, 142, 144Sequential implementability, 183–185,

190–193, 197Service oriented middleware, 110–111Shared memory, viii, 121, 143Shared variables, 8, 12, 22, 86, 94, 122, 233SHIM, vii, 121–145SIGNAL, vi, vii, viii, 2–32, 34–38, 41, 42, 47,

48, 56, 75, 80–82, 86, 88, 94, 95,98–109, 114, 122, 123, 147–170,173–194, 197, 205, 214, 230,233–239, 248, 250–260

Software architecture, vi, 80, 82, 87–96, 116Software synthesis, viii, ix, 173–198Spanning tree, 127SPIN, 122, 141State abstraction, 129, 143State names, 130

Page 18: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

266 Index

State signature, 129Static routing, 60Static scheduling, ix, 26, 29–30, 38, 41, 127,

143, 222, 231, 248, 253, 257Subset construction algorithm, 129Switch statement, 132Synchronous Data Flow (SDF), vi, viii, 43–46,

56–60, 62, 63, 67, 73, 74, 86, 197,203, 204, 221–225, 231, 241

Synchronous Elastic Systems, 43, 48Synchronous formalism, 230, 231, 234, 235,

237, 242Synchronous islands, 88, 93–94, 99, 112, 113,

116, 117Synchronous languages, vi, viii, 2, 88, 148,

160, 176, 202, 230, 232, 235–237Synchronous model, ix, vi, 56, 114, 122, 141,

231Synchronous programming, vi, 173, 197, 237SynDEx, 3, 48, 237, 241–249, 251–253, 259,

260Synergistic processing unit (SPU), 140Synoptic, vii, 79–118Synthesized code, 124, 125

TTail-recursion, 124Termination, 125, 126, 130Testing, 84, 121Transitive poisoning, 136Turing complete, 122

VVerification, vi, vii, 2, 3, 37, 48, 74, 75, 81, 82,

85, 86, 88, 89, 117, 121, 122, 148,182, 189, 194, 232, 235, 237, 248,251

Verilog, 48, 123

WWriter pointer, 124

XX10, 145

Page 19: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

Chapter 1Compilation of Polychronous DataFlow Equations

Loıc Besnard, Thierry Gautier, Paul Le Guernic, and Jean-Pierre Talpin

1.1 Introduction

High-level embedded system design has gained prominence in the face of risingtechnological complexity, increasing performance requirements and shortening timeto market demands for electronic equipments. Today, the installed base of intellec-tual property (IP) further stresses the requirements for adapting existing componentswith new services within complex integrated architectures, calling for appropriatemathematical models and methodological approaches to that purpose.

Over the past decade, numerous programming models, languages, tools andframeworks have been proposed to design, simulate and validate heterogeneoussystems within abstract and rigorously defined mathematical models. Formal de-sign frameworks provide well-defined mathematical models that yield a rigorousmethodological support for the trusted design, automatic validation, and systematictest-case generation of systems. However, they are usually not amenable to directengineering use nor seem to satisfy the present industrial demand.

Despite overwhelming advances in embedded systems design, existing tech-niques and tools merely provide ad-hoc solutions to the challenging issue of theso-called productivity gap [1]. The pressing demand for design tools has sometimeshidden the need to lay mathematical foundations below design languages. Many il-lustrating examples can be found, e.g., the variety of very different formal semanticsfound in state-diagram formalisms [2]. Even though these design languages benefitfrom decades of programming practice, they still give rise to some diverging inter-pretations of their semantics.

L. BesnardCNRS/IRISA, Campus de Beaulieu, Rennes, Francee-mail: [email protected]

T. Gautier (�), P. Le Guernic, and J.-P. TalpinINRIA, Centre de Recherche de Rennes, Campus de Beaulieu,Rennes, Francee-mail: [email protected]; [email protected]; [email protected]

S.K. Shukla and J.-P. Talpin (eds.), Synthesis of Embedded Software: Frameworks andMethodologies for Correctness by Construction, DOI 10.1007/978-1-4419-6400-7 1,c� Springer Science+Business Media, LLC 2010

1

Page 20: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

2 L. Besnard et al.

The need for higher abstraction levels and the rise of stronger market constraintsnow make the need for unambiguous design models more obvious. This chal-lenge requires models and methods to translate a high-level system specificationinto (distributed) cooperating (sequential) processes and to implement high-levelsemantics-preserving transformations such as hierarchical code structuration, se-quentialization or desynchronization (protocol synthesis).

Synchronous hypothesis, in this aim, has focused the attention of many aca-demic and industrial actors. This synchronous paradigm consists of abstractingthe non-functional implementation details of a system. In particular, latencies dueto effective computing and communications depend on actual implementation ar-chitecture; thus they are handled when low level implementation constraints areconsidered; at higher level, time is abstracted as sequences of (multiple) events in alogical time model. Thus the designer can forget those details and focus his or herattention on the functionalities of the system, including logical synchronizations.

With this point of view, synchronous design models and languages provide intu-itive models for embedded systems [3]. This affinity explains the ease of generatingsystems and architectures, and verifying their functionalities using compilers andrelated tools that implement this approach.

Synchronous languages rely on the synchronous hypothesis: computations andbehaviors of a synchronous process are divided into a discrete sequence of atomiccomputation steps which are equivalently called reactions or execution instants. Initself this assumption is rather common in practical embedded system design.

But the synchronous hypothesis adds to this the fact that, inside each instant, thebehavioral propagation is well-behaved (causal), so that the status of every signalor variable is established and defined prior to being tested or used. This criterionensures strong semantic soundness by allowing universally recognized mathemati-cal models to be used as supporting foundations. In turn, these models give accessto a large corpus of efficient optimization, compilation, and formal verificationtechniques.

The polychronous model [4] extends the synchronous hypothesis to the context ofmultiple logical clocks: several synchronous processes can run asynchronously untilsome communication occurs; all communications satisfy the synchronous hypothe-sis. The resulting behaviors are then partial orders of reactions, which is obviouslymore general than simple sequences. This model goes beyond the domain of purelysequential systems and synchronous circuits; it embraces the context of complexarchitectures consisting of synchronous circuits and desynchronization protocols:globally asynchronous and locally synchronous architectures (GALS).

The SIGNAL language [5] supports the polychronous model. Based on data flowand equations, it goes beyond the usual scope of a programming language, allowingfor specifications and properties to be described. It provides a mathematical foun-dation to a notion of refinement: the ability to model a system from the early stagesof its requirement specifications (relations, properties) to the late stages of its syn-thesis and deployment (functions, automata). The inherent flexibility of the abstractnotion of signal handled in the SIGNAL language invites and favors the design of

Page 21: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

1 Compilation of Polychronous Data Flow Equations 3

correct-by-construction systems by means of well-defined model transformationsthat preserve both the intended semantics and stated properties of the architectureunder design.

The integrated development environment POLYCHRONY [6] provides SIGNAL

program transformations that draw a continuum from synchrony to asynchrony,from specification to implementation, from abstraction to refinement, from interfaceto implementation. SIGNAL gives the opportunity to seamlessly model embed-ded systems at multiple levels of abstraction while reasoning within a simple andformally defined mathematical model. It is being extended by plugins to captureSystemC modules or real-time Java classes within the workbench. It allows toperform validation and verification tasks, e.g., with the integrated SIGALI modelchecker [7], or with the Coq theorem prover [8]. C, C++, multi-threaded and real-time Java and SYNDEX [9] code generators are provided.

This chapter focuses on formal transformations, based on the polychronoussemantic model [4], that can be provided by a safe methodology to generate“correct-by-construction” executable code from SIGNAL processes. It gives a thor-ough presentation on program analysis and code generation techniques that can beimplemented to transform synchronous multi-clocked equation systems into variousexecution schemes such as sequential and concurrent programs (C) or object-oriented programs (C++). Most of these techniques are available in the toolsetPOLYCHRONY to design embedded real-time applications.

This chapter is structured as follows: Sect. 1.2 presents the main features of theSIGNAL language and introduces some of the mathematical properties on whichprogram transformations are based. In Sect. 1.3, some of the modularity featuresof SIGNAL are first introduced; then an example, used in the rest of this chapter,is presented. Section 1.4 is dedicated to Data Control Graph models that supportprogram transformations; their use to guide various code generation schemes thatare correct by construction is then considered. Section 1.5 introduces those variousmodes of code generation, illustrated on the considered example.

1.2 SIGNAL Language

SIGNAL [10–13,15,19] is a declarative language expressed within the polychronousmodel of computation. SIGNAL relies on a handful of primitive constructs, whichcan be combined using a composition operator. These core constructs are of suffi-cient expressive power to derive other constructs for comfort and structuring. In thefollowing, we present the main features of the SIGNAL language and its associatedconcepts. We give a sketch of the primitive constructs and a few derived constructsoften used. For each of them, the corresponding syntax and definition are men-tioned. Since the semantics of SIGNAL is not the main topic of this chapter, we givesimplified definitions of operators. For further details, we refer the interested readerto [4, 5].

Page 22: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

4 L. Besnard et al.

1.2.1 Synchronized Data Flow

Consider as an example the following expression in some conventional data flowformalism:

if a > 0 then x D a endifI y D x C a:Considering the data flow semantics given by Kahn [16] as functions over flows, yis the greatest sequence of values a0

t C at where a0 is the subsequence of strictlypositive values in a. Thus in an execution where the edges are considered as FIFOqueues [17], if a is a sequence with infinitely many non-positive values, the queueassociated with a grows forever, or (if a is a finite sequence) the queue associatedwith x remains eventually empty although a is non-empty. Now, suppose that eachFIFO queue consists of a single cell [18]. Then as soon as a negative value appearson the input, the execution (of the C operator) can no longer go on because its firstoperand (cell) does not hold a value (is absent): there is a deadlock. These resultsare not acceptable in the context of embedded systems where not only deadlocksbut also uncontrolled time responses can be dramatic. Synchronized data flow in-troduces synchronizations between occurrences of flows to prevent such effects. Inthis context, the lack of value is usually represented by nil or null; we represent itby the symbol # (stating for “no event”).

To prevent deadlock to occur during execution, it is necessary to be able to verifytiming properties before runtime. In the framework of synchronized data flow, the #will correspond to the absence of value at a given logical instant for a given variable(or signal). In particular, to reach high level modularity allowing for instance inter-nal clock rate increasing (time refinement), it must be possible to insert #’s betweentwo defined values of a signal. Such an insertion corresponds to some resynchro-nization of the signal. However, the main purpose of synchronized data flow is tocompletely handle the whole synchronization at compile time, in such a way that theexecution phase has nothing to do with #. This is assumed by a static representationof the timing relations expressed by each operator. Syntactically, the overall detailedtiming is implicit in the language. SIGNAL describes processes which communi-cate through (possibly infinite) sequences of (typed) values with implicit timing: thesignals.

1.2.2 Signal, Execution, Process in SIGNAL

A pure signal s is a (total) function T ! D, where T , its time domain, is a chain ina partial order (for instance an increasing sequence of integers) and D is some datatype; we name pure flow of such a pure signal s, the sequence of its values in D.

For all chains T T , T � T T , a pure signal s W T ! D can be extended toa synchronized signal ss W T T ! D# (where D# D D [ f#g) such that for allt in T; ss.t/ D s.t/ and ss.t/ D # when t is not in T ; we name synchronizedflow of a synchronized signal ss the sequence of its values in D#; conversely we

Page 23: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

1 Compilation of Polychronous Data Flow Equations 5

name pure signal of the synchronized flow ss, the unique pure signal s from whichit is extended and pure flow of ss the pure flow of s. The pure time domain of asynchronized signal is the time domain of its pure signal. When it is clear from thecontext, one may omit pure or synchronized qualifications. Given a pure signal s(respectively, a synchronized signal ss), st (respectively, sst ) denotes the t-th valueof its pure flow (respectively, its synchronized flow).

An execution is the assignment of a tuple of synchronized signals defined on thesame time domain (as synchronized signals), to a tuple of variables. Let TT be thetime domain of an execution, a clock in this execution is a “characteristic function”clk W T T �! f#; trueg; notice that it is a synchronized signal. The clock of a signalx in an execution is the (unique) clock that has the same pure time domain as x; itis denoted by Ox.

A process is a set of executions defined by a system of equations over signalsthat specifies relations between signal values and clocks. A program is a process.

Two signals are said to be synchronous in an execution iff they have the sameclock (or equivalently the same pure time domain in this execution). They are saidto be synchronous in a process (or simply synchronous), iff they are synchronous inall executions of this process.

Consider a given operator which has, for example, two input signals and one out-put signal all being synchronous. They are logically related in the following sense:for any t , the t-th token on the first input is evaluated with the t-th token on thesecond input, to produce the t-th token on the output. This is precisely the notion ofsimultaneity. However, for two occurrences of a given signal, we can say that oneis before the other (chronology). Then, for the synchronous approach, an event isassociated with a set of instantaneous calculations and communications.

1.2.3 SIGNAL Data Types

A flow is a sequence of values that belong to the same data type. Standard datatypes such as Boolean, integer, . . . (or more specific ones such as event – see below)are provided in the SIGNAL language. One can also find more sophisticated datatypes such as sliding window on a signal, bundles (a structure the fields of whichare signals that are not necessarily synchronous), used to represent union types orsignal multiplexing:

� The event type: to be able to compute on (or to check properties of) clocks,SIGNAL provides a particular type of signals called event. An event signal istrue if and only if it is present (otherwise, it is #).

� Signal declaration: tox x declares a signal x whose common element type istox. Such a declaration is a process that contains all executions that assign to xa signal the image of which is in the domain denoted by tox.

In the remainder of this chapter, when the type of a signal does not matter or whenit is clear from the context, one may omit to mention it.

Page 24: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

6 L. Besnard et al.

1.2.4 SIGNAL Elementary Processes

An elementary process is defined by an equation that associates with a signal vari-able an expression built on operators over signals; the arguments of operators canbe expressions and variables.

� Stepwise extensions. Let f be a symbol denoting a n-ary function ŒŒf �� on values(e.g., Boolean, arithmetic or array operation). Then, the SIGNAL expression

y := f(x1,..., xn)

defines the process equal to the set of executions that satisfy:

� � The signals y; x1; : : : ; xn are synchronous:� Their pure flows have same length l and satisfy 8t�l; yt D ŒŒf ��.x1t ; : : : ; xnt /:

If f is a function, its stepwise extension is a pure flow function. Infix notation isused for usual operators.

Derived operator

ı Clock of a signal: ˆx returns the clock of x; it is defined by(ˆx)D� (x = x), where = denotes the stepwise extension of usual equal-ity operator.

� Delay. This operator defines the signal whose t-th element is the .t � 1/-thelement of its (pure flow) input, at any instant but the first one, where it takesan initialization value. Then, the SIGNAL expression

y := x $ 1 init cdefines the process equal to the set of executions that satisfy:

8<:� y; x are synchronous:

� Pure flows have same length l and satisfy 8t � l;�.t > 1/) yt D xt�1;.t D 1/) yt D c:

The delay operator is thus a pure flow function.

Derived operator

ı Constant: x := v; when x is present its value is the constant value v;x := v is a derived equation equivalent to x := x $ 1 init v.Note that this equation does not have input: it is a pure flow function witharity 0.

� Sampling. This operator has one data input and one Boolean “control” input.When one of the inputs is absent, the output is also absent; at any logical instantwhere both input signals are defined, the output is present (and equal to the cur-rent data input value) if and only if the control input holds the value true. Then,the SIGNAL expression

Page 25: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

1 Compilation of Polychronous Data Flow Equations 7

y := x when b

defines the process equal to the set of executions that satisfy:

8<:� y; x; b are extended to the same infinite domain T; respectively as yy; xx; bb:

� Synchronized flows are infinite and satisfy 8t 2 T�.bbt D t rue/)yyt D xxt ;.bbt ¤ t rue/)yyt D #:

The when operator is thus a synchronized flow function.

Derived operators

ı Clock selection: when b returns the clock that represents the (implicit) setof instants at which the signal b is true; in semantics, this clock is denotedby Œb�.(when b)D� (b when b) is a pure flow function.

ı Null clock: the signal when(b when (not b)) is never present: it iscalled null clock and is denoted by ˆ0 in the SIGNAL syntax, O0 as a semanticconstant.

ı Clock product: x1ˆ� x2 (denoted by O� as a semantic operator) returns theclock that represents the intersection of pure time domains of the signals x1and x2. When their clock product is O0, x1 and x2 are said to be exclusive.(x1ˆ� x2)D� ((ˆx1) when (ˆx2))

� Deterministic merging. The unique output provided by this operator is defined(i.e., with a value different from #) at any logical instant where at least one of itstwo inputs is defined (and non-defined otherwise); a priority makes it determin-istic. Then, the SIGNAL expression

z := x default y

defines the process equal to the set of executions that satisfy:

8<ˆ:

� The time domain T of z is the union of the time domains of x and y:� z; x; y are extended to the same infinite domain TT � T; resp. as zz; xx; yy:

� Synchronized flows satisfy 8t 2 TT

�.xxt ¤ #/) zzt D xxt ;.xxt D #/) zzt D yyt :

The default operator is thus a synchronized flow function.

Derived operators

ı Clock union (or clock max): x1ˆC x2 (denoted by OC as a semantic operator)returns an event signal that is present iff x1 or x2 is present.(x1ˆC x2)D� ((ˆx1) default (ˆx2))

ı Clock difference: x1ˆ� x2 returns an event signal that is present iff x1 ispresent and x2 is absent.(x1ˆ� x2)D� (when ((not ˆx2) default ˆx1))

Page 26: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

8 L. Besnard et al.

Derived equations

ı Partial signal definition: y ::= x is a partial definition for the signal ywhich is equal to x, when x is defined; when x is not defined its value is free.(|y ::= x|)D� (|y := x default y|)This process is generally non deterministic. Nevertheless, it is very usefulto define components such as transitions in automata, or modes in real-timeprocesses, that contribute to the definition of the same signal. Moreover, it isheavily used in the process of code generation (communication of values viashared variables, see Sect. 1.3.1).The clock calculus can compute sufficient conditions to guarantee that theoverall definition is consistent (different components cannot give different val-ues at the same instant) and total (a value is given at all instants of the timedomain of y).

1.2.5 SIGNAL Process Operators

A process is defined by composing elementary processes.

� Restriction. This operator allows one to consider as local signals a subset of thesignals defined in a given process. If x is a signal with type tox defined in aprocess P,

P where tox x or P where x

defines a new process Q where communication ways (for composition) are thoseof P, except x. Let A the variables of P and B the variables of Q: we say thatP is restricted to B , and executions of P are restricted in Q to variables of B.More precisely, the executions in Q are the executions in P from which x signalis removed (the projection of these executions on remaining variables). This hasseveral consequences:

– The clock of each execution may be reduced.– The ability to (directly) add new synchronization constraints to x is lost.– If P has a single output signal named x, then P where x is a pure synchro-

nization process. The generated code (if any) is mostly a synchronization codeused to ensure signal occurrence consumptions.

– If P has a single signal named x, P where x denotes the neutral process: itcannot influence any other process. Hence no code is generated for it.

Derived equations

ı (| P where x, y |) D�(| (| P where x |) where y |)ı Synchronization: x1 ˆ= x2 specifies that x1 and x2 are synchronous.(| x1 ˆ= x2 |)D� (| h := (ˆx1 = ˆx2) |) where h

Page 27: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

1 Compilation of Polychronous Data Flow Equations 9

ı Clock inclusion: x1 ˆ< x2 specifies that time domain of x1 is included intime domain of x2.(| x1 ˆ< x2 |)D� (| x1 ˆ= (x1 ˆ* x2) |)

� Parallel composition. Resynchronizations (by freely inserting #’s) have to takeplace when composing processes with common signals. However, this is only aformal manipulation. If P and Q denote two processes, the composition of P andQ, written (| P | Q |) defines a new process in which common names referto common synchronized signals. Then, P and Q communicate (synchronously)through their common signals. More precisely, let XPP (resp., XQQ) be the vari-ables of P (resp., Q); the executions in (| P | Q |) are the executions whoseprojections on XPP are executions of P, and projections on XQQ are executions ofQ. In other words, (| P | Q |) defines the set of behaviors that satisfy bothP and Q constraints (equations).

Derived operators

ı y := x cell c init x0 behaves as a synchronized memory cell: y ispresent with the most recent value of x when x is present or c is present andtrue. It is defined by the following program:(|y := x default (y $1 init x0) | y ˆ= x ˆ+ when c|)ı y := var x init x0 behaves as a standard memory cell: when y is

present, its value is the most recent value of x (including the current instant);the clock of x and the clock of y are mostly independent (the single constraintis that their time domains belong to a common chain). It is defined by the fol-lowing program:y := (x cell ˆy init x0) when ˆy

Polychrony example: (| x := a | y := b|) defines a process that hastwo independent clocks. This process is a Kahn process (i.e., is a flow function);it can be executed as two independent threads, on the same processor (providedthat the scheduler is fair) or on distinct processors; it can also be executed as asingle reactive process, scanning its inputs and then executing none, one or twoassignments depending on the input configuration.

1.2.6 Parallel Semantic Properties of SIGNAL

A SIGNAL specification close to the example given in Sect. 1.2.1

if a > 0 then x D a endifI y D x C ais the following “DeadLocking Specification”:

DLS � (| x := a when a > 0 | y := x + a |)

DLS denotes the set of executions in which at > 0 for all t . Then safe executionrequires this program to be rejected if one cannot prove (or if it is not asserted) thatthe signal a remains strictly positive when it is present.

Page 28: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

10 L. Besnard et al.

Embedding DLSwithout changing it with the following front-end process resultsin a safe program that accepts negative values for some occurrences of a; thesenegative values are not transmitted to DLS:

ap := a when a > 0 | (| (| DLS | a := ap |) where a |)

Process expression in normal form. The following properties of parallel composi-tion are intensively used to compile processes:

� Associativity: (| P | Q |) | R � P | (| Q | R |).� Commutativity: P | Q � Q | P .� Idempotence: P | P � P is satisfied by processes that do not produce side

effects (for instance due to call to system functions). This property allows toreplicate processes.

� Externalization of restrictions: if x is not a signal of P,P | (| Q where x |) � (| P | Q |) where x

Hence, a process expression can be normalized, modulo required variable substitu-tion, as the composition of elementary processes, included in terminal restrictions.

Signal expression in normal form. Normalizations can be applied to expressionson signals thanks to properties of the operators, for instance:

� when is associative and right-commutative:(a when b)when c � a when(b when c) � a when(c when b)

� default can be written in exclusive normal form:a default b � a default (b when (bˆ-a))

� default is associative and default commutes in exclusive normal form:if a and b are exclusive then a default b � b default a

� when is right-distributive over default:(a default b) when c � (a when c) default (b when c)

� Boolean normalization: logical operators can be written as expressions involvingonly not, false and operators on event type. For example:AB := a or b � (| AB := (when a) default b | a ˆ= b |)

Process abstraction. A process Pa is, by definition, a process abstraction of a pro-cess P if P jPa D P . This means that every execution of P restricted to variablesof Pa is an execution of Pa and thus all safety properties satisfied by executions ofPa are also satisfied by executions of P .

1.3 Example

As a full example to illustrate the SIGNAL features and the compilation techniques(including code generation) we propose the description of a process that solves iter-atively equations aX2CbXCc D 0. This process has three synchronous signals a,b, c as inputs. The output signals x1, x2 are the computed solutions. When there

Page 29: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

1 Compilation of Polychronous Data Flow Equations 11

is no solution at all or infinitely many solutions, their synchronized flows hold #and an output Boolean x_st is set. Before presenting this example let us introducemodularity features than can be found in SIGNAL.

1.3.1 SIGNAL Modularity Features

Process model: Given a process (a set of equations) P_body, a process model MPassociates an interface with P_body, such that P_body can be expanded using thisinterface. A process model is a SIGNAL term

process MP ( ? t_I1 I1; ...; t_Im Im;! t_O1 O1; ...; t_On On )

P_body

that specifies its typed input signals after “?”, and its typed output signals after“!”. Assuming that there is no name conflict (such a conflict is solved by trivialrenaming), the instantiation of MP is defined by:

(| (Y1, ..., Yn) := MP(E1, ..., Em) |)D�

(| I1:=E1 |...| Im:=Em | P_body | Y1:=O1 |...| Yn:=On |)where t_I1 I1; ...; t_Im Im; t_O1 O1; ...; t_On On

Example When a is equal to 0, the second degree equation becomes bX C c D 0.This first degree equation is solved using the FirstDegree process model.

process FirstDegree = ( ? real b, c; ! boolean x_st; real x; )(| b ˆ= c| b1 := b when (b/=0.0)| c1 := c when (b/=0.0)| x := -(c1/b1)| x_st := (c/=0.0) when (b=0.0)|) where real b1, c1; end

Example: FirstDegree SIGNAL process

When the factor b is not 0, the output signal x holds the value of the solution(-c/b), the x_st Boolean signal is absent. Conversely, when b is 0, x is absentand x_st is either true when the equation has no solution (c is not 0) or false whenthe equation has infinitely many solutions (c is 0). This process is activated whenthe input parameter a equals 0. This is achieved by:(x_st_1, x11) := FirstDegree (b when (a=0.0), c when (a=0.0))

The interface of a process model can begin with a list of static parameters givenbetween “{” and “}”; see for instance real epsilon in the interface of rac(Example p. 12).

Local process model: A process model can be declared local to an other processmodel as process rac local to process SecondDegree in Example p. 13.

Page 30: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

12 L. Besnard et al.

Shared variables: A variable x that has partial definitions (Sect. 1.2.4) in aprocess model MP, or in process models local to MP, is declared as variableshared real x, local to MP. A shared signal x cannot appear in the interface ofany process.

1.3.2 Full Example with Time Refinement

In our example, the resolution of the equation aX2C bX C c D 0 uses the iterativeNewton method: starting from � � 0, the computation of R D p� is defined bythe limit of the series .Rn/n�0:

� D b2 � 4ac; R0 D �

2; RnC1 D .Rn �Rn C�/=Rn

2.n � 0/:

The iterative method for computing the square root of the discriminant is imple-mented in SIGNAL using a time refinement of the clock of the discriminant.

The process model rac computes in R the sequence of roots Rt of the values of asignal St assigned to S (corresponding to the discriminant when it is not negative);epsilon is a static threshold parameter used to stop Newton iteration.

process rac = { real epsilon; }( ? real S; ! boolean stable; real R; )(| (| S_n := var S

| R_n := (S/2.0) default (next_R_n $1 init 1.0)| next_R_n := (((R_n+(S_n/R_n))/2.0) when loop) default R_n|) where real S_n; end

| (| loop := (ˆS) default (not stable)| next_stable := abs(next_R_n-R_n)<epsilon| stable := next_stable $1 init true| R_n ˆ= stable| R := R_n when (next_stable and loop)|) where boolean next_stable; end

|) where real R_n, next_R_n; boolean loop; end

Example: rac SIGNAL process

The signal S_n holds the current value of S (St;n D St;0 D St ). The signalR_n is the current approximation of the current root to be computed (Rt;n, withRt;0 D St=2). The signal next_R_n is Rt;nC1. The signal stable is first true,then false until the instant following the emission of the result in (R).

The clock that triggers steps is that of the signal stable: it represents a refine-ment of time, with respect to that of the input signal S.

The process model SecondDegree uses rac to compute the solutions of thesecond degree equation when the discriminant is positive.

Page 31: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

1 Compilation of Polychronous Data Flow Equations 13

process SecondDegree = { real epsilon; }( ? real a, b, c; ! event x_st; real x21, x2; boolean stable; )(| delta := (b*b)-(4.0*a*c)| x_st := when (delta<0.0)| x1_1 := (-b/(2.0*a)) when (delta=0.0)| (| (stable, delta_root) := rac{epsilon}(delta when (delta>0.0))

| aa := var a | bb := var b| x1_2 := -((bb+delta_root)/(2.0*aa))| x2 := -((bb-delta_root)/(2.0*aa)) |) where aa, bb, delta_root; end

| x21 := x1_1 default x1_2|) where delta, x1_1, x1_2;

process rac ... end

Example: SecondDegree SIGNAL process

When the discriminant delta is negative, the current equation does not havesolution: the event x_st output signal is present, x21 and x2 are absent. Whendelta is 0 there is a single solution held by x1_1 and then x_st and x2 areabsent. When delta is strictly positive the two solutions are the current values ofx21 and x2, x_st is absent. The signal stable is false until the instant followingthe emission of the result in (R).

The process model eqSolve is the global solver: a, b and c are declared to besynchronous.

process eqSolve = { real epsilon; }( ? real a, b, c; ! boolean x_st; real x1, x2; )(| a ˆ= b ˆ= c ˆ= when stable| (x_st_1, x11) := FirstDegree ( b when (a=0.0), c when (a=0.0))| (x_st_2, x21, x2, stable) :=

SecondDegree{epsilon}(a when (a/=0.0), b when (a/=0.0), c when (a/=0.0))| x1 := x11 default x21| x_st := x_st_2 default x_st_1 |)

where ... end

Example: eqSolve SIGNAL process

When the value of a is 0, FirstDegree input signals are present (and thenFirstDegree is “activated”), a, b and c are not “transmitted” to Second-Degree which remains inactive. Conversely when a is not 0, SecondDegree isactivated and FirstDegree is not. The results of the activated modes are mergedto generate x1, x2 and x_st.

1.4 Formal Context for Code Generation

The equational nature of the SIGNAL language is a fundamental characteristic thatmakes it possible to consider the compilation of a process as a composition of en-domorphisms over SIGNAL processes. We have given in Sect. 1.2.6 a few propertiesallowing to rewrite processes with rules such as commutativity and associativityof parallel composition. More generally, until the very final steps, the compilation

Page 32: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

14 L. Besnard et al.

process may be seen as a sequence of morphisms rewriting SIGNAL processes toSIGNAL processes. The final steps (C code generation for instance) are simple mor-phisms over the transformed SIGNAL processes.

In some way, because SIGNAL processes are systems of equations, compilingSIGNAL processes amounts to “solving” these equations. Among relevant questionsarising when producing executable code, for instance, there are the following ones:

– Is the program deadlock free?– Has it a deterministic execution?– If so, can it be statically scheduled?

To answer these questions, two basic tools are used in the compilation process.The first one is the modeling of the synchronization relation in F3 by polynomialswith coefficients in the finite field Z=3Z of integers modulo 3 [19]; the POLY-CHRONY SIGNAL compiler manipulates a Boolean hierarchy instead of this field.The second one is the directed graph associated with data dependencies and ex-plicit precedences. The synchronization and precedence relations are representedin a directed labeled graph structure called the Data Control Graph (DCG); it iscomposed of a Clock Hierarchy (CH, Sect. 1.4.3.1) and a Conditioned PrecedenceGraph (CPG, Sect. 1.4.4). A node of this CPG is an elementary process or, in ahierarchical organization, a composite process containing its own DCG.

This section introduces SIGNAL features used to state properties related to theData Control Graph. Principles and algorithms applied to information carried outby this DCG are presented.

1.4.1 Endochronous Acyclic Process

When considering embedded systems specified in SIGNAL, the purpose of codegeneration is to synthesize an executable program that is able to deterministicallycompute the value of all signals defined in a process. Because these signals aretimed by symbolic synchronization relation, one first needs to define a functionfrom these relations to compute the clock of each signal. We say that a process isendochronous when there is a unique (deterministic) way to compute the clocks ofits signals. Note that, for simulation purpose, one may wish to generate code for nondeterministic processes (for example, partially defined ones) or even for processesthat may contain deadlocks. Endochrony is a crucial property for processes to beexecutable: an endochronous process is a function over pure flows. It means thatthe pure flows resulting from its execution on an asynchronous architecture do notdepend on propagation delays or operator latencies. It results from this property thata network of endochronous processes is a Kahn Process Network (KPN) and thusis a function over pure flows. But it is not necessarily a function over synchronizedflows: synchronizations related to the number of #’s are lost because #’s are ignored.

Whereas synchronization relation determines which signals need to be computedat a given time, precedence relation tells us in which order these signals have to

Page 33: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

1 Compilation of Polychronous Data Flow Equations 15

a

a:x:=a+x

zx:=x$1|

b: xx^= when zx=0

+

x

^x = ^zx

x

when nul

x1 := a default x2

x2 := b default x1

a

bzx

c: we have x1 ^= x2

x1^-b

x2^-ba

Fig. 1.1 Paradigms of deadlocks

be computed: x precedes y at c, represented as c W x ! y, means that, for allinstants in the pure time domain of the clock signal c, the computation of y cannotbe performed before the value of x is known. Therefore c W x ! y is equivalent toc O�x O�y W x ! y. Hence, we say that c W x ! y is in normalized form if and only ifc O�x O�y D c. The reduced form x ! y denotes the normalized form x O�y W x ! y.

An immediate cycle in the conditioned precedence graph denotes a deadlock.Figure 1.1 presents the main sources of deadlocks. Figure 1.1a is a classical com-putation cycle. Figure 1.1b is a “schizophrenic” cycle between clock and signal: toknow if zx is present one must know if its value is 0, but to pick up its value itis required to know if it is present. Figure 1.1c is a cyclic dependence due to thefree definition of both x1 and x2 when neither a nor b are present; nevertheless ifthe clock of x1 is a ˆ+ b then there is no deadlock during execution (either a ispresent and x1 ˆ- a is null or b is present and x2 ˆ- b is null).

1.4.2 SIGNAL Graph Expressions

The SIGNAL term a --> b when c is an elementary process in the SIGNAL

syntax. It denotes the precedence relation Œc� W a! b. This constraint is usually im-plicit and related to data dependencies (for instance in x := a+b the constraintsa ! x and b ! x hold), and clock/signal dependencies (the value of a signal xcannot be computed before Ox, the clock of x, hence the implicit precedence relationOx ! x). Precedences can be freely added by the programmer. They can be com-puted by the compiler and made available in specifications associated with processes(Sect. 1.4.6).

Derived expressionsa --> b is a simplified term that means a --> b when (aˆ*b).Local precedence constraints can be combined as in {b,c} --> {x_st,x}meaning that for all pairs (u in {b,c}, v in {x_st,x}), u --> v holds.

The SIGNAL term ll::P is a derived process expression formally defined withSIGNAL primitive features. For the purpose of this chapter, it is enough to know

Page 34: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

16 L. Besnard et al.

that ll::P associates the label ll with the process P. A label ll is a special eventsignal that “activates” P. It may occur in synchronization and precedence constraintsas any other signal, with a specific meaning for precedence: if a process P containsll1::P1 and ll2::P2 then ll1 --> ll2 in P means that every node in P1precedes all nodes of P2, while for a signal x, x --> ll2 (resp. ll1 --> x)means that x precedes (resp. is preceded by) all nodes of P2 (P1).

1.4.3 Synchronization Relation

Table 1.1 gives the synchronization relation associated with a SIGNAL expressionP, as a SIGNAL expression C(P) (column 2), and as a time domain constraint T (P)(column 3). The time domain constraints have to be satisfied by all executions of P.The synchronization relation associated with derived expressions (and normalizableones) are deduced from this table. In this table “[E]” stands for “when E”, “ˆ0”is the “never present” clock, the pure time domain of a signal x is noted “dom.x/”.Other notations are standard ones and SIGNAL notations.

The transformation C defined in Table 1.1 satisfies the following property makingC(P) a process abstraction of P:

.j C(P) j P j/ = P.

The clock of a Boolean signal b is partitioned into its exclusive sub-clocks Œb�and Œ:b� which denote the instants at which the signal b is present and carries thevalues true and false, respectively, as shown in Fig. 1.2.

Table 1.1 Synchronization relation associated with a process

Construct P Clocks: C(P) Pure time domains T (P)

boolean b [b]ˆ+[not b] ˆ= b | T .C.P//[b]ˆ*[not b] ˆ= ˆ0

event b [b] ˆ= b | [not b] ˆ= ˆ0 T .C.P//y := f(x1,...,xn) y ˆ= x1ˆ= ... ˆ= xn T .C.P//y := x $1 init c y ˆ= x dom.y/ D dom.x/y := x when b xˆ*[b] ˆ= y dom.y/ D dom.x/\ dom.Œb�/y := x when not b xˆ*[not b] ˆ= y dom.y/D dom.x/\ dom.Œ:b�/z := x default y xˆ+y ˆ= z dom.z/D dom.x/[dom.y/P1jP2 C.P1/ j C.P2/ T .C.P1//^ T .C.P2//P where x C(P) where x 9xT .C.P//

[b=0],

x_st present

[b/=0],

x present

b, c present b, c

[b=0], x_st [b/=0], x

b, c absent

Fig. 1.2 Time subdomains and hierarchy for the FirstDegree process

Page 35: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

1 Compilation of Polychronous Data Flow Equations 17

1.4.3.1 Clock Hierarchy

The synchronization relation provides the necessary information to determine howclocks can be computed. From the synchronization relation induced by a process, webuild a so-called clock hierarchy [20]. A clock hierarchy is a relation O& (dominates)on the quotient set of signals by ˆ= (x and y are in the same class iff they aresynchronous). Informally a class C dominates a class D (C is higher than D) orequivalently a class D is dominated by C (D is lower than C ), written D O.C , ifthe clock of D is computed as a function of Boolean signals belonging to C and/orto classes recursively dominated by C .

To compute this relation, we consider a set V of free value Boolean signals (thefree variables of the process); this set contains variables the definition of whichcannot be rewritten using some implemented rewriting rule system: in the currentPOLYCHRONY version, input signals, delayed signals, results of most of the non-Boolean predicates are elements of this set V .

Depending on V , the construction of the relation O& is based on the following

rules (for x a signal, Cx denotes its class; O&�is the transitive closure of O&):

1. If x1 is a free variable such that the clock of x is defined by sampling that of x1( Ox D Œx1� or Ox D Œ:x1�) then Cx1

O&Cx : the value of x1 is required to computethe clock of x.

2. If Ox D f . Ox1; : : : ; Oxn/, where f is a Boolean/event function, and there exists C

such that C O&�Cxi

for all xi in x1; : : : ; xn then Ox is written in canonical formOx D cf . Oy1 : : : ; Oym/; cf . Oy1; : : : ; Oym/ is either the null clock, or the clock of C ,or is transitively dominated by C ; in POLYCHRONY, BDDs are used to computecanonical forms.

3. If Ox D cf . Oy1; : : : ; Oym/ is a canonical form such that m � 2 and there exists Cz

such that CzO&�Cyi

for all yi in y1; : : : ; ym, then there exists a lowest class C

that dominates those Cyi, and C O&Cx .

When the clock hierarchy has a unique highest class, like the classes of Bx inFig. 1.3b, the process has a fastest rated clock and the status (presence/absence)of all signals is a pure flow function: this status depends neither on communicationdelays, nor on computing latencies. The process is endochronous.

Bx ^= Bb |x ^= when Bx |b ^= when Bb

[¬Bx]

[¬Bb]

Bx, Bb

x[Bx]

[Bx] ^*[b]: yb[Bb]

[b] [¬b]

x b

[b] [¬b]

x ^*[b]: y

a: not a tree b: endochronized

y:=x when bxb

y

BxBb

c: clock container

Fig. 1.3 Endochronization of y := x when b

Page 36: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

18 L. Besnard et al.

The “clock calculus” provided by POLYCHRONY determines the clock hierarchyof a process. It has a triple purpose:

– It verifies that the process is well-clocked: synchronization relation of the processcan be written as a set of acyclic definitions.

– It assigns to each clock a unique normalized definition when possible.– It structures the control of the process according to its clock hierarchy.

1.4.3.2 “Endochronization”

A process that is not endochronous can be embedded in a container (Fig. 1.3c) suchthat the resulting process is endochronous. For instance, consider the clock hierar-chy associated with the process y := x when b; it has two independent classes:the class of ˆx and the class of ˆb (Fig. 1.3a), the third one is defined by a clockproduct.

To get an endochronous process, one can introduce a new highest clock and twonew input Boolean signals Bx and Bb, synchronous with that clock; the sampling ofBx and Bb defines respectively the clocks of x and b (Fig. 1.3b). This embeddingcan be made by the designer or heuristics can be applied to instrument the clockhierarchy with a default parameterization.

Building such a container is useful not only to make a process endochronousbut also to insure that the pure flow function associated with a KPN remains asynchronized flow function (i.e., synchronizations are preserved despite variouscommunication delays).

1.4.4 Precedence Relation

Table 1.2 gives the precedence relation associated with a SIGNAL expression P, as

a SIGNAL expression S(P) (column 2), and as a path algebraı�!(P) (column 3).

Notice that the delay equation (line 3) does not order the involved signals. The tableis completed by relations coming from the clock hierarchy.

The transformation S defined in Table 1.2 satisfies the following property makingS(P) a process abstraction of P:

.j S(P) j P j/ = P.

1.4.4.1 Path Algebraı�!

From basic precedence relation, a simple path algebra can be used to verify deadlockfreeness, to refine precedence and to produce information for modularity. The path

Page 37: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

1 Compilation of Polychronous Data Flow Equations 19

Table 1.2 Precedence relation associated with a process; the transitive closure .P /� , is moreprecisely a path algebra presented in Sect. 1.4.4.1

Construct P Precedence: S(P) Path algebra:ı�! .P/

{a --> b when c} {a --> b when c} Œc� W a ! b

y := f(x1,...,xn) x1-->y j : : : j xn-->y ı�! .S.P//y := x $1 init cy:= x when b x-->y x ! y

z:= x default y x-->z j y-->z when (ˆyˆ-ˆx)ı�! .S.P//

P1 j P2 S.P1/jS.P2/ ..ı�! .P1// [ .

ı�! .P2///�

P where x S(P) where xı�! .P/

Clock construct P Precedence: S(P) Path algebra:ı�! .P/

x, a signal ˆx --> x Ox ! xx ˆ= when b b --> ˆx b ! Oxx ˆ= when not b

y ˆ= cf(ˆx1 ,. . . ,ˆxn) ˆx1-->ˆy j : : : j ˆxn-->ˆy ı�! .S.P//

algebra is given by the following rules definingı�!

�for expressions c W x ! y in

normalized form:

ı Rule of series c W x ! y and d W y ! z) c O�d W x ! z;

ı Rule of parallelc W x ! y

d W x ! y

�) c OCd W x ! y:

A pseudo cycle in the precedence relationı�! associated with a process P is a

sequence c1 W x1 ! x2, c2 W x2 ! x3 : : : cn�1 W xn�1 ! x1 inı�!

�. P is deadlock

free iff for all pseudo cycle c1 W x1 ! x2, c2 W x2 ! x3 : : : cn�1 W xn�1 ! x1 in itsprecedence relation, the product c1 O�c2 O� : : : O�cn�1 is null (D O0).

1.4.4.2 Precedence Refinement

To generate sequential code for a (sub)process P it is usually necessary to add newdependencies c1 W x1 ! x2 to P, getting so a new process PS which refines prece-dence of P. But it is wishable that if the composition of P with some process Q isdeadlock free, then the composition of PSwith the same processQ remains deadlockfree; a refinement that satisfies this property is said to be cycle consistent by compo-sition. In general, maximal cycle consistent refinement is not unique. The union ofall maximal cycle consistent refinements is actually a preorder that can be computedusing the path algebra [21]. The preorder associated with process FirstDegreeis shown in Fig. 1.4b (clocks are omitted): one can for instance compute b1 beforec1 or conversely.

Page 38: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

20 L. Besnard et al.

^b = ^c

b

[b=0]

x_t:=(c=0)when h

c

x:=c1/b1

b1:=b when k c1:=c when k

[b/=0]

^b =^c

b

[b=0]

x_t:=(c=

c

x:=c

b1:= c1:=

[b/=0]

a-clock and graph for FirstDegree

a b

b- scheduling refinement preorder

IOClock hierarchyschedulingrefinement

h is “when b=0”k is “when b/=0”

Fig. 1.4 FirstDegree process DCG and its precedence refinement preorder

1.4.5 Data Control Graph

The Data Control Graph, composed of a clock hierarchy and a Conditioned Prece-dence Graph as shown in Fig. 1.4a, not only associates with every clock node theset of signals that have this node as clock, but also produces a hierarchy of the CPGin which with every clock node is associated the subgraph of the computations thatmust be processed when this clock is present (for instance computations of b1, c1,x are associated with the node [b /= 0] in Fig. 1.4a).

1.4.6 SIGNAL Process Abstraction

The interface of a process model MP can contain, in a process P_abst, a specificdescription of relations, typically, clock and precedence relations, applying on itsinputs and outputs:

process MP(?I1, ...,Im; !O1,..., On) spec (|P_abst|) P_body

When the process body P_body of MP is a set of equations, the actual processassociated with MP is not P_body but (|P_abst|P_body|). Hence, P_abstis by construction an abstraction of (|P_abst|P_body|).

When MP is an external process model, P_abst is assumed to be an abstrac-tion of the process associated with MP. Supplementary properties are required tocorrectly use those external processes. Thus, a process is (can be) declared:

� Safe: a safe process P is a single-state deterministic, endochronous automaton;such a process does not have occurrence of delay operator and cannot “call”external processes that are not safe; in practice the code generated to interactwith P (to call P ) can be freely replicated.

� Deterministic: a deterministic process P is a deterministic, endochronous au-tomaton (it is a pure flow function); such a process cannot “call” external

Page 39: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

1 Compilation of Polychronous Data Flow Equations 21

processes that are not safe; in practice the code generated to interact with P(to step P ) can be replicated only if the internal states of P are also replicated.

� Unsafe: a process is unsafe by default.

For instance, the FirstDegree function (Example p. 11) is a safe process.Its synchronization and precedence relations are made explicit in its specifica-tion: the signal x_st is synchronized with the clock at which b is zero, thesignal x is present iff b is non-zero, all output signals depend on all input signals({b, c}-->{x, x_st}). The abstraction of its generated C code is then givenby the FirstDegree SIGNAL external process presented below.

For a SIGNAL process P such an abstraction can be automatically computed bythe POLYCHRONY SIGNAL compiler: it is the restriction to input/output signals(of MP) of the process (|C(P)|S 0(P)|), where S 0(P) is the precedence refine-ment of S(P) taking into account the precedences introduced by code generation.

When the specifications are related to imported processes their source code maynot be available (written in another language such as C++ or Esterel [22]). Legacycodes can be used in applications developed in SIGNAL, in which they are consid-ered as external processes via specifications.

Besides, POLYCHRONY provides a translator from C code in SSA form toSIGNAL processes [23]. The translated process can be used to abstract the origi-nal C behavior. Each sequence of instructions in the SSA form is associated witha label. Each individual instruction is translated to an equation and each label istranslated to a Boolean signal that guards its activation. As a result, the SIGNAL

interpretation has an identical behavior as the original C program.

void FirstDegree(int float b, float c,float *x, int *x_st) {

if (b != 0.0)

*x = -(c/b);else

*x_st = (c !=0.0);}

process FirstDegree =( ? real b, c; ! boolean x_st; real x;)safespec (| b --> c %precedence refinement%

| {b, c} --> {x , x_st}| b ˆ= c| x ˆ= when (b/=0.0)| x_st ˆ= when (b=0.0) |)

external "C" ;

Example: Legacy code and SIGNAL abstraction for FirstDegree

Syntactic sugar is made available in SIGNAL to easily handle usual functions. Forinstance the arithmetic C function abs used in rac can be declared asfunction abs=(? real x; ! real s;); function means that the

process is safe, its input and output signals are synchronous and every input pre-cedes every output.

1.4.7 Clustering

The DCG (Sect. 1.4.5) with its CH (Sect. 1.4.3.1) ant its CPG (Sect. 1.4.4) is the ba-sis of various techniques applied to structuring the generated code whilst preservingthe semantics of the SIGNAL process thanks to formal properties of the SIGNAL

Page 40: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

22 L. Besnard et al.

language (Sect. 1.2.6). We give in this section a brief description of several impor-tant techniques that are illustrated in Sect. 1.5.

1.4.7.1 Data Flow Execution

The pure data flow code generation provides the maximal parallel execution. If weconsider the FirstDegree example, we can see that some elementary equationsare not pure flow functions, and then the result of their pure data flow execution isnot deterministic. It is the case for c1 := c when (b/=0.0): an occurrence ofc being arrived and b not, b may be assumed absent. A simple solution to this prob-lem is to consider only endochronous nodes as candidates for data flow execution.One can get this structure by duplicating the synchronization b ˆ= c. The result-ing code for FirstDegree, in which each line is a pure flow function, is then:

(| b1 := b when (b/=0.0)| (| b ˆ= c | c1 := c when (b/=0.0) |)| x := -(c1/b1)| (| b ˆ= c | x_st := (c/=0.0) when (b=0.0) |)

Example: SIGNAL code with endochronous nodes

This example illustrates a general approach, using properties of the SIGNAL lan-guage (Sect. 1.2.6) to get and manage endochronous subprocesses as sources forfuture transformations. Building primitive endochronous nodes does not change thesemantics of the initial process.

1.4.7.2 Data Clustering

Data clustering is an operation that lets unchanged the semantics of a process. Itconsists in splitting processes on various criteria thanks to commutativity, associa-tivity and other properties of process operators (Sect. 1.2.6). It is thus possible toisolate:

– The management of state variables (defined, directly or not, as delayed signals) ina “State variables” process (required synchronizations are associated with thosevariables)

– The management of local shared variables used instead of signal communicationsbetween subprocesses in “Local variables” blocks

– The computation of specific data types towards multicore heterogeneous archi-tectures

1.4.7.3 Phylum

Atomic nodes can be defined w.r.t. the following criterion: when two nodes bothdepend on (are transitively preceded by) the same set of input variables, they can

Page 41: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

1 Compilation of Polychronous Data Flow Equations 23

be both executed only after the inputs are achieved. This criterion defines an equiv-alence relation the class of which we name phylums: for a set of input variables Awe name phylum of A the set of nodes that are preceded by A and only A. It resultsfrom this definition that if a phylum P1 is the phylum of A1, and P2 that of A2, P1precedes P2 iff A1 A2.

As an example, the FirstDegree function (Example p. 22) might have fourphylums: the phylum of the empty set of inputs which is empty, the phylum of bthat contains b1 := b when(b/=0.0), the phylum of c which is empty, andfinally the phylum of fb,cg that contains the three remaining nodes. Due to itsproperties, an endochronous process can be split into a network of endochronousphylums that can be executed atomically (as function calls for instance).

1.4.7.4 From Cluster to Scheduling of Actions

Using data clustering and phylums, a SIGNAL process can be transformed in anequivalent process the local communications of which are made invisible for a usercontext. This transformation is illustrated with the FirstDegree process; twophylums are created: Phylum_B is the phylum of b and Phylum_BC is the phylumof {b,c}; two labels L_PH_B and L_PH_BC are created to define the “activationclock” and the precedence related to these phylums. One can notice that this SIGNAL

code is close to a low level imperative code.

process FirstDegree = ( ? real b, c; ! boolean x_st; real x; )(| b ˆ= c ˆ= b1 ˆ= bb ˆ= L_PH_B ˆ= L_PH_BC| L_PH_B :: Phylum_B(b)| L_PH_BC :: (x_st, x) := Phylum_BC(c)| L_PH_B --> L_PH_BC|) where shared real b1, bb;

process Phylum_B = ( ? real b; )(| bb ::= b| b1 ::= b when (b/=0.0) |)

process Phylum_BC = ( ? real c; ! boolean x_st; real x;)(| c1 := c when (b/=0.0)| x := -(c1/b1)| x_st := (c/=0.0) when (bb=0.0) |) where real c1; end

end

Example: Labels and actions in FirstDegree SIGNAL process

A grey box is an abstraction of such a clustered process: the grey box of a pro-cess contains a specification that describes synchronization and precedence oversignals and labels. The phylums synchronized by these labels are declared as ab-stract processes. The grey box of a process P can be imported in another processPP. The global scheduling generated for PP includes the local scheduling of P.In the case of object-oriented code generation, this inclusion can be achieved byinheritance and redefinition of the scheduling, the methods associated with abstractprocesses remaining unchanged. The grey box associated with FirstDegree canbe found below.

Page 42: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

24 L. Besnard et al.

process FirstDegree = ( ? real b, c; ! boolean x_st; real x; ) %grey box%safespec (| b ˆ= c ˆ= L_PH_B ˆ= L_PH_BC

| L_PH_B :: Phylum_B(b)| L_PH_BC :: (x_st, x) := Phylum_BC(c)| b --> L_PH_B --> c --> L_PH_BC --> {x , x_st}| x ˆ= when (b/=0.0)| x_st ˆ= when (b=0.0) |)

where process Phylum_B = ( ? real b; ) external;process Phylum_BC = ( ? real c; ! boolean x_st; real x;) external;

endexternal ;

Example: Grey box abstraction for FirstDegree SIGNAL process

1.4.8 Building Containers

The construction of containers previously proposed to build endochronous pro-cesses (Sect. 1.4.3.2) is used to embed a given process in various execution contexts.The designer can for instance define input/output functions in some low level lan-guage and import their description in a SIGNAL process providing to the interfacedprocess an abstraction of the operating system; this is illustrated below for theFirstDegree process:

process EmbeddedFirstDegree = ( )(| (| L_SCAN :: (b,c) := scan() | b ˆ= c ˆ= L_SCAN |)| (x_st, x) := FirstDegree(b,c)| (| L_EA :: emitAlarm() | L_EA ˆ= x_st |)| (| L_PRINT :: print(x) | L_PRINT ˆ= x |)|) where real b, c, x; boolean x_st;

process scan = ( ! real b, c; )spec (| b ˆ= c |)

process emitAlarm()process print = ( ? real x )process FirstDegree = ( ? real b, c; ! boolean x_st; real x; )

%description of FirstDegree%end

Example: Container for FirstDegree SIGNAL process

The statement (b,c) := scan(), synchronized to L_SCAN, delivers the newvalues of b and c each time it is activated. The statement print(x) prints the re-sult when it is present. The statement emitAlarm() is synchronized with x_st,thus an alarm is emitted each time the equation has not a unique solution.

The construction of containers can be used to execute processes in various con-texts, including resynchronization of asynchronous communications, thanks to thevar operator (Sect. 1.2.5) or more generally to bounded FIFOs – that can be builtin SIGNAL.

Page 43: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

1 Compilation of Polychronous Data Flow Equations 25

1.5 Code Generation in POLYCHRONY Toolset

Section 1.4 introduces the compilation of a SIGNAL process as a set of transforma-tions. In this section, we describe how code can be generated, as final step of sucha sequence of transformations, following different schemes. When a process P iscomposed of interconnected endochronous subprocesses, free of clock constraints,it is a pure flow function (a KPN). One could then generate code for P follow-ing the Kahn semantics. Nevertheless, the execution of the generated code maydeadlock. If this KPN process is also acyclic then deadlock cannot occur: the codegeneration functionalities of POLYCHRONY can be applied. The code is generatedfor different target languages (C, C++, Java) on different architectures, preservingvarious semantic properties (at least the defined pure flow function). However, itis possible to produce reactive code or defensive code when the graph is acyclicbut there are remaining clock constraints. In these modes, all input configurationsare accepted. For reactive code, inputs that satisfy the constraints are selected;for defensive code, alarms are emitted when a constraint is violated during thesimulation.

1.5.1 Code Generation Principle

The code generation is based on formal transformations presented in the previoussections. It is strongly guided by the clock hierarchy resulting from the clock calcu-lus to structure the target language program, and by the conditioned precedencegraph not only to locally order elementary operations in sequences, but also toschedule component activations in a hierarchical target code. The code generationfollows more or less the structure presented in Fig. 1.5. The “step block” contains astep scheduler that drives the execution of the step component and updates the statevariables (corresponding to delays). The step component may be hierarchically de-composed as a set of sub-components (clusters), scheduled by the step scheduler,

MAIN

(Hierarchical)Step component

Step scheduler

Statevariables

Local variables

IO container

data

State variables: zx:=x$1Local variables: x::=E

C1

Step scheduler

State variables

Local variables

Ci Cn

Ci1

Local step scheduler

Local statevariables

Localvariables

Cji CmComponent Ci

Step component

control

Fig. 1.5 Code generation general scheme

Page 44: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

26 L. Besnard et al.

and each sub-component has, in the same way, its own local step scheduler. Thestep block communicates with its environment through the IO container and it iscontrolled by a main program.

Target language code is generated in different files. For example, for C codegeneration, we have, for a process P, a main program P_main.c, a program bodyP_body.c (that contains the step block) and an input-output module P_io.c(the IO container). The main program calls the initialization function defined inthe program body, then keeps calling the step function. The IO container defines theinput and output communications of the program with the operating system.

Each component of the target code (except the main program, which is an explicitloop) may be seen as a SIGNAL process. Every such component, generated in C forinstance, may be abstracted in SIGNAL for reuse in an embedding SIGNAL process.When target language is an object oriented language, then a class is generated foreach component. This class can be specialized to fit new synchronizations resultingfrom embedding the original process in a new context process.

1.5.1.1 Step Function

Once the program and its interface are initialized, the step function is responsiblefor performing the execution steps that read data from input streams, calculate andwrite results along output streams. There are many ways to implement this functionstarting from the clock hierarchy and conditioned precedence graph produced by thefront-end of the compiler. Various code generation schemes [21, 24, 25] are imple-mented in the POLYCHRONY toolset. They are detailed in the subsequent sectionson the solver example:

� Global code generation

– Sequential (Sect. 1.5.2)– Clustered with static scheduling (Sect. 1.5.3)– Clustered with dynamic scheduling (Sect. 1.5.4)

� Modular code generation

– Monolithic (Sect. 1.5.5.1)– Clustered (Sect. 1.5.5.2)

� Distributed code generation (Sect. 1.5.6)

1.5.1.2 IO Container

If the process contains input and/or output signals (the designer did not build his orher IO container), the communication of the generated program with the executionenvironment is implemented in the IO container. In the simulation code generator,each input or output signal is interfaced with the operating system by a stream con-nected to a file containing input data and collecting output data. The IO container

Page 45: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

1 Compilation of Polychronous Data Flow Equations 27

(Example p. 27) declares global functions for opening (eqSolve_OpenIO) andclosing (eqSolve_CloseIO) all files, and for reading (r_eqSolve_a) andwriting (w_eqSolve_x1) data along all input and output signals.

void eqSolve_OpenIO(){ fra = fopen("Ra.dat","rt");

if (!fra) {fprintf(stderr,

"Can’t open %s\n","Ra.dat");exit(1); }

fwx1 = fopen("Wx1.dat","wt");if (!fwx1) {

fprintf(stderr,"Can’t open %s\n","Wx1.dat");

exit(1); }/* ... idem for b, c, x2, x_st */}

void eqSolve_CloseIO(){ fclose(fra);

...fclose(fwx1); }

int r_eqSolve_a(float *a){ return (fscanf(fra,"%f",a)!=EOF); }

void w_eqSolve_x1(float x1){ fprintf(fwx1,"%f ",x1);

fprintf(fwx1,"\n"); fflush(fwx1); }/* ... idem for b, c, x2, x_st */

Example: An extract of the C code generated for the solver: the IO container

The IO container is the place where the interface of generated code with an ex-ternal visualization and simulation tool can be implemented. The POLYCHRONY

toolset supports default communication functions for various operating systems andmiddlewares that can be modified or replaced by the user. The r_xx_yy functionsreturn an error status that should be removed in embedded programs.

1.5.1.3 Main Program

The main program (Example p. 27) initializes input/output files in eqSolve_OpenIO, state variables in eqSolve_initialize, and iterates call to the stepfunction in eqSolve_step. In the case of simulation code, the infinite loop canbe stopped if the step function returns error code 0 (meaning that input streams areempty) and the main program will close communication (eqSolve_CloseIO).

extern int main(){ int code;eqSolve_OpenIO(); /* input/output initializing */code = eqSolve_initialize(); /* initializing the state variables */while(code) code = eqSolve_step(); /* the steps */eqSolve_CloseIO(); } /* input/output finalizing */

Example: Generated C code of the solver: the main program

1.5.2 Sequential Code Generation

This section describes the basic, sequential, inlining, code generation scheme thatdirectly interprets the SIGNAL process obtained after clock hierarchization. Thisdescription is illustrated by the eqSolve process. Figure 1.6 contains an extract

Page 46: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

28 L. Besnard et al.

stable

[stable]{a,b,c,C_delta,C_,C_x_st,...}

[C_delta]{delta,...} [C_] {...} [C_x_st] {x_st}

{R_n,...} C_delta = not (a=0)C_ = (a=0) %not C_delta%

Fig. 1.6 Clock hierarchy of the solver process

of the clock hierarchy resulting from the clock calculus applied to the solver.Example p. 28 is the C generated code for this solver. The code of the step blockis structured according to the clock hierarchy. The precedence relation is the sourceof local deviations.

Let us give some details about the clock hierarchy, the interface and the statevariables of this example. In the eqSolve_step function, an original SIGNAL

identifier xxx has a Boolean clock named C_xxx.The master clock, which is the clock of the stable variable, ticks every time

the step function is called.Inputs are read as soon as their clock is evaluated and is true (see for example

r_eqSolve_a(&a), called when the signal stable is true). Outputs are sent assoon as they are evaluated (see for example w_eqSolve_x_st(x_st), calledwhen the signal C_x_st is true).

static float a, b, c;static int x_st;...int eqSolve_initialize(){ stable = 1;

S_n = 0.0e0;next_R_n = 1.0;XZX_162 = 0.0e0;U = 0.0e0;eqSolve_step_initialize();return 1; }

void eqSolve_step_initialize(){ C_ = 0;

C_delta = 0;C_b1 = 0;C_x1_1 = 0;C__250 = 0; }

int eqSolve_step_finalize(){stable = next_stable;C_x_st_1 = 0;C_231 = 0;eqSolve_step_initialize();return 1;

}

int eqSolve_step(){if (stable) {

if (!r_eqSolve_a(&a)) return 0;if (!r_eqSolve_b(&b)) return 0;if (!r_eqSolve_c(&c)) return 0;C_ = a == 0.0;C_delta = !(a == 0.0);if (C_delta) {

delta = b * b - (4.0*a)*c;C_231 = delta < 0.0;C_x1_1 = delta == 0.0;C__250 = delta > 0.0;if (C_x1_1) x1_1 = -b/(2.0*a);}

C_234 = (C_delta ? C_231 : 0);if (C_) {

C_x_st_1 = b == 0.0;C_b1 = !(b == 0.0);if (C_x_st_1) x_st_1 = c != 0.0;}

C_x_st_1_220 = (C_ ? C_x_st_1 : 0);C_x_st = C_x_st_1_220 || C_234;if (C_x_st) {

if (C_234) x_st=1; else x_st=x_st_1;w_eqSolve_x_st(x_st); }

}/* ... */

eqSolve_step_finalize();return 1;}

Example: Generated C code of the solver: the step block

Page 47: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

1 Compilation of Polychronous Data Flow Equations 29

The state variables are updated at the end of the step (eqSolve_step_finalize). One can notice the tree structure of conditional if-then-else statementswhich directly translates the clock hierarchy. For instance, the computation of x1_1is executed only if stable is true and a/=0 (C_delta is true) and delta=0(C_x1_1 is true). One can also notice the precedence refinement as presented inSect. 1.4.4.2: to generate sequential code, it is usually necessary to add serializa-tions. To illustrate this, consider the abstraction, reduced to the precedence relationsof the solver (Example p. 29-a). To generate the sequential code, reading actions areordered: a precedes b and b precedes c. This refinement is expressed in SIGNAL inthe abstraction given in Example p. 29-b. Internal actions are also serialized whenrequired. Thus the generated code is a specialization of the original process.

process eqSolve_ABSTRACT =( ? real a, b, c;

! boolean x_st; real x1, x2; )spec (| {a,b,c} --> {x_st,x1,x2}

| %clocks ... %|)

a-From the original process

spec (| (| a --> b | b --> c| x_st --> x2 | x2 --> x1|)

| {a,b,c} --> {x_st,x1,x2}| %clocks ... %|)

b-Precedence refinement

Example: Abstractions (reduced to precedence relations) of the solver

Note that in this code generation scheme, the scheduling and the computationsare merged.

1.5.3 Clustered Code Generation with Static Scheduling

The scheme presented here uses the result of clustering in phylums to generate code.This method is particularly relevant in code generation scenarios such as modularcompilation and distribution.

Figure 1.7 displays the clusters obtained for the eqSolve process. Since thereare three inputs a, b, c, the clustering in phylums can lead to, at most, 23 D 8 clus-ters, plus one for state variables. Fortunately clustering is usually far from reachingworst combinatoric case. In the case of the solver, five clusters are non-empty. Theclusters are subject to inter-cluster precedences that appear in the figure as solid

Cluster_4

Cluster_2

Cluster_3

Cluster_1 Cluster_delaysa

b

c

{a,c}

{a,b}

{a,b,c}{a}

x_st

x2x1

Fig. 1.7 The phylums of the solver

Page 48: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

30 L. Besnard et al.

arrows. Serializations, represented as dotted arrows, are added for static scheduling.The Cluster_delays is the “State variables” component (Fig. 1.5), in charge ofupdating state variables. It is preceded by all other clusters.

Example p. 30 presents the code generated for this structuring of the solver intofive phylums. The function eqSolve_step encodes a static scheduler for theseclusters.

int eqSolve_step(){if (stable) {

if (!r_eqSolve_a(&a)) return 0;if (!r_eqSolve_b(&b)) return 0;if (!r_eqSolve_c(&c)) return 0;

}eqSolve_Cluster_4();eqSolve_Cluster_3();if (stable) eqSolve_Cluster_2();eqSolve_Cluster_1();if (stable) if (C_x_st) w_eqSolve_x_st(x_st);if (C_x1_2) w_eqSolve_x2(x2);if (C_x1) w_eqSolve_x1(x1);eqSolve_Cluster_delays();return 1; }

Example: Generated C code of the solver: statically scheduled clusters

In contrast to the previous code generation method, which globally relies on theclock hierarchy and locally relies (for each equivalence class of the clock hierarchy)on the conditioned precedence graph, clustering globally relies on the conditionedprecedence graph and locally relies (for each cluster of the graph) on the clockhierarchy.

Note. In general any clustering is suitable with respect to some arbitrary criterion,provided that each cluster remains insensitive to communication delays or oper-ator latencies. Monolithic clustering (one cluster for the whole process) minimizesscheduling overhead but has poor concurrency. The finest clustering (one cluster persignal) maximizes concurrency and unfortunately scheduling overhead. Heuristicsare used in POLYCHRONY to limit the exploration cost.

1.5.4 Clustered Code Generation with Dynamic Scheduling

Clustered code generation can be used for multi-threaded simulation by equip-ping it with dynamic scheduling artifacts. The code of a cluster is encapsulatedin a component implemented as a task. A component is structured as outlined inFig. 1.8a: each component Ti has a semaphore Si used to manage the precedencerelation between the components. Each task Ti starts by waiting on itsMi predeces-sors with wait(Si)statements. Each task Ti ends by signaling all its successorsj D 1; : : : ; Ni with signal(Sij). Moreover, one component is generated foreach input-output function. The step scheduler (Fig. 1.8b) is implemented by a

Page 49: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

1 Compilation of Polychronous Data Flow Equations 31

void * P_Ti() { /* task Ti */while (true) {

pK_Isem_wait(Si);...pK_Isem_wait(Si);P_Cluster_i();pK_Isem_signal(Si1);...pK_Isem_signal(SiNi);

}}

a-A step component

void * P_step_Task(){ /* Task T0 */

while (true) {/*signal to the clusters

without predecessors */pK_Isem_signal(S01);...pK_Isem_signal(S0N0);

/* wait the signal of theclusters withoutsuccessors */

pK_Isem_wait(S0);...pK_Isem_wait(S0);

}}

b-The step scheduler

Fig. 1.8 Code template of a cluster task and of the step function

particular task T0: it starts execution by signaling all source tasks j D 1; : : : ; N0with signal(S0j) and ends by waiting on its sink tasks with wait(S0) state-ments. The semaphores and tasks are created in the P_initialize function ofthe generated code. When simulation code is generated, a P_terminate task isalso added to kill all tasks of the application.

1.5.5 Modular Code Generation

Modular compilation consists of compiling a process, then exporting its model, anduse it in another process. For the purpose of exporting and importing the model ofa process whose code has been compiled separately, the POLYCHRONY SIGNAL

compiler provides an annotation mechanism to associate a compiled process with aprofile (Sect. 1.4.6).

This profile consists of an abstraction of the original process model consistingof the synchronization and precedence relations of the input and output signalsof the process. These properties may be provided by the designer (in the case oflegacy C code, for instance) or calculated by the compiler (in the case of a compiledSIGNAL process). The annotations also give the possibility to specify the languagein which the process is compiled (C, C++, Java) since the function call conventionsand variable binding may vary slightly from one language to another.

Starting from a SIGNAL process, modular compilation supports the code gen-eration strategies, with or without clusters, outlined previously. Considering thespecification of the solver, we detail possible compilation scenarios in which thesub-processes FirstDegree and SecondDegree are compiled separately. Theprocess SecondDegree has been adapted to the context of modular compilation:it does not assume that the values of a are different from 0. The clock of the signalstable is the master clock of the SecondDegree process.

Page 50: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

32 L. Besnard et al.

1.5.5.1 Sequential Code Generation for Modular Compilation

A naive compilation method consists in associating each of the sub-processes of thesolver with a monolithic step function. Example p. 32 gives the SIGNAL abstraction(black box abstraction) that is inferred – or could be user-provided – in order to usethe SecondDegree process as external function.

process SecondDegree_ABSTRACT =(? real a, b, c;! boolean x_st; real x21, x2;boolean stable,C_x1_2,C_delta,C_x_st,C_x21;

)spec (| (| stable --> {a,b,c}

... |)| (| stable ˆ= C_x1_2 ˆ= C_x21| ... |) |)

pragmas BlackBox "SecondDegree" end pragmasexternal "C";

Example: Black box abstractions of the FirstDegree and SecondDegree process models

The interface of the process SecondDegree has been modified: it is neces-sary to export Boolean signals that are used to define clocks of output signals. Thenthe interface of SecondDegree is endochronous. The process eqSolve_bb,Example p. 32, is the SIGNAL process of the solver in which the separately com-piled processes FirstDegree (the abstraction and generated code of which canbe found in Example p. 21) and SecondDegree are called.

process eqSolve_bb = {real epsilon}( ? real a, b, c; ! boolean x_st; real x1, x2; )

(| a ˆ= b ˆ= c ˆ= when stable| (x_st_1, x11) := FirstDegree_ABSTRACT(b when (a=0.0), c when (a=0.0))| (x_st_2, x21, x2, stable, C_x1_2, C_delta, C_x_st2, C_x21)

:= SecondDegree_ABSTRACT(a, b, c)| x1 := x11 default x21| x_st := when x_st_2 default x_st_1|) where ...

Example: Importing separately compiled processes in the specification of the solver

Unfortunately, the code from which SecondDegree_ABSTRACT is an abstrac-tion is a sequential code. And it turns out that the added precedence relations, putin the context of process eqSolve, generate a cycle that is reported by the com-piler (Example p. 32): the abstraction of SecondDegree exhibits the precedencestable --> a, the function call to SecondDegree_ABSTRACT implies thatthe input signal a precedes the output signal stable, hence the cycle. Codegeneration cannot proceed further.

process eqSolve_bb_CYC = ( )(| (x_st_2, x21,x2, stable, C_x1_2, C_delta, C_x_st2, C_x21)

:= SecondDegree_ABSTRACT(a,b,c)| stable --> a |)

Example: Cycle in the solver displayed by the compiler

Page 51: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

1 Compilation of Polychronous Data Flow Equations 33

1.5.5.2 Clustered Code Generation for Modular Compilation

To avoid spurious cycles from being introduced, it is better suited to apply clusteringcode generation techniques. The profile of a clustered and separately compiled pro-cess (grey box abstraction), Example p. 33, provides more information than a blackbox abstraction: it makes the clusters apparent and details the clock and precedencerelations between them. In turn, this profile contains sufficient information to sched-ule the clusters, and hence, call them in the appropriate order dictated by the callingcontext.

process SecondDegree_ABSTRACT =( ? real aa, bb, cc; ! boolean x_st; real x21, x2;

boolean stable, C_x1_2, C_delta, C_x_st, C_x21; )pragmas

GreyBox "SecondDegree"end pragmas

(| (| Tick := true| when Tick ˆ= stable ˆ= C_x1_2 ˆ= C_x21| when stable ˆ= aa ˆ= bb ˆ= cc ˆ= C_delta| when C_x1_2 ˆ= x2 | when C_delta ˆ= C_x_st| when C_x_st ˆ= x_st | when C_x21 ˆ= x21 |)

| (| lab :: (x_st,x21,x2,C_x1_2,C_x_st,C_x21) := SecondDegree_Cluster_1()| lab ˆ= when Tick |)

| (| lab_1 :: C_delta := SecondDegree_Cluster_2(aa)| lab_1 ˆ= when Tick |)

| (| lab_2 :: SecondDegree_Cluster_3(bb) | lab_2 ˆ= when stable |)| (| lab_3 :: SecondDegree_Cluster_4(cc) | lab_3 ˆ= when stable |)| (| lab_4 :: stable := SecondDegree_Cluster_delays() | lab_4 ˆ= when Tick |)| (| cc --> lab | bb --> lab | aa --> lab | lab_1 --> lab_3 | lab_1 --> lab_2

| lab_1 --> lab | aa --> lab_1 | lab_2 --> lab | bb --> lab_2| aa --> lab_2 | lab_3 --> lab | cc --> lab_3 | aa --> lab_3| lab_3 --> lab_4 | lab_2 --> lab_4 | lab_1 --> lab_4 | lab --> lab_4|) |) where %Declarations of the clusters% end;

Example: The grey box abstraction of the SecondDegree process model

The calling process schedules the generated clusters in the order best appro-priate to its local context, as shown in Example p. 34. The step associatedwith FirstDegree and the clusters of the separately compiled processSecondDegree are called in the very order dictated by scheduling constraints ofthe solver process, avoiding the introduction of any spurious cycle.

extern int eqSolve_gb_step(){if (stable) {if (!r_eqSolve_gb_a(&a)) return 0;if (!r_eqSolve_gb_b(&b)) return 0;if (!r_eqSolve_gb_c(&c)) return 0;C_ = a == 0.0;if (C_) { FirstDegree_step(&FirstDegree1,b,c,&x_st_1,&x11);

C_x11 = !(b == 0.0); }C_x_st_1_227 = (C_ ? (b == 0.0) : 0); }

SecondDegree_Cluster_2(&SecondDegree2,a,&C_delta);C_x11_236 = (C_ ? C_x11 : 0);

Page 52: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

34 L. Besnard et al.

if (stable) {SecondDegree_Cluster_3(&SecondDegree2,b);SecondDegree_Cluster_4(&SecondDegree2,c); }

SecondDegree_Cluster_1(&SecondDegree2,&x_st_2,&x21,&x2,&C_x1_2,&C_x_st2,&C_x21);

...if (C_x1_2) w_eqSolve_gb_x2(x2);if (C_x1) { if (C_x11_236) x1 = x11; else x1 = x21; w_eqSolve_gb_x1(x1); }eqSolve_gb_step_finalize();return 1; }

Example: Generated C code for the first and second degree clusters steps

1.5.6 Distributed Code Generation

Distributed code generation in POLYCHRONY follows along the same principlesas dynamically scheduled clustered code generation (Sect. 1.5.4). The final embed-ded application implemented on a distributed architecture can be represented by theFig. 1.9.

While clustered code generation is automatic, distribution requires additional in-formation provided by the user. Namely:

� A block-diagram and topological description of the target architecture� A mapping of software diagrams onto the target architecture blocks

Automated distribution consists of a global compilation script that proceeds withthe following steps:

1. DCG computation for the main process structure (Sect. 1.4.5)2. Booleanization: extension of event signals to Boolean signals according to the

clock hierarchy3. Clustering of the main process according to the given target architecture and

mapping (Sect. 1.5.6.1)4. Endochronization of each cluster (Sect. 1.4.3.2); this is made by importing/ex-

porting Boolean signals between clusters, according to the global DCG

main

(Hierarchical)Step

Step scheduler

State variables

Localvariables

IO container

main

(Hierarchical)Step

Step scheduler

Statevariables

Localvariables

IO container

Communicationsupport

Fig. 1.9 Overview of a distributed code

Page 53: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

1 Compilation of Polychronous Data Flow Equations 35

5. Addition of communication information (Sect. 1.5.6.2)6. Individual compilation of all clusters7. Generation of the IO container (Sect. 1.5.6.3) and the global main program

1.5.6.1 Topological Annotations

The distribution methodology is applied in Example p. 35. The user defines the map-ping of the elementary components over the target architecture with a few pragmas.The pragma RunOn specifies the processors on which a set of components has to belocated. For example, the expression RunOn {e1} "1" specifies that the compo-nent labelled with e1 is mapped on the location 1 (e1 is declared in the statemente1 :: PROC1{} as being the label of the component consisting of the subprocessPROC1).

process eqSolve ={real epsilon}( ? real a, b ,c;! boolean x_st;

real x1, x2; )pragmasTopology {b,a} "1"Topology {c} "2"Topology {x1,x2} "2"Topology {x_st} "1"Target "MPI"RunOn {e1} "1"RunOn {e2} "2"end pragmas(| e1 :: PROC1{}| e2 :: PROC2{}|)

process PROC1 =( ? real a, b, c, x11;

boolean x_st_1;! boolean stable;

real x2, x1;boolean x_st; )

(| (x_st_2, x21, x2, stable):= SecondDegree{.}(...)

| x1 := x11 default x21| x_st := x_st_2

default x_st_1|);

process PROC2 =( ? boolean stable;

real b, c;! real x11;

boolean x_st_1;)

(| a ˆ= b ˆ= cˆ= when stable

| (x_st_1, x11):= FirstDegree(...)

|)

Example: Functional clustering of the solver

The pragma Topology associates the input and output signals of the processwith a location. For example, the pragma Topology {a,b} "1" tells that theinput signals a and b must be read on location 1. The pragma Target speci-fies the API used to generate code that implements communication: for instance,Target "MPI" tells that the MPI library is used for that purpose.

1.5.6.2 Communication Annotations

Example p. 36 displays the process transformation resulting of the clustering re-quested by the user. A few signals have been added to the interface of eachcomponent: signals and clocks produced on one part of the system and used onthe other one have to be communicated.

Page 54: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

36 L. Besnard et al.

process eqSolve_EXTRACT_1=( ? boolean x_st_1;

real x11, c, b, a;boolean C_b1, C_x_st_1_490,

C_, C_x_st_1;! boolean x_st;

real x1, x2;boolean stable;

)pragmas

RunOn "1"Environment {c} "1"Environment {b} "3"Environment {a} "5"Environment {x_st} "7"Environment {x1} "8"Environment {x2} "9"Sending {stable} "10"

"eqSolve_EXTRACT_2"Receiving {x_st_1} "11"

"eqSolve_EXTRACT_2"Receiving {x11} "12"

"eqSolve_EXTRACT_2"Receiving {C_} "13"

"eqSolve_EXTRACT_2"Receiving {C_x_st_1} "14"

"eqSolve_EXTRACT_2"Receiving {C_x_st_1_490} "15"

"eqSolve_EXTRACT_2"Receiving {C_b1} "16"

"eqSolve_EXTRACT_2"end pragmas;...

process eqSolve_EXTRACT_2=( ? real c, b, a;

boolean stable;! boolean x_st_1;

real x11;boolean C_, C_x_st_1,

C_x_st_1_490, C_b1;)

pragmasRunOn "2"Environment {c} "2"Environment {b} "4"Environment {a} "6"Receiving {stable} "10""eqSolve_EXTRACT_1"

Sending {x_st_1} "11""eqSolve_EXTRACT_1"

Sending {x11} "12""eqSolve_EXTRACT_1"

Sending {C_} "13""eqSolve_EXTRACT_1"

Sending {C_x_st_1} "14""eqSolve_EXTRACT_1"

Sending {C_x_st_1_490} "15""eqSolve_EXTRACT_1"

Sending {C_b1} "16""eqSolve_EXTRACT_1"

end pragmas

Example: The components annotated by the compiler

Required communications are automatically added using additional pragmas.The pragma Environment associates an input or output signal with the loca-tion of a communication channel. For instance, Environment {c} "1" meansthat signal c is communicated along channel 1. The pragma Receiving asso-ciates an input signal with a channel location and its sender process. To send thesignal x11 from process eqSolve_EXTRACT_1 along channel 12, the follow-ing pragma is written: Receiving {x11} "12" "eqSolve_EXTRACT_2".Similarly, the pragma Sending associates an output signal with a channel lo-cation and its receiving processes. For example, the output signal stable ofprocess eqSolve_EXTRACT_1 is sent along channel 10 to process eqSolve_EXTRACT_2 by: Sending {stable} "10" "eqSolve_EXTRACT_2".

1.5.6.3 IO Code Generation

Multi-threaded, dynamically scheduled, code generation, as described in Sect. 1.5.4,is applied on the process resulting from the transformations performed for au-tomated distribution. The information carried by the pragmas Environment,

Page 55: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

1 Compilation of Polychronous Data Flow Equations 37

Receiving and Sending is used to generate communications. The communica-tions of the signalC_ (Example p. 36), between the sender eqSolve_EXTRACT_2and the receiver eqSolve_EXTRACT_1, are implemented using the MPI library(Example p. 37).

(From the file: eqSolve EXTRACT 1 io.c)int r_eqSolve_EXTRACT_1_C_(int *C_) {

MPI_Recv(C_, /* name */1,MPI_INT, /* type */eqSolve_EXTRACT_2, /* received from */13, /* the logical tag of the receiver */MPI_COMM_WORLD, /* MPI specific parameter */MPI_STATUS_IGNORE); /* MPI specific parameter */

return 1;}

(From the file: eqSolve EXTRACT 2 io.c)void w_eqSolve_EXTRACT_2_C_(int C_) {

MPI_Send(&C_, /* name */1,MPI_INT, /* type */eqSolve_EXTRACT_1, /* sent to */13, /* the logical tag of the sender */MPI_COMM_WORLD); /* MPI specific parameter */

}

Example: Example of communications

1.6 Conclusion

The POLYCHRONY workbench is an integrated development environment and tech-nology demonstrator consisting of a compiler (set of services for, e.g., programtransformations, optimizations, formal verification, abstraction, separate compila-tion, mapping, code generation, simulation, temporal profiling, etc.), a visual editorand a model checker. It provides a unified model-driven environment to performembedded system design exploration by using top-down and bottom-up designmethodologies formally supported by design model transformations from specifi-cation to implementation and from synchrony to asynchrony.

In order to bring the synchronous multi-clock technology in the context of model-driven environments, a metamodel of SIGNAL has been defined and an Eclipseplugin for POLYCHRONY is being integrated in the open-source platforms TopCasedfrom Airbus [26] and OpenEmbeDD [27]. The POLYCHRONY workbench is nowfreely distributed [6].

In parallel with the POLYCHRONY academic set of tools, an industrial implemen-tation of the SIGNAL language, called SILDEX, was developed by the TNI company,now part of Dassault Systems. This commercial toolset, which is now called RT-BUILDER, is supplied by Geensoft [28].

POLYCHRONY supports the polychronous data flow specification languageSIGNAL. It is being extended by plugins to capture within the workbench specific

Page 56: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

38 L. Besnard et al.

modules written in usual software programming languages such as SystemC orJava. It provides a formal framework:

1. To validate a design at different levels2. To refine descriptions in a top-down approach3. To abstract properties needed for black-box composition4. To assemble predefined components (bottom-up with COTS)

To reach these objectives, POLYCHRONY offers services for modeling applica-tion programs and architectures starting from high-level and heterogeneous inputnotations and formalisms. These models are imported in POLYCHRONY using thedata flow notation SIGNAL. POLYCHRONY operates these models by performingglobal transformations and optimizations on them (hierarchization of control, desyn-chronization protocol synthesis, separate compilation, clustering, abstraction) inorder to deploy them on mission specific target architectures.

In this chapter, we meant to illustrate the application of a general principle: thatof correct-by-construction design of systems, from the early stages of the design tothe code generation phases on a given architecture. This is obtained by means offormally defined transformations, based on the mathematical polychrony model ofcomputation, that may be expressed as source-to-source program transformations.This has several beneficial consequences for practical usability. In particular, sce-narios of transformations can be fully controlled by an application designer. Amongpossible scenarios, a designer will have to his or her disposal predefined ones al-lowing for instance simulation of the application following different options, or safecode generation. This transformation-based mechanism makes it possible to for-mally validate the final result of a compilation. Moreover, it provides a suitablelevel of consideration for traceability purpose.

Source-to-source transformation of programs is used also for temporal analysisof SIGNAL processes on their implementation platform [29]. Basically, it consists offormal transformation of a process into another SIGNAL process that corresponds toa so-called temporal interpretation of the initial process. The temporal interpretationis characterized by quantitative properties of the implementation architecture. Thenew process can serve as an observer of the initial one.

Here, we focused more particularly on formal context for code generation(Sect. 1.4) and on code generation strategies available in POLYCHRONY (Sect. 1.5)by considering a SIGNAL process solving second degree equations (Sect. 1.3).While mathematically simple, this process exhibits non-trivial modes, synchroniza-tion relation, precedence relation, which we analyzed and transformed to illustrateseveral usage scenarios and for which we applied the following code generationstrategies:

� Sequential code generation, Sect. 1.5.2, consists of producing a single step func-tion for a complete SIGNAL process.

� Clustered code generation with static scheduling, Sect. 1.5.3, consists of parti-tioning the generated code into one cluster per set of input signals. The stepfunction is a static scheduler of these clusters.

Page 57: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

1 Compilation of Polychronous Data Flow Equations 39

� Clustered code generation with dynamic scheduling, Sect. 1.5.4, consists of adynamic scheduling of a set of clusters.

� Distributed code generation, Sect. 1.5.6, consists of physically partitioning a pro-cess across several locations and of installing point to point communicationsbetween them.

� Sequential code generation for separate compilation, Sect. 1.5.5.1, consists ofassociating the sequential generated code of a process with a profile describingits synthesized synchronization and precedence relations. The calling context ofthe process is passed to it as parameter.

� Clustered code generation for separate compilation, Sect. 1.5.5.2, consists of as-sociating the clustered generated code of a process with a profile describing thesynchronization and precedence relations of and between its clusters. The sched-uler of the process is generated in each call context.

These code generation strategies are based on formal operations, such as abstrac-tions, that make possible separate compilation, code substitutability and reuse oflegacy code. The multiplicity of these strategies is a demonstration of the flexibilityof the compilation and code generation tools provided in POLYCHRONY. It is alsoan indicator for the possibility offered to software developers to create new gen-erators using the open-source version of POLYCHRONY. Such new generators willbe needed for developing new execution schemes or to adapt the current ones inenlarged contexts providing for example a design-by-contract methodology [30].

References

1. S.K. Shukla, J.-P. Talpin, S.A. Edwards, and R.K. Gupta. High level modeling and validationmethodologies for embedded systems: bridging the productivity gap. In VLSI Design 2003,pp. 9–14.

2. M. Crane and J. Dingel. UML vs. classical vs. rhapsody statecharts: not all models are createdequal. In Software and Systems Modeling, 6(4):415–435, December 2007.

3. A. Benveniste, P. Caspi, S. Edwards, N. Halbwachs, P. Le Guernic, and R. de Simone. Thesynchronous languages twelve years later. In Proceedings of the IEEE, 91(1):64–83, January2003.

4. P. Le Guernic, J.-P. Talpin, and J.-C. Le Lann. Polychrony for system design. In Journal forCircuits, Systems and Computers, 12(3):261–304, April 2003.

5. L. Besnard, T. Gautier, and Paul Le Guernic. SIGNAL V4-Inria Version: Reference manual.http://www.irisa.fr/espresso/Polychrony.

6. The Polychrony platform. http://www.irisa.fr/espresso/Polychrony.7. M. Le Borgne, H. Marchand, E. Rutten, and M. Samaan. Formal verification of programs

specified with Signal: application to a power transformer station controller. In Science ofComputer Programming, 41:85–104, 2001.

8. M. Kerbœuf, D. Nowak, and J.-P. Talpin. Specification and verification of a steam-boiler withsignal-coq. In Theorem Proving in Higher Order Logics (TPHOLs’2000). Lecture Notes inComputer Science. Springer, Berlin, 2000.

9. T. Grandpierre and Y. Sorel. From algorithm and architecture specifications to automatic gener-ation of distributed real-time executives: a seamless flow of graphs transformations. In FormalMethods and Models for Codesign Conference, Mont-Saint-Michel, France, June 2003.

Page 58: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

40 L. Besnard et al.

10. P. Le Guernic. SIGNAL: Description algebrique des flots de signaux. In Architecture desmachines et systemes informatiques, pp. 243–252. Hommes et Techniques, Paris, 1982.

11. P. Le Guernic and A. Benveniste. Real-time, synchronous, data-flow programming: the lan-guage SIGNAL and its mathematical semantics. Technical Report 533 (revised version: 620),INRIA, June 1986.

12. P. Le Guernic and T. Gautier. Data-flow to von Neumann: the SIGNAL approach. In J.L.Gaudiot and L. Bic, editors, Advanced Topics in Data-Flow Computing, pp. 413–438. PrenticeHall, Englewood Cliffs, NJ, 1991.

13. A. Benveniste, P. Le Guernic, and C. Jacquemot. Synchronous programming with events andrelations: the SIGNAL language and its semantics. In Science of Computer Programming,16:103–149, 1991.

19. P. Le Guernic, T. Gautier, M. Le Borgne, and C. Le Maire. Programming real-time applicationswith SIGNAL. In Proceedings of the IEEE, 79(9):1321–1336, September 1991.

15. A. Gamatie. Designing Embedded Systems with the SIGNAL Programming Language.Springer, Berlin, 2009.

16. G. Kahn. The semantics of a simple language for parallel programming. In J.L. Rosenfeld,editor, Information Processing 74, pp. 471–475. North-Holland, Amsterdam, 1974.

17. Arvind and K.P. Gostelow. Some Relationships Between Asynchronous Interpreters of aDataflow Language. North-Holland, Amsterdam, 1978.

18. J.B. Dennis, J.B. Fossen, and J.P. Linderman. Data flow schemas. In A. Ershov and V. A.Nepomniaschy, editors, International Symposium on Theoretical Programming, pp. 187–216.Lecture Notes in Computer Science, vol. 5. Springer, Berlin, 1974.

19. M. Le Borgne. Dynamical systems over Galois fields: Applications to DES and to the SignalLanguage, In Lecture Notes of the Belgian-French-Netherlands Summer School on DiscreteEvent Systems. Spa, Belgium, June 1993.

20. T. Amagbegnon, L. Besnard, and P. Le Guernic. Arborescent Canonical Form of Boolean Ex-pressions. INRIA report n. 2290, 1994.

21. O. Maffeıs and P. Le Guernic. Distributed implementation of SIGNAL: scheduling & graphclustering. In 3rd International School and Symposium on Formal Techniques in Real-Time andFault-Tolerant Systems, pp. 547–566. Lecture Notes in Computer Science, vol. 863. Springer,Berlin, 1994.

22. D. Potop-Butucaru, S.E. Edwards, and G. Berry. Compiling Esterel. Springer, Berlin, 2007.23. L. Besnard, T. Gautier, M. Moy, J.-P. Talpin, K. Johnson, and F. Maraninchi. Automatic trans-

lation of C/C++ parallel code into synchronous formalism using an SSA intermediate form. InL. O’Reilly and M. Roggenbach, editors, Ninth International Workshop on Automated Verifi-cation of Critical Systems (AVOCS’09), 2009.

24. T. Gautier and P. Le Guernic. Code generation in the SACRES project. In Towards Sys-tem Safety, Proceedings of the Safety-critical Systems Symposium, SSS’99, Huntingdon, UK,Springer, 1999, pp. 127–149.

25. P. Aubry, P. Le Guernic, and S. Machard. Synchronous distribution of SIGNAL programs. InProc. of the 29th Hawaii International Conference on System Sciences, vol. 1, pp. 656–665.IEEE Computer Society Press, Los Alamitos, CA, 1996.

26. The Topcased platform. http://www.topcased.org.27. The OpenEmbeDD platform. http://www.openembedd.org.28. Geensoft’ RT-Builder. http://www.geensoft.com/fr/article/rtbuilder.29. A. Kountouris and P. Le Guernic. Profiling of SIGNAL programs and its application in the

timing evaluation of design implementations. In Proceedings of the IEE Colloq. on HW-SWCosynthesis for Reconfigurable Systems, pp. 6/1–6/9, Bristol, UK, February 1996. HP Labs.

30. Y. Glouche, T. Gautier, P. Le Guernic, and J.-P. Talpin. A module language for typing SIGNAL

programs by contracts, In this book.

Page 59: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

Chapter 2Formal Modeling of Embedded Systemswith Explicit Schedules and Routes

Julien Boucaron, Anthony Coadou, and Robert de Simone

A main goal of compilation is to efficiently map application programs ontoexecution platforms, while hiding the details of the latter to the programmer throughhigh-level programming languages. Of course this is only feasible inside a certainrange of constructs, and the judicious design of sequential programming languagesand computer architectures that match one another has been a decades-long pro-cess. Now the advent of multicore processors brings radical changes to this topic,bringing forth concurrency as a key element in efficiency, both for applicationdesign and architecture computing power. The shift is mostly prompted by techno-logical factors, namely the ability to cram several processors on a single chip, andthe diminishing gains of Instruction Level Parallelism techniques used in formerarchitectures. Still, the definition of high-level programming (and more generally,application design) formalisms matching the new era is a largely unsolved issue.

In the face of this new situation, formal models of concurrency will have to playa role. While not readily programming models, they can on the one hand provideintermediate representation formats that provide a bridge towards novel architec-tures. On the second hand they can also act as foundational principles to devise newprogramming and design formalisms, specially when matching domains have nat-ural concurrent representation, such as dataflow signal processing algorithms. Onthe third hand their emphasis on communication and data transport as much as ac-tual computations makes them fit to deal with communication latencies (to/fromexternal memory or between computation units) at an early design stage. As a resultsuch Models of Communication and Computation (MoCCs) may hold a privilegedposition, at the crossing point of executable programs and analyzable models. Inthe current chapter, we investigate dataflow Process Networks. They seem speciallyrelevant as a formal framework modeling situations when execution platforms arethemselves networks of processors, while applications are prominently dataflowsignal processing algorithms. In addition they form a privileged setting where toexpress and study issues of static scheduling and routing patterns.

J. Boucaron, A. Coadou, and R. de Simone (�)INRIA Sophia Antipolis Mediterranee, 06902 Sophia Antipolis, Francee-mail: [email protected]

S.K. Shukla and J.-P. Talpin (eds.), Synthesis of Embedded Software: Frameworks andMethodologies for Correctness by Construction, DOI 10.1007/978-1-4419-6400-7 2,c� Springer Science+Business Media, LLC 2010

41

Page 60: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

42 J. Boucaron et al.

Outline

In the first section, we recall and place in relative positions some of the many ProcessNetworks variants introduced in literature. We partially illustrate them on an exam-ple. Sections 2.2 and 2.3 provide more detailed formal definitions, and list a numberof formal properties, some classical, some recently obtained, some as our originalcontribution. Section 2.2 focuses on pure dataflow models, while Section 2.3 intro-duces condition control switches provided it does not introduce conflicts betweenindividual computations. We conclude on open perspectives.

2.1 General Presentation

2.1.1 Process Networks

As their name indicates, Process Networks (PN) consist of local processing (orcomputation) nodes, connected together in a network by point-to-point commu-nication links. Generally the links are supposed to carry data flow streams. MostProcess Network models are considered to be dataflow, meaning here that compu-tations are triggered upon arrival of enough data on incoming channels, accordingto specific semantics. Historically the most famous PN models are Petri Nets [65],Karp-Miller’s Parallel Program Schemata [50], and Kahn Process Networks [47].The Ptolemy [39] environment is famous as a development framework based on arich variety of PN models. More new models are emerging due to a renewed in-terest in such streaming models of computation, prompted by manycore concurrentarchitecture and dataflow concurrent algorithms in signal processing and embeddedsystems.

By nature PNs allow design methodologies based on assembly of componentswith ports into composite subsystems; hierarchical descriptions come as natural.When interconnect fabrics more complex than simple point-to-point channels aremeant, they need to be realized as another component. Local components (com-putation nodes or composite subsystems) may contain states. Even though PNsusually accept a native self-timed semantics (which only states that computationsare triggered whenever the channels data occupancy allows), a great deal of re-search in PN semantics consists in establishing when “optimal” schedules maybe designed, and how they can be computed (and if possible statically). Such aschedule turns the dataflow semantics into a more traditional control-flow one, asin Von Neumann architectures, while it guarantees that compute operation willbe performed exactly when the data arguments are located in the proper places(here channel queues instead of registers). More generally, we believe that Pro-cess Networks can be used as key formal models from Theoretical ComputerScience, ready to explain concurrency phenomena in these days of embedded ap-plications and manycore architectures. Also they may support efficient analyses and

Page 61: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

2 Formal Modeling of Embedded Systems 43

transformations, as part of a general parallelizing compilation process, involvingdistributed scheduling, optimized mapping, and real-time performance optimizationaltogether.

2.1.2 Pure Dataflow

A good starting point for dataflow PN modeling is that of Marked Graphs [32](also called Event Graphs in the literature). They form the simplest model, the onefor which the most results have been established, and under the simplest form.For instance the questions of liveness (or absence of deadlocks and livelocks),boundedness (or executability with finite queue buffers), the question of optimalstatic schedules have positive answers in this model (and specially in the case ofstrongly connected graphs underlying the directed-communication network). Thenother models of computation and communication can be considered as natural ex-tensions of Marked Graphs with additional features. Timed Marked Graphs [68]add minimal prescribed latencies on channel transport and computation duration.Synchronous Data Flow [56] (SDF) graphs, also called Weighted Event Graphs,require that data are produced and consumed on input and output channels not in-dividually, but along a prescribed fixed number each channel. The case of singletoken consumption and production is called “homogeneous” in SDF terminology, sothat Marked Graphs are also called homogeneous SDF processes. Bounded ChannelMarked Graphs limit the buffering capacity of channels, possibly to a value belowthe bound observed by safety criteria, so that further constraints of data traffic areimposed by congestion control. The recent theory of Latency-Insensitive Design[21] and Synchronous Elastic processes is attempting to introduce Bounded Chan-nels, Timed Marked Graphs for the modeling of System-on-Chip, and the analysis oftiming closure issues. It focuses on the modeling of the various elements involved bytypical hardware components. In all cases the models can be expanded into MarkedGraphs, but the exact relations between properties on original models and those oftheir reflection into Marked Graphs often need to be carefully stated. We shall con-sider this as we go through detailed definitions of the various such Process Networksin the next section. The syntactic inclusions are depicted in Fig. 2.1.

Any circuit in a Marked Graph supports a notion of structural throughput whichis the ratio of the number of tokens involved in the circuit over its length (this is aninvariant). Then, a Marked Graph contains critical circuits, and a critical structuralthroughput, which provides an upper bound of the allowable frequency of computa-

Fig. 2.1 Syntacticexpressivity of PureDataFlow

Page 62: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

44 J. Boucaron et al.

tion firings. A master result of Marked Graphs is that they can indeed be scheduledin an ultimately repetitive way which reaches the speed of this critical through-put. This result can be adapted to other extended Process Network models, and theschedules optimized against other criteria as well in addition. We shall describe ourrecent contributions in that direction in further sections.

2.1.3 Static Control Preserving Conflict-Freeness

The previous kinds of models could be tagged as “purely dataflow” based, as data(or rather their token abstractions) always follow the same paths. In the chapter, weshall considered extended models, where some of the nodes may select on whichchannels they will consume or produce respectively their input or output tokensrespectively (instead of producing/consuming on each channel uniformly). Never-theless some restrictions will remain: choices should be made according to the nodecurrent internal state, and not upon availability of tokens on various channels. Typ-ically, the “choice between input guards” construct of CSP [44] will be forbidden,just as any synchronous preemption scheme. As a result, the extended class of Pro-cess Networks shall retain the “conflict freeness” property such as defined in PetriNets terminology. As a result, the various self-time executions of the systems resultall in the same partially ordered trace of computations. In other words, a computa-tion once enabled must eventually be triggered or executed and cannot be otherwisedisabled. Conflict freeness can also be closely associated with the notion of “conflu-ence” (in the terms of R. Milner for process algebra), or also to the “monotonicitycontinuity” in Kahn Process Networks. Conflict freeness implies the important prop-erty of latency-insensitivity. The input/output semantics of the system as a wholewill never depend upon the speed of travelling through channels: only the timingsshall be incidentally affected.

While pure dataflow models (without conditional control) are guaranteed to beconflict-free, this property is also preserved when conditional control which bearsonly on internal local conditions (and not the state of connected buffer channels)are introduced. This is the case for general Kahn Process Networks, but also for so-called Boolean DataFlow [18] (BDF) graphs, which introduce two specific routingnodes for channel multiplexing and demultiplexing. We shall here name these nodesMerge and Select respectively (other authors use a variety of different names, andwe only save the name Switch for the general feature of switching between dataflowstreams according to a conditional switch pattern). This is also the case in Cyclo-Static Data Flow (CSDF) [9] graphs, where the weights allowed in SDF are nowallowed to change values according to a statically pre-defined cyclic pattern; whilesome weights are allowed to take a null value, Merge and Select may be encodeddirectly in CSDF.

While the switching patterns in BDF are classically supposed to be either dynam-ically computed or abstracted away in some deterministic condition (as for KahnProcess Networks), one can consider the case where the syntactic simplicity of BDF

Page 63: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

2 Formal Modeling of Embedded Systems 45

Fig. 2.2 Syntacticexpressivity of MG, SDF,CSDF, BDF and KRG

(only two additional Merge/Select node types) are combined with the predictiverouting patterns of CSDF. This led to our main contribution in this chapter, whichwe call K-periodically Routed Graphs (KRGs). Because the switching patterns arenow explicit, in the form of ultimately periodic binary words (with 0 and 1 referringto the two different switching positions), the global behaviors can now be estimated,analyzed and optimized. We shall focus in some depth on the algebraic and analyticproperties that may be obtained from such a combination of scheduling and routinginformation. The syntactic inclusions are depicted in Fig. 2.2.

To conclude, one global message we would like to pass to the reader is thatProcess Networks (at least in the conflict-free case) can indeed be seen as support-ing two types of semantics: self-timed before scheduling and routing, synchronous(possibly multirate) after scheduling and routing have been performed. The syn-tactic objects representing actual schedules and routing patterns (mostly in ourcase infinite ultimately periodic binary words) should be seen as first-class designobjects, to be used in the development process for design, analysis, and optimizationaltogether.

2.1.4 Example

We use the Kalman filter as supporting example through this chapter. This filterestimates the state of a linear dynamic system from a set of noisy measurements.Kalman filters are used in many fields, especially for guidance, positioning and radartracking. In our case, we consider its application to a navigation system couplingGlobal Positioning System (GPS) and Inertial Navigation System (INS) [20].

Actually, we do not need to understand fully how the Kalman filter works,what really matters is to understand the corresponding block diagram shownin Fig. 2.3.

This block diagram shows data dependencies between operators used to buildthe Kalman filter. Boxes denote data flow operators: for instance, the left top boxis an array of multiply-accumulates (MACs), with two input flows of 2D arrays ofsize Œn W n�, and an output flow of 2D array of size Œn W n�. The number of states(position, velocity, etc.) is denoted by n. The number of measurements (number ofsatellites in view) is denoted by m. Arcs denote communication channels and flowdependencies. Initial data are shown as gray filled circles.

Page 64: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

46 J. Boucaron et al.

MAC(array)

�(array)

MAC(array)

MAC(array)

�(array)

C(array)

MAC(array)

MAC(array)

C(array)

MAC(array)

C(array)

MAC(array)

�(array)

MAC(array)

MAC(array)

Œn W n�

Œn W n�

Œn W n�

Œn W n�

Œn W m�

Œm W m�

Œm W m�

Œn W m�

Œn W n�

Œn W n�

Œn W 1�

Œn W 1�

Œm W 1�

Œm W 1�

Œn W 1�

to Oxkjk

from ızkŒm W 1�

fromHkŒm W n�

from �

Œn W n�

from R

Œm W m�

fromHkŒm W n�

fromHkŒm W n�

from Q

Œn W n�

from �

Œn W n�

fromHkŒm W n�

from InŒn W n�

from �

Œn W n�

Fig. 2.3 Block diagram of the Kalman filter

We refer the interested reader to Brown et al. [16] and Campbell [20] for de-tails on the Kalman filter. Just notice that Hk is the GPS geometry matrix, ızk is acorrected velocity with INS data, and Oxkjk is the estimated current state.

2.2 Pure Dataflow MoCCs

Pure dataflow MoCCs are models where the communication topology of the sys-tem is static during execution, and data follow same paths during each execution.There exists a partial order over events in such system which leads to a deterministicconcurrent behavior.

First, we present the simplest pure data flow MoCC with the synchronousparadigm. After, we detail Latency-Insensitive Design. Then, we detail MarkedGraphs and Synchronous Data Flow.

2.2.1 Synchrony

The synchronous paradigm is the de facto standard for digital hardware design:

� Time is driven by a clock.� At each clock cycle, each synchronous module samples all its inputs and pro-

duces results on all its outputs.

Page 65: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

2 Formal Modeling of Embedded Systems 47

� Computations and communications are instantaneous.� Communications between concurrent modules are achieved through signals

broadcast throughout the system.

Synchrony requires that designs are free of instantaneous loops (called alsocausality cycles): each loop shall contain at least one memory element (e.g., latchor flip-flop), such that when all memory elements are removed the resulting graph isacyclic. This implies a partial order of execution among signals at each instant. Thebehavior is deterministic with respect to concurrency.

Synchrony is control-free: control can be encoded through if-conversions: a con-trol dependency is turned into a data dependency using predicates. For instance,“if (a) b D c else b D d” is transformed into b D .a ^ c/ _ .:a ^ d/. From adigital hardware view, the previous code is equivalent to a 2-to-1 multiplexer withconditional input a, with data inputs c and d , and with data output b.

Example 1. Implementing the Kalman filter as a synchronous system is straightfor-ward: for each box in the initial block diagram in Fig. 2.3, we assign a correspondingset of combinatorial gates; for each initial amount of data, we assign a set of memoryelements (latch, flip-flop). After, we check absence of causality cycles: all circuits(in the meaning of graph theory) of the design must have at least one memoryelement.

In real-life, computation and communication take times. When we implement asynchronous design, performance bottlenecks are caused by highest delay combi-natorial paths. Such paths are called critical paths, they slow down the maximumachievable clock rate.

Using multiple clocks with different rates can help implement more efficiently adesign, given a set of constraints on performances and resources: such systems aresometimes called multiclock. A multiclock implementation introduces synchroniz-ers for crossing multiple clock domains, synchronizers resample a single crossingclock domain.

It exists also polychronous implementations that introduce logical clock do-mains. Logical clock in this context represents an activation condition for a modulegiven by a signal, such signal abstracts a measure of time that can be multi-form:a distance, a fixed number of elements, etc. For instance, our block diagram hasdifferent data path widths (e.g., n W n, n W m, m W m), we can assign to each one alogical clock corresponding to its data width. Polychronous modules are connectedtogether using logical synchronizers that enable or disable the module with respectto availability of input data, output storage, etc. We refer the reader to Chaps. 1, 5and 6 for further details.

Further Reading

Synchronous languages [7] have been built on top of synchronous and/orpolychronous hypotheses. These include Esterel [66], Syncharts [3], Quartz [70],

Page 66: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

48 J. Boucaron et al.

Lustre [25], Lucid Synchrone [26, 27], and Signal [53]. They are mainly used forcritical systems design (avionics, VLSI), prototyping, implementation and formalverification [10]. Tools such as SynDEx [43] have been designed to obtain efficientmapping and scheduling of synchronous applications while taking into account bothcomputation and communication times on heterogeneous architectures.

Synchrony implicitly corresponds to the synthesizable subset of hardware de-scription languages such as (System-)Verilog, VHDL or System-C. Logic synthesistools [37] that are daily used by digital hardware designers rely on synchrony asformal MoCC. This model allows to check correctness of non-trivial optimizations[57,72]. It enables to check design correctness using for instance formal verificationtools with model-checking techniques [17, 62, 67].

Synchronous systems can contain false combinatorial circuits as detailed in [69].This is a very interesting topic, since such circuits are needed to obtain digital de-signs with smaller delay and area. Correctness of such design is checked usingtools for formal verification, based on binary decision diagrams [17, 54] or SATsolvers [38].

2.2.2 Latency-Insensitive Design

In VLSI designs, when geometry shrinks, gate delays decrease and wire delays in-crease due mainly to increases in resistivity as detailed in Anceau [2]. In today’ssynchronous chips, signals can take several clock cycles to propagate from one cor-ner of the die to the other. Such long global wires are causing timing closure issues,and are not compatible with the synchronous hypotheses of null signal propagationtimes.

Latency-Insensitive Design (LID) [21] (also known as Synchronous Elastic inthe literature) is a methodology created by Carloni to cope with such timing closureissues caused by long global wires. LID introduces a communication protocol thatis able to handle any latency on communication channels. LID ensures the samebehavior of the equivalent synchronous design, modulo timing shifts. LID enablesmodules to be designed and built separately, and for them to be linked together withthe Latency-Insensitive protocol.

The protocol requires patient synchronous modules: behaviour of a patient mod-ule only depends on signal values, and not on reception times; i.e., given an inputsequence, a patient module always produces the same output sequence, whateverarrival instants are. The composition of patient modules is itself a patient module asshown in Carloni et al. [22]. This requirement is a strong assumption not fulfilledby all synchronous modules that need in such case a Shell wrapper.

LID is built around two building blocks:

Shells. A Shell wraps each synchronous module (called pearl in LID jargon) toobtain a patient pearl. Shell function is two-fold: (1) it executes the pearl as soonas all input data are present and if there is enough storage to receive results in all

Page 67: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

2 Formal Modeling of Embedded Systems 49

downward receivers (called Relay Stations). (2) it implements part of the latencyinsensitive protocol, it stores and forwards downward backpressure due to trafficcongestion. Backpressure consists of stalling an upward emitter until a downwardreceiver is ready to store more data.

Relay Stations. A relay-station is a patient module: it implements the storage ofresults – a part of the Latency-Insensitive protocol – through a backpressuremechanism. Actually, a chain of Relay Stations implements a distributed FIFO.The minimal buffering capacity of a Relay Station is two pieces of data to avoidthroughput slow-down as detailed in Carloni et al. [22]. There is at most one ini-tial data in each Relay Station. Wires having a delay greater than one clock cycleare split in shorter ones using Relay Stations until reaching timing closure.

The core assumption in LID is that pearls can be stalled within one clock cycle.Block placements and communication latencies are not known, they have to be es-timated during floor-planing, and refined further during placement and routing.

Different works [11, 13, 15, 21, 28, 33, 75, Harris, unpublished] have addressedhow to implement LID Relay Stations and Shells using the backpressure flow-control mechanism. This solution is simple and compositional, but it suffers from awiring overhead that may be difficult to place and route in a design.

Example 2. Implementing the Kalman filter using LID is similar to the synchronousimplementation. Operators, or sets of operators – arrays of multiply-accumulates(MACs) for instance – are wrapped by Shells. Long wires are split by Relay Sta-tions. We assume there is at least one Relay-Station in each circuit of the design forcorrectness purpose.

The reader should notice that LID is a simple polychronous system, where thereis the same physical global clock and logical clocks generated by Shells and RelayStations to execute each Pearl.

Further Readings

The theory is described in detail in [22]. Performance analyses are discussed in[14,15,19,23,24,29]. Formal verification of Latency Insensitive systems is describedin [14,73,74]. Some optimizations are described in [14,15,19,23,24,29]. LID is alsoused in network fabrics (Networks on Chips), as described in [34, 41, 45]. Handlingof variable latencies is described in [5, 6, 30].

2.2.3 Marked Graphs

Marked Graphs (MG) [32] are a conflict-free subset of Petri nets [65], also known asEvent Graphs in the literature. MGs are widely used, for instance in discrete eventssimulators, modeling of asynchronous designs or job shop scheduling.

Page 68: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

50 J. Boucaron et al.

We briefly introduce its definition and operational behavior. We introduce resultson deadlock freeness, and results on static schedulability and maximum achievablethroughput. Then, we briefly recall Timed Marked Graphs, and we recall the caseof Marked Graphs with bounded capacities on places. Then, we show the impactof such capacities on the throughput of a MG with capacities. After, we show twooptimizations on MG. First, an algorithm on MGs with capacities that computesthe capacity of each place such that the graph reaches the maximum achievablethroughput of its equivalent MG without capacities. Next, we provide a throughput-aware algorithm called equalization that slows down the fastest parts of the graphwhile ensuring the same global throughput. This algorithm provides best return oninvestment locations to alter the schedule of the system. This can allow for instanceto postpone an execution of a task to minimize power hotspots.

Definition 1 (Marked Graph). A Marked Graph is a quadruplet hN ;P ; T ;M0i,such that:

� N is a finite set of nodes.� P is a finite set of places.� T � .N � P/[ .P �N / is a finite set of edges between nodes and places.� M0 W P ! N is the function that assigns an initial marking (quantity of data

abstracted as tokens) to each place.1

In a MG, nodes model processing components: they can represent simple gatesor complex systems such as processors. Each place has exactly one input and oneoutput and acts as a point-to-point communication channel such as a FIFO.

The behavior of a MG is as follows:

� When a computation node has at least one token in each input place, then it canbe executed. Such node is said enabled or activated.

� If we execute the node, then it consumes one token in each input place and pro-duces one token in each output place.

A MG is deterministic in case of concurrent behaviour.A MG is confluent: for all sets of activated nodes, the firing of any node does notremove another node from this subset than itself. This leads to a partial order ofevents.

We recall some key results on Marked Graphs from the seminal paper ofCommoner et al. [32], where associated proofs can be found.

Lemma 1 (Token count). The token count of a circuit does not change by nodeexecution.

Theorem 1 (Deadlock-freeness). A marking is deadlock-free if and only if thetoken count of every circuit is positive.

1 We recall that N D f0; 1; 2; : : : g and N� D Nn f0g.

Page 69: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

2 Formal Modeling of Embedded Systems 51

Fig. 2.4 Coarse-grain abstraction of the Kalman filter as a MG

Example 3. Figure 2.4 shows a basic model of the Kalman filter as a MG. The graphhas five circuits: three in the left strongly connected component, and two in the rightone. They all contain one token: the system is deadlock free.

2.2.3.1 Static Schedulability and Throughput

Now, we recall useful results on static schedulability of MG and how to compute itsthroughput. Static schedulability is an interesting property that enables to computea performance metric at compilation time.

The As Soon As Possible (ASAP) firing rule is such that when a node is enabled,it is executed immediately. The ASAP firing rule is also known as earliest firing rulein literature. Since a MG is confluent, any firing rule will generate a partial orderof events compatible with the ASAP firing rule. The ASAP firing rule generates thefastest achievable throughput for a system.

Definition 2 (Rate). Given a circuit C in a MG. We denote the node count of thecircuit C as L.C/, and the token count of the circuit C asM0 .C/. We denote the rateof the circuit C as the ratio M0.C/

L.C/ .

Notice that circuits of a strongly connected component have side effects on eachothers: circuits with low rates slow down those with higher rates.

Theorem 2 (Throughput). The throughput of a strongly connected graph equalsthe minimum rate among its circuits. The throughput of an acyclic graph is 1.

Proof. Given in [4].

Strongly connected graphs reach a periodic steady regime, after a more-or-lesschaotic initialization phase. These MGs are said k-periodic and are staticallyschedulable as shown in [4].

Definition 3 (Critical circuit). A circuit is said critical if its rate equals the graphthroughput.

Page 70: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

52 J. Boucaron et al.

2.2.3.2 Timed Marked Graphs

Timed Marked Graphs (TMG) are an extension of MG introduced by Ramchandani[68] where time is introduced through weights on places and nodes. They representabstract amounts of time needed to move the set of tokens from inputs to outputs.

Definition 4 (Timed Marked Graph). A Timed Marked Graph is a quintuplethN ;P ; T ;M0; Li, such that:

� hN ;P ; T ;M0i is a Marked Graph.� L W N [ P ! N is a function that assigns a latency to each place and computa-

tion node.

Ramchandani showed that TMGs have the same expressivity as MGs [68]. Heprovides a simple transformation from a TMG to an equivalent MG.

The transformation is shown in Fig. 2.5 for both a place with a latency b, anda node with a latency c. The place of latency b is expanded as a succession of thepattern of a node followed by a place with a unitary latency. The succession ofthis pattern has the same latency b. The transformation for the node of latency cis similar, with a succession of the pattern of a unitary latency place followed bya node.

All theoretical results on MG hold for TMGs.

2.2.3.3 Bounded Capacities on Places

Real-life systems have finite memory resources. We introduce place capacities inMG to represent the maximal number of data that can be stored in a place: we addto Definition 4 the function K W P ! N that assigns to each place the maximumnumber of tokens it can hold.

cL.c/ D n

aL.a/ D 0

bL.b/ D m

a

c nC1 nodesn places

b m�1 nodesm places

or

Fig. 2.5 Example of latency expansion

Page 71: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

2 Formal Modeling of Embedded Systems 53

A MG with capacities can be transformed in an equivalent one without capacitiesthrough introduction of complementary places [4]: a placep1 with a capacityK .p1/from node n1 to node n2 is equivalent to a pair of places p2 and p2, where p2 isfrom n1 to n2, and p2 is from n2 to n1. We write P for the set of complementaryplaces of P and for T their corresponding arcs.

Definition 5 (Complemented graph). Given G D hN ;P ; T ;M0; Ki a connectedMG with finite capacities, we build the complemented graph G0 D ˝N 0;P 0; T 0;M 0

0

˛as follows:

N 0 D N ;P 0 D P [ P ;T 0 D T [ T ;

8p 2 P ; M 00 .p/ D M0 .p/ ; and

M 00 .p/ D K .p/ �M0 .p/ :

The hint for the proof of correctness of this transformation is as follows: whenwe introduce a complementary place, we introduce a new circuit holding the placeand the complementary one. This new circuit has a token count equal to the capacityof the original place. Since token count is invariant by node execution, both placescannot have more tokens than the capacity of the original place.

In a regular MG, a node can produce data when it has one token on all inputs. Itstops only when an input is missing. In MG with finite capacities, nodes are awaitingon inputs and also on outputs. This means that a finite capacity graph can slow downthe maximum achievable throughput of the topological equivalent regular graph.

Example 4. Initially in Fig. 2.6, the graph has two circuits: the left circuit has arate of 5=5 D 1, and the right one has a rate of 4=5. The capacity of each placeequals two. After application of the transformation, the transformed graph withoutcapacities has more circuits due to complementary places. In particular, there is

Fig. 2.6 A strongly connected TMG with unitary latencies whose throughput is limited to 3=4 dueto 2-bounded places. Plain places belong to the graph with finite capacities; we only show the mostsignificant complementary places with dotted lines

Page 72: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

54 J. Boucaron et al.

a new circuit whose throughput equals 3=4. This circuit slows down the systemthroughput, due to lack of capacity on places.

Notice that the property of acyclicity loses its soundness when consideringcapacity-bounded graphs. Actually, introducing complementary places createscircuits. An acyclic graph can have a throughput lower than 1. The throughput ofa connected graph with finite buffering resources equals the minimum rate amongthe circuits of its complemented equivalent.

Corollary 1 (Throughput w.r.t. capacities). Let G D hN ;P ; T ;M0; Ki be a con-nected MG with finite capacities, and G0 D ˝

N 0;P 0; T 0;M 00

˛its complemented

graph. The maximum reachable throughput � .G/ of G is

� .G/ D minC2G0

�M 00 .C/

N 0 .C/ ; 1�: (2.1)

Proof. Given in [4].

2.2.3.4 Place Sizing

Previously, we have seen how to compute the throughput of a MG and the through-put of a MG with bounded capacities. The first one only depends on communicationand computation latencies, while the second one can be lower due to capacitybounds. Now, we address the issue of place sizing to reach the maximal throughputof the graph. Our goal is to minimize the overall sum of place capacities in the MG.We state it as the following Integer Linear Programming (ILP) problem (a similarILP formulation for the same problem is given in Bufistov et al. [19]):

minimizeXp2P

k .p/ ; (2.2)

where P is the set of places.It is subject to the following set of constraints in the complementary graph: for

each circuit C that contains at least a complementary place,

PtokensCP kP

latencies�

Pcritic tokensP

critic latencies� 0; (2.3)

where the left part of the constraint is the throughput of the circuit C we consider,and the right part is the maximum achievable throughput of the graph without ca-pacities. This program states that we want to minimize global capacity count: weadd additional capacities to places through

Pk in the marking of complementary

places found in each circuit C, until we reach the maximal throughput of the graph.We assume k � 0.

Page 73: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

2 Formal Modeling of Embedded Systems 55

Now, we provide the algorithm to compute minimum capacities for each place inorder to reach the maximum throughput of the system:

1. Compute the maximum throughput of the not-complemented graph using, forinstance, the Bellman–Ford algorithm, or any minimum cycle mean algorithm[36, 49].

2. Build the complemented graph using previous transformation in Definition 5.3. Enumerate all directed circuits having at least one complementary place in the

complemented graph. We can use Johnson’s algorithm [46] with a modification:when a circuit is found, we check that the circuit contains at least one comple-mentary place.

4. Build and solve the previous formulation of the ILP.

As the reader understands this optimization does not come for free. Additionalresources are needed to reach the maximum throughput of the system.

2.2.3.5 Equalization

Now, we recall an algorithm called equalization [14]. Equalization slows down asmuch as possible the fastest circuits in the graph, while maintaining the same globalsystem throughput. This transformation gives hints on potential slack. Such slackcan be used to add more pipeline stages while ensuring the same performance ofthe whole system. It can be used to postpone an execution, for instance to smoothdynamic power and flatten temperature hot-spots.

We provide a revised equalization that minimizes the amount of additional la-tency to minimize resource overhead. Additional latency here is a dummy nodefollowed by a single place attached to an existing place. The problem is stated as thefollowing Integer Linear Program:

maximizeXp2P

weight .p/ � a .p/ ; (2.4)

where a .p/ are additional latencies assigned to each place p, and weight.p/ is thesum of occurrences of place p in all circuits of the graph. We have the following setof constraints assigned to each non-critical circuit C:

PtokensP

latenciesCPa�

Pcritic tokensP

critic latencies� 0; (2.5)

where the left part of the constraint is the fast circuit that we slow down with anamount of additional latencies a, and the right part is the critical throughput of theMG. We also ensure that we have a.p/ � 0. The weight found in the objective isused to force to choose places shared by different circuits to minimize additionallatencies.

Page 74: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

56 J. Boucaron et al.

The algorithm is as follows:

� Compute the maximum throughput of the MG using the Bellman–Ford algo-rithm, or any minimum cycle mean algorithm

� Enumerate all directed circuits using for instance Johnson’s algorithm� Build and solve previous formulation of the ILP

Links with Other MoCCs

MG is a self-timed system; there is no global clock as in Synchrony, each componenthas its own clock. MG is a polychronous system where each component clock isdriven by presence of tokens on inputs and its firing rule.

Synchrony is a special case of MG running under an ASAP firing rule. The trans-formation from Synchrony to MG is as follows: we assign to each node in the MG adirected acyclic graph of combinatorial gates; we assign to each place a correspond-ing memory element (flip-flop, latch) and we put an initial token. The obtained MGbehaves as a synchronous module, at each instant all nodes are executed as in thesynchronous model.

LID is somehow a special case of MG with bounded capacities. A Shell behavesas a node does: both require data on all their inputs and enough storage on theiroutputs to be executed. Relay-stations carry and store data in-order as places. Butnot all implementations of LID behave like a MG, especially when there are someoptimizations done on the Shell where we can use its buffering capacity to extendfurther the capacity of input Relay Stations: this can avoid further slow-down of thesystem.

2.2.4 Synchronous Data Flow

Synchronous Data Flow (SDF) also known as Weighted Event Graphs in the litera-ture, has been introduced by Lee and Messerschmitt [55,56]. SDF is a generalizationof previous Marked Graphs, where a strictly positive weight is attached to each in-put and output of a node respectively. Such weight represents how many tokens areconsumed and produced respectively, when the node is executed.

SDF is a widely used MoCC, especially in signal processing. It describes perfor-mance critical parts of a system using processes (actors) with different constant datarates. SDF is confluent and deterministic with respect to concurrency.

The weight in SDF raises the problem of deciding whether places are boundedor not during execution: if not, there is an infinite accumulation of tokens. Lee andMesserschmitt [56] provides an algorithm to decide if a SDF graph is bounded,through balance equations. The idea is to check if there exists a static schedulewhere token production equals token consumption.

Page 75: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

2 Formal Modeling of Embedded Systems 57

Balance equations are of the form:

weight .n/ � executions .n/ �weight�n0� � executions

�n0� D 0; (2.6)

where n and n0 are respectively the producer and the consumer of the consid-ered place, and executions are unknown variables. To solve this equation, we canuse a Integer Linear Programming solver, or better yet, a Diophantine equationsolver [64].

If there is no solution, then there exists an infinite accumulation of tokens in theSDF graph. Otherwise, there exists an infinite number of solutions and the solvergives us one solution. This solution holds the number of occurrences of executionfor each node during the period of the schedule.

After boundedness check, we can perform a symbolic simulation to compute astatic schedule using the previous solution. We can check for deadlock-freedom andcompute place capacities.

Example 5. Figure 2.7 represents a SDF graph corresponding to the Kalman filter(Fig. 2.3), where we consider n D 8 states and m D 5 measurements. Balanceequations solving gives the trivial solution of one execution for each node duringthe period.

Further Readings

Several scheduling techniques have been developed to optimize different metricssuch as throughput, schedule size, or buffer requirement [8, 42]. Some results ondeadlock-freeness are found in Marchetti et al. [60, 61].

64

64

64 64

6464

4040

64

4040

25

25 40

40

40

40

40

8

8

5 8

8

5

5

5

8

8

Fig. 2.7 SDF graph of the Kalman filter .m D 5; n D 8/

Page 76: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

58 J. Boucaron et al.

Fig. 2.8 Expressiveness ofPure DataFlow MoCCs Synchronous DataFlow

Marked Graphs

Latency Insensitive Design

Synchrony

Retiming is a commonly used optimization in logic synthesis [57]. In case ofSDF, retiming is used to minimize the cycle length or maximize the throughput ofthe system [59,63]. The correctness of the retiming algorithm uses a useful transfor-mation from a bounded SDF graph to an equivalent MG described in Bhattacharyyaet al. [8].

StreamIt [1, 48, 52, 71, 76] is a language and compiler for stream computingwhere SDF is the underlying MoCC. StreamIt allows exposing concurrency found instream programs and introducing some hierarchy to improve programmability. TheStreamIt compiler can exploit this concurrency for efficient mapping on multicore,tiled and heterogeneous architectures.

Summary

This section provides an overview of pure dataflow MoCCs. Expressiveness ofpure dataflow MoCCs is described in Fig. 2.8. Such MoCCs have the followingproperties:

� Deterministic behavior with respect to concurrency.� Decidabibility on bounded buffers during execution: structural in case of

Latency-Insensitive Design and Marked Graphs; using balance equationsalgorithm in case of Synchronous Data Flow.

� Static communication topology, that leads to a partial order of events.� Static schedulability: the schedule of a bounded system can be computed at

compilation time with needed buffer sizes.� Deadlock-freedom checks using a structural criteria in case of Latency-

Insensitive Design and Marked Graph; or using bounded length symbolicsimulation in case of Synchronous Data Flow.

2.3 Statically Controlled MoCCs

Previously, we introduced pure dataflow MoCCs where the communication topol-ogy is static during execution. Such topology does not allow the reuse of resourcesto implement different functionality.

Page 77: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

2 Formal Modeling of Embedded Systems 59

Now, we introduce statically controlled MoCCs where the communication topol-ogy is dynamic and allows resources to be reused. Static in this context means thatthe control (routing of data in our case) is known at compilation time.

The main interest in such statically controlled MoCCs is that it allows greaterexpressivity while conserving all key properties of pure dataflow MoCCs. SuchMoCCs are deterministic and confluent. We can decide at compilation time if buffersare bounded during execution. We can also check if the system is deadlock free. Wecan statically schedule a bounded system to know its throughput and the size of itsbuffers. Such MoCCs also generate a partial order on events: for instance, this al-lows us to check formally if a transformation applied on an instance of such MoCCis correct.

We briefly recall the MoCC Cyclo-Static Dataflow (CSDF). Then, we detailK-periodically Routed Graphs (KRG).

2.3.1 Cyclo-Static Dataflow Graphs

Cyclo-Static DataFlow (CSDF) [9, 40] extends SDF such that weights associatedto each node can change during its execution according to a repetitive cyclic pat-tern. Actually, CSDF introduces control modes in SDF, also called phases in theliterature.

The behavior of a CSDF graph is as follows:

� A node is enabled when there is at least the amount of tokens given by the indexon the repetitive cyclic pattern associated on each input.

� When a node is executed then it consumes and produces respectively the amountof tokens given by the index on each cyclic pattern according to the associatedinput and output respectively. Finally, we increment the index of each patternmodulo its length to prepare the next execution.

Example 6. Figure 2.9a shows how we can model a branch node in CSDF. When westart the system, every index is 0. When this node is executed, it always consumesone input token since the pattern is .1/. But this is different for outputs: at firstexecution, it produces only one token on the upper output since the index of the

.1/.1; 0/

.0; 1/

(a) Branching

.1/ .1/ .1/ .1/.1; 0; 0; 0/ .0; 0; 0; 1/

.1; 1; 1; 0/.0; 1; 1; 1/

(b) Loop

Fig. 2.9 Routing examples in CSDF

Page 78: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

60 J. Boucaron et al.

pattern is 1, and the lower output index is 0. At the end of this first execution, weincrement all indices of the node.Now, we execute this node a second time. It does not change for the input still havingthe index on 1, but it changes for outputs: now, the upper output produces 0 tokens,and the lower one produces 1 token. We increment all index of the node, and we getback to the same behaviour as in the first execution.

Figure 2.9b shows how to model a simple for loop: when an input token arriveson the upper left input, then we start the loop where the token loops three times inthe circuit before exiting.

CSDF is confluent and deterministic with respect to concurrency. It can be de-cided if memories are bounded, and if so the CSDF graph is statically schedulable.Static routing can also be modeled using CSDF, although it is implicit: routing com-ponents are denoted as computation nodes.

2.3.2 K -Periodically Routed Graphs

In CSDF, it is difficult to check correctness of a transformation applied on the topol-ogy of a graph. The issue is that we do not know clearly whether a node performs acomputation, whether it routes data, or a combination of both.

KRG has been introduced to tackle this issue. KRG is inspired by CSDF; the keydifference is that nodes are split in two classes as done in BDF: one for statelessrouting nodes, and another one for computation nodes. In KRG, only routing nodeshave cyclic repetitive patterns as in CSDF, while computation nodes behave as inMG. In KRG, there are two kinds of routing nodes called Select and Merge, akinto demux and mux. Such routing nodes have at most two inputs or two outputsrespectively, this allows the use of binary words to describe the cyclic repetitivepattern instead of positive integers.

This section will be longer and more technical than previous ones. After the def-inition of KRG, we show decidability of buffer boundedness through an abstractionto SDF. Then, we show how we can check deadlock-freedom through symbolic sim-ulation, and also through the use of a dependency graph. Such dependency graphleads to a similar transformation from a SDF graph to a MG: such transformationis needed for instance to apply a legal retiming in case of SDF. Finally, we presentKRG routing transformations on routing nodes, and show they are legal through theuse of on and when operators applied on routing patterns.

Before providing the definition of a KRG, we recall some definitions borrowedfrom the n-synchrony theory [31]:

� B D f0; 1g is the set of binary values.� B� is the set of finite binary words.� BC is the set of such words except the empty word ".� B! is the set of infinite binary sequences.� We write jwj for the length of the binary word w.

Page 79: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

2 Formal Modeling of Embedded Systems 61

� We write jwj0 and jwj1 respectively for the number of occurrences of 0 and 1in w.

� We write wi for the i th letter of w; we note whead D w1 and wtail such thatw D whead:wtail.

� We write Œw�i for the position of the i th “1” in w.For instance, Œ0101101110�4 D 7. As a rule, Œ0�1 D C1.

� A sequence s in B! is said ultimately periodic if and only if it is of the formu:v! , where u 2 B� and v 2 BC. We call u the initial part and v the steady (orperiodic) part.

� We write Pkp for the set of ultimately periodic binary sequences (or k-periodicsequences), with a steady part of p letters, including k occurrences of 1. We callk the periodicity and p the period.

� We write P for the set of all ultimately periodic sequences.

Definition 6 (K-periodically Routed Graph). A K-periodically Routed Graph(KRG) is a quintuple hN ;P ; T ;M0; Ri, where:

� N is a finite set of nodes, divided in four distinct subsets:

– Nx is the set of computation nodes.– Nc is the set of stateless copy nodes. Each copy node has exactly one input

and at least two outputs.– Ns is the set of select nodes. Each select node has exactly one input and two

outputs.– Nm is the set of merge nodes. Each merge node has exactly two inputs and

one output.

� P is a finite set of places. Each place has exactly one producer and one consumer.� T � .N � P/[ .P �N / is a finite set of edges between nodes and places.� M0 W P ! N is a function assigning an initial marking to each place.� R W Ns [Nm ! P is a function assigning a routing sequence to each select and

merge node.

Places and edges model point-to-point links between nodes. A KRG is con-flict free.

Given a place p, we denote �p and p� its input and output node respectively.Similarly for a node n, we denote �n and n� its set of input places and outputplaces.

The behavior of a KRG is as follows:

� A computation node has the same behavior as in MG: when a token is present oneach input place, the node is enabled. An enabled node can be executed. Whenexecuted it consumes and produces one token on each input place and outputplace respectively.

� A copy node is a special case of stateless computation node where tokens on theinput place are duplicated on each output place.

Page 80: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

62 J. Boucaron et al.

Kalmanfilter

C

0

C0 1�1:09

�!

0 1

�09:1

�!

ızk HkINSvelocity

correctedvelocity

(a) KRG of the system

11

1

11

1

1

1 1

1

9 1

10

9 1

10

(b) SDF abstraction of the KRG

Fig. 2.10 Correcting INS velocity using GPS position

� A select node splits a token flow in two parts. It is enabled when there is a tokenon its unique input place; when executed it consumes the input token and it pro-duces a token on one of its output places. The output of production is given bythe index of the routing pattern: if the index points to a 1 or 0 respectively, thenit produces on the output labeled 1 or 0 respectively. At the end of the execution,the index of the routing pattern is incremented modulo the length of the steadypart of the routing pattern. See Fig. 2.9 and Example 6 for a similar select nodein CSDF.

� A merge node interleaves two token flows into one. It is enabled when a token isavailable on the input place pointed to by the index of the routing pattern. Whenexecuted, the input token is consumed and routed to the output place. At the endof the execution, the index is incremented modulo the length of the steady partof the routing pattern.

Example 7. We come back to our Kalman filter example. Now we integrate it intoa navigation system that computes the velocity of a vehicle, as shown in Fig. 2.10a.Inertial Navigation Systems suffer from integration drift: accumulating small errorsin the measurement of accelerations induces large errors over time. INS velocityis corrected at a low rate (1 Hz) with integrated error states based on GPS data athigher rate (10 Hz).

Integrating ten error states is performed with a for loop: at the beginning of eachperiod, the merge node consumes a token on its right input, when result is resetto 0. Then, data loop through the left input of the merge node for the next nineiterations. Conversely, the select node is the break condition: the nine first tokensare routed to its left output, while the tenth exits the loop. Its SDF abstraction isgiven in Fig. 2.10b.

2.3.2.1 Buffer Boundedness Check

As in CSDF, buffer boundedness is decided through an SDF abstraction. We solvebalance equations of the SDF graph in order to compute node firing rates and bufferbounds.

Page 81: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

2 Formal Modeling of Embedded Systems 63

Given a KRG, the corresponding SDF graph is constructed as follows:

� The communication topology of the graph is identical for places and edges.� Computation and copy nodes are abstracted as SDF nodes with a weight of 1 on

all inputs and all outputs.� Select and merge nodes are also abstracted as SDF nodes where the weight of

each labeled input or output corresponds to the number of occurrences of thelabel found in the steady part of the routing pattern. In case of unlabeled inputor output, the associated weight is the length of the steady part of the routingpattern.

In Fig. 2.10, we show the result of the application of this abstraction reduction.Solving balance equations to show that the KRG is bounded is left as an exercise.

2.3.2.2 Deadlock-Freedom Check

Given a bounded KRG deadlock-freedom only depends on its initial markings. Wecan use a bounded length symbolic simulation with the following halting conditions:

� Deadlock: Simulation ends when no node is enabled.� Periodic and live behavior: Simulation ends when we found a periodic behavior:

we get back to an already reached marking at index x in the list of reached mark-ings, then we check for equal markings:Œx W x C j � 1� � Œx C j W x C j C j � 1�, where j is the distance to the newmarking in the list of reached markings. If we have equal markings, then we havefound a periodic behavior Œx W x C j � 1� with an initialization from Œ0 W x � 1�.

2.3.2.3 Static Schedulability and Throughput

KRG is confluent due to the conflict freedom property on places: when a node isenabled, it remains enabled until executed.

We can introduce latency constraints in our model; we call the result Timed KRG.We define a new function associated to a KRG:L W N [ P ! N is a latency function, assigning a positive integer to places and

nodes. Latencies of copy, merge or select nodes are supposed to be null. N.B.: Weassume that there is no circuit where the sum of latencies is null, otherwise we havea combinational cycle as in synchrony.

Computation and communication latencies are integer and possibly non-unit.However, the synchronous paradigm assumes evaluations at each instant. This prob-lem is solved by graph expansion to KRG, in a similar way as in TMG [68]: foreach node with non-null latency (resp. place with latency higher than 1), it is splitusing intermediate nodes with null latency and places with unit latency. A non-reentrant node can be modeled with a data dependency (or loop) [4], as shownin Fig. 2.5.

Page 82: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

64 J. Boucaron et al.

Kalmanfilter

0�

.100/9 1�!

C00�

.100/9 1�!

0�1027

�!

C 04�0271

�!0 1�

1:09�!

0�

.100/9 1�!

0 1

�09:1

�!000

�.100/9 1

�!

ızk HkINSvelocity

correctedvelocity

Fig. 2.11 Example of KRG scheduling, in bold font. We have arbitrarily set place latencies to 1and node latencies to 0

Timed KRG have the same expressivity as KRG.Similarly as found in MG, we introduce an ASAP firing rule such that when any

node is enabled, it is executed immediately. Since a KRG is confluent any firingrule will generate a partial order of event compatible with the ASAP firing rule. TheASAP firing rule generates the fastest achievable throughput for a given KRG.

Definition 7. The rate of a binary word w is defined as rate .w/ D 1 if w D ", andrate .w/ D jwj1=jwj otherwise. The rate of a k-periodic sequence s D u:v! equalsthe rate of its steady part: rate .s/ D rate .v/.

Definition 8. The throughput of a node is the rate of the steady part of its schedule.The throughput of a KRG is the list of all input and output node throughputs.

Unlike Marked Graphs, the throughput of a KRG may differ from the throughputof its slowest circuit (or path).

Example 8. We introduce latencies on the KRG in Fig. 2.11. We assume here thatplace latency equals 1, and node latency equals 0. Bold sequences next to eachnode are their respective schedules, with an ASAP firing rule and with respect toboundedness constraints.

2.3.2.4 Dependency Analysis

Dependency analysis [51] is one of the most useful tool for compilation and au-tomatic parallelization [35]. Some techniques can be transposed to KRG for theproblematic of token flow dependencies.

Dependency analysis in our case can provide an expansion from a bounded KRGto a MG with the same behavior. Such transformation is useful to check if a trans-formation applied on a KRG is legal or not. This transformation can also be used toshow if two instances of a KRG have the same behavior modulo timing shifts.

First, we introduce some definitions on relations and ordering properties on tokenflows. Next, we explain data dependencies in token flows and their representationsas dependency graphs. Then, we show how to build a behavior-equivalent MG.

Page 83: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

2 Formal Modeling of Embedded Systems 65

Definition 9 (Token flow). Let E be the set of tokens passing through a node n or aplace p of a KRG within a period. Sequential productions then consumptions definea total order<seq on E . Token flow a, passing through n or p, is a sequence a1:a2:::such that for all ai ; aj 2 E , i < j if and only if ai <seq aj .

Definition 10 (Prefix). The prefix of length n of a binary word or sequence w isdefined such that pre .w; n/ D w1 : : :wn.

Definition 11 (Relations between flows). Nodes and places of a KRG apply trans-formations on their input flows to produce output flows. Relations between suchflows are as follows:

� Let p 2 P . a is the input flow and b is the output flow such that:b D c:a, where c is the prefix caused by initial tokens in p, and jcj DM0 .p/.

� Let n 2 Nx . a is the input flow and b is the output flow such that:b D g .a1/ :f .a2/ :::, where g is the function applied by n to token data.

� Let n 2 Nc . a is the input flow and b is the output flow such that:b D a.

� Let n 2 Ns. a is the input flow, and b and c are the zeroth and first outputrespectively. There is a sequence of sub words of a, such that a D a1:a2:a3:a4:::and b D a1:a3::: and c D a2:a4:::, chosen in a such way that the ja1jth firstletters of R .n/ are only 0s, the ja2jth next ones are only 1, etc.

� Let n 2 Nf . a and b are the zeroth and first input respectively, and c is the outputflow such that: c D axR.f / b.

We recall that x is the shuffle product of two words or sequences, recursivelydefined by the equations:

w x " D "xw D fwg ; (2.7)

a:v x b:w D fa: .v x b:w/g [ fb: .a:v x w/g : (2.8)

We write xu for the restriction of the shuffle product according to sequence u; ifc D axub, then:

8 i 2 N�; ci D(ajpre.R.f /;i/j1 if ui D 0;bjpre.R.f /;i/j1 if ui D 1:

(2.9)

Definition 12 (Elementary order relations). We associate with each token goingthrough a node (place resp.) a pair .i; j / 2 N� �N�, where i is the token positionin the node input flow (place input flow resp.), and j is its position in the outputflow. There exists a set of firing relations of the form Õ N� � N�, such that foreach node (place resp.), i Õ j .

We deduce the following relations from Definitions 11 and 12:

8p 2 P ; 8 i 2 N�; i Õp .i CM0 .p// (2.10)

Page 84: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

66 J. Boucaron et al.

8 n 2 Nx [Nc ; 8 i 2 N�; i Õn i (2.11)

8 s 2 Ns ; 8 i 2 N�;hR .s/

ii

Õs;0 i (2.12)

, i Õs;0

ˇˇpre

�R .s/; i

�ˇˇ1

(2.13)

8 s 2 Ns ; 8 i 2 N�; ŒR .s/�i Õs;1 i (2.14)

, i Õs;1 jpre .R .s/ ; i/j1 (2.15)

8 f 2 Nf ; 8 i 2 N�; i Õf;0

hR .f /

ii

(2.16)

8 f 2 Nf ;8 i 2 N�; i Õf;1 ŒR .f /�i (2.17)

These relations are monotone: 8 .i1; j1/ ; .i2; j2/ 2Õ; i1 i2 , j1 j2.Places, copy and computation nodes do not alter token order; there is a bijectionbetween an input and output token index. Relations of merge and select nodes de-pend on their routing patterns: a select relation is surjective, while a merge relationis injective.

For a given KRG, the set of constraints can be modeled as a dependency graph:

Definition 13 (Dependency graph). A dependency graph (or hypergraph) is atriple hJ ;D; Ii where:

� J is a finite set of tokens, union of the tokens of the KRG flows.� D is a finite multiset of dependencies, each one in J � J , and corresponding to

a data dependency.� I � J is the subset of initial tokens.

Dependencies are of three kinds:

Flow dependency. Basically, a flow dependency between two instructions A and Bmeans that A writes a data in a place, then read by B . In a KRG, this dependencyfrom a token a1 to a token b1 means that producing b1 requires consuming a1.Figure 2.12a gives an example of cyclic flow dependency. Token b1 is an initialdata. It is consumed by node a to produce a1, which is then consumed to getback to the initial state. The associated dependency graph is given in Fig. 2.12b.

Output dependency. An output dependency corresponds to consecutive writingsinto a shared resource. In particular, places behave as FIFOs, and FIFO headscan be seen as such shared resources, as depicted in Fig. 2.12b: a token can beconsumed if and only if all its predecessors have already been consumed. Thecorresponding dependency graph is given in Fig. 2.12e.

Antidependency. An antidependency between two instructions A and B means thatA reads a data in a shared resource before B overwrites it. In our case, it meansthat an output dependency ai ! aiC1 and a flow dependency ai ! bj introducean antidependency bj ! aiC1: producing bj (hence consuming ai ) allows toproduce aiC1. An example is given in Figs. 2.12c and 2.12f.

Page 85: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

2 Formal Modeling of Embedded Systems 67

a b

(a) KRG with flowdependency

a

(b) KRG with outputdependency

0 1.01/!

a

b c

(c) KRG with antidepen-dency

a1 b1

(d) Dependency graph of (a).Dependencies between flowsa and b

a3

a2

a1

(e) Dependency graph of(b). Output dependencyin flow a

a2

b1 a1 c1

(f) Dependency graph of (c).Antidependency between a2and b1

Fig. 2.12 Examples of the different dependency types

Rule 1 (Constructing dependency graph) Let g D hN ;P ; T ;M0; Ri be abounded KRG. We construct its dependency graph ı D hJ ;D; Ii as follows:

� We abstract the KRG as a SDF graph and solve balance equations with re-fined constraints: for each merge node n, we assert that it shall be fired at leastPp2�nM0 .p/ times over a period. Any other node n0 shall be fired at least

maxp2�nM0 .p/ times over a period. This way, we compute the number of to-kens that we have to consider over a period for each flow.

� We associate an unique token j 2 J to each token in each flow of g. A tokenbelongs to I if it is initially present in its flow (or place).

� Tokens of the dependency graph are linked together as mentioned above, accord-ing to KRG topology and routing, and with respect to relations of Definition 11.Flow dependencies link pairs of tokens, if the consumption of the first one allowsthe production of the other one through node firing. For each flow a, an outputdependency link each token to its successor: ai ! aiC1. If, for three tokens ai ,aiC1 and bj , there is a flow dependency ai ! bj and an output dependencyai ! aiC1, then we create an antidependency bj ! aiC1.

The dependency graph is then transposed to an equivalent MG, as stated in thisnext rule.

Page 86: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

68 J. Boucaron et al.

Rule 2 (Constructing equivalent Marked Graph) A MG hN ;P ; T ;M0i modelsa dependency graph hJ ;D; Ii as follows:

� Each token j 2 J corresponds to a unique quintuplet .n; n0; p; .n; p/ ; .p; n0// 2N �N � P � T � T .

� Let j1 and j2 be two tokens in J , and�n1; n

01; p1; .n1; p1/ ;

�p1; n

01

��and�

n2; n02; p2; .n2; p2/ ;

�p2; n

02

��their corresponding quintuplets, respectively. If

.j1; j2/ 2 D is a flow or output dependency, then n01 D n2, otherwise n0

1 ¤ n2.� For all j 2 J and its place p 2 P , M0 .p/ D 1 if j 2 I, 0 otherwise.

Using this equivalent MG, we can show, as in bounded length simulation, that aKRG is dead-lock free: using any minimum cycle mean algorithm, if the throughputof the MG is greater than 0 then the KRG is deadlock free, otherwise it suffers fromdeadlocks.

Example 9. We modify slightly the KRG in Fig. 2.10a, considering only three iter-ations. The routing sequences of select and merge nodes are

�02:1

�!and

�1:02

�!respectively.

Using Rule 1, we compute that select, merge and add nodes are executed threetimes, and the other nodes are fired once over a period. We obtain the dependencygraph in Fig. 2.13a. Then, applying Rule 2, we build the equivalent MG, shownin Fig. 2.13b. Multi-dependencies of a given predecessor have been simplified forclarity; it could be further simplified, removing useless select and merge nodes.This graph does not have circuits with empty token count, its throughput is not nulland the original KRG is deadlock-free.

i1

k1 m1

a1

s1

k2 m2

a2

s2

k3 m3

a3

s3

f1

(a) Flow and output dependencies

0

Kalman merge

C

select

Kalman merge

C

select

Kalman merge

C

select

C

(b) Corresponding MG. Multide-pendencies have been simplified forclarity

Fig. 2.13 Dependency analysis of KRG in Fig. 2.10a considering three iterations only

Page 87: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

2 Formal Modeling of Embedded Systems 69

2.3.2.5 Topological Transformations

As stated previously, one of the goals of KRG is to enable to build and check correct-ness of topological transformations of a KRG. Here, we provide simple topologicaltransformations modifying the graph topology of a given KRG while preserving itsoriginal semantics: tokens can be routed onto different flows, then merged together,ensuring a compatible order of events with the original topology. In order to checkthe correctness of the system, we need to introduce operators acting on token flows,in this case on routing patterns attached to select and merge nodes. We recall theOn operator borrowed from n-synchronous theory [31], and we define the Whenoperator.

Definition 14 (On operator). The On operator, written H, is recursively defined onbinary words as follows: 8 n 2 N; 8 u 2 Bn; 8 v 2 Bjuj1 ;

"H" D " (2.18)

uHv; D(0: .utailHv/ if uhead D 0;vhead: .utailHvtail/ if uhead D 1:

(2.19)

Definition 15 (When operator). The When operator, written 4, is recursively de-fined on binary words, such that: 8 n 2 N; 8 u; v 2 Bn;if n D 0,

u4v D "4" D " (2.20)

otherwise,

u4v D(

uqueue4vqueue if vhead D 0;uhead:

�uqueue4vqueue

�if vhead D 1:

(2.21)

Equivalently: 8 n 2 N; 8 u; v 2 Bn; 9w 2 Bjvj1 ;

u4v D w , 8 i 2 ŒŒ1; jwj��; wi D uŒv�i : (2.22)

Places can be shared by different token flows: for instance if data are serial-ized through a shared communication medium. Expanding such places, as shown inFig. 2.14a, is equivalent to replacing the shared medium by as many point-to-pointlinks as required. This transformation may introduce more concurrency in the KRGbut still preserves token order over each flow.

Now, we introduce a useful lemma that simplifies next proof of the expansion ofa place. This lemma states an equivalence of flow.

Lemma 2. 8 u; v 2 P ;8 i 2 N�;

vŒu�i D 1 ) jpre .v; Œu�i /j1 D Œu4v�jpre.v4u;i/j1 : (2.23)

Proof. We illustrate the proof by the example of Fig. 2.14a (left). One the left sideof the implication asserts that the Œu�i th token going through the merge node is

Page 88: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

70 J. Boucaron et al.

0 1u4v

0 1v4u

0 1u4v

0 1v4u

a b

c d

0 1u

0 1v

a b

c d

(a) Place expansion

0 101 : : :

0 110 : : :

0 110 : : :

0 101 : : :

a2a1

b2b1

b1a2

a1b2

0 1?

0 1?

a2a1

b2b1

b1a2

a1b2

(b) Example of graph where reverse factor-ization is impossible

Fig. 2.14 Expanding and factorizing places

routed onto the first output of the select node, i.e., the i th token in flow b is routedtowards flow d . We show that both operands of the equality are different mannersto write the index in flow d of this i th token of flow b.

On the left-hand side: Œu�i is the position, in the merge output flow, of the i thtoken in b (position of the i th occurrence of “1” in the routing sequence). pre .v; Œu�i /corresponds to the routing by select node of tokens, up to the i th from b. We haveassumed that the i th letter of the select sequence is “1”, so its position in d equalsthe number of “1” in this prefix.

On the right-hand side: v4u is a sampling by the routing sequence of the selectnode by the one of the merge node; it corresponds to the routing, by the select node,of all tokens issued from b. Then, jpre .v4u; i /j1 is the number of those tokens,up to the i th one, routed to d . Conversely, u4v corresponds to origins of tokens ind . Finally, Œu4v�jpre.v4u;i/j1 is the position, among tokens in d , of the i th comingfrom b.

Definition 16 (Order relation on a path). Let ˙ be the set of elementarypaths in a KRG. A path � 2 ˙ is of the form � D n1:p1:n2: : : : :nlC1, withn1; : : : ; nlC1 2 N et p1; : : : ; pl 2 P . We associate a relation Õ�� .N� �N�/to each path � such that Õ�DÕnlC1

ı Õplı : : : ı Õn1

, where Õb ı ÕaDf.i; k/ j 9 .i; j / 2Õa; .j; k/ 2Õbg.

Õ� is a monotone relation since it is composed of monotone relations.

Definition 17 (Order preservation). A KRG is said to be order-preserving if andonly if 8 n1; n2 2 N ; 8 �1; �2 2 ˙n1 n2

such that �1 ¤ �2; Õ�1[ Õ�2

ismonotone.

Proposition 1 (Expanding a place). Expanding a place, such as described inFig. 2.14a, preserves order relations over token flows.

Proof. Using and composing flow relations, we infer the following relations be-tween input and output flows:

Page 89: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

2 Formal Modeling of Embedded Systems 71

Figure 2.14a (left): Figure 2.14a (right):

a Õs;0 ıÕp ıÕf;0 jpre .v; Œu�a/j1 a Õf;0 ıÕp ıÕs;0 Œu4v�jpre.v4u;a/j1a Õs;1 ıÕp ıÕf;0 jpre .v; Œu�a/j1 a Õf;0 ıÕp ıÕs;1 Œu4v�jpre.v4u;a/j1b Õs;0 ıÕp ıÕf;1 jpre .v; Œu�b/j1 b Õf;1 ıÕp ıÕs;0 Œu4v�jpre.v4u;b/j1b Õs;1 ıÕp ıÕf;1 jpre .v; Œu�b/j1 b Õf;1 ıÕp ıÕs;1 Œu4v�jpre.v4u;b/j1

Then, equalities between left and right relations are shown by direct application ofLemma 2.

Notice that factorizing places is not always possible, as illustrated in Fig. 2.14b.This is because some token orders allowed by point-to-point connections are incom-patible with token sequentialization that generates a total order.

Figure 2.15a presents token dependencies of the example of point-to-point con-nections in Fig. 2.14b. ˛, ˇ, � and ı are names given to tokens in middle places,between select and merge nodes. This graph is acyclic: we can easily find routingpatterns of select and merge nodes, going up dependency paths.

In Fig. 2.15b tokens have been arbitrarily sequentialized, as in Fig. 2.14b (right).The middle flow equals ˛ı�ˇ D a1b1a2b2. In that case, the associated dependencygraph is cyclic:

� Circuit 1: d2 d1 � d2� Circuit 2: c2 c1 ˇ � ı c2� Circuit 3: c2 c1 ˇ b2 b1 ı c2

Circuits 1 and 2 do not have any initial token: there is a deadlock. These circuitscan be broken if we permute � and ı on one side, ˛ and ˇ on the other. Correspond-ing solutions are as follows: a2b1b2a1, a2b2a1b1, a2b2b1a1, b2a1a2b1, b2a2a1b1

a2 b2

a1 b1

˛ ˇ � ı

c2 d2

c1 d1

(a) Dependency graph of the expandedform, in Fig. 2.14b (left)

a2 b2

a1 b1

˛ ˇ � ı

c2 d2

c1 d1

(b) Dependency graph of an arbitrarilyfactorized form, in Fig. 2.14b (right)

Fig. 2.15 Comparison of dependencies between expanded and factorized forms: sequentializingintroduces dependencies

Page 90: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

72 J. Boucaron et al.

and b2a2b1a1. None of this word is in ax b: such solutions falsify Definition 11.It is impossible to find a valid sequentialization.

Proposition 2 (Merge permutation). Permuting Merge nodes, as shown inFig. 2.16a, preserves token orders.

Proof. From Definitions 12 and 16, we deduce the following relations:

Figure 2.16a (left): Figure 2.16a (right):

a Õf;0 Œv�a a Õf;0 ıÕp ıÕf;0

�vHu

hv4vHu

ia

b Õf;1 ıÕp ıÕf;0 Œv�Œu�b b Õf;0 ıÕp ıÕf;1

�vHu

Œv4vHu�b

c Õf;1 ıÕp ıÕf;1 Œv�Œu�c c Õf;1 ŒvHu�c

Then, we have:(a)

vHuHv4vHu D .vHu/˚ vHuH�v4vHu

� D .vHu/˚ v ^ vHu

D .vHu/˚ .v _ .vHu// D v:

(b)

vHuH�v4vHu

� D v ^ vHu D v ^ .v˚ .vHu// D vHu:

(c)

vHu D vHu:

Missing algebraic properties and detailed proofs can be found in Boucaron et al.[12].

0 1v

0 1u

d

a b c

0 1vHu

0 1v4vHu

d

ca b

(a) Permuting merge nodes

0 1v

0 1u

a

b c d

0 1u4uHv

0 1uHv

a

db c

(b) Permuting select nodes

Fig. 2.16 Permuting routing nodes

Page 91: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

2 Formal Modeling of Embedded Systems 73

Proposition 3 (Select permutation). Permuting Select nodes, as shown inFig. 2.16b, preserves token orders.

Proof.

Figure 2.16b (left): Figure 2.16b (right):

Œu�b Õs;0 b�uHv

hu4uHv

ib

Õs;0 ıÕp ıÕs;0 b

Œu�Œv�c Õs;0 ıÕp ıÕs;1 c�uHv

Œu4uHv�c

Õs;1 ıÕp ıÕs;0 c

Œu�Œv�d Õs;1 ıÕp ıÕs;1 d ŒuHv�d Õs;1 d

The proof is similar to the one of Proposition 2, and can be found in Boucaronet al. [12].

Proposition 4 (Transformation consistency). All following transformations:Place expansion (Fig. 2.14a), Merge node permutation (Fig. 2.16a), and Selectnode permutation (Fig. 2.16b) preserves token flows relations between inputs andoutputs. They do not alter neither graph boundedness, nor liveness and deadlock-freedom properties.

Proof. Using Propositions 1, 2 and 3, these transformations preserve relations be-tween input and output token flows.

Summary

This section provides an overview of static controlled MoCCs. CSDF and KRG havethe following properties:

� Deterministic behavior with respect to concurrency� Decidability on bounded buffers during execution through an abstraction to SDF

and solving balance equations� Dynamic communication topology, that enables reuse of resources for commu-

nication, computation and buffers� Static schedulability: the schedule of a bounded system can be computed at com-

pilation time with needed buffer sizes� Deadlock-freeness checks using bounded length symbolic simulation� Correct-by-construction transformations of the communication topology and as-

sociated routing patterns in case of KRG

Page 92: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

74 J. Boucaron et al.

2.4 Conclusion

Modern hardware designs are multicore and even becoming manycore, with pos-sibly heterogeneous processors and direct hardware accelerators. This increasedparallelism must match the needs of applications, themselves becoming more andmore concurrent in the embedded market, which becomes almost everything in theseconvergence era. A profusion of formalisms have been proposed to cope with thisproblem of efficient compilation mapping. We feel that process networks, as part ofmore general Concurrency Theory, constitute an ideal mathematical field in which toassess the corresponding issues and analyze many of these proposals. One can thenrecall the impact of formal language theory (automata, Turing machines) on previ-ous sequential computer architectures. In order to fulfill that role, process networkshave themselves to be thoroughly characterized and their mathematical propertiesexplicitly described. There is already a large body of results, we try to advance onthe topics of scheduling and routing in this context of process networks, and theexplicit representations of actual schedules and routes obtained in favorable cases:when static ultimately periodic solutions exist, which is the analogous of finite au-tomata for sequential computations.

This chapter describes some process networks, also called Models of Compu-tations and Communications (MoCCs). Presented MoCCs are simple and naturalabstractions for concurrent applications. The simplicity of such MoCCs allows alot of automation using proved algorithms for compilation/synthesis, performanceanalysis and formal verification tools. This can enable a better productivity and en-able design reuse, it can enable the production of highly-reliable or formally provedproducts.

We present briefly pure dataflow MoCCs such as Synchrony, Latency-InsensitiveDesign, Marked Graphs and Synchronous Data Flow. All those pure dataflowMoCCs have a deterministic behavior with respect to concurrency and a static com-munication topology. They generate partial order of events due to conflict-freenessproperty: this allows a lot of opportunities for scheduling; and this eases to showcorrectness of an applied transformation. All of them are statically schedulable: per-formance metrics can be derived at compilation time for both the size of memoryresources and the throughput of the system.

After, we present some statically controlled MoCCs such as Cyclo-Static DataFlow and K-periodically Routed Graphs. Statically controlled MoCCs allow toroute data, to change dynamically the communication topology during system ex-ecution, but all routing decisions have to be known at compilation time. Staticallycontrolled MoCCS are still ensuring desirable properties such as conflict-freenessand determinism with respect to concurrency. We can check if buffers are boundedduring run-time and if so, we can also build a static schedule.

Of course, all previous features come at a cost: all presented MoCCs do not havethe same expressiveness as Turing-machines and have a statically known call graphat compilation time.

Nevertheless, the goal of such MoCCs is not to describe all applications, but theirexpressivities are powerful enough to describe a wide range of useful applications

Page 93: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

2 Formal Modeling of Embedded Systems 75

needed in signal processing (digital filters for audio, image, video), video games(physics, path finding, character animation), genetic research (DNA sequence align-ment), biochemistry (molecular dynamics).

Much remains to be done. Many features useful in practice will defeat conflict-freeness property (priority preemption, synchronous real-time) and should be han-dled with formality still. Conversely, unpredictability due to the many middlewarelayers or the interconnect fabric and memory access uncertain latencies should alsobe considered. But while formal techniques may not yet be able to tackle these, atleast they make it possible to name them and not try to get rid of the topic falsely byignoring it. Also, there is tremendous need for MoCC friendly architectures [58] tosimplify development of tools: for Worst Case Execution Timing used for Real-Timesystems, for highly-optimizing synthesizers/compilers and for formal verificationtools.

References

1. S. Amarasinghe. A stream compiler for communication-exposed architectures. In Proceedingsof ASPLOS, 2002.

2. F. Anceau. A synchronous approach for clocking VLSI systems. IEEE Journal of Solid-StateCircuits, 17:51–56, 1982.

3. C. Andre. Representation and analysis of reactive behaviors: A synchronous approach. InCESA, 1996.

4. F. Baccelli, G. Cohen, G. J. Olsder, and J.-P. Quadrat. Synchronization and Linearity. Wiley,Chichester, West Sussex, UK, 1992.

5. D. Baneres, J. Cortadella, and M. Kishinevsky. Variable-latency design using function specu-lation. In Proc. Design, Automation and Test in Europe, April 2009.

6. L. Benini, E. Macii, and M. Poncino. Telescopic units: Increasing the average throughput ofpipelined designs by adaptive latency control. In DAC, pages 22–27, 1997.

7. A. Benveniste, P. Caspi, S. A. Edwards, N. Halbwachs, P. Le Guernic, and R. de Simone. Thesynchronous languages 12 years later. Proceedings of the IEEE, 91(1):64–83, 2003.

8. S. S. Bhattacharyya, P. K. Murthy, and E. A. Lee. Software Synthesis from Dataflow Graphs.Kluwer, Dordrecht, 1996.

9. G. Bilsen, M. Engels, R. Lauwereins, and J. Peperstraete. Cyclo-static dataflow. IEEE Trans-actions on Signal Processing, 44:397–408, February 1996.

10. A. Bouali. XEVE, an esterel verification environment. In CAV ’98: Proceedings of the 10thInternational Conference on Computer Aided Verification, pages 500–504, London, UK, 1998.Springer.

11. J. Boucaron, A. Coadou, and R. de Simone. Latency-insensitive design: retry relay-station andfusion shell. In Formal Methods for Globally Asynchronous Locally Synchronous Design 2009Proceedings, 2009.

12. J. Boucaron, A. Coadou, B. Ferrero, J.-V. Millo, and R. de Simone. Kahn-extended eventgraphs. Research Report RR-6541, INRIA, 2008.

13. J. Boucaron, J.-V. Millo, and R. de Simone. Another glance at relay-stations in latency-insensitive design. In Formal Methods for Globally Asynchronous Locally Synchronous Design2005 Proceedings, 2005.

Page 94: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

76 J. Boucaron et al.

14. J. Boucaron, J.-V. Millo, and R. de Simone. Latency-insensitive design and central repetitivescheduling. In Proceedings of the 4th IEEE/ACM International Conference on Formal Methodsand Models for Codesign (MEMOCODE’06), pages 175–183, Napa Valley, CA, USA, July2006. IEEE Press.

15. J. Boucaron, J.-V. Millo, and R. de Simone. Formal methods for scheduling of latency-insensitive designs. EURASIP Journal on Embedded Systems, 2007(1), 8–8, 2007.

16. R. G. Brown and P. Y. C. Hwang. Introduction to Random Signals and Applied Kalman Filter-ing, 3rd edition. Wiley, New York, 1996.

17. R. E. Bryant. Graph-based algorithms for Boolean function manipulation. IEEE Transactionson Computers, 1986.

18. J. T. Buck. Scheduling Dynamic Dataflow Graphs with Bounded Memory Using the TokenFlow Model. PhD thesis, University of California, Berkeley, CA, USA, 1993.

19. D. Bufistov, J. Julvez, and J. Cortadella. Performance optimization of elastic systems usingbuffer resizing and buffer insertion. In ICCAD ’08: Proceedings of the 2008 IEEE/ACM Inter-national Conference on Computer-Aided Design, pages 442–448, Piscataway, NJ, USA, 2008.IEEE Press.

20. J. L. Campbell. Application of Airborne Laser Scanner – Aerial Navigation. PhD thesis, RussCollege of Engineering and Technology, Athens, OH, USA, February 2006.

21. L. P. Carloni, K. L. McMillan, A. Saldanha, and A. L. Sangiovanni-Vincentelli. A methodol-ogy for correct-by-construction latency-insensitive design. In Proceedings of the InternationalConference on Computer-Aided Design (ICCAD’99), pages 309–315, Piscataway, NJ, USA,November 1999. IEEE.

22. L. P. Carloni, K. L. McMillan, and A. L. Sangiovanni-Vincentelli. Theory of latency-insensitivedesign. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,20(9):1059–1076, 2001.

23. L. P. Carloni and A. L. Sangiovanni-Vincentelli. Performance analysis and optimization oflatency insensitive systems. In DAC ’00: Proceedings of the 37th Annual Design AutomationConference, pages 361–367, New York, NY, USA, 2000. ACM.

24. L. P. Carloni and A. L. Sangiovanni-Vincentelli. Combining retiming and recycling to optimizethe performance of synchronous circuits. In SBCCI ’03: Proceedings of the 16th symposium onIntegrated circuits and systems design, page 47, Washington, DC, USA, 2003. IEEE ComputerSociety.

25. P. Caspi, D. Pilaud, N. Halbwachs, and J. Plaice. Lustre: a declarative language for program-ming synchronous systems. In POPL, pages 178–188, 1987.

26. P. Caspi and M. Pouzet. A functional extension to Lustre. In M. A. Orgun and E. A. Ashcroft,editors, International Symposium on Languages for Intentional Programming, Sydney,Australia, May 1995. World Scientific.

27. P. Caspi and M. Pouzet. Lucid Synchrone, version 1.01. Tutorial and reference manual. Labo-ratoire d’Informatique de Paris 6, January 1999.

28. M. R. Casu and L. Macchiarulo. A detailed implementation of latency insensitive protocols.In Proceedings of the 1st Workshop on Globally Asynchronous, Locally Synchronous Design(FMGALS’03), pages 94–103, September 2003.

29. M. R. Casu and L. Macchiarulo. A new approach to latency insensitive design. In Proceedingsof the 41st Annual Conference on Design Automation (DAC’04), pages 576–581, 2004.

30. M. R. Casu and L. Macchiarulo. Adaptive latency-insensitive protocols. IEEE Design and Testof Computers, 24(5):442–452, 2007.

31. A. Cohen, M. Duranton, C. Eisenbeis, C. Pagetti, F. Plateau, and M. Pouzet. N-synchronouskahn networks: a relaxed model of synchrony for real-time systems. In Conference Recordof the 33rd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages(POPL’06), pages 180–193, New York, NY, USA, January 2006. ACM.

32. F. Commoner, A. W. Holt, S. Even, and A. Pnueli. Marked directed graph. Journal of Computerand System Sciences, 5:511–523, October 1971.

33. J. Cortadella, M. Kishinevsky, and B. Grundmann. Synthesis of synchronous elasticarchitectures. In Proceedings of the 43rd Annual Conference on Design Automation (DAC’06),pages 657–662, New York, NY, USA, 2006. ACM.

Page 95: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

2 Formal Modeling of Embedded Systems 77

34. M. Dall’Osso, G. Biccari, L. Giovannini, D. Bertozzi, and L. Benini. Xpipes: a latencyinsensitive parameterized network-on-chip architecture for multi-processor SOCS. In ICCD,pages 536–, 2003.

35. A. Darte, Y. Robert, and F. Vivien. Scheduling and Automatic Parallelization. Birkhauser,Boston, 2000.

36. A. Dasdan and R. Gupta. Faster maximum and minimum mean cycle algorithms for system-performance analysis. IEEE Transactions on Computer-Aided Design of Integrated Circuitsand Systems, 17(10):889–899, 1998.

37. G. de Micheli. Synthesis and Optimization of Digital Circuits. McGraw-Hill, New York, 1994.38. N. Een and N. Sorensson. An extensible SAT solver. In SAT Proceedings, volume 2919 of

Lecture Notes in Computer Science, pages 502–518. Springer, Berlin, 2003.39. J. Eker, J. Janneck, E. Lee, J. Liu, X. Liu, J. Ludvig, S. Neuendorffer, S. Sachs, and Y. Xiong.

Taming heterogeneity – the Ptolemy approach. Proceedings of the IEEE, 91(1):127–144, 2003.40. M. Engels, G. Bilsen, R. Lauwereins, and J. A. Peperstraete. Cyclo-static dataflow: model and

implementation. In Conference Record of the Twenty-Eighth Asilomar Conference on Signals,Systems and Computers, volume 1, pages 503–507, Pacific Grove, CA, USA, 1994.

41. D. Gebhardt and K. Stevens. Elastic flow in an application specific network-on-chip. In FormalMethods for Globally Asynchronous Locally Synchronous Design Proceedings, 2007.

42. R. Govindarajan and G. R. Gao. Rate-optimal schedule for multi-rate DSP computations. Jour-nal of VLSI Signal Processing, 9(3):211–232, 1995.

43. T. Grandpierre and Y. Sorel. From algorithm and architecture specification to automatic gen-eration of distributed real-time executives: a seamless flow of graphs transformations. InProceedings of First ACM and IEEE International Conference on Formal Methods and Modelsfor Codesign, MEMOCODE’03, Mont Saint-Michel, France, June 2003.

44. C. A. R. Hoare. Communicating sequential processes. Commun. ACM, 21(8):666–677, 1978.45. G. Hoover and F. Brewer. Synthesizing synchronous elastic flow networks. In DATE, pages

306–311, 2008.46. D. B. Johnson. Finding all the elementary circuits of a directed graph. SIAM Journal on

Computing, 4(1):77–84, 1975.47. G. Kahn. The semantics of a simple language for parallel programming. In J. L. Rosenfeld,

editor, Information Processing ’74: Proceedings of the IFIP Congress, pages 471–475, NewYork, NY, USA, 1974. North-Holland.

48. M. Karczmarek. Phased scheduling of stream programs. In Proceedings of LCTES, 2003.49. R. Karp. A characterization of the minimum cycle mean in a digraph. Discrete Mathematics,

23(3):309–311, 1978.50. R. M. Karp and R. E. Miller. Parallel program schemata. Journal of Computer and System

Sciences, 3(2):147–195, May, 1969.51. K. Kennedy and J. R. Allen. Optimizing compilers for modern architectures: a dependence-

based approach. Morgan Kaufmann, San Francisco, CA, USA, 2002.52. M. Kudlur and S. A. Mahlke. Orchestrating the execution of stream programs on multicore

platforms. In PLDI, pages 114–124, 2008.53. P. Le Guernic, J.-P. Talpin, and J.-C. Le Lann. Polychrony for system design. Journal for

Circuits, Systems and Computers, 12:261–304, 2002.54. C. Y. Lee. Representation of switching circuits by binary-decision programs. Bell Systems

Technical Journal, 38:985–999, 1959.55. E. A. Lee and D. G. Messerschmitt. Static scheduling of synchronous data flow programs for

digital signal processing. IEEE Transactions on Computers, C-36(1):24–35, 1987.56. E. A. Lee and D. G. Messerschmitt. Synchronous data flow. Proceeding of the IEEE,

75(9):1235–1245, 1987.57. C. E. Leiserson and J. B. Saxe. Retiming synchronous circuitry. Algorithmica, 6(1):5–35, 1991.58. B. Lickly, I. Liu, S. Kim, H. D. Patel, S. A. Edwards, and E. A. Lee. Predictable programming

on a precision timed architecture. In Proceedings of Compilers, Architectures, and Synthesis ofEmbedded Systems (CASES), Atlanta, Georgia, USA, October 19–24, 2008.

Page 96: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

78 J. Boucaron et al.

59. N. Liveris, C. Lin, J. Wang, H. Zhou, and P. Banerjee. Retiming for synchronous data flowgraphs. In ASP-DAC ’07: Proceedings of the 2007 Asia and South Pacific Design AutomationConference, pages 480–485, Washington, DC, USA, 2007. IEEE Computer Society.

60. O. Marchetti and A. Munier-Kordon. Minimizing place capacities of weighted event graphs forenforcing liveness. Discrete Event Dynamic Systems, 18(1):91–109, 2008.

61. O. Marchetti and A. Munier-Kordon. A sufficient condition for the liveness of weighted eventgraphs. European Journal of Operational Research, 197(2):532–540, 2009.

62. K. L. McMillan. Symbolic Model Checking. Kluwer, Dordrecht, 1993.63. T. O’Neil and E.-M. Sha. Retiming synchronous data-flow graphs to reduce execution time.

IEEE Transactions on Signal Processing, 49(10):2397–2407, 2001.64. H. D. Patel and S. K. Shukla. SystemC Kernel Extensions For Heterogenous System Modeling:

A Framework for Multi-MoC Modeling and Simulation. Kluwer, Norwell, MA, USA, 2004.65. C. A. Petri. Kommunikation mit Automaten. PhD thesis, Institut fur instrumentelle Mathematik,

Bonn, Germany, 1962.66. D. Potop-Butucaru, S. A. Edwards, and G. Berry. Compiling Esterel. Springer, Berlin, 2007.67. J. P. Queille and J. Sifakis. Specification and verification of concurrent systems in CESAR. In

International Symposium on Programming, 1982.68. C. Ramchandani. Analysis of Asynchronous Concurrent Systems by Timed Petri Nets. PhD the-

sis, Dept. of Electrical Engineering, Massachusetts Institute of Technology, Cambridge, MA,USA, 1973.

69. M. D. Riedel. Cyclic Combinational Circuits. PhD thesis, Dept. of Electrical Engineering,California Institute of Technology, Pasadena, CA, USA, November 2003.

70. K. Schneider. The synchronous programming language Quartz. Internal Report 375,Department of Computer Science, University of Kaiserslautern, Kaiserslautern, Germany,2009.

71. J. Sermulins. Cache aware optimization of stream programs. In Proceedings of LCTES, 2005.72. C. Soviani, O. Tardieu, and S. A. Edwards. High-level optimization by combining retiming

and shannon decomposition. In In Proceedings of the International Workshop on Logic andSynthesis (IWLS), Lake Arrowhead, CA, June, 2005.

73. S. Suhaib, D. Mathaikutty, D. Berner, and S. K. Shukla. Validating families of latencyinsensitive protocols. IEEE Transactions on Computers, 55(11):1391–1401, 2006.

74. S. Suhaib, D. Mathaikutty, S. K. Shukla, D. Berner, and J.-P. Talpin. A functional program-ming framework for latency insensitive protocol validation. Electronic Notes in TheoreticalComputer Science, 146(2):169–188, 2006.

75. C. Svensson. Synchronous latency insensitive design. In Proc. of 10th IEEE International Sym-posium on Asynchronous Circuits and Systems, 2004.

76. W. Thies, M. Karczmarek, M. Gordon, D. Maze, J. Lin, A. Meli, A. Lamb, C. Leger, andS. Amarasinghe. Streamit: a language for streaming applications. In Proceedings of NewEngland Programming Languages and Systems Symposium (NEPLS), 2002.

Page 97: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

Chapter 3Synoptic: A Domain-Specific ModelingLanguage for Space On-board ApplicationSoftware

A. Cortier, L. Besnard, J.P. Bodeveix, J. Buisson, F. Dagnat, M. Filali,G. Garcia, J. Ouy, M. Pantel, A. Rugina, M. Strecker, and J.P. Talpin

3.1 Introduction

3.1.1 Context: Embedded Flight Software

This section describes the context of the SPaCIFY project and the various constraintsthat have been handled in the design and implementation of the Synoptic languagetoolset.

A. Cortier (�), J.P. Bodeveix, M. Filali, M. Pantel, and M. StreckerIRIT-ACADIE, Universite de Toulouse, site Paul Sabatier,118 Route de Narbonne, 31062 Toulouse Cedex 9, Francee-mail: [email protected]; [email protected]; [email protected]; [email protected];[email protected]

J.P. Talpin, J. Ouy, and L. BesnardIRISA-ESPRESSO, campus de Beaulieu., 35042 Rennes Cedex, Francee-mail: [email protected]; [email protected]

J. BuissonVALORIA, Ecoles de St-Cyr Cotquidan, Universite Europeenne de Bretagne,56381 Guer Cedex, Francee-mail: [email protected]

F. DagnatInstitut Telecom – Telecom Bretagne, Universite Europeenne de Bretagne,Technopole Brest Iroise, CS83818, 29238 Brest Cedex 3, Francee-mail: [email protected]

G. GarciaThales Alenia Space, 100 Boulevard Midi, 06150 Cannes, Francee-mail: [email protected]

A. RuginaEADS Astrium, 31 rue des Cosmonautes Z.I. du Palays, 31402 Toulouse Cedex 4, Francee-mail: [email protected]

S.K. Shukla and J.-P. Talpin (eds.), Synthesis of Embedded Software: Frameworks andMethodologies for Correctness by Construction, DOI 10.1007/978-1-4419-6400-7 3,c� Springer Science+Business Media, LLC 2010

79

Page 98: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

80 A. Cortier et al.

Satellite and Flight Software

A satellite is an unmanned spacecraft. The system architecture is usually specializedaccording to the satellite mission. There are two main subsystems: the payload isan application equipment such as specific scientific instrumentation for instance; theplatform consists of mechanical structure, sensors and actuators used by the pay-load and devices for communications with ground stations. TheSPaCIFY 1 projectfocuses on the flight software embedded in the satellite to manage its platform,also called on-board software. The flight software is a real-time software that pro-vides services that are common to whatever mission-specific payload the spacecraftis assigned. Typical services include reaching and following the desired attitudeand orbit, managing thermal regulation systems, power sources, monitoring sys-tem status, managing on-board network (MIL-STD-1553, OBDH, SpaceWire), andcommunicating with ground stations.

Even after the satellite has been launched, flight software must be adapted. Satel-lites are subject to high energy particles that may damage hardware components.Such damages cannot be fixed except by installing software workarounds. Bug fixesand software updates should be propagated to flying satellites. In addition, missionextensions may require functional enhancements.

As flight software is critical to the success of the mission, space industries andagencies have worked on engineering processes in order to help increase reliabil-ity. For instance, the European Space Agency has published standards (ECSS-E-40)on software engineering and (ECSS-Q-ST-80) on product assurance. These stan-dards do not prescribe a specific process. They rather formalize documents, listrequirements of the process and assign responsibilities to involved partners. Re-garding updates, standards stipulate for instance that the type, scope and criticalitymust be documented; that updates must be validated; and so on. Industries are freeto come up with their own conforming processes. The SPaCIFY project defines sucha process and supporting tools based on Model-Driven Engineering (MDE), syn-chronous languages and the Globally Asynchronous Locally Synchronous System(GALS) paradigm.

Kind of Model Used

Two main kinds of models are commonly used in order to design the platformsoftware: on the one hand, the description of the platform itself, its hardwareand software architecture, CPU, memory, storage, communication buses, sensors,actuators, hardware communication facilities, operating system, tasks, softwarecommunication facilities, and on the other hand, the command and control algo-rithms involved in the management of the platform. All these models have commonfeatures. First, the notion of mode is central to represent the various states or con-figurations of both hardware and software (for example: init, reset, on, low power,

1 Project SPaCIFY is funded by the French National Research Agency (ANR-06-TLOG 27). Visithttp://spacify.gforge.enseeiht.fr for more information.

Page 99: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

3 Synoptic: A DSML for Space On-board Application Software 81

failure, safe). The modes and their management are usually expressed using finiteautomata. Second, the functional blocks, data exchange buses and signals are usedto represent both the hardware and software architectures, and the command andcontrol software.

However, the current models only provide a partial account of the constraints thatthe final system must satisfy. The designers usually encode these hidden constraintsusing the available constructs in the modeling languages, even if this was not theinitial purpose of the construct used. This is usually the case for hard real-timeconstraints. The designers will rely on explicit management of the control-flow indataflow models in order to manage the concurrency between the activities. But,the real timing constraints are not explicitly given in the model, they are handledby the designers who sequence the activities, but there is no real specification of theintended result. Thus it requires a very difficult verification phase that can only occuron the final target. This is the current main difficulty: using model constructs forother purpose than their intended ones without any formal, model-level, traceabilityto the initial constraints.

Models in Development Process

Currently, industrial main actors are relying on the Matlab toolboxes Simulink andStateflow from “The MathWorks” [4] for expressing the command and control al-gorithms for the various sub-systems of the platform. These models are built bycommand and control engineers taking into account several kinds of constraints:

� The hardware which is usually known very early (in particular if it handles timingand synchronisation constraints related to sensors and actuators, which must beactivated and will produce results in very precise timing patterns)

� The system mode management that impacts the execution of the varioussub-systems

� The specification of the intended sub-system

The Simulink and Stateflow toolboxes allow a very wide range of modelingmethods, from continuous partial differential equations to graphical functionalspecifications. Each industrial actor has a well defined process which defines therestricted subset of the modeling language that will be used at each phase of thedevelopment cycle. In the ITEA GeneAuto project [33], a subset of these toolboxeswas defined that fits the needs for the modeling of space application from early de-sign to automated target code generation. The industrial partners of SPaCIFY tookalso part in GeneAuto. This subset aimed at providing a solid semantic backgroundthat would ease the understanding and the formal verification of models. This subsetwas chosen as an entry point for the design of the Synoptic language. One key pointis the use of control-flow signals and events (called in Simulink function call events)in order to manage explicitly the sequencing of the blocks in dataflow models. Thisis an important point which is significantly different on the semantics side from theclassical dataflow modeling languages such as SCADE [16] or RT-Builder [3] whichdo not provide a specific modeling construct for this purpose, but allow to encodeit through empty data signals that are added between each block in the intended

Page 100: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

82 A. Cortier et al.

sequencing path. The key point is that the intended hard real-time constraints arenot explicit. Thus it is mandatory to handle the control-flow construct exactly in itsusual semantics not using an approximate encoding which only works most of thetime but is not proven to work in all cases.

Several studies have been conducted by industrial main actors regarding the mod-eling of hardware and software architecture. HOOD [29] and CCM [20] have beenused for many real projects; AADL [6, 30], SysML [18] and UML/MARTE [28]have been evaluated using already existing projects that could be modeled and theresults compared with the real systems. Once again, these languages provide a verywide range of modeling constructs that must be reduced or organized in order to bemanageable. In the European ASSERT project [5], two tracks were experimentedrelated to these languages, one mainly synchronous based on the LUSTRE [22]and SIGNAL [21] languages, the other mainly asynchronous based on the RAVEN-SCAR Ada [13] profile. The industrial partners from SPaCIFY were also part ofASSERT. Thus, the results of these experiments were used as entry points for thedesign of the Synoptic language.

In the current software development process, command and control models areformal specifications for the software development. These specifications have beenvalidated by command and control engineers by using model simulators. The hard-ware architecture is currently defined in a semi-formal manner through structureddocuments. In the near future, models in AADL, or in a subset of SysML/UML/MARTE similar to AADL, will be used to give a formal specification of thearchitecture.

Then, software engineers either develop the software and verify its conformanceto the specification, or use automated code generators; the software is then split inparts that are mapped to threads from the RTOS. They are then scheduled accord-ing to the real-time constraints. The know-how of engineers lies in finding the bestsplitting, mapping and scheduling in order to minimize the resources used.

One of the purposes of introducing Model Driven Engineering is to be ableto automate partly these manual transformations and the verification that their re-sult satisfies the specification. The Synoptic language should thus allow to importcommand and control models expressed in Simulink/Stateflow, hardware archi-tecture models expressed in AADL. The associated toolset should assist in there-organisation of the model, and allow to express the mapping and scheduling.

3.1.2 Domain Specific Requirements

Real Time Constraints

Flight software is mainly responsible for implementing command and control laws.It is therefore a set of tasks performed at fixed time periods by a real-time kernel.Tasks perform the various management activities of the system. They have severetime constraints inherited from the command and control design by automation. Thenumber of tasks varies depending on the approach adopted by the industry: ThalesAlenia Space favors a large number of tasks (about 20–30 active tasks on a total of

Page 101: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

3 Synoptic: A DSML for Space On-board Application Software 83

40–50); while EADS Astrium has a tendency to aggregate calculations, includingfrequency activation to reduce the number of tasks.

Although the constraints are strong, the time scale is relatively slow. Theactivation periods of tasks in the flight software vary typically between 100 msand 100 s.

Limited Hardware Resources Capacity

The computing resources embedded in satellites are limited. Thus, at the time ofwriting, the memory allocated to the flight software is generally under 10 MB. Thebinary image is also stored in a 1–2 MB EEPROM.

The computing power ranges from the 20 MIPS ERC32 processor up to nearly100 MIPS for the LEON 2 and 3 ones. Moreover, satellites operate in a hostilephysical environment. Shocks, temperature variations, radiation, high energy par-ticle beams, damage and destroy electronic components, leading to rapid aging ofembedded equipments.

Remote Observability and Commandability

Once a satellite has been launched, the only interventions currently possible are re-mote. Several scenarios have been identified that require maintenance interventionssuch as for example:

� The satellite, and therefore its embedded software, must survive as long as pos-sible to equipment aging and damages. Installing software workarounds is aneconomical means of continuing the mission as long as remaining equipmentspermit. Identifying problems is critical for the ground engineering teams thatdesign such workarounds.

� Satellites are often still usable at the end of their initial mission. Updating thesoftware is an economical way of achieving new mission objectives while recy-cling satellites. Therefore, ground operators must adapt the software to changingoperational conditions.

� When a bug is detected in the software, its correction should be remotely installedwithout compromising the satellite.

These scenarios emphasize the need for remote monitoring and management ofthe flight software. Monitoring probes characteristics and state of satellite com-ponents (both hardware and software) and the physical environment. Managementprovides actions both at the application level, e.g., manoeuvre, and at the middle-ware level, e.g., software update.

The communication link between satellites and ground passes through groundstations. Since the SPaCIFY project has removed from its study the possibility ofrouting communications through other spacecrafts, being in range of a groundstation is a prerequisite for connectivity. Characteristics (speed, visibility window)

Page 102: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

84 A. Cortier et al.

of the communication link depend of the orbit and of the mission. In general, weconsider that it is rarely possible to download or upload a full image of the flightsoftware.

A High Level of Quality and Safety

Given the fact that the only possible actions once the satellite has been launchedare performed remotely, it is essential to ensure that the satellite is always ableto respond to maintenance requests. In addition to verifying the software, defensivemechanisms are usually added as for example to restart computers in case of trouble.Indeed, given the radiation to which satellites are exposed during flight, high-energyparticles can swap bits in memory. Several images of the software are also stored onboard to be sure to have at least a version which is sufficiently functional to enablecommunications with the ground.

A Particular Software Industry

In addition to software business constraints, the design of flight software has its ownpeculiarities.

For certain classes of satellites, including scientific satellites, each satellite isa single unit, in fact a prototype. Indeed, every scientific mission has a specificobjective, needing specific measuring equipment.

For other categories, including communications satellites, they are built in se-ries from a single design. However, due to the unpredictability of last minute clientrequests, manufacture, launch and space environment hazards, each satellite soonbecomes again unique. This inevitably leads to a specific flight software for eachdifferent satellite. Even when multiple identical copies are initially installed on sev-eral satellites, such copies diverge to take into account the specific situation of eachsatellite and especially their failures and specific maintenance.

Furthermore, it is difficult to perform realistic testing of flight software. At best,simulations can give an indication of the software correctness.

GALS Systems

A satellite management software is usually divided in parts that are quite au-tonomous one from the other, even if they share the same platform and resources.Inside each of these parts, the subparts are most of the time strongly linked andmust cooperate in a synchronous manner. These parts usually exchange informa-tion but with less time critical constraints, thus relying on asynchronous exchanges.These kind of system are usually called Globally Asynchronous, Locally Syn-chronous. Synoptic must provide some specific modelling elements to handle thisstructure.

Page 103: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

3 Synoptic: A DSML for Space On-board Application Software 85

3.1.3 Synoptic: A Domain Specific Design Environment

Motivations

In collaboration with major European manufacturers, the SPaCIFY project (TheSPaCIFY Consortium 2008) aims at bringing advances in MDE to the satellite flightsoftware industry. It focuses on software development and maintenance phases ofsatellite lifecycle. The project advocates a top-down approach built on a domain-specific modeling language named Synoptic. In line with previous approaches toreal-time modeling such as Statecharts and Simulink, Synoptic features hierarchi-cal decomposition in synchronous block diagrams and state machines. SPaCIFY alsoemphasizes verification, validation and code generation.

One key point to ease the long-term availability and the adaptability of the tools,is to rely on open-source technologies.

Overview of the SPaCIFY Development Process

To take into account the domain specific requirements and constraints presented inthe previous section, the SPaCIFY project proposes to introduce models inside thedesign and maintenance processes of the on-board applications. Thus, the spacedomain may benefit from technologies developed in the context of Model DrivenEngineering for processing, handling and analyzing models. It therefore takes a shiftfrom current practices based on documents to a tool-supported process built aroundformal models.

The Fig. 3.1 summarizes the major steps of the proposed development and de-sign process. The central models are described in the Synoptic domain specific and

Fig. 3.1 Sketch of the SPaCIFY development cycle for on-board software

Page 104: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

86 A. Cortier et al.

Fig. 3.2 Code generation and execution platform (middleware)

formal language defined by the project. The process promotes vertical and hori-zontal transformations. The vertical transformations sketch the refinements and theenrichments of Synoptic models. The horizontal transformations consists of trans-lations transformations to formal models (Altarica [1], SME [9] models) that areequipped with specific verification, code generation or simulation tools.

The initial Synoptic model, resulting from the automated transformation ofSimulink/Stateflow models previously designed by control engineers, is enrichedby non-functional properties derived from textual requirements.

Using the hardware specifications, the Synoptic model is further enriched by thehardware model of the system. Based on hardware and software constraints, thedynamic architecture can be derived.

Finally, the resulting model includes the different views of the system (software,hardware, dynamic architectures) and mappings between them. This last model isused to generate the source code and to configure the middleware of the embeddedapplication. At each step, analysis and verifications can be performed. The transfor-mations of models used in the refinement process formalize the expertise acquiredby the industry.

Figure 3.2 focuses on the code generation phase. At this phase, the model ofthe application is initially translated into a SME model. Code generation itself isperformed by the Polychrony compiler [21]. The generated code targets the use of amiddleware specifically designed for the SPaCIFY project.

Contents of the Chapter

This chapter is organized as follows. Section 3.2 introduces the main features ofthe Synoptic DSML with a particular focus on the functional sub-language ofSynoptic, called Synoptic core. Synoptic core permits to model synchronous is-lands. They communicate through asynchronous shared variables (called externalvariables) managed by the middleware. A simplified case study is used to presentthe various concepts introduced. Section 3.3 describes the formal semantics andthe polychronous model of computation of the Synoptic core which is based onthe synchronous dataflow language SIGNAL [7]. Section 3.4 focuses on middle-ware aspects. The architecture of the execution platform, the middleware kernel,external variables, and the reconfiguration service of the middleware are discussed.Section 3.5 concludes the main contribution of the project and gives an outlook onfuture investigations.

Page 105: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

3 Synoptic: A DSML for Space On-board Application Software 87

3.2 Synoptic: A Domain Specific Modeling Languagefor Aerospace Systems

3.2.1 Synoptic Overview

Synoptic is a Domain Specific Modeling Language (DSML) which aims to supportall aspects of embedded flight-software design. As such, Synoptic consists of het-erogeneous modeling and programming principles defined in collaboration with theindustrial partners and end users of the SPaCIFY project.

Used as the central modeling language of the SPaCIFY model driven engineer-ing process, Synoptic allows to describe different layers of abstraction: at thehighest level, the software architecture models the functional decomposition of theflight software. This is mapped to a dynamic architecture which defines the threadstructure of the software. It consists of a set of threads, where each thread is charac-terized by properties such as its frequency, priority and activation pattern (periodic,sporadic).

At the lowest level, the hardware architecture permits to define devices (proces-sors, sensors, actuators, busses) and their properties.

Synoptic permits to describe three types of mappings between these layers:

� Mappings which define a correspondence between the software and the dynamicarchitecture, by specifying which blocks are executed by which threads

� Mappings which describe the correspondences between the dynamic and hard-ware architecture, by specifying which threads are executed by which processor

� Mappings which describe a correspondence between the software and hardwarearchitecture, by specifying which data is transported by which bus for instance

Figure 3.3 depicts the principles discussed above.

Fig. 3.3 Global view: layers and architecture mappings

Page 106: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

88 A. Cortier et al.

Our aim is to synthesize as much of these mappings as possible, for exampleby appealing to internal or external schedulers. However, to allow for human inter-vention, it is possible to give a fine-grained mapping, thus overriding or bypassingmachine-generated schedules. Anyway, consistency of the resulting dynamic archi-tecture is verified by the SPaCIFY tool suite, based on the properties of the softwareand dynamic model.

At each step of the development process, it is also useful to model different ab-straction levels of the system under design inside a same layer (functional, dynamicor hardware architecture). Synoptic offers this capability by providing an incremen-tal design framework and refinement features. To summarize, Synoptic deals withdataflow diagrams, mode automata, blocks, components, dynamic and hardware ar-chitecture, mapping and timing.

In this section we focus on the functional part of the Synoptic language whichpermits to model software architecture. This sub-language is well adapted to modelsynchronous islands and to specify interaction points between these islands and themiddleware platform using the concept of external variables. Synchronous islandsand middleware form a Globally Asynchronous and Locally Synchronous (GALS)system.

Software Architecture

The development of the Synoptic software architecture language has been tightlycoordinated with the definition of the GeneAuto language [33]. Synoptic uses es-sentially two types of modules, called blocks in Synoptic, which can be mutuallynested: dataflow diagrams and mode automata. Nesting favors a hierarchical designand allows to view the description at different levels of details.

By embedding blocks in the states of state machines, one can elegantly model op-erational modes: each state represents a mode, and transitions correspond to modechanges. In each mode, the system may be composed of other sub-blocks or havedifferent connection patterns among components. Apart from structural and behav-ioral aspects, the Synoptic software architecture language allows to define temporalproperties of blocks. For instance, a block can be parameterized with a frequencyand a worst case execution time which are taken into account in the mapping ontothe dynamic architecture.

Synoptic has a formal semantics, defined in terms of the synchronous languageSIGNAL [9]. On the one hand, this allows for neat integration of verification en-vironments for ascertaining properties of the system under development. On theother hand, a formal semantics makes it possible to encode the meta-model in aproof assistant. In this sense, Synoptic will profit from the formal correctness proofand subsequent certification of a code generator that is under way in the GeneAutoproject. Synoptic is equipped with an assertion language that allows to state desiredproperties of the model under development. We are mainly interested in propertiesthat permit to express, for example, coherence of the modes (“if component X is

Page 107: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

3 Synoptic: A DSML for Space On-board Application Software 89

in mode m1, then component Y is in mode m2” or “. . . can eventually move intomode m2”). Specific transformations extract these properties and pass them to theverification tools.

3.2.2 Software Architecture Models

One typical case study under investigation in the project is a generic satellite po-sitioning software, Fig. 3.4. It is responsible for automatically moving the satelliteinto a correct position before starting interaction with the ground.

The specification of the central flight software (OBSW), consists of the compo-sition of heterogeneous diagrams. Each diagram represents one specific aspect ofthe software’s role. The OBSW module is controlled by a remote control telecom-mand TC 001 and receives attitude and position data (POS Data and ATT Data) fromsensors through the middleware.

The Attitude and Orbit Control System (AOCS) is the main sub-module of thecentral flight software, Fig. 3.5b. The AOCS automaton controls its operationalmodes for specific conditions. It is either in nominal mode or in SAFE mode,Fig. 3.5c. The safe mode is characterized by a rude pointing of satellite equipments(solar panels, antennas). It computes the command to be sent to the thrusters tomaintain a correct position and attitude.

3.2.2.1 Block Diagram: Dataflow, Automaton and External Blocks

A Synoptic model is a graphical block-diagram. A Synoptic block-diagram is ahierarchy of nested blocks. As such, you can navigate through different levels ofabstraction and see increasing levels of model details. A block is a functional unitthat communicates with other blocks through its interface.

Fig. 3.4 Satellite positioningsoftware

Page 108: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

90 A. Cortier et al.

OBSW : On Board Software

a

b

AOCS

c

SAFE : AOCS safe mode

d

Legend

Fig. 3.5 A typical case study: satellite positioning software

Page 109: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

3 Synoptic: A DSML for Space On-board Application Software 91

Block Interface

A block interface consists of communication ports. A port can be either an eventport or a data port and is characterized by its direction (in or out). Data ports canbe typed using simple data types. However, typing data ports is optional in the earlystages of design to give the designer the flexibility to describe abstract models.

As shown in Fig. 3.6, which represents a part of the Synoptic meta-model, itis possible to encapsulate a set of ports within a single entity called group ofports (PortGroup in Fig. 3.6). A group of ports is used to group a set of portswhich are conceptually linked. As specified by the synchronized property ofthe PortGroupDecl meta-model class, it is possible to specify a synchronizationconstraint on ports constituting the group.

A block interface can be implemented by different types of blocks: dataflows,automata or externals, Fig. 3.7.

Dataflow

A dataflow block models a dataflow diagram. It embodies sub-blocks and specifiesdata or event flows between them. A flow is a directed connection between portsof sub-blocks. A sub-block is either an instance of dataflow, automaton or external

Fig. 3.6 Synoptic metamodel: interface

Page 110: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

92 A. Cortier et al.

Fig. 3.7 Synoptic metamodel: model diagram

block, or an instance of block interface (see ModelInstance class in Fig. 3.7).In the latter case, the model is abstract: the designer must implement all the blockinterfaces used to type sub-blocks in order to obtain a concrete model. As such,Synoptic environment design promotes a top-down approach.

Automaton

An automaton block models state machines. A Synoptic automaton consists ofstates and transitions. As for dataflow, a state can be represented by an instanceof dataflow, automaton or block interface. Dataflow diagrams and automata can behierarchically nested; this allows for a compact modelling of operational modes.

Apart from the ongoing actions of automata (block embedded in state), it ispossible to specify entry and exit actions in an imperative style. Transitions betweenstates are guarded and equipped with actions. There are two types of transitions inSynoptic: strong and weak transitions. Strong transitions are used to compute thecurrent state of the automaton (before entering the state), whereas weak transitionsare used to compute the state for the next instant.

More precisely, the guards of weak transitions are evaluated to estimate the statefor the next instant, and the guards of strong transitions whose source is the stateestimated at the previous instant, are evaluated to determine the current state.

External Block

A common practice in software industry is the re-use of external source code: de-signers must be able to introduce blocks representing existing source code into themodel. Moreover, for modeling flight software functional units, it is necessary touse primitive operations such as addition, cosinus, division.

Page 111: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

3 Synoptic: A DSML for Space On-board Application Software 93

Synoptic provides the concept of external block to model primitive blocks and ex-ternal source code, Fig. 3.7. Primitive blocks are encapsulated in a Synoptic library.The Synoptic tool suit recognizes these primitive operations during code generation.

For embedding an external source code, the procedure is quite different. Thedesigner must build his own external block by defining its interface, by giving thesource code path which will be used at code generation time, and by specifyingpre/post conditions and the Worst Case Execution Time of the functional unit.

Example: A Dataflow Block Diagram

Figure 3.8 shows the graphical decomposition of the nominal mode of the Atti-tude and Orbit Control System (AOCS/NM) depicted in (Fig. 3.5b). The header ofthe block tells us that NM is an instance of the NM dtf dataflow block. In nominalmode, AOCS can either use its sun pointing sensors (SUP) to define its position or,during eclipse, use its geocentric pointing sensors (GAP). To model this activity,the nominal mode of the AOCS flight software encapsulates two sub-blocks: onedataflow block (TEST ECLIPSE) which detects an eclipse occurrence and one au-tomaton block (SUP OR GAP) which ensures the change of sub-mode (SUP or GAP)depending on the outcome of the eclipse test block.

3.2.2.2 Synchronous Islands

A functional model consists of synchronous islands. A synchronous island is asynchronous functional unit in interaction with the middleware. The middlewaresupports the execution of code generated from the Synoptic model: it provides asynchronous abstraction of the asynchronous real world.

Fig. 3.8 AOCS nominal (NM) sub-mode – satellite positioning software

Page 112: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

94 A. Cortier et al.

To ensure proper integration of the generated code with the middleware platform,it is essential to locate all interactions between the middleware and the application.

Our approach is to locate all these interactions into a single concept in Synoptic:the concept of external variables. External variables allow to represent in Synop-tic models the following domain specific and technical concepts: telecommand,telemetry, constants database systems, shared variables and persistent variables (onreboot). The specification of external variables are used for code generation andmiddleware configuration.

Therefore a Synoptic synchronous island is a block in which all input and outputsignals are connected to external variables. The OBSW block of our case study isone such synchronous island, Fig. 3.5a.

Contractual Approach

An external variable is a variable managed by the middleware. It can be read orwritten by multiple clients (components of the application). Contracts are used tospecify how to control this access.

The configuration of an external variable is done using four levels of contracts:

1. A classical syntactic contract that includes its name and type2. A remote access contract (through telemetry)3. A persistence contract that specifies if the variable is persistent on reboot or not4. A synchronization contract that describes the possible concurrent interactions

when the variable is shared

Each component using such a variable must also provide a usage contract thatdefines the protocol it will use to access the variable.

These contracts allow the middleware to manage the variable. Thus, the synchro-nization contract will indicate if locks are needed and when they should be used.The persistence contract will indicate how the variable should be stored (RAM orpermanent memory). The monitoring functions of the variable that the middlewaremust implement are defined by the remote access contract and the usage contracts.

3.2.2.3 Control and Temporal Properties

As described before, blocks are functional units of compilation and execution. Blockexecution is controlled using two specific control ports: trigger and reset ports.These ports are inherited from the Simulink/Stateflow approach [4]. These portsappear as black triangles in the upper left of the block, Figs. 3.5 and 3.8.

Trigger Port

The trigger port is an event port. The occurrence of a trigger event starts the execu-tion of the block and its specification may then operate at its own pace until the nexttrigger is signaled.

Page 113: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

3 Synoptic: A DSML for Space On-board Application Software 95

Sub-blocks have their own trigger control ports and can thus operate at a differentrate. Without event signal connecting its control port, a sub-block inherits the controlsignals of its parent block.

Explicit Clock and Adaptors

The trigger port stands for the clock of the block. In our case study, the trigger con-trol port of blocks is never connected: the clock of the block is explicitly definedusing a period or a frequency property. Adding this frequency property is semanti-cally equivalent to connecting the trigger control port with a 20-Hz clock.

By convention, all input signals of the block are synchronized with the trigger ofthe parent block. This constraint may be too strict: it is thus possible to redefine thefrequency of ports by adding an explicit property of frequency. Note however thatthe clock of ports must be a down-sampling of the parent trigger.

Explicitly specifying a real-time constraint on ports and blocks can lead todifficulties when specifying a flow between two ports with different frequencies.Synoptic tools are able to detect such clock errors. The designer should use pre-defined clock adaptors to resample the signal, Fig. 3.8.

Besides frequency properties, it is possible to specify the phase and the WorstCase Execution Time (WCET) of a block.

Reset Port

The reset port is a boolean data port whose clock is a down-sampling of the triggersignal. The reset signal forces the block B to reset its state and variables to initialvalues.

Contrarily to StateCharts [23], when a transition is fired the destination state ofthe transition is not reset. It means that by default, one enters the history state. Toreset the destination state and all variables it contains, the solution is to specify thisreset in the action of the transition.

3.2.2.4 Properties Specification

The Synoptic language is equipped with an assertion language to state desired prop-erties of the model under development. This language makes it possible to expressinvariants, pre- and post-conditions on blocks.

Invariants are used to express coherence of modes. For instance, a typical asser-tion that we want to prove on the case study model (Fig. 3.5) is that if the MCS blockis in SAFE mode, then the AOCS block is also in SAFE mode. Such an assertion isdescribed in Synoptic as follows:

(OBSW.MCS.state = SAFE) H) (OBSW.AOCS.state = SAFE)

Page 114: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

96 A. Cortier et al.

The Synoptic environment provides a tool to extract Synoptic models and theirassociated properties and pass them to the Altarica model-checker [1].

Pre- and post-conditions can be either statically or dynamically tested. In thelatter case, monitoring functions are implemented in the final software and raiseevents when properties are not satisfied.

Monitoring functions are distinguished from assertion properties by raising anevent to its environment when the property is not satisfied:

Assertion : pre : a < 10 ;Monitoring function : pre : a < 10 raise eŠ ;

Here, a stand for an input data port and e for an event port and eŠ for an eventemission on e port.

3.2.3 Synoptic Components: Genericity and Modularity

Synoptic supports modular system development and parametric components. Thelatter are particularly useful for an incremental design and gradual refinement, usingpredefined development patterns (see Sect. 3.2.4).

As mentioned in Sect. 3.2.2.1, there are two main categories of compo-nents: dataflow blocks and automata blocks. They exist on the type level (interfaces)and on the element level (instances). Components can be parametric, in the sensethat a block can take other elements as arguments. Parametric components are sim-ilar to functors in ML-style programming languages, but parameters are not limitedto be blocks. They can, among others:

� Be integers, thus allowing for variable-sized arrays whose length is only fixedduring compile time

� Be types, thus allowing for type genericity as in Java 5 [19]� Be entire components, thus allowing for component assembly from more ele-

mentary blocks

Syntactically, the parameters of a block are specified after a requires clause,the publicly visible elements made available by the block follow in a providesclause, and there maybe a private part, as in Fig. 3.9.

This example reveals parts of the textual syntax of Synoptic , which in this caseis more perspicuous than a graphical syntax. The component C requires (an instanceof ) a block, called ADD, that is characterized by a component type having a certainnumber of in and out ports. C provides a dataflow that is composed of two blocks,one of which is defined in the private part of C.

Parameterized blocks can be instantiated with elements that meet their typingconstraints, in a large sense. In the case of simple parameter types (like integers),the typing constrains are evident. When instantiating a parameterized componenthaving a parameter type P with a component C , the component C has to provideall the elements stipulated by the requires clause of P (but may provide more).

Page 115: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

3 Synoptic: A DSML for Space On-board Application Software 97

1 component C2 r e q u i r e s3 block type ADD4 f e a t u r e s5 idp1 : in data port i n t e g e r;6 idp2 : in data port i n t e g e r;7 odp1 : out data port i n t e g e r;8 end ADD;9 p r o v i d e s

10 data f low ADD_MULT.op11 b l o c k s12 add : block type ADD;13 mult : block type MULT;14 f l o w s15 s1 : data idp1! add.idp1;16 -- other flows ...17 end ADD_MULT.op;18 p r i v a t e19 block type MULT20 f e a t u r e s21 -- ...22 end MULT;23 end C;

Fig. 3.9 A parameterized component

Conversely, C may require some (but not necessarily all) the elements providedby P . Parameter instantiation is thus essentially contravariant. Some clauses of acomponent are not checked during instantiation, such as private.

Parameter types can be equipped with properties (see Sect. 3.2.2.4) such astemporal properties. Instantiating these types gives rise to proof obligations, de-pending on the underlying logic. In some cases, an exact match between the formalparameter component and the actual argument component is not required. In thiscase, a mechanism comparable to a type cast comes into play: Take, for example,the case of a component triggered at 20 Hz C20j (as in Fig. 3.8) that is to be usedby a parametric component C10j operating at 10 Hz. Component C20j can be usedas argument of C10j, and a default frequency splitter will be synthesized that adaptsthe frequency of the C20j to C10j.

3.2.4 Incremental Design and Refinement Features

The Synoptic environment promotes a top-down approach including incrementaldesign and refinement features.

In first steps of design, Synoptic allows to describe an abstract model. For in-stance, the designer can describe abstract interfaces where data ports are not typedand connect instances of these interfaces in a dataflow diagram.

Page 116: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

98 A. Cortier et al.

Fig. 3.10 Synoptic metamodel: refinement

In a second step, the designer can refine its modeling by typing the block in-terfaces. The block interface can be then implemented with dataflow or automatonblocks. These features of the Synoptic environment are mainly “edition refinement”which allows the designer to model a system in an incremental way. In doing so, thedesigner loses the link between the different levels of refinement: when the modelis concretized, the initial abstract model is not preserved and therefore cannot beaccessed.

Synoptic offers a way to specify refinement and to keep a formal link betweenan abstract model and its concretization. As depicted in the metamodel, Synopticprovides two types of refinements: flow refinement and block refinement, Fig. 3.10.

A flow refinement consists of refining a flow with a dataflow block. To be properlydefined, the flow must be declared as abstract and the dataflow block must have asingle input port and a single output port correctly typed.

A block refinement consists of refining a block instance with a dataflow block.The main use of this type of refinement is to concretize an interface block with adataflow.

Example

Figure 3.11 illustrates a flow refinement. The first model is an abstract view of theglobal architecture of the case study.

The abstract model consists of three dataflow blocks. Sensors send their data(position and attitude) to the central flight software. OBSW computes them and sendsa new command to actuators. In this abstract block diagram, flows between actuatorsand the OBSW are obviously not synchronous signals: data are exchanged throughan 1553 bus.

To consider this fact, flows are specified as abstract. The same applies to theflows between OBSW and actuators. Moreover, a real-time requirement is added tothe flows:� delay � 20�s�.

Page 117: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

3 Synoptic: A DSML for Space On-board Application Software 99

Fig. 3.11 Flow refinement: introduce external variables

This requirement represents the maximum age of data: data consumed by OBSWmust be aged less than 20�s. The second model in Fig. 3.11 shows how this abstractmodel is refined. Each flow is refined with a dataflow block composed of externalvariables.

The concrete model confirms that data sent by the sensors are not Synchronousflows but pass through the middleware. According to the real-time requirement ofabstract model, the middleware is in charge of distributing data to flight softwareby respecting the limit of data aging. In addition, this refinement step displays theOBSW synchronous island.

3.3 Semantic and Model of Computation of synoptic Models

The model of computation on which Synoptic relies is that of the Eclipse-basedsynchronous modeling environment SME [9]. SME is used to transform Synopticdiagrams and automatically generate executable C code. The core of SME is basedon the synchronous dataflow language SIGNAL [21]. This section describes howSynoptic programs are interpreted and compiled into this core language.

3.3.1 An Introduction to SIGNAL

In SIGNAL, a process P consists of the composition of simultaneous equationsx D f .y; z/ over signals x; y; z. A delay equation x D y pre v defines x everytime y is present. Initially, x is defined by the value v, and then, it is defined by theprevious value of y. A sampling equation x D y when z defines x by y when z istrue. Finally, a merge equation x D y default z defines x by y when y is presentand by z otherwise. An equation x D yfz can use a boolean or arithmetic operator

Page 118: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

100 A. Cortier et al.

f to define all of the nth values of the signal x by the result of the application of fto the nth values of the signals y and z. The synchronous composition of processesP jjQ consists of the simultaneous solution of the equations in P and in Q. It iscommutative and associative. The process P=x restricts the signal x to the lexicalscope of P .

P;Q WWD x D yfz j P=x j P jjQ (process):

In SIGNAL, the presence of a value along a signal x is an expression noted Ox. It istrue when x is present. Otherwise, it is false. Specific processes and operators aredefined in SIGNAL to manipulate clocks explicitly. We only use the simplest one,x sync y, that synchronizes all occurrences of the signals x and y.

3.3.2 Interpretation of Blocks

Blocks are the main structuring elements of Synoptic. A block block xA defines afunctional unit of compilation and of execution that can be called from many con-texts and with different modes in the system under design. A block x encapsulates afunctionality A that may consist of sub-blocks, automata and dataflows. A block xis implicity associated with two signals x:trigger and x:reset. The signal x:triggerstarts the execution of A. The specification A may then operate at its own pace untilthe next x:trigger is signaled. The signal x:reset is delivered to x at some x:triggerand forces A to reset its state and variables to initial values.

.blocks/ A;B WWD block xA j dataflow xA j automaton xA j A jjB :

The execution of a block is driven by the trigger t of its parent block. The blockresynchronizes with that trigger when itself or one of its sub-blocks makes an ex-plicit reference to time (e.g., a skip for an action or a delayed transition S � T

for an automaton). Otherwise, the elapse of time is sensed from outside the block,whose operations (e.g., on ci ), are perceived as belonging to the same period aswithin Œti ; tiC1Œ. The interpretation implements this feature by encoding actions andautomata using static single assignment. As a result, and from within a block, ev-ery non-time-consuming sequence of actions AIB or transitions A ! B definesthe value of all its variables once and defines intermediate ones in the flow of itsexecution.

3.3.3 Interpretation of Dataflow

Dataflows inter-connect blocks with data and events (e.g., trigger and reset signals).A flow can simply define a connection from an event x to an event y, writtenevent x ! y, combine data y and z by a simple operation f to form the flowx, written data y f z! x or feed a signal y back to x, written data y pre v! x.In a feedback loop, the signal x is initially defined by x0 D v. Then, at each

Page 119: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

3 Synoptic: A DSML for Space On-board Application Software 101

occurrence n > 0 of the signal y, it takes its previous value xn D yn�1. Theexecution of a dataflow is controlled by its parent clock. A dataflow simultaneouslyexecutes each connection it is composed of every time it is triggered by its parentblock.

.dataflow/ A;B WWD data y pre v! x j data yfz! x j event x ! y j A jjB :

Dataflows are structurally similar to SIGNAL programs and equally combinedusing synchronous composition. The interpretation ŒŒA��rt D hhP ii of a dataflow(Fig. 3.12) is parameterized by the reset and trigger signals of the parent block andreturns a process P (The input term A and the output term P are marked by ŒŒA��and hhP ii for convenience). A delayed flow data y pre v! x initially defines x bythe value v. It is reset to that value every time the reset signal r occurs. Otherwise,it takes the previous value of y in time.

In Fig. 3.12, we writeQi�n Pi for a finite product of processes P1jj : : : Pn. Sim-

ilarly,Wi�n ei is a finite merge of signals e1 default : : : en and

Vi�n ei a finite

sampling e1 when : : : en.A functional flow data yfz ! x defines x by applying f to the values of y

and z. An event flow event y ! x connects y to define x. Particular cases are theoperator ‹.y/ to convert an event y to a boolean data and the operator O.y/ to convertthe boolean data y to an event. We write in.A/ and out.A/ for the input and outputsignals of a dataflow A.

By default, the convention of Synoptic is to synchronize the input signals of adataflow to the parent trigger. It is however, possible to define alternative policies.One is to down-sample the input signals at the pace of the trigger. Another is to adaptor resample them at that trigger. Alternatively, adaptors could better be installedaround the input and output signals of a block in order to resample them with respectto the specified frequency of the block.

ŒŒdataflowf A��rt DhhŒŒA��rt jj�Q

x2in.A/ x sync t�ii

ŒŒ data y pre v! x��rt Dhhx D .v when r/ default .y pre v/ jj .x sync y/iiŒŒ data y f z! x��rt Dhhx D y f ziiŒŒ event y ! x��rt Dhhx D when yii

ŒŒA jjB��rt DhhŒŒA��rt jj ŒŒB��rt ii

Fig. 3.12 Interpretation of dataflow connections

Page 120: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

102 A. Cortier et al.

3.3.4 Interpretation of Actions

Actions are sequences of operations on variables that are performed during the exe-cution of automata.

The assignment x D yfz defines the value of the variable x at the next instantas the result of the application of the function f to y and z. The skip action letstime elapse until the next trigger and assigns to unchanged variables the value theyhad at the previous instant. The conditional if x thenA elseB executes A if the cur-rent value of x is true and executes B otherwise. A sequence AIB executes A andthen B .

.action/ A;B WWD skip j x D yfz j if x thenA elseB j AIB :

The execution of an action A starts at an occurrence of its parent trigger andshall end before the next occurrence of that event. During the execution of an action,one may also wait and synchronize with this event by issuing a skip: A skip hasno behavior but to signal the end of an instant: all the newly computed values ofsignals are flushed in memory and execution is resumed upon the next parent trigger.Action xŠ sends the signal x to its environment. Execution may continue within thesame symbolic instant unless a second emission is performed: one shall issue a skipbefore that. An operation x D yfz takes the current value of y and z to define thenew value of x by the product with f . A conditional if x thenA elseB executes Aor B depending on the current value of x.

As a result, only one new value of a variable x should at most be defined withinan instant delimited by a start and an end or a skip. Therefore, the interpretation of anaction consists of its decomposition in static single assignment form. To this end, weuse an environment E to associate each variable with its definition, an expression,and a guard, that locates it (in time).

An action holds an internal state s that stores an integer n denoting the currentportion of the actions that is being executed. State 0 represents the start of the pro-gram and each n > 0 labels a skip that materializes a synchronized sequence ofactions.

The interpretation ŒŒA��s;m;g;E D hhP iin;h;F of an action A (Fig. 3.13) takes asparameters the state variable s, the state m of the current section, the guard g thatleads to it, and the environment E. It returns a process P , the state n and guard h ofits continuation, and an updated environment F . The set of variables defined in Eis written V .E/. We write usegE .x/ for the expression that returns the definition ofthe variable x at the guard g and defgE .x/ for storing the final values of all variablesx defined in E at the guard g.

usegE .x/D if x 2 V .E/ then hhE.x/ii else hh.x pre 0/when gii;defg.E/DQx2V .E/

�x D usegE .x/

�:

Page 121: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

3 Synoptic: A DSML for Space On-board Application Software 103

ŒŒ doA��rt Dhh.P jj s sync t jj rD.sD0// =siiwhere hhP iin;h;FDŒŒAI end ��s;0;..spre0/D0/;;

ŒŒ end ��s;n;g;E DhhsD0wheng jj defg.E/ii0;0;;ŒŒ skip ��s;n;g;E DhhsDnC 1when g jj defg.E/iinC1;..spre0/DnC1/;;ŒŒxŠ��s;n;g;E DhhxD1whengiin;g;E

ŒŒxDyfz��s;n;g;E DhhxDeiin;g;Ex]fx 7!eg where eDhhf .usegE .y/; usegE .z//whengiiŒŒAIB��s;n;g;E DhhP jjQiinB ;gB ;EB where hhP iinA;gA;EADŒŒA��s;n;g;E andhhQiinB ;gB ;EB

D ŒŒB��s;nA;gA;EA

ŒŒ if x then A else B��s;n;g;E DhhP jjQiinB ;.gA defaultgB /;.EA]EB /

where hhP iinA;gA;EA D ŒŒA��s;n;.gwhen usegE.x//;E and hhQiinB ;gB ;EBDŒŒB��

s;nA;.gwhennot usegE.x//;E

Fig. 3.13 Interpretation of timed sequential actions

Execution is started with s D 0 upon receipt of a trigger t . It is also resumed froma skip at s D n with a trigger t . Hence the signal t is synchronized to the state s ofthe action. The signal r is used to inform the parent block (an automaton) that theexecution of the action has finished (it is back to its initial state 0). An end resets sto 0, stores all variables x defined in E with an equation x D usegE .x/ and finallystops (its returned guard is 0). A skip advances s to the next label n C 1 when itreceives control upon the guard e and flushes the variables defined so far. It returnsa new guard .s pre 0/ D n C 1 to resume the actions past it. An action xŠ emits xwhen its guard e is true. A sequenceAIB evaluatesA to the processP and passes itsstate nA, guard gA, environment EA to B . It returns P jjQ with the state, guard andenvironment of B . Similarly, a conditional evaluates A with the guard gwhen x toP and B with gwhen not x toQ. It returns P jjQ but with the guard gA defaultgB .All variables x 2 X , defined in both EA and EB , are merged in the environment F .

In Fig. 3.13, we write E ] F to merge the definitions in the environments E andF . For all variables x 2 V .E/ [ V .F / in the domains of E and F ,

.E ] F /.x/ D8<:E.x/; x 2 V .E/ n V .F /;

F.x/; x 2 V .F / n V .E/;

E.x/ defaultF.x/; x 2 V .E/ \ V .F /:

Note that an action cannot be reset from the parent clock because it is not synchro-nized to it. A sequence of emissions xŠI xŠ yield only one event along the signalx because they occur at the same (logical) time, as opposed to xŠI skip I xŠ whichsends the second one during the next trigger.

Page 122: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

104 A. Cortier et al.

0 x1 D 0I1 ify

2 then x2 D x1 C 13 else f4 x3 D x1 � 1I5 x D x3I6 skip I7 x4 D x � 18 gI9 x D �.x2; x4/10 end

x1D0when .sD0/x2Dx1 C 1when .sD0/when y

x3Dx1 � 1when .sD0/when noty

x4D.x pre 0/ � 1when .sD1/xD x2 when .sD0/when y

default x3 when .sD0/when noty

default x4 when .sD1/s0D 0when .sD1/

default 0when .sD0/when y

default 1when .sD0/when noty

sDs0 pre 0

Fig. 3.14 Tracing the interpretation of a timed sequential program

Example

Consider the simple sequential program of the introduction. Its static single assign-ment form is depicted in Fig. 3.14.

x D 0I ify then fx D x C 1g else fx D x � 1I skip I x D x � 1gI end

As in GCC, it uses a �-node, line 9 to merge the possible values x2 and x4 of xflowing from each branch of the if .

Our interpretation implements this � by a default equation that merges these twovalues with the third, x3, that is stored into x just before the skip line 6. The in-terpretation of all assignment instructions in the program follows the same pattern.2

Line 2, for instance, the value of x is x1, which flows from line 1. Its assignment tothe new definition of x, namely x2, is conditioned by the guard y on the path fromline 1 to 2. It is conditioned by the current state of the program, which needs to be0, from line 1 to 6 and 9 (state 1 flows from line 7 to 9, overlapping on the �-node).Hence the equation x2 D x1 C 1when .sD0/when y.

3.3.5 Interpretation of Automata

Automata schedule the execution of operations and blocks by performing timelyguarded transitions. An automaton receives control from its trigger and reset signals

2 In the actual translation, temporary names x1;:::;5 are substituted by the expression that definesthem. We kept them in the figure for a matter of clarity.

Page 123: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

3 Synoptic: A DSML for Space On-board Application Software 105

x:trigger and x:reset as specified by its parent block. When an automaton is firsttriggered, or when it is reset, its starts execution from its initial state, specified asinitial stateS . On any state S W doA, it performs the action A. From this state,it may perform an immediate transition to new state T , written S !on x T , ifthe value of the current variable x is true. It may also perform a delayed transitionto T , written S �on x T , that waits the next trigger before to resume execution(in state T ). If no transition condition applies, it then waits the next trigger andresumes execution in state S . States and transitions are composed as A jjB . Thetimed execution of an automaton combines the behavior of an action or a dataflow.The execution of a delayed transition or of a stutter is controlled by an occurrence ofthe parent trigger signal (as for a dataflow). The execution of an immediate transitionis performed without waiting for a trigger or a reset (as for an action).

.automaton/ A;B WWD stateSW doA j S !on x T j S �on x T j A jjB :

An automaton describes a hierarchic structure consisting of actions that are ex-ecuted upon entry in a state by immediate and delayed transitions. An immediatetransition occurs during the period of time allocated to a trigger. Hence, it does notsynchronize to it. Conversely, a delayed transition occurs upon synchronization withthe next occurrence of the parent trigger event. As a result, an automaton is parti-tioned in regions. Each region corresponds to the amount of calculation that can beperformed within the period of a trigger, starting from a given initial state.

Notations

We write !A and �A for the immediate and delayed transition relations of anautomaton A. We write pred!A.S/ D fT j .T; x; S/ 2 Rg and succ!A.S/ DfT j .S; x; T / 2 Rg (resp. pred�A

.S/ and succ�A.S/) for the predecessor and

successor states of the immediate (resp. delayed) transitions!A (resp. �A) froma state S in an automaton A. Finally, we write S for the region of a state S . It isdefined by an equivalence relation.

8S; T 2 S .A/; ..S; x; T / 2 !A/, S D T:

For any state S of A, written S 2 S .A/, it is required that the restriction of!A tothe region S is acyclic. Notice that, still, a delayed transition may take place betweentwo states of the same region.

Interpretation

An automaton A is interpreted by a process ŒŒ automaton x A��rt parameterized byits parent trigger and reset signals. The interpretation of A defines a local state s. Itis synchronized to the parent trigger t . It is set to 0, the initial state, upon receipt of

Page 124: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

106 A. Cortier et al.

a reset signal r and, otherwise, takes the previous value of s0, that denotes the nextstate. The interpretation of all states is performed concurrently.

We give all states Si of an automaton A a unique integer label i D dSie anddesignate with dAe its number of states. S0 is the initial state and, for each stateof index i , we call Ai its action i and xij the guard of an immediate or delayedtransition from Si to Sj .

ŒŒ automaton x A��rt Dhh�t sync s jj s D .0when r/ default .s0 pre 0/ jj

�QSi2S .A/ ŒŒSi ��

s��=ss0ii :

The interpretation ŒŒSi ��s of all states 0 � i < dAe of an automaton (Fig. 3.15) isimplemented by a series of mutually recursive equations that define the meaning ofeach state Si depending on the result obtained for its predecessors Sj in the sameregion. Since a region is by definition acyclic, this system of equations has thereforea unique solution.

The interpretation of state Si starts with that of its actions Ai . An action Aidefines a local state si synchronized to the parent state s D i of the automaton. Theautomaton stutters with s0 D s if the evaluation of the action is not finished: it is ina local state si ¤ 0.

Interpreting the actions Ai requires the definition of a guard gi and of an envi-ronment Ei . The guard gi defines when Ai starts. It requires the local state to be 0or the state Si to receive control from a predecessor Sj in the same region (with theguard xj i ).

The environment Ei is constructed by merging these Fj returned by its imme-diate predecessors Sj . Once these parameters are defined, the interpretation of Aireturns a process Pi together with an exit guard hi and an environment Fi holdingthe value of all variables it defines.

8i < dAe; ŒŒSi ��s D�Pi jjQi jj si sync when .s D i/ jj s0 D s0i

�=si where

hhPi iin;hi ;Fi D ŒŒAi ��si ;0;gi ;EiQi DQ.Si ;xij ;Sj /2�A

�defhi when .useFi .xij //

.Fi /�

Ei D USj2pred

!A.Si /

Fj

gi D 1when .si pre 0 D 0/ default�W

.Sj ;xji ;Si /2!A.useE .xj i //

�gij D hi when .useFi .xij //; 8.Si ; xij ; Sj / 2�A

s0i D .s when si ¤ 0/ default�W

.Si ;xij ;Sj /2�A.j whengij /

Fig. 3.15 Recursive interpretation of a mode automaton

Page 125: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

3 Synoptic: A DSML for Space On-board Application Software 107

Upon evaluation of Ai , delayed transition from Si are checked. This is done bythe definition of a process Qi which, first, checks if the guard xij of a delayedtransition from Si evaluates to true with Fi . If so, variables defined in Fi are storedwith defhi .Fi /.

All delayed transitions from Si to Sj are guarded by hi (one must have finishedevaluating i before moving to j ) and a condition gij , defined by the value of theguard xij . The default condition is to stay in the current state s while si ¤ 0 (i.e.,until mode i is terminated).

Hence, the next state from i is defined by the equation s0 D s0i . The next stateequation of each state is composed with the other to form the product

Qi<dAe s

0Ds0ithat is merged as s0 DWi<dAe s

0i .

Example

Reconsider our previous sequential program, which we now represent by a mode au-tomaton (Fig. 3.16, left). Our interpretation of automata merges, loads and stores thevariables defined in states S1:::5. Thanks to a decomposition in regions, this providesan interpretation with synchronous equations that has an equivalent meaning as thatof a sequential program.

One can actually check that the translation of the automaton (Fig. 3.16, right)is identical to that of the original program modulo substitution of local the localsignals x1;:::;4 by their definition.

3.3.6 The Polychronous Model of Computation

The polychronous model of computation [21] defines the algebra in which the de-notational semantics of Signal is expressed. In this algebra, symbolic tags t or udenote periods in time during which execution takes place. Time is defined by a

S0 W do x D 0jj S0 !ony S1jj S0 !on noty S2

S1 W do x D x C 1 jjS1 ! S5S2 W do x D x � 1 jjS2 � S4S4 W do x D x � 1 jjS4 ! S5S5 W end

x D 0 � 1when .s D 0/when noty

default 0C 1when .s D 0/when y

default .x pre 0/ � 1when .s D 1/jj s0 D 1when .s D 0/when noty

default 0when .s D 0/when y

default 0when .s D 1/jj s D s0 pre 0

Fig. 3.16 SSA interpretation of a mode automaton into dataflow equations

Page 126: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

108 A. Cortier et al.

partial order relation� on tags: t � u stipulates that t occurs before u or at the sametime. A chain is a totally ordered set of tags. It corresponds to the clock of a signal:it samples its values over a series of totally related tags. The domains for events,signals, behaviors and processes are defined as follows:

– An event is a pair consisting of a tag t 2 T and a value v 2 V .– A signal s 2 S is a function from a chain of tags C � T to a set of values

v 2 V .– A behavior b 2 B is a function from a set of names X � V to signals.– A process p 2P is a set of behaviors that have the same domain.

Notations

We write T .s/ for the chain of tags of a signal s and min s and max s for its minimaland maximal tag. We write V .b/ for the domain of a behavior b (a set of signalnames). The restriction of a behavior b to X is noted bjX (i.e., V .bjX / D X ).Its complementary b=X satisfies b D bjX ] b=X (i.e., V .b=X / D V .b/ n X ). Weoverload T and V to designate the tags of a behavior b and the set of signal namesof a process p. Since tags along a signal s form a chain C D T .s/, we write Ci forthe i th instant in chain C and have that Ci � Cj iff i � j for all i; j � 0.

Reaction

A reaction r is a behavior with (at most) one time tag t . We write T .r/ for the tag ofa non empty reaction r . A reaction r is concatenable to a behavior b, written b �‹ r ,iff r and b have the same domain V .b/ D V .c/ and if max.T .b// < min.T .c//.The concatenation of r to b is written b � r and defined by .b � c/.x/ D b.x/] r.x/for all x 2 V .b/.

Synchronous Structure

A behavior c is a stretching of a behavior b, written b � c, iff V .b/ D V .c/ andthere exists a bijection f on tags s.t.

8t; u; t � f .t/ ^ .t < u, f .t/ < f .u//;8x 2 V .b/;T .c.x// D f .T .b.x/// ^ 8t 2 T .b.x//; b.x/.t/ D c.x/.f .t//:

b and c are clock-equivalent, written b c, iff there exists a behavior d s.t. d � band d � c. The synchronous composition p jj q of two processes p and q is definedby combining behaviors b 2 p and c 2 q that are identical on the interface betweenp and q: I D V .p/ \ V .q/.

p jj q D fb [ c j .b; c/ 2 p q ^ bjI D cjI ^ I D V .p/ \ V .q/g :

Page 127: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

3 Synoptic: A DSML for Space On-board Application Software 109

Alternatively, the synchronous structure of a process can be interpreted as anequivalence relation that refines the causal tag structure of individual signals(for all b 2 B, for all x 2 V .b/, T .b.x// is a chain of T ). If we make the as-sumption that the only phenomenological structure is this causal structure, then thesynchronous structure of a behavior b can be seen as a way to slice time acrossindividual signals in a way that preserves causality: for all t; u 2 T .b/, 1. t < u ifft 6b u and 2. t � u iff t u or t < v and v b u.

Denotation of Data-Flow Diagrams

A data-flow dataflow f A is defined by the meaning of A controlled by the parenttiming C . A delayed flow data yfz! x assigns v to the signal x upon reset t 2 C rand the previous value of y otherwise, at time predCx .t/. An operation data yfz!x assigns the product of the values u and v of y and z by the operation f to thesignal x. An event event x ! y triggers x every time y occurs. Composition A jjBmerges all timely compatible traces of A and B under the same context.

ŒŒdataflowf A��C D fb 2 ŒŒA��C j 8x 2 in.A/C t D T .b.x//g;

ŒŒ data y pre v! x��C D

8ˆ<ˆ:b 2 Bjx;y

ˇˇˇˇ

Cx D T .b.x// D T .b.y// � C v

C v D Cr [ fmin.T .b.x///g8t 2 C v; b.x/.t/ D v

8t 2 Cx n C v; b.x/.t/ D b.y/.predCx .t//

9>>>>>=>>>>>;;

ŒŒ data yfz! x��C D(b 2 Bjx;y;z

ˇˇ8t 2 T .b.x// D T .b.y// D T .b.z//;

b.x/.t/ D f .b.y/.t/; b.z/.t//

);

ŒŒ event y ! x��C D ˚b 2 Bjx;y jT .b.x// D T .b.y//

�;

ŒŒA jjB��C D ŒŒA��C jj ŒŒB��C :

Denotation of Actions

Given its state b, the execution ck D ŒŒA��b of an action A returns a new state c anda status k whose value is 1 if a skip occurred and 0 otherwise. We write b#x Db.x/.min T .b// and b"x D b.x/.max T .b// for the first and last value of x in b.

A sequence first evaluates A to ck and then evaluates B with store b � c to dl . Ifk is true, then a skip has occurred in A, meaning that c and d belong to differentinstants. In this case, the concatenation of b and c is returned.

If k is false, then the execution that ended inA continues inB at the same instant.Hence, variables defined in the last reaction of c must be merged to variables definedin the first reaction of d . To this end, we write .b ‰ c/.x/ D b.x/ ] c.x/ for allx 2 V .b/ if t D max.T .b// D min.T .c//.

Page 128: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

110 A. Cortier et al.

ŒŒ doA��b D b � c ; where ck D ŒŒA��b;ŒŒ skip ��b D ;1;ŒŒxŠ��b D f.x; t; 1/g0;

ŒŒx D yfz��b D f�.x; t; f .b"y ; b"z/

�g0;ŒŒ if x thenA elseB��b D if b"x then hhAiib else hhBiib;

ŒŒAIB��b D if k then .c � d/l else .c ‰ d/l ; where ck D ŒŒA��band dl D ŒŒB��b�c :

Denotation of Automata

An automaton x receives control from its trigger at the clock C t and is reset at theclock C r . Its meaning is hence parameterized by C D .C t ; C r /. The meaning of itsspecifications hhAiis

bis parameterized by the finite trace b, that represents the store,

by the variable s, that represents the current state of the automaton.At the i th step of execution, given that it is in state j (.bi /"s D j ) the automaton

produces a finite behavior biC1 D hhSj iisb . This behavior must match the timelineof the trigger: T .bi / � C t . It must to the initial state 0 is a reset occurs: 8t 2C r ; bi .s/.t/ D 0.

ŒŒ automaton x A��C D8<:.ıi�0bi / =s

ˇˇˇb0 D f.x; t0; 0/ j x 2 V .A/ [ fsgg8i � 0; biC1 2 ŒŒSj ��sbi ;T .biC1/ � C t

j D if min.T .biC1// 2 C r then 0 else .bi /"s

9>=>;:

When an automaton is in the state Si , its action Ai is evaluated to c 2 hhAi iib giventhe store b. Then, immediate or delayed transitions departing from Si are evaluatedto return the final state d of the automaton.

ŒŒSi ��sb D

8<ˆ:

ŒŒSj ��sc ; .Si ; xij ; Sj / 2 !A ^ c"xij ;

c ] f.s; t; j /g; .Si ; xij ; Sj / 2�A ^ c"xij ; where c=s D ŒŒ doAi ��band t D max T c;

c ] f.s; t; i/g; otherwise:

3.4 Middleware Aspect

In order to support the execution of code generated from Synoptic model at runtime,a middleware is embedded in the satellite. This SPaCIFY middleware not only im-plements common middleware services such as communication or naming, but alsooffers domain-specific services of satellite flight control software. There are then

Page 129: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

3 Synoptic: A DSML for Space On-board Application Software 111

two difficulties that must be addressed. First, one has take into account the scarce-ness of resources in a satellite. The large variety3 of satellite platforms from oneproject to another being the second.

To take into account the scarceness of resources in a satellite, this middleware istailored to the domain and adapted to each specific project. This notion of generativemiddleware is inherited from the ASSERT project which has studied proof-basedengineering of real-time applications but is here specialised to the area of satel-lite software. ASSERT defines a so-called virtual machine, which denotes a RTOSkernel along with a middleware and provides a language-neutral semantics of theRavenscar profile [14]. It relies on PolyORB-HI [24], a high integrity version ofPolyORB refining the broker design pattern [15], which fosters the reuse of largechunks of code when implementing multiple middleware personalities. Satisfyingrestrictions from the Ravenscar profile, PolyORB-HI can suitably be used as a run-time support for applications built with AADL code generators. In our context, thesupport of a precise adaptation to the need of the middleware is obtained thanks toits generation based on the requirements expressed in the Synoptic models (mainlythrough the interaction contracts attached to external variables).

Some of the services of the middleware cannot be reused and must be reimple-mented for each satellite. For example, the AOCS4 implements control laws that arespecific to the expected flight and to the mission. Providing a framework, the mid-dleware will help capitalizing on the experience of domain expert, giving a generalarchitecture for the AOCS. The models of services can belong to either the middle-ware or the application. In the later case, we use the same process as for the rest ofthe application, including the Synoptic language. Furthermore as services will havea well defined interface (i.e., an API) other services or applications using AOCS arenot coupled to any particular implementation.

This section starts by presenting the architecture of the SPaCIFY middleware.Then, the middleware kernel and the communication between applications gener-ated from Synoptic models and their hosting middleware is explained. Lastly, thereconfiguration service that has been one of the main focus during the project isdescribed.

3.4.1 Architecture of the Execution Platform

Figure 3.17 depicts the overall architecture of the SPaCIFY middleware. Followingprevious work on middlewares for real-time embedded systems [8,31], the SPaCIFYmiddleware has a microkernel-like or service-based architecture. That way, highflexibility allows to embed only the services that are required, depending on the

3 This diversity is due to the high specialisation of platforms for a specific mission and the broadrange of missions.4 Attitude and Orbit Control System.

Page 130: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

112 A. Cortier et al.

RTOSkernel

abstraction of the RTO

S

-m

id

dleware kernel -

synchronous islands

asynchronousenvironment

External variables

Management of external variables

bus & devices

on board/gound link

persistent memory

naming, time, persistency,reconfiguration,

redundancy, PUS TM/TC,PUS event, AOCS...

Services

Fig. 3.17 Architecture of the middleware

satellite. While previous work has focused on techniques to reduce the time duringwhich the reconfigured system is suspended, along with an analysis to bound thattime, our work aims at:

� Using application model to specify reconfiguration� Proposing a contract approach to the specification of relation between applica-

tions and the middleware� Providing on demand generation of the needed subset of the middleware for a

given satellite

The RTOS kernel offers the usual basic services such as task management andsynchronisation primitives. The core middleware is built on top of this RTOS andprovides core services of composition and communication. They are not intendedto be used by application-level software. They rather provide means to structure themiddleware itself in smaller composable entities, the services. Upon these abstrac-tions are built general purpose services and domain specific services. These servicesare to be used (indirectly) by the application software (here the various synchronousislands). They can be organized in two categories as follows:

� The first layer is composed of general purpose services that may be found inusual middleware. Among them, the naming service implements a namespacethat would be suitable for distributed systems. The persistency service providesa persistent storage for keeping data across system reboots. The redundancyservice helps increasing system reliability thanks to transparent replication man-agement. The reconfiguration service, further described in a dedicated subsectionbelow (Sect. 3.4.3), adds flexibility to the system as it allows to modify the soft-ware at runtime. The task and event service contributes real-time dispatching ofprocessors according to the underlying RTOS scheduler. It provides skeletons for

Page 131: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

3 Synoptic: A DSML for Space On-board Application Software 113

various kinds of tasks, including periodic, sporadic and aperiodic event-triggeredtasks, possibly implementing sporadic servers or similar techniques [25, 32].

� The second layer contains domain-specific services to capture the expertise in thearea of satellite flight control software. These services are often built followingindustry standards such as PUS [17]. The TM/TC service implements the wellestablished telemetry/telecommand link with ground stations or other satellites.The AOCS (attitude and orbit control system) controls actuators in order to en-sure the proper positioning of the satellite. As discussed earlier, services may beimplemented by synchronous islands.

To support the execution the platform use hardware resources provided by thesatellite. As the hardware platform changes from one satellite to another, it mustbe abstracted even for the middleware services. Only the implementation of spe-cific drivers must be done for each hardware architecture. During the softwaredevelopment lifecycle, the appropriate corresponding service implementations areconfigured and adapted to the provided hardware.

As already exposed, one particularity of the SPaCIFY approach is the use ofthe synchronous paradigm. To support the use of the middleware services withinthis approach, we propose to use an abstraction, the external variable as shown inSect. 3.2.2.2. Such a variable abstracts the interaction between synchronous islandsor between a synchronous island and the middleware relaxing the requirements ofsynchrony. In the model, when the software architect wants to use a middlewareservice, he provides a contract describing its requirements on the correspondingset of external variables and the middleware is in charge to meet these require-ments. Clearly, this mediation layer in charge of the external variables managementis specific to each satellite. Hence the contractual approach drives the generation ofthe proper management code. This layer is made of backends that capture all theasynchronous concerns such as accessing to a device or any aperiodic task, henceimplementing asynchronous communications of the GALS approach. The middle-ware is in charge of the orchestration of the exchange between external variables,their managing backends and the services while ensuring the respect of quality ofservice constraints (such temporal one) specified in their contracts.

3.4.2 The Middleware Kernel and External Variables

As stated above, the middleware is built around a RTOS providing tasks andsynchronisation. As the RTOS cannot be fixed due to industrial constraints, themiddleware kernel must provide a common abstraction. It therefore embeds its ownnotion of task and for specific RTOS an adaptor must be provided. The implemen-tation of such adaptors has been left for future until now. Notice that the use ofthis common abstraction forbids the use of specific and sophisticated services of-fered by some RTOS. The approach here is more to adapt the services offered bythe middleware to the business needs rather using low level and high performanceservices.

Page 132: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

114 A. Cortier et al.

The task is the unit of execution. Each task can contain Synoptic componentsaccording to the specification of the dynamic architecture of the application. Taskshave temporal features (processor provisioning, deadline, activation period) inher-ited from the Synoptic model. The middleware kernel is in charge of their executionand their monitoring. It is also in charge of provisioning resources for the aperiodicand sporadic tasks.

Communication inside a task results from the compilation of the synchronousspecification of the various components it must support. All communication outsidea task must go through external variables limiting interaction to only one abstraction.External variables are decomposed into two sub abstractions:

� The frontend identified in the Synoptic model and that constitutes the interactionpoint. It appears as a usual signal in the synchronous model and may be used asinput or output. The way it is provided or consumed is abstracted in a contractthat specify the requirements on the signal (such as its timing constraints). Anexternal variable is said to be asynchronous because no clock constraint is intro-duced between the producer of the signal and its consumer. In the code generatedaccess to such variables are compiled into getter and setter function implementedby the middleware. The contract must include usage requirements specifying theway the signal is used by the task (either an input or an output). They may alsoembed requirements on the freshness of the value or on event notifying valuechange.

� The backend specified using stereotypes in the contract configure the middle-ware behavior for non synchronous concerns. For example, persistency contractspecify that the external variable must be saved in a persistent memory. Acquisi-tion contracts can be used by a task to specify which data it must get before itsexecution. Such backends are collected and global acquisition plan are built andexecuted by the middleware.

As the middleware supports the reconfiguration of applications at runtime, taskscan be dynamically created. The dynamic modification of the features is also pro-vided by the middleware. The middleware will have to ensure that these constraintsare respected. Beware that any modification must be made on the model and vali-dated by the SPaCIFY tools to only introduce viable constraints. Each task has a misshandler defined. This defensive feature makes the middleware execute correctiveactions and report an error whenever the task does not respect its deadline.

3.4.3 Reconfiguration Service

Among, domain specific services offered by the middleware, the reconfigurationservice is the one that had the most impact on the middleware conception.

The design of the reconfiguration service embedded in the SPaCIFY middlewareis driven by the specific requirements of satellite onboard software. Reconfigura-tion is typically controlled from ground stations via telemetry and telecommands.

Page 133: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

3 Synoptic: A DSML for Space On-board Application Software 115

Developer

offline

onboard softwaremodel

Operator ground station

OBSW imagecompilation

satellite state

abstractreconf plan

compilationaccording to theOBSW image

ground-satellite link

concretereconf plan

Fig. 3.18 Process and responsibilities for reconfigurations

Human operators not only design reconfigurations, they also decide the right time atwhich reconfigurations should occur, for instance while the mission is idle. Due toresource shortage in satellites, the reconfiguration service must be memory savingin the choice of embedded metadata. Last, in families of satellites, software versionstend to diverge as workarounds for hardware damages are installed. Nevertheless,some reconfigurations should still apply well to a whole family despite the possibledifferences between the deployed software.

In order to tackle these challenges, the reconfiguration service rely on the struc-tural model (Synoptic models) of the OBSW enriched by the current state of thesatellite as described in Fig. 3.18. Using the structure of the software allows toabstract the low-level implementation details when performing reconfigurations.While designing reconfigurations, operators can work on models of the software,close to those used at development time, and specify so called abstract reconfig-uration plan like: change the flow between blocks A and B by a flow between Aand C. An abstract reconfiguration plan use high level elements and operationswhich may increase the CPU consumption of reconfigurations compared to lowlevel patches. Typical operations such as pattern matching of components make thedesign of reconfiguration easier, but they are compute intensive. Instead of embed-ding the implementation of those operations into the satellite middleware, patternsmay be matched offline, at the ground station, thanks to the knowledge of the flyingsoftware. We therefore define a hierarchy of reconfiguration languages, rangingfrom high-level constructs presented to the reconfiguration designer to low-levelinstructions implemented in the satellite runtime. Reconfigurations are compiledin a so called concrete reconfiguration plan before being sent to the satellite. Forinstance, the abstract plan upgrade A may be compiled to stop A and B; unbind Aand B; patch the implementation of A; rebind A and B; restart A and B. This com-pilation process uses proposed techniques to reduce the size of the patch sent to thesatellite and the time to apply it [34]. Another interest of this compilation schemeis to enable the use of the same abstract reconfiguration plan to a whole family ofsatellites. While traditional patch based approaches make it hard to apply a singlereconfiguration to a family, each concrete plan may be adapted to the precise stateof each of the family member.

Page 134: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

116 A. Cortier et al.

Applying the previously model driven approach to reconfigurations of satellitesoftware raise a number of other issues that are described in [10]. Work remains tobe conducted to get the complete reconfiguration tool chain available. The biggestchallenge being the specification of reconfiguration plan. Indeed, considering recon-figuration at the level of the structural model implies to include the reconfigurabilityconcern in the metamodel of Synoptic . If reconfigurability is designed as an abstractframework (independent of the modeling language), higher modularity is achievedwhen deciding the elements that are reconfigurable. First experiments made on Frac-tal component model [11] allow us to claim that focusing on the points of interestin application models, metadata for reconfiguration are more efficient. Indeed, sep-arating reconfigurability from Synoptic allows to download metadata on demand orto drop them at runtime, depending on requested reconfigurations.

Lastly, applying reconfiguration usually require that the software reach a quies-cent state [27]. Basically, the idea consists in ensuring that the pieces of code toupdate are not active nor will get activated during their update. For reactive and pe-riodic components as found in the OBSW, such states where reconfiguration can beapplied may not be reached. We have proposed another direction [12]. We considerthat active code can be updated consistently. Actually, doing so runs into low-leveltechnical issues such as adjusting instruction pointers, and reshaping and relocatingstack frames. Building on previous work on control operators and continuation, wehave proposed to deal with the low level difficulties using the notion of continuationand operators to manipulates continuations. This approach do not make updatingeasier but gives the opportunity to relax the constraints on update timing and allowupdates without being anticipated.

3.5 Conclusion

Outlook

The SPaCIFY ANR exploratory project proposes a development process and asso-ciated tools for hard real time embedded space applications. The main originalityof the project is to combine model driven engineering, formal methods and syn-chronous paradigms in a single homogeneous design process.

The domain specific language Synoptic has been defined in collaboration withindustrial end-users of the project combining functional models derived fromSimulink and architectural models derived from AADL . Synoptic provides severalviews of the system under design: software architecture, hardware architecture, dy-namic architecture and mappings between them. Synoptic is especially well adaptedfor control and command algorithm design.

The GALS paradigm adopted by the project is also a key point in the promotedapproach. Synoptic language allows to model synchronous islands and to specifyhow these islands exchange asynchronous information by using the services of adedicated middleware.

Page 135: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

3 Synoptic: A DSML for Space On-board Application Software 117

In particular, the SPaCIFY project proposed a contract approach to the specifica-tion of the relations between applications and the middleware.

This middleware does not only implement common services such as communi-cation or naming, but also offers domain-specific services for satellite flight controlsoftware. Reconfiguration is one of these domain-specific services. The SPaCIFYproject studied this service with a particular focus on the use of application modelto specify reconfiguration.

The Synoptic Environment

The SPaCIFY design process has been equipped with an Eclipse-based modelingworkbench. To ensure the long-term availability of the tools, the Synoptic en-vironment rely on open-source technologies: it guarantees the durability and theadaptability of tools for space projects which can last more than 15 years. We hopethis openness will also facilitate adaptations to other industries requirements.

The development of the Eclipse-based modeling workbench started with the def-inition of the Ecore meta-model [2] of the Synoptic language. The definition of thismeta-model has relied on the experience gained during the GeneAuto project. Thisdefinition is the result of a collaborative and iterative process. In a first step, a con-crete syntax relying on the meta-model has been defined using academic tools suchas TCS (Textual Concrete Syntax) [26]. This textual syntax was used to validatethe usability of the language through a pilot case study. These models have helpedto improve the Synoptic language and to adapt it to industrial know-how. Once thelanguage was stabilized, a graphical user editor was designed. A set of structuraland typing constraints have been formalized, encoded in OCL (Object ConstraintLanguage), and integrated into the environment.

In parallel of these activities, the semantics of the language was defined. Theformal semantics of the Synoptic language relies on the polychronous paradigm.This semantics was used for the definition of the transformation of Synoptic modelstowards SME models following a MDE approach. This transformation allows touse the Polychrony platform for verification, semantics model transformation hintsfor the end user, such as splitting of software as several synchronous islands andsimulation code generation purposes.

The Synoptic environment is being used to develop case studies of industrial size.

Future Investigations

The Synoptic environment is based on model transformation. Thus, verifying thistransformations is a key point. It has been addressed in the Geneauto project to cer-tify sequential code generation from a Stateflow/Simulink based language. Thiswork must be extended to take into account features of the execution platformsuch as timers, preemption-based schedulers, multi-threading, multi-processors, . . . .Work is in progress on a subset of the Synoptic language.

Page 136: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

118 A. Cortier et al.

The Synoptic environment provides a toolset supporting a development process.Experience acquired during the SPaCIFY project with industrial partners has en-lighten two different processes and thus the need to parameterize the platform bythe process. A SPEM-based specification of the development process could be usedas input of a generic platform so that it could be configured to match end user currentand future development method.

The Synoptic environment offers a limited support to refinement-based develop-ment process. This support could be extended and coupled with versioning to allowrefinement checking between any couples of models or submodels. It means a sup-port for defining and partially automatically generating gluing invariants betweenmodels. Then proof obligations could be generated.

References

1. Altarica Project. http://altarica.labri.u-bordeaux.fr/wiki/.2. Eclipse Modeling Framework project (EMF). http://www.eclipse.org/modeling/emf/.3. RT-Builder. Solutions for Real-Time design, modeling and analysis of complex, multi-

processors and multi-bus systems and software. http://www.geensys.com/?Outils/RTBuilder.4. Simulink. Simulation and model-based design. http://www.mathworks.com/.5. ASSERT Project. Automated proof-based system and software engineering for real-time sys-

tems. http://www.assert-project.net/, 2007.6. As-2 Embedded Computing Systems Committee SAE. Architecture Analysis & Design Lan-

guage (AADL). SAE Standards no AS5506, November 2004.7. Albert Benveniste, Patricia Bournai, Thierry Gautier, Michel Le Borgne, Paul Le Guernic,

and Herv Marchand. The Signal declarative synchronous language: controller synthesis andsystems/architecture design. In 40th IEEE Conference on Decision and Control, December2001.

8. U. Brinkschulte, A. Bechina, F. Picioroaga, and E. Schneider. Open System Architecture forembedded control applications. In International Conference on Industrial Technology, vol-ume 2, pages 1247–1251, Slovenia, December 2003.

9. Christian Brunette, Jean-Pierre Talpin, Abdoulaye Gamatie, and Thierry Gautier. A meta-model for the design of polychronous systems. Journal of Logic and Algebraic Programming,78(4):233–259, 2009.

10. Jeremy Buisson, Cecilia Carro, and Fabien Dagnat. Issues in applying a model driven approachto reconfigurations of satellite software. In HotSWUp ’08: Proceedings of the 1st InternationalWorkshop on Hot Topics in Software Upgrades, pages 1–5, New York, NY, USA, 2008. ACM.

11. Jeremy Buisson and Fabien Dagnat. Experiments with fractal on modular reflection. In SERA’08: Proceedings of the 2008 Sixth International Conference on Software Engineering Re-search, Management and Applications, pages 179–186, Washington, DC, USA, 2008. IEEEComputer Society.

12. J. Buisson and F. Dagnat, “ReCaml: Execution State as the Cornerstone of Reconfigura-tions”. In The 15th ACM SIGPLAN International Conference on Functional Programming,pages 27–29, Baltimore, Maryland, USA, September 2010.

13. Alan Burns. The Ravenscar profile. ACM Ada Letters, 4:49–52, 1999.14. Alan Burns, Brian Dobbing, and Tullio Vardanega. Guide for the use of the Ada Ravenscar

Profile in high integrity systems. Ada Letters, XXIV(2):1–74, 2004.15. F. Buschmann, R. Meunier, H. Rohnert, P. Sommerlad, and M. Stal. Pattern-oriented software

architecture: a system of patterns. Wiley, New York, 1996.

Page 137: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

3 Synoptic: A DSML for Space On-board Application Software 119

16. F.-X. Dormoy. Scade 6: a model based solution for safety critical software development. InProceedings of the 4th European Congress on Embedded Real Time Software (ERTS ’08),pages 1–9, Toulouse, France, January–February 2008.

17. ESA. European space agency. ground systems and operations – telemetry and telecommandpacket utilization (ECSS-E-70), January 2003.

18. Sanford Friedenthal, Alan Moore, and Rick Steiner. A practical guide to SysML: the systemsmodeling language. Morgan Kaufmann, San Francisco, CA, 2008.

19. James Gosling, Bill Joy, Guy Steele, and Gilad Bracha. Java(TM) language specification, 3rdedition. Addison-Wesley, New York, 2005.

20. Object Management Group. CORBA Component Model 4.0 Specification. SpecificationVersion 4.0, Object Management Group, April 2006.

21. Paul Le Guernic, Jean-Pierre Talpin, Jean-Christophe Le Lann, and Projet Espresso. Poly-chrony for system design. Journal for Circuits, Systems and Computers, 12:261–304, 2002.

22. N. Halbwachs, P. Caspi, P. Raymond, and D. Pilaud. The synchronous dataflow programminglanguage LUSTRE. In Proceedings of the IEEE, pages 1305–1320, 1991.

23. David Harel. Statecharts: a visual formalism for complex systems. Science of Computer Pro-gramming, 8(3):231–274, 1987.

24. Jerome Hugues, Bechir Zalila, and Laurent Pautet. Combining model processing and middle-ware configuration for building distributed high-integrity systems. In ISORC ’07: 10th IEEEInternational Symposium on Object and Component-Oriented Real-Time Distributed Comput-ing, pages 307–312, Washington, DC, USA, 2007. IEEE Computer Society.

25. Damir Isovic and Gerhard Fohler. Efficient scheduling of sporadic, aperiodic, and periodictasks with complex constraints. In 21st IEEE Real-Time Systems Symposium (RTSS’2000),pages 207–216, Orlando, USA, November 2000.

26. Frederic Jouault, Jean Bezivin, and Ivan Kurtev. Tcs:: a DSL for the specification of textualconcrete syntaxes in model engineering. In GPCE ’06: Proceedings of the 5th internationalconference on Generative programming and component engineering, pages 249–254, NewYork, NY, USA, 2006. ACM.

27. J. Kramer and J. Magee. The evolving philosophers problem: dynamic change management.IEEE Transactions on Software Engineering, 16(11):1293–1306, November 1990.

28. Object Management Group Management Group. A UML profile for MARTE, beta 2. Technicalreport, June 2008.

29. Peter J. Robinson. Hierarchical object-oriented design. Prentice-Hall, Upper Saddle River, NJ,1992.

30. Jean-Franois Rolland, Jean-Paul Bodeveix, Mamoun Filali, David Chemouil, and ThomasDave. AADL modes for space software. In Data Systems In Aerospace (DASIA), Palmade Majorca, Spain, 27 May 08 – 30 May 08, page (electronic medium), http://www.esa.int/publications, May 2008. European Space Agency (ESA Publications).

31. E. Schneider. A middleware approach for dynamic real-time software reconfiguration on dis-tributed embedded systems. PhD thesis, INSA Strasbourg, 2004.

32. Brinkley Sprunt, John P. Lehoczky, and Lui Sha. Exploiting unused periodic time for aperiodicservice using the extended priority exchange algorithm. In IEEE Real-Time Systems Sympo-sium, pages 251–258, Huntsville, AL, USA, December 1988.

33. Andres Toom, Tonu Naks, Marc Pantel, Marcel Gandriau, and Indra Wati. GeneAuto: an au-tomatic code generator for a safe subset of SimuLink/StateFlow. In European Congress onEmbedded Real-Time Software (ERTS), Toulouse, France, 29 January 08 – 01 February 08,page (electronic medium), http://www.sia.fr, 2008. Socit des Ingnieurs de l’Automobile.

34. Carl von Platen and Johan Eker. Feedback linking: optimizing object code layout for updates.SIGPLAN Notices, 41(7):2–11, 2006.

Page 138: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

Chapter 4Compiling SHIM

Stephen A. Edwards and Nalini Vasudevan

4.1 Introduction

Embedded systems differ from traditional computing systems in their need forconcurrent descriptions to handle simultaneous activities in their environment orto exploit parallel hardware. While it would be nice to program such systems inpurely sequential languages, this greatly hinders both expressing and exploitingparallelism. Instead, we propose a fundamentally concurrent language that, by con-struction, avoids many of the usual pitfalls of parallel programming, specificallydata races and non-determinism.

Most sequential programming languages (e.g., C) are deterministic: they pro-duce the same output for the same input. Inputs include usual things such as filesand command-line arguments, but for reproducibility and portability, things suchas the processor architecture, the compiler, and even the operating system are notconsidered inputs. This helps programmers by making it simpler to reason about aprogram and it also simplifies verification because if a program produces the desiredresult for an input during testing, it will do so reliably.

By contrast, concurrent software languages based on the traditional shared mem-ory, locks, and condition variables model (e.g., pthreads or Java) are not determin-istic by this definition because the output of a program may depend on such thingsas the operating system’s scheduling policy, the relative execution rates of parallelprocessors, and other things outside the application programmer’s control. Not onlydoes this demand a programmer consider the effects of these things when designingthe program, it also means testing can only say a program may behave correctly oncertain inputs, not that it will.

That deterministic concurrent languages are desirable and practical is the centralhypothesis of the SHIM project. That they relieve the programmer from consideringdifferent execution orders is clear; whether they impose too many constraints is notsomething we attempt to answer here.

S.A. Edwards (�) and N. VasudevanComputer Science Department of Columbia University, New York, NY, USAe-mail: [email protected]; [email protected]

S.K. Shukla and J.-P. Talpin (eds.), Synthesis of Embedded Software: Frameworks andMethodologies for Correctness by Construction, DOI 10.1007/978-1-4419-6400-7 4,c� Springer Science+Business Media, LLC 2010

121

Page 139: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

122 S.A. Edwards and N. Vasudevan

In this chapter, we demonstrate that determinism also benefits code synthesis,optimization, and verification by making it easier for an automated tool to un-derstand a program’s behavior. The advantage is particularly helpful for formalverification algorithms, which can ignore different execution interleavings of SHIMprograms. Although statements in concurrently running SHIM processes may ex-ecute in different orders, SHIM’s determinism guarantees this will not affect anyresult and hence most properties. This is in great contrast to the motivation forthe SPIN model checker [1], one of whose main purposes is to check differentexecution interleavings for consistency. SHIM has no need for SPIN.

SHIM is an asynchronous concurrent language whose programs consist of in-dependently running threads coded in an imperative C-like style that communicateexclusively through rendezvous channels. It is a restriction of Kahn networks [2]that replaces Kahn’s unbounded buffers with the rendezvous of Hoare’s CSP [3].Kahn’s unbounded buffers would make the language Turing-complete [4], and aredifficult to schedule [5], so the restriction to rendezvous makes the language easyto analyze. Furthermore, since SHIM is a strict subset of Kahn networks, it inheritsKahn’s scheduling independence: the sequence of data values passed across eachcommunication channel is guaranteed to be the same for all correct executions ofthe program (and potentially input-dependent).

We started the SHIM (Software/Hardware Integration Medium) project afterobserving students having difficulty making systems that communicated across thehardware/software boundary [6]. Our first attempt [7] focused exclusively on this byproviding variables that could be accessed by either hardware processes or softwarefunctions (both written in C dialect).

We found the inherent nondeterminism of this approach a key drawback. Thespeed at which software runs on processors is rarely known, let alone controlled,and since software and hardware run in parallel and communicate using sharedvariables, the resulting system was nondeterministic, making it difficult to test.

After our first attempt, we started again from the wish list of Table 4.1. Our goalwas to design a concurrent, deterministic (i.e., scheduling-independent) model ofcomputation and started looking around. The synchronous model [8] embodied inlanguages like Lustre or Esterel assumes a single or harmonically related clocks andthus would not work well for software. The Signal language is based on a richer

Table 4.1 The SHIM wish list

Trait Motivation

Concurrent Hardware/software systems fundamentally parallelMixes synchronous and

asynchronous stylesSoftware slower and less predictable than hardware;

need something like multirate dataflowOnly requires bounded

resourcesFundamental restriction on hardware

Formal semantics No arguments about meaning or behaviorScheduling-independent I/O should not depend on program implementation

Page 140: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

4 Compiling SHIM 123

model whose clocks’ rates can be controlled by data, but many find its syntax andsemantics confusing. Furthermore, Signal does not guarantee determinism; estab-lishing it requires sometimes-costly program-specific analysis.

In the rest of this chapter, we describe a series of code-generation techniques suit-able for both sequential and parallel processors. Each actually works on a slightlydifferent dialect of the SHIM language, although all use the Kahn-with-rendezvouscommunication scheme. The reason for this diversity is historical; we added featuresto the SHIM model as we discovered the need for them.

4.2 SHIM with Processes and Networks

Programs in our first Kahn-with-rendezvous dialect of SHIM consisted of sequentialprocesses and hierarchical network blocks (Fig. 4.1). This dichotomy (which welater removed – see Sect. 4.5) came from mimicking a similar division in hardwaredescription languages like Verilog.

The body of a network block consisted of instantiations (written in a function-call style) of processes or other networks (recursion was not permitted), which allran in parallel. For succinctness, the compiler inferred the names of communicationchannels from process and network arguments, although this could be overridden.

Processes consisted of C-like code without pointers. Process arguments wereinput or output channels. Following C++ syntax, outputs were marked with

process sink (uint32 D) fint v ;for (;;) v = D; /� Read from D �/

gprocess receiver ( uint32 C, uint32 &D) f

int a , b, r , v ;a = b = 0;for (;;) f

r = 1;while (r) f

r = C; /� Read from C �/if (r != 0) f

v = C; /� Read from C �/a = a + v ;

ggb = b + 1;D = b; /� Write to D �/

gg

process sender( uint32 &C) fint d , e;d = 0;while (d < 4) f

e = d;while (e > 0) f

C = 1; /� Write to C �/C = e ; /� Write to C �/e = e � 1;

gC = 0; /� Write to C �/d = d + 1;

ggnetwork main() f

sender ();receiver ();sink ();

g

Fig. 4.1 The dialect of SHIM on which the tail-recursive (Sect. 4.3) and static (Sect. 4.4) codegenerators worked. A process contains imperative code; its parameters are channels. A networkruns processes or other networks in parallel

Page 141: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

124 S.A. Edwards and N. Vasudevan

ampersands (&), suggesting they were passed by reference. References to argu-ments would be treated as blocking write operations if they appeared on the left ofan assignment statement and blocking reads otherwise.

The overall structure, then, of a SHIM program in this dialect was a collection ofsequential processes running in parallel and communicating through point-to-pointchannels. The compiler rejected programs in which a channel was an output on morethan one process.

4.3 Tail-Recursive Code Generation

Our first code-generation technique produces single-threaded C code for a uni-processor [9]. The central challenge is efficiently simulating concurrency without(costly, non-portable) operating system support (we present a parallel code gener-ator in Sect. 4.6). Our technique uses an extremely simple scheduler – a stack offunction pointers – that invokes fragments of concurrently-running processes usingtail-recursion.

In tail-recursive code generation, we translate the code for each process into acollection of C functions. In this dialect of SHIM, every program amounted to agroup of processes running in parallel (Sect. 4.2). The boundaries of these functionsare places where the process may communicate and have to block, so each suchprocess function begins with code just after a read or a write and terminates at aread, a write, or when the process itself terminates.

At any time, a process may be running, runnable, blocked on a channel, or termi-nated. These states are distinguished by the contents of the stack, channel meta-datastructs, and the program counter of the process. When a process is runnable, apointer to one of its functions is on the stack and its blocked field (defined in itslocal variable struct) is 0. A running process has control of the processor and thereis no pointer to any of its functions on the stack. When a process is blocked, itsblocked field is 1 and the reader or writer function pointer of at least one chan-nel has a function pointer to one of the process’s functions. When a process hasterminated, no pointers to it appear on the stack and its blocked field is 0.

Normal SHIM processes may only block on a single channel at once, so it wouldseem wasteful to keep a function pointer per channel to remember where a pro-cess is to resume. In Sect. 4.4, we relax the block-on-single-channel restriction toaccommodate code that mimics groups of concurrently-running processes.

Processes communicate through channels that consist of two things: a structchannel that contains function pointers to the reading or writing process that isblocked on the channel, and a buffer that can hold a single object being passedthrough the channel. A non-null function pointer points to the process function thatshould be invoked when the process becomes runnable again.

Figure 4.2 shows the implementation of a system consisting of a source processthat writes 42 to channel C and a sink process that reads it. The synthesized Ccode consists of data structures that maintain a set of functions whose execution is

Page 142: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

4 Compiling SHIM 125

void (�stack [3])( void ); /� runnable process stack �/void (��sp)(void ); /� stack pointer �/

struct channel fvoid (�reader )( void ); /� process blocked reading , if any �/void (�writer )( void ); /� process blocked writing , if any �/

g;

struct channel C = f 0, 0 g;int C value;

struct f /� local state of source process �/char blocked ; /� 1 = blocked on a channel �/int tmp1;

g source = f 0 g;

struct f /� local state of sink process �/char blocked ; /� 1 = blocked on a channel �/int v ;int tmp2;

g sink = f 0 g;

process source ( int32 &C) fC = 42; /� send on C �/

gvoid source 0 () f 1 2

source . tmp1 = 42;C value = source . tmp1; /� write to channel buffer �/if ( sink . blocked && C.reader) f /� if reader blocked �/

sink . blocked = 0; /� mark reader unblocked �/�(sp++) = C.reader ; /� schedule the reader �/C.reader = 0; /� clear the channel �/

gsource . blocked = 1; /� block us , the writer �/C.writer = source 1 ; /� to continue at source 1 �/(�(��sp))(); return; /� run next process �/

gvoid source 1 () f 3 4

(�(��sp))(); return;g

process sink ( int32 C) fint v = C; /� receive �/

gvoid sink 0 () f 2 1

if (source . blocked && C.writer) f /� if writer blocked �/sink 1 (); return; /� go directly to sink 1 �/

gsink . blocked = 1; /� block us , the reader �/C.reader = sink 1 ; /� to continue at sink 1 �/(�(��sp))(); return; /� run next process �/

gvoid sink 1 () f 3

sink . tmp2 = C value; /� read from channel buffer �/source . blocked = 0; /� unblock the writer �/�(sp++) = C.writer ; /� schedule the writer �/C.writer = 0; /� clear the channel �/sink .v = sink . tmp2;(�(��sp))(); return; /� run next process �/

gvoid termination process () fgint main() f

sp = &stack [0];�(sp++) = termination process ;�(sp++) = source 0 ;�(sp++) = sink 0 ;(�(��sp))();return 0;

gFig. 4.2 Synthesized code for two processes (in the boxes) that communicate and the main()function that schedules them

Page 143: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

126 S.A. Edwards and N. Vasudevan

pending, a buffer and state for each communication channel, structs that hold thelocal variables of each process, a collection of functions that hold the code of theprocesses broken into pieces, a placeholder function called termination process thatis called when the system terminates or deadlocks, and finally a main function thatinitializes the stack of pending function pointers and starts the system.

Processes are scheduled by pushing the address of a function on the stack andperforming a tail-recursive call to a function popped off the top of the stack. The Ccode for this is as follows.

void func1 () f...

�(sp++) = func2 ; /� schedule func2() �/...

(�(��sp ))(); return ; /� run a pending function �/gvoid func2 () f ... g

Under this scheme, each process is responsible for running the next; there is nocentral scheduler code.

Because this code generator compiles a SHIM dialect that uses only point-to-point channels for blocking rendezvous-style communication (see Sect. 4.2), the firstprocess that attempts to read or write on a channel blocks until the process at theother end of the channel attempts the complementary operation. Communication isthe only cause of blocking behavior in SHIM systems (i.e., the scheduler is non-preemptive), so processes control their peers’ execution at communication events.

The sequence of actions at read and write events in the process is fairly compli-cated but still fairly efficient. Broadly, when a process attempts to read or write,it attempts to unblock its peer, if its peer is waiting, otherwise it blocks on thechannel. Annotations in Fig. 4.2 illustrate the behavior of the code. There are twopossibilities: when the source runs first ( 1 ), it immediately writes the value to becommunicated into the buffer for the channel (C Value because the code maintainsthe invariant that a reader only unblocks a writer after it has read data from the chan-nel buffer) and checks to see if the reader (the sink process) is already blocked onthe channel.

Since we assumed the source runs first, the sink is not blocked so the sourceblocks on the channel. Next, the source records that control should continue at thesource 1 function (the purpose of setting C.writer) when the sink resumes it. Finally,it pops and calls the next waiting process function from the stack.

Later, ( 2 ) the sink checks if the source is blocked on C. In this source-before-sink scenario, the source is blocked so sink 0 immediately jumps to sink 1, whichfetches the data from the channel buffer, unblocks and schedules the writer, andclears the channel before calling the next process function, source 1 ( 3 ).

When the sink runs first ( 1 ), it finds the source is not blocked and then blocks.Later, the source runs ( 2 ), writes into the buffer, discovers the waiting sink pro-cess, and unblocks and schedules sink before blocking itself. Later, sink 1 runs ( 3 ),which reads data from the channel buffer, unblocks and schedules the writer, whicheventually sends control back to source 1 ( 4 ).

Page 144: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

4 Compiling SHIM 127

The main challenge in generating the code described above is identifying the pro-cess function boundaries. We use a variant of extended basic blocks (see Fig. 4.3c):a new function starts at the beginning of the process, at a read or write operation,and at any statement with more than one predecessor. This divides the process intosingle-entry, multiple-exit subtrees, which is finer than it needs to be, but is sufficientand fast. The algorithm is simple: after building the control-flow graph of a process,a DFS is performed starting from each read, write, or multiple-fanin node that goesuntil it hits such a node. The spanning tree built by each DFS becomes the control-flow graph for the process function, and code is generated mechanically from there.

Figure 4.3 illustrates the code generation process for a simple process with someinteresting control-flow. The process (Fig. 4.3a) consists of two nested loops. Wetranslate the SHIM code into a fairly standard linear IR (Fig. 4.3b). Its main noveltyis await, a statement that represents blocking on one or more channels. E.g., awaitwrite C goto 6 indicates the process wants to communicate with its environmenton channel C and will branch to statement 6 once this has occurred. Note that theinstruction itself only controls synchronization; the actual data transfer takes placein an earlier assignment statement. Although this example (and in fact all SHIMprocesses) only ever blocks on a single channel at a time, our static schedulingprocedure (Sect. 4.4) uses the ability to block on multiple channels simultaneously.

Our generated C code uses the following macros:

#define BLOCKED READING(r, ch) r.blocked && ch.reader#define RUN READER(r, ch) n

r . blocked = 0, �(sp++) = ch .reader , ch . reader = 0#define BLOCK WRITING(w, ch, succ) w.blocked = 1, ch.writer = succ#define BLOCKED WRITING(w, ch) w.blocked && ch.writer#define RUN WRITER(w, ch) n

w.blocked = 0, �(sp++) = ch . writer , ch . writer = 0#define BLOCK READING(r, ch, succ) r.blocked = 1, ch. reader = succ#define RUN NEXT (�(��sp))(); return

BLOCKED READING is true if the given process is blocked on the given chan-nel. RUN READER marks the given process that is blocked on the given channelas runnable. BLOCK WRITING marks the given process – the currently runningone – as blocked writing on the given channel. The succ parameter specifies theprocess function to be executed when the process next becomes runnable. Finally,RUN NEXT runs the next runnable process.

4.4 Code Generation from Static Schedules

The tail-recursive code generator we presented above uses a clever technique toreduce run-time scheduling to little more than popping an address off a stack andjumping to it, but even this amount of overhead can be high for an extremely simpleprocess such as an adder.

In this section, we describe how to eliminate even this low scheduling over-head by compiling together groups of concurrently-running processes into a single

Page 145: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

128 S.A. Edwards and N. Vasudevan

process source ( int32 &C) fbool b = 0;for ( int32 a = 0 ; a < 100 ; ) f

if (b) fC = a;

g else ffor ( int32 d = 0 ; d < 10 ; ++d)

a = a + 1;gb = ˜b;

gg

(a)

0 b = 01 a = 02 ifnot a < 100 goto 143 ifnot b goto 74 C = a5 await write C goto 66 goto 127 d = 08 ifnot d < 10 goto 129 a = a + 110 d = d + 111 goto 812 b = 1 - b13 goto 214 Exit

(b)

b = 0

a = 0

ifnot a < 100goto 14

ifnot b goto 7

C = a

await write Cgoto 6

goto 12

d = 0

ifnot d < 10goto 12

a = a+1

d = d+1

goto 8b = 1 - b

goto 2Exit

b = 0

a = 0

ifnot a < 100goto 14

ifnot b goto 7

C = a

await write Cgoto 6

goto 12

d = 0

ifnot d < 10goto 12

a = a+1

d = d+1

goto 8b = 1 - b

goto 2Exit

(c)

struct channel C = f0, 0g;int C val ;struct f

bool blocked ;bool b;int32 a;int32 d;

g source = f 0 g;

static void source 0 () fsource .b = 0;source .a = 0;source 1 (); return ;

gstatic void source 1 () f

if (!( source .a < 100)) goto L9;if (!( source .b)) goto L7;C val = source .a;if (BLOCKED READING(sink, C))

RUN READER(sink, C);BLOCK WRITING(source, C,

source 2 );RUN NEXT;

L7:source .d = 0;source 3 (); return ;

L9:RUN NEXT;

gstatic void source 2 () f

source 4 (); return ;gstatic void source 3 () f

L1:if (!( source .d<10)) goto L6;source .a = source .a + 1;source .d = source .d + 1;goto L1;

L6:source 4 (); return ;

gstatic void source 4 () f

source .b = 1 � source .b;source 1 (); return ;

g(d)

Fig. 4.3 Generating tail-recursive code for a single process. Our compiler translates a process(a) into an intermediate representation (b). This is translated into a CFG, split into extended basicblocks (c), finally each block becomes a function (d)

Page 146: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

4 Compiling SHIM 129

imperative process that can be substituted for the group of processes [9]. Efficiencyis the advantage of this approach: by analyzing the behavior of a group at compiletime, we are able to eliminate most scheduling overhead. Our procedure is there-fore similar to many known techniques for sequential code generation, but makesdifferent trade-offs. It generates an automaton for a group of SHIM processes usingexhaustive simulation that resembles the subset construction algorithm for generat-ing deterministic finite automata from nondeterministic ones.

The disadvantage of this approach is a potential explosion in code size. Sinceit builds a product machine from concurrently-running processes, there is a dangerof an exponential state explosion. We do not consider this a serious problem fortwo reasons: our abstraction of processes often leads to small machines for largesystems, and it is always possible to synthesize smaller subsets of a system andrun them dynamically. Our technique therefore provides a controllable time/spacetrade-off.

The complete state of a SHIM system comprises the program counter of eachprocess and the value of each process-local variable. While we could build an au-tomaton whose states exactly represent this, it would be impractically large for allbut the simplest programs. Instead we track an abstract version of the system statein the automaton. While this does defer many computations to when the generatedcode is running, it greatly reduces the size of the automata and hence the generatedcode. Experimentally, we find this a good trade-off.

Because SHIM systems tend to have periodic communication patterns, it turnsout we can compile away most of the scheduling overhead and still have smallautomata. Unfortunately, while compiling away context-switching overhead wouldalso be nice, it would demand tracking combinations of reachable program counterstates, something that easily grows exponential. We find our current solution a goodtrade-off that can produce impressive speed-ups.

Each state in our generated automaton represents the execution of one processbetween context-switch points or a point where the subnetwork is waiting for its en-vironment. Each transition corresponds to as many as two separate communicationevents, so the automaton represents the system’s communication pattern. For eachstate, we copy code from the state’s process and replace context-switching pointswith gotos to code for the state’s successors.

Each state’s signature – the system state we insist be unique for each automatonstate – is a flag for each process indicating whether it is runnable plus a flag foreach channel that indicates whether the channel is clear, blocked on a reader, orblocked on a writer. We deliberately ignore program counters and local variables inthe signature – our abstraction to produce compact automata.

Although we do not consider it part of a state’s signature for matching purposes,we do track what program counter values are possible to streamline the generatedcode and reduce the size of the automaton by limiting both the amount of codegenerated for each state (unreachable code is omitted) and the number of successorstates. Practically, when we reach a state with the same signature as an existing onebut with new program counter values, we consider the two states identical and formthe union of the program counter sets. Our simulation procedure thus combines adepth-first search and a relaxation procedure that finds a fixed point.

Page 147: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

130 S.A. Edwards and N. Vasudevan

Figure 4.4 shows a simple program being transformed into an automaton. Theprogram’s three processes (Fig. 4.4a) are a sink that always reads, a buffer that readsand then writes, and a source that sends four numbers and terminates. Our compilerdismantles processes into statement lists (Fig. 4.4b) that are simulated to producean automaton (Fig. 4.4c). Our compiler then generates code for each state in theautomaton and connects them with gotos, producing the IR in Fig. 4.4d. This IRis then passed to the normal code generation procedure described in Sect. 4.3 toproduce executable C.

The structure of Fig. 4.4c is typical of systems with periodic behavior that ter-minate: the first state initializes the system to bring it to where periodic behaviorbegins. The loop represents the periodic behavior, and the state just outside the looprepresents a deadlock because the source has terminated.

Each state in Fig. 4.4c is labeled with its name; the state of each process, eitherrunnable (�) or blocked on a channel (�) when the state begins; the state of eachchannel (“-” for clear, “R” when its reader is blocked, and “W” when the writer isblocked); and a set of program counter values that each process may be in at thebeginning of the state. Thus, in State 1, process 0 is blocked, processes 1 and 2are runnable, no process is blocked the first channel (A), and the reader (the sinkprocess) is blocked on the second channel (B). Moreover, the first process (sink)must be at instruction 1, the second process (buffer), may be at instruction 0 or 4,and the third process may be at 0, 2, 4, 6, or 8.

A SHIM system runs consistently under any reasonable scheduling policy [10].We adopt a scheduling policy that selects the lowest-numbered runnable process.The automaton we generate, therefore, depends on process labeling (currently frompositions in the source file), but it is guaranteed to produce the same overallbehavior. A better scheduling policy could improve the generated code.

The automaton generation procedure starts with all processes runnable and allprogram counters at 0 – State 0 in Fig. 4.4c. Our scheduling policy then runs thefirst process – the sink – which executes instruction 0 and blocks on channel 1 (B),so State 1 has channel 1 blocked on sink. The first runnable process, 1 (the buffer)starts at instruction 0 in State 1, tries to read from channel 0 (A), and blocks. Thisgives State 2, in which the first two processes (sink and buffer) are blocked andchannels 0 and 1 are blocked on them.

The loop in Fig. 4.4c (States 1, 2, 3, and 4) is periodic behavior: the buffer blockstrying to read, the source emits a token, the buffer reads it, the source reads it, andthe loop repeats.

The simulation traces the loop four times because the source can be at four con-trol points waiting to write on A, but this does not create new states because eachhas the same signature; here our choice of signature shrinks the automaton.

State 2 in Fig. 4.4c has two successors: the loop (State 3) and State 5. This isa choice between the three PC values (2, 4, and 6) that lead to a write on the Achannel and a fourth (8) that brings it to termination. State 5 corresponds to the statein which no process is runnable; the buffer is waiting to read from the source andthe sink is waiting to read from the buffer.

Page 148: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

4 Compiling SHIM 131

process sink ( int32 B) ffor (;;) B;

gprocess buffer ( int32 &B,

int32 A) ffor (;;) B = A;

gprocess source ( int32 &A) f

A = 17;A = 42;A = 157;A = 8;

gnetwork main() f

sink ();buffer ();source ();

g

(a) SHIM code

sink0 PreRead 11 PostRead 1 tmp32 goto 0

buffer0 PreRead 01 PostRead 0 tmp22 tmp1 := tmp23 Write 1 tmp14 goto 0

source0 tmp4 := 171 Write 0 tmp42 tmp5 := 423 Write 0 tmp54 tmp6 := 1575 Write 0 tmp76 tmp8 := 87 Write 0 tmp88 Exit

(b) Dismantled

State 0��� --f0gf0gf0g

State 1��� -Rf1gf0; 4gf0; 2; 4; 6; 8g

State 2��� RRf1gf1gf0; 2; 4; 6; 8g

State 3��� WRf1gf1gf2; 4; 6; 8g

State 4��� -Wf1gf4gf2; 4; 6; 8g

State 5��� RRf1gf1gf8g

(c) The automaton

0 /* State 0 (sink) */1 sink_state = 12 goto 3

3 /* State 1 (buffer) */4 switch buffer_state

case 0: goto 8case 4: goto 7

5 buffer_state = 16 goto 97 goto 58 goto 5

9 /* State 2 (source) */10 switch source_state

case 0: goto 29case 2: goto 25case 4: goto 21case 6: goto 17case 8: goto 15

11 value__V0 = 1712 A__V0 = value__V013 source_state = 214 goto 3015 source_state = 816 goto 4217 value__V3 = 818 A__V0 = value__V319 source_state = 820 goto 3021 value__V2 = 15722 A__V0 = value__V223 source_state = 624 goto 3025 value__V1 = 4226 A__V0 = value__V127 source_state = 428 goto 3029 goto 11

30 /* State 3 (buffer) */31 value__V5 = A__V032 received 0 in value__V533 value__V4 = value__V534 B__V1 = value__V435 buffer_state = 436 goto 37

37 /* State 4 (sink) */38 value__V6 = B__V139 received 1 in value__V640 sink_state = 141 goto 3

42 /* State 5 (blocked) */43 exit

(d) The generated IR

Fig. 4.4 Synthesizing the automaton for three concurrently-running processes. The SHIM code(a) is first translated into a linear IR (b) that splits read operations into two halves. Simulatingthese processes produces an automaton (c), from which a different type of IR is generated (d). Thisis passed to the code generation algorithm in Sect. 4.3 to be translated into C

Page 149: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

132 S.A. Edwards and N. Vasudevan

Figure 4.4d is the IR generated from the automaton in Fig. 4.4c. Each state pro-duces a code fragment, some of which begin with a switch that sends control towhere the process suspended. The code for each state ends by assigning a constantto the process’s state variable that indicates where it should resume. We describe thegeneration of such switch statement code elsewhere [11]. The mechanism is analo-gous to the tail-recursive calls to function pointers described in Sect. 4.3, but keepsthe code together.

4.5 SHIM with Functions, Recursion, and Exceptions

The processes-and-networks dialect of SHIM (Sect. 4.2) worked, but we quicklydiscovered we missed function calls. We also found that we wanted the ability torun lightweight blocks of code in parallel rather than require new processes be de-clared. Finally, we decided to add exceptions, which turned out to be interesting buttechnically challenging. The result was, by design, a much more C-like language,which simplified the task of porting existing C programs into SHIM.

We began [12] by removing the process/network dichotomy by introducing thepar construct, which starts two or more code blocks in parallel and waits for themto terminate. To uphold the SHIM model and its goal of scheduling-independence,our compiler actually split each code block into a separate functions and carefullydetermined which variables to pass into and out of it.

A key trick was to infer a pass-by-reference parameter for any variable modifiedby the code block and a pass-by-value parameter for any variable only read by thecode block. Furthermore, we prohibited any variable from being passed by referencemore than once at a call site, thus prohibiting any more than one alias for eachvariable at any given time.

For example,

x = y ;par

y = x ;

swaps the values of x and y. Internally, the compiler expands this into functions:

void block 1 ( int &x, int y) f x = y ; gvoid block 2 ( int &y, int x) f y = x ; g

block 1 (x , y );par

block 2 (y , x );

Here, the two block functions run in parallel. The first is passed a reference to xand a copy of y, the second a reference to y and a copy of x. Thus, the assignmentscan happen in either order and produce the same result.

We also made the syntax for communication on channels more explicit by addingsend, recv, and next keywords, rather than make it simply a side-effect of referenc-ing a channel. Although clarity was the main motivation for adding send and recv

Page 150: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

4 Compiling SHIM 133

void sink (chan uint32 D) ffor (;;) recv D;

gvoid receiver (chan uint32 C,

chan uint32 &D) fint a;a = D = 0;for (;;) f

while (next C)a = a + next C;

D = D + 1;send D;next D = a;

gg

void sender(chan uint32 &C) throws Done fint d , e ;for ( d = 0 ; d < 4 ; d = d + 1) f

for ( e = d ; e > 0 ; e = e � 1 ) fnext C = 1;next C = e ;

gnext C = 0;

gthrow Done;

gvoid main() f

chan uint32 C, D;try

sender(C); par receiver (C, D); par sink (D);catch (Done) fg

g

Fig. 4.5 The program of Fig. 4.1 coded in the latest SHIM dialect. We added par, chan, send, recv,next, try, catch, and throw and removed the distinction between processes and networks: both arenow functions

(the previous policy confused many users), was also found we were often reading avalue from a channel and storing it locally so we could refer to it multiple times. Weattempted to retain the communication-in-an-expression syntax by introducing thenext keyword, which sends on a channel if it appears on the left side of an assign-ment and receives when it appears elsewhere. While it can make for very succinctcode (next b D next a is a succinct way of writing a buffer), users continue to findit confusing and prefer the send and recv syntax. Figure 4.5 shows the program ofFig. 4.1 coded in this new dialect.

We also added the facility for multiway rendezvous. Although a channel may bepassed by reference only once, it may be passed by value an unlimited number oftimes. In this case, each function that receives a pass-by-value copy of the channelis required to participate in any rendezvous on the channel. A primary motivation ofthis was to facilitate debugging – it is easy now to add processes that monitor chan-nels without affecting a system’s behavior. In retrospect, this facility is sparselyused, difficult to implement, and slows down the more typical point-to-point com-munication unless carefully optimized away.

4.5.1 Recursion

When we introduced function calls, we had to consider how to handle recursion. Ourmain goal was to make basic function calls work, allowing the usual re-use of code,but we also found that recursion, especially bounded recursion, was an interestingmechanism for specifying more complex structures.

Page 151: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

134 S.A. Edwards and N. Vasudevan

void buffer ( int i , int &o) ffor (;;) f

recv i ;o = i ;send o;

gg

void fifo ( int i , int &o, int n) fint c ; int m = n � 1;if (m)

buffer ( i , c) par fifo (c , o, m);else

buffer ( i , o );g

Fig. 4.6 An n-place FIFO specified using recursion, from Tardieu and Edwards [12]

void fifo3 (chan int i , chan int &o) ffifo ( i , o , 3);

gvoid fifo (chan int i , chan int &o, int n) f

if (n > 1) fchan int c ;buf ( i , c ); par fifo (c , o , n�1);

g else buf ( i , o );gvoid buf (chan int i , chan in &o) f

int tmp;for (;;) f

tmp = recv i ;send o = tmp;

gg

(a)

void fifo3 (chan int i , chan int &o) fchan int c1 , c2 , c3;

buf( i , c1 );par

buf(c1 , c2 );par

buf(c2 , o );g

void buf (chan int i , chan in &o) fint tmp;for (;;) f

tmp = recv i ;send o = tmp;

gg

(b)

Fig. 4.7 Removing bounded recursion, controlled by the n variable, from (a) gives (b). AfterEdwards and Zeng [13]

Figure 4.6 illustrates this style. The recursive fifo procedure calls itself repeatedlyin parallel, effectively instantiating buffer processes as it goes. This recursion runsonly once, when the program starts, to set up a chain of single-place buffers.

We developed a technique for removing bounded recursion from SHIM pro-grams [13]. One goal was to simplify SHIM’s translation into hardware, wheregeneral recursion would require memory for a stack and choosing a size for it, but ithas found many other uses. In particular, if all the recursion in a program is bounded,the program is finite-state, simplifying other analysis steps.

The basic idea of our work was to unroll recursive calls by exactly tracking thebehavior of variables that control the recursion. Our insight was to observe that fora recursive function to terminate, the recursive call must be within the scope of aconditional. Therefore, we need to track the predicate of this conditional, see whatcan affect it, and so forth.

Figure 4.7 shows the transformation of a simple FIFO. Our procedure producesthe static version in Fig. 4.7b by observing that the n variable controls the predicate

Page 152: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

4 Compiling SHIM 135

around fifo’s recursive call. Then it notices n is set first to 3 by fifo3 and generatesthree specialized versions of fifo – one with n D 3, n D 2, and n D 1 – simplifieseach, then inlines each function, since each is only called once.

Of course, in the worst case our procedure could end up trying to track everyvariable in the program, which would be impractical, but in programs written withthis idiom in mind, recursion control only involved a few variables, making it easyto resolve.

4.5.2 Exceptions

At this stage, we also added exceptions [14], without question the most techni-cally difficult addition we have made. Inspired by the Esterel language [15], whereexceptions are used not just for occasional error handling but as widely as, say, if-then-else, we wanted our exceptions to be widely applicable and be concurrent andscheduling-independent.

For sequential code, the desired exception semantics were clear: throwing anexception immediately sends control to the most-recently-entered handler for thegiven exception, terminating any functions that were called in between (Fig. 4.8).

For concurrently running functions, the right behavior was less obvious. Wewanted to terminate everything leading up to the handler, including any concurrentlyrunning relatives, but we insisted on maintaining SHIM’s scheduling independence,meaning we had to carefully time when the effect of an exception was felt. Simplyterminating siblings when one called an exception would be nondeterministic: thebehavior would then depend on the relative execution rates of the processes and thusnot be scheduling independent.

Our solution was to piggyback the exception mechanism on the communicationsystem. The idea was that a process would only learn of an exception when it at-tempted to communicate, since it is only at rendezvous points that two processes

void main() fint i ; i = 0;try f

i = 1;throw T;i = i � 2; // not executed

g catch(T) fi = i � 3; // i = 3

gg

(a)

void main() fint i ;i = 0;try f // thread 1

throw T;g par f // thread 2

for (;;) // never terminatedi = i + 1;

g catch(T) fgg

(b)

Fig. 4.8 (a) Sequential exception semantics are classical. (b) Thread 2 never feels the effect of theexception because it never communicates. From Tardieu and Edwards [14]

Page 153: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

136 S.A. Edwards and N. Vasudevan

void main() fchan int i = 0, j = 0;try f // task 1

while ( i < 5) send i = i + 1;throw T; // poisons itself

g par f // task 2for (;;) send j = recv i + 1; // poisoned by task 1

g par f // task 3for (;;) recv j ; // poisoned by task 2

g catch (T) fgg

Fig. 4.9 Transitive poisoning: throw T poisons the first task, which poisons the second when thesecond attempts recv i. Finally the third is poisoned when it attempts recv j and the whole groupterminates

agree on what time it is. Thus, SHIM’s exception mechanism is layered on theinter-process communication mechanism to preserve determinism while providingpowerful sequential control.

To accommodate exceptions, we introduced a new “poisoned” state for a processthat represents when it has been terminated by an exception and is waiting for itsrelatives to terminate. Any process that attempts to communicate with a poisonedprocess will itself become poisoned. In Fig. 4.9, the first thread throws an exception;the second thread is poisoned when it attempts to rendezvous on i, and the third ispoisoned by the second when it attempts to rendezvous on j.

The idea was simple enough, and the interface it presented to the programmercould certainly be used and explained without much difficulty, but implementing itturned out to be a huge challenge, despite there being fairly simple set of structuraloperational semantics rules for it.

The real complexity came from a combination of having to implement exceptionscope, which limits how far the poison propagates (it does not propagate outside thescope of the exception) and how that interacts with the scope of multiple, concur-rently thrown exceptions.

4.6 Generating Threaded Code

To handle multiway rendezvous and exceptions on multiprocessors, we needed anew technique. Our next backend [16] generates C code that calls the POSIX threadlibrary. Here, the challenge is minimizing overhead. Each communication actionacquires the lock on a channel, checks whether every connected process had alsoblocked (whether the rendezvous could occur), and then checks if the channel isconnected to a poisoned process (an exception had been thrown).

Page 154: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

4 Compiling SHIM 137

void h(chan int &A) fA = 4; send A;A = 2; send A;

gvoid j (chan int A) throws Done f

recv A;throw Done;

gvoid f (chan int &A) throws Done f

h(A); par j (A);

void g(chan int A) frecv A;recv A;

gvoid main() f

try fchan int A;f (A); par g(A);

g catch (Done) fgg

Fig. 4.10 A SHIM program with exceptions

4.6.1 An Example

We will use the example in Fig. 4.10 to illustrate threaded code generation. There,the main function declares the integer channel A and passes it to tasks f and g, thenf passes it to h and j. Tasks f and h send data with send A Tasks g and j receive itwith recv A.

Task h sends the value four to tasks g and j. Task h blocks on the second send Abecause task j does not run a matching recv A.

As we described earlier, SHIM’s exceptions enable a task to gracefully interruptits concurrently running siblings. A sibling is “poisoned” by an exception only whenit attempts to communicate with a task that raised an exception or with a poisonedtask. For example, when j in Fig. 4.10 throws Done, it interrupts h’s second send Aand g’s seconds recv A, resulting in the death of h and g. An exception handler runsafter all the tasks in its scope have terminated or been poisoned.

4.6.2 The Static Call-Graph Assumption

For efficiency, our compiler assumes the communication and call graph of theprogram is known at compile time. We reject programs with unbounded recursionand can expand programs with bounded recursion [13], allowing us to transformthe call graph into a call tree. This duplicates code to improve performance: fewerchannel aspects are managed at run time.

We encode in a bit vector the subtree of functions connected to a channel. Sincewe know at compile time which functions can connect to each channel, we assigna unique bit to each function on a channel. We check these bits at run time withlogical mask operations. In the code, something like A f is a constant that holds thebit our compiler assigns to function f connected to channel A, such as 0x4.

Page 155: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

138 S.A. Edwards and N. Vasudevan

4.6.3 Implementing Rendezvous Communication

Implementing SHIM’s multiway rendezvous communication with exceptions is themain code generation challenge.

The code at a send or receive is straightforward: it locks the channel, marks thefunction and its ancestors as blocked, calls the event function for the channel toattempt the communication, and blocks until communication has occurred. If it waspoisoned, it branches to a handler. Figure 4.11 is the code for send A in h in Fig. 4.10.

For each channel, our compiler generates an event function that manages com-munication. Our code calls an event function when the state of a channel changes,such as when a task blocks or connects to a channel.

Figure 4.12 shows the event function our compiler generates for channel A inFig. 4.10. While complex, the common case is quick: when the channel is notready (one connected task is not blocked on the channel) and no task is poisoned,A.connected ! D A.blocked and A.poisoned D 0 so the bodies of the two if statementsare skipped.

If the channel is ready to communicate, A.blocked D A.connected so the body ofthe first if runs. This clears the channel blocked D 0) and main’s value for A (passedby reference to f and h) is copied to g or j if connected.

If at least one task connected to the channel has been poisoned, A.poisoned !D 0so the body of the second if runs. This code comes from unrolling a recursive pro-cedure at compile time, which is possible because we know the structure of thechannel (i.e., which tasks connect to it). The speed of such code is a key advantageover a library.

This exception-propagation code attempts to determine which tasks, if any, con-nected to the channel should be poisoned. It does this by manipulating two bitvectors. A task can die if and only if it is blocked on the channel and all its childrenconnected to the channel (if any) also can die. A poisoned task may kill its sib-ling tasks and their descendants. Finally, the code kills each task in the kill set thatcan die and was not poisoned before by setting its state to POISON and updatingthe channel accordingly (A.poisoned /Dkill).

lock (A.mutex); /� acquire lock for channel A �/A.blocked j= (A hjA f jA main); /� block h and ancestors on A �/event A (); /� alert channel of the change �/while (A.blocked & A h) f /� while h remains blocked �/

if (A.poisoned & A h) f /� were we poisoned? �/unlock(A.mutex);goto poisoned ;

gwait(A.cond, A.mutex); /� wait on channel A �/

gunlock(A.mutex); /� release lock for channel A �/

Fig. 4.11 C code for send A in function h() of Fig. 4.10

Page 156: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

4 Compiling SHIM 139

void event A () funsigned int can die = 0, kill = 0;if (A.connected == A.blocked ) f /� communicate �/

A.blocked = 0;if (A.connected & A g) �A.g = �A.main;if (A.connected & A j) �A.j = �A.main;broadcast (A.cond);

g else if (A.poisoned) f /� propagate exceptions �/can die = blocked & (A gjA hjA j ); /� compute can die set �/if (can die & (A hjA j ) == A.connected & (A hjA j ))

can die j= blocked & A f;if (A.poisoned & (A f jA g)) f /� compute kill set �/

kill j= A g; if (can die & A f) kill j= (A f jA hjA j );gif (A.poisoned & (A hjA j )) f kill j= A h; kill j= A j ; gif ( kill &= can die & ˜A.poisoned) f /� poison some tasks? �/

unlock(A.mutex);if ( kill & A g) f /� poison g if in kill set �/

lock (g .mutex);g . state = POISON;unlock(g .mutex);

g/� also poison f , h, and j if in kill set ... �/lock (A.mutex);A.poisoned j= kill ; broadcast (A.cond);

g g g

Fig. 4.12 C code for the event function for channel A of Fig. 4.10

lock (main.mutex); main. state = POISON; unlock(main.mutex);lock ( f .mutex); f . state = POISON; unlock(f.mutex);lock ( j .mutex); j . state = POISON; unlock(j.mutex);goto poisoned ;

Fig. 4.13 C code for throw Done in function j() of Fig. 4.10

Code for throwing an exception (Fig. 4.13) marks as POISON all its ancestors upto where it will be handled. Because the compiler knows the call tree, it knows howfar to “unroll the stack,” i.e., how many ancestors to poison.

4.6.4 Starting and Terminating Tasks

It is costly to create and destroy a POSIX thread because each has a separate stack, itrequires interaction with the operating system’s scheduler, and it usually requires asystem call. To minimize this overhead, because we know the call graph of the pro-gram at compile time, our compiler generates code that creates at the beginning asmany threads as the SHIM program will ever need. These threads are only destroyedwhen the SHIM program terminates; if a SHIM task terminates, its POSIX threadblocks until it is re-awakened.

Page 157: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

140 S.A. Edwards and N. Vasudevan

lock (A.mutex); /� connect �/A.connected j= (A f jA g);event A ();unlock(A.mutex);

lock (main.mutex);main. attached children = 2;unlock(main.mutex);

lock ( f .mutex); /� pass args �/f .A = &A;unlock( f .mutex);

/� A is dead on entry for g,so do not pass A to g �/

lock ( f .mutex); /� run f () �/f . state = RUN; broadcast(f .cond);unlock( f .mutex);

lock (g .mutex); /� run g() �/g . state = RUN; broadcast(g.cond);unlock(g .mutex);

lock (main.mutex); /� wait for children �/while (main. attached children )

wait(main.cond, main.mutex);if (main. state == POISON) f

unlock(main.mutex);goto poisoned ;

gunlock(main.mutex);

Fig. 4.14 C code for calling f() and g() in main() of Fig. 4.10

Figure 4.14 shows the code in main that runs f and g in parallel. It connects fand g to channel A, sets its number of live children to 2, passes function parameters,then starts f and g. The address for the pass-by-reference argument A is passed to f.Normally, a value for A would be passed to g, but our compiler found this value isnot used so the copy is avoided (discussed below). After starting f and g, main waitsfor both children to return. Then main checks whether it was poisoned, and if so,branches to a handler.

Reciprocally, Fig. 4.15 shows the code in f that controls its execution: an infiniteloop that waits for main, its parent, to set its state field to running, at which point itcopies its formal arguments into local variables and runs its body.

If a task terminates normally, it cleans up after itself. In Fig. 4.15, task fdisconnects from channel A, sets its state to STOP, and informs main it has one lessrunning child.

By contrast, if a task is poisoned, it may still have children running and it mayalso have to poison sibling tasks so it cannot entirely disappear yet. In Fig. 4.15,task f, if poisoned, does not disconnect from A but updates its poisoned field. Then,task f waits for its children to return. At this time, f can disconnect its (potentiallypoisoned) children from channels, since they can no longer poison siblings. Finally,f informs main it has one less running child.

For IBM’s CELL processor, we developed a backend [17] that is a direct offshootof the pthreads backend but allowed the user to assign certain (computationally in-tensive) tasks directly to the CELL’s eight synergistic processing units (SPUs); therest of the tasks ran on the CELL’s standard PowerPC core (PPU). We did thisby replacing these offloaded functions with wrapper functions that communicatedacross the PPU–SPU boundary. Function calls across the boundary turned out to befairly technical because of data alignment restrictions on function arguments, whichwe would have preferred to be stack-resident. This, and many more fussy aspects

Page 158: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

4 Compiling SHIM 141

int �A; /� value of channel A �/

restart :lock ( f .mutex);while ( f . state != RUN)

wait( f .cond, f .mutex);A = f .A; /� copy arg . �/unlock( f .mutex);

/� body of the f task �/

terminated :lock (A.mutex); /� disconnect f �/A.connected &= ˜A f;event A ();unlock(A.mutex);

lock ( f .mutex); /� stop �/f . state = STOP;unlock( f .mutex);goto detach ;

poisoned :lock (A.mutex); /� poison A �/A.poisoned j= A f ;A.blocked &= ˜A f; event A ();unlock(A.mutex);

lock ( f .mutex); /� wait for children �/while ( f . attached children )

wait( f .cond, f .mutex);unlock( f .mutex);

lock (A.mutex); /� disconnect j , h �/A.connected &= ˜(A hjA j );A.poisoned &= ˜(A hjA j );event A ();unlock(A.mutex);

detach : /� detach from parent �/lock (main.mutex);��main.attached children ;broadcast (main.cond);unlock(main.mutex);goto restart ;

Fig. 4.15 C code in function f() controlling its execution

of coding for the CELL, did convince us that language at a higher level than C isappropriate for such heterogeneous multicore processors.

4.7 Detecting Deadlocks

SHIM, although race free, is not immune to deadlocks. A simple example is f recv a;recv b; g par f send b; send a; g, which deadlocks because the first task is attemptingto communicate on a first yet the second, which is also connected to a, excepts b tobe first. Fortunately, SHIM’s scheduling independence means that for a given inputsequence, a SHIM program behaves the same for all executions of the program, andthus either always deadlocks or never deadlocks. In particular, SHIM does not needto be analyzed under an interleaved model of concurrency, rendering all the partialorder reduction tricks of model checkers such as Holzmann’s SPIN [1] unnecessaryfor SHIM.

Our first attempt at detecting deadlocks in SHIM [18] employs the symbolicsynchronous model checker NuSMV [19] – an interesting choice since SHIM’sconcurrency model is fundamentally asynchronous. Our approach abstracts awaydata operations and chooses a specific schedule in which each communication eventtakes a single cycle. This reduces the SHIM program to a set of communicating statemachines suitable for the NuSMV model checker.

Page 159: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

142 S.A. Edwards and N. Vasudevan

We have since continued our work on deadlock detection in SHIM [20]. Herewe take a compositional approach in which we build an automaton for a completesystem piece by piece. Our insight is that we can usually abstract away internalchannels and simplify the automaton without introducing or avoiding deadlocks.The result is that even though we are doing explicit model-checking, we can oftendo it much faster than a brute-force symbolic model checker such as NuSMV.

Figure 4.16 shows our technique in action. Starting from the (contrived) program,we first abstract the behavior of the first two tasks into simple automata. The firsttask communicates on channel a, then on channel b, then repeats; the second task

(b)

a b

(c)b

c

(d)

a ab

c

c

(e)

a

a

bc

c

(f)

a

a c

c

(g)c

d

(h)

a a

a a

d

d

d

c

c

(i)

a

a

a

a

d

d

d

c

c(j)

a

a

a d

d

d

(k)d

a

(l)

(m)

a aaab b

c

c

d

dd

d

void main()f

chan int a , b , c , d;for (;;) f

recv a; b = a + 1; send b;g par for (;;) f

recv b; c = b + 1; send c;g par for (;;) f

recv c ; d = c + 1; send d;g par for (;;) f

recv d; a = d + 1; send a;g

g (a)

Fig. 4.16 Analyzing a four-task SHIM program (a) for deadlock. Composing the automata for thefirst (b) and second (c) tasks gives a product automaton (d). Channel b only appears in the firsttwo tasks, so we abstract away its effect by identifying (e) and merging (f) equivalent states. Next,we compose this simplified automaton with that for the third task (g) to produce another (h). Now,channel c will not appear again, so again we identify (i) and merge (j) states. Finally, we composethis with the automaton for the fourth task (k) to produce a single, deadlocked state (l) becausethe fourth task insists on communicating first on d but the other three communicate first on a. Thedirect composition of the first three tasks without removing channels (m) is larger – eight states

Page 160: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

4 Compiling SHIM 143

does the same on channels b and c. We compose these automata by allowing eitherto take a step on unshared channels but insisting on a rendezvous when a channelis shared. Then, since channel b is local to these two tasks, we abstract away itsbehavior by merging two states. This produces a simplified automaton that we thencompose with the automaton for the third task. This time, channel c is local, so againwe simplify the automaton and compose it with the automaton for the fourth task.The automaton we obtained for the first three tasks insists on communicating firston a then d; the fourth tasks communicates on d then a. This is a deadlock, whichmanifests itself as a state with no outgoing arcs.

For programs that follow such a pipeline pattern, the number of states growsexponentially with the number of pipeline stages (precisely, n stages produce 2n

states), yet our analysis only builds machines with 2n states before simplifyingthem to n C 1 states at each step. Although we still have to step through and an-alyze each of the n stages (leading to quadratic complexity), this is still a substantialimprovement.

Of course, our technique cannot always reduce an exponential state space to apolynomial one, but we find it often did on the example programs we tried.

4.8 Sharing Buffers

We also applied the model-checking approach from the previous section tosearch for situations where buffer memory can be shared [21]. In general, eachcommunication channel needs its own space to store any data being communicatedover it, but in certain cases, it is possible to prove that two channels can never beactive simultaneously.

In the program in Fig. 4.17, the main task starts four tasks in parallel. Tasks 1and 2 communicate on a. Then, tasks 2 and 3 communicate on b and finally tasks 3and 4 on c. The value of c received by task 4 is 8. Communication on a cannot oc-cur simultaneously with that of b because task 2 forces them to occur sequentiallythem. Similarly communications on b and c are forced to be sequential by task 3.Communications on a and c cannot occur together because they are forced to besequential by the communication on b. Our tool understands this pattern and re-ports that a, b, and c can share buffers because their communications never overlap,thereby reducing the total buffer requirements by 66% for this program.

4.9 Conclusions

The central hypothesis of the SHIM project is that its simple, deterministic seman-tics helps both programming and automated program analysis. That we have beenable to devise truly effective mechanisms for clever code generation (e.g., staticscheduling) and analysis (e.g., deadlock detection) that can gain deep insight into

Page 161: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

144 S.A. Edwards and N. Vasudevan

void main() fchan int a , b, c;f

// Task 1send a = 6; // Send a (synchronize with task 2)

g par f// Task 2recv a; // Receive a ( synchronize with task 1)send b = a + 1; // Send 7 on b (synchronize with task 3)

g par f// Task 3recv b; // Receive b ( synchronize with task 2)send c = b + 1; // Send 8 on c (synchronize with task 4)

g par f// Task 4recv c; // Receive c ( synchronize with task 3g// c = 8 here

gg

Fig. 4.17 A SHIM program that illustrates the possibility of buffer sharing. Channels a, b, and care never active simultaneously and can therefore share buffer space

the behavior of programs vindicates this view. The bottom line: if a programminglanguage does not have simple semantics, it is really hard to analyze its programsquickly or precisely.

Algorithms where there is a large number of little, variable-sized, but inde-pendent pieces of work to be done do not mesh well with SHIM’s scheduling-independent philosophy as it currently stands. The obvious way to handle this isto maintain a bucket of tasks and assign each task to a processor once it has fin-ished its last task. The order in which the tasks is performed, therefore, depends ontheir relative execution rates, but this does not matter if the tasks are independent.It would be possible to add scheduling-independent task distribution and schedulingto SHIM (i.e., provided the tasks are truly independent or, equivalently, confluent);exactly how is an open research question.

Exceptions have been even more painful than multiway rendezvous. They areextremely convenient from a programming standpoint (e.g., SHIM’s rudimentaryI/O library wraps each program in an exception to allow it to terminate gracefully;virtually every compiler test case includes at least a single exception), but extremelydifficult to both implement and reason about.

An alternative is to turn exceptions into syntactic sugar layered on the exception-free SHIM model. We always had this in the back our minds: an exception wouldjust put a process into an unusual state where it would communicate its poisonedstate to any process that attempts to communicate with it. The problem is that thecomplexity tends to grow quickly when multiple, concurrent exceptions and scopesare considered. Again, exactly how to translate exceptions into a simpler SHIMmodel remains an open question.

Page 162: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

4 Compiling SHIM 145

That buffering is mandatory for high-performance parallel applications is hardlya revelation; we confirmed it anyway. The SHIM model has always been able to im-plement FIFO buffers (e.g., Fig. 4.6), but we have realized that they are sufficientlyfundamental to be a first-class type in the language. We are currently working on avariant of the language that replaces pure rendezvous communication with bounded,buffered communication. Because it will be part of the language, it will be easier tomap to unusual environments, such as the DMA mechanism for inter-core commu-nication on the CELL processor.

SHIM has already been an inspiration for aspects of some other languages. Weported its communication model into the Haskell functional language [22] and pro-posed a compiler that would impose its scheduling-independent view of the workon arbitrary programs [23]. Certain SHIM ideas, such as scheduling analysis [24],have also been used in IBM’s X10 language.

Acknowledgements Many have contributed to SHIM. Olivier Tardieu created the formal seman-tics, devised the exception mechanism, and instigated endless (constructive) arguments. Jia Zengdeveloped the static recursion removal algorithm. Baolin Shao designed the compositional dead-lock detection algorithm. The NSF has supported the SHIM project under grant 0614799.

References

1. Holzmann, G.J.: The model checker SPIN. IEEE Transactions on Software Engineering 23(5)(May 1997) 279–294

2. Kahn, G.: The semantics of a simple language for parallel programming. In: InformationProcessing 74: Proceedings of IFIP Congress 74, Stockholm, Sweden, North-Holland (August1974) 471–475

3. Hoare, C.A.R.: Communicating sequential processes. Communications of the ACM 21(8)(August 1978) 666–677

4. Buck, J.T.: Scheduling dynamic dataflow graphs with bounded memory using the token flowmodel. PhD thesis, University of California, Berkeley (1993). Available as UCB/ERL M93/69

5. Parks, T.M.: Bounded scheduling of process networks. PhD thesis, University of California,Berkeley (1995). Available as UCB/ERL M95/105

6. Edwards, S.A.: Experiences teaching an FPGA-based embedded systems class. In: Proceed-ings of the Workshop on Embedded Systems Education (WESE), Jersey City, New Jersey(September 2005) 52–58

7. Edwards, S.A.: SHIM: A language for hardware/software integration. In: Proceedings of Syn-chronous Languages, Applications, and Programming (SLAP). Electronic Notes in TheoreticalComputer Science, Edinburgh, Scotland (April 2005)

8. Benveniste, A., Caspi, P., Edwards, S.A., Halbwachs, N., Guernic, P.L., de Simone, R.: Thesynchronous languages 12 years later. Proceedings of the IEEE 91(1) (January 2003) 64–83

9. Edwards, S.A., Tardieu, O.: Efficient code generation from SHIM models. In: Proceedingsof Languages, Compilers, and Tools for Embedded Systems (LCTES), Ottawa, Canada (June2006) 125–134

10. Edwards, S.A., Tardieu, O.: SHIM: A deterministic model for heterogeneous embedded sys-tems. In: Proceedings of the International Conference on Embedded Software (Emsoft), JerseyCity, New Jersey (September 2005) 37–44

11. Edwards, S.A., Tardieu, O.: SHIM: A deterministic model for heterogeneous embedded sys-tems. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 14(8) (August 2006)854–867

Page 163: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

146 S.A. Edwards and N. Vasudevan

12. Tardieu, O., Edwards, S.A.: R-SHIM: Deterministic concurrency with recursion and sharedvariables. In: Proceedings of the International Conference on Formal Methods and Models forCodesign (MEMOCODE), Napa, California (July 2006) 202

13. Edwards, S.A., Zeng, J.: Static elaboration of recursion for concurrent software. In: Proceed-ings of the Workshop on Partial Evaluation and Program Manipulation (PEPM), San Francisco,California (January 2008) 71–80

14. Tardieu, O., Edwards, S.A.: Scheduling-independent threads and exceptions in SHIM. In:Proceedings of the International Conference on Embedded Software (Emsoft), Seoul, Korea(October 2006) 142–151

15. Berry, G., Gonthier, G.: The Esterel synchronous programming language: Design, semantics,implementation. Science of Computer Programming 19(2) (November 1992) 87–152

16. Edwards, S.A., Vasudevan, N., Tardieu, O.: Programming shared memory multiprocessors withdeterministic message-passing concurrency: Compiling SHIM to Pthreads. In: Proceedings ofDesign, Automation, and Test in Europe (DATE), Munich, Germany (March 2008) 1498–1503

17. Vasudevan, N., Edwards, S.A.: Celling SHIM: Compiling deterministic concurrency to a het-erogeneous multicore. In: Proceedings of the Symposium on Applied Computing (SAC),Volume III, Honolulu, Hawaii (March 2009) 1626–1631

18. Vasudevan, N., Edwards, S.A.: Static deadlock detection for the SHIM concurrent language.In: Proceedings of the International Conference on Formal Methods and Models for Codesign(MEMOCODE), Anaheim, California (June 2008) 49–58

19. Cimatti, A., Clarke, E.M., Giunchiglia, E., Giunchiglia, F., Pistore, M., Roveri, M., Sebastiani,R., Tacchella, A.: NuSMV version 2: An opensource tool for symbolic model checking. In:Proceedings of the International Conference on Computer-Aided Verification (CAV), Copen-hagen, Denmark (July 2002). Volume 2404 of Lecture Notes in Computer Science, Springer,Berlin, pp. 359–364

20. Shao, B., Vasudevan, N., Edwards, S.A.: Compositional deadlock detection for rendezvouscommunication. In: Proceedings of the International Conference on Embedded Software(Emsoft), Grenoble, France (October 2009)

21. Vasudevan, N., Edwards, S.A.: Buffer sharing in CSP-like programs. In: Proceedings of theInternational Conference on Formal Methods and Models for Codesign (MEMOCODE), Cam-bridge, Massachusetts (July 2009)

22. Vasudevan, N., Singh, S., Edwards, S.A.: A deterministic multi-way rendezvous library forHaskell. In: Proceedings of the International Parallel and Distributed Processing Symposium(IPDPS), Miami, Florida (April 2008) 1–12

23. Vasudevan, N., Edwards, S.A.: A determinizing compiler. In: Proceedings of ProgramLanguage Design and Implementation (PLDI), Dublin, Ireland (June 2009)

24. Vasudevan, N., Tardieu, O., Dolby, J., Edwards, S.A.: Compile-time analysis and specializationof clocks in concurrent programs. In: Proceedings of Compiler Construction (CC), York,United Kingdom (March 2009). Volume 5501 of Lecture Notes in Computer Science, Springer,Berlin, pp. 48–62

Page 164: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

Chapter 5A Module Language for Typing SIGNALPrograms by Contracts

Yann Glouche, Thierry Gautier, Paul Le Guernic, and Jean-Pierre Talpin

5.1 Introduction

Methodological guidelines for the design of real-time embedded systems advise thevalidation of specifications as early as possible. Moreover, in a refinement-based de-velopment methodology of large embedded systems, an iterative validation of eachrefinement or modification made to the initial specification, until the implementationof the system is finalized, is highly desirable. Additionally, cooperative component-based development requires to use and to assemble components, which have beendeveloped by different suppliers, in a safe and consistent way [11, 17]. These com-ponents have to be provided with their conditions of use and guarantees that theyhave been validated when these conditions are satisfied. These conditions of use andguarantees represent a notion of contract.

Contracts are now often required as a useful mechanism for validation in robustsoftware design. Design by Contract, as advocated in [26], is being made availablefor usual languages like C++ or Java. Assertion-based contracts express programinvariants, pre- and post-conditions, as Boolean type expressions that have to betrue for the contract being validated. We adopt here a different paradigm of contractto define a component-based validation technique in the context of a synchronousmodeling framework. In our theoretical model, a component is represented by anabstract view of its behaviors. It has a finite set of input/output variables to cooperatewith its environment. Behaviors are viewed as multi-set traces on the variables ofthe component. The abstract model of a component is thus a process, defined as aset of such behaviors.

A contract is a pair (assumptions, guarantees). Assumptions describe propertiesexpected by a component to be satisfied by the context (the environment) in whichthis component is used; on the opposite, guarantees describe properties that aresatisfied by the component itself when the context satisfies the assumptions. Such

Y. Glouche (�), T. Gautier, P. Le Guernic, and J.-P. TalpinINRIA, Centre de Recherche de Rennes, Campus de Beaulieu, Rennes, Francee-mail: [email protected]; [email protected]; [email protected];[email protected]

S.K. Shukla and J.-P. Talpin (eds.), Synthesis of Embedded Software: Frameworks andMethodologies for Correctness by Construction, DOI 10.1007/978-1-4419-6400-7 5,c� Springer Science+Business Media, LLC 2010

147

Page 165: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

148 Y. Glouche et al.

a contract may be documentary; however, when a suitable formal model exists,contracts can be supplied to some formal verification tool. We want to providedesigners with such a formal model allowing “simple” but powerful and efficientcomputation on contracts. Thus, we define a novel algebraic framework to enableformal reasoning on contracts.

The assumptions and guarantees of a component are defined as process-filters:assumptions filter the processes (sets of behaviors) a component may accept andguarantees filter the processes a component provides. A process-filter is the setof processes, whatever their input and output variables are, that are compatiblewith some property (or constraint) expressed on the variables of the component.Foremost, we define a Boolean algebra to manipulate process-filters. This yieldsan algebraically rich structure that allows us to reason about contracts (to abstract,refine, combine and normalize them). This algebraic model is based on a minimal-ist model of execution traces, allowing one to adapt it easily to a particular designframework.

A main characteristic of this model is that it allows one to precisely handle thevariables of components and their possible behaviors. This is a key point. Indeed,assumptions and guarantees are expressed, as usual, by properties constraining orrelating the behaviors of some variables. What has to be considered very carefullyis thus the “compatibility” of such constraints with the possible behaviors of othervariables. This is the reason why we introduce partial order relations on processesand on process-filters. Moreover, having a Boolean algebra on process-filters al-lows one to formally, unambiguously and finitely express complementation withinthe algebra. This is, in turn, a real advantage compared to related formalisms andmodels.

We put this algebra to work for the definition of a general purpose module sys-tem whose typing paradigm is based on the notion of contract. The type of a moduleis a contract holding assumptions made on and guarantees offered by its behaviors.It allows to associate a module with an interface which can be used in varietiesof scenarios such as checking the composability of modules or efficiently support-ing modular compilation. The corresponding module language is generic in thatprocesses and contracts may be expressed in some external target language. In thecontext of real-time, safety-critical applications, we consider here the synchronouslanguage SIGNAL to specify processes.

Organization

We start with a highlight on some key features of our module system by consideringthe specification of a protocol for Loosely Time-Triggered Architectures, Sect. 5.2.This example is used in this article to illustrate our approach. We give an out-line of our contract algebra, Sect. 5.3, and demonstrate its capabilities for logicaland compositional reasoning on the assumptions and guarantees of component-based embedded systems. This algebra is used, Sect. 5.4, as a foundation for thedefinition of a strongly-typed module system: contracts are used to type components

Page 166: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

5 A Module Language for Typing SIGNAL Programs by Contracts 149

with behavioral properties. Section 5.5 demonstrates the use of our module systemby considering the introductory example and by illustrating its contract basedspecification.

5.2 A Case Study

We illustrate our approach by considering a protocol that ensures a coherent systemof logical clocks on the top of Loosely Time-Triggered Architectures (LTTA). Thisprotocol has been presented in [7]. We define contracts to characterize properties ofthis protocol.

5.2.1 Description of the Protocol

In general, a distributed real-time control system has a time-triggered nature justbecause the physical system for control is bound to physics. A LTTA features aquasi-periodic and non-blocking bus access and independent read-write operations.The LTTA is composed of three devices, a writer, a bus, and a reader (Fig. 5.1).Each device is activated by its own, approximately periodic, clock.

At the nth clock tick (time tw.n/), the writer generates the value xw.n/ and analternating flag bw.n/ s.t.:

bw.n/ D�

false if n D 0;

not bw.n � 1/ otherwise:

Both values are stored in its output buffer, denoted by yw. At any time t , the writer’soutput buffer yw contains the last value that was written into it:

yw.t/ D .xw.n/; bw.n// ; where n D supfn0 j tw.n0/ < tg: (5.1)

writer reader

� �

� �� .xb; bb/ � �bus

xw

tw

.xw; bw/

xr

tr

.xr ; br/

tb

Fig. 5.1 The three devices of the LTTA

Page 167: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

150 Y. Glouche et al.

At tb.n/, the bus fetches yw to store in the input buffer of the reader, denoted by yb .Thus, at any time t , the reader input buffer is defined by:

yb.t/ D yw.tb.n// ; where n D supfn0 j tb.n0/ < tg: (5.2)

At tr.n/, the reader loads the input buffer yb into the variables x.n/ and b.n/:

.x.n/; b.n// D yb.tr .n//:

Then, the reader extracts x.n/ iff b.n/ has changed. This defines the sequence m ofticks:

m.0/ D 0 ; m.n/ D inffk > m.n � 1/ j b.k/ ¤ b.k � 1/g;xr .k/ D x.m.k//: (5.3)

In any execution of the protocol, the sequences xw and xr must coincide, i.e.,8n�xr .n/ D xw.n/. In [7] it is proved that this is the case iff the following conditionshold:

w � b andjw

b

k� r

b; (5.4)

where w, b and r are the respective periods of the clocks of the writer, the busand the reader (for x 2 R, bxc denotes the largest integer less or equal to x).Conditions (5.4) are abstracted by conditions on ordering between events. The firstcondition, w � b, is abstracted by the predicate:

w � b $ never two tw between two tb . (5.5)

The abstraction of the second condition, bw=bc � r=b requires the following defi-nition of the first instant (of the bus) �b.n/ D minftb.p/ j tb.p/ > tw.n/g where thebus can fetch the nth writing. The second condition is then restated as the require-ment (5.6) that no two successive �b can occur between two successive tr :

jw

b

k� r

b$ never two �b between two successive tr . (5.6)

Under the specific conditions (5.5) and (5.6), the correctness of the protocol re-duces to the assumption:

8n 2 N; 9k 2 N; s:t: �b.n/ < tr.k/ � �b.nC1/:

It guarantees that all written values are actually fetched by the bus (�b.n/ alwaysexists, and �b.nC1/ ¤ �b.n/ since there is at least one instant tr .k/ which occursin between them), and all fetched values are actually read by the reader (�b.n/ <

tr .k/ � �b.nC1/): see Fig. 5.2.

Page 168: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

5 A Module Language for Typing SIGNAL Programs by Contracts 151

Fig. 5.2 Correctness of the protocol

5.2.2 Introduction to the Module Language

Considering first the writer and the bus, the protocol will be correct only if w � b,so that the data flow emitted by the bus is equal to the data flow emitted by the writer(8n � xb.n/ D xw.n/).

In the module language, a specification is designated by the keyword contract.It defines a set of input and output variables (interface) subject to a contract. Theinterface defines the way the component interacts with its environment through itsvariables. In addition, it embeds properties that are modeled by a composition ofcontracts. For instance, the specification of a bus controller could be defined bythe assumption w � b and the guarantee 8n � xb.n/ D xw.n/. An implementationof the specification, designated by the keyword process, contains a compatibleimplementation of the above contract.

module type WriterBusType =contract input real w, b;

boolean xwoutput boolean xb;

assume w >= bguarantee xb = xw

end;

module WriterBus : WriterBusType =process input real w, b;

boolean xwoutput boolean xb;

(|...|)

end;

The specification of the properties we consider for the whole LTTA consistsof two contracts. Each contract applies to a given component (bus or reader) ofthe LTTA. It is made of a clock relation as assumption and an equality of flowsas guarantee. Instead of specifying two separate contracts, we define them as twoinstances of a generic one. To this end, we define an encapsulation mechanism togenerically represent a class of specifications or implementations sharing a com-mon pattern of behavior up to that of the parameters. In the example of the LTTA,for instance, we have used a functor to parameterize it with respect to its clockrelations.

Page 169: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

152 Y. Glouche et al.

module type LTTAProperty =functor(real c1, c2)contract input boolean xwb

output boolean xbr;assume c1 >= c2guarantee xbr = xwb

end;

module type LTTAClockConstraints =contract input real w, b, r;

boolean cwoutput boolean xr;

LTTAProperty(w, b)(xw, xb) andLTTAProperty(floor(w/b), r/b)(xb, xr)end;

The generic contract, called LTTAProperty, is parameterized with two clocks.When the clock constraint associated with the context of the considered component(bus or reader) of the LTTA is respected, the preservation of the flows is ensured bythis component. The contract of the LTTA is defined by the composition “and”of two applications of the generic LTTAProperty contract. Each applicationdefines a property of the LTTA with its appropriate clock constraint. The compo-sition defines the greatest lower-bound of both contracts. Each application of theLTTAProperty functor produces a contract which is composed with the otherones in order to produce the expected contract.

A module is hence viewed as a pair M W I consisting of an implementation M

that is typed by (or viewed as) a specification I of satisfiable contract. The seman-tics of the specification I , written [[I]], is a set of processes (in the sense of Sect. 5.3)whose traces satisfy the contract associated with I . The semantics [[M]] of the im-plementation M is a process contained in [[I]].

5.3 An Algebra of Contracts for Assume-Guarantee Reasoning

Section 5.3.1 introduces a suitably general algebra of processes. A contract (A,G)is viewed as a pair of logical devices filtering processes: the assumption A filtersprocesses to select (accept or conversely reject) those that are asserted (acceptedor conversely rejected) by the guarantee G. Process-filters are defined in Sect. 5.3.2and contracts in Sect. 5.3.3. The proofs of properties presented in this section areprovided in [12]. Section 5.3.4 discusses some related approaches for contracts.

5.3.1 An Algebra of Processes

We start with the definition of a suitable algebra for behaviors and processes. Wedeliberately choose an abstract definition of behavior as a function from a set ofvariable names to a domain of typed values. These typed values may be themselvesfunctions of time to some domain of data values: it is the case, for instance, when weconsider the SIGNAL language, where a behavior describes the trace of a discreteprocess.

Page 170: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

5 A Module Language for Typing SIGNAL Programs by Contracts 153

Definition 1 (Behavior). Let V be an infinite, countable set of variables, and D aset of values; for Y a nonempty finite set of variables included in V , a Y-behavior isa function b : Y ! D .

The set of Y-behaviors is denoted by BY D Y ! D ; B; D ; denotes the set ofbehaviors (which is empty) on the empty set of variables. The notation cjX is usedfor the restriction of a Y-behavior c on X, a (possibly empty) subset of Y.

A process is defined as a set of behaviors on a given set of variables.

Definition 2 (Process). For X a finite set of variables, an X-process p is anonempty set of X-behaviors.

The unique ;-process, on the empty set of variables, is denoted by ˝ D f;g.It can be seen as the universal process; it has no effect when composed with otherprocesses. The empty process, which is defined by the empty set of behaviors, isdenoted by � D ;. It can be seen as the null process; when composed (intersected)with other processes, it always results in the empty process.

The set of X-processes is denoted by P X D P.BX/ n f�g and P ?X D P X [ f�g.

The set of all processes is denoted by P D [(X �V) P X and P ? D P [ f�g. For anX-process p, the domain X of its behaviors is denoted var(p), and var(�) D V.

Complement, restriction and extension of a process have expected definitions:

Definition 3 (Complement, restriction and extension). For X a finite set of vari-ables, the complement ep of a process p 2 P X is defined by ep D (BX n p). Also,e� D BX. For X, Y, finite sets of variables such that X � Y �V , qjX D fcjX / c 2 qgis the restriction qjX 2 P X of q 2 P Y and pjY D fc 2 BY / cjX 2 pg is the extension

pjY 2 P Y of p 2 P X. Also, �jX D � and �jV D �.

Note that the extension of p in P X to Y �V is the process on Y that has the sameconstraints as p.

The set P ?X, equipped with union, intersection and complement, is a Boolean

algebra with supremum P ?X and infimum �.

The extension operator induces a partial order � on processes, such that p � q ifq is an extension of p to the variables of q; the relation �, used to define filters, isstudied below.

Definition 4 (Process extension relation). The process extension relation � isdefined by: ( 8 p 2 P ) ( 8 q 2 P ) (p � q) D ((var(p ) � var( q)) ^ (pjvar(q) D q))

Thus, if (p � q), q is defined on more variables than p; on the variables of p, q hasthe same constraints as p; its other variables are free. This relation extends to P ?

with (� � �).

Property 1. (P ?,�) is a poset.

In this poset, the upper set of a process p, called extension upper set, is the set ofall its extensions; it is denoted by p"� D fq 2 P / p � qg.

Page 171: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

154 Y. Glouche et al.

Fig. 5.3 Controlled (left) and non-controlled (right) variable y in a process q

To study properties of extension upper sets, we characterize semantically the setof variables that are constrained by a given process: a process q 2 P controls a givenvariable y if y belongs to var(q) and q is not equal to the extension on var(q) of itsprojection on (var(q)nfyg). Formally, a process q 2 P controls a variable y, written(q � y), iff (y 2 var(q)) and q ¤ ..qj(var(q)nfyg)/

jvar(q)/. A process q 2 P controls avariable set X, written (q � X), iff ( 8 x 2 X) (q � x). Also, � is extended to P ?

with � � V.This is illustrated in Fig. 5.3 (left): there is some behavior b in q that has the same

restriction on (var(q)nfyg) as some behavior c in Bvar(q) such that c does not belongto q; thus q is strictly included in .qj(var(q)nfyg)/

jvar(q) .We define a reduced process (the key concept to define filters) as being a process

that controls all of its variables.

Definition 5 (Reduced process). A process p 2 P ? is reduced iff p � var(p).

Reduced processes are minimal in (P ,�). We denote byOq, called reduction of q,

the (minimal) process such thatOq � q (p is reduced iff

Op D p).

Property 2. The complementep of a nonempty process p strictly included in Bvar(p)

is reduced iff p is reduced; thenep and p control the same set of variables var(p).

The extension upper setOp"� of the reduction of p is composed of all the sets

of behaviors, defined on variable sets that include the variables controlled by p,as maximal processes (for union of sets of behaviors) that have exactly the sameconstraints as p (variables that are not controlled by p are also not controlled in the

processes ofOp"�). We also observe that var(

Oq) is the greatest subset of variables

such that q � var(Oq).

Then we define the inclusion lower set of a set of processes to capture all thesubsets of behaviors of these processes. Let R � P ?, R#� is the inclusion lower setof R for � defined by R#� D fp 2 P ? / ( 9 q 2 R) (p � q)g.

Page 172: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

5 A Module Language for Typing SIGNAL Programs by Contracts 155

5.3.2 An Algebra of Filters

In this section, we define a process-filter by the set of processes that satisfy a givenproperty. We define an order relation (v) on the set of process-filters ˆ. With thisrelation, (ˆ,v) is a Boolean algebra. A process-filter R is a subset of P ? that filtersprocesses: it contains all the processes that are “equivalent” with respect to someconstraint or property, so that all processes in R are accepted or all of them but �are rejected. A process-filter is built from a unique process generator by extendingit to larger sets of variables and then by including subprocesses of these “maximalallowed behavior sets”.

Definition 6 (Process-filter). A set of processes R is a process-filter iff ( 9 r 2 P ?)

((r D Or) ^ (R D r"�#�)). The process r is a generator of R (R is generated by r).

The process-filter generated by the reduction of a process p is denoted by bp DOp"�#�. The generator of a process-filter R is unique, we refer to it as

OR. Note

that ˝ generates the set of all processes (including �) and � belongs to all filters.Formally, ( 8 p,r,s 2 P ?), we have:

.p 2br/ H) .var(Or) � var(p)/ br Dbs ” O

r D Os ˝ 2br ”br D P ?

Figure 5.4 illustrates how a process-filter is generated from a process p (de-picted by the bold line) in two successive operations. The first operation consistsof building the extension upper set of the process: it takes all the processes that arecompatible with p and that are defined on a larger set of variables. The second op-eration proceeds using the inclusion lower set of this set of processes: it takes allthe processes that are defined by subsets of behaviors from processes in the exten-sion upper set (in other words, those processes that remain compatible when addingconstraints, since adding constraints removes behaviors).

We denote by ˆ the set of process-filters. We call strict process-filters theprocess-filters that are neither P ? nor f�g. The filtered variable set of a process-

filter R is var(R) defined by var(R) D var(OR).

Fig. 5.4 Example ofprocess-filter

Page 173: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

156 Y. Glouche et al.

We define an order relation on process-filters, that we call relaxation, and writeR v S to mean that S is a relaxation of R.

Definition 7 (Process-filter relaxation). For R and S, two process-filters, let Z Dvar(R) [ var(S). The relation S is a relaxation of R, written R v S, is defined by:

.R v S ” OR

jZ� OS

jZ/ f�g v S .R v f�g/ ” f�g D R

The relaxation relation defines the structure of process-filters, which is shown tobe a lattice.

Lemma 1. (ˆ,v) is a lattice of supremum P ? and infimum f�g. Let R and S be

two process-filters, V D var(R) [ var(S), RV D OR

jVand SV D OS

jV. Conjunction

R u S, disjunction R t S and complementeR are respectively defined by:

R u S DO

.‚ …„ ƒRV \ SV/"�#�, f�g u R D f�g,

R t S DO

.‚ …„ ƒRV [ SV/"�#�; f�g t R D R,

eR DeOR"�#�; ef�g D P ?; fP ? D f�g:

Let us comment the definitions of these operators. Conjunction of two strict process-filters R and S, for instance, is obtained by first building the extension of the

generatorsOR and

OS on the union of the sets of their controlled variables; then the in-

tersection of these processes, which is also a process (set of behaviors) is considered;since this operation may result in some variables becoming free (not controlled), thereduction of this process is taken; and finally, the result is the process-filter generatedby this reduction. The process-filter conjunction R u S of two strict process-filtersR and S is the greatest process-filter T D R u S that accepts all processes that areaccepted by R and by S. The same mechanism, with union, is used to define dis-junction. The process-filter disjunction R t S of two strict process-filters R and Sis the smallest process-filter T D R t S that accepts all processes that are acceptedby R or by S. And the complement of a strict process-filter R is the process-filter

generated by the complement of its generatorOR.

Finally, we state a main result for process-filters, which is that process-filtersform a Boolean algebra:

Theorem 1. (ˆ, v) is a Boolean algebra with P ? as 1, f�g as 0 and thecomplementeR.

5.3.3 An Algebra of Contracts

From process-filters, we define the notion of assume/guarantee contract and proposea refinement relation on contracts.

Page 174: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

5 A Module Language for Typing SIGNAL Programs by Contracts 157

Fig. 5.5 A process psatisfying a contract (A,G)

Definition 8 (Contract). A contract C D (A,G) is a pair of process-filters. Thevariable set of C D (A,G) is defined by var(C) D var(A) [ var(G). C D ˆ ,ˆ isthe set of contracts.

Usually, an assumption A is an assertion on the behavior of the environment (it istypically expressed on the inputs of a process) and thus defines the set of behav-iors that the process has to take into account. The guarantee G defines propertiesthat should be guaranteed by a process running in an environment where behaviorssatisfy A.

A process p satisfies a contract C D (A,G) if all its behaviors that are accepted byA (i.e., that are behaviors of some process in A), are also accepted by G. Figure 5.5depicts a process p satisfying the contract (A,G) (bp is the process-filter generatedby the reduction of p). This is made more precise and formal by the followingdefinition.

Definition 9 (Satisfaction). Let C D (A,G) a contract; a process p satisfiesC, written p � C, iff (bp u A) v G.

Property 3. p � C ” bp v (eA t G).

We define a preorder relation that allows to compare contracts. A contract(A1,G1) is finer than a contract (A2,G2), written (A1,G1) (A2,G2), iff all pro-cesses that satisfy the contract (A1,G1) also satisfy the contract (A2,G2):

Definition 10 (Satisfaction preorder). (A1,G1) (A2,G2) iff ( 8 p 2 P )((p �(A1,G1)) H) (p � (A2,G2))).

The preorder on contracts satisfies the following property:

Property 4. (A1,G1) (A2,G2) iff (fA1 t G1) v (fA2 t G2).

Refinement of contracts is further defined from the satisfaction preorder:

Definition 11 (Refinement of contracts). A contract C1 D (A1,G1) refines a con-tract C2 D (A2,G2), written C1 4 C2, iff (A1,G1) (A2,G2), (A2 v A1) and G1

v (A1 t G2).

Refinement of contracts amounts to relaxing assumptions and reinforcingpromises under the initial assumptions. The intuitive meaning is that for any pthat satisfies a contract C, if C refines D then p satisfies D. Our relation of refine-ment formalizes substitutability. Among contracts that could be used to refine an

Page 175: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

158 Y. Glouche et al.

existing contract (A2,G2), we choose those contracts (A1,G1) that “scan” moreprocesses than (A2,G2) (A2 v A1) and that guarantee less processes than those ofA1 t G2.

The refinement relation can be expressed as follows in the algebra of process-filters:

Property 5. (A1,G1) 4 (A2,G2) iff A2 v A1, (A2 u G1) v G2 and G1 v(A1 t G2).

The refinement relation (4) defines the poset of contracts, which is shown to be alattice. In this lattice, the union (or disjunction) of contracts is defined by their leastupper bound and the intersection (or conjunction) of contracts is defined by theirgreatest lower bound. These operations provide two compositions of contracts.

Lemma 2 (Composition of contracts). Two contracts C1 D (A1,G1) and C2 D(A2,G2) have a greatest lower bound C D (A,G), written (C1 + C2), defined by:

A D A1 t A2 and G D ..A1 u fA2 u G1/ t .fA1 u A2 u G2/ t .G1 u G2//

and a least upper bound D D (B,H), written (C1 * C2), defined by

B D A1 u A2 and H D .fA1 u G1/ t .fA2 u G2/ t .A1 u G2/ t .A2 u G1/ .

A Heyting algebra H is a bounded lattice such that for all a and b in H there is agreatest element x of H such that the greatest lower bound of a and x is smaller thanb [5]. For all contracts C1 D (A1,G1), C2 D (A2,G2), there is a greatest element Xof C such that the greatest lower bound of C1 and X refines C2. Then our contractalgebra is a Heyting algebra (in particular, it is distributive):

Theorem 2. (C, 4) is a Heyting algebra with supremum (f�g,P ?) and infimum(P ?,f�g).

Note that it is not a Boolean algebra since it is not possible to define in generala complement for each contract. The complement exists only for contracts of theform (A,eA) and it is then equal to (eA,A).

5.3.4 Contracts: Some Related Approaches

The use of contracts has been advocated for a long time in computer science [1, 16]and, more recently, has been successfully applied in object-oriented software engi-neering [25]. In object-oriented programming, the basic idea of design-by-contractis to consider the services provided by a class as a contract between the class and itscaller. The contract is composed of two parts: requirements made by the class uponits caller and promises made by the class to its caller.

In the theory of interface automata [2], the notion of interface offers benefits sim-ilar to our notion of contract and for the purpose of checking interface compatibility

Page 176: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

5 A Module Language for Typing SIGNAL Programs by Contracts 159

between reactive modules. In that context, it is irrelevant to separate the assumptionsfrom guarantees and only one contract needs to be and is associated with a module.

Separation and multiple views become of importance in a more general-purposesoftware engineering context. Separation allows more flexibility in finding (contra-variant) compatibility relations between components. Multiple views allow betterisolation between modules and hence favor compositionality. This is discussed inSect. 5.5.3.

In our contract algebra as in interface automata, a contract can be expressed withonly one filter. To this end, a filtering equivalence relation [12] (that defines theequivalence class of contracts that accept the same set of processes) may be usedto express a contract with only one guarantee filter and with its hypothesis filteraccepting all the processes (or, conversely, with only one hypothesis filter and aguarantee filter that accepts no process).

In [6], a system of assume-guarantee contracts with similar aims of genericityis proposed. By contrast to our domain-theoretical approach, the EC Speeds projectconsiders an automata-based approach, which is indeed dual but makes notions suchas the complement of a contract more difficult to express from within the model. Theproposed approach also leaves the role of variables in contracts unspecified, at thecost of some algebraic relations such as inclusion.

In [18], the authors show that the framework of interface automata may be em-bedded into that of modal I/O automata. This approach is further developed in [27],where modal specifications are considered. This consists of labelling transitions thatmay be fired and other that must. Modal specifications are equipped with a parallelcomposition operator and refinement order which induces a greatest lower bound.The glb allows addressing multiple-viewpoint and conjunctive requirements. Withthe experience of [6], the authors notice the difficulty in handling interfaces havingdifferent alphabets. Thanks to modalities, they propose different alphabet equaliza-tions depending on whether parallel composition or conjunction is considered. Thenthey consider contracts as residuations G/A (the residuation is the adjoint of parallelcomposition), where assumptions A and guarantees G are both specified as modalspecifications. The objectives of this approach are quite close to ours. Our modeldeals carefully with alphabet equalization. Moreover, using synchronous composi-tion for processes and greatest lower bound for process-filters and for contracts, ourmodel captures both substitutability and multiple-viewpoint (see Sect. 5.5.3).

In [22], a notion of synchronous contracts is proposed for the programminglanguage LUSTRE. In this approach, contracts are executable specifications (syn-chronous observers) timely paced by a clock (the clock of the environment). Thisyields an approach which is satisfactory to verify safety properties of individualmodules (which have a clock) but can hardly scale to the modeling of globally asyn-chronous architectures (which have multiple clocks).

In [9], a compositional notion of refinement is proposed for a simpler stream-processing data-flow language. By contrast to our algebra, which encompasses theexpression of temporal properties, it is limited to reasoning on input–output typesand input–output causality graph.

Page 177: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

160 Y. Glouche et al.

The system Jass [4] is somewhat closer to our motivations and solution. It pro-poses a notion of trace, and a language to talk about traces. However, it seems thatit evolves mainly towards debugging and defensive code generation. For embeddedsystems, we prefer to use contracts for validating composition and hope to use for-mal tools once we have a dedicated language for contracts. Like in JML [21], thenotion of agent with inputs/outputs does not exist in Jass, the language is based onclass invariants, and pre/post-conditions associated with methods.

Our main contribution is to define a complete domain-theoretical frameworkfor assume-guarantee reasoning. Starting from a domain-theoretical characteriza-tion of behaviors and processes, we build a Boolean algebra of process-filters anda Heyting algebra of contracts. This yields a rich structure which is (1) generic, inthat it can be implemented or instantiated to specific models of computation; (2)flexible, in the way it can help structuring and normalizing expressions; and (3)complete, in the sense that all combinations of propositions can be expressed withinthe model.

Finally, a temporal logic that is consistent with our model, such as for instanceATL (Alternating-time Temporal Logic [3]), can directly be used to express assump-tions about the context of a process and guarantees provided by that process.

5.4 A Module Language for Typing by Contracts

In this section, we define a module language to implement our contract algebraand apply it to the validation of component-based systems. For the peculiarity ofour applications, it will be instantiated to the context of the synchronous languageSIGNAL, yet it could equally be used in the context of related programminglanguages manipulating processes or agents. Its basic principle is to separate theinterface, which declares properties of a program using contracts, and implementa-tion, which defines an executable specification satisfying it.

5.4.1 Syntax

We define the formal syntax of our module language. Its grammar is parameterizedby the syntax of programs, noted p or q, which belong to the target specificationor programming language. Names are noted x or y. Types t are used to declare pa-rameters and variables in the interface of contracts. Assumptions and guarantees aredescribed by expressions p and q of the target language. An expression exp manip-ulates contracts and modules to parameterize, apply, reference and compose them.

Page 178: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

5 A Module Language for Typing SIGNAL Programs by Contracts 161

x; y (name)p; q (process)b; c WWD event j boolean j short j integer j : : : (datatype)t WWD b j input b j output b j x j t t (type)dec WWD t x Œ; dec� (declaration)def WWD module Œtype� x D exp (definition)

j module x ŒW t � D expj def I def

ag WWD Œassume p� guarantee qI (contract)j ag and ag j x.y�/ (process)

exp WWD contract decI ag end (contract)j process decI p end (process)j functor .dec/ exp (functor)j exp and exp (composition)j x .exp�/ (application)j let def in exp (scoping)

5.4.2 A Type System for Contracts and Processes

We define a type system for contracts and processes in the module language. In thesyntax of the module language, contracts and processes are associated with names x.These names can be used to type formal parameters in a functor and become typedeclarations. Hence, in the type system, type names that stand for a contract or aprocess are associated with a module type T . A base module type is a tagged pair�.I; C /. The tag � is noted � for the type of a process and � for the type of acontract. The set I consists of pairs x W t that declare the types t for its input andoutput variables x. The contract C is a pair of predicates .p; q/ that represent itsassumptions p and guarantees q. The type of a functor �.x W S/:T consists of thename x and of the types S and T of its formal parameter and result.

S; T WWD t j �.I; C / j S T j �.x W S/:T (type) � WWD � j � (kind):

5.4.3 Subtyping as Refinement

We define a subtyping relation on types t to extend the refinement relation of thecontract algebra to the type algebra. In that aim, we apply the subtyping principleS � T (S is a subtype of T ) to mean that the semantic objects denoted by S arecontained in the semantic objects denoted by T (S refines T ). Hence, a moduleof type T can safely be replaced or substituted by a module of type S . Figure 5.6depicts a process P with one long input x and two short outputs a, b, and a processQ with two integer inputs x, y and one integer output a, such that P refines Q.

Page 179: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

162 Y. Glouche et al.

x:long!P

!a:short!b:short

� x:integer !y:integer ! Q

!a:integer

Fig. 5.6 Example of module refinement

Then the type of a module M encapsulating P is a subtype of a module N

encapsulating Q, thus M can replace N .The subtyping relation � is defined inductively with axioms for datatypes, rules

for declarations and rules for each kind of module type. The complete rules aredescribed in [14]. In particular, a module type S D �.I; C / is a subtype of T D�.J; D/, written S � T , iff the inputs in J subtype those in I , the outputs in I

subtype those in J , and the contract C refines D (written C � D).We can interpret the relation C � D as a mean to register the refinement

constraint between C and D in the typing constraints. It corresponds to a proofobligation in the target language, whose meaning is defined by the semantic relationŒŒC �� � ŒŒD�� in the contract algebra, and whose validity may for instance be provedby model checking (the decidability of the subtyping relation essentially reduces tothat of the refinement of contracts).

5.4.4 Composition of Modules

Just as the subtyping relation, which implements and extends the refinement relationof the contract algebra in the typing algebra, the operations that define the greatestlower bound (glb) and least upper bound (lub) of two contracts are extended tomodule types by induction on the structure of types.

Greatest lower bound:– Modules: �.I; C / u �.J; D/D�.I u J; C + D/

– Products: S T u U V D.S u U / .T u V /

– Functors: �.x W S/:T u �.y W U /:V D�.x W .S t U //: .T u V Œy=x�/

Least upper bound:– Modules: �.I; C / t �.J; D/D�.I t J; C * D/

– Products: S T t U V D.S t U / .T t V /

– Functors: �.x W S/:T t �.y W U /:V D�.x W .S u U //:.T t V Œy=x�/

Note that the intersection and union operators have to be extended to combinethe set of input and output declarations of a module. For the greatest lower bound,for instance, the resulting set of input variables is obtained by intersection of theinput sets, and the type of an input variable is defined as the lub of the consideredcorresponding types. For the lower bound again, the resulting set of output vari-ables is obtained by the union of the output sets, and for common output variables,

Page 180: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

5 A Module Language for Typing SIGNAL Programs by Contracts 163

their type is defined as the glb of the considered corresponding types. Similar rulesare defined for the least upper bound (union of sets for inputs with glb of types,intersection of sets for outputs with lub of types).

The composition of modules is made available in the module language throughthe “and” operator. The operands of this operator can be contracts, in that case,the resulting type is the greatest lower bound + of both contracts. Or they can beexpressions of modules, in that case the resulting type is the greatest lower bound uof both module types.

5.5 Application to SIGNAL

We illustrate the distinctive features of our contract algebra by reconsidering thespecification of the LTTA and its translation into observers in the target languageof our choice: the multi-clocked synchronous (or polychronous) data-flow languageSIGNAL [19, 20].

5.5.1 Implementation of the LTTA

We model the LTTA protocol in SIGNAL by specifying the abstraction of allfunctionalities that write and read values on the bus. Refer to [8] for a descrip-tion of the operators of the SIGNAL language. In particular, the cell operatory := x cell b init y0 allows to memorize in y the latest value carried byx when x is present or when b is true. It is defined as follows:

(| y := x default (x$1 init y0) | y ˆ= x ˆ+ (when b) |)

We consider first the following basic functionalities:

process current ={ boolean v0; }( ? boolean wx; event c; ! boolean rx; )(| rx := (wx cell c init v0) when c |);

process interleave =( ? boolean x, sx; )(| b := not (b$1 init false)| x ˆ= when b| sx ˆ= when (not b) |)

where boolean b; end;

The process current defines a cell in which values are stored at the inputclock ˆwx and loaded on rx at the output clock c (the parameter v0 is used as

Page 181: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

164 Y. Glouche et al.

initial value). The other functionality is the process interleave, that desynchro-nizes the signals x and sx by synchronizing them to the true and false values of analternating Boolean signal b.

A simple buffer can be defined from these functionalities:

process shift_1 =( ? boolean x; ! boolean sx; )(| interleave(x, sx)| sx := current{false}(x, ˆsx) |);

It represents a one-place FIFO, the contents of which is the last value written intoit. Thanks to the interleave, the output (signal sx) may only be read/retrievedstrictly after it was entered. Also, there is no possible loss nor duplication of data.

For the purpose of the LTTA, a couple of signals have to be memorized together:the value to be transmitted, and its associated Boolean flag. So we define the processshift 2, in which both values are memorized at some common clock:

process shift_2 =( ? boolean x, b ! boolean sx, sb; )(| interleave(x, sx)| sx := current{false}(x, ˆsb)| sb := current{true}(b, ˆsb)|);

The shift processes ensure there is necessarily some delay between the inputof a data and its output. But for a more general buffer, some data may be lost if anew one is entered and memorized values have to be sustained. Using the shift 2(for the LTTA), we may write:

process buffer =( ? boolean x, b; event c ! boolean bx, bb, sb;)(| (sx, sb) := shift_2(x, b)| bx := current{false}(sx, c)| bb := current{true}(sb, c)|) where boolean sx; end;

The signal c provides the clock at which data are retrieved from the buffer. Theclock of the output signal sb (which is the clock resulting from the internal shift)represents the clock of the first instants at which the buffer can fetch a new value. Itwill be used to express some assumptions on the protocol.

Then the process ltta is decomposed into a reader, a bus and a writer:

process ltta =( ? boolean xw; event cw, cb, cr; ! boolean xr;)(| (xb, bb, sbw) := bus(xw, writer(xw, cw), cb)| (xr, br, sbb) := reader(xb, bb, cr)|) where boolean bw, xb, bb, sbw, br, sbb; end;

Page 182: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

5 A Module Language for Typing SIGNAL Programs by Contracts 165

Using the buffer process, the components have the following definition:

process writer =( ? boolean xw; event cw; ! boolean bw; )(| bw ˆ= xw ˆ= cw| bw := not(bw$1 init true)|);

process bus =( ? boolean xw, bw; event cb;

! boolean xb, bb, sbw; )(| (xb, bb, sbw) := buffer(xw, bw, cb) |);

process reader =( ? boolean xb, bb; event cr;

! boolean xr, br, sbb; )(| (yr, br, sbb) := buffer(xb, bb, cr)| xr := yr when switched(br)|) where boolean yr; end;

The switched basic functionality allows to consider the values for which theBoolean flag has alternated:

process switched =( ? boolean b; ! boolean c; )(| zb := b$1 init true| c := (b and not zb) or (not b and zb)|) where boolean zb; end;

5.5.2 Contracts in SIGNAL

In this section, we will specify assumptions and guarantees as SIGNAL processesrepresenting generators of corresponding process-filters.

The behavior of the LTTA is correct if the data flow extracted by the reader isequal to the data flow emitted by the writer (8n � xr .n/ D xw.n/). It is the case ifthe following conditions hold:

w � b andjw

b

k� r

b:

We will consider this property as a contract to be satisfied by a given implemen-tation of the protocol. Here, we use again the SIGNAL language to specify thiscontract with the help of clock constraints or of signals used as observers [15]. Ingeneral, the generic structure of observers specified in contracts will find a direct in-stance and compositional translation into the synchronous multi-clocked model ofcomputation of SIGNAL [20]. Indeed, a subtlety of the SIGNAL language is that

Page 183: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

166 Y. Glouche et al.

an observer not only talks about the value, true or false, of a signal, but also aboutits status, present or absent. Considered as observers, the assumption and guaranteeof the contract for LTTA could be described as follows:

ALTTA D w � b ^ �wb

˘ � rb;

GLTTA D xr .n/ D xw.n/:

For instance, GLTTA is true when xr .n/ D xw.n/ and it is false when xr .n/ ¤ xw.n/.By default, it is absent when the equality cannot be tested. Notice that thecomplement of an event (a given signal, e.g. xr, is present and true) is that itis absent or false. The signal GLTTA is present and true iff xr is present andthe condition xr .n/ D xw.n/ is satisfied. For a trace of the guarantee GLTTA,the set of possible traces corresponding to its complement AGLTTA is infinite(and dense) since it is not defined on the same clock as GLTTA. For example,GLTTA D 1 0 1 0 1 0 1 0 1

and AGLTTA D 0 0 0 0 0 or 0 1 1 1 0 1 1 0 1 1 0 1 0 1 : : :

Let cw, cb and cr be the signals representing respectively the clocks of thewriter, the bus and the reader. If the signal sbw has the clock of the first instants atwhich the bus can fetch a new value (from the writer) – it has the same period ascw but is shifted from it –, the constraint “never two tw between two successive tb”.w � b/ can be expressed in SIGNAL by: cb ˆ= sbw ˆ+ cb.

If the signal sbb has the clock of the first instants at which the reader can fetcha new value (from the bus), and its values represent the values of the Booleanflag transmitted along the communication path, then the constraint “never two�b between two successive tr ”

�wb

˘ � rb

) – remind that �b.n/ is the first in-stant where the bus can fetch the nth writing – can be expressed in SIGNAL by:cr ˆ= (when switched(sbb)) ˆ+ cr.

Then the assumptions of the contract for the LTTA may be expressed as thesynchronous composition of the above two clock constraints.

Consider now the property that has to be verified: 8n � xr .n/ D xw.n/. Let xrand xw represent respectively the corresponding signals. The property that these twosignals represent the same data flows (remind that they do not have the same clock)can be expressed in SIGNAL by comparing xr (the output of the reader) with asignal – call it xok – which is the output of a FIFO queue on the input signal xw,such that xok can be freely resynchronized with xr. The signal xok can be definedas xok := fifo_3(xw) with fifo_3 a FIFO with enough memory so that theclock of xr is not indirectly constrained when xr and xok are synchronized:

process fifo_3 =( ? boolean x; ! boolean xok; )(| xok := shift_1(shift_1(shift_1(x))) |);

The observer of the guarantee is expressed as: obs := (xr = xok).

Page 184: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

5 A Module Language for Typing SIGNAL Programs by Contracts 167

This contract can be used as a type for specifying a LTTA protocol. A possi-ble implementation of this protocol is the one described in Sect. 5.5.1 using theSIGNAL language:

module type spec_LTTA =contract input boolean xw;

event cw,cb, cr

output boolean xr,sbw, sbb;

assume(| cb ˆ= sbw ˆ+ cb| cr ˆ= cr ˆ+(when switched(sbb))

|)whereprocess switched = ...end;

guarantee(| xok := fifo_3(xw)| obs := xr = xok|)

where boolean xok;process fifo_3 = ...end;

end;

module ltta : spec_LTTA =process input boolean xw;

event cw, cb, croutput boolean xr,

sbw, sbb;(| (xb, bb, sbw) :=

bus(xw, writer(xw, cw), cb)| (xr, br, sbb) :=

reader(xb, bb, cr)|)where boolean bw, xb, bb, br;process writer =

( ? boolean xw; event cw;! boolean bw; )

(| ...|);

process bus =( ? boolean xw, bw; event cb;

! boolean xb, bb, sbw; )(| ...|);

process reader =( ? boolean xb, bb; event cr;

! boolean xr, br, sbb; )(| ...|);

end;end;

It is needless to say that a sophisticated solver, based for instance on Pressburgerarithmetics, shall help us to verify the consistency of the LTTA property. Nonethe-less, an implementation of the above LTTA specification, for the purpose of simula-tion, can straightforwardly be derived. As a by-product, it defines an observer whichmay be used as a proof obligation against an effective implementation of the LTTAcontroller to verify that it implements the expected LTTA property. Alternatively, itmay be used as a medium to synthesize a controller enforcing the satisfaction of thespecified property on the implementation of the model.

5.5.3 Salient Properties of Contracts in the Synchronous Context

In the context of component-based or contract-based engineering, refinement andsubstitutability are recognized as being fundamental requirements [10]. Refinementallows one to replace a component by a finer version of it. Substitutability allows oneto implement every contract independently of its context of use. These properties are

Page 185: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

168 Y. Glouche et al.

essential for considering an implementation as a succession of steps of refinement,until final implementation. As noticed in [27], other aspects might be considered ina design methodology. In particular, shared implementation for different specifica-tions, multiple viewpoints and conjunctive requirements for a given component.

Considering the synchronous composition of SIGNAL processes, the satisfac-tion relation of contracts and the greatest lower bound as a composition operator forcontracts, we have the following properties:

Property 6. Let two processes p, q 2 P , and contracts C1, C2, C01, C0

2 2 C.

(1) C1 4 C2 H) ((p � C1) H) (p � C2))(2) C1 � C2 ” ((p � C1) H) (p � C2))(3) ((C0

1 4 C1) ^ (C02 4 C2)) H) ((C0

1 + C02) 4 (C1 + C2))

(4) ((p � C1) ^ (q � C2)) H) ((pjq) � (C1 + C2))(5) ((p � C1) ^ (p � C2)) ” (p � (C1 + C2))

(1) and (2) relate to refinement and implementation; (3) and (4) allow for substi-tutability in composition; (5) addresses multiple viewpoints:

� (1) and (2) illustrate the substitutability induced by the refinement relation. Forrelation (1), if a contract C1 refines a contract C2 then a process p which satis-fies C1 also satisfies C2. Consequently, the set of processes satisfying C1 beingincluded in the set of processes satisfying C2, a component which satisfies C2

can be replaced by a component which satisfies C1. For relation (2), a contractC1 is finer than a contract C2 if and only if the processes which satisfy C1 alsosatisfy C2.

� (3) and (4) illustrate the substitutability in composition. For relation (3), if a con-tract C0

1 refines a contract C1 and a contract C02 refines a contract C0

2, thenthe greatest lower bound of C0

1 and C02 refines the greatest lower bound of

C1 and C2. Relation (4) expresses that a subsystem can be developed in iso-lation. Then, when developed independently, subsystems can be substituted totheir specifications and composed as expected. If a SIGNAL process p satisfiesa contract C1 and a SIGNAL process q satisfies a contract C2, then the syn-chronous composition of p and q satisfies the greatest lower bound of C1 and C2.Thus, each subsystem of a component can be analyzed and designed with its spe-cific frameworks and tools. Finally, the composition of the subsystems satisfiesthe specification of the component. Property (4) could be illustrated as followson the LTTA example: define the implementation of the LTTA as a functor pa-rameterized by two components, bus and reader, respectively associated withthe types (i.e., contracts) busType and readerType, such that the greatestlower bound of busType and readerType is equal to the type spec_LTTAassociated with the LTTA implementation.

� (5) illustrates the notion of multiple viewpoints: a process p satisfies a contractC1 and a contract C2 if and only if p satisfies the greatest lower bound of con-tracts C1 and C2. This property is a solution for the need for modularity comingfrom the concurrent development of systems by different teams using different

Page 186: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

5 A Module Language for Typing SIGNAL Programs by Contracts 169

frameworks and tools. An example is the concurrent handling of the safety orreliability aspects and the functional aspect of a system. Other aspects may haveto be considered too. Each of these aspects requires specific frameworks andtools for its analysis and design. Yet, they are not totally independent but ratherinteract. The issue of dealing with multiple aspects or multiple viewpoints is thusessential.

5.5.4 Implementation

The module system described in this paper, embedding data-flow equations definedin SIGNAL, has been implemented in OCaml. It produces a proof tree that consistsof (1) an elaborated SIGNAL program, that hierarchically renders the structure ofthe system described in the original module expressions, (2) a static type assign-ment, that is sound and complete with respect to the module type inference system,(3) a proof obligation consisting of refinement constraints, that are compiled as anobserver or a temporal property in SIGNAL.

The property is then passed on to SIGNAL’s model-checker, Sigali [24], whichallows to prove or disprove that it is satisfied by the generated program. Satisfactionimplies that the type assignment and produced SIGNAL program are correct withthe initially intended specification. The generated property may however be usedfor other purposes. One is to use the controller synthesis services of Sigali [23]to automatically generate a SIGNAL program that enforces the property on thegenerated program. Another, in the case of infinite state system (e.g., on numbers)would be to generate defensive simulation code in order to produce a trace if theproperty is violated.

5.6 Conclusion

Starting from an abstract characterization of behaviors as functions from variablesto a domain of values (Booleans, integers, series, sets of tagged values, continuousfunctions), we introduced the notion of process-filter to formally characterize thelogical device that filters behaviors from processes much like the assumption andguarantee of a contract do. In our model, a process p fulfils its requirements (or sat-isfies a contract) (A,G) if either it is rejected by A (i.e., if A represents assumptionson the environment, they are not satisfied for p) or it is accepted by G.

The structure of process-filters is a Boolean algebra and the structure of con-tracts is a Heyting algebra. These rich structures allow for reasoning on contractswith great flexibility to abstract, refine and combine them. In addition to that, thenegation of a contract can formally be expressed from within the model. Moreover,contracts are not limited to expressing safety properties, as is the case in most re-lated frameworks, but encompass the expression of liveness properties [13]. This isall again due to the central notion of process-filter. Our model deals with constraints

Page 187: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

170 Y. Glouche et al.

or properties possibly expressed on different sets of variables, and takes into ac-count variable equalization when combining them. In this model, assumption andguarantee properties are not necessarily restricted to be expressed as formulas insome logic, but are rather considered as sets of behaviors (generator of process-filter). Note that such a process can represent a constraint expressed in sometemporal logic.

We introduced a module system based on the paradigm of contract for a syn-chronous multi-clocked formalism, SIGNAL, and applied it to the specification ofa component-based design process. The paradigm we are putting forward is to re-gard a contract as the behavioral type of a component and to use it for the elaborationof the functional architecture of a system together with a proof obligation that val-idates the correctness of assumptions and guarantees made while constructing thatarchitecture.

Acknowledgement Partially funded by the EADS Foundation.

References

1. Abadi, M., Lamport, L.: Composing specifications. ACM Transactions on Programming Lan-guages and Systems 15(1), 73–132 (1993)

2. de Alfaro, L., Henzinger, T.A.: Interface automata. ACM SIGSOFT Software EngineeringNotes 26(5), 109–120 (2001)

3. Alur, R., Henzinger, T.A., Kupferman, O.: Alternating-time temporal logic. Journal of theACM 49(5), 672–713 (2002)

4. Bartetzko, D., Fischer, C., Moller, M., Wehrheim, H.: Jass – Java with assertions. ElectronicNotes in Theoretical Computer Science 55(2), 1–15 (2001)

5. Bell, J.L.: Boolean algebras and distributive lattices treated constructively. Mathematical LogicQuarterly 45, 135–143 (1999)

6. Benveniste, A., Caillaud, B., Passerone, R.: A generic model of contracts for embedded sys-tems. Tech. Rep. 6214, INRIA Rennes (2007)

7. Benveniste, A., Caspi, P., Le Guernic, P., Marchand, H., Talpin, J.P., Tripakis, S.: A protocolfor loosely time-triggered architectures. In: J. Sifakis, S.A. Vincentelli (eds.) EMSOFT ’02:Proceedings of the Second International Conference on Embedded Software, Lecture Notes inComputer Science, vol. 2491, pp. 252–265. Springer, Berlin (2002)

8. Besnard, L., Gautier, T., Le Guernic, P., Talpin, J.P.: Compilation of polychronous dataflowequations. In this book

9. Broy, M.: Compositional refinement of interactive systems. Journal of the ACM 44(6), 850–891 (1997)

10. Doyen, L., Henzinger, T.A., Jobstmann, B., Petrov, T.: Interface theories with component reuse.In: EMSOFT ’08: Proceedings of the 8th ACM international conference on Embedded soft-ware, pp. 79–88. ACM (2008)

11. Edwards, S., Lavagno, L., Lee, E.A., Sangiovanni-Vincentelli, A.: Design of embedded sys-tems: formal models, validation, and synthesis. Proceedings of the IEEE 85(3), 366–390 (1997)

12. Glouche, Y., Le Guernic, P., Talpin, J.P., Gautier, T.: A boolean algebra of contracts for logicalassume-guarantee reasoning. Tech. Rep. 6570, INRIA Rennes (2008)

13. Glouche, Y., Talpin, J.P., Le Guernic, P., Gautier, T.: A boolean algebra of contracts for logicalassume-guarantee reasoning. In: 6th International Workshop on Formal Aspects of ComponentSoftware (FACS 2009) (2009)

Page 188: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

5 A Module Language for Typing SIGNAL Programs by Contracts 171

14. Glouche, Y., Talpin, J.P., Le Guernic, P., Gautier, T.: A module language for typing by contracts.In: E. Denney, D. Giannakopoulou, C.S. Pasareanu (eds.) Proceedings of the First NASA For-mal Methods Symposium, pp. 86–95. NASA Ames Research Center, Moffett Field, CA, USA(2009)

15. Halbwachs, N., Lagnier, F., Raymond, P.: Synchronous observers and the verification ofreactive systems. In: AMAST ’93: Proceedings of the Third International Conference onMethodology and Software Technology, pp. 83–96. Springer, Berlin (1994)

16. Hoare, C.A.R.: An axiomatic basis for computer programming. Communications of the ACM12(10), 576–580 (1969)

17. Kopetz, H.: Component-based design of large distributed real-time systems. Control Engineer-ing Practice 6(1), 53–60 (1997)

18. Larsen, K.G., Nyman, U., Wasowski, A.: Modal I/O automata for interface and productline theories. In: R. De Nicola (ed.) ESOP, Lecture Notes in Computer Science, vol. 4421,pp. 64–79. Springer, Berlin (2007)

19. Le Guernic, P., Gautier, T., Le Borgne, M., Le Maire, C.: Programming real-time applicationswith SIGNAL. Proceedings of the IEEE 79(9), 1321–1336 (1991)

20. Le Guernic, P., Talpin, J.P., Le Lann, J.C.: Polychrony for system design. Journal for Circuits,Systems and Computers 12(3), 261–304 (2003)

21. Leavens, G.T., Baker, A.L., Ruby, C.: JML: A notation for detailed design. In: H. Kilov,B. Rumpe, W. Harvey (eds.) Behavioral Specifications of Businesses and Systems,pp. 175–188. Kluwer, Dordrecht (1999)

22. Maraninchi, F., Morel, L.: Logical-time contracts for reactive embedded components.In: EUROMICRO, pp. 48–55. IEEE Computer Society (2004)

23. Marchand, H., Bournai, P., Le Borgne, M., Le Guernic, P.: Synthesis of discrete-event con-trollers based on the Signal environment. Discrete Event Dynamic System: Theory andApplications 10(4), 325–346 (2000)

24. Marchand, H., Rutten, E., Le Borgne, M., Samaan, M.: Formal verification of programs spec-ified with Signal: application to a power transformer station controller. Science of ComputerProgramming 41(1), 85–104 (2001)

25. Meyer, B.: Object-Oriented Software Construction (2nd ed.). Prentice-Hall, New York (1997)26. Mitchell, R., McKim, J., Meyer, B.: Design by Contract, by Example. Addison Wesley Long-

man, Redwood City, CA (2002)27. Raclet, J.B., Badouel, E., Benveniste, A., Caillaud, B., Passerone, R.: Why are modalities good

for interface theories? In: Proc. of the 9th International Conference on Application of Concur-rency to System Design (ACSD’09), pp. 119–127. IEEE Computer Society Press (2009)

Page 189: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

Chapter 6MRICDF: A Polychronous Model for EmbeddedSoftware Synthesis

Bijoy A. Jose and Sandeep K. Shukla

6.1 Introduction

Safety critical applications require embedded software that can guarantee deter-ministic output and on time results during execution. Synchronous programminglanguages are by nature focused on providing the programmer with deterministicoutput at fixed points in time [1, 2]. Synchronous programming languages such asEsterel [3], LUSTRE [4], SIGNAL [5], etc., have successfully generated sequen-tial software for embedded applications targeting avionics, rail transport, etc. Theunderlying formalism of these languages is robust and the associated software toolsprovide correctness preserving refinement from the specification to implementation.But even within the class of synchronous programming languages, there are verydistinctive Models of Computation (MoC) [6]. Esterel and LUSTRE use an exter-nal global tick (or global clock) as reference for events that occur in the system.SIGNAL has an entirely different Model of Computation, where no assumptionsare made on the global tick while designing the system. Each event has its owntick associated with it and for compilation the analysis of the specification wouldhave to yield a totally ordered tick from the specification. In its absence, a variable(also known as signal) is constructed which has an event happening in synchronywith every event occurring in the system. This signal can be viewed as a root clockof the system with a hierarchical tree structure relating it with every other signal.Since events can occur at each signal at different rates, this MoC is also known aspolychronous or multi-rate.

B.A. Jose (�) and S.K. ShuklaFERMAT Lab, Bradley Department of Electrical and Computer Engineering,Virginia Polytechnic Institute and State University, Blacksburg, VA, USAe-mail: [email protected]; [email protected]

S.K. Shukla and J.-P. Talpin (eds.), Synthesis of Embedded Software: Frameworks andMethodologies for Correctness by Construction, DOI 10.1007/978-1-4419-6400-7 6,c� Springer Science+Business Media, LLC 2010

173

Page 190: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

174 B.A. Jose and S.K. Shukla

6.1.1 Motivation for Polychronous Formalism

Embedded software systems interact with their environments by sampling inputs(continuous inputs such as temperature, pressure, etc.) or by getting discrete inputsfrom other digital systems, infinitely. Hence all the inputs, internal variables andoutputs of such systems can be thought of as infinite streams of values. If one hasa global discrete clock, one could also associate time stamps to each occurrence ofvalues on each stream. But in polychronous MoC, there is no assumption of a prioriglobal clock, and hence occurrence of a new value on a variable is termed an event.Now a few software design requirements are discussed along with their possibleimplementations through polychronous and non-polychronous MoC.

Example 1 (Priority-Merge). Let a and b be two signals, and we need a softwarethat can merge these two streams of events into one stream such that if a and b haveevents at the same instant, the value from a’s event will be sent to the output. If bhas an event at an instant when a does not have an event, then only the value fromb’s event will be sent to the output.

In a single clocked specification language such as Esterel, the time is globally di-vided into discrete instants at which inputs are sampled, and outputs are computedwith no time delay. So the term ‘instant’ in Example 1 refers to a discrete time pointin a linear global timeline, at which sampling a and b determines if any events arepresent and if so what their values are. Thus if a is absent at the sampling instant,and b is present, then output at that instant will take the value from the stream b.However, in a multi-clocked specification, no externally defined global linear time-line is present, and hence there is no unique implementation for Example 1. Oneexample would be that the software waits for events on both inputs, and if an eventoccurs on a, it outputs that event. However, if an event occurs on input b, it outputsevent on b. But an ‘instant’ in the specification is a period in real execution, thus, ifduring the period after b has had an event, if a has an event, one should output eventon a. Since the length of this period is not predetermined, the implementation will benondeterministic. Does that mean multi-clock specifications are problematic? Thediscussion about the next example will give more insight into this problem.

Example 2 (Mutually exclusive signals). Suppose that every time an event on a sig-nal c arrives with value t rue, an event on a arrives. When an event on c arrives witha value of false, then only event b arrives. Events on a or b are always accompaniedby an event on c.

Here the implementation should remain suspended until an event occurs on c,and determine which of the two streams (a or b) must be sampled based on thevalue carried by this event. If no event on c occurs, the system does not react at all.Now we see the advantage of multi-rate specification. In a global synchronizationbased model of computation, every time the global tick occurs, at least c must besensed and hence the program needs to wake up repeatedly even when no event onc occurred, leading to less efficient implementation (e.g., the system cannot go tosleep mode).

Page 191: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

6 MRICDF 175

Example 3 (Memory). Suppose you want a system that accepts a stream of valuesfrom signal a, and outputs the same value. However, you want the system to havememory so that if its output is sampled by another system, and no event on a occursat that sampling time, the output must provide the last output value (memorized).

In this case, the system reads a only if either an event on a occurs or if anothersystem is requesting this program to give an output. If a does have an event, thenew value is forwarded as output. If this checking was on a request, and a does nothave an event during that request, it sends the old memorized value. In a single-rate driven system, at every global tick, the system must check for event on a, evenwhen not requested, as the notion of global tick is external to the specification of theprogram. In a multi-rate MoC, the only time the program wakes up is when a has anew event, or when a request for an output arrives.

These examples show that having an a priori notion of global tick which is exter-nal to the specification, results in generation of inefficient code. If the inter-arrivaltime between environment events or requests is unpredictable, a global tick drivenprogram will be less efficient. The multi-rate formalism of SIGNAL realizes thisrequirement and generates code which would have lesser sampling instances as op-posed to single-clock frameworks. Recently, various multi-rate extensions to Esterelhave been proposed as well [7], affirming the importance of multi-rate systems in theembedded systems context. Let us conclude this discussion with another example,where an implementation based on multi-rate specifications can be more efficientdue to relative performance difference between different parts of a program.

Example 4 (Razor [8]). Consider a system that computes a function f on two syn-chronized input signals a and b. In order to check that the software implementationof f is correct, one can have multiple implementations for f , say Pf and Qf . Theinputs taken from a and b are simultaneously passed to both Pf and Qf , and out-puts are compared as shown in Fig. 6.1. If the outputs are always equal, then Pfand Qf are both implementing the same function f , otherwise not (cf. programchecking [9]).

If we have a global synchronization or ticks at which inputs are fed to Pf andQf , by synchrony hypothesis, the outputs from both Pf and Qf come out instan-taneously and they are compared instantaneously. Now consider the case wherethe implementation of Pf runs much faster than that of Qf . If the specifica-tion was in a global tick based formalism, either the inter-tick interval must be

Fig. 6.1 Comparison of twodifferent implementations– Razor example [8]

Pf

Qf

a

b

c

d

c = d?Equivalent

Page 192: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

176 B.A. Jose and S.K. Shukla

max(WCET(Pf ),WCET(Qf )), or Pf must be suspended until the tick at whichQf completes. This requires a suspension code aroundPf , watching at every tick ifQf has completed among other synchronizations. In polychronous formalism, onecan directly state that Pf andQf must synchronize. A ‘completion’ signal fromQfwill be enough for this synchronization, and there is no need to check at predeter-mined intervals. This might remind the reader about the advantages of asynchronousdesign vs. synchronous in hardware.

The above examples are to illustrate cases where global synchronization basedspecifications unnecessarily restrict some obvious optimization opportunities thatcan be exploited using a polychronous formalism. Software synthesis from poly-chronous formalism involve clock calculus of greater complexity which are dis-cussed in the next section.

6.1.2 Software Synthesis from Polychronous Formalism

Synchronous languages such as Esterel and LUSTRE are relatively more popu-lar than polychronous formalism based language SIGNAL. Programming usingsingle-clocked languages is easier since having an external reference of time ismore natural for users and it follows conventional software synthesis methodology.Hence code synthesis tools like Esterel Studio [10] and SCADE [11] are being usedin the industry more often than Polychrony [12]. SIGNAL’s MoC is reactive or inother words execute only when an event occurs at any of the signals in the system.Signals which are related to them will compute next and once all data dependentsignals have completed execution, one round of computation is over. The reactiveresponse happens for an interval of time and the start of such an interval is consid-ered to be the global software tick of the system, if one exists. This is very usefulin low power applications where an embedded system has to be put into sleep oncecomputation has been completed. An embedded system with a non-polychronousMoC will have to check for new inputs for every tick of the external global clock asit does not have any knowledge about events occurring in the system.

SIGNAL [13] and its toolset Polychrony [12] provides a framework for multi-rate data flow specification and is the inspiration for our Multi-Rate InstantaneousChannel Connected Data Flow (MRICDF) framework. However, MRICDF envi-sions a visual framework to represent the data flow, with control specification onlythrough epoch relations (defined in later sections). The execution semantics andimplementability analysis also differs from Polychrony. Polychrony analyzes thesystem by solving a set of clock equations [14] which only apply to Boolean sig-nals. They use ROBDDs for creating canonical form for all clock variables, andthen construct clock hierarchy in the form of tree-like data structures. This makesthe analysis more complex unnecessarily. MRICDF synthesis methodology seeks toremain in the Boolean equational domain for analysis, but uses various techniquessuch as prime implicate generation for obtaining a total order on events occurring inthe network.

Page 193: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

6 MRICDF 177

This chapter is organized as follows. The semantics of MRICDF require infor-mation on terminology which is introduced first in Sect. 6.2. MRICDF formalismis described in Sect. 6.3. Issues in implementing MRICDF as embedded softwareare discussed in Sect. 6.4. A visual framework called EmCodeSyn which imple-ments MRICDF models as embedded software is discussed in Sect. 6.5. Sections 6.4and 6.5 include case studies which demonstrate how the MRICDF and EmCodeSynare used to design safety-critical applications. The chapter is concluded in Sect. 6.6with a discussion on possible extensions and future work.

6.2 Synchronous Structure Preliminaries

The semantics of MRICDF is explained in terms of synchronous structures. Thepreliminaries of synchronous structures are given in [15]. Here we review some ofthe definitions that are required in explaining the semantics of MRICDF.

Definition 1 (Event, signals). An occurrence of a value is called an event. We de-note with� the set of all events that can occur in a synchronous system. A signalis a totally ordered set of events. If ‘a’ is a signal in MRICDF formalism, the eventson ‘a’ are denoted by E(a).

The occurrence of events on signals could be defined in terms of causality rela-tions. Here we define three operators on events namely precedence, equivalence andpreorder.

Definition 2 (Precedence, equivalence, preorder). We define a precedence �on events such that 8 a; b 2 � , a� b if and only if the event a occurs beforeb. An equivalence relation � exists between events a; b 2 � if and only ifa ˜ b and b ˜ a. This means that neither event happens before the other or in otherwords both events occur simultaneously. They are also called synchronous events.A relation ‘�’ on events a; b 2 � defines a preorder, such that a � b if and onlyif a’s occurrence is a prerequisite for the occurrence of b, or if they are synchronous.Here event b occurs after a or together with a.

With these operators on events in place, we can see that for two events a; b2� ,a � b means that a � b and aœ b. From a code synthesis perspective, a datadependence is a binary relation on events of precedence type. We denote datadependence by the notation ‘+’. So for any two events a; b 2 � , a + b impliesa � b. In the absence of global time reference, an instant or a point in time does notmake sense. That is why an instant is defined as the maximal set of events, such thatall events in that set are related by ‘�’ relation with each other. In other words, allevents that are synchronized with each other define an instant.

Definition 3 (Instant). Once the quotient of the set of all events � is taken withrespect to �, we obtain equivalence classes, each of which is an Instant. The setof instants is denoted by � D �= �.

Page 194: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

178 B.A. Jose and S.K. Shukla

Each set S 2 � will contain events which have the property 8 a; b 2 S , a � b.Since instants are equivalence classes, we can define a precedence order betweenthem. For two sets S; T 2 � , S � T if and only if, 8 .s; t/ 2 .S; T /, there existsthe relation s � t .

Definition 4 (Signal behavior). The behavior of a signal ‘x’ is an infinite sequenceof totally ordered events. In other words, the behavior of signal ‘x’ or ˇ.x/ isE.x/ � � , such that for any two events a; b 2 E.x/ either a � b or b � a

(totally order).

One can also associate a function to a signal a as �.a/ W N ! E.a/, whereN is set of natural numbers. Even though N is a countably infinite set, it does notrepresent a discrete global time line, but rather the sequence number of a value onsignal a. Thus one can talk about the first event on a and nth event of a, but nthevent of another signal b is not necessarily synchronized with that of a. Note thatour definition of an instant eliminates the need for any ‘absent’ event, as opposed tothe standard polychrony literature [13].

Definition 5 (Epoch). The epoch of a signal defines a possibly infinite set of in-stants at which the signal has events. Given a signal ‘a’, I.a/ (or ba) represents itsEpoch (or rate). Let I.a/ � � denote the set of instants of ‘a’ or its epoch. FormallyI.a/ D f�j� 2 � ^ .E.a/ \ � ¤ �/.

Given two signals a and b, intersection of I.a/ and I.b/ will denote those in-stants where both signals have events together, and union of I.a/ and I.b/ willdenote those instants where either of the two signals have events on them.

Definition 6 (Synchronous signals). Two signals ‘a’ and ‘b’ are said to be syn-chronous if and only if, I.a/ D I.b/. In other words, the signals have unique eventsthat are part of the same equivalence class S 2 � .

For code synthesis, a partial order on events is required by relating the eventsoccurring in a synchronous structure. For two events e; f there could be a relatione � b due to two reasons. One possibility is the existence of data dependencebetween the events, i.e., e+f . Another possibility being the case where two eventsare part of distinct instants e 2 S , f 2 T with S; T 2 � and S � T .

6.3 MRICDF Actor Network Model

Multi-Rate Instantaneous Channel-connected Data Flow (MRICDF) is a data flownetwork model consisting of several actors in a network communicating with eachother via instantaneous channels [16]. Each actor in the network will contain inputand output ports and a specified computation that will be performed on the inputsignals. Events in different signals may occur at different instants and there is noexternal global time reference. As a result, signals may or may not be synchronized

Page 195: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

6 MRICDF 179

w

Actor 1 Actor 2

Actor 3

Actor 4

u

v

zwxy

Fig. 6.2 An MRICDF network model

with each other. Synchronization information between events may either be implicitin the model or explicitly added as part of the model as exogenous constraints.

Actors could be of two types. Primitive actors or hierarchically composed com-posite actors. All actors compute their reaction to a trigger condition in an instant.The communication between actors using an instantaneous-channel is assumed tobe completed within an instant. Figure 6.2 is a simple MRICDF model consist-ing of four actors who communicate through their input and output signals. TheMRICDF model here has signals x; y as input and w as output. The intermediatesignals are u; v and z. This entire network can be again used as an actor in anotherMRICDF model. Thus hierarchical specification is enabled. What triggers an actorto react (or compute) is not a priori fixed, and is computed using epoch analysis.These trigger conditions are derived from the model along with some epoch con-straints imposed by the modeler. If it can be derived for all actors in the model,then software synthesis is possible. If not, the user has to add extra constraints tohelp determine the triggering conditions. A triggering event at any actor could sparka series of computations with event generation at intermediate signals and outputinterfaces throughout the network.

6.3.1 Primitive and Composite Actors for MRICDF

An Actor is structurally represented by AD hT; I;O;N;Gi where I;O arerespectively, the set of input ports and output ports. T is the type of the actor whichis either one of the primitive actors (B; S;M;F.n;m/) which are discussed lateror a composite actor Tc . N is the set of internal actors in the actor A, where eachn 2 N is of type t 2 T . G denotes an instantaneous-channel connection graphbetween internal actors like n 2 N and the input and output ports (I;O) of the actorA. Primitive actors do not have any internal actors and hence for primitive actors Nis empty. So a primitive actor can be represented as A D hTp; I;O;;; Gi,where Tp 2 fB; S;M;F.n;m/g. The four types of primitive actors are describedbelow:

Page 196: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

180 B.A. Jose and S.K. Shukla

Bi o

S

i1

i2 (boolean)

oi1

i2

oM

in

i2o1

om

o2...

.

.

.

i1

F (n,m)

Sampler

Function Buffer

Merge

Fig. 6.3 MRICDF primitives operators [16]

1. A Buffer node B is a node which has a single input port and a single outputport. An event arriving into a buffer node is consumed and stored in the buffernode, and the event that was consumed before this particular event is emittedon the output port. However, for the very first input event, a pre-specified defaultvalue is emitted. An input event and the corresponding output event belong to thesame instant. A pictorial visualization of such a primitive node is in Fig. 6.3. If xis an input signal to a B type actor, and y is its output, then �.y/.i/ � �.x/.i/

and �.y/.i/ D �.x/.i � 1/ for all i 2 N ^ i ¤ 1, and �.y/.1/ is a default value.2. A Sampler node S has two input ports, and one output port. The second input

port must always have Boolean valued events. If the first input has an event onit, and if the second input has a ‘true’ valued Boolean event belonging to thesame instant, then the first input event is emitted on the output port. If eventsappear on its first input port, and no event or a ‘false’ Boolean input appears onits second input port at that instant, then no output is emitted. However, eventhe ‘false’ Boolean valued event is consumed and discarded. Whenever S nodeproduces an output event, it belongs to the same instant as the two input events itconsumes. A visualization of a sampler node is in Fig. 6.3. If x is the first inputsignal, c the second input and y the output signal, then the following must betrue: 8i 2 N , if there exists j 2 N and T 2 � , such that �.x/.i/; �.c/.j / 2 Tand �.c/.j / carries the Boolean value ‘true’, then there exists l 2 N such that�.y/.l/ 2 T and the values of �.y/.l/ and �.x/.i/ are the same. If �.c/.j /carries a value ‘false’ or if �.x/.i/ 2 T but �.c/.j / … T for any j , then nol 2 N exists such that �.y/.l/ � �.x/.i/. Also if �.x/.i/ … T for any i and�.c/.j / 2 T , there will be no l 2 N such that �.c/.j / � �.y/.l/.

3. A Prioritized merge node M also has two input ports and one output port. If aninput event appears on the first input port, then it is emitted on the output portirrespective of whether the second input port has any event appearing on it. How-ever, in the absence of an event at the first input port, if the second input port hasan event on it, then that event is emitted on the output port. The output eventbelongs to the same instant as the input event that is responsible for the output

Page 197: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

6 MRICDF 181

Table 6.1 MRICDF Operators and their epoch relations

MRICDF actor MRICDF expression Epoch relation

Function F.n;m/ fo1; o2; : : : ; omg D F fi1; i2; : : : ; ing bo1 D � � � D bom D bi1 D � � � D bin

Buffer B o D Bfig bo Dbi

Sampler S o1 D Sfi1; i2g bo1 D bi1 \ cŒi2�

Merge M o1 D M fi1; i2g bo1 D bi1 [ bi2

event. A visualization of a merge node is in Fig. 6.3. Let x and y be the inputsignals, and z be the output signal, then the following must be true: If there existsi; j 2 N such that �.x/.i/; �.y/.j / 2 S for S 2 � , then there exists an l 2 Nsuch that �.z/.l/ � �.x/.i/ and they carry the same value. If for some j 2 N�.y/.j / 2 S for some S 2 � , and there is no i 2 N such that �.x/.i/ 2 S ,then there exists l 2 N such that �.z/.l/ � �.y/.j / where �.z/.l/ and �.y/.j /carry the same value.

4. A Function node F.n;m/ of input arity n, and output aritym basically computesthe function F on n events, one event arriving on each of its input ports, andproduces m output events, one each on all the m output ports. Graphically, onecan depict a function node F as a circle with n arrows coming into it, and marrows going out of it. (See Fig. 6.3). Let x1; x2; : : : ; xn be the input signals, andy1; y2; : : : ; ym be the output signals, then the followings must hold: For each i 2N , for an instant S 2 � if 8j D f1; : : : ; ng �.xj /.i/2 S and 8lDf1; : : : ; mg�.yl /.i/2S , and vector of values carried by h�.y1/.i/; �.y2/.i/; : : : ; �.ym/.i/iis obtained from F.�.x1/.i/; �.x2/.i/; : : : ; �.xn/.i//:

Table 6.1 summarizes the four MRICDF primitive actors and their epoch rela-tions. If the input port names of a Function F.n;m/ node are ij for j D 1::n,and output port names are ok for k D 1::m, then usage of the F.n;m/ node impliesthat the events on these signals occur at the same instant. The epoch relation willbe: bi1 D bi2 D � � � D bin D bo1 D bo2 D � � � D com. If i is the input port name ofa Buffer primitive node B , and o is its output port, then it is implied bi D bo. Ifi1 and i2 are the input port names for a Sampler node, and o1 is its output portname, then it is implied that bo1 D bi1 \ cŒi2�. cŒx� denotes the set of instants at whichthe port x has an event occurrence which contains the Boolean ‘true’ value. So for aBoolean variable/signal one can write bx D cŒx�[ bŒ:x� and cŒx�\ bŒ:x� D ;. If i1 andi2 are the input port names for a prioritized merge node, and o1 is its outputport name, then it is implied that bo1 D bi1 [ bi2.

6.4 Embedded Software Synthesis from MRICDF

The implementation of MRICDF models as embedded software, involves a detailedanalysis of the model and several transformation steps to reach the code generationphase. Correct-by-construction software synthesis methodology is put in place toform a total order on all the events occurring in an MRICDF model. Obtaining a

Page 198: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

182 B.A. Jose and S.K. Shukla

unique order of execution is important for sequential software synthesis. Apart fromthese transformation steps, issues like deadlock detection have to be handled as well.We discuss the implementation issues of MRICDF models first and then explain howEpoch Analysis and master trigger identification steps are used to resolve them.

6.4.1 Issues in the Transformation from MRICDF Modelsto Embedded Software

The instantaneous reaction or communication is part of the abstract semantics ofMRICDF. When software is synthesized, an instant has to be mapped to an inter-val of time. There may be structural cycles in the MRICDF network, which meansone has to determine possible deadlock during the real software execution. Sincedifferent instants will be mapped to different periods of real time, only deadlockpossibilities are cycles within an instant. So given the data dependence and prece-dence relations, verification has to be done to check for any cyclic dependencieswithin an instant:

� Recall that e � f , ..e � f / ^ .f � e//. Also e + f could imply e � f

and f + e could imply f � e. This means that two events in the same in-stant may have cyclic dependency. This may result in a deadlock while executingthe software to mimic that instant. So one has to take the transitive closure of+ relation, and check for such cyclic dependency. However, any other cyclicdependency that spans two different instants is not a deadlock.

� Mapping the abstract instant into real world time periods requires a special signalwhich has instants in each equivalence class � . This special signal called mastertrigger, needs to be found for a possible sequential implementation of MRICDFmodel. The existence/non-existence of a master trigger, proofs for identifyinga signal as master trigger and how to use it in the conversion into embeddedsoftware will be explained in later sections.

� The event set � is an infinite set. Even its quotient, the set of instants � with re-spect to � is also an infinite set. To generate software that is supposed to mimicthe instants in their right precedence order, the synthesis tool must find a wayto analyze the MRICDF specification in finite time. Even if the two previouslydescribed issues are satisfied, there could be a case where no deterministic se-quential implementation is possible from the given MRICDF model. It mightindicate that more constraints on the components or signals must be provided(exogenous constraints), for code synthesis to be possible.

For synthesizing sequential software from MRICDF models, a total order has tobe achieved on � through the precedence operator ‘�’. The software executes inrounds, each round corresponding to one Si 2 � . Within each Si , there will bedata dependence between events implied by +. If � is a partial order, the specifi-cation does not have enough information for a unique sequential implementation,and hence the synthesized software will have a behavior that is one of the possible

Page 199: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

6 MRICDF 183

behaviors from the specification. To obtain a unique behavior, there needs to be amaster trigger signal having events in every instant of the MRICDF model.

Definition 7 (Master trigger [16]). LetM be an MRICDF model with � denotingthe instants of the model. Let ‘t’ be a signal with E.t/ as its set of events, with theproperty that for each S 2 � , there exists an event eS 2 E.t/, such that eS 2 S ,and there is no S 2 � which has a causal cycle with respect to the relation +.Then ‘t’ will be termed the master trigger for M and M is said to be sequentiallyimplementable.

Informally the signal t can be used as a master trigger for the software imple-mented from MRICDF to indicate each round. If a master trigger ‘t’ exists, thenE.t/ is totally ordered, and it is implied that � is totally ordered with respect to �.Sequential implementability is analogous to the ‘endochrony’ concept in SIGNALliterature [17]. An endochronous SIGNAL program is one which has a deterministicsequential schedule or whose schedule can be statically computed. Multi-threadedimplementability could be defined from the events occurring simultaneously, buthere we limit the discussion to sequential code generation. Note that existence ofmaster trigger is a necessary, but not a sufficient condition for sequential imple-mentability. If no master trigger has been identified, exogenous constraints will beapplied on the MRICDF model to form a master trigger.

6.4.2 Epoch Analysis: Determining Sequential Implementabilityof MRICDF Models

In this section, we explain how Epoch Analysis technique solves the three majorimplementation issues described in the previous section. Epoch Analysis involvesa deadlock detection technique, master trigger identification test on the MRICDFsignals and finally an optional ‘endochronization’ process. The endochronizationstep is taken when the master trigger identification test fails and an external mastertrigger needs to be inserted.

6.4.2.1 Deadlock Detection

In an actor network model, the generation of tokens (or events) happen at inputports and at Buffer output ports where there is the occurrence of a stored event.A deadlock situation can happen when the computation loop is within an instantand hence there should not be any Buffer actor within the loop. In the Fig. 6.4,an MRICDF network with a deadlock condition is shown. The computation ofthe output signal t is dependent on the current value of the same signal. Once theMRICDF network is translated into epoch equations, a value chain can be formedto identify deadlock situations. For the network in Fig. 6.4, the epoch equations are:bx D bb; by Dbt \ cŒa�; bz D bx \ cŒy�; bt Dbz \ bŒc�. Had any of the actors within the loopbeen a Buffer actor, the loop would have been broken, since the data dependenceis between instants.

Page 200: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

184 B.A. Jose and S.K. Shukla

Fig. 6.4 MRICDF modelwith deadlock

tS

F

S

Sa

b

cy

x

z

6.4.2.2 Master Trigger Identification

From the definition of master trigger/sequential implementability, for deterministicsequential software synthesis from MRICDF, mapping from the instances of roundsto software code has to be performed. It is required that all events of Si must occurduring the execution of the software before any event of Sj , where fSi ; Sj g 2 �

and i � j . The synthesized sequential software must execute in rounds, each roundcorresponding to one Si 2 � . Note that events in Si may have data dependencebetween them, hence events within a single Si must also execute in a way as topreserve the order implied by +.

To identify a master trigger which has events in all instants of an MRICDFmodel, the epoch equations of the MRICDF model are analyzed in terms of Booleanequations. For a Boolean signal x, Œ:x� and Œx� represent signals which have anevent at the same instant as x with the Boolean value ‘false’ and ‘true’ respec-tively. The epoch relations between these signals are as follows: bx D bŒ:x� [ cŒx�,bŒ:x�\ cŒx� D �. If a signal a has an event that belongs to T 2� , then in the Booleandomain ba D true, else ba D false. When Boolean equations are constructed usingBoolean variables for type of epoch relation, three types of Boolean equations canbe obtained. For bx D by, the Boolean equation becomes bx D by . If bx D by [bz,then bx D by _ bz and finally for bx D by \bz Boolean equation is bx D by ^ bz.For a Boolean signal c, there exists two Boolean equations, bc D bŒc� _ bŒ:c� andbŒc�^bŒ:c� D false. For signals of other type such integer float, etc., their values areignored and in the Boolean domain they are represented by separate signals whichdenote their presence and absence.

From such a system of Boolean equations, if there exists a variable bx such thatif bx D false, then all variables in the equation system are false for the equations tobe satisfied, then its equivalent signal x is the master trigger. If no such bx exists,then there is no master trigger signal, and hence the given MRICDF model is notsequentially implementable.

Theorem 1 (Master Trigger Identification [16]). Given an MRICDF model M ,let BM denote the system of Boolean equations obtained by converting each signalin terms of its Boolean signals. Then a signal x in M is a master trigger if and only

Page 201: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

6 MRICDF 185

if there exists an input signal x in the modelM such that its corresponding Booleanvariable bx in BM has the property that if bx is false, every other variable is false.

Proof sketch. Recall that a master trigger is one which has one event in every instantof the set of abstract instants � . The instants are constructed by partitioning theunion of events from all signals. Thus if x has to have an event in every instant,its instant set must be a superset of all instants in the model. In other words, everysignal’s instant set must be a subset of its instant set. By definition bx should then beimplied by every variable by for all signals y in the system. bx encodes the presenceof an event of x in an arbitrary instant T , and if by ! bx , then if y has an eventin T , so does bx . So if the solution set of this equation system implies that for anysignal y in M , by ! bx , then when we set bx to false, all left hand sides of thoseimplications have to be set to false. ut

Note that we are considering MRICDF models which are sequentially imple-mentable at this point, where there is a master trigger signal available from theenvironment. There are cases where an internal signal has the highest epoch in thesystem, which is called as oversampling in SIGNAL language [18]. We do not con-sider models with oversampling as sequentially implementable at this point. Insteadwe add external information about input signals to create a master trigger.

6.4.2.3 Exogenous Constraints for Code Synthesis

For a master trigger signal to exist, there should be a signal which hasevents in all instants of the MRICDF network. For a simple Sampler ac-tor, C DSampler.A;X/, where B D ŒX�, the epoch equation is as follows:bC D bA \ bB . If the signals A and B are independent, there is a possibility thatno master trigger exists for an MRICDF network with this single actor. Let a 2 A

be an event belonging to an instant where there is no b 2 B . Similarly, there couldbe an event b 2 B in an instant where there is no a 2 A. Since C can have eventsonly when both A and B have events, there are several instants of A and B that arenot part of instants of C . In short, there is no signal who has events in all instants ofthe network. Hence there is an absence of master trigger.

In MRICDF models with no master trigger, exogenous rate constraints have tobe applied to construct an external one. For the Sampler actor, the exogenousconstraints are bx D by;ba D cŒx�; and bb D cŒy�. Now the set of Boolean equations forthe actor is as follows: bx D by , ba D bŒx�, bb D bŒy�, bx D bŒx� _ bŒ:x�, by DbŒy� _ bŒ:y� and bc D ba ^ bb . In this system of Boolean equations, we can observethat the external signals x (or y) can become the master trigger, since when theBoolean variable bx (or by) is set to false, all the variables become false. Now thata master trigger has been identified a schedule for computing signals in the networkhas to be constructed which is called as its follower set. To obtain the follower setfrom the set of Boolean equations, set the master trigger signal to true, simplify theBoolean equations and perform master trigger identification process again. Repeatedexecution of these steps will form the follower set for the whole network which isthe order of execution.

Page 202: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

186 B.A. Jose and S.K. Shukla

6.4.3 Case Study of Producer–Consumer Problem

The Producer–Consumer problem or the bounded-buffer problem is a computer sci-ence example to demonstrate synchronization problem between multiple processes.We select this example due to its inherent multi-rate nature and for demonstratinghierarchical specification feature of MRICDF. Producer and consumer have separaterates of operation and they must be synchronized for obtaining a read/write from asingle-stage buffer. A pseudo code representing the model is shown in Listing 6.1.The SIGNAL representation of this model was given in [17].

The Producer and Consumermodels are represented as individual compositeactors connected to another actor PCmain which synchronizes them and also takescare of the storage of data being sent. In the MRICDF representation of the exampleshown in Fig. 6.5, the PCmain is separated into multiple composite actors, Celland Activate. Figure 6.6 shows the MRICDF composite actor representing theProducer. Producer actor performs a counter function and its value is being

Listing 6.1 Pseudo code for Producer–Consumer model

A c t o r P roduce r ( i n p u t boo l ean p t i c k ;o u t p u t i n t e g e r dva l ue ) =

( j c o u n t e r := sam pl e r ( ( p r e v c o u n t +1) mod 7 , p t i c k )j p r e v c o u n t := B u f f e r ( coun t e r , 0 ) j )

A c t o r Consumer ( i n p u t boo l ean c t i c k , u ;i n t e g e r bva l ue ;

o u t p u t i n t e g e r d read ; boo l ean v ) =( j v := P r i o r i t y M e r g e ( Sample ( t r u e , u ) , f a l s e )

j dread := P r i o r i t y M e r g e ( Sample ( bvalue , v ) , �1) j )

A c t o r PCmain ( i n p u t boo l ean p , c ;o u t p u t i n t e g e r pcda t a , boo l ean p c v a l i d ) =

( j d := Produce r ( p )j b := P r i o r i t y M e r g e ( c , B u f f e r ( d , 0 ) )j pc := P r i o r i t y M e r g e ( B u f f e r ( p , f a l s e ) , f a l s e )j ( pcda t a , p c v a l i d ) = Consumer ( c , pc , b ) j )

ProducerCell

d

Consumerc

pcActivate

pcvalid

p

pcdata

b

Fig. 6.5 Producer–Consumer example

Page 203: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

6 MRICDF 187

B F(1,1)Sprev_count

p = ptickdvalue = d

counter

Fig. 6.6 Producer Actor model

BM

dMd

c

S

b

Fig. 6.7 Cell Actor model

Fig. 6.8 Activate Actormodel

M

Bp

S

false

pc

c

sent to the output on every input trigger represented by ptick. So for every event inthe signal ptick, the Producer emits a value at the port dvalue.

The PCmain actor in the pseudocode combines both synchronization tasks alongwith the storage of the generated value. These tasks are separated and shown inFigs. 6.7 and 6.8. The Consumer actor is triggered by events arriving at the signalctick. The alias for trigger signals of Producer and Consumer actors are p and crespectively. The ActivateMRICDF actor shown in Fig. 6.8 accepts these signalsand produces a combined trigger for synchronizing all actors. This trigger signal pcdenotes instants where there is either a production or consumption of data from thebuffer. In the Consumer actor shown in Fig. 6.9, pc is used to indicate a valid datais present/absent in the buffer. The consumed data is obtained through the port bconnected to the storage actor Cell. Figure 6.7 shows the structure of this singledata storage block. For every Consumer tick c, the newly generated value from theconsumer d or the value stored in the Buffer is passed onto the output port b. Thereis a loop with a Buffer primitive actor which refreshes the previously stored value.

Prior to the code synthesis stage, the master trigger has to be identified and afollower set has to be built. In this model, the master trigger has to be enforced

Page 204: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

188 B.A. Jose and S.K. Shukla

M

S

pc = u

true

S

false

S

b = bvalue

S

M

−1

c = ctick

v = pcvaliddread = pcdata

c = ctick

Fig. 6.9 Consumer Actor model

externally, since p and c are independent signals. The follower set will be decidedon the basis of values present at these ports. Once a unique sequential schedule isdecided in the form of a follower set, code generation can be performed. More in-formation about producer–consumer model and other examples implemented usingMRICDF are available in [19].

6.5 Visual Framework for Code Synthesisfrom MRICDF Specification

One of the goals of MRICDF is to popularize the use of polychronous formal-ism among engineers in the embedded software field. Embedded Code Synthesisor EmCodeSyn is a software synthesis tool which provides an environment to spec-ify MRICDF models and utilities to perform the debugging and code generationfrom specification [20]. In this section, we explain the design methodology of theEmCodeSyn tool and show using a case study each step in achieving code synthesisfrom an MRICDF specification.

The design methodology of EmCodeSyn is shown in Fig. 6.10. The MRICDFmodel specified using the GUI is stored in a Network Information File (NIF). Thiswill contain information about the types of actors, their inter-connections, compu-tation to be performed, and other GUI information about placement of actors, linesdrawn, etc. Now Epoch Analysis is performed on the model which will breakdownthe composite actors in terms of primitive actors and translate the model into epochequations and its corresponding Boolean equations as explained in Sect. 6.4.2. Nowmaster trigger identification tests based on Theorem 1 is performed to compute asequential schedule for computation called follower set. If a master trigger cannotbe found within the system, exogenous rate constraints will have to be provided toforce a master trigger on the MRICDF model. Once a follower set is establishedfor the MRICDF model, code generation can begin. Section 6.4.2 described how

Page 205: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

6 MRICDF 189

MRICDF Specification

Master TriggerIdentification

EpochAnalysis

BooleanEquations

EpochEquations

CodeGeneration

MainFile

Function definition File

Header File

NIF

Follower SetGeneration

Exogenous rateconstraints

Fig. 6.10 EmCodeSyn design methodology

to perform a test for identifying a master trigger in an MRICDF model and how toprovide exogenous information to force one, in the absence of master trigger. In thissection, we explain how a prime implicate based strategy is used to identify a mastertrigger from a system of Boolean equations.

6.5.1 Prime Implicates for Master Trigger Computation

An implicate is defined as a sum term that can cover max terms of a function. Bycovering a max term we mean that the sum term can be simplified into the individ-ual max terms. A prime implicate is defined as a sum term that is not covered byanother implicate of the function. Jackson and Pais [21] surveys some of the primeimplicate computation algorithms in literature. Research into artificial intelligenceand verification tools have contributed to the development of better algorithms andtools to compute Prime Implicates (PI). De Kleer [22] used a specialized data struc-ture called tries to improve the performance of prime implicate algorithms. These PIgeneration algorithms provide us with solutions to a set of Boolean clauses. In Em-CodeSyn, Boolean equations for the signals in the network are generated. From thissystem of Boolean equations, SAT equations in Conjunctive Normal Form (CNF)can be formed. Once a PI generator has been used to solve these equations, theresult is an array of solutions which are in their reduced form. Now if any of theprime implicates is a single positive literal, that particular signal is the master trig-ger. The selection of master trigger helps in identifying the order of execution ofactors which is called the follower set of the model. This is computed by settingthe current master trigger value to true and by repeating the PI generation process.The next set of PIs will be the candidates for the follower set of previous one andso on. The prime implicate generator we use in our software tool is developed byMatusiwicz et al. [23].

Page 206: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

190 B.A. Jose and S.K. Shukla

6.5.2 Sequential Implementability of Multi-rate SpecificationUsing Prime Implicates

Similar to endochrony property in SIGNAL/Polychrony framework, an equivalentproperty of sequential implementability exists in MRICDF formalism, which guar-antees that the scheduling of computation based on input events can be inferredcorrectly. In other words, the order in which inputs are read is known from theprogram itself. For deterministic code synthesis, either the MRICDF design has tobe endochronous or exogenous constraints have to be applied to force sequentialimplementability of the design. Here we show how a PI based strategy is used toidentify the master trigger for MRICDF primitive actors. Also in the absence ofa master trigger, we demonstrate how exogenous constraints are applied using thePrime Implicate method to force sequential implementability.

6.5.2.1 Endochronous Primitive Actors: Buffer and Function

Buffer actor has the simplest of epoch equations, where the input and output rates areequated. Listing 6.2 shows the epoch equations based on Table 6.1 and the resultingSAT equations for PI generation in the first column. The union and intersectionoperations are represented by C and � respectively. A trivial solution exists where‘false’ value for all variables would result in the whole system giving a false outputas required by the Theorem 1. To eliminate this solution, we add the constrainta C b D true. Each variable (a; b) is assigned a number in CNF format (1; 2) anda two variable system of SAT equations is formed. The SAT equations in each line

Listing 6.2 Epoch and Boolean equations for endochronous actors Buffer and Function

MRICDF t e x t u a l r e p r e s e n t a t i o n MRICDF t e x t u a l r e p r e s e n t a t i o n����������������������������� �����������������������������a = B u f f e r b a = b F u n c t i o n c

Epoch E q u a t i o n s Epoch E q u a t i o n s��������������� ���������������a = b , a+b = t r u e c =a , b = a , a+b+c = t r u e

Boolean E q u a t i o n s Boolean E q u a t i o n s����������������� �����������������1 : a , 2 : 1 : b , 2 : a , 3 : c

p s a t 2 p s a t 3(� ( (� (

+( �(1 2 ) �(�1 �2) ) +( �(1 2 ) �(�1 �2) )+(1 2 ) +( �(3 2 ) �(�3 �2) )

) ) +(1 2 3 )) )

Prime I m p l i c a t e s : f a g , f b g Prime I m p l i c a t e s : f a g , f b g , f c gMaster T r i g g e r : fb g Master T r i g g e r : fb g , f c g

Page 207: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

6 MRICDF 191

corresponds to the epoch equations shown just above them in Listing 6.2. For theexternal PI generator we used SAT equations are expressed in sum of products form.So C.�.1 2/ � .�1� 2// represents f.a � b/C .�a � �b/g. The SAT equations areused to find prime implicates fag and fbg. Two single positive literals are obtainedfrom the PI generation, which satisfies the sequential implementability requirement.Since a is an output port, it is not a candidate for master trigger. Hence b is chosenas the master trigger for the Buffer actor.

Function actor equates the rates of all of its input and output ports through itsepoch equations. Listing 6.2 shows the epoch relations for double input, single out-put Function actor in the second column. From the epoch equations a 3 variablesystem of Boolean equations is formed. The PI generator result contains all possi-ble single literals fag, fbg and fcg. From this endochronous result, we remove theoutput port fag and narrow down the choices to fbg or fcg. Since they are of equalrate, the order of execution can be chosen arbitrarily.

6.5.2.2 Non-endochronous Primitive Actors: Sampler and Merge

Sampler actor performs an intersection operation on the rates of two input signal tocompute the rate of an output signal as shown in Table 6.1. Listing 6.3 shows theepoch and Boolean equations of a Sampler actor with inputs b, c and output a. The

Listing 6.3 Epoch and Boolean equations for non-endochronous actor Sampler

MRICDF t e x t u a l r e p r e s e n t a t i o n E ndoch ron i zed Epoch E q u a t i o n s����������������������������� �����������������������������a = b Sampler c a = b i n t e r [ c ]

c = [ c ] union [�c ]Epoch E q u a t i o n s [ c ] i n t e r [�c ] = f a l s e���������������a = b i n t e r [ c ] x = yc = [ c ] union [�c ] b =[ x ][ c ] i n t e r [�c ] = f a l s e [ c ] = [ y ]a+b+c= t r u e x = [ x ] union [�x ]

y = [ y ] union [�y ]Boolean E q u a t i o n s [ x ] i n t e r [�x ] = f a l s e

����������������� [ y ] i n t e r [�y ] = f a l s e1 : a , 2 : b , 3 : c , 4 : [ c ] , 5 : [ � c ] a+b+c+x+y= t r u e

p s a t 5 Boolean E q u a t i o n s(� ( �����������������

+( �(1 �(2 4 ) )�(�1 +(�2 �4)) ) 1 : a , 2 : b , 3 : c , . . , 1 0 : [ y ] , 11 : [ � y ]

+( �(3 + +(4 5 ) )�(�3 �(�4 �5)))

+(�4 �5) p s a t 11+(1 2 3) (� ( . . .

) ) ) )PI : fb , c g , f b , [ c ] , [ � c ] g , PI : fx g , f y g , f b , c g , . .Master T r i g g e r : fg Master T r i g g e r : fx g , f yg

Page 208: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

192 B.A. Jose and S.K. Shukla

true valued Boolean input requires additional epoch equations to define the relationbetween the variables c, Œc� and Œ�c�. A 5 variable system of Boolean equations issolved to produce the PIs, but from Listing 6.3 we can see that none of the PIs is asingle positive literal. Hence there are no master trigger candidates. To find a mastertrigger, exogenous constraints are forced on the system using the new variables xand y as shown in the second column of Listing 6.3. The variables x and y areequated, but Œx� and Œy� are independent of each other. x; y act as the exogenousinformation about input signals. The inputs of the Sampler actor are equated to thesevariables as b D Œx� and Œc� D Œy�. In this way, we can simulate all combinations ofpresence/absence of b; Œc� occurring at the input. Once the PIs are generated, fxgor fyg can be chosen as the master trigger. In the subsequent follower set, either Œx�is true or Œ�x� is true and we can be sure whether to read b or not. Thus a mastertrigger and a follower set are established to avoid any ambiguity in the order ofreading and writing signals for the Sampler actor.

Merge actor performs the union operation on the input signals to produce theoutput signal as shown in Table 6.1. Listing 6.4 shows the epoch and Booleanequations of a Merge actor with inputs b,c and output a. A 3 variable system ofBoolean equations is solved using the prime implicate generator to identify a can-didate fb; cg. Since there is no single positive literal as a candidate, the model isnot endochronous. The endochronized version of the Merge actor is shown in thesecond column of Listing 6.3. In a similar manner as in Sampler case, new vari-ables x; y are introduced with accompanying epoch equations. A 9 variable systemof Boolean equations is solved to identify the PIs and the master trigger is chosen

Listing 6.4 Epoch and SAT equations for non-endochronous actor Merge

MRICDF t e x t u a l r e p r e s e n t a t i o n E ndoch ron i zed Epoch E q u a t i o n s����������������������������� �����������������������������a = b Merge c a = b union c

Epoch E q u a t i o n s x = y��������������� b = [ x ]a = b union c c = [ y ]a+b+c= t r u e x = [ x ] union [�x ]

y = [ y ] union [�y ]Boolean E q u a t i o n s [ x ] i n t e r [�x ] = f a l s e

����������������� [ y ] i n t e r [�y ] = f a l s e1 : a , 2 : b , 3 : c a+b+c+x+y = t r u e

p s a t 3 Boolean E q u a t i o n s(� ( �����������������

+( �(1 +(2 3 ) ) 1 : a , 2 : b , . . , 8 : [ y ] , 9 : [ � y ]�(�1 �(�2 �3)))

+(1 2 3 )) ) p s a t 9

(� ( . . .) )

PI : fb , c g PI : x , f [ x ] , [ � x ] g , f y g , f b , c g ,Master T r i g g e r : f g Master T r i g g e r : f x g , f y g

Page 209: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

6 MRICDF 193

from fxg; fyg. The value of Œx� or Œy� will tell us the presence/absence of b and c.If both are present, order of reading them can be arbitrary since we are assured thatthe input will arrive. If only one is present, we can avoid waiting for the other signal.Effectively, we obtain a deterministic order for reading the signals.

6.5.3 Sequential Code Synthesis

Epoch Analysis analyzed the epochs of individual signals in the network. A mas-ter trigger having events in all instants of the MRICDF model was identified fromor enforced on the MRICDF model. Software will be executed as rounds mappinginstants of the MRICDF model, with computation within each round following theorder given in the follower set. The computation within each instant is made upof primitive actors interacting with each other. At the code generation phase ofEmCodeSyn, each primitive actor is represented in terms of its equivalent C code.Figure 6.11 shows the equivalent C code for the 4 endochronized primitive actors inisolation. Endochronization allows us to know the order of reading the input signalsa and x for Merge actor. The MRICDF network is built by connecting the signalsusing these pieces of C code according to the scheduling order specified in the fol-lower set.

EmCodeSyn generates three C files for an MRICDF model. The Main file whichlists the computation according to the scheduling order obtained from the followerset. The Header file contains all the functions and primitives to be used and theFunction Definition file contains the code for each function that was declared. Notethat code generation stage is reached after computation of master trigger and fol-lower set. The template for the generated Main file is shown in Listing 6.5. Eachiteration of the “while (1)” loop represents each round of computation or an instantat the MRICDF abstraction level. The scheduling of actors within the loop will be

F

b = Function (a>0)

Ma b

B

w = Buffer (a)

a w

S

x = Sampler (a,b)

y = Merge (a,x)

a

x

y

a

b

x

Fig. 6.11 MRICDF primitive actors represented in C code

Page 210: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

194 B.A. Jose and S.K. Shukla

Listing 6.5 Code generation format for Main C file

# i n c l u d e ” s y n h e a d e r . h ”i n t main ( vo id )f

/ / S e l e c t manual , random , or i n p u t s from f i l ewhi le ( 1 )f

Read m a s t e r t r i g g e r s i g n a l

Compute f o l l o w e r s e t/ / Code f o r b u f f e r , sampler , f u n c t i o n , merge a c t o r s/ / Sends b u f f e r ’ s i n p u t s t o tempor ar y v a l u e s/ / Sends v a l u e s o f a l l t h e i n p u t s and o u t p u t s

f i n i s h c o m p u t a t i o n of f o l l o w e r s e tgre turn 0 ;

g

according to the order in the follower set of the MRICDF network. Each iteration ofthe loop will begin with the master trigger signal and will schedule actors accordingto the nature of input signals. Finally the buffer values and the output values areupdated.

To demonstrate the capabilities of EmCodeSyn, we take you through the processof designing an MRICDF model, creating its NIF, forming equations for EpochAnalysis and finally performing code generation. The model used is a simplifiedversion of a height supervisory unit (STARMAC) in a miniature aircraft. This willserve as a guide in following the EmCodeSyn design methodology.

6.5.4 Case Study on Implementation of an MRICDF Modelusing EmCodeSyn

STARMAC is a multi-agent control project of a small Unmanned Aerial Vehicle(UAV) at Stanford University [24]. The model presented in this section is a simpli-fied version of a height supervisory unit of the STARMAC UAV. The block diagramof the intended design is shown in Fig. 6.12. There is a supervisory phase whichtakes care of all computation and a monitoring phase which verifies if the heightco-ordinate of the UAV is above a minimum. The design requirement would need aparallel implementation as shown in Fig. 6.12a. A sequential implementation of therequirement can also perform the verification of elevation co-ordinate if the mon-itoring phase is placed after the supervisory part. This particular case is shown inFig. 6.12b.

The MRICDF model for the requirement is shown in Fig. 6.13. The height super-visory unit is represented in terms of two composite actors combined with a Mergeactor. The composite actors are designed using the GUI of EmCodeSyn and savedin the actor library. They are reused in the top level model of the STARMAC design

Page 211: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

6 MRICDF 195

AboveTol. ?

Below min ?

Supervisor

MonitorZm

Zin

Yin

Xin

Zs

Zout

Yout

Xout

AboveTol. ?

Below min ?

Supervisor

MonitorZm

Zin

Yin

Xin

ba

Zs

Zout

Yout

Xout

Design Requirement Actual Implementation

update

update

Zref Zref

Fig. 6.12 Concurrent design and sequential implementation of STARMAC unit

Fig. 6.13 MRICDF model in EmCodeSyn for the STARMAC example

Page 212: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

196 B.A. Jose and S.K. Shukla

along with the Merge primitive actor. The composite actors are shown separatelyin Fig. 6.14. The function actors in Monitor and Supervisor implement thecomparison operations F1 W ZinM > Zm and F 2 W ZinS < ZrefS respectively.

During Epoch Analysis, epoch equations of these actors are generated. Epochequations of STARMAC model is shown in Listing 6.6. The union and intersectionoperations are represent as ‘C’ and ‘&’ respectively. The epoch equations in lines 3

F1S1

ZinMZoutM

ZrefM

I1

I2o1

F2 o2ZrefS

ZinS S2

I4

I3o3 M1I5

ZoutS

MONITOR

SUPERVISOR

M2ZoutZin

Zm

M3o4 i6

Fig. 6.14 MRICDF models of monitor and supervisor for STARMAC unit

Listing 6.6 Epoch equations for STARMAC model in EmCodeSyn

1 F2 . o2 = F2 . ZinS2 F2 . o2 = F2 . Z re fS3 S2 . o3 = S2 . i 3 & [ S2 . i 4 ]4 S2 . i 4 = [ S2 . i 4 ] + [�S2 . i 4 ]5 t r u e = �[S2 . i 4 ] + �[�S2 . i 4 ]6 F1 . o1 = F1 . ZinM7 F1 . o1 = F1 . ZrefM8 S1 . o4 = S1 . i 1 & [ S1 . i 2 ]9 S1 . i 2 = [ S1 . i 2 ] + [�S1 . i 2 ]10 t r u e = �[S1 . i 2 ] + �[�S1 . i 2 ]11 M3. ZoutM = M3. i 6 + M3.Zm12 M2. Zout = M2. ZoutS + M2. ZoutM13 M1. ZoutS = M1. i 5 + M1. Z re fS14 F2 . o2 = S2 . i 415 S2 . o3 = M1. i 516 F1 . o1 = S1 . i 217 S1 . o4 = M3. i 618 M3. ZoutM = M2. ZoutM19 M1. ZoutS = M2. ZoutS20 S2 . i 3 = F2 . ZinS21 F1 . ZinM = F2 . ZinS22 F1 . ZinM = S1 . i 123 M1. i . 1 0 = F2 . Z re fS

Page 213: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

6 MRICDF 197

and 8 show the Sampler epoch equations of actors S1 and S2, while the Merge actorsare represented in the lines 11, 12 and 13. The Boolean signal S2:i4 is separatedinto a true value signal ŒS2:i4� and Œ�S2:i4� in lines 4, 5. From the examination ofthe MRICDF epoch equations and from Fig. 6.14, it can be observed that the epochsof Zin;ZinS;ZinM;ZrefM;ZrefS are being equated due to the Function epoch re-lations. The only other independent input to the system is the minimum elevation tobe maintainedZm. Due to these two independent inputs to the system, the MRICDFmodel is non-endochronous. An external master trigger has to be computed.

EmCodeSyn adds exogenous constraints to the system, x and y. Here x D y,Œx� D F2:ZinS, Œy� D M3:Zm. So based on the value of Œx� and Œy�, the inde-pendent inputs can be read and the follower set can be computed. The follower setof the system will be as follows: (1) fxg; fyg; (2) fŒx�; Œy�g or fF2:ZinS;M3:Zmg;(3) fŒS1:i2�; ŒS2:i4�g; and (4)M2:Zout. Each set in the follower set shown containssignals which can be replaced with a different signal in the network of same rate.In other words, signals with same rate are computed in the same set, but are notshown here to maintain readability. A few examples in the second set are signals.F 2:ZrefS; F 2:o2; S2:i3; S2:i4/ having same rate as F2:ZinS or Œx�. With a se-quential scheduling order in place for all signals in the network, code synthesis canbe performed. The generated output files and intermediate files of this model can befound in [19].

6.6 Conclusions and Future Direction

An alternate polychronous model of computation, MRICDF, was discussed in thischapter. MRICDF provides a visual representation for the primitives used in SIG-NAL in the form of actors in a synchronous data flow network. These actors interactwithin the network through instantaneous communicating channels. The transfor-mation of MRICDF models to embedded software goes through an Epoch Analysisstep which looks for deadlocks and for sequential implementability. Sequential im-plementability of a model relies on finding a master trigger, a signal within thenetwork which can act as a master tick. A unique sequential implementation isachieved using a prime implicate based master trigger identification technique. Theexistence of master trigger is proved and a sequential schedule is created in the formof a follower set. EmCodeSyn, a visual framework for designing MRICDF modelsis also discussed. EmCodeSyn provides the utilities for each step in the transforma-tion of the specification into embedded software. The capabilities of MRICDF andEmCodeSyn have been demonstrated in the form of two case studies in this chapter.Together MRICDF and EmCodeSyn aim to make the design and synthesis of codetargeting safety-critical applications easier.

Synchronous programming languages have concurrency at the specificationlevel, which can be used to generate multi-threaded code [25]. The sequentialimplementability or endochrony property discussed in the chapter has an ex-tended counterpart called ‘weak endochrony’ which allows multiple behaviors

Page 214: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

198 B.A. Jose and S.K. Shukla

for a program, if the output remains the same. Another area of research is real-timesimulation of computation which has been partially implemented in EmCodeSyn.Computation of prime implicates is a time consuming process. Since PI computa-tion time is dependent on the number of Boolean equations, an actor eliminationtechnique to reduce the Boolean equations is being worked on [26].

References

1. N. Halbwachs. Synchronous programming of reactive systems. Kluwer, Dordrecht, 1993.2. E. A. Lee and A. Sangiovanni-Vincentelli. A framework for comparing models of compu-

tation. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,17(12):1217–1229, 1998.

3. G. Berry and G. Gonthier. The ESTEREL synchronous programming language: design, se-mantics, implementation. Science of Computer Programming, 19(2):87–152, 1992.

4. N. Halbwachs, P. Caspi, P. Raymond, and D. Pilaud. The synchronous data flow programminglanguage LUSTRE. Proceedings of the IEEE, 79(9):1305–1320, 1991.

5. P. L. Guernic, T. Gautier, M. L. Borgne, and C. L. Maire. Programming real-time applicationswith Signal. Proceedings of the IEEE, 79(9):1321–1336, 1991.

6. A. Benveniste, P. Caspi, S. Edwards, N. Halbwachs, P. Le Guernic, and R. de Simone. The syn-chronous languages twelve years later. Proceedings of the IEEE: Special Isssue on Modelingand Design of Embedded Systems, 91(1):64–83, 2003.

7. G. Berry and E. Sentovich. Multiclock esterel: correct hardware design and verification meth-ods. Lecture Notes in Computer Science, volume 2144, pages 110–125. Springer, Berlin, 2001.

8. T. Austin, D. Blaauw, T. Mudge, and K. Flautner. Making typical silicon matter with razor.Computer, 37(3):57–65, 2004.

9. M. Blum and S. Kannan. Designing programs that check their work. In Proc. of the 21st ACMSymposium on Theory of Computing, pages 86–97, New York, NY, USA, 1989. ACM.

10. Synfora Inc. Esterel Studio EDA Tool. http://www.synfora.com/products/esterelstudio.html.11. ESTEREL Technologies. The SCADE suite. http://www.esterel-technologies.com/products/

scade-suite.12. ESPRESSO Project, INRIA. The Polychrony toolset. http://www.irisa.fr/espresso/Polychrony.13. T. Gautier, P. Le Guernic, and L. Besnard. SIGNAL: A declarative language for synchronous

programming of real-time systems. In Proc. of a Conf. on Functional Programming Languagesand Computer Architecture, pages 257–277, 1987.

14. T. P. Amagbegnon, L. Besnard, and P. Le Guernic. Implementation of the data-flow syn-chronous language signal. In ACM Symp. on Prog. Languages Design and Implementation(PLDI’95), volume 1, pages 163–173, 1995.

15. D. Nowak. Synchronous structures. Information and Computation, 204(8):1295–1324, 2006.16. B. A. Jose and S. K. Shukla. An alternative polychronous model and synthesis methodology for

model-driven embedded software. In Asia and South Pacific Design Automation Conference(ASP-DAC 2010), pages 13–18, Jan. 2010.

17. B. A. Jose, S. K. Shukla, H. D. Patel, and J-P Talpin. On the multi-threaded software synthesisfrom polychronous specifications. In Formal Models and Methods in Co-Design (MEM-OCODE), Anaheim, CA, USA, pages 129–138, Jun. 2008.

18. B. Houssais. The synchronous programming language signal. a tutorial. Technical report,IRISA ESPRESSO project, 2004.

19. B. A. Jose, L. Stewart, J. Pribble, and S. K. Shukla. Technical report on EmCodeSyn models:STARMAC and producer–consumer examples. Technical Report 2009-02, FERMAT Lab,Virginia Tech, 2009.

20. B. A. Jose, J. Pribble, Lemaire Stewart, and Sandeep K. Shukla. EmCodeSyn: a visual frame-work for multi-rate data flow specifications and code synthesis for embedded applications. In12th Forum on Specification and Design Languages (FDL’09), pages 1–6, Sept. 2009.

Page 215: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

6 MRICDF 199

21. P. Jackson and J. Pais. Computing prime implicants. In CADE-10: Proc. of the Tenth Intl. Conf.on Automated Deduction, pages 543–557, New York, NY, USA, 1990. Springer, New York.

22. J. de Kleer. An improved incremental algorithm for computing prime implicants. In Proc.AAAI-92, pages 780–785, San Jose, CA, USA, 1992.

23. A. Matusiwicz, N. Murray, and E. Rosenthal. Prime implicate tries. In Proc. of 18th Intl. Conf.on Automated Reasoning with Analytic Tableaux and Related Methods, Oslo, Norway, 2009.Lecture Notes in Computer Science, volume 5607, pages 250–264. Springer, Berlin, 2009.

24. STARMAC project Group. The Stanford testbed of autonomous rotorcraft for multi-agentcontrol overview. http://www.hybrid.stanford.edu/starmac/overview.

25. B. A. Jose, H. D. Patel, S. K. Shukla, and J.-P. Talpin. Generating multi-threaded code frompolychronous specifications. Electronic Notes on Computer Science, 238(1):57–69, 2009.

26. B. A. Jose, J. Pribble, and S. K. Shukla. Faster software synthesis using actor eliminationtechniques for polychronous formalism. In 10th International Conference on Applications ofConcurrency to System Design (ACSD 2010), June 2010.

Page 216: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

Chapter 7The Time Model of Logical Clocks Availablein the OMG MARTE Profile

Charles Andre, Julien DeAntoni, Frederic Mallet, and Robert de Simone

7.1 Introduction

Embedded System Design is progressively becoming a field of choice forModel-Driven Engineering techniques. There are fundamental reasons for thistrend:

1. Target execution platforms in the embedded world are often heterogeneous, flex-ible or reconfigurable by nature (as opposed to the conventional Von Neumanncomputer architecture). Such architectures can even sometimes be decided uponand customized to the specific kind of applications meant to run on them. Earlyarchitecture modeling allows exploring possible variations, possibly optimizingthe form of future execution platforms before they are actually built. Architec-ture exploration is becoming a key ingredient of embedded system design, whereapplications and execution platforms are essentially designed jointly and concur-rently at model level.

2. Applications are often reactive by nature, that is, meant to react repeatedly withan external environment. The main design concern goes with handling data orcontrol flow propagation, which includes frequently streaming and pipelined pro-cessing. In contrast, the actual data values and computation contents are of lesserimportance (for design, not for correctness). Application models such as processnetworks, reactive components and state/activity diagrams are used to representthe structure and the behavior of such applications (more than usual softwareengineering notions such as classes, objects, and methods).

3. Designs are usually subject to stringent real-time requirements, imposed “fromabove”, while they are constrained by the limitations of their components and theavailability of execution platform resources, “from below”. Allocation modelscan here serve to check at early stages whether there exist feasible mappings and

C. Andre, J. DeAntoni, F. Mallet, and R. de SimoneLaboratoire I3S, UMR 6070 CNRS, Universite Nice-Sophia Antipolis, INRIA SophiaAntipolis Mediterranee, 06902 Sophia Antipolis, Francee-mail: [email protected]; [email protected];[email protected]; Robert.De [email protected]

S.K. Shukla and J.-P. Talpin (eds.), Synthesis of Embedded Software: Frameworks andMethodologies for Correctness by Construction, DOI 10.1007/978-1-4419-6400-7 7,c� Springer Science+Business Media, LLC 2010

201

Page 217: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

202 C. Andre et al.

scheduling of application functions to architecture resources and services thatmay match the requirements under the given constraints.

The model-driven approach to embedded system engineering is often tagged asY-Chart methodology, or also Platform-Based design. In this approach, applicationand architecture (execution platform) models are developed and refined concur-rently, and then associated by allocation relationships, again at a virtual modelinglevel.

The representation of requirements and constraints in this context becomes itselfan important issue, as they guide the search for optimal solutions inside the range ofpossible allocations. They may be of various natures, functional or extra-functional.In this article, we focus on requirements and constraints which we call logical func-tional timing, and which we feel to be an important (although too often neglected ormisconceived) aspect of embedded system modeling.

Timing information in modeling is often used as extra-functional, real-time an-notations analyzed mostly by simulation. But the relevance of these timing figuresin later implementations has to be further assessed then. On the other hand, timeinformation also carries functional intent, as it selects some behaviors and discardsothers. Very often this can be done not only by using single form physical time,but also logical multiform time models. For instance, it may be sufficient to statethat a process runs twice as fast as another, or at least as fast as the second, with-out providing concrete physical time figures. The selected solutions will work forany such physical assignment that matches the logical time constraints. Also, dura-tions may be counted with events that are not regular in physical time: the numberof clock cycles of a processor in a low-power design context may vary accordingto frequency scaling or clock gating; processing functions may be indexed by theengine crankshaft revolution angle in case of automotive applications. Other ex-amples abound in the embedded design world. Modeling with logical time partialordering was advocated in [16]. The notion of multiform (or polychronous) logicaltime has been exploited extensively in the theory of Synchronous languages [2],in HDLs (Hardware Description Languages), but also importantly in the many ap-proaches of model-based scheduling theories around process networks [6, 26] andformal data/control-flow graphs synthesized from nested loop programs [8]. Soft-ware pipelining and modulo scheduling techniques are based on such logical timecounted in parallel processor execution cycles. The important common feature ofall these approaches is that (logical) time is an integral part of the functional design,to be used and maintained along compilation/synthesis, and not only in simulation.In many cases the designer is fronted with time decisions in order to specify cor-rect behaviors. This is of course largely true also of classical physical real-timerequirements, but their operational demand is usually downplayed as they are onlyconsidered for analysis, not synthesis. Many of the techniques are still shared be-tween both worlds (for instance consider zero-time abstraction of atomic behaviors,and progress all behaviors in correct causal order in a run-to-completion modebefore time may pass in a discrete step). Some techniques are still different, and cer-tainly the goals are, but the underlying models are similar. Large and heterogeneous

Page 218: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

7 MARTE Time Model 203

systems require a single common environment to integrate all these models whilestill preserving the semantics and the analysis techniques of each of them.

We consider here the use of existing modeling tools to integrate all these models.The UML appears as a good candidate since it has unified in a common syntactic no-tation most of the underlying formal models generally used for embedded systems:state machines, data-flow graphs (UML activities), static models to describe the exe-cution platforms (block diagrams). It is even more relevant because the UML profilefor MARTE, recently adopted by the OMG, proposes extensions to UML specifi-cally targeting real-time and embedded systems. Our goal is to provide an explicitformal semantics to the UML elements so that it can be referred to as a golden modelby external tools and make the model amenable to formal analysis, code generationor synthesis. An explicit and formal semantics within the model is fundamental toprevent different analysis tools from giving a different semantics to the same model.Each analysis technique relies on a specific formal model that has its own modelof computation and communication (MoCC). We propose a language, called ClockConstraint Specification Language (CCSL), specifically devised to equip a givenmodel (conformant to the UML or to any domain-specific language) with a timedcausality model and thus define a specific MoCC. When considering UML models,CCSL relies on the MARTE time subprofile to identify model elements on whichCCSL constraints apply.

In this chapter, we select different models from different domains to illustratepossible uses of CCSL. First, we show how it can be used to express classicalconstraints of the real-time and embedded domain by expressing East-ADL [27]extra-functional properties in CCSL. East-ADL proposes a set of time requirementsclassical in automotive applications (duration, deadline, jitter) and on which timerequirements for AUTOSAR1 are being built. The second illustration falls in theavionics domain and focuses on AADL (Architecture and Analysis DescriptionLanguage) [9]. AADL is an Architecture Description Language (ADL) adoptedby the SAE (Society of Automotive Engineering) that offers specific support forschedulability analysis. It also considers classical computation (periodic, sporadic,aperiodic) and communication (immediate/delayed, event-based or timed-triggered)patterns. However, it departs from East-ADL because it explicitly considers the ex-ecution platform to which the application is allocated. Our illustration uses MARTE(and notably its allocation subprofile) to build a model amenable to architectureexploration and schedulability analysis. These first two examples consider modelsthat combine logical and physical time. The last example considers a purely logicalcase. A CCSL library that specifies the operational semantics of the SynchronousData Flow (SDF [15]) is built. This library is applied to several diagrammatic views,equipping purely syntactic models with an explicit behavioral semantics.

Section 7.2 gives a general overview of MARTE, a detailed view of the timesubprofile and of its facilities to annotate UML models with (logical) time. Then,Sect. 7.3 describes the CCSL syntax and semantics together with the mechanismsfor building libraries. It also introduces TimeSquare, the dedicated environment

1 http://www.autosar.org.

Page 219: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

204 C. Andre et al.

we have built to analyze and verify UML/MARTE/CCSL models. The followingsections address several possible usages of CCSL and describe CCSL librariesfor the different subdomains: automotive and East-ADL in Sect. 7.4, avionics andAADL in Sect. 7.5, static analysis and SDF in Sect. 7.6.

7.2 The UML Profile for MARTE

7.2.1 Overview

The Unified Modeling Language (UML) [23] is a general-purpose modeling lan-guage specified by the Object Management Group (OMG). It proposes graphicalnotations to represent all aspects of a system from the early requirements to thedeployment of software components, including design and analysis phases, struc-tural and behavioral aspects. As a general-purpose language, it does not focus on aspecific domain and maintains a weak, informal semantics to widen its applicationfield. However, when targeting a specific application domain and especially whenbuilding trustworthy software components or for critical systems where life may beat stake, it is absolutely required to extend the UML and attach a formal semanticsto its model elements. The simplest and most efficient extension mechanism pro-vided by the UML is through the definition of profiles. A UML profile adapts theUML to a specific domain by adding new concepts, modifying existing ones anddefining a new visual representation for others. Each modification is done throughthe definition of annotations (called stereotypes) that introduce domain-specific ter-minology and provide additional semantics. However, the semantics of stereotypesmust be compatible with the original semantics (if any) of the modified or extendedconcepts.

The UML profile for Modeling and Analysis of Real-Time and Embedded sys-tems (MARTE [22]) extends the UML with concepts related to the domain ofreal-time and embedded systems. It supersedes the UML profile for Schedulabil-ity, Performance and Time (SPT [20]) that was extending the UML 1.x and that hadlimited capabilities.

MARTE has three parts: Foundations, Design and Analysis. The foundationpart is itself divided into five chapters: CoreElements, NFP, Time, Generic Re-source Modeling and Allocation. CoreElements defines configurations and modes,which are key parameters for analysis. In real-time systems, preserving the non-functional (or extra-functional) properties (power consumption, area, financial cost,time budget, . . . ) is often as important as preserving the functional ones. The UMLproposes no mechanism at all to deal with non-functional properties and relies onmere string for that purpose. NFP (Non Functional Properties) offers mechanismsto describe the quantitative as well as the qualitative aspects of properties and toattach a unit and a dimension to quantities. It defines a set of predefined quantities,units and dimensions and supports customization. NFP comes with a companionlanguage called VSL (Value Specification Language) that defines the concrete syn-tax to be used in expressions of non-functional properties. VSL also recommends

Page 220: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

7 MARTE Time Model 205

syntax for user-defined properties. Time is often considered as an extra-functionalproperty that comes as a mere annotation after the design. These annotations arefed into analysis tools that check the conformity without any actual impact on thefunctional model: e.g., whether a deadline is met, whether the end-to-end latency iswithin the expected range. Sometimes though, time can also be of a functional na-ture and has a direct impact on what is done and not only when it is done. All theseaspects are addressed in the time chapter of MARTE. The next section elaborateson the time profile.

The design part has four chapters: High Level application modeling, Genericcomponent modeling, Software Resource Modeling, and Hardware Resource Mod-eling. The first chapter describes real-time units and active objects. Active objectsdepart from passive ones by their ability to send spontaneous messages or signals,and react to event occurrences. Normal objects, the passive ones, can only answerto the messages they receive. The three other parts provide a support to describe re-sources used and in particular execution platforms on which applications may run. Ageneric description of resources is provided, including stereotypes to describe com-munication media, storages and computing resources. Then this generic model isrefined to describe software and hardware resources along with their non-functionalproperties.

The analysis part also has a chapter that defines generic elements to performmodel-driven analysis on real-time and embedded systems. This generic chapter isspecialized to address schedulability analysis and performance analysis. The chapteron schedulability analysis is not specific to a given technique and addresses variousformalisms like the classic and generalized Rate Monotonic Analysis (RMA), holis-tic techniques, or extended timed automata. This chapter provides all the keywordsusually required for such analyses. In Sect. 7.5, we follow a rather different approachand instead of focusing on syntactic elements usually required to perform schedu-lability analysis (periodicity, task, scheduler, deadline, latency), we show how wecan use MARTE time model and its companion language CCSL to build libraries ofconstraints that reflect the exact same concepts. Finally, the chapter on performanceanalysis, even if somewhat independent of a specific analysis technique, emphasizeson concepts supported by the queueing theory and its extensions.

MARTE extends the UML for real-time and embedded systems but should berefined by more specific profiles to address specific domains (avionics, automotive,silicon) or specific analysis techniques (simulation, schedulability, static analysis).The three examples addressed here consider different domains and/or different anal-ysis techniques to motivate the demand for a fairly general time model that hasjustified the creation of MARTE time subprofile.

7.2.2 The Time Profile

Time in SPT is a metric time with implicit reference to physical time. As asuccessor of SPT, MARTE supports this model of time. UML 2, issued after SPT,

Page 221: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

206 C. Andre et al.

has introduced a model of time called SimpleTime [23, Chap. 13]. This model alsomakes implicit reference to physical time, but is too simple for use in real-timeapplications, and was initially devised to be extended in dedicated profiles.

MARTE goes beyond SPT and UML 2. It adopts a more general time modelsuitable for system design. In MARTE, Time can be physical, and considered as con-tinuous or discretized, but it can also be logical, and related to user-defined clocks.Time may even be multiform, allowing different times to progress in a non-uniformfashion, and possibly independently to any (direct) reference to physical time.

In MARTE, time is represented by a collection of Clocks. Each clock specifiesa totally ordered set of instants. There may be dependence relationships betweeninstants of different clocks. Thus this model, called the MARTE time structure, isakin to the Tagged Systems [16]. To cover continuous and discrete times, the set ofinstants associated with a clock can either be dense or discrete. In this paper, mostclocks are discrete (i.e., they represent discrete time). In this case the set of instantsis indexed by natural numbers. For a clock c, cŒk� denotes its kth instant.

The MARTE Time profile defines two stereotypes ClockType and Clock torepresent the concept of clock. ClockType gathers common features shared by afamily of clocks. The ClockType fixes the nature of time (dense or discrete), sayswhether the represented time is linked to physical time or not (respectively identi-fied as chronometric clocks and logical clocks), chooses the type of the time units.A Clock, whose type must be a ClockType, carries more specific information suchas its actual unit, and values of quantitative (resolution, offset, etc.) or qualitative(time standard) properties, if relevant.

TimedElement is another stereotype introduced in MARTE. A timed elementis explicitly bound to at least one clock, and thus closely related to the time model.For instance, a TimedEvent, which is a specialization of TimedElement extendingUML Event, has a special semantics compared to usual events: it can occur onlyat instants of the associated clock. In a similar way, a TimedValueSpecification,which extends UML ValueSpecification, is the specification of a set of time valueswith explicit references to a clock, and taking the clock unit as time unit. Thus, in aMARTE model of a system, the stereotype TimedElement or one of its specializa-tions is applied to model elements which have an influence on the specification ofthe temporal behavior of this system.

The MARTE Time subprofile also provides a model library named TimeLibrary.This model library defines the enumeration TimeUnitKind which is the standardtype of time units for chronometric clocks. This enumeration contains units like s(second), its submultiples, and other related units (minute, hour, . . . ). The libraryalso predefines a clock type (IdealClock) and a clock (idealClk) whose type isIdealClock. idealClk is a dense chronometric clock with the second as time unit.This clock is assumed to be an ideal clock, perfectly reflecting the evolutions ofphysical time. idealClk should be imported in user’s models with references tophysical time concepts (i.e., frequency, physical duration, etc.). This is illustratedin Sects. 7.4 and 7.5.

Page 222: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

7 MARTE Time Model 207

7.3 CCSL Time Model

7.3.1 The Clock Constraint Specification Language

CCSL is a language to impose dependence relationships between instants ofdifferent clocks. This dependency is specified by Clock constraints. A Clock-Constraint is a TimeElement that extends UML Constraint. The constrainedelements are clocks. A clock constraint imposes relationships between instants ofits constrained clocks. So, to understand clock constraints, we have first to definerelations on instants.

7.3.1.1 Instant Relations

The precedence relation 4 is a reflexive and transitive binary relation on set of in-stants. From 4 we derive four new relations: Coincidence (� , 4 \ <), Strictprecedence (� , 4 n �), Independence (k , 4 [ <), and Exclusion (# , � [ �).The precedence relation represents causal dependency. The coincidence relation isa strong relation that forces simultaneous occurrences of instants from differentclocks.

7.3.1.2 Clock Relations

Specifying a full time structure using only instant relations is not realistic. Moreovera set of instants is usually infinite, thus forbidding an enumerative specification ofinstant relations. Hence the idea to extend relations to clocks. CCSL defines fivebasic clock relations. In the following definitions, a and b stand for Clocks. Forsimplicity, mathematical expressions are given only in the case of discrete clocks:

� Subclocking: a � b means that each instant of a is coincident with an instantof b, and that the coincidence mapping is order-preserving. a is said to be asubclock of b, and b a superclock of a.

� Equality: a D b is a special case of subclocking where the coincidence map-ping is a bijection. 8k 2 N?; aŒk� � bŒk�. a and b are “synchronous”.

� Precedence: a 4 b means 8k 2 N?; aŒk� 4 bŒk�. a is said to be faster than b.� Strict precedence: a � b is similar to the previous one but considers the strict

precedence instead. 8k 2 N?; aŒk� � bŒk�.� Exclusion: a # b means that a and b have no coincident instants.

The Alternation a � b, used in the application sections, is a derived clock relationthat imposes 8k 2 N?; aŒk� � bŒk� � aŒk C 1�. a alternates with b.

Page 223: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

208 C. Andre et al.

7.3.1.3 Clock Expressions

They allow definitions of new clocks from existing ones. Filtering is an example ofoften used clock expression. Let a be a clock, and w a binary word (i.e., a finite orinfinite word on bits: w 2 f0; 1g� [ f0; 1g!). w is used as a filtering pattern. a H wdefines a new clock, say b, such that 8k 2 N?; bŒk� � aŒw " k�, where w " k isthe index of the kth 1 in w. Binary words are convenient representations of Booleanflows and schedules. A schedule is an activation sequence, generally periodic inwhich case periodic binary words are used, denoted as w D u.v/! , where u (prefix)and v (period) are finite binary words. w is the infinite binary word u�v� � � ��v� � � � .Periodic binary words have already been successfully applied to N-SynchronousKahn networks [3].

7.3.1.4 Clock Constraints

A CCSL specification consists of a set of Clocks and a conjunction of clockconstraints. A clock constraint is a clock relation between two clocks orclock expressions. The stereotype ClockConstraint has Boolean meta-attributes(isCoincidenceBased and isPrecedenceBased) that indicate the kind ofconstraint. The coincidence-based constraints are also known as “synchronous”constraints, whereas the precedence-based constraints are called “asynchronous”.There also exist mixed constraints in which case the two meta-attributes are set totrue. A third meta-attribute (isChronometricBased) is used only for chronometricclocks and quantitative time constraints such as stability, skew.

7.3.1.5 Temporal Evolutions

A CCSL specification imposes a complex ordering on instants. We do not explicitlyrepresent this time structure. We compute possible runs instead. A run is a sequenceof steps. Each step is a set of clocks that are simultaneously fired without violatingany clock constraint. When a discrete clock ticks (or fires), the index of its currentinstant is incremented by 1. The computation of a step is detailed in a technicalreport [1] that provides a syntax and an operational semantics for a kernel of CCSL.Here, we just sketch this process.

Using the semantics of CCSL, from a clock constraint specification S we derivea logical representation of the constraints �S�. This representation is a Boolean ex-pression on a set of Boolean variables V , in bijection b with the set of clocks C.Any valuation v W V ! f0; 1g such that �S� .v/ D 1 indirectly represents a set ofclocks F that respects all the clock constraints: F D fc 2 C j v.b.c// D 1g. F is apossible set of simultaneously fireable clocks. Most of the time, this solution is notunique. Our solver supports several policies for choosing one solution.

Page 224: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

7 MARTE Time Model 209

7.3.1.6 CCSL Libraries

CCSL specifications are executable specifications. However, the expressiveness ofthe kernel CCSL is limited, for instance by the lack of support for parameterizedconstructs. The full CCSL overcomes these limitations through libraries. A library isa collection of parameterized constraints, using constraints from one or many otherlibraries. The primitive constraints, which constitute the kernel CCSL, are groupedtogether in the kernel library. The operational semantics is explicitly defined onlyfor the constraints of the kernel library. Each user-defined constraint is structurallyexpanded into kernel constraints, thus defining its operational semantics.

As a very simple example, we define a ternary coincidence relation, denoted � .The user-defined relation has three clock parameters (v1, v2 and v3). This definitioncontains two instances of the equality relation whose definition is given in the kernellibrary. The textual specification of the ternary coincidence relation is as follows:

def � .clock v1; clock v2; clock v3/ , .v1 D v2/ j .v1 D v3/:

CCSL libraries are used in the remainder of the paper either to group domainspecific constraints (Sect. 7.4) or to encapsulate a specific MoCC (Sect. 7.6).

7.3.2 TimeSquare

TimeSquare is a software environment dedicated to the resolution of CCSL con-straints and computation of partial solutions. It has four main features: (1) defi-nition/modeling of CCSL user-defined libraries that encapsulate the MoCCs, (2)specification/modeling of a CCSL model and its application to a specific UML-based or DSL model, (3) simulation of MoCCs and generation of a correspondingtrace model, (4) based on a trace model, displaying and exploring the augmentedtiming diagram, animating UML-based model and storing the scheduling result inthe model and sequence diagrams.

TimeSquare is released as a set of Eclipse plug-ins. A detailed description ofTimeSquare features, examples, and video demonstrations are available from itswebsite (http://www-sop.inria.fr/aoste/dev/time square).

7.3.2.1 Summary

In MARTE, clock constraints impose physical time or logical time (causal) rela-tionships between timed elements. CCSL is a formal specification language, so thatclock constraints expressed in CCSL are executable in the TimeSquare environment.They are also amenable to analysis. Note that while fully integrated with conceptsfrom the MARTE profile, CCSL can be used outside the UML, for instance within

Page 225: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

210 C. Andre et al.

the framework of a DSL. The purpose remains the same: to provide a time causalitymodel fitting a model-driven approach.

7.4 MARTE and East-ADL2

We consider here an example from the automotive domain. We build a CCSL libraryfor expressing the semantics of East-ADL time requirements. Their semantics is leftinformal in the East-ADL specification and we had to disambiguate some of theirdefinitions to build our CCSL model. By building this library we make East-ADLrequirement specifications executable and allow the use of TimeSquare to executeand animate UML models annotated with East-ADL stereotypes.

7.4.1 East-ADL2

East-ADL (Electronic Architecture and Software Tools, Architecture DescriptionLanguage) has been initially developed in the context of the East-EEA Europeanproject [28]. To integrate proposals from the emerging standard AUTOSAR andfrom other requirement formalisms like SysML [21, 29], a new release calledEast-ADL2 [4, 27] has been proposed by the ATESST project. In this section, weabusively refer to both versions under the name East-ADL.

Structural modeling in East-ADL covers both analysis and design levels. In thischapter the focus is on the analysis level and especially on timing requirements.

7.4.1.1 Timing Requirements

East-ADL requirements extend SysML requirements and express conditions thatmust be met by the system. They usually enrich the functional architecture withextra-functional characteristics such as variability and temporal behavior. We focuson the three kinds of timing requirements available in East-ADL:

1. DelayRequirement that constrains the delay “from” a set of entities “until” an-other set of entities. It specifies the temporal distance between the execution ofthe earliest “from” entity and the latest “until” entity.

2. RepetitionRate that defines the inter-arrival time of data on a port or the trig-gering period of an elementary ADLFunction.

3. Input/outputSynchronization that expresses a timing requirement on the in-put/output synchronization among the set of ports of an ADLFunction. It shouldbe used to express the maximum temporal skew allowed between input or outputevents or data of an ADLFunction.

Timing requirements specialize the meta-class TimingRestriction, which de-fines bounds on system timing attributes. The timing restriction can be specified as

Page 226: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

7 MARTE Time Model 211

Rk-1

ABS ABS

Rk

Lsk-1 Liok-1 JookJiikJook-1Jiik-1 Lsk Liok

Hk-1 Hk

H : Sampling IntervalLs : Sampling LatencyLio : Input-Output Latency

Jii : Input SynchronizationJoo : Output SynchronizationR : Trigger

Fig. 7.1 Timing model of the ABS

a nominal value, with or without a jitter, and can have lower and upper bounds.The jitter is the maximal positive or negative variation from the nominal value.A bound is a real value associated with an implicit time unit (ms, s, . . . ).

7.4.1.2 Example

As an illustration, we consider an Anti-lock Braking System (ABS). This exam-ple and the associated timing requirements are taken from the ATESST report onEast-ADL timing model [12]. The ABS architecture consists of four sensors, fouractuators and an indicator of the vehicle speed. The sensors (ifl, ifr, irl, irr) measurethe rotation speed of the vehicle wheels. The actuators (ofl, ofr, orl, orr) indicate thebrake pressure to be applied on the wheels. The FunctionalArchitecture is com-posed of FunctionalDevices for sensors and actuators and an ADLFunctionTypefor the functional part of the ABS. An ADLOutFlowPort provides the vehicle speed(speed).

The execution of the ABS is triggered by the occurrences of event R (Fig. 7.1).Parameter Ls represents the latency of sensor sampling. The values of the four sen-sors involved in the ABS must arrive on the input ADLFlowPorts within delayJii (InputSynchronization). A similar OuputSynchronization delay Joo is rep-resented on the output interface side. Lio represents the delay from the first eventoccurrence on the input set of the ABS until the last event occurrence on the outputset. The sampling interval of the sensor is given by parameter H . All these param-eters are modeled by timing requirements characterized by timing values or timeintervals with jitters.

7.4.2 A CCSL Library for East-ADL

East-ADL introduces a vocabulary specific to the subdomain considered (delay re-quirement, input/output synchronization, repetition rate). These time requirementscan be modeled simply by using CCSL relations. To ease the use of such relations,user-defined relations are proposed and grouped together in a library.

Page 227: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

212 C. Andre et al.

7.4.2.1 Applying the UML Profile for MARTE

The ABS function is modeled in UML (Fig. 7.2) and some model elements(TimedElements) are selected to apply the CCSL clock constraints. The reac-tion of a timed element is dictated by the clock associated with it. For instance, thesensor ifl is a timed element associated with the clock ifl. When the clock ifl ticks,because of the CCSL specification and the clock calculus, the sensor acquires data.Similarly when the clock ofl ticks, this means that the actuator ofl emits data.

In the following, we explain how the three different kinds of time requirementsdefined in East-ADL can be modeled with CCSL constraints.

7.4.2.2 Repetition Rate

A RepetitionRate concerns successive occurrences of the same event (data arriv-ing to or departing from a port, triggering of a function). In all cases, it consistsin giving a nominal duration between two successive occurrences/instants of thesame event/clock. We build a CCSL relation definition called repetitionRate that hasthree parameters: element, rate and jitter. element is the clock that must be given a

«aDLFunctionType»FunctionABS

«functionalDevice»ifl: Sensor

«functionalDevice»ifr: Sensor

«functionalDevice»irl: Sensor

«aDLFunctionPrototype»: ABS [1]

«functionalDevice»ofl: Actuator

«functionalDevice»ofr: Actuator

«functionalDevice»orl: Actuator

«functionalDevice»orr: Actuator

«functionalDevice»irr: Sensor

speed

Fig. 7.2 Example of the ABS

Page 228: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

7 MARTE Time Model 213

repetition rate. rate is an integer, the actual repetition rate. jitter is a real number,the jitter with which the repetition rate is expressed.

def repetitionRate.clock element; int rate; real jitter/ ,clock c1 D idealClk discretizedBy 0:001 (7.1)

j element isPeriodicOn c1 period rate (7.2)

j element hasStability jitter=rate (7.3)

This relation definition involves three CCSL constraints. For the duration to bespecified in seconds (time unit s), we use the clock idealClk defined in the MARTEtime library (Sect. 7.3). The CCSL expression discretizedBy discretizes idealClkand defines a chronometric discrete clock c1 so that the distance between two suc-cessive instants of c1 is 0.001 s (7.1). The unit (here s) is the default unit definedfor idealClk and therefore c1 is a 1-kHz chronometric clock. Equation (7.2) uses theCCSL expression isPeriodicOn to undersample c1 and build another clock element,rate times slower than c1. The clock expression isPeriodicOn has not been de-scribed before but (7.2) is equivalent to (7.4).

element D c1 H .1:0rate�1/! (7.4)

Finally, (7.3) expresses the jitter of the repetition rate. The CCSL constraint has-Stability states that the clock element is not strictly periodic: a maximal relativevariations of jitter=rate is possible on its period.

Back to the example of the ABS. One timing requirement of the ATESST exam-ple specifies that the ABS function must be executed every 5 ms with a maximumjitter of 1 ms. If abs:start is the clock that triggers the execution of the functionABS, then repetitionRate.f:start; 5; 1/ enforces this requirement. A jitter of 1 ms fora nominal period of 5 ms corresponds to a stability of 20%.

7.4.2.3 Delay Requirements

A DelayRequirement constrains the delay between a set of inputs and a set ofoutputs. At each iteration, all inputs and outputs must occur. So, defining a delayrequirement between two model elements means constraining the temporal distancebetween the i th occurrences of their events. In the ATESST example, a delay re-quirement is used, for instance to constrain the end-to-end latency for the functionABS. The definition is that at each iteration, the distance between the reception ofthe first input and the emission of the last output must be less than 3 ms. Conse-quently, we define a CCSL clock relation named distance that has three parameters:begin, end and duration. The specification is that the distance between the i th oc-currence of begin and the i th occurrence of end must be less than duration ms. Ifwe need a better precision than the ms we may define a 10-kHz chronometric clockrather than a 1-kHz one (7.5).

Page 229: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

214 C. Andre et al.

def distance.clock begin; clock begin; int duration/ ,clock c10 D idealClk discretizedBy 0:0001 (7.5)

j end � .begin delayedFor duration on c10/ (7.6)

The CCSL clock expression delayedFor is a ternary operator. a delayedFor 3on b builds a clock c that is a subclock of b. Operator delayedFor expresses a puredelay where the delay duration is in number of ticks of b. Note that this operatoris polychronous, contrary to usual synchronous delay operators (pre in Lustre, $ inSignal). In Sect. 7.6, we use the synchronous form of this operator where the thirdparameter is implicit (i.e., a delayedFor 3 on a).

To specify the end-to-end latency, we need clocks to model the arrival of theearliest input and of the latest output. Kernel CCSL expressions inf and sup areused for that purpose.

clock iinf D inf.ifl; ifr; irl; irr/I clock isup D sup.ifl; ifr; irl; irr/Iclock oinf D inf.ofl; ofr; orl; orr/I clock osup D sup.ofl; ofr; orl; orr/I

inf .a; b/ defines the greatest lower bound of a and b for the precedence relation4 and sup .a; b/ is the lowest upper bound. With these four new clocks, stating

that the end-to-end latency of the function ABS is less than 3 ms is simply writtendistance.iinf ; osup; 30/ in CCSL.

Similarly, input (resp. output) synchronizations are specializations of a delayrequirement. An input synchronization delay requirement for the function ABSbounds the temporal distance between the earliest input and the latest input (spec-ified by Jii on Fig. 7.1). distance.iinf ; isup; 5/ enforces an input synchronizationof 0.5 ms. Likewise, an output synchronization bounds the temporal distance be-tween the earliest output and the latest output (specified by Joo on Fig. 7.1).distance.oinf ; osup; 5/ enforces an output synchronization of 0.5 ms.

7.4.3 Analysis of East-ADL Specification

In TimeSquare, we have implemented a specific set of menus to build East-ADLspecifications. The menus give access to the different East-ADL keywords and tothe CCSL library for East-ADL. Hence, an East-ADL specification can be builtinteractively. Actually, the menu builds an internal model of the specification aswell as a UML MARTE model. The internal model can be transformed into eithera pure East-ADL model or a pure CCSL specification. The East-ADL model canbe used by East-ADL-compliant tools. The CCSL specification can be analyzed bythe TimeSquare clock calculus engine to detect inconsistent specifications or to ex-ecute the UML model. The execution trace can be dumped as a VCD file or candrive the animation of the UML model. Figure 7.3 shows an example of trace result-ing from a complete specification of the ABS. This execution exhibits a violation

Page 230: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

7 MARTE Time Model 215

Jii

Joo

Ls

R

i_inf

i_sup

abs

o_inf

o_sup

speed

o_speed

active

active

active

active

Violation!

1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

Fig. 7.3 The East-ADL specification of the ABS executed with TimeSquare

of the specification because all the computations of the ABS itself, its sensors andactuators cannot be executed within the specified repetition rate of 5 ms. A completeanalysis of this particular example is available [19].

7.5 MARTE and AADL

In this second example, we consider AADL and use a combination of MARTE andCCSL to build its software components, execution platform components, expressthe binding relationships, and its rich model of computations and communications.Our intent is to allow a UML MARTE representation of AADL models so as UMLmodels can benefit from the analysis tools (mainly for schedulability analysis) thataccept AADL models as inputs.

7.5.1 AADL

7.5.1.1 Modeling Elements

AADL supports the modeling of application software components (thread, subpro-gram, and process), execution platform components (bus, memory, processor, anddevice) and the binding of software onto execution platform. Each model element(software or execution platform) must be defined by a type and comes with at leastone implementation.

Page 231: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

216 C. Andre et al.

The latest AADL specification acknowledges that MARTE should be used toprovide a UML-based front-end to AADL models and the MARTE specificationprovides a full annex on the matter [7]. However, even though the annex gives fulldetails on syntactic equivalences between MARTE stereotypes and AADL concepts,it does not say much about the semantic equivalence.

7.5.1.2 AADL Application Software Components

Threads are executed within the context of a process, therefore the process imple-mentations must specify the number of executed threads and their interconnections.Type and implementation declarations also provide a set of properties that charac-terizes model elements. For threads, AADL standard properties include the dispatchprotocol (periodic, aperiodic, sporadic, background), the period (if the dispatch pro-tocol is periodic or sporadic), the deadline, the minimum and maximum executiontimes, along with many others.

We have created a UML library to model AADL application software com-ponents [17]. AADL threads are modeled using the stereotype SwSchedula-bleResource from the MARTE Software Resource Modeling sub-profile. Its meta-attributes deadlineElements and periodElements explicitly identify the actualproperties used to represent the deadline and the period. Using a meta-attribute oftype Property avoids a premature choice of the type of such properties. This makesit easier for the transformation tools to be language and domain independent. In ourlibrary, MARTE type NFP Duration is used as an equivalent for AADL type Time.

7.5.1.3 AADL Flows

AADL end-to-end flows explicitly identify a data-stream from sensors to the ex-ternal environment (actuators). Fig. 7.4 shows an example previously used [10] todiscuss flow latency analysis with AADL models.

This flow starts from a sensor (Ds, an aperiodic device instance) and sinks inan actuator (Da, also aperiodic) through two process instances. The first processexecutes the first two threads while the last thread is executed by the second pro-cess. The two devices are part of the execution platform and communicate via abus (db1) with two processors (cpu1 and cpu2), which host the three threads withseveral possible bindings. All processes are executed by either the same processor,or any other combination. One possible binding is illustrated by the dashed arrows.The component declarations and implementations are not shown. Several config-urations deriving from this example are modeled with MARTE and discussed inSect. 7.5.2.

Page 232: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

7 MARTE Time Model 217

T1 T2 T3

Ds Da

CPU1 CPU2

Bus

« binding » « binding »

step1 step2 step3

acquire release

Fig. 7.4 The example in AADL

7.5.1.4 AADL Ports

There are three kinds of ports: data, event and event-data. Data ports are for datatransmissions without queueing. Connections between data ports are either imme-diate or delayed. Event ports are for queued communications. The queue size mayinduce transfer delays that must be taken into account when performing latencyanalysis. Event data ports are for message transmission with queueing. Here againthe queue size may induce transfer delays. In our example, all components have dataports represented as filled triangles. We have omitted the ports of the processes sincethey are required to be of the same type as the connected port declared within thethread declaration and are therefore redundant.

UML components are linked together through ports and connectors. No queuesare specifically associated with connectors. The queueing policy is better repre-sented on a UML activity diagram that models the algorithm. A UML activity isthe specification of parameterized behavior as the coordinated sequencing of ac-tions. The sequencing is determined by token flows. A token contains an object,datum, or locus of control. A token is stored in an activity node and can move toanother node through an edge. Nodes and edges have flow rules that define theirsemantics. In UML, an object node (a special activity node) can contain 0 or manytokens. The number of tokens in a object node can be bounded by setting its prop-erty upperBound. The order in which the tokens present in the object node areoffered to its outgoing edges can be imposed (property ordering). FIFO (First-InFirst-Out) is a predefined ordering value. So, object nodes can be used to representboth event and event-data AADL communication links. The token flow representsthe communication itself. The standard rule is that only a single token can be chosenat a time. This is fully compatible with the AADL dequeue protocol OneItem. TheUML representation of the AADL dequeue protocol AllItems is also possible. Thisneeds the advanced activity concept of edge weight, which allows any number of

Page 233: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

218 C. Andre et al.

tokens to pass along the edge, in groups at one time. The weight attribute specifiesthe minimum number of tokens that must traverse the edge at the same time. Settingthis attribute to the unlimited weight (denoted ‘*’) means that all the tokens at thesource are offered to the target.

To model data ports, UML provides «datastore» object nodes. In these nodes,tokens are never consumed thus allowing multiple readings of the same token. Usinga data store node with an upper bound equal to one is a good way to represent AADLdata port communications.

7.5.2 Describing AADL Models with MARTE

7.5.2.1 AADL Flows with MARTE

We choose to represent the AADL flows using a UML activity diagram. Figure 7.5gives the activity diagram equivalent to the AADL example described in Fig. 7.4.The diagram was built with Papyrus (http://www.papyrusuml.org), an open-sourceUML graphical editor.

As discussed previously, object nodes are used to represent the queues betweentwo tasks. This UML diagram is untimed and we use MARTE Time Profile to addtime information. This diagram is a priori polychronous since each AADL task isindependent of the other tasks. The first action to describe the time behavior ofthis model is to build five logical clocks (ds, t1, t2, t3, da). This is done in twosteps. Firstly, a logical, discrete, clock type called AADLTask is defined. Then, fiveinstances of this clock type are built. Figure 7.6 shows the final result. Secondly,the five clocks must be associated with the activity, which is done by applying thestereotype TimedProcessing. As shown in Fig. 7.5, this stereotype is applied to the

Fig. 7.5 End to end flow with UML and MARTE

Page 234: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

7 MARTE Time Model 219

«clockType»AADLTask

«ClockType»nature = discreteunitType = LogicalTimeUnitisLogical = true

«clock»t1 : AADLTask

«clock»t2 : AADLTask

«clock»t3 : AADLTask

«clock»ds : AADLTask

«clock»da : AADLTask

Fig. 7.6 One logical clock for each AADL task

Fig. 7.7 Five aperiodic tasks

whole activity but also to the actions. In our case, each action is associated with adifferent clock. In AADL, the same association is done when binding a subprogramto a task.

7.5.2.2 Five Aperiodic Tasks

The five clocks are a priori independent. The required time behavior is defined byapplying clock constraints to these five clocks. The clock constraints to use differdepending on the dispatch protocols of the tasks. Aperiodic tasks start their execu-tion when the data is available on their input port in. This is the case for devices,which are aperiodic. The alternation relation . � / can be used to model asyn-chronous communications. For instance, action Release starts when the data fromStep3 is available in d3. t3 is the clock associated with Step3 and da is the clockassociated with Release. The asynchronous communication is represented as fol-lows: t3 � da. Fig. 7.7 represents the execution proposed by TimeSquare withonly aperiodic tasks with the following constraints: ds � t1, t1 � t2, t2 � t3,t3 � da. The optional dashed arrows represent instant precedence relations in-duced by the applied clock constraints.

Page 235: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

220 C. Andre et al.

Note that this is only an abstraction of the behavior where the task durations areneglected. Additionally, we did not enforce a run to completion execution of thewhole activity. Therefore, the behavior is pipelined and ds occurs a second time be-fore the first occurrence of da. This is because the operator � is not transitive. Anadditional constraint (ds � da) would be required to ensure the atomic executionof the whole activity. Finally, this run is one possible behavior and certainly notthe only one. Most of the time, and as in this case, clock constraints only impose apartial ordering on the instants of the clocks. Applying a simulation policy reducesthe set of possible solutions. The one applied here is the random policy that relieson a pseudo-random number generator. Consequently, the result is not deterministic,but the same simulation can be replayed by restoring the generator seed.

7.5.2.3 Mixing Periodic and Aperiodic Tasks

Logical clocks are infinite sets of instants but we do not assume any periodicity,i.e., the distance between successive instants is not relevant. The clock constraintisPeriodicOn allows the creation of a periodic clock from another one. This is amore general notion of periodicity than the general acceptation. A clock c1 is saidto be periodic on another clock c2 with period P if c1 ticks every P th ticks of c2.In CCSL, this is expressed as follows: c1 isPeriodicOn c2 period P offset ı.

To build a periodic clock with the usual meaning, the base clock must refer to thephysical time, i.e., it must be a chronometric clock. As in Sect. 7.4, we can discretizeidealClk for that purpose and build c100, a 100-Hz clock (7.7).

c100 D idealClk discretizedBy 0:01 (7.7)

Figure 7.8 illustrates an execution of the same application when the threads t1and t3 are periodic. t1 and t3 are harmonic and t3 is twice as slow as t1. Coincidenceinstant relations imposed by the specification are shown with vertical edges with adiamond on one end. Depending on the simulation policy there may also be someopportunistic coincidences. Clock ds is not shown at all in this figure since it iscompletely independent from other clocks.

Fig. 7.8 Mixing periodic and aperiodic tasks

Page 236: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

7 MARTE Time Model 221

Note that, the first execution of t3 is synchronous with the first execution of t1even before the first execution of t2. Hence, the task Step3 has no data to consume.This is compatible with the UML semantics only when using data stores. The datastores are non-depleting so if we assume an initialization step to put one data ineach data store, the data store can be read several times without any other writing.The execution is allowed, but the result may be difficult to anticipate and the samedata will be read several times. When the task t1 is slower than t3, i.e., when over-sampling, some data may be lost.

The complete CCSL specification for this configuration is available in anotherwork [18].

To interpret the result of the simulation, the TimeSquare VCD viewer annotatesthe VCD with additional information derived from the CCSL specification. We havealready discussed the instant relations (dashed arrows and vertical edges). Figure 7.8also exhibits the ghost-tick feature. Ghosts may be hidden or shown at will andrepresent instants when the clock was enabled but not fired. Having a close look atclock da, we can see that its first occurrence is opportunistically coincident with thefirst occurrence of t2. However, the second occurrences of the two clocks are notcoincident. A ghost is displayed (at time 60) to show that both were enabled, but t2was fired alone, da was actually fired at the next step. In that particular example,which is not the rule, the contrary could have been true also. Additionally, thatspecification is conflict-free but it may happen that the firing of one clock disablesothers. These are classical problems, occurring when modeling with Petri nets, thatappear with CCSL because we have defined precedence instant relations in additionto coincidence relations.

7.6 MARTE and SDF

This third and last example considers a purely logical case and builds a CCSLlibrary for defining Synchronous Data Flow (SDF) [15] graphs. Section 7.6.1 recallsbasics on Synchronous Data Flow (SDF) (syntax and execution rules). Section 7.6.2proposes a modular CCSL specification to describe the behavioral semantics of thisMoCC. Finally, an example SDF graph is built using our library. The semantics isgiven with CCSL and the syntax is built by a UML model with Papyrus. We applythe very same semantic model to two different UML diagrams. The first target isa UML activity, a popular notation to represent data flows. The second target is aUML state machine, whose concrete syntax is close to the usual representation ofSDF graphs.

Page 237: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

222 C. Andre et al.

7.6.1 Synchronous Data Flow

Data Flow graphs are directed graphs where each node represents a function ora computation and each arc represents a data path. SDF is a special case of dataflow in which the number of data samples produced and consumed by each nodeis specified a priori. This simple formalism is well suited for expressing multi-rateDSP algorithms that deal with continuous streams of data. This is a restriction ofKahn process networks [13] to allow static scheduling and ease the parallelization.SDF graphs are essentially equivalent to Computation graphs [14] which have beenproven to be a special case of conflict-free Petri nets [24].

In SDF graphs, nodes (called actors) represent operations. Arcs carry tokens,which represent data values (of any data type) stored in a first-in first-out queue.Actors have inputs and outputs. Each input (resp. output) has a weight that representsthe number of tokens consumed (resp. produced) when the actor executes. SDFgraphs obey the following rules:

� An actor is enabled for execution when all inputs are enabled. An input is enabledwhen enough tokens are available on the incoming arc. Enough tokens meansequal to or greater than the input weight. Actors with no inputs (Source actors)are always enabled. Enabling and execution never depend on the token values,i.e., the control is data-independent.

� When an actor executes, it always produces and consumes the same fixed amountof tokens, in an atomic way. It produces on each output exactly the number oftokens specified by the output weight; these tokens are written into the queueof the outgoing arc. It consumes on each input exactly the number of tokensspecified by the input weight, these tokens are read (and removed) from the queueof the incoming arc.

� Delay is a property of an arc. A delay of n samples means that n tokens areinitially in the queue of the arc.

7.6.2 A CCSL Library for SDF

With SDF graphs a local observation is enough to know the dependency on a givenelement. So it is possible to construct locally a set of CCSL constraints for eachmodel element. This section describes the library of CCSL relations built for thispurpose.

As illustrated in previous examples, the first stage is to identify which CCSLclocks must be defined to create a time system conforming to the SDF semantics.For each actor A, one CCSL clock A is created. The instants of this clock representthe atomic execution instants of the operation related to the actor. For each arc T ,two CCSL clocks write and read are created. Clock write ticks whenever a token iswritten into the arc queue. Clock read ticks whenever a token is read (and removed)from the queue. Note that, the actual number of available tokens is not directly

Page 238: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

7 MARTE Time Model 223

represented and must be computed, if required, from the difference in the numberof read and write operations. No specific clocks are created for inputs and outputs.

The second stage is to apply the right CCSL clock constraints so that the resultof the clock calculus can be interpreted to apprehend the behavioral semantics ofthe SDF graph.

7.6.2.1 Actors

Actors do not require any specific constraint.

7.6.2.2 Tokens

Tokens are written into and read from the arc queues. A token cannot be read beforehaving been written, hence, for a given arc, the i th tick of write must strictly precedethe i th tick of read. The kernel CCSL relation strict precedence models such aconstraint: write � read. When delay > 0, delay tokens are initially availablein the queue, which means that the i th read operation gets the data written at the.i delay/th write operation, for i > delay. The delay previous read operationsactually get tokens initially available and that do not match an actual write operation.CCSL operator delayedFor can represent such a delay (7.8). To represent SDF arcs,we propose to create in a library, a new relation definition, called token. Such arelation has three parameters: two clocks (write and read) and an integer (delay).The token relation applies the adequate constraint (7.8). Note that, when delay D 0,(7.8) reduces to write � read.

def token.clock write; clock read; int delay/ ,write � .read delayedFor delay/ (7.8)

7.6.2.3 Inputs

Inputs require a packet-based precedence (keyword by in (7.9)). A relation defini-tion, called input, has three parameters. The clock actor represents the actor withwhich the input is associated. The clock read represents the reading of tokens on theincoming arc. The strictly positive integer weight represents the input weight.

def input.clock actor; clock read; int weight/ ,.read by weight/ � actor (7.9)

Here again, the packet-based precedence can be built with the filtering operator(7.10). When weight D 1, it reduces to read � actor.

�read H

�0weight�1:1

�!�

� actor (7.10)

Page 239: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

224 C. Andre et al.

7.6.2.4 Outputs

Outputs are represented by the relation definition output , which has three param-eters. The clock actor represents the actor with which the output is associated.The clock write represents the writing of tokens to the outgoing arc. The strictlypositive integer weight represents the output weight. CCSL operator filteredBy isused (7.11). When weight D 1, (7.11) simplifies into actor D write.

def output.clock actor; clock write; int weight/ ,

actor D�

write H�1:0weight�1

�!�

(7.11)

7.6.2.5 Arcs

Arcs can globally be seen as a conjunction of one output, one token and one input.The library relation definition arc (7.12) specifies a complete set of constraints foran arc with a delay delay, from an actor source, with an output weight out to anotheractor target, with an input weight in. In that case, CCSL clocks write and read arelocal clocks.

def arc.int delay; clock source; int out; clock target; int in/ ,clock read;write

output.source;write; out/ (7.12)

j token.write; read; delay/

j input.target; read; in/

7.6.3 Applying the SDF Semantics

The previous subsection defines, for each SDF model element, the CCSL clocks thatmust be created and the clock constraints to apply. This section illustrates the useof our CCSL library for SDF. Our purpose is to explicitly add to an existing modelthe SDF semantics. In this example, we use UML activities (Fig. 7.9a) and statemachines (Fig. 7.9c) to build with the Papyrus editor a simple SDF graph (Fig. 7.9b).Our intent is NOT to extend the semantics of UML activities and state machines butrather to use Papyrus as a mere graphical editor to build graphs, i.e., nodes connectedby arcs. Our proposal is to attach CCSL clocks to some UML elements representedby the modeling editor. Papyrus gives the concrete graphical syntax and the clockconstraints give explicitly the expected execution rules.

By instantiating elements of our CCSL library for SDF, these two models can begiven the execution semantics as classically understood for SDF graphs. The idea isto add the semantics within the model explicitly without being implicitly bound tosome external description. In our case, the semantics is given by the CCSL specifi-cation and by explicit associations between CCSL clocks and model elements.

Page 240: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

7 MARTE Time Model 225

Activity in Papyrusa

SDF

A

o_ab

B

i_ab

C

o_bc

o_cb

i_cb

i_bc

bSDF graph

2

1

2

1

tbc

2

Biab

icb obc

C

ibcocb

tcb

tab

1

Aoab

cState machine in Papyrus

SDF

A

B

C

t_ab

t_bct_cb

Fig. 7.9 Paradigms of deadlocks

Fig. 7.10 A simulation output with TimeSquare

When using the syntax of activities, CCSL clocks that represent actors areassociated with actions and CCSL clocks that represent readings from (resp. writ-ings to) the arc queues are associated with input (resp. output) pins. All other CCSLclocks are left unbound and are just used as local clocks with no direct interpreta-tion on the model. When using the syntax of state machines, only CCSL clocks thatrepresent actors are bound to the states and other clocks are left unbound.

Equation (7.13) uses our library to give to the models on parts (a) and (c) thesame semantics as understood when considering the SDF graph in the middle part(b). Clocks are named after the model elements with which they are associated, i.e.,the clock for actorA is namedA. However, this rule is for clarity only and the actualassociation is done explicitly in the CCSL model.

S , arc.0; A; 1; B; 2/ j arc.0; B; 2; C; 1/ j arc.2; C; 1; B; 2/: (7.13)

Fig. 7.10 shows one possible execution of this specification produced by Time-Square. Intermediate clocks are hidden and only actors are displayed. The relativeexecution of the actors is what matters when considering SDF graphs.

Page 241: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

226 C. Andre et al.

7.7 Conclusion

The UML profile for MARTE extends the UML to model real-time and embeddedsystems, at the appropriate level of description. However, the real-time and embed-ded domain is vast and lots of efforts have already been done to provide dedicatedproprietary UML extensions [5, 11, 25]. MARTE clearly does not cover the wholedomain and should be refined to tackle specific aspects, and provide a larger supportfor analysis and synthesis tools. Our contribution is to illustrate the use of logicaltime for system specification. This is done by using the MARTE time subprofileconjointly with CCSL. Logical time is thus an integral part of the functional designthat can be exploited by compilation/synthesis tools, not restricting time to annota-tions for performance evaluation or simulation.

At the same time, there is a large demand for standards because the community,tool vendors and manufacturers, wants to rely on industry-proven, perennial lan-guages. Many specifications in various subdomains are issued by various organi-zations to answer this demand (AADL, AUTOSAR, East-ADL, IP-Xact, . . . ), eventhough these subdomains have covered for a long time separate markets and wereaddressed by different communities. With the emergence of large systems, systemsof systems, we need to combine several aspects of these subdomains into a commonmodeling environment. The automotive and avionic industries, which were usingdedicated processors are now integrating generic processors and need interoperabil-ity between their own models and models used by electronic circuits manufacturers.Therefore, we need formalisms able to compose these models both syntactically andsemantically. Having a common semantic framework is required to ensure interop-erability of tools. By selecting several examples from some of these subdomains,we have shown that MARTE time profile and CCSL can be used to tackle differentaspects of some of these emerging specifications. We advocate that the composedmodel must provide structural composition rules but also a description of the intentsemantics, so that some analysis can be done at the model level. Then, this modelcan serve as a golden specification for equivalence comparison with other formalmodels suitable to apply specific analysis or synthesis techniques.

References

1. C. Andre. Syntax and semantics of the clock constraint specification language (CCSL).Research Report 6925, INRIA and University of Nice, May 2009.

2. A. Benveniste, P. Caspi, S. Edwards, N. Halbwachs, P. Le Guernic, and R. de Simone. Thesynchronous languages twelve years later. Proceedings of the IEEE, 91(1):64–83, 2003.

3. A. Cohen, M. Duranton, C. Eisenbeis, C. Pagetti, F. Plateau, and M. Pouzet. N-synchronousKahn networks: a relaxed model of synchrony for real-time systems. In J. G. Morrisett andS. L. P. Jones, editors, POPL, pages 180–193. ACM, January 2006.

4. P. Cuenot, D. Chen, S. Gerard, H. Lonn, M.-O. Reiser, D. Servat, C.-J. Sjostedt, R. T. Kolagari,M. Torngren, and M. Weber. Managing complexity of automotive electronics using the East-ADL. In Proc. of the 12th IEEE Int. Conf. on Engineering Complex Computer Systems(ICECCS’07), pages 353–358. IEEE Computer Society, 2007.

Page 242: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

7 MARTE Time Model 227

5. B. P. Douglass. Real-time UML. In W. Damm and E.-R. Olderog, editors, FTRTFT, volume2469 of Lecture Notes in Computer Science, pages 53–70. Springer, Berlin, 2002.

6. J. Eker, J. Janneck, E. Lee, J. Liu, X. L., J. Ludvig, S. Neuendorffer, S. Sachs, and Y. Xiong.Taming heterogeneity – the Ptolemy approach. Proceedings of the IEEE, 91(1):127–144, 2003.

7. M. Faugere, T. Bourbeau, R. de Simone, and S. Gerard. Marte: also an UML profile for model-ing AADL applications. In ICECCS – UML&AADL, pages 359–364. IEEE Computer Society,2007.

8. P. Feautrier. Compiling for massively parallel architectures: a perspective. Microprogrammingand Microprocessors, (41):425–439, 1995.

9. P. Feiler, D. Gluch, and J. Hudak. The architecture analysis and design language (AADL): anintroduction, Technical report CMU/SEI-2006-TN-011, CMU, 2006.

10. P. H. Feiler and J. Hansson. Flow latency analysis with the architecture analysis and designlanguage. Technical report CMU/SEI-2007-TN-010, CMU, June 2007.

11. S. Graf. OMEGA: correct development of real time and embedded systems. Software andSystem Modeling, 7(2):127–130, 2008.

12. R. Johansson, H. Lonn, and P. Frey. ATESST timing model. Technical report, ITEA, 2008.Deliverable D2.1.3.

13. G. Kahn. The semantics of a simple language for parallel programming. In Information Pro-cessing, pages 471–475, 1974.

14. R. M. Karp and R. E. Miller. Properties of a model for parallel computations: determinacy,termination, queueing. SIAM Journal on Applied Mathematics, 14(6):1390–1411, 1966.

15. E. A. Lee and D. G. Messerschmitt. Static scheduling of synchronous data flow programs fordigital signal processing. IEEE Transactions on Computers, 36(1):24–35, 1987.

16. E. A. Lee and A. L. Sangiovanni-Vincentelli. A framework for comparing models of com-putation. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,17(12):1217–1229, 1998.

17. S.-Y. Lee, F. Mallet, and R. de Simone. Dealing with AADL end-to-end flow latency with UMLMarte. In ICECCS – UML&AADL, pages 228–233. IEEE CS, April 2008.

18. F. Mallet and R. de Simone. MARTE vs. AADL for Discrete-Event and Discrete-Time Domains,volume 36 of LNEE, chapter 2, pages 27–41. Springer, Berlin, 2009.

19. F. Mallet, M.-A. Peraldi-Frati, and C. Andre. Marte CCSL to execute East-ADL timingrequirements. In ISORC, pages 249–253. IEEE Computer Society, March 2009.

20. OMG. UML Profile for Schedulability, Performance, and Time Specification, v1.1. ObjectManagement Group, January 2005. formal/05-01-02.

21. OMG. Systems Modeling Language (SysML) Specification, v1.1. Object Management Group,November 2008. formal/08-11-02.

22. OMG. UML Profile for MARTE, v1.0. Object Management Group, November 2009.formal/2009-11-02.

23. OMG. Unified Modeling Language, Superstructure, v2.2. Object Management Group,February 2009. formal/2009-02-02.

24. C. Petri. Concurrency theory. In W. Brauer, W. Reisig, and G. Rozenberg, editors, Petri Nets:Central Models and Their Properties, volume 254 of Lecture Notes in Computer Science,pages 4–24. Springer, Berlin, 1987.

25. B. Selic. The emerging real-time UML standard. Computer Systems Science and Engineering,17(2):67–76, 2002.

26. S. Sriram and S. S. Bhattacharyya. Embedded Multiprocessors Scheduling and Synchroniza-tion, second edition. CRC, Boca Raton, FL, 2009.

27. The ATESST Consortium. East-ADL2 specification. Technical report, ITEA, March 2008.http://www.atesst.org, 2008-03-20.

28. The East-EEA Project. Definition of language for automotive embedded electronic architectureapproach. Technical report, ITEA, 2004. Deliverable D.3.6.

29. T. Weilkiens. Systems Engineering with SysML/UML: Modeling, Analysis, Design. TheMK/OMG, Burlington, MA, 2008.

Page 243: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

Chapter 8From Synchronous Specifications to StaticallyScheduled Hard Real-Time Implementations

Dumitru Potop-Butucaru, Robert de Simone, and Yves Sorel

8.1 Introduction

The focus of this chapter is on the development of hard real-time embedded controlsystems. The evolution of this field has been lately driven by two major trends:

1. The increasing synergy between traditional application areas for hard real-time (avionics, automotive, robotics) and the ever-growing consumer electronicsindustry

2. The need for new development techniques to meet the challenges of productivityin a competitive environment

These trends are the result of industrial needs: In avionics and automotive appli-cations, specification size and complexity rapidly increases, while the developmentcycle needs to be shortened. Meanwhile, the pervasive introduction of real-time em-bedded control systems in ordinary-life objects, such as phones or home appliances,implies a need for higher reliability, including strict timing requirements, if only dueto the large product recall costs. Thus, in both consumer electronics and more tradi-tional safety-critical areas there is a need to build larger, more reliable hard real-timeembedded control applications in shorter time frames.

New development techniques have been proposed in response to this need. Inthis chapter, we present one such technique allowing the synthesis of efficient hardreal-time embedded systems [31]. Our technique has two specific points:

� It allows the synthesis of correct-by-construction distributed embedded systemsin a fully automatic way, including both partitioning and real-time scheduling.

� The strength of our approach is derived from the use of Synchronous Reactive(S/R) languages and formalisms and the associated analysis and code generationtechniques, adapted of course to our real-time framework.

D. Potop-Butucaru (�) and Y. SorelINRIA, Centre de Recherche de Paris-Rocquencourt, Le Chesnay Cedex, Francee-mail: [email protected]; [email protected]

R. de SimoneINRIA, Centre de Recherche de Sophia-Antipolis, Sophia Antipolis Cedex, Francee-mail: [email protected]

S.K. Shukla and J.-P. Talpin (eds.), Synthesis of Embedded Software: Frameworks andMethodologies for Correctness by Construction, DOI 10.1007/978-1-4419-6400-7 8,c� Springer Science+Business Media, LLC 2010

229

Page 244: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

230 D. Potop-Butucaru et al.

S/R languages [2–4, 20] are high-level modeling formalisms relying on thesynchronous hypothesis, which lets computations and behaviors be divided into adiscrete sequence of computation steps which are equivalently called reactions orexecution instants. In itself, this assumption is rather common in practical embed-ded system design. But the synchronous hypothesis adds to this the fact that, insideeach instant, the behavioral propagation is well-behaved (causal), so that the statusof every signal or variable is established and defined prior to being tested or used.This criterion, which may be seen at first as an isolated technical requirement, isin fact the key point of the approach. It ensures strong semantic soundness by al-lowing universally recognized mathematical models such as the Mealy machinesand the digital circuits to be used as supporting foundations. In turn, these modelsgive access to a large corpus of efficient optimization, compilation, and formal ver-ification techniques. The synchronous hypothesis also guarantees full equivalencebetween various levels of representation, thereby avoiding altogether the pitfalls ofnon-synthesizability of other similar formalisms.

Synchronous reactive formalisms have long been used [3] in the design of dig-ital circuits and in the design of safety-critical embedded systems in fields such asavionics, nuclear I&C and railway control. Novel synchronous formalisms and asso-ciated development techniques are currently gaining recognition for their modelingadequacy to other classes of embedded applications.

The S/R formalisms used in our approach belong to the class of so-called data-flow formalisms, which shape applications based on intensive data computation anddata-flow organization. Two reasons to this: The first one is that hard real-time(embedded) systems are often designed as automatic control systems, and the defacto standard for the design of automatic control systems is the data-flow formal-ism Simulink [38]. When a real-time (embedded) implementation is desired, the(continuous-time) Simulink specification is discretized. To represent the resultingdiscrete-time specification, it is then convenient to use a data-flow synchronouslanguage such as Scade, which is very close in syntax to Simulink, and at thesame time gives access to the various analysis and code generation techniquesmentioned above.

The second reason for relying on data-flow formalisms is that they are similar toclassical dependent task models used in hard real-time scheduling. However, syn-chronous dataflow formalisms go beyond the dependent task models and the classicaldataflow model of Dennis [14] by introducing a form of hierarchical conditionalexecution allowing the description of execution modes. Execution conditions arerepresented with so-called logical clocks, which are (internally generated) Booleanexpressions associated with data-flow computation blocks. The clocks prescribewhich data computation blocks are to be performed as part of the current reaction.

Efficient real-time implementation of such specifications relies on two essentialingredients:

Clock analysis determines the relations between the various hierarchical execu-tion conditions (clocks). For instance, we can determine that two computationblocks can never be executed in parallel, because their execution conditions arenever true at the same execution cycle. We say in this case that their clocks areexclusive.

Page 245: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

8 Real-Time Implementation of Synchronous Specifications 231

Real-time scheduling consists in the efficient (spatial) allocation and (temporal)scheduling of the various data-flow blocks, taking into account the clock rela-tions determined above. For instance, two computation blocks having exclusiveclocks can be allocated on the same processor at the same date (but the condi-tional execution will ensure that only one is active at a time).

In this chapter we present an offline (static) scheduling technique based on theseprinciples. The schedule is computed offline as a schedule table where all the com-putations and communications of an execution cycle are allocated in time and space.To allow correct-by-construction implementation, the schedule table must respectconsistency rules such as causality. Causality means that all the information neededto compute the clock of a dataflow block, as well as its data inputs, are availablebefore the start date of the dataflow block on the processor where it will run. Oncea consistent schedule table is computed, a real-time implementation will have to pe-riodically run the dataflow blocks, as prescribed by the schedule table, on the targetarchitecture. The real-time period on which the schedule table is run must be largeenough to ensure that successive runs of the table (which correspond to successiveclock cycles of the specification) do not interfere.

The remainder of the chapter is structured as follows: We first give a generalview of the synchronous model and of the synchronous formalisms. We explainthe rationale behind their introduction, present the basic principles and underlyingmathematical models, compare them with neighboring models, and briefly presenttheir implementation issues. Then, we give a brief overview of the problems relatedwith the construction of real-time systems. We then present our solution for the de-velopment of distributed real-time implementation from synchronous specifications.This presentation starts with an intuitive example which explains how a synchronousdataflow specification can be transformed into a statically scheduled real-timeimplementation, and how clock analysis can be used to improve the generatedimplementations. The remainder of the paper introduces the formalism allowingthe representation of real-time schedules of synchronous specifications, definesthe correctness properties such a schedule must satisfy, and provides correctness-preserving optimization techniques.

8.2 The Synchronous Hypothesis

8.2.1 What For?

Program correctness (the process performs as intended) and program efficiency (itperforms as fast as possible) are major concerns in all of computer science, but theyare even more stringent in the embedded area, as on-line debugging is difficult orimpossible, and time budgets are often imperative (for instance in safety-criticalsystems and multimedia applications).

Page 246: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

232 D. Potop-Butucaru et al.

Program correctness is sought by introducing appropriate syntactic constructsand dedicated languages, making programs more easily understandable by humans,as well as allowing high-level modeling and associated verification techniques.Provided semantic preservation is ensured down to actual implementation code,this provides reasonable guarantees on functional correctness. However, while thismight sound obvious for traditional software compilation schemes, the developmentof an embedded control system is often not “seamless”, involving manual rewritingand/or the use of hardware, operating systems and middleware for which the se-mantics are either undefined, or too complex to be fully integrated in the semanticanalysis without some level of abstraction.

Program efficiency is traditionally handled in the software world by algorith-mic complexity analysis, and expressed in terms of individual operations. But inmodern systems, due to a number of phenomena, this “high-level” complexityreflects rather imperfectly the “low-level” complexity in numbers of clock cyclesspent. In the real-time scheduling community, the main solution has been the in-troduction of various abstraction levels (known as task models and architecturemodels) allowing the various timing analyses that provide real-time guarantees.Such task models are often very abstract. In the seminal model defined by Liuand Layland [27], control, synchronization and communication are not explicitlyrepresented, meaning that their costs cannot be directly accounted for. What themodel defines are repetitive pieces of code (the tasks), bounds on task durations,and real-time constraints (deadlines). The strength of the model (its simplicity) isalso its major problem, because timing (schedulability) analyses tend to be overlypessimistic.

The repetitive, cycle-based execution specified by task models is natural in real-time, because it corresponds to the natural implementation of automatic controlsystems. But cycle-based execution models are not restricted to the real-time com-munity. They are also the norm in the design of digital synchronous circuits, andsimulation environments in many domains (from scientific engineering to Hard-ware Description Language (HDL) simulators) often use lockstep computationparadigms. The difference is that, in these settings, cycles represent logical steps,not physical time. Of course, timing analysis is still possible afterwards, and in factoften simplified by the previous division into cycles. In this chapter we advocate fora similar approach for the RT/E domain.

The focus of synchronous languages is to allow modeling and programming ofsuch systems where cycle (computation step) precision is needed. The objective isto provide domain-specific structured languages for their description, and to studymatching techniques for efficient design, including compilation/synthesis/real-time scheduling, optimization, and analysis/verification. The strong conditionensuring the feasibility of these design activities is the synchronous hypothesis,described next.

Page 247: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

8 Real-Time Implementation of Synchronous Specifications 233

8.2.2 Basic Notions

What has come to be known as the Synchronous Hypothesis, laying foundations forS/R systems, is really a collection of assumptions of a common nature, sometimesadapted to the framework considered. We shall avoid heavy mathematical formal-ization in this presentation, and defer the interested reader to the existing literature,such as [2, 3]. The basics are:

Instants and reactions. Behavioral activities are divided according to (logical, ab-stract) discrete time. In other words, computations are divided according to asuccession of non-overlapping execution instants. In each instant, input signalspossibly occur (for instance by being sampled), internal computations take place,and control and data are propagated until output values are computed and a newglobal system state is reached. This execution cycle is called the reaction of thesystem to the input signals. Although we used the word “time” just before, thereis no real physical time involved, and instant durations need not be uniform (oreven considered!). All that is required is that reactions converge and computa-tions are entirely performed before the current execution instant ends and a newone begins. This empowers the obvious conceptual abstraction that computationsare infinitely fast (“instantaneous”, “zero-time”), and take place only at discretepoints in (physical) time, with no duration. When presented without sufficientexplanations, this strong formulation of the Synchronous Hypothesis is often dis-carded by newcomers as unrealistic (while, again, it is only an abstraction, amplyused in other domains where “all-or-nothing” transaction operations take place).

Signals. Broadcast signals are used to propagate information. At each executioninstant, a signal can either be present or absent. If present, it also carries somevalue of a prescribed type (“pure” signals exists as well, that carry only their pres-ence status). The key rule is that a signal must be consistent (same present/absentstatus, same data) for all read operations during any given instant. In particular,reads from parallel components must be consistent, meaning that signals act ascontrolled shared variables.

Causality. The crucial task of deciding whenever a signal can be declared absentis of utter importance in the theory of S/R systems, and an important part ofthe theoretical body behind the Synchronous Hypothesis. This is of course espe-cially true of local signals, that are both generated and tested inside the system.The fundamental rule is that the presence status and value of a signal should bedefined before they are read (and tested). This requirement takes various practicalforms depending on the actual language or formalism considered, and we shallcome back to this later. Note that “before” refers here to causal dependency inthe computation of the instant, and not to physical or even logical time betweensuccessive instants [5]. The Synchronous Hypothesis ensures that all possibleschedules of operations amount to the same result (convergence); it also leadsto the definition of “correct” programs, as opposed to ill-behaved ones where nocausal scheduling can be found.

Page 248: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

234 D. Potop-Butucaru et al.

Activation conditions and clocks. Each signal can be seen as defining (or generat-ing) a new clock, ticking when it occurs; in hardware design, this is called gatedclocks. Clocks and sub-clocks, either external or internally generated, can be usedas control entities to activate (or not) component blocks of the system. We shallalso call them activation conditions.

8.2.3 Mathematical Models

If one forgets temporarily about data values, and one accepts the duality of present-absent signals mapped to true/false values, then there is a natural interpretation ofsynchronous formalisms as synchronous digital circuits at schematic gate level, or“netlists” (roughly RTL level with only Boolean variables and registers). In turn,such circuits have a straightforward behavioral expansion into Mealy Finite StateMachines (FSM).

The two slight restrictions above are not essential: the adjunction of types andvalues into digital circuit models has been successfully attempted in a number ofcontexts, and S/R systems can also be seen as contributing to this goal. Meanwhile,the introduction of clocks and presence/absence signal status in S/R languages de-parts drastically from the prominent notion of sensitivity list generally used to definethe simulation semantics of Hardware Description Languages (HDLs).

We now comment on the opportunities made available through the interpretationof S/R systems into Mealy machines or netlists.

Netlists. We consider here a simple form, as Boolean equation systems definingthe values of wires and Boolean registers as a Boolean function of other wiresand previous register values. Some wires represent input and output signals (withvalue true indicating signal presence), others are internal variables.This type of representation is of special interest because it can provide exact de-pendency relations between variables, and thus the good representation level tostudy causality issues with accurate analysis. Notions of “constructive” causalityhave been the subject of much attention here. They attempt to refine the usualcrude criterion for synthesizability, which forbids cyclic dependencies betweennon-register variables (so that a variable seems to depend upon itself in the sameinstant), but does not take into account the Boolean interpretation, nor the po-tentially reachable configurations. Consider the equation x D y _ z, while ithas been established that y is the constant true. Then x does not really dependon z, since its (constant) value is forced by y’s. Constructive causality seeksfor the best possible faithful notion of true combinatorial dependency taking theBoolean interpretation of functions into account. For details, see [36].Another equally important aspect of the mathematical model is that a number ofcombinatorial and sequential optimization techniques have been developed overthe years, in the context of hardware synthesis approaches. The main ones arenow embedded in the SIS and MVSIS optimization suites, from UC Berkeley

Page 249: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

8 Real-Time Implementation of Synchronous Specifications 235

[16, 34]. They come as a great help in allowing programs written in high-levelS/R formalisms to compile into efficient code, either software or hardware-targeted [35].

Mealy machines. They are finite-state automata corresponding strictly to the syn-chronous assumption. In a given state, provided a certain input valuation (a subsetof present signals), the machine reacts by immediately producing a set of outputsignals before entering a new state.The Mealy machines can be generated from netlists (and by extension from anyS/R system). The Mealy machine construction can then be seen as a symbolic ex-pansion of all possible behaviors, computing the space of reachable states (RSS)on the way. But while the precise RSS is won, the precise causal dependenciesrelations are lost, which is why Mealy FSM and netlists models are both usefulin the course of S/R design [41].When the RSS is extracted, often in symbolic BDD form, it can be used in anumber of ways: We already mentioned that constructive causality only con-siders dependencies inside the RSS; similarly, all activities of model-checkingformal verification, and test coverage analysis are strongly linked to the RSSconstruction [1, 7, 8, 37].

8.2.3.1 Synchronous Hypothesis Vs. Neighboring Models

Many quasi-synchronous formalisms exist in the fields of embedded system (co-simulation): the simulation semantics of SystemC and other HDLs at RTL level,the discrete-step Simulink/Stateflow simulation, or the official StateCharts seman-tics for instance. Such formalisms generally employ a notion of physical time inorder to establish when to start the next execution instant. Inside the current ex-ecution instant, however, delta-cycles allow zero-delay activity propagation, andpotentially complex behaviors occur inside a given single reaction. The main dif-ference here is that no causality analysis (based on the Synchronous Hypothesis) isperformed at compilation time, so that an efficient ordering/scheduling cannot bepre-computed before simulation. Instead, each variable change recursively triggersfurther re-computations of all depending variables in the same reaction.

8.2.4 Synchronous Languages

Synchronous netlists and Mealy machines provide not only the main semanticmodels, but also the two fundamental modeling styles followed by the varioussynchronous languages and formalisms: By adding more types and arithmeticoperators, as well as activation conditions to introduce some amount of control-flow, the modeling style of netlists can be extrapolated to the previously mentioneddata-flow (block-diagram) networks that are common in automatic control and mul-timedia digital signal processing. The data-flow synchronous languages can be seen

Page 250: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

236 D. Potop-Butucaru et al.

as attempts to provide structured programming to compose large systems mod-ularly in this class of applications. Similarly, imperative synchronous languagesprovide ways to program in a structured way hierarchical systems of interactingMealy FSMs.

The imperative style is best illustrated by the Esterel language [6]. Its syntaxincludes classical imperative control flow statements (sequence, if-then-else, loop),explicit concurrency, and the statements that operate on the signals and state of theprogram by:

� Emitting a signal, i.e., making it present for the current execution cycle, and giveit a value, if it carries a value

� Defining a halting point where the control flow stops between execution cycles� Conditionally starting, stopping, or suspending for a cycle the execution of a

(sub-)statement based on the status of a signal

The clock (activation condition) of a statement in an Esterel program is determinedby the position of the statement in the syntax tree, and by the signals controlling itsexecution through enclosing start/stop/suspend statements. Signals are therefore thelanguage construct allowing explicit clock manipulation.

The data-flow, or declarative, style is best illustrated by the Signal/Polychrony[19] and Lustre [21] languages. Both languages structure programs as hierarchicalnetworks of dataflow blocks that communicate through signals that are also calledflows. Special blocks allow the encoding of state and control. State is representedthrough delay blocks that transmit values from one execution instant to the next.Control is encoded by (conditional) sampling and merge blocks. The clocks control-ling the execution and communication of all blocks and flows are defined throughclock constraints. Clock constraints are either implicitly associated to the dataflowblocks (e.g., an adder block and all its input and output flows have the same clock),or explicitly defined by the programmer (e.g., data never arrives on the data inputwithout an accompanying opcode on the instruction input, but an opcode can ar-rive without data). The main difference between the various data-flow formalismscomes from the restrictions they place on sampling and merge, and by the clockanalyses that:

� Determine that clock constraints are not contradictory� Transform clock constraints into actual clocks that can be efficiently computed

at run-time

The Signal/Polychrony language is a so-called multi-clock synchronous languagewhere clocks can be defined and used without specifying their relation to the baseclock of the specification (the activation condition of the entire program, which istrue at every execution cycle). Complex clock constraints can therefore be definedthat often have more than one solution. Efficient heuristics are then used to chooseone such solution allowing code generation. The Lustre language can be seen as thesingle-clock subset of Signal/Polychrony, because it only allows the definition ofclocks that are constructed by sampling from the base clock.

Page 251: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

8 Real-Time Implementation of Synchronous Specifications 237

The Esterel, Signal, and Lustre languages mentioned above have a particular po-sition in synchronous programming, in both research and industrial areas. Theirdevelopment started in the early 1980s, and they were the subject of sustainedresearch ever since. All three languages have been incorporated into commercialproducts: EsterelStudio for Esterel, Scade for Lustre, and Sildex and RTBuilder forSignal.

However, synchronous language research has also produced a large array offormalisms adapted to specific tasks, such as formal verification (e.g., the Esterelvariant Quartz [33]), the programming of interactive systems such as virtual real-ities, GUIs or simulations of large reactive systems [40], or the development ofsignal video processing systems, where the n-synchronous Kahn networks [11]allow the natural representation of clocks involving regular activation motives (e.g.,every eight samples, the filter will compute one output as the average of the samples1, 3, and 7) and define a notion of consistent buffering allowing the direct connectionof dataflows with different clocks but the same throughput.

We mentioned here only a few synchronous languages, and we invite the in-terested reader to reference papers dedicated exclusively to the presentation of thesynchronous approach [3]. In the remainder of the chapter, we shall focus on par-ticular, very simple synchronous formalism, which is used as input by the real-timescheduling tool SynDEx [17]. Like Lustre, it is a single-clock synchronous lan-guage, but with a simpler clock definition language. After defining the formalismin Sect. 8.4.2, we fully present in the remainder of the chapter a technique for opti-mized real-time implementation of such specifications.

8.2.5 Implementation Issues

The problem of implementing a synchronous specification mainly consists in defin-ing the step reaction function that will implement the behavior of an instant, asshown in Fig. 8.1. Then, the global behavior is computed by iterating this func-tion for successive instants and successive input signal valuations. Following thebasic mathematical interpretations, the compilation of a S/R program may eitherconsist in the expansion into a flat Mealy FSM, or in the translation into a flatnetlist (with more types and arithmetic operators, but without activation conditions).

Fig. 8.1 The reactionfunction is called at eachinstant to perform thecomputation of the currentstep

reaction () fdecode state ; read input ;compute ;write output ; encode state ;

g

Page 252: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

238 D. Potop-Butucaru et al.

The run-time implementation consists here in the execution of the resulting Mealymachine or netlist. In the first case, the automaton structure is implemented as a bigtop-level switch between states. In the second case, the netlist is totally ordered ina way compatible with causality, and all the equations in the ordered list are eval-uated at each execution instant. These basic techniques are at the heart of the firstcompilers, and still of some industrial ones.

In the last decade fancier implementation schemes have been sought, relying onthe use of activation conditions: During each reaction, execution starts by identi-fying the “truly useful” program blocks, which are marked as “active”. Then onlythe actual execution of the active blocks is scheduled (a bit more dynamically) andperformed in an order that respects the causality of the program. In the case of data-flow languages, the activation conditions come in the form of a hierarchy of clockunder-samplings – the clock tree, which is computed at compile time. In the case ofimperative formalisms, activation conditions are based on the halting points (wherethe control flow can stop between execution instants) and on the signal-generated(sub-)clocks.

When the implementation is distributed over a network of processors, the speci-fied functionality is implemented not by one, but by several iterating functions thatrun in lockstep. Input reading is distributed among the iterating functions running onthe various processors. Similarly, state storage, encoding, and decoding is also dis-tributed. The information transmitted through the communication media must allownot only the computation of the execution condition of each dataflow block, but alsothe reliable communication between blocks on different processors (e.g., after beingsent, no message remains in the communication stack due to unmatched executionconditions, because this could block or confuse the system). To ensure temporal pre-dictability, we require that communications on the bus(es) are statically scheduled(modulo conditional communication). This requires extra synchronization, but com-munication time must remain small, so that the real-time implementation can meetits deadlines. Balancing between these two objectives – functional correctness andcommunication efficiency – is necessarily based on the fine manipulation of bothtime and execution conditions, which is the objective of the technique presented inSect. 8.5.

8.3 Real-Time Embedded (RT/E) Systems Implementation

The digital electronic (software and hardware) part in complex applications foundin domains such as automobile, aeronautics, telecommunications, etc., is growingrapidly. On the one hand it increasingly replaces mechanical and analog deviceswhich cost a lot and are too sensitive to failures, and on the other hand it offers tothe end-users new functionalities which may evolve more easily. These applicationsshare the following main features:

Page 253: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

8 Real-Time Implementation of Synchronous Specifications 239

� Automatic-control and discrete-event: they include control laws, possibly usingsignal and image processing, as well as discrete events, in order to schedule thesecontrol laws through finite state machines.

� Critical real-time constraints: they must satisfy the well-known deadlines oftenequal to a sampling rate (period) or/and end-to-end delays (latencies [12]) moredifficult to master, otherwise the application may fail leading to a human, eco-logical or financial disaster.

� Embedding constraints: they are mobile and rely on limited resources because ofweight, size, energy consumption, and price limitations.

� Distributed and heterogeneous hardware architecture: they are distributed in or-der to provide enough computing power through parallelism, but also for thepurpose of modularity, and to keep the sensors and actuators close to the com-puting resources. Furthermore, fault tolerance imposes redundant architectures tocope with failures of hardware components. They are also heterogeneous becausedifferent types of resources (processors, ASICs, FPGAs, various communicationlines) are necessary to implement the intended functionalities while satisfyingthe constraints. The reader must be aware that distributed real-time systems areconsiderably more difficult to tackle than centralized ones (only one type of re-source), it is the reason why the most significant results in the literature are givenfor this latter case.

Taking all these features into account is a great challenge, that only a formal(based on mathematics) methodology may properly achieve. The typical approachconsists in decomposing into real-time tasks a large C program (produced, forinstance, from a higher-level model), and then using an RTOS (Real-Time Oper-ating System) for the implementation. This approach is no longer efficient enoughto cope with the complexity of today’s applications, mainly because there is a gapbetween the specification step and the implementation step. However, this does notmean that the application should not be carried out with respect to the constraints,but that the development cycle will have too long a duration, essentially due to thereal-time tests which must cover as many cases as possible. This is the reason whywe propose to automatically generate an implementation satisfying the real-time andresource constraints by transforming the synchronous specification into a staticallyscheduled and distributed program running onto the embedded architecture.

There are two main issues when distributed real-time implementation isconsidered:

� Real-time scheduling: Liu & Layland’s pioneering work provided in 1973 [27]the main principles for the schedulability of real-time independent preemptiveperiodic task systems in the case of an uniprocessor architecture. This schedula-bility analysis allows the designer to decide whether a real-time task system willbe schedulable when executed by performing simple tests involving the timingcharacteristics of every task, i.e., its period, duration, and deadline. We recallthat a task is simply an executable program which is temporally characterized.The schedulability analyses are all based on simple fixed priority scheduling al-gorithms such as rate monotonic (RM), deadline monotonic (DM), or based on

Page 254: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

240 D. Potop-Butucaru et al.

variable priority scheduling algorithms such as earliest deadline first (EDF), leastlaxity first (LLF). These works have been extensively used and improved overthe years, to take into account dependent tasks sharing resources, aperiodic, andsporadic tasks, etc. However, there are several levels of dynamicity in these ap-proaches which induce indeterminacy. The first one is related to the difficultythe designer has to assess the cost of the scheduler which is obviously a programlike a task. This cost can be decomposed in two parts, a constant part correspond-ing to the scheduling policy and a variable part corresponding to the preemptionmechanism. The constant cost is more difficult to assess in the case of variablepriorities than in the case of fixed ones. The cost of the preemption is very diffi-cult to assess since, at the best, only an upper bound of the number of preemptions[9] is known. As stated by Liu & Layland the usual solution consists in includingits worst-case cost inside the task durations. Another solution is the use of non-preemptive scheduling policies, such as the one presented in this chapter. Theproblem of non-preemptive scheduling algorithms is that some task systems thatare schedulable using a preemptive algorithm become here non-schedulable.The second level of dynamicity is due to the fact that some tasks may have partsconditioned by the result of a test on variables whose values are known onlyat the execution time. Thus, depending on the path taken for each condition,the execution cost of the task may be different. The solution is again to includethe worst-case run-time of each task in the task duration, but this is often verypessimistic and leads to over-dimensioning of the systems.

� Resource management: multiprocessor (distributed) embedded architectures fea-ture more than one computing resource. Consequently, the real-time schedulingproblem increases in difficulty. There are two possible approaches for multipro-cessor real-time scheduling. Partitioned multiprocessor scheduling is based on aspecific scheduling algorithm for every processor and on an assignment (distri-bution) algorithm, used to assign tasks to processors (Bin Packing complexity)which implies heuristics utilization since these problems are NP-hard. Globalmultiprocessor scheduling is based on a unique scheduling algorithm for all theprocessors. A task is put in a queue that is shared by all the processors and,since migration is allowed, a preempted task can return to the queue to be allo-cated to another processor. Tasks are chosen in the queue according to the uniquescheduling algorithm, and executed on the processor which is idle. The two ap-proaches cannot be opposed since one may solve a problem while the other onemay fail, and vice-versa [26]. On the other hand, some scheduling algorithmssuch as EDF have their multiprocessor variant [13, 28]. Nevertheless, the globalmultiprocessor approach induces a cost for migrating tasks that is too importantfor the processors available nowadays (this may change in the following yearsdue to improvements in integrated circuit technology). Currently, the partitionedmultiprocessor scheduling approach is the one used for realistic applications inthe industrial world. Here again indeterminacy can be minimized using static ap-proaches rather than dynamic ones. In addition, since the problem must be solved

Page 255: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

8 Real-Time Implementation of Synchronous Specifications 241

with heuristics, other criteria than the time, e.g., memory, power consumption,etc., can be added in the optimization problem, possibly under constraints suchas deadlines, task dependences, etc., [22–24].

8.4 The SynDEx Specification Formalism

Our technique for optimized real-time implementation has evolved from needs inthe development of the SynDEx optimized distributed real-time implementationtool [17]. Hence, our presentation is based on the specification formalism of theSynDEx real-time scheduling tool, which we slightly adapt to suit the needs ofour presentation. The results of the chapter can be easily generalized to cover hi-erarchical conditioned dataflow models used in embedded systems design, such asLustre/Scade [21, 32] or Simulink [38]. The definitions of this section shall be ex-emplified in Sect. 8.5.

The SynDEx tool takes as input three types of information: The topology of thetarget hardware architecture, the specification of the behavior to be implemented(a synchronous data-flow program), and the timing information giving the cost of thebasic data-flow constructs on the various computing and communication resourcesof the architecture.

8.4.1 The Architecture

In order to focus on the relation between time and execution conditions, we considerin this chapter simple architectures formed of a unique broadcast bus denoted B thatconnects a set of processors P D fPi j i D 1::ng. The architecture example ofFig. 8.3 is formed of three processors connected to the central bus.

We assume all communication and synchronization is done through asyn-chronous message passing. We also make the simplifying assumption that thebus is reliable (no message losses, no duplications, no data corruption). The relia-bility assumption amounts to an abstraction of real-life asynchronous buses, suchas CAN [29], which is realistic in certain settings [17] (pending the use of faulttolerant communication libraries and a reliability analysis that are not covered inthis chapter).

We further assume for the scope of this chapter that no form of global or lo-cal time is used to control execution (e.g., through timeouts), either synthesizedby clock synchronization protocols or integrated to the hardware platform throughthe use of time-triggered buses such as TTA [29] and FlexRay. However, the opti-mization technique we develop can be extended (with weaker optimality results) tosuch cases.

Page 256: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

242 D. Potop-Butucaru et al.

The formal model of our execution mechanism will be given under the form ofconstraints on the possible schedules in Sect. 8.8. We give in Sect. 8.5 the intuitionbehind these constraints.

8.4.2 Functional Specification

8.4.2.1 Syntax

In SynDEx, functional specification is realized using a synchronous formalism verysimilar to Scade/Lustre. A specification, also called algorithm in SynDEx jargon, isa synchronous hierarchic dataflow diagram with conditional execution. It is formedof a hierarchy of dataflow nodes, also called operations in SynDEx jargon. Eachnode has a finite, possibly void set of named and typed input and output ports.We respectively denote with I.n/ and O.n/ the sets of input and output ports ofnode n. To each input or output port p we associate its type (or domain) Dp.

An algorithm is a hierarchic description, where hierarchic decomposition is de-fined by means of dataflow graphs. A dataflow graph is a pair G D .NG;AG/ whereNG is a finite set of nodes and AG is a subset of .

Sn2NG O.n// � .

Sn2NG I.n//

satisfying two consistency properties:

Domain consistency: For all .o; i/ 2 AG , we have Do D Di

Static single assignment: For all i 2 Sn2NG I.n/ there exists a unique o such

that .o; i/ 2 AG .

The nodes are divided into basic nodes, which are the leaves of the hierarchy tree(the elementary dataflow executable operations), and composed nodes, which areformed by composition of other nodes (basic and composed). There are two types ofbasic nodes: dataflow functions, which represent elementary dataflow computations,and delays, which are the state elements. Each delay d has exactly one input andone output port of the same type, denoted Dd , and also has an initializing valued0 2 Dd .

Each composed node has one or more expansions, which are dataflow graphs. Wedenote with E.n/ the set of expansions of node n. The expansion(s) of a composednode define its possible behaviors. Composed nodes with more than one expansionare called conditioned nodes and they need to choose between the possible dataflowexpansions at execution time. This choice is done based on the values of certaininput ports called condition input ports. We denote with C.n/ � I.n/ the set ofcondition ports of a conditioned node n. The association of expansions to conditionport valuations is formally described using a partial function:

condn WY

i2C.n/

Di ! E.n/

Page 257: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

8 Real-Time Implementation of Synchronous Specifications 243

The function needs not be complete, because all input valuations are not alwaysfeasible. However, the function must be defined for all feasible combinations ofconditioning inputs. We assume this has already been checked.

An expansion G D .NG;AG/ of a node n must satisfy the following consistencyproperties:

� There exists an injective function from I.n/ to NG associating to each port i thebasic node iG having no input ports and one output port of domain Di . We calliG an input node.

� There exists an injective function from O.n/ to NG associating to each port o thebasic node oG having no output ports and one input port of domain Do. We calloG an output node.

We assume the hierarchy of nodes is complete, the algorithm being the topmostnode. We also assume that the algorithm has no input or output ports. This impliesthat interaction between the system and its environment is represented by dataflowfunctions.

To simplify notations, we assume that the algorithm is used in no expansion, andthat each other node is used in exactly one expansion. We denote with Nn the set ofnodes used in the hierarchic description of n.

8.4.2.2 Operational Semantics

The operational semantics is the classical cycle-based synchronous execution modelwhich is also used in Scade/Lustre. The execution of a node (the algorithm included)is an infinite sequence of cycles where the node reads all its inputs and computes allits outputs. The behavior of a node depends on its type:

� The computation of a dataflow function is atomic: All inputs are waited for andread, then the non-interruptible computation is performed, and then the outputsare all produced.

� The delays are state elements. When executed in a cycle, a delay delivers thevalue stored in its previous cycle. In the first cycle, no previous value is available,and the delay d delivers the initializing value d0. Then, it waits and reads itsinput, and concludes the execution of the cycle by storing this new value for thenext cycle.

� If a composed node has condition input ports, then its execution starts by waitingand reading the values of the conditioning inputs. These values are used to chooseone expansion, and execution proceeds as though the node were replaced by itsexpansion. In particular, no atomicity is required and two nodes can be executedin parallel if no dependency exists between them.

As mentioned above, it is assumed that any possible combination of conditioninginputs of a node n is assigned an expansion through condn. This amounts to a par-tition of the possible configurations of the conditioning inputs among expansions.

Page 258: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

244 D. Potop-Butucaru et al.

This requirement ensures an important correctness property: The fact that any everyvalue that is read in a cycle has been produced in that cycle (conditional executionnever leads to the use of uninitialized data).

Whenever a data-flow block or communication arc x is executed at an instant,if m is an expansion of conditioned node n that encloses x, then the current val-ues of the input ports in C.n/ are in cond�1

n .m/. Conversely, the clock of x is theconjunction for allm 2 E.n/ enclosing x of the predicates requiring that the currentvalues of the input ports in C.n/ are in cond�1

n .m/.

8.5 Motivating Example

We give in this section an example of full SynDEx [17] specification, includingthe definition of the architecture, dataflow, and timing information. We then ex-plain what type of implementation SynDEx generates, and show that the generatedreal-time schedule is sub-optimal. Finally, the type of optimizations we can realizeis illustrated with a version of the SynDEx schedule where a finer clock analysisand clock-driven schedule optimizations result in a significantly shorter cycle dura-tion. We thus motivate the following sections, which introduce the formal apparatusneeded to perform these clock analyses and optimizations.

8.5.1 Example of SynDEx Specification

Figures 8.2 and 8.3 gives our SynDEx specification example in an intuitive graphicalform that is more suited for presentation than the original one.

ID

HS=false

F1 IDFS=false

FS=true

ID

C1C2

M

NHS=true

G ID

F3F2VHS IN

FS INFS

HS

Fig. 8.2 The data-flow of a simple SynDEx specification

Page 259: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

8 Real-Time Implementation of Synchronous Specifications 245

Fig. 8.3 The architecture andtiming information of asimple SynDEx specification

P2

M=3

F1=3

HS IN=1FS IN=1

G=3

N=3F3=2

F3=3

V type=2ID type=5

boolean=2

F2=8

Asynchronousbroadcast bus

P1

P3

8.5.1.1 Dataflow

Dataflow nodes are represented by boxes on which the ports (black boxes) areplaced. Circle boxes are the input and output nodes. The remaining boxes are theother basic and composed nodes. The labels “HSDtrue” and “HSDfalse”, identifythe two expansions of the conditioned node C1 and their clocks. Similarly, the labels“FSDtrue” and “FSDfalse” identify the two expansions of conditioned node C2 andtheir clocks. For clarity reasons, we only name four output ports: HS, FS, V, and ID.The first two correspond to the values acquired from the environment by the nodes“FS IN” and “HS IN”. The output of F2 is V, and the output of the C1 is ID.

Our specification represents a system with two switches (Boolean inputs) con-trolling its execution: high-speed (HSDtrue) vs. low-speed (HSDfalse), and fail-safe (FSDtrue) vs. normal operation (FSDfalse). In the low-speed mode, moreoperations can be executed, whereas in the fail-safe mode the operation that getsexecuted (N) does not use any of the inputs (which corresponds to the use of defaultvalues).

The specified behavior is the following: At each execution cycle, read FS andHS. Depending on the values of HS, either execute F1, followed by F2 and F3, orexecute G. Both F1 and G are computing the output value ID of the conditionednode C1. Depending on the value of FS, we either wait for ID and then execute M,or execute N.

Dataflow blocks having no dependency between them can be executed in parallel.For instance, if FSDtrue then N can be executed and output as soon as FS is read.On the contrary, the computation of M must wait until both FS and ID have arrived.

8.5.1.2 Architecture and Timing Information

The architecture model we use is formed of three processors connected to an asyn-chronous broadcast bus, as explained in Sect. 8.4.1.

The timing information specifies:

� The atomic dataflow nodes that can be executed by each processor, and theirWCET (Worst-Case Execution Time) on that processor. Each processor has a listof timing information of the form “nodeDwcet”.

Page 260: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

246 D. Potop-Butucaru et al.

� The communication operations that can be executed by the bus (the data typesthat can be sent atomically), and the worst-case duration of the communication,assuming the bus is free. The bus has a list of timing information of the form“data typeDwcet”.

Timing information for one atomic node can be present on several processors, todenote the fact that the node can be executed by several processors. For instance,node F3 can be executed on P2 with WCETD3, and on P3 with WCETD2.

We assume that all processors are sequential (no parallel execution is possible),and that the bus can be used for no more than one Send operation at a time.

8.5.2 Schedule Generated by SynDEx

The SynDEx tool takes the previously defined specification and produces a staticschedule of all computations and communications for one execution cycle. Theschedule of a full execution is obtained by repeating the execution cycle schedule foras many cycles as needed. The tool tries to minimize the worst-case duration of oneexecution cycle (the latency), and not the input and output rates (the throughput).

The left-hand part of Fig. 8.4 gives the schedule generated by SynDEx for ourexample. The schedule is formed of one lane for each processor, and one lane forthe bus. Each atomic node is scheduled onto one processor, while the bus carriesinter-processor communications. Given that the bus is broadcast, communicationoperations are of the form “Send(P,OP)”, where P is the sender processor, and OPis the output port whose value is sent.

The clock of each scheduled node and communication is figured after the nameof the operation. For instance, HS IN is executed in all cycles to read the HS valuefrom the environment, because its clock (“@true”) is always true. Similarly, F1 isonly executed when the value read for HS is false. Execution conditions (clocks) arepredicates over the port names of the dataflow.

The width of an operation (its support) inside its lane intuitively represents itsclock (the larger it is, the more its clock is simple). Much like in Venn diagrams, thelogical relations between clocks of various operations are given by the horizontaloverlapping of supports. Given operations O1@P1 and O2@P2 inside a lane, theirsupports are disjoint to show that P1 ^ P2 D false.

In this case, the two operations are never both executed in a single cycle. Theycan be scheduled independently and can be placed side by side in a lane withoutcontradicting the sequentiality of processors and buses.

The last lane in Fig. 8.4 contains an instance of such independent operations.Whenever we cannot prove that P1 ^ P2 D false, we must assume that the operationsO1@P1 and O2@P2 can be executed both in the same cycle. Such operations haveoverlapping supports in Fig. 8.4. Non-conditioned operations, like “HS IN@true”,use the full width of the lane.

Each scheduled operation (dataflow node or communication) is assigned a startdate and an end date. While this presentation of the schedules may suggest a

Page 261: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

8 Real-Time Implementation of Synchronous Specifications 247

Sen

d(P

1,ID

)

Sen

d(P

1,H

S)@

true

Sen

d(P

1,F

S)@

true

FS

IN@

true

HS

IN@

true

F1@

(HS

=fa

lse)

P1

tim

e

G@

HS

=tr

ue

P2

@(F

S=

true

)

M@

(FS

=fa

lse)

F3@

(HS

=fa

lse)

P3

Sen

d(P

1,V

)@

(HS=

fals

e)

@(H

S=fa

lse)

Sen

d(P

2,ID

)@

(FS

=fa

lse)

Bus

Sen

d(P

2,ID

)

Sen

d(P

1,ID

)@

(HS=

fals

e^

FS

=fa

lse)

^H

S=

true

)@

(FS

=fa

lse

Opt

imiz

edP

3

N@

(FS

=tr

ue)

M@

(FS

=fa

lse)

F3@

(HS

=fa

lse)

Opt

imiz

edbu

sG

ener

ated

byS

ynD

Ex

3210 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

@(H

S=fa

lse)

Sen

d(P

1,V

)

Opt

imiz

edby

our

tech

niqu

e

Sen

d(P

1,H

S)@

true

Sen

d(P

1,F

S)@

true

F2@

(HS

=fa

lse)

N

Fig

.8.4

The

stat

icsc

hedu

lege

nera

ted

bySy

nDE

xan

dw

hatw

ew

antt

oob

tain

thro

ugh

opti

miz

atio

ns(w

eon

lyfig

ured

the

opti

miz

edla

nes

that

diff

erfr

omth

eSy

nDE

X-g

ener

ated

ones

).T

ime

flow

sfr

omto

pto

bott

om.W

egi

vehe

reth

esc

hedu

lefo

ron

eex

ecut

ion

cycl

e.A

nex

ecut

ion

ofth

esy

stem

isan

infin

ite

repe

titi

onof

this

patt

ern.

The

wid

thof

anop

erat

ion

insi

deit

sla

nein

tuit

ivel

yre

pres

ents

its

exec

utio

nco

ndit

ion

(clo

ck),

asex

plai

ned

inSe

ct.8

.5.2

Page 262: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

248 D. Potop-Butucaru et al.

time-triggered implementation, this is not the case. Currently, SynDEx producesschedules that are implementable as latency-insensitive systems [39] where the com-munication operations allow the preservation of the scheduled sequence of nodecomputations on each processor and the sequence of communication operations onthe bus (static scheduling modulo conditional execution). For instance, M can onlybe executed only after both FS and ID have been received by P3 and FSDfalse. Thestart and end dates of a scheduled operation should therefore be interpreted as worst-case start and end dates for operations in a latency-insensitive system (thus, timingerrors will not affect the functional correctness and determinism of the system).

Intuitive constraints define the correctness of a schedule (they will be formalizedlater in Sect. 8.8):

� For each scheduled node, the values needed to compute its clock, as well as itsinputs, are available on the processor where the operation has been scheduled.

� For each Send operation on the bus, the data to send is already available onthe sending processor, and the values needed to compute its clock are availableon all processors (having been sent on the bus previously). This way, when aSend occurs, all processors know what to do with the incoming value: keep it, ordiscard it.

8.5.3 Optimizing Communications

In Fig. 8.4, note that the schedule generated by SynDEx contains redundant buscommunications:

� Two send operations of signal ID exist, with execution conditions “HSDfalse”and “FSDfalse”. Therefore, ID is sent twice on the bus in cycles where both HSand FS are false.

� In cycles where HS=false and FSDtrue, ID is sent over the bus, even though thenode M which uses ID is not executed.

The first problem here is to determine such redundancies. This amounts to verifyingthe satisfiability of predicates over the (possibly non-Boolean) ports of the dataflow.This can be done using approached satisfiability checks that determine a sub-set ofthese redundancies (but never report false redundancies). This chapter determines,in Sect. 8.9, the predicates corresponding to the redundancies we seek to remove.We do not discuss the arbitrarily complex satisfiability verification techniques thatcan be used on these predicates. A variety of techniques exist, ranging from the low-complexity one implicitly used in SynDEx to BDD-based techniques like those ofWolinski [25]. We simply explain here how this computation can be done on ourexample.

Once a set of redundant communications has been determined, we optimizethe schedule accordingly by (1) recomputing the clocks of the communications toremove the redundancies, and (2) compacting the schedule based on the new exclu-siveness properties. Section 8.9 defines such an optimization technique consisting

Page 263: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

8 Real-Time Implementation of Synchronous Specifications 249

in a sequence of low-complexity transformations of the initial schedule. The outputof the transformations is optimal, in the sense where no further minimizations of theclocks can be done while guaranteeing the functional correctness of the implemen-tation, on the basis of the redundancies provided by the previous step.

We exemplify these optimizations in Fig. 8.4, which gives in its right-hand partthe optimized version of the SynDEx-generated schedule (we only figured the lanesthat change after optimization). Here, (1) the clock of Send(P1,ID) is changed,so that ID is never sent on the bus when M is not executed, (2) the clock ofSend(P2,ID) is changed, so that ID is not sent a second time by P2 when P1 hasalready sent it, and (3) Send(P2,ID) is moved earlier in the schedule, which is nowpossible. Like in a Tetris game (falling tiles), the operations Send(P1,V), M, andF3 can be moved earlier in the schedule. The optimized schedule has a worst-casecycle duration that is 20% (four time units) better than the SynDEx generated one.Note that our transformations do not alter the allocation or the order of operationsof the initial schedule (just the clocks). For instance, we do not exchange the slotsof HS IN and FS IN, even though the existing choice is not optimal.

The key point of our formal approach is the use of a notion of execution con-dition, defined in the next section. An execution condition represents both a clockand the executable code used in the implementation to compute this clock (a hierar-chy of conditionals, and the associated data dependencies). Manipulating executionconditions amounts to both defining new clocks and transforming executable code.

8.6 Execution Conditions

An execution condition is a pair c D< clk.c/; supp.c/ > formed of:

� A clock clk.c/ which is a predicate over the values produced by the ports of thedataflow.1 This predicate is defined using a set of operators given in this section,and allowing us to also see it as the sequence of operations for computing theclock.

� The support supp.c/ of the execution condition, consisting in the set of portswhose value is needed to compute clk.c/, and, for each such port, the executioncondition defining the cycles where the value is needed.

To give an intuition of this definition, assume in Fig. 8.4, last lane, that the guardedexecution of Send(P2,ID)@(FSDfalse^HSDtrue) is implemented as:

if (FS=false) thenif (HS=true) thenSend(P2,ID)

1 Clocks can be easily extended to include cycle indices, which is of help when dealing with multi-periodic systems.

Page 264: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

250 D. Potop-Butucaru et al.

Then, it is natural to use the execution condition c with clk.c/ D ŒFS Dfalse�ŒHS D true�, which specifies that FS is evaluated first, and that if its value isfalse we evaluate HS, and if HSDtrue then the predicate is true, otherwise it is false.The support of c is supp.c/DfFS@t rue;HS@ < ŒFSDtrue�; fFS@trueg >g,which specifies that we can compute the predicate provided FS is available at eachcycle (FS@true), and provided HS is available in cycles where FSDfalse. The sup-port information is necessary to check the correctness of the communications, andpossibly to optimize them.

Other execution conditions with different supports can compute the sameBoolean function (FSDfalse^HSDtrue), for instance c0 with clk.c0/ D ŒFS Dfalse ^ HS D true� and supp.c0/ D fFS@true;HS@trueg, which intuitively corre-sponds to the implementation:

if (FS=false and HS=true) then Send(P2,ID)

8.6.1 Basic Operations

We naturally extend on execution conditions the basic Boolean operations on pred-icates (^, _, :):

c1 ^ c2 D < clk.c1/ ^ clk.c2/; supp.c1/[ supp.c2/ >

c1 _ c2 D < clk.c1/ _ clk.c2/; supp.c1/[ supp.c2/ >

:c D < :clk.c/; supp.c/ > :

We define the difference operator on predicates and execution conditions by p1 np2 D p1 ^ :p2. If C is a finite set of predicates or execution conditions, we alsodefine

VC as the conjunction of all the elements inC , and

WC as their disjunction.

Clocks are partially ordered with p1 � p2 whenever p1 implies p2. We extend� to a preorder on execution conditions, by setting c1 � c2 whenever clk.c1/ �clk.c2/ and for all s@c 2 supp.c1/ there exists s@c0 2 supp.c2/ with c � c0. Weshall say that c1 is true (respectively false) when p1 is true (resp. false).

8.6.2 Signals

Port values mentioned above are formally represented by objects named signals.A signal s is a value that is computed and can be used in computations at preciseexecution cycles given by an execution condition cond.s/. At every such cycle, sis assigned a unique value. Inside each cycle, the value of s can be read only afterbeing assigned a value. A signal s has a type Ds . Given a signal s and an executioncondition c � cond.s/, we define the sub-sampling of s on c, denoted s@c, as

Page 265: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

8 Real-Time Implementation of Synchronous Specifications 251

the signal of execution condition c and domain Ds that has the same values as s atcycles where c is true.

In a dataflow graph, each value is produced by exactly one output port in anexecution cycle (due to the single assignment rule). However, hierarchical decom-position and conditional execution mean that several ports may in fact correspondto the same signal. In Fig. 8.2, the signal named ID represents the values producedby the output ports of F1, G, C1,2 and the input node labelled ID of C2, and is con-sumed by the input ports of M, C2, and the output node labelled ID of C1. Moreprecisely, we need to create one signal in Sig.n/ for each:

� Input and output port of the algorithm node. The input ports of the other nodesuse the signal associated with output port feeding them.

� Output port of a dataflow node, whenever the port is not connected to the inputport of an output node. The remaining output ports produce values of the signalassociated with the output port they feed.

For a signal s, we denote with node.s/ the node whose input or output portdefines s. By construction cond.s/ D cond.node.s//. We denote with sig.p/ theunique signal associated to some port p of a node in the algorithm.

8.6.3 Conditioning

Given a set of signals S and X � Qs2S Ds we shall write S 2 X to denote the

condition that the valuation of the signals of S belongs to X .Consider now an execution condition c and a finite set of signals S such that

cond.s/ � c for all s 2 S . ConsiderX � Qs2S Ds a set of valuations of the signals

in S . Then, we define the conditioning of c by S 2 X , denoted cŒS 2 X�, by clockclk.cŒS 2 X�/ D clk.c/ŒS 2 X� and the support supp.c/ [ fs@c j s 2 Sg. Usingon clocks the new operator pŒS 2 X� gives us a syntax to represent hierarchicalconditions, whereas for verification purposes pŒS 2 X� D p ^ .S 2 X/.

Using the conditioning operator, it is easy to define the hierarchical computationsof all the nodes in a SynDEx algorithm. The condition of the algorithm node is< true;; >, also denoted true. We also abbreviate trueŒS 2 X� as ŒS 2 X�. Giventhe execution condition cond.n/ of a composed node n:

� If n is not conditioned, then the execution condition of all the nodes in its uniqueexpansion is cond.n/.

� If n is conditioned, then the execution condition of all the nodes in expansionG 2E.n/ is cond.n/ŒC.n/ 2 condn

�1.G/�, where condn�1.G/ D fx 2 Qc2C.n/ Dc j

condn.x/ D Gg.

2 This does not contradict the single assignment rule because the dataflow functions F1 and G thatproduce ID cannot be both executed in a cycle.

Page 266: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

252 D. Potop-Butucaru et al.

The resulting execution conditions are all of the form ŒS1 2 X1�: : : :ŒSk 2 Xk�,which are naturally organized in a tree with true as root.

8.7 Flattened Dataflow

The hierarchy of a SynDEx specification facilitates the specification of complexcontrol patterns, ensuring through simple construction rules that every value thatis consumed has been produced. However, actual dataflow computations are onlyperformed by the leaves of the hierarchy tree (dataflow functions and delays), whilecomposed nodes only represent control (conditional execution). We want to be ableto represent schedules that specify the spatial and temporal allocation at the levelof the hierarchy leaves.3

To facilitate the description of such flat schedules, we identify in this section theset Op.a/ of operations that need to appear in a correct schedule of the algorithm a.Each o 2 Op.n/ is associated an execution condition cond.o/ defining its execution,a set IS.o/ of signals it reads, and a set OS./s of signals it produces. The set Op.n/is formed of:

� All the atomic dataflow functions that are not input or output nodes. For such anode n we set IS./n D fsig.i/ j i 2 I.n/g andOS.n/ D fsig.o/ j o 2 I.n/g.

� For each delay node n:

– One read operation read.n/ with cond.read.n//Dcond.n/, IS./read.n/D;,and OS./read.n/ D fsig.o/g, where o is the output port of n.

– One store operation store.n/with cond.store.n//Dcond.n/, OS./store.n/D;;and IS./store.n/ D fsig.i/g, where i is the input port of n.

� For each input port i of the algorithm node a, an input operation read.i/ withcond.read.i// D >, IS./read.i/ D ;, and OS./read.i/ D fsig.i/g.

� For each output port o of the algorithm node a, an output operation send.o/ withcond.send.o// D >, OS./send.o/ D ;, and IS./send.o/ D fsig.o/g.

We associate no operation to nodes that are not leaves, as (1) all control informationis represented by execution conditions, and (2) we assume control to take no time,so that we don’t need to attach its cost to some object.

8.7.1 Timing Information

To allow real-time scheduling, we assume each operation op 2 Op.a/ is assigneda worst-case execution time dP .op/, on each processor P it can be allocated on.

3 Such a formalism can also represent coarser-grain allocation policies where allocation is done atsome other hierarchy level.

Page 267: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

8 Real-Time Implementation of Synchronous Specifications 253

To represent the fact that a processor P cannot execute an operation op, we setdP .op/ D 1. This last convention is particularly important for the operations cor-responding to algorithm inputs and outputs. These operations must be allocated onthe processor having the corresponding input or output channel.

In addition to operation timings, we need communication timings. We associatea positive value dB.D/ to each domain D. This value represents the worst-caseduration of transmitting one message of type D over the bus (in the absence of allinterference).

8.8 Static Schedules

In this section, we define our model of static real-time schedule of an algorithm overthe bus-based architecture defined in Sect. 8.4.1. To simplify the developments, weassume that scheduling is done for one cycle, the global scheduling being an infi-nite repetition of the scheduling of a cycle. This corresponds to the case where thecomputations of the different cycles do not overlap in time in the real-time imple-mentation.4

The remainder of the section defines our formal model of static schedule. Thedefinition includes some very general choices, such as the use of data messagesfor synchronization. These supplementary hypotheses will be clearly identified inthe text. The resulting formalism is general enough to allow the representation ofschedules generated by several common static scheduling techniques, including theones of SynDEx.

8.8.1 Basic Definitions

Consider an algorithm a. A static schedule S of a on the bus-based architecture isa set of scheduled operations. Each so 2 S has (1) one execution resource, denotedRes.so/, which is either the bus or one of the processors, (2) the execution conditioncond.so/ of the operation, (3) a real-time date tso giving the latest (worst-case) real-time date at which the execution of the operation will start, and (4) the scheduledoperation, which is resource-specific:

� If Res.so/ D B, the scheduled operation is the emission of a signal denotedsig.so/ by processor emitter.so/.

� If Res.so/ D Pi , the scheduled operation is the execution of an operationop.so/ 2 Op.a/ with cond.so/ D cond.op.so//.

4 The results of this chapter can be extended to the case where cycles can overlap by consideringmodulo scheduling techniques.

Page 268: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

254 D. Potop-Butucaru et al.

For convenience, we define SB D fso 2 S j Res.so/ D Bg and SPiD fso 2 S j

Res.so/ D Pi g. We also define the duration of a scheduled operation as d.so/ DdB.Dsig.so// if Res.so/ D B and d.so/ D dRes.so/.op.so// otherwise.

Note that the definition of SB implicitly assumes that we make no use of special-ized synchronization messages, nor data encoding, only allowing the transmissionof messages containing signal values.

We assume that each operation is scheduled exactly once in S, meaning that nooptimizing replication such as inlining is done. This hypothesis is often used in real-time scheduling. Formally, we require that for all o 2 Op.a/, there exists a uniqueso 2 S with op.so/ D o.

8.8.2 Data Availability

The execution condition defining the execution cycles where signal s is sent on thebus before date t is

cond.s; t;B/ D_

fcond.so/ j .so 2 SB/ ^ .sig.so/ D s/ ^ .tso C dB.Ds/ � t/g:

Recall that each signal is computed on a single processor at each cycle. Then,cond.s; t;B/ is the execution condition giving the cycles where s is availablesystem-wide at all dates t 0 � t .

The execution condition defining the cycles where s is available on P at date t is

cond.s; t; P / D cond.s; t;B/ __˚

cond.so/ j .so 2 SP /

^.s 2 OS.op.so/// ^ .tsoCdP .op.so// � t/�:

The second term of the conjunction corresponds to the local production of s.Given a schedule S, a signal v and an execution condition c � cond.s/, we

denote with ready date.P; s; c/ the minimum t such that cond.s; t; P / � c, andwith ready date.B; s; c/ the minimum date t such that cond.s; t;B/ � c.

8.8.3 Correctness Properties

The following properties define the correctness of a static schedule S with respectto the initial algorithm a (viewed as a flattened graph), and the timing informationthat was provided. This concludes the definition of our model of real-time schedule,and allows us, in the next section, to state the aforementioned optimality properties,and define the corresponding optimization transformations.

Page 269: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

8 Real-Time Implementation of Synchronous Specifications 255

8.8.3.1 Delay Consistency

The read and store operations of a delay node must be scheduled on the same pro-cessor, to allow the use of local memory for storing the value. Formally, for all delaynode n and so1; so2 2 S, if op.so1/ D op.so2/ D n, then Res.so1/ D Res.so2/.

8.8.3.2 Exclusive Resource Use

A processor or bus cannot be used by more than one operation at a time. Formally:

� On processors: If so1; so2 2 SP such that so1 6D so2 and cond.so1/ ^cond.so2/ 6D false, then either tso1

� tso2C dP .op.so2// or tso2

� tso1C

dP .op.so1//.� On the bus: If so1; so2 2 SB such that so1 6D so2 and cond.so1/^ cond.so2/ 6D

false, then either tso1� tso2

C dB.Dsig.so2// or tso2� tso1

C dB.Dsig.so1//.

8.8.3.3 Causal Correctness

Intuitively, to ensure causal correctness our schedule must ensure in static fashionthat when a computation or communication is using a signal s at time t on executioncondition c, the signal has been computed or transmitted on the bus at a previoustime and on a greater execution condition:

� On a processor: For all so 2 SP we have:

– If s@c 2 supp.cond.so//, then cond.s; tso; P / � c.– If s 2 IS.op.so//, then cond.s; tso; P / � cond.so/.

� On the bus: For all so 2 SB we have:

– If s@c 2 supp.cond.so//, then cond.s; tso;B/ � c.– cond.sig.so/; tso; emitter.so// � cond.so/.

8.9 Scheduling Optimizations

In this section, we use the previously defined model of static schedule to state someoptimality properties which we expect of all schedule, yet which are not easy to stateor ensure in the presence of complex execution conditions. We provide algorithmsensuring these properties.

The schedule optimizations we are interested in are based on the removal ofredundant idle time and redundant communications. This is done only throughmanipulations of the execution conditions of the communications on the bus, andthrough adjustments of the dates of the scheduled operations. In particular, our op-timizations preserve the ordering of the computations on processors, as well as

Page 270: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

256 D. Potop-Butucaru et al.

the ordering of communications on the bus (with the exception of the fact thatonly the first send of a given signal is performed at a given cycle, the other beingoptimized out).

8.9.1 Static Dependencies

To simplify notations, we define the sets of bus operations and processor operationsthat produce a given signal s. They are SB.s/ D fso 2 SB j sig.so/ D sg andSP .s/ D fso 2 SP j s 2 OS.op.so//g.

To allow the optimization of a static schedule, we need to understand which datadependencies must not be broken through removal of communications. Our first re-mark is that a signal s can be used on all processors as soon as it has been emittedonce on the bus. Subsequent emissions bring no new information, so that readers ofs should not depend on them. We shall denote with depB.s@c/ the subset of opera-tions of SB that can (depending on execution conditions) be the first communicationof signal s on the bus in cycles where c is true.

depB.s@c/ Dnso 2 SB.s/ j clk

�cond.so/ ^

�c n

_fcond.so0/ j .so0 2 SB.s//

^.tso0 < tso/g��

6D falseo:

On a processor, things are more complicated, as information can be locally pro-duced. However, local productions are always the source of the data, and cannotbe redundant. We denote the set of local dependencies with dep locP .s@c/ Dfso 2 SP .s/ j clk.cond.so/ ^ c/ 6D falseg, and we set cloc D Wfcond.so/ jso 2 dep locP .s@c/g. With these notations,

depP .s@c/ D dep locP .s@c/[ depP [email protected] n cloc//:

Now, we can define the static dependencies between the various scheduled opera-tions. The set of scheduled operations upon which so 2 SB depends is

dep.so/ D depemitter.so/.sig.so//[[

[email protected]//

depB.s@c/:

Similarly, for so 2 SP :

dep.so/ D0@ [

s2IS.op.so//

depP .s/cond.so/

1A

[[

[email protected]//

depP .s@c/ :

Page 271: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

8 Real-Time Implementation of Synchronous Specifications 257

To these data dependencies, we must add the dependencies due to sequencingon the processors and on the bus, and we obtain the set of static dependencies thatmust be preserved by any optimization. We denote with pre.so/ the set of predeces-sors of a scheduled operation, which includes dep.so/ and the dependencies due tosequencing:

pre.so/ D dep.so/[nso0 2 S j .Res.so/ D Res.so0//

^.clk.cond.so/ ^ cond.so0// 6D false/ ^ .tso > tso0/o:

We call scheduling DAG (directed acyclic graph) the set S endowed with the orderrelation given by the pre./ relation.

The computation of pre.so/ relies on computing the satisfiability of predicates.As explained in Sect. 8.5.3, we assume these satisfiability computations are not ex-act, so that the computed pre.so/ contains more dependencies than the exact one.The closer the approximation of pre.so/ to the exact one, the better results are givenby the following optimizations.

8.9.2 ASAP Scheduling

The first optimality property we want to obtain is the absence of idle time. Inour static scheduling framework, this amounts to recomputing the dates of all thescheduled operations by performing a MAX+ computation on the scheduling DAGdefined in the previous section. More precisely, starting on the scheduled operationswithout predecessors, we set

tso D maxso02pre.so/

�tso0 C d.so0/

�:

This operation is of polynomial complexity, and it does not change other aspectsof the schedule, meaning that the same implementation will function. The gain is atighter worst-case bound for the execution of a cycle.

8.9.3 Removal of Redundant Communication

A stronger result is the absence of redundant communications. We explore two com-plementary approaches to this problem here.

8.9.3.1 Removal of Subsequent Emissions of a Same Signal

The simplest of the two approaches is the one that seeks to reduce the executioncondition of a signal communication based on previously scheduled communica-

Page 272: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

258 D. Potop-Butucaru et al.

tions of the same signals. The minimization works by replacing for each so 2 SBthe execution condition cond.so/ with

cond.so/ n_˚

cond.so0/ j .so0 2 SB.sig.so/// ^ .tso > tso0/�:

The transformation of the schedule is a traversal of the bus scheduling DAG (ofpolynomial complexity). Due to the minimization of execution conditions (mean-ing that some communications are no longer realized at certain cycles), the outputof this transformation should be put in ASAP form. To do this, the schedulingDAG must be recomputed through a re-computation of the dependencies due tosequencing on the bus. The data dependencies are not changed, as this form ofredundancy has been taken into account during the computation of the data depen-dencies.

8.9.3.2 Removal of Useless Communication Operations

The removal of subsequent emissions reduces to 1 the maximal number of emis-sions of a signal in a cycle. However, it does not try to restrict emissions tocycles where the signal is needed. Doing this amounts to a reduction of theexecution condition on which a signal s is sent through the bus in a cycle toW

so2SB.s/ cond.so/.We propose here the simplest such transformation, which consists in the remov-

ing of communication operations that are in none of the pre.so/ for some so 2 S. Itis important here to note that the removal of one operation may cause the removalof another, a.s.o. The removal and propagation process can be done in polynomialcomplexity. The modified schedule is easily implemented by removing pieces ofcode from the initial implementation.

For a given schedule, the removal of subsequent emissions, followed by the re-moval of useless communication operation, and by a transformation into ASAPform leads to a schedule where none of the three techniques can result in furtherimprovement.

8.10 Related Work

To our knowledge, no work exists on the optimization of given real-time schedulesfor conditioned dataflow programs. Meanwhile, an important corpus of heuristicalgorithms for real-time scheduling of conditioned dataflow programs (and similarformalisms) exists. We mention here only a few approaches.

Our approach is closest related to work by Eles et al. [15] on the scheduling ofconditioned process graphs onto multiprocessor systems. Like in our approach, thedifficulty is the handling of execution conditions. However, the chosen approach isdifferent. Eles et al. start from schedules corresponding to each execution mode of

Page 273: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

8 Real-Time Implementation of Synchronous Specifications 259

the system, and unify them into a global schedule covering all modes. The enumera-tion of execution modes allows a finer real-time analysis than possible in our modelof static schedule (because a given operation has not one worst-case starting date,but one date per mode). The main drawback of the approach is the enumeration ofall execution modes of the system, which can be intractable for complex systems.A better approach here could be an intermediate one between ours and Eles’, wheremode-dependent execution dates can be manipulated, but manipulation is not doneon individual modes, but on sets of nodes identified by execution conditions. A sec-ond drawback of the approach, from a communication optimization point of view,is that for most communications the value that is transmitted is not specified.

Our work is based on existing work on SynDEx [17]. The SynDEx tool allowsthe real-time scheduling of conditioned dataflow specifications onto hardware archi-tectures involving multiple processors and communication lines (point-to-point andbuses). When applied to architectures with a single bus, SynDEx generates sched-ules that can be represented using our formalism, but where each communicationhas either the execution condition of the sender, or the execution condition of theintended receiver. This simplifies our calculus, but potentially pessimizes commu-nication, meaning that our techniques can be directly applied.

The work by Kountouris and Wolinski [25] on the scheduling of hierarchicalconditional dependency graphs, a formalism allowing the representation of datadependencies and execution conditions whose development is related to the imple-mentation of the Signal/Polychrony language [19]. Kountouris and Wolinski focuson the development of scheduling heuristics that use clock calculus to drive classi-cal optimization techniques. We focus on the determination of optimality propertiesin a more constraint solution space, and on a schedule analysis based on a calculusinvolving both logical clocks and time.

We also mention here the large corpus of work on optimized distributed schedul-ing of dataflow specifications onto time-triggered architectures [18, 42]. Given theproximity of the specification formalism, we insist here on the work of Caspi et al.on the distributed scheduling of Scade/Lustre specifications onto TTA-based archi-tectures [10]. The main difference is that in TTA-based systems communicationsmust be realized in time slots that are statically assigned to the various processors.Therefore, communications from different processors cannot be assigned overlap-ping time frames even if their execution conditions are exclusive. This implies thatthe computation of an ASAP scheduling is no longer a simple MAX+ computation,as it is in our case. Closer to our approach is the scheduling of the FlexRay dy-namic segment, where time overlapping is possible (but still ASAP scheduling doesnot correspond to MAX+). Finally, in time-triggered systems all execution condi-tions do not need to be computed starting from “true” because a notion of time andcommunication absence is defined.

Our model of static schedule can also represent schedules for systems based ontime-triggered buses such as TTA or FlexRay [29]. Our technique for removal ofredundant communications still works, but the ASAP minimization does not.

Page 274: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

260 D. Potop-Butucaru et al.

8.11 Conclusion and Future Work

In this paper, we presented a technique for the real-time implementation of dataflowsynchronous specifications. The technique is based on the introduction of a newformat for the representation of static real-time schedules of synchronous specifi-cations. At the level of this formalism, we define the correctness of the schedulewith respect to the initial functional specification, which enables us to definecorrectness-preserving optimization algorithms that improve the real-time latencyof the computation of one synchronous cycle.

We envision several directions for extending the work presented here. Most im-portant is the extension of our results to allow optimization over time-triggeredbuses. One specific problem here is how to use execution conditions in order tobetter determine which communications of the FlexRay dynamic section can beguaranteed. Another is to take into account the latency of the protocol layers on thesender and receiver side (in our model, this latency is assumed to be 0).

A second direction is the extension of the model to cover architectures with mul-tiple buses. A natural question in this case is which of the optimality results stillstand.

A third direction consists in extending the clock calculus, to cover specificationswhere the clocks are less hierarchized, or defined by external events, such as peri-odic input arrivals. A good starting point in this direction is the use of endochronoussynchronous systems [30] specified in the Signal/Polychrony language [19].

We intend to improve our schedule representation formalism to allow a finercharacterization of worst-case execution dates, a la Eles et al. [15].

More work is also needed to deal with multi-rate synchronous specifications andtheir schedules. A classical (single-clock) synchronous specification fully definesthe temporal allocation of all operations with cycle precision. The only degrees offreedom that can be taken by a scheduling algorithm concern (1) temporal allocationinside a cycle, and (2) the distribution (which is what SynDEx does). However, inmulti-clock specifications, the various operations can have different periods, whichamounts to having not one global cycle-based execution, but several interlockedcycle-based executions running in parallel. One approach here (already used withSynDEx, and thus amenable to our optimizations) is to transform multi-rate specifi-cations into single-rate ones either by determining a so-called hyperperiod (the leastcommon multiple of all periods of all operations).

References

1. Arditi, L., Boufaıed, H., Cavanie, A., Stehle, V.: Coverage-directed generation of system-leveltest cases for the validation of a DSP system. In: FME 2001: Formal Methods for IncreasingSoftware Productivity. Lecture Notes in Computer Science, vol. 2021. Springer, Berlin (2001)

2. Benveniste, A., Berry, G.: The synchronous approach to reactive and real-time systems. Pro-ceedings of the IEEE 79(9), 1270–1282 (1991)

Page 275: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

8 Real-Time Implementation of Synchronous Specifications 261

3. Benveniste, A., Caspi, P., Edwards, S.A., Halbwachs, N., Guernic, P.L., de Simone, R.: Thesynchronous languages 12 years later. Proceedings of the IEEE 91(1), 64–83 (2003)

4. Berry, G.: Real-time programming: general-purpose or special-purpose languages. In: G. Ritter(ed.) Information Processing 89, pp. 11–17. Elsevier, Amsterdam (1989)

5. Berry, G.: The constructive semantics of Pure Esterel. Esterel Technologies. Electronic versionavailable at http://www.esterel-technologies.com (1999)

6. Berry, G., Gonthier, G.: The Esterel synchronous programming language: design, semantics,implementation. Science of Computer Programming 19(2), 87–152 (1992)

7. Bouali, A.: XEVE, an Esterel verification environment. In: Proceedings of the Tenth Inter-national Conference on Computer Aided Verification (CAV’98). UBC, Vancouver, Canada.Lecture Notes in Computer Science, vol. 1427. Springer, Berlin (1998)

8. Bouali, A., Marmorat, J.P., de Simone, R., Toma, H.: Verifying synchronous reactive systemsprogrammed in ESTEREL. In: Proceedings FTRTFT’96. Lecture Notes in Computer Science,vol. 1135, pp. 463–466. Springer, Berlin (1996)

9. Burns, A., Tindell, K., Wellings, A.: Effective analysis for engineering real-time fixed priorityschedulers. IEEE Transactions on Software Engineering 21(5), 475–480 (1995)

10. Caspi, P., Curic, A., Maignan, A., Sofronis, C., Tripakis, S., Niebert, P.: From Simulink toSCADE/Lustre to TTA: a layered approach for distributed embedded applications. In: Pro-ceedings LCTES (2003)

11. Cohen, A., Duranton, M., Eisenbeis, C., Pagetti, C., Plateau, F., Pouzet, M.: N-synchronousKahn networks: a relaxed model of synchrony for real-time systems. In: ACM InternationalConference on Principles of Programming Languages (POPL’06). Charleston, SC, USA (2006)

12. Cucu, L., Pernet, N., Sorel, Y.: Periodic real-time scheduling: from deadline-based model tolatency-based model. Annals of Operations Research 159(1), 41–51 (2008). http://www-rocq.inria.fr/syndex/publications/pubs/aor07/aor07.pdf

13. Danne, K., Platzner, M.: An EDF schedulability test for periodic tasks on reconfigurable hard-ware devices. In: Proceedings of the Conference on Language, Compilers, and Tool Supportfor Embedded Systems, ACM SIGPLAN/SIGBED. Ottawa, Canada (2006)

14. Dennis, J.: First version of a dataflow procedure language. In: Lecture Notes in ComputerScience, vol. 19, pp. 362–376. Springer, Berlin (1974)

15. Eles, P., Kuchcinski, K., Peng, Z., Pop, P., Doboli, A.: Scheduling of conditional process graphsfor the synthesis of embedded systems. In: Proceedings of DATE. Paris, France (1998)

16. Gao, M., Jiang, J.H., Jiang, Y., Li, Y., Sinha, S., Brayton, R.: MVSIS. In: Proceedings of theInternational Workshop on Logic Synthesis (IWLS’01). Tahoe City (2001)

17. Grandpierre, T., Sorel, Y.: From algorithm and architecture specification to automatic genera-tion of distributed real-time executives. In: Proceedings MEMOCODE (2003)

18. Gu, Z., He, X., Yuan, M.: Optimization of static task and bus access schedules for time-triggered distributed embedded systems with model-checking. In: Proceedings DAC (2007)

19. Guernic, P.L., Talpin, J.P., Lann, J.C.L.: Polychrony for system design. Journal for Circuits,Systems and Computers 12(3), 261–303 (2003). Special Issue on Application Specific Hard-ware Design

20. Halbwachs, N.: Synchronous programming of reactive systems. In: Computer Aided Verifica-tion (CAV’98), pp. 1–16 (1998). citeseer.ist.psu.edu/article/halbwachs98synchronous.html

21. Halbwachs, N., Caspi, P., Raymond, P., Pilaud, D.: The synchronous dataflow programminglanguage Lustre. Proceedings of the IEEE 79(9), 1305–1320 (1991)

22. Kermia, O., Cucu, L., Sorel, Y.: Non-preemptive multiprocessor static scheduling for systemswith precedence and strict periodicity constraints. In: Proceedings of the 10th InternationalWorkshop on Project Management and Scheduling, PMS’06. Posnan, Poland (2006). http://www-rocq.inria.fr/syndex/publications/pubs/pms06/pms06.pdf

23. Kermia, O., Sorel, Y.: Load balancing and efficient memory usage for homogeneous distributedreal-time embedded systems. In: Proceedings of the 4th International Workshop on Schedul-ing and Resource Management for Parallel and Distributed Systems, SRMPDS’08. Portland,Oregon, USA (2008). http://www-rocq.inria.fr/syndex/publications/pubs/srmpds08/srmpds08.pdf

Page 276: Synthesis of Embedded Softwaredl.booktolearn.com/.../embeddedsystems/...of_embedded_software_… · introduces approaches to the high-level specification of embedded software that

262 D. Potop-Butucaru et al.

24. Kermia, O., Sorel, Y.: Schedulability analysis for non-preemptive tasks under strict periodic-ity constraints. In: Proceedings of 14th International Conference on Real-Time ComputingSystems and Applications, RTCSA’08. Kaohsiung, Taiwan (2008). http://www-rocq.inria.fr/syndex/publications/pubs/rtcsa08/rtcsa08.pdf

25. Kountouris, A., Wolinski, C.: Efficient scheduling of conditional behaviors for high-level syn-thesis. ACM Transactions on Design Automation of Electronic Systems 7(3), 380–412 (2002)

26. Leung, J., Whitehead, J.: On the complexity of fixed-priority scheduling of periodic real-timetasks. Performance Evaluation 2(4), 237–250 (1982)

27. Liu, C., Layland, J.: Scheduling algorithms for multiprogramming in a hard real-time environ-ment. Journal of ACM 14(2), 46–61 (1973)

28. Lopez, J.M., Garcia, M., Diaz, J.L., Garcia, D.F.: Worst-case utilization bound for EDFscheduling on real-time multiprocessor system. In: Proceedings of 19th Euromicro Confer-ence on Real-Time Systems, ECRTS’00. Stockholm, Sweden (2000)

29. Obermeisser, R.: Event-Triggered and Time-Triggered Control Paradigms. Springer, Berlin(2005)

30. Potop-Butucaru, D., Caillaud, B., Benveniste, A.: Concurrency in synchronous systems. For-mal Methods in System Design 28(2), 111–130 (2006)

31. Potop-Butucaru, D., de Simone, R., Sorel, Y., Talpin, J.P.: Clock-driven distributed real-time implementation of endochronous synchronous programs. In: Proceedings EMSOFT’09.Grenoble, France (2009)

32. The Scade tool page: http//www.esterel-technologies.com/products/scade-suite/33. Schneider, K.: Proving the equivalence of microstep and macrostep semantics. In: 15th Inter-

national Conference on Theorem Proving in Higher Order Logics (2002)34. Sentovich, E., Singh, K.J., Lavagno, L., Moon, C., Murgai, R., Saldanha, A., Savoj, H.,

Stephan, P., Brayton, R., Sagiovanni-Vincentelli, A.: SIS: a system for sequential circuit syn-thesis. Memorandum UCB/ERL M92/41, UCB, ERL (1992)

35. Sentovich, E., Toma, H., Berry, G.: Latch optimization in circuits generated from high-leveldescriptions. In: Proceedings of the International Conference on Computer-Aided Design(ICCAD’96) (1996)

36. Shiple, T., Berry, G., Touati, H.: Constructive analysis of cyclic circuits. In: Proceedings of theInternational Design and Testing Conference (ITDC). Paris (1996)

37. de Simone, R., Ressouche, A.: Compositional semantics of Esterel and verification by com-positional reductions. In: Proceedings CAV’94. Lecture Notes in Computer Science, vol. 818.Springer, Berlin (1994)

38. The Simulink tool page: http://www.mathworks.com/products/simulink/39. Singh, M., Theobald, M.: Generalized latency-insensitive systems for single-clock and multi-

clock architectures. In: Proceedings DATE (2004)40. Susini, J.F., Hazard, L., Boussinot, F.: Distributed reactive machines. In: Proceedings RTCSA

(1998)41. Touati, H., Berry, G.: Optimized controller synthesis using Esterel. In: Proceedings of the

International Workshop on Logic Synthesis (IWLS’93). Lake Tahoe (1993)42. Zheng, W., Chong, J., Pinello, C., Kanajan, S., Sangiovanni-Vincentelli, A.: Extensible and

scalable time triggered scheduling. In: Proceedings ACSD (2005)