data flow in uml
DESCRIPTION
Data Flow in UML. SAGE (12 prod units). UML (50 prod units). PGM (20 prod. CORBA (17 prod units). Dr. Jeffrey E. Smith Mercury Computer Systems, Inc. [email protected]. SCE (40 pr. Agenda. Model based parallel programming alternatives Focus on framework/UML Conceptualization - PowerPoint PPT PresentationTRANSCRIPT
CORBA (17 prod units)
UML (50 prod units)
SCE (40 pr
PGM (20 prod
SAGE (12 prod units)
Data Flow in UML
Dr. Jeffrey E. Smith
Mercury Computer Systems, Inc.
Agenda
Model based parallel programming alternatives
Focus on framework/UML Conceptualization
– Data Parallel CORBA
– Data Flow in UML Superstructure
Motivation: From “Portable HP SW for SIP - What’s Next”, Lincoln Labs
Moore’s law addresses computations, not complexity
In their roadmap for advancing RT Embedded Software Development, they identified model-based development and automated mapping support as the key long-term technologies
“Blue Jean” datapoint
Develop Model(GUI + Interpreted Code)
Make Parallel(Simulator)
Auto-GenerateApplication Code
(Cross Compilation)
Development Steps
Methods to Conceptualize/Apply High Performance Data Flow Applications
Observations
UML doesn’t include consistent model of data flow … yet … not really
Translate UML diagrams to any source - might be an avenue of tool support worth exploring
Goals: Component Reuse, Software Productivity, Leverage Existing Investments & Wider
Programming Base
POSIX-Compliant kernel
POSIX-Compliant API
Profile-Guided Optimization
Graph(ical) CORBA SCE V/P CompilersParallel/DSPPrototypers
Executable Prototype
ExecutableDeliverable
Translate
UMLModel BehaviorConstructor
(Programmer 1)
Optimizer(Programmer 2)
Source
. . .
Requirements and Design
Dynamic Compilation Can Provide a Solution
• Memory usage• instruction and data caches• translation look-aside buffers
• Control flow• branch probabilities• program “traces”
• Call graphs• gprof statistics
• Data dependencies• data-dependent control flow
• Variable values• value locality• interprocedural dataflow
•Hardware counters• pipeline stalls
Collect runtime execution behavior
UML UML with Data FlowWork with
OMG
Common CASE &Data-Flow Machine
Development
Non-OptimizedLow-Level Algorithms
OptimizedLow-Level Algorithms
Feedback
(Par.)CORBA IDE
1-7 Transforms
Profile-GuidedOptimizations
High-Level Algorithms
Next Steps
Application to IR formation, fusion, template matching Collect software productivity metrics on above and MITRE benchmarks Experiment with optimization of UML transformed (through data parallel
CORBA or specialized data parallel compiler IDEs) software to efficient embedded platforms
Work with OMG in introducing data flow, in a way that supports streaming high-performance, data-flow distributed computers
Examine possibility of embedding dynamic profile optimization into runtime system
Work with CASE and IDE vendor to integrate model-based development of efficient streaming high-performance, data-flow distributed computer targets
Trick is to:
Pin
DataFlow
OutputPin
1
1
Action
0..n0..1
+outputPin
0..n{ordered}
+action
0..1
0..n0..n 0..n
+/availableOutput
0..n
InputPin0..n
0..1
+inputPin{ordered}
+action0..1
11
0..n
0..n
0..n
0..n
+/availableInput+flow
1
+destination
1{ordered}
+source1
+flow10..n
GlobalDistribution
parti tioningInfodimensions
Src
Channel
1
1
Dst
1
11
1
1
1
Parti tion
ChannelSide
BufferSet
typeshapeprocessSet
Distribution
groupDimssizememoryLayout
piecepiece
Layout
Action Semantics DataFlow PAS Channel
1) Discover common patterns (SCE, PAS, Par. CORBA, …) 2) Feed this forward into standard OMG specs3) Simplify our own software architectures/APIs
CORBA SequenceanApp :
ApplicationaClient :
SingularClientanOrb :
NonParallelOrbanAdaptor :
PortableObjectAdaptoraServant :
SingularServantcreate_instance( )
create_opaque_Reference
associate_servant_with_Reference
convey_Reference( )
connect(Reference)
check_location_and_interface( )
invoke(handle, ops, args)
GIOPSend(ops, key,args)
dispatch(ops,args)
create_params( )
location, key and interface are part of this reference
Data-Parallel CORBA SequencePC :
ParallelClientPO : ParallelOrb POC :
ParallelObjectCreatorPOF :
PartObjectFactoryPPF : PartFactory PPA : PartAdaptor PS : PartServ ant PSA :
PartServ erApp
create_part_instance( )
create_PartObjectRef erence( )
register_part(whole)
register_part(whole)
get_PartObjectFactory
create_ParallelObjectRef erence(whole)
conv ey _ParallelObjectRef erence
connect(ParallelObjectRef erence)
create_params(whole)
inv oke(handle,ops,args)GIOPSend(ops,args,ObjectKey ) dispatch(ops,args)
ParallelBehav iorincludes PartObjectRef erence
To collect parts of whole.ParallelObjectRef erence includes Location, ObjectKey and IRinterf aceID
Red f ont signif ies messages that dif f er f rom the ty pical CORBA sequence
Meta-Classes
NonParallelOrb
Object
LocalObjectParallelObject
SingularObject
PartObject
Inv ocation
Only by parallel client
ApplicationPortableObjectAdaptor
create_opaque_Ref erence()associate_serv ant_with_Ref erence()GIOPSend(ops, ObjectKey , args)
Orb
connect(Ref erence) : handlecheck_location_and_interf ace()inv oke(handle, ops, args)
PartObjectFactory
ParallelObjectCreator
get_PartObjectFactory ()
PartFactory
PartAdaptor
create_PartObjectRef erence()
ParallelOrb
PartInterf ace
Interf ace
identif ier
PartServ erApp
ClientApp
SingularClient
create_params() : ops, args
Serv erApp
SingularServ ant
<<executes in the context of >>
PartServ ant
create_part_instance() : instance
<<executes in the context of >>
SingularInv ocation
<<executes in the context of >>
Collectiv eInv ocation
Serv ant
create_instance() : instancedispatch(ops, args)
<<executes in the context of >>
ParallelInv ocation
<<executes in the context of >>
Corba::Current
Client
conv ey _Ref erence()
<<executes in the context of >>
<<initiates in the context of >>
<<initiates in the context of >>ParallelClient
create_params(whole) : ops, argsconv ey _ParallelObjectRef erence()connect(ParallelObjectRef erence) : handle
<<initiates in the context of >>
<<executes in the context of >>
ParallelCorba::Current
<<embodies contents of >>
Data StructuresReference
ParallelProxyProfile ParallelAgentProfile
DataPartitioningRequestDistribution
ParallelObjectReference
1111
OperationDescription
0..n
{for every arg}
0..n 11
ParallelRealizationProfile
11
ParallelRealization
11
ParallelBehaviorProfile
PartReferenceProfile 1..n1..n
PartObjectReference
11
11
Location
adaptorAddress : integerObjectKey
StandardIIOPProfile
{is exactly}
ParallelBehavior
1..n1..n
1111
IRInterfaceID
11
Runtime Associations
Serv ant
create_instance() : instancedispatch(ops, args)
Applicationinstance create<<colocate>>
Client
conv ey _Ref erence()
convey reference
PortableObjectAdaptor
create_opaque_Ref erence()associate_serv ant_with_Ref erence()GIOPSend(ops, ObjectKey , args)
dispatch<<colocate>> reference create/associate
Orb
connect(Ref erence) : handlecheck_location_and_interf ace()inv oke(handle, ops, args)
invoke connection<<colocate>>
GIOP send
PartObjectFactory
register_part(whole)create_ParallelObjectRef erence(whole) : POR
ParallelObjectCreator
get_PartObjectFactory ()
PartServ ant
create_part_instance() : instance
PartFactory
get parallel reference
PartServ erApp
part instance create
register part
ParallelClient
create_params(whole) : ops, argsconv ey _ParallelObjectRef erence()connect(ParallelObjectRef erence) : handle
parallel reference create
PartAdaptor
create_PartObjectRef erence()
dispatch<<colocate>>
<<colocate>>part reference create
ParallelOrb <<colocate>>
invoke connection
GIOP send
<<colocate>>
Control Flow
Each step is taken when the previous one finishes …
…regardless of whether inputs are available, accurate or complete (“pull”)
Emphasis is on order in which steps are taken
Not UMLNotation Chart CourseChart Course Cancel TripCancel Trip
Analyze Weather InfoAnalyze Weather Info
Weather InfoStart
Object/Data Flow
Each step is taken when all the required input objects/data are available …
… and only when all the inputs are available (“push”) Emphasis is on objects flowing between steps
Design ProductDesign Product
ProcureMaterials
ProcureMaterials
Acquire CapitalAcquire Capital
BuildSubassembly 1
BuildSubassembly 1
BuildSubassembly 2
BuildSubassembly 2
FinalAssembly
FinalAssembly
Not UMLNotation
UML 2.0 Superstructure RFP Excerpt
Further, the way that objects and other data flow between parts of a system is crucial to understanding its architecture. The UML currently supports object/data flow only at the lowest level of granularity {not even}, between the steps in an activity graph {as well as otherlocations, in a contradictory way}. It is important for architects to be able to model object and data flows between entities at a higher level of granularity, such as classifiers and packages {as well as many other requirements coming up}.
{signifies my comments}
Why bring back data flow explicitly into UML?
With parallel computation increasingly used to increase computation speeds, there is interest in linking streaming data flow machines with a matching modeling paradigm
To bring back data flow standard – developers have been building unique custom DFDs out of standard UML structure (patterns) - some CASE vendors added data flow at model & meta-model level
To link/integrate existing DFD toolsets with UML toolsets & existing simulators e.g. Ptolemy [Park]
Functional modeling (only third left out of OMT) fits OO and non-OO modeling paradigm and can be united with other UML models [SD, DSH]
Currently addressed in piecemeal in UML (shown later), none of which conform to pre-existing modelers view (OMT view) of data flow
Why bring back data flow explicitly into UML (cont)?
Object model defines system components, dynamic model (state machines) define system control but functional model (data flow) defines what computations occur in a system & functional dependencies between processes
Need expressed in software process/workflow, defense, medical, wireless and digital video domains
Example: When response to Action Semantics RFP was presented in OMG Plenary, diagrams were not done in UML (were in data flow) - reason given was it would take too much space in UML
Why bring back data flow explicitly into UML (continued) ?
Different (yet related) underlying semantics than State Machines» "is-used-to-produce" relation» Can have consistent parent/child (state/substate) diagram from state machine point of view that
violates data consistency model [TK]
» Unique inheritance (decomposition) requirement– Example definition: Let P be a process and D a composition of P. D is consistent with P iff the
I/O relationships that are 1) specified for P must also hold for D and 2) not specified to hold for P must not hold for D [TK]
» Relation to state machines: A trigger (t) of a control process "is-used-to-produce" a response (r) of the same process iff there is a transition in the STD that is triggered by t and responds with r [TK].
» It is conceptually simpler for some applications – simply a digraph together with a binary precedence relation.
» It is impossible to represent continuous flow, especially with feedback, in a State Machine because of the theoretically infinite amount of states to represent. This is a natural modeling view with data transforms.
» STDs are sequential within one machine, DFDs are not
d0:1
d0:3
d0:2
d0:4d0:4
d0:5d0:5
P1,2
P1,3
P1,1
P1d0:1
d0:3
d0:2
Interaction Diagrams (Sequence, Collaboration)
Different (yet related) underlying semantics than Interaction Diagrams» Interaction diagrams are for interaction among objects» Cannot represent interaction at a lower level (among
methods of different classes)» Cannot represent interaction among systems
Why aspects of data flow are not yet supported?
Ambiguous order of input to processes Considered difficult to unite the semantics of data flow
models with other OO models (other research has proved this false)
Some DFDs allowed for control flow and control flow is duplicated in many of the dynamic models
Could be non-deterministic, since not all processes or data flows are necessarily used to produce the high level process outputs
No way to represent sequencing, iteration and conditionals Considered to be included: but inconsistently and multiply
Where UML experts think data flow either exists or would fit?
UML Profile for Enterprise Distributed Object Computing Activity Diagrams Collaboration Diagrams Action Semantics (Data flow is model element that acts as
temporary data store between in and out pins) Data-parallel CORBA Using new (data flow) patterns of existing UML structures A UML ActivityGraph Profile for EDOC Task Model (ad/99-10-
07) Object interaction diagram (I assume this option is merely
seconding the Collaboration Diagrams suggestion)
Initial Requirements Collection Establish criterion for (automatable) checking of (internal/external) completeness (well-
formed,well-connected,well-introduced,well-rooted) [TK], consistency [Kung], decomposition (inheritance), boundedness [Park], determinancy [Park, KM] and termination [Park, KM].
Elementary processes modeled, like Petri nets [Petri], with pre and post conditions describing the behavior of processes.
Provide ability to express data dimensions and other data properties (in Class Diagram) and explicit linkage to these from Data Flow Diagram.
Map to rest of UML - Consistent with State Diagrams (events trigger "is-used-to-produce" relation), Action Semantics, EDOC [EDOC], Collaborations, Activity, Class Diagrams (generalization and process functional dependencies to associations) [Kung], Use Case Diagrams (actors) [Park], RT UML (ports map to I/O specs) and Deployment Diagrams (see next 2 bullets).
Provide ability to express parallelization along data dimensions and mapping to hardware resources in Deployment Diagram. Must express data distribution types (sequential or parallel) and sub types (round robin, random even, random statistical, first available, etc.)
Allow ability to specify that arrows in DFD are associated with (virtual) channels (see Virtual Interface Specification) in Deployment Diagram.
Initial Requirements Collection (cont)
Non-side affecting operations, or previously defined actions, are decomposed using functional models and these are generally used at the aggregate level [SD].
Aggregate objects are passed as an input parameter and returned as an output parameter, allowing a process to access any object (data stores, object classes, or associations) with the parameter [SD].
Place all control flow info in a state machine (to solve 4.4 and 4.5) [SD]. Provide for data store I/O not included in action semantics. Must be able to model partial objects (multiple partial partitions of data)
described in Data Parallel CORBA Spec. Provide method to express process synchronization as something
external to processes (as opposed to state machines where this would be defined in a state) without knowledge of composition context.
Constraints tounify behavior,
class & functionalmodels
Associate I/O specs with port attributes Need semantics to model data
Initial Requirements Collection (cont)for Modeling Streaming Data
Provide a uni-directional data streaming interface with data flow. Model structured (number of dimensions, extent in each dimension, packing order &
element type) and unstructured global data (number of data sets, size of data) [DRI].
Model object I/O requirements e.g. support for structured/non-structured data, dimensions, element types and data partitioning specification (e.g. indivisible or block type and for each dimension, maximum size, minimum number of required elements, modulo size, block length, left and right overlap specs, etc.) [DRI].
Model data stream control e.g. push and pull of data, QoS based on data control (e.g. rate & latency constraints), control data stream, control tagged data, etc. [DRI].
InputSpecifications
OutputSpecifications
Name
Properties
Global Data
Name 1
PropertiesData
Distribution(sub)Type
Name 2
Properties
Name 3
Properties
Existing Data Flow Semantic Models
Petri Nets [Petri] Kung, et al [TK, Kung] Karp and Miller Computation Graphs [KM] Kahn Process Networks [Kahn] Parks Bounded Execution [Parks]
Completely different connection in action semantics, EDOC, Activity Diagrams, Different CASE vendors
References
[BS] D. Bhatt and J. Shackleton, A Design Notation and Toolset for High-Performance Embedded Systems Development, Lectures on Embedded Systems, LNCS 1494, Springer-Verlag, VIII, October 1998.
[DRI] Document for the DARPA Data Reorganization Effort, www.data-re.org, Feb 2000. [EDOC] Cooperative Research Centre for Enterprise Distributed Systems Technology, UML Profile for
Enterprise Distributed Object Computing, ad/99-10-07. [Kahn] G. Kahn, The Semantics of a Simple Language for Parallel Programming, Info. Proc., pages 471-
475, Stockholm, Aug. 1974. [KM] R. M. Karp and R. E. Miller, Properties of a Model for Parallel Computations: Determinacy,
Termination, Queueing, SIAM Journal of Applied Mathematics, Vol. 14, No. 6, November 1966. [Kung] C. H. Kung, Conceptual Modeling in the Context of Software Development, IEEE Transactions
on Software Engineering", 15(10):1176-1187, Oct. 1989. [Parks] T. M. Parks. Bounded Scheduling of Process Networks Technical Report UCB/ERL-95-105, PhD
Dissertation, EECS Department, University of California. Berkeley, CA, December 1995. [Petri] C. A. Petri, Kommunikation mit Automaten, PhD dissertation, translation by C. F. Greene,
Supplement 1 to Technical Report RADC-TR-65-337, Vol. 1, Rome Labs, Griffiss Air Force Base, NY, 1965.
[TK] Y. Tao and C. Kung: Formal Definition and Verification of Data Flow Diagrams, J. Systems Software, 16:29-36, 1991.
[SD] S. DeLoach, Formal Transformations from Graphically-Based Object-Oriented Representations to Theory-Based Specification, PhD thesis, Air Force Institute of Technology, Wright-Patterson AFB,OH, June 1996, AFIT/DS/ENG/96-05, AD-A310 608.