middleware systems research group msrg.org enabling bpm for clouds hans-arno jacobsen bell...
of 59
/59
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Enabling BPM for Clouds Hans-Arno Jacobsen Bell University Laboratory Chair University of Toronto http://www.padres.msrg.utoronto.ca Summer School on Service Research, July 19 th , 2010 “BPM in Cloud Architectures: Busines s Process Management with SLAs and E vents ”, joint work with Vinod Muthusamy (extended abstract) http://eqosystem.msrg.org/ An eQoSystem for declarative distributed applications with SLAs via Events & SLAs
Embed Size (px)
TRANSCRIPT
PADRES A Content-based Pub/Sub SystemSummer School on Service
Research, July 19th, 2010
> 0.7
else
else
*
Large-scale Business Processes
Department-level processes with 26 to 47 activities
Global processes that compose departmental ones
Thousands of concurrent instances
Hundreds of collaborating partners
*
What Support is Required ?
De-coupling and loose coupling
*
Agenda
Enabler
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Databases
But what about future data (e.g., events)
Data streams
But what about un-structured, multi-typed, sporadic, un-ordered events from many sources
Rule-based expert systems
Great for inference and reasoning
But what about managing large numbers of fined-grained filters in distributed environments
Cum grani salis
*
What Abstractions Enable BPM?
It is our opinion that the afore-mentioned requirements can best be addressed by
The content-based publish/subscribe paradigm
Events represent state transitions in the environment.
Conveyed as publications to the pub/sub system
Event filtering and correlation is based on
Subscriptions managed by the pub/sub system
Service Summer, July 19th, 2010, KIT, Germany
*
Content-based Routing
3. Publish
Input queue
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Publications & events (data)
advertisements (data sources), publications
Service Summer, July 19th, 2010, KIT, Germany
A
S
P
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Benefits of Publish/Subscribe
Supports sophisticated interactions among components using expressive subscription languages
Supports fine-grained subscriptions for event management
Achieves scalability with in-network filtering and processing
Service Summer, July 19th, 2010, KIT, Germany
*
Content-based publish/subscribe is a powerful coordination and interaction model. It goes far beyond what can be achieved by the earlier topic-based publish/subscribe model.
Content-based publish/subscribe is fully compatible with topic-based p/s (i.e., any topic-based system can be easily emulated by a content-based one.)
Content-based p/s offers message filtering capabilities that by far outperform topic-based ones, as content-based filters operate over the message content rather than the topic filed.
Content-based p/s is very efficient. We have shown that millions (10 million) content-based filters (i.e., 10 predicates over integers and strings) can be processed on current server technology (Dell Poweredge Small Business Model) in about 100 milliseconds for one event containing up to 50 attribute-value pairs.
Content-based routing can be used to coordinate activities (e.g., business processes) as we demonstrate later. It can be used for event correlation, suppressing unnecessary events from polluting the network infrastructure. This is only possible due to the fine-grained expressiveness over message content, not available in the topic-based model.
Of course, at the end it is a design decision of whether to use topic or content-based p/s. However, the choice for the more flexible content-based p/s can never be wrong as it is fully compatible with the less expressive and less flexible topic-based model.
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Many Applications are Event-based &
Supply chain and logistics
*
Enterprise applications are inherently event-based.
Supply chain applications manage life-cycle from product purchase order, manufacture, shipping and distribution through resellers and suppliers, to stocking at the retailer, and monitoring sales at the retailer, which drives more product orders, etc. Each stage is triggered by the occurrence of one or more events. For example, the event that the retailer’s stock is empty triggers more product orders, or the event that the product has been manufactured triggers the delivery process.
Job scheduling applications can manage an IT infrastructure (e.g., initiate backups, notify administrators of system failures), maintain customer relations (e.g., send coupons to VIP customers every month, call new customers), etc. Each job (such as emailing an administrator) is triggered by some event (such as a database crash).
Applications to monitor sensors (such as the temperature in various components of an assembly line), or process RFID readings (e.g., shipments entering and leaving a warehouse, or products scanned at a point-of-sale terminal) react to external events (such as a high temperature measurement, or the arrival of a shipment).
In SOA architectures, business processes are built by composing services provided by various departments, vendors, partners or customers. The communication among these distributed components can be modeled as a set of events that trigger the invocation, callback, failure handling, and control flow of the business process.
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Our PADRES ESB for
Peng Alex David aRno Eli Serge
PADRES is Publish/subscribe Applied to Distributed
Resource Scheduling
http://www.padres.msrg.utoronto.ca
*
Our PADRES ESB Stack & Vision
Service Summer, July 19th, 2010, KIT, Germany
*
Content-based Routing (Publish/Subscribe)
Service Summer, July 19th, 2010, KIT, Germany
Try out and download at:
http://www.padres.msrg.org
B
B
B
B
*
PADRES is a distributed publish/subscribe system implemented by MSRG, U of T.
Each broker is a content-based router.
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Innovative PADRES Features
*
Need more features from the infrastructure for effectively supporting enterprise apps.
Old features
Historic queries– for auditing; also temporal joins. For extracting information published in the past.
Management – monitoring, control, configuration.
This refers to allowing content-based routing over arbitrary overlay network topologies.
Existing approaches are limited to acyclic topologies and to tree topologies, which are severely limiting especially in sight of load issues (e.g., a link / broker maybe overloaded and in an acyclic topology there is no way to get around the overloaded link, as it is the only one link connecting two entities.) Furthermore, node failures are detrimental in an acyclic topology, whereas with cycles in the topology a new route can be found easily.
LB – gives load balance, ease of administration.
Security – hard due to content-based routing; issues of privacy, authentication, authorization, immutability, non-repudiation, trusted domains, key management, etc.
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Content-based Routing (Publish/Subscribe)
have completed (D depends on
B and C).
Modeling Business Processes
Dependency in processes and more complex process patterns require event correlation
Event correlation is enabled by the detection of composite events
Composite events are expressed via composite subscriptions
Composite subscription consists of atomic subscriptions
Subscription language features for BPM modeling
E.g., AND, OR, and variables ($x)
B
C
D
E
F
A
B
C
D
E
F
A
B
C
D
Exception
*
[class = Activity Status], [cmd. = Archived],[Process ID = $X ] AND
[class = Activity Status], [cmd. = Signed Off],[Process ID = $X ]
Expresses a performance property of a process
[cmd. = Credit check request], [Process ID = $X ] AND
[status = Approved], [Process ID = $X ]
Service Summer, July 19th, 2010, KIT, Germany
Check
score
Credit
check
Process
Approve
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Business Process Management
Deployment of transformed process
Monitor process & instance execution
Manage, i.e., control, version,…
*
BPEL
Receive
Assign
Flow
Invoke
Wait
Reply
*
In the execution phase, the deployed business process can be
invoked through a Web service agent, which translates the invocation
into a \ninos\ service request. The service request is a
publication message that specifies the process and instance
identifiers, and other required information. The first activity
agent in the process, say the receive activity, receives this publication,
instantiates a process instance, processes the activity, and
triggers the successor assign activity. Agents execute and
trigger one another using \pubsub\ messages in this event-driven
manner until the process terminates.
An agent might need to access external Web services, and wait for response and then continue.
The agents are both publishers and subscribers. The dual roles enable them to exchanges message in pub/sub, and coordinately execute the BPEL process.
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Cf. our ACM Trans Web’2010
for full BPEL mapping
MIDDLEWARE SYSTEMS
RESEARCH GROUP
P/S
P/S
P/S
P/S
P/S
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Content-based Routing (Publish/Subscribe)
© 2010 Hans-Arno Jacobsen
Currently, business goals must be manually considered at every stage of the business process development cycle
N
Y
Far?
Get
destination
Validate
request
*
Service Level Agreements (SLAs)
SLAs are contracts between service consumers and providers that specify the expected behavior of each party and the penalties of violating the contract.
SLAs specify business goals declaratively.
Layer
Fidelity, quality, utility
Map resolution > 300x300
Deployment & Operations
Service time
*
*
Monitoring
Feed back measurements to support runtime adaptations.
Distributed execution
ESB adaptation
M
Monitor
A
B
C
D
p
q
*
MIDDLEWARE SYSTEMS
RESEARCH GROUP
A
B
C
D
A
B
C
D
A
B
C
D
D
C
A,B
Service Summer, July 19th, 2010, KIT, Germany
Problem: How to deploy activities in a distributed manner to satisfy SLAs?
Service Summer, July 19th, 2010, KIT, Germany
*
*
Will talk about
Evaluation
Redeployment
Activity manager
Stores activity state
Executes activities by collaborating with other activity managers at (remote) locations
Activity profiler
Records statistics on activity ai relevant to metric type Mk
Candidate engine discovery
Discovers other candidates
Engine profiler
Maintains information about remote engine ej relevant to metric type Mk (i.e., info necessary to compute metric cost of an activity)
Atomic redeployer
Move an activity to another engine without interrupting the process
Redeployment manager
Decide when and where to move activities
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Maintains profiles for various metric types
Message hops, disk I/O, energy usage, etc.
1
4
7
Example:
Activity
Pred/Succ
Activity
Activity
Mem
T
What is analysis (monitoring) in terms of activity and process.
A profiler per metric and per activity
Message hops metric (metric type) profiler monitors the message rate to / from other activities
Right-hand corner box
Activity profiler is triggered by lifecycle events, such as message sends, receipts, disk i/o, etc.
Input ak is activity parameter, i.e., the activity profiled
Index k refers to metric type
*
APk
activity lifecycle
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Compute paths
Computes and caches information about candidate engines
Cf. DEBS’2009 for our resource discovery algorithms to identify candidates
T
Process
Probed paths
What is analysis (monitoring) in terms of entire engine
Discover information about candidate engines (here, the one neighbour)
Distance (message hops)
Available memory
Paths are embedded in messages; discover paths as by-product of communication
Send probe request and reply with path traversed
Example: Determine and compute distance to/from candidate engines and predecessor and successor
Monitor distances between process activities and predecessor/successor (piggy-back path information onto message to determine the distance et al.)
Probed information expires after some time
*
EPk
set of candidate engines
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Redeployment Manager
Estimator: Computes an estimate of the metric cost ck(ai,ej) of hosting an activity ai at engine ej
Cost model: Computes an estimate of the cost c(ai,ej) of hosting activity ai on engine ej
Check deployment: Determines what to do with an activity ai
Determine best engine e
Compute benefit: c(ai) – c(ai,e)
If resident long enough
Otherwise, apply pressure to other activities
Service Summer, July 19th, 2010, KIT, Germany
Service Summer, July 19th, 2010, KIT, Germany
What if analysis
Compute the cost of deploying local activity ai at candidate engines ej
Uses
Cost model could be a function of many metric types
c(ai) is the cost at the current engine, where the benefit is computed
Resident time is important to prevent oscillation (moves too often(
Example: Minimize end-to-end delay: Cost of CPU execution time + message transmission time.
Complexity
Given
E(P(ai)): Location of predecessors
E(S(ai)): Location of successors
*
© 2010 Hans-Arno Jacobsen
Redeployment Manager Summary
Compute the cost of deploying local activities ai at candidate engines ej
Given
E(P(ai)): Location of predecessors
E(S(ai)): Location of successors
Estimated cost of deploying activity ai at candidate engine ej
Complexity
Cost Model
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Ek
ai
ej
ck(ai,ej)
APk
EPk
CM
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Atomic Redeployment
Traditional pub/sub client movement protocols are expensive and do not offer transactional properties
Transactional movement
Efficient and guaranteed routing reconfiguration
For example, guarantee that no messages are lost, if an activity is re-deployed
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Redeployment
Activity manager
Stores activity state
Executes activities by collaborating with other activity managers at (remote) locations
Activity profiler
Records statistics on activity ai relevant to metric type Mk
Candidate engine discovery
Discovers other candidates
Engine profiler
Maintains information about remote engine ej relevant to metric type Mk (i.e., info necessary to compute metric cost of an activity)
Atomic redeployer
Move an activity to another engine without interrupting the process
Redeployment manager
Decide when and where to move activities
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Cd1 Message rate
Cd2 Message size
Ce1 Load (number of instances)
Ce2 Resources (CPU, memory, etc.)
Ce3 Activity complexity
Cs1 Latency of external service
Cs2 Execution time of external service
Cs3 Marshalling/unmarshalling
Cost(activity) = f(wiCi)
Cost(process) = ∑cost(activity)
Optimize time
Optimize network overhead
Threshold criteria: ∑wiCi > x
Minimized criteria: min( ∑wiCi )
E.g., Minimize distribution overhead
*
Minimize message hops
f() = 0.3 * cpu_energy + 0.7 * link_energy < X
f() = 0.3 * (invocation rate * engine_unit_energy) + 0.7 * (msg_rate * link_unit_energy) < X
Service Summer, July 19th, 2010, KIT, Germany
Service Summer, July 19th, 2010, KIT, Germany
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MIDDLEWARE SYSTEMS
RESEARCH GROUP
1
4
7
2
3
5
6
8
9
D
F
E
AB
C
GI
H
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Post-redeployment traffic is 10%
*
MIDDLEWARE SYSTEMS
RESEARCH GROUP
1
4
7
2
3
5
6
8
9
D
F
E
AB
C
GI
H
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Traffic with redeployment is 42% of the static case
Service Summer, July 19th, 2010, KIT, Germany
Service Summer, July 19th, 2010, KIT, Germany
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Challenge 1: Local optima
Widen candidate radius
Potential problem has not manifest in evaluations so far
Challenge 3: SLA granularity (more an engineering issue)
Can’t specify SLA on portions of a process
Can’t specify SLA on particular instances of a process (e.g., VIP user)
Service Summer, July 19th, 2010, KIT, Germany
Service Summer, July 19th, 2010, KIT, Germany
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Benefits of Content-based Publish/Subscribe for BPM
Naturally enables centralized and distributed business process coordination
Coordination can span administrative domains and physically distributed resources
Supports process orchestration and choreography
Monitoring & control is integral part of paradigm
Agile on the fly process adaptation and versioning
Correlation of application events with low-level infrastructure events
Service Summer, July 19th, 2010, KIT, Germany
*
Summary: The PADRES Stack
*
Content-based Routing (Publish/Subscribe)
Ad hoc business processes
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Conclusions
Effective BPM requires capable event processing abstractions.
Content-based publish/subscribe is a powerful event processing abstraction and paradigm.
PADRES is based on the pub/sub paradigm.
PADRES is an ESB targeted at event-based BPM.
PADRES enables real-time business analytics and business activity monitoring.
Service Summer, July 19th, 2010, KIT, Germany
*
Countless alumni (see our web site)
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Questions & Discussion?
P
*
MIDDLEWARE SYSTEMS
RESEARCH GROUP
References
Quantifying events in software to increase modularity & customization in C-based systems and software-based product lines
http://www.AspeCtC.net (ACC - the AspeCt-oriented C compiler)
The Middleware Systems Research Group
*
Server Farm
Content-based Routing (Publish/Subscribe)
Redirect
resume
add
remove
> 0.7
else
else
*
Large-scale Business Processes
Department-level processes with 26 to 47 activities
Global processes that compose departmental ones
Thousands of concurrent instances
Hundreds of collaborating partners
*
What Support is Required ?
De-coupling and loose coupling
*
Agenda
Enabler
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Databases
But what about future data (e.g., events)
Data streams
But what about un-structured, multi-typed, sporadic, un-ordered events from many sources
Rule-based expert systems
Great for inference and reasoning
But what about managing large numbers of fined-grained filters in distributed environments
Cum grani salis
*
What Abstractions Enable BPM?
It is our opinion that the afore-mentioned requirements can best be addressed by
The content-based publish/subscribe paradigm
Events represent state transitions in the environment.
Conveyed as publications to the pub/sub system
Event filtering and correlation is based on
Subscriptions managed by the pub/sub system
Service Summer, July 19th, 2010, KIT, Germany
*
Content-based Routing
3. Publish
Input queue
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Publications & events (data)
advertisements (data sources), publications
Service Summer, July 19th, 2010, KIT, Germany
A
S
P
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Benefits of Publish/Subscribe
Supports sophisticated interactions among components using expressive subscription languages
Supports fine-grained subscriptions for event management
Achieves scalability with in-network filtering and processing
Service Summer, July 19th, 2010, KIT, Germany
*
Content-based publish/subscribe is a powerful coordination and interaction model. It goes far beyond what can be achieved by the earlier topic-based publish/subscribe model.
Content-based publish/subscribe is fully compatible with topic-based p/s (i.e., any topic-based system can be easily emulated by a content-based one.)
Content-based p/s offers message filtering capabilities that by far outperform topic-based ones, as content-based filters operate over the message content rather than the topic filed.
Content-based p/s is very efficient. We have shown that millions (10 million) content-based filters (i.e., 10 predicates over integers and strings) can be processed on current server technology (Dell Poweredge Small Business Model) in about 100 milliseconds for one event containing up to 50 attribute-value pairs.
Content-based routing can be used to coordinate activities (e.g., business processes) as we demonstrate later. It can be used for event correlation, suppressing unnecessary events from polluting the network infrastructure. This is only possible due to the fine-grained expressiveness over message content, not available in the topic-based model.
Of course, at the end it is a design decision of whether to use topic or content-based p/s. However, the choice for the more flexible content-based p/s can never be wrong as it is fully compatible with the less expressive and less flexible topic-based model.
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Many Applications are Event-based &
Supply chain and logistics
*
Enterprise applications are inherently event-based.
Supply chain applications manage life-cycle from product purchase order, manufacture, shipping and distribution through resellers and suppliers, to stocking at the retailer, and monitoring sales at the retailer, which drives more product orders, etc. Each stage is triggered by the occurrence of one or more events. For example, the event that the retailer’s stock is empty triggers more product orders, or the event that the product has been manufactured triggers the delivery process.
Job scheduling applications can manage an IT infrastructure (e.g., initiate backups, notify administrators of system failures), maintain customer relations (e.g., send coupons to VIP customers every month, call new customers), etc. Each job (such as emailing an administrator) is triggered by some event (such as a database crash).
Applications to monitor sensors (such as the temperature in various components of an assembly line), or process RFID readings (e.g., shipments entering and leaving a warehouse, or products scanned at a point-of-sale terminal) react to external events (such as a high temperature measurement, or the arrival of a shipment).
In SOA architectures, business processes are built by composing services provided by various departments, vendors, partners or customers. The communication among these distributed components can be modeled as a set of events that trigger the invocation, callback, failure handling, and control flow of the business process.
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Our PADRES ESB for
Peng Alex David aRno Eli Serge
PADRES is Publish/subscribe Applied to Distributed
Resource Scheduling
http://www.padres.msrg.utoronto.ca
*
Our PADRES ESB Stack & Vision
Service Summer, July 19th, 2010, KIT, Germany
*
Content-based Routing (Publish/Subscribe)
Service Summer, July 19th, 2010, KIT, Germany
Try out and download at:
http://www.padres.msrg.org
B
B
B
B
*
PADRES is a distributed publish/subscribe system implemented by MSRG, U of T.
Each broker is a content-based router.
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Innovative PADRES Features
*
Need more features from the infrastructure for effectively supporting enterprise apps.
Old features
Historic queries– for auditing; also temporal joins. For extracting information published in the past.
Management – monitoring, control, configuration.
This refers to allowing content-based routing over arbitrary overlay network topologies.
Existing approaches are limited to acyclic topologies and to tree topologies, which are severely limiting especially in sight of load issues (e.g., a link / broker maybe overloaded and in an acyclic topology there is no way to get around the overloaded link, as it is the only one link connecting two entities.) Furthermore, node failures are detrimental in an acyclic topology, whereas with cycles in the topology a new route can be found easily.
LB – gives load balance, ease of administration.
Security – hard due to content-based routing; issues of privacy, authentication, authorization, immutability, non-repudiation, trusted domains, key management, etc.
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Content-based Routing (Publish/Subscribe)
have completed (D depends on
B and C).
Modeling Business Processes
Dependency in processes and more complex process patterns require event correlation
Event correlation is enabled by the detection of composite events
Composite events are expressed via composite subscriptions
Composite subscription consists of atomic subscriptions
Subscription language features for BPM modeling
E.g., AND, OR, and variables ($x)
B
C
D
E
F
A
B
C
D
E
F
A
B
C
D
Exception
*
[class = Activity Status], [cmd. = Archived],[Process ID = $X ] AND
[class = Activity Status], [cmd. = Signed Off],[Process ID = $X ]
Expresses a performance property of a process
[cmd. = Credit check request], [Process ID = $X ] AND
[status = Approved], [Process ID = $X ]
Service Summer, July 19th, 2010, KIT, Germany
Check
score
Credit
check
Process
Approve
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Business Process Management
Deployment of transformed process
Monitor process & instance execution
Manage, i.e., control, version,…
*
BPEL
Receive
Assign
Flow
Invoke
Wait
Reply
*
In the execution phase, the deployed business process can be
invoked through a Web service agent, which translates the invocation
into a \ninos\ service request. The service request is a
publication message that specifies the process and instance
identifiers, and other required information. The first activity
agent in the process, say the receive activity, receives this publication,
instantiates a process instance, processes the activity, and
triggers the successor assign activity. Agents execute and
trigger one another using \pubsub\ messages in this event-driven
manner until the process terminates.
An agent might need to access external Web services, and wait for response and then continue.
The agents are both publishers and subscribers. The dual roles enable them to exchanges message in pub/sub, and coordinately execute the BPEL process.
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Cf. our ACM Trans Web’2010
for full BPEL mapping
MIDDLEWARE SYSTEMS
RESEARCH GROUP
P/S
P/S
P/S
P/S
P/S
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Content-based Routing (Publish/Subscribe)
© 2010 Hans-Arno Jacobsen
Currently, business goals must be manually considered at every stage of the business process development cycle
N
Y
Far?
Get
destination
Validate
request
*
Service Level Agreements (SLAs)
SLAs are contracts between service consumers and providers that specify the expected behavior of each party and the penalties of violating the contract.
SLAs specify business goals declaratively.
Layer
Fidelity, quality, utility
Map resolution > 300x300
Deployment & Operations
Service time
*
*
Monitoring
Feed back measurements to support runtime adaptations.
Distributed execution
ESB adaptation
M
Monitor
A
B
C
D
p
q
*
MIDDLEWARE SYSTEMS
RESEARCH GROUP
A
B
C
D
A
B
C
D
A
B
C
D
D
C
A,B
Service Summer, July 19th, 2010, KIT, Germany
Problem: How to deploy activities in a distributed manner to satisfy SLAs?
Service Summer, July 19th, 2010, KIT, Germany
*
*
Will talk about
Evaluation
Redeployment
Activity manager
Stores activity state
Executes activities by collaborating with other activity managers at (remote) locations
Activity profiler
Records statistics on activity ai relevant to metric type Mk
Candidate engine discovery
Discovers other candidates
Engine profiler
Maintains information about remote engine ej relevant to metric type Mk (i.e., info necessary to compute metric cost of an activity)
Atomic redeployer
Move an activity to another engine without interrupting the process
Redeployment manager
Decide when and where to move activities
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Maintains profiles for various metric types
Message hops, disk I/O, energy usage, etc.
1
4
7
Example:
Activity
Pred/Succ
Activity
Activity
Mem
T
What is analysis (monitoring) in terms of activity and process.
A profiler per metric and per activity
Message hops metric (metric type) profiler monitors the message rate to / from other activities
Right-hand corner box
Activity profiler is triggered by lifecycle events, such as message sends, receipts, disk i/o, etc.
Input ak is activity parameter, i.e., the activity profiled
Index k refers to metric type
*
APk
activity lifecycle
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Compute paths
Computes and caches information about candidate engines
Cf. DEBS’2009 for our resource discovery algorithms to identify candidates
T
Process
Probed paths
What is analysis (monitoring) in terms of entire engine
Discover information about candidate engines (here, the one neighbour)
Distance (message hops)
Available memory
Paths are embedded in messages; discover paths as by-product of communication
Send probe request and reply with path traversed
Example: Determine and compute distance to/from candidate engines and predecessor and successor
Monitor distances between process activities and predecessor/successor (piggy-back path information onto message to determine the distance et al.)
Probed information expires after some time
*
EPk
set of candidate engines
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Redeployment Manager
Estimator: Computes an estimate of the metric cost ck(ai,ej) of hosting an activity ai at engine ej
Cost model: Computes an estimate of the cost c(ai,ej) of hosting activity ai on engine ej
Check deployment: Determines what to do with an activity ai
Determine best engine e
Compute benefit: c(ai) – c(ai,e)
If resident long enough
Otherwise, apply pressure to other activities
Service Summer, July 19th, 2010, KIT, Germany
Service Summer, July 19th, 2010, KIT, Germany
What if analysis
Compute the cost of deploying local activity ai at candidate engines ej
Uses
Cost model could be a function of many metric types
c(ai) is the cost at the current engine, where the benefit is computed
Resident time is important to prevent oscillation (moves too often(
Example: Minimize end-to-end delay: Cost of CPU execution time + message transmission time.
Complexity
Given
E(P(ai)): Location of predecessors
E(S(ai)): Location of successors
*
© 2010 Hans-Arno Jacobsen
Redeployment Manager Summary
Compute the cost of deploying local activities ai at candidate engines ej
Given
E(P(ai)): Location of predecessors
E(S(ai)): Location of successors
Estimated cost of deploying activity ai at candidate engine ej
Complexity
Cost Model
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Ek
ai
ej
ck(ai,ej)
APk
EPk
CM
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Atomic Redeployment
Traditional pub/sub client movement protocols are expensive and do not offer transactional properties
Transactional movement
Efficient and guaranteed routing reconfiguration
For example, guarantee that no messages are lost, if an activity is re-deployed
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Redeployment
Activity manager
Stores activity state
Executes activities by collaborating with other activity managers at (remote) locations
Activity profiler
Records statistics on activity ai relevant to metric type Mk
Candidate engine discovery
Discovers other candidates
Engine profiler
Maintains information about remote engine ej relevant to metric type Mk (i.e., info necessary to compute metric cost of an activity)
Atomic redeployer
Move an activity to another engine without interrupting the process
Redeployment manager
Decide when and where to move activities
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Cd1 Message rate
Cd2 Message size
Ce1 Load (number of instances)
Ce2 Resources (CPU, memory, etc.)
Ce3 Activity complexity
Cs1 Latency of external service
Cs2 Execution time of external service
Cs3 Marshalling/unmarshalling
Cost(activity) = f(wiCi)
Cost(process) = ∑cost(activity)
Optimize time
Optimize network overhead
Threshold criteria: ∑wiCi > x
Minimized criteria: min( ∑wiCi )
E.g., Minimize distribution overhead
*
Minimize message hops
f() = 0.3 * cpu_energy + 0.7 * link_energy < X
f() = 0.3 * (invocation rate * engine_unit_energy) + 0.7 * (msg_rate * link_unit_energy) < X
Service Summer, July 19th, 2010, KIT, Germany
Service Summer, July 19th, 2010, KIT, Germany
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MIDDLEWARE SYSTEMS
RESEARCH GROUP
1
4
7
2
3
5
6
8
9
D
F
E
AB
C
GI
H
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Post-redeployment traffic is 10%
*
MIDDLEWARE SYSTEMS
RESEARCH GROUP
1
4
7
2
3
5
6
8
9
D
F
E
AB
C
GI
H
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Traffic with redeployment is 42% of the static case
Service Summer, July 19th, 2010, KIT, Germany
Service Summer, July 19th, 2010, KIT, Germany
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Challenge 1: Local optima
Widen candidate radius
Potential problem has not manifest in evaluations so far
Challenge 3: SLA granularity (more an engineering issue)
Can’t specify SLA on portions of a process
Can’t specify SLA on particular instances of a process (e.g., VIP user)
Service Summer, July 19th, 2010, KIT, Germany
Service Summer, July 19th, 2010, KIT, Germany
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Benefits of Content-based Publish/Subscribe for BPM
Naturally enables centralized and distributed business process coordination
Coordination can span administrative domains and physically distributed resources
Supports process orchestration and choreography
Monitoring & control is integral part of paradigm
Agile on the fly process adaptation and versioning
Correlation of application events with low-level infrastructure events
Service Summer, July 19th, 2010, KIT, Germany
*
Summary: The PADRES Stack
*
Content-based Routing (Publish/Subscribe)
Ad hoc business processes
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Conclusions
Effective BPM requires capable event processing abstractions.
Content-based publish/subscribe is a powerful event processing abstraction and paradigm.
PADRES is based on the pub/sub paradigm.
PADRES is an ESB targeted at event-based BPM.
PADRES enables real-time business analytics and business activity monitoring.
Service Summer, July 19th, 2010, KIT, Germany
*
Countless alumni (see our web site)
MIDDLEWARE SYSTEMS
RESEARCH GROUP
Questions & Discussion?
P
*
MIDDLEWARE SYSTEMS
RESEARCH GROUP
References
Quantifying events in software to increase modularity & customization in C-based systems and software-based product lines
http://www.AspeCtC.net (ACC - the AspeCt-oriented C compiler)
The Middleware Systems Research Group
*
Server Farm
Content-based Routing (Publish/Subscribe)
Redirect
resume
add
remove