Querying and Monitoring Distributed Business Processes
Daniel Deutch, Tova MiloTel-Aviv University
ERPERP
HR HR SystemSystem
eComeCommm
CRMCRM
LogisticsLogistics
CustomerCustomer
BankBank
SupplierSupplier
Querying and Monitoring Distributed BPs D. M. VLDB '082
Students Takes
sid=sid
sname
name=“Mary”
cid=cid
Courses
SSN Name Category123-45-6789 Charles undergrad234-56-7890 Dan grad
… …
Select… From… Where…
Students
Optimization
Indexing
Transactions
Files organization
Distribution
...
Data model
Design
Query language
Streams
...
XML
SOAP
WSDL
...
Querying and Monitoring Distributed BPs D. M. VLDB '083
Outline
Introduction to Business Processes
Querying
Monitoring
Summary & Research Directions
Querying and Monitoring Distributed BPs D. M. VLDB '084
Outline
Introduction to Business Processes
Querying
Monitoring
Summary & Research Directions
Querying and Monitoring Distributed BPs D. M. VLDB '085
Logically related activities that, when combined in a flow, achieve a business goal.
Activities may either be local or remote
Operates in a cross-organization, distributed environment
Abstract representation, independent of implementation
Standards facilitate design, deployment, and execution
What is a Business Process?Introduction to BPs
Querying and Monitoring Distributed BPs D. M. VLDB '086
Travel Service
Airlines Websites
ConsolidateResults
Travel request
Confirmation
Hotels Websites
queries
responses
queries
responses
Web-Based Travel Agency BPIntroduction to BPs
Querying and Monitoring Distributed BPs D. M. VLDB '087
BPs are designed by Non-programmers
Specify combination of functionalities and flow thereof, to solve a complex problem
Example: process an order
The operations/functions are implemented by programmers (programming in the small)
Example: fetch order document
Programming In the largeIntroduction to BPs
Querying and Monitoring Distributed BPs D. M. VLDB '088
Orchestration: Executable process, message exchange sequences are controlled by the orchestration designer.
Choreography: Non-executable protocol for interactions. E.g., the legal sequences of messages exchanged,
guaranteeing interoperability
[orchestra with a conductor vs. ballet dancers]
Orchestration vs.ChoreographyIntroduction to BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0810
Web Service 1
Web Service 2
Web Service 3
Web Service 4
Web Service 5
Web Service n
Company A business process
Local to company A
At company BOn the Web
Web Services Meet BPsIntroduction to BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0811
Planning
Modeling & Design
Development & Deployment
Execution, Interacting &
Monitoring
Analysis and Optimization
Source: Microsoft BPM Description
BP ManagementIntroduction to BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0812
Complex Systems Distributed Settings Interoperability issues Robustness Scale Web interface Legacy systems
Modeling ChallengesIntroduction to BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0813
BP market (BPTrends survey)Introduction to BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0814
Over 20 million hits in google for “business process”
Over 5 million hits for "business process management"
Vast interest by analysts (e.g. Gartner)
Rapidly growing interest in industry
New Standards
BP Management buzzIntroduction to BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0815
2000/05
BPML(Intallio et al)
WSFL(IBM)
BPSS(ebXML)
BPEL4WS 1.0 (IBM, Microsoft)
BPEL4WS 1.1
(OASIS)
WS-Choreography(W3C)
WSCI(Sun et al)
WSCL(HP)
BPEL
XLang(Micorsoft)
2001/03 2001/05 2001/06 2002/03 2002/06 2002/08 2003/01 2003/04 2007/04
WSBPEL 2.0
(OASIS)
2007/06
BPEL4 PEOPLE(Oracle et. Al)
Standards HistoryIntroduction to BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0817
Language for specifying BP behavior based on Web Services (WS)
Define BPs as coordinated sets of Web service interactions
Define both abstract and executable processes
Specifies Web services Composition
BPEL in a nutshellIntroduction to BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0818
<receive> <reply> <invoke> <assign> <throw> <wait> <sequence> <flow> (parallel) <if> <while> <repeatUntil>
…
Communication-related constructs
Flow-related constructs
BPEL constructsIntroduction to BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0819
BPEL is XML-based
In general we could use XML editors for specification design
Infeasible in practice
BPEL as XMLIntroduction to BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0820
<process> <!– Definition and roles of process participants -->
<partnerLinks> ... </partnerLinks> <!- Data/state used within the process --> <variables> ... </variables> <!- Properties that enable conversations --> <correlationSets> ... </correlationSets> <!- Exception handling --> <faultHandlers> ... </faultHandlers>
<!- Error recovery – undoing actions -->
<compensationHandlers>...</compensationHandlers> <!- Concurrent events with process itself --> <eventHandlers> ... </eventHandlers> <!- Business process flow --> (activities)*</process>
BPEL as XML (cont.) Introduction to BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0821
<if> <condition> bpel:getVariableProperty('shipRequest', 'props:shipComplete') </condition> <sequence> <assign> <copy> <from variable="shipRequest" property="props:shipOrderID" /> <to variable="shipNotice" property="props:shipOrderID" /> </copy> <copy> <from variable="shipRequest" property="props:itemsCount" /> <to variable="shipNotice" property="props:itemsCount" /> </copy> </assign> …
If (shipRequest = shipComplete)
{
shipNotice.OrderId = shipRequest.OrderId;
shipNotice.itemsCnt = shipRequest.itemsCnt;
}
BPEL as XML (cont.)Introduction to BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0824
Example Editor (eclipse)Introduction to BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0827
Example editor (Microsoft VS)Introduction to BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0828
Travel Agency Process FlowIntroduction to BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0830
So far: challenges & solutions for modeling
BPs are hard to analyze, debug, and optimize
Again, due to scale, distributed settings, legacy systems,…..
Good modeling simplifies process specification
But further analysis tools are required
ChallengesIntroduction to BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0831
Static Analysis• “What kind of credit services are used (in)directly?”• “How can I buy a plane ticket?”• “Can one get a price quote without giving first credit card info?”
Monitoring• “Notify me when a user hacks the system and get a price quote
without giving his credit card info”
Log Analysis• “Find all logs where a user bought a plane ticket”
Analysis TypesIntroduction to BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0832
Introduction to Business Processes
Querying BP Specifications
Monitoring
Summary & Research Directions
Outline
Querying and Monitoring Distributed BPs D. M. VLDB '0833
Statically analyze a Business Process “Find all ways in which a user can buy a plane ticket without relying
first her credit card details”
Analysis needs• Control flow analysis (Reachability, Cycle Detection, Temporal properties,…)• Structural analysis• Data analysis
• Messages Validation, Pointer analysis, Array bounds analysis,…• Combined (data and flow) analysis
Database approach• Treat BPs as data• Design a query language
Querying Business ProcessesQuerying BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0834
Recall that BPEL is XML-based, then… Why not use XQuery?!
Similar arguments to BPs design & XML
• Need to handle complex technical constructs
• Complex, unintuitive queries (many joins & recursion)
• No abstraction
• No robustness to changes in BPEL standard
Why not XQuery?Querying BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0835
Uniform
Graphical
Scalable
Flexible
Similar to the specification design
Can handle partial information and uncertainty
An Ideal querying/analysis toolQuerying BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0836
Finite State Machine
Recursive State Machine
Context Free Graph Grammars
LTL
Monadic Second Order Logic
CTL
CTL*First
Order
Logic
Mu-calculus
Models & Query LanguagesQuerying BPs
Temporal Logic
BP Specification Query languages
Querying and Monitoring Distributed BPs D. M. VLDB '0837
Finite State Machines (FSM) for software specification
Temporal Logic (TL) for querying all possible behaviors
Very common in software (and hardware) verification
Typically Linear time evaluation (data complexity)
Exponential time (query complexity)
First TryQuerying BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0838
Finite State Machines (FSM)
States and transition function (typically no “accepting” state)
The system configuration is encoded within the states
Interested in properties of possible traversal over the states
Temporal Logics express such properties
Finite State Machines (FSM)Querying BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0839
“Flat” No functions No recursion
Cycles allowed
Execution path by traversal
Payment
CreditCash
Login
Search Reserve
Confirm Cancel
FSM (example)Querying BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0840
Predicates • x=0? • Was a reservation made?
Logical operators (and, or, not)
Queries • Can a reservation be made without relaying a credit
card number? • Must one eventually login if he makes a trip search?
Temporal Logic - IngredientsQuerying BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0841
Quantifiers over execution paths • A φ - All: φ holds on all paths starting from the current state. • E φ - Exists: φ holds in at least one path.
Path-specific quantifiers • X φ - Next: φ holds at the next state. • G φ - Globally: φ has to hold on the entire subsequent path. • F φ - Finally: φ eventually has to hold (somewhere). • φ U ψ - Until: φ has to hold until at some position ψ holds, and ψ
must hold eventually. • φ W ψ - Weak until: φ has to hold until ψ holds.
Temporal OperatorsQuerying BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0842
Queries • Can a reservation be made without relaying a credit card
number? E(F(Reserve) and not F(Credit)) • Must one eventually login if he makes a trip search? A (F(login) or not F(search))
Logics• Linear time Logic (LTL)
No path quantifiers
• CTL* Allows path quantifiers
• Mu-calculus Introduces fix-point operators
First Order
(Monadic) Second Order
Temporal LogicQuerying BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0843
1. Scale
2. Expressive Power (Specification)
3. Expressive Power (Query language)
4. (Un)Intuitive Formulation
Features & LimitationsQuerying BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0844
Evaluation is linear in FSM size, but…
FSM size is huge for real-life specifications
Call stack Data Unfeasible Evaluation
Payment
CreditCash
Login
Search Reserve
Confirm Cancel
1. ScaleFeatures and Limitations
Querying and Monitoring Distributed BPs D. M. VLDB '0845
Bounded Model Checking [Biere et. Al ’99,’03] ,[Clarke et. Al ’04], …
Summarization [Reps et. Al ’98], [Sagiv et. Al ‘05],…
Compact representation • StateCharts [Harel ’87] • BDD [Bryant 86’, Lam et. Al ’05]
SolutionsFeatures and Limitations1. Scale
Querying and Monitoring Distributed BPs D. M. VLDB '0846
Graphical tool for designing state machine based specifications
Extend basic FSM concepts (super-states)
Simplifies design
UML standard
StateChartsFeatures and Limitations1. Scale
Querying and Monitoring Distributed BPs D. M. VLDB '0847
Data structure that allows compact representation of data with high similarities
(search ^ cash) V
(not (search) ^ confirm)
search
confirm
cash
1 0
BDDFeatures and Limitations1. Scale
Querying and Monitoring Distributed BPs D. M. VLDB '0848
Store commands as db relations command c is written at location l
“Open” all possible contexts
Exploit similarities and represent compactly by Binary Decision Diagrams
Datalog queries
Relational DB approach [Lam ’05]Features and Limitations1. Scale
Querying and Monitoring Distributed BPs D. M. VLDB '0849
FSMs have limited expressive power
May yield inaccurate approximation of real-life specification
2. Expressive Power (Specification)Features and Limitations
Querying and Monitoring Distributed BPs D. M. VLDB '0850
Finite State Machine
Recursive State Machine
Context Free Graph Grammars
Monadic Second Order Logic
First
Order
Logic
Models & Query LanguagesFeatures & Limitations 2. EX-power (Spec)
Temporal Logic
BP Specification Query languages
Querying and Monitoring Distributed BPs D. M. VLDB '0851
A collection of FSMs
Each with multiple entries & exits
Some states represent calls to other state machines (or to self)
An expansion is replacing a call state by a possible implementation
An execution is a sequence of expansions
Recursive State Machines (RSMs)Features & Limitations 2. EX-power (Spec)
Querying and Monitoring Distributed BPs D. M. VLDB '0852
RSM (example)
paymentconfirm
Start
HomePage
cash
search
reserve
paycredit
Home Page Payment
Features & Limitations 2. EX-power (Spec)
Querying and Monitoring Distributed BPs D. M. VLDB '0853
Single Exit RSMs • CTL* : Linear data complexity [Benedikt et. Al ‘05]• Mu-calculus: also [Alur et. Al ’07]
Multiple Exit RSMs• LTL: PTIME• CTL,CTL*: EXPTIME• Mu-calculus: EXPTIME
Evaluation (Temporal Logic)Features & Limitations 2. EX-power (Spec)
Querying and Monitoring Distributed BPs D. M. VLDB '0854
Finite State Machine
Recursive State Machine
Context Free Graph Grammars
Monadic Second Order Logic
First
Order
Logic
Models & Query LanguagesExpressive Power (Specification)
Temporal Logic
Features & Limitations 2. EX-power (Spec)
BP Specification Query languages
Querying and Monitoring Distributed BPs D. M. VLDB '0855
Extensions of string grammars to graphs
Labels over graph nodes (VR) or edges (HR)
Terminal and non-terminal labels
Derivation rules for non-terminal labels
No start and end nodes!
Connection (“gluing”) rules, by labels, for the derived sub-graph
Context Free Graph GrammarsFeatures & Limitations 2. EX-power (Spec)
Querying and Monitoring Distributed BPs D. M. VLDB '0856
Connection relation :
StartPayment connects only to Search
Confirm is connected to cash, but not to credit
Context Free Graph Grammars Example
Expressive Power (Specification)
paymentconfirm
Start
HomePage
cash
search
reserve
paycredit
Home Page Payment
Features & Limitations 2. EX-power (Spec)
Querying and Monitoring Distributed BPs D. M. VLDB '0857
Depends greatly on the allowed connection relation
A restricted model defines entries and exits for graphs and is equivalent to RSMs
Typically Monadic Second Order Queries
We’ll revisit it later..
Context Free Graph GrammarsEvaluation
Expressive Power (Specification)
Features & Limitations 2. EX-power (Spec)
Querying and Monitoring Distributed BPs D. M. VLDB '0858
AXML [Abiteboul, M et. Al ’04-’08]
Extension of XML to include embedded Web-Services calls
Very useful for modeling web-sites
Formally – a restriction of context free graph grammars to trees
XML query languages (XPath,XQuery,..)
Practically efficient evaluation (for restricted cases)
Active XMLExpressive Power (Specification)
Features & Limitations 2. EX-power (Spec)
Querying and Monitoring Distributed BPs D. M. VLDB '0859
Temporal logics have limited expressive power
Basically – consider only executions
Good for behavioral properties
Can’t capture structural properties
Bisimulation-invariant
Example
3. Expressive Power (Queries)Features and Limitations
Querying and Monitoring Distributed BPs D. M. VLDB '0860
Behavioral vs. Structural AnalysisExpressive Power (Queries)
Features & Limitations 3. EX-power (Queries)
Querying and Monitoring Distributed BPs D. M. VLDB '0861
Solution?Expressive Power (Queries)
Finite State Machine
Recursive State Machine
Context Free Graph Grammars
Monadic Second Order Logic
First
Order
Logic
Temporal Logic
Features & Limitations 3. EX-power (Queries)
BP Specification Query languages
Querying and Monitoring Distributed BPs D. M. VLDB '0862
Studied extensively for Context Free Graph Grammars
Linear in grammar size
But unfortunately…
Non-elementary in the query (formula) size
2^2^2^…..2 (tower size depends on query size)
Infeasible for even the smallest queries
Infeasible!Expressive Power (Queries)
Features & Limitations 3. EX-power (Queries)
Querying and Monitoring Distributed BPs D. M. VLDB '0863
4. (Un) intuitive Formulation Very difficult to express properties of interest in FO/MSO
Long and error-prone formulas
Temporal Logic is more intuitive
Still, textual and complex, especially for large-scaled analysis
Existing tools provide inadequate interface
4. (Un)intuitive FormulationFeatures and Limitations
Querying and Monitoring Distributed BPs D. M. VLDB '0864
r[n,z](v) = (c[n](v) & r[n,x1](v)? z(v) | E(v_1) z(v_1) & TC (v_1, v) (v_3,
v_4) (n(v_3, v_4) & !x1(v_3)) : r[n,z](v) & ! (E(v_1) r[n,z](v_1) & x1(v_1) & r[n,x1](v) & !x1(v)))
FO+TC Formulas (TVLA syntax)(Un)intuitive formulationFeatures & Limitations
4. (Un)intuitive formulation
Query:
Is there a point of code reachable
from v in which n points at z?
Querying and Monitoring Distributed BPs D. M. VLDB '0865
PReqFullfilledDef1: assert (P_BUTTON_PRESSED -> (~P_REQ_FULFILLED U P_state));
PReqFullfilledDef2: assert (P_state -> (P_REQ_FULFILLED U P_BUTTON_PRESSED));
EnterTStateDef: assert ((~T_state & X(T_state)) -> X(ENTER_T_STATE));
EnterPStateDef: assert ((~P_state & X(P_state)) -> X(ENTER_P_STATE));
MoveToPPrevDef: assert ((~move_to_p & X(move_to_p)) -> MOVE_TO_P_WAS_SET_TRUE_IN_PREV_SEC);
MoveToItoTPrevDef: assert( (~move_to_i_to_t & X(move_to_i_to_t)) -> X(X(MOVE_TO_I_TO_T_WAS_SET_TRUE_IN_PREV_SEC)));
Temporal Logic (SMV syntax)(Un)intuitive formulationFeatures & Limitations
4. (Un)intuitive formulation
Traffic Light Requirement Specification
Querying and Monitoring Distributed BPs D. M. VLDB '0866
Model problems • expressive power • scale
Pretty much solved by models we’ve seen Query Language problems
• expressive power • unintuitive formulation
Not solved yet BPQL to the rescue [VLDB’06]
Querying BPs Mid-section Summary
Querying and Monitoring Distributed BPs D. M. VLDB '0867
Possible SolutionQuerying BPs
Finite State Machine
Recursive State Machine
Context Free Graph Grammars
Monadic Second Order Logic
First
Order
Logic
Temporal Logic
BPQL
BP Specification Query languages
Querying and Monitoring Distributed BPs D. M. VLDB '0868
BP patterns (like tree patterns for XML)
Single/double-headed edges (Xpath’s / and //)• edges
• paths of arbitrary length
Single/double-bounded activities:• simple zoom-in • unbounded zoom-in
BPQL QueriesQuerying BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0869
local
Q1: used credit card services?Querying BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0872
Query: A BP pattern with some transitive nodes & edges transitive
An embedding: a mapping • from: query graphs • to: [possible flows defined by the] BP graphs
A result: image of query graph under an embedding
Answer: all results
Queries and their semanticsQuerying BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0873
• Sub-graph homomorphism vs. bisimulation• Both are supported in BPQL
Structural vs. Behavioral SemanticsQuerying BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0876
Systems and queries are essentially Context Free Graph Grammars (Recursive state machines)
We basically compute their intersection
Bad news: These are not closed in general under intersection
Good news: Our systems and queries are sufficiently simple:
PSIZE representation (as a BP) can be computed in PTIME (data complexity!)
Distributed query processing (based on AXML)
Query Evaluation AlgorithmQuerying BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0877
So far we have mainly considered flow
Data is important as well
Data: variable values, message exchange,…
Especially interesting in context of Web
Some representative works Web-services analysis Pointer Analysis
And what about data?Querying BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0878
[Deutsch et. Al ‘06] • Query language combines LTL and FO• LTL for temporal relationships• FO for data snapshots• Efficient evaluation (restricted versions)
[Fu et. Al ’04] • Guarded automaton for flow query• XPath “guards” relate to data• Reduces problem into “conventional” model checking• Undecidability for general case• Polynomial Data Complexity for bounded message number
Web-services analysisQuerying BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0879
[Lam et. Al ’05]• Pointer analysis for sql injections, buffer overflow,…• Analyzes all possible call stack contexts• BDD-based optimization techniques capture
similarities• Efficient run-time, practical system
[Sagiv et. Al ’06]• Shape analysis• Summarization techniques• Efficient run-time, practical system
Pointer AnalysisQuerying BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0880
SMV, NuSMV [Clarke et. Al, ’92] bddbddb [Lam et. Al, ’04] TVLA [Sagiv et. Al, ’99] SPIN [Bell Labs,’91] SLAM [Ball et. Al. ’00] Moped [Schwoon ’02] Mops [Chen et. Al ’02] BPQL [Beeri, D, M, et. Al ’05]
Finite State Machine
Context Free Processes
Some SystemsQuerying BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0881
Introduction to Business Processes
Querying
Monitoring
Summary & Research Directions
Outline
Querying and Monitoring Distributed BPs D. M. VLDB '0882
The aggregation, analysis, and presentation of real time information about activities inside organizations and involving customers and partners [Gartner]
Provide real-time information on executions
What is Monitoring?Monitoring BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0883
Imagine you run an auction service…
• Guarantee fair play: notify on too many cancels• Maintain SLA: monitor response time• Promotions: prizes for the x10,000 transaction• Illegal access: notify on buyers attempt to confirm
bids without registering first
Monitoring is crucial for enforcing business policies and meeting efficiency & reliability goals
Why Monitoring?Monitoring BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0884
<actionData> <header> <processName> auctionHouse </processName> <instanceId> 517 </instanceId> <sensorTarget> notify_winner </sensorTarget> <timestamp> 2006-05-31T11:32:46.510+00:00 </> </header>
… <activityData> <activityType>invoke </activityType> <evalPoint> completion </evalPoint> …
BPEL XML eventsMonitoring BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0885
1. Absorb the stream of events coming from the BP execution engine
2. Process and filter events, selects relevant events data and automatically triggers actions
3. A dashboard that allows users to follow the processes progress, view custom reports, perform analysis,…
Monitoring Systems LayersMonitoring BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0886
XML streams management
Complex Event Processing (CEP)
Commercial tools) BAM(
BP-Mon
Existing ApproachesMonitoring BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0887
Many works on xml streaming• Query Optimization [Koch et. Al ’04, Viglas et. Al ’02,…] • Security [Altinel, Franklin ’00, Benedikt et. Al, ’08,…]• Updates & Concurrency [Grabs et. Al ’02, Nicola et. Al ’07,…]
Automata-based techniques vastly used
BPEL processes emits XML messages
Use XML streaming engines for monitoring?
XML streams managementMonitoring BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0888
Each XML element describes an individual event
Fairly complex XQuery queries (lots of joins)
Difficult to handle by existing streaming engines…
XML stream engines manage tree-shaped data (vs. DAGs)
XML stream engines expect to receive elements in document order (but we have here parallel flow).
Why not ?Monitoring BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0889
Processing of multiple events to identify semantically meaningful combinations.
Studied extensively for Active Databases [Widom ’96], [Payton ’99], [Wolski ’98],…
Recent works in context of BPs [Wu ’06], [Jobst ‘07]
Identify meaningful activities combinations that form a business logic
Still, somewhat low-level
Complex Event ProcessingMonitoring BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0890
An enterprise solution, intended to provide a summary of business activities to operations managers and upper management.
Example products:• WebSphere Business Monitor (IBM)• Oracle Business Activity Monitoring• SAP NetWeaver• ProActivity PA and P-BAM (BEA)• BusinessBridge (Systar)• …
Business Activity Monitoring (BAM)Monitoring BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0891
Event driven decision making• Analytics (CEP) run on events as they are generated• Actions are taken immediately
Ruled-based monitoring and reporting• Set thresholds according to key performance indicators (KPIs)
and other business-specific triggers
Real time integration of event and context data• Make decision based on combination of real time, historical and
plan and forecast data (multi-dimensional queries)
Built for business users• Customizable dashboards, reports and alerts
Main featuresMonitoring BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0892
BAM tools provide solid solutions
But do not incorporate static analysis
BP-Mon is an integrated framework [VLDB’07] Uses a query language similar to BPQL Graphical & Intuitive Semantics: evaluated over run-time executions (vs.
potential executions in BP-QL)
BP-MonMonitoring BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0894
Unfair play (too many cancellations)
As before
Report/ Report*
Sliding window time based Instance based
New
or rep
Query Example (1)Monitoring BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0896
Illegal bidding (mix of static and run-time analysis)
Query Example (2)Monitoring BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0897
Intuitive monitoring (looks like BPEL) Easy deployment (implemented as BPEL)
Greedy embedding Automata-based evaluation algorithm Type-based optimization
Some Nice BP-Mon FeaturesMonitoring BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0898
• Incrementally extends a greedy matching to one of a larger prefix
• Automaton with pattern nodes as states– Tries to concurrently match the all concrete
patterns of a given pattern– Attempts to match events as early as possible– On failure: backtracks & retries
Complexity: polynomial in the size of the execution log (with the exponent determined by the size of the pattern)
Evaluation algorithmMonitoring BPs
Querying and Monitoring Distributed BPs D. M. VLDB '0899
The automaton is compiled into BPEL
May be used on any application server
Guarantees portability
Easy DeploymentMonitoring BPs
Querying and Monitoring Distributed BPs D. M. VLDB '08100
Exploit knowledge of specification Infer Irrelevancy & inconsistency using BP-QL
Type-Based OptimizationMonitoring BPs
Querying and Monitoring Distributed BPs D. M. VLDB '08101
Type Inference and Type Checking for Queries on Execution Traces
VLDB’08
Tuesday 10:45, Theory Session
Study of type systems for real-life logs
Type systems serve as a basis for optimizations
Shameless Advertisement
Querying and Monitoring Distributed BPs D. M. VLDB '08102
Introduction to Business Processes
Querying
Monitoring
Summary & Research Directions
Outline
Querying and Monitoring Distributed BPs D. M. VLDB '08103
Optimization
Indexing
Transactions
Files organization
Distribution
...
Data model
Design
Query language
Streams
...
XML
SOAP
WSDL
...
Querying and Monitoring Distributed BPs D. M. VLDB '08104
Important aspects of Business Processes• Design• Analysis• Monitoring
Plenty of work on each subject, in many different fields Still missing a real synergy
• Programming languages, model checking, DB technology• XML Streaming, CEP/BAM• All together…
BPQL: integrated, high-level, intuitive framework for all.
Summary & Research DirectionsAlmost done
Querying and Monitoring Distributed BPs D. M. VLDB '08105
Topics for research:• Missing information (partial specifications, logs,…)• Probabilistic Processes• Data values• Interactions• Log mining• Enhanced query language features• Optimizations • Further applications • and more…
Summary & Research DirectionsAlmost done
Querying and Monitoring Distributed BPs D. M. VLDB '08107
Monitor models can be transformed into executable code for WebSphere.
Steps for creating a monitor model:
1. Generate CEI events for BPEL elements.2. Generate Monitor events.3. Generate the Monitor model.4. Create the respective business measures (metric, KPI,
dimensions, alerts).5. Deploy into WebSphere
Example: WebSphereMonitoring BPs
Querying and Monitoring Distributed BPs D. M. VLDB '08108
Step1:Choose activity events and variablesMonitoring BPs
Querying and Monitoring Distributed BPs D. M. VLDB '08109
Step2: Generate monitoring eventsMonitoring BPs
Querying and Monitoring Distributed BPs D. M. VLDB '08110
Create monitor modelSelect events to be monitored
Step 3: Generate the monitor modelMonitoring BPs
Querying and Monitoring Distributed BPs D. M. VLDB '08111
Step 4: Create the respective business measuresMonitoring BPs
back
Querying and Monitoring Distributed BPs D. M. VLDB '08112
In real-life BPs, executions depend on various external events User choices, ariable values, server states,…
The set of all traces conforming to a query may be large (possibly infinite)Some results are more interesting than others
Also, some information on the execution may be missing or unknown
Probabilistic PsProbabilistic BP
Querying and Monitoring Distributed BPs D. M. VLDB '08113
What is the typical/likely behavior (flow) for users that do not finalize their reservation?
Which hotels are preferred by British Airways fliers?
TOP-K answers reflect common usage patterns
Example QueriesProbabilistic BP
Querying and Monitoring Distributed BPs D. M. VLDB '08114
Probabilistic Relational DBs
Probabilistic XML
(Probabilistic) Recursive State Machines with temporal logic as query language
BP and Web applications mining
Lots of related workProbabilistic BP
Querying and Monitoring Distributed BPs D. M. VLDB '08115
We define likelihood of traces
Then find the top-k most likely out of these conforming to a user query
Compact representation of output, using a type
TOP-K likely tracesProbabilistic BP
Querying and Monitoring Distributed BPs D. M. VLDB '08116
We distinct three classes of distributions, according to their level of dependency
Memory-less (markovian): no dependencies between formulas.
Bounded-memory: dependency in(at most) B last values of each formula.
General
Distribution ClassesProbabilistic BP
Querying and Monitoring Distributed BPs D. M. VLDB '08117
For memory-less distribution, we find the TOP-K matches in PTIME (data complexity)
For bounded-memory distributions, NP-completeness in the data size, but we give powerful heuristics
In all settings, NP-completeness in the query size
For general distributions, we show undecidability
ResultsProbabilistic BP
back
Querying and Monitoring Distributed BPs D. M. VLDB '08118
• [Biere et. Al ’99] A. Biere, A. Cimatti, E. Clarke, M. Fujita, Y. Zhu. Symbolic Model Checking using SAT procedures instead of BDDs. Design Automation Conf. (DAC)'99, 1999.
• [Biere et. Al,’03] A. Biere, A. Cimatti, E. Clarke, O. Strichman, Y. Zhu. Bounded Model Checking. In Advances in Computers, vol. 58, Academic Press 2003.
• [Clarke et. Al ’04] Edmund M. Clarke, Daniel Kroening, Joel Ouaknine, Ofer Strichman, Completeness and complexity of bounded model checking, VMCAI 2004.
• [Reps ’98] Reps, T., Program analysis via graph reachability. Information and Software Technology 40, 11-12 1998
• [Harel ’87] Harel. D. Statecharts: A Visual Formulation for Complex Systems. Sci. Comput. Program. (SCP) 8(3):231-274 (1987)
• [Lam et. Al ’05] Monica S. Lam, John Whaley, V. Benjamin Livshits, Michael C. Martin, Dzintars Avots, Michael Carbin, Christopher Unkel. Context-sensitive program analysis as database queries. PODS 2005
• [Benedikt et. Al ‘05] Rajeev Alur, Michael Benedikt, Kousha Etessami, Patrice Godefroid, Thomas W. Reps, Mihalis Yannakakis. Analysis of recursive state machines. ACM Trans. Program. Lang. Syst. 27(4): 786-818 (2005)
References
Querying and Monitoring Distributed BPs D. M. VLDB '08119
• [Alur et. Al ’05] Rajeev Alur, Swarat Chaudhuri, Kousha Etessami, P. Madhusudan On-the-fly Reachability and Cycle Detection for Recursive State Machines
• [Abiteboul, M et. Al ’04-’08] Serge Abiteboul, Omar Benjelloun, Tova Milo: Positive Active XML, PODS 2004
• [Abiteboul, M et. Al ’04-’08] Serge Abiteboul, Omar Benjelloun, Bogdan Cautis, Ioana Manolescu, Tova Milo, Nicoleta Preda: Lazy Query Evaluation for Active XML, SIGMOD 2004
• [Abiteboul, M et. Al ’04-’08] Serge Abiteboul, Omar Benjelloun, Tova Milo: The Active XML project: an overview
• SMV, NuSMV [Clarke et. Al, ’92] Jerry R. Burch, Edmund M. Clarke, Kenneth L. McMillan, David L. Dill, L. J. Hwang: Symbolic Model Checking: 10^20 States and Beyond Inf. Comput. 98(2): 142-170 (1992)
• bddbddb [Lam et. Al, ’05] Monica S. Lam, John Whaley, V. Benjamin Livshits, Michael C. Martin, Dzintars Avots, Michael Carbin, Christopher Unkel. Context-sensitive program analysis as database queries. PODS 2005
• TVLA [Sagiv et. Al, ’99] Shmuel Sagiv, Thomas W. Reps, Reinhard Wilhelm: Parametric Shape Analysis via 3-Valued Logic. POPL 1999
References
Querying and Monitoring Distributed BPs D. M. VLDB '08120
• SPIN [Bell Labs,’91] The Spin Model Checker: Primer and Reference Manual Addison-Wesley, ’98.
• SLAM [Ball et. Al. ’00] Thomas Ball, Sriram K. Rajamani: Bebop: A Symbolic Model Checker for Boolean Programs. SPIN 2000
• Mops [Chen et. Al ’02] Hao Chen, David Wagner , MOPS: an Infrastructure for Examining Security Properties of Software CCS ‘02
• BPQL [Beeri, M, D et. Al ’05]
• [Koch et. Al ’04[ Christoph Koch, Stefanie Scherzinger, Nicole Schweikardt, Bernhard Stegmaier: FluXQuery: An Optimizing XQuery Processor for Streaming XML Data. VLDB 2004
• Viglas et. Al ’02[Stratis Viglas, Jeffrey F. Naughton: Rate-based query optimization for streaming information sources. SIGMOD ’02
• [Altinel, Franklin ’00[Mehmet Altinel, Michael J. Franklin: Efficient Filtering of XML Documents for Selective Dissemination of Information, VLDB ‘02
References
Querying and Monitoring Distributed BPs D. M. VLDB '08121
• ]Benedikt et. Al, ’08 [Michael Benedikt, Alan Jeffrey, Ruy Ley-Wild: Stream firewalling of xml constraints, SIGMOD ‘08
• [Grabs et. Al ’02[Torsten Grabs, Klemens Böhm, Hans-Jörg Schek: XMLTM: efficient transaction management for XML documents. CIKM 2002
• [Widom ’96] Jennifer Widom, Stefano Ceri: Active Database Systems: Triggers and Rules For Advanced Database Processing. Morgan Kaufmann 1996
• [Wolski ’98] Antoni Wolski, Tarik Bouaziz: Fuzzy Triggers: Incorporating Imprecise Reasoning into Active Databases. ICDE ’98
• [Wu ’06] Eugene Wu, Yanlei Diao, Shariq Rizvi, High-Performance Complex Event Processing Over Streams, SIGMOD ‘06
• [Jobst ‘07] Daniel Jobst, Gerald Preissler, Mapping clouds of SOA- and business-related events for an enterprise cockpit in a Java-based environment. PPPJ 2007
References