data-driven web services: specification and verification victor vianu uc san diego
Post on 21-Dec-2015
215 views
TRANSCRIPT
Data-driven web services:specification and verification
Victor Vianu
UC San Diego
Web service: service hosted on the Web
• Interactive, often data-driven: accesses an underlying database and interacts with
users/programs according to explicit or implicit workflow
• Complex services: Web service compositions peers communicating asynchronously
• Complexity of workflow leads to bugs: see the public database of Web site bugs (Orbitz bug)
• Static analysis required -behavior of individual peers -protocols of communication between peers -global properties
Focus of this course
Automatic VerificationUnderstand when verification is possible, describe techniques and algorithms
Our target: data-aware Web services
database
Product index page(PIP)
Matching products
Home page(HP)Name passwd
login
Customer page(CP)
DesktopMy order
laptop
Product detail page(PP)
Product detail
Confirmation page(CoP)
Order detail
Desktop Search(SP)
Desktop search
Ram:
Hdd:
search
laptop Search(SP)
Desktop search
Ram:
Hdd:
Display:
search
Past Order (POP)
Past Order
Order status(OSP)
Order status
Cancel confirmationpage(CCP)
cancel
buy
Error message page(MP)
back
state
Product index page(PIP)
Matching products
Home page(HP)Name passwd
login
Customer page(CP)
DesktopMy order
laptop
Product detail page(PP)
Product detail
Confirmation page(CoP)
Order detail
Desktop Search(SP)
Desktop search
Ram:
Hdd:
search
laptop Search(SP)
Desktop search
Ram:
Hdd:
Display:
search
Past Order (POP)
Past Order
Order status(OSP)
Order status
Cancel confirmationpage(CCP)
cancel
buy
Error message page(MP)
back
input
output query
update
Triggers state update and transition to new page
InputInput
Input
Input
login cancel
Home Page(HP)
desktoplaptop
RAM:CPU:
RAM:CPU:SCREEN:
submit submit
Matching products
Details Confirmationbuy print
Customer Page(CP)
Laptop Search (LSP) Desktop Search (DSP)
Product Index (PIP)
Product Detail (PDP)
Confirmation (CoP)
back
Message
Message Page (MP)
state update
DB
output
NAME:PASSWD:
Compositions of Web services
Product index page(PIP)
Matching products
Buyer Login page(BP)
Name passwd
login
Category choice page(CP)
DesktopMy order laptop
Product detail page(PP)
Product detailbuy
Confirmation page(CoP)
Order detail
Desktop Search(SP)
Desktop searchRam:Hdd:
search
laptop Search(SP)Desktop searchRam:Hdd:Display:
Past Order (POP)
Past Order
Order status(OSP)
Order status
Cancel confirmationpage(CCP)
User payment(UPP)
Credit Verification
M
Payment CC No:Expire date
submit
cancel
search
Examples of Desirable Properties
• Semantic properties – “no product is delivered before payment in the right amount is received" – “no user can cancel an order that has already been
shipped”
• Basic soundness of specification– “conditions guarding transition to next Web page are
mutually exclusive”
• Navigational properties– “the shopping cart page is reachable from any page”
Expressed using temporal logics
x G [ X reject-order(x)
(past-order(x) y (pay(x,y) price(x,y)))]
“An order is rejected in the next step only if it has already been ordered but not paid correctly in the current input”
always at next step
Outline
• Review of model checking
• Finite-state abstractions
• Data-aware abstractions
• The XML angle
Disclaimer: a lot not covered here!
• Process algebras, pi-calculus
• Situation calculus
• Petri nets
• Transaction logic
• Use of ontologies, description logics
Finite-state abstractions of Web services*
• Black box: input/output signature
order
delivery
bill
payment
Supplier
* curtesy of Rick Hull
Finite-state abstractions of Web services
• White box: internal logic
order
delivery
bill
payment
?o !b ?p !d
Simplest: finite state Mealy machines
Finite-state abstractions of Web services
• White box: internal logic
order
delivery
bill
payment
Simplest: finite state Mealy machines
!d!b
?p
?p
!d
!d
?o
!b
• Interacting Web services: compositions
C : finite set of peer-to-peer channels
authorize
M : (finite) set of mesage classes
ok
bill 2
paym
ent 2
order1
receipt1
order2
receipt2
payment 1
bill 1
P : finite set of Web services (peers) store bank
supplier1 supplier2
Combining Peer and Composition Models
• Peer fsa’s begin in their start states
. . .. . .
. . .
store
supplier1 supplier2
. . .
bank!o 1
!a ?k
!o2
?b1
?a !k
?o1 !b1?o2
?r 2
!b2
a
Executing a Mealy Composition (cont.)
. . .. . .
. . .
store
supplier1 supplier2
. . .
bank
• STORE produces letter a and sends to BANK
!o 1!a ?k
!o2
?b1
?a !k
?o1 !b1?o2
?r 2
!b2
Executing a Mealy Composition (cont.)
• BANK consumes letter a
. . .. . .
. . .
store
supplier1 supplier2
. . .
bank!o 1
!a ?k
!o2
?b1
?a !k
?o1 !b1?o2
?r 2
!b2
• Important parameters:
--bounded or unbounded queues
--open or closed system• Execution successful if all queues are empty and
fsa’s in final state
r2
r2
b1b2
o1
. . . . . .!o 1
!a ?k
!o2
?o2 ?r 2
!b2
?o1 !b1 . . .
store
supplier1 supplier2
. . .
bank
o2o2
?b1
?a !k
Verifying Temporal Properties of Mealy Compositions
• Label states with propositions• Express temporal formulas in LTL, e.g.,
– “shipment just made” only after “line-of-credit avail”
. . .. . .
. . .
store
warehouse1 warehouse2
. . .
bank!o 1
!a ?k
!o2
?b1
?a !k
?o1 !b1?o2
?r 2
!b2
?r 2
!b2
“line-of-credit
available”
“shipment just
made”
“shipment just
made”
Results on Temporal Verification
• Long history, see [Clarke et.al. ’00]
Example: one fsa and propositional LTL – PSPACE in size of formula + fsa– linear time in size of fsa
• Mealy compositions– Bounded queues
• Composition can be simulated as Mealy machine• Verification is decidable• Standard techniques to reduce cost
– Unbounded queues• In general, undecidable [Brand & Zafiropulo 83]
Alternative: temporal property of sequence of exchanged messages
order2
payment
1bill 1
ok
receipt2
ord
er
1rece
ipt
1
bill2
paym
ent
2
“conversation”
a k o1 b1o2p1
r1 r2 b2p2
LTL properties: Every authorize followed by some bill?
authorizestore bank
ware-house1
ware-house2
Examples:
G(( [?o] [ !b]) X [ !b])
“if an order has been received but a bill not yet sent, then in the next state a bill has been sent”
G( [ ?o] F( [ !b] ))
“if an order has been received then eventually a bill will be sent”
• Conversation: sequence of exchanged messages
• Conversation language: set of all conversations between Mealy peers
• Bounded queues: regular language
Conversation languages
• Conversation language L is not regular:
L a*b* = { anbn | n 0 }• Conversation languages are context sensitive in fact, they are the quasi-realtime languages
[Bultan+Fu+Hull+Su 03]
• Unbounded queues
?b!a
p1 p2
?a
!b
a
b
Quasi-realtime languages
• Accepted by non-deterministic multi-tape TM in linear time [Book+Greibach]
• Also, smallest AFL containing the CFGs and closed under intersection
AFL: families of languages containing the finite languages and closed under union, concatenation, +, non-erasing homomorphism, inverse homomorphism, intersection with regular languages
Synthesis of compositions
• Given a set of peers and communication channels with message names, and a constraint Φ on the conversation language, find Mealy automata for peers so that the constraint is satisfied.
Bounded queues• Closed systems: PSPACE for LTL PTIME for ω-regular sets given by an automaton
• Open systems: “game” against environment synthesis = finding winning strategy undecidable for arbitrary topology hierarchical topology: decidable [Kupferman + Vardi 01]
but non-elementary even in linear case [Pnueli+Rosner] [
Unbounded queues
• Open systems: undecidable
• Closed systems: open
Practical alternative: synthesizing hierarchical compositionfrom “library” of services
TravelServiceTemplat
es
Air Travel
TemplatesAirport
Transfer
HotelReservatio
n
Customized
TravelService
More work on compositions
• The Roman model sequence of papers by Berardi, Calvanese, Da
Giacomo, Hull, Lenzerini, Mecella
also Bultan, Dang, Fu, Hull, Ibara, Su
see references.• Use of description logics and logic programming
Beyond fsa: workflow+data
database
Product index page(PIP)
Matching products
Home page(HP)Name passwd
login
Customer page(CP)
DesktopMy order
laptop
Product detail page(PP)
Product detail
Confirmation page(CoP)
Order detail
Desktop Search(SP)
Desktop search
Ram:
Hdd:
search
laptop Search(SP)
Desktop search
Ram:
Hdd:
Display:
search
Past Order (POP)
Past Order
Order status(OSP)
Order status
Cancel confirmationpage(CCP)
cancel
buy
Error message page(MP)
back
state
Product index page(PIP)
Matching products
Home page(HP)Name passwd
login
Customer page(CP)
DesktopMy order
laptop
Product detail page(PP)
Product detail
Confirmation page(CoP)
Order detail
Desktop Search(SP)
Desktop search
Ram:
Hdd:
search
laptop Search(SP)
Desktop search
Ram:
Hdd:
Display:
search
Past Order (POP)
Past Order
Order status(OSP)
Order status
Cancel confirmationpage(CCP)
cancel
buy
Error message page(MP)
back
input
output query
update
dbcontrol
input
output
state
• Abstract model: relational transducer
Control: (input, state, db) (output, state)
transducer
History:
• Relational transducers [Abiteboul+Vianu+Fordham+Yesha 98]
• ASM relational transducer [Spielmann 00]
• Extended ASM transducer [Deutsch+Liying+Vianu 04]
• Communicating transducers [Deutsch+Liying+Vianu+Zhou 06]
History:
• Relational transducers [Abiteboul+Vianu+Fordham+Yesha 98]
• ASM relational transducer [Spielmann 00]
• Extended ASM transducer [Deutsch+Liying+Vianu 04]
• Communicating transducers [Deutsch+Liying+Vianu+Zhou 06]
Target: “business models”in e-commerce
High-level descriptions of protocols of interaction
Transducer - SHORT% schema: relational signatures
database: price: 2 input: order: 1, pay: 2 state: past-order: 1, past-pay: 2 output: send-bill: 2, deliver: 1% state rules
past-order(X) +:- order(X) past-pay(X,Y) +:- pay(X,Y)% output rules
send-bill(X,Y) :- order(X), price(X,Y), past-pay(X,Y)
deliver(X) :- past-order(X), price(X,Y), pay(X,Y), past-pay(X,Y)
Run of a transducer
input/output sequence
input1
output1
input3
output3
input2
output2
Input Sequence
order(Time) order(Le Monde) pay(Newsweek,45) order(Newsweek) pay(Time,55) order(People) pay(Newsweek,48)
send-bill(Time,55) deliver(Time)
deliver(Newsweek)send-bill(Newsweek,45) send-bill(Le Monde,350)
PriceNewsweek 45Time 55Le Monde
350
Output Sequence
What to verify:
• Goal reachability
is it possible to reach a goal (output)?
Example: is it possible to receive
a tax refund?
• Temporal properties of all runs
Example: no product can be delivered unless it has been paid
• Comparing transducers
-- Equivalence: same runs
-- Containment: every run of T1 is
a run of T2
• Interacting transducers
Overall consistency
Example: T1 requires
payment before delivery
T2 requires
delivery before payment
deadlock situations!
• Generally, undecidable
Tractable restriction: Spocus Transducer
Semi-Positive Output, CUmulative State
• State rules:past-R( x ) R(x) Rinput
• Output rules: A0 A1 … Ai … An
A0
Ai : input, database, or state literals, safe negation
+
: output relation R(x)
Transducer - SHORT% schema
database: price input: order, pay state: past-order, past-pay output: send-bill, deliver% state rules
past-order(X) +:- order(X) past-pay(X,Y) +:- pay(X,Y)% output rules
send-bill(X,Y) :- order(X), price(X,Y), past-pay(X,Y)
deliver(X) :- past-order(X), price(X,Y), pay(X,Y), past-pay(X,Y)
What can be verified• Goal reachability
goal: x ( A1 … An)
Ai : () R( y ) where R is an output relation, safe negation
Example: is it possible to reach deliver(x)?
• Temporal properties of runs class T of sentences: x φ( x ) where φ is a Boolean combination of output, state, and database literals
• Complexity: NEXPTIME Proof: Reduction to finite satisfiability of FO sentences in the Bernays-Schönfinkel prefix class ** FO
Example:
x y [(deliver(x) price(x,y)) past-pay(x,y)]
Comparing Spocus transducers
• Motivation: customization, negotiation, etc.• Need to distinguish significant events from
“syntactic sugaring”• Notion of log: semantically significant
input and output relations
Transducer SHORT% schema database: price input: order, pay state: past-order, past-pay output: send-bill, deliver % state rules past-order(X) +:- order(X) past-pay(X,Y) +:- pay(X,Y)% output rules send-bill(X,Y) :- order(X), price(X,Y), past-
pay(X,Y) deliver(X) :- past-order(X), price(X,Y), pay(X,Y),
past-pay(X,Y)
Input Sequence
order(Time) order(Le Monde) pay(Newsweek,45) order(Newsweek) pay(Time,55) order(People) pay(Newsweek,48)
send-bill(Time,55) deliver(Time)
deliver(Newsweek)send-bill(Newsweek,45) send-bill(Le Monde,350)
PriceNewsweek 45Time 55Le Monde
350
Output Sequence
Transducer Friendly% schema database: price input: order, pay, pending-bills state: past-order, past-pay, past-pending-
bills output: send-bill, deliver, reject-pay, already-paid, re-bill
% some additional output rulesre-bill(X,Y) :- pending-bills(X), past-order(X), price(X,Y),
past-pay(X,Y) already-paid(X) :- pay(X,Y), past-pay(X,Y)
reject-pay(X) :- pay(X,Y), past-order(X)
Transducer Friendly% schema database: price input: order, pay, pending-bills state: past-order, past-pay, past-pending-
bills output: send-bill, deliver, reject-pay, already-paid, re-bill
% some additional output rulesre-bill(X,Y) :- pending-bills(X), past-order(X), price(X,Y),
past-pay(X,Y) already-paid(X) :- pay(X,Y), past-pay(X,Y)
reject-pay(X) :- pay(X,Y), past-order(X)
log: order, pay, send-bill, deliver
• Log of a run: restriction to logged events• Valid log of a transducer: log of a run• Transducer containment relative to a log:
every valid log of T1
is also a valid log of T2
Containment of Spocus transducers: undecidable.
Proof: reduction of implication of FDs and IDs
• Useful special case: customization: “T2 is an extension of T1” -- input(T1) input(T2) -- input(T1) log(T1) = log(T2)
• Faithful extension T2 of T1 : T2 contained in T1 Example: Short vs. Friendly • Decidable for Spocus tranducers in NEXPTIME Proof: Bernays-Schönfinkel
Controlling Input Sequences
• Example:
order(x) must be input before pay(x)
• Use a distinguished error output:
error pay(x), not past-order(x)
Valid runs: error free
Power of error-free runs• Propositional output transducer:
-- propositional outputs
-- one proposition output at each step
• Generror-free(T): words output by T on finite runs
Theorem: A language L equals Generror-free(T) for some propositional-output Spocus transducer T iff L is a prefix-closed r.e. language
Temporal properties of error-free runs
• A useful class can be enforced
class Terror-free of sentences:
x [φ(state,db,in)(x) ψ(state,db,in)(x)]
-- φ(state,db,in)(x): conjunction of literals
with safe negation
-- ψ(state,db,in)(x): positive formula
Examples
xy [(past-order(x) price(x,y) past-pay(x,y)) (pay(x,y) cancel(x))]
xy [pay(x,y) (price(x,y) past-order(x))]
x [cancel(x) past-order(x)]
Theorem: for every formula φ in Terror-free
there exists a Spocus transducer T whose error-free runs are precisely those satisfying φ.
Verification:
• Undecidable if all error-free runs of a Spocus transducer T satisfy given φ in Terror-free
• Decidable for transducers T without negative state literals in error-generating rules
Similar results for containment:
• Undecidable if every error-free run of T1 is also an error-free run of T2
• Decidable if T1 and T2 have same schema
and full log, and no negative state literals
in error-generating rules
Conclusion on Spocus transducers
• Reasonable for simple business models• Too restricted for more intricate Web service
workflows• Limited class of temporal properties verified• No compositions
Target: more sophisticated Web services
• Specified using high-level tools
Example: WebML
• Single peers and compositions
Triggers state update and transition to new page
InputInput
Input
Input
login cancel
Home Page(HP)
desktoplaptop
RAM:CPU:
RAM:CPU:SCREEN:
submit submit
Matching products
Details Confirmationbuy print
Customer Page(CP)
Laptop Search (LSP) Desktop Search (DSP)
Product Index (PIP)
Product Detail (PDP)
Confirmation (CoP)
back
Message
Message Page (MP)
state update
DB
output
NAME:PASSWD:
High-level
WebML-style specification
tools
Model for data-driven Web servicesWebML-style
A web service W is a set of web page schemas as well as– global database D (fixed during a session)– global state relations S (updatable)– input relations I and output relations O
– home page W0
– error page We
Webpage Schema
• Input options (user may pick at most one) Query(DB, State, Prev-input)
• Output and Transitions (triggered by user input) Output: Query(DB, State, Input, Prev-input) Next Webpage: Query(DB, State, Input, Prev-input)
• State updates (triggered by user input) Insertions/deletions: Query(DB, State, Input, Prev-Input)
“Query”: First Order Logic formula (core SQL)
• input options as provided by menus, pull-down lists, buttons, HTTP links: user must choose one
• input constants model text input boxes: name, password, etc.
• input I at previous step: prev-I can be viewed as special state
Details on Modeling the Input
login cancel
Home Page(HP)
desktoplaptop
RAM:CPU:
RAM:CPU:SCREEN:
submit submit
Matching products
Details Confirmationbuy print
Customer Page(CP)
Laptop Search (LSP) Desktop Search (DSP)
Product Index (PIP)
Product Detail (PDP)
Confirmation (CoP)
back
Message
Message Page (MP)
NAME:PASSWD:
Home page(HP) Input: name, password , clickbutton(x)
Input options: clickbutton(x) (x= “login” or x = “cancel”)
State update: error("bad user") not users(name,password) and clickbutton("login")
Page Transition rules: CP users(name,password) and
clickbutton(“login”) MP not users(name, password) and clickbutton("login")
input constant
input constant
DB tablestate table
next page
Product index page(PIP)
Matching products
Home page(HP)
Name passwd
login cancel
Customer page(CP)
DesktopMy order laptop
Product detail page(PP)
Product detail
buy
Confirmation page(CoP)
Order detail
Desktop Search(SP)
Desktop search
Ram:
Hdd:
search
laptop Search(SP)
Desktop search
Ram:
Hdd:
Display:
search
Past Order (POP)
Past Order
Order status(OSP)
Order status
Cancel confirmationpage(CCP)
Page: Past Order (POP) Input: Clickbutton(x), Opick(oid, pid, price, status)
Input Rules: Opick(pid,price,status) order-db(name, pid, price, status);
Clickbutton(x) x="view cart" or x="logout" or x="continue shopping" or x="back" State Rules:
Userorderpick(name,pid,price, status) Opick(pid,price,status) Target Webpages: OSP, CP, HP,CCTarget Rules:
OSP Opick(pid,price,status);
HP Clickbutton("logout"); ...........
Error message page(MP)
back
login cancel
Home Page(HP)
desktoplaptop
RAM:CPU:
RAM:CPU:SCREEN:
submit submit
Matching products
Details Confirmationbuy print
Customer Page(CP)
Laptop Search (LSP) Desktop Search (DSP)
Product Index (PIP)
Product Detail (PDP)
Confirmation (CoP)
back
Message
Message Page (MP)
NAME:PASSWD:
Product Info Page (PIP) Input: pick(pid,price)
Input options:
pick(pid,price) ram cpuprev-search(ram,cpu) catalog(pid,ram,cpu,price)
previous
input
db table
Web service run
• Fixed database
• Sequence of inputs, states and outputs
• Input constants are given values
throughout the run
Inconsistencies error page
login cancel
Home Page(HP)
desktoplaptop
RAM:CPU:
RAM:CPU:SCREEN:
submit submit
Matching products
Details Confirmationbuy print
Customer Page(CP)
Laptop Search (LSP) Desktop Search (DSP)
Product Index (PIP)
Product Detail (PDP)
Confirmation (CoP)
back
Message
Message Page (MP)
state update
NAME:PASSWD:
High-level
WebML-style specification
Web application
code
Examples of Desirable Properties
• Semantic properties – “no product is delivered before payment in the right amount is received" – “no user can cancel an order that has already been
shipped”
• Basic soundness of specification– “conditions guarding transition to next Web page are
mutually exclusive”
• Navigational properties– “the shopping cart page is reachable from any page”
Approach to verification
• Define an extended relational transducer• Show that every Web service spec and temporal
property to be verified can be translated efficiently into an extended relational transducer spec and a corresponding property
Main result restrictions leading to PSPACE decidability of LTL properties
Several versions:
• Relational transducers [Abiteboul+Vianu+Fordham+Yesha 98]
• ASM relational transducer [Spielmann 00]
• Extended ASM transducer [Deutsch+Liying+Vianu 04]
• Communicating transducers [Deutsch+Liying+Vianu+Zhou 06]
dbcontrol:
FO
input
output
state
Extended relational transducer: reactive system with FO control
Control: (input, state, db) (output, state)
Single peer
dbcontrol:
FO
input
output
state
Control: (input, state, db) (output, state)
Single peer
dbstate
Single peer
FO query
dbstate
Single peer
FO query
Input optionsuser choice
dbstate
Single peer
Input optionsuser choice
FO queries
output
Technical point: queries can also refer to k previous inputs
dbstate
input
output
Run: infinite sequence of consecutive configurations
Configurations and runs
Configuration
Language for properties of runs: LTL-FO
• Start with FO formulas referring to the states, db, inputs, top and last message of queues in current configuration
FO components• Apply Boolean and LTL operators: X, U, F, G, B • All remaining free variables are universally quantified
FO + LTL operators + Boolean operators
x (x)
Example of LTL-FO formula
x G [ X reject-order(x)
(past-order(x) y (pay(x,y) price(x,y)))]
FO componentsp: reject-order(x) q: (past-order(x) y (pay(x,y) price(x,y)))
LTL formula over p, q: G [ X p q ]
“An order is rejected in the next step only if it has already been ordered but not paid correctly in the current input”
Example Property
“any shipped product must be previously paid for”
pay(price)picked(uname, pid, price)
prod-price(pid, price)
pid, uname,price [(pid, uname, price) B
Ship(uname, pid)]
Where (pid, uname, price) is the formula
input
state database
output
The Verification ProblemGiven transducer M and LTL-FO property
Decide if every run of M satisfies .If not, exhibit a counterexample run.
Challenge: infinite-state system!
Typical approaches in Software Verification are unsatisfactory:
– Model checking: developed for finite-state systems described by propositional states. More expressive specifications first abstracted to propositional ones.
Unsatisfactory: can check that some payment
occurred before some shipment, but not that it involved the correct amount and product.
– Theorem proving: no completeness guarantees, not autonomous. Prover requires expert guidance .
Alternative approach: identify a restricted but reasonably expressive class of transducers that can be verified
Main restrictions for decidability
guarded quantification: quantified variables must appear in input (or prev-I) atoms
“input boundedness” earlier variant: Spielmann
pick(pid,price) ram cpu prev-search(ram,cpu) catalog(pid,ram,cpu,price)
guarded quantification
Input-bounded transducer
• State, action, and transition rules use FO conditions with “input-bounded” quantification:
x ( input(- x- ) φ( x ))
x ( input(- x- ) φ( x ))
state atoms have no quantified variables
prev-I can also serve as guard• Input options definitions:
*FO (db, prev-input, ground state atoms)
Product index page(PIP)
Matching products
Home page(HP)
Name passwd
login cancel
Customer page(CP)
DesktopMy order laptop
Product detail page(PP)
Product detail
buy
Confirmation page(CoP)
Order detail
Desktop Search(SP)
Desktop search
Ram:
Hdd:
search
laptop Search(SP)
Desktop search
Ram:
Hdd:
Display:
search
Past Order (POP)
Past Order
Order status(OSP)
Order status
Cancel confirmationpage(CCP)
Home page(HP) Input: name, password , clickbutton(x)
Input options: clickbutton(x) (x= “login” or x = “cancel”)State update: error("bad user") not users(name,password) and clickbutton("login")Transition rules:
HP clickbutton(“cancel”)CP user(name,password) and
clickbutton(“login”) MP not users(name, password) and clickbutton("login")
Error message page(MP)
back
Product index page(PIP)
Matching products
Home page(HP)
Name passwd
login cancel
Customer page(CP)
DesktopMy order laptop
Product detail page(PP)
Product detail
buy
Confirmation page(CoP)
Order detail
Desktop Search(SP)
Desktop search
Ram:
Hdd:
search
laptop Search(SP)
Desktop search
Ram:
Hdd:
Display:
search
Past Order (POP)
Past Order
Order status(OSP)
Order status
Cancel confirmationpage(CCP)
Page: Past Order (POP) Input: Clickbutton(x), Opick(oid, pid, price, status)
Input Rules: Opick(pid,price,status) order-db(name, pid, price, status);
Clickbutton(x) x="view cart" or x="logout" or x="continue shopping" or x="back" State Rules:
Userorderpick(name,pid,price, status) Opick(pid,price,status) Target Webpages: OSP, CP, HP,CCTarget Rules:
OSP Opick(pid,price,status);
HP Clickbutton("logout"); ...........
Error message page(MP)
back
login cancel
Home Page(HP)
desktoplaptop
RAM:CPU:
RAM:CPU:SCREEN:
submit submit
Matching products
Details Confirmationbuy print
Customer Page(CP)
Laptop Search (LSP) Desktop Search (DSP)
Product Index (PIP)
Product Detail (PDP)
Confirmation (CoP)
back
Message
Message Page (MP)
NAME:PASSWD:
Product Info Page (PIP) Input: pick(pid,price)
Input options:
pick(pid,price) ram cpuprev-search(ram,cpu) catalog(pid,ram,cpu,price)
previous
input
db table
Input-bounded LTL-FO property:FO components are input bounded
x G [ X reject-order(x)
(past-order(x) y (pay(x,y) price(x,y)))]
“An order is rejected in the next step only if it has already been ordered but not paid correctly in the current input”
Main verification result
Theorem: It is decidable whether an input-bounded extended relational transducer satisfies an input-bounded LTL-FO property.
Complexity: PSPACE-complete for bounded arityschemas, EXPSPACE otherwise
Tightness: even small extensions lead to undecidability
• Relaxing the requirement that state atoms must be ground in formula defining the input options.
Reduction: Does TM halt on input epsilon?• Lifting the input-bounded requirement by allowing
state projection. Reduction: Implication for FDs and IDs• Allowing Prev-I to record all previous input to I rather
than the most recent one. Reduction: Trakhtenbrot’s Theorem• Extend the FO-LTL formulas with path quantification. Reduction: validity of **FO formulas
Expressivity of input-bounded specs
Significant parts of the following Web applications could be modeled:
• Dell-like computer shopping website• Expedia• Barnes&Noble• GrandPrix motor sports Web site
See demo site http://www.db.ucsd.edu
PSPACE verification: outline for single peer
To check that M satisfies ,
verify that there is no run satisfying
Recall model checking approach (finite-state):• Build Büchi automaton B( ) for • Build Kripke model K for all runs• Check that there is no counterexample run:
emptiness of K B( )
Our case: infinite-state system
Same idea: build B( ), then search for counterexample runs accepted by B( )
But: no finite Kripke model K for the runs!
Problem in searching for counterexample runs: infinite runs infinitely many underlying databases
How to limit the search space?
Infinite search space for runs
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
length of run
...
number of underlying DBs
Bounding the search for counterexample runs
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
length of run
number of underlying DBs
Periodic runs suffice: counterexample iff
periodic one
Sufficient to consider only DBs over a fixed domain of cardinality exponential
in size of spec + prop
doub
le-e
xpon
entia
lly m
any
DB
s
doubly-exponential length in size of spec+prop
Finite search space yields decidability
of verification
Key insight for PSPACE complexity
• No need to explicitly materialize entire configuration:
• Instead, at each step construct only those portions of DB, states and outputs which can affect property.
• Call them pseudoconfigurations.
Pseudoconfigurations
input output
S
C = a set of relevant constants extracted from the spec. and prop. + a fixed number of variables
input picked from C
restriction of states to constants in C
restriction of DB to C
restriction of outputs to constants in C
Size polynomial in spec + prop
Pseudoruns
. . .
DB DB DB
pseudorun
counterexample run iff counterexample pseudorun
Pseudoruns
. . .
DB DB DB
pseudorun
• Can compute next possible pseudoconfigurations from current one
Pseudoruns
. . .DB DB DB
pseudorun
• Never construct entire DB, just “slide” poly window over it
PSPACE verification algorithm
• Can compute next possible pseudoconfigurations from current one
“Relevant” constants in C
• constants explicitly used in spec• witnesses for existentially quantified variables in
Example:
= x G [ X reject-order(x)
(past-order(x) y (pay(x,y) price(x,y)))]
= x G [ X reject-order(x)
(past-order(x) y (pay(x,y) price(x,y)))]
Replace x by non-deterministically chosen constant c, continue with new formula
G [ X reject-order(c)
(past-order(c) y (pay(c,y) price(c,y)))]
Variables in C
Input values that are not “relevant constants”
Why don’t we need infinitely many?
Variables in C
Input values that are not “relevant constants”
Can use just 2k variables because only k valuescan be passed via prev-input between configurations
k
Implementation so farWAVE: verifier for single Web service peer
[SIGMOD’05]
• Essentially implements search for a counterexample pseudorun
• Many tricks and heuristics to achieve good verification times
Some techniques
• Dataflow analysis to identify all constants to which a DB attribute may be compared (directly or indirectly).
Limits the relevant combinations of constants when constructing partial DBs. Spectacular reduction: for the computer shopping website, from 2^(17,270,412,688) partial DBs to 8 !
• Internal representation of pseudoconfigs to– Efficiently detect loop in periodic run– Efficiently evaluate queries
• Early pruning of pseudoruns
Experimental Evaluation of WAVE Tool
• Online Demo at http://www.db.ucsd.edu/
• Evaluated experimentally on 4 Web applications:– Dell-like computer shopping– Part of Expedia, Barnes&Noble, GrandPrix
• Verification times for a battery of properties: all within seconds, below one minute.
• Here, report only Dell experiment. All others are similar.
Some of the Verified PropertiesProperty type Property name Time (seconds)
Sequence pBq P5 (true)
P7 (true)
4
2
Session Gp Gq P9 (true) 1
Correlation Fp Fq P10 (true)
P11 (false)
P12 (true)
P13 (false)
0.23
0.29
0.6
0.44
Response p Fq P14 (false) 0.19
Reachability Gp or Fq P2 (true)
P3 (false)
0.9
0.37
Recurrence G(Fp) P17 (false) 0.15
Strong non-progress F(Gp) P15 (false) 0.26
Weak non-progress G(pXp) P6 (false) 0.49
Guarantee Fp P1 (true)
P8 (false)
0.02
0.11
Shipment only after proper payment
Failure of Classical Tools
• SPIN model checker Abstraction is unsatisfactory.
Alternative trick:Try to use SPIN to verify pseudoruns. The resulting SPIN input is too large to
handle.
• PVS theorem prover Not guaranteed to find a counterexample. Gets stuck during search, asks for guidance from expert user.
Branching-time temporal properties
Need path quantifiers
Current state
homepage
“At any point in a run, there is a way to return to the home page”
Verification results for CTL(*)
• Propositional Web services:
--states and actions are propositional
--prev-I atoms are disallowed
• CTL* formulas using state, action,
input, and Web page symbols interpreted as propositions
Examples
Navigational properties
• “One can reach the home page from any page”
• “After login, the user can reach a page where he can authorize payment”
AGEF( Home-Page)
AG(Home-Page button(“login”) EF(button(“authorize payment”)))
• Verification of CTL(*) formulas for propositional Web services:
--CO-NEXPTIME for CTL
--EXPSPACE for CTL*
Proof idea:
(i) show that there is a bound
on the databases that need to be considered in
order to detect a violation;
(ii) for a fixed database, reduce checking violation
to model checking for a Kripke structure
generated from the database.
Getting down to PSPACE:
• Navigational properties: CTL* formulas using only Web page symbols• Fully propositional Web services: inputs are also propositional
Proof technique: highly efficient model-checking techniqueof Kupferman, Vardi, Wolper using hesitant alternatingtree automata (HAA). Reduce to checking non-emptinessof a one-letter word HAA.
Alternative restriction: Web services with “input-driven search”
• Propositional states and actions• Inputs are monadic, propagated to next Web page as a parameter using prev-I atoms
Example: allows conducting an user-driven searchgoing through consecutive stages of refinement(each choice triggers new set of options)
For Web services with input-driven search:
EXPTIME for fixed out-degree of input choice
Proof: reduce to satisfiability of CTL(*) formulas by a Kripke structure
CTL formulas can be verified in EXPTIMECTL* formulas can be verified in 2-EXPTIME
Compositions of Web services
Product index page(PIP)
Matching products
Buyer Login page(BP)
Name passwd
login
Category choice page(CP)
DesktopMy order laptop
Product detail page(PP)
Product detailbuy
Confirmation page(CoP)
Order detail
Desktop Search(SP)
Desktop searchRam:Hdd:
search
laptop Search(SP)Desktop searchRam:Hdd:Display:
Past Order (POP)
Past Order
Order status(OSP)
Order status
Cancel confirmationpage(CCP)
User payment(UPP)
Credit Verification
M
Payment CC No:Expire date
submit
cancel
search
Examples of Composition Properties
• “every payment request by a user results eventually in an approval or denial output to the user”
• “the answer to every credit check request message for a user is a credit rating message poor, fair, or good, for the same user”
• “for every two consecutive credit rating messages for the same user user there exists an intermediate credit request message for that user.”
• Communicating peers: composition
• channels between peers• message: finite relation (set or singleton)• one FIFO queue at recipient of each channel
More on messages
• Flat message: single tuple
• Nested message: finite set of tuples
• Messages queued at recipient
• Message contents:
!M(x) :- query(db, state, input, in-messages)
Flat messages: query may generate severaltuples, choose non-deterministically one to be sent
dbFO
control
input
output
state
Peers with messages
Control: (input, in-messages, state, db) (output, out-messages, state)
Single peer
incoming messages
outgoingmessages
dbstate
input
output
Configurations and runs
Configuration of a single peer
incoming message queues
Configuration of a composition: member peer configurations
Transitions: one peer at a time
Configuration of a composition: member peer configurations
Transitions: one peer at a time
Configuration of a composition: member peer configurations
Transitions: one peer at a timeRun: infinite sequence of consecutive configurations
Verification of compositions: main restrictions for decidability
input boundedness of rules + bounded queues, lossy channels
Input-bounded compositions
• State, output, and nested message rules use FO formulas with guarded quantification:
x ( guard(- x- ) φ( x ))
x ( guard(- x- ) φ( x ))
where guard is an input or flat message atom and state and nested message atoms in φ have no quantified variables
• Input options and flat message definitions:
*FO formulas with ground state and
nested message atoms
Verification of compositions
Reduce to single peer verification
• Reduction applies to input-bounded compositions with bounded, lossy channels• Flat message queues simulated by inputs• Nested message queues simulated by states• Non-deterministic choice of peer at each transition simulated with additional input• Some tricky timing issues in translation of property
PTIME reduction preserving input boundedness
Main verification result
Theorem: It is decidable whether an input-bounded composition with bounded queues
and lossy channels satisfies an input-bounded LTL-FO property.
Complexity: PSPACE-complete for bounded arityschemas, EXPSPACE otherwise
Examples of extensions leading to undecidability
• Allowing perfect flat queues
(perfect nested queues are ok)
• Disallowing non-deterministic choice for flat messages
Reduction: Post Correspondence Problem
Additional verification problems
• Conversation protocols
sequences of messages observed in runs
data-agnostic: message parameters ignored
data-aware: parameters taken into account
• Modular verification
specs of some peers not available
information limited to input/output behavior
Verification of conversation protocols
• data-agnostic protocol: Büchi automaton over alphabet of message names
• Possible semantics with lossy channels:
observer-at-recipient
observer-at-source
Theorem: It is PSPACE-complete if an input-bounded composition with bounded, lossy channels satisfies a data-agnostic conversation protocol with observer-at-recipient semantics
Theorem: It is undecidable if an input-bounded composition with bounded, lossy channels satisfies a data-agnostic conversation protocol with observer-at-source semantics
Verification of conversation protocols
• Similar results for data-aware protocols: formalized as Büchi automaton whose alphabet is a finite set of FO formulas on message relations
G( get-rating(x) B rating(x,y) )
Theorem: It is PSPACE-complete if an input-bounded composition with bounded, lossy channels satisfies a data-aware conversation protocol withobserver-at-recipient semantics
Modular verification
Black box peers: input-output behavior
? ?
Modular verification
Environment specification:LTL-FO description of input and output messages
Environment
Properties under given environment
Composition C satisfies LTL-FO property
under environment specification :
every run of C in which messages to/from the
environment satisfy and use values from
some finite domain, satisfies
Verification under given environment
Additional restriction needed for decidability
LTL-FO property is strictly input-bounded if its FO components have no free variables
Example:
G ssn [ ?getRating(ssn)
(!rating(ssn, “poor”) !rating(ssn, “fair”) !rating(ssn, “good”))]
Verification under given environment
Theorem: It is PSPACE-complete if an input-bounded composition C with bounded queues and lossy channels satisfies an input-bounded LTL-FO property under
a strictly-input-bounded environment specification
Theorem: It is undecidable if an input-bounded composition C with bounded queues and lossy channels satisfies an input-bounded LTL-FO property underan input-bounded but not strictly-input-bounded environment specification
Putting the pieces together
WebML-style spec of Web service composition
peer composition spec
single peer spec
PTIME
PTIME
The XML angle
• Black box: input/output signature
order
delivery
bill
payment
Supplier
The XML angle
• Black box: input/output signature
order
delivery
bill
payment
Supplier
XML type
XML type
XML type
XML type
• Interacting Web services
authorizeok
bill 2
paym
ent 2
order1
receipt1
order2
receipt2
payment 1
bill 1
store bank
supplier1 supplier2
• Interacting Web services
authorizeok
bill 2
paym
ent 2
order1
receipt1
order2
receipt2
payment 1
bill 1
store bank
supplier1 supplier2
XML types XML types
XML types
XML types
Active XML
• Full integration of XML with Web service calls• Service calls embedded in XML documents• Static analysis: typechecking, type casting,
optimization, materialization policy• More later!
• XML document (ignoring PC data):
labeled, unranked, ordered tree
Quick XML Review
root
section section
intro section conc intro conc
intro section section conc
intro conc intro conc
• XML type: XML Schema
• XML query language: XQuery
Tools in static analysis: connections to
tree automata and tree transducers
Simple XSchema:Document Type Definition (DTD)
: alphabet of element names, root
set of rules:
e r
element name regular expressionover
Documents satisfying a DTD
root
e
e1 …. ek
r
….
e r
Set of trees satisfying DTD d: Sat(d)
A DTD and a tree satisfying it:
root section*; section intro, section*,conclusions;
Example
root
section section
intro section conc intro conc
intro section section conc
intro conc intro conc
dealer
used new
car car
model year model
car has different structure in different contexts
Essential enhancement of XML Schema: specialization
Specialization
dealer
used new
carnew
model year model
car has different structure in different contexts
carused
dealer used new used (carused )* new (carnew)* carused model year carnew model (year | )
dealer
used new
car car
model year model
dealer
used new
carused carnew
model year model
Specialized DTD
Tool in static analysis:
Powerful connection to tree automata!
Tree automata
States: p,q,r, …Start state: q0
Final state: qf
Transitions:
p
a b
root
a b
a c c b
c c b a a a b a
Tree automata
root
a b
a c c b
c c b a a a b a
States: p,q,r, …Start state: q0
Final state: qf
Transitions:
p
q r
Tree automaton computation
a b
a c c b
c c b a a a b a
root
Tree automaton computation
a b
a c c b
c c b a a a b a
q0
Tree automaton computation
p q
a c c b
c c b a a a b a
q0
Tree automaton computation
p q
r p q p
c c b a a a b a
q0
Tree automaton computation
p q
r p q p
qf qf qf qf qf qf qf qf
q0
Acceptance: there is a computation such thatall leaves are labeled qf
Set of accepted trees: regular tree language
Tree automata variants
• Top-down nondeterministic
• Bottom-up (non)-deterministic
• Top-down deterministic (weaker)
Examples
• regular: set of trees representing Boolean circuits evaluating to true
0 1 1 0 0 1 0 1
• not regular: equal numbers of a and b nodes
Regular: definable in Monadic Second-Order Logic
Theorem: XMLSchemas define regular languages of unranked trees
• Static analysis
Benefits:
• Algorithms for validation wrt XMLSchemas, testing inclusion of XMLSchemas, etc.
Query languages for XML
• One paradigm (also XML-QL, Lorel)
– extract bindings for variables using patterns
– construct answer from bindings
• Another paradigm (also XSLT, UnQL)
– structural recursion
• Standard: XQuery
integrates previous paradigms
pattern:
root
X Y
Z T
p q
r s
p,q,r,s: regular path expressions
Basic form of XML-QL query
WHERE pattern(X,Y,…)
CONSTRUCT answer(X,Y,…)
pattern:
root
X Y
Z=5 T
p q
r s
p,q,r,s: regular path expressions
Basic form of XML-QL query
WHERE pattern(X,Y,…)
CONSTRUCT answer(X,Y,…)
pattern:
root
X Y
Z=5 T
p q
r s
p,q,r,s: regular path expressions
=
Basic form of XML-QL query
WHERE pattern(X,Y,…)
CONSTRUCT answer(X,Y,…)
answer: root
f(X)… …
g(X,Y)
f(X), g(X,Y)
Skolem functions
Basic form of XML-QL query
WHERE pattern(X,Y,…)
CONSTRUCT answer(X,Y,…)
answer: root
f(X)… …f(X), g(X,Y)
Skolem functions
Basic form of XML-QL query
WHERE pattern(X,Y,…)
CONSTRUCT answer(X,Y,…)
answer: root
f(X), g(X,Y)
Skolem functions
f(X1) …f(Xi) …f(Xn)
Basic form of XML-QL query
WHERE pattern(X,Y,…)
CONSTRUCT answer(X,Y,…)
answer: root
f(X)… …
g(X,Y)
f(X), g(X,Y)
Skolem functions
Basic form of XML-QL query
WHERE pattern(X,Y,…)
CONSTRUCT answer(X,Y,…)
answer: root
f(X)… …f(X), g(X,Y)
Skolem functionsg(X,Y1)…g(X,Yk)
Basic form of XML-QL query
WHERE pattern(X,Y,…)
CONSTRUCT answer(X,Y,…)
CONSTRUCT
root
X1
X2 X3
X4 = Vermeer X5
exhibit
title museum
artist *
Vermeer’s exhibits
title(X2)
museum(X2,X3) Q(X2)
X5(X2,X3,X5)
Example: art exhibits
DTD: root exhibit*exhibit title. museum*. review*title artist*museum address. dates._*
WHERE
Nested query Q(X2):
WHERE CONSTRUCT
root
Y1
X2 Y2
title review
Vermeer’s-reviews(X2)
review(X2 ,Y2)
exhibit
A Different Paradigm: Structural Recursion
• Example: change all name tags occurring under person nodes to pname
• Notation: a(t1,t2) denotes the tree
• Transformation defined by f:
a
t1 t2
f(x(t1,t2)) = if x = person then x (g(t1), g(t2))
else x (f(t1), f(t2))
g(x(t1,t2)) = if x = name then pname (g(t1), g(t2))
else x (g(t1), g(t2))
Connection to tree transducers
k-pebble transducers
• subsume core of XQuery
• useful for static analysis
[Milo+Suciu+V.-]
k-pebble transducers
1 2 kk pebbles:
– States, pebbles, transition rules
– Control: alternating
k-pebble transducers
stack discipline for pebbles
1
i
2
k-pebble transducers
1 2 33 pebbles:
stack discipline for pebbles
k-pebble transducers
1 2 33 pebbles:
1 stack discipline for pebbles
1
k-pebble transducers
1 2 33 pebbles:
1
stack discipline for pebbles
1
k-pebble transducers
1 2 33 pebbles:
1
stack discipline for pebbles
1
k-pebble transducers
1 2 33 pebbles:
1
stack discipline for pebbles
1
k-pebble transducers
1 2 33 pebbles:
1
stack discipline for pebbles
1
k-pebble transducers
1 2 33 pebbles:
1
stack discipline for pebbles
1
k-pebble transducers
1 2 33 pebbles:
1
2 stack discipline for pebbles
1
2
k-pebble transducers
1 2 33 pebbles:
1
2
stack discipline for pebbles
1
2
k-pebble transducers
1 2 33 pebbles:
1
2
stack discipline for pebbles
1
2
k-pebble transducers
1 2 33 pebbles:
1
2
3 stack discipline for pebbles
1
2
3
k-pebble transducers
1 2 33 pebbles:
1
2
3
stack discipline for pebbles
1
2
3
k-pebble transducers
1 2 33 pebbles:
1
23
stack discipline for pebbles
1
2
3
k-pebble transducers
1 2 33 pebbles:
1
2
3
stack discipline for pebbles
1
2
3
k-pebble transducers
1 2 33 pebbles:
1
23
stack discipline for pebbles
1
2
3
k-pebble transducers
1 2 33 pebbles:
1
2
3
stack discipline for pebbles
1
2
3
k-pebble transducers
1 2 33 pebbles:
1
2
3 stack discipline for pebbles
1
2
3
k-pebble transducers
1 2 33 pebbles:
1
2
stack discipline for pebbles
1
2
k-pebble transducers
1 2 33 pebbles:
1
2
stack discipline for pebbles
1
2
k-pebble transducers
1 2 33 pebbles:
1
2 stack discipline for pebbles
1
2
k-pebble transducers
1 2 33 pebbles:
1
stack discipline for pebbles
1
k-pebble transducers
1 2 33 pebbles:
1
stack discipline for pebbles
1
k-pebble transducers
1 2 33 pebbles:
1
stack discipline for pebbles
1
k-pebble transducers
1 2 33 pebbles:
1 stack discipline for pebbles
1
Output-producing transitions
root
root
1
2
3 1
2
3
Output-producing transitions
root
1
2
3
1
2
3
Output-producing transitions
root
1
23
1
2
Output-producing transitions
root
1
2
3
1
2
Output-producing transitions
root
1
23
1 2
Output-producing transitions
root
1
23 1
2
1
23
a
Output-producing transitions
root
1
2
1
2
3
a
b
Output-producing transitions
root
1
2
3
a
b
c
Output-producing transitions
root
a
b
c
a
Output-producing transitions
root
a
b
c
a
b
Output-producing transitions
root
a
b
c
a
b a
Output-producing transitions
root
a
b
c
a
b a
c
a
Output-producing transitions
root
a
b
c
a
b a
c
ac
a
Output-producing transitions
XML Query languages vs. k-pebble transducers
k-pebble transducers subsume the tree manipulation core of XQuery
Application: typechecking
output type β
Need to check: T(α) β
Equivalently: α T-1(β)
Web service A
output type output typeoutput type α
XML query
Tree transducer T
Web service B Web service C
Theorem: T-1(β) is a regular tree language[Milo+Suciu+V.-]
1. Compute from T and β the tree automaton for T-1(β)2. Check that α T-1(β)
Typechecking:
Caveat: no data joins
Example: application to matchingfor service composition
Web service A
Web service B
output type α
input type β
Can output of A be restructured to fit the input type β of B?
• Given output I of service A, check whether
T(I) β ≠ by constructing a tree automaton for T(I)• If yes, produce as side effect a minimal
restructuring of I that satisfies β,
witnessing the nonempty intersection
Key: describe allowed restructurings by a nondeterministic transducer T
Static version:
Can every output of A be restructured
so as to satisfy the input type of B?
• Key: {I / T(I) β ≠ } is regular
if T is k-pebble transducer
• Enough to check that
α {I / T(I) β ≠ }
GetTemp
city
“Paris”
GetEvents
“Exhibits”
Going all the way: Active XMLnewspaper
titledate
“Le Monde”
“06/10/2003”
Y!Y!
Materialization: replacing a service call by its result. It’s a recursive process.
[Milo,Abiteboul,Amann,Benjelloun,Ngoc – SIGMOD03]
XML+
embedded service calls
GetEvents
“Exhibits”
temp
“16°C”
Going all the way: Active XMLnewspaper
titledate
“Le Monde”
“06/10/2003”
Materialization: replacing a service call by its result. It’s a recursive process.
T!T!
[Milo,Abiteboul,Amann,Benjelloun,Ngoc – SIGMOD03]
XML+
embedded service calls
temp
“16°C”
exhibits
GetExhibits
“Paris”
City
Going all the way: Active XMLnewspaper
titledate
“Le Monde”
“06/10/2003”
Materialization: replacing a service call by its result. It’s a recursive process.
[Milo,Abiteboul,Amann,Benjelloun,Ngoc – SIGMOD03]
XML+
embedded service calls
• Context: peer-to-peer Web services
• Each peer– Repository of intensional (AXML)
documents– Server: provides Web services (XQuery)– Client: when invoking the embedded
service calls
• Restriction on where data and service calls
occur in tree
Extended type
GetEvents
“Exhibits”
newspaper
title date
“Le Monde”“06/10/2003”
GetTemp
city
“Paris”
• Service call input/output signatures
• Service call definitions
input parameters: XQueries on client data
output: XQuery on parameters and server data
Basic typechecking problem
Given AXML types and service signatures and definitions for all peers, check if:• all AXML documents resulting from calls among peers are valid• all service inputs and outputs satisfy the signatures
Can be checked using transducers (if no data joins)
Controlling expansion policy by typing
GetEvents
“Exhibits”
newspaper
title date
“Le Monde”“06/10/2003”
GetTemp
city
“Paris”
Y!Y!
Materialization can be performed by the sender, before sending a document… or by the receiver, after receiving it.
GetEvents
“Exhibits”
newspaper
title date
“Le Monde”“06/10/2003”
Controlling expansion policy by typing
GetEvents
“Exhibits”
newspaper
title date
“Le Monde”“06/10/2003”
temp
“16°C”
Materialization can be performed by the sender, before sending a document… or by the receiver, after receiving it.
GetEvents
“Exhibits”
newspaper
title date
“Le Monde”“06/10/2003”
Controlling expansion policy by typing
GetEvents
“Exhibits”
newspaper
title date
“Le Monde”“06/10/2003”
GetTemp
city
“Paris”
Materialization can be performed by the sender, before sending a document… or by the receiver, after receiving it.
Y!Y!
Controlling expansion policy by typing
GetEvents
“Exhibits”
newspaper
title date
“Le Monde”“06/10/2003”
temp
“16°C”
Y!Y!
Materialization can be performed by the sender, before sending a document… or by the receiver, after receiving it.
temp
“16°C”
Why control the materialization of calls?
• For added functionality, e.g. – Intensional data allows to get up-to-date information.
• For security reasons or capabilities, e.g.– I don’t trust this Web service/domain,– I don’t have the right credentials to invoke it, – It costs money,– Maybe the receiver doesn’t know Active XML!
• For performance reasons, e.g.– A proxy can invoke services on behalf of a PDA.
Example scenario
• Client allows only certain service calls,
specified by its type• Can server always force its answer to satisfy
the clients schema by appropriate expansions?• Game between server and invoked services
Example: word case
• Client type: regular language R
• Input: word a1 ... an
each ai represents a Web service call
• Output type of service a: regular language Ra
• Game: Bob chooses a in the current word,
Alice responds with word in Ra to replace a
• Bob wins if resulting word is in R
Does Bob have a winning strategy on a1 ... an ?
Undecidable[Segoufin,Schwentick,Muscholl – STACS’04]
Decidable under restrictions[Segoufin,Schwentick,Muscholl – STACS’04][Milo,Abiteboul,Amann,Benjelloun,Ngoc – SIGMOD’03]
even if limited to “context-free games”: Ra consists of just finitely many words
complexity: PSPACE to EXPSPACE
Mille Grazie!