2005rel-xml-ii1 the silkroute system the system goals scenario, examples view forests view...
Post on 20-Dec-2015
226 views
TRANSCRIPT
2005 rel-xml-ii 1
The SilkRoute system
The system goals Scenario, examples View ForestsView forest and query compositionView forest efficient execution
2005 rel-xml-ii 2
The system goals
• Publishing of relational data in xml form
• Efficient execution of XQuery queries on published views
Assumptions:
• The only real data is the relational data
• The public XML view is virtual
• XQuery queries on published views are composed with view def.
• Execution is always by SQL queries, since relational engines are efficient
2005 rel-xml-ii 3
Scenario, examples
The scenario: e-commerce
• Each supplier has a private relational store
• This store can be represented in a canonical XML view
• The format/contents of both is private
• There is a public XML schema def, that all participants in the scenario know about and use – the public view
• Resellers, merchants pose queries on the public view, as per their current interests
2005 rel-xml-ii 4
Supplier private data: rel and XML (fig. 2, p. 5, fig.4, p. 8)
2005 rel-xml-ii 5
The generation of the canonical view from the relational data is standard, can be completely automated
But, the db is not converted to the XML view – doing this will throw away the availability of SQL for queries
The canonical view is virtual
And, it is just a basis for further work; this data is considered private
For the public, there is an XML schema agreed upon by the community – the public view
2005 rel-xml-ii 6
The public view schema: (fig. 1 , p. 5) back10 back14
XQuery type syntax – an internal format (more human readable) of XML, used by the XQuery type checker
Differences: one root, mainly products, change of names of some fields, reports inside product
2005 rel-xml-ii 7
The db admin can now write queries on the canonical view (p. 8, p. 9)
$CanonicalView is a pre-defined variable, bound to the canonical view
2005 rel-xml-ii 8
A more complex query, with fusion (fig. 5, p. 10)
2005 rel-xml-ii 9
Note: XQuery is compositional:
Given Q1 on a a source S, Q2 on results of Q1, there exists a query Q3 on S, such that
Q3(S) = Q2(Q1(S))
Simple approach: define Q1 in a let, then use it as in Q2
(see prev. page for let)
Here (next page) is a query defining our supplier’s public view (ignore boxes) (fig. 6, p. 11)
2005 rel-xml-ii 10
public view
2005 rel-xml-ii 11
Users (e.g., resellers) may pose queries on the public view
e.g. somebody wants only the deeply discounted items (fig.7)
This can be composed with the def. of the public view
An efficient composition, that does not generate intermediate results, and retrieves only what is needed, is more difficult
2005 rel-xml-ii 12
Here is an efficient composition (as might be produced by an optimizer) (fig. 8, p. 13)
2005 rel-xml-ii 13
View Forests
But, the canonical view is also virtual – all data is in a relational db
How do we represent XML virtual views over relational, so that composition can be performed?
The representation in SilkRoute: View forest(typically a tree, but XQuery allows XML sequences, hence forest)
• A tree structure – the structure of the XML view
• Nodes are labeled XML labels, and with SQL fragments – these are used to generate the data
2005 rel-xml-ii 14
public view
back17
2005 rel-xml-ii 15
Explanations:
• Each node is assigned a (Dewey style) id, no explicit tree is maintained
• The XML label of an internal node is an element/attribute
• The XML label of a leaf is an atomic type (string, float, …)
• Each node is labeled with an SQL fragment, with FROM, WHERE, SELECT components (may be empty, not always shown)– A leaf must have a SELECT
– Every variable used in a node, must be defined (in FROM) in the node or in an ancestor
2005 rel-xml-ii 16
The SQL query for a node: C(n)
The union of the fragments for the nodes on the path from the node to the root
Examples:
N1.1.1:
N1.2:
SELECT * (select an empty result tuple)
FROM clothing c
WHERE c.category = “outerwear”
N1.2.1.1:
N1.2.5.1:
Note: if n is parent of n’, then C(n’) is an extension of C(n) (possibly the same, possibly a new var, or a new condition…)
2005 rel-xml-ii 17
The semantics of a view forest:
Given a (relational) instance I, view forest V, V(I) is:
• Collection of nodes (n, r) where – n is a node of V
– r is a row in the answer of C(n) on I
• (n, r) is parent of (n’,r’) if – n is parent of n’ in V
– r, r’ are obtained from variables assignments that agree on all the variables of C(n)
Look on the view (p. 14) and nodes
N1.1.1, N.1.2.1.1, N1.2.5.1
2005 rel-xml-ii 18
The semantics a naïve, inefficient method for generating the XML represented by a view V from instance I
More efficient method: combine many SQL queries into one
Will be discussed later (2nd half of paper)
2005 rel-xml-ii 19
Here is a view forest for a fragment of the canonical view
Can be generated automatically from db schema