ling wang, mukesh mulchandani advisor: elke a. rundensteiner co-advisor: kathi fisler updating xml...

Post on 16-Dec-2015

226 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Ling Wang, Mukesh Mulchandani

Advisor: Elke A. Rundensteiner Co-Advisor: Kathi Fisler

Updating XML Views over Relational Data

Outline Motivation (Why?)

• Background:

XML View, Update Extension for XQuery

• Problem Definition: - Correct Translatability

- General Classification of XML View Update(XVUP)

- Typical Case Study: RUP, PUP

• Update Strategy for Round-Trip update problem ( RUP)

• Update Strategy for Publish-based update problem ( PUP)

• XVUP System Architecture

• Contribution

• Related Work

• XML is a standard for information exchange over internet But RDBMS is mature

- Mature query optimization techniques- High query performance

• Research Topic on dealing with XML with relational technology: - Publishing XML over relational database:

SilkRoute (AT&T), XPERANTO (IBM), RAINBOW - Storing XML into RDBs

LegoDB (BellLab), RAINBOW

• Support Update features • Our Work will focus on Content updates using XQuery language

Motivation

• Step1: Expressing updates in XQuery

- Extension to XQuery

- Extension to XML query Parser to support Update features

• Step2: Update RD through XML View

- Keep Consistency

- Translate XML View Updates (XQuery) into Relation Table Updates (SQL)

What should we do?

RDBMS

View Query

XML View

XML Update Query

SQL Update

RDBMS

Where we are Motivation (Why?) Background:

XML View, Update Extension for XQuery

• Problem Definition: - Correct Translatability

- General Classification of XML View Update(XVUP)

- Typical Case Study: RUP, PUP

• Update Strategy for Round-Trip update problem ( RUP)

• Update Strategy for Publish-based update problem ( PUP)

• XVUP System Architecture

• Contribution

• Related Work

XML Schema

<?xml version="1.0"?><xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">

<xs:element name="bib"><xs:complexType>

<xs:sequence><xs:element name="book" maxOccurs="unbounded">

<xs:complexType><xs:sequence>

<xs:element name="bookid" type="xs:string" nillable="false"/><xs:element name="title" type="xs:string" nillable="false"/><xs:element name="author">

<xs:complexType><xs:sequence>

<xs:element name="aname" type="xs:string" maxOccurs="unbounded"/></xs:sequence>

</xs:complexType></xs:element><xs:element name="prices" maxOccurs="unbounded">

<xs:complexType><xs:sequence>

<xs:element name="source" type="xs:string"/><xs:element name="currency" type="xs:string"/><xs:element name="value" type="xs:double"/>

</xs:sequence></xs:complexType>

</xs:element><xs:element name="publisher">

<xs:complexType><xs:sequence>

<xs:element name="pname" type="xs:string"/><xs:element name="location" type="xs:string"/>

</xs:sequence></xs:complexType>

</xs:element><xs:element name="review" type="xs:string" nillable="true"/>

</xs:sequence><xs:attribute name="year" type="xs:string" use="required"/>

</xs:complexType></xs:element>

</xs:sequence></xs:complexType>

</xs:element></xs:schema>

Bib.xsd

XML document<prices>

<source>www.amazon.com</source><currency>USD</currency><value>65.95</value>

</prices><prices>

<source>www.bn.com</source><currency>USD</currency><value>64.75</value>

</prices><publisher>

<pname>Addison-Wesley</pname><location>Boston</location>

</publisher><review>

A clear and detailed discussion of UNIX programming. </review>

</book><book year="2000">

<bookid>98003</bookid><title>Data on the Web</title><author>

<aname>Serge Abiteboul</aname><aname>Peter Buneman</aname><aname>Dan Suciu</aname>

</author><prices>

<source>www.amazon.com</source><currency>DEM</currency><value>34.95</value>

</prices><publisher>

<pname>Morgan Kaufmann Publishers</pname><location>New York</location>

</publisher><review>

A very good discussion of semi-structured database systems and XML. </review>

</book></bib>

<bib><book year="1994">

<bookid>98001</bookid><title>TCP/IP Illustrated</title><author>

<aname>W. Stevens</aname></author><prices>

<source>www.amazon.com</source><currency>USD</currency><value>65.95</value>

</prices><publisher>

<pname>Addison-Wesley</pname><location>San Francisco</location>

</publisher><review>

One of the best books on TCP/IP. </review>

</book><book year="1992">

<bookid>98002</bookid><title>Advanced Programming in the Unix environment</title><author>

<aname>Bram Stoker</aname></author>

Bib.xml

<books>FOR $book IN document(“bib.xml”)/bookLET $titles = $book/titleWHERE $book/@year <= 2000RETURN

$book/title,<total>count($titles)</total>

</books>

<books><title>TCP/IP Illustrated</title><title> Advanced Programming in the Unix environment </title><total>2</total>

</books>

Xquery:

Query Result:

XQuery Example

XQuery Update Grammar

  FOR $binding1 IN Xpath-expr,…..LET $binding := Xpath-expr,…WHERE predicate1,…..updateOp,……

  Where updateOp is defined as :   UPDATE $binding {subOp {, subOp}* } and subOp is : 

DELETE $child |RENAME $child To new_name |INSERT ( $bind [BEFORE | AFTER $child]

| new_attribute(name, value) | new_ref(name, value) | content [BEFORE | AFTER $child] ) |

REPLACE $child WITH ( new_attribute(name, value)| new_ref(name, value)| content ) |

FOR $sub_binding IN Xpath-subexpr,…..WHERE predicate1,……….updateOp.

FOR $book IN document(“bib.xml")/bookLET $author:=$book/authorWHERE $book/title = “TCP/IP Illustrated”UPDATE $author{ INSERT

<aname>"Peter Naughton "</aname>}

Insert Update

Update query example

<bib><book year="1994">

<bookid>98001</bookid><title>TCP/IP Illustrated</title><author>

<aname>W. Stevens</aname><aname>"Peter Naughton "</aname>

</author><prices>

<source>www.amazon.com</source><currency>USD</currency><value>65.95</value>

</prices><publisher>

<pname>Addison-Wesley</pname><location>San Francisco</location>

</publisher><review>

One of the best books on TCP/IP. </review>

</book>……

<bib>

Where we are Motivation (Why?) Background:

XML View, Update Extension for XQuery Problem Definition:

- Correct Update Translatability

- General Classification of XML View Update(XVUP)

- Typical Case Study: RUP, SUP

• Update Strategy for Round-Trip update problem ( RUP)

• Update Strategy for Publish-based update problem ( SUP)

• XVUP System Architecture

• Contribution

• Related Work

Correct Update Translatability

• No side effect• One Step changes

- Each database tuple is affected by at most one step of update operation- Implications: No order between update operation

Could affect same table several times• Minimal changes

- No valid translation is subset of current translation

- No extraneous updates• Replacement can not be simplified

- Two replace could get same result, pick simple one- Replace the minimum attribute set

• No insert-delete pairs- Replace is cheaper than Insert/Delete pair

General Classification of XML View Update(XVUP)

Four dimensions for XVUP be studied:

• Information Dimension: - The amount of information available for XVUPeg: Constraints, Keys, Virtual View Definition, Underlying Relation

• Modification Dimension:- What modifications the XVUP can handle?eg: Content ( Insert/Delete/Replace/Move/Rename…), Schema

• Language Dimension:- View formeg: Algebra, XML Query language, SPJ, duplicate, Recursion, Aggregation

• Instance Dimension:- Requirement for DBeg: BCNF, 3NF, others?

Virtual View Definition

Underlying Relation

RDB Schema

Integrity Constraints(Key, FK)

DeletionInsertion

ReplacementRename MoveSet of each

Group UpdateSchema Change

Duplicates

Aggregation

Recursion

Hierarchy Consistency

Key Exposition

Non Correlation predicates attribute exposition

Correlation predicates attributes exposition

BCNF

3NF

Information

Modification

Instance Language

Local Constraints(Not Null, domain)

2NF

1NF

General Classification of XVUP

• Information dimension- Why Local Constraints ? ---- Valid Update

An Update to an XML view is Valid Update iff the update never inviolate any XML semantic constraints.

- Why Integrity Constraints ? ----- Update PropagationKnowledge of dependency of RDB , keep global integrity

• Instance dimension- BCNF ---- preserving of data dependency

• Modification dimension- Content update ( insertion/deletion/replacement )

• Only think about language dimension!

Why?

bib

book

bookid title author prices

aname sourcecurrenc

y

year

(0,n)

(1,n)

(1,1) (1,1) (0,1) (0,n) (1:1)

value

publisher

pname location

(0,1)

review

(0,1)

XML Schema Graph (XSG)

-Remember hierarchical information of XML view or XML document

1

(1,1)(1,1)

(1,1) (1,1) (1,1)

1

1 1 1

Duplicate

Duplicate:- two vertex in XSG are exposed from same relational attribute.- Partial updates touching duplicate elements are not translatable.

why? Cause ambiguous/inconsistent for underlying relation.

book

publisher author

title anamepname title

Book/title Book/title

update

Exposition Features• Key Exposition

- Primary key of underlying relation has to be exposed- except automatic generated key(implication: user has right to update it)

• Non Correlation predicates attribute exposition (select condition)- variable involved in predicates has to be exposedeg: $book/bookid = “98004”

bookid has to be exposed in view result • Correlation predicates attributes exposition (join condition)

- variable involved in predicates has to be exposedeg: $book/authorid = $author/id

then authorid in book table, id in author table have to be exposed

• Complete Exposition

• Why? Flexibility in constructing view, could against RDB.• Hierarchy in Relational Semantic:

- Table vs. Attribute - Key vs. Foreign-Key

bookid title year pname location review

BOOK

bookid authorid name

AUTHOR

Then, book is parent of all its attributes, book is ancester of author

- ID pairs. ( Recursive table like edge) ???

source position name target

1.0 1.0 book 6.0

6.0 1.0 bookid 98003

Hierarchy Consistency

Note: Same implication with default XML view generation

Hierarchy Consistency-Transitivity holds:

Aiancestor

Ajancestor

Ak

ancestor

Ai

PK UK NKR

• Consistent edge in XSG:edge has same ancestor-descendant with underlying relation

Hierarchy Consistency

BIB

AUTHOR

ANAME BOOK

BOOKID TITLE

<BIB> <author> <name>David Sklansky</name> <book>

<bookid>98001</bookid><title>TCP/IP Illustrated</title>

</book> </author>

……</BIB>

Inconsistent edge

Author/aname

Book/bookid Book/title

<BIB>FOR $book IN document("default.xml")/books/Row, $author IN document("default.xml")/author/RowWHERE $book/Author_IID = $author/PID RETURN

<author> $author/aname, <book> $book/bookid, $book/title </book> </author>

</BIB>

Un-updatable

• Consistent edge construction

View is consistent construction iff all edges in XSG are consistent edge.

• Inconsistent ConstructionView is inconsistent construction if exist an inconsistent edge.

• TranslatabilityAll update worked on sub-tree rooted in inconsistent edge are not translatable.

Hierarchy Consistency

Assumption & Case Study

• General Assumption- RDB has no cyclic dependencies- No order issue

• Typical View update problem- Round-Trip Update problem (RUP)- Semi-structured Update problem (SUP)

RDBMS

View Query

XML View

RDBMS

XML Doc+Schema

1

RUP: View = Schema

SUP:View Schema2

Where we are Motivation (Why?) Background:

XML View, Update Extension for XQuery Problem Definition:

- Correct Update Translatability

- General Classification of XML View Update(XVUP)

- Typical Case Study: RUP, PUP

Update Strategy for Publish-based update problem ( SUP)

• Update Strategy for Round-Trip update problem ( RUP)

• XVUP System Architecture

• Contribution

• Related Work

Semi-structured Update Problem (SUP)

• Update Translatability

Exposition complete

Consistency Duplication Complete Update

Partial Update

Y Y Y Y Case 1

Y Y N Y Y

Y N Y Y Case 2

Y N N Y Case 3

N Y Y N N

N Y N N N

N N Y N N

N N N N N

Semi-structured Update Problem (SUP)

• Case 1: Complete Exposition + Consistent + DuplicationPartial update + touch duplication is not translatable

• Case 2: Complete Exposition + In-Consistent + No-Duplicationsub-tree rooted at inconsistent edge is not updatable

• Case 3: Complete Exposition + In-Consistent + Duplication- case2 case 3, same as case 2 for inconsistent part- partial update touch duplication is not translatable

Where we are Motivation (Why?) Background:

XML View, Update Extension for XQuery Problem Definition:

- Correct Update Translatability

- General Classification of XML View Update(XVUP)

- Typical Case Study: RUP, PUP

Update Strategy for Publish-based update problem ( SUP) Update Strategy for Round-Trip update problem ( RUP)

• XVUP System Architecture

• Contribution

• Related Work

Round-Trip Update Problem

Loading Features

•Structure Preserving ( hierarchy information of XML)

Complete Structure Loading ---- each edge e(v1,v2) in XSG is mapped to a hierachical relationship defined in relational semantic.

Lossless Structure Loading ---- could re-construct XML view with same structure information as original XML document.

Complete Structure Loading Lossless Structure Loading

IID PID BOOK

IID PID BOOKID TITLE AUTHOR_IID PUBLISHER_IID YEAR

IID PID ANAME

IID PID SOURCE CURRENCY VALUE

BIB

BOOK

AUTHOR

PRICE

IID PID PNAME LOCATION

Complete Structure Loading for example XML schema (Basic Inline)

IID PID BOOK

IID PID BOOKID TITLE AUTHOR_IID PNAME LOCATION YEAR

IID PID ANAME

IID PID SOURCE CURRENCY VALUE

BIB

BOOK

AUTHOR

PRICE

Lossless Structure Loading for example XML schema ( Shared Inline)

Round-Trip Update Problem

• Semantic Preserving ( Constraints information of XML)

- Five kinds of constraints: Domain Constraints, Not null constraints, Key Constraints, Cardinality Constraints

(0,1) at most ---- NULL + UNIQUE(0,n) any ---- eg: Separate Table/ overflow table(1,1) only ---- NOT NULL + UNIQUE

(1,n) at least ---- Not NullInclusion Dependency ( IDREF)

Keep as duplicateSeparate table with K-FK connection

Complete Semantic Loading --- keep all semantic constraints in RDB schema

Round-Trip Update Problem

• loading strategy feature for RUP Lossless Structure loading + Complete Semantic Loading

• Update Translatability

any valid update are translatable in RUP

Where we are Motivation (Why?) Background:

XML View, Update Extension for XQuery Problem Definition:

- Correct Update Translatability

- General Classification of XML View Update(XVUP)

- Typical Case Study: RUP, PUP

Update Strategy for Publish-based update problem ( SUP) Update Strategy for Round-Trip update problem ( RUP) XVUP System Architecture

• Update Strategy

• Contribution

• Related Work

Parser

View Analyser

Valid Update Checker

Translatability Checker

Update Decomposer

Translator

Update Propagation

Execution Engine

View

DB Trigger

SQL Update

XQuery

System Architecture

Where we are Motivation (Why?) Background:

XML View, Update Extension for XQuery Problem Definition:

- Correct Update Translatability

- General Classification of XML View Update(XVUP)

- Typical Case Study: RUP, PUP

Update Strategy for Publish-based update problem ( SUP) Update Strategy for Round-Trip update problem ( RUP) XVUP System Architecture Update Strategy

• Contribution

• Related Work

Fundmental ---- connection

Ownership

connection

Subset connection Referencing connection

X1 & X2X1=NK/PK(R1),

X2PK(R2)X1=NK/PK(R1),

X2=PK(R2)X1=NK/PK(R1),

X2 NK/UK(R2),

Cardinality1:n 1:[0,1] 1:n

Representation R1 R2 R1 R2 R1 R2

R

We divide Foreign key as three types:

Inner-going Outer-going

• R1 is the owner of R2 if:

(a) every tuple in R2 must be connected to an owning tuple in R1

(b) Deletion of an owning tuple in R1 requires deletion of all tuples connected to that tuple in R2

(c) Modification of X1 in an owning tuple of R1

- propagation of the modification to the matching attributes X2 of all owned tuples in R2 or

- deletion of those tuples.

Fundmental ---- Ownership Connection

Fundmental ---- Reference Connection

• R1 is referencing to R2 if:

(a) Every tuple in R2 must either be connected to a referenced tuple in R1 or have null value for X1 ( the latter is allowed only when X1 NK(R1).

(b) Deletion of a tuple in R1 requires

- deletion of its referencing tuples in R2

- assignment of null values to attributes X2 of all the referencing tuples in R2.

(c) Modification of X1 in a referenced tuple of R1

- propagation of the modification to attributes X2 of all referencing tuples in R2 - assignment of null values to attributes X2 of all referencing tuples in R2

(unique + NULL)

- deletion of those tuples. (unique + Not NULL)

Fundmental ---- Subset Connection

• R1 and R2 is subset connection if:

(a) Every tuple in R2 must be connected to one tuple in R1.

(b) Deletion of a tuple in R1 requires deletion of the connected tuple in R2 ( if the latter exists)

(c) Modification of X1 in a tuple of R1 requires

- propagation of the modification to attributes X2 of its connected tuple in R2

- deletion of the R1 tuple. ( reject update)

XML View Mapping Graph (VMG)

Graph G(V,E) is represented as follows:

Nodes:- Core Relation : Relations underlying View- Extended Relation: Relations connected with Core Relation by FK.

- Involved Relation: Relations connected with Extended Relation or other involved

relation by FK

Edges: connections between two relation node.

From DAG to Set-Tree

Observation: - DAG: No recursion in View- Set of trees: replicating subtrees rooted at vertices having multiple

incoming edges.

XML View Mapping Graph (VMG)

IID PID BOOK

IID PID BOOKID TITLE AUTHOR_IID PUBLISHER_IID YEAR

IID PID ANAME

IID PID SOURCE CURRENCY VALUE

BIB

BOOK

AUTHOR

PRICE

IID PID PNAME LOCATION

VMG for Basic Inline Loading Strategy

PUBLISHER

• Pivot Relations (PR)

- Core Relation

- key is exposed in the view

- not included in other tree rooted in pivot relation

- out-going ownership/subset connections to other Core Relations = 0

• Implication:

- Start point of sub-tree of VMG

Fundmental ---- Pivot Relation

Book

Price PublisherAuthor

Bib

Book

Price PublisherAuthor

Bib

Book

Price PublisherAuthor

BibPivot Relation

Pivot Relation

Example for PR

• Dependency Island (DI) of root relation R

- Rooted at R.

- Maximal sub-tree.

- All inner-going ownership and subset connections of R.

• Referencing Peninsula (RP) of root relation R

- A relation Rj

- Directly connected to any relation of dependency island Rk via Reference connection Rk Rj

Dependency Island /Reference Peninsula

• Referenced Continent( RC) of root relation R

- Rooted at R.

- Maximal sub-tree.

- All outer-going ownership / subset / reference connections of R.

Referenced Continent

RDI

RPRC

analysisVMG(){ new VMG(V,E) for each underlying relation Ri

put Ri into V as a node of VMGfor each relation RjDB if ( Foreign key from Rj->Ri) then {

identifyConnectionType(Ri,Rj)put edge e(Rj->Ri) with connection type(o/s/r) into E

} else if ( Foreign key from Ri->Rj) then {

identifyConnectionType(Ri,Rj)put edge e(Ri->Rj) with connection type(o/s/r) into E

} else{}

return VMG}

Step1: View Analyser

updateTranslatabilityChecking( XAT_tree, update, VMG){ if ( expositionCompletenessChecking(VMG))

if ( completeUpdate(update) )return true;

else{ if ( duplicationChecking(VMG, update) )

if ( expositionConsistencyChecking(VMG))return true;

return false;}

}

Step2: Update Translatability Checking

Step3: Update Decomposition

updateDecomposition(XAT_tree)

{ XATleave = get all leave node of XAT_tree resultUpdate = array of RelationalUpdate for all XAT_leave do{ node = ith XAT_leave update = ith resultUpdate set updateType by looking at the root of XAT_tree while node != null {

update = opUpdateDecomp( node, update)node = parent node

} }

distinctResultUpdate();}

Tagger

Source

Join

NavNav

Source

opUpdateDecomp( XAT_node, update){ if XAT_node is a Navigate node

update.tableName = get table name from node if it has one elseif XAT_node is a Select node or a Join node

add the condition into whereClause of updatebreak any complex binary conditions into simple binary conditions and store the conditions

to be referred while extending tuples in case of insert and replace updates elseif XAT_node is a Tagger node

if type of update is deleteextract names of attributes from tagger patternfill in the updateColumn vector of update

if type of update is Insert if the DOM pattern of element represented by the tagger matches DOM pattern of the element to be inserted

extract names of attributes from tagger patternextract values of attributes from pattern of element to be insertedfill in the updateColumn vector of update

if type of update is Replace if the DOM pattern of element represented by the tagger matches DOM pattern of the replacing element

extract names of attributes from tagger patternextract old values by querying the relational databaseextract new values from pattern of the replacing elementfill in the updateColumn vector of update

else do nothing return update}

Step4: Delete PropagationDelete tuple t from relation R

• Algorithm (Step1-3):- Isolate the Dependency Island (DI) of R- Delete Matching t from R- Identify Referencing Peninsulas(RP)- Replacement on foreign key of matching tuple in each Peninsula else delete corresponding tuple

• Global Integrity Maintenance (Step4):- Relation involved in deletions

Delete propagation in its DI, Repeatedly, if necessaryForeign Key Replacement in its RP

IID PID BOOK

IID PID BOOKID TITLE AUTHOR_IID PUBLISHER_IID YEAR

IID PID ANAME

IID PID SOURCE CURRENCY VALUE

BIB

BOOK

AUTHOR

PRICE

IID PID PNAME LOCATION

VMG for Basic Inline Loading Strategy

PUBLISHER

delete

Delete/ Replace

Step4: Insertion Propagation

Insert tuple t into the relation R

• Algorithm- Extend the view tuple with values for the attributes that have been exposed out in the view definition- If new tuple is already present in the instance, reject the update- Otherwise, perform an insertion in the underlying database relation.

• Global Integrity Maintenance:- Insertion-check for RC,

if there, do nothingelse rollback, reject insertion

updateTranslation(resultUpdate ){ update = first of resultUpdate while update != null{

updateType (update) formatWhereConditions

formatOtherConditions }}

updateType (update){ if update-type is replace

do nothing elseif delete or insertion

if update-columns include all attributes of table do nothingelse update-type = replace

}

Step5: Update Translation

Virtual View Definition

Underlying Relation

RDB Schema

Integrity Constraints(Key, FK)

DeletionInsertion

Replacement Rename MoveSet of each

Group UpdateSchema Change

Duplicates

Aggregation

Recursion

Hierarchy Consistency

Key Exposition

Non Correlation predicates attribute exposition

Correlation predicates attributes exposition

BCNF

3NF

Information

Modification

Instance Language

Local Constraints(Not Null, domain)

2NF

3NF

General Classification of XVUP

top related