xml in relational dbcstyng/webdb.07/lectures/lesson5.pdfxml data storage - relational db 5...

44
XML Data Storage - Relational DB 1 XML in Relational DB

Upload: others

Post on 12-Mar-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 1

XML in Relational DB

Page 2: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 2

A Sample File<SigmodRecord>

<issue><volume>11</volume><number>1</number><articles>

<article><title>Annotated Bibliography on Data Design.</title><initPage>45</initPage><endPage>77</endPage><authors>

<author position="00">Anthony I. Wasserman</author><author position="01">Karen Botnich</author>

</authors></article>

</articles></issue></SigmodRecord>

Page 3: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 3

Corresponding Document TreeRoot

1

2

7 8 9

42

208207 209 210

10481047

5232 5237Anthony I

Wasserman

11 1

Annotated Bibliographyon Data Design. 45 77

KarenBotnich

SigmodRecord

issue

number articlesvolume

article

endPage authorsinitPagetitle

author author

@position@position

00 01

element node

text node

attribute node

Page 4: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 4

Two Mapping Approaches

• Structure-mapping approach– static DTD– well-defined semantic constraints

• Model-mapping approach– dynamic DTD– ambiguous semantic constraints

Page 5: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 5

Structure-mapping approach

• Design database schema based on DTD

• Different schemas for different DTDs

• Slides partially from a VLDB 2002 Tutorial

Relational DB

DTD

XML Doc

XMLQuery

XML

Page 6: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 6

A Sample DTD<!ELEMENT book (booktitle, author)<!ELEMENT booktitle (#PCDATA)><!ELEMENT author (name, address)>

<!ATTLIST author id ID #REQUIRED>

<!ELEMENT name (firstname?, lastname)><!ELEMENT firstname (#PCDATA)><!ELEMENT lastname (#PCDATA)><!ELEMENT address ANY><!ELEMENT article (title, author*, contactauthor?)><!ELEMENT title (#PCDATA)><!ELEMENT contactauthor EMPTY>

<!ATTLIST contactauthor authorID IDREF IMPLIED><!ELEMENT monograph (title, author, editor)><!ELEMENT editor (monograph*)>

<!ATTLIST editor name CDATA #REQUIRED>

Page 7: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 7

A DTD Graphbook

booktitle

author

monograph

title

contactauthor

authorID

editor

*

nameaddress

?

firstname lastname

?

authorid

article

*

name

12 elements

Page 8: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 8

Basic Inlining• An element graph is created for each element in the DTD

graph• Relations created for:

– Root of an element graph– Set value children (under a *)– Nodes with a back pointer (recursion)

• All other descendants inlined• Every element is made into as comprehensive a relation as

possible• Set-valued attributes are handled with separate relations

and foreign keys

Page 9: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 9

Basic Inlining• An element graph is constructed by

– Do a DFS traversal of DTD graph, starting at element node for which we are constructing relations

– Each node is marked as “visited” the first time it is reached and is unmarked once all its children have been traversed

– If an unmarked node in DTD graph is reach during DFS, a new node bearing the same name is created in the element graph

– A regular edge is created from the most recently created node in the element graph with the same names as the DFS parent of the current DTD node to newly created node

– If an attempt is made to traverse an already marked DTD, then a backpointer edge is added from the most recently created node in the element graph to the most recently created node in the element graph of the same name as the marked DTD node

editor

name

Page 10: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 10

Basic Inlining

• A relation is created for root of element of graph

• All element’s descendents are inlined into that relation except– Children below a “*” node

are made into separate relations – this corresponds to creating a new relation for a set-valued child

– Each node having a backpointer edge pointing to it is made into a separate relation

Page 11: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 11

Basic: Relational Schemabook (bookID: integer, book.booktitle : string, book.author.name.firstname: string, book.author.name.lastname: string,

book.author.address: string, author.authorid: string)booktitle (booktitleID: integer, booktitle: string)article (articleID: integer, article.contactauthor.authorid: string, article.title: string)article.author (article.authorID: integer, article.author.parentID: integer, article.author.name.firstname: string,

article.author.name.lastname: string, article.author.address: string, article.author.authorid: string)contactauthor (contactauthorID: integer, contactauthor.authorid: string)title (titleID: integer, title: string)monograph (monographID: integer, monograph.parentID: integer, monograph.title: string,

monograph.editor.name: string, monograph.author.name.firstname: string, monograph.author.name.lastname: string, monograph.author.address: string, monograph.author.authorid: string)

editor (editorID: integer, editor.parentID: integer, editor.name: string)editor.monograph (editor.monographID: integer, editor.monograph.parentID: integer, editor.monograph.title: string,

editor.monograph.author.name.firstname: string, editor.monograph.author.name.lastname: string, editor.monograph.author.address: string, editor.monograph.author.authorid: string)

author (authorID: integer, author.name.firstname: string, author.name.lastname: string, author.address: string,aauthor.authorid: string)

name (nameID: integer, name.firstname: string, name.lastname: string)firstname (firstnameID: integer, firstname: string)lastname (lastnameID: integer, lastname: string)address (addressID: integer, address: string)

Page 12: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 12

Fragmentation Problem<!ELEMENT author (name, address)><!ATTLIST author id ID #REQUIRED><!ELEMENT name (firstname?, lastname)>

<!ELEMENT firstname (#PCDATA)>

<!ELEMENT lastname (#PCDATA)>

<!ELEMENT address ANY>

author (authorID: integer, id: string)

name (nameID: integer, authorID: integer)

firstname (firstnameID: integer, nameID: integer, value: string)

lastname (lastnameID: integer, nameID: integer, value: string)

address (addressID: integer, authorID: integer, value: string)

Each element is a table

Page 13: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 13

Problems in Basic In-lining• Creating a large amount of relations

– Separate relational schema for each element as root (minor)

– Unrolling recursive strongly connected components (major)

• Consuming great storage space when where are many null able(optional) attributes

• Too many joins– E.g. author becomes scattered (i.e., it appears

once for each possible instantiation from a root). Query: list all the books authored by Louis

Page 14: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 14

Complexity of DTDs• DTD element specifications can be of

arbitrary complexity• <!ELEMENT a ((b|c|e)?,(e?|(f?,(b,b)*))*)> is valid!• Fortunately, can simplify DTD for

translation purposes:– Key observations: not necessary to regenerate

DTD from relational schema– XML query languages do not query over

complex structures of elements

Page 15: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 15

Simplification of DTDs

(e1, e2)* e1*, e2*(e1, e2)? e1?, e2?(e1|e2) e1?, e2?

e1** e1*e1*? e1*e1?* e1*e1?? e1?

..., a*, ..., a*, ... a*, ...

..., a*, ..., a?, ... a*, ...

..., a?, ..., a*, ... a*, ...

..., a?, ..., a?, ... a*, ……, ...a, …, a, … a*, …

Flattening Transformations Simplification Transformations

Grouping Transformations

<!ELEMENT a ((b|c|e)?,(e?|(f?,(b,b)*))*)>

<!ELEMENT a (b*, c?, e*, f*)>

Page 16: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 16

Shared Inlining• Idea

– Inline as many sub-elements as possible– However, to avoid scattering, do not inline an

element if it is shared– I.e. do not inline only if it is a shared, recursive or set

sub-element.

• Approach– Necessary and Sufficient Condition for shared/

recursive element: e.g. authorIn-degree >= 2 in DTD graph

Page 17: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 17

Shared Inlining• Technique:

– Node with in-degree 1 in the DTD graph: inline

• Special cases: * and recursion– Node with in-degree 0: create a separate

relation, because it cannot be inlined– Node with in-degree greater than 1: create

a separate relation, because it is shared

Page 18: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 18

Issues with Sharing Elements

• Parent of elements not fixed at schema level– Isroot: boolean variable

• Need to store type and ids of parents (or if there are no parents)– parentCODE field (type of parent)– parentID field (id of parent)– Not foreign key relationship

Page 19: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 19

Shared: Relational Schemabook (bookID: integer, book.booktitle.isroot: boolean, book.booktitle : string)article (articleID: integer, article.contactauthor.isroot: boolean,

article.contactauthor.authorid: string)monograph (monographID: integer, monograph.parentID: integer,

monograph.parentCODE: integer, monograph.editor.isroot: boolean,monograph.editor.name: string)

title (titleID: integer, title.parentID: integer, title.parentCODE: integer, title: string)author (authorID: integer, author.parentID: integer, author.parentCODE: integer,

author.name.isroot: boolean, author.name.firstname.isroot: :boolean,author.name.firstname: string, author.name.lastname.isroot: boolean,author.name.lastname: string, author.address.isroot: boolean,author.address: string, author.authorid: string)

Page 20: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 20

Shared Inlining

• Advantages– Reduces number of joins for queries like “get

the first and last names of an author”– Efficient for queries such as “list all authors

with name Jack”• Problems

– Sharing whenever possible implies extra joins for path expressions. E.g. “Article with a given title name”

Page 21: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 21

Converting XML Queries to SQL

• Converting queries with simple path expressions– Relation(s) corresponding

to start of root path expression(s) are identified and added to from clause of SQL query

– If necessary, path expressions are translated to joins among relations

Page 22: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 22

Converting Relational Results to XML

• Simple Structuring– Constructing such

results from a RDBMS is natural and efficient

– Requires attaching appropriate tags for each tuple

Page 23: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 23

Model-mapping approach

• Fixed database schema for different XML documents without the support of DTD– Edge (Florescu and Kossmann, 1999)– XRel (YoshiKawa, M. and Amagasa, T., 2001)– XParent (H. Jiang et al., 2002)– iNode (Lau & Ng, 2002)

• Support well-formed but non-DTD XML documents

Page 24: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 24

Edge

• One-table schema– Edge(Source, Ordinal, Target, Label, Flag, Value)

• Each row corresponds to one edge in data graph

• Require table joins for edge connections• Cannot determine the parent-child

relationship directly

Page 25: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 25

The Document TreeRoot

1

2

7 8 9

42

208207 209 210

10481047

5232 5237Anthony I

Wasserman

11 1

Annotated Bibliographyon Data Design. 45 77

KarenBotnich

SigmodRecord

issue

number articlesvolume

article

endPage authorsinitPagetitle

author author

@position@position

00 01

element node

text node

attribute node

Page 26: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 26

EdgeSrc Ord Tgt Label Flag Value0 1 1 SigmodRecord ref1 1 2 issue ref2 1 3 volume val 112 2 4 number val 12 3 5 articles ref5 1 6 article ref6 1 7 title val Annotated ….6 2 8 initPage val 456 3 9 endPage val 776 4 10 authors ref10 1 11 author val Anthony I …11 1 12 @position val 0010 2 13 author val Karen Botnich13 1 14 @position val 01

Page 27: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 27

XParent

• XParent– uses LabelPath and DataPath tables to maintain

the structural information of the XML documents.

– The DataPath table is also used to keep the parent-child relationship instead of using region as in XRel.

Page 28: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 28

XParent

• Four-table schema– LabelPath(PathID, Len, Path)– DataPath(Pid, Cid)– Element(PathID, Ordinal, Did)– Data(PathID, Did, Ordinal, Value)

• Determine the parent-child relationship by using equijoin

Page 29: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 29

The Document TreeRoot

1

2

7 8 9

42

208207 209 210

10481047

5232 5237Anthony I

Wasserman

11 1

Annotated Bibliographyon Data Design. 45 77

KarenBotnich

SigmodRecord

issue

number articlesvolume

article

endPage authorsinitPagetitle

author author

@position@position

00 01

element node

text node

attribute node

Page 30: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 30

XParentPID PathLen PathExp1 1 ./SigmodRecord2 2 ./SigmodRecord./issue3 3 ./SigmodRecord./issue./volume4 3 ./SigmodRecord./issue./number5 3 ./SigmodRecord./issue./articles6 4 ./SigmodRecord./issue./articles./article7 5 ./SigmodRecord./issue./articles./article./title8 5 ./SigmodRecord./issue./articles./article./initPage9 5 ./SigmodRecord./issue./articles./article./endPage10 5 ./SigmodRecord./issue./articles./article./authors11 6 ./SigmodRecord./issue./articles./article./authors./author12 7 ./SigmodRecord./issue./articles./article./authors./author./@position PID CID

1 22 32 42 55 65 156 76 86 96 1010 1110 1210 1310 14

LabelPath Table

DataPath Table

Page 31: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 31

XParentPathID Ordinal Did1 1 12 1 23 1 34 1 45 1 56 1 67 1 78 1 89 1 910 1 1011 1 1112 1 1211 2 1312 1 14 PathID Did Ordinal Value

3 3 1 114 4 1 17 7 1 Annotated Bibliography on Data Design.8 8 1 459 9 1 7712 12 1 0011 11 1 Anthony I. Wasserman12 14 1 0111 13 2 Karen Botnich

Element Table

Data Table

Page 32: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 32

INode

• Model-mapping approach• Node-oriented approach• Based on a node numbering scheme• Using the virtual node concept to calculate

the id for each node (Lee et al., 1996)

Page 33: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 33

Key Features

1. Does not need to concatenate the edges to form a simple path for query processing

2. Reduces the database size by using a table to store all simple path expressions

3. The parent-child relationship is embedded in the NodeID

4. Retrieve the parent-child relationship by calculation instead of using equijoins orθ–joins

Page 34: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 34

INode Numbering Scheme (1/4)

• Given a node n, denote the path of n asa1, a2, a3, …, akwhere k is the depth of node n

and 1 ≤ ai ≤ nc

1

32

1198 10 12 13

4

765

1,1 1,2 1,3

1,1,1

1( )⎥⎦⎥

⎢⎣⎢ +

−= 12)(

kiiParent

real node virtual node

Page 35: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 35

INode Numbering Scheme (2/4)

• Based on the Parent(i) function, we have a path function as follow:

⎪⎪⎩

⎪⎪⎨

>++−

≤=

∑ ∑

=

=−

−−

=

2,1

2,)( 2

1

3

0

22

2

1

kannan

kapathf

i

k

iik

ic

kci

kc

ii

Where nc = maximum number of child nodes for a node

k = path length of the node

ai = value in the path at the position i

Page 36: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 36

INode Numbering Scheme (3/4)

• With the path function, we embed the document-id in the node-id with the following function, where i is the number of decimal places for document-id

i

docidpathfnodeid10

)( +=

Page 37: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 37

INode Numbering Scheme (4/4)

• Given the node-ids of i and j, where i is the ancestor of j, and the depth difference between two nodes as d, we can confirm the ancestor-descendent relationship with the following function

⎣ ⎦jjdc

d

i

icj

i nodeidnodeidn

nnodeidnodeid −+

⎥⎥⎥⎥

⎢⎢⎢⎢

+−−

=∑−

= 12

1

1

Page 38: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 38

Table Mapping

• Three-table schema– Path(PathID, PathExp)– Element(NodeID, PathID, Ordinal, Value)– Attribute(NodeID, PathID, Value)

• Determine the parent-child relationship by calculation

Page 39: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 39

Querying - EdgeQ1: /SigmodRecord/issue/articles/article[endPage=77]/authors/author

select authors.valuefrom edge sigmodrecord, edge issue, edge articles, edge article,

edge authors, edge author, edge endpagewhere sigmodreacord.label = ‘SigmodRecord’

and issue.label = ‘issue’and articles.label = ‘articles’and article.label = ‘article’and authors.label = ‘authors’and author.label = ‘author’and endpage.label = ‘endPage’and sigmodrecord.source = ‘0’and sigmodrecord.target = issue.sourceand issue.target = articles.sourceand articles.target = article.sourceand article.target = authors.sourceand endpages.source = authors.sourceand author.source = authors.targetand endpage.value = ‘77’

Page 40: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 40

Querying - XParentQ1: /SigmodRecord/issue/articles/article[endPage=77]/authors/author

select d1.valuefrom labelpath lp1, labelpath lp2, datapath dp1, datapath dp2, data d1, data d2

labelpath lp3, datapath dp3where lp1.path = ‘./SigmodRecord./issue./articles./article./authors/author'

and lp2.path = ‘./SigmodRecord./issue./articles./article./endPage’lp3.path = ‘./SigmodRecord./issue./articles./article./authors'and d1.pathid = lp1.pathidand d2.pathid = lp2.pathidand d1.did = dp1.cidand d2.did = dp2.cidand dp3.pid = dp2.pidand dp1.pid = dp3.cidand d2.value = ‘77’

Page 41: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 41

Query Processing - iNode

Q1: /SigmodRecord/issue/articles/article[endPage=77]/authors

select e1.valuefrom path p1, path p2, element e1, element e2where p1.pathexp = '#/SigmodRecord#/issue#/articles#/article#/authors'

and p2.pathexp = '#/SigmodRecord#/issue#/articles#/article#/endPage'and e1.pathid = p1.pathidand e2.pathid = p2.pathidand mod(e1.nodeid, 1) = mod(e2.nodeid, 1)and floor(((e1.nodeid – 2) / 5) + 1) = floor(((e2.nodeid - 2) / 5) + 1)and e2.value = '77'

Page 42: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 42

Discussion• Edge

– easy to maintain because it uses a singe table schema– Since it only maintains edges individually, it needs a

large number of equijoins to check the edge-connections.

• XRel– uses a table to store all simple path expressions. This

improves the query performance and performs regular path expressions easily.

– it can identify the containment relationship by using θ–joins. However, θ–join is more costly than an equijoin.

Page 43: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 43

Discussion

• XParent– uses a table to maintain the parent-child relationship

instead of using the concept of region. Therefore, it can use equijoins to test the relationship and the performance can be increased.

– For some complex queries that require checking the ancestor-descendent relationship, XParent requires a large number of equijoins to check for the relationship.

– Although it can use another table, called Ancestor, to keep the ancestor-descendent relationship, it requires more storage space.

Page 44: XML in Relational DBcstyng/webdb.07/lectures/lesson5.pdfXML Data Storage - Relational DB 5 Structure-mapping approach • Design database schema based on DTD • Different schemas

XML Data Storage - Relational DB 44

End of Lecture