fundamentals, design, and implementation, 9/e text and xml databases instructor: dragomir r. radev...
Post on 22-Dec-2015
222 views
TRANSCRIPT
Fundamentals, Design, and Implementation, 9/e
Text and XML databases
Instructor: Dragomir R. Radev
Winter 2005
Chapter 9/2 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Types of databases
Textual databases Semi-structured databases
Chapter 9/3 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Indexing textual data
Inverted files Boolean queries Signature files Signature S1 matches signature S2 if
S2&S1=S2
Chapter 9/4 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
XML-QL
WHERE <BOOK> <NAME><LAST>$1</LAST></NAME> </BOOK> in “www.booklist.com/books.xmlCONSTRUCT <RESULT> $1 </RESULT>
Two slides from Johannes Gehrke, Cornell University<IMG SRC=“xysq.gif” ALT=“(x+y)^2”>
<apply> <power/> <apply> <plus/> <ci>x</ci> <ci>y</ci> </apply> <cn>2</cn> </apply>
Chapter 9/5 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
XML-QL (continued)
WHERE <BOOK> $b <BOOK> IN “www.booklist.com/books.xml”, <AUTHOR> $n </AUTHOR> <PUBLISHED> $p </PUBLISHED> in $eCONSTRUCT <RESULT> <PUBLISHED> $p </PUBLISHED> WHERE <LAST> $l </LAST> IN $n CONSTRUCT <LAST> $l </LAST> </RESULT>
Chapter 9/6 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
<!ELEMENT book (author+, title, publisher)>
<!ATTLIST book year CDATA>
<!ELEMENT article (author+, title, year?, (shortversion|longversion))>
<!ATTLIST article type CDATA>
<!ELEMENT publisher (name, address)>
<!ELEMENT author (firstname?, lastname)>
XML-QL (continued)
Chapter 9/7 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
WHERE <book>
<publisher><name>Addison-Wesley</name></publisher>
<title> $t</title> <author> $a</author> </book> IN "www.a.b.c/bib.xml" CONSTRUCT $a
XML-QL (continued)
Chapter 9/8 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
WHERE <book> <publisher><name>Addison-
Wesley</></> <title> $t</> <author> $a</> </> IN "www.a.b.c/bib.xml" CONSTRUCT $a
XML-QL (continued)
Chapter 9/9 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
WHERE <book>
<publisher><name>Addison-Wesley</></> <title> $t</> <author> $a</> </> IN "www.a.b.c/bib.xml" CONSTRUCT <result> <author> $a</> <title> $t</> </>
XML-QL (continued)
Chapter 9/10 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
<bib>
<book year="1995">
<!-- A good introductory text -->
<title> An Introduction to Database Systems </title>
<author> <lastname> Date </lastname> </author>
<publisher> <name> Addison-Wesley </name > </publisher>
</book>
<book year="1998">
<title> Foundation for Object/Relational Databases: The Third Manifesto </title>
<author> <lastname> Date </lastname> </author>
<author> <lastname> Darwen </lastname> </author>
<publisher> <name> Addison-Wesley </name > </publisher>
</book>
</bib>
XML-QL (continued)
Chapter 9/11 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
<result> <author> <lastname> Date </lastname> </author> <title> An Introduction to Database Systems </title> </result>
<result> <author> <lastname> Date </lastname> </author> <title> Foundation for Object/Relational Databases: The Third Manifesto </title> </result>
<result> <author> <lastname> Darwen </lastname> </author> <title> Foundation for Object/Relational Databases: The Third Manifesto </title> </result>
XML-QL (continued)
Chapter 9/12 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
WHERE <book > $p</> IN "www.a.b.c/bib.xml",
<title > $t</>,
<publisher><name>Addison-Wesley</>> IN $p
CONSTRUCT <result>
<title> $t </>
WHERE <author> $a </> IN $p
CONSTRUCT <author> $a</>
</>
XML-QL (continued)
Chapter 9/13 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
<result>
<title> An Introduction to Database Systems </title>
<author> <lastname> Date </lastname> </author>
</result>
<result>
<title> Foundation for Object/Relational Databases: The Third Manifesto </title>
<author> <lastname> Date </lastname> </author>
<author> <lastname> Darwen </lastname> </author>
</result>
XML-QL (continued)
Chapter 9/14 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
WHERE <article> <author> <firstname> $f </> // firstname $f <lastname> $l </> // lastname $l </> </> CONTENT_AS $a IN "www.a.b.c/bib.xml"
<book year=$y> <author> <firstname> $f </> // join on same firstname $f <lastname> $l </> // join on same lastname $l </> </> IN "www.a.b.c/bib.xml", y > 1995 CONSTRUCT <article> $a </>
XML-QL (continued)
Chapter 9/15 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
XML-QL (continued)
Chapter 9/16 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
<!ATTLIST person ID ID #REQUIRED><!ATTLIST article author IDREFS
#IMPLIED>
XML-QL (continued)
Chapter 9/17 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
<person ID="o123">
<firstname>John</firstname>
<lastname>Smith<lastname>
</person>
<person ID="o234">
. . .
</person>
<article author="o123 o234">
<title> ... </title>
<year> 1995 </year>
</article>
XML-QL (continued)
Chapter 9/18 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
XML-QL (continued)
Chapter 9/19 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
WHERE <article><author><lastname> $n</></></> IN "abc.xml”
XML-QL (continued)
WHERE <article author=$i> <title> </> ELEMENT_AS $t </>, <person ID=$i> <lastname> </> ELEMENT_AS $l </>CONSTRUCT <result> $t $l</>
Chapter 9/20 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Scalar values
<title>A Trip to <titlepart> the Moon </titlepart></title> NOT!
<title><CDATA> A Trip to </CDATA><titlepart><CDATA> the
Moon</CDATA></titlepart></title> YES
Chapter 9/21 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Tag variables
WHERE <$p> <title> $t </title> <year>1995</> <$e> Smith </> </> IN "www.a.b.c/bib.xml", $e IN {author, editor} CONSTRUCT <$p> <title> $t </title> <$e> Smith </> </>
Chapter 9/22 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Transforming data
<!ELEMENT book (author+, title, publisher)> <!ATTLIST book year CDATA> <!ELEMENT article (author+, title, year?, (shortversion|
longversion))> <!ATTLIST article type CDATA> <!ELEMENT publisher (name, address)> <!ELEMENT author (firstname?, lastname)>
<!ELEMENT person (lastname, firstname, address?, phone?, publicationtitle*)>
Chapter 9/23 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Transforming data (cont’d)
WHERE <$> <author> <firstname> $fn </> <lastname> $ln </> </> <title> $t </> </> IN "www.a.b.c/bib.xml", CONSTRUCT <person ID=PersonID($fn, $ln)> <firstname> $fn </> <lastname> $ln </> <publicationtitle> $t </> </>
Chapter 9/24 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Integrating data from different sources
WHERE <person> <name></> ELEMENT_AS $n <ssn> $ssn</> </> IN "www.a.b.c/data.xml",
<taxpayer> <ssn> $ssn</> <income></> ELEMENT_AS $i </> IN "www.irs.gov/taxpayers.xml" CONSTRUCT <result> $n $i </>
Chapter 9/25 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Query blocks
WHERE <$e> <title> $t </> <year> 1995 </> </> CONTENT_A $p IN "www.a.b.c/bib.xml" CONSTRUCT <result ID=ResultID($p)> <title> $t </> </> { WHERE $e = "journal-paper", <month> $m </> IN $p CONSTRUCT <result ID=ResultID($p)> <month> $m </>
</> } { WHERE $e = "book", <publisher>$q </> IN $p CONSTRUCT <result ID=ResultID($p)> <publisher>$q </>
</> }
Chapter 9/26 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
XQuery
Successor to XML-QL, YAML, Lorel, Quilt
Supported by the W3C Draft only
Chapter 9/27 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
DTD
<!ELEMENT bib (book* )><!ELEMENT book (title, (author+ | editor+ ), publisher, price )><!ATTLIST book year CDATA #REQUIRED ><!ELEMENT author (last, first )><!ELEMENT editor (last, first, affiliation )><!ELEMENT title (#PCDATA )><!ELEMENT last (#PCDATA )><!ELEMENT first (#PCDATA )><!ELEMENT affiliation (#PCDATA )><!ELEMENT publisher (#PCDATA )><!ELEMENT price (#PCDATA )>
Chapter 9/28 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Sample database<bib>
<book year="1994"><title>TCP/IP Illustrated</title><author>
<last>Stevens</last><first>W.</first>
</author><publisher>Addison-Wesley</publisher><price> 65.95</price>
</book> <book year="1992">
<title>Advanced Programming in the Unix environment</title><author>
<last>Stevens</last><first>W.</first>
</author><publisher>Addison-Wesley</publisher><price>65.95</price>
</book><book year="2000">
<title>Data on the Web</title><author>
<last>Abiteboul</last><first>Serge</first></author>
<author><last>Buneman</last><first>Peter</first>
</author><author>
<last>Suciu</last><first>Dan</first>
</author><publisher>Morgan Kaufmann Publishers</publisher><price>39.95</price>
</book></bib>
Chapter 9/29 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Sample query
<bib>{for $b in
document("http://www.bn.com/bib.xml")/bib/bookwhere
$b/publisher = "Addison-Wesley“and
$b/@year > 1991return
<book year="{ $b/@year }">{ $b/title }
</book>}
</bib>
Chapter 9/30 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Expected result
<bib><book year="1994">
<title>TCP/IP Illustrated</title></book><book year="1992">
<title>Advanced Programming in the Unix environment</title>
</book></bib>
Chapter 9/31 Copyright © 2004
Database Processing: Fundamentals, Design, and Implementation, 9/e by David M. Kroenke
Pointers and Demos
http://www.w3.org/TR/xquery/ http://www.w3.org/TR/xmlquery-use-cases/ http://xml.org/ http://131.107.228.20/Default.aspx?
case=XMP&example=Q1#Query http://www.db.ucsd.edu/people/yannis/
XQueryTutorial.htm http://www.ex.ac.uk/~pellison/xml/multiple.htm http://seacow.eecs.umich.edu:8080/timberweb/