schema-free xquery based on the work of: yanyao li, cong yu and h.v.jagadish from the university of...
Post on 18-Jan-2018
218 Views
Preview:
DESCRIPTION
TRANSCRIPT
Schema-Free XQuerySchema-Free XQueryBased on the work of: Based on the work of: Yanyao Li, Cong Yu and Yanyao Li, Cong Yu and
H.V.JagadishH.V.Jagadish From the University of MichiganFrom the University of Michigan
Presented by Gil Barash in the course SDBI 05’
ContentContent What is XQueryWhat is XQuery The problem of Schema-Based The problem of Schema-Based
queriesqueries MLCASMLCAS Integrating MLCAS with XQueryIntegrating MLCAS with XQuery ConclusionConclusion
XQueryXQuery
XQuery is an XML Query Language.XQuery is an XML Query Language. Sometimes referred as the SQL of
XML files. It is built on XPath expressions. It is supported by all major database
engines. It will soon become a W3C standard.
XPathXPath
XPath is used to navigate through XPath is used to navigate through XML documents.XML documents.
In order for us to write an XQuery In order for us to write an XQuery query, we should first get familiar with query, we should first get familiar with XPath…XPath…
Bibliography XML (version Bibliography XML (version 1)1)
<bibliography> <bib> <year> 1999 </year> <book>
<title> SQL </title><author> Bob </author>
</book> <article>
<title> XML </title><author> Mary </author>
</article> </bib> … …</bibliography>
bibliography
bib
year articlebook1999 title author
SQL Bob
title author
XML Mary
bib
year articlebook2000 title author
D.B. David
title author
.NET Bill
XPath - exampleXPath - example<bibliography> <bib> <year> 1999 </year> <book>
<title> SQL </title><author> Bob </author>
</book> <article>
<title> XML </title><author> Mary </author>
</article> </bib> <bib>
……
</bib></bibliography>
The expression: The expression: /bibliograph/bib/*/bibliograph/bib/*
Will return the nodes: Will return the nodes: <year> , <book> and <year> , <book> and <article><article>Look from
the root of the
document
Under the path “bibliography/bib”
For all child nodes
/ / bibliograph/bibbibliograph/bib /* /*
XPath - exampleXPath - example
The expression: The expression: /bibliography//title/bibliography//title
Will return both the Will return both the titles “SQL” and “XML”titles “SQL” and “XML”
For all child nodes of the root which
are named “bibliography”
Look for any descendent (not
only direct children)For the nodes named “title”
/bibliography/bibliography //// titletitle
<bibliography> <bib> <year> 1999 </year> <book>
<title> SQL </title><author> Bob </author>
</book> <article>
<title> XML </title><author> Mary </author>
</article> </bib> <bib>
……
</bib></bibliography>
XPath - exampleXPath - example<bibliography> <bib> <year> 1999 </year> <book>
<title> SQL </title><author> Bob </author>
</book> <article>
<title> XML </title><author> Mary </author>
</article> </bib> <bib>
……
</bib></bibliography>
The expression: The expression: //bib[1]//bib[1]
Will return the sub Will return the sub tree rooted by the first tree rooted by the first ‘bib’‘bib’
// // bib[1]bib[1]
Look somewhere in the document
For the 1st bib node
XQuery queriesXQuery queries
FOR $x IN doc(“doc.xml”)/bibliography/bib/bookWHERE $x/author/text()=“Mary”RETURN $x/title
Suppose we want to find the title of the book of which Mary is an author.
Our Query will be:
XQuery - exampleXQuery - example
For all sub trees (marked as $x) in the document “doc.xml” under the XPath: /bibliograyph/bib/book
FOR $x IN doc(“doc.xml”)/bibliography/bib/book
WHERE $x/author/text()=“Mary”If in the sub tree $x there is a path /author/ and the text of the node at the end of the path is “Mary”.
XQuery - exampleXQuery - example
Return the node which is under the path /title from the $x sub tree.
RETURN $x/title
Bibliography XML (version Bibliography XML (version 1)1)
bibliography
bib
year articlebook1999 title author
SQL Mary
title author
XML Mary
bib
year articlebook2000 title author
D.B. David
title author
.NET Bill
FOR $x IN doc(“doc.xml”)/bibliography/bib/bookWHERE $x/author/text()=“Mary”RETURN $x/title
XQuery - exampleXQuery - example
Suppose we want to find the Suppose we want to find the authorsauthors that that wrotewrote a book with Mary. a book with Mary.
bibliography
bib
year articlebook1999 title author
SQL Mary
title author
XML Mary
bib
year book2000 title author
D.B. Davidauthor
Bill
XQuery - exampleXQuery - example Suppose we want to find the Suppose we want to find the authorsauthors
that that wrotewrote a book with Mary. a book with Mary.
FOR $b IN doc(“doc.xml”)/bibliography/bib/book, $a IN $b/authorWHERE $b/author/text()=“Mary” AND $a/text() != “Mary”RETURN $a
XQuery - exampleXQuery - example FOR $b IN doc(“doc.xml”)/bibliography/bib/book, $a IN $b/author
For all sub trees (marked as $b) in the document “doc.xml” under the XPath: /bibliograyph/bib/book And all sub trees (marked as $a) in the tree $b under the XPath: /author
Ahhh… $b is a book and $a is an author of the book
XQuery - exampleXQuery - example WHERE $b/author/text()=“Mary” AND $a/text() != “Mary”
If $b contains a path /author ending with “Mary” And $a isn’t “Mary”
RETURN $a
Return the sub tree $a
ContentContent What is XQueryWhat is XQuery The problem of Schema-Based The problem of Schema-Based
queriesqueries MLCASMLCAS Integrating MLCAS with XQueryIntegrating MLCAS with XQuery ConclusionConclusion
The Schema-Based problemThe Schema-Based problem Remember the first query?Remember the first query?
We wanted to find a We wanted to find a titletitle of a of a bookbook of of which Mary is an which Mary is an authorauthor..
We never said that it will be under We never said that it will be under the path the path /bibliography/bib/book/bibliography/bib/book
FOR $x IN doc(“doc.xml”)/bibliography/bib/bookWHERE $x/author/text()=“Mary”RETURN $x/title
The Schema-Based problemThe Schema-Based problem FurthermoreFurthermore
Suppose we want to get the year of the book that Mary wrote… <bibliography>
<bib> <year> 1999 </year> <book>
<title> SQL </title><author> Mary </author>
</book> <article> …
Notice that the year of the book IS NOT a descendent node of the book node, but of the bib
node
The Schema-Based problemThe Schema-Based problem
FOR $x in doc(“doc.xml”)/bibliography/bib/WHERE $x/book/author/text()=“Mary”RETURN $x/year
$x is now the bib node. If there exists a book written by Mary under that bib then the year of that
bib is returned
Before:Before:FOR $x IN doc(“doc.xml”)/bibliography/bib/bookWHERE $x/author/text()=“Mary”RETURN $x/title After:After:
((getting the titlegetting the title))
((getting the yeargetting the year))
The Schema-Based problemThe Schema-Based problem We could have never written that We could have never written that
query without knowledge about the query without knowledge about the structure of the XML file.structure of the XML file.
The query we wrote will not work on The query we wrote will not work on other files, even if they represent the other files, even if they represent the same data, under a different same data, under a different structure.structure.
Bibliography XML (version Bibliography XML (version 2)2)
<bibliography> <bib> <book> <year> 1999 </year>
<title> SQL </title><author> Bob </author>
</book> <book> <year> 2000 </year>
<title> D.B. </title><author> David </author>
</article> </bib> … …</bibliography>
bibliography
bib
year
book
1999
title author
SQL Bob
bib
year
book
2000
title author
D.B. David
BeforeAfter
The Schema-Based problemThe Schema-Based problem
FOR $x in doc(“doc.xml”)/bibliography/bib/WHERE $x/book/author/text()=“Mary”RETURN $x/year
bibliography
bib
year
book
1999
title author
SQL Bob
bib
year
book
2000
title author
D.B. David
Our query (getting the year) from before:Our query (getting the year) from before:
$x is a ‘bib’ node, and it has no child named year
33 kinds of peoplekinds of people……
If the user has If the user has FULL knowledgeFULL knowledge of the of the structure, she can simply use XQuery.structure, she can simply use XQuery.
If the user has If the user has NO knowledgeNO knowledge of the of the structure, she can use keyword based structure, she can use keyword based queries (like XKeyword)queries (like XKeyword)
If the user has If the user has PARTIAL knowledgePARTIAL knowledge of the of the structure, she can use schema-free structure, she can use schema-free queries, queries, and make good use of her and make good use of her knowledge.knowledge.
Partial knowledgePartial knowledge Suppose you want to search all the Suppose you want to search all the
books about Albert Einstein…books about Albert Einstein… If you will be using a keyword based If you will be using a keyword based
search. You will enter the keyword search. You will enter the keyword “Albert Einstein”.“Albert Einstein”.
Now, what if you want all the books Now, what if you want all the books written by Albert Einstein? written by Albert Einstein?
Your query will not change. Even though Your query will not change. Even though you you knowknow what you are really looking what you are really looking for.for.
XQuery with partial XQuery with partial knowledgeknowledge
Suppose we want to find the title and year of the publications of which Mary is
an author:FOR $a in doc(“doc.xml”)//author, $b in doc(“doc.xml”)//title, $c in doc(“doc.xml”)//yearWHERE $a/text()=“Mary”RETURN { $b , $c }
All we know are the
names of the nodes which
we are looking for
XQuery with partial XQuery with partial knowledgeknowledgebibliography
bib
year articlebook1999 title author
SQL Mary
title author
XML Mary
bib
year articlebook2000 title author
D.B. David
title author
.NET Bill
FOR $a in doc(“doc.xml”)//author, $b in doc(“doc.xml”)//title, $c in doc(“doc.xml”)//yearWHERE $a/text()=“Mary”RETURN { $b , $c }
ContentContent What is XQueryWhat is XQuery The problem of Schema-Based queriesThe problem of Schema-Based queries MLCASMLCAS
– LCALCA– MLCAMLCA– MLCASMLCAS
Integrating MLCAS with XQueryIntegrating MLCAS with XQuery ConclusionConclusion
LCALCA We would like to guess which part of We would like to guess which part of
the XML document is relevant for our the XML document is relevant for our search.search.
By reducing the XML tree, we would By reducing the XML tree, we would get more precise answers and avoid get more precise answers and avoid wrong ones.wrong ones. bibliography
bib
year articlebook1999 title author
SQL Mary
title author
XML Mary
bib
year articlebook2000 title author
D.B. David
title author
.NET Bill
LCALCA LLowest owest CCommon ommon AAncestorncestor
bibliography
bib
year articlebook1999 title author
SQL Bob
title author
XML Mary
What is the LCA of “title” and “author”?
LCALCA Lowest Common AncestorLowest Common Ancestor
bibliography
bib
year articlebook1999 title author
SQL Bob
title author
XML Mary
The LCA of “author” and
“title”
“book” is the root of the tree we should look within.
LCALCA Lowest Common AncestorLowest Common Ancestor
bibliography
bib
year articlebook1999 title author
SQL Bob
title author
XML Mary
The LCA of “author” and
“title”
“bib” doesn’t help us refine our search
ContentContent What is XQueryWhat is XQuery The problem of Schema-Based queriesThe problem of Schema-Based queries MLCASMLCAS
– LCALCA– MLCAMLCA– MLCASMLCAS
Integrating MLCAS with XQueryIntegrating MLCAS with XQuery ConclusionConclusion
MLCAMLCA Blindly computing the LCA might Blindly computing the LCA might
bring undesired results.bring undesired results. What we are looking for is:What we are looking for is:
MMeaningful eaningful LLowest owest CCommon ommon AAncestorncestor
Entity TypeEntity Type A Type of a node is it’s tag nameA Type of a node is it’s tag name
bibliography
bib
year articlebook1999 title author
SQL Bob
title author
XML Mary
Nodes of the “title” type
Meaningfully RelatedMeaningfully Related
A B
Consider two nodes “A” and “B”, of Consider two nodes “A” and “B”, of type “T1” and “T2” respectively.type “T1” and “T2” respectively.
If, we say that A and B are If, we say that A and B are meaningfully related. meaningfully related.
If, we say that A and B are If, we say that A and B are related, being related, being descendents of node C.descendents of node C.
So far, this is much like LCA…So far, this is much like LCA…A B
C
Meaningfully RelatedMeaningfully Related There is an exception to the second case:There is an exception to the second case:
Suppose that node B* is of the same type as B
A B*
C
B
D
In this case, nodes “A” and “B” are NOT meaningfully related.
Author Title
book
Title
bib
MLCAMLCA So we say that a node “D” is the MLCA So we say that a node “D” is the MLCA
of nodes “A” and “B” if:of nodes “A” and “B” if:– ““D” is a common ancestor of nodes “A” D” is a common ancestor of nodes “A”
and “B”.and “B”.– There is no node “C” that is the LCA of There is no node “C” that is the LCA of
types “T1” and “T2” which is a types “T1” and “T2” which is a descendent of node “D”descendent of node “D”
A B*
C
B
D X
MLCAMLCA For multiple nodes, we require that For multiple nodes, we require that
all the subsets will have a MLCA and all the subsets will have a MLCA and that the MLCA of the whole set will that the MLCA of the whole set will be an ancestor of the MLCAs of the be an ancestor of the MLCAs of the subsets.subsets.
year bookbook2000 title author
D.B. David
title author
.NET Bill
bib
For example, if we are looking at the types: year, title and authorbib is the
MLCA of the types: year,
title and author
book is the MLCA of the types: title and author
MLCAMLCA
FOR $a in doc(“doc.xml”)//author, $b in doc(“doc.xml”)//title, $c in doc(“doc.xml”)//yearWHERE $a/text()=“Mary”RETURN { $b , $c }
Lets’ try the query again…Lets’ try the query again…
Bibliography XMLBibliography XML
bibliography
bib
year articlebook1999 title author
SQL Bob
title author
XML Mary
bib
year articlebook2000 title author
D.B. David
title author
.NET Bill
““bib” is the MLCA of “author”, “title” bib” is the MLCA of “author”, “title” and “year”and “year”
““author” = Maryauthor” = Mary
year1999
year1999 title
SQL
title
XML
FOR $a in doc(“doc.xml”)//author, $b in doc(“doc.xml”)//title, $c in doc(“doc.xml”)//yearWHERE $a/text()=“Mary”RETURN { $b , $c }
ContentContent What is XQueryWhat is XQuery The problem of Schema-Based queriesThe problem of Schema-Based queries MLCASMLCAS
– LCALCA– MLCAMLCA– MLCASMLCAS
Integrating MLCAS with XQueryIntegrating MLCAS with XQuery ConclusionConclusion
MLCASMLCAS The result of the query was almost right.The result of the query was almost right. The problem was that “bib” is the MLCA of The problem was that “bib” is the MLCA of
several groups of nodes which satisfy the several groups of nodes which satisfy the query.query.
To solve this, we use:To solve this, we use:MMeaningful eaningful LLowest owest CCommon ommon AAncestor ncestor SStructuretructure
bib
year articlebook1999 title author
SQL Bob
title author
XML Mary
year1999
year1999 title
SQL
title
XML
Nodes Nodes requested:requested:
TitleTitle AuthorAuthor YearYear
MLCASMLCAS Given a set of types {tGiven a set of types {t11 …t …tmm} from } from
the querythe query MLCAS is a set of nodes {r, aMLCAS is a set of nodes {r, a11, … , , … ,
aamm}} Where {aWhere {a11 … a … amm} are nodes } are nodes
matching the types {tmatching the types {t11 …t …tmm} } And r is the MLCA of {aAnd r is the MLCA of {a11 … a … amm} }
MLCAS exampleMLCAS example We are looking for the types: We are looking for the types: AuthorAuthor, ,
TitleTitle and and YearYear.. Set of nodes matching those types:Set of nodes matching those types:
The MLCA of the set:The MLCA of the set:bibliography
year bookbook1999 title author
SQL Bob
title author
XML Mary
year articlebook2000 title author
D.B. David
title author
.NET Bill
{David, SQL, 1999}There is none
bib nodes are the MLCA of the types: Author,
Title, Yearbibliography is the LCA of the nodes: David, SQL, 1999
So this set isn’t good for us
So this set is good for us
bib[2]bib[1]
{Mary, SQL, 1999}
book is the MLCA of the types:
Title, Author
bib is the LCA of the nodes:
Mary, SQL
{Bob, SQL, 1999}bib[2]
MLCAS query exampleMLCAS query exampleFOR $a in doc(“doc.xml”)//year, $b in doc(“doc.xml”)//title, $c in doc(“doc.xml”)//authorWHERE $c/text()=“Mary”RETURN { $a , $b }
bib
year articlebook1999 title author
SQL Bob
title author
XML Mary
year1999
year1999 title
SQL
title
XML
bibbib
author
Bob
author
Mary
Other work on creating meaningful Other work on creating meaningful resultsresults
““Integrating Keyword Search into XML Integrating Keyword Search into XML Query Processing (XML-QL)” - Daniela Query Processing (XML-QL)” - Daniela Florescu and Ioana Manolescu from INRIA Florescu and Ioana Manolescu from INRIA Rocquencourt, France and Donald Rocquencourt, France and Donald Kossmann from UnivKossmann from Univ. . of Passau, Germany. of Passau, Germany. – Use of hierarchical location in the XML (at what Use of hierarchical location in the XML (at what
level the keyword should be).level the keyword should be).– Use of semantical location in the XML (tag name, Use of semantical location in the XML (tag name,
CDATA, attribute …)CDATA, attribute …)– Use of the user’s knowledge of the structure of Use of the user’s knowledge of the structure of
the XML file (Ex: if she knows that books are the XML file (Ex: if she knows that books are under the bib tag she can ask for those elements under the bib tag she can ask for those elements only).only).
“XSEarch: A Semantic Search Engine for XML” - Sara Cohen, Jonathan Mamou, Yaron Kanza and Yehoshua Sagiv from the Hebrew University.– Enables the user to specify a tag name under
which the keyword should be found.– Use of the fact that if the shortest path between
two elements goes through the same tag name more than once, they are probably not meaningfully related.
– Gives ranking to the results.
Other work on creating meaningful Other work on creating meaningful resultsresults
bookbooktitle author
D.B. David
title author
.NET Bill
bib
ContentContent What is XQueryWhat is XQuery The problem of Schema-Based The problem of Schema-Based
queriesqueries MLCASMLCAS Integrating MLCAS with XQueryIntegrating MLCAS with XQuery
– mlcasmlcas– ExpandExpand
ConclusionConclusion
Integrating MLCAS with Integrating MLCAS with XQueryXQuery
In order for us to integrate MLCAS In order for us to integrate MLCAS into XQuery we will introduce a new into XQuery we will introduce a new function into XQuery: function into XQuery: mlcas mlcas (surprising, (surprising, isn't it?)isn't it?)
Whenever we want to make sure that Whenever we want to make sure that the nodes exist in an MLCAS, we will the nodes exist in an MLCAS, we will add the condition: add the condition: existsexists mlcas ($a, mlcas ($a, $b, $c)$b, $c)((existsexists is a keyword in XQuery) is a keyword in XQuery)
Query example number 1Query example number 1 Find the Find the titletitle and and yearyear of the publications of the publications
of which of which MaryMary is an is an authorauthor..
for $a in doc(“doc.xml”)//author, $b in doc(“doc.xml”)//title, $c in doc(“doc.xml”)//yearwhere $a/text() = “Mary” and exists mlcas ($a, $b, $c)return { $b, $c }
This will make sure that the “author”, “title” and “year” that we get, are
really of the same publication
Query example number 2Query example number 2 Find additional Find additional authorsauthors of the publications, of the publications,
of which of which MaryMary is an is an authorauthor
for $a in doc(“doc.xml”)//author, $b in doc(“doc.xml”)//authorwhere $a/text() = “Mary” and $a != $b and exists mlcas ($a, $b)return $b
This will make sure that both the
authors are really of the same publication
Query example number 3Query example number 3 Find Find yearyear and and authorauthor of the of the
publications with similar publications with similar titletitles to a s to a publication of which publication of which MaryMary is an is an authorauthor
for $a in doc(“doc.xml”)//author, $t in doc(“doc.xml”)//title, $y in doc(“doc.xml)//year, $t2 in {
for $aM in doc(“doc.xml”)//author, $tM in doc(“doc.xml”)//titlewhere $aM/text() = “Mary” and exists mlcas($aM, $tM)return $tM }
where $t ≈ $t2 and exists mlcas ($y, $a, $t)return { $y, $a }
Not integrated enoughNot integrated enough??
The user who will want to use the The user who will want to use the MLCAS feature will have to add the MLCAS feature will have to add the line:line:andand existsexists mlcas($a, $b, …) mlcas($a, $b, …)to the where statement.to the where statement.
This might not be simple enough, This might not be simple enough, especially when changing an already especially when changing an already existing query.existing query.
The The mlcasmlcas keyword keyword The keyword The keyword mlcasmlcas will be used to will be used to
ask the system to use MLCAS when ask the system to use MLCAS when choosing nodes:choosing nodes:for $a in mlcas doc(“doc.xml”)//author, $b in mlcas doc(“doc.xml”)//titlewhere $a/text() = “Mary”return $b
and exists mlcas ($a, $b)
Some we knowSome we know Suppose you do know that you are Suppose you do know that you are
interested only in the first ‘bib’ nodeinterested only in the first ‘bib’ node You can make use of your knowledge…You can make use of your knowledge…
bibliography
bib
year articlebook1999 title author
SQL Bob
title author
XML Mary
bib
year articlebook2000 title author
D.B. David
title author
.NET Bill
Some we knowSome we know
for $b in doc(“doc.xml”)//bib[1], $a in mlcas $b//author, $t in mlcas $b//titlereturn { $a , $t }
bibliography
bib
year articlebook1999 title author
SQL Bob
title author
XML Mary
bib
year articlebook2000 title author
D.B. David
title author
.NET Bill
Some we knowSome we know
bibliography
bib
year articlebook1999 title author
SQL Bob
title author
XML Mary
bib
year articlebook2000 title author
D.B. David
title author
.NET Bill
for $b in doc(“doc.xml”)//bib[1], $a in mlcas $b//author, $t in mlcas $b//titlereturn { $a , $t }
ContentContent What is XQueryWhat is XQuery The problem of Schema-Based The problem of Schema-Based
queriesqueries MLCASMLCAS Integrating MLCAS with XQueryIntegrating MLCAS with XQuery
– mlcasmlcas– ExpandExpand
ConclusionConclusion
Many ways to say…Many ways to say… There are different tag names that There are different tag names that
represent the same thing.represent the same thing.Author:Author: AuthorAuthor / / WriterWriter / / AuAuTitle:Title: Title Title // Name Name // Headline Headline
Less then 20% choose the same term Less then 20% choose the same term for a single well known object.for a single well known object.
Our partial knowledge of the XML file Our partial knowledge of the XML file will still have to be accurate of how it will still have to be accurate of how it tags the information we want.tags the information we want.
The The expandexpand keyword keyword To solve this issue, we will include To solve this issue, we will include
yet another keyword: yet another keyword: expandexpand Whenever we are not sure of the Whenever we are not sure of the
exact tag name, we could use the exact tag name, we could use the expand keyword to find it for us.expand keyword to find it for us.for $a in mlcas doc(“doc.xml”)//expand(author), $b in mlcas doc(“doc.xml”)//titlewhere $a/text() = “Mary”return $b
The The expandexpand keyword keyword The synonyms of a word can be found The synonyms of a word can be found
using a domain-specific thesaurus using a domain-specific thesaurus (developed by domain experts or (developed by domain experts or WordNet).WordNet).
Another application is an ontology-Another application is an ontology-driven hierarchical thesaurus. For driven hierarchical thesaurus. For example, use the word “publication” to example, use the word “publication” to get both “book” and “article” tags.get both “book” and “article” tags.
Think of other applications where this Think of other applications where this can useful. (google?)can useful. (google?)
Ontology-based Query Ontology-based Query ProcessingProcessing
An Ontology for Domain-oriented Semantic An Ontology for Domain-oriented Semantic Similarity Search On XML Data - Similarity Search On XML Data - Anja Theobald from the university of the Saarland, Germany.– Use of tag name and keyword similarity.– Use of WordNet and Google to give a ranking to
how similar objects are. WordNet is used to get synonyms or broader terms Google is used to get a rank of how close two terms
are– Gives ranking to the results.
Ontology-based Query Ontology-based Query ProcessingProcessing
Taken from “The Index Based XXL Search Engine for Querying XML Data with Relevance Ranking” by:
Anja Theobald, Gerhard WeikumUniversity of the Saarland, Germany
Ontology-based Query Processing(taken from a presentation of Anja Theobald - 26.02.03)
XXL Query:
... WHERE #.~universe AS U AND U.#.~appearance AS A AND U.#.S ~ „star“
sim(universe,
galaxy)
0.94
1.0
sim(star, sun) * tfidf(sun)0.43
XXL Query Representation:
~universe
~appearance
% %
~ “star”
1.0
sim(app, app)
1.0
XML Data Graph:
galaxy
object
“…light and heat…”
description
sun
appearance
location
history
ContentContent What is XQueryWhat is XQuery The problem of Schema-Based The problem of Schema-Based
queriesqueries MLCASMLCAS Integrating MLCAS with XQueryIntegrating MLCAS with XQuery ConclusionConclusion
ConclusionConclusion We wanted to find a way to get We wanted to find a way to get
accurate results from an XML file accurate results from an XML file which it’s structure we don’t know.which it’s structure we don’t know.
We used the MLCAS concept to get We used the MLCAS concept to get meaningful results.meaningful results.
We integrated the ability into an We integrated the ability into an already existing query language.already existing query language.
Thank youThank youQuestions?Questions?
Computing MLCASComputing MLCAS One could implement MLCAS computation using the definition One could implement MLCAS computation using the definition
of MLCAS:of MLCAS:– ““D” is a MLCA for nodes “A” and “B” of types “T1” and “T2” D” is a MLCA for nodes “A” and “B” of types “T1” and “T2”
respectively. If:respectively. If: ““D” is a common ancestor of nodes “A” and “B”.D” is a common ancestor of nodes “A” and “B”. There is no node “C” that is the LCA of types “T1” and “T2” which is a There is no node “C” that is the LCA of types “T1” and “T2” which is a
descendent of node “D”descendent of node “D” Take each pair {n1, n2} when “n1” and “n2” are of types “T1” Take each pair {n1, n2} when “n1” and “n2” are of types “T1”
and “T2” respectively.and “T2” respectively. Find their LCA by going up from both the nodes till you find a Find their LCA by going up from both the nodes till you find a
common ancestor. And produce a tree, rooted by the LCA, with common ancestor. And produce a tree, rooted by the LCA, with n1 and n2 as it’s leaves.n1 and n2 as it’s leaves.
For each pair of trees that you found (TA and TB), if the root of For each pair of trees that you found (TA and TB), if the root of TA is a descendent of the root of TB, remove TB.TA is a descendent of the root of TB, remove TB.– Because TB contradicts the second rule:Because TB contradicts the second rule:
There is no node “C” that is the LCA of types “T1” and “T2” which is a There is no node “C” that is the LCA of types “T1” and “T2” which is a descendent of node “D”descendent of node “D”
top related