querying on the web: xquery, rdql, sparql semantic web - spring 2006 computer engineering department...

Post on 29-Mar-2015

220 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Querying on the Web:XQuery, RDQL, SparQL

Semantic Web - Spring 2006

Computer Engineering Department

Sharif University of Technology

2

Outline

• XQuery– Querying on XML Data

• RDQL– Querying on RDF Data

• SparQL– Another RDF query language (under development)

3

Requirements for an XML Query Language

David Maier, W3C XML Query Requirements:• Closedness: output must be XML• Composability: wherever a set of XML elements is

required, a subquery is allowed as well• Can benefit from a schema, but should also be applicable

without• Retains the order of nodes• Formal semantics

4

How Does One Design a Query Language?

• In most query languages, there are two aspects to

a query:

– Retrieving data (e.g., from … where … in SQL)

– Creating output (e.g., select … in SQL)

• Retrieval consists of

– Pattern matching (e.g., from … )

– Filtering (e.g., where … )

… although these cannot always be clearly distinguished

5

XQuery Principles

• A language for querying XML document.

• Data Model identical with the XPath data model– documents are ordered, labeled trees

– nodes have identity

– nodes can have simple or complex types (defined in XML Schema)

• XQuery can be used without schemas, but can be checked against DTDs and XML schemas

• XQuery is a functional language– no statements

– evaluation of expressions

6

Sample data

7

<titles>

{for $r in doc("recipes.xml")//recipe

return $r/title}

</titles>

returns

<titles>

<title>Beef Parmesan with Garlic Angel Hair Pasta</title>

<title>Ricotta Pie</title>

</titles>

A Query over the Recipes Document

8

XPath

<titles>

{for $r in doc("recipes.xml")//recipe

return

$r/title}

</titles>

Query Features

doc(String) returns input document

Part to be returned as it is given {To be evaluated}

Iteration $var - variables

Sequence of results,one for each variable binding

9

Features: Summary

• The result is a new XML document

• A query consists of parts that are returned as is

• ... and others that are evaluated (everything in {...} )

• Calling the function doc(String) returns an input document

• XPath is used to retrieve nodes sets and values

• Iteration over node sets:

let binds a variable to all nodes in a node set

• Variables can be used in XPath expressions

• return returns a sequence of results,

one for each binding of a variable

10

XPath is a Fragement of XQuery• doc("recipes.xml")//recipe[1]/title

returns

<title>Beef Parmesan with Garlic Angel Hair Pasta</title>

• doc("recipes.xml")//recipe[position()<=3] /title

returns

<title>Beef Parmesan with Garlic Angel Hair Pasta</title>,

<title>Ricotta Pie</title>,

<title>Linguine Pescadoro</title>

an element

a list of elements

11

Beware: XPath Attributes

• doc("recipes.xml")//recipe[1]/ingredient[1] /@name

→ attribute name {"beef cube steak"}

• string(doc("recipes.xml")//recipe[1] /ingredient[1]/@name)

→ "beef cube steak"

a constructor for an attribute node

a value of type string

12

XPath Attributes (cntd.)

• <first-ingredient>{string(doc("recipes.xml")//recipe[1] /ingredient[1]/@name)}</first-ingredient>

→ <first-ingredient>beef cube steak</first-ingredient>

an element with string content

13

XPath Attributes (cntd.)

• <first-ingredient>{doc("recipes.xml")//recipe[1] /ingredient[1]/@name}

</first-ingredient>

→ <first-ingredient name="beef cube steak"/>

an element with an attribute

14

XPath Attributes (cntd.)

• <first-ingredient

oldName="{doc("recipes.xml")//recipe[1] /ingredient[1]/@name}">Beef</first-ingredient>

→ <first-ingredient oldName="beef cube steak">

Beef

</first-ingredient>

An attribute is cast as a string

15

Iteration with the For-Clause

Syntax: for $var in xpath-expr

Example: for $r in doc("recipes.xml")//recipe return string($r)

• The expression creates a list of bindings for a variable $var

If $var occurs in an expression exp,

then exp is evaluated for each binding

• For-clauses can be nested:

for $r in doc("recipes.xml")//recipefor $v in doc("vegetables.xml")//vegetable return ...

16

Nested For-clauses: Example

<my-recipes>

{for $r in doc("recipes.xml")//recipe

return

<my-recipe title="{$r/title}">

{for $i in $r//ingredient

return

<my-ingredient>

{string($i/@name)}

</my-ingredient>

}

</my-recipe>

}

</my-recipes>

Returns my-recipes with titles as attributes and my-ingredientswith names as text content

17

The Let Clause

Syntax: let $var := xpath-expr

• binds variable $var to a list of nodes,

with the nodes in document order

• does not iterate over the list

• allows one to keep intermediate results for reuse

(not possible in SQL)

Example:

let $ooreps := doc("recipes.xml")//recipe

[.//ingredient/@name="olive oil"]

18

Let Clause: Example

<calory-content>

{let $ooreps := doc("recipes.xml")//recipe

[.//ingredient/@name="olive oil"]

for $r in $ooreps return

<calories>

{$r/title/text()}

{": "}

{string($r/nutrition/@calories)}

</calories>}

</calory-content>

Calories of recipeswith olive oil

Note the implicitstring concatenation

19

Let Clause: Example (cntd.)

The query returns:

<calory-content>

<calories>Beef Parmesan: 1167</calories>

<calories>Linguine Pescadoro: 532</calories>

</calory-content>

20

The Where Clause

Syntax: where <condition>• occurs before return clause • similar to predicates in XPath• comparisons on nodes:

– "=" for node equality– "<<" and ">>" for document order

• Example:

for $r in doc("recipes.xml")//recipewhere $r//ingredient/@name="olive oil"return ...

21

Quantifiers

• Syntax: some/every $var in <node-set> satisfies <expr>

• $var is bound to all nodes in <node-set> • Test succeeds if <expr> is true for some/every

binding• Note: if <node-set> is empty, then

“some” is false and “all” is true

22

Quantifiers (Example)

• Recipes that have some compound ingredient

• Recipes where every ingredient is non-compound

for $r in doc("recipes.xml")//recipewhere some $i in $r/ingredient satisfies $i/ingredient Return $r/title

for $r in doc("recipes.xml")//recipewhere every $i in $r/ingredient satisfies not($i/ingredient) Return $r/title

23

Element Fusion

“To every recipe, add the attribute calories!”<result>

{let $rs := doc("recipes.xml")//recipe

for $r in $rs return

<recipe>

{$r/nutrition/@calories}

{$r/title}

</recipe>}

</result>

an element

an attribute

24

Element Fusion (cntd.)

The query result:

<result>

<recipe calories="1167">

<title>Beef Parmesan with Garlic Angel Hair Pasta</title>

</recipe>

<recipe calories="349">

<title>Ricotta Pie</title>

</recipe>

<recipe calories="532">

<title>Linguine Pescadoro</title>

</recipe>

</result>

25

Eliminating Duplicates

The function distinct-values(Node Set)

– extracts the values of a sequence of nodes

– creates a duplicate free sequence of values

Note the coercion: nodes are cast as values!

Example:

let $rs := doc("recipes.xml")//recipereturn distinct-values($rs//ingredient/@name)

yields

"beef cube steak

onion, sliced into thin rings

...

26

Syntax: order by expr [ ascending | descending ]

for $iname in doc("recipes.xml")//@name

order by $iname descending

return string($iname)

yields

"whole peppercorns",

"whole baby clams",

"white sugar",

...

The Order By Clause

27

The Order By Clause (cntd.)

The interpreter must be told whether the values should be regarded as numbers or as strings (alphanumerical sorting is default)

for $r in $rsorder by number($r/nutrition/@calories)return $r/title

Note:

– The query returns titles ...

– but the ordering is according to calories, which do not appear in the output

Not possible in SQL!

28

Grouping and Aggregation

Aggregation functions count, sum, avg, min, max

Example: The number of simple ingredients

per recipe

for $r in doc("recipes.xml")//recipe

return

<number>

{attribute {"title"} {$r/title/text()}}

{count($r//ingredient[not(ingredient)])}

</number>

29

Grouping and Aggregation (cntd.)

The query result:

<number title="Beef Parmesan with Garlic Angel Hair Pasta">11</number>,

<number title="Ricotta Pie">12</number>,

<number title="Linguine Pescadoro">15</number>,

<number title="Zuppa Inglese">8</number>,

<number title="Cailles en Sarcophages">30</number>

30

Nested Aggregation

“The recipe with the maximal number of calories!”

let $rs := doc("recipes.xml")//recipelet $maxCal := max($rs//@calories)for $r in $rswhere $r//@calories = $maxCalreturn string($r/title)

returns

"Cailles en Sarcophages"

31

Running Queries with Galax

• Galax is an open-source implementation of

XQuery (http://www.galaxquery.org/)

– The main developers have taken part in the definition of

XQuery

RDQL

Querying on RDF data

33

Introduction

• RDF Data Query Language• JDBC/ODBC friendly

• Simple:

SELECTsome information

FROMsomewhere

WHEREthis match

ANDthese constraints

USINGthese vocabularies

34

Example

35

Example

• q1 contains a query:SELECT ?x

WHERE (?x, <http://www.w3.org/2001/vcard-rdf/3.0#FN>, "John Smith")

• For executing q1with a model m1.rdf:java jena.rdfquery --data m1.rdf --query q1

• The outcome is:x

=============================

<http://somewhere/JohnSmith/>

36

Example

• Return all the resources that have property FN and the associated values:

SELECT ?x, ?fnameWHERE (?x, <http://www.w3.org/2001/vcard-rdf/3.0#FN>, ?fname)

• The outcome is:

x | fname ================================================<http://somewhere/JohnSmith/> | "John Smith" <http://somewhere/SarahJones/> | "Sarah Jones"<http://somewhere/MattJones/> | "Matt Jones"

37

Example

• Return the first name of Jones:

SELECT ?givenName

WHERE (?y, <http://www.w3.org/2001/vcard-rdf/3.0#Family>, "Jones"),

(?y, <http://www.w3.org/2001/vcard-rdf/3.0#Given>, ?givenName)

• The outcome is:

givenName

=========

"Matthew"

"Sarah"

38

URI Prefixes : USING

• RDQL has a syntactic convenience that allows prefix strings to be defined in the USING clause :

SELECT ?x WHERE (?x, vCard:FN, "John Smith") USING vCard FOR <http://www.w3.org/2001/vcard-rdf/3.0#>

SELECT ?givenNameWHERE (?y, vCard:Family, "Smith"),

(?y, vCard:Given, ?givenName) USING vCard FOR <http://www.w3.org/2001/vcard-rdf/3.0#>

39

Filters

• RDQL has a syntactic convenience that allows prefix strings to be defined in the USING clause :

SELECT ?resource WHERE (?resource, info:age, ?age) AND ?age >= 24 USING info FOR <http://somewhere/peopleInfo#>

40

Another Example

SELECT?title ?description ?orbit ?satellite ?sensor ?date

FROM<http://earth.esa.int/showcase/ers/dublin.rdf>

WHERE(?item <dc:title> ?title)(?item <dc:description> ?description)(?item <isc:orbit> ?orbit)(?item <isc:satellite> ?satellite)(?item <isc:sensor> ?sensor)(?item <dc:date> ?date)

USINGisc FOR <http://earth.esa.int/standards/showcase/>dc FOR <http://purl.org/dc/elements/1.1/>rdf FOR <http://www.w3.org/1999/02/22-rdf-syntax-ns#>rdfs FOR <http://www.w3.org/2000/01/rdf-schema#>

41

Implementations

• Jena– http://jena.sourceforge.net/

• Sesame– http://sesame.aidministrator.nl/

• RDFStore– <http://rdfstore.sourceforge.net/>

42

Limitation

• Does not take into account semantics of RDFS• For example:

ex:human rdfs:subClassOf ex:animalex:student rdfs:subClassOf ex:humanex:john rdf:type ex:student

Query: “ To which class does the resource John belong?”Expected answer: ex:student, ex:human, ex:animalHowever, the query:

SELECT ?xWHERE (<http://example.org/#john>, rdf:type, ?x)USING rdf FOR <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

Yields only:<http://example.org/#student>

• Solution: Inference Engines

SparQL

44

Introduction

• A RDF query language currently under development by W3C

• Builds on previous RDF query languages such as rdfDB, RDQL, and SeRQL.

45

Example RDF

46

Example

• Simple Query:

PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?url FROM <bloggers.rdf> WHERE {

?contributor foaf:name "Jon Foobar" . ?contributor foaf:weblog ?url . }

47

Example (cont.)

• Optional block:

PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT ?name ?depiction

WHERE { ?person foaf:name ?name .

OPTIONAL { ?person foaf:depiction ?depiction . }

}

48

Example (cont.)

• Alternative matches:

PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> SELECT ?name ?mbox WHERE { ?person foaf:name ?name . { { ?person foaf:mbox ?mbox } UNION { ?person foaf:mbox_sha1sum ?mbox } } }

• There are many other features in SparQL which is out of scope for this class. Refer to references for more information.

49

References

• http://www.w3.org/TR/xquery/

• A Programmer's Introduction to RDQL– http://jena.sourceforge.net/tutorial/RDQL/

• http://rdfstore.sourceforge.net/

• http://jena.sourceforge.net

• http://sesame.aidministrator.nl/

• http://www.w3.org/TR/2004/WD-rdf-sparql-query-20041012/

• http://www-128.ibm.com/developerworks/java/library/j-sparql/

top related