sparql tutorial - imran ihsanimranihsan.com/upload/lecture/sws1707.pdf · semantic web imran ihsan...

29
SEMANTIC WEB IMRAN IHSAN ASSISTANT PROFESSOR, AIR UNIVERSITY, ISLAMABAD WWW.IMRANIHSAN.COM 07 SPARQL TUTORIAL BY EXAMPLE: DBPEDIA

Upload: doanduong

Post on 24-May-2018

219 views

Category:

Documents


2 download

TRANSCRIPT

SEMANTIC WEB

IMRAN IHSANASSISTANT PROFESSOR, AIR UNIVERSITY, ISLAMABADWWW.IMRANIHSAN.COM

07SPARQL TUTORIALBY EXAMPLE: DBPEDIA

VIRTUOSO SERVER DOWNLOAD

2

• Open Link Virtuoso Server

http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VOSDownload

• Make sure that you have downloaded the right server according to your system operating system and specification (32 bit, 64 bit).

VIRTUOSO STARTUP

3

• Virtuoso is a portable server; therefore, it does not require any installation.

• Just extract the zip file into some directory D:\virtuoso

• Zip extraction will create a folder with the name virtuoso-opensource

• In virtuoso-opensource, go to database directory & copy virtuoso.ini file into bin directory.

• Go to D:\virtuoso\virtuoso-opensource\bin using command line

• Run the following command

Virtuoso-t -f

• A virtuoso server will be started.

• It will be available at http://localhost:8890/sparql or http://your.system.ip.address/sparql

• You can directly run various SPARQL using this public interface.

LOADING DATA INTO VIRTUOSO SPARQL ENDPOINT

4

RDF UPLOAD USING VIRTUOSO CONDUCTOR

• This method is useful for uploading small RDF files (e.g. 100 or 200 MB files).

• In this method only one file can be uploaded at one time.

• Steps .

1. Go to the link http://localhost:8890/ and click on conductor on the left side.

2. Type dbaas both login account and password.

3. Click on the Linked Data tab and then Quad Store Upload

4. Select your RDF file (only one at a time), give a proper graph name and click upload

5. If there is no syntax error, the RDF file will be added to the virtuoso server

6. Query using the public interface given at http://localhost:8890/sparql .

LOADING DATA INTO VIRTUOSO SPARQL ENDPOINT

5

USING BULK LOAD

1. Go to bin folder and click on isql (D:\virtuoso\virtuoso-opensource\isql).

2. Run the following command to clear any previous load list of files

SQL>delete from db.dba.load_list;

3. Enter the following command by providing the appropriate input values

SQL>ld_dir(‘<sourcefilename-or-directory>’,'<file name pattern>’,’graph iri’);

4. Please mind forward slash. In our case the parameters are given below.

SQL>ld_dir (‘D:/virtuoso/DBPediaData’, ‘*.nt’, ‘http://cbakerlab.unbsj.ca&#8217;);

5. Next enter the command

SQL> select * from DB.DBA.load_list;

6. Finally, enter the command to start the bulk load and wait for completion.

SQL>rdf_loader_run();

7. After successful upload run the shut down command otherwise the file will be not completely uploaded

SQL> shutdown;

SPARQLQUERYING DBPEDIA

6

DBPEDIA LOGO

7

• http://147.228.127.146:9220/search/_all

QUERY #1DBPEDIA’S SPARQL UI – HTTP://DBPEDIA.ORG/SPARQL/

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT * WHERE {

?city rdf:type <http://dbpedia.org/ontology/PopulatedPlace>

}

This query returns all of the URIs that identify as cities that are of type “Populated Place".

8

QUERY #2POPULATION TOTAL

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

PREFIX dbp: <http://dbpedia.org/ontology/>

SELECT * WHERE {

?city rdf:type dbp:PopulatedPlace .

?city dbp:populationTotal ?popTotal .

}

This query returns the cities as well as their total populations.

9

QUERY #3TOTAL AND METRO POPULATION

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

PREFIX dbp: <http://dbpedia.org/ontology/>

SELECT * WHERE {

?city rdf:type dbp:PopulatedPlace;

dbp:populationTotal ?popTotal ;

dbp:populationMetro ?popMetro .

}

This query returns the cities with their total populations and metro populations..

10

QUERY #4OPTIONAL CLAUSE

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

PREFIX dbp: <http://dbpedia.org/ontology/>

SELECT * WHERE {

?city rdf:type dbp:PopulatedPlace;

dbp:populationTotal ?popTotal .

OPTIONAL {?city dbp:populationMetro ?popMetro . }

}

This query returns the cities, their total population and optionally the metro population, if it exists.

11

QUERY #5ORDER BY CLAUSE

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

PREFIX dbp: <http://dbpedia.org/ontology/>

SELECT * WHERE {

?city rdf:type dbp:PopulatedPlace;

dbp:populationTotal ?popTotal .

OPTIONAL {?city dbp:populationMetro ?popMetro . }

}

ORDER BY desc(?popTotal)

This query returns the cities, their total population, and optionally their metro populations. The results are returned in the order of their total populations.

12

QUERY #6LIMIT AND OFFSET CLAUSES

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

PREFIX dbp: <http://dbpedia.org/ontology/>

SELECT * WHERE {

?city rdf:type dbp:PopulatedPlace;

dbp:populationTotal ?popTotal .

OPTIONAL {?city dbp:populationMetro ?popMetro . }

}

ORDER BY desc(?popTotal)

LIMIT 10

OFFSET 5

This query returns the cities, their total population, and optionally their metro populations. The results are returned in the order of their total populations.

At most 10 results will be returned, starting with the 5th result.

13

QUERY #7FILTER

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

PREFIX dbp: <http://dbpedia.org/ontology/>

SELECT * WHERE {

?city rdf:type dbp:PopulatedPlace;

dbp:populationTotal ?popTotal .

OPTIONAL {?city dbp:populationMetro ?popMetro . }

FILTER (?popTotal > 50000)

}

ORDER BY desc(?popTotal)

This is the same as Query 6, but returns places that have a total population of more than 50,000.

Logical: &&, ||, !

Mathematical: +, -, *, /

Comparison: =, !=, <, >, <=, >=

SPARQL tests: isURI, isBlank, isLiteral, bound

SPARQL accessors: str, lang, datatype

Other: sameTerm, langMatches, regex

14

QUERY #8RDFS PREDICATE - LABEL

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

PREFIX dbp: <http://dbpedia.org/ontology/>

SELECT * WHERE {

?city rdf:type dbp:PopulatedPlace;

dbp:populationTotal ?popTotal;

rdfs:label ?name

OPTIONAL {?city dbp:populationMetro ?popMetro . }

FILTER (?popTotal > 50000)

}

ORDER BY desc(?popTotal)

This query is the same as query 7, but brings back the human readable name of each place with the results.

15

QUERY #9LANGUAGE MATCHING

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

PREFIX dbp: <http://dbpedia.org/ontology/>

SELECT * WHERE {

?city rdf:type dbp:PopulatedPlace;

dbp:populationTotal ?popTotal;

rdfs:label ?name

OPTIONAL {?city dbp:populationMetro ?popMetro . }

FILTER (?popTotal > 50000 && langmatches(lang(?name), "EN"))

}

ORDER BY desc(?popTotal)

Query 8, but requesting only English labels for the matching patterns.

16

QUERY #9ALANGUAGE MATCHING

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

PREFIX dbp: <http://dbpedia.org/ontology/>

SELECT * WHERE {

?city rdf:type dbp:PopulatedPlace;

dbp:populationTotal ?popTotal;

rdfs:label ?name

OPTIONAL {?city dbp:populationMetro ?popMetro . }

FILTER (?popTotal > 50000 && lang(?name) = "en")

}

ORDER BY desc(?popTotal)

Query 9 can be rewritten equivalently without the langmatches operator and using "=" and "en" (lowercase) instead of "EN" (uppercase):.

17

QUERY #10REGEX – REGULAR EXPRESSION

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

PREFIX dbp: <http://dbpedia.org/ontology/>

SELECT * WHERE {

?city rdf:type dbp:PopulatedPlace;

dbp:populationTotal ?popTotal;

rdfs:label ?name

OPTIONAL {?city dbp:populationMetro ?popMetro . }

FILTER (?popTotal > 50000 &&

langmatches(lang(?name), "EN") &&

regex(str(?name),"abad"))

}

ORDER BY desc(?popTotal)

It is the same as Query 9, but matching only cities with “abad" in their names.

18

QUERY #11NOT OPERATOR

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

PREFIX dbp: <http://dbpedia.org/ontology/>

SELECT * WHERE {

?city rdf:type dbp:PopulatedPlace;

dbp:populationTotal ?popTotal ;

rdfs:label ?name

OPTIONAL {?city dbp:populationMetro ?popMetro. }

FILTER (?popTotal > 50000 && langmatches(lang(?name), "EN") )

FILTER(!bound(?popMetro))

}

ORDER BY desc(?popTotal)

This query is the same as before, except that it returns only cities that do not have a metro population.

19

QUERY #11A

20

NOT OPERATOR

• Find the person entries in Tim Berners-Lee's FOAF file that do not contain a URL for the person's FOAF file.

PREFIX foaf: <http://xmlns.com/foaf/0.1/>

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT ?name ?url

FROM <http://dig.csail.mit.edu/2008/webdav/timbl/foaf.rdf>

WHERE {

?person a foaf:Person ; foaf:name ?name .

OPTIONAL { ?person rdfs:seeAlso ?url }

FILTER(!bound(?url))

}

• Negation in SPARQL 1.0 was done using OPTIONAL, the bound filter, and the logical-not operator.

• OPTIONAL clause binds a variable in cases we want to exclude, and the filter removes those cases.

• Try it with ARQ. (http://sparql.org/sparql.html)

QUERY #11B

21

MINUS

• Find the person entries in Tim Berners-Lee's FOAF file that do not contain a URL for the person's FOAF file.

PREFIX foaf: <http://xmlns.com/foaf/0.1/>

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT ?name ?url

FROM <http://dig.csail.mit.edu/2008/webdav/timbl/foaf.rdf>

WHERE {

?person a foaf:Person ; foaf:name ?name .

MINUS { ?person rdfs:seeAlso ?url }

}

• SPARQL 1.1 includes a MINUS graph pattern clause: a binary operator that removes bindings that match the right-hand side.

• Try it with ARQ. (http://sparql.org/sparql.html)

QUERY #11C

22

NOT EXISTS

• Find the person entries in Tim Berners-Lee's FOAF file that do not contain a URL for the person's FOAF file.

PREFIX foaf: <http://xmlns.com/foaf/0.1/>

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT ?name ?url

FROM <http://dig.csail.mit.edu/2008/webdav/timbl/foaf.rdf>

WHERE {

?person a foaf:Person ; foaf:name ?name .

FILTER(NOT EXISTS { ?person rdfs:seeAlso ?url })

}

• SPARQL 1.1 includes a NOT EXISTS filter that uses the bindings from a solution to test whether or not a given graph pattern exists.

• In most cases, negation can be done with either MINUS or NOT EXISTS -- there are some differences in edge cases, though!

• Try it with ARQ. (http://sparql.org/sparql.html)

QUERY #12UNION

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>PREFIX dbp: <http://dbpedia.org/ontology/>

SELECT * WHERE {

{?city rdf:type dbp:PopulatedPlace;

dbp:populationTotal ?popTotal;

rdfs:label ?nameOPTIONAL {?city dbp:populationMetro ?popMetro . }

FILTER (?popTotal > 50000 && langmatches(lang(?name), "EN") && regex(str(?name),"abad"))

}UNION

{

?city rdf:type dbp:PopulatedPlace;dbp:populationTotal ?popTotal;

rdfs:label ?name

OPTIONAL {?city dbp:populationMetro ?popMetro . }FILTER (?popTotal > 50000 && langmatches(lang(?name), "EN") && regex(str(?name),"pur"))

}

}ORDER BY desc(?popTotal)

This query returns cities that are of type "Cities in Texas" or of type "Cities in California"..

23

QUERY #13NAMED GRAPHS AND THE GRAPH CLAUSE

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

SELECT * WHERE {

GRAPH ?g {

?city rdf:type <http://dbpedia.org/class/yago/CitiesInTexas> .

}

}

This query returns the cities that are of type "Cities in Texas" and the graph in which each city resource is contained.

24

QUERY #14ASK

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

ASK WHERE {

<http://dbpedia.org/resource/Islamabad_Capital_Territory> rdf:type dbp:PopulatedPlace .

}

ASK queries checks if there is at least one result for a given query pattern.

The result is true or false.

This query asks if Islamabad is a Populated Place.

25

QUERY #15ASK

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

PREFIX dbp: <http://dbpedia.org/ontology/>

ASK WHERE {

?city rdf:type dbp:PopulatedPlace ;

dbp:populationTotal ?popTotal ;

dbp:populationMetro ?popMetro.

FILTER (?popTotal > 600000 && ?popMetro < 1800000)

}

This query asks if there exists a city in Texas that has a total population greater than 600,000 and a metro population less than 1,800.000.

26

QUERY #16DESCRIBE

DESCRIBE <http://dbpedia.org/resource/Austin,_Texas>

DESCRIBE queries returns an RDF graph that describes a resource. The implementation of this return form is up to each query engine.

This query returns an RDF graph that describes Austin.

27

QUERY #17DESCRIBE

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

PREFIX dbp: <http://dbpedia.org/ontology/>

DESCRIBE ?city WHERE {

?city rdf:type dbp:PopulatedPlace ;

dbp:populationTotal ?popTotal ;

dbp:populationMetro ?popMetro.

FILTER (?popTotal > 600000 && ?popMetro < 1800000)

}

This query returns an RDF graph that describes all the cities that have a total population greater than 600,000 and a metro population less than 1,800.000.

28

QUERY #18DESCRIBE

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

PREFIX dbp: <http://dbpedia.org/ontology/>

CONSTRUCT {

?city rdf:type <http://myvocabulary.com/LargeMetroCities> ;

<http://myvocabulary.com/cityName> ?name ;

<http://myvocabulary.com/totalPopulation> ?popTotal ;

<http://myvocabulary.com/metroPopulation> ?popMetro .

} WHERE {

?city rdf:type dbp:PopulatedPlace ;

dbp:populationTotal ?popTotal ;

rdfs:label ?name ;

dbp:populationMetro ?popMetro .

FILTER (?popTotal > 500000 && langmatches(lang(?name), "EN"))

}

This query constructs a new RDF graph for cities in Texas that have a metro population greater than 500,000.

29