sparql tutorial - imran ihsanimranihsan.com/upload/lecture/sws1707.pdf · semantic web imran ihsan...
TRANSCRIPT
SEMANTIC WEB
IMRAN IHSANASSISTANT PROFESSOR, AIR UNIVERSITY, ISLAMABADWWW.IMRANIHSAN.COM
07SPARQL TUTORIALBY EXAMPLE: DBPEDIA
VIRTUOSO SERVER DOWNLOAD
2
• Open Link Virtuoso Server
http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VOSDownload
• Make sure that you have downloaded the right server according to your system operating system and specification (32 bit, 64 bit).
VIRTUOSO STARTUP
3
• Virtuoso is a portable server; therefore, it does not require any installation.
• Just extract the zip file into some directory D:\virtuoso
• Zip extraction will create a folder with the name virtuoso-opensource
• In virtuoso-opensource, go to database directory & copy virtuoso.ini file into bin directory.
• Go to D:\virtuoso\virtuoso-opensource\bin using command line
• Run the following command
Virtuoso-t -f
• A virtuoso server will be started.
• It will be available at http://localhost:8890/sparql or http://your.system.ip.address/sparql
• You can directly run various SPARQL using this public interface.
LOADING DATA INTO VIRTUOSO SPARQL ENDPOINT
4
RDF UPLOAD USING VIRTUOSO CONDUCTOR
• This method is useful for uploading small RDF files (e.g. 100 or 200 MB files).
• In this method only one file can be uploaded at one time.
• Steps .
1. Go to the link http://localhost:8890/ and click on conductor on the left side.
2. Type dbaas both login account and password.
3. Click on the Linked Data tab and then Quad Store Upload
4. Select your RDF file (only one at a time), give a proper graph name and click upload
5. If there is no syntax error, the RDF file will be added to the virtuoso server
6. Query using the public interface given at http://localhost:8890/sparql .
LOADING DATA INTO VIRTUOSO SPARQL ENDPOINT
5
USING BULK LOAD
1. Go to bin folder and click on isql (D:\virtuoso\virtuoso-opensource\isql).
2. Run the following command to clear any previous load list of files
SQL>delete from db.dba.load_list;
3. Enter the following command by providing the appropriate input values
SQL>ld_dir(‘<sourcefilename-or-directory>’,'<file name pattern>’,’graph iri’);
4. Please mind forward slash. In our case the parameters are given below.
SQL>ld_dir (‘D:/virtuoso/DBPediaData’, ‘*.nt’, ‘http://cbakerlab.unbsj.ca’);
5. Next enter the command
SQL> select * from DB.DBA.load_list;
6. Finally, enter the command to start the bulk load and wait for completion.
SQL>rdf_loader_run();
7. After successful upload run the shut down command otherwise the file will be not completely uploaded
SQL> shutdown;
QUERY #1DBPEDIA’S SPARQL UI – HTTP://DBPEDIA.ORG/SPARQL/
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT * WHERE {
?city rdf:type <http://dbpedia.org/ontology/PopulatedPlace>
}
This query returns all of the URIs that identify as cities that are of type “Populated Place".
8
QUERY #2POPULATION TOTAL
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbp: <http://dbpedia.org/ontology/>
SELECT * WHERE {
?city rdf:type dbp:PopulatedPlace .
?city dbp:populationTotal ?popTotal .
}
This query returns the cities as well as their total populations.
9
QUERY #3TOTAL AND METRO POPULATION
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbp: <http://dbpedia.org/ontology/>
SELECT * WHERE {
?city rdf:type dbp:PopulatedPlace;
dbp:populationTotal ?popTotal ;
dbp:populationMetro ?popMetro .
}
This query returns the cities with their total populations and metro populations..
10
QUERY #4OPTIONAL CLAUSE
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbp: <http://dbpedia.org/ontology/>
SELECT * WHERE {
?city rdf:type dbp:PopulatedPlace;
dbp:populationTotal ?popTotal .
OPTIONAL {?city dbp:populationMetro ?popMetro . }
}
This query returns the cities, their total population and optionally the metro population, if it exists.
11
QUERY #5ORDER BY CLAUSE
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbp: <http://dbpedia.org/ontology/>
SELECT * WHERE {
?city rdf:type dbp:PopulatedPlace;
dbp:populationTotal ?popTotal .
OPTIONAL {?city dbp:populationMetro ?popMetro . }
}
ORDER BY desc(?popTotal)
This query returns the cities, their total population, and optionally their metro populations. The results are returned in the order of their total populations.
12
QUERY #6LIMIT AND OFFSET CLAUSES
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbp: <http://dbpedia.org/ontology/>
SELECT * WHERE {
?city rdf:type dbp:PopulatedPlace;
dbp:populationTotal ?popTotal .
OPTIONAL {?city dbp:populationMetro ?popMetro . }
}
ORDER BY desc(?popTotal)
LIMIT 10
OFFSET 5
This query returns the cities, their total population, and optionally their metro populations. The results are returned in the order of their total populations.
At most 10 results will be returned, starting with the 5th result.
13
QUERY #7FILTER
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbp: <http://dbpedia.org/ontology/>
SELECT * WHERE {
?city rdf:type dbp:PopulatedPlace;
dbp:populationTotal ?popTotal .
OPTIONAL {?city dbp:populationMetro ?popMetro . }
FILTER (?popTotal > 50000)
}
ORDER BY desc(?popTotal)
This is the same as Query 6, but returns places that have a total population of more than 50,000.
Logical: &&, ||, !
Mathematical: +, -, *, /
Comparison: =, !=, <, >, <=, >=
SPARQL tests: isURI, isBlank, isLiteral, bound
SPARQL accessors: str, lang, datatype
Other: sameTerm, langMatches, regex
14
QUERY #8RDFS PREDICATE - LABEL
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbp: <http://dbpedia.org/ontology/>
SELECT * WHERE {
?city rdf:type dbp:PopulatedPlace;
dbp:populationTotal ?popTotal;
rdfs:label ?name
OPTIONAL {?city dbp:populationMetro ?popMetro . }
FILTER (?popTotal > 50000)
}
ORDER BY desc(?popTotal)
This query is the same as query 7, but brings back the human readable name of each place with the results.
15
QUERY #9LANGUAGE MATCHING
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbp: <http://dbpedia.org/ontology/>
SELECT * WHERE {
?city rdf:type dbp:PopulatedPlace;
dbp:populationTotal ?popTotal;
rdfs:label ?name
OPTIONAL {?city dbp:populationMetro ?popMetro . }
FILTER (?popTotal > 50000 && langmatches(lang(?name), "EN"))
}
ORDER BY desc(?popTotal)
Query 8, but requesting only English labels for the matching patterns.
16
QUERY #9ALANGUAGE MATCHING
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbp: <http://dbpedia.org/ontology/>
SELECT * WHERE {
?city rdf:type dbp:PopulatedPlace;
dbp:populationTotal ?popTotal;
rdfs:label ?name
OPTIONAL {?city dbp:populationMetro ?popMetro . }
FILTER (?popTotal > 50000 && lang(?name) = "en")
}
ORDER BY desc(?popTotal)
Query 9 can be rewritten equivalently without the langmatches operator and using "=" and "en" (lowercase) instead of "EN" (uppercase):.
17
QUERY #10REGEX – REGULAR EXPRESSION
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbp: <http://dbpedia.org/ontology/>
SELECT * WHERE {
?city rdf:type dbp:PopulatedPlace;
dbp:populationTotal ?popTotal;
rdfs:label ?name
OPTIONAL {?city dbp:populationMetro ?popMetro . }
FILTER (?popTotal > 50000 &&
langmatches(lang(?name), "EN") &&
regex(str(?name),"abad"))
}
ORDER BY desc(?popTotal)
It is the same as Query 9, but matching only cities with “abad" in their names.
18
QUERY #11NOT OPERATOR
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbp: <http://dbpedia.org/ontology/>
SELECT * WHERE {
?city rdf:type dbp:PopulatedPlace;
dbp:populationTotal ?popTotal ;
rdfs:label ?name
OPTIONAL {?city dbp:populationMetro ?popMetro. }
FILTER (?popTotal > 50000 && langmatches(lang(?name), "EN") )
FILTER(!bound(?popMetro))
}
ORDER BY desc(?popTotal)
This query is the same as before, except that it returns only cities that do not have a metro population.
19
QUERY #11A
20
NOT OPERATOR
• Find the person entries in Tim Berners-Lee's FOAF file that do not contain a URL for the person's FOAF file.
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?name ?url
FROM <http://dig.csail.mit.edu/2008/webdav/timbl/foaf.rdf>
WHERE {
?person a foaf:Person ; foaf:name ?name .
OPTIONAL { ?person rdfs:seeAlso ?url }
FILTER(!bound(?url))
}
• Negation in SPARQL 1.0 was done using OPTIONAL, the bound filter, and the logical-not operator.
• OPTIONAL clause binds a variable in cases we want to exclude, and the filter removes those cases.
• Try it with ARQ. (http://sparql.org/sparql.html)
QUERY #11B
21
MINUS
• Find the person entries in Tim Berners-Lee's FOAF file that do not contain a URL for the person's FOAF file.
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?name ?url
FROM <http://dig.csail.mit.edu/2008/webdav/timbl/foaf.rdf>
WHERE {
?person a foaf:Person ; foaf:name ?name .
MINUS { ?person rdfs:seeAlso ?url }
}
• SPARQL 1.1 includes a MINUS graph pattern clause: a binary operator that removes bindings that match the right-hand side.
• Try it with ARQ. (http://sparql.org/sparql.html)
QUERY #11C
22
NOT EXISTS
• Find the person entries in Tim Berners-Lee's FOAF file that do not contain a URL for the person's FOAF file.
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?name ?url
FROM <http://dig.csail.mit.edu/2008/webdav/timbl/foaf.rdf>
WHERE {
?person a foaf:Person ; foaf:name ?name .
FILTER(NOT EXISTS { ?person rdfs:seeAlso ?url })
}
• SPARQL 1.1 includes a NOT EXISTS filter that uses the bindings from a solution to test whether or not a given graph pattern exists.
• In most cases, negation can be done with either MINUS or NOT EXISTS -- there are some differences in edge cases, though!
• Try it with ARQ. (http://sparql.org/sparql.html)
QUERY #12UNION
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>PREFIX dbp: <http://dbpedia.org/ontology/>
SELECT * WHERE {
{?city rdf:type dbp:PopulatedPlace;
dbp:populationTotal ?popTotal;
rdfs:label ?nameOPTIONAL {?city dbp:populationMetro ?popMetro . }
FILTER (?popTotal > 50000 && langmatches(lang(?name), "EN") && regex(str(?name),"abad"))
}UNION
{
?city rdf:type dbp:PopulatedPlace;dbp:populationTotal ?popTotal;
rdfs:label ?name
OPTIONAL {?city dbp:populationMetro ?popMetro . }FILTER (?popTotal > 50000 && langmatches(lang(?name), "EN") && regex(str(?name),"pur"))
}
}ORDER BY desc(?popTotal)
This query returns cities that are of type "Cities in Texas" or of type "Cities in California"..
23
QUERY #13NAMED GRAPHS AND THE GRAPH CLAUSE
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT * WHERE {
GRAPH ?g {
?city rdf:type <http://dbpedia.org/class/yago/CitiesInTexas> .
}
}
This query returns the cities that are of type "Cities in Texas" and the graph in which each city resource is contained.
24
QUERY #14ASK
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
ASK WHERE {
<http://dbpedia.org/resource/Islamabad_Capital_Territory> rdf:type dbp:PopulatedPlace .
}
ASK queries checks if there is at least one result for a given query pattern.
The result is true or false.
This query asks if Islamabad is a Populated Place.
25
QUERY #15ASK
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dbp: <http://dbpedia.org/ontology/>
ASK WHERE {
?city rdf:type dbp:PopulatedPlace ;
dbp:populationTotal ?popTotal ;
dbp:populationMetro ?popMetro.
FILTER (?popTotal > 600000 && ?popMetro < 1800000)
}
This query asks if there exists a city in Texas that has a total population greater than 600,000 and a metro population less than 1,800.000.
26
QUERY #16DESCRIBE
DESCRIBE <http://dbpedia.org/resource/Austin,_Texas>
DESCRIBE queries returns an RDF graph that describes a resource. The implementation of this return form is up to each query engine.
This query returns an RDF graph that describes Austin.
27
QUERY #17DESCRIBE
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dbp: <http://dbpedia.org/ontology/>
DESCRIBE ?city WHERE {
?city rdf:type dbp:PopulatedPlace ;
dbp:populationTotal ?popTotal ;
dbp:populationMetro ?popMetro.
FILTER (?popTotal > 600000 && ?popMetro < 1800000)
}
This query returns an RDF graph that describes all the cities that have a total population greater than 600,000 and a metro population less than 1,800.000.
28
QUERY #18DESCRIBE
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbp: <http://dbpedia.org/ontology/>
CONSTRUCT {
?city rdf:type <http://myvocabulary.com/LargeMetroCities> ;
<http://myvocabulary.com/cityName> ?name ;
<http://myvocabulary.com/totalPopulation> ?popTotal ;
<http://myvocabulary.com/metroPopulation> ?popMetro .
} WHERE {
?city rdf:type dbp:PopulatedPlace ;
dbp:populationTotal ?popTotal ;
rdfs:label ?name ;
dbp:populationMetro ?popMetro .
FILTER (?popTotal > 500000 && langmatches(lang(?name), "EN"))
}
This query constructs a new RDF graph for cities in Texas that have a metro population greater than 500,000.
29