ekaw - triple pattern fragments

29
Triple Pattern Fragments Ruben Taelman - @rubensworks imec - Ghent University 1

Upload: ruben-taelman

Post on 10-Feb-2017

46 views

Category:

Technology


0 download

TRANSCRIPT

Triple Pattern Fragments Ruben Taelman - @rubensworks

imec - Ghent University

1

Evaluate SPARQL queries client-side with TPF

SELECT ?person ?city WHERE { ?person rdf:type dbpedia-owl:Artist. ?person dbpedia-owl:birthPlace ?city. ?city foaf:name "Waterloo"@en.}

?person rdf:type dbpedia-owl:Artist.

?person dbpedia-owl:birthPlace ?city.

?city foaf:name "Waterloo"@en. 2

Publishing with Triple Pattern FragmentsServer

Client

Evaluations

3

Publishing with Triple Pattern FragmentsServer

Client

Evaluations

4

Simple triple pattern interfaceTriple pattern queries, paged results

Example: s1 p1 o1 at page 2

http://example.org/my-dataset?subject=s1&predicate=p1&object=o1&page=2

Metadata and controls

5

Effective caching because of limited number of URI’sFrequently used TPF’s are cached

Cached TPF’s can be delivered efficiently

Queries with common data fragments will evaluate faster

6

Storage solution must support triple pattern queriesIn-memory triplestore for RDF files

HDT (Fernández 2010)

SPARQL endpoints

7

Server is simpleIn order to make publication cheap and easy

8

Publishing with Triple Pattern FragmentsServer

Client

Evaluations

9

Clients can evaluate any SPARQL queryusing the simple TPF interface of one or more servers

Split up SPARQL queries into separate triple pattern queries

Combine results client-side

10

Joining triple pattern fragments can be tricky

SELECT ?person ?city WHERE { ?person rdf:type dbpedia-owl:Artist. ?person dbpedia-owl:birthPlace ?city. ?city foaf:name "Waterloo"@en.}

?person rdf:type dbpedia-owl:Artist.

?person dbpedia-owl:birthPlace ?city.

?city foaf:name "Waterloo"@en.

Some query plans are more efficient than others

11

Number of triples as metadata in TPF’s

?person rdf:type dbpedia-owl:Artist.

?person dbpedia-owl:birthPlace ?city.

?city foaf:name "Waterloo"@en. 26

96 000

12 000 000

12

Select most selective triple pattern

?person rdf:type dbpedia-owl:Artist.

?person dbpedia-owl:birthPlace ?city.

?city foaf:name "Waterloo"@en. 26

96 000

12 000 000

Most selective!

13

Find all results for most selective pattern

?person rdf:type dbpedia-owl:Artist.

?person dbpedia-owl:birthPlace ?city.

?city foaf:name "Waterloo"@en.

dbp:Waterloo,_Iowa foaf:name "Waterloo"@endbp:Waterloo,_London foaf:name "Waterloo"@endbp:Waterloo,_Ontario foaf:name "Waterloo"@en...

14

Fill in results in other patterns

?person rdf:type dbpedia-owl:Artist.

?person dbpedia-owl:birthPlace dbp:Waterloo,_Iowa.

?city foaf:name "Waterloo"@en.

dbp:Waterloo,_Iowa foaf:name "Waterloo"@endbp:Waterloo,_London foaf:name "Waterloo"@endbp:Waterloo,_Ontario foaf:name "Waterloo"@en...

15

Recursively repeat for remaining patterns

?person rdf:type dbpedia-owl:Artist.

?person dbpedia-owl:birthPlace dbp:Waterloo,_Iowa .

96 000

45

16

Select most selective pattern

?person rdf:type dbpedia-owl:Artist.

?person dbpedia-owl:birthPlace dbp:Waterloo,_Iowa .

96 000

45

17

Find all results for most selective pattern

?person rdf:type dbpedia-owl:Artist.

?person dbpedia-owl:birthPlace dbp:Waterloo,_Iowa .

96 000

45

dbp:Allan_Carpenter dbo:birthPlace dbp:Waterloo,_Iowa.dbp:Adam_DeVine dbo:birthPlace dbp:Waterloo,_Iowa.dbp:Bonnie_Koloc dbo:birthPlace dbp:Waterloo,_Iowa....

18

Fill in results in other patterns

dbp:Allan_Carpenter rdf:type dbpedia-owl:Artist.

?person dbpedia-owl:birthPlace dbp:Waterloo,_Iowa .

96 000

45

dbp:Allan_Carpenter dbo:birthPlace dbp:Waterloo,_Iowa.dbp:Adam_DeVine dbo:birthPlace dbp:Waterloo,_Iowa.dbp:Bonnie_Koloc dbo:birthPlace dbp:Waterloo,_Iowa....

19

One solution is found

dbp:Allan_Carpenter rdf:type dbpedia-owl:Artist. 1

Repeat process for all other matches

20

Joining algorithm is not always optimalOther algorithms are possible

Improved algorithm minimizes #requests (Van Herwegen 2015)

Additional metadata may improve query plans

21

Publishing with Triple Pattern FragmentsServer

Client

Evaluations

22

Send many client queries to a single server1 server (TPF, Virtuoso, Fuseki)

1 - 244 simultaneous clients

Different query types from Berlin SPARQL benchmark

23

(Verborgh 2016)

Query throughput is lower

24

Server load is lower

25

TPF is just one possible trade-offReduce client load:

Additional metadata for membership search (Vander Sande 2015)

Substring filtering (Van Herwegen 2015)

Dynamic data publication and querying (Taelman 2016)

Reduce server load:

Decentralized caching (Folz 2016)

26

ConclusionsTPF servers have a simple low-cost interface

TPF clients evaluate SPARQL queries locally, using this interface

27

Next sessionSetting up a TPF server yourself

Querying the server

28

SourcesR Verborgh “Linked Data Publishing” http://rubenverborgh.github.io/WebFundamentals/linked-data-publishing/

R. Verborgh, M. Vander Sande, O. Hartig, et al. Triple Pattern Fragments: a Low-cost Knowledge Graph Interface for the Web.

29