continuously updating query results over real-time linked data

Post on 10-Feb-2017

405 Views

Category:

Engineering

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Ruben Taelman - @rubensworksiMinds - Ghent University

Continuously Updating Query Resultsover Real-Time Linked Data

Dynamic Linked DataE.g. Thermometer measures every minute:

“19,05°C” - 30-05-2016 11:00“19,06°C” - 30-05-2016 11:01“19,11°C” - 30-05-2016 11:02“19,08°C” - 30-05-2016 11:03…

Typically exposed as an RDF stream = stream of <RDF triple, timestamp>

Querying continous dataClients send queries to server: e.g. What is the current temperature?

Server continuously evaluates the queries

→ Server does all of the work

Cause of low public endpoint availability!½ have availability of < 95% (Buil-Aranda 2013)

→ Clients just wait for results

What if we moved continuous query evaluation to the client?→ to lower server load

Triple Pattern Fragments does this for static data!

Triple pattern fragments (TPF) (Verborgh 2016):

Servers can only respond to triple pattern queriesClients need to evaluate queries locally→ Lowers server complexity

Can we do the same for dynamic data?

OverviewDynamic data representation

Query streamer engine

Evaluation

OverviewDynamic data representation

Query streamer engine

Evaluation

Dynamic data representationExpose dynamic data through the TPF interface

→ Represent dynamic data in RDF

We annotate dynamic data with the time at which they are valid

→ Client can derive the time at which data can change!

But how do we annotate data/triples with time?

Annotation methodsReification

Singleton properties (Nguyen 2014)

Graphs

Implicit graphs

Outdated

Instantiate predicates

Define fourth element in quad

TPF makes triples (de)referencable

Time labeling typesTime interval

Expiration time

Start- and endtime of validity

Good for maintaining a history of elements

Endtime of validity

When only the latest version is required

Dynamic data example

radio:bbc-radio-1 m:plays radio:jauz-netsky-higher.

GRAPH _:g1 {radio:bbc-radio-1 m:plays radio:jauz-netsky-higher.

}_:g1 tmp:interval _:interval_1._:interval_1 tmp:initial "2016-05-30T09:15:00"^^xsd:dateTime._:interval_1 tmp:final "2016-05-30T09:20:00"^^xsd:dateTime.

Graph-annotation: [ 9:15, 9:20 ]

OverviewDynamic data representation

Query engine

Evaluation

Query streamer engine

OverviewDynamic data representation

Query streamer engine

Evaluation

Measure query execution times for query duration

Query: “All trains with their delay in station X within the next hour”Frequency: 10 secondsClients: 1Engine: Query streamer

Annotation methods: singleton property, graph, implicit graphTime labeling types: time interval, expiration time

Evaluating annotation methods

Evaluating annotation methods

Time interval Expiration time

Evaluating scalabilityMeasure server CPU usage for increasing # clients

Query: “All trains with their delay in station X within the next hour”Frequency: 10 secondsClients: 1 → 200Engines: Query streamer, C-SPARQL (Barbieri 2012) and

CQELS (Le-Phuoc 2011)

Annotation method: graphTime labeling types: expiration time

Query Streamer has better scalability

Query Streamer moves load from server to client

OverviewDynamic data representation

Annotate dynamic data with time

Query streamer engine

Client-side query engineDynamic data at TPF server

Evaluation

Annotation methodsScalability

ConclusionsFurther evaluation: Different query types, …?

Solve efficiency-problem time intervals?

Promising approach for improved scalability

top related