continuously updating query results over real-time linked data
TRANSCRIPT
Ruben Taelman - @rubensworksiMinds - Ghent University
Continuously Updating Query Resultsover Real-Time Linked Data
Dynamic Linked DataE.g. Thermometer measures every minute:
“19,05°C” - 30-05-2016 11:00“19,06°C” - 30-05-2016 11:01“19,11°C” - 30-05-2016 11:02“19,08°C” - 30-05-2016 11:03…
Typically exposed as an RDF stream = stream of <RDF triple, timestamp>
Querying continous dataClients send queries to server: e.g. What is the current temperature?
Server continuously evaluates the queries
→ Server does all of the work
Cause of low public endpoint availability!½ have availability of < 95% (Buil-Aranda 2013)
→ Clients just wait for results
What if we moved continuous query evaluation to the client?→ to lower server load
Triple Pattern Fragments does this for static data!
Triple pattern fragments (TPF) (Verborgh 2016):
Servers can only respond to triple pattern queriesClients need to evaluate queries locally→ Lowers server complexity
Can we do the same for dynamic data?
OverviewDynamic data representation
Query streamer engine
Evaluation
OverviewDynamic data representation
Query streamer engine
Evaluation
Dynamic data representationExpose dynamic data through the TPF interface
→ Represent dynamic data in RDF
We annotate dynamic data with the time at which they are valid
→ Client can derive the time at which data can change!
But how do we annotate data/triples with time?
Annotation methodsReification
Singleton properties (Nguyen 2014)
Graphs
Implicit graphs
Outdated
Instantiate predicates
Define fourth element in quad
TPF makes triples (de)referencable
Time labeling typesTime interval
Expiration time
Start- and endtime of validity
Good for maintaining a history of elements
Endtime of validity
When only the latest version is required
Dynamic data example
radio:bbc-radio-1 m:plays radio:jauz-netsky-higher.
GRAPH _:g1 {radio:bbc-radio-1 m:plays radio:jauz-netsky-higher.
}_:g1 tmp:interval _:interval_1._:interval_1 tmp:initial "2016-05-30T09:15:00"^^xsd:dateTime._:interval_1 tmp:final "2016-05-30T09:20:00"^^xsd:dateTime.
Graph-annotation: [ 9:15, 9:20 ]
OverviewDynamic data representation
Query engine
Evaluation
Query streamer engine
OverviewDynamic data representation
Query streamer engine
Evaluation
Measure query execution times for query duration
Query: “All trains with their delay in station X within the next hour”Frequency: 10 secondsClients: 1Engine: Query streamer
Annotation methods: singleton property, graph, implicit graphTime labeling types: time interval, expiration time
Evaluating annotation methods
Evaluating annotation methods
Time interval Expiration time
Evaluating scalabilityMeasure server CPU usage for increasing # clients
Query: “All trains with their delay in station X within the next hour”Frequency: 10 secondsClients: 1 → 200Engines: Query streamer, C-SPARQL (Barbieri 2012) and
CQELS (Le-Phuoc 2011)
Annotation method: graphTime labeling types: expiration time
Query Streamer has better scalability
Query Streamer moves load from server to client
OverviewDynamic data representation
Annotate dynamic data with time
Query streamer engine
Client-side query engineDynamic data at TPF server
Evaluation
Annotation methodsScalability
ConclusionsFurther evaluation: Different query types, …?
Solve efficiency-problem time intervals?
Promising approach for improved scalability