executing sparql queries over mapped document stores with sparqlmap-m

35
Executing SPARQL queries over Mapped Document Stores with SparqlMap-M J. Unbehauen M. Martin IIS // AKSW // BIS // IfI Leipzig University SEMANTiCS 2016 J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 1 / 25

Upload: linked-enterprise-date-services

Post on 14-Jan-2017

169 views

Category:

Data & Analytics


2 download

TRANSCRIPT

Page 1: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

Executing SPARQL queries over Mapped DocumentStores with SparqlMap-M

J. Unbehauen M. Martin

IIS // AKSW // BIS // IfILeipzig University

SEMANTiCS 2016

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 1 / 25

Page 2: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

Outline

1 Motivation and Scope

2 Approach

3 Evaluation

4 Conclusions and Future Work

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 2 / 25

Page 3: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

Scoping

[1] S. Auer, J. Lehmann, A. Ngonga Ngomo. Introduction to Linked Data and ItsLifecycle on the Web, Reasoning Web. Semantic Technologies for the Web of

Data, LNCS 6848, 2011

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 3 / 25

Page 4: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

Motivation

NoSQL DBMS and document stores are thriving

Document stores used in Rapid Application Development Frameworks

Visit our Adding Semantics to Model-Driven Software DevelopmentPoster

Use cases in both research and industry

Current solutions support R2RML and relational databases

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 4 / 25

Page 5: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

Motivation

NoSQL DBMS and document stores are thriving

Document stores used in Rapid Application Development Frameworks

Visit our Adding Semantics to Model-Driven Software DevelopmentPoster

Use cases in both research and industry

Current solutions support R2RML and relational databases

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 4 / 25

Page 6: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

Motivation

NoSQL DBMS and document stores are thriving

Document stores used in Rapid Application Development Frameworks

Visit our Adding Semantics to Model-Driven Software DevelopmentPoster

Use cases in both research and industry

Current solutions support R2RML and relational databases

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 4 / 25

Page 7: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

Motivation

NoSQL DBMS and document stores are thriving

Document stores used in Rapid Application Development Frameworks

Visit our Adding Semantics to Model-Driven Software DevelopmentPoster

Use cases in both research and industry

Current solutions support R2RML and relational databases

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 4 / 25

Page 8: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

Outline

1 Motivation and Scope

2 Approach

3 Evaluation

4 Conclusions and Future Work

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 5 / 25

Page 9: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

SparqlMap Architecture

BindingTranslat.

SparqlMap

QueryAnalysis

QueryParsing

MappingBinding

QuerySELECT DISTINCT ?name {

?person foaf:name ?name. #(tp1)

?person :inDepartment ?dep. #(tp2)

?dep rdfs:label ’Research’ #(tp3) }

Result?name

------------

’Mary R.’

’James T.’

Translat.Exec.

[2] J. Unbehauen, C. Stadler, and S. Auer. Accessing relational data on the webwith sparqlmap. In JIST. 2012.[3] J. Unbehauen, C. Stadler, and S. Auer. Optimizing sparql-to-sql rewriting. InIIWAS, 2013.

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 6 / 25

Page 10: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

SparqlMap-M Architecture

BindingTranslat.

SparqlMap-MSparqlMap

QueryAnalysis

QueryParsing

MappingBinding

SelectiveMaterialization

QuerySELECT DISTINCT ?name {

?person foaf:name ?name. #(tp1)

?person :inDepartment ?dep. #(tp2)

?dep rdfs:label ’Research’ #(tp3)

}Mapping

DeduplicationUnion Decom-

position

MaterializedExecution?name

------------

’Mary R.’

’James T.’

Translat.Exec.

Result

1 Data Models and Mapping

2 Query Structure

3 Querying Capabilities

4 Data Model Specific Optimization

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 7 / 25

Page 11: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

SparqlMap-M Architecture

BindingTranslat.

SparqlMap-MSparqlMap

QueryAnalysis

QueryParsing

MappingBinding

SelectiveMaterialization

QuerySELECT DISTINCT ?name {

?person foaf:name ?name. #(tp1)

?person :inDepartment ?dep. #(tp2)

?dep rdfs:label ’Research’ #(tp3)

}Mapping

DeduplicationUnion Decom-

position

MaterializedExecution?name

------------

’Mary R.’

’James T.’

Translat.Exec.

Result

1 Data Models and Mapping

2 Query Structure

3 Querying Capabilities

4 Data Model Specific Optimization

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 7 / 25

Page 12: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

SparqlMap-M Architecture

BindingTranslat.

SparqlMap-MSparqlMap

QueryAnalysis

QueryParsing

MappingBinding

SelectiveMaterialization

QuerySELECT DISTINCT ?name {

?person foaf:name ?name. #(tp1)

?person :inDepartment ?dep. #(tp2)

?dep rdfs:label ’Research’ #(tp3)

}Mapping

DeduplicationUnion Decom-

position

MaterializedExecution?name

------------

’Mary R.’

’James T.’

Translat.Exec.

Result

1 Data Models and Mapping

2 Query Structure

3 Querying Capabilities

4 Data Model Specific Optimization

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 7 / 25

Page 13: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

SparqlMap-M Architecture

BindingTranslat.

SparqlMap-MSparqlMap

QueryAnalysis

QueryParsing

MappingBinding

SelectiveMaterialization

QuerySELECT DISTINCT ?name {

?person foaf:name ?name. #(tp1)

?person :inDepartment ?dep. #(tp2)

?dep rdfs:label ’Research’ #(tp3)

}Mapping

DeduplicationUnion Decom-

position

MaterializedExecution?name

------------

’Mary R.’

’James T.’

Translat.Exec.

Result

1 Data Models and Mapping

2 Query Structure

3 Querying Capabilities

4 Data Model Specific Optimization

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 7 / 25

Page 14: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

SparqlMap-M Architecture

BindingTranslat.

SparqlMap-MSparqlMap

QueryAnalysis

QueryParsing

MappingBinding

SelectiveMaterialization

QuerySELECT DISTINCT ?name {

?person foaf:name ?name. #(tp1)

?person :inDepartment ?dep. #(tp2)

?dep rdfs:label ’Research’ #(tp3)

}Mapping

DeduplicationUnion Decom-

position

MaterializedExecution?name

------------

’Mary R.’

’James T.’

Translat.Exec.

Result

1 Data Models and Mapping

2 Query Structure

3 Querying Capabilities

4 Data Model Specific Optimization

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 7 / 25

Page 15: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

Data Models and Mapping

BindingTranslat.

SparqlMap-MSparqlMap

QueryAnalysis

QueryParsing

MappingBinding

SelectiveMaterialization

QuerySELECT DISTINCT ?name {

?person foaf:name ?name. #(tp1)

?person :inDepartment ?dep. #(tp2)

?dep rdfs:label ’Research’ #(tp3)

}Mapping

Deduplication

Union De-composition

MaterializedExecution?name

------------

’Mary R.’

’James T.’

Translat.Exec.

Result

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 8 / 25

Page 16: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

Data Models and Mapping

Key-Value pairs

Nested documents

Schema less

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 9 / 25

Page 17: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

Data Models and Mapping

A relational view on documents by:

Goal: reuse existing (R2RML) concepts

Unnesting documents by joining them with parent → Flat structure

Naming attributes to reflect hierarchy → Key-Value treated as tuples

Schema imposed by mapping

#Department{ i d : 2 , name : ” Resea rch ” ,emp : [{ i d : 1 , name : ”Mary R. ”} ,

{ i d : 2 , name : ”James T. ” } ] } ,

i d | name | emp . i d | emp . name−−+−−−−−−−−−+−−−−−−−+−−−−−−−−2 | Resea rch |1 |Mary R .2 | Resea rch |2 | James T.

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 10 / 25

Page 18: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

Query Structure

BindingTranslat.

SparqlMap-MSparqlMap

QueryAnalysis

QueryParsing

MappingBinding

SelectiveMaterialization

QuerySELECT DISTINCT ?name {

?person foaf:name ?name. #(tp1)

?person :inDepartment ?dep. #(tp2)

?dep rdfs:label ’Research’ #(tp3)

}Mapping

Deduplication

Union De-composition

MaterializedExecution?name

------------

’Mary R.’

’James T.’

Translat.Exec.

Result

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 11 / 25

Page 19: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

Query Structure

SparqlMap

Recursive translation yields nested unions

Index hits require careful query design

Complex expressions for joins

SparqlMap-M / MongoDB

No direct equivalents for joins

No complex equivalence expression

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 12 / 25

Page 20: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

Query Structure

SparqlMap

Recursive translation yields nested unions

Index hits require careful query design

Complex expressions for joins

SparqlMap-M / MongoDB

No direct equivalents for joins

No complex equivalence expression

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 12 / 25

Page 21: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

Query Structure: Union Decomposition

Nested Unions:

./?dep=?dep

σname=Research

trm3

./?person=?person⋃trm1 trm4

⋃trm2 trm5

Pushed Union: ⋃./?dep=?dep

trm3 ./?person=?person

trm1 trm2

./?dep=?dep

trm3 ./?person=?person

trm4 trm5

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 13 / 25

Page 22: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

Query Structure: Union Decomposition

Nested Unions:

./?dep=?dep

σname=Research

trm3

./?person=?person⋃trm1 trm4

⋃trm2 trm5

Pushed Union: ⋃./?dep=?dep

trm3 ./?person=?person

trm1 trm2

./?dep=?dep

trm3 ./?person=?person

trm4 trm5

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 13 / 25

Page 23: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

Selective Materialization

BindingTranslat.

SparqlMap-MSparqlMap

QueryAnalysis

QueryParsing

MappingBinding

SelectiveMaterialization

QuerySELECT DISTINCT ?name {

?person foaf:name ?name. #(tp1)

?person :inDepartment ?dep. #(tp2)

?dep rdfs:label ’Research’ #(tp3)

}Mapping

Deduplication

Union De-composition

MaterializedExecution?name

------------

’Mary R.’

’James T.’

Translat.Exec.

Result

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 14 / 25

Page 24: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

Selective Materialization

Delegate to abstraction layer (Apache MetaModel)

Execute unpushable SPARQL operators in memory

Πname

./id=depid

σname=”Research”

department employee

MaterializedExecution

SelectiveMaterialization

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 15 / 25

Page 25: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

De-Duplication

BindingTranslat.

SparqlMap-MSparqlMap

QueryAnalysis

QueryParsing

MappingBinding

SelectiveMaterialization

QuerySELECT DISTINCT ?name {

?person foaf:name ?name. #(tp1)

?person :inDepartment ?dep. #(tp2)

?dep rdfs:label ’Research’ #(tp3)

}Mapping

Deduplication

Union De-composition

MaterializedExecution?name

------------

’Mary R.’

’James T.’

Translat.Exec.

Result

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 16 / 25

Page 26: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

De-Duplication

Documents are nested for fastretrieval and filtering

Naive mapping introduces overhead

Declaratively labelR2RML-TriplesMaps as duplicated

Only use denormalized data in joins

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 17 / 25

Page 27: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

De-Duplication

Documents are nested for fastretrieval and filtering

Naive mapping introduces overhead

Declaratively labelR2RML-TriplesMaps as duplicated

Only use denormalized data in joins

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 17 / 25

Page 28: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

Outline

1 Motivation and Scope

2 Approach

3 Evaluation

4 Conclusions and Future Work

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 18 / 25

Page 29: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

Benchmark Setup

BSBM for availability of both SQL and RDF representation

SQL representation translated into MongoDB documents

Additionally performed denormalization

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 19 / 25

Page 30: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

Benchmark Results

BSBM 10 million triples

PostgreSQL Fastest

MongoDB-Naive/-Dup Dup required for performance

SparqlMap-M-Naive/ -Dup/ -DupAwareOverhead by rewriting/materialization

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 20 / 25

Page 31: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

Benchmark Results

BSBM Q4

Medium selectivity

Naive modes touch a lot of data

Performance gain by duplicatedata (MongoDB, SparqlMap-M)

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 21 / 25

Page 32: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

Benchmark Results

BSBM Q5

Low selectivity join

SparqlMap-M: expensive selfjoin in memory, dominates cost

MongoDB: Self-join inaggregate pipeline, slower thanPostgreSQL

BSBM Q9

High selectivity join

SparqlMap-M-Dup(Aware):duplicates increase overhead.Unpushable join dominates cost

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 22 / 25

Page 33: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

Outline

1 Motivation and Scope

2 Approach

3 Evaluation

4 Conclusions and Future Work

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 23 / 25

Page 34: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

Future Work

Enable Updates

Integrate Caching

Evaluate Join capable query language

MongoDB left outer join ($lookup)Multimodel databases: ArangoDB, OrientDBDB virtualizations: JBoss Teiid, Apache HAWQ

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 24 / 25

Page 35: Executing SPARQL Queries over Mapped Document Stores with SparqlMap-M

Conclusion

Architecture for a SPARQL execution layer over document stores

Harness duplicates for increasing performance

Evaluated with BSBM on MongoDB

J. Unbehauen, M. Martin (Leipzig Univ.) SPARQL over Document Stores: SparqlMap-M SEMANTiCS 2016 25 / 25