ruleml2015: graal - a toolkit for query answering with existential rules

Post on 18-Aug-2015

259 Views

Category:

Science

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

A Toolkit for Query Answering with Existential Rules

Jean-François Baget, Michel Leclère, Marie-Laure Mugnier,Swan Rocher, and Clément Sipieter

RuleML 2015

sipieter@lirmm.fr Graal 1

Ontology-mediated Query Answering

Query

Ontology

DataKnowledge

basesipieter@lirmm.fr Graal 2

Ontology language

Existential rules framework

=Datalog+[Calì et al., 2009]

sipieter@lirmm.fr Graal 3

What is Datalog+?

An extension of positive Datalog:

I Existentially quanti�ed variables in rule heads∀x (human(x)→ ∃y parentOf (y , x))

I Negative constraints∀x (man(x) ∧ woman(x)→ ⊥)

I Equality rules∀x∀y∀z (motherOf (y , x) ∧motherOf (z , x)→ y = z)

Generalizes:

I Horn description logics(e.g. DL-Lite, the 3 tractable pro�les of OWL2),

I database dependencies (TGDs and EGDs).

sipieter@lirmm.fr Graal 4

What is Datalog+?

An extension of positive Datalog:

I Existentially quanti�ed variables in rule heads∀x (human(x)→ ∃y parentOf (y , x))

I Negative constraints∀x (man(x) ∧ woman(x)→ ⊥)

I Equality rules∀x∀y∀z (motherOf (y , x) ∧motherOf (z , x)→ y = z)

Generalizes:

I Horn description logics(e.g. DL-Lite, the 3 tractable pro�les of OWL2),

I database dependencies (TGDs and EGDs).

sipieter@lirmm.fr Graal 4

What is Datalog+?

An extension of positive Datalog:

I Existentially quanti�ed variables in rule heads∀x (human(x)→ ∃y parentOf (y , x))

I Negative constraints∀x (man(x) ∧ woman(x)→ ⊥)

I Equality rules∀x∀y∀z (motherOf (y , x) ∧motherOf (z , x)→ y = z)

Generalizes:

I Horn description logics(e.g. DL-Lite, the 3 tractable pro�les of OWL2),

I database dependencies (TGDs and EGDs).

sipieter@lirmm.fr Graal 4

What is Datalog+?

An extension of positive Datalog:

I Existentially quanti�ed variables in rule heads∀x (human(x)→ ∃y parentOf (y , x))

I Negative constraints∀x (man(x) ∧ woman(x)→ ⊥)

I Equality rules∀x∀y∀z (motherOf (y , x) ∧motherOf (z , x)→ y = z)

Generalizes:

I Horn description logics(e.g. DL-Lite, the 3 tractable pro�les of OWL2),

I database dependencies (TGDs and EGDs).

sipieter@lirmm.fr Graal 4

What is Datalog+?

An extension of positive Datalog:

I Existentially quanti�ed variables in rule heads∀x (human(x)→ ∃y parentOf (y , x))

I Negative constraints∀x (man(x) ∧ woman(x)→ ⊥)

I Equality rules∀x∀y∀z (motherOf (y , x) ∧motherOf (z , x)→ y = z)

Generalizes:

I Horn description logics(e.g. DL-Lite, the 3 tractable pro�les of OWL2),

I database dependencies (TGDs and EGDs).

sipieter@lirmm.fr Graal 4

Graal, an architecture overview

sipieter@lirmm.fr Graal 5

Graal - General architecture

chaining

core

RDBMS

Triple Store

InMemory GraphDBstore

ioDLGP

ruleset-analyser

OWL2fragment

forward-chaining

backward-

homomorphism

RuleML

Rule

AtomSet

Query

ConjunctiveQuery UnionConjunctiveQuery

Atom Substitution

GraphOfRuleDependencies

NegativeConstraint

Predicate Term

1

0..*

2

0..*

0..* 0..*1

0..*

sipieter@lirmm.fr Graal 6

Graal - General architecture

chaining

core

RDBMS

Triple Store

InMemory GraphDBstore

ioDLGP

ruleset-analyser

OWL2fragment

forward-chaining

backward-

homomorphism

RuleML

Rule

AtomSet

Query

ConjunctiveQuery UnionConjunctiveQuery

Atom Substitution

GraphOfRuleDependencies

NegativeConstraint

Predicate Term

1

0..*

2

0..*

0..* 0..*1

0..*

sipieter@lirmm.fr Graal 6

Store: Storing data

RDBMS Triple Store GraphDBstore

core

InMemory

AtomSet

Iterator<Atom> iterator()Set<Term> getTerms()boolean contains(Atom)boolean addAtom(Atom)boolean removeAtom(Atom)

Atom

Predicate

Term

0..*

0..*

1

sipieter@lirmm.fr Graal 7

Homomorphism: Querying data

Query

sipieter@lirmm.fr Graal 8

Homomorphism: Querying data

store

Query

homomorphismRecursive backtrackcoreAtomSet

Iterator<Atom> iterator()Set<Term> getTerms()boolean contains(Atom)boolean addAtom(Atom)boolean removeAtom(Atom)

RDBMS

sipieter@lirmm.fr Graal 8

Homomorphism: Querying data

store

Query

Triple Store

homomorphismRecursive backtrackcoreAtomSet

Iterator<Atom> iterator()Set<Term> getTerms()boolean contains(Atom)boolean addAtom(Atom)boolean removeAtom(Atom)

sipieter@lirmm.fr Graal 8

Homomorphism: Querying data

store

Query

homomorphismRecursive backtrackcoreAtomSet

Iterator<Atom> iterator()Set<Term> getTerms()boolean contains(Atom)boolean addAtom(Atom)boolean removeAtom(Atom)

GraphDB

sipieter@lirmm.fr Graal 8

Homomorphism: Querying data

store

homomorphism

Query

SQL

RDBMS

sipieter@lirmm.fr Graal 8

Homomorphism: Querying data

store

homomorphism

Query

Sparql

Triple Store

sipieter@lirmm.fr Graal 8

Homomorphism: Querying data

store

homomorphism

Query

Cypher

Neo4j

sipieter@lirmm.fr Graal 8

Taking ontology in account

chaining

core

RDBMS

Triple Store

InMemory GraphDBstore

ioDLGP

ruleset-analyser

OWL2fragment

forward-chaining

backward-

homomorphism

RuleML

Rule

AtomSet

Query

ConjunctiveQuery UnionConjunctiveQuery

Atom Substitution

GraphOfRuleDependencies

NegativeConstraint

Predicate Term

1

0..*

2

0..*

0..* 0..*1

0..*

sipieter@lirmm.fr Graal 9

Forward Chaining / Chase

QueryRules

sipieter@lirmm.fr Graal 10

Forward Chaining / Chase

QueryRules

homomorphism

store

SQLforward-chaining

RDBMS

sipieter@lirmm.fr Graal 10

Forward Chaining / Chase

QueryRules

homomorphism

store

SQLforward-chaining

RDBMS

sipieter@lirmm.fr Graal 10

Backward Chaining / Query rewriting

QueryRules

sipieter@lirmm.fr Graal 11

Backward Chaining / Query rewriting

RDBMS

SQL

Q1 v Q2 v Q3 v Q4 v Q5 v Q6 v … v Qn

store

homomorphism

query-rewriting

QueryRules

sipieter@lirmm.fr Graal 11

Pure - A compilation based query rewriter

sipieter@lirmm.fr Graal 12

E�ciency of the Query Rewriting approach in practice?

+ data do not grow− the size of the rewriting set can be prohibitive in practice

A

B1

B2

Bn

B1(x) ∧ A(x)

B2(x) ∧ B1(x)

Bn(x) ∧ Bn−1(x)

A1

A2

A3

An

A2(x)→ A1(x)

A3(x)→ A2(x)

An(x)→ An−1(x)

q = A1(x1) ∧ . . . ∧ A1(xk)

rewriting set: nk CQs

It is not a theoretical worst-case: it happens often in practicebecause hierarchies are at the heart of ontologies.

sipieter@lirmm.fr Graal 13

Compilation-based Query Rewriting

Preprocess some simple rules known as sources of combinatorialexplosion:

atom1 → atom2

without existential variableE.g., subclass, subproperty, domain, range, inverse properties. . .

R = RC ∪RE

1. Compile RC into a preorder over atoms

2. Embed this preorder into the rewrite process

RC PreorderRE

q

query�pivotal�

preorder

O�ine compilation

Query rewriting

sipieter@lirmm.fr Graal 14

Example

R0 : project(x , y , z,w)→ hasArea(x , y)R1 : project(x , y , z,w)→ hasScManager(x , z)R2 : project(x , y , z,w)→ hasAdmManager(x ,w)R3 : sensitiveArea(x)→ area(x)R4 : security(x)→ sensitiveArea(x)R5 : innovation(x)→ sensitiveArea(x)R6 : hasScManager(x , y)→ hasManager(x , y)R7 : hasAdmManager(x , y)→ hasManager(x , y)R8 : isManagerOf (y , x)→ hasManager(x , y)R9 : hasManager(y , x)→ isManagerOf (x , y)

R10 : manager(x)→ isManager(x , y)R11 : isManagerOf (x , y) ∧ hasArea(y , z) ∧ sensitiveArea(z)→ criticalManager(x)R12 : criticalManager(x)→ isManagerOf (x , y) ∧ hasArea(y , z) ∧ sensitiveArea(z)R13 : accreditatedManager(x)→ isManagerOf (x , y) ∧ project(y , z, v ,w) ∧ security(z)

sipieter@lirmm.fr Graal 15

Example

R0 : project(x , y , z,w)→ hasArea(x , y)R1 : project(x , y , z,w)→ hasScManager(x , z)R2 : project(x , y , z,w)→ hasAdmManager(x ,w)R3 : sensitiveArea(x)→ area(x)R4 : security(x)→ sensitiveArea(x)R5 : innovation(x)→ sensitiveArea(x)R6 : hasScManager(x , y)→ hasManager(x , y)R7 : hasAdmManager(x , y)→ hasManager(x , y)R8 : isManagerOf (y , x)→ hasManager(x , y)R9 : hasManager(y , x)→ isManagerOf (x , y)

R10 : manager(x)→ isManager(x , y)R11 : isManagerOf (x , y) ∧ hasArea(y , z) ∧ sensitiveArea(z)→ criticalManager(x)R12 : criticalManager(x)→ isManagerOf (x , y) ∧ hasArea(y , z) ∧ sensitiveArea(z)R13 : accreditatedManager(x)→ isManagerOf (x , y) ∧ project(y , z, v ,w) ∧ security(z)

sipieter@lirmm.fr Graal 16

Closure of the compilable rules

R0 : project(x , y , z,w)→ hasArea(x , y)R1 : project(x , y , z,w)→ hasScManager(x , z)R2 : project(x , y , z,w)→ hasAdmManager(x ,w)R3 : sensitiveArea(x)→ area(x)R4 : security(x)→ sensitiveArea(x)R5 : innovation(x)→ sensitiveArea(x)R6 : hasScManager(x , y)→ hasManager(x , y)R7 : hasAdmManager(x , y)→ hasManager(x , y)R8 : isManagerOf (y , x)→ hasManager(x , y)R9 : hasManager(y , x)→ isManagerOf (x , y)

We compute all inferred rules obtained by composition:

Ra : R1 · R6 = project(x , y , z,w)→ hasManager(x , z)Rb : R2 · R7 = project(x , y , z,w)→ hasManager(x ,w)Rc : R4 · R3 = security(x)→ area(x)Rd : R5 · R3 = innovation(x)→ area(x)Re : R6 · R9 = hasScManager(x , y)→ isManagerOf (y , x)Rf : R7 · R9 = hasAdmManager(x , y)→ isManagerOf (y , x)Rg : Ra · R9 = project(x , y , z,w)→ isManagerOf (z, x)Rh : Rb · R9 = project(x , y , z,w)→ isManagerOf (w , x)

sipieter@lirmm.fr Graal 17

Preorder

isManagerOf (x , y)

hasScManager(y , x)hasAdmManager(y , x)hasManager(y , x)project(y , z , x ,w)project(y , z ,w , x)

area(x)

security(x)innovation(x)

sensitiveArea(x)

hasManager(x , y)

hasScManager(x , y)hasAdmManager(x , y)isManagerOf (y , x)project(x , z , y ,w)project(x , z ,w , y)

hasAdmManager(x , y)

project(x , z ,w , x)

hasScManager(x , y)

project(x , z , x ,w)

sensitiveArea(x)

security(x)innovation(x)

hasArea(x , y)

project(x , y , z ,w)

sipieter@lirmm.fr Graal 18

4-Homomorphism

Q = isManagerOf (x , y) ∧ hasArea(y , z) ∧ sensitiveArea(z)F = isManagerOf (M1,P) ∧ project(P,A,M1,M2) ∧ security(A)

hasArea(x , y)

project(x , y , z ,w)

sensitiveArea(x)

security(x)innovation(x)

. . .

COMPILED

sipieter@lirmm.fr Graal 19

4-Homomorphism

Q = isManagerOf (x , y) ∧ hasArea(y , z) ∧ sensitiveArea(z)F = isManagerOf (M1,P) ∧ project(P,A,M1,M2) ∧ security(A)

h = {{x ,M1}, {y ,P}, {z ,A}}

hasArea(x , y)

project(x , y , z ,w)

sensitiveArea(x)

security(x)innovation(x)

. . .

COMPILED

sipieter@lirmm.fr Graal 19

Query rewriting using 4-Homomorphism

Q = criticalManager(x)

Q ′ = isManagerOf (x , y) ∧ hasArea(y , z) ∧ sensitiveArea(z)

Q ′′ = accreditatedManager(x)

[Classical rewriting: 38 CQs]

R10 : manager(x)→ isManager(x , y)R11 : isManagerOf (x , y) ∧ hasArea(y , z) ∧ sensitiveArea(z)→ criticalManager(x)R12 : criticalManager(x)→ isManagerOf (x , y) ∧ hasArea(y , z) ∧ sensitiveArea(z)R13 : accreditatedManager(x)→ isManagerOf (x , y) ∧ project(y , z, v ,w) ∧ security(z)

hasArea(x , y)

project(x , y , z ,w)

sensitiveArea(x)

security(x)innovation(x)

. . .

COMPILED

sipieter@lirmm.fr Graal 20

Query rewriting using 4-Homomorphism

Q = criticalManager(x)

Q ′ = isManagerOf (x , y) ∧ hasArea(y , z) ∧ sensitiveArea(z)

Q ′′ = accreditatedManager(x)

[Classical rewriting: 38 CQs]

R10 : manager(x)→ isManager(x , y)R11 : isManagerOf (x , y) ∧ hasArea(y , z) ∧ sensitiveArea(z)→ criticalManager(x)R12 : criticalManager(x)→ isManagerOf (x , y) ∧ hasArea(y , z) ∧ sensitiveArea(z)R13 : accreditatedManager(x)→ isManagerOf (x , y) ∧ project(y , z, v ,w) ∧ security(z)

hasArea(x , y)

project(x , y , z ,w)

sensitiveArea(x)

security(x)innovation(x)

. . .

COMPILED

sipieter@lirmm.fr Graal 20

Query rewriting using 4-Homomorphism

Q = criticalManager(x)

Q ′ = isManagerOf (x , y) ∧ hasArea(y , z) ∧ sensitiveArea(z)

Q ′′ = accreditatedManager(x)

[Classical rewriting: 38 CQs]

R10 : manager(x)→ isManager(x , y)R11 : isManagerOf (x , y) ∧ hasArea(y , z) ∧ sensitiveArea(z)→ criticalManager(x)R12 : criticalManager(x)→ isManagerOf (x , y) ∧ hasArea(y , z) ∧ sensitiveArea(z)R13 : accreditatedManager(x)→ isManagerOf (x , y) ∧ project(y , z, v ,w) ∧ security(z)

hasArea(x , y)

project(x , y , z ,w)

sensitiveArea(x)

security(x)innovation(x)

. . .

COMPILED

sipieter@lirmm.fr Graal 20

Results - Impact of compilation on rewriting sizes

Benchmark:translation of DL-LiteR ontologies:

I Adolena (102 rules, 75% compilable)

I Vicodi (222 rules, 100% compilable)

I OpenGalen2-Lite (51k rules, 55%compilable)

I OBOprotein (43k rules, 82% compilable)

Each ontology is provided with 5 queries.

UCQ p-UCQ

A Q1 27 2

Q2 50 2

Q3 104 1

Q4 224 2

Q5 624 1

V Q1 15 1

Q2 10 1

Q3 72 1

Q4 185 1

Q5 30 1

G Q1 2 1

Q2 1152 1

Q3 488 5

Q4 147 1

Q5 324 19

O Q1 27 20

Q2 1356 1264

Q3 33887 1

Q4 34733 682

Q5 36612 -

sipieter@lirmm.fr Graal 21

Results - Impact of compilation on rewriting times

Pure PureC PureCto UCQ

A Q1 190 20 140

Q2 100 10 50

Q3 180 20 40

Q4 290 10 140

Q5 1510 10 450

V Q1 20 10 10

Q2 20 10 20

Q3 120 10 80

Q4 130 10 70

Q5 20 10 20

G Q1 10 10 20

Q2 1070 60 630

Q3 1030 80 270

Q4 30 20 20

Q5 900 40 100

O Q1 450 140 150

Q2 1170 1120 1880

Q3 TO 100 558000

Q4 TO 440 TO

Q5 TO TO TOTO = 10min.

sipieter@lirmm.fr Graal 22

Query evaluation (Futur work)

In central memory, compute 4-homomorphisms from Q to F.

Otherwise:

I Unfold Q using the preorder,

I Transform Q + the compilable rules into a Datalog program P,

I Saturate F with Rc (the set of compilable rules).

sipieter@lirmm.fr Graal 23

https://graphik-team.github.io/graal/

sipieter@lirmm.fr Graal 24

Results - Impact of compilation on rewriting timesPure PureC PureC Nyaya Requiem Iqaros tw Rapid

to UCQ

A Q1 190 20 140 1130 270 60 20 20

Q2 100 10 50 870 110 60 20 30

Q3 180 20 40 2370 140 200 10 40

Q4 290 10 140 5560 260 140 20 50

Q5 1510 10 450 33210 470 580 20 100

V Q1 20 10 10 20 20 20 10 10

Q2 20 10 20 60 20 20 10 10

Q3 120 10 80 30 70 30 10 30

Q4 130 10 70 30 140 40 10 40

Q5 20 10 20 30 80 50 20 30

G Q1 10 10 20 - 50 50 10 10

Q2 1070 60 630 - 209050 5870 20 80

Q3 1030 80 270 - 259110 9190 30 60

Q4 30 20 20 - 190260 780 10 20

Q5 900 40 100 - 238460 7410 30 50

O Q1 450 140 150 - 3450 6680 20 30

Q2 1170 1120 1880 - 21790 27820 580 960

Q3 TO 100 558000 - TO TO 80 620

Q4 TO 440 TO - TO 139990 1240 14700

Q5 TO TO TO - TO TO TO 562230

TO = 10min.sipieter@lirmm.fr Graal 25

top related