vldb phd workshop

38
P ERSONALIZED S EARCH FOR THE S OCIAL S EMANTIC WEB VLDB PHDWORKSHOP Oana Tifrea-Marciuska supervisor: Prof. Thomas Lukasiewicz Department of Computer Science, University of Oxford, UK August 31, 2015 OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 1 /32

Upload: oana-tifrea-marciuska

Post on 22-Jan-2018

64 views

Category:

Engineering


0 download

TRANSCRIPT

Page 1: VLDB Phd Workshop

PERSONALIZED SEARCH FOR THE SOCIAL

SEMANTIC WEBVLDB PHD WORKSHOP

Oana Tifrea-Marciuskasupervisor: Prof. Thomas Lukasiewicz

Department of Computer Science, University of Oxford, UK

August 31, 2015

OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 1 /32

Page 2: VLDB Phd Workshop

MOTIVATION

PRELIMINARIESDatalog+/–

LANGUAGESStrategies to Answer k-rank Disjunctive Atomic QueriesExperiments

FUTURE WORK

OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 2 /32

Page 3: VLDB Phd Workshop

WEB 3.0

Social Data Semantic data

OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 3 /32

Page 4: VLDB Phd Workshop

WEB 3.0 SEARCH

Social Data

Personalized access

Semantic data

Precise and rich results

OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 3 /32

Page 5: VLDB Phd Workshop

PERSONALIZED INFORMATION ACCESS

QUERY

ORDER BY user’s preferences

LIMIT k

OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 4 /32

Page 6: VLDB Phd Workshop

PERSONALIZED INFORMATION ACCESS

QUERY Datalog+/– Queries

ORDER BY user’s preferences Preference Model

LIMIT k Top k

OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 4 /32

Page 7: VLDB Phd Workshop

DATALOG+/–

hotelid city conn class

t1 h1 rome c et2 h2 rome w lt3 h3 rome c e

reviewid user feedback

t7 h1 b nt8 h2 b pt9 h3 j p

revieweruser age

t10 b 20t11 j 30

frienduser user

t12 b at13 j a

FIGURE : Database D.

ConstraintsDatalog like ∀X∀YΦ(X,Y)→ Ψ(X)

friend(A,B)→ friend(B,A)With existential in the head ∀X∀YΦ(X,Y)→ ∃ZΨ(X,Z)

reviewer(U,A)→ ∃Ffriend(U,F )

Negative constraints ∀XΦ(X)→ ⊥friend(D,D)→ ⊥

Equality constraints ∀XΦ(X)→ Ai = Aj

review(A,U,F1) ∧ review(A,U,F2)→ F1 = F2

OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 5 /32

Page 8: VLDB Phd Workshop

WHY DATALOG+/–

Generalizes DL-Lite familyHas implementationsCan use ideas from the preference modeling in databasecommunity

OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 6 /32

Page 9: VLDB Phd Workshop

PERSONALIZED INFORMATION ACCESS

QUERY Datalog+/– Queries

ORDER BY user’s preferences Preference Model

LIMIT k Top k

OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 7 /32

Page 10: VLDB Phd Workshop

PREFERENCE MODELS

Syntax and SemanticsAdvantages and Disadvantages: formal properties, experimentsComplexity for answering queries: conjunctive queries (CQ) ordisjunction of atomic queries(DAQ)Solve conflict between data

Preferences of a group of usersUncertainty (reviews) vs preferences

Desire: ontology language that handles preferences of user or agroup of users and can handle uncertainty (e.g., informationintegration)

OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 8 /32

Page 11: VLDB Phd Workshop

CH 1. PP-DATALOG+/–∗

Combines (4 different operators)Qualitative preferences - binary relations � ⊆ HPref ×HPref .

hotel(h1, rome, c, e) � hotel(h2, rome,w , l)Probabilistic model (e.g., reviews)

hotel(h3, rome, c, e) 0.9hotel(h1, rome, c, e) 0.8hotel(h2, rome,w , l) 0.3

Semantic propertiesComplexity of answering top-k DAQ polynomialDataset: IMDB movie database, with synthetic preferences

∗In Journal on Data Semantics. Vol. 4. No. 2. Jun, 2015.OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 9 /32

Page 12: VLDB Phd Workshop

CH. 1. GPP-DATALOG+/–∗

Combines for a group of peopleQualitative preferences - binary relations � ⊆ HPref ×HPref .

food(f1) � food(f2)Probabilistic model (e.g., reviews)

food(m2) 0.9; food(m3) 0.8; food(m2) 0.3

Semantic propertiesComplexity of answering top-k DAQ polynomialDataset: YELP, with real preferences from 50 users

efficiencyquality of the results

∗In ACM Transactions on Internet Technology. Vol. 14. No. 4. Dec, 2014OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 10 /32

Page 13: VLDB Phd Workshop

GROUP PREFERENCE MODEL

DEFINITIONA group preference model U =(U1, . . . ,Un) for n>1 users is a collectionof n user preference models.

dest(f1) dest(c3) dest(c2)

dest(c1) dest(b1) dest(f2)

u1

dest(c1)

dest(c3)

u2dest(f1)

dest(b1)dest(f2)

dest(c2) dest(f1)

u3dest(b1)

dest(c1)

dest(f2)dest(c3)

dest(c2)

OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 11 /32

Page 14: VLDB Phd Workshop

PROBABILISTIC MODEL

A preference relation � is score-based if is defined as follows:a1 � a2 iff score(a1) > score(a2).Model assigns a probability to each atom (using e.g. Markovlogic and Bayesian networks).

PrM0.4

0.34

0.3

0.8

0.75

0.6

dest(b1)

dest(c1)

dest(f1)

dest(c2)

dest(f2)

dest(c3)

OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 12 /32

Page 15: VLDB Phd Workshop

CHALLENGES OF THE GIVEN MODEL 1/2

u1

dest(f1) dest(c3) dest(c2)

dest(c1) dest(b1) dest(f2)

u2dest(f1)

dest(c1) dest(b1)dest(f2)

dest(c3) dest(c2) dest(f1)

u3dest(b1)

dest(c1)

dest(f2)dest(c3)

dest(c2)

0.8

0.75

0.6

dest(b1)

dest(c1)

dest(f1)

PrM0.4

0.34

0.3

dest(c2)

dest(f2)

dest(c3)

Challenge 1: user preference model and the probabilistic modelin disagreement: preference merging operators

DEFINITIONLet �U be an SPO and �M be a score-based preference relation. Apreference merging operator ⊗(�U ,�M) yields a relation �∗ such that

1 �∗ is an SPO2 if a1 �U a2 and a1 �M a2, then a1 �∗ a2.

OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 13 /32

Page 16: VLDB Phd Workshop

CHALLENGES OF THE GIVEN MODEL 2/2

u1

dest(f1) dest(c3) dest(c2)

dest(c1) dest(b1) dest(f2)

u2dest(f1)

dest(c1) dest(b1)dest(f2)

dest(c3) dest(c2) dest(f1)

u3dest(b1)

dest(c1)

dest(f2)dest(c3)

dest(c2)

0.8

0.75

0.6

dest(b1)

dest(c1)

dest(f1)

PrM0.4

0.34

0.3

dest(c2)

dest(f2)

dest(c3)

Challenge 2: user preference models may be in disagreementwith each other: preference aggregation operator

DEFINITIONLet U =(U1, . . . ,Un) be a group preference model, where every Ui is anSPO. A preference aggregation operator

⊎on U yields an SPO �∗.

OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 14 /32

Page 17: VLDB Phd Workshop

GPP-Datalog+/– ontology - our model

DEFINITIONA GPP-Datalog+/– ontology has the form KB =(O,U ,M,⊗,

⊎)

O is a Datalog+/– ontologyU =(U1, . . . ,Un) is a group preference model with n>1M is a probabilistic model⊗ is a preference merging operator⊎

is the preference aggregation operatorWe say that KB is a guarded iff O is guarded.

OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 15 /32

Page 18: VLDB Phd Workshop

MERGING OPERATOR

0.4− 0.6 > 0.1 No =⇒ keep relation

dest(f1)

dest(c1) dest(b1)dest(f2)

dest(c3) dest(c2)

0.8

0.75

0.6

dest(b1)

dest(c1)

dest(f1)

PrM0.4

0.34

0.3

dest(c2)

dest(f2)

dest(c3)

dest(c1) dest(b1)

dest(f1)

dest(c3) dest(c2) dest(f2)

0.75− 0.6 > 0.1 Yes =⇒ inverse relation

dest(f1)

dest(c1) dest(b1)dest(f2)

dest(c3) dest(c2)

0.8

0.75

0.6

dest(b1)

dest(c1)

dest(f1)

PrM0.4

0.34

0.3

dest(c2)

dest(f2)

dest(c3)

dest(c1) dest(b1)

dest(f1)

dest(c3) dest(c2) dest(f2)

OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 16 /32

Page 19: VLDB Phd Workshop

SKYLINE AND K-RANK ANSWER

Let KB be a GPP-Datalog+/– ontology, Q(X)=q1(X1) ∨ · · · ∨ qn(Xn)be a DAQ. Then, a skyline answer to Q relative to�∗=

⊎(⊗(�U1 ,�M), . . . ,⊗(�Un ,�M)) is any θqi entailed by O such

that no θ′ exists with O |= θ′qj and θ′qj �∗ θqi , where θ and θ′ areground substitutions for the variables in Q(X).A k -rank answer to Q is a sequence S = 〈θ1, . . . θk ′〉 built bysubsequently appending the skyline answers to Q, removing theseatoms from consideration, and repeating until either S = k or no moreskyline answers to Q remain.

dest(f1)

dest(b1)

dest(c1)

dest(f2)dest(c3)

dest(c2)

OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 17 /32

Page 20: VLDB Phd Workshop

STRATEGIES TO ANSWER K-RANK DAQ

Collapse to single user1 Create virtual user2 Calculate k-rank from it

Voting1 Calculate k-rank for each of the users2 Vote

OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 18 /32

Page 21: VLDB Phd Workshop

COLLAPSE TO SINGLE USER

no relation

dest(b1)

u1 t = 0

dest(f2)

dest(c3)

dest(c1) dest(c2)

dest(f1) u2 t = 0.1

dest(c1)

dest(f1)

dest(c2) dest(f2)

dest(b1)

dest(c3)

u3 t = 0.19

dest(b1)

dest(c1)

dest(f1)

dest(c3)

dest(c2)

dest(f2) dest(b1)

dest(c1)

dest(f2)

dest(c3)

dest(c2)

dest(f1)

1

11

1

1

1

1

3

22

11 1

2

OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 19 /32

Page 22: VLDB Phd Workshop

COLLAPSE TO SINGLE USER

relation with weight 2

dest(b1)

u1 t = 0

dest(f2)

dest(c3)

dest(c1) dest(c2)

dest(f1) u2 t = 0.1

dest(c1)

dest(f1)

dest(c2) dest(f2)

dest(b1)

dest(c3)

u3 t = 0.19

dest(b1)

dest(c1)

dest(f1)

dest(c3)

dest(c2)

dest(f2) dest(b1)

dest(c1)

dest(f2)

dest(c3)

dest(c2)

dest(f1)

1

11

1

1

1

1

3

22

11 1

2

OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 20 /32

Page 23: VLDB Phd Workshop

COLLAPSE TO SINGLE USER: K-RANK

Q =dest(X ), (t1, t2, t3) = (0,0.1,0.19), k = 1k -rank answer to Q〈 dest(b1) 〉 .

dest(c1)

dest(f2)

dest(f1)

dest(c2)

dest(b1) 3

1

1

dest(c3) 1

1 1

2

12

12

11

1

OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 21 /32

Page 24: VLDB Phd Workshop

COLLAPSE TO SINGLE USER: K-RANK

Q =dest(X ), (t1, t2, t3) = (0,0.1,0.19), k = 2k -rank answer to Q〈 dest(b1) , dest(c1) 〉 .

dest(c1)

dest(f2)

dest(f1)

dest(c2)

1

dest(c3) 1

1

2

12 11

1

OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 21 /32

Page 25: VLDB Phd Workshop

COLLAPSE TO SINGLE USER: K-RANK

Q =dest(X ), (t1, t2, t3) = (0,0.1,0.19), k = 3k -rank answer to Q〈 dest(b1) , dest(c1) , dest(f1) 〉 .

dest(f2)

dest(f1)

dest(c2)

1

dest(c3) 1

11

1

OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 21 /32

Page 26: VLDB Phd Workshop

COLLAPSE TO SINGLE USER: K-RANK

Q =dest(X ), (t1, t2, t3) = (0,0.1,0.19), k = 4k -rank answer to Q〈 dest(b1) , dest(c1) , dest(f1) , dest(f2) 〉 .

dest(f2)dest(c2)

dest(c3) 11

OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 21 /32

Page 27: VLDB Phd Workshop

COLLAPSE TO SINGLE USER: K-RANK

Q =dest(X ), (t1, t2, t3) = (0,0.1,0.19), k = 5k -rank answer to Q〈 dest(b1) , dest(c1) , dest(f1) , dest(f2) , dest(c2) 〉 or〈 dest(b1) , dest(c1) , dest(f1) , dest(f2) , dest(c3) 〉 .

dest(c2)

dest(c3)

OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 21 /32

Page 28: VLDB Phd Workshop

VOTING - PLURALITY VOTING

Q =dest(X ), k =2, and (t1, t2, t3) = (0,0.1,0.19).

dest(f2) dest(c1) dest(c2) dest(b1) dest(c3)u1 1 1 1 1 0u2 0 1 0 1 0u3 1 0 0 1 1

Total 2 2 1 3 1

dest(b1)

u1 t = 0

dest(f2)

dest(c3)

dest(c1) dest(c2)

dest(f1)

u3 t = 0.19

dest(b1)

dest(c1)

dest(f1)

dest(c3)

dest(c2)

dest(f2)

u2 t = 0.1

dest(c1)

dest(f1)

dest(c2) dest(f2)

dest(b1)

dest(c3)

k -rank answer to Q using plurality voting is 〈dest(b1), dest(c1) 〉 or〈dest(b1), dest(f2) 〉.

OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 22 /32

Page 29: VLDB Phd Workshop

EXPERIMENTS

Yelp dataset: 1,000 businesses, 229,907 reviews to find placesto eatcategories in Yelp→ Datalog+/–concepts (e.g Italian,Mediterranean)50 users inserted their preferences (e.g., the place to eat , food)

Compare methodsefficiencyquality of the results

OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 23 /32

Page 30: VLDB Phd Workshop

RESULTS OF THE EXPERIMENTS

group size increase→ quality deacreasesk increase→ quality increases

OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 24 /32

Page 31: VLDB Phd Workshop

CH 2. ONTOLOGICAL CP-THEORIES∗

A Datalog+/– ontology O and a set of υ : ξ � ξ′ [W ]

Given υ we prefer ξ to ξ′, irrespective of the value of W , as longas the other values are the same.

reviewer(bob,A) : hotel(I,C,wifi ,S) � hotel(I,C, cable,S)[REVIEW ]with REVIEW = {review(p), review(n)}

Complexity of answering top-k CQ

∗In Proc. of the International Joint Conference on Artificial Intelligence, Jul 2015.OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 25 /32

Page 32: VLDB Phd Workshop

ONTOLOGICAL CP-THEORIES

> : review(I,U,p) � review(I′,U,n)[∅]

OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 26 /32

Page 33: VLDB Phd Workshop

ONTOLOGICAL CP-THEORIES

> : hotel(I,C,w,S) � hotel(I′,C, c,S′)[{R,F}]

OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 27 /32

Page 34: VLDB Phd Workshop

ONTOLOGICAL CP-THEORIES

hotel(I,C, c,S) review(I,U,n) : reviewer(j ,A)� reviewer(beate,A′)[∅]

:

OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 28 /32

Page 35: VLDB Phd Workshop

SUMMARY

Chapter Formalism ∗

1 PP-Datalog+/– [1]1 GP-Datalog+/– [5]1 GPP-Datalog+/–[2,3,4]2 Ontological CP-nets [6,7]2 Ontological CP-theories [8]

PP-Datalog+/-

GPP-Datalog+/-Ontological CP-theories

Ontological CP-nets

1 2

GP-Datalog+/-

∗Complexity, Uncertainty, Properties and ImplementationOANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 29 /32

Page 36: VLDB Phd Workshop

PUBLICATIONS

1 PP-Datalog+/– Preference–Based Query Answering inProbabilistic Datalog+/– Ontologies In Journal on DataSemantics. Vol. 4. No. 2. Pages 81–101. June, 2015.∗

2 Query Answering in Probabilistic Datalog+/– Ontologies underGroup Preferences. In Proc. of WI 2013. Pages 171–178.∗

3 Query Answering in Datalog+/– Ontologies under GroupPreferences and Probabilistic Uncertainty In Proc. DMSSW 2013Vol. 8295 of LNCS. Pages 192–206.∗

4 Ontology–Based Query Answering with Group Preferences InACM Transactions on Internet Technology (TOIT). Vol. 14. No. 4.Pages 25:1–25:24. December, 2014.∗

5 Group Preferences for Query Answering in Datalog+/–Ontologies In Proc. of SUM 2013 Vol. 8078 of LNCS. Pages360–373. 2013.∗

Journal, Conference, Workshop or symposium

∗Authors: T. Lukasiewicz, M.V. Martinez, G. I. Simari and O. Tifrea–MarciuskaOANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 30 /32

Page 37: VLDB Phd Workshop

PUBLICATIONS

6 Computing k-Rank Answers with Ontological CP–Nets In Proc.SEBD 2014. Pages 276-283.∗

7 Computing k-Rank Answers with Ontological CP–Nets In Proc.PRUV 2014 Vol. 1205 of CEUR Workshop Proceedings. Pages74–87.∗

8 Combining Existential Rules with the Power of CP–Theories InProc. of IJCAI 2015.∗

Conference, Workshop or symposium

∗Authors: T. Di Noia, T. Lukasiewicz, M.V. Martinez, G. I. Simari and O.Tifrea–Marciuska

OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 31 /32

Page 38: VLDB Phd Workshop

THANK YOU

Questions? [email protected]

OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 32 /32