vldb phd workshop
TRANSCRIPT
PERSONALIZED SEARCH FOR THE SOCIAL
SEMANTIC WEBVLDB PHD WORKSHOP
Oana Tifrea-Marciuskasupervisor: Prof. Thomas Lukasiewicz
Department of Computer Science, University of Oxford, UK
August 31, 2015
OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 1 /32
MOTIVATION
PRELIMINARIESDatalog+/–
LANGUAGESStrategies to Answer k-rank Disjunctive Atomic QueriesExperiments
FUTURE WORK
OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 2 /32
WEB 3.0
Social Data Semantic data
OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 3 /32
WEB 3.0 SEARCH
Social Data
Personalized access
Semantic data
Precise and rich results
OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 3 /32
PERSONALIZED INFORMATION ACCESS
QUERY
ORDER BY user’s preferences
LIMIT k
OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 4 /32
PERSONALIZED INFORMATION ACCESS
QUERY Datalog+/– Queries
ORDER BY user’s preferences Preference Model
LIMIT k Top k
OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 4 /32
DATALOG+/–
hotelid city conn class
t1 h1 rome c et2 h2 rome w lt3 h3 rome c e
reviewid user feedback
t7 h1 b nt8 h2 b pt9 h3 j p
revieweruser age
t10 b 20t11 j 30
frienduser user
t12 b at13 j a
FIGURE : Database D.
ConstraintsDatalog like ∀X∀YΦ(X,Y)→ Ψ(X)
friend(A,B)→ friend(B,A)With existential in the head ∀X∀YΦ(X,Y)→ ∃ZΨ(X,Z)
reviewer(U,A)→ ∃Ffriend(U,F )
Negative constraints ∀XΦ(X)→ ⊥friend(D,D)→ ⊥
Equality constraints ∀XΦ(X)→ Ai = Aj
review(A,U,F1) ∧ review(A,U,F2)→ F1 = F2
OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 5 /32
WHY DATALOG+/–
Generalizes DL-Lite familyHas implementationsCan use ideas from the preference modeling in databasecommunity
OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 6 /32
PERSONALIZED INFORMATION ACCESS
QUERY Datalog+/– Queries
ORDER BY user’s preferences Preference Model
LIMIT k Top k
OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 7 /32
PREFERENCE MODELS
Syntax and SemanticsAdvantages and Disadvantages: formal properties, experimentsComplexity for answering queries: conjunctive queries (CQ) ordisjunction of atomic queries(DAQ)Solve conflict between data
Preferences of a group of usersUncertainty (reviews) vs preferences
Desire: ontology language that handles preferences of user or agroup of users and can handle uncertainty (e.g., informationintegration)
OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 8 /32
CH 1. PP-DATALOG+/–∗
Combines (4 different operators)Qualitative preferences - binary relations � ⊆ HPref ×HPref .
hotel(h1, rome, c, e) � hotel(h2, rome,w , l)Probabilistic model (e.g., reviews)
hotel(h3, rome, c, e) 0.9hotel(h1, rome, c, e) 0.8hotel(h2, rome,w , l) 0.3
Semantic propertiesComplexity of answering top-k DAQ polynomialDataset: IMDB movie database, with synthetic preferences
∗In Journal on Data Semantics. Vol. 4. No. 2. Jun, 2015.OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 9 /32
CH. 1. GPP-DATALOG+/–∗
Combines for a group of peopleQualitative preferences - binary relations � ⊆ HPref ×HPref .
food(f1) � food(f2)Probabilistic model (e.g., reviews)
food(m2) 0.9; food(m3) 0.8; food(m2) 0.3
Semantic propertiesComplexity of answering top-k DAQ polynomialDataset: YELP, with real preferences from 50 users
efficiencyquality of the results
∗In ACM Transactions on Internet Technology. Vol. 14. No. 4. Dec, 2014OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 10 /32
GROUP PREFERENCE MODEL
DEFINITIONA group preference model U =(U1, . . . ,Un) for n>1 users is a collectionof n user preference models.
dest(f1) dest(c3) dest(c2)
dest(c1) dest(b1) dest(f2)
u1
dest(c1)
dest(c3)
u2dest(f1)
dest(b1)dest(f2)
dest(c2) dest(f1)
u3dest(b1)
dest(c1)
dest(f2)dest(c3)
dest(c2)
OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 11 /32
PROBABILISTIC MODEL
A preference relation � is score-based if is defined as follows:a1 � a2 iff score(a1) > score(a2).Model assigns a probability to each atom (using e.g. Markovlogic and Bayesian networks).
PrM0.4
0.34
0.3
0.8
0.75
0.6
dest(b1)
dest(c1)
dest(f1)
dest(c2)
dest(f2)
dest(c3)
OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 12 /32
CHALLENGES OF THE GIVEN MODEL 1/2
u1
dest(f1) dest(c3) dest(c2)
dest(c1) dest(b1) dest(f2)
u2dest(f1)
dest(c1) dest(b1)dest(f2)
dest(c3) dest(c2) dest(f1)
u3dest(b1)
dest(c1)
dest(f2)dest(c3)
dest(c2)
0.8
0.75
0.6
dest(b1)
dest(c1)
dest(f1)
PrM0.4
0.34
0.3
dest(c2)
dest(f2)
dest(c3)
Challenge 1: user preference model and the probabilistic modelin disagreement: preference merging operators
DEFINITIONLet �U be an SPO and �M be a score-based preference relation. Apreference merging operator ⊗(�U ,�M) yields a relation �∗ such that
1 �∗ is an SPO2 if a1 �U a2 and a1 �M a2, then a1 �∗ a2.
OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 13 /32
CHALLENGES OF THE GIVEN MODEL 2/2
u1
dest(f1) dest(c3) dest(c2)
dest(c1) dest(b1) dest(f2)
u2dest(f1)
dest(c1) dest(b1)dest(f2)
dest(c3) dest(c2) dest(f1)
u3dest(b1)
dest(c1)
dest(f2)dest(c3)
dest(c2)
0.8
0.75
0.6
dest(b1)
dest(c1)
dest(f1)
PrM0.4
0.34
0.3
dest(c2)
dest(f2)
dest(c3)
Challenge 2: user preference models may be in disagreementwith each other: preference aggregation operator
DEFINITIONLet U =(U1, . . . ,Un) be a group preference model, where every Ui is anSPO. A preference aggregation operator
⊎on U yields an SPO �∗.
OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 14 /32
GPP-Datalog+/– ontology - our model
DEFINITIONA GPP-Datalog+/– ontology has the form KB =(O,U ,M,⊗,
⊎)
O is a Datalog+/– ontologyU =(U1, . . . ,Un) is a group preference model with n>1M is a probabilistic model⊗ is a preference merging operator⊎
is the preference aggregation operatorWe say that KB is a guarded iff O is guarded.
OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 15 /32
MERGING OPERATOR
0.4− 0.6 > 0.1 No =⇒ keep relation
dest(f1)
dest(c1) dest(b1)dest(f2)
dest(c3) dest(c2)
0.8
0.75
0.6
dest(b1)
dest(c1)
dest(f1)
PrM0.4
0.34
0.3
dest(c2)
dest(f2)
dest(c3)
dest(c1) dest(b1)
dest(f1)
dest(c3) dest(c2) dest(f2)
0.75− 0.6 > 0.1 Yes =⇒ inverse relation
dest(f1)
dest(c1) dest(b1)dest(f2)
dest(c3) dest(c2)
0.8
0.75
0.6
dest(b1)
dest(c1)
dest(f1)
PrM0.4
0.34
0.3
dest(c2)
dest(f2)
dest(c3)
dest(c1) dest(b1)
dest(f1)
dest(c3) dest(c2) dest(f2)
OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 16 /32
SKYLINE AND K-RANK ANSWER
Let KB be a GPP-Datalog+/– ontology, Q(X)=q1(X1) ∨ · · · ∨ qn(Xn)be a DAQ. Then, a skyline answer to Q relative to�∗=
⊎(⊗(�U1 ,�M), . . . ,⊗(�Un ,�M)) is any θqi entailed by O such
that no θ′ exists with O |= θ′qj and θ′qj �∗ θqi , where θ and θ′ areground substitutions for the variables in Q(X).A k -rank answer to Q is a sequence S = 〈θ1, . . . θk ′〉 built bysubsequently appending the skyline answers to Q, removing theseatoms from consideration, and repeating until either S = k or no moreskyline answers to Q remain.
dest(f1)
dest(b1)
dest(c1)
dest(f2)dest(c3)
dest(c2)
OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 17 /32
STRATEGIES TO ANSWER K-RANK DAQ
Collapse to single user1 Create virtual user2 Calculate k-rank from it
Voting1 Calculate k-rank for each of the users2 Vote
OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 18 /32
COLLAPSE TO SINGLE USER
no relation
dest(b1)
u1 t = 0
dest(f2)
dest(c3)
dest(c1) dest(c2)
dest(f1) u2 t = 0.1
dest(c1)
dest(f1)
dest(c2) dest(f2)
dest(b1)
dest(c3)
u3 t = 0.19
dest(b1)
dest(c1)
dest(f1)
dest(c3)
dest(c2)
dest(f2) dest(b1)
dest(c1)
dest(f2)
dest(c3)
dest(c2)
dest(f1)
1
11
1
1
1
1
3
22
11 1
2
OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 19 /32
COLLAPSE TO SINGLE USER
relation with weight 2
dest(b1)
u1 t = 0
dest(f2)
dest(c3)
dest(c1) dest(c2)
dest(f1) u2 t = 0.1
dest(c1)
dest(f1)
dest(c2) dest(f2)
dest(b1)
dest(c3)
u3 t = 0.19
dest(b1)
dest(c1)
dest(f1)
dest(c3)
dest(c2)
dest(f2) dest(b1)
dest(c1)
dest(f2)
dest(c3)
dest(c2)
dest(f1)
1
11
1
1
1
1
3
22
11 1
2
OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 20 /32
COLLAPSE TO SINGLE USER: K-RANK
Q =dest(X ), (t1, t2, t3) = (0,0.1,0.19), k = 1k -rank answer to Q〈 dest(b1) 〉 .
dest(c1)
dest(f2)
dest(f1)
dest(c2)
dest(b1) 3
1
1
dest(c3) 1
1 1
2
12
12
11
1
OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 21 /32
COLLAPSE TO SINGLE USER: K-RANK
Q =dest(X ), (t1, t2, t3) = (0,0.1,0.19), k = 2k -rank answer to Q〈 dest(b1) , dest(c1) 〉 .
dest(c1)
dest(f2)
dest(f1)
dest(c2)
1
dest(c3) 1
1
2
12 11
1
OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 21 /32
COLLAPSE TO SINGLE USER: K-RANK
Q =dest(X ), (t1, t2, t3) = (0,0.1,0.19), k = 3k -rank answer to Q〈 dest(b1) , dest(c1) , dest(f1) 〉 .
dest(f2)
dest(f1)
dest(c2)
1
dest(c3) 1
11
1
OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 21 /32
COLLAPSE TO SINGLE USER: K-RANK
Q =dest(X ), (t1, t2, t3) = (0,0.1,0.19), k = 4k -rank answer to Q〈 dest(b1) , dest(c1) , dest(f1) , dest(f2) 〉 .
dest(f2)dest(c2)
dest(c3) 11
OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 21 /32
COLLAPSE TO SINGLE USER: K-RANK
Q =dest(X ), (t1, t2, t3) = (0,0.1,0.19), k = 5k -rank answer to Q〈 dest(b1) , dest(c1) , dest(f1) , dest(f2) , dest(c2) 〉 or〈 dest(b1) , dest(c1) , dest(f1) , dest(f2) , dest(c3) 〉 .
dest(c2)
dest(c3)
OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 21 /32
VOTING - PLURALITY VOTING
Q =dest(X ), k =2, and (t1, t2, t3) = (0,0.1,0.19).
dest(f2) dest(c1) dest(c2) dest(b1) dest(c3)u1 1 1 1 1 0u2 0 1 0 1 0u3 1 0 0 1 1
Total 2 2 1 3 1
dest(b1)
u1 t = 0
dest(f2)
dest(c3)
dest(c1) dest(c2)
dest(f1)
u3 t = 0.19
dest(b1)
dest(c1)
dest(f1)
dest(c3)
dest(c2)
dest(f2)
u2 t = 0.1
dest(c1)
dest(f1)
dest(c2) dest(f2)
dest(b1)
dest(c3)
k -rank answer to Q using plurality voting is 〈dest(b1), dest(c1) 〉 or〈dest(b1), dest(f2) 〉.
OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 22 /32
EXPERIMENTS
Yelp dataset: 1,000 businesses, 229,907 reviews to find placesto eatcategories in Yelp→ Datalog+/–concepts (e.g Italian,Mediterranean)50 users inserted their preferences (e.g., the place to eat , food)
Compare methodsefficiencyquality of the results
OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 23 /32
RESULTS OF THE EXPERIMENTS
group size increase→ quality deacreasesk increase→ quality increases
OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 24 /32
CH 2. ONTOLOGICAL CP-THEORIES∗
A Datalog+/– ontology O and a set of υ : ξ � ξ′ [W ]
Given υ we prefer ξ to ξ′, irrespective of the value of W , as longas the other values are the same.
reviewer(bob,A) : hotel(I,C,wifi ,S) � hotel(I,C, cable,S)[REVIEW ]with REVIEW = {review(p), review(n)}
Complexity of answering top-k CQ
∗In Proc. of the International Joint Conference on Artificial Intelligence, Jul 2015.OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 25 /32
ONTOLOGICAL CP-THEORIES
> : review(I,U,p) � review(I′,U,n)[∅]
OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 26 /32
ONTOLOGICAL CP-THEORIES
> : hotel(I,C,w,S) � hotel(I′,C, c,S′)[{R,F}]
OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 27 /32
ONTOLOGICAL CP-THEORIES
hotel(I,C, c,S) review(I,U,n) : reviewer(j ,A)� reviewer(beate,A′)[∅]
:
OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 28 /32
SUMMARY
Chapter Formalism ∗
1 PP-Datalog+/– [1]1 GP-Datalog+/– [5]1 GPP-Datalog+/–[2,3,4]2 Ontological CP-nets [6,7]2 Ontological CP-theories [8]
PP-Datalog+/-
GPP-Datalog+/-Ontological CP-theories
Ontological CP-nets
1 2
GP-Datalog+/-
∗Complexity, Uncertainty, Properties and ImplementationOANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 29 /32
PUBLICATIONS
1 PP-Datalog+/– Preference–Based Query Answering inProbabilistic Datalog+/– Ontologies In Journal on DataSemantics. Vol. 4. No. 2. Pages 81–101. June, 2015.∗
2 Query Answering in Probabilistic Datalog+/– Ontologies underGroup Preferences. In Proc. of WI 2013. Pages 171–178.∗
3 Query Answering in Datalog+/– Ontologies under GroupPreferences and Probabilistic Uncertainty In Proc. DMSSW 2013Vol. 8295 of LNCS. Pages 192–206.∗
4 Ontology–Based Query Answering with Group Preferences InACM Transactions on Internet Technology (TOIT). Vol. 14. No. 4.Pages 25:1–25:24. December, 2014.∗
5 Group Preferences for Query Answering in Datalog+/–Ontologies In Proc. of SUM 2013 Vol. 8078 of LNCS. Pages360–373. 2013.∗
Journal, Conference, Workshop or symposium
∗Authors: T. Lukasiewicz, M.V. Martinez, G. I. Simari and O. Tifrea–MarciuskaOANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 30 /32
PUBLICATIONS
6 Computing k-Rank Answers with Ontological CP–Nets In Proc.SEBD 2014. Pages 276-283.∗
7 Computing k-Rank Answers with Ontological CP–Nets In Proc.PRUV 2014 Vol. 1205 of CEUR Workshop Proceedings. Pages74–87.∗
8 Combining Existential Rules with the Power of CP–Theories InProc. of IJCAI 2015.∗
Conference, Workshop or symposium
∗Authors: T. Di Noia, T. Lukasiewicz, M.V. Martinez, G. I. Simari and O.Tifrea–Marciuska
OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 31 /32
THANK YOU
Questions? [email protected]
OANA TIFREA-MARCIUSKA PERSONALIZED SEARCH FOR THE SOCIAL SEMANTIC WEB SLIDE 32 /32