expanding the yago knowledge base - thomas rebele · yago knowledge base rebele the yago knowledge...
TRANSCRIPT
![Page 1: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/1.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Conclusion
1/41
Expanding the YAGO knowledge base
Thomas Rebele
Télécom ParisTech
2018-07-19
![Page 2: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/2.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
What is a knowledgebase?
What is YAGO?
Involvement
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Conclusion
What is a knowledge base?
2/41
Albert EinsteinMileva Maric
marriedAlfred Kleinerhas advisor
Nobel Prize in Physics
won prize
◮
◮
◮
◮
![Page 3: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/3.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
What is a knowledgebase?
What is YAGO?
Involvement
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Conclusion
What is a knowledge base?
2/41
Albert EinsteinMileva Maric
marriedAlfred Kleinerhas advisor
Nobel Prize in Physics
won prize
Applications of knowledge bases:
◮ Question answering
◮ Semantic search
◮ Text analysis
◮ Machine translation
![Page 4: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/4.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
What is a knowledgebase?
What is YAGO?
Involvement
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Conclusion
What is YAGO?
3/41
◮ Knowledge base with 10 million entities and >210 million facts
◮ Automatically extracted from Wikipedia, Wordnet, and Geonames
◮ Multilingual facts from 10 languages
◮ Focus on precision
◮ Developed by Max-Planck Institute for Informatics and TélécomParisTech
![Page 5: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/5.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
What is a knowledgebase?
What is YAGO?
Involvement
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Conclusion
What is YAGO?
4/41
Languages
EnglishGermanFrench
Albert Einstein
Albert Einstein was aphysicist. His work in-fluenced the philosophyof science. He developedthe theory of relativity.
Albert Einstein
Spouse(s) Mileva MaricDoctoral advisor(s) Alfred Kleiner
Categories: Nobel laureates in Physics
Albert EinsteinMileva Maric
marriedAlfred Kleinerhas advisor
Nobel Prize in Physics
won prize
automaticextraction
![Page 6: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/6.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
What is a knowledgebase?
What is YAGO?
Involvement
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Conclusion
Involvement
5/41
◮ I joined the project in 2014
◮ Maintenance and development
◮ Contributed to open source release in 2017 athttps://github.com/yago-naga/yago3/
◮ Coordinated / contributed to the evaluation
◮ ground truth: Wikipedia◮ 98% facts of the sample were correct
Publication: ISWC 2016 (resource paper)
Thomas Rebele Fabian Suchanek Johannes Hoffart Joanna Biega Erdal Kuzey Gerhard Weikum
YAGO is very accurate. But how complete is it?
![Page 7: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/7.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Conclusion
6/41
Contributions:
Extracting more information aboutresidences, gender, birth and death dates
Repairing regular expressionsby adding missing words
Preprocessing tabular databy transforming queries to Bash scripts
![Page 8: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/8.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Conclusion
6/41
Contributions:
Extracting more information aboutresidences, gender, birth and death dates
Repairing regular expressionsby adding missing words
Preprocessing tabular databy transforming queries to Bash scripts
![Page 9: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/9.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Problem statement
Extensions
Place of residence
Gender
Evaluation
Births per month
Life span over time
Relative populationsize
Summary
Adding Words toRegexes
AnsweringQueries with UnixShell
Conclusion
Using YAGO for the humanities: Problem statement
7/41
◮ Every person lives somewhere,but YAGO knows the residence only for 30% of the people
◮ Every person has a gender,but YAGO knows the gender only for 64% of the people
How can we make YAGO more complete?
![Page 10: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/10.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Problem statement
Extensions
Place of residence
Gender
Evaluation
Births per month
Life span over time
Relative populationsize
Summary
Adding Words toRegexes
AnsweringQueries with UnixShell
Conclusion
Using YAGO for the humanities: Place of residence
8/41
Plato
Plato was a philoso-pher. He founded theAcademy in Athens. Helaid the foundation forphilosophy.
Plato
Birthplace Athens
Categories: 420s BC births | 340s BCdeaths | Greek philosopher | Greekmale wrestler | Austrian writer
residence
previous approach
![Page 11: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/11.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Problem statement
Extensions
Place of residence
Gender
Evaluation
Births per month
Life span over time
Relative populationsize
Summary
Adding Words toRegexes
AnsweringQueries with UnixShell
Conclusion
Using YAGO for the humanities: Place of residence
8/41
Plato
Plato was a philoso-pher. He founded theAcademy in Athens. Helaid the foundation forphilosophy.
Plato
Birthplace Athens
Categories: 420s BC births | 340s BCdeaths | Greek philosopher | Greekmale wrestler | Austrian writer
residence
previous approach
AustriaGreece Greece
mapping of
5900 demonyms
![Page 12: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/12.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Problem statement
Extensions
Place of residence
Gender
Evaluation
Births per month
Life span over time
Relative populationsize
Summary
Adding Words toRegexes
AnsweringQueries with UnixShell
Conclusion
Using YAGO for the humanities: Place of residence
8/41
Plato
Plato was a philoso-pher. He founded theAcademy in Athens. Helaid the foundation forphilosophy.
Plato
Birthplace Athens
Categories: 420s BC births | 340s BCdeaths | Greek philosopher | Greekmale wrestler | Austrian writer
residence
previous approach
AustriaGreece Greece
mapping of
5900 demonyms
Greece: 2Austria: 1
![Page 13: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/13.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Problem statement
Extensions
Place of residence
Gender
Evaluation
Births per month
Life span over time
Relative populationsize
Summary
Adding Words toRegexes
AnsweringQueries with UnixShell
Conclusion
Using YAGO for the humanities: Place of residence
8/41
Plato
Plato was a philoso-pher. He founded theAcademy in Athens. Helaid the foundation forphilosophy.
Plato
Birthplace Athens
Categories: 420s BC births | 340s BCdeaths | Greek philosopher | Greekmale wrestler | Austrian writer
residence
previous approach
AustriaGreece Greece
mapping of
5900 demonyms
Greece: 2Austria: 1
![Page 14: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/14.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Problem statement
Extensions
Place of residence
Gender
Evaluation
Births per month
Life span over time
Relative populationsize
Summary
Adding Words toRegexes
AnsweringQueries with UnixShell
Conclusion
Using YAGO for the humanities: Gender
9/41
Extract gender:
Languages
EnglishGermanFrench
Albert Einstein
Albert Einstein was aphysicist. His work in-fluenced the philosophyof science. He developedthe theory of relativity.
Albert Einstein
Categories: Male scientist | Swissphysicists
◮
◮
From pronoun:
◮ YAGO’s originalalgorithm
◮ Count pronouns (he,him / she, her)
◮ Assign genderaccordingly
![Page 15: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/15.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Problem statement
Extensions
Place of residence
Gender
Evaluation
Births per month
Life span over time
Relative populationsize
Summary
Adding Words toRegexes
AnsweringQueries with UnixShell
Conclusion
Using YAGO for the humanities: Gender
9/41
Extract gender:
Languages
EnglishGermanFrench
Albert Einstein
Albert Einstein was aphysicist. His work in-fluenced the philosophyof science. He developedthe theory of relativity.
Albert Einstein
Categories: Male scientist | Swissphysicists
From category
◮
◮
From pronoun:
◮ YAGO’s originalalgorithm
◮ Count pronouns (he,him / she, her)
◮ Assign genderaccordingly
![Page 16: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/16.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Problem statement
Extensions
Place of residence
Gender
Evaluation
Births per month
Life span over time
Relative populationsize
Summary
Adding Words toRegexes
AnsweringQueries with UnixShell
Conclusion
Using YAGO for the humanities: Gender
9/41
Extract gender:
Languages
EnglishGermanFrench
Albert Einstein
Albert Einstein was aphysicist. His work in-fluenced the philosophyof science. He developedthe theory of relativity.
Albert Einstein
Categories: Male scientist | Swissphysicists
From category From first name:
◮ Countmales/females foreach first name
◮ Assign names togender accordingly
From pronoun:
◮ YAGO’s originalalgorithm
◮ Count pronouns (he,him / she, her)
◮ Assign genderaccordingly
![Page 17: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/17.jpg)
Using YAGO for the humanities: Evaluation
10/41
◮ Compare extraction process on Wikipedia dump from 2017-02-20
◮ Extracted on 11 languages
◮ Evaluate precision based on a sample of 100 people
ExtractionYAGO
beforeRecall
YAGO
nowRecall Precision DBpedia (en)
Place of
residence 0.7m 30% 2.1m 91% (+201%) 97% (*) 0.7m
Gender 1.5m 64% 2.0m 87% (+35%) 98% 4k
Birth dates 1.6m 69% 1.7m 74% (+8%) 100% 0.8m
Death dates 0.7m 33% 0.8m 36% (+10%) 100% 0.3m
Table: Coverage and precision of our methods.Recall relative to total number of people in YAGO (2.2m).
m million k thousand(*) 6% of anachronistic residencies (e.g., German Empire instead of Germany)
![Page 18: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/18.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Problem statement
Extensions
Place of residence
Gender
Evaluation
Births per month
Life span over time
Relative populationsize
Summary
Adding Words toRegexes
AnsweringQueries with UnixShell
Conclusion
Using YAGO for the humanities: Births per month
11/41
0 2 4 6 8 10 12
7.5%
8%
8.5%
9%
Month
Rel
ativ
eb
irth
s
YAGO - all
National Center for Health Statistics
Figure: Births per month in the United States between 2003 and 2015(with the Student’s t confidence interval at α = 95%).
![Page 19: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/19.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Problem statement
Extensions
Place of residence
Gender
Evaluation
Births per month
Life span over time
Relative populationsize
Summary
Adding Words toRegexes
AnsweringQueries with UnixShell
Conclusion
Using YAGO for the humanities: Births per month
11/41
0 2 4 6 8 10 12
7.5%
8%
8.5%
9%
Month
Rel
ativ
eb
irth
s
YAGO - all
National Center for Health Statistics
Figure: Births per month in the United States between 2003 and 2015(with the Student’s t confidence interval at α = 95%).
Languages
EnglishEuskara
Relative age effect
The relative age effectdescribes a bias. Peopleborn early in the selec-tion period of sports oracademia are more likelyto succeed.
Categories: Ageism|Epidemiology
![Page 20: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/20.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Problem statement
Extensions
Place of residence
Gender
Evaluation
Births per month
Life span over time
Relative populationsize
Summary
Adding Words toRegexes
AnsweringQueries with UnixShell
Conclusion
Using YAGO for the humanities: Births per month
11/41
0 2 4 6 8 10 12
7.5%
8%
8.5%
9%
Month
Rel
ativ
eb
irth
s
YAGO - no sportsmen
National Center for Health Statistics
Figure: Births per month in the United States between 2003 and 2015(with the Student’s t confidence interval at α = 95%).
![Page 21: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/21.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Problem statement
Extensions
Place of residence
Gender
Evaluation
Births per month
Life span over time
Relative populationsize
Summary
Adding Words toRegexes
AnsweringQueries with UnixShell
Conclusion
Using YAGO for the humanities: Life span over time
12/41
100 500 1000 1500 1900
45
50
55
60
65
70
75
80
85
Year
Med
ian
age
male
female
Figure: Median age over time, by year of birth(with the Student’s t confidence interval at α = 95%).
![Page 22: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/22.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Problem statement
Extensions
Place of residence
Gender
Evaluation
Births per month
Life span over time
Relative populationsize
Summary
Adding Words toRegexes
AnsweringQueries with UnixShell
Conclusion
Using YAGO for the humanities: Relative population size
13/41
−3000 −2500 −2000 −1500 −1000 −500 0 500 1000 1500 20000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.81
Year
Rel
ativ
ep
op
ula
tio
nsi
ze
Egypt
Babylonian-EmpireSyria
ChinaGreece
Ancient-RomeBritainFrance
ItalyGermany
United-States
Figure: Relative population size, by century. The y-axis is scaled by a quadraticfunction.
![Page 23: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/23.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Problem statement
Extensions
Place of residence
Gender
Evaluation
Births per month
Life span over time
Relative populationsize
Summary
Adding Words toRegexes
AnsweringQueries with UnixShell
Conclusion
Using YAGO for the humanities: Summary
14/41
◮ Extension of YAGO:◮ More people with residences (+201%, 97% precison)◮ More people with genders (+35%, 98% precision)◮ More birth and death dates (+8%/10%, 100% precision)
◮ Case studies:◮ Births per month◮ Life span over time◮ Relative population size over time
◮ Interdisciplinary project
Publication: ISWC 2017 (workshop paper)
Thomas Rebele Arash Nekoei Fabian Suchanek
We often had to repair regular expressions (e.g., for matching dates).Can we automate this step?
100 500 1000 1500 1900
50
60
70
80
Year
Med
ian
age
male
female
![Page 24: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/24.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Problem statement
Extensions
Place of residence
Gender
Evaluation
Births per month
Life span over time
Relative populationsize
Summary
Adding Words toRegexes
AnsweringQueries with UnixShell
Conclusion
Using YAGO for the humanities: Summary
15/41
Contributions:
Extracting more information aboutresidences, gender, birth and death dates
Repairing regular expressionsby adding missing words
Preprocessing tabular databy transforming queries to Bash scripts
![Page 25: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/25.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
Introduction
Problem statement
What is new in ourapproach
Approximate regexmatching
Finding the gaps
Add missing parts
Feedback function
Experiments
Summary
AnsweringQueries with UnixShell
Conclusion
Adding Words to Regexes: Introduction
16/41
Why does YAGO not knowthe ISBN numbers of my books?
◮ We want to find ISBN numbers in Wikipedia to include it in YAGO
◮ We try the regex ISBN(978|979)?\d{10}
◮ I978-2-1234-5680-3
◮
![Page 26: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/26.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
Introduction
Problem statement
What is new in ourapproach
Approximate regexmatching
Finding the gaps
Add missing parts
Feedback function
Experiments
Summary
AnsweringQueries with UnixShell
Conclusion
Adding Words to Regexes: Introduction
16/41
Why does YAGO not knowthe ISBN numbers of my books?
◮ We want to find ISBN numbers in Wikipedia to include it in YAGO
◮ We try the regex ISBN(978|979)?\d{10}
◮ Why does the regex not find I978-2-1234-5680-3 ?
◮ How can we modify the regex automatically to match the word?
![Page 27: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/27.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
Introduction
Problem statement
What is new in ourapproach
Approximate regexmatching
Finding the gaps
Add missing parts
Feedback function
Experiments
Summary
AnsweringQueries with UnixShell
Conclusion
Adding Words to Regexes: Problem statement
17/41
Problem statement, first try:
Given
◮ a regular expression r and ISBN(978|979)?\d{10}
◮ a set of positive examples E+, { I978-2-1234-5680-3 }find a regular expression r′ such that
◮ L(r) ⊆ L(r′)
◮ E+ ⊆ L(r′)
′ = .∗
![Page 28: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/28.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
Introduction
Problem statement
What is new in ourapproach
Approximate regexmatching
Finding the gaps
Add missing parts
Feedback function
Experiments
Summary
AnsweringQueries with UnixShell
Conclusion
Adding Words to Regexes: Problem statement
17/41
Problem statement, first try:
Given
◮ a regular expression r and ISBN(978|979)?\d{10}
◮ a set of positive examples E+, { I978-2-1234-5680-3 }find a regular expression r′ such that
◮ L(r) ⊆ L(r′)
◮ E+ ⊆ L(r′)
Solution:
r′ = .∗
![Page 29: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/29.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
Introduction
Problem statement
What is new in ourapproach
Approximate regexmatching
Finding the gaps
Add missing parts
Feedback function
Experiments
Summary
AnsweringQueries with UnixShell
Conclusion
Adding Words to Regexes: Problem statement
18/41
Problem statement:
Given
◮ a regular expression r, ISBN(978|979)?\d{10}
◮ a set of positive examples E+, { I978-2-1234-5680-3 }◮ a set of negative examples E−, { 0612345678 }
find a regular expression r′ such that
◮ L(r) ⊆ L(r′)
◮ E+ ⊆ L(r′)
◮ L(r′) ∩ E− is small
◮ ′ ≥ ≈
◮ ′ ≥
![Page 30: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/30.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
Introduction
Problem statement
What is new in ourapproach
Approximate regexmatching
Finding the gaps
Add missing parts
Feedback function
Experiments
Summary
AnsweringQueries with UnixShell
Conclusion
Adding Words to Regexes: Problem statement
18/41
Problem statement:
Given
◮ a regular expression r, ISBN(978|979)?\d{10}
◮ a set of positive examples E+, { I978-2-1234-5680-3 }◮ a set of negative examples E−, { 0612345678 }
find a regular expression r′ such that
◮ L(r) ⊆ L(r′)
◮ E+ ⊆ L(r′)
◮ L(r′) ∩ E− is small
Evaluation:
◮ Precision of r′ ≥ or ≈ precision of r
◮ Recall of r′ ≥ recall of r(w.r.t. the intended meaning of the regex)
![Page 31: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/31.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
Introduction
Problem statement
What is new in ourapproach
Approximate regexmatching
Finding the gaps
Add missing parts
Feedback function
Experiments
Summary
AnsweringQueries with UnixShell
Conclusion
Adding Words to Regexes: What is new in our approach
19/41
Previous approaches:
regex E+ E− regex+ + −→
Our approach:
regex E+ E− regex+ + −→
Rationale: creating a large set of positive examples is difficult
![Page 32: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/32.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
Introduction
Problem statement
What is new in ourapproach
Approximate regexmatching
Finding the gaps
Add missing parts
Feedback function
Experiments
Summary
AnsweringQueries with UnixShell
Conclusion
Adding Words to Regexes: Approximate regex matching
20/41
Step 1: match string and regex approximately [Myers et al. 1989]
.
.
I S B N
?
|
.
9 7 8
.
9 7 9
.
\d \d ... \d \d
I 9 7 8 - 2 - 1 2 3 4 - 5 6 8 0 - 3
...
![Page 33: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/33.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
Introduction
Problem statement
What is new in ourapproach
Approximate regexmatching
Finding the gaps
Add missing parts
Feedback function
Experiments
Summary
AnsweringQueries with UnixShell
Conclusion
Adding Words to Regexes: Finding the gaps
21/41
Step 2: find the gaps
◮ Between regex leaves
◮
.
.
I S B N
?
|
.
9 7 8
.
9 7 9
.
\d \d ... \d \dS B N
I 9 7 8 - 2 - 1 2 3 4 - 5 6 8 0 - 3
...
![Page 34: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/34.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
Introduction
Problem statement
What is new in ourapproach
Approximate regexmatching
Finding the gaps
Add missing parts
Feedback function
Experiments
Summary
AnsweringQueries with UnixShell
Conclusion
Adding Words to Regexes: Finding the gaps
21/41
Step 2: find the gaps
◮ Between regex leaves
◮ Between characters of the string
.
.
I S B N
?
|
.
9 7 8
.
9 7 9
.
\d \d ... \d \d
I 9 7 8 - 2 - 1 2 3 4 - 5 6 8 0 - 3-
...
![Page 35: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/35.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
Introduction
Problem statement
What is new in ourapproach
Approximate regexmatching
Finding the gaps
Add missing parts
Feedback function
Experiments
Summary
AnsweringQueries with UnixShell
Conclusion
Adding Words to Regexes: Add missing parts
22/41
Step 3 (simple approach): adapt regex, so that it includes the missingparts
.
.
I S B N
?
|
.
9 7 8
.
9 7 9
.
\d \d ... \d \dS B N
I 9 7 8 - 2 - 1 2 3 4 - 5 6 8 0 - 3
...
.
.
I ?
.
S B N
?
|
.
9 7 8
.
9 7 9
.
\d \d ... \d \d
![Page 36: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/36.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
Introduction
Problem statement
What is new in ourapproach
Approximate regexmatching
Finding the gaps
Add missing parts
Feedback function
Experiments
Summary
AnsweringQueries with UnixShell
Conclusion
Adding Words to Regexes: Add missing parts
22/41
Step 3 (simple approach): adapt regex, so that it includes the missingparts
.
.
I S B N
?
|
.
9 7 8
.
9 7 9
.
\d \d ... \d \d
I 9 7 8 - 2 - 1 2 3 4 - 5 6 8 0 - 3-
...
.
.
I ?
.
S B N
?
|
.
9 7 8
.
9 7 9
?
-
.
\d \d ... \d \d
![Page 37: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/37.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
Introduction
Problem statement
What is new in ourapproach
Approximate regexmatching
Finding the gaps
Add missing parts
Feedback function
Experiments
Summary
AnsweringQueries with UnixShell
Conclusion
Adding Words to Regexes: Add missing parts
22/41
Step 3 (simple approach): adapt regex, so that it includes the missingparts
.
.
I S B N
?
|
.
9 7 8
.
9 7 9
.
\d \d ... \d \d
I 9 7 8 - 2 - 1 2 3 4 - 5 6 8 0 - 3-
...
.
.
I ?
.
S B N
?
|
.
9 7 8
.
9 7 9
?
-
.
\d ?
-
\d ... \d \d
![Page 38: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/38.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
Introduction
Problem statement
What is new in ourapproach
Approximate regexmatching
Finding the gaps
Add missing parts
Feedback function
Experiments
Summary
AnsweringQueries with UnixShell
Conclusion
Adding Words to Regexes: Add missing parts
22/41
Step 3 (simple approach): adapt regex, so that it includes the missingparts
.
.
I S B N
?
|
.
9 7 8
.
9 7 9
.
\d \d ... \d \d
I 9 7 8 - 2 - 1 2 3 4 - 5 6 8 0 - 3-
...
.
.
I ?
.
S B N
?
|
.
9 7 8
.
9 7 9
?
-
.
\d ?
-
\d ... \d ?
-
\d
![Page 39: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/39.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
Introduction
Problem statement
What is new in ourapproach
Approximate regexmatching
Finding the gaps
Add missing parts
Feedback function
Experiments
Summary
AnsweringQueries with UnixShell
Conclusion
Adding Words to Regexes: Add missing parts
23/41
Step 3 (adaptive approach): adapt regex, so that it includes the missingparts
Exemplarily for a concatenation:
.
a b ... c d
gs
2g1 g
e
2g3
{g1, g2, g3}
→
.
a
?
b ... c d
{g1, gs
2} {g e
2, g3}
![Page 40: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/40.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
Introduction
Problem statement
What is new in ourapproach
Approximate regexmatching
Finding the gaps
Add missing parts
Feedback function
Experiments
Summary
AnsweringQueries with UnixShell
Conclusion
Adding Words to Regexes: Add missing parts
23/41
Step 3 (adaptive approach): adapt regex, so that it includes the missingparts
Exemplarily for a concatenation:
.
a b ... c d
gs
2g1 g
e
2g3
{g1, g2, g3}
→
.
a
?
b ... c d
{g1, gs
2} {g e
2, g3}
![Page 41: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/41.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
Introduction
Problem statement
What is new in ourapproach
Approximate regexmatching
Finding the gaps
Add missing parts
Feedback function
Experiments
Summary
AnsweringQueries with UnixShell
Conclusion
Adding Words to Regexes: Feedback function
24/41
Now we want to find URLs:
◮ We try regex r = http://[a-zA-Z\.]+
◮ It does not find s = wikipedia.org
◮ Repaired regex r′ = (http://)?[a-zA-Z\.]+
Problem:
◮ r′ finds all words
◮ Precision drops
−
◮
◮
( ′) = | − ∩ ( ′)| ≤ α| − ∩ ( )|
◮ ( ′) =http://[a-zA-Z\.]+|wikipedia.org
![Page 42: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/42.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
Introduction
Problem statement
What is new in ourapproach
Approximate regexmatching
Finding the gaps
Add missing parts
Feedback function
Experiments
Summary
AnsweringQueries with UnixShell
Conclusion
Adding Words to Regexes: Feedback function
24/41
Now we want to find URLs:
◮ We try regex r = http://[a-zA-Z\.]+
◮ It does not find s = wikipedia.org
◮ Repaired regex r′ = (http://)?[a-zA-Z\.]+
Problem:
◮ r′ finds all words
◮ Precision drops
Solution: use feedback on set of negative examples E−
◮ Determine the parts of the regex that we can make optional
◮ We use the number of false positives, i.e.,
f (r′) = |E− ∩ L(r′)| ≤ α|E− ∩ L(r)|
◮ If f (r′) = false, add the word as disjunction instead:http://[a-zA-Z\.]+|wikipedia.org
![Page 43: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/43.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
Introduction
Problem statement
What is new in ourapproach
Approximate regexmatching
Finding the gaps
Add missing parts
Feedback function
Experiments
Summary
AnsweringQueries with UnixShell
Conclusion
Adding Words to Regexes: Experiments
25/41
Input data:
◮ Datasets:ReLIE [Li et al., 2008],Enron [Babbar et al., 2010], andWikipedia infobox attributes
◮ In total 8 tasks (e.g., phone numbers, software names, dates)
◮ In total 52 regexes
![Page 44: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/44.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
Introduction
Problem statement
What is new in ourapproach
Approximate regexmatching
Finding the gaps
Add missing parts
Feedback function
Experiments
Summary
AnsweringQueries with UnixShell
Conclusion
Adding Words to Regexes: Experiments
26/41
Input data:
◮ Datasets:ReLIE [Li et al., 2008],Enron [Babbar et al., 2010], andWikipedia infobox attributes
◮ In total 8 tasks (e.g., phonenumbers, software names, dates)
◮ In total 52 regexes
Approaches:
◮ Dis: r|s1| · · · |sn
◮ Star: .*
◮ B&S: [Babbar et al., 2010](reimplementation)
◮ Simple
◮ Adaptive
baseline adaptive
measure original dis star B&S simple α = 1.0 α = 1.1 α = 1.20
F1 55 55 21 40 56 60 60 60recall 66 67 62 35 69 75 76 77precision 64 64 14 71 64 63 63 63
length 56 270 2 3929 250 76 80 81
Table: Averaged measures for the different systems. Length is # of characters ofthe regex.
![Page 45: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/45.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
Introduction
Problem statement
What is new in ourapproach
Approximate regexmatching
Finding the gaps
Add missing parts
Feedback function
Experiments
Summary
AnsweringQueries with UnixShell
Conclusion
Adding Words to Regexes: Summary
27/41
Summary:
◮ Algorithm for adding missing words to regexes
◮ Increases recall, while keeping precision stable
◮ Source code available athttps://github.com/thomasrebele/regex-repair
Future work:
◮ Decrease dependency on E−
◮ Add a generalization step as postprocessing
Publications: ISWC 2017 (demo), PAKDD 2018 (full paper)
Thomas Rebele Katerina Tzompanaki Fabian Suchanek
Now that we have all this data, how can we process it efficiently?
![Page 46: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/46.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
Introduction
Problem statement
What is new in ourapproach
Approximate regexmatching
Finding the gaps
Add missing parts
Feedback function
Experiments
Summary
AnsweringQueries with UnixShell
Conclusion
Adding Words to Regexes: Summary
28/41
Contributions:
Extracting more information aboutresidences, gender, birth and death dates
Repairing regular expressionsby adding missing words
Preprocessing tabular databy transforming queries to Bash scripts
![Page 47: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/47.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Motivation
29/41
How can I findall my academic ancestors?
Albert Einstein
Relativity
teaches
Alfred Kleiner
Statistical physics
teaches
hasAdvisor
Johann Müller
Electromagnetism
teaches
hasAdvisor
![Page 48: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/48.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Idea
30/41
QSPARQL / OWL
qresult
database
qTSV/n-triples/
/
σπ
![Page 49: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/49.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Idea
30/41
QSPARQL / OWL
qresult
database
qTSV/n-triples/
/
σπ
![Page 50: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/50.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Idea
30/41
QSPARQL / OWL
qresult
database
qTSV/n-triples/
/
&Bash script
Q1. Datalog
σπ2. Algebra
÷3. Optimize
^4. Translate
/
![Page 51: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/51.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Idea
30/41
QSPARQL / OWL
qresult
database
qTSV/n-triples/
/
&Bash script
Q1. Datalog
σπ2. Algebra
÷3. Optimize
^4. Translate
/
![Page 52: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/52.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Approach
31/41
QSPARQL / OWL
&Bash script
Q1. Datalog
σπ2. Algebra
÷3. Optimize
^4. Translate
Query "Who are Einstein’s academic ancestors?" in SPARQL:
SELECT ?Y WHERE {<Einstein> <hasAdvisor>+ ?Y
}
Translating the query to Datalog (simplified):
facts(X, Y, Z) :~ read_ntriples input
adv(X, Y) :- facts(X, "hasAdvisor", Y).
result(Y) :- adv("Einstein", Y).result(Y) :- result(X), adv(X, Y).
![Page 53: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/53.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Approach
32/41
QSPARQL / OWL
&Bash script
Q1. Datalog
σπ2. Algebra
÷3. Optimize
^4. Translate
facts(X, Y, Z) :~ read_ntriples input
adv(X, Y) :- facts(X, "hasAdvisor", Y).
result(Y) :- adv("Einstein", Y).result(Y) :- result(X), adv(X, Y).
µx
∪
π2
σ1=Einstein
π1,3
σ2=hasAdvisor
input
π3
⋊⋉1=1
x π1,3
σ2=hasAdvisor
input
( ) ( )
( )
( , )
( , )( , )
( , ., )( , ., )
( )
( , , )
( ) ( , )( , )
![Page 54: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/54.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Approach
32/41
QSPARQL / OWL
&Bash script
Q1. Datalog
σπ2. Algebra
÷3. Optimize
^4. Translate
facts(X, Y, Z) :~ read_ntriples input
adv(X, Y) :- facts(X, "hasAdvisor", Y).
result(Y) :- adv("Einstein", Y).result(Y) :- result(X), adv(X, Y).
µx
∪
π2
σ1=Einstein
π1,3
σ2=hasAdvisor
input
π3
⋊⋉1=1
x π1,3
σ2=hasAdvisor
input
( ) ( )
( )
( , )
( , )( , )
( , ., )( , ., )
( )
( , , )
( )
projection
( , )( , )
selection
![Page 55: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/55.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Approach
32/41
QSPARQL / OWL
&Bash script
Q1. Datalog
σπ2. Algebra
÷3. Optimize
^4. Translate
facts(X, Y, Z) :~ read_ntriples input
adv(X, Y) :- facts(X, "hasAdvisor", Y).
result(Y) :- adv("Einstein", Y).result(Y) :- result(X), adv(X, Y).
µx
∪
π2
σ1=Einstein
π1,3
σ2=hasAdvisor
input
π3
⋊⋉1=1
x π1,3
σ2=hasAdvisor
input
( ) ( )
( )
( , )
( , )( , )
( , ., )( , ., )
( )
( , , )
( )
projection
( , )( , )
selection
![Page 56: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/56.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Approach
32/41
QSPARQL / OWL
&Bash script
Q1. Datalog
σπ2. Algebra
÷3. Optimize
^4. Translate
facts(X, Y, Z) :~ read_ntriples input
adv(X, Y) :- facts(X, "hasAdvisor", Y).
result(Y) :- adv("Einstein", Y).result(Y) :- result(X), adv(X, Y).
µx
∪
π2
σ1=Einstein
π1,3
σ2=hasAdvisor
input
π3
⋊⋉1=1
x π1,3
σ2=hasAdvisor
input
( ) ( )
( )
( , )
( , )( , )
( , ., )( , ., )
( )
join
( , , )
( ) ( , )( , )
![Page 57: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/57.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Approach
32/41
QSPARQL / OWL
&Bash script
Q1. Datalog
σπ2. Algebra
÷3. Optimize
^4. Translate
facts(X, Y, Z) :~ read_ntriples input
adv(X, Y) :- facts(X, "hasAdvisor", Y).
result(Y) :- adv("Einstein", Y).result(Y) :- result(X), adv(X, Y).
µx
∪
π2
σ1=Einstein
π1,3
σ2=hasAdvisor
input
π3
⋊⋉1=1
x π1,3
σ2=hasAdvisor
input
least fixed point
( ) ( )
( )
( , )
( , )( , )
( , ., )( , ., )
( )
( , , )
recursive call
( ) ( , )( , )
![Page 58: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/58.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Approach
32/41
QSPARQL / OWL
&Bash script
Q1. Datalog
σπ2. Algebra
÷3. Optimize
^4. Translate
facts(X, Y, Z) :~ read_ntriples input
adv(X, Y) :- facts(X, "hasAdvisor", Y).
result(Y) :- adv("Einstein", Y).result(Y) :- result(X), adv(X, Y).
µx
∪
π2
σ1=Einstein
π1,3
σ2=hasAdvisor
input
π3
⋊⋉1=1
x π1,3
σ2=hasAdvisor
input
( ) ( )
( )
( , )
( , )( , )
( , s., )( , s., )
( )
( , , )
( ) ( , )( , )
![Page 59: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/59.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Approach
32/41
QSPARQL / OWL
&Bash script
Q1. Datalog
σπ2. Algebra
÷3. Optimize
^4. Translate
facts(X, Y, Z) :~ read_ntriples input
adv(X, Y) :- facts(X, "hasAdvisor", Y).
result(Y) :- adv("Einstein", Y).result(Y) :- result(X), adv(X, Y).
µx
∪
π2
σ1=Einstein
π1,3
σ2=hasAdvisor
input
π3
⋊⋉1=1
x π1,3
σ2=hasAdvisor
input
( )
( )
( )
( , )
( , )( , )
( , s., )( , s., )
( )
( , , )
( ) ( , )( , )
![Page 60: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/60.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Approach
32/41
QSPARQL / OWL
&Bash script
Q1. Datalog
σπ2. Algebra
÷3. Optimize
^4. Translate
facts(X, Y, Z) :~ read_ntriples input
adv(X, Y) :- facts(X, "hasAdvisor", Y).
result(Y) :- adv("Einstein", Y).result(Y) :- result(X), adv(X, Y).
µx
∪
π2
σ1=Einstein
π1,3
σ2=hasAdvisor
input
π3
⋊⋉1=1
x π1,3
σ2=hasAdvisor
input
( )
( )
( )
( , )
( , )( , )
( , s., )( , s., )
( )
( , , )
( )
( , )( , )
![Page 61: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/61.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Approach
32/41
QSPARQL / OWL
&Bash script
Q1. Datalog
σπ2. Algebra
÷3. Optimize
^4. Translate
facts(X, Y, Z) :~ read_ntriples input
adv(X, Y) :- facts(X, "hasAdvisor", Y).
result(Y) :- adv("Einstein", Y).result(Y) :- result(X), adv(X, Y).
µx
∪
π2
σ1=Einstein
π1,3
σ2=hasAdvisor
input
π3
⋊⋉1=1
x π1,3
σ2=hasAdvisor
input
( )
( )
( )
( , )
( , )( , )
( , s., )( , s., )
( )
( , , )
( ) ( , )( , )
![Page 62: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/62.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Approach
32/41
QSPARQL / OWL
&Bash script
Q1. Datalog
σπ2. Algebra
÷3. Optimize
^4. Translate
facts(X, Y, Z) :~ read_ntriples input
adv(X, Y) :- facts(X, "hasAdvisor", Y).
result(Y) :- adv("Einstein", Y).result(Y) :- result(X), adv(X, Y).
µx
∪
π2
σ1=Einstein
π1,3
σ2=hasAdvisor
input
π3
⋊⋉1=1
x π1,3
σ2=hasAdvisor
input
( )
( )
( )
( , )
( , )( , )
( , s., )( , s., )
( )
( , , )
( ) ( , )( , )
![Page 63: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/63.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Approach
32/41
QSPARQL / OWL
&Bash script
Q1. Datalog
σπ2. Algebra
÷3. Optimize
^4. Translate
facts(X, Y, Z) :~ read_ntriples input
adv(X, Y) :- facts(X, "hasAdvisor", Y).
result(Y) :- adv("Einstein", Y).result(Y) :- result(X), adv(X, Y).
µx
∪
π2
σ1=Einstein
π1,3
σ2=hasAdvisor
input
π3
⋊⋉1=1
x π1,3
σ2=hasAdvisor
input
( )
( )
( )
( , )
( , )( , )
( , s., )( , s., )
( )
( , , )
( ) ( , )( , )
![Page 64: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/64.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Approach
32/41
QSPARQL / OWL
&Bash script
Q1. Datalog
σπ2. Algebra
÷3. Optimize
^4. Translate
facts(X, Y, Z) :~ read_ntriples input
adv(X, Y) :- facts(X, "hasAdvisor", Y).
result(Y) :- adv("Einstein", Y).result(Y) :- result(X), adv(X, Y).
µx
∪
π2
σ1=Einstein
π1,3
σ2=hasAdvisor
input
π3
⋊⋉1=1
x π1,3
σ2=hasAdvisor
input
( ) ( )
( )
( , )
( , )( , )
( , s., )( , s., )
( )
( , , )
( ) ( , )( , )
![Page 65: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/65.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Approach
32/41
QSPARQL / OWL
&Bash script
Q1. Datalog
σπ2. Algebra
÷3. Optimize
^4. Translate
facts(X, Y, Z) :~ read_ntriples input
adv(X, Y) :- facts(X, "hasAdvisor", Y).
result(Y) :- adv("Einstein", Y).result(Y) :- result(X), adv(X, Y).
µx
∪
π2
σ1=Einstein
π1,3
σ2=hasAdvisor
input
π3
⋊⋉1=1
x π1,3
σ2=hasAdvisor
input
( ) ( )
( )
( , )
( , )( , )
( , s., )( , s., )
( )
( , , )
( ) ( , )( , )
![Page 66: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/66.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Approach
33/41
QSPARQL / OWL
&Bash script
Q1. Datalog
σπ2. Algebra
÷3. Optimize
^4. Translate
Optimizations:
µx
∪
π2
σ1=Einstein
π1,3
σ2=hasAdvisor
input
π3
⋊⋉1=1
x π1,3
σ2=hasAdvisor
input
µx
∪
π3
σ1=Einstein
σ2=hasAdvisor
input
π3
⋊⋉1=1
x π1,3
σ2=hasAdvisor
input
![Page 67: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/67.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Approach
33/41
QSPARQL / OWL
&Bash script
Q1. Datalog
σπ2. Algebra
÷3. Optimize
^4. Translate
Optimizations:
µx
∪
π2
σ1=Einstein
π1,3
σ2=hasAdvisor
input
π3
⋊⋉1=1
x π1,3
σ2=hasAdvisor
input
µx
∪
π3
σ1=Einstein
σ2=hasAdvisor
input
π3
⋊⋉1=1
δx π1,3
σ2=hasAdvisor
input
![Page 68: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/68.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Approach
33/41
QSPARQL / OWL
&Bash script
Q1. Datalog
σπ2. Algebra
÷3. Optimize
^4. Translate
Optimizations:
µx
∪
π2
σ1=Einstein
π1,3
σ2=hasAdvisor
input
π3
⋊⋉1=1
x π1,3
σ2=hasAdvisor
input
µx
sort
∪
π3
σ1=Einstein
σ2=hasAdvisor
input
π3
⋊⋉1=1
sort1
δx
sort1
π1,3
σ2=hasAdvisor
input
![Page 69: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/69.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Approach
33/41
QSPARQL / OWL
&Bash script
Q1. Datalog
σπ2. Algebra
÷3. Optimize
^4. Translate
Optimizations:
µx
∪
π2
σ1=Einstein
π1,3
σ2=hasAdvisor
input
π3
⋊⋉1=1
x π1,3
σ2=hasAdvisor
input
µx
π3
σ1=Einstein
σ2=hasAdvisor
input
sort
π3
⋊⋉1=1
sort1
δx
sort1
π1,3
σ2=hasAdvisor
input
(1st)
![Page 70: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/70.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Approach
33/41
QSPARQL / OWL
&Bash script
Q1. Datalog
σπ2. Algebra
÷3. Optimize
^4. Translate
Optimizations:
µx
∪
π2
σ1=Einstein
π1,3
σ2=hasAdvisor
input
π3
⋊⋉1=1
x π1,3
σ2=hasAdvisor
input
£ a
sort1
π1,3
σ2=hasAdvisor
input
µx
π3
σ1=Einstein
σ2=hasAdvisor
input
sort
π3
⋊⋉1=1
sort1
δx
a
(1st)
![Page 71: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/71.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Approach
34/41
QSPARQL / OWL
&Bash script
Q1. Datalog
σπ2. Algebra
÷3. Optimize
^4. Translate
£ a
sort1
π1,3
σ2=adv.
input
µx
π3
σ1=Einstein
σ2=adv.
input
sort
π3
⋊⋉1=1
sort1
δx
a
(1st)
awk ’( $1 == "Einstein"
&& $2 == "hasAdvisor"){ print $3 >> "b" }
($2 == "hasAdvisor"){ print $1 FS $3 >> "pre_a" }
’ <(read_ntriples input)
# lock a(
sort -k 1 pre_a > a# unlock a
) &
![Page 72: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/72.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Approach
34/41
QSPARQL / OWL
&Bash script
Q1. Datalog
σπ2. Algebra
÷3. Optimize
^4. Translate
£ a
sort1
π1,3
σ2=adv.
input
µx
π3
σ1=Einstein
σ2=adv.
input
sort
π3
⋊⋉1=1
sort1
δx
a
(1st)
awk ’( $1 == "Einstein"
&& $2 == "hasAdvisor"){ print $3 >> "b" }
($2 == "hasAdvisor"){ print $1 FS $3 >> "pre_a" }
’ <(read_ntriples input)
# lock a(
sort -k 1 pre_a > a# unlock a
) &
![Page 73: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/73.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Approach
34/41
QSPARQL / OWL
&Bash script
Q1. Datalog
σπ2. Algebra
÷3. Optimize
^4. Translate
£ a
sort1
π1,3
σ2=adv.
input
µx
π3
σ1=Einstein
σ2=adv.
input
sort
π3
⋊⋉1=1
sort1
δx
a
(1st)
awk ’( $1 == "Einstein"
&& $2 == "hasAdvisor"){ print $3 >> "b" }
($2 == "hasAdvisor"){ print $1 FS $3 >> "pre_a" }
’ <(read_ntriples input)
# lock a(
sort -k 1 pre_a > a# unlock a
) &
![Page 74: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/74.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Approach
34/41
QSPARQL / OWL
&Bash script
Q1. Datalog
σπ2. Algebra
÷3. Optimize
^4. Translate
£ a
sort1
π1,3
σ2=adv.
input
µx
π3
σ1=Einstein
σ2=adv.
input
sort
π3
⋊⋉1=1
sort1
δx
a
(1st)
awk ’( $1 == "Einstein"
&& $2 == "hasAdvisor"){ print $3 >> "b" }
($2 == "hasAdvisor"){ print $1 FS $3 >> "pre_a" }
’ <(read_ntriples input)
# lock a(
sort -k 1 pre_a > a# unlock a
) &
![Page 75: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/75.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Approach
34/41
QSPARQL / OWL
&Bash script
Q1. Datalog
σπ2. Algebra
÷3. Optimize
^4. Translate
£ a
sort1
π1,3
σ2=adv.
input
µx
π3
σ1=Einstein
σ2=adv.
input
sort
π3
⋊⋉1=1
sort1
δx
a
(1st)
awk ’( $1 == "Einstein"
&& $2 == "hasAdvisor"){ print $3 >> "b" }
($2 == "hasAdvisor"){ print $1 FS $3 >> "pre_a" }
’ <(read_ntriples input)
# lock a(
sort -k 1 pre_a > a# unlock a
) &
![Page 76: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/76.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Approach
34/41
QSPARQL / OWL
&Bash script
Q1. Datalog
σπ2. Algebra
÷3. Optimize
^4. Translate
£ a
sort1
π1,3
σ2=adv.
input
µx
π3
σ1=Einstein
σ2=adv.
input
sort
π3
⋊⋉1=1
sort1
δx
a
(1st)
while# ...sort -k 1 -u
<( # wait for ajoin -1 1 -2 1 -o 2.2
<(sort -k 1 -u delta)a
)
# ...[ -s delta ];do continue; done
![Page 77: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/77.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Approach
34/41
QSPARQL / OWL
&Bash script
Q1. Datalog
σπ2. Algebra
÷3. Optimize
^4. Translate
£ a
sort1
π1,3
σ2=adv.
input
µx
π3
σ1=Einstein
σ2=adv.
input
sort
π3
⋊⋉1=1
sort1
δx
a
(1st)
while# ...sort -k 1 -u
<( # wait for ajoin -1 1 -2 1 -o 2.2
<(sort -k 1 -u delta)a
)
# ...[ -s delta ];do continue; done
![Page 78: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/78.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Optimization
35/41
How can I find all professors?
Professor(X) :- Person(X),teachesCourse(X,Y).
Professor(X) :- advisorOf(X,Y),Professor(Y).
Person(X) :- Employee(X).Person(X) :- Professor(X).
Professor(X) :- Professor(X),teachesCourse(X,Y).
![Page 79: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/79.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Optimization
35/41
How can I find all professors?
Professor(X) :- Person(X),teachesCourse(X,Y).
Professor(X) :- advisorOf(X,Y),Professor(Y).
Person(X) :- Employee(X).Person(X) :- Professor(X).
Professor(X) :- Professor(X),teachesCourse(X,Y).
![Page 80: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/80.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Optimization
35/41
How can I find all professors?
Professor(X) :- Person(X),teachesCourse(X,Y).
Professor(X) :- advisorOf(X,Y),Professor(Y).
Person(X) :- Employee(X).Person(X) :- Professor(X).
Combining the first and the last rule leads to
Professor(X) :- Professor(X),teachesCourse(X,Y).
![Page 81: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/81.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Optimization
35/41
How can I find all professors?
Professor(X) :- Person(X),teachesCourse(X,Y).
Professor(X) :- advisorOf(X,Y),Professor(Y).
Person(X) :- Employee(X).Person(X) :- Professor(X).
Combining the first and the last rule leads to
Professor(X) :- Professor(X),teachesCourse(X,Y).
![Page 82: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/82.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Optimization
36/41
µx
∪
π1
⋊⋉1=1
∪
employee x
teachesCourse
π1
⋊⋉2=1
advisorOf x
( ) (?)
( ) (?)
( )
( , , ?)
( )
( )
⇒
(?)
(?, , )
( )
⇒
![Page 83: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/83.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Optimization
36/41
µx
∪
π1
⋊⋉1=1
∪
employee x
teachesCourse
π1
⋊⋉2=1
advisorOf x
( )
(?)
( )
(?)
( )
( , , ?)
( )
( )
⇒
(?)
(?, , )
( )
⇒
![Page 84: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/84.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Optimization
36/41
µx
∪
π1
⋊⋉1=1
∪
employee x
teachesCourse
π1
⋊⋉2=1
advisorOf x
( )
(?)
( )
(?)
( )
( , , ?)
( )
( )
⇒ superfluous
(?)
(?, , )
( )
⇒
![Page 85: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/85.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Optimization
36/41
µx
∪
π1
⋊⋉1=1
∪
employee x
teachesCourse
π1
⋊⋉2=1
advisorOf x
( ) (?)
( ) (?)
( )
( , , ?)
( )
( )
⇒ superfluous
(?)
(?, , )
( )
⇒
![Page 86: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/86.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Optimization
36/41
µx
∪
π1
⋊⋉1=1
∪
employee x
teachesCourse
π1
⋊⋉2=1
advisorOf x
( ) (?)
( ) (?)
( )
( , , ?)
( )
( )
⇒ superfluous
(?)
(?, , )
( )
⇒ necessary
![Page 87: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/87.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Optimization
36/41
µx
∪
π1
⋊⋉1=1
∪
employee x
teachesCourse
π1
⋊⋉2=1
advisorOf x
(c1) (?)
(c1) (?)
(c1)
(c1, c1, ?)
(c1)
(c1)
⇒ superfluous
(?)
(?, c1, c1)
(c1)
⇒ necessary
![Page 88: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/88.jpg)
Answering Queries with Unix Shell: Experiments
37/41
◮ Dataset: LUBM university benchmark
◮ 14 different queries
◮ Competitors: Datalog-based systems (DLV, Souffle, RDFox),Triple stores (Jena, Stardog, Virtuoso),Database management systems (MonetDB, Postgres)
Number of finished queries within time limit
BashDLV
Souffle
RDFoxJena
Stardog
Virtuoso
MonetDB*
Postgres*
0
5
10
14
#o
ffi
nis
hed
qu
erie
s
LUBM 10
* = we folded the TBox into the query
![Page 89: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/89.jpg)
Answering Queries with Unix Shell: Experiments
38/41
101 102 103
100
101
102
103
# of universities
Ru
nti
me
(s)
Bash DLV RDFox
Stardog Souffle Jena
Virtuoso MonetDB Postgres
![Page 90: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/90.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Experiments
39/41
dataset Bash RDFox BigDatalog Stardog Virtuoso
LiveJournal 117 70 532 941 -orkut 225 121 1838 1123 -friendster 16306 - - - -
Table: Runtime for the reachability query, in seconds.
![Page 91: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/91.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Motivation
Idea
Approach
Optimization
Experiments
Summary
Conclusion
Answering Queries with Unix Shell: Summary
40/41
Summary:
◮ Preprocess large datasets without installing software
◮ Supports subset of SPARQL / OWL and Datalog as query language
◮ Try it online athttps://www.thomasrebele.org/projects/bashlog
◮ Source code available athttps://github.com/thomasrebele/bashlog
Future work:
◮ Numerical comparisons
◮ Aggregations (e.g., max, count)
Publication: ISWC 2018 (full paper)
Thomas Rebele Thomas P. Tanon Fabian Suchanek
![Page 92: Expanding the YAGO knowledge base - Thomas Rebele · YAGO knowledge base Rebele The YAGO knowledge base What is a knowledge base? What is YAGO? Involvement Outline Using YAGO for](https://reader033.vdocuments.net/reader033/viewer/2022042811/5fa73379567d5a57e929d968/html5/thumbnails/92.jpg)
Expanding theYAGO knowledge
base
Rebele
The YAGOknowledge base
Outline
Using YAGO forthe humanities
Adding Words toRegexes
AnsweringQueries with UnixShell
Conclusion
Conclusion
41/41
This thesis showed how to extend YAGO along several axes:
◮ Improve completeness w.r.t. people
◮ Automatically repairing of its regular expressions
◮ Preprocessing queries using only a Bash shell
Other accomplishments:
◮ Source code of all contributions is available online
◮ Publications at ISWC 2016 (resource paper),ISWC 2017 (demo, workshop), PAKDD 2018 (full paper),ISWC 2018 (full paper)
Future work:
◮ More studies on human society using facts from YAGO (ongoing)
◮ Combine YAGO and Wikidata
◮ Queries with numerical comparisons and aggregations