instance-based ontological knowledge acquisition
DESCRIPTION
The Linked Open Data (LOD) cloud contains tremendous amounts of interlinked instances, from where we can retrieve abundant knowledge. However, because of the heterogeneous and big ontologies, it is time consuming to learn all the ontologies manually and it is difficult to observe which properties are important for describing instances of a specific class. In order to construct an ontology that can help users easily access to various data sets, we propose a semi-automatic ontology inte- gration framework that can reduce the heterogeneity of ontologies and retrieve frequently used core properties for each class. The framework consists of three main components: graph-based ontology integration, machine-learning-based ontology schema extraction, and an ontology merger. By analyzing the instances of the linked data sets, this framework acquires ontological knowledge and constructs a high-quality integrated ontology, which is easily understandable and effective in knowledge ac- quisition from various data sets using simple SPARQL queries.TRANSCRIPT
Instance-Based Ontological Knowledge AcquisitionThe Graduate University for Advanced Studies (SOKENDAI)
National Institute of Informatics
Lihua Zhao & Ryutaro IchiseESWC2013, Montpellier, France, 28th May, 2013
Introduction Related Work Semi-automatic Ontology Integration Framework Experiments Conclusion and Future Work
Outline
Introduction
Related Work
Semi-automatic Ontology Integration FrameworkGraph-Based Ontology IntegrationMachine-Learning-Based Ontology Schema ExtractionOntology Merger
Experiments
Conclusion and Future Work
Lihua Zhao & Ryutaro Ichise | Instance-Based Ontological Knowledge Acquisition | 2
Introduction Related Work Semi-automatic Ontology Integration Framework Experiments Conclusion and Future Work
Introduction
Linked Open Data (LOD)Machine-readable and interlinked at instance-level.295 data sets, 31 billion RDF triples (as of Sep. 2011).Around 504 million owl:sameAs links.7 domains (cross-domain, geographic, media, life sciences,government, user-generated content, and publications).
WorldFact-book
JohnPeel
(DBTune)
Pokedex
Pfam
US SEC(rdfabout)
LinkedLCCN
EuropeanaEEA
IEEE
ChEMBL
SemanticXBRL
SWDogFood
CORDIS(FUB)
AGROVOC
OpenlyLocal
Discogs(Data
Incubator)
DBpedia
yovisto
Tele-graphis
tags2condelicious
NSF
MediCare
BrazilianPoli-
ticians
dotAC
ERA
OpenCyc
Italianpublic
schools
UB Mann-heim
JISC
MoseleyFolk
SemanticTweet
OS
GTAA
totl.net
OAI
Portu-guese
DBpedia
LOCAH
KEGGGlycan
CORDIS(RKB
Explorer)
UMBEL
Affy-metrix
riese
business.data.gov.
uk
OpenData
Thesau-rus
GeoLinkedData
UK Post-codes
SmartLink
ECCO-TCP
UniProt(Bio2RDF)
SSWThesau-
rus
RDFohloh
Freebase
LondonGazette
OpenCorpo-rates
Airports
GEMET
P20
TCMGeneDIT
Source CodeEcosystemLinked Data
OMIM
HellenicFBD
DataGov.ie
MusicBrainz
(DBTune)
data.gov.ukintervals
LODE
Climbing
SIDER
ProjectGuten-berg
MusicBrainz
(zitgist)
ProDom
HGNC
SMCJournals
Reactome
NationalRadio-activity
JP
legislationdata.gov.uk
AEMET
ProductTypes
Ontology
LinkedUser
Feedback
Revyu
GeneOntology
NHS(En-
AKTing)
URIBurner
DBTropes
Eurécom
ISTATImmi-
gration
LichfieldSpen-ding
SurgeRadio
Euro-stat
(FUB)
PiedmontAccomo-dations
NewYork
Times
Klapp-stuhl-club
EUNIS
Bricklink
reegle
CO2Emission
(En-AKTing)
AudioScrobbler(DBTune)
GovTrack
GovWILDECS
South-amptonEPrints
KEGGReaction
LinkedEDGAR
(OntologyCentral)
LIBRIS
OpenLibrary
KEGGDrug
research.data.gov.
uk
VIVOCornell
UniRef
WordNet(RKB
Explorer)
Cornetto
medu-cator
DDC DeutscheBio-
graphie
Wiki
Ulm
NASA(Data Incu-
bator)
BBCMusic
DrugBank
Turismode
Zaragoza
PlymouthReading
Lists
education.data.gov.
uk
KISTI
UniPathway
Eurostat(OntologyCentral)
OGOLOD
Twarql
MusicBrainz(Data
Incubator)
GeoNames
PubChem
ItalianMuseums
Good-win
Familyflickr
wrappr
Eurostat
Thesau-rus W
OpenLibrary(Talis)
LOIUS
LinkedGeoData
LinkedOpenColors
WordNet(VUA)
patents.data.gov.
uk
GreekDBpedia
SussexReading
Lists
MetofficeWeatherForecasts
GND
LinkedCT
SISVU
transport.data.gov.
uk
Didac-talia
dbpedialite
BNB
OntosNewsPortal
LAAS
ProductDB
iServe
Recht-spraak.
nl
KEGGCom-pound
GeoSpecies
VIVO UF
LinkedSensor Data(Kno.e.sis)
lobidOrgani-sations
LEM
LinkedCrunch-
base
FTS
OceanDrillingCodices
JanusAMP
ntnusc
WeatherStations
Amster-dam
Museum
lingvoj
Crime(En-
AKTing)
Course-ware
PubMed
ACM
BBCWildlifeFinder
Calames
Chronic-ling
America
data-open-
ac-uk
OpenElection
DataProject
Slide-share2RDF
FinnishMunici-palities
OpenEI
MARCCodes
List
VIVOIndiana
HellenicPD
LCSH
FanHubz
bibleontology
IdRefSudoc
KEGGEnzyme
NTUResource
Lists
PRO-SITE
LinkedOpen
Numbers
Energy(En-
AKTing)
Roma
OpenCalais
databnf.fr
lobidResources
IRIT
theses.fr
LOV
Rådatanå!
DailyMed
Taxo-nomy
New-castle
GoogleArt
wrapper
Poké-pédia
EURES
BibBase
RESEX
STITCH
PDB
EARTh
IBM
Last.FMartists
(DBTune)
YAGO
ECS(RKB
Explorer)
EventMedia
STW
myExperi-ment
BBCProgram-
mes
NDLsubjects
TaxonConcept
Pisa
KEGGPathway
UniParc
Jamendo(DBtune)
Popula-tion (En-AKTing)
Geo-WordNet
RAMEAUSH
UniSTS
Mortality(En-
AKTing)
AlpineSki
Austria
DBLP(RKB
Explorer)
Chem2Bio2RDF
MGI
DBLP(L3S)
Yahoo!Geo
Planet
GeneID
RDF BookMashup
El ViajeroTourism
Uberblic
SwedishOpen
CulturalHeritage
GESIS
datadcs
Last.FM(rdfize)
Ren.EnergyGenera-
tors
Sears
RAE2001
NSZLCatalog
Homolo-Gene
Ord-nanceSurvey
TWC LOGD
Disea-some
EUTCProduc-
tions
PSH
WordNet(W3C)
semanticweb.org
ScotlandGeo-
graphy
Magna-tune
Norwe-gian
MeSH
SGD
TrafficScotland
statistics.data.gov.
uk
CrimeReports
UK
UniProt
US Census(rdfabout)
Man-chesterReading
Lists
EU Insti-tutions
PBAC
VIAF
UN/LOCODE
Lexvo
LinkedMDB
ESDstan-dards
reference.data.gov.
uk
t4gminfo
Sudoc
ECSSouth-ampton
ePrints
Classical(DB
Tune)
DBLP(FU
Berlin)
Scholaro-meter
St.AndrewsResource
Lists
NVD
Fishesof
TexasScotlandPupils &Exams
RISKS
gnoss
DEPLOY
InterPro
Lotico
OxPoints
Enipedia
ndlna
Budapest
CiteSeer
Media
Geographic
Publications
User-generated content
Government
Cross-domain
Life sciences
As of September 2011
Lihua Zhao & Ryutaro Ichise | Instance-Based Ontological Knowledge Acquisition | 3
Introduction Related Work Semi-automatic Ontology Integration Framework Experiments Conclusion and Future Work
Motivation
Figure: Interlinked Instances of “France”.
Problems when access to several data sets:Ontology Heterogeneity Problem
Map related ontology classes and properties.Ontology similarity matching on the SameAs graph patterns.
Di!culty in Identifying Core Ontology SchemasRetrieve frequently used core ontology classes and properties.Machine learning for core ontology schema extraction.
Lihua Zhao & Ryutaro Ichise | Instance-Based Ontological Knowledge Acquisition | 4
Introduction Related Work Semi-automatic Ontology Integration Framework Experiments Conclusion and Future Work
Related Work
Find useful attributes from frequent graph patterns using asupervised machine learning method. [Le, 2010]
Only for geographic data and no discussion about the features.
A debugging method for mapping lightweight ontologies withmachine learning method. [Meilicke, 2008]
Limited to the expressive lightweight ontologies.
Construct intermediate-layer ontology by analyzing conceptcoverings. [Parundekar, 2012]
Only for specific domains and limited between two resources.
Lihua Zhao & Ryutaro Ichise | Instance-Based Ontological Knowledge Acquisition | 5
Introduction Related Work Semi-automatic Ontology Integration Framework Experiments Conclusion and Future Work
Semi-automatic Ontology Integration Framework
Construct a global ontology by integrating heterogeneousontologies of the Linked Open Data.
Graph-Based Ontology Integration [Zhao, et al., 2012]Group related classes and properties.
Machine-Learning-Based Ontology Schema ExtractionExtract frequent core classes and properties.
Ontology MergerMerge extracted ontology classes and properties.
Lihua Zhao & Ryutaro Ichise | Instance-Based Ontological Knowledge Acquisition | 6
Introduction Related Work Semi-automatic Ontology Integration Framework Experiments Conclusion and Future Work
Semi-automatic Ontology Integration Framework
Construct a global ontology by integrating heterogeneousontologies of the Linked Open Data.
Graph-Based Ontology Integration [Zhao, et al., 2012]Group related classes and properties.
Machine-Learning-Based Ontology Schema ExtractionExtract frequent core classes and properties.
Ontology MergerMerge extracted ontology classes and properties.
Lihua Zhao & Ryutaro Ichise | Instance-Based Ontological Knowledge Acquisition | 7
Introduction Related Work Semi-automatic Ontology Integration Framework Experiments Conclusion and Future Work
Graph-based Ontology Integration
Extract graph patterns from interlinked instances to discoverrelated ontology classes and predicates.
Lihua Zhao & Ryutaro Ichise | Instance-Based Ontological Knowledge Acquisition | 8
Introduction Related Work Semi-automatic Ontology Integration Framework Experiments Conclusion and Future Work
STEP 1: Graph Pattern Extraction
SameAs Graph: An undirected SameAs Graph SG = (V , E , I ), where
V : a set of vertices (the labels of data sets).
E ! V " V : a set of sameAs edges.
I : a set of URIs of the interlinked SameAs Instances.
Example: SGFrance = (VFrance , EFrance , IFrance).
VFrance = {M, D, G, N}EFrance = {(D, G), (D, N), (G, M), (G, N)}IFrance = {mdb-country:FR, db:France, geo:3017382, nyt:67...21}.
Lihua Zhao & Ryutaro Ichise | Instance-Based Ontological Knowledge Acquisition | 9
Introduction Related Work Semi-automatic Ontology Integration Framework Experiments Conclusion and Future Work
STEP 2: <Predicate, Object> Collection
<Predicate, Object> (PO) pairs and types for SGFrance
Predicate Object Type
rdf:type db-onto:Country Classrdfs:label “France”@en Stringfoaf:name “France”@en Stringfoaf:name “Republique francaise”@en Stringdb-onto:wikiPageExternalLink http://us.franceguide.com/ URIdb-prop:populationEstimate 65447374 Number. . . . . . . . . . . . . . . . . .geo-onto:name France Stringgeo-onto:alternateName “France”@en Stringgeo-onto:featureCode geo-onto:A.PCLI Classgeo-onto:population 64768389 Number. . . . . . . . . . . . . . . . . .rdf:type mdb:country Classmdb:country name France Stringmdb:country population 64094000 Numberrdfs:label France (Country) String. . . . . . . . . . . . . . . . . .rdf:type skos:Concept Classskos:inScheme nyt:nytd geo Classskos:prefLabel “France”@en Stringnyt-prop:first use 2004-09-01 Date
Lihua Zhao & Ryutaro Ichise | Instance-Based Ontological Knowledge Acquisition | 10
Introduction Related Work Semi-automatic Ontology Integration Framework Experiments Conclusion and Future Work
STEP 3: Related Classes and Properties Grouping
Related Classes Grouping (Leaf nodes)Tracking subsumption relations from SameAs graphs.
< C1 owl:subClassOf C2 >< C1 skos:inScheme C2 >
Example: SGFrance
Related Classes # {db-onto:Country, geo-onto:A.PCLI,mdb:country, nyt:nytd geo }
Lihua Zhao & Ryutaro Ichise | Instance-Based Ontological Knowledge Acquisition | 11
Introduction Related Work Semi-automatic Ontology Integration Framework Experiments Conclusion and Future Work
STEP 3: Related Classes and Properties Grouping
Related Properties GroupingExact matching for creating initial sets of PO pairs S1, S2, . . . , Sk .
Similarity matching on the initial sets of PO pairs.
Sim(POi ,POj) =ObjSim(POi ,POj) + PreSim(POi ,POj)
2
ObjSim(POi ,POj ) =
!"
#1!
|OPOi!OPOj
|OPOi
+OPOjif OPO is Number
StrSim(OPOi,OPOj
) if OPO is String
PreSim(POi ,POj ) = WNSim(TPOi,TPOj
)
StrSim(OPOi,OPOj
): Average of 3 string-based similarity measures.
WNSim(TPOi,TPOj
): Average of 9 WordNet-based similarity measures.
Refine sets of PO pairs according to rdfs:domain.
Lihua Zhao & Ryutaro Ichise | Instance-Based Ontological Knowledge Acquisition | 12
Introduction Related Work Semi-automatic Ontology Integration Framework Experiments Conclusion and Future Work
STEP 4: Aggregation of All Graph Patterns
Aggregate the integrated classes and properties from all the extractedgraph patterns.
Select A Term for Each Setex-onto:ClassTermex-onto:propTerm
Construct Relationsex-prop:hasMemberClasses<class, ex-prop:hasMemberClasses, ex-onto:ClassTerm>ex-prop:hasMemberDataTypes<property, ex-prop:hasMemberDataTypes, ex-onto:propTerm>
Construct A Preliminary Integrated Ontology
Lihua Zhao & Ryutaro Ichise | Instance-Based Ontological Knowledge Acquisition | 13
Introduction Related Work Semi-automatic Ontology Integration Framework Experiments Conclusion and Future Work
STEP 5: Manual Revision
Manually revise the preliminary integrated ontology.
Terms of the integrated classes and properties:Choose a proper term for each group of classes or properties.
Groups of related classes or properties:Correct wrong grouping.
Lihua Zhao & Ryutaro Ichise | Instance-Based Ontological Knowledge Acquisition | 14
Introduction Related Work Semi-automatic Ontology Integration Framework Experiments Conclusion and Future Work
Semi-automatic Ontology Integration Framework
Construct a global ontology by integrating heterogeneousontologies of the Linked Open Data.
Graph-Based Ontology Integration [Zhao, et al., 2012]Group related classes and properties.
Machine-Learning-Based Ontology Schema ExtractionExtract frequent core classes and properties.
Ontology MergerMerge extracted ontology classes and properties.
Lihua Zhao & Ryutaro Ichise | Instance-Based Ontological Knowledge Acquisition | 15
Introduction Related Work Semi-automatic Ontology Integration Framework Experiments Conclusion and Future Work
Machine-Learning-Based Ontology Schema Extraction
Top-level classes and core properties are necessary.
Decision TableRetrieves core properties in each data set.
Belongs to rule-based machine learning with simple hypothesis.Retrieves a subset of properties that are important for describinginstances in a data set.
AprioriRetrieves core properties in the instances of a specifictop-level class.
Belongs to association rule mining.Finds a set of properties, whose support is greater than theuser-defined minimum support.
Lihua Zhao & Ryutaro Ichise | Instance-Based Ontological Knowledge Acquisition | 16
Introduction Related Work Semi-automatic Ontology Integration Framework Experiments Conclusion and Future Work
Decision Table
Retrieve top-level classes and core properties that are important fordescribing instances in a data set.
Collect top-level classes.
Filter out infrequent properties.Convert each instance for the Decision Table algorithm.weight(prop1, inst), weight(prop2, inst), ... weight(propn, inst), class
PF-IIF (Property Frequency - Inverse Instance Frequency)
weight(prop, inst) = pf (prop, inst)" iif (prop,D)
pf (prop, inst) = the frequency of prop in inst.
iif (prop,D) = log|D|
|instprop |
instprop: an instance that contains the property prop.
Lihua Zhao & Ryutaro Ichise | Instance-Based Ontological Knowledge Acquisition | 17
Introduction Related Work Semi-automatic Ontology Integration Framework Experiments Conclusion and Future Work
Apriori
Retrieve top-level classes and frequent core properties that areimportant for describing instances in a specific class.
Collect top-level classes.
Filter out infrequent properties.
Convert each instance of top-level class c for the Apriori algorithm.[prop1, prop2, ..., propn]
Define minimum support and confidence metric.
Lihua Zhao & Ryutaro Ichise | Instance-Based Ontological Knowledge Acquisition | 18
Introduction Related Work Semi-automatic Ontology Integration Framework Experiments Conclusion and Future Work
Semi-automatic Ontology Integration Framework
Construct a global ontology by integrating heterogeneousontologies of the Linked Open Data.
Graph-Based Ontology Integration [Zhao, et al., 2012]Group related classes and properties.
Machine-Learning-Based Ontology Schema ExtractionExtract frequent core classes and properties.
Ontology MergerMerge extracted ontology classes and properties.
Lihua Zhao & Ryutaro Ichise | Instance-Based Ontological Knowledge Acquisition | 19
Introduction Related Work Semi-automatic Ontology Integration Framework Experiments Conclusion and Future Work
Ontology Merger
Graph-Based Ontology Integration outputs a Preliminary IntegratedOntology.
For the ontology classes and properties retrieved fromMachine-Learning-Based Approach:
If Class c $% Preliminary Integrated Ontology,add < ex-onto:ClassTermnew , ex-prop:hasMemberClasses, c >.For each Property prop retrieved from top-level class c using Apriori,add a triple < prop, rdfs:domain, c >.
Lihua Zhao & Ryutaro Ichise | Instance-Based Ontological Knowledge Acquisition | 20
Introduction Related Work Semi-automatic Ontology Integration Framework Experiments Conclusion and Future Work
Experiments
Data Sets
Graph-Based Ontology Integration
Decision Table
Apriori
Comparison of Integrated Ontology
Case Studies
Lihua Zhao & Ryutaro Ichise | Instance-Based Ontological Knowledge Acquisition | 21
Introduction Related Work Semi-automatic Ontology Integration Framework Experiments Conclusion and Future Work
Data Sets
DBpedia (v3.6): cross-domain, 3.5 million things, 8.9 million URIs.
Geonames (v2.2.1): geographical domain, 7 million URIs.
NYTimes: media domain, 10,467 subject news.
LinkedMDB: media domain, 0.5 million entities.
Lihua Zhao & Ryutaro Ichise | Instance-Based Ontological Knowledge Acquisition | 22
Introduction Related Work Semi-automatic Ontology Integration Framework Experiments Conclusion and Future Work
Data Sets - Machine Learning
Data Set Instances Selected Class Top-level Property SelectedInstances Class Property
DBpedia 3,708,696 64,460 241 28 1385 840Geonames 7,480,462 45,000 428 9 31 21NYTimes 10,441 10,441 5 4 8 7LinkedMDB 694,400 50,000 53 10 107 60
Selected InstancesRandomly select instances per class:DBpedia (5000), Geonames(3000), NYTimes(All), LinkedMDB(3000)
Top-level ClassesOntology-based data set: Use subsumption relations.Without ontology: Use categories.
Selected PropertiesWith frequency threshold ! as
&n, where n is the total number of
instances in the data set.
Lihua Zhao & Ryutaro Ichise | Instance-Based Ontological Knowledge Acquisition | 23
Introduction Related Work Semi-automatic Ontology Integration Framework Experiments Conclusion and Future Work
Graph-Based Ontology Integration
13 graph patterns
Frequent graph patterns:GP1, GP2, GP3
N,G,D: GP4, GP5, GP7, GP8
N,M,D: GP6
M,G,D: GP9
M,D,N,G: GP10, GP11,GP12, GP13
13 graph patterns.
97 classes into 48 groups.
357 properties into 38 groups.
Retrieved related classes and properties by analyzing graph patterns.[Zhao, I-Semantics2012]
Lihua Zhao & Ryutaro Ichise | Instance-Based Ontological Knowledge Acquisition | 24
Introduction Related Work Semi-automatic Ontology Integration Framework Experiments Conclusion and Future Work
Evaluation of Machine Learning Approaches
Evaluate the Decision Table and Apriori algorithm.
Evaluation of Decision TableEvaluate whether the retrieved sets of properties are important fordescribing instances by testing if they can be used to distinguishdi!erent types of instances in the data set.
Evaluation of AprioriAnalyze the performance of Apriori algorithm in each data set withexamples of retrieved sets of properties.
Lihua Zhao & Ryutaro Ichise | Instance-Based Ontological Knowledge Acquisition | 25
Introduction Related Work Semi-automatic Ontology Integration Framework Experiments Conclusion and Future Work
Decision Table
Data Set Precision Recall F-Measure Selected Properties
DBpedia 0.892 0.821 0.837 53Geonames 0.472 0.4 0.324 10NYTimes 0.795 0.792 0.785 5LinkedMDB 1 1 1 11
Core properties are evaluated by predicting classes of instances (10-fold).11 properties from LinkedMDB can correctly identify class of an instance.DBpedia and NYTimes performs good with selected properties.10 properties from Geonames are commonly used for all types of classes.Examples of retrieved core properties.
DBpedia: db-prop:city, db-prop:debut, db-onto:formationYear,etc.Geonames: geo-onto:alternateName, geo-onto:countryCode, etc.NYTimes: nyt:latest use, nyt:topicPage, wgs84 pos:long, etc.LinkedMDB: mdb:director directorid, mdb:writer writerid, etc.
Retrieved top-level classes and core properties in each data set.Lihua Zhao & Ryutaro Ichise | Instance-Based Ontological Knowledge Acquisition | 26
Introduction Related Work Semi-automatic Ontology Integration Framework Experiments Conclusion and Future Work
Apriori
Examples of retrieved core properties with Apriori Algorithm.
Data Set Class Properties
DBpediadb:Event db-onto:place, db-prop:date, db-onto:related/geo.db:Species db-onto:kingdom, db-onto:class, db-onto:family.db:Person foaf:givenName, foaf:surname, db-onto:birthDate.
Geonamesgeo:P geo-onto:alternateName, geo-onto:countryCode.geo:R wgs84 pos:alt, geo-onto:name, geo-onto:countryCode.
NYTimesnyt:nytd geo wgs84 pos:long.nyt:nytd des skos:scopeNote.
LinkedMDBmdb:actor mdb:performance, mdb:actor name, mdb:actor netflix id.mdb:film mdb:director, mdb:performane, mdb:actor, dc:date.
DBpedia and LinkedMDB: Retrieved unique properties.Geonames and NYTimes: Retrieved commonly used properties only.Automatically added missing domain information:< prop, rdfs : domain, classtop >.
Retrieved frequent core properties in each top-level class.Lihua Zhao & Ryutaro Ichise | Instance-Based Ontological Knowledge Acquisition | 27
Introduction Related Work Semi-automatic Ontology Integration Framework Experiments Conclusion and Future Work
Comparison of Integrated Ontology
Previous Work Machine Learning Current WorkGraph-Based Decision Apriori IntegratedIntegration Table Ontology
Class 97 50 (38 new) 50 (38 new) 135 (38 new)Property 357 79 (49 new) 119(80 new) 453 (96 new)
Previous Work: 97 classes in 49 groups, 357 properties in 38 groups.
Current Work: 135 classes in 87 groups, 453 properties in 97 groups.
Apriori retrieves more properties than Decision Table.
33 new properties are found with both Apriori and Decision Table.
Lihua Zhao & Ryutaro Ichise | Instance-Based Ontological Knowledge Acquisition | 28
Introduction Related Work Semi-automatic Ontology Integration Framework Experiments Conclusion and Future Work
Case Studies I
Find Missing Links of Islands with Integrated OntologySELECT DISTINCT ?geo ?db ?stringwhere { ?geo geo-onto:featureCode geo-onto:T.ISL.?geo ?gname ?string.ex-onto:name ex-prop:hasMemberDataTypes ?gname.?db rdf:type db-onto:Island.ex-onto:name ex-prop:hasMemberDataTypes ?dname.?db ?dname ?string. }
Retrieved 509 links, including 218 existing SameAs links:97 existing links from DBpedia to Geonames.211 links from Geonames to DBpedia.90 bidirectional links between DBpedia and Geonames.
Discovered 291 missing links with the integrated ontology using exactmatching on the labels of instances.
Lihua Zhao & Ryutaro Ichise | Instance-Based Ontological Knowledge Acquisition | 29
Introduction Related Work Semi-automatic Ontology Integration Framework Experiments Conclusion and Future Work
Case Studies II
Predicates grouped in ex-prop:birthDate
Property Number of Instances rdfs:domain
db-onto:birthDate 287,327 db-onto:Persondb-prop:datebirth 1,675 N/Adb-prop:dateofbirth 87,364 N/Adb-prop:dateOfBirth 163,876 N/Adb-prop:born 34,832 N/Adb-prop:birthdate 70,630 N/Adb-prop:birthDate 101,121 N/A
Suggest “db-onto:birthDate” as the standard property because it
has rdfs:domain definition
has the highest usage in the DBpedia instances.
Lihua Zhao & Ryutaro Ichise | Instance-Based Ontological Knowledge Acquisition | 30
Introduction Related Work Semi-automatic Ontology Integration Framework Experiments Conclusion and Future Work
Case Studies III
Give me all the cities with more than 10,000,000 inhabitants.
Standard Query Query with the Integrated OntologySELECT DISTINCT ?uri ?string SELECT DISTINCT ?uri ?stringWHERE { WHERE {?uri rdf:type db-onto:City. ?uri rdf:type db-onto:City.
ex-onto:population ex-prop:hasMemberDataTypes ?prop.?uri db-prop:populationTotal ?inhabitants. ?uri ?prop ?inhabitants.FILTER (?inhabitants > 10000000). FILTER (?inhabitants > 10000000).OPTIONAL { ?uri rdfs:label ?string. OPTIONAL { ?uri rdfs:label ?string.FILTER (lang(?string) = ’en’) }} FILTER (lang(?string) = ’en’) }}
A SPARQL example from QALD-1 Open Challenge.
Standard query: 9 cities.
Query with the integrated ontology: 20 cities.
Help QA systems for finding more related answers with simple queries.
Lihua Zhao & Ryutaro Ichise | Instance-Based Ontological Knowledge Acquisition | 31
Introduction Related Work Semi-automatic Ontology Integration Framework Experiments Conclusion and Future Work
Conclusion and Future Work
Conclusion
Semi-automatic ontology integration frameworkGraph-Based Ontology Integration.
Ontology similarity matching on SameAs graph patterns.Retrieve related ontology classes and properties.
Machine-Learning-Based Ontology Schema ExtractionDecision Table and Apriori.Extract top-level classes and core properties.
Ontology Merger
Find missing links, detect misuses of ontologies, and access variousdata sets with integrated ontology.
Future Work
Automatically detect and revise mistakes in ontology merger.
Automatically detect ranges and domains of properties.
Test our framework with more LOD data sets.Lihua Zhao & Ryutaro Ichise | Instance-Based Ontological Knowledge Acquisition | 32
Thank you!Questions?
Lihua Zhao, [email protected] Ichise, [email protected]
Lihua Zhao & Ryutaro Ichise | Instance-Based Ontological Knowledge Acquisition | 33