ontology languages - pierre senellart · 2012. 9. 7. · 3 / 43 télécom paristech pierre...
TRANSCRIPT
7 September 2012, Télécom ParisTech
Ontology Languages
Pierre Senellart
2 / 43 Télécom ParisTech Pierre Senellart
Outline
IntroductionThe Semantic WebOntologies and ReasoningComponents of an Ontology
3 Ontology Languages for the WebRDFRDFSOWL
Querying Data through Ontologies
Reference Material
3 / 43 Télécom ParisTech Pierre Senellart
Outline
IntroductionThe Semantic WebOntologies and ReasoningComponents of an Ontology
3 Ontology Languages for the Web
Querying Data through Ontologies
Reference Material
4 / 43 Télécom ParisTech Pierre Senellart
Outline
IntroductionThe Semantic WebOntologies and ReasoningComponents of an Ontology
3 Ontology Languages for the Web
Querying Data through Ontologies
Reference Material
5 / 43 Télécom ParisTech Pierre Senellart
The Semantic Web
A Web in which the resources are semantically describedannotations give information about a page, explain an expression ina page, etc.
More precisely, a resource is anything that can be referred to by aURI
a web page, identified by a URLa fragment of an XML document, identified by an element node ofthe document,a web service,a thing, an object, a concept, a property, etc.
Semantic annotations: logical assertions that relate resources tosome terms in associated ontologies
6 / 43 Télécom ParisTech Pierre Senellart
Outline
IntroductionThe Semantic WebOntologies and ReasoningComponents of an Ontology
3 Ontology Languages for the Web
Querying Data through Ontologies
Reference Material
7 / 43 Télécom ParisTech Pierre Senellart
Ontologies
Formal descriptions providing human users a sharedunderstanding of a given domain
A controlled vocabulary
Formally defined so that it can also be processed by machines
Logical semantics that enables reasoning
Reasoning is the key for different important tasks of Web datamanagement, in particular:
to answer queries (over possibly distributed data)to relate objects in different data sources enabling their integrationto detect inconsistencies or redundanciesto refine queries with too many answers, or to relax queries with noanswer
8 / 43 Télécom ParisTech Pierre Senellart
Where Do Ontologies Come From?
Manually crafted to represent the knowledge of a specific domain(e.g., life sciences)
Exported from classical Web databases
Through information extraction from the Web, Wikipedia, etc.(e.g., DBpedia, YAGO)
Private to a company or public
Some ontologies focus on instances, others on a schema (seefurther)
Value of the Semantic Web: bits of ontologies can be re-used inanother, and ontologies can be mapped through an owl:sameAslink
As of September 2011
MusicBrainz
(zitgist)
P20
Turismo de
Zaragoza
yovisto
Yahoo! Geo
Planet
YAGO
World Fact-book
El ViajeroTourism
WordNet (W3C)
WordNet (VUA)
VIVO UF
VIVO Indiana
VIVO Cornell
VIAF
URIBurner
Sussex Reading
Lists
Plymouth Reading
Lists
UniRef
UniProt
UMBEL
UK Post-codes
legislationdata.gov.uk
Uberblic
UB Mann-heim
TWC LOGD
Twarql
transportdata.gov.
uk
Traffic Scotland
theses.fr
Thesau-rus W
totl.net
Tele-graphis
TCMGeneDIT
TaxonConcept
Open Library (Talis)
tags2con delicious
t4gminfo
Swedish Open
Cultural Heritage
Surge Radio
Sudoc
STW
RAMEAU SH
statisticsdata.gov.
uk
St. Andrews Resource
Lists
ECS South-ampton EPrints
SSW Thesaur
us
SmartLink
Slideshare2RDF
semanticweb.org
SemanticTweet
Semantic XBRL
SWDog Food
Source Code Ecosystem Linked Data
US SEC (rdfabout)
Sears
Scotland Geo-
graphy
ScotlandPupils &Exams
Scholaro-meter
WordNet (RKB
Explorer)
Wiki
UN/LOCODE
Ulm
ECS (RKB
Explorer)
Roma
RISKS
RESEX
RAE2001
Pisa
OS
OAI
NSF
New-castle
LAASKISTI
JISC
IRIT
IEEE
IBM
Eurécom
ERA
ePrints dotAC
DEPLOY
DBLP (RKB
Explorer)
Crime Reports
UK
Course-ware
CORDIS (RKB
Explorer)CiteSeer
Budapest
ACM
riese
Revyu
researchdata.gov.
ukRen. Energy Genera-
tors
referencedata.gov.
uk
Recht-spraak.
nl
RDFohloh
Last.FM (rdfize)
RDF Book
Mashup
Rådata nå!
PSH
Product Types
Ontology
ProductDB
PBAC
Poké-pédia
patentsdata.go
v.uk
OxPoints
Ord-nance Survey
Openly Local
Open Library
OpenCyc
Open Corpo-rates
OpenCalais
OpenEI
Open Election
Data Project
OpenData
Thesau-rus
Ontos News Portal
OGOLOD
JanusAMP
Ocean Drilling Codices
New York
Times
NVD
ntnusc
NTU Resource
Lists
Norwe-gian
MeSH
NDL subjects
ndlna
myExperi-ment
Italian Museums
medu-cator
MARC Codes List
Man-chester Reading
Lists
Lotico
Weather Stations
London Gazette
LOIUS
Linked Open Colors
lobidResources
lobidOrgani-sations
LEM
LinkedMDB
LinkedLCCN
LinkedGeoData
LinkedCT
LinkedUser
FeedbackLOV
Linked Open
Numbers
LODE
Eurostat (OntologyCentral)
Linked EDGAR
(OntologyCentral)
Linked Crunch-
base
lingvoj
Lichfield Spen-ding
LIBRIS
Lexvo
LCSH
DBLP (L3S)
Linked Sensor Data (Kno.e.sis)
Klapp-stuhl-club
Good-win
Family
National Radio-activity
JP
Jamendo (DBtune)
Italian public
schools
ISTAT Immi-gration
iServe
IdRef Sudoc
NSZL Catalog
Hellenic PD
Hellenic FBD
PiedmontAccomo-dations
GovTrack
GovWILD
GoogleArt
wrapper
gnoss
GESIS
GeoWordNet
GeoSpecies
GeoNames
GeoLinkedData
GEMET
GTAA
STITCH
SIDER
Project Guten-berg
MediCare
Euro-stat
(FUB)
EURES
DrugBank
Disea-some
DBLP (FU
Berlin)
DailyMed
CORDIS(FUB)
Freebase
flickr wrappr
Fishes of Texas
Finnish Munici-palities
ChEMBL
FanHubz
EventMedia
EUTC Produc-
tions
Eurostat
Europeana
EUNIS
EU Insti-
tutions
ESD stan-dards
EARTh
Enipedia
Popula-tion (En-AKTing)
NHS(En-
AKTing) Mortality(En-
AKTing)
Energy (En-
AKTing)
Crime(En-
AKTing)
CO2 Emission
(En-AKTing)
EEA
SISVU
education.data.g
ov.uk
ECS South-ampton
ECCO-TCP
GND
Didactalia
DDC Deutsche Bio-
graphie
datadcs
MusicBrainz
(DBTune)
Magna-tune
John Peel
(DBTune)
Classical (DB
Tune)
AudioScrobbler (DBTune)
Last.FM artists
(DBTune)
DBTropes
Portu-guese
DBpedia
dbpedia lite
Greek DBpedia
DBpedia
data-open-ac-uk
SMCJournals
Pokedex
Airports
NASA (Data Incu-bator)
MusicBrainz(Data
Incubator)
Moseley Folk
Metoffice Weather Forecasts
Discogs (Data
Incubator)
Climbing
data.gov.uk intervals
Data Gov.ie
databnf.fr
Cornetto
reegle
Chronic-ling
America
Chem2Bio2RDF
Calames
businessdata.gov.
uk
Bricklink
Brazilian Poli-
ticians
BNB
UniSTS
UniPathway
UniParc
Taxonomy
UniProt(Bio2RDF)
SGD
Reactome
PubMedPub
Chem
PRO-SITE
ProDom
Pfam
PDB
OMIMMGI
KEGG Reaction
KEGG Pathway
KEGG Glycan
KEGG Enzyme
KEGG Drug
KEGG Com-pound
InterPro
HomoloGene
HGNC
Gene Ontology
GeneID
Affy-metrix
bible ontology
BibBase
FTS
BBC Wildlife Finder
BBC Program
mes BBC Music
Alpine Ski
Austria
LOCAH
Amster-dam
Museum
AGROVOC
AEMET
US Census (rdfabout)
Media
Geographic
Publications
Government
Cross-domain
Life sciences
User-generated content
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
10 / 43 Télécom ParisTech Pierre Senellart
Outline
IntroductionThe Semantic WebOntologies and ReasoningComponents of an Ontology
3 Ontology Languages for the Web
Querying Data through Ontologies
Reference Material
11 / 43 Télécom ParisTech Pierre Senellart
Classes and class hierarchy
Backbone of the ontology
AcademicStaff is a Class (A class will be interpreted as a set ofobjects)
AcademicStaff isa Staff (isa is interpreted as set inclusion)FacultyComponent
Course
MathCourse
ProbabilitiesAlgebra
LogicCSCourse
DBAIJava
Student
UndergraduateStudentMasterStudentPhDStudent
Department
PhysicsDeptMathsDeptCSDept
Staff
AcademicStaff
LecturerResearcherProfessor
AdministrativeStaff
12 / 43 Télécom ParisTech Pierre Senellart
Relations
Declaration of relations with their signature
(Relations will be interpreted as binary relations between objects)TeachesIn(AcademicStaff, Course)
if one states that “X TeachesIn Y ”, then X belongs toAcademicStaff and Y to Course
TeachesTo(AcademicStaff, Student)
Leads(Staff, Department)
13 / 43 Télécom ParisTech Pierre Senellart
Instances
Classes have instances
Dupond is an instance of the class Professor
corresponds to the fact: Professor(Dupond)
Relations also have instances
(Dupond,CS101) is an instance of the relation TeachesIn
corresponds to the fact: TeachesIn(Dupond,CS101)
The instance statements can be seen as (and stored in) a database
14 / 43 Télécom ParisTech Pierre Senellart
Ontology = schema + instance
Schema (TBox)
The set of class and relation namesThe signatures of relations and also constraintsThe constraints are used for two purposes
– checking data consistency (like dependencies in databases)– inferring new facts
Instance (ABox)
The set of factsThe set of base facts together with the inferred facts should satisfythe constraints
Ontology (i.e., Knowledge Base) = Schema + Instance
15 / 43 Télécom ParisTech Pierre Senellart
Outline
Introduction
3 Ontology Languages for the WebRDFRDFSOWL
Querying Data through Ontologies
Reference Material
16 / 43 Télécom ParisTech Pierre Senellart
3 ontology languages for the Web
RDF: a very simple ontology language
RDFS: Schema for RDF
Can be used to define richer ontologies
OWL: a much richer ontology language
We next present them rapidly
17 / 43 Télécom ParisTech Pierre Senellart
Outline
Introduction
3 Ontology Languages for the WebRDFRDFSOWL
Querying Data through Ontologies
Reference Material
18 / 43 Télécom ParisTech Pierre Senellart
Namespaces
An URI most often takes the form of a URL:http://live.dbpedia.org/page/%C3%89lectricit%C3%A9_de_Francehttp://sw.opencyc.org/2012/05/10/concept/en/GameOfChancehttp://www.w3.org/People/Berners-Lee/card#ihttp://www4.wiwiss.fu-berlin.de/dblp/resource/record/conf/www/2006
Some of these URIs may be actual URLs (point to a browsableresource), some may just be abstractFor simplicity, possibility of defining namespace prefixes, e.g.,dbpedia = http://live.dbpedia.org/page/ along with adefault prefix (e.g., mapped tohttp://sw.opencyc.org/2012/05/10/concept/en/):
dbpedia:%C3%89lectricit%C3%A9_de_France:GameOfChance
19 / 43 Télécom ParisTech Pierre Senellart
RDF: Resource Description Framework
RDF facts are triplets
Each triplet is of the form: h Subject Predicate Object i
The subject is a URI, referencing an entity
The predicate is a URI, referencing a relation
The object is either a URI, referencing an entity, or a literalh :Dupond :Leads :CSDept ih :Dupond :HasName “Paul Dupond” ih :Dupond :TeachesIn :UE111 ih :Dupond :TeachesTo :Pierre ih :Pierre :EnrolledIn :CSDept ih :Pierre :RegisteredTo :UE111 ih :UE111 :OfferedBy :CSDept i
The linked data cloud contains dozens of billions of RDF triples
20 / 43 Télécom ParisTech Pierre Senellart
RDF graph
A set of RDF facts definesa set of relations between objectsan RDF graph where the nodes are objects:
:TeachesIn
:Dupond :Pierre
:InfoDept
UE111
:TeachesTo
:EnrolledIn
:RegisteredTo
:OfferedBy
:Leads
21 / 43 Télécom ParisTech Pierre Senellart
RDF semantics
A triplet hs P oi is interpreted in first-order logic (FOL) as a factP(s ; o)
Example:Leads(Dupond, CSDept)TeachesIn(Dupond, UE111)TeachesTo(Dupond,Pierre)EnrolledIn(Pierre, CSDept)RegisteredTo(Pierre, UE111)OfferedBy(UE111, CSDept)
22 / 43 Télécom ParisTech Pierre Senellart
RDF Serialization
Several serialization formats for RDF data, see alternate formats ofhttp://live.dbpedia.org/page/%C3%89lectricit%C3%A9_de_France:
RDF/XML, structured XML representation, allowing for nesting
N3 or Turtle, more compact, readable syntax, with some syntaxicsugar (Turtle is a subset of N3)
N-Triples, a simple text-based format
RDFa to integrate RDF annotations into HTML
23 / 43 Télécom ParisTech Pierre Senellart
Outline
Introduction
3 Ontology Languages for the WebRDFRDFSOWL
Querying Data through Ontologies
Reference Material
24 / 43 Télécom ParisTech Pierre Senellart
RDFS: RDF Schema
Not detailed here: the schema in RDF is super simplistic
An RDF Schema defines the schema of a richer ontology
25 / 43 Télécom ParisTech Pierre Senellart
RDF Schema
Do net get confused: RDFS can use RDF as syntax, i.e., RDFSstatements can be themselves expressed as RDF triplets usingsome specific properties and objects used as RDFS keywords witha particular meaning.
Declaration of classes and subclass relationshipsh Staff rdf:type rdfs:Class i
h Java rdfs:subClassOf CSCourse i
Declaration of instances (beyond the pure schema)h Dupond rdf:type AcademicStaff i
26 / 43 Télécom ParisTech Pierre Senellart
RDF Schema - continued
Declaration of relations (properties in RDFS terminology)
h RegisteredTo rdf:type rdfs:Property i
Declaration of subproperty relationships
h LateRegisteredTo rdfs:subPropertyOf RegisteredTo i
Declaration of domain and range restrictions for predicatesh TeachesIn rdfs:domain AcademicStaff i
h TeachesIn rdfs:range Course i
TeachesIn(AcademicStaff, Course)
27 / 43 Télécom ParisTech Pierre Senellart
RDFS logical semantics
RDF and RDFS statements FOL translation DL notationh i rdf:type C i C (i) i : C or C (i)h i P j i P(i ; j ) i P j or P(i ; j )h C rdfs:subClassOf D i 8X (C (X )) D(X )) C v Dh P rdfs:subPropertyOf R i 8X 8Y (P(X ;Y )) R(X ;Y )) P v Rh P rdfs:domain C i 8X 8Y (P(X ;Y )) C (X )) 9P v Ch P rdfs:range D i 8X 8Y (P(X ;Y )) D(Y )) 9P� v D
DL: Description logics, a specialized logical formalism
28 / 43 Télécom ParisTech Pierre Senellart
Outline
Introduction
3 Ontology Languages for the WebRDFRDFSOWL
Querying Data through Ontologies
Reference Material
29 / 43 Télécom ParisTech Pierre Senellart
OWL: Web Ontology Language
OWL extends RDFS with the possibility to express additionalconstraints
Main OWL constructs
Disjointness between classesConstraints of functionality and symmetry on predicatesIntentional class definitionsClass union and intersection
Corresponds to description logics
30 / 43 Télécom ParisTech Pierre Senellart
OWL constructs
Disjointness between classes:
OWL notation FOL translation DL notationh C owl:disjointWith D i 8X (C (X )) :D(X )) C v :D
Constraints of functionality and symmetry on predicates:OWL notation FOL translation DL notationh P rdf:typeowl:FunctionalProperty i
8X 8Y 8Z (funct P)
(P(X ;Y ) ^ P(X ;Z ) ) Y =
Z )
or 9P v (� 1P)
h P rdf:type 8X 8Y 8Z (funct P�)owl:InverseFunctionalProperty i (P(X ;Y ) ^ P(Z ;Y ) ) X =
Z )
or 9P� v (�
1P�)h P owl:inverseOf Q i 8X 8Y (P(X ;Y ) ,
Q(Y ;X ))
P � Q�
h P rdf:type owl:SymmetricPropertyi
8X 8Y (P(X ;Y ) )
P(Y ;X ))
P v P�
31 / 43 Télécom ParisTech Pierre Senellart
Definition of intentional classes in OWL
Goal: allow expressing complex constraints such as:departments can be lead only by professorsonly professors or lecturers may teach to undergraduate students.
The keyword owl:Restriction is used in association with a blanknode class, and some specific restriction properties:
owl:someValuesFromowl:allValuesFromowl:minCardinalityowl:maxCardinality
32 / 43 Télécom ParisTech Pierre Senellart
OWL Semantics
OWL notation FOL translation DL notation_a owl:onProperty P_a owl:allValuesFrom C 8Y (P(X ;Y )) C (Y )) 8P :C_a owl:onProperty P_a owl:someValuesFrom C 9Y (P(X ;Y ) ^C (Y )) 9P :C_a owl:onProperty P_a owl:minCardinality n 9Y1 : : : 9Yn(P(X ;Y1) ^ : : : ^
P(X ;Yn)^Vi ;j2[1::n ];i 6=j (Yi 6= Yj ))
(� n P)
_a owl:maxCardinality n 8Y1 : : : 8Yn8Yn+1
(P(X ;Y1) ^ : : : ^ P(X ;Yn) ^
P(X ;Yn+1)
(� n P)
)Wi ;j2[1::n+1];i 6=j (Yi = Yj ))
33 / 43 Télécom ParisTech Pierre Senellart
Unnamed new classes by example
Departments can be lead only by professors
Define the set of objects that are lead by professors_a rdfs:subClassOf owl:Restriction_a owl:onProperty Leads_a owl:allValuesFrom Professor
Now specify that all departments are lead by professorsDepartment rdfs:subClassOf _a
34 / 43 Télécom ParisTech Pierre Senellart
Union and Intersection of Classes byexample
Only professors or lecturers may teach to undergraduate students
_a rdfs:subClassOf owl:Restriction_a owl:onProperty TeachesTo_a owl:someValuesFrom Undergrad_b owl:unionOf (Professor, Lecturer)_a rdfs:subClassOf _b
This corresponds to an inclusion axiom in Description Logic:
9 TeachesTo.UndergraduateStudent v Professor t Lecturer
owl:equivalentClass corresponds to double inclusion:
MathStudent � Student u 9 RegisteredTo.MathCourse
35 / 43 Télécom ParisTech Pierre Senellart
Outline
Introduction
3 Ontology Languages for the Web
Querying Data through Ontologies
Reference Material
36 / 43 Télécom ParisTech Pierre Senellart
Querying using RDFS
RDFS statements can be used to infer new triples
Example
Base fact ResponsibleOf (durand ;ue111 )
Use the statement hResponsibleOf rdfs:domain Professorii.e., the logical rule: ResponsibleOf (X ;Y ) ) Professor(X )
With substitution {X/durand, Y/ue111}Infer fact Professor(durand)
Use the statement hProfessor rdfs:subClassOf AcademicStaff ii.e., the rule Professor(X ) ) AcademicStaff (X )
With substitution {X/durand}Infer fact AcademicStaff (durand)
etc.
37 / 43 Télécom ParisTech Pierre Senellart
The saturation algorithm
Keep infering new facts until a fixpoint is reached
Note: Only polynomially many facts can be added
ptime
38 / 43 Télécom ParisTech Pierre Senellart
Querying using DL
RDFS simpler and very used but limited
Query answering is unfeasible for all of OWL (even OWL-DL). . .but restrictions exist that have more reasonable complexity.
39 / 43 Télécom ParisTech Pierre Senellart
SPARQL: query language for RDF data
Graph pattern language
Get all URIs, name, emails of persons with all three pieces ofinformation from a FOAF dataset:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>SELECT *WHERE {
?person foaf:name ?name .?person foaf:mbox ?emails .
}
The SPARQL processor may or may not use reasoning w.r.t. theschema
40 / 43 Télécom ParisTech Pierre Senellart
Outline
Introduction
3 Ontology Languages for the Web
Querying Data through Ontologies
Reference Material
41 / 43 Télécom ParisTech Pierre Senellart
Where can Semantic Content be Found?
In the linked data, through Web-available RDF data:dumps of an entire ontology, in one of the RDF serializationformats (RDF/XML, Turtle, N-Triples)crawlable RDF content, with small fragments pointing to otherfragmentsa SPARQL endpointHTML annotated with RDFa,cf. http://www.w3.org/TR/rdfa-syntax/
Other popular semantic content embedded in Web pages:microformats (hCard, vCard, etc.), microdata(cf. http://www.schemas.org/). Not directly the spirit of theSemantic Web, but heavily used.
RDF content used internally in a company
42 / 43 Télécom ParisTech Pierre Senellart
Systems
RDF stores (triplestores) with relational or native backend,open-source or commercial, seehttp://en.wikipedia.org/wiki/Triplestore. Apache Jena is apopular open-source Java store.
SPARQL engines, usually on top of a triplestore.http://en.wikipedia.org/wiki/SPARQL
Semantic browsers, to navigate RDF content,http://en.wikipedia.org/wiki/Linked_data#Browsers
A semantic Web search engine, http://sig.ma/
43 / 43 Télécom ParisTech Pierre Senellart
To go further. . .
The specifications on http://www.w3.org/
A gentle introduction to the Semantic Web:A Semantic Web Primer, G. Antoniou, F. van Harmelen, MITPress, 2008
A practical approach:Semantic Web for the Working Ontologist: Modeling in RDF,RDFS and OWL, Dean Allemang, James A. Hendler,Morgan-Kaufman, 2008
Contents of this lecture from:Web data management, S. Abiteboul, I. Manolescu, P. Rigaux,M.-C. Rousset, P. Senellart, Cambridge University Press, 2012.Also available at http://webdam.inria.fr/Jorge/