three theses of representation in the semantic web ian horrocks university of manchester manchester,...

15
Three Theses of Representation in the Semantic Web Ian Horrocks University of Manchester Manchester, UK [email protected] Peter F. Patel-Schneider Bell Labs Research Murray Hill, NJ, USA [email protected] labs.com

Upload: gia-canterbury

Post on 31-Mar-2015

218 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Three Theses of Representation in the Semantic Web Ian Horrocks University of Manchester Manchester, UK horrocks@cs.man.ac.uk Peter F. Patel-Schneider

Three Theses of Representationin the Semantic Web

Ian Horrocks

University of Manchester

Manchester, UK

[email protected]

Peter F. Patel-Schneider

Bell Labs Research

Murray Hill, NJ, USA

[email protected]

Page 2: Three Theses of Representation in the Semantic Web Ian Horrocks University of Manchester Manchester, UK horrocks@cs.man.ac.uk Peter F. Patel-Schneider

Semantic Web Languages• SemWeb aims to make content accessible to automated processes

– Add semantic markup (meta-data) describing content/function of resources

• Need a common way of providing meta-data so that:– It can be understood and manipulated by automated processes (“agents”)– Agents can integrate meta-data from different sources

• Proposed solution is famous language “layer cake”:

Page 3: Three Theses of Representation in the Semantic Web Ian Horrocks University of Manchester Manchester, UK horrocks@cs.man.ac.uk Peter F. Patel-Schneider

Language Architecture• Relationship between adjacent layers not clear

– XML $ RDF relationship purely syntactic

– RDF $ Ontology layer relationship should be something more?

• RDF is proposed as base for SemWeb languages– Used to add metadata annotations to resources

– Also used to define syntax and semantics of subsequent layers

• Not clear that RDF is appropriate for all these functions– Limited set of syntax constructs (triples)

– Not possible to extend syntax (as it is, e.g., when using XML)

– Uniform semantic treatment of triple syntax

– Non standard KR thesis and model theory

• May facilitate development of SemWeb to use more standard KR thesis…

Page 4: Three Theses of Representation in the Semantic Web Ian Horrocks University of Manchester Manchester, UK horrocks@cs.man.ac.uk Peter F. Patel-Schneider

Ontology Language Layer• Ontologies set to play key role in SemWeb

– source of shared and precisely defined terms for use in meta-data

• RDF already extended to RDFS– Hierarchies of classes and properties

– Domain and range constraints on properties

• More expressive ontology languages clearly required– With logical connectives, quantifiers, transitive properties, etc.

– E.g., OIL, DAML+OIL, and now OWL

• Possible choices for language layering:– Base ontology language layer(s) on RDF(S)

– Base ontology language layer(s) on “classical” FOL

– Base ontology language layer(s) on SKIF/Lbase/CL languages

Page 5: Three Theses of Representation in the Semantic Web Ian Horrocks University of Manchester Manchester, UK horrocks@cs.man.ac.uk Peter F. Patel-Schneider

Semantics and Model Theories• Ontology/KR languages aim to model (part of) world• Constructs in language correspond to entities in world• Meaning given by mapping to some formal system

– E.g., a logic such as FOL with its own well defined semantics– or a data model such as XQuery data model for XML– or (for more expressive languages) a Model Theory (MT)

• MT defines relationship between syntax and interpretations– Can be many interpretations (models) of one piece of syntax– Models supposed to be analogue of (part of) world

• E.g., elements of model correspond to objects in world– Formal relationship between syntax and models

• Structure of models must reflect relationships specified in syntax– Inference (e.g., entailment) defined in terms of MT

• E.g., A ² B iff every model of A is also a model of B

Page 6: Three Theses of Representation in the Semantic Web Ian Horrocks University of Manchester Manchester, UK horrocks@cs.man.ac.uk Peter F. Patel-Schneider

FOL Thesis• Base SW languages on established

FO hierarchy– Propositional logic

– Decidable FOL subsets (e.g., DL, Horn)

– Undecidable FOL subsets

– Full FOL (and even HOL)

• Higher layers extend syntax– Upwards compatibility, i.e., syntax retains

same meaning in higher layers

• Semantics via FOL mapping or standard FO model theory

– Individual i ! element of domain (iI 2 D)

– Class C ! sets of elements (CI µ D)

– Property P ! binary rel on D (PI µ D £ D)

Page 7: Three Theses of Representation in the Semantic Web Ian Horrocks University of Manchester Manchester, UK horrocks@cs.man.ac.uk Peter F. Patel-Schneider

(Dis)advantages of FOL Thesis• Pros

– Based on well known and extensively studied formalism– Wealth of theoretical knowledge and practical experience– Family of sub-languages with well known formal properties

• E.g., decidability, complexity– Highly optimised reasoners for FOL and many sub-languages

• E.g., DL reasoners, Horn (rule) reasoners, FOL provers– Mapping to FOL provides easy integration, e.g., of DL and Horn

languages– FO subset of RDFS fits well in this framework

• Cons– No classes as instances (unless extended to HOL)– Relatively poor fit with full RDFS

• Can be axiomatised in FOL, but may damage semantic interoperability and computational properties

Page 8: Three Theses of Representation in the Semantic Web Ian Horrocks University of Manchester Manchester, UK horrocks@cs.man.ac.uk Peter F. Patel-Schneider

Axiomatisation• An Axiomatisation can be used to embed RDFS in FOL, e.g.:

– Triple x P y translated as holds2(P,x,y)– Axioms capture semantics of language, e.g.:

• Problems with axiomatisations include– May require large and complex set of axioms– Difficult to prove semantics have been correctly captured– Axiomatisation may greatly increase computational complexity

• RDFS ! undecidable (subset of) FOL– No interoperability unless all languages similarly axiomatised

• E.g., in DAML+OIL, C subClassOf D equivalent to 8 x.C(x) ! D(x)

• But have to axiomatise as holds2(subClass, C, D)

Page 9: Three Theses of Representation in the Semantic Web Ian Horrocks University of Manchester Manchester, UK horrocks@cs.man.ac.uk Peter F. Patel-Schneider

SKIF/Lbase/CL Thesis• Base SW languages on SKIF/Lbase/CL

– Similar to FOL thesis, but FOL replaced with CL

• Higher layers extend syntax– Upwards compatibility, i.e., syntax retains

same meaning in higher layers

• Semantics via mapping into CL

• CL provides model theory– Individual i ! element of domain (iV 2 D)

– Class C ! element of domain (CV 2 D)

– Property P ! element of domain (PV 2 D)

Second mapping (ext)

– Class elt w ! set of elts (ext(w) µ D)

– Prop elt k ! binary rel (ext(P) µ D £ D)

Page 10: Three Theses of Representation in the Semantic Web Ian Horrocks University of Manchester Manchester, UK horrocks@cs.man.ac.uk Peter F. Patel-Schneider

(Dis)advantages of CL Thesis• Pros

– Classes as individuals without HOL extension– Can use as a basis for a family of sub-languages– Mapping to CL provides easy integration of sub-languages– Better fit with RDFS

• Cons– Relatively new and untried– Little known about CL sub-languages– Confusion w.r.t. FOL compatibility– RDFS still requires axiomatisation due, e.g., to rdf:type being in

domain of discourse• Still no direct semantic interoperability with RDFS

– Computational pathway only via (performance-damaging) FOL mapping

Page 11: Three Theses of Representation in the Semantic Web Ian Horrocks University of Manchester Manchester, UK horrocks@cs.man.ac.uk Peter F. Patel-Schneider

Confusion w.r.t. FOL Compatibility• SKIF/Lbase/CL use same syntax as FOL

– But allow variables to occur in predicate positions

• Originally asserted that SKIF semantics coincide with FOL for well formed FOL sentences

• Subsequently shown to be wrong for FOL with equality– E.g.,

• Moral of the story– May confuse users more familiar with

classical FOL– Easy to make mistakes with complex new

formalisms– Risky to base future of SemWeb on such a

new formalism

Page 12: Three Theses of Representation in the Semantic Web Ian Horrocks University of Manchester Manchester, UK horrocks@cs.man.ac.uk Peter F. Patel-Schneider

RDF Thesis• All SW languages based on triples

– Triple based syntax– Semantics compatible with semantics of

triples as defined by RDF MT

• Upwards & downwards compatibility– Syntax retains same meaning in higher

layers– Higher layer syntax is valid in lower layers

• Semantics via RDF model theory– Similar to CL, but only binary predicates– Language syntax also in domain of

discourse– Higher layers impose additional

constraints on models

• Syntax must be encoded as triples– Awkward for complex constructs– Resulting triples also have meaning

Page 13: Three Theses of Representation in the Semantic Web Ian Horrocks University of Manchester Manchester, UK horrocks@cs.man.ac.uk Peter F. Patel-Schneider

(Dis)advantages of RDF Thesis• Pros

– (Supposed) interoperability between language layers– RDF tools can be used to parse all SW languages into triples– Large ontologies/KBs can be stored in triple DBs

• Cons– Achieving real (semantic) interoperability may be difficult or impossible

• E.g., efforts to layer OWL on top of RDF(S)– Triple encoding of complex languages such as OWL is very clumsy– Triples introduced by encodings have semantic consequences

• E.g., first-rest triples used in list syntax have same consequences as ground facts (even though ordering of list may be arbitrary)

– Not clear if technique can be extended to more expressive languages• E.g., full FOL

– Computational pathway only via (performance-damaging) FOL mapping

Page 14: Three Theses of Representation in the Semantic Web Ian Horrocks University of Manchester Manchester, UK horrocks@cs.man.ac.uk Peter F. Patel-Schneider

Summary• Formal meaning of SW languages crucial to interoperability

– Common semantic underpinning facilitates layered architecture

• Widely assumed that RDF will provide this underpinning– But layering on top of RDF(S) may be difficult/impossible and does

not lead to any direct computational pathway– Moreover, benefits are not clear

• Alternative would be to use standard FOL as underpinning– Well established and well understood– Established family of languages capturing different trade-offs– Direct computational pathway for FOL and many sub-languages– FO subset of RDF(S) would fit well in this framework

• Third approach is to use CL as underpinning– Relatively new and untested– May not solve problems with RDF(S)

Page 15: Three Theses of Representation in the Semantic Web Ian Horrocks University of Manchester Manchester, UK horrocks@cs.man.ac.uk Peter F. Patel-Schneider

Perhaps we should consider recalling the Semantic Web

bandwagon in order to carry out a safety modification on the RDF

component!