semantics
TRANSCRIPT
Semantics & digital data processing
Beirut, 9-10 March 2015
Establishing a framework for scholarly editing and publishing in the 21st century
Mokhtar BEN HENDA
Semantics: Better computational description of Semantics: Better computational description of sciencescience
Information is given explicit meaning so that machines can process it more intelligently;
Instead of just creating standard terms for concepts as is done in XML, the Semantic Web also allows users to provide formal definitions for the standard terms they create so that machines can use inference algorithms to reason about the terms;
A crucial component to the Semantic Web is the definition and use of ontologies,
Basics of semantic WebBasics of semantic Web XML
XML is a language for transmitting structured information. If the goal of the web is to enable not only communication between people, but also between machines, then XML seems a good basis not only for documents to be read by people, but for data to be read by machines.
RDF RDF(Resource Description Framework) is a directed, labeled graph data
format for representing information in the Web. It is just a data model that does not have any significant semantics. RDF Schema is used to define a vocabulary for use in RDF models. In particular, it allows to define the classes used to type resources and to define the properties that resources can have.
OWL OWL was designed to provide a common way to process the content of web
information instead of displaying it. It is primarily concerned with defining terminology that can be used in RDF documents. Syntactically, an OWL ontology is a valid RDF document and as such also a well-formed XML document.
SPARQL Is a set of specifications that provide languages and protocols to query and
manipulate RDF graph content on the Web or in an RDF store
OntologiesOntologies Principle:
From characters string to word meaning From words listing to words relations From indexes to ontologies
Ontologies
Semantic links
Vocabularies
Data ModelsData Models
bank
fiddleviolin
violistfiddler
string
rec: 12345- financial instituterec: 54321- side of a riverrec: 9876- small string instrumentrec: 65438- musician playing violinrec:42654- musician
rec:25876- string instrument
rec:35576- string of instrumentrec:29551- underwear
type-of
type-of
part-of
Vocabulary of a languageConceptsRelations
1
2
2
1
1
2
© Piek Vossen
Capturing Semantics in XML DocumentsCapturing Semantics in XML Documents
7
XML
app#1Semantics: Code to interpret the dataAction: Code to process the data
app#2Semantics: Code to interpret the dataAction: Code to process the data
Meaning (semantics) applied on a per-XML-Meaning (semantics) applied on a per-XML-application basisapplication basis
© Katia Sycara and Massimo Paolucci
OWL (Ontology Web Language)OWL (Ontology Web Language)
XML
app#1Action: Code to process the data
app#2Action: Code to process the data
OWL DocumentSemantic Definitions
Handles separate concept definitions (semantics) from applicationExpress concept definitions using a standard vocabulary
© Katia Sycara and Massimo Paolucci
9
OWL and logicsOWL and logics OWL relies on Description Logics Logics provide automatic
CheckCheck of consistency of concept definitions CompletionCompletion of concept definitions ClassificationClassification of new instances and concepts ExtractionExtraction of implicit knowledge in the documents
OWL greatly expands the vocabulary for multiple possible constructs
XML Schema provides some of those properties to some extent
© Katia Sycara and Massimo Paolucci
RDF (Resource Description Framework)RDF (Resource Description Framework)
Provide basic syntax for OWL Use of URI for uniqueunique identification of
concepts, instances and relations Expression of relations between objects
and concepts (RDF triples)
Problem: no structure© Katia Sycara and Massimo Paolucci
11
RDF SchemaRDF Schema Add basic structure to RDF
Class/Subclass declaration Instances Properties (relations) Multiple inheritance
12(24)
NOTES• everything is a replaceable bean• all communication via fixed APIs • low coupling, high modularity, high extensibility
…
HTMLdocs
RTFdocs
XMLdocs
PDFdocs
XMLDocumentFormat
HTMLDocumentFormat
PDFDocumentFormat
…DocumentFormatLayer (LRs)
XML Oracle PostgreSql .ser
DataStore Layer
Corpus Document
DocumentContent
AnnotationSet
Annotation FeatureMap
Corpus Layer (LRs)
NOTES (2)• eg: Protégé LR & VR both wrapped in Res. (bean) API
• ontology repositories and inference are the same: KAON + Sesame + Orenge + ?
GATE APIs
Processing Layer (PRs)
NE Co-ref TEs TRs POS …
Onto-logy
ProtégéOnto-logy
Word-net
Gaz-etteers
Language Resource Layer (LRs)
...
Application Layer
ANNIE OBIE …IDE GUI Layer (VRs)
ADiff OntolVR DocVR ... WebServices
Sheffield Natural Language Processing GroupSheffield Natural Language Processing Group
Semantic Web: cartographic searchingSemantic Web: cartographic searching
Semantic Web: cartographic searchingSemantic Web: cartographic searching Conceptual charts
Reproduce semantic links like charts One concept can have numerous semantic
links
http://www.touchgraph.com/TGGoogleBrowser.html
Semantic Web: cartographic searchingSemantic Web: cartographic searching
http://www.oskope.com
Semantic Web: cartographic searchingSemantic Web: cartographic searching
• http://vionto.com/show/
Semantic Web: cartographic searchingSemantic Web: cartographic searching
Topic Maps : http://highwire.stanford.edu/lists/artbytopic.dtl
SKOS: Simple Knowledge Org. SystemsSKOS: Simple Knowledge Org. Systems SKOS: specifications
and standards to support within the framework of the Semantic Web, the use of knowledge organization systems (KOS) such as: thesauri, classification schemes, subject heading lists Taxonomies
http://www.ebusiness-unibw.org/tools/skos2owl/
SKOS PlaySKOS Play SKOS Play thesaurus is a
visualization service of SKOS formatted taxonomies or vocabularies.
More generally, it is used to view or print a knowledge organization system expressed in SKOS.
http://labs.sparna.fr/skos-play/upload?lang=en