ontology, semantic web and dbpedia

23
10/8/2009 1 Ontology, Semantic Web and Global Database 10/9/2009 1 Creative Commons - BY-NC Contents Ontology? Why? P tѐ ѐ Protѐgѐ Semantic Web Linked Open Data 10/9/2009 2 Creative Commons - BY-NC

Upload: richard-kuo

Post on 07-Aug-2015

36 views

Category:

Technology


3 download

TRANSCRIPT

Page 1: Ontology, Semantic Web and DBpedia

10/8/2009

1

Ontology, Semantic Web and 

Global Database  

10/9/2009 1Creative Commons - BY-NC

Contents

• Ontology? Why?

P tѐ ѐ• Protѐgѐ

• Semantic Web

• Linked Open Data

10/9/2009 2Creative Commons - BY-NC

Page 2: Ontology, Semantic Web and DBpedia

10/8/2009

2

Syntactic Web  

10/9/2009 Creative Commons - BY-NC 3

ProblemsA typical web page is designed with markup language ,HTML, which is designed for rendering presentation and Hyperlink to related information. Semantic content is accessiblecontent is accessible to humans but not to computers.  

10/9/2009 Creative Commons - BY-NC 4

Page 3: Ontology, Semantic Web and DBpedia

10/8/2009

3

Linguistic

Concept

ReferentForm

Concept

Relates toActivates

10/9/2009 Creative Commons - BY-NC 5

TankStands for

?

Problems

• Keyword‐based Search

S d H• Synonyms and Homonyms

• No Parameter Search

• No Cross Silos Data Extraction or Comparison

• No Unified View and/or Interpretation of Data

• Limited Ability to Re‐use of Datay

• Difficult to Share Data with Business Partners

10/9/2009 Creative Commons - BY-NC 6

Page 4: Ontology, Semantic Web and DBpedia

10/8/2009

4

Need to Add “Semantics”

• Using Ontology to specify the meaning of annotationannotation.– Ontology provides a set of vocabulary terms

– New terms can be defined with existing ones

– Meaning of each term can be formally specified

– The relationship between terms can be defined

10/9/2009 Creative Commons - BY-NC 7

Web

• Web 1.0 – links documents to documents

W b 2 0 id t t f• Web 2.0 – provides contents from users

• Web 3.0 – links data to data

10/9/2009 Creative Commons - BY-NC 8

Page 5: Ontology, Semantic Web and DBpedia

10/8/2009

5

What is Ontology? http://en.wikipedia.org/wiki/Ontology_%28information_science%29

• In computer science and information science, an ontology is a formal representation of a set ofontology is a formal representation of a set of concepts within a domain and the relationships between those concepts. It is used to reason about the properties of that domain, and may be used to define the domain. 

• An ontology is a formal, explicit specification of a conceptualization. 

10/9/2009 Creative Commons - BY-NC 9

XML (Extensible Markup Language)It is a textual data format, with strong support via Unicode for the languages

Well‐formed and error‐handling• It contains only properly‐encoded legal 

Unicode characters.  None of the special syntax characters such as "<" and "&" appear except when performing their markup‐delineation roles.

• The begin, end, and empty‐elementUnicode for the languages of the world. Although XML’s design focuses on documents, it is widely used for the representation of arbitrary data structures.

The begin, end, and empty element tags which delimit the elements are correctly nested, with none missing and none overlapping.

• The element tags are case‐sensitive; the beginning and end tags must match exactly.

• There is a single "root" element which contains all the other elements.

10/9/2009 Creative Commons - BY-NC 10

Page 6: Ontology, Semantic Web and DBpedia

10/8/2009

6

XSD (XML Schema)

XSD datatypes ‐1/2• xsd:string, • xsd:boolean, • xsd:decimal, • xsd:float, • xsd:double, • xsd:dateTime, 

d i

XSD can be used to express a set of rules to which an XML document must

XSD datatypes ‐2/2• xsd:language, • xsd:NMTOKEN, • xsd:Name, • xsd:NCName,• xsd:integer,• xsd:nonPositiveInteger,• xsd:time, 

• xsd:date, • xsd:gYearMonth, • xsd:gYear, • xsd:gMonthDay, • xsd:gDay, • xsd:gMonth, • xsd:hexBinary, • xsd:base64Binary

XML document must conform in order to be considered 'valid' according to that schema. However, unlike most other schema languages, XSD was also designed with the intent that 

xsd:nonPositiveInteger,• xsd:negativeInteger, • xsd:long, • xsd:int, • xsd:short,• xsd:byte,• xsd:nonNegativeInteger,• xsd:unsignedLong,

d i dIxsd:base64Binary, • xsd:anyURI, • xsd:normalizedString, • xsd:token, 

determination of a document's validity would produce a collection of information adhering to specific data types.

10/9/2009 Creative Commons - BY-NC 11

• xsd:unsignedInt,• xsd:unsignedShort,• xsd:unsignedByte,• xsd:positiveIntegers

RDF (Resource Descriptive Framework)

RDF vocabulary• rdf:type

• rdf:Property

• rdf:XMLLiteral

• rdf:nil

• rdf:List

RDF describes statements about resources, in particular Web resources

• rdf:Statement

• rdf:subject

• rdf:predicate

• rdf:object

• rdf:first

• rdf:rest

• rdf:Seq

particular, Web resources, in the form of subject‐predicate‐object expressions. These expressions are known as triples in RDF terminology. 

rdf:Seq

• rdf:Bag

• rdf:Alt

• rdf:_1 

• rdf:_2 ... 

• rdf:value

10/9/2009 Creative Commons - BY-NC 12

Page 7: Ontology, Semantic Web and DBpedia

10/8/2009

7

Triples and GraphThe base element of the RDF model is the triple: 

• a resource (the subject)• a resource (the subject)• inks (the predicate)  • another resource (the object) 

A resource <subject> has a property <predicate> valued by <object>.

10/9/2009 Creative Commons - BY-NC 13

<subject> <predicate> <object>

Pro and Cons of RDF

• ProsU i l d t d l ( t XML bj t d l ti l– Universal data model (map to XML, object and relational model)

– Additive, easy to merge multiple RDFs

– Predicate logic (like prolog)

– Use URI to identify  a resource

• ConsCons– Lacks  of concepts of enumeration

– Lacks data types

– No Object‐Oriented Features

10/9/2009 Creative Commons - BY-NC 14

Page 8: Ontology, Semantic Web and DBpedia

10/8/2009

8

Resource (RDFS)

Classes• rdfs:Resource

• rdfs:Literal

• rdfs:Class

• rdfs:Datatype

df C i

RDF Schema (RDFS) is an extensible knowledge representation language

Properties• rdfs:subClassOf• rdfs:subPropertyOf• rdfs:domain• rdfs:range• rdfs:label

df t• rdfs:Container

• rdfs:ContainerMembershipProperty

• rdf:List

• rdf:Statement

• rdf:Bag

• rdf:Seq

representation language, providing basic elements for the description of ontologies, otherwise called Resource Description Framework (RDF) vocabularies, intended to structure RDF 

• rdfs:comment• rdfs:member• rdfs:seeAlso• rdfs:isDefinedBy• rdf:first• rdf:rest• rdf:type• rdf:valuerdf:Seq

• rdf:Alt

• rdf:XMLLiteral

• rdf:Property

resources.

10/9/2009 Creative Commons - BY-NC 15

• rdf:subject• rdf:predicate• rdf:object

Web Ontology Language 

10/9/2009 Creative Commons - BY-NC 16

Page 9: Ontology, Semantic Web and DBpedia

10/8/2009

9

Web Ontology Language (OWL)

• Extends RDF/RDFS to support complex knowledge representationrepresentation.

• An OWL ontology may include descriptions of classes, properties and their instances.

• Open‐World assumption – what is not known is not “untrue”, it is just “unknown”.

10/9/2009 Creative Commons - BY-NC 17

OWL‐1

• OWL‐LiteS t i l l ifi ti ll l di liti– Support simple classification, allows only cardinalities (member count) of 1 and 0 and only minimal constraints. 

• OWL‐DL (Descriptive Language)– Supports more complex ontologies, but with guarantees, such as processing finishing in finite time, restricting elements to be one type.

• OWL‐Full– Full support for maximum freedom of RDF, with no computational guarantees.

10/9/2009 Creative Commons - BY-NC 18

Page 10: Ontology, Semantic Web and DBpedia

10/8/2009

10

OWL Classes and Properties partial list, see http://www.w3.org/TR/owl‐guide/ for full list

• Class– owl:class

• Property Restrictions– owl:allValuesFrom

– rdfs:subClassOf

• Property– owl:ObjectProperty

– owl:DataProperty

– rdfs:subPropertyOf

– rdfs:domain

– rdfs:range

• Property Characteristic

– owl:someValuesFrom

– owl:cardinality

– owl:someValue

• Equivalence– owl:EquivalenceClass

– owl:EquivalenceProperty

– owl:sameAs

• Complex Classesp y– owl:TransitiveProperty

– owl:FunctionalProperty

– owl:InverseProperty

– owl:InverseFunctionalProperty

p– owl:IntersectionOf

– owl:UnionOf

– owl:CompoundOf

10/9/2009 Creative Commons - BY-NC 19

Semantic Web Layer CakeFrom: http://www.semanticfocus.com/blog/entry/title/introduction‐to‐the‐semantic‐web‐vision‐and‐technologies‐part‐1‐overview/

10/9/2009 Creative Commons - BY-NC 20

Page 11: Ontology, Semantic Web and DBpedia

10/8/2009

11

Tools

• RDF/OWL EditorsP tѐ ѐ T b id– Protѐgѐ, Topbraid, …

• RDF Store– SwiftOWLIM, AllegroGraph, OpenLink Virtuoso, …

• Query– SPARQL

• Reasoners– Pellet, FaCT++, …

10/9/2009 Creative Commons - BY-NC 21

10/9/2009 Creative Commons - BY-NC 22

Page 12: Ontology, Semantic Web and DBpedia

10/8/2009

12

Protѐgѐ Overview

• Stanford Center for Biomedical Informatics Research, – Stanford UniversityStanford University 

– University of Manchester

• OWL Editor

• Plugins: Natural Language, Visualization,  Rules Engine, Database, …

• Very well documented, 

• Long history with many academic supports

10/9/2009 Creative Commons - BY-NC 23

Protѐgѐ – Class View 

10/9/2009 Creative Commons - BY-NC 24

Page 13: Ontology, Semantic Web and DBpedia

10/8/2009

13

Protѐgѐ – Object Property View  

10/9/2009 Creative Commons - BY-NC 25

Protѐgѐ – Value Property View 

10/9/2009 Creative Commons - BY-NC 26

Page 14: Ontology, Semantic Web and DBpedia

10/8/2009

14

Protѐgѐ ‐ Visualization 

10/9/2009 Creative Commons - BY-NC 27

Ontology Development

• Define purpose and scopes

Eli it k l d• Elicit knowledge

• Collect and organize concepts

• Classify and add axioms

• Reasoning 

10/9/2009 Creative Commons - BY-NC 28

Page 15: Ontology, Semantic Web and DBpedia

10/8/2009

15

OWL vs. UML class modeling

• OWL properties vs. UML associations & attributesOWL ti h di ti– OWL properties have a direction

– OWL properties are binary relations

– OWL properties are “first‐class” citizens (global scope)

• OWL classes vs. UML classes– OWL classes have no operations

OWL classes can have “sufficient” conditions– OWL classes can have  sufficient  conditions• Primitive vs. defined classes

2910/9/2009 Creative Commons - BY-NC

Ontologies and Data Models

• Ontologies live in an open, distributed world; data models in a closed worldmodels in a closed world

• Writing a model in OWL does not make it an ontology– The ontology should be shared

3010/9/2009 Creative Commons - BY-NC

Page 16: Ontology, Semantic Web and DBpedia

10/8/2009

16

Semantic Web

10/9/2009 Creative Commons - BY-NC 31

Web Technologiesfrom http://www.abricocotier.fr/5694‐les‐trois‐grandes‐etapes‐de‐levolution‐du‐web

10/9/2009 Creative Commons - BY-NC 32

Page 17: Ontology, Semantic Web and DBpedia

10/8/2009

17

Benefit Semantic Web Applications

• Less coding, more meaningful data structure

L b i l• Less business rules

• More across boundary information

• Embedded logic

10/9/2009 Creative Commons - BY-NC 33

Global Databasefrom: Tim Berners‐Lee, Weaving the Web, 1999

• "If HTML and the Web made all the online documents look like one huge book RDF schemadocuments look like one huge book, RDF, schema, and inference languages will make all the data in the world look like one huge database"

10/9/2009 Creative Commons - BY-NC 34

Page 18: Ontology, Semantic Web and DBpedia

10/8/2009

18

nterne

tnterne

tme

meto th

e In

to th

e In

10/9/2009 Creative Commons - BY-NC 35

Welcom

Welcom

One Global Machine

10/9/2009 Creative Commons - BY-NC 36

Page 19: Ontology, Semantic Web and DBpedia

10/8/2009

19

Dimension of Global MachineFrom: http://www.kk.org/thetechnium/archives/2007/11/dimensions_of_t.php

170 quadrillion (170 * 10^15) Transistors

55 trillion (55* 10^12) Links55 trillion (55  10 12) Links

2 megahertz Emails

31 kilohertz Text Messages

162 kilohertz Instance Messages

14 kilohertz Search

246 exabyte Storage

9 exabyte (9 * 10^18) RAM

9 terabyes/second Bandwidth

800 billion kwh/year Power consumption

10/9/2009 Creative Commons - BY-NC 37

10/9/2009 38Creative Commons - BY-NC

Page 20: Ontology, Semantic Web and DBpedia

10/8/2009

20

10/9/2009 Creative Commons - BY-NC 39

DBpedia

• Structure multiple wikipedia information to allow query directlyquery directly

• Build from scratch, 170 classes, 900 properties

• Serves as hub for other databases

10/9/2009 Creative Commons - BY-NC 40

Page 21: Ontology, Semantic Web and DBpedia

10/8/2009

21

Multilingual 

Abstracts– English: 2,613,000 g , ,– German: 391,000 – French: 383,000 – Dutch: 284,000 – Polish: 256,000 – Italian: 286,000 – Spanish: 226,000 

10/9/2009 Creative Commons - BY-NC 41

– Japanese: 199,000 – Portuguese: 246,000 – Swedish: 144,000 – Chinese: 101,000

Sept 2008

May 2007

April 20082 billion RDF triples

10/9/2009 Creative Commons - BY-NC 42

May 2007500 million  RDF triples

Page 22: Ontology, Semantic Web and DBpedia

10/8/2009

22

Linked Open Database March 20094.5 billion  RDF triples180 data million links

Online ActivitiesMusic Online Activities

PublicationsGeographic

Cross-Domain

10/9/2009 Creative Commons - BY-NC 43

Life Sciences

Open Questions

• Architecture Impact

D i A li ti• Device Applications

• Device Management

• Data Structure and Management

• Software Evolution, new requirements

• Competitor’s offersp

• …

10/9/2009 Creative Commons - BY-NC 44

Page 23: Ontology, Semantic Web and DBpedia

10/8/2009

23

Thank You for Your Attention

10/9/2009 Creative Commons - BY-NC 45