semantic web - lecture 09 - web information systems (4011474fnr)

42
2 December 2005 Web Information Systems Semantic Web Prof. Beat Signer Prof. Beat Signer Department of Computer Science Vrije Universiteit Brussel http://www.beatsigner.com

Upload: beat-signer

Post on 15-Jan-2015

4.295 views

Category:

Education


3 download

DESCRIPTION

This lecture is part of a Web Information Systems course given at the Vrije Universiteit Brussel.

TRANSCRIPT

Page 1: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

2 December 2005

Web Information SystemsSemantic Web

Prof. Beat SignerProf. Beat Signer

Department of Computer Science

Vrije Universiteit Brussel

http://www.beatsigner.com

Page 2: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 2November 28, 2014

The Semantic Web

I have a dream for the Web [in which com-

puters] become capable of analyzing all the

data on the Web – the content, links, and

transactions between people and computers.

A 'Semantic Web', which should make this

possible, has yet to emerge, but when it

does, the day-to-day mechanisms of trade,

bureaucracy and our daily lives will be

handled by machines talking to machines.

The 'intelligent agents' people have touted

for ages will finally materialize.Weaving the Web - The Original Design and Ultimate Destiny of the World Wide Web by Its Inventor,

Tim Berners-Lee, Harper San Francisco, September 1999

Tim Berners-Lee

Page 3: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 3November 28, 2014

The Semantic Web ...

The Semantic Web is a vision: the idea of having data on

the Web defined and linked in a way that it can be used by

machines not just for display purposes, but for auto-

mation, integration and reuse of data across various

applications. Metadata provides a means to make

statements and create machine-readable statements. W3C, 2003

Page 4: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 4November 28, 2014

The Semantic Web ...

Meaning of data on the Web can not only be infered by

people but also discovered by machines without (or with

less) human intervention

Web of Data instead of Web of Documents the Web as a huge decentralised database (knowledge base)

machine-accessible data

data may be interconnected similar to today's webpages

machine-readable metadata for existing web content

combination of data from different sources to derive new facts

machines (agents) may use logical reasoning to infer facts that are not explicitly recorded

Crucial component of Web 3.0 or Giant Global Graph

Page 5: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 5November 28, 2014

Video: The Future Internet

Page 6: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 6November 28, 2014

Semantic Web Stack

The Semantic Web Stack

(or Semantic Web Cake)

describes the architecture

of the Semantic Web

URI/IRI

- unique identification of semantic

web resources

Unicode

- representing/manipulating text

in different languages

XML

- interchange of structured data

over the Web

Character set: UNICODE

Cry

pto

gra

ph

y

Syntax: XML and XML Namespaces

Data interchange: RDF

Taxonomies: RDFS

Ontologies: OWLQuerying:

SPARQL

Unifying Logic

Trust

User interface and applications

Proof

Rules:RIF/SWRL

Based on [http://en.wikipedia.org/wiki/File:Semantic-web-stack.png]

Identifiers: URI/IRI

Page 7: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 7November 28, 2014

Semantic Web Stack ...

XML Namespaces

- uniquely qualify markup from

multiple sources (integration)

Resource Description

Framework (RDF)

- define RDF triples and repre-

sent resource information in

a graph structure

RDF Schema (RDFS)

- create hierarchies of classes

and properties

Character set: UNICODE

Cry

pto

gra

ph

y

Syntax: XML and XML Namespaces

Data interchange: RDF

Taxonomies: RDFS

Ontologies: OWLQuerying:

SPARQL

Unifying Logic

Trust

User interface and applications

Proof

Rules:RIF/SWRL

Based on [http://en.wikipedia.org/wiki/File:Semantic-web-stack.png]

Identifiers: URI/IRI

Page 8: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 8November 28, 2014

Semantic Web Stack ...

Web Ontology Language

(OWL)

- language to define vocabularies

- extends RDFS with more ad-

vanced features (e.g. cardinality)

- enables reasoning based on

description logic

SPARQL

- query language to query any

RDF-based data

Rule Interchange Format

(RIF) and Semantic Web

Rule Language (SWRL)

- describe relations that cannot be

described in OWL

Character set: UNICODE

Cry

pto

gra

ph

y

Syntax: XML and XML Namespaces

Data interchange: RDF

Taxonomies: RDFS

Ontologies: OWLQuerying:

SPARQL

Unifying Logic

Trust

User interface and applications

Proof

Rules:RIF/SWRL

Based on [http://en.wikipedia.org/wiki/File:Semantic-web-stack.png]

Identifiers: URI/IRI

Page 9: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 9November 28, 2014

Semantic Web Stack ...

Unifying Logic

- logical reasoning (infer new

facts and check consistency)

Proof

- explain logical reasoning steps

Cryptography

- protect RDF data via encryption

- validate the source of facts by

digitally signing RDF data

Trust

- authentification of sources and

trustworthiness of derived facts

User Interface

- user interfaces for semantic web

applications

Character set: UNICODE

Cry

pto

gra

ph

y

Syntax: XML and XML Namespaces

Data interchange: RDF

Taxonomies: RDFS

Ontologies: OWLQuerying:

SPARQL

Unifying Logic

Trust

User interface and applications

Proof

Rules:RIF/SWRL

Based on [http://en.wikipedia.org/wiki/File:Semantic-web-stack.png]

Identifiers: URI/IRI

Page 10: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 10November 28, 2014

Resource Description Framework

The Resource Description Framework (RDF) has

been designed to describe data and metadata about specific subjects

structure of data sets

relationships between bits of data

An RDF statement (triple) consists of three parts subject

predicate (property)

object (value)

{person-1, name, "Niklaus Wirth"}

subject predicate object

Page 11: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 11November 28, 2014

Resource Description Framework ...

Subjects, predicates and objects are all resources

Resource anything that can be referenced by a URI

Literal non-structured data (e.g. String, Integer, ...); is also a resource

a literal cannot be the subject of an RDF statement

Predicate relation between two resources or between a resource and a

literal

RDF data is often stored in relational databases or so-

called triplestores (e.g. Apache Jena) up to billions of triples

Page 12: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 12November 28, 2014

RDF Graph

A set of RDF statements can be represented as a

directed labelled graph note that in RDF we can only define statements about specific

instances but not about generic concepts

- ontologies have to be used to define statements about generic concepts

Beat

SignerhasFamilyName

hasGivenName

http://wise.vub.ac.be/beat-signer

Page 13: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 13November 28, 2014

RDF Graph ...

Anonymous resources have no explicit identifier in the example, the "office" is an anonymous resource

anonymous resources are also called blank nodes or bnodes

blank nodes can only be used as subjects or objects

Beat Signer

hasFamilyNamehasGivenName

http://wise.vub.ac.behasDirector

http://wise.vub.ac.be/beat-signer

isMember

Lode

http://wise.vub.ac.be/lode-hoste

Hoste

hasFamily

Name

hasGivenName

isColleague

hasOffice

10F733 026293306

room phone

Page 14: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 14November 28, 2014

RDF Reification

An RDF triple is not a resource and can therefore not

become subject of another statement we have to reify the original statement

- make a resource out of the statement

Beat Signer

hasFamilyNamehasGivenName

http://wise.vub.ac.behasDirector

http://wise.vub.ac.be/beat/

isMember

Lode

http://wise.vub.ac.be/lode/

Hoste

hasGivenName

rdf:subject rdf:object

rdf:statement isColleague

rdf:type

rdf:Property

1

forYears

hasFamily

Name

Page 15: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 15November 28, 2014

RDF Container Nodes

Special container resource types bag

- number of unordered resources with potential duplicates

sequence

- ordered collection of resources

alternative

- one of the members can be selected

collection

- closed; once it has been defined, the

members can no longer be changed

http://wise.vub.ac.be/beat-signer

wearsShirt

http://shirt.org/shirt1

http://shirt.og/shirt2RDF:_2

rdf:alternative

Page 16: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 16November 28, 2014

Advantages of RDF

Simple

Enables the combination (merging) of data from

different data models not easily possible in a relational database (different schemas)

The same resource can be annotated by different people resource referenced by URI

separation of data and metadata

Well-defined standard many tools available

- triplestores, parsers, editors, frameworks, ...

Page 17: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 17November 28, 2014

RDF Schema (RDFS)

Vocabulary description language for RDF domain vocabulary and structure

Define common concepts and relationships classes (rdfs:Class) and subclasses (rdfs:subClassOf)

properties and sub-properties (rdfs:subPropertyOf)

domain (rdfs:domain) and range (rdfs:range) of a property

rdfs:seeAlso, rdfs:isDefinedBy (utility properties)

rdfs:label, rdfs:comment

...

Provides the basic elements for the definition of

ontologies

Page 18: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 18November 28, 2014

RDF Schema Example

Beat Signer

hasFamilyNamehasGivenName

Researcher

http://wise.vub.ac.be/beat-signer

Lode

http://wise.vub.ac.be/lode-hoste

Hoste

isColleague

hasFamilyNamehasGivenName

rdf:type rdf:type

Person isColleague

rdfs:Class rdf:Property

rdf:type rdf:typerdfs:domain

rdfs:range

rdfs:subClassOf

rdfs:Literal rdfs:Literal rdfs:Literal rdfs:Literal

rdf:type rdf:type rdf:type rdf:type

Page 19: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 19November 28, 2014

Advantages of RDFS

With RDFS we have a richer expressiveness

(e.g. subClassOf) than with RDF

Simple reasoning (e.g. type hierarchy)

Many existing tools to deal with RDFS

However, some things cannot be expressed; for example "a person must have a family name"

"a person can have at most one family name" (cardinality)

"if Beat is a colleague of Lode then Lode is a colleague of Beat" (transitivity)

these issues are addressed by the Web OntologyLanguage (OWL)

Page 20: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 20November 28, 2014

RDF(S) / XML Serialisation

Syntax not so easy to learn many different ways to construct the same statement

long URIs are hard to read

{http://wise.vub.ac.be/beat-signer, isColleague,http://wise.vub.ac.be/lode-hoste}

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"><rdf:Description rdf:about="http://wise.vub.ac.be/beat-signer"><isColleague rdf:resource="http://wise.vub.ac.be/lode-hoste"/><hasGivenName>Beat</hasGivenName>...</rdf:Description>...</rdf:RDF>

Page 21: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 21November 28, 2014

RDF Notation 3 (N3)

Short non-XML serialisation separate predicates with a semicollon

finish subject definition with a full stop

Note that the N3 notation offers more features than are

necessary for RDF(S) serialisation e.g. support for RDF-based rules

<http://wise.vub.ac.be/beat-signer> isColleague <http://wise.vub.ac.be/lode-hoste>;...hasGivenName "Beat".

Page 22: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 22November 28, 2014

RDF Turtle Notation

Terse RDF Triple Language

Subset of N3 language only describes RDF features (RDF graph model)

Syntax looks similar to Notation 3 http://www.w3.org/TeamSubmission/turtle/

Many RDF frameworks (e.g. Jena) offer Turtle parser

and serialisation features

Page 23: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 23November 28, 2014

RDF Applications

Annotea project defines an RDF schema for the types of annotations that can be

used to annote webpages

RSS some RSS versions use RDF(S) / XML serialisation

Dublin Core widely used to describe digital media (also in standard HTML)

- bibliographic metadata such a title, creator, description, ...

uses RDF(S) / XML serialisation as one possible representation

<head>...<meta name="DC.Subject" content="Interactive Paper, Cross-media ..."/><meta name="DC.Description" content="Beat Signer does research on ..."/></head>

Page 24: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 24November 28, 2014

SPARQL Query Language

RDF query language which can be used to extract information as URIs, literals, blank nodes or subgraphs

SPARQL SELECT queries return variable bindings

SPARQL querying relies on graph pattern matching

Example get the name and mbox of all subjects that have both of these

properties defined

SELECT ?name ?mboxWHERE { ?x foaf:name ?name .

?x foaf:mbox ?mbox }

Page 25: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 25November 28, 2014

Web Ontology Language (OWL)

OWL evolved from DAML+OIL DAML is the DARPA Agent Markup Language

OIL stands for Ontology Inference Layer

There exist 3 different OWL sublanguages (flavours) with

different expressivness OWL Full

- maximum expressiveness (full language)

- no computational guarantee

OWL DL

- maximal OWL Full subset that is still computationally decidable

OWL Lite

- classification hierarchy and simple constraints (limited cardinality constraints)

- weakest of the three variants

Page 26: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 26November 28, 2014

Jena Semantic Web Framework

Open source Semantic Web framework for Java create and access data from RDF graphs via an RDF API

offers an OWL API

data can be stored in files, databases or accessed via URLs

http://jena.sourceforge.net

RDF graphs can be serialised into different formats RDF/XML

Notation 3

Turtle

relational database

SPARQL query interface

Multiple reasoners

Page 27: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 27November 28, 2014

Protégé

Free open source platform

to create, manipulate and

visualise ontologies

Two modelling tools

Protégé-Frames editor

- build and populate frame-based

ontologies

- Java API for plug-ins

Protégé-OWL editor

- build Semantic Web ontologies

Page 28: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 28November 28, 2014

Swoogle

Search engine for seman-

tic web data (RDF)

ontologies

instance data

single terms

Ranking of semantic web

documents

inspired by Google's

PageRank

Developed at the

University of Maryland http://swoogle.umbc.edu

Page 29: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 29November 28, 2014

Friend of a Friend (FOAF)

Personal information and connections to friends in RDF http://www.foaf-project.org

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:foaf="http://xmlns.com/foaf/0.1/"><foaf:Person><foaf:name>Beat Signer</foaf:name><foaf:title>Prof.</foaf:title><foaf:givenname>Beat</foaf:givenname><foaf:family_name>Signer</foaf:family_name><foaf:nick>Beat</foaf:nick><foaf:mbox_sha1sum>ce6d419869307d57839feef6445a9d64f784eb36</foaf:mbox_sha1sum>...<foaf:knows><foaf:Person><foaf:name>Moira C. Norrie</foaf:name><foaf:mbox_sha1sum>4cb61b36a6feaa48c78acbb51fcce7cb356afdd6</foaf:mbox_sha1sum><rdfs:seeAlso rdf:resource="http://www.globis.ethz.ch/people/norrie.rdf"></foaf:Person></foaf:knows>...</foaf:Person></rdf:RDF>

Page 30: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 30November 28, 2014

Friend of a Friend (FOAF) ...

First social Semantic Web application

Miller and Brickley, 2000

Describe a social network without a central database

links can be followed by

spiders (data mining)

no unique identifier

- identification by description

(pedicates and objects)

"six degrees of separation" or

"small world phenomenon"

FOAFNaut browser

[http://rdfweb.org/images/foaf/foafnaut-screenshot-path.jpg]

Page 31: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 31November 28, 2014

Semantic Wikis

Use Semantic Web

technologies to provide

machine-processable

Wiki content

page content

link metadata

Ontology reasoning

much richer query interface

Existing semantic Wikis

DBPedia

Semantic MediaWiki

...

Page 32: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 32November 28, 2014

Linked Data

Link different data sources (URIs) on the Web provide metadata about the resources via RDF/XML, N3, etc.

provide links to resources in other data sets on the Web

Linked Open Data community project RDF tripes from DBPedia, GeneID, ACM, etc. (>30 billion triples)

links between those triples (>500 million links)

http://lod-cloud.net/versions/2014-08-30/lod-cloud.svg

Page 33: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 33November 28, 2014

Semantic Desktops

Apply Semantic Web tech-

nologies to personal infor-

mation management (PIM)

inter-application data sharing

enhancement of limited

filesystem functionality

- add document metadata

Examples

Haystack

Nepomuk

Nepomuk Integration with Dolphin (KDE 4.0)

Page 34: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 34November 28, 2014

GoodRelations

Lightweight ontology for expressing

product information in e-commerce web applications

Product features offers

prices

units

...

Adopted by various companies Yahoo

BestBuy

...

Leads to enhanced product search functionality

Page 35: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 35November 28, 2014

Microformats

Add semantics to (X)HTML pages

Makes use of specific (X)HTML tag attributes class and rel attributes

- e.g. rel="nofollow" for search engines

Specific microformats

hCard: contact information

hCalendar: event information

hProduct: product information

Alternative solutions semantic web (RDFa)

GRDDL

- Gleaning Resource Descriptions from Dialects of Language

- can convert from microformats to semantic web data (RDF)

Page 36: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 36November 28, 2014

hCard Microformat Example

Some search engines (e.g. Google and Yahoo) start to

pay attention to different types of microformats

<head profile="http://www.w3.org/2006/03/hcard">...</head>...<div class="vcard"><div class="fn">Lode Hoste</div><div class="org">Vrije Universiteit Brussel</div><div class="tel">32 2629 3306</div><a class="url" href="http://wise.vub.ac.be/members/lode-hoste">

http://wise.vub.ac.be/members/lode-hoste</a></div>

Page 37: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 37November 28, 2014

RDF in Attributes (RDFa)

Add a set of attribute extensions to XHTML for

embedding RDF metadata

Different vocabularies FOAF, video, audio, commerce, …

Search engines (e.g. Yahoo and Google) process certain

RDFa metadata (e.g. product information)

<p xmlns:dc=http://purl.org/dc/elements/1.1/about="http://www.amazon.com/...">

and the will to live. <span property="dc:creator">Simpson</span>dedicates the book <cite property="dc:title">Touching the Void</cite> tothe... The book was published in <span property="dc:date"content="1989-12-01">December 1989</span>.</p>

Page 38: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 38November 28, 2014

Microdata

Add machine readable metadata (semantics) to

HTML5 documents in the form of key/value pairs can be used by crawlers, search engines (SEO) and browsers to

provide a richer browsing experience

alternative to Microformats and RDFa

W3C W

ork

ing D

raft

<section itemscope itemtype="http://data-vocabulary.org/Person">Hello, my name is <span itemprop="name">Beat Signer</span> and I am a<span itemprop="title">Professor</span> at the<span itemprop="affiliation">Vrije Universiteit Brussel. </span><section itemprop="address" itemscope itemtype="http://data-vocabulary.org/Address">My address is:<span itemprop="street-address">Pleinlaan 2</span>,<span itemprop="postal-code">1050 </span><span itemprop="locality">Brussels</span>,<span itemprop="country-name">Belgium</span>.</section></section>

Page 39: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 39November 28, 2014

Exercise 9

Semantic Web working with linked data

Page 40: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 40November 28, 2014

References

Tim Berners-Lee, James Hendler and Ora

Lassila, The Semantic Web, Scientific American

Magazine, May 2001 http://www.scientificamerican.com/article.cfm?id=the-semantic-web

The Future Internet: Service Web 3.0 http://www.youtube.com/watch?v=off08As3siM

Resource Description Framework (RDF) http://www.w3.org/RDF/

Thomas B. Passin, Explorer's Guide to the Semantic

Web, Manning Publications, March 2004

Page 41: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

Beat Signer - Department of Computer Science - [email protected] 41November 28, 2014

References ...

Linked Data http://linkeddata.org

Page 42: Semantic Web - Lecture 09 - Web Information Systems (4011474FNR)

2 December 2005

Next LectureWeb Search