semantic web and linked data
TRANSCRIPT
2 | 70
Content
Semantic Web
Ontology
RDF, URI
Linked Data
RDFa, Schema.org
* Social Semantics
00
Semantic Web
4 | 70
Growth of the Web 01 – The amount of information available on the Web grows so fast.
– The February 2014 survey shows there exist at least 920,120,079 sites (http://news.netcraft.com/archives/category/web-server-survey/).
5 | 70
The.. 02
6 | 70
Semantic 03
7 | 70
Semantic Web 04
8 | 70
Inside the Web 05
9 | 70
Links among Objects 06
10 | 70
Querying on the web, searching the Web 07 Show me Peter’s Photo Eva likes.
Play me a music Eva Listened Yesterday at Last.fm.
11 | 70
Definition of Semantic Web 08 The Semantic Web is an extension of the Web through standards by the World Wide Web Consortium (W3C).
The standards promote common data formats and exchange protocols on the Web, most fundamentally the Resource Description Framework(RDF). According to the W3C, "The Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries". The term was coined by Tim Berners-Lee for a web of data that can be processed by machines. While its critics have questioned its feasibility, proponents argue that applications in industry, biology and human sciences research have already proven the validity of the original concept.
또..나!!
https://en.wikipedia.org/wiki/Tim_Berners-Lee
2007
Process-able? Understand-able?
shared and reused ≒ interoperable
W3C standard
12 | 70
Web vs Semantic Web 09 With HTML mark-ups, Web is about displaying the data
Semantic web is about the meaning of the data
13 | 70
Semantic Web Data Representation 10
또..나!!
https://en.wikipedia.org/wiki/Tim_Berners-Lee
https://en.wikipedia.org/wiki/World_Wide_Web_Consortium
https://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol
https://en.wikipedia.org/wiki/RDF
https://en.wikipedia.org/wiki/Semantic_Web
https://en.wikipedia.org/wiki/British_Empire
create
create
entitle
descriptionStandard
standardize standardize
The Semantic Web is an extension of the Web through standards by the World Wide Web Consortium (W3C). The standards promote common data formats and exchange protocols on the Web, most fundamentally the Resource Description Framework(RDF). According to the W3C, "The Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries". The term was coined by Tim Berners-Lee for a web of data that can be processed by machines. While its critics have questioned its feasibility, proponents argue that applications in industry, biology and human sciences research have already proven the validity of the original concept.
14 | 70
Vision of Semantic Web
최종적인 목적지는 인공지능 Agent의 실현
11
Siri
Ok Google
IBM Watson
15 | 70
Graph Data Model with Shared Data schema 12
자동차 정보 데이터 in Database
Table Structure (Data Schema)
Data Schema on the Web
Data on the Web or local
16 | 70
Self describing Data with Shared Data Schema 13
Ok, got it. Ok, got it.
Ok, got it.
Ok, got it.
현대 그랜져
17 | 70
Graph Data could be diverse 14
현대 아반때
쉐보레크루즈
18 | 70
Graph Data could be entailed 15 Inference, Reason, Entail unstated statements
차종 대형
3등급 연비등급
현대 그랜져
현대 그랜져
현대 그랜져
Entailed Statements
19 | 70
Goal of Semantic Web 16
20 | 70
Semantic Web Layer Cake 17
Data Description Structure
Encoding, Addressing
Common Syntax
Vocabularies
Querying Language Shared Meanings(Ontology) Rule Description
Establish logical truth, Infer Unstated statements
Trustworthiness of statements
Ontology, RDF, URI
22 | 70
Semantic Web Data Representation 01
또..나!!
https://en.wikipedia.org/wiki/Tim_Berners-Lee
https://en.wikipedia.org/wiki/World_Wide_Web_Consortium
https://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol
https://en.wikipedia.org/wiki/RDF
https://en.wikipedia.org/wiki/Semantic_Web
https://en.wikipedia.org/wiki/British_Empire
create
create
entitle
descriptionStandard
standardize standardize
The Semantic Web is an extension of the Web through standards by the World Wide Web Consortium (W3C). The standards promote common data formats and exchange protocols on the Web, most fundamentally the Resource Description Framework(RDF). According to the W3C, "The Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries". The term was coined by Tim Berners-Lee for a web of data that can be processed by machines. While its critics have questioned its feasibility, proponents argue that applications in industry, biology and human sciences research have already proven the validity of the original concept.
23 | 70
HTML, XML 02
In HTML In XML
24 | 70
XML, RDF, Ontology, URI 03
XML describes Data
RDF supports Logical Description
<fact> <person>Tim_Berners-Lee</person> <behavior>create</behavior> <object>Semantic_Web</object> </fact>
<rdf:Description rdf:about="Tim_Berners-Lee"> <create>Semantic Web</create> </rdf:Description>
Ontology explains Meanings
<rdf:Description rdf:about="https://en.wikipedia.org/wiki/Tim_Berners-Lee"> <scientificBehavior:create resource=“https://en.wikipedia.org/wiki/Semantic Web”></ scientificBehavior:create > </rdf:Description>
URI
25 | 70
Def. of RDF, Ontology, URI 04
Semantic Web 기반으로 데이터/온톨로지를 표기하기 위한 Standard 데이터 포맷
공유 가능한 개념들을 표현해 놓은 집합
웹에서 표현 가능한 자원, 대상을 지칭하기 위한 방법, 혹은 이름
RDF
URI
Ontology
26 | 70
Ontology
Ontology
– Ontology is the philosophical study of the nature of being, becoming, existence, or reality, as well as the basic categories of being and their relations. – In Philosophy
– Ontology is said to be an agreement about a shared, formal, explicit and partial account of a conceptualisation.- In Computer Science
Heterogeneous/ Interoperability
05
Concept
Symbol Referent
Concept: 머리 속에 담아두고 있는 바 (개념) Symbol:개념을 언어로 표현한 것(단어) Referent: 개념이 지시하는 대상(외연) Meaning
27 | 70
Ontology 구성요소
1. 클래스(Class) - 클래스는 일반적으로 우리가 사물이나 개념 등에 붙이는 이름 "키보드", "모니터", “자동차“
2. 인스턴스(Instance) - 인스턴스는 사물이나 개념의 구체물이나 사건 등의 실질적인 형태를 지칭 - "LG전자 ST-500 울트라슬림키보드", "삼성 싱크마스터 Wide LCD 모니터", "로미오와 줄리엣의 사랑"은 일반적으로 인스
턴스라 볼 수 있음. 이와 같은 클래스와 인스턴스의 구분은 응용과 사용목적에 따라서 달라짐
3. 속성(Property) - 속성은 클래스나 인스턴스의 특정한 성질, 성향등을 나타내기 위하여 클래스나 인스턴스를 특정한 값(value)와 연결시킨
것 - 예를 들어, "삼성 싱크마스터 Wide LCD 모니터 크기(size)는 XX인치이다."라는 것을 표현하기 위하여, Size와 같은 속성
을 정의할 수 있음
4. 관계(Relation, Object Property) - 관계는 클래스 – 인스턴스, 인스턴스-인스턴스 간에 존재하는 관계들을 지칭 - 예를 들어, 소나타-brand-현대차 와 같은 정보를 표현하고자 할 때 인스턴스를 대상으로 가지는 속성, brand는 Relation
이라고 함
06
28 | 70
Ontology Example 1 07
29 | 70
Ontology Example 2 08
Brand
Model Name
Publish Year
Standard Price
Warranty Miles
Service Center
Phone Number
30 | 70
Modeling Ontology 101 09
Ontology Modeling - “Ontology Development 101”, Natasha Noy, Stanford University - http://protege.stanford.edu/publications/ontology_development/ontology101-noy-mcguinness.html FOAF (Friend of a friend) - http://www.foaf-project.org/
31 | 70
Around Ontology, Modeling 10
Ontology Editor - http://protege.stanford.edu/ Querying Language - SPARQL (w3c Recommendation) Ontology Description Language - RDF, OWL
Online Ontology Editor - http://vowl.visualdataweb.org/webvowl/
32 | 70
RDF 11
RDF stands for
Resource
Description
Framework
: URI를 갖는 모든 것 (웹 페이지, 이미지, 동영상 등)
: 자원(Resource)들의 속성, 특성, 관계 기술
: 위의 것들을 기술하기 위한 모델, 언어, 문법
33 | 70
URI 12
URL : Uniform Resource Locator
URN : Uniform Resource Name
URI (통합 자원 식별자,Uniform Resource Identifier): 인터넷에 있는 자원을 나타내는 유일한 주소
34 | 70
RDF Data Model 13
subject Object predicate
주어 (Resource)
술어 (Property, Relation)
목적어 (Resource, Literal)
URI Blank Node
URI URI Literal
RDF는 Graph Model 을 갖고 있다.
http://dbpedia.org/resource/Billie_Jean has a singer whose value is http://dbpedia.org/resource/Michael Jackson
• Subject : http://dbpedia.org/resource/Billie_Jean (URI)
• Predicate: www.example.com/terms/singer (URI)
• Object: http://dbpedia.org/resource/Michael_Jackson (URI)
35 | 70
RDF is Graph Model 14
http://dbpedia.org/resource/Billie_Jean
http://www.example.com/terms/singer
1983-01-02
http://www.example.com/terms/released
http://dbpedia.org/resource/Michael_Jackson
36 | 70
Graph Model Expandable 15
http://dbpedia.org/resource/Billie_Jean
http://www.example.com/terms/singer
Michael_Jackson
1983-01-02
http://www.example.com/terms/released
http://dbpedia.org/resource/Michael_Jackson
http://www.example.com/terms/name
http://dbpedia.org/resource/USA
http://www.example.com/terms/bornIn
37 | 70
Semantic Web 관련 언어 흐름 16
Ontology (information science)
RDF (W3C WD)
1997.08
XML (W3C WD)
RDF Scheme (W3C WD)
OWL (W3C WD)
SPARQL (WD)
DAML
OIL (Europe IST Project)
DAML+OIL
1970s
1996.11
1998.04 1999 2000
1999 2002
2004.10
38 | 70
RDF is Database? 17
Ontology = Schema
RDF = Data
URI = Key 값
Database ?
39 | 70
RDF as a Database 18
Movie Ontology (Shared Schema)
subClassof
Artifact
Project Music
Singer
hasSinger
Matrix
Director
Du Hast
Wachowski brothers Rammstein
isA isA
isA
isA
hasDirector
hasDirector hasPlayer RDF Data
Movie Database
URI URI URI URI
40 | 70
Web as a Database 19
Movie data #3
Person data #1
Standard Schema
Web of Data
Web of Data
Movie data #1
Movie data #2
41 | 70
Web of Data 20
Universally Opened
Semantically Expandable
Easily Re-usable
Database on the Web
Web of Data
Linked Data, RDFa, Schema.org
43 | 70
Linking Open Data Project 01 Building a “Web of Data” to enhance the current Web
The Linking Open Data (LOD) project:
http://linkeddata.org/ Translating existing datasets into RDF and linking them
together.
For example, DBpedia (Wikipedia) and GeoNames, Freebase, BBC programmes, etc.
Government data also available as Linked Data DATA.gov DATA.gov.uk
공유자원포탈(http://data.go.kr) by 인터넷 정보화 진흥원
서울 열린 데이터 광장(http://data.seoul.go.kr) by 서울시 정정보화 사업단
44 | 70
Linked Data Cloud 02
2008
2007
45 | 70
Linked Data Cloud Expanded 03
2011
2009
2010
또..나!!
‘5-star Open Data’
46 | 70
Linked Data Cloud Now 04
47 | 70
Linked Data Data 05
- DBLP Bibliography - provides bibliographic information about scientific papers; it contains about 800,000 articles, 400,000 authors, and approx. 15 million triples
- GeoNames provides RDF descriptions of more than 6,500,000 geographical features worldwide.
- riese - serving statistical data about 500 million Europeans (the first linked dataset deployed with XHTML+RDFa)
- Sensorpedia - A scientific initiative at Oak Ridge National Laboratory using a RESTful web architecture to link to sensor data and related sensing systems.
- BBC Things - Data on the places, people and organisations that appear in BBC programmes and online content.
48 | 70
What they do with LOD
The Bio2RDF Cloud
06
49 | 70
What they do with LOD
DBtune Slashfacet
07 • Visualizes music-related Linked Data
• Uses LastFM, MySpace, and BBC data
50 | 70
What they do with LOD
Where-are-jobs
08 http://www.symsoftsolutions.com/award/2011-where-are-jobs
51 | 70
What they do with LOD
Google Knowledge Graph
09
52 | 70
Origin of Linked Data
Wikipedia to DBPedia
10
2007
2014
53 | 70
Wikipedia.org 11
54 | 70
Wikipedia.org
Knowledge of Crowd
12
55 | 70
DBpedia 13
DBPedia is a community effort to
extract structured (“infobox”) information from Wikipedia
provide a SPARQL endpoint to the dataset
interlink the DBpedia dataset with other datasets on the Web
56 | 70
Extracting structured data from Wikipedia 14
@prefix dbpedia <http://dbpedia.org/resource/>.
@prefix dbterm <http://dbpedia.org/property/>.
dbpedia:Amsterdam
dbterm:officialName “Amsterdam” ;
dbterm:longd “4” ;
dbterm:longm “53” ;
dbterm:longs “32” ;
...
dbterm:leaderName dbpedia:Job_Cohen ;
...
dbterm:areaTotalKm “219” ;
...
dbpedia:ABN_AMRO
dbterm:location dbpedia:Amsterdam ;
...
as
57 | 70
Automatic links among open datasets 15
<http://dbpedia.org/resource/Amsterdam>
owl:sameAs <http://rdf.freebase.com/ns/...> ;
owl:sameAs <http://sws.geonames.org/2759793> ;
...
<http://sws.geonames.org/2759793>
owl:sameAs <http://dbpedia.org/resource/Amsterdam>
wgs84_pos:lat “52.3666667” ;
wgs84_pos:long “4.8833333” ;
geo:inCountry <http://www.geonames.org/countries/#NL> ;
...
58 | 70
Inlink to DBpedia 16 Central of Linked Data DBpedia is being linked to from a variety of datasets. The sum of inlinks is 39,007,478.
59 | 70
Linked Data is to Replace the Web? 17 Nope… is to re-organize little potion of the Web
LOD
60 | 70
Rich Snippets
INTO RDF from Text Page
19
[ rdf:type schema:Review ;
schema:name "Oscars 2012: The Artist, review" ;
schema:description "The Artist, an utterly
beguiling…" ;
schema:ratingValue "5" ;
…
]
Search engines add text under results to preview what’s on page and why it’s relevant Text often extracted from structured data embedded on the page
61 | 70
RDFa / MicroData / Microformat 20
We’d like to add semi-structured know-ledge to a conventional HTML document – Humans can see and understand the regular HTML content (text, images,
videos, audio)
– Machines can see and understand the data markup in XML, RDF or some other format
Possibilities include – Add a link to a separate document with the knowledge
– Embed the knowledge as comments, javascript, etc.
– Distribute the knowledge markup throughout the HTML as attributes of existing HTML tags
62 | 70
63 | 70
65 | 70
Use of RDFa 21
66 | 70
Where there are now?
Intelligence Agent
22
IBM Watson ExoBrainProject
EOD
Social Semantics
69 | 70
Web 2.0 is about The Social Web
“Web 2.0 Is Much More
About A Change In People
and Society Than Technology”
• 1 billion people connect to the Internet • 100 million web sites • over a third of adults in US have contributed
content to the public Internet. - 18% of adults over 65
-Dion Hinchcliffe,
tech blogger
70 | 70
Potential Roles for Semantic Web Technology
• Composing and integrating user-contributed data across applications – example: tagging data
• Creating aggregate value from a mix of structured and unstructured data – example: blogging data
• Ontology of Folksonomy : applications that use tag data from multiple systems – tag search across multiple sites – collaboratively filtered search
• “find things using tags my buddies say match those tags”
– combine tags with structured query • “find all hotels in Spain tagged with “romantic”
71 | 70
Core concepts
• Term – a word or phrase that is recognizable by people and computers
• Document – a thing to be tagged, identifiable by a URI or a similar naming service
• Tagger – someone or thing doing the tagging, such as the user of an application
• Tagged – the assertion by Tagger that Document should be tagged with Term
72 | 70
SKOS : Term
• SKOS – Simple Knowledge Organisation System(s)
– W3C 2nd Public Working Draft – work in progress!
– (probably Recommendation within 1-2 years).
• SKOS is about declaring and publishing taxonomies, thesauri or classification schemes, for use in a distributed, decentralised information system (I.e. a semantic web). – The application of library science to the semantic web.
73 | 70
FOAF : Tagger
• Friend of a Friend Project - an OWL vocabulary for describing social networks
• Some centralized social networking sites, like Ecademy and LiveJournal, and other sites with network data, like the Dean Campaign, output their data in FOAF format
• Currently, there are millions of FOAF files and users. Recently the count has risen by orders of magnitude, but estimates put the total over 5,000,000
74 | 70
SIOC : Document
• Semantically-Interlinked Online Communities Project
• SIOC provides methods for interconnecting discussion methods such as blogs, forums and mailing lists to each other.
• It consists of the SIOC ontology, an open-standard machine readable format for expressing the information contained both explicitly and implicitly in Internet discussion methods, of SIOC metadata producers for a number of popular blogging platforms and content management systems, and of storage and browsing/searching systems for leveraging this SIOC data.
75 | 70
SIOC + FOAF + SKOS : Tagged