lod2 introduction [email protected] 서울대학교 bike lab
TRANSCRIPT
![Page 2: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/2.jpg)
Creating Knowledge out of Interlinked Data
Introduction
FP7 project (2010-2014) 15 partners (technology researchers, companies and
service providers) from 11 European countries plus 1 associated partner from Korea
Coordinated by the AKSW research group at the University of Leipzig
![Page 3: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/3.jpg)
Creating Knowledge out of Interlinked Data
Achievements1. Extension of the Web
with a data commons (currently amounting 25 Billion facts)
2. vibrant, global RTD community
3. Industrial uptake begins (e.g. BBC, Thomson Reuters, Eli Lilly)
4. Emerging governmental adoption in sight
5. Establishing Linked Data as a deployment path for the Semantic Web.
The emerging Web of Data achievements and challenges
Challenges1. Coherence: Relatively
few, expensively maintained links
2. Quality: partly low quality data and inconsistencies
3. Performance: Still substantial penalties compared to relational
4. Data consumption: large-scale processing, schema mapping and data fusion still in its infancy
5. Usability: Missing direct end-user tools and network effect
• Web - a global, distributed platform for data, information and knowledge integration
• exposing, sharing, and connecting pieces of data, information, and knowledge on the Semantic Web using URIs and RDF
July 2007 April 2008 September 2008
July 2009
![Page 4: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/4.jpg)
Creating Knowledge out of Interlinked Data
Web server
Web server
Problem: Try to search for these things on the current Web:• Apartments near German-Russian bilingual childcare in Leipzig.• ERP service providers with offices in Vienna and London.• Researchers working on multimedia topics in Eastern Europe.Information is available on the Web, but opaque to current Web search.
Why Linked Open Data?
berlin.deHas everything about childcare in Berlin.
Immobilienscout.deKnows all about real estate offers in GermanyDB
Web serverWeb
server
DB
Web server
Search engineSearch engineHTML HTML
RDF RDF
Solution: complement text on Web pages with structured linked open data & intelligently combine/integrate such structured information from different sources:
![Page 5: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/5.jpg)
Creating Knowledge out of Interlinked Data
Objectives of LOD2
![Page 6: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/6.jpg)
Creating Knowledge out of Interlinked Data
Linked DataLifecycle
Challenges
![Page 7: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/7.jpg)
LOD2 in a Nutshell
7
Research focus• Very large RDF data
management• Knowledge Enrichment &
Interlinking• Fusion & Information
Quality• Adaptive, semantic user
interfaces
Use Cases• Media & Publishing• Enterprise Data Webs• Open Gov Data
Main Result• Integrated LOD2-Stack
for Linked Data lifecycle management
PartnerUni Leipzig, CWI, DERI
Galway, FU Berlin, Semantic Web Company, OpenLink, Tenforce, Exalead, Wolters Kluwer, OKFN
![Page 8: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/8.jpg)
LOD2 STACK
![Page 9: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/9.jpg)
Creating Knowledge out of Interlinked Data
LOD2 stack as Debian package repository
LOD2 stack repository is a Debian package repository http://http://stack.lod2.eu/deb/distributions/dists/.
We have chosen a new reference OS: Ubuntu12.04 LTS o This version is supported for the next 5 years.
Changes in repository management system for o enabling quality control (development -> test -> stable)
enabling architecture dependent distribution support (e.g. Virtuoso RDF store) o Public access to documentation
• http://wiki.lod2.eu
![Page 10: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/10.jpg)
Creating Knowledge out of Interlinked Data
LOD2 stack contribution process
![Page 11: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/11.jpg)
Creating Knowledge out of Interlinked Data
LOD2 stack components
![Page 12: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/12.jpg)
Creating Knowledge out of Interlinked Data
Linked Data publishing capabilities currently offered
Covers most of the LOD publishing cycle o Combination of
• locally installed software, • online available software, and • online available data sources as well as data packages • about page in the LOD demonstrator (http://demo.lod2.eu/lod2demo)
![Page 13: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/13.jpg)
LOD2 STACK – ExtractionVirtuoso SpongerD2RQ
![Page 14: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/14.jpg)
Creating Knowledge out of Interlinked Data
Virtuoso Sponger
An RDFizer introduced in Virtuoso 5.0 Provides built-in RDF middleware for transforming
non-RDF data into RDF "on the fly“. You can use non-RDF data sources as Semantic Web
data sources. Inputs: Wide variety of non-RDF Web data sources,
e.g:o (X)HTML Web Pages (including hosted microformats)o Web services (Google, Del.icio.us, Flickr etc.)o Binary files (MS Office, PDF, OpenDocument etc.)
Output: RDF structured data
![Page 15: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/15.jpg)
Creating Knowledge out of Interlinked Data
Inputs: Supported Data Sources
RDF (inc. N3, Turtle)o SIOC, SKOS, FOAF, AtomOWL, Annotea …
(X)HTML pageso HTML header metadata: Dublin Coreo Microformats: eRDF, RDFa, hCard, hCalendar, XFN, xFolk …
Syndication formatso RSS 2.0, Atom, OPML, OCS, XBEL
GRDDL Web service APIs: Google Base, Flickr, Del.icio.us, Ning … Files:
o Binary files: MS Office, OpenOffice, images, audio, video …o Data exchange formats: iCalendar, vCard
3rd party metadata extractors: Aperture, Spotlight, SIMILE RDFizers or add your own!
![Page 16: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/16.jpg)
Creating Knowledge out of Interlinked Data
Output: Structured Data
In the context of the Semantic Data Web:“Data organized into semantic chunks or entities, with similar entities
grouped together into relations or classes”Michael Bergman (http://www.mkbergman.com)Article: “More Structure, More Terminology and (hopefully) More Clarity”
![Page 17: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/17.jpg)
Creating Knowledge out of Interlinked Data
Sponger Benefits
Majority of the world's data resides in non-RDF form at the current time
Sponger provides a “Swiss army knife” for RDF structured data generation from non-RDF sources
Extracting data from non-RDF Web sources and converting it to RDFo helps “bootstrap” the Semantic Webo helps drive the transition of the traditional Document-Web into the
emerging Semantic Data-Webo exposes the data in a canonical form for querying and inference
![Page 18: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/18.jpg)
Creating Knowledge out of Interlinked Data
Sponger Inputs & Outputs
![Page 19: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/19.jpg)
Creating Knowledge out of Interlinked Data
Sponger Architecture
Sponger is comprised of Sponger Cartridges Default cartridge collection is bundled as a Virtuoso VAD Cartridge = Metadata Extractor + Ontology Mapper Metadata extracted from non-RDF resources is mapped to a
suitable ontology by Ontology Mapper to produce Structured Data
Sponger is highly customizable Custom cartridges can be developed
o Using any language (e.g. Virtuoso PL, C/C++, Java) supported by Virtuoso Server Extensions API
![Page 20: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/20.jpg)
Creating Knowledge out of Interlinked Data
D2RQ Platform
System for accessing relational databases as virtual RDF graphs
Offers RDF-based access to the content of relational databases without having to replicate it into an RDF store
Features:• query a non-RDF database using SPARQL• access the content of the database as Linked Data over the
Web• create custom dumps of the database in RDF • access information using the Apache Jena API
![Page 21: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/21.jpg)
Creating Knowledge out of Interlinked Data
D2RQ Platform : Components
The D2RQ Platform consists of: D2RQ Mapping Language, a declarative mapping
language for describing the relation between an ontology and an relational data model.
D2RQ Engine, uses the mappings to rewrite SQL queries against the database and passes query results up to the higher layers of the frameworks
D2R Server, an HTTP server that provides a Linked Data view, a HTML view for debugging and a SPARQL Protocol endpoint over the database.
![Page 22: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/22.jpg)
Creating Knowledge out of Interlinked Data
Mapping Examples
map:MyDatabase a d2rq:Database; d2rq:jdbcDSN "jdbc:mysql://localhost/mydb"; d2rq:jdbcDriver "com.mysql.jdbc.Driver"; d2rq:username "user"; d2rq:password "password".
map:MyDatabase a d2rq:Database; d2rq:jdbcDSN "jdbc:mysql://localhost/mydb"; d2rq:jdbcDriver "com.mysql.jdbc.Driver"; d2rq:username "user"; d2rq:password "password".
map:People a d2rq:ClassMap;d2rq:uriPattern “http://.../people/@@User.ID@@”;d2rq:condition “User.deleted=0”;d2rq:class foaf:Person .
map:People a d2rq:ClassMap;d2rq:uriPattern “http://.../people/@@User.ID@@”;d2rq:condition “User.deleted=0”;d2rq:class foaf:Person .
map:photo a d2rq:PropertyBridge; d2rq:belongsToClassMap map:People; d2rq:property foaf:made; d2rq:join “User.ID = Photo.UserID”; d2rq:refersToClassMap map:Photos .
map:photo a d2rq:PropertyBridge; d2rq:belongsToClassMap map:People; d2rq:property foaf:made; d2rq:join “User.ID = Photo.UserID”; d2rq:refersToClassMap map:Photos .
![Page 23: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/23.jpg)
LOD2 STACK - OntoWiki
![Page 24: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/24.jpg)
Creating Knowledge out of Interlinked Data
OntoWiki
Ontowiki enables intuitive authoring of semantic content, with an inline editing mode for editing RDF content, similar to WYSIWIG for text documents.o Knowledge Bases (aka. graphs, Linked Data optional)o Generic list and resource viewso Versioningo Commenting on arbitrary resourceso User management + access controlo Inline editingo Navigation hierarchies (e.g. Class hierarchies)
![Page 25: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/25.jpg)
Creating Knowledge out of Interlinked Data
OntoWiki Screenshots
![Page 26: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/26.jpg)
LOD2 STACK - InterlinkingLIMES, SILK
![Page 27: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/27.jpg)
Creating Knowledge out of Interlinked Data
LIME
Declarative Link Discovery Framework Tuned towards efficiency and extensibility Set-theoretical grammar for specifying links Time-efficient mappers for single data types Machine learning for detecting link specs
![Page 28: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/28.jpg)
Creating Knowledge out of Interlinked Data
LIME : Architecture
![Page 29: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/29.jpg)
Creating Knowledge out of Interlinked Data
LIMES Link Specifications
1. Metadata 2. SourceandTarget 3. SimilarityMeasure 4. AcceptanceConditions 5. ReviewConditions 6. ExecutionMode 7. OutputFormat
![Page 30: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/30.jpg)
Creating Knowledge out of Interlinked Data
Silk : Link Discovery Framework
Tool for discovering links between data items within different Linked Data sources.
The Silk Link Specification Language (Silk-LSL) allows to express complex linkage rules
Can be used to generate owl:sameAs links as well as other relationships
Scalability and high performance through efficient data handling
![Page 31: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/31.jpg)
Creating Knowledge out of Interlinked Data
Silk Versions
Silk Single Machine o Generate links on a single machine o Local or remote data sets
Silk MapReduce o Generate RDF links using a cluster of multiple machines o Based on Hadoop (usable with Amazon Elastic MapReduce)
Silk Server o Provides an HTTP API for matching instances from an incoming stream of
RDF data o Can be used as an identity resolution component within applications that
consume Linked Data from the Web
![Page 32: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/32.jpg)
Creating Knowledge out of Interlinked Data
SILK : Linking Workflow
![Page 33: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/33.jpg)
Creating Knowledge out of Interlinked Data
SILK : Linkage Rule Components
![Page 34: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/34.jpg)
LOD2 STACK - InterlinkingLIMES, SILK
![Page 35: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/35.jpg)
Creating Knowledge out of Interlinked Data
LODRefine
LOD-enabled OpenRefine Google Refine ==> OpenRefine LODGrefine ==> LODRefine
o Supporting DBpedia (and Freebase) o Supporting crowdsourcing o Exporting RDF o Extracting named entities
![Page 36: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/36.jpg)
Creating Knowledge out of Interlinked Data
OpenRefine
![Page 37: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/37.jpg)
Creating Knowledge out of Interlinked Data
The Extensions
Extend functionalities of OpenRefineo RDF Refine extension
• Reconciliation and interlinking • Exporting RDF
o DBpedia extension • Extending reconciled data with columns from DBpedia • Extracting Named Entities using Zemanta API
o NER extension • Extracts named entities from unstructured text
o Crowdsourcing extension Developed by
o Zemanta: DBpedia extension, Crowdsourcing o DERI: RDF Refine o Free Your Metadata Group: Named Entity Extraction extension
![Page 38: LOD2 Introduction jordse@gmail.com 서울대학교 BIKE lab](https://reader037.vdocuments.net/reader037/viewer/2022110100/56649e305503460f94b20f65/html5/thumbnails/38.jpg)
Creating Knowledge out of Interlinked Data
References
LOD2 Webinar: The 2nd release of the LOD2 stack LOD2 Webinar Series: Zemanta / Open refine LOD2 Webinar Series: LIMES LOD2 Webinar Series: SILK LOD2 Webinar Series: OntoWiki Virtuoso Sponger - RDFizer Middleware for creating RDF from non RDF Data Sources LOD2 Webinar Series: D2R and Sparqlify
LOD2 HomePage, http://stack.lod2.eu/blog/ LOD2 Prototype, http://demo.lod2.eu/lod2demo