apache marmotta - introduction

ApacheCon Europe 2012 Presentation Template - 1

Apache Marmotta: Introduction

Sebastian Schaffert

Who Am I

Dr. Sebastian Schaffert

Senior Researcher at Salzburg ResearchChief Technology Officer at Redlink GmbHCommitter at Apache Software Foundation

and starting 12/2014 Software Engineering Manager (SRE) @ Google

http://www.schaffert.eu

http://linkedin.com/in/sebastianschaffert

[email protected]

Agenda

Introduction to Apache Marmotta (Sebastian)Overview

Installation

Development

Linked Data Platform (Sergio & Jakob)Overview

Practical Usage

Semantic Media Management (Thomas)Media Use Case

SPARQL-MM

Overview

What is Apache Marmotta?

Linked Data Server
(implements Content Negotiation and LDP)

SPARQL Server
(public SPARQL 1.1 query and update endpoint)

Linked Data Development Environment
(collection of modules and libraries for building custom Linked Data applications)

Community of Open Source Linked Data Developers

all under business friendly Apache Open Source licence

Linked Data Server

easily offer your data as Linked Data on the Web

human-readable and machine-readable read-write data access based on HTTP content negotiation

reference implementation of the Linked Data Platform (see next presentation block)

SPARQL Server

full support of SPARQL 1.1 through HTTP web services

SPARQL 1.1 query and update endpoints

implements the SPARQL 1.1 protocol
( supports any standard SPARQL clients)

fast native implementation of SPARQL in KiWi triple store

lightweight Squebi SPARQL explorer UI

Linked Data Development

modular server architecture allows combining exactly those functionalities needed for a use case
(no need for reasoning? exclude reasoner ...)

collection of independent libraries for common Linked Data problemsaccess Linked Data resources (and even some that are not Linked Data)

simplified Linked Data query language (LDPath)

use only the triple store without the server

Community of Developers

discuss with people interested in getting-things-done in the Linked Data world

build applications that are useful without reimplementing the whole stack

thorough software engineering process under the roof of the Apache Software Foundation

Installation / Setup

(we help you)

https://github.com/wikier/apache-marmotta-tutorial-iswc2014

Sample Project

Requirements:JDK 7/8 (https://java.com/de/download/)

Maven 3.x (http://maven.apache.org)

git (http://git-scm.com/)

curl (http://curl.haxx.se/)


Sample Project


$ git clone [email protected]:wikier/apache-marmotta-tutorial-iswc2014.git$ cd apache-marmotta-tutorial-iswc2014$ mvn clean tomcat7:run

then point browser to http://localhost:8080

Apache Marmotta Platform

Apache Marmotta Platform

implemented as Java web application
(deployed as marmotta.war file)

service oriented architecture using CDI (Java EE 6)

REST web services using JAX-RS (RestEasy)

CDI services found on classpath are automatically added to system

Architecture

Marmotta Core (required)

core platform functionalities:Linked Data access

RDF import and export

Admin UI

platform glue code:service and dependency injection

triple store

system configuration

logging

Marmotta Backends (one required)

choice of different triple store backends

KiWi (Marmotta Default)based on relational database (PostgreSQL, MySQL, H2)

highly scalable

Sesame Nativebased on Sesame Native RDF backend

BigDatabased on BigData clustered triple store

Titanbased on Titan graph database (backed by HBase, Cassandra, or BerkeleyDB)

Marmotta SPARQL (optional)

SPARQL HTTP endpointsupports SPARQL 1.1 protocol

query: /sparql/select

update: /sparql/update

SPARQL explorer UI (Squebi)

Marmotta LDCache (optional)

transparently access Linked Data resources from other servers as if they were local

support for wrapping some legacy data sources (e.g. Facebook Graph)

local triple cache, honors HTTP expiry and cache headers

Note:SPARQL does NOT work well with LDCache, use LDPath instead!

Marmotta LDPath (optional)

query language specifically designed for querying the Linked Data Cloud

regular path based navigation starting at a resource and then following links

limited expressivity (compared to SPARQL) but full Linked Data support

@prefix local: ;@prefix foaf: ;@prefix mao: ; likes = local:likes / (foaf:primaryTopic / mao:title | foaf:name) :: xsd:string;

Marmotta Reasoner (optional)

implementation of rule-based sKWRL reasoner

Datalog-style rules over RDF triples, evaluated in forward-chaining procedure

@prefix skos:

($1 skos:broaderTransitive $2) -> ($1 skos:broader $2)($1 skos:narrowerTransitive $2) -> ($1 skos:narrower $2)($1 skos:broaderTransitive $2), ($2 skos:broaderTransitive $3) -> ($1 skos:broaderTransitive $3)($1 skos:narrowerTransitive $2), ($2 skos:narrowerTransitive $3) -> ($1 skos:narrowerTransitive $3)($1 skos:broader $2) -> ($2 skos:narrower $1)($1 skos:narrower $2) -> ($2 skos:broader $1)($1 skos:broader $2) -> ($1 skos:related $2)($1 skos:narrower $2) -> ($1 skos:related $2)($1 skos:related $2) -> ($2 skos:related $1)

Marmotta Versioning (optional)

transaction-based versioning of all changes to the triple store

implementation of Memento protocol for exploring changes over time

snapshot/wayback functionality (i.e. possibility to query the state of the triple store at a given time in history)

Apache Marmotta Walkthrough(Demo)

Apache Marmotta Libraries

Apache Marmotta Libraries

provide implementations for common Linked Data problems (e.g. accessing resources)

standalone lightweight Java libraries that can be used outside the Marmotta platform

LDClient

library for accessing and retrieving Linked Data resources

includes all the standard code written again and again (HTTP retrieval, content negotiation, ...)

extensible (Java ServiceLoader) with custom wrappers for legacy data sources
(included are RDF, RDFa, Facebook, Youtube, Freebase, Wikipedia, as well as base classes for mapping other formats like XML and JSON)

LDCache

library providing local caching functionality for (remote) Linked Data resources

builds on top of LDClient, so offers the same extensibility

Sesame Sail with transparent Linked Data access (i.e. Sesame API for Linked Data Cloud)

LDPath

library offering a standalone implementation of the LDPath query language

large function library for various scenarios (e.g. string, math, ...)

can be used with LDCache and LDClient

can be integrated in your own applications

supports different backends (Sesame, Jena, Clerezza)

Marmotta Loader

command line infrastructure for bulk-loading RDF data in various formats to different triple stores

supports most RDF serializations, directory imports, split-file imports, compressed files (.gz, .bzip2, .xy), archives (tar, zip)

provides progress indicator, statistics

KiWi Triplestore

KiWi Triplestore

Sesame SAIL: can be plugged into any Sesame application

based on relational database (supported: PostgreSQL, MySQL, H2)

integrates easily in existing enterprise infrastructure (database server, backups, clustering, )

reliable transaction management (at the cost of performance)

supports very large datasets (e.g. Freebase with more than 2 billion triples)

KiWi Triplestore: SPARQL

translation of SPARQL queries into native SQL

generally very good performance for typical queries, even on big datasets

query performance can be optimized by proper index and memory configuration in the database

almost complete support for SPARQL 1.1 (except some constructs exceeding the expressivity of SQL and some bugs)

KiWi Triplestore: Reasoner

rule-based sKWRL reasoner (see demo before)

fast forward chaining implementation of rule evaluation

truth maintenance for easy deletes/updates

future: might be implemented as stored procedures in database

KiWi Triplestore: Clustering

cluster-wide caching and synchronization based on Hazelcast or Infinispan

useful for load balancing of several instances of the same application (e.g. Marmotta Platform)

KiWi Triplestore: Versioning

transaction-based versioning of triple updates

undo transactions (applied in reverse order)

get a Sesame repository connection visiting any time of the triple store history

Thank You!

Sebastian [email protected]

supported by the European Commission FP7 project MICO (grant no. 610480)

apache marmotta - introduction

Internet