linked data as a service
DESCRIPTION
TRANSCRIPT
LINKED DATA AS A SERVICE
SEMTECHBIZ Berlin 2012
Peter Haase, Michael Schmidtfluid Operations AG
fluid Operations (fluidOps)
Linked Data & Semantic Technologies Enterprise Cloud Computing
Software company founded Q1/2008 by team of serial entrepreneurs, privately held, VC funded
Headquarters in Walldorf / Germany, SAP Partner Port
Currently 40 employees
Named “Cool Vendor for SAP 2010” by Gartner Mar 2010
Global reseller agreement with EMC focus large enterprise customers Apr 2010
NetApp Advantage Alliance Partner Oct 2010
The Potential of Linked Data
Linked Data• Set of standards, principles for publishing, sharing
and interrelating structured knowledge• From data silos to a Web of Data• RDF as data model, SPARQL for querying• Ontologies to describe the semantics
Benefits of Linked Data in the Enterprise• Enterprise Data Integration: Semantically integrate and
interlink data scattered among different information systems
• Simplified publishing and sharing of data: Increase openness and accessibility of Enterprise Data
• Enrichment and contextualization through interlinking: Value add by linking to Linked Open Data
Everything as a Service
• Abstract from physical implementation details and location of resources
• Regardless of geographic or organizational separation of provider and consumer
• “In the cloud”• Web based• Virtualized• On-demand• Self-service• Scalable• Pay as you go
Infrastructure as a Service
Platform as a Service
Software as a Service
Data as a Service
Next generation of XaaS is centered around the power of data.
Data-as-a-Service
• Abstraction layer for data accessabstract the applications from the specific setup of the data management service (such as local vs. remote, federation, and distribution)
• Enabling automation of discovery, composition, and use of datasets
5
Next generation of XaaS is centered around the power of data.
“Like all members of the "as a Service” family, DaaS is based on the concept that the product, data in this case, can be provided on demand to the user regardless of geographic or organizational separation of provider and consumer.”
Source: Wikipedia
Data-as-a-Service – Beyond Data Access
• Data Markets: make it easy to find data from secondary data sources, consume or acquire the data in a usable – and often unified – format
• Online Visualization Services: allow users to upload data, make charts and visualizations and publish these to an online audience
• Data Publishing Solutions: allow data owners to publish their data collections and make them available to an online audience
• Data Aggregators: integrate, cleanse data from different sources to provide the aggregated data as a value added service
• BI / Analytics as a Service: provide higher level analytics functionality (statistical analysis), reporting, predictive analytics
See also: http://blog.datamarket.com/2010/10/24/data-as-a-service-market-definitions/
7
Information Workbench - Linked Data Platform
Information Workbench: Semantics- & Linked Data-based
integration of private and public data sources
Intelligent Data Access and Analytics
Visual Exploration Semantic Search Dashboarding and Reporting
Collaboration and knowledge management platform
Wiki-based curation & authoring of data
Collaborative workflowsSemantic Web Data
Enabling Data Access:Virtualization of Data Sources
• Linked Data as abstraction layer for virtualized data access across data spaces
• Linked Data principles1. Use URIs as names for things 2. Use HTTP URIs so that people can look up those names. 3. When someone looks up a URI, provide useful information, using the
standards: RDF, SPARQL 4. Include links to other URIs, so that they can discover more things.
• Enables data portability across current data silos • Platform independent data access
8
Enabling Data Discovery:Metadata about Data Sets• Metadata about data sources essential for dynamic discovery• Access to data registered at global registries, e.g. ckan.org, data.gov, …• Based on metadata vocabularies (voID, DCAT)• Sort/filter data sets by topic, license, size and many more facets to identify
relevant data• Visually explore data sets
Enabling Data Composition:Federation of Virtualized Data Sources
Application Layer
Virtualization Layer
Data Layer
Data Source Data Source Data Source Data Source
SPARQLEndpoint
SPARQLEndpoint
SPARQLEndpoint
SPARQLEndpoint Metadata
Registry
See also: FedX: Optimization Techniques for Federated Query Processing on Linked Data (ISWC2011)
Semantic Wiki + Widgets as Self-service Linked Data Frontend• Semantic Wiki for linking of
unstructured and structured data • Declarative specification of the UI
based on available pool of widgets and declarative wiki-based syntax
• Widgets have direct access to the DB• Type-based template mechanism
Wiki Page in Edit Mode … … and Displayed Result Page
Information Workbench:Data as a Service in a Cloud Platform Architecture
Prov
isio
ning
, Mon
itorin
g a
nd M
anag
emen
t
Infrastructure Layer (IaaS)
Virtualization Layer
Network Computing ResourcesNetw.-Att. Storage
Data Layer (DaaS)
Open Data SourcesEnterprise Data Sources
Application Layer (SaaS)
Self-serviceDeployment
Data Discovery
• Self-service deployment of the Information Workbench in the cloud
• Pay-per-use• Scalability on demand
• On demand access to private and public data sources
• Dynamic Discovery
Data Integration& Federation
• Living UI, composed from semantics-aware widgets
• Ad hoc data exploration, visualization, analytics
Self-service UI & Analytics
Prov
isio
ning
, Mon
itorin
g a
nd M
anag
emen
t
Infrastructure Layer (IaaS)
Virtualization Layer
Network Computing ResourcesNetw.-Att. Storage Open Data SourcesEnterprise Data Sources
Application Layer (SaaS)
Data Layer (DaaS)
• Virtualized data access
• Dynamic integration & federation of data sources
Information Workbench – Linked Data as a ServiceApplication Areas
Knowledge Management in the Life Sciences
Digital Libraries, Media and Content Management
Intelligent Data Center Management
Example:Conference Explorer
15
• „Linked-Data-a-Thon“: build an application that makes use of conference metadata and contextualizes data with external data sources in two weeks
• Realized with the Information Workbench
Data Sources• Conference Metadata (Linked Data)• Public bibliographic meta data• Social Networks:
• Twitter• Facebook• LinkedIn
• LinkedGeoData
Features• Conference schedule, timelines,
hot topics• Statistics and reports• Background information about
authors and publications• Link to social network profiles and
statistics
http://semtech2012.fluidops.net/
Example: A Cloud Portal for Access to Open Data with the Information Workbench
Goal• Collect meta data from global data markets (LOD Cloud,
WorldBank, CKAN, …)• Allow integrated search and ad hoc integration of data
sources from different repositories• Link data with private/internal data sources, if desired• Support semi-automated linking between data sets• Provide visualization, exploration, and analytics
functionality on top of integrated data sources
Realization• Currently running project with the Hasso Plattner Institute
(Potsdam, Germany)• Create local repository containing data market metadata• Use self-service technology to make services publicly
available + Information Workbench for analytics
... using the fluid Operations Technology Stack
Example: Linked Data in Pharma
Integ
Public Data Sources
Search, Interrogate and Reason
Capture and Augment Knowledge
Visualize, Analyze and Explore
Integrated data graph over all data sources
Private Data Sources
Main Use Cases
• Integrate data from company-internal data silos
• Augment company-internal data with Linked Open Data
• Collaborative knowledge management
• Support of internal processes (drug development)
Example: Dynamic Semantic Publishing
Information Workbench for DSP
• Collaborative authoring and linking of unstructured and structured semantic data
• Ontology and instance data management• DSP editorial workflows• Automation of content creation and
enrichment
Olympics 2012 requirements• A lot of output... Page per Athlete [10,000+], Page per country [200+],
Page per Discipline [400-500], Time coded, metadata annotated, on demand video, 58,000 hours of content
• Almost real time statistics and live event pages with too many web pages for too few journalists
Dynamic Semantic Publishing (DSP) architecture to automate content aggregation
CONTACT:fluid OperationsAltrottstr. 31Walldorf, Germany
Email: [email protected]: www.fluidops.comTel.: +49 6227 3846-527
Visit us at our booth!
http://semtech2012.fluidops.net/