[webinar] semantic technologies

Post on 11-May-2015

1.947 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Want to get an update on Nuxeo's involvement in semantic search and knowledge extraction? Watch this slideshow to hear all the latest news on this topic and learn how it may impact the future of Enterprise Content Management! If you want to go further, watch the video of a webinar using this slideshow http://www.youtube.com/watch?v=YLgJKx1y6Fk

TRANSCRIPT

Nuxeo & Semantic TechnologiesStefane Fermigier - Nuxeo

The Web - May 2011

Wednesday, May 25, 2011

Agenda

• A pragmatic introduction to the Semantic Web

• Experience report and demos from Nuxeo

Wednesday, May 25, 2011

1. Introduction to the Semantic Web

Wednesday, May 25, 2011

Prelude

Wednesday, May 25, 2011

Source: Mills Davis, “Semantic Social Computing”, sept. 2007Wednesday, May 25, 2011

Source: Mills Davis, “Semantic Social Computing”, sept. 2007Wednesday, May 25, 2011

Source: Mills Davis, “Semantic Social Computing”, sept. 2007Wednesday, May 25, 2011

Source: Mills Davis, “Semantic Social Computing”, sept. 2007Wednesday, May 25, 2011

History

Wednesday, May 25, 2011

Wednesday, May 25, 2011

Invented the web in 1989(yeah!)

Wednesday, May 25, 2011

Invented the web in 1989(yeah!)

Invented the semantic web in 1994 (duh?)

Wednesday, May 25, 2011

Historical perspective

• From web 1.0: web of sites and pages, aka the World Wide Web

• To web 2.0: web of people and of participation, aka the Social Web (Blogs, RSS, tags, Facebook, Wikipedia, etc.)

• To web 3.0: web of data, of meaning and connected knowledge, aka the Semantic Web

Wednesday, May 25, 2011

Semantics & Ontologies

Wednesday, May 25, 2011

Wednesday, May 25, 2011

Wednesday, May 25, 2011

Wednesday, May 25, 2011

Wednesday, May 25, 2011

Some examples

• FOAF: relationships between people (social network)

• SIOC: relationships between websites, articles, blogs, comments

• Rich Snippets: syndicate RDFa content for SEO by Google, Yahoo

• good-relations: e-commerce (Ebay...)

• rNews: metadata for news agencies (AFP, Reuters...)

Wednesday, May 25, 2011

How is it related tothe Web?

Wednesday, May 25, 2011

The traditional Web

• A principle: hypertext

• A protocol: HTTP

• An identification scheme: URNs/URIs

• A language: HTML

Wednesday, May 25, 2011

“To a computer, then, the web is a flat, boring world devoid of meaning”

Tim Berners Lee, http://www.w3.org/Talks/WWW94Tim/Wednesday, May 25, 2011

“This is a pity, as in fact documents on the web describe real objects and imaginary

concepts, and give particular relationships between them”

Tim Berners Lee, http://www.w3.org/Talks/WWW94Tim/Wednesday, May 25, 2011

“Adding semantics to the web involves two things: allowing documents which have information in

machine-readable forms, and allowing links to be created with relationship values.”

Tim Berners Lee, http://www.w3.org/Talks/WWW94Tim/Wednesday, May 25, 2011

“The Semantic Web is not a separate Web but an extension of the current one, in which information

is given well-defined meaning, better enabling computers and people to work in cooperation.”

Tim Berners Lee, http://www.w3.org/Talks/WWW94Tim/Wednesday, May 25, 2011

The traditional Web

• A principle: hypertext

• A protocol: HTTP

• An identification scheme: URNs/URIs

• A language: HTML

Wednesday, May 25, 2011

The semantic Web

• A principle: hypertext

• A protocol: HTTP

• An identification scheme: URNs/URIs

• A language: HTML RDF

Wednesday, May 25, 2011

The W3C “Layer Cake”

Wednesday, May 25, 2011

The W3C “Layer Cake”

Alreadystandardized

Wednesday, May 25, 2011

URIs and theWeb of Things

• URIs (Unique Resource Identifiers) are used to identify things (also called entities) in the real world

• For instance: people, places, events, companies, products, movies, etc.

Wednesday, May 25, 2011

The RDF model

Subject ObjectPredicate

RDF is used to describe relationships between objects, identified by their URIs

Wednesday, May 25, 2011

RDF serialization

As XML:

Others, ex: N3:

Wednesday, May 25, 2011

SPARQL

• Query language for RDF databases

• Several implementations

• OSS: Apache Jena, Sesame, 4Store, Virtuoso, Mulgara, Redland, Open Anzo...

• Proprietary: 5Store, AllegroGraph RDFStore, Stardog, Dydra, OWLIM...

• More expressive than SQL, scalability is still an open question

Wednesday, May 25, 2011

SPARQL Sample

Wednesday, May 25, 2011

Where and howto find these data?

Wednesday, May 25, 2011

Solution 1: “Lift”

• One can use HTML scrapping and natural language processing (NLP) technique to extract semantic information from existing content / sites

• Generic solutions: OpenCalais, Zemanta, Apache Stanbol

• Pro: no need to change existing content

• Con: error prone, needs human checks

Wednesday, May 25, 2011

Example: DBPedia

Wednesday, May 25, 2011

Solution 2: export

• RDFa and microformats are used to embed semantic information (expressed using the RDF model) into regular web pages

• RDFa does it using existing (rel) and additional (about, property, typeof) attributes

• Microformats only use usual HTML attributes (class)

Wednesday, May 25, 2011

Solution 3: reuse

• Linked Open Data: (usually large) data repositories available on the web (for free or not), expressed using the RDF model

• Interoperability between these repositories (their ontologies) must be defined

Wednesday, May 25, 2011

“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”

Linked Open Data in 2007

Wednesday, May 25, 2011

“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”

2008

Wednesday, May 25, 2011

“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”

2009

Wednesday, May 25, 2011

“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”

2010

Wednesday, May 25, 2011

Good for Enterprise apps too!

Diagram source: http://www.w3.org/2007/Talks/0130-sb-W3CTechSemWeb/Wednesday, May 25, 2011

Why now?

Wednesday, May 25, 2011

Key Enablers

Open Data and Linked Online Data

Advances in automatic content analysis (linguistics, image processing) and machine learning

Classical logic and classical AI

Computing power (Moore’s law + MapReduce)

Wednesday, May 25, 2011

Let’s put them to use!

The technologies and data are available,

Wednesday, May 25, 2011

2. Nuxeo &Semantic ECM

Wednesday, May 25, 2011

Nuxeo

Wednesday, May 25, 2011

Nuxeo: an open source ECM vendor

Our Focus is Enterprise Content Management

ECM as a Platform for Content Applications

Open Source as Efficient Development Model

Modern architecture for 21st Century business

“Lean, mobile, social, interoperable”

A Social Marketplace in action

Innovation driven by community of customers, partners, and our core developers

Wednesday, May 25, 2011

49

Nuxeo ECM - From Platform to Products

PlatformContent

Infrastructure

Nuxeo Enterprise PlatformComplete set of components covering all aspects of ECM

Nuxeo CoreLightweight, scalable, embeddable content repository

HorizontalPackages

DocumentManagement

Digital AssetManagement

CaseManagement

Framework

StructuredDocument

Server

ContentAggregator

Business Solutions

Correspondence Management

Contracts Management

Invoice ProcessingRecords

Management

Construction Media Government Life Sciences

Wednesday, May 25, 2011

Major Customers

Wednesday, May 25, 2011

Goals

Wednesday, May 25, 2011

Goals for Semantic ECM

Repurpose existing content

Improve search and collaboration

Make information contextual

Extract and use information from your content

Make your content smarter!

Wednesday, May 25, 2011

Semantic ECM

Wednesday, May 25, 2011

Content

Text

Image

Sound

Video

Semantic ECM

Wednesday, May 25, 2011

Content

Text

Image

Sound

Video

Meaning

Metadata

Relations

EntitiesTags

Reasoning

Semantic ECM

Wednesday, May 25, 2011

Content

Text

Image

Sound

Video

Meaning

Metadata

Relations

EntitiesTags

Reasoning

Semantic ECM

Wednesday, May 25, 2011

Content Stack vs. Knowledge Cake

Architectural Challenge

Wednesday, May 25, 2011

Business valuefrom semantic ECM

Efficiency gains: 20% to 90% (ex: in search, collaboration)

Effectiveness gains: better returns from your assets (ex: news and images from AFP)

Strategic edge: growth, value capture, new services, gain unfair strategic advantage (ex: vertical ontologies for CEVAs / CCAs)

Wednesday, May 25, 2011

56

Demo

Wednesday, May 25, 2011

How does it work?

Wednesday, May 25, 2011

58

IKS project

• European project under the FP7, with 13 partners (6 SMEs) and a 8.5 MEUR budget

• Goal: create a semantic software “stack” that will be used by CMS vendors to add semantic features to their products

• Started in Jan. 2009, will last until Dec. 2012

• First tangible result: Apache Stanbol, already integrated in a Nuxeo plugin

Wednesday, May 25, 2011

59

Wednesday, May 25, 2011

Stanbol: a semantic engine

• From unstructured content to Knowledge

• Language guessing

• Topic classification (Business, Sports, Media, ...)

• Named Entities extraction and linking

• Relationships and properties extraction

• Pluggable with proprietary engines (ex: Temis)

Wednesday, May 25, 2011

61

Wednesday, May 25, 2011

62

Wednesday, May 25, 2011

63

RESTfulis

Beautiful

Wednesday, May 25, 2011

64

= Semantic Engines

(Apache OpenNLP) +

Fast Linked Data local index(Apache Solr)

+ Semantic Rule Engine

(Apache Jena)Wednesday, May 25, 2011

Local IT infrastructure (LAN) 65

Nuxeo DM

addon

1

Apache Stanbol

2

Engine 1

Engine 2

Engine 3

3

DBpedia

Freebase

GeonamesLDAP

Wednesday, May 25, 2011

How to try it?

Wednesday, May 25, 2011

https://connect.nuxeo.com/nuxeo/site/marketplace/category/semanticWednesday, May 25, 2011

Notes

• Nuxeo EP 5.4.2 (next week) will have significant improvements to enable new features of the semantic plugins

• Source code here: http://hg.nuxeo.org/addons/nuxeo-platform-semantic-entities/

• Join us at the IKS Paris Workshop on July 5-6 to learn much more about Nuxeo and semantic technologies!

Wednesday, May 25, 2011

69

Resources• http://iks-project.eu

• http://stanbol.demo.nuxeo.com

• http://incubator.apache.org/stanbol

• http://blogs.nuxeo.com/dev

• http://hadoop.apache.org/

• http://incubator.apache.org/opennlp/

Wednesday, May 25, 2011

70

Questions?

Wednesday, May 25, 2011

71

Up Next!

Live Demo - Nuxeo StudioJune 1, 2011

Building Packages for the Nuxeo Marketplace

Juen 8, 2011

Wednesday, May 25, 2011

top related