sda2012 pundit system

42
PUNDIT: SEMANTICALLY STRUCTURED ANNOTATIONS FOR WEB CONTENTS AND DIGITAL LIBRARIES Marco Grassi (1), Christian Morbidoni(2), Michele Nucci(3), Simone Fonda(4), Giovanni Ledda(5) (1,2,3,5) DII - Department of Information Engineering. Polytechnic University of Le Marche, Ancona, Italy (4) - NET7 srl SDA 2012 Semantic Digital Archives Semedia (Semantic Web and Multimedia) http://semedia.dii.univpm.it www.netseven.it / This work is licensed under a Creative Commons Attribution 3.0 Unported (CC BY 3.0)

Upload: marco-grassi

Post on 11-May-2015

637 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Sda2012 pundit system

PUNDIT: SEMANTICALLY STRUCTURED ANNOTATIONS FOR WEB CONTENTS

AND DIGITAL LIBRARIESMarco Grassi(1), Christian Morbidoni(2), Michele Nucci(3),

Simone Fonda(4), Giovanni Ledda(5)

(1,2,3,5) DII - Department of Information Engineering. Polytechnic University of Le Marche, Ancona, Italy(4) - NET7 srl

SDA 2012Semantic Digital Archives

Semedia(Semantic Web and Multimedia)http://semedia.dii.univpm.it www.netseven.it/

This work is licensed under a Creative Commons Attribution 3.0 Unported (CC BY 3.0)

Page 2: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

THE WEB SCENARIO

• Annotating web content has become a common task• Comments and tags are widely supported by

mainstream application

• Many tools to bookmark, highlight, comment web page fragments

• Some tools support collaborative annotations

• Web content annotations are beneficial:• More engaging and productive user experience

• Exploit social engagement to improve resource ranking, classification

Page 3: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

DL SCENARIO

• Crowdsourcing experiments for enriching DL, curating contents or uploading digital material of interest for the DL (BBC WW2 People’s War, …)

• Digital Libraries (DL) are no longer simple “expositions” of digital objects but provide users with more interaction

Digital Library

Consume Contents

Create Contents

Experts

Users

Expert modelDigital Library

Consume Contents

Create Contents

Experts

Users

TaggingConsumeContents

Linking

Commenting

Social Engagement

User Interaction

Digital LibraryConsume Contents

Create Contents

Experts

Users

TaggingConsumeContents

Linking

Commenting

Add Content Add Annotations

Crowdsourcing

Page 4: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

WHAT’S MISSING? ...

• Most of existing annotation tools are

usually limited to simple textual tags and

comments.

• limitation due to the ambiguity of natural

language

• their semantic is not machine interpretable

Limitation in the efficiency of resource classification and retrieval and in the possibility to reuse these annotations in other context

Orange?

Page 5: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

SEMANTICALLY STRUCTURED ANNOTATIONS

• Semantically structured annotations to make smart use of such added knowledge:

• Unambiguously express semantics to be processed by software agents: • annotations can be harvested periodically and publish back• used by recommender systems or search engines,• ...

• Enhance Digital Libraries capabilities

• improving browsing• enabling automatic content classification• ...

• Reuse such a collaborative knowledge in different contexts and by different applications

Page 6: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

SEMANTICALLY STRUCTURED ANNOTATIONS

User should be able to create knowledge graphs where web content fragments, concepts and entities are meaningfully connected.

Page 7: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

SEMANTICALLY STRUCTURED ANNOTATIONS

• Rely on controlled vocabularies and ontologies• share the same terminology and “talk about the same things”• annotations can be meaningfully mashed-up

• Link to the emerging Web of Data• a software can automatically get additional, useful semantic data (e.g. date and place of

birth, pictures, citations, multi-language data)

Augmenting the information of the annotation and of the original content to support

smarter application behaviors!

Ex. We have discovered that the two images contain american film actors showing anger emotion!

Page 8: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

• developed by:

• funded by:

• supported and further developed in:

• Pundit is a novel semantic annotation tool:

Semedia (Semantic Web and Multimedia)http://semedia.dii.univpm.it

with the collaboration of NET7

Semlib Project Eu Projecthttp://semedia.dii.univpm.it

DM2E EU Projecthttp://dm2e.edu/

AGORA EU Projecthttp://project-agora.eu/

Page 9: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

SEMLIB PROJECT

• R&D project supported by EU FP7 Theme: Research for SMEs (no. FP7-SME -2010-01- 262301 - SEMLIB)

• 24 months (commenced in January 2011, currently at month 19)

Semlib ProjectSemantic Web Tools for DL

http://www.semlibproject.eu/

www.netseven.it/www.knowledgehives.com/www.liberologico.com/www.in-two.com

www.semedia.dii.univpm.it/ www.deri.ie/

Page 10: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

ANNOTATION MODEL

Contextual Information

• Based on Open Annotation Collaboration (OAC) ontology (currently working to provide full compliancy with OA)

Page 11: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

ANNOTATION MODEL• Based on Open Annotation Collaboration (OAC) ontology

(currently working to provide full compliancy with OA)

Contextual Information

Annotation Content

Page 12: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

ANNOTATION MODEL

Contextual Information

Annotation Content

Semantically Structured Content

• Based on Open Annotation Collaboration (OAC) ontology (currently working to provide full compliancy with OA)

Page 13: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

ANNOTATION MODEL

Contextual Information

Annotation Content

Named Graph

SPARQL support to query slices of knowledge

• Based on Open Annotation Collaboration (OAC) ontology (currently working to provide full compliancy with OA)

Page 14: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

NAMED GRAPHS AS BODIES

An example annotation showing the annotation model

oac:Annotation

ex:MarcoGrassi

http://example.com/mypage.htm#textFragment

a

oac:hasTarget

rdfs:comment

2011-01-27 10:30:56

http://example.com/img1.jpeg

oac:hasTarget

ex:ANNOTATION-GRAPH-ID-1

http://example.com/mypage.htm#textFragment

semlib:Renassance

semlib:DanteAlighieri

http://example.com/img1.jpeg

http://example.com/1.htm

semlib:mentionsAuthor

semlib:depicts

Fragment: Durante gli Alighieri...

rdfs:label

semlib:mentionsPeriod

Annotation 1

rdfs:label

dcterms:created

dcterms:creator

ex:ANNOTATION-ID-1

oac:hasBody

Another annotation whose content can be merged with the former one

oac:Annotation

ex:MarcoGrassi

http://example.com/mypage.htm#textFragment2

a

oac:hasTarget

rdfs:comment

2011-09-27 11:43:12

ex:ANNOTATION-GRAPH-ID-2

http://example.com/mypage.htm#textFragment

2

semlin:Renassance

semlib:Politics

http://example.com/mypage.htm#textFragment

semlib:talksAbout

Fragment: Dante Alighieri life has

been..

rdfs:label

semlib:mentionPeriod

Annotation 2

rdfs:label

dcterms:created

dcterms:creator

ex:ANNOTATION-ID-2

oac:hasBody

semlib:hasSimilarContent

...allow to keep separated statements belonging to different annotations...

Page 15: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

NAMED GRAPHS AS BODIES

An example annotation showing the annotation model

oac:Annotation

ex:MarcoGrassi

http://example.com/mypage.htm#textFragment

a

oac:hasTarget

rdfs:comment

2011-01-27 10:30:56

http://example.com/img1.jpeg

oac:hasTarget

ex:ANNOTATION-GRAPH-ID-1

http://example.com/mypage.htm#textFragment

semlib:Renassance

semlib:DanteAlighieri

http://example.com/img1.jpeg

http://example.com/1.htm

semlib:mentionsAuthor

semlib:depicts

Fragment: Durante gli Alighieri...

rdfs:label

semlib:mentionsPeriod

Annotation 1

rdfs:label

dcterms:created

dcterms:creator

ex:ANNOTATION-ID-1

oac:hasBody

Another annotation whose content can be merged with the former one

oac:Annotation

ex:MarcoGrassi

http://example.com/mypage.htm#textFragment2

a

oac:hasTarget

rdfs:comment

2011-09-27 11:43:12

ex:ANNOTATION-GRAPH-ID-2

http://example.com/mypage.htm#textFragment

2

semlin:Renassance

semlib:Politics

http://example.com/mypage.htm#textFragment

semlib:talksAbout

Fragment: Dante Alighieri life has

been..

rdfs:label

semlib:mentionPeriod

Annotation 2

rdfs:label

dcterms:created

dcterms:creator

ex:ANNOTATION-ID-2

oac:hasBody

semlib:hasSimilarContent

http://example.com/mypage.htm#textFragment

semlib:Renassance

semlib:DanteAlighierihttp://example.com/

img1.jpeg

semlib:mentionsAuthor

semlib:depicts

Fragment: Durante gli Alighieri...

rdfs:labelsemlib:mentionsPeriod

http://example.com/mypage.htm#textFragment

2

semlib:Politics

semlib:talksAbout

Fragment: Dante Alighieri life has

been..rdfs:label

semlib:mentionPeriod

semlib:hasSimilarContent

...allow to keep separated statements belonging to different annotations...

...but enable to aggregate them into “composite’ graphs and query them using standard SPARQL

Page 16: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

NOTEBOOKS• Annotations are collected in notebooks

NotebookURI

2011-01-27 10:30:56

My Example Notebook

An Example Notebook used to show the model

dcterms:creator

dcterms:created

rdfs:label

rdfs:comment

• Users can organize their annotations

• Aggregate annotations to be retrieved and queried

• Different UNIX style read/write privileges (from private to completely public)*

• Activate/Deactivate a notebook to filter the amount of public annotations visualizing only those of interest.

• Identified by a (dereferenciable) URI

Page 17: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

NOTEBOOKS• Notebooks allow annotations sharing

NotebookURI

2011-01-27 10:30:56

My Example Notebook

An Example Notebook used to show the model

dcterms:creator

dcterms:created

rdfs:label

rdfs:comment

SINGLE USER

COMMUNITIES

PUBLIC

SHARE

NotebookURI

SHARENotebookURI

SHARE

NotebookURI

WIKI

• Sharing a notebook is as easy as sharing its URL on the web (similarly to popular file sharing platforms)

Page 18: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

NOTEBOOK MANAGEMENT

• Create new notebooks

• Set the current notebook (where the annotations are written)

• Set notebook private or public

• Activate/deactivate owned notebooks or public notebook to filter annotations of interest

• Share notebook by URI

Page 19: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

USER AUTHENTICATION

• Authentication is based on OpenID:

• No need to store user’s credentials

• Implemented already by mainstream company (Google, Yahoo, ...)

• Possibly avoid user multiple registration (waste of time, another password)

• Single identity can be used among different Pundit-enabled Digital Libraries

• Adding an OpenID provider is easy and transparent to the Pundit server.

Page 20: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

PUNDIT ARCHITECTURE

• Open Source RESTful Web Service (Java Jersey framework)

• Cross origin request• CORS (Cross-Origin Resource Sharing)

• JSONP

• Sesame triple store• SPARQL and inference

• Different sail are provided to implement different storages (BigOWLIM, MySQL, PostgreeSQL, Virtuoso ...)

• MySQL for user data

• RESTful API to edit and consume annotations

• Set of Javascript modules (Dojo Framework)• Easily extendable

• Highly customizableCLI

ENT

SERV

ER

Page 21: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

DIFFERENT ANNOTABLE CONTENTS

• Pundit allows the annotation of different types of contents at different level of granularity

• Text fragments

• Images

• Image fragments (under development)

• Videos and video fragments (experimented in Semtube)

Page 22: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

• Semantic annotation of YouTube videos (alpha state) based on Pundit JavaScript libraries and annotation server

http://semedia.dii.univpm.it/semtube

Page 23: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

DIFFERENT TYPES OF ANNOTATIONS

Annotation with different levels of expressivity and structure

Comment/Tag Panel

Page 24: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

DIFFERENT TYPES OF ANNOTATIONS

• Textual comments

Annotation with different levels of expressivity and structure

Comment/Tag Panel

Page 25: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

DIFFERENT TYPES OF ANNOTATIONS

• Textual comments• Semantic Tags

• Automatically extracted from textual comments (Dbpedia Spotlight)

Annotation with different levels of expressivity and structure

Comment/Tag Panel

Page 26: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

DIFFERENT TYPES OF ANNOTATIONS

• Textual comments• Semantic Tags

• Automatically extracted from textual comments (Dbpedia Spotlight)

• Popular Linked Data service(Dbpedia, Freebase, Wordnet, ..)

• Define your own source of named entities (SPARQL endpoint, HTTP API)

Annotation with different levels of expressivity and structure

Comment/Tag Panel

Page 27: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

DIFFERENT TYPES OF ANNOTATIONS

• Textual comments• Semantic Tags• Semantic Relations

• Subject-Property-Object Statements

• Drag&Drop and suggestions

• Connect different resources (user selection, linked data entities, ...) with semantically defined properties

Annotation with different levels of expressivity and structure

Triple Composer

Page 28: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

DIFFERENT TYPES OF ANNOTATIONS

• Textual comments• Semantic Tags• Semantic Relations

• Subject-Property-Object Statements

• Drag&Drop and suggestions

• Connect different resources (user selection, linked data entities, ...) with semantically defined properties

Annotation with different levels of expressivity and structure

Triple Composer

Page 29: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

CUSTOM VOCABULARIES• Pundit allows to use custom vocabularies/taxonomies (and

relations):• Create a JSONp file (manually or automatically from an ontology )

• Put it online

• Add its URL to the configuration to import and use it

Page 30: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

CROSS PAGE / DOMAIN ANNOTATIONS• Special Bookmarklet allows to lunch Pundit on every Web page to perform annotations

• Selected resources (text fragments, images, ...) on different pages and domain can be added to “My Items” to be stored on server and reused on different pages

Add to My Items

Page 31: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

CROSS PAGE / DOMAIN ANNOTATIONS• Special Bookmarklet allows to lunch Pundit on every Web page to perform annotations

• Selected resources (text fragments, images, ...) on different pages and domain can be added to “My Items” to be stored on server and reused on different pages

Add to My Items

Use in another page

Create cross page semantic relations

cites

Page 32: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

NAMED CONTENT• DLs change over time

• Presentation can restyled and content can be re-organized

• Same content in different pages• Some part of the page should not be

annotated (menu, ...)

• Specific markup can be added in the pages to allows Pundit:• identifying atomic pieces of content (by

means of URI)• attaching the annotations to such

contents• avoid the annotation of page accessory

component

<div class="pundit-content" about="http://example.org/contents/123"> <!-- HTML goes here. --> <p>This is a named content and contains both text and a picture</p> <img src="http://example.org/pictires/pictire123.png" /> <p><em>Caption:</em> this is a caption.</p></div>

Page 33: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

NAMED CONTENT• DLs change over time

• Presentation can restyled and content can be re-organized

• Same content in different pages• Some part of the page should not be

annotated (menu, ...)

• Specific markup can be added in the pages to allows Pundit:• identifying atomic pieces of content (by

means of URI)• attaching the annotations to such

contents• avoid the annotation of page accessory

component

<div class="pundit-content" about="http://example.org/contents/123"> <!-- HTML goes here. --> <p>This is a named content and contains both text and a picture</p> <img src="http://example.org/pictires/pictire123.png" /> <p><em>Caption:</em> this is a caption.</p></div>

Page 34: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

NAMED CONTENT

The same content in different pages shows the same annotations!

Text

Page 35: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

NAMED CONTENT

The same content in different pages shows the same annotations!

Text

Page 36: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

CONSUMING THE ANNOTATIONS

• PUNDIT server provides RESTfull APIs to consume annotations.

• (Public) annotations can be consumed by third party applications.

• Currently conceiving and developing apps to display and reuse pundit annotation

Page 37: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

ASK THE PUND

• A social web app consuming people's annotations, which let group of people to organize them into a shared collection, telling a meaningful story with it.

http://ask.thepund.it/

Page 38: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

EDGEMAPS VISUALIZATION

• An Edgemaps graph populated with Pundit annotations

http://thepund.it/edgemaps_demo/demo.html

Page 39: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

TIMELINE ANNOTATION

http://ask.thepund.it/#/timeline/31951d93

Page 40: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

MORE...• Find our and suggest more: http://thepund.it/okfest.php

...and don’t forget to leave some feedbacks :-) !!!

Page 41: Sda2012 pundit system

Pundit: Semantically Structured Annotations for Web Contents... [email protected] 2012

DEMO TIME!

http://thepund.it

Page 42: Sda2012 pundit system

http://thepund.it

THANK YOU!

Semlib Project Eu Projecthttp://www.semlibproject.eu/

DM2E EU Projecthttp://dm2e.edu/

AGORA EU Projecthttp://project-agora.eu/

SDA 2012Semantic Digital Archives

Semedia(Semantic Web and Multimedia)http://semedia.dii.univpm.it www.netseven.it/

This work is licensed under a Creative Commons Attribution 3.0 Unported (CC BY 3.0)