datalift: a catalyser for the web of data - francois scharffe
DESCRIPTION
Talk at Web Science Montpellier Meetup - 13th May 2011TRANSCRIPT
Webscience meetup 5/02/2011 1
With the help of the Datalift teamAnd the support of the French National Research Agency
Datalift: A Catalyser for the Web of Data
François ScharffeUniversity of Montpellier, LIRMM, [email protected]@lechatpito
Datalift and Web-Science
3
Datalift
A large scale Web data publication experiment.
Objectives:
- Publish reference datasets
- Automate the data publication process
- Show the interest of publishing linked data
4
Datalift
Motivation:
- Two phenomena:
- Society – Open Data
- Technology – Semantic Web
Data revolution going on : the web of data is explosing as the web of documents exploded in the 90'
Datalift
Datasets publication
R&D to automate the publication process
A modular architecture to assist data publication
Training, tutorials, data publication camps
Welcome aboard the data lift
Published and interlinked data on the Web
Applications
Interconnexion
Publication infrastructure
Data convertion
Vocabulary selection
Raw data
SemWebPro 18/01/2011 7
1st floor - Selection
Vocabulary selection
Vocabularies for linked-data● Are meant to describe resources in RDF● Are based on one of the standard W3C language RDFS
and OWL
Ø What makes a good vocabulary ?● A good vocabulary is a used vocabulary● Other usability criterias : Simplicity, visibility,
documentation, flexibility, semantic integration, social integration
Ø Types of vocabularies● Metadata, reference, domain, general
Vocabulary of a Friend
Øhttp://www.mondeca.com/foaf/voaf
ØA simple vocabulary...
ØTo represent interconnexions between vocabularies
ØA unique entry point to vocabularies and Datasets of the linked-data cloud Linked Data Cloud
ØOngoing work in Datalift
SemWebPro 18/01/2011 10
2nd floor - Conversion
Reference datasets, URI design
● Providing reference datasets for the French ecosystem: geographical, topological, statistical, political. Ex: http://parisemantique.fr
● Providing URI design guidelines● Opaque or transparent URIs ?● Usage of accents in URIs
Convertion tools to RDF
ØHow is the raw data to be converted ?
§ Relational Database ?
§ (Semi-)structured formats ?
§ Programmatic acces (API) ?
ØThere are solutions for all cases
SemWebPro 18/01/2011 13
3rd floor - Publication
SemWebPro 18/01/2011 15
4th floor - Interconnexion
Towards automated interconnexion services
ØRecord linkage, entity reconciliation, instance, ontology, schema matching
§ Using alignments between vocabularies
§ Detection of discriminating properties
§ Indicating comparison methods by attaching metadata to ontologies
ØWork in progress in Datalift
SemWebPro 18/01/2011 17
5th floor - Applications
“It is a time when, even if nets were to guide all consciousness that had been converted to photons
and electrons toward coalescing, standalone individuals have not yet been converted into data to the extent that they can form unique components of
a larger complex”
Mamoru Oshii, Ghost in the Shell