Download - Digital archiving 3.0
Digital Archiving 3.0
“My data open on the Web, ok but how ?”
Christophe Guéret (@cgueret)
Open Data on the Web, 23 - 24 April 2013
Data Archiving and Networked Services
DANS is een instituut van KNAW en NWO
A bit of context
http://cedar-project.nl
http://easy.dans.knaw.nl
Put your data open on the Web!
“Sharing knowledge: EC-funded projects on scientific information in the digital age”
“E-Data & Research”, October 2011
Where is your research data ?
It is available as an RDF/XML dump on my test server
Just get it from the web site of the research project
I think I have have it somewhere on a stick, let me check...
All bad answers, really.
● We need research data to be
– Accessible/readable/usable by anyone– Available in many (>1) years from now– With traceable provenance and usages
● Dumping the data on a web site somewhere is not enough
Solution: use a repository
● Data repositories will take over serving the data and have a page for it!
● Repository hold two type of data– The data stored
– The meta-data about this data
“Sharing knowledge: EC-funded projects on scientific information in the digital age”
Which format for meta-data ?
● LOD is a perfect fit for describing data
– Use to refer to and link data items– Facilitates discovery, easy to crawl/index– One description per data item stored– Redirects to actual location of the data
● Remaining question: how much meta-data is needed?
Which format for the data?
● Many formats around : PDF, SDF, DSPL, XLS, RDF, CSV, SHP, JSON-LD, ...
● Translation will imply some extra work for the data owner and not please everyone
Which format for the data?
● Many formats around : PDF, SDF, DSPL, XLS, RDF, CSV, SHP, JSON-LD, ...
● Translation will imply some extra work for the data owner and not please everyone
Select vocabularies to describe your resources
Buy a DN, decide on a URI scheme for your data
Express your data as described resources
Solution: use a repository
● Data repositories will take over serving your data
● Just get the data in the repository
● Repositories will take care of everything
● PS: forget about HTTP URIs for data
Format evolution
● Use Content-negotiation to translate and serve different data formats
● Ensure everyone gets the format he wants
Format evolution
● Use Content-negotiation to translate and serve different data formats
● Ensure everyone gets the format he wants
??
Next generation archives
● Provide long term access to data in several formats
● Publish Linked Open Meta-Data about the data stored (DCAT, ...)
● Facilitate moving data around archives (LDP, ...)