lds 3
DESCRIPTION
LDS 3. David Tarrant @ davetaz [email protected] Open Planets Foundation / University of Southampton. Applying Preservation Principals to Linked Data Systems. iPres2012 Toronto, October 2012. Present Day. Presenting the REF The Results Evaluation Framework. - PowerPoint PPT PresentationTRANSCRIPT
SCAPE
David Tarrant @davetaz [email protected]
Open Planets Foundation / University of SouthamptoniPres2012Toronto, October 2012
LDS3Applying Preservation Principals to Linked Data Systems
This work was partially supported by the SCAPE Project.The SCAPE project is co-funded by the European Union under FP7 ICT-2009.4.1 (Grant Agreement number 270137).
SCAPE
Present Day
2
SCAPEPresenting the REF
The Results Evaluation Framework
• 5 Tools (Droid, Fits, file, fido, Tika)
• 65 Versions (from 2008 to now)
• 1 Govdocs Corpora
• 1 Question….
3
SCAPE
How accurate are file format identification tools historically?
4
SCAPE
5
PDF 1.4
SCAPE
6
DOCX
SCAPE
9 Months Ago
7
SCAPEWhy is Data Important?
• Data and Metadata are knowledge.• Knowledge is power.• Knowledge enables decision.• Knowledge enables process.• Knowledge empowers action.• Knowledge enables us to say because…
8
SCAPEProcesses
9
ProcessDecision
DATA
DATA
DATA
A Classic Flow ChartData is key to making decisions
SCAPEPolicy
10
ProcessPolicy
DATA
DATA
DATA
A Preservation Flow ChartData is key to informing policy
SCAPEPolicy Data - Generated
• When?• Who?• What it affects?• What action is taken?
• Why?11
Policy
SCAPE
Why?• Because something said so?
12
• When?• Who?• What it affects?• What action is taken?
• Why?
DATA
DATA
DATA
SCAPECase Study Example (Opinion)
• Due to format obsolescence, all flash video files are to be migrated to H264/AAC.• Input data: Study on proliferation of flash and evidence of
lacking support from the rights holder, adobe. • File B was created from File A a year ago as it was
identified as being a flash video file.• Today, File A is identified as being an ogg video file.
• What has changed? Why? Does it affect me? Who generated the wrong information? Did they generate any other wrong information? 13
SCAPE
I Don’t Know!
14
SCAPE
6 Months Ago
15
SCAPEA Fact?
16
File#1
application/zip
hasIdentification
SCAPEProvenance
• Tarrant, David and Carr, Leslie (2012) LDS3: Applying Digital Preservation Principals to Linked Data Systems. In, Ninth International Conference on Digital Preservation (iPres2012), Toronto, Canada
17
Tim Berners-Lee
5-Star Linked Data Guide
Provides
SCAPEData!!!
• One fact.• One document the fact comes from• One citation about the documents place of publication.
• Who, What, When and Where• Who they worked for and with.
18
SCAPE
Named-Graph • In Linked Data a document is called a named-graph.
• But these also get used for two purposes!!
19
File#1
Application/zip
hasIdentification
SCAPEThe two uses of the named-graph
No. 1 – Data Publication
20
DATA
DATA
DATA
Named-GraphFile#1
Application/zip
hasIdentification
SCAPEThe two uses of the named-graph
No. 2 – Data Discovery/Query
21
Named-GraphFile#1
application/zip
hasIdentification
DATA
DATA
DATA
File#1
application/msword
hasIdentification
SCAPEThe two uses of the named-graph
No. 2 – Data Discovery/Query
22
Works For
Works For
Named-GraphFile#1
Application/zip
hasIdentification
Named-GraphFile#1
application/zip
hasIdentification
File#1
application/msword
hasIdentification
SCAPE
Query Graph
Source Graph 2
Source Graph 1
Quads
23
File#1
application/zip
hasIdentification
File#1
application/msword
hasIdentification
After all, RDF is a graph model
RDF the spec, not the RDF/XML serialization
SCAPE
Query Graph
Source Graph 2
Source Graph 1
Quads
24
File#1
application/zip
hasIdentification
File#1
application/msword
hasIdentification
usesTool
File 5.04
usesTool
File 5.07
SCAPE
File1/Identification/tool/file/version/5.03
File#1
University of Southampton
hasIdentification
Still with me…
• Ok so what about versioning?
25
File1/Identification/tool/file/version/5.07
File#1
application/msword
hasIdentification
SCAPE
Latest
26
/File1/Identification/tool/file/
File1/Identification/tool/file/version/5.03
File#1
University of Southampton
hasIdentification
File1/Identification/tool/file/version/5.07
File#1
application/msword
hasIdentification
prev
ious
ver
sion
SCAPE
3 Months Ago
27
SCAPEwww.LDS3.org
• A technical solution to all the complexity, automatic:
• Versioning• Linking• Annotation• Named-Graph Management• Query Management
28
SCAPE
Demo
29
SCAPEwww.LDS3.org
• CRUD
• SWORDv2 (Based Upon)
• Oauth Authentication
30
SCAPEIn the paper
• Links between P2-Registry, Pronom and LDS3
• Description of the LDS3 specification• Overview of software in the LDS3 stack (hardly any of
it is new)• How LDS3 relates to Amazon S3• More on named-graphs versioning• More on information and non-information resources.
31
SCAPE
2 Months Ago
32
SCAPEDEMO
• http://dev.lds3.org/admin/timemachine.php?uri=http://dev.lds3.org/doc/B1/E3/7F01/8ACE-43BA-9AA9-B708B7A20263
33
SCAPE
34
SCAPE
35
Present Day
SCAPEPresenting the REF
The Results Evaluation Framework
• 5 Tools (Droid, Fits, file, fido, Tika)
• 65 Versions (from 2008 to now)
• 1 Govdocs Corpora
• 1 Question….
36
SCAPE
How accurate are file format identification tools historically?
37
SCAPE
38
PDF 1.4
http://data.openplanetsfoundation.org/ref/pdf/pdf_1.4/
SCAPE
39
DOCX
http://data.openplanetsfoundation.org/ref/docx/
SCAPE
40
Back To The Future
SCAPEThe Future
• Get me the identification for a file as it would have been on 3rd October 2010.
GET /ref/?query=“SELECT ?identificaiton where file = X” HTTP/1.1
Accept-Datetime: Sun, 3 Oct 2010 12:00:00 GMT Accept: text/plain
application/zip
41
SCAPE
David Tarrant @davetaz [email protected]
Open Planets Foundation / University of SouthamptoniPres2012Toronto, October 2012
LDS3Applying Preservation Principals to Linked Data Systems
This work was partially supported by the SCAPE Project.The SCAPE project is co-funded by the European Union under FP7 ICT-2009.4.1 (Grant Agreement number 270137).