digital | curation | centre uk digital curation centre an introduction dr liz lyon, associate...
TRANSCRIPT
Digital | Curation | Centre
UK Digital Curation Centre An Introduction
Dr Liz Lyon,
Associate Director Outreach IACMST MED Forum, November 2005
Funded by:
This work is licensed under a Creative Commons LicenseAttribution-ShareAlike 2.0
2
Digital | Curation | Centre
For later use? In use now (and the future)?
Repositories and digital curation
Data preservation Data curation
Static Dynamic
“maintaining and adding value to a trusted body of digital information for current and future use”
3
Digital | Curation | Centre
Assuring permanent access to the records of science & the humanities?
Long term access to primary data
• Increasing data volumes from eScience and Grid-enabled / cyberinfrastructure applications
• Changing research paradigm: data-driven science, “big science”
• Observational data, simulations, large-scale experimentation
• Multi-media resources, statistical data, surveys, geo-spatial data……
4
Digital | Curation | Centre
Facilitate “post-processing” and knowledge extraction
Enable the acquisition of newly-derived information and knowledge
• Run complex algorithms over primary datasets
• Mining (data, text, structures)
• Modelling (economic, climate, mathematical, biological)
• Analysis (statistical, lexical, pattern matching, gene)
• Presentation (visualisation, rendering)
5
Digital | Curation | Centre
6
Digital | Curation | Centre
Provide additional functionality beyond digital preservation processes: adding value
Annotations
• Gene and protein sequences
• e-Lab books (Smart Tea Project in chemistry)
7
Digital | Curation | Centre
Research & e-Science workflows
Aggregator services: national, commercial
Repositories : institutional, e-prints, subject, data, learning objects
Data curation: databases & databanks
Validation
Harvestingmetadata
Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media
Deposit / self-archiving
Peer-reviewed publications: journals, conference proceedings
Publication
Validation
Data analysis, transformation, mining, modelling
Searching , harvesting, embedding
Presentation services: subject, media-specific, data, commercial portals
Resource discovery, linking, embedding
Linking
The scholarly knowledge cycle : linking research data to publications
eBank UK Projecthttp://www.ukoln.ac.uk/projects/ebank-uk/
Emerging policy on open access to data
8
Digital | Curation | Centre
Issues: generic data models, metadata schema & terminology
• Validation against other schema• Complex digital objects and packaging options
– METS– MPEG 21 DIDL
• Terminologies– Domain: marine?– Inter-disciplinary e.g. wider environment, bio….– Metadata and vocabularies– Meaningful resource discovery?
9
Digital | Curation | Centre
Ontologies for discovery in an interdisciplinary world
• Transform the ‘list’ into an ‘ontology’
• Embed ontology into the deposition process
• Aggregators use keywords for linking with the broader literature
• Researchers use keyword ontology in search and discovery services
• Formal vs informal “folksonomies”
• Web 2.0???
10
Digital | Curation | Centre
Issues: Persistent identifiers for data (image) citation
• Use cases: depositor, author, service provider, reader, publisher, ?
• Schemes: DOI, Handle, ARK, PURL• Global identification: express as http URIs• Added value services: CrossRef, resolution
service, integration (Globus), look-up service, ?• Degree of trust or persistence• Costs• Future potential: political, ?• Domain identifiers: e.g. International Chemical
Identifier (InChI) codes
11
Digital | Curation | Centre
Issues: Integration into (marine) research workflows
• R4L Repository for the Laboratory Project (JISC-funded) automated data capture from instrumentation, registration of results
• SMART TEA electronic Laboratory notebook + annotations
• Publishers??• Research assessment (RAE) process?
12
Digital | Curation | Centre
UK Digital Curation Centre
• Delivering services
• Development activities
• Research agenda
• Outreach Programme
• http://www.dcc.ac.uk/
13
Digital | Curation | Centre
DCC people (some of them…)
• Management & Co-ordination– Director Chris Rusbridge (University of Edinburgh)
• Community Support & Outreach– Led by Dr Liz Lyon (UKOLN, University of Bath)
• Service Definition & Delivery– Led by Professor Seamus Ross (HATII, University of Glasgow)
• Development– Led by Dr David Giaretta (Astronomical Software & Services,
CCLRC)
• Research– Led by Professor Peter Buneman (Informatics, University of
Edinburgh)
14
Digital | Curation | Centre
User requirements analysis: some sound bytes…
R&D issues: Annotation services, Ontology development, Automating metadata creation, Tools and toolkits, Data Format Description Language, Identifiers, Registries, Economic and cost-benefits studies
Advisory services :“Ask-a-Curator”,FAQs, reports, briefings, awareness-raising materials, best practice guidance, Storage media, “Like Erpanet”, advise Government, Research Councils, funding bodies
Professional development: Short courses, conferences, seminars, workshops, secondments to DCC and to working repository services
Outreach: Leadership for the future, case studies, sharing solutions, collaboration with other partners, international peers, industry links
Taxonomy of “Users”
15
Digital | Curation | Centre
Advisory services
• Responses to queries—from legal to technical guidance [email protected]
• FAQs constructed
• Some useful resources…..
16
Digital | Curation | Centre
Digital Curation Manual
• A world class resource• Constructed from topic-specific chapters
– written by international experts– editorial board comprising leading researchers and
practitioners
• 45 initial topics including– Metadata, Appraisal and Selection; Costs;
Freedom of Information; Interoperability; the OAIS Reference Model; Preservation Strategies; and Open Source
• Briefing Papers aimed at senior managers
17
Digital | Curation | Centre
Workshops and Information Days
• 2005 Workshop Programme – Persistent identifiers – Institutional repositories– Cost models – Preservation of medical
databases
• Information Days at Bath, Aberystwyth, London, Glasgow, Belfast (1st December)…..???
18
Digital | Curation | Centre
OAIS Reference Model
4-1.
2
MANAGEMENT
Ingest
Data Management
SIP
AIPDIP
queries
result setsAccess
PRODUCER
CONSUMER
Descriptive Info
AIP
orders
Descriptive Info
Archival Storage
Administration
Preservation Planning
19
Digital | Curation | Centre
DCC: Development
• “DCC Approach to Digital Curation” based on the Reference Model for an Open Archival Information System (OAIS); ISO standard, 14721:
– Monitoring international standards– Development of a Representation Information
(RI) registry/repository (DCC-RR)– Recommendations for tools and methods for
generating Representation Information– Creating test-beds for digital curation tools
Development info – see
http://dev.dcc.ac.uk
for details of Wiki and email list open to all
20
Digital | Curation | Centre
Trusted digital repositories
• Audit Checklist for Certification • Draft Report August 2005• Research Libraries Group RLG-NARA
Taskforce• Defined criteria under 4 categories
– Organisation– Functions, processes & procedures– Designated community & usability– Technologies & technical infrastructure
21
Digital | Curation | Centre
The database picture
Source dataCurated data: classified, cleaned, annotated, integrated, cross-linked
22
Digital | Curation | Centre
• www.ijdc.net
• Peer-review Editorial Board
• Peter Buneman Editor (research)
• Production editor Richard Waller
• Papers for submission are very welcome!
• 1st issue soon….
23
Digital | Curation | Centre
DCC Conferences
• 1st International DCC Conference, Bath, Sept
• Keynote speakers
Clifford Lynch CNI
Graham Cameron EBI
• Presentations available
• PV 2005 Edinburgh NOW
• 2nd DCC Conf Nov 2006
24
Digital | Curation | Centre
Associates Network
Goals
Develop understanding, share best practice, advance research, promote recognition, develop consensus
Membership
International groups, national bodies, industry partners, funders, research groups, HEIs, FEIs, individuals……
Benefits
Early access to R&D outputs, advisory services, training, input to definition and design, community participation
Discussion Forum www.dcc.ac.uk Please join us!
Digital | Curation | Centre
Thank you.
Questions?…..