managing and sharing research data: an afternoon...

79
Managing and sharing research data: an afternoon tour Louise Corti Director, Collections Development and Producer Relations UK Data Service University of Essex Workshop on Research Data Management Conference ‘Changing Landscape of Science & Technology Libraries’ Indian Institute of Technology Gandhinagar 2-4 March 2017

Upload: others

Post on 09-Jul-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Managing and sharing research data:

an afternoon tour

Louise Corti

Director, Collections Development and Producer Relations

UK Data Service

University of Essex

Workshop on Research Data Management

Conference ‘Changing Landscape of Science & Technology

Libraries’

Indian Institute of Technology Gandhinagar

2-4 March 2017

Page 2: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Plan for this afternoon

Session 1: Overview of data management: presentation on

data management planning, and key elements for librarians

and data support services

• Quiz and planning exercises in pairs

Session 2: How the UK Data Service does it – speciality in

social and biomedical sciences: presentation

Session 3: Examples of University multi-purpose data

repositories and Indian Data Service:

• Presentations and showcase

Session 4: Skills required – questions you need to consider in

your role as data sharing advocates

• Exercise and discussion

Page 3: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Brief break for Gangnam style dance off!

Page 4: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Making data available is trending now

Open access and transparency agendas

Huge progress in opening up government data

(gov.data)

Lack of trust in published academic findings –

demands for evidence for claims and verification

Value for money from public funds

Page 5: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Why and how is data shared?

• Research data is often exchanged in informal

ways with collaborators and colleagues

• Formally publishing data brings many advantages

– longevity, robust citation and proper attribution

• Publishing data from research grown rapidly in

recent years – result of funder and journal

policies

Page 6: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Benefits of data sharing

To funders • make optimal use of publicly funded research

• maximise return on investment

• avoid duplication of data collection

To the scholarly community

• maintain professional standards of open inquiry

• quality improvement from verification, replication and trust

• develop long time series of data

• promote innovation through unintended, new uses of data

• Study documentation for research design and teaching

Page 7: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Benefits of data sharing

To research participants

• allow maximum use of their contributions

• minimise data collection on the hard-to-reach (e.g. ill,

elites)

To the public • production of high quality findings with social value

• advance science to the benefit of society

• compliance with laws and regulations

• adoption of emerging norms – ‘open access’publishing

• Seen to be open and accountable

Page 8: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Good research should be dependent on

process-visibility

Data Sharing Research

Transparency

Data used to support an evidence-based claim

Data made

available for

secondary

analysis

Page 9: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Transparency Organizations

Page 10: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

International research funder policies

• Most based on:OECD Principles and Guidelines for Access to

Research Data from Public Funding

• UK: variety of models

• Data management plans and recommendation only

• Dedicated data centres

• Institutions taking responsibility

• Europe

• Communication & recommendation on access to / preservation

of scientific information (publications, data)

• USA: NSF and NIH

• Data management plans

Page 11: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Journal and Publisher Data Policies

• Many science journals have policies for data sharing

• Science, Nature, BioMed Central, PLOS ONE “PLOS ONE will not consider a study if the conclusions depend

solely on the analysis of proprietary data … the paper must include

an analysis of public data that validates the conclusions so others

can reproduce the analysis.”

• Data underpinning publication accessible:

• upon request from author

• as supplementary materials alongside publication

• in public repository or specialist data centre

• in mandated repository (e.g. PANGAEA – Elsevier)

• Usually need a Digital Object Identifier (DOI)

Page 12: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Social science journals: transparency agenda

• Replication policies exist for psychology,

economics and political science journals The Journal of Development Economics

Quarterly Journal of Economics

Quarterly Journal of Political Science

• Options: • Authors supply enough information that the exact analysis

can be replicated - raw data, survey forms, data collection

protocols, computer programs and scripts, etc.

• Authors are required to replicate their study, ideally with a

preregistered design

• Journal may have its own repository

Page 13: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Why do we need to plan for data management in research?

• Think how to design and implement research

• Consider how to look after research data safely

• Keep track of research data (e.g. staff leaving)

• Identify support, resources, services needed

• Plan data storage, short & long-term

• Plan data security, ethical aspects

• Plan for current and future data uses (data sharing /

publishing)

Page 14: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Why DMP – funder requirements

• Many research funders require a plan for data

management a part of research applications

• Expect to cost sustainable data management

and sharing into research

• Overview of requirements: • Digital Curation Centre, UK: UK Funders’ data plan

requirements

• California Digital Library, USA:

DMPTool – Funder requirements

Page 15: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

ESRC research data policy

Research data should be openly available to the maximum extent possible

through long-term preservation and high quality data management.

(ESRC Research Data Policy, 2010)

• ESRC grant applicants planning to create data during their research include a data management plan with their application, as an attachment to the Je-S form

• ESRC award holders offer their research data to the ESRC Data Store (managed by UK Data Service) within three months of the end of their grant, to preserve them and to make them available for new research.

Researchers who collect the data initially should be aware that ESRC

expects that others will also use it, so consent should be obtained on this

basis and the original researcher must take into account the long-term use

and preservation of data. (ESRC Framework for Research Ethics, 2012)

Page 16: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

ESRC data management plan

Assessment of existing data

Information on new data

Quality assurance of data

Backup and security of data

Expected difficulties in data sharing

Copyright / Intellectual Property Right

Responsibilities

Preparation of data for sharing and archiving

ESRC DMP guidance

Page 17: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Why DMP - practical

• Identify key data management decision points in

the research lifecycle

• How to address these?

• Who will address these?

Examples:

• Set up data storage area on server

• Verify institutional back-up policy is in place

• Design database with documented labels, codes

• Identify factors that limit, prohibit data sharing

and re-use before data collection starts

Page 18: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Data life cycle intervention

Sign off consent

form

Agree data &

metadata

templates/

organisation

Data sharing

protocols

Licensing, terms

and conditions for

sharing, formal

documentation Data formats,

data migration

Page 19: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,
Page 20: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Roles & responsibilities

• Project director: design, oversee research

• Research staff: design research, collect, process and

analyze data, decide where to keep data & who has access

• Laboratory or technical staff: generate metadata and doc.

• Database designer

• External contractors: data collection, data entry, transcribe,

process, analysis; agree standard protocols

• Support staff: manage and administer research and

funding, ethical review and assess property rights

• Institutional IT services: storage, security, backup services

• External Data Centres: facilitate data sharing

Page 21: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Cost research data management

• Cost RDM into research applications and budgets

• List and identify resources needed to make research

data shareable beyond primary research team - above

planned standard research procedures and practices

• Resources

• People and skills

• Equipment

• Infrastructure, storage and access costs

• Tools to manage, document, organise

• Early planning can reduce costs

Page 22: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

UKDA Data management costing tool

• check data management activities in table and tick what applies to your proposed research; we propose 18 essential RDM activities STEP 1

• for each selected activity, estimate / calculate additional time and/or resources needed and cost this STEP 2

• add data management costs to your research application; coordinate resourcing and costing with your institution, research office and institutional IT services

STEP 3

Page 23: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,
Page 25: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Brief poll and questions

• Who has undertaken research

• Who has collected /created their own data?

• Who has written a DMP?

• What Data Policies exist in India?

• National Data Sharing and Accessibility Policy (NDSAP) – non sensitive data only?

Page 26: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

The nuts & bolts of managing and sharing

data

Formatting and organising data

Storing and transferring data, including encryption and security

Legal, ethical issues: consent, anonymisation & access control

Rights relating to research data

Documenting and contextualising data

Publishing and citing research data

Page 27: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Can you understand/use these data?

SrvMthdDraft.doc

SrvMthdFinal.doc

SrvMthdLastOne.doc

SrvMthdRealVersion.doc

Page 28: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Formatting and organising data

Consistent templates for similar kinds of data

Well organised – consistent folders

Folders/files are suitably named and properly

versioned

Identify the authenticity of master files

Page 29: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

File formats

• Choice of software format for digital data:

• hardware used e.g. audio capture

• discipline-specific customs and planned data analyses

• software availability/cost

• Digital data endangered software/ hardware obsolesce

• Best formats for long-term preservation are standard,

interchangeable and open formats:

• tab-delimited, comma-delimited (CSV), ASCII

• SPSS portable, XML

• RTF, OpenDocument format, PDF/A,

• Recomended formats

• Beware of errors/losses of data when converting!

Page 30: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Format conversion

MS Excel (.XLSX) format using colour highlighting for annotation

Tab-delimited text format, and loss of colour annotation

Loss of

annotation

Page 31: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

File naming

• File name - principal identifier of file

• Use logical naming i.e. easy to identify, locate,

retrieve, access

• Naming provides organisation, context &

consistency

• Name elements: version number, date, content

description, creator name

• For separation use underscores _

• Avoid very long file names!

Page 32: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Version controlling

Keep track of different copies or versions of data files

Best practice:

• unique identifiers for files (naming convention keeping track)

• record file status/versions

• record relationships between files

e.g. data file and documentation; similar data files

• keep track of file locations

e.g. laptop vs. PC

Tools available for versioning and syncronising files

Page 33: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Data capture - it’s all about consistency

Example: Transcription of text from audio

• Use a uniform layout throughout – use a template

• Provide guidance of how you would like the data transcribed

• Indicate speakers

• Capture verbal and non-verbal?

• Implications of various technologies – video, multiple camera, screen

capture, webcams

Example: digitisation of photographs

• Specify the expected output

• Use standard settings on equipment, plus capture of correct metadata from

images

Regardless of who does the work – rules help to make data cleaner

and easier to share!

Page 34: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Organising data

• Plan in advance how best to organise data

• Use a logical structure and ensure collaborators understand

Examples

• hierarchical structure of files, grouped in folders, e.g. audio, transcripts and annotated transcripts

• survey data: spreadsheet, SPSS,

relational database

• interview transcripts: individual

well-named files

Page 35: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Storing data safely

• Looking after data – protect from damage and loss

• Strategies in place for:

• backing-up

• transmission

• secure storage

• disposal

Page 36: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Backing up data

• Why do back-ups? Risk of loss and change - would your

data survive a disaster?

• Protect against: software failure, hardware failure,

malicious attack, natural disasters

• Back-ups are additional copies that can be used to

restore originals

• It’s not backed-up unless backed-up with a strategy

Page 37: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Digital back-up strategy

Consider

• what’s backed-up? - all, some, just the bits you change?

• where? - original copy, external local and remote copies

• what media? - CD, DVD, external hard drive, tape, etc.

• how often? – assess frequency and automate the process

• for how long is it kept? Data retention policies that might apply?

• verify and recover - never assume, regularly test a restore

Backing-up need not be expensive

• 1Tb external drives are around

£50, with back-up software

Consider non-digital storage too!

Page 38: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

38

Encryption and security of data

Encrypt personal or sensitive data:

• when moving or storing files

• free software – easy to use: - Safehouse, Truecrypt, Axcrypt

• encrypt hard drives, partitions, USBs, files and folders

Protect from unauthorised access, change, disclosure, destruction

• control access to computers, buildings, rooms, cabinets

• restrict access to sensitive materials e.g. consent forms

Proper disposal of equipment and media

• even reformatting the hard drive is not sufficient

Page 39: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

39

File sharing & collaborative environments

• Too often data sent via email attachments!

• Virtual Research Environments • MS SharePoint

• Cloud solutions • Google Drive, DropBox, OneDrive

• Base camp

• Locally managed; ownCloud, ZendTo

• File transfer protocol (ftp)

• Physical media

• Data Protection Act for location of data storage

Page 40: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Sharing confidential data

Researchers should:

• obtain informed consent from human subjects for

data sharing and preservation /curation

• protect identities by not collecting personal data as

‘research data’; or anonymise data

• restrict / regulate access where needed (all or part

of data). UK Data Service uses a spectrum of

access

Consider jointly and in dialogue with participants

Plan early in research

Page 41: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Disclosure review

• Direct identifiers: names; addresses; telephone numbers; email

addresses; images (and check file properties!)

• Indirect identifiers: demographics: age, ethnicity,

education/employment details, religion, household size, detailed

income, geography. Could combinations reveal identity?

• Balance confidentiality protection without compromising usability

of data. If can’t be achieved, consider more restrictive access

• Solution: discuss with data creator – data edits (recoding,

banding, aggregation, pseudonyms etc.) or access restriction?

Page 42: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Copyright and data sharing

• Copyright permissions sought and granted prior to data

sharing / archiving

• Clearing copyright for re-use – reach agreement with

copyright holder

• Copyright holders give permission to repositories to

preserve and publish data and provide user access

• Repositories do not inherit the copyright, but ay

have some rights e.g. database right

• ‘Fair dealing’ exception in UK Copyright Law for non-

commercial research, private study, teaching,

quotations, criticism or review; then author and source

must be cited

Page 43: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Matching data with documentation

• Do the data match the documentation

• Is any data missing?

• If anything looks wrong go back to the

depositor

• Don’t never amend data without checking

with data depositor/owner first!

Do No Harm!

Page 44: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Useful documentation for users

• Questionnaire, field work procedures

• Interview schedule or topic guide

• Observation or diary templates

• Stimuli e.g. scenarios, photos, images

• Field notes

• Outputs e.g. reports

• Details of any processing e.g. digitisation or

new derived variables /measures

• Information sheet and consent agreement

• Errata

Page 45: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

The value of the ‘ReadMe’ file

Good practice for each data collection

• For each filename a short description of data is included

• Capture relationships between the data files

• For tabular data definitions of column headings and row

labels, data codes (including missing data) and

measurement units

• For textual data a data list of all interviews, focus

groups, etc.

Page 46: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Descriptive metadata

Record for discovery – Open license; harvestable via OAI-PMH

Persistent identifier – e.g. Datacite DOI

Controlled vocabularies

Standardised schema for data description

Dublin Core and DataCite Core

Data Documentation Initiative (DDI) – good for research data!

Text Encoding Initiative (TEI) – good for text markup

Use Extensible Markup Language (XML)

Page 47: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,
Page 48: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Your turn

• Documentation quiz – discuss briefly with the person

next to you

Page 49: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Promoting data

Catalogue metadata record – harvestable into

meta catalogues

Use of DOIs that can be used and citations

found easily on the web

Promotion through news, social media and

events

Be creative….codesign and experiment

Advocacy for data citation!

Data papers

Page 50: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Key skills that enhance research training

Policy landscape and data sharing

Writing and implementing a data management plan

Documenting and contextualising data

Formatting and organising data

Storing and transferring data, encryption and security

Legal, ethical issues in handling and sharing data –

consent, anonymisation and access control

Rights relating to research data

Publishing and citing research data

Page 51: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Data Papers – growing trend

• Encourage data owners to publish a data paper

• Enables formal citation of data - gives credit to data

managers and scientists

• Data housed in a trusted repository with own DOI

• e.g. Nature Scientific Data (http://www.nature.com/scientificdata/)

Page 52: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,
Page 53: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

What about ‘big data’?

• Scientific repositories already do this, e.g. astronomy

• Steps for smaller more traditional archives to ‘scale up’

• While principles are the same – it’s only data! – certain

issues are challenging

Unconsented for research

Often commercial rights

Unknown provenance - hard to verify and document

Often no version control

Web-based sources may ‘disappear’ or access

refused

Data might change dramatically

Page 54: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

More about how we are building

capacity for big data at UKDS later!

Page 55: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Effecting a DMP – the researcher

• Discuss data archiving and sharing with research

participants to gain their consent for data sharing

• Anonymise data where needed

• Document and contextualise data for future reuse:

– information embedded in data files, e.g. variable labels,

value labels, codes and descriptions

– final report may contain the majority of contextual and

methodological documentation for data

– publications, working papers, lab books, code books

• Recommended formats for preservation and sharing

• Quality control checks

• Copyright permissions for data ownership

Page 56: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Take away planning issues

Know the legal, ethical and other obligations

towards research participants, colleagues, research

funders and institutions

Know the institution’s policies and services: storage

and backup strategy, research integrity framework,

rights policies, institutional data repository

Assign roles and responsibilities to relevant parties

Incorporate data management into research lifecycle

Implement and review management of data during

project meetings and review

Think ahead, plan for the future

Page 57: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Effecting a DMP – The Institution

Minimum • Advice to the researcher on funder requirements

and costing for planning & sharing

• Check research applications against a DMP

• Provide secure data storage during research

If have institutional repository provide:

• Visibility - metadata record

• Long-term data preservation

• Data dissemination and access

If not, help refer to a trusted Data Centre

Page 58: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Your exercise

Handout: Exercise: Data Management Planning

• You are to help a researcher create a DMP for his project

• In pairs, identify data management aspects that matter for

this research proposal and that should be included in the

DMP. Consider topics:

• File formats and organisation

• Data documentation, standards & metadata

• Quality assurance

• Data storage, security and backup

• Data access and (re)use, incl. restrictions and challenges

• IPR and data ownership

• Roles & responsibilities

Page 59: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Contact

Louise Corti

Collections Development and Producer Relations team

UK Data Service

University of Essex

UK CO43SQ

[email protected]

Page 60: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Resourcing the data lifecycle:

services, skills & infrastructure

Louise Corti

Director, Collections Development and Producer

Relations

UK Data Service, University of Essex

Workshop on Research Data Management

Conference ‘Changing Landscape of Science & Technology

Libraries’

Indian Institute of Technology Gandhinagar

2-4 March 2017

Page 61: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Covering

• Mission statements and scope

• Model data services and staffing

• Question for your own data service/archive

• Your own assessment of readiness

Page 62: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

How to set up a data service

• Mission statement & statement of aspiration

– high-level scientific strategy and competencies

– country and discipline-specific researcher practices in

data sharing and re-use

– relevant legal frameworks

• Policy dialogue, appetite for scope

• Locating and building capacities

• Access to shared knowledge bases are useful

Page 63: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

UK Data Archive mission

Promoting best practice in data curation

Raise standards in data management

Raise standards in data security

Drive archival innovation

Advance professionalisation of data service

infrastructures (leadership within the

profession)

Attracting, developing and maintaining

excellent staff

Page 64: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

And we undertake R&D

• Core data services

– Curation, management, dissemination

• R&D projects

– Controlled vocabularies

– Infrastructure and tools development, e.g.

self-deposit system, online data browsing

– Safe settings services

– Data sharing practices

– Scaling up for big data

Page 65: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Defining scope of your collections

• Anticipate capacity – space and humans

• Draft a Collections Development Policy – an

evolving document

• Draft an Appraisal and Selection Policy

• Define publishing pathways

• Set up a Data Appraisal Group

• Is your repository FAIR?

Page 66: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

FAIR principles for repositories

Findable

Accessible

Interoperable

Re-usable

Persistent identification of collections

https://www.force11.org/group/fairgroup/fairprinciples

Page 67: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,
Page 68: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

OAIS adapted for a data service (ISO 14721)

Pre-Ingest

Access (Data)

(Support)

Page 69: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Staffing at UKDA

80+ staff – mostly supported by ESRC

• UK Data Service

• newer Administrative Research Data Network; Big Data

Network

5+2 main sections: • Resources and Management Services

• Collections Development and Producer Support

• Ingest and Access Services

• Technical Services

• Preservation Services

• Administrative Research Data & Big Data Teams

Page 70: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Multiple skills needed across Service

• Archiving and librarianship

• Data handling/manipulation

• Research expertise - user & producer community

• Metadata - cataloguing standards, controlled

vocabularies,

• IT systems - data management and storage

• Programming - maintenance and development

• Legal, ethical, security, rights expertise

• Finance, HR, management

• Digital preservation

Page 71: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Professionalisation

• Individuals – networks of organisations (IASSIST, RDA, Codata)

– continuous development and training

– professional qualifications and formal training

• Organisations – standards adherence

– governance

– audit, assessment, certification

– sit on decision-making forms across academia, government, funders and other data producers

Page 72: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Key questions for you

• What types of data are envisaged to be part

of your archive? Numeric or qualitative data?

Outputs and code?

• Who are the expected data producers

supplying data? Are they willing to provide

data in appropriate formats and with relevant

documentation?

Page 73: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Key questions for you

• Who are the expected end-users of the

data?

• Is there an expectation to provide analysis

tools for users?

• Will data storage need to be carried out at

multiple locations?

• What are the long-term expectations for

preservation of data?

Page 74: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Key questions for you

• Are there legislative requirements for data

protection/freedom of information to take into

account?

• Are there ethical issues about data sharing?

• Are all data likely to be anonymous?

• Are there specific information security

requirements?

• Will there be tiered access methods? (Open,

Safeguarded, Controlled?)

Page 75: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Capacity to run a data service

• How much capacity do you think you need?

• How much capacity do you have?

• Be realistic - choose RDM and data publishing

activities that are manageable - delegate what

you can to others

• Competition is not helpful here – carve your

niche and federate! Strength in numbers

Page 76: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Aspirations

• Secure funding!

• FAIR, Data Seal of Approval, World Data System

• Lots of beautiful data

• Many happy users

• And smiling bosses!

17

Page 77: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Final exercise

• Hand out: Exercise: Resourcing your institution

for data management and curation

• Write an aspirational statement for your data

service

• Rate your institution’s data sharing readiness

Page 78: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Tools to understand data practices

• Data Asset Framework

• identify which data assets exist in organisation, condition

and format, responsibility and long-term custody

• Digital Repository Audit Method Based on Risk Assessment

(DRAMBORA)

• evaluate and manage risks that threaten data or the

infrastructures they may rely upon

• CARDIO

• assess capabilities to support research data management,

and contribute to an institution-wide agenda for change

• Trusted Repositories Audit & Certification (TRAC)

• audit, assess and certify digital repositories

Page 79: Managing and sharing research data: an afternoon …events.iitgn.ac.in/2017/CLSTL/wp-content/uploads/2017/04/...Managing and sharing research data: an afternoon tour Louise Corti Director,

Guidance materials

UK Digital Curation Centre Guides

• How to Develop RDM Services - a guide for HEIs

• 5 Steps to Research Data Readiness: IT managers

• 5 steps to Developing a Research Data Policy

• (Longer version)