datashare : collaboration yields promising tool

33
DataShare: Collaboration Yields Promising Tool Julia Kochi, UCSF Library Angela Rizk-Jackson, UCSF CTSI Perry Willett, CDL CNI 2013 Meeting San Antonio, TX

Upload: keely

Post on 14-Feb-2016

33 views

Category:

Documents


0 download

DESCRIPTION

DataShare : Collaboration Yields Promising Tool. Julia Kochi, UCSF Library Angela Rizk-Jackson, UCSF CTSI Perry Willett, CDL CNI 2013 Meeting San Antonio, TX. The Background. Julia Kochi UCSF Library. What is DataShare ?. An open data repository for the UCSF researcher - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: DataShare :  Collaboration Yields Promising Tool

DataShare: Collaboration Yields Promising Tool

Julia Kochi, UCSF LibraryAngela Rizk-Jackson, UCSF CTSI

Perry Willett, CDL

CNI 2013 MeetingSan Antonio, TX

Page 2: DataShare :  Collaboration Yields Promising Tool

The Background

Julia KochiUCSF Library

Page 3: DataShare :  Collaboration Yields Promising Tool

What is DataShare?

An open data repository for the UCSF researcher

A concept initially envisioned by Michael Weiner, M.D.

A collaboration between UCSF CTSI, UCSF Library, and the California Digital Library

Page 4: DataShare :  Collaboration Yields Promising Tool

The Problem

Increasing requirements to share data• NIH grants >$500k • Publisher requirements

Unequal availability of national repositoriesCampus prioritiesFASTR, White House Directive

Page 5: DataShare :  Collaboration Yields Promising Tool

The Partners

UCSF CTSI• Knowledge of the researcher, access to the data

UCSF Library • Metadata expertise, programming resources

UC3• Preservations tools, services and expertise

Page 6: DataShare :  Collaboration Yields Promising Tool

Technical Infrastructure

Perry WillettCalifornia Digital Library

Page 7: DataShare :  Collaboration Yields Promising Tool

DataShare Components

Merritt: CDLEZID: CDLXTF: CDL, UCSF LibraryIngest tool: UCSF Library

Page 8: DataShare :  Collaboration Yields Promising Tool

Merritt Repository Service

Built on “micro-services” principlesContent and format agnosticHas a UI and RESTful APIs to submit and

retrieve content, and check statusesCan serve as either “dark” or “bright” archiveAdded public access, data use agreements,

asynchronous downloads as part of Datashare project

Page 9: DataShare :  Collaboration Yields Promising Tool

EZID

Service for creation and management of long-term identifiers

Currently supports ARKs and DOIs; other types in planning stages

Registers DOIs with DataCiteHas a UI and APIs with good documentation

Page 10: DataShare :  Collaboration Yields Promising Tool

XTF

eXtensible Text FrameworkDeveloped and maintained by CDLRuns several CDL services:• eScholarship• Online Archive of California• Calisphere

Faceted browsing, full-text search, other desirable features

Page 11: DataShare :  Collaboration Yields Promising Tool
Page 12: DataShare :  Collaboration Yields Promising Tool
Page 13: DataShare :  Collaboration Yields Promising Tool

Ingest tool

Submitting content to a digital repository is hard and costly

An attempt to simplify several aspects:• Digital object creation• Metadata creation• Object submission

Page 14: DataShare :  Collaboration Yields Promising Tool
Page 15: DataShare :  Collaboration Yields Promising Tool

Interactions for submission

Ingest Tool

Creates MetadataAssembles Dataset

Submits to Merritt

Merritt

EZID

Datacite

Requests DOISubmits Metadatato EZID

Registers DOI and Metadata

XTF

Requests ATOM feed for collection

Retrieves Metadata

Index metadata

Receives DOI

Packages object

Gets ATOM feed

Page 16: DataShare :  Collaboration Yields Promising Tool

Process for Endusers

Search, browse Request dataset download Fill out Data Use Agreement Receive dataset

Page 17: DataShare :  Collaboration Yields Promising Tool
Page 18: DataShare :  Collaboration Yields Promising Tool
Page 19: DataShare :  Collaboration Yields Promising Tool
Page 20: DataShare :  Collaboration Yields Promising Tool

Lessons learned

Partnerships• Many hands make light work• Real users uncover hidden assumptions

Scale• Object size• Number of files• Upload and download

Page 21: DataShare :  Collaboration Yields Promising Tool

If you build it, will they come?

Angela Rizk-JacksonUCSF CTSI

Page 22: DataShare :  Collaboration Yields Promising Tool

What will it take?

Sketch by Juliana Olivera Silva via Flickr

+

Page 23: DataShare :  Collaboration Yields Promising Tool

Providing Incentives: RequirementsOrganization Data Access Requirement # UCSF Studies

Funding

NIH Grants >$500K (2003 on), Specific programs

318 (active projects)693 (inactive)

NSF All funded projects (2005 on) 19

Foundations(e.g. Moore, Gates,

Hewlett)

All funded projects 3, 31, 19

Publishing

Nature Publishing Group (Nature, Science,

etc.)

All published studies (2009-2011) 58

Cell Press(Cell, Neuron, etc.)

All published studies (2009-2011) 48

PNAS All published studies (2005-2011) 26

Page 24: DataShare :  Collaboration Yields Promising Tool

Providing Incentives: Visibility

01010010101001100101001010100101010111101010111101010001010100010101000010011000

Enhances collaborative opportunities 69% increase in citation rate for

publications associated with shared data (Piwowar, 2007)

Page 25: DataShare :  Collaboration Yields Promising Tool

Providing Incentives: Credit

Page 26: DataShare :  Collaboration Yields Promising Tool

Providing Incentives: Preservation & Access

Page 27: DataShare :  Collaboration Yields Promising Tool

Providing Incentives: Institutional

UCLA Royce Hall photo courtesy of Adam Fagen via Flickr

• Support researcher needs• Improved archiving efficiency• Cost savings

Page 28: DataShare :  Collaboration Yields Promising Tool

Eliminating Barriers1. Time / Effort

- Minimal requirements- Specific tools (e.g. ingest)- Integrate into existing workflow

2. Control- Data Use Agreement- Centralized service

3. Cultural Paradigm- Outreach- Demonstrate value

Page 29: DataShare :  Collaboration Yields Promising Tool

Other Collaborators

Page 30: DataShare :  Collaboration Yields Promising Tool

Lessons LearnedDon’t underestimate technical matters • Separating data & metadata

Standards are not standard• Metadata schema (Dublin Core DataCite)• Interpretation

Policy issues are ever-present• Data Ownership & Data Use Agreements• Privacy & Consent (Human subjects)

Keep in mind the entire lifecycle: ALL users• Discoverability & interoperability• README File

Page 31: DataShare :  Collaboration Yields Promising Tool

Next Steps

OutreachSystem enhancements• Design overhaul• Ingest mechanism• DUA menu

Policy navigationProof-of-concept

Page 32: DataShare :  Collaboration Yields Promising Tool

Discussion Topics

What incentives have you found useful to encourage adoption of this type of resource?

Are you using data use agreements? Uniform or individualized?

Where do you see institutional data repositories fitting in the larger ecosystem?

Page 33: DataShare :  Collaboration Yields Promising Tool

More info

Datashare: http://datashare.ucsf.eduCDL: http://www.cdlib.org• Merritt: https://merritt.cdlib.org• EZID: http://n2t.net/ezid• XTF: http://xtf.cdlib.org

UCSF Library: http://www.library.ucsf.edu/UCSF CTSI: http://ctsi.ucsf.edu/

NCATS – NIH Grant # UL1 TR000004