uva library scientific data consulting group (scidac): new partnerships and services to support...
DESCRIPTION
A. Sallans. "UVa Library Scientific Data Consulting Group (SciDaC): New Partnerships and Services to Support Scientific Data in the Library." Presented at the 2011 International Association for Social Science Information Services and Technology.TRANSCRIPT
UVA LIBRARY SCIENTIFIC DATA
CONSULTING GROUP (SCIDAC): NEW PARTNERSHIPS AND SERVICES TO
SUPPORT SCIENTIFIC DATA IN THE LIBRARY
Andrew Sallans
Head of Strategic Data Initiatives
Sherry Lake
Senior Scientific Data Consultant
IASSIST 2011
1 June 2011
OUTLINE
Phase 1 – Research Computing Lab
Phase 2 – Scientific Data Consulting Group
1. Data assessment interviews
2. Data management planning
3. Integration of processes with IR
Partnerships
Internal
External
Challenges
Future opportunities
2
BACKGROUND ON THE UNIVERSITY OF VIRGINIA
“Mr. Jefferson’s University”
Size
About 14,000 undergraduate
students
About 6,000 graduate students
About 2,000 faculty
Annual research dollars –
FY10 $375 million
DE (Ed) - $10 million
DOE- $10 million
DOD- $15 million
NSF - $29 million
DHHS - $197 million 3
4
Source: “Men's Lacrosse NCAA CHAMPS! (by Matt Riley) 5/31/2011” photo gallery, http://www.virginiasports.com/
PHASE 1: RESEARCH COMPUTING LAB
Began planning in 2005.
Central IT: seeking greater
visibility.
Library: seeking new ways
to support scientific research.
Collocation provided mutual
benefits.
Staff combined in 2006,
moved to Library locations
(Research Computing Lab &
Scholars’ Lab), setup new
service points and services.5
RESEARCH COMPUTING LAB RESPONSE
Aiming to provide support across the entire
scientific research data lifecycle
Staff with expertise in:
Data
Quantitative data, statistics
Modeling, visualization
Scientific publishing
Emphasis on consulting, not drop-off services
Partnership with traditional librarians to help
ease transition to new support models
6
SAMPLE RCL CONSULTATIONS
STS Undergrad Environmental Justice (2008) Development of technology solutions for empowering the
citizen scientist
Web 2.0 tools, data collection/management
Data analysis
Economics Graduate Student (2008/2009) Airline flight price modeling
Screen scraping, data collection/management
Data analysis
Mountain Lake Beetle Project (2009) Mobile data acquisition/collection solution
Database development/management, programming
Data analysis
Archiving of dissertation data (2009) EVSC student, ModelMaker 4.0 data
Biology student, IDL, Matlab, R code 7
TAKE-AWAYS
This is the future
Heavily growing space, lots of opportunity
Requires big investment and commitment, the
biggest being training and priority alignment
Libraries and institutions need to make decisions
on what to do and what not to do
It’s a culture change for both libraries,
institutions, and researchers
8
PHASE 2 - SCIENTIFIC DATA CONSULTING
GROUP
December 2009/January 2010: rethinking the
model
Budgetary pressures
Changes in organizational priorities
Emerging demands in research community
Spring 2010: decision to focus on data
May 2010: close of RCL, start of SciDaC
9
WHAT’S HOT IN 2010?
Open data: growing governmental interest in
making publicly-funded research more
transparent and more available (NIH, NSF)
Broader critical review: greater interest
evaluating original research data (Nature)
Technological advances: sharing of research
results easier and faster (Repositories, Web 2.0)
Reuse/preservation of research data:
increased consideration of the cost and value of
research data and need to ensure its longevity
10
“SCIENTISTS SEEKING NSF FUNDING WILL SOON BE
REQUIRED TO SUBMIT DATA MANAGEMENT PLANS”Press Release 10-077, May 5, 2010
11
Current Policy:
o “To advance science by encouraging data sharing among researchers”
o Data obtained with federal funds be accessible to the general public
o Grantees must develop and submit specific plans to share materials collected with NSF support, except where this is inappropriate or impossible
On or around October 2010:
o All new NSF proposals will be required to include a data management plan in the form of a 2 pg supplementary document (peer reviewed)
o New policy is meant to be a 1st step toward a more comprehensive approach to data management
o Exact requirements vague
THE CHALLENGE FOR INSTITUTIONS
Data is expensive
Time, instrumentation, inability to reproduce
Increasing regulation
Granting agencies and journals require
submission
Inadequate training
No formal data management curriculum
Preservation is not a priority
For most researchers, preservation takes time
away from the work that is rewarded
(publication, teaching) 12
SO…WHO’S GOING TO TAKE THIS ON?
Researchers?
VPR?
CIO?
OSP?
UL?
13
WHY THE LIBRARY?
Neutral: works across the entire institution
Strong in relationship building: has
experience fostering discussion and relationships,
and cultivates an existing support network
Intellectual Property experts: has dealt with
copyright, can translate to data
Service-oriented: uniquely positioned as an
intellectual service unit within the institution
14
GETTING STARTED…
Take what we learned in the RCL experience and
apply it to the focused demands around data
Steps:
Conduct a stakeholder analysis
Make a short term plan (12 months)
Develop clear priorities
Refine and standardize consulting methods
Communicate heavily
15
STAKEHOLDER ANALYSIS (ABBREVIATED)
Internal
Researchers
Graduate Students
Grant Administrators
Deans
VP/CIO
VPR
OSP
UL
External
Funding agencies
Broader research
community
“The Public”
16
SHORT TERM PLAN
Survey OSP to match grant holders with
regulations
Educate/engage subject librarians
Build political awareness/support
Build partnerships with
local/national/international groups
Resource requests:
Staffing commitment
Travel/partnership support
Promotion of initiative to institution17
CLEAR PRIORITIES
1. Data interviews/assessments
2. Response to NSF Data Management Plan
(DMP) Mandate
3. Leadership on data for the Institutional
Repository (IR)
18
CONSULTING ACTIVITIES
Interviews/assessments
Data management planning templates
LOTS of documentation
Constant and continuous refinement of process
Focus on helping researchers improve process
19
COMMUNICATE HEAVILY
Internal
Inform staff of processes, priorities, and progress
Keep stakeholders engaged
Reach the consumers from many angles
External
Discuss and share experiences with colleagues at other
institutions
Create partnerships to share, build upon resources and
experiences, collaborate on tools
Networking (Twitter, LinkedIn, listserves, conference calls,
conference presentations)
Bottom line: this is a big culture shift, and you do have to
say the same thing many times in different ways20
PRIORITY 1 – DATA ASSESSMENT INTERVIEWS
Initially a means of growing awareness of consulting service and doing assessment, now a means of establishing a baseline for research data management practices with any new “client”
Protocol involves:
60 minute interview discussion (researcher / SciDaCconsultants / subject librarian)
Development of a report
SciDaC consultants give researchers recommendations to improve data management
SciDaC consultants work with researchers to implement recommended solutions
Approach has proven to be very effective thus far 21
PRIORITY 2 – DATA MANAGEMENT
PLANNING
Highest priority of responding to and addressing
support needs for funding agency requirements
(ie. NSF, others)
Getting a handle on data management as a
means of institutional risk management
Coordination of effort across institution
22
NSF DATA MANAGEMENT PLAN MANDATE
Official mandate became active Jan. 18, 2011
New NSF Directorates/Divisions continue to
release and specify guidelines (examples below)
Education and Human Resources (EHR)
Engineering (ENG)
Geological Sciences (GEO)
Mathematical and Physical Sciences (MPS)
Social, Behavioral, and Economic Sciences (SBE)
Researchers continue to be mostly unaware of the
mandate and how to prepare a DMP
23
UVA SCIDAC NSF DMP RESPONSE
UVa Library’s Original Request
Develop boilerplate for researchers to use in proposals
SciDaC Group’s Response
No boilerplate, successful proposals need customized plans
Our approach involves:
Knowledge across many communities (“translational” opportunities)
Leadership on policy/infrastructure development
Development of a template that simplifies writing the plan
Principles
Must be easy for researcher
Must be supportable by available UVA resources/infrastructure
Must be able to be followed-through on if grant is awarded24
PRIORITY 3 – INTEGRATION WITH IR
Institutional repository “Libra” (http://libra.virginia.edu)
Built upon Hydra architecture
Three components: open access publications, data, and electronic
theses/dissertations
Working on figuring out storage and cost models to
support management of big and small data from across
institution’s research community
Will provide preservation assurance for data in form of
“blobs” or packages (bit preservation, no format migration)
Currently in process of developing user
interface/ingestion prototype that addresses needs of
small data for release in late July 201125
COLLABORATIONS
Internal
Library / VPR / CIO / OSP
Institutional Repository Team
Kuali Coeus team
External
DMP Tool
DataONE
Conference/professional networks
26
27
CHALLENGES
Involving subject librarians?
Gaining institutional buy-in?
Meeting demand?
28
HOW TO INVOLVE SUBJECT LIBRARIANS?
UVa Library Staff Model
Scientific Data Consultants
Subject Librarians
Current Training Model
Brown Bag Data CurationDiscussions
Data Interviews
Goals and Objectives
Build Data Literacy
Create Collaborative Opportunities
Establish the Library for Data Preservation
29
HOW TO GAIN INSTITUTIONAL BUY-IN?
Regulations are helpful
Partnerships between key stakeholders:
University libraries (UL)
Central IT (CIO)
Research Office (VP for Research)
Sponsored Programs/Research
Strategic investment: take ownership, allocate
resources, and demonstrate capability
30
HOW TO MEET DEMAND?
Time: how to best manage staff time
NSF research support alone is going to be very time consuming (UVA had about 140 proposals over the past year, 44 in November alone)
Funding: work with leaders to find money
Redirection/reallocation of grant overhead dollars
Write-in of library staff on grants
Strategy: decide how to invest
How might units be reorganized?
How do we expand to other disciplines?
How could staff resources and expertise be refocused?
What additional partnerships would add value? 31
FUTURE DIRECTIONS
Addressing data management needs of other disciplines across the institution
Integration into formal research proposal process
Broader data management education
Increased funded research project consulting
Technology consulting
Expansion of virtual organization partners and creation of research advisory board
Guiding of policy revision to address new interests in data management and preservation 32
THANK YOU!
Andrew Sallans
Head of Strategic Data Initiatives, SciDaC Group
University of Virginia Library
Email: [email protected]
Twitter: asallans
http://www.lib.virginia.edu/brown/data
33