Download - It's 2015. Do You Know Where Your Data Are?
It’s 2015.
Do You Know Where Your Data Are?
Professional Development SeminarPsychology Department Penn State University2 September 2015
Patricia Hswe | University Libraries Co-department Head, Publishing and Curation ServicesDigital Content Strategist and Head, ScholarSphere User Serviceshttp://www.libraries.psu.edu/psul/[email protected] | 867-3702
This is . . . data?
I’m confused by Brian Moore via Flickr CC BY-SA
1108845-godzilla_facepalm_godzilla_facepalm_face_palm_epic_fail_demotivational_poster_1245384435_super by Patty Marvel via Flickr CC BY-NC-ND
Also, don’t try this filename at home.
What we’ll talk about
• Data – definitions, debates / resistances
• Tips, tools, and services for research data management (RDM)
• Sharing research: values and venues
• Discussion: questions, comments, concerns,?
Data – Definitions, Debates / Resistances
What are research data?
Why should they be shared?
Data Center by Bob Micael via Flickr CC BY-NC
Research data management by janneke staack via Flickr CC BY-NC
lab notebook by __. via Flickr CC BY-NC
“THE AVAILABILITY OF RESEARCH DATA DECLINES RAPIDLY WITH ARTICLE AGE.”
Title of a 2014 article by Vines et al.
“The major cause of the reduced data availability for older papers was the rapid increase in the proportion of data sets reported as either lost or on inaccessible storage media.”
Forty years of removable storage by David Smith via Flickr CC BY
“The odds that we were able to find an apparently working e-mail address (either in the paper or by searching online) for any of the contacted authors did decrease by about 7% per year.”
e-mail symbol by Micky Aldridge via Flickr CC BY
The authors of the article were able to obtain only 19.5% of the data sets they requested – and only 11% for articles published before 2000.
“Unfortunately, many of these missing data sets could be retrieved only with considerable effort by the authors, and others are completely lost to science.”
Implications? What can researchers begin doing differently?
TIPS, TOOLS, AND SERVICES FOR RESEARCH DATA MANAGEMENT (RDM)
Use them!
Quick Tips and Best Practices
• Take a lifecycle mindset to research and data• File-naming conventions • Standards for description• File formats • Storage
From DataONE Best Practiceshttps://www.dataone.org/best-practices
Reflect on the “during” & the end at the beginning
File-Naming Conventions
• Consistency– Patterns
• Descriptiveness– Keywords– “Aboutness” / content
• Versions– Which versions need to
be saved, tracked?
• Major components (will depend on type of research)– Project name– Content of the file– Date– Version number– Location– Instrument name /
number
1108845-godzilla_facepalm_godzilla_facepalm_face_palm_epic_fail_demotivational_poster_1245384435_super by Patty Marvel . . . NOT A USEFUL FILE NAME!
Standards for Description
• What does your discipline use to describe information?– Arts uses VRA Core, CDWA (Categories for
Description of Works of Art)– Biology uses Darwin Core– Digital collections use Dublin Core– Ecology has Ecological Metadata Language– Social sciences has DDI (Data Documentation
Initiative)
File Formats
• Open rather than proprietary– Interoperable, usable across platforms
• What’s commonly used in your community / discipline?
• Formats for use vs. formats for archiving– PNG or JPG vs. TIFF– Word vs. PDF
Storage
• Distribution and redundancy– Keep the same files in more than one place– Local options: internal (computer, laptop) hard drive;
external hard drive; college/department servers– Campus enterprise services: Box, Tivoli Storage
Manager, HPC– Cloud services: Dropbox, Box, Spideroak, Amazon Web
Services• At least 3 copies• Have master files from which copies get made
Tools / Resources / Services • Training– MANTRA: http://datalib.edina.ac.uk/mantra/ – Penn State’s DMP Tutorial:
https://www.e-education.psu.edu/dmpt/ • Resources– DMPTool: https://dmp.cdlib.org/ – re3data - data repository index: http://www.re3data.org/ – PSU resources: Penn State boilerplate language andPenn
State DMP local guidance• Services– ScholarSphere: https://scholarsphere.psu.edu/
• Sandbox environment: https://scholarsphere-demo.dlt.psu.edu/ – Libraries also consult, teach, review DMPs
DEMOS OF TOOLS/RESOURCES/SERVICES
Penn-State-based
WHAT KIND OF PRESENCE DO YOU HAVE ONLINE?
Sharing research: values and venues
Social network in a course by Hans Poldolja via Flickr CC BY
How CONNECTED are you, and in what WAYS? Have you GOOGLED yourself lately?
Impact Story: https://impactstory.org/
http://patriciahswe.net/
DISCUSSION (NOT THE LAST SLIDE, BY THE WAY)
Questions? Comments? Concerns?
Patricia Hswe | [email protected]
Thank you!
Goodman, Alyssa, Alberto Pepe, Alexander W. Blocker, Christine L. Borgman, Kyle Cranmer, Merce Crosas, Rosanne Di Stefano, et al. 2014.
“Ten Simple Rules for the Care and Feeding of Scientific Data.”
PLoS Comput Biol 10 (4): e1003542. doi:10.1371/journal.pcbi.1003542.