data management practices for early career scientists: closing robert cook environmental sciences...

7
Data Management Practices for Early Career Scientists: Closing Robert Cook Environmental Sciences Division Oak Ridge National Laboratory Oak Ridge, TN [email protected] February 3, 2013

Upload: phebe-dixon

Post on 02-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Management Practices for Early Career Scientists: Closing Robert Cook Environmental Sciences Division Oak Ridge National Laboratory Oak Ridge, TN

Data Management Practices for Early Career Scientists:Closing

Robert CookEnvironmental Sciences Division Oak Ridge National LaboratoryOak Ridge, [email protected]

February 3, 2013

Page 2: Data Management Practices for Early Career Scientists: Closing Robert Cook Environmental Sciences Division Oak Ridge National Laboratory Oak Ridge, TN

NACP Best Data Management Practices, February 3, 2013

Plan for archiving data

“Begin with the end in mind”

•Identified the Data Center

•Collaborated with data center during project

•Communicated:• Volume

• Number of Files

• Special needs

• Delivery dates

2

Page 3: Data Management Practices for Early Career Scientists: Closing Robert Cook Environmental Sciences Division Oak Ridge National Laboratory Oak Ridge, TN

NACP Best Data Management Practices, February 3, 2013

Followed Fundamental Data Practices

Define the contents of your data files

Use consistent data organizationUse stable file formats Assign descriptive file names Preserve informationPerform basic quality assurance Provide documentationProtect your data

3

Page 4: Data Management Practices for Early Career Scientists: Closing Robert Cook Environmental Sciences Division Oak Ridge National Laboratory Oak Ridge, TN

NACP Best Data Management Practices, February 3, 2013

What to submit to the archive?

• Well-structured data files, with variables, units, and fill values well defined

• Metadata files (optional)• Document that describes the data set• Companion files that describe project,

protocols, or field sites (photographs)– Material from Project Web site or Wiki

4

Page 5: Data Management Practices for Early Career Scientists: Closing Robert Cook Environmental Sciences Division Oak Ridge National Laboratory Oak Ridge, TN

5

Exploration and Distribution– provide tools to explore, access,

and extract data

Post-Project Data Support– provide long-term secure

archiving– serve as a buffer between end

users and PIs– provide usage statistics

Stewardship– security, disaster recovery– migration to new computer

systems

Data Center: Stewardship and Archive Functions

Ingest– perform QA checks– compile project-provided

metadata– generate additional metadata– convert to archival file

formatsMetadata / Documentation

– prepare final metadata record and documentation

Archive / Publish−generate citation and DOI

(digital object identifier)

Page 6: Data Management Practices for Early Career Scientists: Closing Robert Cook Environmental Sciences Division Oak Ridge National Laboratory Oak Ridge, TN

NACP Best Data Management Practices, February 3, 2013

Workshop Goal

Provide fundamental data management practices that investigators should perform during the course of data collection.

6

To improve the usability of data sets• You• Collaborators• People outside your project

By following the practices taught in this workshop, your data will be • less prone to error, • more efficiently structured for analysis, and • more readily understandable for any future research.

Page 7: Data Management Practices for Early Career Scientists: Closing Robert Cook Environmental Sciences Division Oak Ridge National Laboratory Oak Ridge, TN

7

Workshop Sponsors