grad 521, research data management winter 2014 – lecture 5 amanda l. whitmire, asst. professor

16
GRAD 521, Research Data Management Winter 2014 – Lecture 5 Amanda L. Whitmire, Asst. Professor DM planning for YOUR research

Upload: silas-byrd

Post on 28-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

GRAD 521, Research Data Management Winter 2014 – Lecture 5

Amanda L. Whitmire, Asst. Professor

DM planningfor YOUR research

What are we doing here?

Primary course outcome = your DMP

➜ addresses all RDM competencies

➜ establishes YOUR plan of action

However…More than half of you

do not yet have a well-defined

research project

What are we doing today?

1. Outlining your research project: major areas to cover

“What are you doing?”

2. Describe your research process & tasks

“How are you doing it?”

3. Identify roles & responsibilities“Is anyone else involved?”

“What are you doing?”

1. What is your hypothesis/research question?

Set the stage for your workE.g., “Measurements of the bulk spectral backscattering properties of seawater are directly related to the physical properties of it’s constituent particles.”

2. Answering: “Who cares?” Why are you doing this? Why is it important?

“How are you doing it?”

What is your approach for addressing your research question?

“How are you doing it?”

What types of data will be produced?

Observational | Experimental | Derived | Compiled | Simulation | Reference/Canonical

Qualitative | Quantitative | GeospatialImage/audio/video | Scripts/codes

-&-

What formats will be produced?

ASCII | CSV | DOCX | XLSX | TIFFWAVE | NetCDF | MAT

“How are you doing it?”

How will your data be produced?

When? | Where? | Methods?

-&-

How much data will be produced?

Megabytes – Gigabytes – Terabytes10’s – 100’s – 1000’s of files?

“Who is involved?”

Do you depend on anyone for data?

If so, what are their roles?Adviser | Research Technician | Public Database

Data in Real Life: A DMP Example

Project name: Effects of temperature and salinity on populationgrowth of the estuarine copepod, Eurytemora affinis

Project participants and affiliations:

Carly Strasser (University of Alberta and Dalhousie University)

Mark Lewis (University of Alberta)

Claudio DiBacco (Dalhousie University and Bedford Institute ofOceanography)

Funding agency: CAISN (Canadian Aquatic Invasive Species Network)

Description of project aims and purpose:We will rear populations of E. affinis in the laboratory at three temperatures and three salinities (9 treatments total). We will document the population from hatching to death, noting the proportion of individuals in each stage over time. The data collected will be used to parameterize population models of E. affinis. We will build a model of population growth as a function of temperature and salinity. This will be useful for studies of invasive copepod populations in the Northeast Pacific.

Pho

to b

y C

. S

tra

sse

r; a

ll ri

ghts

re

serv

ed

Data in Real Life: A DMP Example

1. Information about data Every two days, we will subsample E. affinis populations growing at our treatment

conditions. We will use a microscope to identify the stage and sex of the subsampled individuals. We will document the information first in a laboratory notebook, then copy the data into an Excel spreadsheet. For quality control, values will be entered separately by two different people to ensure accuracy. The Excel spreadsheet will be saved as a comma-separated value (.csv) file daily and backed up to a server. After all data are collected, the Excel spreadsheet will be saved as a .csv file and imported into the program R for statistical analysis. Strasser will be responsible for all data management during and after data collection.

Our short-term data storage plan, which will be used during the experiment, will be to save copies of 1) the .txt metadata file and 2) the Excel spreadsheet as .csv files to an external drive, and to take the external drive off site nightly. We will use the Subversion version control system to update our data and metadata files daily on the University of Alberta Mathematics Department server. We will also have the laboratory notebook as a hard copy backup.

From Strasser et. al.

Data in Real Life: A DMP ExampleAn example from the UK Economic and Social Research Council Department for International Development

“…the research project involves primary data collection: 1) public data; 2) semi-structured interviews; and 3) focus group discussions with people identified through profiling techniques: 1. Public data: Where possible, we will use online and/or electronic archives. This will involve

extracting and processing quantitative data, including participants, objectives and outcomes.2. Semi-structured interviews with individuals: The team anticipates undertaking 25-40 semi-

structured interviews in each country from a sample frame to be developed in Phase 2. Data will be collected and stored using digital audio recording (eg MP3) where interviewees permit. In case they do not, interviews will be undertaken in pairs to enable detailed note-taking.

3. Focus group discussions matched to profiles: …Focus groups will involve two researchers, and be conducted in the vernacular. Whether recorded or not, the event will be transcribed or documented using agreed formats and standards for handling the issue of multiple voices, interruptions, labeling of participatory and visual activities, and so on. All transcripts will be in Microsoft Word.”

Data in Real Life: A DMP ExampleAn example from the UK Economic and Social Research Council Department for International Development

“Responsibilities: The PI will direct the data management process over all, with the UK research assistant responsible for ensuring metadata production, day-to-day cross-checks, back-up and other quality control activities are maintained. The lead country researchers will be responsible for routine supervision of the dataset development. Data extraction, processing and inputting for the dataset will be undertaken by the in-country junior researchers. The UK Institution, lead country and junior researchers will share responsibilities for collecting and transcribing focus group and interview data, with the UK research assistant supporting as necessary. The PI will be finally responsible for dealing with quality and sharing and archiving of data. “

So, you need to:

1. Outline your research project: major areas to cover

“What are you doing?”

2. Describe your research process & tasks“How are you doing it?”

3. Identify roles & responsibilities“Is anyone else involved?”

Homework

Address sections 1.1 – 1.6 of a general data management plan. Include the aims and scope of your research project. See detailed assignment on Blackboard later today.

DUE before class on TUESDAY.

Next meeting:

Data Curation ProfilesWhat are they?

Why are you doing one?

What’s involved?

YOU DECIDE what’s important.