using specimen data in scientific workflow environments to connect to metadata archive and discovery...

22
Using Specimen Data in Scientific Workflow Environments to Connect to Metadata Archive and Discovery Services in Environmental Biology CJ Grady, J.H. Beach, A. Stewart, J. Cavner University of Kansas Biodiversity Institute

Upload: calvin-sherrick

Post on 15-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Using Specimen Data in Scientific Workflow Environments to Connect to Metadata Archive and Discovery Services in Environmental Biology CJ Grady, J.H. Beach,

Using Specimen Data in Scientific Workflow Environments to Connect to Metadata Archive

and Discovery Services in Environmental Biology

CJ Grady,J.H. Beach, A. Stewart, J. Cavner

University of Kansas Biodiversity Institute

Page 2: Using Specimen Data in Scientific Workflow Environments to Connect to Metadata Archive and Discovery Services in Environmental Biology CJ Grady, J.H. Beach,

Geospatial Metadata

• Describes– What it is– What it looks like– Who assembled it– When it was collected– Etc

1960 - 1990

Page 3: Using Specimen Data in Scientific Workflow Environments to Connect to Metadata Archive and Discovery Services in Environmental Biology CJ Grady, J.H. Beach,

EML

• Ecological Metadata Language– XML Schema– Open Source– Community Driven– Describes ecological data• Occurrence Data• Climate Layers• Species Ranges

Page 4: Using Specimen Data in Scientific Workflow Environments to Connect to Metadata Archive and Discovery Services in Environmental Biology CJ Grady, J.H. Beach,

Narratives

• Transformation of metadata into a story that is appropriate for the intended audience

• Same metadata can be used to create narrative for:– Scientists– Undergrads– K-12 students

Page 5: Using Specimen Data in Scientific Workflow Environments to Connect to Metadata Archive and Discovery Services in Environmental Biology CJ Grady, J.H. Beach,

Narrative Example

Page 6: Using Specimen Data in Scientific Workflow Environments to Connect to Metadata Archive and Discovery Services in Environmental Biology CJ Grady, J.H. Beach,

• DataONE– Distributed system for:• Queries• Data replication

– Initially supports EML

Page 7: Using Specimen Data in Scientific Workflow Environments to Connect to Metadata Archive and Discovery Services in Environmental Biology CJ Grady, J.H. Beach,

Study of Experiment Reproducibility

Ellison, Aaron. 2010. Repeatability and transparency in ecological research. Ecology 90.

Page 8: Using Specimen Data in Scientific Workflow Environments to Connect to Metadata Archive and Discovery Services in Environmental Biology CJ Grady, J.H. Beach,

There is a Solution!

Page 9: Using Specimen Data in Scientific Workflow Environments to Connect to Metadata Archive and Discovery Services in Environmental Biology CJ Grady, J.H. Beach,

Process Metadata

• Data about the process used• Descriptive and prescriptive• Documents process used to generate data /

metadata– Quality control– SDM experiments– RAD experiments

Page 10: Using Specimen Data in Scientific Workflow Environments to Connect to Metadata Archive and Discovery Services in Environmental Biology CJ Grady, J.H. Beach,

Capabilities

• Reproducibility– Actions are documented

• Transparency– Experiments can be evaluated and validated

• Publishing– Metadata can be published along with results

Page 11: Using Specimen Data in Scientific Workflow Environments to Connect to Metadata Archive and Discovery Services in Environmental Biology CJ Grady, J.H. Beach,

What we have done

• EML for all of our Species Distribution Modeling services

• Simple process metadata– Documents how an experiment is ran through our

cluster including what versions of software– Also describes what web services would be called

to execute the experiment again

Page 12: Using Specimen Data in Scientific Workflow Environments to Connect to Metadata Archive and Discovery Services in Environmental Biology CJ Grady, J.H. Beach,

What we have done

• Clients– Python library– VisTrails– QGIS

Page 13: Using Specimen Data in Scientific Workflow Environments to Connect to Metadata Archive and Discovery Services in Environmental Biology CJ Grady, J.H. Beach,

What we are doing

• Publishing EML to a repository• Client extensions• Extending process metadata– HTTP message– XPath

Page 14: Using Specimen Data in Scientific Workflow Environments to Connect to Metadata Archive and Discovery Services in Environmental Biology CJ Grady, J.H. Beach,

Process Metadata Extensions

• HTTP Message– Documents any web resource call over HTTP

• XPath processing

Page 15: Using Specimen Data in Scientific Workflow Environments to Connect to Metadata Archive and Discovery Services in Environmental Biology CJ Grady, J.H. Beach,
Page 16: Using Specimen Data in Scientific Workflow Environments to Connect to Metadata Archive and Discovery Services in Environmental Biology CJ Grady, J.H. Beach,
Page 17: Using Specimen Data in Scientific Workflow Environments to Connect to Metadata Archive and Discovery Services in Environmental Biology CJ Grady, J.H. Beach,

What we will do next

• Use standard APIs to communicate with DataONE

• Continue to search for standard process metadata and include it whenever possible

• Contribute process metadata extensions back to the community

• Add additional conditional analysis elements to the schema (JSON, etc)

Page 18: Using Specimen Data in Scientific Workflow Environments to Connect to Metadata Archive and Discovery Services in Environmental Biology CJ Grady, J.H. Beach,

Reproducibility

• Simple process metadata• EML process metadata extensions• Lifemapper client EML reader

Page 19: Using Specimen Data in Scientific Workflow Environments to Connect to Metadata Archive and Discovery Services in Environmental Biology CJ Grady, J.H. Beach,

Transparency

• EML for all service objects• Descriptive process metadata

Page 20: Using Specimen Data in Scientific Workflow Environments to Connect to Metadata Archive and Discovery Services in Environmental Biology CJ Grady, J.H. Beach,

Publishing Aid

• Client access to public data / metadata catalogs

• Publish buttons

Page 21: Using Specimen Data in Scientific Workflow Environments to Connect to Metadata Archive and Discovery Services in Environmental Biology CJ Grady, J.H. Beach,

Lessons Learned

• Had success starting with narrow, specific, process steps and generalizing them– Calls to our web services expanding to any HTTP

call• Easy to get carried away with all of the

possibilities

Page 22: Using Specimen Data in Scientific Workflow Environments to Connect to Metadata Archive and Discovery Services in Environmental Biology CJ Grady, J.H. Beach,

Lifemapper funded by:

U.S. National Science Foundation

NSF EPSCoR 0553722

NSF EPSCoR 0919443

EHR/DRL 0918590

BIO/DBI 0851290

OCI/CI-TEAM 0753336

http://[email protected]