giri palanisamy oak ridge national laboratory & lorrie apple johnson u.s. department of energy...

27
Giri Palanisamy Oak Ridge National Laboratory & Lorrie Apple Johnson U.S. Department of Energy October 16, 2013 Data Citation and Linking of Big and Continuous Data An Experience from the U.S. Department of Energy’s Atmospheric Radiation Measurement (ARM) Program

Upload: trenton-grinnell

Post on 02-Apr-2015

226 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Giri Palanisamy Oak Ridge National Laboratory & Lorrie Apple Johnson U.S. Department of Energy October 16, 2013

Giri PalanisamyOak Ridge National Laboratory

&Lorrie Apple Johnson

U.S. Department of Energy

October 16, 2013

Data Citation and Linking of Big and Continuous Data

An Experience from the U.S. Department of Energy’s Atmospheric Radiation

Measurement (ARM) Program

Page 2: Giri Palanisamy Oak Ridge National Laboratory & Lorrie Apple Johnson U.S. Department of Energy October 16, 2013

U.S. Department of Energy’s Atmospheric Radiation Measurement (ARM) Data Center

• Located at Oak Ridge National Laboratory (ORNL)

• Part of Climate Change Science Institute

• ARM – www.arm.gov

Page 3: Giri Palanisamy Oak Ridge National Laboratory & Lorrie Apple Johnson U.S. Department of Energy October 16, 2013

Office of Scientific and Technical Information (OSTI)

“The Secretary, through the Office of Scientific and Technical Information, shall maintain within the Department publicly available collections of scientific and technical information resulting from research, development, demonstration, and commercial applications activities supported by the Department.”

Energy Policy Act of 2005

OSTI has the corporate responsibility for ensuring appropriate access to the U.S. Department of Energy’s (DOE) R&D results.

• DOE invests over $10 billion/year in basic sciences, clean energy technology, nuclear research.

• The immediate output from this investment is information… knowledge… R&D results in many formats, including digital data.

• OSTI’s mission is to accelerate scientific progress by accelerating access to this information.

Page 4: Giri Palanisamy Oak Ridge National Laboratory & Lorrie Apple Johnson U.S. Department of Energy October 16, 2013

Type of Data – Atmospheric processes, cloud dynamics

Products - > 3,000Archive Size - > 300 TBUsers/year - ~ 1,500Year Started - 1991

Page 5: Giri Palanisamy Oak Ridge National Laboratory & Lorrie Apple Johnson U.S. Department of Energy October 16, 2013

ARM data collection: Consists of permanent, mobile, and aircraft sites

Southern Great Plains (1993) North Slope of Alaska: Barrow (1998)

and Atqasuk (1999) Tropical Western Pacific: Manus

(1996), Nauru (1998), and Darwin (2002)

First ARM Mobile Facility (2005); Second ARM Mobile Facility (2010)

ARM Aerial Facility (2007)

Page 6: Giri Palanisamy Oak Ridge National Laboratory & Lorrie Apple Johnson U.S. Department of Energy October 16, 2013

Challenges for Scientific DataHard to FIND

Hard to NAVIGATE

Hard to CITE

Page 7: Giri Palanisamy Oak Ridge National Laboratory & Lorrie Apple Johnson U.S. Department of Energy October 16, 2013

ARM Archive - Challenges

Millions of data files from over 3,000 data products.Most of them are continuous data streams.Large user community and complex use of data (climate change modeling).Data is also published via other portals.

Page 8: Giri Palanisamy Oak Ridge National Laboratory & Lorrie Apple Johnson U.S. Department of Energy October 16, 2013

Why Cite Data?

Data citation can help by: enabling easy reuse and verification of data allowing the impact of data to be tracked creating a scholarly structure that recognizes and rewards data producers

Data should be cited in just the same way that other sources of information, such as articles and books, are cited.

Page 9: Giri Palanisamy Oak Ridge National Laboratory & Lorrie Apple Johnson U.S. Department of Energy October 16, 2013

ARM Data Citation Service - GoalsTo allow users to cite the exact ARM data used in their research publicationsTo allow future data users, and the project, to easily track the data used in various articles

Strategy: DOI’s assigned at the ARM data product level,

and presented in the ARM data stream pages and field campaign readme files

DOI’s also sent via Archive data notification emails

Page 10: Giri Palanisamy Oak Ridge National Laboratory & Lorrie Apple Johnson U.S. Department of Energy October 16, 2013

One Solution: DataCiteWhat is DataCite?

A global consortium composed of local institutions focused on improving the scholarly infrastructure around datasets and other non-textual information.

A service for assigning Digital Object Identification (DOIs) and metadata to datasets.

DataCite (www.datacite.org) helps researchers find, access and reuse data.

Page 11: Giri Palanisamy Oak Ridge National Laboratory & Lorrie Apple Johnson U.S. Department of Energy October 16, 2013

DOE Data ID Service• DOE/OSTI is the only U.S. federal member of DataCite.

• Interagency agreement in place with NIH project; in discussions with eight agencies representing 15 projects.

• OSTI Partnered with Oak Ridge National Laboratory to pioneer procedure.

• First DOI for a DOE dataset was minted and registered with DataCiteon 8/10/2011.

• DOE Atmospheric Radiation Measurement (ARM) has now registered over 545 datastreams, each representing hundreds of subordinate data files.

• Currently working with 6 DOE data centers, including ARM. Two are fully integrated; 4 others in testing or planning phases.

Page 12: Giri Palanisamy Oak Ridge National Laboratory & Lorrie Apple Johnson U.S. Department of Energy October 16, 2013

Improving Access, Citation & Reuse of Data Easier identification and access of datasets across the

international community of researchers via DataCite’s resolving tools

Linkage between DOE’s R&D documents and the underlying datasets generated by the research

Standard format for including data in the accepted bibliographic citation framework

Aid researchers in locating exact datasets used in previous work, thus allowing verification of results or new uses for the data

Page 13: Giri Palanisamy Oak Ridge National Laboratory & Lorrie Apple Johnson U.S. Department of Energy October 16, 2013

DataCite Registers DOI

DOE-OSTI submits nightly feed of new

DOIs to DataCite

How Data Citation Works

Data Citation metadata submitted to

DOE-OSTI

•Dataset Type

•Dataset Title

•Dataset Creator/Author or Principal Investigator

•Dataset Product Number

•DOE Contract/Award Number

•Originating Research Organization

•Publication/ Issue Date

•Sponsoring Organization

•URL where the Dataset is posted for access

•Contact information

DOI Assigned ByDOE-OSTI

WebService

API

241.6AN

=

Creator/Author, Primary Investigator, or

Submitter notified of Data Citation availability

Data Citation submitted to

search enginesfor indexing

DOE-OSTI updates metadata record with DOI

creating a full Data Citation

DataCite validates DOI registration with

DOE-OSTI

Page 14: Giri Palanisamy Oak Ridge National Laboratory & Lorrie Apple Johnson U.S. Department of Energy October 16, 2013

•Dataset Type

•Dataset Title

•Dataset Creator/Author or Principal Investigator

•Dataset Product Number

•DOE Contract/Award Number

•Originating Research Organization

•Publication/ Issue Date

•Sponsoring Organization

•URL where the Dataset is posted for access

•Contact information

Required Metadata Elements

Page 15: Giri Palanisamy Oak Ridge National Laboratory & Lorrie Apple Johnson U.S. Department of Energy October 16, 2013
Page 16: Giri Palanisamy Oak Ridge National Laboratory & Lorrie Apple Johnson U.S. Department of Energy October 16, 2013
Page 17: Giri Palanisamy Oak Ridge National Laboratory & Lorrie Apple Johnson U.S. Department of Energy October 16, 2013
Page 18: Giri Palanisamy Oak Ridge National Laboratory & Lorrie Apple Johnson U.S. Department of Energy October 16, 2013
Page 19: Giri Palanisamy Oak Ridge National Laboratory & Lorrie Apple Johnson U.S. Department of Energy October 16, 2013

Since science is not bound by agency, organization, or geography…

Facilitating Access to Scientific Data:

Federated Searching

• We integrate or aggregate multiple government R&D-related databases into single-search portals.

• Innovative technology drills down to selected databases and websites in parallel, then presents ranked search results.

Page 20: Giri Palanisamy Oak Ridge National Laboratory & Lorrie Apple Johnson U.S. Department of Energy October 16, 2013

WorldWideScience.org Enabling Access to Global R&D Results

• Multilingual translations capability for 10 languages.

• More than 400 million pages of scientific and technical information, including:• Text• Multimedia• Data

Research results from 70+ countries are searchable via single-query global science portal.

Page 21: Giri Palanisamy Oak Ridge National Laboratory & Lorrie Apple Johnson U.S. Department of Energy October 16, 2013
Page 22: Giri Palanisamy Oak Ridge National Laboratory & Lorrie Apple Johnson U.S. Department of Energy October 16, 2013
Page 23: Giri Palanisamy Oak Ridge National Laboratory & Lorrie Apple Johnson U.S. Department of Energy October 16, 2013
Page 24: Giri Palanisamy Oak Ridge National Laboratory & Lorrie Apple Johnson U.S. Department of Energy October 16, 2013
Page 25: Giri Palanisamy Oak Ridge National Laboratory & Lorrie Apple Johnson U.S. Department of Energy October 16, 2013

Citing ARM DataSeveral citation formats are possible using DOI’s. ARM encourages users to include the following information when citing ARM data:

AuthorOriginal publication dateUpdate period, if applicable (daily, monthly, etc.)Dataset nameDates usedLocation (latitude/longitude, site name, and facility identifier)Editor(s) or compiler(s)Place of publicationPublisherDate accessedDOI

Page 26: Giri Palanisamy Oak Ridge National Laboratory & Lorrie Apple Johnson U.S. Department of Energy October 16, 2013

Example of Scientific ImpactORNL DAAC: Data Products used in literature

ORNL DAAC requests that data be cited in list of references; some authors “refer” to data in text or acknowledgements

Page 27: Giri Palanisamy Oak Ridge National Laboratory & Lorrie Apple Johnson U.S. Department of Energy October 16, 2013

Thank you!

Giri PalanisamyOak Ridge National Laboratory

[email protected]

Lorrie JohnsonU.S. Department of Energy

Office of Scientific and Technical Information [email protected]