noaa data stewardship

19
Scott Hausman Scott Hausman Acting Director, Acting Director, NOAA National Climatic Data Center NOAA National Climatic Data Center (NCDC), (NCDC), Asheville, North Carolina Asheville, North Carolina June 3, 2010

Upload: jillian-adriano

Post on 04-Jan-2016

48 views

Category:

Documents


0 download

DESCRIPTION

National Research Council. Third Meeting, Board on Research Data and Information. NOAA Data Stewardship. Scott Hausman Acting Director, NOAA National Climatic Data Center (NCDC), Asheville, North Carolina. June 3, 2010. Overview. Leader in Environmental Data Management - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: NOAA Data Stewardship

Scott HausmanScott HausmanActing Director, Acting Director, NOAA National Climatic Data Center (NCDC), NOAA National Climatic Data Center (NCDC),

Asheville, North CarolinaAsheville, North Carolina

June 3, 2010

Page 2: NOAA Data Stewardship

OverviewOverview• Leader in Environmental Data ManagementLeader in Environmental Data Management• Partner with National Research CouncilPartner with National Research Council• Transforming NOAA Data ManagementTransforming NOAA Data Management

• Strengthening Policies and DirectivesStrengthening Policies and Directives• Investing in Enterprise IT InfrastructureInvesting in Enterprise IT Infrastructure• Leveraging Universal StandardsLeveraging Universal Standards• Expanding Data Discovery and AccessExpanding Data Discovery and Access• Redefining Scientific Data StewardshipRedefining Scientific Data Stewardship• Developing a Data Management WorkforceDeveloping a Data Management Workforce

• The way forward and how BRDI can helpThe way forward and how BRDI can help

6/3/20106/3/2010NOAA Data Stewardship, Third NRC/BRDI MeetingNOAA Data Stewardship, Third NRC/BRDI Meeting

Page 3: NOAA Data Stewardship

Leader in Environmental Data ManagementLeader in Environmental Data Management

Improving data stewardship is among NOAA’s top priorities!Improving data stewardship is among NOAA’s top priorities!

6/3/20106/3/2010NOAA Data Stewardship, Third NRC/BRDI MeetingNOAA Data Stewardship, Third NRC/BRDI Meeting

Mission: To understand and predict changes in Earth’s environment and conserve and manage coastal and marine resources to meet our Nation’s economic, social, and environmental needs

Vision: An informed society that uses a comprehensive understanding of the role of the oceans, coasts, and atmosphere in the global ecosystem to make the best social and economic decisions

Page 4: NOAA Data Stewardship

Leader in Environmental Data ManagementLeader in Environmental Data ManagementBroadest Scope of any Agency forBroadest Scope of any Agency forEnvironmental Data StewardshipEnvironmental Data Stewardship

6/3/20106/3/2010NOAA Data Stewardship, Third NRC/BRDI MeetingNOAA Data Stewardship, Third NRC/BRDI Meeting

Space Observations

Ocean Observations

Land Surface Observations

Atmospheric Observations

• ~150 Research & Operational Observing Systems

(http://www.nosa.noaa.gov/observing_systems.html)

• ~4-5 Petabytes of data/year (~15 Pb total)

Data Management Challenges are ChangingData Management Challenges are Changing

• No longer about data volume

• Data discovery and integration

• Data stewardship and information

Page 5: NOAA Data Stewardship

Partner with National Research CouncilPartner with National Research Council

Principles for Effective Environmental Data Management Principles for Effective Environmental Data Management

6/3/20106/3/2010NOAA Data Stewardship, Third NRC/BRDI MeetingNOAA Data Stewardship, Third NRC/BRDI Meeting

1.1. Data should be Data should be archivedarchived and and accessibleaccessible

2.2. Adequate resources for Adequate resources for end-to-end end-to-end managementmanagement

3.3. Management activities should Management activities should involve usersinvolve users

4.4. Interagency and international Interagency and international partnershipspartnerships

5.5. MetadataMetadata are essential are essential

6.6. Expert stewardsExpert stewards required for management required for management

7.7. Process to decide Process to decide what data to archivewhat data to archive

8.8. Archive must support Archive must support discovery, access, and discovery, access, and integrationintegration

9.9. Effective management requires a formal, Effective management requires a formal, ongoing ongoing planning processplanning process National Research Council

Committee on Archiving and Accessing Environmental and Geospatial Data at NOAA

Page 6: NOAA Data Stewardship

• Strengthening Policies and DirectivesStrengthening Policies and Directives

• Investing in Enterprise IT InfrastructureInvesting in Enterprise IT Infrastructure

• Leveraging Universal StandardsLeveraging Universal Standards

• Expanding Data Discovery and AccessExpanding Data Discovery and Access

• Redefining Scientific Data StewardshipRedefining Scientific Data Stewardship

• Developing a Data Management WorkforceDeveloping a Data Management Workforce

Transforming NOAA Data ManagementTransforming NOAA Data Management

6/3/20106/3/2010NOAA Data Stewardship, Third NRC/BRDI MeetingNOAA Data Stewardship, Third NRC/BRDI Meeting

Page 7: NOAA Data Stewardship

• Coordinates the development of NOAA’s environmental Coordinates the development of NOAA’s environmental data management strategy, and policy, and provides data management strategy, and policy, and provides guidance to ensure consistent implementation across guidance to ensure consistent implementation across NOAA, on behalf of the NOSC and CIO CouncilNOAA, on behalf of the NOSC and CIO Council

• Environmental data management is an end-to-end Environmental data management is an end-to-end process that includes acquisition, quality control, process that includes acquisition, quality control, validation, reprocessing, storage, retrieval, dissemination, validation, reprocessing, storage, retrieval, dissemination, and long-term preservation activities and long-term preservation activities

• The goal of the EDMC is to enable NOAA to maximize The goal of the EDMC is to enable NOAA to maximize the value of its environmental data assets through sound the value of its environmental data assets through sound and coordinated data management practicesand coordinated data management practices

• Leadership: Chair and Deputy Chair appointed by NOSC Leadership: Chair and Deputy Chair appointed by NOSC and CIO Counciland CIO Council

• MembershipMembership• Line Office RepresentativesLine Office Representatives• NOAA Chief Enterprise ArchitectNOAA Chief Enterprise Architect• NOAA Data Management ArchitectNOAA Data Management Architect

• Ex-officio or AdvisoryEx-officio or Advisory• NOAA National Data Center DirectorsNOAA National Data Center Directors• Designated Mission-Goal & Sub-Goal Team Designated Mission-Goal & Sub-Goal Team

RepresentativesRepresentatives• NOAA liaisons to key Federal and International NOAA liaisons to key Federal and International

initiatives concerning environmental data managementinitiatives concerning environmental data management

Strengthening Policies and DirectivesStrengthening Policies and Directives

6/3/20106/3/2010NOAA Data Stewardship, Third NRC/BRDI MeetingNOAA Data Stewardship, Third NRC/BRDI Meeting

Environmental Data Management Committee (EDMC)Environmental Data Management Committee (EDMC)Established in Fall of 2009Helen Wood current Chair

NOAA

EDMCEnvironmental Data

Management Committee

CIOCChief Information Officer

Council

NOSCNOAA Observing System Council

DMITData Management Integration Team

Page 8: NOAA Data Stewardship

Strengthening Policies and DirectivesStrengthening Policies and DirectivesNOAA Environmental Data Management FrameworkNOAA Environmental Data Management Framework

6/3/20106/3/2010NOAA Data Stewardship, Third NRC/BRDI MeetingNOAA Data Stewardship, Third NRC/BRDI Meeting

Overarching all Aspects of the Data Management LifecycleOverarching all Aspects of the Data Management Lifecycle

Governance, Requirements Management, Architecture ManagementDeveloping and maintaining rich metadata to accompany the data

Establishing mechanisms that allow for user requirements and feedback

Governance, Requirements Management, Architecture ManagementDeveloping and maintaining rich metadata to accompany the data

Establishing mechanisms that allow for user requirements and feedback

Observing Operations

•Actual observation•Transmission/processing QA•Integration with other data to create products (e.g., models)•Dissemination to real-time subscribers•Delivery to archive

Observing Operations

•Actual observation•Transmission/processing QA•Integration with other data to create products (e.g., models)•Dissemination to real-time subscribers•Delivery to archive

Archive•Ingest (Receipt)•Archival storage•Data management (populating catalogs, registries, metadata)•Preservation planning (migration to new technologies)

Archive•Ingest (Receipt)•Archival storage•Data management (populating catalogs, registries, metadata)•Preservation planning (migration to new technologies)

Access•Discovery (catalogs, registries, metadata)•Dissemination to users (web services, legacy systems, standard formats)

Access•Discovery (catalogs, registries, metadata)•Dissemination to users (web services, legacy systems, standard formats)

Use•Integration with other information (NOAA, others)•Assimilation into models•Product creation•Make decisions (policy, emergency, others)•Scientific discovery•Feedback to NOAA

Use•Integration with other information (NOAA, others)•Assimilation into models•Product creation•Make decisions (policy, emergency, others)•Scientific discovery•Feedback to NOAA

Planning of New Observing or Data

Management Systems

•Requirements definition•Analysis of alternatives•Systems design•Integration with observing systems (NOAA, interagency, state, international)•Determining what to archive and associated funding•Buy/build

Planning of New Observing or Data

Management Systems

•Requirements definition•Analysis of alternatives•Systems design•Integration with observing systems (NOAA, interagency, state, international)•Determining what to archive and associated funding•Buy/build

Stewardship Overarches Observing Operations, Archive, Access, UseAll ongoing, iterative processes that improve: 1) data and metadata content (include reprocessing data) and 2) access and user understanding

Stewardship Overarches Observing Operations, Archive, Access, UseAll ongoing, iterative processes that improve: 1) data and metadata content (include reprocessing data) and 2) access and user understanding

Page 9: NOAA Data Stewardship

• Maintains NOAA’s policy of “Maintains NOAA’s policy of “full and full and open accessopen access” to environmental data” to environmental data

• Provides mechanism for EDMC to Provides mechanism for EDMC to develop procedural directivesdevelop procedural directives for more for more detailed guidance (e.g., the NOAA detailed guidance (e.g., the NOAA Procedure for Scientific Records Procedure for Scientific Records Appraisal and Archive Approval)Appraisal and Archive Approval)

• Presents an Presents an end-to-end lifecycle end-to-end lifecycle frameworkframework for the management of for the management of environmental dataenvironmental data

• Signed by NOAA CIO; awaiting Signed by NOAA CIO; awaiting clearance from Chief Administrative clearance from Chief Administrative Officer and General CouncilOfficer and General Council

Strengthening Policies and DirectivesStrengthening Policies and Directives

6/3/20106/3/2010NOAA Data Stewardship, Third NRC/BRDI MeetingNOAA Data Stewardship, Third NRC/BRDI Meeting

Revision of NOAA Administrative Order (NAO) 212-15Revision of NOAA Administrative Order (NAO) 212-15Establishes a NOAA policy for acquiring, integrating, managing, disseminating, and archiving environmental and geospatial data and information obtained from worldwide sources to support NOAA's mission.

““What to Archive”What to Archive”

Page 10: NOAA Data Stewardship

Investing in Enterprise IT InfrastructureInvesting in Enterprise IT Infrastructure

• NOAA’s primary enterprise IT system for NOAA’s primary enterprise IT system for archive and accessarchive and access

• Employs OAIS-RMEmploys OAIS-RM• Enterprise benefits include: Enterprise benefits include:

• Economy of ScaleEconomy of Scale• High Quality of ServiceHigh Quality of Service• A System Evolution ApproachA System Evolution Approach

• Location of “Nodes”Location of “Nodes”• Operational: Asheville, NC (NCDC); Operational: Asheville, NC (NCDC);

Boulder, CO (NGDC)Boulder, CO (NGDC)• Development and Test: Suitland, MD; Development and Test: Suitland, MD;

Fairmont, WVFairmont, WV• Environmental Data HoldingsEnvironmental Data Holdings

• Current: POES, DMSP, GOES, CFSR Current: POES, DMSP, GOES, CFSR (Model Reanalysis)(Model Reanalysis)

• Development: MetOp, EOS MODIS, NPPDevelopment: MetOp, EOS MODIS, NPP

Comprehensive Large Array-data Stewardship System (CLASS)Comprehensive Large Array-data Stewardship System (CLASS)

6/3/20106/3/2010NOAA Data Stewardship, Third NRC/BRDI MeetingNOAA Data Stewardship, Third NRC/BRDI Meeting

Page 11: NOAA Data Stewardship

Leveraging Universal StandardsLeveraging Universal Standards

• Reference model is to be Reference model is to be applicable to all digital archives, applicable to all digital archives, and their Producers and and their Producers and ConsumersConsumers

• Identifies a minimum set of Identifies a minimum set of responsibilities for an archive to responsibilities for an archive to claim it is an OAISclaim it is an OAIS

• Establishes common terms and Establishes common terms and concepts for comparing concepts for comparing implementations, but does not implementations, but does not specify an implementationspecify an implementation

• Provides detailed models of both Provides detailed models of both archival functions and archival archival functions and archival informationinformation

• Discusses OAIS information Discusses OAIS information migration and interoperability migration and interoperability among OAISsamong OAISs

Open Archival Information System Reference Model (OAIS-RM)Open Archival Information System Reference Model (OAIS-RM)

6/3/20106/3/2010NOAA Data Stewardship, Third NRC/BRDI MeetingNOAA Data Stewardship, Third NRC/BRDI Meeting

Page 12: NOAA Data Stewardship

Leveraging Universal StandardsLeveraging Universal StandardsGlobal Earth Observation-Integrated Global Earth Observation-Integrated

Data Environment (GEO-IDE)Data Environment (GEO-IDE)• ScopeScope – NOAA-wide architecture – NOAA-wide architecture

development to integrate legacy development to integrate legacy systems and guide development of systems and guide development of future NOAA environmental data future NOAA environmental data management systemsmanagement systems

• VisionVision – NOAA’s GEO-IDE is – NOAA’s GEO-IDE is envisioned as a “system of envisioned as a “system of systems” – a framework that systems” – a framework that provides effective and efficient provides effective and efficient integration of NOAA’s many quasi-integration of NOAA’s many quasi-independent systemsindependent systems

• FoundationFoundation – built upon agreed – built upon agreed standards, principles and guidelinesstandards, principles and guidelines

• ApproachApproach – evolution of existing – evolution of existing systems into a service-oriented systems into a service-oriented architecture architecture

• ResultResult – a single system of – a single system of systems (user perspective) to systems (user perspective) to access the data sets needed to access the data sets needed to address significant societal address significant societal questionsquestions

6/3/20106/3/2010NOAA Data Stewardship, Third NRC/BRDI MeetingNOAA Data Stewardship, Third NRC/BRDI Meeting

Unified Access Framework for Gridded Data (UAF Grid)Integrated Ocean Observing System Data Integration Framework (IOOS DIF)

Page 13: NOAA Data Stewardship

Expanding Data Discovery and AccessExpanding Data Discovery and Access

6/3/20106/3/2010NOAA Data Stewardship, Third NRC/BRDI MeetingNOAA Data Stewardship, Third NRC/BRDI Meeting

NOAA NationalData Centers

NOAA Centers of Data

NOAA Other Data Sources

External Data Sources

PortalsCloud ResourcesSearch Engines

Online RequestsSubscription Services

Web ServicesM2M Interfaces

Source Agnostic Interface / Federated Data Sources for Transparent AccessMetadata Catalogs for Data Discovery

Tiers of Access (Customer Sophistication)

Page 14: NOAA Data Stewardship

Redefining Scientific Data StewardshipRedefining Scientific Data Stewardship

New approach for real time data management and production of New approach for real time data management and production of climate data recordsclimate data records

6/3/20106/3/2010NOAA Data Stewardship, Third NRC/BRDI MeetingNOAA Data Stewardship, Third NRC/BRDI Meeting

Climate QualityData

Records & Products

Observing System Monitoring Climate Data Records

ObservingSystem

Operators

Reprocessing & Reanalyses

Feedbacks

Operational Product

Processing

Random & Time Dependent Error

Checks

OriginalObservations & Metadata

SentinelScientific

StewardshipTeams

Scientific Stewardship Teams

Metadata

Archives

Random & Time Dependent Error

Checks

Intercomparison and

Analysis

Rapid feedback to observing system Scientist/Analysts involved with observations early on Enable and facilitate future research Safeguard interests of future generations

End-to-end accountability of data Spatial and temporal sampling Time dependent biases Metadata Reprocessing for CDRs

Page 15: NOAA Data Stewardship

• NOAA/NESDIS Top NOAA/NESDIS Top PriorityPriority

• Partnering with Earth Partnering with Earth Science Information Science Information Partners (ESIP) Partners (ESIP) FederationFederation

• One day with One day with afternoon practicumafternoon practicum

• Focus on graduate Focus on graduate students and junior students and junior scientists scientists

• Target Fall AGU Target Fall AGU MeetingMeeting

Developing a Data Management WorkforceDeveloping a Data Management Workforce

6/3/20106/3/2010NOAA Data Stewardship, Third NRC/BRDI MeetingNOAA Data Stewardship, Third NRC/BRDI Meeting

Page 16: NOAA Data Stewardship

ConclusionConclusion• The Way ForwardThe Way Forward

• Translate NAO 212-15 in NOAA DirectivesTranslate NAO 212-15 in NOAA Directives• Finalize a NOAA-wide CONOPS for ArchiveFinalize a NOAA-wide CONOPS for Archive• Prototype federated architecturePrototype federated architecture

• How BRDI can helpHow BRDI can help• Defining archival standards for research/small Defining archival standards for research/small

data setsdata sets• Improving interdisciplinary integration of dataImproving interdisciplinary integration of data• Increasing transparency and discovery to Increasing transparency and discovery to

enhance data reuse and avoid redundant enhance data reuse and avoid redundant researchresearch

6/3/20106/3/2010NOAA Data Stewardship, Third NRC/BRDI MeetingNOAA Data Stewardship, Third NRC/BRDI Meeting

Page 17: NOAA Data Stewardship

Scott HausmanScott HausmanActing DirectorActing Director

NOAA’s National Climatic Data Center (NCDC)NOAA’s National Climatic Data Center (NCDC)

151 Patton Avenue, Room 557151 Patton Avenue, Room 557

Asheville, NC 28807-5002Asheville, NC 28807-5002 828-271-4848828-271-4848

828-271-4246828-271-4246

828-450-9188828-450-9188

[email protected]

www.ncdc.noaa.govwww.ncdc.noaa.gov

6/3/20106/3/2010NOAA Data Stewardship, Third NRC/BRDI MeetingNOAA Data Stewardship, Third NRC/BRDI Meeting

Page 18: NOAA Data Stewardship

Background MaterialBackground Material

6/3/20106/3/2010NOAA Data Stewardship, Third NRC/BRDI MeetingNOAA Data Stewardship, Third NRC/BRDI Meeting

Page 19: NOAA Data Stewardship

1.1. Commitment and leadershipCommitment and leadership: Information is a strategic asset and information management must be a : Information is a strategic asset and information management must be a key component of every environmental data and information program. This ethic must be reflected in a key component of every environmental data and information program. This ethic must be reflected in a corporate culture, embraced throughout the organization, that recognizes data as a corporate resource.corporate culture, embraced throughout the organization, that recognizes data as a corporate resource.

2.2. StewardshipStewardship: People who take observations or produce data and information are stewards of these data, : People who take observations or produce data and information are stewards of these data, not owners. These data must be collected, produced, documented, transmitted and maintained with the not owners. These data must be collected, produced, documented, transmitted and maintained with the accuracy, timeliness and reliability needed to meet the needs of all users.accuracy, timeliness and reliability needed to meet the needs of all users.

3.3. Long-term preservationLong-term preservation: Irreplaceable observations, data products of lasting value, and associated : Irreplaceable observations, data products of lasting value, and associated metadata must be preserved. This information must be well-documented and maintained so that it is metadata must be preserved. This information must be well-documented and maintained so that it is available to and independently understandable by users, now and in the future.available to and independently understandable by users, now and in the future.

4.4. Requirements-drivenRequirements-driven: It is essential that providers and users of data and products play an active role in : It is essential that providers and users of data and products play an active role in defining the constantly evolving requirements that drive the development and evolution of data defining the constantly evolving requirements that drive the development and evolution of data management systems.management systems.

5.5. Discovery and accessDiscovery and access: Freedom of access, mechanisms that facilitate discovery, timely delivery, use and : Freedom of access, mechanisms that facilitate discovery, timely delivery, use and interpretation of data and products (directories, browse capabilities, metadata, mapping, visualization, etc.) interpretation of data and products (directories, browse capabilities, metadata, mapping, visualization, etc.) are essential, recognizing relevant policies and regulations.are essential, recognizing relevant policies and regulations.

6.6. Standards and practicesStandards and practices: Appropriate use of information technologies, widely shared standards, and : Appropriate use of information technologies, widely shared standards, and integration approaches are vital to facilitate collection, management, discovery, dissemination, and access integration approaches are vital to facilitate collection, management, discovery, dissemination, and access services for environmental data and products. This will ensure interoperability among providers, systems, services for environmental data and products. This will ensure interoperability among providers, systems, and users. Effective application of standards and best practices contribute to the development of systems and users. Effective application of standards and best practices contribute to the development of systems that are interoperable, efficient, reliable, scalable, and adaptable.that are interoperable, efficient, reliable, scalable, and adaptable.

7.7. QualityQuality: Data, products and information should be of quality sufficient to meet the requirements of society : Data, products and information should be of quality sufficient to meet the requirements of society and to support sound decision making.and to support sound decision making.

8.8. Cooperation and coordinationCooperation and coordination: Environmental and scientific data management is a task of global scope : Environmental and scientific data management is a task of global scope – a whole that should be much bigger than the sum of its parts. It is only by participating in a global – a whole that should be much bigger than the sum of its parts. It is only by participating in a global community of integrated data management that each organization can realize the potential of its data to community of integrated data management that each organization can realize the potential of its data to the betterment of humankind.the betterment of humankind.

9.9. SecuritySecurity: Data, information, and products must be preserved and protected from unintended or malicious : Data, information, and products must be preserved and protected from unintended or malicious modification, unauthorized use, or inadvertent disclosure. modification, unauthorized use, or inadvertent disclosure.

NOAA Data Management PrinciplesNOAA Data Management Principles

6/3/20106/3/2010NOAA Data Stewardship, Third NRC/BRDI MeetingNOAA Data Stewardship, Third NRC/BRDI Meeting