study on data integration and sharing standard and ... · information sharing center for earth...

48
Study on Data Integration and Sharing Study on Data Integration and Sharing Standard and Specification System for Standard and Specification System for Earth System Science Earth System Science Juanle Juanle Wang and Wang and Jiulin Jiulin Sun Sun Information Sharing Center for Earth System Science Institute of Geographic Sciences and Natural Resources Research, CAS 2010-05-27 Hongkong

Upload: truongque

Post on 27-Aug-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

Study on Data Integration and Sharing Study on Data Integration and Sharing Standard and Specification System for Standard and Specification System for

Earth System Science Earth System Science

JuanleJuanle Wang and Wang and JiulinJiulin SunSunInformation Sharing Center for Earth System Science

Institute of Geographic Sciences and Natural Resources Research, CAS

2010-05-27 Hongkong

Outline

• Standard and specification requirements

• Data Sharing Progress in China

• Basic principles of data sharing standards

and specifications

• System Architecture and contents

• Implementation and application

• Future work

Standard and specification requirements

From Geosciences to Earth System Science

• Geosciences have evolved into Earth System Science stage in 21st century (Liu Dongsheng, 2004).

Data feature analysis

• Research data

collections

• Reference data

collections

• Resource or

community

data collections

Research Data• Research data collections are the products of one

or more focused research projects and typically contain data that are subject to limited processing or curation.

• They may or may not conform to community standards.

• They may vary greatly in size but are intended to serve a specific group.

• There may be no intention to preserve the collection beyond the end of a project, because their budgets are small usually.

Such as Fluxes Over Snow Surfaces Project

Reference data• Reference data collections are intended to serve

large segments of the scientific and education community.

• In these circumstances, conformance to robust, well-established, and comprehensive standards is essential, and the selection of standards often has the effect of creating a universal standard.

• Budgets supporting reference collections are often large.

Such as GenBank of National Institutes of Health

Resource or community data• Resource or community data collections serve a

single science or engineering community.

• These digital collections often establish community- level standards either by selecting from among preexisting standards or by bringing the community together to develop new standards where they are absent or inadequate.

• The budgets are intermediate in size and generally are provided through direct funding from agencies.

Such as DAACs in NASA, NOAA and USGS.

Requirement

• How to integrate and share these data in Earth System Science research community is a challenge for ESS development. – ESS data belong to research data

– Most of these geosciences data are distributed in different research agencies, group or even personal scientists’ database

Requirement

• Concretely, there are 3 main problems needed to be studied at present.– How to establish the ESS data sharing

mechanism?

– How to integrate and share multi- disciplinary data?

– How to make data easily for user find and access?

Data Sharing Progress in China

Scientific Data Sharing in China

• Meteorology• Seismology• Oceanography• Geology• Astronomy• Space Science• Geophysics• Glaciology and

Geocryology• Renewable Resources

and Environment

1988 2001.12 2002.12 时间

Progress

• Meteorology• ……

• Agriculture• Forestry• Water resources• Seismology• Mapping and

Survey• Earth Science(DSNESS)• Sustainable

Development• Rural

Techniques

2004

• Oceanography• Land &

Resources• Medicine and

Health• Energy• Environment• ……

WDCs in China

National Science and Technology Infrastructure

WDC system of ICSU

Scientific Data Sharing Program (SDSP) in China

• SDSP was launched by MOST in the beginning of 2002

• 9 pilot projects was launched in the first stage

• According to its plan, there will be 40 national data centers, cover 300 master databases, in 6 fields at the end of 2020.

Gateway to China Scientific Data Sharing Program

Resourceand Environment

Agriculture

Populationand Health

Basic and Frontier Sciences

Engineering andTechnology

Regional Development

Meteorological Scientific Data Center

Rural Development Sci Data Center

Agricultural Scientific Data Center

Basic Medicine Scientific Data Center

Population Control Sci Data Center

Earth System Scientific Data sharing Net

Basic Sci Data Center

………………………………………………

………………………………………………

About300

Master Databases

In 40 Data Centers

Fields/Disciplines Fields/Disciplines Data Center / NetworksData Center / Networks Master DatabaseMaster Database

Po- rtal

Framework of China SDSP

National Science & Technology Infrastructure (NSTI)

• Two years later after SDSP, NSTI was launched in 2004 supported by MOST, Ministry of Finance of China, Ministry of Education, and State Development and Reform Commission.

• Data sharing is the core of NSTI– SDSP is one of important parts in NSTI

• Data sharing portal: http://www.escience.gov.cn

2. Data Sharing Network of Earth System Science (DSNESS)

• One of 9 Pilot Projects in SDSP and NSTI

• Long term project, 2002-

• Host by Institute of Geographic Sciences and

Natural Resources Research, CAS

• Lead Scientist: Prof. Sun Jiulin, IGSNRR

• http://www.geodata.cn

• Integrate and share distributed scientific data in Institutes, Universities, Individuals, government funded research projects, and International organization or data centers

• Enhance and support the Earth System Science Research and Science and Technology innovation

• At present, mainly focus on “Land surface and human-land relationship” research field

Objectives

Standard and specification concept model in DSNESS

Basic principles for data sharing standard and specification environment

Corresponding principle (1/4)

• Standard and specification should correspond with their inside system and outside systems.– First of all, all the standards and

specifications should avoid conflict inside the Earth System Science data sharing system.

– At the same time, standard and specification of Earth System Science should correspond with its out side system.

• For example, Earth System Science data sharing is one of important components of NSTI in China, so it should keep consistency with the standards and specifications in NSTI system.

• Also, these standard and specification should keep consistency with the international standards, such as ISO 19115 metadata standard, etc.

Easy access principle (2/4)

• The final objective of Earth System Science data sharing is to provide an easy collection, management and dissemination environment. – Under this principle, Earth System Science

data should show perfect and easy understand data catalogue for users firstly.

Reference model instruction principle(3/4)

• Geosciences data have a complex process from the data collection to dissemination. There should be a reference model to instruct the data sharing standard and specification environment. – For example, ISO geography information

reference model is a good example for Earth System Science standard and specification environment.

Software implementation principle(4/4)

• Data sharing need information and technology support. At present, distributed network is a popular tool for data sharing. – This principle will contribute to the

software developers develop data collection and management tools, which is a good method for the standard and specification implementation in science community.

System Architecture and contents

• The general architecture includes 4 standard and specification classes. – Mechanism and rules class

– Data management standards and specification class

– Platform development specification class

– Data service specification class.

Mechanism and rules class(1/4)

• Includes– data sharing Constitutions

– platform implementation measurement

– platform operational management rules

– data sharing rules and guides.

• These rules keep consistency with the national data sharing law and rules.

• Data sharing constitution is the core mechanism of DSNESS.

Universal members

Core members

platform operational management rules

Data sharing Union

Data management standard and specification class (2/4)

• Includes – metadata standard,

– metadata editing specification,

– data document specification

– data backup specification

– international data collection and exchange specification

– data quality control specification,

– database design specification for vector, raster and attribute data, etc.

• Core metadata standard of DSNESS include 188 metadata elements, including 22 core elements

Platform development specification class (3/4)

• Includes– data classification specification

– software development and coding specification

– software interface specification, etc

Data Catalogue• Ecosystem background data

• Social & economic development data

• Coastal and delta resources and environment data

• Frozen earth data

• arctic pole and south pole scientific explore data

• Typical regional thematic data (e.g. Tibetan plateau, loess plateau, mountain area, Yellow river delta, Yangze river delta, etc.)

• Earth observation system data and data production

• Solar-earth environment data

User service specification class

• It includes user service specification and data service guides. – These specifications ensure perfect user

services for data sharing.

Online service

Outline service

Implementation and application

Geodata Sharing Union

14 Institutes in CAS

10 Universities in China

5 WDCs in China

Himalayan area

Arctic and Antarctic Research Center, ……

Mongolian Academy of Sciences

More than 30 partners

Network PlatformNetwork Platform

Metadata

Data Management

System

Data Integrate

Data Manager

Search, Browse,

Download

CheckPublication

Data Users Data Providers

Web Portal

Background

Requirement

Content Service

Data SubmitManagement

Data Resources in DSNESS

• Under this standard and specification environment, more than 18TB data resources have been collected and shared in this network.

• Majority of the integrated data is related to geographic science, resources science, environmental science, ecological sciences, societal sciences, and remote sensing, etc.

www.geodata.cn

Data Sharing Network of Earth System Science

Data services

• Till to the end of 2009, almost 46000 users have resisted in the platform, 24.58 TB data have been downloaded for public.

Future work

• Earth System Science data sharing need a long term processing and its standard and specification environment should be revised and fulfilled continually.

• Under the DOWN direction, the basic concept model

of standard and specification environment should be

researched and studied deeply.

• Under the UP direction, different stage’s concrete

standards and specifications according to the data

sharing life cycle should be paid more attention to

– especially for the data product specifications, dataset fusion

and assimilation specifications, dataset service

specifications, etc.

Thanks!

E-mail:[email protected]

Institute of Geographic Sciences and Natural Resources Research, CAS