global science needs global data a case for data sharing
DESCRIPTION
The Fifth China - U.S. Roundtable on Scientific Data Cooperation. Global Science Needs Global Data A Case for Data Sharing. E. Lynn Usery. [email protected]. http://cegis.usgs.gov. Objectives. - PowerPoint PPT PresentationTRANSCRIPT
U.S. Department of the InteriorU.S. Geological Survey
Global Science Needs Global DataA Case for Data Sharing
E. Lynn Usery
http://cegis.usgs.gov [email protected]
The Fifth China - U.S. Roundtable on Scientific Data Cooperation
2
Objectives
I will focus on the need for complete global geospatial datasets at high resolution to support global science modeling and analysis
Example Science Issues and Global Data Needs
Global Climate and Land Cover Change
Global Ecosystem Modeling
Global Hazards
Earthquakes
Sea Level Rise
Volcanism
Research on Semantic Web; platform for data sharing
3
USGS Science Strategy
http://www.usgs.gov/science_strategy/
4
USGS Science
Understanding Ecosystems and Predicting Ecosystem Change
Climate Variability and Change
Energy and Minerals for America’s Future
A National Hazards, Risk, and Resilience Assessment Program
The Role of Environment and Wildlife in Human Health
A Water Census of the United States
5
USGS Science
Data Integration and Beyond The USGS will use its information resources to
create a more integrated and accessible environment for its vast resources of past and future data. It will invest in cyberinfrastructure, nurture and cultivate programs in natural-science informatics, and participate in efforts to build a global integrated science and computing platform.
6
Global Climate and Land Cover Change
Data Needs – One Example
High resolution (30 m or smaller pixels) satellite images for land cover extraction
U.S. has Landsat archive but does not
include all scenes from non-US-based
receiving stations
Extracted land cover
Classes must match, i.e., same classification
system and same level of detail
7
Worldwide Usage of Landsat Imagery
1M
7
8
Online Data Search, Browse and Order Tools
Earth Explorer GLOVIS http://earthexplorer.usgs.gov http://glovis.usgs.gov
9
Landsat International Data Usage (FY10)
104734
60257
41108
37487
29234
25604
24850
23185
22074
16525
14146
14042
11265
10107
9167
9124
6962
5729
5091
4793
0 20000 40000 60000 80000 100000 120000
CHINA
RUSSIAN FEDERATION
SPAIN
AUSTRALIA
INDONESIA
MEXICO
CANADA
GERMANY
UNITED KINGDOM
JAPAN
KAZAKHSTAN
INDIA
KOREA, REPUBLIC OF
FRANCE
ITALY
CONGO
NETHERLANDS
ARGENTINA
COLOMBIA
MALAYSIA
Top 20 Countries (Excludes United States)
10
Landsat Global Archive Consolidation (LGAC)
Goal is to consolidate the entire Landsat archive5 million scenes held internationally vs. 2 million in the USGS
archive
From current stations as well as historical stations
Each station has data that will enhance the USGS archive
Enables scientific analysis of most complete time-series of images for global land change
Facilitates large scale scene selection and data mining capability
Recover data not currently available to usersSome data at risk due to aging media and drive obsolescence
Provides data to global user community as standard product like current Landsat data from US archive
11
LGAC International Data HoldingsICs
Scenes(in 1,000s)
Unique to IC % Unique
Europe (ESA) 2,089 1,883 90%
Australia (GA-NEO) 629 463 74%
Canada (CCRS) 532 242 45%
China (CEODE) 449 409 91%
Japan (RESTEC) 275 261 95%
Brazil (INPE) 234 210 90%
Thailand (GISTDA) 90 77 86%
Germany (DLR) 87 61 70%
India (NRSC) 51 50 98%
Ecuador (CLIRSEN) 40 30 75%
Pakistan (SUPARCO) 24 20 84%
Japan (HIT) 21 7 35%
Indonesia (LAPAN) 8 3 63%
Argentina (CONAE) 253 162 64%
South Africa (CSIR-SAC) 137 122 89%
Subtotal 4919 3704 75%
Saudi Arabia (KACST) 69
Puerto Rico (UPR) 42
Grand Total (in 1,000s) 5030
12
Landsat 8
Similar requirement for global data from all receiving stations to be archived and made freely available to support global science
13
Global Ecosystem Modeling – Data NeedsGlobal species data
Invasive species – cost in U.S. is billions of dollars each year – similar in other countries
Global secession data
Global climate records
As climate changes, how do species adapt; this is a global problem and requires global data sharing
14
Global Hazards – Earthquakes
Locations, epicenters, seismic wave data, exchanged in real time
Soil effects data
Infrastructure damage
Relief effort and support depends on data availability
15
Global Hazards – Volcanism
Volcano locations, eruption histories, types, distributed in realtime
Ash cloud distribution and models
16
Global Hazards – Sea Level Rise
Global elevation high resolution
ASTER Global DEM (15 m resolution) is a start
Need lidar/IfSar along all coasts
Corresponding population data
(current highest resolution is 30 arc-sec)
Corresponding land cover data
17
DATA SHARING ISSUES
Volume – multiple global datasets at high resolution
Structure – variety of structures, vector and raster, many different formats
Semantics – various attribution and relation schemes, some feature-based, some layers
Integration of multiple datasets – for maximum utility all datasets should be able to be integrated to produce new data and information
18
DatasetGeometry/ Format
Attribution/ Scaling
URL
National Hydrography Dataset (NHD) Vector Discrete/nominal http://viewer.nationalmap.gov/viewer/nhd.html?p=nhdNational Transportation Dataset Vector; tables Discrete/nominal http://viewer.nationalmap.gov/viewer/
http://gisdata.usgs.net/website/MRLC/viewer.htmNational Boundaries Dataset Vector Discrete/nominal http://viewer.nationalmap.gov/viewer/National Structures Dataset Vector Discrete/nominal http://viewer.nationalmap.gov/viewer/Geographic Names Information System (GNIS) Vector Discrete/nominal http://geonames.usgs.gov/domestic/download_data.htmNational Elevation Dataset (NED) Raster Continuous/ratio http://viewer.nationalmap.gov/viewer/
http://seamless.usgs.gov/website/seamless/viewer.htmNational Digital Orthophotos Raster Continuous/
intervalhttp://www.ndop.gov/data.html; http://viewer.nationalmap.gov/viewer/http://gisdata.usgs.net/website/MRLC/viewer.htm
National Land Cover Dataset (NLCD) Raster Discrete/nominal http://viewer.nationalmap.gov/viewer/http://gisdata.usgs.net/website/MRLC/viewer.htm
Global Land Cover Dataset Raster Discrete/nominal http://landcover.usgs.gov/landcoverdata.phpLiDAR Point Continuous/ratio http://viewer.nationalmap.gov/viewer/Satellite images Raster Continuous/interval http://edcsns17.cr.usgs.gov/NewEarthExplorer/; http://glovis.usgs.gov/Hazards (Earthquakes, Volcanoes) Graphics Multiple forms http://earthquake.usgs.gov/hazards/; http://volcanoes.usgs.gov/activity/status.phpMinerals Vector; text Discrete/nominal http://mrdata.usgs.gov/; http://tin.er.usgs.gov/mrds/
http://tin.er.usgs.gov/geochem/; http://crustal.usgs.gov/geophysics/index.htmlEnergy Vector; databases Multiple forms http://energy.usgs.gov/search.htmlLandscapes and Coasts Reports Discrete/nominal http://geochange.er.usgs.gov/info/holdings.htmlAstrogeology Databases Discrete/nominal http://astrogeology.usgs.gov/DataAndInformation/Geologic Map Database Vector; maps; text Discrete/nominal http://ngmdb.usgs.gov/Geologic Data Digital Data Series Maps; tables Discrete/nominal http://pubs.usgs.gov/dds/dds-060/National Water Information System Graphics; tables Continuous/ratio http://wdr.water.usgs.gov/nwisgmap/Floods and High Flow Graphics; tables Continuous/ratio http://waterwatch.usgs.gov/new/index.php?id=ww
Drought Graphics; tables Continuous/ratio http://waterwatch.usgs.gov/new/index.php?id=wwMonthly Stream Flow Graphics; tables Continuous/ratio http://waterwatch.usgs.gov/new/index.php?id=wwGround Water Vector; tables; Continuous/ratio http://waterdata.usgs.gov/nwis/gw/; http://groundwaterwatch.usgs.gov/Water Quality Graphics Continuous/ratio http://waterdata.usgs.gov/nwis/qw/; http://waterwatch.usgs.gov/wqwatch/National Biological Information Infra- structure (NBII)
Graphics; vector; geodatabases
Multiple forms http://www.nbii.gov/portal/server.pt/community/nbii_home/236
Vegetation Characterization Vector; databases Multiple forms http://biology.usgs.gov/npsveg/
Wildlife Vector; text;video Multiple forms http://www.nwhc.usgs.gov/
Invasive Species Vector; databases; graphics, image
Multiple forms http://www.nbii.gov/portal/server.pt/community/invasive_species/221
19
Volunteered Geographic Information/ User Generated ContentUSGS “Did You Feel It?”
Open Street Map (OSM)
USGS now researching use of OSM for our transportation and structures data
VGI/UGC rivals traditional geospatial data sources and provides new basis for data sharing
20
Technical problems
Compatible data models
Resolution, accuracy issues
Attribution issues – need ontology that allows matching across data schema
Data sharing is more than making data available for download over the Web
Requires standards
USGS data meets Federal Geographic Data Committee and Open Geospatial Consortium standards for metadata and packaging
21
Semantics – Intelligence
USGS is exploring Semantic Web for data sharing; globally linked data
Requirements:
Ontology of features, attributes, and relationships: currently being developed.
Semantic Web triple format: Conversion for selected test areas is in progress.
Uniform Resource Identifiers (URIs) for individual features, i.e., each geographic feature has a unique URI
22
http://usgs-ybother.srv.mst.edu:8890/sparqlUSGS Semantic Web SPARQL Endpoint for Data Access
23
Query – Find the tributaries of West Hunter Creek Default Graph URIhttp://cegis.usgs.gov/rdf/ontologytest/ PREFIX ogc: <http://www.opengis.net/rdf#>PREFIX fid: <http://cegis.usgs.gov/rdf/nhd/featureID#> SELECT ?feature ?typeWHERE {fid:_102217454 ogc:hasGeometry ?geo1.?geo1 ogc:touches ?geo2.?feature ogc:hasGeometry ?geo2.?feature a ?type }
24
http://cegis.usgs.gov/rdf/nhd/featureID#_102216320
http://cegis.usgs.gov/rdf/nhd/featureID#_102216358
http://cegis.usgs.gov/rdf/nhd/featureID#_102217454
http://cegis.usgs.gov/rdf/nhd/featureID#_102216276
http://cegis.usgs.gov/rdf/nhd/featureID#_102216340
http://cegis.usgs.gov/rdf/nhd/featureID#_102216432
http://cegis.usgs.gov/rdf/nhd/featureID#_102216448
Query Result
25
Major Challenges for Geospatial Data Sharing with Semantics
Semantic spatial data model
Coordinates on the Semantic Web in RDF
Geospatial feature ontologies
Ontology-driven geospatial operators
Moving multi-GB to TB of data to grid/cloud
Implementing spatial operators on Semantic Web and in
grid/cloud environment
Interfacing Semantic Web and grid/cloud capabilities
U.S. Department of the InteriorU.S. Geological Survey
Global Science Needs Global DataA Case for Data Sharing
E. Lynn Usery
http://cegis.usgs.gov [email protected]
The Fifth China - U.S. Roundtable on Scientific Data Cooperation