nsf meeting on cyberinfrastructure for surficial processes, jan.18-19, 2006 slide 1 geon: the...

31
NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006 Slide 1 GEON: The Geosciences Network Chaitan Baru San Diego Supercomputer Center (SDSC) California Institute for Telecommunications and Information Technology (Calit2)

Upload: dwight-dixon

Post on 28-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006 Slide 1 GEON: The Geosciences Network Chaitan Baru San Diego Supercomputer

NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006

Slide 1

GEON: The Geosciences Network

Chaitan BaruSan Diego Supercomputer Center (SDSC)

California Institute for Telecommunications and Information Technology (Calit2)

Page 2: NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006 Slide 1 GEON: The Geosciences Network Chaitan Baru San Diego Supercomputer

NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006

Slide 2

Data Management

DATA COLLECTION

DATA PUBLICATION

DATA ACCESS

DATA ANALYSIS

GEON

Page 3: NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006 Slide 1 GEON: The Geosciences Network Chaitan Baru San Diego Supercomputer

NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006

Slide 3

GEON Background• See website: www.geongrid.org, and portal• Began as a collaboration among ~15 institutions• Goals

• Provide a Cyberinfrastructure-based Interpretive Environment for Earth Science research, e.g. for data acquired in EarthScope

• Support for data discovery • A platform for data integration

• Train students and geoscience researchers in state-of-the-art and advanced IT concepts, i.e. technical aspects of geoinformatics

• Two-Tier approach• Develop working systems, while also doing research and building

advanced prototypes• The focus this year is on registering content and tools at portal and

providing a number of “reference” datasets• The end goal is to provide science infrastructure. Support for both

“hosted” and “non-hosted” data

Page 4: NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006 Slide 1 GEON: The Geosciences Network Chaitan Baru San Diego Supercomputer

NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006

Slide 4

Topics for Today

• LIDAR data management and processing in GEON• Courtesy: Prof. Ramon Arrowsmith, Arizona State

• Data Registration• Linkage with other geoinformatics, CI projects• Won’t cover details of grid computing,

visualization, data integration, …

Page 5: NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006 Slide 1 GEON: The Geosciences Network Chaitan Baru San Diego Supercomputer

5~1.2 billion data points

Example Data Set:

• Northern San Andreas fault and associated marine terraces.

• Flown February 2003

• Funded by NASA in collaboration w/ USGS.

• ~418 Square Kilometers

Page 6: NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006 Slide 1 GEON: The Geosciences Network Chaitan Baru San Diego Supercomputer

6

~1.1 million data pointsTo produce this DEM

Page 7: NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006 Slide 1 GEON: The Geosciences Network Chaitan Baru San Diego Supercomputer

7

Page 8: NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006 Slide 1 GEON: The Geosciences Network Chaitan Baru San Diego Supercomputer

8

Page 9: NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006 Slide 1 GEON: The Geosciences Network Chaitan Baru San Diego Supercomputer

9

Page 10: NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006 Slide 1 GEON: The Geosciences Network Chaitan Baru San Diego Supercomputer

10

Page 11: NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006 Slide 1 GEON: The Geosciences Network Chaitan Baru San Diego Supercomputer

11

Page 12: NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006 Slide 1 GEON: The Geosciences Network Chaitan Baru San Diego Supercomputer

12

Page 13: NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006 Slide 1 GEON: The Geosciences Network Chaitan Baru San Diego Supercomputer

13

Page 14: NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006 Slide 1 GEON: The Geosciences Network Chaitan Baru San Diego Supercomputer

14

Page 15: NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006 Slide 1 GEON: The Geosciences Network Chaitan Baru San Diego Supercomputer

15

Page 16: NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006 Slide 1 GEON: The Geosciences Network Chaitan Baru San Diego Supercomputer

16

Page 17: NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006 Slide 1 GEON: The Geosciences Network Chaitan Baru San Diego Supercomputer

17

Page 18: NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006 Slide 1 GEON: The Geosciences Network Chaitan Baru San Diego Supercomputer

18

Page 19: NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006 Slide 1 GEON: The Geosciences Network Chaitan Baru San Diego Supercomputer

19

Lidar Processing Workflow: Using Kepler

Subset

Analyze

move process

Visualize

move render display

Arizona Cluster

NFS Mounted DiskIBM DB2

Datastar

NFS Mounted Disk

d1d1

d2 (grid file)

d2

d2d1

iView3D/Browser

CreateScene file

Fledermaus (or ASU OpenGL tool LViz)

sd

Page 20: NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006 Slide 1 GEON: The Geosciences Network Chaitan Baru San Diego Supercomputer

20

Data Set # of points Schema Source

Northern San Andreas (NSAF)

1.2 billion10 column (x,y,z

+ attributes)NASA / USGS

West Rainier~800 million –

1 billion (est.)10 column (x,y,z

+ attributes)

Southern SAF Laser Scan

?? Likely to be 5+ billion

??

NCALMNAPA ~500M ??

E. CA Shear Zone (E. Mohave)

~500M ??

Antarctic Dry Valleys

10-100M? ??Bea Csatho (Ohio

State)

Hector Mine EQ 10-100M? ?? Ken Hudnut (USGS)

Alvord (Tripod) 16.6M4 column (x,y,z +

intensity)John Oldow (U.

Idaho)

LiDAR DATA SETS COMMITTED (?) TO GEON DISTRIBUTION:

Page 21: NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006 Slide 1 GEON: The Geosciences Network Chaitan Baru San Diego Supercomputer

NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006

Slide 21

Current Activities

• “Release” of GLW—GEON LIDAR Workflow capability

• Incorporation of ground-based LIDAR data• Ground-based Data Collection Workshop,

organized by John Oldow, April 6-7, 2006, SDSC/Calit2 Synthesis Center. Sponsored by NSF

Page 22: NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006 Slide 1 GEON: The Geosciences Network Chaitan Baru San Diego Supercomputer

NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006

Slide 22

Data Registration: GEONsearch

Choose a filetype

Choose subject (from a “base” ontology)

Choose location (from a gazetteer Webservice)

Choose a time (numeric range or from a time ontology Webservice)

Choose concepts from ontologies

www.geongrid.org

Page 23: NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006 Slide 1 GEON: The Geosciences Network Chaitan Baru San Diego Supercomputer

NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006

Slide 23

GEONsearch and myGEON

GEONsearch

Search Condition(s)spatial temporal concept

Log

GEON Catalog

GEON Datasets

extracted information/indexes

Web services

GazetteerGeologic

Age

myGEON

Map Service

-Move data-Create map service

-Create session

selected results(shape file GEON ID’s)

Handle to interactivemap session

Save MapSession

Saved sessionSearch results

(in Data Integration Cart©)

Page 24: NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006 Slide 1 GEON: The Geosciences Network Chaitan Baru San Diego Supercomputer

NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006

Slide 24

The 1-2-3 of GEON Data Registration1. Register dataset, tool to index terms

• Allows users to more easily discover relevant resources

2. Register dataset “schema” to ontology• E.g. Age_MA Geologic Age• Could be relational dbms, shapefile, Excel, netCDF, …• Allows discovery of datasets that have information of interest, e.g. “all

datasets that have velocity data”

3. Register data values to ontology• E.g. “Jur” Jurrasic Age from Geologic Age ontology• Allows advanced data integration, e.g. integrate Paleobiology data with

Paleostrat, or Neptune, Janus, etc.

• Prerequisite: ontologies need to be defined (by community), represented in OWL, and registered

Page 25: NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006 Slide 1 GEON: The Geosciences Network Chaitan Baru San Diego Supercomputer

NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006

Slide 25

GEON Data Registration

Ontology Registration

Dataset Registration(hosted)

Data Item (Schema) Registration(hosted / non-hosted)

Data Item Detail Registration(values)

Service Registration

Resource Registration

Page 26: NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006 Slide 1 GEON: The Geosciences Network Chaitan Baru San Diego Supercomputer

NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006

Slide 26

Data Registration Activities

• GEON “Mini-Workshop on Information Exchange from Distributed Data Systems”, Feb 7th, 2006 • Co-organized

by Chuck Meertens, UNAVCO/GEON and Ben Domenico, Unidata/LEAD

• Goal: Register netCDF/OpenDAP data in GEON portal

Page 27: NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006 Slide 1 GEON: The Geosciences Network Chaitan Baru San Diego Supercomputer

NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006

Slide 27

GEON IDV

• Courtesy Dr. Chuck Meertens, UNAVCO• Adapt IDV for earth science datasets• Incorporate web service calls in IDV to invoke

GEONsearch• and access and manipulate netCDF-based 3D, 4D data

sets

Page 28: NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006 Slide 1 GEON: The Geosciences Network Chaitan Baru San Diego Supercomputer

NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006

Slide 28

Geo-ontologies

• Data Registration and Ontology meetings• GEON Data Registration meeting, March 10-11, SDSC• Volcano Ontology meeting, sponsored by NASA SESDI

project (Semantically-Enabled Scientific Data Integration), Feb 16/17, SDSC

• An opportunity for the community to develop community standards for knowledge representation, e.g.• Schemas, controlled vocabularies, ontologies• And, choose a common representation system, e.g. OWL

Page 29: NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006 Slide 1 GEON: The Geosciences Network Chaitan Baru San Diego Supercomputer

NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006

Slide 29

Linkage with Other Geoinformatics, CI Projects

• CUAHSI Hydrologic Information System (HIS)• HIS is using GEON data registration and search capability, and

mapping services, and GEON PoP node structure and the “GEON Pack” (i.e. a common software stack),

• CHRONOS• Database federation

• Hosting Paleo-pollen databases• Hosting NAVDAT• IT collaborations with NCMIR/BIRN (NIH), SESDI (NASA),

LEAD, GRASP (Grid Benchmarking), Globus (Data Replication Service middleware)

Page 30: NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006 Slide 1 GEON: The Geosciences Network Chaitan Baru San Diego Supercomputer

NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006

Slide 30

E.g, CHRONOS Federated Databases

• The following databases are all part of the CHRONOS Federated Database at SDSC based on IBM’s DB2 Information Integrator. Federated database is registered in GEON.• Neptune• PaleoStrat• PaleoBiology• Janus• TimeScale• FAUNMAP• MIOMAP

Page 31: NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006 Slide 1 GEON: The Geosciences Network Chaitan Baru San Diego Supercomputer

NSF Meeting on Cyberinfrastructure for Surficial Processes, Jan.18-19, 2006

Slide 31

Opportunities• Leverage CI from existing projects in same or even different

disciplines• Adopt a service-oriented architecture (SOA)

• i.e. standardize on Web service interfaces for tools, applications, and data • E.g. Web Mapping Services for map image services, and WFS, WCS, and other

standards, e.g for accessing geologic maps, gravity data, sensor data, …• Need to deal with .NET and Java compatibility

• Develop centralized community services, e.g. for LIDAR processing• Develop community standards for knowledge representation

• Schemas, controlled vocabularies, ontologies. Choose common representation system, e.g. OWL

• Organize community meetings, workshops, conferences• Develop “Meta-workflow” frameworks

• Support inter-operation among different scientific workflow systems

• There may be an opportunity to work through a proposed new GSA Division on Geoinformatics and AGU working group on IT

• Geoinformatics 2006. See www.geongrid.org/geoinformatics2006