thredds, cdm, opendap, netcdf and related conventions

29
THREDDS, CDM, OPeNDAP, netCDF and Related Conventions John Caron Unidata/UCAR Sep 2007

Upload: omar-carr

Post on 30-Dec-2015

37 views

Category:

Documents


0 download

DESCRIPTION

THREDDS, CDM, OPeNDAP, netCDF and Related Conventions. John Caron Unidata/UCAR Sep 2007. Contents. Overview THREDDS Data Server Unidata’s Common Data Model. 1) Request. 0) Client. 2) Server. 3) Response. 0) The Client. What functionality is needed ? Scientific User Raw data - PowerPoint PPT Presentation

TRANSCRIPT

THREDDS, CDM, OPeNDAP, netCDF and Related

Conventions

John Caron

Unidata/UCAR

Sep 2007

Contents

• Overview

• THREDDS Data Server

• Unidata’s Common Data Model

2) Server0) Client

1) Request

3) Response

0) The Client

What functionality is needed?

1. Scientific User– Raw data– Drill down to arbitrary detail

2. Decision Support– “best effort” Visualization– operational

1) The Request

What functionality is possible?• Analogous to SQL language for RDBMS• Implies a Data Model• OGC vs File access APIs

– NetCDF/OPeNDAP/HDF5 : index space– WXS : coordinate space

• Higher semantic level trumps if no significant extra cost.– File APIs become implementation, not interface

1) WCS Request

• Functionality – Subsetting (bounding box, time range, variable)– Optional reprojection/resample

• Variants: KML/XML/SOAP+XML/REST• Optional Functionality : 42 flavors• Bad news for interoperability• Is there an elephant to dictate a standards?

– Eg IBM chose SQL/Relational model (1984)

2) Server

How do I serve my data?

• Do I need specialized personnel?– $$, resource consumption, core competency

• What are the common requests?– (that I should optimize for)?

3) Response

What comes back?

• Has to be a representation of the “answer” in the Data Model

• WCS allows anything– Cant write a generic client

• Communities will form around a small number of variants– No elephants in sight

3) Response : XML vs. binary

• Extensibility vs. Efficiency

• Binary: netCDF/GeoTIFF/HDF/etc – reflect favorite formats of committee members– Different data models : ideally need a formal

mapping (but there arent any yet)– Domain experts can make use

• GML closely follows the OGC/ISO data model (WFS requires GML)

3) Response : XML vs. binary

• GML is waaaay too complex– Ambitious– OGC/ISO models are complex– Reality is complex– XML Schema is a disaster

• Google KML – “visualization format” not “data storage”

HTTP Tomcat Server

THREDDS Data Server

Datasets

catalog.xml

motherlode.ucar.edu

THREDDS Server

NetCDF-Javalibrary

Application

IDD Data

•HTTPServer

•NetcdfSubset

•WCS

•OPeNDAP

configCatalog.xml

THREDDS Catalogs

• XML over HTTP• Hierarchical listing of online resources (datasets)• Container for arbitrary search metadata

– Standard set maps to DC, GCMD, ADN – Unidata/NCAR-CDP

• Metadata can be inherited• Design goal: Make it easy for data providers• TDS uses extended version for configuration• Data Access URLS

– “Crossing the protocol boundary”

THREDDS OPeNDAP Server

• OPenDAP is protocol for remote access to CDM• Current version 2.0; NASA ESE standard

– Working on new 4.0 protocol spec

• Based on Java-OPeNDAP library – shared development by Unidata/opendap.org

• Any CDM dataset can be served• Server4 (Hyrax):

– latest version of opendap.org C++ library – THREDDS Catalogs replace dods_dir

THREDDS WCS service

• CDM files that have Grid coordinate system– evenly spaced x,y

• Allow to subset the dataset by:– Lat/lon or projection bounding box– time and vertical coordinate range– list of Variables

• Return formats– GeoTIFF floating point, grayscale– NetCDF/CF-1.0

• No reprojections, resamplings• Uses WCS 1.0, work on WCS 1.1 in progress

NetCDF Subset Service

• Experiment with REST style web service• Allow to subset the dataset by:

– Lat/lon bounding box– time and vertical coordinate range– list of Variables

• NetCDF/CF, XML, CSV (spreadsheet)• Gridded Data

– Output is a CF-1.0 netCDF file– Variation of WCS (simplified request protocol)

• Grid as Point Datasets (experimental)– Extract vertical profile, time series from one point in model data

• Station Data: metars (7 day rolling archive)

HTTP Tomcat Server

Common Data Model

catalog.xml

hostname.edu

THREDDS Server Application

NetCDF-Javalibrary

IDD Data

•HTTPServer

•NetcdfSubset

•WCS

•OPeNDAP

Then a miracle

happens

Datasets

NetcdfDataset

ApplicationScientific Datatypes

NetCDF-Java architecture

OPeNDAP

THREDDS

Catalog.xml NetCDF-3

HDF5

I/O service provider

GRIB

GINI

NIDS

NetcdfFile

NetCDF-4

…Nexrad

DMSP

CoordSystem Builder

Datatype Adapter

NcMLNcML

Common Data Model File Formats

• General: NetCDF, HDF5, OPeNDAP

• Gridded: GRIB-1, GRIB-2 • Radar: NEXRAD level II and level III, DORADE,

Chinese NEXRAD

• Point: BUFR

• Satellite: DMSP, GINI

• In Progress: NetCDF4, McIdas AREA, NPOESS, NOAA CLASS legacy files, Barrowdale DataBlade, others

Coordinate Systems

Common Data Model Layers

Data Access

Scientific Datatypes

Grid

Point

Radial

Trajectory

Swath

Station Profile

Common Data Model(Data Access Layer)

Coordinate Systems UML

NetCDF-4 file format

• NetCDF-4 C library – 4.0 Beta implements CDM access layer

• Persistence format for complete CDM

– 4.1: adding Coordinate Systems • Optional layer, focus on CF-1 (libcf)

– 4.?: merge OPeNDAP access

• NetCDF-Java library will read, maybe write

TDS / NcML aggregation

<dataset name="WEST-CONUS_4km Aggregation" urlPath="satellite/3.9/WEST-CONUS_4km">

<netcdf > <aggregation dimName="time" type="joinNew"> <scan location="/data/ldm/pub/satellite/3.9/WEST-CONUS_4km/"

suffix=".gini" /> </aggregation> </netcdf>

</dataset>

Forecast Model Run Collection (FMRC)

Scientific DataTypes

• Based on datasets Unidata is familiar with– APIs are evolving

• How are data points connected?• Intended to scale to large, multifile

collections• Intended to support “specialized queries”

– Space, Time

• Intend to create “standard” NetCDF file encoding conventions

Scientific DataTypes

• Grids– Structured– Swath– Unstructured

• Point Observation– Unconnected– Station / Time Series– Trajectory– Profile

• Radial

Climate and Forecast (CF) Conventions

• Conventions for encoding coordinate systems, other semantics in netCDF

• Working for 10 years– Version 1.0 in 2003– Good for gridded data

• Current working goups– Point/Station/Trajectory/Profile observations– CRS (map to OGC)

• Governance in place• Volunteer: motivated, practical, real

Summary: Unidata’s directions

• Client: both Scientific User and Decision Support

• Request in coordinate space– WCS is fine, not a big architectural decision

• Server: TDS– Files in native format, augmented by

indexing/DB

• Response: netCDF/CF and GeoTIFF/KML or WMS/JPEG