serving unstructured grids using opendap: using server-side operations to subset and subsample data...

32
Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration Emergency Response Division James Gallagher OPenDAP, inc. NOAA’s National Ocean Service • Office of Response and Restor

Upload: letitia-tyler

Post on 23-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration

Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data

Christopher BarkerNOAA Office of Response & Restoration

Emergency Response Division

James GallagherOPenDAP, inc.

NOAA’s National Ocean Service • Office of Response and Restoration

Page 2: Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration

NOAA Emergency Response Division

• National Contingency Plan specifies NOAA’s role in supporting the Coast Guard:

“Provide scientific expertise to support anincident response for Oil and Chemical Spills”

Page 3: Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration

Key Role: Trajectory Modeling

• Where is the oil (or chemical) going?

Page 4: Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration

Primary Tool: GNOME(General NOAA Operational Modeling Environment)

• Lagrangian element(particle) model

• Forcing from externalsources:– Winds– Currents

• Currents:– In house model– External operational models

Page 5: Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration

GOODSGNOME Online Operational Data Server

Page 6: Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration

Example: Deepwater Horizon• Ocean models utilized:

– NOAA CSDL: NGOM– Navy models: NCOM, HYCOM, IASNFS– USF: West Florida Shelf ROMS– TGLO/TAMU: TX shelf ROMS– NC State: SABGOM

– All structured grid models

Page 7: Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration

Unstructured Grid Models?

• Unstructured Grids:– Allow resolution to vary spatially– Conform to boundaries

• Nice for oil spills and particle tracking

• Many more UGRID models coming online– Many papers at this conference

Page 8: Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration

Some Models of Interest

• FVCOM:– nGOMOFS (NOAA CSDL)– Gulf of Maine/Mass Bay (UMASS)– Salish Sea (PNNL)

• SELFE: – Columbia River (OHSU) – Texas Estuaries models (UT)

• ADCIRC:– Gulf of Mexico / Southern LA and Texas grid

9,108,128 nodes--18,061,765 elements

Page 9: Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration

90,310 Nodes 174,550 Elements

V6nGOMOFS (NOAA CSDL)

Page 10: Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration

Mobile Bay, AL detail grid. About 300 m grid resolution along a 13 m deep navigation channel

What if I just need Mobile Bay?

Page 11: Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration

FVCOM-GoM/GB for Mass Bay and Nantucket

Sounds/Shoals

Boston Inner Harbor

Page 12: Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration

ADCIRC:Gulf of Mexico / Southern LA and

Texas grid (SL18TX)

• Gulf of Mexico / Southern LA and Texas grid 9,108,128 nodes--18,061,765 elements

• Just surface currents:– 275 MB per time step (plus the grid specs)

Page 13: Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration

Obstacles to using UGRID models:

• No standard for data/results on UGRIDS:– Informal working group for (quite!) a few years– Recent draft standard (netcdf 3)– Work on JavaNetcdf lib to support it

(SURA modeling test bed project)

• Big Grids:– Need server side subsetting

Page 14: Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration

How to get it done?

• NOAA/ORR post-DWH funding:– Better able to response to large spills

• We started talking to folks about server-side subsetting options

• But we’re clients:– We’re not going to run a server

• We needed something that wouldbecome an excepted standard/tool.

Page 15: Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration

How to get it done?

• NOAA/NESDIS noted assorted issues:– Netcdf/OpenDAP development funding limited– Multiple diverging implementations:

“Unfunded Mandate”

• NESDIS coordinated funding from:– Technology, Planning and Integration for

Observations (TPIO) Program– OR&R– National Climatic Data Center (NCDC)

Page 16: Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration

OPeNDAP-Unidata Linked Servers (OPULS)

• NOAA/BAA grant supports this important collaboration between Unidata & OPeNDAP

• First goal: conformance between OPeNDAP & Unidata servers, through which access is gained to growing amounts of NOAA & related data. Other short-term goals include:

– Asynchronous modes, such as are needed for (delayed) access to near-line data, perhaps stored on tape, e.g.

– Improved access (with server-side subsetting) to data organized on non-rectangular meshes, such as in coastal modeling

• Work began in Boulder during October & will be influenced by an advisory committee (yet to be appointed)

Page 17: Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration

OPeNDAP:the Data Access Protocol• DAP2 combines simple data model with a

general set of operators.– Data Model: Atomic types (e.g., ‘Integer’); Arrays;

Structures; Grids; and Sequences. – Operators: These provide ways to subset all but

the atomic types.– Domain neutral: By keeping the semantics of the

model clean, we ensure that it can be applied to many different types of data.

Page 18: Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration

But how is it used?

• DAP is generally used as a ‘web service’• DAP requests are made using a URL• DAP responses are ‘documents’:

– Text that contains metadata– Combination of text/metadata and binary data.

• Applications read these responses and use them it whatever ways they see fit:– the netCDF client library makes legacy

applications believe they are reading from a local file

Page 19: Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration

About Array and Grid Selection• In addition to requesting a Grid or Array, the

Selection can be used to subset in indicial space.

Page 20: Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration

About Functions

• Constraint Expression can contain functions• These functions can perform any operation

that can be programmed.• Thus they provide a good way to extend a

data server to perform new operations• These include operations that are not domain

neutral• In Hyrax they are written in C++

Page 21: Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration

Example URLs

• The base URL: “http://test.opendap.org/opendap/data/nc/fnoc1.nc”• To get metadata:

– Dataset variables: http://test.opendap.org/opendap/data/nc/fnoc1.nc.dds– … attributes: http://test.opendap.org/opendap/data/nc/fnoc1.nc.das– Or less readable in XML: http://test.opendap.org/opendap/data/nc/fnoc1.nc.ddx

• To get data:– Just the variables u and v:

http://test.opendap.org/opendap/data/nc/fnoc1.nc.dods?u,v– … in ASCII so it’s easy to read: http://…/opendap/data/nc/fnoc1.nc.asc?u,v

• With subsetting:– http://test.opendap.org/opendap/data/nc/fnoc1.nc.asc?u[0][3:6][5:8]

• Here’s a function:– http://…/nc/coads_climatology.nc.ascii?geogrid(SST,45,-80,20,-

60,”1000<TIME<3000”)– This is an example of how functions can enable domain-specific behavior; this

function will return an error if the Grid is not ‘geospatial’

Page 22: Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration

Challenges

• Unstructured Grids are not a specific type in DAP• We must choose a way, or set of ways, to represent

these data• Datasets are often too large to download –

subsetting must be done server-side.• Because the subsetting operations are complex, we

will need to use server-side functions to implement them

Page 23: Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration

Requirements

• Must enable subsettingby polygonal regions

• The result must be anunstructured grid itself

• A subset must preservethe topological and geometric relationships present in the whole:– we can’t just regrid everything to a more

convenient form.

Page 24: Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration

Proposed Solution

• Server-side function to add subsetting• Adopt the proposed unstructured grid

encoding using netCDF3• Result of the function will be a DAP2 response

– Input is netCDF3 with some additional ‘conventions’: it can be represented in DAP2

– There are existing clients that can read DAP2• If they understand netcdf in the new convention, they

will understand the results

Page 25: Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration

The server-side function

• Ugrid(Mesh,<polygon>)– <polygon> is a comma separated list of latitude

and longitude points– However, there is an arbitrary limit to the number

of characters in a URL, so • We will also support POST when OPULS makes

the transition to DAP4– It will likely take more than a year for all of DAP4

to be realized, but POST for constraint expressions will be set in the first year.

Page 26: Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration

Example ugrid() calls• http://…/model.nc?ugrid(SST,45,-80,20,-60)

– When ugrid() is called with two points, itwill assume the polygon is a box.

• http://…/model.nc?ugrid(SST,45,-80, 45,-60, 20,-60, 20,-80)– Here the polygon the same box as above. – There’s an understood edge connecting the first and last points– Point order is important – self-intersecting polygons will raise an

error.

Page 27: Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration

http://…/model.nc?ugrid(SST, -71.03, 42.38, -71.06, 42.37, -71.06, 42.36, -71.06, 42.35, -71.04, 42.33 -71.01, 42.34, -71.01, 42.35, -71.03, 42.38)

Page 28: Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration

Implementation

• We will use the Gridfields library [Howe 05]

• The library will be extended to work with the new netCDF3 file format:

“Deltares CF proposal for Unstructured Grid data model”

• And to work with DAP

[Howe 05] Bill Howe, David Maier, “Algebraic Manipulation of Scientific Datasets,” VLDB Journal, 14(4) 2005

Page 29: Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration

Progress so far

• Gridfields has already been used to build a simpler server-side demonstration function

• The Gridfields code has adopted GNU’s autotools to streamline its build.

• We will factor out the C++ code into its own project, separate from the Python layer

• This will simplify moving gridfields into the Linux community builds

Page 30: Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration

Summary

• Ugrid models are seeing wide deployment• Subsetting UGrids on the server is critical to

the wide use of model results• UGrids will be encoded in netCDF3• We will use a widely available open-source

library to perform the actual operations• The results will be valid UGrids, in DAP• The work has begun

Page 31: Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration

Use for Curvilinear grids, too?• Capture arbitrary polygon subset.• Rectangle in geo-coordinates not a rectangle

in grid coordinates– We generally over sample.- But that’s not always a good solution for highly

deformed grids.- What would the result look like?

- A new structured grid?- An unstructured grid?

Page 32: Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration

Further Discussion, etc.

• Meet here at ECM: – Lunch Wed?

• Discussion on UGRID Google group:https://groups.google.com/group/ugrid-interoperability

• OPeNDAP Wiki:http://docs.opendap.org/index.php/Projects