gcoos and griidc: a third coast partnership for big ocean...

34
GCOOS AND GRIIDC: A Third Coast Partnership for Big Ocean Data Tools and Techniques for Data Transfers M.K. Howard 1 , F.C. Gayanilo 2 , M Stössel 1 , and Steve Baum 1 1 Dept. of Oceanography, Texas A&M University, College Station, TX, 77843, USA 2 Harte Research Institute for Gulf of Mexico Studies, Texas A&M University-Corpus Christi, 6300 Ocean Drive, Unit 5869, Corpus Christi, Texas 78412, USA 2014 Ocean Sciences Meeting: 28-February 2014 Honolulu, Hawaii Session: 113: Big Data for a Big Ocean: Progress in Data Availability and Interoperability

Upload: buicong

Post on 07-Nov-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

GCOOS AND GRIIDC: A Third Coast Partnership for Big Ocean Data

Tools and Techniques for Data Transfers

M.K. Howard1, F.C. Gayanilo2, M Stössel1, and Steve Baum1

!1Dept. of Oceanography, Texas A&M University, College Station, TX, 77843, USA

2Harte Research Institute for Gulf of Mexico Studies, Texas A&M University-Corpus Christi, 6300 Ocean Drive, Unit 5869, Corpus Christi, Texas 78412, USA

2014 Ocean Sciences Meeting: 28-February 2014 Honolulu, Hawaii

Session: 113: Big Data for a Big Ocean: Progress in Data Availability and Interoperability

Gulf of Mexico Coastal Ocean Observing System

(GCOOS) !

Gulf of Mexico Research Initiative Information

and Data Cooperative (GRIIDC)

Authors and Outline

• GRIIDC - GCOOS!

• Large File Transport!

• DAP Transfers

Matt Nonong SteveMarion

GCOOS Station Map - data.gcoos.org

1555 Near Real-Time Sensors

GCOOS Holdings• Near real-time collection: 1500+ sensors!

• ~1.4M obs. per month. 2008-present!

• Historical: Ocean Data 1900-2000!

• Circulation Models & Model Resources!

• Pursuing: Coastal oxygen and nutrients!

• Future: Support ecosystem studies

GRIIDC — data.gomri.org$500M/10yr

Partial List of Models, Data, Instruments, and studies. !

Numerical Models: GCM: ROMS*, HYCOM*, FVCOM*, MITGCM, NCOM, ADCIR, Plume Dynamics: LES/Nek5000, Waves/Surf: UMWM, Delft3D, SWAN, Near-field: SMIP, CMS, HYDRO3D, Particle-tracking: LTRANS, Bay-Models: SUNTANS, Barataria Bay coupled model. (* multiple versions) Remotely-sensed Data: ENVISAT-SAR, MERIS, MODIS, JASON-SSH, LANDSAT, ASTER, AVHRR-SST, HF-RADAR Currents In-Situ: Autonomous: RAFOS, Surface/bottom/profiling drifters, drift-cards, dye, chemical tracers, AUV: video, multi-beam, Gliders: with temperature, salinity, velocity, chlorophyll, CDOM, acoustic backscatter, Personal water craft: shallow-water bathymetry, Expendables: AXBT, AXCP, AXCTD, GPS_Sondes, Optical: laser-altimetry, deep-LISST, CDOM, fluorescence, PAR, OBS, transmission, Video/Stills: polarimetric, infrared, sediment profile & traps, seabed, digital X-rays, Moorings: with conductivity, temperature, pressure, DO2, turbidity, currents, fluorescence, Other: water level, single-beam soundings, turbulence, Sediment samplers: box cores, Ekman grabs, dredges, benthic skimmers, traps, Sediment measurements: soil stress, cores, respiration, gas fluxes, light compensation., Chemicals: DOC, DIC, DON, DOP, NO3, NH4, UREA, PO4, SIO2, SRP, H2S, Eh, pH, C14, C13, Th234, Pb210, Hg-isotopes, methane concentration Selected Laboratory Studies: Concentration of compounds of crude oil in sediments, water, and tissue., Model of canyon flow, Core photography, resuspension rates of oil., Grain-size distribution-properties., Releases of crude and methane in a high-pressure environment in a laboratory setting., Surfactant effects on boundary layers., Physical processes of dispersion influencing droplet fate., Biodegradation over time and toxicity. (Amalgamated) in situ or in vivo biological studies: Bacteria: oil encounter rates, colony formation, films, oil interactions, and POC, DIC, DOC, N2 fixation rates, Microbial: community abundance, community characterization, identification, composition, process rates, degradation of oil, aerobic respiration, cell counts, transcription activity, and ecotoxcity of oil droplets, Macroinfauna: richness dominance, community structure, reproductive states, diversity, and gene expression, Mussel: growth, abundance, tissue response, hemocyte density, viability, and phagocytes, Zooplankton: concentration, abundance, distribution, composition, condition, grazing rates, and oil encounter rates, Phytoplankton: hydrocarbon signature, and dispersant toxicity, Fish: toxicology, reproductive potential, immune system function, stomach, muscle, gonad tissue, liver and bile, -exposure to PAH, growth rates, and diversity, Oil consumptions waste products (fecal matter and marine snow)., Insect species abundance, diversity, turn over and oil uptake, Birds: survival and nest success, fecal and stomach composition, and oil uptake. Marshes: Spartina stem number and diameter, Intercellular CO2 concentrations, measurements of above and belowground biomass.

Phys

ical

- C

hem

ical

Biol

ogic

alM

odelsField

LabField-Lab

GRIIDC Data Holdings

~18Tb

Data Exchange Requirements• Move large files from provider to archive • Serve data subsets to consumers • Support humans and machine access

Large File

Transport

1Gb/sRound-trip

min

utes

: sec

onds

Servers

GCOOS Data Servers

• ERDDAP!• THREDDS Data Server (TDS)!• OPeNDAP

Where possible we encourage the use of: netCDF, COARDS/CF, NODC netCDF Feature Templates

Full Featured

To be deployed by GRIIDC in 2014-2015 timeframe.

Few Features

NOAA ERDDAP

http://coastwatch.pfeg.noaa.gov/erddap/griddap/jplAmsreSstMon.htmlTable?tos[(2010-12-16T12:00:00Z):1:(2010-12-16T12:00:00Z)][(-89.5):1:(89.5)][(1.0):1:(360.0)],tosNobs[(2010-12-16T12:00:00Z):1:(2010-12-16T12:00:00Z)][(-89.5):

1:(89.5)][(1.0):1:(360.0)],tosStderr[(2010-12-16T12:00:00Z):1:(2010-12-16T12:00:00Z)][(-89.5):1:(89.5)][(1.0):1:(360.0)]

DAP Clients

OPeNDAP

• Access remote data using URLs. • Remote subsetting via constraint expressions • Online browse -

• Data Set Descriptor Structure .dds (syntactic metatdata: shape 2D? 3D?)

• Data Attribute Structure .das (parameter names and units)

• Or combined .info

Tested Clients• MatLab (MathWorks) • IDL (Exelis) • IDV (UCAR) • EDC (NOAA/ASA) • Panoply (NASA) • GrADS (IGES/COLA) • Ferret (NOAA/PMEL)

Extractor

Viewer

28 pages like this

Clie

nt

Server-type

Example Code

url = ‘http://urltodata/wrfout_0726_d01_2012-07-26.nc? LU_INDEX[0:1:0][0:10:810][0:10:855]' !ncid = ncdf_open(url) dataid = ncdf_varid(ncid,'LU_INDEX') ncdf_varget,ncid,dataid,lu_index

In ERDDAP: suffix is file type of response!In TDS: suffix is file type of source

Test Results

Summary• GCOOS-GRIIDC: Strong Partnership!

• More data to appear in TDS/ERDDAP!

• Standards: Use them!

• RAM - buy lots!

• iPython/pyoos - yes!

Resources• http://www.opendap.org!• http://upwell.pfeg.noaa.gov/erddap/index.html!

• http://stommel.tamu.edu/~baum/bp.html#_pydap!

• http://stommel.tamu.edu/~baum/thredds2.html!

• http://stommel.tamu.edu/~baum/cages.html!

• http://www.giss.nasa.gov/tools/panoply/!

• http://www.unidata.ucar.edu/software/idv/!

• http://www.pfeg.noaa.gov/products/edc!

• http://www.mathworks.com!

• http://www.iges.org/grads/!

• http://www.ferret.noaa.gov/Ferret/!

• http://www.exelisvis.com/ProductsServices/IDL.aspx!