session 2: using opendap-enabled applications to access australian data services and repositories...

26
Session 2: Using OPeNDAP-enabled Applications to Access Australian Data Services and Repositories eResearch Australasia 2011, ½ Day Morning Workshop, Thursday 10 th November 2011

Upload: reynard-douglas

Post on 25-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Session 2: Using OPeNDAP-enabled Applications to Access Australian Data Services and Repositories eResearch Australasia 2011, ½ Day Morning Workshop, Thursday

Session 2: Using OPeNDAP-enabled Applications to Access Australian Data Services and RepositorieseResearch Australasia 2011, ½ Day Morning Workshop, Thursday 10th November 2011

Page 2: Session 2: Using OPeNDAP-enabled Applications to Access Australian Data Services and Repositories eResearch Australasia 2011, ½ Day Morning Workshop, Thursday

GENERAL INFORMATION

• This is a half-day workshop (9am to 12:30pm)

• 9:00am Introductions and Participants Goals

• 9:15am Session 1: Discovering OPeNDAP data access services

• 10:00am Session 2: Applicable use cases of OPeNDAP data services

− 10:30am Tea Break for 15 minutes

• 11:00am Session 3: OPeNDAP service protocols and features

• 11:45am Session 4: Accessing complementary features and services

• 12:30pm End of Workshop

10:00am Session 2: Applicable use cases of OPeNDAP data services10:00am Session 2: Applicable use cases of OPeNDAP data services

Page 3: Session 2: Using OPeNDAP-enabled Applications to Access Australian Data Services and Repositories eResearch Australasia 2011, ½ Day Morning Workshop, Thursday

Session 2

• Applicable use cases of OPeNDAP data services for data cataloging and data access using a variety of applications and tools.

• A short tutorial exploring data access using an OPeNDAP-enabled tool within a scripting language such as python.

• 45 minutes in length + 15 minute Tea Break at 10:30am

Page 4: Session 2: Using OPeNDAP-enabled Applications to Access Australian Data Services and Repositories eResearch Australasia 2011, ½ Day Morning Workshop, Thursday

Spectrum of Use Cases

Application Data Representation

OGC data modeldomain specificgeospatial, 1-D, 2-D

DAP2 data modeldomain neutraln-D, time series

**DAP4 data modeldomain neutralnew data types and data structuresstreaming, compressed, chunked

Common Data Model (CDM)domain specific

Future data modeldomain neutral??

Application Types

Programmatic / Langauge APIFORTRAN, C/C++, JAVA, Python, NetCDF, Java NetCDF

Programmatic / ToolsNetCDF, NCO, PyDAPCustom Tools: OPeNDAP crawler, ocean_prep

Interactive Data ViewerIDV, Panolopy, IDL, MATLAB, iPython (matplotlib), NCL, web browser (metadata)

Interactive AnalysisMATLAB, IDL, iPython, NCLCustom Application: Inudation Modeller

Web ApplicationLive Access ServerIMOS Data Portal (WMS)Custom Java Servlet

ProgrammingDAP2 Legacy Codeexisting tools

DAP2 New CodeNew tools

**DAP4 programminglegacy code support

**DAP4 programmingnew data model and protocolsstreaming support

**DAP4 programmingAsynchronous access modes, server-side processing

Data Access Protocol

Metadata Requestdas, dds, ddx

ASCII/Binary Data RequestSimple data representation

DAP Binary Object Request NcML Data Requestaggregation, virtual data sets

**DAP4server-side operations, async access mode, new data model, posting

Syntax

Return data set infofile.nc.dds - readablefile.nc.ddx - XMLfile.nc.asc - ASCII data return

Select variablesfile.nc.dods?var1,var2,var3

subset arraysfile.dods?var1(0:1:10)

Return file translationsfile.nc.netcdf - NetCDF file

Server-side operationsfile.nc?GEOLOC()Async access mode??

Clients

Programmatic AccessTsunami inudation modeller, NetCDF,NCO, PyDAP, PyNetCDF, MATLAB, IDL, …

Interactive AccessWeb browser - CatalogMATLAB, IDL, Python, Panolopy,…

Data Library & Catalog Servicemetadata harvestingdirectory listingsremote THREDDS services

Web ServiceJava servlet, Java appletGeospatial Information ServiceOPeNDAP data service

Analysis ServiceLive Access Server

Service CapabilitiesDAP2 response metadata, dods, ASCII / Binary

**DAP4 Responseasync access mode, server-side, streaming,

NcMLAggregation serviceVirtual Data Set ServiceRemote Data Access

Metadata Conversion and RDFmetadata definitions, translations (-> ISO) sematics, ontalogyCF->ISO, CF->WMS, CF->WCS

Layered ServicesCatalogue serviceWMS, WCS servicesAuthenticationConformance checksCF metadata checkISO metadata check

**DAP4 features listed is my estimation and not the official specification

Page 5: Session 2: Using OPeNDAP-enabled Applications to Access Australian Data Services and Repositories eResearch Australasia 2011, ½ Day Morning Workshop, Thursday

Workshop Use-Cases

Application Data Representation

 

DAP2 data modeldomain neutraln-D, time series

     

Application Types

Programmatic / Langauge APIFORTRAN, C/C++, JAVA, Python, NetCDF, Java Netcdf, PyDAP

Programmatic / ToolsNetCDF, NCO, PyDAPCustom Tools: OPeNDAP crawler

Interactive Data ViewerPanolopy, MATLAB, NCL, web browser

   

ProgrammingDAP2 Legacy Codeexisting tools:

DAP2 New CodeNew tools

     

Data Access Protocol

Metadata Requestdas, dds, ddx

ASCII/Binary Data RequestSimple data representation

DAP Binary Object Request NcML Data Requestaggregation

 

Syntax

Return metadata infofile.nc.das - readable file.nc.dds - readable file.nc.ddx - XML metadatafile.nc.help - help info

Select vars and return datafile.nc.asc?var1,var2,var3file.nc.dods?var1,var2,var3

subset arrays, return datafile.asc?var1(0:1:10)file.dods?var1(0:1:10)

Return file translationsfile.nc.netcdf - NetCDF file

Server-side operationsfile.nc?GEOLOC()

ClientsProgrammatic AccessNetCDF, NCO, PyDAP, PyNetCDF

Interactive AccessWeb browser - CatalogPython, MATLAB, Panolopy

     

Service Capabilities

DAP2 response THREDDS data serviceHyrax data service

 

NcMLAggregation service

 Layered ServicesCatalog serviceWMS

Page 6: Session 2: Using OPeNDAP-enabled Applications to Access Australian Data Services and Repositories eResearch Australasia 2011, ½ Day Morning Workshop, Thursday

Use Case limitations

• Time to access data is dependent on the following factors:

• Hardware and network performance

• Selection of variables and dimensions

• Number of data requests to be issued

− Latency inherent in the data request

• Number of concurrent accesses to the server

Page 7: Session 2: Using OPeNDAP-enabled Applications to Access Australian Data Services and Repositories eResearch Australasia 2011, ½ Day Morning Workshop, Thursday

Performance limitations to data delivery

Network connection

Network Bandwidth

Data Transfer(MB per second)

*Elapse Time(seconds)

WiFi 2 – 56 Mbps 0.2 – 5 MBps 500+

ADSL modem 2 – 14 Mbps 0.2 – 1.4 MBps 357+

Home LAN 100 Mbps 10 MBps 50

SATA Disk 20 – 40 MBps 12.5+

Office LAN 1000 Mbps 100 MBps ~5.00

Disk Array 120 – 240 MBps ~3.00

Backbone Ethernet 10 Gbps 1,000 MBps ~0.50

QDR Infiniband 40 Gbps 4,000 MBps ~0.12

Lustre Parallel FS 10,000 MBps 0.05+

*Time to transfer a 500 MB data object

Page 8: Session 2: Using OPeNDAP-enabled Applications to Access Australian Data Services and Repositories eResearch Australasia 2011, ½ Day Morning Workshop, Thursday

Performance limitations to data delivery

Data Request Data Size *Elapse Time(seconds)

Improved Access

Complete File 3 x fields(3D) doubles

250 MB 178 1.0x

One 3D field5 vertical levels

50 MB 35.7 5.0x

3 x 2D fields 30 MB 21.4 8.3x

One 2D field(1250x1000)

10 MB 7.14 24.9x

Subset 3D field(500 x 500 x 5 )

10 MB 7.14 24.9x

Subset 2D field(500 x 500 )

2 MB 1.43 124x

Vertical Column(100 x 100 x 5 )

0.4 MB 0.28 635x

*Time to transfer a data object on an ADSL2 modem = 14 Mbps

Page 9: Session 2: Using OPeNDAP-enabled Applications to Access Australian Data Services and Repositories eResearch Australasia 2011, ½ Day Morning Workshop, Thursday

DAP-enabled client tools/applications

OPeNDAP Clients (partial list)

http://opendap.org/whatClients

To be demo’ed today.

1. Web browser returning ASCII data

2. Pydap - is a pure Python library implementation of the DAP2

3. NetCDF - is a set of software libraries and self-describing, machine-independent data formats with interfaces to Python, FORTRAN, C/C++, and Java languages

4. NCO – comprises a dozen standalone, command-line programs that take netCDF files as input

5. MATLAB – session 3

6. Panoply – session 4

Page 10: Session 2: Using OPeNDAP-enabled Applications to Access Australian Data Services and Repositories eResearch Australasia 2011, ½ Day Morning Workshop, Thursday

Web Browser demo

• “.ascii” tells the OPeNDAP service to return the data in ASCII format.− http://opendap.bom.gov.au:8080/thredds/dodsC/gamssa_4deg/2011/201111

06-ABOM-L4LRfnd-GLOB-v01-fv01.nc.ascii?lon

• Try accessing multiple variables such as latitude− http://opendap.bom.gov.au:8080/thredds/dodsC/gamssa_4deg/2011/201111

06-ABOM-L4LRfnd-GLOB-v01-fv01.nc.ascii?lon,lat

• What other variables are available in the file? − Try accessing “sst” and download to ascii

Page 12: Session 2: Using OPeNDAP-enabled Applications to Access Australian Data Services and Repositories eResearch Australasia 2011, ½ Day Morning Workshop, Thursday

Tutorial – subsetting continued

http://opendap.bom.gov.au:8080/thredds/dodsC/gamssa_4deg/2011/20111106-ABOM-L4LRfnd-GLOB-v01-fv01.nc.ascii?lon[0:1:1439]

Add a new variable to the above URL, separated by a comma, and make a request for ascii data in the web browser:

• http://opendap.bom.gov.au:8080/thredds/dodsC/gamssa_4deg/2011/20111106-ABOM-L4LRfnd-GLOB-v01-fv01.nc.ascii?lon[0:1:50],lat[0:1:30]

Now do the same thing in the form and modify the indice range• watch out for large indice ranges returning large amounts of data

Page 13: Session 2: Using OPeNDAP-enabled Applications to Access Australian Data Services and Repositories eResearch Australasia 2011, ½ Day Morning Workshop, Thursday

Tutorial: .dods response

Try the binary response “.dods”

• “.dods” tells the OPeNDAP service to return the data in binary format− http://opendap.bom.gov.au:8080/thredds/dodsC/gamssa_4deg/2011/201111

06-ABOM-L4LRfnd-GLOB-v01-fv01.nc.dods?lon

This is two part binary DAP data object which contains 1) meta data, and 2) binary data structure.

This is the typical response for OPeNDAP enabled client applications.

Page 14: Session 2: Using OPeNDAP-enabled Applications to Access Australian Data Services and Repositories eResearch Australasia 2011, ½ Day Morning Workshop, Thursday

Pydap

Pydap is a pure Python library implementing the Data Access Protocol, also known as DODS or OPeNDAP. You can use Pydap as a client or server.• http://pydap.org/

To install Pydap on Windows… see the next slide for Windows

To install Pydap on Mac OS X… see the slide for Mac OS X

To install Pydap on Linux… see the slide for Mac OS X

Page 15: Session 2: Using OPeNDAP-enabled Applications to Access Australian Data Services and Repositories eResearch Australasia 2011, ½ Day Morning Workshop, Thursday

Pydap installation for Windows

To install Pydap on Windows …

1. Install python onto Windows

2. Install easy_install: ez_setup.py

3. Install Pydap: easy_install Pydap

Page 16: Session 2: Using OPeNDAP-enabled Applications to Access Australian Data Services and Repositories eResearch Australasia 2011, ½ Day Morning Workshop, Thursday

Pydap installation for Mac OS X

To install Pydap on Mac OS X…

1. Python is install on Mac OS 10.5 and 10.6 by default

2. Install easy_install: ez_setup.py

3. Install Pydap: easy_install Pydap

Page 17: Session 2: Using OPeNDAP-enabled Applications to Access Australian Data Services and Repositories eResearch Australasia 2011, ½ Day Morning Workshop, Thursday

Test Pydap client installation

>>> from pydap.client import open_url>>> dataset = open_url('http://test.opendap.org/dap/data/nc/coads_climatology.nc')>>> var = dataset['SST']>>> var.shape(12, 90, 180)>>> var.type<class 'pydap.model.Float32'>>>> print var[0,10:14,10:14] # this will download data from the server<class 'pydap.model.GridType'> with data[[ -1.26285708e+00 -9.99999979e+33 -9.99999979e+33 -9.99999979e+33] [ -7.69166648e-01 -7.79999971e-01 -6.75454497e-01 -5.95714271e-01] [ 1.28333330e-01 -5.00000156e-02 -6.36363626e-02 -1.41666666e-01] [ 6.38000011e-01 8.95384610e-01 7.21666634e-01 8.10000002e-01]] and axes366.0[-69. -67. -65. -63.][ 41. 43. 45. 47.]

Page 18: Session 2: Using OPeNDAP-enabled Applications to Access Australian Data Services and Repositories eResearch Australasia 2011, ½ Day Morning Workshop, Thursday

More Pydap client features

See Pydap client: http://pydap.org/client.html

Page 19: Session 2: Using OPeNDAP-enabled Applications to Access Australian Data Services and Repositories eResearch Australasia 2011, ½ Day Morning Workshop, Thursday

NetCDF API and Tools

NetCDF is a set of software libraries and self-describing, machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data.• http://www.unidata.ucar.edu/software/netcdf/

To install, go to • http://www.unidata.ucar.edu/downloads/netcdf/index.jsp

To use with python, build netCDF4 and its python module, or …• easy_install netCDF4

Page 20: Session 2: Using OPeNDAP-enabled Applications to Access Australian Data Services and Repositories eResearch Australasia 2011, ½ Day Morning Workshop, Thursday

NetCDF demo

>>> import netCDF4>>> url = 'http://test.opendap.org/dap/data/nc/coads_climatology.nc’>>> dataset = netCDF4.Dataset(url)>>> var = dataset.variables['SST']>>> var.shape(12, 90, 180)>>> print var[0,10:14,10:14] # this will download data from the server<class 'pydap.model.GridType'> with data[[-1.26285707951 -- -- --] [-0.769166648388 -0.77999997139 -0.675454497337 -0.595714271069] [0.128333330154 -0.0500000156462 -0.0636363625526 -0.141666665673] [0.638000011444 0.895384609699 0.721666634083 0.810000002384]]

>>> print var

<type 'netCDF4.Variable'>

float32 SST('TIME', 'COADSY', 'COADSX')

Page 21: Session 2: Using OPeNDAP-enabled Applications to Access Australian Data Services and Repositories eResearch Australasia 2011, ½ Day Morning Workshop, Thursday

NetCDF demo

Get metadata information about the following data set:• ncdump -h

http://opendap.bom.gov.au:8080/thredds/dodsC/nmoc/oceanmaps2_ofam_fc/latest/ocean_fc_20111108_000_surface.nc

Page 22: Session 2: Using OPeNDAP-enabled Applications to Access Australian Data Services and Repositories eResearch Australasia 2011, ½ Day Morning Workshop, Thursday

NCO Tools

The netCDF Operators (NCO) comprise a dozen standalone, command-line programs that take netCDF files as input, then operate (e.g., derive new data, average, print, hyperslab, manipulate metadata) and output the results to screen or files in text, binary, or netCDF formats. NCO aids manipulation and analysis of gridded scientific data.• http://nco.sourceforge.net/

To install NCO tools, go to• http://nco.sourceforge.net/#Binaries

Page 24: Session 2: Using OPeNDAP-enabled Applications to Access Australian Data Services and Repositories eResearch Australasia 2011, ½ Day Morning Workshop, Thursday

NCO demo

Download the initial conditions for regional ocean model using longitude and latitudes ranges for the dimensions• ncks -O -F -d xt_ocean,143.55,176.66 -d yt_ocean,-28.35,4.75

http://opendap.bom.gov.au:8080/thredds/dodsC/oceanmaps_access_analysis_ogcm/temp/2010/ocean_an_20100312_temp.nc -o ocean_temp2_2010_03_12.nc

Are the files the same (dimensions and lon/lat range)?ncdump –v xt_ocean ocean_temp_2010_03_12.ncncdump –v xt_ocean ocean_temp2_2010_03_12.nc

Page 25: Session 2: Using OPeNDAP-enabled Applications to Access Australian Data Services and Repositories eResearch Australasia 2011, ½ Day Morning Workshop, Thursday

Tutorial: Pick a demo to try

Please select from pydap, NetCDF, and NCO demos1. Install the software on your machine2. Run a test case and see if the software is installed correctly3. Access a different file from a TDS or Hyrax data service4. Get the metadata information5. Get the coordinate axes data6. Get a subset of data from an array

Page 26: Session 2: Using OPeNDAP-enabled Applications to Access Australian Data Services and Repositories eResearch Australasia 2011, ½ Day Morning Workshop, Thursday

Thank you

Authors:

Tim F. Pugh1, James Gallagher2, Dave Fulker3

1Australian Bureau of Meteorology, Melbourne, Australia, [email protected] OPeNDAP, Butte, Montana, USA, [email protected] OPeNDAP, Boulder, Colorado, USA, [email protected]