opendap developer's meeting 2007

32
Implementing the Data Access Protocol in Python Dr. Rob De Almeida

Upload: rob-de-almeida

Post on 14-Oct-2014

217 views

Category:

Documents


0 download

DESCRIPTION

Presentation given at the OPeNDAP Developer's Meeting in Boulder, CO, 2007.

TRANSCRIPT

Page 1: OPeNDAP Developer's Meeting 2007

Implementing the Data Access Protocol in Python

Dr. Rob De Almeida

Page 2: OPeNDAP Developer's Meeting 2007

Table of Contents

● History● Current implementation

● Client● Server● Plugins & responses● WSGI & Paste

● Future

Page 3: OPeNDAP Developer's Meeting 2007

History

● pyDAP is a free implementation of the Data Access Protocol written in Python from scratch

● It is the product of naïveness and determination :)

Page 4: OPeNDAP Developer's Meeting 2007

Why Python?

● Object-oriented high level programming language that emphasizes programmer effort (vs. computer effort)

● Increasing usage in science (CDAT, MayaVi) and web (Google, YouTube)

● Advantages: interpreter, batteries included, easy prototyping, dynamically typed, concise, fun

Page 5: OPeNDAP Developer's Meeting 2007

pyDAP 1.0

● Started in 2003● “Afternoon project”: client only,

downloaded data from ASCII response and worked only with Grids and Arrays

● Reverse-engineering of the protocol● Should've really been version 0.0.1

Page 6: OPeNDAP Developer's Meeting 2007

pyDAP 1.x

● Binary data using Python's xdrlib● Server architecture based on a

common core that could run as CGI, Twisted or using Python's BaseHTTPServer

Page 7: OPeNDAP Developer's Meeting 2007

pyDAP 2.0

● Complete rewrite, based on the DAP 2.0 specification draft

● Developed during the Google Summer of Code 2005

● Own implementation of XDR● Server built based on WSGI

specification*● This should've been version 1.0

Page 8: OPeNDAP Developer's Meeting 2007

pyDAP 2.1

● Fully buffered server, able to handle infinite datasets

● Automatic discovery of plugins● Automatic installation of dependencies● Runs with Python Paste*

Page 9: OPeNDAP Developer's Meeting 2007

pyDAP 2.2.5.8

● Released last Friday (2007-02-16)● Approximately 3k LOC for client and

server, including docstrings, comments and its own XDR implementation

● Support for additional plugins (for new data formats) and responses (for new output) that are auto-discoverable

● Stub support for DDX on the client and server

Page 10: OPeNDAP Developer's Meeting 2007

Client

● Based on the httplib2 module● HTTP / HTTPS● Keep Alive● Auth: digest, basic, WSSE, HMAC digest● Caching● Compression: deflate, gzip

● Intuitive interface

Page 11: OPeNDAP Developer's Meeting 2007

Sample client session

>>> from pynetcdf import NetCDFFile

>>> dataset = NetCDFFile(“coads.nc”)

>>> sst = dataset.variables['SST']

>>> print sst.shape

(12, 90, 180)

>>> print sst.dimensions

('TIME', 'COADSY', 'COADSX')

>>> print sst[0,40,40]

28.0669994354

>>> from dap.client import open

>>> dataset = \

... open(“http://server/coads.nc”)

>>> sst = dataset['SST']

>>> print sst.shape

(12, 90, 180)

>>> print sst.dimensions

('TIME', 'COADSY', 'COADSX')

>>> print sst[0,40,40]

[[[ 28.06699944]]]

Page 12: OPeNDAP Developer's Meeting 2007

Client usage

● Commonly used to automate the download of data from OpeNDAP servers and storing in a different format (scripting)

● Dapper-compliance validator for testing servers

Page 13: OPeNDAP Developer's Meeting 2007

Server

● “Writing a server is like writing a client backwards”

● Thin layer between plugins and responses (both auto-discoverable)

● Implemented as a WSGI application*● Deployed using Paste Deploy*

Page 14: OPeNDAP Developer's Meeting 2007

Plugins and responses

Page 15: OPeNDAP Developer's Meeting 2007

Plugins and responses

http://localhost:8080/file.nc.das

Page 16: OPeNDAP Developer's Meeting 2007

Installing plugins & responses

● pyDAP uses EasyInstall:● easy_install dap.plugins.netcdf● easy_install dap.responses.html

● Easy to create new plugins (for small values of “easy”):

● paster create -t dap_plugin myplugin● Generates template with skeleton code● New plugin can be easily distributed

Page 17: OPeNDAP Developer's Meeting 2007

Available plugins

● CSV● netCDF (reference implementation)● SQL (compatible with most databases

but generates “flat” dataset)● Matlab 4/5● GrADS grib● HDF5 and GDAL (experimental)● grib2? (Rob Cermak)

Page 18: OPeNDAP Developer's Meeting 2007

Available responses

● dds, das, dods● ASCII variant● HTML form● JSON● WMS / KML● EditGrid / Google Spreadsheets● netCDF?

Page 19: OPeNDAP Developer's Meeting 2007

JSON

● Lightweight alternative to XML for data exchange

● Based on a subset of Javascript● Easy to parse on the browser● Parsers and generators for C, C++ C#, Java,

Lisp, Lua, Objective C, Perl, PHP, Python, Ruby, Squeak and several other languages

● Coincidentally, also a subset of Python● JSON == valid Python code

Page 20: OPeNDAP Developer's Meeting 2007

A JSON response

Content-description: dods_json

XDODS-Server: dods/2.0

Content-type: application/json

{"test": {"attributes": {"NC_GLOBAL": {},

"author": "Roberto De Almeida"},

"type": "Dataset",

"a": {"type": "Int32",

"shape": [10],

"data": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]}}}

Page 21: OPeNDAP Developer's Meeting 2007

WMS

● Returns maps (images) from requested variables and regions

● Works with geo-referenced grids and sequences

● Layers can be composed together● Data can be constrained:

● /coads.nc.wms?SST // annual mean● /coads.nc.wms?SST[0] // january

Page 22: OPeNDAP Developer's Meeting 2007

WMS example request

http://localhost:8080/netcdf/coads.nc.wms?LAYERS=SST&WIDTH=512

Page 23: OPeNDAP Developer's Meeting 2007

KML

● Generates XML file using the Keyhole Markup Language, pointing to the WMS response

● Nice and simple interface for quick visualizing data

Page 24: OPeNDAP Developer's Meeting 2007
Page 25: OPeNDAP Developer's Meeting 2007
Page 26: OPeNDAP Developer's Meeting 2007

WSGI

● Python Web Standard Gateway Interface

● Simple and universal interface between web servers (like Apache) and web applications (like pyDAP)

● Allows the sharing of middleware between applications (gzip, authentication, caching, etc.)

Page 27: OPeNDAP Developer's Meeting 2007

Before WSGI

Page 28: OPeNDAP Developer's Meeting 2007

After WSGI

Page 29: OPeNDAP Developer's Meeting 2007

Paste & Paste Deploy

● Python module that facilitates the development and deployment of web applications

● Allows the deployment of pyDAP using a simple INI file that specifies server, middleware and application configuration

Page 30: OPeNDAP Developer's Meeting 2007

Running a server

[server:main]use = egg:PasteScript#wsgiutilshost = 127.0.0.1port = 8080

[filter-app:main]use = egg:Paste#httpexceptionsnext = pyDAP

[app:pyDAP]use = egg:dapname = Test DAP serverroot = %(here)s/dataverbose = 0template = %(here)s/templatex-wsgiorg.throw_errors = 1dap.responses.kml.format = image/png

Page 31: OPeNDAP Developer's Meeting 2007

Future

● pyDAP 2.3 almost ready● Dapper compliance● Faster XDR encoding/decoding● Initial support for DDX response and parser

● Build a rich web interface (AJAX) based on JSON + WMS + KML responses

● Not only to pyDAP, but to other OPeNDAP servers using pyDAP as a proxy

Page 32: OPeNDAP Developer's Meeting 2007

Acknowledgments

● OPeNDAP for all the support● James Gallagher for all my questions

about the spec on the mailing list● Everybody who submitted bugs (bonus

points for submitting patches!)