developing a netcdf-4 interface to hdf5 data

23
Russ Rew (PI), UCAR Unidata Mike Folk (Co-PI), NCSA/UIUC Ed Hartnett, UCAR Unidata Quincey Kozial, NCSA/UIUC John Caron, UCAR Unidata Robert E. McGrath, NCSA/UIUC NASA award AIST-02-007

Upload: nerita

Post on 05-Jan-2016

49 views

Category:

Documents


1 download

DESCRIPTION

Developing a NetCDF-4 Interface to HDF5 Data. Russ Rew (PI), UCAR Unidata Mike Folk (Co-PI), NCSA/UIUC Ed Hartnett, UCAR Unidata Quincey Kozial, NCSA/UIUC John Caron, UCAR Unidata Robert E. McGrath, NCSA/UIUC. NASA award AIST-02-0071. Unidata: A Community Endeavor. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Developing a NetCDF-4 Interface to HDF5 Data

Developing a

NetCDF-4 Interface to

HDF5

Data

Russ Rew (PI), UCAR UnidataMike Folk (Co-PI), NCSA/UIUCEd Hartnett, UCAR UnidataQuincey Kozial, NCSA/UIUCJohn Caron, UCAR Unidata

Robert E. McGrath, NCSA/UIUC

QuickTime™ and aGraphics decompressor

are needed to see this picture.

NASA award AIST-02-0071

Page 2: Developing a NetCDF-4 Interface to HDF5 Data

2

Unidata: A Community Endeavor

• Community of educators and researchers at 120 universities, 30 other institutions, international in scope

• Managed by the University Corporation for Atmospheric Research

• Mission: providing data, tools, support, and community leadership for enhanced earth-system education and research

• Atmospheric science community, expanding to oceanography, hydrology, other geosciences

• Unidata Program Center: 25 staff, 15 developers

Source

LDM

Source

Source

LDM LDM

LDMLDM

LDM LDMLDM

LDM

Internet

OpenDAPDatasetHDF5

File

NetCDF 4 library

API

OpenDAP

4.0

protocol

Local file or

HTTP protocol

Client

Application

NcMLDataset XML

NcMLDataset XML

NetCDFV.1 and 2

File

Virtual dataset

Page 3: Developing a NetCDF-4 Interface to HDF5 Data

3

Overview

• What is netCDF? What is HDF5?

• Why develop a netCDF interface to HDF5?

• What is the current project status?

• What still needs to be done?

• Do we have the necessary resources?

• What are the prospects for success?

Page 4: Developing a NetCDF-4 Interface to HDF5 Data

4

NetCDF-3 and HDF5

• Standard Data Models for scientific data and data abstractions

• Standard Interfaces between data providers and data users

• Standard Libraries for data access from various languages

• Standard Formats for portable binary data

• Users need not know about the format

Ad hoc standards are useful standards

Page 5: Developing a NetCDF-4 Interface to HDF5 Data

5

Data Models

netCDF-3 HDF5

Variables DatasetsDimensions DataspacesAttributes Attributes

Coordinates

Element types

Datatypes

Groups

Links

References

Property Lists

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Page 6: Developing a NetCDF-4 Interface to HDF5 Data

6

Libraries

netCDF-3 HDF5

one interface levelhigh- and low-level

interfacesserial I/O serial. parallel (MPI) I/OC, C++ C, C++

Fortran-77, -90 Fortran-90Java (pure) Java (native)

Perl

Python Python

Ruby

IDL IDLMatlab Matlab

...

Page 7: Developing a NetCDF-4 Interface to HDF5 Data

7

Formats

netCDF-3 HDF5

XDR XDR and nativedirect access direct access

efficiently extendible efficiently extendible32-bit file offsets 64-bit file offsets

chunked access

compound structures

nested structures

compression

efficient schema changes

virtual file I/O layer

Page 8: Developing a NetCDF-4 Interface to HDF5 Data

8

Other Characterisitics

NetCDF-3 HDF5

Availability free free

Development and

maintenanceUCAR Unidata NCSA HDF Group

Primary funding

NSF NASA, DOE ASCI

Advantagespopular, simple, lots

of tools, multiple implementations

powerful, high-performance, storage

efficiency, extensibility

Primary usesclimate, forecast,

ocean models, data archives

satellite data, computational fluid dynamics, parallel

computing

Page 9: Developing a NetCDF-4 Interface to HDF5 Data

9

Goals of NetCDF/HDF Combination

• Create netCDF-4, combining desirable characteristics of netCDF-3 and HDF5, while taking advantage of their separate strengths• Widespread use and simplicity of netCDF-3

• Generality and performance of HDF5

• Make netCDF more suitable for high-performance computing

• Provide simple high-level interface for HDF5

• Demonstrate benefits of combination in advanced Earth science modeling efforts

Page 10: Developing a NetCDF-4 Interface to HDF5 Data

10

NetCDF-4 Features Enabled by HDF5

• Large file support

• Parallel I/O

• Multiple dynamic dimensions

• Packed data, compression

• New data types

• Dynamic schema modifications

• Other possibilities: groups, user-defined types, better coordinate support, …

Page 11: Developing a NetCDF-4 Interface to HDF5 Data

11

Approach

• Implement netCDF-3 over HDF5, to demonstrate backward compatibility with• Programming interface• Format

• Design netCDF-4 interface• Implement netCDF-4 over HDF5 to add

enhancements made possible with HDF5• Foster continued collaboration between

Unidata and NCSA in design, development, testing, and support

Page 12: Developing a NetCDF-4 Interface to HDF5 Data

12

NetCDF-4 Architecture

•Access to netCDF-3, netCDF-4, and HDF5 data created through netCDF-4 interface

HDF5 Library

netCDF-4 Library

netCDF-3Interface

Page 13: Developing a NetCDF-4 Interface to HDF5 Data

13

User View of NetCDF-4

• NetCDF-4 library accesses either the netCDF-3 or HDF5 library to read or write data

netCDF-4 Prototype

HDF5 Library netCDF-3 Library

HDF5 file netCDF file

netCDF is fun!HDF5 is fun!

NetCDF-4user canwritenetCDF orHDF5 files

Page 14: Developing a NetCDF-4 Interface to HDF5 Data

14

Current Technical Status

Implement netCDF-3 over HDF5, to demonstrate backward compatibility with API

and formatdone

Determine needed HDF5 enhancements donePrepare netCDF-3 for incorporation with

netCDF-4 nearly done

Design netCDF-4 interface to add enhancements made possible with HDF5

in progress

Implement needed HDF5 enhancementsin

progress

Implement netCDF-4 over enhanced HDF5not

started yet

Page 15: Developing a NetCDF-4 Interface to HDF5 Data

15

NetCDF-3 Interface Using HDF5

• 13,000 lines of C code

• Passes all netCDF-3 tests

• Demonstrates HDF5 practical for netCDF-4

• Identifies HDF5 enhancements needed

• Shows read/write times and file sizes satisfactory

• Validates approach to backward compatibility• API compatibility: only recompilation and

relinking needed for existing netCDF-4 programs

• Format compatibility: accesses all current netCDF files as well as new HDF5 files transparently

Page 16: Developing a NetCDF-4 Interface to HDF5 Data

16

NetCDF-3 Enhancements for NetCDF-4

• To provide • stable foundation for incorporating netCDF-4

• smooth transition for current users

• Automated multi-platform testing

• Documentation converted to maintainable form, new language-independent Users Guide

• Added large file support with backward compatibility

• Added default format interfaces

• Better Windows and .Net support

Page 17: Developing a NetCDF-4 Interface to HDF5 Data

17

HDF5 Additions for Supporting NetCDF-4

• HDF5 enhancements

• numeric type conversions

• zero-dimensional datasets

• overflow handling improvements

• flexible parallel I/O

• HDF5 design specifications

• dimension scales for coordinate systems

• shared object proposal

Page 18: Developing a NetCDF-4 Interface to HDF5 Data

18

Project Schedule

• July 2004: version 3.6.0 - revised documentation, 64-bit file offsets, default format functions

• October 2004: version 3.7.0 - use of autotools

• January 2005: version 3.7.1: netCDF-4 prototype included, support for multiple unlimited dimensions

• March 2005: version 4.0.0_beta - test relelase

• July 2005: version 4.0.0 - first netCDF-4 production release

Currently on schedule for a July 2005 release

Page 19: Developing a NetCDF-4 Interface to HDF5 Data

19

NetCDF-4 Design Issues

• Issue: support for coordinate systems in netCDF and HDF5 data models? under consideration

• Issue: addition of HDF5 Groups abstraction to netCDF data model? yes, tentatively• subset of HDF5 Group features

• constrained by backward compatibility with netCDF-3

• no Group aliases but try to support Variable aliases and Dimension scoping?

• Issue: can we just adopt Northwestern/Argonne pnetCDF interface for adding parallel I/O?

Page 20: Developing a NetCDF-4 Interface to HDF5 Data

20

What remains to be done?

• Next for netCDF-4: interface additions for multiple unlimited dimensions, group interfaces, dynamic schema modification, new data types, packed data, parallel I/O, compression

• HDF5 enhancements

• zero-length attributes

• shared dimensions

• creation order access for objects

• Testing in models (CCSM, WRF, ESMF, ...)

Page 21: Developing a NetCDF-4 Interface to HDF5 Data

21

Papers, Posters, Presentations

1. R. Rew, M. Folk, E. Hartnett, and R. McGrath: Plans for an Enhanced NetCDF-4 Interface to HDF5 Data. HDF/HDF-EOS Workshop VII, Silver Springs, September 2003. Poster and presentation.

2. M. Folk, R. Rew, K. Yang, R. McGrath: NetCDF-4: Combining netCDF and HDF5 Data. AGU Fall Meeting, San Francisco, December 2003. Poster.

3. R. Rew and E. Hartnett: Merging NetCDF and HDF5. 20th International Conference on Interactive Information Processing Systems (IIPS) for Meteorology, Oceanography, and Hydrology, Seattle, January 2004. Paper and poster.

4. E. Hartnett: Merging the NetCDF and HDF5 Libraries to Achieve Gains in Performance and Interoperability. 2004 Earth Science Technology Conference, Palo Alto, June 2004. Paper and presentation.

Page 22: Developing a NetCDF-4 Interface to HDF5 Data

22

Excellent Prospects for Success• More software engineering than research

• NetCDF-4 web site just announced:• www.unidata.ucar.edu/packages/netcdf/

netcdf-4/

• Unidata and NCSA developers collaborating via email, teleconferences

• On schedule for July 2005 release:• www.unidata.ucar.edu/packages/netcdf/

release_schedule.html

• Great interest in status of project! Ultimate goal to make earth science researchers more productive ...

Page 23: Developing a NetCDF-4 Interface to HDF5 Data

23

Questions?

?

? ?

?

?

?

?