1
HDF5Life cycle of data
Boeing
September 19, 2006
2
Overview• “Life cycle” of HDF5 data• I/O operations for datasets with
different storage layouts• Compact dataset• Contiguous dataset
• Datatype conversion • Partial I/O for contiguous dataset
• Chunked dataset• I/O for chunked dataset
• Variable length datasets and I/O
3
• Life cycle: what does happen to data when it is transferred from application buffer to HDF5 file?
File or other “storage”
Virtual file I/O
Library internals
Object API
ApplicationApplication Data buffer
H5Dwrite
Magic box
Unbuffered I/O
Data in a file
4
“Life cycle” of HDF5 data: inside the magic box• Operations on data inside the magic box
• Datatype conversion• Scattering - gathering • Data transformation (filters, compression)• Copying to/from internal buffers
• Concepts involved• HDF5 metadata, metadata cache• Chunking, chunk cache
• Data structures used• B-trees (groups, dataset chunks)• Hash tables• Local and Global heaps (variable length data: link names,
strings, etc.)
5
“Life cycle” of HDF5 data: inside the magic box
• Understanding of what is happening to data inside the magic box will help to write efficient applications
• HDF5 library has mechanisms to control behavior inside the magic box
• Goals of this and the next talk are to • Introduce the basic concepts and internal data
structures and explain how they affect performance and storage sizes
• Give some “recipes” for how to improve performance
6
Operations on data inside the magic box• Datatype conversion
• Examples: • float integer• LE BE• 64-bit integer to 16-bit integer (overflow may occur!)
• Scattering - gathering • Data is scattered/gathered from/to user’s buffers into internal
buffers for datatype conversion and partial I/O• Data transformation (filters, compression)
• Checksum on raw data and metadata (in 1.8.0)• Algebraic transform• GZIP and SZIP compressions• User-defined filters
• Copying to/from internal buffers
7
“Life cycle” of HDF5 data: inside the magic box
• HDF5 metadata• Information about HDF5 objects used by the library• Examples: object headers, B-tree nodes for group, B-
Tree nodes for chunks, heaps, super-block, etc. • Usually small compared to raw data sizes (KB vs. MB-
GB)• Metadata cache
• Space allocated to handle pieces of the HDF5 metadata • Allocated by the HDF5 library in application’s memory
space• Cache behavior affects overall performance• Will cover in the next talk
8
“Life cycle” of HDF5 data: inside the magic box
• Chunking mechanism• Chunking – storage layout where a dataset is
partitioned in fixed-size multi-dimensional tiles or chunks
• Used for extendible datasets and datasets with filters applied (checksum, compression)
• HDF5 library treats each chunk as atomic object• Greatly affects performance and file sizes
• Chunk cache• Created for each chunked dataset• Default size 1MB
9
HDF5 file structure
User block File header infoVersion #, etc.
Root group
Symbol Table
Group
Symbol Table
Object
Global Heap
Local Heap
10
Writing a contiguous dataset of atomic type
DataMetadataDataspace
3
RankRank
Dim_2 = 5Dim_1 = 4
DimensionsDimensions
Time = 32.4
Pressure = 987
Temp = 56
AttributesAttributes
Chunked
Compressed
Dim_3 = 7
Storage infoStorage info
IEEE 32-bit float
DatatypeDatatype
11
I/O operations for HDF5 datasets with different storage layouts
• Storage layouts• Compact• Contiguous• Chunked
• I/O performance depends on • Dataset storage properties• Chunking strategy• Metadata cache performance• Etc.
12
Application memory
Writing a compact dataset
Dataset header
………….Datatype
Dataspace………….Attribute 1Attribute 2
Data
Metadata cache
File
Raw data is stored within the dataset header
13
Writing a contiguous dataset with no datatype conversion
User buffer (matrix 5x4x7)
Dataset header
………….Datatype
Dataspace………….Attribute 1Attribute 2………… Application memory
Metadata cache
File
Dataset header Dataset raw data
14
Writing a contiguous dataset with conversion
Dataset header
………….Datatype
Dataspace………….Attribute 1Attribute 2………… Application memory
Metadata cache
File
Dataset header Dataset raw data
Conversion buffer 1MB
Dataset raw data
15
Sub-setting of contiguous datasetSeries of adjacent rows
File
N
Application data in memory
Data is contiguous in a file
One I/O operation
M rows
M
16
Sub-setting of contiguous datasetAdjacent, partial rows
File
N
M
…
Application data in memory
Data is scattered in a file in M contiguous blocks
Several small I/O operation
N elements
17
Sub-setting of contiguous datasetExtreme case: writing a column
File
N
M
…
Application data in memory
Data is scattered in a file in M contiguous blocks
Several small I/O operation
1 element
18
Sub-setting of contiguous datasetData sieve buffer
File
N
M
…
Application data in memory
Data is scattered in a file in M contiguous blocks
1 element
Data is gathered in a sieve buffer in memory 64K
memcopy
19
Performance tuning for contiguous dataset
• Datatype conversion• Avoid for better performance• Use H5Pset_buffer function to customize
conversion buffer size
• Partial I/O• Write/read in big contiguous blocks (at least the
size of a block on FS)• Use H5Pset_sieve_buf_size to improve
performance for complex subsetting
20
Possible tuning work
• Datatype conversion• Use of multiple threads for datatype conversion
• Partial I/O• OS vector I/O
• Asynchronous I/O
21
Writing chunked dataset
Dataset is partitioned into fixed-size multi-dimensional chunks of sizes X/4 x Y/2 x Z
Dimension sizes X x Y x Z
22
Extending chunked dataset in any dimension
•Data can be added in any dimensions•Compression is applied to each chunk•Datatype conversion is applied to each chunk
23
Writing chunked dataset
C BA
…………..
• Each chunk is written as a contiguous blob• Chunks may be scattered all over the file • Compression is performed when chunk is evicted from the chunk cache• Other filters when data goes through filter pipeline (e.g. encryption)
AB C
C
File
Chunk cacheChunked dataset
Filter pipeline
24
Writing chunked dataset
Dataset_1 header
…………
Application memory
Metadata cache
Chunking B-tree nodesChunk cache
Default size is 1MB
• Size of chunk cache is set for file • Each chunked dataset has its own chunk cache• Chunk may be too big to fit into cache• Memory may grow if application keeps opening datasets
Dataset_N header
…………
………
25
Partial I/O for chunked dataset
• Build list of chunks and loop through the list• For each chunk:• Bring chunk into memory• Map selection in memory to selection in file• Gather elements into conversion buffer and perform conversion• Scatter elements back to the chunk• Perform conversion when chunk is flushed from chunk cacheFor each element 3 memcopy performed
1 2
3 4
26
Partial I/O for chunked dataset
3
Application memory
memcopy
Application buffer
Chunk
Elements participated in I/O are gathered into corresponding chunk
27
Partial I/O for chunked dataset
3
Conversion bufferGather data
Scatter dataApplication memory
Chunk cache
On eviction from cache chunk is compressed and is written to the file
File Chunk
28
Variable length datasets and I/O
• Examples of variable-length data• String
A[0] “the first string we want to write”…………………………………A[N-1] “the N-th string we want to write”
• Each element is a record of variable-lengthA[0] (1,1,0,0,0,5,6,7,8,9) length of the first record is
10 A[1] (0,0,110,2005)………………………..A[N] (1,2,3,4,5,6,7,8,9,10,11,12,….,M) length of the
N+1 record is M
29
Variable length datasets and I/O
• Variable length description in HDF5 applicationtypedef struct {
size_t length;
void *p;
}hvl_t;
• Base type can be any HDF5 typeH5Tvlen_create(base_type)
• ~ 20 bytes overhead for each element
• Raw data cannot be compressed
30
Variable length datasets and I/O
Global heapGlobal heap
Application bufferApplication buffer
Raw dataRaw data
Elements in application buffer point to global heaps where actual data is stored
Global heapGlobal heap
31
Writing VL datasets
Dataset header
…………
Application memoryMetadata cache B-tree nodes
Chunk cache
………
Conversion buffer
Raw data
Global heap
Chunk cache
VL chunked dataset with selected region
File
Filter pipeline
32
VL chunked dataset in a file
File
Dataset header
Chunking B-tree
Dataset chunksRaw data
33
Variable length datasets and I/O
• Hints • Avoid closing/opening a file while writing VL
datasets • global heap information is lost• global heaps may have unused space
• Avoid writing VL datasets interchangeably • data from different datasets will is written to the
same heap
• If maximum length of the record is known, use fixed-length records and compression
34
Example: Boeing time-segment library application
• Multiple extendible 1-dim arrays of variable-length records• Uses HDF5 Packet Table APIs (H5PT)• HDF5 features used
• Chunked storage• Chunk cache• Compound and VL datatypes • Datatype conversion• Partial I/O
• Complexity affects performance• Performance tuning is needed
35
Thank you!
Questions ?