The MathWorks introduced MATLAB support for HDF5 in 2002 via three high-level functions: HDF5INFO, HDF5READ, and HDF5WRITE. These functions worked well for their purpose-providing simple interfaces to a complicated file format-but MATLAB users requested finer control over their HDF5 files and the HDF5 library. MATLAB 7.3 (R2006b) adds this precise level of support for version 1.6.5 of the HDF5 library via a close mapping of the HDF5 C API to MATLAB function calls.
This presentation will briefly introduce the earlier, high-level HDF5 interface (and its limitations) before showing in detail the low-level HDF5 functions. It will show how to interact with the HDF5 library and files using the thirteen classes of functions in MATLAB, which encapsulate groupings of functionality found in the HDF5 C API. But because MATLAB is itself a higher-level language than C, we will also present MATLAB's extensions and modifications of the HDF5 C API that make it more MATLAB-like, work with defined values, and perform ID and memory management.
Wrapping a library like HDF5 requires a great deal of effort and design, and we will briefly present a general-purpose mechanism for creating close mappings between library interfaces and an application like MATLAB. One of our goals in this presentation is to facilitate communication with The HDF Group about how The MathWorks builds our HDF5 interfaces in order to ease adoption of future versions of the HDF5 library in large, general-purpose applications.
4. The World of HDF Applications
HDF4 / HDF5 APIs
API Supported by MATLAB
High-level access functions
Customer
application
Customer
application
Customer
application
Customer
application
4
5. HDF5READ
DATA = HDF5READ(FILENAME,DATASETNAME) returns
in the variable DATA all data from the file FILENAME for the
data set named DATASETNAME.
DATA has to be extremely general
because of the wide variety of
datatypes that HDF5 accomodates.
More control needed to
match the uniqueness of
customer datasets and files.
Simple access only:
● No subsetting.
● Limited datatype control.
5
6. HDF5INFO
FILEINFO = HDF5INFO(FILENAME) returns a structure
whose fields contain information about the contents of an
HDF5 file. FILENAME is a string that specifies the name of
the HDF file.
6
7. HDF5WRITE
HDF5WRITE(FILENAME, LOCATION, DATASET) adds
the data in DATASET to the HDF5 file named FILENAME.
LOCATION defines where to write the DATASET in the file
and resembles a Unix-style path. The data in DATASET is
mapped to HDF5 datatypes using the rules below. . . .
HDF5WRITE is completely
symmetric with HDF5READ.
Objects disambiguate datatypes.
The values in DATASET are cumbersome
for non-native MATLAB types (e.g., arrays,
compound, and references).
7
8. Customer HDF5 Requests
Library upgrades (1.4.5, 1.6.4, 1.6.5, 1.8)
Better support for large data
Hyperselection, chunking
New platform support (Solaris 64, MacIntel)
GZIP, SZIP compression
HDF5 file interrogation
Bitfield, date/time datatypes
Data translators: HDF5 --> MATLAB
8
12. Use Cases
Be able to drop in new versions of the HDF5 library when
they become available.
HDF5 1.8
12
13. Use Cases
Use a variety of esoteric HDF5 features at once:
“I'm trying to use HDF5 files [with] grouping
features like compound data types, group links,
and reference data types.”
13
24. The HDF Group and The MathWorks
Continue to communicate future directions.
Don't change the existing API functions.
Communicate API functionality changes.
Produce a machine parsable version of hdf5.h.
24