[Numpy-discussion] ANN: HDF5 for Python (h5py) 2.2 BETA

Andrew Collette andrew.collette at gmail.com
Thu Jul 18 09:23:59 EDT 2013

Announcing HDF5 for Python (h5py) 2.2.0 BETA

We are proud to announce that HDF5 for Python 2.2.0 (beta) is now available.
Because of the large number of new features in this release, we are actively
seeking community feedback over the (2-week) beta period.

The h5py package is a Pythonic interface to the HDF5 binary data format.

It lets you store huge amounts of numerical data, and easily manipulate that
data from NumPy. For example, you can slice into multi-terabyte datasets
stored on disk, as if they were real NumPy arrays. Thousands of datasets can
be stored in a single file, categorized and tagged however you want.

H5py uses straightforward NumPy and Python metaphors, like dictionary and
NumPy array syntax. For example, you can iterate over datasets in a file, or
check out the .shape or .dtype attributes of datasets. You don't need to know
anything special about HDF5 to get started.

Documentation and download links are available at:


Parallel HDF5

This version of h5py introduces support for MPI/Parallel HDF5, using the
mpi4py package.  Parallel HDF5 is the native method for sharing
files and objects across multiple processes.  Unlike "multiprocessing"
based solutions, all processes in an MPI-based program can read
from and write to the same shared HDF5 file.

There is a guide to using Parallel HDF5 at the h5py web site:


Other new features

* Support for Python 3.3
* Support for 16-bit "mini" floats
* Access to the HDF5 scale-offset filter
* Field names are now allowed when writing to a dataset
* Region references now preserve the shape of their selections
* File-resident "committed" types can be linked to datasets and attributes
* A new "move" method on Group objects
* Many new options for Group.copy

More information about the NumPy-Discussion mailing list