[Numpy-discussion] fast numpy i/o
Derek Homeier
derek at astro.physik.uni-goettingen.de
Mon Jun 27 12:17:45 EDT 2011
On 21.06.2011, at 8:35PM, Christopher Barker wrote:
> Robert Kern wrote:
>> https://raw.github.com/numpy/numpy/master/doc/neps/npy-format.txt
>
> Just a note. From that doc:
>
> """
> HDF5 is a complicated format that more or less implements
> a hierarchical filesystem-in-a-file. This fact makes satisfying
> some of the Requirements difficult. To the author's knowledge, as
> of this writing, there is no application or library that reads or
> writes even a subset of HDF5 files that does not use the canonical
> libhdf5 implementation.
> """
>
> I'm pretty sure that the NetcdfJava libs, developed by Unidata, use
> their own home-grown code. netcdf4 is built on HDF5, so that qualifies
> as "a library that reads or writes a subset of HDF5 files". Perhaps
> there are lessons to be learned there. (too bad it's Java)
>
> """
> Furthermore, by
> providing the first non-libhdf5 implementation of HDF5, we would
> be able to encourage more adoption of simple HDF5 in applications
> where it was previously infeasible because of the size of the
> library.
> """
>
> I suppose this point is still true -- a C lib that supported a subset of
> hdf would be nice.
>
> That being said, I like the simplicity of the .npy format, and I don't
> know that anyone wants to take any of this on anyway.
Some late comments on the note (I was a bit surprised that HDF5 installation seems to be a serious hurdle to many - maybe I've just been profiting from the fink build system for OS X here - but I also was not aware that the current netCDF is built on downwards-compatibility to the HDF5 standard, something useful learnt again...:-)
Some more confusion arose when finding that the NCAR netCDF includes C and Fortran versions:
http://www.unidata.ucar.edu/software/netcdf/
but they also depend actually on HDF5 for netCDF 4 access. While the Java version appears not to, it also only provides *read* access to those formats, so it probably would not be of that much help anyway.
The netCDF4-Python package mentioned before
http://code.google.com/p/netcdf4-python/
unfortunately builds on HDF5 again, same for the PyNIO module
http://www.pyngl.ucar.edu/Nio.shtml
which is probably explained by the above dependencies.
Finally, the former Scientific.IO NetCDF interface is now part of scipy.io, but I assume it only supports netCDF 3 (the documentation is not specific about that). This might be the easiest option for a portable data format (if Matlab supports it).
Cheers,
Derek
More information about the NumPy-Discussion
mailing list