[SciPy-User] Convert hdf5 file content to numpy array

Johann Rohwer jr at sun.ac.za
Wed Jul 20 04:23:35 EDT 2011


On Wednesday 20 July 2011, Thomas Königstein wrote:
> Hi all, attached to this email (or if the attachment doesn't show
> up, alternatively at
> http://dl.dropbox.com/u/15199/vs001_3d_particles.h5 ), you find a
> 400kb hdf5 file with a number of nodes, which I would like to
> "import" as numpy array. The code that I use so far (also
> attached, and here http://dl.dropbox.com/u/15199/mwe.py ) is this:
> 
> import tables
> 
> hdf5=tables.openFile("vs001_3d_particles.h5")
> root=hdf5.root
> 
> ptcls_names=[_ for _ in dir(root) if
> _.startswith("Electrons_at_PE_")] ptcls=[eval("root."+_) for _ in
> ptcls_names]
> for ptcl in ptcls:
>     ptcl_data=ptcl.read()
>     print type(ptcl_data)   # <type 'numpy.ndarray'>
>     print ptcl_data.dtype   # [('cell', '<i4'), ('x', '<f4'), ('y',
> '<f4'), ('z', '<f4'), ('px', '<f4'), ('py', '<f4'), ('pz', '<f4'),
> ('weight', '<f4'), ('q2m', '<f4')]
>     ptcl_data*=2       # TypeError: unsupported operand type(s) for
> *=: 'numpy.ndarray' and 'int'
>     x=ptcl_data[:,1]   # I'd like to do stuff like that
> 
> now, it is  type(ptcl_data) == numpy.ndarray, but the contents of
> the array are some kind of lists.
> How can I now transform this weird, "custom/proprietaty" data
> format into an ordinary numpy array? So that I can use operations
> as, for example, node1*=2, or slicing,  and so on?

I use the h5py module for this, not the tables module. In short,

import numpy, h5py
f = h5py.File('myhdf5file.h5','r')
data = f.get('path/to/my/dataset')
data_as_array = numpy.array(data)

Then you have a normal numpy array with which you can work further.

HTH,
Johann



More information about the SciPy-User mailing list