[SciPy-user] Fast saving/loading of huge matrices
Francesc Altet
faltet at carabos.com
Fri Apr 20 03:30:15 EDT 2007
El dv 20 de 04 del 2007 a les 08:24 +0200, en/na Gael Varoquaux va
escriure:
> I agree that pytable lack a really simple interface. Say something that
> dumps a dic to an hdf5 file, and vice-versa (althought hdf5 -> dic is a
> bit harder as all the hdf5 types may not convert nicely to python types).
As I said before, be used to recarrays. If you have reasons for sticking
with dictionaries, it is straighforward converting a dict into a
recarray. For example:
>>> v1=numpy.random.rand(10,)
>>> v2=numpy.random.randint(10, size=10)
>>> mydict={'v1':v1,'v2':v2}
#Conversion to a recarray begins
>>> cols = [col for col in mydict.itervalues()]
>>> ratype = [(name, col.dtype) for (name, col) in mydict.iteritems()]
>>> ra=numpy.rec.fromarrays(cols, dtype=ratype)
# now, you can proceed to saving (and reading) the data
>>> tra3=f.createTable('/', 'ra3', ra)
>>> tra3[:]
array([(0.71896141583591389, 3), (0.6147395923362261, 8),
(0.74390300993242819, 8), (0.85740583591803832, 8),
(0.058988577053635471, 4), (0.33839332688847212, 9),
(0.3847836118934358, 2), (0.0072535131033339972, 5),
(0.42023038711482563, 5), (0.26398728887523382, 6)],
dtype=[('v1', '<f8'), ('v2', '<i4')])
[snip]
> Similarily I have some python code to dump a dic of arrays to an hdf5
> file:
>
> """
> def dic_to_h5(filename, dic):
> """ Saves all the arrays in a dictionary to an hdf5 file.
> """
> out_file = tables.openFile(filename, mode = "w")
> for key, value in dic.iteritems():
> if isinstance( value, ndarray):
> out_file.createArray('/', str(key), value)
> out_file.close()
> """
>
> This code is not general enough to go in pytables, but if the list wants
> to improve it a bit, then we could propose it for inclusion, or at least
> put it on the cookbook.
Yeah, there are infinite possibilities in that regard. However, I think
that there is a beauty in keeping the values of a dictionary (or fields
in a recarray) tied together in a table. This approach has proven to be
very powerful in many situations (but, of course, the user has to decide
the better way to arrange his own data).
Cheers,
--
Francesc Altet | Be careful about using the following code --
Carabos Coop. V. | I've only proven that it works,
www.carabos.com | I haven't tested it. -- Donald Knuth
More information about the SciPy-User
mailing list