[SciPy-user] Fast saving/loading of huge matrices
Pauli Virtanen
pav at iki.fi
Fri Apr 20 12:43:36 EDT 2007
Fri, 20 Apr 2007 08:24:20 +0200, Gael Varoquaux kirjoitti:
>
> I agree that pytable lack a really simple interface. Say something that
> dumps a dic to an hdf5 file, and vice-versa (althought hdf5 -> dic is a
> bit harder as all the hdf5 types may not convert nicely to python types).
In a different attempt to make storing stuff in Pytables easier,
I wrote a library to dump and load any objects directly to HDF5 files
http://www.iki.fi/pav/software/hdf5pickle/index.html
It uses the pickle protocol to interface with Python, but unrolls
objects so that they are stored in the "native" Pytables formats, if
possible, instead of pickled strings.
It's a bit rought around some edges and a bit slow, but works. (Also, all
security issues associated with pickling should be remembered...)
For example:
import numpy as N
import hdf5pickle, tables
class Foo(object):
def __init__(self, c):
self.a = array([1,2,3,4,5], float)
self.b = 12345
self.c = c
foo = Foo(array([[1+2j, 3+4j]]))
f = tables.openFile('test.h5', 'w')
hdf5pickle.dump(foo, f, '/foo')
f.close()
f = tables.openFile('test.h5', 'r')
foo2 = hdf5pickle.load(f, '/foo')
f.close()
assert N.all(foo.a == foo2.a)
assert N.all(foo.b == foo2.b)
... meanwhile, in the shell ...
$ h5ls -dvr test.h5
/foo Group
/foo/__ Group
/foo/__/args Dataset {1}
Data:
(0) 0
/foo/__/cls Dataset {12}
Data:
(0) 95, 95, 109, 97, 105, 110, 95, 95, 10, 70, 111, 111
/foo/a Dataset {5}
Data:
(0) 1, 2, 3, 4, 5
/foo/b Dataset {SCALAR}
Data:
(0) 12345
/foo/c Dataset {1, 2}
Data:
(0,0) {1, 2}, {3, 4}
--
Pauli Virtanen
More information about the SciPy-User
mailing list