[Numpy-discussion] saving incrementally numpy arrays
David Warde-Farley
dwf at cs.toronto.edu
Wed Aug 12 19:32:17 EDT 2009
On 12-Aug-09, at 7:11 PM, Juan Fiol wrote:
> Hi, I finally decided by the pytables approach because will be
> easier later to work with the data. Now, I know is not the right
> place but may be I can get some quick pointers. I've calculated a
> numpy array of about 20 columns and a few thousands rows at each
> time. I'd like to append all the rows without iterating over the
> numpy array. Someone knows what would be the "right" approach? I am
> looking for something simple, I do not need to keep the piece of
> table after I put into the h5file. Thanks in advance and regards, Juan
You'll probably want the EArray. createEArray() on a new h5file, then
append to it.
http://www.pytables.org/docs/manual/ch04.html#EArrayMethodsDescr
If your chunks are always the same size it might be best to try and do
your work in-place and not allocate a new NumPy array each time. In
theory 'del' ing the object when you're done with it should work but
the garbage collector may not act quickly enough for your liking/the
allocation step may start slowing you down.
What do I mean? Well, you could clear the array when you're done with
it using foo[:] = 0 (or nan, or whatever) and when you're "building it
up" use the inplace augmented assignment operators as much as possible
(+=, /=, -=, *=, %=, etc.).
David
More information about the NumPy-Discussion
mailing list