[Numpy-discussion] Loading a > GB file into array

Ivan Vilata i Balaguer ivilata at carabos.com
Fri Nov 30 13:19:38 EST 2007

Martin Spacek (el 2007-11-30 a les 00:47:41 -0800) va dir::

> I find that if I load the file in two pieces into two arrays, say 1GB
> and 0.3GB respectively, I can avoid the memory error. So it seems that
> it's not that windows can't allocate the memory, just that it can't
> allocate enough contiguous memory. I'm OK with this, but for indexing
> convenience, I'd like to be able to treat the two arrays as if they were
> one. Specifically, this file is movie data, and the array I'd like to
> get out of this is of shape (nframes, height, width).

Well, one thing you could do is dump your data into a PyTables_
``CArray`` dataset, which you may afterwards access as if its was a
NumPy array to get slices which are actually NumPy arrays.  PyTables
datasets have no problem in working with datasets exceeding memory size.
For instance::

  h5f = tables.openFile('foo.h5', 'w')
  carray = h5f.createCArray(
      '/', 'bar', atom=tables.UInt8Atom(), shape=(TOTAL_NROWS, 3) )
  base = 0
  for array in your_list_of_partial_arrays:
      carray[base:base+len(array)] = array
      base += len(array)

  # Now you can access ``carray`` as a NumPy array.
  carray[42] --> a (3,) uint8 NumPy array
  carray[10:20] --> a (10, 3) uint8 NumPy array
  carray[42,2] --> a NumPy uint8 scalar, "width" for row 42

(You may use an ``EArray`` dataset if you want to enlarge it with new
rows afterwards, or a ``Table`` if you want a different type for each

.. _PyTables: http://www.pytables.org/



	Ivan Vilata i Balaguer   >qo<   http://www.carabos.com/
	       Cárabos Coop. V.  V  V   Enjoy Data
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 307 bytes
Desc: Digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20071130/993b8bdb/attachment.sig>

More information about the NumPy-Discussion mailing list