[Numpy-discussion] RE: memory-mapped Numeric arrays: arrayfrombuffer version 2

Eric Nodwell nodwell at physics.ubc.ca
Fri Jan 25 10:41:08 EST 2002

Since I have a 2.4GB data file handy, I thought I'd try this package
with it.  (Normally I process this data file by reading it in a chunk
at a time, which is perfectly adequate.)  Not surprisinly, it chokes:

  File "/home/eric/lib/python2.2/site-packages/maparray.py", line 15,
  in maparray
    m = mmap.mmap(fn, os.fstat(fn)[stat.ST_SIZE])
OverflowError: memory mapped size is too large (limited by C int)

(details: Python 2.2, numpy 20.3, Pentium III, Debian Woody, Linux
kernel 2.4.13, gcc 2.95.4)

I'm not a big C programmer, but I wonder if there is some way for this
package to overcome the 2GB limit on 32-bit systems.  That could be
useful in some situations.


On Fri, Jan 25, 2002 at 09:40:21AM -0800, Paul F. Dubois wrote:
> I have verified that this package seems to work on Windows. I says seems
> only because I didn't try enough to uncover anything subtle.
> Unless or until we are convinced as a community that this is (a) the
> right way to do this and (b) that the package is portable, it would not
> be wise to put it in the main distribution. 
> I would like to hear from the community about this so that I will know
> whether or not to add this package as a separate SourceForge 'package'
> within the Numerical Python area. Meantime I will add a link to the web
> page.
> From: python-announce-list-admin at python.org
> [mailto:python-announce-list-admin at python.org] On Behalf Of Kragen
> Sitaker
> Sent: Wednesday, January 23, 2002 9:40 PM
> To: python-announce-list at python.org
> Subject: memory-mapped Numeric arrays: arrayfrombuffer version 2
> The 'arrayfrombuffer' package features support for Numerical Python
> arrays whose contents are stored in buffer objects, including
> memory-mapped files.  This has the following advantages:
> - loading your array from a file is easy --- a module import and a
>   single function call --- and doesn't use excessive amounts of
>   memory.
> - loading your array is quick; it doesn't need to be copied from one
>   part of memory to another in order to be loaded.
> - your array gets demand-loaded; parts you aren't using don't need to
>   be in memory or in swap.
> - under memory-pressure conditions, your array doesn't use up swap,
>   and parts of it you haven't modified can be evicted from RAM without
>   the need for a disk write
> - your arrays can be bigger than your physical memory
> - when you modify your array, only the parts you modify get written
>   back out to disk
> This is something that's been requested on the Numpy list a few times a
> year since 1999.
> arrayfrombuffer lives at http://pobox.com/~kragen/sw/arrayfrombuffer/
> The current version is version 2; it is released under the X11 license
> (the BSD license without the advertising clause).
> <kragen at pobox.com>
> <P><A
> HREF="http://pobox.com/~kragen/sw/arrayfrombuffer/">arrayfrombuffer
> 2</A> - creates Numeric arrays from memory-mapped files.  (23-Jan-02)

Eric Nodwell
Ph.D. candidate
Department of Physics
University of British Columbia

tel: 604-822-5425
fax: 604-822-4750
nodwell at physics.ubc.ca

More information about the NumPy-Discussion mailing list