[Numpy-discussion] Bug in memmap/python allocation code?

Karol Langner karol.langner at kn.pl
Tue Jul 25 02:59:11 EDT 2006


On Tuesday 25 July 2006 02:36, Mike Ressler wrote:
> I'm trying to work with memmaps on very large files, i.e. > 2 GB, up to 10
> GB. The files are data cubes of images (my largest is
> 1290(x)x1024(y)x2011(z)) and my immediate task is to strip the data from
> 32-bits down to 16, and to rearrange some of the data on a per-xy-plane
> basis. I'm running this on a Fedora Core 5 64-bit system, with
> python-2.5b2(that I believe I compiled in 64-bit mode) and
> numpy-1.0b1. The disk has 324 GB free space.
>
> The log from a minimal case is as follows:
>
> ressler > python2.5
> Python 2.5b2 (r25b2:50512, Jul 18 2006, 12:58:29)
> [GCC 4.1.1 20060525 (Red Hat 4.1.1-1)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>
> >>> import numpy as np
> >>> data=np.memmap('temp_file',mode='w+',shape=(2011,1280,1032),dtype='h')
>
> size = 2656450560
> bytes = 5312901120
> len(mm) = 5312901120
> (2011, 1280, 1032) h 0 0
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "/usr/local/lib/python2.5/site-packages/numpy/core/memmap.py", line
> 75, in __new__
>     offset=offset, order=order)
> TypeError: buffer is too small for requested array
>
>
> If I have a small number of frames (z=800 rather than 2011), this all works
> fine. I've added a few lines to memmap.py to print some diagnostic
> information - the error occurs on line 71 in the original memmap.py file,
> not 75. The "size =" and "bytes =" lines show that memmap.py is calculating
> the correct size for the buffer, and the len(mm) shows that the python
> mmap.mmap call on line 67 is returning a buffer of the correct size. The
> "(2011, 1280, 1032) h 0 0" bit is from a print statement that was left in
> the source file by the authors, and indicates what the following "self =
> ndarray.__new__" call is trying to do. However, it is the ndarray.__new__
> call that is breaking down, and I don't really have enough skill to
> continue chasing it down. I took a quick look at the C source, but I
> couldn't figure out where the ndarray.__new__ is actually defined.
>
> Any suggestions to help me past this? Thanks.
>
> Mike

I know Travis has nswered in a different thread. Let me jsut add where the 
actual error is raised - maybe it will be of some use. It is around line 5490 
of arrayobject.c (procedure array_new):

        else {  /* buffer given -- use it */
                if (dims.len == 1 && dims.ptr[0] == -1) {
                        dims.ptr[0] = (buffer.len-(intp)offset) / itemsize;
                }
                else if ((strides.ptr == NULL) && \
			 buffer.len < itemsize*				\
                         PyArray_MultiplyList(dims.ptr, dims.len)) {
                        PyErr_SetString(PyExc_TypeError,
                                        "buffer is too small for "      \
                                        "requested array");
                        goto fail;
                }

So it does look like an overflow to me.

Karol

-- 
written by Karol Langner
wto lip 25 08:56:42 CEST 2006




More information about the NumPy-Discussion mailing list