[Numpy-discussion] why std() eats much memory in multidimensional case?

Charles R Harris charlesr.harris at gmail.com
Fri Apr 20 12:06:11 EDT 2007


On 4/20/07, Emanuele Olivetti <emanuele at relativita.com> wrote:
>
> Hi,
> I'm working with 4D integer matrices and need to compute std() on a
> given axis but I experience problems with excessive memory consumption.
> Example:
> ---
> import numpy
> a = numpy.random.randint(100,size=(50,50,50,200)) # 4D randint matrix
> b = a.std(3)
> ---
> It seems that this code requires 100-200 Mb to allocate 'a'
> as a matrix of integers, but requires >500Mb more just to
> compute std(3). Is it possible to compute std(3) on integer
> matrices without spending so much memory?
>
> I manage 4D matrices that are not much bigger than the one in the example
> and they require >1.2Gb of ram to compute std(3) only.
> Note that quite all this memory is immediately released after
> computing std() so it seems it's used just internally and not to
> represent/store the result. Unfortunately I haven't all that RAM...
>
> Could someone explain/correct this problem?


I suspect that a temporary double precision copy of the input array is made
before computing the std. If the original array is int32, then the temporary
would be twice the size. That would certainly be the easiest way to do the
computation.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070420/a0b8b467/attachment.html>


More information about the NumPy-Discussion mailing list