[Numpy-discussion] numpy.ma.MaskedArray.min() makes a copy?

Nathaniel Smith njs at pobox.com
Fri Sep 7 12:05:25 EDT 2012

On 7 Sep 2012 14:38, "Benjamin Root" <ben.root at ou.edu> wrote:
> An issue just reported on the matplotlib-users list involved a user who
ran out of memory while attempting to do an imshow() on a large array.
While this wouldn't be totally unexpected, the user's traceback shows that
they ran out of memory before any actual building of the image occurred.
Memory usage sky-rocketed when imshow() attempted to determine the min and
max of the image.  The input data was a masked array, and it appears that
the implementation of min() for masked arrays goes something like this
(paraphrasing here):
> obj.filled(inf).min()
> The idea is that any masked element is set to the largest possible value
for their dtype in a copied array of itself, and then a min() is performed
on that copied array.  I am assuming that max() does the same thing.
> Can this be done differently/more efficiently?  If the "filled" approach
has to be done, maybe it would be a good idea to make the copy in chunks
instead of all at once?  Ideally, it would be nice to avoid the copying
altogether and utilize some of the special iterators that Mark Weibe
created last year.

I think what you're looking for is where= support for ufunc.reduce. This
isn't implemented yet but at least it's straightforward in principle...
otherwise I don't know anything better than reimplementing .min() by hand.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120907/7e3e57dc/attachment.html>

More information about the NumPy-Discussion mailing list