[Numpy-discussion] Setting contents of buffer for array object

Anne Archibald peridot.faceted at gmail.com
Mon Feb 11 00:27:56 EST 2008


On 10/02/2008, Matthew Brett <matthew.brett at gmail.com> wrote:
> > Ah, I see. You definitely do not want to reassign the .data buffer in
> > this case. An out= parameter does not reassign the memory location
> > that the array object points to. It should use the allocated memory
> > that was already there. It shouldn't "copy" anything at all;
> > otherwise, "median(x, out=out)" is no better than "out[:] =
> > median(x)". Personally, I don't think that a function should expose an
> > out= parameter unless if it can make good on that promise of memory
> > efficency.
>
> I agree - but there are more efficient median algorithms out there
> which can make use of the memory efficiently.  I wanted to establish
> the call signature to allow that.  I don't feel strongly about it
> though.

This is a startling claim! Are there really median algorithms that are
faster for having the use of a single float as storage space? If it
were permissible to mutilate the original array in-place, I can
certainly see a good median algorithm (based on quicksort, perhaps)
being faster, but modifying the input array is a different question
from using an output array.

I can also see that this could possibly be improved by using a for
loop to iterate over the output elements, so that there was no need to
duplicate the large input array, or perhaps a "blocked" iteration that
duplicated arrays of modest size would be better. But how can a single
float per data set whose median is being taken help?

Anne



More information about the NumPy-Discussion mailing list