[Numpy-discussion] Release of 1.0b5 this weekend

Charles R Harris charlesr.harris at gmail.com
Tue Aug 29 17:20:24 EDT 2006


Hi Tim,

On 8/29/06, Tim Hochberg <tim.hochberg at ieee.org> wrote:
>
> Charles R Harris wrote:
> >
> >
> > On 8/29/06, *Tim Hochberg* <tim.hochberg at ieee.org
> > <mailto:tim.hochberg at ieee.org>> wrote:
> >
> >     Charles R Harris wrote:
> >     > Hi,
> >     >
> >     > On 8/29/06, *Tim Hochberg* <tim.hochberg at ieee.org
> >     <mailto:tim.hochberg at ieee.org>
> >     > <mailto:tim.hochberg at ieee.org <mailto:tim.hochberg at ieee.org>>>
> >     wrote:
> >     >
> >     >
> >     >     -0.5 from me if what we're talking about here is having
> mutating
> >     >     methods
> >     >     return self rather than None. Chaining stuff is pretty, but
> >     having
> >     >     methods that mutate self and return self looks like a source
> of
> >     >     elusive
> >     >     bugs to me.
> >     >
> >     >     -tim
> >     >
> >     >
> >     > But how is that any worse than the current mutating operators? I
> >     think
> >     > the operating principal is that methods generally work in place,
> >     > functions make copies. The exceptions to this rule need to be
> noted.
> >     Is that really the case? I was more under the impression that there
> >     wasn't much rhyme nor reason to this. Let's do a quick
> dir(somearray)
> >     and see what we get (I'll strip out the __XXX__ names):
> >
> >     'all', 'any', 'argmax', 'argmin', 'argsort', 'astype', 'base',
> >     'byteswap', 'choose', 'clip', 'compress', 'conj', 'conjugate',
> 'copy',
> >     'ctypes', 'cumprod', 'cumsum', 'data', 'diagonal', 'dtype', 'dump',
> >     'dumps', 'fill', 'flags', 'flat', 'flatten', 'getfield', 'imag',
> >     'item',
> >     'itemsize', 'max', 'mean', 'min', 'nbytes', 'ndim', 'newbyteorder',
> >     'nonzero', 'prod', 'ptp', 'put', 'putmask', 'ravel', 'real',
> >     'repeat',
> >     'reshape', 'resize', 'round', 'searchsorted', 'setfield',
> 'setflags',
> >     'shape', 'size', 'sort', 'squeeze', 'std', 'strides', 'sum',
> >     'swapaxes',
> >     'take', 'tofile', 'tolist', 'tostring', 'trace', 'transpose',
> >     'var', 'view'
> >
> >
> > There are certainly many methods where inplace operations make no
> > sense. But for such things as conjugate and clip I think it should be
> > preferred. Think of them as analogs of the "+=" operators that allow
> > memory efficient inplace operations. At the moment there are too few
> > such operators, IMHO, and that makes it hard to write memory efficient
> > code when you want to do so. If you need a copy, the functional form
> > should be the preferred way to go and can easily be implement by
> > constructions like a.copy().sort().
> So let's make this clear; what you are proposing is more that just
> returning self for more operations. You are proposing changing the
> meaning of the existing methods to operate in place rather than return
> new objects. It seems awfully late in the day to be considering this
> being that we're on the edge of 1.0 and this would could break any
> existing numpy code that is out there.
>
> Just for grins let's look at the operations that could potentially
> benefit from being done in place. I think they are:
>    byteswap
>    clip
>    conjugate
>    round
>    sort
>
> Of these,  clip, conjugate and round support an 'out' argument like that
> supported by ufunces;  byteswap has a boolean argument telling it
> whether to perform operations in place; and sort always operates in
> place. Noting that the ufunc-like methods (max, argmax, etc) appear to
> support the 'out' argument as well although it's not documented for most
> of them, it looks to me as if the two odd methods are byteswap and sort.
> The method situation could be made more consistent by swapping the
> boolean inplace flag in byteswapped with another 'out' argument and also
> having sort not operate in place by default, but also supply an out
> argument there. Thus:
>
> b = a.sort()   # Returns a copy
> a.sort(out=a) # Sorts a in place
> a.sort(out=c) # Sorts a into c (probably just equivalent to c = a.sort()
> in this case since we don't want to rewrite the sort routines)
>
> On the whole I think that this would be an improvement, but it may be
> too late in the day to actually implement it since 1.0 is coming up.
> There would still be a few methods (fill, put, etc) that modify the
> array in place and return None, but I haven't heard any complaints about
> those.


That sounds like a good idea. One could keep the present behaviour in most
cases by supplying a default value, although the out keyword might need a
None value to indicate "copy" and a 'Self' value that means in place, or
something like that, and then have all reasonable methods return values.
That way the change would be transparent. The changes to the sort method
would all be upper level, the low level sorting routines would remain
unchanged.

Methods are new, so code that needs to be changed is code specifically
written for Numpy and now is the time to make these sort of decisions.

-tim


Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060829/4d22c585/attachment-0001.html>


More information about the NumPy-Discussion mailing list