[Numpy-discussion] Bug in np.nonzero / Should index returning functions return ndarray subclasses?
ben.root at ou.edu
Sat May 9 15:53:31 EDT 2015
Absolutely, it should be writable. As for subclassing, that might be messy.
Consider the following:
inds = np.where(data > 5)
In that case, I'd expect a normal, bog-standard ndarray because that is
what you use for indexing (although pandas might have a good argument for
having it return one of their special indexing types if "data" was a pandas
foobar = np.where(data > 5, 1, 2)
Again, I'd expect a normal, bog-standard ndarray because the scalar
elements are very simple. This question gets very complicated when
considering array arguments. Consider:
merged_data = np.where(data > 5, data, data2)
So, what should "merged_data" be? If both "data" and "data2" are the same
types, then it would be reasonable to return the same type, if possible.
But what if they aren't the same? Maybe use array_priority to determine the
return type? Or, perhaps it does make sense to say "sod it all" and always
return an ndarray?
I don't know the answer. I do find it interesting that the result from a
multi-dimensional array is not writable. I don't know why I have never
On Sat, May 9, 2015 at 2:42 PM, Nathaniel Smith <njs at pobox.com> wrote:
> On May 9, 2015 10:48 AM, "Jaime Fernández del Río" <jaime.frio at gmail.com>
> > There is a reported bug (issue #5837) regarding different returns from
> np.nonzero with 1-D vs higher dimensional arrays. A full summary of the
> differences can be seen from the following output:
> > >>> class C(np.ndarray): pass
> > ...
> > >>> a = np.arange(6).view(C)
> > >>> b = np.arange(6).reshape(2, 3).view(C)
> > >>> anz = a.nonzero()
> > >>> bnz = b.nonzero()
> > >>> type(anz)
> > <type 'numpy.ndarray'>
> > >>> anz.flags
> > C_CONTIGUOUS : True
> > F_CONTIGUOUS : True
> > OWNDATA : True
> > WRITEABLE : True
> > ALIGNED : True
> > UPDATEIFCOPY : False
> > >>> anz.base
> > >>> type(bnz)
> > <class '__main__.C'>
> > >>> bnz.flags
> > C_CONTIGUOUS : False
> > F_CONTIGUOUS : False
> > OWNDATA : False
> > WRITEABLE : False
> > ALIGNED : True
> > UPDATEIFCOPY : False
> > >>> bnz.base
> > array([[0, 1],
> > [0, 2],
> > [1, 0],
> > [1, 1],
> > [1, 2]])
> > The original bug report was only concerned with the non-writeability of
> higher dimensional array returns, but there are more differences: 1-D
> always returns an ndarray that owns its memory and is writeable, but higher
> dimensional arrays return views, of the type of the original array, that
> are non-writeable.
> > I have a branch that attempts to fix this by making both 1-D and n-D
> > return a view, never the base array,
> This doesn't matter, does it? "View" isn't a thing, only "view of" is
> meaningful. And in this case, none of the returned arrays share any memory
> with any other arrays that the user has access to... so whether they were
> created as a view or not should be an implementation detail that's
> transparent to the user?
> > return an ndarray, never a subclass, and
> > return a writeable view.
> > I guess the most controversial choice is #2, and in fact making that
> change breaks a few tests. I nevertheless think that all of the index
> returning functions (nonzero, argsort, argmin, argmax, argpartition) should
> always return a bare ndarray, not a subclass. I'd be happy to be corrected,
> but I can't think of any situation in which preserving the subclass would
> be needed for these functions.
> I also can't see any logical reason why the return type of these functions
> has anything to do with the type of the inputs. You can index me with my
> phone number but my phone number is not a person. OTOH logic and ndarray
> subclassing don't have much to do with each other; the practical effect is
> probably more important. Looking at the subclasses I know about (masked
> arrays, np.matrix, and astropy quantities), though, I also can't see much
> benefit in copying the subclass of the input, and the fact that we were
> never consistent about this suggests that people probably aren't depending
> on it too much.
> So in summary my feeling is: +1 to making then writable, no objection to
> the view thing (though I don't see how it matters), and provisional +1 to
> consistently returning ndarray (to be revised if the people who use the
> subclassing functionality disagree).
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion