[Numpy-discussion] Changes to np.digitize since NumPy 1.9?

Wed Aug 12 17:03:13 EDT 2015

Hi all,

I've been testing the package I spend most of my time on, yt, under numpy
1.10b1 since the announcement went out.

I think I've narrowed down and fixed all of the test failures that cropped
up except for one last issue. It seems that the behavior of np.digitize
with respect to ndarray subclasses has changed since the NumPy 1.9 series.
Consider the following test script:

```python
import numpy as np

class MyArray(np.ndarray):
    def __new__(cls, *args, **kwargs):
        return np.ndarray.__new__(cls, *args, **kwargs)

data = np.arange(100)

bins = np.arange(100) + 0.5

data = data.view(MyArray)

bins = bins.view(MyArray)

digits = np.digitize(data, bins)

print type(digits)
```

Under NumPy 1.9.2, this prints "<type 'numpy.ndarray'>", but under the 1.10
beta, it prints "<class '__main__.MyArray'>"

I'm curious why this change was made. Since digitize outputs index arrays,
it doesn't make sense to me why it should return anything but a plain
ndarray. I see in the release notes that digitize now uses searchsorted
under the hood. Is this related?

We can "fix" this in our codebase by wrapping digitize or by adding numpy
version checks in places where the output type matters. Is it also possible
for me to customize the return type here by exploiting the ufunc machinery
and the __array_wrap__ and __array_finalize__ functions?

Thanks for any help or advice you might have,

Nathan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150812/a722018a/attachment.html>