[Numpy-discussion] is __array_ufunc__ ready for prime-time?

Nathan Goldbaum nathan12343 at gmail.com
Thu Nov 2 15:40:26 EDT 2017


On Thu, Nov 2, 2017 at 2:37 PM, Stephan Hoyer <shoyer at gmail.com> wrote:

> On Thu, Nov 2, 2017 at 9:45 AM <josef.pktd at gmail.com> wrote:
>
>> similar, scipy.special has ufuncs
>> what units are those?
>>
>> Most code that I know (i.e. scipy.stats and statsmodels) does not use only
>> "normal mathematical operations with ufuncs"
>> I guess there are a lot of "abnormal" mathematical operations
>> where just simply propagating the units will not work.
>>
>
>> Aside: The problem is more general also for other datastructures.
>> E.g. statsmodels for most parts uses only numpy ndarrays inside the
>> algorithm and computations because that provides well defined
>> behavior. (e.g. pandas behaved too differently in many cases)
>> I don't have much idea yet about how to change the infrastructure to
>> allow the use of dask arrays, sparse matrices and similar and possibly
>> automatic differentiation.
>>
>
> This is the exact same reason why pandas and xarray do not support
> wrapping arbitrary ndarray subclasses or duck array types. The operations
> we use internally (on numpy.ndarray objects) may not be what you would
> expect externally, and may even be implementation details not considered
> part of the public API. For example, in xarray we use numpy.nanmean() or
> bottleneck.nanmean() instead of numpy.mean().
>
> For NumPy and xarray, I think we could (and should) define an interface to
> support subclasses and duck types for generic operations for core
> use-cases. My main concern with subclasses / duck-arrays is
> undefined/untested behavior, especially where we might silently give the
> wrong answer or trigger some undesired operation (e.g., loading a lazily
> computed into memory) rather than raising an informative error. Leaking
> implementation details is another concern: we have already had several
> cases in NumPy where a function only worked on a subclass if a particular
> method was called internally, and broke when that was changed.
>

Would this issue be ameliorated given Nathaniel's proposal to try to move
away from subclasses and towards storing data in dtypes? Or would that just
mean that xarray would need to ban dtypes it doesn't know about?


>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20171102/6f074ac4/attachment-0001.html>


More information about the NumPy-Discussion mailing list