[Numpy-discussion] is __array_ufunc__ ready for prime-time?

Matthew Harrigan harrigan.matthew at gmail.com
Thu Nov 2 16:39:08 EDT 2017


Numpy already does support a specific unit, datetime64 and timedelta64,
think through that very mechanism.  Its also probably the most complicated
unit since at least there is no such thing as leap meters.  And it works
well and is very useful IMHO

On Thu, Nov 2, 2017 at 3:40 PM, Nathan Goldbaum <nathan12343 at gmail.com>
wrote:

>
>
> On Thu, Nov 2, 2017 at 2:37 PM, Stephan Hoyer <shoyer at gmail.com> wrote:
>
>> On Thu, Nov 2, 2017 at 9:45 AM <josef.pktd at gmail.com> wrote:
>>
>>> similar, scipy.special has ufuncs
>>> what units are those?
>>>
>>> Most code that I know (i.e. scipy.stats and statsmodels) does not use
>>> only
>>> "normal mathematical operations with ufuncs"
>>> I guess there are a lot of "abnormal" mathematical operations
>>> where just simply propagating the units will not work.
>>>
>>
>>> Aside: The problem is more general also for other datastructures.
>>> E.g. statsmodels for most parts uses only numpy ndarrays inside the
>>> algorithm and computations because that provides well defined
>>> behavior. (e.g. pandas behaved too differently in many cases)
>>> I don't have much idea yet about how to change the infrastructure to
>>> allow the use of dask arrays, sparse matrices and similar and possibly
>>> automatic differentiation.
>>>
>>
>> This is the exact same reason why pandas and xarray do not support
>> wrapping arbitrary ndarray subclasses or duck array types. The operations
>> we use internally (on numpy.ndarray objects) may not be what you would
>> expect externally, and may even be implementation details not considered
>> part of the public API. For example, in xarray we use numpy.nanmean() or
>> bottleneck.nanmean() instead of numpy.mean().
>>
>> For NumPy and xarray, I think we could (and should) define an interface
>> to support subclasses and duck types for generic operations for core
>> use-cases. My main concern with subclasses / duck-arrays is
>> undefined/untested behavior, especially where we might silently give the
>> wrong answer or trigger some undesired operation (e.g., loading a lazily
>> computed into memory) rather than raising an informative error. Leaking
>> implementation details is another concern: we have already had several
>> cases in NumPy where a function only worked on a subclass if a particular
>> method was called internally, and broke when that was changed.
>>
>
> Would this issue be ameliorated given Nathaniel's proposal to try to move
> away from subclasses and towards storing data in dtypes? Or would that just
> mean that xarray would need to ban dtypes it doesn't know about?
>
>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20171102/c45f985a/attachment.html>


More information about the NumPy-Discussion mailing list