[Numpy-discussion] Issues with adding new dtypes - customizing ndarray attributes

Mark Wiebe mwwiebe at gmail.com
Fri Jul 29 10:30:37 EDT 2011


On Thu, Jul 28, 2011 at 8:54 AM, Martin Ling <martin-numpy at earth.li> wrote:

> Hi,
>
> I'd like to kick off some discussion on general issues I've encountered
> while developing the quaternion dtype (see other thread, and the code
> at: https://github.com/martinling/numpy_quaternion)
>
> The basic issue is that the attributes of ndarray cannot be adapted
> to the dtype of a given array. Indeed, their behaviour can't be changed
> at all without patching numpy itself.
>
> There are essentially four cases of the problem:
>
> 1. Attributes which do the wrong thing even though there is a mechanism
>   that should let them do the right thing, e.g:
>
>   >>> a = array([quaternion(1,2,3,4), quaternion(5,6,7,8)])
>
>   >>> conjugate(a) # correct, calls conjugate ufunc I defined
>   array([quaternion(1, -2, -3, -4), quaternion(5, -6, -7, -8)],
> dtype=quaternion)
>
>   >>> a.conjugate() # incorrect, why doesn't it do the same?
>   array([quaternion(1, 2, 3, 4), quaternion(5, 6, 7, 8)], dtype=quaternion)
>
>   >>> min(a) # works, calls min ufunc I defined
>   quaternion(1, 2, 3, 4)
>
>   >>> a.min() # fails, why doesn't it do the same?
>   ValueError: No cast function available.
>
> 2. Attributes that do the wrong thing with no mechanism to override them:
>
>   >>> array([q.real for q in a])
>   array([ 1.,  5.])
>
>   >>> a.real # would like this to return the same, can't make it do so
>   array([quaternion(1, 2, 3, 4), quaternion(5, 6, 7, 8)], dtype=quaternion)
>
> 3. Attributes that don't exist and could be added to suit the dtype:
>
>   >>> array([q.y for q in a])
>   array([ 3.,  7.])
>
>   >>> a.y # would like this to return the same, can't make it do so
>   AttributeError: 'numpy.ndarray' object has no attribute 'y'
>
> 4. Attributes that already exist and make no sense for some dtypes:
>
>   >>> sa = array(['foo', 'bar', 'baz'])
>
>   >>> sa.imag # why can I even do this?
>   array(['', '', ''], dtype='|S3')
>
> We had ѕome discussion about this at the SciPy conference sprints and
> the consensus seemed to be that allowing dtypes to customize the
> attributes of ndarrays would be a good thing. This would also be useful
> for struct arrays, datetime arrays, etc.
>
> What do people think?
>

I was part of this discussion at SciPy, and while I was initially skeptical
of giving dtypes the ability to add properties and functions to arrays built
with them, the discussion at the SciPy sprint convinced me otherwise. Since
the author of a dtype is fully aware of what properties and functions an
array already have, they can avoid name collisions in a straightforward way.
This is different from the recarray case, where assigning field names can be
a lot more haphazard, and it's perfectly sane to want a field called 'sum'
conflicting with the arr.sum() array method.

One example where this would help is with the datetime64 type. I suggested
that it might be good to automatically convert Python's datetime objects
into datetime64 arrays. Here's a pull request Ben Walsh did towards that:

https://github.com/numpy/numpy/pull/111

The point he raises, that np.array([datetime.date(2000, 1, 1)])[0].year
would fail, could be addressed through this mechanism.

-Mark


>
>
> Martin
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110729/05e3b4db/attachment.html>


More information about the NumPy-Discussion mailing list