[Numpy-discussion] Objects exposing the array interface

Thu Feb 26 00:22:23 EST 2015

On Wed, Feb 25, 2015 at 2:48 PM, Jaime Fernández del Río <
jaime.frio at gmail.com> wrote:

> I am not really sure what the behavior of __array__ should be. The link
> to the subclassing docs I gave before indicates that it should be possible
> to write to it if it is writeable (and probably pandas should set the
> writeable flag to False if it cannot be reliably written to), but the
> obscure comment I mentioned seems to point to the opposite, that it should
> never be written to. This is probably a good moment in time to figure out
> what the proper behavior should be and document it.
>

It's one thing to rely on the result of __array__ being writeable. It's
another thing to rely on writing to that array to modify the original
array-like object.

Presuming the later would be a mistake. Let me give three categories of
examples where I know this would fail:
- pandas: for DataFrame objects with inhomogeneous dtype
- netCDF4 and other IO libraries: The array's data may be readonly on disk
or require a network call to access. The memory model may not even be able
to be cleanly mapped to numpy's (e.g., it may use chunked storage)
- blaze.Data: Blaze arrays use lazily evaluation and don't support mutation

As far as I know, none of these libraries produce readonly ndarray objects
from __array__. It can actually be highly convenient to return normal,
writeable ndarrays even if they don't modify the original source, because
this lets you do all the normal numpy stuff to the returned array,
including operations that mutate it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150225/7c8902d3/attachment.html>