[Numpy-discussion] Behaviour of copy for structured dtypes with gaps

Travis Oliphant teoliphant at gmail.com
Thu Apr 11 21:04:31 EDT 2019


I agree with Stefan that option 2 is what NumPy should go with for .copy()

If you want to get an identical memory copy you should be getting the .data
attribute and doing something with that buffer.

My $0.02

-Travis


On Thu, Apr 11, 2019 at 6:01 PM Stefan van der Walt <stefanv at berkeley.edu>
wrote:

> Hi Marten,
>
> On Thu, 11 Apr 2019 09:51:10 -0400, Marten van Kerkwijk wrote:
> > From the discussion so far, it
> > seems the logic has boiled down to a choice between:
> >
> > (1) Copy is a contract that the dtype will not vary (e.g., we also do not
> > change endianness);
> >
> > (2) Copy is a contract that any access to the data in the array will
> return
> > exactly the same result, without wasting memory and possibly optimized
> for
> > access with different strides. E.g., `array[::10].copy() also compacts
> the
> > result.
>
> I think you'll get different answers, depending on whom you ask—those
> interested in low-level memory layout, vs those who use the higher-level
> API.  Given that higher-level API use is much more common, I would lean
> in the direction of option (2).
>
> From that perspective, we already don't make consistency guarantees about
> memory
> layout and other flags.  E.g.,
>
> In [16]: x = np.arange(12).reshape((3, 4))
>
>
> In [17]: x.strides
>
>
> Out[17]: (32, 8)
>
> In [18]: x[::2, 1::2].strides
>
>                              Out[18]: (64, 16)
>
> In [19]: np.copy(x[::2, 1::2]).strides
>
>
> Out[19]: (16, 8)
>
> Not to mention this odd copy contract:
>
> >>> x = np.array([[1,2,3],[4,5,6]], order='F')
> >>> print(np.copy(x).flags['C_CONTIGUOUS'])
> >>> print(x.copy().flags['C_CONTIGUOUS'])
>
> False
> True
>
>
> The objection about arrays that don't behave identically in [0] feels
> somewhat arbitary to me.  As shown above, you can always find attributes
> that differ between a copied array and the original.
>
> The user's expectation is that they'll get an array that behaves the
> same way as the original, not one that is byte-for-byte compatible.  The
> most common use case is to make sure that the original array doesn't get
> overwritten.
>
> Just to play devil's advocate with myself: if you do choose option (2),
> how would you go about making an identical memory copy of the original
> array?
>
> Best regards,
> Stéfan
>
>
> [0] https://github.com/numpy/numpy/issues/13299#issuecomment-481912827
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20190411/5ba275ac/attachment-0001.html>


More information about the NumPy-Discussion mailing list