[Numpy-discussion] Bytes vs. Unicode in Python3
Dag Sverre Seljebotn
dagss at student.matnat.uio.no
Thu Dec 3 08:05:51 EST 2009
Dag Sverre Seljebotn wrote:
> Pauli Virtanen wrote:
>
>> Fri, 27 Nov 2009 23:19:58 +0100, Dag Sverre Seljebotn wrote:
>> [clip]
>>
>>
>>> One thing to keep in mind here is that PEP 3118 actually defines a
>>> standard dtype format string, which is (mostly) incompatible with
>>> NumPy's. It should probably be supported as well when PEP 3118 is
>>> implemented.
>>>
>>>
>> PEP 3118 is for the most part implemented in my Py3K branch now -- it was
>> not actually much work, as I could steal most of the format string
>> converter from numpy.pxd.
>>
>>
> Great! Are you storing the format string in the dtype types as well? (So
> that no release is needed and acquisitions are cheap...)
>
> As far as numpy.pxd goes -- well, for the simplest dtypes.
>
>> Some questions:
>>
>> How hard do we want to try supplying a buffer? Eg. if the consumer does
>> not specify strided but specifies suboffsets, should we try to compute
>> suitable suboffsets? Should we try making contiguous copies of the data
>> (I guess this would break buffer semantics?)?
>>
>>
> Actually per the PEP, suboffsets imply strided:
>
> #define PyBUF_INDIRECT (0x0100 | PyBUF_STRIDES)
>
> :-) So there's no real way for a consumer to specify only suboffsets,
> 0x0100 is not a possible flag I think. Suboffsets can't really work
> without the strides anyway IIUC, and in the case of NumPy the field can
> always be left at 0.
>
That is, NULL!
> IMO one should very much stay clear of making contiguous copies,
> especially considering the existance of PyBuffer_ToContiguous, which
> makes it trivial for client code to get a pointer to a contiguous buffer
> anyway. The intention of the PEP seems to be to export the buffer in as
> raw form as possible.
>
> Do keep in mind that IS_C_CONTIGUOUS and IS_F_CONTIGUOUS go be too
> conservative with NumPy arrays. If a contiguous buffer is requested,
> then looping through the strides and checking that the strides are
> monotonically decreasing/increasing could eventually save copying in
> some cases. I think that could be worth it -- I actually have my own
>
And, of course, that the innermost stride is 1.
> code for IS_F_CONTIGUOUS rather than relying on the flags personally
> because of this issue, so it does come up in practice.
>
> Dag Sverre
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
More information about the NumPy-Discussion
mailing list