[Numpy-discussion] Bytes vs. Unicode in Python3

Dag Sverre Seljebotn dagss at student.matnat.uio.no
Thu Dec 3 08:05:51 EST 2009


Dag Sverre Seljebotn wrote:
> Pauli Virtanen wrote:
>   
>> Fri, 27 Nov 2009 23:19:58 +0100, Dag Sverre Seljebotn wrote:
>> [clip]
>>   
>>     
>>> One thing to keep in mind here is that PEP 3118 actually defines a
>>> standard dtype format string, which is (mostly) incompatible with
>>> NumPy's. It should probably be supported as well when PEP 3118 is
>>> implemented.
>>>     
>>>       
>> PEP 3118 is for the most part implemented in my Py3K branch now -- it was 
>> not actually much work, as I could steal most of the format string 
>> converter from numpy.pxd.
>>   
>>     
> Great! Are you storing the format string in the dtype types as well? (So 
> that no release is needed and acquisitions are cheap...)
>
> As far as numpy.pxd goes -- well, for the simplest dtypes.
>   
>> Some questions:
>>
>> How hard do we want to try supplying a buffer? Eg. if the consumer does 
>> not specify strided but specifies suboffsets, should we try to compute 
>> suitable suboffsets? Should we try making contiguous copies of the data 
>> (I guess this would break buffer semantics?)?
>>   
>>     
> Actually per the PEP, suboffsets imply strided:
>
> #define PyBUF_INDIRECT (0x0100 | PyBUF_STRIDES)
>
> :-) So there's no real way for a consumer to specify only suboffsets, 
> 0x0100 is not a possible flag I think. Suboffsets can't really work 
> without the strides anyway IIUC, and in the case of NumPy the field can 
> always be left at 0.
>   
That is, NULL!
> IMO one should very much stay clear of making contiguous copies, 
> especially considering the existance of PyBuffer_ToContiguous, which 
> makes it trivial for client code to get a pointer to a contiguous buffer 
> anyway. The intention of the PEP seems to be to export the buffer in as 
> raw form as possible.
>
> Do keep in mind that IS_C_CONTIGUOUS and IS_F_CONTIGUOUS go be too 
> conservative with NumPy arrays. If a contiguous buffer is requested, 
> then  looping through the strides and checking that the strides are 
> monotonically decreasing/increasing could eventually save copying in 
> some cases. I think that could be worth it -- I actually have my own 
>   
And, of course, that the innermost stride is 1.
> code for IS_F_CONTIGUOUS rather than relying on the flags personally 
> because of this issue, so it does come up in practice.
>
> Dag Sverre
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>   




More information about the NumPy-Discussion mailing list