[Python-3000] PEP Draft: Enhancing the buffer protcol
Travis Oliphant
oliphant.travis at ieee.org
Wed Feb 28 18:56:16 CET 2007
Thomas Heller wrote:
>
>>Additions to the struct string-syntax
>>
>> The struct string-syntax is missing some characters to fully
>> implement data-format descriptions already available elsewhere (in
>> ctypes and NumPy for example). Here are the proposed additions:
>>
>> Character Description
>> ==================================
>> '1' bit (number before states how many bits)
>> '?' platform _Bool type
>
>
> In SVN trunk (2.6), the struct module already supports _Bool, but the
> format character used is 't'. Not a big issue, though, and I like '?'
> better.
>
I think 't' should be used for the bit type also (because '1' is
confusing when you have something like '71b' which looks like 71 signed
chars but is actually 7 bits + 1 signed char).
I've changed this in the current PEP.
>
>> 'g' long double
>> 'F' complex float
>> 'D' complex double
>> 'G' complex long double
>
>
> IIUC, in the latest PEP draft you have apparently changed to two-letter codes
> for complex types; which is inconsistent with previous conventions in struct.
Yeah, I've introduced two-letter codes for pointers as well. But, there
is a certain logic to it because 'Zd' would be similar to 'dd' except
you would know that the two are supposed to be treated as a complex number.
>
>
>> 'c' ucs-1 (latin-1) encoding
>> 'u' ucs-2
>> 'w' ucs-4
>> 'O' pointer to Python Object
>> 'T{}' structure (detailed layout inside {})
>> '(k1,k2,...,kn)' multi-dimensional array of whatever follows
>> ':name:' optional name of the preceeding element
>> '&' specific pointer (prefix before another charater)
>> 'X{}' pointer to a function (optional function
>> signature inside {})
>>
>> The struct module will be changed to understand these as well and
>> return appropriate Python objects on unpacking. Un-packing a
>> long-double will return a c-types long_double.
>
>
> This is probably because there is no way for current Python to support
> the long double datatype.
Right. On some platforms there is no difference between double and
long double. I guess returning a decimal object might actually be the
easiest solution.
> The question for ctypes is: How should ctypes
> support that? Should the .value attribute of a c_longdouble have two
> components, should it expose the value as decimal, should Python itself
> switch to using long double internally, or are there other possibilities?
>
I think I like the decimal object solution better.
>
>> Unpacking 'u' or
>> 'w' will return Python unicode. Unpacking a multi-dimensional
>> array will return a list of lists. Un-packing a pointer will
>> return a ctypes pointer object.
>
>
> ctypes does not support pointer objects of non-native byte order;
> should they be forbidden?
Yes, I'm fine with them being forbidden.
>
>
>>
>> Functions should be added to ctypes to create a ctypes object from
>> a struct description, and add long-double, and ucs-2 to ctypes.
>
>
> Well, ucs-4 should probably be added to ctypes as well. The current ctypes.c_wchar
> type corresponds to the C WCHAR type, its size is configuration dependend.
I think you are right. In the discussions for unifying string/unicode I
really like the proposals that are leaning toward having a unicode
object be an immutable string of either ucs-1, ucs-2, or ucs-4 depending
on what is in the string.
This does create some conversion issues that must be handled, but I
think it is the best option. In the Python 3.0 version of NumPy, I
think that's what we are going to have (three different string types
ucs-1, ucs-2, ucs-4).
-Travis
More information about the Python-3000
mailing list