[Python-Dev] PEP: Adding data-type objects to Python
Travis E. Oliphant
oliphant.travis at ieee.org
Sat Oct 28 21:10:49 CEST 2006
M.-A. Lemburg wrote:
> Travis E. Oliphant wrote:
>> M.-A. Lemburg wrote:
>>> Travis E. Oliphant wrote:
>>>> ------------------------------------------------------------------------
>>>>
>>>> PEP: <unassigned>
>>>> Title: Adding data-type objects to the standard library
>>>> Attributes
>>>>
>>>> kind -- returns the basic "kind" of the data-type. The basic kinds
>>>> are:
>>>> 't' - bit,
>>>> 'b' - bool,
>>>> 'i' - signed integer,
>>>> 'u' - unsigned integer,
>>>> 'f' - floating point,
>>>> 'c' - complex floating point,
>>>> 'S' - string (fixed-length sequence of char),
>>>> 'U' - fixed length sequence of UCS4,
>>> Shouldn't this read "fixed length sequence of Unicode" ?!
>>> The underlying code unit format (UCS2 and UCS4) depends on the
>>> Python version.
>> Well, in NumPy 'U' always means UCS4. So, I just copied that over. See
>> my questions at the bottom which talk about how to handle this. A
>> data-format does not necessarily have to correspond to something Python
>> represents with an Object.
>
> Ok, but why are you being specific about UCS4 (which is an internal
> storage format), while you are not specific about e.g. the
> internal bit size of the integers (which could be 32 or 64 bit) ?
>
The 'kind' does not specify how "big" the data-type (data-format) is. A
number is needed to represent the number of bytes.
In this case, the 'kind' does not specify how large the data-type is.
You can have 'u1', 'u2', 'u4', etc.
The same is true with Unicode. You can have 10-character unicode
elements, 20-character, etc. But, we have to be clear about what a
"character" is in the data-format.
-Travis
More information about the Python-Dev
mailing list