[Python-Dev] PEP: Adding data-type objects to Python

Travis E. Oliphant oliphant.travis at ieee.org
Tue Oct 31 07:32:47 CET 2006


M.-A. Lemburg wrote:
> Travis E. Oliphant wrote:
> 
> I understand and that's why I'm asking why you made the range
> explicit in the definition.
> 

In the case of NumPy it was so that String and Unicode arrays would both 
look like multi-length string "character" arrays and not arrays of 
arrays of some character.

But, this can change in the data-format object.  I can see that the 
Unicode description needs to be improved.

> The definition should talk about Unicode code points.
> The number of bytes then determines whether you can only
> represent the ASCII subset (1 byte), UCS2 (2 bytes, BMP only)
> or UCS4 (4 bytes, all currently assigned code points).

Yes, you are correct.  A string of unicode characters should really be 
represented in the same way that an array of integers is represented for 
a data-format object.

-Travis



More information about the Python-Dev mailing list