[Python-Dev] New Py_UNICODE doc

Nicholas Bastin nbastin at opnet.com
Sat May 7 06:11:33 CEST 2005


On May 6, 2005, at 8:11 PM, Martin v. Löwis wrote:

> Nicholas Bastin wrote:
>> Well, this is a completely separate issue/problem. The internal
>> representation is UTF-16, and should be stated as such.  If the
>> built-in methods actually don't work with surrogate pairs, then that
>> should be fixed.
>
> Yes to the former, no to the latter. PEP 261 specifies what should
> and shouldn't work.

This PEP has several textual errors and ambiguities (which, admittedly, 
may have been a necessary state given the unicode standard in 2001).  
However, putting that aside, I would recommend that:

--enable-unicode=ucs2

be replaced with:

--enable-unicode=utf16

and the docs be updated to reflect more accurately the variance of the 
internal storage type.

I would also like the community to strongly consider standardizing on a 
single internal representation, but I will leave that fight for another 
day.

--
Nick



More information about the Python-Dev mailing list