[Python-Dev] New Py_UNICODE doc

Nicholas Bastin nbastin at opnet.com
Sat May 7 22:13:55 CEST 2005


On May 7, 2005, at 9:29 AM, Martin v. Löwis wrote:

> Nicholas Bastin wrote:
>> --enable-unicode=ucs2
>>
>> be replaced with:
>>
>> --enable-unicode=utf16
>>
>> and the docs be updated to reflect more accurately the variance of the
>> internal storage type.
>
> -1. This breaks existing documentation and usage, and provides only
> minimum value.

Have you been missing this conversation?  UTF-16 is *WHAT PYTHON 
CURRENTLY IMPLEMENTS*.  The current documentation is flat out wrong.  
Breaking that isn't a big problem in my book.

It provides more than minimum value - it provides the truth.


> With --enable-unicode=ucs2, Python's Py_UNICODE does *not* start
> supporting the full Unicode ccs the same way it supports UCS-2.
> Individual surrogate values remain accessible, and supporting
> non-BMP characters is left to the application (with the exception
> of the UTF-8 codec).

I can't understand what you mean by this.  My point is that if you 
configure python to support UCS-2, then it SHOULD NOT support surrogate 
pairs.  Supporting surrogate paris is the purvey of variable width 
encodings, and UCS-2 is not among them.

--
Nick



More information about the Python-Dev mailing list