[Python-Dev] New Py_UNICODE doc
Nicholas Bastin
nbastin at opnet.com
Sat May 7 22:13:55 CEST 2005
On May 7, 2005, at 9:29 AM, Martin v. Löwis wrote:
> Nicholas Bastin wrote:
>> --enable-unicode=ucs2
>>
>> be replaced with:
>>
>> --enable-unicode=utf16
>>
>> and the docs be updated to reflect more accurately the variance of the
>> internal storage type.
>
> -1. This breaks existing documentation and usage, and provides only
> minimum value.
Have you been missing this conversation? UTF-16 is *WHAT PYTHON
CURRENTLY IMPLEMENTS*. The current documentation is flat out wrong.
Breaking that isn't a big problem in my book.
It provides more than minimum value - it provides the truth.
> With --enable-unicode=ucs2, Python's Py_UNICODE does *not* start
> supporting the full Unicode ccs the same way it supports UCS-2.
> Individual surrogate values remain accessible, and supporting
> non-BMP characters is left to the application (with the exception
> of the UTF-8 codec).
I can't understand what you mean by this. My point is that if you
configure python to support UCS-2, then it SHOULD NOT support surrogate
pairs. Supporting surrogate paris is the purvey of variable width
encodings, and UCS-2 is not among them.
--
Nick
More information about the Python-Dev
mailing list