[Python-Dev] New Py_UNICODE doc
Shane Hathaway
shane at hathawaymix.org
Thu May 5 00:20:45 CEST 2005
Martin v. Löwis wrote:
> Nicholas Bastin wrote:
>
>>"This type represents the storage type which is used by Python
>>internally as the basis for holding Unicode ordinals. Extension module
>>developers should make no assumptions about the size of this type on
>>any given platform."
>
>
> But people want to know "Is Python's Unicode 16-bit or 32-bit?"
> So the documentation should explicitly say "it depends".
On a related note, it would be help if the documentation provided a
little more background on unicode encoding. Specifically, that UCS-2 is
not the same as UTF-16, even though they're both two bytes wide and most
of the characters are the same. UTF-16 can encode 4 byte characters,
while UCS-2 can't. A Py_UNICODE is either UCS-2 or UCS-4. It took me
quite some time to figure that out so I could produce a patch [1]_ for
PyXPCOM that fixes its unicode support.
.. [1] https://bugzilla.mozilla.org/show_bug.cgi?id=281156
Shane
More information about the Python-Dev
mailing list