[Python-Dev] New Py_UNICODE doc
"Martin v. Löwis"
martin at v.loewis.de
Sat May 7 01:43:15 CEST 2005
Nicholas Bastin wrote:
> If this is the case, then we're clearly misleading users. If the
> configure script says UCS-2, then as a user I would assume that
> surrogate pairs would *not* be encoded, because I chose UCS-2, and it
> doesn't support that.
What do you mean by that? That the interpreter crashes if you try
to store a low surrogate into a Py_UNICODE?
> I would assume that any UTF-16 string I would
> read would be transcoded into the internal type (UCS-2), and information
> would be lost. If this is not the case, then what does the configure
> option mean?
It tells you whether you have the two-octet form of the Universal
Character Set, or the four-octet form.
Regards,
Martin
More information about the Python-Dev
mailing list