[issue4474] PyUnicode_FromWideChar incorrect for characters outside the BMP (unix only)
report at bugs.python.org
Wed Jan 21 03:05:11 CET 2009
STINNER Victor <victor.stinner at haypocalc.com> added the comment:
> Also note that on platforms with 16-bit wchar_t, the comparison
> (0xffff < *w) will always be false, so an additional check for
> (Py_UNICODE_SIZE > 2) is needed.
Yes, but the right test is (SIZEOF_WCHAR_T > 2). I wrote a new test:
#if (Py_UNICODE_SIZE == 2) && (SIZEOF_WCHAR_T > 2)
const wchar_t *orig_w;
> BTW: Please always use upper-case hex literals, or at leat don't
> mix the case within the same function.
I try, but it would be easier if the rule was already respected: they
are many tabs and many lower-case hex literals. I used copy/paste from
existing code of unicodeobject.c...
Patch version 3:
- disable the UTF-16 surrogate for 16 bits wchar_t: so my patch is
only used for 16 bits Py_UNICODE and 32 bits wchar_t... which is the
default case for python 2.6 and 3.0 on Linux
- replace tabulation by spaces (in existing code)
- use upper case literals
Added file: http://bugs.python.org/file12822/unicode_fromwidechar_surrogate-3.patch
Python tracker <report at bugs.python.org>
More information about the Python-bugs-list