[Python-Dev] Re: [Python-checkins]
python/dist/src/Objects unicodeobject.c, 2.197, 2.198
Martin v. Löwis
martin at v.loewis.de
Thu Sep 18 11:13:12 EDT 2003
"M.-A. Lemburg" <mal at lemburg.com> writes:
> Because that's what was used as basis in the type implementation
> as well as the codecs (internal and external). Comparisons simply
> work differently when you're using a signed type which is also
> why most compilers warn about this -- but you know that.
Yes and no. It appears that you are assuming that Py_UNICODE is always
unsigned, however, it is (AFAICT) nowhere documented, and I don't see
any strong reason ("it used to be that way" is not a strong reason -
code does change over time).
> An signed type also doesn't make much sense for things like
> character storage -- the sign information is useless and you
> lose a bit for each character.
OTOH, using wchar_t where possible is also valuable. On all systems
with a signed wchar_t that we know of, that wchar_t has 32 bits -
losing the sign bit does not mean to lose character code points, as
Unicode has less than 2**17 code points, anyway.
Regards,
Martin
More information about the Python-Dev
mailing list