[Python-Dev] Re: [Python-checkins] python/dist/src/Objects unicodeobject.c, 2.197, 2.198

Martin v. Löwis martin at v.loewis.de
Thu Sep 18 11:13:12 EDT 2003


"M.-A. Lemburg" <mal at lemburg.com> writes:

> Because that's what was used as basis in the type implementation
> as well as the codecs (internal and external). Comparisons simply
> work differently when you're using a signed type which is also
> why most compilers warn about this -- but you know that.

Yes and no. It appears that you are assuming that Py_UNICODE is always
unsigned, however, it is (AFAICT) nowhere documented, and I don't see
any strong reason ("it used to be that way" is not a strong reason -
code does change over time).

> An signed type also doesn't make much sense for things like
> character storage -- the sign information is useless and you
> lose a bit for each character.

OTOH, using wchar_t where possible is also valuable. On all systems
with a signed wchar_t that we know of, that wchar_t has 32 bits -
losing the sign bit does not mean to lose character code points, as
Unicode has less than 2**17 code points, anyway.

Regards,
Martin




More information about the Python-Dev mailing list