[Python-Dev] Re:
[Python-checkins]python/dist/src/Objects unicodeobject.c,
2.197, 2.198
M.-A. Lemburg
mal at lemburg.com
Mon Sep 22 15:13:26 EDT 2003
Tim Peters wrote:
> [Tim]
>
>>>At the moment, it appears there's no identified reason to care about
>>>signedness of a greater-than 16-bit type,
>
>
> [M.-A. Lemburg]
>
>>Sure there is: first of all, having a single type that can
>>be signed on some platforms and unsigned on others is a bad
>>thing per se
>
>
> We inherit that from C, though -- it's fine by C if wchar_t is signed or
> unsigned, just as it refused to define the signedness of char.
It maybe fine for C... it is not for the Unicode implementation
since that has always assumed Py_UNICODE to be unsigned. This
is fixed now.
>>and second the 32-bit signed wchar_t value was what triggered this
>>thread in the first place.
>
> What triggered the thread originally was a segfault due to the code making a
> branch based on the content of uninitialized memory. The code clearly
> didn't *think* it was reading up random heap bits, so that was a bug
> regardless of wchar_t's signedness.
True, but the test (unicode->str[0] < 256) is what revealed a
second bug and that's what we've been discussing all along.
> That wchar_t happened to be a signed
> 32-bit type on Jeremy's box is what uncovered the read-uninitialized-memory
> bug.
>
> If there's no other code vulnerable to bad behavior if wchar_t is a signed
> 32-bit type (nobody has identified another case), objections to it being
> signed anyway seem technically groundless.
There are more comparisons of the above type in the code and
even worse: it is documented that Py_UNICODE is unsigned,
so it's very likely that code external to the Python distribution
such as codec packages or applications talking to libraries
use that assumption as well.
> Martin did give a technical
> reason (efficiency) for wanting to continue to use wchar_t on Jeremy's
> system.
Python won't be using wchar_t on those systems anymore, so
the problem is solved and the original intent restored. If
efficiency matters programmers are always free to cast Py_UNICODE
to wchar_t on these systems for fast read-only access.
--
Marc-Andre Lemburg
eGenix.com
Professional Python Software directly from the Source (#1, Sep 22 2003)
>>> Python/Zope Products & Consulting ... http://www.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
________________________________________________________________________
::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
More information about the Python-Dev
mailing list