[Python-Dev] Re: sre warnings

Tim Peters tim.one at comcast.net
Sat Jan 10 13:59:05 EST 2004

[Martin v. Loewis]
>> Please change all uses of sizes/positions to "size_t", and
>> change the special -1 marker to (size_t)-1.

[Sjoerd Mullender]
> If sizeof(int) < sizeof(size_t), is it *guaranteed* that (size_t)-1
> expands to a bit pattern of all 1's?

Guaranteed (not to mention *guaranteed* <wink>) is a very strong thing, and
C explicitly allows for that an integer type other than unsigned char may
contain bits that don't contribute to the value (like parity bits, or a
not-an-integer bit).  This makes "a bit pattern of all 1's" hard to relate
to what the standard says.  For casts from signed to unsigned, when the
original value can't be represented in the new type (which includes all
negative original values):

    the value is converted by repeatedly adding or subtracting one
    more than the maximum value that can be represented in the new
    type until the value is in the range of the new type.

If we assume a binary box and that int and size_t don't have "extra" bits
(Python assumes all this in more than one place already), then yes, the
result is guaranteed to be a string of 1 bits:  it's mathematically 2**i-1,
where i is the number of bits in a size_t.  That resulting mathematical
*value* is well-defined even if there are extra bits, although C doesn't
define what the extra bits may contain.

> Also, is it *guaranteed* that you don't get more warnings
> (converting a negative quantity to unsigned)?

Well, the standard never guarantees that you won't get a warning.  It's
traditional for C compilers not to whine about explicit casts, no matter how
goofy they are.  Casting from signed to unsigned is well-defined, so there's
no reason at all to whine about that.  Note that there are instances of
(size_t)-1 in Python already (cPickle.c and longobject.c), and no reports of
warnings from those.

> I've been using ~(size_t)1 for things like this where these *are*
> guaranteed.

I think you mean ~(size_t)0.  There are also two instances of that in
Python's source now.  I think either way is fine.

More information about the Python-Dev mailing list