[Python-Dev] hex constants, bit patterns, PEP 237 warnings and gettext

Oren Tirosh oren-py-d@hishome.net
Wed, 14 Aug 2002 08:24:58 -0400

On Wed, Aug 14, 2002 at 08:02:30AM -0400, Barry A. Warsaw wrote:
> >>>>> "TP" == Tim Peters <tim.one@comcast.net> writes:
>     TP> [Barry A. Warsaw]
>     >> ...  So if "0x950412de" isn't the right way to write a 32 bit
>     >> pattern,
>     TP> It isn't today, but will be in 2.4.
> But isn't that wasteful?  Today I have to add the L to my hex
> constants, but in a year from now, I can just turn around and remove
> them again.  What's the point?
> The deeper question is: what's wrong with "0x950412de"?  What bits
> have I lost by writing my hex constant this way?  I'm trying to
> understand why hex constants > sys.maxint have to deprecated.

Unifying ints and longs means that there is no predefined bit width for
numbers. Conceptually they are all infinite. Positive numbers have an
infinite number of leading '0's and negative numbers have an infinite number
of leading 'F's. Numbers that have less than 8/16 digits to the right of
this infinite sequence '0'f or 'F's of happen to get a more efficient 
internal representation and a different ob_type, but other than that it 
should be impossible to tell the difference between an int and a long.

What's wrong with 0x950412de is that with a word width of 32 bits it is 
negative and therefore the invisible bits to the left are all set. With a 
word width of 64 bits or with an infinite width they are cleared.

That's why I propose borrowing the 'U' suffix from C. 0x950412deU would
mean that the bits to the left are cleared. This way you could change your
code only once, document your intentions clearly and get a number that is
guaranteed to be equivalent on Python and C compilers with different native
word sizes.