[Python-Dev] Deprecation warning on integer shifts and such

Jack Jansen Jack.Jansen@oratrix.com
Tue, 13 Aug 2002 23:51:29 +0200


On dinsdag, augustus 13, 2002, at 10:29 , Guido van Rossum wrote:
>> Huh??! Now you've confused me. If "k" means "32 bit mask", why 
>> would it
>> be changed in 2.4 not to accept negative values? "-1" is a perfectly
>> normal way to specify "0xffffffff" in C usage...
>
> Hm, in Python I'd hope that people would write 0xffffffff if they want
> 32 one bits -- -1L has an infinite number of one bits, and on 64-bit
> systems, -1 has 64 one-bits instead of 32.  Most masks are formed by
> taking a small positive constant (e.g. 1 or 0xff) and shifting it
> left.  In Python 2.4 that will always return a positive value.

That is all fine if you're on an island. But if you transcribe 
existing C code to Python, or use examples or manuals written 
for C, then I would think there's no reason not to be lenient.

But (you hear Jack's reasoning collapsing in the distance) I 
haven't checked that Apple still uses -1 to mean "all ones" in 
their sample code. They used to do that a lot, but they may have 
stopped that. I don't know.

> In the end, I see two possibilities: lenient, taking the lower N bits,
> or strict, requiring [0 .. 2**32-1].  The proposal I made above was an
> intermediate move on the way to the strict approach (given the reality
> that in 2.3, 1<<31 is negative).

I would say strict on positive values, lenient on negative ones. 
Not too lenient, of course: -0x100000000L should not be a passed 
as 0 but give an exception.

This would correspond to everyday use in languages such as C.

>>> We'll have to niggle about the C type corresponding to 'k'.  
>>> Should it
>>> be 'int' or 'long'?  It may not matter for you, since you 
>>> expect to be
>>> running on 32-bit hardware forever; but it matters for other 
>>> potential
>>> users of 'k'.  We could also have both 'k' and 'K', where 'k' stores
>>> into a C int and 'K' into a C long.
>>
>> How about k1 for a byte, k2 for a short, k4 for a long and k8 
>> for a long
>> long?
>
> Hm, the format characters typically correspond to a specific C type.
> We already have 'b' for unsigned char and 'B' for signed/unsigned
> char, 'h' for unsigned short and 'H' for signed/unsigned short.  These
> are unfortunately inconsistent with 'i' for signed int and 'l' for
> signed long.
>
> So I'd rather you pick a C type for 'k' (and a policy about range
> checks).

ok. How about uint32_t? And, while we're at it, add Q for uint64_t?

>> In my proposal these would then probably become PyInt_As1Byte,
>> PyInt_As2Bytes, PyInt_As4Bytes and PyInt_As8Bytes.
>
> And what would their return types be?

uint8_t, uint16_t, uint32_t and uint64_t.
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- 
Emma Goldman -