[Tutor] operators >> and &
Steven D'Aprano
steve at pearwood.info
Sun Feb 14 02:21:49 CET 2010
On Sun, 14 Feb 2010 10:58:10 am Alan Gauld wrote:
> "spir" <denis.spir at free.fr> wrote
>
> > PS: in "l>>24 & 255", the & operation is useless, since all 24
> > higher bits are already thrown away by the shift:
>
> They are not gone however there are still 32 bits in an integer so
> the top bits *should* be set to zero.
No, Python ints are not 32 bit native ints. They're not even 64 bit
ints. Python has unified the old "int" type with "long", so that ints
automatically grow as needed. This is in Python 3.1:
>>> (0).bit_length()
0
>>> (1).bit_length()
1
>>> (2).bit_length()
2
>>> (3).bit_length()
2
>>> (10**100).bit_length()
333
Consequently, if you have an arbitrary int that you don't know where it
came from, you can't make any assumptions about the number of bits it
uses.
> But glitches can occur from time to time...
If Python had a glitch of the magnitude of right-shifting non-zero bits
into a number, that would be not just a bug but a HUGE bug. That would
be as serious as having 1+1 return 374 instead of 2. Guarding against
(say) 8 >> 1 returning anything other than 4 makes as much sense as
guarding against 8//2 returning something other than 4: if you can't
trust Python to get simple integer arithmetic right, then you can't
trust it to do *anything*, and your guard (ANDing it with 255) can't be
trusted either.
> It is good practice to restrict the range to the 8 bits needed by
> and'ing with 255
> even when you think you should be safe.
It is certainly good practice if you are dealing with numbers which
might be more than 24 bits to start with:
>>> n = 5**25
>>> n >> 24
17763568394
>>> n >> 24 & 255
10
But *if* you know the int is no more than 32 bits, then adding in a
guard to protect against bugs in the >> operator is just wasting CPU
cycles and needlessly complicating the code. The right way to guard
against "this will never happen" scenarios is with assert:
assert n.bit_length() <= 32 # or "assert 0 <= n < 2**32"
print(n >> 24)
This has two additional advantages:
(1) It clearly signals to the reader what your intention is ("I'm
absolutely 100% sure than n will not be more than 32 bits, but since
I'm a fallible human, I'd rather find out about an error in my logic as
soon as possible").
(2) If the caller cares enough about speed to object to the tiny little
cost of the assertion, he or she can disable it by passing the -O (O
for Optimise) switch to Python.
(More likely, while each assert is very cheap, a big application might
have many, many asserts.)
--
Steven D'Aprano
More information about the Tutor
mailing list