[Tutor] operators >> and &

Mon Feb 15 00:34:00 CET 2010

On Sun, 14 Feb 2010 08:16:18 pm Alan Gauld wrote:

> >> But glitches can occur from time to time...
> >
> > If Python had a glitch of the magnitude of right-shifting non-zero
> > bits into a number, that would be not just a bug but a HUGE bug.
>
> Bit shifting is machine specific. 

Pardon me, but that's incorrect. Python is not assembly, or C, and the 
behaviour of bit shifting in Python is NOT machine specific. Python 
doesn't merely expose the native bit shift operations on native ints, 
it is a high-level object-oriented method with carefully defined 
semantics.

http://docs.python.org/library/stdtypes.html#bit-string-operations-on-integer-types

In Python, a left shift of n MUST return the equivalent of 
multiplication by 2**n, and a right shift MUST return the equivalent of 
integer division by 2**n. Any other result is a SERIOUS bug in Python 
of the same magnitude (and the same likelihood) as 10/2 returning 18.

So while I bow to your knowledge of bit operations in assembler on 
obscure four bit processors, Python does not do that. (I'm not even 
sure if Python runs on any four bit CPUs!) Python is a high-level 
language, not an assembler, and the behaviour of the bit operators >> 
and << is guaranteed to be the same no matter what CPU you are using.

(The only low-level ops that Python exposes are floating point ops: 
Python mostly does whatever the C library on your platform does.)

> > It is certainly good practice if you are dealing with numbers which
> > might be more than 24 bits to start with:
>
> Its more than good practice there, its essential.

Hardly. There are other ways of truncating a number to 8 bits, e.g. by 
using n % 256. If you're dealing with signed numbers, using & 255 will 
throw away the sign bit, which may be undesirable. And of course, it 
isn't desirable (let alone essential) to truncate the number if you 
don't need an 8 bit number in the first place!

[and discussing the case where you know your input is already 8 bits]
> In the case in point the & 255 keeps the coding style consistent
> and provides an extra measure of protection against unexpected
> oddities so I would keep it in there.

So you add unnecessary operations to be consistent? That's terrible 
practice.

So if you have an operation like this:

n = 12*i**3 + 7

and later on, you then want n = i+1, do you write:

n = 1*i**1 + 1

instead to be "consistent"? I would hope not!

> > cycles and needlessly complicating the code. The right way to guard
> > against "this will never happen" scenarios is with assert:
> >
> > assert n.bit_length() <= 32  # or "assert 0 <= n < 2**32"
>
> I would accept the second condition but the mask is much faster.

Premature (micro) optimizations is the root of all evil. An assert that 
can be turned off and not executed is infinitely faster than a bit 
shift which is always executed whether you want it or not.

And either way, the 20 seconds I lose trying to interpret the bit ops 
when I read the code is far more important than the 0.000001 seconds I 
lose executing the assert :)

> bit_length doesn't seem to work on any of my Pythons (2.5,2.6 and
> 3.1)

It won't work in 2.5 or 2.6. You're probably trying this:

123.bit_length()

and getting a syntax error. That's because the Python parser sees the . 
and interprets it as a float, and 123.bit_length is not a valid decimal 
float.

You need to either group the int, or refer to it by name:

(123).bit_length()

n = 123
n.bit_length()

-- 
Steven D'Aprano