[Tutor] operators >> and &
Alan Gauld
alan.gauld at btinternet.com
Sun Feb 14 10:16:18 CET 2010
"Steven D'Aprano" <steve at pearwood.info> wrote
>> They are not gone however there are still 32 bits in an integer so
>> the top bits *should* be set to zero.
>
> No, Python ints are not 32 bit native ints. They're not even 64 bit
> ints. Python has unified the old "int" type with "long", so that ints
> automatically grow as needed. This is in Python 3.1:
Valid point but irrelevant to the one I was making which
is that the number after shifting is longer than 8 bits.
>> But glitches can occur from time to time...
>
> If Python had a glitch of the magnitude of right-shifting non-zero bits
> into a number, that would be not just a bug but a HUGE bug.
Bit shifting is machine specific. Some CPUs (the DEC PDP
range from memory is an example) will add the carry bit for
example, most will not. But you can never be sure unless you
know exactly which artchiotecture the program will run on.
And of course data can always be corrupted at any time so its
always wise to take as many precautions as possibe to keep
it clean (although corruption within the CPU itself is, I agree,
extremely unlikely)
> be as serious as having 1+1 return 374 instead of 2. Guarding against
> (say) 8 >> 1 returning anything other than 4
Not if you have a 4 bit processor and the previous opertation
set the carry flag. In that case returning 12 would be emminently
sensible....and used to be a common assembler trick for
recovering from overflow errors.
> guarding against 8//2 returning something other than 4: if you can't
> trust Python to get simple integer arithmetic right,
But ths is not simple integer arithmetic it is bit m,anippulation.
You can use bit manipulation to fake arithmetic but they are
fundamentally different operations and may not always
produce the same results depending on how the designer
built it!
> trust it to do *anything*, and your guard (ANDing it with 255) can't be
> trusted either.
Nothing can be trusted 100% on a computer because, as you
say the guard might itself be corrupted. Itas all about risk management.
But when it comes to bit operations I'd always have at least one
extra level of check, whether it be a mask or a checksum.
> It is certainly good practice if you are dealing with numbers which
> might be more than 24 bits to start with:
Its more than good practice there, its essential.
> But *if* you know the int is no more than 32 bits, then adding in a
> guard to protect against bugs in the >> operator is just wasting CPU
It may not be a bug it may be a design feature.
Now all modern CPUs behave as you would expect but if
you are running on older equipment (or specialised
hardware - but that's more unlikely to have Python onboard!)
you can never be quite sure how bitwise operations will react
at boundary cases. If you know for certainty what the runtime
environment will be then you can afford to take a chance.
In the case in point the & 255 keeps the coding style consistent
and provides an extra measure of protection against unexpected
oddities so I would keep it in there.
> cycles and needlessly complicating the code. The right way to guard
> against "this will never happen" scenarios is with assert:
>
> assert n.bit_length() <= 32 # or "assert 0 <= n < 2**32"
I would accept the second condition but the mask is much faster.
bit_length doesn't seem to work on any of my Pythons (2.5,2.6 and 3.1)
> This has two additional advantages:
>
> (1) It clearly signals to the reader what your intention is ("I'm
> absolutely 100% sure than n will not be more than 32 bits, but since
> I'm a fallible human, I'd rather find out about an error in my logic as
> soon as possible").
The assert approach is perfectly valid, but since the mask is
more consistent I'd still prefer to use it in this case.
Alan G.
More information about the Tutor
mailing list