int/long unification hides bugs

Andrew Dalke adalke at mindspring.com
Tue Oct 26 08:29:11 CEST 2004


kartik wrote:
> i suggest u base your comments on real code, rather than reasoning in
> an abstract manner from your ivory tower.

My experience, based on 10+ years of making a living as a
professional programmer, says that you are wrong and the
comments made by others has been spot on.

Real code?  Here's one used for generating the canonical
SMILES representation of a chemical compound.  It comes
from the FROWNS package.

             try:
                 val = 1
                 for offset, bondtype in offsets[index]:
                     val *= symclasses[offset] * bondtype
             except OverflowError:
                 # Hmm, how often does this occur?
                 val = 1L
                 for offset, bondtype in offsets[index]:
                     val *= symclasses[offset] * bondtype


The algorithm uses the fundamental theorem of arithmetic
as part of computing a unique characteristic value for
every atom in the molecule, up to symmetry.

It's an iterative algorithm, and the new value for
a given atom is the product of the old values of its
neighbor atoms in the graph:

    V'(atom1) = V(atom1.neighbor[0]) * V(atom1.neighbor[1]) * ...

In very rare cases this can overflow 32 bits.  Rare
enough that it's faster to do everything using 32 bit
numbers and just redo the full calculation if there's
an overflow.

Because Python now no longer gives this overflow error,
we have the advantage of both performance and simplified
code.

Relatively speaking, 2**31 is tiny.  My little laptop
can count that high in Python in about 7 minutes, and
my hard drive has about 2**35 bits of space.  I deal
with single files bigger than 2**32 bits.

Why then should I have to put in all sorts of workarounds
into *my* code because *you* don't know how to write
good code, useful test cases, and appropriate internal
sanity checks?

Your examples, btw, are hypothetical.  Having an
OverflowException at 2**31 doesn't fix your 500 year
old person problem and you'll have an IndexError / KeyError
well before you reach that limit, assuming your string /
dictionary doesn't have that much data.

Why not give some example of real code that shows
1) that giving an OverflowError is the right behaviour
(excluding talking to hardware or other system that
requires a fixed size number), 2) that there is a
fixed number N that is always appropriate, and 3)
that value is < sys.maxint.

For bonus points, use proper spelling and capitalization.


  				Andrew
				dalke at dalkescientific.com



More information about the Python-list mailing list