[Python-Dev] Re: Unifying Long Integers and Integers

Moshe Zadka moshez@zadka.site.co.il
Mon, 12 Mar 2001 02:25:23 +0200 (IST)


On Sun, 11 Mar 2001, Guido van Rossum <guido@digicool.com> wrote:

> Actually, since you have SF checkin permissions, Barry can just give
> you a PEP number and you can check it in yourself!

Technically yes. I'd rather Barry would change PEP-0000 himself ---
if he's ready to do that and let me check in the PEPs it's fine, but
I just imagined he'd like to keep the state consistent.

[re: numerical PEPs mailing list] 
> Please help yourself.  I recommend using SF since it requires less
> overhead for the poor python.org sysadmins.

Err...I can't. Requesting an SF mailing list is an admin operation.

[re: portablity of literals]
> I'm not sure if the portability of .pyc's is much worse than that of
> .py files.

Of course, .py's and .pyc's is just as portable. I do think that this
helps programs be more portable when they have literals inside them,
especially since (I believe) that soon the world will be a mixture of
32 bit and 64 bit machines.

> There's more to it than that.  What about sys.maxint?  What should it
> be set to?

I think I'd like to stuff this one "open issues" and ask people to 
grep through code searching for sys.maxint before I decide.

Grepping through the standard library shows that this is most often
use as a maximum size for sequences. So, I think it should be probably
the maximum size of an integer type large enough to hold a pointer.
(the only exception is mhlib.py, and it uses it when int(string) gives an
OverflowError, which it would stop so the code would be unreachable)

> Other areas where we need to decide what to do: there are a few
> operations that treat plain ints as unsigned: hex() and oct(), and the
> format operators "%u", "%o" and "%x".  These have different semantics
> for bignums!  (There they ignore the request for unsignedness and
> return a signed representation anyway.)

This would probably be solved by the fact that after the change 1<<31
will be positive. The real problem is that << stops having 32 bit semantics --
but it never really had those anyway, it had machine-long-size semantics,
which were unportable, so we can just people with unportable code to fix
it.

What do you think? Should I issue a warning on shifting an integer so
it would be cut/signed in the old semantics?

> May C APIs for other datatypes currently take int or long arguments,
> e.g. list indexing and slicing.  I suppose these could stay the same,
> or should we provide ways to use longer integers from C as well?

Hmmmm....I'd probably add PyInt_AS_LONG_LONG under an #ifdef HAVE_LONG_LONG

> Also, what will you do about PyInt_AS_LONG()?  If PyInt_Check()
> returns true for bignums, C code that uses PyInt_Check() and then
> assumes that PyInt_AS_LONG() will return a valid outcome is in for a
> big surprise!

Yes, that's a problem. I have no immediate solution to that -- I'll
add it to the list of open issues.

> Note that the implementation suggested below implies that the overflow
> boundary is at a different value than currently -- you take one bit
> away from the long.  For backwards compatibility I think that may be
> bad...

It also means overflow raises a different exception. Again, I suspect
it will be used only in cases where the algorithm is supposed to maintain
that internal results are not bigger then the inputs or things like that,
and there only as a debugging aid -- so I don't think that this would be this
bad. And if people want to avoid using the longs for performance reasons,
then the implementation should definitely *not* lie to them.

> Almost.  The current bignum implementation actually has a length field
> first.

My bad. ;-)

> I have an alternative implementation in mind where the type field is
> actually different for machine ints and bignums.  Then the existing
> int representation can stay, and we lose no bits.  This may have other
> implications though, since uses of type(x) == type(1) will be broken.
> Once the type/class unification is complete, this could be solved by
> making long a subtype of int.

OK, so what's the concrete advice? How about if I just said "integer operations
that previously raised OverflowError now return long integers, and literals
in programs that are too big to be integers are long integers?". I started
leaning this way when I started writing the PEP and decided that true 
unification may not be the low hanging fruit we always assumed it would be.

-- 
"I'll be ex-DPL soon anyway so I'm        |LUKE: Is Perl better than Python?
looking for someplace else to grab power."|YODA: No...no... no. Quicker,
   -- Wichert Akkerman (on debian-private)|      easier, more seductive.
For public key, finger moshez@debian.org  |http://www.{python,debian,gnu}.org