[Python-Dev] shouldn't we be considering all pending numeric proposals together?

Guido van Rossum guido@digicool.com
Tue, 24 Jul 2001 15:29:12 -0400


> Skip Montanaro wrote:
> > 
> > There are several active or could-be-active PEPs related to Python's numeric
> > behavior:
> > 
> >      S   211  pep-0211.txt  Adding A New Outer Product Operator    Wilson
> >      S   228  pep-0228.txt  Reworking Python's Numeric Model       Zadka
> >      S   237  pep-0237.txt  Unifying Long Integers and Integers    Zadka
> >      S   238  pep-0238.txt  Non-integer Division                   Zadka
> >      S   239  pep-0239.txt  Adding a Rational Type to Python       Zadka
> >      S   240  pep-0240.txt  Adding a Rational Literal to Python    Zadka
> >      S   242  pep-0242.txt  Numeric Kinds                          Dubois
> > 
> > Instead of implementing them piecemeal, shouldn't we be
> > considering them as a related group?  For example, implementing
> > any or all of PEPs 237, 239 and 240 might well have an effect on
> > what needs to be done for PEP 238.  With slight modifications, the
> > proposals in PEP 242 might well subsume PEP 238's functionality in
> > a different way.
> > 
> > If the semantics of arithmetic are going to change, I think they should
> > change in the context of expanded capability in the language.

I think PEP 211 and PEP 242 don't belong in this list.  PEP 211
doesn't affect Python's number system at all, and PEP 242 proposes
a set of storage choices, not choices in semantics.  PEP 242 is valid
regardless of what we decide about int division.

The others, however, indeed are connected.  In fact the one that's
currently generating so much heat, PEP 238, is an essential
prerequisite for PEP 228, and so is PEP 237: if the different numeric
types are to be made fully interchangeable, as PEP 228 requires, a
different answer for 1/2 and 1.0/2.0 is impossible, and likewise, 1L
will be treated the same as 1 (in fact, the 'L' suffix will probably
be ignored eventually, and the representation choice is made solely
based on the numerical value).

But it's different the other way around: PEP 238 can easily stand on
its own.  It addresses a problem that exists even without a unified
numeric model.

Conversely, if PEP 238 is unacceptable, PEP 228 also has no hope, and
PEP 239 is much less attractive.  Since PEP 238 is the only one that
cannot avoid breaking existing code, I want to introduce it as soon as
I can, since the others can only be introduced after the full
compatibility waiting period for PEP 238, at least two years.

The relationship between PEP 238 and PEP 239 is interesting.  PEP 238
currently proposes to let int division return a float, because that's
the only available type.  But I believe that if we decide down the
road that int division should return a rational number instead, this
will break little or no code, as long as we embed the rationals in the
floats.  That is, the coercion rules would use this ordering:

  int -> long -> rational -> float -> complex

This is in spite of the fact that floating point numbers are actually
representable exactly as rationals!  (Using unbounded precision, which
Python rationals should definitely have.)  When I add a float to a
rational number, I want the result to be a float, not a rational,
because the float (most likely) represents an approximated value, and
turning it into an exact rational seems a mistake in that case.

The property which current division lacks, and which I think is an
important step towards PEP 228, is the following:

    In an expression involving only numeric variables and operators,
    the *mathematical value* of the result (except for accuracy issues
    due to the fallibility of floating point hardware) should only
    depend on the mathematical value of the inputs.  The *type* of the
    result should be the first type in the above coercion list that
    does not come before any of the input types, and that can
    represent the mathematical value of the result.

With "mathematical value" I mean the abstraction of numbers generally
used in mathematics, where the integers are embedded in the rationals
which are embedded in the reals, etc.  Mathematicians may talk about
the type of a variable ("let i be an integer, etc.")  but they never
talk about the type of a *value*: integer literals are used without
prejudice in formulas yielding real results.

If we introduce rationals, and we redefine int division as returning a
rational instead of a float, this will not affect the mathematical
value.

(BTW, float is a misnomer.  I should have called it real, but alas, I
was a little *too* much under the influence of C at the time.  This is
not worth fixing.)

[MAL]
> May I suggest that these rather controversial changes be carried
> out on a separate branch of the Python source tree before adding 
> them to the trunk ?!

Definitely.  I am currently maintaining the PEP 238 implementation as
a patch; I don't want to start any new branches before we've merged
the descr-branch into the trunk.

> The reasoning here is that numerics are so low-level that porting
> applications to a new release implementing these changes will
> cause a lot of work (mostly due to the dynamic nature of Python).

I am aware of the amount of work; that's why I want to allow a very
generous waiting period before making it law.

> Another suggestion I would like to make is that the new semantics
> are first implemented using alternative subclassed numeric 
> objects (e.g. newint()) which can then live side-by-side with the
> old semantics types for a few releases until they replace the
> old types.

Hm, I don't think that that will be very useful.  A new-division-aware
module could create integer values of the new type, but in order to
protect itself against old-style integers passed in as arguments, it
would have to force a conversion of all arguments -- in which case the
code becomes even uglier than if we added explicit float() coercions
to all arguments.

Have you looked at my PEP-238 patch at all?  It solves the
side-by-side problem with a future statement and two new division
operators: one that forces int results, for //, one that forces float
results, for / under the influence of the appropriate future
statement, and one that implements the old behavior, for / without the
future statement.

--Guido van Rossum (home page: http://www.python.org/~guido/)