PEP 31XX: A Type Hierarchy for Numbers (and other algebraic entities)
Here's a draft of the numbers ABCs PEP. The most up to date version
will live in the darcs repository at
http://jeffrey.yasskin.info/darcs/PEPs/pep-3141.txt (unless the number
changes) for now. Naming a PEP about numbers 3.141 seems cute, but of
course, I don't get to pick the number. :) This is my first PEP, so
apologies for any obvious mistakes.
I'd particularly like the numpy people's input on whether I've gotten
floating-point support right.
Thanks,
Jeffrey Yasskin
-------------------------------
PEP: 3141
Title: A Type Hierarchy for Numbers (and other algebraic entities)
Version: $Revision: 54928 $
Last-Modified: $Date: 2007-04-23 16:37:29 -0700 (Mon, 23 Apr 2007) $
Author: Jeffrey Yasskin
"Jeffrey Yasskin"
Here's a draft of the numbers ABCs PEP. The most up to date version will live in the darcs repository at http://jeffrey.yasskin.info/darcs/PEPs/pep-3141.txt (unless the number changes) for now. Naming a PEP about numbers 3.141 seems cute, but of course, I don't get to pick the number. :) This is my first PEP, so apologies for any obvious mistakes.
I'd particularly like the numpy people's input on whether I've gotten floating-point support right.
Thanks, Jeffrey Yasskin
------------------------------- PEP: 3141 Title: A Type Hierarchy for Numbers (and other algebraic entities) Version: $Revision: 54928 $ Last-Modified: $Date: 2007-04-23 16:37:29 -0700 (Mon, 23 Apr 2007) $ Author: Jeffrey Yasskin
Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 23-Apr-2007 Post-History: Not yet posted Abstract ========
This proposal defines a hierarchy of Abstract Base Classes (ABCs) [#pep3119] to represent numbers and other algebraic entities similar to numbers. It proposes: [snip]
Are the base number operations in Python all that difficult to understand? Do we really need to add mathematical formalism into Python's type system before people understand the meaning of X * Y? Because all I really see this doing is confusing the hell out of people who aren't mathematicians; I'm a theoretical computer scientist and I had to try to think for a moment to verify that all of those were really necessary to separate the cases. I really do understand wanting to be able to ask "is operation X supported for values Y and Z", but is this really necessary for numbers? - Josiah
[bcc'ing numpy-discussion. Comments should probably try to stay on the
python-3000 list.]
On 4/25/07, Josiah Carlson
Are the base number operations in Python all that difficult to understand? Do we really need to add mathematical formalism into Python's type system before people understand the meaning of X * Y? Because all I really see this doing is confusing the hell out of people who aren't mathematicians; I'm a theoretical computer scientist and I had to try to think for a moment to verify that all of those were really necessary to separate the cases.
I really do understand wanting to be able to ask "is operation X supported for values Y and Z", but is this really necessary for numbers?
If you don't want to use the full power of the system, nothing forces you to. If someone doesn't know what a Ring or a complex number is, they can just check for Real and rely on the same operations, at the small cost of overconstraining their inputs. One reason I'm suggesting splitting the classes into the "algebra" and "numbers" modules is so that people who don't know or care about the algebra can rely on just "numbers" and still get nearly all of the benefits. However, I think it's important to provide a framework for people who want to do more powerful things. If you write a function that can take an arbitrary ring, it should Just Work on a python int, or any other Complex subclass, without you having to check for a lot of operations separately. The Haskell98 numerical hierarchy actually does start at Num, ignoring the algebraic structures underneath, and it draws a lot of criticism for it. That does mean that the names and documentation in the numbers module need to be as clear as we can make them for non-mathematical audiences. I'm certainly open to suggestions on that front. -- Namasté, Jeffrey Yasskin
[bcc'ing numpy-discussion. Comments should probably try to stay on the
python-3000 list.]
On 4/25/07, Josiah Carlson
Are the base number operations in Python all that difficult to understand? Do we really need to add mathematical formalism into Python's type system before people understand the meaning of X * Y? Because all I really see this doing is confusing the hell out of people who aren't mathematicians; I'm a theoretical computer scientist and I had to try to think for a moment to verify that all of those were really necessary to separate the cases.
I really do understand wanting to be able to ask "is operation X supported for values Y and Z", but is this really necessary for numbers?
If you don't want to use the full power of the system, nothing forces you to. If someone doesn't know what a Ring or a complex number is, they can just check for Real and rely on the same operations, at the small cost of overconstraining their inputs. One reason I'm suggesting splitting the classes into the "algebra" and "numbers" modules is so that people who don't know or care about the algebra can rely on just "numbers" and still get nearly all of the benefits. However, I think it's important to provide a framework for people who want to do more powerful things. If you write a function that can take an arbitrary ring, it should Just Work on a python int, or any other Complex subclass, without you having to check for a lot of operations separately. The Haskell98 numerical hierarchy actually does start at Num, ignoring the algebraic structures underneath, and it draws a lot of criticism for it. That does mean that the names and documentation in the numbers module need to be as clear as we can make them for non-mathematical audiences. I'm certainly open to suggestions on that front. -- Namasté, Jeffrey Yasskin
Are the base number operations in Python all that difficult to understand?
Well, they're a little tricky. But the basic problem with "number" support, in almost every programming language, is that they are too low-level an abstraction. A number in a program is never *just* an integer or a float. It's always an integer or float (or complex, OK) deployed in support of a very specific purpose. It's a loop index, or a weight in kilos, or the number of calendar entries for a particular day. It often has units, and a fixed range of values or fixed set of values, neither of which are taken into account in most programming languages. Strings really suffer from the same lack of focus. A string is never just a string. It's a book title, or a filename, or a person's surname, or an email address. It's usually got a particular human language (or computer OS) associated with it. Each of these usages has particular limitations and patterns which limit both its range of values, and the operations which can be successfully applied to it. Bill
On Thu, 26 Apr 2007 10:11:27 PDT, Bill Janssen
Are the base number operations in Python all that difficult to understand?
Well, they're a little tricky.
But the basic problem with "number" support, in almost every programming language, is that they are too low-level an abstraction. A number in a program is never *just* an integer or a float. It's always an integer or float (or complex, OK) deployed in support of a very specific purpose. It's a loop index, or a weight in kilos, or the number of calendar entries for a particular day. It often has units, and a fixed range of values or fixed set of values, neither of which are taken into account in most programming languages.
Strings really suffer from the same lack of focus. A string is never just a string. It's a book title, or a filename, or a person's surname, or an email address. It's usually got a particular human language (or computer OS) associated with it. Each of these usages has particular limitations and patterns which limit both its range of values, and the operations which can be successfully applied to it.
I definitely agree with this, and resolving it would probably address a large class of mistakes made when doing any programming involving numbers and strings (anyone done much of that lately? :). Does the introduction of a lot of base classes for `int' and `float' do anything to address this, though? What you really want is a bunch of new subclasses of them, the opposite. Jean-Paul
The following things mean absolutely nothing to me: - Monoid - MonoidUnderPlus - Group - Ring - Semiring - Field So, most of the terminology in the PEP. I can see use-cases for this level of formalism, but I'm a strong -1 on making any part of the stdlib effectively off-limits for people without advanced math degrees. Why can't this be shipped as a third-party module? Collin Winter
On 4/25/07, Collin Winter
The following things mean absolutely nothing to me:
- Monoid - MonoidUnderPlus - Group - Ring - Semiring - Field
So, most of the terminology in the PEP.
I can see use-cases for this level of formalism, but I'm a strong -1 on making any part of the stdlib effectively off-limits for people without advanced math degrees. Why can't this be shipped as a third-party module?
As a math major I have no excuse, but I have to confess I'm also really rusty on these things. (Nothing a quick look at wikipedia can't refresh though.) Jeffrey, is there any way you can drop the top of the tree and going straight from Number to Complex -> Real -> Rational -> Integer? These are the things that everyone with high school math will know. -- --Guido van Rossum (home page: http://www.python.org/~guido/)
On 4/25/07, Guido van Rossum
Jeffrey, is there any way you can drop the top of the tree and going straight from Number to Complex -> Real -> Rational -> Integer? These are the things that everyone with high school math will know.
I think yes, if you can confirm that the import numbers class Ring(AdditiveGroup): ... numbers.Complex.__bases__ = (Ring,) + numbers.Complex.__bases__ hack I mention under IntegralDomain will make isinstance(int, Ring) return True. Do you want "Number" to be equivalent to Ring+Hashable (or a Haskell-like Ring+Hashable+__abs__+__str__)? I don't think a "Number" class is strictly necessary since people can check for Complex or Real directly, and one of those two is probably what they'll expect after knowing that something's a number. Also, I don't see much point in putting Rational between Real and Integer. The current hierarchy is Real (int, float, Decimal, rational) :> Integer (int) and Real :> FractionalReal (float, Decimal, rational), but Integer and FractionalReal are just siblings. I'm wondering how many people are writing new numeric types who don't know the abstract algebra. I think we should be able to insulate people who are just using the types from the complexities underneath, and the people writing new types will benefit. Or will seeing "Ring" in a list of superclasses be enough to confuse people? -- Namasté, Jeffrey Yasskin
On 4/25/07, Jeffrey Yasskin
On 4/25/07, Guido van Rossum
wrote: Jeffrey, is there any way you can drop the top of the tree and going straight from Number to Complex -> Real -> Rational -> Integer? These are the things that everyone with high school math will know.
I think yes, if you can confirm that the import numbers class Ring(AdditiveGroup): ... numbers.Complex.__bases__ = (Ring,) + numbers.Complex.__bases__ hack I mention under IntegralDomain will make isinstance(int, Ring) return True.
Hm... That would require built-in types to be mutable (by mere mortals) which would kill the idea of multiple interpreters. AFAIK the only user of multiple interpreters is mod_python, which is a pretty popular Apache module... (Google claims 1.8M hits, vs. 5.66M for mod_perl, the market leader.) I guess the alternative would be to keep the mathematical ones in but make it so that mere mortals (among which I count myself :) never need to know about them. Sort of like metaclasses, for most people. But this still presents a bit of a problem. If we insist that built-in types are immutable, whenever a mathematician needs a new kind of algebraic category (you mention a few in the PEP), they have no way to declare that int or float are members of that category. They could write things like class MyFloat(float, DivisionRing): pass but this would be fairly painful. I would like to let users modify the built-ins, but such changes ought to be isolated from other interpreters running in the same address space. I don't want to be the one to tell the mod_python community they won't be able to upgrade to 3.0...
Do you want "Number" to be equivalent to Ring+Hashable (or a Haskell-like Ring+Hashable+__abs__+__str__)? I don't think a "Number" class is strictly necessary since people can check for Complex or Real directly, and one of those two is probably what they'll expect after knowing that something's a number.
Fair enough, you could start with Complex.
Also, I don't see much point in putting Rational between Real and Integer. The current hierarchy is Real (int, float, Decimal, rational) :> Integer (int) and Real :> FractionalReal (float, Decimal, rational), but Integer and FractionalReal are just siblings.
Why? Why not have a properfraction() on Integer that always returns 0 for the fraction? The current math.modf() works just fine on ints (returning a tuple of floats). And why distinguish between Real and FractionalReal? What's the use case for reals that don't know how to compute their fractional part?
I'm wondering how many people are writing new numeric types who don't know the abstract algebra.
It's easy to know the algebra without recalling the terminologies. Lots of programmers (myself included!) are largely self-taught.
I think we should be able to insulate people who are just using the types from the complexities underneath, and the people writing new types will benefit. Or will seeing "Ring" in a list of superclasses be enough to confuse people?
Shouldn't only mathematicians be tempted to write classes deriving from Ring? Everyone else is likely to subclass one of the more familiar types, from Complex onwards. PS. I've checked your PEP in, as PEP3141. We might as well keep a fairly complete record in svn, even if we decide to cut out the algebraic foundational classes. -- --Guido van Rossum (home page: http://www.python.org/~guido/)
On Apr 26, 6:58 am, "Guido van Rossum"
I would like to let users modify the built-ins, but such changes ought to be isolated from other interpreters running in the same address space. I don't want to be the one to tell the mod_python community they won't be able to upgrade to 3.0...
As someone from the mod_python community, the discussion has been noted though. :-) FWIW, in mod_wsgi, an alternate way of hosting Python in conjunction with Apache which is being developed and already available for early experimentation, it will also be optionally possible to isolate applications into separate processes potentially being run as different users. This will still though all be configured through and managed by Apache with it being pretty transparent to an application and a user as to whether a specific application is running in the Apache child process or the distinct daemon process spawned by Apache. Since the use of a separate daemon process will probably end up being the preferred mode of operation, especially for commercial web hosting, loosing the ability to also run parallel applications in distinct interpreters within the same process would only be a minor loss. Using multiple sub interpreters already causes lots of headaches with the simple GIL state API, inability to have different versions of extension modules loaded etc, so much so that separate processes may have to be the norm. Graham
Jeffrey, is there any way you can drop the top of the tree and going straight from Number to Complex -> Real -> Rational -> Integer? These are the things that everyone with high school math will know.
I think knowledge of the concepts of group, ring, and field is supposed to be standard knowledge for any high-school senior -- isn't this what the "new math" was all about?. But they're pretty fundamental to computer science; anyone trying to do serious work in the field (that is, anyone with a reason to read the PEP) should have a nodding acquaintance with them. Bill
Guido van Rossum wrote:
Jeffrey, is there any way you can drop the top of the tree and going straight from Number to Complex -> Real -> Rational -> Integer? These are the things that everyone with high school math will know.
Having taught mathematics at a community college for a little while (hell, having taught mathematics at a few Universities - I won't name names), I would suggest that this is not a valid premise. I was going to qualify that by adding that hopefully it is a valid premise for would-be mathematics-package-using programmers, but I recall a fellow student in an Intro to C class I was once in (not teaching, however) who, even though the class was not supposed to be an Intro to Programming course (this was back in the day when those were still taught in Pascal and those taking the C class were supposed to have had successfully completed the Pascal course first), had to be handheld not simply through the implementation of algorithms in C, but in the very development of algorithms from a problem statement. Even without having to resort to such an extreme anecdote, I can envision situations where knowledgeable, skilled, experienced, but "deficient"-in-math programmers are called upon (implicitly or explicitly) by management to use numpy/scipy. All this said, I'm certainly of the opinion that we can't "dumb down" numpy *too* much (after all, by some definition, it is an "advanced" mathematics package, at least IMO), and to my mind the sequence above, i.e., X->C->R->Q->Z, is a more than reasonable "line in the sand". DG
On 4/25/07, Jeffrey Yasskin
class MonoidUnderPlus(Abstract):
Is this useful? Just because two things are both Monoid instances doesn't mean I can add them -- they have to be part of the same Monoid. By the time you do assert isinstance(a, MonoidUnderPlus) assert isinstance(b, MonoidUnderPlus) assert isinstance(a, b.__class__) or isinstance(b, a.__class__) I'm not sure how much you've really saved.
**Open issue:** Do we want to give people a choice of which of the following to define, or should we pick one arbitrarily?::
def __neg__(self): """Must define this or __sub__().""" return self.zero() - self
def __sub__(self, other): """Must define this or __neg__().""" return self + -other
Probably better to pick one; then it is clear that they have to override something, and there won't be as many accidental infinite loops. If they really want to define the other, they can just do def __neg__(self): super(_this_class__, self).__neg__() The decorator doesn't know it isn't a real override. -jJ
participants (9)
-
Bill Janssen
-
Collin Winter
-
David Goldsmith
-
Graham Dumpleton
-
Guido van Rossum
-
Jean-Paul Calderone
-
Jeffrey Yasskin
-
Jim Jewett
-
Josiah Carlson