PEP: Adding new math operators

Thu Aug 3 05:56:18 EDT 2000

Here's the first draft of the PEP for adding new math operators.  There are
still a lot of idea and details not included, but I think it already touches
all the aspects covered and I'd better put it here for comments.

        Python Extension Proposal: Adding new math operators 
                Huaiyu Zhu <hzhu at users.sourceforge.net>
                         2000-08-03, draft 1

Introduction
------------

This PEP describes proposal to add new math operators to Python.  This is
largely a summary of discussions in the comp.lang.python news group.  Issues
discussed here include:

1. Background.
2. Description of proposed operators and implementation issues.
3. Analysis of alternatives to new operators.
4. Analysis of alternative forms.
5. Compatibility issues
6. Description of wider extensions and other related ideas.

A substantial portion of this PEP describes ideas that do not go into the
proposed implementation.  They are presented because the extension is
essentially syntactic sugar, so its adoption must be weighed against various
possible alternatives.  While many alternatives may be better in some
aspects, the current proposal appears to be overall advantageous.

Background
----------

Python provides five basic math operators, + - * / **.  (Hereafter
generically represented by "op").  They can be overloaded with new semantics
for user-defined classes.  However, for objects composed of homogeneous
elements, such as arrays, vectors and matrices in numerical computation,
there are two essentially distinct flavors of semantics.  The objectwise
operations treats these objects as points in multidimensional spaces.  The
elementwise operations treats these as collection of ordinary numbers.
These two operations are often intermixed in the same formulas, thereby
requiring ways to distinguishing them.

Many numerical computation languages provides, two sets of math operators.
For example, in Matlab, the ordinary op is used for objectwise operation
while .op is used for elementwise operation.  In R, op stands for
elementwise operation whil %op% stands for objectwise operation.

In python, there are other methods of representation, some of which already
used by available numerical packages, such as

1. function:   mul(a,b)
2. method:     a.mul(b)
3. casting:    a.E*b 

In several aspects these do not provide adequate solution as compared with
distinct infix operators.  This will be analyzed in more details later, the
key points being

1. Readability: Even for moderately complicated formulas, infix operators
   are much cleaner than alternatives.
2. Familiarity: Users are familiar with ordinary math operators.  
3. Implementation: New infix operators will not unduly clutter python
   syntax.  They will greatly ease the implementation of numerical packages.

While it is possible to assign current math operators to one flavor of
operation, there is simply not enough infix operators to overload for the
other flavor.  It is also impossible to maintain visual symmetry between
these two flavors of operators if one of them does not contain symbols for
ordinary math operators.

Proposed extension
------------------

1.  New operators ~+ ~- ~* ~/ ~** ~+= ~-= ~*= ~/= ~**= are added to core
    Python.  They parallel the existing operators + - * / ** and the (soon
    to be added) += -= *= /= **= operators.

2.  Operator ~op retains the syntactical properties of operator op,
    including precedence.

3.  Operator ~op retains the semantical properties of operator op on
    built-in number types.  They raise syntax error on other types.

4.  These operators are overloadable in classes with names that prepend
    "alt" to names of ordinary math operators.  For example, __altadd__ and
    __raltadd__ work for ~+ just as __add__ and __radd__ work for +.

5.  As with standard math operators, the __r*__() methods are invoked when
    the left operand does not provide the appropriate method.

The symbol ~ is already used in Python as the unary "bitwise not" operator.
Currently it is not allowed for binary operators.  So using it as a prefix
to binary operators will not create incompatibility.

The proposed implementation is to patch several files relating to the parser
and compiler to duplicate the functionality of existing math operators as
necessary.  All new semantics are to be implemented in the application that
overload them, but they are recommended to be conceptually similar to
existing math operators.

It is not specified which version of operators stand for elementwise or
objectwise operations, leaving the decision to applications.

A prototype implementation already exists.

Alternatives to adding new operators
------------------------------------

Some of the leading alternatives using the multiplication as an example.

1. Use function mul(a,b).

   Advantage:
   -  No need for new operators.

   Disadvantage: 
   - Prefix forms are cumbersome for composite formulas.
   - Unfamiliar to the intended users.
   - Too verbose for the intended users.
   - Unable to use natural precedence rules.

2. Use method call a.mul(b)

   Advantage:
   - No need for new operators.

   Disadvantage:
   - Asymmetric for both operands.
   - Unfamiliar to the intended users.
   - Too verbose for the intended users.
   - Unable to use natural precedence rules.

3. Implement a shadowing "elementwise class" and use casting to indicate the
   operators.  For example a*b for matrix multiply, and a.E*b for
   elementwise multiply.

   Advantage:
   - No need for new operators.
   - Benefits of infix operators with correct precedence rules.
   - Clean formulas in applications.

   Disadvantage:
   - Hard to maintain in current Python because ordinary numbers cannot have
     class methods.  (a.E*b will fail if a is a pure number.)
   - Difficult to implement, as this will interfere with existing method
     calls. (Like .T for transpose, etc.)
   - Runtime overhead of method lookup.
   - The shadowing class cannot replace a true class, because it does not
     return its own type.  So there need to be a M class with shadow E class,
     and an E class with shadow M class.
   - Unnatural to mathematicians.

4. Using mini parser to parse formulas written in arbitray extension placed
   in quoted strings.

   Advantage:
   - Pure Python, without new operators

   Disadvantage:
   - The actual syntax is within the quoted string, which does not resolve
     the problem itself.
   - Introducing zones special syntax.

Among these alternatives, the first and second are used in current
applications to some extent, but found inadequate.  The third is the most
favorite for applications, but it will incur huge implementation complexity.
The fourth creates more problems than it solves.

Alternative forms of infix operators
------------------------------------

There are essentially two types of representations of infix operators:

1. Bracketed form

   (op)
   [op]
   {op}
   <op>
   :op:
   ~op~
   %op%

2. Meta character form

   .op
   @op
   ~op

   There are proposals for puting meta character after the operator as well.

3. Less consistent variations of these themes.   These are considered
   unfavorably.  For completeness some are listed here
   - Use @/ and /@ for left and right division
   - Use [*] and (*) for outer and inner products

4. Use __call__ to simulate multiplication.
   a(b)  or (a)(b)

There are several criteria for choosing them:

1. No syntactical ambiguities with existing operators.  

2. Produces higher readability is actual formulas.  This makes the bracketed
   forms unfavorable.  See examples in the appendix.

3. Produce as much visual similarity to existing math operators as possible.

4. Be syntactically simple without blocking possible future extensions.

With these criteria the overall winner in bracket form appear to be {op}.  A
clear winner in the meta character form is ~op.  Comparing these it appears
that ~op is the favorite among them all.  

(MORE DETAILS!!!)

.op is eliminated because 1.+a would be different from 1 .+a.

The bracket type operators are most favorable when standing alone, but in
formulas, they interfere with visual parsing of parenthesis for precedence
and function argument.  This holds true for (op) and [op].  The {op} and
<op> forms have similar problems but the effect is not as severe.

@op is rejected because @ is visually heavy and is more readily associated
with the preceding identifier than the operator itself.

Most of existing ASCII symbols have already been used.  The only three
unused are (@) ($) (?)

Semantics of new operators
--------------------------

There are strong opinions as to which set of operator should be objectwise
or elementwise.  Here is a list of some arguments

1. op for element, ~op for object
   - Consistent with current multiarray interface of Numeric package
   - Consistent with some other languages
   - Perception that elementwise operations are more natural
   - Perception that elementwise operations are used more frequently

2. op for object, ~op for element
   - Consistent with current linear algebra interface of MatPy package
   - Consistent with some other languages
   - Perception that objectwise operations are more natural
   - Perception that objectwise operations are used more frequently
   - Consistent with the current behavior of math operators on lists

It is generally agreed upon that 
   - there are no absolute reason to favor one or the other
   - it is easy to cast from one representation to another in a sizable
     chunk of code, so the other flavor of operators are always minority
   - there are other semantic differences that favor existence of
     array-oriented and matrix-oriented packages, even if their operators
     are unified.
   - whatever the decision is taken, codes using existing interfaces should
     not be broken for a very long time.

Therefore not much is lost, and much flexibility retained, if the semantic
flavors of these two sets of operators are not dictated by the core
language.  The application packages are responsible to make the most
suitable choice.  This is already the case for NumPy and MatPy which use
opposite semantics.  Adding new operators will not break this.

The issue of numerical precision was raised, but if the semantics is left to
the applications, the actual precisions should also go there.

Examples
--------

Following we list the actual formulas that will appear using various
operators or other representations described above.

1. The matrix inversion formula:

   - Using op for object and ~op for element:

     b = a.I - a.I * u / (c.I + v/a*u) * v / a

   - Using op for element and ~op for object:

     b = a.I @- a.I @* u @/ (c.I @+ v@/a@*u) @* v @/ a

     b = a.I ~- a.I ~* u ~/ (c.I ~+ v~/a~*u) ~* v ~/ a

     b = a.I (-) a.I (*) u (/) (c.I (+) v(/)a(*)u) (*) v (/) a

     b = a.I [-] a.I [*] u [/] (c.I [+] v[/]a[*]u) [*] v [/] a

     b = a.I <-> a.I <*> u </> (c.I <+> v</>a<*>u) <*> v </> a

     b = a.I {-} a.I {*} u {/} (c.I {+} v{/}a{*}u) {*} v {/} a

   Observation: For linear algebra using op for object is preferable.  The
   ~op type operators look better than (op) type in complicated formulas.

   - using named operators

     b = a.I @sub a.I @mul u @div (c.I @add v @div a @mul u) @mul v @div a

     b = a.I ~sub a.I ~mul u ~div (c.I ~add v ~div a ~mul u) ~mul v ~div a

   Observations: Named operators are not suitable for math formulas.

2. Plotting a 3d graph

   - Using op for object and ~op for element:

     z = sin(x~**2 ~+ y~**2)
     plot(x,y,z)

   - Using op for element and ~op for object:

     z = sin(x**2 + y**2)
     plot(x,y,z)

    Observation: Elementwise operations with broadcasting can be used to
    give more efficient implementation than Matlab.  In these cases using
    plain op for elementwise is preferable.

3. Using + and - with automatic broadcasting

     a = b - c;  d = a.T*a

   Observation: This would silently produce hard to trace bugs if one of b
   or c is row vector while the other is col vector.

Miscellaneous issues:
---------------------

1. Need for the ~+ ~- operators.  The objectwise + - are needed because they
   provide important sanity check as per linear algebra.  The elementwise +
   - are important because they allow broadcasting rules that are very
   efficient in applications.

2. Left division (solve).  For matrix, a*x is not necessarily equal to x*a.
   The solution of a*x==b, denoted x=solve(a,b), is therefore different from
   the solution of x*a==b, denoted x=div(b,a).  There are discussions about
   finding a new symbol for solve.  [Background: Matlab use b/a for div(b,a)
   and a\b for solve(a,b).]

   It is recognized that Python provides a better solution without requiring
   a new symbol: the inverse method .I can be made to be delayed so that
   a.I*b and b*a.I are equivalent to Matlab's a\b and b/a.  The
   implementation is quite simple and the resulting application code clean.

3. Power operator.  Python's use of a**b as pow(a,b) has two perceived
   disadvantages:
   - Most mathematicians are more familiar with using a^b for this purpose.
   - It results in long augmented assignment operator ~**=.
   However, this issue is distinct from the main issue here.

4. Additional multiplication operators.  Several forms of multiplications
   are used in (multi-)linear algebra.  Eliminating variations of
   multiplication in linear algebra sense, two general forms remain for
   multiarray (tensors): outer product and inner product.  They need to
   specify indices, which can be either
   (1) associated with the operator,
   (2) associated with the objects.
   The latter is used extensively on paper, and is also the easier one to
   implement.  By implementing a tensor class with indicesa general form of
   multiplication would cover both outer and inner products, and specialize
   to linear algebra multiplication as well.  The index rule is defined as
   class methods.  For example, 

     a = b.i(1,2,-1,-2) * c.i(4,-2,3,-1)   # a_ijkl = b_ijmn c_lnkm

   Therefore one objectwise multiplication is sufficient.

5. Bitwise operators.  Currently Python assigns six operators to bitwise
   operations: and (&), or (|), xor (^), complement (~), left shift (<<) and
   shift right (>>), with their own precedence levels.  This is related to
   the new math operators in several ways:

   (1) It assigns special syntactical and semantical structures to a field
       that is more restrictive.  It appears most of these could be replaced
       by a bitwise module with named functions.  The issue is different
       from new math operators.
   (2) The proposed new math operators use the symbol ~ that is "bitwise not"
       operator.  This poses no compatibility problem.
   (3) The symbol ^ might be better used for pow than bitwise xor.  But this
       depends on the future of bitwise operators.  It does immediately
       impact on the proposed math operator.
   (4) The symbol | was suggested to be used for matrix solve.  But the new
       solution of using delayed .I is better in several ways.

6. Lattice operators.  It was suggested that similar operators be combined
   with bitwise operators to represent lattice operations.  For example, ~|
   and ~& could represent "lattice or" and "lattice and".  But these can
   already been achieved by overloading existing logical or bitwise
   operators.  These operations might be more suitable for infix operators
   than the built-in bitwise operators, but see also below.

7. There is a suggestion that operators may not need special forms, using
   def "+"(a, b)      in place of       def __add__(a, b)
   This appears to require larger syntactic change, but would only be useful
   when arbitrary additional operators are allowed.

8. There is a suggestion to provide a copy operator :=, but this can be done
   by a=b.copy already.

Impact on possible future extensions:
-------------------------------------

There are several more general possible extensions leading from the current
proposal.  Although they are distint proposals, they may have syntactical or
semantical implications on each other.  It is prudent to ensure that the
current extension will not restrict future possibilities in any way.

1. Named operators. 

During the discussion it is generally recognized that infix operators are a
scarce resource in Python, not only in numerical computation, but in other
fields as well.  Several proposals and ideas were put forward that would
allow infix operators to be extended more or less the same way as named
functions.

The idea for a large supply of operators is essentially this: Choose a meta
character, say @, so that for any identifier "opname", the combination
"@opname" would be a binary operator, and

a @opname b == opname(a,b)

Other representations mentioned include .name ~name~ :name: (.name) %name%
and similar variations.  The pure bracket based operators cannot be extended
this way.

This requires a change in the parser to recognize @opname, and parse it into
the same structure as a function call.  The precedence of all these
operators would have to be fixed at one level.  So the implementation would
be different from additional math operators which keep the precedence of
existing math operators.

The current proposal does not appear to limit possible future extensions of
such form in any way.

2. More general symbolic operators.

One additional form of future extension is to use meta character and
operator symbols (symbols that cannot be used in syntactical structures
other than operators).  Suppose @ is the meta character.  Then

      a + b,    a @+ b,    a @@+ b,  a @+- b

would all be operators with a hierarchy of precedence, defined by

   def "+"(a, b)
   def "@+"(a, b)
   def "@@+"(a, b)
   def "@+-"(a, b)

One advantage compared with named operators is greater flexibility for
precendences based on either the meta character or the ordinary operator
symbols.  This also allows operator composition.  The disadvantage is that
they are more like "line noise".  In any case the current proposal does not
impact its future possibility.

These kinds of future extensions may not be necessary when Unicode is
generally adopted. 

3. Object/element dichotomy for other types of objects.

This makes sense for any object considered as a collection of homogeneous
elements.  Several examples are listed here:

1. List arithmetic
   [1, 2] + [3, 4]  ===>  [1, 2, 3, 4]
   [1, 2] ~+ [3, 4] ===> [4, 6]

   ['a', 'b'] * 2   ===> ['a', 'b', 'a', 'b']
   'ab' * 2         ===> 'abab'
   ['a', 'b'] ~* 2  ===> ['aa', 'bb']
   [1, 2] ~* 2      ===> [2, 4]

2. Tupple generation

   [1, 2, 3], [4, 5, 6]   ===>  ([1,2, 3], [4, 5, 6])
   [1, 2, 3]~,[4, 5, 6]   ===>  [(1,4), (2, 5), (3,6)]

   This has the same effect as the proposed zip function.

3. Bitwise operation (regarding integer as collection of bits)

   2 and 3  ===> 1
   2 ~and 3 ===> 2
   2 or 3   ===> 1
   2 ~or 3  ===> 3

4. Elementwise format operator (with broadcasting)

   a = [1,2,3,4,5]
   print ["%5i"] ~% a  

   Currently: print ("%5i "*len(a)) % tuple(a)

There are probably many other similar situations.  This general approach
seems well suited for most of them.  In any case, the current proposal will
not negatively impact on future possibilities.

Huaiyu Zhu                       hzhu at users.sourceforge.net
Matrix for Python Project        http://MatPy.sourceforge.net