Operators for matrix: current choices (Was Matlab vs Python ...)

Tue Jul 18 14:30:26 EDT 2000

This is a summary of what's discussed so far.

On Tue, 18 Jul 2000, Moshe Zadka wrote:
> ... we really meant that we'll help yo. 

Thanks!  I see where compile() is called.  But how is the parser itself
called in python?  Only used as C function?  Now that we have a patch to add
the new operators, is it possible to assemble these into a module?

Before we plunge into this thing which may turn out to be more laborious
than MatPy itself, we need to get a handle on what's involved.  Let me make
a list in order of my preference.  Please jump in at any point that's both
feasible and interesting enough for someone to actually do it:

1. Make a pure python way to introduce additional operators, like
   import NewOperator
   NewOperator.define(".+", "__dotadd__")
   NewOperator.define("+=", "__add_ab__")
   NewOperator.define(":=", "__copy__")

   Advantage: Pure python.  Doesn't interfere with existing sytax.  May be
   used for other operators.  Not in the core, only affect the module that
   imports it.  
   Disadvantage: Said to be quite difficult to implement (Why?).

2. Incorporate additional operators in python.  Either . or @.  Five to
   eight would be enough for linear algebra.

   Advantage: Pure python.  Doesn't interfere with existing syntax.
   Disadvantage: Fixed choice for everybody.  Unlikely to happen for quite
   some time until everyone agrees on everything.

3. Maintain a patch for additional operators. 

   Advantage: once built, everything is pure python.  Already have patch.
   Disadvantage: Can't guarantee most users can patch python, so in practice
   may need to maintain windows executable for download.

4. Temporarily use a.mmul(b) and a.emul(b) before things settle down.  Then
   define __dotmul__ = emul, or the like, when they are available.

   Advantage: immediately implementable.  Won't break anyone's code.  Can
   also switch to  __atmul__ = mmul or other choices later.
   Disadvantage: code somewhat cluttered.

5. Use a customized parser in python, with some compile tools?

   Advantage: can change anything.  Can fine tune before settle down.
   Disadvantage: can change anything.  Slower than 1.  Don't know how to use
   it.  (But maybe they can be turned into option 1?)

6. Maintain two classes: E and M.  

   Advantage: no need for new operators.
   Disadvantage: artificial casting likely to be maintenance nightmare,
   because we change objects while we really want to change operators.

   A variant: only use .E for elementwise.  Always return matrixwise.  This
   is simply just using casting to help select among two methods denoted by
   the same binary symbol.  Does not look much better than 4 but worth
   considering. 

7. Use prefix function call syntax:

   Advantage: very simple to implement
   Disadvantage: very difficult-to-read user code.  Won't be used in practice.

8. Use preprocessor on another file extension like .mpy

   Advantage: user code looks good
   Disadvantage: two set of files to maintain.  No implementation yet.  The
   preprocessor need to be able to see the operators, their scopes and
   precedence.  If this is possible, why not turn it to 5 or even 1?
   Unlikely used in practice unless the preprocessing step is invisible.

9. Use a miniparser for expressions in strings.

   Advantage: artificially looks like pure python.
   Disadvantage: the whole syntax is inside the string, which becomes a
   special syntax zone with severe limitations.  These halfline programs
   completely lack the elegance of python code.  Yuck!

10. Use list comprehension.

	Advantage: apparently none.
	Disadvantage: admitting that elementwise operations cannot be
	encapsulated.  Need a new syntactic structure with brackets, loops,
	dummy variables, scope delimitor (: or ;) implicitly parrallel for loops
	and implicit casting.  Yuck!!!

Note: I dislike some of the proposals because they look like ugly hacks to
use the abundance of class names, method names, function names and string
literals to walk around what is clearly just a shortage of a few operator
symbols.  But I don't think we can reach consensus on such issue anyhow.

If there is no more proposal forthcoming, I will adopt a combination of
options 3 and 4, and hoping for someone to come up with 1.  The largest
problem with this approach is this: If we can't guarantee the patch is
usable everywhere, do we want everybody using these operators?  Will it run
on another machine, or even another version of python?  If such a status
persists for a long time I guess all codes will settle into a local optimum,
something like

(a.emul(b))*(c.emul(d))

and an opportunity would be lost, even if in reality most people actually
like (a.*b)*(c.*d).  This incidentally also removes an opportunity to see if
elementwise operators might be useful elsewhere, like

["%5.2f", "%-8s", "%.2g"] .% [pi, 'short', 1]

["Alice", "Bob", "Charlie"] .+ "is" .+ ["girl", "boy", "boy"]

The argument "if it finds wide-spread use we may put it in the language" has
a severe limitation - if it is technically very difficult for users to use a
feature that's not in the language, it will unlikely to find wide-spread use
and may never get into the language.  That's why I really hope something
like 1 can come up.

Inputs greatly appreciated.

Huaiyu