There should be one-- and preferably only one --obvious way to do it.

There should be one-- and preferably only one --obvious way to do it.

Send NumPy-Discussion mailing list submissions to
	numpy-discussion@scipy.org

To subscribe or unsubscribe via the World Wide Web, visit
	http://mail.scipy.org/mailman/listinfo/numpy-discussion
or, via email, send a message with subject or body 'help' to
	numpy-discussion-request@scipy.org

You can reach the person managing the list at
	numpy-discussion-owner@scipy.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of NumPy-Discussion digest..."

Today's Topics:

   1. Re: [help needed] associativity and precedence	of '@'
      (josef.pktd@gmail.com)
   2. Re: [RFC] should we argue for a matrix power operator, @@?
      (josef.pktd@gmail.com)

----------------------------------------------------------------------

Message: 1
Date: Sat, 15 Mar 2014 21:20:40 -0400
From: josef.pktd@gmail.com
Subject: Re: [Numpy-discussion] [help needed] associativity and
	precedence	of '@'
To: Discussion of Numerical Python <numpy-discussion@scipy.org>
Message-ID:
	<CAMMTP+Ahag9fN3XPtS4uDRThBknVXzudc0G8TtJ7G3w3dWbBWw@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

On Fri, Mar 14, 2014 at 11:41 PM, Nathaniel Smith <njs@pobox.com> wrote:

Hi all,

Here's the main blocker for adding a matrix multiply operator '@' to
Python: we need to decide what we think its precedence and associativity
should be. I'll explain what that means so we're on the same page, and what
the choices are, and then we can all argue about it. But even better would
be if we could get some data to guide our decision, and this would be a lot
easier if some of you all can help; I'll suggest some ways you might be
able to do that.

So! Precedence and left- versus right-associativity. If you already know
what these are you can skim down until you see CAPITAL LETTERS.

We all know what precedence is. Code like this:
  a + b * c
gets evaluated as:
  a + (b * c)
because * has higher precedence than +. It "binds more tightly", as they
say. Python's complete precedence able is here:
  http://docs.python.org/3/reference/expressions.html#operator-precedence

Associativity, in the parsing sense, is less well known, though it's just
as important. It's about deciding how to evaluate code like this:
  a * b * c
Do we use
  a * (b * c)    # * is "right associative"
or
  (a * b) * c    # * is "left associative"
? Here all the operators have the same precedence (because, uh... they're
the same operator), so precedence doesn't help. And mostly we can ignore
this in day-to-day life, because both versions give the same answer, so who
cares. But a programming language has to pick one (consider what happens if
one of those objects has a non-default __mul__ implementation). And of
course it matters a lot for non-associative operations like
  a - b - c
or
  a / b / c
So when figuring out order of evaluations, what you do first is check the
precedence, and then if you have multiple operators next to each other with
the same precedence, you check their associativity. Notice that this means
that if you have different operators that share the same precedence level
(like + and -, or * and /), then they have to all have the same
associativity. All else being equal, it's generally considered nice to have
fewer precedence levels, because these have to be memorized by users.

Right now in Python, every precedence level is left-associative, except
for '**'. If you write these formulas without any parentheses, then what
the interpreter will actually execute is:
  (a * b) * c
  (a - b) - c
  (a / b) / c
but
  a ** (b ** c)

Okay, that's the background. Here's the question. We need to decide on
precedence and associativity for '@'. In particular, there are three
different options that are interesting:

OPTION 1 FOR @:
Precedence: same as *
Associativity: left
My shorthand name for it: "same-left" (yes, very creative)

This means that if you don't use parentheses, you get:
   a @ b @ c  ->  (a @ b) @ c
   a * b @ c  ->  (a * b) @ c
   a @ b * c  ->  (a @ b) * c

OPTION 2 FOR @:
Precedence: more-weakly-binding than *
Associativity: right
My shorthand name for it: "weak-right"

This means that if you don't use parentheses, you get:
   a @ b @ c  ->  a @ (b @ c)
   a * b @ c  ->  (a * b) @ c
   a @ b * c  ->  a @ (b * c)

OPTION 3 FOR @:
Precedence: more-tightly-binding than *
Associativity: right
My shorthand name for it: "tight-right"

This means that if you don't use parentheses, you get:
   a @ b @ c  ->  a @ (b @ c)
   a * b @ c  ->  a * (b @ c)
   a @ b * c  ->  (a @ b) * c

We need to pick which of which options we think is best, based on whatever
reasons we can think of, ideally more than "hmm, weak-right gives me warm
fuzzy feelings" ;-). (In principle the other 2 possible options are
tight-left and weak-left, but there doesn't seem to be any argument in
favor of either, so we'll leave them out of the discussion.)

Some things to consider:

* and @ are actually not associative (in the math sense) with respect to
each other, i.e., (a * b) @ c and a * (b @ c) in general give different
results when 'a' is not a scalar. So considering the two expressions 'a * b
@ c' and 'a @ b * c', we can see that each of these three options gives
produces different results in some cases.

"Same-left" is the easiest to explain and remember, because it's just, "@
acts like * and /". So we already have to know the rule in order to
understand other non-associative expressions like a / b / c or a - b - c,
and it'd be nice if the same rule applied to things like a * b @ c so we
only had to memorize *one* rule. (Of course there's ** which uses the
opposite rule, but I guess everyone internalized that one in secondary
school; that's not true for * versus @.) This is definitely the default we
should choose unless we have a good reason to do otherwise.

BUT: there might indeed be a good reason to do otherwise, which is the
whole reason this has come up. Consider:
    Mat1 @ Mat2 @ vec
Obviously this will execute much more quickly if we do
    Mat1 @ (Mat2 @ vec)
because that results in two cheap matrix-vector multiplies, while
    (Mat1 @ Mat2) @ vec
starts out by doing an expensive matrix-matrix multiply. So: maybe @
should be right associative, so that we get the fast behaviour without
having to use explicit parentheses! /If/ these kinds of expressions are
common enough that having to remember to put explicit parentheses in all
the time is more of a programmer burden than having to memorize a special
associativity rule for @. Obviously Mat @ Mat @ vec is more common than vec
@ Mat @ Mat, but maybe they're both so rare that it doesn't matter in
practice -- I don't know.

Also, if we do want @ to be right associative, then I can't think of any
clever reasons to prefer weak-right over tight-right, or vice-versa. For
the scalar multiplication case, I believe both options produce the same
result in the same amount of time. For the non-scalar case, they give
different answers. Do people have strong intuitions about what expressions
like
  a * b @ c
  a @ b * c
should do actually? (I'm guessing not, but hey, you never know.)

And, while intuition is useful, it would be really *really* nice to be
basing these decisions on more than *just* intuition, since whatever we
decide will be subtly influencing the experience of writing linear algebra
code in Python for the rest of time. So here's where I could use some help.
First, of course, if you have any other reasons why one or the other of
these options is better, then please share! But second, I think we need to
know something about how often the Mat @ Mat @ vec type cases arise in
practice. How often do non-scalar * and np.dot show up in the same
expression? How often does it look like a * np.dot(b, c), and how often
does it look like np.dot(a * b, c)? How often do we see expressions like
np.dot(np.dot(a, b), c), and how often do we see expressions like np.dot(a,
np.dot(b, c))? This would really help guide the debate. I don't have this
data, and I'm not sure the best way to get it. A super-fancy approach would
be to write a little script that uses the 'ast' module to count things
automatically. A less fancy approach would be to just pick some code you've
written, or a well-known package, grep through for calls to 'dot', and make
notes on what you see. (An advantage of the less-fancy approach is that as
a human you might be able to tell the difference between scalar and
non-scalar *, or check whether it actually matters what order the 'dot'
calls are done in.)

-n

--
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

I'm in favor of same-left because it's the easiest to remember.
with scalar factors it is how I read formulas.

Both calculating dot @ first or calculating elementwise * first sound
logical, but I wouldn't know which should go first. (My "feeling" would be
@ first.)

two cases I remembered in statsmodels
H = np.dot(results.model.pinv_wexog, scale[:,None] *
results.model.pinv_wexog.T)
se = (exog * np.dot(covb, exog.T).T).sum(1)

we are mixing * and dot pretty freely in all combinations AFAIR

my guess is that I wouldn't trust any sequence without parenthesis for a
long time.
(and I don't trust a sequence of dots @ without parenthesis either, in our
applications.)

x @ (W.T @ W) @ x      ( W.shape = (10000, 5) )
or
x * (W.T @ W) * x

(w * x) @ x    weighted sum of squares

Josef
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20140315/d4126289/attachment-0001.html 

------------------------------

Message: 2
Date: Sat, 15 Mar 2014 21:31:22 -0400
From: josef.pktd@gmail.com
Subject: Re: [Numpy-discussion] [RFC] should we argue for a matrix
	power operator, @@?
To: Discussion of Numerical Python <numpy-discussion@scipy.org>
Message-ID:
	<CAMMTP+DA-xhZNTekdK2fUxifyZjJOomHtPKr1eAvVfuK-WpODw@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

On Sat, Mar 15, 2014 at 8:47 PM, Warren Weckesser <
warren.weckesser@gmail.com> wrote:

On Sat, Mar 15, 2014 at 8:38 PM, <josef.pktd@gmail.com> wrote:

I think I wouldn't use anything like @@ often enough to remember it's
meaning. I'd rather see english names for anything that is not **very**
common.

I find A@@-1 pretty ugly compared to inv(A)
A@@(-0.5)  might be nice   (do we have matrix_sqrt ?)

scipy.linalg.sqrtm:
http://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.sqrtm.html

maybe a good example: I could never figured that one out

M = sqrtm(A)

A = M @ M

but what we use in stats is

A = R.T @ R
(eigenvectors dot diag(sqrt of eigenvalues)

which sqrt is A@@(0.5) ?

Josef

Warren

Josef

On Sat, Mar 15, 2014 at 5:11 PM, Stephan Hoyer <shoyer@gmail.com> wrote:

Speaking only for myself (and as someone who has regularly used matrix
powers), I would not expect matrix power as @@ to follow from matrix
multiplication as @. I do agree that matrix power is the only reasonable
use for @@ (given @), but it's still not something I would be confident
enough to know without looking up.

We should keep in mind that each new operator imposes some (small)
cognitive burden on everyone who encounters them for the first time, and,
in this case, this will include a large fraction of all Python users,
whether they do numerical computation or not.

Guido has given us a tremendous gift in the form of @. Let's not insist
on @@, when it is unclear if the burden of figuring out what @@ means it
would be worth using, even for heavily numeric code. I would certainly
prefer to encounter norm(A), inv(A), matrix_power(A, n),
fractional_matrix_power(A, n) and expm(A) rather than their infix
equivalents. It will certainly not be obvious which of these @@ will
support for objects from any given library.

One useful data point might be to consider whether matrix power is
available as an infix operator in other languages commonly used for
numerical work. AFAICT from some quick searches:
MATLAB: Yes
R: No
IDL: No

All of these languages do, of course, implement infix matrix
multiplication, but it is apparently not clear at all whether the matrix
power is useful.

Best,
Stephan

On Sat, Mar 15, 2014 at 9:03 AM, Olivier Delalleau <shish@keba.be>wrote:

2014-03-15 11:18 GMT-04:00 Charles R Harris <charlesr.harris@gmail.com>
:

On Fri, Mar 14, 2014 at 10:32 PM, Nathaniel Smith <njs@pobox.com>wrote:

Hi all,

Here's the second thread for discussion about Guido's concerns about
PEP 465. The issue here is that PEP 465 as currently written proposes
two new operators, @ for matrix multiplication and @@ for matrix power
(analogous to * and **):
  http://legacy.python.org/dev/peps/pep-0465/

The main thing we care about of course is @; I pushed for including @@
because I thought it was nicer to have than not, and I thought the
analogy between * and ** might make the overall package more appealing
to Guido's aesthetic sense.

It turns out I was wrong :-). Guido is -0 on @@, but willing to be
swayed if we think it's worth the trouble to make a solid case.

Note that question now is *not*, how will @@ affect the reception of
@. @ itself is AFAICT a done deal, regardless of what happens with @@.
For this discussion let's assume @ can be taken for granted, and that
we can freely choose to either add @@ or not add @@ to the language.
The question is: which do we think makes Python a better language (for
us and in general)?

Some thoughts to start us off:

Here are the interesting use cases for @@ that I can think of:
- 'vector @@ 2' gives the squared Euclidean length (because it's the
same as vector @ vector). Kind of handy.
- 'matrix @@ n' of course gives the matrix power, which is of marginal
use but does come in handy sometimes, e.g., when looking at graph
connectivity.
- 'matrix @@ -1' provides a very transparent notation for translating
textbook formulas (with all their inverses) into code. It's a bit
unhelpful in practice, because (a) usually you should use solve(), and
(b) 'matrix @@ -1' is actually more characters than 'inv(matrix)'. But
sometimes transparent notation may be important. (And in some cases,
like using numba or theano or whatever, 'matrix @@ -1 @ foo' could be
compiled into a call to solve() anyway.)

(Did I miss any?)

In practice it seems to me that the last use case is the one that's
might matter a lot practice, but then again, it might not -- I'm not
sure. For example, does anyone who teaches programming with numpy have
a feeling about whether the existence of '@@ -1' would make a big
difference to you and your students? (Alan? I know you were worried
about losing the .I attribute on matrices if switching to ndarrays for
teaching -- given that ndarray will probably not get a .I attribute,
how much would the existence of @@ -1 affect you?)

On a more technical level, Guido is worried about how @@'s precedence
should work (and this is somewhat related to the other thread about
@'s precedence and associativity, because he feels that if we end up
giving @ and * different precedence, then that makes it much less
clear what to do with @@, and reduces the strength of the */**/@/@@
analogy). In particular, if we want to argue for @@ then we'll need to
figure out what expressions like
   a @@ b @@ c
and
   a ** b @@ c
and
   a @@ b ** c
should do.

A related question is what @@ should do if given an array as its right
argument. In the current PEP, only integers are accepted, which rules
out a bunch of the more complicated cases like a @@ b @@ c (at least
assuming @@ is right-associative, like **, and I can't see why you'd
want anything else). OTOH, in the brave new gufunc world, it
technically would make sense to define @@ as being a gufunc with
signature (m,m),()->(m,m), and the way gufuncs work this *would* allow
the "power" to be an array -- for example, we'd have:

   mat = randn(m, m)
   pow = range(n)
   result = gufunc_matrix_power(mat, pow)
   assert result.shape == (n, m, m)
   for i in xrange(n):
       assert np.all(result[i, :, :] == mat ** i)

In this case, a @@ b @@ c would at least be a meaningful expression to
write. OTOH it would be incredibly bizarre and useless, so probably
no-one would ever write it.

As far as these technical issues go, my guess is that the correct rule
is that @@ should just have the same precedence and the same (right)
associativity as **, and in practice no-one will ever write stuff like
a @@ b @@ c. But if we want to argue for @@ we need to come to some
consensus or another here.

It's also possible the answer is "ugh, these issues are too
complicated, we should defer this until later when we have more
experience with @ and gufuncs and stuff". After all, I doubt anyone
else will swoop in and steal @@ to mean something else! OTOH, if e.g.
there's a strong feeling that '@@ -1' will make a big difference in
pedagogical contexts, then putting that off for years might be a
mistake.

I don't have a strong feeling either way on '@@' . Matrix inverses are
pretty common in matrix expressions, but I don't know that the new operator
offers much advantage over a function call. The positive integer powers
might be useful in some domains, as others have pointed out, but
computational practice one would tend to factor the evaluation.

Chuck

Personally I think it should go in, because:
- it's useful (although marginally), as in the examples previously
mentioned
 - it's what people will expect
- it's the only reasonable use of @@ once @ makes it in

As far as the details about precedence rules and what not... Yes,
someone should think about them and come up with rules that make sense, but
since it will be pretty much only be used in unambiguous situations, this
shouldn't be a blocker.

-=- Olivier

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20140315/ddc1812f/attachment.html 

------------------------------

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

End of NumPy-Discussion Digest, Vol 90, Issue 45
************************************************