[Numpy-discussion] [help needed] associativity and precedence of '@'

Sat Mar 15 07:44:49 EDT 2014

I tend to favor tight-right. The general scheme of precedence more or
less puts "heavier" operations higher than "lighter" operations (+ < *
< **) and @ is "heavier" than * in my mind. I think tight (either
-right or -left) has a good correspondence with current dot()
expressions, so it will make translation a bit more straightforward,
IMO.

  s * dot(A, b) == s * A @ b
  dot(s * A, b) == (s * A) @ b


On Sat, Mar 15, 2014 at 3:41 AM, Nathaniel Smith <njs at pobox.com> wrote:
> Hi all,
>
> Here's the main blocker for adding a matrix multiply operator '@' to Python:
> we need to decide what we think its precedence and associativity should be.
> I'll explain what that means so we're on the same page, and what the choices
> are, and then we can all argue about it. But even better would be if we
> could get some data to guide our decision, and this would be a lot easier if
> some of you all can help; I'll suggest some ways you might be able to do
> that.
>
> So! Precedence and left- versus right-associativity. If you already know
> what these are you can skim down until you see CAPITAL LETTERS.
>
> We all know what precedence is. Code like this:
>   a + b * c
> gets evaluated as:
>   a + (b * c)
> because * has higher precedence than +. It "binds more tightly", as they
> say. Python's complete precedence able is here:
>   http://docs.python.org/3/reference/expressions.html#operator-precedence
>
> Associativity, in the parsing sense, is less well known, though it's just as
> important. It's about deciding how to evaluate code like this:
>   a * b * c
> Do we use
>   a * (b * c)    # * is "right associative"
> or
>   (a * b) * c    # * is "left associative"
> ? Here all the operators have the same precedence (because, uh... they're
> the same operator), so precedence doesn't help. And mostly we can ignore
> this in day-to-day life, because both versions give the same answer, so who
> cares. But a programming language has to pick one (consider what happens if
> one of those objects has a non-default __mul__ implementation). And of
> course it matters a lot for non-associative operations like
>   a - b - c
> or
>   a / b / c
> So when figuring out order of evaluations, what you do first is check the
> precedence, and then if you have multiple operators next to each other with
> the same precedence, you check their associativity. Notice that this means
> that if you have different operators that share the same precedence level
> (like + and -, or * and /), then they have to all have the same
> associativity. All else being equal, it's generally considered nice to have
> fewer precedence levels, because these have to be memorized by users.
>
> Right now in Python, every precedence level is left-associative, except for
> '**'. If you write these formulas without any parentheses, then what the
> interpreter will actually execute is:
>   (a * b) * c
>   (a - b) - c
>   (a / b) / c
> but
>   a ** (b ** c)
>
> Okay, that's the background. Here's the question. We need to decide on
> precedence and associativity for '@'. In particular, there are three
> different options that are interesting:
>
> OPTION 1 FOR @:
> Precedence: same as *
> Associativity: left
> My shorthand name for it: "same-left" (yes, very creative)
>
> This means that if you don't use parentheses, you get:
>    a @ b @ c  ->  (a @ b) @ c
>    a * b @ c  ->  (a * b) @ c
>    a @ b * c  ->  (a @ b) * c
>
> OPTION 2 FOR @:
> Precedence: more-weakly-binding than *
> Associativity: right
> My shorthand name for it: "weak-right"
>
> This means that if you don't use parentheses, you get:
>    a @ b @ c  ->  a @ (b @ c)
>    a * b @ c  ->  (a * b) @ c
>    a @ b * c  ->  a @ (b * c)
>
> OPTION 3 FOR @:
> Precedence: more-tightly-binding than *
> Associativity: right
> My shorthand name for it: "tight-right"
>
> This means that if you don't use parentheses, you get:
>    a @ b @ c  ->  a @ (b @ c)
>    a * b @ c  ->  a * (b @ c)
>    a @ b * c  ->  (a @ b) * c
>
> We need to pick which of which options we think is best, based on whatever
> reasons we can think of, ideally more than "hmm, weak-right gives me warm
> fuzzy feelings" ;-). (In principle the other 2 possible options are
> tight-left and weak-left, but there doesn't seem to be any argument in favor
> of either, so we'll leave them out of the discussion.)
>
> Some things to consider:
>
> * and @ are actually not associative (in the math sense) with respect to
> each other, i.e., (a * b) @ c and a * (b @ c) in general give different
> results when 'a' is not a scalar. So considering the two expressions 'a * b
> @ c' and 'a @ b * c', we can see that each of these three options gives
> produces different results in some cases.
>
> "Same-left" is the easiest to explain and remember, because it's just, "@
> acts like * and /". So we already have to know the rule in order to
> understand other non-associative expressions like a / b / c or a - b - c,
> and it'd be nice if the same rule applied to things like a * b @ c so we
> only had to memorize *one* rule. (Of course there's ** which uses the
> opposite rule, but I guess everyone internalized that one in secondary
> school; that's not true for * versus @.) This is definitely the default we
> should choose unless we have a good reason to do otherwise.
>
> BUT: there might indeed be a good reason to do otherwise, which is the whole
> reason this has come up. Consider:
>     Mat1 @ Mat2 @ vec
> Obviously this will execute much more quickly if we do
>     Mat1 @ (Mat2 @ vec)
> because that results in two cheap matrix-vector multiplies, while
>     (Mat1 @ Mat2) @ vec
> starts out by doing an expensive matrix-matrix multiply. So: maybe @ should
> be right associative, so that we get the fast behaviour without having to
> use explicit parentheses! /If/ these kinds of expressions are common enough
> that having to remember to put explicit parentheses in all the time is more
> of a programmer burden than having to memorize a special associativity rule
> for @. Obviously Mat @ Mat @ vec is more common than vec @ Mat @ Mat, but
> maybe they're both so rare that it doesn't matter in practice -- I don't
> know.
>
> Also, if we do want @ to be right associative, then I can't think of any
> clever reasons to prefer weak-right over tight-right, or vice-versa. For the
> scalar multiplication case, I believe both options produce the same result
> in the same amount of time. For the non-scalar case, they give different
> answers. Do people have strong intuitions about what expressions like
>   a * b @ c
>   a @ b * c
> should do actually? (I'm guessing not, but hey, you never know.)
>
> And, while intuition is useful, it would be really *really* nice to be
> basing these decisions on more than *just* intuition, since whatever we
> decide will be subtly influencing the experience of writing linear algebra
> code in Python for the rest of time. So here's where I could use some help.
> First, of course, if you have any other reasons why one or the other of
> these options is better, then please share! But second, I think we need to
> know something about how often the Mat @ Mat @ vec type cases arise in
> practice. How often do non-scalar * and np.dot show up in the same
> expression? How often does it look like a * np.dot(b, c), and how often does
> it look like np.dot(a * b, c)? How often do we see expressions like
> np.dot(np.dot(a, b), c), and how often do we see expressions like np.dot(a,
> np.dot(b, c))? This would really help guide the debate. I don't have this
> data, and I'm not sure the best way to get it. A super-fancy approach would
> be to write a little script that uses the 'ast' module to count things
> automatically. A less fancy approach would be to just pick some code you've
> written, or a well-known package, grep through for calls to 'dot', and make
> notes on what you see. (An advantage of the less-fancy approach is that as a
> human you might be able to tell the difference between scalar and non-scalar
> *, or check whether it actually matters what order the 'dot' calls are done
> in.)
>
> -n
>
> --
> Nathaniel J. Smith
> Postdoctoral researcher - Informatics - University of Edinburgh
> http://vorpus.org
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


-- 
Robert Kern