Hi all,
In the process of cleaning out old design... peculiarities... in numpy, I happened to look into the history of attempts to add syntax for matrix multiplication to Python, since the lack of this is (as you'll see) at the root of various intractable problems we have. I was pretty surprised; it turns out that even though numerical folks have been whinging about missing this operator for ~15 years, the only two attempts that have been made to add it were:
PEP 211, which instead adds an operator for itertools.product, aka, "maybe we can sneak matrix multiply past Guido in some sort of... large, wooden rabbit..."
and
PEP 225, aka "let's add 12 new operators and figure out what to do with them later"
I'd have rejected these too! So I thought, maybe we should try the radical tactic of writing down what we actually want, carefully explaining why we want it, and then asking for it. And at least this way, if it gets rejected, we'll know that it was rejected for the right reasons...
You'll notice that this draft is rather more developed than the average first-round PEP posting, because it's already been the rounds of all the various numerical package mailing lists to build consensus; no point in asking for the wrong thing. Don't let that slow you down, though. I think what we have here is fairly convincing and covers a lot of the design space (at least it convinced me, which I wasn't sure of at the start), but I'm still totally open to changing anything here based on comments and feedback. AFAICT the numerical community would walk over hot coals if there were an infix matrix multiplication operator on the other side. (BTW, since this is python-ideas -- have you considered adding hot coals to python 3? It might do wonders for uptake.) ...Anyway, the point is, I'm sure I can wrangle them into accepting any useful suggestions or other changes deemed necessary by the broader Python community.
-n
--- [begin draft PEP -- monospace font recommended] ---
PEP: XXXX Title: Dedicated infix operators for matrix multiplication and matrix power Version: $Revision$ Last-Modified: $Date$ Author: Nathaniel J. Smith njs@pobox.com Status: Draft Type: Standards Track Python-Version: 3.5 Content-Type: text/x-rst Created: 20-Feb-2014 Post-History:
Abstract ========
This PEP proposes two new binary operators dedicated to matrix multiplication and matrix power, spelled ``@`` and ``@@`` respectively. (Mnemonic: ``@`` is ``*`` for mATrices.)
Specification =============
Two new binary operators are added to the Python language, together with corresponding in-place versions:
======= ========================= =============================== Op Precedence/associativity Methods ======= ========================= =============================== ``@`` Same as ``*`` ``__matmul__``, ``__rmatmul__`` ``@@`` Same as ``**`` ``__matpow__``, ``__rmatpow__`` ``@=`` n/a ``__imatmul__`` ``@@=`` n/a ``__imatpow__`` ======= ========================= ===============================
No implementations of these methods are added to the builtin or standard library types. However, a number of projects have reached consensus on the recommended semantics for these operations; see `Intended usage details`_ below.
Motivation ==========
Executive summary -----------------
In numerical code, there are two important operations which compete for use of Python's ``*`` operator: elementwise multiplication, and matrix multiplication. In the nearly twenty years since the Numeric library was first proposed, there have been many attempts to resolve this tension [#hugunin]_; none have been really satisfactory. Currently, most numerical Python code uses ``*`` for elementwise multiplication, and function/method syntax for matrix multiplication; however, this leads to ugly and unreadable code in common circumstances. The problem is bad enough that significant amounts of code continue to use the opposite convention (which has the virtue of producing ugly and unreadable code in *different* circumstances), and this API fragmentation across codebases then creates yet more problems. There does not seem to be any *good* solution to the problem of designing a numerical API within current Python syntax -- only a landscape of options that are bad in different ways. The minimal change to Python syntax which is sufficient to resolve these problems is the addition of a single new infix operator for matrix multiplication.
Matrix multiplication has a singular combination of features which distinguish it from other binary operations, which together provide a uniquely compelling case for the addition of a dedicated infix operator:
* Just as for the existing numerical operators, there exists a vast body of prior art supporting the use of infix notation for matrix multiplication across all fields of mathematics, science, and engineering; ``@`` harmoniously fills a hole in Python's existing operator system.
* ``@`` greatly clarifies real-world code.
* ``@`` provides a smoother onramp for less experienced users, who are particularly harmed by hard-to-read code and API fragmentation.
* ``@`` benefits a substantial and growing portion of the Python user community.
* ``@`` will be used frequently -- in fact, evidence suggests it may be used more frequently than ``//`` or the bitwise operators.
* ``@`` allows the Python numerical community to reduce fragmentation, and finally standardize on a single consensus duck type for all numerical array objects.
And, given the existence of ``@``, it makes more sense than not to have ``@@``, ``@=``, and ``@@=``, so they are added as well.
Background: What's wrong with the status quo? ---------------------------------------------
When we crunch numbers on a computer, we usually have lots and lots of numbers to deal with. Trying to deal with them one at a time is cumbersome and slow -- especially when using an interpreted language. Instead, we want the ability to write down simple operations that apply to large collections of numbers all at once. The *n-dimensional array* is the basic object that all popular numeric computing environments use to make this possible. Python has several libraries that provide such arrays, with numpy being at present the most prominent.
When working with n-dimensional arrays, there are two different ways we might want to define multiplication. One is elementwise multiplication::
[[1, 2], [[11, 12], [[1 * 11, 2 * 12], [3, 4]] x [13, 14]] = [3 * 13, 4 * 14]]
and the other is `matrix multiplication`_:
.. _matrix multiplication: https://en.wikipedia.org/wiki/Matrix_multiplication
::
[[1, 2], [[11, 12], [[1 * 11 + 2 * 13, 1 * 12 + 2 * 14], [3, 4]] x [13, 14]] = [3 * 11 + 4 * 13, 3 * 12 + 4 * 14]]
Elementwise multiplication is useful because it lets us easily and quickly perform many multiplications on a large collection of values, without writing a slow and cumbersome ``for`` loop. And this works as part of a very general schema: when using the array objects provided by numpy or other numerical libraries, all Python operators work elementwise on arrays of all dimensionalities. The result is that one can write functions using straightforward code like ``a * b + c / d``, treating the variables as if they were simple values, but then immediately use this function to efficiently perform this calculation on large collections of values, while keeping them organized using whatever arbitrarily complex array layout works best for the problem at hand.
Matrix multiplication is more of a special case. It's only defined on 2d arrays (also known as "matrices"), and multiplication is the only operation that has a meaningful "matrix" version -- "matrix addition" is the same as elementwise addition; there is no such thing as "matrix bitwise-or" or "matrix floordiv"; "matrix division" can be defined but is not very useful, etc. However, matrix multiplication is still used very heavily across all numerical application areas; mathematically, it's one of the most fundamental operations there is.
Because Python syntax currently allows for only a single multiplication operator ``*``, libraries providing array-like objects must decide: either use ``*`` for elementwise multiplication, or use ``*`` for matrix multiplication. And, unfortunately, it turns out that when doing general-purpose number crunching, both operations are used frequently, and there are major advantages to using infix rather than function call syntax in both cases. Thus it is not at all clear which convention is optimal, or even acceptable; often it varies on a case-by-case basis.
Nonetheless, network effects mean that it is very important that we pick *just one* convention. In numpy, for example, it is technically possible to switch between the conventions, because numpy provides two different types with different ``__mul__`` methods. For ``numpy.ndarray`` objects, ``*`` performs elementwise multiplication, and matrix multiplication must use a function call (``numpy.dot``). For ``numpy.matrix`` objects, ``*`` performs matrix multiplication, and elementwise multiplication requires function syntax. Writing code using ``numpy.ndarray`` works fine. Writing code using ``numpy.matrix`` also works fine. But trouble begins as soon as we try to integrate these two pieces of code together. Code that expects an ``ndarray`` and gets a ``matrix``, or vice-versa, may crash or return incorrect results. Keeping track of which functions expect which types as inputs, and return which types as outputs, and then converting back and forth all the time, is incredibly cumbersome and impossible to get right at any scale. Functions that defensively try to handle both types as input and DTRT, find themselves floundering into a swamp of ``isinstance`` and ``if`` statements.
PEP 238 split ``/`` into two operators: ``/`` and ``//``. Imagine the chaos that would have resulted if it had instead split ``int`` into two types: ``classic_int``, whose ``__div__`` implemented floor division, and ``new_int``, whose ``__div__`` implemented true division. This, in a more limited way, is the situation that Python number-crunchers currently find themselves in.
In practice, the vast majority of projects have settled on the convention of using ``*`` for elementwise multiplication, and function call syntax for matrix multiplication (e.g., using ``numpy.ndarray`` instead of ``numpy.matrix``). This reduces the problems caused by API fragmentation, but it doesn't eliminate them. The strong desire to use infix notation for matrix multiplication has caused a number of specialized array libraries to continue to use the opposing convention (e.g., scipy.sparse, pyoperators, pyviennacl) despite the problems this causes, and ``numpy.matrix`` itself still gets used in introductory programming courses, often appears in StackOverflow answers, and so forth. Well-written libraries thus must continue to be prepared to deal with both types of objects, and, of course, are also stuck using unpleasant funcall syntax for matrix multiplication. After nearly two decades of trying, the numerical community has still not found any way to resolve these problems within the constraints of current Python syntax (see `Rejected alternatives to adding a new operator`_ below).
This PEP proposes the minimum effective change to Python syntax that will allow us to drain this swamp. It splits ``*`` into two operators, just as was done for ``/``: ``*`` for elementwise multiplication, and ``@`` for matrix multiplication. (Why not the reverse? Because this way is compatible with the existing consensus, and because it gives us a consistent rule that all the built-in numeric operators also apply in an elementwise manner to arrays; the reverse convention would lead to more special cases.)
So that's why matrix multiplication doesn't and can't just use ``*``. Now, in the the rest of this section, we'll explain why it nonetheless meets the high bar for adding a new operator.
Why should matrix multiplication be infix? ------------------------------------------
Right now, most numerical code in Python uses syntax like ``numpy.dot(a, b)`` or ``a.dot(b)`` to perform matrix multiplication. This obviously works, so why do people make such a fuss about it, even to the point of creating API fragmentation and compatibility swamps?
Matrix multiplication shares two features with ordinary arithmetic operations like addition and multiplication on numbers: (a) it is used very heavily in numerical programs -- often multiple times per line of code -- and (b) it has an ancient and universally adopted tradition of being written using infix syntax. This is because, for typical formulas, this notation is dramatically more readable than any function call syntax. Here's an example to demonstrate:
One of the most useful tools for testing a statistical hypothesis is the linear hypothesis test for OLS regression models. It doesn't really matter what all those words I just said mean; if we find ourselves having to implement this thing, what we'll do is look up some textbook or paper on it, and encounter many mathematical formulas that look like:
.. math::
S = (H \beta - r)^T (H V H^T)^{-1} (H \beta - r)
Here the various variables are all vectors or matrices (details for the curious: [#lht]_).
Now we need to write code to perform this calculation. In current numpy, matrix multiplication can be performed using either the function or method call syntax. Neither provides a particularly readable translation of the formula::
import numpy as np from numpy.linalg import inv, solve
# Using dot function: S = np.dot((np.dot(H, beta) - r).T, np.dot(inv(np.dot(np.dot(H, V), H.T)), np.dot(H, beta) - r))
# Using dot method: S = (H.dot(beta) - r).T.dot(inv(H.dot(V).dot(H.T))).dot(H.dot(beta) - r)
With the ``@`` operator, the direct translation of the above formula becomes::
S = (H @ beta - r).T @ inv(H @ V @ H.T) @ (H @ beta - r)
Notice that there is now a transparent, 1-to-1 mapping between the symbols in the original formula and the code that implements it.
Of course, an experienced programmer will probably notice that this is not the best way to compute this expression. The repeated computation of :math:`H \beta - r` should perhaps be factored out; and, expressions of the form ``dot(inv(A), B)`` should almost always be replaced by the more numerically stable ``solve(A, B)``. When using ``@``, performing these two refactorings gives us::
# Version 1 (as above) S = (H @ beta - r).T @ inv(H @ V @ H.T) @ (H @ beta - r)
# Version 2 trans_coef = H @ beta - r S = trans_coef.T @ inv(H @ V @ H.T) @ trans_coef
# Version 3 S = trans_coef.T @ solve(H @ V @ H.T, trans_coef)
Notice that when comparing between each pair of steps, it's very easy to see exactly what was changed. If we apply the equivalent transformations to the code using the .dot method, then the changes are much harder to read out or verify for correctness::
# Version 1 (as above) S = (H.dot(beta) - r).T.dot(inv(H.dot(V).dot(H.T))).dot(H.dot(beta) - r)
# Version 2 trans_coef = H.dot(beta) - r S = trans_coef.T.dot(inv(H.dot(V).dot(H.T))).dot(trans_coef)
# Version 3 S = trans_coef.T.dot(solve(H.dot(V).dot(H.T)), trans_coef)
Readability counts! The statements using ``@`` are shorter, contain more whitespace, can be directly and easily compared both to each other and to the textbook formula, and contain only meaningful parentheses. This last point is particularly important for readability: when using function-call syntax, the required parentheses on every operation create visual clutter that makes it very difficult to parse out the overall structure of the formula by eye, even for a relatively simple formula like this one. Eyes are terrible at parsing non-regular languages. I made and caught many errors while trying to write out the 'dot' formulas above. I know they still contain at least one error, maybe more. (Exercise: find it. Or them.) The ``@`` examples, by contrast, are not only correct, they're obviously correct at a glance.
If we are even more sophisticated programmers, and writing code that we expect to be reused, then considerations of speed or numerical accuracy might lead us to prefer some particular order of evaluation. Because ``@`` makes it possible to omit irrelevant parentheses, we can be certain that if we *do* write something like ``(H @ V) @ H.T``, then our readers will know that the parentheses must have been added intentionally to accomplish some meaningful purpose. In the ``dot`` examples, it's impossible to know which nesting decisions are important, and which are arbitrary.
Infix ``@`` dramatically improves matrix code usability at all stages of programmer interaction.
Transparent syntax is especially crucial for non-expert programmers -------------------------------------------------------------------
A large proportion of scientific code is written by people who are experts in their domain, but are not experts in programming. And there are many university courses run each year with titles like "Data analysis for social scientists" which assume no programming background, and teach some combination of mathematical techniques, introduction to programming, and the use of programming to implement these mathematical techniques, all within a 10-15 week period. These courses are more and more often being taught in Python rather than special-purpose languages like R or Matlab.
For these kinds of users, whose programming knowledge is fragile, the existence of a transparent mapping between formulas and code often means the difference between succeeding and failing to write that code at all. This is so important that such classes often use the ``numpy.matrix`` type which defines ``*`` to mean matrix multiplication, even though this type is buggy and heavily disrecommended by the rest of the numpy community for the fragmentation that it causes. This pedagogical use case is, in fact, the *only* reason ``numpy.matrix`` remains a supported part of numpy. Adding ``@`` will benefit both beginning and advanced users with better syntax; and furthermore, it will allow both groups to standardize on the same notation from the start, providing a smoother on-ramp to expertise.
But isn't matrix multiplication a pretty niche requirement? -----------------------------------------------------------
The world is full of continuous data, and computers are increasingly called upon to work with it in sophisticated ways. Arrays are the lingua franca of finance, machine learning, 3d graphics, computer vision, robotics, operations research, econometrics, meteorology, computational linguistics, recommendation systems, neuroscience, astronomy, bioinformatics (including genetics, cancer research, drug discovery, etc.), physics engines, quantum mechanics, geophysics, network analysis, and many other application areas. In most or all of these areas, Python is rapidly becoming a dominant player, in large part because of its ability to elegantly mix traditional discrete data structures (hash tables, strings, etc.) on an equal footing with modern numerical data types and algorithms.
We all live in our own little sub-communities, so some Python users may be surprised to realize the sheer extent to which Python is used for number crunching -- especially since much of this particular sub-community's activity occurs outside of traditional Python/FOSS channels. So, to give some rough idea of just how many numerical Python programmers are actually out there, here are two numbers: In 2013, there were 7 international conferences organized specifically on numerical Python [#scipy-conf]_ [#pydata-conf]_. At PyCon 2014, ~20% of the tutorials appear to involve the use of matrices [#pycon-tutorials]_.
To quantify this further, we used Github's "search" function to look at what modules are actually imported across a wide range of real-world code (i.e., all the code on Github). We checked for imports of several popular stdlib modules, a variety of numerically oriented modules, and various other extremely high-profile modules like django and lxml (the latter of which is the #1 most downloaded package on PyPI). Starred lines indicate packages which export array- or matrix-like objects which will adopt ``@`` if this PEP is approved::
Count of Python source files on Github matching given search terms (as of 2014-04-10, ~21:00 UTC) ================ ========== =============== ======= =========== module "import X" "from X import" total total/numpy ================ ========== =============== ======= =========== sys 2374638 63301 2437939 5.85 os 1971515 37571 2009086 4.82 re 1294651 8358 1303009 3.12 numpy ************** 337916 ********** 79065 * 416981 ******* 1.00 warnings 298195 73150 371345 0.89 subprocess 281290 63644 344934 0.83 django 62795 219302 282097 0.68 math 200084 81903 281987 0.68 threading 212302 45423 257725 0.62 pickle+cPickle 215349 22672 238021 0.57 matplotlib 119054 27859 146913 0.35 sqlalchemy 29842 82850 112692 0.27 pylab *************** 36754 ********** 41063 ** 77817 ******* 0.19 scipy *************** 40829 ********** 28263 ** 69092 ******* 0.17 lxml 19026 38061 57087 0.14 zlib 40486 6623 47109 0.11 multiprocessing 25247 19850 45097 0.11 requests 30896 560 31456 0.08 jinja2 8057 24047 32104 0.08 twisted 13858 6404 20262 0.05 gevent 11309 8529 19838 0.05 pandas ************** 14923 *********** 4005 ** 18928 ******* 0.05 sympy 2779 9537 12316 0.03 theano *************** 3654 *********** 1828 *** 5482 ******* 0.01 ================ ========== =============== ======= ===========
These numbers should be taken with several grains of salt (see footnote for discussion: [#github-details]_), but, to the extent they can be trusted, they suggest that ``numpy`` might be the single most-imported non-stdlib module in the entire Pythonverse; it's even more-imported than such stdlib stalwarts as ``subprocess``, ``math``, ``pickle``, and ``threading``. And numpy users represent only a subset of the broader numerical community that will benefit from the ``@`` operator. Matrices may once have been a niche data type restricted to Fortran programs running in university labs and military clusters, but those days are long gone. Number crunching is a mainstream part of modern Python usage.
In addition, there is some precedence for adding an infix operator to handle a more-specialized arithmetic operation: the floor division operator ``//``, like the bitwise operators, is very useful under certain circumstances when performing exact calculations on discrete values. But it seems likely that there are many Python programmers who have never had reason to use ``//`` (or, for that matter, the bitwise operators). ``@`` is no more niche than ``//``.
So ``@`` is good for matrix formulas, but how common are those really? ----------------------------------------------------------------------
We've seen that ``@`` makes matrix formulas dramatically easier to work with for both experts and non-experts, that matrix formulas appear in many important applications, and that numerical libraries like numpy are used by a substantial proportion of Python's user base. But numerical libraries aren't just about matrix formulas, and being important doesn't necessarily mean taking up a lot of code: if matrix formulas only occured in one or two places in the average numerically-oriented project, then it still wouldn't be worth adding a new operator. So how common is matrix multiplication, really?
When the going gets tough, the tough get empirical. To get a rough estimate of how useful the ``@`` operator will be, the table below shows the rate at which different Python operators are actually used in the stdlib, and also in two high-profile numerical packages -- the scikit-learn machine learning library, and the nipy neuroimaging library -- normalized by source lines of code (SLOC). Rows are sorted by the 'combined' column, which pools all three code bases together. The combined column is thus strongly weighted towards the stdlib, which is much larger than both projects put together (stdlib: 411575 SLOC, scikit-learn: 50924 SLOC, nipy: 37078 SLOC). [#sloc-details]_
The ``dot`` row (marked ``******``) counts how common matrix multiply operations are in each codebase.
::
==== ====== ============ ==== ======== op stdlib scikit-learn nipy combined ==== ====== ============ ==== ======== = 2969 5536 4932 3376 / 10,000 SLOC - 218 444 496 261 + 224 201 348 231 == 177 248 334 196 * 156 284 465 192 % 121 114 107 119 ** 59 111 118 68 != 40 56 74 44 / 18 121 183 41 > 29 70 110 39 += 34 61 67 39 < 32 62 76 38 >= 19 17 17 18 <= 18 27 12 18 dot ***** 0 ********** 99 ** 74 ****** 16 | 18 1 2 15 & 14 0 6 12 << 10 1 1 8 // 9 9 1 8 -= 5 21 14 8 *= 2 19 22 5 /= 0 23 16 4 >> 4 0 0 3 ^ 3 0 0 3 ~ 2 4 5 2 |= 3 0 0 2 &= 1 0 0 1 //= 1 0 0 1 ^= 1 0 0 0 **= 0 2 0 0 %= 0 0 0 0 <<= 0 0 0 0 >>= 0 0 0 0 ==== ====== ============ ==== ========
These two numerical packages alone contain ~780 uses of matrix multiplication. Within these packages, matrix multiplication is used more heavily than most comparison operators (``<`` ``!=`` ``<=`` ``>=``). Even when we dilute these counts by including the stdlib into our comparisons, matrix multiplication is still used more often in total than any of the bitwise operators, and 2x as often as ``//``. This is true even though the stdlib, which contains a fair amount of integer arithmetic and no matrix operations, makes up more than 80% of the combined code base.
By coincidence, the numeric libraries make up approximately the same proportion of the 'combined' codebase as numeric tutorials make up of PyCon 2014's tutorial schedule, which suggests that the 'combined' column may not be *wildly* unrepresentative of new Python code in general. While it's impossible to know for certain, from this data it seems entirely possible that across all Python code currently being written, matrix multiplication is already used more often than ``//`` and the bitwise operations.
But isn't it weird to add an operator with no stdlib uses? ----------------------------------------------------------
It's certainly unusual (though ``Ellipsis`` was also added without any stdlib uses). But the important thing is whether a change will benefit users, not where the software is being downloaded from. It's clear from the above that ``@`` will be used, and used heavily. And this PEP provides the critical piece that will allow the Python numerical community to finally reach consensus on a standard duck type for all array-like objects, which is a necessary precondition to ever adding a numerical array type to the stdlib.
Matrix power and in-place operators -----------------------------------
The primary motivation for this PEP is ``@``; the other proposed operators don't have nearly as much impact. The matrix power operator ``@@`` is useful and well-defined, but not really necessary. It is still included, though, for consistency: if we have an ``@`` that is analogous to ``*``, then it would be weird and surprising to *not* have an ``@@`` that is analogous to ``**``. Similarly, the in-place operators ``@=`` and ``@@=`` provide limited value -- it's more common to write ``a = (b @ a)`` than it is to write ``a = (a @ b)``, and in-place matrix operations still generally have to allocate substantial temporary storage -- but they are included for completeness and symmetry.
Compatibility considerations ============================
Currently, the only legal use of the ``@`` token in Python code is at statement beginning in decorators. The new operators are all infix; the one place they can never occur is at statement beginning. Therefore, no existing code will be broken by the addition of these operators, and there is no possible parsing ambiguity between decorator-@ and the new operators.
Another important kind of compatibility is the mental cost paid by users to update their understanding of the Python language after this change, particularly for users who do not work with matrices and thus do not benefit. Here again, ``@`` has minimal impact: even comprehensive tutorials and references will only need to add a sentence or two to fully document this PEP's changes for a non-numerical audience.
Intended usage details ======================
This section is informative, rather than normative -- it documents the consensus of a number of libraries that provide array- or matrix-like objects on how the ``@`` and ``@@`` operators will be implemented.
This section uses the numpy terminology for describing arbitrary multidimensional arrays of data, because it is a superset of all other commonly used models. In this model, the *shape* of any array is represented by a tuple of integers. Because matrices are two-dimensional, they have len(shape) == 2, while 1d vectors have len(shape) == 1, and scalars have shape == (), i.e., they are "0 dimensional". Any array contains prod(shape) total entries. Notice that `prod(()) == 1`_ (for the same reason that sum(()) == 0); scalars are just an ordinary kind of array, not a special case. Notice also that we distinguish between a single scalar value (shape == (), analogous to ``1``), a vector containing only a single entry (shape == (1,), analogous to ``[1]``), a matrix containing only a single entry (shape == (1, 1), analogous to ``[[1]]``), etc., so the dimensionality of any array is always well-defined. Other libraries with more restricted representations (e.g., those that support 2d arrays only) might implement only a subset of the functionality described here.
.. _prod(()) == 1: https://en.wikipedia.org/wiki/Empty_product
Semantics ---------
The recommended semantics for ``@`` for different inputs are:
* 2d inputs are conventional matrices, and so the semantics are obvious: we apply conventional matrix multiplication. If we write ``arr(2, 3)`` to represent an arbitrary 2x3 array, then ``arr(3, 4) @ arr(4, 5)`` returns an array with shape (3, 5).
* 1d vector inputs are promoted to 2d by prepending or appending a '1' to the shape, the operation is performed, and then the added dimension is removed from the output. The 1 is always added on the "outside" of the shape: prepended for left arguments, and appended for right arguments. The result is that matrix @ vector and vector @ matrix are both legal (assuming compatible shapes), and both return 1d vectors; vector @ vector returns a scalar. This is clearer with examples.
* ``arr(2, 3) @ arr(3, 1)`` is a regular matrix product, and returns an array with shape (2, 1), i.e., a column vector.
* ``arr(2, 3) @ arr(3)`` performs the same computation as the previous (i.e., treats the 1d vector as a matrix containing a single *column*, shape = (3, 1)), but returns the result with shape (2,), i.e., a 1d vector.
* ``arr(1, 3) @ arr(3, 2)`` is a regular matrix product, and returns an array with shape (1, 2), i.e., a row vector.
* ``arr(3) @ arr(3, 2)`` performs the same computation as the previous (i.e., treats the 1d vector as a matrix containing a single *row*, shape = (1, 3)), but returns the result with shape (2,), i.e., a 1d vector.
* ``arr(1, 3) @ arr(3, 1)`` is a regular matrix product, and returns an array with shape (1, 1), i.e., a single value in matrix form.
* ``arr(3) @ arr(3)`` performs the same computation as the previous, but returns the result with shape (), i.e., a single scalar value, not in matrix form. So this is the standard inner product on vectors.
An infelicity of this definition for 1d vectors is that it makes ``@`` non-associative in some cases (``(Mat1 @ vec) @ Mat2`` != ``Mat1 @ (vec @ Mat2)``). But this seems to be a case where practicality beats purity: non-associativity only arises for strange expressions that would never be written in practice; if they are written anyway then there is a consistent rule for understanding what will happen (``Mat1 @ vec @ Mat2`` is parsed as ``(Mat1 @ vec) @ Mat2``, just like ``a - b - c``); and, not supporting 1d vectors would rule out many important use cases that do arise very commonly in practice. No-one wants to explain to new users why to solve the simplest linear system in the obvious way, they have to type ``(inv(A) @ b[:, np.newaxis]).flatten()`` instead of ``inv(A) @ b``, or perform an ordinary least-squares regression by typing ``solve(X.T @ X, X @ y[:, np.newaxis]).flatten()`` instead of ``solve(X.T @ X, X @ y)``. No-one wants to type ``(a[np.newaxis, :] @ b[:, np.newaxis])[0, 0]`` instead of ``a @ b`` every time they compute an inner product, or ``(a[np.newaxis, :] @ Mat @ b[:, np.newaxis])[0, 0]`` for general quadratic forms instead of ``a @ Mat @ b``. In addition, sage and sympy (see below) use these non-associative semantics with an infix matrix multiplication operator (they use ``*``), and they report that they haven't experienced any problems caused by it.
* For inputs with more than 2 dimensions, we treat the last two dimensions as being the dimensions of the matrices to multiply, and 'broadcast' across the other dimensions. This provides a convenient way to quickly compute many matrix products in a single operation. For example, ``arr(10, 2, 3) @ arr(10, 3, 4)`` performs 10 separate matrix multiplies, each of which multiplies a 2x3 and a 3x4 matrix to produce a 2x4 matrix, and then returns the 10 resulting matrices together in an array with shape (10, 2, 4). The intuition here is that we treat these 3d arrays of numbers as if they were 1d arrays *of matrices*, and then apply matrix multiplication in an elementwise manner, where now each 'element' is a whole matrix. Note that broadcasting is not limited to perfectly aligned arrays; in more complicated cases, it allows several simple but powerful tricks for controlling how arrays are aligned with each other; see [#broadcasting]_ for details. (In particular, it turns out that when broadcasting is taken into account, the standard scalar * matrix product is a special case of the elementwise multiplication operator ``*``.)
If one operand is >2d, and another operand is 1d, then the above rules apply unchanged, with 1d->2d promotion performed before broadcasting. E.g., ``arr(10, 2, 3) @ arr(3)`` first promotes to ``arr(10, 2, 3) @ arr(3, 1)``, then broadcasts the right argument to create the aligned operation ``arr(10, 2, 3) @ arr(10, 3, 1)``, multiplies to get an array with shape (10, 2, 1), and finally removes the added dimension, returning an array with shape (10, 2). Similarly, ``arr(2) @ arr(10, 2, 3)`` produces an intermediate array with shape (10, 1, 3), and a final array with shape (10, 3).
* 0d (scalar) inputs raise an error. Scalar * matrix multiplication is a mathematically and algorithmically distinct operation from matrix @ matrix multiplication, and is already covered by the elementwise ``*`` operator. Allowing scalar @ matrix would thus both require an unnecessary special case, and violate TOOWTDI.
The recommended semantics for ``@@`` are::
def __matpow__(self, n): if not isinstance(n, numbers.Integral): raise TypeError("@@ not implemented for fractional powers") if n == 0: return identity_matrix_with_shape(self.shape) elif n < 0: return inverse(self) @ (self @@ (n + 1)) else: return self @ (self @@ (n - 1))
(Of course we expect that much more efficient implementations will be used in practice.) Notice that if given an appropriate definition of ``identity_matrix_with_shape``, then this definition will automatically handle >2d arrays appropriately. Notice also that with this definition, ``vector @@ 2`` gives the squared Euclidean length of the vector, a commonly used value. Also, while it is rarely useful to explicitly compute inverses or other negative powers in standard immediate-mode dense matrix code, these computations are natural when doing symbolic or deferred-mode computations (as in e.g. sympy, theano, numba, numexpr); therefore, negative powers are fully supported. Fractional powers, though, bring in variety of `mathematical complications`_, so we leave it to individual projects to decide whether they want to try to define some reasonable semantics for fractional inputs.
.. _`mathematical complications`: https://en.wikipedia.org/wiki/Square_root_of_a_matrix
Adoption --------
We group existing Python projects which provide array- or matrix-like types based on what API they currently use for elementwise and matrix multiplication.
**Projects which currently use * for *elementwise* multiplication, and function/method calls for *matrix* multiplication:**
The developers of the following projects have expressed an intention to implement ``@`` and ``@@`` on their array-like types using the above semantics:
* numpy * pandas * blaze * theano
The following projects have been alerted to the existence of the PEP, but it's not yet known what they plan to do if it's accepted. We don't anticipate that they'll have any objections, though, since everything proposed here is consistent with how they already do things:
* pycuda * panda3d
**Projects which currently use * for *matrix* multiplication, and function/method calls for *elementwise* multiplication:**
The following projects have expressed an intention, if this PEP is accepted, to migrate from their current API to the elementwise-``*``, matmul-``@`` convention (i.e., this is a list of projects whose API fragmentation will probably be eliminated if this PEP is accepted):
* numpy (``numpy.matrix``) * scipy.sparse * pyoperators * pyviennacl
The following projects have been alerted to the existence of the PEP, but it's not known what they plan to do if it's accepted (i.e., this is a list of projects whose API fragmentation may or may not be eliminated if this PEP is accepted):
* cvxopt
**Projects which currently use * for *matrix* multiplication, and which do not implement elementwise multiplication at all:**
There are several projects which implement matrix types, but from a very different perspective than the numerical libraries discussed above. These projects focus on computational methods for analyzing matrices in the sense of abstract mathematical objects (i.e., linear maps over free modules over rings), rather than as big bags full of numbers that need crunching. And it turns out that from the abstract math point of view, there isn't much use for elementwise operations in the first place; as discussed in the Background section above, elementwise operations are motivated by the bag-of-numbers approach. The different goals of these projects mean that they don't encounter the basic problem that this PEP exists to address, making it mostly irrelevant to them; while they appear superficially similar, they're actually doing something quite different. They use ``*`` for matrix multiplication (and for group actions, and so forth), and if this PEP is accepted, their expressed intention is to continue doing so, while perhaps adding ``@`` and ``@@`` on matrices as aliases for ``*`` and ``**``:
* sympy * sage
If you know of any actively maintained Python libraries which provide an interface for working with numerical arrays or matrices, and which are not listed above, then please let the PEP author know: njs@pobox.com
Rationale for specification details ===================================
Choice of operator ------------------
Why ``@`` instead of some other punctuation symbol? It doesn't matter much, and there isn't any consensus across other programming languages about how this operator should be named [#matmul-other-langs]_, but ``@`` has a few advantages:
* ``@`` is a friendly character that Pythoneers are already used to typing in decorators, and its use in email addresses means it is more likely to be easily accessible across keyboard layouts than some other characters (e.g. ``$`` or non-ASCII characters).
* The mATrices mnemonic is cute.
* It's round like ``*`` and :math:`\cdot`.
* The use of a single-character token makes ``@@`` possible, which is a nice bonus.
* The swirly shape is reminiscent of the simultaneous sweeps over rows and columns that define matrix multiplication; its asymmetry is evocative of its non-commutative nature.
(Non)-Definitions for built-in types ------------------------------------
No ``__matmul__`` or ``__matpow__`` are defined for builtin numeric types (``float``, ``int``, etc.) or for the ``numbers.Number`` hierarchy, because these types represent scalars, and the consensus semantics for ``@`` are that it should raise an error on scalars.
We do not -- for now -- define a ``__matmul__`` method on the standard ``memoryview`` or ``array.array`` objects, for several reasons. First, there is currently no way to create multidimensional memoryview objects using only the stdlib, and array objects cannot represent multidimensional data at all, which makes ``__matmul__`` much less useful. Second, providing a quality implementation of matrix multiplication is highly non-trivial. Naive nested loop implementations are very slow and providing one in CPython would just create a trap for users. But the alternative -- providing a modern, competitive matrix multiply -- would require that CPython link to a BLAS library, which brings a set of new complications. In particular, several popular BLAS libraries (including the one that ships by default on OS X) currently break the use of ``multiprocessing`` [#blas-fork]_. And finally, we'd have to add quite a bit beyond ``__matmul__`` before ``memoryview`` or ``array.array`` would be useful for numeric work -- like elementwise versions of the other arithmetic operators, just to start. Put together, these considerations mean that the cost/benefit of adding ``__matmul__`` to these types just isn't there, so for now we'll continue to delegate these problems to numpy and friends, and defer a more systematic solution to a future proposal.
There are also non-numeric Python builtins which define ``__mul__`` (``str``, ``list``, ...). We do not define ``__matmul__`` for these types either, because why would we even do that.
Unresolved issues -----------------
Associativity of ``@`` ''''''''''''''''''''''
It's been suggested that ``@`` should be right-associative, on the grounds that for expressions like ``Mat @ Mat @ vec``, the two different evaluation orders produce the same result, but the right-associative order ``Mat @ (Mat @ vec)`` will be faster and use less memory than the left-associative order ``(Mat @ Mat) @ vec``. (Matrix-vector multiplication is much cheaper than matrix-matrix multiplication). It would be a shame if users found themselves required to use an overabundance of parentheses to achieve acceptable speed/memory usage in common situations, but, it's not currently clear whether such cases actually are common enough to override Python's general rule of left-associativity, or even whether they're more common than the symmetric cases where left-associativity would be faster (though this does seem intuitively plausible). The only way to answer this is probably to do an audit of some real-world uses and check how often the associativity matters in practice; if this PEP is accepted in principle, then we should probably do this check before finalizing it.
Rejected alternatives to adding a new operator ==============================================
Over the past few decades, the Python numeric community has explored a variety of ways to resolve the tension between matrix and elementwise multiplication operations. PEP 211 and PEP 225, both proposed in 2000 and last seriously discussed in 2008 [#threads-2008]_, were early attempts to add new operators to solve this problem, but suffered from serious flaws; in particular, at that time the Python numerical community had not yet reached consensus on the proper API for array objects, or on what operators might be needed or useful (e.g., PEP 225 proposes 6 new operators with unspecified semantics). Experience since then has now led to consensus that the best solution, for both numeric Python and core Python, is to add a single infix operator for matrix multiply (together with the other new operators this implies like ``@=``).
We review some of the rejected alternatives here.
**Use a second type that defines __mul__ as matrix multiplication:** As discussed above (`Background: What's wrong with the status quo?`_), this has been tried this for many years via the ``numpy.matrix`` type (and its predecessors in Numeric and numarray). The result is a strong consensus among both numpy developers and developers of downstream packages that ``numpy.matrix`` should essentially never be used, because of the problems caused by having conflicting duck types for arrays. (Of course one could then argue we should *only* define ``__mul__`` to be matrix multiplication, but then we'd have the same problem with elementwise multiplication.) There have been several pushes to remove ``numpy.matrix`` entirely; the only counter-arguments have come from educators who find that its problems are outweighed by the need to provide a simple and clear mapping between mathematical notation and code for novices (see `Transparent syntax is especially crucial for non-expert programmers`_). But, of course, starting out newbies with a dispreferred syntax and then expecting them to transition later causes its own problems. The two-type solution is worse than the disease.
**Add lots of new operators, or add a new generic syntax for defining infix operators:** In addition to being generally un-Pythonic and repeatedly rejected by BDFL fiat, this would be using a sledgehammer to smash a fly. The scientific python community has consensus that adding one operator for matrix multiplication is enough to fix the one otherwise unfixable pain point. (In retrospect, we all think PEP 225 was a bad idea too -- or at least far more complex than it needed to be.)
**Add a new @ (or whatever) operator that has some other meaning in general Python, and then overload it in numeric code:** This was the approach taken by PEP 211, which proposed defining ``@`` to be the equivalent of ``itertools.product``. The problem with this is that when taken on its own terms, adding an infix operator for ``itertools.product`` is just silly. (During discussions of this PEP, a similar suggestion was made to define ``@`` as a general purpose function composition operator, and this suffers from the same problem; ``functools.compose`` isn't even useful enough to exist.) Matrix multiplication has a uniquely strong rationale for inclusion as an infix operator. There almost certainly don't exist any other binary operations that will ever justify adding any other infix operators to Python.
**Add a .dot method to array types so as to allow "pseudo-infix" A.dot(B) syntax:** This has been in numpy for some years, and in many cases it's better than dot(A, B). But it's still much less readable than real infix notation, and in particular still suffers from an extreme overabundance of parentheses. See `Why should matrix multiplication be infix?`_ above.
**Use a 'with' block to toggle the meaning of * within a single code block**: E.g., numpy could define a special context object so that we'd have::
c = a * b # element-wise multiplication with numpy.mul_as_dot: c = a * b # matrix multiplication
However, this has two serious problems: first, it requires that every array-like type's ``__mul__`` method know how to check some global state (``numpy.mul_is_currently_dot`` or whatever). This is fine if ``a`` and ``b`` are numpy objects, but the world contains many non-numpy array-like objects. So this either requires non-local coupling -- every numpy competitor library has to import numpy and then check ``numpy.mul_is_currently_dot`` on every operation -- or else it breaks duck-typing, with the above code doing radically different things depending on whether ``a`` and ``b`` are numpy objects or some other sort of object. Second, and worse, ``with`` blocks are dynamically scoped, not lexically scoped; i.e., any function that gets called inside the ``with`` block will suddenly find itself executing inside the mul_as_dot world, and crash and burn horribly -- if you're lucky. So this is a construct that could only be used safely in rather limited cases (no function calls), and which would make it very easy to shoot yourself in the foot without warning.
**Use a language preprocessor that adds extra numerically-oriented operators and perhaps other syntax:** (As per recent BDFL suggestion: [#preprocessor]_) This suggestion seems based on the idea that numerical code needs a wide variety of syntax additions. In fact, given ``@``, most numerical users don't need any other operators or syntax; it solves the one really painful problem that cannot be solved by other means, and that causes painful reverberations through the larger ecosystem. Defining a new language (presumably with its own parser which would have to be kept in sync with Python's, etc.), just to support a single binary operator, is neither practical nor desireable. In the numerical context, Python's competition is special-purpose numerical languages (Matlab, R, IDL, etc.). Compared to these, Python's killer feature is exactly that one can mix specialized numerical code with code for XML parsing, web page generation, database access, network programming, GUI libraries, and so forth, and we also gain major benefits from the huge variety of tutorials, reference material, introductory classes, etc., which use Python. Fragmenting "numerical Python" from "real Python" would be a major source of confusion. A major motivation for this PEP is to *reduce* fragmentation. Having to set up a preprocessor would be an especially prohibitive complication for unsophisticated users. And we use Python because we like Python! We don't want almost-but-not-quite-Python.
**Use overloading hacks to define a "new infix operator" like *dot*, as in a well-known Python recipe:** (See: [#infix-hack]_) Beautiful is better than ugly. This is... not beautiful. And not Pythonic. And especially unfriendly to beginners, who are just trying to wrap their heads around the idea that there's a coherent underlying system behind these magic incantations that they're learning, when along comes an evil hack like this that violates that system, creates bizarre error messages when accidentally misused, and whose underlying mechanisms can't be understood without deep knowledge of how object oriented systems work. We've considered promoting this as a general solution, and perhaps if the PEP is rejected we'll revisit this option, but so far the numeric community has mostly elected to leave this one on the shelf.
References ==========
.. [#preprocessor] From a comment by GvR on a G+ post by GvR; the comment itself does not seem to be directly linkable: https://plus.google.com/115212051037621986145/posts/hZVVtJ9bK3u .. [#infix-hack] http://code.activestate.com/recipes/384122-infix-operators/ http://www.sagemath.org/doc/reference/misc/sage/misc/decorators.html#sage.mi... .. [#scipy-conf] http://conference.scipy.org/past.html .. [#pydata-conf] http://pydata.org/events/ .. [#lht] In this formula, :math:`\beta` is a vector or matrix of regression coefficients, :math:`V` is the estimated variance/covariance matrix for these coefficients, and we want to test the null hypothesis that :math:`H\beta = r`; a large :math:`S` then indicates that this hypothesis is unlikely to be true. For example, in an analysis of human height, the vector :math:`\beta` might contain one value which was the the average height of the measured men, and another value which was the average height of the measured women, and then setting :math:`H = [1, -1], r = 0` would let us test whether men and women are the same height on average. Compare to eq. 2.139 in http://sfb649.wiwi.hu-berlin.de/fedc_homepage/xplore/tutorials/xegbohtmlnode...
Example code is adapted from https://github.com/rerpy/rerpy/blob/0d274f85e14c3b1625acb22aed1efa85d122ecb7...
.. [#pycon-tutorials] Out of the 36 tutorials scheduled for PyCon 2014 (https://us.pycon.org/2014/schedule/tutorials/), we guess that the 8 below will almost certainly deal with matrices:
* Dynamics and control with Python
* Exploring machine learning with Scikit-learn
* How to formulate a (science) problem and analyze it using Python code
* Diving deeper into Machine Learning with Scikit-learn
* Data Wrangling for Kaggle Data Science Competitions – An etude
* Hands-on with Pydata: how to build a minimal recommendation engine.
* Python for Social Scientists
* Bayesian statistics made simple
In addition, the following tutorials could easily involve matrices:
* Introduction to game programming
* mrjob: Snakes on a Hadoop *("We'll introduce some data science concepts, such as user-user similarity, and show how to calculate these metrics...")*
* Mining Social Web APIs with IPython Notebook
* Beyond Defaults: Creating Polished Visualizations Using Matplotlib
This gives an estimated range of 8 to 12 / 36 = 22% to 33% of tutorials dealing with matrices; saying ~20% then gives us some wiggle room in case our estimates are high.
.. [#sloc-details] SLOCs were defined as physical lines which contain at least one token that is not a COMMENT, NEWLINE, ENCODING, INDENT, or DEDENT. Counts were made by using ``tokenize`` module from Python 3.2.3 to examine the tokens in all files ending ``.py`` underneath some directory. Only tokens which occur at least once in the source trees are included in the table. The counting script will be available as an auxiliary file once this PEP is submitted; until then, it can be found here: https://gist.github.com/njsmith/9157645
Matrix multiply counts were estimated by counting how often certain tokens which are used as matrix multiply function names occurred in each package. In principle this could create false positives, but as far as I know the counts are exact; it's unlikely that anyone is using ``dot`` as a variable name when it's also the name of one of the most widely-used numpy functions.
All counts were made using the latest development version of each project as of 21 Feb 2014.
'stdlib' is the contents of the Lib/ directory in commit d6aa3fa646e2 to the cpython hg repository, and treats the following tokens as indicating matrix multiply: n/a.
'scikit-learn' is the contents of the sklearn/ directory in commit 69b71623273ccfc1181ea83d8fb9e05ae96f57c7 to the scikit-learn repository (https://github.com/scikit-learn/scikit-learn), and treats the following tokens as indicating matrix multiply: ``dot``, ``fast_dot``, ``safe_sparse_dot``.
'nipy' is the contents of the nipy/ directory in commit 5419911e99546401b5a13bd8ccc3ad97f0d31037 to the nipy repository (https://github.com/nipy/nipy/), and treats the following tokens as indicating matrix multiply: ``dot``.
.. [#blas-fork] BLAS libraries have a habit of secretly spawning threads, even when used from single-threaded programs. And threads play very poorly with ``fork()``; the usual symptom is that attempting to perform linear algebra in a child process causes an immediate deadlock.
.. [#threads-2008] http://fperez.org/py4science/numpy-pep225/numpy-pep225.html
.. [#broadcasting] http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
.. [#matmul-other-langs] http://mail.scipy.org/pipermail/scipy-user/2014-February/035499.html
.. [#github-details] Counts were produced by manually entering the string ``"import foo"`` or ``"from foo import"`` (with quotes) into the Github code search page, e.g.: https://github.com/search?q=%22import+numpy%22&ref=simplesearch&type... on 2014-04-10 at ~21:00 UTC. The reported values are the numbers given in the "Languages" box on the lower-left corner, next to "Python". This also causes some undercounting (e.g., leaving out Cython code, and possibly one should also count HTML docs and so forth), but these effects are negligible (e.g., only ~1% of numpy usage appears to occur in Cython code, and probably even less for the other modules listed). The use of this box is crucial, however, because these counts appear to be stable, while the "overall" counts listed at the top of the page ("We've found ___ code results") are highly variable even for a single search -- simply reloading the page can cause this number to vary by a factor of 2 (!!). (They do seem to settle down if one reloads the page repeatedly, but nonetheless this is spooky enough that it seemed better to avoid these numbers.)
These numbers should of course be taken with multiple grains of salt; it's not clear how representative Github is of Python code in general, and limitations of the search tool make it impossible to get precise counts. AFAIK this is the best data set currently available, but it'd be nice if it were better. In particular:
* Lines like ``import sys, os`` will only be counted in the ``sys`` row.
* A file containing both ``import X`` and ``from X import`` will be counted twice
* Imports of the form ``from X.foo import ...`` are missed. We could catch these by instead searching for "from X", but this is a common phrase in English prose, so we'd end up with false positives from comments, strings, etc. For many of the modules considered this shouldn't matter too much -- for example, the stdlib modules have flat namespaces -- but it might especially lead to undercounting of django, scipy, and twisted.
Also, it's possible there exist other non-stdlib modules we didn't think to test that are even more-imported than numpy -- though we tried quite a few of the obvious suspects. If you find one, let us know! The modules tested here were chosen based on a combination of intuition and the top-100 list at pypi-ranking.info.
Fortunately, it doesn't really matter if it turns out that numpy is, say, merely the *third* most-imported non-stdlib module, since the point is just that numeric programming is a common and mainstream activity.
Finally, we should point out the obvious: whether a package is import**ed** is rather different from whether it's import**ant**. No-one's claiming numpy is "the most important package" or anything like that. Certainly more packages depend on distutils, e.g., then depend on numpy -- and far fewer source files import distutils than import numpy. But this is fine for our present purposes. Most source files don't import distutils because most source files don't care how they're distributed, so long as they are; these source files thus don't care about details of how distutils' API works. This PEP is in some sense about changing how numpy's and related packages' APIs work, so the relevant metric is to look at source files that are choosing to directly interact with that API, which is sort of like what we get by looking at import statements.
.. [#hugunin] The first such proposal occurs in Jim Hugunin's very first email to the matrix SIG in 1995, which lays out the first draft of what became Numeric. He suggests using ``*`` for elementwise multiplication, and ``%`` for matrix multiplication: https://mail.python.org/pipermail/matrix-sig/1995-August/000002.html
Copyright =========
This document has been placed in the public domain.
-- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org
On Fri, Mar 14, 2014 at 1:59 AM, Nathaniel Smith njs@pobox.com wrote:
[...]
Hmm, not sure how that "Fwd:" snuck onto the subject line. (Or really, I do know, but am embarrassed to say.) Oh well, sorry, hope it isn't too distracting!
-n
This (or various related ideas) actually has come up somewhat regularly. I think the reason it hasn't gone anywhere is that there was never enough input from the numerical community; someone says, "I'll bet this would be useful for numerics," and someone else says "Then why haven't the numarray/numeric/numpy guys asked for it since 2000?", and nobody has an answer to that. This obviously is the answer to that.
One thing that always comes up is a suggestion for using Unicode. There are obvious downsides--the Unicode multiplication character isn't easy to type; even if Python and major code editors are fully Unicode friendly, code often has to go through channels that may not be; etc. But it's worth asking whether the numeric community has considered this and rejected it for these reasons, or for other reasons, or if they'd be happy with it but just didn't think Python was ready for it, or whatever.
Also, how do other general purpose programming languages solve this? Surely people do matrix math in Haskell and C++, at least? Do they just use separate types for matrices and arrays because they don't have to worry about duck typing (being statically-typed languages)? Do they just avoid mixing libraries together to avoid the problem? Or have people attempted to reuse % or other operators? Or (doesn't apply to C++, but does to Haskell) do they use spelled-out infix functions like `mmul` instead of trying to come up with symbolic operators? A lot of those answers wouldn't tell us anything more than "their experience isn't relevant to Python", but it would be nice to know that at least.
On Mar 13, 2014, at 19:08, Nathaniel Smith njs@pobox.com wrote:
On Fri, Mar 14, 2014 at 1:59 AM, Nathaniel Smith njs@pobox.com wrote:
[...]
Hmm, not sure how that "Fwd:" snuck onto the subject line. (Or really, I do know, but am embarrassed to say.) Oh well, sorry, hope it isn't too distracting!
-n
-- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Fri, Mar 14, 2014 at 12:03 AM, Andrew Barnert abarnert@yahoo.com wrote:
One thing that always comes up is a suggestion for using Unicode. There are obvious downsides--the Unicode multiplication character isn't easy to type; even if Python and major code editors are fully Unicode friendly, code often has to go through channels that may not be; etc. But it's worth asking whether the numeric community has considered this and rejected it for these reasons, or for other reasons, or if they'd be happy with it but just didn't think Python was ready for it, or whatever.
It did. Some 60 years ago [1]. Mostly rejected by now [2].
[1] http://en.wikipedia.org/wiki/APL_(programming_language) [2] http://www.jsoftware.com/start.htm
On Fri, Mar 14, 2014 at 4:03 AM, Andrew Barnert abarnert@yahoo.com wrote:
This (or various related ideas) actually has come up somewhat regularly. I think the reason it hasn't gone anywhere is that there was never enough input from the numerical community; someone says, "I'll bet this would be useful for numerics," and someone else says "Then why haven't the numarray/numeric/numpy guys asked for it since 2000?", and nobody has an answer to that. This obviously is the answer to that.
The numeric community has many talents, but crossing the cultural divide with upstream is not really a speciality...
One thing that always comes up is a suggestion for using Unicode. There are obvious downsides--the Unicode multiplication character isn't easy to type; even if Python and major code editors are fully Unicode friendly, code often has to go through channels that may not be; etc. But it's worth asking whether the numeric community has considered this and rejected it for these reasons, or for other reasons, or if they'd be happy with it but just didn't think Python was ready for it, or whatever.
We don't really have a strong opinion on which character is used. It is nice if it's easy to type, though -- esp. because in many ways it's already beginners who suffer the brunt of the elmul/matmul issues, and beginners are exactly the people for whom figuring out how to type some stupid character is a prohibitive speed bump. (Scientific python has a very large ongoing stream of newbie programmers.) And I don't know that any of the Unicode characters are actually better. In real math the two operations are distinguished by context, so there's no existing character with the right meaning that we can steal. In some ways having an unusual character for this is better, because it indicates a special-case operation. If the two multiplication characters are * and ×, then how do you remember which is which? But if they're * and @, then it's easy to remember that * is the general-use one, and @ is the special non-commutative matrix multiplication one.
It's not like new operators are being added to Python every week and we need to start scraping the bottom of the barrel for new characters. ASCII gets the job done, has some minor upsides, and no real downsides.
Also, how do other general purpose programming languages solve this? Surely people do matrix math in Haskell and C++, at least? Do they just use separate types for matrices and arrays because they don't have to worry about duck typing (being statically-typed languages)? Do they just avoid mixing libraries together to avoid the problem? Or have people attempted to reuse % or other operators? Or (doesn't apply to C++, but does to Haskell) do they use spelled-out infix functions like `mmul` instead of trying to come up with symbolic operators? A lot of those answers wouldn't tell us anything more than "their experience isn't relevant to Python", but it would be nice to know that at least.
I'm not an expert on Haskell and C++ matrix libraries -- perhaps someone else will speak up -- but Haskell's extensible infix system and static typing do seem like they put them in a pretty different design space. And I'm skeptical that there exist Haskell libraries with anything like numpy's maturity, though I'll be delighted to be proven wrong.
Eigen is maybe the most popular C++ matrix/array library right now, and AFAICT from their docs they use a system where there's one "array" type that *only* supports elementwise multiplication, and one "matrix" type that *only* supports matrix multiplication, and if you want both operations you have to cast back and forth. This seems a bit annoying to me, but not as deadly as it is in Python -- in C++ the static type system means they can at least fob off the work of keeping track of which variables are which type, and doing conversions at function boundaries, onto the compiler. In Python these conversions have to happen by hand, and the two-type solution is just not workable.
In general, I doubt there's huge amounts of experience to steal from other languages, because Python is in a unique place: AFAICT it's the only general purpose language that has ever made serious inroads against the specialized numeric languages (Matlab, R, GAUSS, IDL, etc.) on their home territory. So we're breaking new ground here. (All those languages do have separate infix operators for elementwise and matrix multiplication; they all involve horrible names like ".*" or "%*%" and all kinds of weird inconsistencies.)
-n
On Mar 13, 2014, at 19:08, Nathaniel Smith njs@pobox.com wrote:
On Fri, Mar 14, 2014 at 1:59 AM, Nathaniel Smith njs@pobox.com wrote:
[...]
Hmm, not sure how that "Fwd:" snuck onto the subject line. (Or really, I do know, but am embarrassed to say.) Oh well, sorry, hope it isn't too distracting!
-n
-- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
From: Nathaniel Smith njs@pobox.com Sent: Thursday, March 13, 2014 10:05 PM
On Fri, Mar 14, 2014 at 4:03 AM, Andrew Barnert abarnert@yahoo.com wrote:
[snip, here and elsewhere]
The numeric community has many talents, but crossing the cultural divide with upstream is not really a speciality...
Well, you've fixed that here.
One thing that always comes up is a suggestion for using Unicode.
We don't really have a strong opinion on which character is used. It is nice if it's easy to type, though -- esp. because in many ways it's already beginners who suffer the brunt of the elmul/matmul issues, and beginners are exactly the people for whom figuring out how to type some stupid character is a prohibitive speed bump. (Scientific python has a very large ongoing stream of newbie programmers.) And I don't know that any of the Unicode characters are actually better. In real math the two operations are distinguished by context, so there's no existing character with the right meaning that we can steal. In some ways having an unusual character for this is better, because it indicates a special-case operation. If the two multiplication characters are * and ×, then how do you remember which is which? But if they're * and @, then it's easy to remember that * is the general-use one, and @ is the special non-commutative matrix multiplication one.
Honestly, that sounds like a strong negative opinion, with good motivations, not no opinion,
The argument for Unicode is generally "readability is more important than writability," all things being equal. The traditional counter to that is that all things aren't equal, as it's still way too hard for novices to write Unicode characters. But adding the additional rationale that there is no obvious Unicode character for this operator certainly strengthens the case for @.
It's not like new operators are being added to Python every week and we need to start scraping the bottom of the barrel for new characters. ASCII gets the job done, has some minor upsides, and no real downsides.
Well, we have just about two ASCII characters left, @ and ?, so we really are scraping the bottom of the barrel. But you made a good argument why this is worthy of one of those two characters.
Also, how do other general purpose programming languages solve this? Surely people do matrix math in Haskell and C++, at least? Do they just use separate types for matrices and arrays because they don't have to worry about duck typing (being statically-typed languages)? Do they just avoid mixing libraries together to avoid the problem? Or have people attempted to reuse % or other operators? Or (doesn't apply to C++, but does to Haskell) do they use spelled-out infix functions like `mmul` instead of trying to come up with symbolic operators? A lot of those answers wouldn't tell us anything more than "their experience isn't relevant to Python", but it would be nice to know that at least.
I'm not an expert on Haskell and C++ matrix libraries -- perhaps someone else will speak up -- but Haskell's extensible infix system and static typing do seem like they put them in a pretty different design space. And I'm skeptical that there exist Haskell libraries with anything like numpy's maturity, though I'll be delighted to be proven wrong.
From what I can tell, it's more common to implement matrix libraries in Haskell than to use them. Probably because Haskell is more of a research and/or teaching language than a language you'd expect, say, physicists to use. But I was hoping someone would actually know that, rather than just guess…
Eigen is maybe the most popular C++ matrix/array library right now, and AFAICT from their docs they use a system where there's one "array" type that *only* supports elementwise multiplication, and one "matrix" type that *only* supports matrix multiplication, and if you want both operations you have to cast back and forth. This seems a bit annoying to me, but not as deadly as it is in Python -- in C++ the static type system means they can at least fob off the work of keeping track of which variables are which type, and doing conversions at function boundaries, onto the compiler. In Python these conversions have to happen by hand, and the two-type solution is just not workable.
Yes, that's why I suspected that C++ experience would not be applicable to Python. Assigning a matrix value to an array variable, or passing it to an array function, can be an implicit (but still obvious) conversion. And the way they'd solve the same problems Python solves with duck typing would be would generic functions, which is pretty different from duck-typed functions. And I'd expect that most other static languages are similar. But it might be worth adding to the PEP if you actually know the answer rather than just suspecting.
In general, I doubt there's huge amounts of experience to steal from other languages, because Python is in a unique place: AFAICT it's the only general purpose language that has ever made serious inroads against the specialized numeric languages (Matlab, R, GAUSS, IDL, etc.) on their home territory. So we're breaking new ground here. (All those languages do have separate infix operators for elementwise and matrix multiplication; they all involve horrible names like ".*" or "%*%" and all kinds of weird inconsistencies.)
You could be right. Most of the core low-level libraries are in C, and C++ programmers have a habit of just using C libraries with their C APIs rather than wrapping them in generic/OO C++ APIs… (Honestly, I think automatic C compatibility is one of the biggest problems with C++, not its biggest strength. In Python, it's easy to wrap a C library, and you have to do it, so you do it and end up with a great API; in C++, it's easy to wrap a C library, but you don't have to, so you end up using it C-style and losing all the benefits of C++.)
On 3/13/2014 9:59 PM, Nathaniel Smith wrote:
I'd have rejected these too! So I thought, maybe we should try the radical tactic of writing down what we actually want, carefully explaining why we want it, and then asking for it.
Great idea ;-). Until someone comes up with reasons to reject this, I am +1.
On Fri, Mar 14, 2014 at 3:59 AM, Nathaniel Smith njs@pobox.com wrote:
PEP: XXXX Title: Dedicated infix operators for matrix multiplication and matrix power
A wholehearted +1!
- Tal Einat
On 14.03.2014 02:59, Nathaniel Smith wrote:
PEP: XXXX Title: Dedicated infix operators for matrix multiplication and matrix power
...
Specification
Two new binary operators are added to the Python language, together with corresponding in-place versions:
======= ========================= =============================== Op Precedence/associativity Methods ======= ========================= =============================== ``@`` Same as ``*`` ``__matmul__``, ``__rmatmul__`` ``@@`` Same as ``**`` ``__matpow__``, ``__rmatpow__`` ``@=`` n/a ``__imatmul__`` ``@@=`` n/a ``__imatpow__`` ======= ========================= ===============================
...
When working with n-dimensional arrays, there are two different ways we might want to define multiplication. One is elementwise multiplication::
[[1, 2], [[11, 12], [[1 * 11, 2 * 12], [3, 4]] x [13, 14]] = [3 * 13, 4 * 14]]
and the other is `matrix multiplication`_:
.. _matrix multiplication: https://en.wikipedia.org/wiki/Matrix_multiplication
I have some questions:
1. Since in math, the operator is usually spelt "·" (the center dot, or "." but that's already reserved for methods and attributes in Python), why not try to use that instead of "@" (which in Python already identifies decorators) ?
2. The PEP should include a section on how other programming languages solve this, i.e. what syntax they use for matrix multiplications.
3. Since matrix multiplication is only one type of product you find in math, albeit a very frequently used one, how would those other products fit into the picture ? Would then have to use methods again ? E.g. think of cross product, inner product, outer/tensor product.
4. Another very common operation needed in vector/matrix calculation is transposition. This is usually written as superscript "T" or "t" ("ᵀ" in Unicode). Wouldn't this operator be needed as well, to make the picture complete ? OTOH, we currently don't have postfix operators in Python, so I guess writing this as A.transpose() comes close enough ;-)
Now since this is all about syntactic sugar, we also need to look at some code examples:
I == A @@ -1 @ A vs. I == A ·· -1 · A vs. I == A.inverse().dot(A)
(A @ B).transpose() == A.transpose() @ B.transpose() vs. (A · B).transpose() == A.transpose() · B.transpose() vs. A.dot(B).transpose() == A.transpose().dot(B.transpose())
c = A @ v vs. c = A · v vs. c = A.dot(v)
Hmm, even though I'd love to see matrix operators in Python, I don't think they really add clarity to the syntax of matrix calculations - a bit disappointing, I must say :-(
On Fri, 14 Mar 2014 11:16:34 +0100 "M.-A. Lemburg" mal@egenix.com wrote:
- Another very common operation needed in vector/matrix calculation is transposition. This is usually written as superscript "T" or "t" ("ᵀ" in Unicode). Wouldn't this operator be needed as well, to make the picture complete ? OTOH, we currently don't have postfix operators in Python, so I guess writing this as A.transpose() comes close enough ;-)
Or simply implement __invert__, so you can write it "~ A". (__invert__ doesn't really invert a number, it takes the bitwise complement :-))
Hmm, even though I'd love to see matrix operators in Python, I don't think they really add clarity to the syntax of matrix calculations - a bit disappointing, I must say :-(
I think they do when you have several of these operations combined in a single formula, with parentheses and the like.
Regards
Antoine.
Antoine Pitrou wrote:
Or simply implement __invert__, so you can write it "~ A". (__invert__ doesn't really invert a number, it takes the bitwise complement :-))
But that's already taken to mean elementwise bitwise complement.
On 2014-03-14 10:16, M.-A. Lemburg wrote:
I have some questions:
- Since in math, the operator is usually spelt "·" (the center dot, or "." but that's already reserved for methods and attributes in Python), why not try to use that instead of "@" (which in Python already identifies decorators) ?
I think the current feeling of the Python core team is against including non-ASCII characters in the language's keywords or operators. Even if that were not so, I would still recommend against it because it would be quite difficult to type. I don't know off-hand the key combination to do it on my native system, and it would change from system to system.
The PEP should include a section on how other programming languages solve this, i.e. what syntax they use for matrix multiplications.
Since matrix multiplication is only one type of product you find in math, albeit a very frequently used one, how would those other products fit into the picture ? Would then have to use methods again ? E.g. think of cross product, inner product, outer/tensor product.
Our experience is that these have come up much less regularly than matrix multiplication. The two products in common use in our code are the Hadamard product (elementwise multiplication, currently assigned to * in numpy) and matrix multiplication (currently done with the function numpy.dot()).
- Another very common operation needed in vector/matrix calculation is transposition. This is usually written as superscript "T" or "t" ("ᵀ" in Unicode). Wouldn't this operator be needed as well, to make the picture complete ? OTOH, we currently don't have postfix operators in Python, so I guess writing this as A.transpose() comes close enough ;-)
Indeed. Numpy already uses a .T property for this.
Now since this is all about syntactic sugar, we also need to look at some code examples:
I == A @@ -1 @ A vs. I == A ·· -1 · A vs. I == A.inverse().dot(A)
(A @ B).transpose() == A.transpose() @ B.transpose() vs. (A · B).transpose() == A.transpose() · B.transpose() vs. A.dot(B).transpose() == A.transpose().dot(B.transpose())
(A @ B).T == B.T @ A.T (A · B).T == B.T · A.T A.dot(B).T == B.T.dot(A.T)
(FWIW, I didn't notice the math error until I wrote out the @ version.)
c = A @ v vs. c = A · v vs. c = A.dot(v)
Hmm, even though I'd love to see matrix operators in Python, I don't think they really add clarity to the syntax of matrix calculations - a bit disappointing, I must say :-(
Some more from real code:
RSR = R.dot(var_beta.dot(R.T)) RSR = R @ var_beta @ R.T
xx_inv.dot(xeps.dot(xx_inv)) xx_inv @ xeps @ xx_inv
dF2lower_dper.dot(F2lower.T) + F2lower.dot(dF2lower_dper.T) - 4/period*F2lower.dot(F2lower.T) dF2lower_dper @ F2lower.T + F2lower @ dF2lower_dper.T - 4/period*(F2lower @ F2lower.T)
dFX_dper.dot(Gi.dot(FX2.T)) - FX.dot(Gi.dot(dG_dper.dot(Gi.dot(FX2.T)))) + FX.dot(Gi.dot(dFX2_dper.T)) (dFX_dper @ Gi @ FX2.T) - (FX @ Gi @ dG_dper @ Gi @ FX2.T) + (FX @ G @ dFX2_dper.T)
torient_inv.dot(tdof).dot(torient).dot(self.vertices[parent].meta['key'])) (((torient_inv @ tdof) @ torient) @ self.vertices[parent].meta['key']
On 14.03.2014 12:25, Robert Kern wrote:
On 2014-03-14 10:16, M.-A. Lemburg wrote:
- Another very common operation needed in vector/matrix calculation is transposition. This is usually written as superscript "T" or "t" ("ᵀ" in Unicode). Wouldn't this operator be needed as well, to make the picture complete ? OTOH, we currently don't have postfix operators in Python, so I guess writing this as A.transpose() comes close enough ;-)
Indeed. Numpy already uses a .T property for this.
Ah, good trick :-)
Now since this is all about syntactic sugar, we also need to look at some code examples:
I == A @@ -1 @ A vs. I == A ·· -1 · A vs. I == A.inverse().dot(A)
(A @ B).transpose() == B.transpose() @ A.transpose() vs. (A · B).transpose() == B.transpose() · A.transpose() vs. A.dot(B).transpose() == B.transpose().dot(A.transpose())
(A @ B).T == B.T @ A.T (A · B).T == B.T · A.T A.dot(B).T == B.T.dot(A.T)
(FWIW, I didn't notice the math error until I wrote out the @ version.)
Thanks; I should have proofread the email before hitting the send button.
I've correct the quoted version above to have the comparisons return True for all A and B instead of just for a select few :-)
c = A @ v vs. c = A · v vs. c = A.dot(v)
Hmm, even though I'd love to see matrix operators in Python, I don't think they really add clarity to the syntax of matrix calculations - a bit disappointing, I must say :-(
Some more from real code:
RSR = R.dot(var_beta.dot(R.T)) RSR = R @ var_beta @ R.T
xx_inv.dot(xeps.dot(xx_inv)) xx_inv @ xeps @ xx_inv
dF2lower_dper.dot(F2lower.T) + F2lower.dot(dF2lower_dper.T) - 4/period*F2lower.dot(F2lower.T) dF2lower_dper @ F2lower.T + F2lower @ dF2lower_dper.T - 4/period*(F2lower @ F2lower.T)
dFX_dper.dot(Gi.dot(FX2.T)) - FX.dot(Gi.dot(dG_dper.dot(Gi.dot(FX2.T)))) + FX.dot(Gi.dot(dFX2_dper.T)) (dFX_dper @ Gi @ FX2.T) - (FX @ Gi @ dG_dper @ Gi @ FX2.T) + (FX @ G @ dFX2_dper.T)
torient_inv.dot(tdof).dot(torient).dot(self.vertices[parent].meta['key'])) (((torient_inv @ tdof) @ torient) @ self.vertices[parent].meta['key']
This doesn't look very readable to me - the operator saves you a few parens in some situations, but as in the last example, it can also require adding new ones.
On Fri, 14 Mar 2014 12:48:13 +0100 "M.-A. Lemburg" mal@egenix.com wrote:
Some more from real code:
RSR = R.dot(var_beta.dot(R.T)) RSR = R @ var_beta @ R.T
xx_inv.dot(xeps.dot(xx_inv)) xx_inv @ xeps @ xx_inv
dF2lower_dper.dot(F2lower.T) + F2lower.dot(dF2lower_dper.T) - 4/period*F2lower.dot(F2lower.T) dF2lower_dper @ F2lower.T + F2lower @ dF2lower_dper.T - 4/period*(F2lower @ F2lower.T)
dFX_dper.dot(Gi.dot(FX2.T)) - FX.dot(Gi.dot(dG_dper.dot(Gi.dot(FX2.T)))) + FX.dot(Gi.dot(dFX2_dper.T)) (dFX_dper @ Gi @ FX2.T) - (FX @ Gi @ dG_dper @ Gi @ FX2.T) + (FX @ G @ dFX2_dper.T)
torient_inv.dot(tdof).dot(torient).dot(self.vertices[parent].meta['key'])) (((torient_inv @ tdof) @ torient) @ self.vertices[parent].meta['key']
This doesn't look very readable to me - the operator saves you a few parens in some situations, but as in the last example, it can also require adding new ones.
The parentheses mirror those necessary in the equivalent mathematical formula, though, so they are "natural" in a sense.
I do find the "@" examples much more readable myself - except that I don't understand what they are about, of course :-)
Regards
Antoine.
On 03/14/2014 04:48 AM, M.-A. Lemburg wrote:
On 14.03.2014 12:25, Robert Kern wrote:
RSR = R.dot(var_beta.dot(R.T)) RSR = R @ var_beta @ R.T
xx_inv.dot(xeps.dot(xx_inv)) xx_inv @ xeps @ xx_inv
dF2lower_dper.dot(F2lower.T) + F2lower.dot(dF2lower_dper.T) - 4/period*F2lower.dot(F2lower.T) dF2lower_dper @ F2lower.T + F2lower @ dF2lower_dper.T - 4/period*(F2lower @ F2lower.T)
dFX_dper.dot(Gi.dot(FX2.T)) - FX.dot(Gi.dot(dG_dper.dot(Gi.dot(FX2.T)))) + FX.dot(Gi.dot(dFX2_dper.T)) (dFX_dper @ Gi @ FX2.T) - (FX @ Gi @ dG_dper @ Gi @ FX2.T) + (FX @ G @ dFX2_dper.T)
torient_inv.dot(tdof).dot(torient).dot(self.vertices[parent].meta['key'])) (((torient_inv @ tdof) @ torient) @ self.vertices[parent].meta['key']
This doesn't look very readable to me - the operator saves you a few parens in some situations, but as in the last example, it can also require adding new ones.
The difference being that grouping parens are easier to read and are less visual clutter than calling parens.
-- ~Ethan~
On 14 March 2014 14:13, Ethan Furman ethan@stoneleaf.us wrote:
torient_inv.dot(tdof).dot(torient).dot(self.vertices[parent].meta['key'])) (((torient_inv @ tdof) @ torient) @ self.vertices[parent].meta['key']
This doesn't look very readable to me - the operator saves you a few parens in some situations, but as in the last example, it can also require adding new ones.
The difference being that grouping parens are easier to read and are less visual clutter than calling parens.
Personally, my biggest problem with all of these is that the @ sign is a bit too big and bulky, so it's visually jarring.
But:
1. As the PEP mentions, there aren't many options available. 2. I trust the scientific community to choose something that they are comfortable with (I'm only an interested outsider). 3. Ultimately, this is just bikeshedding.
One genuine question though - when the PEP was developed, were multi-character operators like .* or <*> considered? A "rejected alternative operator symbols" would be a useful addition to the PEP (although it'd rob all us non-experts of the opportunity to bikeshed :-))
On a related note, the @@ operator is visually dreadful (far too heavy). While I see the */** analogy, and I appreciate that there's few good options, I'd definitely like to see some evidence that it's "the best of a bad lot" in the PEP.
Paul.
Disclaimer: I am not a numpy (or similar) user and have only skimmed this thread. If I've simply missed something big, I can be safely ignored :)
On Fri, Mar 14, 2014 at 9:53 AM, Paul Moore p.f.moore@gmail.com wrote:
Personally, my biggest problem with all of these is that the @ sign is a bit too big and bulky, so it's visually jarring.
But:
- As the PEP mentions, there aren't many options available.
- I trust the scientific community to choose something that they are
comfortable with (I'm only an interested outsider). 3. Ultimately, this is just bikeshedding.
One genuine question though - when the PEP was developed, were multi-character operators like .* or <*> considered? A "rejected alternative operator symbols" would be a useful addition to the PEP (although it'd rob all us non-experts of the opportunity to bikeshed :-))
On a related note, the @@ operator is visually dreadful (far too heavy). While I see the */** analogy, and I appreciate that there's few good options, I'd definitely like to see some evidence that it's "the best of a bad lot" in the PEP.
I agree with Paul, @/@@ just look scary as operators. Here's a few multi-character options I've come up with that I would like to see shot down in flames before @/@@ are added to Python:
< (for multiplication, not sure about exponentiation. I only like it because it's the shortest thing I've come up with that looks somewhat like multiplication)
[*] / [**] ([] make me think 'matrix', but this might be confusing to the parser) |*| / |**| (pretty close to [], shouldn't confuse the parser)
The downside is that the inplace version of matrix exponentiation would be a 5 character operator ([**]=), which I will freely admit is not appealing, but it *looks* quite a lot nicer to me than "@@=" does.
On 14 March 2014 15:46, Zachary Ware zachary.ware+pyideas@gmail.com wrote:
I agree with Paul, @/@@ just look scary as operators. Here's a few multi-character options I've come up with that I would like to see shot down in flames before @/@@ are added to Python:
Just as a contrasting point, I've been reading this thread on gmail with a proportional font. I went and looked at the PEP in a fixed width font earlier, and the @ sign doesn't look anywhere near as bad there.
Paul
On Fri, Mar 14, 2014 at 12:16 PM, Paul Moore p.f.moore@gmail.com wrote:
On 14 March 2014 15:46, Zachary Ware zachary.ware+pyideas@gmail.com wrote:
I agree with Paul, @/@@ just look scary as operators. Here's a few multi-character options I've come up with that I would like to see shot down in flames before @/@@ are added to Python:
Just as a contrasting point, I've been reading this thread on gmail with a proportional font. I went and looked at the PEP in a fixed width font earlier, and the @ sign doesn't look anywhere near as bad there.
In Courier New:
S = (H @ beta - r).T @ inv(H @ V @ H.T) @ (H @ beta - r)
Still looks kind of bulky to me, because @ is the height and width of a capital letter. How about prefixing * with an innocuous backtick?
S = (H `* beta - r).T `* inv(H `* V `* H.T) `* (H `* beta - r)
That way no part of the operator extends to the baseline, so identifiers and parentheses/brackets are visually well-separated from this as they are with most other binary operators.
Nathan
Paul _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On 2014-03-14 16:39, Nathan Schneider wrote:
In Courier New:
S = (H @ beta - r).T @ inv(H @ V @ H.T) @ (H @ beta - r)
Still looks kind of bulky to me, because @ is the height and width of a capital letter. How about prefixing * with an innocuous backtick?
S = (H `* beta - r).T `* inv(H `* V `* H.T) `* (H `* beta - r)
That way no part of the operator extends to the baseline, so identifiers and parentheses/brackets are visually well-separated from this as they are with most other binary operators.
Fails the grit-on-Tim's-monitor test, or at least the grit-on-Robert's-monitor test.
On Fri, 14 Mar 2014 16:46:27 +0000 Robert Kern robert.kern@gmail.com wrote:
On 2014-03-14 16:39, Nathan Schneider wrote:
In Courier New:
S = (H @ beta - r).T @ inv(H @ V @ H.T) @ (H @ beta - r)
Still looks kind of bulky to me, because @ is the height and width of a capital letter. How about prefixing * with an innocuous backtick?
S = (H `* beta - r).T `* inv(H `* V `* H.T) `* (H `* beta - r)
That way no part of the operator extends to the baseline, so identifiers and parentheses/brackets are visually well-separated from this as they are with most other binary operators.
Fails the grit-on-Tim's-monitor test, or at least the grit-on-Robert's-monitor test.
Not only grit, but the problem with the backtick is that it can look very close to a straight apostrophe.
I am personally not fond of @, but I find it ok in that it is distinctive enough without being terribly ugly.
Regards
Antoine.
On Fri, Mar 14, 2014 at 5:08 PM, Antoine Pitrou solipsis@pitrou.net wrote:
On Fri, 14 Mar 2014 16:46:27 +0000 Robert Kern robert.kern@gmail.com wrote:
On 2014-03-14 16:39, Nathan Schneider wrote:
In Courier New:
S = (H @ beta - r).T @ inv(H @ V @ H.T) @ (H @ beta - r)
Still looks kind of bulky to me, because @ is the height and width of a capital letter. How about prefixing * with an innocuous backtick?
S = (H `* beta - r).T `* inv(H `* V `* H.T) `* (H `* beta - r)
That way no part of the operator extends to the baseline, so identifiers and parentheses/brackets are visually well-separated from this as they are with most other binary operators.
Fails the grit-on-Tim's-monitor test, or at least the grit-on-Robert's-monitor test.
Not only grit, but the problem with the backtick is that it can look very close to a straight apostrophe.
Backtick has in fact been formally banned from py3 by BDFL pronouncement (for this reason): https://mail.python.org/pipermail/python-ideas/2007-January/000054.html http://legacy.python.org/dev/peps/pep-3099/
Antoine Pitrou writes:
I am personally not fond of @, but I find it ok in that it is distinctive enough without being terribly ugly.
"Not terribly ugly" is the best we're going to do, since Guido has already ruled out non-ASCII characters. Well, without another PEP, and I don't think we should delay PEP 465 for a "Unicode operators" PEP.
I think making @ right associative would make this less suitable for other uses. Example
someclassobject @ somevalue @ somevalue
won't work with right-associativity. Plus precedence of mixing right and left can be confusing.
--- Bruce
On Fri, Mar 14, 2014 at 09:09:02PM -0700, Bruce Leban wrote:
I think making @ right associative would make this less suitable for other uses. Example
someclassobject @ somevalue @ somevalue
won't work with right-associativity.
Why not? It works with the other right-associative operator:
x**y**z
"Works" depends on what you expect it to do. Unless you tell us what these "other uses" are, how can we know that right-associativity won't work?
Plus precedence of mixing right and left can be confusing.
Perhaps. Perhaps not. This works as expected:
3*2**5
Are there any other right-associative operators that are confusing?
On Sat, 15 Mar 2014 18:35:09 +1100 Steven D'Aprano steve@pearwood.info wrote:
On Fri, Mar 14, 2014 at 09:09:02PM -0700, Bruce Leban wrote:
I think making @ right associative would make this less suitable for other uses. Example
someclassobject @ somevalue @ somevalue
won't work with right-associativity.
Why not? It works with the other right-associative operator:
x**y**z
"Works" depends on what you expect it to do.
The real question is why @ would be right-associative. "**" is very rarely used in a chained manner as the above, so its associativity isn't really important (I don't think I have ever written "x**y**z"). @ will be used routinely in a chained manner, so the question is more important here.
The possible reason given in the PEP is very weak and amounts to premature optimization:
"""It's been suggested that @ should be right-associative, on the grounds that for expressions like Mat @ Mat @ vec, the two different evaluation orders produce the same result, but the right-associative order Mat @ (Mat @ vec) will be faster and use less memory than the left-associative order (Mat @ Mat) @ vec. (Matrix-vector multiplication is much cheaper than matrix-matrix multiplication)."""
If that's the only reason, then I'd like @ to be left-associative.
Regards
Antoine.
Antoine Pitrou wrote:
The possible reason given in the PEP is very weak and amounts to premature optimization:
I don't think it's just a matter of optimization. Often, matrix @ vector represents a linear operator acting on an element of a vector space. When you chain them,
A @ B @ C @ v
conceptually represents acting on v with C, then B, then A.
On Sun, 16 Mar 2014 00:55:09 +1300 Greg Ewing greg.ewing@canterbury.ac.nz wrote:
Antoine Pitrou wrote:
The possible reason given in the PEP is very weak and amounts to premature optimization:
I don't think it's just a matter of optimization. Often, matrix @ vector represents a linear operator acting on an element of a vector space. When you chain them,
A @ B @ C @ v
conceptually represents acting on v with C, then B, then A.
It can just as well represent "acting" on v with (A @ B @ C).
Of course, mathematically it shouldn't make a difference, but in computer programming right-associative operators are always a special case, and therefore an additional cognitive burden.
Regards
Antoine.
On 15 March 2014 12:06, Antoine Pitrou solipsis@pitrou.net wrote:
The possible reason given in the PEP is very weak and amounts to premature optimization:
I don't think it's just a matter of optimization. Often, matrix @ vector represents a linear operator acting on an element of a vector space. When you chain them,
A @ B @ C @ v
conceptually represents acting on v with C, then B, then A.
It can just as well represent "acting" on v with (A @ B @ C).
Of course, mathematically it shouldn't make a difference, but in computer programming right-associative operators are always a special case, and therefore an additional cognitive burden.
I don't think it's a premature optimisation. It's a significant algorithmic optimisation.
A @ (B @ (C @ v))) # 3 matrix-vector multiplies (((A @ B) @ C) @ v) # 2 matrix-matrix multiplies and one matrix-vector multiply
If the matrices are NxN and the vector of length N then the matrix-vector multiplications can be performed with the asymptotically optimal N**2 operations but the matrix-matrix multiplications require something like N**2.5 or worse.
It is possible but unusual for people to write this the other way round (in which case the optimisation would favour left-associativity). In any case many of the users of these operators will not know the difference between left and right associativity and will either use brackets or just write it out and hope that Python/numpy know what to do.
Oscar
On Sat, 15 Mar 2014 12:20:42 +0000 Oscar Benjamin oscar.j.benjamin@gmail.com wrote:
I don't think it's a premature optimisation. It's a significant algorithmic optimisation.
You could also make the multiplications lazy (and enforce the optimal order when computing the final result) rather than enforce that optimization at the language parsing level. It would be more flexible, and would also avoid potentially pessimizing other use cases.
Regards
Antoine.
On 15 March 2014 12:28, Antoine Pitrou solipsis@pitrou.net wrote:
On Sat, 15 Mar 2014 12:20:42 +0000 Oscar Benjamin oscar.j.benjamin@gmail.com wrote:
I don't think it's a premature optimisation. It's a significant algorithmic optimisation.
You could also make the multiplications lazy (and enforce the optimal order when computing the final result) rather than enforce that optimization at the language parsing level. It would be more flexible, and would also avoid potentially pessimizing other use cases.
That's true. I believe both blaze and numpypy intend to introduce optimisations in this style.
Oscar
On 15 March 2014 12:36, Oscar Benjamin oscar.j.benjamin@gmail.com wrote:
On 15 March 2014 12:28, Antoine Pitrou solipsis@pitrou.net wrote:
On Sat, 15 Mar 2014 12:20:42 +0000 Oscar Benjamin oscar.j.benjamin@gmail.com wrote:
I don't think it's a premature optimisation. It's a significant algorithmic optimisation.
You could also make the multiplications lazy (and enforce the optimal order when computing the final result) rather than enforce that optimization at the language parsing level. It would be more flexible, and would also avoid potentially pessimizing other use cases.
That's true. I believe both blaze and numpypy intend to introduce optimisations in this style.
Just to add to that: I personally would almost always use brackets rather than rely on left- or right- associativity for something like this. A similar way that it can come up is with scalar-scalar vs scalar-array multiplication e.g.:
2 * pi * x / L * A # A is a big array
I would rewrite that as
(2 * pi * x / L) * A
rather than rely on precedence/associativity.
Oscar
On Sat, Mar 15, 2014 at 12:42:51PM +0000, Oscar Benjamin wrote:
Just to add to that: I personally would almost always use brackets rather than rely on left- or right- associativity for something like this.
A similar way that it can come up is with scalar-scalar vs scalar-array multiplication e.g.:
2 * pi * x / L * A # A is a big array
I would rewrite that as
(2 * pi * x / L) * A
rather than rely on precedence/associativity.
It seems to me that you actually are relying on precedence/ associativity, otherwise you would have written it in fully-bracketed form like this:
(((2 * pi) * x) / L) * A
It's your choice to include redundant brackets in an expression, but I try hard to avoid the extra visual noise.
On 15 March 2014 13:47, Steven D'Aprano steve@pearwood.info wrote:
On Sat, Mar 15, 2014 at 12:42:51PM +0000, Oscar Benjamin wrote:
Just to add to that: I personally would almost always use brackets rather than rely on left- or right- associativity for something like this.
A similar way that it can come up is with scalar-scalar vs scalar-array multiplication e.g.:
2 * pi * x / L * A # A is a big array
I would rewrite that as
(2 * pi * x / L) * A
rather than rely on precedence/associativity.
It seems to me that you actually are relying on precedence/ associativity, otherwise you would have written it in fully-bracketed form like this:
(((2 * pi) * x) / L) * A
The point is that I don't care about the order of evaluation for the scalar part. Wherever you put the brackets works for me:
(2 * pi) * (x / L) ((2 * pi) * x) / L (2 * (pi * x)) / L 2 * ((pi * x) / L) 2 * (pi * (x / L))
The reason I care about it with A is because A is a big array. Every operation on A involves an expensive pass over all the elements of the array as well as a large temporary allocation.
It's your choice to include redundant brackets in an expression, but I try hard to avoid the extra visual noise.
I don't see those brackets as noise. To me it clearly shows that I'm only doing one pass over the array which is useful information.
Oscar
On Sat, Mar 15, 2014 at 01:28:11PM +0100, Antoine Pitrou wrote:
On Sat, 15 Mar 2014 12:20:42 +0000 Oscar Benjamin oscar.j.benjamin@gmail.com wrote:
I don't think it's a premature optimisation. It's a significant algorithmic optimisation.
You could also make the multiplications lazy (and enforce the optimal order when computing the final result) rather than enforce that optimization at the language parsing level. It would be more flexible, and would also avoid potentially pessimizing other use cases.
That sounds to me that you are suggesting that every single implementation of a matrix type that supports matrix multiplication needs to spend the considerable extra effort to make the multiplication lazy, in order to "potentially" avoid hurting hypothetical use-cases which don't exist now and may never exist.
That's as clear a case of YAGNI as I've seen for a long time.
We have one solid use-case for a matrix multiplication operator, and that's matrix multiplication. If matrix multiplication is most usefully treated as right-associative, then we ought to make it right- associative, and not burden the matrix multiplication operator with restrictions for the sake of hypothetical non-matrix-mult uses.
In the same way that there is one good use-case for the numeric exponentiation operator ** , namely numeric exponentiation, and consequently it is right-associative because that's how the operator is most usefully treated.
On Sun, 16 Mar 2014 01:04:51 +1100 Steven D'Aprano steve@pearwood.info wrote:
That sounds to me that you are suggesting that every single implementation of a matrix type that supports matrix multiplication
No, only implementations that are actually concerned with that particular optimization.
For an analogy, bytes objects don't implement cheap slicing: if you want that, you need to use a different type (memoryview).
Regards
Antoine.
I don't think it's a premature optimisation. It's a significant algorithmic optimisation.
Just in idea, any library could optimize itself by delaying the evaluation of the expression until the result is used or a vector is right multiplied. So the optimization wouldn't be a real issue here.
On 2014-03-15 12:06, Antoine Pitrou wrote:
On Sun, 16 Mar 2014 00:55:09 +1300 Greg Ewing greg.ewing@canterbury.ac.nz wrote:
Antoine Pitrou wrote:
The possible reason given in the PEP is very weak and amounts to premature optimization:
I don't think it's just a matter of optimization. Often, matrix @ vector represents a linear operator acting on an element of a vector space. When you chain them,
A @ B @ C @ v
conceptually represents acting on v with C, then B, then A.
It can just as well represent "acting" on v with (A @ B @ C).
Of course, mathematically it shouldn't make a difference, but in computer programming right-associative operators are always a special case, and therefore an additional cognitive burden.
I think his point was that people doing linear algebra tend to read these expressions "right-to-left" anyways because of that conceptual model. I'm not entirely sure how supportable that is in the general population, but it's certainly the way I think about these expressions. For me, left-associative causes an additional cognitive burden, but it's a minor one, either way.
Robert Kern wrote:
For me, left-associative causes an additional cognitive burden, but it's a minor one, either way.
Wild idea -- provide a way to spell both left and right associative versions.
It's disappointing that there isn't a character that's the mirror image of an @. Failing that:
A @ B left associative A @: B right associative
(Ducks hail of tomatoes from the no-colons-in-expressions crowd...)
Antoine Pitrou wrote:
It can just as well represent "acting" on v with (A @ B @ C).
Yes, but if you ask what A @ B @ C means as an operator, it means "the same thing as acting with C, then B, then A."
The fact that evaluating it right-to-left also happens to be more efficient in some cases means that there are both conceptual *and* practical reasons for making @ right associative.
I was just trying to point out that efficiency is not the *only* reason to consider this.
There's a counter-argument as well -- it's only more efficient if the rightmost operand is just a single vector or a small enough array of vectors. Otherwise, it's more efficient to calculate a combined operator and then apply that. So the question would be whether the small case is common enough in practice to be worth having an unusual associativity.
A data point from another language, for what it's worth: In APL, *all* operators are right-associative. I'm not sure what the reason for that choice was, but it may result from a similar line of reasoning around how mathematicians think about things.
On Sat, Mar 15, 2014 at 6:31 PM, Greg Ewing greg.ewing@canterbury.ac.nzwrote:
In APL, *all* operators are right-associative. I'm not sure what the reason for that choice was, but it may result from a similar line of reasoning around how mathematicians think about things.
In APL, there is no operator precedence, so right associativity is the only way to make expressions that mix unary and binary operation look natural. For example, a + -b.
On Sat, Mar 15, 2014 at 12:27:33PM +0100, Antoine Pitrou wrote:
The real question is why @ would be right-associative.
That's a good question, but I don't think that should be up to us to decide. Guido clearly stated that if it is more useful to define @ as right-associative, we shouldn't let the fact that most operators are left-associative get in the way of doing the right thing here. The only people who are in a position to decide that are the users of matrix multiplication.
[...]
The possible reason given in the PEP is very weak and amounts to premature optimization:
I don't think it's premature optimization. Sometimes we do know ahead of time that a calculation done one way will be faster than doing it another way: you don't have to "try it and see" to realise that repeatedly adding strings in a for-loop can be O(N**2) versus O(N) for using str.join(). [Aside: if you do try it, the string-concat reference-counting optimization of CPython may fool you into thinking that concatenation is generally fast. It isn't.]
Likewise there is nothing premature about the fact that "Matrix-vector multiplication is much cheaper than matrix-matrix multiplication". The only question is whether it is more common to write:
Matrix @ Matrix @ Column_Vector
or
Row_Vector @ Matrix @ Matrix
I'll leave it to those who do matrix maths to decide which they use more often, but personally I've never come across the second case except in schoolbook exercises.
"""It's been suggested that @ should be right-associative, on the grounds that for expressions like Mat @ Mat @ vec, the two different evaluation orders produce the same result, but the right-associative order Mat @ (Mat @ vec) will be faster and use less memory than the left-associative order (Mat @ Mat) @ vec. (Matrix-vector multiplication is much cheaper than matrix-matrix multiplication)."""
If that's the only reason, then I'd like @ to be left-associative.
I'm not sure that premature pessimation is much of an improvement over premature optimization.
*wink*
On Sun, 16 Mar 2014 02:11:43 +1100 Steven D'Aprano steve@pearwood.info wrote:
The possible reason given in the PEP is very weak and amounts to premature optimization:
I don't think it's premature optimization. Sometimes we do know ahead of time that a calculation done one way will be faster than doing it another way: you don't have to "try it and see" to realise that repeatedly adding strings in a for-loop can be O(N**2) versus O(N) for using str.join().
It's premature optimization because the PEP is proposing to enforce it at the language level. We didn't change *the language* so that "str +=" allows for O(N) repeated concatenation; instead we tell people that "".join() should be used for repeated concatenation. Why would our course of action be different for matrix multiplication?
Regards
Antoine.
On 16 March 2014 01:22, Antoine Pitrou solipsis@pitrou.net wrote:
On Sun, 16 Mar 2014 02:11:43 +1100 Steven D'Aprano steve@pearwood.info wrote:
The possible reason given in the PEP is very weak and amounts to premature optimization:
I don't think it's premature optimization. Sometimes we do know ahead of time that a calculation done one way will be faster than doing it another way: you don't have to "try it and see" to realise that repeatedly adding strings in a for-loop can be O(N**2) versus O(N) for using str.join().
It's premature optimization because the PEP is proposing to enforce it at the language level. We didn't change *the language* so that "str +=" allows for O(N) repeated concatenation; instead we tell people that "".join() should be used for repeated concatenation. Why would our course of action be different for matrix multiplication?
That's why Guido's request was for the numeric community to go back and try to come up with a better rationale one way or the other for that question - he wanted to make it clear to them that right associativity *was* potentially acceptable, so they should seriously consider the question and make the case for right associativity if they decided they really wanted it, rather than assume that we would reject the idea out of hand. Left associativity still remains the default, and this isn't the right list to argue about the choice (as most of the relevant people aren't here). If the numeric computing community decide they *do* want right associativity and add that to the PEP, *then* Guido will need to make the call as to whether or not he considers their rationale for requesting it convincing (and we will have the opportunity to chime in with our own perspectives to help him make up his mind).
Cheers, Nick.
On Sat, Mar 15, 2014 at 04:22:33PM +0100, Antoine Pitrou wrote:
On Sun, 16 Mar 2014 02:11:43 +1100 Steven D'Aprano steve@pearwood.info wrote:
The possible reason given in the PEP is very weak and amounts to premature optimization:
I don't think it's premature optimization. Sometimes we do know ahead of time that a calculation done one way will be faster than doing it another way: you don't have to "try it and see" to realise that repeatedly adding strings in a for-loop can be O(N**2) versus O(N) for using str.join().
It's premature optimization because the PEP is proposing to enforce it at the language level.
The PEP leaves the question of left- versus right-associativity open. It has to be decided one way or the other. What evidence would you want to see before saying "It's not premature optimization, it's a justified optimization"?
We didn't change *the language* so that "str +=" allows for O(N) repeated concatenation; instead we tell people that "".join() should be used for repeated concatenation. Why would our course of action be different for matrix multiplication?
Exactly the same argument applies to left-associativity. Why should the language enforce optimizing the first case over the second?
row_vector @ matrix @ matrix
vs.
matrix @ matrix @ column_vector
Since infix operators have to have an associativity, the language implicitly has to optimize one case or the other. We can't sit on the fence and refuse to favour one over the other.
If the matrix-using community says that both cases are equally common, (or uncommon, perhaps) and they are indifferent between left- and right-associativity, then it makes sense to stick to what the majority of other operators do. I agree with that.
But if the matrix-using community comes down in favour of right- associativity, because Practicality Beats Purity and the second case is sufficiently more common than the first case, then I think it is a *broken design* to force left-associativity on them just to satisfy the expectations of people who aren't going to use the @ operator.
I hope I've made my position clear here: Guido has given his blessing to the possibility of making @ right-associative, and I think that the only people who should care what the associativity of @ ends up being is the community of matrix-multiplication users. They should make that decision, based on what it useful to them. I don't care what colour this bikeshed ends up having, I just want to see that it is the users of that shed that make the decision.
On Sun, 16 Mar 2014 03:26:49 +1100 Steven D'Aprano steve@pearwood.info wrote:
Exactly the same argument applies to left-associativity. Why should the language enforce optimizing the first case over the second?
It has nothing to do with optimizing, it's just that left-associativity is the common expectation wrt. operators. Left-associativity doesn't need any particular justification.
Regards
Antoine.
On Sat, Mar 15, 2014 at 11:11 AM, Steven D'Aprano steve@pearwood.infowrote:
The only question is whether it is more common to write:
Matrix @ Matrix @ Column_Vector
or
Row_Vector @ Matrix @ Matrix
I'll leave it to those who do matrix maths to decide which they use more often, but personally I've never come across the second case except in schoolbook exercises.
Abstractly, 1-dimensional arrays are neither columns nor rows, but Python's horizontal notation makes them more row-like than column-like. In 2-dimensional case, [[1,2]] is a row-vector and [[1],[2]] is a column-vector. Which one is more "natural"?
When you have a matrix
A = [[1, 2], [3, 4]]
A[1] is [3, 4], which is a row. To get a column, [2, 4], one has to write A[:,1] in numpy.
When it comes to matrix - vector multiplication,
[1, 2] @ [[1, 2], [3, 4]] -> [7, 10]
has a text-book appearance, while
[[1, 2], [3, 4]] @ [1, 2] -> [5, 11]
has to be mentally cast into
([[1, 2], [3, 4]] @ [[1], [2]])[0] -> [5, 11]
While it is more common in math literature to see Mat @ vec than vec @ Mat, I don't think anyone who has completed an introductory linear algebra course would have trouble understanding what [1, 2, 3] @ Mat means. On the other hand, novice programmers may find it puzzling why Mat @ [Mat1, Mat2] is the same as [Mat @ Mat1, Mat @ Mat2], but [Mat @ [vec1, vec2]] is not [Mat @ vec1, Mat @ vec2].
On 16 March 2014 06:50, Alexander Belopolsky alexander.belopolsky@gmail.com wrote:
Abstractly, 1-dimensional arrays are neither columns nor rows, but Python's horizontal notation makes them more row-like than column-like. In 2-dimensional case, [[1,2]] is a row-vector and [[1],[2]] is a column-vector. Which one is more "natural"?
Folks, please stop trying to argue this one in the abstract. The decision will likely be made primarily based on the feedback of folks that have *actually been using and teaching* Python as a tool for matrix manipulation (etc) for more than a decade.
Guido has asked them to have that discussion and summarise their conclusions in the PEP - we can lob our armchair opinions into the mix after the experts have spoken.
This won't be the first time features have been added to the language core specifically for the benefit of the numeric community - the ellipsis notation and extended slicing in general were added for their benefit years ago (I started getting involved in Python core development around Python 2.3, just as the core sequence types were being updated to support extended slicing, and being able to use "s[::-1]" with lists, strings, etc was a novel concept).
Regards, Nick.
Also complex numbers were Jim Hugunin's request for numeric work.
On Saturday, March 15, 2014, Nick Coghlan ncoghlan@gmail.com wrote:
On 16 March 2014 06:50, Alexander Belopolsky <alexander.belopolsky@gmail.com javascript:;> wrote:
Abstractly, 1-dimensional arrays are neither columns nor rows, but
Python's
horizontal notation makes them more row-like than column-like. In 2-dimensional case, [[1,2]] is a row-vector and [[1],[2]] is a column-vector. Which one is more "natural"?
Folks, please stop trying to argue this one in the abstract. The decision will likely be made primarily based on the feedback of folks that have *actually been using and teaching* Python as a tool for matrix manipulation (etc) for more than a decade.
Guido has asked them to have that discussion and summarise their conclusions in the PEP - we can lob our armchair opinions into the mix after the experts have spoken.
This won't be the first time features have been added to the language core specifically for the benefit of the numeric community - the ellipsis notation and extended slicing in general were added for their benefit years ago (I started getting involved in Python core development around Python 2.3, just as the core sequence types were being updated to support extended slicing, and being able to use "s[::-1]" with lists, strings, etc was a novel concept).
Regards, Nick.
-- Nick Coghlan | ncoghlan@gmail.com javascript:; | Brisbane, Australia _______________________________________________ Python-ideas mailing list Python-ideas@python.org javascript:; https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Sat, Mar 15, 2014 at 10:57 PM, Nick Coghlan ncoghlan@gmail.com wrote:
Folks, please stop trying to argue this one in the abstract. The decision will likely be made primarily based on the feedback of folks that have *actually been using and teaching* Python as a tool for matrix manipulation (etc) for more than a decade.
I would think I qualify having been using Numeric/NumPy since 2003.
On 16 March 2014 13:12, Alexander Belopolsky alexander.belopolsky@gmail.com wrote:
On Sat, Mar 15, 2014 at 10:57 PM, Nick Coghlan ncoghlan@gmail.com wrote:
Folks, please stop trying to argue this one in the abstract. The decision will likely be made primarily based on the feedback of folks that have *actually been using and teaching* Python as a tool for matrix manipulation (etc) for more than a decade.
I would think I qualify having been using Numeric/NumPy since 2003.
OK, thanks for the additional context - I didn't know that. However, I still suggest finding the relevant discussion on the NumPy lists and contributing there would be a better way to go, rather than getting Nathaniel to track the discussion in two places at once.
We'll get another go around here and on python-dev after the PEP has been updated with answers to Guido's questions :)
Regards, Nick.
On Sun, 16 Mar 2014 12:57:06 +1000 Nick Coghlan ncoghlan@gmail.com wrote:
Guido has asked them to have that discussion and summarise their conclusions in the PEP - we can lob our armchair opinions into the mix after the experts have spoken.
Debunking armchair opinions is one of the points of the PEP process :-)
Regards
Antoine.
On 16 March 2014 22:26, Antoine Pitrou solipsis@pitrou.net wrote:
On Sun, 16 Mar 2014 12:57:06 +1000 Nick Coghlan ncoghlan@gmail.com wrote:
Guido has asked them to have that discussion and summarise their conclusions in the PEP - we can lob our armchair opinions into the mix after the experts have spoken.
Debunking armchair opinions is one of the points of the PEP process :-)
Aye, but at the moment it makes sense to wait and see if there is even an argument to be had - Nathaniel may decide that even after reviewing the question seriously, he doesn't want to propose a right associative operator :)
Cheers, Nick.
On Sun, Mar 16, 2014 at 1:05 PM, Nick Coghlan ncoghlan@gmail.com wrote:
On 16 March 2014 22:26, Antoine Pitrou solipsis@pitrou.net wrote:
On Sun, 16 Mar 2014 12:57:06 +1000 Nick Coghlan ncoghlan@gmail.com wrote:
Guido has asked them to have that discussion and summarise their conclusions in the PEP - we can lob our armchair opinions into the mix after the experts have spoken.
Debunking armchair opinions is one of the points of the PEP process :-)
Aye, but at the moment it makes sense to wait and see if there is even an argument to be had - Nathaniel may decide that even after reviewing the question seriously, he doesn't want to propose a right associative operator :)
And indeed, that does seem to be the way things have worked out :-). http://mail.scipy.org/pipermail/numpy-discussion/2014-April/069834.html
On Sun, Apr 06, 2014 at 11:02:00PM +0100, Nathaniel Smith wrote:
On Sun, Mar 16, 2014 at 1:05 PM, Nick Coghlan ncoghlan@gmail.com wrote:
Aye, but at the moment it makes sense to wait and see if there is even an argument to be had - Nathaniel may decide that even after reviewing the question seriously, he doesn't want to propose a right associative operator :)
And indeed, that does seem to be the way things have worked out :-). http://mail.scipy.org/pipermail/numpy-discussion/2014-April/069834.html
That's a shame in one way -- for anyone using operator overloading to define their own DSL, there's only ** if you want a right-associative operator to overload. Still, that's an extremely marginal, and hypothetical, use-case. Left-associative it is.
Was that the last blocker for the PEP?
On Mon, Apr 7, 2014 at 12:20 AM, Steven D'Aprano steve@pearwood.info wrote:
On Sun, Apr 06, 2014 at 11:02:00PM +0100, Nathaniel Smith wrote:
On Sun, Mar 16, 2014 at 1:05 PM, Nick Coghlan ncoghlan@gmail.com wrote:
Aye, but at the moment it makes sense to wait and see if there is even an argument to be had - Nathaniel may decide that even after reviewing the question seriously, he doesn't want to propose a right associative operator :)
And indeed, that does seem to be the way things have worked out :-). http://mail.scipy.org/pipermail/numpy-discussion/2014-April/069834.html
That's a shame in one way -- for anyone using operator overloading to define their own DSL, there's only ** if you want a right-associative operator to overload. Still, that's an extremely marginal, and hypothetical, use-case. Left-associative it is.
Was that the last blocker for the PEP?
Pretty much. Just posted to python-dev: https://mail.python.org/pipermail/python-dev/2014-April/133791.html
-n
I haven't been following this thread too closely, so please stop me if this has been covered, but has overloading the @ operator as function composition been considered yet?
An example would be
filter(a @ b, lst)
as opposed to
filter(lambda x: a(b(x)), lst)
On Sun, Apr 6, 2014 at 6:51 PM, Nathaniel Smith njs@pobox.com wrote:
On Mon, Apr 7, 2014 at 12:20 AM, Steven D'Aprano steve@pearwood.info wrote:
On Sun, Apr 06, 2014 at 11:02:00PM +0100, Nathaniel Smith wrote:
On Sun, Mar 16, 2014 at 1:05 PM, Nick Coghlan ncoghlan@gmail.com
wrote:
Aye, but at the moment it makes sense to wait and see if there is even an argument to be had - Nathaniel may decide that even after reviewing the question seriously, he doesn't want to propose a right associative operator :)
And indeed, that does seem to be the way things have worked out :-).
http://mail.scipy.org/pipermail/numpy-discussion/2014-April/069834.html
That's a shame in one way -- for anyone using operator overloading to define their own DSL, there's only ** if you want a right-associative operator to overload. Still, that's an extremely marginal, and hypothetical, use-case. Left-associative it is.
Was that the last blocker for the PEP?
Pretty much. Just posted to python-dev: https://mail.python.org/pipermail/python-dev/2014-April/133791.html
-n
-- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
It's been raised a few times, but the problem is that there's no evidence that anyone actually needs a short way to perform composition - notice the composition operation's never even been added to functools. If you do want to make that case though then there's nothing stopping you :-) On 8 Apr 2014 22:10, "Michael Mitchell" epsilonmichael@gmail.com wrote:
I haven't been following this thread too closely, so please stop me if this has been covered, but has overloading the @ operator as function composition been considered yet?
An example would be
filter(a @ b, lst)
as opposed to
filter(lambda x: a(b(x)), lst)
On Sun, Apr 6, 2014 at 6:51 PM, Nathaniel Smith njs@pobox.com wrote:
On Mon, Apr 7, 2014 at 12:20 AM, Steven D'Aprano steve@pearwood.info wrote:
On Sun, Apr 06, 2014 at 11:02:00PM +0100, Nathaniel Smith wrote:
On Sun, Mar 16, 2014 at 1:05 PM, Nick Coghlan ncoghlan@gmail.com
wrote:
Aye, but at the moment it makes sense to wait and see if there is
even
an argument to be had - Nathaniel may decide that even after
reviewing
the question seriously, he doesn't want to propose a right
associative
operator :)
And indeed, that does seem to be the way things have worked out :-).
http://mail.scipy.org/pipermail/numpy-discussion/2014-April/069834.html
That's a shame in one way -- for anyone using operator overloading to define their own DSL, there's only ** if you want a right-associative operator to overload. Still, that's an extremely marginal, and hypothetical, use-case. Left-associative it is.
Was that the last blocker for the PEP?
Pretty much. Just posted to python-dev: https://mail.python.org/pipermail/python-dev/2014-April/133791.html
-n
-- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
I'm not confident I can make that case very well, but I suppose I can try :).
There do seem to be some hand-rolled composition functions in various Github repositories: https://github.com/search?l=python&q=%22def+compose%22&ref=searchres...
I'm not sure why a composition operator has never been added to the functools library, but I'd make the observation that a composition function wouldn't be much less verbose than in my earlier lambda example. A infix binary operator, however, would be quite more concise, especially so when chaining multiple functions.
There's also precedent to the @ symbol being associated with functions, i.e. decorator syntax. Decorators are callables that accept callables and return callables, which parallels the infix definition of accepting two callables and returning a callable.
On Tue, Apr 8, 2014 at 2:17 PM, Nathaniel Smith njs@pobox.com wrote:
It's been raised a few times, but the problem is that there's no evidence that anyone actually needs a short way to perform composition - notice the composition operation's never even been added to functools. If you do want to make that case though then there's nothing stopping you :-) On 8 Apr 2014 22:10, "Michael Mitchell" epsilonmichael@gmail.com wrote:
I haven't been following this thread too closely, so please stop me if this has been covered, but has overloading the @ operator as function composition been considered yet?
An example would be
filter(a @ b, lst)
as opposed to
filter(lambda x: a(b(x)), lst)
On Sun, Apr 6, 2014 at 6:51 PM, Nathaniel Smith njs@pobox.com wrote:
On Mon, Apr 7, 2014 at 12:20 AM, Steven D'Aprano steve@pearwood.info wrote:
On Sun, Apr 06, 2014 at 11:02:00PM +0100, Nathaniel Smith wrote:
On Sun, Mar 16, 2014 at 1:05 PM, Nick Coghlan ncoghlan@gmail.com
wrote:
Aye, but at the moment it makes sense to wait and see if there is
even
an argument to be had - Nathaniel may decide that even after
reviewing
the question seriously, he doesn't want to propose a right
associative
operator :)
And indeed, that does seem to be the way things have worked out :-).
http://mail.scipy.org/pipermail/numpy-discussion/2014-April/069834.html
That's a shame in one way -- for anyone using operator overloading to define their own DSL, there's only ** if you want a right-associative operator to overload. Still, that's an extremely marginal, and hypothetical, use-case. Left-associative it is.
Was that the last blocker for the PEP?
Pretty much. Just posted to python-dev: https://mail.python.org/pipermail/python-dev/2014-April/133791.html
-n
-- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Apr 8, 2014, at 16:00, Michael Mitchell epsilonmichael@gmail.com wrote:
I'm not confident I can make that case very well, but I suppose I can try :).
There do seem to be some hand-rolled composition functions in various Github repositories: https://github.com/search?l=python&q=%22def+compose%22&ref=searchres...
I'm not sure why a composition operator has never been added to the functools library, but I'd make the observation that a composition function wouldn't be much less verbose than in my earlier lambda example. A infix binary operator, however, would be quite more concise, especially so when chaining multiple functions.
A few problems have been raised with this.
The big one is that it's not clear what it means to compose functions unless they take and return a single value. While you could make it call g(*f(x)) to pass multiple values, that makes it no longer work with single-value functions.
Functional languages deal with that by having equally compact ways to do all kinds of other general higher-order tasks, like partialing functions and sectioning operators (or they just use currying to make all functions partial), flipping arguments, raising functions (like a map that returns a function over an iterable, instead of taking an iterable in the map call), etc. Python has nothing but partial, it's not that compact, and it's buried in functools.
Also, some people find that code using higher-order functions more than necessary usually doesn't look very pythonic. All kinds of things that are done by building up a higher-order function and then calling it in a language like Haskell are usually instead done by chaining or imperatively calling functions in Python. This could lead to a "more than one way to do it" situation, or just to code that's harder to understand without thinking it through. (I personally don't find this argument very compelling, but then it's hard for me to put myself in the shoes of people who never use functional languages, so that may not mean much. And I think Guido is one of the people who finds it compelling.)
Both left-compose and right-compose are perfectly valid operations, and the one that looks obviously right to novices is the opposite of the one that's obviously right (or at least more often) to experts.
Finally, compose and matrix multiplication are false cognates. Matrices are operations on vectors under multiplication--the same operator used for composing those operations. Functions are operations on values under calling--a completely different operator than @.
Finally, it would help to have a real use case instead of just filter(a @ b, x). Presumably b is some function from elements of x to some other type, and a is a predicate function from that type to bool... but it's hard to think of useful functions (as opposed to expressions you'd have to wrap in lambda anyway) that fit that bill off the top of my head.
None of these means the idea is impossible or useless, just that it's less obvious a win than it looks at first.
There's also precedent to the @ symbol being associated with functions, i.e. decorator syntax. Decorators are callables that accept callables and return callables, which parallels the infix definition of accepting two callables and returning a callable.
On Tue, Apr 8, 2014 at 2:17 PM, Nathaniel Smith njs@pobox.com wrote:
It's been raised a few times, but the problem is that there's no evidence that anyone actually needs a short way to perform composition - notice the composition operation's never even been added to functools. If you do want to make that case though then there's nothing stopping you :-)
On 8 Apr 2014 22:10, "Michael Mitchell" epsilonmichael@gmail.com wrote:
I haven't been following this thread too closely, so please stop me if this has been covered, but has overloading the @ operator as function composition been considered yet?
An example would be
filter(a @ b, lst)
as opposed to
filter(lambda x: a(b(x)), lst)
On Sun, Apr 6, 2014 at 6:51 PM, Nathaniel Smith njs@pobox.com wrote:
On Mon, Apr 7, 2014 at 12:20 AM, Steven D'Aprano steve@pearwood.info wrote:
On Sun, Apr 06, 2014 at 11:02:00PM +0100, Nathaniel Smith wrote:
On Sun, Mar 16, 2014 at 1:05 PM, Nick Coghlan ncoghlan@gmail.com wrote: > Aye, but at the moment it makes sense to wait and see if there is even > an argument to be had - Nathaniel may decide that even after reviewing > the question seriously, he doesn't want to propose a right associative > operator :)
And indeed, that does seem to be the way things have worked out :-). http://mail.scipy.org/pipermail/numpy-discussion/2014-April/069834.html
That's a shame in one way -- for anyone using operator overloading to define their own DSL, there's only ** if you want a right-associative operator to overload. Still, that's an extremely marginal, and hypothetical, use-case. Left-associative it is.
Was that the last blocker for the PEP?
Pretty much. Just posted to python-dev: https://mail.python.org/pipermail/python-dev/2014-April/133791.html
-n
-- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Wed, Apr 9, 2014 at 9:00 AM, Michael Mitchell epsilonmichael@gmail.com wrote:
I'm not sure why a composition operator has never been added to the functools library, but I'd make the observation that a composition function wouldn't be much less verbose than in my earlier lambda example. A infix binary operator, however, would be quite more concise, especially so when chaining multiple functions.
Is "a @ b" equivalent to "lambda x: a(b(x))" or to "lambda x: b(a(x))"? Is there a sufficiently-obvious way to make it clear that the one on the right gets called first?
ChrisA
The question is academic. Regardless of which answer you choose that is not going to happen.
On Tue, Apr 8, 2014 at 11:30 PM, Chris Angelico rosuav@gmail.com wrote:
On Wed, Apr 9, 2014 at 9:00 AM, Michael Mitchell epsilonmichael@gmail.com wrote:
I'm not sure why a composition operator has never been added to the functools library, but I'd make the observation that a composition
function
wouldn't be much less verbose than in my earlier lambda example. A infix binary operator, however, would be quite more concise, especially so when chaining multiple functions.
Is "a @ b" equivalent to "lambda x: a(b(x))" or to "lambda x: b(a(x))"? Is there a sufficiently-obvious way to make it clear that the one on the right gets called first?
ChrisA _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Wed, Apr 9, 2014 at 1:33 PM, Guido van Rossum guido@python.org wrote:
The question is academic. Regardless of which answer you choose that is not going to happen.
On Tue, Apr 8, 2014 at 11:30 PM, Chris Angelico rosuav@gmail.com wrote:
Is "a @ b" equivalent to "lambda x: a(b(x))" or to "lambda x: b(a(x))"? Is there a sufficiently-obvious way to make it clear that the one on the right gets called first?
Well there you go, that settles it :)
ChrisA
Steven D'Aprano wrote:
or
Row_Vector @ Matrix @ Matrix
personally I've never come across the second case except in schoolbook exercises.
It turns up sometimes in quantum mechanics, at least when doing the algebra. I don't know whether people who do QM calculations numerically ever do it that way, though. Flipping it around would be easy enough if it were more efficient.
On Sat, Mar 15, 2014 at 6:27 AM, Antoine Pitrou solipsis@pitrou.net wrote:
On Sat, 15 Mar 2014 18:35:09 +1100 Steven D'Aprano steve@pearwood.info wrote:
On Fri, Mar 14, 2014 at 09:09:02PM -0700, Bruce Leban wrote:
I think making @ right associative would make this less suitable for other uses. Example
someclassobject @ somevalue @ somevalue
won't work with right-associativity.
Why not? It works with the other right-associative operator:
x**y**z
"Works" depends on what you expect it to do.
The real question is why @ would be right-associative. "**" is very rarely used in a chained manner as the above, so its associativity isn't really important (I don't think I have ever written "x**y**z"). @ will be used routinely in a chained manner, so the question is more important here.
I write that all the time when working with SymPy. And the associativity of ** is *very* important. If ** were left associative it would not be useful, because, at least for positive x, y, and z, (x**y)**z is the same as x**(y*z). If you really want the stacked power it has to be right associative.
Aaron Meurer
The possible reason given in the PEP is very weak and amounts to premature optimization:
"""It's been suggested that @ should be right-associative, on the grounds that for expressions like Mat @ Mat @ vec, the two different evaluation orders produce the same result, but the right-associative order Mat @ (Mat @ vec) will be faster and use less memory than the left-associative order (Mat @ Mat) @ vec. (Matrix-vector multiplication is much cheaper than matrix-matrix multiplication)."""
If that's the only reason, then I'd like @ to be left-associative.
Regards
Antoine.
Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
--
You received this message because you are subscribed to a topic in the Google Groups "python-ideas" group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/python-ideas/aHVlL6BADLY/unsubscribe. To unsubscribe from this group and all its topics, send an email to python-ideas+unsubscribe@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
On Sat, Mar 15, 2014 at 12:35 AM, Steven D'Aprano steve@pearwood.infowrote:
On Fri, Mar 14, 2014 at 09:09:02PM -0700, Bruce Leban wrote:
I think making @ right associative would make this less suitable for
other
uses. Example
someclassobject @ somevalue @ somevalue
won't work with right-associativity.
Why not? It works with the other right-associative operator:
x**y**z
It won't work because right-associativity means that the two values on the right are combined together first and they won't be combined properly. given that the type of someclassobject is not consulted.
"Works" depends on what you expect it to do. Unless you tell us what these "other uses" are, how can we know that right-associativity won't work?
Here's a simple example to illustrate the concept. Suppose I have a class that is roughly equivalent to a dict mapping values to lists of values, e.g.,
a = { 1: [10, 11, 12] } b = { 1: [11, 13]
}
I might have
a + b = {
1: [10, 11, 12, 11, 13] # appends
}
a @ b = {
1: [10, 11, 12, 13] # appends non-duplicate values
}
but I also want to be able to merge in standard dicts. If @ is right-associative, a @ x @ y where x and y are dicts, will try to compute x @ y first and built-in dict doesn't supporr the @ operator. Even if it did, it's not going to do what I'd want here.
I also want to give a larger example to illustrate the kind of thing I think this operator would be useful for. SQLAlchemy has query objects where you can write something like
some_query.filter(...).order_by(...).join(...)
I image that this could use the @ operator:
some_query @ filter(...) @ order_by(...) @ join(...)
The benefit here is that I have filter objects, order_by objects, etc. that can be passed around rather than only having API calls. In the current API, it's clumsy to support that as I either need both filter objects and a query.filter method or I need a clumsy query.apply method. Having an @operator I think works really well. Left vs. right associativity is useful if I wanted to support other kinds of objects on the right hand side of the @, e.g., if:
some_query @ { column1: value1, column2: value2, ... }
as another spelling of
some_query @ filter(column1=value1, column2=value2, ... }
--- Bruce Learn how hackers think: http://j.mp/gruyere-security https://www.linkedin.com/in/bruceleban
On 17 March 2014 13:46, Bruce Leban bruce@leapyear.org wrote:
I also want to give a larger example to illustrate the kind of thing I think this operator would be useful for. SQLAlchemy has query objects where you can write something like
some_query.filter(...).order_by(...).join(...)
I image that this could use the @ operator:
some_query @ filter(...) @ order_by(...) @ join(...)
There are plenty of existing left-associative operators that could be used for that if Mike Bayer wanted (or could be convinced) to do so, so I don't see how that observation is relevant to the question of whether or not this particular proposal should be for a right-associative operator.
Cheers, Nick.
On 2014-03-17 05:03, Nick Coghlan wrote:
On 17 March 2014 13:46, Bruce Leban bruce@leapyear.org wrote:
I also want to give a larger example to illustrate the kind of thing I think this operator would be useful for. SQLAlchemy has query objects where you can write something like
some_query.filter(...).order_by(...).join(...)
I image that this could use the @ operator:
some_query @ filter(...) @ order_by(...) @ join(...)
There are plenty of existing left-associative operators that could be used for that if Mike Bayer wanted (or could be convinced) to do so, so I don't see how that observation is relevant to the question of whether or not this particular proposal should be for a right-associative operator.
Although this does suggest that having a right-associative operator might provide some useful variety in the operator-overloading-as-DSL ecosystem.
On Mon, Mar 17, 2014 at 10:35:37AM +0000, Robert Kern wrote:
On 2014-03-17 05:03, Nick Coghlan wrote:
There are plenty of existing left-associative operators that could be used for that if Mike Bayer wanted (or could be convinced) to do so, so I don't see how that observation is relevant to the question of whether or not this particular proposal should be for a right-associative operator.
Although this does suggest that having a right-associative operator might provide some useful variety in the operator-overloading-as-DSL ecosystem.
That's a good point. That suggests that adding a second right-associative operator may be a good idea even if it's not necessary for numpy.
On Sun, Mar 16, 2014 at 10:03 PM, Nick Coghlan ncoghlan@gmail.com wrote:
There are plenty of existing left-associative operators that could be used for that if Mike Bayer wanted (or could be convinced) to do so, so I don't see how that observation is relevant to the question of whether or not this particular proposal should be for a right-associative operator.
I knew I might get in trouble by giving such a specific example. My view is that a new @ operator is "baggage free" in terms of implied semantics -- since it has defined semantics only for matrices.
To the extent that it has implied semantics of being multiplication-like, I think it would be surprising that one multiplication operator is left-associative and one is right-associative. Does @ have higher or lower precedence than *? The draft PEP says @ is the same as * but it can't be if it has opposite associativity. The interpretations of these two expressions depends on the precedence:
A @ B * C A * B @ C
Someone mentioned that APL is right-associative and has no operator-precedence. Those two things are related. There's no operator precedence because there are too many operators for everyone to remember the exact order of precedence. Given that, all operators need to be left or right associative. I don't know the reason for sure but I have heard that one reason is that and given that monadic (unary) operators are right associative, making all operators right associative means that you can simply interpret an APL statement reading from right to left. I don't think APL is the right model to look at.
--- Bruce Learn how hackers think: http://j.mp/gruyere-security https://www.linkedin.com/in/bruceleban
On Fri, Mar 14, 2014 at 3:46 PM, Zachary Ware zachary.ware+pyideas@gmail.com wrote:
I agree with Paul, @/@@ just look scary as operators. Here's a few multi-character options I've come up with that I would like to see shot down in flames before @/@@ are added to Python:
< (for multiplication, not sure about exponentiation. I only like it because it's the shortest thing I've come up with that looks somewhat like multiplication)
[*] / [**] ([] make me think 'matrix', but this might be confusing to the parser) |*| / |**| (pretty close to [], shouldn't confuse the parser)
I really liked the [*] or |*| "footnote operator" idea, and got all excited, until I tried it :-). Using the PEP's example of a linear hypothesis test, compare:
S = (H @ beta - r).T @ inv(H @ V @ H.T) @ (H @ beta - r)
versus
S = (H [*] beta - r).T [*] inv(H [*] V [*] H.T) [*] (H [*] beta - r)
S = (H |*| beta - r).T |*| inv(H |*| V |*| H.T) |*| (H |*| beta - r)
A big part of the motivation for wanting an infix operator for this kind of expression is that when using function syntax, the visual clutter from all the required parentheses makes expressions hard to parse by eye. [*] has this same problem of introducing visual nesting and making it hard to pick out which vertical shapes are actually grouping parens, and which are the operators. E.g., try counting how many sub-expressions there are in each of these expressions -- I find it much easier to pick out the 3 groups in the top example than in the bottom two. Sort of a variant of leaning toothpick syndrome...
-n
On Fri, Mar 14, 2014 at 12:20 PM, Nathaniel Smith njs@pobox.com wrote:
< (for multiplication, not sure about exponentiation. I only like it
because it's the shortest thing I've come up with that looks somewhat like multiplication)
[*] / [**] ([] make me think 'matrix', but this might be confusing to the parser) |*| / |**| (pretty close to [], shouldn't confuse the parser)
I really liked the [*] or |*| "footnote operator" idea, and got all excited, until I tried it :-).
+1
Consider this:
A[*](B+C)
vs.
A[0]*(B+C)
the former looks like that later fat-fingered.
Given that in numpy it is no uncommon to have ":", ":,:", "...", or even "()" to go inside [], [*] is more likely to be interpreted by those who first see it as some new form of fancy indexing rather than matrix multiplication.
On Fri, Mar 14, 2014 at 11:20 AM, Nathaniel Smith njs@pobox.com wrote:
On Fri, Mar 14, 2014 at 3:46 PM, Zachary Ware zachary.ware+pyideas@gmail.com wrote:
Here's a few multi-character options I've come up with that I would like to see shot down in flames before @/@@ are added to Python:
<snip>
A big part of the motivation for wanting an infix operator for this kind of expression is that when using function syntax, the visual clutter from all the required parentheses makes expressions hard to parse by eye. [*] has this same problem of introducing visual nesting and making it hard to pick out which vertical shapes are actually grouping parens, and which are the operators. E.g., try counting how many sub-expressions there are in each of these expressions -- I find it much easier to pick out the 3 groups in the top example than in the bottom two. Sort of a variant of leaning toothpick syndrome...
This, plus added text for the PEP in a later message, is enough of a fire for me :). Thank you.
I'll also admit that, again like Paul, I was reading in a proportional font. @/@@ is significantly less bad in fixed width, but I still don't love it. I certainly won't stand in the way if it's determined to be the least bad option, though.
On Fri, Mar 14, 2014 at 2:53 PM, Paul Moore p.f.moore@gmail.com wrote:
One genuine question though - when the PEP was developed, were multi-character operators like .* or <*> considered? A "rejected alternative operator symbols" would be a useful addition to the PEP (although it'd rob all us non-experts of the opportunity to bikeshed :-))
On a related note, the @@ operator is visually dreadful (far too heavy). While I see the */** analogy, and I appreciate that there's few good options, I'd definitely like to see some evidence that it's "the best of a bad lot" in the PEP.
It's worth noting that @@ will probably see marginal use -- matrix power and matrix inversion are not common operations in number crunching code. Matrix inversion is very common in math formulas, but on a computer you almost always want to use a fused invert+multiply operation instead of inversion itself. My original draft didn't even allow '@@ -1'; I added it at the request of the symbolic math guys. In practice the main use case for @@ will probably be as a crutch for beginners before they learn about better tools like numpy.linalg.solve (which implements fused invert+multiply).
Anyway, I wrote some more text for the PEP, see what you think:
Rationale for specification details ===================================
Choice of operator ------------------
Why ``@`` instead of some other spelling? There isn't any consensus across other programming languages about how this operator should be named [#matmul-other-langs]_; here we discuss the various options.
Restricting ourselves only to symbols present on US English keyboards, the punctuation characters that don't already have a meaning in Python expression context are: ``@``, backtick, ``$``, ``!``, and ``?``. Of these options, ``@`` is clearly the best; ``!`` and ``?`` are already heavily freighted with inapplicable meanings in the programming context, backtick has been banned from Python by BDFL pronouncement (see PEP 3099), and ``$`` is uglier, even more dissimilar to ``*`` and :math:`\cdot`, and has Perl/PHP baggage. ``$`` is probably the second-best option of these, though.
Symbols which are not present on US English keyboards start at a significant disadvantage (having to spend 5 minutes at the beginning of every numeric Python tutorial just going over keyboard layouts is not a hassle anyone really wants). Plus, even if we somehow overcame the typing problem, it's not clear there are any that are actually better than ``@``. Some options that have been suggested include:
* U+00D7 MULTIPLICATION SIGN: ``A × B`` * U+22C5 DOT OPERATOR: ``A ⋅ B`` * U+2297 CIRCLED TIMES: ``A ⊗ B`` * U+00B0 DEGREE: ``A ° B``
What we need, though, is an operator that means "matrix multiplication, as opposed to scalar/elementwise multiplication". There is no conventional symbol for these in mathematics or programming, where these operations are usually distinguished by context. (And U+2297 CIRCLED TIMES is actually used conventionally to mean exactly the opposite: elementwise multiplication -- the "Hadamard product" -- as opposed to matrix multiplication). ``@`` at least has the virtue that it *looks* like a funny non-commutative operator; a naive user who knows maths but not programming couldn't look at ``A * B`` versus ``A × B``, or ``A * B`` versus ``A ⋅ B``, or ``A * B`` versus ``A ° B`` and guess which one is the usual multiplication, and which one is the special case.
Finally, there is the option of using multi-character tokens. Some options:
* Matlab uses a ``.*`` operator. Aside from being visually confusable with ``*``, this would be a terrible choice for us because in Matlab, ``*`` means matrix multiplication and ``.*`` means elementwise multiplication, so using ``.*`` for matrix multiplication would make us exactly backwards from what Matlab users expect.
* APL apparently used ``+.×``, which by combining a multi-character token, confusing attribute-access-like . syntax, and a unicode character, ranks somewhere below U+2603 SNOWMAN on our candidate list. If we like the idea of combining addition and multiplication operators as being evocative of how matrix multiplication actually works, then something like ``+*`` could be used -- though this may be too easy to confuse with ``*+``, which is just multiplication combined with the unary ``+`` operator.
* PEP 211 suggested ``~*`` and ``~**``. This has the downside that it sort of suggests that there is a unary ``*`` operator that is being combined with unary ``~``, but it could work.
* R uses ``%*%`` for matrix multiplication. In R this forms part of a general extensible infix system in which all tokens of the form ``%foo%`` are user-defined binary operators. We could steal the token without stealing the system.
* Some other plausible candidates that have been suggested: ``><`` (= ascii drawing of the multiplication sign ×); the footnote operators ``[*]`` and ``[**]`` or ``|*|`` and ``|**|`` (but when used in context, the use of vertical grouping symbols tends to recreate the nested parentheses visual clutter that was noted as one of the major downsides of the function syntax we're trying to get away from); ``^*`` and ``^^``.
So, it doesn't matter much, but ``@`` seems as good or better than any of the alternatives:
* It's a friendly character that Pythoneers are already used to typing in decorators, but the decorator usage and the math expression usage are sufficiently dissimilar that it would be hard to confuse them in practice.
* It's widely accessible across keyboard layouts (and thanks to its use in email addresses, this is true even of weird keyboards like those in phones).
* It's round like ``*`` and :math:`\cdot`.
* The mATrices mnemonic is cute.
* The use of a single-character token reduces the line-noise effect, and makes ``@@`` possible, which is a nice bonus.
* The swirly shape is reminiscent of the simultaneous sweeps over rows and columns that define matrix multiplication
* Its asymmetry is evocative of its non-commutative nature.
On 14 March 2014 16:41, Nathaniel Smith njs@pobox.com wrote:
Anyway, I wrote some more text for the PEP, see what you think:
From my POV, this pretty much covers it. Thanks :-)
Paul
On Fri, Mar 14, 2014 at 12:41 PM, Nathaniel Smith njs@pobox.com wrote:
(And U+2297 CIRCLED TIMES is actually used conventionally to mean exactly the opposite: elementwise multiplication -- the "Hadamard product" -- as opposed to matrix multiplication).
It's actually worse: CIRCLED TIMES is commonly used for tensor or outer product that is opposite to the proposed inner product meaning of vec @ vec.
On 2014-03-14 16:41, Nathaniel Smith wrote:
- The mATrices mnemonic is cute.
I recommend dropping this mnemonic from the PEP. It has been pointed out that the mnemonic does not cross languages[1], and even to my Anglo ears is not especially evocative.
[1] Python, like most programming languages, is unabashedly Anglocentric in its keywords, standard library and documentation, so some amount of Anglocentrism in new language features is unavoidable. My point is just that an Anglocentric mnemonic isn't really helping the argument for the PEP. If the PEP needs a mnemonic to survive, this is not going to cut it. If the PEP doesn't need a mnemonic, as I believe, it's best left out.
"M.-A. Lemburg" mal@egenix.com wrote:
Indeed. Numpy already uses a .T property for this.
Ah, good trick :-)
It is a better trick than you would expect.
Matlab and Fortran matrix transpose has complexity O(M*N) for a rank M x N matrix.
NumPy's .T attribute has complexity O(1)
With Numpy, we can put .T into matrix expressions with impunity. Now THAT is a good trick :-)
I am not aware of any other system where memory access is so flexible that matrix transposition can be done in O(1) time.
Regards, Sturla
On 14.03.2014 12:25, Robert Kern wrote:
On 2014-03-14 10:16, M.-A. Lemburg wrote:
I have some questions:
- Since in math, the operator is usually spelt "·" (the center dot, or "." but that's already reserved for methods and attributes in Python), why not try to use that instead of "@" (which in Python already identifies decorators) ?
I think the current feeling of the Python core team is against including non-ASCII characters in the language's keywords or operators. Even if that were not so, I would still recommend against it because it would be quite difficult to type. I don't know off-hand the key combination to do it on my native system, and it would change from system to system.
That's a fair argument. How about using the degree symbol instead: "°" ?
(A ° B).T == B.T ° A.T
On 2014-03-14 13:20, M.-A. Lemburg wrote:
On 14.03.2014 12:25, Robert Kern wrote:
On 2014-03-14 10:16, M.-A. Lemburg wrote:
I have some questions:
- Since in math, the operator is usually spelt "·" (the center dot, or "." but that's already reserved for methods and attributes in Python), why not try to use that instead of "@" (which in Python already identifies decorators) ?
I think the current feeling of the Python core team is against including non-ASCII characters in the language's keywords or operators. Even if that were not so, I would still recommend against it because it would be quite difficult to type. I don't know off-hand the key combination to do it on my native system, and it would change from system to system.
That's a fair argument. How about using the degree symbol instead: "°" ?
(A ° B).T == B.T ° A.T
That's still not ASCII, and I still don't know how to type it off the top of my head (US keyboard, OS X), though experimentation shows that Alt-0 does it easily enough. I don't know if it's universally that easy.
On 2014-03-14 13:20, M.-A. Lemburg wrote:
On 14.03.2014 12:25, Robert Kern wrote:
On 2014-03-14 10:16, M.-A. Lemburg wrote:
I have some questions:
- Since in math, the operator is usually spelt "·" (the center dot, or "." but that's already reserved for methods and attributes in Python), why not try to use that instead of "@" (which in Python already identifies decorators) ?
I think the current feeling of the Python core team is against including non-ASCII characters in the language's keywords or operators. Even if that were not so, I would still recommend against it because it would be quite difficult to type. I don't know off-hand the key combination to do it on my native system, and it would change from system to system.
That's a fair argument. How about using the degree symbol instead: "°" ?
(A ° B).T == B.T ° A.T
Your point is taken, though. I do find these smaller symbols more readable and similar to standard mathematical notation than an @ sign, which is as big or bigger than most uppercase characters. Unfortunately, ASCII leaves us few single-character options.
Where on earth is the degree sign on my keyboard? (Don't answer. It's a rhetorical question.)
On Fri, Mar 14, 2014 at 7:44 AM, Robert Kern robert.kern@gmail.com wrote:
On 2014-03-14 13:20, M.-A. Lemburg wrote:
On 14.03.2014 12:25, Robert Kern wrote:
On 2014-03-14 10:16, M.-A. Lemburg wrote:
I have some questions:
- Since in math, the operator is usually spelt "·" (the center dot, or "." but that's already reserved for methods and attributes in Python), why not try to use that instead of "@" (which in Python already identifies decorators) ?
I think the current feeling of the Python core team is against including non-ASCII characters in the language's keywords or operators. Even if that were not so, I would still recommend against it because it would be quite difficult to type. I don't know off-hand the key combination to do it on my native system, and it would change from system to system.
That's a fair argument. How about using the degree symbol instead: "°" ?
(A ° B).T == B.T ° A.T
Your point is taken, though. I do find these smaller symbols more readable and similar to standard mathematical notation than an @ sign, which is as big or bigger than most uppercase characters. Unfortunately, ASCII leaves us few single-character options.
-- Robert Kern
"I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On 2014-03-14 14:49, Guido van Rossum wrote:
Where on earth is the degree sign on my keyboard? (Don't answer. It's a rhetorical question.)
To be fair to Marc-Andre, there is a key for this right on his keyboard. It is we benighted Americans who lack it.
http://en.wikipedia.org/wiki/German_keyboard_layout
On 14.03.2014 15:56, Robert Kern wrote:
On 2014-03-14 14:49, Guido van Rossum wrote:
Where on earth is the degree sign on my keyboard? (Don't answer. It's a rhetorical question.)
To be fair to Marc-Andre, there is a key for this right on his keyboard. It is we benighted Americans who lack it.
The keyboard is indeed where I got the idea from :-)
I didn't know that it's not available on US keyboards; only German and French keyboard appear to have a key for it:
http://en.wikipedia.org/wiki/Degree_symbol#Keyboard_entry
Ok, enough bike shedding for today :-)
well, on linux, everyone has “°”.
on american layouts, it’s AltGr+Shift+0, I think.
also i think it’s one reason why you americans love your funny imperial units so much: you can’t even *type* “23°C” if you’re not on linux.
2014-03-14 16:16 GMT+01:00 M.-A. Lemburg mal@egenix.com:
On 14.03.2014 15:56, Robert Kern wrote:
On 2014-03-14 14:49, Guido van Rossum wrote:
Where on earth is the degree sign on my keyboard? (Don't answer. It's a rhetorical question.)
To be fair to Marc-Andre, there is a key for this right on his keyboard.
It is we benighted
Americans who lack it.
The keyboard is indeed where I got the idea from :-)
I didn't know that it's not available on US keyboards; only German and French keyboard appear to have a key for it:
http://en.wikipedia.org/wiki/Degree_symbol#Keyboard_entry
Ok, enough bike shedding for today :-)
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Source (#1, Mar 14 2014)
Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2014-03-29: PythonCamp 2014, Cologne, Germany ... 15 days to go 2014-04-09: PyCon 2014, Montreal, Canada ... 26 days to go
::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Fri, Mar 14, 2014 at 04:29:02PM +0100, "Philipp A." flying-sheep@web.de wrote:
well, on linux, everyone has ???°???.
Well, not on my linux.
PS. I am -10**6 on any non-ascii character.
Oleg.
On Fri, 14 Mar 2014 14:56:31 +0000 Robert Kern robert.kern@gmail.com wrote:
On 2014-03-14 14:49, Guido van Rossum wrote:
Where on earth is the degree sign on my keyboard? (Don't answer. It's a rhetorical question.)
To be fair to Marc-Andre, there is a key for this right on his keyboard. It is we benighted Americans who lack it.
Just for the record, on a French keyboard the combo is Shift+).
Regards
Antoine.
Robert Kern <robert.kern@...> writes:
On 2014-03-14 13:20, M.-A. Lemburg wrote:
On 14.03.2014 12:25, Robert Kern wrote:
On 2014-03-14 10:16, M.-A. Lemburg wrote:
I have some questions:
- Since in math, the operator is usually spelt "·" (the center dot, or "." but that's already reserved for methods and attributes in Python), why not try to use that instead of " <at> " (which in Python already identifies decorators) ?
I think the current feeling of the Python core team is against
including non-ASCII characters in the
language's keywords or operators. Even if that were not so, I would
still recommend against it
because it would be quite difficult to type. I don't know off-hand the
key combination to do it on
my native system, and it would change from system to system.
That's a fair argument. How about using the degree symbol instead: "°" ?
(A ° B).T == B.T ° A.T
Your point is taken, though. I do find these smaller symbols more readable
and
similar to standard mathematical notation than an <at> sign, which is as
big or
bigger than most uppercase characters. Unfortunately, ASCII leaves us few single-character options.
Putting aside tha ascii problem, ° is easily written using a AZERTY keyboard. It is smaller and less convoluted than @ and looks like the mathematical notation for function composition, which is similar to matrix multiplication. Still, not ascii and not displayed on every keyboard...
I think the fundamental issue with the degree sign is quite simply that it is *not-ASCII*. If we are willing to expand the syntax of Python to include other characters, then we should just use Unicode Character 'DOT OPERATOR' (U+22C5), which is actually the *exact* correct thing, not just "something that looks a bit similar."
If we are worried about "stuff that's easy to enter on the keyboard", there's no reason AZERTY is necessarily more relevant than Dubeolsik or JCUKEN. And if we aim for "something that looks similar" we probably have lots of options on some keyboard layout in the world.
On Fri, Mar 14, 2014 at 7:54 AM, Joseph Martinot-Lagarde < joseph.martinot-lagarde@m4x.org> wrote:
Robert Kern <robert.kern@...> writes:
On 2014-03-14 13:20, M.-A. Lemburg wrote:
On 14.03.2014 12:25, Robert Kern wrote:
On 2014-03-14 10:16, M.-A. Lemburg wrote:
I have some questions:
- Since in math, the operator is usually spelt "·" (the center dot, or "." but that's already reserved for methods and attributes in Python), why not try to use that instead of " <at> " (which in
Python
already identifies decorators) ?
I think the current feeling of the Python core team is against
including non-ASCII characters in the
language's keywords or operators. Even if that were not so, I would
still recommend against it
because it would be quite difficult to type. I don't know off-hand the
key combination to do it on
my native system, and it would change from system to system.
That's a fair argument. How about using the degree symbol instead: "°"
?
(A ° B).T == B.T ° A.T
Your point is taken, though. I do find these smaller symbols more
readable and
similar to standard mathematical notation than an <at> sign, which is
as big or
bigger than most uppercase characters. Unfortunately, ASCII leaves us few single-character options.
Putting aside tha ascii problem, ° is easily written using a AZERTY keyboard. It is smaller and less convoluted than @ and looks like the mathematical notation for function composition, which is similar to matrix multiplication. Still, not ascii and not displayed on every keyboard...
Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Fri, 14 Mar 2014 14:54:02 +0000 (UTC) Joseph Martinot-Lagarde joseph.martinot-lagarde@m4x.org wrote:
Putting aside tha ascii problem, ° is easily written using a AZERTY keyboard. It is smaller and less convoluted than @ and looks like the mathematical notation for function composition, which is similar to matrix multiplication.
Perhaps we should keep it for the temperature literals that will appear in Python 4, though :-)
Regards
Antoine.
On Fri, Mar 14, 2014 at 1:20 PM, M.-A. Lemburg mal@egenix.com wrote:
On 14.03.2014 12:25, Robert Kern wrote:
On 2014-03-14 10:16, M.-A. Lemburg wrote:
I have some questions:
- Since in math, the operator is usually spelt "·" (the center dot, or "." but that's already reserved for methods and attributes in Python), why not try to use that instead of "@" (which in Python already identifies decorators) ?
I think the current feeling of the Python core team is against including non-ASCII characters in the language's keywords or operators. Even if that were not so, I would still recommend against it because it would be quite difficult to type. I don't know off-hand the key combination to do it on my native system, and it would change from system to system.
That's a fair argument. How about using the degree symbol instead: "°" ?
(A ° B).T == B.T ° A.T
Well, obviously we can bikeshed this all day :-). For reasons the draft PEP goes into in more detail, what we need is a symbol that means "matrix rather than scalar/elementwise multiplication", and there is no existing conventional symbol (either in math or programming) that has that meaning -- \cdot, \degree, etc. don't have that meaning any more than @ does. So we have to make something up.
My feeling is that @ is the least-bad option, and -- as the person who's maybe been staring at these code samples the longest -- I found that I got used to it pretty quickly. But I don't think debating is going to lead to any "obviously best" answer, at some point we'll just have to pick something. So maybe it's more useful to focus more on the other parts of the proposal? :-)
-n
On 03/14/2014 04:25 AM, Robert Kern wrote:
Some more from real code:
RSR = R.dot(var_beta.dot(R.T)) RSR = R @ var_beta @ R.T
xx_inv.dot(xeps.dot(xx_inv)) xx_inv @ xeps @ xx_inv
dF2lower_dper.dot(F2lower.T) + F2lower.dot(dF2lower_dper.T) - 4/period*F2lower.dot(F2lower.T) dF2lower_dper @ F2lower.T + F2lower @ dF2lower_dper.T - 4/period*(F2lower @ F2lower.T)
dFX_dper.dot(Gi.dot(FX2.T)) - FX.dot(Gi.dot(dG_dper.dot(Gi.dot(FX2.T)))) + FX.dot(Gi.dot(dFX2_dper.T)) (dFX_dper @ Gi @ FX2.T) - (FX @ Gi @ dG_dper @ Gi @ FX2.T) + (FX @ G @ dFX2_dper.T)
torient_inv.dot(tdof).dot(torient).dot(self.vertices[parent].meta['key'])) (((torient_inv @ tdof) @ torient) @ self.vertices[parent].meta['key']
Arrgggghhhhhhh!! Uncle! Uncle!
+1!
-- ~Ethan~
On Fri, Mar 14, 2014 at 11:25 AM, Robert Kern robert.kern@gmail.com wrote:
On 2014-03-14 10:16, M.-A. Lemburg wrote:
Now since this is all about syntactic sugar, we also need to look at some code examples:
[...]
c = A @ v vs. c = A · v vs. c = A.dot(v)
Hmm, even though I'd love to see matrix operators in Python, I don't think they really add clarity to the syntax of matrix calculations - a bit disappointing, I must say :-(
Some more from real code:
RSR = R.dot(var_beta.dot(R.T)) RSR = R @ var_beta @ R.T
xx_inv.dot(xeps.dot(xx_inv)) xx_inv @ xeps @ xx_inv
dF2lower_dper.dot(F2lower.T) + F2lower.dot(dF2lower_dper.T) - 4/period*F2lower.dot(F2lower.T) dF2lower_dper @ F2lower.T + F2lower @ dF2lower_dper.T - 4/period*(F2lower @ F2lower.T)
dFX_dper.dot(Gi.dot(FX2.T)) - FX.dot(Gi.dot(dG_dper.dot(Gi.dot(FX2.T)))) + FX.dot(Gi.dot(dFX2_dper.T)) (dFX_dper @ Gi @ FX2.T) - (FX @ Gi @ dG_dper @ Gi @ FX2.T) + (FX @ G @ dFX2_dper.T)
torient_inv.dot(tdof).dot(torient).dot(self.vertices[parent].meta['key'])) (((torient_inv @ tdof) @ torient) @ self.vertices[parent].meta['key']
For those skimming along, note that there's also a more detailed example in the draft PEP, looking at a few different aspects of code usability: https://github.com/numpy/numpy/pull/4351/files#diff-dd521035cec59cd5fb2e040f...
-n
Hi,
On Fri, 14 Mar 2014 01:59:14 +0000 Nathaniel Smith njs@pobox.com wrote:
PEP 211, which instead adds an operator for itertools.product, aka, "maybe we can sneak matrix multiply past Guido in some sort of... large, wooden rabbit..."
A chocolate rabbit might be tastier for some core developers ;-)
Ok, so first I'm gonna state upfront that I don't have any personal interest in this proposal (i.e. I don't do matrix multiplications often enough for it to matter to me). That said:
This PEP proposes two new binary operators dedicated to matrix multiplication and matrix power, spelled ``@`` and ``@@`` respectively. (Mnemonic: ``@`` is ``*`` for mATrices.)
That sounds like a strange choice to me. There's nothing in "@" or "@@" that suggests "matrices" or "multiplication". Also, are there precedents in other languages?
AFAIK, other languages sometimes use ".*" for matrix multiplication (as opposed to element-wise multiplication). Why not re-use that convention? Would that lead to parsing difficulties?
That said, I'm rather sympathetic to the general idea, and if "@" is the least contentious option, then I'm ok with it too ("@@" doesn't sound like a good idea at all, though).
Thanks for writing this PEP!
Regards
Antoine.
On Fri, Mar 14, 2014 at 11:24:26AM +0100, Antoine Pitrou wrote:
This PEP proposes two new binary operators dedicated to matrix multiplication and matrix power, spelled ``@`` and ``@@`` respectively. (Mnemonic: ``@`` is ``*`` for mATrices.)
That sounds like a strange choice to me. There's nothing in "@" or "@@" that suggests "matrices" or "multiplication".
The PEP even gives a mnemonic for it: @ is for mATrix multiplication.
There's nothing in @ that suggests "decorator", nor is there anything about * that suggests scalar multiplication.
Out of the common symbols used in Python, & is possibly the only operator symbol with a non-arbitrary connection to its meaning. & was originally a ligature of "et", the Latin word for "and". Everything else is pretty much arbitrary, and it's only habit and long use that makes them seem normal.
AFAIK, other languages sometimes use ".*" for matrix multiplication (as opposed to element-wise multiplication). Why not re-use that convention? Would that lead to parsing difficulties?
"Syntax should not look like grit on Tim Peters' monitor."
.* is, in my opinion, pretty ugly, and leads to even ugly extensions:
a .* b # matrix multiplication a .*= b # in place matrix multiplication a .** b # matrix exponentiation a .**= b # in-place matrix exponentiation
And it suffers from exactly the same lack of connection to matrix multiplication as @ does: there is nothing about . that spells "matrix".
It also has the problem that it could cause confusion with vector dot product. Read out A .* B aloud and you get something that sounds like it might mean dot product, "A dot times B".
That said, I'm rather sympathetic to the general idea, and if "@" is the least contentious option, then I'm ok with it too ("@@" doesn't sound like a good idea at all, though).
I don't see why. Since Python uses * and ** for scalar multiplication and exponentiation, and given that @ is accepted as matrix multiplication, then it is completely logical to follow the same pattern, which gives us @@ for matrix exponentiation. That's no sillier than using ** for exponentiation, instead of ^ as the Gods intended :-)
On Fri, 14 Mar 2014 23:32:39 +1100 Steven D'Aprano steve@pearwood.info wrote:
On Fri, Mar 14, 2014 at 11:24:26AM +0100, Antoine Pitrou wrote:
This PEP proposes two new binary operators dedicated to matrix multiplication and matrix power, spelled ``@`` and ``@@`` respectively. (Mnemonic: ``@`` is ``*`` for mATrices.)
That sounds like a strange choice to me. There's nothing in "@" or "@@" that suggests "matrices" or "multiplication".
The PEP even gives a mnemonic for it: @ is for mATrix multiplication.
Well, "@" is not pronounced "at" in every country, so your mnemonic will only work for a subset of Python users.
There's nothing in @ that suggests "decorator", nor is there anything about * that suggests scalar multiplication.
Apart from the fact that "*" is commonly used for multiplication in most programming languages, you mean?
("@" either spells "decorator" or "in reply to" or "e-mail address" to me, depending on the context)
It also has the problem that it could cause confusion with vector dot product. Read out A .* B aloud and you get something that sounds like it might mean dot product, "A dot times B".
That is true, and I realize I might have got the convention reversed :-)
But, again, "dot product" is an English term which has no direct equivalent in other languages. In French, "dot product" doesn't exist, we only say "scalar product" and therefore the confusion doesn't exist.
That said, I'm rather sympathetic to the general idea, and if "@" is the least contentious option, then I'm ok with it too ("@@" doesn't sound like a good idea at all, though).
I don't see why. Since Python uses * and ** for scalar multiplication and exponentiation, and given that @ is accepted as matrix multiplication, then it is completely logical to follow the same pattern, which gives us @@ for matrix exponentiation.
It's logical, but it's much less useful, and it's also starting to look really weird: since the operator space is scarce, it makes sense to only provide those that have common use.
Regards
Antoine.
The reason @ is used for reply is because the menomic Is "at soandso". Probably if people can handle that they can handle it here too.
On Mar 14, 2014, at 8:57 AM, Antoine Pitrou solipsis@pitrou.net wrote:
("@" either spells "decorator" or "in reply to" or "e-mail address" to me, depending on the context)
On Fri, 14 Mar 2014 09:01:38 -0400 Donald Stufft donald@stufft.io wrote:
The reason @ is used for reply is because the menomic Is "at soandso". Probably if people can handle that they can handle it here too.
It's not about "handling it", it's about finding the best candidate.
Regards
Antoine.
On 2014-03-14, at 14:01 , Donald Stufft donald@stufft.io wrote:
The reason @ is used for reply is because the menomic Is "at soandso". Probably if people can handle that they can handle it here too.
People don't "handle" anything, they learn the semantics of the @ symbol in-context, whatever name they give it. That does not mean non-english-speakers associate it with the sound "at" (they don't, in my experience), even ignoring the dodginess of using "at" as a mnemonic for "matrix", and @ as a shortcut for that (doubly so in Python where @ already means "decorator")
On 03/14/2014 07:32 AM, Steven D'Aprano wrote:
AFAIK, other languages sometimes use ".*" for matrix multiplication (as opposed to element-wise multiplication). Why not re-use that convention? Would that lead to parsing difficulties?
"Syntax should not look like grit on Tim Peters' monitor."
.* is, in my opinion, pretty ugly, and leads to even ugly extensions:
a .* b # matrix multiplication a .*= b # in place matrix multiplication a .** b # matrix exponentiation a .**= b # in-place matrix exponentiation
And it suffers from exactly the same lack of connection to matrix multiplication as @ does: there is nothing about . that spells "matrix".
It also has the problem that it could cause confusion with vector dot product. Read out A .* B aloud and you get something that sounds like it might mean dot product, "A dot times B".
The best I can come up with is to use '^' instead of dot.
a ^* b # matrix multiplication a ^^ b # matrix exponentiation
Also two ^ together looks like a M. It's not a strong association, but it may help some people remember what it's for. Or the second one could be spelled... ^**, but I think the ^^ is cleaner.
Question? Is it possible for python to tell these apart at compile time?
a = 6 b = 4 c = 8 a^b
2
a ^b c
File "<stdin>", line 1 a ^b c ^ SyntaxError: invalid syntax
So that this later example does...
a.__use__(self, "b", c)
The ^b is unrelated to the name b in the second case. Or is this just not possible?
If it could be done, then ...
a ^* b --> a.__use__(self, "*", b) a ^^ b --> a.__use__(self, "^", b)
Where the __use__ method is defined on a module level object. (and not defined on builtin's as a general rule.)
def __use__(self, s, other): if s == "*": ... # do matrix multiply return result elif s == "^": ... # do matrix exponentiation return result raise TypeError
Anyway to make this work nicely?
Cheers, Ron
On 03/14/2014 10:41 AM, Ron Adam wrote:
On 03/14/2014 07:32 AM, Steven D'Aprano wrote:
AFAIK, other languages sometimes use ".*" for matrix multiplication (as opposed to element-wise multiplication). Why not re-use that convention? Would that lead to parsing difficulties?
"Syntax should not look like grit on Tim Peters' monitor."
.* is, in my opinion, pretty ugly, and leads to even ugly extensions:
a .* b # matrix multiplication a .*= b # in place matrix multiplication a .** b # matrix exponentiation a .**= b # in-place matrix exponentiation
And it suffers from exactly the same lack of connection to matrix multiplication as @ does: there is nothing about . that spells "matrix".
It also has the problem that it could cause confusion with vector dot product. Read out A .* B aloud and you get something that sounds like it might mean dot product, "A dot times B".
The best I can come up with is to use '^' instead of dot.
a ^* b # matrix multiplication a ^^ b # matrix exponentiation
Also two ^ together looks like a M. It's not a strong association, but it may help some people remember what it's for. Or the second one could be spelled... ^**, but I think the ^^ is cleaner.
Question? Is it possible for python to tell these apart at compile time?
a = 6 b = 4 c = 8 a^b
2
a ^b c
File "<stdin>", line 1 a ^b c ^ SyntaxError: invalid syntax
So that this later example does...
a.__use__(self, "b", c)
The ^b is unrelated to the name b in the second case. Or is this just not possible?
One more comment, if this idea is limited to only allow symbols that can't be seen in identifiers, then the conflict goes away also. That's probably not an unreasonable restriction.
If it could be done, then ...
a ^* b --> a.__use__(self, "*", b) a ^^ b --> a.__use__(self, "^", b)
Where the __use__ method is defined on a module level object. (and not defined on builtin's as a general rule.)
def __use__(self, s, other): if s == "*": ... # do matrix multiply return result elif s == "^": ... # do matrix exponentiation return result raise TypeError
Anyway to make this work nicely?
I would of liked it to be more general, but this could work for just symbols. It could also be extended to allow ^'string', at some later date.
Cheers, Ron
Ron Adam writes:
The best I can come up with is to use '^' instead of dot.
This symbol is used to denote the "wedge product" of vectors, so it's pretty clearly inappropriate. I forget the exact definition, but it's related to the area (volume) of the parallelpiped spanned by the vectors.
On Thu, Mar 13, 2014 at 6:59 PM, Nathaniel Smith njs@pobox.com wrote: [...]
PEP: XXXX Title: Dedicated infix operators for matrix multiplication and matrix power Version: $Revision$ Last-Modified: $Date$ Author: Nathaniel J. Smith njs@pobox.com Status: Draft Type: Standards Track Python-Version: 3.5 Content-Type: text/x-rst Created: 20-Feb-2014 Post-History:
I still have to read this but I've assigned a PEP number. Henceforth this will be known as PEP 465. So far I really like what I've read.
On the bikeshedding, if someone wants to start introducing Unicode (non-ASCII) characters as operators they would have to propose a separate PEP arguing the benefits of Unicode -- I am heavily opposed to doing that ATM, for reasons that have already been brought up in the bikeshed.
I have now read the PEP, and I think it's good. I think it's a waste of time to keep bikeshedding on the choice of operator -- @ is the best compromise. I do have a few specific notes:
- Right associativity is not unheard of in Python. E.g. **. If you think that for other reasons @ should be right associative, don't let Python's tradition stop you. But then you need to decide which of * and @ binds more tightly -- e.g. does a*b@c mean a*(b@c) or (a*b)@c? And if you choose the latter, it follows that a@b*c means a@(b*c) -- is that okay? (And similar examples exist for the other choice.)
- Did you consider a duck-typing (is that the word?) attribute? E.g. a*b is elementwise multiplication; a.M*b must be used for matrix multiplication. (Your use of .T as "transpose" made me think of this.) Of course the question is, can you get those packages that currently use * for matrix multiply to comply? (I don't consider this a serious counter-proposal. But you list a bunch of rejected alternatives; this could be in that list.
- Is @@ really necessary? It seems you are adding it mostly because it's cute and because of the parallel with **, not because it is actually important enough to add new syntax. And then later you use it as an argument for @, which seems a bit circular. Also, if we were to make @ right-associative, the parallel with ** is already imperfect.
- For better counts of usages, perhaps Sourcegraph.com might help? It is a source code query engine that has a Python parser and (limited) type inference built in (also separately available as pysonar on github IIRC). To be clear, I don't need more numbers to be convinced.
Once we've decided on associativity and @@, I'm ready to accept.
A simple suggestion: what about defining the special methods as __at__, __rat__, __iat__ (@) and __atat__, __ratat__, __iatat__ (@@)? This make the operator more "neutral" -- it make it sound less "wrong" to overload it for something else that matrix multiplication (which is certainly an important topic, but even though I use numpy quite a lot, I don't actually use it at all for linear algebra (in fact I use elementwise multiplications much more often) -- so counting imports of numpy is a somewhat biaised metric for counting users of matrix multiplication).
Antony
2014-03-14 10:53 GMT-07:00 Guido van Rossum guido@python.org:
I have now read the PEP, and I think it's good. I think it's a waste of time to keep bikeshedding on the choice of operator -- @ is the best compromise. I do have a few specific notes:
Right associativity is not unheard of in Python. E.g. **. If you think that for other reasons @ should be right associative, don't let Python's tradition stop you. But then you need to decide which of * and @ binds more tightly -- e.g. does a*b@c mean a*(b@c) or (a*b)@c? And if you choose the latter, it follows that a@b*c means a@(b*c) -- is that okay? (And similar examples exist for the other choice.)
Did you consider a duck-typing (is that the word?) attribute? E.g. a*b is elementwise multiplication; a.M*b must be used for matrix multiplication. (Your use of .T as "transpose" made me think of this.) Of course the question is, can you get those packages that currently use * for matrix multiply to comply? (I don't consider this a serious counter-proposal. But you list a bunch of rejected alternatives; this could be in that list.
Is @@ really necessary? It seems you are adding it mostly because it's cute and because of the parallel with **, not because it is actually important enough to add new syntax. And then later you use it as an argument for @, which seems a bit circular. Also, if we were to make @ right-associative, the parallel with ** is already imperfect.
For better counts of usages, perhaps Sourcegraph.com might help? It is a source code query engine that has a Python parser and (limited) type inference built in (also separately available as pysonar on github IIRC). To be clear, I don't need more numbers to be convinced.
Once we've decided on associativity and @@, I'm ready to accept.
-- --Guido van Rossum (python.org/~guido)
Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Fri, Mar 14, 2014 at 11:41 AM, Antony Lee antony.lee@berkeley.eduwrote:
A simple suggestion: what about defining the special methods as __at__, __rat__, __iat__ (@) and __atat__, __ratat__, __iatat__ (@@)? This make the operator more "neutral" -- it make it sound less "wrong" to overload it for something else that matrix multiplication (which is certainly an important topic, but even though I use numpy quite a lot, I don't actually use it at all for linear algebra (in fact I use elementwise multiplications much more often) -- so counting imports of numpy is a somewhat biaised metric for counting users of matrix multiplication).
Ratatatatatat! :-)
On Fri, Mar 14, 2014 at 11:53:47AM -0700, Guido van Rossum guido@python.org wrote:
On Fri, Mar 14, 2014 at 11:41 AM, Antony Lee antony.lee@berkeley.eduwrote:
A simple suggestion: what about defining the special methods as __at__, __rat__, __iat__ (@) and __atat__, __ratat__, __iatat__ (@@)? This make the operator more "neutral" -- it make it sound less "wrong" to overload it for something else that matrix multiplication (which is certainly an important topic, but even though I use numpy quite a lot, I don't actually use it at all for linear algebra (in fact I use elementwise multiplications much more often) -- so counting imports of numpy is a somewhat biaised metric for counting users of matrix multiplication).
Ratatatatatat! :-)
Is this the sound made by matrices being multiplied? ;-)
Oleg.
Oleg Broytman wrote:
On Fri, Mar 14, 2014 at 11:53:47AM -0700, Guido van Rossum guido@python.org wrote:
Ratatatatatat! :-)
Is this the sound made by matrices being multiplied? ;-)
I think it's the sound of a PEP idea being lined up against the wall and shot. :-(
On 15 Mar 2014 04:42, "Antony Lee" antony.lee@berkeley.edu wrote:
A simple suggestion: what about defining the special methods as __at__,
__rat__, __iat__ (@) and __atat__, __ratat__, __iatat__ (@@)? This make the operator more "neutral" -- it make it sound less "wrong" to overload it for something else that matrix multiplication (which is certainly an important topic, but even though I use numpy quite a lot, I don't actually use it at all for linear algebra (in fact I use elementwise multiplications much more often) -- so counting imports of numpy is a somewhat biaised metric for counting users of matrix multiplication).
The method name for "*" is "__mul__" rather than "__star__", and so on for the other types. Naming the magic methods for their intended semantics rather than their syntax is an established pattern.
A few other miscellaneous comments:
- nice work on the PEP Nathaniel! - as with others, "@" as the operator doesn't thrill me, but I also think it crosses the threshold of "good enough given the constraints" - the PEP should probably recommend adding an "operator.matmul" function, a "PyObject_MatrixMultiply" C API and consider whether or not the new special method should be given a C level type slot.
Cheers, Nick.
Antony
2014-03-14 10:53 GMT-07:00 Guido van Rossum guido@python.org:
I have now read the PEP, and I think it's good. I think it's a waste of
time to keep bikeshedding on the choice of operator -- @ is the best compromise. I do have a few specific notes:
Right associativity is not unheard of in Python. E.g. **. If you think that for other reasons @ should be right associative, don't let Python's tradition stop you. But then you need to decide which of * and @ binds more tightly -- e.g. does a*b@c mean a*(b@c) or (a*b)@c? And if you choose the latter, it follows that a@b*c means a@(b*c) -- is that okay? (And similar examples exist for the other choice.)
Did you consider a duck-typing (is that the word?) attribute? E.g. a*b is elementwise multiplication; a.M*b must be used for matrix multiplication. (Your use of .T as "transpose" made me think of this.) Of course the question is, can you get those packages that currently use * for matrix multiply to comply? (I don't consider this a serious counter-proposal. But you list a bunch of rejected alternatives; this could be in that list.
Is @@ really necessary? It seems you are adding it mostly because it's cute and because of the parallel with **, not because it is actually important enough to add new syntax. And then later you use it as an argument for @, which seems a bit circular. Also, if we were to make @ right-associative, the parallel with ** is already imperfect.
For better counts of usages, perhaps Sourcegraph.com might help? It is a source code query engine that has a Python parser and (limited) type inference built in (also separately available as pysonar on github IIRC). To be clear, I don't need more numbers to be convinced.
Once we've decided on associativity and @@, I'm ready to accept.
-- --Guido van Rossum (python.org/~guido)
Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Fri, Mar 14, 2014 at 3:15 PM, Nick Coghlan ncoghlan@gmail.com wrote:
The method name for "*" is "__mul__" rather than "__star__"
Yes, but just think. W ith a couple extra "@"s we could start defining __ratatatat__http://www.merriam-webster.com/audio.php?file=ratata02&word=rat-a-tat-tat&text=%5C%CB%8Cra-t%C9%99-%CB%8Cta(t)-%CB%88tat%5C methods. :-)
Skip
On Fri, Mar 14, 2014 at 8:15 PM, Nick Coghlan ncoghlan@gmail.com wrote:
A few other miscellaneous comments:
- nice work on the PEP Nathaniel!
Thanks!
- as with others, "@" as the operator doesn't thrill me, but I also think it
crosses the threshold of "good enough given the constraints"
- the PEP should probably recommend adding an "operator.matmul" function, a
"PyObject_MatrixMultiply" C API and consider whether or not the new special method should be given a C level type slot.
operator.matmul and PyObject_MatrixMultiply are obvious enough, but I'm afraid I'm not too clear on the tradeoffs about adding a C level type slot, or even entirely sure what the alternative is. (I guess I just assumed that all special methods used C level type slots and there was nothing to think about.) Do you (or anyone) have any thoughts?
On 18 Mar 2014 09:08, "Nathaniel Smith" njs@pobox.com wrote:
On Fri, Mar 14, 2014 at 8:15 PM, Nick Coghlan ncoghlan@gmail.com wrote:
A few other miscellaneous comments:
- nice work on the PEP Nathaniel!
Thanks!
- as with others, "@" as the operator doesn't thrill me, but I also
think it
crosses the threshold of "good enough given the constraints"
- the PEP should probably recommend adding an "operator.matmul"
function, a
"PyObject_MatrixMultiply" C API and consider whether or not the new
special
method should be given a C level type slot.
operator.matmul and PyObject_MatrixMultiply are obvious enough, but I'm afraid I'm not too clear on the tradeoffs about adding a C level type slot, or even entirely sure what the alternative is. (I guess I just assumed that all special methods used C level type slots and there was nothing to think about.) Do you (or anyone) have any thoughts?
I suspect you're going to want one, as without it, the implementation method ends up in the class dict instead (the context management protocol works that way).
I suspect the design we will want is a new struct for Py_Matrix slots (akin to those for numbers, etc). The alternative would be to just add more "Number" slots, but that isn't really accurate.
I'll come up with a more specific proposal after refreshing my memory of the exact details of the current layout.
Cheers, Nick.
-- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org
On Mar 18, 2014, at 1:02, Nick Coghlan ncoghlan@gmail.com wrote:
On 18 Mar 2014 09:08, "Nathaniel Smith" njs@pobox.com wrote:
On Fri, Mar 14, 2014 at 8:15 PM, Nick Coghlan ncoghlan@gmail.com wrote:
A few other miscellaneous comments:
- nice work on the PEP Nathaniel!
Thanks!
- as with others, "@" as the operator doesn't thrill me, but I also think it
crosses the threshold of "good enough given the constraints"
- the PEP should probably recommend adding an "operator.matmul" function, a
"PyObject_MatrixMultiply" C API and consider whether or not the new special method should be given a C level type slot.
operator.matmul and PyObject_MatrixMultiply are obvious enough, but I'm afraid I'm not too clear on the tradeoffs about adding a C level type slot, or even entirely sure what the alternative is. (I guess I just assumed that all special methods used C level type slots and there was nothing to think about.) Do you (or anyone) have any thoughts?
I suspect you're going to want one, as without it, the implementation method ends up in the class dict instead (the context management protocol works that way).
I suspect the design we will want is a new struct for Py_Matrix slots (akin to those for numbers, etc). The alternative would be to just add more "Number" slots, but that isn't really accurate.
Someone needs to work out how these changes will affect the numpy C API (the Py_Array dotfunc slot, the PyArray interface, etc.). Whoever is doing that--and the numpy C API users who are interested in it--will probably care more about how matmul fits into the python C API than anyone else does.
As for whether this is another numeric operation or a new kind of operation, I think that depends on how you intuitively interpret matrix operations, or rather which intuition numpy and similar libraries should be encouraging. The fact that the proposal suggests that >2d arrays should handle matmul by broadcasting it over the final 2d subarrays seems like it's treating it fundamentally as a numeric operation on (vectors and) matrices. On the other hand, I don't think anyone implementing a split-complex or quaternion library on top of numpy would want to use @ for multiplying their numbers, even if they used @ on the matrices as the implementation.
On 2014-03-18 08:02, Nick Coghlan wrote:
On 18 Mar 2014 09:08, "Nathaniel Smith" <njs@pobox.com mailto:njs@pobox.com> wrote:
On Fri, Mar 14, 2014 at 8:15 PM, Nick Coghlan
<ncoghlan@gmail.com mailto:ncoghlan@gmail.com> wrote:
A few other miscellaneous comments:
- nice work on the PEP Nathaniel!
Thanks!
- as with others, "@" as the operator doesn't thrill me, but I also think it
crosses the threshold of "good enough given the constraints"
- the PEP should probably recommend adding an "operator.matmul" function, a
"PyObject_MatrixMultiply" C API and consider whether or not the new special method should be given a C level type slot.
operator.matmul and PyObject_MatrixMultiply are obvious enough, but I'm afraid I'm not too clear on the tradeoffs about adding a C level type slot, or even entirely sure what the alternative is. (I guess I just assumed that all special methods used C level type slots and there was nothing to think about.) Do you (or anyone) have any thoughts?
I suspect you're going to want one, as without it, the implementation method ends up in the class dict instead (the context management protocol works that way).
I suspect the design we will want is a new struct for Py_Matrix slots (akin to those for numbers, etc). The alternative would be to just add more "Number" slots, but that isn't really accurate.
Would it be more palatable if the name were something like __altmul__ or __auxmul__ rather than __matmul__? Really, it's just a second multiplication-like operator. The leading use case for a second multiplication-like operator happens to be matrix multiplication, but I strongly suspect it will get used for other mathematical things like symbolic function composition or operator application (as in "linear operator", not +-*/) and maybe some secondary multiplication types in the weirder groups and fields (you can bet I will resurrect my Clifford algebra module to use this operator for one of the several types of multiplication they support). Granted, there is still some awkwardness in that *none* of the builtin number types will support it.
And hey, look! It makes the @ux or @lt multiplication operator makes a sensible, language-independent mnemonic! (not that "auxiliary" or "alternate" work in all languages, but at least the 'a' is baked into the special method name)
On 18 March 2014 20:47, Robert Kern robert.kern@gmail.com wrote:
On 2014-03-18 08:02, Nick Coghlan wrote:
operator.matmul and PyObject_MatrixMultiply are obvious enough, but I'm afraid I'm not too clear on the tradeoffs about adding a C level type slot, or even entirely sure what the alternative is. (I guess I just assumed that all special methods used C level type slots and there was nothing to think about.) Do you (or anyone) have any thoughts?
I suspect you're going to want one, as without it, the implementation method ends up in the class dict instead (the context management protocol works that way).
I suspect the design we will want is a new struct for Py_Matrix slots (akin to those for numbers, etc). The alternative would be to just add more "Number" slots, but that isn't really accurate.
So, here's the change to PyHeapType object that makes the most sense to me (assuming both "@" and "@@" are added - drop the new "power" methods if "@@" is dropped from the PEP):
- add "PyMatrixMethods as_matrix;" as a new field in PyHeapTypeObject - define PyMatrixMethods as:
typedef struct { binaryfunc mt_multiply; binaryfunc mt_power; binaryfunc mt_inplace_multiply; binaryfunc mt_inplace_power; } PyMatrixMethods;
This approach increases the size of all type objects by one pointer.
The other way to do it would be to just add four new slots to PyNumberMethods:
binaryfunc nb_matrix_multiply; binaryfunc nb_matrix_power; binaryfunc nb_inplace_matrix_multiply; binaryfunc nb_inplace_matrix_power;
This approach increases the size of all type objects that define one or more of the numeric functions by four pointers, and doesn't really make sense at a conceptual level. The latter is the main reason I prefer the separate PyMatrixMethods struct.
Other "should probably be listed in the PEP for completeness" change is that this will need new opcodes and AST nodes. Reviewing the current opcode list and node names, I would suggest:
BINARY_MATRIX_MULTIPLY BINARY_MATRIX_POWER INPLACE_MATRIX_MULTIPLY INPLACE_MATRIX_POWER
MatMult | MatPow
Would it be more palatable if the name were something like __altmul__ or __auxmul__ rather than __matmul__? Really, it's just a second multiplication-like operator. The leading use case for a second multiplication-like operator happens to be matrix multiplication, but I strongly suspect it will get used for other mathematical things like symbolic function composition or operator application (as in "linear operator", not +-*/) and maybe some secondary multiplication types in the weirder groups and fields (you can bet I will resurrect my Clifford algebra module to use this operator for one of the several types of multiplication they support). Granted, there is still some awkwardness in that *none* of the builtin number types will support it.
I think "matmul" is fine. That makes the primary intended use case clear, without preventing its use for other purposes (like vector dot products or more exotic things). The magic method names are "add", "mul", "div", "mod", etc, even though we occasionally use them for other purposes (e.g. concatenation, sequence repetition, path joining, interpolation).
Cheers, Nick.
On 18.03.2014 12:27, Nick Coghlan wrote:
On 18 March 2014 20:47, Robert Kern robert.kern@gmail.com wrote:
On 2014-03-18 08:02, Nick Coghlan wrote:
operator.matmul and PyObject_MatrixMultiply are obvious enough, but I'm afraid I'm not too clear on the tradeoffs about adding a C level type slot, or even entirely sure what the alternative is. (I guess I just assumed that all special methods used C level type slots and there was nothing to think about.) Do you (or anyone) have any thoughts?
I suspect you're going to want one, as without it, the implementation method ends up in the class dict instead (the context management protocol works that way).
I suspect the design we will want is a new struct for Py_Matrix slots (akin to those for numbers, etc). The alternative would be to just add more "Number" slots, but that isn't really accurate.
So, here's the change to PyHeapType object that makes the most sense to me (assuming both "@" and "@@" are added - drop the new "power" methods if "@@" is dropped from the PEP):
add "PyMatrixMethods as_matrix;" as a new field in PyHeapTypeObject
define PyMatrixMethods as:
typedef struct { binaryfunc mt_multiply; binaryfunc mt_power; binaryfunc mt_inplace_multiply; binaryfunc mt_inplace_power; } PyMatrixMethods;
This approach increases the size of all type objects by one pointer.
The other way to do it would be to just add four new slots to PyNumberMethods:
binaryfunc nb_matrix_multiply; binaryfunc nb_matrix_power; binaryfunc nb_inplace_matrix_multiply; binaryfunc nb_inplace_matrix_power;
This approach increases the size of all type objects that define one or more of the numeric functions by four pointers, and doesn't really make sense at a conceptual level. The latter is the main reason I prefer the separate PyMatrixMethods struct.
I don't think that it's a good idea to make all type objects larger just to address matrix multiplications which none of the builtin types will actually use, so +1 on adding to the number methods, even if a matrix isn't a number (they still use numbers, so it's not that far off :-)), but -1 on adding another slot struct.
Aside: Mechanisms such as namedtuple add lots of new type objects to the runtime environment, so it's no longer safe to assume that the size of type objects doesn't really matter much in real-life applications.
From: M.-A. Lemburg mal@egenix.com
Sent: Tuesday, March 18, 2014 5:18 AM
On 18.03.2014 12:27, Nick Coghlan wrote:
The other way to do it would be to just add four new slots to PyNumberMethods:
binaryfunc nb_matrix_multiply; binaryfunc nb_matrix_power; binaryfunc nb_inplace_matrix_multiply; binaryfunc nb_inplace_matrix_power;
This approach increases the size of all type objects that define one or more of the numeric functions by four pointers, and doesn't really make sense at a conceptual level. The latter is the main reason I prefer the separate PyMatrixMethods struct.
I don't think that it's a good idea to make all type objects larger just to address matrix multiplications which none of the builtin types will actually use, so +1 on adding to the number methods, even if a matrix isn't a number (they still use numbers, so it's not that far off :-)), but -1 on adding another slot struct.
I don't see how putting them in PyNumberMethods doesn't make sense at a conceptual level. (Matrix, +, @) is a ring, and they form an assocative algebra. Sure, matrices are collections of simpler numbers, and their operations are defined in terms of the operations of those simpler numbers, but the same thing is true for complex numbers, or even rationals. From a mathematical point of view, there's no principled definition of "number" (or, rather, if there is, it's only going to include the naturals or maybe the ordinals). If anything, the float type has a lot more problems as a "number" than a matrix type does. So if there is a good practical reason to call matrix multiplication a numeric operation, why not?
On 18 Mar 2014 22:18, "M.-A. Lemburg" mal@egenix.com wrote:
On 18.03.2014 12:27, Nick Coghlan wrote:
On 18 March 2014 20:47, Robert Kern robert.kern@gmail.com wrote:
On 2014-03-18 08:02, Nick Coghlan wrote:
operator.matmul and PyObject_MatrixMultiply are obvious enough, but I'm afraid I'm not too clear on the tradeoffs about adding a C
level
type slot, or even entirely sure what the alternative is. (I guess
I
just assumed that all special methods used C level type slots and there was nothing to think about.) Do you (or anyone) have any thoughts?
I suspect you're going to want one, as without it, the implementation method ends up in the class dict instead (the context management protocol
works
that way).
I suspect the design we will want is a new struct for Py_Matrix slots (akin to those for numbers, etc). The alternative would be to just add more "Number" slots, but that isn't really accurate.
So, here's the change to PyHeapType object that makes the most sense to me (assuming both "@" and "@@" are added - drop the new "power" methods if "@@" is dropped from the PEP):
add "PyMatrixMethods as_matrix;" as a new field in PyHeapTypeObject
define PyMatrixMethods as:
typedef struct { binaryfunc mt_multiply; binaryfunc mt_power; binaryfunc mt_inplace_multiply; binaryfunc mt_inplace_power; } PyMatrixMethods;
This approach increases the size of all type objects by one pointer.
The other way to do it would be to just add four new slots to
PyNumberMethods:
binaryfunc nb_matrix_multiply; binaryfunc nb_matrix_power; binaryfunc nb_inplace_matrix_multiply; binaryfunc nb_inplace_matrix_power;
This approach increases the size of all type objects that define one or more of the numeric functions by four pointers, and doesn't really make sense at a conceptual level. The latter is the main reason I prefer the separate PyMatrixMethods struct.
I don't think that it's a good idea to make all type objects larger just to address matrix multiplications which none of the builtin types will actually use, so +1 on adding to the number methods, even if a matrix isn't a number (they still use numbers, so it's not that far off :-)), but -1 on adding another slot struct.
Aside: Mechanisms such as namedtuple add lots of new type objects to the runtime environment, so it's no longer safe to assume that the size of type objects doesn't really matter much in real-life applications.
So, here's the problem: operand precedence for sequences implemented in C is currently broken in certain relatively obscure cases (and has been since forever - we only found out about due to an SQL Alchemy failure when porting to PyPy revealed PyPy was right and CPython was wrong). This is why returning NotImplemented from sq_concat and sq_repeat doesn't work right. The problem doesn't arise for sequences implemented in Python, because we always populate the nb_add and nb_multiply slots for those.
I've tried fixing that directly, and it turned abstract.c into an unmaintainable mess, so I abandoned that approach. My current preferred solution is now similar to the one we use from Python: drop the direct calls to sq_concat and sq_repeat from abstract.c, and instead automatically add nb_add and nb_multiply implementations during type creation that delegate dynamically to the sequence slots.
So a *lot* of types already have their PyNumberMethods slot allocated (including sequence types implemented in Python), and I'd like to expand that to include all sequence types, even those implemented in C.
That makes the space trade-off here substantially less clear, particularly if matrix power ends up being added in the future (it has been dropped from the current PEP).
I accept that adding the new slots to PyNumberMethods is more conceptually coherent than I thought, though. There's also the fact that we already mess about populating different slots for pragmatic reasons that have nothing to do with conceptual integrity :)
Cheers, Nick.
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Source (#1, Mar 18 2014)
Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2014-03-29: PythonCamp 2014, Cologne, Germany ... 11 days to go 2014-04-09: PyCon 2014, Montreal, Canada ... 22 days to go
::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
On 18.03.2014 21:58, Nick Coghlan wrote:
On 18 Mar 2014 22:18, "M.-A. Lemburg" mal@egenix.com wrote:
[...matrix method slots space tradeoff...]
[... need for implementing number methods even on non-number types ...] So a *lot* of types already have their PyNumberMethods slot allocated (including sequence types implemented in Python), and I'd like to expand that to include all sequence types, even those implemented in C.
That makes the space trade-off here substantially less clear, particularly if matrix power ends up being added in the future (it has been dropped from the current PEP).
Ok, point taken :-)
I accept that adding the new slots to PyNumberMethods is more conceptually coherent than I thought, though. There's also the fact that we already mess about populating different slots for pragmatic reasons that have nothing to do with conceptual integrity :)
Perhaps something to revisit for Python 4 ?!
The refactoring could lead to some space savings, untangle code and make the C API easier to understand.
Antony Lee wrote:
A simple suggestion: what about defining the special methods as __at__, __rat__, __iat__
Please, no -- the last thing we need in Python is iyats!
http://www.paravia.com/wiki/index.php5?title=Iyat
On 2014-03-14 17:53, Guido van Rossum wrote:
I have now read the PEP, and I think it's good. I think it's a waste of time to keep bikeshedding on the choice of operator -- @ is the best compromise. I do have a few specific notes:
- Right associativity is not unheard of in Python. E.g. **. If you think that for other reasons @ should be right associative, don't let Python's tradition stop you. But then you need to decide which of * and @ binds more tightly -- e.g. does a*b@c mean a*(b@c) or (a*b)@c? And if you choose the latter, it follows that a@b*c means a@(b*c) -- is that okay? (And similar examples exist for the other choice.)
I *think* either works out fine in practice, but I have a preference for @ binding tighter than *. `scalar * matrix @ vector` does fewer flops that way, and most current expressions are written with this binding anyways: `scalar * np.dot(matrix, vector)`. It just feels right to me.
- Did you consider a duck-typing (is that the word?) attribute?
Facade?
E.g. a*b is elementwise multiplication; a.M*b must be used for matrix multiplication. (Your use of .T as "transpose" made me think of this.) Of course the question is, can you get those packages that currently use * for matrix multiply to comply? (I don't consider this a serious counter-proposal. But you list a bunch of rejected alternatives; this could be in that list.
It seems to me that the left-associativity of * makes this less useful than a dedicated operator if we consider chains of matrix multiplications.
A.M * B.M * C == (A.M * B.M) * C
We probably could make the rule that if the right-operand is one of these matmul-facades, then the result should also be a matmul-facade, but otherwise would be a plain numpy array. That would take care of this case, but it's not obvious to me that it's unambiguously the right thing to do in all cases.
- Is @@ really necessary? It seems you are adding it mostly because it's cute and because of the parallel with **, not because it is actually important enough to add new syntax. And then later you use it as an argument for @, which seems a bit circular. Also, if we were to make @ right-associative, the parallel with ** is already imperfect.
In my personal experience, I would use a matrix power operator much less than a matrix multiplication operator. I, at least, am content to continue to use a function call for that.
On Fri, Mar 14, 2014 at 5:53 PM, Guido van Rossum guido@python.org wrote:
I have now read the PEP, and I think it's good. I think it's a waste of time to keep bikeshedding on the choice of operator -- @ is the best compromise. I do have a few specific notes:
- Right associativity is not unheard of in Python. E.g. **. If you think that for other reasons @ should be right associative, don't let Python's tradition stop you. But then you need to decide which of * and @ binds more tightly -- e.g. does a*b@c mean a*(b@c) or (a*b)@c? And if you choose the latter, it follows that a@b*c means a@(b*c) -- is that okay? (And similar examples exist for the other choice.)
Like Robert I have the suspicion that the best option is to make @ right-associative and place it just above (more tightly binding) than *. But I'll poll numpy-discussion and friends and see if anyone has ideas for more objective measures.
- Did you consider a duck-typing (is that the word?) attribute? E.g. a*b is elementwise multiplication; a.M*b must be used for matrix multiplication. (Your use of .T as "transpose" made me think of this.) Of course the question is, can you get those packages that currently use * for matrix multiply to comply? (I don't consider this a serious counter-proposal. But you list a bunch of rejected alternatives; this could be in that list.
This is an interesting suggestion! I think it hasn't received full consideration before because the tangle between numpy.ndarray and numpy.matrix means that the most obvious implementation for an ndarray.M attribute would be to return a full-fledged numpy.matrix object... and no-one wants to encourage proliferation of numpy.matrix objects. But returning a tiny special-purpose class that only implements __mul__ would avoid that objection. At that point it's basically a nicer version of the "user-defined infix" '*dot*' operator idea. It has similar problems with needing an unaesthetically magical implementation, producing a proliferation of special classes (one extra one for each array type), requiring an allocation on every call, etc., but this might well be the next-best thing to a real operator.
All in all, I think we'd rather have a real operator, so if you're happy to go with @ then I won't put lots of effort into finding horrible problems that force us to reject .M ;-), but I'll certainly add it to the PEP in any case.
- Is @@ really necessary? It seems you are adding it mostly because it's cute and because of the parallel with **, not because it is actually important enough to add new syntax. And then later you use it as an argument for @, which seems a bit circular. Also, if we were to make @ right-associative, the parallel with ** is already imperfect.
@@ hasn't received as much attention as the rest of the proposal, so I'll check in with numpy-discussion etc. on this as well. But my personal feeling is +0 on @@ -- all else being equal, it's nicer to have it than not. Is all else equal? Given the existence of @, the increase in language complexity is small (or arguably negative, since people who know *, **, @ may be surprised to find @@ missing, and have to memorize its non-existence as an extra rule); the opportunity cost is low (given the existence of @ I can't imagine we'll want to use the token @@ for something else later); and @@ will be used in real life, if not as often as @ itself -- 'vec @@ 2' for squared Euclidean length probably won't be too uncommon, 'matrix @@ n' gets used for things like simulating random walks on graphs or other markov chains, and the 'matrix @@ -1' notation will probably be a nice win for beginners and other less-sophisticated programmers who just want to get some computation done in a way that doesn't require them to keep a more complicated mapping between math and code notation in their head or caring about numerical details. I think of @@ being like, say, '%=' -- nothing one would ever add syntax for in isolation, but if designing an overall system of operators then it makes more sense.
Probably in the end it just comes down to your aesthetic judgement :-). Numeric folk will be fine either way.
(And I wouldn't consider @@ a *strong* argument for spelling '@' as '@'; it's just mentioned in that section because there aren't really *any* strong arguments for preferring one spelling versus another, we have to make a decision, and so a weak argument is better than nothing. Two useful operators for the complexity cost of one is a good deal? :shrug: Obviously if @@ gets dropped I'll drop that bullet point as well.)
- For better counts of usages, perhaps Sourcegraph.com might help? It is a source code query engine that has a Python parser and (limited) type inference built in (also separately available as pysonar on github IIRC). To be clear, I don't need more numbers to be convinced.
Oo, that is shiny, thanks for the tip.
Once we've decided on associativity and @@, I'm ready to accept.
Wonderful to hear! Thanks for giving this so much time and attention (on no warning)!
-n
On 2014-03-14 17:53, Guido van Rossum wrote:
- Did you consider a duck-typing (is that the word?) attribute? E.g. a*b is elementwise multiplication; a.M*b must be used for matrix multiplication. (Your use of .T as "transpose" made me think of this.) Of course the question is, can you get those packages that currently use * for matrix multiply to comply? (I don't consider this a serious counter-proposal. But you list a bunch of rejected alternatives; this could be in that list.
Apparently this *was* considered in PEP 225 under "Alternatives to adding new operators", numbers 3 and 4.
http://legacy.python.org/dev/peps/pep-0225/
Of course, number 6 is "Introducing a single operator, such as @, for matrix multiplication." :-)
On 14/03/14 18:53, Guido van Rossum wrote:
- Did you consider a duck-typing (is that the word?) attribute? E.g. a*b is elementwise multiplication; a.M*b must be used for matrix multiplication.
This is exactly what has been tried with the types numpy.matrix and numpy.array. For numpy.matrix the * operator means matrix multiplication and for numpy.array it means Hadamard product.
In practice, it does not work as we want. That is the raison d'etre for this PEP.
First, we often needs both operators in the same expression. This makes it very hard to read, in fact a function call is always more readable.
Second, NumPy often has to return temporary arrays from binary operators, and the duck-typing of these must be controlled too. If we are unlucky a temporary array will have the wrong type, and we end up with errors that are very hard to track down.
So for any serious code, matrix multiplication is written with a function call: numpy.dot.
numpy.matrix is for this reason mostly useful for teaching students, not for serious numerical hacking with Python. It is due for retirement from NumPy.
A special operator will work, however, because languages like Matlab and R have proven that it does. Many in the NumPy community have extensive experience with Matlab and R, and we know it works with two multiplication operators. In contrast, we have proven that duck-typing is not useful in this context.
Though this PEP is written by Nathaniel, the whole NumPy dev team is behind it. It has been planned and discussed for a long time, both on the NumPy mailing list and on Github. But Nathaniel took the job to write it down. So other options has been considered very carefully.
- Is @@ really necessary?
It seems the consensus on the NumPy list is that asking for @@ can wait.
- An expression like matrix @@ 0.5 is ambiguous because it could mean the matrix square root or the Cholesky factorization.
- An expression like Z = (X @@ -1) @ Y should for numerical stability ans speed be written as solving a set of equations: X @ Z = Y. Symbolically it is the same, but numerically it is not.
So for most practical purposes, matrix exponentiation will require a function call anyway. The unambiguous case of positive integer powers might not be common enough.
Sturla
On 2014-03-26 18:59, Sturla Molden wrote:
On 14/03/14 18:53, Guido van Rossum wrote:
- Did you consider a duck-typing (is that the word?) attribute? E.g. a*b is elementwise multiplication; a.M*b must be used for matrix multiplication.
This is exactly what has been tried with the types numpy.matrix and numpy.array.
For the record, no it hasn't. The proposal here is different than the numpy.matrix status quo. The proposed result of `a.M*b` would be an ndarray, not whatever the .M type is. The .M type would not be intended to be used outside of a multiplication expression. It is unlikely to propagate through ones code the way that numpy.matrix does (at least, it would be considered a bug to let a free .M type loose). One could safely write their functions expecting only numpy.ndarrays.
While this proposal has been considered and rejected, it is not the same as numpy.matrix and has been rejected for different reasons.
I have a few remarks, in random order:
* The PEP is not just about the multiplication of matrices. The matrix-vector and vector-vector semantics are required in order to get nice expressions. This should probably be mentioned more prominently.
* The complicated semantics for higher-dimensional arrays are effectively incompatible with the matrix-vector and vector-vector cases: for instance, writing the product of a list of matrices by a list of vectors requires something like "mats @ vecs[..., np.newaxis]". In other words, A @ B is AFAICT nearly equivalent to numpy.dot(A, B), therefore people who cannot use numpy.dot() today will not be able to use "@" either.
* A big part of the problem with np.matrix is that it subclasses np.ndarray but has a completely different __mul__(). IMHO, the problems with it are a good argument for the importance of the Liskov substitution principle, but say rather little about the viability of a correctly implemented matrix type.
On Thu, Mar 13, 2014 at 8:59 PM, Nathaniel Smith njs@pobox.com wrote:
You'll notice that this draft is rather more developed than the average first-round PEP posting
Has this acquired a PEP number yet? It seems far enough along that it really ought to be more broadly visible.
Skip
On Fri, Mar 14, 2014 at 02:42:26PM -0500, Skip Montanaro skip@pobox.com wrote:
On Thu, Mar 13, 2014 at 8:59 PM, Nathaniel Smith njs@pobox.com wrote:
You'll notice that this draft is rather more developed than the average first-round PEP posting
Has this acquired a PEP number yet? It seems far enough along that it really ought to be more broadly visible.
https://mail.python.org/pipermail/python-ideas/2014-March/027099.html
Oleg.
On Fri, Mar 14, 2014 at 2:42 PM, Skip Montanaro skip@pobox.com wrote:
On Thu, Mar 13, 2014 at 8:59 PM, Nathaniel Smith njs@pobox.com wrote:
You'll notice that this draft is rather more developed than the average first-round PEP posting
Has this acquired a PEP number yet? It seems far enough along that it really ought to be more broadly visible.
Guido committed it to peps earlier: http://hg.python.org/peps/file/tip/pep-0465.txt
It should be available at http://www.python.org/dev/peps/pep-0465/ soon.
-- Zach
On Fri, Mar 14, 2014 at 2:54 PM, Zachary Ware zachary.ware+pyideas@gmail.com wrote:
Guido committed it to peps earlier: http://hg.python.org/peps/file/tip/pep-0465.txt
It should be available at http://www.python.org/dev/peps/pep-0465/ soon.
Thanks. I was just searching the PEP 0 page and didn't see it.
Skip
14.03.14 03:59, Nathaniel Smith написав(ла):
PEP: XXXX Title: Dedicated infix operators for matrix multiplication and matrix power
This is about matrix multiplication and matrix power. But what about matrix division? It is needed to distinguish elementwise division from left and right matrix divisions. And what about concatenation? Python lists and tuples use ``+`` for this, but NumPy arrayse use ``+`` for bitwise addition.
Matrix multiplication is only defined on 2d arrays (matrices). But why not on arbitrary tensors (except scalars)?
On 15 Mar 2014 06:18, "Serhiy Storchaka" storchaka@gmail.com wrote:
14.03.14 03:59, Nathaniel Smith написав(ла):
PEP: XXXX Title: Dedicated infix operators for matrix multiplication and matrix
power
This is about matrix multiplication and matrix power. But what about
matrix division? It is needed to distinguish elementwise division from left and right matrix divisions. And what about concatenation? Python lists and tuples use ``+`` for this, but NumPy arrayse use ``+`` for bitwise addition.
Matrix multiplication is only defined on 2d arrays (matrices). But why
not on arbitrary tensors (except scalars)?
These are covered in the PEP - the relatively simple "just a new matrix multiplication operator" proposal is due to the fact that, after long experience, matrix multiplication is the only operator the numeric community really feel they miss in Python compared to working in more special purpose languages like Matlab or R.
Everything else can continue to use appropriately named methods and functions, just as it does today.
That does raise a question for me though: does Julia have syntax for matrix multiplication? If so, what does that look like?
Cheers, Nick.
Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Le 14/03/14 20:17, Serhiy Storchaka a écrit :
Matrix multiplication is only defined on 2d arrays (matrices). But why not on arbitrary tensors (except scalars)?
The PEP does include a (complicated) definition of "matrix multiplication" for arrays of all ranks. In particular, @-multiplication of rank-1 arrays is the dot product.
On 2014-03-14 20:17, Serhiy Storchaka wrote:
14.03.14 03:59, Nathaniel Smith написав(ла):
PEP: XXXX Title: Dedicated infix operators for matrix multiplication and matrix power
This is about matrix multiplication and matrix power. But what about matrix division? It is needed to distinguish elementwise division from left and right matrix divisions.
In our experience, matrix division comes up rather more rarely. For numerical problems, these should typically be implemented with a linear solve of some type. The best kind of solve varies based on the problem, so that's a good reason to continue to implement it with a function call rather than an operator. For symbolic problems (a la Sympy), both forms can be spelled reasonably well given matrix multiplication and matrix power.
And what about concatenation? Python lists and tuples use ``+`` for this, but NumPy arrayse use ``+`` for bitwise addition.
There are *lots* of ways to concatenate an n-dimensional array, and we have a whole complement of functions to do that, vstack(), hstack(), dstack(), concatenate(), r_[], c_[]. A single operator would not suffice.
Matrix multiplication is only defined on 2d arrays (matrices). But why not on arbitrary tensors (except scalars)?
The PEP defines how numpy intends to interpret n-dimensional operands and the reasons why. For tensors qua tensors you tend to want to do arbitrary Einstein-summation, which is hard to do in a single operator. numpy.einsum() handles that for us.
On Fri, Mar 14, 2014 at 01:59:14AM +0000, Nathaniel Smith wrote:
This PEP proposes two new binary operators dedicated to matrix multiplication and matrix power, spelled ``@`` and ``@@`` respectively. (Mnemonic: ``@`` is ``*`` for mATrices.)
When I first started reading the PEP, I was rather dubious about the choice of @ as operator, but it surprised me at how quickly I got used to it. I suppose because unconsciously I associated it with the common shorthand implying (scalar) multiplication:
5kg of apples @ $2 per kg costs $10.00
so it didn't take me very long to warm to it.
+1
Steven D'Aprano wrote:
I suppose because unconsciously I associated it with the common shorthand implying (scalar) multiplication:
5kg of apples @ $2 per kg costs $10.00
That analogy actually extends to the matrix case as well. E.g. if you have a vector q of quantities and a vector p of prices, then q @ p is the total price.
On Sat, 15 Mar 2014 12:41:23 +1300 Greg Ewing greg.ewing@canterbury.ac.nz wrote:
Steven D'Aprano wrote:
I suppose because unconsciously I associated it with the common shorthand implying (scalar) multiplication:
5kg of apples @ $2 per kg costs $10.00
That analogy actually extends to the matrix case as well. E.g. if you have a vector q of quantities and a vector p of prices, then q @ p is the total price.
It depends how the vectors are layed out (horizontal @ vertical or vertical @ horizontal).
Regards
Antoine.
Antoine Pitrou wrote:
It depends how the vectors are layed out (horizontal @ vertical or vertical @ horizontal).
The definition of @ in the proposal is such that two 1D arrays is interpreted as horizontal @ vertical.
On 15 Mar 2014 09:56, "Greg Ewing" greg.ewing@canterbury.ac.nz wrote:
Antoine Pitrou wrote:
It depends how the vectors are layed out (horizontal @ vertical or vertical @ horizontal).
The definition of @ in the proposal is such that two 1D arrays is interpreted as horizontal @ vertical.
Oh, nice! Maybe the PEP should include implementing that for lists and our other 1D sequence types?
Also, I just remembered another minor error in the PEP - as of Python 3.3, memoryview.cast() allows the creation of multidimensional views, so the PEP is incorrect in saying that can't be done with the builtins/stdlib.
Cheers, Nick.
-- Greg
Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On 15 March 2014 00:04, Nick Coghlan ncoghlan@gmail.com wrote:
On 15 Mar 2014 09:56, "Greg Ewing" greg.ewing@canterbury.ac.nz wrote:
Antoine Pitrou wrote:
It depends how the vectors are layed out (horizontal @ vertical or vertical @ horizontal).
The definition of @ in the proposal is such that two 1D arrays is interpreted as horizontal @ vertical.
Oh, nice! Maybe the PEP should include implementing that for lists and our other 1D sequence types?
I don't think that that should be included in the PEP. It's unlikely that many users will want that and if they do it will be subject to bike-shedding of issues that aren't directly relevant to the problem that this PEP tries to solve. It'll get brought up on python-ideas some time in the next couple of years and then interested parties can bike-shed over it separately (with the benefit of an existing implementation of the operators in numpy for comparison).
Also, I just remembered another minor error in the PEP - as of Python 3.3, memoryview.cast() allows the creation of multidimensional views, so the PEP is incorrect in saying that can't be done with the builtins/stdlib.
I'm getting a bit of déjà vu here: https://mail.python.org/pipermail/python-dev/2013-September/128463.html
Has this changed since then? I'd like to know how to use memoryview to create a 2D matrix of integers or floats.
Also even if it is possible to create a 2D view in the stdlib I don't think that the stdlib should try so support matrix multiplication unless it is prepared to try and do so efficiently. Numpy delegates this task to the underlying BLAS library which will usually be not just asymptotically more efficient than a naïve algorithm but also heavily micro-optimised. I doubt that the Python core wants to grow a dependency on having a BLAS library so unless someone wants to reimplement this part of one it's best to leave it out.
A big +1 on the PEP from me.
I think this brings numpy on a par with Matlab for linear algebra. Matlab also has other matrix related operators such as matrix left and right divide with \ and / but I think that these are triviallising what should be understood as non-trivial operations and are also overloaded to mean too many different things (given different array shapes). Similarly Matlab has made a big mistake in having the default multiplication operator be matrix multiplication which creates a significant pitfall for new users and becomes a common source of problems that people have to work through every time they start debugging their code (it's just a time waster really).
Some people have complained that the @ symbol looks clunky. Personally I think it's important that this operator should be very distinctive. Using an innocent-looking operator for matrix multiplication has been tried before and based on ~7 years of teaching Matlab I would say that it's a bad idea.
Oscar
On Sat, 15 Mar 2014 12:55:38 +1300 Greg Ewing greg.ewing@canterbury.ac.nz wrote:
Antoine Pitrou wrote:
It depends how the vectors are layed out (horizontal @ vertical or vertical @ horizontal).
The definition of @ in the proposal is such that two 1D arrays is interpreted as horizontal @ vertical.
Really? That should be up to the third-party library implementing the @ operator for its types, not to the language itself: Python _suggests_ an use case for @, it doesn't mandate it (especially as there's no appropriate data type in the stdlib).
Regards
Antoine.
On Sat, Mar 15, 2014 at 12:09 AM, Antoine Pitrou solipsis@pitrou.net wrote:
On Sat, 15 Mar 2014 12:55:38 +1300 Greg Ewing greg.ewing@canterbury.ac.nz wrote:
Antoine Pitrou wrote:
It depends how the vectors are layed out (horizontal @ vertical or vertical @ horizontal).
The definition of @ in the proposal is such that two 1D arrays is interpreted as horizontal @ vertical.
Really? That should be up to the third-party library implementing the @ operator for its types, not to the language itself: Python _suggests_ an use case for @, it doesn't mandate it (especially as there's no appropriate data type in the stdlib).
See: http://legacy.python.org/dev/peps/pep-0465/#intended-usage-details
Which begins: "This section is informative, rather than normative -- it documents the consensus of a number of libraries that provide array- or matrix-like objects on how the @ and @@ operators will be implemented. ..."
Antoine Pitrou wrote:
On Sat, 15 Mar 2014 12:55:38 +1300 Greg Ewing greg.ewing@canterbury.ac.nz wrote:
The definition of @ in the proposal is such that two 1D arrays is interpreted as horizontal @ vertical.
Really? That should be up to the third-party library implementing the @ operator for its types,
It is. It's described as "recommended semantics" in the PEP, not something defined by the language.
On 15/03/14 01:09, Antoine Pitrou wrote:
Really? That should be up to the third-party library implementing the @ operator for its types, not to the language itself: Python _suggests_ an use case for @, it doesn't mandate it (especially as there's no appropriate data type in the stdlib).
array.array is an appropriate type for supporting @ for matrix multiplication.
Sturla
On Sun, Mar 23, 2014 at 3:40 AM, Sturla Molden sturla.molden@gmail.com wrote:
On 15/03/14 01:09, Antoine Pitrou wrote:
Really? That should be up to the third-party library implementing the @ operator for its types, not to the language itself: Python _suggests_ an use case for @, it doesn't mandate it (especially as there's no appropriate data type in the stdlib).
array.array is an appropriate type for supporting @ for matrix multiplication.
I was thinking the function type, as a composition operator. Matrix multiplication corresponds exactly with composition of linear functions, so it makes some sort of sense. And it has some kind of neat similarity to decorators, which generalize function composition.
If you rename "matmul" to "compose" and "matpow" to something like "iterate" (except less confusing), then you've generalized the operators a little and given them more potential use cases.
-- Devin
On Mar 23, 2014, at 3:51, Devin Jeanpierre jeanpierreda@gmail.com wrote:
On Sun, Mar 23, 2014 at 3:40 AM, Sturla Molden sturla.molden@gmail.com wrote:
On 15/03/14 01:09, Antoine Pitrou wrote:
Really? That should be up to the third-party library implementing the @ operator for its types, not to the language itself: Python _suggests_ an use case for @, it doesn't mandate it (especially as there's no appropriate data type in the stdlib).
array.array is an appropriate type for supporting @ for matrix multiplication.
I was thinking the function type, as a composition operator. Matrix multiplication corresponds exactly with composition of linear functions, so it makes some sort of sense. And it has some kind of neat similarity to decorators, which generalize function composition.
If you rename "matmul" to "compose" and "matpow" to something like "iterate" (except less confusing), then you've generalized the operators a little and given them more potential use cases.
This sounds nice from a purity point of view, but I'm not sure it's a good idea.
First, matrix multiplication only corresponds exactly with function composition if matrix*vector multiplication corresponds exactly with function calling, which is generally not true in programming. You can't assign a matrix to f, then write f(v) to apply the matrix as an operation; you write f @ v.
Meanwhile, Python doesn't even have compose in functools, much less in builtins, and I don't think anyone had seriously proposed it in decades. If we don't even need it in the stdlib, do we really need is as an operator?
Also, is it right-compose or left-compose? (That is, does (f@g)(2) call f then g, or g then f?) One answer is obvious to people who think mathematically, the other to novices. And even if you argue that the right answer to that is to teach the novices the right way to think about it clearly, reverse composition is still often useful; that's why many languages that have a compose function or operator also have rcompose and/or make it trivial to write as flip(compose).
Also, how exactly do you define compose? This is easy in a language like Haskell, where every function takes one argument and returns one value (which, because of currying, may actually be a function that takes the next argument...), but in Python, where functions can take any number of arguments, with default values and *args and keyword-only arguments and so on, it's a lot less clear. Does the second function need to be a single-argument function, or do tuple returns match to *args, or something different? Different third-party function libraries use different answers to that, sometimes offering more than one under different names, but obviously the operator has to be either compose1 or composestar, and then the other one won't even be in the stdlib?
Meanwhile, looking at languages that care about mathematical purity and about functional composition: you don't write . for matrix multiplication in Haskell, you write `multStd` or, more likely, define *, **, `mult`, or something else more readable.
On Sun, Mar 23, 2014 at 03:51:24AM -0700, Devin Jeanpierre wrote:
If you rename "matmul" to "compose" and "matpow" to something like "iterate" (except less confusing), then you've generalized the operators a little and given them more potential use cases.
-1
The dunder methods are typically named for the most common or typical use-case, not generalised or marginal ones e.g.:
+ __add__ not __concat__ * __mul__ not __repeat__ or __repetition__ < __lt__ not __subset__ & __and__ not __ampersand__ or __conjunction__
On Sun, Mar 23, 2014 at 6:08 PM, Steven D'Aprano steve@pearwood.info wrote:
-1
The dunder methods are typically named for the most common or typical use-case, not generalised or marginal ones e.g.:
- __add__ not __concat__
- __mul__ not __repeat__ or __repetition__
< __lt__ not __subset__ & __and__ not __ampersand__ or __conjunction__
Thanks for taking issue with the not-silly point.
Ironically enough, all of those but "and" are actually general names for operators, but I agree that probably wasn't intentional. Anyway, in practice people seem to get what it means for something to be "and"ed with another thing. There's an intuition that many things using the operator treat it in a roughly analogous way, and the same is true for many other overloadable operators in Python. These conventions help make code more understandable and are great.
I'm not sure such a thing would happen with an operator for which the stdlib and documentation and everything insists is only for matrix multiplication. It'd be like if we had "__twos_complement_bitwise_and__". It's one thing to generalize an operator name, and another to not make it very specific.
-- Devin
On Sun, Mar 23, 2014 at 06:47:40PM -0700, Devin Jeanpierre wrote:
I'm not sure such a thing would happen with an operator for which the stdlib and documentation and everything insists is only for matrix multiplication.
They would *insist* would they? The documentation would come with a great warning "Thou Shalt Not Use __matmul__ For Anything Except Matrix Multiplication"? I suppose that will be like the similar (imaginary) warning that __pow__ is only permitted to be used for numerical exponentiation and that any other use is forbidden on pain of being slapped with a fish.
I think it is highly unlikely that people will be frightened off from overloading @ by the name. If people happily use __lt__ for subset checking, which is *nothing* like less-than, I'm sure they'll use @ as well. People do far weirder things -- there's even a recipe out there on the Internet for implementing arbitrary named pseudo-operators using __or__ :
vector |dot| vector
But suppose you're right, and people don't use it for anything but matrix multiplication. So what? Who cares? It's not like anyone is made worse off by the change. None of my code currently uses the @ operator. If numpy starts to use it, and I don't, I'll be no worse off than I am now.
It'd be like if we had "__twos_complement_bitwise_and__".
Having __matmul__ would be like if we had __xor__ __pow__ __sub__ __div__ __mod__ __pos__ or __neg__
On Sun, Mar 23, 2014 at 8:12 PM, Steven D'Aprano steve@pearwood.info wrote:
I think it is highly unlikely that people will be frightened off from overloading @ by the name. If people happily use __lt__ for subset checking, which is *nothing* like less-than,
Actually, no. <, or "less than", is the exact way it's spelled for any partial order, and subset relations are probably the most famous non-numeric example of a partial order.
In fact, if you look at the most common construction of the natural numbers (0 = frozenset(), succ(x) = frozenset(x) | x), it is more complicated than it needs to be only so that the < operator means the same thing for natural numbers when treated as numbers or as sets.
It'd be like if we had "__twos_complement_bitwise_and__".
Having __matmul__ would be like if we had __xor__ __pow__ __sub__ __div__ __mod__ __pos__ or __neg__
Then we disagree.
-- Devin
On Sun, Mar 23, 2014 at 08:48:49PM -0700, Devin Jeanpierre wrote:
On Sun, Mar 23, 2014 at 8:12 PM, Steven D'Aprano steve@pearwood.info wrote:
I think it is highly unlikely that people will be frightened off from overloading @ by the name. If people happily use __lt__ for subset checking, which is *nothing* like less-than,
Actually, no. <, or "less than", is the exact way it's spelled for any partial order, and subset relations are probably the most famous non-numeric example of a partial order.
I will accept that subsets are an example of partial order and so I was wrong to say that that it has nothing to do with __lt__ and __gt__.
But I disagree that < is the usual symbol for subset. Neither Wikipedia nor Wolfram Mathworld mention it:
https://en.wikipedia.org/wiki/Subset http://mathworld.wolfram.com/Subset.html
I was taught to use symbols ⊂ and ⊆ for proper subset and subset-or- equal. I've also seen ⊊ and ⊂ used instead. But prior to Python, I've never seen < used. If it's used in mathematics, it's a niche use.
On 24 March 2014 22:06, Steven D'Aprano steve@pearwood.info wrote:
On Sun, Mar 23, 2014 at 08:48:49PM -0700, Devin Jeanpierre wrote:
On Sun, Mar 23, 2014 at 8:12 PM, Steven D'Aprano steve@pearwood.info wrote:
I think it is highly unlikely that people will be frightened off from overloading @ by the name. If people happily use __lt__ for subset checking, which is *nothing* like less-than,
Actually, no. <, or "less than", is the exact way it's spelled for any partial order, and subset relations are probably the most famous non-numeric example of a partial order.
I will accept that subsets are an example of partial order and so I was wrong to say that that it has nothing to do with __lt__ and __gt__.
I normally let python-ideas subthreads run indefinitely without commenting, but in this case... *really* not a productive tangent :)
Ellipsis, extended slicing and memory views were added to the core language and C API definitions for the numeric computing folks without a compelling stdlib use case (at least at the time). Adding __matmul__ for their benefit really isn't much of a stretch, and it makes the distinction *they* care about clear (__mul__ = element-wise multiplication, __matmul__ = matrix multiplication). Previous proposals along these lines failed because they overgeneralised without compelling benefits from the extra complexity, while this one hits the sweet spot of solving the *actual* problem to be solved with just some minor technical details to work out.
Cheers, Nick.
On Sun, Mar 23, 2014 at 11:40:10AM +0100, Sturla Molden wrote:
On 15/03/14 01:09, Antoine Pitrou wrote:
Really? That should be up to the third-party library implementing the @ operator for its types, not to the language itself: Python _suggests_ an use case for @, it doesn't mandate it (especially as there's no appropriate data type in the stdlib).
array.array is an appropriate type for supporting @ for matrix multiplication.
No it isn't. The PEP even discusses it:
[quote] array objects cannot represent multidimensional data at all, which makes ``__matmul__`` much less useful. Second, providing a quality implementation of matrix multiplication is highly non-trivial. Naive nested loop implementations are very slow and providing one in CPython would just create a trap for users. But the alternative -- providing a modern, competitive matrix multiply -- would require that CPython link to a BLAS library, which brings a set of new complications. In particular, several popular BLAS libraries (including the one that ships by default on OS X) currently break the use of ``multiprocessing`` [#blas-fork]_. And finally, we'd have to add quite a bit beyond ``__matmul__`` before ``memoryview`` or ``array.array`` would be useful for numeric work -- like elementwise versions of the other arithmetic operators, just to start. [end quote]
I don't think we should be looking for additional use-cases for @. Either the PEP stands on its own, and @ is approved for matrix multiplication (and possibly @@ for exponentiation), or it isn't. If the PEP is accepted -- and I think it should be -- then people will invent new uses for @. Or they won't. Either way, it doesn't matter.
On 24/03/14 01:49, Steven D'Aprano wrote:
No it isn't. The PEP even discusses it:
(...)
Well, even with a single dimension, matrix multiplication has a useful interpretation (inner product). At least the intention in numpy is to interpret "vector @ vector" as the inner product. Thus it is useful both for list and array in the standard library. A possible usecase for it in the standard library would be to simplify the code in the statistics module.
But the main consideration is to make @ benign to the rest of the Python community.
I don't think we should be looking for additional use-cases for @. Either the PEP stands on its own, and @ is approved for matrix multiplication (and possibly @@ for exponentiation),
The consensus on the NumPy list seems to be that @@ can wait. It is also ambigous. For example matrix @@ 0.5 could be interpreted as the Cholesky factorization or the matrix square root, depending on context.
Also an expression like Z = (X @@ -1) @ Y should for numerical stability (and speed) be computed as "solve X @ Z = Y" e.g. with LU, SVD or QR factorization. This would require calling a function like solve(X,Y). Matlab actually has a special back-divide operator for this reason: X \ Y.
This means that @@ is ambiguous except for positive integer powers or -1. But the common case of matrix @@ -1 is not as useful as we might think. Then we are left with exponentiation to positive integer powers, which is not nearly common enough to justify a special operator.
Sturla