[Numpy-discussion] [help needed] associativity and precedence of '@'

josef.pktd at gmail.com josef.pktd at gmail.com
Mon Mar 17 14:55:21 EDT 2014


On Mon, Mar 17, 2014 at 1:18 PM, <josef.pktd at gmail.com> wrote:

>
>
>
> On Mon, Mar 17, 2014 at 12:50 PM, Alexander Belopolsky <ndarray at mac.com>wrote:
>
>>
>> On Mon, Mar 17, 2014 at 12:13 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>>> In practice all
>>> well-behaved classes have to make sure that they implement __special__
>>> methods in such a way that all the different variations work, no
>>> matter which class ends up actually handling the operation.
>>>
>>
>> "Well-behaved classes" are hard to come by in practice.  The @ operator
>> may fix the situation with np.matrix, so take a look at MaskedArray with
>> its 40-line __array_wrap__ and no end of bugs.
>>
>> Requiring superclass __method__ to handle creation of subclass results
>> correctly is turning Liskov principle on its head.  With enough clever
>> tricks and tight control over the full class hierarchy you can make it work
>> in some cases, but it is not a good design.
>>
>> I am afraid that making @ special among other binary operators that
>> implement mathematically associative operations will create a lot of
>> confusion.  (The pow operator is special because the corresponding
>> mathematical operation is non-associative.)
>>
>> Imagine teaching someone that a % b % c = (a % b) % c, but a @ b @ c = a
>> @ (b @ c).  What are the chances that they will correctly figure out what a
>> // b // c means after this?
>>
>
> One case where we need to keep track of left or right is type promotion
>
> >>> a.shape
> (100,)
> >>> 1. * a.dot(a)
> -98.0
> >>> (1.*a).dot(a)
> 328350.0
> >>> a.dtype
> dtype('int8')
>
> >>> 1. * a @ a
> ???
>
> similar to
> >>> 1. * 2 / 3
> 0.6666666666666666
> >>> 1. * (2 / 3)   # I'm not in the `future`
> 0.0
>

I thought of sending a message with I'm +-1 on either, but I'm not

I'm again in favor of "left", because it's the simplest to understand
A.dot(B).dot(C)  with some * mixed in

I understand now the computational argument in favor of right

x @ inv(x.T @ x) @ x.T @ y   ( with shapes T,k   k,k   k,T  T,1  )
or
x @ pinv(x) @ y    (with shapes T,k k,T  T,1 )

with  with T>>k      (last 1 could be a m>1 with T>>m)

However, we don't write code like that most of the time.
Alan's students won't care much if some intermediate arrays blow up.
In library code like in statsmodels it's almost always a conscious choice
of where to set the parenthesis and, more often, which part of a long array
expression is taken out as a temporary or permanent variable.

I think almost the only uses of chain_dot(A, B, C) (which is "right") is
for quadratic forms

            xtxi = pinv(np.dot(exog.T, exog))       # k,k
            xtdx = np.dot(exog.T * d[np.newaxis, :], exog)   # k,k
            vcov = chain_dot(xtxi, xtdx, xtxi)      # kk, kk, kk
(from Quantreg)

I think optimizing this way is relatively easy


On the other hand, I worry a lot more about messy cases with different
dtypes or different classes involved as Alexander has pointed out. Cases
that might trip up medium to medium-advanced numpy users.

(Let's see, I have to read @ back to front, and * front to back, and why
did I put a sparse matrix in the middle and a masked array at the end. Oh
no, that's not a masked array it's a panda.)
compared to
(Somewhere there is a mistake, let's go through all terms from the
beginning to the end)

Josef



>
> Josef
>
>
>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>> >>> 1. * a.dot(a)
> -98.0
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140317/11daaac6/attachment.html>


More information about the NumPy-Discussion mailing list