[Numpy-discussion] how is y += x computed when y.strides = (0, 8) and x.strides=(16, 8) ?

Wed Sep 5 19:55:20 EDT 2012

Poor Sebastian, you make the mistake of asking difficult questions.

I noticed that it should be [6, 10] not [6, 12], and in fact is with numpy-1.4.1; while I observe the [4, 6] result with numpy-1.6.1.  Logs follow:

numpy-1.4.1 in Python-2.6.5 on Mac (intel 64bit) with Python + numpy built from sources dual-arch.  The result in terms of output does not depend on the architecture chosen for run.  The other is numpy-1.6.1 with Python-2.7.2.

numpy-1.4.1 (64bit Python 2.6.5):

Python 2.6.5 (r265:79063, Jul 18 2010, 12:14:53) 
[GCC 4.2.1 (Apple Inc. build 5659)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> print numpy.__version__
1.4.1
>>> import numpy
>>> 
>>> x = numpy.arange(6).reshape((3,2))
>>> y = numpy.arange(2)
>>> 
>>> print 'x=\n', x
x=
[[0 1]
 [2 3]
 [4 5]]
>>> print 'y=\n', y
y=
[0 1]
>>> 
>>> u,v = numpy.broadcast_arrays(x, y)
>>> 
>>> print 'u=\n', u
u=
[[0 1]
 [2 3]
 [4 5]]
>>> print 'v=\n', v
v=
[[0 1]
 [0 1]
 [0 1]]
>>> print 'v.strides=\n', v.strides
v.strides=
(0, 8)
>>> 
>>> v += u
>>> 
>>> print 'v=\n', v  # expectation: v = [[6,12], [6,12], [6,12]]
v=
[[ 6 10]
 [ 6 10]
 [ 6 10]]
>>> print 'u=\n', u
u=
[[0 1]
 [2 3]
 [4 5]]
>>> print 'y=\n', y  # expectation: y = [6,12]
y=
[ 6 10]

And numpy-1.6.1 (64bit Python-2.7.2):

Python 2.7.2 (default, Mar 15 2012, 15:42:23) 
[GCC 4.2.1 (Apple Inc. build 5664)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
[fiddling with sys.maxint edited out]
>>> import numpy
>>> 
>>> x = numpy.arange(6).reshape((3,2))
>>> y = numpy.arange(2)
>>> 
>>> print 'x=\n', x
x=
[[0 1]
 [2 3]
 [4 5]]
>>> print 'y=\n', y
y=
[0 1]
>>> 
>>> u,v = numpy.broadcast_arrays(x, y)
>>> 
>>> print 'u=\n', u
u=
[[0 1]
 [2 3]
 [4 5]]
>>> print 'v=\n', v
v=
[[0 1]
 [0 1]
 [0 1]]
>>> print 'v.strides=\n', v.strides
v.strides=
(0, 8)
>>> 
>>> v += u
>>> 
>>> print 'v=\n', v  # expectation: v = [[6,12], [6,12], [6,12]]
v=
[[4 6]
 [4 6]
 [4 6]]
>>> print 'u=\n', u
u=
[[0 1]
 [2 3]
 [4 5]]
>>> print 'y=\n', y  # expectation: y = [6,12]
y=
[4 6]

Maybe this helps bisecting it.

Friedrich.

P.S.: I took the time scrolling through the tickets, with an empty set resulting (first three pages by date or so).  This does not mean such a ticket does not exist.  Also the docs are rather quiet about this (e.g. http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.__iadd__.html#numpy.ndarray.__iadd__, or http://docs.scipy.org/doc/numpy/reference/arrays.ndarray.html#arithmetic-and-comparison-operations).

My guess is that v is duplicated during the in-place addition (including strides), and the results of the calculation (using the original v) are written to that copy.  After that, the copy's data would be [4, 6] with the strides as for v.  So copying over to v's "original" data or letting it point to the copy and freeing the remaining copy would explain the behaviour after all.  Maybe there's a sensible explanation for why it should be done that way.

It is and remains a guess after all.

I find the problem difficult to term after all.  Maybe "In-place operation not in-place for zero-strided first operand?"  I don't think it is easy to get a good notion of an "in-place operation" for a zero-strided first operand.  It might be that your definition does just not match up.  But I think in-place operations should be expected to be independent on the order of execution of the element-wise operations, even if these elements share data (as in v in this case, the first operand).  This criterion is fulfilled by your expectation and numpy-1.4.1 but not by numpy-1.6.1.

I noticed that it's not necessary to duplicate v's strides in the hypothesis noted above.  The neglection of the other element-operations would then happen when copying the element-wise results over to v's storage.

A remedy might be to let the back-copy respect the operation it is used for.  So if the back-copying happens because an addition is to be in-place, it could use in-place addition instead of assigment.  As a result, the element-wise operations on the "repeat()'ed" copy would be "chained together" by the operation under use.  The non-commutative operations like subtraction or division would use addition or mulitiplication, resp., because the subtraction or division character can be thought of as an inversion of the second operand only before the whole in-place operations takes place, leading to a commutative in-place operation.  Hence the "chained" operands in the back-copy operation form an associative and commutative expression after all.  Nevertheless, a duplication of the first operands would occur in our case, since the same data from the first operand appears in several element's expressions in the copy buffer.

The result would then be, in our example here, neither [6, 10] nor [4, 6], but [6, 12] instead.  What makes actually more sense if you observe the symmetry regarding the quotient of the second column and the first column (the second column seen as a series, as it is to be collapes onto the same datum).  It seems it comes down to giving the assigment a sensible meaning in case the first operand (i.e., the target) is partially zero-strided or contains less data elements that its view does.  The second case might appear for stride tricks.

It seems, it comes down to deciding in that case how to collapse the value of the second operand (the source of the assigment) with the values of the first operand (the target).  And it seems from this consideration here, that the substitution from "a += b" to "a = a + b" is not generally valid if the substitute needs reshaping of the first operand "a" to match the shape of "b".  This repetition is what makes the substitute generate a different result than the breakdown to element-wise operation in the substituted form.  It seems, that in-place operations and parallelisation by array notions do not commute.

To illustrate this noncummativity of in-place operations and parallelism, note that the implementation of "a += b" which yielded the 'expected result' (as in numpy-1.4.1) goes first by elements, and then into the in-place operation (the iteration happens on a higher code level than the operation).  In the other case, which is done by numpy-1.6.1 in an equivalent way at least, the operations is executed first on array operands, and then an iterations is employed.  Unfortunately, for the second approach, due to the nature of computers, an "execution of an operation on arrays" is abstract, and needs concrete implementation on the element level, s.t. the "a = a + b" substitute leads to an elementwise assigment, which is not accounted for by the algorithmic idea, and leads to the divergence between the two approaches' results.  

I like the equivalence criterion of "a += b" <=> "a = a + b", not caring for dtype downcasts.  These downcasts are actually a Python idiosyncrasy and leads to the known paradox only when neglecting the character of Python assignment.  The important notion of the substitutional equivalence of "a += b" <=> "a = a + b" seems to be to me, that the r.h.s. of the equivalence refers to additions, and explains the in-place operation by addition.  Unfortunately, in the case of a zero-strided first operand, the assignment in between becomes ill-defined.  But if then we then stand by "explaining in-place addition by pure addition" it would seem natural to me, to chain the operands resulting from the substitute "a + b" together by just that "+" operation.  I don't know what symbol to put for that, maybe a "=+", how strange this might ever look.  So the usual Python teaching of replacing "+=" by "=" and "+" seems to need a more precise definition of that "=" used in case of arrays as operands.  For scalars, of course an ordinary assigment does very well.  In numpy, I would hence state as a formula containing the preceding sentences: "a += b  <=>  a =+ a + b".

It would be possible to implement this by filling the first operand by the unity of the respective operation (0 for addition and 1 for multiplication), carrying out the element-wise operation on broadcasted arrays, and feeding it back to the original first (left) operand by chaining using the resp. operation, which could in turn be done by scalar in-place operation using the unity data as a start.

Funny is, e.g. the following hypothetical Python session (I turn the arrows upwards for this):

^^^ a = numpy.ones(())
^^^ b = numpy.zeros((10, 1))
^^^ a
array(1.0)
^^^ b
array([[ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.]])
^^^ u, v = numpy.broadcast_arrays(a, b)
^^^ u
array([[ 1.],
       [ 1.],
       [ 1.],
       [ 1.],
       [ 1.],
       [ 1.],
       [ 1.],
       [ 1.],
       [ 1.],
       [ 1.]])
^^^ v
array([[ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.]])
^^^ u += v
^^^ u
array([[10.],
       [10.],
       [10.],
       [10.],
       [10.],
       [10.],
       [10.],
       [10.],
       [10.],
       [10.]])
^^^ v
array([[ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.]])
^^^ a
array(10.0)
^^^ b
array([[ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.]])
^^^ 

Funnily too, the operations "a += b" (in my case), or "y += x" would be perfectly defined even if "a + b" or "y + x" (latter in Sebastian's case) require broadcasting of the first operand.  Currently they yield the following result:

(numpy-1.5.1, as above:)
>>> a += b
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: invalid return array shape

(numpy-1.6.1, as above:)
>>> y += x
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: non-broadcastable output operand with shape (2) doesn't match the broadcast shape (3,2)

The result for Sebstian would then be [6, 12] for the y variable as noted above already.

Actually it's interesting that any arbitrarily added additional parallelisation by forcing the first operand to be broadcasted yields changes in it even if the second operand is the zero element w.r.t. the operation considered.  But I think that's rather by design then.  No-one was asking for invariance under arbitrary repetition, no?

I don't know how this would work with multi-threading.  Probably not very well as it would require locking on the then-shared target operand on the l.h.s. of, e.g., "a += b".  It's not easy to tell apart the cases where all elements of "a" are independent on each other.  But actually it's possible at all only in restriction of the dtype to an atomic data type.  What's always the case if object arrays are considered as pointer arrays.  But I really know too little on this multithreading optimisation as it's really not my point of interest and rather an additional idea which probably can be made consistent with the idea proposed here if some work is done on it.

This was the post-scriptum, apparently.  F.

Am 31.08.2012 um 11:31 schrieb Sebastian Walter:

> Hi,
> 
> I'm using numpy 1.6.1 on Ubuntu 12.04.1 LTS.
> A code that used to work with an older version of numpy now fails with an error.
> 
> Were there any changes in the way inplace operations like +=, *=, etc.
> work on arrays with non-standard strides?
> 
> For the script:
> 
> ------- start of code -------
> 
> import numpy
> 
> x = numpy.arange(6).reshape((3,2))
> y = numpy.arange(2)
> 
> print 'x=\n', x
> print 'y=\n', y
> 
> u,v = numpy.broadcast_arrays(x, y)
> 
> print 'u=\n', u
> print 'v=\n', v
> print 'v.strides=\n', v.strides
> 
> v += u
> 
> print 'v=\n', v  # expectation: v = [[6,12], [6,12], [6,12]]
> print 'u=\n', u
> print 'y=\n', y  # expectation: y = [6,12]
> 
> ------- end of code -------
> 
> I get the output
> 
> -------- start of output ---------
> x=
> [[0 1]
> [2 3]
> [4 5]]
> y=
> [0 1]
> u=
> [[0 1]
> [2 3]
> [4 5]]
> v=
> [[0 1]
> [0 1]
> [0 1]]
> v.strides=
> (0, 8)
> v=
> [[4 6]
> [4 6]
> [4 6]]
> u=
> [[0 1]
> [2 3]
> [4 5]]
> y=
> [4 6]
> 
> -------- end of output --------
> 
> I would have expected that
> 
> v += u
> 
> performs an element-by-element +=
> 
> v[0,0] += u[0,0]  # increments y[0]
> v[0,1] += u[0,1]  # increments y[1]
> v[1,0] += u[1,0]  # increments y[0]
> v[1,1] += u[1,1]  # increments y[1]
> v[2,0] += u[2,0]  # increments y[0]
> v[2,1] += u[2,1]  # increments y[1]
> 
> yielding the result
> 
> y = [6,12]
> 
> but instead one obtains
> 
> y = [4, 6]
> 
> which could be the result of
> 
> v[2,0] += u[2,0]  # increments y[0]
> v[2,1] += u[2,1]  # increments y[1]
> 
> 
> Is this the intended behavior?
> 
> regards,
> Sebastian
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion