np.asfortranarray: unnecessary copying?

What are the rules for when 'np.asarray' and 'np.asfortranarray' make a copy? This makes sense to me: In [3]: carr = np.arange(3) In [6]: carr2 = np.asarray(carr) In [8]: carr2[0] = 1 In [9]: carr Out[9]: array([1, 1, 2]) No copy is made. But doing the same with a fortran array makes a copy: In [10]: farr = np.arange(3).copy('F') In [12]: farr2 = np.asfortranarray(farr) In [13]: farr2[0] = 1 In [14]: farr Out[14]: array([0, 1, 2]) Could it be a 1D thing, since it's both C contiguous & F contiguous? Here's a 2D example: In [15]: f2D = np.arange(10).reshape((2,5), order='F') In [17]: f2D2 = np.asfortranarray(f2D) In [19]: f2D2[0,0] = 10 In [20]: f2D Out[20]: array([[10, 2, 4, 6, 8], [ 1, 3, 5, 7, 9]]) So it looks like np.asfortranarray makes an unnecessary copy if the array is simultaneously 1D, C contiguous and F contiguous. Coercing the array with np.atleast_2d() makes asfortranarry behave. Looking further, np.isfortran always returns false if the array is 1D, even if it's Fortran contiguous (and np.isfortran is documented as such). What is the rationale here? Is it a 'column' vs. 'row' thing? Kurt

This seems to me to be a bug, or rather, two bugs. 1D arrays are automatically Fortran-ordered, so isfortran should return True for them (incidentally, the documentation should be edited to indicate that the data must also be contiguous in memory). Whether or not this change is made, there's no point in asfortranarray making a copy of a 1D array, since the copy isn't any more Fortran-ordered than the input array. Another kind of iffy case is axes of length one. These should not affect C/Fortran order, since the length of their strides doesn't matter, but they do; if you use newaxis to add an axis to an array, it's still just as C/Fortran ordered as it was, but np.isfortran reports False. (Is there a np.isc or equivalent function?) Incidentally, there is a subtle misconception in your example code: when reshaping an array, the order='F' has a different meaning. It has nothing direct to do with the memory layout; what it does is define the logical arrangement of elements used while reshaping the array. The array returned will be in C order if a copy must be made, or in whatever arbitrarily-strided order is necessary if the reshape can be done without a copy. As it happens, in your example, the latter case occurs and works out to Fortran order. Anne On 30 July 2010 13:50, Kurt Smith <kwmsmith@gmail.com> wrote:
What are the rules for when 'np.asarray' and 'np.asfortranarray' make a copy?
This makes sense to me:
In [3]: carr = np.arange(3)
In [6]: carr2 = np.asarray(carr)
In [8]: carr2[0] = 1
In [9]: carr Out[9]: array([1, 1, 2])
No copy is made.
But doing the same with a fortran array makes a copy:
In [10]: farr = np.arange(3).copy('F')
In [12]: farr2 = np.asfortranarray(farr)
In [13]: farr2[0] = 1
In [14]: farr Out[14]: array([0, 1, 2])
Could it be a 1D thing, since it's both C contiguous & F contiguous?
Here's a 2D example:
In [15]: f2D = np.arange(10).reshape((2,5), order='F')
In [17]: f2D2 = np.asfortranarray(f2D)
In [19]: f2D2[0,0] = 10
In [20]: f2D Out[20]: array([[10, 2, 4, 6, 8], [ 1, 3, 5, 7, 9]])
So it looks like np.asfortranarray makes an unnecessary copy if the array is simultaneously 1D, C contiguous and F contiguous.
Coercing the array with np.atleast_2d() makes asfortranarry behave.
Looking further, np.isfortran always returns false if the array is 1D, even if it's Fortran contiguous (and np.isfortran is documented as such).
What is the rationale here? Is it a 'column' vs. 'row' thing?
Kurt _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

On Fri, Jul 30, 2010 at 1:33 PM, Anne Archibald <aarchiba@physics.mcgill.ca> wrote:
This seems to me to be a bug, or rather, two bugs. 1D arrays are automatically Fortran-ordered, so isfortran should return True for them (incidentally, the documentation should be edited to indicate that the data must also be contiguous in memory). Whether or not this change is made, there's no point in asfortranarray making a copy of a 1D array, since the copy isn't any more Fortran-ordered than the input array.
Yep, seem like bugs to me too. And they're related to the same thing in the numpy source: the FORTRAN flag always tests for at least 2 dimensional arrays, although some comments contradict this: see: numpy/include/ndarrayobject.h:592 /* Note: all 0-d arrays are CONTIGUOUS and FORTRAN contiguous. If a 1-d array is CONTIGUOUS it is also FORTRAN contiguous */ There is an array flag, 'fnc' that stands for something like "fortran-not-contiguous"; this is what isfortran checks. Non-intuitively an array can have a.flags.f_contigouous == True but a.flags.fnc == False (0-D and 1-D contig. arrays, for example). Is there some rationale for this behavior in the code? It's enforced everywhere (ignoring the comments to the contrary) so it's intentional, but makes no sense to me. This fortran stuff is very important to have working correctly for my project, 'fwrap'. I'm thinking of creating a wrapper module that has fixed versions of these functions. (I'd also like this fixed in numpy, but that might break old code that depends on the current sematics.) Something like:
from fwrap import fnp fnp.isfortran(a) # handles 0- and 1-D arrays correctly fnp.asfortranarray(a) # won't make unnecessary copies, etc.
Another kind of iffy case is axes of length one. These should not affect C/Fortran order, since the length of their strides doesn't matter, but they do; if you use newaxis to add an axis to an array, it's still just as C/Fortran ordered as it was, but np.isfortran reports False. (Is there a np.isc or equivalent function?)
Good point; taking a C- or F-contiguous array and using np.newaxis sets a.flags.contiguous == False and a.flags.f_contiguous == False. So this is more general than just a fortran issue. Example: In [17]: a = np.arange(10) In [18]: a.flags.contiguous Out[18]: True In [19]: anew = a[np.newaxis, :] In [20]: anew.flags.contiguous Out[20]: False In [21]: anew.shape Out[21]: (1, 10) In [22]: other = np.empty((1,10)) In [23]: other.flags.contiguous Out[23]: True In [24]: other.shape Out[24]: (1, 10) In [25]: other.strides Out[25]: (80, 8) In [26]: anew.strides Out[26]: (0, 8)
Incidentally, there is a subtle misconception in your example code: when reshaping an array, the order='F' has a different meaning. It has nothing direct to do with the memory layout; what it does is define the logical arrangement of elements used while reshaping the array. The array returned will be in C order if a copy must be made, or in whatever arbitrarily-strided order is necessary if the reshape can be done without a copy. As it happens, in your example, the latter case occurs and works out to Fortran order.
Good catch; I should have done 'arr.reshape(2,5).copy('F')'.
Anne
On 30 July 2010 13:50, Kurt Smith <kwmsmith@gmail.com> wrote:
What are the rules for when 'np.asarray' and 'np.asfortranarray' make a copy?
This makes sense to me:
In [3]: carr = np.arange(3)
In [6]: carr2 = np.asarray(carr)
In [8]: carr2[0] = 1
In [9]: carr Out[9]: array([1, 1, 2])
No copy is made.
But doing the same with a fortran array makes a copy:
In [10]: farr = np.arange(3).copy('F')
In [12]: farr2 = np.asfortranarray(farr)
In [13]: farr2[0] = 1
In [14]: farr Out[14]: array([0, 1, 2])
Could it be a 1D thing, since it's both C contiguous & F contiguous?
Here's a 2D example:
In [15]: f2D = np.arange(10).reshape((2,5), order='F')
In [17]: f2D2 = np.asfortranarray(f2D)
In [19]: f2D2[0,0] = 10
In [20]: f2D Out[20]: array([[10, 2, 4, 6, 8], [ 1, 3, 5, 7, 9]])
So it looks like np.asfortranarray makes an unnecessary copy if the array is simultaneously 1D, C contiguous and F contiguous.
Coercing the array with np.atleast_2d() makes asfortranarry behave.
Looking further, np.isfortran always returns false if the array is 1D, even if it's Fortran contiguous (and np.isfortran is documented as such).
What is the rationale here? Is it a 'column' vs. 'row' thing?
Kurt _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
participants (2)
-
Anne Archibald
-
Kurt Smith