[Numpy-discussion] Change default order to Fortran order

Matthew Brett matthew.brett at gmail.com
Mon Aug 3 04:49:35 EDT 2015


Hi,

On Mon, Aug 3, 2015 at 8:09 AM, Nathaniel Smith <njs at pobox.com> wrote:
> On Aug 2, 2015 11:06 PM, "Kang Wang" <kwang24 at wisc.edu> wrote:
>>
>> This is very good discussion. Thank you all for replying.
>>
>> I can see the fundamental difference is that I always
>> think/talk/read/write a 3D image as I(x, y, z), not (plane, row, column) . I
>> am coming from MRI (magnetic resonance imaging) research, and I can assure
>> you that the entire MRI community is using (x, y, z), including books,
>> journal papers, conference abstracts, presentations, everything. We even
>> talk about what we called "logical x/y/z" and "physical x/y/z", and the
>> rotation matrix that converts the two coordinate systems. The radiologists
>> are also used to (x, y, z). For example, we always say "my image is 256 by
>> 256 by 20 slices", and we never say "20 by 256 by 256".
>>
>> So, basically, at least in MRI, we always talk about an image as I(x, y,
>> z), and we always assume that "x" is the fastest changing index. That's why
>> I prefer column-major (because it is more natural).
>>
>> Of course, I can totally get my work done by using row-major, I just have
>> to always remind myself "write last dimension index first" when coding. I
>> actually have done this before, and I found it would be so much easier if
>> just using column-major.
>
> Why not just use I[x, y, z] like you're used to, and let the computer worry
> about the physical layout in memory? Sometimes this will be Fortran order
> and sometimes C order and sometimes something else, but you don't have to
> know or care; 99% of the time it doesn't matter. The worst case is that when
> you use a python wrapper to call into a library that can only handle Fortran
> order, then the wrapper will quietly have to convert the memory order around
> and it will be slightly slower than if you had used Fortran order in the
> first place. But in practice you'll barely ever notice this, and when you
> do, *then* you can tell numpy explicitly what memory layout you want in the
> situation where it matters.

Yes - if you are using numpy, you really have to look numpy in the eye and say:

"I will let you worry about the array element order in memory, and in
return, you promise to make indexing work as I would expect"

Just for example, let's say you loaded an MRI image into memory:

In [1]: import nibabel
In [2]: img = nibabel.load('my_mri.nii')
In [3]: data = img.get_data()

Because NIfTI images are Fortran memory layout, this happens to be the
memory layout you get for your array:

In [4]: data.flags
Out[4]:
  C_CONTIGUOUS : False
  F_CONTIGUOUS : True
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False

But now - in Python - all I care about is what data I have on the
first, second, third axes.  For example, I could do this:

In [5]: data_copy = data.copy()

This has exactly the same values as the original array, and at the
same index positions:

In [7]: import numpy as np
In [8]: np.all(data == data)
Out[8]: memmap(True, dtype=bool)

but I now have a C memory layout array.

In [9]: data_copy.flags
Out[9]:
  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False

Worse than that, if I slice my original data array, then I get an
array that is neither C- or Fortran- compatible in memory:

In [10]: data_view = data[:, :, ::2]
In [11]: data_view.flags
Out[11]:
  C_CONTIGUOUS : False
  F_CONTIGUOUS : False
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False

So - if you want every array to be Fortran-contiguous in memory, I
would not start with numpy at all, I would write your own array
library.

The alternative - or "the numpy way" - is to give up on enforcing a
particular layout in memory, until you need to pass an array to some C
or C++ or Fortran code that needs some particular layout, in which
case you get your extension code to copy the array into the required
layout on entry.   Of course this is what numpy itself has to do when
interfacing with external libraries like BLAS or LAPACK.

Cheers,

Matthew



More information about the NumPy-Discussion mailing list