Mailman 3 Matlab/Numpy index order - NumPy-Discussion

Matlab/Numpy index order

brechmos

12 May 2009 12 May '09

7:51 p.m.

I am very new to Numpy and relatively new to Python. I have used Matlab for 15+ years now. But, I am starting to lean toward using Numpy for all my work. One thing that I am not understanding is the order of data when read in from a file. Let's say I have a 256x256x150 uint16 dataset (MRI, 150 slices). In Matlab I would read it in as:

...

...
fp=fopen(<binary file>); d = fread(fp,256*256*150, 'int16'); fclose(fp); c = reshape(d, [256 256 150]); imagesc(c(:,:,1));

I am very used to having it read it in and doing the reshaping such that it is 256 rows by 256 columns by 150 slices. Now, Numpy, I can read in the binary data using fromfile (or open, read, close): In [85]: a = fromfile(<binary file>, dtype='int16') In [86]: b = array(struct.unpack('<%dH'%(256*256*150), a)).reshape(150,256,256) So, in Numpy I have to reshape it so the "slices" are in the first dimension. Obviously, I can do a b.transpose( (1,2,0) ) to get it to look like Matlab, but... I don't understand why the index ordering is different between Matlab and Numpy. (It isn't a C/Fortran ordering thing, I don' think). Is the data access faster if I have b without the tranpose, or can I transpose it so it "looks" like Matlab without taking a hit when I do imshow( b[:,:,0] ). Any help for a Numpy newbie would be appreciated. -- View this message in context: http://www.nabble.com/Matlab-Numpy-index-order-tp23509178p23509178.html Sent from the Numpy-discussion mailing list archive at Nabble.com.

Show replies by thread

Ryan May

12 May 12 May

7:55 p.m.

On Tue, May 12, 2009 at 2:51 PM, brechmos <craig@brechmos.org> wrote:

...

So, in Numpy I have to reshape it so the "slices" are in the first dimension. Obviously, I can do a b.transpose( (1,2,0) ) to get it to look like Matlab, but...

I don't understand why the index ordering is different between Matlab and Numpy. (It isn't a C/Fortran ordering thing, I don' think).

Actually, that's precisely the reason.

...

Is the data access faster if I have b without the tranpose, or can I transpose it so it "looks" like Matlab without taking a hit when I do imshow( b[:,:,0] ).

It's going to be faster to do it without the transpose. Besides, for numpy, that imshow becomes: imshow(b[0]) Which, IMHO, looks better than Matlab. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma

brechmos

8:02 p.m.

Ah, hah. In [3]: c = b.reshape((256,256,150), order='F') Ok, I needed more coffee. If I do it this way (without the transpose), it should be as fast as c=b.reshape((150,256,256)), right? It is just changing the stride (or something like that)? Or is it going to be faster without changing the order? Thanks for the help. Ryan May-3 wrote:

...

On Tue, May 12, 2009 at 2:51 PM, brechmos <craig@brechmos.org> wrote:

...
So, in Numpy I have to reshape it so the "slices" are in the first dimension. Obviously, I can do a b.transpose( (1,2,0) ) to get it to look like Matlab, but...

I don't understand why the index ordering is different between Matlab and Numpy. (It isn't a C/Fortran ordering thing, I don' think).

Actually, that's precisely the reason.

...
Is the data access faster if I have b without the tranpose, or can I transpose it so it "looks" like Matlab without taking a hit when I do imshow( b[:,:,0] ).

It's going to be faster to do it without the transpose. Besides, for numpy, that imshow becomes:

imshow(b[0])

Which, IMHO, looks better than Matlab.

Ryan

-- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma

_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

-- View this message in context: http://www.nabble.com/Matlab-Numpy-index-order-tp23509178p23509345.html Sent from the Numpy-discussion mailing list archive at Nabble.com.

Robert Kern

8:02 p.m.

On Tue, May 12, 2009 at 14:55, Ryan May <rmay31@gmail.com> wrote:

...

On Tue, May 12, 2009 at 2:51 PM, brechmos <craig@brechmos.org> wrote:

...
So, in Numpy I have to reshape it so the "slices" are in the first dimension. Obviously, I can do a b.transpose( (1,2,0) ) to get it to look like Matlab, but...

I don't understand why the index ordering is different between Matlab and Numpy. (It isn't a C/Fortran ordering thing, I don' think).

Actually, that's precisely the reason.

To expand on this comment, when Matlab was first released, it was basically just an interactive shell on top of FORTRAN routines from LAPACK and other linear algebra *PACKs. Consequently, it standardized on FORTRAN's column-major format. While numpy isn't really beholden to C's ordering for multidimensional arrays (numpy arrays are just blocks of strided memory, not x[i][j][k] arrays of pointers to arrays of pointers to arrays), we do want consistency with the equivalent nested Python lists, and that does imply row-major formatting by default. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

David Warde-Farley

8:14 p.m.

On 12-May-09, at 3:55 PM, Ryan May wrote:

...

It's going to be faster to do it without the transpose. Besides, for numpy, that imshow becomes:

imshow(b[0])

Which, IMHO, looks better than Matlab.

You're right, that is better, odd how I never thought of doing it like that. I've been stuck in my Matlab-esque world with dstack() as my default mental model of how images/matrices ought to be stacked. Am I right in thinking that b[0] is stored in a big contiguous block of memory, thus making the read marginally faster than slicing on the third? David

Chris Colbert

8:32 p.m.

This is interesting. I have always done RGB imaging with numpy using arrays of shape (height, width, 3). In fact, this is the form that PIL gives when calling np.asarray() on a PIL image. It does seem more efficient to be able to do a[0],a[1],a[2] to get the R, G, and B channels respectively. This, obviously is not currently the case. Would it be better for me to switch to this way of doing things and/or work a patch for PIL so that the array is built in the form (3, height, width)? Chris On Tue, May 12, 2009 at 4:14 PM, David Warde-Farley <dwf@cs.toronto.edu>wrote:

...

On 12-May-09, at 3:55 PM, Ryan May wrote:

...
It's going to be faster to do it without the transpose. Besides, for numpy, that imshow becomes:

imshow(b[0])

Which, IMHO, looks better than Matlab.

You're right, that is better, odd how I never thought of doing it like that. I've been stuck in my Matlab-esque world with dstack() as my default mental model of how images/matrices ought to be stacked.

Am I right in thinking that b[0] is stored in a big contiguous block of memory, thus making the read marginally faster than slicing on the third?

David _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

Robert Kern

8:47 p.m.

On Tue, May 12, 2009 at 15:32, Chris Colbert <sccolbert@gmail.com> wrote:

...

This is interesting.

I have always done RGB imaging with numpy using arrays of shape (height, width, 3). In fact, this is the form that PIL gives when calling np.asarray() on a PIL image.

It does seem more efficient to be able to do a[0],a[1],a[2] to get the R, G, and B channels respectively. This, obviously is not currently the case.

It's not *that* much more efficient.

...

Would it be better for me to switch to this way of doing things and/or work a patch for PIL so that the array is built in the form (3, height, width)?

Submitting a patch for PIL would neither be successful, nor worth your time. Not to mention breaking existing code. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

5626

Age (days ago)

5626

Last active (days ago)

List overview

Download

6 comments

5 participants

participants (5)

brechmos
Chris Colbert
David Warde-Farley
Robert Kern
Ryan May

Matlab/Numpy index order

brechmos

Ryan May

brechmos

Robert Kern

David Warde-Farley

Chris Colbert

Robert Kern

tags

participants (5)