Hi,
I am confused. Here's the reason:
The following structure is a representation of N points in 3D space:
U = numpy.array([[x1,y1,z1], [x1,y1,z1],...,[xn,yn,zn]])
So the array U has shape (N,3). This order makes sense to me since U[i]
will give you the i'th point in the set. Now, I want to pass this array
to a C++ function that does some stuff with the points. Here's how I do
that
void Foo::doStuff(int n, PyObject * numpy_data)
{
// Get pointer to data
double * const positions = (double *) PyArray_DATA(numpy_data);
// Print positions
for (int i=0; i
Not exactly an answer to your question, but I can highly recommend using Boost.python, PyUblas and Ublas for your C++ vectors and matrices. It gives you a really good interface on the C++ side to numpy arrays and matrices, which can be passed in both directions over the language threshold with no copying. If I had to guess I'd say sometimes when transposing numpy simply sets a flag internally to avoid copying the data, but in some cases (such as perhaps when multiplication needs to take place) the data has to be placed in a new object. Accessing the data via raw pointers in C++ may not be checking for the 'transpose' flag and therefore you see an unexpected result. Disclaimer: this is just a guess, someone more familiar with Numpy internals will no doubt be able to correct me. Malcolm
On Tue, Jan 31, 2012 at 6:14 AM, Malcolm Reynolds
Not exactly an answer to your question, but I can highly recommend using Boost.python, PyUblas and Ublas for your C++ vectors and matrices. It gives you a really good interface on the C++ side to numpy arrays and matrices, which can be passed in both directions over the language threshold with no copying.
or use Cython...
If I had to guess I'd say sometimes when transposing numpy simply sets a flag internally to avoid copying the data, but in some cases (such as perhaps when multiplication needs to take place) the data has to be placed in a new object.
good guess:
V = numpy.dot(R, U.transpose()).transpose()
a array([[1, 2], [3, 4], [5, 6]]) a.flags C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False
b = a.transpose() b.flags C_CONTIGUOUS : False F_CONTIGUOUS : True OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False
so the transpose() simple re-arranges the strides to Fortran order, rather than changing anything in memory. np.dot() produces a new array, so it is C-contiguous, then you transpose it, so you get a fortran-ordered array.
Now when I call my C++ function from the Python side, all the data in V is printed, but it has been transposed.
as mentioned, if you are working with arrays in C++ (or fortran, orC, or...) and need to count on the ordering of the data, you need to check it in your extension code. There are utilities for this.
However, if I do:
V = numpy.array(U.transpose()).transpose()
right: In [7]: a.flags Out[7]: C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [8]: a.transpose().flags Out[8]: C_CONTIGUOUS : False F_CONTIGUOUS : True OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False In [9]: np.array( a.transpose() ).flags Out[9]: C_CONTIGUOUS : False F_CONTIGUOUS : True OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False so the np.array call doesn't re-arrange the order if it doesn't need to. If you want to force it, you can specify the order: In [10]: np.array( a.transpose(), order='C' ).flags Out[10]: C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False (note: this does surprise me a bit, as it is making a copy, but there you go -- if order matters, specify it) In general, numpy does a lot of things for the sake of efficiency -- avoiding copies when it can, for instance -- this give efficiency and flexibility, but you do need to be careful, particularly when interfacing with the binary data directly. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On 31/01/2012 18:23, Chris Barker wrote:
On Tue, Jan 31, 2012 at 6:14 AM, Malcolm Reynolds
wrote: Not exactly an answer to your question, but I can highly recommend using Boost.python, PyUblas and Ublas for your C++ vectors and matrices. It gives you a really good interface on the C++ side to numpy arrays and matrices, which can be passed in both directions over the language threshold with no copying. or use Cython...
If I had to guess I'd say sometimes when transposing numpy simply sets a flag internally to avoid copying the data, but in some cases (such as perhaps when multiplication needs to take place) the data has to be placed in a new object. good guess:
V = numpy.dot(R, U.transpose()).transpose()
a array([[1, 2], [3, 4], [5, 6]]) a.flags C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False
b = a.transpose() b.flags C_CONTIGUOUS : False F_CONTIGUOUS : True OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False
so the transpose() simple re-arranges the strides to Fortran order, rather than changing anything in memory.
np.dot() produces a new array, so it is C-contiguous, then you transpose it, so you get a fortran-ordered array.
Now when I call my C++ function from the Python side, all the data in V is printed, but it has been transposed. as mentioned, if you are working with arrays in C++ (or fortran, orC, or...) and need to count on the ordering of the data, you need to check it in your extension code. There are utilities for this.
However, if I do: V = numpy.array(U.transpose()).transpose() right:
In [7]: a.flags Out[7]: C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False
In [8]: a.transpose().flags Out[8]: C_CONTIGUOUS : False F_CONTIGUOUS : True OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False
In [9]: np.array( a.transpose() ).flags Out[9]: C_CONTIGUOUS : False F_CONTIGUOUS : True OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False
so the np.array call doesn't re-arrange the order if it doesn't need to. If you want to force it, you can specify the order:
In [10]: np.array( a.transpose(), order='C' ).flags Out[10]: C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False
(note: this does surprise me a bit, as it is making a copy, but there you go -- if order matters, specify it)
In general, numpy does a lot of things for the sake of efficiency -- avoiding copies when it can, for instance -- this give efficiency and flexibility, but you do need to be careful, particularly when interfacing with the binary data directly.
-Chris
Thanks for all the answers to my question. Helped a lot. Mads -- +-----------------------------------------------------+ | Mads Ipsen | +----------------------+------------------------------+ | Gåsebæksvej 7, 4. tv | | | DK-2500 Valby | phone: +45-29716388 | | Denmark | email: mads.ipsen@gmail.com | +----------------------+------------------------------+
Hi,
On Tue, Jan 31, 2012 at 8:29 AM, Mads Ipsen
Hi,
I am confused. Here's the reason:
The following structure is a representation of N points in 3D space:
U = numpy.array([[x1,y1,z1], [x1,y1,z1],...,[xn,yn,zn]])
So the array U has shape (N,3). This order makes sense to me since U[i] will give you the i'th point in the set. Now, I want to pass this array to a C++ function that does some stuff with the points. Here's how I do that
void Foo::doStuff(int n, PyObject * numpy_data) { // Get pointer to data double * const positions = (double *) PyArray_DATA(numpy_data);
// Print positions for (int i=0; i
printf("Pos[%d] = %f %f %f\n", x, y, z); } }
When I call this routine, using a swig wrapped Python interface to the C++ class, everything prints out nice.
Now, I want to apply a rotation to all the positions. So I set up some rotation matrix R like this:
R = numpy.array([[r11,r12,r13], [r21,r22,r23], [r31,r32,r33]])
To apply the matrix to the data in one crunch, I do
V = numpy.dot(R, U.transpose()).transpose()
Now when I call my C++ function from the Python side, all the data in V is printed, but it has been transposed. So apparently the internal data structure handled by numpy has been reorganized, even though I called transpose() twice, which I would expect to cancel out each other.
However, if I do:
V = numpy.array(U.transpose()).transpose()
and call the C++ routine, everything is perfectly fine, ie. the data structure is as expected.
What went wrong?
The numpy array reserves the right to organize its data internally. For example, a numpy array can be in Fortran order in memory, or C order in memory, and many more complicated schemes. You might want to have a look at: http://docs.scipy.org/doc/numpy/reference/arrays.ndarray.html#internal-memor... If you depend on a particular order for your array memory, you might want to look at: http://docs.scipy.org/doc/numpy/reference/generated/numpy.ascontiguousarray.... Best, Matthew
participants (4)
-
Chris Barker
-
Mads Ipsen
-
Malcolm Reynolds
-
Matthew Brett