On Fri, Feb 8, 2013 at 11:27 PM, Jaime Fernández del Río <jaime.frio@gmail.com> wrote:
On Fri, Feb 8, 2013 at 6:44 PM, Charles R Harris <charlesr.harris@gmail.com> wrote:
My money is on 'five year old bug'.
 
A bug indeed it seems to be. I have cloned the source code, and in item_selection.c, in function PyArray_TakeFrom, when 'out' is an argument in the call, the code is actually trying to cast 'out' to the type of 'self' (the first array in the call to take):

int flags = NPY_ARRAY_CARRAY | NPY_ARRAY_UPDATEIFCOPY;
dtype = PyArray_DESCR(self);
obj = (PyArrayObject *)PyArray_FromArray(out, dtype, flags);

I have also been looking at PyArray_FromArray in ctors.c, and it would be very easy to fix the broken behaviour, by adding NPY_ARRAY_FORCECAST to the flags in the call to PyArray_FromArray, the casting mode would be changed to NPY_UNSAFE_CASTING, and that should do away with the error.

I'm not sure if a smarter type checking is in order here, that would require a more in depth redoing of how PyArray_TakeFrom operates. In think ufuncs let you happily cast unsafely, so maybe take should just be the same? Or should 'self' should be cast to the type of 'out'? Would that break anything else?

But if nothing else, the above fix should just make the current possibly dysfunctional typecasting a consistent feature of numpy, which would be better than what's going on right now.

So, where do I go to file a bug report? Should I try to send the above proposed change as a patch? I am not sure how to do either thing, any reference explaining it a little more in depth that you can point me to?
 
Many basic numpy functions are not well tested. Writing tests is a tedious job but doesn't require any C foo, just Python an patience, so if anyone would like to get involved...

How does one get involved?


Just as you have done, by starting with a look, the next step is a PR. There are guidelines for developers, but if you are on github just fork numpy, make a branch on your fork with the changes and hit the PR button while you have the branch checked out. It might take some trial and error, but you are unlikely to cause any damage in the process. For testing, the next step would be to write a test that tested the take function with all combinations of types, etc, the current tests looks to be in numpy/core/tests/test_item_selection.py with some bits in test_regressions.py. Fixes to the C code need to come with a test, so you will end up writing tests anyway. I find writing tests takes more time and work than fixing the bugs.

Chuck