Type casting problems with numpy.take

Hi,
I see there is a lot of ongoing discussion on casting rules, but I couldn't find any reference to the following issue I am facing. I am trying to 'take' from an array of uint8's, using an array of uint16's as indices. Even though the return dtype would be uint8, I want to direct the output back into the array of uint16's:
lut = np.random.randint(256, size=(65536,)).astype('uint8') arr = np.random.randint(65536, size=(1000, 1000)).astype('uint16') np.take(lut, arr, out=arr)
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "C:\Python27\lib\site-packages\numpy\core\fromnumeric.py", line 103, in take return take(indices, axis, out, mode) TypeError: array cannot be safely cast to required type
This is puzzling, since the only casting that should be happening is from uint8's to uint16's, which is as safe as it gets:
np.can_cast('uint8', 'uint16')
True
To make things even weirder, I can get the above code to work if the type of lut is uint16, uint32, uint64, int32 or int 64, but not if it is uint8, int8 or int16. Without looking at the source, it almost looks as if the type checking in numpy.take was reversed... Am I missing something, or is this broken?
My numpy's version is:
np.__version__
'1.6.2'
which is the one packaged in Python xy 2.7.3.1, running on a 64 bit Windows 7 system.
Thanks,
Jaime
P.S. I have posted the same question in StackExchange: http://stackoverflow.com/questions/14782135/type-casting-error-with-numpy-ta...

On 9 February 2013 00:12, Jaime Fernández del Río jaime.frio@gmail.com wrote:
TypeError: array cannot be safely cast to required type
With version 1.7.rc1
TypeError: Cannot cast array data from dtype('uint16') to dtype('uint8') according to the rule 'safe'.
I have also tried with bigger values of lut, being it uint32, so, when they are casted to uint16 they are modified, and it will do it without complaining:
lut = np.random.randint(256000, size=(65536,)).astype('uint16') arr = np.random.randint(65536, size=(1000, 1000)).astype('uint16') np.take(lut, arr, out=arr) arr.dtype
dtype('uint16')

On Fri, Feb 8, 2013 at 3:54 PM, Daπid davidmenhur@gmail.com wrote:
TypeError: Cannot cast array data from dtype('uint16') to dtype('uint8') according to the rule 'safe'.
That really makes it sound like the check is being done the other way around!
But I'd be surprised if something so obvious hadn't been seen and reported earlier, especially since I have tried it on a Linux box with older versions, and things were the same in 1.2.1. So that means this would be a 5 year old bug.
np.__version__
'1.2.1'
lut = np.random.randint(256, size=(65536,)).astype('uint8') arr = np.random.randint(65536, size=(1000, 1000)).astype('uint16') np.take(lut, arr)
array([[ 56, 131, 248, ..., 233, 34, 191], [229, 217, 233, ..., 183, 8, 86], [249, 238, 79, ..., 38, 17, 72], ..., [ 19, 95, 199, ..., 236, 148, 39], [178, 129, 208, ..., 76, 46, 125], [ 66, 196, 71, ..., 227, 252, 94]], dtype=uint8)
np.take(lut, arr, out=arr)
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python2.6/dist-packages/numpy/core/fromnumeric.py", line 97, in take return take(indices, axis, out, mode) TypeError: array cannot be safely cast to required type

On Fri, Feb 8, 2013 at 5:34 PM, Jaime Fernández del Río < jaime.frio@gmail.com> wrote:
On Fri, Feb 8, 2013 at 3:54 PM, Daπid davidmenhur@gmail.com wrote:
TypeError: Cannot cast array data from dtype('uint16') to dtype('uint8') according to the rule 'safe'.
That really makes it sound like the check is being done the other way around!
But I'd be surprised if something so obvious hadn't been seen and reported earlier, especially since I have tried it on a Linux box with older versions, and things were the same in 1.2.1. So that means this would be a 5 year old bug.
np.__version__
'1.2.1'
lut = np.random.randint(256, size=(65536,)).astype('uint8') arr = np.random.randint(65536, size=(1000, 1000)).astype('uint16') np.take(lut, arr)
array([[ 56, 131, 248, ..., 233, 34, 191], [229, 217, 233, ..., 183, 8, 86], [249, 238, 79, ..., 38, 17, 72], ..., [ 19, 95, 199, ..., 236, 148, 39], [178, 129, 208, ..., 76, 46, 125], [ 66, 196, 71, ..., 227, 252, 94]], dtype=uint8)
np.take(lut, arr, out=arr)
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python2.6/dist-packages/numpy/core/fromnumeric.py", line 97, in take return take(indices, axis, out, mode) TypeError: array cannot be safely cast to required type
My money is on 'five year old bug'. Many basic numpy functions are not well tested. Writing tests is a tedious job but doesn't require any C foo, just Python an patience, so if anyone would like to get involved...
Chuck

On Fri, Feb 8, 2013 at 6:44 PM, Charles R Harris charlesr.harris@gmail.comwrote:
My money is on 'five year old bug'.
A bug indeed it seems to be. I have cloned the source code, and in item_selection.c, in function PyArray_TakeFrom, when 'out' is an argument in the call, the code is actually trying to cast 'out' to the type of 'self' (the first array in the call to take):
int flags = NPY_ARRAY_CARRAY | NPY_ARRAY_UPDATEIFCOPY; dtype = PyArray_DESCR(self); obj = (PyArrayObject *)PyArray_FromArray(out, dtype, flags);
I have also been looking at PyArray_FromArray in ctors.c, and it would be very easy to fix the broken behaviour, by adding NPY_ARRAY_FORCECAST to the flags in the call to PyArray_FromArray, the casting mode would be changed to NPY_UNSAFE_CASTING, and that should do away with the error.
I'm not sure if a smarter type checking is in order here, that would require a more in depth redoing of how PyArray_TakeFrom operates. In think ufuncs let you happily cast unsafely, so maybe take should just be the same? Or should 'self' should be cast to the type of 'out'? Would that break anything else?
But if nothing else, the above fix should just make the current possibly dysfunctional typecasting a consistent feature of numpy, which would be better than what's going on right now.
So, where do I go to file a bug report? Should I try to send the above proposed change as a patch? I am not sure how to do either thing, any reference explaining it a little more in depth that you can point me to?
Many basic numpy functions are not well tested. Writing tests is a tedious job but doesn't require any C foo, just Python an patience, so if anyone would like to get involved...
How does one get involved?
Jaime

On Fri, Feb 8, 2013 at 11:27 PM, Jaime Fernández del Río < jaime.frio@gmail.com> wrote:
On Fri, Feb 8, 2013 at 6:44 PM, Charles R Harris < charlesr.harris@gmail.com> wrote:
My money is on 'five year old bug'.
A bug indeed it seems to be. I have cloned the source code, and in item_selection.c, in function PyArray_TakeFrom, when 'out' is an argument in the call, the code is actually trying to cast 'out' to the type of 'self' (the first array in the call to take):
int flags = NPY_ARRAY_CARRAY | NPY_ARRAY_UPDATEIFCOPY; dtype = PyArray_DESCR(self); obj = (PyArrayObject *)PyArray_FromArray(out, dtype, flags);
I have also been looking at PyArray_FromArray in ctors.c, and it would be very easy to fix the broken behaviour, by adding NPY_ARRAY_FORCECAST to the flags in the call to PyArray_FromArray, the casting mode would be changed to NPY_UNSAFE_CASTING, and that should do away with the error.
I'm not sure if a smarter type checking is in order here, that would require a more in depth redoing of how PyArray_TakeFrom operates. In think ufuncs let you happily cast unsafely, so maybe take should just be the same? Or should 'self' should be cast to the type of 'out'? Would that break anything else?
But if nothing else, the above fix should just make the current possibly dysfunctional typecasting a consistent feature of numpy, which would be better than what's going on right now.
So, where do I go to file a bug report? Should I try to send the above proposed change as a patch? I am not sure how to do either thing, any reference explaining it a little more in depth that you can point me to?
Many basic numpy functions are not well tested. Writing tests is a tedious job but doesn't require any C foo, just Python an patience, so if anyone would like to get involved...
How does one get involved?
Just as you have done, by starting with a look, the next step is a PR. There are guidelines for developershttp://docs.scipy.org/doc/numpy/dev/gitwash/development_workflow.html#development-workflow, but if you are on github just fork numpy, make a branch on your fork with the changes and hit the PR button while you have the branch checked out. It might take some trial and error, but you are unlikely to cause any damage in the process. For testing, the next step would be to write a test that tested the take function with all combinations of types, etc, the current tests looks to be in numpy/core/tests/test_item_selection.py with some bits in test_regressions.py. Fixes to the C code need to come with a test, so you will end up writing tests anyway. I find writing tests takes more time and work than fixing the bugs.
Chuck
participants (3)
-
Charles R Harris
-
Daπid
-
Jaime Fernández del Río