Is numpy's argsort lying about its numpy.int32 types?
![](https://secure.gravatar.com/avatar/5450949473dc87e6639aa82bfcaa2c18.jpg?s=120&d=mm&r=g)
Hi, I'm having a problem comparing some types when using numpy's argsort. I'm using numpy 1.0.2. I can reproduce it simply: Python 2.4.4 (#71, Oct 18 2006, 08:34:43) [MSC v.1310 32 bit (Intel)] << snip >> In [1]: from numpy import argsort, sort, int32, array In [2]: x=array([1,3,2]) In [3]: aa=argsort(x) In [4]: as=sort(x) In [5]: type(aa[0]) Out[5]: <type 'numpy.int32'> In [6]: type(as[0]) Out[6]: <type 'numpy.int32'> In [7]: int32 Out[7]: <type 'numpy.int32'> In [8]: type(as[0])==int32 Out[8]: True In [9]: type(aa[0])==int32 Out[9]: False Any of the three indices in aa give me the same problem. Can someone explain if I should be doing this a different way, and if this is a bug?! Thanks, Rob
![](https://secure.gravatar.com/avatar/4d021a1d1319f36ad861ebef0eb5ba44.jpg?s=120&d=mm&r=g)
Rob Clewley wrote:
Hi,
I'm having a problem comparing some types when using numpy's argsort. I'm using numpy 1.0.2. I can reproduce it simply:
In [1]: from numpy import argsort, sort, int32, array
In [2]: x=array([1,3,2])
In [3]: aa=argsort(x)
In [4]: as=sort(x)
In [5]: type(aa[0]) Out[5]: <type 'numpy.int32'>
In [6]: type(as[0]) Out[6]: <type 'numpy.int32'>
In [7]: int32 Out[7]: <type 'numpy.int32'>
In [8]: type(as[0])==int32 Out[8]: True
In [9]: type(aa[0])==int32 Out[9]: False
Any of the three indices in aa give me the same problem. Can someone explain if I should be doing this a different way, and if this is a bug?!
The "problem" is that there are two types that display as "int32" for some platforms (e.g. c_long and c_int are both numpy.int32 on my machine). These are equivalent which can be seen by looking at the output of aa.dtype == as.dtype -Travis
![](https://secure.gravatar.com/avatar/5450949473dc87e6639aa82bfcaa2c18.jpg?s=120&d=mm&r=g)
Fair enough, but it does cause a *real* problem when I extract the values from aa and pass them on to other functions which try to compare their types to the integer types int and int32 that I can import from numpy. Since the values I'm testing could equally have been generated by functions that return the regular int type I can't guarantee that those values will have a dtype attribute! I have some initialization code for a big class that has to set up some state differently depending on the type of the input. So, I was trying to do something like this if type(x) in [int, int32]: ## do stuff specific to integer x but now it seems like I'll need try: isint = x.dtype == dtype('int32') except AttributeError: isint = type(x) == int if isint: ## do stuff specific to integer x -- which is a mess! Is there a better way to do this test cleanly and robustly? And why couldn't c_long always correspond to a unique numpy name (i.e., not shared with int32) regardless of how it's implemented? Either way it would be helpful to have a name for this "other" int32 that I can test against using the all-purpose type() ... so that I could test something like type(x) in [int, int32_c_long, int32_c_int] Thanks in advance for the clarification! Rob
![](https://secure.gravatar.com/avatar/49df8cd4b1b6056c727778925f86147a.jpg?s=120&d=mm&r=g)
Rob Clewley wrote:
Fair enough, but it does cause a *real* problem when I extract the values from aa and pass them on to other functions which try to compare their types to the integer types int and int32 that I can import from numpy. Since the values I'm testing could equally have been generated by functions that return the regular int type I can't guarantee that those values will have a dtype attribute!
You don't have to use the bit-width names (which can be confusing) in such cases. There is a regular name for every C-like type You can use the names byte, short, intc, int_, longlong (and corresponding unsigned names prefixed with u)
I have some initialization code for a big class that has to set up some state differently depending on the type of the input. So, I was trying to do something like this
if type(x) in [int, int32]: ## do stuff specific to integer x
but now it seems like I'll need
try: isint = x.dtype == dtype('int32') except AttributeError: isint = type(x) == int if isint: ## do stuff specific to integer x
try: if isinstance(x, (int, integer)) integer is the super-class of all c-like integer types.
-- which is a mess! Is there a better way to do this test cleanly and robustly? And why couldn't c_long always correspond to a unique numpy name (i.e., not shared with int32) regardless of how it's implemented?
There is a unique numpy name for all of them. The bit-width names just can't be unique.
Either way it would be helpful to have a name for this "other" int32 that I can test against using the all-purpose type() ... so that I could test something like
type(x) in [int, int32_c_long, int32_c_int]
isinstance(x, (int, intc, int_)) is what you want. -Travis
![](https://secure.gravatar.com/avatar/5450949473dc87e6639aa82bfcaa2c18.jpg?s=120&d=mm&r=g)
Excellent. I didn't know isinstance could be used with a tuple in the second argument! That helps a lot. Cheers!
![](https://secure.gravatar.com/avatar/af6c39d6943bd4b0e1fde23161e7bb8c.jpg?s=120&d=mm&r=g)
On Wed, Apr 18, 2007 at 10:00:06PM -0600, Travis Oliphant wrote:
try:
if isinstance(x, (int, integer))
integer is the super-class of all c-like integer types.
Is issubdtype(x,int) also a safe bet? Cheers Stéfan
![](https://secure.gravatar.com/avatar/dce3ba50ee64b5e2208af900a5654c16.jpg?s=120&d=mm&r=g)
Travis Oliphant <oliphant.travis <at> ieee.org> writes:
Rob Clewley wrote:
Fair enough, but it does cause a *real* problem when I extract the values from aa and pass them on to other functions which try to compare their types to the integer types int and int32 that I can import from numpy. Since the values I'm testing could equally have been generated by functions that return the regular int type I can't guarantee that those values will have a dtype attribute!
You don't have to use the bit-width names (which can be confusing) in such cases. There is a regular name for every C-like type
You can use the names byte, short, intc, int_, longlong (and corresponding unsigned names prefixed with u)
I have some initialization code for a big class that has to set up some state differently depending on the type of the input. So, I was trying to do something like this
if type(x) in [int, int32]: ## do stuff specific to integer x
but now it seems like I'll need
try: isint = x.dtype == dtype('int32') except AttributeError: isint = type(x) == int if isint: ## do stuff specific to integer x
try:
if isinstance(x, (int, integer))
integer is the super-class of all c-like integer types.
-- which is a mess! Is there a better way to do this test cleanly and robustly? And why couldn't c_long always correspond to a unique numpy name (i.e., not shared with int32) regardless of how it's implemented?
There is a unique numpy name for all of them. The bit-width names just can't be unique.
Either way it would be helpful to have a name for this "other" int32 that I can test against using the all-purpose type() ... so that I could test something like
type(x) in [int, int32_c_long, int32_c_int]
isinstance(x, (int, intc, int_))
is what you want.
-Travis
In a slightly different context, we have found a situation with the type comparison of two *general* objects (ie, we don't know if they are numpy objects or something else) that confused us mightily. For example, in a recursive general object comparison function we have: def compare(A, B): if type(A) is not type(B): return False <test for various types of object, split and compare components> where the <...> code may call compare() recursively after splitting complex objects into less complex objects. The confusion comes when we debug a comparison that should return *equal* but doesn't due to the type comparison saying the objects have different types. Printing the type() of A and B shows both as numpy.int32 but they are *not* equal types (the id(type()) values differ). That is confusing. Wouldn't it be better if numpy types that have the same underlying bit representation (integer, 32bit) use the same type object. Or if that can't be done, arrange for the different object types to display different representation strings? That would remove the confusion we experience when we see the same type string for objects that aren't the same type. Ross
participants (5)
-
Rob Clewley
-
Ross Wilson
-
Stefan van der Walt
-
Travis Oliphant
-
Travis Oliphant