[Numpy-discussion] comparison operators (e.g. ==) on array with dtype object do not work

Warren Weckesser warren.weckesser at enthought.com
Thu Jan 14 17:49:09 EST 2010


Yaroslav Halchenko wrote:
> Dear NumPy People,
>
> First I want to apologize if I misbehaved on NumPy Trac by reopening the
> closed ticket
> http://projects.scipy.org/numpy/ticket/1362
> but I still feel strongly that there is misunderstanding
> and the bug/defect is valid.   I would appreciate if someone would waste
> more of his time to persuade me that I am wrong but please first read
> till the end:
>
> The issue, as originally reported, is demonstrated with:
>
> ,---
> | > python -c 'import numpy as N; print N.__version__; a=N.array([1, (0,1)],dtype=object); print a==1; print a == (0,1),  a[1] == (0,1)'
> | 1.5.0.dev
> | [ True False]
> | [False False] True
> `---
>
> whenever I expected the last line to be
>
> [False True] True
>
> charris (thanks for all the efforts to enlighten me) summarized it as 
>
> """the result was correct given that the tuple (0,1) was converted to an
> object array with elements 0 and 1. It is *not* converted to an array
> containing a tuple. """
>
> and I was trying to argue that it is not the case in my example.  It is
> the case in charris's example though whenever both elements are of
> the same length, or there is just a single tuple, i.e.
>
>   

The "problem" is that the tuple is converted to an array in the 
statement that
does the comparison, not in the construction of the array.  Numpy attempts
to convert the right hand side of the == operator into an array.  It 
then does
the comparison using the two arrays.

One way to get what you want is to create your own array and then do
the comparison:

In [1]: import numpy as np

In [2]: a = np.array([1, (0,1)], dtype='O')

In [3]: t = np.empty(1, dtype='O')

In [4]: t[0] = (0,1)

In [5]: a == t
Out[5]: array([False,  True], dtype=bool)


In the above code, a numpy array 't' of objects with shape (1,) is created,
and the single element is assigned the value (0,1).  Then the comparison
works as expected.

More food for thought:

In [6]: b = np.array([1, (0,1), "foo"], dtype='O')

In [7]: b == 1
Out[7]: array([ True, False, False], dtype=bool)

In [8]: b == (0,1)
Out[8]: False

In [9]: b == "foo"
Out[9]: array([False, False,  True], dtype=bool)


Warren

> ,---
> | In [1]: array((0,1), dtype=object)
> | Out[1]: array([0, 1], dtype=object)
> |
> | In [2]: array((0,1), dtype=object).shape
> | Out[2]: (2,)
> `---
>
> There I would not expect my comparison to be valid indeed.  But lets see what
> happens in my case:
>
> ,---
> | In [2]: array([1, (0,1)],dtype=object)
> | Out[2]: array([1, (0, 1)], dtype=object)
> |
> | *In [3]: array([1, (0,1)],dtype=object).shape
> | Out[3]: (2,)
> |
> | *In [4]: array([1, (0,1)],dtype=object)[1].shape
> | ---------------------------------------------------------------------------
> | AttributeError                            Traceback (most recent call
> | last)
> |
> | /home/yoh/proj/<ipython console> in <module>()
> |
> | AttributeError: 'tuple' object has no attribute 'shape'
> `---
>
> So, as far as I see it, the array does contain an object of type tuple,
> which does not get correctly compared upon __eq__ operation.  Am I
> wrong?  Or does numpy internally somehow does convert 1st item (ie
> tuple) into an array, but casts it back to tuple upon __repr__ or
> __getitem__?
>
> Thanks in advance for feedback
>
> On Thu, 14 Jan 2010, NumPy Trac wrote:
>
>   
>> #1362: comparison operators (e.g. ==) on array with dtype object do not work
>> -------------------------+--------------------------------------------------
>>   Reporter:  yarikoptic  |       Owner:  somebody
>>       Type:  defect      |      Status:  closed  
>>   Priority:  normal      |   Milestone:          
>>  Component:  Other       |     Version:          
>> Resolution:  invalid     |    Keywords:          
>> -------------------------+--------------------------------------------------
>> Changes (by charris):
>>     
>
>   
>>   * status:  reopened => closed
>>   * resolution:  => invalid
>>     
>
>
>   
>> Old description:
>>     
>
>   
>>> You can see this better with the '*' operator:
>>>       
>
>
>   
>>> {{{
>>> In [8]: a * (0,2)
>>> Out[8]: array([0, (0, 1, 0, 1)], dtype=object)
>>> }}}
>>>       
>
>
>   
>>> Note how the tuple is concatenated with itself. The reason the original
>>> instance of a worked was that 1 and (0,1) are of different lengths, so
>>> the decent into the nested sequence types stopped at one level and a
>>> tuple is one of the elements. When you do something like ((0,1),(0,1))
>>> the decent goes down two levels and you end up with a 2x2 array of
>>> integer objects. The rule of thumb for object arrays is that you get an
>>> array with as many indices as possible. Which is why object arrays are
>>> hard to create. Another example:
>>>       
>
>
>   
>>> {{{
>>> In [10]: array([(1,2,3),(1,2)], dtype=object)
>>> Out[10]: array([(1, 2, 3), (1, 2)], dtype=object)
>>>       
>
>   
>>> In [11]: array([(1,2),(1,2)], dtype=object)
>>> Out[11]:
>>> array([[1, 2],
>>>        [1, 2]], dtype=object)
>>> }}}
>>>       
>
>   
>> New description:
>>     
>
>   
>>  {{{
>>  python -c 'import numpy as N; print N.__version__; a=N.array([1,
>>  (0,1)],dtype=object); print a==1; print a == (0,1),  a[1] == (0,1)'
>>  }}}
>>  results in
>>  {{{
>>  1.5.0.dev
>>  [ True False]
>>  [False False] True
>>  }}}
>>  I expected last line to be
>>  {{{
>>  [False True] True
>>  }}}
>>  So, it works for int but doesn't work for tuple... I guess it doesn't try
>>  to compare element by element but does smth else.
>>     




More information about the NumPy-Discussion mailing list