[Numpy-discussion] Multiplicity of an entry
Michael Droettboom
mdroe at stsci.edu
Tue Oct 27 17:07:33 EDT 2009
Travis Oliphant wrote:
> On Oct 27, 2009, at 2:31 PM, Michael Droettboom wrote:
>
>
>> Christopher Barker wrote:
>>
>>> Nadav Horesh wrote:
>>>
>>>
>>>> np.equal(a,a).sum(0)
>>>>
>>>> but, for unknown reason, np.equal operates only on "normal" arrays.
>>>>
>>>>
>>> true:
>>>
>>> In [25]: a
>>> Out[25]:
>>> array(['abc', 'def', 'abc', 'ghij'],
>>> dtype='|S4')
>>>
>>> In [27]: np.equal(a,a)
>>> Out[27]: NotImplemented
>>>
>>> however:
>>>
>>> In [28]: a == a
>>> Out[28]: array([ True, True, True, True], dtype=bool)
>>>
>>> don't they use the same code? or is "==" reverting to plain old
>>> generic
>>> python sequence comparison, which would partly explain why it is so
>>> slow.
>>>
>>>
>> It looks as if "a == a" (that is array_richcompare) is triggering
>> special case code for strings, so it is fast. However, IMHO np.equal
>> should be made to work as well. Can you file a bug and assign it to
>> me
>> (I'm dealing with a number of other string-related things, so I
>> might as
>> well take this too).
>>
>
> The array_richcompare special-cased strings not for speed but for
> actual functionality.
>
> Making np.equal work with strings requires changes to the ufunc code
> itself which was never written to work with "variable-length" data-
> types (like strings, unicode, and records). There are several
> things that would have to be fixed. Some of the changes we made to
> allow for date-time data-types also made it possible to support
> variable-length strings, but this is non-trivial to implement. It's
> certainly possible, but I would want to look at any changes you make
> before committing them to make sure all the issues are being understood.
>
Yeah -- I'm realizing this is a bigger project than I initially
suspected. I'll keep you posted if I find the time to do this right.
Mike
--
Michael Droettboom
Science Software Branch
Operations and Engineering Division
Space Telescope Science Institute
Operated by AURA for NASA
More information about the NumPy-Discussion
mailing list