[Numpy-discussion] improving arraysetops
Robert Cimrman
cimrman3 at ntc.zcu.cz
Wed Jun 17 09:06:39 EDT 2009
Hi Neil,
Neil Crighton wrote:
>>> What about merging unique and unique1d? They're essentially identical for an
>>> array input, but unique uses the builtin set() for non-array inputs and so is
>>> around 2x faster in this case - see below. Is it worth accepting a speed
>>> regression for unique to get rid of the function duplication? (Or can they be
>>> combined?)
>> unique1d can return the indices - can this be achieved by using set(), too?
>>
>
> No, set() can't return the indices as far as I know.
>
>> The implementation for arrays is the same already, IMHO, so I would
>> prefer adding return_index, return_inverse to unique (automatically
>> converting input to array, if necessary), and deprecate unique1d.
>>
>> We can view it also as adding the set() approach to unique1d, when the
>> return_index, return_inverse arguments are not set, and renaming
>> unique1d -> unique.
>>
>
> This sounds good. If you don't have time to do it, I don't mind having
> a go at writing
> a patch to implement these changes (deprecate the existing unique1d, rename
> unique1d to unique and add the set approach from the old unique, and the other
> changes mentioned in http://projects.scipy.org/numpy/ticket/1133).
That would be really great - I will not be online starting tomorrow till
the end of next week (more or less), so I can really look at the issue
after I return.
[...]
>> UnicodeEncodeError: 'ascii' codec can't encode character u'\xb5' in
>> position 28: ordinal not in range(128)
>>
>> It disappears after increasing the array size, or the integer size.
>> In [39]: np.__version__
>> Out[39]: '1.4.0.dev7047'
>>
>> r.
>
> Weird! From the error message, it looks like a problem with ipython's timeit
> function rather than unique. I can't reproduce it on my machine
> (numpy 1.4.0.dev, r7059; IPython 0.10.bzr.r1163 ).
True, I have ipython 0.9.1, that might cause the problem.
cheers,
r.
More information about the NumPy-Discussion
mailing list