[Numpy-discussion] Does a `mergesorted` function make sense?

Jaime Fernández del Río jaime.frio at gmail.com
Tue Sep 2 22:07:52 EDT 2014


On Tue, Sep 2, 2014 at 5:40 PM, Charles R Harris <charlesr.harris at gmail.com>
wrote:
>
>
> What do you think about the suggestion of timsort? One would need to
> concatenate the arrays before sorting, but it should be fairly efficient.
>

Timsort is very cool, and it would definitely be fun to implement in numpy.
It is also a lot more work that merging two sorted arrays! I think +1 if
someone else does it, but although I would love to be part of it, I am not
sure I will be able to find time to look into it seriously in the next
couple of months.

>From a setops point of view, merging two sorted arrays makes it very
straightforward to compute, together with (or instead of) the result of the
merge, index arrays that let you calculate things like `in1d` faster.
Although perhaps an `argtimsort` could provide the same functionality, not
sure. I will probably wrap up what I have, put a lace on it, and submit it
as a PR. Even if it is not destined to be merged, it may serve as a warning
to others.

I have also been thinking lately that one of the problems with all these
unique-related computations may be a case of having a hammer and seeing
everything as nails. Perhaps the algorithm that needs to be ported from
Python is not the sorting one, but the hash table...

Jaime

-- 
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140902/9e263e38/attachment.html>


More information about the NumPy-Discussion mailing list