[Numpy-discussion] Does a `mergesorted` function make sense?

Charles R Harris charlesr.harris at gmail.com
Mon Sep 1 08:05:24 EDT 2014


On Mon, Sep 1, 2014 at 1:49 AM, Eelco Hoogendoorn <
hoogendoorn.eelco at gmail.com> wrote:

> Sure, id like to do the hashing things out, but I would also like some
> preliminary feedback as to whether this is going in a direction anyone else
> sees the point of, if it conflicts with other plans, and indeed if we can
> agree that numpy is the right place for it; a point which I would very much
> like to defend. If there is some obvious no-go that im missing, I can do
> without the drudgery of writing proper documentation ;).
>
> As for whether this belongs in numpy: yes, I would say so. There are the
> extension of functionality to functions already in numpy, which are a
> no-brainer (it need not cost anything performance wise, and ive needed
> unique graph edges many many times), and there is the grouping
> functionality, which is the main novelty.
>
> However, note that the grouping functionality itself is a very small
> addition, just a few 100 lines of pure python, given that the indexing
> logic has been factored out of the classic arraysetops. At least from a
> developers perspective, it very much feels like a logical extension of the
> same 'thing'.
>
> But also from a conceptual numpy perspective, grouping is really more an
> 'elementary manipulation of an ndarray' than a 'special purpose algorithm'.
> It is useful for literally all kinds of programming; hence there is similar
> functionality in the python standard library (itertools.groupby); so why
> not have an efficient vectorized equivalent in numpy? It belongs there more
> than the linalg module, arguably.
>
> Also, from a community perspective, a significant fraction of all
> stackoverflow numpy questions are (unknowingly) exactly about 'how to do
> grouping in numpy'.
>

What I'm trying to say is that numpy is a community project. We don't have
a central planning committee, the only difference between "developers" and
everyone else is activity and commit rights. Which is to say if you develop
and push this topic it is likely to go in. There certainly seems to be
interest in this functionality. The reason that I brought up scipy is that
there are some graph algorithms there that went in a couple of years ago.

Note that the convention on the list is bottom posting.

<snip>

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140901/a35a0335/attachment.html>


More information about the NumPy-Discussion mailing list