[Numpy-discussion] Numpy Enhancement Proposal: group_by functionality

Eelco Hoogendoorn hoogendoorn.eelco at gmail.com
Sun Jan 26 15:16:47 EST 2014


not off topic at all; there are several matters of naming that I am not at
all settled on yet, and I don't think it is unimportant.

indeed, those are closely related functions, and I wasn't aware of them
yet, so that's some welcome additional perspective. The mathematica
function differs in that the keys are always function of the values; as per
your example as well. My proposed interface does not have that
constraint, but that behavior is of course easily obtained by something
like group_by(mapping(values), values).

indeed grpstats also has a lot of overlap, though it does not have the same
generality as my proposal.

its interesting to wonder where one gets ones ideas as to how to call what.
ive never worked with SQL much; I suppose I picked up this naming by
working with LINQ. I rather like group_by; it is more suitable to the
generality of the operations supported by the group_by object than
something like grpstats. The majority of my applications for grouping have
nothing whatsoever to do with statistics.


On Sun, Jan 26, 2014 at 8:44 PM, Alan G Isaac <alan.isaac at gmail.com> wrote:

> My comment is just on the name.
> I'd expect something named `groupby`
> to behave essentially like Mathematica's `GatherBy` command.
> http://reference.wolfram.com/mathematica/ref/GatherBy.html
>
> I think you are after something more like Matlab's grpstats:
> http://www.mathworks.com/help/stats/grpstats.html
>
> Perhaps the implicit reference to SQL justifies the name...
>
> Sorry if this seems off topic,
> Alan Isaac
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140126/ea45e256/attachment.html>


More information about the NumPy-Discussion mailing list