[Numpy-discussion] Numpy Enhancement Proposal: group_by functionality

Eelco Hoogendoorn hoogendoorn.eelco at gmail.com
Sun Jan 26 12:50:04 EST 2014

To follow up with an example as to why it is useful that a temporary object
is created, consider the following (taken from the radial reduction

    g = group_by(np.round(radius, 5).flatten())
        g.std(sample.flatten())[1] / np.sqrt(g.count))

Creating the GroupBy object encapsulates the expense of 'indexing' the
keys, which is the most expensive part of these operations. We would have
to redo that four times here, if we didn't have access to the GroupBy

>From looking at the numpy source, I get the impression that it is
considered good practice not to overuse OOP. And I agree, but I think it is
called for here.

On Sun, Jan 26, 2014 at 6:02 PM, Stéfan van der Walt <stefan at sun.ac.za>wrote:

> Hi Eelco
> On Sun, 26 Jan 2014 12:20:04 +0100, Eelco Hoogendoorn wrote:
> > key1 = list('abaabb')
> > key2 = np.random.randint(0,2,(6,2))
> > values = np.random.rand(6,3)
> > print group_by((key1, key2)).median(values)
> I agree that group_by functionality could be handy in numpy.
> In the above example, what would the output of
> ``group_by((key1, key2))``
> be?
> Stéfan
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140126/f5de7ecd/attachment.html>

More information about the NumPy-Discussion mailing list