[Numpy-discussion] [Suggestion] Labelled Array

Allan Haldane allanhaldane at gmail.com
Fri Feb 19 12:08:33 EST 2016


I also want to add a historical note here, that 'groupby' has been
discussed a couple times before.

Travis Oliphant even made an NEP for it, and Wes McKinney lightly hinted
at adding it to numpy.

http://thread.gmane.org/gmane.comp.python.numeric.general/37480/focus=37480
http://thread.gmane.org/gmane.comp.python.numeric.general/38272/focus=38299
http://docs.scipy.org/doc/numpy-1.10.1/neps/groupby_additions.html

Travis's idea for a ufunc method 'reduceby' is more along the lines of
what I was originally thinking. Just musing about it, it might cover few
small cases pandas groupby might not: It could work on arbitrary ufuncs,
and over particular axes of multidimensional data. Eg, to sum over
pixels from NxNx3 image data. But maybe pandas can cover the
multidimensional case through additional index columns or with Panel.

Cheers,
Allan

On 02/15/2016 05:31 PM, Paul Hobson wrote:
> Just for posterity -- any future readers to this thread who need to do
> pandas-like on record arrays should look at matplotlib's mlab submodule. 
> 
> I've been in situations (::cough:: Esri production ::cough::) where I've
> had one hand tied behind my back and unable to install pandas. mlab was
> a big help there.
> 
> https://goo.gl/M7Mi8B
> 
> -paul
> 
> 
> 
> On Mon, Feb 15, 2016 at 1:28 PM, Lluís Vilanova <vilanova at ac.upc.edu
> <mailto:vilanova at ac.upc.edu>> wrote:
> 
>     Benjamin Root writes:
> 
>     > Seems like you are talking about xarray: https://github.com/pydata/xarray
> 
>     Oh, I wasn't aware of xarray, but there's also this:
> 
>      
>     https://people.gso.ac.upc.edu/vilanova/doc/sciexp2/user_guide/data.html#basic-indexing
>      
>     https://people.gso.ac.upc.edu/vilanova/doc/sciexp2/user_guide/data.html#dimension-oblivious-indexing
> 
> 
>     Cheers,
>       Lluis
> 
> 
> 
>     > Cheers!
>     > Ben Root
> 
>     > On Fri, Feb 12, 2016 at 9:40 AM, Sérgio <filaboia at gmail.com
>     <mailto:filaboia at gmail.com>> wrote:
> 
>     >     Hello,
> 
> 
>     >     This is my first e-mail, I will try to make the idea simple.
> 
> 
>     >     Similar to masked array it would be interesting to use a label
>     array to
>     >     guide operations.
> 
> 
>     >     Ex.:
>     >>>> x
>     >     labelled_array(data =
> 
>     >     [[0 1 2]
>     >     [3 4 5]
>     >     [6 7 8]],
>     >     label =
>     >     [[0 1 2]
>     >     [0 1 2]
>     >     [0 1 2]])
> 
> 
>     >>>> sum(x)
>     >     array([9, 12, 15])
> 
> 
>     >     The operations would create a new axis for label indexing.
> 
> 
>     >     You could think of it as a collection of masks, one for each
>     label.
> 
> 
>     >     I don't know a way to make something like this efficiently
>     without a loop.
>     >     Just wondering...
> 
> 
>     >     Sérgio.
> 
>     >     _______________________________________________
>     >     NumPy-Discussion mailing list
>     >     NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>     >     https://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
> 
> 
> 
>     > _______________________________________________
>     > NumPy-Discussion mailing list
>     > NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>     > https://mail.scipy.org/mailman/listinfo/numpy-discussion
>     _______________________________________________
>     NumPy-Discussion mailing list
>     NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>     https://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
> 
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
> 




More information about the NumPy-Discussion mailing list