[Numpy-discussion] Proposal for new ufunc functionality

Stephen Simmons mail at stevesimmons.com
Wed Apr 14 17:48:47 EDT 2010


I would really like to see this become a core part of numpy...

For groupby-like summing over arrays, I use a modified version of
numpy.bincount() which has optional arguments that greatly enhance its 
flexibility:
   bincount(bin, weights=, max_bins=. out=)
where:
       *  bins    - numpy array of bin numbers (uint8, int16 or int32).
  [1]  *            Negative bins numbers indicate weights to be ignored
       *  weights - (opt) numpy array of weights (float or double)
  [2]  *  max_bin - (opt) bin numbers greater than this are ignored when 
counting
       *  out     - (opt) numpy output array (int32 or double)

[1]  This is how I support Robert Kern's comment below "If there are some
areas you want to ignore, that's difficult to do with reduceat()."

[2]  Specifying the number of bins up front has two benefits: (i) saves
scanning the bins array to see how big the output needs to be;
and (ii) allows you to control the size of the output array, as you may
want it bigger than the number of bins would suggest.


I look forward to the draft NEP!

Best regards
Stephen Simmons



On 13/04/2010 10:34 PM, Robert Kern wrote:
> On Sat, Apr 10, 2010 at 17:59, Robert Kern<robert.kern at gmail.com>  wrote:
>    
>> On Sat, Apr 10, 2010 at 12:45, Pauli Virtanen<pav at iki.fi>  wrote:
>>      
>>> la, 2010-04-10 kello 12:23 -0500, Travis Oliphant kirjoitti:
>>> [clip]
>>>        
>>>> Here are my suggested additions to NumPy:
>>>> ufunc methods:
>>>>          
>>> [clip]
>>>        
>>>>        * reducein (array, indices, axis=0)
>>>>                 similar to reduce-at, but the indices provide both the
>>>> start and end points (rather than being fence-posts like reduceat).
>>>>          
>>> Is the `reducein` important to have, as compared to `reduceat`?
>>>        
>> Yes, I think so. If there are some areas you want to ignore, that's
>> difficult to do with reduceat().
>>      
> And conversely overlapping areas are highly useful but completely
> impossible to do with reduceat.
>
>    




More information about the NumPy-Discussion mailing list