[Numpy-discussion] Proposal for new ufunc functionality

Travis Oliphant oliphant at enthought.com
Mon Apr 12 18:26:52 EDT 2010


On Apr 11, 2010, at 2:56 PM, Anne Archibald wrote:

> 2010/4/10 Stéfan van der Walt <stefan at sun.ac.za>:
>> On 10 April 2010 19:45, Pauli Virtanen <pav at iki.fi> wrote:
>>> Another addition to ufuncs that should be though about is  
>>> specifying the
>>> Python-side interface to generalized ufuncs.
>>
>> This is an interesting idea; what do you have in mind?
>
> I can see two different kinds of answer to this question: one is a
> tool like vectorize/frompyfunc that allows construction of generalized
> ufuncs from python functions, and the other is thinking out what
> methods and support functions generalized ufuncs need.
>
> The former would be very handy for prototyping gufunc-based libraries
> before delving into the templated C required to make them actually
> efficient.
>
> The latter is more essential in the long run: it'd be nice to have a
> reduce-like function, but obviously only when the arity and dimensions
> work out right (which I think means (shape1,shape2)->(shape2) ). This
> could be applied along an axis or over a whole array. reduceat and the
> other, more sophisticated, schemes might also be worth supporting. At
> a more elementary level, gufunc objects should have good introspection
> - docstrings, shape specification accessible from python, named formal
> arguments, et cetera. (So should ufuncs, for that matter.)

We should collect all of these proposals into a NEP.      To clarify  
what I mean by "group-by" behavior.

Suppose I have an array of floats and an array of integers.   Each  
element in the array of integers represents a region in the float  
array of a certain "kind".   The reduction should take place over like- 
kind values:

Example:

add.reduceby(array=[1,2,3,4,5,6,7,8,9], by=[0,1,0,1,2,0,0,2,2])

results in the calculations:

1 + 3 + 6 + 7
2 + 4
5 + 8 + 9

and therefore the output (notice the two arrays --- perhaps a  
structured array should be returned instead...)

[0,1,2],
[17, 6, 22]


The real value is when you have tabular data and you want to do  
reductions in one field based on values in another field.   This  
happens all the time in relational algebra and would be a relatively  
straightforward thing to support in ufuncs.

-Travis






-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20100412/80c039ab/attachment.html>


More information about the NumPy-Discussion mailing list