Mailman 3 suggestion for generalizing numpy functions - NumPy-Discussion

newer
Why is the truth value of ndarray...

suggestion for generalizing numpy functions

older
iteration slowing, no increase in...

Darren Dale

9 Mar 2009 9 Mar '09

9:50 a.m.

I spent some time over the weekend fixing a few bugs in numpy that were exposed when attempting to use ufuncs with ndarray subclasses. It got me thinking that, with relatively little work, numpy's functions could be made to be more general. For example, the numpy.ma module redefines many of the standard ufuncs in order to do some preprocessing before the builtin ufunc is called. Likewise, in the units/quantities package I have been working on, I would like to perform a dimensional analysis to make sure an operation is allowed before I call a ufunc that might change data in place. Imagine an ndarray subclass with methods like __gfunc_pre__ and __gfunc_post__. __gfunc_pre__ could accept the context that is currently provided to __array_wrap__ (the inputs and the function called), perform whatever preprocessing is desired, and maybe return a dictionary containing metadata. Numpy functions could then be wrapped with a decorator that 1) calls __gfunc_pre__ and obtain any metadata that is returned 2) calls the wrapped functions, and then 3) calls __gfunc_post__, which might be very similar to __array_wrap__ except that it would also accept the metadata created by __gfunc_pre__. In cases where the routines to be called by __gfunc_pre__ and _post__ depend on what function is called, the the subclass could implement routines and store them in a dictionary-like object that is keyed using the function called. I have been exploring this approach with Quantities and it seems to work well. For example: def __gfunc_pre__(self, gfunc, *args): try: return gfunc_pre_registry[gfunc](*args) except KeyError: return {} I think such an approach for generalizing numpy's functions could be implemented without being disruptive to the existing __array_wrap__ framework. The decorator would attempt to identify an input or output array to use to call __gfunc_pre__ and _post__. If it finds them, it uses them. If it doesnt find them, no harm done, the existing __array_wrap__ mechanisms are still in place if the wrapped function is a ufunc. One other nice feature: the metadata that is returned by __gfunc_pre__ could contain an optional flag that the decorator attempts to pass to the wrapped function so that __gfunc_pre__ and _post are not called for any decorated internal functions. That way the subclass could specify that __gfunc_pre__ and _post should be called only for the outer-most function. Comments? Darren

Attachments:

attachment.htm (text/html — 2.6 KB)

Show replies by date

Darren Dale

9 Mar 9 Mar

5:37 p.m.

On Mon, Mar 9, 2009 at 9:50 AM, Darren Dale <dsdale24@gmail.com> wrote:

...

I spent some time over the weekend fixing a few bugs in numpy that were exposed when attempting to use ufuncs with ndarray subclasses. It got me thinking that, with relatively little work, numpy's functions could be made to be more general. For example, the numpy.ma module redefines many of the standard ufuncs in order to do some preprocessing before the builtin ufunc is called. Likewise, in the units/quantities package I have been working on, I would like to perform a dimensional analysis to make sure an operation is allowed before I call a ufunc that might change data in place.

Imagine an ndarray subclass with methods like __gfunc_pre__ and __gfunc_post__. __gfunc_pre__ could accept the context that is currently provided to __array_wrap__ (the inputs and the function called), perform whatever preprocessing is desired, and maybe return a dictionary containing metadata. Numpy functions could then be wrapped with a decorator that 1) calls __gfunc_pre__ and obtain any metadata that is returned 2) calls the wrapped functions, and then 3) calls __gfunc_post__, which might be very similar to __array_wrap__ except that it would also accept the metadata created by __gfunc_pre__.

In cases where the routines to be called by __gfunc_pre__ and _post__ depend on what function is called, the the subclass could implement routines and store them in a dictionary-like object that is keyed using the function called. I have been exploring this approach with Quantities and it seems to work well. For example:

def __gfunc_pre__(self, gfunc, *args): try: return gfunc_pre_registry[gfunc](*args) except KeyError: return {}

I think such an approach for generalizing numpy's functions could be implemented without being disruptive to the existing __array_wrap__ framework. The decorator would attempt to identify an input or output array to use to call __gfunc_pre__ and _post__. If it finds them, it uses them. If it doesnt find them, no harm done, the existing __array_wrap__ mechanisms are still in place if the wrapped function is a ufunc.

One other nice feature: the metadata that is returned by __gfunc_pre__ could contain an optional flag that the decorator attempts to pass to the wrapped function so that __gfunc_pre__ and _post are not called for any decorated internal functions. That way the subclass could specify that __gfunc_pre__ and _post should be called only for the outer-most function.

Comments?

I'm attaching a proof of concept script, maybe it will better illustrate what I am talking about. Darren

Travis E. Oliphant

6:08 p.m.

Darren Dale wrote:

...

On Mon, Mar 9, 2009 at 9:50 AM, Darren Dale <dsdale24@gmail.com <mailto:dsdale24@gmail.com>> wrote:

I spent some time over the weekend fixing a few bugs in numpy that were exposed when attempting to use ufuncs with ndarray subclasses. It got me thinking that, with relatively little work, numpy's functions could be made to be more general. For example, the numpy.ma <http://numpy.ma> module redefines many of the standard ufuncs in order to do some preprocessing before the builtin ufunc is called. Likewise, in the units/quantities package I have been working on, I would like to perform a dimensional analysis to make sure an operation is allowed before I call a ufunc that might change data in place.

The suggestions behind this idea are interesting. It seems related to me, to the concept of "contexts" that Eric presented at SciPy a couple of years ago that keeps coming up at Enthought. It may be of benefit to solve the problem from that perspective rather than the "sub-class" perspective. Unfortunately, I don't have time to engage this discussion as it deserves, but I wanted to encourage you because I think there are good ideas in what you are doing. The sub-class route may be a decent solution, but it also might be worthwhile to think from the perspective of contexts as well. Basically, the context idea is that rather than "sub-class" the ndarray, you create a more powerful name-space for code that uses arrays to live in. Because python code can execute using a namespace that is any dictionary-like thing, you can create a "namespace" object with more powerful getters and setters that intercepts the getting and setting of names as the Python code is executing. This allows every variable to be "adapted" in a manner analagous to "type-maps" in SWIG --- but in a more powerful way. We have been taking advantage of this basic but powerful idea quite a bit. Unit-handling is a case where "contexts" and generic functions rather than sub-classes appears to be an approach to solving the problem. The other important idea about contexts is that you can layer-on adapters on getting and setting variables into the namespace which provide more hooks for doing some powerful things in easy-to-remember ways. I apologize if it sounds like I'm hi-jacking your question to promote an agenda. I really like the generality you are trying to reach with your suggestions and just wanted to voice the opinion that it might be better to look for a solution using the two dimensions of "objects" and "namespaces" (o.k. generic functions are probably another dimension in my metaphor) rather than just sub-classes of objects. -- Travis Oliphant Enthought, Inc. (512) 536-1057 (office) (512) 536-1059 (fax) http://www.enthought.com oliphant@enthought.com

Darren Dale

15 Mar 15 Mar

10:46 a.m.

Hi Travis, On Mon, Mar 9, 2009 at 6:08 PM, Travis E. Oliphant <oliphant@enthought.com>wrote:

...

Darren Dale wrote:

...
On Mon, Mar 9, 2009 at 9:50 AM, Darren Dale <dsdale24@gmail.com <mailto:dsdale24@gmail.com>> wrote:

I spent some time over the weekend fixing a few bugs in numpy that were exposed when attempting to use ufuncs with ndarray subclasses. It got me thinking that, with relatively little work, numpy's functions could be made to be more general. For example, the numpy.ma <http://numpy.ma> module redefines many of the standard ufuncs in order to do some preprocessing before the builtin ufunc is called. Likewise, in the units/quantities package I have been working on, I would like to perform a dimensional analysis to make sure an operation is allowed before I call a ufunc that might change data in place.

The suggestions behind this idea are interesting. It seems related to me, to the concept of "contexts" that Eric presented at SciPy a couple of years ago that keeps coming up at Enthought. It may be of benefit to solve the problem from that perspective rather than the "sub-class" perspective.

Unfortunately, I don't have time to engage this discussion as it deserves, but I wanted to encourage you because I think there are good ideas in what you are doing. The sub-class route may be a decent solution, but it also might be worthwhile to think from the perspective of contexts as well.

Basically, the context idea is that rather than "sub-class" the ndarray, you create a more powerful name-space for code that uses arrays to live in. Because python code can execute using a namespace that is any dictionary-like thing, you can create a "namespace" object with more powerful getters and setters that intercepts the getting and setting of names as the Python code is executing.

This allows every variable to be "adapted" in a manner analagous to "type-maps" in SWIG --- but in a more powerful way. We have been taking advantage of this basic but powerful idea quite a bit. Unit-handling is a case where "contexts" and generic functions rather than sub-classes appears to be an approach to solving the problem.

The other important idea about contexts is that you can layer-on adapters on getting and setting variables into the namespace which provide more hooks for doing some powerful things in easy-to-remember ways.

I apologize if it sounds like I'm hi-jacking your question to promote an agenda. I really like the generality you are trying to reach with your suggestions and just wanted to voice the opinion that it might be better to look for a solution using the two dimensions of "objects" and "namespaces" (o.k. generic functions are probably another dimension in my metaphor) rather than just sub-classes of objects.

Contexts may be an alternative approach, but I do not understand the vision or how they would be applied to the problem of unit handling. The Quantities package is already in a useful and working state, based on an ndarray subclass. My goal at this point is to make quantities more useful with numpy/scipy. There is already a mechanism for doing so, it just needs to be tweaked in order for it to more generally applicable. Hopefully I can interest some of the current numpy developers to engage in the discussion after 1.3 is released. Darren

Darren Dale

27 May 27 May

11:30 a.m.

Now that numpy-1.3 has been released, I was hoping I could engage the numpy developers and community concerning my suggestion to improve the ufunc wrapping mechanism. Currently, ufuncs call, on the way out, the __array_wrap__ method of the input array with the highest __array_priority__. There are use cases, like masked arrays or arrays with units, where it is imperative to run some code on the way in to the ufunc as well. MaskedArrays do this by reimplementing or wrapping ufuncs, but this approach puts some pretty severe constraints on subclassing. For example, in my Quantities package I have a Quantity object that derives from ndarray. It has been suggested that in order to make ufuncs work with Quantity, I should wrap numpy's built-in ufuncs. But I intend to make a MaskedQuantity object as well, deriving from MaskedArray, and would therefore have to wrap the MaskedArray ufuncs as well. If ufuncs would simply call a method both on the way in and on the way out, I think this would go a long way to improving this situation. I whipped up a simple proof of concept and posted it in this thread a while back. For example, a MaskedQuantity would implement a method like __gfunc_pre__ to check the validity of the units operation etc, and would then call MaskedArray.__gfunc_pre__ (if defined) to determine the domain etc. __gfunc_pre__ would return a dict containing any metadata the subclasses wish to provide based on the inputs, and that dict would be passed along with the inputs, output and context to __gfunc_post__, so postprocessing can be done (__gfunc_post__ replacing __array_wrap__). Of course, packages like MaskedArray may still wish to reimplement ufuncs, like Eric Firing is investigating right now. The point is that classes that dont care about the implementation of ufuncs, that only need to provide metadata based on the inputs and the output, can do so using this mechanism and can build upon other specialized arrays. I would really appreciate input from numpy developers and other interested parties. I would like to continue developing the Quantities package this summer, and have been approached by numerous people interested in using Quantities with sage, sympy, matplotlib. But I would prefer to improve the ufunc mechanism (or establish that there is no interest among the community to do so) so I can improve the package (or limit its scope) before making an official announcement. Thank you, Darren On Mon, Mar 9, 2009 at 5:37 PM, Darren Dale <dsdale24@gmail.com> wrote:

...

On Mon, Mar 9, 2009 at 9:50 AM, Darren Dale <dsdale24@gmail.com> wrote:

...
I spent some time over the weekend fixing a few bugs in numpy that were exposed when attempting to use ufuncs with ndarray subclasses. It got me thinking that, with relatively little work, numpy's functions could be made to be more general. For example, the numpy.ma module redefines many of the standard ufuncs in order to do some preprocessing before the builtin ufunc is called. Likewise, in the units/quantities package I have been working on, I would like to perform a dimensional analysis to make sure an operation is allowed before I call a ufunc that might change data in place.

Imagine an ndarray subclass with methods like __gfunc_pre__ and __gfunc_post__. __gfunc_pre__ could accept the context that is currently provided to __array_wrap__ (the inputs and the function called), perform whatever preprocessing is desired, and maybe return a dictionary containing metadata. Numpy functions could then be wrapped with a decorator that 1) calls __gfunc_pre__ and obtain any metadata that is returned 2) calls the wrapped functions, and then 3) calls __gfunc_post__, which might be very similar to __array_wrap__ except that it would also accept the metadata created by __gfunc_pre__.

In cases where the routines to be called by __gfunc_pre__ and _post__ depend on what function is called, the the subclass could implement routines and store them in a dictionary-like object that is keyed using the function called. I have been exploring this approach with Quantities and it seems to work well. For example:

def __gfunc_pre__(self, gfunc, *args): try: return gfunc_pre_registry[gfunc](*args) except KeyError: return {}

I think such an approach for generalizing numpy's functions could be implemented without being disruptive to the existing __array_wrap__ framework. The decorator would attempt to identify an input or output array to use to call __gfunc_pre__ and _post__. If it finds them, it uses them. If it doesnt find them, no harm done, the existing __array_wrap__ mechanisms are still in place if the wrapped function is a ufunc.

One other nice feature: the metadata that is returned by __gfunc_pre__ could contain an optional flag that the decorator attempts to pass to the wrapped function so that __gfunc_pre__ and _post are not called for any decorated internal functions. That way the subclass could specify that __gfunc_pre__ and _post should be called only for the outer-most function.

Comments?

I'm attaching a proof of concept script, maybe it will better illustrate what I am talking about.

Darren Dale

24 Jun 24 Jun

9:08 a.m.

On Wed, May 27, 2009 at 11:30 AM, Darren Dale <dsdale24@gmail.com> wrote:

...

Now that numpy-1.3 has been released, I was hoping I could engage the numpy developers and community concerning my suggestion to improve the ufunc wrapping mechanism. Currently, ufuncs call, on the way out, the __array_wrap__ method of the input array with the highest __array_priority__.

There are use cases, like masked arrays or arrays with units, where it is imperative to run some code on the way in to the ufunc as well. MaskedArrays do this by reimplementing or wrapping ufuncs, but this approach puts some pretty severe constraints on subclassing. For example, in my Quantities package I have a Quantity object that derives from ndarray. It has been suggested that in order to make ufuncs work with Quantity, I should wrap numpy's built-in ufuncs. But I intend to make a MaskedQuantity object as well, deriving from MaskedArray, and would therefore have to wrap the MaskedArray ufuncs as well.

If ufuncs would simply call a method both on the way in and on the way out, I think this would go a long way to improving this situation. I whipped up a simple proof of concept and posted it in this thread a while back. For example, a MaskedQuantity would implement a method like __gfunc_pre__ to check the validity of the units operation etc, and would then call MaskedArray.__gfunc_pre__ (if defined) to determine the domain etc. __gfunc_pre__ would return a dict containing any metadata the subclasses wish to provide based on the inputs, and that dict would be passed along with the inputs, output and context to __gfunc_post__, so postprocessing can be done (__gfunc_post__ replacing __array_wrap__).

Of course, packages like MaskedArray may still wish to reimplement ufuncs, like Eric Firing is investigating right now. The point is that classes that dont care about the implementation of ufuncs, that only need to provide metadata based on the inputs and the output, can do so using this mechanism and can build upon other specialized arrays.

I would really appreciate input from numpy developers and other interested parties. I would like to continue developing the Quantities package this summer, and have been approached by numerous people interested in using Quantities with sage, sympy, matplotlib. But I would prefer to improve the ufunc mechanism (or establish that there is no interest among the community to do so) so I can improve the package (or limit its scope) before making an official announcement.

There was some discussion of this proposal to allow better interaction of ufuncs with ndarray subclasses in another thread (Plans for numpy-1.4.0 and scipy-0.8.0) and the comments were encouraging. I have been trying to gather feedback as to whether the numpy devs were receptive to the idea, and it seems the answer is tentatively yes, although there were questions about who would actually write the code. I guess I have not made clear that I intend to write the implementation and tests. I gained some familiarity with the relevant code while squashing a few bugs for numpy-1.3, but it would be helpful if someone else who is familiar with the existing __array_wrap__ machinery would be willing to discuss this proposal in more detail and offer constructive criticism along the way. Is anyone willing? What is the timeframe being considered for the numpy-1.4 release? Thanks, Darren

Charles R Harris

9:42 a.m.

On Wed, Jun 24, 2009 at 7:08 AM, Darren Dale<dsdale24@gmail.com> wrote:

...

On Wed, May 27, 2009 at 11:30 AM, Darren Dale <dsdale24@gmail.com> wrote:

...
Now that numpy-1.3 has been released, I was hoping I could engage the numpy developers and community concerning my suggestion to improve the ufunc wrapping mechanism. Currently, ufuncs call, on the way out, the __array_wrap__ method of the input array with the highest __array_priority__.

There are use cases, like masked arrays or arrays with units, where it is imperative to run some code on the way in to the ufunc as well. MaskedArrays do this by reimplementing or wrapping ufuncs, but this approach puts some pretty severe constraints on subclassing. For example, in my Quantities package I have a Quantity object that derives from ndarray. It has been suggested that in order to make ufuncs work with Quantity, I should wrap numpy's built-in ufuncs. But I intend to make a MaskedQuantity object as well, deriving from MaskedArray, and would therefore have to wrap the MaskedArray ufuncs as well.

If ufuncs would simply call a method both on the way in and on the way out, I think this would go a long way to improving this situation. I whipped up a simple proof of concept and posted it in this thread a while back. For example, a MaskedQuantity would implement a method like __gfunc_pre__ to check the validity of the units operation etc, and would then call MaskedArray.__gfunc_pre__ (if defined) to determine the domain etc. __gfunc_pre__ would return a dict containing any metadata the subclasses wish to provide based on the inputs, and that dict would be passed along with the inputs, output and context to __gfunc_post__, so postprocessing can be done (__gfunc_post__ replacing __array_wrap__).

Of course, packages like MaskedArray may still wish to reimplement ufuncs, like Eric Firing is investigating right now. The point is that classes that dont care about the implementation of ufuncs, that only need to provide metadata based on the inputs and the output, can do so using this mechanism and can build upon other specialized arrays.

I would really appreciate input from numpy developers and other interested parties. I would like to continue developing the Quantities package this summer, and have been approached by numerous people interested in using Quantities with sage, sympy, matplotlib. But I would prefer to improve the ufunc mechanism (or establish that there is no interest among the community to do so) so I can improve the package (or limit its scope) before making an official announcement.

There was some discussion of this proposal to allow better interaction of ufuncs with ndarray subclasses in another thread (Plans for numpy-1.4.0 and scipy-0.8.0) and the comments were encouraging. I have been trying to gather feedback as to whether the numpy devs were receptive to the idea, and it seems the answer is tentatively yes, although there were questions about who would actually write the code. I guess I have not made clear that I intend to write the implementation and tests. I gained some familiarity with the relevant code while squashing a few bugs for numpy-1.3, but it would be helpful if someone else who is familiar with the existing __array_wrap__ machinery would be willing to discuss this proposal in more detail and offer constructive criticism along the way. Is anyone willing?

I think Travis would be the only one familiar with that code and that would be from a couple of years back when he wrote it. Most of us have followed the same route as yourself, finding our way into the code by squashing bugs. Chuck

Darren Dale

10:52 a.m.

On Wed, Jun 24, 2009 at 9:42 AM, Charles R Harris <charlesr.harris@gmail.com

...

wrote:

...

...
On Wed, May 27, 2009 at 11:30 AM, Darren Dale <dsdale24@gmail.com> wrote:

...
Now that numpy-1.3 has been released, I was hoping I could engage the numpy developers and community concerning my suggestion to improve the

ufunc

...
wrapping mechanism. Currently, ufuncs call, on the way out, the __array_wrap__ method of the input array with the highest __array_priority__.

There are use cases, like masked arrays or arrays with units, where it is imperative to run some code on the way in to the ufunc as well. MaskedArrays do this by reimplementing or wrapping ufuncs, but this approach puts some pretty severe constraints on subclassing. For example, in my Quantities package I have a Quantity object that derives from ndarray. It has been suggested that in order to make ufuncs work with Quantity, I should wrap numpy's built-in ufuncs. But I intend to make a MaskedQuantity object as well, deriving from MaskedArray, and would therefore have to wrap the MaskedArray ufuncs as well.

If ufuncs would simply call a method both on the way in and on the way out, I think this would go a long way to improving this situation. I whipped up a simple proof of concept and posted it in this thread a while back. For example, a MaskedQuantity would implement a method like __gfunc_pre__ to check the validity of the units operation etc, and would then call MaskedArray.__gfunc_pre__ (if defined) to determine the domain etc. __gfunc_pre__ would return a dict containing any metadata the subclasses wish to provide based on the inputs, and that dict would be passed along with the inputs, output and context to __gfunc_post__, so postprocessing can be done (__gfunc_post__ replacing __array_wrap__).

Of course, packages like MaskedArray may still wish to reimplement ufuncs, like Eric Firing is investigating right now. The point is that classes

...
...
dont care about the implementation of ufuncs, that only need to provide metadata based on the inputs and the output, can do so using this mechanism and can build upon other specialized arrays.

I would really appreciate input from numpy developers and other interested parties. I would like to continue developing the Quantities package this summer, and have been approached by numerous people interested in using Quantities with sage, sympy, matplotlib. But I would prefer to improve

On Wed, Jun 24, 2009 at 7:08 AM, Darren Dale<dsdale24@gmail.com> wrote: that the

...
...
ufunc mechanism (or establish that there is no interest among the community to do so) so I can improve the package (or limit its scope) before making an official announcement.

There was some discussion of this proposal to allow better interaction of ufuncs with ndarray subclasses in another thread (Plans for numpy-1.4.0 and scipy-0.8.0) and the comments were encouraging. I have been trying to gather feedback as to whether the numpy devs were receptive to the idea, and it seems the answer is tentatively yes, although there were questions about who would actually write the code. I guess I have not made clear that I intend to write the implementation and tests. I gained some familiarity with the relevant code while squashing a few bugs for numpy-1.3, but it would be helpful if someone else who is familiar with the existing __array_wrap__ machinery would be willing to discuss this proposal in more detail and offer constructive criticism along the way. Is anyone willing?

I think Travis would be the only one familiar with that code and that would be from a couple of years back when he wrote it. Most of us have followed the same route as yourself, finding our way into the code by squashing bugs.

Do you mean that you would require Travis to sign off on the implementation (assuming he would agree to review my work)? I would really like to avoid a situation where I invest the time and then the code bitrots because I can't find a route to committing it to svn. Darren

Charles R Harris

3:37 p.m.

On Wed, Jun 24, 2009 at 8:52 AM, Darren Dale<dsdale24@gmail.com> wrote:

...

On Wed, Jun 24, 2009 at 9:42 AM, Charles R Harris <charlesr.harris@gmail.com> wrote:

...
On Wed, Jun 24, 2009 at 7:08 AM, Darren Dale<dsdale24@gmail.com> wrote:

...
On Wed, May 27, 2009 at 11:30 AM, Darren Dale <dsdale24@gmail.com> wrote:

...
Now that numpy-1.3 has been released, I was hoping I could engage the numpy developers and community concerning my suggestion to improve the ufunc wrapping mechanism. Currently, ufuncs call, on the way out, the __array_wrap__ method of the input array with the highest __array_priority__.

There are use cases, like masked arrays or arrays with units, where it is imperative to run some code on the way in to the ufunc as well. MaskedArrays do this by reimplementing or wrapping ufuncs, but this approach puts some pretty severe constraints on subclassing. For example, in my Quantities package I have a Quantity object that derives from ndarray. It has been suggested that in order to make ufuncs work with Quantity, I should wrap numpy's built-in ufuncs. But I intend to make a MaskedQuantity object as well, deriving from MaskedArray, and would therefore have to wrap the MaskedArray ufuncs as well.

If ufuncs would simply call a method both on the way in and on the way out, I think this would go a long way to improving this situation. I whipped up a simple proof of concept and posted it in this thread a while back. For example, a MaskedQuantity would implement a method like __gfunc_pre__ to check the validity of the units operation etc, and would then call MaskedArray.__gfunc_pre__ (if defined) to determine the domain etc. __gfunc_pre__ would return a dict containing any metadata the subclasses wish to provide based on the inputs, and that dict would be passed along with the inputs, output and context to __gfunc_post__, so postprocessing can be done (__gfunc_post__ replacing __array_wrap__).

Of course, packages like MaskedArray may still wish to reimplement ufuncs, like Eric Firing is investigating right now. The point is that classes that dont care about the implementation of ufuncs, that only need to provide metadata based on the inputs and the output, can do so using this mechanism and can build upon other specialized arrays.

I would really appreciate input from numpy developers and other interested parties. I would like to continue developing the Quantities package this summer, and have been approached by numerous people interested in using Quantities with sage, sympy, matplotlib. But I would prefer to improve the ufunc mechanism (or establish that there is no interest among the community to do so) so I can improve the package (or limit its scope) before making an official announcement.

There was some discussion of this proposal to allow better interaction of ufuncs with ndarray subclasses in another thread (Plans for numpy-1.4.0 and scipy-0.8.0) and the comments were encouraging. I have been trying to gather feedback as to whether the numpy devs were receptive to the idea, and it seems the answer is tentatively yes, although there were questions about who would actually write the code. I guess I have not made clear that I intend to write the implementation and tests. I gained some familiarity with the relevant code while squashing a few bugs for numpy-1.3, but it would be helpful if someone else who is familiar with the existing __array_wrap__ machinery would be willing to discuss this proposal in more detail and offer constructive criticism along the way. Is anyone willing?

I think Travis would be the only one familiar with that code and that would be from a couple of years back when he wrote it. Most of us have followed the same route as yourself, finding our way into the code by squashing bugs.

Do you mean that you would require Travis to sign off on the implementation (assuming he would agree to review my work)? I would really like to avoid a situation where I invest the time and then the code bitrots because I can't find a route to committing it to svn.

No, just that Travis would know the most about that subsystem if you are looking for help. I and others here can look over the code and commit it without Travis signing off on it. You could ask for commit privileges yourself. The important thing is having some tests and an agreement that the interface is appropriate. Pierre also seems interested in the functionality so it would be useful for him to say that it serves his needs also. Chuck

Darren Dale

3:49 p.m.

On Wed, Jun 24, 2009 at 3:37 PM, Charles R Harris <charlesr.harris@gmail.com

...

wrote:

...

...
On Wed, Jun 24, 2009 at 9:42 AM, Charles R Harris <charlesr.harris@gmail.com> wrote:

...
On Wed, Jun 24, 2009 at 7:08 AM, Darren Dale<dsdale24@gmail.com> wrote:

...
On Wed, May 27, 2009 at 11:30 AM, Darren Dale <dsdale24@gmail.com> wrote:

...
Now that numpy-1.3 has been released, I was hoping I could engage the numpy developers and community concerning my suggestion to improve

...
...
...
...
ufunc wrapping mechanism. Currently, ufuncs call, on the way out, the __array_wrap__ method of the input array with the highest __array_priority__.

There are use cases, like masked arrays or arrays with units, where it is imperative to run some code on the way in to the ufunc as well. MaskedArrays do this by reimplementing or wrapping ufuncs, but this approach puts some pretty severe constraints on subclassing. For example, in my Quantities package I have a Quantity object that derives from ndarray. It has been suggested that in order to make ufuncs work with Quantity, I should wrap numpy's built-in ufuncs. But I intend to make a MaskedQuantity object as well, deriving from MaskedArray, and would therefore have to wrap the MaskedArray ufuncs as well.

If ufuncs would simply call a method both on the way in and on the way out, I think this would go a long way to improving this situation. I whipped up a simple proof of concept and posted it in this thread a while back. For example, a MaskedQuantity would implement a method like __gfunc_pre__ to check the validity of the units operation etc, and would then call MaskedArray.__gfunc_pre__ (if defined) to determine the domain etc. __gfunc_pre__ would return a dict containing any metadata the subclasses wish to provide based on the inputs, and that dict would be passed along with the inputs, output and context to __gfunc_post__, so postprocessing can be done (__gfunc_post__ replacing __array_wrap__).

Of course, packages like MaskedArray may still wish to reimplement ufuncs, like Eric Firing is investigating right now. The point is that classes that dont care about the implementation of ufuncs, that only need to

On Wed, Jun 24, 2009 at 8:52 AM, Darren Dale<dsdale24@gmail.com> wrote: the provide

...
...
...
...
metadata based on the inputs and the output, can do so using this mechanism and can build upon other specialized arrays.

I would really appreciate input from numpy developers and other interested parties. I would like to continue developing the Quantities package this summer, and have been approached by numerous people interested in using Quantities with sage, sympy, matplotlib. But I would prefer to improve the ufunc mechanism (or establish that there is no interest among the community to do so) so I can improve the package (or limit its scope) before making an official announcement.

There was some discussion of this proposal to allow better interaction of ufuncs with ndarray subclasses in another thread (Plans for numpy-1.4.0 and scipy-0.8.0) and the comments were encouraging. I have been trying to gather feedback as to whether the numpy devs were receptive to the idea, and it seems the answer is tentatively yes, although there were questions about who would actually write the code. I guess I have not made clear that I intend to write the implementation and tests. I gained some familiarity with the relevant code while squashing a few bugs for numpy-1.3, but it would be helpful if someone else who is familiar with the existing __array_wrap__ machinery would be willing to discuss this proposal in more detail and offer constructive criticism along the way. Is anyone willing?

I think Travis would be the only one familiar with that code and that would be from a couple of years back when he wrote it. Most of us have followed the same route as yourself, finding our way into the code by squashing bugs.

Do you mean that you would require Travis to sign off on the implementation (assuming he would agree to review my work)? I would really like to avoid a situation where I invest the time and then the code bitrots because I can't find a route to committing it to svn.

No, just that Travis would know the most about that subsystem if you are looking for help. I and others here can look over the code and commit it without Travis signing off on it. You could ask for commit privileges yourself. The important thing is having some tests and an agreement that the interface is appropriate. Pierre also seems interested in the functionality so it would be useful for him to say that it serves his needs also.

Ok, I'll start working on it then. Any idea what you are targeting for numpy-1.4? Scipy-2009, or much earlier? I'd like to gauge how to budget my time. Darren

Charles R Harris

4:08 p.m.

On Wed, Jun 24, 2009 at 1:49 PM, Darren Dale<dsdale24@gmail.com> wrote:

...

On Wed, Jun 24, 2009 at 3:37 PM, Charles R Harris <charlesr.harris@gmail.com> wrote:

...
On Wed, Jun 24, 2009 at 8:52 AM, Darren Dale<dsdale24@gmail.com> wrote:

...
On Wed, Jun 24, 2009 at 9:42 AM, Charles R Harris <charlesr.harris@gmail.com> wrote:

...
On Wed, Jun 24, 2009 at 7:08 AM, Darren Dale<dsdale24@gmail.com> wrote:

...
On Wed, May 27, 2009 at 11:30 AM, Darren Dale <dsdale24@gmail.com> wrote:

...
Now that numpy-1.3 has been released, I was hoping I could engage the numpy developers and community concerning my suggestion to improve the ufunc wrapping mechanism. Currently, ufuncs call, on the way out, the __array_wrap__ method of the input array with the highest __array_priority__.

There are use cases, like masked arrays or arrays with units, where it is imperative to run some code on the way in to the ufunc as well. MaskedArrays do this by reimplementing or wrapping ufuncs, but this approach puts some pretty severe constraints on subclassing. For example, in my Quantities package I have a Quantity object that derives from ndarray. It has been suggested that in order to make ufuncs work with Quantity, I should wrap numpy's built-in ufuncs. But I intend to make a MaskedQuantity object as well, deriving from MaskedArray, and would therefore have to wrap the MaskedArray ufuncs as well.

If ufuncs would simply call a method both on the way in and on the way out, I think this would go a long way to improving this situation. I whipped up a simple proof of concept and posted it in this thread a while back. For example, a MaskedQuantity would implement a method like __gfunc_pre__ to check the validity of the units operation etc, and would then call MaskedArray.__gfunc_pre__ (if defined) to determine the domain etc. __gfunc_pre__ would return a dict containing any metadata the subclasses wish to provide based on the inputs, and that dict would be passed along with the inputs, output and context to __gfunc_post__, so postprocessing can be done (__gfunc_post__ replacing __array_wrap__).

Of course, packages like MaskedArray may still wish to reimplement ufuncs, like Eric Firing is investigating right now. The point is that classes that dont care about the implementation of ufuncs, that only need to provide metadata based on the inputs and the output, can do so using this mechanism and can build upon other specialized arrays.

I would really appreciate input from numpy developers and other interested parties. I would like to continue developing the Quantities package this summer, and have been approached by numerous people interested in using Quantities with sage, sympy, matplotlib. But I would prefer to improve the ufunc mechanism (or establish that there is no interest among the community to do so) so I can improve the package (or limit its scope) before making an official announcement.

There was some discussion of this proposal to allow better interaction of ufuncs with ndarray subclasses in another thread (Plans for numpy-1.4.0 and scipy-0.8.0) and the comments were encouraging. I have been trying to gather feedback as to whether the numpy devs were receptive to the idea, and it seems the answer is tentatively yes, although there were questions about who would actually write the code. I guess I have not made clear that I intend to write the implementation and tests. I gained some familiarity with the relevant code while squashing a few bugs for numpy-1.3, but it would be helpful if someone else who is familiar with the existing __array_wrap__ machinery would be willing to discuss this proposal in more detail and offer constructive criticism along the way. Is anyone willing?

I think Travis would be the only one familiar with that code and that would be from a couple of years back when he wrote it. Most of us have followed the same route as yourself, finding our way into the code by squashing bugs.

Do you mean that you would require Travis to sign off on the implementation (assuming he would agree to review my work)? I would really like to avoid a situation where I invest the time and then the code bitrots because I can't find a route to committing it to svn.

No, just that Travis would know the most about that subsystem if you are looking for help. I and others here can look over the code and commit it without Travis signing off on it. You could ask for commit privileges yourself. The important thing is having some tests and an agreement that the interface is appropriate. Pierre also seems interested in the functionality so it would be useful for him to say that it serves his needs also.

Ok, I'll start working on it then. Any idea what you are targeting for numpy-1.4? Scipy-2009, or much earlier? I'd like to gauge how to budget my time.

The timeline is open for discussion. A six month timeline would put it sometime in November but David might want it earlier for scipy 0.8. My guess would be sometime after Scipy-2009, late September at the earliest. But as I say, it is open for discussion. What schedule would you prefer? Chuck

Darren Dale

4:37 p.m.

On Wed, Jun 24, 2009 at 4:08 PM, Charles R Harris <charlesr.harris@gmail.com

...

wrote:

...

...
On Wed, Jun 24, 2009 at 3:37 PM, Charles R Harris <charlesr.harris@gmail.com> wrote:

...
On Wed, Jun 24, 2009 at 8:52 AM, Darren Dale<dsdale24@gmail.com> wrote:

...
On Wed, Jun 24, 2009 at 9:42 AM, Charles R Harris <charlesr.harris@gmail.com> wrote:

...
On Wed, Jun 24, 2009 at 7:08 AM, Darren Dale<dsdale24@gmail.com>

wrote:

...
...
...
...
On Wed, May 27, 2009 at 11:30 AM, Darren Dale <dsdale24@gmail.com> wrote: > > Now that numpy-1.3 has been released, I was hoping I could engage > the > numpy developers and community concerning my suggestion to improve > the > ufunc > wrapping mechanism. Currently, ufuncs call, on the way out, the > __array_wrap__ method of the input array with the highest > __array_priority__. > > There are use cases, like masked arrays or arrays with units, where > it > is > imperative to run some code on the way in to the ufunc as well. > MaskedArrays > do this by reimplementing or wrapping ufuncs, but this approach

...
...
...
...
...
> some > pretty severe constraints on subclassing. For example, in my > Quantities > package I have a Quantity object that derives from ndarray. It has > been > suggested that in order to make ufuncs work with Quantity, I should > wrap > numpy's built-in ufuncs. But I intend to make a MaskedQuantity > object > as > well, deriving from MaskedArray, and would therefore have to wrap > the > MaskedArray ufuncs as well. > > If ufuncs would simply call a method both on the way in and on the > way > out, I think this would go a long way to improving this situation. I > whipped > up a simple proof of concept and posted it in this thread a while > back. > For > example, a MaskedQuantity would implement a method like > __gfunc_pre__ > to > check the validity of the units operation etc, and would then call > MaskedArray.__gfunc_pre__ (if defined) to determine the domain etc. > __gfunc_pre__ would return a dict containing any metadata the > subclasses > wish to provide based on the inputs, and that dict would be passed > along > with the inputs, output and context to __gfunc_post__, so > postprocessing can > be done (__gfunc_post__ replacing __array_wrap__). > > Of course, packages like MaskedArray may still wish to reimplement > ufuncs, > like Eric Firing is investigating right now. The point is that > classes > that > dont care about the implementation of ufuncs, that only need to > provide > metadata based on the inputs and the output, can do so using this > mechanism > and can build upon other specialized arrays. > > I would really appreciate input from numpy developers and other > interested > parties. I would like to continue developing the Quantities

On Wed, Jun 24, 2009 at 1:49 PM, Darren Dale<dsdale24@gmail.com> wrote: puts package

...
...
...
...
...
> this > summer, and have been approached by numerous people interested in > using > Quantities with sage, sympy, matplotlib. But I would prefer to > improve > the > ufunc mechanism (or establish that there is no interest among the > community > to do so) so I can improve the package (or limit its scope) before > making an > official announcement.

There was some discussion of this proposal to allow better interaction of ufuncs with ndarray subclasses in another thread (Plans for numpy-1.4.0 and scipy-0.8.0) and the comments were encouraging. I have been trying to gather feedback as to whether the numpy devs were receptive to the idea, and it seems the answer is tentatively yes, although there were questions about who would actually write the code. I guess I have not made clear that I intend to write the implementation and tests. I gained some familiarity with the relevant code while squashing a few bugs for numpy-1.3, but it would be helpful if someone else who is familiar with the existing __array_wrap__ machinery would be willing to discuss this proposal in more detail and offer constructive criticism along the way. Is anyone willing?

I think Travis would be the only one familiar with that code and that would be from a couple of years back when he wrote it. Most of us have followed the same route as yourself, finding our way into the code by squashing bugs.

Do you mean that you would require Travis to sign off on the implementation (assuming he would agree to review my work)? I would really like to avoid a situation where I invest the time and then the code bitrots because I can't find a route to committing it to svn.

No, just that Travis would know the most about that subsystem if you are looking for help. I and others here can look over the code and commit it without Travis signing off on it. You could ask for commit privileges yourself. The important thing is having some tests and an agreement that the interface is appropriate. Pierre also seems interested in the functionality so it would be useful for him to say that it serves his needs also.

Ok, I'll start working on it then. Any idea what you are targeting for numpy-1.4? Scipy-2009, or much earlier? I'd like to gauge how to budget my time.

The timeline is open for discussion. A six month timeline would put it sometime in November but David might want it earlier for scipy 0.8. My guess would be sometime after Scipy-2009, late September at the earliest. But as I say, it is open for discussion. What schedule would you prefer?

I guess I'd like a shot at submitting this in time for 1.4, but I wouldn't want to hold up the release. Late September should provide plenty of time. Darren

Darren Dale

13 Jul 13 Jul

5:18 p.m.

I've put together a first cut at implementing __array_prepare__, which appears to work, and I would like to request feedback. Here is an overview of the approach: Once the ufunc machinery has created the output arrays, it is time to offer subclasses a chance to initialize the output arrays and determine metadata and/or perform whatever other operations may be desired before the ufunc actually performs the computation. In the construct_arrays function in umath.c, I added a function _find_array_prepare, which attempts to find an __array_prepare__ method to call from the inputs, almost identical to the existing _find_array_wrap. The default implementation of __array_prepare__ is currently identical to the default implementation of __array_wrap__, in methods.c. I think that bit of code fits better in __array_prepare__, but maybe there is a good reason to keep it in __array_wrap__ and make the default __array_prepare__ just pass through the output array. So now that the output arrays have been created by the ufunc machinery, and those arrays have been initialized by __array_prepare__ (which has the same call signature as __array_wrap__), the ufunc can continue as usual. Classes that already rely on __array_wrap__ can continue to do so, implementing __array_prepare__ is entirely optional. But other classes like MA and Quantity can set the output array type, determine a mask, perform units analysis checks, and update some metadata in advance of the ufunc, and they can still update metadata after the ufunc using __array_wrap__. The implementation is included in the attached patch. I ran np.test and got 1 known failure and 11 skipped tests. I am using a quantities branch for testing (bzr branch lp:~dsdale24/python-quantities/quantities-array-prepare), and after simply moving my units analysis out of __array_wrap__ and into __array_prepare__, quantities.test() does not yield any errors. Darren

Stéfan van der Walt

6:09 p.m.

Hi Darren 2009/7/13 Darren Dale <dsdale24@gmail.com>:

...

I've put together a first cut at implementing __array_prepare__, which appears to work, and I would like to request feedback. Here is an overview of the approach:

This is pretty neat! Do you have a quick snippet at hand illustrating its use? Regards Stéfan

Darren Dale

7:12 p.m.

2009/7/13 Stéfan van der Walt <stefan@sun.ac.za>

...

Hi Darren

2009/7/13 Darren Dale <dsdale24@gmail.com>:

...
I've put together a first cut at implementing __array_prepare__, which appears to work, and I would like to request feedback. Here is an overview of the approach:

This is pretty neat! Do you have a quick snippet at hand illustrating its use?

That would be helpful, wouldn't it? The attached script is a modified version of RealisticInfoArray from http://docs.scipy.org/doc/numpy/user/basics.subclassing.html . It should yield the following output: starting with [0 1 2 3 4] which is of type <class '__main__.MyArray'> and has info attribute = "information" subtracting 3 from [0 1 2 3 4] subtract calling __array_prepare__ on [0 1 2 3 4] input output array is now of type <class '__main__.MyArray'> output array values are still uninitialized: [139911601789568 39578752 139911614885536 39254560 48] __array_prepare__ is updating info attribute on output __array_prepare__ finished, subtract ufunc is taking over subtract calling __array_wrap__ on [0 1 2 3 4] input output array has initial value: [-3 -2 -1 0 1] __array_wrap__ is setting output endpoints to 0 yielding [ 0 -2 -1 0 0] which is of type <class '__main__.MyArray'> and has info attribute = "new_information" Darren

Darren Dale

17 Jul 17 Jul

10:03 a.m.

On Mon, Jul 13, 2009 at 7:12 PM, Darren Dale <dsdale24@gmail.com> wrote:

...

2009/7/13 Stéfan van der Walt <stefan@sun.ac.za>

...
Hi Darren

2009/7/13 Darren Dale <dsdale24@gmail.com>:

...
I've put together a first cut at implementing __array_prepare__, which appears to work, and I would like to request feedback. Here is an overview of the approach:

This is pretty neat! Do you have a quick snippet at hand illustrating its use?

That would be helpful, wouldn't it? The attached script is a modified version of RealisticInfoArray from http://docs.scipy.org/doc/numpy/user/basics.subclassing.html . It should yield the following output:

starting with [0 1 2 3 4] which is of type <class '__main__.MyArray'> and has info attribute = "information" subtracting 3 from [0 1 2 3 4] subtract calling __array_prepare__ on [0 1 2 3 4] input output array is now of type <class '__main__.MyArray'> output array values are still uninitialized: [139911601789568 39578752 139911614885536 39254560 48] __array_prepare__ is updating info attribute on output __array_prepare__ finished, subtract ufunc is taking over subtract calling __array_wrap__ on [0 1 2 3 4] input output array has initial value: [-3 -2 -1 0 1] __array_wrap__ is setting output endpoints to 0 yielding [ 0 -2 -1 0 0] which is of type <class '__main__.MyArray'> and has info attribute = "new_information"

This is a gentle ping, hoping to get some feedback so this feature has a chance of being included in the next release. Darren

Darren Dale

11:44 a.m.

On Fri, Jul 17, 2009 at 10:03 AM, Darren Dale <dsdale24@gmail.com> wrote:

...

On Mon, Jul 13, 2009 at 7:12 PM, Darren Dale <dsdale24@gmail.com> wrote:

...
2009/7/13 Stéfan van der Walt <stefan@sun.ac.za>

...
Hi Darren

2009/7/13 Darren Dale <dsdale24@gmail.com>:

...
I've put together a first cut at implementing __array_prepare__, which appears to work, and I would like to request feedback. Here is an overview of the approach:

This is pretty neat! Do you have a quick snippet at hand illustrating its use?

That would be helpful, wouldn't it? The attached script is a modified version of RealisticInfoArray from http://docs.scipy.org/doc/numpy/user/basics.subclassing.html . It should yield the following output:

starting with [0 1 2 3 4] which is of type <class '__main__.MyArray'> and has info attribute = "information" subtracting 3 from [0 1 2 3 4] subtract calling __array_prepare__ on [0 1 2 3 4] input output array is now of type <class '__main__.MyArray'> output array values are still uninitialized: [139911601789568 39578752 139911614885536 39254560 48] __array_prepare__ is updating info attribute on output __array_prepare__ finished, subtract ufunc is taking over subtract calling __array_wrap__ on [0 1 2 3 4] input output array has initial value: [-3 -2 -1 0 1] __array_wrap__ is setting output endpoints to 0 yielding [ 0 -2 -1 0 0] which is of type <class '__main__.MyArray'> and has info attribute = "new_information"

This is a gentle ping, hoping to get some feedback so this feature has a chance of being included in the next release.

I have a question about the C-api. If I want to make the default implementation of __array_prepare__ (or __array_wrap__, is anyone out there?) simply pass through the output array: static PyObject * array_preparearray(PyArrayObject *self, PyObject *args) { PyObject *arr; if (PyTuple_Size(args) < 1) { PyErr_SetString(PyExc_TypeError, "only accepts 1 argument"); return NULL; } arr = PyTuple_GET_ITEM(args, 0); if (!PyArray_Check(arr)) { PyErr_SetString(PyExc_TypeError, "can only be called with ndarray object"); return NULL; } return arr; } Is this sufficient, or do I need to worry about calling Py_INCREF? Thanks, Darren

Charles R Harris

12:04 p.m.

On Fri, Jul 17, 2009 at 9:44 AM, Darren Dale <dsdale24@gmail.com> wrote:

...

On Fri, Jul 17, 2009 at 10:03 AM, Darren Dale <dsdale24@gmail.com> wrote:

...
On Mon, Jul 13, 2009 at 7:12 PM, Darren Dale <dsdale24@gmail.com> wrote:

...
2009/7/13 Stéfan van der Walt <stefan@sun.ac.za>

...
Hi Darren

2009/7/13 Darren Dale <dsdale24@gmail.com>:

...
I've put together a first cut at implementing __array_prepare__, which appears to work, and I would like to request feedback. Here is an overview of the approach:

This is pretty neat! Do you have a quick snippet at hand illustrating its use?

That would be helpful, wouldn't it? The attached script is a modified version of RealisticInfoArray from http://docs.scipy.org/doc/numpy/user/basics.subclassing.html . It should yield the following output:

starting with [0 1 2 3 4] which is of type <class '__main__.MyArray'> and has info attribute = "information" subtracting 3 from [0 1 2 3 4] subtract calling __array_prepare__ on [0 1 2 3 4] input output array is now of type <class '__main__.MyArray'> output array values are still uninitialized: [139911601789568 39578752 139911614885536 39254560 48] __array_prepare__ is updating info attribute on output __array_prepare__ finished, subtract ufunc is taking over subtract calling __array_wrap__ on [0 1 2 3 4] input output array has initial value: [-3 -2 -1 0 1] __array_wrap__ is setting output endpoints to 0 yielding [ 0 -2 -1 0 0] which is of type <class '__main__.MyArray'> and has info attribute = "new_information"

This is a gentle ping, hoping to get some feedback so this feature has a chance of being included in the next release.

I have a question about the C-api. If I want to make the default implementation of __array_prepare__ (or __array_wrap__, is anyone out there?) simply pass through the output array:

static PyObject * array_preparearray(PyArrayObject *self, PyObject *args) { PyObject *arr;

if (PyTuple_Size(args) < 1) { PyErr_SetString(PyExc_TypeError, "only accepts 1 argument"); return NULL; } arr = PyTuple_GET_ITEM(args, 0); if (!PyArray_Check(arr)) { PyErr_SetString(PyExc_TypeError, "can only be called with ndarray object"); return NULL; } return arr; }

Is this sufficient, or do I need to worry about calling Py_INCREF?

PyObject* *PyTuple_GetItem*(PyObject *p, Py_ssize_t pos) Return value: Borrowed reference. Return the object at position pos in the tuple pointed to by p. If pos is out of bounds, return NULL and sets an IndexError exception. It's a borrowed reference so you need to call Py_INCREF on it. I find this Python C-API documentation <http://www.python.org/doc/2.5/api/api.html>useful. Chuck

Stéfan van der Walt

20 Jul 20 Jul

5:33 p.m.

Hi Chuck 2009/7/17 Charles R Harris <charlesr.harris@gmail.com>:

...

PyObject* PyTuple_GetItem(PyObject *p, Py_ssize_t pos) Return value: Borrowed reference. Return the object at position pos in the tuple pointed to by p. If pos is out of bounds, return NULL and sets an IndexError exception. It's a borrowed reference so you need to call Py_INCREF on it. I find this Python C-API documentation useful.

Have you had a look over the rest of the code? I think this would make a good addition. Travis mentioned Contexts for doing something similar, but I don't know enough about that concept to compare the two. Regards Stefan

Darren Dale

21 Jul 21 Jul

7:44 a.m.

2009/7/20 Stéfan van der Walt <stefan@sun.ac.za>:

...

Hi Chuck

2009/7/17 Charles R Harris <charlesr.harris@gmail.com>:

...
PyObject* PyTuple_GetItem(PyObject *p, Py_ssize_t pos) Return value: Borrowed reference. Return the object at position pos in the tuple pointed to by p. If pos is out of bounds, return NULL and sets an IndexError exception. It's a borrowed reference so you need to call Py_INCREF on it. I find this Python C-API documentation useful.

Have you had a look over the rest of the code? I think this would make a good addition. Travis mentioned Contexts for doing something similar, but I don't know enough about that concept to compare the two.

I think contexts would be very different from what is already in place. For now, it would be nice to make this one small improvement to the existing ufunc infrastructure, and maybe consider contexts (which I still don't understand) at a later time. I have improved the code slightly and added a few tests, and will post a new patch later this morning. I just need to add some documentation. Darren

Darren Dale

10:11 a.m.

On Tue, Jul 21, 2009 at 7:44 AM, Darren Dale<dsdale24@gmail.com> wrote:

...

2009/7/20 Stéfan van der Walt <stefan@sun.ac.za>:

...
Hi Chuck

2009/7/17 Charles R Harris <charlesr.harris@gmail.com>:

...
PyObject* PyTuple_GetItem(PyObject *p, Py_ssize_t pos) Return value: Borrowed reference. Return the object at position pos in the tuple pointed to by p. If pos is out of bounds, return NULL and sets an IndexError exception. It's a borrowed reference so you need to call Py_INCREF on it. I find this Python C-API documentation useful.

Have you had a look over the rest of the code? I think this would make a good addition. Travis mentioned Contexts for doing something similar, but I don't know enough about that concept to compare the two.

I think contexts would be very different from what is already in place. For now, it would be nice to make this one small improvement to the existing ufunc infrastructure, and maybe consider contexts (which I still don't understand) at a later time. I have improved the code slightly and added a few tests, and will post a new patch later this morning. I just need to add some documentation.

Here is a better patch, which includes a few additional tests and adds some documentation. It also attempts to improve the docstring and sphinx docs for __array_wrap__, which may have been a little bit misleading. There is also some whitespace cleanup in a few places. Would someone please review my work and commit the patch if it is acceptable? Pierre or Travis, would either of you have a chance to look over the implementation and the documentation changes, since you two seem to be most familiar with ufuncs and subclassing ndarray? (off topic: it would be nice if numpy had a mechanism in place for merge requests and code reviews. I've been following bzr-dev for a while now and their development model is pretty impressive.) Thank you, Darren

Robert Kern

2:12 p.m.

On Tue, Jul 21, 2009 at 09:11, Darren Dale<dsdale24@gmail.com> wrote:

...

(off topic: it would be nice if numpy had a mechanism in place for merge requests and code reviews. I've been following bzr-dev for a while now and their development model is pretty impressive.)

You can use Rietveld. numpy is already a registered repository. http://codereview.appspot.com -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

Darren Dale

23 Jul 23 Jul

12:54 p.m.

On Tue, Jul 21, 2009 at 10:11 AM, Darren Dale<dsdale24@gmail.com> wrote:

...

On Tue, Jul 21, 2009 at 7:44 AM, Darren Dale<dsdale24@gmail.com> wrote:

...
2009/7/20 Stéfan van der Walt <stefan@sun.ac.za>:

...
Hi Chuck

2009/7/17 Charles R Harris <charlesr.harris@gmail.com>:

...
PyObject* PyTuple_GetItem(PyObject *p, Py_ssize_t pos) Return value: Borrowed reference. Return the object at position pos in the tuple pointed to by p. If pos is out of bounds, return NULL and sets an IndexError exception. It's a borrowed reference so you need to call Py_INCREF on it. I find this Python C-API documentation useful.

Have you had a look over the rest of the code? I think this would make a good addition. Travis mentioned Contexts for doing something similar, but I don't know enough about that concept to compare the two.

I think contexts would be very different from what is already in place. For now, it would be nice to make this one small improvement to the existing ufunc infrastructure, and maybe consider contexts (which I still don't understand) at a later time. I have improved the code slightly and added a few tests, and will post a new patch later this morning. I just need to add some documentation.

Here is a better patch, which includes a few additional tests and adds some documentation. It also attempts to improve the docstring and sphinx docs for __array_wrap__, which may have been a little bit misleading. There is also some whitespace cleanup in a few places. Would someone please review my work and commit the patch if it is acceptable? Pierre or Travis, would either of you have a chance to look over the implementation and the documentation changes, since you two seem to be most familiar with ufuncs and subclassing ndarray?

It looks like part of my patch has been clobbered by changes introduced in svn 7184-7191. What else should I be doing so a patch like this can be committed relatively quickly? Darren

Darren Dale

25 Jul 25 Jul

8:33 p.m.

On Thu, Jul 23, 2009 at 12:54 PM, Darren Dale<dsdale24@gmail.com> wrote:

...

On Tue, Jul 21, 2009 at 10:11 AM, Darren Dale<dsdale24@gmail.com> wrote:

...
On Tue, Jul 21, 2009 at 7:44 AM, Darren Dale<dsdale24@gmail.com> wrote:

...
2009/7/20 Stéfan van der Walt <stefan@sun.ac.za>:

...
Hi Chuck

2009/7/17 Charles R Harris <charlesr.harris@gmail.com>:

...
PyObject* PyTuple_GetItem(PyObject *p, Py_ssize_t pos) Return value: Borrowed reference. Return the object at position pos in the tuple pointed to by p. If pos is out of bounds, return NULL and sets an IndexError exception. It's a borrowed reference so you need to call Py_INCREF on it. I find this Python C-API documentation useful.

Have you had a look over the rest of the code? I think this would make a good addition. Travis mentioned Contexts for doing something similar, but I don't know enough about that concept to compare the two.

I think contexts would be very different from what is already in place. For now, it would be nice to make this one small improvement to the existing ufunc infrastructure, and maybe consider contexts (which I still don't understand) at a later time. I have improved the code slightly and added a few tests, and will post a new patch later this morning. I just need to add some documentation.

Here is a better patch, which includes a few additional tests and adds some documentation. It also attempts to improve the docstring and sphinx docs for __array_wrap__, which may have been a little bit misleading. There is also some whitespace cleanup in a few places. Would someone please review my work and commit the patch if it is acceptable? Pierre or Travis, would either of you have a chance to look over the implementation and the documentation changes, since you two seem to be most familiar with ufuncs and subclassing ndarray?

It looks like part of my patch has been clobbered by changes introduced in svn 7184-7191. What else should I be doing so a patch like this can be committed relatively quickly?

Could I please obtain commit privileges so I can commit this feature to svn myself? Darren

Darren Dale

13 Sep 13 Sep

1:01 p.m.

On Sat, Jul 25, 2009 at 8:33 PM, Darren Dale <dsdale24@gmail.com> wrote:

...

On Thu, Jul 23, 2009 at 12:54 PM, Darren Dale<dsdale24@gmail.com> wrote:

...
On Tue, Jul 21, 2009 at 10:11 AM, Darren Dale<dsdale24@gmail.com> wrote:

...
On Tue, Jul 21, 2009 at 7:44 AM, Darren Dale<dsdale24@gmail.com> wrote:

...
2009/7/20 Stéfan van der Walt <stefan@sun.ac.za>:

...
Hi Chuck

2009/7/17 Charles R Harris <charlesr.harris@gmail.com>:

...
PyObject* PyTuple_GetItem(PyObject *p, Py_ssize_t pos) Return value: Borrowed reference. Return the object at position pos in the tuple pointed to by p. If pos is out of bounds, return NULL and sets an IndexError exception. It's a borrowed reference so you need to call Py_INCREF on it. I find this Python C-API documentation useful.

Have you had a look over the rest of the code? I think this would make a good addition. Travis mentioned Contexts for doing something similar, but I don't know enough about that concept to compare the two.

I think contexts would be very different from what is already in place. For now, it would be nice to make this one small improvement to the existing ufunc infrastructure, and maybe consider contexts (which I still don't understand) at a later time. I have improved the code slightly and added a few tests, and will post a new patch later this morning. I just need to add some documentation.

Here is a better patch, which includes a few additional tests and adds some documentation. It also attempts to improve the docstring and sphinx docs for __array_wrap__, which may have been a little bit misleading. There is also some whitespace cleanup in a few places. Would someone please review my work and commit the patch if it is acceptable? Pierre or Travis, would either of you have a chance to look over the implementation and the documentation changes, since you two seem to be most familiar with ufuncs and subclassing ndarray?

It looks like part of my patch has been clobbered by changes introduced in svn 7184-7191. What else should I be doing so a patch like this can be committed relatively quickly?

Could I please obtain commit privileges so I can commit this feature to svn myself?

I guess I forgot to follow up here, I committed the patch during the SciPy conference. Thank you to the devs for granting me commit privileges, I'll use them with care. Are the numpy developers familiar with predicative dispatch in general, and PEP 3124 (generic functions) in particular? I've been reading about them all weekend. They seem particularly applicable to numpy, and not just where we currently use __array_prepare__ and __array_wrap__. The PEP seems to have stalled, but there is currently some discussion about it at python-dev. If anyone is interested in commenting on how generic functions could be useful to numpy, commenting in tho thread at python-dev could help establish what features would be desirable in generic functions and motivation for including them in the standard library. Here are some links for anyone who is interested: The PEP: http://ftp.python.org/dev/peps/pep-3124/ A presentation of Eby's implementation in PEAK: http://peak.telecommunity.com/PyCon05Talk/img0.html One of the paper's Eby cites in the presentation: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.47.1167 A Charming Python article an Eby's implementation: http://www.ibm.com/developerworks/library/l-cppeak2/ Guido's musings on the topic http://www.artima.com/weblogs/viewpost.jsp?thread=155123 Darren

5602

Age (days ago)

5790

Last active (days ago)

List overview

Download

24 comments

5 participants

participants (5)

Charles R Harris
Darren Dale
Robert Kern
Stéfan van der Walt
Travis E. Oliphant

suggestion for generalizing numpy functions

Darren Dale

Darren Dale

Darren Dale

Darren Dale

Darren Dale

Darren Dale

Darren Dale

Darren Dale

Darren Dale

Darren Dale

Darren Dale

Darren Dale

Darren Dale

Darren Dale

Darren Dale

Darren Dale

Darren Dale

tags

participants (5)