[Numpy-discussion] Warnings in numpy.ma.test()

Wed Mar 17 21:39:12 EDT 2010

On Wed, Mar 17, 2010 at 8:22 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Wed, Mar 17, 2010 at 5:26 PM, Darren Dale <dsdale24 at gmail.com> wrote:
>>
>> On Wed, Mar 17, 2010 at 5:43 PM, Charles R Harris
>> <charlesr.harris at gmail.com> wrote:
>> > On Wed, Mar 17, 2010 at 3:13 PM, Darren Dale <dsdale24 at gmail.com> wrote:
>> >> On Wed, Mar 17, 2010 at 4:48 PM, Pierre GM <pgmdevlist at gmail.com>
>> >> wrote:
>> >> > On Mar 17, 2010, at 8:19 AM, Darren Dale wrote:
>> >> >>
>> >> >> I started thinking about a third method called __input_prepare__
>> >> >> that
>> >> >> would be called on the way into the ufunc, which would allow you to
>> >> >> intercept the input and pass a somehow modified copy back to the
>> >> >> ufunc. The total flow would be:
>> >> >>
>> >> >> 1) Call myufunc(x, y[, z])
>> >> >> 2) myufunc calls ?.__input_prepare__(myufunc, x, y), which returns
>> >> >> x',
>> >> >> y' (or simply passes through x,y by default)
>> >> >> 3) myufunc creates the output array z (if not specified) and calls
>> >> >> ?.__array_prepare__(z, (myufunc, x, y, ...))
>> >> >> 4) myufunc finally gets around to performing the calculation
>> >> >> 5) myufunc calls ?.__array_wrap__(z, (myufunc, x, y, ...)) and
>> >> >> returns
>> >> >> the result to the caller
>> >> >>
>> >> >> Is this general enough for your use case? I haven't tried to think
>> >> >> about how to change some global state at one point and change it
>> >> >> back
>> >> >> at another, that seems like a bad idea and difficult to support.
>> >> >
>> >> >
>> >> > Sounds like a good plan. If we could find a way to merge the first
>> >> > two
>> >> > (__input_prepare__ and __array_prepare__), that'd be ideal.
>> >>
>> >> I think it is better to keep them separate, so we don't have one
>> >> method that is trying to do too much. It would be easier to explain in
>> >> the documentation.
>> >>
>> >> I may not have much time to look into this until after Monday. Is
>> >> there a deadline we need to consider?
>> >>
>> >
>> > I don't think this should go into 2.0, I think it needs more thought.
>>
>> Now that you mention it, I agree that it would be too rushed to try to
>> get it in for 2.0. Concerning a later release, is there anything in
>> particular that you think needs to be clarified or reconsidered?
>>
>> > And
>> > 2.0 already has significant code churn. Is there any reason beyond a big
>> > hassle not to set/restore the error state around all the ufunc calls in
>> > ma?
>> > Beyond that, the PEP that you pointed to looks interesting. Maybe some
>> > sort
>> > of decorator around ufunc calls could also be made to work.
>>
>> I think the PEP is interesting, but it is languishing. There were some
>> questions and criticisms on the mailing list that I do not think were
>> satisfactorily addressed, and as far as I know the author of the PEP
>> has not pursued the matter further. There was some interest on the
>> python-dev mailing list in the numpy community's use case, but I think
>> we need to consider what can be done now to meet the needs of ndarray
>> subclasses. I don't see PEP 3124 happening in the near future.
>>
>> What I am proposing is a simple extension to our existing framework to
>> let subclasses hook into ufuncs and customize their behavior based on
>> the context of the operation (using the __array_priority__ of the
>> inputs and/or outputs, and the identity of the ufunc). The steps I
>> listed allow customization at the critical steps: prepare the input,
>> prepare the output, populate the output (currently no proposal for
>> customization here), and finalize the output. The only additional step
>> proposed is to prepare the input.
>>
>
> What bothers me here is the opposing desire to separate ufuncs from their
> ndarray dependency, having them operate on buffer objects instead. As I see
> it ufuncs would be split into layers, with a lower layer operating on buffer
> objects, and an upper layer tying them together with ndarrays where the
> "business" logic -- kinds, casting, etc -- resides. It is in that upper
> layer that what you are proposing would reside. Mind, I'm not sure that
> having matrices and masked arrays subclassing ndarray was the way to go, but
> given that they do one possible solution is to dump the whole mess onto the
> subtype with the highest priority. That subtype would then be responsible
> for casts and all the other stuff needed for the call and wrapping the
> result. There could be library routines to help with that. It seems to me
> that that would be the most general way to go. In that sense ndarrays
> themselves would just be another subtype with especially low priority.

I'm sorry, I didn't understand your point. What you described sounds
identical to how things are currently done. What distinction are you
making, aside from operating on the buffer object? How would adding a
method to modify the input to a ufunc complicate the situation?

Darren