[Numpy-discussion] deprecate updateifcopy in nditer operand, flags?

Sun Nov 12 14:13:00 EST 2017

On 10/11/17 12:25, numpy-discussion-request at python.org wrote:
> Date: Fri, 10 Nov 2017 02:25:19 -0800
> From: Nathaniel Smith<njs at pobox.com>
> To: Discussion of Numerical Python<numpy-discussion at python.org>
> Subject: Re: [Numpy-discussion] deprecate updateifcopy in nditer
> 	operand, flags?
> Message-ID:
> 	<CAPJVwBnctR=sSWizdgR3jKMMkC6HcFC4pXmtRiCC6B=k4ugtKg at mail.gmail.com>
> Content-Type: text/plain; charset="UTF-8"
>
> On Wed, Nov 8, 2017 at 2:13 PM, Allan Haldane<allanhaldane at gmail.com>  wrote:
>> On 11/08/2017 03:12 PM, Nathaniel Smith wrote:
>>> - We could adjust the API so that there's some explicit operation to
>>> trigger the final writeback. At the Python level this would probably
>>> mean that we start supporting the use of nditer as a context manager,
>>> and eventually start raising an error if you're in one of the "unsafe"
>>> case and not using the context manager form. At the C level we
>>> probably need some explicit "I'm done with this iterator now" call.
>>>
>>> One question is which cases exactly should produce warnings/eventually
>>> errors. At the Python level, I guess the simplest rule would be that
>>> if you have any write/readwrite arrays in your iterator, then you have
>>> to use a 'with' block. At the C level, it's a little trickier, because
>>> it's hard to tell up-front whether someone has updated their code to
>>> call a final cleanup function, and it's hard to emit a warning/error
>>> on something that*doesn't*  happen. (You could print a warning when
>>> the nditer object is GCed if the cleanup function wasn't called, but
>>> you can't raise an error there.) I guess the only reasonable option is
>>> to deprecate NPY_ITER_READWRITE and NP_ITER_WRITEONLY, and make people
>>> switch to passing new flags that have the same semantics but also
>>> promise that the user has updated their code to call the new cleanup
>>> function.
>> Seems reasonable.
>>
>> When people use the Nditer C-api, they (almost?) always call
>> NpyIter_Dealloc when they're done. Maybe that's a place to put a warning
>> for C-api users. I think you can emit a warning there since that
>> function calls the GC, not the other way around.
>>
>> It looks like you've already discussed the possibilities of putting
>> things in NpyIter_Dealloc though, and it could be tricky, but if we only
>> need a warning maybe there's a way.
>> https://github.com/numpy/numpy/pull/9269/files/6dc0c65e4b2ea67688d6b617da3a175cd603fc18#r127707149
> Oh, hmm, yeah, on further examination there are some more options here.
>
> I had missed that for some reason NpyIter isn't actually a Python
> object, so actually it's never subject to GC and you always need to
> call NpyIter_Deallocate when you are finished with it. So that's a
> natural place to perform writebacks. We don't even need a warning.
> (Which is good, because warnings can be set to raise errors, and while
> the docs say that NpyIter_Deallocate can fail, in fact it never has
> been able to in the past and none of the code in numpy or the examples
> in the docs actually check the return value. Though I guess in theory
> writeback can also fail so I suppose we need to start returning
> NPY_FAIL in that case. But it should be vanishingly rare in practice,
> and it's not clear if anyone is even using this API outside of numpy.)
>
> And for the Python-level API, there is the option of performing the
> final writeback when the iterator is exhausted. The downside to this
> is that if someone only goes half-way through the iteration and then
> aborts (e.g. by raising an exception), then the last round of
> writeback won't happen. But maybe that's fine, or at least better than
> forcing the use of 'with' blocks everywhere? If we do this then I
> think we'd at least want to make sure that the writeback really never
> happens, as opposed to happening at some random later point when the
> Python iterator object is GCed. But I'd appreciate if anyone would
> express a preference between these:-)
>
> -n
>
> -- Nathaniel J. Smith -- https://vorpus.org
We cannot assume that the call to NPyIter_Deallocate() can resolve 
writebackifcopy semantics. NPyIter_Copy() will return a new iterator 
(after Py_INCREF ing the operands), so when either the original or the 
copy is deallocated the operand's writeback buffer may still be needed. 
So at the C level the user must resolve the writback when the last copy 
of the iterator is deallocated.

At the python level we can force the use of a context manager and 
prohibit use of a suspicious (one with writebackifcopy semantics) nditer 
outside of a context manager. As for non-exhausted nditers, IMO using a 
context manager makes it very clear when the writeback resolution is 
meant to happen. Do we really want to support a use case where someone 
creates an iterator, uses it partially, then needs to think carefully 
about whether the operand changes will be resolved?
Matti