On 10/11/17 12:25, numpy-discussion-request(a)python.org wrote:
> Date: Fri, 10 Nov 2017 02:25:19 -0800
> From: Nathaniel Smith<njs(a)pobox.com>
> To: Discussion of Numerical Python<numpy-discussion(a)python.org>
> Subject: Re: [Numpy-discussion] deprecate updateifcopy in nditer
> operand, flags?
> Content-Type: text/plain; charset="UTF-8"
> On Wed, Nov 8, 2017 at 2:13 PM, Allan Haldane<allanhaldane(a)gmail.com> wrote:
>> On 11/08/2017 03:12 PM, Nathaniel Smith wrote:
>>> - We could adjust the API so that there's some explicit operation to
>>> trigger the final writeback. At the Python level this would probably
>>> mean that we start supporting the use of nditer as a context manager,
>>> and eventually start raising an error if you're in one of the "unsafe"
>>> case and not using the context manager form. At the C level we
>>> probably need some explicit "I'm done with this iterator now" call.
>>> One question is which cases exactly should produce warnings/eventually
>>> errors. At the Python level, I guess the simplest rule would be that
>>> if you have any write/readwrite arrays in your iterator, then you have
>>> to use a 'with' block. At the C level, it's a little trickier, because
>>> it's hard to tell up-front whether someone has updated their code to
>>> call a final cleanup function, and it's hard to emit a warning/error
>>> on something that*doesn't* happen. (You could print a warning when
>>> the nditer object is GCed if the cleanup function wasn't called, but
>>> you can't raise an error there.) I guess the only reasonable option is
>>> to deprecate NPY_ITER_READWRITE and NP_ITER_WRITEONLY, and make people
>>> switch to passing new flags that have the same semantics but also
>>> promise that the user has updated their code to call the new cleanup
>> Seems reasonable.
>> When people use the Nditer C-api, they (almost?) always call
>> NpyIter_Dealloc when they're done. Maybe that's a place to put a warning
>> for C-api users. I think you can emit a warning there since that
>> function calls the GC, not the other way around.
>> It looks like you've already discussed the possibilities of putting
>> things in NpyIter_Dealloc though, and it could be tricky, but if we only
>> need a warning maybe there's a way.
> Oh, hmm, yeah, on further examination there are some more options here.
> I had missed that for some reason NpyIter isn't actually a Python
> object, so actually it's never subject to GC and you always need to
> call NpyIter_Deallocate when you are finished with it. So that's a
> natural place to perform writebacks. We don't even need a warning.
> (Which is good, because warnings can be set to raise errors, and while
> the docs say that NpyIter_Deallocate can fail, in fact it never has
> been able to in the past and none of the code in numpy or the examples
> in the docs actually check the return value. Though I guess in theory
> writeback can also fail so I suppose we need to start returning
> NPY_FAIL in that case. But it should be vanishingly rare in practice,
> and it's not clear if anyone is even using this API outside of numpy.)
> And for the Python-level API, there is the option of performing the
> final writeback when the iterator is exhausted. The downside to this
> is that if someone only goes half-way through the iteration and then
> aborts (e.g. by raising an exception), then the last round of
> writeback won't happen. But maybe that's fine, or at least better than
> forcing the use of 'with' blocks everywhere? If we do this then I
> think we'd at least want to make sure that the writeback really never
> happens, as opposed to happening at some random later point when the
> Python iterator object is GCed. But I'd appreciate if anyone would
> express a preference between these:-)
> -- Nathaniel J. Smith -- https://vorpus.org
We cannot assume that the call to NPyIter_Deallocate() can resolve
writebackifcopy semantics. NPyIter_Copy() will return a new iterator
(after Py_INCREF ing the operands), so when either the original or the
copy is deallocated the operand's writeback buffer may still be needed.
So at the C level the user must resolve the writback when the last copy
of the iterator is deallocated.
At the python level we can force the use of a context manager and
prohibit use of a suspicious (one with writebackifcopy semantics) nditer
outside of a context manager. As for non-exhausted nditers, IMO using a
context manager makes it very clear when the writeback resolution is
meant to happen. Do we really want to support a use case where someone
creates an iterator, uses it partially, then needs to think carefully
about whether the operand changes will be resolved?
> On 11/09/2017 04:30 AM, Joe wrote:
> > Hello,
> > I have a question and hope that you can help me.
> > The doc for vstack mentions that "this function continues to be
> > supported for backward compatibility, but you should prefer
> > np.concatenate or np.stack."
> > Using vstack was convenient because "the arrays must have the same shape
> > along all but the first axis."
> > So it was possible to stack an array (3,) and (2, 3) to a (3, 3) array
> > without using e.g. atleast_2d on the (3,) array.
> > Is there a possibility to mimic that behavior with np.concatenate or
> > np.stack?
> > Joe
Can anybody explain why vstack is going the way of the dodo?
Why are stack / concatenate better? What is 'bad' about vstack?
I filed issue 9714 https://github.com/numpy/numpy/issues/9714 and wrote
a mail in September trying to get some feedback on what to do with
updateifcopy semantics and user-exposed nditer.
It garnered no response, so I am trying again.
For those who are unfamiliar with the issue see below for a short
summary and issue 7054 for a lengthy discussion.
Note that pull request 9639 which should be merged very soon changes the
magical UPDATEIFCOPY into WRITEBACKIFCOPY, and hopefully will appear in
As I mention in the issue, there is a magical update done in this
snippet in the next-to-the-last line:
|a = np.arange(24, dtype='f8').reshape(2, 3, 4).T i = np.nditer(a, ,
[['readwrite', 'updateifcopy']], casting='same_kind',
op_dtypes=[np.dtype('f4')]) # Check that UPDATEIFCOPY is activated
i.operands[2, 1, 1] = -12.5 assert a[2, 1, 1] != -12.5 i = None #
magic!!! assert a[2, 1, 1] == -12.5|
Not only is this magic very implicit, it relies on refcount semantics
and thus does not work on PyPy.
1. nditer is rarely used, just deprecate updateifcopy use on operands
2. make nditer into a context manager, so the code would become explicit
|a = np.arange(24, dtype='f8').reshape(2, 3, 4).T with np.nditer(a, ,
[['readwrite', 'updateifcopy']], casting='same_kind',
op_dtypes=[np.dtype('f4')]) as i: # Check that WRITEBACKIFCOPY is
activated i.operands[2, 1, 1] = -12.5 assert a[2, 1, 1] != -12.5
assert a[2, 1, 1] == -12.5 # a is modified in i.__exit__|
3. something else?
Any opinions? Does anyone use nditer in production code?
what are updateifcopy semantics? When a temporary copy or work buffer is
required, NumPy can (ab)use the base attribute of an ndarray by
- creating a copy of the data from the base array
- mark the base array read-only
Then when the temporary buffer is "no longer needed"
- the data is copied back
- the original base array is marked read-write
The trigger for the "no longer needed" decision before pull request 9639
is in the dealloc function.
That is not generally a place to do useful work, especially on PyPy
which can call dealloc much later.
Pull request 9639 adds an explicit PyArray_ResolveWritebackIfCopy api
function, and recommends calling it explicitly before dealloc.
The only place this change is visible to the python-level user is in
C-API users will need to adapt their code to use the new API function,
with a deprecation cycle that is backwardly compatible on CPython.
Thank you all kindly for your responses! Based on your encouragement, I
will pursue an ndarray subclass / __array_ufunc__ implementation. I had
been toying with np.set_numeric_ops, which is less than ideal (for example,
np.ndarray.around segfaults if I use set_numeric_ops in any way).
A second question: very broadly speaking, how much 'pain' can I expect
trying to use an np.ndarray subclass in the broader python scientific
computing ecosystem, and is there general consensus that projects 'should'
support ndarray subclasses?
> We spent a *long time* sorting out the messy details of __array_ufunc__
, especially for handling interactions between different types, e.g.,
between numpy arrays, non-numpy array-like objects, builtin Python objects,
objects that override arithmetic to act in non-numpy-like ways, and of
course subclasses of all the above.
> We hope that we have it right this time, but as we wrote in the NumPy
1.13 release notes "The API is provisional, we do not yet guarantee
backward compatibility as modifications may be made pending feedback." That
said, let's give it a try!
> If any changes are necessary, I expect it would likely relate to how we
handle interactions between different types. That's where we spent the
majority of the design effort, but debate is a poor substitute for
experience. I would be very surprised if the basic cases (one argument or
two arguments of the same type) need any changes.
I'd like to branch NumPy 1.14 soon. Before doing so, I'd like to make sure
at a minimum that
1) Changes in array print formatting are done.
2) Proposed deprecations have been make.
If there are other things that folks see as essential, now is the time to
Dear SciPythonists and NumPythonists,
FOSDEM is a free event for software developers to meet, share ideas and
collaborate. Every year, 6500+ of developers of free and open source software
from all over the world gather at the event in Brussels.
Every year, 6500+ of developers of free and open source software from all over
the world gather at the event in Brussels.
For FOSDEM 2018, we will try the new concept of a virtual Python-devroom: there
is no dedicated Python room but instead, we promote the presence of Python in
all devrooms. We hope to have at least one Python talk in every devroom (Yes,
even in Perl, Ada, Go and Rust devrooms ;-) ).
How can you help to highlight the Python community at Python-FOSDEM 2018?
Propose your talk in the closest related devroom:
Not all devrooms are language-specific and a number of topics come to mind for
data and science participants:
"Monitoring & Cloud devroom" https://lists.fosdem.org/pipermail/fosdem/2017-October/002631.html
"HPC, Big Data, and Data Science"
Most call for contributions end around the 24 of november.
Send a copy of your proposition to python-devroom AT lists.fosdem DOT org. We will
publish a dedicated schedule for Python on https://python-fosdem.org/ and at our
A dinner will be also organized, stay tuned.
We are waiting for your talks proposals.
The Python-FOSDEM committee