[Numpy-discussion] failure to register ufunc loops for user defined types

mark florisson markflorisson88 at gmail.com
Mon Dec 5 12:59:49 EST 2011


On 5 December 2011 17:57, mark florisson <markflorisson88 at gmail.com> wrote:
> On 5 December 2011 17:48, Mark Wiebe <mwwiebe at gmail.com> wrote:
>> On Mon, Dec 5, 2011 at 9:37 AM, mark florisson <markflorisson88 at gmail.com>
>> wrote:
>>>
>>> On 5 December 2011 17:25, Mark Wiebe <mwwiebe at gmail.com> wrote:
>>> > On Sun, Dec 4, 2011 at 11:37 PM, Geoffrey Irving <irving at naml.us> wrote:
>>> >>
>>> >> <snip>
>>> >>
>>> >>
>>> >> Back to the bugs: here's a branch with all the changes I needed to get
>>> >> rational arithmetic to work:
>>> >>
>>> >>    https://github.com/girving/numpy
>>> >>
>>> >> I discovered two more after the last email.  One is another simple 0
>>> >> vs. 1 bug, and another is somewhat optional:
>>> >>
>>> >> commit 730b05a892371d6f18d9317e5ae6dc306c0211b0
>>> >> Author: Geoffrey Irving <irving at naml.us>
>>> >> Date:   Sun Dec 4 20:03:46 2011 -0800
>>> >>
>>> >>    After loops, check for PyErr_Occurred() even if needs_api is 0
>>> >>
>>> >>    For certain types of user defined classes, casting and ufunc loops
>>> >>    normally run without the Python API, but occasionally need to throw
>>> >>    an error.  Currently we assume that !needs_api means no error occur.
>>> >>    However, the fastest way to implement such loops is to run without
>>> >>    the GIL normally and use PyGILState_Ensure/Release if an error
>>> >> occurs.
>>> >>
>>> >>    In order to support this usage pattern, change all post-loop checks
>>> >> from
>>> >>
>>> >>        needs_api && PyErr_Occurred()
>>> >>
>>> >>    to simply
>>> >>
>>> >>        PyErr_Occurred()
>>> >
>>> >
>>> > To support this properly, I think we would need to convert needs_api
>>> > into an
>>> > enum with this hybrid mode as another case. While it isn't done
>>> > currently, I
>>> > was imagining using a thread pool to multithread the trivially
>>> > data-parallel
>>> > operations when needs_api is false, and I suspect the
>>> > PyGILState_Ensure/Release would trigger undefined behavior in a thread
>>> > created entirely outside of the Python system.
>>>
>>> PyGILState_Ensure/Release can be safely used by non-python threads
>>> with the only requirement that the GIL has been initialized previously
>>> in the main thread (PyEval_InitThreads).
>>
>>
>> Is there a way this could efficiently be used to propagate any errors back
>> to the main thread, for example using TBB as the thread pool? The innermost
>> task code which calls the inner loop can't call PyErr_Occurred() without
>> first calling PyGILState_Ensure itself, which would kill utilization.
>
> No, there is no way these things can be efficient, as the GIL is
> likely contented anyway (I wasn't making a point for these functions,
> just wanted to clarify). There is in fact the additional problem that
> PyGILState_Ensure would initialize a threadstate, you set an
> exception, and when you call PyGILState_Release the threadstate gets
> deleted along with the exception, before you will even have a chance
> to check with PyErr_Occurred().

To clarify, this case will only happen if you're doing this from a
non-Python thread that doesn't have a threadstate to begin with.

> For cython.parallel I worked around this by calling PyGILState_Ensure
> (to initialize the thread state), followed immediately by
> Py_BEGIN_ALLOW_THREADS before starting any work. You then have to
> fetch the exception and restore it in another thread when you want to
> propagate it. It's a total mess, it's inefficient and if you can avoid
> it you should.
>
>> Maybe this is an ABI problem in NumPy that needs to be fixed, to mandate
>> that inner loops always return an error code and disallow them from setting
>> the Python exception state without returning failure.
>
> That would likely be the best thing.
>
>> -Mark
>>
>>>
>>>
>>> > For comparison, I created a
>>> > special mechanism for simplified multi-threaded exceptions in the nditer
>>> > in
>>> > the 'errmsg' parameter:
>>> >
>>> >
>>> > http://docs.scipy.org/doc/numpy/reference/c-api.iterator.html#NpyIter_GetIterNext
>>> >
>>> > Worth considering is also the fact that the PyGILState API is
>>> > incompatible
>>> > with multiple embedded interpreters. Maybe that's not something anyone
>>> > does
>>> > with NumPy, though.
>>> >
>>> > -Mark
>>> >
>>> >>
>>> >>
>>> >> Geoffrey
>>> >> _______________________________________________
>>> >> NumPy-Discussion mailing list
>>> >> NumPy-Discussion at scipy.org
>>> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>> >
>>> >
>>> >
>>> > _______________________________________________
>>> > NumPy-Discussion mailing list
>>> > NumPy-Discussion at scipy.org
>>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>> >
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>



More information about the NumPy-Discussion mailing list