Error handling in a ufunc inner loop.
This is a new thread to address the question of error handling in a ufunc loop that was brought up in the thread on handling core dimensions of length zero. I'm attempting to answer my own question about the idiomatic way to handle an error in an inner loop. The use of the GIL with a ufunc loop is documented at https://numpy.org/devdocs/reference/internals.code-explanations.html#functio... So an inner loop is running without the GIL if the macro NPY_ALLOW_THREADS is defined and the loop is not an object-type loop. If the inner loop is running without the GIL, it must acquire the GIL before calling, say, PyErr_SetString to set an exception. The NumPy macros for acquiring the GIL are documented at https://docs.scipy.org/doc/numpy/reference/c-api.array.html#group-2 These macros are defined in numpy/core/include/numpy/ndarraytypes.h. If NPY_ALLOW_THREADS is defined, these macros wrap calls to PyGILState_Ensure() and PyGILState_Release() ( https://docs.python.org/3/c-api/init.html#non-python-created-threads): ``` #define NPY_ALLOW_C_API_DEF PyGILState_STATE __save__; #define NPY_ALLOW_C_API do {__save__ = PyGILState_Ensure();} while (0); #define NPY_DISABLE_C_API do {PyGILState_Release(__save__);} while (0); ``` If NPY_ALLOW_THREADS is not defined, those macros are defined with empty values. Now suppose I want to change the following inner loop to set an exception instead of returning nan when the input is negative: ``` static void logfactorial_loop(char **args, npy_intp *dimensions, npy_intp* steps, void* data) { char *in = args[0]; char *out = args[1]; npy_intp in_step = steps[0]; npy_intp out_step = steps[1]; for (npy_intp i = 0; i < dimensions[0]; ++i, in += in_step, out += out_step) { int64_t x = *(int64_t *)in; if (x < 0) { *((double *)out) = NAN; } else { *((double *)out) = logfactorial(x); } } } ``` Based on the documentation linked above, the changed inner loop is simply: ``` static void logfactorial_loop(char **args, npy_intp *dimensions, npy_intp* steps, void* data) { char *in = args[0]; char *out = args[1]; npy_intp in_step = steps[0]; npy_intp out_step = steps[1]; for (npy_intp i = 0; i < dimensions[0]; ++i, in += in_step, out += out_step) { int64_t x = *(int64_t *)in; if (x < 0) { NPY_ALLOW_C_API_DEF NPY_ALLOW_C_API PyErr_SetString(PyExc_ValueError, "math domain error in logfactorial: x < 0"); NPY_DISABLE_C_API return; } else { *((double *)out) = logfactorial(x); } } } ``` That worked as expected, but I haven't tried it yet with a NumPy installation where NPY_ALLOW_THREADS is not defined. Is that change correct? Would that be considered the (or an) idiomatic way to handle errors in an inner loop? Are there any potential problems that I'm missing? Warren
On 9/29/19, Warren Weckesser <warren.weckesser@gmail.com> wrote:
This is a new thread to address the question of error handling in a ufunc loop that was brought up in the thread on handling core dimensions of length zero. I'm attempting to answer my own question about the idiomatic way to handle an error in an inner loop.
The use of the GIL with a ufunc loop is documented at
https://numpy.org/devdocs/reference/internals.code-explanations.html#functio...
So an inner loop is running without the GIL if the macro NPY_ALLOW_THREADS is defined and the loop is not an object-type loop.
If the inner loop is running without the GIL, it must acquire the GIL before calling, say, PyErr_SetString to set an exception. The NumPy macros for acquiring the GIL are documented at
https://docs.scipy.org/doc/numpy/reference/c-api.array.html#group-2
These macros are defined in numpy/core/include/numpy/ndarraytypes.h. If NPY_ALLOW_THREADS is defined, these macros wrap calls to PyGILState_Ensure() and PyGILState_Release() ( https://docs.python.org/3/c-api/init.html#non-python-created-threads):
``` #define NPY_ALLOW_C_API_DEF PyGILState_STATE __save__; #define NPY_ALLOW_C_API do {__save__ = PyGILState_Ensure();} while (0); #define NPY_DISABLE_C_API do {PyGILState_Release(__save__);} while (0); ```
If NPY_ALLOW_THREADS is not defined, those macros are defined with empty values.
Now suppose I want to change the following inner loop to set an exception instead of returning nan when the input is negative:
``` static void logfactorial_loop(char **args, npy_intp *dimensions, npy_intp* steps, void* data) { char *in = args[0]; char *out = args[1]; npy_intp in_step = steps[0]; npy_intp out_step = steps[1];
for (npy_intp i = 0; i < dimensions[0]; ++i, in += in_step, out += out_step) { int64_t x = *(int64_t *)in; if (x < 0) { *((double *)out) = NAN; } else { *((double *)out) = logfactorial(x); } } } ```
Based on the documentation linked above, the changed inner loop is simply:
``` static void logfactorial_loop(char **args, npy_intp *dimensions, npy_intp* steps, void* data) { char *in = args[0]; char *out = args[1]; npy_intp in_step = steps[0]; npy_intp out_step = steps[1];
for (npy_intp i = 0; i < dimensions[0]; ++i, in += in_step, out += out_step) { int64_t x = *(int64_t *)in; if (x < 0) { NPY_ALLOW_C_API_DEF NPY_ALLOW_C_API PyErr_SetString(PyExc_ValueError, "math domain error in logfactorial: x < 0"); NPY_DISABLE_C_API return; } else { *((double *)out) = logfactorial(x); } } } ```
That worked as expected, but I haven't tried it yet with a NumPy installation where NPY_ALLOW_THREADS is not defined.
Is that change correct? Would that be considered the (or an) idiomatic way to handle errors in an inner loop? Are there any potential problems that I'm missing?
Sebastian Berg pointed out to me that exactly this pattern is used in NumPy, for example, https://github.com/numpy/numpy/blob/68bd6e359a6b0863acf39cad637e1444d78eabd0... So I'll take that as a yes, that's the way (or at least a way) to do it. Warren
Warren
participants (1)
-
Warren Weckesser