Releasing the GIL in ufuncs dealing with object arrays
Hi all, PyGEOS (https://github.com/caspervdw/pygeos) is an experimental package implementing a set of numpy ufuncs to provide vectorized geometry functionality (wrapping the C++ GEOS library). The way it does this is by implementing a Python extension type (pygeos.Geometry) that wraps an actual GEOSGeometry object by storing a pointer to it in the PyObject struct of the extension type. This way, we can store those objects in an object dtype array in numpy, but still access the pointer in the ufunc inner loop without needing the python interpreter. The single threaded performance of the ufuncs with the approach above is very good. There doesn't seem to be an overhead of using the object array approach. However, as far as I can find in the docs, the GIL is only released in ufuncs for non-object dtypes. So the question here is: is there a way to let numpy release the GIL in such a case nonetheless? Although the array holds python objects, the ufunc inner loop only accesses a static attribute of the object struct, not needing any explicit Python interaction. Best, Joris
Hi Joris, On Mon, 2019-08-19 at 17:35 +0200, Joris Van den Bossche wrote:
Hi all,
PyGEOS (https://github.com/caspervdw/pygeos) is an experimental package implementing a set of numpy ufuncs to provide vectorized geometry functionality (wrapping the C++ GEOS library).
The way it does this is by implementing a Python extension type (pygeos.Geometry) that wraps an actual GEOSGeometry object by storing a pointer to it in the PyObject struct of the extension type. This way, we can store those objects in an object dtype array in numpy, but still access the pointer in the ufunc inner loop without needing the python interpreter.
The single threaded performance of the ufuncs with the approach above is very good. There doesn't seem to be an overhead of using the object array approach. However, as far as I can find in the docs, the GIL is only released in ufuncs for non-object dtypes.
So the question here is: is there a way to let numpy release the GIL in such a case nonetheless? Although the array holds python objects, the ufunc inner loop only accesses a static attribute of the object struct, not needing any explicit Python interaction.
Hmmm, interesting use case. No, I do not think there currently is a reasonable way to do this (I think there may be ways to hack it). Even when all access to the objects is safe by itself, you still have the problem that the object stored inside the array could be replaced (and invalidated) at any time if you run multithreaded. We would like to type such objects in the future, even then, I am not sure how to make things safe against race conditions if elements are replaced (and deleted). This is an interesting use case, since arrays of pointers (or specific pyobjects) will always have this type of issue, and I am not sure how you would avoid it (a cheap lock on the object itself works probably, but even if it is cheap, it is probably fairly expensive?). Best, Sebastian
Best, Joris _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Hi Sebastian, Thanks for the answer! On Mon, 19 Aug 2019 at 17:57, Sebastian Berg <sebastian@sipsolutions.net> wrote:
...
Hmmm, interesting use case. No, I do not think there currently is a reasonable way to do this (I think there may be ways to hack it). Even when all access to the objects is safe by itself, you still have the problem that the object stored inside the array could be replaced (and invalidated) at any time if you run multithreaded.
Would it help to have a custom dtype that ensures that all objects in the array are of this specific extension type? (I don't know if a custom dtype (done in C, like the quaternion examples) are possible for storing python objects)
We would like to type such objects in the future, even then, I am not sure how to make things safe against race conditions if elements are replaced (and deleted).
This is an interesting use case, since arrays of pointers (or specific pyobjects) will always have this type of issue, and I am not sure how you would avoid it (a cheap lock on the object itself works probably, but even if it is cheap, it is probably fairly expensive?).
Currently, we are thinking of doing two loops in the ufunc. First one for getting all the pointers into a C array, and then manually releasing the gil for the second loop doing the actual operation on the array of pointers. See https://github.com/caspervdw/pygeos/issues/7 for example code. From a quick experiment that seems to give only a small overhead (in a single threaded case). That of course still has the same problems as you mentioned (although in our case, we are, in principle, the holders of the array and know what it contains, and the individual extension type objects are not mutable), but then at least it is our own responsibility of making sure that the array contains the correct objects and is not mutated. Joris
On Tue, 2019-08-20 at 22:30 +0200, Joris Van den Bossche wrote:
Hi Sebastian,
Thanks for the answer!
On Mon, 19 Aug 2019 at 17:57, Sebastian Berg < sebastian@sipsolutions.net> wrote:
...
Hmmm, interesting use case. No, I do not think there currently is a reasonable way to do this (I think there may be ways to hack it). Even when all access to the objects is safe by itself, you still have the problem that the object stored inside the array could be replaced (and invalidated) at any time if you run multithreaded.
Would it help to have a custom dtype that ensures that all objects in the array are of this specific extension type? (I don't know if a custom dtype (done in C, like the quaternion examples) are possible for storing python objects)
You can do a custom dtype like the quaternion. We are working on creating new custom dtypes, but that will be a while until it lands. That is one thing I am not quite sure about, whether it is possible to do an object backed dtype currently (the issue is whether the reference counting is done -- especially without adding other issues), I could have a look if you like. Making that easy is very high up on the "what I want in the future" list.
We would like to type such objects in the future, even then, I am not sure how to make things safe against race conditions if elements are replaced (and deleted).
This is an interesting use case, since arrays of pointers (or specific pyobjects) will always have this type of issue, and I am not sure how you would avoid it (a cheap lock on the object itself works probably, but even if it is cheap, it is probably fairly expensive?).
Currently, we are thinking of doing two loops in the ufunc. First one for getting all the pointers into a C array, and then manually releasing the gil for the second loop doing the actual operation on the array of pointers. See https://github.com/caspervdw/pygeos/issues/7 for example code. From a quick experiment that seems to give only a small overhead (in a single threaded case).
I suppose that should work. If you are within a inner loop, you have only limited control on the chunking/buffersize though, so in the worst case you might be releasing the GIL very often. I suppose in the event that the array is not writeable, you could actually release the GIL. This is something that we are thinking about enabling full control over, although it is not on the high list for priorities right now (Basically my plan/thought is to start off without allowing such things, but keeping it open for later addition). In practice I suppose that such objects (and ufuncs) can be fairly heavy, so that even indiscriminately copying the full input arrays really is not a big issue as such.
That of course still has the same problems as you mentioned (although in our case, we are, in principle, the holders of the array and know what it contains, and the individual extension type objects are not mutable), but then at least it is our own responsibility of making sure that the array contains the correct objects and is not mutated.
I understood it as: You copy+incref. After that all seems OK with me, unless your object itself is mutable (in a non-threadsafe way). Best, Sebastian
Joris
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
participants (2)
-
Joris Van den Bossche -
Sebastian Berg