PyWeakref_GetObject() borrows its reference from... whom?

The documentation for PyWeakref_GetObject() states:
Return value: Borrowed reference.
Return the referenced object from a weak reference, ref. If the referent is no longer live, returns Py_None.
Given that the weakref doesn't have a reference to the object--merely a weak reference, different thing--whose reference is it borrowing?
FWIW, yes, this is playing merry hell with the Gilectomy. If there are two threads, and one calls PyWeakref_GetObject(obj), and there's only one reference to obj, and the other thread blows it away... now what? It's my contention that this API is simply untenable under the Gilectomy, and that it needs to change to returning a new (strong) reference.
//arry/

It's my contention that this API is simply untenable under the Gilectomy
Yeah, I agree with your contention. The only circumstance PyWeakref_GetObject would still be valid without a GIL is if you happened to know you had a reference elsewhere that was keeping it alive. But if you did, you probably wouldn't need to use the weakref. If you don't, then with GILectomy you have no thread-safe options.
Trivial bikeshedding: rather than change the existing API, wouldn't it be better to add a new API, and not support the old one in GILectomized CPython? Introducing memory leaks would be a little unfortunate, and it's hard to audit C code for correct use of refcounted pointers even *before* you change the ownership of returned pointers across versions.
w.r.t. the exact question:
whose reference is it borrowing?
I think this is a red herring sort of question. "borrowed" only means "unowned". But anyway, we borrowed from the weakref. (Borrowing from somebody doesn't imply they own a reference -- we can borrow from a borrowed reference, for example, and this is common.)
The term "borrowed" is supposed to imply a sensible scope during which you're free to use the object, and weakrefs don't have that (except for what is granted by the GIL), so this does sound wacky. I bet it was for performance.
-- Devin

On 10/10/2016 11:05 AM, Devin Jeanpierre wrote:
whose reference is it borrowing?
I think this is a red herring sort of question. "borrowed" only means "unowned". But anyway, we borrowed from the weakref. (Borrowing from somebody doesn't imply they own a reference -- we can borrow from a borrowed reference, for example, and this is common.)
Huh? In all other circumstances, a "borrowed" reference is exactly that: X has a reference, and you are relying on X's reference to keep the object alive. Borrowing from a borrowed reference is simply a chain of these; Z borrows from Y, Y borrows from X, and X is the original person who did the incref. But you're borrowing from something specific, somebody who the API guarantees has a legitimate reference on the object and won't drop it while you're using it. I bet for every other spot in the API I can tell you from whom you're borrowing the reference.
In contrast, the "borrowed" reference returned by PyWeakRef_GetObject() seems to be "borrowed" from some unspecified entity. The fact that the object is live in Python directly implies that, yes, *somebody* must have a reference, somewhere. But ISTM (and apparently you) that this is relying on the GIL preventing that unknown other actor from dropping their reference while you've borrow it. A guarantee that the post-Gilectomy Python interpreter can no longer make!
In any case, I see nothing in the documentation that suggests "borrowed only means unowned" as you suggest. In contrast, the documentation seems to suggest that the metaphor is how I understood it; that when you "borrow" a reference, there is another object who has a reference and you're relying on their reference to keep the object alive.
https://docs.python.org/3.6/extending/extending.html#reference-counts
//arry/

On Mon, Oct 10, 2016 at 8:35 PM, Larry Hastings larry@hastings.org wrote:
Huh? In all other circumstances, a "borrowed" reference is exactly that: X has a reference, and you are relying on X's reference to keep the object alive. Borrowing from a borrowed reference is simply a chain of these; Z borrows from Y, Y borrows from X, and X is the original person who did the incref. But you're borrowing from something specific, somebody who the API guarantees has a legitimate reference on the object and won't drop it while you're using it. I bet for every other spot in the API I can tell you from whom you're borrowing the reference.
Okay. Here's a test:
PyObject* PyList_GetItem(PyObject *list, Py_ssize_t index) Return value: Borrowed reference.
Presumably you own a reference to the list itself before you call this, and the list has a reference to its items. But suppose another thread clear()s the list immediately after you call this; whose reference are you borrowing now? The list's is gone.
Or is this another API that will have to change post gilectomy?
ChrisA

On 2016-10-10 10:45, Chris Angelico wrote:
On Mon, Oct 10, 2016 at 8:35 PM, Larry Hastings larry@hastings.org wrote:
Huh? In all other circumstances, a "borrowed" reference is exactly that: X has a reference, and you are relying on X's reference to keep the object alive. Borrowing from a borrowed reference is simply a chain of these; Z borrows from Y, Y borrows from X, and X is the original person who did the incref. But you're borrowing from something specific, somebody who the API guarantees has a legitimate reference on the object and won't drop it while you're using it. I bet for every other spot in the API I can tell you from whom you're borrowing the reference.
Okay. Here's a test:
PyObject* PyList_GetItem(PyObject *list, Py_ssize_t index) Return value: Borrowed reference.
Presumably you own a reference to the list itself before you call this, and the list has a reference to its items. But suppose another thread clear()s the list immediately after you call this; whose reference are you borrowing now? The list's is gone.
Or is this another API that will have to change post gilectomy?
Couldn't the reference to the list also be borrowed?
If you lookup something in a dict, it'll be a borrowed reference.
If the dict is globals() and there's no GIL, another thread could delete the item while your code had the borrowed reference.
It looks like there might be a lot that will need to changed post gilectomy!

On 10 October 2016 at 17:49, MRAB python@mrabarnett.plus.com wrote:
If you lookup something in a dict, it'll be a borrowed reference.
If the dict is globals() and there's no GIL, another thread could delete the item while your code had the borrowed reference.
It looks like there might be a lot that will need to changed post gilectomy!
It seems to me that the whole concept of a borrowed reference may be risky in a post-GIL world. There may be occasional cases where it's possible to prove that borrowing is safe, but I suspect they'll be pretty rare.
Paul

On Tue, Oct 11, 2016 at 4:03 AM, Paul Moore p.f.moore@gmail.com wrote:
On 10 October 2016 at 17:49, MRAB python@mrabarnett.plus.com wrote:
If you lookup something in a dict, it'll be a borrowed reference.
If the dict is globals() and there's no GIL, another thread could delete the item while your code had the borrowed reference.
It looks like there might be a lot that will need to changed post gilectomy!
It seems to me that the whole concept of a borrowed reference may be risky in a post-GIL world. There may be occasional cases where it's possible to prove that borrowing is safe, but I suspect they'll be pretty rare.
Not really. If you have an object that forever owns a reference to another object, it's safe to return borrowed references. A quick search for 'borrowed' showed up these perfectly safe[1] examples:
PyObject* PyFunction_GetGlobals(PyObject *op) Return value: Borrowed reference. -- a function's __globals__ attribute is read-only -- though its __code__ is not, so possibly PyFunction_GetCode may be dangerous??
PyObject* PyTuple_GetItem(PyObject *p, Py_ssize_t pos) Return value: Borrowed reference. -- as mentioned previously, tuples are immutable and thus safe
PyObject* PyMethod_Function(PyObject *meth) Return value: Borrowed reference. PyObject* PyMethod_Self(PyObject *meth) Return value: Borrowed reference. -- a bound method's __func__ and __self__ attributes are read-only
In these cases, it's easy to see what reference you're borrowing, and it's not a problem.
ChrisA
[1] As far as I know!

On Tue, Oct 11, 2016 at 4:03 AM, Paul Moore p.f.moore@gmail.com wrote:
If you have an object that forever owns a reference to another object, it's safe to return borrowed references.
Even if there are such cases, it seems we're going to have to be a lot more careful about identifying them.
It might be safer not to have any APIs that return borrowed references, or at least name them in a way that make it clear they're dangerous and only to be used with extreme care.
Having things like PyDict_GetItem where the *normal* version of the API returns a borrowed reference seems like a footgun.

On Mon, Oct 10, 2016 at 10:03 AM, Paul Moore p.f.moore@gmail.com wrote:
On 10 October 2016 at 17:49, MRAB python@mrabarnett.plus.com wrote:
If you lookup something in a dict, it'll be a borrowed reference.
If the dict is globals() and there's no GIL, another thread could delete the item while your code had the borrowed reference.
It looks like there might be a lot that will need to changed post gilectomy!
It seems to me that the whole concept of a borrowed reference may be risky in a post-GIL world. There may be occasional cases where it's possible to prove that borrowing is safe, but I suspect they'll be pretty rare.
I don't think it's that bad...
In a post-GIL world, the problem cases we're talking about like deleting items from containers all require some sort of locking, even before we start thinking about borrowed references. If two threads start mucking around resizing the internals of a list or dict at the same time, then you are unlikely to go to space today. IIRC to handle this gilectomy adds per-object mutexes that you have to hold whenever you're mucking around with that object's internals.
If we say that borrowing reference from a dict is one of the things that counts as mucking about with that dict, and thus requires you to hold the dict lock for as long as you hold the borrowed reference, then all should be well.
I assume Larry is way ahead of us on this, which is why he's suddenly confusing us all by emphasizing that when you borrow a reference you should know who you're borrowing it from, so you know which lock to hold.
-n

On 10 October 2016 at 18:50, Nathaniel Smith njs@pobox.com wrote:
I assume Larry is way ahead of us on this,
Yeah, I'd imagine you're right on that :-) Paul

On 2016-10-10 18:50, Nathaniel Smith wrote:
On Mon, Oct 10, 2016 at 10:03 AM, Paul Moore p.f.moore@gmail.com wrote:
On 10 October 2016 at 17:49, MRAB python@mrabarnett.plus.com wrote:
If you lookup something in a dict, it'll be a borrowed reference.
If the dict is globals() and there's no GIL, another thread could delete the item while your code had the borrowed reference.
It looks like there might be a lot that will need to changed post gilectomy!
It seems to me that the whole concept of a borrowed reference may be risky in a post-GIL world. There may be occasional cases where it's possible to prove that borrowing is safe, but I suspect they'll be pretty rare.
I don't think it's that bad...
In a post-GIL world, the problem cases we're talking about like deleting items from containers all require some sort of locking, even before we start thinking about borrowed references. If two threads start mucking around resizing the internals of a list or dict at the same time, then you are unlikely to go to space today. IIRC to handle this gilectomy adds per-object mutexes that you have to hold whenever you're mucking around with that object's internals.
If we say that borrowing reference from a dict is one of the things that counts as mucking about with that dict, and thus requires you to hold the dict lock for as long as you hold the borrowed reference, then all should be well.
I assume Larry is way ahead of us on this, which is why he's suddenly confusing us all by emphasizing that when you borrow a reference you should know who you're borrowing it from, so you know which lock to hold.
Instead of locking the object, could we keep the GIL, but have it normally released?
A thread could then still call a function such as PyWeakref_GetObject() that returns a borrowed reference, but only if it's holding the GIL. It would be able to INCREF the reference before releasing the GIL again.
There could still be a new function that returned a strong reference.

On Mon, Oct 10, 2016, at 14:04, MRAB wrote:
Instead of locking the object, could we keep the GIL, but have it normally released?
A thread could then still call a function such as PyWeakref_GetObject() that returns a borrowed reference, but only if it's holding the GIL. It would be able to INCREF the reference before releasing the GIL again.
So, what stops the other thread which never asks for the GIL from blowing away the reference? Or is this a special kind of lock that you can "assert isn't locked" without locking it for yourself, and INCREF/DECREF does so?

On Tue, Oct 11, 2016 at 5:24 AM, Random832 random832@fastmail.com wrote:
On Mon, Oct 10, 2016, at 14:04, MRAB wrote:
Instead of locking the object, could we keep the GIL, but have it normally released?
A thread could then still call a function such as PyWeakref_GetObject() that returns a borrowed reference, but only if it's holding the GIL. It would be able to INCREF the reference before releasing the GIL again.
So, what stops the other thread which never asks for the GIL from blowing away the reference? Or is this a special kind of lock that you can "assert isn't locked" without locking it for yourself, and INCREF/DECREF does so?
"assert isn't locked" is pretty cheap (race conditions wouldn't be applicable here, as you would have to completely obtain the GIL before even attempting a dangerous operation), but what would INCREF/DECREF do if the GIL is locked by another thread?
Hmm. Here's a naughty, and maybe dangerous, theory. Obtain a "memory deallocation lock". While it is held (by any thread - it's a guard, more than a lock), Py_DECREF will not actually deallocate memory - objects can fall to zero references without being wiped. Once the lock/guard is freed/cleared, anything that had fallen to zero is now deallocated. This probably would mean stuffing them onto a list of "doomed objects", and upon release of the guard, any doomed objects that still have no refs would get deallocated.
That would, by definition, make *all* borrowed refs legal; you can either use them and finish, or INCREF them before releasing the deallocation lock, and either way, you have a guarantee that (a) you're not messing with junk memory, and (b) it really is the object you think it is. There should be no risk of race conditions, because you would completely claim the deallocation lock before calling something that gives you a borrowed ref, meaning that the whole time the ref is borrowed, you must have that lock.
But I'm guessing other people have thought far more about this than I have.
ChrisA

On 2016-10-10 20:36, Chris Angelico wrote:
On Tue, Oct 11, 2016 at 5:24 AM, Random832 random832@fastmail.com wrote:
On Mon, Oct 10, 2016, at 14:04, MRAB wrote:
Instead of locking the object, could we keep the GIL, but have it normally released?
A thread could then still call a function such as PyWeakref_GetObject() that returns a borrowed reference, but only if it's holding the GIL. It would be able to INCREF the reference before releasing the GIL again.
So, what stops the other thread which never asks for the GIL from blowing away the reference? Or is this a special kind of lock that you can "assert isn't locked" without locking it for yourself, and INCREF/DECREF does so?
"assert isn't locked" is pretty cheap (race conditions wouldn't be applicable here, as you would have to completely obtain the GIL before even attempting a dangerous operation), but what would INCREF/DECREF do if the GIL is locked by another thread?
Hmm. Here's a naughty, and maybe dangerous, theory. Obtain a "memory deallocation lock". While it is held (by any thread - it's a guard, more than a lock), Py_DECREF will not actually deallocate memory - objects can fall to zero references without being wiped. Once the lock/guard is freed/cleared, anything that had fallen to zero is now deallocated. This probably would mean stuffing them onto a list of "doomed objects", and upon release of the guard, any doomed objects that still have no refs would get deallocated.
That would, by definition, make *all* borrowed refs legal; you can either use them and finish, or INCREF them before releasing the deallocation lock, and either way, you have a guarantee that (a) you're not messing with junk memory, and (b) it really is the object you think it is. There should be no risk of race conditions, because you would completely claim the deallocation lock before calling something that gives you a borrowed ref, meaning that the whole time the ref is borrowed, you must have that lock.
But I'm guessing other people have thought far more about this than I have.
I have to admit that the GIL thing was just a vague feeling I had, and I hadn't thought too deeply about it, but I think that you've crystallised it. :-)

On 10/10/2016 09:36 PM, Chris Angelico wrote:
Hmm. Here's a naughty, and maybe dangerous, theory. Obtain a "memory deallocation lock". While it is held (by any thread - it's a guard, more than a lock), Py_DECREF will not actually deallocate memory - objects can fall to zero references without being wiped. Once the lock/guard is freed/cleared, anything that had fallen to zero is now deallocated. This probably would mean stuffing them onto a list of "doomed objects", and upon release of the guard, any doomed objects that still have no refs would get deallocated.
If this worked, this would actually be really easy with my current "buffered reference counting" approach. I literally already have a list of doomed objects, which a separate thread queues up for each thread to process later.
(This was a necessary part of "buffered reference counting", in which reference count changes are stored in a transaction log and later committed by a single thread. Since there's only one thread making reference count changes, there's no contention, so it doesn't have to use atomic incr/decr, which is a big performance win.)
But I don't think this fixes the problem. Consider:
1. Thread A calls Q = PyList_GetItem(L, 0), which returns a borrowed reference. Thread A then gets suspended, before it has a chance to Py_INCREF(Q). 2. Thread B does L.clear(), the reference count of Q goes to 0, Q is added to the doomed objects list. 3. Thread A gets to run again. It does Py_INCREF(Q); Q's refcount is now back up to 1, as if Q had been resurrected. 4. At some point in the future the "memory deallocation lock" is released and Q is deleted.
If you say "well, just look at the reference count, and if it's 1 throw it off the doomed objects list", keep in mind that I no longer have accurate real-time reference counts. These hacks where we play games with the reference count are mostly removed in my branch.
Also, I don't know when it would ever be safe to release the "memory deallocation lock". Just because it's safe for your thread doesn't mean it's safe for another thread. And if you do it on a thread-by-thread basis, in the above example it might be safe from thread B's perspective to release its "memory deallocation lock", but as illustrated that can have an effect on thread A.
I appreciate the brainstorming but I'm not currently sanguine about this idea,
//arry/

On Tue, Oct 11, 2016 at 8:14 AM, Larry Hastings larry@hastings.org wrote:
But I don't think this fixes the problem. Consider:
Thread A calls Q = PyList_GetItem(L, 0), which returns a borrowed reference. Thread A then gets suspended, before it has a chance to Py_INCREF(Q). Thread B does L.clear(), the reference count of Q goes to 0, Q is added to the doomed objects list. Thread A gets to run again. It does Py_INCREF(Q); Q's refcount is now back up to 1, as if Q had been resurrected. At some point in the future the "memory deallocation lock" is released and Q is deleted.
If you say "well, just look at the reference count, and if it's 1 throw it off the doomed objects list", keep in mind that I no longer have accurate real-time reference counts. These hacks where we play games with the reference count are mostly removed in my branch.
That's exactly what I would have said, because I was assuming that refcounts would be accurate. I'm not sure what you mean by "play games with", but it sounds like you have a much more sophisticated system going on than I had in my head, and you'll need a correspondingly sophisticated solution to this problem.
Like I said, others have thought a LOT more about this than I have.
ChrisA

On 10/10/2016 10:38 PM, Chris Angelico wrote:
On Tue, Oct 11, 2016 at 8:14 AM, Larry Hastings larry@hastings.org wrote:
These hacks where we play games with the reference count are mostly removed in my branch.
That's exactly what I would have said, because I was assuming that refcounts would be accurate. I'm not sure what you mean by "play games with",
By "playing games with reference counts", I mean code that purposely doesn't follow the rules of reference counting. Sadly, there are special cases that apparently *are* special enough to break the rules. Which made implementing "buffered reference counting" that much harder.
I currently know of two examples of this in CPython. In both instances, an object has a reference to another object, but *deliberately* does not increase the reference count of the object, in order to prevent keeping the other object alive. The implementation relies on the GIL to preserve correctness; without a GIL, it was much harder to ensure this code was correct. (And I'm still not 100% I've done it. More thinking needed.)
Those two examples are:
1. PyWeakReference objects. The wr_object pointer--the "reference" held by the weak reference object--points to an object, but does not increment the reference count. Worse yet, as already observed, PyWeakref_GetObject() and PyWeakref_GET_OBJECT() don't increment the reference count, an inconvenient API decision from my perspective. 2. "Interned mortal" strings. When a string is both interned *and* mortal, it's stored in the static "interned" dict in unicodeobject.c--as both key and value--and then its's DECREF'd twice so those two references don't count. When the string is destroyed, unicode_dealloc resurrects the string, reinstating those two references, then removes it from the "interned" dict, then destroys the string as normal.
To support these, I've implemented what is effectively a secondary, atomic-only reference count. It seems to work. (And yes that means all objects are now 8 bytes bigger. Let me worry about memory consumption later, m'kay?)
Resurrecting object also gave me a headache in the Gilectomy with this buffered reference counting scheme, but I think I have that figured out too. When you resurrect an object, it's generally because you're going to expose it to other subsystems that may incr / decr / otherwise inspect the reference count. Which means that code may buffer reference count changes. Which means you can't immediately destroy the object anymore. So: when you resurrect, you set the new reference count, you also set a flag saying "I've already been resurrected", you pass it in to that other code, you then drop your references with Py_DECREF, and you exit. Your dealloc function will get called again later; you then see you've already done that first resurrection, and you destroy as normal. Curiously enough, the typeobject actually needs to do this twice: once for tp_finalize, once for tp_del. (Assuming I didn't completely misunderstand what the code was doing.)
My struggles continue,
//arry/

On Thu, Oct 13, 2016 at 4:43 AM Larry Hastings larry@hastings.org wrote:
On 10/10/2016 10:38 PM, Chris Angelico wrote:
On Tue, Oct 11, 2016 at 8:14 AM, Larry Hastings larry@hastings.org larry@hastings.org wrote:
These hacks where we play games with the reference count are mostly removed in my branch.
That's exactly what I would have said, because I was assuming that refcounts would be accurate. I'm not sure what you mean by "play games with",
By "playing games with reference counts", I mean code that purposely doesn't follow the rules of reference counting. Sadly, there are special cases that apparently *are* special enough to break the rules. Which made implementing "buffered reference counting" that much harder.
I currently know of two examples of this in CPython. In both instances, an object has a reference to another object, but *deliberately* does not increase the reference count of the object, in order to prevent keeping the other object alive. The implementation relies on the GIL to preserve correctness; without a GIL, it was much harder to ensure this code was correct. (And I'm still not 100% I've done it. More thinking needed.)
Those two examples are:
- PyWeakReference objects. The wr_object pointer--the "reference"
held by the weak reference object--points to an object, but does not increment the reference count. Worse yet, as already observed, PyWeakref_GetObject() and PyWeakref_GET_OBJECT() don't increment the reference count, an inconvenient API decision from my perspective.
That a PyWeakReference object does not increment the reference count is
the entire point of a weakref. The object wouldn't be destroyed and break the weak reference otherwise. Weak references could be implemented in a different manner - coordinate with the garbage collector to consider things who's only references come from weakrefs as collectable. That'd be an internal overhaul of the weakref implementation and potentially the gc.
- "Interned mortal" strings. When a string is both interned *and*
mortal, it's stored in the static "interned" dict in unicodeobject.c--as both key and value--and then its's DECREF'd twice so those two references don't count. When the string is destroyed, unicode_dealloc resurrects the string, reinstating those two references, then removes it from the "interned" dict, then destroys the string as normal.
yow. i don't even want to know the history of that one...
Resurrecting object also gave me a headache in the Gilectomy with this
buffered reference counting scheme, but I think I have that figured out too. When you resurrect an object, it's generally because you're going to expose it to other subsystems that may incr / decr / otherwise inspect the reference count. Which means that code may buffer reference count changes. Which means you can't immediately destroy the object anymore. So: when you resurrect, you set the new reference count, you also set a flag saying "I've already been resurrected", you pass it in to that other code, you then drop your references with Py_DECREF, and you exit. Your dealloc function will get called again later; you then see you've already done that first resurrection, and you destroy as normal. Curiously enough, the typeobject actually needs to do this twice: once for tp_finalize, once for tp_del. (Assuming I didn't completely misunderstand what the code was doing.)
kudos for trying to understand this. resurrection during destruction or
finalization hurts my brain even though in many ways it makes sense.
-gps

On 2016-10-10 22:14, Larry Hastings wrote:
On 10/10/2016 09:36 PM, Chris Angelico wrote:
Hmm. Here's a naughty, and maybe dangerous, theory. Obtain a "memory deallocation lock". While it is held (by any thread - it's a guard, more than a lock), Py_DECREF will not actually deallocate memory - objects can fall to zero references without being wiped. Once the lock/guard is freed/cleared, anything that had fallen to zero is now deallocated. This probably would mean stuffing them onto a list of "doomed objects", and upon release of the guard, any doomed objects that still have no refs would get deallocated.
If this worked, this would actually be really easy with my current "buffered reference counting" approach. I literally already have a list of doomed objects, which a separate thread queues up for each thread to process later.
(This was a necessary part of "buffered reference counting", in which reference count changes are stored in a transaction log and later committed by a single thread. Since there's only one thread making reference count changes, there's no contention, so it doesn't have to use atomic incr/decr, which is a big performance win.)
But I don't think this fixes the problem. Consider:
- Thread A calls Q = PyList_GetItem(L, 0), which returns a borrowed reference. Thread A then gets suspended, before it has a chance to Py_INCREF(Q).
- Thread B does L.clear(), the reference count of Q goes to 0, Q is added to the doomed objects list.
- Thread A gets to run again. It does Py_INCREF(Q); Q's refcount is now back up to 1, as if Q had been resurrected.
- At some point in the future the "memory deallocation lock" is released and Q is deleted.
If you say "well, just look at the reference count, and if it's 1 throw it off the doomed objects list", keep in mind that I no longer have accurate real-time reference counts. These hacks where we play games with the reference count are mostly removed in my branch.
Also, I don't know when it would ever be safe to release the "memory deallocation lock". Just because it's safe for your thread doesn't mean it's safe for another thread. And if you do it on a thread-by-thread basis, in the above example it might be safe from thread B's perspective to release its "memory deallocation lock", but as illustrated that can have an effect on thread A.
The deallocation lock could be a counter. Doomed objects would be collectable when it's zero.
I appreciate the brainstorming but I'm not currently sanguine about this idea,

On Tue, Oct 11, 2016 at 9:52 AM, MRAB python@mrabarnett.plus.com wrote:
Also, I don't know when it would ever be safe to release the "memory deallocation lock". Just because it's safe for your thread doesn't mean it's safe for another thread. And if you do it on a thread-by-thread basis, in the above example it might be safe from thread B's perspective to release its "memory deallocation lock", but as illustrated that can have an effect on thread A.
The deallocation lock could be a counter. Doomed objects would be collectable when it's zero.
Yeah, which is why I described it as a "guard". As long as you have atomic increment/decrement (ie if two threads simultaneously try to increment it, it WILL go up by two), it should be fine. Well, that part should, anyway.
ChrisA

On Mon, Oct 10, 2016, at 14:04, MRAB wrote:
Instead of locking the object, could we keep the GIL, but have it normally released?
A thread could then still call a function such as PyWeakref_GetObject() that returns a borrowed reference, but only if it's holding the GIL. It would be able to INCREF the reference before releasing the GIL again.
So you need to get/release the GIL just to run a slightly faster function that doesn't bother with an extra incref/defcref pair? I think anyone willing to make those changes would be willing to switch to a non-borrowing version of that same function, and do an explicit DECREF if that is really what they wanted.
On Tue, Oct 11, 2016 at 5:24 AM, Random832 <random832 at fastmail.com> wrote:
So, what stops the other thread which never asks for the GIL from blowing away the reference? Or is this a special kind of lock that you can "assert isn't locked" without locking it for yourself, and INCREF/DECREF does so?
On Mon Oct 10 15:36:59 EDT 2016, Chris Angelico wrote:
"assert isn't locked" is pretty cheap
Yeah, but so is INCREF/DECREF on memory that is almost certainly in cache anyhow, because you're using the object right next to it.
The write part hurts, particularly when trying to use multiple cores with shared memory, but any sort of indirection (even separating the refcount from the object, to allow per-core counters) ... well, it doesn't take much at all to be worse than INCREF/DECREF in even the normal case, let alone amortized across the the "drat, this object now has to be handled specially" cases.
Imagine two memory pools, one for "immortal" objects (such as None) that won't be collected, and so don't need their memory dirtied when you INCREF/DECREF. Alas, now *every* INCREF and DECREF has to branch on the address to tell whether or not it should be a no-op.
-jJ
--
If there are still threading problems with my replies, please email me with details, so that I can try to resolve them. -jJ

Random832 wrote:
Or is this a special kind of lock that you can "assert isn't locked" without locking it for yourself, and INCREF/DECREF does so?
I don't think that would work. It might be unlocked at the moment you test it, but someone might lock it between then and the following INCREF/DECREF.
Locking is like pregnancy -- you can't have just a little bit of it. :-)

Nathaniel Smith wrote:
IIRC to handle this gilectomy adds per-object mutexes that you have to hold whenever you're mucking around with that object's internals.
What counts as "mucking around with the object's internals", though?
If I do the C equivalent of:
x = somedict[5] x.dosomething()
am I mucking around with the internals of somedict? Intuitively, I would not think so. But the way PyDict_GetItem currently works, this would be dangerous.

On Mon, Oct 10, 2016 at 3:27 PM, Greg Ewing greg.ewing@canterbury.ac.nz wrote:
Nathaniel Smith wrote:
IIRC to handle this gilectomy adds per-object mutexes that you have to hold whenever you're mucking around with that object's internals.
What counts as "mucking around with the object's internals", though?
If I do the C equivalent of:
x = somedict[5] x.dosomething()
am I mucking around with the internals of somedict? Intuitively, I would not think so. But the way PyDict_GetItem currently works, this would be dangerous.
I guess you already have to be aware of this to some extent -- e.g., the C version of this code is dangerous, even though in Python it would be fine:
x = somedict[5] del somedict x.dosomething()
But Larry pointed out to me separately that while the per-object locks exist and can be used, the they don't help as much as you'd hope because the goal is to *not* require C API users to change their code. So it's safe to have two simultaneous calls to PyDict_Clear, or to any other single API call, because each PyDict_Clear call wil *internally* takes the lock, so they won't stomp on each other. But any code sequence that makes multiple C API calls and assumes that the world won't shift under its feet becomes problematic. E.g. pre-GILectomy you can do
Py_ssize_t count = PyList_Size(list); for (int i = 0; i < count; ++i) { PyObject *obj = PyList_Get_Item(list, i); /* code which blithely proceeds without checking for obj == NULL */ ... }
but post-GILectomy, even putting aside the borrowing issues, this code might segfault if someone shrinks the list when you aren't looking. I can't see any way to avoid breaking the API here -- it seems like callers will just have to take a lock or otherwise update their code.
I guess you can think of borrowed references as a particularly common and insidious form of this problem -- so the tactical question is whether it's better to look for a general solution like teaching everyone to take locks, or try to attack borrowed references specifically.
(Full API compatibility is never going to be preserved with the GILectomy anyway -- if nothing else C extensions currently use the GIL to protect internal globals, and that will have to be replaced by some kind of more specific locking. So the open question is more like, can the porting burden be kept low enough to make it viable.)
-n

On Tue, Oct 11, 2016 at 3:49 AM, MRAB python@mrabarnett.plus.com wrote:
On 2016-10-10 10:45, Chris Angelico wrote:
On Mon, Oct 10, 2016 at 8:35 PM, Larry Hastings larry@hastings.org wrote:
Huh? In all other circumstances, a "borrowed" reference is exactly that: X has a reference, and you are relying on X's reference to keep the object alive. Borrowing from a borrowed reference is simply a chain of these; Z borrows from Y, Y borrows from X, and X is the original person who did the incref. But you're borrowing from something specific, somebody who the API guarantees has a legitimate reference on the object and won't drop it while you're using it. I bet for every other spot in the API I can tell you from whom you're borrowing the reference.
Okay. Here's a test:
PyObject* PyList_GetItem(PyObject *list, Py_ssize_t index) Return value: Borrowed reference.
Presumably you own a reference to the list itself before you call this, and the list has a reference to its items. But suppose another thread clear()s the list immediately after you call this; whose reference are you borrowing now? The list's is gone.
Or is this another API that will have to change post gilectomy?
Couldn't the reference to the list also be borrowed?
If you lookup something in a dict, it'll be a borrowed reference.
If the dict is globals() and there's no GIL, another thread could delete the item while your code had the borrowed reference.
It looks like there might be a lot that will need to changed post gilectomy!
If you're trying to manipulate a list, you should probably own a reference to it. Otherwise, you're taking great risks. The trouble is that there's time enough for a context switch between PyList_GetItem and even an immediately-following incref, so you could have a junk pointer despite your best efforts.
Contrast:
PyObject* PySequence_GetItem(PyObject *o, Py_ssize_t i) Return value: New reference.
If you use sequence protocol, you are guaranteed to have a valid reference to the object. To be sure, the item might no longer be in the sequence by the time you look at it... but you know you're not looking at junk memory. With PyList_GetItem, it would be possible for you to have a dud pointer.
Also contrast:
PyObject* PyTuple_GetItem(PyObject *p, Py_ssize_t pos) Return value: Borrowed reference.
As long as you own a reference to the tuple itself, you can be confident that the borrowed reference will still be valid. There's no way that the object's refcount can validly hit zero so long as the tuple exists. (I'm ignoring PyTuple_SetItem here, which basically is for initialization.) It's only with lists that you have this risk.
PyObject* PyDict_GetItem(PyObject *p, PyObject *key) Return value: Borrowed reference.
... okay, it's only with lists and dictionaries. Here come the other lot called 'Python', I guess.
ChrisA

By the way, just to finish playing this new fun game "Who'd I Borrow This Reference From?":
On 10/10/2016 11:45 AM, Chris Angelico wrote:
On Mon, Oct 10, 2016 at 8:35 PM, Larry Hastings larry@hastings.org wrote:
I bet for every other spot in the API I can tell you from whom you're borrowing the reference.
Okay. Here's a test:
PyObject* PyList_GetItem(PyObject *list, Py_ssize_t index) Return value: Borrowed reference.
Presumably you own a reference to the list itself before you call this, and the list has a reference to its items. But suppose another thread clear()s the list immediately after you call this; whose reference are you borrowing now? The list's is gone.
As you say: you've borrowed the reference from the list. With the GIL this is safe. After the Gilectomy it's not always safe. It *can* be: for example, if you allocated the list locally in the current thread, and it hasn't escaped the thread yet, you can assert that in this case reference will never go away on you. But in general this is an API that has to change for the Gilectomy.
//arry/

Huh? In all other circumstances, a "borrowed" reference is exactly that:
X has a reference, and you are relying on X's reference to keep the object alive. Borrowing from a borrowed reference is simply a chain of these; Z borrows from Y, Y borrows from X, and X is the original person who did the incref. But you're borrowing from something specific, somebody who the API guarantees has a legitimate reference on the object and won't drop it while you're using it. I bet for every other spot in the API I can tell you from whom you're borrowing the reference.
In contrast, the "borrowed" reference returned by PyWeakRef_GetObject()
seems to be "borrowed" from some unspecified entity. The fact that the object is live in Python directly implies that, yes, *somebody* must have a reference, somewhere. But ISTM (and apparently you) that this is relying on the GIL preventing that unknown other actor from dropping their reference while you've borrow it. A guarantee that the post-Gilectomy Python interpreter can no longer make!
Let me try again. The only rule for borrowing: x (borrowed reference) is only guaranteed to be alive for as long as y (source) is guaranteed to be alive. At least if you phrase it carefully enough, weakrefs still fit the bill -- your borrowed reference is alive for as long as the weakref is alive. Of course, the lifetime for a weakref is completely undefined in GILectomized Python, and barely controllable even in regular CPython. So this is not a great API.
At the very least we can contort definitions into making it plausible that we borrowed from the weakref. If we can only analyze code through the lens of borrowing, we have no choice except to look at it this way.
In any case, I see nothing in the documentation that suggests "borrowed
only means unowned" as you suggest. In contrast, the documentation seems to suggest that the metaphor is how I understood it; that when you "borrow" a reference, there is another object who has a reference and you're relying on their reference to keep the object alive.
This is its own big tangent. Sorry,
Your link is all I have too. It doesn't spell it out. AFAIK there are exactly two kinds of references discussed anywhere in the docs: owned references, where you are obligated to call Py_DECREF, and everything else. Python exclusively uses the term "borrowed references" for that "everything else". I don't know why. It's a good way to encourage reasonable practices, I guess.
As the docs themselves note, this metaphor is useful but flawed: e.g. you can "borrow" the same thing and the original is still usable. But I'd go in another direction -- the metaphor is sufficient to show safety of some code, but some safe code exists where we can't as easily use "borrowing" to talk about why it's safe, and might need to introduce new concepts or switch tactics.
PyObject* x = PyList_New(0); // x is a new owned reference. PyObject* y = x; // y is a borrowed reference. PyObject* z = x; // z is also a borrowed reference. Py_INCREF(y); // y has been upgraded to an owned reference. Py_CLEAR(x); // the original reference z borrowed from disappeared. // This is provably safe, but you can't use the "borrowing" metaphor for that proof without introducing new concepts.
do_stuff(z);
// You might wonder, "why would anyone ever do that?!?" // One reason is because some functions "steal" references, so you need to borrow before handing off ownership:
// y is an owned reference. my_struct->foo = y // borrowed a reference to y. PyTuple_SetItem(my_strict->some_tuple, 0, y); // stole y. // Now, in a super-strict interpretation, y is no longer "valid", and the original borrowing relationship has been broken. // We should ideally reason "as if" it borrowed from my_struct->some_tuple, even though it didn't. // (Obviously, this example is still a little overcomplicated, but you get how this might happen IRL, yeah? // e.g. maybe z already existed and the API stealing a reference doesn't have a GET_ITEM macro.) // [Note that this has a similar problem to weakref: another thread could mutate the tuple and delete the object. Yikes!]
I hope those examples made sense.
weakref is playing with fire in a similar way: PyWeakref_GetObject is safe because someone still exists, and there is a lock that guarantees they exist as long as you don't release that lock and don't run any code that might delete a reference. (In particular, it's guaranteed safe to immediately upgrade the borrowed reference -- or is without gilectomy.) Unlike the stealing tuple example, it isn't clear where the owned reference is.
So what I mean by saying it's a red herring, is that the correctness of code doesn't hinge on how easy it is to apply the concept of borrowing, but exclusively on lifetimes and whether your borrowed reference can be proven to lie within the object lifetime. If it could not at all be explained sensibly as a "borrow" from anyone, it can still be right -- that would only make it confusing, and dangerous. (But that's a foregone conclusion at this point.)
-- Devin

On 2016-10-10, at 11:05 , Devin Jeanpierre jeanpierreda@gmail.com wrote: The term "borrowed" is supposed to imply a sensible scope during which you're free to use the object, and weakrefs don't have that (except for what is granted by the GIL), so this does sound wacky. I bet it was for performance.
Especially as it handles both getting an object from a weakref and checking whether the weakref is still alive.
OTOH it could be an enshrined bug, http://bugs.python.org/issue520087 fixed a discrepancy between the doc and the implementation by matching the doc to the implementation (of returning a borrowed ref').
Also of note, pypy developers have been reporting issues with that specific API since ~2010[0][1], and IIRC they have added a PyWeakref_LockObject to cpyext.
[0] http://bugs.python.org/issue8578 [1] http://bugs.python.org/issue16602#msg177272

On Mon, Oct 10, 2016 at 4:17 AM, Larry Hastings larry@hastings.org wrote:
Given that the weakref doesn't have a reference to the object--merely a weak reference, different thing--whose reference is it borrowing?
As others have said, it doesn't really matter who's reference it was; just that there was another at the time it was returned. Clearly it can't be considered valid once additional Python code might be run.
FWIW, yes, this is playing merry hell with the Gilectomy. If there are two threads, and one calls PyWeakref_GetObject(obj), and there's only one reference to obj, and the other thread blows it away... now what? It's my contention that this API is simply untenable under the Gilectomy, and that it needs to change to returning a new (strong) reference.
+1 for this change.
-Fred

Modified +1: you can't change the behavior of the existing API, but you can deprecate it and introduce a better one with a different name. We'll have until Python 4.0 to carry through the deprecation anyways. And I doubt this is the only C API change needed for happy gil-free coding...
On Mon, Oct 10, 2016 at 5:29 AM, Fred Drake fred@fdrake.net wrote:
On Mon, Oct 10, 2016 at 4:17 AM, Larry Hastings larry@hastings.org wrote:
Given that the weakref doesn't have a reference to the object--merely a weak reference, different thing--whose reference is it borrowing?
As others have said, it doesn't really matter who's reference it was; just that there was another at the time it was returned. Clearly it can't be considered valid once additional Python code might be run.
FWIW, yes, this is playing merry hell with the Gilectomy. If there are two threads, and one calls PyWeakref_GetObject(obj), and there's only one reference to obj, and the other thread blows it away... now what? It's my contention that this API is simply untenable under the Gilectomy, and that it needs to change to returning a new (strong) reference.
+1 for this change.
-Fred
-- Fred L. Drake, Jr. <fred at fdrake.net> "A storm broke loose in my mind." --Albert Einstein _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org

On 10/10/2016 06:37 PM, Guido van Rossum wrote:
Modified +1: you can't change the behavior of the existing API, but you can deprecate it and introduce a better one with a different name. We'll have until Python 4.0 to carry through the deprecation anyways. And I doubt this is the only C API change needed for happy gil-free coding...
First, "deprecate" won't work for these semantics for the Gilectomy branch. I simply cannot safely support the semantics of the existing function. All call sites need to change.
Second, consider that every function that returns a borrowed reference--PyDict_GetItem(), PyList_GetItem()--has to change too, to return an actual (non-borrowed) reference. It's going to be a major upheaval in the C API.
For now I'm going to leave the names as-is and just change the semantics everywhere. If this approach really works, and if we decide we want to merge it back into trunk--and those are both still big if's--we can revisit this decision.
I haven't yet ruled out abandoning reference counting completely and going to 100% tracing garbage collecting,
//arry/

On Mon, Oct 10, 2016 at 1:57 PM, Larry Hastings larry@hastings.org wrote:
On 10/10/2016 06:37 PM, Guido van Rossum wrote:
Modified +1: you can't change the behavior of the existing API, but you can deprecate it and introduce a better one with a different name. We'll have until Python 4.0 to carry through the deprecation anyways. And I doubt this is the only C API change needed for happy gil-free coding...
First, "deprecate" won't work for these semantics for the Gilectomy branch. I simply cannot safely support the semantics of the existing function. All call sites need to change.
Oh, in the gilectomy you can fix all call sites. Even in CPython. But 3rd party extensions should not have this API removed in Python 3.7.
Second, consider that every function that returns a borrowed reference--PyDict_GetItem(), PyList_GetItem()--has to change too, to return an actual (non-borrowed) reference. It's going to be a major upheaval in the C API.
Yeah, this has been discussed to death in the rest of the thread.
For now I'm going to leave the names as-is and just change the semantics everywhere. If this approach really works, and if we decide we want to merge it back into trunk--and those are both still big if's--we can revisit this decision.
Changing something as subtle as this to an API sounds really bad even in an experimental branch. But it's your branch and you can do what you want (then again, why did you bring it up here? :-).
I haven't yet ruled out abandoning reference counting completely and going to 100% tracing garbage collecting,
I guess that would have its own set of pros and cons.

On 11 October 2016 at 08:49, Guido van Rossum guido@python.org wrote:
On Mon, Oct 10, 2016 at 1:57 PM, Larry Hastings larry@hastings.org wrote:
For now I'm going to leave the names as-is and just change the semantics everywhere. If this approach really works, and if we decide we want to merge it back into trunk--and those are both still big if's--we can revisit this decision.
Changing something as subtle as this to an API sounds really bad even in an experimental branch. But it's your branch and you can do what you want (then again, why did you bring it up here? :-).
I'm guessing Larry was hoping someone else would see a possible solution that he'd missed.
On that front, while I don't see an alternative to making borrowed references real references in a GIL-free Python, I do think we may need to keep the borrowed/new distinction for the sake of folks writing cross-version compatible code. To explain why, consider these scenarios:
* CPython with a GIL: - new references MUST be decref'ed - borrowed references MUST NOT be decref'ed
* CPython without a GIL: - new references MUST be decref'ed - borrowed references MUST be decref'ed
Assuming that the GILectomy does end up needing to implicitly treat all borrowed references as new references, that suggests we're going to need APIs like "PyBorrowed_DECREF" to provide the "only decref when built without the GIL" semantics, and the rules for GIL-independent source code would become:
- new references must be decref'ed using the normal APIs - borrowed references must be decref'ed using the PyBorrowed* APIs (with a no-op compatibility shim for older GIL-only CPython versions)
Regards, Nick.
participants (14)
-
Chris Angelico
-
Devin Jeanpierre
-
Fred Drake
-
Greg Ewing
-
Gregory P. Smith
-
Guido van Rossum
-
Jim J. Jewett
-
Larry Hastings
-
MRAB
-
Nathaniel Smith
-
Nick Coghlan
-
Paul Moore
-
Random832
-
Xavier Morel