
.2016-04-15 23:45 GMT+02:00 Jim J. Jewett <jimjjewett@gmail.com>:
It's an useful property. For example, let's say that you have a guard on globals()['value']. The guard is created with value=3. An unit test replaces the value with 50, but then restore the value to its previous value (3). Later, the guard is checked to decide if an optimization can be used.
If the dict version is increased, you need a lookup. If the dict version is not increased, the guard is cheap.
I would expect the version to be increased twice, and therefore to require a lookup. Are you suggesting that unittest should provide an example of resetting the version back to the original value when it cleans up after itself?
Sorry, as I wrote in another email that I was wrong. If you modify the value, the version is increased. The discussed case is really a corner case: the version does not change if the key is set again to exactly the same value. d[key] = value d[key] = value It's just that it's cheap to implement it :-)
In C, it's very cheap to implement the test "new_value == old_value", it just compares two pointers.
Yeah, I understand that it is likely a win in terms of performance, and a good way to start off (given that you're willing to do the work).
I just worry that you may end up closing off even better optimizations later, if you make too many promises about exactly how you will do which ones.
Today, dict only cares about ==, and you (reasonably) think that full == isn't always worth running ... but when it comes to which tests *are* worth running, I'm not confident that the answers won't change over the years.
I checked, currently there is no unit test for a==b, only for a is b. I will add add a test for a==b but a is not b, and ensure that the version is increased.
[2A] Do you want to promise that replacing a value with a non-identical object *will* trigger a version_tag update *even* if the objects are equal?
It's already written in the PEP:
I read that as a description of what the code does, rather than a spec for what it should do... so it isn't clear whether I could count on that remaining true.
For example, if I know that my dict values are all 4-digit integers, can I write:
d[k] = d[k] + 0
and be assured that the version_tag will bump? Or is that something that a future optimizer might optimize out?
Hum, I will try to clarify that.
(4) Please be explicit about the locking around version++; it is enough to say that the relevant methods already need to hold the GIL (assuming that is true).
I don't think that it's important to mention it in the PEP. It's more an implementation detail. The version can be protected by atomic operations.
Now I'm the one arguing from a specific implementation. :D
My thought was that any sort of locking (including atomic operations) is slow, but if the GIL is already held, then there is no *extra* locking cost. (Well, a slightly longer hold on the lock, but...)
Hum, since the PEP clarify targets CPython, I will simply described its implementation, so explain that the GIL ensures that version++ is atomic.
On the one hand, you never need a strong reference to the value; if it has been collected, then it has obviously been removed from the dict and should trigger a change even with per-dict.
Let's say that you watch the key1 of a dict. The key2 is modified, it increases the version. Later, you test the guard: to check if the key1 was modified, you need to lookup the key and compare the value. You need the value to compare it.
And the value for key1 is still there, so you can.
Sorry, how do you want to compare that dict[key1] value didn't change, using the value identifier? dict[key1] is old_value_id? The problem with storing an identifier (a pointer in C) with no strong reference is when the object is destroyed, a new object can likely get the same identifier. So it's likely that "dict[key] is old_value_id" can be true even if dict[key] is now a new object.
The only reason you would notice that the key2 value had gone away is if you also care about key2 -- in which case the cached value is out of date, regardless of what specific value it used to hold.
I don't understand, technically, what do you mean by "out of date" for an object?
If the dictionary values are modified during the loop, the dict version is increased. But it's allowed to modify values when you iterate on *keys*.
Sure. So?
I see three cases:
(A) I don't care that the collection changed. The python implementation might, but I don't. (So no bug even today.)
I'm sorry, I don't understand your description. What do you mean by "collection"? It's different if you modify dict *keys*, or dict *values*, or both. Serhiy opened an issue because he wants to raise an exception if keys are modified while you iterate on keys: https://bugs.python.org/issue19332 But only modifying values must *not* raise an exception.
(B) I want to process exactly the collection that I started with. If some of the values get replaced, then I want to complain, even if python doesn't. version_tag is what I want.
This is not the issue #19332.
(C) I want to process exactly the original keys, but go ahead and use updated values. The bug still bites, but ... I don't think this case is any more common than B.
I don't understand exaclty your definition neither. Maybe you need to provide an example of code. Sorry, I don't understand why do you want to discuss the issue #19332 here. I only mentioned the issue in "Prior Work" because the implementation is *similar*, but the PEP 509 is different and so it doesn't help to fix this issue. Do you want to modify the PEP 509 to fix this issue? Or you don't understand why the PEP 509 cannot be used to fix the issue? I'm lost... Victor