
2016-04-15 19:54 GMT+02:00 Jim J. Jewett <jimjjewett@gmail.com>:
(1) Meta Question: If this is really only for CPython, then is "Standards Track" the right classification?
Yes, I think so. It doesn't seem to be an Informal nor a Process: https://www.python.org/dev/peps/pep-0001/#pep-types
(2) Why *promise* not to update the version_tag when replacing a value with itself?
It's an useful property. For example, let's say that you have a guard on globals()['value']. The guard is created with value=3. An unit test replaces the value with 50, but then restore the value to its previous value (3). Later, the guard is checked to decide if an optimization can be used. If the dict version is increased, you need a lookup. If the dict version is not increased, the guard is cheap. In C, it's very cheap to implement the test "new_value == old_value", it just compares two pointers. If an overhead is visible, I can drop it from the PEP, and implement the check in the guard.
Isn't that the sort of quality-of-implementation issue that got pushed to a note for objects that happen to be represented as singletons, such as small integers or ASCII chars?
I prefer to require this property.
[2A] Do you want to promise that replacing a value with a non-identical object *will* trigger a version_tag update *even* if the objects are equal?
It's already written in the PEP: "The version is not incremented if an existing key is set to the same value. For efficiency, values are compared by their identity: new_value is old_value , not by their content: new_value == old_value ."
(3) It is worth being explicit on whether empty dicts can share a version_tag of 0. If this PEP is about dict content, then that seems fine, and it may well be worth optimizing dict creation.
This is not part of the PEP yet. I'm not sure that I will modify the PEP to use the version 0 for empty dictionaries. Antoine doesn't seem to be convinced :-)
(4) Please be explicit about the locking around version++; it is enough to say that the relevant methods already need to hold the GIL (assuming that is true).
I don't think that it's important to mention it in the PEP. It's more an implementation detail. The version can be protected by atomic operations.
(5) I'm not sure I understand the arguments around a per-entry version.
It doesn't matter since I don't want this option :-)
On the one hand, you never need a strong reference to the value; if it has been collected, then it has obviously been removed from the dict and should trigger a change even with per-dict.
Let's say that you watch the key1 of a dict. The key2 is modified, it increases the version. Later, you test the guard: to check if the key1 was modified, you need to lookup the key and compare the value. You need the value to compare it.
On the other hand, I'm not sure per-entry would really allow finer-grained guards to avoid lookups; just because an entry hasn't been modified doesn't prove it hasn't been moved to another location, perhaps by replacing a dummy in a slot it would have preferred.
The main advantage of per-entry version is to avoid the strong reference to values. According to my tests, the drawbacks are too important to take this option. I prefer a simple version per dictionary.
(6) I'm also not sure why version_tag *doesn't* solve the problem of dicts that fool the iteration guards by mutating without changing size ( https://bugs.python.org/issue19332 ) ... are you just saying that the iterator views aren't allowed to rely on the version-tag remaining stable, because replacing a value (as opposed to a key-value pair) is allowed?
If the dictionary values are modified during the loop, the dict version is increased. But it's allowed to modify values when you iterate on *keys*. Victor