[Python-Dev] RFC: PEP 509: Add a private version to dict

Fri Apr 15 16:41:44 EDT 2016

2016-04-15 19:54 GMT+02:00 Jim J. Jewett <jimjjewett at gmail.com>:
> (1)  Meta Question:  If this is really only for CPython, then is
> "Standards Track" the right classification?

Yes, I think so. It doesn't seem to be an Informal nor a Process:
https://www.python.org/dev/peps/pep-0001/#pep-types

> (2)  Why *promise* not to update the version_tag when replacing a
> value with itself?

It's an useful property. For example, let's say that you have a guard
on globals()['value']. The guard is created with value=3. An unit test
replaces the value with 50, but then restore the value to its previous
value (3). Later, the guard is checked to decide if an optimization
can be used.

If the dict version is increased, you need a lookup. If the dict
version is not increased, the guard is cheap.

In C, it's very cheap to implement the test "new_value == old_value",
it just compares two pointers.

If an overhead is visible, I can drop it from the PEP, and implement
the check in the guard.

>  Isn't that the sort of quality-of-implementation
> issue that got pushed to a note for objects that happen to be
> represented as singletons, such as small integers or ASCII chars?

I prefer to require this property.

> [2A] Do you want to promise that replacing a value with a
> non-identical object *will* trigger a version_tag update *even*
> if the objects are equal?

It's already written in the PEP:

"The version is not incremented if an existing key is set to the same
value. For efficiency, values are compared by their identity:
new_value is old_value , not by their content: new_value == old_value
."

> (3)  It is worth being explicit on whether empty dicts can share
> a version_tag of 0.  If this PEP is about dict content, then that
> seems fine, and it may well be worth optimizing dict creation.

This is not part of the PEP yet. I'm not sure that I will modify the
PEP to use the version 0 for empty dictionaries. Antoine doesn't seem
to be convinced :-)

> (4)  Please be explicit about the locking around version++; it
> is enough to say that the relevant methods already need to hold
> the GIL (assuming that is true).

I don't think that it's important to mention it in the PEP. It's more
an implementation detail. The version can be protected by atomic
operations.

> (5)  I'm not sure I understand the arguments around a per-entry
> version.

It doesn't matter since I don't want this option :-)

> On the one hand, you never need a strong reference to the value;
> if it has been collected, then it has obviously been removed from
> the dict and should trigger a change even with per-dict.

Let's say that you watch the key1 of a dict. The key2 is modified, it
increases the version. Later, you test the guard: to check if the key1
was modified, you need to lookup the key and compare the value. You
need the value to compare it.

> On the other hand, I'm not sure per-entry would really allow
> finer-grained guards to avoid lookups; just because an entry hasn't
> been modified doesn't prove it hasn't been moved to another location,
> perhaps by replacing a dummy in a slot it would have preferred.

The main advantage of per-entry version is to avoid the strong
reference to values.

According to my tests, the drawbacks are too important to take this
option. I prefer a simple version per dictionary.

> (6)  I'm also not sure why version_tag *doesn't* solve the problem
> of dicts that fool the iteration guards by mutating without changing
> size ( https://bugs.python.org/issue19332 ) ... are you just saying
> that the iterator views aren't allowed to rely on the version-tag
> remaining stable, because replacing a value (as opposed to a
> key-value pair) is allowed?

If the dictionary values are modified during the loop, the dict
version is increased. But it's allowed to modify values when you
iterate on *keys*.

Victor