[Python-ideas] RFC: PEP: Add dict.__version__
Terry Reedy
tjreedy at udel.edu
Mon Jan 11 01:04:02 EST 2016
On 1/10/2016 12:23 AM, Chris Angelico wrote:
(in reponse to Steven's response to my post)
> There's more to it than that. Yes, a dict maps values to values; but
> the keys MUST be immutable
Keys just have to be hashable; only hashes need to be immutable. By
default, hashes depends on ids, which are immutable for a particular
object within a run.
(otherwise hashing has problems),
only if the hash depends on values that mutate. Some do.
> and this optimization
> doesn't actually care about the immutability of the value.
astoptimizer has multiple optimizations. One is not repeating name
lookups. This is safe as long as the relevant dicts have not changed. I
am guessing that you were pointing to this one.
Another is not repeating the call of a function with a particular value.
This optimization, in general, is not safe even if dicts have not
changed. It *does* care about the nature of dict values -- in
particular the nature of functions that are dict values. It is the one
*I* discussed, and the reason I claimed that using __version__ is tricky.
His toy example is replacing conditionally replacing 'len('abc') (at
runtime) with '3', where '3' is computed *when the code is compiled.
For this, it is crucial that builtin len is pure and immutable.
Viktor is being super careful to not break code. In response to my
question, Viktor said astoptimizer uses a whitelist of pure builtins to
supplement the information supplied by .__version__. Dict history,
summarized by __version__ is not always enough to answer 'is this
optimization safe'? The nature of values is sometimes crucially important.
However, others might use __version__ *without* thinking through what
other information is needed. This is why I think its exposure is a bit
dangerous. 19 years of experience suggests to me that misuse *will*
happen. Viktor just reported that CPython's type already has a
*private* version count. The issue of exposing a new internal feature
is somewhat separate and comes after the decision to add it.
As you know, and even alluded to later in your post, CPython already
replaces '1 + 1' with '2' at compile time. Method int.__add__ is pure
and immutable. Since it (unlike len) also cannot be replaced or
shadowed, the replacement can be complete, with '2' put in the code
object (and .pyc if written), as if the programmer had actually written '2'.
>>> from dis import dis
>>> dis('1 + 1')
1 0 LOAD_CONST 1 (2)
3 RETURN_VALUE
JIT compilers depend on the same properties of int, float, and str
operations, for instance, as well as the fact that unbox(Py object) and
box(machine value) are inverses, so that unbox(box(temp_machine_value)
can be replaced by temp_machine_value.
--
Terry Jan Reedy
More information about the Python-ideas
mailing list