[Python-ideas] RFC: PEP: Add dict.__version__

Chris Angelico rosuav at gmail.com
Mon Jan 11 03:07:32 EST 2016


On Mon, Jan 11, 2016 at 5:04 PM, Terry Reedy <tjreedy at udel.edu> wrote:
> On 1/10/2016 12:23 AM, Chris Angelico wrote:
>
> (in reponse to Steven's response to my post)
>
>> There's more to it than that. Yes, a dict maps values to values; but
>> the keys MUST be immutable
>
>
> Keys just have to be hashable; only hashes need to be immutable.  By
> default, hashes depends on ids, which are immutable for a particular object
> within a run.

Yes, but if you're using the ID as the hash and identity as equality,
then *by definition* the only way to look up that key is with that
object. That means it doesn't matter to the lookup optimization if the
object itself has changed:

class Puddle(object): pass
d = {}
key, val = Puddle(), Puddle()
key.foo = "foo"; val.foo = "bar"
d[key] = val

print(d[key])
snapshotted_d_key = d[key]
key.foo = "not foo"
print(d[key])
print(snapshotted_d_key)

The optimization in question is effectively using a local reference
like snapshotted_d_key rather than doing the actual lookup again. It
can safely do this even if the attributes of that key have changed,
because there is no way for that to affect the result of the lookup.
So in terms of dict lookups, whatever affects hash and equality *is*
the object's value; if that's its identity, then identity is the sole
value that object has.

>> and this optimization
>> doesn't actually care about the immutability of the value.
>
> astoptimizer has multiple optimizations. One is not repeating name lookups.
> This is safe as long as the relevant dicts have not changed. I am guessing
> that you were pointing to this one.

Yes, that's the one I was talking about.

> Another is not repeating the call of a function with a particular value.
> This optimization, in general, is not safe even if dicts have not changed.
> It *does* care about the nature of dict values -- in particular the nature
> of functions that are dict values.  It is the one *I* discussed, and the
> reason I claimed that using __version__ is tricky.

Okay. In that case, yes, it takes a lot more checks.

> His toy example is replacing conditionally replacing 'len('abc') (at
> runtime) with '3', where '3' is computed *when the code is compiled. For
> this, it is crucial that builtin len is pure and immutable.

Correct. I'm getting this mental picture of angelic grace, with a
chosen few most beautiful functions being commended for their purity,
immutability, and reverence.

> Viktor is being super careful to not break code.  In response to my
> question, Viktor said astoptimizer uses a whitelist of pure builtins to
> supplement the information supplied by .__version__.  Dict history,
> summarized by __version__ is not always enough to answer 'is this
> optimization safe'?  The nature of values is sometimes crucially important.

There would be very few operations that can be optimized like this. In
practical terms, the only ones that I can think of are what you might
call "computed literals" - like (2+3j), they aren't technically
literals, but the programmer thinks of them that way. Things like
module-level constants (the 'stat' module comes to mind), a small
handful of simple transformations, and maybe some text<->bytes
transformations (eg "abc".encode("ascii") could be replaced at
compile-time with b"abc"). There won't be very many others, I suspect.

ChrisA


More information about the Python-ideas mailing list