[Python-ideas] RFC: PEP: Add dict.__version__
Terry Reedy
tjreedy at udel.edu
Sat Jan 9 17:18:40 EST 2016
On 1/8/2016 4:27 PM, Victor Stinner wrote:
> Add a new read-only ``__version__`` property to ``dict`` and
> ``collections.UserDict`` types, incremented at each change.
I agree with Neil Girdhar that this looks to me like a CPython-specific
implementation detail that should not be imposed on other
implementations. For testing, perhaps we could add a dict_version
function in test.support that uses ctypes to access the internals.
Another reason to hide __version__ from the Python level is that its use
seems to me rather tricky and bug-prone.
> Python is hard to optimize because almost everything is mutable: builtin
> functions, function code, global variables, local variables, ... can be
> modified at runtime.
I believe that C-coded functions are immutable. But I believe that
mutability otherwise otherwise undercuts what your are trying to do.
> Implementing optimizations respecting the Python
> semantic requires to detect when "something changes":
But as near as I can tell, your proposal cannot detect all relevant
changes unless one is *very* careful. A dict maps hashable objects to
objects. Objects represent values. So a dict represents a mapping of
values to values. If an object is mutated, the object to object mapping
is not changed, but the semantic value to value mapping *is* changed.
In the following example, __version__ twice gives the 'wrong' answer
from a value perspective.
d = {'f': [int]}
d['f'][0] = float # object mapping unchanged, value mapping changed
d['f'] = [float] # object mapping changed, value mapping unchanged
> The astoptimizer of the FAT Python project implements many optimizations
> which require guards on namespaces. Examples:
>
> * Call pure builtins: to replace ``len("abc")`` with ``3``,
Replacing a call with a return value assumes that the function is
immutable, deterministic, and without side-effect. Perhaps this is what
you meant by 'pure'. Are you proposing to provide astoptimizer with
either a whitelist or blacklist of builtins that qualify or not?
Aside from this, I don't find this example motivational. I would either
write '3' in the first place or write something like "slen =
len('akjslkjgkjsdkfjsldjkfs')" outside of any loop. I would more likely
write something like "key = 'jlkjfksjlkdfjlskfjkslkjeicji'; key_len =
len(key)" to keep a reference to both the string and its length. Will
astoptimizer 'propogate the constant' (in this case 'key')?
The question in my mind is whether real code has enough pure builtin
calls of constants to justify the overhead.
> * Loop unrolling: to unroll the loop ``for i in range(...): ...``,
How often is this useful in modern real-world Python code? Many old
uses of range have been or could be replaced with enumerate or a
collection iterator, making it less common than it once was.
How often is N small enough that one wants complete versus partial
unrolling? Wouldn't it be simpler to only use a (specialized)
loop-unroller where range is known to be the builtin?
--
Terry Jan Reedy
More information about the Python-ideas
mailing list