[Python-ideas] RFC: PEP: Add dict.__version__

Terry Reedy tjreedy at udel.edu
Sat Jan 9 17:18:40 EST 2016


On 1/8/2016 4:27 PM, Victor Stinner wrote:

> Add a new read-only ``__version__`` property to ``dict`` and
> ``collections.UserDict`` types, incremented at each change.

I agree with Neil Girdhar that this looks to me like a CPython-specific 
implementation detail that should not be imposed on other 
implementations.  For testing, perhaps we could add a dict_version 
function in test.support that uses ctypes to access the internals.

Another reason to hide __version__ from the Python level is that its use 
seems to me rather tricky and bug-prone.

> Python is hard to optimize because almost everything is mutable: builtin
> functions, function code, global variables, local variables, ... can be
> modified at runtime.

I believe that C-coded functions are immutable.  But I believe that 
mutability otherwise otherwise undercuts what your are trying to do.

> Implementing optimizations respecting the Python
> semantic requires to detect when "something changes":

But as near as I can tell, your proposal cannot detect all relevant 
changes unless one is *very* careful.  A dict maps hashable objects to 
objects.  Objects represent values.  So a dict represents a mapping of 
values to values.  If an object is mutated, the object to object mapping 
is not changed, but the semantic value to value mapping *is* changed. 
In the following example, __version__ twice gives the 'wrong' answer 
from a value perspective.

d = {'f': [int]}
d['f'][0] = float # object mapping unchanged, value mapping changed
d['f'] = [float]  # object mapping changed, value mapping unchanged

> The astoptimizer of the FAT Python project implements many optimizations
> which require guards on namespaces. Examples:
>
> * Call pure builtins: to replace ``len("abc")`` with ``3``,

Replacing a call with a return value assumes that the function is 
immutable, deterministic, and without side-effect.  Perhaps this is what 
you meant by 'pure'.  Are you proposing to provide astoptimizer with 
either a whitelist or blacklist of builtins that qualify or not?

Aside from this, I don't find this example motivational.  I would either 
write '3' in the first place or write something like "slen = 
len('akjslkjgkjsdkfjsldjkfs')" outside of any loop.  I would more likely 
write something like "key = 'jlkjfksjlkdfjlskfjkslkjeicji'; key_len = 
len(key)" to keep a reference to both the string and its length.  Will 
astoptimizer 'propogate the constant' (in this case 'key')?

The question in my mind is whether real code has enough pure builtin 
calls of constants to justify the overhead.

> * Loop unrolling: to unroll the loop ``for i in range(...): ...``,

How often is this useful in modern real-world Python code?  Many old 
uses of range have been or could be replaced with enumerate or a 
collection iterator, making it less common than it once was.

How often is N small enough that one wants complete versus partial 
unrolling?  Wouldn't it be simpler to only use a (specialized) 
loop-unroller where range is known to be the builtin?

-- 
Terry Jan Reedy



More information about the Python-ideas mailing list