
On 1/8/2016 4:27 PM, Victor Stinner wrote:
Add a new read-only ``__version__`` property to ``dict`` and ``collections.UserDict`` types, incremented at each change.
I agree with Neil Girdhar that this looks to me like a CPython-specific implementation detail that should not be imposed on other implementations. For testing, perhaps we could add a dict_version function in test.support that uses ctypes to access the internals. Another reason to hide __version__ from the Python level is that its use seems to me rather tricky and bug-prone.
Python is hard to optimize because almost everything is mutable: builtin functions, function code, global variables, local variables, ... can be modified at runtime.
I believe that C-coded functions are immutable. But I believe that mutability otherwise otherwise undercuts what your are trying to do.
Implementing optimizations respecting the Python semantic requires to detect when "something changes":
But as near as I can tell, your proposal cannot detect all relevant changes unless one is *very* careful. A dict maps hashable objects to objects. Objects represent values. So a dict represents a mapping of values to values. If an object is mutated, the object to object mapping is not changed, but the semantic value to value mapping *is* changed. In the following example, __version__ twice gives the 'wrong' answer from a value perspective. d = {'f': [int]} d['f'][0] = float # object mapping unchanged, value mapping changed d['f'] = [float] # object mapping changed, value mapping unchanged
The astoptimizer of the FAT Python project implements many optimizations which require guards on namespaces. Examples:
* Call pure builtins: to replace ``len("abc")`` with ``3``,
Replacing a call with a return value assumes that the function is immutable, deterministic, and without side-effect. Perhaps this is what you meant by 'pure'. Are you proposing to provide astoptimizer with either a whitelist or blacklist of builtins that qualify or not? Aside from this, I don't find this example motivational. I would either write '3' in the first place or write something like "slen = len('akjslkjgkjsdkfjsldjkfs')" outside of any loop. I would more likely write something like "key = 'jlkjfksjlkdfjlskfjkslkjeicji'; key_len = len(key)" to keep a reference to both the string and its length. Will astoptimizer 'propogate the constant' (in this case 'key')? The question in my mind is whether real code has enough pure builtin calls of constants to justify the overhead.
* Loop unrolling: to unroll the loop ``for i in range(...): ...``,
How often is this useful in modern real-world Python code? Many old uses of range have been or could be replaced with enumerate or a collection iterator, making it less common than it once was. How often is N small enough that one wants complete versus partial unrolling? Wouldn't it be simpler to only use a (specialized) loop-unroller where range is known to be the builtin? -- Terry Jan Reedy