RFC: PEP: Add dict.__version__
Hi, Here is a first PEP, part of a serie of 3 PEP to add an API to implement a static Python optimizer specializing functions with guards. HTML version: https://faster-cpython.readthedocs.org/pep_dict_version.html#pep-dict-versio... PEP: xxx Title: Add dict.__version__ Version: $Revision$ Last-Modified: $Date$ Author: Victor Stinner <victor.stinner@gmail.com> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 4-January-2016 Python-Version: 3.6 Abstract ======== Add a new read-only ``__version__`` property to ``dict`` and ``collections.UserDict`` types, incremented at each change. Rationale ========= In Python, the builtin ``dict`` type is used by many instructions. For example, the ``LOAD_GLOBAL`` instruction searchs for a variable in the global namespace, or in the builtins namespace (two dict lookups). Python uses ``dict`` for the builtins namespace, globals namespace, type namespaces, instance namespaces, etc. The local namespace (namespace of a function) is usually optimized to an array, but it can be a dict too. Python is hard to optimize because almost everything is mutable: builtin functions, function code, global variables, local variables, ... can be modified at runtime. Implementing optimizations respecting the Python semantic requires to detect when "something changes": we will call these checks "guards". The speedup of optimizations depends on the speed of guard checks. This PEP proposes to add a version to dictionaries to implement efficient guards on namespaces. Example of optimization: replace loading a global variable with a constant. This optimization requires a guard on the global variable to check if it was modified. If the variable is modified, the variable must be loaded at runtime, instead of using the constant. Guard example ============= Pseudo-code of an efficient guard to check if a dictionary key was modified (created, updated or deleted):: UNSET = object() class Guard: def __init__(self, dict, key): self.dict = dict self.key = key self.value = dict.get(key, UNSET) self.version = dict.__version__ def check(self): """Return True if the dictionary value did not changed.""" version = self.dict.__version__ if version == self.version: # Fast-path: avoid the dictionary lookup return True value = self.dict.get(self.key, UNSET) if value == self.value: # another key was modified: # cache the new dictionary version self.version = version return True return False Changes ======= Add a read-only ``__version__`` property to builtin ``dict`` type and to the ``collections.UserDict`` type. New empty dictionaries are initilized to version ``0``. The version is incremented at each change: * ``clear()`` if the dict was non-empty * ``pop(key)`` if the key exists * ``popitem()`` if the dict is non-empty * ``setdefault(key, value)`` if the `key` does not exist * ``__detitem__(key)`` if the key exists * ``__setitem__(key, value)`` if the `key` doesn't exist or if the value is different * ``update(...)`` if new values are different than existing values (the version can be incremented multiple times) Example:: >>> d = {} >>> d.__version__ 0 >>> d['key'] = 'value' >>> d.__version__ 1 >>> d['key'] = 'new value' >>> d.__version__ 2 >>> del d['key'] >>> d.__version__ 3 If a dictionary is created with items, the version is also incremented at each dictionary insertion. Example:: >>> d=dict(x=7, y=33) >>> d.__version__ 2 The version is not incremented is an existing key is modified to the same value, but only the identifier of the value is tested, not the content of the value. Example:: >>> d={} >>> value = object() >>> d['key'] = value >>> d.__version__ 2 >>> d['key'] = value >>> d.__version__ 2 .. note:: CPython uses some singleton like integers in the range [-5; 257], empty tuple, empty strings, Unicode strings of a single character in the range [U+0000; U+00FF], etc. When a key is set twice to the same singleton, the version is not modified. The PEP is designed to implement guards on namespaces, only the ``dict`` type can be used for namespaces in practice. ``collections.UserDict`` is modified because it must mimicks ``dict``. ``collections.Mapping`` is unchanged. Integer overflow ================ The implementation uses the C unsigned integer type ``size_t`` to store the version. On 32-bit systems, the maximum version is ``2**32-1`` (more than ``4.2 * 10 ** 9``, 4 billions). On 64-bit systems, the maximum version is ``2**64-1`` (more than ``1.8 * 10**19``). The C code uses ``version++``. The behaviour on integer overflow of the version is undefined. The minimum guarantee is that the version always changes when the dictionary is modified. The check ``dict.__version__ == old_version`` can be true after an integer overflow, so a guard can return false even if the value changed, which is wrong. The bug occurs if the dict is modified at least ``2**64`` times (on 64-bit system) between two checks of the guard. Using a more complex type (ex: ``PyLongObject``) to avoid the overflow would slow down operations on the ``dict`` type. Even if there is a theorical risk of missing a value change, the risk is considered too low compared to the slow down of using a more complex type. Alternatives ============ Add a version to each dict entry -------------------------------- A single version per dictionary requires to keep a strong reference to the value which can keep the value alive longer than expected. If we add also a version per dictionary entry, the guard can rely on the entry version and so avoid the strong reference to the value (only strong references to a dictionary and key are needed). Changes: add a ``getversion(key)`` method to dictionary which returns ``None`` if the key doesn't exist. When a key is created or modified, the entry version is set to the dictionary version which is incremented at each change (create, modify, delete). Pseudo-code of an efficient guard to check if a dict key was modified using ``getversion()``:: UNSET = object() class Guard: def __init__(self, dict, key): self.dict = dict self.key = key self.dict_version = dict.__version__ self.entry_version = dict.getversion(key) def check(self): """Return True if the dictionary value did not changed.""" dict_version = self.dict.__version__ if dict_version == self.version: # Fast-path: avoid the dictionary lookup return True # lookup in the dictionary, but get the entry version, #not the value entry_version = self.dict.getversion(self.key) if entry_version == self.entry_version: # another key was modified: # cache the new dictionary version self.dict_version = dict_version return True return False This main drawback of this option is the impact on the memory footprint. It increases the size of each dictionary entry, so the overhead depends on the number of buckets (dictionary entries, used or unused yet). For example, it increases the size of each dictionary entry by 8 bytes on 64-bit system if we use ``size_t``. In Python, the memory footprint matters and the trend is more to reduce it. Examples: * `PEP 393 -- Flexible String Representation <https://www.python.org/dev/peps/pep-0393/>`_ * `PEP 412 -- Key-Sharing Dictionary <https://www.python.org/dev/peps/pep-0412/>`_ Add a new dict subtype ---------------------- Add a new ``verdict`` type, subtype of ``dict``. When guards are needed, use the ``verdict`` for namespaces (module namespace, type namespace, instance namespace, etc.) instead of ``dict``. Leave the ``dict`` type unchanged to not add any overhead (memory footprint) when guards are not needed. Technical issue: a lot of C code in the wild, including CPython core, expect the exact ``dict`` type. Issues: * ``exec()`` requires a ``dict`` for globals and locals. A lot of code use ``globals={}``. It is not possible to cast the ``dict`` to a ``dict`` subtype because the caller expects the ``globals`` parameter to be modified (``dict`` is mutable). * Functions call directly ``PyDict_xxx()`` functions, instead of calling ``PyObject_xxx()`` if the object is a ``dict`` subtype * ``PyDict_CheckExact()`` check fails on ``dict`` subtype, whereas some functions require the exact ``dict`` type. * ``Python/ceval.c`` does not completly supports dict subtypes for namespaces The ``exec()`` issue is a blocker issue. Other issues: * The garbage collector has a special code to "untrack" ``dict`` instances. If a ``dict`` subtype is used for namespaces, the garbage collector may be unable to break some reference cycles. * Some functions have a fast-path for ``dict`` which would not be taken for ``dict`` subtypes, and so it would make Python a little bit slower. Usage of dict.__version__ ========================= astoptimizer of FAT Python -------------------------- The astoptimizer of the FAT Python project implements many optimizations which require guards on namespaces. Examples: * Call pure builtins: to replace ``len("abc")`` with ``3``, guards on ``builtins.__dict__['len']`` and ``globals()['len']`` are required * Loop unrolling: to unroll the loop ``for i in range(...): ...``, guards on ``builtins.__dict__['range']`` and ``globals()['range']`` are required The `FAT Python <http://faster-cpython.readthedocs.org/fat_python.html>`_ project is a static optimizer for Python 3.6. Pyjion ------ According of Brett Cannon, one of the two main developers of Pyjion, Pyjion can also benefit from dictionary version to implement optimizations. Pyjion is a JIT compiler for Python based upon CoreCLR (Microsoft .NET Core runtime). Unladen Swallow --------------- Even if dictionary version was not explicitly mentionned, optimization globals and builtins lookup was part of the Unladen Swallow plan: "Implement one of the several proposed schemes for speeding lookups of globals and builtins." Source: `Unladen Swallow ProjectPlan <https://code.google.com/p/unladen-swallow/wiki/ProjectPlan>`_. Unladen Swallow is a fork of CPython 2.6.1 adding a JIT compiler implemented with LLVM. The project stopped in 2011: `Unladen Swallow Retrospective <http://qinsb.blogspot.com.au/2011/03/unladen-swallow-retrospective.html>`_. Prior Art ========= Cached globals+builtins lookup ------------------------------ In 2006, Andrea Griffini proposes a patch implementing a `Cached globals+builtins lookup optimization <https://bugs.python.org/issue1616125>`_. The patch adds a private ``timestamp`` field to dict. See the thread on python-dev: `About dictionary lookup caching <https://mail.python.org/pipermail/python-dev/2006-December/070348.html>`_. Globals / builtins cache ------------------------ In 2010, Antoine Pitrou proposed a `Globals / builtins cache <http://bugs.python.org/issue10401>`_ which adds a private ``ma_version`` field to the ``dict`` type. The patch adds a "global and builtin cache" to functions and frames, and changes ``LOAD_GLOBAL`` and ``STORE_GLOBAL`` instructions to use the cache. PySizer ------- `PySizer <http://pysizer.8325.org/>`_: a memory profiler for Python, Google Summer of Code 2005 project by Nick Smallbone. This project has a patch for CPython 2.4 which adds ``key_time`` and ``value_time`` fields to dictionary entries. It uses a global process-wide counter for dictionaries, incremented each time that a dictionary is modified. The times are used to decide when child objects first appeared in their parent objects. Copyright ========= This document has been placed in the public domain. -- Victor
On Sat, Jan 9, 2016 at 8:27 AM, Victor Stinner <victor.stinner@gmail.com> wrote:
Here is a first PEP, part of a serie of 3 PEP to add an API to implement a static Python optimizer specializing functions with guards.
Are you intending for these features to become part of the Python core language, or are you discussing this as something that your alternate implementation will do? If the former, send your PEP drafts to peps@python.org and we can get them assigned numbers; if the latter, is there some specific subset of this which *is* for the language core? (For example, MyPy has type checking, but PEP 484 isn't proposing to include that in the core; all it asks is for a 'typing.py' to allow the code to run unchanged.) ChrisA
2016-01-09 2:00 GMT+01:00 Chris Angelico <rosuav@gmail.com>:
Are you intending for these features to become part of the Python core language
Yes.
If the former, send your PEP drafts to peps@python.org and we can get them assigned numbers
My plan is to start a first round of discussion on python-ideas, then get a PEP number for my PEPs before moving the discussion to python-dev. Victor
On Sat, Jan 9, 2016 at 12:09 PM, Victor Stinner <victor.stinner@gmail.com> wrote:
2016-01-09 2:00 GMT+01:00 Chris Angelico <rosuav@gmail.com>:
Are you intending for these features to become part of the Python core language
Yes.
If the former, send your PEP drafts to peps@python.org and we can get them assigned numbers
My plan is to start a first round of discussion on python-ideas, then get a PEP number for my PEPs before moving the discussion to python-dev.
The discussion on python-ideas can benefit from PEP numbers too, particularly since you're putting three separate proposals up. ("Wait, I know I saw a comment about that. Oh right, that was in PEP 142857, not 142856.") But it's up to you. ChrisA
2016-01-09 2:38 GMT+01:00 Chris Angelico <rosuav@gmail.com>:
The discussion on python-ideas can benefit from PEP numbers too, particularly since you're putting three separate proposals up. ("Wait, I know I saw a comment about that. Oh right, that was in PEP 142857, not 142856.") But it's up to you.
Hum, I forgot to mention that I'm not 100% sure yet that I correctly splitted my work on the FAT Python project into the right number of PEPs. Maybe we could merge two PEPs, or a PEP should be splitted into sub-PEPs because it requires too many changes (I'm thinking at the third PEP, not published yet, it's still a "private" draft). Victor
On 9 January 2016 at 12:01, Victor Stinner <victor.stinner@gmail.com> wrote:
2016-01-09 2:38 GMT+01:00 Chris Angelico <rosuav@gmail.com>:
The discussion on python-ideas can benefit from PEP numbers too, particularly since you're putting three separate proposals up. ("Wait, I know I saw a comment about that. Oh right, that was in PEP 142857, not 142856.") But it's up to you.
Hum, I forgot to mention that I'm not 100% sure yet that I correctly splitted my work on the FAT Python project into the right number of PEPs. Maybe we could merge two PEPs, or a PEP should be splitted into sub-PEPs because it requires too many changes (I'm thinking at the third PEP, not published yet, it's still a "private" draft).
The first two proposals you've posted make sense to consider as standalone changes, so it seems reasonable to assign them PEP numbers now rather than waiting. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
2016-01-09 5:47 GMT+01:00 Nick Coghlan <ncoghlan@gmail.com>:
The first two proposals you've posted make sense to consider as standalone changes, so it seems reasonable to assign them PEP numbers now rather than waiting.
Ok fine, I requested 3 numbers for my first draft PEPs. Victor
2016-01-09 2:00 GMT+01:00 Chris Angelico <rosuav@gmail.com>:
(...) send your PEP drafts to peps@python.org and we can get them assigned numbers
Ok, this PEP got the number 509: "PEP 0509 -- Add dict.__version__" https://www.python.org/dev/peps/pep-0509/ FYI the second PEP got the number 510: "PEP 0510 -- Specialized functions with guards" https://www.python.org/dev/peps/pep-0510/ Victor
On 08.01.16 23:27, Victor Stinner wrote:
Add a new read-only ``__version__`` property to ``dict`` and ``collections.UserDict`` types, incremented at each change.
This may be not the best name for a property. Many modules already have the __version__ attribute, this may make a confusion.
The C code uses ``version++``. The behaviour on integer overflow of the version is undefined. The minimum guarantee is that the version always changes when the dictionary is modified.
For clarification, this code has defined behavior in C (we should avoid introducing new undefined behaviors). May be you mean that the bahavior is not specified from Python side (since it is platform and implementation defined).
Usage of dict.__version__ =========================
This also can be used for better detecting dict mutating during iterating: https://bugs.python.org/issue19332.
On 9 January 2016 at 16:03, Serhiy Storchaka <storchaka@gmail.com> wrote:
On 08.01.16 23:27, Victor Stinner wrote:
Add a new read-only ``__version__`` property to ``dict`` and ``collections.UserDict`` types, incremented at each change.
This may be not the best name for a property. Many modules already have the __version__ attribute, this may make a confusion.
The equivalent API for the global ABC object graph is abc.get_cache_token: https://docs.python.org/3/library/abc.html#abc.get_cache_token One of the reasons we chose that name is that even though it's a number, the only operation with semantic significance is equality testing, with the intended use case being cache invalidation when the token changes value. If we followed the same reasoning for Victor's proposal, then a suitable attribute name would be "__cache_token__".
The C code uses ``version++``. The behaviour on integer overflow of the version is undefined. The minimum guarantee is that the version always changes when the dictionary is modified.
For clarification, this code has defined behavior in C (we should avoid introducing new undefined behaviors). May be you mean that the bahavior is not specified from Python side (since it is platform and implementation defined).
At least in recent versions of the standard*, overflow is defined on unsigned types as wrapping modulo-N. It only remains formally undefined for signed types. *(I'm not sure about C89, but with MSVC getting their standards compliance act together, we could potentially update our minimum C version expectation in PEP 7 to C99 or even C11).
Usage of dict.__version__ =========================
This also can be used for better detecting dict mutating during iterating: https://bugs.python.org/issue19332.
I initially thought the same thing, but the cache token will be updated even if the keys all stay the same, and one of the values is modified, while the mutation-during-iteration check is aimed at detecting changes to the keys, rather than the values. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 09.01.16 09:43, Nick Coghlan wrote:
If we followed the same reasoning for Victor's proposal, then a suitable attribute name would be "__cache_token__".
LGTM.
This also can be used for better detecting dict mutating during iterating: https://bugs.python.org/issue19332.
I initially thought the same thing, but the cache token will be updated even if the keys all stay the same, and one of the values is modified, while the mutation-during-iteration check is aimed at detecting changes to the keys, rather than the values.
This makes Raymond's objections even more strong.
2016-01-09 9:57 GMT+01:00 Serhiy Storchaka <storchaka@gmail.com>:
This also can be used for better detecting dict mutating during iterating: https://bugs.python.org/issue19332. (...)
This makes Raymond's objections even more strong.
Raymond has two major objections: memory footprint and performance. I opened an issue with a patch implementing dict__version__ and I ran pybench: https://bugs.python.org/issue26058#msg257810 pybench doesn't seem reliable: microbenchmarks on dict seems faster with the patch, it doesn't make sense. I expect worse or same performance. With my own timeit microbenchmarks, I don't see any slowdown with the patch. For an unknown reason (it's really strange), dict operations seem even faster with the patch. For the memory footprint, it's clearly stated in the PEP that it adds 8 bytes per dict (4 bytes on 32-bit platforms). See the "dict subtype" section which explains why I proposed to modify directly the dict type. IMHO adding 8 bytes per dict is worth it. See for example microbenchmarks on func.specialize() which rely on dict.__version__ to implement efficient guards on namespaces: https://faster-cpython.readthedocs.org/pep_specialize.html#example "1.6 times" (155 ns => 95 ns) is better than a few percent as fast usually seen when optimizing dict operations. Victor
On 09.01.2016 10:58, Victor Stinner wrote:
2016-01-09 9:57 GMT+01:00 Serhiy Storchaka <storchaka@gmail.com>:
This also can be used for better detecting dict mutating during iterating: https://bugs.python.org/issue19332. (...)
This makes Raymond's objections even more strong.
Raymond has two major objections: memory footprint and performance. I opened an issue with a patch implementing dict__version__ and I ran pybench: https://bugs.python.org/issue26058#msg257810
pybench doesn't seem reliable: microbenchmarks on dict seems faster with the patch, it doesn't make sense. I expect worse or same performance.
With my own timeit microbenchmarks, I don't see any slowdown with the patch. For an unknown reason (it's really strange), dict operations seem even faster with the patch.
This can well be caused by a better memory alignment, which depends on the CPU you're using.
For the memory footprint, it's clearly stated in the PEP that it adds 8 bytes per dict (4 bytes on 32-bit platforms). See the "dict subtype" section which explains why I proposed to modify directly the dict type.
Some questions: * How would the implementation deal with wrap around of the version number for fast changing dicts (esp. on 32-bit platforms) ? * Given that this is an optimization and not meant to be exact science, why would we need 64 bits worth of version information ? AFAIK, you only need the version information to be able to answer the question "did anything change compared to last time I looked ?". For an optimization it's good enough to get an answer "yes" for slow changing dicts and "no" for all other cases. False negatives don't really hurt. False positives are not allowed. What you'd need to answer the question is a way for the code in need of the information to remember the dict state and then later compare it's remembered state with the now current state of the dict. dicts could do this with a 16-bit index into an array of state object slots which are set by the code tracking the dict. When it's time to check, the code would simply ask for the current index value and compare the state object in the array with the one it had set. * Wouldn't it be possible to use the hash array itself to store the state index ? We could store the state object as regular key in the dict and filter this out when accessing the dict. Alternatively, we could try to use the free slots for storing these state objects by e.g. declaring a free slot as being NULL or a pointer to a state object. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Jan 09 2016)
Python Projects, Coaching and Consulting ... http://www.egenix.com/ Python Database Interfaces ... http://products.egenix.com/ Plone/Zope Database Interfaces ... http://zope.egenix.com/
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/
On Sat, Jan 09, 2016 at 01:09:13PM +0100, M.-A. Lemburg wrote:
* Given that this is an optimization and not meant to be exact science, why would we need 64 bits worth of version information ?
AFAIK, you only need the version information to be able to answer the question "did anything change compared to last time I looked ?".
For an optimization it's good enough to get an answer "yes" for slow changing dicts and "no" for all other cases.
I don't understand this. The question has nothing to do with how quickly or slowly the dict has changed, but only on whether or not it actually has changed. Maybe your dict has been stable for three hours, except for one change; or it changes a thousand times a second. Either way, it has still changed.
False negatives don't really hurt. False positives are not allowed.
I think you have this backwards. False negatives potentially will introduce horrible bugs. A false negative means that you fail to notice when the dict has changed, when it actually has. ("Has the dict changed?" "No.") The result of that will be to apply the optimization when you shouldn't, and that is potentially catastrophic (the entirely wrong function is mysteriously called). A false positive means you wrongly think the dict has changed when it hasn't. ("Has the dict changed?" "Yes.") That's still bad, because you miss out on the possibility of applying the optimization when you actually could have, but it's not so bad. So false positives (wrongly thinking the dict has changed when it hasn't) can be permitted, but false negatives shouldn't be.
What you'd need to answer the question is a way for the code in need of the information to remember the dict state and then later compare it's remembered state with the now current state of the dict.
dicts could do this with a 16-bit index into an array of state object slots which are set by the code tracking the dict.
When it's time to check, the code would simply ask for the current index value and compare the state object in the array with the one it had set.
If I've understand that correctly, and I may not have, that will on detect (some?) insertions and deletions to the dict, but fail to detect when an existing key has a new value bound. -- Steve
On Sun, Jan 10, 2016 at 12:21 AM, Steven D'Aprano <steve@pearwood.info> wrote:
On Sat, Jan 09, 2016 at 01:09:13PM +0100, M.-A. Lemburg wrote:
* Given that this is an optimization and not meant to be exact science, why would we need 64 bits worth of version information ?
AFAIK, you only need the version information to be able to answer the question "did anything change compared to last time I looked ?".
For an optimization it's good enough to get an answer "yes" for slow changing dicts and "no" for all other cases.
I don't understand this. The question has nothing to do with how quickly or slowly the dict has changed, but only on whether or not it actually has changed. Maybe your dict has been stable for three hours, except for one change; or it changes a thousand times a second. Either way, it has still changed.
False negatives don't really hurt. False positives are not allowed.
I think you have this backwards. False negatives potentially will introduce horrible bugs. A false negative means that you fail to notice when the dict has changed, when it actually has. ("Has the dict changed?" "No.") The result of that will be to apply the optimization when you shouldn't, and that is potentially catastrophic (the entirely wrong function is mysteriously called).
A false positive means you wrongly think the dict has changed when it hasn't. ("Has the dict changed?" "Yes.") That's still bad, because you miss out on the possibility of applying the optimization when you actually could have, but it's not so bad. So false positives (wrongly thinking the dict has changed when it hasn't) can be permitted, but false negatives shouldn't be.
I think we're getting caught in terminology a bit. The original question was "why a 64-bit counter". Here's my take on it: * If the dict has changed but we say it hasn't, this is a critical failure. M-A L called this a "false positive", which works if the question is "may we use the optimized version". * If the dict has changed exactly N times since it was last checked, where N is the integer wrap-around period of the counter, a naive counter comparison will show that it has not changed. Consequently, a small counter is more problematic than a large one. If the counter has 2**8 states, then collisions will be frequent, and that would be bad. If it has 2**32 states, then a slow-changing dict will last longer than any typical run of a program (if it changes, say, once per second, you get over a century of uptime before it's a problem), but a fast-changing dict could run into issues (change every millisecond and you'll run into trouble after a couple of months). A 64-bit counter could handle ridiculously fast mutation (say, every nanosecond) for a ridiculously long time (hundreds of years). That's the only way that fast-changing and slow-changing have any meaning. ChrisA
On 09.01.2016 14:21, Steven D'Aprano wrote:
On Sat, Jan 09, 2016 at 01:09:13PM +0100, M.-A. Lemburg wrote:
* Given that this is an optimization and not meant to be exact science, why would we need 64 bits worth of version information ?
AFAIK, you only need the version information to be able to answer the question "did anything change compared to last time I looked ?".
For an optimization it's good enough to get an answer "yes" for slow changing dicts and "no" for all other cases.
I don't understand this. The question has nothing to do with how quickly or slowly the dict has changed, but only on whether or not it actually has changed. Maybe your dict has been stable for three hours, except for one change; or it changes a thousand times a second. Either way, it has still changed.
I was referring to how many versions will likely have passed since the code querying the dict last looked. Most algorithms won't be interested in the version number itself, but simply want to know whether the dict has changed or not.
False negatives don't really hurt. False positives are not allowed.
I think you have this backwards.
With "false negatives" I meant: the code says the dict has changed, even though it has not. With "false positives" I meant the code says the dict has not changed, even though it has. But you're right: I should have used more explicit definitions :-)
False negatives potentially will introduce horrible bugs. A false negative means that you fail to notice when the dict has changed, when it actually has. ("Has the dict changed?" "No.") The result of that will be to apply the optimization when you shouldn't, and that is potentially catastrophic (the entirely wrong function is mysteriously called).
A false positive means you wrongly think the dict has changed when it hasn't. ("Has the dict changed?" "Yes.") That's still bad, because you miss out on the possibility of applying the optimization when you actually could have, but it's not so bad. So false positives (wrongly thinking the dict has changed when it hasn't) can be permitted, but false negatives shouldn't be.
What you'd need to answer the question is a way for the code in need of the information to remember the dict state and then later compare it's remembered state with the now current state of the dict.
dicts could do this with a 16-bit index into an array of state object slots which are set by the code tracking the dict.
When it's time to check, the code would simply ask for the current index value and compare the state object in the array with the one it had set.
If I've understand that correctly, and I may not have, that will on detect (some?) insertions and deletions to the dict, but fail to detect when an existing key has a new value bound.
This depends on how the state object is managed by the dictionary implementation. It's currently just a rough idea. Thinking about this some more, I guess having external code set the state object would result in potential race conditions, so not a good plan. The idea to add a level of indirection to reduce the memory overhead, under the assumption that only few dictionaries will actually need to report changes. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Jan 09 2016)
Python Projects, Coaching and Consulting ... http://www.egenix.com/ Python Database Interfaces ... http://products.egenix.com/ Plone/Zope Database Interfaces ... http://zope.egenix.com/
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/
2016-01-09 13:09 GMT+01:00 M.-A. Lemburg <mal@egenix.com>:
* How would the implementation deal with wrap around of the version number for fast changing dicts (esp. on 32-bit platforms) ?
Let me try to do some maths. haypo@selma$ python3 -m timeit 'd={}' 'for i in range(2**16): d[i]=i' 100 loops, best of 3: 7.01 msec per loop haypo@selma$ python3 Python 3.4.3 (default, Jun 29 2015, 12:16:01)
t=7.01e-3 / 2**16 t*1e9 106.964111328125
It looks like __setitem__() takes 107 in average. I guess that the number depends a lot on the dictionary size, the number of required resize (rehash all keys), etc. But well, it's just to have an estimation.
print(datetime.timedelta(seconds=2**32 * t)) 0:07:39.407360
With a 32-bit version, less than 8 minutes are enough to hit the integer overflow if each dict operation changes the dict version and you modify a dict in a loop.
print(2016 + datetime.timedelta(seconds=2**64 * t) / datetime.timedelta(days=365.25)) 64541.02049400773
With a 64-bit version, the situation is very different: the next overflow will not occur before the year 64 541 :-) Maybe it's worth to use a 64-bit version on 32-bit platforms? Python 3.5 already uses a 64-bit integer on 32-bit platforms to store a timestamp in the private "pytime" API. Guard has only a bug on integer overflow if the new version modulo 2^32 (or modulo 2^64) is equal to the old version. The bet is also that it's "unlikely".
* Given that this is an optimization and not meant to be exact science, why would we need 64 bits worth of version information ?
If a guard says that nothing changes where something changes, it is a real issue for me. It means that the optimization changes the Python semantic.
For an optimization it's good enough to get an answer "yes" for slow changing dicts and "no" for all other cases. False negatives don't really hurt. False positives are not allowed.
False negative means that you loose the optimization. It would be annoying to see server performance degrades after N days before of an integer overflow :-/ It can be a big issue. How do you choose the number of servers if performances are not stable? Victor
On Sat, Jan 9, 2016 at 4:09 AM M.-A. Lemburg <mal@egenix.com> wrote:
On 09.01.2016 10:58, Victor Stinner wrote:
2016-01-09 9:57 GMT+01:00 Serhiy Storchaka <storchaka@gmail.com>:
This also can be used for better detecting dict mutating during iterating: https://bugs.python.org/issue19332. (...)
This makes Raymond's objections even more strong.
Raymond has two major objections: memory footprint and performance. I opened an issue with a patch implementing dict__version__ and I ran pybench: https://bugs.python.org/issue26058#msg257810
pybench doesn't seem reliable: microbenchmarks on dict seems faster with the patch, it doesn't make sense. I expect worse or same performance.
With my own timeit microbenchmarks, I don't see any slowdown with the patch. For an unknown reason (it's really strange), dict operations seem even faster with the patch.
This can well be caused by a better memory alignment, which depends on the CPU you're using.
For the memory footprint, it's clearly stated in the PEP that it adds 8 bytes per dict (4 bytes on 32-bit platforms). See the "dict subtype" section which explains why I proposed to modify directly the dict type.
Some questions:
* How would the implementation deal with wrap around of the version number for fast changing dicts (esp. on 32-bit platforms) ?
* Given that this is an optimization and not meant to be exact science, why would we need 64 bits worth of version information ?
AFAIK, you only need the version information to be able to answer the question "did anything change compared to last time I looked ?".
For an optimization it's good enough to get an answer "yes" for slow changing dicts and "no" for all other cases. False negatives don't really hurt. False positives are not allowed.
What you'd need to answer the question is a way for the code in need of the information to remember the dict state and then later compare it's remembered state with the now current state of the dict.
dicts could do this with a 16-bit index into an array of state object slots which are set by the code tracking the dict.
When it's time to check, the code would simply ask for the current index value and compare the state object in the array with the one it had set.
Given it is for optimization only with the fallback slow path being to do an actual dict lookup, we could implement this using a single bit. Every modification sets the bit. There exists an API to clear the bit and to query the bit. Nothing else is needed. The bit could be stored creatively to avoid increasing the struct size, though ABI compatibility may prevent that...
* Wouldn't it be possible to use the hash array itself to store the state index ?
We could store the state object as regular key in the dict and filter this out when accessing the dict.
Alternatively, we could try to use the free slots for storing these state objects by e.g. declaring a free slot as being NULL or a pointer to a state object.
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Experts (#1, Jan 09 2016)
Python Projects, Coaching and Consulting ... http://www.egenix.com/ Python Database Interfaces ... http://products.egenix.com/ Plone/Zope Database Interfaces ... http://zope.egenix.com/
::: We implement business ideas - efficiently in both time and costs :::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Marc-Andre Lemburg:
* Given that this is an optimization and not meant to be exact science, why would we need 64 bits worth of version information ?
AFAIK, you only need the version information to be able to answer the question "did anything change compared to last time I looked ?". (...)
Gregory P. Smith <greg@krypto.org>:
Given it is for optimization only with the fallback slow path being to do an actual dict lookup, we could implement this using a single bit.
You misunderstood the purpose of the PEP. The purpose is to implement fast guards by avoiding dict lookups in the common case (when watched keys are not modified) because dict lookups are fast, but still slower than reading a field of a C structure and an integer comparison. See the result of my microbenchmark: https://www.python.org/dev/peps/pep-0509/#implementation We are talking about a nanoseconds. For the optimizations that I implemented in FAT Python, I bet that watched keys are rarely modified. But it's common to modify the watched namespaces. For example, a global namespace can be modified by the "lazy module import" pattern: "global module; if module is None: import module". Or a global variable can be a counter used to generate identifiers, counter modified regulary with "global counter; counter = counter + 1" which changes the dictionary version.
* Wouldn't it be possible to use the hash array itself to store the state index ?
We could store the state object as regular key in the dict and filter this out when accessing the dict.
Alternatively, we could try to use the free slots for storing these state objects by e.g. declaring a free slot as being NULL or a pointer to a state object.
I'm sorry, I don't understand this idea. Victor
On Jan 09, 2016, at 10:58 AM, Victor Stinner wrote:
IMHO adding 8 bytes per dict is worth it.
I'm not so sure. There are already platforms where Python is unfeasible to generally use (e.g. some mobile devices) at least in part because of memory footprint. Dicts are used everywhere so think about the kind of impact adding 8 bytes to every dict in an application running on such systems will have. Cheers, -Barry
On 12 January 2016 at 13:04, Barry Warsaw <barry@python.org> wrote:
On Jan 09, 2016, at 10:58 AM, Victor Stinner wrote:
IMHO adding 8 bytes per dict is worth it.
I'm not so sure. There are already platforms where Python is unfeasible to generally use (e.g. some mobile devices) at least in part because of memory footprint. Dicts are used everywhere so think about the kind of impact adding 8 bytes to every dict in an application running on such systems will have.
This is another advantage of making this a CPython specific internal implementation detail - embedded focused variants like MicroPython won't need to implement it. The question then becomes "Are we willing to let CPython cede high memory pressure environments to more specialised Python variants?", and I think the answer to that is "yes". Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Jan 12, 2016, at 01:37 PM, Nick Coghlan wrote:
The question then becomes "Are we willing to let CPython cede high memory pressure environments to more specialised Python variants?", and I think the answer to that is "yes".
I'm not so willing to cede that space to alternative implementations, at least not yet. If this suite of ideas yields *significant* performance improvements, it might be a worthwhile trade-off. But I'm not in favor of adding dict.__version__ in the hopes that we'll see that improvement; I think we need proof. That makes me think that 1) it should not be exposed to Python yet; 2) it should be conditionally compiled in, and not by default. This would allow experimentation without committing us to long-term maintenance or an across-the-board increase in memory pressures for speculative gains. Cheers, -Barry
On 1/12/2016 12:11 PM, Barry Warsaw wrote:
On Jan 12, 2016, at 01:37 PM, Nick Coghlan wrote:
The question then becomes "Are we willing to let CPython cede high memory pressure environments to more specialised Python variants?", and I think the answer to that is "yes".
I'm not so willing to cede that space to alternative implementations, at least not yet. If this suite of ideas yields *significant* performance improvements, it might be a worthwhile trade-off. But I'm not in favor of adding dict.__version__ in the hopes that we'll see that improvement; I think we need proof.
That makes me think that 1) it should not be exposed to Python yet; 2) it should be conditionally compiled in, and not by default. This would allow experimentation without committing us to long-term maintenance or an across-the-board increase in memory pressures for speculative gains.
New modules can be labelled 'provisional', whose meaning includes 'might be removed'. Can we do the same with new internal features? -- Terry Jan Reedy
Hi Nick, 2016-01-09 8:43 GMT+01:00 Nick Coghlan <ncoghlan@gmail.com>:
For clarification, this code has defined behavior in C (we should avoid introducing new undefined behaviors). May be you mean that the bahavior is not specified from Python side (since it is platform and implementation defined).
At least in recent versions of the standard*, overflow is defined on unsigned types as wrapping modulo-N. It only remains formally undefined for signed types.
*(I'm not sure about C89, but with MSVC getting their standards compliance act together, we could potentially update our minimum C version expectation in PEP 7 to C99 or even C11).
Great.
Usage of dict.__version__ =========================
This also can be used for better detecting dict mutating during iterating: https://bugs.python.org/issue19332.
I initially thought the same thing, but the cache token will be updated even if the keys all stay the same, and one of the values is modified, while the mutation-during-iteration check is aimed at detecting changes to the keys, rather than the values.
Serhiy's unit test ensure that creating a new key and deleting a key during an iteration is detected as a dict mutation, even if the dict size doesn't change. This use case works well with dict.__version__. Any __setitem__() changes the version (except if the key already exists and the value is exactly the same, id(old_value) == id(new_value)). Example:
d={1 :1} len(d) 1 d.__version__, len(d) (1, 1) d[2]=2 del d[1] d.__version__, len(d) (3, 1)
Changing the value can be detected as well during iteration using dict.__version__:
d={1:1} d.__version__, len(d) (1, 1) d[1]=2 d.__version__, len(d) (2, 1)
It would be nice to detect keys mutation while iteration on dict.keys(), but it would also be be nice to detect values mutation while iterating on dict.values() and dict.items(). No? Victor
On 9 January 2016 at 19:18, Victor Stinner <victor.stinner@gmail.com> wrote:
It would be nice to detect keys mutation while iteration on dict.keys(), but it would also be be nice to detect values mutation while iterating on dict.values() and dict.items(). No?
No, because mutating values as you go while iterating over a dictionary is perfectly legal:
data = dict.fromkeys(range(5)) for k in data: ... data[k] = k ... for k, v in data.items(): ... data[k] = v ** 2 ... data {0: 0, 1: 1, 2: 4, 3: 9, 4: 16}
It's only changing the key in the dict that's problematic, as that's the one that can affect the iteration order, regardless of whether you're emitting keys, values, or both. Raymond did mention that when closing the issue, but it was as an aside in one of his bullet points, rather than as a full example. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
2016-01-09 14:12 GMT+01:00 Nick Coghlan <ncoghlan@gmail.com>:
On 9 January 2016 at 19:18, Victor Stinner <victor.stinner@gmail.com> wrote:
It would be nice to detect keys mutation while iteration on dict.keys(), but it would also be be nice to detect values mutation while iterating on dict.values() and dict.items(). No?
No, because mutating values as you go while iterating over a dictionary is perfectly legal: (...)
Oh you're right. I removed the reference to the issue #19332 from the PEP, since the PEP doesn't help. Too bad. Victor
On Fri, Jan 8, 2016 at 11:44 PM Nick Coghlan <ncoghlan@gmail.com> wrote:
On 08.01.16 23:27, Victor Stinner wrote:
Add a new read-only ``__version__`` property to ``dict`` and ``collections.UserDict`` types, incremented at each change.
This may be not the best name for a property. Many modules already have
On 9 January 2016 at 16:03, Serhiy Storchaka <storchaka@gmail.com> wrote: the
__version__ attribute, this may make a confusion.
The equivalent API for the global ABC object graph is abc.get_cache_token: https://docs.python.org/3/library/abc.html#abc.get_cache_token
One of the reasons we chose that name is that even though it's a number, the only operation with semantic significance is equality testing, with the intended use case being cache invalidation when the token changes value.
If we followed the same reasoning for Victor's proposal, then a suitable attribute name would be "__cache_token__".
+1 for consistency. for most imaginable uses the actual value and type of the value doesn't matter, you just care if it is different than the value you recorded earlier. How the token/version gets mutated should be up to the implementation within defined parameters such as "the same value is never re-used twice for the lifetime of a process" (which pretty much guarantees some form of unsigned 64-bit counter increment - but an implementation could choose to use 256 bit random numbers for all we really care). Calling it __version__ implies numeric, but that isn't a requirement. we _really_ don't want someone to write code depending upon it being a number and expecting it to change in a given manner so that they do something conditional on math performed on that number rather than a simple == vs !=. -gps
Le samedi 9 janvier 2016, Serhiy Storchaka <storchaka@gmail.com> a écrit :
On 08.01.16 23:27, Victor Stinner wrote:
Add a new read-only ``__version__`` property to ``dict`` and ``collections.UserDict`` types, incremented at each change.
This may be not the best name for a property. Many modules already have the __version__ attribute, this may make a confusion.
It's fine to have a __version__ property and a __version__ key in the same dict. They are different. For a module, it's something like: With moddict = globals(): - moddict.__version__ is the dict version - moddict['__version__'] is the module version Using the same name for different things is not new in Python. An example still in the module namespace: - moddict.__class__.__name__ is the dict class name - moddict['__name__'] is the module name (or '__main__') "Version" is really my favorite name for the name feature. Sometimes I saw "timestamp", but in my opinion it's more confusing because it's not related to a clock.
The C code uses ``version++``. The behaviour on integer overflow of the
version is undefined. The minimum guarantee is that the version always changes when the dictionary is modified.
For clarification, this code has defined behavior in C (we should avoid introducing new undefined behaviors). May be you mean that the bahavior is not specified from Python side (since it is platform and implementation defined).
The C type for version is unsigned (size_t). I hope that version++ is defined but I was too lazy to check C specs for that :-) Does it wrap to 0 on overflow on all architecture (supported by Python)? If not, it's easy to wrap manually: version = (version==size_max) ? 0 : version+1;
Usage of dict.__version__
=========================
This also can be used for better detecting dict mutating during iterating: https://bugs.python.org/issue19332.
Oh, cool. Victor
On 09.01.16 09:57, Victor Stinner wrote:
Le samedi 9 janvier 2016, Serhiy Storchaka This may be not the best name for a property. Many modules already have the __version__ attribute, this may make a confusion.
It's fine to have a __version__ property and a __version__ key in the same dict. They are different.
Oh, I meant not a confusion between a property and a key, but between properties of two related objects. Perhaps one time we'll want to add the property with the same meaning directly to module object, but it is already in use.
"Version" is really my favorite name for the name feature. Sometimes I saw "timestamp", but in my opinion it's more confusing because it's not related to a clock.
Nick's "__cache_token__" LGTM.
On 9 January 2016 at 18:55, Serhiy Storchaka <storchaka@gmail.com> wrote:
On 09.01.16 09:57, Victor Stinner wrote:
Le samedi 9 janvier 2016, Serhiy Storchaka This may be not the best name for a property. Many modules already have the __version__ attribute, this may make a confusion.
It's fine to have a __version__ property and a __version__ key in the same dict. They are different.
Oh, I meant not a confusion between a property and a key, but between properties of two related objects. Perhaps one time we'll want to add the property with the same meaning directly to module object, but it is already in use.
The confusion I was referring to was yet a third variant of possible confusion: when people read "version", they're inevitably going to think "module version" or "package version" (since dealing with those kinds of versions is a day to day programming activity, regardless of domain), not "cache validity token" (as "version" in that sense is a technical term of art most programmers won't have encountered before). Yes, technically, "version" and "cache validity token" refer to the same thing in the context of data versioning, but the latter emphasises what the additional piece of information is primarily *for* in practical terms (checking if your caches are still valid), rather than what it *is* in formal terms (the current version of the stored data). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
How is this not just a poorer version of PyPy's optimizations? If what you want is optimization, it would be much better to devote time to a solution that can potentially yield orders of magnitude worth of speedup like PyPy rather than increasing language complexity for a minor payoff. Best, Neil On Friday, January 8, 2016 at 4:27:53 PM UTC-5, Victor Stinner wrote:
Hi,
Here is a first PEP, part of a serie of 3 PEP to add an API to implement a static Python optimizer specializing functions with guards.
HTML version:
https://faster-cpython.readthedocs.org/pep_dict_version.html#pep-dict-versio...
PEP: xxx Title: Add dict.__version__ Version: $Revision$ Last-Modified: $Date$ Author: Victor Stinner <victor....@gmail.com <javascript:>> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 4-January-2016 Python-Version: 3.6
Abstract ========
Add a new read-only ``__version__`` property to ``dict`` and ``collections.UserDict`` types, incremented at each change.
Rationale =========
In Python, the builtin ``dict`` type is used by many instructions. For example, the ``LOAD_GLOBAL`` instruction searchs for a variable in the global namespace, or in the builtins namespace (two dict lookups). Python uses ``dict`` for the builtins namespace, globals namespace, type namespaces, instance namespaces, etc. The local namespace (namespace of a function) is usually optimized to an array, but it can be a dict too.
Python is hard to optimize because almost everything is mutable: builtin functions, function code, global variables, local variables, ... can be modified at runtime. Implementing optimizations respecting the Python semantic requires to detect when "something changes": we will call these checks "guards".
The speedup of optimizations depends on the speed of guard checks. This PEP proposes to add a version to dictionaries to implement efficient guards on namespaces.
Example of optimization: replace loading a global variable with a constant. This optimization requires a guard on the global variable to check if it was modified. If the variable is modified, the variable must be loaded at runtime, instead of using the constant.
Guard example =============
Pseudo-code of an efficient guard to check if a dictionary key was modified (created, updated or deleted)::
UNSET = object()
class Guard: def __init__(self, dict, key): self.dict = dict self.key = key self.value = dict.get(key, UNSET) self.version = dict.__version__
def check(self): """Return True if the dictionary value did not changed.""" version = self.dict.__version__ if version == self.version: # Fast-path: avoid the dictionary lookup return True
value = self.dict.get(self.key, UNSET) if value == self.value: # another key was modified: # cache the new dictionary version self.version = version return True
return False
Changes =======
Add a read-only ``__version__`` property to builtin ``dict`` type and to the ``collections.UserDict`` type. New empty dictionaries are initilized to version ``0``. The version is incremented at each change:
* ``clear()`` if the dict was non-empty * ``pop(key)`` if the key exists * ``popitem()`` if the dict is non-empty * ``setdefault(key, value)`` if the `key` does not exist * ``__detitem__(key)`` if the key exists * ``__setitem__(key, value)`` if the `key` doesn't exist or if the value is different * ``update(...)`` if new values are different than existing values (the version can be incremented multiple times)
Example::
>>> d = {} >>> d.__version__ 0 >>> d['key'] = 'value' >>> d.__version__ 1 >>> d['key'] = 'new value' >>> d.__version__ 2 >>> del d['key'] >>> d.__version__ 3
If a dictionary is created with items, the version is also incremented at each dictionary insertion. Example::
>>> d=dict(x=7, y=33) >>> d.__version__ 2
The version is not incremented is an existing key is modified to the same value, but only the identifier of the value is tested, not the content of the value. Example::
>>> d={} >>> value = object() >>> d['key'] = value >>> d.__version__ 2 >>> d['key'] = value >>> d.__version__ 2
.. note:: CPython uses some singleton like integers in the range [-5; 257], empty tuple, empty strings, Unicode strings of a single character in the range [U+0000; U+00FF], etc. When a key is set twice to the same singleton, the version is not modified.
The PEP is designed to implement guards on namespaces, only the ``dict`` type can be used for namespaces in practice. ``collections.UserDict`` is modified because it must mimicks ``dict``. ``collections.Mapping`` is unchanged.
Integer overflow ================
The implementation uses the C unsigned integer type ``size_t`` to store the version. On 32-bit systems, the maximum version is ``2**32-1`` (more than ``4.2 * 10 ** 9``, 4 billions). On 64-bit systems, the maximum version is ``2**64-1`` (more than ``1.8 * 10**19``).
The C code uses ``version++``. The behaviour on integer overflow of the version is undefined. The minimum guarantee is that the version always changes when the dictionary is modified.
The check ``dict.__version__ == old_version`` can be true after an integer overflow, so a guard can return false even if the value changed, which is wrong. The bug occurs if the dict is modified at least ``2**64`` times (on 64-bit system) between two checks of the guard.
Using a more complex type (ex: ``PyLongObject``) to avoid the overflow would slow down operations on the ``dict`` type. Even if there is a theorical risk of missing a value change, the risk is considered too low compared to the slow down of using a more complex type.
Alternatives ============
Add a version to each dict entry --------------------------------
A single version per dictionary requires to keep a strong reference to the value which can keep the value alive longer than expected. If we add also a version per dictionary entry, the guard can rely on the entry version and so avoid the strong reference to the value (only strong references to a dictionary and key are needed).
Changes: add a ``getversion(key)`` method to dictionary which returns ``None`` if the key doesn't exist. When a key is created or modified, the entry version is set to the dictionary version which is incremented at each change (create, modify, delete).
Pseudo-code of an efficient guard to check if a dict key was modified using ``getversion()``::
UNSET = object()
class Guard: def __init__(self, dict, key): self.dict = dict self.key = key self.dict_version = dict.__version__ self.entry_version = dict.getversion(key)
def check(self): """Return True if the dictionary value did not changed.""" dict_version = self.dict.__version__ if dict_version == self.version: # Fast-path: avoid the dictionary lookup return True
# lookup in the dictionary, but get the entry version, #not the value entry_version = self.dict.getversion(self.key) if entry_version == self.entry_version: # another key was modified: # cache the new dictionary version self.dict_version = dict_version return True
return False
This main drawback of this option is the impact on the memory footprint. It increases the size of each dictionary entry, so the overhead depends on the number of buckets (dictionary entries, used or unused yet). For example, it increases the size of each dictionary entry by 8 bytes on 64-bit system if we use ``size_t``.
In Python, the memory footprint matters and the trend is more to reduce it. Examples:
* `PEP 393 -- Flexible String Representation <https://www.python.org/dev/peps/pep-0393/>`_ * `PEP 412 -- Key-Sharing Dictionary <https://www.python.org/dev/peps/pep-0412/>`_
Add a new dict subtype ----------------------
Add a new ``verdict`` type, subtype of ``dict``. When guards are needed, use the ``verdict`` for namespaces (module namespace, type namespace, instance namespace, etc.) instead of ``dict``.
Leave the ``dict`` type unchanged to not add any overhead (memory footprint) when guards are not needed.
Technical issue: a lot of C code in the wild, including CPython core, expect the exact ``dict`` type. Issues:
* ``exec()`` requires a ``dict`` for globals and locals. A lot of code use ``globals={}``. It is not possible to cast the ``dict`` to a ``dict`` subtype because the caller expects the ``globals`` parameter to be modified (``dict`` is mutable). * Functions call directly ``PyDict_xxx()`` functions, instead of calling ``PyObject_xxx()`` if the object is a ``dict`` subtype * ``PyDict_CheckExact()`` check fails on ``dict`` subtype, whereas some functions require the exact ``dict`` type. * ``Python/ceval.c`` does not completly supports dict subtypes for namespaces
The ``exec()`` issue is a blocker issue.
Other issues:
* The garbage collector has a special code to "untrack" ``dict`` instances. If a ``dict`` subtype is used for namespaces, the garbage collector may be unable to break some reference cycles. * Some functions have a fast-path for ``dict`` which would not be taken for ``dict`` subtypes, and so it would make Python a little bit slower.
Usage of dict.__version__ =========================
astoptimizer of FAT Python --------------------------
The astoptimizer of the FAT Python project implements many optimizations which require guards on namespaces. Examples:
* Call pure builtins: to replace ``len("abc")`` with ``3``, guards on ``builtins.__dict__['len']`` and ``globals()['len']`` are required * Loop unrolling: to unroll the loop ``for i in range(...): ...``, guards on ``builtins.__dict__['range']`` and ``globals()['range']`` are required
The `FAT Python <http://faster-cpython.readthedocs.org/fat_python.html>`_ project is a static optimizer for Python 3.6.
Pyjion ------
According of Brett Cannon, one of the two main developers of Pyjion, Pyjion can also benefit from dictionary version to implement optimizations.
Pyjion is a JIT compiler for Python based upon CoreCLR (Microsoft .NET Core runtime).
Unladen Swallow ---------------
Even if dictionary version was not explicitly mentionned, optimization globals and builtins lookup was part of the Unladen Swallow plan: "Implement one of the several proposed schemes for speeding lookups of globals and builtins." Source: `Unladen Swallow ProjectPlan <https://code.google.com/p/unladen-swallow/wiki/ProjectPlan>`_.
Unladen Swallow is a fork of CPython 2.6.1 adding a JIT compiler implemented with LLVM. The project stopped in 2011: `Unladen Swallow Retrospective <http://qinsb.blogspot.com.au/2011/03/unladen-swallow-retrospective.html>`_.
Prior Art =========
Cached globals+builtins lookup ------------------------------
In 2006, Andrea Griffini proposes a patch implementing a `Cached globals+builtins lookup optimization <https://bugs.python.org/issue1616125>`_.
The patch adds a private ``timestamp`` field to dict.
See the thread on python-dev: `About dictionary lookup caching <https://mail.python.org/pipermail/python-dev/2006-December/070348.html>`_.
Globals / builtins cache ------------------------
In 2010, Antoine Pitrou proposed a `Globals / builtins cache <http://bugs.python.org/issue10401>`_ which adds a private ``ma_version`` field to the ``dict`` type. The patch adds a "global and builtin cache" to functions and frames, and changes ``LOAD_GLOBAL`` and ``STORE_GLOBAL`` instructions to use the cache.
PySizer -------
`PySizer <http://pysizer.8325.org/>`_: a memory profiler for Python, Google Summer of Code 2005 project by Nick Smallbone.
This project has a patch for CPython 2.4 which adds ``key_time`` and ``value_time`` fields to dictionary entries. It uses a global process-wide counter for dictionaries, incremented each time that a dictionary is modified. The times are used to decide when child objects first appeared in their parent objects.
Copyright =========
This document has been placed in the public domain.
-- Victor _______________________________________________ Python-ideas mailing list Python...@python.org <javascript:> https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Hi, 2016-01-09 13:48 GMT+01:00 Neil Girdhar <mistersheik@gmail.com>:
How is this not just a poorer version of PyPy's optimizations?
This a very good question :-) There are a lot of optimizers in the wild, mostly JIT compilers. The problem is that most of them are specific to numerical computations, and the remaining ones are generic but not widely used. The most advanced and complete fast implementation of Python is obviously PyPy. I didn't heard a lot of deployements with PyPy. For example, PyPy is not used to install OpenStack (a very large project which has a big number of dependencies). I'm not even sure that PyPy is the favorite implementation of Python used to run Django, to give another example of popular Python application. PyPy is just amazing in term of performances, but for an unknown reason, it didn't replace CPython yet. PyPy has some drawbacks: it only supports Python 2.7 and 3.2 (CPython is at the version 3.5), it has bad performances on the C API and I heard that performances are not as amazing as expected on some applications. PyPy has also a worse startup time and use more memory. IMHO the major issue of Python is the backward compatibility on the C API. In short, almost all users are stuck at CPython and CPython implements close to 0 optimization (come on, constant folding and dead code elimintation is not what I would call an "optimization" ;-)). My goal is to fill the hole between CPython (0 optimization) and PyPy (the reference for best performances). I wrote a whole website to explain the status of the Python optimizers and why I want to write my own optimizer: https://faster-cpython.readthedocs.org/index.html
If what you want is optimization, it would be much better to devote time to a solution that can potentially yield orders of magnitude worth of speedup like PyPy rather than increasing language complexity for a minor payoff.
I disagree that my proposed changes increase the "language complexity". According to early benchmarks, my changes has a negligible impact on performances. I don't see how adding a read-only __version__ property to dict makes the Python *language* more complex? My whole design is based on the idea that my optimizer will be optimal. You will be free to not use it ;-) And sorry, I'm not interested to contribute to PyPy. Victor
On Sat, Jan 9, 2016 at 8:42 AM, Victor Stinner <victor.stinner@gmail.com> wrote:
Hi,
2016-01-09 13:48 GMT+01:00 Neil Girdhar <mistersheik@gmail.com>:
How is this not just a poorer version of PyPy's optimizations?
This a very good question :-) There are a lot of optimizers in the wild, mostly JIT compilers. The problem is that most of them are specific to numerical computations, and the remaining ones are generic but not widely used. The most advanced and complete fast implementation of Python is obviously PyPy. I didn't heard a lot of deployements with PyPy. For example, PyPy is not used to install OpenStack (a very large project which has a big number of dependencies). I'm not even sure that PyPy is the favorite implementation of Python used to run Django, to give another example of popular Python application.
PyPy is just amazing in term of performances, but for an unknown reason, it didn't replace CPython yet. PyPy has some drawbacks: it only supports Python 2.7 and 3.2 (CPython is at the version 3.5), it has bad performances on the C API and I heard that performances are not as amazing as expected on some applications. PyPy has also a worse startup time and use more memory. IMHO the major issue of Python is the backward compatibility on the C API.
In short, almost all users are stuck at CPython and CPython implements close to 0 optimization (come on, constant folding and dead code elimintation is not what I would call an "optimization" ;-)).
My goal is to fill the hole between CPython (0 optimization) and PyPy (the reference for best performances).
I wrote a whole website to explain the status of the Python optimizers and why I want to write my own optimizer: https://faster-cpython.readthedocs.org/index.html
I think this is admirable. I also dream of faster Python. However, we have a fundamental disagreement about how to get there. You can spend your whole life adding one or two optimizations a year and Python may only end up twice as fast as it is now, which would still be dog slow. A meaningful speedup requires a JIT. So, I question the value of this kind of change.
If what you want is optimization, it would be much better to devote time to a solution that can potentially yield orders of magnitude worth of speedup like PyPy rather than increasing language complexity for a minor payoff.
I disagree that my proposed changes increase the "language complexity". According to early benchmarks, my changes has a negligible impact on performances. I don't see how adding a read-only __version__ property to dict makes the Python *language* more complex?
It makes it more complex because you're adding a user-facing property. Every little property adds up in the cognitive load of a language. It also means that all of the other Python implementation need to follow suit even if their optimizations work differently. What is the point of making __version__ an exposed property? Why can't it be a hidden variable in CPython's underlying implementation of dict? If some code needs to query __version__ to see if it's changed then CPython should be the one trying to discover this pattern and automatically generate the right code. Ultimately, this is just a piece of a JIT, which is the way this is going to end up. My whole design is based on the idea that my optimizer will be
optimal. You will be free to not use it ;-)
And sorry, I'm not interested to contribute to PyPy.
That's fine, but I think you are probably wasting your time then :) The "hole between CPython and PyPy" disappears as soon as PyPy catches up to CPython 3.5 with numpy, and then all of this work goes with it.
Victor
Le samedi 9 janvier 2016, Neil Girdhar <mistersheik@gmail.com> a écrit :
On Sat, Jan 9, 2016 at 8:42 AM, Victor Stinner <victor.stinner@gmail.com <javascript:_e(%7B%7D,'cvml','victor.stinner@gmail.com');>> wrote:
I wrote a whole website to explain the status of the Python optimizers and why I want to write my own optimizer: https://faster-cpython.readthedocs.org/index.html
I think this is admirable. I also dream of faster Python. However, we have a fundamental disagreement about how to get there. You can spend your whole life adding one or two optimizations a year and Python may only end up twice as fast as it is now, which would still be dog slow. A meaningful speedup requires a JIT. So, I question the value of this kind of change.
There are multiple JIT compilers for Python actively developped: PyPy, Pyston, Pyjion, Numba (numerical computation), etc. I don't think that my work will slow down these projects. I hope that it will create more competition and that we will cooperate. For example, I am in contact with a Pythran developer who told me that my PEPs will help his project. As I wrote in the dict.__version__ PEP, the dictionary version will also be useful for Pyjion according to Brett Canon. But Antoine Pitrou told me that dictionary version will not help Numba. Numba doesn't use dictionaries and already has its own efficient implemenation for guards.
What is the point of making __version__ an exposed property?
Hum, technically I don't need it at the Python level. Guards are implemented in C and access directly the field from the strcuture. Having the property in Python helps to write unit tests, to write prototypes (experiment new things), etc.
That's fine, but I think you are probably wasting your time then :) The "hole between CPython and PyPy" disappears as soon as PyPy catches up to CPython 3.5 with numpy, and then all of this work goes with it.
PyPy works since many years but it's still not widely used by users. Maybe PyPy has drawbacks and the speedup is not enough to convince users to use it? I'm not sure that Python 3.5 support wil make PyPy immediatly more popular. Users still widely use Python 2 in practice. Yes, better and faster numpy will help PyPy. Victor
On Sat, Jan 9, 2016 at 11:27 AM, Victor Stinner <victor.stinner@gmail.com> wrote:
Le samedi 9 janvier 2016, Neil Girdhar <mistersheik@gmail.com> a écrit :
On Sat, Jan 9, 2016 at 8:42 AM, Victor Stinner <victor.stinner@gmail.com> wrote:
I wrote a whole website to explain the status of the Python optimizers and why I want to write my own optimizer: https://faster-cpython.readthedocs.org/index.html
I think this is admirable. I also dream of faster Python. However, we have a fundamental disagreement about how to get there. You can spend your whole life adding one or two optimizations a year and Python may only end up twice as fast as it is now, which would still be dog slow. A meaningful speedup requires a JIT. So, I question the value of this kind of change.
There are multiple JIT compilers for Python actively developped: PyPy, Pyston, Pyjion, Numba (numerical computation), etc.
I don't think that my work will slow down these projects. I hope that it will create more competition and that we will cooperate. For example, I am in contact with a Pythran developer who told me that my PEPs will help his project. As I wrote in the dict.__version__ PEP, the dictionary version will also be useful for Pyjion according to Brett Canon.
But Antoine Pitrou told me that dictionary version will not help Numba. Numba doesn't use dictionaries and already has its own efficient implemenation for guards.
What is the point of making __version__ an exposed property?
Hum, technically I don't need it at the Python level. Guards are implemented in C and access directly the field from the strcuture.
Having the property in Python helps to write unit tests, to write prototypes (experiment new things), etc.
I understand what you mean, but If you can do this without changing the language, I think that would be better. Isn't it still possible to write your unit tests in the same C interface that you expose "version" with? Then the language with stay the same, but CPython would be faster, which is what you wanted.
That's fine, but I think you are probably wasting your time then :) The "hole between CPython and PyPy" disappears as soon as PyPy catches up to CPython 3.5 with numpy, and then all of this work goes with it.
PyPy works since many years but it's still not widely used by users. Maybe PyPy has drawbacks and the speedup is not enough to convince users to use it? I'm not sure that Python 3.5 support wil make PyPy immediatly more popular. Users still widely use Python 2 in practice.
Yes, better and faster numpy will help PyPy.
Victor
On Sat, 9 Jan 2016 at 07:04 Neil Girdhar <mistersheik@gmail.com> wrote:
On Sat, Jan 9, 2016 at 8:42 AM, Victor Stinner <victor.stinner@gmail.com> wrote:
Hi,
2016-01-09 13:48 GMT+01:00 Neil Girdhar <mistersheik@gmail.com>:
How is this not just a poorer version of PyPy's optimizations?
This a very good question :-) There are a lot of optimizers in the wild, mostly JIT compilers. The problem is that most of them are specific to numerical computations, and the remaining ones are generic but not widely used. The most advanced and complete fast implementation of Python is obviously PyPy. I didn't heard a lot of deployements with PyPy. For example, PyPy is not used to install OpenStack (a very large project which has a big number of dependencies). I'm not even sure that PyPy is the favorite implementation of Python used to run Django, to give another example of popular Python application.
PyPy is just amazing in term of performances, but for an unknown reason, it didn't replace CPython yet. PyPy has some drawbacks: it only supports Python 2.7 and 3.2 (CPython is at the version 3.5), it has bad performances on the C API and I heard that performances are not as amazing as expected on some applications. PyPy has also a worse startup time and use more memory. IMHO the major issue of Python is the backward compatibility on the C API.
In short, almost all users are stuck at CPython and CPython implements close to 0 optimization (come on, constant folding and dead code elimintation is not what I would call an "optimization" ;-)).
My goal is to fill the hole between CPython (0 optimization) and PyPy (the reference for best performances).
I wrote a whole website to explain the status of the Python optimizers and why I want to write my own optimizer: https://faster-cpython.readthedocs.org/index.html
I think this is admirable. I also dream of faster Python. However, we have a fundamental disagreement about how to get there. You can spend your whole life adding one or two optimizations a year and Python may only end up twice as fast as it is now, which would still be dog slow. A meaningful speedup requires a JIT. So, I question the value of this kind of change.
Obviously a JIT can help, but even they can benefit from this. For instance, Pyjion could rely on this instead of creating our own guards for built-in and global namespaces if we wanted to inline calls to certain built-ins.
If what you want is optimization, it would be much better to devote time to a solution that can potentially yield orders of magnitude worth of speedup like PyPy rather than increasing language complexity for a minor payoff.
I disagree that my proposed changes increase the "language complexity". According to early benchmarks, my changes has a negligible impact on performances. I don't see how adding a read-only __version__ property to dict makes the Python *language* more complex?
It makes it more complex because you're adding a user-facing property. Every little property adds up in the cognitive load of a language. It also means that all of the other Python implementation need to follow suit even if their optimizations work differently.
What is the point of making __version__ an exposed property? Why can't it be a hidden variable in CPython's underlying implementation of dict? If some code needs to query __version__ to see if it's changed then CPython should be the one trying to discover this pattern and automatically generate the right code. Ultimately, this is just a piece of a JIT, which is the way this is going to end up.
My whole design is based on the idea that my optimizer will be
optimal. You will be free to not use it ;-)
And sorry, I'm not interested to contribute to PyPy.
That's fine, but I think you are probably wasting your time then :) The "hole between CPython and PyPy" disappears as soon as PyPy catches up to CPython 3.5 with numpy, and then all of this work goes with it.
That doesn't solve the C API compatibility problem, nor other issues some people have with PyPy deployments (e.g., inconsistent performance that can't necessarily be relied upon).
On Sat, Jan 09, 2016 at 09:55:08AM -0500, Neil Girdhar wrote:
I think this is admirable. I also dream of faster Python. However, we have a fundamental disagreement about how to get there. You can spend your whole life adding one or two optimizations a year and Python may only end up twice as fast as it is now, which would still be dog slow. A meaningful speedup requires a JIT. So, I question the value of this kind of change.
I think that's pessimistic and unrealistic. If Python were twice as fast as it is now, it would mean that scripts could process twice as much data in the same time as they do now. How is that not meaningful? Sometimes I work hard to get a 5% or 10% improvement in speed of a function, because it's worth it. Doubling the speed is something I can only dream about. As for a JIT, they have limited value for code that isn't long-running. As the PyPy FAQ says: "Note also that our JIT has a very high warm-up cost, meaning that any program is slow at the beginning. If you want to compare the timings with CPython, even relatively simple programs need to run at least one second, preferrably at least a few seconds." which means that PyPy is going to have little or no benefit for short-lived programs and scripts. But if you call those scripts thousands or tens of thousands of times (say, from the shell) the total amount of time can be considerable. Halving that time would be a good thing. There is plenty of room in the Python ecosystem for many different approaches to optimization. [...]
It makes it more complex because you're adding a user-facing property. Every little property adds up in the cognitive load of a language. It also means that all of the other Python implementation need to follow suit even if their optimizations work differently.
That second point is a reasonable criticism of Victor's idea.
What is the point of making __version__ an exposed property? Why can't it be a hidden variable in CPython's underlying implementation of dict?
Making it public means that anyone can make use of it. Just because Victor wants to use it for CPython optimizations doesn't mean that others can't or shouldn't make use of it for their own code. Victor wants to detect changes to globals() and builtins, but I might want to use it to detect changes to some other dict: mydict = {'many': 1, 'keys': 2, 'with': 3, 'an': 4, 'invariant': 5} v = mydict.__version__ modify(mydict) if v != mydict.__version__: expensive_invariant_check(mydict) If Victor is right that tracking this version flag is cheap, then there's no reason not to expose it. Some people will find a good use for it, and others can ignore it. -- Steve
I think that's pessimistic and unrealistic. If Python were twice as fast as it is now, it would mean that scripts could process twice as much data in
Steven D'Aprano Saturday, January 09, 2016 19:32: the same time as they do now. How is that not meaningful?
Sometimes I work hard to get a 5% or 10% improvement in speed of a
function, because it's worth it. Doubling the speed is something I can only dream about. Often when I hear people complain about "tiny" improvements, I change the context: "Ok, I'm going to raise your salary 5%, or is that too small and you don't want it?" Suddenly that 5% looks pretty good.
On Mon, Jan 11, 2016 at 2:28 AM, Eric Fahlgren <ericfahlgren@gmail.com> wrote:
I think that's pessimistic and unrealistic. If Python were twice as fast as it is now, it would mean that scripts could process twice as much data in
Steven D'Aprano Saturday, January 09, 2016 19:32: the same time as they do now. How is that not meaningful?
Sometimes I work hard to get a 5% or 10% improvement in speed of a
function, because it's worth it. Doubling the speed is something I can only dream about.
Often when I hear people complain about "tiny" improvements, I change the context: "Ok, I'm going to raise your salary 5%, or is that too small and you don't want it?" Suddenly that 5% looks pretty good.
Although realistically, it's more like saying "If you put in enough overtime, I'll raise by 5% the rate you get paid for one of the many types of work you do". Evaluating that depends on what proportion of your salary comes from that type of work. 5% across the board is pretty good. 5% to one function is only worth serious effort if that's a hot spot. But broadly I do agree - 5% is significant. ChrisA
On Sun, Jan 10, 2016 at 10:28 AM, Eric Fahlgren <ericfahlgren@gmail.com> wrote:
Often when I hear people complain about "tiny" improvements, I change the context: "Ok, I'm going to raise your salary 5%, or is that too small and you don't want it?" Suddenly that 5% looks pretty good.
To extend this analogy a bit, I think Neil's objection was more along the lines of "Why work an extra 5 hours a week for only a 5% raise?" I don't think anyone's going to pooh-pooh a performance improvement. Neil's concern is just about whether the benefit justifies the cost. Nick
To extend this analogy a bit, I think Neil's objection was more along the
Le 10 janv. 2016 4:39 PM, "Nicholas Chammas" <nicholas.chammas@gmail.com> a écrit : lines of "Why work an extra 5 hours a week for only a 5% raise?" Your analogy is wrong. I am working and you get the salary. Victor
I read through this thread and just want to quickly address some good points. First of all, I didn't mean to suggest that this kind of optimization is not useful. Of course, I will be thankful of any optimization that makes it into CPython. Making CPython faster is good, useful work. It's just that my dream of the future of Python is one where Python is faster than C thanks to a very clever JIT.
I agree with Neil Girdhar that this looks to me like a CPython-specific
implementation detail that should not be imposed on other implementations. For testing, perhaps we could add a dict_version function in test.support that uses ctypes to access the internals.
Another reason to hide __version__ from the Python level is that its use seems to me rather tricky and bug-prone. What makes you say that? Isn't it a simple matter of: v = mydict.__version__ maybe_modify(mydict) if v != mydict.__version__: print("dict has changed")
This is exactly what I want to avoid. If you want to do something like this, I think you should do it in regular Python by subclassing dict and overriding the mutating methods. What happens if someone uses a custom Mapping? Do all custom Mappings need to implement __version__? Do they need a __version__ that indicates that no key-value pairs have changed, and another version that indicates that nothing has changed (for example OrderedDict has an order, sorteddict has a sort function; changing either of those doesn't change key-value pairs). This is not supposed to be user-facing; this is an interpreter optimization.
Obviously a JIT can help, but even they can benefit from this. For instance, Pyjion could rely on this instead of creating our own guards for built-in and global namespaces if we wanted to inline calls to certain built-ins.
I understand that, but what if another JIT decides that instead of __version__ being an attribute on the object, version is a global mapping from objects to version numbers? What if someone else wants to implement it instead as a set of changed objects at ever sequence point? There are many ways to do this optimization. It's not obvious to me that everyone will want to do it this way.
C compilers often have optimization levels that can potentially alter the program's operation
Some of those optimizations lead to bugs that are very hard to track down. One of the advantages of Python is that what you pay for in runtime, you save ten-fold in development time. In summary, I am 100% behind this idea if it were hidden from the user. Best, Neil
On Sun, Jan 10, 2016 at 11:48:35AM -0500, Neil Girdhar wrote: [...]
v = mydict.__version__ maybe_modify(mydict) if v != mydict.__version__: print("dict has changed")
This is exactly what I want to avoid. If you want to do something like this, I think you should do it in regular Python by subclassing dict and overriding the mutating methods.
That doesn't help Victor, because exec need an actual dict, not subclasses. Victor's PEP says this is a blocker. I can already subclass dict to do that now. But if Victor's suggestion is accepted, then I don't need to. The functionality will already exist. Why shouldn't I use it?
What happens if someone uses a custom Mapping?
If they inherit from dict or UserDict, they get this functionality for free. If they don't, they're responsible for implementing it if they want it.
Do all custom Mappings need to implement __version__?
I believe the answer to that is No, but the PEP probably should clarify that. -- Steve
On Sun, Jan 10, 2016 at 12:57 PM, Steven D'Aprano <steve@pearwood.info> wrote:
On Sun, Jan 10, 2016 at 11:48:35AM -0500, Neil Girdhar wrote:
[...]
v = mydict.__version__ maybe_modify(mydict) if v != mydict.__version__: print("dict has changed")
This is exactly what I want to avoid. If you want to do something like this, I think you should do it in regular Python by subclassing dict and overriding the mutating methods.
That doesn't help Victor, because exec need an actual dict, not subclasses. Victor's PEP says this is a blocker.
No, he can still do what he wants transparently in the interpreter. What I want to avoid is Python users using __version__ in their own code.
I can already subclass dict to do that now. But if Victor's suggestion is accepted, then I don't need to. The functionality will already exist. Why shouldn't I use it?
Because people write code for the abc "Mapping". What you are suggesting is then to add "__version__" to the abc Mapping, which I am against. Mapping provides the minimum interface to be a mapping; there is no reason that every Mapping should have a "__version__".
What happens if someone uses a custom Mapping?
If they inherit from dict or UserDict, they get this functionality for free. If they don't, they're responsible for implementing it if they want it.
But they shouldn't have to implement it just so that code written for Mappings works — as it does now.
Do all custom Mappings need to implement __version__?
I believe the answer to that is No, but the PEP probably should clarify that.
If the answer is "no" then honestly no user should write code counting on the existence of __version__.
-- Steve _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
--
--- You received this message because you are subscribed to a topic in the Google Groups "python-ideas" group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/python-ideas/HP5qdo3rJxE/unsubscribe. To unsubscribe from this group and all its topics, send an email to python-ideas+unsubscribe@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
2016-01-10 19:35 GMT+01:00 Neil Girdhar <mistersheik@gmail.com>:
If the answer is "no" then honestly no user should write code counting on the existence of __version__.
For my use case, I don't need a public (read-only) property at the Python level. When I wrote the PEP, I proposed a public property to try to find more use cases and make the PEP more interesting. I'm not sure anymore that it's worth since they are legit and good counterargument were listed: * it gives more work for other Python implementations, whereas they may not use or benefit from the overall API for static optimizers (discussed in following PEPs). Except of guards used for static optimizers, I don't see any use case for dictionary versionning. * the behaviour on integer overflow is an implementation detail, it's sad to have to describe it in the specification of a *Python* property. Users expect Python to abtract the hardware Victor
On Mon, Jan 11, 2016 at 11:15 AM, Victor Stinner <victor.stinner@gmail.com> wrote:
* the behaviour on integer overflow is an implementation detail, it's sad to have to describe it in the specification of a *Python* property. Users expect Python to abtract the hardware
Compromise: Document that it's an integer that changes every time the dictionary is changed, and has a "vanishingly small chance" of ever reusing a number. It'll trap the same people who try to use id(obj) as a memory address, but at least it'll be documented as false. ChrisA
On Mon, Jan 11, 2016 at 01:15:47AM +0100, Victor Stinner wrote:
* the behaviour on integer overflow is an implementation detail, it's sad to have to describe it in the specification of a *Python* property. Users expect Python to abtract the hardware
Is that a real possibility? A 32-bit counter will overflow, sure, but a 64-bit counter starting from zero should never overflow in a human lifetime. Even if we assume a billion increments per second (one per nanosecond), it would take over 584 years of continuous operation for the counter to overflow. What am I missing? So I would be inclined to just document that the counter may overflow, and you should always compare it using == or != and not >. I think anything else is overkill. -- Steve
On Mon, Jan 11, 2016 at 12:39 PM, Steven D'Aprano <steve@pearwood.info> wrote:
On Mon, Jan 11, 2016 at 01:15:47AM +0100, Victor Stinner wrote:
* the behaviour on integer overflow is an implementation detail, it's sad to have to describe it in the specification of a *Python* property. Users expect Python to abtract the hardware
Is that a real possibility? A 32-bit counter will overflow, sure, but a 64-bit counter starting from zero should never overflow in a human lifetime.
Even if we assume a billion increments per second (one per nanosecond), it would take over 584 years of continuous operation for the counter to overflow. What am I missing?
You're missing that a 32-bit build of Python would then be allowed to use a 32-bit counter. But if the spec says "64-bit counter", then yeah, we can pretty much assume that it won't overflow. Reasonable usage wouldn't include nanosecondly updates; I doubt you could even achieve 1000 updates a second, sustained over a long period of time, and that would only overflow every 50ish days. Unless there's some bizarre lockstep system that forces you to run into the rollover, it's going to be basically one chance in four billion that you hit the exact equal counter. So even a 32-bit counter is unlikely to cause problems in real-world situations; and anyone who's paranoid can just insist on using a 64-bit build of Python. (Most of us probably are anyway.) ChrisA
On Jan 10, 2016, at 17:55, Chris Angelico <rosuav@gmail.com> wrote:
You're missing that a 32-bit build of Python would then be allowed to use a 32-bit counter. But if the spec says "64-bit counter", then yeah, we can pretty much assume that it won't overflow.
As I understand it from Victor's PEP, the added cost of maintaining this counter is literally so small as to be unmeasurable against the cost of normal dict operations in microbenchmarks. If that's true, surely the cost of requiring a 64-bit counter is going to be acceptable? I realize that some MicroPython projects will be targeting platforms where there's no fast way to do an inc64 (or where the available compilers are too dumb to do it the fast way), but those projects are probably not going to want FAT Python anyway. On a P3 or later x86 or an ARM 7 or something like that, the cost should be more than acceptable. Or at least it's worth testing.
On Jan 10, 2016, at 10:35, Neil Girdhar <mistersheik@gmail.com> wrote:
On Sun, Jan 10, 2016 at 12:57 PM, Steven D'Aprano <steve@pearwood.info> wrote: On Sun, Jan 10, 2016 at 11:48:35AM -0500, Neil Girdhar wrote:
[...]
v = mydict.__version__ maybe_modify(mydict) if v != mydict.__version__: print("dict has changed")
This is exactly what I want to avoid. If you want to do something like this, I think you should do it in regular Python by subclassing dict and overriding the mutating methods.
That doesn't help Victor, because exec need an actual dict, not subclasses. Victor's PEP says this is a blocker.
No, he can still do what he wants transparently in the interpreter. What I want to avoid is Python users using __version__ in their own code.
Well, he could change exec so it can use arbitrary mappings (or at least dict subclasses), but I assume that's much harder and more disruptive than his proposed change. Anyway, if I understand your point, it's this: __version__ should either be a private implementation-specific property of dicts, or it should be a property of all mappings; anything in between gets all the disadvantages of both. If so, I agree with you. Encouraging people to use __version__ for other purposes besides namespace guards, but not doing anything to guarantee it actually exists anywhere besides namespaces, seems like a bad idea. But there is still something in between public and totally internal to FAT Python. Making it a documented property of PyDict objects at the C API level is a different story--there are already plenty of ways that C code can use those objects that won't work with arbitrary mappings, so adding another doesn't seem like a problem. And even making it public but implementation-specific at the Python level may be useful for other CPython-specific optimizers (even if partially written in Python); if so, the best way to deal with the danger that someone could abuse it for code that should work with arbitrary mappings or with another Python implementation should be solved by clearly documenting it's non portability and discouraging its abuse in the docs, not by hiding it.
On Sun, Jan 10, 2016 at 7:37 PM, Andrew Barnert <abarnert@yahoo.com> wrote:
On Jan 10, 2016, at 10:35, Neil Girdhar <mistersheik@gmail.com> wrote:
On Sun, Jan 10, 2016 at 12:57 PM, Steven D'Aprano <steve@pearwood.info> wrote:
On Sun, Jan 10, 2016 at 11:48:35AM -0500, Neil Girdhar wrote:
[...]
v = mydict.__version__ maybe_modify(mydict) if v != mydict.__version__: print("dict has changed")
This is exactly what I want to avoid. If you want to do something like this, I think you should do it in regular Python by subclassing dict and overriding the mutating methods.
That doesn't help Victor, because exec need an actual dict, not subclasses. Victor's PEP says this is a blocker.
No, he can still do what he wants transparently in the interpreter. What I want to avoid is Python users using __version__ in their own code.
Well, he could change exec so it can use arbitrary mappings (or at least dict subclasses), but I assume that's much harder and more disruptive than his proposed change.
Anyway, if I understand your point, it's this: __version__ should either be a private implementation-specific property of dicts, or it should be a property of all mappings; anything in between gets all the disadvantages of both.
Right. I prefer the the former since making it a property of mappings bloats Mapping beyond a minimum interface.
If so, I agree with you. Encouraging people to use __version__ for other purposes besides namespace guards, but not doing anything to guarantee it actually exists anywhere besides namespaces, seems like a bad idea.
But there is still something in between public and totally internal to FAT Python. Making it a documented property of PyDict objects at the C API level is a different story--there are already plenty of ways that C code can use those objects that won't work with arbitrary mappings, so adding another doesn't seem like a problem.
Adding it to PyDict and exposing it in the C API is totally reasonable to me.
And even making it public but implementation-specific at the Python level may be useful for other CPython-specific optimizers (even if partially written in Python); if so, the best way to deal with the danger that someone could abuse it for code that should work with arbitrary mappings or with another Python implementation should be solved by clearly documenting it's non portability and discouraging its abuse in the docs, not by hiding it.
Here is where I have to disagree. I hate it when experts say "we'll just document it and then it's the user's fault for misusing it". Yeah, you're right, but as a user, it is very frustrating to have to read other people's documentation. You know that some elite Python programmer is going to optimize his code using this and someone years later is going to scratch his head wondering where __version__ is coming from. Is it the provided by the caller? Was it added to the object at some earlier point? Finally, he'll search the web, arrive at a stackoverflow question with 95 upvotes that finally clears things up. And for what? Some minor optimization. (Not Victor's optimization, but a Python user's optimization in Python code.) Python should make it easy to write clear code. It's my opinion that documentation is not a substitute for good language design, just as comments are not a substitute for good code design. Also, using this __version__ in source code is going to complicate switching from CPython to any of the other Python implementations, so those implementations will probably end up implementing it just to simplify "porting", which would otherwise be painless. Why don't we leave exposing __version__ in Python to another PEP? Once it's in the C API (as you proposed) you will be able to use it from Python by writing an extension and then someone can demonstrate the value of exposing it in Python by writing tests.
On Mon, Jan 11, 2016 at 05:18:59AM -0500, Neil Girdhar wrote:
Here is where I have to disagree. I hate it when experts say "we'll just document it and then it's the user's fault for misusing it". Yeah, you're right, but as a user, it is very frustrating to have to read other people's documentation. You know that some elite Python programmer is going to optimize his code using this and someone years later is going to scratch his head wondering where __version__ is coming from. Is it the provided by the caller? Was it added to the object at some earlier point?
Neil, don't you think you're being overly dramatic here? "Programmer needs to look up API feature, news at 11!" The same could be said about class.__name__, instance.__class__, obj.__doc__, module.__dict__ and indeed every single Python feature. Sufficiently inexperienced or naive programmers could be scratching their head over literally *anything*. (I remember being perplexed by None the first time I read Python code. What was it and where did it come from? I had no idea.) All those words for such a simple, and minor, point: every new API feature is one more thing for programmers to learn. We get that. But the following is a good, strong argument:
Also, using this __version__ in source code is going to complicate switching from CPython to any of the other Python implementations, so those implementations will probably end up implementing it just to simplify "porting", which would otherwise be painless.
Why don't we leave exposing __version__ in Python to another PEP? Once it's in the C API (as you proposed) you will be able to use it from Python by writing an extension and then someone can demonstrate the value of exposing it in Python by writing tests.
I can't really argue against this. As much as I would love to play around with __version__, I think you're right. It needs to prove itself before being exposed as a public API. -- Steve
On Mon, Jan 11, 2016 at 6:20 AM Steven D'Aprano <steve@pearwood.info> wrote:
On Mon, Jan 11, 2016 at 05:18:59AM -0500, Neil Girdhar wrote:
Here is where I have to disagree. I hate it when experts say "we'll just document it and then it's the user's fault for misusing it". Yeah, you're right, but as a user, it is very frustrating to have to read other people's documentation. You know that some elite Python programmer is going to optimize his code using this and someone years later is going to scratch his head wondering where __version__ is coming from. Is it the provided by the caller? Was it added to the object at some earlier point?
Neil, don't you think you're being overly dramatic here? "Programmer needs to look up API feature, news at 11!" The same could be said about class.__name__, instance.__class__, obj.__doc__, module.__dict__ and indeed every single Python feature. Sufficiently inexperienced or naive programmers could be scratching their head over literally *anything*.
All those words for such a simple, and minor, point: every new API feature is one more thing for programmers to learn. We get that.
I don't think Neil is being overly dramatic, nor is it a minor point. Simple, yes, but important. If Python wants to maintain its enviable position as the majority language for intro computer science of top schools, it needs to stay an easily teachable language. The more junk showing up in ``dir()`` the harder it is to learn. When it's unclear what purpose a feature would have for an expert, why not err on the side of caution and keep the language as usable for a newbie as possible? But the following is a good, strong argument:
Also, using this __version__ in source code is going to complicate switching from CPython to any of the other Python implementations, so those implementations will probably end up implementing it just to simplify "porting", which would otherwise be painless.
Why don't we leave exposing __version__ in Python to another PEP? Once it's in the C API (as you proposed) you will be able to use it from Python by writing an extension and then someone can demonstrate the value of exposing it in Python by writing tests.
I can't really argue against this. As much as I would love to play around with __version__, I think you're right. It needs to prove itself before being exposed as a public API.
2016-01-11 11:18 GMT+01:00 Neil Girdhar <mistersheik@gmail.com>:
No, he can still do what he wants transparently in the interpreter. What I want to avoid is Python users using __version__ in their own code.
Well, he could change exec so it can use arbitrary mappings (or at least dict subclasses), but I assume that's much harder and more disruptive than his proposed change.
Anyway, if I understand your point, it's this: __version__ should either be a private implementation-specific property of dicts, or it should be a property of all mappings; anything in between gets all the disadvantages of both.
Right. I prefer the the former since making it a property of mappings bloats Mapping beyond a minimum interface.
The discussion on adding a __version__ property on all mapping types is interesting. I now agree that it's a boolean choice: no mapping type must have a __version__ property, or all types must have it. It would be annoying to get a cryptic issue when we pass a dict subtype or a dict-like type to a function expecting a "mapping". I *don't* want to require all mapping types to implement a __version__ property. Even if it's simple to implement, some types can be a simple wrapper on top on an existing efficient mapping type which doesn't implement such property (or worse, have a similar *but different* property). For example, Jython and IronPython probably reuse existing mapping types of Java and .NET, and I don't think that they have such version property. The Mapping ABC already requires a lot of methods, having to implement yet another property would make the implementation even more complex and difficult to maintain. My PEP 509 requires 8 methods (including the constructor) to update the __version__.
Here is where I have to disagree. I hate it when experts say "we'll just document it and then it's the user's fault for misusing it". Yeah, you're right, but as a user, it is very frustrating to have to read other people's documentation. You know that some elite Python programmer is going to optimize his code using this and someone years later is going to scratch his head wondering where __version__ is coming from. Is it the provided by the caller? Was it added to the object at some earlier point? Finally, he'll search the web, arrive at a stackoverflow question with 95 upvotes that finally clears things up. And for what? Some minor optimization. (Not Victor's optimization, but a Python user's optimization in Python code.)
I agree that it would be a bad practice to use widely __version__ in a project to micro-optimize manually an application. Well, micro-optimizations are bad practice in most cases ;-) Remember that dict lookup have a complex of O(1), that's why they are used for namespaces ;-) It's a bad idea because at the Python level, the dict lookup and checking the version has... the same cost! (48.7 ns vs 47.5 ns... a difference of 1 nanosecond) haypo@smithers$ ./python -m timeit -s 'd = {str(i):i for i in range(100)}' 'd["33"] == 33' 10000000 loops, best of 3: 0.0487 usec per loop haypo@smithers$ ./python -m timeit -s 'd = {str(i):i for i in range(100)}' 'd.__version__ == 100' 10000000 loops, best of 3: 0.0475 usec per loop The difference is only visible at the C level: * PyObject_GetItem: 16.5 ns * PyDict_GetItem: 14.8 ns * fat.GuardDict: 3.8 ns (check dict.__version__) Well, 3.8 ns (guard) vs 14.8 ns (dict lookup) is nice but not so amazing, a dict lookup is already *fast*. The difference between guards and dict lookups is that a guard check has a complexity of O(1) in the common case (if the dict was not modified). For example, an optimization using 10 global variables in a function, the check costs 148 ns for 10 dict lookups, whereas the guard still only cost 3.8 ns (39x as fast). The guards must be as cheap as possible, otherwise it will have to work harder to implement more efficient optimizations :-D Note: the performance of a dict lookup also depends if the key is "interned" (in short, it's a kind of singleton to compare strings by their address instead of having to compare character per character). For code objects, Python interns strings which are made of characters a-z, A-Z and "_". Well, it's just to confirm that yes, the PEP is designed to implement fast guards in C, but it would be a bad idea to start to use it widely at the Python level.
Also, using this __version__ in source code is going to complicate switching from CPython to any of the other Python implementations, so those implementations will probably end up implementing it just to simplify "porting", which would otherwise be painless.
IMHO *if* we add __version__ to dict (or even to all mapping types), it must be done for all Python implementations. It would be really annoying to have to start putting kind of #ifdef in the code for a feature of a core builtin type (dict). But again, I now agree to not expose the version at the Python level...
Why don't we leave exposing __version__ in Python to another PEP?
According to this thread and my benchmark above, the __version__ property at the Python level is a *bad* idea. So I'm not interested anymore to expose it. Victor
2016-01-10 18:57 GMT+01:00 Steven D'Aprano <steve@pearwood.info>:
Do all custom Mappings need to implement __version__?
I believe the answer to that is No, but the PEP probably should clarify that.
In the PEP, I wrote "The PEP is designed to implement guards on namespaces, only the dict type can be used for namespaces in practice. collections.UserDict is modified because it must mimicks dict. collections.Mapping is unchanged." https://www.python.org/dev/peps/pep-0509/#changes Is it enough? If no, what do you suggest to be more explicit? Victor
On 01/10/2016 12:02 PM, Victor Stinner wrote:
2016-01-10 18:57 GMT+01:00 Steven D'Aprano <steve@pearwood.info>:
Do all custom Mappings need to implement __version__?
I believe the answer to that is No, but the PEP probably should clarify that.
In the PEP, I wrote "The PEP is designed to implement guards on namespaces, only the dict type can be used for namespaces in practice. collections.UserDict is modified because it must mimicks dict. collections.Mapping is unchanged." https://www.python.org/dev/peps/pep-0509/#changes
Is it enough? If no, what do you suggest to be more explicit?
It is enough. -- ~Ethan~
On Sun, Jan 10, 2016 at 09:02:47PM +0100, Victor Stinner wrote:
2016-01-10 18:57 GMT+01:00 Steven D'Aprano <steve@pearwood.info>:
Do all custom Mappings need to implement __version__?
I believe the answer to that is No, but the PEP probably should clarify that.
In the PEP, I wrote "The PEP is designed to implement guards on namespaces, only the dict type can be used for namespaces in practice. collections.UserDict is modified because it must mimicks dict. collections.Mapping is unchanged." https://www.python.org/dev/peps/pep-0509/#changes
Is it enough? If no, what do you suggest to be more explicit?
You also should argue whether or not __version__ should be visible to users from pure Python, or only from C code (as Neil wants). In other words, should __version__ be part of the public API of dict, or an implementation detail? (1) Make __version__ part of the public API. Pros: - Simpler implementation? - Allows easier debugging. - Users can make use of it for their own purposes. Cons: - Neil wants to avoid users making use of this feature. (Why, he hasn't explained, or if he did, I missed it.) - All implementations (PyPy, Jython, etc.) must copy it. - You lock in one specific implementation for guards and cannot change to another one. (2) Keep __version__ private. Pros: - Other implementations can ignore it. - You can change the implementation for guards. Cons: - Users may resort to ctypes to make use of it. (If they can.) -- Steve
2016-01-11 1:12 GMT+01:00 Steven D'Aprano <steve@pearwood.info>:
Cons:
- Users may resort to ctypes to make use of it. (If they can.)
It's not something new. It's already possible to access any C private attribute using ctypes. I don't think that it's a real issue. "We are all consenting adults here" ;-) Victor
On 1/10/2016 3:02 PM, Victor Stinner wrote:
In the PEP, I wrote "The PEP is designed to implement guards on namespaces, only the dict type can be used for namespaces in practice. collections.UserDict is modified because it must mimicks dict.
collections.UserDict mimics the public interface of dict, not internal implementation details. It uses an actual dict to do this. If __version__ is not exposed at the python level, it will not be and should not be visible via UserDict.
collections.Mapping is unchanged." https://www.python.org/dev/peps/pep-0509/#changes
Is it enough? If no, what do you suggest to be more explicit?
Your minimal core proposal is or should be to add a possibly private .__version__ attribute to CPython dicts, so as to enable astoptimizer. Stick with that. Stop inviting peripheral discussion and distractions. Modifying UserDict and exposing __version__ to Python code are separate issues, and can be done later if later deemed to be desirable. -- Terry Jan Reedy
On Jan 9, 2016, at 04:48, Neil Girdhar <mistersheik@gmail.com> wrote:
How is this not just a poorer version of PyPy's optimizations? If what you want is optimization, it would be much better to devote time to a solution that can potentially yield orders of magnitude worth of speedup like PyPy rather than increasing language complexity for a minor payoff.
I think he's already answered this twice between the two threads, plus at least once in the thread last year, not to mention similar questions from slightly different angles. Which implies to me that the PEPs really need to anticipate and answer these questions.
Hi, Andrew Barnert:
Which implies to me that the PEPs really need to anticipate and answer these questions.
The dict.__version__ PEP mentions FAT python as an use case. In fact, I should point to the func.specialize() PEP which already explains partially the motivation for static optimizers: https://www.python.org/dev/peps/pep-0510/#rationale But ok I will enhance the PEP 510 rationale to explain why static optimizers makes sense in Python, maybe even more sense than a JIT compiler in some cases (short living programs). By the way, I think that Mercurial is a good example of short living program. (There is a project for a local "server" to keep a process in backgroud, this one would benefit from a JIT compiler.) Victor
On Jan 10, 2016, at 05:01, Victor Stinner <victor.stinner@gmail.com> wrote:
Andrew Barnert:
Which implies to me that the PEPs really need to anticipate and answer these questions.
The dict.__version__ PEP mentions FAT python as an use case. In fact, I should point to the func.specialize() PEP which already explains partially the motivation for static optimizers: https://www.python.org/dev/peps/pep-0510/#rationale
Sure, linking to PEP 510 instead of repeating its while rationale seems perfectly reasonable to me.
But ok I will enhance the PEP 510 rationale to explain why static optimizers makes sense in Python, maybe even more sense than a JIT compiler in some cases (short living programs). By the way, I think that Mercurial is a good example of short living program.
If CPython is already faster than PyPy for hg, and your optimization makes it faster, then you've got a great answer for "why should anyone care about making CPython a little faster?" Can you benchmark that, or at least a toy app that simulates the same kind of work? Anyway, my point is just that it would be nice if, the next time someone raises the same kind of objection (because I'll bet it comes up when you post to -dev on the next pass, from people who don't read -ideas), you could just say "read this section of PEP 509 and that section of PEP 510 and then tell me what objections you still have", instead of needing to repeat the arguments you've already made.
2016-01-11 1:53 GMT+01:00 Andrew Barnert <abarnert@yahoo.com>:
If CPython is already faster than PyPy for hg, and your optimization makes it faster, then you've got a great answer for "why should anyone care about making CPython a little faster?" Can you benchmark that, or at least a toy app that simulates the same kind of work?
My optimizer now has a good library to implement optimizations, but I didn't start to implement optimizations which will provide real speedup on real applications. I expect a speedup with function inlining, detecting pure functions, elimination of "unused" variables (after constant propagation), etc. In short, since the optimizer is "incomplete", I don't even want to start playing with benchmarks. You can play with microbenchmarks if you want. Try FAT Python, it's a working Python 3.6: https://faster-cpython.readthedocs.org/fat_python.html Currently, you have to run it with "-X fat" to enable the optimizer. But the command line argument may change, I'm still working on the API. Victor
On 1/8/2016 4:27 PM, Victor Stinner wrote:
Add a new read-only ``__version__`` property to ``dict`` and ``collections.UserDict`` types, incremented at each change.
I agree with Neil Girdhar that this looks to me like a CPython-specific implementation detail that should not be imposed on other implementations. For testing, perhaps we could add a dict_version function in test.support that uses ctypes to access the internals. Another reason to hide __version__ from the Python level is that its use seems to me rather tricky and bug-prone.
Python is hard to optimize because almost everything is mutable: builtin functions, function code, global variables, local variables, ... can be modified at runtime.
I believe that C-coded functions are immutable. But I believe that mutability otherwise otherwise undercuts what your are trying to do.
Implementing optimizations respecting the Python semantic requires to detect when "something changes":
But as near as I can tell, your proposal cannot detect all relevant changes unless one is *very* careful. A dict maps hashable objects to objects. Objects represent values. So a dict represents a mapping of values to values. If an object is mutated, the object to object mapping is not changed, but the semantic value to value mapping *is* changed. In the following example, __version__ twice gives the 'wrong' answer from a value perspective. d = {'f': [int]} d['f'][0] = float # object mapping unchanged, value mapping changed d['f'] = [float] # object mapping changed, value mapping unchanged
The astoptimizer of the FAT Python project implements many optimizations which require guards on namespaces. Examples:
* Call pure builtins: to replace ``len("abc")`` with ``3``,
Replacing a call with a return value assumes that the function is immutable, deterministic, and without side-effect. Perhaps this is what you meant by 'pure'. Are you proposing to provide astoptimizer with either a whitelist or blacklist of builtins that qualify or not? Aside from this, I don't find this example motivational. I would either write '3' in the first place or write something like "slen = len('akjslkjgkjsdkfjsldjkfs')" outside of any loop. I would more likely write something like "key = 'jlkjfksjlkdfjlskfjkslkjeicji'; key_len = len(key)" to keep a reference to both the string and its length. Will astoptimizer 'propogate the constant' (in this case 'key')? The question in my mind is whether real code has enough pure builtin calls of constants to justify the overhead.
* Loop unrolling: to unroll the loop ``for i in range(...): ...``,
How often is this useful in modern real-world Python code? Many old uses of range have been or could be replaced with enumerate or a collection iterator, making it less common than it once was. How often is N small enough that one wants complete versus partial unrolling? Wouldn't it be simpler to only use a (specialized) loop-unroller where range is known to be the builtin? -- Terry Jan Reedy
2016-01-09 23:18 GMT+01:00 Terry Reedy <tjreedy@udel.edu>:
But as near as I can tell, your proposal cannot detect all relevant changes unless one is *very* careful. A dict maps hashable objects to objects. Objects represent values. So a dict represents a mapping of values to values. If an object is mutated, the object to object mapping is not changed, but the semantic value to value mapping *is* changed. In the following example, __version__ twice gives the 'wrong' answer from a value perspective.
dict.__version__ is a technical solution to implement efficient guards on namespace. You are true, that it's not enough to detect any kind of change. For example, to inline a function inside the same module, we need a guard on the global variable, but also a guard on the function itself. We need to disable the optimization if the function code (func.__code__) is modified. Maybe a guard is also needed on default values of function parameters. But guards on functions don't need to modify CPython internals. It's already possible to implement efficient guards on functions.
Replacing a call with a return value assumes that the function is immutable, deterministic, and without side-effect.
that the function is deterministic and has no side-effect, yep.
Perhaps this is what you meant by 'pure'. Are you proposing to provide astoptimizer with either a whitelist or blacklist of builtins that qualify or not?
Currently, I'm using a whitelist of builtin functions which are known to be pure. Later, I plan to detect automatically pure functions by analyzing the (AST) code.
Aside from this, I don't find this example motivational. I would either write '3' in the first place or write something like "slen = len('akjslkjgkjsdkfjsldjkfs')" outside of any loop. I would more likely write something like "key = 'jlkjfksjlkdfjlskfjkslkjeicji'; key_len = len(key)" to keep a reference to both the string and its length. Will astoptimizer 'propogate the constant' (in this case 'key')?
FYI I already have a working implementation of the astoptimizer: it's possible to run the full Python test suite with the optimizer. Implemented optimizations: https://faster-cpython.readthedocs.org/fat_python.html#optimizations Constant propagation and constant folding optimizations are implemented. A single optimization is not interesting, It's more interesting when you combine optimizations. Like constant propagation + constant folding + loop unrolling.
The question in my mind is whether real code has enough pure builtin calls of constants to justify the overhead.
Replacing len("abc") with 3 is not the goal of FAT Python. It's only an example simple to understand.
How often is this useful in modern real-world Python code? Many old uses of range have been or could be replaced with enumerate or a collection iterator, making it less common than it once was.
IMHO the optimizations currently implemented will not provide any major speedup. It will become more interesting with function inlining. The idea is more to create an API to support pluggable static optimizations.
How often is N small enough that one wants complete versus partial unrolling? Wouldn't it be simpler to only use a (specialized) loop-unroller where range is known to be the builtin?
What is the link between your question and dict.__version__? Victor
On Sat, Jan 09, 2016 at 05:18:40PM -0500, Terry Reedy wrote:
On 1/8/2016 4:27 PM, Victor Stinner wrote:
Add a new read-only ``__version__`` property to ``dict`` and ``collections.UserDict`` types, incremented at each change.
I agree with Neil Girdhar that this looks to me like a CPython-specific implementation detail that should not be imposed on other implementations. For testing, perhaps we could add a dict_version function in test.support that uses ctypes to access the internals.
Another reason to hide __version__ from the Python level is that its use seems to me rather tricky and bug-prone.
What makes you say that? Isn't it a simple matter of: v = mydict.__version__ maybe_modify(mydict) if v != mydict.__version__: print("dict has changed") which doesn't seen tricky or bug-prone to me. The only thing I would consider is the risk that people will write v > mydict.__version__ instead of not equal, which is wrong if the flag overflows back to zero. But with a 64-bit flag, and one modification to the dict every nanosecond (i.e. a billion changes per second), it will take approximately 584 years before the counter overflows. I don't think this is a realistic scenario. How many computers do you know with an uptime of more than a decade? (A 32-bit counter, on the other hand, will only take four seconds to overflow at that rate.)
Python is hard to optimize because almost everything is mutable: builtin functions, function code, global variables, local variables, ... can be modified at runtime.
I believe that C-coded functions are immutable. But I believe that mutability otherwise otherwise undercuts what your are trying to do.
If I have understood Victor's intention correctly, what he's looking for is a way to quickly detect the shadowing or monkey-patching of builtins, so that if they *haven't* been shadowed/monkey-patched, functions can bypass the (slow) lookup process with a fast inline version. Here's a sketch of the idea: def demo(arg): return len(arg) This has to do a time-consuming lookup of len in the globals, and if not found, then a second lookup in builtins. But 99.99% of the time, we haven't shadowed or monkey-patched len, so the compiler ought to be able to inline the function and skip the search. This is how static programming languages typically operate, and is one of the reasons why they're so fast. In Python, you will often see functions like this: def demo(arg, len=len): return len(arg) which replace the slow global lookup with a fast local lookup, but at the cost of adding an extra parameter to the function call. Ugly and confusing. And, it has the side-effect that if you do shadow or monkey-patch len, the demo function won't see the new version, which may not be what you want. Victor wants to be able to make that idiom obsolete by allowing the compiler to automatically translate this: def demo(arg): return len(arg) into something like this: def demo(arg): if len has been shadowed or monkey-patched: return len(arg) # calls the new version else: return inlined or cached version of len(arg) (I stress that you, the code's author, don't have to write the code like that, the compiler will automatically do this. And it won't just operate on len, it could potentially operate on any function that has no side-effects.) This relies on the test for shadowing etc to be cheap, which Victor's tests suggest it is. But he needs a way to detect when the globals() and builtins.__dict__ dictionaries have been changed, hence his proposal.
Implementing optimizations respecting the Python semantic requires to detect when "something changes":
But as near as I can tell, your proposal cannot detect all relevant changes unless one is *very* careful. A dict maps hashable objects to objects. Objects represent values. So a dict represents a mapping of values to values. If an object is mutated, the object to object mapping is not changed, but the semantic value to value mapping *is* changed. In the following example, __version__ twice gives the 'wrong' answer from a value perspective.
d = {'f': [int]} d['f'][0] = float # object mapping unchanged, value mapping changed d['f'] = [float] # object mapping changed, value mapping unchanged
I don't think that matters for Victor's use-case. Going back to the toy example above, Victor doesn't need to detect internal modifications to the len built-in, because as you say it's immutable: py> len.foo = "spam" Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'builtin_function_or_method' object has no attribute 'foo' He just needs to know if globals()['len'] and/or builtins.len are different (in any way) from how they were when the function "demo" was compiled. I'm sure that there are complications that I haven't thought of, but these sorts of static compiler optimizations aren't cutting edge computer science, they've been around for decades and are well-studied and well-understood.
The astoptimizer of the FAT Python project implements many optimizations which require guards on namespaces. Examples:
* Call pure builtins: to replace ``len("abc")`` with ``3``,
Replacing a call with a return value assumes that the function is immutable, deterministic, and without side-effect. Perhaps this is what you meant by 'pure'.
Yes, "pure function" is the term of art for a function which is deterministic and free of side-effects. https://en.wikipedia.org/wiki/Pure_function Immutability is only important in the sense that if a function *is* pure now, you know it will be pure in the future as well.
Are you proposing to provide astoptimizer with either a whitelist or blacklist of builtins that qualify or not?
I don't think the implementation details of astoptimizer are important for this proposal. [...]
The question in my mind is whether real code has enough pure builtin calls of constants to justify the overhead.
Its not just builtin calls of constants, this technique has much wider application. If I understand Victor correctly, he thinks he can get function inlining, where instead of having to make a full function call to the built-in (which is slow), the compiler can jump directly to the function's implementation as if it were written inline. https://en.wikipedia.org/wiki/Inline_expansion Obviously you can't do this optimization if len has changed from the inlined version, hence Victor needs to detect changes to globals() and builtins.__dict__. This shouldn't turn into a general critique of optimization techniques, but I think that Victor's PEP should justify why he is confident that these optimizations have a good chance to be worthwhile. It's not enough to end up with "well, we applied all the optimizations we could, and the good news is that Python is no slower". We want some evidence that it will actually be faster. -- Steve
On Sun, Jan 10, 2016 at 3:24 PM, Steven D'Aprano <steve@pearwood.info> wrote:
On Sat, Jan 09, 2016 at 05:18:40PM -0500, Terry Reedy wrote:
On 1/8/2016 4:27 PM, Victor Stinner wrote:
Add a new read-only ``__version__`` property to ``dict`` and ``collections.UserDict`` types, incremented at each change.
I agree with Neil Girdhar that this looks to me like a CPython-specific implementation detail that should not be imposed on other implementations. For testing, perhaps we could add a dict_version function in test.support that uses ctypes to access the internals.
Another reason to hide __version__ from the Python level is that its use seems to me rather tricky and bug-prone.
What makes you say that? Isn't it a simple matter of:
v = mydict.__version__ maybe_modify(mydict) if v != mydict.__version__: print("dict has changed")
which doesn't seen tricky or bug-prone to me.
That doesn't. I would, however, expect that __version__ is a read-only attribute. I can't imagine any justifiable excuse for changing it; if you want to increment it, just mutate the dict in some unnecessary way.
But as near as I can tell, your proposal cannot detect all relevant changes unless one is *very* careful. A dict maps hashable objects to objects. Objects represent values. So a dict represents a mapping of values to values. If an object is mutated, the object to object mapping is not changed, but the semantic value to value mapping *is* changed. In the following example, __version__ twice gives the 'wrong' answer from a value perspective.
d = {'f': [int]} d['f'][0] = float # object mapping unchanged, value mapping changed d['f'] = [float] # object mapping changed, value mapping unchanged
I don't think that matters for Victor's use-case. Going back to the toy example above, Victor doesn't need to detect internal modifications to the len built-in, because as you say it's immutable:
py> len.foo = "spam" Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'builtin_function_or_method' object has no attribute 'foo'
He just needs to know if globals()['len'] and/or builtins.len are different (in any way) from how they were when the function "demo" was compiled.
There's more to it than that. Yes, a dict maps values to values; but the keys MUST be immutable (otherwise hashing has problems), and this optimization doesn't actually care about the immutability of the value. When you use the name "len" in a Python function, somewhere along the way, that will resolve to some object. Currently, CPython knows in advance that it isn't in the function-locals, but checks at run-time for a global and then a built-in; all FAT Python is doing differently is snapshotting the object referred to, and then having a quick check to prove that globals and builtins haven't been mutated. Consider: def enumerate_classes(): return (cls.__name__ for cls in object.__subclasses__()) As long as nobody has *rebound* the name 'object', this will continue to work - and it'll pick up new subclasses, which means that something's mutable or non-pure in there. FAT Python should be able to handle this just as easily as it handles an immutable. The only part that has to be immutable is the string "len" or "object" that is used as the key. The significance of len being immutable and pure comes from the other optimization, which is actually orthogonal to the non-rebound names optimization, except that CPython already does this where it doesn't depend on names. CPython already constant-folds in situations where no names are involved. That's how we maintain the illusion that there is such a thing as a "complex literal":
dis.dis(lambda: 1+2j) 1 0 LOAD_CONST 3 ((1+2j)) 3 RETURN_VALUE
FAT Python proposes to do the same here:
dis.dis(lambda: len("abc")) 1 0 LOAD_GLOBAL 0 (len) 3 LOAD_CONST 1 ('abc') 6 CALL_FUNCTION 1 (1 positional, 0 keyword pair) 9 RETURN_VALUE
And that's where it might be important to check more than just the identity of the object. If len were implemented in Python:
def len(x): ... l = 0 ... for l, _ in enumerate(x, 1): pass ... return l ... len("abc") 3 len <function len at 0x7fc6111769d8>
then it would be possible to keep the same len object but change its behaviour.
len.__code__ = (lambda x: 5).__code__ len <function len at 0x7fc6111769d8> len("abc") 5
Does anyone EVER do this? C compilers often have optimization levels that can potentially alter the program's operation (eg replacing division with multiplication by the reciprocal); if FAT Python has an optimization flag that says "Assume no __code__ objects are ever replaced", most programs would have no problem with it. (Having it trigger an immediate exception would mean there's no "what the bleep is going on" moment, and I still doubt it'll ever happen.) I think there are some interesting possibilities here. Whether they actually result in real improvement I don't know; but if FAT Python is aiming to be fast at the "start program, do a tiny bit of work, and then terminate" execution model (where JIT compilation can't help), then it could potentially make Mercurial a *lot* faster to fiddle with. ChrisA
On 01/09/2016 09:23 PM, Chris Angelico wrote:
On Sun, Jan 10, 2016 at 3:24 PM, Steven D'Aprano <steve@pearwood.info> wrote:
On Sat, Jan 09, 2016 at 05:18:40PM -0500, Terry Reedy wrote:
On 1/8/2016 4:27 PM, Victor Stinner wrote:
Add a new read-only ``__version__`` property to ``dict`` and ``collections.UserDict`` types, incremented at each change.
Another reason to hide __version__ from the Python level is that its use seems to me rather tricky and bug-prone.
What makes you say that? Isn't it a simple matter of:
[snip]
which doesn't seen tricky or bug-prone to me.
That doesn't. I would, however, expect that __version__ is a read-only attribute.
You mean like it says in the first quote of this message? ;) -- ~Ethan~
On Sun, Jan 10, 2016 at 5:25 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
That doesn't. I would, however, expect that __version__ is a read-only attribute.
You mean like it says in the first quote of this message? ;)
D'oh. Yep. Reminder, to self: Read through things twice. You never know what you missed the first time. ChrisA
Hi, 2016-01-10 6:23 GMT+01:00 Chris Angelico <rosuav@gmail.com>:
Consider:
def enumerate_classes(): return (cls.__name__ for cls in object.__subclasses__())
As long as nobody has *rebound* the name 'object', this will continue to work - and it'll pick up new subclasses, which means that something's mutable or non-pure in there. FAT Python should be able to handle this just as easily as it handles an immutable. The only part that has to be immutable is the string "len" or "object" that is used as the key.
FYI I implemented a "copy builtin to constant" optimization which replaces "LOAD_GLOBAL object" instruction with "LOAD_CONST <object function>": https://faster-cpython.readthedocs.org/fat_python.html#fat-copy-builtin-to-c... It uses a guard on the builtin and global namespaces to disable the optimization if object is replaced. If you want to make object.__subclasses__ constant, we need more guards: * guard on the object.__subclasses__ attribute * guard on the private tp_version_tag attribute of the object type * ... and it looks like object.__subclasses__ uses weak references, so I'm not sure that it's really possible to make object.__subclasses__() constant with guards. Is it really worth it? Is it a common case? Oh... I just remember that the "type" type already implements a version as I propose for dict. It's called "tp_version_tag" and it's private. It has the C type "unsigned int" and it's incremented at each modification.
And that's where it might be important to check more than just the identity of the object. If len were implemented in Python:
def len(x): ... l = 0 ... for l, _ in enumerate(x, 1): pass ... return l ... len("abc") 3 len <function len at 0x7fc6111769d8>
then it would be possible to keep the same len object but change its behaviour.
len.__code__ = (lambda x: 5).__code__ len <function len at 0x7fc6111769d8> len("abc") 5
Does anyone EVER do this?
FAT Python implements a fat.GuardFunc which checks if func.__code__ was replaced or not. It doesn't matter if replacing replacing func.__code__ is unlikely. An optimizer must not change the Python semantic, otherwise it will break some applications and cannot be used widely.
if FAT Python has an optimization flag that says "Assume no __code__ objects are ever replaced", most programs would have no problem with it. (Having it trigger an immediate exception would mean there's no "what the bleep is going on" moment, and I still doubt it'll ever happen.)
In my plan, I will add an option to skip guards if you are 100% sure that some things will never change. For example, if you control all code of your application (not only the app itself, all modules) and you know that func.__code__ is never replaced, you can skip fat.GuardFunc (not emit them).
I think there are some interesting possibilities here. Whether they actually result in real improvement I don't know; but if FAT Python is aiming to be fast at the "start program, do a tiny bit of work, and then terminate" execution model (where JIT compilation can't help), then it could potentially make Mercurial a *lot* faster to fiddle with.
FAT Python is designed to compile the code ahead of code. The installation can be pre-optimized in a package, or optimized at the installation, but it's not optimized when the program is started. If the optimization are efficient, the program will run faster, even for short living programs (yes, like Mercurial). Victor
On 1/10/2016 12:23 AM, Chris Angelico wrote: (in reponse to Steven's response to my post)
There's more to it than that. Yes, a dict maps values to values; but the keys MUST be immutable
Keys just have to be hashable; only hashes need to be immutable. By default, hashes depends on ids, which are immutable for a particular object within a run. (otherwise hashing has problems), only if the hash depends on values that mutate. Some do.
and this optimization
doesn't actually care about the immutability of the value.
astoptimizer has multiple optimizations. One is not repeating name lookups. This is safe as long as the relevant dicts have not changed. I am guessing that you were pointing to this one. Another is not repeating the call of a function with a particular value. This optimization, in general, is not safe even if dicts have not changed. It *does* care about the nature of dict values -- in particular the nature of functions that are dict values. It is the one *I* discussed, and the reason I claimed that using __version__ is tricky. His toy example is replacing conditionally replacing 'len('abc') (at runtime) with '3', where '3' is computed *when the code is compiled. For this, it is crucial that builtin len is pure and immutable. Viktor is being super careful to not break code. In response to my question, Viktor said astoptimizer uses a whitelist of pure builtins to supplement the information supplied by .__version__. Dict history, summarized by __version__ is not always enough to answer 'is this optimization safe'? The nature of values is sometimes crucially important. However, others might use __version__ *without* thinking through what other information is needed. This is why I think its exposure is a bit dangerous. 19 years of experience suggests to me that misuse *will* happen. Viktor just reported that CPython's type already has a *private* version count. The issue of exposing a new internal feature is somewhat separate and comes after the decision to add it. As you know, and even alluded to later in your post, CPython already replaces '1 + 1' with '2' at compile time. Method int.__add__ is pure and immutable. Since it (unlike len) also cannot be replaced or shadowed, the replacement can be complete, with '2' put in the code object (and .pyc if written), as if the programmer had actually written '2'.
from dis import dis dis('1 + 1') 1 0 LOAD_CONST 1 (2) 3 RETURN_VALUE
JIT compilers depend on the same properties of int, float, and str operations, for instance, as well as the fact that unbox(Py object) and box(machine value) are inverses, so that unbox(box(temp_machine_value) can be replaced by temp_machine_value. -- Terry Jan Reedy
On Jan 10, 2016, at 22:04, Terry Reedy <tjreedy@udel.edu> wrote:
On 1/10/2016 12:23 AM, Chris Angelico wrote:
(in reponse to Steven's response to my post)
There's more to it than that. Yes, a dict maps values to values; but the keys MUST be immutable
Keys just have to be hashable; only hashes need to be immutable.
By default, hashes depends on ids, which are immutable for a particular object within a run.
(otherwise hashing has problems),
only if the hash depends on values that mutate. Some do.
But if equality depends on values, the hash has to depend on those same values. (Because two values that are equal have to hash equal.) Which means that if equality depends on any mutable values, the type can't be hashable. Which is why none of the built-in mutable types are hashable. Of course Python doesn't stop you from writing your own types that can provide different hashes for equal values, or that can change hashes as they're mutated. It's even possible to use them as dict keys as long as you're very careful (the keys don't mutate in a way that changes either their hash or their equivalence while they're in the dict, and you never look up or add a key that's equal to an existing key but has a different hash). But it's not _that_ much of an oversimplification to say that keys have to be immutable. And any dict-based optimizations can safely rely on the same thing basic dict usage relies on: if the keys _aren't_ actually immutable, they're coded and used carefully (as described above) so that you can't tell they're mutable.
On 1/11/2016 1:48 AM, Andrew Barnert via Python-ideas wrote:
On Jan 10, 2016, at 22:04, Terry Reedy <tjreedy@udel.edu> wrote:
On 1/10/2016 12:23 AM, Chris Angelico wrote:
(in reponse to Steven's response to my post)
There's more to it than that. Yes, a dict maps values to values; but the keys MUST be immutable
Keys just have to be hashable; only hashes need to be immutable.
By default, hashes depends on ids, which are immutable for a particular object within a run.
(otherwise hashing has problems),
A '>' quote mark is missing here. This line is from Chris.
only if the hash depends on values that mutate. Some do.
In other words, hashes should not depend on values that mutate. We all agree on that.
But
We all three agree on the following.
if equality depends on values, the hash has to depend on those same values. (Because two values that are equal have to hash equal.) Which means that if equality depends on any mutable values, the type can't be hashable. Which is why none of the built-in mutable types are hashable.
By default, object equality is based on ids.
But it's not _that_ much of an oversimplification to say that keys have to be immutable.
By default, an instance of a subclass of object is mutable, hashable (by id, making the hash immutable), and usable as a dict key. The majority of both builtin and user-defined classes follow this pattern and are quite usable as keys, contrary to the claim. Classes with immutable instances (tuples, numbers, strings, frozen sets, some extension classes, and user classes that take special measures) are exceptions. So are classes with mutable hashes (lists, sets, dicts, some extension classes, and user classes that override __eq__ and __hash__). -- Terry Jan Reedy
On Jan 11, 2016, at 14:56, Terry Reedy <tjreedy@udel.edu> wrote:
Classes with immutable instances (tuples, numbers, strings, frozen sets, some extension classes, and user classes that take special measures) are exceptions. So are classes with mutable hashes (lists, sets, dicts, some extension classes, and user classes that override __eq__ and __hash__).
I don't understand your terminology here. What are "classes with mutable hashes"? Your examples of lists, sets, and dicts don't have mutable hashes; they have no hashes. If you write "hash([])", you get a TypeError("unhashable type: 'list'"). And well-behaved extensions classes and user classes that override __eq__ and __hash__ provide immutable hashes and immutable equality to match, or they use __hash__=None if they need mutable equality. Python can't actually stop you from creating a class with mutable hashes, and even putting instances of such a class in a dict, but that dict won't actually work right. So, there's nothing for a version-guarded dict to worry about there.
On 1/11/2016 6:30 PM, Andrew Barnert via Python-ideas wrote:
On Jan 11, 2016, at 14:56, Terry Reedy <tjreedy@udel.edu> wrote:
Classes with immutable instances (tuples, numbers, strings, frozen sets, some extension classes, and user classes that take special measures) are exceptions. So are classes with mutable hashes (lists, sets, dicts, some extension classes, and user classes that override __eq__ and __hash__).
I don't understand your terminology here.
Yes, the term, as a negation, is wrong. It should be 'classes that don't have immutable hashes'. The list is right, except that 'override' should really be 'disable'. Anyway, Viktor changed the PEP and has moved on, so I will too. -- Terry Jan Reedy
On Mon, Jan 11, 2016 at 5:04 PM, Terry Reedy <tjreedy@udel.edu> wrote:
On 1/10/2016 12:23 AM, Chris Angelico wrote:
(in reponse to Steven's response to my post)
There's more to it than that. Yes, a dict maps values to values; but the keys MUST be immutable
Keys just have to be hashable; only hashes need to be immutable. By default, hashes depends on ids, which are immutable for a particular object within a run.
Yes, but if you're using the ID as the hash and identity as equality, then *by definition* the only way to look up that key is with that object. That means it doesn't matter to the lookup optimization if the object itself has changed: class Puddle(object): pass d = {} key, val = Puddle(), Puddle() key.foo = "foo"; val.foo = "bar" d[key] = val print(d[key]) snapshotted_d_key = d[key] key.foo = "not foo" print(d[key]) print(snapshotted_d_key) The optimization in question is effectively using a local reference like snapshotted_d_key rather than doing the actual lookup again. It can safely do this even if the attributes of that key have changed, because there is no way for that to affect the result of the lookup. So in terms of dict lookups, whatever affects hash and equality *is* the object's value; if that's its identity, then identity is the sole value that object has.
and this optimization doesn't actually care about the immutability of the value.
astoptimizer has multiple optimizations. One is not repeating name lookups. This is safe as long as the relevant dicts have not changed. I am guessing that you were pointing to this one.
Yes, that's the one I was talking about.
Another is not repeating the call of a function with a particular value. This optimization, in general, is not safe even if dicts have not changed. It *does* care about the nature of dict values -- in particular the nature of functions that are dict values. It is the one *I* discussed, and the reason I claimed that using __version__ is tricky.
Okay. In that case, yes, it takes a lot more checks.
His toy example is replacing conditionally replacing 'len('abc') (at runtime) with '3', where '3' is computed *when the code is compiled. For this, it is crucial that builtin len is pure and immutable.
Correct. I'm getting this mental picture of angelic grace, with a chosen few most beautiful functions being commended for their purity, immutability, and reverence.
Viktor is being super careful to not break code. In response to my question, Viktor said astoptimizer uses a whitelist of pure builtins to supplement the information supplied by .__version__. Dict history, summarized by __version__ is not always enough to answer 'is this optimization safe'? The nature of values is sometimes crucially important.
There would be very few operations that can be optimized like this. In practical terms, the only ones that I can think of are what you might call "computed literals" - like (2+3j), they aren't technically literals, but the programmer thinks of them that way. Things like module-level constants (the 'stat' module comes to mind), a small handful of simple transformations, and maybe some text<->bytes transformations (eg "abc".encode("ascii") could be replaced at compile-time with b"abc"). There won't be very many others, I suspect. ChrisA
Hi, 2016-01-11 9:07 GMT+01:00 Chris Angelico <rosuav@gmail.com>:
Yes, but if you're using the ID as the hash and identity as equality, then *by definition* the only way to look up that key is with that object. That means it doesn't matter to the lookup optimization if the object itself has changed:
class Puddle(object): pass d = {} key, val = Puddle(), Puddle() key.foo = "foo"; val.foo = "bar" d[key] = val
IMHO the discussion gone too far. See the PEP: the goal is to implement efficient guards on namespaces. In namespaces, keys are short immutable strings. Not funny objects. Keys come from the Python source code, like "x" from "x=1". Again, if the dict value is mutable (like functions implemented in pure Python), they are dedicated guards for that, but no PEP is required to implement these guards ;-) See the PEP 510: to specialize a function, you have to a pass a *list* of guards. There is not arbitrary limit on the number of guards :-) (But I expect to have less than 10 guards for the common case, or more likely just a few ones.)
There would be very few operations that can be optimized like this. In practical terms, the only ones that I can think of are what you might call "computed literals" - like (2+3j), they aren't technically literals, but the programmer thinks of them that way.
FYI Python 2 peephole optimizer is not able to optimize all operations like that because of technical issues, it's limited :-/ Python 3 peephole optimizer is better ;-) http://faster-cpython.readthedocs.org/bytecode.html#cpython-peephole-optimiz... In more general, the optimizer is limited because it works on the bytecode which is difficult to manipulate. It's difficult to implement simple optimizations. For example, the peephole optimizer of Python 3 maintains a "stack of constants" to implement constant folding. Implemeting constant folding on the AST is much easier, you can browse the subtree of a node with nice Python objects. If you are curious, you can take a look at the constant folding optimization step of astoptimizer: https://hg.python.org/sandbox/fatpython/file/tip/Lib/astoptimizer/const_fold... It implements more optimizations than the peephole optimizer: http://faster-cpython.readthedocs.org/fat_python.html#comparison-with-the-pe...
Things like module-level constants (the 'stat' module comes to mind),
In Python, it's rare to manipulate directly constants. But it's common to access constant coming from a different namespace, like constants at the module level. To implement constant propagation on these constants, we also need guards on the namespace to disable the optimization when a "constant" is modified (which can be done for unit tests for example). For example, the base64 module defines: _b32alphabet = b'ABCDEFGHIJKLMNOPQRSTUVWXYZ234567' and later in a function, it uses: {v: k for k, v in enumerate(_b32alphabet)} This "complex" dict-comprehension calling enumerate() can be replaced with a simpler dict literal: {65: 0, 66: 1, ...} or dict(((65,0), (66,0), ...)). I don't know if it's the best example, I don't know if it's really much faster, it's just to explain the general idea. Another simple example: pickle.py defines many constants like MARK = b'(' and TUPLE = b(', defined at module-level. Later it uses for example MARK + TUPLE. Using guards on the global namespace, it's possible to replace MARK + TUPLE with b'((' to avoid two dict lookups and a call to byte string concatenation. Again, it's a simple explain to explain the principle. Usually, a single optimization alone is not interesting. It's when you combine optimization that it becomes interesting. For example, constant propagation + constant folding + simplify iterable + loop unrolling + elimitation of unused variables really makes the code simpler (and more efficient).
a small handful of simple transformations, and maybe some text<->bytes transformations (eg "abc".encode("ascii") could be replaced at compile-time with b"abc"). There won't be very many others, I suspect.
It's possible to optimize some method calls on builtin types without guards since it's not possible to replace methods of builtin types. My old AST optimizer implements such optimizations (I didn't reimplement them in my new AST optimizer yet), but alone they are not really interesting in term of performance. Victor
2016-01-10 5:24 GMT+01:00 Steven D'Aprano <steve@pearwood.info>:
Here's a sketch of the idea:
def demo(arg): return len(arg) (...)
For examples of guards and how they can be used, please see the PEP 510: https://www.python.org/dev/peps/pep-0510/#example Victor
On 1/9/2016 11:24 PM, Steven D'Aprano wrote:
On Sat, Jan 09, 2016 at 05:18:40PM -0500, Terry Reedy wrote:
Another reason to hide __version__ from the Python level is that its use seems to me rather tricky and bug-prone.
What makes you say that?
We would like to replace slow tortoise steps with quick rabbit jumps. Is it safe? For avoiding name lookups in dicts, careful dict guards using __version__ should be enough. For avoiding function calls, they help but are not enough. Optimization is empirically tricky and bug prone. CPython has many private implementation details that have not been exposed at the Python level because the expected gain is not worth the expected pain. If __version__ is added, I think exposing it should be giving separate consideration. -- Terry Jan Reedy
Thank you very much for the first round of comments on this PEP 509 (dict version). I posted a second version to python-dev. The main changes since the first version are that the dictionary version is no more exposed at the Python level and the field type now also has a size of 64-bit on 32-bit platforms. Please continue the discussion there, this thread is now closed ;-) It's now time to review my second PEP 510 (func.specialize), also posted on the python-ideas list 3 days ago: "RFC: PEP: Specialized functions with guards"! Victor
participants (16)
-
Andrew Barnert
-
Barry Warsaw
-
Brett Cannon
-
Chris Angelico
-
Eric Fahlgren
-
Ethan Furman
-
Gregory P. Smith
-
M.-A. Lemburg
-
Michael Selik
-
Neil Girdhar
-
Nicholas Chammas
-
Nick Coghlan
-
Serhiy Storchaka
-
Steven D'Aprano
-
Terry Reedy
-
Victor Stinner