14 Apr
2016
14 Apr
'16
4:28 p.m.
+1 from me! A couple of grammar/typo suggestions below. On Thu, 14 Apr 2016 at 08:20 Victor Stinner <victor.stinner@gmail.com> wrote: > Hi, > > I updated my PEP 509 to make the dictionary version globally unique. > With *two* use cases of this PEP (Yury's method call patch and my FAT > Python project), I think that the PEP is now ready to be accepted. > > Globally unique identifier is a requirement for Yury's patch > optimizing method calls ( https://bugs.python.org/issue26110 ). It > allows to check for free if the dictionary was replaced. > > I also renamed the ma_version field to ma_version_tag. > > HTML version: > https://www.python.org/dev/peps/pep-0509/ > > Victor > > > PEP: 509 > Title: Add a private version to dict > Version: $Revision$ > Last-Modified: $Date$ > Author: Victor Stinner <victor.stinner@gmail.com> > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 4-January-2016 > Python-Version: 3.6 > > > Abstract > ======== > > Add a new private version to the builtin ``dict`` type, incremented at > each dictionary creation and at each dictionary change, to implement > fast guards on namespaces. > > > Rationale > ========= > > In Python, the builtin ``dict`` type is used by many instructions. For > example, the ``LOAD_GLOBAL`` instruction searchs for a variable in the > global namespace, or in the builtins namespace (two dict lookups). > Python uses ``dict`` for the builtins namespace, globals namespace, type > namespaces, instance namespaces, etc. The local namespace (namespace of > a function) is usually optimized to an array, but it can be a dict too. > > Python is hard to optimize because almost everything is mutable: builtin > functions, function code, global variables, local variables, ... can be > modified at runtime. Implementing optimizations respecting the Python > semantics requires to detect when "something changes": we will call > these checks "guards". > > The speedup of optimizations depends on the speed of guard checks. This > PEP proposes to add a version to dictionaries to implement fast guards > on namespaces. > > Dictionary lookups can be skipped if the version does not change which > is the common case for most namespaces. Since the version is globally > unique, the version is also enough to check if the namespace dictionary > was not replaced with a new dictionary. The performance of a guard does > not depend on the number of watched dictionary entries, complexity of > O(1), if the dictionary version does not change. > > Example of optimization: copy the value of a global variable to function > constants. This optimization requires a guard on the global variable to > check if it was modified. If the variable is modified, the variable must > be loaded at runtime when the function is called, instead of using the > constant. > > See the `PEP 510 -- Specialized functions with guards > <https://www.python.org/dev/peps/pep-0510/>`_ for the concrete usage of > guards to specialize functions and for the rationale on Python static > optimizers. > > > Guard example > ============= > > Pseudo-code of an fast guard to check if a dictionary entry was modified > (created, updated or deleted) using an hypothetical > ``dict_get_version(dict)`` function:: > > UNSET = object() > > class GuardDictKey: > def __init__(self, dict, key): > self.dict = dict > self.key = key > self.value = dict.get(key, UNSET) > self.version = dict_get_version(dict) > > def check(self): > """Return True if the dictionary entry did not changed > and the dictionary was not replaced.""" > "did not change" > > # read the version of the dict structure > version = dict_get_version(self.dict) > if version == self.version: > # Fast-path: dictionary lookup avoided > return True > > # lookup in the dictionary > value = self.dict.get(self.key, UNSET) > if value is self.value: > # another key was modified: > # cache the new dictionary version > self.version = version > return True > > # the key was modified > return False > > > Usage of the dict version > ========================= > > Speedup method calls 1.2x > ------------------------- > > Yury Selivanov wrote a `patch to optimize method calls > <https://bugs.python.org/issue26110>`_. The patch depends on the > `implement per-opcode cache in ceval > <https://bugs.python.org/issue26219>`_ patch which requires dictionary > versions to invalidate the cache if the globals dictionary or the > builtins dictionary has been modified. > > The cache also requires that the dictionary version is globally unique. > It is possible to define a function in a namespace and call it > in a different namespace: using ``exec()`` with the *globals* parameter > for example. In this case, the globals dictionary was changed and the > cache must be invalidated. > > > Specialized functions using guards > ---------------------------------- > > The `PEP 510 -- Specialized functions with guards > <https://www.python.org/dev/peps/pep-0510/>`_ proposes an API to support > specialized functions with guards. It allows to implement static > optimizers for Python without breaking the Python semantics. > > Example of a static Python optimizer: the `fatoptimizer > <http://fatoptimizer.readthedocs.org/>`_ of the `FAT Python > <http://faster-cpython.readthedocs.org/fat_python.html>`_ project > implements many optimizations which require guards on namespaces. > Examples: > > * Call pure builtins: to replace ``len("abc")`` with ``3``, guards on > ``builtins.__dict__['len']`` and ``globals()['len']`` are required > * Loop unrolling: to unroll the loop ``for i in range(...): ...``, > guards on ``builtins.__dict__['range']`` and ``globals()['range']`` > are required > > > Pyjion > ------ > > According of Brett Cannon, one of the two main developers of Pyjion, > Pyjion can also benefit from dictionary version to implement > optimizations. > > Pyjion is a JIT compiler for Python based upon CoreCLR (Microsoft .NET > Core runtime). > > > Unladen Swallow > --------------- > > Even if dictionary version was not explicitly mentioned, optimizing > globals and builtins lookup was part of the Unladen Swallow plan: > "Implement one of the several proposed schemes for speeding lookups of > globals and builtins." Source: `Unladen Swallow ProjectPlan > <https://code.google.com/p/unladen-swallow/wiki/ProjectPlan>`_. > > Unladen Swallow is a fork of CPython 2.6.1 adding a JIT compiler > implemented with LLVM. The project stopped in 2011: `Unladen Swallow > Retrospective > <http://qinsb.blogspot.com.au/2011/03/unladen-swallow-retrospective.html > >`_. > > > Changes > ======= > > Add a ``ma_version_tag`` field to the ``PyDictObject`` structure with > the C type ``PY_INT64_T``, 64-bit unsigned integer. Don't you mean ``PY_UINT64_T``? > Add also a global > dictionary version. Each time a dictionary is created, the global > version is incremented and the dictionary version is initialized to the > global version. The global version is also incremented and copied to the > dictionary version at each dictionary change: > > * ``clear()`` if the dict was non-empty > * ``pop(key)`` if the key exists > * ``popitem()`` if the dict is non-empty > * ``setdefault(key, value)`` if the `key` does not exist > * ``__detitem__(key)`` if the key exists > * ``__setitem__(key, value)`` if the `key` doesn't exist or if the value > is not ``dict[key]`` > * ``update(...)`` if new values are different than existing values: > values are compared by identity, not by their content; the version can > be incremented multiple times > > The ``PyDictObject`` structure is not part of the stable ABI. > > The field is called ``ma_version_tag`` rather than ``ma_version`` to > suggest to compare it using ``version_tag == old_version_tag`` rather > than ``version <= old_version`` which makes the integer overflow much > likely. > > Example using an hypothetical ``dict_get_version(dict)`` function:: > > >>> d = {} > >>> dict_get_version(d) > 100 > >>> d['key'] = 'value' > >>> dict_get_version(d) > 101 > >>> d['key'] = 'new value' > >>> dict_get_version(d) > 102 > >>> del d['key'] > >>> dict_get_version(d) > 103 > > The version is not incremented if an existing key is set to the same > value. For efficiency, values are compared by their identity: > ``new_value is old_value``, not by their content: > ``new_value == old_value``. Example:: > > >>> d = {} > >>> value = object() > >>> d['key'] = value > >>> dict_get_version(d) > 40 > >>> d['key'] = value > >>> dict_get_version(d) > 40 > > .. note:: > CPython uses some singleton like integers in the range [-5; 257], > empty tuple, empty strings, Unicode strings of a single character in > the range [U+0000; U+00FF], etc. When a key is set twice to the same > singleton, the version is not modified. > > > Implementation and Performance > ============================== > > The `issue #26058: PEP 509: Add ma_version_tag to PyDictObject > <https://bugs.python.org/issue26058>`_ contains a patch implementing > this PEP. > > On pybench and timeit microbenchmarks, the patch does not seem to add > any overhead on dictionary operations. > > When the version does not change, ``PyDict_GetItem()`` takes 14.8 ns for > a dictionary lookup, whereas a guard check only takes 3.8 ns. Moreover, > a guard can watch for multiple keys. For example, for an optimization > using 10 global variables in a function, 10 dictionary lookups costs 148 > ns, whereas the guard still only costs 3.8 ns when the version does not > change (39x as fast). > > The `fat module > <http://fatoptimizer.readthedocs.org/en/latest/fat.html>`_ implements > such guards: ``fat.GuardDict`` is based on the dictionary version. > > > Integer overflow > ================ > > The implementation uses the C type ``PY_UINT64_T`` to store the version: > a 64 bits unsigned integer. The C code uses ``version++``. On integer > overflow, the version is wrapped to ``0`` (and then continue to be > incremented) according to the C standard. > > After an integer overflow, a guard can succeed whereas the watched > dictionary key was modified. The bug only occurs at a guard check if > there are exaclty ``2 ** 64`` dictionary creations or modifications > since the previous guard check. > > If a dictionary is modified every nanosecond, ``2 ** 64`` modifications > takes longer than 584 years. Using a 32-bit version, it only takes 4 > seconds. That's why a 64-bit unsigned type is also used on 32-bit > systems. A dictionary lookup at the C level takes 14.8 ns. > > A risk of a bug every 584 years is acceptable. > > > Alternatives > ============ > > Expose the version at Python level as a read-only __version__ property > ---------------------------------------------------------------------- > > The first version of the PEP proposed to expose the dictionary version > as a read-only ``__version__`` property at Python level, and also to add > the property to ``collections.UserDict`` (since this type must mimick > the ``dict`` API). > > There are multiple issues: > > * To be consistent and avoid bad surprises, the version must be added to > all mapping types. Implementing a new mapping type would require extra > work for no benefit, since the version is only required on the > ``dict`` type in practice. > * All Python implementations must implement this new property, it gives > more work to other implementations, whereas they may not use the > dictionary version at all. > * Exposing the dictionary version at Python level can lead the > false assumption on performances. Checking ``dict.__version__`` at > the Python level is not faster than a dictionary lookup. A dictionary > lookup has a cost of 48.7 ns and checking a guard has a cost of 47.5 > ns, the difference is only 1.2 ns (3%):: > > > $ ./python -m timeit -s 'd = {str(i):i for i in range(100)}' 'd["33"] > == 33' > 10000000 loops, best of 3: 0.0487 usec per loop > $ ./python -m timeit -s 'd = {str(i):i for i in range(100)}' > 'd.__version__ == 100' > 10000000 loops, best of 3: 0.0475 usec per loop > > * The ``__version__`` can be wrapped on integer overflow. It is error > prone: using ``dict.__version__ <= guard_version`` is wrong, > ``dict.__version__ == guard_version`` must be used instead to reduce > the risk of bug on integer overflow (even if the integer overflow is > unlikely in practice). > > Mandatory bikeshedding on the property name: > > * ``__cache_token__``: name proposed by Nick Coghlan, name coming from > `abc.get_cache_token() > <https://docs.python.org/3/library/abc.html#abc.get_cache_token>`_. > * ``__version__`` > * ``__timestamp__`` > > > Add a version to each dict entry > -------------------------------- > > A single version per dictionary requires to keep a strong reference to > the value which can keep the value alive longer than expected. If we add > also a version per dictionary entry, the guard can only store the entry > version to avoid the strong reference to the value (only strong > references to the dictionary and to the key are needed). > > Changes: add a ``me_version`` field to the ``PyDictKeyEntry`` structure, > the field has the C type ``PY_INT64_T``. When a key is created or > modified, the entry version is set to the dictionary version which is > incremented at any change (create, modify, delete). > > Pseudo-code of an fast guard to check if a dictionary key was modified > using hypothetical ``dict_get_version(dict)`` > ``dict_get_entry_version(dict)`` functions:: > > UNSET = object() > > class GuardDictKey: > def __init__(self, dict, key): > self.dict = dict > self.key = key > self.dict_version = dict_get_version(dict) > self.entry_version = dict_get_entry_version(dict, key) > > def check(self): > """Return True if the dictionary entry did not changed > and the dictionary was not replaced.""" > > # read the version of the dict structure > dict_version = dict_get_version(self.dict) > if dict_version == self.version: > # Fast-path: dictionary lookup avoided > return True > > # lookup in the dictionary > entry_version = get_dict_key_version(dict, key) > if entry_version == self.entry_version: > # another key was modified: > # cache the new dictionary version > self.dict_version = dict_version > return True > > # the key was modified > return False > > The main drawback of this option is the impact on the memory footprint. > It increases the size of each dictionary entry, so the overhead depends > on the number of buckets (dictionary entries, used or unused yet). For > example, it increases the size of each dictionary entry by 8 bytes on > 64-bit system. > > In Python, the memory footprint matters and the trend is to reduce it. > Examples: > > * `PEP 393 -- Flexible String Representation > <https://www.python.org/dev/peps/pep-0393/>`_ > * `PEP 412 -- Key-Sharing Dictionary > <https://www.python.org/dev/peps/pep-0412/>`_ > > > Add a new dict subtype > ---------------------- > > Add a new ``verdict`` type, subtype of ``dict``. When guards are needed, > use the ``verdict`` for namespaces (module namespace, type namespace, > instance namespace, etc.) instead of ``dict``. > > Leave the ``dict`` type unchanged to not add any overhead (memory > footprint) when guards are not needed. > > Technical issue: a lot of C code in the wild, including CPython core, > expecting the exact ``dict`` type. Issues: > > * ``exec()`` requires a ``dict`` for globals and locals. A lot of code > use ``globals={}``. It is not possible to cast the ``dict`` to a > ``dict`` subtype because the caller expects the ``globals`` parameter > to be modified (``dict`` is mutable). > * Functions call directly ``PyDict_xxx()`` functions, instead of calling > ``PyObject_xxx()`` if the object is a ``dict`` subtype > * ``PyDict_CheckExact()`` check fails on ``dict`` subtype, whereas some > functions require the exact ``dict`` type. > * ``Python/ceval.c`` does not completely supports dict subtypes for > namespaces > > > The ``exec()`` issue is a blocker issue. > > Other issues: > > * The garbage collector has a special code to "untrack" ``dict`` > instances. If a ``dict`` subtype is used for namespaces, the garbage > collector can be unable to break some reference cycles. > * Some functions have a fast-path for ``dict`` which would not be taken > for ``dict`` subtypes, and so it would make Python a little bit > slower. > > > Prior Art > ========= > > Method cache and type version tag > --------------------------------- > > In 2007, Armin Rigo wrote a patch to to implement a cache of methods. It > was merged into Python 2.6. The patch adds a "type attribute cache > version tag" (``tp_version_tag``) and a "valid version tag" flag to > types (the ``PyTypeObject`` structure). > > The type version tag is not available at the Python level. > > The version tag has the C type ``unsigned int``. The cache is a global > hash table of 4096 entries, shared by all types. The cache is global to > "make it fast, have a deterministic and low memory footprint, and be > easy to invalidate". Each cache entry has a version tag. A global > version tag is used to create the next version tag, it also has the C > type ``unsigned int``. > > By default, a type has its "valid version tag" flag cleared to indicate > that the version tag is invalid. When the first method of the type is > cached, the version tag and the "valid version tag" flag are set. When a > type is modified, the "valid version tag" flag of the type and its > subclasses is cleared. Later, when a cache entry of these types is used, > the entry is removed because its version tag is outdated. > > On integer overflow, the whole cache is cleared and the global version > tag is reset to ``0``. > > See `Method cache (issue #1685986) > <https://bugs.python.org/issue1685986>`_ and `Armin's method cache > optimization updated for Python 2.6 (issue #1700288) > <https://bugs.python.org/issue1700288>`_. > > > Globals / builtins cache > ------------------------ > > In 2010, Antoine Pitrou proposed a `Globals / builtins cache (issue > #10401) <http://bugs.python.org/issue10401>`_ which adds a private > ``ma_version`` field to the ``PyDictObject`` structure (``dict`` type), > the field has the C type ``Py_ssize_t``. > > The patch adds a "global and builtin cache" to functions and frames, and > changes ``LOAD_GLOBAL`` and ``STORE_GLOBAL`` instructions to use the > cache. > > The change on the ``PyDictObject`` structure is very similar to this > PEP. > > > Cached globals+builtins lookup > ------------------------------ > > In 2006, Andrea Griffini proposed a patch implementing a `Cached > globals+builtins lookup optimization > <https://bugs.python.org/issue1616125>`_. The patch adds a private > ``timestamp`` field to the ``PyDictObject`` structure (``dict`` type), > the field has the C type ``size_t``. > > Thread on python-dev: `About dictionary lookup caching > <https://mail.python.org/pipermail/python-dev/2006-December/070348.html > >`_. > > > Guard against changing dict during iteration > -------------------------------------------- > > In 2013, Serhiy Storchaka proposed `Guard against changing dict during > iteration (issue #19332) <https://bugs.python.org/issue19332>`_ which > adds a ``ma_count`` field to the ``PyDictObject`` structure (``dict`` > type), the field has the C type ``size_t``. This field is incremented > when the dictionary is modified, and so is very similar to the proposed > dictionary version. > > Sadly, the dictionary version proposed in this PEP doesn't help to > detect dictionary mutation. The dictionary version changes when values > are replaced, whereas modifying dictionary values while iterating on > dictionary keys is legit in Python. > > > PySizer > ------- > > `PySizer <http://pysizer.8325.org/>`_: a memory profiler for Python, > Google Summer of Code 2005 project by Nick Smallbone. > > This project has a patch for CPython 2.4 which adds ``key_time`` and > ``value_time`` fields to dictionary entries. It uses a global > process-wide counter for dictionaries, incremented each time that a > dictionary is modified. The times are used to decide when child objects > first appeared in their parent objects. > > > Discussion > ========== > > Thread on the mailing lists: > > * python-dev: `PEP 509: Add a private version to dict > <https://mail.python.org/pipermail/python-dev/2016-January/142685.html > >`_ > (january 2016) > * python-ideas: `RFC: PEP: Add dict.__version__ > <https://mail.python.org/pipermail/python-ideas/2016-January/037702.html > >`_ > (january 2016) > > > Copyright > ========= > > This document has been placed in the public domain. > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org >