[Python-Dev] PEP 509: Add a private version to dict

Victor Stinner victor.stinner at gmail.com
Mon Jan 11 14:56:26 EST 2016


Le 11 janv. 2016 8:09 PM, "Maciej Fijalkowski" <fijall at gmail.com> a écrit :
> Hi Victor.
>
> You know that pypy does this stuff without changing and exposing
> python semantics right? We have a version dict that does not leak
> abstractions to the user.

The PEP adds a field to the C structure PyDictObject. Are you asking me to
hide it from the C structure?

The first version of my PEP added a public read-only property at Python
level, but I changed the PEP. See the alternatives section for more detail.

Victor

> In general, doing stuff like that where there is a public API that
> leaks details of certain optimizations makes it harder and harder for
> optimizing compilers to do their job properly, if you want to do
> something slightly different.
>
> Can we make this happen (as you noted in the prior art) WITHOUT
> changing ANY of the things exposed to the user?
>
> On Mon, Jan 11, 2016 at 6:49 PM, Victor Stinner
> <victor.stinner at gmail.com> wrote:
> > Hi,
> >
> > After a first round on python-ideas, here is the second version of my
> > PEP. The main changes since the first version are that the dictionary
> > version is no more exposed at the Python level and the field type now
> > also has a size of 64-bit on 32-bit platforms.
> >
> > The PEP is part of a serie of 3 PEP adding an API to implement a
> > static Python optimizer specializing functions with guards. The second
> > PEP is currently discussed on python-ideas and I'm still working on
> > the third PEP.
> >
> > Thanks to Red Hat for giving me time to experiment on this.
> >
> >
> > HTML version:
> > https://www.python.org/dev/peps/pep-0509/
> >
> >
> > PEP: 509
> > Title: Add a private version to dict
> > Version: $Revision$
> > Last-Modified: $Date$
> > Author: Victor Stinner <victor.stinner at gmail.com>
> > Status: Draft
> > Type: Standards Track
> > Content-Type: text/x-rst
> > Created: 4-January-2016
> > Python-Version: 3.6
> >
> >
> > Abstract
> > ========
> >
> > Add a new private version to builtin ``dict`` type, incremented at each
> > change, to implement fast guards on namespaces.
> >
> >
> > Rationale
> > =========
> >
> > In Python, the builtin ``dict`` type is used by many instructions. For
> > example, the ``LOAD_GLOBAL`` instruction searchs for a variable in the
> > global namespace, or in the builtins namespace (two dict lookups).
> > Python uses ``dict`` for the builtins namespace, globals namespace, type
> > namespaces, instance namespaces, etc. The local namespace (namespace of
> > a function) is usually optimized to an array, but it can be a dict too.
> >
> > Python is hard to optimize because almost everything is mutable: builtin
> > functions, function code, global variables, local variables, ... can be
> > modified at runtime. Implementing optimizations respecting the Python
> > semantics requires to detect when "something changes": we will call
> > these checks "guards".
> >
> > The speedup of optimizations depends on the speed of guard checks. This
> > PEP proposes to add a version to dictionaries to implement fast guards
> > on namespaces.
> >
> > Dictionary lookups can be skipped if the version does not change which
> > is the common case for most namespaces. The performance of a guard does
> > not depend on the number of watched dictionary entries, complexity of
> > O(1), if the dictionary version does not change.
> >
> > Example of optimization: copy the value of a global variable to function
> > constants.  This optimization requires a guard on the global variable to
> > check if it was modified. If the variable is modified, the variable must
> > be loaded at runtime when the function is called, instead of using the
> > constant.
> >
> > See the `PEP 510 -- Specialized functions with guards
> > <https://www.python.org/dev/peps/pep-0510/>`_ for the concrete usage of
> > guards to specialize functions and for the rationale on Python static
> > optimizers.
> >
> >
> > Guard example
> > =============
> >
> > Pseudo-code of an fast guard to check if a dictionary entry was modified
> > (created, updated or deleted) using an hypothetical
> > ``dict_get_version(dict)`` function::
> >
> >     UNSET = object()
> >
> >     class GuardDictKey:
> >         def __init__(self, dict, key):
> >             self.dict = dict
> >             self.key = key
> >             self.value = dict.get(key, UNSET)
> >             self.version = dict_get_version(dict)
> >
> >         def check(self):
> >             """Return True if the dictionary entry did not changed."""
> >
> >             # read the version field of the dict structure
> >             version = dict_get_version(self.dict)
> >             if version == self.version:
> >                 # Fast-path: dictionary lookup avoided
> >                 return True
> >
> >             # lookup in the dictionary
> >             value = self.dict.get(self.key, UNSET)
> >             if value is self.value:
> >                 # another key was modified:
> >                 # cache the new dictionary version
> >                 self.version = version
> >                 return True
> >
> >             # the key was modified
> >             return False
> >
> >
> > Usage of the dict version
> > =========================
> >
> > Specialized functions using guards
> > ----------------------------------
> >
> > The `PEP 510 -- Specialized functions with guards
> > <https://www.python.org/dev/peps/pep-0510/>`_ proposes an API to support
> > specialized functions with guards. It allows to implement static
> > optimizers for Python without breaking the Python semantics.
> >
> > Example of a static Python optimizer: the astoptimizer of the `FAT
> > Python <http://faster-cpython.readthedocs.org/fat_python.html>`_ project
> > implements many optimizations which require guards on namespaces.
> > Examples:
> >
> > * Call pure builtins: to replace ``len("abc")`` with ``3``, guards on
> >   ``builtins.__dict__['len']`` and ``globals()['len']`` are required
> > * Loop unrolling: to unroll the loop ``for i in range(...): ...``,
> >   guards on ``builtins.__dict__['range']`` and ``globals()['range']``
> >   are required
> >
> >
> > Pyjion
> > ------
> >
> > According of Brett Cannon, one of the two main developers of Pyjion,
> > Pyjion can also benefit from dictionary version to implement
> > optimizations.
> >
> > Pyjion is a JIT compiler for Python based upon CoreCLR (Microsoft .NET
> > Core runtime).
> >
> >
> > Unladen Swallow
> > ---------------
> >
> > Even if dictionary version was not explicitly mentionned, optimization
> > globals and builtins lookup was part of the Unladen Swallow plan:
> > "Implement one of the several proposed schemes for speeding lookups of
> > globals and builtins." Source: `Unladen Swallow ProjectPlan
> > <https://code.google.com/p/unladen-swallow/wiki/ProjectPlan>`_.
> >
> > Unladen Swallow is a fork of CPython 2.6.1 adding a JIT compiler
> > implemented with LLVM. The project stopped in 2011: `Unladen Swallow
> > Retrospective
> > <http://qinsb.blogspot.com.au/2011/03/unladen-swallow-retrospective.html
>`_.
> >
> >
> > Changes
> > =======
> >
> > Add a ``ma_version`` field to the ``PyDictObject`` structure with the C
> > type ``PY_INT64_T``, 64-bit unsigned integer. New empty dictionaries are
> > initilized to version ``0``. The version is incremented at each change:
> >
> > * ``clear()`` if the dict was non-empty
> > * ``pop(key)`` if the key exists
> > * ``popitem()`` if the dict is non-empty
> > * ``setdefault(key, value)`` if the `key` does not exist
> > * ``__detitem__(key)`` if the key exists
> > * ``__setitem__(key, value)`` if the `key` doesn't exist or if the value
> >   is different
> > * ``update(...)`` if new values are different than existing values (the
> >   version can be incremented multiple times)
> >
> > Example using an hypothetical ``dict_get_version(dict)`` function::
> >
> >     >>> d = {}
> >     >>> dict_get_version(d)
> >     0
> >     >>> d['key'] = 'value'
> >     >>> dict_get_version(d)
> >     1
> >     >>> d['key'] = 'new value'
> >     >>> dict_get_version(d)
> >     2
> >     >>> del d['key']
> >     >>> dict_get_version(d)
> >     3
> >
> > If a dictionary is created with items, the version is also incremented
> > at each dictionary insertion. Example::
> >
> >     >>> d=dict(x=7, y=33)
> >     >>> dict_get_version(d)
> >     2
> >
> > The version is not incremented if an existing key is set to the same
> > value. For efficiency, values are compared by their identity:
> > ``new_value is old_value``, not by their content:
> > ``new_value == old_value``. Example::
> >
> >     >>> d={}
> >     >>> value = object()
> >     >>> d['key'] = value
> >     >>> dict_get_version(d)
> >     2
> >     >>> d['key'] = value
> >     >>> dict_get_version(d)
> >     2
> >
> > .. note::
> >    CPython uses some singleton like integers in the range [-5; 257],
> >    empty tuple, empty strings, Unicode strings of a single character in
> >    the range [U+0000; U+00FF], etc. When a key is set twice to the same
> >    singleton, the version is not modified.
> >
> >
> > Implementation
> > ==============
> >
> > The `issue #26058: PEP 509: Add ma_version to PyDictObject
> > <https://bugs.python.org/issue26058>`_ contains a patch implementing
> > this PEP.
> >
> > On pybench and timeit microbenchmarks, the patch does not seem to add
> > any overhead on dictionary operations.
> >
> > When the version does not change, ``PyDict_GetItem()`` takes 14.8 ns for
> > a dictioanry lookup, whereas a guard check only takes 3.8 ns. Moreover,
> > a guard can watch for multiple keys. For example, for an optimization
> > using 10 global variables in a function, 10 dictionary lookups costs 148
> > ns, whereas the guard still only costs 3.8 ns when the version does not
> > change (39x as fast).
> >
> >
> > Integer overflow
> > ================
> >
> > The implementation uses the C unsigned integer type ``PY_INT64_T`` to
> > store the version, a 64 bits unsigned integer. The C code uses
> > ``version++``. On integer overflow, the version is wrapped to ``0`` (and
> > then continue to be incremented) according to the C standard.
> >
> > After an integer overflow, a guard can succeed whereas the watched
> > dictionary key was modified. The bug occurs if the dictionary is
> > modified at least ``2 ** 64`` times between two checks of the guard and
> > if the new version (theorical value with no integer overflow) is equal
> > to the old version modulo ``2 ** 64``.
> >
> > If a dictionary is modified each nanosecond, an overflow takes longer
> > than 584 years. Using a 32-bit version, the overflow occurs only after 4
> > seconds. That's why a 64-bit unsigned type is also used on 32-bit
> > systems. A dictionary lookup at the C level takes 14.8 ns.
> >
> > A risk of a bug every 584 years is acceptable.
> >
> >
> > Alternatives
> > ============
> >
> > Expose the version at Python level as a read-only __version__ property
> > ----------------------------------------------------------------------
> >
> > The first version of the PEP proposed to expose the dictionary version
> > as a read-only ``__version__`` property at Python level, and also to add
> > the property to ``collections.UserDict`` (since this type must mimick
> > the ``dict`` API).
> >
> > There are multiple issues:
> >
> > * To be consistent and avoid bad surprises, the version must be added to
> >   all mapping types. Implementing a new mapping type would require extra
> >   work for no benefit, since the version is only required on the
> >   ``dict`` type in practice.
> > * All Python implementations must implement this new property, it gives
> >   more work to other implementations, whereas they may not use the
> >   dictionary version at all.
> > * The ``__version__`` can be wrapped on integer overflow. It is error
> >   prone: using ``dict.__version__ <= guard_version`` is wrong,
> >   ``dict.__version__ == guard_version`` must be used instead to reduce
> >   the risk of bug on integer overflow (even if the integer overflow is
> >   unlikely in practice).
> > * Exposing the dictionary version at Python level can lead the
> >   false assumption on performances. Checking ``dict.__version__`` at
> >   the Python level is not faster than a dictionary lookup. A dictionary
> >   lookup has a cost of 48.7 ns and checking a guard has a cost of 47.5
> >   ns, the difference is only 1.2 ns (3%)::
> >
> >
> >     $ ./python -m timeit -s 'd = {str(i):i for i in range(100)}'
'd["33"] == 33'
> >     10000000 loops, best of 3: 0.0487 usec per loop
> >     $ ./python -m timeit -s 'd = {str(i):i for i in range(100)}'
> > 'd.__version__ == 100'
> >     10000000 loops, best of 3: 0.0475 usec per loop
> >
> > Bikeshedding on the property name:
> >
> > * ``__cache_token__``: name proposed by Nick Coghlan, name coming from
> >   `abc.get_cache_token()
> >   <https://docs.python.org/3/library/abc.html#abc.get_cache_token>`_.
> > * ``__version__``
> > * ``__timestamp__``
> >
> >
> > Add a version to each dict entry
> > --------------------------------
> >
> > A single version per dictionary requires to keep a strong reference to
> > the value which can keep the value alive longer than expected. If we add
> > also a version per dictionary entry, the guard can only store the entry
> > version to avoid the strong reference to the value (only strong
> > references to the dictionary and to the key are needed).
> >
> > Changes: add a ``me_version`` field to the ``PyDictKeyEntry`` structure,
> > the field has the C type ``PY_INT64_T``. When a key is created or
> > modified, the entry version is set to the dictionary version which is
> > incremented at any change (create, modify, delete).
> >
> > Pseudo-code of an fast guard to check if a dictionary key was modified
> > using hypothetical ``dict_get_version(dict)``
> > ``dict_get_entry_version(dict)`` functions::
> >
> >     UNSET = object()
> >
> >     class GuardDictKey:
> >         def __init__(self, dict, key):
> >             self.dict = dict
> >             self.key = key
> >             self.dict_version = dict_get_version(dict)
> >             self.entry_version = dict_get_entry_version(dict, key)
> >
> >         def check(self):
> >             """Return True if the dictionary entry did not changed."""
> >
> >             # read the version field of the dict structure
> >             dict_version = dict_get_version(self.dict)
> >             if dict_version == self.version:
> >                 # Fast-path: dictionary lookup avoided
> >                 return True
> >
> >             # lookup in the dictionary
> >             entry_version = get_dict_key_version(dict, key)
> >             if entry_version == self.entry_version:
> >                 # another key was modified:
> >                 # cache the new dictionary version
> >                 self.dict_version = dict_version
> >                 return True
> >
> >             # the key was modified
> >             return False
> >
> > The main drawback of this option is the impact on the memory footprint.
> > It increases the size of each dictionary entry, so the overhead depends
> > on the number of buckets (dictionary entries, used or unused yet). For
> > example, it increases the size of each dictionary entry by 8 bytes on
> > 64-bit system.
> >
> > In Python, the memory footprint matters and the trend is to reduce it.
> > Examples:
> >
> > * `PEP 393 -- Flexible String Representation
> >   <https://www.python.org/dev/peps/pep-0393/>`_
> > * `PEP 412 -- Key-Sharing Dictionary
> >   <https://www.python.org/dev/peps/pep-0412/>`_
> >
> >
> > Add a new dict subtype
> > ----------------------
> >
> > Add a new ``verdict`` type, subtype of ``dict``. When guards are needed,
> > use the ``verdict`` for namespaces (module namespace, type namespace,
> > instance namespace, etc.) instead of ``dict``.
> >
> > Leave the ``dict`` type unchanged to not add any overhead (memory
> > footprint) when guards are not needed.
> >
> > Technical issue: a lot of C code in the wild, including CPython core,
> > expecting the exact ``dict`` type. Issues:
> >
> > * ``exec()`` requires a ``dict`` for globals and locals. A lot of code
> >   use ``globals={}``. It is not possible to cast the ``dict`` to a
> >   ``dict`` subtype because the caller expects the ``globals`` parameter
> >   to be modified (``dict`` is mutable).
> > * Functions call directly ``PyDict_xxx()`` functions, instead of calling
> >   ``PyObject_xxx()`` if the object is a ``dict`` subtype
> > * ``PyDict_CheckExact()`` check fails on ``dict`` subtype, whereas some
> >   functions require the exact ``dict`` type.
> > * ``Python/ceval.c`` does not completly supports dict subtypes for
> >   namespaces
> >
> >
> > The ``exec()`` issue is a blocker issue.
> >
> > Other issues:
> >
> > * The garbage collector has a special code to "untrack" ``dict``
> >   instances. If a ``dict`` subtype is used for namespaces, the garbage
> >   collector can be unable to break some reference cycles.
> > * Some functions have a fast-path for ``dict`` which would not be taken
> >   for ``dict`` subtypes, and so it would make Python a little bit
> >   slower.
> >
> >
> > Prior Art
> > =========
> >
> > Method cache and type version tag
> > ---------------------------------
> >
> > In 2007, Armin Rigo wrote a patch to to implement a cache of methods. It
> > was merged into Python 2.6.  The patch adds a "type attribute cache
> > version tag" (``tp_version_tag``) and a "valid version tag" flag to
> > types (the ``PyTypeObject`` structure).
> >
> > The type version tag is not available at the Python level.
> >
> > The version tag has the C type ``unsigned int``. The cache is a global
> > hash table of 4096 entries, shared by all types. The cache is global to
> > "make it fast, have a deterministic and low memory footprint, and be
> > easy to invalidate". Each cache entry has a version tag. A global
> > version tag is used to create the next version tag, it also has the C
> > type ``unsigned int``.
> >
> > By default, a type has its "valid version tag" flag cleared to indicate
> > that the version tag is invalid. When the first method of the type is
> > cached, the version tag and the "valid version tag" flag are set. When a
> > type is modified, the "valid version tag" flag of the type and its
> > subclasses is cleared. Later, when a cache entry of these types is used,
> > the entry is removed because its version tag is outdated.
> >
> > On integer overflow, the whole cache is cleared and the global version
> > tag is reset to ``0``.
> >
> > See `Method cache (issue #1685986)
> > <https://bugs.python.org/issue1685986>`_ and `Armin's method cache
> > optimization updated for Python 2.6 (issue #1700288)
> > <https://bugs.python.org/issue1700288>`_.
> >
> >
> > Globals / builtins cache
> > ------------------------
> >
> > In 2010, Antoine Pitrou proposed a `Globals / builtins cache (issue
> > #10401) <http://bugs.python.org/issue10401>`_ which adds a private
> > ``ma_version`` field to the ``PyDictObject`` structure (``dict`` type),
> > the field has the C type ``Py_ssize_t``.
> >
> > The patch adds a "global and builtin cache" to functions and frames, and
> > changes ``LOAD_GLOBAL`` and ``STORE_GLOBAL`` instructions to use the
> > cache.
> >
> > The change on the ``PyDictObject`` structure is very similar to this
> > PEP.
> >
> >
> > Cached globals+builtins lookup
> > ------------------------------
> >
> > In 2006, Andrea Griffini proposed a patch implementing a `Cached
> > globals+builtins lookup optimization
> > <https://bugs.python.org/issue1616125>`_.  The patch adds a private
> > ``timestamp`` field to the ``PyDictObject`` structure (``dict`` type),
> > the field has the C type ``size_t``.
> >
> > Thread on python-dev: `About dictionary lookup caching
> > <https://mail.python.org/pipermail/python-dev/2006-December/070348.html
>`_.
> >
> >
> > Guard against changing dict during iteration
> > --------------------------------------------
> >
> > In 2013, Serhiy Storchaka proposed `Guard against changing dict during
> > iteration (issue #19332) <https://bugs.python.org/issue19332>`_ which
> > adds a ``ma_count`` field to the ``PyDictObject`` structure (``dict``
> > type), the field has the C type ``size_t``.  This field is incremented
> > when the dictionary is modified, and so is very similar to the proposed
> > dictionary version.
> >
> > Sadly, the dictionary version proposed in this PEP doesn't help to
> > detect dictionary mutation. The dictionary version changes when values
> > are replaced, whereas modifying dictionary values while iterating on
> > dictionary keys is legit in Python.
> >
> >
> > PySizer
> > -------
> >
> > `PySizer <http://pysizer.8325.org/>`_: a memory profiler for Python,
> > Google Summer of Code 2005 project by Nick Smallbone.
> >
> > This project has a patch for CPython 2.4 which adds ``key_time`` and
> > ``value_time`` fields to dictionary entries. It uses a global
> > process-wide counter for dictionaries, incremented each time that a
> > dictionary is modified. The times are used to decide when child objects
> > first appeared in their parent objects.
> >
> >
> > Discussion
> > ==========
> >
> > Thread on the python-ideas mailing list: `RFC: PEP: Add dict.__version__
> > <https://mail.python.org/pipermail/python-ideas/2016-January/037702.html
>`_.
> >
> >
> > Copyright
> > =========
> >
> > This document has been placed in the public domain.
> > _______________________________________________
> > Python-Dev mailing list
> > Python-Dev at python.org
> > https://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe:
https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160111/76061889/attachment-0001.html>


More information about the Python-Dev mailing list