[Python-checkins] peps: Add PEP 509 (dict.version) and 510 (func.specialize)

Sat Jan 9 17:28:50 EST 2016

https://hg.python.org/peps/rev/5ad62448c882
changeset:   6153:5ad62448c882
user:        Victor Stinner <victor.stinner at gmail.com>
date:        Sat Jan 09 23:28:43 2016 +0100
summary:
  Add PEP 509 (dict.__version__) and 510 (func.specialize)

files:
  pep-0509.txt |  387 +++++++++++++++++++++++++++++++++++++++
  pep-0510.txt |  213 +++++++++++++++++++++
  2 files changed, 600 insertions(+), 0 deletions(-)

diff --git a/pep-0509.txt b/pep-0509.txt
new file mode 100644
--- /dev/null
+++ b/pep-0509.txt
@@ -0,0 +1,387 @@
+PEP: 509
+Title: Add dict.__version__
+Version: $Revision$
+Last-Modified: $Date$
+Author: Victor Stinner <victor.stinner at gmail.com>
+Status: Draft
+Type: Standards Track
+Content-Type: text/x-rst
+Created: 4-January-2016
+Python-Version: 3.6
+
+
+Abstract
+========
+
+Add a new read-only ``__version__`` property to ``dict`` and
+``collections.UserDict`` types, incremented at each change.
+
+
+Rationale
+=========
+
+In Python, the builtin ``dict`` type is used by many instructions. For
+example, the ``LOAD_GLOBAL`` instruction searchs for a variable in the
+global namespace, or in the builtins namespace (two dict lookups).
+Python uses ``dict`` for the builtins namespace, globals namespace, type
+namespaces, instance namespaces, etc. The local namespace (namespace of
+a function) is usually optimized to an array, but it can be a dict too.
+
+Python is hard to optimize because almost everything is mutable: builtin
+functions, function code, global variables, local variables, ... can be
+modified at runtime. Implementing optimizations respecting the Python
+semantic requires to detect when "something changes": we will call these
+checks "guards".
+
+The speedup of optimizations depends on the speed of guard checks. This
+PEP proposes to add a version to dictionaries to implement efficient
+guards on namespaces.
+
+Example of optimization: replace loading a global variable with a
+constant.  This optimization requires a guard on the global variable to
+check if it was modified. If the variable is modified, the variable must
+be loaded at runtime, instead of using the constant.
+
+
+Guard example
+=============
+
+Pseudo-code of an efficient guard to check if a dictionary key was
+modified (created, updated or deleted)::
+
+    UNSET = object()
+
+    class Guard:
+        def __init__(self, dict, key):
+            self.dict = dict
+            self.key = key
+            self.value = dict.get(key, UNSET)
+            self.version = dict.__version__
+
+        def check(self):
+            """Return True if the dictionary value did not changed."""
+            version = self.dict.__version__
+            if version == self.version:
+                # Fast-path: avoid the dictionary lookup
+                return True
+
+            value = self.dict.get(self.key, UNSET)
+            if value == self.value:
+                # another key was modified:
+                # cache the new dictionary version
+                self.version = version
+                return True
+
+            return False
+
+
+Changes
+=======
+
+Add a read-only ``__version__`` property to builtin ``dict`` type and to
+the ``collections.UserDict`` type. New empty dictionaries are initilized
+to version ``0``. The version is incremented at each change:
+
+* ``clear()`` if the dict was non-empty
+* ``pop(key)`` if the key exists
+* ``popitem()`` if the dict is non-empty
+* ``setdefault(key, value)`` if the `key` does not exist
+* ``__detitem__(key)`` if the key exists
+* ``__setitem__(key, value)`` if the `key` doesn't exist or if the value
+  is different
+* ``update(...)`` if new values are different than existing values (the
+  version can be incremented multiple times)
+
+Example::
+
+    >>> d = {}
+    >>> d.__version__
+    0
+    >>> d['key'] = 'value'
+    >>> d.__version__
+    1
+    >>> d['key'] = 'new value'
+    >>> d.__version__
+    2
+    >>> del d['key']
+    >>> d.__version__
+    3
+
+If a dictionary is created with items, the version is also incremented
+at each dictionary insertion. Example::
+
+    >>> d=dict(x=7, y=33)
+    >>> d.__version__
+    2
+
+The version is not incremented is an existing key is modified to the
+same value, but only the identifier of the value is tested, not the
+content of the value. Example::
+
+    >>> d={}
+    >>> value = object()
+    >>> d['key'] = value
+    >>> d.__version__
+    2
+    >>> d['key'] = value
+    >>> d.__version__
+    2
+
+.. note::
+   CPython uses some singleton like integers in the range [-5; 257],
+   empty tuple, empty strings, Unicode strings of a single character in
+   the range [U+0000; U+00FF], etc. When a key is set twice to the same
+   singleton, the version is not modified.
+
+The PEP is designed to implement guards on namespaces, only the ``dict``
+type can be used for namespaces in practice.  ``collections.UserDict``
+is modified because it must mimicks ``dict``. ``collections.Mapping`` is
+unchanged.
+
+
+Integer overflow
+================
+
+The implementation uses the C unsigned integer type ``size_t`` to store
+the version.  On 32-bit systems, the maximum version is ``2**32-1``
+(more than ``4.2 * 10 ** 9``, 4 billions). On 64-bit systems, the maximum
+version is ``2**64-1`` (more than ``1.8 * 10**19``).
+
+The C code uses ``version++``. On integer overflow, the version is
+wrapped to ``0`` (and then continue to be incremented).
+
+The check ``dict.__version__ == old_version`` can be true after an
+integer overflow, so a guard can return false even if the value changed,
+which is wrong. The bug occurs if the dict is modified at least ``2**64``
+times (on 64-bit system) between two checks of the guard.
+
+Using a more complex type (ex: ``PyLongObject``) to avoid the overflow
+would slow down operations on the ``dict`` type. Even if there is a
+theorical risk of missing a value change, the risk is considered too low
+compared to the slow down of using a more complex type.
+
+
+Usage of dict.__version__
+=========================
+
+Detect dictionary mutation during iteration
+-------------------------------------------
+
+Currently, iterating on a dictionary only detects when the dictionary
+size changes, but not when keys or values are modified. Using the
+dictionary version, it would be possible to detect when keys and values
+are modified.
+
+See the `issue #19332: Guard against changing dict during iteration
+<https://bugs.python.org/issue19332>`_.
+
+
+astoptimizer of FAT Python
+--------------------------
+
+The astoptimizer of the FAT Python project implements many optimizations
+which require guards on namespaces. Examples:
+
+* Call pure builtins: to replace ``len("abc")`` with ``3``, guards on
+  ``builtins.__dict__['len']`` and ``globals()['len']`` are required
+* Loop unrolling: to unroll the loop ``for i in range(...): ...``,
+  guards on ``builtins.__dict__['range']`` and ``globals()['range']``
+  are required
+
+The `FAT Python
+<http://faster-cpython.readthedocs.org/fat_python.html>`_ project is a
+static optimizer for Python 3.6.
+
+
+Pyjion
+------
+
+According of Brett Cannon, one of the two main developers of Pyjion, Pyjion can
+also benefit from dictionary version to implement optimizations.
+
+Pyjion is a JIT compiler for Python based upon CoreCLR (Microsoft .NET Core
+runtime).
+
+
+Unladen Swallow
+---------------
+
+Even if dictionary version was not explicitly mentionned, optimization globals
+and builtins lookup was part of the Unladen Swallow plan: "Implement one of the
+several proposed schemes for speeding lookups of globals and builtins."
+Source: `Unladen Swallow ProjectPlan
+<https://code.google.com/p/unladen-swallow/wiki/ProjectPlan>`_.
+
+Unladen Swallow is a fork of CPython 2.6.1 adding a JIT compiler implemented
+with LLVM. The project stopped in 2011: `Unladen Swallow Retrospective
+<http://qinsb.blogspot.com.au/2011/03/unladen-swallow-retrospective.html>`_.
+
+
+Implementation
+==============
+
+See the `issue #26058: Add dict.__version__ read-only property
+<https://bugs.python.org/issue26058>`_.
+
+On pybench and timeit microbenchmarks, the patch does not seem to add
+any overhead on dictionary operations.
+
+
+Alternatives
+============
+
+Add a version to each dict entry
+--------------------------------
+
+A single version per dictionary requires to keep a strong reference to
+the value which can keep the value alive longer than expected. If we add
+also a version per dictionary entry, the guard can rely on the entry
+version and so avoid the strong reference to the value (only strong
+references to a dictionary and key are needed).
+
+Changes: add a ``getversion(key)`` method to dictionary which returns
+``None`` if the key doesn't exist. When a key is created or modified,
+the entry version is set to the dictionary version which is incremented
+at each change (create, modify, delete).
+
+Pseudo-code of an efficient guard to check if a dict key was modified
+using ``getversion()``::
+
+    UNSET = object()
+
+    class Guard:
+        def __init__(self, dict, key):
+            self.dict = dict
+            self.key = key
+            self.dict_version = dict.__version__
+            self.entry_version = dict.getversion(key)
+
+        def check(self):
+            """Return True if the dictionary value did not changed."""
+            dict_version = self.dict.__version__
+            if dict_version == self.version:
+                # Fast-path: avoid the dictionary lookup
+                return True
+
+            # lookup in the dictionary, but get the entry version,
+            #not the value
+            entry_version = self.dict.getversion(self.key)
+            if entry_version == self.entry_version:
+                # another key was modified:
+                # cache the new dictionary version
+                self.dict_version = dict_version
+                return True
+
+            return False
+
+This main drawback of this option is the impact on the memory footprint.
+It increases the size of each dictionary entry, so the overhead depends
+on the number of buckets (dictionary entries, used or unused yet). For
+example, it increases the size of each dictionary entry by 8 bytes on
+64-bit system if we use ``size_t``.
+
+In Python, the memory footprint matters and the trend is more to reduce
+it. Examples:
+
+* `PEP 393 -- Flexible String Representation
+  <https://www.python.org/dev/peps/pep-0393/>`_
+* `PEP 412 -- Key-Sharing Dictionary
+  <https://www.python.org/dev/peps/pep-0412/>`_
+
+
+Add a new dict subtype
+----------------------
+
+Add a new ``verdict`` type, subtype of ``dict``. When guards are needed,
+use the ``verdict`` for namespaces (module namespace, type namespace,
+instance namespace, etc.) instead of ``dict``.
+
+Leave the ``dict`` type unchanged to not add any overhead (memory
+footprint) when guards are not needed.
+
+Technical issue: a lot of C code in the wild, including CPython core,
+expect the exact ``dict`` type. Issues:
+
+* ``exec()`` requires a ``dict`` for globals and locals. A lot of code
+  use ``globals={}``. It is not possible to cast the ``dict`` to a
+  ``dict`` subtype because the caller expects the ``globals`` parameter
+  to be modified (``dict`` is mutable).
+* Functions call directly ``PyDict_xxx()`` functions, instead of calling
+  ``PyObject_xxx()`` if the object is a ``dict`` subtype
+* ``PyDict_CheckExact()`` check fails on ``dict`` subtype, whereas some
+  functions require the exact ``dict`` type.
+* ``Python/ceval.c`` does not completly supports dict subtypes for
+  namespaces
+
+
+The ``exec()`` issue is a blocker issue.
+
+Other issues:
+
+* The garbage collector has a special code to "untrack" ``dict``
+  instances. If a ``dict`` subtype is used for namespaces, the garbage
+  collector may be unable to break some reference cycles.
+* Some functions have a fast-path for ``dict`` which would not be taken
+  for ``dict`` subtypes, and so it would make Python a little bit
+  slower.
+
+
+Prior Art
+=========
+
+Guard against changing dict during iteration
+--------------------------------------------
+
+In 2013, Serhiy Storchaka proposed a patch for the `issue #19332: Guard
+against changing dict during iteration
+<https://bugs.python.org/issue19332>`_ (mentioned above) which adds a
+``size_t ma_count`` field to the ``PyDictObject`` structure. This field
+is incremented when the dictionary is modified, and so is very similar
+to the proposed ``dict.__version__``.
+
+
+Cached globals+builtins lookup
+------------------------------
+
+In 2006, Andrea Griffini proposes a patch implementing a `Cached
+globals+builtins lookup optimization <https://bugs.python.org/issue1616125>`_.
+The patch adds a private ``timestamp`` field to dict.
+
+See the thread on python-dev: `About dictionary lookup caching
+<https://mail.python.org/pipermail/python-dev/2006-December/070348.html>`_.
+
+
+Globals / builtins cache
+------------------------
+
+In 2010, Antoine Pitrou proposed a `Globals / builtins cache
+<http://bugs.python.org/issue10401>`_ which adds a private
+``ma_version`` field to the ``dict`` type. The patch adds a "global and
+builtin cache" to functions and frames, and changes ``LOAD_GLOBAL`` and
+``STORE_GLOBAL`` instructions to use the cache.
+
+
+PySizer
+-------
+
+`PySizer <http://pysizer.8325.org/>`_: a memory profiler for Python,
+Google Summer of Code 2005 project by Nick Smallbone.
+
+This project has a patch for CPython 2.4 which adds ``key_time`` and
+``value_time`` fields to dictionary entries. It uses a global
+process-wide counter for dictionaries, incremented each time that a
+dictionary is modified. The times are used to decide when child objects
+first appeared in their parent objects.
+
+
+Discussion
+==========
+
+Thread on the python-ideas mailing list: `RFC: PEP: Add dict.__version__
+<https://mail.python.org/pipermail/python-ideas/2016-January/037702.html>`_.
+
+
+Copyright
+=========
+
+This document has been placed in the public domain.
diff --git a/pep-0510.txt b/pep-0510.txt
new file mode 100644
--- /dev/null
+++ b/pep-0510.txt
@@ -0,0 +1,213 @@
+PEP: 510
+Title: Specialized functions with guards
+Version: $Revision$
+Last-Modified: $Date$
+Author: Victor Stinner <victor.stinner at gmail.com>
+Status: Draft
+Type: Standards Track
+Content-Type: text/x-rst
+Created: 4-January-2016
+Python-Version: 3.6
+
+
+Abstract
+========
+
+Add an API to add specialized functions with guards to functions, to
+support static optimizers respecting the Python semantic.
+
+
+Rationale
+=========
+
+Python is hard to optimize because almost everything is mutable: builtin
+functions, function code, global variables, local variables, ... can be
+modified at runtime. Implement optimizations respecting the Python
+semantic requires to detect when "something changes", we will call these
+checks "guards".
+
+This PEP proposes to add a ``specialize()`` method to functions to add a
+specialized functions with guards. When the function is called, the
+specialized function is used if nothing changed, otherwise use the
+original bytecode.
+
+Writing an optimizer is out of the scope of this PEP.
+
+
+Example
+=======
+
+Using bytecode
+--------------
+
+Replace ``chr(65)`` with ``"A"``::
+
+    import myoptimizer
+
+    def func():
+        return chr(65)
+
+    def fast_func():
+        return "A"
+
+    func.specialize(fast_func.__code__, [myoptimizer.GuardBuiltins("chr")])
+    del fast_func
+
+    print("func(): %s" % func())
+    print("#specialized: %s" % len(func.get_specialized()))
+    print()
+
+    import builtins
+    builtins.chr = lambda obj: "mock"
+
+    print("func(): %s" % func())
+    print("#specialized: %s" % len(func.get_specialized()))
+
+Output::
+
+    func(): A
+    #specialized: 1
+
+    func(): mock
+    #specialized: 0
+
+The hypothetical ``myoptimizer.GuardBuiltins("len")`` is a guard on the
+builtin ``len()`` function and the ``len`` name in the global namespace.
+The guard fails if the builtin function is replaced or if a ``len`` name
+is defined in the global namespace.
+
+The first call returns directly the string ``"A"``. The second call
+removes the specialized function because the builtin ``chr()`` function
+was replaced, and executes the original bytecode
+
+On a microbenchmark, calling the specialized function takes 88 ns,
+whereas the original bytecode takes 145 ns (+57 ns): 1.6 times as fast.
+
+
+Using builtin function
+----------------------
+
+Replace a slow Python function calling ``chr(obj)`` with a direct call
+to the builtin ``chr()`` function::
+
+    import myoptimizer
+
+    def func(arg):
+        return chr(arg)
+
+    func.specialize(chr, [myoptimizer.GuardBuiltins("chr")])
+
+    print("func(65): %s" % func(65))
+    print("#specialized: %s" % len(func.get_specialized()))
+    print()
+
+    import builtins
+    builtins.chr = lambda obj: "mock"
+
+    print("func(65): %s" % func(65))
+    print("#specialized: %s" % len(func.get_specialized()))
+
+Output::
+
+    func(): A
+    #specialized: 1
+
+    func(): mock
+    #specialized: 0
+
+The first call returns directly the builtin ``chr()`` function (without
+creating a Python frame). The second call removes the specialized
+function because the builtin ``chr()`` function was replaced, and
+executes the original bytecode.
+
+On a microbenchmark, calling the specialized function takes 95 ns,
+whereas the original bytecode takes 155 ns (+60 ns): 1.6 times as fast.
+Calling directly ``chr(65)`` takes 76 ns.
+
+
+Python Function Call
+====================
+
+Pseudo-code to call a Python function having specialized functions with
+guards::
+
+    def call_func(func, *args, **kwargs):
+        # by default, call the regular bytecode
+        code = func.__code__.co_code
+        specialized = func.get_specialized()
+        nspecialized = len(specialized)
+
+        index = 0
+        while index < nspecialized:
+            guard = specialized[index].guard
+            # pass arguments, some guards need them
+            check = guard(args, kwargs)
+            if check == 1:
+                # guard succeeded: we can use the specialized function
+                code = specialized[index].code
+                break
+            elif check == -1:
+                # guard will always fail: remove the specialized function
+                del specialized[index]
+            elif check == 0:
+                # guard failed temporarely
+                index += 1
+
+        # code can be a code object or any callable object
+        execute_code(code, args, kwargs)
+
+
+Changes
+=======
+
+* Add two new methods to functions:
+
+  - ``specialize(code, guards: list)``: add specialized
+    function with guard. `code` is a code object (ex:
+    ``func2.__code__``) or any callable object (ex: ``len``).
+    The specialization can be ignored if a guard already fails.
+  - ``get_specialized()``: get the list of specialized functions with
+    guards
+
+* Base ``Guard`` type which can be used as parent type to implement
+  guards. It requires to implement a ``check()`` function, with an
+  optional ``first_check()`` function. API:
+
+  * ``int check(PyObject *guard, PyObject **stack)``: return 1 on
+    success, 0 if the guard failed temporarely, -1 if the guard will
+    always fail
+  * ``int first_check(PyObject *guard, PyObject *func)``: return 0 on
+    success, -1 if the guard will always fail
+
+Microbenchmark on ``python3.6 -m timeit -s 'def f(): pass' 'f()'`` (best
+of 3 runs):
+
+* Original Python: 79 ns
+* Patched Python: 79 ns
+
+According to this microbenchmark, the changes has no overhead on calling
+a Python function without specialization.
+
+
+Behaviour
+=========
+
+When a function code is replaced (``func.__code__ = new_code``), all
+specialized functions are removed.
+
+When a function is serialized (by ``marshal`` or ``pickle`` for
+example), specialized functions and guards are ignored (not serialized).
+
+
+Discussion
+==========
+
+Thread on the python-ideas mailing list: `RFC: PEP: Specialized
+functions with guards
+<https://mail.python.org/pipermail/python-ideas/2016-January/037703.html>`_.
+
+
+Copyright
+=========
+
+This document has been placed in the public domain.

-- 
Repository URL: https://hg.python.org/peps


[Python-checkins] peps: Add PEP 509 (dict.__version__) and 510 (func.specialize)

[Python-checkins] peps: Add PEP 509 (dict.version) and 510 (func.specialize)