Mailman 3 January 2016 - Python-ideas

RFC: PEP: Specialized functions with guards
by Victor Stinner Jan. 13, 2016

Jan. 13, 2016

Hi, Here is the second PEP, part of a serie of 3 PEP to add an API to implement a static Python optimizer specializing functions with guards. HTML version: https://faster-cpython.readthedocs.org/pep_specialize.html PEP: xxx Title: Specialized functions with guards Version: $Revision$ Last-Modified: $Date$ Author: Victor Stinner <victor.stinner(a)gmail.com> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 4-January-2016 Python-Version: 3.6 Abstract ======== Add … [View More]an API to add specialized functions with guards to functions, to support static optimizers respecting the Python semantic. Rationale ========= Python is hard to optimize because almost everything is mutable: builtin functions, function code, global variables, local variables, ... can be modified at runtime. Implement optimizations respecting the Python semantic requires to detect when "something changes", we will call these checks "guards". This PEP proposes to add a ``specialize()`` method to functions to add a specialized functions with guards. When the function is called, the specialized function is used if nothing changed, otherwise use the original bytecode. Writing an optimizer is out of the scope of this PEP. Example ======= Using bytecode -------------- Replace ``chr(65)`` with ``"A"``:: import myoptimizer def func(): return chr(65) def fast_func(): return "A" func.specialize(fast_func.__code__, [myoptimizer.GuardBuiltins("chr")]) del fast_func print("func(): %s" % func()) print("#specialized: %s" % len(func.get_specialized())) print() import builtins builtins.chr = lambda obj: "mock" print("func(): %s" % func()) print("#specialized: %s" % len(func.get_specialized())) Output:: func(): A #specialized: 1 func(): mock #specialized: 0 The hypothetical ``myoptimizer.GuardBuiltins("len")`` is a guard on the builtin ``len()`` function and the ``len`` name in the global namespace. The guard fails if the builtin function is replaced or if a ``len`` name is defined in the global namespace. The first call returns directly the string ``"A"``. The second call removes the specialized function because the builtin ``chr()`` function was replaced, and executes the original bytecode On a microbenchmark, calling the specialized function takes 88 ns, whereas the original bytecode takes 145 ns (+57 ns): 1.6 times as fast. Using builtin function ---------------------- Replace a slow Python function calling ``chr(obj)`` with a direct call to the builtin ``chr()`` function:: import myoptimizer def func(arg): return chr(arg) func.specialize(chr, [myoptimizer.GuardBuiltins("chr")]) print("func(65): %s" % func(65)) print("#specialized: %s" % len(func.get_specialized())) print() import builtins builtins.chr = lambda obj: "mock" print("func(65): %s" % func(65)) print("#specialized: %s" % len(func.get_specialized())) Output:: func(): A #specialized: 1 func(): mock #specialized: 0 The first call returns directly the builtin ``chr()`` function (without creating a Python frame). The second call removes the specialized function because the builtin ``chr()`` function was replaced, and executes the original bytecode. On a microbenchmark, calling the specialized function takes 95 ns, whereas the original bytecode takes 155 ns (+60 ns): 1.6 times as fast. Calling directly ``chr(65)`` takes 76 ns. Python Function Call ==================== Pseudo-code to call a Python function having specialized functions with guards:: def call_func(func, *args, **kwargs): # by default, call the regular bytecode code = func.__code__.co_code specialized = func.get_specialized() nspecialized = len(specialized) index = 0 while index < nspecialized: guard = specialized[index].guard # pass arguments, some guards need them check = guard(args, kwargs) if check == 1: # guard succeeded: we can use the specialized function code = specialized[index].code break elif check == -1: # guard will always fail: remove the specialized function del specialized[index] elif check == 0: # guard failed temporarely index += 1 # code can be a code object or any callable object execute_code(code, args, kwargs) Changes ======= * Add two new methods to functions: - ``specialize(code, guards: list)``: add specialized function with guard. `code` is a code object (ex: ``func2.__code__``) or any callable object (ex: ``len``). The specialization can be ignored if a guard already fails. - ``get_specialized()``: get the list of specialized functions with guards * Base ``Guard`` type which can be used as parent type to implement guards. It requires to implement a ``check()`` function, with an optional ``first_check()`` function. API: * ``int check(PyObject *guard, PyObject **stack)``: return 1 on success, 0 if the guard failed temporarely, -1 if the guard will always fail * ``int first_check(PyObject *guard, PyObject *func)``: return 0 on success, -1 if the guard will always fail Microbenchmark on ``python3.6 -m timeit -s 'def f(): pass' 'f()'`` (best of 3 runs): * Original Python: 79 ns * Patched Python: 79 ns According to this microbenchmark, the changes has no overhead on calling a Python function without specialization. Behaviour ========= When a function code is replaced (``func.__code__ = new_code``), all specialized functions are removed. When a function is serialized (by ``marshal`` or ``pickle`` for example), specialized functions and guards are ignored (not serialized). Copyright ========= This document has been placed in the public domain. -- Victor [View Less]

5 10

RFC: PEP: Add dict.__version__
by Victor Stinner Jan. 12, 2016

Jan. 12, 2016

Hi, Here is a first PEP, part of a serie of 3 PEP to add an API to implement a static Python optimizer specializing functions with guards. HTML version: https://faster-cpython.readthedocs.org/pep_dict_version.html#pep-dict-versi… PEP: xxx Title: Add dict.__version__ Version: $Revision$ Last-Modified: $Date$ Author: Victor Stinner <victor.stinner(a)gmail.com> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 4-January-2016 Python-Version: 3.6 Abstract ======== Add … [View More]a new read-only ``__version__`` property to ``dict`` and ``collections.UserDict`` types, incremented at each change. Rationale ========= In Python, the builtin ``dict`` type is used by many instructions. For example, the ``LOAD_GLOBAL`` instruction searchs for a variable in the global namespace, or in the builtins namespace (two dict lookups). Python uses ``dict`` for the builtins namespace, globals namespace, type namespaces, instance namespaces, etc. The local namespace (namespace of a function) is usually optimized to an array, but it can be a dict too. Python is hard to optimize because almost everything is mutable: builtin functions, function code, global variables, local variables, ... can be modified at runtime. Implementing optimizations respecting the Python semantic requires to detect when "something changes": we will call these checks "guards". The speedup of optimizations depends on the speed of guard checks. This PEP proposes to add a version to dictionaries to implement efficient guards on namespaces. Example of optimization: replace loading a global variable with a constant. This optimization requires a guard on the global variable to check if it was modified. If the variable is modified, the variable must be loaded at runtime, instead of using the constant. Guard example ============= Pseudo-code of an efficient guard to check if a dictionary key was modified (created, updated or deleted):: UNSET = object() class Guard: def __init__(self, dict, key): self.dict = dict self.key = key self.value = dict.get(key, UNSET) self.version = dict.__version__ def check(self): """Return True if the dictionary value did not changed.""" version = self.dict.__version__ if version == self.version: # Fast-path: avoid the dictionary lookup return True value = self.dict.get(self.key, UNSET) if value == self.value: # another key was modified: # cache the new dictionary version self.version = version return True return False Changes ======= Add a read-only ``__version__`` property to builtin ``dict`` type and to the ``collections.UserDict`` type. New empty dictionaries are initilized to version ``0``. The version is incremented at each change: * ``clear()`` if the dict was non-empty * ``pop(key)`` if the key exists * ``popitem()`` if the dict is non-empty * ``setdefault(key, value)`` if the `key` does not exist * ``__detitem__(key)`` if the key exists * ``__setitem__(key, value)`` if the `key` doesn't exist or if the value is different * ``update(...)`` if new values are different than existing values (the version can be incremented multiple times) Example:: >>> d = {} >>> d.__version__ 0 >>> d['key'] = 'value' >>> d.__version__ 1 >>> d['key'] = 'new value' >>> d.__version__ 2 >>> del d['key'] >>> d.__version__ 3 If a dictionary is created with items, the version is also incremented at each dictionary insertion. Example:: >>> d=dict(x=7, y=33) >>> d.__version__ 2 The version is not incremented is an existing key is modified to the same value, but only the identifier of the value is tested, not the content of the value. Example:: >>> d={} >>> value = object() >>> d['key'] = value >>> d.__version__ 2 >>> d['key'] = value >>> d.__version__ 2 .. note:: CPython uses some singleton like integers in the range [-5; 257], empty tuple, empty strings, Unicode strings of a single character in the range [U+0000; U+00FF], etc. When a key is set twice to the same singleton, the version is not modified. The PEP is designed to implement guards on namespaces, only the ``dict`` type can be used for namespaces in practice. ``collections.UserDict`` is modified because it must mimicks ``dict``. ``collections.Mapping`` is unchanged. Integer overflow ================ The implementation uses the C unsigned integer type ``size_t`` to store the version. On 32-bit systems, the maximum version is ``2**32-1`` (more than ``4.2 * 10 ** 9``, 4 billions). On 64-bit systems, the maximum version is ``2**64-1`` (more than ``1.8 * 10**19``). The C code uses ``version++``. The behaviour on integer overflow of the version is undefined. The minimum guarantee is that the version always changes when the dictionary is modified. The check ``dict.__version__ == old_version`` can be true after an integer overflow, so a guard can return false even if the value changed, which is wrong. The bug occurs if the dict is modified at least ``2**64`` times (on 64-bit system) between two checks of the guard. Using a more complex type (ex: ``PyLongObject``) to avoid the overflow would slow down operations on the ``dict`` type. Even if there is a theorical risk of missing a value change, the risk is considered too low compared to the slow down of using a more complex type. Alternatives ============ Add a version to each dict entry -------------------------------- A single version per dictionary requires to keep a strong reference to the value which can keep the value alive longer than expected. If we add also a version per dictionary entry, the guard can rely on the entry version and so avoid the strong reference to the value (only strong references to a dictionary and key are needed). Changes: add a ``getversion(key)`` method to dictionary which returns ``None`` if the key doesn't exist. When a key is created or modified, the entry version is set to the dictionary version which is incremented at each change (create, modify, delete). Pseudo-code of an efficient guard to check if a dict key was modified using ``getversion()``:: UNSET = object() class Guard: def __init__(self, dict, key): self.dict = dict self.key = key self.dict_version = dict.__version__ self.entry_version = dict.getversion(key) def check(self): """Return True if the dictionary value did not changed.""" dict_version = self.dict.__version__ if dict_version == self.version: # Fast-path: avoid the dictionary lookup return True # lookup in the dictionary, but get the entry version, #not the value entry_version = self.dict.getversion(self.key) if entry_version == self.entry_version: # another key was modified: # cache the new dictionary version self.dict_version = dict_version return True return False This main drawback of this option is the impact on the memory footprint. It increases the size of each dictionary entry, so the overhead depends on the number of buckets (dictionary entries, used or unused yet). For example, it increases the size of each dictionary entry by 8 bytes on 64-bit system if we use ``size_t``. In Python, the memory footprint matters and the trend is more to reduce it. Examples: * `PEP 393 -- Flexible String Representation <https://www.python.org/dev/peps/pep-0393/>`_ * `PEP 412 -- Key-Sharing Dictionary <https://www.python.org/dev/peps/pep-0412/>`_ Add a new dict subtype ---------------------- Add a new ``verdict`` type, subtype of ``dict``. When guards are needed, use the ``verdict`` for namespaces (module namespace, type namespace, instance namespace, etc.) instead of ``dict``. Leave the ``dict`` type unchanged to not add any overhead (memory footprint) when guards are not needed. Technical issue: a lot of C code in the wild, including CPython core, expect the exact ``dict`` type. Issues: * ``exec()`` requires a ``dict`` for globals and locals. A lot of code use ``globals={}``. It is not possible to cast the ``dict`` to a ``dict`` subtype because the caller expects the ``globals`` parameter to be modified (``dict`` is mutable). * Functions call directly ``PyDict_xxx()`` functions, instead of calling ``PyObject_xxx()`` if the object is a ``dict`` subtype * ``PyDict_CheckExact()`` check fails on ``dict`` subtype, whereas some functions require the exact ``dict`` type. * ``Python/ceval.c`` does not completly supports dict subtypes for namespaces The ``exec()`` issue is a blocker issue. Other issues: * The garbage collector has a special code to "untrack" ``dict`` instances. If a ``dict`` subtype is used for namespaces, the garbage collector may be unable to break some reference cycles. * Some functions have a fast-path for ``dict`` which would not be taken for ``dict`` subtypes, and so it would make Python a little bit slower. Usage of dict.__version__ ========================= astoptimizer of FAT Python -------------------------- The astoptimizer of the FAT Python project implements many optimizations which require guards on namespaces. Examples: * Call pure builtins: to replace ``len("abc")`` with ``3``, guards on ``builtins.__dict__['len']`` and ``globals()['len']`` are required * Loop unrolling: to unroll the loop ``for i in range(...): ...``, guards on ``builtins.__dict__['range']`` and ``globals()['range']`` are required The `FAT Python <http://faster-cpython.readthedocs.org/fat_python.html>`_ project is a static optimizer for Python 3.6. Pyjion ------ According of Brett Cannon, one of the two main developers of Pyjion, Pyjion can also benefit from dictionary version to implement optimizations. Pyjion is a JIT compiler for Python based upon CoreCLR (Microsoft .NET Core runtime). Unladen Swallow --------------- Even if dictionary version was not explicitly mentionned, optimization globals and builtins lookup was part of the Unladen Swallow plan: "Implement one of the several proposed schemes for speeding lookups of globals and builtins." Source: `Unladen Swallow ProjectPlan <https://code.google.com/p/unladen-swallow/wiki/ProjectPlan>`_. Unladen Swallow is a fork of CPython 2.6.1 adding a JIT compiler implemented with LLVM. The project stopped in 2011: `Unladen Swallow Retrospective <http://qinsb.blogspot.com.au/2011/03/unladen-swallow-retrospective.html>`_. Prior Art ========= Cached globals+builtins lookup ------------------------------ In 2006, Andrea Griffini proposes a patch implementing a `Cached globals+builtins lookup optimization <https://bugs.python.org/issue1616125>`_. The patch adds a private ``timestamp`` field to dict. See the thread on python-dev: `About dictionary lookup caching <https://mail.python.org/pipermail/python-dev/2006-December/070348.html>`_. Globals / builtins cache ------------------------ In 2010, Antoine Pitrou proposed a `Globals / builtins cache <http://bugs.python.org/issue10401>`_ which adds a private ``ma_version`` field to the ``dict`` type. The patch adds a "global and builtin cache" to functions and frames, and changes ``LOAD_GLOBAL`` and ``STORE_GLOBAL`` instructions to use the cache. PySizer ------- `PySizer <http://pysizer.8325.org/>`_: a memory profiler for Python, Google Summer of Code 2005 project by Nick Smallbone. This project has a patch for CPython 2.4 which adds ``key_time`` and ``value_time`` fields to dictionary entries. It uses a global process-wide counter for dictionaries, incremented each time that a dictionary is modified. The times are used to decide when child objects first appeared in their parent objects. Copyright ========= This document has been placed in the public domain. -- Victor [View Less]

16 79

More friendly access to chmod
by Ram Rachum Jan. 12, 2016

Jan. 12, 2016

Hi everyone, What do you think about enabling a more friendly interface to chmod information in Python? I believe that currently if I want to get chmod information from a file, I need to do this: my_path.stat().st_mode & 0o777 (I'm using `pathlib`.) (If there's a nicer way than this, please let me know.) This sucks. And then the result is then a number, like 511, which you then have to call `oct` on it to get 0o777. I'm not even happy with getting the octal number. For some of us who … [View More]

9 20

find-like functionality in pathlib
by Ram Rachum Jan. 11, 2016

Jan. 11, 2016

What do you think about implementing functionality similar to the `find` utility in Linux in the Pathlib module? I wanted this today, I had a script to write to archive a bunch of files from a folder, and I decided to try writing it in Python rather than in Bash. But I needed something stronger than `Path.glob` in order to select the files. I wanted a regular expression. (In this particular case, I wanted to get a list of all the files excluding the `.git` folder and all files inside of it. Thanks, Ram.

15 36

Re: [Python-ideas] PEP 9 - plaintext PEP format - is officially deprecated
by Barry Warsaw Jan. 11, 2016

Jan. 11, 2016

On Jan 11, 2016, at 03:25 PM, anatoly techtonik wrote: >On Wed, Jan 6, 2016 at 2:49 AM, Barry Warsaw <barry(a)python.org> wrote: > >> reStructuredText is clearly a better format > >Can you expand on that? I use markdown everywhere reST is better than plain text. Markdown is not a PEP format option. >> all recent PEP submissions have been in reST for a while now anyway. > >Is it possible to query exact numbers automatically? Feel free to grep the PEPs hg … [View More]

1 0

Fall back to encoding unicode strings in utf-8 if latin-1 fails in http.client
by Emil Stenström Jan. 7, 2016

Jan. 7, 2016

Hi, I hope python-ideas is the right place to post this, I'm very new to this and appreciate a pointer in the right direction if this is not it. The requests project is getting multiple bug reports about a problem in the stdlib http.client, so I thought I'd raise an issue about it here. The bug reports concern people posting http requests with unicode strings when they should be using utf-8 encoded strings. Since RFC 2616 says latin-1 is the default encoding http.client tries that and … [View More]

8 22

intuitive timedeltas like in go
by Sven R. Kunze Jan. 7, 2016

Jan. 7, 2016

Hi, timedelta handling always felt cumbersome to me: from datetime import timedelta short_period = timedelta(seconds=10) long_period = timedelta(hours=4, seconds=37) Today, I came across this one https://github.com/lxc/lxd/pull/1471/files and I found the creation of a 10 seconds timeout extremely intuitive. Would this represent a valuable addition to Python? from datetime import second, hour short period = 10*second long_period = 4*hour + 37*second Best, Sven

4 3

PEP 9 - plaintext PEP format - is officially deprecated
by Barry Warsaw Jan. 5, 2016

Jan. 5, 2016

I don't think this will be at all controversial. Brett suggested, and there was no disagreement from the PEP editors, that plain text PEPs be deprecated. reStructuredText is clearly a better format, and all recent PEP submissions have been in reST for a while now anyway. I am therefore withdrawing[*] PEP 9 and have made other appropriate changes to make it clear that only PEP 12 format is acceptable going forward. The PEP editors will not be converting the legacy PEPs to reST, nor will we … [View More]

1 0

Re: [Python-ideas] Deprecating the old-style sequence protocol
by Guido van Rossum Jan. 5, 2016

Jan. 5, 2016

[Adding python-ideas back -- I'm not sure why you dropped it but it looks like an oversight, not intentional] On Fri, Jan 1, 2016 at 2:25 PM, Andrew Barnert <abarnert(a)yahoo.com> wrote: > On Dec 27, 2015, at 09:04, Guido van Rossum <guido(a)python.org> wrote: > > > If we want some way to turn something that just defines __getitem__ and > __len__ into a proper sequence, it should just be made to inherit from > Sequence, which supplies the default __iter__ and … [View More]

4 6

How exactly does from ... import ... work?
by u8y7541 The Awesome Person Jan. 5, 2016

Jan. 5, 2016

Suppose I have a file called randomFile.py which reads like this: class A: def __init__(self, foo): self.foo = foo self.bar = bar(foo) class B(A): pass class C(B): pass def bar(foo): return foo + 1 Suppose in another file in the same directory, I have another python program. from randomFile import C # some code When C has to be imported, B also has to be imported because it is the parent. Therefore, A also has to be imported. This also results in the function … [View More]

3 2