[Python-ideas] RFC: PEP: Specialized functions with guards

Victor Stinner victor.stinner at gmail.com
Fri Jan 8 16:31:40 EST 2016


Hi,

Here is the second PEP, part of a serie of 3 PEP to add an API to
implement a static Python optimizer specializing functions with
guards.

HTML version:
https://faster-cpython.readthedocs.org/pep_specialize.html

PEP: xxx
Title: Specialized functions with guards
Version: $Revision$
Last-Modified: $Date$
Author: Victor Stinner <victor.stinner at gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 4-January-2016
Python-Version: 3.6


Abstract
========

Add an API to add specialized functions with guards to functions, to
support static optimizers respecting the Python semantic.


Rationale
=========

Python is hard to optimize because almost everything is mutable: builtin
functions, function code, global variables, local variables, ... can be
modified at runtime. Implement optimizations respecting the Python
semantic requires to detect when "something changes", we will call these
checks "guards".

This PEP proposes to add a ``specialize()`` method to functions to add a
specialized functions with guards. When the function is called, the
specialized function is used if nothing changed, otherwise use the
original bytecode.

Writing an optimizer is out of the scope of this PEP.


Example
=======

Using bytecode
--------------

Replace ``chr(65)`` with ``"A"``::

    import myoptimizer

    def func():
        return chr(65)

    def fast_func():
        return "A"

    func.specialize(fast_func.__code__, [myoptimizer.GuardBuiltins("chr")])
    del fast_func

    print("func(): %s" % func())
    print("#specialized: %s" % len(func.get_specialized()))
    print()

    import builtins
    builtins.chr = lambda obj: "mock"

    print("func(): %s" % func())
    print("#specialized: %s" % len(func.get_specialized()))

Output::

    func(): A
    #specialized: 1

    func(): mock
    #specialized: 0

The hypothetical ``myoptimizer.GuardBuiltins("len")`` is a guard on the
builtin ``len()`` function and the ``len`` name in the global namespace.
The guard fails if the builtin function is replaced or if a ``len`` name
is defined in the global namespace.

The first call returns directly the string ``"A"``. The second call
removes the specialized function because the builtin ``chr()`` function
was replaced, and executes the original bytecode

On a microbenchmark, calling the specialized function takes 88 ns,
whereas the original bytecode takes 145 ns (+57 ns): 1.6 times as fast.


Using builtin function
----------------------

Replace a slow Python function calling ``chr(obj)`` with a direct call
to the builtin ``chr()`` function::

    import myoptimizer

    def func(arg):
        return chr(arg)

    func.specialize(chr, [myoptimizer.GuardBuiltins("chr")])

    print("func(65): %s" % func(65))
    print("#specialized: %s" % len(func.get_specialized()))
    print()

    import builtins
    builtins.chr = lambda obj: "mock"

    print("func(65): %s" % func(65))
    print("#specialized: %s" % len(func.get_specialized()))

Output::

    func(): A
    #specialized: 1

    func(): mock
    #specialized: 0

The first call returns directly the builtin ``chr()`` function (without
creating a Python frame). The second call removes the specialized
function because the builtin ``chr()`` function was replaced, and
executes the original bytecode.

On a microbenchmark, calling the specialized function takes 95 ns,
whereas the original bytecode takes 155 ns (+60 ns): 1.6 times as fast.
Calling directly ``chr(65)`` takes 76 ns.


Python Function Call
====================

Pseudo-code to call a Python function having specialized functions with
guards::

    def call_func(func, *args, **kwargs):
        # by default, call the regular bytecode
        code = func.__code__.co_code
        specialized = func.get_specialized()
        nspecialized = len(specialized)

        index = 0
        while index < nspecialized:
            guard = specialized[index].guard
            # pass arguments, some guards need them
            check = guard(args, kwargs)
            if check == 1:
                # guard succeeded: we can use the specialized function
                code = specialized[index].code
                break
            elif check == -1:
                # guard will always fail: remove the specialized function
                del specialized[index]
            elif check == 0:
                # guard failed temporarely
                index += 1

        # code can be a code object or any callable object
        execute_code(code, args, kwargs)


Changes
=======

* Add two new methods to functions:

  - ``specialize(code, guards: list)``: add specialized
    function with guard. `code` is a code object (ex:
    ``func2.__code__``) or any callable object (ex: ``len``).
    The specialization can be ignored if a guard already fails.
  - ``get_specialized()``: get the list of specialized functions with
    guards

* Base ``Guard`` type which can be used as parent type to implement
  guards. It requires to implement a ``check()`` function, with an
  optional ``first_check()`` function. API:

  * ``int check(PyObject *guard, PyObject **stack)``: return 1 on
    success, 0 if the guard failed temporarely, -1 if the guard will
    always fail
  * ``int first_check(PyObject *guard, PyObject *func)``: return 0 on
    success, -1 if the guard will always fail

Microbenchmark on ``python3.6 -m timeit -s 'def f(): pass' 'f()'`` (best
of 3 runs):

* Original Python: 79 ns
* Patched Python: 79 ns

According to this microbenchmark, the changes has no overhead on calling
a Python function without specialization.


Behaviour
=========

When a function code is replaced (``func.__code__ = new_code``), all
specialized functions are removed.

When a function is serialized (by ``marshal`` or ``pickle`` for
example), specialized functions and guards are ignored (not serialized).


Copyright
=========

This document has been placed in the public domain.

--
Victor


More information about the Python-ideas mailing list