[Python-ideas] PEP 550 v2

Barry barry at barrys-emacs.org
Sun Aug 20 17:01:06 EDT 2017


I'm not clear why there is a new_context_key which seems not to be a key.
It seems that the object is a container for a single value.

Key.set( value ) does not feel right.

Container.set( value ) is fine.

Barry


> On 16 Aug 2017, at 00:55, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
> 
> Hi,
> 
> Here's the PEP 550 version 2.  Thanks to a very active and insightful
> discussion here on Python-ideas, we've discovered a number of
> problems with the first version of the PEP.  This version is a complete
> rewrite (only Abstract, Rationale, and Goals sections were not updated).
> 
> The updated PEP is live on python.org:
> https://www.python.org/dev/peps/pep-0550/
> 
> There is no reference implementation at this point, but I'm confident
> that this version of the spec will have the same extremely low
> runtime overhead as the first version.  Thanks to the new ContextItem
> design, accessing values in the context is even faster now.
> 
> Thank you!
> 
> 
> PEP: 550
> Title: Execution Context
> Version: $Revision$
> Last-Modified: $Date$
> Author: Yury Selivanov <yury at magic.io>
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 11-Aug-2017
> Python-Version: 3.7
> Post-History: 11-Aug-2017, 15-Aug-2017
> 
> 
> Abstract
> ========
> 
> This PEP proposes a new mechanism to manage execution state--the
> logical environment in which a function, a thread, a generator,
> or a coroutine executes in.
> 
> A few examples of where having a reliable state storage is required:
> 
> * Context managers like decimal contexts, ``numpy.errstate``,
>  and ``warnings.catch_warnings``;
> 
> * Storing request-related data such as security tokens and request
>  data in web applications, implementing i18n;
> 
> * Profiling, tracing, and logging in complex and large code bases.
> 
> The usual solution for storing state is to use a Thread-local Storage
> (TLS), implemented in the standard library as ``threading.local()``.
> Unfortunately, TLS does not work for the purpose of state isolation
> for generators or asynchronous code, because such code executes
> concurrently in a single thread.
> 
> 
> Rationale
> =========
> 
> Traditionally, a Thread-local Storage (TLS) is used for storing the
> state.  However, the major flaw of using the TLS is that it works only
> for multi-threaded code.  It is not possible to reliably contain the
> state within a generator or a coroutine.  For example, consider
> the following generator::
> 
>    def calculate(precision, ...):
>        with decimal.localcontext() as ctx:
>            # Set the precision for decimal calculations
>            # inside this block
>            ctx.prec = precision
> 
>            yield calculate_something()
>            yield calculate_something_else()
> 
> Decimal context is using a TLS to store the state, and because TLS is
> not aware of generators, the state can leak.  If a user iterates over
> the ``calculate()`` generator with different precisions one by one
> using a ``zip()`` built-in, the above code will not work correctly.
> For example::
> 
>    g1 = calculate(precision=100)
>    g2 = calculate(precision=50)
> 
>    items = list(zip(g1, g2))
> 
>    # items[0] will be a tuple of:
>    #   first value from g1 calculated with 100 precision,
>    #   first value from g2 calculated with 50 precision.
>    #
>    # items[1] will be a tuple of:
>    #   second value from g1 calculated with 50 precision (!!!),
>    #   second value from g2 calculated with 50 precision.
> 
> An even scarier example would be using decimals to represent money
> in an async/await application: decimal calculations can suddenly
> lose precision in the middle of processing a request.  Currently,
> bugs like this are extremely hard to find and fix.
> 
> Another common need for web applications is to have access to the
> current request object, or security context, or, simply, the request
> URL for logging or submitting performance tracing data::
> 
>    async def handle_http_request(request):
>        context.current_http_request = request
> 
>        await ...
>        # Invoke your framework code, render templates,
>        # make DB queries, etc, and use the global
>        # 'current_http_request' in that code.
> 
>        # This isn't currently possible to do reliably
>        # in asyncio out of the box.
> 
> These examples are just a few out of many, where a reliable way to
> store context data is absolutely needed.
> 
> The inability to use TLS for asynchronous code has lead to
> proliferation of ad-hoc solutions, which are limited in scope and
> do not support all required use cases.
> 
> Current status quo is that any library, including the standard
> library, that uses a TLS, will likely not work as expected in
> asynchronous code or with generators (see [3]_ as an example issue.)
> 
> Some languages that have coroutines or generators recommend to
> manually pass a ``context`` object to every function, see [1]_
> describing the pattern for Go.  This approach, however, has limited
> use for Python, where we have a huge ecosystem that was built to work
> with a TLS-like context.  Moreover, passing the context explicitly
> does not work at all for libraries like ``decimal`` or ``numpy``,
> which use operator overloading.
> 
> .NET runtime, which has support for async/await, has a generic
> solution of this problem, called ``ExecutionContext`` (see [2]_).
> On the surface, working with it is very similar to working with a TLS,
> but the former explicitly supports asynchronous code.
> 
> 
> Goals
> =====
> 
> The goal of this PEP is to provide a more reliable alternative to
> ``threading.local()``.  It should be explicitly designed to work with
> Python execution model, equally supporting threads, generators, and
> coroutines.
> 
> An acceptable solution for Python should meet the following
> requirements:
> 
> * Transparent support for code executing in threads, coroutines,
>  and generators with an easy to use API.
> 
> * Negligible impact on the performance of the existing code or the
>  code that will be using the new mechanism.
> 
> * Fast C API for packages like ``decimal`` and ``numpy``.
> 
> Explicit is still better than implicit, hence the new APIs should only
> be used when there is no acceptable way of passing the state
> explicitly.
> 
> 
> Specification
> =============
> 
> Execution Context is a mechanism of storing and accessing data specific
> to a logical thread of execution.  We consider OS threads,
> generators, and chains of coroutines (such as ``asyncio.Task``)
> to be variants of a logical thread.
> 
> In this specification, we will use the following terminology:
> 
> * **Local Context**, or LC, is a key/value mapping that stores the
>  context of a logical thread.
> 
> * **Execution Context**, or EC, is an OS-thread-specific dynamic
>  stack of Local Contexts.
> 
> * **Context Item**, or CI, is an object used to set and get values
>  from the Execution Context.
> 
> Please note that throughout the specification we use simple
> pseudo-code to illustrate how the EC machinery works.  The actual
> algorithms and data structures that we will use to implement the PEP
> are discussed in the `Implementation Strategy`_ section.
> 
> 
> Context Item Object
> -------------------
> 
> The ``sys.new_context_item(description)`` function creates a
> new ``ContextItem`` object.  The ``description`` parameter is a
> ``str``, explaining the nature of the context key for introspection
> and debugging purposes.
> 
> ``ContextItem`` objects have the following methods and attributes:
> 
> * ``.description``: read-only description;
> 
> * ``.set(o)`` method: set the value to ``o`` for the context item
>  in the execution context.
> 
> * ``.get()`` method: return the current EC value for the context item.
>  Context items are initialized with ``None`` when created, so
>  this method call never fails.
> 
> The below is an example of how context items can be used::
> 
>    my_context = sys.new_context_item(description='mylib.context')
>    my_context.set('spam')
> 
>    # Later, to access the value of my_context:
>    print(my_context.get())
> 
> 
> Thread State and Multi-threaded code
> ------------------------------------
> 
> Execution Context is implemented on top of Thread-local Storage.
> For every thread there is a separate stack of Local Contexts --
> mappings of ``ContextItem`` objects to their values in the LC.
> New threads always start with an empty EC.
> 
> For CPython::
> 
>    PyThreadState:
>        execution_context: ExecutionContext([
>            LocalContext({ci1: val1, ci2: val2, ...}),
>            ...
>        ])
> 
> The ``ContextItem.get()`` and ``.set()`` methods are defined as
> follows (in pseudo-code)::
> 
>    class ContextItem:
> 
>        def get(self):
>            tstate = PyThreadState_Get()
> 
>            for local_context in reversed(tstate.execution_context):
>                if self in local_context:
>                    return local_context[self]
> 
>        def set(self, value):
>            tstate = PyThreadState_Get()
> 
>            if not tstate.execution_context:
>                tstate.execution_context = [LocalContext()]
> 
>            tstate.execution_context[-1][self] = value
> 
> With the semantics defined so far, the Execution Context can already
> be used as an alternative to ``threading.local()``::
> 
>    def print_foo():
>        print(ci.get() or 'nothing')
> 
>    ci = sys.new_context_item(description='test')
>    ci.set('foo')
> 
>    # Will print "foo":
>    print_foo()
> 
>    # Will print "nothing":
>    threading.Thread(target=print_foo).start()
> 
> 
> Manual Context Management
> -------------------------
> 
> Execution Context is generally managed by the Python interpreter,
> but sometimes it is desirable for the user to take the control
> over it.  A few examples when this is needed:
> 
> * running a computation in ``concurrent.futures.ThreadPoolExecutor``
>  with the current EC;
> 
> * reimplementing generators with iterators (more on that later);
> 
> * managing contexts in asynchronous frameworks (implement proper
>  EC support in ``asyncio.Task`` and ``asyncio.loop.call_soon``.)
> 
> For these purposes we add a set of new APIs (they will be used in
> later sections of this specification):
> 
> * ``sys.new_local_context()``: create an empty ``LocalContext``
>  object.
> 
> * ``sys.new_execution_context()``: create an empty
>  ``ExecutionContext`` object.
> 
> * Both ``LocalContext`` and ``ExecutionContext`` objects are opaque
>  to Python code, and there are no APIs to modify them.
> 
> * ``sys.get_execution_context()`` function.  The function returns a
>  copy of the current EC: an ``ExecutionContext`` instance.
> 
>  The runtime complexity of the actual implementation of this function
>  can be O(1), but for the purposes of this section it is equivalent
>  to::
> 
>    def get_execution_context():
>        tstate = PyThreadState_Get()
>        return copy(tstate.execution_context)
> 
> * ``sys.run_with_execution_context(ec: ExecutionContext, func, *args,
>  **kwargs)`` runs ``func(*args, **kwargs)`` in the provided execution
>  context::
> 
>    def run_with_execution_context(ec, func, *args, **kwargs):
>        tstate = PyThreadState_Get()
> 
>        old_ec = tstate.execution_context
> 
>        tstate.execution_context = ExecutionContext(
>            ec.local_contexts + [LocalContext()]
>        )
> 
>        try:
>            return func(*args, **kwargs)
>        finally:
>            tstate.execution_context = old_ec
> 
>  Any changes to Local Context by ``func`` will be ignored.
>  This allows to reuse one ``ExecutionContext`` object for multiple
>  invocations of different functions, without them being able to
>  affect each other's environment::
> 
>      ci = sys.new_context_item('example')
>      ci.set('spam')
> 
>      def func():
>          print(ci.get())
>          ci.set('ham')
> 
>      ec = sys.get_execution_context()
> 
>      sys.run_with_execution_context(ec, func)
>      sys.run_with_execution_context(ec, func)
> 
>      # Will print:
>      #   spam
>      #   spam
> 
> * ``sys.run_with_local_context(lc: LocalContext, func, *args,
>  **kwargs)`` runs ``func(*args, **kwargs)`` in the current execution
>  context using the specified local context.
> 
>  Any changes that ``func`` does to the local context will be
>  persisted in ``lc``.  This behaviour is different from the
>  ``run_with_execution_context()`` function, which always creates
>  a new throw-away local context.
> 
>  In pseudo-code::
> 
>    def run_with_local_context(lc, func, *args, **kwargs):
>        tstate = PyThreadState_Get()
> 
>        old_ec = tstate.execution_context
> 
>        tstate.execution_context = ExecutionContext(
>            old_ec.local_contexts + [lc]
>        )
> 
>        try:
>            return func(*args, **kwargs)
>        finally:
>            tstate.execution_context = old_ec
> 
>  Using the previous example::
> 
>      ci = sys.new_context_item('example')
>      ci.set('spam')
> 
>      def func():
>          print(ci.get())
>          ci.set('ham')
> 
>      ec = sys.get_execution_context()
>      lc = sys.new_local_context()
> 
>      sys.run_with_local_context(lc, func)
>      sys.run_with_local_context(lc, func)
> 
>      # Will print:
>      #   spam
>      #   ham
> 
> As an example, let's make a subclass of
> ``concurrent.futures.ThreadPoolExecutor`` that preserves the execution
> context for scheduled functions::
> 
>    class Executor(concurrent.futures.ThreadPoolExecutor):
> 
>        def submit(self, fn, *args, **kwargs):
>            context = sys.get_execution_context()
> 
>            fn = functools.partial(
>                sys.run_with_execution_context, context,
>                fn, *args, **kwargs)
> 
>            return super().submit(fn)
> 
> 
> EC Semantics for Coroutines
> ---------------------------
> 
> Python :pep:`492` coroutines are used to implement cooperative
> multitasking.  For a Python end-user they are similar to threads,
> especially when it comes to sharing resources or modifying
> the global state.
> 
> An event loop is needed to schedule coroutines.  Coroutines that
> are explicitly scheduled by the user are usually called Tasks.
> When a coroutine is scheduled, it can schedule other coroutines using
> an ``await`` expression.  In async/await world, awaiting a coroutine
> is equivalent to a regular function call in synchronous code.  Thus,
> Tasks are similar to threads.
> 
> By drawing a parallel between regular multithreaded code and
> async/await, it becomes apparent that any modification of the
> execution context within one Task should be visible to all coroutines
> scheduled within it.  Any execution context modifications, however,
> must not be visible to other Tasks executing within the same OS
> thread.
> 
> 
> Coroutine Object Modifications
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> To achieve this, a small set of modifications to the coroutine object
> is needed:
> 
> * New ``cr_local_context`` attribute.  This attribute is readable
>  and writable for Python code.
> 
> * When a coroutine object is instantiated, its ``cr_local_context``
>  is initialized with an empty Local Context.
> 
> * Coroutine's ``.send()`` and ``.throw()`` methods are modified as
>  follows (in pseudo-C)::
> 
>    if coro.cr_local_context is not None:
>        tstate = PyThreadState_Get()
> 
>        tstate.execution_context.push(coro.cr_local_context)
> 
>        try:
>            # Perform the actual `Coroutine.send()` or
>            # `Coroutine.throw()` call.
>            return coro.send(...)
>        finally:
>            coro.cr_local_context = tstate.execution_context.pop()
>    else:
>        # Perform the actual `Coroutine.send()` or
>        # `Coroutine.throw()` call.
>        return coro.send(...)
> 
> * When Python interpreter sees an ``await`` instruction, it inspects
>  the ``cr_local_context`` attribute of the coroutine that is about
>  to be awaited.  For ``await coro``:
> 
>  * If ``coro.cr_local_context`` is an empty ``LocalContext`` object
>    that ``coro`` was created with, the interpreter will set
>    ``coro.cr_local_context`` to ``None``.
> 
>  * If ``coro.cr_local_context`` was modified by Python code, the
>    interpreter will leave it as is.
> 
>  This makes any changes to execution context made by nested coroutine
>  calls within a Task to be visible throughout the Task::
> 
>      ci = sys.new_context_item('example')
> 
>      async def nested():
>          ci.set('nested')
> 
>      asynd def main():
>          ci.set('main')
>          print('before:', ci.get())
>          await nested()
>          print('after:', ci.get())
> 
>      # Will print:
>      #   before: main
>      #   after: nested
> 
>  Essentially, coroutines work with Execution Context items similarly
>  to threads, and ``await`` expression acts like a function call.
> 
>  This mechanism also works for ``yield from`` in generators decorated
>  with ``@types.coroutine`` or ``@asyncio.coroutine``, which are
>  called "generator-based coroutines" according to :pep:`492`,
>  and should be fully compatible with native async/await coroutines.
> 
> 
> Tasks
> ^^^^^
> 
> In asynchronous frameworks like asyncio, coroutines are run by
> an event loop, and need to be explicitly scheduled (in asyncio
> coroutines are run by ``asyncio.Task``.)
> 
> With the currently defined semantics, the interpreter makes
> coroutines linked by an ``await`` expression share the same
> Local Context.
> 
> The interpreter, however, is not aware of the Task concept, and
> cannot help with ensuring that new Tasks started in coroutines,
> use the correct EC::
> 
>    current_request = sys.new_context_item(description='request')
> 
>    async def child():
>        print('current request:', repr(current_request.get()))
> 
>    async def handle_request(request):
>        current_request.set(request)
>        event_loop.create_task(child)
> 
>    run(top_coro())
> 
>    # Will print:
>    #   current_request: None
> 
> To enable correct Execution Context propagation into Tasks, the
> asynchronous framework needs to assist the interpreter:
> 
> * When ``create_task`` is called, it should capture the current
>  execution context with ``sys.get_execution_context()`` and save it
>  on the Task object.
> 
> * When the Task object runs its coroutine object, it should execute
>  ``.send()`` and ``.throw()`` methods within the captured
>  execution context, using the ``sys.run_with_execution_context()``
>  function.
> 
> With help from the asynchronous framework, the above snippet will
> run correctly, and the ``child()`` coroutine will be able to access
> the current request object through the ``current_request``
> Context Item.
> 
> 
> Event Loop Callbacks
> ^^^^^^^^^^^^^^^^^^^^
> 
> Similarly to Tasks, functions like asyncio's ``loop.call_soon()``
> should capture the current execution context with
> ``sys.get_execution_context()`` and execute callbacks
> within it with ``sys.run_with_execution_context()``.
> 
> This way the following code will work::
> 
>    current_request = sys.new_context_item(description='request')
> 
>    def log():
>        request = current_request.get()
>        print(request)
> 
>    async def request_handler(request):
>        current_request.set(request)
>        get_event_loop.call_soon(log)
> 
> 
> Generators
> ----------
> 
> Generators in Python, while similar to Coroutines, are used in a
> fundamentally different way.  They are producers of data, and
> they use ``yield`` expression to suspend/resume their execution.
> 
> A crucial difference between ``await coro`` and ``yield value`` is
> that the former expression guarantees that the ``coro`` will be
> executed fully, while the latter is producing ``value`` and
> suspending the generator until it gets iterated again.
> 
> Generators, similarly to coroutines, have a ``gi_local_context``
> attribute, which is set to an empty Local Context when created.
> 
> Contrary to coroutines though, ``yield from o`` expression in
> generators (that are not generator-based coroutines) is semantically
> equivalent to ``for v in o: yield v``, therefore the interpreter does
> not attempt to control their ``gi_local_context``.
> 
> 
> EC Semantics for Generators
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> Every generator object has its own Local Context that stores
> only its own local modifications of the context.  When a generator
> is being iterated, its local context will be put in the EC stack
> of the current thread.  This means that the generator will be able
> to see access items from the surrounding context::
> 
>    local = sys.new_context_item("local")
>    global = sys.new_context_item("global")
> 
>    def generator():
>        local.set('inside gen:')
>        while True:
>            print(local.get(), global.get())
>            yield
> 
>    g = gen()
> 
>    local.set('hello')
>    global.set('spam')
>    next(g)
> 
>    local.set('world')
>    global.set('ham')
>    next(g)
> 
>    # Will print:
>    #   inside gen: spam
>    #   inside gen: ham
> 
> Any changes to the EC in nested generators are invisible to the outer
> generator::
> 
>    local = sys.new_context_item("local")
> 
>    def inner_gen():
>        local.set('spam')
>        yield
> 
>    def outer_gen():
>        local.set('ham')
>        yield from gen()
>        print(local.get())
> 
>    list(outer_gen())
> 
>    # Will print:
>    #   ham
> 
> 
> Running generators without LC
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> Similarly to coroutines, generators with ``gi_local_context``
> set to ``None`` simply use the outer Local Context.
> 
> The ``@contextlib.contextmanager`` decorator uses this mechanism to
> allow its generator to affect the EC::
> 
>    item = sys.new_context_item('test')
> 
>    @contextmanager
>    def context(x):
>        old = item.get()
>        item.set('x')
>        try:
>            yield
>        finally:
>            item.set(old)
> 
>    with context('spam'):
> 
>        with context('ham'):
>            print(1, item.get())
> 
>        print(2, item.get())
> 
>    # Will print:
>    #   1 ham
>    #   2 spam
> 
> 
> Implementing Generators with Iterators
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> The Execution Context API allows to fully replicate EC behaviour
> imposed on generators with a regular Python iterator class::
> 
>    class Gen:
> 
>        def __init__(self):
>            self.local_context = sys.new_local_context()
> 
>        def __iter__(self):
>            return self
> 
>        def __next__(self):
>            return sys.run_with_local_context(
>                self.local_context, self._next_impl)
> 
>        def _next_impl(self):
>            # Actual __next__ implementation.
>            ...
> 
> 
> Asynchronous Generators
> -----------------------
> 
> Asynchronous Generators (AG) interact with the Execution Context
> similarly to regular generators.
> 
> They have an ``ag_local_context`` attribute, which, similarly to
> regular generators, can be set to ``None`` to make them use the outer
> Local Context.  This is used by the new
> ``contextlib.asynccontextmanager`` decorator.
> 
> The EC support of ``await`` expression is implemented using the same
> approach as in coroutines, see the `Coroutine Object Modifications`_
> section.
> 
> 
> Greenlets
> ---------
> 
> Greenlet is an alternative implementation of cooperative
> scheduling for Python.  Although greenlet package is not part of
> CPython, popular frameworks like gevent rely on it, and it is
> important that greenlet can be modified to support execution
> contexts.
> 
> In a nutshell, greenlet design is very similar to design of
> generators.  The main difference is that for generators, the stack
> is managed by the Python interpreter.  Greenlet works outside of the
> Python interpreter, and manually saves some ``PyThreadState``
> fields and pushes/pops the C-stack.  Thus the ``greenlet`` package
> can be easily updated to use the new low-level `C API`_ to enable
> full support of EC.
> 
> 
> New APIs
> ========
> 
> Python
> ------
> 
> Python APIs were designed to completely hide the internal
> implementation details, but at the same time provide enough control
> over EC and LC to re-implement all of Python built-in objects
> in pure Python.
> 
> 1. ``sys.new_context_item(description='...')``: create a
>   ``ContextItem`` object used to access/set values in EC.
> 
> 2. ``ContextItem``:
> 
>   * ``.description``: read-only attribute.
>   * ``.get()``: return the current value for the item.
>   * ``.set(o)``: set the current value in the EC for the item.
> 
> 3. ``sys.get_execution_context()``: return the current
>   ``ExecutionContext``.
> 
> 4. ``sys.new_execution_context()``: create a new empty
>   ``ExecutionContext``.
> 
> 5. ``sys.new_local_context()``: create a new empty ``LocalContext``.
> 
> 6. ``sys.run_with_execution_context(ec: ExecutionContext,
>   func, *args, **kwargs)``.
> 
> 7. ``sys.run_with_local_context(lc:LocalContext,
>   func, *args, **kwargs)``.
> 
> 
> C API
> -----
> 
> 1. ``PyContextItem * PyContext_NewItem(char *desc)``: create a
>   ``PyContextItem`` object.
> 
> 2. ``PyObject * PyContext_GetItem(PyContextItem *)``: get the
>   current value for the context item.
> 
> 3. ``int PyContext_SetItem(PyContextItem *, PyObject *)``: set
>   the current value for the context item.
> 
> 4. ``PyLocalContext * PyLocalContext_New()``: create a new empty
>   ``PyLocalContext``.
> 
> 5. ``PyLocalContext * PyExecutionContext_New()``: create a new empty
>   ``PyExecutionContext``.
> 
> 6. ``PyExecutionContext * PyExecutionContext_Get()``: get the
>   EC for the active thread state.
> 
> 7. ``int PyExecutionContext_Set(PyExecutionContext *)``: set the
>   passed EC object as the current for the active thread state.
> 
> 8. ``int PyExecutionContext_SetWithLocalContext(PyExecutionContext *,
>   PyLocalContext *)``: allows to implement
>   ``sys.run_with_local_context`` Python API.
> 
> 
> Implementation Strategy
> =======================
> 
> LocalContext is a Weak Key Mapping
> ----------------------------------
> 
> Using a weak key mapping for ``LocalContext`` implementation
> enables the following properties with regards to garbage
> collection:
> 
> * ``ContextItem`` objects are strongly-referenced only from the
>  application code, not from any of the Execution Context
>  machinery or values they point to.  This means that there
>  are no reference cycles that could extend their lifespan
>  longer than necessary, or prevent their garbage collection.
> 
> * Values put in the Execution Context are guaranteed to be kept
>  alive while there is a ``ContextItem`` key referencing them in
>  the thread.
> 
> * If a ``ContextItem`` is garbage collected, all of its values will
>  be removed from all contexts, allowing them to be GCed if needed.
> 
> * If a thread has ended its execution, its thread state will be
>  cleaned up along with its ``ExecutionContext``, cleaning
>  up all values bound to all Context Items in the thread.
> 
> 
> ContextItem.get() Cache
> -----------------------
> 
> We can add three new fields to ``PyThreadState`` and
> ``PyInterpreterState`` structs:
> 
> * ``uint64_t PyThreadState->unique_id``: a globally unique
>  thread state identifier (we can add a counter to
>  ``PyInterpreterState`` and increment it when a new thread state is
>  created.)
> 
> * ``uint64_t PyInterpreterState->context_item_deallocs``: every time
>  a ``ContextItem`` is GCed, all Execution Contexts in all threads
>  will lose track of it.  ``context_item_deallocs`` will simply
>  count all ``ContextItem`` deallocations.
> 
> * ``uint64_t PyThreadState->execution_context_ver``: every time
>  a new item is set, or an existing item is updated, or the stack
>  of execution contexts is changed in the thread, we increment this
>  counter.
> 
> The above two fields allow implementing a fast cache path in
> ``ContextItem.get()``, in pseudo-code::
> 
>    class ContextItem:
> 
>        def get(self):
>            tstate = PyThreadState_Get()
> 
>            if (self.last_tstate_id == tstate.unique_id and
>                self.last_ver == tstate.execution_context_ver
>                self.last_deallocs ==
>                    tstate.iterp.context_item_deallocs):
>                return self.last_value
> 
>            value = None
>            for mapping in reversed(tstate.execution_context):
>                if self in mapping:
>                    value = mapping[self]
>                    break
> 
>            self.last_value = value
>            self.last_tstate_id = tstate.unique_id
>            self.last_ver = tstate.execution_context_ver
>            self.last_deallocs = tstate.interp.context_item_deallocs
> 
>            return value
> 
> This is similar to the trick that decimal C implementation uses
> for caching the current decimal context, and will have the same
> performance characteristics, but available to all
> Execution Context users.
> 
> 
> Approach #1: Use a dict for LocalContext
> ----------------------------------------
> 
> The straightforward way of implementing the proposed EC
> mechanisms is to create a ``WeakKeyDict`` on top of Python
> ``dict`` type.
> 
> To implement the ``ExecutionContext`` type we can use Python
> ``list`` (or a custom stack implementation with some
> pre-allocation optimizations).
> 
> This approach will have the following runtime complexity:
> 
> * O(M) for ``ContextItem.get()``, where ``M`` is the number of
>  Local Contexts in the stack.
> 
>  It is important to note that ``ContextItem.get()`` will implement
>  a cache making the operation O(1) for packages like ``decimal``
>  and ``numpy``.
> 
> * O(1) for ``ContextItem.set()``.
> 
> * O(N) for ``sys.get_execution_context()``, where ``N`` is the
>  total number of items in the current **execution** context.
> 
> 
> Approach #2: Use HAMT for LocalContext
> --------------------------------------
> 
> Languages like Clojure and Scala use Hash Array Mapped Tries (HAMT)
> to implement high performance immutable collections [5]_, [6]_.
> 
> Immutable mappings implemented with HAMT have O(log\ :sub:`32`\ N)
> performance for both ``set()``, ``get()``, and ``merge()`` operations,
> which is essentially O(1) for relatively small mappings
> (read about HAMT performance in CPython in the
> `Appendix: HAMT Performance`_ section.)
> 
> In this approach we use the same design of the ``ExecutionContext``
> as in Approach #1, but we will use HAMT backed weak key Local Context
> implementation.  With that we will have the following runtime
> complexity:
> 
> * O(M * log\ :sub:`32`\ N) for ``ContextItem.get()``,
>  where ``M`` is the number of Local Contexts in the stack,
>  and ``N`` is the number of items in the EC.  The operation will
>  essentially be O(M), because execution contexts are normally not
>  expected to have more than a few dozen of items.
> 
>  (``ContextItem.get()`` will have the same caching mechanism as in
>  Approach #1.)
> 
> * O(log\ :sub:`32`\ N) for ``ContextItem.set()`` where ``N`` is the
>  number of items in the current **local** context.  This will
>  essentially be an O(1) operation most of the time.
> 
> * O(log\ :sub:`32`\ N) for ``sys.get_execution_context()``, where
>  ``N`` is the total number of items in the current **execution**
>  context.
> 
> Essentially, using HAMT for Local Contexts instead of Python dicts,
> allows to bring down the complexity of ``sys.get_execution_context()``
> from O(N) to O(log\ :sub:`32`\ N) because of the more efficient
> merge algorithm.
> 
> 
> Approach #3: Use HAMT and Immutable Linked List
> -----------------------------------------------
> 
> We can make an alternative ``ExecutionContext`` design by using
> a linked list.  Each ``LocalContext`` in the ``ExecutionContext``
> object will be wrapped in a linked-list node.
> 
> ``LocalContext`` objects will use an HAMT backed weak key
> implementation described in the Approach #2.
> 
> Every modification to the current ``LocalContext`` will produce a
> new version of it, which will be wrapped in a **new linked list
> node**.  Essentially this means, that ``ExecutionContext`` is an
> immutable forest of ``LocalContext`` objects, and can be safely
> copied by reference in ``sys.get_execution_context()`` (eliminating
> the expensive "merge" operation.)
> 
> With this approach, ``sys.get_execution_context()`` will be an
> **O(1) operation**.
> 
> 
> Summary
> -------
> 
> We believe that approach #3 enables an efficient and complete
> Execution Context implementation, with excellent runtime performance.
> 
> `ContextItem.get() Cache`_ enables fast retrieval of context items
> for performance critical libraries like decimal and numpy.
> 
> Fast ``sys.get_execution_context()`` enables efficient management
> of execution contexts in asynchronous libraries like asyncio.
> 
> 
> Design Considerations
> =====================
> 
> Can we fix ``PyThreadState_GetDict()``?
> ---------------------------------------
> 
> ``PyThreadState_GetDict`` is a TLS, and some of its existing users
> might depend on it being just a TLS.  Changing its behaviour to follow
> the Execution Context semantics would break backwards compatibility.
> 
> 
> PEP 521
> -------
> 
> :pep:`521` proposes an alternative solution to the problem:
> enhance Context Manager Protocol with two new methods: ``__suspend__``
> and ``__resume__``.  To make it compatible with async/await,
> the Asynchronous Context Manager Protocol will also need to be
> extended with ``__asuspend__`` and ``__aresume__``.
> 
> This allows to implement context managers like decimal context and
> ``numpy.errstate`` for generators and coroutines.
> 
> The following code::
> 
>    class Context:
> 
>        def __enter__(self):
>            self.old_x = get_execution_context_item('x')
>            set_execution_context_item('x', 'something')
> 
>        def __exit__(self, *err):
>            set_execution_context_item('x', self.old_x)
> 
> would become this::
> 
>    local = threading.local()
> 
>    class Context:
> 
>        def __enter__(self):
>            self.old_x = getattr(local, 'x', None)
>            local.x = 'something'
> 
>        def __suspend__(self):
>            local.x = self.old_x
> 
>        def __resume__(self):
>            local.x = 'something'
> 
>        def __exit__(self, *err):
>            local.x = self.old_x
> 
> Besides complicating the protocol, the implementation will likely
> negatively impact performance of coroutines, generators, and any code
> that uses context managers, and will notably complicate the
> interpreter implementation.
> 
> :pep:`521` also does not provide any mechanism to propagate state
> in a local context, like storing a request object in an HTTP request
> handler to have better logging.  Nor does it solve the leaking state
> problem for greenlet/gevent.
> 
> 
> Can Execution Context be implemented outside of CPython?
> --------------------------------------------------------
> 
> Because async/await code needs an event loop to run it, an EC-like
> solution can be implemented in a limited way for coroutines.
> 
> Generators, on the other hand, do not have an event loop or
> trampoline, making it impossible to intercept their ``yield`` points
> outside of the Python interpreter.
> 
> 
> Backwards Compatibility
> =======================
> 
> This proposal preserves 100% backwards compatibility.
> 
> 
> Appendix: HAMT Performance
> ==========================
> 
> To assess if HAMT can be used for Execution Context, we implemented
> it in CPython [7]_.
> 
> .. figure:: pep-0550-hamt_vs_dict.png
>   :align: center
>   :width: 100%
> 
>   Figure 1.  Benchmark code can be found here: [9]_.
> 
> Figure 1 shows that HAMT indeed displays O(1) performance for all
> benchmarked dictionary sizes.  For dictionaries with less than 100
> items, HAMT is a bit slower than Python dict/shallow copy.
> 
> .. figure:: pep-0550-lookup_hamt.png
>   :align: center
>   :width: 100%
> 
>   Figure 2.  Benchmark code can be found here: [10]_.
> 
> Figure 2 shows comparison of lookup costs between Python dict
> and an HAMT immutable mapping.  HAMT lookup time is 30-40% worse
> than Python dict lookups on average, which is a very good result,
> considering how well Python dicts are optimized.
> 
> Note, that according to [8]_, HAMT design can be further improved.
> 
> 
> Acknowledgments
> ===============
> 
> I thank Elvis Pranskevichus and Victor Petrovykh for countless
> discussions around the topic and PEP proof reading and edits.
> 
> Thanks to Nathaniel Smith for proposing the ``ContextItem`` design
> [17]_ [18]_, for pushing the PEP towards a more complete design, and
> coming up with the idea of having a stack of contexts in the thread
> state.
> 
> Thanks to Nick Coghlan for numerous suggestions and ideas on the
> mailing list, and for coming up with a case that cause the complete
> rewrite of the initial PEP version [19]_.
> 
> 
> References
> ==========
> 
> .. [1] https://blog.golang.org/context
> 
> .. [2] https://msdn.microsoft.com/en-us/library/system.threading.executioncontext.aspx
> 
> .. [3] https://github.com/numpy/numpy/issues/9444
> 
> .. [4] http://bugs.python.org/issue31179
> 
> .. [5] https://en.wikipedia.org/wiki/Hash_array_mapped_trie
> 
> .. [6] http://blog.higher-order.net/2010/08/16/assoc-and-clojures-persistenthashmap-part-ii.html
> 
> .. [7] https://github.com/1st1/cpython/tree/hamt
> 
> .. [8] https://michael.steindorfer.name/publications/oopsla15.pdf
> 
> .. [9] https://gist.github.com/1st1/9004813d5576c96529527d44c5457dcd
> 
> .. [10] https://gist.github.com/1st1/dbe27f2e14c30cce6f0b5fddfc8c437e
> 
> .. [11] https://github.com/1st1/cpython/tree/pep550
> 
> .. [12] https://www.python.org/dev/peps/pep-0492/#async-await
> 
> .. [13] https://github.com/MagicStack/uvloop/blob/master/examples/bench/echoserver.py
> 
> .. [14] https://github.com/MagicStack/pgbench
> 
> .. [15] https://github.com/python/performance
> 
> .. [16] https://gist.github.com/1st1/6b7a614643f91ead3edf37c4451a6b4c
> 
> .. [17] https://mail.python.org/pipermail/python-ideas/2017-August/046752.html
> 
> .. [18] https://mail.python.org/pipermail/python-ideas/2017-August/046772.html
> 
> .. [19] https://mail.python.org/pipermail/python-ideas/2017-August/046780.html
> 
> 
> Copyright
> =========
> 
> This document has been placed in the public domain.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
> 



More information about the Python-ideas mailing list