[Python-ideas] PEP 550 v2
Jelle Zijlstra
jelle.zijlstra at gmail.com
Wed Aug 16 02:53:04 EDT 2017
2017-08-16 1:55 GMT+02:00 Yury Selivanov <yselivanov.ml at gmail.com>:
> Hi,
>
> Here's the PEP 550 version 2. Thanks to a very active and insightful
> discussion here on Python-ideas, we've discovered a number of
> problems with the first version of the PEP. This version is a complete
> rewrite (only Abstract, Rationale, and Goals sections were not updated).
>
> The updated PEP is live on python.org:
> https://www.python.org/dev/peps/pep-0550/
>
> There is no reference implementation at this point, but I'm confident
> that this version of the spec will have the same extremely low
> runtime overhead as the first version. Thanks to the new ContextItem
> design, accessing values in the context is even faster now.
>
> Thank you!
>
>
> PEP: 550
> Title: Execution Context
> Version: $Revision$
> Last-Modified: $Date$
> Author: Yury Selivanov <yury at magic.io>
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 11-Aug-2017
> Python-Version: 3.7
> Post-History: 11-Aug-2017, 15-Aug-2017
>
>
> Abstract
> ========
>
> This PEP proposes a new mechanism to manage execution state--the
> logical environment in which a function, a thread, a generator,
> or a coroutine executes in.
>
> A few examples of where having a reliable state storage is required:
>
> * Context managers like decimal contexts, ``numpy.errstate``,
> and ``warnings.catch_warnings``;
>
> * Storing request-related data such as security tokens and request
> data in web applications, implementing i18n;
>
> * Profiling, tracing, and logging in complex and large code bases.
>
> The usual solution for storing state is to use a Thread-local Storage
> (TLS), implemented in the standard library as ``threading.local()``.
> Unfortunately, TLS does not work for the purpose of state isolation
> for generators or asynchronous code, because such code executes
> concurrently in a single thread.
>
>
> Rationale
> =========
>
> Traditionally, a Thread-local Storage (TLS) is used for storing the
> state. However, the major flaw of using the TLS is that it works only
> for multi-threaded code. It is not possible to reliably contain the
> state within a generator or a coroutine. For example, consider
> the following generator::
>
> def calculate(precision, ...):
> with decimal.localcontext() as ctx:
> # Set the precision for decimal calculations
> # inside this block
> ctx.prec = precision
>
> yield calculate_something()
> yield calculate_something_else()
>
> Decimal context is using a TLS to store the state, and because TLS is
> not aware of generators, the state can leak. If a user iterates over
> the ``calculate()`` generator with different precisions one by one
> using a ``zip()`` built-in, the above code will not work correctly.
> For example::
>
> g1 = calculate(precision=100)
> g2 = calculate(precision=50)
>
> items = list(zip(g1, g2))
>
> # items[0] will be a tuple of:
> # first value from g1 calculated with 100 precision,
> # first value from g2 calculated with 50 precision.
> #
> # items[1] will be a tuple of:
> # second value from g1 calculated with 50 precision (!!!),
> # second value from g2 calculated with 50 precision.
>
> An even scarier example would be using decimals to represent money
> in an async/await application: decimal calculations can suddenly
> lose precision in the middle of processing a request. Currently,
> bugs like this are extremely hard to find and fix.
>
> Another common need for web applications is to have access to the
> current request object, or security context, or, simply, the request
> URL for logging or submitting performance tracing data::
>
> async def handle_http_request(request):
> context.current_http_request = request
>
> await ...
> # Invoke your framework code, render templates,
> # make DB queries, etc, and use the global
> # 'current_http_request' in that code.
>
> # This isn't currently possible to do reliably
> # in asyncio out of the box.
>
> These examples are just a few out of many, where a reliable way to
> store context data is absolutely needed.
>
> The inability to use TLS for asynchronous code has lead to
> proliferation of ad-hoc solutions, which are limited in scope and
> do not support all required use cases.
>
> Current status quo is that any library, including the standard
> library, that uses a TLS, will likely not work as expected in
> asynchronous code or with generators (see [3]_ as an example issue.)
>
> Some languages that have coroutines or generators recommend to
> manually pass a ``context`` object to every function, see [1]_
> describing the pattern for Go. This approach, however, has limited
> use for Python, where we have a huge ecosystem that was built to work
> with a TLS-like context. Moreover, passing the context explicitly
> does not work at all for libraries like ``decimal`` or ``numpy``,
> which use operator overloading.
>
> .NET runtime, which has support for async/await, has a generic
> solution of this problem, called ``ExecutionContext`` (see [2]_).
> On the surface, working with it is very similar to working with a TLS,
> but the former explicitly supports asynchronous code.
>
>
> Goals
> =====
>
> The goal of this PEP is to provide a more reliable alternative to
> ``threading.local()``. It should be explicitly designed to work with
> Python execution model, equally supporting threads, generators, and
> coroutines.
>
> An acceptable solution for Python should meet the following
> requirements:
>
> * Transparent support for code executing in threads, coroutines,
> and generators with an easy to use API.
>
> * Negligible impact on the performance of the existing code or the
> code that will be using the new mechanism.
>
> * Fast C API for packages like ``decimal`` and ``numpy``.
>
> Explicit is still better than implicit, hence the new APIs should only
> be used when there is no acceptable way of passing the state
> explicitly.
>
>
> Specification
> =============
>
> Execution Context is a mechanism of storing and accessing data specific
> to a logical thread of execution. We consider OS threads,
> generators, and chains of coroutines (such as ``asyncio.Task``)
> to be variants of a logical thread.
>
> In this specification, we will use the following terminology:
>
> * **Local Context**, or LC, is a key/value mapping that stores the
> context of a logical thread.
>
> * **Execution Context**, or EC, is an OS-thread-specific dynamic
> stack of Local Contexts.
>
> * **Context Item**, or CI, is an object used to set and get values
> from the Execution Context.
>
> Please note that throughout the specification we use simple
> pseudo-code to illustrate how the EC machinery works. The actual
> algorithms and data structures that we will use to implement the PEP
> are discussed in the `Implementation Strategy`_ section.
>
>
> Context Item Object
> -------------------
>
> The ``sys.new_context_item(description)`` function creates a
> new ``ContextItem`` object. The ``description`` parameter is a
> ``str``, explaining the nature of the context key for introspection
> and debugging purposes.
>
> ``ContextItem`` objects have the following methods and attributes:
>
> * ``.description``: read-only description;
>
> * ``.set(o)`` method: set the value to ``o`` for the context item
> in the execution context.
>
> * ``.get()`` method: return the current EC value for the context item.
> Context items are initialized with ``None`` when created, so
> this method call never fails.
>
> The below is an example of how context items can be used::
>
> my_context = sys.new_context_item(description='mylib.context')
> my_context.set('spam')
>
Minor suggestion: Could we allow something like
`sys.set_new_context_item(description='mylib.context',
initial_value='spam')`? That would make it easier for type checkers to
infer the type of a ContextItem, and it would save a line of code in the
common case.
With this modification, the type of new_context_item would be
@overload
def new_context_item(*, description: str, initial_value: T) ->
ContextItem[T]: ...
@overload
def new_context_item(*, description: str) -> ContextItem[Any]: ...
If we only allow the second variant, type checkers would need some sort of
special casing to figure out that after .set(), .get() will return the same
type.
> # Later, to access the value of my_context:
> print(my_context.get())
>
>
> Thread State and Multi-threaded code
> ------------------------------------
>
> Execution Context is implemented on top of Thread-local Storage.
> For every thread there is a separate stack of Local Contexts --
> mappings of ``ContextItem`` objects to their values in the LC.
> New threads always start with an empty EC.
>
> For CPython::
>
> PyThreadState:
> execution_context: ExecutionContext([
> LocalContext({ci1: val1, ci2: val2, ...}),
> ...
> ])
>
> The ``ContextItem.get()`` and ``.set()`` methods are defined as
> follows (in pseudo-code)::
>
> class ContextItem:
>
> def get(self):
> tstate = PyThreadState_Get()
>
> for local_context in reversed(tstate.execution_context):
> if self in local_context:
> return local_context[self]
>
> def set(self, value):
> tstate = PyThreadState_Get()
>
> if not tstate.execution_context:
> tstate.execution_context = [LocalContext()]
>
> tstate.execution_context[-1][self] = value
>
> With the semantics defined so far, the Execution Context can already
> be used as an alternative to ``threading.local()``::
>
> def print_foo():
> print(ci.get() or 'nothing')
>
> ci = sys.new_context_item(description='test')
> ci.set('foo')
>
> # Will print "foo":
> print_foo()
>
> # Will print "nothing":
> threading.Thread(target=print_foo).start()
>
>
> Manual Context Management
> -------------------------
>
> Execution Context is generally managed by the Python interpreter,
> but sometimes it is desirable for the user to take the control
> over it. A few examples when this is needed:
>
> * running a computation in ``concurrent.futures.ThreadPoolExecutor``
> with the current EC;
>
> * reimplementing generators with iterators (more on that later);
>
> * managing contexts in asynchronous frameworks (implement proper
> EC support in ``asyncio.Task`` and ``asyncio.loop.call_soon``.)
>
> For these purposes we add a set of new APIs (they will be used in
> later sections of this specification):
>
> * ``sys.new_local_context()``: create an empty ``LocalContext``
> object.
>
> * ``sys.new_execution_context()``: create an empty
> ``ExecutionContext`` object.
>
> * Both ``LocalContext`` and ``ExecutionContext`` objects are opaque
> to Python code, and there are no APIs to modify them.
>
> * ``sys.get_execution_context()`` function. The function returns a
> copy of the current EC: an ``ExecutionContext`` instance.
>
> The runtime complexity of the actual implementation of this function
> can be O(1), but for the purposes of this section it is equivalent
> to::
>
> def get_execution_context():
> tstate = PyThreadState_Get()
> return copy(tstate.execution_context)
>
> * ``sys.run_with_execution_context(ec: ExecutionContext, func, *args,
> **kwargs)`` runs ``func(*args, **kwargs)`` in the provided execution
> context::
>
> def run_with_execution_context(ec, func, *args, **kwargs):
> tstate = PyThreadState_Get()
>
> old_ec = tstate.execution_context
>
> tstate.execution_context = ExecutionContext(
> ec.local_contexts + [LocalContext()]
> )
>
> try:
> return func(*args, **kwargs)
> finally:
> tstate.execution_context = old_ec
>
> Any changes to Local Context by ``func`` will be ignored.
> This allows to reuse one ``ExecutionContext`` object for multiple
> invocations of different functions, without them being able to
> affect each other's environment::
>
> ci = sys.new_context_item('example')
> ci.set('spam')
>
> def func():
> print(ci.get())
> ci.set('ham')
>
> ec = sys.get_execution_context()
>
> sys.run_with_execution_context(ec, func)
> sys.run_with_execution_context(ec, func)
>
> # Will print:
> # spam
> # spam
>
> * ``sys.run_with_local_context(lc: LocalContext, func, *args,
> **kwargs)`` runs ``func(*args, **kwargs)`` in the current execution
> context using the specified local context.
>
> Any changes that ``func`` does to the local context will be
> persisted in ``lc``. This behaviour is different from the
> ``run_with_execution_context()`` function, which always creates
> a new throw-away local context.
>
> In pseudo-code::
>
> def run_with_local_context(lc, func, *args, **kwargs):
> tstate = PyThreadState_Get()
>
> old_ec = tstate.execution_context
>
> tstate.execution_context = ExecutionContext(
> old_ec.local_contexts + [lc]
> )
>
> try:
> return func(*args, **kwargs)
> finally:
> tstate.execution_context = old_ec
>
> Using the previous example::
>
> ci = sys.new_context_item('example')
> ci.set('spam')
>
> def func():
> print(ci.get())
> ci.set('ham')
>
> ec = sys.get_execution_context()
> lc = sys.new_local_context()
>
> sys.run_with_local_context(lc, func)
> sys.run_with_local_context(lc, func)
>
> # Will print:
> # spam
> # ham
>
> As an example, let's make a subclass of
> ``concurrent.futures.ThreadPoolExecutor`` that preserves the execution
> context for scheduled functions::
>
> class Executor(concurrent.futures.ThreadPoolExecutor):
>
> def submit(self, fn, *args, **kwargs):
> context = sys.get_execution_context()
>
> fn = functools.partial(
> sys.run_with_execution_context, context,
> fn, *args, **kwargs)
>
> return super().submit(fn)
>
>
> EC Semantics for Coroutines
> ---------------------------
>
> Python :pep:`492` coroutines are used to implement cooperative
> multitasking. For a Python end-user they are similar to threads,
> especially when it comes to sharing resources or modifying
> the global state.
>
> An event loop is needed to schedule coroutines. Coroutines that
> are explicitly scheduled by the user are usually called Tasks.
> When a coroutine is scheduled, it can schedule other coroutines using
> an ``await`` expression. In async/await world, awaiting a coroutine
> is equivalent to a regular function call in synchronous code. Thus,
> Tasks are similar to threads.
>
> By drawing a parallel between regular multithreaded code and
> async/await, it becomes apparent that any modification of the
> execution context within one Task should be visible to all coroutines
> scheduled within it. Any execution context modifications, however,
> must not be visible to other Tasks executing within the same OS
> thread.
>
>
> Coroutine Object Modifications
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> To achieve this, a small set of modifications to the coroutine object
> is needed:
>
> * New ``cr_local_context`` attribute. This attribute is readable
> and writable for Python code.
>
> * When a coroutine object is instantiated, its ``cr_local_context``
> is initialized with an empty Local Context.
>
> * Coroutine's ``.send()`` and ``.throw()`` methods are modified as
> follows (in pseudo-C)::
>
> if coro.cr_local_context is not None:
> tstate = PyThreadState_Get()
>
> tstate.execution_context.push(coro.cr_local_context)
>
> try:
> # Perform the actual `Coroutine.send()` or
> # `Coroutine.throw()` call.
> return coro.send(...)
> finally:
> coro.cr_local_context = tstate.execution_context.pop()
> else:
> # Perform the actual `Coroutine.send()` or
> # `Coroutine.throw()` call.
> return coro.send(...)
>
> * When Python interpreter sees an ``await`` instruction, it inspects
> the ``cr_local_context`` attribute of the coroutine that is about
> to be awaited. For ``await coro``:
>
> * If ``coro.cr_local_context`` is an empty ``LocalContext`` object
> that ``coro`` was created with, the interpreter will set
> ``coro.cr_local_context`` to ``None``.
>
> * If ``coro.cr_local_context`` was modified by Python code, the
> interpreter will leave it as is.
>
> This makes any changes to execution context made by nested coroutine
> calls within a Task to be visible throughout the Task::
>
> ci = sys.new_context_item('example')
>
> async def nested():
> ci.set('nested')
>
> asynd def main():
> ci.set('main')
> print('before:', ci.get())
> await nested()
> print('after:', ci.get())
>
> # Will print:
> # before: main
> # after: nested
>
> Essentially, coroutines work with Execution Context items similarly
> to threads, and ``await`` expression acts like a function call.
>
> This mechanism also works for ``yield from`` in generators decorated
> with ``@types.coroutine`` or ``@asyncio.coroutine``, which are
> called "generator-based coroutines" according to :pep:`492`,
> and should be fully compatible with native async/await coroutines.
>
>
> Tasks
> ^^^^^
>
> In asynchronous frameworks like asyncio, coroutines are run by
> an event loop, and need to be explicitly scheduled (in asyncio
> coroutines are run by ``asyncio.Task``.)
>
> With the currently defined semantics, the interpreter makes
> coroutines linked by an ``await`` expression share the same
> Local Context.
>
> The interpreter, however, is not aware of the Task concept, and
> cannot help with ensuring that new Tasks started in coroutines,
> use the correct EC::
>
> current_request = sys.new_context_item(description='request')
>
> async def child():
> print('current request:', repr(current_request.get()))
>
> async def handle_request(request):
> current_request.set(request)
> event_loop.create_task(child)
>
> run(top_coro())
>
> # Will print:
> # current_request: None
>
> To enable correct Execution Context propagation into Tasks, the
> asynchronous framework needs to assist the interpreter:
>
> * When ``create_task`` is called, it should capture the current
> execution context with ``sys.get_execution_context()`` and save it
> on the Task object.
>
> * When the Task object runs its coroutine object, it should execute
> ``.send()`` and ``.throw()`` methods within the captured
> execution context, using the ``sys.run_with_execution_context()``
> function.
>
> With help from the asynchronous framework, the above snippet will
> run correctly, and the ``child()`` coroutine will be able to access
> the current request object through the ``current_request``
> Context Item.
>
>
> Event Loop Callbacks
> ^^^^^^^^^^^^^^^^^^^^
>
> Similarly to Tasks, functions like asyncio's ``loop.call_soon()``
> should capture the current execution context with
> ``sys.get_execution_context()`` and execute callbacks
> within it with ``sys.run_with_execution_context()``.
>
> This way the following code will work::
>
> current_request = sys.new_context_item(description='request')
>
> def log():
> request = current_request.get()
> print(request)
>
> async def request_handler(request):
> current_request.set(request)
> get_event_loop.call_soon(log)
>
>
> Generators
> ----------
>
> Generators in Python, while similar to Coroutines, are used in a
> fundamentally different way. They are producers of data, and
> they use ``yield`` expression to suspend/resume their execution.
>
> A crucial difference between ``await coro`` and ``yield value`` is
> that the former expression guarantees that the ``coro`` will be
> executed fully, while the latter is producing ``value`` and
> suspending the generator until it gets iterated again.
>
> Generators, similarly to coroutines, have a ``gi_local_context``
> attribute, which is set to an empty Local Context when created.
>
> Contrary to coroutines though, ``yield from o`` expression in
> generators (that are not generator-based coroutines) is semantically
> equivalent to ``for v in o: yield v``, therefore the interpreter does
> not attempt to control their ``gi_local_context``.
>
>
> EC Semantics for Generators
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> Every generator object has its own Local Context that stores
> only its own local modifications of the context. When a generator
> is being iterated, its local context will be put in the EC stack
> of the current thread. This means that the generator will be able
> to see access items from the surrounding context::
>
> local = sys.new_context_item("local")
> global = sys.new_context_item("global")
>
> def generator():
> local.set('inside gen:')
> while True:
> print(local.get(), global.get())
> yield
>
> g = gen()
>
> local.set('hello')
> global.set('spam')
> next(g)
>
> local.set('world')
> global.set('ham')
> next(g)
>
> # Will print:
> # inside gen: spam
> # inside gen: ham
>
> Any changes to the EC in nested generators are invisible to the outer
> generator::
>
> local = sys.new_context_item("local")
>
> def inner_gen():
> local.set('spam')
> yield
>
> def outer_gen():
> local.set('ham')
> yield from gen()
> print(local.get())
>
> list(outer_gen())
>
> # Will print:
> # ham
>
>
> Running generators without LC
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> Similarly to coroutines, generators with ``gi_local_context``
> set to ``None`` simply use the outer Local Context.
>
> The ``@contextlib.contextmanager`` decorator uses this mechanism to
> allow its generator to affect the EC::
>
> item = sys.new_context_item('test')
>
> @contextmanager
> def context(x):
> old = item.get()
> item.set('x')
> try:
> yield
> finally:
> item.set(old)
>
> with context('spam'):
>
> with context('ham'):
> print(1, item.get())
>
> print(2, item.get())
>
> # Will print:
> # 1 ham
> # 2 spam
>
>
> Implementing Generators with Iterators
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> The Execution Context API allows to fully replicate EC behaviour
> imposed on generators with a regular Python iterator class::
>
> class Gen:
>
> def __init__(self):
> self.local_context = sys.new_local_context()
>
> def __iter__(self):
> return self
>
> def __next__(self):
> return sys.run_with_local_context(
> self.local_context, self._next_impl)
>
> def _next_impl(self):
> # Actual __next__ implementation.
> ...
>
>
> Asynchronous Generators
> -----------------------
>
> Asynchronous Generators (AG) interact with the Execution Context
> similarly to regular generators.
>
> They have an ``ag_local_context`` attribute, which, similarly to
> regular generators, can be set to ``None`` to make them use the outer
> Local Context. This is used by the new
> ``contextlib.asynccontextmanager`` decorator.
>
> The EC support of ``await`` expression is implemented using the same
> approach as in coroutines, see the `Coroutine Object Modifications`_
> section.
>
>
> Greenlets
> ---------
>
> Greenlet is an alternative implementation of cooperative
> scheduling for Python. Although greenlet package is not part of
> CPython, popular frameworks like gevent rely on it, and it is
> important that greenlet can be modified to support execution
> contexts.
>
> In a nutshell, greenlet design is very similar to design of
> generators. The main difference is that for generators, the stack
> is managed by the Python interpreter. Greenlet works outside of the
> Python interpreter, and manually saves some ``PyThreadState``
> fields and pushes/pops the C-stack. Thus the ``greenlet`` package
> can be easily updated to use the new low-level `C API`_ to enable
> full support of EC.
>
>
> New APIs
> ========
>
> Python
> ------
>
> Python APIs were designed to completely hide the internal
> implementation details, but at the same time provide enough control
> over EC and LC to re-implement all of Python built-in objects
> in pure Python.
>
> 1. ``sys.new_context_item(description='...')``: create a
> ``ContextItem`` object used to access/set values in EC.
>
> 2. ``ContextItem``:
>
> * ``.description``: read-only attribute.
> * ``.get()``: return the current value for the item.
> * ``.set(o)``: set the current value in the EC for the item.
>
> 3. ``sys.get_execution_context()``: return the current
> ``ExecutionContext``.
>
> 4. ``sys.new_execution_context()``: create a new empty
> ``ExecutionContext``.
>
> 5. ``sys.new_local_context()``: create a new empty ``LocalContext``.
>
> 6. ``sys.run_with_execution_context(ec: ExecutionContext,
> func, *args, **kwargs)``.
>
> 7. ``sys.run_with_local_context(lc:LocalContext,
> func, *args, **kwargs)``.
>
>
> C API
> -----
>
> 1. ``PyContextItem * PyContext_NewItem(char *desc)``: create a
> ``PyContextItem`` object.
>
> 2. ``PyObject * PyContext_GetItem(PyContextItem *)``: get the
> current value for the context item.
>
> 3. ``int PyContext_SetItem(PyContextItem *, PyObject *)``: set
> the current value for the context item.
>
> 4. ``PyLocalContext * PyLocalContext_New()``: create a new empty
> ``PyLocalContext``.
>
> 5. ``PyLocalContext * PyExecutionContext_New()``: create a new empty
> ``PyExecutionContext``.
>
> 6. ``PyExecutionContext * PyExecutionContext_Get()``: get the
> EC for the active thread state.
>
> 7. ``int PyExecutionContext_Set(PyExecutionContext *)``: set the
> passed EC object as the current for the active thread state.
>
> 8. ``int PyExecutionContext_SetWithLocalContext(PyExecutionContext *,
> PyLocalContext *)``: allows to implement
> ``sys.run_with_local_context`` Python API.
>
>
> Implementation Strategy
> =======================
>
> LocalContext is a Weak Key Mapping
> ----------------------------------
>
> Using a weak key mapping for ``LocalContext`` implementation
> enables the following properties with regards to garbage
> collection:
>
> * ``ContextItem`` objects are strongly-referenced only from the
> application code, not from any of the Execution Context
> machinery or values they point to. This means that there
> are no reference cycles that could extend their lifespan
> longer than necessary, or prevent their garbage collection.
>
> * Values put in the Execution Context are guaranteed to be kept
> alive while there is a ``ContextItem`` key referencing them in
> the thread.
>
> * If a ``ContextItem`` is garbage collected, all of its values will
> be removed from all contexts, allowing them to be GCed if needed.
>
> * If a thread has ended its execution, its thread state will be
> cleaned up along with its ``ExecutionContext``, cleaning
> up all values bound to all Context Items in the thread.
>
>
> ContextItem.get() Cache
> -----------------------
>
> We can add three new fields to ``PyThreadState`` and
> ``PyInterpreterState`` structs:
>
> * ``uint64_t PyThreadState->unique_id``: a globally unique
> thread state identifier (we can add a counter to
> ``PyInterpreterState`` and increment it when a new thread state is
> created.)
>
> * ``uint64_t PyInterpreterState->context_item_deallocs``: every time
> a ``ContextItem`` is GCed, all Execution Contexts in all threads
> will lose track of it. ``context_item_deallocs`` will simply
> count all ``ContextItem`` deallocations.
>
> * ``uint64_t PyThreadState->execution_context_ver``: every time
> a new item is set, or an existing item is updated, or the stack
> of execution contexts is changed in the thread, we increment this
> counter.
>
> The above two fields allow implementing a fast cache path in
> ``ContextItem.get()``, in pseudo-code::
>
> class ContextItem:
>
> def get(self):
> tstate = PyThreadState_Get()
>
> if (self.last_tstate_id == tstate.unique_id and
> self.last_ver == tstate.execution_context_ver
> self.last_deallocs ==
> tstate.iterp.context_item_deallocs):
> return self.last_value
>
> value = None
> for mapping in reversed(tstate.execution_context):
> if self in mapping:
> value = mapping[self]
> break
>
> self.last_value = value
> self.last_tstate_id = tstate.unique_id
> self.last_ver = tstate.execution_context_ver
> self.last_deallocs = tstate.interp.context_item_deallocs
>
> return value
>
> This is similar to the trick that decimal C implementation uses
> for caching the current decimal context, and will have the same
> performance characteristics, but available to all
> Execution Context users.
>
>
> Approach #1: Use a dict for LocalContext
> ----------------------------------------
>
> The straightforward way of implementing the proposed EC
> mechanisms is to create a ``WeakKeyDict`` on top of Python
> ``dict`` type.
>
> To implement the ``ExecutionContext`` type we can use Python
> ``list`` (or a custom stack implementation with some
> pre-allocation optimizations).
>
> This approach will have the following runtime complexity:
>
> * O(M) for ``ContextItem.get()``, where ``M`` is the number of
> Local Contexts in the stack.
>
> It is important to note that ``ContextItem.get()`` will implement
> a cache making the operation O(1) for packages like ``decimal``
> and ``numpy``.
>
> * O(1) for ``ContextItem.set()``.
>
> * O(N) for ``sys.get_execution_context()``, where ``N`` is the
> total number of items in the current **execution** context.
>
>
> Approach #2: Use HAMT for LocalContext
> --------------------------------------
>
> Languages like Clojure and Scala use Hash Array Mapped Tries (HAMT)
> to implement high performance immutable collections [5]_, [6]_.
>
> Immutable mappings implemented with HAMT have O(log\ :sub:`32`\ N)
> performance for both ``set()``, ``get()``, and ``merge()`` operations,
> which is essentially O(1) for relatively small mappings
> (read about HAMT performance in CPython in the
> `Appendix: HAMT Performance`_ section.)
>
> In this approach we use the same design of the ``ExecutionContext``
> as in Approach #1, but we will use HAMT backed weak key Local Context
> implementation. With that we will have the following runtime
> complexity:
>
> * O(M * log\ :sub:`32`\ N) for ``ContextItem.get()``,
> where ``M`` is the number of Local Contexts in the stack,
> and ``N`` is the number of items in the EC. The operation will
> essentially be O(M), because execution contexts are normally not
> expected to have more than a few dozen of items.
>
> (``ContextItem.get()`` will have the same caching mechanism as in
> Approach #1.)
>
> * O(log\ :sub:`32`\ N) for ``ContextItem.set()`` where ``N`` is the
> number of items in the current **local** context. This will
> essentially be an O(1) operation most of the time.
>
> * O(log\ :sub:`32`\ N) for ``sys.get_execution_context()``, where
> ``N`` is the total number of items in the current **execution**
> context.
>
> Essentially, using HAMT for Local Contexts instead of Python dicts,
> allows to bring down the complexity of ``sys.get_execution_context()``
> from O(N) to O(log\ :sub:`32`\ N) because of the more efficient
> merge algorithm.
>
>
> Approach #3: Use HAMT and Immutable Linked List
> -----------------------------------------------
>
> We can make an alternative ``ExecutionContext`` design by using
> a linked list. Each ``LocalContext`` in the ``ExecutionContext``
> object will be wrapped in a linked-list node.
>
> ``LocalContext`` objects will use an HAMT backed weak key
> implementation described in the Approach #2.
>
> Every modification to the current ``LocalContext`` will produce a
> new version of it, which will be wrapped in a **new linked list
> node**. Essentially this means, that ``ExecutionContext`` is an
> immutable forest of ``LocalContext`` objects, and can be safely
> copied by reference in ``sys.get_execution_context()`` (eliminating
> the expensive "merge" operation.)
>
> With this approach, ``sys.get_execution_context()`` will be an
> **O(1) operation**.
>
>
> Summary
> -------
>
> We believe that approach #3 enables an efficient and complete
> Execution Context implementation, with excellent runtime performance.
>
> `ContextItem.get() Cache`_ enables fast retrieval of context items
> for performance critical libraries like decimal and numpy.
>
> Fast ``sys.get_execution_context()`` enables efficient management
> of execution contexts in asynchronous libraries like asyncio.
>
>
> Design Considerations
> =====================
>
> Can we fix ``PyThreadState_GetDict()``?
> ---------------------------------------
>
> ``PyThreadState_GetDict`` is a TLS, and some of its existing users
> might depend on it being just a TLS. Changing its behaviour to follow
> the Execution Context semantics would break backwards compatibility.
>
>
> PEP 521
> -------
>
> :pep:`521` proposes an alternative solution to the problem:
> enhance Context Manager Protocol with two new methods: ``__suspend__``
> and ``__resume__``. To make it compatible with async/await,
> the Asynchronous Context Manager Protocol will also need to be
> extended with ``__asuspend__`` and ``__aresume__``.
>
> This allows to implement context managers like decimal context and
> ``numpy.errstate`` for generators and coroutines.
>
> The following code::
>
> class Context:
>
> def __enter__(self):
> self.old_x = get_execution_context_item('x')
> set_execution_context_item('x', 'something')
>
> def __exit__(self, *err):
> set_execution_context_item('x', self.old_x)
>
> would become this::
>
> local = threading.local()
>
> class Context:
>
> def __enter__(self):
> self.old_x = getattr(local, 'x', None)
> local.x = 'something'
>
> def __suspend__(self):
> local.x = self.old_x
>
> def __resume__(self):
> local.x = 'something'
>
> def __exit__(self, *err):
> local.x = self.old_x
>
> Besides complicating the protocol, the implementation will likely
> negatively impact performance of coroutines, generators, and any code
> that uses context managers, and will notably complicate the
> interpreter implementation.
>
> :pep:`521` also does not provide any mechanism to propagate state
> in a local context, like storing a request object in an HTTP request
> handler to have better logging. Nor does it solve the leaking state
> problem for greenlet/gevent.
>
>
> Can Execution Context be implemented outside of CPython?
> --------------------------------------------------------
>
> Because async/await code needs an event loop to run it, an EC-like
> solution can be implemented in a limited way for coroutines.
>
> Generators, on the other hand, do not have an event loop or
> trampoline, making it impossible to intercept their ``yield`` points
> outside of the Python interpreter.
>
>
> Backwards Compatibility
> =======================
>
> This proposal preserves 100% backwards compatibility.
>
>
> Appendix: HAMT Performance
> ==========================
>
> To assess if HAMT can be used for Execution Context, we implemented
> it in CPython [7]_.
>
> .. figure:: pep-0550-hamt_vs_dict.png
> :align: center
> :width: 100%
>
> Figure 1. Benchmark code can be found here: [9]_.
>
> Figure 1 shows that HAMT indeed displays O(1) performance for all
> benchmarked dictionary sizes. For dictionaries with less than 100
> items, HAMT is a bit slower than Python dict/shallow copy.
>
> .. figure:: pep-0550-lookup_hamt.png
> :align: center
> :width: 100%
>
> Figure 2. Benchmark code can be found here: [10]_.
>
> Figure 2 shows comparison of lookup costs between Python dict
> and an HAMT immutable mapping. HAMT lookup time is 30-40% worse
> than Python dict lookups on average, which is a very good result,
> considering how well Python dicts are optimized.
>
> Note, that according to [8]_, HAMT design can be further improved.
>
>
> Acknowledgments
> ===============
>
> I thank Elvis Pranskevichus and Victor Petrovykh for countless
> discussions around the topic and PEP proof reading and edits.
>
> Thanks to Nathaniel Smith for proposing the ``ContextItem`` design
> [17]_ [18]_, for pushing the PEP towards a more complete design, and
> coming up with the idea of having a stack of contexts in the thread
> state.
>
> Thanks to Nick Coghlan for numerous suggestions and ideas on the
> mailing list, and for coming up with a case that cause the complete
> rewrite of the initial PEP version [19]_.
>
>
> References
> ==========
>
> .. [1] https://blog.golang.org/context
>
> .. [2] https://msdn.microsoft.com/en-us/library/system.threading.
> executioncontext.aspx
>
> .. [3] https://github.com/numpy/numpy/issues/9444
>
> .. [4] http://bugs.python.org/issue31179
>
> .. [5] https://en.wikipedia.org/wiki/Hash_array_mapped_trie
>
> .. [6] http://blog.higher-order.net/2010/08/16/assoc-and-clojures-
> persistenthashmap-part-ii.html
>
> .. [7] https://github.com/1st1/cpython/tree/hamt
>
> .. [8] https://michael.steindorfer.name/publications/oopsla15.pdf
>
> .. [9] https://gist.github.com/1st1/9004813d5576c96529527d44c5457dcd
>
> .. [10] https://gist.github.com/1st1/dbe27f2e14c30cce6f0b5fddfc8c437e
>
> .. [11] https://github.com/1st1/cpython/tree/pep550
>
> .. [12] https://www.python.org/dev/peps/pep-0492/#async-await
>
> .. [13] https://github.com/MagicStack/uvloop/blob/master/examples/
> bench/echoserver.py
>
> .. [14] https://github.com/MagicStack/pgbench
>
> .. [15] https://github.com/python/performance
>
> .. [16] https://gist.github.com/1st1/6b7a614643f91ead3edf37c4451a6b4c
>
> .. [17] https://mail.python.org/pipermail/python-ideas/2017-
> August/046752.html
>
> .. [18] https://mail.python.org/pipermail/python-ideas/2017-
> August/046772.html
>
> .. [19] https://mail.python.org/pipermail/python-ideas/2017-
> August/046780.html
>
>
> Copyright
> =========
>
> This document has been placed in the public domain.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20170816/573ea992/attachment-0001.html>
More information about the Python-ideas
mailing list