Python-Dev
Threads by month
- ----- 2025 -----
- February
- January
- ----- 2024 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2018 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2017 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2016 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2015 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2014 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2013 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2012 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2011 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2010 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2009 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2008 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2007 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2006 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2005 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2004 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2003 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2002 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2001 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2000 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 1999 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
September 2017
- 95 participants
- 66 discussions
This module implements cplx class (complex numbers) regardless to the
built-in class.
The main goal of this module is to propose some improvement to complex
numbers in python and deal with them from a mathematical approach.
Also cplx class doesn't support the built-in class intentionally, as the
idea was to give an alternative to it.
With the hope I managed to succeed, here is the module :
2
2
Since IRIX was EOLed in 2013, I propose support for it be removed in
Python 3.7. I will add it to PEP 11.
2
1
Hi,
Below is the fifth iteration of the PEP. The summary of changes is
in the "Version History" section, but I'll list them here too:
* Coroutines have no logical context by default (a revert to the V3
semantics). Read about the motivation in the
`Coroutines not leaking context changes by default`_ section.
The `High-Level Specification`_ section was also updated
(specifically Generators and Coroutines subsections).
* All APIs have been placed to the ``contextvars`` module, and
the …
[View More]factory functions were changed to class constructors
(``ContextVar``, ``ExecutionContext``, and ``LogicalContext``).
Thanks to Nick for the idea.
* ``ContextVar.lookup()`` got renamed back to ``ContextVar.get()``
and gained the ``topmost`` and ``default`` keyword arguments.
Added ``ContextVar.delete()``.
* Fixed ``ContextVar.get()`` cache bug (thanks Nathaniel!).
* New `Rejected Ideas`_,
`Should "yield from" leak context changes?`_,
`Alternative Designs for ContextVar API`_,
`Setting and restoring context variables`_, and
`Context manager as the interface for modifications`_ sections.
Thanks!
PEP: 550
Title: Execution Context
Version: $Revision$
Last-Modified: $Date$
Author: Yury Selivanov <yury(a)magic.io>,
Elvis Pranskevichus <elvis(a)magic.io>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 11-Aug-2017
Python-Version: 3.7
Post-History: 11-Aug-2017, 15-Aug-2017, 18-Aug-2017, 25-Aug-2017,
01-Sep-2017
Abstract
========
This PEP adds a new generic mechanism of ensuring consistent access
to non-local state in the context of out-of-order execution, such
as in Python generators and coroutines.
Thread-local storage, such as ``threading.local()``, is inadequate for
programs that execute concurrently in the same OS thread. This PEP
proposes a solution to this problem.
Rationale
=========
Prior to the advent of asynchronous programming in Python, programs
used OS threads to achieve concurrency. The need for thread-specific
state was solved by ``threading.local()`` and its C-API equivalent,
``PyThreadState_GetDict()``.
A few examples of where Thread-local storage (TLS) is commonly
relied upon:
* Context managers like decimal contexts, ``numpy.errstate``,
and ``warnings.catch_warnings``.
* Request-related data, such as security tokens and request
data in web applications, language context for ``gettext`` etc.
* Profiling, tracing, and logging in large code bases.
Unfortunately, TLS does not work well for programs which execute
concurrently in a single thread. A Python generator is the simplest
example of a concurrent program. Consider the following::
def fractions(precision, x, y):
with decimal.localcontext() as ctx:
ctx.prec = precision
yield Decimal(x) / Decimal(y)
yield Decimal(x) / Decimal(y ** 2)
g1 = fractions(precision=2, x=1, y=3)
g2 = fractions(precision=6, x=2, y=3)
items = list(zip(g1, g2))
The intuitively expected value of ``items`` is::
[(Decimal('0.33'), Decimal('0.666667')),
(Decimal('0.11'), Decimal('0.222222'))]
Rather surprisingly, the actual result is::
[(Decimal('0.33'), Decimal('0.666667')),
(Decimal('0.111111'), Decimal('0.222222'))]
This is because implicit Decimal context is stored as a thread-local,
so concurrent iteration of the ``fractions()`` generator would
corrupt the state. For Decimal, specifically, the only current
workaround is to use explicit context method calls for all arithmetic
operations [28]_. Arguably, this defeats the usefulness of overloaded
operators and makes even simple formulas hard to read and write.
Coroutines are another class of Python code where TLS unreliability
is a significant issue.
The inadequacy of TLS in asynchronous code has lead to the
proliferation of ad-hoc solutions, which are limited in scope and
do not support all required use cases.
The current status quo is that any library (including the standard
library), which relies on TLS, is likely to be broken when used in
asynchronous code or with generators (see [3]_ as an example issue.)
Some languages, that support coroutines or generators, recommend
passing the context manually as an argument to every function, see
[1]_ for an example. This approach, however, has limited use for
Python, where there is a large ecosystem that was built to work with
a TLS-like context. Furthermore, libraries like ``decimal`` or
``numpy`` rely on context implicitly in overloaded operator
implementations.
The .NET runtime, which has support for async/await, has a generic
solution for this problem, called ``ExecutionContext`` (see [2]_).
Goals
=====
The goal of this PEP is to provide a more reliable
``threading.local()`` alternative, which:
* provides the mechanism and the API to fix non-local state issues
with coroutines and generators;
* implements TLS-like semantics for synchronous code, so that
users like ``decimal`` and ``numpy`` can switch to the new
mechanism with minimal risk of breaking backwards compatibility;
* has no or negligible performance impact on the existing code or
the code that will be using the new mechanism, including
C extensions.
High-Level Specification
========================
The full specification of this PEP is broken down into three parts:
* High-Level Specification (this section): the description of the
overall solution. We show how it applies to generators and
coroutines in user code, without delving into implementation
details.
* Detailed Specification: the complete description of new concepts,
APIs, and related changes to the standard library.
* Implementation Details: the description and analysis of data
structures and algorithms used to implement this PEP, as well as
the necessary changes to CPython.
For the purpose of this section, we define *execution context* as an
opaque container of non-local state that allows consistent access to
its contents in the concurrent execution environment.
A *context variable* is an object representing a value in the
execution context. A call to ``contextvars.ContextVar(name)``
creates a new context variable object. A context variable object has
three methods:
* ``get()``: returns the value of the variable in the current
execution context;
* ``set(value)``: sets the value of the variable in the current
execution context;
* ``delete()``: can be used for restoring variable state, it's
purpose and semantics are explained in
`Setting and restoring context variables`_.
Regular Single-threaded Code
----------------------------
In regular, single-threaded code that doesn't involve generators or
coroutines, context variables behave like globals::
var = contextvars.ContextVar('var')
def sub():
assert var.get() == 'main'
var.set('sub')
def main():
var.set('main')
sub()
assert var.get() == 'sub'
Multithreaded Code
------------------
In multithreaded code, context variables behave like thread locals::
var = contextvars.ContextVar('var')
def sub():
assert var.get() is None # The execution context is empty
# for each new thread.
var.set('sub')
def main():
var.set('main')
thread = threading.Thread(target=sub)
thread.start()
thread.join()
assert var.get() == 'main'
Generators
----------
Unlike regular function calls, generators can cooperatively yield
their control of execution to the caller. Furthermore, a generator
does not control *where* the execution would continue after it yields.
It may be resumed from an arbitrary code location.
For these reasons, the least surprising behaviour of generators is
as follows:
* changes to context variables are always local and are not visible
in the outer context, but are visible to the code called by the
generator;
* once set in the generator, the context variable is guaranteed not
to change between iterations;
* changes to context variables in outer context (where the generator
is being iterated) are visible to the generator, unless these
variables were also modified inside the generator.
Let's review::
var1 = contextvars.ContextVar('var1')
var2 = contextvars.ContextVar('var2')
def gen():
var1.set('gen')
assert var1.get() == 'gen'
assert var2.get() == 'main'
yield 1
# Modification to var1 in main() is shielded by
# gen()'s local modification.
assert var1.get() == 'gen'
# But modifications to var2 are visible
assert var2.get() == 'main modified'
yield 2
def main():
g = gen()
var1.set('main')
var2.set('main')
next(g)
# Modification of var1 in gen() is not visible.
assert var1.get() == 'main'
var1.set('main modified')
var2.set('main modified')
next(g)
Now, let's revisit the decimal precision example from the `Rationale`_
section, and see how the execution context can improve the situation::
import decimal
# create a new context var
decimal_ctx = contextvars.ContextVar('decimal context')
# Pre-PEP 550 Decimal relies on TLS for its context.
# For illustration purposes, we monkey-patch the decimal
# context functions to use the execution context.
# A real working fix would need to properly update the
# C implementation as well.
def patched_setcontext(context):
decimal_ctx.set(context)
def patched_getcontext():
ctx = decimal_ctx.get()
if ctx is None:
ctx = decimal.Context()
decimal_ctx.set(ctx)
return ctx
decimal.setcontext = patched_setcontext
decimal.getcontext = patched_getcontext
def fractions(precision, x, y):
with decimal.localcontext() as ctx:
ctx.prec = precision
yield MyDecimal(x) / MyDecimal(y)
yield MyDecimal(x) / MyDecimal(y ** 2)
g1 = fractions(precision=2, x=1, y=3)
g2 = fractions(precision=6, x=2, y=3)
items = list(zip(g1, g2))
The value of ``items`` is::
[(Decimal('0.33'), Decimal('0.666667')),
(Decimal('0.11'), Decimal('0.222222'))]
which matches the expected result.
Coroutines and Asynchronous Tasks
---------------------------------
Like generators, coroutines can yield and regain control. The major
difference from generators is that coroutines do not yield to the
immediate caller. Instead, the entire coroutine call stack
(coroutines chained by ``await``) switches to another coroutine call
stack. In this regard, ``await``-ing on a coroutine is conceptually
similar to a regular function call, and a coroutine chain
(or a "task", e.g. an ``asyncio.Task``) is conceptually similar to a
thread.
>From this similarity we conclude that context variables in coroutines
should behave like "task locals":
* changes to context variables in a coroutine are visible to the
coroutine that awaits on it;
* changes to context variables made in the caller prior to awaiting
are visible to the awaited coroutine;
* changes to context variables made in one task are not visible in
other tasks;
* tasks spawned by other tasks inherit the execution context from the
parent task, but any changes to context variables made in the
parent task *after* the child task was spawned are *not* visible.
The last point shows behaviour that is different from OS threads.
OS threads do not inherit the execution context by default.
There are two reasons for this: *common usage intent* and backwards
compatibility.
The main reason for why tasks inherit the context, and threads do
not, is the common usage intent. Tasks are often used for relatively
short-running operations which are logically tied to the code that
spawned the tasks (like running a coroutine with a timeout in
asyncio). OS threads, on the other hand, are normally used for
long-running, logically separate code.
With respect to backwards compatibility, we want the execution context
to behave like ``threading.local()``. This is so that libraries can
start using the execution context in place of TLS with a lesser risk
of breaking compatibility with existing code.
Let's review a few examples to illustrate the semantics we have just
defined.
Context variable propagation in a single task::
import asyncio
var = contextvars.ContextVar('var')
async def main():
var.set('main')
await sub()
# The effect of sub() is visible.
assert var.get() == 'sub'
async def sub():
assert var.get() == 'main'
var.set('sub')
assert var.get() == 'sub'
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
Context variable propagation between tasks::
import asyncio
var = contextvars.ContextVar('var')
async def main():
var.set('main')
loop.create_task(sub()) # schedules asynchronous execution
# of sub().
assert var.get() == 'main'
var.set('main changed')
async def sub():
# Sleeping will make sub() run after
# `var` is modified in main().
await asyncio.sleep(1)
# The value of "var" is inherited from main(), but any
# changes to "var" made in main() after the task
# was created are *not* visible.
assert var.get() == 'main'
# This change is local to sub() and will not be visible
# to other tasks, including main().
var.set('sub')
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
As shown above, changes to the execution context are local to the
task, and tasks get a snapshot of the execution context at the point
of creation.
There is one narrow edge case when this can lead to surprising
behaviour. Consider the following example where we modify the
context variable in a nested coroutine::
async def sub(var_value):
await asyncio.sleep(1)
var.set(var_value)
async def main():
var.set('main')
# waiting for sub() directly
await sub('sub-1')
# var change is visible
assert var.get() == 'sub-1'
# waiting for sub() with a timeout;
await asyncio.wait_for(sub('sub-2'), timeout=2)
# wait_for() creates an implicit task, which isolates
# context changes, which means that the below assertion
# will fail.
assert var.get() == 'sub-2' # AssertionError!
However, relying on context changes leaking to the caller is
ultimately a bad pattern. For this reason, the behaviour shown in
the above example is not considered a major issue and can be
addressed with proper documentation.
Detailed Specification
======================
Conceptually, an *execution context* (EC) is a stack of logical
contexts. There is always exactly one active EC per Python thread.
A *logical context* (LC) is a mapping of context variables to their
values in that particular LC.
A *context variable* is an object representing a value in the
execution context. A new context variable object is created by
calling ``contextvars.ContextVar(name: str)``. The value of the
required ``name`` argument is not used by the EC machinery, but may
be used for debugging and introspection.
The context variable object has the following methods and attributes:
* ``name``: the value passed to ``ContextVar()``.
* ``get(*, topmost=False, default=None)``, if *topmost* is ``False``
(the default), traverses the execution context top-to-bottom, until
the variable value is found, if *topmost* is ``True``, returns
the value of the variable in the topmost logical context.
If the variable value was not found, returns the value of *default*.
* ``set(value)``: sets the value of the variable in the topmost
logical context.
* ``delete()``: removes the variable from the topmost logical context.
Useful when restoring the logical context to the state prior to the
``set()`` call, for example, in a context manager, see
`Setting and restoring context variables`_ for more information.
Generators
----------
When created, each generator object has an empty logical context
object stored in its ``__logical_context__`` attribute. This logical
context is pushed onto the execution context at the beginning of each
generator iteration and popped at the end::
var1 = contextvars.ContextVar('var1')
var2 = contextvars.ContextVar('var2')
def gen():
var1.set('var1-gen')
var2.set('var2-gen')
# EC = [
# outer_LC(),
# gen_LC({var1: 'var1-gen', var2: 'var2-gen'})
# ]
n = nested_gen() # nested_gen_LC is created
next(n)
# EC = [
# outer_LC(),
# gen_LC({var1: 'var1-gen', var2: 'var2-gen'})
# ]
var1.set('var1-gen-mod')
var2.set('var2-gen-mod')
# EC = [
# outer_LC(),
# gen_LC({var1: 'var1-gen-mod', var2: 'var2-gen-mod'})
# ]
next(n)
def nested_gen():
# EC = [
# outer_LC(),
# gen_LC({var1: 'var1-gen', var2: 'var2-gen'}),
# nested_gen_LC()
# ]
assert var1.get() == 'var1-gen'
assert var2.get() == 'var2-gen'
var1.set('var1-nested-gen')
# EC = [
# outer_LC(),
# gen_LC({var1: 'var1-gen', var2: 'var2-gen'}),
# nested_gen_LC({var1: 'var1-nested-gen'})
# ]
yield
# EC = [
# outer_LC(),
# gen_LC({var1: 'var1-gen-mod', var2: 'var2-gen-mod'}),
# nested_gen_LC({var1: 'var1-nested-gen'})
# ]
assert var1.get() == 'var1-nested-gen'
assert var2.get() == 'var2-gen-mod'
yield
# EC = [outer_LC()]
g = gen() # gen_LC is created for the generator object `g`
list(g)
# EC = [outer_LC()]
The snippet above shows the state of the execution context stack
throughout the generator lifespan.
contextlib.contextmanager
-------------------------
The ``contextlib.contextmanager()`` decorator can be used to turn
a generator into a context manager. A context manager that
temporarily modifies the value of a context variable could be defined
like this::
var = contextvars.ContextVar('var')
@contextlib.contextmanager
def var_context(value):
original_value = var.get()
try:
var.set(value)
yield
finally:
var.set(original_value)
Unfortunately, this would not work straight away, as the modification
to the ``var`` variable is contained to the ``var_context()``
generator, and therefore will not be visible inside the ``with``
block::
def func():
# EC = [{}, {}]
with var_context(10):
# EC becomes [{}, {}, {var: 10}] in the
# *precision_context()* generator,
# but here the EC is still [{}, {}]
assert var.get() == 10 # AssertionError!
The way to fix this is to set the generator's ``__logical_context__``
attribute to ``None``. This will cause the generator to avoid
modifying the execution context stack.
We modify the ``contextlib.contextmanager()`` decorator to
set ``genobj.__logical_context__`` to ``None`` to produce
well-behaved context managers::
def func():
# EC = [{}, {}]
with var_context(10):
# EC = [{}, {var: 10}]
assert var.get() == 10
# EC becomes [{}, {var: None}]
Enumerating context vars
------------------------
The ``ExecutionContext.vars()`` method returns a list of
``ContextVar`` objects, that have values in the execution context.
This method is mostly useful for introspection and logging.
coroutines
----------
In CPython, coroutines share the implementation with generators.
The difference is that in coroutines ``__logical_context__`` defaults
to ``None``. This affects both the ``async def`` coroutines and the
old-style generator-based coroutines (generators decorated with
``(a)types.coroutine``).
Asynchronous Generators
-----------------------
The execution context semantics in asynchronous generators does not
differ from that of regular generators.
asyncio
-------
``asyncio`` uses ``Loop.call_soon``, ``Loop.call_later``,
and ``Loop.call_at`` to schedule the asynchronous execution of a
function. ``asyncio.Task`` uses ``call_soon()`` to run the
wrapped coroutine.
We modify ``Loop.call_{at,later,soon}`` to accept the new
optional *execution_context* keyword argument, which defaults to
the copy of the current execution context::
def call_soon(self, callback, *args, execution_context=None):
if execution_context is None:
execution_context = contextvars.get_execution_context()
# ... some time later
contextvars.run_with_execution_context(
execution_context, callback, args)
The ``contextvars.get_execution_context()`` function returns a
shallow copy of the current execution context. By shallow copy here
we mean such a new execution context that:
* lookups in the copy provide the same results as in the original
execution context, and
* any changes in the original execution context do not affect the
copy, and
* any changes to the copy do not affect the original execution
context.
Either of the following satisfy the copy requirements:
* a new stack with shallow copies of logical contexts;
* a new stack with one squashed logical context.
The ``contextvars.run_with_execution_context(ec, func, *args,
**kwargs)`` function runs ``func(*args, **kwargs)`` with *ec* as the
execution context. The function performs the following steps:
1. Set *ec* as the current execution context stack in the current
thread.
2. Push an empty logical context onto the stack.
3. Run ``func(*args, **kwargs)``.
4. Pop the logical context from the stack.
5. Restore the original execution context stack.
6. Return or raise the ``func()`` result.
These steps ensure that *ec* cannot be modified by *func*,
which makes ``run_with_execution_context()`` idempotent.
``asyncio.Task`` is modified as follows::
class Task:
def __init__(self, coro):
...
# Get the current execution context snapshot.
self._exec_context = contextvars.get_execution_context()
# Create an empty Logical Context that will be
# used by coroutines run in the task.
coro.__logical_context__ = contextvars.LogicalContext()
self._loop.call_soon(
self._step,
execution_context=self._exec_context)
def _step(self, exc=None):
...
self._loop.call_soon(
self._step,
execution_context=self._exec_context)
...
Generators Transformed into Iterators
-------------------------------------
Any Python generator can be represented as an equivalent iterator.
Compilers like Cython rely on this axiom. With respect to the
execution context, such iterator should behave the same way as the
generator it represents.
This means that there needs to be a Python API to create new logical
contexts and run code with a given logical context.
The ``contextvars.LogicalContext()`` function creates a new empty
logical context.
The ``contextvars.run_with_logical_context(lc, func, *args,
**kwargs)`` function can be used to run functions in the specified
logical context. The *lc* can be modified as a result of the call.
The ``contextvars.run_with_logical_context()`` function performs the
following steps:
1. Push *lc* onto the current execution context stack.
2. Run ``func(*args, **kwargs)``.
3. Pop *lc* from the execution context stack.
4. Return or raise the ``func()`` result.
By using ``LogicalContext()`` and ``run_with_logical_context()``,
we can replicate the generator behaviour like this::
class Generator:
def __init__(self):
self.logical_context = contextvars.LogicalContext()
def __iter__(self):
return self
def __next__(self):
return contextvars.run_with_logical_context(
self.logical_context, self._next_impl)
def _next_impl(self):
# Actual __next__ implementation.
...
Let's see how this pattern can be applied to an example generator::
# create a new context variable
var = contextvars.ContextVar('var')
def gen_series(n):
var.set(10)
for i in range(1, n):
yield var.get() * i
# gen_series is equivalent to the following iterator:
class CompiledGenSeries:
# This class is what the `gen_series()` generator can
# be transformed to by a compiler like Cython.
def __init__(self, n):
# Create a new empty logical context,
# like the generators do.
self.logical_context = contextvars.LogicalContext()
# Initialize the generator in its LC.
# Otherwise `var.set(10)` in the `_init` method
# would leak.
contextvars.run_with_logical_context(
self.logical_context, self._init, n)
def _init(self, n):
self.i = 1
self.n = n
var.set(10)
def __iter__(self):
return self
def __next__(self):
# Run the actual implementation of __next__ in our LC.
return contextvars.run_with_logical_context(
self.logical_context, self._next_impl)
def _next_impl(self):
if self.i == self.n:
raise StopIteration
result = var.get() * self.i
self.i += 1
return result
For hand-written iterators such approach to context management is
normally not necessary, and it is easier to set and restore
context variables directly in ``__next__``::
class MyIterator:
# ...
def __next__(self):
old_val = var.get()
try:
var.set(new_val)
# ...
finally:
var.set(old_val)
Implementation
==============
Execution context is implemented as an immutable linked list of
logical contexts, where each logical context is an immutable weak key
mapping. A pointer to the currently active execution context is
stored in the OS thread state::
+-----------------+
| | ec
| PyThreadState +-------------+
| | |
+-----------------+ |
|
ec_node ec_node ec_node v
+------+------+ +------+------+ +------+------+
| NULL | lc |<----| prev | lc |<----| prev | lc |
+------+--+---+ +------+--+---+ +------+--+---+
| | |
LC v LC v LC v
+-------------+ +-------------+ +-------------+
| var1: obj1 | | EMPTY | | var1: obj4 |
| var2: obj2 | +-------------+ +-------------+
| var3: obj3 |
+-------------+
The choice of the immutable list of immutable mappings as a
fundamental data structure is motivated by the need to efficiently
implement ``contextvars.get_execution_context()``, which is to be
frequently used by asynchronous tasks and callbacks. When the EC is
immutable, ``get_execution_context()`` can simply copy the current
execution context *by reference*::
def get_execution_context(self):
return PyThreadState_Get().ec
Let's review all possible context modification scenarios:
* The ``ContextVariable.set()`` method is called::
def ContextVar_set(self, val):
# See a more complete set() definition
# in the `Context Variables` section.
tstate = PyThreadState_Get()
top_ec_node = tstate.ec
top_lc = top_ec_node.lc
new_top_lc = top_lc.set(self, val)
tstate.ec = ec_node(
prev=top_ec_node.prev,
lc=new_top_lc)
* The ``contextvars.run_with_logical_context()`` is called, in which
case the passed logical context object is appended to the execution
context::
def run_with_logical_context(lc, func, *args, **kwargs):
tstate = PyThreadState_Get()
old_top_ec_node = tstate.ec
new_top_ec_node = ec_node(prev=old_top_ec_node, lc=lc)
try:
tstate.ec = new_top_ec_node
return func(*args, **kwargs)
finally:
tstate.ec = old_top_ec_node
* The ``contextvars.run_with_execution_context()`` is called, in which
case the current execution context is set to the passed execution
context with a new empty logical context appended to it::
def run_with_execution_context(ec, func, *args, **kwargs):
tstate = PyThreadState_Get()
old_top_ec_node = tstate.ec
new_lc = contextvars.LogicalContext()
new_top_ec_node = ec_node(prev=ec, lc=new_lc)
try:
tstate.ec = new_top_ec_node
return func(*args, **kwargs)
finally:
tstate.ec = old_top_ec_node
* Either ``genobj.send()``, ``genobj.throw()``, ``genobj.close()``
are called on a ``genobj`` generator, in which case the logical
context recorded in ``genobj`` is pushed onto the stack::
PyGen_New(PyGenObject *gen):
if (gen.gi_code.co_flags &
(CO_COROUTINE | CO_ITERABLE_COROUTINE)):
# gen is an 'async def' coroutine, or a generator
# decorated with @types.coroutine.
gen.__logical_context__ = None
else:
# Non-coroutine generator
gen.__logical_context__ = contextvars.LogicalContext()
gen_send(PyGenObject *gen, ...):
tstate = PyThreadState_Get()
if gen.__logical_context__ is not None:
old_top_ec_node = tstate.ec
new_top_ec_node = ec_node(
prev=old_top_ec_node,
lc=gen.__logical_context__)
try:
tstate.ec = new_top_ec_node
return _gen_send_impl(gen, ...)
finally:
gen.__logical_context__ = tstate.ec.lc
tstate.ec = old_top_ec_node
else:
return _gen_send_impl(gen, ...)
* Coroutines and asynchronous generators share the implementation
with generators, and the above changes apply to them as well.
In certain scenarios the EC may need to be squashed to limit the
size of the chain. For example, consider the following corner case::
async def repeat(coro, delay):
await coro()
await asyncio.sleep(delay)
loop.create_task(repeat(coro, delay))
async def ping():
print('ping')
loop = asyncio.get_event_loop()
loop.create_task(repeat(ping, 1))
loop.run_forever()
In the above code, the EC chain will grow as long as ``repeat()`` is
called. Each new task will call
``contextvars.run_with_execution_context()``, which will append a new
logical context to the chain. To prevent unbounded growth,
``contextvars.get_execution_context()`` checks if the chain
is longer than a predetermined maximum, and if it is, squashes the
chain into a single LC::
def get_execution_context():
tstate = PyThreadState_Get()
if tstate.ec_len > EC_LEN_MAX:
squashed_lc = contextvars.LogicalContext()
ec_node = tstate.ec
while ec_node:
# The LC.merge() method does not replace
# existing keys.
squashed_lc = squashed_lc.merge(ec_node.lc)
ec_node = ec_node.prev
return ec_node(prev=NULL, lc=squashed_lc)
else:
return tstate.ec
Logical Context
---------------
Logical context is an immutable weak key mapping which has the
following properties with respect to garbage collection:
* ``ContextVar`` objects are strongly-referenced only from the
application code, not from any of the execution context machinery
or values they point to. This means that there are no reference
cycles that could extend their lifespan longer than necessary, or
prevent their collection by the GC.
* Values put in the execution context are guaranteed to be kept
alive while there is a ``ContextVar`` key referencing them in
the thread.
* If a ``ContextVar`` is garbage collected, all of its values will
be removed from all contexts, allowing them to be GCed if needed.
* If an OS thread has ended its execution, its thread state will be
cleaned up along with its execution context, cleaning
up all values bound to all context variables in the thread.
As discussed earlier, we need ``contextvars.get_execution_context()``
to be consistently fast regardless of the size of the execution
context, so logical context is necessarily an immutable mapping.
Choosing ``dict`` for the underlying implementation is suboptimal,
because ``LC.set()`` will cause ``dict.copy()``, which is an O(N)
operation, where *N* is the number of items in the LC.
``get_execution_context()``, when squashing the EC, is an O(M)
operation, where *M* is the total number of context variable values
in the EC.
So, instead of ``dict``, we choose Hash Array Mapped Trie (HAMT)
as the underlying implementation of logical contexts. (Scala and
Clojure use HAMT to implement high performance immutable collections
[5]_, [6]_.)
With HAMT ``.set()`` becomes an O(log N) operation, and
``get_execution_context()`` squashing is more efficient on average due
to structural sharing in HAMT.
See `Appendix: HAMT Performance Analysis`_ for a more elaborate
analysis of HAMT performance compared to ``dict``.
Context Variables
-----------------
The ``ContextVar.get()`` and ``ContextVar.set()`` methods are
implemented as follows (in pseudo-code)::
class ContextVar:
def get(self, *, default=None, topmost=False):
tstate = PyThreadState_Get()
ec_node = tstate.ec
while ec_node:
if self in ec_node.lc:
return ec_node.lc[self]
if topmost:
break
ec_node = ec_node.prev
return default
def set(self, value):
tstate = PyThreadState_Get()
top_ec_node = tstate.ec
if top_ec_node is not None:
top_lc = top_ec_node.lc
new_top_lc = top_lc.set(self, value)
tstate.ec = ec_node(
prev=top_ec_node.prev,
lc=new_top_lc)
else:
# First ContextVar.set() in this OS thread.
top_lc = contextvars.LogicalContext()
new_top_lc = top_lc.set(self, value)
tstate.ec = ec_node(
prev=NULL,
lc=new_top_lc)
def delete(self):
tstate = PyThreadState_Get()
top_ec_node = tstate.ec
if top_ec_node is None:
raise LookupError
top_lc = top_ec_node.lc
if self not in top_lc:
raise LookupError
new_top_lc = top_lc.delete(self)
tstate.ec = ec_node(
prev=top_ec_node.prev,
lc=new_top_lc)
For efficient access in performance-sensitive code paths, such as in
``numpy`` and ``decimal``, we cache lookups in ``ContextVar.get()``,
making it an O(1) operation when the cache is hit. The cache key is
composed from the following:
* The new ``uint64_t PyThreadState->unique_id``, which is a globally
unique thread state identifier. It is computed from the new
``uint64_t PyInterpreterState->ts_counter``, which is incremented
whenever a new thread state is created.
* The new ``uint64_t PyThreadState->stack_version``, which is a
thread-specific counter, which is incremented whenever a non-empty
logical context is pushed onto the stack or popped from the stack.
* The ``uint64_t ContextVar->version`` counter, which is incremented
whenever the context variable value is changed in any logical
context in any OS thread.
The cache is then implemented as follows::
class ContextVar:
def set(self, value):
... # implementation
self.version += 1
def get(self, *, default=None, topmost=False):
if topmost:
return self._get_uncached(
default=default, topmost=topmost)
tstate = PyThreadState_Get()
if (self.last_tstate_id == tstate.unique_id and
self.last_stack_ver == tstate.stack_version and
self.last_version == self.version):
return self.last_value
value = self._get_uncached(default=default)
self.last_value = value # borrowed ref
self.last_tstate_id = tstate.unique_id
self.last_stack_version = tstate.stack_version
self.last_version = self.version
return value
Note that ``last_value`` is a borrowed reference. We assume that
if the version checks are fine, the value object will be alive.
This allows the values of context variables to be properly garbage
collected.
This generic caching approach is similar to what the current C
implementation of ``decimal`` does to cache the the current decimal
context, and has similar performance characteristics.
Performance Considerations
==========================
Tests of the reference implementation based on the prior
revisions of this PEP have shown 1-2% slowdown on generator
microbenchmarks and no noticeable difference in macrobenchmarks.
The performance of non-generator and non-async code is not
affected by this PEP.
Summary of the New APIs
=======================
Python
------
The following new Python APIs are introduced by this PEP:
1. The new ``contextvars.ContextVar(name: str='...')`` class,
instances of which have the following:
* the read-only ``.name`` attribute,
* the ``.get()`` method, which returns the value of the variable
in the current execution context;
* the ``.set()`` method, which sets the value of the variable in
the current logical context;
* the ``.delete()`` method, which removes the value of the variable
from the current logical context.
2. The new ``contextvars.ExecutionContext()`` class, which represents
an execution context.
3. The new ``contextvars.LogicalContext()`` class, which represents
a logical context.
4. The new ``contextvars.get_execution_context()`` function, which
returns an ``ExecutionContext`` instance representing a copy of
the current execution context.
5. The ``contextvars.run_with_execution_context(ec: ExecutionContext,
func, *args, **kwargs)`` function, which runs *func* with the
provided execution context.
6. The ``contextvars.run_with_logical_context(lc: LogicalContext,
func, *args, **kwargs)`` function, which runs *func* with the
provided logical context on top of the current execution context.
C API
-----
1. ``PyContextVar * PyContext_NewVar(char *desc)``: create a
``PyContextVar`` object.
2. ``PyObject * PyContext_GetValue(PyContextVar *, int topmost)``:
return the value of the variable in the current execution context.
3. ``int PyContext_SetValue(PyContextVar *, PyObject *)``: set
the value of the variable in the current logical context.
4. ``int PyContext_DelValue(PyContextVar *)``: delete the value of
the variable from the current logical context.
5. ``PyLogicalContext * PyLogicalContext_New()``: create a new empty
``PyLogicalContext``.
6. ``PyExecutionContext * PyExecutionContext_New()``: create a new
empty ``PyExecutionContext``.
7. ``PyExecutionContext * PyExecutionContext_Get()``: return the
current execution context.
8. ``int PyContext_SetCurrent(
PyExecutionContext *, PyLogicalContext *)``: set the
passed EC object as the current execution context for the active
thread state, and/or set the passed LC object as the current
logical context.
Design Considerations
=====================
Should "yield from" leak context changes?
-----------------------------------------
No. It may be argued that ``yield from`` is semantically
equivalent to calling a function, and should leak context changes.
However, it is not possible to satisfy the following at the same time:
* ``next(gen)`` *does not* leak context changes made in ``gen``, and
* ``yield from gen`` *leaks* context changes made in ``gen``.
The reason is that ``yield from`` can be used with a partially
iterated generator, which already has local context changes::
var = contextvars.ContextVar('var')
def gen():
for i in range(10):
var.set('gen')
yield i
def outer_gen():
var.set('outer_gen')
g = gen()
yield next(g)
# Changes not visible during partial iteration,
# the goal of this PEP:
assert var.get() == 'outer_gen'
yield from g
assert var.get() == 'outer_gen' # or 'gen'?
Another example would be refactoring of an explicit ``for..in yield``
construct to a ``yield from`` expression. Consider the following
code::
def outer_gen():
var.set('outer_gen')
for i in gen():
yield i
assert var.get() == 'outer_gen'
which we want to refactor to use ``yield from``::
def outer_gen():
var.set('outer_gen')
yield from gen()
assert var.get() == 'outer_gen' # or 'gen'?
The above examples illustrate that it is unsafe to refactor
generator code using ``yield from`` when it can leak context changes.
Thus, the only well-defined and consistent behaviour is to
**always** isolate context changes in generators, regardless of
how they are being iterated.
Should ``PyThreadState_GetDict()`` use the execution context?
-------------------------------------------------------------
No. ``PyThreadState_GetDict`` is based on TLS, and changing its
semantics will break backwards compatibility.
PEP 521
-------
:pep:`521` proposes an alternative solution to the problem, which
extends the context manager protocol with two new methods:
``__suspend__()`` and ``__resume__()``. Similarly, the asynchronous
context manager protocol is also extended with ``__asuspend__()`` and
``__aresume__()``.
This allows implementing context managers that manage non-local state,
which behave correctly in generators and coroutines.
For example, consider the following context manager, which uses
execution state::
class Context:
def __init__(self):
self.var = contextvars.ContextVar('var')
def __enter__(self):
self.old_x = self.var.get()
self.var.set('something')
def __exit__(self, *err):
self.var.set(self.old_x)
An equivalent implementation with PEP 521::
local = threading.local()
class Context:
def __enter__(self):
self.old_x = getattr(local, 'x', None)
local.x = 'something'
def __suspend__(self):
local.x = self.old_x
def __resume__(self):
local.x = 'something'
def __exit__(self, *err):
local.x = self.old_x
The downside of this approach is the addition of significant new
complexity to the context manager protocol and the interpreter
implementation. This approach is also likely to negatively impact
the performance of generators and coroutines.
Additionally, the solution in :pep:`521` is limited to context
managers, and does not provide any mechanism to propagate state in
asynchronous tasks and callbacks.
Can Execution Context be implemented without modifying CPython?
---------------------------------------------------------------
No.
It is true that the concept of "task-locals" can be implemented
for coroutines in libraries (see, for example, [29]_ and [30]_).
On the other hand, generators are managed by the Python interpreter
directly, and so their context must also be managed by the
interpreter.
Furthermore, execution context cannot be implemented in a third-party
module at all, otherwise the standard library, including ``decimal``
would not be able to rely on it.
Should we update sys.displayhook and other APIs to use EC?
----------------------------------------------------------
APIs like redirecting stdout by overwriting ``sys.stdout``, or
specifying new exception display hooks by overwriting the
``sys.displayhook`` function are affecting the whole Python process
**by design**. Their users assume that the effect of changing
them will be visible across OS threads. Therefore we cannot
just make these APIs to use the new Execution Context.
That said we think it is possible to design new APIs that will
be context aware, but that is outside of the scope of this PEP.
Greenlets
---------
Greenlet is an alternative implementation of cooperative
scheduling for Python. Although greenlet package is not part of
CPython, popular frameworks like gevent rely on it, and it is
important that greenlet can be modified to support execution
contexts.
Conceptually, the behaviour of greenlets is very similar to that of
generators, which means that similar changes around greenlet entry
and exit can be done to add support for execution context. This
PEP provides the necessary C APIs to do that.
Context manager as the interface for modifications
--------------------------------------------------
This PEP concentrates on the low-level mechanics and the minimal
API that enables fundamental operations with execution context.
For developer convenience, a high-level context manager interface
may be added to the ``contextvars`` module. For example::
with contextvars.set_var(var, 'foo'):
# ...
Setting and restoring context variables
---------------------------------------
The ``ContextVar.delete()`` method removes the context variable from
the topmost logical context.
If the variable is not found in the topmost logical context, a
``LookupError`` is raised, similarly to ``del var`` raising
``NameError`` when ``var`` is not in scope.
This method is useful when there is a (rare) need to correctly restore
the state of a logical context, such as when a nested generator
wants to modify the logical context *temporarily*::
var = contextvars.ContextVar('var')
def gen():
with some_var_context_manager('gen'):
# EC = [{var: 'main'}, {var: 'gen'}]
assert var.get() == 'gen'
yield
# EC = [{var: 'main modified'}, {}]
assert var.get() == 'main modified'
yield
def main():
var.set('main')
g = gen()
next(g)
var.set('main modified')
next(g)
The above example would work correctly only if there is a way to
delete ``var`` from the logical context in ``gen()``. Setting it
to a "previous value" in ``__exit__()`` would mask changes made
in ``main()`` between the iterations.
Alternative Designs for ContextVar API
--------------------------------------
Logical Context with stacked values
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
By the design presented in this PEP, logical context is a simple
``LC({ContextVar: value, ...})`` mapping. An alternative
representation is to store a stack of values for each context
variable: ``LC({ContextVar: [val1, val2, ...], ...})``.
The ``ContextVar`` methods would then be:
* ``get(*, default=None)`` -- traverses the stack
of logical contexts, and returns the top value from the
first non-empty logical context;
* ``push(val)`` -- pushes *val* onto the stack of values in the
current logical context;
* ``pop()`` -- pops the top value from the stack of values in
the current logical context.
Compared to the single-value design with the ``set()`` and
``delete()`` methods, the stack-based approach allows for a simpler
implementation of the set/restore pattern. However, the mental
burden of this approach is considered to be higher, since there
would be *two* stacks to consider: a stack of LCs and a stack of
values in each LC.
(This idea was suggested by Nathaniel Smith.)
ContextVar "set/reset"
^^^^^^^^^^^^^^^^^^^^^^
Yet another approach is to return a special object from
``ContextVar.set()``, which would represent the modification of
the context variable in the current logical context::
var = contextvars.ContextVar('var')
def foo():
mod = var.set('spam')
# ... perform work
mod.reset() # Reset the value of var to the original value
# or remove it from the context.
The critical flaw in this approach is that it becomes possible to
pass context var "modification objects" into code running in a
different execution context, which leads to undefined side effects.
Backwards Compatibility
=======================
This proposal preserves 100% backwards compatibility.
Rejected Ideas
==============
Replication of threading.local() interface
------------------------------------------
Choosing the ``threading.local()``-like interface for context
variables was considered and rejected for the following reasons:
* A survery of the standard library and Django has shown that the
vast majority of ``threading.local()`` uses involve a single
attribute, which indicates that the namespace approach is not
as helpful in the field.
* Using ``__getattr__()`` instead of ``.get()`` for value lookup
does not provide any way to specify the depth of the lookup
(i.e. search only the top logical context).
* Single-value ``ContextVar`` is easier to reason about in terms
of visibility. Suppose ``ContextVar()`` is a namespace,
and the consider the following::
ns = contextvars.ContextVar('ns')
def gen():
ns.a = 2
yield
assert ns.b == 'bar' # ??
def main():
ns.a = 1
ns.b = 'foo'
g = gen()
next(g)
# should not see the ns.a modification in gen()
assert ns.a == 1
# but should gen() see the ns.b modification made here?
ns.b = 'bar'
yield
The above example demonstrates that reasoning about the visibility
of different attributes of the same context var is not trivial.
* Single-value ``ContextVar`` allows straightforward implementation
of the lookup cache;
* Single-value ``ContextVar`` interface allows the C-API to be
simple and essentially the same as the Python API.
See also the mailing list discussion: [26]_, [27]_.
Coroutines not leaking context changes by default
-------------------------------------------------
In V4 (`Version History`_) of this PEP, coroutines were considered to
behave exactly like generators with respect to the execution context:
changes in awaited coroutines were not visible in the outer coroutine.
This idea was rejected on the grounds that is breaks the semantic
similarity of the task and thread models, and, more specifically,
makes it impossible to reliably implement asynchronous context
managers that modify context vars, since ``__aenter__`` is a
coroutine.
Appendix: HAMT Performance Analysis
===================================
.. figure:: pep-0550-hamt_vs_dict-v2.png
:align: center
:width: 100%
Figure 1. Benchmark code can be found here: [9]_.
The above chart demonstrates that:
* HAMT displays near O(1) performance for all benchmarked
dictionary sizes.
* ``dict.copy()`` becomes very slow around 100 items.
.. figure:: pep-0550-lookup_hamt.png
:align: center
:width: 100%
Figure 2. Benchmark code can be found here: [10]_.
Figure 2 compares the lookup costs of ``dict`` versus a HAMT-based
immutable mapping. HAMT lookup time is 30-40% slower than Python dict
lookups on average, which is a very good result, considering that the
latter is very well optimized.
There is research [8]_ showing that there are further possible
improvements to the performance of HAMT.
The reference implementation of HAMT for CPython can be found here:
[7]_.
Acknowledgments
===============
Thanks to Victor Petrovykh for countless discussions around the topic
and PEP proofreading and edits.
Thanks to Nathaniel Smith for proposing the ``ContextVar`` design
[17]_ [18]_, for pushing the PEP towards a more complete design, and
coming up with the idea of having a stack of contexts in the thread
state.
Thanks to Nick Coghlan for numerous suggestions and ideas on the
mailing list, and for coming up with a case that cause the complete
rewrite of the initial PEP version [19]_.
Version History
===============
1. Initial revision, posted on 11-Aug-2017 [20]_.
2. V2 posted on 15-Aug-2017 [21]_.
The fundamental limitation that caused a complete redesign of the
first version was that it was not possible to implement an iterator
that would interact with the EC in the same way as generators
(see [19]_.)
Version 2 was a complete rewrite, introducing new terminology
(Local Context, Execution Context, Context Item) and new APIs.
3. V3 posted on 18-Aug-2017 [22]_.
Updates:
* Local Context was renamed to Logical Context. The term "local"
was ambiguous and conflicted with local name scopes.
* Context Item was renamed to Context Key, see the thread with Nick
Coghlan, Stefan Krah, and Yury Selivanov [23]_ for details.
* Context Item get cache design was adjusted, per Nathaniel Smith's
idea in [25]_.
* Coroutines are created without a Logical Context; ceval loop
no longer needs to special case the ``await`` expression
(proposed by Nick Coghlan in [24]_.)
4. V4 posted on 25-Aug-2017 [31]_.
* The specification section has been completely rewritten.
* Coroutines now have their own Logical Context. This means
there is no difference between coroutines, generators, and
asynchronous generators w.r.t. interaction with the Execution
Context.
* Context Key renamed to Context Var.
* Removed the distinction between generators and coroutines with
respect to logical context isolation.
5. V5 posted on 01-Sep-2017: the current version.
* Coroutines have no logical context by default (a revert to the V3
semantics). Read about the motivation in the
`Coroutines not leaking context changes by default`_ section.
The `High-Level Specification`_ section was also updated
(specifically Generators and Coroutines subsections).
* All APIs have been placed to the ``contextvars`` module, and
the factory functions were changed to class constructors
(``ContextVar``, ``ExecutionContext``, and ``LogicalContext``).
Thanks to Nick for the idea [33]_.
* ``ContextVar.lookup()`` got renamed back to ``ContextVar.get()``
and gained the ``topmost`` and ``default`` keyword arguments.
Added ``ContextVar.delete()``.
See Guido's comment in [32]_.
* Fixed ``ContextVar.get()`` cache bug (thanks Nathaniel!).
* New `Rejected Ideas`_,
`Should "yield from" leak context changes?`_,
`Alternative Designs for ContextVar API`_,
`Setting and restoring context variables`_, and
`Context manager as the interface for modifications`_ sections.
References
==========
.. [1] https://blog.golang.org/context
.. [2] https://msdn.microsoft.com/en-us/library/system.threading.executioncontext.…
.. [3] https://github.com/numpy/numpy/issues/9444
.. [4] http://bugs.python.org/issue31179
.. [5] https://en.wikipedia.org/wiki/Hash_array_mapped_trie
.. [6] http://blog.higher-order.net/2010/08/16/assoc-and-clojures-persistenthashma…
.. [7] https://github.com/1st1/cpython/tree/hamt
.. [8] https://michael.steindorfer.name/publications/oopsla15.pdf
.. [9] https://gist.github.com/1st1/9004813d5576c96529527d44c5457dcd
.. [10] https://gist.github.com/1st1/dbe27f2e14c30cce6f0b5fddfc8c437e
.. [11] https://github.com/1st1/cpython/tree/pep550
.. [12] https://www.python.org/dev/peps/pep-0492/#async-await
.. [13] https://github.com/MagicStack/uvloop/blob/master/examples/bench/echoserver.…
.. [14] https://github.com/MagicStack/pgbench
.. [15] https://github.com/python/performance
.. [16] https://gist.github.com/1st1/6b7a614643f91ead3edf37c4451a6b4c
.. [17] https://mail.python.org/pipermail/python-ideas/2017-August/046752.html
.. [18] https://mail.python.org/pipermail/python-ideas/2017-August/046772.html
.. [19] https://mail.python.org/pipermail/python-ideas/2017-August/046775.html
.. [20] https://github.com/python/peps/blob/e8a06c9a790f39451d9e99e203b13b3ad73a1d0…
.. [21] https://github.com/python/peps/blob/e3aa3b2b4e4e9967d28a10827eed1e9e5960c17…
.. [22] https://github.com/python/peps/blob/287ed87bb475a7da657f950b353c71c1248f67e…
.. [23] https://mail.python.org/pipermail/python-ideas/2017-August/046801.html
.. [24] https://mail.python.org/pipermail/python-ideas/2017-August/046790.html
.. [25] https://mail.python.org/pipermail/python-ideas/2017-August/046786.html
.. [26] https://mail.python.org/pipermail/python-ideas/2017-August/046888.html
.. [27] https://mail.python.org/pipermail/python-ideas/2017-August/046889.html
.. [28] https://docs.python.org/3/library/decimal.html#decimal.Context.abs
.. [29] https://curio.readthedocs.io/en/latest/reference.html#task-local-storage
.. [30] https://docs.atlassian.com/aiolocals/latest/usage.html
.. [31] https://github.com/python/peps/blob/1b8728ded7cde9df0f9a24268574907fafec6d5…
.. [32] https://mail.python.org/pipermail/python-dev/2017-August/149020.html
.. [33] https://mail.python.org/pipermail/python-dev/2017-August/149043.html
Copyright
=========
This document has been placed in the public domain.
[View Less]
3
3
Hi everyone,
While looking over the PyLong source code in Objects/longobject.c I came
across the fact that the PyLong object doesnt't include implementation for
basic inplace operations such as adding or multiplication:
[...]
long_long, /*nb_int*/
0, /*nb_reserved*/
long_float, /*nb_float*/
0, /* nb_inplace_add */
0, /* nb_inplace_subtract */
0, …
[View More] /* nb_inplace_multiply */
0, /* nb_inplace_remainder */
[...]
While I understand that the immutable nature of this type of object justifies
this approach, I wanted to experiment and see how much performance an inplace
add would bring.
My inplace add will revert to calling the default long_add function when:
- the refcount of the first operand indicates that it's being shared
or
- that operand is one of the preallocated 'small ints'
which should mitigate the effects of not conforming to the PyLong immutability
specification.
It also allocates a new PyLong _only_ in case of a potential overflow.
The workload I used to evaluate this is a simple script that does a lot of
inplace adding:
import time
import sys
def write_progress(prev_percentage, value, limit):
percentage = (100 * value) // limit
if percentage != prev_percentage:
sys.stdout.write("%d%%\r" % (percentage))
sys.stdout.flush()
return percentage
progress = -1
the_value = 0
the_increment = ((1 << 30) - 1)
crt_iter = 0
total_iters = 10 ** 9
start = time.time()
while crt_iter < total_iters:
the_value += the_increment
crt_iter += 1
progress = write_progress(progress, crt_iter, total_iters)
end = time.time()
print ("\n%.3fs" % (end - start))
print ("the_value: %d" % (the_value))
Running the baseline version outputs:
./python inplace.py
100%
356.633s
the_value: 1073741823000000000
Running the modified version outputs:
./python inplace.py
100%
308.606s
the_value: 1073741823000000000
In summary, I got a +13.47% improvement for the modified version.
The CPython revision I'm using is 7f066844a79ea201a28b9555baf4bceded90484f
from the master branch and I'm running on a I7 6700K CPU with Turbo-Boost
disabled (frequency is pinned at 4GHz).
Do you think that such an optimization would be a good approach ?
Thank you,
Catalin
[View Less]
8
13
ACTIVITY SUMMARY (2017-08-25 - 2017-09-01)
Python tracker at http://bugs.python.org/
To view or respond to any of the issues listed below, click on the issue.
Do NOT respond to this message.
Issues counts and deltas:
open 6166 (+17)
closed 36910 (+30)
total 43076 (+47)
Open issues with patches: 2339
Issues opened (35)
==================
#30581: os.cpu_count() returns wrong number of processors on system wi
http://bugs.python.org/issue30581 reopened by haypo
#30776: regrtest: …
[View More]change -R/--huntrleaks rule to decide if a test leak
http://bugs.python.org/issue30776 reopened by pitrou
#31250: test_asyncio leaks dangling threads
http://bugs.python.org/issue31250 reopened by haypo
#31281: fileinput inplace does not work with pathlib.Path
http://bugs.python.org/issue31281 opened by zmwangx
#31282: C APIs called without GIL in PyOS_Readline
http://bugs.python.org/issue31282 opened by xiang.zhang
#31284: IDLE: Make GUI test teardown less fragile
http://bugs.python.org/issue31284 opened by csabella
#31285: a SystemError and an assertion failure in warnings.warn_explic
http://bugs.python.org/issue31285 opened by Oren Milman
#31288: IDLE tests: don't modify tkinter.messagebox.
http://bugs.python.org/issue31288 opened by terry.reedy
#31289: File paths in exception traceback resolve symlinks
http://bugs.python.org/issue31289 opened by Paul Pinterits
#31290: segfault on missing library symbol
http://bugs.python.org/issue31290 opened by immortalplants
#31292: `python setup.py check --restructuredtext` fails when a includ
http://bugs.python.org/issue31292 opened by flying sheep
#31293: crashes in multiply_float_timedelta() and in truedivide_timede
http://bugs.python.org/issue31293 opened by Oren Milman
#31294: ZeroMQSocketListener and ZeroMQSocketHandler examples in the L
http://bugs.python.org/issue31294 opened by pablogsal
#31296: support pty.fork and os.forkpty actions in posix subprocess mo
http://bugs.python.org/issue31296 opened by gregory.p.smith
#31297: Unpickleable ModuleImportError in unittest patch not backporte
http://bugs.python.org/issue31297 opened by Rachel Tobin
#31298: Error when calling numpy.astype
http://bugs.python.org/issue31298 opened by droth
#31299: Add "ignore_modules" option to TracebackException.format()
http://bugs.python.org/issue31299 opened by ncoghlan
#31301: Python 2.7 SIGSEGV
http://bugs.python.org/issue31301 opened by cody
#31302: smtplib on linux fails to log in correctly
http://bugs.python.org/issue31302 opened by murphdasurf
#31304: Update doc for starmap_async error_back kwarg
http://bugs.python.org/issue31304 opened by tamas
#31305: 'pydoc -w import' report "no Python documentation found for 'i
http://bugs.python.org/issue31305 opened by limuyuan
#31306: IDLE, configdialog, General tab: validate user entries
http://bugs.python.org/issue31306 opened by terry.reedy
#31307: ConfigParser.read silently fails if filenames argument is a by
http://bugs.python.org/issue31307 opened by vxgmichel
#31308: forkserver process isn't re-launched if it died
http://bugs.python.org/issue31308 opened by pitrou
#31310: semaphore tracker isn't protected against crashes
http://bugs.python.org/issue31310 opened by pitrou
#31311: a SystemError and a crash in PyCData_setstate() when __dict__
http://bugs.python.org/issue31311 opened by Oren Milman
#31313: Feature Add support of os.chflags() on Linux platform
http://bugs.python.org/issue31313 opened by socketpair
#31314: email throws exception with oversized header input
http://bugs.python.org/issue31314 opened by doko
#31315: assertion failure in imp.create_dynamic(), when spec.name is n
http://bugs.python.org/issue31315 opened by Oren Milman
#31319: Rename idlelib to just idle
http://bugs.python.org/issue31319 opened by rhettinger
#31320: test_ssl logs a traceback
http://bugs.python.org/issue31320 opened by haypo
#31321: traceback.clear_frames() doesn't clear *all* frames
http://bugs.python.org/issue31321 opened by haypo
#31323: test_ssl: reference cycle between ThreadedEchoServer and its C
http://bugs.python.org/issue31323 opened by haypo
#31324: support._match_test() used by test.bisect is very inefficient
http://bugs.python.org/issue31324 opened by haypo
#31325: req_rate is a namedtuple type rather than instance
http://bugs.python.org/issue31325 opened by gvx
Most recent 15 issues with no replies (15)
==========================================
#31325: req_rate is a namedtuple type rather than instance
http://bugs.python.org/issue31325
#31323: test_ssl: reference cycle between ThreadedEchoServer and its C
http://bugs.python.org/issue31323
#31314: email throws exception with oversized header input
http://bugs.python.org/issue31314
#31310: semaphore tracker isn't protected against crashes
http://bugs.python.org/issue31310
#31308: forkserver process isn't re-launched if it died
http://bugs.python.org/issue31308
#31307: ConfigParser.read silently fails if filenames argument is a by
http://bugs.python.org/issue31307
#31306: IDLE, configdialog, General tab: validate user entries
http://bugs.python.org/issue31306
#31305: 'pydoc -w import' report "no Python documentation found for 'i
http://bugs.python.org/issue31305
#31304: Update doc for starmap_async error_back kwarg
http://bugs.python.org/issue31304
#31301: Python 2.7 SIGSEGV
http://bugs.python.org/issue31301
#31299: Add "ignore_modules" option to TracebackException.format()
http://bugs.python.org/issue31299
#31298: Error when calling numpy.astype
http://bugs.python.org/issue31298
#31296: support pty.fork and os.forkpty actions in posix subprocess mo
http://bugs.python.org/issue31296
#31294: ZeroMQSocketListener and ZeroMQSocketHandler examples in the L
http://bugs.python.org/issue31294
#31290: segfault on missing library symbol
http://bugs.python.org/issue31290
Most recent 15 issues waiting for review (15)
=============================================
#31310: semaphore tracker isn't protected against crashes
http://bugs.python.org/issue31310
#31308: forkserver process isn't re-launched if it died
http://bugs.python.org/issue31308
#31270: Simplify documentation of itertools.zip_longest
http://bugs.python.org/issue31270
#31185: Miscellaneous errors in asyncio speedup module
http://bugs.python.org/issue31185
#31184: Fix data descriptor detection in inspect.getattr_static
http://bugs.python.org/issue31184
#31179: Speed-up dict.copy() up to 5.5 times.
http://bugs.python.org/issue31179
#31178: [EASY] subprocess: TypeError: can't concat str to bytes, in _e
http://bugs.python.org/issue31178
#31175: Exception while extracting file from ZIP with non-matching fil
http://bugs.python.org/issue31175
#31151: socketserver.ForkingMixIn.server_close() leaks zombie processe
http://bugs.python.org/issue31151
#31120: [2.7] Python 64 bit _ssl compile fails due missing buildinf_am
http://bugs.python.org/issue31120
#31113: Stack overflow with large program
http://bugs.python.org/issue31113
#31106: os.posix_fallocate() generate exception with errno 0
http://bugs.python.org/issue31106
#31065: Documentation for Popen.poll is unclear
http://bugs.python.org/issue31065
#31051: IDLE, configdialog, General tab: re-arrange, test user entries
http://bugs.python.org/issue31051
#31046: ensurepip does not honour the value of $(prefix)
http://bugs.python.org/issue31046
Top 10 most discussed issues (10)
=================================
#10746: ctypes c_long & c_bool have incorrect PEP-3118 type codes
http://bugs.python.org/issue10746 5 msgs
#30776: regrtest: change -R/--huntrleaks rule to decide if a test leak
http://bugs.python.org/issue30776 5 msgs
#31284: IDLE: Make GUI test teardown less fragile
http://bugs.python.org/issue31284 5 msgs
#31051: IDLE, configdialog, General tab: re-arrange, test user entries
http://bugs.python.org/issue31051 4 msgs
#31250: test_asyncio leaks dangling threads
http://bugs.python.org/issue31250 4 msgs
#31271: an assertion failure in io.TextIOWrapper.write
http://bugs.python.org/issue31271 4 msgs
#31282: C APIs called without GIL in PyOS_Readline
http://bugs.python.org/issue31282 4 msgs
#31293: crashes in multiply_float_timedelta() and in truedivide_timede
http://bugs.python.org/issue31293 4 msgs
#31313: Feature Add support of os.chflags() on Linux platform
http://bugs.python.org/issue31313 4 msgs
#31315: assertion failure in imp.create_dynamic(), when spec.name is n
http://bugs.python.org/issue31315 4 msgs
Issues closed (30)
==================
#5001: Remove assertion-based checking in multiprocessing
http://bugs.python.org/issue5001 closed by pitrou
#23835: configparser does not convert defaults to strings
http://bugs.python.org/issue23835 closed by lukasz.langa
#28261: wrong error messages when using PyArg_ParseTuple to parse norm
http://bugs.python.org/issue28261 closed by serhiy.storchaka
#29741: BytesIO methods don't accept integer types, while StringIO cou
http://bugs.python.org/issue29741 closed by steve.dower
#30617: IDLE: Add docstrings and unittests to outwin.py
http://bugs.python.org/issue30617 closed by terry.reedy
#30781: IDLE: configdialog -- switch to ttk widgets.
http://bugs.python.org/issue30781 closed by terry.reedy
#31108: add __contains__ for list_iterator (and others) for better per
http://bugs.python.org/issue31108 closed by serhiy.storchaka
#31191: Fix grammar in threading.Barrier docs
http://bugs.python.org/issue31191 closed by Mariatta
#31209: MappingProxyType can not be pickled
http://bugs.python.org/issue31209 closed by Alex Hayes
#31217: test_code leaked [1, 1, 1] memory blocks on x86 Gentoo Refleak
http://bugs.python.org/issue31217 closed by haypo
#31237: test_gdb disables 25% of tests in optimized builds
http://bugs.python.org/issue31237 closed by lukasz.langa
#31243: checks whether PyArg_ParseTuple returned a negative int
http://bugs.python.org/issue31243 closed by serhiy.storchaka
#31249: test_concurrent_futures leaks dangling threads
http://bugs.python.org/issue31249 closed by haypo
#31272: typing module conflicts with __slots__-classes
http://bugs.python.org/issue31272 closed by levkivskyi
#31275: Check fall-through in _codecs_iso2022.c
http://bugs.python.org/issue31275 closed by skrah
#31279: Squash new gcc warning (-Wstringop-overflow)
http://bugs.python.org/issue31279 closed by skrah
#31280: Namespace packages in directories added to path aren't importa
http://bugs.python.org/issue31280 closed by j1m
#31283: Inconsistent behaviours with explicit and implicit inheritance
http://bugs.python.org/issue31283 closed by r.david.murray
#31286: import in finally results in SystemError
http://bugs.python.org/issue31286 closed by serhiy.storchaka
#31287: IDLE configdialog tests: don't modify tkinter.messagebox.
http://bugs.python.org/issue31287 closed by terry.reedy
#31291: zipimport.zipimporter.get_data() crashes when path.replace() r
http://bugs.python.org/issue31291 closed by brett.cannon
#31295: typo in __hash__ docs
http://bugs.python.org/issue31295 closed by r.david.murray
#31300: Traceback prints different code than the running module
http://bugs.python.org/issue31300 closed by r.david.murray
#31303: xml.etree.ElementTree fails to parse a document (regression)
http://bugs.python.org/issue31303 closed by serhiy.storchaka
#31309: Tkinter root window does not close if used with matplotlib.pyp
http://bugs.python.org/issue31309 closed by terry.reedy
#31312: Build differences caused by the time stamps
http://bugs.python.org/issue31312 closed by r.david.murray
#31316: Frequent *** stack smashing detected *** with Python 3.6.2/mei
http://bugs.python.org/issue31316 closed by pitrou
#31317: Memory leak in dict with shared keys
http://bugs.python.org/issue31317 closed by haypo
#31318: On Windows importlib.util.find_spec("re") results in Attribute
http://bugs.python.org/issue31318 closed by steve.dower
#31322: SimpleNamespace deep copy
http://bugs.python.org/issue31322 closed by Pritish Patil
[View Less]
1
0

PEP 539 (second round): A new C API for Thread-Local Storage in CPython
by Masayuki YAMAMOTO Sept. 1, 2017
by Masayuki YAMAMOTO Sept. 1, 2017
Sept. 1, 2017
Hi python-dev,
Since Erik started the PEP 539 thread on python-ideas, I've collected
feedbacks in the discussion and pull-request, and tried improvement for the
API specification and reference implementation, as the result I think
resolved issues which pointed out by feedbacks.
Well, it's probably not finish yet, there is one which bothers me. I'm not
sure the CPython startup sequence design (PEP 432 Restructuring the CPython
startup sequence, it might be a conflict with the draft …
[View More]specification [1]),
please let me know what you think about the new API specification. In any
case, I start a new thread of the updated draft.
Summary of technical changes:
- Two functions which correspond PyThread_delete_key_value and
PyThread_ReInitTLS are omitted, because these are for the removed CPython's
own TLS implementation.
- Add an internal field "_is_initialized" and a constant default value
"Py_tss_NEEDS_INIT" to Py_tss_t type to indicate the thread key's
initialization state independent of the underlying implementation.
- Then, define behaviors for functions which uses the "_is_initialized"
field.
- Change the key argument to pass a pointer, allow to use in the limited
API that does not know the key type size.
- Add three functions which dynamic (de-)allocation and the key's
initialization state checking, because handle opaque struct.
- Change platform support in the case of enabling thread support, all
platforms are required at least one of native thread implementations.
Also the draft has been added explanations and rationales for above
changes, moreover, additional annotations for information.
Regards,
Masayuki
[1]: The specifications of thread key creation and deletion refer how to
use in the API clients (Modules/_tracemalloc.c and Python/pystate.c). One
of those, Py_Initialize function that is a caller's origin of
PyThread_tss_create is the flow "no-op when called for a second time" until
CPython 3.6 [2]. However, an internal function _Py_InitializeCore that has
been added newly in the current master branch is the flow "fatal error when
called for a second time" [3].
[2]: https://docs.python.org/3.6/c-api/init.html#c.Py_Initialize
[3]: https://github.com/python/cpython/blob/master/Python/pylifecycle.c#L508
First round for PEP 539:
https://mail.python.org/pipermail/python-ideas/2016-December/043983.html
Discussion for the issue:
https://bugs.python.org/issue25658
HTML version for PEP 539 draft:
https://www.python.org/dev/peps/pep-0539/
Diff between first round and second round:
https://gist.github.com/ma8ma/624f9e4435ebdb26230130b11ce12d20/revisions
And the pull-request for reference implementation (work in progress):
https://github.com/python/cpython/pull/1362
========================================
PEP: 539
Title: A New C-API for Thread-Local Storage in CPython
Version: $Revision$
Last-Modified: $Date$
Author: Erik M. Bray, Masayuki Yamamoto
BDFL-Delegate: Nick Coghlan
Status: Draft
Type: Informational
Content-Type: text/x-rst
Created: 20-Dec-2016
Post-History: 16-Dec-2016
Abstract
========
The proposal is to add a new Thread Local Storage (TLS) API to CPython which
would supersede use of the existing TLS API within the CPython interpreter,
while deprecating the existing API. The new API is named "Thread Specific
Storage (TSS) API" (see `Rationale for Proposed Solution`_ for the origin of
the name).
Because the existing TLS API is only used internally (it is not mentioned in
the documentation, and the header that defines it, ``pythread.h``, is not
included in ``Python.h`` either directly or indirectly), this proposal
probably
only affects CPython, but might also affect other interpreter
implementations
(PyPy?) that implement parts of the CPython API.
This is motivated primarily by the fact that the old API uses ``int`` to
represent TLS keys across all platforms, which is neither POSIX-compliant,
nor portable in any practical sense [1]_.
.. note::
Throughout this document the acronym "TLS" refers to Thread Local
Storage and should not be confused with "Transportation Layer Security"
protocols.
Specification
=============
The current API for TLS used inside the CPython interpreter consists of 6
functions::
PyAPI_FUNC(int) PyThread_create_key(void)
PyAPI_FUNC(void) PyThread_delete_key(int key)
PyAPI_FUNC(int) PyThread_set_key_value(int key, void *value)
PyAPI_FUNC(void *) PyThread_get_key_value(int key)
PyAPI_FUNC(void) PyThread_delete_key_value(int key)
PyAPI_FUNC(void) PyThread_ReInitTLS(void)
These would be superseded by a new set of analogous functions::
PyAPI_FUNC(int) PyThread_tss_create(Py_tss_t *key)
PyAPI_FUNC(void) PyThread_tss_delete(Py_tss_t *key)
PyAPI_FUNC(int) PyThread_tss_set(Py_tss_t *key, void *value)
PyAPI_FUNC(void *) PyThread_tss_get(Py_tss_t *key)
The specification also adds a few new features:
* A new type ``Py_tss_t``--an opaque type the definition of which may
depend on the underlying TLS implementation. It is defined::
typedef struct {
bool _is_initialized;
NATIVE_TSS_KEY_T _key;
} Py_tss_t;
where ``NATIVE_TSS_KEY_T`` is a macro whose value depends on the
underlying native TLS implementation (e.g. ``pthread_key_t``).
* A constant default value for ``Py_tss_t`` variables,
``Py_tss_NEEDS_INIT``.
* Three new functions::
PyAPI_FUNC(Py_tss_t *) PyThread_tss_alloc(void)
PyAPI_FUNC(void) PyThread_tss_free(Py_tss_t *key)
PyAPI_FUNC(bool) PyThread_tss_is_created(Py_tss_t *key)
The first two are needed for dynamic (de-)allocation of a ``Py_tss_t``,
particularly in extension modules built with ``Py_LIMITED_API``, where
static allocation of this type is not possible due to its implementation
being opaque at build time. A value returned by ``PyThread_tss_alloc``
is the same state as initialized by ``Py_tss_NEEDS_INIT``, or ``NULL`` for
dynamic allocation failure. The behavior of ``PyThread_tss_free``
involves
calling ``PyThread_tss_delete`` preventively, or is no-op if the value
pointed to by the ``key`` argument is ``NULL``.
``PyThread_tss_is_created``
returns ``true`` if the given ``Py_tss_t`` has been initialized (i.e. by
``PyThread_tss_create``).
The new TSS API does not provide functions which correspond to
``PyThread_delete_key_value`` and ``PyThread_ReInitTLS``, because these
functions are for the removed CPython's own TLS implementation, that is the
existing API behavior has become as follows:
``PyThread_delete_key_value(key)``
is equal to ``PyThread_set_key_value(key, NULL)``, and
``PyThread_ReInitTLS()``
is no-op [8]_.
The new ``PyThread_tss_`` functions are almost exactly analogous to their
original counterparts with a few minor differences: Whereas
``PyThread_create_key`` takes no arguments and returns a TLS key as an
``int``, ``PyThread_tss_create`` takes a ``Py_tss_t*`` as an argument and
returns an ``int`` status code. The behavior of ``PyThread_tss_create`` is
undefined if the value pointed to by the ``key`` argument is not initialized
by ``Py_tss_NEEDS_INIT``. The returned status code is zero on success
and non-zero on failure. The meanings of non-zero status codes are not
otherwise defined by this specification.
Similarly the other ``PyThread_tss_`` functions are passed a ``Py_tss_t*``
whereas previously the key was passed by value. This change is necessary,
as
being an opaque type, the ``Py_tss_t`` type could hypothetically be almost
any size. This is especially necessary for extension modules built with
``Py_LIMITED_API``, where the size of the type is not known. Except for
``PyThread_tss_free``, the behaviors of ``PyThread_tss_`` are undefined if
the
value pointed to by the ``key`` argument is ``NULL``.
Moreover, because of the use of ``Py_tss_t`` instead of ``int``, there are
additional behaviors which the existing API design would be carried over
into
new API: The TSS key creation and deletion are parts of "do-if-needed" flow
and these features are silently skipped if already done--Calling
``PyThread_tss_create`` with an initialized key does nothing and returns
success soon. This is also the case of calling ``PyThread_tss_delete`` with
an uninitialized key.
The behavior of ``PyThread_tss_delete`` is defined to change the key's
initialization state to "uninitialized" in order to restart the CPython
interpreter without terminating the process (e.g. embedding Python in an
application) [12]_.
The old ``PyThread_*_key*`` functions will be marked as deprecated in the
documentation, but will not generate runtime deprecation warnings.
Additionally, on platforms where ``sizeof(pthread_key_t) != sizeof(int)``,
``PyThread_create_key`` will return immediately with a failure status, and
the other TLS functions will all be no-ops on such platforms.
Comparison of API Specification
-------------------------------
================= =============================
=============================
API Thread Local Storage (TLS) Thread Specific Storage
(TSS)
================= =============================
=============================
Version Existing New
Key Type ``int`` ``Py_tss_t`` (opaque type)
Handle Native Key cast to ``int`` conceal into internal
field
Function Argument ``int`` ``Py_tss_t *``
Features - create key - create key
- delete key - delete key
- set value - set value
- get value - get value
- delete value - (set ``NULL`` instead)
[8]_
- reinitialize keys (for - (unnecessary) [8]_
after fork)
- dynamically
(de-)allocate
key
- check key's
initialization
state
Default Value (``-1`` as key creation ``Py_tss_NEEDS_INIT``
failure)
Requirement native thread native thread
(since CPython 3.7 [9]_)
Restriction Not support platform where Unable to statically
allocate
native TLS key is defined in key when
``Py_LIMITED_API``
a way that cannot be safely is defined.
cast to ``int``.
================= =============================
=============================
Example
-------
With the proposed changes, a TSS key is initialized like::
static Py_tss_t tss_key = Py_tss_NEEDS_INIT;
if (PyThread_tss_create(&tss_key)) {
/* ... handle key creation failure ... */
}
The initialization state of the key can then be checked like::
assert(PyThread_tss_is_created(&tss_key));
The rest of the API is used analogously to the old API::
int the_value = 1;
if (PyThread_tss_get(&tss_key) == NULL) {
PyThread_tss_set(&tss_key, (void *)&the_value);
assert(PyThread_tss_get(&tss_key) != NULL);
}
/* ... once done with the key ... */
PyThread_tss_delete(&tss_key);
assert(!PyThread_tss_is_created(&tss_key));
When ``Py_LIMITED_API`` is defined, a TSS key must be dynamically
allocated::
static Py_tss_t *ptr_key = PyThread_tss_alloc();
if (ptr_key == NULL) {
/* ... handle key allocation failure ... */
}
assert(!PyThread_tss_is_created(ptr_key));
/* ... once done with the key ... */
PyThread_tss_free(ptr_key);
ptr_key = NULL;
Platform Support Changes
========================
A new "Native Thread Implementation" section will be added to PEP 11 that
states:
* As of CPython 3.7, in the case of enabling thread support, all platforms
are
required to provide at least one of native thread implementation (as of
pthreads or Windows) to implement TSS API. Any TSS API problems that occur
in the implementation without native thread will be closed as "won't fix".
Motivation
==========
The primary problem at issue here is the type of the keys (``int``) used for
TLS values, as defined by the original PyThread TLS API.
The original TLS API was added to Python by GvR back in 1997, and at the
time the key used to represent a TLS value was an ``int``, and so it has
been to the time of writing. This used CPython's own TLS implementation,
but the current generation of which hasn't been used, largely unchanged, in
Python/thread.c. Support for implementation of the API on top of native
thread implementations (pthreads and Windows) was added much later, and the
own implementation has been no longer necessary and removed [9]_.
The problem with the choice of ``int`` to represent a TLS key, is that while
it was fine for CPython's own TLS implementation, and happens to be
compatible with Windows (which uses ``DWORD`` for the analogous data), it is
not compatible with the POSIX standard for the pthreads API, which defines
``pthread_key_t`` as an opaque type not further defined by the standard (as
with ``Py_tss_t`` described above) [14]_. This leaves it up to the
underlying
implementation how a ``pthread_key_t`` value is used to look up
thread-specific data.
This has not generally been a problem for Python's API, as it just happens
that on Linux ``pthread_key_t`` is defined as an ``unsigned int``, and so is
fully compatible with Python's TLS API--``pthread_key_t``'s created by
``pthread_create_key`` can be freely cast to ``int`` and back (well, not
exactly, even this has some limitations as pointed out by issue #22206).
However, as issue #25658 points out, there are at least some platforms
(namely Cygwin, CloudABI, but likely others as well) which have otherwise
modern and POSIX-compliant pthreads implementations, but are not compatible
with Python's API because their ``pthread_key_t`` is defined in a way that
cannot be safely cast to ``int``. In fact, the possibility of running into
this problem was raised by MvL at the time pthreads TLS was added [2]_.
It could be argued that PEP-11 makes specific requirements for supporting a
new, not otherwise officially-support platform (such as CloudABI), and that
the status of Cygwin support is currently dubious. However, this creates a
very high barrier to supporting platforms that are otherwise Linux- and/or
POSIX-compatible and where CPython might otherwise "just work" except for
this one hurdle. CPython itself imposes this implementation barrier by way
of an API that is not compatible with POSIX (and in fact makes invalid
assumptions about pthreads).
Rationale for Proposed Solution
===============================
The use of an opaque type (``Py_tss_t``) to key TLS values allows the API to
be compatible, with all present (POSIX and Windows) and future (C11?)
native TLS implementations supported by CPython, as it allows the definition
of ``Py_tss_t`` to depend on the underlying implementation.
Since the existing TLS API has been available in *the limited API* [13]_ for
some platforms (e.g. Linux), CPython makes an effort to provide the new TSS
API
at that level likewise. Note, however, that ``Py_tss_t`` definition
becomes to
be an opaque struct when ``Py_LIMITED_API`` is defined, because exposing
``NATIVE_TSS_KEY_T`` as part of the limited API would prevent us from
switching
native thread implementation without rebuilding extension module.
A new API must be introduced, rather than changing the function signatures
of
the current API, in order to maintain backwards compatibility. The new API
also more clearly groups together these related functions under a single
name
prefix, ``PyThread_tss_``. The "tss" in the name stands for
"thread-specific
storage", and was influenced by the naming and design of the "tss" API that
is
part of the C11 threads API [15]_. However, this is in no way meant to
imply
compatibility with or support for the C11 threads API, or signal any future
intention of supporting C11--it's just the influence for the naming and
design.
The inclusion of the special default value ``Py_tss_NEEDS_INIT`` is required
by the fact that not all native TLS implementations define a sentinel value
for uninitialized TLS keys. For example, on Windows a TLS key is
represented by a ``DWORD`` (``unsigned int``) and its value must be treated
as opaque [3]_. So there is no unsigned integer value that can be safely
used to represent an uninitialized TLS key on Windows. Likewise, POSIX
does not specify a sentinel for an uninitialized ``pthread_key_t``, instead
relying on the ``pthread_once`` interface to ensure that a given TLS key is
initialized only once per-process. Therefore, the ``Py_tss_t`` type
contains an explicit ``._is_initialized`` that can indicate the key's
initialization state independent of the underlying implementation.
Changing ``PyThread_create_key`` to immediately return a failure status on
systems using pthreads where ``sizeof(int) != sizeof(pthread_key_t)`` is
intended as a sanity check: Currently, ``PyThread_create_key`` may report
initial success on such systems, but attempts to use the returned key are
likely to fail. Although in practice this failure occurs earlier in the
interpreter initialization, it's better to fail immediately at the source of
problem (``PyThread_create_key``) rather than sometime later when use of an
invalid key is attempted. In other words, this indicates clearly that the
old API is not supported on platforms where it cannot be used reliably, and
that no effort will be made to add such support.
Rejected Ideas
==============
* Do nothing: The status quo is fine because it works on Linux, and
platforms
wishing to be supported by CPython should follow the requirements of
PEP-11. As explained above, while this would be a fair argument if
CPython were being to asked to make changes to support particular quirks
or features of a specific platform, in this case it is quirk of CPython
that prevents it from being used to its full potential on otherwise
POSIX-compliant platforms. The fact that the current implementation
happens to work on Linux is a happy accident, and there's no guarantee
that this will never change.
* Affected platforms should just configure Python ``--without-threads``:
This is a possible temporary workaround to the issue, but only that.
Python should not be hobbled on affected platforms despite them being
otherwise perfectly capable of running multi-threaded Python.
* Affected platforms should use CPython's own TLS implementation instead of
native TLS implementation: This is a more acceptable alternative to the
previous idea, and in fact there had been a patch to do just that [4]_.
However, the own implementation being "slower and clunkier" in general
than native implementations still needlessly hobbles performance on
affected
platforms. At least one other module (``tracemalloc``) is also broken if
Python is built without native implementation. And this idea cannot be
adopted because the own implementation was removed.
* Keep the existing API, but work around the issue by providing a mapping
from
``pthread_key_t`` values to ``int`` values. A couple attempts were made
at
this ([5]_, [6]_), but this only injects needless complexity and overhead
into performance-critical code on platforms that are not currently
affected
by this issue (such as Linux). Even if use of this workaround were made
conditional on platform compatibility, it introduces platform-specific
code
to maintain, and still has the problem of the previous rejected ideas of
needlessly hobbling performance on affected platforms.
Implementation
==============
An initial version of a patch [7]_ is available on the bug tracker for this
issue. Since the migration to Github, it's being developed in the
``pep539-tss-api`` feature branch [10]_ in Masayuki Yamamoto's fork of the
CPython repository on Github. A work-in-progress PR is available at [11]_.
This reference implementation covers not only the enhancement request in API
features, but also the client codes fix needed to replace the existing TLS
API
with the new TSS API.
Copyright
=========
This document has been placed in the public domain.
References and Footnotes
========================
.. [1] http://bugs.python.org/issue25658
.. [2] https://bugs.python.org/msg116292
.. [3]
https://msdn.microsoft.com/en-us/library/windows/desktop/ms686801(v=vs.85).…
.. [4] http://bugs.python.org/file45548/configure-pthread_key_t.patch
.. [5] http://bugs.python.org/file44269/issue25658-1.patch
.. [6] http://bugs.python.org/file44303/key-constant-time.diff
.. [7] http://bugs.python.org/file46379/pythread-tss-3.patch
.. [8] https://bugs.python.org/msg298342
.. [9] http://bugs.python.org/issue30832
.. [10]
https://github.com/python/cpython/compare/master...ma8ma:pep539-tss-api
.. [11] https://github.com/python/cpython/pull/1362
.. [12] https://docs.python.org/3/c-api/init.html#c.Py_FinalizeEx
.. [13] It is also called as "stable ABI"
(https://www.python.org/dev/peps/pep-0384/)
.. [14]
http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_key_create…
.. [15] http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf#page=404
[View Less]
3
5