webmaster has already heard from 4 people who cannot install it.
I sent them to the bug tracker or to python-list but they seem
not to have gone either place. Is there some guide I should be
sending them to, 'how to debug installation problems'?
Laura
Hi,
On Twitter, Raymond Hettinger wrote:
"The decision making process on Python-dev is an anti-pattern,
governed by anecdotal data and ambiguity over what problem is solved."
https://twitter.com/raymondh/status/887069454693158912
About "anecdotal data", I would like to discuss the Python startup time.
== Python 3.7 compared to 2.7 ==
First of all, on speed.python.org, we have:
* Python 2.7: 6.4 ms with site, 3.0 ms without site (-S)
* master (3.7): 14.5 ms with site, 8.4 ms without site (-S)
Python 3.7 startup time is 2.3x slower with site (default mode), or
2.8x slower without site (-S command line option).
(I will skip Python 3.4, 3.5 and 3.6 which are much worse than Python 3.7...)
So if an user complained about Python 2.7 startup time: be prepared
for a 2x - 3x more angry user when "forced" to upgrade to Python 3!
== Mercurial vs Git, Python vs C, startup time ==
Startup time matters a lot for Mercurial since Mercurial is compared
to Git. Git and Mercurial have similar features, but Git is written in
C whereas Mercurial is written in Python. Quick benchmark on the
speed.python.org server:
* hg version: 44.6 ms +- 0.2 ms
* git --version: 974 us +- 7 us
Mercurial startup time is already 45.8x slower than Git whereas tested
Mercurial runs on Python 2.7.12. Now try to sell Python 3 to Mercurial
developers, with a startup time 2x - 3x slower...
I tested Mecurial 3.7.3 and Git 2.7.4 on Ubuntu 16.04.1 using "python3
-m perf command -- ...".
== CPython core developers don't care? no, they do care ==
Christian Heimes, Naoki INADA, Serhiy Storchaka, Yury Selivanov, me
(Victor Stinner) and other core developers made multiple changes last
years to reduce the number of imports at startup, optimize impotlib,
etc.
IHMO all these core developers are well aware of the competition of
programming languages, and honesty Python startup time isn't "good".
So let's compare it to other programming languages similar to Python.
== PHP, Ruby, Perl ==
I measured the startup time of other programming languages which are
similar to Python, still on the speed.python.org server using "python3
-m perf command -- ...":
* perl -e ' ': 1.18 ms +- 0.01 ms
* php -r ' ': 8.57 ms +- 0.05 ms
* ruby -e ' ': 32.8 ms +- 0.1 ms
Wow, Perl is quite good! PHP seems as good as Python 2 (but Python 3
is worse). Ruby startup time seems less optimized than other
languages.
Tested versions:
* perl 5, version 22, subversion 1 (v5.22.1)
* PHP 7.0.18-0ubuntu0.16.04.1 (cli) ( NTS )
* ruby 2.3.1p112 (2016-04-26) [x86_64-linux-gnu]
== Quick Google search ==
I also searched for "python startup time" and "python slow startup
time" on Google and found many articles. Some examples:
"Reducing the Python startup time"
http://www.draketo.de/book/export/html/498
=> "The python startup time always nagged me (17-30ms) and I just
searched again for a way to reduce it, when I found this: The
Python-Launcher caches GTK imports and forks new processes to reduce
the startup time of python GUI programs."
https://nelsonslog.wordpress.com/2013/04/08/python-startup-time/
=> "Wow, Python startup time is worse than I thought."
"How to speed up python starting up and/or reduce file search while
loading libraries?"
https://stackoverflow.com/questions/15474160/how-to-speed-up-python-startin…
=> "The first time I log to the system and start one command it takes
6 seconds just to show a few line of help. If I immediately issue the
same command again it takes 0.1s. After a couple of minutes it gets
back to 6s. (proof of short-lived cache)"
"How does one optimise the startup of a Python script/program?"
https://www.quora.com/How-does-one-optimise-the-startup-of-a-Python-script-…
=> "I wrote a Python program that would be used very often (imagine
'cd' or 'ls') for very short runtimes, how would I make it start up as
fast as possible?"
"Python Interpreter Startup time"
https://bytes.com/topic/python/answers/34469-pyhton-interpreter-startup-time
"Python is very slow to start on Windows 7"
https://stackoverflow.com/questions/29997274/python-is-very-slow-to-start-o…
=> "Python takes 17 times longer to load on my Windows 7 machine than
Ubuntu 14.04 running on a VM"
=> "returns in 0.614s on Windows and 0.036s on Linux"
"How to make a fast command line tool in Python" (old article Python 2.5.2)
https://files.bemusement.org/talks/OSDC2008-FastPython/
=> "(...) some techniques Bazaar uses to start quickly, such as lazy imports."
--
So please continue efforts for make Python startup even faster to beat
all other programming languages, and finally convince Mercurial to
upgrade ;-)
Victor
Hi folks,
As some people here know I've been working off and on for a while to
improve CPython's support of Cygwin. I'm motivated in part by a need
to have software working on Python 3.x on Cygwin for the foreseeable
future, preferably with minimal graft. (As an incidental side-effect
Python's test suite--especially of system-level functionality--serves
as an interesting test suite for Cygwin itself too.)
This is partly what motivated PEP 539 [1], although that PEP had the
advantage of benefiting other POSIX-compatible platforms as well (and
in fact was fixing an aspect of CPython that made it unfriendly to
supporting other platforms).
As far as I can tell, the first commit to Python to add any kind of
support for Cygwin was made by Guido (committing a contributed patch)
back in 1999 [2]. Since then, bits and pieces have been added for
Cygwin's benefit over time, with varying degrees of impact in terms of
#ifdefs and the like (for the most part Cygwin does not require *much*
in the way of special support, but it does have some differences from
a "normal" POSIX-compliant platform, such as the possibility for
case-insensitive filesystems and executables that end in .exe). I
don't know whether it's ever been "officially supported" but someone
with a longer memory of the project can comment on that. I'm not sure
if it was discussed at all or not in the context of PEP 11.
I have personally put in a fair amount of effort already in either
fixing issues on Cygwin (many of these issues also impact MinGW), or
more often than not fixing issues in the CPython test suite on
Cygwin--these are mostly tests that are broken due to invalid
assumptions about the platform (for example, that there is always a
"root" user with uid=0; this is not the case on Cygwin). In other
cases some tests need to be skipped or worked around due to
platform-specific bugs, and Cygwin is hardly the only case of this in
the test suite.
I also have an experimental AppVeyor configuration for running the
tests on Cygwin [3], as well as an experimental buildbot (not
available on the internet, but working). These currently rely on a
custom branch that includes fixes needed for the test suite to run to
completion without crashing or hanging (e.g.
https://bugs.python.org/issue31885). It would be nice to add this as
an official buildbot, but I'm not sure if it makes sense to do that
until it's "green", or at least not crashing. I have several other
patches to the tests toward this goal, and am currently down to ~22
tests failing.
Before I do any more work on this, however, it would be best to once
and for all clarify the support for Cygwin in CPython, as it has never
been "officially supported" nor unsupported--this way we can avoid
having this discussion every time a patch related to Cygwin comes up.
I could provide some arguments for why I believe Cygwin should
supported, but before this gets too long I'd just like to float the
idea of having the discussion in the first place. It's also not
exactly clear to me how to meet the standards in PEP 11 for supporting
a platform--in particular it's not clear when a buildbot is considered
"stable", or how to achieve that without getting necessary fixes
merged into the main branch in the first place.
Thanks,
Erik
[1] https://www.python.org/dev/peps/pep-0539/
[2] https://github.com/python/cpython/commit/717d1fdf2acbef5e6b47d9b4dcf48ef182…
[3] https://ci.appveyor.com/project/embray/cpython
The make_dataclass() factory function in the dataclasses module currently requires type declarations. It would be nice if the type declarations were optional.
With typing (currently works):
Point = NamedTuple('Point', [('x', float), ('y', float), ('z', float)])
Point = make_dataclass('Point', [('x', float), ('y', float), ('z', float)])
Without typing (only the first currently works):
Point = namedtuple('Point', ['x', 'y', 'z']) # underlying store is a tuple
Point = make_dataclass('Point', ['x', 'y', 'z']) # underlying store is an instance dict
This proposal would make it easy to cleanly switch between the immutable tuple-based container and the instancedict-based optionally-frozen container. The proposal would make it possible for instructors to teach dataclasses without having to teach typing as a prerequisite. And, it would make dataclasses usable for projects that have elected not to use static typing.
Raymond
Hi,
Serhiy Storchaka seems to be worried by the high numbers of commits in
https://bugs.python.org/issue32030 "PEP 432: Rewrite Py_Main()", so
let me explain the context of this work :-)
To prepare CPython to implement my UTF-8 Mode PEP (PEP 540), I worked
on the implementation of Nick Coghlan's PEP 432:
PEP 432 -- Restructuring the CPython startup sequence
https://www.python.org/dev/peps/pep-0432/
The startup sequence is a big pile of code made of multiple functions:
main(), Py_Main(), Py_Initialize(), Py_Finalize()... and a lot of tiny
"configuration" functions like Py_SetPath().
Over the years, many configuration options were added in the middle of
the code. The priority of configuration options is not always correct
between command line options, envrionment variables, configuration
files (like "pyenv.cfg"), etc. For technical reasons, it's hard to
impement properly the -E option (ignore PYTHON* environment
variables).
For example, the new PYTHONCOERCECLOCALE environment variable (of PEP
538) doesn't handle properly -E (it ignores -E), because it was too
complex to support -E. -- I'm working on fixing this.
Last weeks, I mostly worked on the Py_Main() function,
Modules/getpath.c and PC/getpathp.c, to "refactor" the code:
* Split big functions (300 to 500 lines) into multiple small functions
(50 lines or less), to make it easily to follow the control flow and
to allow to more easily move code
* Replace static and global variables with memory allocated on the heap.
* Reorganize how the configuration is read: populate a first temporary
structure (_PyMain using wchar_t*), then create Python objects
(_PyMainInterpreterConfig) to finish with the real configuration (like
setting attributes of the sys module). The goal is to centralize all
code reading configuration to fix the priority and to simplify the
code.
My motivation was to write a correct implementation of the UTF-8 Mode (PEP 540).
Nick's motivation is to make CPython easily to embed. His plan for
Python 3.8 is to give access to the new _PyCoreConfig and
_PyMainInterpreterConfig structures to:
* easily give access to most (if not all?) configuration options to "embedders"
* allow to configure Python without environment variables, command
line options, configuration files, but only using these structures
* allow to configure Python using Python objects (PyObject*) rather
than C types (like wchar_t*)
(I'm not sure that I understood correctly, so please read the PEP 432 ;-))
IMHO the most visible change of the PEP 432 is to split Python
initialization in two parts:
* Core: strict minimum to use the Python C API
* Main: everything else
The goal is to introduce the opportunity to configure Python between
Core and Main.
The implementation is currently a work-in-progress. Nick will not have
the bandwidth, neither do I, to update his PEP and finish the
implementation, before Python 3.7. So this work remains private until
at least Python 3.8.
Another part of the work is to enhance the documentation. You can for
example now find an explicit list of C functions which can be called
before Py_Initialize():
https://docs.python.org/dev/c-api/init.html#before-python-initialization
And also a list of functions that must not be called before
Py_Initialize(), whereas you might want to call them :-)
Victor
The following note is a proposal to add the support of the Android platform.
The note is easier to read with clickable links at
https://github.com/xdegaye/cagibi/blob/master/doc/android_support.rst
Motivations
===========
* Android is ubiquitous.
* This would be the first platform supported by Python that is cross-compiled,
thanks to many contributors.
* Although the Android operating system is linux, it is different from most
linux platforms, for example it does not use GNU libc and runs SELinux in
enforcing mode. Therefore supporting this platform would make Python more
robust and also would allow testing it on arm 64-bit processors.
* Python running on Android is also a handheld calculator, a successor of the
slide rule and the `HP 41`_.
Current status
==============
* The Python test suite succeeds when run on Android emulators using buildbot
strenuous settings with the following architectures on API 24: x86, x86_64,
armv7 and arm64.
* The `Android build system`_ is described in another section.
* The `buildmaster-config PR 26`_ proposes to update ``master.cfg`` to enable
buildbots to run a given Android API and architecture on the emulators.
* The Android emulator is actually ``qemu``, so the test suites for x86 and
x86_64 last about the same time as the test suite run natively when the
processor of the build system is of the x86 family. The test suites for the
arm architectures last much longer: about 8 hours for arm64 and 10 hours for
armv7 on a four years old laptop.
* The changes that have been made to achieve this status are listed in
`bpo-26865`_, the Android meta-issue.
* Given the cpu resources required to run the test suite on the arm emulators,
it may be difficult to find a contributed buildbot worker. So it remains to
find the hardware to run these buildbots.
Proposal
========
Support the Android platform on API 24 [1]_ for the x86_64, armv7 and arm64
architectures built with NDK 14b.
*API 24*
* API 21 is the first version to provide usable support for wide characters
and where SELinux is run in enforcing mode.
* API 22 introduces an annoying bug on the linker that prints something like
this when python is started::
``WARNING: linker: libpython3.6m.so.1.0: unused DT entry: type 0x6ffffffe arg 0x14554``.
The `termux`_ Android terminal emulator describes this problem at the end
of its `termux-packages`_ gitlab page and has implemented a
``termux-elf-cleaner`` tool to strip the useless entries from the ELF
header of executables.
* API 24 is the first version where the `adb`_ shell is run on the emulator
as a ``shell`` user instead of the ``root`` user previously, and the first
version that supports arm64.
*x86_64*
It seems that no handheld device exists using that architecture. It is
supported because the x86_64 Android emulator runs fast and therefore is a
good candidate as a buildbot worker.
*NDK 14b*
This release of the NDK is the first one to use `Unified headers`_ fixing
numerous problems that had been fixed by updating the Python configure script
until now (those changes have been reverted by now).
Android idiosyncrasies
======================
* The default shell is ``/system/bin/sh``.
* The file system layout is not a traditional unix layout, there is no
``/tmp`` for example. Most directories have user restricted access,
``/sdcard`` is mounted as ``noexec`` for example.
* The (java) applications are allocated a unix user id and a subdirectory on
``/data/data``.
* SELinux is run in enforcing mode.
* Shared memory and semaphores are not supported.
* The default encoding is UTF-8.
Android build system
====================
The Android build system is implemented at `bpo-30386`_ with `PR 1629`_ and
is documented by its `README`_. It provides the following features:
* To build a distribution for a device or an emulator with a given API level
and a given architecture.
* To start the emulator and
+ install the distribution
+ start a remote interactive shell
+ or run remotely a python command
+ or run remotely the buildbottest
* Run gdb on the python process that is running on the emulator with python
pretty-printing.
The build system adds the ``Android/`` directory and the ``configure-android``
script to the root of the Python source directory on the master branch without
modifying any other file. The build system can be installed, upgraded (i.e. the
SDK and NDK) and run remotely, through ssh for example.
The following external libraries, when they are configured in the build system,
are downloaded from the internet and cross-compiled (only once, on the first
run of the build system) before the cross-compilation of the extension modules:
* ``ncurses``
* ``readline``
* ``sqlite``
* ``libffi``
* ``openssl``, the cross-compilation of openssl fails on x86_64 and arm64 and
this step is skipped on those architectures.
The following extension modules are disabled by adding them to the
``*disabled*`` section of ``Modules/Setup``:
* ``_uuid``, Android has no uuid/uuid.h header.
* ``grp`` some grp.h functions are not declared.
* ``_crypt``, Android does not have crypt.h.
* ``_ctypes`` on x86_64 where all long double tests fail (`bpo-32202`_) and on
arm64 (see `bpo-32203`_).
.. [1] On Wikipedia `Android version history`_ lists the correspondence between
API level, commercial name and version for each release. It also provides
information on the global Android version distribution, see the two charts
on top.
.. _`README`: https://github.com/xdegaye/cpython/blob/bpo-30386/Android/README.rst
.. _`bpo-26865`: https://bugs.python.org/issue26865
.. _`bpo-30386`: https://bugs.python.org/issue30386
.. _`bpo-32202`: https://bugs.python.org/issue32202
.. _`bpo-32203`: https://bugs.python.org/issue32203
.. _`PR 1629`: https://github.com/python/cpython/pull/1629
.. _`buildmaster-config PR 26`: https://github.com/python/buildmaster-config/pull/26
.. _`Android version history`: https://en.wikipedia.org/wiki/Android_version_history
.. _`termux`: https://termux.com/
.. _`termux-packages`: https://gitlab.com/jbwhips883/termux-packages
.. _`adb`: https://developer.android.com/studio/command-line/adb.html
.. _`Unified headers`: https://android.googlesource.com/platform/ndk.git/+/ndk-r14-release/docs/Un…
.. _`HP 41`: https://en.wikipedia.org/wiki/HP-41C
.. vim:filetype=rst:tw=78:ts=8:sts=2:sw=2:et:
Hello all,
I've recently been experimenting with dataclasses. They totally rock! A lot
of the boilerplate for the AST I've designed in Python is automatically
taken care of, it's really great! However, I have a few concerns about the
implementation.
In a few cases I want to override the repr of the AST nodes. I wrote a
__repr__ and ran the code but lo and behold I got a type error. I couldn't
override it. I quickly learned that one needs to pass a keyword to the
dataclass decorator to tell it *not* to auto generate methods you override.
I have two usability concerns with the current implementation. I emailed
Eric about the first, and he said I should ask for thoughts here. The
second I found after a couple of days sitting on this message.
The first is that needing both a keyword and method is duplicative and
unnecessary. Eric agreed it was a hassle, but felt it was justified
considering someone may accidentally override a dataclass method. I
disagree with this point of view as dataclasses are billed as providing
automatic methods. Overriding via method definition is very natural and
idiomatic. I don't really see how someone could accidentally override a
dataclass method if methods were not generated by the dataclass decorator
that are already defined in the class at definition time.
The second concern, which I came across more recently, is if I have a base
class, and dataclasses inherit from this base class, inherited __repr__ &
co are silently overridden by dataclass. This is both unexpected, and also
means I need to pass a repr=False to each subclass' decorator to get
correct behavior, which somewhat defeats the utility of subclassing. Im not
as sure a whole lot can be done about this though.
I appreciate any thoughts folks have related to this.
Cheers,
~>Ethan Smith
This is a second version of PEP 567.
A few things have changed:
1. I now have a reference implementation:
https://github.com/python/cpython/pull/5027
2. The C API was updated to match the implementation.
3. The get_context() function was renamed to copy_context() to better
reflect what it is really doing.
4. Few clarifications/edits here and there to address earlier feedback.
Yury
PEP: 567
Title: Context Variables
Version: $Revision$
Last-Modified: $Date$
Author: Yury Selivanov <yury(a)magic.io>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 12-Dec-2017
Python-Version: 3.7
Post-History: 12-Dec-2017, 28-Dec-2017
Abstract
========
This PEP proposes a new ``contextvars`` module and a set of new
CPython C APIs to support context variables. This concept is
similar to thread-local storage (TLS), but, unlike TLS, it also allows
correctly keeping track of values per asynchronous task, e.g.
``asyncio.Task``.
This proposal is a simplified version of :pep:`550`. The key
difference is that this PEP is concerned only with solving the case
for asynchronous tasks, not for generators. There are no proposed
modifications to any built-in types or to the interpreter.
This proposal is not strictly related to Python Context Managers.
Although it does provide a mechanism that can be used by Context
Managers to store their state.
Rationale
=========
Thread-local variables are insufficient for asynchronous tasks that
execute concurrently in the same OS thread. Any context manager that
saves and restores a context value using ``threading.local()`` will
have its context values bleed to other code unexpectedly when used
in async/await code.
A few examples where having a working context local storage for
asynchronous code is desirable:
* Context managers like ``decimal`` contexts and ``numpy.errstate``.
* Request-related data, such as security tokens and request
data in web applications, language context for ``gettext``, etc.
* Profiling, tracing, and logging in large code bases.
Introduction
============
The PEP proposes a new mechanism for managing context variables.
The key classes involved in this mechanism are ``contextvars.Context``
and ``contextvars.ContextVar``. The PEP also proposes some policies
for using the mechanism around asynchronous tasks.
The proposed mechanism for accessing context variables uses the
``ContextVar`` class. A module (such as ``decimal``) that wishes to
store a context variable should:
* declare a module-global variable holding a ``ContextVar`` to
serve as a key;
* access the current value via the ``get()`` method on the
key variable;
* modify the current value via the ``set()`` method on the
key variable.
The notion of "current value" deserves special consideration:
different asynchronous tasks that exist and execute concurrently
may have different values for the same key. This idea is well-known
from thread-local storage but in this case the locality of the value is
not necessarily bound to a thread. Instead, there is the notion of the
"current ``Context``" which is stored in thread-local storage, and
is accessed via ``contextvars.copy_context()`` function.
Manipulation of the current ``Context`` is the responsibility of the
task framework, e.g. asyncio.
A ``Context`` is conceptually a read-only mapping, implemented using
an immutable dictionary. The ``ContextVar.get()`` method does a
lookup in the current ``Context`` with ``self`` as a key, raising a
``LookupError`` or returning a default value specified in
the constructor.
The ``ContextVar.set(value)`` method clones the current ``Context``,
assigns the ``value`` to it with ``self`` as a key, and sets the
new ``Context`` as the new current ``Context``.
Specification
=============
A new standard library module ``contextvars`` is added with the
following APIs:
1. ``copy_context() -> Context`` function is used to get a copy of
the current ``Context`` object for the current OS thread.
2. ``ContextVar`` class to declare and access context variables.
3. ``Context`` class encapsulates context state. Every OS thread
stores a reference to its current ``Context`` instance.
It is not possible to control that reference manually.
Instead, the ``Context.run(callable, *args, **kwargs)`` method is
used to run Python code in another context.
contextvars.ContextVar
----------------------
The ``ContextVar`` class has the following constructor signature:
``ContextVar(name, *, default=_NO_DEFAULT)``. The ``name`` parameter
is used only for introspection and debug purposes, and is exposed
as a read-only ``ContextVar.name`` attribute. The ``default``
parameter is optional. Example::
# Declare a context variable 'var' with the default value 42.
var = ContextVar('var', default=42)
(The ``_NO_DEFAULT`` is an internal sentinel object used to
detect if the default value was provided.)
``ContextVar.get()`` returns a value for context variable from the
current ``Context``::
# Get the value of `var`.
var.get()
``ContextVar.set(value) -> Token`` is used to set a new value for
the context variable in the current ``Context``::
# Set the variable 'var' to 1 in the current context.
var.set(1)
``ContextVar.reset(token)`` is used to reset the variable in the
current context to the value it had before the ``set()`` operation
that created the ``token``::
assert var.get(None) is None
token = var.set(1)
try:
...
finally:
var.reset(token)
assert var.get(None) is None
``ContextVar.reset()`` method is idempotent and can be called
multiple times on the same Token object: second and later calls
will be no-ops.
contextvars.Token
-----------------
``contextvars.Token`` is an opaque object that should be used to
restore the ``ContextVar`` to its previous value, or remove it from
the context if the variable was not set before. It can be created
only by calling ``ContextVar.set()``.
For debug and introspection purposes it has:
* a read-only attribute ``Token.var`` pointing to the variable
that created the token;
* a read-only attribute ``Token.old_value`` set to the value the
variable had before the ``set()`` call, or to ``Token.MISSING``
if the variable wasn't set before.
Having the ``ContextVar.set()`` method returning a ``Token`` object
and the ``ContextVar.reset(token)`` method, allows context variables
to be removed from the context if they were not in it before the
``set()`` call.
contextvars.Context
-------------------
``Context`` object is a mapping of context variables to values.
``Context()`` creates an empty context. To get a copy of the current
``Context`` for the current OS thread, use the
``contextvars.copy_context()`` method::
ctx = contextvars.copy_context()
To run Python code in some ``Context``, use ``Context.run()``
method::
ctx.run(function)
Any changes to any context variables that ``function`` causes will
be contained in the ``ctx`` context::
var = ContextVar('var')
var.set('spam')
def function():
assert var.get() == 'spam'
var.set('ham')
assert var.get() == 'ham'
ctx = copy_context()
# Any changes that 'function' makes to 'var' will stay
# isolated in the 'ctx'.
ctx.run(function)
assert var.get() == 'spam'
Any changes to the context will be contained in the ``Context``
object on which ``run()`` is called on.
``Context.run()`` is used to control in which context asyncio
callbacks and Tasks are executed. It can also be used to run some
code in a different thread in the context of the current thread::
executor = ThreadPoolExecutor()
current_context = contextvars.copy_context()
executor.submit(
lambda: current_context.run(some_function))
``Context`` objects implement the ``collections.abc.Mapping`` ABC.
This can be used to introspect context objects::
ctx = contextvars.copy_context()
# Print all context variables and their values in 'ctx':
print(ctx.items())
# Print the value of 'some_variable' in context 'ctx':
print(ctx[some_variable])
asyncio
-------
``asyncio`` uses ``Loop.call_soon()``, ``Loop.call_later()``,
and ``Loop.call_at()`` to schedule the asynchronous execution of a
function. ``asyncio.Task`` uses ``call_soon()`` to run the
wrapped coroutine.
We modify ``Loop.call_{at,later,soon}`` and
``Future.add_done_callback()`` to accept the new optional *context*
keyword-only argument, which defaults to the current context::
def call_soon(self, callback, *args, context=None):
if context is None:
context = contextvars.copy_context()
# ... some time later
context.run(callback, *args)
Tasks in asyncio need to maintain their own context that they inherit
from the point they were created at. ``asyncio.Task`` is modified
as follows::
class Task:
def __init__(self, coro):
...
# Get the current context snapshot.
self._context = contextvars.copy_context()
self._loop.call_soon(self._step, context=self._context)
def _step(self, exc=None):
...
# Every advance of the wrapped coroutine is done in
# the task's context.
self._loop.call_soon(self._step, context=self._context)
...
C API
-----
1. ``PyContextVar * PyContextVar_New(char *name, PyObject *default)``:
create a ``ContextVar`` object.
2. ``int PyContextVar_Get(PyContextVar *, PyObject *default_value,
PyObject **value)``:
return ``-1`` if an error occurs during the lookup, ``0`` otherwise.
If a value for the context variable is found, it will be set to the
``value`` pointer. Otherwise, ``value`` will be set to
``default_value`` when it is not ``NULL``. If ``default_value`` is
``NULL``, ``value`` will be set to the default value of the
variable, which can be ``NULL`` too. ``value`` is always a borrowed
reference.
3. ``PyContextToken * PyContextVar_Set(PyContextVar *, PyObject *)``:
set the value of the variable in the current context.
4. ``PyContextVar_Reset(PyContextVar *, PyContextToken *)``:
reset the value of the context variable.
5. ``PyContext * PyContext_New()``: create a new empty context.
6. ``PyContext * PyContext_Copy()``: get a copy of the current context.
7. ``int PyContext_Enter(PyContext *)`` and
``int PyContext_Exit(PyContext *)`` allow to set and restore
the context for the current OS thread. It is required to always
restore the previous context::
PyContext *old_ctx = PyContext_Copy();
if (old_ctx == NULL) goto error;
if (PyContext_Enter(new_ctx)) goto error;
// run some code
if (PyContext_Exit(old_ctx)) goto error;
Implementation
==============
This section explains high-level implementation details in
pseudo-code. Some optimizations are omitted to keep this section
short and clear.
For the purposes of this section, we implement an immutable dictionary
using ``dict.copy()``::
class _ContextData:
def __init__(self):
self._mapping = dict()
def get(self, key):
return self._mapping[key]
def set(self, key, value):
copy = _ContextData()
copy._mapping = self._mapping.copy()
copy._mapping[key] = value
return copy
def delete(self, key):
copy = _ContextData()
copy._mapping = self._mapping.copy()
del copy._mapping[key]
return copy
Every OS thread has a reference to the current ``_ContextData``.
``PyThreadState`` is updated with a new ``context_data`` field that
points to a ``_ContextData`` object::
class PyThreadState:
context_data: _ContextData
``contextvars.copy_context()`` is implemented as follows::
def copy_context():
ts : PyThreadState = PyThreadState_Get()
if ts.context_data is None:
ts.context_data = _ContextData()
ctx = Context()
ctx._data = ts.context_data
return ctx
``contextvars.Context`` is a wrapper around ``_ContextData``::
class Context(collections.abc.Mapping):
def __init__(self):
self._data = _ContextData()
def run(self, callable, *args, **kwargs):
ts : PyThreadState = PyThreadState_Get()
saved_data : _ContextData = ts.context_data
try:
ts.context_data = self._data
return callable(*args, **kwargs)
finally:
self._data = ts.context_data
ts.context_data = saved_data
# Mapping API methods are implemented by delegating
# `get()` and other Mapping calls to `self._data`.
``contextvars.ContextVar`` interacts with
``PyThreadState.context_data`` directly::
class ContextVar:
def __init__(self, name, *, default=_NO_DEFAULT):
self._name = name
self._default = default
@property
def name(self):
return self._name
def get(self, default=_NO_DEFAULT):
ts : PyThreadState = PyThreadState_Get()
data : _ContextData = ts.context_data
try:
return data.get(self)
except KeyError:
pass
if default is not _NO_DEFAULT:
return default
if self._default is not _NO_DEFAULT:
return self._default
raise LookupError
def set(self, value):
ts : PyThreadState = PyThreadState_Get()
data : _ContextData = ts.context_data
try:
old_value = data.get(self)
except KeyError:
old_value = Token.MISSING
ts.context_data = data.set(self, value)
return Token(self, old_value)
def reset(self, token):
if token._used:
return
if token._old_value is Token.MISSING:
ts.context_data = data.delete(token._var)
else:
ts.context_data = data.set(token._var,
token._old_value)
token._used = True
class Token:
MISSING = object()
def __init__(self, var, old_value):
self._var = var
self._old_value = old_value
self._used = False
@property
def var(self):
return self._var
@property
def old_value(self):
return self._old_value
Implementation Notes
====================
* The internal immutable dictionary for ``Context`` is implemented
using Hash Array Mapped Tries (HAMT). They allow for O(log N)
``set`` operation, and for O(1) ``copy_context()`` function, where
*N* is the number of items in the dictionary. For a detailed
analysis of HAMT performance please refer to :pep:`550` [1]_.
* ``ContextVar.get()`` has an internal cache for the most recent
value, which allows to bypass a hash lookup. This is similar
to the optimization the ``decimal`` module implements to
retrieve its context from ``PyThreadState_GetDict()``.
See :pep:`550` which explains the implementation of the cache
in a great detail.
Summary of the New APIs
=======================
* A new ``contextvars`` module with ``ContextVar``, ``Context``,
and ``Token`` classes, and a ``copy_context()`` function.
* ``asyncio.Loop.call_at()``, ``asyncio.Loop.call_later()``,
``asyncio.Loop.call_soon()``, and
``asyncio.Future.add_done_callback()`` run callback functions in
the context they were called in. A new *context* keyword-only
parameter can be used to specify a custom context.
* ``asyncio.Task`` is modified internally to maintain its own
context.
Design Considerations
=====================
Why contextvars.Token and not ContextVar.unset()?
-------------------------------------------------
The Token API allows to get around having a ``ContextVar.unset()``
method, which is incompatible with chained contexts design of
:pep:`550`. Future compatibility with :pep:`550` is desired
(at least for Python 3.7) in case there is demand to support
context variables in generators and asynchronous generators.
The Token API also offers better usability: the user does not have
to special-case absence of a value. Compare::
token = cv.get()
try:
cv.set(blah)
# code
finally:
cv.reset(token)
with::
_deleted = object()
old = cv.get(default=_deleted)
try:
cv.set(blah)
# code
finally:
if old is _deleted:
cv.unset()
else:
cv.set(old)
Rejected Ideas
==============
Replication of threading.local() interface
------------------------------------------
Please refer to :pep:`550` where this topic is covered in detail: [2]_.
Backwards Compatibility
=======================
This proposal preserves 100% backwards compatibility.
Libraries that use ``threading.local()`` to store context-related
values, currently work correctly only for synchronous code. Switching
them to use the proposed API will keep their behavior for synchronous
code unmodified, but will automatically enable support for
asynchronous code.
Reference Implementation
========================
The reference implementation can be found here: [3]_.
References
==========
.. [1] https://www.python.org/dev/peps/pep-0550/#appendix-hamt-performance-analysis
.. [2] https://www.python.org/dev/peps/pep-0550/#replication-of-threading-local-in…
.. [3] https://github.com/python/cpython/pull/5027
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:
Hi,
tl;dr
This mail is about internationalized domain names and TLS/SSL. It
doesn't concern you if you live in ASCII-land. Me and a couple of other
developers like to change the ssl module in a backwards-incompatible way
to fix IDN support for TLS/SSL.
Simply speaking the IDNA standards (internationalized domain names for
applications) describe how to encode non-ASCII domain names. The DNS
system and X.509 certificates cannot handle non-ASCII host names. Any
non-ASCII part of a hostname is punyencoded. For example the host name
'www.bücher.de' (books) is translated into 'www.xn--bcher-kva.de'. In
IDNA terms, 'www.bücher.de' is called an IDN U-label (unicode) and
'www.xn--bcher-kva.de' an IDN A-label (ASCII). Please refer to the TR64
document [1] for more information.
In a perfect world, it would be very simple. We'd only had one IDNA
standard. However there are multiple standards that are incompatible
with each other. The German TLD .de demands IDNA-2008 with UTS#46
compatibility mapping. The hostname 'www.straße.de' maps to
'www.xn--strae-oqa.de'. However in the older IDNA 2003 standard,
'www.straße.de' maps to 'www.strasse.de', but 'strasse.de' is a totally
different domain!
CPython has only support for IDNA 2003.
It's less of an issue for the socket module. It only converts text to
IDNA bytes on the way in. All functions support bytes and text. Since
IDNA encoding does change ASCII and IDNA-encoded data is ASCII, it is
also no problem to pass IDNA2008-encoded text or bytes to all socket
functions.
Example:
>>> import socket
>>> import idna # from PyPI
>>> names = ['straße.de', b'strasse.de', idna.encode('straße.de'),
idna.encode('straße.de').encode('ascii')]
>>> for name in names:
... print(name, socket.getaddrinfo(name, None, socket.AF_INET,
socket.SOCK_STREAM, 0, socket.AI_CANONNAME)[0][3:5])
...
straße.de ('strasse.de', ('89.31.143.1', 0))
b'strasse.de' ('strasse.de', ('89.31.143.1', 0))
b'xn--strae-oqa.de' ('xn--strae-oqa.de', ('81.169.145.78', 0))
xn--strae-oqa.de ('xn--strae-oqa.de', ('81.169.145.78', 0))
As you can see, 'straße.de' is canonicalized as 'strasse.de'. The IDNA
2008 encoded hostname maps to a different IP address.
On the other hand ssl module is currently completely broken. It converts
hostnames from bytes to text with 'idna' codec in some places, but not
in all. The SSLSocket.server_hostname attribute and callback function
SSLContext.set_servername_callback() are decoded as U-label.
Certificate's common name and subject alternative name fields are not
decoded and therefore A-labels. The *must* stay A-labels because
hostname verification is only defined in terms of A-labels. We even had
a security issue once, because partial wildcard like 'xn*.example.org'
must not match IDN hosts like 'xn--bcher-kva.example.org'.
In issue [2] and PR [3], we all agreed that the only sensible fix is to
make 'SSLContext.server_hostname' an ASCII text A-label. But this is an
backwards incompatible fix. On the other hand, IDNA is totally broken
without the fix. Also in my opinion, PR [3] is not going far enough.
Since we have to break backwards compatibility anyway, I'd like to
modify SSLContext.set_servername_callback() at the same time.
Questions:
- Is everybody OK with breaking backwards compatibility? The risk is
small. ASCII-only domains are not affected and IDNA users are broken anyway.
- Should I only fix 3.7 or should we consider a backport to 3.6, too?
Regards,
Christian
[1] https://www.unicode.org/reports/tr46/
[2] https://bugs.python.org/issue28414
[3] https://github.com/python/cpython/pull/3010
On 30 Dec. 2017 11:01 am, "Ethan Smith" <ethan(a)ethanhs.me> wrote:
On Fri, Dec 29, 2017 at 4:52 PM, Guido van Rossum <guido(a)python.org> wrote:
> I still think it should overrides anything that's just inherited but
> nothing that's defined in the class being decorated.
>
>
Could you explain why you are of this opinion? Is it a concern about
complexity of implementation?
Adding a new method to a base class shouldn't risk breaking existing
subclasses.
If folks want to retain the base class implementation, they can request
that explicitly (and doing so isn't redundant at the point of subclass
definition the way it is for methods defined in the class body).
Cheers,
Nick.