Mailman 3 September 2017 - Python-ideas

@classproperty, @abc.abstractclasspropery, etc.
by K. Richard Pixley 16 Dec '20

16 Dec '20

There's a whole matrix of these and I'm wondering why the matrix is currently sparse rather than implementing them all. Or rather, why we can't stack them as: class foo(object): @classmethod @property def bar(cls, ...): ... Essentially the permutation are, I think: {'unadorned'|abc.abstract}{'normal'|static|class}{method|property|non-callable attribute}. concreteness implicit first arg type name comments {unadorned} {unadorned} method def foo(): exists now {unadorned} {unadorned} property @property exists now {unadorned} {unadorned} non-callable attribute x = 2 exists now {unadorned} static method @staticmethod exists now {unadorned} static property @staticproperty proposing {unadorned} static non-callable attribute {degenerate case - variables don't have arguments} unnecessary {unadorned} class method @classmethod exists now {unadorned} class property @classproperty or @classmethod;@property proposing {unadorned} class non-callable attribute {degenerate case - variables don't have arguments} unnecessary abc.abstract {unadorned} method @abc.abstractmethod exists now abc.abstract {unadorned} property @abc.abstractproperty exists now abc.abstract {unadorned} non-callable attribute @abc.abstractattribute or @abc.abstract;@attribute proposing abc.abstract static method @abc.abstractstaticmethod exists now abc.abstract static property @abc.staticproperty proposing abc.abstract static non-callable attribute {degenerate case - variables don't have arguments} unnecessary abc.abstract class method @abc.abstractclassmethod exists now abc.abstract class property @abc.abstractclassproperty proposing abc.abstract class non-callable attribute {degenerate case - variables don't have arguments} unnecessary I think the meanings of the new ones are pretty straightforward, but in case they are not... @staticproperty - like @property only without an implicit first argument. Allows the property to be called directly from the class without requiring a throw-away instance. @classproperty - like @property, only the implicit first argument to the method is the class. Allows the property to be called directly from the class without requiring a throw-away instance. @abc.abstractattribute - a simple, non-callable variable that must be overridden in subclasses @abc.abstractstaticproperty - like @abc.abstractproperty only for @staticproperty @abc.abstractclassproperty - like @abc.abstractproperty only for @classproperty --rich

10 15

Specify number of items to allocate for array.array() constructor
by Sven Rahmann 22 Feb '20

22 Feb '20

At the moment, the array module of the standard library allows to create arrays of different numeric types and to initialize them from an iterable (eg, another array). What's missing is the possiblity to specify the final size of the array (number of items), especially for large arrays. I'm thinking of suffix arrays (a text indexing data structure) for large texts, eg the human genome and its reverse complement (about 6 billion characters from the alphabet ACGT). The suffix array is a long int array of the same size (8 bytes per number, so it occupies about 48 GB memory). At the moment I am extending an array in chunks of several million items at a time at a time, which is slow and not elegant. The function below also initializes each item in the array to a given value (0 by default). Is there a reason why there the array.array constructor does not allow to simply specify the number of items that should be allocated? (I do not really care about the contents.) Would this be a worthwhile addition to / modification of the array module? My suggestions is to modify array generation in such a way that you could pass an iterator (as now) as second argument, but if you pass a single integer value, it should be treated as the number of items to allocate. Here is my current workaround (which is slow): def filled_array(typecode, n, value=0, bsize=(1<<22)): """returns a new array with given typecode (eg, "l" for long int, as in the array module) with n entries, initialized to the given value (default 0) """ a = array.array(typecode, [value]*bsize) x = array.array(typecode) r = n while r >= bsize: x.extend(a) r -= bsize x.extend([value]*r) return x

14 20

Asynchronous exception handling around with/try statement borders
by Erik Bray 24 Sep '18

24 Sep '18

Hi folks, I normally wouldn't bring something like this up here, except I think that there is possibility of something to be done--a language documentation clarification if nothing else, though possibly an actual code change as well. I've been having an argument with a colleague over the last couple days over the proper way order of statements when setting up a try/finally to perform cleanup of some action. On some level we're both being stubborn I think, and I'm not looking for resolution as to who's right/wrong or I wouldn't bring it to this list in the first place. The original argument was over setting and later restoring os.environ, but we ended up arguing over threading.Lock.acquire/release which I think is a more interesting example of the problem, and he did raise a good point that I do want to bring up. </prologue> My colleague's contention is that given lock = threading.Lock() this is simply *wrong*: lock.acquire() try: do_something() finally: lock.release() whereas this is okay: with lock: do_something() Ignoring other details of how threading.Lock is actually implemented, assuming that Lock.__enter__ calls acquire() and Lock.__exit__ calls release() then as far as I've known ever since Python 2.5 first came out these two examples are semantically *equivalent*, and I can't find any way of reading PEP 343 or the Python language reference that would suggest otherwise. However, there *is* a difference, and has to do with how signals are handled, particularly w.r.t. context managers implemented in C (hence we are talking CPython specifically): If Lock.__enter__ is a pure Python method (even if it maybe calls some C methods), and a SIGINT is handled during execution of that method, then in almost all cases a KeyboardInterrupt exception will be raised from within Lock.__enter__--this means the suite under the with: statement is never evaluated, and Lock.__exit__ is never called. You can be fairly sure the KeyboardInterrupt will be raised from somewhere within a pure Python Lock.__enter__ because there will usually be at least one remaining opcode to be evaluated, such as RETURN_VALUE. Because of how delayed execution of signal handlers is implemented in the pyeval main loop, this means the signal handler for SIGINT will be called *before* RETURN_VALUE, resulting in the KeyboardInterrupt exception being raised. Standard stuff. However, if Lock.__enter__ is a PyCFunction things are quite different. If you look at how the SETUP_WITH opcode is implemented, it first calls the __enter__ method with _PyObjet_CallNoArg. If this returns NULL (i.e. an exception occurred in __enter__) then "goto error" is executed and the exception is raised. However if it returns non-NULL the finally block is set up with PyFrame_BlockSetup and execution proceeds to the next opcode. At this point a potentially waiting SIGINT is handled, resulting in KeyboardInterrupt being raised while inside the with statement's suite, and finally block, and hence Lock.__exit__ are entered. Long story short, because Lock.__enter__ is a C function, assuming that it succeeds normally then with lock: do_something() always guarantees that Lock.__exit__ will be called if a SIGINT was handled inside Lock.__enter__, whereas with lock.acquire() try: ... finally: lock.release() there is at last a small possibility that the SIGINT handler is called after the CALL_FUNCTION op but before the try/finally block is entered (e.g. before executing POP_TOP or SETUP_FINALLY). So the end result is that the lock is held and never released after the KeyboardInterrupt (whether or not it's handled somehow). Whereas, again, if Lock.__enter__ is a pure Python function there's less likely to be any difference (though I don't think the possibility can be ruled out entirely). At the very least I think this quirk of CPython should be mentioned somewhere (since in all other cases the semantic meaning of the "with:" statement is clear). However, I think it might be possible to gain more consistency between these cases if pending signals are checked/handled after any direct call to PyCFunction from within the ceval loop. Sorry for the tl;dr; any thoughts?

7 15

Positional-only parameters
by Victor Stinner 11 Sep '18

11 Sep '18

Hi, For technical reasons, many functions of the Python standard libraries implemented in C have positional-only parameters. Example: ------- $ ./python Python 3.7.0a0 (default, Feb 25 2017, 04:30:32) >>> help(str.replace) replace(self, old, new, count=-1, /) # <== notice "/" at the end ... >>> "a".replace("x", "y") # ok 'a' >>> "a".replace(old="x", new="y") # ERR! TypeError: replace() takes at least 2 arguments (0 given) ------- When converting the methods of the builtin str type to the internal "Argument Clinic" tool (tool to generate the function signature, function docstring and the code to parse arguments in C), I asked if we should add support for keyword arguments in str.replace(). The answer was quick: no! It's a deliberate design choice. Quote of Yury Selivanov's message: """ I think Guido explicitly stated that he doesn't like the idea to always allow keyword arguments for all methods. I.e. `str.find('aaa')` just reads better than `str.find(needle='aaa')`. Essentially, the idea is that for most of the builtins that accept one or two arguments, positional-only parameters are better. """ http://bugs.python.org/issue29286#msg285578 I just noticed a module on PyPI to implement this behaviour on Python functions: https://pypi.python.org/pypi/positional My question is: would it make sense to implement this feature in Python directly? If yes, what should be the syntax? Use "/" marker? Use the @positional() decorator? Do you see concrete cases where it's a deliberate choice to deny passing arguments as keywords? Don't you like writing int(x="123") instead of int("123")? :-) (I know that Serhiy Storshake hates the name of the "x" parameter of the int constructor ;-)) By the way, I read that "/" marker is unknown by almost all Python developers, and [...] syntax should be preferred, but inspect.signature() doesn't support this syntax. Maybe we should fix signature() and use [...] format instead? Replace "replace(self, old, new, count=-1, /)" with "replace(self, old, new[, count=-1])" (or maybe even not document the default value?). Python 3.5 help (docstring) uses "S.replace(old, new[, count])". Victor

19 27

Implicit string literal concatenation considered harmful?
by Guido van Rossum 14 Mar '18

14 Mar '18

I just spent a few minutes staring at a bug caused by a missing comma -- I got a mysterious argument count error because instead of foo('a', 'b') I had written foo('a' 'b'). This is a fairly common mistake, and IIRC at Google we even had a lint rule against this (there was also a Python dialect used for some specific purpose where this was explicitly forbidden). Now, with modern compiler technology, we can (and in fact do) evaluate compile-time string literal concatenation with the '+' operator, so there's really no reason to support 'a' 'b' any more. (The reason was always rather flimsy; I copied it from C but the reason why it's needed there doesn't really apply to Python, as it is mostly useful inside macros.) Would it be reasonable to start deprecating this and eventually remove it from the language? -- --Guido van Rossum (python.org/~guido)

51 165

PEP 560 (second post)
by Ivan Levkivskyi 15 Nov '17

15 Nov '17

Previously I posted PEP 560 two weeks ago, while several other PEPs were also posted, so it didn't get much of attention. Here I post the PEP 560 again, now including the full text for convenience of commenting. -- Ivan ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ PEP: 560 Title: Core support for generic types Author: Ivan Levkivskyi <levkivskyi(a)gmail.com> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 03-Sep-2017 Python-Version: 3.7 Post-History: 09-Sep-2017 Abstract ======== Initially PEP 484 was designed in such way that it would not introduce *any* changes to the core CPython interpreter. Now type hints and the ``typing`` module are extensively used by the community, e.g. PEP 526 and PEP 557 extend the usage of type hints, and the backport of ``typing`` on PyPI has 1M downloads/month. Therefore, this restriction can be removed. It is proposed to add two special methods ``__class_getitem__`` and ``__subclass_base__`` to the core CPython for better support of generic types. Rationale ========= The restriction to not modify the core CPython interpreter lead to some design decisions that became questionable when the ``typing`` module started to be widely used. There are three main points of concerns: performance of the ``typing`` module, metaclass conflicts, and the large number of hacks currently used in ``typing``. Performance: ------------ The ``typing`` module is one of the heaviest and slowest modules in the standard library even with all the optimizations made. Mainly this is because subscripted generic types (see PEP 484 for definition of terms used in this PEP) are class objects (see also [1]_). The three main ways how the performance can be improved with the help of the proposed special methods: - Creation of generic classes is slow since the ``GenericMeta.__new__`` is very slow; we will not need it anymore. - Very long MROs for generic classes will be twice shorter; they are present because we duplicate the ``collections.abc`` inheritance chain in ``typing``. - Time of instantiation of generic classes will be improved (this is minor however). Metaclass conflicts: -------------------- All generic types are instances of ``GenericMeta``, so if a user uses a custom metaclass, then it is hard to make a corresponding class generic. This is particularly hard for library classes that a user doesn't control. A workaround is to always mix-in ``GenericMeta``:: class AdHocMeta(GenericMeta, LibraryMeta): pass class UserClass(LibraryBase, Generic[T], metaclass=AdHocMeta): ... but this is not always practical or even possible. With the help of the proposed special attributes the ``GenericMeta`` metaclass will not be needed. Hacks and bugs that will be removed by this proposal: ----------------------------------------------------- - ``_generic_new`` hack that exists since ``__init__`` is not called on instances with a type differing form the type whose ``__new__`` was called, ``C[int]().__class__ is C``. - ``_next_in_mro`` speed hack will be not necessary since subscription will not create new classes. - Ugly ``sys._getframe`` hack, this one is particularly nasty, since it looks like we can't remove it without changes outside ``typing``. - Currently generics do dangerous things with private ABC caches to fix large memory consumption that grows at least as O(N\ :sup:`2`), see [2]_. This point is also important because it was recently proposed to re-implement ``ABCMeta`` in C. - Problems with sharing attributes between subscripted generics, see [3]_. Current solution already uses ``__getattr__`` and ``__setattr__``, but it is still incomplete, and solving this without the current proposal will be hard and will need ``__getattribute__``. - ``_no_slots_copy`` hack, where we clean-up the class dictionary on every subscription thus allowing generics with ``__slots__``. - General complexity of the ``typing`` module, the new proposal will not only allow to remove the above mentioned hacks/bugs, but also simplify the implementation, so that it will be easier to maintain. Specification ============= The idea of ``__class_getitem__`` is simple: it is an exact analog of ``__getitem__`` with an exception that it is called on a class that defines it, not on its instances, this allows us to avoid ``GenericMeta.__getitem__`` for things like ``Iterable[int]``. The ``__class_getitem__`` is automatically a class method and does not require ``@classmethod`` decorator (similar to ``__init_subclass__``) and is inherited like normal attributes. For example:: class MyList: def __getitem__(self, index): return index + 1 def __class_getitem__(cls, item): return f"{cls.__name__}[{item.__name__}]" class MyOtherList(MyList): pass assert MyList()[0] == 1 assert MyList[int] == "MyList[int]" assert MyOtherList()[0] == 1 assert MyOtherList[int] == "MyOtherList[int]" Note that this method is used as a fallback, so if a metaclass defines ``__getitem__``, then that will have the priority. If an object that is not a class object appears in the bases of a class definition, the ``__subclass_base__`` is searched on it. If found, it is called with the original tuple of bases as an argument. If the result of the call is not ``None``, then it is substituted instead of this object. Otherwise (if the result is ``None``), the base is just removed. This is necessary to avoid inconsistent MRO errors, that are currently prevented by manipulations in ``GenericMeta.__new__``. After creating the class, the original bases are saved in ``__orig_bases__`` (currently this is also done by the metaclass). NOTE: These two method names are reserved for exclusive use by the ``typing`` module and the generic types machinery, and any other use is strongly discouraged. The reference implementation (with tests) can be found in [4]_, the proposal was originally posted and discussed on the ``typing`` tracker, see [5]_. Backwards compatibility and impact on users who don't use ``typing``: ===================================================================== This proposal may break code that currently uses the names ``__class_getitem__`` and ``__subclass_base__``. This proposal will support almost complete backwards compatibility with the current public generic types API; moreover the ``typing`` module is still provisional. The only two exceptions are that currently ``issubclass(List[int], List)`` returns True, with this proposal it will raise ``TypeError``. Also ``issubclass(collections.abc.Iterable, typing.Iterable)`` will return ``False``, which is probably desirable, since currently we have a (virtual) inheritance cycle between these two classes. With the reference implementation I measured negligible performance effects (under 1% on a micro-benchmark) for regular (non-generic) classes. References ========== .. [1] Discussion following Mark Shannon's presentation at Language Summit (https://github.com/python/typing/issues/432) .. [2] Pull Request to implement shared generic ABC caches (https://github.com/python/typing/pull/383) .. [3] An old bug with setting/accessing attributes on generic types (https://github.com/python/typing/issues/392) .. [4] The reference implementation (https://github.com/ilevkivskyi/cpython/pull/2/files) .. [5] Original proposal (https://github.com/python/typing/issues/468) Copyright ========= This document has been placed in the public domain.

6 18

PEP 562
by Ivan Levkivskyi 14 Nov '17

14 Nov '17

I have written a short PEP as a complement/alternative to PEP 549. I will be grateful for comments and suggestions. The PEP should appear online soon. -- Ivan *********************************************************** PEP: 562 Title: Module __getattr__ Author: Ivan Levkivskyi <levkivskyi(a)gmail.com> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 09-Sep-2017 Python-Version: 3.7 Post-History: 09-Sep-2017 Abstract ======== It is proposed to support ``__getattr__`` function defined on modules to provide basic customization of module attribute access. Rationale ========= It is sometimes convenient to customize or otherwise have control over access to module attributes. A typical example is managing deprecation warnings. Typical workarounds are assigning ``__class__`` of a module object to a custom subclass of ``types.ModuleType`` or substituting ``sys.modules`` item with a custom wrapper instance. It would be convenient to simplify this procedure by recognizing ``__getattr__`` defined directly in a module that would act like a normal ``__getattr__`` method, except that it will be defined on module *instances*. For example:: # lib.py from warnings import warn deprecated_names = ["old_function", ...] def _deprecated_old_function(arg, other): ... def __getattr__(name): if name in deprecated_names: warn(f"{name} is deprecated", DeprecationWarning) return globals()[f"_deprecated_{name}"] raise AttributeError(f"module {__name__} has no attribute {name}") # main.py from lib import old_function # Works, but emits the warning There is a related proposal PEP 549 that proposes to support instance properties for a similar functionality. The difference is this PEP proposes a faster and simpler mechanism, but provides more basic customization. An additional motivation for this proposal is that PEP 484 already defines the use of module ``__getattr__`` for this purpose in Python stub files, see [1]_. Specification ============= The ``__getattr__`` function at the module level should accept one argument which is a name of an attribute and return the computed value or raise an ``AttributeError``:: def __getattr__(name: str) -> Any: ... This function will be called only if ``name`` is not found in the module through the normal attribute lookup. The reference implementation for this PEP can be found in [2]_. Backwards compatibility and impact on performance ================================================= This PEP may break code that uses module level (global) name ``__getattr__``. The performance implications of this PEP are minimal, since ``__getattr__`` is called only for missing attributes. References ========== .. [1] PEP 484 section about ``__getattr__`` in stub files (https://www.python.org/dev/peps/pep-0484/#stub-files) .. [2] The reference implementation (https://github.com/ilevkivskyi/cpython/pull/3/files) Copyright ========= This document has been placed in the public domain.

10 17

PEP draft: context variables
by Koos Zevenhoven 18 Oct '17

18 Oct '17

Hi all, as promised, here is a draft PEP for context variable semantics and implementation. Apologies for the slight delay; I had a not-so-minor autosave accident and had to retype the majority of this first draft. During the past years, there has been growing interest in something like task-local storage or async-local storage. This PEP proposes an alternative approach to solving the problems that are typically stated as motivation for such concepts. This proposal is based on sketches of solutions since spring 2015, with some minor influences from the recent discussion related to PEP 550. I can also see some potential implementation synergy between this PEP and PEP 550, even if the proposed semantics are quite different. So, here it is. This is the first draft and some things are still missing, but the essential things should be there. -- Koos |||||||||||||||||||||||||||||||||||||||||||||||||||||||||| PEP: 999 Title: Context-local variables (contextvars) Version: $Revision$ Last-Modified: $Date$ Author: Koos Zevenhoven Status: Draft Type: Standards Track Content-Type: text/x-rst Created: DD-Mmm-YYYY Post-History: DD-Mmm-YYYY Abstract ======== Sometimes, in special cases, it is desired that code can pass information down the function call chain to the callees without having to explicitly pass the information as arguments to each function in the call chain. This proposal describes a construct which allows code to explicitly switch in and out of a context where a certain context variable has a given value assigned to it. This is a modern alternative to some uses of things like global variables in traditional single-threaded (or thread-unsafe) code and of thread-local storage in traditional *concurrency-unsafe* code (single- or multi-threaded). In particular, the proposed mechanism can also be used with more modern concurrent execution mechanisms such as asynchronously executed coroutines, without the concurrently executed call chains interfering with each other's contexts. The "call chain" can consist of normal functions, awaited coroutines, or generators. The semantics of context variable scope are equivalent in all cases, allowing code to be refactored freely into *subroutines* (which here refers to functions, sub-generators or sub-coroutines) without affecting the semantics of context variables. Regarding implementation, this proposal aims at simplicity and minimum changes to the CPython interpreter and to other Python interpreters. Rationale ========= Consider a modern Python *call chain* (or call tree), which in this proposal refers to any chained (nested) execution of *subroutines*, using any possible combinations of normal function calls, or expressions using ``await`` or ``yield from``. In some cases, passing necessary *information* down the call chain as arguments can substantially complicate the required function signatures, or it can even be impossible to achieve in practice. In these cases, one may search for another place to store this information. Let us look at some historical examples. The most naive option is to assign the value to a global variable or similar, where the code down the call chain can access it. However, this immediately makes the code thread-unsafe, because with multiple threads, all threads assign to the same global variable, and another thread can interfere at any point in the call chain. A somewhat less naive option is to store the information as per-thread information in thread-local storage, where each thread has its own "copy" of the variable which other threads cannot interfere with. Although non-ideal, this has been the best solution in many cases. However, thanks to generators and coroutines, the execution of the call chain can be suspended and resumed, allowing code in other contexts to run concurrently. Therefore, using thread-local storage is *concurrency-unsafe*, because other call chains in other contexts may interfere with the thread-local variable. Note that in the above two historical approaches, the stored information has the *widest* available scope without causing problems. For a third solution along the same path, one would first define an equivalent of a "thread" for asynchronous execution and concurrency. This could be seen as the largest amount of code and nested calls that is guaranteed to be executed sequentially without ambiguity in execution order. This might be referred to as concurrency-local or task-local storage. In this meaning of "task", there is no ambiguity in the order of execution of the code within one task. (This concept of a task is close to equivalent to a ``Task`` in ``asyncio``, but not exactly.) In such concurrency-locals, it is possible to pass information down the call chain to callees without another code path interfering with the value in the background. Common to the above approaches is that they indeed use variables with a wide but just-narrow-enough scope. Thread-locals could also be called thread-wide globals---in single-threaded code, they are indeed truly global. And task-locals could be called task-wide globals, because tasks can be very big. The issue here is that neither global variables, thread-locals nor task-locals are really meant to be used for this purpose of passing information of the execution context down the call chain. Instead of the widest possible variable scope, the scope of the variables should be controlled by the programmer, typically of a library, to have the desired scope---not wider. In other words, task-local variables (and globals and thread-locals) have nothing to do with the kind of context-bound information passing that this proposal intends to enable, even if task-locals can be used to emulate the desired semantics. Therefore, in the following, this proposal describes the semantics and the outlines of an implementation for *context-local variables* (or context variables, contextvars). In fact, as a side effect of this PEP, an async framework can use the proposed feature to implement task-local variables. Proposal ======== Because the proposed semantics are not a direct extension to anything already available in Python, this proposal is first described in terms of semantics and API at a fairly high level. In particular, Python ``with`` statements are heavily used in the description, as they are a good match with the proposed semantics. However, the underlying ``__enter__`` and ``__exit__`` methods correspond to functions in the lower-level speed-optimized (C) API. For clarity of this document, the lower-level functions are not explicitly named in the definition of the semantics. After describing the semantics and high-level API, the implementation is described, going to a lower level. Semantics and higher-level API ------------------------------ Core concept '''''''''''' A context-local variable is represented by a single instance of ``contextvars.Var``, say ``cvar``. Any code that has access to the ``cvar`` object can ask for its value with respect to the current context. In the high-level API, this value is given by the ``cvar.value`` property:: cvar = contextvars.Var(default="the default value", description="example context variable") assert cvar.value == "the default value" # default still applies # In code examples, all ``assert`` statements should # succeed according to the proposed semantics. No assignments to ``cvar`` have been applied for this context, so ``cvar.value`` gives the default value. Assigning new values to contextvars is done in a highly scope-aware manner:: with cvar.assign(new_value): assert cvar.value is new_value # Any code here, or down the call chain from here, sees: # cvar.value is new_value # unless another value has been assigned in a # nested context assert cvar.value is new_value # the assignment of ``cvar`` to ``new_value`` is no longer visible assert cvar.value == "the default value" Here, ``cvar.assign(value)`` returns another object, namely ``contextvars.Assignment(cvar, new_value)``. The essential part here is that applying a context variable assignment (``Assignment.__enter__``) is paired with a de-assignment (``Assignment.__exit__``). These operations set the bounds for the scope of the assigned value. Assignments to the same context variable can be nested to override the outer assignment in a narrower context:: assert cvar.value == "the default value" with cvar.assign("outer"): assert cvar.value == "outer" with cvar.assign("inner"): assert cvar.value == "inner" assert cvar.value == "outer" assert cvar.value == "the default value" Also multiple variables can be assigned to in a nested manner without affecting each other:: cvar1 = contextvars.Var() cvar2 = contextvars.Var() assert cvar1.value is None # default is None by default assert cvar2.value is None with cvar1.assign(value1): assert cvar1.value is value1 assert cvar2.value is None with cvar2.assign(value2): assert cvar1.value is value1 assert cvar2.value is value2 assert cvar1.value is value1 assert cvar2.value is None assert cvar1.value is None assert cvar2.value is None Or with more convenient Python syntax:: with cvar1.assign(value1), cvar2.assign(value2): assert cvar1.value is value1 assert cvar2.value is value2 In another *context*, in another thread or otherwise concurrently executed task or code path, the context variables can have a completely different state. The programmer thus only needs to worry about the context at hand. Refactoring into subroutines '''''''''''''''''''''''''''' Code using contextvars can be refactored into subroutines without affecting the semantics. For instance:: assi = cvar.assign(new_value) def apply(): assi.__enter__() assert cvar.value == "the default value" apply() assert cvar.value is new_value assi.__exit__() assert cvar.value == "the default value" Or similarly in an asynchronous context where ``await`` expressions are used. The subroutine can now be a coroutine:: assi = cvar.assign(new_value) async def apply(): assi.__enter__() assert cvar.value == "the default value" await apply() assert cvar.value is new_value assi.__exit__() assert cvar.value == "the default value" Or when the subroutine is a generator:: def apply(): yield assi.__enter__() which is called using ``yield from apply()`` or with calls to ``next`` or ``.send``. This is discussed further in later sections. Semantics for generators and generator-based coroutines ''''''''''''''''''''''''''''''''''''''''''''''''''''''' Generators, coroutines and async generators act as subroutines in much the same way that normal functions do. However, they have the additional possibility of being suspended by ``yield`` expressions. Assignment contexts entered inside a generator are normally preserved across yields:: def genfunc(): with cvar.assign(new_value): assert cvar.value is new_value yield assert cvar.value is new_value g = genfunc() next(g) assert cvar.value == "the default value" with cvar.assign(another_value): next(g) However, the outer context visible to the generator may change state across yields:: def genfunc(): assert cvar.value is value2 yield assert cvar.value is value1 yield with cvar.assign(value3): assert cvar.value is value3 with cvar.assign(value1): g = genfunc() with cvar.assign(value2): next(g) next(g) next(g) assert cvar.value is value1 Similar semantics apply to async generators defined by ``async def ... yield ...`` ). By default, values assigned inside a generator do not leak through yields to the code that drives the generator. However, the assignment contexts entered and left open inside the generator *do* become visible outside the generator after the generator has finished with a ``StopIteration`` or another exception:: assi = cvar.assign(new_value) def genfunc(): yield assi.__enter__(): yield g = genfunc() assert cvar.value == "the default value" next(g) assert cvar.value == "the default value" next(g) # assi.__enter__() is called here assert cvar.value == "the default value" next(g) assert cvar.value is new_value assi.__exit__() Special functionality for framework authors ------------------------------------------- Frameworks, such as ``asyncio`` or third-party libraries, can use additional functionality in ``contextvars`` to achieve the desired semantics in cases which are not determined by the Python interpreter. Some of the semantics described in this section are also afterwards used to describe the internal implementation. Leaking yields '''''''''''''' Using the ``contextvars.leaking_yields`` decorator, one can choose to leak the context through ``yield`` expressions into the outer context that drives the generator:: @contextvars.leaking_yields def genfunc(): assert cvar.value == "outer" with cvar.assign("inner"): yield assert cvar.value == "inner" assert cvar.value == "outer" g = genfunc(): with cvar.assign("outer"): assert cvar.value == "outer" next(g) assert cvar.value == "inner" next(g) assert cvar.value == "outer" Capturing contextvar assignments '''''''''''''''''''''''''''''''' Using ``contextvars.capture()``, one can capture the assignment contexts that are entered by a block of code. The changes applied by the block of code can then be reverted and subsequently reapplied, even in another context:: assert cvar1.value is None # default assert cvar2.value is None # default assi1 = cvar1.assign(value1) assi2 = cvar1.assign(value2) with contextvars.capture() as delta: assi1.__enter__() with cvar2.assign("not captured"): assert cvar2.value is "not captured" assi2.__enter__() assert cvar1.value is value2 delta.revert() assert cvar1.value is None assert cvar2.value is None ... with cvar1.assign(1), cvar2.assign(2): delta.reapply() assert cvar1.value is value2 assert cvar2.value == 2 However, reapplying the "delta" if its net contents include deassignments may not be possible (see also Implementation and Open Issues). Getting a snapshot of context state ''''''''''''''''''''''''''''''''''' The function ``contextvars.get_local_state()`` returns an object representing the applied assignments to all context-local variables in the context where the function is called. This can be seen as equivalent to using ``contextvars.capture()`` to capture all context changes from the beginning of execution. The returned object supports methods ``.revert()`` and ``reapply()`` as above. Running code in a clean state ''''''''''''''''''''''''''''' Although it is possible to revert all applied context changes using the above primitives, a more convenient way to run a block of code in a clean context is provided:: with context_vars.clean_context(): # here, all context vars start off with their default values # here, the state is back to what it was before the with block. Implementation -------------- This section describes to a variable level of detail how the described semantics can be implemented. At present, an implementation aimed at simplicity but sufficient features is described. More details will be added later. Alternatively, a somewhat more complicated implementation offers minor additional features while adding some performance overhead and requiring more code in the implementation. Data structures and implementation of the core concept '''''''''''''''''''''''''''''''''''''''''''''''''''''' Each thread of the Python interpreter keeps its on stack of ``contextvars.Assignment`` objects, each having a pointer to the previous (outer) assignment like in a linked list. The local state (also returned by ``contextvars.get_local_state()``) then consists of a reference to the top of the stack and a pointer/weak reference to the bottom of the stack. This allows efficient stack manipulations. An object produced by ``contextvars.capture()`` is similar, but refers to only a part of the stack with the bottom reference pointing to the top of the stack as it was in the beginning of the capture block. Now, the stack evolves according to the assignment ``__enter__`` and ``__exit__`` methods. For example:: cvar1 = contextvars.Var() cvar2 = contextvars.Var() # stack: [] assert cvar1.value is None assert cvar2.value is None with cvar1.assign("outer"): # stack: [Assignment(cvar1, "outer")] assert cvar1.value == "outer" with cvar1.assign("inner"): # stack: [Assignment(cvar1, "outer"), # Assignment(cvar1, "inner")] assert cvar1.value == "inner" with cvar2.assign("hello"): # stack: [Assignment(cvar1, "outer"), # Assignment(cvar1, "inner"), # Assignment(cvar2, "hello")] assert cvar2.value == "hello" # stack: [Assignment(cvar1, "outer"), # Assignment(cvar1, "inner")] assert cvar1.value == "inner" assert cvar2.value is None # stack: [Assignment(cvar1, "outer")] assert cvar1.value == "outer" # stack: [] assert cvar1.value is None assert cvar2.value is None Getting a value from the context using ``cvar1.value`` can be implemented as finding the topmost occurrence of a ``cvar1`` assignment on the stack and returning the value there, or the default value if no assignment is found on the stack. However, this can be optimized to instead be an O(1) operation in most cases. Still, even searching through the stack may be reasonably fast since these stacks are not intended to grow very large. The above description is already sufficient for implementing the core concept. Suspendable frames require some additional attention, as explained in the following. Implementation of generator and coroutine semantics ''''''''''''''''''''''''''''''''''''''''''''''''''' Within generators, coroutines and async generators, assignments and deassignments are handled in exactly the same way as anywhere else. However, some changes are needed in the builtin generator methods ``send``, ``__next__``, ``throw`` and ``close``. Here is the Python equivalent of the changes needed in ``send`` for a generator (here ``_old_send`` refers to the behavior in Python 3.6):: def send(self, value): # if decorated with contextvars.leaking_yields if self.gi_contextvars is LEAK: # nothing needs to be done to leak context through yields :) return self._old_send(value) try: with contextvars.capture() as delta: if self.gi_contextvars: # non-zero captured content from previous iteration self.gi_contextvars.reapply() ret = self._old_send(value) except Exception: raise else: # suspending, revert context changes but delta.revert() self.gi_contextvars = delta return ret The corresponding modifications to the other methods is essentially identical. The same applies to coroutines and async generators. For code that does not use ``contextvars``, the additions are O(1) and essentially reduce to a couple of pointer comparisons. For code that does use ``contextvars``, the additions are still O(1) in most cases. More on implementation '''''''''''''''''''''' The rest of the functionality, including ``contextvars.leaking_yields``, contextvars.capture()``, ``contextvars.get_local_state()`` and ``contextvars.clean_context()`` are in fact quite straightforward to implement, but their implementation will be discussed further in later versions of this proposal. Caching of assigned values is somewhat more complicated, and will be discussed later, but it seems that most cases should achieve O(1) complexity. Backwards compatibility ======================= There are no *direct* backwards-compatibility concerns, since a completely new feature is proposed. However, various traditional uses of thread-local storage may need a smooth transition to ``contextvars`` so they can be concurrency-safe. There are several approaches to this, including emulating task-local storage with a little bit of help from async frameworks. A fully general implementation cannot be provided, because the desired semantics may depend on the design of the framework. Another way to deal with the transition is for code to first look for a context created using ``contextvars``. If that fails because a new-style context has not been set or because the code runs on an older Python version, a fallback to thread-local storage is used. Open Issues =========== Out-of-order de-assignments --------------------------- In this proposal, all variable deassignments are made in the opposite order compared to the preceding assignments. This has two useful properties: it encourages using ``with`` statements to define assignment scope and has a tendency to catch errors early (forgetting a ``.__exit__()`` call often results in a meaningful error. To have this as a requirement requirement is beneficial also in terms of implementation simplicity and performance. Nevertheless, allowing out-of-order context exits is not completely out of the question, and reasonable implementation strategies for that do exist. Rejected Ideas ============== Dynamic scoping linked to subroutine scopes ------------------------------------------- The scope of value visibility should not be determined by the way the code is refactored into subroutines. It is necessary to have per-variable control of the assignment scope. Acknowledgements ================ To be added. References ========== To be added. -- + Koos Zevenhoven + http://twitter.com/k7hoven +

19 92

Fwd: Fwd: A PEP to define basical metric which allows to guarantee minimal code quality
by Nick Coghlan 09 Oct '17

09 Oct '17

Forwarding my reply, since Google Groups still can't get the Reply-To headers for the mailing list right, and we still don't know how to categorically prohibit posting from there. ---------- Forwarded message ---------- From: Nick Coghlan <ncoghlan(a)gmail.com> Date: 26 September 2017 at 12:51 Subject: Re: [Python-ideas] Fwd: A PEP to define basical metric which allows to guarantee minimal code quality To: Alexandre GALODE <alexandre.galode(a)gmail.com> Cc: python-ideas <python-ideas(a)googlegroups.com> On 25 September 2017 at 21:49, <alexandre.galode(a)gmail.com> wrote: > Hi, > > Sorry from being late, i was in professional trip to Pycon FR. > > I see that the subject is divising advises. > > Reading responses, i have impression that my proposal has been saw as > mandatory, that i don't want of course. As previously said, i see this "PEP" > as an informational PEP. So it's a guideline, not a mandatory. Each > developer will have right to ignore it, as each developer can choose to > ignore PEP8 or PEP20. > > Perfect solution does not exist, i know it, but i think this "PEP" could, > partially, be a good guideline. Your question is essentially "Are python-dev prepared to offer generic code quality assessment advice to Python developers?" The answer is "No, we're not". It's not our role, and it's not a role we're the least bit interested in taking on. Just because we're the ones making the software equivalent of hammers and saws doesn't mean we're also the ones that should be drafting or signing off on people's building codes :) Python's use cases are too broad, and what's appropriate for my ad hoc script to download desktop wallpaper backgrounds, isn't going to be what's appropriate for writing an Ansible module, which in turn isn't going to be the same as what's appropriate for writing a highly scalable web service or a complex data analysis job. So the question of "What does 'good enough for my purposes' actually mean?" is something for end users to tackle for themselves, either individually or collaboratively, without seeking specific language designer endorsement of their chosen criteria. However, as mentioned earlier in the thread, it would be *entirely* appropriate for the folks participating in PyCQA to decide to either take on this work themselves, or else endorse somebody else taking it on. I'd see such an effort as being similar to the way that packaging.python.org originally started as an independent PyPA project hosted at python-packaging-user-guide.readthedocs.io, with a fair bit of content already being added before we later requested and received the python.org subdomain. Cheers, Nick. -- Nick Coghlan | ncoghlan(a)gmail.com | Brisbane, Australia -- Nick Coghlan | ncoghlan(a)gmail.com | Brisbane, Australia

2 1

Changes to the existing optimization levels
by Diana Clarke 05 Oct '17

05 Oct '17

Hi folks: I was recently looking for an entry-level cpython task to work on in my spare time and plucked this off of someone's TODO list. "Make optimizations more fine-grained than just -O and -OO" There are currently three supported optimization levels (0, 1, and 2). Briefly summarized, they do the following. 0: no optimizations 1: remove assert statements and __debug__ blocks 2: remove docstrings, assert statements, and __debug__ blocks >From what I gather, their use-case is assert statements in production code. More specifically, they want to be able to optimize away docstrings, but keep the assert statements, which currently isn't possible with the existing optimization levels. As a first baby-step, I considered just adding a new optimization level 3 that keeps asserts but continues to remove docstrings and __debug__ blocks. 3: remove docstrings and __debug__ blocks >From a command-line perspective, there is already support for additional optimization levels. That is, without making any changes, the optimization level will increase with the number of 0s provided. $ python -c "import sys; print(sys.flags.optimize)" 0 $ python -OO -c "import sys; print(sys.flags.optimize)" 2 $ python -OOOOOOO -c "import sys; print(sys.flags.optimize)" 7 And the PYTHONOPTIMIZE environment variable will happily assign something like 42 to sys.flags.optimize. $ unset PYTHONOPTIMIZE $ python -c "import sys; print(sys.flags.optimize)" 0 $ export PYTHONOPTIMIZE=2 $ python -c "import sys; print(sys.flags.optimize)" 2 $ export PYTHONOPTIMIZE=42 $ python -c "import sys; print(sys.flags.optimize)" 42 Finally, the resulting __pycache__ folder also already contains the expected bytecode files for the new optimization levels ( __init__.cpython-37.opt-42.pyc was created for optimization level 42, for example). $ tree . └── test ├── __init__.py └── __pycache__ ├── __init__.cpython-37.opt-1.pyc ├── __init__.cpython-37.opt-2.pyc ├── __init__.cpython-37.opt-42.pyc ├── __init__.cpython-37.opt-7.pyc └── __init__.cpython-37.pyc Adding optimization level 3 is an easy change to make. Here's that quick proof of concept (minus changes to the docs, etc). I've also attached that diff as 3.diff. https://github.com/dianaclarke/cpython/commit/4bd7278d87bd762b2989178e5bfed… I was initially looking for a more elegant solution that allowed you to specify exactly which optimizations you wanted, and when I floated this naive ("level 3") approach off-list to a few core developers, their feedback confirmed my hunch (too hacky). So for my second pass at this task, I started with the following two pronged approach. 1) Changed the various compile signatures to accept a set of string optimization flags rather than an int value. 2) Added a new command line option N that allows you to specify any number of individual optimization flags. For example: python -N nodebug -N noassert -N nodocstring The existing optimization options (-O and -OO) still exist in this approach, but they are mapped to the new optimization flags ("nodebug", "noassert", "nodocstring"). With the exception of the builtin complile() function, all underlying compile functions would only accept optimization flags going forward, and the builtin compile() function would accept both an integer optimize value or a set of optimization flags for backwards compatibility. You can find that work-in-progress approach here on github (also attached as N.diff). https://github.com/dianaclarke/cpython/commit/3e36cea1fc8ee6f4cdc584851e4c1… All in all, that approach is going fairly well, but there's a lot of work remaining, and that diff is already getting quite large (for my new-contributor status). Note for example, that I haven't yet tackled adding bytecode files to __pycache__ that reflect these new optimization flags. Something like: $ tree . └── test ├── __init__.py └── __pycache__ ├── __init__.cpython-37.opt-nodebug-noassert.pyc ├── __init__.cpython-37.opt-nodebug-nodocstring.pyc ├── __init__.cpython-37.opt-nodebug-noassert-nodocstring.pyc └── __init__.cpython-37.pyc I'm also not certain if the various compile signatures are even open for change (int optimize => PyObject *optimizations), or if that's a no-no. And there are still a ton of references to "-O", "-OO", "sys.flags.optimize", "Py_OptimizeFlag", "PYTHONOPTIMIZE", "optimize", etc that all need to be audited and their implications considered. I've really enjoyed this task and I'm learning a lot about the c api, but I think this is a good place to stop and solicit feedback and direction. My gut says that the amount of churn and resulting risk is too high to continue down this path, but I would love to hear thoughts from others (alternate approaches, ways to limit scope, confirmation that the existing approach is too entrenched for change, etc). Regardless, I think the following subset change could merge without any bigger picture changes, as it just adds test coverage for a case not yet covered. I can reopen that pull request once I clean up the commit message a bit (I closed it in the mean time). https://github.com/python/cpython/pull/3450/commits/bfdab955a94a7fef431548f… Thanks for your time! Cheers, --diana

6 16