[Python-ideas] PEP draft: context variables
Paul Moore
p.f.moore at gmail.com
Sun Oct 15 08:15:34 EDT 2017
On 13 October 2017 at 23:30, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
> At this point of time, there's just one place which describes one well
> defined semantics: PEP 550 latest version. Paul, if you have
> time/interest, please take a look at it, and say what's confusing
> there.
Hi Yury,
The following is my impressions from a read-through of the initial
part of the PEP. tl; dr - you say "concurrent" too much and it makes
my head hurt :-)
1. The abstract feels like it's talking about async. The phrase
"consistent access to non-local state in the context of out-of-order
execution, such as in Python generators and coroutines" said async to
me, even though it mentioned generators. Probably because any time I
see generators mentioned alongside coroutines (a term I really don't
grasp yet in the context of Python) I immediately assume the reference
is to the weird extensions of generators when send() and yield
expressions are used. It quite genuinely took me two or three attempts
to get past the abstract and actually read the next section, because
the "this is async" idea came across so strongly.
2. The rationale says that "Prior to the advent of asynchronous
programming in Python" threads and TLS were used - and it implies this
was fine. But the section goes on to say "TLS does not work well for
programs which execute concurrently in a single thread". But it uses a
*generator* as the example. I'm sorry, but to me a generator is pure
and simple standard Python, and definitely not "executing concurrently
in a single thread" (see below). So again, the clash between what the
description said and the actual example left me confused (and confused
enough to equate all of this in my mind with "all that async stuff I
don't follow").
3. "This is because implicit Decimal context is stored as a
thread-local, so concurrent iteration of the fractions() generator
would corrupt the state." This makes no sense to me. The example isn't
concurrent. There's only one thread, and no async. So no concurrency.
It's interleaved iteration through two generators, which I understand
is *technically* considered concurrency in the async sense, but
doesn't *feel* like concurrency. At its core, this is the problem I'm
hitting throughout the whole document - the conceptual clash between
examples that don't feel concurrent, and discussions that talk almost
totally in terms of concurrency, means that understanding every
section is a significant mental effort.
4. By the end of the rationale, what I'd got from the document was:
"There's a bug in decimal.context, caused by the fact that it uses
TLS. It's basically a limitation of TLS. To fix it they need a new
mechanism, which this PEP provides." So unless I'm using (or would
expect to use) TLS in my own code, this doesn't affect me. Which
really isn't the point (if I now understand correctly) - the PEP is
actually providing a safe (and hopefully easy to use/understand!)
mechanism for handling a specific class of programming problem,
maintaining dynamic state that follows the execution order of the
code, rather than the lexical structure. (I didn't state that well -
but I hope I got the idea across) Basically, the problem that Lisp
dynamic variables are designed to solve (although I don't think that
describing the feature in terms of Lisp is a good idea either).
4a. I'd much prefer this part of the PEP to be structured as follows:
* There's a class of programming problems that need to allow code
to access "state" in a way that follows the runtime path the code
takes. Prior art in this area include Lisp's dynamic scope, ... (more
examples would be good - IIRC, Perl has this type of variable too).
* Normal local variables can't do this as they are lexically
scoped. Global variables can be used, but they don't work in the
presence of threads.
* TLS work for threads, but hit problems when code execution paths
aren't nested subroutine-style. Examples where this happens are
generators (which suspend execution and yield back to their parent),
and async (which simulates multiple threads by interleaving execution
of generators). [Note - I know this explanation is probably
inaccurate]
* This PEP proposes a general mechanism that will allow
programmers to simply write code that manages state like this, which
will work in all of the above cases.
That's it. Barely any mention of async, no need to focus on the
Decimal bug except as a motivating example of why TLS isn't
sufficient, and so no risk that people think "why not just fix
decimal.context" - so no need to go into detail as to why you can't
"just fix it". And it frames the PEP as providing a new piece of
functionality that *anyone* might find a use for, rather than as a fix
for a corner case of async/TLS interaction.
5. The "Goals" section says "provide a more reliable threading.local()
alternative" which is fine. But the bullet points do exactly the same
as before, using terms that I associate with async to describe the
benefits, and so they aren't compelling to me. I'd say something like:
* Is a reliable replacement for TLS that doesn't have the issue
that was described in the rationale
* Is closely modeled on the TLS API, to minimise the impact of
switching on code that currently uses TLS
* Performance yada yada yada - I don't think this is important,
there's been no mention yet that any of this is performance critical
(but see 4a above, this could probably be improved further if the
rationale were structured the way I propose there).
6. The high level spec, under generators, says:
"""
Unlike regular function calls, generators can cooperatively yield
their control of execution to the caller. Furthermore, a generator
does not control where the execution would continue after it yields.
It may be resumed from an arbitrary code location.
"""
That's not how I understand generators. To me, a generator can
*suspend its execution* to be resumed later. On suspension, control
*always* returns to the caller. Generators can be resumed from
anywhere, although the most common pattern is to resume them
repeatedly in a loop.
To me, this implies that context variables should follow that
execution path. If the caller sets a value, the generator sees it. If
the generator sets a value then yields, the caller will see that. If
code changes the value between two resumptions of the generator, the
generator will see those changes. The PEP at this point, though,
states the behaviour of context variables in a way that I simply don't
follow - it's using the idea of an "outer context" - which as far as I
can see, has never been defined at this point (and doesn't have any
obvious meaning in terms of the execution flow, which is not nested in
any obvious sense - that's the *point* of generators, to not have a
purely nested execution path).
The problem with the decimal context isn't about any of that - it's
about how "yield" interacts with "with", and specifically that
yielding out of the with *doesn't* run the exit part of the context
manager, as the code inside the with statement hasn't finished running
yet. Having stated the problem like that, I'm wondering why the
solution isn't to add some sort of "suspend/resume" mechanism to the
context manager protocol, rather than introducing context variables?
That may be worth adding to the "Rejected ideas" section if it's not a
viable solution.
The next section of the high level spec is coroutines and async, which
I'll skip, as I firmly believe that as I don't use them, if there's
anything of relevance to me in that section, it should be moved to
somewhere that isn't about async.
I'm not going to comment on anything further. At this point, I'm far
too overwhelmed with concepts and ideas that are at odds with my
understanding of the problem to really take in detail-level
information. I'd assume that the detail is about how the overall
functionality as described is implemented, but as I don't really have
a working mental model of the high-level functionality, I doubt I'd
get much from the detail.
I hope this is of some use. I appreciate I'm talking about a pretty
wholesale rewrite, and it's quite far down the line to be suggesting
such a thing. I'll understand if you don't feel it's worthwhile to
take that route.
Paul
More information about the Python-ideas
mailing list