[Python-ideas] (PEP 555 subtopic) Propagation of context in async code

Fri Oct 13 15:08:06 EDT 2017

On Fri, Oct 13, 2017 at 1:46 PM, Koos Zevenhoven <k7hoven at gmail.com> wrote:
> On Fri, Oct 13, 2017 at 7:38 PM, Yury Selivanov <yselivanov.ml at gmail.com>
> wrote:
>>
>> On Fri, Oct 13, 2017 at 11:49 AM, Koos Zevenhoven <k7hoven at gmail.com>
>> wrote:
>> [..]
>> > This was my starting point 2.5 years ago, when Yury was drafting this
>> > status
>> > quo (PEP 492). It looked a lot of PEP 492 was inevitable, but that there
>> > will be a problem, where each API that uses "blocking IO" somewhere
>> > under
>> > the hood would need a duplicate version for asyncio (and one for each
>> > third-party async framework!). I felt it was necessary to think about a
>> > solution before PEP 492 is accepted, and this became a fairly
>> > short-lived
>> > thread here on python-ideas:
>>
>> Well, it's obvious why the thread was "short-lived".  Don't mix
>> non-blocking and blocking code and don't nest asyncio loops.  But I
>> believe this new subtopic is a distraction.
>
>
> Nesting is not the only way to have interaction between two event loops.
> But whenever anyone *does* want to nest two loops, they are perhaps more
> likely to be loops of different frameworks.
>
> You believe that the semantics in async code is a distraction?

Discussing blocking calls and/or nested event loops in async code is
certainly a distraction :)

[..]
>> The real big question is how people usually write code.  And the
>> answer is that they *don't write it like that* at all.  Many context
>> managers in many frameworks (aiohttp, tornado, and even asyncio)
>> require you to wrap your await expressions in them.  Not coroutine
>> instantiation.
>
>
> You know very well that I've been talking about how people usually write
> code etc. But we still need to handle the corner cases too.
[..]
> The code is to illustrate semantics, not an example of real code. The point
> is to highlight that the context has changed between when the coroutine
> function was called and when it is awaited. That's certainly a thing that
> can happen in real code, even if it is not the most typical case. I do
> mention this in my previous email.

I understand the point and what you're trying to illustrate.  I'm
saying that people don't write 'with smth: c = coro()' because it's
currently pointless.  And unless you tell them they should, they
won't.

>
>>
>> [..]
>> > Both of these would have their own stack of (argument, value) assignment
>> > pairs, explained in the implementation part of the first PEP 555 draft.
>> > While this is a complication, the performance overhead of these is so
>> > small,
>> > that doubling the overhead should not be a performance concern.
>>
>> Please stop handwaving performance.  Using big O notation:
>>
>
> There is discussion on perfomance elsewhere, now also in this other
> subthread:
>
> https://mail.python.org/pipermail/python-ideas/2017-October/047327.html
>
>> PEP 555, worst complexity for uncached lookup: O(N), where 'N' is the
>> total number of all context values for all context keys for the
>> current frame stack.

Quoting you from that link:

"Indeed I do mention performance here and there in the PEP 555 draft.

Lookups can be made fast and O(1) in most cases. Even with the simplest
unoptimized implementation, the worst-case lookup complexity would be O(n),
where n is the number of assignment contexts entered after the one which is
being looked up from (or in other words, nested inside the one that is
being looked up from). This means that for use cases where the relevant
context is entered as the innermost context level, the lookups are O(1)
even without any optimizations.

It is perfectly reasonable to make an implementation where lookups are
*always* O(1). Still, it might make more sense to implement a half-way
solution with "often O(1)", because that has somewhat less overhead in case
the callees end up not doing any lookups."

So where's the actual explanation of how you can make *uncached*
lookups O(1) in your best implementation?  I only see you claiming
that you know how to do that. And since you're using a stack of values
instead of hash tables, your explanation can make a big impact on the
CS field :)

It's perfectly reasonable to say that "cached lookups in my
optimization is O(1)".  Saying that "in most cases it's O(1)" isn't
how the big O notation should be used.

BTW, what's the big O for capturing the entire context in PEP 555
(get_execution_context() in PEP 550)?  How will that operation be
implemented?  A shallow copy of the stack?

Also, if I had this:

  with c.assign(o1):
    with c.assign(o2):
      with c.assign(o3):
        ctx = capture_context()

will ctx have references to o1, o2, and o3?

> Not true. See the above link. Lookups are fast (*and* O(1), if we want them
> to be).
>
> PEP 555 stacks are independent of frames, BTW.
>
>
>>
>> For a recursive function you can easily have a
>> situation where cache is invalidated often, and code starts to run
>> slower and slower.
>
>
> Not true either. The lookups are O(1) in a recursive function, with and
> without nested contexts.

See the above.  I claim that you can't say that *uncached* lookups can
be O(1) in PEP 555 with the current choice of datastructures.

Yury