On Tue, Oct 10, 2017 at 4:22 AM, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
On Mon, Oct 9, 2017 at 8:37 PM, Koos Zevenhoven <k7hoven@gmail.com> wrote:
> You can cause unbound growth in PEP 550 too. All you have to do is nest an
> unbounded number of generators.

You can only nest up to 'sys.get_recursion_limit()' number of generators.

With PEP 555 you can do:

  while True:
    context_var.assign(42).__enter__()


​Well, in PEP 550, you can explicitly stack an unbounded number of LogicalContexts in a while True loop. Or you can run out of memory using plain lists even faster:

l = [42]

while True:
    l *= 2 # ensure exponential blow-up

I don't see why your example with context_var.assign(42).__enter__() would be any more likely.​

​​Sure, we could limit the number of allowed nested contexts in PEP 555. I don't really care. Just don't enter an unbounded number of context managers without exiting them.

​Really, it was my mistake to ever make you think that context_var.assign(42).__enter__() can be compared to .set(42) in PEP 550. I'll say it once more: PEP 555 context arguments have no equivalent of the PEP-550 .set(..).
 
> In PEP 555, nesting generators doesn't do
> anything really, unless you actually assign to context arguments in the
> generators. Only those who use it will pay.

Same for 550.  If a generator doesn't set context variables, its LC
will be an empty mapping (or NULL if you want to micro-optimize
things).  Nodes for the chain will come from a freelist. The effective
overhead for generators is a couple operations on pointers, and thus
visible only in microbenchmarks.

​Sure, you can implement push and pop and maintain a freelist by just doing operations on pointers. But that would be a handful of operations. Maybe you'd even manage to avoid INCREFs and DECREFs by not exposing things as Python objects. 

But I guarantee you, PEP 555 is simpler in this regard. In (pseudo?) C, the per-generator and per-send overhead would come from something like:

/* On generator creation */

stack = PyThreadState_Get()->carg_stack;
Py_INCREF(stack);
self->carg_stack = stack;

----------

/* On each next / send */

stack_ptr = &PyThreadState_Get()->carg_stack;
if (*stack_ptr == self->carg_stack) {
    /* no assignments made => do nothing */
}

/* ... then after next yield */

if (*stack_ptr == self->carg_stack) {
    /* once more, do nothing */
}



And there will of course be a PyDECREF after the generator has finished or when it is deallocated. 

If the generators *do* use context argument assignments, then some stuff would happen in the else clauses of the if statements above. (Or actually, using != instead of ==).


> But seriously, you will always end up in a weird situation if you call an
> unbounded number of contextmanager.__enter__() methods without calling
> __exit__(). Nothing new about that. But entering a handful of assignment
> contexts and leaving them open until a script ends is not the end of the
> world. I don't think anyone should do that though.
>
>
>>
>> You'll say that it's not how the API is supposed to be used,
>> and we say that we want to convert things like decimal and numpy to
>> use the new mechanism.  That question was also hand-waved by you:
>> numpy and decimal will have to come up with new/better APIs to use PEP
>> 555.  Well, that's just not good enough.
>
>
> What part of my explanation of this are you unhappy with? For instance, the
> 12th (I think) email in this thread, which is my response to Nathaniel.
> Could you reply to that and tell us your concern?

I'm sorry, I'm not going to find some 12th email in some thread.  I
stated in this thread the following: not being able to use PEP 555 to
fix *existing* decimal & numpy APIs is not good enough.  And decimal &
numpy is only one example, there's tons of code out there that can
benefit from its APIs to be fixed to support for async code in Python
3.7.


Well, anyone interested can read that 12th email in this thread. In short, my recommendation for libraries would be as follows:

* If the library does not provide a context manager yet, ​they should add one, using PEP 555. That will then work nicely in coroutines and generators.

* If the library does have a context manager, implement it using PEP 555. Or to be safe, add a new API function, so behavior in existing async code won't change.

* If the library needs to support some kind of set_state(..) operation, implement it by getting the state using a PEP 555 context argument and mutating its contents. 

​* Fall back to thread-local storage if no ​context argument is present or if the Python version does not support context arguments.


​[...]
 
>> > Some kind of
>> > chained-lookup-like thing is inevitable if you want the state not to
>> > leak
>> > though yields out of the generator:
>>
>> No, it's not "inevitable".  In PEP 550 v1, generators captured the
>> context when they are created and there was always only one level of
>> context.  This means that:
>>
>> 1. Context changes in generators aren't visible to the outside world.
>> 2. Changes to the context in the outside world are not visible to
>> running generators.
>
>
> Sure, if you make generators completely isolated from the outside world,
> then you can avoid chaining-like things too. But that would just sweep it
> under the carpet.

What do you mean by "just sweep it under the carpet"?  Capturing the
context at the moment of generators creation is a design choice with
some consequences (that I illustrated in my previous email).  There
are cons and pros of doing that.


"Capturing the context at generator creation" and "isolating generators completely" are two different things.

I've described pros of the former. The latter has no pros that I'm aware of, except if sweeping things under the carpet is considered as one.

Yes, the latter works in some use cases, but in others it does not. For instance, if an async framework wants to make some information available throughout the async task. If you isolate generators, then async programmers will have to avoid generators, because they don't have access to the information the framework is trying to provide. Also, if you refactor your generator into subgenerators using `yield from`, the subgenerators will not see the context set by the outer generator.

––Koos


--
+ Koos Zevenhoven + http://twitter.com/k7hoven +