On 11 October 2017 at 21:58, Koos Zevenhoven <k7hoven@gmail.com> wrote:
On Wed, Oct 11, 2017 at 7:46 AM, Steve Dower <steve.dower@python.org> wrote:

Nick: “I like Yury's example for this, which is that the following two examples are currently semantically equivalent, and we want to preserve that equivalence:

 

    with decimal.localcontext() as ctx:

        ctc.prex = 30

        for i in gen():
           pass

    g = gen()

    with decimal.localcontext() as ctx:

        ctc.prex = 30

        for i in g:
          pass”

 

I’m following this discussion from a distance, but cared enough about this point to chime in without even reading what comes later in the thread. (Hopefully it’s not twenty people making the same point…)

 

I HATE this example! Looking solely at the code we can see, you are refactoring a function call from inside an *explicit* context manager to outside of it, and assuming the behavior will not change. There’s *absolutely no* logical or semantic reason that these should be equivalent, especially given the obvious alternative of leaving the call within the explicit context. Even moving the function call before the setattr can’t be assumed to not change its behavior – how is moving it outside a with block ever supposed to be safe?

 


​Exactly. You did say it less politely than I did, but this is exactly how I thought about it. And I'm not sure people got it the first time.

Refactoring isn't why I like the example, as I agree there's no logical reason why the two forms should be semantically equivalent in a greenfield context management design.

The reason I like the example is because, in current Python, with the way generators and decimal contexts currently work, it *doesn't matter* which of these two forms you use - they'll both behave the same way, since no actual code execution takes place in the generator iterator at the time the generator is created.

That means we have a choice to make, and that choice will affect how risky it is for a library like decimal to switch from using thread local storage to context local storage: is switching from thread locals to context variables in a synchronous context manager going to be a compatibility break for end user code that uses the second form, where generator creation happens outside a with statement, but use happens inside it?

Personally, I want folks maintaining context managers to feel comfortable switching from thread local storage to context variables (when the latter are available), and in particular, I want the decimal module to be able to make such a switch and have it be an entirely backwards compatible change for synchronous single-threaded code.

That means it doesn't matter to me whether we see separating generator (or context manager) creation from subsequent use is good style or not, what matters is that decimal contexts work a certain way today and hence we're faced with a choice between:

1. Preserve the current behaviour, since we don't have a compelling reason to change its semantics
2. Change the behaviour, in order to gain <end user benefit>

"I think it's more correct, but don't have any specific examples where the status quo subtly does the wrong thing" isn't an end user benefit, as:
- of necessity, any existing tested code won't be written that way (since it would be doing the wrong thing, and will hence have been changed)
- future code that does want creation time context capture can be handled via an explicit wrapper (as is proposed for coroutines, with event loops supplying the wrapper in that case)

"It will be easier to implement & maintain" isn't an end user benefit either, but still a consideration that carries weight when true. In this case though, it's pretty much a wash - whichever form we make the default, we'll need to provide some way of switching to the other behaviour, since we need both behavioural variants ourselves to handle different use cases.

That puts the burden squarely on the folks arguing for a semantic change: "We should break currently working code because ...".

PEP 479 (the change to StopIteration semantics) is an example of doing that well, and so is the proposal in PEP 550 to keep context changes from implicitly leaking *out* of generators when yield or await is used in a with statement body.

The challenge for folks arguing for generators capturing their creation context is to explain the pay-off that end users will gain from our implicitly changing the behaviour of code like the following:

    >>> data = [sum(Decimal(10)**-r for r in range(max_r+1)) for max_r in range(5)]
    >>> data
    [Decimal('1'), Decimal('1.1'), Decimal('1.11'), Decimal('1.111'), Decimal('1.1111')]
    >>> def lazily_round_to_current_context(data):
    ...     for d in data: yield +d
    ...
    >>> g = lazily_round_to_current_context(data)
    >>> with decimal.localcontext() as ctx:
    ...     ctx.prec = 2
    ...     rounded_data = list(g)
    ... 
    >>> rounded_data
    [Decimal('1'), Decimal('1.1'), Decimal('1.1'), Decimal('1.1'), Decimal('1.1')]

Yes, it's a contrived example, but it's also code that will work all the way back to when the decimal module was first introduced. Because of the way I've named the rounding generator, it's also clear to readers that the code is aware of the existing semantics, and is intentionally relying on them.

The current version of PEP 550 means that the decimal module can switch to using context variables instead of thread local storage, and the above code won't even notice the difference.

However, if generators were to start implicitly capturing their creation context, then the above code would break, since the rounding would start using a decimal context other than the one that's in effect in the current thread when the rounding takes place - the generator would implicitly reset it back to an earlier state.

Cheers,
Nick.

--
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia