On Thu, Oct 12, 2017 at 6:54 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 11 October 2017 at 21:58, Koos Zevenhoven <k7hoven@gmail.com> wrote:
On Wed, Oct 11, 2017 at 7:46 AM, Steve Dower <steve.dower@python.org> wrote:

Nick: “I like Yury's example for this, which is that the following two examples are currently semantically equivalent, and we want to preserve that equivalence:

 

    with decimal.localcontext() as ctx:

        ctc.prex = 30

        for i in gen():
           pass

    g = gen()

    with decimal.localcontext() as ctx:

        ctc.prex = 30

        for i in g:
          pass”

 

I’m following this discussion from a distance, but cared enough about this point to chime in without even reading what comes later in the thread. (Hopefully it’s not twenty people making the same point…)

 

I HATE this example! Looking solely at the code we can see, you are refactoring a function call from inside an *explicit* context manager to outside of it, and assuming the behavior will not change. There’s *absolutely no* logical or semantic reason that these should be equivalent, especially given the obvious alternative of leaving the call within the explicit context. Even moving the function call before the setattr can’t be assumed to not change its behavior – how is moving it outside a with block ever supposed to be safe?

 


​Exactly. You did say it less politely than I did, but this is exactly how I thought about it. And I'm not sure people got it the first time.

Refactoring isn't why I like the example, as I agree there's no logical reason why the two forms should be semantically equivalent in a greenfield context management design.

The reason I like the example is because, in current Python, with the way generators and decimal contexts currently work, it *doesn't matter* which of these two forms you use - they'll both behave the same way, since no actual code execution takes place in the generator iterator at the time the generator is created.


The latter version does not feel like a good way to write the code. People will hate it, because they can't tell what happens by looking at the code locally.

What I think is that the current behavior of decimal contexts only satisfies some contrived examples. IMO, everything about decimal contexts together with generators is currently a misfeature. Of course, you can also make use of a misfeature, like in the above example, where the subtleties of decimal rounding are hidden underneath the iterator protocol and a `for` statement.​ 

That means we have a choice to make, and that choice will affect how risky it is for a library like decimal to switch from using thread local storage to context local storage: is switching from thread locals to context variables in a synchronous context manager going to be a compatibility break for end user code that uses the second form, where generator creation happens outside a with statement, but use happens inside it?


​AFAICT, the number of users of  `decimal` could be anywhere between 3 and 3**19. ​Anything you do might break someone's code. Personally, I think the current behavior, which you explain using the example above, is counter-intuitive. But I can't tell you how much code would break by fixing it with direct PEP 555 semantics. I also can't tell how much would break when using PEP 550 to "fix" it, but I don't even like the semantics that that would give. 

I strongly believe that the most "normal" use case for a generator function is that it's a function that returns an iterable of values. Sadly, decimal contexts don't currently work correctly for this use case.

Indeed, I would introduce a new context manager that behaves intuitively and then slowly deprecate the old one. [*]


Personally, I want folks maintaining context managers to feel comfortable switching from thread local storage to context variables (when the latter are available), and in particular, I want the decimal module to be able to make such a switch and have it be an entirely backwards compatible change for synchronous single-threaded code.

​Sure. But PEP 550 won't give you that, though. Being inside a generator affects the scope of changing the decimal context. Yes, as a side effect, the behavior of a decimal context manager inside a generator becomes more intuitive. But it's still a breaking change, even for synchronous code. 

That means it doesn't matter to me whether we see separating generator (or context manager) creation from subsequent use is good style or not, what matters is that decimal contexts work a certain way today and hence we're faced with a choice between:

1. Preserve the current behaviour, since we don't have a compelling reason to change its semantics
2. Change the behaviour, in order to gain <end user benefit>


​3. Introduce​ a new context manager that behaves intuitively. ​My guess is that the two context managers could even be made to interact with each other in a fairly reasonable manner, even if you nest them in different orders. I'm not sure how necessary that is.

 
"I think it's more correct, but don't have any specific examples where the status quo subtly does the wrong thing" isn't an end user benefit, as:
- of necessity, any existing tested code won't be written that way (since it would be doing the wrong thing, and will hence have been changed)

​Then it's actually better to not change the semantics of the existing functionality, but add new ones instead.​
 
- future code that does want creation time context capture can be handled via an explicit wrapper (as is proposed for coroutines, with event loops supplying the wrapper in that case)


Handling the normal case with wrappers (that might even harm performance)
​ just because decimal does not handle the normal case?

 
"It will be easier to implement & maintain" isn't an end user benefit either, but still a consideration that carries weight when true. In this case though, it's pretty much a wash - whichever form we make the default, we'll need to provide some way of switching to the other behaviour, since we need both behavioural variants ourselves to handle different use cases.

True, in PEP 555 there is not really much difference in complexity regarding leaking in from the side (next/send) and leaking in from the top (genfunc() call). Just a matter of some if statements.
 

That puts the burden squarely on the folks arguing for a semantic change: "We should break currently working code because ...".

And on the folks that end up having to argue against it, or to come up with a better solution. And those that feel that it's a distraction from the discussion.
 
PEP 479 (the change to StopIteration semantics) is an example of doing that well, and so is the proposal in PEP 550 to keep context changes from implicitly leaking *out* of generators when yield or await is used in a with statement body.

The former is a great example. The latter has good parts but is complicated and didn't end up getting all the way there.

The challenge for folks arguing for generators capturing their creation context is to explain the pay-off that end users will gain from our implicitly changing the behaviour of code like the following:

    >>> data = [sum(Decimal(10)**-r for r in range(max_r+1)) for max_r in range(5)]
    >>> data
    [Decimal('1'), Decimal('1.1'), Decimal('1.11'), Decimal('1.111'), Decimal('1.1111')]
    >>> def lazily_round_to_current_context(data):
    ...     for d in data: yield +d
    ...
    >>> g = lazily_round_to_current_context(data)
    >>> with decimal.localcontext() as ctx:
    ...     ctx.prec = 2
    ...     rounded_data = list(g)
    ... 
    >>> rounded_data
    [Decimal('1'), Decimal('1.1'), Decimal('1.1'), Decimal('1.1'), Decimal('1.1')]

Yes, it's a contrived example, but it's also code that will work all the way back to when the decimal module was first introduced. Because of the way I've named the rounding generator, it's also clear to readers that the code is aware of the existing semantics, and is intentionally relying on them.


​The way you've named the function (lazily_round_to_current_context) does not correspond to the behavior in the code example. "Current" means "current", not "the context of the caller of next at lazy evaluation time". Maybe you could make it:

g = rounded_according_to_decimal_context_of_whoever_calls_next(data)

Really, I think that, to get this behavior, the function should be defined with a decorator to mark that context should leak in through next(). But probably the programmer will realize––there must be a better way:

with decimal.localcontext() as ctx:
    ctx.prec = 2
    rounded_data = [round_in_context(d) for d in data]

That one would already work and be equivalent in any of the proposed semantics.

But there could be more improvements, perhaps:

with decimal.context(prec=2):
    rounded_data = [round_in_context(d) for d in data]


––Koos​



​​[*] Maybe somehow make the existing functionality a phantom easter egg––a blast from the past which you can import and use, but which is otherwise invisible :-). Then later give warnings and finally remove it completely.

But we need better smooth upgrade paths anyway, maybe something like:

from __compat__ import unintuitive_decimal_contexts

with unintuitive_decimal_contexts:
    do_stuff()

​Now code bases can more quickly switch to new python versions and make the occasional compatibility adjustments more lazily, while already benefiting from other new language features. 


––Koos​


--
+ Koos Zevenhoven + http://twitter.com/k7hoven +