[Python-ideas] PEP draft: context variables

Thu Oct 12 08:59:31 EDT 2017

On Thu, Oct 12, 2017 at 6:54 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> On 11 October 2017 at 21:58, Koos Zevenhoven <k7hoven at gmail.com> wrote:
>
>> On Wed, Oct 11, 2017 at 7:46 AM, Steve Dower <steve.dower at python.org>
>> wrote:
>>
>>> Nick: “I like Yury's example for this, which is that the following two
>>> examples are currently semantically equivalent, and we want to preserve
>>> that equivalence:
>>>
>>>
>>>
>>>     with decimal.localcontext() as ctx:
>>>
>>>         ctc.prex = 30
>>>
>>>         for i in gen():
>>>            pass
>>>
>>>     g = gen()
>>>
>>>     with decimal.localcontext() as ctx:
>>>
>>>         ctc.prex = 30
>>>
>>>         for i in g:
>>>           pass”
>>>
>>>
>>>
>>> I’m following this discussion from a distance, but cared enough about
>>> this point to chime in without even reading what comes later in the thread.
>>> (Hopefully it’s not twenty people making the same point…)
>>>
>>>
>>>
>>> I HATE this example! Looking solely at the code we can see, you are
>>> refactoring a function call from inside an **explicit** context manager
>>> to outside of it, and assuming the behavior will not change. There’s **absolutely
>>> no** logical or semantic reason that these should be equivalent,
>>> especially given the obvious alternative of leaving the call within the
>>> explicit context. Even moving the function call before the setattr can’t be
>>> assumed to not change its behavior – how is moving it outside a with block
>>> ever supposed to be safe?
>>>
>>>
>>>
>>
>> Exactly. You did say it less politely than I did, but this is exactly
>> how I thought about it. And I'm not sure people got it the first time.
>>
>
> Refactoring isn't why I like the example, as I agree there's no logical
> reason why the two forms should be semantically equivalent in a greenfield
> context management design.
>
> The reason I like the example is because, in current Python, with the way
> generators and decimal contexts currently work, it *doesn't matter* which
> of these two forms you use - they'll both behave the same way, since no
> actual code execution takes place in the generator iterator at the time the
> generator is created.
>
>
The latter version does not feel like a good way to write the code. People
will hate it, because they can't tell what happens by looking at the code
locally.

What I think is that the current behavior of decimal contexts only
satisfies some contrived examples. IMO, everything about decimal contexts
together with generators is currently a misfeature. Of course, you can also
make use of a misfeature, like in the above example, where the subtleties
of decimal rounding are hidden underneath the iterator protocol and a `for`
statement.

That means we have a choice to make, and that choice will affect how risky
> it is for a library like decimal to switch from using thread local storage
> to context local storage: is switching from thread locals to context
> variables in a synchronous context manager going to be a compatibility
> break for end user code that uses the second form, where generator creation
> happens outside a with statement, but use happens inside it?
>
>
AFAICT, the number of users of  `decimal` could be anywhere between 3 and
3**19. Anything you do might break someone's code. Personally, I think the
current behavior, which you explain using the example above, is
counter-intuitive. But I can't tell you how much code would break by fixing
it with direct PEP 555 semantics. I also can't tell how much would break
when using PEP 550 to "fix" it, but I don't even like the semantics that
that would give.

I strongly believe that the most "normal" use case for a generator function
is that it's a function that returns an iterable of values. Sadly, decimal
contexts don't currently work correctly for this use case.

Indeed, I would introduce a new context manager that behaves intuitively
and then slowly deprecate the old one. [*]

Personally, I want folks maintaining context managers to feel comfortable
> switching from thread local storage to context variables (when the latter
> are available), and in particular, I want the decimal module to be able to
> make such a switch and have it be an entirely backwards compatible change
> for synchronous single-threaded code.
>

Sure. But PEP 550 won't give you that, though. Being inside a generator
affects the scope of changing the decimal context. Yes, as a side effect,
the behavior of a decimal context manager inside a generator becomes more
intuitive. But it's still a breaking change, even for synchronous code.

That means it doesn't matter to me whether we see separating generator (or
> context manager) creation from subsequent use is good style or not, what
> matters is that decimal contexts work a certain way today and hence we're
> faced with a choice between:
>
> 1. Preserve the current behaviour, since we don't have a compelling reason
> to change its semantics
> 2. Change the behaviour, in order to gain <end user benefit>
>
>
3. Introduce a new context manager that behaves intuitively. My guess is
that the two context managers could even be made to interact with each
other in a fairly reasonable manner, even if you nest them in different
orders. I'm not sure how necessary that is.

> "I think it's more correct, but don't have any specific examples where the
> status quo subtly does the wrong thing" isn't an end user benefit, as:
> - of necessity, any existing tested code won't be written that way (since
> it would be doing the wrong thing, and will hence have been changed)
>

Then it's actually better to not change the semantics of the existing
functionality, but add new ones instead.

> - future code that does want creation time context capture can be handled
> via an explicit wrapper (as is proposed for coroutines, with event loops
> supplying the wrapper in that case)
>
>
Handling the normal case with wrappers (that might even harm performance)
 just because decimal does not handle the normal case?

> "It will be easier to implement & maintain" isn't an end user benefit
> either, but still a consideration that carries weight when true. In this
> case though, it's pretty much a wash - whichever form we make the default,
> we'll need to provide some way of switching to the other behaviour, since
> we need both behavioural variants ourselves to handle different use cases.
>

True, in PEP 555 there is not really much difference in complexity
regarding leaking in from the side (next/send) and leaking in from the top
(genfunc() call). Just a matter of some if statements.

>
> That puts the burden squarely on the folks arguing for a semantic change:
> "We should break currently working code because ...".
>
> 
And on the folks that end up having to argue against it, or to come up with
a better solution. And those that feel that it's a distraction from the
discussion.

> PEP 479 (the change to StopIteration semantics) is an example of doing
> that well, and so is the proposal in PEP 550 to keep context changes from
> implicitly leaking *out* of generators when yield or await is used in a
> with statement body.
>

The former is a great example. The latter has good parts but is complicated
and didn't end up getting all the way there.

The challenge for folks arguing for generators capturing their creation
> context is to explain the pay-off that end users will gain from our
> implicitly changing the behaviour of code like the following:
>
>     >>> data = [sum(Decimal(10)**-r for r in range(max_r+1)) for max_r in
> range(5)]
>     >>> data
>     [Decimal('1'), Decimal('1.1'), Decimal('1.11'), Decimal('1.111'),
> Decimal('1.1111')]
>     >>> def lazily_round_to_current_context(data):
>     ...     for d in data: yield +d
>     ...
>     >>> g = lazily_round_to_current_context(data)
>     >>> with decimal.localcontext() as ctx:
>     ...     ctx.prec = 2
>     ...     rounded_data = list(g)
>     ...
>     >>> rounded_data
>     [Decimal('1'), Decimal('1.1'), Decimal('1.1'), Decimal('1.1'),
> Decimal('1.1')]
>
> Yes, it's a contrived example, but it's also code that will work all the
> way back to when the decimal module was first introduced. Because of the
> way I've named the rounding generator, it's also clear to readers that the
> code is aware of the existing semantics, and is intentionally relying on
> them.
>
>
The way you've named the function (lazily_round_to_current_context) does
not correspond to the behavior in the code example. "Current" means
"current", not "the context of the caller of next at lazy evaluation time".
Maybe you could make it:

g = rounded_according_to_decimal_context_of_whoever_calls_next(data)

Really, I think that, to get this behavior, the function should be defined
with a decorator to mark that context should leak in through next(). But
probably the programmer will realize––there must be a better way:

with decimal.localcontext() as ctx:
    ctx.prec = 2
    rounded_data = [round_in_context(d) for d in data]

That one would already work and be equivalent in any of the proposed
semantics.

But there could be more improvements, perhaps:

with decimal.context(prec=2):
    rounded_data = [round_in_context(d) for d in data]

––Koos

[*] Maybe somehow make the existing functionality a phantom easter egg––a
blast from the past which you can import and use, but which is otherwise
invisible :-). Then later give warnings and finally remove it completely.

But we need better smooth upgrade paths anyway, maybe something like:

from __compat__ import unintuitive_decimal_contexts

with unintuitive_decimal_contexts:
    do_stuff()

Now code bases can more quickly switch to new python versions and make the
occasional compatibility adjustments more lazily, while already benefiting
from other new language features.

––Koos

-- 
+ Koos Zevenhoven + http://twitter.com/k7hoven +
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20171012/7e9c1805/attachment-0001.html>