Re: [Python-ideas] PEP draft: context variables

12 Oct 2017

      On Thu, Oct 12, 2017 at 6:54 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
...
On 11 October 2017 at 21:58, Koos Zevenhoven <k7hoven@gmail.com> wrote:
...
On Wed, Oct 11, 2017 at 7:46 AM, Steve Dower <steve.dower@python.org>
wrote:
...
Nick: “I like Yury's example for this, which is that the following two
examples are currently semantically equivalent, and we want to preserve
that equivalence:
with decimal.localcontext() as ctx:
ctc.prex = 30
for i in gen():
           pass
g = gen()
with decimal.localcontext() as ctx:
ctc.prex = 30
for i in g:
          pass”
I’m following this discussion from a distance, but cared enough about
this point to chime in without even reading what comes later in the thread.
(Hopefully it’s not twenty people making the same point…)
I HATE this example! Looking solely at the code we can see, you are
refactoring a function call from inside an **explicit** context manager
to outside of it, and assuming the behavior will not change. There’s **absolutely
no** logical or semantic reason that these should be equivalent,
especially given the obvious alternative of leaving the call within the
explicit context. Even moving the function call before the setattr can’t be
assumed to not change its behavior – how is moving it outside a with block
ever supposed to be safe?
Exactly. You did say it less politely than I did, but this is exactly
how I thought about it. And I'm not sure people got it the first time.
Refactoring isn't why I like the example, as I agree there's no logical
reason why the two forms should be semantically equivalent in a greenfield
context management design.
The reason I like the example is because, in current Python, with the way
generators and decimal contexts currently work, it *doesn't matter* which
of these two forms you use - they'll both behave the same way, since no
actual code execution takes place in the generator iterator at the time the
generator is created.
The latter version does not feel like a good way to write the code. People
will hate it, because they can't tell what happens by looking at the code
locally.

What I think is that the current behavior of decimal contexts only
satisfies some contrived examples. IMO, everything about decimal contexts
together with generators is currently a misfeature. Of course, you can also
make use of a misfeature, like in the above example, where the subtleties
of decimal rounding are hidden underneath the iterator protocol and a `for`
statement.

That means we have a choice to make, and that choice will affect how risky
...
it is for a library like decimal to switch from using thread local storage
to context local storage: is switching from thread locals to context
variables in a synchronous context manager going to be a compatibility
break for end user code that uses the second form, where generator creation
happens outside a with statement, but use happens inside it?
AFAICT, the number of users of  `decimal` could be anywhere between 3 and
3**19. Anything you do might break someone's code. Personally, I think the
current behavior, which you explain using the example above, is
counter-intuitive. But I can't tell you how much code would break by fixing
it with direct PEP 555 semantics. I also can't tell how much would break
when using PEP 550 to "fix" it, but I don't even like the semantics that
that would give.

I strongly believe that the most "normal" use case for a generator function
is that it's a function that returns an iterable of values. Sadly, decimal
contexts don't currently work correctly for this use case.

Indeed, I would introduce a new context manager that behaves intuitively
and then slowly deprecate the old one. [*]

Personally, I want folks maintaining context managers to feel comfortable
...
switching from thread local storage to context variables (when the latter
are available), and in particular, I want the decimal module to be able to
make such a switch and have it be an entirely backwards compatible change
for synchronous single-threaded code.
Sure. But PEP 550 won't give you that, though. Being inside a generator
affects the scope of changing the decimal context. Yes, as a side effect,
the behavior of a decimal context manager inside a generator becomes more
intuitive. But it's still a breaking change, even for synchronous code.

That means it doesn't matter to me whether we see separating generator (or
...
context manager) creation from subsequent use is good style or not, what
matters is that decimal contexts work a certain way today and hence we're
faced with a choice between:
1. Preserve the current behaviour, since we don't have a compelling reason
to change its semantics
2. Change the behaviour, in order to gain <end user benefit>
3. Introduce a new context manager that behaves intuitively. My guess is
that the two context managers could even be made to interact with each
other in a fairly reasonable manner, even if you nest them in different
orders. I'm not sure how necessary that is.
...
"I think it's more correct, but don't have any specific examples where the
status quo subtly does the wrong thing" isn't an end user benefit, as:
- of necessity, any existing tested code won't be written that way (since
it would be doing the wrong thing, and will hence have been changed)
Then it's actually better to not change the semantics of the existing
functionality, but add new ones instead.
...
- future code that does want creation time context capture can be handled
via an explicit wrapper (as is proposed for coroutines, with event loops
supplying the wrapper in that case)
Handling the normal case with wrappers (that might even harm performance)
 just because decimal does not handle the normal case?

...
"It will be easier to implement & maintain" isn't an end user benefit
either, but still a consideration that carries weight when true. In this
case though, it's pretty much a wash - whichever form we make the default,
we'll need to provide some way of switching to the other behaviour, since
we need both behavioural variants ourselves to handle different use cases.
True, in PEP 555 there is not really much difference in complexity
regarding leaking in from the side (next/send) and leaking in from the top
(genfunc() call). Just a matter of some if statements.
...
That puts the burden squarely on the folks arguing for a semantic change:
"We should break currently working code because ...".

And on the folks that end up having to argue against it, or to come up with
a better solution. And those that feel that it's a distraction from the
discussion.
...
PEP 479 (the change to StopIteration semantics) is an example of doing
that well, and so is the proposal in PEP 550 to keep context changes from
implicitly leaking *out* of generators when yield or await is used in a
with statement body.
The former is a great example. The latter has good parts but is complicated
and didn't end up getting all the way there.

The challenge for folks arguing for generators capturing their creation
...
context is to explain the pay-off that end users will gain from our
implicitly changing the behaviour of code like the following:
>>> data = [sum(Decimal(10)**-r for r in range(max_r+1)) for max_r in
range(5)]
    >>> data
    [Decimal('1'), Decimal('1.1'), Decimal('1.11'), Decimal('1.111'),
Decimal('1.1111')]
    >>> def lazily_round_to_current_context(data):
    ...     for d in data: yield +d
    ...
    >>> g = lazily_round_to_current_context(data)
    >>> with decimal.localcontext() as ctx:
    ...     ctx.prec = 2
    ...     rounded_data = list(g)
    ...
    >>> rounded_data
    [Decimal('1'), Decimal('1.1'), Decimal('1.1'), Decimal('1.1'),
Decimal('1.1')]
Yes, it's a contrived example, but it's also code that will work all the
way back to when the decimal module was first introduced. Because of the
way I've named the rounding generator, it's also clear to readers that the
code is aware of the existing semantics, and is intentionally relying on
them.
The way you've named the function (lazily_round_to_current_context) does
not correspond to the behavior in the code example. "Current" means
"current", not "the context of the caller of next at lazy evaluation time".
Maybe you could make it:

g = rounded_according_to_decimal_context_of_whoever_calls_next(data)

Really, I think that, to get this behavior, the function should be defined
with a decorator to mark that context should leak in through next(). But
probably the programmer will realize––there must be a better way:

with decimal.localcontext() as ctx:
    ctx.prec = 2
    rounded_data = [round_in_context(d) for d in data]

That one would already work and be equivalent in any of the proposed
semantics.

But there could be more improvements, perhaps:

with decimal.context(prec=2):
    rounded_data = [round_in_context(d) for d in data]

––Koos

[*] Maybe somehow make the existing functionality a phantom easter egg––a
blast from the past which you can import and use, but which is otherwise
invisible :-). Then later give warnings and finally remove it completely.

But we need better smooth upgrade paths anyway, maybe something like:

from __compat__ import unintuitive_decimal_contexts

with unintuitive_decimal_contexts:
    do_stuff()

Now code bases can more quickly switch to new python versions and make the
occasional compatibility adjustments more lazily, while already benefiting
from other new language features.

––Koos

-- 
+ Koos Zevenhoven + http://twitter.com/k7hoven +