[Python-ideas] PEP draft: context variables
Amit Green
amit.mixie at gmail.com
Fri Oct 13 23:53:39 EDT 2017
really like what Paul Moore wrote here as it matches a *LOT* of what I
have been feeling as I have been reading this whole discussion;
specifically:
- I find the example, and discussion, really hard to follow.
- I also, don't understand async, but I do understand generators very
well (like Paul Moore)
- A lot of this doesn't seem natural (generators & context variable
syntax)
- And particular: " If the implementation is hard to explain, it's a bad
idea."
I've spend a lot of time thinking about this, and what the issues are.
I think they are multi-fold:
- I really use Generators a lot -- and find them wonderful & are one of
the joy's of python. They are super useful. However, as I am going to,
hopefully, demonstrate here, they are not initially intuitive (to a
beginner).
- Generators are not really functions; but they appear to be functions,
this was very confusing to me when I started working with generators.
- Now, I'm used to it -- BUT, we really need to consider new people
- and I suggest making this easier.
- I find the proposed context syntax very confusing (and slow). I think
contexts are super-important & instead need to be better integrated into
the language (like nonlocal is)
- People keep writing they want a real example -- so this is a very real
example from real code I am writing (a python parser) and how I use
contexts (obviously they are not part of the language yet, so I have
emulated them) & how they interact with generators.
The full example, which took me a few hours to write is available here (its
a very very reduced example from a real parser of the python language
written in python):
- https://github.com/AmitGreen/Gem/blob/emerald_6/work/demo.py
Here is the result of running the code -- which reads & executes demo1.py
(importing & executing demo2.py twice): [Not by executing, I mean the code
is running its own parser to execute it & its own code to emulate an
'import' -- thus showing nested contexts):
It creates two input files for testing -- demo1.py:
print 1
print 8 - 2 * 3
import demo2
print 9 - sqrt(16)
print 10 / (8 - 2 * 3)
import demo2
print 2 * 2 * 2 + 3 - 4
And it also creates demo2.py:
print 3 * (2 - 1)
error
print 4
There are two syntax errors (on purpose) in the files, but since demo2.py
is imported twice, this will show three syntax errors.
Running the code produces the following:
demo1.py#1: expression '1' evaluates to 1
demo1.py#2: expression '8 - 2 * 3' evaluates to 2
demo1.py#3: importing module demo2
demo2.py#1: expression '3 * (3 - 2)' evaluates to 3
demo2.py#2: UNKNOWN STATEMENT: 'error'
demo2.py#3: expression '4' evaluates to 4
demo1.py#4: UNKNOWN ATOM: ' sqrt(16)'
demo1.py#5: expression '10 / (8 - 2 * 3)' evaluates to 5
demo1.py#6: importing module demo2
demo2.py#1: expression '3 * (3 - 2)' evaluates to 3
demo2.py#2: UNKNOWN STATEMENT: 'error'
demo2.py#3: expression '4' evaluates to 4
demo1.py#7: expression '2 * 2 * 2 + 3 - 4' evaluates to 7
This code demonstrates all of the following:
- Nested contexts
- Using contexts 'naturally' -- i.e.: directly as variables; without a
'context.' prefix -- which I would find too harder to read & also slower.
- Using a generator that is deliberately broken up into three parts,
start, next & stop.
- Handling errors & how it interacts with both the generator & 'context'
- Actually parsing the input -- which creates a deeply nested stack (due
to recursive calls during expression parsing) -- thus a perfect example for
contexts.
So given all of the above, I'd first like to focus on the generator:
- Currently we can write generators as either: (1) functions; or (2)
classes with a __next__ method. However this is very confusing to a
beginner.
- Given a generator like the following (actually in the code):
def __iter__(self):
while not self.finished:
self.loop += 1
yield self
- What I found so surprising when I started working with generator, is
that calling the generator does *NOT* actually start the function.
- Instead, the actual code does not actually get called until the first
__next__ method is called.
- This is quite counter-intuitive.
I therefore suggest the following:
- Give generators their own first class language syntax.
- This syntax, would have good entry point's, to allow their interaction
with context variables.
Here is the generator in my code sample:
#
# Here is our generator to walk over a file.
#
# This generator has three sections:
#
# generator_start - Always run when the generator is started.
# This opens the file & reads it.
#
# generator_next - Run each time the generator needs to
retrieve
# The next value.
#
# generator_stop - Called when the generator is going to
stop.
#
def iterate_lines(path):
data_lines = None
def generator_startup(path):
nonlocal current_path, data_lines
with open(path) as f:
current_path = path
data = f.read()
data_lines = tuple(data.splitlines())
def generator_next():
nonlocal current_line, line_number
for current_line in data_lines:
line_number += 1
line_position = 0
yield current_line
generator_stop()
def generator_stop():
current_path = None
line_number = 0
line_position = 0
generator_startup(path)
return generator_next()
This generator demonstrates the following:
- It immediately starts up when called (and in fact opens the file when
called -- so if the file doesn't exist, an exception is thrown then, not
later when the __next__ method is first called)
- It's half way between a function generator & a class generator; thus
(1) efficient; and (2) more understandable than a class generator.
Here is (a first draft) proposal and how I would like to re-write the above
generator, so it would have its own first class syntax:
generator iterate_lines(path):
local data_lines = None
context current_path, current_line, line_number, line_position
start:
with open(path) as f:
current_path = path
data = f.read()
data_lines = tuple(data.splitlines())
next:
for current_line in data_lines:
line_number += 1
line_position = 0
yield current_line
stop:
current_path = None
line_number = 0
line_position = 0
This:
1. Adds a keyword 'generator' so its obvious this is a generator not a
function.
2. Declares it variables (data_lines)
3. Declares which context variables it wants to use (current_path,
currentline, line_number, & line_position)
4. Has a start section that immediately gets executed.
5. Has a next section that executes on each call to __next__ (and this
is where the yield keyword must appear)
6. Has a stop section that executes when the generator receives a
StopIteration.
7. The compiler could generate equally efficient code for generators as
it does for current generators; while making the syntax clearer to the user.
8. The syntax is chosen so the user can edit it & convert it to a class
generator.
Given the above:
- I could now add special code to either the 'start' or 'next' section,
saying which context I wanted to use (once we have that syntax implemented).
The reason for its own syntax is to allow us to think more clearly about
the different parts of a generator & then makes it easier for the user to
choose which part of the generator interacts with contexts & which
context. In particular the user could interact with multiple contexts (one
in the start section & a different one in the next section).
[Also for other generators I think the syntax needs to be extended, to
something like:
next(context):
use context:
....
Allowing two new features --- requesting that the __next__ receive the
context of the caller & secondly being able to use that context itself.
Next, moving on to contexts:
- I love how non-local works & how you can access variables declared in
your surrounding function.
- I really think that contexts should work the same way
- You would simply declare 'context' (like non-local) & just be able to
use the variables directly.
- Way easier to understand & use.
The sample code I have actually emulates contexts using non-local, so as to
demonstrate the idea I am explaining.
Thanks,
Amit
P.S.: As I'm very new to python ideas, I'm not sure if I should start a
separate thread to discuss this or use the current thread. Also I'm not
sure if I should attached the sample code here or not ... So I just
provided the link above.
On Fri, Oct 13, 2017 at 4:29 PM, Paul Moore <p.f.moore at gmail.com> wrote:
> On 13 October 2017 at 19:32, Yury Selivanov <yselivanov.ml at gmail.com>
> wrote:
> >>> It seems simpler to have one specially named and specially called
> function
> >>> be special, rather than make the semantics
> >>> more complicated for all functions.
> >>
> >
> > It's not possible to special case __aenter__ and __aexit__ reliably
> > (supporting wrappers, decorators, and possible side effects).
> >
> >> +1. I think that would make it much more usable by those of us who are
> not
> >> experts.
> >
> > I still don't understand what Steve means by "more usable", to be honest.
>
> I'd consider myself a "non-expert" in async. Essentially, I ignore it
> - I don't write the sort of applications that would benefit
> significantly from it, and I don't see any way to just do "a little
> bit" of async, so I never use it.
>
> But I *do* see value in the context variable proposals here - if only
> in terms of them being a way to write my code to respond to external
> settings in an async-friendly way. I don't follow the underlying
> justification (which is based in "we need this to let things work with
> async/coroutines) at all, but I'm completely OK with the basic idea
> (if I want to have a setting that behaves "naturally", like I'd expect
> decimal contexts to do, it needs a certain amount of language support,
> so the proposal is to add that). I'd expect to be able to write
> context variables that my code could respond to using a relatively
> simple pattern, and have things "just work". Much like I can write a
> context manager using @contextmanager and yield, and not need to
> understand all the intricacies of __enter__ and __exit__. (BTW,
> apologies if I'm mangling the terminology here - write it off as part
> of me being "not an expert" :-))
>
> What I'm getting from this discussion is that even if I *do* have a
> simple way of writing context variables, they'll still behave in ways
> that seem mildly weird to me (as a non-async user). Specifically, my
> head hurts when I try to understand what that decimal context example
> "should do". My instincts say that the current behaviour is wrong -
> but I'm not sure I can explain why. So on that example, I'd ask the
> following of any proposal:
>
> 1. Users trying to write a context variable[1] shouldn't have to jump
> through hoops to get "natural" behaviour. That means that suggestions
> that the complexity be pushed onto decimal.context aren't OK unless
> it's also accepted that the current behaviour is wrong, and the only
> reason decimal.context needs to replicated is for backward
> compatibility (and new code can ignore the problem).
> 2. The proposal should clearly establish what it views as "natural"
> behaviour, and why. I'm not happy with "it's how decimal.context has
> always behaved" as an explanation. Sure, people asking to break
> backward compatibility should have a good justification, but equally,
> people arguing to *preserve* an unintuitive current behaviour in new
> code should be prepared to explain why it's not a bug. To put it
> another way, context variables aren't required to be bug-compatible
> with thread local storage.
>
> [1] I'm assuming here that "settings that affect how a library behave"
> is a common requirement, and the PEP is intended as the "one obvious
> way" to implement them.
>
> Nick's other async refactoring example is different. If the two forms
> he showed don't behave identically in all contexts, then I'd consider
> that to be a major problem. Saying that "coroutines are special" just
> reads to me as "coroutines/async are sufficiently weird that I can't
> expect my normal patterns of reasoning to work with them". (Apologies
> if I'm conflating coroutines and async incorrectly - as a non-expert,
> they are essentially indistinguishable to me). I sincerely hope that
> isn't the message I should be getting - async is already more
> inaccessible than I'd like for the average user.
>
> The fact that Nick's async example immediately devolved into a
> discussion that I can't follow at all is fine - to an extent. I don't
> mind the experts debating implementation details that I don't need to
> know about. But if you make writing context variables harder, just to
> fix Nick's example, or if you make *using* async code like (either of)
> Nick's forms harder, then I do object, because that's affecting the
> end user experience.
>
> In that context, I take Steve's comment as meaning "fiddling about
> with how __aenter__ and __aexit__ work is fine, as that's internals
> that non-experts like me don't care about - but making context
> variables behave oddly because of this is *not* fine".
>
> Apologies if the above is unhelpful. I've been lurking but not
> commenting here, precisely because I *am* a non-expert, and I trust
> the experts to build something that works. But when non-experts were
> explicitly mentioned, I thought my input might be useful.
>
> The following quote from the Zen seems particularly relevant here:
>
> If the implementation is hard to explain, it's a bad idea.
>
> (although the one about needing to be Dutch to understand why
> something is obvious might well trump it ;-))
>
> Paul
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20171013/42e7a885/attachment-0001.html>
More information about the Python-ideas
mailing list