
really like what Paul Moore wrote here as it matches a *LOT* of what I have been feeling as I have been reading this whole discussion; specifically: - I find the example, and discussion, really hard to follow. - I also, don't understand async, but I do understand generators very well (like Paul Moore) - A lot of this doesn't seem natural (generators & context variable syntax) - And particular: " If the implementation is hard to explain, it's a bad idea." I've spend a lot of time thinking about this, and what the issues are. I think they are multi-fold: - I really use Generators a lot -- and find them wonderful & are one of the joy's of python. They are super useful. However, as I am going to, hopefully, demonstrate here, they are not initially intuitive (to a beginner). - Generators are not really functions; but they appear to be functions, this was very confusing to me when I started working with generators. - Now, I'm used to it -- BUT, we really need to consider new people - and I suggest making this easier. - I find the proposed context syntax very confusing (and slow). I think contexts are super-important & instead need to be better integrated into the language (like nonlocal is) - People keep writing they want a real example -- so this is a very real example from real code I am writing (a python parser) and how I use contexts (obviously they are not part of the language yet, so I have emulated them) & how they interact with generators. The full example, which took me a few hours to write is available here (its a very very reduced example from a real parser of the python language written in python): - https://github.com/AmitGreen/Gem/blob/emerald_6/work/demo.py Here is the result of running the code -- which reads & executes demo1.py (importing & executing demo2.py twice): [Not by executing, I mean the code is running its own parser to execute it & its own code to emulate an 'import' -- thus showing nested contexts): It creates two input files for testing -- demo1.py: print 1 print 8 - 2 * 3 import demo2 print 9 - sqrt(16) print 10 / (8 - 2 * 3) import demo2 print 2 * 2 * 2 + 3 - 4 And it also creates demo2.py: print 3 * (2 - 1) error print 4 There are two syntax errors (on purpose) in the files, but since demo2.py is imported twice, this will show three syntax errors. Running the code produces the following: demo1.py#1: expression '1' evaluates to 1 demo1.py#2: expression '8 - 2 * 3' evaluates to 2 demo1.py#3: importing module demo2 demo2.py#1: expression '3 * (3 - 2)' evaluates to 3 demo2.py#2: UNKNOWN STATEMENT: 'error' demo2.py#3: expression '4' evaluates to 4 demo1.py#4: UNKNOWN ATOM: ' sqrt(16)' demo1.py#5: expression '10 / (8 - 2 * 3)' evaluates to 5 demo1.py#6: importing module demo2 demo2.py#1: expression '3 * (3 - 2)' evaluates to 3 demo2.py#2: UNKNOWN STATEMENT: 'error' demo2.py#3: expression '4' evaluates to 4 demo1.py#7: expression '2 * 2 * 2 + 3 - 4' evaluates to 7 This code demonstrates all of the following: - Nested contexts - Using contexts 'naturally' -- i.e.: directly as variables; without a 'context.' prefix -- which I would find too harder to read & also slower. - Using a generator that is deliberately broken up into three parts, start, next & stop. - Handling errors & how it interacts with both the generator & 'context' - Actually parsing the input -- which creates a deeply nested stack (due to recursive calls during expression parsing) -- thus a perfect example for contexts. So given all of the above, I'd first like to focus on the generator: - Currently we can write generators as either: (1) functions; or (2) classes with a __next__ method. However this is very confusing to a beginner. - Given a generator like the following (actually in the code): def __iter__(self): while not self.finished: self.loop += 1 yield self - What I found so surprising when I started working with generator, is that calling the generator does *NOT* actually start the function. - Instead, the actual code does not actually get called until the first __next__ method is called. - This is quite counter-intuitive. I therefore suggest the following: - Give generators their own first class language syntax. - This syntax, would have good entry point's, to allow their interaction with context variables. Here is the generator in my code sample: # # Here is our generator to walk over a file. # # This generator has three sections: # # generator_start - Always run when the generator is started. # This opens the file & reads it. # # generator_next - Run each time the generator needs to retrieve # The next value. # # generator_stop - Called when the generator is going to stop. # def iterate_lines(path): data_lines = None def generator_startup(path): nonlocal current_path, data_lines with open(path) as f: current_path = path data = f.read() data_lines = tuple(data.splitlines()) def generator_next(): nonlocal current_line, line_number for current_line in data_lines: line_number += 1 line_position = 0 yield current_line generator_stop() def generator_stop(): current_path = None line_number = 0 line_position = 0 generator_startup(path) return generator_next() This generator demonstrates the following: - It immediately starts up when called (and in fact opens the file when called -- so if the file doesn't exist, an exception is thrown then, not later when the __next__ method is first called) - It's half way between a function generator & a class generator; thus (1) efficient; and (2) more understandable than a class generator. Here is (a first draft) proposal and how I would like to re-write the above generator, so it would have its own first class syntax: generator iterate_lines(path): local data_lines = None context current_path, current_line, line_number, line_position start: with open(path) as f: current_path = path data = f.read() data_lines = tuple(data.splitlines()) next: for current_line in data_lines: line_number += 1 line_position = 0 yield current_line stop: current_path = None line_number = 0 line_position = 0 This: 1. Adds a keyword 'generator' so its obvious this is a generator not a function. 2. Declares it variables (data_lines) 3. Declares which context variables it wants to use (current_path, currentline, line_number, & line_position) 4. Has a start section that immediately gets executed. 5. Has a next section that executes on each call to __next__ (and this is where the yield keyword must appear) 6. Has a stop section that executes when the generator receives a StopIteration. 7. The compiler could generate equally efficient code for generators as it does for current generators; while making the syntax clearer to the user. 8. The syntax is chosen so the user can edit it & convert it to a class generator. Given the above: - I could now add special code to either the 'start' or 'next' section, saying which context I wanted to use (once we have that syntax implemented). The reason for its own syntax is to allow us to think more clearly about the different parts of a generator & then makes it easier for the user to choose which part of the generator interacts with contexts & which context. In particular the user could interact with multiple contexts (one in the start section & a different one in the next section). [Also for other generators I think the syntax needs to be extended, to something like: next(context): use context: .... Allowing two new features --- requesting that the __next__ receive the context of the caller & secondly being able to use that context itself. Next, moving on to contexts: - I love how non-local works & how you can access variables declared in your surrounding function. - I really think that contexts should work the same way - You would simply declare 'context' (like non-local) & just be able to use the variables directly. - Way easier to understand & use. The sample code I have actually emulates contexts using non-local, so as to demonstrate the idea I am explaining. Thanks, Amit P.S.: As I'm very new to python ideas, I'm not sure if I should start a separate thread to discuss this or use the current thread. Also I'm not sure if I should attached the sample code here or not ... So I just provided the link above. On Fri, Oct 13, 2017 at 4:29 PM, Paul Moore <p.f.moore@gmail.com> wrote:
On 13 October 2017 at 19:32, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
It seems simpler to have one specially named and specially called function be special, rather than make the semantics more complicated for all functions.
It's not possible to special case __aenter__ and __aexit__ reliably (supporting wrappers, decorators, and possible side effects).
+1. I think that would make it much more usable by those of us who are not experts.
I still don't understand what Steve means by "more usable", to be honest.
I'd consider myself a "non-expert" in async. Essentially, I ignore it - I don't write the sort of applications that would benefit significantly from it, and I don't see any way to just do "a little bit" of async, so I never use it.
But I *do* see value in the context variable proposals here - if only in terms of them being a way to write my code to respond to external settings in an async-friendly way. I don't follow the underlying justification (which is based in "we need this to let things work with async/coroutines) at all, but I'm completely OK with the basic idea (if I want to have a setting that behaves "naturally", like I'd expect decimal contexts to do, it needs a certain amount of language support, so the proposal is to add that). I'd expect to be able to write context variables that my code could respond to using a relatively simple pattern, and have things "just work". Much like I can write a context manager using @contextmanager and yield, and not need to understand all the intricacies of __enter__ and __exit__. (BTW, apologies if I'm mangling the terminology here - write it off as part of me being "not an expert" :-))
What I'm getting from this discussion is that even if I *do* have a simple way of writing context variables, they'll still behave in ways that seem mildly weird to me (as a non-async user). Specifically, my head hurts when I try to understand what that decimal context example "should do". My instincts say that the current behaviour is wrong - but I'm not sure I can explain why. So on that example, I'd ask the following of any proposal:
1. Users trying to write a context variable[1] shouldn't have to jump through hoops to get "natural" behaviour. That means that suggestions that the complexity be pushed onto decimal.context aren't OK unless it's also accepted that the current behaviour is wrong, and the only reason decimal.context needs to replicated is for backward compatibility (and new code can ignore the problem). 2. The proposal should clearly establish what it views as "natural" behaviour, and why. I'm not happy with "it's how decimal.context has always behaved" as an explanation. Sure, people asking to break backward compatibility should have a good justification, but equally, people arguing to *preserve* an unintuitive current behaviour in new code should be prepared to explain why it's not a bug. To put it another way, context variables aren't required to be bug-compatible with thread local storage.
[1] I'm assuming here that "settings that affect how a library behave" is a common requirement, and the PEP is intended as the "one obvious way" to implement them.
Nick's other async refactoring example is different. If the two forms he showed don't behave identically in all contexts, then I'd consider that to be a major problem. Saying that "coroutines are special" just reads to me as "coroutines/async are sufficiently weird that I can't expect my normal patterns of reasoning to work with them". (Apologies if I'm conflating coroutines and async incorrectly - as a non-expert, they are essentially indistinguishable to me). I sincerely hope that isn't the message I should be getting - async is already more inaccessible than I'd like for the average user.
The fact that Nick's async example immediately devolved into a discussion that I can't follow at all is fine - to an extent. I don't mind the experts debating implementation details that I don't need to know about. But if you make writing context variables harder, just to fix Nick's example, or if you make *using* async code like (either of) Nick's forms harder, then I do object, because that's affecting the end user experience.
In that context, I take Steve's comment as meaning "fiddling about with how __aenter__ and __aexit__ work is fine, as that's internals that non-experts like me don't care about - but making context variables behave oddly because of this is *not* fine".
Apologies if the above is unhelpful. I've been lurking but not commenting here, precisely because I *am* a non-expert, and I trust the experts to build something that works. But when non-experts were explicitly mentioned, I thought my input might be useful.
The following quote from the Zen seems particularly relevant here:
If the implementation is hard to explain, it's a bad idea.
(although the one about needing to be Dutch to understand why something is obvious might well trump it ;-))
Paul _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/