Re: [Python-ideas] PEP draft: context variables

Oct. 14, 2017

      really like what Paul Moore wrote here as it matches a *LOT* of what I
have been feeling as I have been reading this whole discussion;
specifically:

   - I find the example, and discussion, really hard to follow.

   - I also, don't understand async, but I do understand generators very
      well (like Paul Moore)

   - A lot of this doesn't seem natural (generators & context variable
   syntax)

   - And particular: " If the implementation is hard to explain, it's a bad
   idea."

I've spend a lot of time thinking about this, and what the issues are.

I think they are multi-fold:

   - I really use Generators a lot -- and find them wonderful & are one of
   the joy's of python.  They are super useful.  However, as I am going to,
   hopefully, demonstrate here, they are not initially intuitive (to a
   beginner).

   - Generators are not really functions; but they appear to be functions,
   this was very confusing to me when I started working with generators.
      -  Now, I'm used to it -- BUT, we really need to consider new people
      - and I suggest making this easier.

   - I find the proposed context syntax very confusing (and slow).  I think
   contexts are super-important & instead need to be better integrated into
   the language (like nonlocal is)

   - People keep writing they want a real example -- so this is a very real
   example from real code I am writing (a python parser) and how I use
   contexts (obviously they are not part of the language yet, so I have
   emulated them) & how they interact with generators.

The full example, which took me a few hours to write is available here (its
a very very reduced example from a real parser of the python language
written in python):

   - https://github.com/AmitGreen/Gem/blob/emerald_6/work/demo.py

Here is the result of running the code -- which reads & executes demo1.py
(importing & executing demo2.py twice): [Not by executing, I mean the code
is running its own parser to execute it & its own code to emulate an
'import' -- thus showing nested contexts):

It creates two input files for testing -- demo1.py:

print 1
print 8 - 2 * 3
import demo2
print 9 - sqrt(16)
print 10 / (8 - 2 * 3)
import demo2
print 2 * 2 * 2 + 3 - 4

And it also creates demo2.py:

print 3 * (2 - 1)
error
print 4

There are two syntax errors (on purpose) in the files, but since demo2.py
is imported twice, this will show three syntax errors.

Running the code produces the following:

demo1.py#1: expression '1' evaluates to 1
demo1.py#2: expression '8 - 2 * 3' evaluates to 2
demo1.py#3: importing module demo2
demo2.py#1: expression '3 * (3 - 2)' evaluates to 3
demo2.py#2: UNKNOWN STATEMENT: 'error'
demo2.py#3: expression '4' evaluates to 4
demo1.py#4: UNKNOWN ATOM: ' sqrt(16)'
demo1.py#5: expression '10 / (8 - 2 * 3)' evaluates to 5
demo1.py#6: importing module demo2
demo2.py#1: expression '3 * (3 - 2)' evaluates to 3
demo2.py#2: UNKNOWN STATEMENT: 'error'
demo2.py#3: expression '4' evaluates to 4
demo1.py#7: expression '2 * 2 * 2 + 3 - 4' evaluates to 7

This code demonstrates all of the following:

   - Nested contexts
   - Using contexts 'naturally' -- i.e.: directly as variables; without a
   'context.' prefix -- which I would find too harder to read & also slower.
   - Using a generator that is deliberately broken up into three parts,
   start, next & stop.
   - Handling errors & how it interacts with both the generator & 'context'
   - Actually parsing the input -- which creates a deeply nested stack (due
   to recursive calls during expression parsing) -- thus a perfect example for
   contexts.

So given all of the above, I'd first like to focus on the generator:

   - Currently we can write generators as either: (1) functions; or (2)
   classes with a __next__ method.  However this is very confusing to a
   beginner.
   - Given a generator like the following (actually in the code):

        def __iter__(self):
            while not self.finished:
                self.loop += 1

                yield self

   - What I found so surprising when I started working with generator, is
   that calling the generator does *NOT* actually start the function.
   - Instead, the actual code does not actually get called until the first
   __next__ method is called.
   - This is quite counter-intuitive.

I therefore suggest the following:

   - Give generators their own first class language syntax.
   - This syntax, would have good entry point's, to allow their interaction
   with context variables.

Here is the generator in my code sample:

    #
    #   Here is our generator to walk over a file.
    #
    #   This generator has three sections:
    #
    #       generator_start     - Always run when the generator is started.
    #                             This opens the file & reads it.
    #
    #       generator_next      - Run each time the generator needs to
retrieve
    #                             The next value.
    #
    #       generator_stop      - Called when the generator is going to
stop.
    #
    def iterate_lines(path):
        data_lines = None

        def generator_startup(path):
            nonlocal current_path, data_lines

            with open(path) as f:
                current_path = path
                data         = f.read()

            data_lines = tuple(data.splitlines())

        def generator_next():
            nonlocal current_line, line_number

            for current_line in data_lines:
                line_number += 1
                line_position = 0

                yield current_line

            generator_stop()

        def generator_stop():
            current_path  = None
            line_number   = 0
            line_position = 0

        generator_startup(path)

        return generator_next()

This generator demonstrates the following:

   - It immediately starts up when called (and in fact opens the file when
   called -- so if the file doesn't exist, an exception is thrown then, not
   later when the __next__ method is first called)
   - It's half way between a function generator & a class generator; thus
   (1) efficient; and (2) more understandable than a class generator.

Here is (a first draft) proposal and how I would like to re-write the above
generator, so it would have its own first class syntax:

    generator iterate_lines(path):
        local   data_lines = None
        context current_path, current_line, line_number, line_position

        start:
            with open(path) as f:
                current_path = path
                data         = f.read()

            data_lines = tuple(data.splitlines())

        next:
            for current_line in data_lines:
                line_number += 1
                line_position = 0

                yield current_line

        stop:
            current_path  = None
            line_number  = 0
            line_position  = 0

This:

   1. Adds a keyword 'generator' so its obvious this is a generator not a
   function.
   2. Declares it variables (data_lines)
   3. Declares which context variables it wants to use (current_path,
   currentline, line_number, & line_position)
   4. Has a start section that immediately gets executed.
   5. Has a next section that executes on each call to __next__ (and this
   is where the yield keyword must appear)
   6. Has a stop section that executes when the generator receives a
   StopIteration.
   7. The compiler could generate equally efficient code for generators as
   it does for current generators; while making the syntax clearer to the user.
   8. The syntax is chosen so the user can edit it & convert it to a class
   generator.

Given the above:

   - I could now add special code to either the 'start' or 'next' section,
   saying which context I wanted to use (once we have that syntax implemented).

The reason for its own syntax is to allow us to think more clearly about
the different parts of a generator  & then makes it easier for the user to
choose which part of the generator interacts with contexts & which
context.  In particular the user could interact with multiple contexts (one
in the start section & a different one in the next section).

[Also for other generators I think the syntax needs to be extended, to
something like:

    next(context):

       use context:

           ....

Allowing two new features --- requesting that the __next__ receive the
context of the caller & secondly being able to use that context itself.

Next, moving on to contexts:

   - I love how non-local works & how you can access variables declared in
   your surrounding function.
   - I really think that contexts should work the same way
   - You would simply declare 'context' (like non-local) & just be able to
   use the variables directly.
   - Way easier to understand & use.

The sample code I have actually emulates contexts using non-local, so as to
demonstrate the idea I am explaining.

Thanks,

Amit
P.S.: As I'm very new to python ideas, I'm not sure if I should start a
separate thread to discuss this or use the current thread.  Also I'm not
sure if I should attached the sample code here or not ... So I just
provided the link above.

On Fri, Oct 13, 2017 at 4:29 PM, Paul Moore <p.f.moore@gmail.com> wrote:
...
On 13 October 2017 at 19:32, Yury Selivanov <yselivanov.ml@gmail.com>
wrote:
...
...
...
It seems simpler to have one specially named and specially called
function
be special, rather than make the semantics
more complicated for all functions.
It's not possible to special case __aenter__ and __aexit__ reliably
(supporting wrappers, decorators, and possible side effects).
...
+1.  I think that would make it much more usable by those of us who are
not
experts.
I still don't understand what Steve means by "more usable", to be honest.
I'd consider myself a "non-expert" in async. Essentially, I ignore it
- I don't write the sort of applications that would benefit
significantly from it, and I don't see any way to just do "a little
bit" of async, so I never use it.
But I *do* see value in the context variable proposals here - if only
in terms of them being a way to write my code to respond to external
settings in an async-friendly way. I don't follow the underlying
justification (which is based in "we need this to let things work with
async/coroutines) at all, but I'm completely OK with the basic idea
(if I want to have a setting that behaves "naturally", like I'd expect
decimal contexts to do, it needs a certain amount of language support,
so the proposal is to add that). I'd expect to be able to write
context variables that my code could respond to using a relatively
simple pattern, and have things "just work". Much like I can write a
context manager using @contextmanager and yield, and not need to
understand all the intricacies of __enter__ and __exit__. (BTW,
apologies if I'm mangling the terminology here - write it off as part
of me being "not an expert" :-))
What I'm getting from this discussion is that even if I *do* have a
simple way of writing context variables, they'll still behave in ways
that seem mildly weird to me (as a non-async user). Specifically, my
head hurts when I try to understand what that decimal context example
"should do". My instincts say that the current behaviour is wrong -
but I'm not sure I can explain why. So on that example, I'd ask the
following of any proposal:
1. Users trying to write a context variable[1] shouldn't have to jump
through hoops to get "natural" behaviour. That means that suggestions
that the complexity be pushed onto decimal.context aren't OK unless
it's also accepted that the current behaviour is wrong, and the only
reason decimal.context needs to replicated is for backward
compatibility (and new code can ignore the problem).
2. The proposal should clearly establish what it views as "natural"
behaviour, and why. I'm not happy with "it's how decimal.context has
always behaved" as an explanation. Sure, people asking to break
backward compatibility should have a good justification, but equally,
people arguing to *preserve* an unintuitive current behaviour in new
code should be prepared to explain why it's not a bug. To put it
another way, context variables aren't required to be bug-compatible
with thread local storage.
[1] I'm assuming here that "settings that affect how a library behave"
is a common requirement, and the PEP is intended as the "one obvious
way" to implement them.
Nick's other async refactoring example is different. If the two forms
he showed don't behave identically in all contexts, then I'd consider
that to be a major problem. Saying that "coroutines are special" just
reads to me as "coroutines/async are sufficiently weird that I can't
expect my normal patterns of reasoning to work with them". (Apologies
if I'm conflating coroutines and async incorrectly - as a non-expert,
they are essentially indistinguishable to me). I sincerely hope that
isn't the message I should be getting - async is already more
inaccessible than I'd like for the average user.
The fact that Nick's async example immediately devolved into a
discussion that I can't follow at all is fine - to an extent. I don't
mind the experts debating implementation details that I don't need to
know about. But if you make writing context variables harder, just to
fix Nick's example, or if you make *using* async code like (either of)
Nick's forms harder, then I do object, because that's affecting the
end user experience.
In that context, I take Steve's comment as meaning "fiddling about
with how __aenter__ and __aexit__ work is fine, as that's internals
that non-experts like me don't care about - but making context
variables behave oddly because of this is *not* fine".
Apologies if the above is unhelpful. I've been lurking but not
commenting here, precisely because I *am* a non-expert, and I trust
the experts to build something that works. But when non-experts were
explicitly mentioned, I thought my input might be useful.
The following quote from the Zen seems particularly relevant here:
If the implementation is hard to explain, it's a bad idea.
(although the one about needing to be Dutch to understand why
something is obvious might well trump it ;-))
Paul
_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] PEP draft: context variables

Amit Green