[Python-ideas] PEP draft: context variables

Amit Green amit.mixie at gmail.com
Fri Oct 13 23:53:39 EDT 2017


 really like what Paul Moore wrote here as it matches a *LOT* of what I
have been feeling as I have been reading this whole discussion;
specifically:

   - I find the example, and discussion, really hard to follow.


   - I also, don't understand async, but I do understand generators very
      well (like Paul Moore)


   - A lot of this doesn't seem natural (generators & context variable
   syntax)


   - And particular: " If the implementation is hard to explain, it's a bad
   idea."

I've spend a lot of time thinking about this, and what the issues are.

I think they are multi-fold:

   - I really use Generators a lot -- and find them wonderful & are one of
   the joy's of python.  They are super useful.  However, as I am going to,
   hopefully, demonstrate here, they are not initially intuitive (to a
   beginner).


   - Generators are not really functions; but they appear to be functions,
   this was very confusing to me when I started working with generators.
      -  Now, I'm used to it -- BUT, we really need to consider new people
      - and I suggest making this easier.


   - I find the proposed context syntax very confusing (and slow).  I think
   contexts are super-important & instead need to be better integrated into
   the language (like nonlocal is)


   - People keep writing they want a real example -- so this is a very real
   example from real code I am writing (a python parser) and how I use
   contexts (obviously they are not part of the language yet, so I have
   emulated them) & how they interact with generators.

The full example, which took me a few hours to write is available here (its
a very very reduced example from a real parser of the python language
written in python):

   - https://github.com/AmitGreen/Gem/blob/emerald_6/work/demo.py

Here is the result of running the code -- which reads & executes demo1.py
(importing & executing demo2.py twice): [Not by executing, I mean the code
is running its own parser to execute it & its own code to emulate an
'import' -- thus showing nested contexts):

It creates two input files for testing -- demo1.py:

print 1
print 8 - 2 * 3
import demo2
print 9 - sqrt(16)
print 10 / (8 - 2 * 3)
import demo2
print 2 * 2 * 2 + 3 - 4

And it also creates demo2.py:

print 3 * (2 - 1)
error
print 4

There are two syntax errors (on purpose) in the files, but since demo2.py
is imported twice, this will show three syntax errors.

Running the code produces the following:

demo1.py#1: expression '1' evaluates to 1
demo1.py#2: expression '8 - 2 * 3' evaluates to 2
demo1.py#3: importing module demo2
demo2.py#1: expression '3 * (3 - 2)' evaluates to 3
demo2.py#2: UNKNOWN STATEMENT: 'error'
demo2.py#3: expression '4' evaluates to 4
demo1.py#4: UNKNOWN ATOM: ' sqrt(16)'
demo1.py#5: expression '10 / (8 - 2 * 3)' evaluates to 5
demo1.py#6: importing module demo2
demo2.py#1: expression '3 * (3 - 2)' evaluates to 3
demo2.py#2: UNKNOWN STATEMENT: 'error'
demo2.py#3: expression '4' evaluates to 4
demo1.py#7: expression '2 * 2 * 2 + 3 - 4' evaluates to 7

This code demonstrates all of the following:

   - Nested contexts
   - Using contexts 'naturally' -- i.e.: directly as variables; without a
   'context.' prefix -- which I would find too harder to read & also slower.
   - Using a generator that is deliberately broken up into three parts,
   start, next & stop.
   - Handling errors & how it interacts with both the generator & 'context'
   - Actually parsing the input -- which creates a deeply nested stack (due
   to recursive calls during expression parsing) -- thus a perfect example for
   contexts.

So given all of the above, I'd first like to focus on the generator:

   - Currently we can write generators as either: (1) functions; or (2)
   classes with a __next__ method.  However this is very confusing to a
   beginner.
   - Given a generator like the following (actually in the code):

        def __iter__(self):
            while not self.finished:
                self.loop += 1

                yield self

   - What I found so surprising when I started working with generator, is
   that calling the generator does *NOT* actually start the function.
   - Instead, the actual code does not actually get called until the first
   __next__ method is called.
   - This is quite counter-intuitive.

I therefore suggest the following:

   - Give generators their own first class language syntax.
   - This syntax, would have good entry point's, to allow their interaction
   with context variables.

Here is the generator in my code sample:

    #
    #   Here is our generator to walk over a file.
    #
    #   This generator has three sections:
    #
    #       generator_start     - Always run when the generator is started.
    #                             This opens the file & reads it.
    #
    #       generator_next      - Run each time the generator needs to
retrieve
    #                             The next value.
    #
    #       generator_stop      - Called when the generator is going to
stop.
    #
    def iterate_lines(path):
        data_lines = None


        def generator_startup(path):
            nonlocal current_path, data_lines

            with open(path) as f:
                current_path = path
                data         = f.read()

            data_lines = tuple(data.splitlines())


        def generator_next():
            nonlocal current_line, line_number

            for current_line in data_lines:
                line_number += 1
                line_position = 0

                yield current_line

            generator_stop()


        def generator_stop():
            current_path  = None
            line_number   = 0
            line_position = 0


        generator_startup(path)

        return generator_next()

This generator demonstrates the following:

   - It immediately starts up when called (and in fact opens the file when
   called -- so if the file doesn't exist, an exception is thrown then, not
   later when the __next__ method is first called)
   - It's half way between a function generator & a class generator; thus
   (1) efficient; and (2) more understandable than a class generator.

Here is (a first draft) proposal and how I would like to re-write the above
generator, so it would have its own first class syntax:

    generator iterate_lines(path):
        local   data_lines = None
        context current_path, current_line, line_number, line_position

        start:
            with open(path) as f:
                current_path = path
                data         = f.read()

            data_lines = tuple(data.splitlines())

        next:
            for current_line in data_lines:
                line_number += 1
                line_position = 0

                yield current_line

        stop:
            current_path  = None
            line_number  = 0
            line_position  = 0

This:

   1. Adds a keyword 'generator' so its obvious this is a generator not a
   function.
   2. Declares it variables (data_lines)
   3. Declares which context variables it wants to use (current_path,
   currentline, line_number, & line_position)
   4. Has a start section that immediately gets executed.
   5. Has a next section that executes on each call to __next__ (and this
   is where the yield keyword must appear)
   6. Has a stop section that executes when the generator receives a
   StopIteration.
   7. The compiler could generate equally efficient code for generators as
   it does for current generators; while making the syntax clearer to the user.
   8. The syntax is chosen so the user can edit it & convert it to a class
   generator.

Given the above:

   - I could now add special code to either the 'start' or 'next' section,
   saying which context I wanted to use (once we have that syntax implemented).

The reason for its own syntax is to allow us to think more clearly about
the different parts of a generator  & then makes it easier for the user to
choose which part of the generator interacts with contexts & which
context.  In particular the user could interact with multiple contexts (one
in the start section & a different one in the next section).

[Also for other generators I think the syntax needs to be extended, to
something like:

    next(context):

       use context:

           ....

Allowing two new features --- requesting that the __next__ receive the
context of the caller & secondly being able to use that context itself.

Next, moving on to contexts:

   - I love how non-local works & how you can access variables declared in
   your surrounding function.
   - I really think that contexts should work the same way
   - You would simply declare 'context' (like non-local) & just be able to
   use the variables directly.
   - Way easier to understand & use.

The sample code I have actually emulates contexts using non-local, so as to
demonstrate the idea I am explaining.

Thanks,

Amit
P.S.: As I'm very new to python ideas, I'm not sure if I should start a
separate thread to discuss this or use the current thread.  Also I'm not
sure if I should attached the sample code here or not ... So I just
provided the link above.

On Fri, Oct 13, 2017 at 4:29 PM, Paul Moore <p.f.moore at gmail.com> wrote:

> On 13 October 2017 at 19:32, Yury Selivanov <yselivanov.ml at gmail.com>
> wrote:
> >>> It seems simpler to have one specially named and specially called
> function
> >>> be special, rather than make the semantics
> >>> more complicated for all functions.
> >>
> >
> > It's not possible to special case __aenter__ and __aexit__ reliably
> > (supporting wrappers, decorators, and possible side effects).
> >
> >> +1.  I think that would make it much more usable by those of us who are
> not
> >> experts.
> >
> > I still don't understand what Steve means by "more usable", to be honest.
>
> I'd consider myself a "non-expert" in async. Essentially, I ignore it
> - I don't write the sort of applications that would benefit
> significantly from it, and I don't see any way to just do "a little
> bit" of async, so I never use it.
>
> But I *do* see value in the context variable proposals here - if only
> in terms of them being a way to write my code to respond to external
> settings in an async-friendly way. I don't follow the underlying
> justification (which is based in "we need this to let things work with
> async/coroutines) at all, but I'm completely OK with the basic idea
> (if I want to have a setting that behaves "naturally", like I'd expect
> decimal contexts to do, it needs a certain amount of language support,
> so the proposal is to add that). I'd expect to be able to write
> context variables that my code could respond to using a relatively
> simple pattern, and have things "just work". Much like I can write a
> context manager using @contextmanager and yield, and not need to
> understand all the intricacies of __enter__ and __exit__. (BTW,
> apologies if I'm mangling the terminology here - write it off as part
> of me being "not an expert" :-))
>
> What I'm getting from this discussion is that even if I *do* have a
> simple way of writing context variables, they'll still behave in ways
> that seem mildly weird to me (as a non-async user). Specifically, my
> head hurts when I try to understand what that decimal context example
> "should do". My instincts say that the current behaviour is wrong -
> but I'm not sure I can explain why. So on that example, I'd ask the
> following of any proposal:
>
> 1. Users trying to write a context variable[1] shouldn't have to jump
> through hoops to get "natural" behaviour. That means that suggestions
> that the complexity be pushed onto decimal.context aren't OK unless
> it's also accepted that the current behaviour is wrong, and the only
> reason decimal.context needs to replicated is for backward
> compatibility (and new code can ignore the problem).
> 2. The proposal should clearly establish what it views as "natural"
> behaviour, and why. I'm not happy with "it's how decimal.context has
> always behaved" as an explanation. Sure, people asking to break
> backward compatibility should have a good justification, but equally,
> people arguing to *preserve* an unintuitive current behaviour in new
> code should be prepared to explain why it's not a bug. To put it
> another way, context variables aren't required to be bug-compatible
> with thread local storage.
>
> [1] I'm assuming here that "settings that affect how a library behave"
> is a common requirement, and the PEP is intended as the "one obvious
> way" to implement them.
>
> Nick's other async refactoring example is different. If the two forms
> he showed don't behave identically in all contexts, then I'd consider
> that to be a major problem. Saying that "coroutines are special" just
> reads to me as "coroutines/async are sufficiently weird that I can't
> expect my normal patterns of reasoning to work with them". (Apologies
> if I'm conflating coroutines and async incorrectly - as a non-expert,
> they are essentially indistinguishable to me). I sincerely hope that
> isn't the message I should be getting - async is already more
> inaccessible than I'd like for the average user.
>
> The fact that Nick's async example immediately devolved into a
> discussion that I can't follow at all is fine - to an extent. I don't
> mind the experts debating implementation details that I don't need to
> know about. But if you make writing context variables harder, just to
> fix Nick's example, or if you make *using* async code like (either of)
> Nick's forms harder, then I do object, because that's affecting the
> end user experience.
>
> In that context, I take Steve's comment as meaning "fiddling about
> with how __aenter__ and __aexit__ work is fine, as that's internals
> that non-experts like me don't care about - but making context
> variables behave oddly because of this is *not* fine".
>
> Apologies if the above is unhelpful. I've been lurking but not
> commenting here, precisely because I *am* a non-expert, and I trust
> the experts to build something that works. But when non-experts were
> explicitly mentioned, I thought my input might be useful.
>
> The following quote from the Zen seems particularly relevant here:
>
>     If the implementation is hard to explain, it's a bad idea.
>
> (although the one about needing to be Dutch to understand why
> something is obvious might well trump it ;-))
>
> Paul
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20171013/42e7a885/attachment-0001.html>


More information about the Python-ideas mailing list