[Python-ideas] Tweaking closures and lexical scoping to include the function being defined

Sun Oct 2 04:11:46 CEST 2011

On Sat, Oct 1, 2011 at 2:57 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Ron Adam writes:
>
>  > Supposedly the @ decorator syntax is supposed to be like a pre-compile
>  > substitution where..
>  >
>  >     @decorator
>  >     def func(x):x
>  >
>  > Is equivalent to...
>  >
>  >     def func(x):x
>  >     func = decorator(func)
>  >
>  > IF that is true,
>
> It is.

It isn't quite - the name binding doesn't happen until *after* the
decorator chain has been invoked, so the function is anonymous while
the decorators are executing. In addition, the decorator expressions
themselves are evaluated before the function is defined. That's not a
problem in practice, since each decorator gets passed the result of
the previous one (or the original function for the innermost
decorator).

The real equivalent code is more like:

    <anon1> = decorator
    def <anon2>(x):
        return x
    func = <anon1>(<anon2>)

(IIRC, the anonymous references happen to be stored on the frame stack
in CPython, but that's an implementation detail)

As far as the proposed semantics for any new syntax to eliminate the
desire to use the default argument hack goes, I haven't actually heard
any complaints about any addition being syntactic sugar for the
following closure idiom:

    def <anon1>():
        NAME = EXPR
        def FUNC(ARGLIST):
            """DOC"""
            nonlocal NAME
            BODY
        return FUNC
    FUNC = <anon1>()

The debate focuses on whether or not there is any possible shorthand
spelling for those semantics that successfully negotiates the Zen of
Python:

Beautiful is better than ugly.
    - the default argument hack is actually quite neat and tidy if you
know what it means. Whatever we do should be at least as attractive as
that approach.

Explicit is better than implicit.
    - the line between the default argument hack and normal default
arguments is blurry. New syntax would fix that.

Simple is better than complex.
    - lexical scoping left simple behind years ago ;)

Complex is better than complicated.
    - IMO, the default argument hack is complicated, since it abuses a
tool meant for something else, whereas function state variables would
be just another tier in the already complex scoping spectrum from
locals through lexical scoping to module globals and builtins (with
function state variables slotting in neatly between ordinary locals
and lexically scoped nonlocals).

Flat is better than nested.
   - There's a lot of visual nesting going on if you spell out these
semantics as a closure or as a class. The appeal of the default
argument hack largely lies in its ability to flatten that out into
state storage on the function object itself

Sparse is better than dense.
  - This would be the main argument for having something before the
header line (decorator style) rather than cramming yet more
information into the header line itself. However, it's also an
argument against decorator-style syntax, since that is quite heavy on
the page (due to the business of the '@' symbol in most fonts)

Readability counts.
  - The class and closure solutions are not readable - that's the big
reason people opt for the default argument hack when it applies. It
remains to be seen if we can come up with dedicated syntax that is at
least as readable as the default argument hack itself.

Special cases aren't special enough to break the rules.
  - I think this is the heart of what killed the "inside the function
scope" variants for me. They're *too* magical and different from the
way other code at function scope works to be a good fit.

Although practicality beats purity.
  - Using the default argument hack in the first place is the epitome of this :)

Errors should never pass silently.
Unless explicitly silenced.
  - This is why 'nonlocal x', where x is not defined in a lexical
scope, is, and will remain, a Syntax Error and why nonlocal and global
declarations that conflict with the parameter list are also errors.
Similar constraints would be placed on any new syntax dedicated to
function state variables.

In the face of ambiguity, refuse the temptation to guess.
  - In the case of name bindings, the compiler doesn't actually
*guess* anything - name bindings create local variables, unless
overridden by some other piece of syntax (i.e. a nonlocal or global
declaration). This may, of course, look like guessing to developers
that don't understand the scoping rules yet. The challenge for
function state variables is coming up with a similarly unambiguous
syntax that still allows them to be given an initial state at function
definition time.

There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
  - For me, these two are about coming up with a syntax that is easy
to *remember* once you know it, even if you have to look up what it
means the first time you encounter. Others set the bar higher and want
developers to have a reasonable chance of *guessing* what it means
without actually reading the documentation for the new feature. I
think the latter goal is unattainable and hence not a useful standard.
However, I'll also note that the default argument hack itself does
meet *my* interpretation of this guideline (if you know it,
recognising it and remembering it aren't particularly difficult)

Now is better than never.
Although never is often better than *right* now.
  - The status quo has served us well for a long time. If someone can
come up with an elegant syntax, great, let's pursue it. Otherwise,
this whole issue really isn't that important in the grand scheme of
things (although a PEP to capture the current 'state of the art'
thinking on the topic would still be nice - I believe Jan and Eric
still plan to get to that once the discussion dies down again)

If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
  - this is where I think the specific proposal to just add syntactic
sugar for a particular usage of an ordinary closure is a significant
improvement on past attempts. Anyone that understands how closures
work will understand the meaning of the new syntax, just as anyone
that fully understands 'yield' can understand PEP 380's 'yield from'.

Namespaces are one honking great idea -- let's do more of those!
  - closures are just another form of namespace, even though people
typically think of classes, modules and packages when contemplating
this precept. "Function state variables" would be formalising the
namespace where default argument values live (albeit anonymously) and
making it available for programmatic use.

Despite its flaws, the simple brackets enclosed list after the
function parameter list is still my current favourite:

    def global_counter(x) [n=0, lock=Lock()]:
        with lock:
            n += 1
            yield n

It just composes more nicely with decorators than the main alternative
still standing and is less prone to overwhelming the function name
with extraneous implementation details:

    @contextmanager
    def counted()  [active=collections.Counter(), lock=threading.RLock()]:
        with lock:
            active[threading.current_thread().ident] += 1
        yield active
        with lock:
            active[threading.current_thread().ident] -= 1

far more clearly conveys "this defines a context manager named
'counted'" than the following does:

    @contextmanager
    @(active=collections.Counter(), lock=threading.RLock())
    def counted():
        with lock:
            active[threading.current_thread().ident] += 1
        yield active
        with lock:
            active[threading.current_thread().ident] -= 1

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia