[Python-ideas] Tweaking closures and lexical scoping to include the function being defined

Sun Oct 2 15:38:07 CEST 2011

On Sun, Oct 2, 2011 at 2:29 AM, Ron Adam <ron3200 at gmail.com> wrote:
> On Sat, 2011-10-01 at 22:11 -0400, Nick Coghlan wrote:
>
> +1 on all of the zen statements of course.
>
> I think you made a fine case for being careful and mindful about this
> stuff.  :-)

Heh, even if nothing else comes out of these threads, I can be happy
with helping others to learn how to look at this kind of question from
multiple angles without getting too locked in to one point of view
(and getting more practice at doing so, myself, of course!)

> One way to think of this is, Private, Shared, and Public, name spaces.
> Private and Public  are locals and globals, and are pretty well
> supported, but Shared names spaces, (closures or otherwise) are not well
> supported.
>
> I think the whole concept of explicit shared name spaces, separate from
> globals and locals is quite important and should be done carefully.  I
> don't think it is just about one or two use-cases that a small tweak
> will cover.

"not well supported" seems a little too harsh in the post PEP 3104
'nonlocal' declaration era. If we look at the full suite of typical
namespaces in Python, we currently have the following (note that
read/write and read-only refer to the name bindings themselves -
mutable objects can obviously still be modified for a reference that
can't be rebound):

Locals: naturally read/write
Function state variables (aka default argument values): naturally
read-only, very hard to rebind since this namespace is completely
anonymous in normal usage
Lexically scoped non-locals: naturally read-only, writable with
nonlocal declaration
Module globals: within functions in module, naturally read-only,
writable with global declaration. At module level, naturally
read/write. From outside the module, naturally read/write via module
object
Process builtins: naturally read-only, writable via "import builtins"
and attribute assignment
Instance variables: in methods, naturally read/write via 'self' object
Class variables: in instance methods, naturally read-only, writable
via 'type(self)' or 'self.__class__'. Naturally read/write in class
methods via 'cls', 'klass' or 'class_' object.

Of those, I would put lexical scoping, function state variables and
class variables in the 'shared' category - they aren't as contained as
locals and instance variables, but they aren't as easy to access as
module globals and process builtins, either.

The current discussion is about finding a syntax to bring function
state variables on par with lexical scoping, such that default
argument values are no longer such a unique case.

> How about a name space literal? ie.. a dictionary.
>
>    def global_counter(x) {n:0, lock=lock}:
>        with lock:
>            n += 1
>            yield n
>
> I think that looks better than dict(n=0, lock=lock).  And when used as a
> repr for name spaces, it is more readable.

The "but it looks like a list" argument doesn't really hold any water
for me. Parameter and argument lists look like tuples, too, but people
figure out from context that they mean something different and permit
different content.

I have some specific objections to the braces syntax, too:
  - I believe the assocation with func.__dict__ would be too strong
(since it's actually unrelated)
  - braces and colons are a PITA to type compared to brackets and equals signs
  - the LHS in a dictionary is an ordinary expression, here it's an
unquoted name

[NAME=EXPR, NAME2=EXPR2] is clearly illegal as a list, so it must mean
something else, perhaps something akin to what (NAME=EXPR,
NAME2=EXPR2) would have meant in the immediately preceding parameter
list (this intuition would be correct, since the two are closely
related, differing only in the scope of any rebindings of the names in
the function body). {NAME=EXPR, NAME2=EXPR}, on the other hand, looks
an awful lot like {NAME:EXPR, NAME2:EXPR2}, which would be an ordinary
dict literal, and *not* particularly related to what the new syntax
would mean.

> A literal would cover the default values use case quite nicely.  A
> reference to a pre-defined dictionary would cover values shared between
> different functions independent of scope.

No, that can never work (it's akin to the old "from module import *"
at function level, which used to disable fast locals but is now simply
not allowed). The names for any shared state *must* be explicit in the
syntax so that the compiler knows what they are. When that isn't
adequate it's a sign that it's time to upgrade to a full class or
closure.

>>     @contextmanager
>>     def counted()  [active=collections.Counter(),
>>  lock=threading.RLock()]:
>
>> far more clearly conveys "this defines a context manager named
>> 'counted'" than the following does:
>
>>     @contextmanager
>>     @(active=collections.Counter(), lock=threading.RLock())
>>     def counted():
>
>
> Putting them after the function signature will result in more wrapped
> function signatures.

Agreed, but even there I think I prefer that outcome, since the more
important information (name and signature) precedes the less important
(the state variable initialisation). Worst case, someone can put their
state in a named tuple or class instance to reduce the noise in the
header line - state variables are about approaching a problem in a
different way (i.e. algorithm more prominent than state) rather than
about avoiding the use of structured data altogether.

> While its very interesting to try to find a solution, I am also
> concerned about what this might mean in the long term.  Particularly we
> will see more meta programming.  Being able to initiate an object from
> one or more other objects can be very nice.  Python does that sort of
> thing all over the place.

I'm not sure I understand what you mean in your use of the term
'meta-programming' here. The biggest danger to my mind is that we'll
see more true process-level globals as state on top-level functions,
and those genuinely *can* be problematic (but also very useful, which
is why C has them). It's really no worse than class variables, though.

The other objection to further enhancing the power of functions to
maintain state is that functions aren't naturally decomposable the way
classes are - if an algorithm is written cleanly as methods on a
class, then you can override just the pieces you need to modify while
leaving the overall structure intact. For functions, it's much harder
to do the same thing (hence generators, coroutines and things like the
visitor pattern when walking data structures).

My main counters to those objections are that:
1. Any feature of this new proposal can already be done with explicit
closures or the default argument hack. While usage may increase
slightly with an officially blessed syntax, I don't expect that to
happen to any great extent - I'm more hoping that over time, the
default argument hack usages would get replaced
2. When an algorithm inevitably runs up against the practical limits
of any new syntax, the full wealth of Python remains available for
refactoring (e.g. by upgrading to a full class or closure)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia