[Python-ideas] Tweaking closures and lexical scoping to include the function being defined

Steven D'Aprano steve at pearwood.info
Mon Sep 26 17:16:56 CEST 2011


Masklinn wrote:
> On 2011-09-26, at 14:26 , Alex Gaynor wrote:
>> Nick Coghlan <ncoghlan at ...> writes:
>>>    i = 88
>>>
>>>    def f():
>>>        nonlocal i from 17
>>>        print(i)
>>>        i += 1
>>>
>>>  def outer():
>>>    i = 17
>>>    def f():
>>>        nonlocal i
>>>        print(i)
>>>        i += 1
>>>    return f
>>>
>>>>>> f = outer()
[...]
> An other thing which strikes me as weird is that the proposal is basically the
> creation of private instance attribute on functions. Could you not get the same
> by actually setting an attribute on the function (this can not be used in
> lambdas in any case)?
> 
>     def f():
>         print(f.i)
>         f.i += 1
>     f.i = 17


The advantages of the attribute solution are:

- you can do it right now (functions have supported public writable 
attributes since v2.1);
- the attribute is writable by the caller.

If you want a public, writable attribute on a function, and I frequently 
do, this works fine. But:

- the attribute is exposed to the caller even if it shouldn't be;
- it's slow compared to local lookup;
- lookup is by name, so if the function is renamed, it breaks;
- initial value for the attribute is assigned *after* the function is 
created -- this is the same problem with decorators that the @ syntax 
was designed to fix.

You can "fix" that last issue by moving the assignment inside the 
function, only this is even worse:

def f():
     try:
         f.i
     except AttributeError:
         f.i = 17
     print(f.i)
     f.i += 1


or with a decorator, as you suggested. But still, it's pretty crappy to 
have slow public attribute access for something which should be fast and 
private.


I have gradually warmed to Nick's suggestion. I'm not completely sold on 
the "nonlocal var from expr" syntax. Paul Moore's criticism makes a lot 
of sense to me. At the risk of dooming the proposal, "static" seems to 
me to be a more sensible keyword than "nonlocal". But for the sake of 
the argument, I'll stick to nonlocal for now.

Some use-cases:

(1) Early-binding: you use some constant value in a function, and 
nowhere else, so you don't want to make it a global, but it's expensive 
to calculate so you only want to do it once:

# old way:
_var = some_expensive_calculation()
def spam():
     do_stuff_with(_var)

# or:
def spam(_var=some_expensive_calculation()):
     do_stuff_with(_var)



# proposed way:
def spam():
     nonlocal _var = some_expensive_calculation()
     do_stuff_with(_var)


This puts the calculation inside the function where it belongs and is a 
win for encapsulation, without the ugly "looks like an argument, quacks 
like an argument, swims like an argument, but please don't try treating 
it as an argument" hack.


(2) Persistent non-global storage: you have some value which needs to 
persist between calls to the function, but shouldn't be exposed as a 
global. A neat example comes from Guido's essay on graphs:

def find_path(graph, start, end, path=[]):
     path = path + [start]
     if start == end:
         return path
     if not graph.has_key(start):
         return None
     for node in graph[start]:
         if node not in path:
             newpath = find_path(graph, node, end, path)
             if newpath: return newpath
     return None

http://www.python.org/doc/essays/graphs.html


I expect that could be re-written as:

def find_path(graph, start, end):
     nonlocal path from []
     path = path + [start]
     if start == end:
         return path
     if not graph.has_key(start):
         return None
     for node in graph[start]:
         if node not in path:
             newpath = find_path(graph, node, end)
             if newpath: return newpath
     return None


The downside of this would be that the caller can now no longer seed the 
path argument with nodes. But for some applications, maybe that's a plus 
rather than a minus.


(3) Micro-optimizations. An example from the random module:


     def randrange(self, start, stop=None, step=1, int=int):
         """Choose a random item from ...

         Do not supply the 'int' argument.
         """

If we're not supposed to supply the int argument, why is it an argument?

Even uglier:


     def _randbelow(self, n, int=int, maxsize=1<<BPF, type=type,
                Method=_MethodType, BuiltinMethod=_BuiltinMethodType):


These would become:

     def randrange(self, start, stop=None, step=1):
         nonlocal int from int
         ...

etc. Much nicer and less confusing for the reader.


(4) Recursion in Python is implemented by name lookup, so if you rename 
a function, or if it is anonymous, you're out of luck. But:


def recurse(x):
     nonlocal recurse from recurse
     if x > 0:
         return recurse(x-1)+1
     return 1

func = recurse
del recurse


Note: this use-case implies that the initial binding can't happen until 
*after* the function exists, otherwise recurse won't exist.



-- 
Steven



More information about the Python-ideas mailing list