[Tutor] Why define a function inside a function?

Fri Jan 29 22:31:09 EST 2016

On Fri, Jan 29, 2016 at 04:29:14PM -0500, Ryan Smith wrote:
> Hi all,
> 
> I am new to programming and python and had a question. I hope I
> articulate this well enough so here it goes... I was following along
> on a thread on this mailing list discussing how to test file I/O
> operations. In one of the suggested solutions the following code was
> used:
> 
> 
> def _open_as_stringio(self, filename):
>         filelike = StringIO(self.filecontents[filename])
>         real_close = filelike.close
>         def wrapped_close():
>             self.filecontents[filename] = filelike.getvalue()
>             real_close()
>         filelike.close = wrapped_close
>         return filelike

This function (actually a method) creates and returns a new object which 
behaves like an actual file. When that fake file's "close" method is 
called, it has to run some code which accesses the instance which 
created it (self). How does this fake file know which instance created 
it? It has to "remember" the value of self, somehow.

There are two usual ways for a function to be given access to a variable 
(in this case, self):

- pass it as a parameter to the function;
- access it as a global variable.

The first can't work, because fake_file.close() takes no arguments.

The second won't work safely, unless you can guarantee that there is 
only ever one fake file at a time. If there are two, they will both try 
to access different values of self in the same global, and bad things 
will happen. ("Global Variables Considered Harmful.")

But there's a third way, and that's called "a closure". And this is 
exactly what happens here. Let's look at the inner function more 
carefully:

def wrapped_close():
    self.filecontents[filename] = filelike.getvalue()
    real_close()

"wrapped_close" takes no arguments, but it references four variables 
inside the body:

- self
- filename
- filelike
- real_close

Each of those variables are defined in the surrounding function. So when 
Python creates the inner function, it records a reference to those outer 
variables inside the inner function. These are called "closures", and we 
say "the inner function closes over these four variables". (I don't know 
why the word "close" is used.) So the word "closure" can be used either 
for the inner function itself, or the internal hidden links between it 
and the variables.

In other words:

The inner function "wrapped_close" remembers the values of those four 
variables, as they were at the time the inner function was defined.

So by the time you eventually get hold of the fake file, and call its 
close method, that close method will remember which instance created the 
fake file, etc, and use those variables as needed.

Complicated? Well, a bit. Here's a simpler demonstration of a closure:

def factory(num):
    """Factory function that returns a new function that adds 
    num to whatever it is given."""
    def adder(x):
        return x + num

add1 = factory(1)
add2 = factory(2)
add17_25 = factory(17.25)

print(add1(100))
# => prints 101

print(add2(50))
# => prints 52

print(add17_25(5000))
# => prints 5017.25

Each of the "add" functions remember the value that "num" had, *at the 
time of creation*. So as far as add1 is concerned, num=1, even though 
add2 considers that num=2.

(Warning: be careful with mutable values like lists, since they can 
change value if you modify them. If this makes no sense to you, please 
ask, and I'll demonstrate what I mean.)

> In trying to understand the logic behind the code for my own
> edification, I was wondering what is the reasoning of defining a
> function in side another function (for lack of a better phrase)?

There are two common reasons for defining inner functions.

The first is a boring and not-very-exciting reason: you want to hide 
that function from anyone else, prevent them from using it, except for 
one specific function. So you put it inside that function:

def example(n):
    results = []
    def calc(x):
        # Pretend that calc is actually more complex.
        return x*100
    for i in range(1, n+1):
        results.append(calc(i))
    return results

example(3)
# => returns [100, 200, 300]

That's a fairly boring reason, and people rarely do that. Normally, you 
would just make calc an external function, perhaps named "_calc" so 
people know it is a private function that shouldn't be used.

The second reason is to make the inner function a closure, as shown 
above.

-- 
Steve