[Tutor] Why define a function inside a function?
Steven D'Aprano
steve at pearwood.info
Fri Jan 29 22:31:09 EST 2016
On Fri, Jan 29, 2016 at 04:29:14PM -0500, Ryan Smith wrote:
> Hi all,
>
> I am new to programming and python and had a question. I hope I
> articulate this well enough so here it goes... I was following along
> on a thread on this mailing list discussing how to test file I/O
> operations. In one of the suggested solutions the following code was
> used:
>
>
> def _open_as_stringio(self, filename):
> filelike = StringIO(self.filecontents[filename])
> real_close = filelike.close
> def wrapped_close():
> self.filecontents[filename] = filelike.getvalue()
> real_close()
> filelike.close = wrapped_close
> return filelike
This function (actually a method) creates and returns a new object which
behaves like an actual file. When that fake file's "close" method is
called, it has to run some code which accesses the instance which
created it (self). How does this fake file know which instance created
it? It has to "remember" the value of self, somehow.
There are two usual ways for a function to be given access to a variable
(in this case, self):
- pass it as a parameter to the function;
- access it as a global variable.
The first can't work, because fake_file.close() takes no arguments.
The second won't work safely, unless you can guarantee that there is
only ever one fake file at a time. If there are two, they will both try
to access different values of self in the same global, and bad things
will happen. ("Global Variables Considered Harmful.")
But there's a third way, and that's called "a closure". And this is
exactly what happens here. Let's look at the inner function more
carefully:
def wrapped_close():
self.filecontents[filename] = filelike.getvalue()
real_close()
"wrapped_close" takes no arguments, but it references four variables
inside the body:
- self
- filename
- filelike
- real_close
Each of those variables are defined in the surrounding function. So when
Python creates the inner function, it records a reference to those outer
variables inside the inner function. These are called "closures", and we
say "the inner function closes over these four variables". (I don't know
why the word "close" is used.) So the word "closure" can be used either
for the inner function itself, or the internal hidden links between it
and the variables.
In other words:
The inner function "wrapped_close" remembers the values of those four
variables, as they were at the time the inner function was defined.
So by the time you eventually get hold of the fake file, and call its
close method, that close method will remember which instance created the
fake file, etc, and use those variables as needed.
Complicated? Well, a bit. Here's a simpler demonstration of a closure:
def factory(num):
"""Factory function that returns a new function that adds
num to whatever it is given."""
def adder(x):
return x + num
add1 = factory(1)
add2 = factory(2)
add17_25 = factory(17.25)
print(add1(100))
# => prints 101
print(add2(50))
# => prints 52
print(add17_25(5000))
# => prints 5017.25
Each of the "add" functions remember the value that "num" had, *at the
time of creation*. So as far as add1 is concerned, num=1, even though
add2 considers that num=2.
(Warning: be careful with mutable values like lists, since they can
change value if you modify them. If this makes no sense to you, please
ask, and I'll demonstrate what I mean.)
> In trying to understand the logic behind the code for my own
> edification, I was wondering what is the reasoning of defining a
> function in side another function (for lack of a better phrase)?
There are two common reasons for defining inner functions.
The first is a boring and not-very-exciting reason: you want to hide
that function from anyone else, prevent them from using it, except for
one specific function. So you put it inside that function:
def example(n):
results = []
def calc(x):
# Pretend that calc is actually more complex.
return x*100
for i in range(1, n+1):
results.append(calc(i))
return results
example(3)
# => returns [100, 200, 300]
That's a fairly boring reason, and people rarely do that. Normally, you
would just make calc an external function, perhaps named "_calc" so
people know it is a private function that shouldn't be used.
The second reason is to make the inner function a closure, as shown
above.
--
Steve
More information about the Tutor
mailing list