I am currently working on implementing an algorithm with the following properties:
- It is an algorithm, not a data structure; that is, you run it, it returns an answer, and it doesn't leave any persistent state afterwards.
- It is sufficiently complex that I prefer to break it into several different functions or methods.
- These functions or methods need to share various state variables.
If I implement it as a collection of separate functions, then there's a lot of unnecessary code complexity involved in passing the state variables from one function to the next, returning the changes to the variables, etc. Also, it doesn't present a modular interface to the rest of the project -- code outside this algorithm is not prevented from calling the internal subroutines of the algorithm.
If I implement it as a collection of methods of an object, I then have to include a separate function which creates an instance of the object and immediately destroys it. This seems clumsy and also doesn't fit with my intuition about what objects are for (representing persistent structure). Also, again, modularity is violated -- outside code should not be making instances of this object or accessing its methods.
What I would like to do is to make an outer function, which sets up the state variables, defines inner functions, and then calls those functions. Currently, this sort of works: most of the state variables consist of mutable objects, so I can mutate them without rebinding them. But some of the state is immutable (in this case, an int) so I need to somehow encapsulate it in mutable objects, which is again clumsy. Write access to outer scopes would let me avoid this encapsulation problem.
I know the problem, I've dealt with this many times. Personally I would much rather define a class than a bunch of nested functions. I'd have a separate master function that creates the instance, calls the main computation, and then extracts and returns the result. Yes, the class may be accessible at the toplevel in the module. I don't care: I just add a comment explaining that it's not part of the API, or give it a name starting with "_".
My problem with the nested functions is that it is much harder to get a grasp of what the shared state is -- any local variable in the outer function *could* be part of the shared state, and the only way to tell for sure is by inspecting all the subfunctions. With the class, there's a strong convention that all state is initialized in __init__(), so __init__() is self-documenting.
--Guido van Rossum (home page: http://www.python.org/%7Eguido/)