[Python-ideas] Deterministic iterator cleanup

Steven D'Aprano steve at pearwood.info
Fri Oct 21 03:12:19 EDT 2016


On Thu, Oct 20, 2016 at 11:03:11PM -0700, Nathaniel Smith wrote:

> The motivation here is that prompt (non-GC-dependent) cleanup is a
> good thing for a variety of reasons: determinism, portability across
> Python implementations, proper exception propagation, etc. async does
> add yet another entry to this list, but I don't the basic principle is
> controversial.

Perhaps it should be.

The very first thing you say is "determinism". Hmmm. As we (or at least, 
some of us) move towards more async code, more threads or multi- 
processing, even another attempt to remove the GIL from CPython which 
will allow people to use threads with less cost, how much should we 
really value determinism? That's not a rhetorical question -- I don't 
know the answer.

Portability across Pythons... if all Pythons performed exactly the same, 
why would we need multiple implementations? The way I see it, 
non-deterministic cleanup is the cost you pay for a non-reference 
counting implementation, for those who care about the garbage collection 
implementation. (And yes, ref counting is garbage collection.)


[...]
> 'with' blocks are a whole chunk of extra syntax that
> were added to the language just for this use case. In fact 'with'
> blocks weren't even needed for the functionality -- we already had
> 'try/finally', they just weren't ergonomic enough. This use case is so
> important that it's had multiple rounds of syntax directed at it
> before async/await was even a glimmer in C#'s eye :-).
> 
> BUT, currently, 'with' and 'try/finally' have a gap: if you use them
> inside a generator (async or not, doesn't matter), then they often
> fail at accomplishing their core purpose. Sure, they'll execute their
> cleanup code whenever the generator is cleaned up, but there's no
> ergonomic way to clean up the generator. Oops.

How often is this *actually* a problem in practice?

On my system, I can open 1000+ files as a regular user. I can't even 
comprehend opening a tenth of that as an ordinary application, although 
I can imagine that if I were writing a server application things would 
be different. But then I don't expect to write server applications in 
quite the same way as I do quick scripts or regular user applications.

So it seems to me that a leaked file handler or two normally shouldn't 
be a problem in practice. They'll be friend when the script or 
application closes, and in the meantime, you have hundreds more 
available. 90% of the time, using `with file` does exactly what we want, 
and the times it doesn't (because we're writing a generator that isn't 
closed promptly) 90% of those times it doesn't matter. So (it seems to 
me) that you're talking about changing the behaviour of for-loops to 
suit only a small proportion of cases: maybe 10% of 10%.

It is not uncommon to pass an iterator (such as a generator) through a 
series of filters, each processing only part of the iterator:

it = generator()
header = collect_header(it)
body = collect_body(it)
tail = collect_tail(it)

Is it worth disrupting this standard idiom? I don't think so.



-- 
Steve


More information about the Python-ideas mailing list