[Python-ideas] Deterministic iterator cleanup
Steven D'Aprano
steve at pearwood.info
Fri Oct 21 03:12:19 EDT 2016
On Thu, Oct 20, 2016 at 11:03:11PM -0700, Nathaniel Smith wrote:
> The motivation here is that prompt (non-GC-dependent) cleanup is a
> good thing for a variety of reasons: determinism, portability across
> Python implementations, proper exception propagation, etc. async does
> add yet another entry to this list, but I don't the basic principle is
> controversial.
Perhaps it should be.
The very first thing you say is "determinism". Hmmm. As we (or at least,
some of us) move towards more async code, more threads or multi-
processing, even another attempt to remove the GIL from CPython which
will allow people to use threads with less cost, how much should we
really value determinism? That's not a rhetorical question -- I don't
know the answer.
Portability across Pythons... if all Pythons performed exactly the same,
why would we need multiple implementations? The way I see it,
non-deterministic cleanup is the cost you pay for a non-reference
counting implementation, for those who care about the garbage collection
implementation. (And yes, ref counting is garbage collection.)
[...]
> 'with' blocks are a whole chunk of extra syntax that
> were added to the language just for this use case. In fact 'with'
> blocks weren't even needed for the functionality -- we already had
> 'try/finally', they just weren't ergonomic enough. This use case is so
> important that it's had multiple rounds of syntax directed at it
> before async/await was even a glimmer in C#'s eye :-).
>
> BUT, currently, 'with' and 'try/finally' have a gap: if you use them
> inside a generator (async or not, doesn't matter), then they often
> fail at accomplishing their core purpose. Sure, they'll execute their
> cleanup code whenever the generator is cleaned up, but there's no
> ergonomic way to clean up the generator. Oops.
How often is this *actually* a problem in practice?
On my system, I can open 1000+ files as a regular user. I can't even
comprehend opening a tenth of that as an ordinary application, although
I can imagine that if I were writing a server application things would
be different. But then I don't expect to write server applications in
quite the same way as I do quick scripts or regular user applications.
So it seems to me that a leaked file handler or two normally shouldn't
be a problem in practice. They'll be friend when the script or
application closes, and in the meantime, you have hundreds more
available. 90% of the time, using `with file` does exactly what we want,
and the times it doesn't (because we're writing a generator that isn't
closed promptly) 90% of those times it doesn't matter. So (it seems to
me) that you're talking about changing the behaviour of for-loops to
suit only a small proportion of cases: maybe 10% of 10%.
It is not uncommon to pass an iterator (such as a generator) through a
series of filters, each processing only part of the iterator:
it = generator()
header = collect_header(it)
body = collect_body(it)
tail = collect_tail(it)
Is it worth disrupting this standard idiom? I don't think so.
--
Steve
More information about the Python-ideas
mailing list