[Python-ideas] Deterministic iterator cleanup
Steven D'Aprano
steve at pearwood.info
Fri Oct 21 06:29:01 EDT 2016
On Wed, Oct 19, 2016 at 05:52:34PM -0400, Yury Selivanov wrote:
> IOW I'm not convinced that if we implement your proposal we'll fix 90%
> (or even 30%) of cases where non-deterministic and postponed cleanup is
> harmful.
Just because something doesn't solve ALL problems doesn't mean it isn't
worth doing. Reference counting doesn't solve the problem of cycles, but
Python worked really well for many years even though cycles weren't
automatically broken. Then a second GC was added, but it didn't solve
the problem of cycles with __del__ finalizers. And recently (a year or
two ago) there was an improvement that made the GC better able to deal
with such cases -- but I expect that there are still edge cases where
objects aren't collected.
Had people said "garbage collection doesn't solve all the edge cases,
therefore its not worth doing" where would we be?
I don't know how big a problem the current lack of deterministic GC
of resources opened in generators actually is. I guess that users of
CPython will have *no idea*, because most of the time the ref counter
will cleanup quite early. But not all Pythons are CPython, and despite
my earlier post, I now think I've changed my mind and support this
proposal.
One reason for this is that I thought hard about my own code where I use
the double-for-loop idiom:
for x in iterator:
if cond: break
...
# later
for y in iterator: # same iterator
...
and I realised:
(1) I don't do this *that* often;
(2) when I do, it really wouldn't be that big a problem for me to
guard against auto-closing:
for x in protect(iterator):
if cond: break
...
(3) if I need to write hybrid code that runs over multiple versions,
that's easy too:
try:
from itertools import protect
except ImportError:
def protect(it):
return it
> Yes, mainly iterator wrappers. You'll also will need to educate users
> to refactor (more on that below) their __del__ methods to
> __(a)iterclose__ in 3.6.
Couldn't __(a)iterclose__ automatically call __del__ if it exists? Seems
like a reasonable thing to inherit from object.
> A lot of code that you find on stackoverflow etc will be broken.
"A lot"? Or a little? Are you guessing, or did you actually count it?
If we are worried about code like this:
it = iter([1, 2, 3])
a = list(it)
# currently b will be [], with this proposal it will raise RuntimeError
b = list(it)
we can soften the proposal's recommendation that iterators raise
RuntimeError on calling next() when they are closed. I've suggested that
"whatever exception makes sense" should be the rule. Iterators with no
resources to close can simply raise StopIteration instead. That will
preserve the current behaviour.
> Porting
> code from Python2/<3.6 will be challenging. People are still struggling
> to understand 'dict.keys()'-like views in Python 3.
I spend a lot of time on the tutor and python-list mailing lists, and a
little bit of time on Reddit /python, and I don't think I've ever seen
anyone struggle with those. I'm sure it happens, but I don't think it
happens often. After all, for the most common use-case, there's no real
difference between Python 2 and 3:
for key, value in mydict.items():
...
[...]
> With you proposal, to achieve the same (and make the code compatible
> with new for-loop semantics), users will have to implement both
> __iterclose__ and __del__.
As I ask above, couldn't we just inherit a default __(a)iterclose__ from
object that looks like this?
def __iterclose__(self):
finalizer = getattr(type(self), '__del__', None)
if finalizer:
finalizer(self)
I know it looks a bit funny for non-iterables to have an iterclose
method, but they'll never actually be called.
[...]
> The __(a)iterclose__ semantics is clear. What's not clear is how much
> harm changing the semantics of for-loops will do (and how to quantify
> the amount of good :))
The "easy" way to find out (easy for those who aren't volunteering to do
the work) is to fork Python, make the change, and see what breaks. I
suspect not much, and most of the breakage will be easy to fix.
As for the amount of good, this proposal originally came from PyPy. I
expect that CPython users won't appreciate it as much as PyPy users, and
Jython/IronPython users when they eventually support Python 3.x.
--
Steve
More information about the Python-ideas
mailing list