[Python-ideas] Deterministic iterator cleanup
Terry Reedy
tjreedy at udel.edu
Wed Oct 19 22:07:18 EDT 2016
On 10/19/2016 12:38 AM, Nathaniel Smith wrote:
> I'd like to propose that Python's iterator protocol be enhanced to add
> a first-class notion of completion / cleanup.
With respect the the standard iterator protocol, a very solid -1 from
me. (I leave commenting specifically on __aiterclose__ to Yury.)
1. I consider the introduction of iterables and the new iterator
protocol in 2.2 and their gradual replacement of lists in many
situations to be the greatest enhancement to Python since 1.3 (my first
version). They are, to me, they one of Python's greatest features and
the minimal nature of the protocol an essential part of what makes them
great.
2. I think you greatly underestimate the negative impact, just as we did
with changing str is bytes to str is unicode. The change itself,
embodied in for loops, will break most non-trivial programs. You
yourself note that there will have to be pervasive changes in the stdlib
just to begin fixing the breakage.
3. Though perhaps common for what you do, the need for the change is
extremely rare in the overall Python world. Iterators depending on an
external resource are rare (< 1%, I would think). Incomplete iteration
is also rare (also < 1%, I think). And resources do not always need to
releases immediately.
4. Previous proposals to officially augment the iterator protocol, even
with optional methods, have been rejected, and I think this one should
be too.
a. Add .__len__ as an option. We added __length_hint__, which an
iterator may implement, but which is not part of the iterator protocol.
It is also ignored by bool().
b., c. Add __bool__ and/or peek(). I posted a LookAhead wrapper class
that implements both for most any iterable. I suspect that the is
rarely used.
> def read_newline_separated_json(path):
> with open(path) as file_handle: # <-- with block
> for line in file_handle:
> yield json.loads(line)
One problem with passing paths around is that it makes the receiving
function hard to test. I think functions should at least optionally
take an iterable of lines, and make the open part optional. But then
closing should also be conditional.
If the combination of 'with', 'for', and 'yield' do not work together,
then do something else, rather than changing the meaning of 'for'.
Moving responsibility for closing the file from 'with' to 'for', makes
'with' pretty useless, while overloading 'for' with something that is
rarely needed. This does not strike me as the right solution to the
problem.
> for document in read_newline_separated_json(path): # <-- outer for loop
> ...
If the outer loop determines when the file should be closed, then why
not open it there? What fails with
try:
lines = open(path)
gen = read_newline_separated_json(lines)
for doc in gen: do_something(doc)
finally:
lines.close
# and/or gen.throw(...) to stop the generator.
--
Terry Jan Reedy
More information about the Python-ideas
mailing list