[Python-ideas] Aid reiteration with new class: gfic

Terry Reedy tjreedy at udel.edu
Sat Jun 20 21:54:59 CEST 2009


Paul Moore wrote:
> 2009/6/18 Terry Reedy <tjreedy at udel.edu>:
>> Problem:
>> Some functions iterate thru the input more than once. If the input is not an
>> iterator, there is no problem: call iter(input) again. If the input is an
>> iterator, that just returns the now-empty iterator.  Bad.
> 
> So: The problem is to detect which case you have. In the bad case,
> call list(input) and iterate over that twice.

> Which leaves two problems:
> 1. Detecting the 2 cases.
> 2. The cost of keeping all of the generated values in a list.
> 
> For 1, you actually don't need to detect the case of a container, just
> call list() anyway. Unless the input is so huge that having a second
> copy is a substantial cost, who cares?
> 
> For 2, you have 2 choices:
> 
> 1. Document for callers that you'll be duplicating the input. Then
> your API isn't usable by people with infinite (or practically
> infinite) streams - but how much of a problem is this anyway?
> 
> 2. Design a different API - maybe take explicit callable/arguments
> parameters. Then people passing a list have to do my_api(iter, L), but
> you're designing for huge iterators as input, so presumably you don't
> care about their convenience so much.
> 
> Actually, I think you *can* detect that you were passed an iterable -
> if "iter(it) is iter(it)" returns True, then that's when you need to
> copy[1]. So you could document your API as copying the input if that
> condition is true. Then, if some user wants to use your function on a
> huge stream, they know what they are doing, and they can wrap their
> iterator in a class like your gfic (you could even document this
> technique in your documentation).
> 
> So all of this is entirely manageable within your documentation. You
> could put a sample up as cookbook code, as well, I guess. And

You are looking at the issue from the callee's standpoint.
And you have made some good suggestions from that viewpoint.

I am looking at the issue from the caller's viewpoint.
I have an iterator constructor and I want to use an existing function 
that requires a re-iterable.

> honestly, I don't believe that the problem is common enough to warrant
> anything more - certainly not standard library support.

The issue is more frequent in Py3, which largely shifts from lists to 
iterators as the common sequence interchange format. Py3 is what I use 
and the target of my suggestion. I should have said that. Map and 
filter, for instance, are now iterator classes, which is to say, 
iterator constructors, rather than list-returning functions. A call such as
   somefunc(map(f,l))
which works fine in 2.x will not work in 3.x if somefunc requires a 
re-iterable.

But I agree that it seems to still be too rare for stdlib. I will, 
however, put the class in the code part of my book-in-progress as 
'reiterable'. I am sure that the discussion here will help improve the 
text discussion of this issue and this class.

Terry Jan Reedy




More information about the Python-ideas mailing list