Paul Moore wrote:
2009/6/18 Terry Reedy firstname.lastname@example.org:
Problem: Some functions iterate thru the input more than once. If the input is not an iterator, there is no problem: call iter(input) again. If the input is an iterator, that just returns the now-empty iterator. Bad.
So: The problem is to detect which case you have. In the bad case, call list(input) and iterate over that twice.
Which leaves two problems:
- Detecting the 2 cases.
- The cost of keeping all of the generated values in a list.
For 1, you actually don't need to detect the case of a container, just call list() anyway. Unless the input is so huge that having a second copy is a substantial cost, who cares?
For 2, you have 2 choices:
- Document for callers that you'll be duplicating the input. Then
your API isn't usable by people with infinite (or practically infinite) streams - but how much of a problem is this anyway?
- Design a different API - maybe take explicit callable/arguments
parameters. Then people passing a list have to do my_api(iter, L), but you're designing for huge iterators as input, so presumably you don't care about their convenience so much.
Actually, I think you *can* detect that you were passed an iterable - if "iter(it) is iter(it)" returns True, then that's when you need to copy. So you could document your API as copying the input if that condition is true. Then, if some user wants to use your function on a huge stream, they know what they are doing, and they can wrap their iterator in a class like your gfic (you could even document this technique in your documentation).
So all of this is entirely manageable within your documentation. You could put a sample up as cookbook code, as well, I guess. And
You are looking at the issue from the callee's standpoint. And you have made some good suggestions from that viewpoint.
I am looking at the issue from the caller's viewpoint. I have an iterator constructor and I want to use an existing function that requires a re-iterable.
honestly, I don't believe that the problem is common enough to warrant anything more - certainly not standard library support.
The issue is more frequent in Py3, which largely shifts from lists to iterators as the common sequence interchange format. Py3 is what I use and the target of my suggestion. I should have said that. Map and filter, for instance, are now iterator classes, which is to say, iterator constructors, rather than list-returning functions. A call such as somefunc(map(f,l)) which works fine in 2.x will not work in 3.x if somefunc requires a re-iterable.
But I agree that it seems to still be too rare for stdlib. I will, however, put the class in the code part of my book-in-progress as 'reiterable'. I am sure that the discussion here will help improve the text discussion of this issue and this class.
Terry Jan Reedy