[Python-ideas] Aid reiteration with new class: gfic
p.f.moore at gmail.com
Fri Jun 19 22:37:59 CEST 2009
2009/6/18 Terry Reedy <tjreedy at udel.edu>:
> Some functions iterate thru the input more than once. If the input is not an
> iterator, there is no problem: call iter(input) again. If the input is an
> iterator, that just returns the now-empty iterator. Bad.
So: The problem is to detect which case you have. In the bad case,
call list(input) and iterate over that twice.
Which leaves two problems:
1. Detecting the 2 cases.
2. The cost of keeping all of the generated values in a list.
For 1, you actually don't need to detect the case of a container, just
call list() anyway. Unless the input is so huge that having a second
copy is a substantial cost, who cares?
For 2, you have 2 choices:
1. Document for callers that you'll be duplicating the input. Then
your API isn't usable by people with infinite (or practically
infinite) streams - but how much of a problem is this anyway?
2. Design a different API - maybe take explicit callable/arguments
parameters. Then people passing a list have to do my_api(iter, L), but
you're designing for huge iterators as input, so presumably you don't
care about their convenience so much.
Actually, I think you *can* detect that you were passed an iterable -
if "iter(it) is iter(it)" returns True, then that's when you need to
copy. So you could document your API as copying the input if that
condition is true. Then, if some user wants to use your function on a
huge stream, they know what they are doing, and they can wrap their
iterator in a class like your gfic (you could even document this
technique in your documentation).
So all of this is entirely manageable within your documentation. You
could put a sample up as cookbook code, as well, I guess. And
honestly, I don't believe that the problem is common enough to warrant
anything more - certainly not standard library support.
 There may be some edge cases I'm ignoring here.
More information about the Python-ideas