Re: [Python-ideas] Aid reiteration with new class: gfic

20 Jun 2009

      2009/6/18 Terry Reedy :
...
Problem:
Some functions iterate thru the input more than once. If the input is not an
iterator, there is no problem: call iter(input) again. If the input is an
iterator, that just returns the now-empty iterator.  Bad.
So: The problem is to detect which case you have. In the bad case,
call list(input) and iterate over that twice.

Which leaves two problems:
1. Detecting the 2 cases.
2. The cost of keeping all of the generated values in a list.

For 1, you actually don't need to detect the case of a container, just
call list() anyway. Unless the input is so huge that having a second
copy is a substantial cost, who cares?

For 2, you have 2 choices:

1. Document for callers that you'll be duplicating the input. Then
your API isn't usable by people with infinite (or practically
infinite) streams - but how much of a problem is this anyway?

2. Design a different API - maybe take explicit callable/arguments
parameters. Then people passing a list have to do my_api(iter, L), but
you're designing for huge iterators as input, so presumably you don't
care about their convenience so much.

Actually, I think you *can* detect that you were passed an iterable -
if "iter(it) is iter(it)" returns True, then that's when you need to
copy[1]. So you could document your API as copying the input if that
condition is true. Then, if some user wants to use your function on a
huge stream, they know what they are doing, and they can wrap their
iterator in a class like your gfic (you could even document this
technique in your documentation).

So all of this is entirely manageable within your documentation. You
could put a sample up as cookbook code, as well, I guess. And
honestly, I don't believe that the problem is common enough to warrant
anything more - certainly not standard library support.

Paul.

[1] There may be some edge cases I'm ignoring here.

Re: [Python-ideas] Aid reiteration with new class: gfic

Paul Moore