
On Sun, Dec 08, 2019 at 01:45:08PM +0000, Oscar Benjamin wrote:
On Sat, 7 Dec 2019 at 00:43, Steven D'Aprano <steve@pearwood.info> wrote:
[...]
But there's a major difference in behaviour depending on your input, and one which is surely going to lead to bugs from people who didn't realise that iterator arguments and iterable arguments will behave differently:
# non-iterator iterable py> obj = [1, 2, 3, 4] py> [first(obj) for __ in range(5)] [1, 1, 1, 1, 1]
# iterator py> obj = iter([1, 2, 3, 4]) py> [first(obj) for __ in range(5)] [1, 2, 3, 4, None]
We could document the difference in behaviour, but it will still bite people and surprise them.
This kind of confusion can come with iterators and iterables all the time.
Do you have some concrete examples of where this is common, because I don't recall seeing anything like this ever. Since next() doesn't accept a non-iterator, this confusion doesn't come up for next. I suppose it could come up with itertools islice: py> s = "abcdefghijklmn" py> [list(itertools.islice(s, 0, 1)) for __ in range(5)] [['a'], ['a'], ['a'], ['a'], ['a']] but I've never seen that in real code, so I wouldn't say it happens all the time, or even a lot of the time. YMMV I guess, but I like to think I have a reasonable grasp of the kinds of gotchas people often trip over, and this isn't one of them. The closest I can think of is the perennial gotcha that you can iterate over a sequence as often as you like, but an iterator only once.
I can see that the name "first" is potentially confusing. Another possible name is "take" which might make more sense in the context of partially consumed iterators.
There's already a take() in the itertools recipes. Rather than add a new "first" itertools function, I'd rather promote `take` out of the recipes and give it an optional default, something like this: def take(n, iterable, /, *default): if default: (default,) = default iterable = chain(iterable, repeat(default)) return list(islice(iterable, n)) "Get the first item" with a default then becomes: a = take(1, iterable, default)[0] "Get the first 5 items" becomes: a, b, c, d, e = take(5, iterable, default) If you want to distinguish the case where the iterable is empty or shorter than you expect, you can pass a known sentinel and check for that, or pass no default at all and test the length of the resulting list. -- Steven