[Python-ideas] + operator on generators
Steven D'Aprano
steve at pearwood.info
Sat Jul 1 02:13:41 EDT 2017
On Fri, Jun 30, 2017 at 01:09:51AM +0200, Jan Kaliszewski wrote:
> But implementation of the OP's proposal does not need to be based on
> __add__ at all. It could be based on extending the current behaviour of
> the `+` operator itself.
>
> Now this behavior is (roughly): try left side's __add__, if failed try
> right side's __radd__, if failed raise TypeError.
>
> New behavior could be (again: roughly): try left side's __add__, if
> failed try right side's __radd__, if failed try __iter__ of both sides
> and chain them (creating a new iterator¹), if failed raise TypeError.
That's what I suggested earlier, except using & instead of + as the
operator.
The reason I suggested & instead of + is that there will be fewer
clashes between iterables that already support the operator and hence
fewer surprises.
Using + will be a bug magnet. Consider:
it = x + y # chain two iterables
first = next(it, "default")
That code looks pretty safe, but it's actually a minefield waiting to
blow you up. It works fine if you pass (let's say) a generator object
and a string, or a list and an iterator, but if x and y happen to both
be strings, or both lists, or both tuples, the + operator will
concatenate them instead of chaining them, and the call to next will
blow up.
So you would have to write:
it = iter(x + y) # chain two iterables, and ensure the result is an iterator
to be sure. Which is not a huge burden, but it does take away the
benefit of having an operator. In that case, you might as well do:
it = chain(x, y)
and be done with it.
It's true that exactly the same potential problem occurs with & but its
less likely. Strings, tuples, lists and other sequences don't typically
support __(r)and__ and the & operator, so you're less likely to be
burned. Still, the possibility is there. Maybe we should use a different
operator. ++ is out because that already has meaning, so that leaves
either && or inventing some arbitrary symbol.
But the more I think about it the more I agree with Nick. Let's start
by moving itertools.chain into built-ins, with zip and map, and only
consider giving it an operator after we've had a few years of experience
with chain as a built-in. We might even find that an operator doesn't
add any real value.
> ¹ Preferably using the existing `yield from` mechanism -- because, in
> case of generators, it would provide a way to combine ("concatenate")
> *generators*, preserving semantics of all that their __next__(), send(),
> throw() nice stuff...
I don't think that would be generally useful. If you're sending values
into an arbitrary generator, who knows what you're getting? chain() will
operate on arbitrary iterables, you can't expect to send values into
chain([1, 2, 3], my_generator(), "xyz") and have anything sensible
occur.
--
Steve
More information about the Python-ideas
mailing list