
Hello ! I often use generators, and itertools.chain on them. What about providing something like the following: a = (n for n in range(2)) b = (n for n in range(2, 4)) tuple(a + b) # -> 0 1 2 3 This, from user point of view, is just as how the __add__ operator works on lists and tuples. Making generators works the same way could be a great way to avoid calls to itertools.chain everywhere, and to limits the differences between generators and other "linear" collections. I do not know exactly how to implement that (i'm not that good at C, nor CPython source itself), but by seeing the sources, i imagine that i could do something like the list_concat function at Objects/listobject.c:473, but in the Objects/genobject.c file, where instead of copying elements i'm creating and initializing a new chainobject as described at Modules/itertoolsmodule.c:1792. (In pure python, the implementation would be something like `def __add__(self, othr): return itertools.chain(self, othr)`) Best regards, --lucas

25.06.17 15:06, lucas via Python-ideas пише:
It would be weird if the addition is only supported for instances of the generator class, but not for other iterators. Why (n for n in range(2)) + (n for n in range(2, 4)) works, but iter(range(2)) + iter(range(2, 4)) and iter([0, 1]) + iter((2, 3)) don't? itertools.chain() supports arbitrary iterators. Therefore you will need to implement the __add__ method for *all* iterators in the world. However itertools.chain() accepts not just *iterators*. It works with *iterables*. Therefore you will need to implement the __add__ method also for all iterables in the world. But __add__ already is implemented for list and tuple, and many other sequences, and your definition conflicts with this.

I would like to add that for example numpy ndarrays are iterables, but they have an __add__ with completely different semantics, namely element-wise ( numerical) addition. So this proposal would conflict with existing libraries with iterable objects. Stephan Op 25 jun. 2017 2:51 p.m. schreef "Serhiy Storchaka" <storchaka@gmail.com>:

Personally, I find syntactic sugar for concating interators would come in handy. The purpose of iterators and generators is performance and efficiency. So, lowering the bar of using them is a good idea IMO. Also hoping back and forth a generator/iterator-based solution and a, say, list-based/materialized solution would become a lot easier. On 25.06.2017 16:04, Stephan Houben wrote:
I don't see a conflict.
I don't think it's necessary to start with *all* iterators in the world. So, adding iterators and/or generators, should be possible without any problems. It's a start and could already help a lot if I have my use-cases correctly.
As above, I don't see a conflict. Regards, Sven

2017-06-25 Serhiy Storchaka <storchaka@gmail.com> dixit: > 25.06.17 15:06, lucas via Python-ideas пише: > > I often use generators, and itertools.chain on them. > > What about providing something like the following: > > > > a = (n for n in range(2)) > > b = (n for n in range(2, 4)) > > tuple(a + b) # -> 0 1 2 3 [...] > It would be weird if the addition is only supported for instances of > the generator class, but not for other iterators. Why (n for n in > range(2)) > + (n for n in range(2, 4)) works, but iter(range(2)) + iter(range(2, > 4)) and iter([0, 1]) + iter((2, 3)) don't? itertools.chain() supports > arbitrary iterators. Therefore you will need to implement the __add__ > method for *all* iterators in the world. > > However itertools.chain() accepts not just *iterators*. [...] But implementation of the OP's proposal does not need to be based on __add__ at all. It could be based on extending the current behaviour of the `+` operator itself. Now this behavior is (roughly): try left side's __add__, if failed try right side's __radd__, if failed raise TypeError. New behavior could be (again: roughly): try left side's __add__, if failed try right side's __radd__, if failed try __iter__ of both sides and chain them (creating a new iterator¹), if failed raise TypeError. And similarly, for `+=`: try __iadd__..., try __add__..., try __iter__..., raise TypeError. Cheers. *j ¹ Preferably using the existing `yield from` mechanism -- because, in case of generators, it would provide a way to combine ("concatenate") *generators*, preserving semantics of all that their __next__(), send(), throw() nice stuff...

On Fri, Jun 30, 2017 at 1:09 AM, Jan Kaliszewski <zuo@chopin.edu.pl> wrote: > 2017-06-25 Serhiy Storchaka <storchaka@gmail.com> dixit: > >> 25.06.17 15:06, lucas via Python-ideas пише: > >> > I often use generators, and itertools.chain on them. >> > What about providing something like the following: >> > >> > a = (n for n in range(2)) >> > b = (n for n in range(2, 4)) >> > tuple(a + b) # -> 0 1 2 3 > [...] >> It would be weird if the addition is only supported for instances of >> the generator class, but not for other iterators. Why (n for n in >> range(2)) >> + (n for n in range(2, 4)) works, but iter(range(2)) + iter(range(2, >> 4)) and iter([0, 1]) + iter((2, 3)) don't? itertools.chain() supports >> arbitrary iterators. Therefore you will need to implement the __add__ >> method for *all* iterators in the world. >> >> However itertools.chain() accepts not just *iterators*. > [...] > > But implementation of the OP's proposal does not need to be based on > __add__ at all. It could be based on extending the current behaviour of > the `+` operator itself. > > Now this behavior is (roughly): try left side's __add__, if failed try > right side's __radd__, if failed raise TypeError. > > New behavior could be (again: roughly): try left side's __add__, if > failed try right side's __radd__, if failed try __iter__ of both sides > and chain them (creating a new iterator¹), if failed raise TypeError. > > And similarly, for `+=`: try __iadd__..., try __add__..., try > __iter__..., raise TypeError. I actually really like this proposal, in additional to the original proposal of using '+' to chain generators--I don't think it necessarily needs to be extended to *all* iterables. But this proposal goes one better. I just have to wonder what kind of strange unexpected bugs would result. For example now you could add a list to a string: >>> list(['a', 'b', 'c'] + 'def') ['a', 'b', 'c', 'd', 'e', 'f'] Personally, I really like this and find it natural. But it will break anything expecting this to be a TypeError.

On Jun 30, 2017 2:23 PM, "Erik Bray" <erik.m.bray@gmail.com> wrote: I actually really like this proposal, in additional to the original proposal of using '+' to chain generators--I don't think it necessarily needs to be extended to *all* iterables. But this proposal goes one better. I just have to wonder what kind of strange unexpected bugs would result. For example now you could add a list to a string: >>> list(['a', 'b', 'c'] + 'def') ['a', 'b', 'c', 'd', 'e', 'f'] Personally, I really like this and find it natural. But it will break anything expecting this to be a TypeError. Note that you can already do: [*iterable1, *iterable2] Or like in your example:
[*['a', 'b', 'c'], *'def'] ['a', 'b', 'c', 'd', 'e', 'f']
At least I think you can do that in 3.6 ;) -- Koos (mobile)

On Fri, Jun 30, 2017 at 01:09:51AM +0200, Jan Kaliszewski wrote:
That's what I suggested earlier, except using & instead of + as the operator. The reason I suggested & instead of + is that there will be fewer clashes between iterables that already support the operator and hence fewer surprises. Using + will be a bug magnet. Consider: it = x + y # chain two iterables first = next(it, "default") That code looks pretty safe, but it's actually a minefield waiting to blow you up. It works fine if you pass (let's say) a generator object and a string, or a list and an iterator, but if x and y happen to both be strings, or both lists, or both tuples, the + operator will concatenate them instead of chaining them, and the call to next will blow up. So you would have to write: it = iter(x + y) # chain two iterables, and ensure the result is an iterator to be sure. Which is not a huge burden, but it does take away the benefit of having an operator. In that case, you might as well do: it = chain(x, y) and be done with it. It's true that exactly the same potential problem occurs with & but its less likely. Strings, tuples, lists and other sequences don't typically support __(r)and__ and the & operator, so you're less likely to be burned. Still, the possibility is there. Maybe we should use a different operator. ++ is out because that already has meaning, so that leaves either && or inventing some arbitrary symbol. But the more I think about it the more I agree with Nick. Let's start by moving itertools.chain into built-ins, with zip and map, and only consider giving it an operator after we've had a few years of experience with chain as a built-in. We might even find that an operator doesn't add any real value.
I don't think that would be generally useful. If you're sending values into an arbitrary generator, who knows what you're getting? chain() will operate on arbitrary iterables, you can't expect to send values into chain([1, 2, 3], my_generator(), "xyz") and have anything sensible occur. -- Steve

On Saturday, July 1, 2017, Steven D'Aprano <steve@pearwood.info> wrote:
- Would that include chain.from_iterable? - So there's then a new conditional import (e.g. in a compat package)? What does this add?
Flatten one level?
- is my_generator() mutable (e.g. before or during iteration)? - https://docs.python.org/2/reference/expressions.html#generator.send

On Sat, Jul 01, 2017 at 01:35:29AM -0500, Wes Turner wrote:
Yes.
- So there's then a new conditional import (e.g. in a compat package)? What does this add?
try: chain except NameError: from itertools import chain Two lines, if and only if you both need chain and want to support versions of Python older than 3.7. There's no need to import it if you aren't going to use it.
Flattening typically applies to lists and sequences. I'm not saying that chain shouldn't support generators. That would be silly: a generator is an iterable and chaining supports iterables. I'm saying that it wouldn't be helpful to require chain objects to support send(), throw() etc.
It doesn't matter. Sending into a chain of arbitrary iterators isn't a useful thing to do.
- https://docs.python.org/2/reference/expressions.html#generator.send
Why are you linking to the 2 version of the docs? We're discusing a hypotheticial new feature which must go into 3, not 2. -- Steve

On Sat, Jul 1, 2017 at 6:11 PM, Steven D'Aprano <steve@pearwood.info> wrote:
It'd be even simpler. If you want to support <3.7 and 3.7+, you write: from itertools import chain At least, I presume it isn't going to be *removed* from itertools. Promotion to builtin shouldn't break pre-existing code, so the way to be compatible with pre-promotion Pythons is simply to code for those and not take advantage of the new builtin. ChrisA

On Saturday, July 1, 2017, Steven D'Aprano <steve@pearwood.info> wrote:
Or, can I just continue to import the same function from the same place: from itertools import chain Nice, simple, easy. There's even (for all you functional lovers): from itertools import * And, again, this works today: from fn import Stream itr = Stream() << my_generator() << (8,9,0) - https://github.com/kachayev/fn.py/blob/master/README.rst#streams-and-infinit... - https://github.com/kachayev/fn.py/blob/master/fn/stream.py - AFAIU, + doesn't work because e.g. numpy already defines + and & for Iterable arrays.
So the argspec is/shouldbe Iterables with __iter__ (but not necessarily __len__)?
So, with a generator function, I get a traceback at the current yield statement. With chain() I get whatever line the chain call is on.
In your opinion, has the send() functionality changed at all?

On 1 July 2017 at 07:13, Steven D'Aprano <steve@pearwood.info> wrote:
I'm struck here by the contrast between this and the "let's slim down the stdlib" debates we've had in the past. How difficult is it really to add "from itertools import chain" at the start of a file? It's not even as if itertools is a 3rd party dependency. Paul

On Sun, Jun 25, 2017 at 3:06 PM, lucas via Python-ideas < python-ideas@python.org> wrote:
I think a convenient syntax for chaining iterables and sequences would be very useful in Python 3, because there has been a shift from using lists by default to using views to dict keys and values, range objects etc. Having to add an import for a basic operation that used to just work with the + operator feels like a regression to many. It's not really clear if you will be able to implement this, but if you can find a syntax that gets accepted, I think using the same type as itertools.chain might be a good starting point, although the docs should not promise to return that exact type so that support for __getitem__ etc. could be added in the future for cases where the chained iterables are Sequences. -- Koos
-- + Koos Zevenhoven + http://twitter.com/k7hoven +

On Sun, Jun 25, 2017 at 3:06 PM, lucas via Python-ideas < python-ideas@python.org> wrote:
AudioLazy does that: https://github.com/danilobellini/audiolazy -- Danilo J. S. Bellini --------------- "*It is not our business to set up prohibitions, but to arrive at conventions.*" (R. Carnap)

On Sunday, June 25, 2017, Danilo J. S. Bellini <danilo.bellini@gmail.com> wrote:
- http://toolz.readthedocs.io/en/latest/api.html#toolz.itertoolz.concat and concatv - https://github.com/kachayev/fn.py#streams-and-infinite-sequences-declaration - Stream() << obj

On Sunday, June 25, 2017, Wes Turner <wes.turner@gmail.com> wrote:
<< is __lshift__() <<= is __ilshift__() https://docs.python.org/2/library/operator.html Do Stream() and __lshift__() from fn.py not solve here?

On Mon, Jun 26, 2017 at 4:53 PM, Serhiy Storchaka <storchaka@gmail.com> wrote:
And you can also do def a_and_b(): yield from a yield from b c = a_and_b() # iterable that yields 0, 1, 2, 3 I sometimes wish there was something like c from: yield from a yield from b ...or to get a list: c as list from: yield from a yield from b ...or a sum: c as sum from: yield from a yield from b These would be great for avoiding crazy oneliner generator expressions. They would also be equivalent to things like: @list @from def c(): yield from a yield from b @sum @from def c(): yield from a yield from b the above, given: def from(genfunc): return genfunc() Except of course `from` is a keyword and it should probably just be `call`. But this still doesn't naturally extend to allow indexing and slicing, like c[2] and c[1:3], for the case where the concatenated iterables are Sequences. -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On 26Jun2017 23:26, Koos Zevenhoven <k7hoven@gmail.com> wrote:
Nice.
Also nice, but for me a nonstarter because it breaks the existing pythyon idion that "... as foo" means to bind the name "foo" as the expression on the left. Such as with import, except. So +1 for the form, -1 for the particular keyword. Cheers, Cameron Simpson <cs@zip.com.au> Trust the computer industry to shorten Year 2000 to Y2K. It was this thinking that caused the problem in the first place. - Mark Ovens <marko@uk.radan.com>

We HAVE spellings for these things: c from:
yield from a yield from b
c = chain(a, b)
c = list(chain(a, b))
c = sum(chain(a, b)) Those really are not "crazy generator expressions." -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

On Sun, Jun 25, 2017 at 02:06:54PM +0200, lucas via Python-ideas wrote:
As Serhiy points out, this is going to conflict with existing use of + operator for string and sequence concatenation. I have a counter-proposal: introduce the iterator chaining operator "&": iterable & iterable --> itertools.chain(iterable, iterable) The reason I choose & rather than + is that & is less likely to conflict with any existing string/sequence types. None of the built-in or std lib sequences that I can think of support the & operator. Also, & is used for (string?) concatenation in some languages, such as VB.Net, some BASIC dialects, Hypertalk, AppleScript, and Ada. Iterator chaining is more like concatenation than (numeric) addition. However, the & operator is already used for bitwise-AND. Under my proposal that behaviour will continue, and will take priority over chaining. Currently, the & operator does something similar to (but significantly more complex) to this: # simplified pseudo-code of existing behaviour if hasattr(x, '__and__'): return x.__and__(y) elif hasattr(y, '__rand__'): return y.__rand__(x) else: raise TypeError The key is to insert the new behaviour after the existing __(r)and__ code, just before TypeError is raised: attempt existing __(r)and__ behaviour if and only if that fails to apply: return itertools.chain(iter(x), iter(y)) So classes that define a __(r)and__ method will keep their existing behaviour. This implies that we cannot use & to chain sets and frozen sets, since they already define __(r)and__. This has an easy work-around: just call iter() on the set first. Applying & to objects which don't define __(r)and__ and aren't iterable will continue to raise TypeError, just as it does now. The only backwards incompatibility this proposal introduces is for any code which relies on `iterable & iterable` to raise TypeError. Frankly I can't imagine that there is any such code, outside of the Python test suite, but if there is, and people think it is worth it, we could make this a __future__ import. But I think that's overkill. The downside to this proposal is that it adds some conceptual complexity to Python operators. Apart from `is` and `is not`, all Python operators call one or more dunder methods. This is (as far as I know) the first operator which has fall-back functionality if the dunder methods aren't defined. Up to now, I've talked about & chaining being equivalent to the itertools.chain function. That glosses over one difference which needs to be mentioned. The chain function currently doesn't attempt to iterate over its arguments until needed: py> x = itertools.chain("a", 1, "c") py> next(x) 'a' py> next(x) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: 'int' object is not iterable Any proposal to change this behaviour for the itertools.chain function should be kept separate from this one. But for the & chaining operator, I think that behaviour must change: if we have an operand that is neither iterable nor defines __(r)and__, the & operator should fail early: [1, 2, 3] & None should raise TypeError immediately, unlike itertools.chain(). -- Steve

On Sun, Jun 25, 2017 at 8:23 PM, Steven D'Aprano <steve@pearwood.info> wrote:
In [1]: import numpy as np In [2]: import itertools In [3]: a, b = np.array([1,2,3]), np.array([4,5,6]) In [4]: a & b Out[4]: array([0, 0, 2]) In [5]: a + b Out[5]: array([5, 7, 9]) In [6]: list(itertools.chain(a, b)) Out[6]: [1, 2, 3, 4, 5, 6] These are all distinct, useful, and well-defined behaviors. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

On Mon, Jun 26, 2017 at 09:55:19AM -0700, David Mertz wrote:
Um... yes? I don't understand what point you are making. Did you read all of my post? I know it was long, but if you stopped reading at the point you replied, you might not realise that my proposal keeps the existing bitwise-AND behaviour of & and so the numpy array behaviour won't change. TL;DR - keep the existing __and__ and __rand__ behaviour; - if they are not defined, and both operands x, y are iterable, return chain(x, y); - raise TypeError for operands which neither support __(r)and__ nor are iterable. I think that chaining iterators is common enough and important enough in Python 3 to deserve an operator. While lists are still important, a lot of things which were lists are now lazily generated iterators, and we often need to concatenate them. itertools.chain() is less convenient than it should be. If we decide that chaining deserves an operator, it shouldn't be + because that clashes with existing sequence addition. & has the advantage that it means "concatenation" in some other languages, it means "and" in English which can be read as "add or concatenate", and it is probably unsupported by most iterables. I didn't think of numpy arrays as an exception (I was mostly thinking of sets) but I don't think people chain numpy arrays together very often. If they do, it's easy enough to call iter() first. -- Steve

On Tue, Jun 27, 2017 at 12:12 AM, Steven D'Aprano <steve@pearwood.info> wrote:
I understand. But it invites confusion about just what the `&` operator will do for a given iterable. For NumPy itself, you don't really want to spell `chain(a, b)` so much. But you CAN, they are iterables. The idiomatic way is: >>> np.concat((a,b)) array([1, 2, 3, 4, 5, 6]) However, for "any old iterable" it feels very strange to need to inspect the .__and__() and .__rand__ () methods of the things on both sides before you know WHAT operator it is. Maybe if you are confident a and b are exactly NumPy arrays it is obvious, but what about: from some_array_library import a from other_array_library import b What do you think `a & b` will do under your proposal? Yes, I understand it's deterministic... but it's far from *obvious*. This isn't even doing something pathological like defining both `.__iter__()` and `.__and__()` on the same class... which honestly, isn't even all that pathological; I can imagine real-world use cases. I think that chaining iterators is common enough and important enough in
I actually completely agree! I just wish I could think of a good character that doesn't have some very different meaning in other well-known contexts (even among iterables). Some straw men: both = a ⊕ b both = a ⇢ b Either of those look pretty nice to me, but neither is easy to enter on most keyboards. I think I wouldn't mind `&` if it only worked on iteraTORS. But then it loses many of the use cases. I'd like this, after all: for i in range(10)⇢[20,19,18]⇢itertools.count(100): if i>N: break ... -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

Hi all, Is "itertools.chain" actually that common? Sufficiently common to warrant its own syntax? In my experience, "enumerate" is far more common among the iterable operations. And that doesn't have special syntax either. A minimal proposal would be to promote "chain" to builtins. Stephan 2017-06-27 10:40 GMT+02:00 Greg Ewing <greg.ewing@canterbury.ac.nz>:

On Tue, Jun 27, 2017 at 11:01:32AM +0200, Stephan Houben wrote:
I think it's much more common than (say) sequence repetition: a = [None]*5 which has had an operator for a long, long time.
True. But enumerate is a built-in, and nearly always used in a single context: for i, x in enumerate(sequence): ... A stranger to Python could almost be forgiven for thinking that enumerate is part of the for-loop syntax. In contrast, chaining (while not as common as, say, numeric addition) happens in variable contexts: in expressions, as arguments to function calls, etc. It is absloutely true that this proposal brings nothing new to the language that cannot already be done. It's syntactic sugar. So I guess the value of it depends on whether or not you chain iterables enough that you would rather use an operator rather than a function.
A minimal proposal would be to promote "chain" to builtins.
Better than nothing, I suppose. -- Steve

Hi Steven, To put this into perspective, I did some greps on Sagemath, being the largest Python project I have installed on this machine (1955 .py files). Occurrences: enumerate: 922 zip: 585 itertools.product: 67 itertools.combinations: 18 itertools.islice: 17 chain: 14 (with or without itertools. prefix) This seems to confirm my gut feeling that "chain" just isn't that common an operation; even among itertools functions, product, combinations and islice are more common. Based on this I would say there is little justification to even put "chain" in builtins, let alone to give it dedicated syntax. I also note that * for repetition is only supported for a few iterables (list, tuple), incidentally the same ones which support + for sequence chaining. Stephan 2017-06-27 12:38 GMT+02:00 Steven D'Aprano <steve@pearwood.info>:

On Tue, Jun 27, 2017 at 01:32:05PM +0200, Stephan Houben wrote:
And one which is especially focused on numerical processing, not really the sort of thing that does a much iterator chaining. That's hardly a fair test -- we know there are applications where chaining is not important at all. Its the applications where it *is* important that we should be looking at. -- Steve

Its the applications where it *is* important that we should be looking at.
Um, yes, but given our relative positions in this debate, the onus is not really on *me* to demonstrate such an application, right? That would just confuse everbody ;-) (FWIW, Sagemath is not mostly "numerical processing", it is mostly *symbolic* calculations and involves a lot of complex algorithms and datastructures, including sequences.) Stephan 2017-06-27 13:48 GMT+02:00 Steven D'Aprano <steve@pearwood.info>:

Just another syntactical suggestion: the binary ++ operator is used as concat in various contexts in various languages, and is probably less likely to confuse people as being either a logical or binary &. On Tue, Jun 27, 2017 at 6:53 AM Stephan Houben <stephanh42@gmail.com> wrote:

Unfortunately this is existing syntax: a++b is parsed as a+(+b) Stephan Op 27 jun. 2017 6:03 p.m. schreef "Joshua Morton" <joshua.morton13@gmail.com
:
Just another syntactical suggestion: the binary ++ operator is used as concat in various contexts in various languages, and is probably less likely to confuse people as being either a logical or binary &. On Tue, Jun 27, 2017 at 6:53 AM Stephan Houben <stephanh42@gmail.com> wrote:

2017-06-27 Stephan Houben <stephanh42@gmail.com> dixit:
Is "itertools.chain" actually that common? Sufficiently common to warrant its own syntax?
Please, note that it can be upturned: maybe they are not so common as they could be because of all that burden with importing from separate module -- after all we are saying about somewhat very simple operation, so using lists and `+` just wins because of our (programmers') laziness. :-) Cheers. *j

On Tue, Jun 27, 2017 at 08:40:23PM +1200, Greg Ewing wrote:
Except to the human reader, who can be forgiven for thinking "What the fudge is that semicolon doing there???" (or even less polite). I don't know of any language that uses semi-colon as an operator. That looks like a bug magnet to me. Consider what happens when (not if) you write (a,b) instead, or when you write items = (x; y) and it happens to succeed because x and y are iterable. To be perfectly honest, and no offence is intended, this suggestion seems so wacky to me that I'm not sure if you intended for it to be taken seriously or not. -- Steve

TL;DR If people really object to & doing double-duty as bitwise-AND and chaining, there's always && as a possibility. On Tue, Jun 27, 2017 at 12:47:40AM -0700, David Mertz wrote:
With operator overloading, that's a risk for any operator, and true for everything except literals. What will `x * y` do? How about `y * x`? They're not even guarenteed to call the same dunder method. But this is a problem in theory far more than in practice. In practice, you know what types you are expecting, and if you don't get them, you either explicitly raise an exception, or wait for something to fail. "Consenting adults" applies, and we often put responsibility back on the caller to do the right thing. If you expect to use & on iterable arguments, it is reasonable to put the responsibility on the caller to only provide "sensible" iterables, not something wacky like an infinite generator (unless your code is explicitly documented as able to handle them) or one that overrides __(r)and__. Or you could check for it yourself, if you don't trust the argument: if hasattr(arg, '__and__') or hasattr(arg, '__rand__'): raise Something But that strikes me as overkill. You don't normally check for dunders before using an operator, and we already have operators that can return different types depending on the operands: % can mean modulo division or string interpolation * can mean sequence repetition or numeric multiplication + can mean numeric addition or sequence concatenation Why is & can mean ierable chaining or bitwise-AND uniquely confusing?
That's not really chaining, per say -- it is concatenating two arrays to create a third array. (Probably by copying the array elements.) If you wanted to chain a numpy array, you would either use itertools.chain directly, or call iter(myarray) before using the & operator.
Do you inspect the dunder methods of objects before calling + or * or & currently? Why would you need to start now? Since there's no way of peering into an object's dunder methods and seeing what they do (short of reading the source code), you always have an element of trust and hope whenever you call an operator on anything but a literal.
It's not "obvious" now, since a and b can do anything they like in their dunder methods. They could even inspect the call stack from __next__ and if they see "chain" in the stack, erase your hard drive. Does that mean we don't dare call chain(a, b)? I don't think this proposal adds any new risk. All the risk already exists as soon as you allow method overloading. All we're doing is saying is "try the method overloading first, and if that isn't supported, try iterable chaining second before raising TypeError".
Indeed -- numpy arrays probably do that, as do sets and frozensets. I didn't say it was pathological, I said it was uncommon.
There's always && for iterator chaining. -- Steve

On Tue, Jun 27, 2017 at 4:44 AM, Steven D'Aprano <steve@pearwood.info> wrote:
I don't think it's "uniquely confusing." Just more so than the other examples you give. For example, I might write functions like these (untested): def modulo1(i: int, j: int) -> int: return i % j def modulo2(s: str, t: tuple) -> str: return s % t And similar examples for `*` and `+`. When I try to write this: def ampersand(x: Iterable, y: Iterable) -> Iterable: return x & y More ambiguity exists. The type signature works for both Numpy arrays and generators (under the proposed language feature), but the function does something different... in a way that is "more different" than I'd expect. That said, I like the idea of having iterators that act magically to fold in general iterables after an .__and__() or .__add__() as proposed by Brendan down-thread. Without any language change we could have: chainable(x for x in blah) + [1, 2, 3] + "hello" And I would like a language change that made a number of common iterable objects "chainable" without the wrapper. This wrapper could of course be used as a decorator too. E.g. generator comprehensions, things returned by itertools functions, range(), enumerate(), zip(), etc. This wouldn't promise that EVERY iterable or iterator had that "chainable" behavior, but it would cover 90% of the use cases. And I wouldn't find it confusing because the leftmost object would be the one determining the behavior, which feels more intuitive and predictable. I don't hate `&&`, but I think this approach makes more sense. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

On 27.06.2017 21:27, David Mertz wrote:
I think that most people in favor of this proposal agree with you here. Let's start with something simple which can be extended bit by bit to cover more and more use-cases. I for one would include also right-handed iterators/generator because I definitely know real-world usage (just recently) but that can wait until you feel comfortable with it as well. If it's a language change, I would like the plus operator to be it as it integrated well with lists, such as generator + (list1 + list2) It can be sometimes necessary to group things up like this. Using a different operator symbol here, I would find confusing. Plus looks like concat to me. Regards, Sven

On 2017-06-27 00:47, David Mertz wrote:
Hmmm, is the proposal really meant to include behavior that global and non-overridable? My understanding was that the proposal would effectively be like defining a default __and__ (or whatever) on some basic iterator types. Individual iterables (or iterators) could still define their own magic methods to define their own behavior. Because of that, I wouldn't expect it to be obvious what would happen in your case. If I import types from two random libraries, I can't expect to know what any operator does on them without reading their docs. Also because of that, I think it might be a bit much to expect this new concat operator to work on ALL iterables/iterators/ Nothing else really works that way; types have to define their own operator behavior. Iterators can be instances of any class that defines a __next__, so I don't see how we could support the magic-concat-everything operator without interfering there. So. . . wouldn't it be somewhat more reasonable to define this concat operator only on actual generators, and perhaps instances of the common iterator types returned from zip, enumerate, etc.? Someone earlier in the thread said that would be "weird" because it would be less generic than itertools.chain, but it seems to me it would cover most of the needed use cases (including the one that was initially given as a motivating example). Also, this generator.__add__ could be smart about handling other iterables on the right of the operator, so you could do (x for x in blah) + [1, 2, 3] + "hello" . . .and, as long as you started with a regular generator, it could work, by having the magic method return a new instance of a type that also has this behavior. This would be a bit odd because I think most existing Python types don't try to do this kind of thing (subsuming many different types of right-operands). But I think it would be useful. In the interim, it could be played with by just making a class that implements the __add__ (or whatever), so you could do cool_iterator(x for x in blah) + [1, 2, 3] + "hello" . . . and just wrapping the leftmost operand would be enough to give you nice syntax for chaining all the rest. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown

On Mon, Jun 26, 2017 at 01:23:36PM +1000, Steven D'Aprano wrote:
I remembered there is a precedent here. The == operator tries __eq__ before falling back on object identity, at least in Python 2. py> getattr(object(), '__eq__') Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'object' object has no attribute '__eq__' py> object() == object() False -- Steve

On 2017-06-25 20:23, Steven D'Aprano wrote:
I have a counter-proposal: introduce the iterator chaining operator "&":
iterable & iterable --> itertools.chain(iterable, iterable)
I like this suggestion. Here's another color that might be less controversial: iterable3 = iterable1.chain(iterable2) Perhaps more obvious than &, easier to use than "from itertools import chain...". -Mike

On Tue, Jun 27, 2017 at 1:57 PM, Mike Miller <python-ideas@mgmiller.net> wrote:
How do you chain it1, it2, it3, etc? I guess `it1.chain(it2.chain(it3)))` ... but that starts to become distinctly less readable IMO. I'd much rather spell `chain(it1, it2, it3)`. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

On 2017-06-27 14:02, David Mertz wrote:
Even if this "chain" only took one argument, you could do it1.chain(it2).chain(it3). But I don't see why it couldn't take multiple arguments as you suggest. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown

On 28 June 2017 at 07:13, Mike Miller <python-ideas@mgmiller.net> wrote:
While I haven't been following this thread closely, I'd like to note that arguing for a "chain()" builtin has the virtue that would just be arguing for the promotion of the existing itertools.chain function into the builtin namespace. Such an approach has a lot to recommend it: 1. It has precedent, in that Python 3's map(), filter(), and zip(), are essentially Python 2's itertools.imap(), ifilter(), and izip() 2. There's no need for a naming or semantics debate, as we'd just be promoting an established standard library API into the builtin namespace 3. Preserving compatibility with older versions is straightforward: just do an unconditional "from itertools import chain" 4. As an added bonus, we'd also get "chain.from_iterable" as a builtin API So it would be good to have a short PEP that argued that since chaining arbitrary iterables is at least as important as mapping, filtering, and zipping them, itertools.chain should be added to the builtin namespace in 3.7+ (but no, I'm not volunteering to write that myself). As a *separate* discussion, folks could then also argue for the additional of a `__lshift__` operator implementation specifically to iterator chains that let you write: full_chain = chain(it1) << it2 << it3 # Incrementally create new chains full_chain <<= it4 # Extend an existing chain I'd be surprised if such a proposal got accepted for 3.7, but it would make a good follow-up discussion for 3.8 (assuming chain() made it into the 3.7 builtins). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 6/27/2017 10:47 PM, Nick Coghlan wrote:
A counter-argument is that there are other itertools that deserve promotion, by usage, even more. But we need to see comparisons from more that one limited corpus. On the other hand, there might be a theory argument that chain is somehow more basic, akin to map, etc, in a way that others are not.
-- Terry Jan Reedy

On 28 June 2017 at 14:30, Terry Reedy <tjreedy@udel.edu> wrote:
The main rationale I see is the one that kicked off the most recent discussion, which is that in Python 2, you could readily chain the output of map(), filter(), zip(), range(), dict.keys(), dict.values(), dict.items(), etc together with "+", simply because they all returned concrete lists. In Python 3, you can't do that as easily anymore, since they all return either iterators or computed containers that don't support "+". While there are good reasons not to implement "+" on those iterators and custom containers, we *can* fairly easily restore builtin concatentation support for their outputs, and we can do it in a way that's friendly to all implementations that already provide the itertools.chain API. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 28 June 2017 at 05:30, Terry Reedy <tjreedy@udel.edu> wrote:
Indeed. I don't recall *ever* using itertools.chain myself. I'd be interested in seeing some usage stats to support this proposal. As an example, I see 8 uses of itertools.chain in pip and its various vendored packages, as opposed to around 30 uses of map (plus however many list comprehensions are used in place of maps). On a very brief scan, it looks like the various other itertools are used less than chain, but with only 8 uses of chain, it's not really possible to read anything more into the relative frequencies. Paul

On 28 June 2017 at 18:54, Paul Moore <p.f.moore@gmail.com> wrote:
The other thing to look for would be list() and list.extend() calls. I know I use those quite a bit in combination with str.join, where I don't actually *need* a list, it's just currently the most convenient way to accumulate all the components I'm planning to combine. And if you're converting from Python 2 code, then adding a few list() calls in critical places in order to keep the obj.extend() calls working is likely to be easier in many cases than switching over to using itertools.chain. For simple cases, that's fine (since a list of direct references will be lower overhead than accumulating a chain of short iterables), but without builtin support for iterator concatenation, it's currently a nonlocal refactoring (to add the "from itertools import chain" at the top of the file) to switch completely to the "pervasive iterators" model when folks actually do want to do that. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 28.06.2017 11:09, Nick Coghlan wrote:
This is exactly the reason why I also doubt that Stephan's Sagemath stats are telling anything beyond "chain isn't used that much". Iterators are only nice to have if you work with simple lists up to 1000 items. Current hardware is able to fix that for you. There are simply more readable ways of doing "chaining of sequences" for many cases. Even if you are already on Python 3. In the end, list and "+" operator are the best way of "doing sequences". Regards, Sven

On Wed, Jun 28, 2017 at 11:54 AM, Paul Moore <p.f.moore@gmail.com> wrote:
Indeed. I don't recall *ever* using itertools.chain myself.
In fact, me neither. Or maybe a couple of times. For such a basic task, it feels more natural to write a generator function, or even turn it into a list, if you can be sure that the 'unnecessary' lists will be small and that the code won't be a performance bottle neck. To reiterate on this some more: One of the nice things of Python 3 is (or could be) the efficiency of not making unnecessary lists by default. But for the programmer/beginner it's not nearly as convenient with the views as it is with lists. Beginners quickly need to learn about Also generators are really nice, and chaining them is just as useful/necessary as extending or concatenating lists. Chaining generators with other iterables is certainly useful, but when all the parts of the chained object are iterable but not sequences, that seems like an invitation to use list() at some point in the code. So whatever the outcome of this discussion (I hope there is one, whether it is by adding iterator-related builtins or something more sophisticated), it should probably take into account possible future ways of dealing with some kind of "lazy lists". However, I'm actually not sure the syntax of chaining generators/iterables should necessarily be the same as for chaining arbitrary sequences. The programmer needs to be well aware of whether the resulting object is a Sequence or 'just' a generator. -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On 28.06.2017 14:00, Koos Zevenhoven wrote:
The programmer needs to be well aware of whether the resulting object is a Sequence or 'just' a generator.
Could you elaborate more on **why**? Regards, Sven PS: I consider this proposal to be like allowing adding floats and ints together. If I don't know if there was a float in the sum, don't know if my result will be a float.

On Wed, Jun 28, 2017 at 3:18 PM, Sven R. Kunze <srkunze@mail.de> wrote:
For a moment, I was wondering what the double emphasis was for, but then I realized you are simply calling `statement.__why__()` directly instead of the recommended `spoiler(statement)`. But sure, I just got on vacation and I even found a power extension cord to use my laptop at the pool, so what else would I do ;). It all depends on what you need to do with the result of the concatenation. When all you need is something to iterate over, a generator-like thingy is fine. But when you need something for indexing and slicing or len etc., you want to be sure that that is what you're getting. But maybe someone passed you an argument that is not a sequence, or you forgot if a function returns a sequence or a generator. In that case, you want the error right away, instead of from some completely different piece of code somewhere that thinks it's getting a sequence. I don't think Python should depend on a static type checker to catch the error early. After all, type hints are optional. PS: I consider this proposal to be like allowing adding floats and ints
together. If I don't know if there was a float in the sum, don't know if my result will be a float.
Not to say that the float/int case is never problematic, but the situation is still different. Often when a float makes any sense, you can work with either floats or ints and it doesn't really matter. But if you specifically *need* an int, you usually don't call functions that return floats. But if you do use division etc., you probably need to think about floor/ceil/closest anyway. And yes, there have probably been Python 2->3 porting bugs where / division was not appropriately replaced with //. But regarding containers, it often makes just as much sense for a function to return a generator as it does to return a sequence. The name or purpose of a function may give no hint about whether an iterable or sequence is returned, and you can't expect everyone to prefix their function names with iter_ and seq_ etc. And it's not just function return values, it's also arguments passed into your function. -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On 28.06.2017 16:01, Koos Zevenhoven wrote:
Doing this for years now. Sometimes, when 'statement.__why__()' returns None, 'spoiler(statement)' returns some thought-terminating cliché. ;)
But sure, I just got on vacation and I even found a power extension cord to use my laptop at the pool, so what else would I do ;).
Good man. Today, a colleague of mine showed me a mobile mini-keyboard with a phone bracket (not even a dock). So, having his 7'' smartphone, he can work from his vacations and answer emails as well. ;) Cheap notebook replacement, if you don't prefer large screens and keyboards. :D
I understand that. In the end, I remember people on this mailing-list recommending me to use "list(...)" to make sure you got one in your hands. I remember this being necessary in the conversion process from Python2 to 3. The pattern is already here.
Division is one thing, numeric input parameters from unknown sources is another. In this regard, calling "int(...)" or "list(...)" follows the same scheme IMO.
Exactly. Neither want I those prefixes. And I can tell you they aren't necessary in practice at all. Just my 2 cents on this: At work, we heavily rely on Django. Django provides a so-called QuerySet type, its db-result abstraction. Among those querysets, our functions return lists and sets with no indication of whatsoever type it may be. It works quite well and we didn't had any issues with that. If we need a list, we wrap it with "list(...)". It's as simple as that. The valid concern, that it could be confusing which type the return value might have, is usually an abstract one. I can tell you that in practice it's not really an issue to talk about. Regards, Sven

On Wed, Jun 28, 2017 at 9:01 PM, Sven R. Kunze <srkunze@mail.de> wrote:
Oh, I've been very close to getting one of those. But then I should probably get a pair of glasses too ;).
That pattern annoys people and negates the benefits of views and generators.
Sure, but you may want to turn your unknown sources into something predictable as soon as possible. You'll need to deal with the errors in the input anyway. Returning sequences vs generators is a different matter. You don't want to turn generators into lists if you don't have to.
[...] Just my 2 cents on this:
Very often one doesn't really need a list, but just something that has indexing, slicing and/or len(). Wrapping things with list() can be ok, but uses memory and is O(n). Generating lists from all kinds of iterables all the time is just a whole bunch of unnecessary overhead. But yes, it happens, because that's the convenient way of doing it now. That's like going back to Python 2, but with additional calls to list() required. Maybe you're lucky that your iterables are small and not a bottle neck and/or you just don't feel guilty every time you call list() where you shouldn't have to ;). -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On 28.06.2017 20:37, Koos Zevenhoven wrote:
Oh, I've been very close to getting one of those. But then I should probably get a pair of glasses too ;).
:D
That pattern annoys people and negates the benefits of views and generators.
Sure, that's why I am in favor of this proposal. It would remove the necessity to do that in various places. :)
That's a good point.
Yep, exactly. That's why I like an easier way of concating them with no bells and whistles. Preferably like lists today. ;) Cheers, Sven

On 28Jun2017 09:54, Paul Moore <p.f.moore@gmail.com> wrote:
I don't use it often, but when I do it is very handy. While I'm not arguing for making it a builtin on the basis of my own use (though I've no objections either), a quick grep shows: My maildb kit uses chain to assemble multiple related header values: *chain( msg.get_all(hdr, []) for hdr in ('to', 'cc', 'bcc', 'resent-to', 'resent-cc') ) Two examples where I use it to insert items in front of an iterable: chunks = chain( [data], chunks ) blocks = indirect_blocks(chain( ( topblock, nexttopblock ), blocks )) Neither of these is amenable to list rephrasings because the tail iterables ("chunks" and "blocks") are of unknown and potentially large size. And a few other cases whose uses are harder to succinctly describe, but generally "iterable flattening". So it is uncommon for me, but very useful when I want it. Just some (small) data points. Cheers, Cameron Simpson <cs@zip.com.au>

25.06.17 15:06, lucas via Python-ideas пише:
It would be weird if the addition is only supported for instances of the generator class, but not for other iterators. Why (n for n in range(2)) + (n for n in range(2, 4)) works, but iter(range(2)) + iter(range(2, 4)) and iter([0, 1]) + iter((2, 3)) don't? itertools.chain() supports arbitrary iterators. Therefore you will need to implement the __add__ method for *all* iterators in the world. However itertools.chain() accepts not just *iterators*. It works with *iterables*. Therefore you will need to implement the __add__ method also for all iterables in the world. But __add__ already is implemented for list and tuple, and many other sequences, and your definition conflicts with this.

I would like to add that for example numpy ndarrays are iterables, but they have an __add__ with completely different semantics, namely element-wise ( numerical) addition. So this proposal would conflict with existing libraries with iterable objects. Stephan Op 25 jun. 2017 2:51 p.m. schreef "Serhiy Storchaka" <storchaka@gmail.com>:

Personally, I find syntactic sugar for concating interators would come in handy. The purpose of iterators and generators is performance and efficiency. So, lowering the bar of using them is a good idea IMO. Also hoping back and forth a generator/iterator-based solution and a, say, list-based/materialized solution would become a lot easier. On 25.06.2017 16:04, Stephan Houben wrote:
I don't see a conflict.
I don't think it's necessary to start with *all* iterators in the world. So, adding iterators and/or generators, should be possible without any problems. It's a start and could already help a lot if I have my use-cases correctly.
As above, I don't see a conflict. Regards, Sven

2017-06-25 Serhiy Storchaka <storchaka@gmail.com> dixit: > 25.06.17 15:06, lucas via Python-ideas пише: > > I often use generators, and itertools.chain on them. > > What about providing something like the following: > > > > a = (n for n in range(2)) > > b = (n for n in range(2, 4)) > > tuple(a + b) # -> 0 1 2 3 [...] > It would be weird if the addition is only supported for instances of > the generator class, but not for other iterators. Why (n for n in > range(2)) > + (n for n in range(2, 4)) works, but iter(range(2)) + iter(range(2, > 4)) and iter([0, 1]) + iter((2, 3)) don't? itertools.chain() supports > arbitrary iterators. Therefore you will need to implement the __add__ > method for *all* iterators in the world. > > However itertools.chain() accepts not just *iterators*. [...] But implementation of the OP's proposal does not need to be based on __add__ at all. It could be based on extending the current behaviour of the `+` operator itself. Now this behavior is (roughly): try left side's __add__, if failed try right side's __radd__, if failed raise TypeError. New behavior could be (again: roughly): try left side's __add__, if failed try right side's __radd__, if failed try __iter__ of both sides and chain them (creating a new iterator¹), if failed raise TypeError. And similarly, for `+=`: try __iadd__..., try __add__..., try __iter__..., raise TypeError. Cheers. *j ¹ Preferably using the existing `yield from` mechanism -- because, in case of generators, it would provide a way to combine ("concatenate") *generators*, preserving semantics of all that their __next__(), send(), throw() nice stuff...

On Fri, Jun 30, 2017 at 1:09 AM, Jan Kaliszewski <zuo@chopin.edu.pl> wrote: > 2017-06-25 Serhiy Storchaka <storchaka@gmail.com> dixit: > >> 25.06.17 15:06, lucas via Python-ideas пише: > >> > I often use generators, and itertools.chain on them. >> > What about providing something like the following: >> > >> > a = (n for n in range(2)) >> > b = (n for n in range(2, 4)) >> > tuple(a + b) # -> 0 1 2 3 > [...] >> It would be weird if the addition is only supported for instances of >> the generator class, but not for other iterators. Why (n for n in >> range(2)) >> + (n for n in range(2, 4)) works, but iter(range(2)) + iter(range(2, >> 4)) and iter([0, 1]) + iter((2, 3)) don't? itertools.chain() supports >> arbitrary iterators. Therefore you will need to implement the __add__ >> method for *all* iterators in the world. >> >> However itertools.chain() accepts not just *iterators*. > [...] > > But implementation of the OP's proposal does not need to be based on > __add__ at all. It could be based on extending the current behaviour of > the `+` operator itself. > > Now this behavior is (roughly): try left side's __add__, if failed try > right side's __radd__, if failed raise TypeError. > > New behavior could be (again: roughly): try left side's __add__, if > failed try right side's __radd__, if failed try __iter__ of both sides > and chain them (creating a new iterator¹), if failed raise TypeError. > > And similarly, for `+=`: try __iadd__..., try __add__..., try > __iter__..., raise TypeError. I actually really like this proposal, in additional to the original proposal of using '+' to chain generators--I don't think it necessarily needs to be extended to *all* iterables. But this proposal goes one better. I just have to wonder what kind of strange unexpected bugs would result. For example now you could add a list to a string: >>> list(['a', 'b', 'c'] + 'def') ['a', 'b', 'c', 'd', 'e', 'f'] Personally, I really like this and find it natural. But it will break anything expecting this to be a TypeError.

On Jun 30, 2017 2:23 PM, "Erik Bray" <erik.m.bray@gmail.com> wrote: I actually really like this proposal, in additional to the original proposal of using '+' to chain generators--I don't think it necessarily needs to be extended to *all* iterables. But this proposal goes one better. I just have to wonder what kind of strange unexpected bugs would result. For example now you could add a list to a string: >>> list(['a', 'b', 'c'] + 'def') ['a', 'b', 'c', 'd', 'e', 'f'] Personally, I really like this and find it natural. But it will break anything expecting this to be a TypeError. Note that you can already do: [*iterable1, *iterable2] Or like in your example:
[*['a', 'b', 'c'], *'def'] ['a', 'b', 'c', 'd', 'e', 'f']
At least I think you can do that in 3.6 ;) -- Koos (mobile)

On Fri, Jun 30, 2017 at 01:09:51AM +0200, Jan Kaliszewski wrote:
That's what I suggested earlier, except using & instead of + as the operator. The reason I suggested & instead of + is that there will be fewer clashes between iterables that already support the operator and hence fewer surprises. Using + will be a bug magnet. Consider: it = x + y # chain two iterables first = next(it, "default") That code looks pretty safe, but it's actually a minefield waiting to blow you up. It works fine if you pass (let's say) a generator object and a string, or a list and an iterator, but if x and y happen to both be strings, or both lists, or both tuples, the + operator will concatenate them instead of chaining them, and the call to next will blow up. So you would have to write: it = iter(x + y) # chain two iterables, and ensure the result is an iterator to be sure. Which is not a huge burden, but it does take away the benefit of having an operator. In that case, you might as well do: it = chain(x, y) and be done with it. It's true that exactly the same potential problem occurs with & but its less likely. Strings, tuples, lists and other sequences don't typically support __(r)and__ and the & operator, so you're less likely to be burned. Still, the possibility is there. Maybe we should use a different operator. ++ is out because that already has meaning, so that leaves either && or inventing some arbitrary symbol. But the more I think about it the more I agree with Nick. Let's start by moving itertools.chain into built-ins, with zip and map, and only consider giving it an operator after we've had a few years of experience with chain as a built-in. We might even find that an operator doesn't add any real value.
I don't think that would be generally useful. If you're sending values into an arbitrary generator, who knows what you're getting? chain() will operate on arbitrary iterables, you can't expect to send values into chain([1, 2, 3], my_generator(), "xyz") and have anything sensible occur. -- Steve

On Saturday, July 1, 2017, Steven D'Aprano <steve@pearwood.info> wrote:
- Would that include chain.from_iterable? - So there's then a new conditional import (e.g. in a compat package)? What does this add?
Flatten one level?
- is my_generator() mutable (e.g. before or during iteration)? - https://docs.python.org/2/reference/expressions.html#generator.send

On Sat, Jul 01, 2017 at 01:35:29AM -0500, Wes Turner wrote:
Yes.
- So there's then a new conditional import (e.g. in a compat package)? What does this add?
try: chain except NameError: from itertools import chain Two lines, if and only if you both need chain and want to support versions of Python older than 3.7. There's no need to import it if you aren't going to use it.
Flattening typically applies to lists and sequences. I'm not saying that chain shouldn't support generators. That would be silly: a generator is an iterable and chaining supports iterables. I'm saying that it wouldn't be helpful to require chain objects to support send(), throw() etc.
It doesn't matter. Sending into a chain of arbitrary iterators isn't a useful thing to do.
- https://docs.python.org/2/reference/expressions.html#generator.send
Why are you linking to the 2 version of the docs? We're discusing a hypotheticial new feature which must go into 3, not 2. -- Steve

On Sat, Jul 1, 2017 at 6:11 PM, Steven D'Aprano <steve@pearwood.info> wrote:
It'd be even simpler. If you want to support <3.7 and 3.7+, you write: from itertools import chain At least, I presume it isn't going to be *removed* from itertools. Promotion to builtin shouldn't break pre-existing code, so the way to be compatible with pre-promotion Pythons is simply to code for those and not take advantage of the new builtin. ChrisA

On Saturday, July 1, 2017, Steven D'Aprano <steve@pearwood.info> wrote:
Or, can I just continue to import the same function from the same place: from itertools import chain Nice, simple, easy. There's even (for all you functional lovers): from itertools import * And, again, this works today: from fn import Stream itr = Stream() << my_generator() << (8,9,0) - https://github.com/kachayev/fn.py/blob/master/README.rst#streams-and-infinit... - https://github.com/kachayev/fn.py/blob/master/fn/stream.py - AFAIU, + doesn't work because e.g. numpy already defines + and & for Iterable arrays.
So the argspec is/shouldbe Iterables with __iter__ (but not necessarily __len__)?
So, with a generator function, I get a traceback at the current yield statement. With chain() I get whatever line the chain call is on.
In your opinion, has the send() functionality changed at all?

On 1 July 2017 at 07:13, Steven D'Aprano <steve@pearwood.info> wrote:
I'm struck here by the contrast between this and the "let's slim down the stdlib" debates we've had in the past. How difficult is it really to add "from itertools import chain" at the start of a file? It's not even as if itertools is a 3rd party dependency. Paul

On Sun, Jun 25, 2017 at 3:06 PM, lucas via Python-ideas < python-ideas@python.org> wrote:
I think a convenient syntax for chaining iterables and sequences would be very useful in Python 3, because there has been a shift from using lists by default to using views to dict keys and values, range objects etc. Having to add an import for a basic operation that used to just work with the + operator feels like a regression to many. It's not really clear if you will be able to implement this, but if you can find a syntax that gets accepted, I think using the same type as itertools.chain might be a good starting point, although the docs should not promise to return that exact type so that support for __getitem__ etc. could be added in the future for cases where the chained iterables are Sequences. -- Koos
-- + Koos Zevenhoven + http://twitter.com/k7hoven +

On Sun, Jun 25, 2017 at 3:06 PM, lucas via Python-ideas < python-ideas@python.org> wrote:
AudioLazy does that: https://github.com/danilobellini/audiolazy -- Danilo J. S. Bellini --------------- "*It is not our business to set up prohibitions, but to arrive at conventions.*" (R. Carnap)

On Sunday, June 25, 2017, Danilo J. S. Bellini <danilo.bellini@gmail.com> wrote:
- http://toolz.readthedocs.io/en/latest/api.html#toolz.itertoolz.concat and concatv - https://github.com/kachayev/fn.py#streams-and-infinite-sequences-declaration - Stream() << obj

On Sunday, June 25, 2017, Wes Turner <wes.turner@gmail.com> wrote:
<< is __lshift__() <<= is __ilshift__() https://docs.python.org/2/library/operator.html Do Stream() and __lshift__() from fn.py not solve here?

On Mon, Jun 26, 2017 at 4:53 PM, Serhiy Storchaka <storchaka@gmail.com> wrote:
And you can also do def a_and_b(): yield from a yield from b c = a_and_b() # iterable that yields 0, 1, 2, 3 I sometimes wish there was something like c from: yield from a yield from b ...or to get a list: c as list from: yield from a yield from b ...or a sum: c as sum from: yield from a yield from b These would be great for avoiding crazy oneliner generator expressions. They would also be equivalent to things like: @list @from def c(): yield from a yield from b @sum @from def c(): yield from a yield from b the above, given: def from(genfunc): return genfunc() Except of course `from` is a keyword and it should probably just be `call`. But this still doesn't naturally extend to allow indexing and slicing, like c[2] and c[1:3], for the case where the concatenated iterables are Sequences. -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On 26Jun2017 23:26, Koos Zevenhoven <k7hoven@gmail.com> wrote:
Nice.
Also nice, but for me a nonstarter because it breaks the existing pythyon idion that "... as foo" means to bind the name "foo" as the expression on the left. Such as with import, except. So +1 for the form, -1 for the particular keyword. Cheers, Cameron Simpson <cs@zip.com.au> Trust the computer industry to shorten Year 2000 to Y2K. It was this thinking that caused the problem in the first place. - Mark Ovens <marko@uk.radan.com>

We HAVE spellings for these things: c from:
yield from a yield from b
c = chain(a, b)
c = list(chain(a, b))
c = sum(chain(a, b)) Those really are not "crazy generator expressions." -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

On Sun, Jun 25, 2017 at 02:06:54PM +0200, lucas via Python-ideas wrote:
As Serhiy points out, this is going to conflict with existing use of + operator for string and sequence concatenation. I have a counter-proposal: introduce the iterator chaining operator "&": iterable & iterable --> itertools.chain(iterable, iterable) The reason I choose & rather than + is that & is less likely to conflict with any existing string/sequence types. None of the built-in or std lib sequences that I can think of support the & operator. Also, & is used for (string?) concatenation in some languages, such as VB.Net, some BASIC dialects, Hypertalk, AppleScript, and Ada. Iterator chaining is more like concatenation than (numeric) addition. However, the & operator is already used for bitwise-AND. Under my proposal that behaviour will continue, and will take priority over chaining. Currently, the & operator does something similar to (but significantly more complex) to this: # simplified pseudo-code of existing behaviour if hasattr(x, '__and__'): return x.__and__(y) elif hasattr(y, '__rand__'): return y.__rand__(x) else: raise TypeError The key is to insert the new behaviour after the existing __(r)and__ code, just before TypeError is raised: attempt existing __(r)and__ behaviour if and only if that fails to apply: return itertools.chain(iter(x), iter(y)) So classes that define a __(r)and__ method will keep their existing behaviour. This implies that we cannot use & to chain sets and frozen sets, since they already define __(r)and__. This has an easy work-around: just call iter() on the set first. Applying & to objects which don't define __(r)and__ and aren't iterable will continue to raise TypeError, just as it does now. The only backwards incompatibility this proposal introduces is for any code which relies on `iterable & iterable` to raise TypeError. Frankly I can't imagine that there is any such code, outside of the Python test suite, but if there is, and people think it is worth it, we could make this a __future__ import. But I think that's overkill. The downside to this proposal is that it adds some conceptual complexity to Python operators. Apart from `is` and `is not`, all Python operators call one or more dunder methods. This is (as far as I know) the first operator which has fall-back functionality if the dunder methods aren't defined. Up to now, I've talked about & chaining being equivalent to the itertools.chain function. That glosses over one difference which needs to be mentioned. The chain function currently doesn't attempt to iterate over its arguments until needed: py> x = itertools.chain("a", 1, "c") py> next(x) 'a' py> next(x) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: 'int' object is not iterable Any proposal to change this behaviour for the itertools.chain function should be kept separate from this one. But for the & chaining operator, I think that behaviour must change: if we have an operand that is neither iterable nor defines __(r)and__, the & operator should fail early: [1, 2, 3] & None should raise TypeError immediately, unlike itertools.chain(). -- Steve

On Sun, Jun 25, 2017 at 8:23 PM, Steven D'Aprano <steve@pearwood.info> wrote:
In [1]: import numpy as np In [2]: import itertools In [3]: a, b = np.array([1,2,3]), np.array([4,5,6]) In [4]: a & b Out[4]: array([0, 0, 2]) In [5]: a + b Out[5]: array([5, 7, 9]) In [6]: list(itertools.chain(a, b)) Out[6]: [1, 2, 3, 4, 5, 6] These are all distinct, useful, and well-defined behaviors. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

On Mon, Jun 26, 2017 at 09:55:19AM -0700, David Mertz wrote:
Um... yes? I don't understand what point you are making. Did you read all of my post? I know it was long, but if you stopped reading at the point you replied, you might not realise that my proposal keeps the existing bitwise-AND behaviour of & and so the numpy array behaviour won't change. TL;DR - keep the existing __and__ and __rand__ behaviour; - if they are not defined, and both operands x, y are iterable, return chain(x, y); - raise TypeError for operands which neither support __(r)and__ nor are iterable. I think that chaining iterators is common enough and important enough in Python 3 to deserve an operator. While lists are still important, a lot of things which were lists are now lazily generated iterators, and we often need to concatenate them. itertools.chain() is less convenient than it should be. If we decide that chaining deserves an operator, it shouldn't be + because that clashes with existing sequence addition. & has the advantage that it means "concatenation" in some other languages, it means "and" in English which can be read as "add or concatenate", and it is probably unsupported by most iterables. I didn't think of numpy arrays as an exception (I was mostly thinking of sets) but I don't think people chain numpy arrays together very often. If they do, it's easy enough to call iter() first. -- Steve

On Tue, Jun 27, 2017 at 12:12 AM, Steven D'Aprano <steve@pearwood.info> wrote:
I understand. But it invites confusion about just what the `&` operator will do for a given iterable. For NumPy itself, you don't really want to spell `chain(a, b)` so much. But you CAN, they are iterables. The idiomatic way is: >>> np.concat((a,b)) array([1, 2, 3, 4, 5, 6]) However, for "any old iterable" it feels very strange to need to inspect the .__and__() and .__rand__ () methods of the things on both sides before you know WHAT operator it is. Maybe if you are confident a and b are exactly NumPy arrays it is obvious, but what about: from some_array_library import a from other_array_library import b What do you think `a & b` will do under your proposal? Yes, I understand it's deterministic... but it's far from *obvious*. This isn't even doing something pathological like defining both `.__iter__()` and `.__and__()` on the same class... which honestly, isn't even all that pathological; I can imagine real-world use cases. I think that chaining iterators is common enough and important enough in
I actually completely agree! I just wish I could think of a good character that doesn't have some very different meaning in other well-known contexts (even among iterables). Some straw men: both = a ⊕ b both = a ⇢ b Either of those look pretty nice to me, but neither is easy to enter on most keyboards. I think I wouldn't mind `&` if it only worked on iteraTORS. But then it loses many of the use cases. I'd like this, after all: for i in range(10)⇢[20,19,18]⇢itertools.count(100): if i>N: break ... -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

Hi all, Is "itertools.chain" actually that common? Sufficiently common to warrant its own syntax? In my experience, "enumerate" is far more common among the iterable operations. And that doesn't have special syntax either. A minimal proposal would be to promote "chain" to builtins. Stephan 2017-06-27 10:40 GMT+02:00 Greg Ewing <greg.ewing@canterbury.ac.nz>:

On Tue, Jun 27, 2017 at 11:01:32AM +0200, Stephan Houben wrote:
I think it's much more common than (say) sequence repetition: a = [None]*5 which has had an operator for a long, long time.
True. But enumerate is a built-in, and nearly always used in a single context: for i, x in enumerate(sequence): ... A stranger to Python could almost be forgiven for thinking that enumerate is part of the for-loop syntax. In contrast, chaining (while not as common as, say, numeric addition) happens in variable contexts: in expressions, as arguments to function calls, etc. It is absloutely true that this proposal brings nothing new to the language that cannot already be done. It's syntactic sugar. So I guess the value of it depends on whether or not you chain iterables enough that you would rather use an operator rather than a function.
A minimal proposal would be to promote "chain" to builtins.
Better than nothing, I suppose. -- Steve

Hi Steven, To put this into perspective, I did some greps on Sagemath, being the largest Python project I have installed on this machine (1955 .py files). Occurrences: enumerate: 922 zip: 585 itertools.product: 67 itertools.combinations: 18 itertools.islice: 17 chain: 14 (with or without itertools. prefix) This seems to confirm my gut feeling that "chain" just isn't that common an operation; even among itertools functions, product, combinations and islice are more common. Based on this I would say there is little justification to even put "chain" in builtins, let alone to give it dedicated syntax. I also note that * for repetition is only supported for a few iterables (list, tuple), incidentally the same ones which support + for sequence chaining. Stephan 2017-06-27 12:38 GMT+02:00 Steven D'Aprano <steve@pearwood.info>:

On Tue, Jun 27, 2017 at 01:32:05PM +0200, Stephan Houben wrote:
And one which is especially focused on numerical processing, not really the sort of thing that does a much iterator chaining. That's hardly a fair test -- we know there are applications where chaining is not important at all. Its the applications where it *is* important that we should be looking at. -- Steve

Its the applications where it *is* important that we should be looking at.
Um, yes, but given our relative positions in this debate, the onus is not really on *me* to demonstrate such an application, right? That would just confuse everbody ;-) (FWIW, Sagemath is not mostly "numerical processing", it is mostly *symbolic* calculations and involves a lot of complex algorithms and datastructures, including sequences.) Stephan 2017-06-27 13:48 GMT+02:00 Steven D'Aprano <steve@pearwood.info>:

Just another syntactical suggestion: the binary ++ operator is used as concat in various contexts in various languages, and is probably less likely to confuse people as being either a logical or binary &. On Tue, Jun 27, 2017 at 6:53 AM Stephan Houben <stephanh42@gmail.com> wrote:

Unfortunately this is existing syntax: a++b is parsed as a+(+b) Stephan Op 27 jun. 2017 6:03 p.m. schreef "Joshua Morton" <joshua.morton13@gmail.com
:
Just another syntactical suggestion: the binary ++ operator is used as concat in various contexts in various languages, and is probably less likely to confuse people as being either a logical or binary &. On Tue, Jun 27, 2017 at 6:53 AM Stephan Houben <stephanh42@gmail.com> wrote:

2017-06-27 Stephan Houben <stephanh42@gmail.com> dixit:
Is "itertools.chain" actually that common? Sufficiently common to warrant its own syntax?
Please, note that it can be upturned: maybe they are not so common as they could be because of all that burden with importing from separate module -- after all we are saying about somewhat very simple operation, so using lists and `+` just wins because of our (programmers') laziness. :-) Cheers. *j

On Tue, Jun 27, 2017 at 08:40:23PM +1200, Greg Ewing wrote:
Except to the human reader, who can be forgiven for thinking "What the fudge is that semicolon doing there???" (or even less polite). I don't know of any language that uses semi-colon as an operator. That looks like a bug magnet to me. Consider what happens when (not if) you write (a,b) instead, or when you write items = (x; y) and it happens to succeed because x and y are iterable. To be perfectly honest, and no offence is intended, this suggestion seems so wacky to me that I'm not sure if you intended for it to be taken seriously or not. -- Steve

TL;DR If people really object to & doing double-duty as bitwise-AND and chaining, there's always && as a possibility. On Tue, Jun 27, 2017 at 12:47:40AM -0700, David Mertz wrote:
With operator overloading, that's a risk for any operator, and true for everything except literals. What will `x * y` do? How about `y * x`? They're not even guarenteed to call the same dunder method. But this is a problem in theory far more than in practice. In practice, you know what types you are expecting, and if you don't get them, you either explicitly raise an exception, or wait for something to fail. "Consenting adults" applies, and we often put responsibility back on the caller to do the right thing. If you expect to use & on iterable arguments, it is reasonable to put the responsibility on the caller to only provide "sensible" iterables, not something wacky like an infinite generator (unless your code is explicitly documented as able to handle them) or one that overrides __(r)and__. Or you could check for it yourself, if you don't trust the argument: if hasattr(arg, '__and__') or hasattr(arg, '__rand__'): raise Something But that strikes me as overkill. You don't normally check for dunders before using an operator, and we already have operators that can return different types depending on the operands: % can mean modulo division or string interpolation * can mean sequence repetition or numeric multiplication + can mean numeric addition or sequence concatenation Why is & can mean ierable chaining or bitwise-AND uniquely confusing?
That's not really chaining, per say -- it is concatenating two arrays to create a third array. (Probably by copying the array elements.) If you wanted to chain a numpy array, you would either use itertools.chain directly, or call iter(myarray) before using the & operator.
Do you inspect the dunder methods of objects before calling + or * or & currently? Why would you need to start now? Since there's no way of peering into an object's dunder methods and seeing what they do (short of reading the source code), you always have an element of trust and hope whenever you call an operator on anything but a literal.
It's not "obvious" now, since a and b can do anything they like in their dunder methods. They could even inspect the call stack from __next__ and if they see "chain" in the stack, erase your hard drive. Does that mean we don't dare call chain(a, b)? I don't think this proposal adds any new risk. All the risk already exists as soon as you allow method overloading. All we're doing is saying is "try the method overloading first, and if that isn't supported, try iterable chaining second before raising TypeError".
Indeed -- numpy arrays probably do that, as do sets and frozensets. I didn't say it was pathological, I said it was uncommon.
There's always && for iterator chaining. -- Steve

On Tue, Jun 27, 2017 at 4:44 AM, Steven D'Aprano <steve@pearwood.info> wrote:
I don't think it's "uniquely confusing." Just more so than the other examples you give. For example, I might write functions like these (untested): def modulo1(i: int, j: int) -> int: return i % j def modulo2(s: str, t: tuple) -> str: return s % t And similar examples for `*` and `+`. When I try to write this: def ampersand(x: Iterable, y: Iterable) -> Iterable: return x & y More ambiguity exists. The type signature works for both Numpy arrays and generators (under the proposed language feature), but the function does something different... in a way that is "more different" than I'd expect. That said, I like the idea of having iterators that act magically to fold in general iterables after an .__and__() or .__add__() as proposed by Brendan down-thread. Without any language change we could have: chainable(x for x in blah) + [1, 2, 3] + "hello" And I would like a language change that made a number of common iterable objects "chainable" without the wrapper. This wrapper could of course be used as a decorator too. E.g. generator comprehensions, things returned by itertools functions, range(), enumerate(), zip(), etc. This wouldn't promise that EVERY iterable or iterator had that "chainable" behavior, but it would cover 90% of the use cases. And I wouldn't find it confusing because the leftmost object would be the one determining the behavior, which feels more intuitive and predictable. I don't hate `&&`, but I think this approach makes more sense. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

On 27.06.2017 21:27, David Mertz wrote:
I think that most people in favor of this proposal agree with you here. Let's start with something simple which can be extended bit by bit to cover more and more use-cases. I for one would include also right-handed iterators/generator because I definitely know real-world usage (just recently) but that can wait until you feel comfortable with it as well. If it's a language change, I would like the plus operator to be it as it integrated well with lists, such as generator + (list1 + list2) It can be sometimes necessary to group things up like this. Using a different operator symbol here, I would find confusing. Plus looks like concat to me. Regards, Sven

On 2017-06-27 00:47, David Mertz wrote:
Hmmm, is the proposal really meant to include behavior that global and non-overridable? My understanding was that the proposal would effectively be like defining a default __and__ (or whatever) on some basic iterator types. Individual iterables (or iterators) could still define their own magic methods to define their own behavior. Because of that, I wouldn't expect it to be obvious what would happen in your case. If I import types from two random libraries, I can't expect to know what any operator does on them without reading their docs. Also because of that, I think it might be a bit much to expect this new concat operator to work on ALL iterables/iterators/ Nothing else really works that way; types have to define their own operator behavior. Iterators can be instances of any class that defines a __next__, so I don't see how we could support the magic-concat-everything operator without interfering there. So. . . wouldn't it be somewhat more reasonable to define this concat operator only on actual generators, and perhaps instances of the common iterator types returned from zip, enumerate, etc.? Someone earlier in the thread said that would be "weird" because it would be less generic than itertools.chain, but it seems to me it would cover most of the needed use cases (including the one that was initially given as a motivating example). Also, this generator.__add__ could be smart about handling other iterables on the right of the operator, so you could do (x for x in blah) + [1, 2, 3] + "hello" . . .and, as long as you started with a regular generator, it could work, by having the magic method return a new instance of a type that also has this behavior. This would be a bit odd because I think most existing Python types don't try to do this kind of thing (subsuming many different types of right-operands). But I think it would be useful. In the interim, it could be played with by just making a class that implements the __add__ (or whatever), so you could do cool_iterator(x for x in blah) + [1, 2, 3] + "hello" . . . and just wrapping the leftmost operand would be enough to give you nice syntax for chaining all the rest. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown

On Mon, Jun 26, 2017 at 01:23:36PM +1000, Steven D'Aprano wrote:
I remembered there is a precedent here. The == operator tries __eq__ before falling back on object identity, at least in Python 2. py> getattr(object(), '__eq__') Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'object' object has no attribute '__eq__' py> object() == object() False -- Steve

On 2017-06-25 20:23, Steven D'Aprano wrote:
I have a counter-proposal: introduce the iterator chaining operator "&":
iterable & iterable --> itertools.chain(iterable, iterable)
I like this suggestion. Here's another color that might be less controversial: iterable3 = iterable1.chain(iterable2) Perhaps more obvious than &, easier to use than "from itertools import chain...". -Mike

On Tue, Jun 27, 2017 at 1:57 PM, Mike Miller <python-ideas@mgmiller.net> wrote:
How do you chain it1, it2, it3, etc? I guess `it1.chain(it2.chain(it3)))` ... but that starts to become distinctly less readable IMO. I'd much rather spell `chain(it1, it2, it3)`. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

On 2017-06-27 14:02, David Mertz wrote:
Even if this "chain" only took one argument, you could do it1.chain(it2).chain(it3). But I don't see why it couldn't take multiple arguments as you suggest. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown

On 28 June 2017 at 07:13, Mike Miller <python-ideas@mgmiller.net> wrote:
While I haven't been following this thread closely, I'd like to note that arguing for a "chain()" builtin has the virtue that would just be arguing for the promotion of the existing itertools.chain function into the builtin namespace. Such an approach has a lot to recommend it: 1. It has precedent, in that Python 3's map(), filter(), and zip(), are essentially Python 2's itertools.imap(), ifilter(), and izip() 2. There's no need for a naming or semantics debate, as we'd just be promoting an established standard library API into the builtin namespace 3. Preserving compatibility with older versions is straightforward: just do an unconditional "from itertools import chain" 4. As an added bonus, we'd also get "chain.from_iterable" as a builtin API So it would be good to have a short PEP that argued that since chaining arbitrary iterables is at least as important as mapping, filtering, and zipping them, itertools.chain should be added to the builtin namespace in 3.7+ (but no, I'm not volunteering to write that myself). As a *separate* discussion, folks could then also argue for the additional of a `__lshift__` operator implementation specifically to iterator chains that let you write: full_chain = chain(it1) << it2 << it3 # Incrementally create new chains full_chain <<= it4 # Extend an existing chain I'd be surprised if such a proposal got accepted for 3.7, but it would make a good follow-up discussion for 3.8 (assuming chain() made it into the 3.7 builtins). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 6/27/2017 10:47 PM, Nick Coghlan wrote:
A counter-argument is that there are other itertools that deserve promotion, by usage, even more. But we need to see comparisons from more that one limited corpus. On the other hand, there might be a theory argument that chain is somehow more basic, akin to map, etc, in a way that others are not.
-- Terry Jan Reedy

On 28 June 2017 at 14:30, Terry Reedy <tjreedy@udel.edu> wrote:
The main rationale I see is the one that kicked off the most recent discussion, which is that in Python 2, you could readily chain the output of map(), filter(), zip(), range(), dict.keys(), dict.values(), dict.items(), etc together with "+", simply because they all returned concrete lists. In Python 3, you can't do that as easily anymore, since they all return either iterators or computed containers that don't support "+". While there are good reasons not to implement "+" on those iterators and custom containers, we *can* fairly easily restore builtin concatentation support for their outputs, and we can do it in a way that's friendly to all implementations that already provide the itertools.chain API. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 28 June 2017 at 05:30, Terry Reedy <tjreedy@udel.edu> wrote:
Indeed. I don't recall *ever* using itertools.chain myself. I'd be interested in seeing some usage stats to support this proposal. As an example, I see 8 uses of itertools.chain in pip and its various vendored packages, as opposed to around 30 uses of map (plus however many list comprehensions are used in place of maps). On a very brief scan, it looks like the various other itertools are used less than chain, but with only 8 uses of chain, it's not really possible to read anything more into the relative frequencies. Paul

On 28 June 2017 at 18:54, Paul Moore <p.f.moore@gmail.com> wrote:
The other thing to look for would be list() and list.extend() calls. I know I use those quite a bit in combination with str.join, where I don't actually *need* a list, it's just currently the most convenient way to accumulate all the components I'm planning to combine. And if you're converting from Python 2 code, then adding a few list() calls in critical places in order to keep the obj.extend() calls working is likely to be easier in many cases than switching over to using itertools.chain. For simple cases, that's fine (since a list of direct references will be lower overhead than accumulating a chain of short iterables), but without builtin support for iterator concatenation, it's currently a nonlocal refactoring (to add the "from itertools import chain" at the top of the file) to switch completely to the "pervasive iterators" model when folks actually do want to do that. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 28.06.2017 11:09, Nick Coghlan wrote:
This is exactly the reason why I also doubt that Stephan's Sagemath stats are telling anything beyond "chain isn't used that much". Iterators are only nice to have if you work with simple lists up to 1000 items. Current hardware is able to fix that for you. There are simply more readable ways of doing "chaining of sequences" for many cases. Even if you are already on Python 3. In the end, list and "+" operator are the best way of "doing sequences". Regards, Sven

On Wed, Jun 28, 2017 at 11:54 AM, Paul Moore <p.f.moore@gmail.com> wrote:
Indeed. I don't recall *ever* using itertools.chain myself.
In fact, me neither. Or maybe a couple of times. For such a basic task, it feels more natural to write a generator function, or even turn it into a list, if you can be sure that the 'unnecessary' lists will be small and that the code won't be a performance bottle neck. To reiterate on this some more: One of the nice things of Python 3 is (or could be) the efficiency of not making unnecessary lists by default. But for the programmer/beginner it's not nearly as convenient with the views as it is with lists. Beginners quickly need to learn about Also generators are really nice, and chaining them is just as useful/necessary as extending or concatenating lists. Chaining generators with other iterables is certainly useful, but when all the parts of the chained object are iterable but not sequences, that seems like an invitation to use list() at some point in the code. So whatever the outcome of this discussion (I hope there is one, whether it is by adding iterator-related builtins or something more sophisticated), it should probably take into account possible future ways of dealing with some kind of "lazy lists". However, I'm actually not sure the syntax of chaining generators/iterables should necessarily be the same as for chaining arbitrary sequences. The programmer needs to be well aware of whether the resulting object is a Sequence or 'just' a generator. -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On 28.06.2017 14:00, Koos Zevenhoven wrote:
The programmer needs to be well aware of whether the resulting object is a Sequence or 'just' a generator.
Could you elaborate more on **why**? Regards, Sven PS: I consider this proposal to be like allowing adding floats and ints together. If I don't know if there was a float in the sum, don't know if my result will be a float.

On Wed, Jun 28, 2017 at 3:18 PM, Sven R. Kunze <srkunze@mail.de> wrote:
For a moment, I was wondering what the double emphasis was for, but then I realized you are simply calling `statement.__why__()` directly instead of the recommended `spoiler(statement)`. But sure, I just got on vacation and I even found a power extension cord to use my laptop at the pool, so what else would I do ;). It all depends on what you need to do with the result of the concatenation. When all you need is something to iterate over, a generator-like thingy is fine. But when you need something for indexing and slicing or len etc., you want to be sure that that is what you're getting. But maybe someone passed you an argument that is not a sequence, or you forgot if a function returns a sequence or a generator. In that case, you want the error right away, instead of from some completely different piece of code somewhere that thinks it's getting a sequence. I don't think Python should depend on a static type checker to catch the error early. After all, type hints are optional. PS: I consider this proposal to be like allowing adding floats and ints
together. If I don't know if there was a float in the sum, don't know if my result will be a float.
Not to say that the float/int case is never problematic, but the situation is still different. Often when a float makes any sense, you can work with either floats or ints and it doesn't really matter. But if you specifically *need* an int, you usually don't call functions that return floats. But if you do use division etc., you probably need to think about floor/ceil/closest anyway. And yes, there have probably been Python 2->3 porting bugs where / division was not appropriately replaced with //. But regarding containers, it often makes just as much sense for a function to return a generator as it does to return a sequence. The name or purpose of a function may give no hint about whether an iterable or sequence is returned, and you can't expect everyone to prefix their function names with iter_ and seq_ etc. And it's not just function return values, it's also arguments passed into your function. -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On 28.06.2017 16:01, Koos Zevenhoven wrote:
Doing this for years now. Sometimes, when 'statement.__why__()' returns None, 'spoiler(statement)' returns some thought-terminating cliché. ;)
But sure, I just got on vacation and I even found a power extension cord to use my laptop at the pool, so what else would I do ;).
Good man. Today, a colleague of mine showed me a mobile mini-keyboard with a phone bracket (not even a dock). So, having his 7'' smartphone, he can work from his vacations and answer emails as well. ;) Cheap notebook replacement, if you don't prefer large screens and keyboards. :D
I understand that. In the end, I remember people on this mailing-list recommending me to use "list(...)" to make sure you got one in your hands. I remember this being necessary in the conversion process from Python2 to 3. The pattern is already here.
Division is one thing, numeric input parameters from unknown sources is another. In this regard, calling "int(...)" or "list(...)" follows the same scheme IMO.
Exactly. Neither want I those prefixes. And I can tell you they aren't necessary in practice at all. Just my 2 cents on this: At work, we heavily rely on Django. Django provides a so-called QuerySet type, its db-result abstraction. Among those querysets, our functions return lists and sets with no indication of whatsoever type it may be. It works quite well and we didn't had any issues with that. If we need a list, we wrap it with "list(...)". It's as simple as that. The valid concern, that it could be confusing which type the return value might have, is usually an abstract one. I can tell you that in practice it's not really an issue to talk about. Regards, Sven

On Wed, Jun 28, 2017 at 9:01 PM, Sven R. Kunze <srkunze@mail.de> wrote:
Oh, I've been very close to getting one of those. But then I should probably get a pair of glasses too ;).
That pattern annoys people and negates the benefits of views and generators.
Sure, but you may want to turn your unknown sources into something predictable as soon as possible. You'll need to deal with the errors in the input anyway. Returning sequences vs generators is a different matter. You don't want to turn generators into lists if you don't have to.
[...] Just my 2 cents on this:
Very often one doesn't really need a list, but just something that has indexing, slicing and/or len(). Wrapping things with list() can be ok, but uses memory and is O(n). Generating lists from all kinds of iterables all the time is just a whole bunch of unnecessary overhead. But yes, it happens, because that's the convenient way of doing it now. That's like going back to Python 2, but with additional calls to list() required. Maybe you're lucky that your iterables are small and not a bottle neck and/or you just don't feel guilty every time you call list() where you shouldn't have to ;). -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +

On 28.06.2017 20:37, Koos Zevenhoven wrote:
Oh, I've been very close to getting one of those. But then I should probably get a pair of glasses too ;).
:D
That pattern annoys people and negates the benefits of views and generators.
Sure, that's why I am in favor of this proposal. It would remove the necessity to do that in various places. :)
That's a good point.
Yep, exactly. That's why I like an easier way of concating them with no bells and whistles. Preferably like lists today. ;) Cheers, Sven

On 28Jun2017 09:54, Paul Moore <p.f.moore@gmail.com> wrote:
I don't use it often, but when I do it is very handy. While I'm not arguing for making it a builtin on the basis of my own use (though I've no objections either), a quick grep shows: My maildb kit uses chain to assemble multiple related header values: *chain( msg.get_all(hdr, []) for hdr in ('to', 'cc', 'bcc', 'resent-to', 'resent-cc') ) Two examples where I use it to insert items in front of an iterable: chunks = chain( [data], chunks ) blocks = indirect_blocks(chain( ( topblock, nexttopblock ), blocks )) Neither of these is amenable to list rephrasings because the tail iterables ("chunks" and "blocks") are of unknown and potentially large size. And a few other cases whose uses are harder to succinctly describe, but generally "iterable flattening". So it is uncommon for me, but very useful when I want it. Just some (small) data points. Cheers, Cameron Simpson <cs@zip.com.au>
participants (21)
-
Brendan Barnwell
-
Cameron Simpson
-
Chris Angelico
-
Danilo J. S. Bellini
-
David Mertz
-
Erik Bray
-
Greg Ewing
-
Jan Kaliszewski
-
Joao S. O. Bueno
-
Joshua Morton
-
Koos Zevenhoven
-
lucas
-
Mike Miller
-
Nick Coghlan
-
Paul Moore
-
Serhiy Storchaka
-
Stephan Houben
-
Steven D'Aprano
-
Sven R. Kunze
-
Terry Reedy
-
Wes Turner