
I would like to purpose that in python 3 it will be easy to chain and slice Iterators just like you can add and slice lists and tuples easily. I think there could be 2 methods to do this: 1. Implementing '+' and slicing ([1:2:3]) for iterators and generators by default, Resulting in Itertool's chain(a,b) and islice(a,b,c) respectively. 2. Having itertool's chain and islice imported by default. I think since python 3 return zip,map, dict.items and so on as iterators, it makes working with those kind of objects more difficult without having these methods around. And since those methods are important enough to have for lists, it seems important enough for iterators too.

On Sat, Jan 03, 2015 at 03:37:03PM +0200, yotam vaknin wrote:
The iterator protocol is intentionally simple. To create an iterator, you need to only define two methods: (1) __iter__, which returns the instance itself; (2) __next__, which returns the next item and raises StopIterator when done. With your proposal, you would have to define two or three more methods: (3) __add__ and possibly __radd__, to chain iterators; (4) __getitem__, for slicing. There is nothing stopping you from adding these methods to your own iterator classes, but with your proposal that would be compulsory for all iterators.
It only takes a single line of code to get iterator chaining and slicing: from itertools import chain, islice And now you can chain and slice any iterator, regardless of where it came from. I don't think that is difficult. *If* iterators where instances of a concrete base class, then it would make sense to add chain and slice methods to that base class so all other iterators could inherit from it. But they aren't, iterators in Python are values which obey a protocol, not inheritence. That means that a functional approach, like itertools, is more appropriate. I can see the appeal of using + and [a:b:c] syntax instead of function syntax chain() and islice(), but I don't think the advantage is enough to out-weigh the costs. -- Steve

Sorry, I probably wasn't clear enough. My idea was to add these method (__add__, __getitem__) to the already available iterators (map,zip,dict.items) and generators by default. Not to make these methods part of the iterator protocol. 2015-01-03 17:28 GMT+02:00 Steven D'Aprano <steve@pearwood.info>:

On Jan 3, 2015, at 16:53, yotam vaknin <tomirendo@gmail.com> wrote:
Sorry, I probably wasn't clear enough. My idea was to add these method (__add__, __getitem__) to the already available iterators (map,zip,dict.items) and generators by default. Not to make these methods part of the iterator protocol.
Well, dict.items doesn't return an iterator, it returns an iterable view. More importantly, if you add this just to generators and map and zip, it won't apply to (among other things) filter; the built-in iterators for list, tuple, set, dict, and the dict views; the special type used to iterate types that implement __getitem__ but not __iter__; any of the iterators from itertools--including the result of slicing or chaining two map iterators; files; csv readers; XML iterparse; … I think that would make the language a lot worse. In general, code doesn't care whether it's gotten an iterator from mapping a function over a file, chaining two iterators together, or iterating a sequence. If you wrote some code that expected to get a generator, used the + operator, and then changed the calling code to filter that generator, it would break, for no good reason. If you really wanted to, you could make this general pretty easily: change the meaning of the + operator so that, after checking __add__ and __radd__, before raising a TypeError, it checks whether both operand have __iter__ methods and, if so, it returns a chain of the two. Slicing has another problem, however. First, it's a bit odd to be able to do i[5:7] but not i[5]. But, more seriously, I wouldn't expect i[5:7] to give me an iterable that, when accessed, discards the first 5 elements of i. And imagine how confusing this code would be: >>> i = iter(range(10)) >>> a = i[2:4] >>> b = i[4:6] >>> list(a) [2, 3] >>> list(b) [8, 9] Hiding the fact that i is an iterable rather than a sequence will just confuse your readers.

On Jan 3, 2015, at 16:53, yotam vaknin <tomirendo@gmail.com> wrote:
Sorry, I probably wasn't clear enough. My idea was to add these method (__add__, __getitem__) to the already available iterators (map,zip,dict.items) and generators by default. Not to make these methods part of the iterator protocol.
Reading this again, I think what you _really_ want is probably a generalized view protocol, where views of sequences act like (in fact, _are_) sequences, but with the main advantages of iterators (no storage or upfront computation). Like dictionary views (which is why I chose the name "view", of course). And there's no reason map _couldn't_ return a view instead of an iterator. When called on (all) sequences it would be a sequence view; when called on (any) iterators it would be an iterator view; otherwise it would be a non-sequence non-iterator (that is, reusable, but not indexable) iterable view. But in any case, it would never call the mapped function on a given mapped value until the result is requested. In other words: >>> def spam(x): ... print(x) ... return x+1 >>> m = map(spam, range(5)) >>> m <MapSequenceView at 0x12345678> >>> m[2] 2 3 >>> ms = m[2:4] >>> ms[1] 3 4 You can build view-based zip similarly, and filter, and most of the functions in itertools, and so on (although note that in some cases--like filter--a view of a sequence wouldn't be a sequence). You could also make slices into lazy views instead of copying (as NumPy already does with its arrays). In fact, Python could have designed all of its collections and higher-order functions around views instead of iterators. That's what Swift did. (Swift goes farther, with a hierarchy of kinds of indexing, based on C++, instead of just iterables and sequences.) I wrote a blog post last year (based on the first beta of Swift, which had some oddities in its implementation) examining how this could work in Python terms (http://stupidpythonideas.blogspot.com/2014/07/swift-style-map-and-filter-vie...). I think I've also got a partial implementation somewhere on Github of map, filter, and maybe a few more functions implemented as views. But it would be a pretty major change to Python to move from iterators to views. And iterators are much simpler to create than views, so the tradeoff probably wouldn't be worth it, even if it weren't for the historical/compatibility issue. (It's much the same with Haskell-style lazy lists; Python iterables can only substitute for lazy lists 90% of the time, but that doesn't mean lazy lists are a better language choice.)

On 4 January 2015 at 07:14, Andrew Barnert <abarnert@yahoo.com.dmarc.invalid> wrote:
But it would be a pretty major change to Python to move from iterators to views. And iterators are much simpler to create than views, so the tradeoff probably wouldn't be worth it, even if it weren't for the historical/compatibility issue. (It's much the same with Haskell-style lazy lists; Python iterables can only substitute for lazy lists 90% of the time, but that doesn't mean lazy lists are a better language choice.)
It's worth noting that many types implement Mapping.(keys,values,items) as iterators in Python 3 rather than as views, and generally don't receive any complaints from users. Iterators are very simple to implement, cover 90% of the use cases, and in those cases where they don't, you can usually write a custom wrapper around the original containers with view-like behaviour. Providing an operator based spelling for itertools.chain, itertools.islice and itertools.repeat is tempting enough on the surface to be suggested every few years (e.g. [1]), but it creates so many new complications on the *implementation* side that the benefits just aren't worth the cost in additional complexity. Cheers, Nick. [1] https://mail.python.org/pipermail/python-ideas/2010-April/006983.html -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 4 January 2015 at 17:45, Devin Jeanpierre <jeanpierreda@gmail.com> wrote:
What people "should" do and what they actually do in practice often differ wildly. In this particular case, folks migrating from Python 2 will frequently rename existing iter* methods to be the Python 3 implementations of the base methods. It's technically a non-compliant implementation of the Python mapping protocol, but you'll only encounter problems if you attempt to use that container implementation with code that relies on the Python 3 view behaviour. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

In addition to itertools.chain and itertools.islice, this is possible with a number of third-party packages: * http://toolz.readthedocs.org/en/latest/api.html#itertoolz (concat, pluck, first, last, take) * https://github.com/kachayev/fn.py#streams-and-infinite-sequences-declaration (<< 'Stream' operator) * http://funcy.readthedocs.org/en/latest/seqs.html Stdlib docs for this: * https://docs.python.org/2/howto/functional.html#iterators * https://docs.python.org/2/tutorial/classes.html#iterators * https://docs.python.org/2/library/stdtypes.html#iterator-types * https://docs.python.org/2/reference/datamodel.html#object.__iter__ * On Sat, Jan 3, 2015 at 7:37 AM, yotam vaknin <tomirendo@gmail.com> wrote:

On Sat, Jan 03, 2015 at 03:37:03PM +0200, yotam vaknin wrote:
The iterator protocol is intentionally simple. To create an iterator, you need to only define two methods: (1) __iter__, which returns the instance itself; (2) __next__, which returns the next item and raises StopIterator when done. With your proposal, you would have to define two or three more methods: (3) __add__ and possibly __radd__, to chain iterators; (4) __getitem__, for slicing. There is nothing stopping you from adding these methods to your own iterator classes, but with your proposal that would be compulsory for all iterators.
It only takes a single line of code to get iterator chaining and slicing: from itertools import chain, islice And now you can chain and slice any iterator, regardless of where it came from. I don't think that is difficult. *If* iterators where instances of a concrete base class, then it would make sense to add chain and slice methods to that base class so all other iterators could inherit from it. But they aren't, iterators in Python are values which obey a protocol, not inheritence. That means that a functional approach, like itertools, is more appropriate. I can see the appeal of using + and [a:b:c] syntax instead of function syntax chain() and islice(), but I don't think the advantage is enough to out-weigh the costs. -- Steve

Sorry, I probably wasn't clear enough. My idea was to add these method (__add__, __getitem__) to the already available iterators (map,zip,dict.items) and generators by default. Not to make these methods part of the iterator protocol. 2015-01-03 17:28 GMT+02:00 Steven D'Aprano <steve@pearwood.info>:

On Jan 3, 2015, at 16:53, yotam vaknin <tomirendo@gmail.com> wrote:
Sorry, I probably wasn't clear enough. My idea was to add these method (__add__, __getitem__) to the already available iterators (map,zip,dict.items) and generators by default. Not to make these methods part of the iterator protocol.
Well, dict.items doesn't return an iterator, it returns an iterable view. More importantly, if you add this just to generators and map and zip, it won't apply to (among other things) filter; the built-in iterators for list, tuple, set, dict, and the dict views; the special type used to iterate types that implement __getitem__ but not __iter__; any of the iterators from itertools--including the result of slicing or chaining two map iterators; files; csv readers; XML iterparse; … I think that would make the language a lot worse. In general, code doesn't care whether it's gotten an iterator from mapping a function over a file, chaining two iterators together, or iterating a sequence. If you wrote some code that expected to get a generator, used the + operator, and then changed the calling code to filter that generator, it would break, for no good reason. If you really wanted to, you could make this general pretty easily: change the meaning of the + operator so that, after checking __add__ and __radd__, before raising a TypeError, it checks whether both operand have __iter__ methods and, if so, it returns a chain of the two. Slicing has another problem, however. First, it's a bit odd to be able to do i[5:7] but not i[5]. But, more seriously, I wouldn't expect i[5:7] to give me an iterable that, when accessed, discards the first 5 elements of i. And imagine how confusing this code would be: >>> i = iter(range(10)) >>> a = i[2:4] >>> b = i[4:6] >>> list(a) [2, 3] >>> list(b) [8, 9] Hiding the fact that i is an iterable rather than a sequence will just confuse your readers.

On Jan 3, 2015, at 16:53, yotam vaknin <tomirendo@gmail.com> wrote:
Sorry, I probably wasn't clear enough. My idea was to add these method (__add__, __getitem__) to the already available iterators (map,zip,dict.items) and generators by default. Not to make these methods part of the iterator protocol.
Reading this again, I think what you _really_ want is probably a generalized view protocol, where views of sequences act like (in fact, _are_) sequences, but with the main advantages of iterators (no storage or upfront computation). Like dictionary views (which is why I chose the name "view", of course). And there's no reason map _couldn't_ return a view instead of an iterator. When called on (all) sequences it would be a sequence view; when called on (any) iterators it would be an iterator view; otherwise it would be a non-sequence non-iterator (that is, reusable, but not indexable) iterable view. But in any case, it would never call the mapped function on a given mapped value until the result is requested. In other words: >>> def spam(x): ... print(x) ... return x+1 >>> m = map(spam, range(5)) >>> m <MapSequenceView at 0x12345678> >>> m[2] 2 3 >>> ms = m[2:4] >>> ms[1] 3 4 You can build view-based zip similarly, and filter, and most of the functions in itertools, and so on (although note that in some cases--like filter--a view of a sequence wouldn't be a sequence). You could also make slices into lazy views instead of copying (as NumPy already does with its arrays). In fact, Python could have designed all of its collections and higher-order functions around views instead of iterators. That's what Swift did. (Swift goes farther, with a hierarchy of kinds of indexing, based on C++, instead of just iterables and sequences.) I wrote a blog post last year (based on the first beta of Swift, which had some oddities in its implementation) examining how this could work in Python terms (http://stupidpythonideas.blogspot.com/2014/07/swift-style-map-and-filter-vie...). I think I've also got a partial implementation somewhere on Github of map, filter, and maybe a few more functions implemented as views. But it would be a pretty major change to Python to move from iterators to views. And iterators are much simpler to create than views, so the tradeoff probably wouldn't be worth it, even if it weren't for the historical/compatibility issue. (It's much the same with Haskell-style lazy lists; Python iterables can only substitute for lazy lists 90% of the time, but that doesn't mean lazy lists are a better language choice.)

On 4 January 2015 at 07:14, Andrew Barnert <abarnert@yahoo.com.dmarc.invalid> wrote:
But it would be a pretty major change to Python to move from iterators to views. And iterators are much simpler to create than views, so the tradeoff probably wouldn't be worth it, even if it weren't for the historical/compatibility issue. (It's much the same with Haskell-style lazy lists; Python iterables can only substitute for lazy lists 90% of the time, but that doesn't mean lazy lists are a better language choice.)
It's worth noting that many types implement Mapping.(keys,values,items) as iterators in Python 3 rather than as views, and generally don't receive any complaints from users. Iterators are very simple to implement, cover 90% of the use cases, and in those cases where they don't, you can usually write a custom wrapper around the original containers with view-like behaviour. Providing an operator based spelling for itertools.chain, itertools.islice and itertools.repeat is tempting enough on the surface to be suggested every few years (e.g. [1]), but it creates so many new complications on the *implementation* side that the benefits just aren't worth the cost in additional complexity. Cheers, Nick. [1] https://mail.python.org/pipermail/python-ideas/2010-April/006983.html -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 4 January 2015 at 17:45, Devin Jeanpierre <jeanpierreda@gmail.com> wrote:
What people "should" do and what they actually do in practice often differ wildly. In this particular case, folks migrating from Python 2 will frequently rename existing iter* methods to be the Python 3 implementations of the base methods. It's technically a non-compliant implementation of the Python mapping protocol, but you'll only encounter problems if you attempt to use that container implementation with code that relies on the Python 3 view behaviour. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

In addition to itertools.chain and itertools.islice, this is possible with a number of third-party packages: * http://toolz.readthedocs.org/en/latest/api.html#itertoolz (concat, pluck, first, last, take) * https://github.com/kachayev/fn.py#streams-and-infinite-sequences-declaration (<< 'Stream' operator) * http://funcy.readthedocs.org/en/latest/seqs.html Stdlib docs for this: * https://docs.python.org/2/howto/functional.html#iterators * https://docs.python.org/2/tutorial/classes.html#iterators * https://docs.python.org/2/library/stdtypes.html#iterator-types * https://docs.python.org/2/reference/datamodel.html#object.__iter__ * On Sat, Jan 3, 2015 at 7:37 AM, yotam vaknin <tomirendo@gmail.com> wrote:
participants (7)
-
Andrew Barnert
-
Devin Jeanpierre
-
Nick Coghlan
-
Steven D'Aprano
-
Sturla Molden
-
Wes Turner
-
yotam vaknin