[Python-ideas] Slicing and Chainging iterables.

Andrew Barnert abarnert at yahoo.com
Sat Jan 3 21:40:59 CET 2015

On Jan 3, 2015, at 16:53, yotam vaknin <tomirendo at gmail.com> wrote:

> Sorry, I probably wasn't clear enough.
> My idea was to add these method (__add__, __getitem__) to the already available iterators (map,zip,dict.items) and generators by default. Not to make these methods part of the iterator protocol.

Well, dict.items doesn't return an iterator, it returns an iterable view.

More importantly, if you add this just to generators and map and zip, it won't apply to (among other things) filter; the built-in iterators for list, tuple, set, dict, and the dict views; the special type used to iterate types that implement __getitem__ but not __iter__; any of the iterators from itertools--including the result of slicing or chaining two map iterators; files; csv readers; XML iterparse; …

I think that would make the language a lot worse. In general, code doesn't care whether it's gotten an iterator from mapping a function over a file, chaining two iterators together, or iterating a sequence. If you wrote some code that expected to get a generator, used the + operator, and then changed the calling code to filter that generator, it would break, for no good reason.

If you really wanted to, you could make this general pretty easily: change the meaning of the + operator so that, after checking __add__ and __radd__, before raising a TypeError, it checks whether both operand have __iter__ methods and, if so, it returns a chain of the two.

Slicing has another problem, however. First, it's a bit odd to be able to do i[5:7] but not i[5]. But, more seriously, I wouldn't expect i[5:7] to give me an iterable that, when accessed, discards the first 5 elements of i. And imagine how confusing this code would be:

    >>> i = iter(range(10))
    >>> a = i[2:4]
    >>> b = i[4:6]
    >>> list(a)
    [2, 3]
    >>> list(b)
    [8, 9]

Hiding the fact that i is an iterable rather than a sequence will just confuse your readers.

> 2015-01-03 17:28 GMT+02:00 Steven D'Aprano <steve at pearwood.info>:
>> On Sat, Jan 03, 2015 at 03:37:03PM +0200, yotam vaknin wrote:
>> > I would like to purpose that in python 3 it will be easy to chain and slice
>> > Iterators just like you can add and slice lists and tuples easily.
>> >
>> > I think there could be 2 methods to do this:
>> > 1. Implementing '+' and slicing ([1:2:3]) for iterators and generators by
>> > default, Resulting in  Itertool's chain(a,b) and islice(a,b,c) respectively.
>> > 2. Having itertool's chain and islice imported by default.
>> The iterator protocol is intentionally simple. To create an iterator,
>> you need to only define two methods:
>> (1) __iter__, which returns the instance itself;
>> (2) __next__, which returns the next item and raises StopIterator when
>> done.
>> With your proposal, you would have to define two or three more methods:
>> (3) __add__ and possibly __radd__, to chain iterators;
>> (4) __getitem__, for slicing.
>> There is nothing stopping you from adding these methods to your own
>> iterator classes, but with your proposal that would be compulsory for
>> all iterators.
>> > I think since python 3 return zip,map, dict.items and so on as iterators,
>> > it makes working with those kind of objects more difficult without having
>> > these methods around. And since those methods are important enough to have
>> > for lists, it seems important enough for iterators too.
>> It only takes a single line of code to get iterator chaining and
>> slicing:
>> from itertools import chain, islice
>> And now you can chain and slice any iterator, regardless of where it
>> came from. I don't think that is difficult.
>> *If* iterators where instances of a concrete base class, then it would
>> make sense to add chain and slice methods to that base class so all
>> other iterators could inherit from it. But they aren't, iterators in
>> Python are values which obey a protocol, not inheritence. That means
>> that a functional approach, like itertools, is more appropriate.
>> I can see the appeal of using + and [a:b:c] syntax instead of function
>> syntax chain() and islice(), but I don't think the advantage is enough
>> to out-weigh the costs.
>> --
>> Steve
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150103/32d86fd7/attachment-0001.html>

More information about the Python-ideas mailing list