[Python-ideas] Integrate some itertools into the Python syntax

Michel Desmoulin desmoulinmichel at gmail.com
Mon Mar 21 19:06:32 EDT 2016


Itertools is great, and some functions in it are more used than others:

- islice;
- chain;
- dropwhile, takewhile;

Unfortunatly many people don't use them because they don't know it
exists, but also are not aware of the importance of generators in Python
and all in all, the central place iteration has in the language.

But I must confess that after 12 years of Python, I always delay the use
of it as well:

- I have to type the import in every single module and shell session (ok
I got PYTHONSTARTUP setup to autoimport, but most people don't).
- All functions fell verbose for such a common use case.
- If I start to use it, I can say good by to simpler syntaxes such as []
and +.
- It always take me a minutes to get dropwhile/takewhile right. They
works the opposite way of my brain.

The changes I'm going to propose do not add new syntax to Python, but
yet would streamline the use of this nice tool and blend it into the
language core.

Make slicing accept callables
=============================

One day someone asked me something similar to:

"I got a list of numbers, how do I filter this list so that I stop when
numbers are bigger than 4."

So I said:

print([x for x in numbers if x > 4])

But then he said:

"No, I want to stop reading any number I encounter after the first x > 4."

"Oh".

Then:

import itertools
def stop(element):
    return not element > 4
print(list(itertools.takewhile(stop, numbers))

I actually got it wrong 2 times, first I forgot the "not", then I mixed
up the parameters in takewhile. I was going to then introduce lambda but
my colleagues looked at me in a sad way after glancing at the code and I
backed up.

So my first proposal is to be able to do:

def stop(element):
    return element > 4
print(numbers[:stop])

It's quite pythonic, easy to understand : the end of the slice is when
this condition is met. Any not the strange way takewhile work, which is
"carry on as long as this condition is met".

We could also extend itertools.islice to accept such parameter.


Slicing any iterable
======================

Now, while I do like islice, I miss the straigthforwardness of [:]:


from itertools import islice

def func_accepting_any_iterable(foo):
    return bar(islice(foo, 3, 7))

It's verbose, and renders the [3:7] syntaxe almost useless if you don't
have control over the code creating the iterable you are going to
process since you don't know what it's going to be.

So the second proposal is to allow:

def func_accepting_any_iterable(foo):
    return bar(foo[3:7])

The slicing would then return a list if it's a list, a typle if it's a
tuple, and a islice(generator) if it's a generator. If somebody uses a
negative index, it would then raises a ValueError like islice.

This would make duck typing and iteration even easier in Python.


Chaining iterable
==================

Iterating on heterogenous iterable is not clear.

You can add lists with lists and tuples with tuples, but if you need
more, then you need itertools.chain. Few people know about it, so I
usually see duplicate loops and conversion to lists/tuples.

So My first proposal is to overload the "&" operator so that anything
defining __iter__ can be used with it.

Then you can just do:

chaining = "abc" & [True, False] & (x * x for x in range(10))
for element in chaining:
    print(element)

Instead of:

from itertools import chain
chaining = chain("abc", [True, False], (x * x for x in range(10)))
for element in chaining:
    print(element)



More information about the Python-ideas mailing list