[Python-ideas] Integrate some itertools into the Python syntax

Michel Desmoulin desmoulinmichel at gmail.com
Tue Mar 22 13:51:50 EDT 2016


Le 22/03/2016 17:18, Paul Moore a écrit :
> On 22 March 2016 at 16:01, Michel Desmoulin <desmoulinmichel at gmail.com> wrote:
>> It still an import in every file you want to use it, and in the shell.
>> We can already have 3rd party lib to do this, so no need to change
>> Python for that.
> 
> Note that sys, os, re, math, datetime are all "an import in every file
> you want to use it". The bar for changing Python is higher than just
> avoiding an import.
> 
> Paul
> 

This is an appeal to consider islice & Co as important as normal slicing.

Indeed, you will have most certainly iterables in any code that use
datetime, re or math. You have no certainty of having datetime, re or
math imported in any code dealing with iterables.

Think about how annoying it would be to do:

>>> from builtins import slice
>>> [1, 2, 3, 4, 5, 6][slice(2, 4)]
[3, 4]

For every slice. We don't, because slicing is part of our standard data
processing toolkit. And I think we can make it even better.

We already do in some places. E.G: range(10)[3:5] works while range()
generate values on the fly. Why ? Because it's convenient, expressive
and Pythonic.

Well, if you process a file and you want to limit it to all lines after
the first "BEGIN SECTION" (there can be other) and before the first
"STOP" (there can be others), but only 10000 lines max, and not load the
whole file in memory, you could do:

def foo(p):
    with open(p) as f:
        def begin:
            return x == "BEGIN SECTION"
         def end:
            return x == "STOP"
        return f[begin, end][:10000]

It's very clean, very convenient, very natural, and memory efficient.

Now compare it with itertools:

from itertools import takewhile, dropwhile, islice

def foo(p):
    with open(p) as f:
        def begin:
            return x != "BEGIN SECTION"
    def end:
        return x != "STOP"
    return islice(takewhile(end, dropwhile(begin, f)), 0, 10000)

It's ugly, hard to read, hard to write.

In Python, you are always iterating on something, it makes sense to make
sure we have the best tooling to do at our fingertips.





More information about the Python-ideas mailing list