[Python-ideas] Integrate some itertools into the Python syntax
Michel Desmoulin
desmoulinmichel at gmail.com
Mon Mar 21 19:59:58 EDT 2016
The solution I'm currently using involves having a class called g, and
everytime I want to manipulate an iterable, I just wrap it in g().
Then I got an object with all those semantics (and actually a lot more).
Maybe we can make only those things apply to the objects returned by
iter() ?
(Also, about slicing accepting callable, actually g() goes a bit
overboard and accept any object. If the object is not an int or
callable, then it's used as a sentinel value. Not sure if I should speak
about that here.)
Le 22/03/2016 00:36, Chris Angelico a écrit :
> On Tue, Mar 22, 2016 at 10:06 AM, Michel Desmoulin
> <desmoulinmichel at gmail.com> wrote:
>> Itertools is great, and some functions in it are more used than others:
>>
>> - islice;
>> - chain;
>> - dropwhile, takewhile;
>> ...
>>
>> The changes I'm going to propose do not add new syntax to Python, but
>> yet would streamline the use of this nice tool and blend it into the
>> language core.
>
> You're not the first to ask for something like this :) Let's get
> *really* specific about semantics, though - and particularly about the
> difference between iterables, iterators, and generators.
>
>> Make slicing accept callables
>> =============================
>>
>> So my first proposal is to be able to do:
>>
>> def stop(element):
>> return element > 4
>> print(numbers[:stop])
>>
>> It's quite pythonic, easy to understand : the end of the slice is when
>> this condition is met. Any not the strange way takewhile work, which is
>> "carry on as long as this condition is met".
>>
>> We could also extend itertools.islice to accept such parameter.
>
> This cannot be defined for arbitrary iterables, unless you're
> proposing to mandate it in some way. (It conflicts with the way a list
> handles slicing, for instance.) Even for arbitrary iterators, it may
> be quite tricky (since iterators are based on a protocol, not a type);
> but maybe it would be worth proposing an "iterator mixin" that handles
> this for you, eg:
>
> class IteratorOperations:
> def __getitem__(self, thing):
> if isinstance(thing, slice):
> if has_function_in_criteria(slice): return
> self.takeuntil(s.start, s.stop)
> return itertools.islice(...)
> def takeuntil(self, start, stop):
> val = next(self)
> while start is not None and not start(val):
> val = next(self)
> while stop is None or not stop(val):
> yield val
> val = next(self)
>
> As long as you inherit from that, you get these operations made
> available to you.
>
> Now, if you're asking this about generators specifically, then it
> might be possible to add this (since all generators are of the same
> type). It wouldn't be as broad as the itertools functions (which can
> operate on any iterable), but could be handy if you do a lot with
> gens, plus it's hard to subclass them.
>
>> Slicing any iterable
>> ======================
>>
>> So the second proposal is to allow:
>>
>> def func_accepting_any_iterable(foo):
>> return bar(foo[3:7])
>>
>> The slicing would then return a list if it's a list, a typle if it's a
>> tuple, and a islice(generator) if it's a generator. If somebody uses a
>> negative index, it would then raises a ValueError like islice.
>>
>> This would make duck typing and iteration even easier in Python.
>
> Again, while I am sympathetic to the problem, it's actually very hard;
> islice always returns the same kind of thing, but slicing syntax can
> return all manner of different things, because it's up to the object
> on the left:
>
>>>> range(10)[3:7]
> range(3, 7)
>>>> "Hello, world!"[3:7]
> 'lo, '
>>>> [1, 4, 2, 8, 5, 7, 1, 4, 2, 8, 5, 7][3:7]
> [8, 5, 7, 1]
>>>> memoryview(b"Hello, world!")[3:7]
> <memory at 0x7fb98dab3f48>
>
> You don't want these to start returning islice objects. You mentioned
> lists, but other types will also return themselves when sliced.
>
> Possibly the solution here is actually to redefine object.__getitem__?
> Currently, it simply raises TypeError - not subscriptable. Instead, it
> could call iter() on itself, and then attempt to islice it. That would
> mean that the TypeError would change to "is not iterable"
> (insignificant difference), anything that already defines __getitem__
> will be unaffected (good), and anything that's iterable but not
> subscriptable would automatically islice itself (potentially a trap,
> if people don't know what they're doing).
>
>> Chaining iterable
>> ==================
>>
>> Iterating on heterogenous iterable is not clear.
>>
>> You can add lists with lists and tuples with tuples, but if you need
>> more, then you need itertools.chain. Few people know about it, so I
>> usually see duplicate loops and conversion to lists/tuples.
>>
>> So My first proposal is to overload the "&" operator so that anything
>> defining __iter__ can be used with it.
>>
>> Then you can just do:
>>
>> chaining = "abc" & [True, False] & (x * x for x in range(10))
>> for element in chaining:
>> print(element)
>>
>> Instead of:
>>
>> from itertools import chain
>> chaining = chain("abc", [True, False], (x * x for x in range(10)))
>> for element in chaining:
>> print(element)
>
> Again, anything involving operators is tricky, since anything can
> override its handling. But if you require that the first one be a
> specific iterator class, you can simply add __and__ to it to do what
> you want:
>
> class iter:
> iter = iter # snapshot the default 'iter'
> def __init__(self, *args):
> self.iter = self.iter(*args) # break people's minds
> def __iter__(self): return self
> def __next__(self): return next(self.iter)
> def __and__(self, other):
> yield from self.iter
> yield from other
>
> Okay, so you'd probably do it without the naughty bits, but still :)
> As long as you call iter() on the first thing in the chain, everything
> else will work.
>
> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
More information about the Python-ideas
mailing list