The solution I'm currently using involves having a class called g, and everytime I want to manipulate an iterable, I just wrap it in g(). Then I got an object with all those semantics (and actually a lot more). Maybe we can make only those things apply to the objects returned by iter() ? (Also, about slicing accepting callable, actually g() goes a bit overboard and accept any object. If the object is not an int or callable, then it's used as a sentinel value. Not sure if I should speak about that here.) Le 22/03/2016 00:36, Chris Angelico a écrit :
On Tue, Mar 22, 2016 at 10:06 AM, Michel Desmoulin
wrote: Itertools is great, and some functions in it are more used than others:
- islice; - chain; - dropwhile, takewhile; ...
The changes I'm going to propose do not add new syntax to Python, but yet would streamline the use of this nice tool and blend it into the language core.
You're not the first to ask for something like this :) Let's get *really* specific about semantics, though - and particularly about the difference between iterables, iterators, and generators.
Make slicing accept callables =============================
So my first proposal is to be able to do:
def stop(element): return element > 4 print(numbers[:stop])
It's quite pythonic, easy to understand : the end of the slice is when this condition is met. Any not the strange way takewhile work, which is "carry on as long as this condition is met".
We could also extend itertools.islice to accept such parameter.
This cannot be defined for arbitrary iterables, unless you're proposing to mandate it in some way. (It conflicts with the way a list handles slicing, for instance.) Even for arbitrary iterators, it may be quite tricky (since iterators are based on a protocol, not a type); but maybe it would be worth proposing an "iterator mixin" that handles this for you, eg:
class IteratorOperations: def __getitem__(self, thing): if isinstance(thing, slice): if has_function_in_criteria(slice): return self.takeuntil(s.start, s.stop) return itertools.islice(...) def takeuntil(self, start, stop): val = next(self) while start is not None and not start(val): val = next(self) while stop is None or not stop(val): yield val val = next(self)
As long as you inherit from that, you get these operations made available to you.
Now, if you're asking this about generators specifically, then it might be possible to add this (since all generators are of the same type). It wouldn't be as broad as the itertools functions (which can operate on any iterable), but could be handy if you do a lot with gens, plus it's hard to subclass them.
Slicing any iterable ======================
So the second proposal is to allow:
def func_accepting_any_iterable(foo): return bar(foo[3:7])
The slicing would then return a list if it's a list, a typle if it's a tuple, and a islice(generator) if it's a generator. If somebody uses a negative index, it would then raises a ValueError like islice.
This would make duck typing and iteration even easier in Python.
Again, while I am sympathetic to the problem, it's actually very hard; islice always returns the same kind of thing, but slicing syntax can return all manner of different things, because it's up to the object on the left:
range(10)[3:7] range(3, 7) "Hello, world!"[3:7] 'lo, ' [1, 4, 2, 8, 5, 7, 1, 4, 2, 8, 5, 7][3:7] [8, 5, 7, 1] memoryview(b"Hello, world!")[3:7]
You don't want these to start returning islice objects. You mentioned lists, but other types will also return themselves when sliced.
Possibly the solution here is actually to redefine object.__getitem__? Currently, it simply raises TypeError - not subscriptable. Instead, it could call iter() on itself, and then attempt to islice it. That would mean that the TypeError would change to "is not iterable" (insignificant difference), anything that already defines __getitem__ will be unaffected (good), and anything that's iterable but not subscriptable would automatically islice itself (potentially a trap, if people don't know what they're doing).
Chaining iterable ==================
Iterating on heterogenous iterable is not clear.
You can add lists with lists and tuples with tuples, but if you need more, then you need itertools.chain. Few people know about it, so I usually see duplicate loops and conversion to lists/tuples.
So My first proposal is to overload the "&" operator so that anything defining __iter__ can be used with it.
Then you can just do:
chaining = "abc" & [True, False] & (x * x for x in range(10)) for element in chaining: print(element)
Instead of:
from itertools import chain chaining = chain("abc", [True, False], (x * x for x in range(10))) for element in chaining: print(element)
Again, anything involving operators is tricky, since anything can override its handling. But if you require that the first one be a specific iterator class, you can simply add __and__ to it to do what you want:
class iter: iter = iter # snapshot the default 'iter' def __init__(self, *args): self.iter = self.iter(*args) # break people's minds def __iter__(self): return self def __next__(self): return next(self.iter) def __and__(self, other): yield from self.iter yield from other
Okay, so you'd probably do it without the naughty bits, but still :) As long as you call iter() on the first thing in the chain, everything else will work.
ChrisA _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/