Re: [Python-ideas] Integrate some itertools into the Python syntax

I haven't seen enough in this conversation to persuade me that we need more built ins, but: I recently taught Intro to Python using py3 for the first time recently, and really noticed that while the most obvious difference (at the beginning level) is that print is a function now, that is only annoying, but didn't confuse anyone at all. (and Unicode because a non-issue -- yeah!) However, where I found things changed in a conceptual way is the movement from sequences to iterables in many places: zip() dict.keys(), etc.... I found that my approach of focusing a lot on sequences -- you can iterate over them, you can slice them, you can.... seems out of date now, and I think lead to confusion as folks got a bit more advanced. Py2 was all about sequences -- py3 is all about iterables, and I think I need to start teaching it that way -- start with iterables, and introduce sequences as a special case of iterables, rather than starting from sequences, and introducing iterables as a more advanced topic. I understand the focus on iterables -- for performance reasons if nothing else, but I think it does, in fact, make the language more complex. and I think we should think about making the language more iterable-focused. This certainly comes up with this discussion -- if Python is all about iterables, then some of itertools should be more discoverable and obvious. And it came up in another recent thred about a mechanism for doing something after a for loop that didn't loop -- in that case, for sequences, the idiom is obvious: if seq: do_the_for_loop else: do_somethign else since the loop wont have run. But when you plug in an arbitrary iterable into that, it doesn't work, and there is no easy, obvious, and robust idiom to replace that with. I don't know that that particular issue needs to be solved, but it makes my point -- there is much to be done to make python really about iterables, rather than sequences. For my part, I would like to see iterables look more like sequences -- I know it's going to be inefficient in many cases to index an iterable, but it i really less efficient that wrapping list() around it? One tiny example: I often need to parse text files, which I used to do with code like: for line in the_file.readlines(): .... then, particulary when debugging, I could do: for line in the_file.readlines()[:10]: .... and just work with the first ten lines.... but now that file objects ar iterable, I can do: for line in the_file: .... much nicer! but it breaks when I want to debug and try to do: for line in the_file[:10]: ... arrgg! files are not indexable!. (and by the way, not only for testing, but also when you really do want the next ten lines from the file -- maybe it's a header, or...) Sure, I have plenty of ways to work around this, but frankly, they are all a bit ugly, even if trivial. So maybe there should be some ability to index / slice iterables? But aside from that -- just the idea that looking at how to make iterable a more "natural" part of the language is a good thing. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Mar 24 2016, Chris Barker <chris.barker-32lpuo7BZBA@public.gmane.org> wrote:
Hmm. empty = True for stuff in seq: empty = False do_stuff if empty: do_something_else is two lines longer than the above (if you expand the do_the_for_loop) and seems pretty obvious and robust. What do you dislike about it? Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.«

On 25 March 2016 at 07:20, Chris Barker <chris.barker@noaa.gov> wrote:
I like the concept of focusing on working with file iterators and file processing concepts like head, tail and cat as a concrete example - starting by designing a clean solution to a specific problem and generalising from there is almost always a much better approach than trying to design the general case without reference to specific examples. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Le 25/03/2016 03:41, Nick Coghlan a écrit :
Dully noted. I'll do that next time. The ability to accept callables in slicing, and allow slicing on more iterables would actually fit well in the file processing use case: # get all lines between the first command and the first blank line # then limit that result to 100 with open(p) as f: def is_comment(line): return line.startswith('#') def is_blank_line(line): return line.strip() for line in f[is_comment, is_blank_line][:100]: print(line) It's also very convenient for generator expressions: # get random numbers between 0 and 100000, and square them # remove all numbers you can't devide by 3 # then sample 100 of them numbers = (x * x for x in random.randint(100000) if x % 3 == 0) for x in numbers[:100]: print(x)
Cheers, Nick.

On Mar 25 2016, Michel Desmoulin <desmoulinmichel-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
If there has ever been an example that I hope to never, ever see in actual code than it's this one. If you have to make up stuff like that to justify a proposed feature, then that does not bode well. Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.«

Le 25/03/2016 17:53, Nikolaus Rath a écrit :
The (x * x for x in random.randint(100000) if x % 3 == 0) is just here to stand for "rich generators". The example is artificial, but every advanced python code out there use generator. The feature is on the for loop, which is simple and elegant. What you're doing here is trolling, really.
Best, -Nikolaus

On Sat, Mar 26, 2016 at 9:16 AM, Michel Desmoulin <desmoulinmichel@gmail.com> wrote:
No, it's not trolling. But there's a difficult balance to strike when giving examples of a new feature; too simple and there's no justification for the feature, too complex and it's a poor justification ("if you have to make up stuff like that"). Trying to see past the example, generalizing to code that you might yourself write, isn't always easy. ChrisA

On 03/25/2016 03:16 PM, Michel Desmoulin wrote:
True, but without a useful comment it was easy to miss that this line was not the feature.
The feature is on the for loop, which is simple and elegant.
True.
What you're doing here is trolling, really.
Just because you don't like it doesn't make it trolling. -- ~Ethan~

Chris Barker writes:
The *language* has always been iterable-focused. Python doesn't have any counting loops! range() is a builtin (and a necessary one), but not part of the language. for and while are part of the language, and they are both "do until done" loops, not "do n times" loops. I think the problem is more on the user (and teacher) side. We learn that a square has four sides, but perhaps it's more Pythonic(?) to think of a square as the process "do until here == start: forward 1 furlong; right 90". But *neither* (or *both*) is how the children I've observed think of it. They draw four sides, the first three quite straight, and the last one *warped as necessary* to join with the first. (And adult practice is even more varied: drawing two parallel sides first, then joining them at the ends. In the case of the Japanese and Chinese, a square has *three* sides: the left vertical is drawn, then the top and right are drawn as one stroke, and finally the horizontal base.[2]) People don't think like the algorithms that are convenient for us to teach computers.
But does it? That thread never did present a real use case for "empty:" with an iterator. Evidently the OP has one, but we didn't get to see it. All of the realistic iterator cases I can think of are *dynamic*: eg RSS or Twitter feeds. In those cases, it's not that the iterator is empty, it's that it's in a wait state. Even an empty database cursor can be interpreted that way. (If you didn't expect updates, why are you using a database?) I am inclined to think this is a general point, that is, the problem is not *empty* iterators vs. *non-empty* ones. It's iterators that have produced values recently vs. those that haven't. The empty vs. non-empty distinction is a property of *sequences*, including buffers (which are associated with iterators).
No, they are *enumerable*: for i, line in enumerate(the_file): if i >= 10: break ... I guess your request to make iterators more friendly is a good part of why enumerate() got promoted to builtin.
So maybe there should be some ability to index / slice iterables?
There's no way to index an iterator, except to enumerate it. Then, "to memoize or not to memoize, that is the question." Slicing makes more sense to me, but again the fact that your discarded data may or may not be valuable means that you need to make a choice between memoizing and not doing so. Putting that in the API is complexity, or perhaps even complication. If you just want head or tail, then takewhile or dropwhile from itertools is your friend. (I have no opinion -- not even -0 -- on whether promoting those functions to builtin is a good idea.)
But aside from that -- just the idea that looking at how to make iterable a more "natural" part of the language is a good thing.
I think it's from Zen and the Art of Motorcycle Maintenance (though Pirsig may have been quoting), but I once read the advice: "If you want to paint a perfect painting, make yourself perfect and then paint naturally." I think iterators are just "unnatural" to a lot of people, and will remain unnatural until people evolve. Which they may not! Real life "do until done" tasks are careers ("do ... until dead") or have timeouts ("do n times: break if done else ..."). In computation that would be analogous to scrolling a Twitter feed vs. grabbing a pageful. In the context of this discussion, a feed is something you wait for (and maybe timeout and complain to the operator if it blocks too long), while you can apply len() to pages. And you know which is which in all applications I've dealt with -- except for design of abstract programming language facilities like "for ... in". The point being that "real life" examples don't seem to help people's intuition on the Python versions.

Sure.
range() is a builtin (and a necessary one),
But range used to produce a list. So for lips were, in the beginning, about iterating, yes, but iterating through a sequence, not an arbitrary iterables. I may be wrong here, but I started using Python in version 1.5, and I'm pretty sure the iterator protocol did not exist then -- I know for sure I learned about it far later. And aside from range, there is dict.keys, zip, etc....
I think the problem is more on the user (and teacher) side.
Well sure - mostly this is about how we present the language, and I at least am making that switch.
That thread never did present a real use case for "empty:" with an iterator.
I agree here, actually.
Sure, but you have to admit that the slicing notation is a lot cleaner. And I don't WANT to enumerate -/ I want

Oops, hit send by accident.
I want to iterate only the first n items -- that may be a common enough use case for a concise notation.
Well this thread started with those ideas .... My point is still : making working with iteratables as easy and natural as sequences is a good direction to go. -CHB

I'm still very uneasy about how slicing is usually random access, and doesn't change how it indexes its elements from repeated use. It means that you have something very different for the same syntax. (No interest in `my_iterator(:)`? It might actually be compatible with existing syntax.) What about IterTool(foo).islice(dropuntil_f, take_while_g, filter_h) Keeps it as a single call but with explicit call syntax. Won't most uses stick something complicated (= long) in the slice? On Mar 27, 2016 11:31 PM, "Chris Barker - NOAA Federal" < chris.barker@noaa.gov> wrote:

On Tue, Mar 29, 2016 at 1:45 AM, Franklin? Lee <leewangzhong+python@gmail.com> wrote:
range(10, 100)[25:35] range(35, 45)
It's a slice. Whether it's random access or not is pretty much unrelated to slicing; you get a slice of the underlying object, whatever that is. A slice of a sliceable iterator should be a sliced iterator. ChrisA

On Mon, Mar 28, 2016 at 7:45 AM, Franklin? Lee < leewangzhong+python@gmail.com> wrote:
I think only slightly different :-) On Mon, Mar 28, 2016 at 8:04 AM, Chris Angelico <rosuav@gmail.com> wrote:
sure -- but that works great for range, because is is an lazy evaluated sequence, which of course makes it iterable, but I think the trick is: arbitrary_iterable[10:20] should return an iterable that will return the 10th to the 20th item when iterated -- doable, but: for i in arbitrary_iterable[10:20]: pass will then have to call the underlying iterable 20 times -- if it's an iterator that isn't a sequence, its state will have been altered. so I'm not sure how to go about this -- but it would be nice. Also, arbitrary_iterable[-10:] would be essentially impossible. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Tue, Mar 29, 2016 at 3:13 AM, Chris Barker <chris.barker@noaa.gov> wrote:
I don't think arbitrary iterables should be sliceable. Arbitrary *iterators* can be, because that operation has well-defined semantics (just tie in with itertools.islice); the state of the underlying iterator *will* be changed.
Also, arbitrary_iterable[-10:]
would be essentially impossible.
I suppose it could be something like: iter(collections.deque(arbitrary_iterable, maxlen=10)) but that's not what people will normally expect. Much better for various iterable types to define their own slice handling. ChrisA

Le 28/03/2016 18:13, Chris Barker a écrit :
It would be possible but would basically be equivalent to list(arbitrary_iterable)[-10:] and would load everything in memory, at best rendering generators useless, at worst exploding on a infinite data stream. But I think it would be ok to raise ValueError any negative indices by default, and then let containers such as list/tuple define custom behavior for them. Note that arbitrary_iterable[:-10] or arbitrary_iterable[-10] would be possibl, with collections.deque you buffer at most 10 elements in memory, but I'm not sure if it would be preferable to fordid all negative values, or have an special case. It's handy but confusing. Also, what do you think about the callable passed as an index from the first examples ?

On 03/28/2016 09:23 AM, Michel Desmoulin wrote:
Agreed. And already the default behaviour.
Also, what do you think about the callable passed as an index from the first examples ?
I like it, but it will be a pain to implement. (At least, it would be for me. ;) -- ~Ethan~

On 03/28/2016 09:13 AM, Chris Barker wrote:
Iter*ables* are easy to slice (think list, tuple, range, etc.). Iter*ators* are the tricky ones. And yes, there is no way to slice an iterator without changing its state. -- ~Ethan~

On Tue, Mar 29, 2016 at 3:29 AM, Ethan Furman <ethan@stoneleaf.us> wrote:
*Some* iterables are easy to slice. Some are not. A dictionary is perfectly iterable, but it doesn't make a lot of sense to slice it. Are you slicing based on keys, or sequence (like paginating the dict), or something else? That's why iterable slicing needs to be handled by the type itself - a range knows how to produce a subrange, a list knows how to produce a sublist. ChrisA

On 03/28/2016 09:34 AM, Chris Angelico wrote:
From the glossary [https://docs.python.org/3/glossary.html] iterable An object capable of returning its members one at a time. Very good point, Chris, thanks for the correction. So the key point then is that objects with a `__getitem__` that is based on offsets `0` to `len(obj)-1` are easily sequenced, all others may as well be wrapped in `iter()` calls. -- ~Ethan~

On 3/27/2016 11:26 PM, Chris Barker - NOAA Federal wrote:
The current iterator protocol, based on __iter__, __next__, and StopIteration, was added in 2.2. The original iteration protocol, now called the 'sequence protocol', was based on __getitem__(count) and IndexError. Except for not requiring a __length__ method, it required that what we now call an 'iterable' pretend to be a sequence. It is still supported by iter(). https://docs.python.org/3/library/functions.html#iter With the additional changes to builtins in 3.0, there has definitely been a shift of emphasis from lists and sequences to iterables and iterators. --- General response to thread: In spite of this, I am not in favor of trying to force-fit itertools, in particular islice, into syntax. Iterables for which slicing makes sense are free to directly support non-destructive slicing in a __getitem__ method. It does not bother me that destructive slicing of iterators requires a different syntax. Because of the difference between non-destructive and destructive slicing, the goal of making the same code work for sequences and iterators cannot work beyond a single slice. Suppose one wants to process the first 5 and then the next 5 items of an iterable. The necessary code is different for sequences and iterators. from itertools import islice rs = range(10) ri = iter(rs) print(list(islice(rs, 0, 5))) print(list(islice(rs, 0, 5))) print() print(list(islice(ri, 0, 5))) print(list(islice(ri, 0, 5))) # works for iterators print() ri = iter(rs) print(list(islice(rs, 0, 5))) print(list(islice(rs, 5, 10))) # works for sequences print() print(list(islice(ri, 0, 5))) print(list(islice(ri, 5, 10))) # [0, 1, 2, 3, 4] [0, 1, 2, 3, 4] [0, 1, 2, 3, 4] [5, 6, 7, 8, 9] [0, 1, 2, 3, 4] [5, 6, 7, 8, 9] [0, 1, 2, 3, 4] [] How anyone written and experimented with using a 'syntax-iterator' class to test the idea? Has such to uploaded to PyPI? import itertools class Iter: # syntax-intergrated iterator def __init__(self, iterable): self.it = iter(iterable) def __iter__(self): return self def __next__(self): return next(self.it) def __getitem__(self, what): if isistance(what, slice): return itertools.islice( what.start or 0, what.stop, what.step or 1) def __add__(self, other): return itertools.chain(self, other) # for more that two iterables, chain(a, b, c, ...) will run faster ... Then, in the current idiom for a function of an iterable: def f(args, iterable, args): it = iter(iterable) one could replace 'iter' with 'Iter' and proceed to use proposed iterator syntax within the function. -- Terry Jan Reedy

On Mon, Mar 28, 2016 at 2:46 PM, Terry Reedy <tjreedy@udel.edu> wrote:
With the additional changes to builtins in 3.0, there has definitely been a shift of emphasis from lists and sequences to iterables and iterators.
Exactly my point -- and I think there ARE place in the language where we could make that more obviou san natural -- maybe not this place though.
Because of the difference between non-destructive and destructive slicing,
This is a great way to describe the difference. Note that there is also a distinction between destructive and nondestructive looping: for item in an_iterable: do_something(item) if some_condition: break may leave an_iterable alone, and may leave it in a different state depending on if an_iterable is a sequence or an actual iterator. Even so -- I agree that it's probably not a good idea to re-use the slicing syntax for "destructive slicing." -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Mar 24 2016, Chris Barker <chris.barker-32lpuo7BZBA@public.gmane.org> wrote:
Hmm. empty = True for stuff in seq: empty = False do_stuff if empty: do_something_else is two lines longer than the above (if you expand the do_the_for_loop) and seems pretty obvious and robust. What do you dislike about it? Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.«

On 25 March 2016 at 07:20, Chris Barker <chris.barker@noaa.gov> wrote:
I like the concept of focusing on working with file iterators and file processing concepts like head, tail and cat as a concrete example - starting by designing a clean solution to a specific problem and generalising from there is almost always a much better approach than trying to design the general case without reference to specific examples. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Le 25/03/2016 03:41, Nick Coghlan a écrit :
Dully noted. I'll do that next time. The ability to accept callables in slicing, and allow slicing on more iterables would actually fit well in the file processing use case: # get all lines between the first command and the first blank line # then limit that result to 100 with open(p) as f: def is_comment(line): return line.startswith('#') def is_blank_line(line): return line.strip() for line in f[is_comment, is_blank_line][:100]: print(line) It's also very convenient for generator expressions: # get random numbers between 0 and 100000, and square them # remove all numbers you can't devide by 3 # then sample 100 of them numbers = (x * x for x in random.randint(100000) if x % 3 == 0) for x in numbers[:100]: print(x)
Cheers, Nick.

On Mar 25 2016, Michel Desmoulin <desmoulinmichel-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
If there has ever been an example that I hope to never, ever see in actual code than it's this one. If you have to make up stuff like that to justify a proposed feature, then that does not bode well. Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.«

Le 25/03/2016 17:53, Nikolaus Rath a écrit :
The (x * x for x in random.randint(100000) if x % 3 == 0) is just here to stand for "rich generators". The example is artificial, but every advanced python code out there use generator. The feature is on the for loop, which is simple and elegant. What you're doing here is trolling, really.
Best, -Nikolaus

On Sat, Mar 26, 2016 at 9:16 AM, Michel Desmoulin <desmoulinmichel@gmail.com> wrote:
No, it's not trolling. But there's a difficult balance to strike when giving examples of a new feature; too simple and there's no justification for the feature, too complex and it's a poor justification ("if you have to make up stuff like that"). Trying to see past the example, generalizing to code that you might yourself write, isn't always easy. ChrisA

On 03/25/2016 03:16 PM, Michel Desmoulin wrote:
True, but without a useful comment it was easy to miss that this line was not the feature.
The feature is on the for loop, which is simple and elegant.
True.
What you're doing here is trolling, really.
Just because you don't like it doesn't make it trolling. -- ~Ethan~

Chris Barker writes:
The *language* has always been iterable-focused. Python doesn't have any counting loops! range() is a builtin (and a necessary one), but not part of the language. for and while are part of the language, and they are both "do until done" loops, not "do n times" loops. I think the problem is more on the user (and teacher) side. We learn that a square has four sides, but perhaps it's more Pythonic(?) to think of a square as the process "do until here == start: forward 1 furlong; right 90". But *neither* (or *both*) is how the children I've observed think of it. They draw four sides, the first three quite straight, and the last one *warped as necessary* to join with the first. (And adult practice is even more varied: drawing two parallel sides first, then joining them at the ends. In the case of the Japanese and Chinese, a square has *three* sides: the left vertical is drawn, then the top and right are drawn as one stroke, and finally the horizontal base.[2]) People don't think like the algorithms that are convenient for us to teach computers.
But does it? That thread never did present a real use case for "empty:" with an iterator. Evidently the OP has one, but we didn't get to see it. All of the realistic iterator cases I can think of are *dynamic*: eg RSS or Twitter feeds. In those cases, it's not that the iterator is empty, it's that it's in a wait state. Even an empty database cursor can be interpreted that way. (If you didn't expect updates, why are you using a database?) I am inclined to think this is a general point, that is, the problem is not *empty* iterators vs. *non-empty* ones. It's iterators that have produced values recently vs. those that haven't. The empty vs. non-empty distinction is a property of *sequences*, including buffers (which are associated with iterators).
No, they are *enumerable*: for i, line in enumerate(the_file): if i >= 10: break ... I guess your request to make iterators more friendly is a good part of why enumerate() got promoted to builtin.
So maybe there should be some ability to index / slice iterables?
There's no way to index an iterator, except to enumerate it. Then, "to memoize or not to memoize, that is the question." Slicing makes more sense to me, but again the fact that your discarded data may or may not be valuable means that you need to make a choice between memoizing and not doing so. Putting that in the API is complexity, or perhaps even complication. If you just want head or tail, then takewhile or dropwhile from itertools is your friend. (I have no opinion -- not even -0 -- on whether promoting those functions to builtin is a good idea.)
But aside from that -- just the idea that looking at how to make iterable a more "natural" part of the language is a good thing.
I think it's from Zen and the Art of Motorcycle Maintenance (though Pirsig may have been quoting), but I once read the advice: "If you want to paint a perfect painting, make yourself perfect and then paint naturally." I think iterators are just "unnatural" to a lot of people, and will remain unnatural until people evolve. Which they may not! Real life "do until done" tasks are careers ("do ... until dead") or have timeouts ("do n times: break if done else ..."). In computation that would be analogous to scrolling a Twitter feed vs. grabbing a pageful. In the context of this discussion, a feed is something you wait for (and maybe timeout and complain to the operator if it blocks too long), while you can apply len() to pages. And you know which is which in all applications I've dealt with -- except for design of abstract programming language facilities like "for ... in". The point being that "real life" examples don't seem to help people's intuition on the Python versions.

Sure.
range() is a builtin (and a necessary one),
But range used to produce a list. So for lips were, in the beginning, about iterating, yes, but iterating through a sequence, not an arbitrary iterables. I may be wrong here, but I started using Python in version 1.5, and I'm pretty sure the iterator protocol did not exist then -- I know for sure I learned about it far later. And aside from range, there is dict.keys, zip, etc....
I think the problem is more on the user (and teacher) side.
Well sure - mostly this is about how we present the language, and I at least am making that switch.
That thread never did present a real use case for "empty:" with an iterator.
I agree here, actually.
Sure, but you have to admit that the slicing notation is a lot cleaner. And I don't WANT to enumerate -/ I want

Oops, hit send by accident.
I want to iterate only the first n items -- that may be a common enough use case for a concise notation.
Well this thread started with those ideas .... My point is still : making working with iteratables as easy and natural as sequences is a good direction to go. -CHB

I'm still very uneasy about how slicing is usually random access, and doesn't change how it indexes its elements from repeated use. It means that you have something very different for the same syntax. (No interest in `my_iterator(:)`? It might actually be compatible with existing syntax.) What about IterTool(foo).islice(dropuntil_f, take_while_g, filter_h) Keeps it as a single call but with explicit call syntax. Won't most uses stick something complicated (= long) in the slice? On Mar 27, 2016 11:31 PM, "Chris Barker - NOAA Federal" < chris.barker@noaa.gov> wrote:

On Tue, Mar 29, 2016 at 1:45 AM, Franklin? Lee <leewangzhong+python@gmail.com> wrote:
range(10, 100)[25:35] range(35, 45)
It's a slice. Whether it's random access or not is pretty much unrelated to slicing; you get a slice of the underlying object, whatever that is. A slice of a sliceable iterator should be a sliced iterator. ChrisA

On Mon, Mar 28, 2016 at 7:45 AM, Franklin? Lee < leewangzhong+python@gmail.com> wrote:
I think only slightly different :-) On Mon, Mar 28, 2016 at 8:04 AM, Chris Angelico <rosuav@gmail.com> wrote:
sure -- but that works great for range, because is is an lazy evaluated sequence, which of course makes it iterable, but I think the trick is: arbitrary_iterable[10:20] should return an iterable that will return the 10th to the 20th item when iterated -- doable, but: for i in arbitrary_iterable[10:20]: pass will then have to call the underlying iterable 20 times -- if it's an iterator that isn't a sequence, its state will have been altered. so I'm not sure how to go about this -- but it would be nice. Also, arbitrary_iterable[-10:] would be essentially impossible. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Tue, Mar 29, 2016 at 3:13 AM, Chris Barker <chris.barker@noaa.gov> wrote:
I don't think arbitrary iterables should be sliceable. Arbitrary *iterators* can be, because that operation has well-defined semantics (just tie in with itertools.islice); the state of the underlying iterator *will* be changed.
Also, arbitrary_iterable[-10:]
would be essentially impossible.
I suppose it could be something like: iter(collections.deque(arbitrary_iterable, maxlen=10)) but that's not what people will normally expect. Much better for various iterable types to define their own slice handling. ChrisA

Le 28/03/2016 18:13, Chris Barker a écrit :
It would be possible but would basically be equivalent to list(arbitrary_iterable)[-10:] and would load everything in memory, at best rendering generators useless, at worst exploding on a infinite data stream. But I think it would be ok to raise ValueError any negative indices by default, and then let containers such as list/tuple define custom behavior for them. Note that arbitrary_iterable[:-10] or arbitrary_iterable[-10] would be possibl, with collections.deque you buffer at most 10 elements in memory, but I'm not sure if it would be preferable to fordid all negative values, or have an special case. It's handy but confusing. Also, what do you think about the callable passed as an index from the first examples ?

On 03/28/2016 09:23 AM, Michel Desmoulin wrote:
Agreed. And already the default behaviour.
Also, what do you think about the callable passed as an index from the first examples ?
I like it, but it will be a pain to implement. (At least, it would be for me. ;) -- ~Ethan~

On 03/28/2016 09:13 AM, Chris Barker wrote:
Iter*ables* are easy to slice (think list, tuple, range, etc.). Iter*ators* are the tricky ones. And yes, there is no way to slice an iterator without changing its state. -- ~Ethan~

On Tue, Mar 29, 2016 at 3:29 AM, Ethan Furman <ethan@stoneleaf.us> wrote:
*Some* iterables are easy to slice. Some are not. A dictionary is perfectly iterable, but it doesn't make a lot of sense to slice it. Are you slicing based on keys, or sequence (like paginating the dict), or something else? That's why iterable slicing needs to be handled by the type itself - a range knows how to produce a subrange, a list knows how to produce a sublist. ChrisA

On 03/28/2016 09:34 AM, Chris Angelico wrote:
From the glossary [https://docs.python.org/3/glossary.html] iterable An object capable of returning its members one at a time. Very good point, Chris, thanks for the correction. So the key point then is that objects with a `__getitem__` that is based on offsets `0` to `len(obj)-1` are easily sequenced, all others may as well be wrapped in `iter()` calls. -- ~Ethan~

On 3/27/2016 11:26 PM, Chris Barker - NOAA Federal wrote:
The current iterator protocol, based on __iter__, __next__, and StopIteration, was added in 2.2. The original iteration protocol, now called the 'sequence protocol', was based on __getitem__(count) and IndexError. Except for not requiring a __length__ method, it required that what we now call an 'iterable' pretend to be a sequence. It is still supported by iter(). https://docs.python.org/3/library/functions.html#iter With the additional changes to builtins in 3.0, there has definitely been a shift of emphasis from lists and sequences to iterables and iterators. --- General response to thread: In spite of this, I am not in favor of trying to force-fit itertools, in particular islice, into syntax. Iterables for which slicing makes sense are free to directly support non-destructive slicing in a __getitem__ method. It does not bother me that destructive slicing of iterators requires a different syntax. Because of the difference between non-destructive and destructive slicing, the goal of making the same code work for sequences and iterators cannot work beyond a single slice. Suppose one wants to process the first 5 and then the next 5 items of an iterable. The necessary code is different for sequences and iterators. from itertools import islice rs = range(10) ri = iter(rs) print(list(islice(rs, 0, 5))) print(list(islice(rs, 0, 5))) print() print(list(islice(ri, 0, 5))) print(list(islice(ri, 0, 5))) # works for iterators print() ri = iter(rs) print(list(islice(rs, 0, 5))) print(list(islice(rs, 5, 10))) # works for sequences print() print(list(islice(ri, 0, 5))) print(list(islice(ri, 5, 10))) # [0, 1, 2, 3, 4] [0, 1, 2, 3, 4] [0, 1, 2, 3, 4] [5, 6, 7, 8, 9] [0, 1, 2, 3, 4] [5, 6, 7, 8, 9] [0, 1, 2, 3, 4] [] How anyone written and experimented with using a 'syntax-iterator' class to test the idea? Has such to uploaded to PyPI? import itertools class Iter: # syntax-intergrated iterator def __init__(self, iterable): self.it = iter(iterable) def __iter__(self): return self def __next__(self): return next(self.it) def __getitem__(self, what): if isistance(what, slice): return itertools.islice( what.start or 0, what.stop, what.step or 1) def __add__(self, other): return itertools.chain(self, other) # for more that two iterables, chain(a, b, c, ...) will run faster ... Then, in the current idiom for a function of an iterable: def f(args, iterable, args): it = iter(iterable) one could replace 'iter' with 'Iter' and proceed to use proposed iterator syntax within the function. -- Terry Jan Reedy

On Mon, Mar 28, 2016 at 2:46 PM, Terry Reedy <tjreedy@udel.edu> wrote:
With the additional changes to builtins in 3.0, there has definitely been a shift of emphasis from lists and sequences to iterables and iterators.
Exactly my point -- and I think there ARE place in the language where we could make that more obviou san natural -- maybe not this place though.
Because of the difference between non-destructive and destructive slicing,
This is a great way to describe the difference. Note that there is also a distinction between destructive and nondestructive looping: for item in an_iterable: do_something(item) if some_condition: break may leave an_iterable alone, and may leave it in a different state depending on if an_iterable is a sequence or an actual iterator. Even so -- I agree that it's probably not a good idea to re-use the slicing syntax for "destructive slicing." -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
participants (10)
-
Chris Angelico
-
Chris Barker
-
Chris Barker - NOAA Federal
-
Ethan Furman
-
Franklin? Lee
-
Michel Desmoulin
-
Nick Coghlan
-
Nikolaus Rath
-
Stephen J. Turnbull
-
Terry Reedy