Add lookahead iterator (peeker) to itertools
data:image/s3,"s3://crabby-images/e2594/e259423d3f20857071589262f2cb6e7688fbc5bf" alt=""
An iterator iter represents the remainder of some collection, concrete or not, finite or not. If the remainder is not empty, its .__next__ method selects one object, mutates iter to represent the reduced remainder (now possibly empty), and returns the one item. At various times, people have asked for the ability to determine whether an iterator is exhausted, whether a next() call will return or raise. If it will return, people have also asked for the ability to peek at what the return would be, without disturbing what the return will be. For instance, on the 'iterable.__unpack__ method' Alex Stewart today wrote:
The problem is that there is no standard way to query an iterator to find out if it has more data available without automatically consuming that next data element in the rocess.
It turns out that there is a solution that gives the ability to both test emptiness (in the standard way) and peek ahead, without modifying the iterator protocol. It merely require a wrapper iterator, much like the ones in itertools. I have posted one before and give my current version below. Does anyone else this should be added to itertools? It seems to not be completely obvious to everyone, is more complex that some of the existing itertools, and cannot be composed from them either. (Nor can it be written as a generator function.) Any of the names can be changed. Perhaps the class should be 'peek' and the lookahead object something else. The sentinel should be read-only if possible. I considered whether the peek object should be read-only, but someone would say that they *want* be able to replace the next object to be yielded. Peeking into an exhausted iterable could raise instead of returning the sentinel, but I don't know if that would be more useful. ---------------- class lookahead(): "Wrap iterator with lookahead to both peek and test exhausted" _NONE = object() def __init__(self, iterable): self._it = iter(iterable) self._set_peek() def __iter__(self): return self def __next__(self): ret = self.peek self._set_peek() return ret def _set_peek(self): try: self.peek = next(self._it) except StopIteration: self.peek = self._NONE def __bool__(self): return self.peek is not self._NONE def test_lookahead(): it = lookahead('abc') while it: a = it.peek b = next(it) print('next:', b, '; is peek:', a is b ) test_lookahead() --------------------
next: a ; is peek: True next: b ; is peek: True next: c ; is peek: True
-- Terry Jan Reedy
data:image/s3,"s3://crabby-images/8e91b/8e91bd2597e9c25a0a8c3497599699707003a9e9" alt=""
On 25 February 2013 09:51, Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> wrote:
Terry Reedy <tjreedy@...> writes:
class lookahead(): "Wrap iterator with lookahead to both peek and test exhausted" ...
+1 That's a nice tool that I'd love to have in itertools.
It's not a bad idea, but I don't like the fact that as written it turns a finite iterator into an infinite one (returning an endless sequence of sentinels after the underlying iterator is exhausted). That seems to me to be very prone to errors if you don't keep it very clear whether you're dealing with iterators or lookahead-wrapped iterators:
from lookahead import lookahead it = lookahead(range(3)) list(it) <hangs>
The problem is that once you wrap an iterator with lookahead() you can't use the underlying iterator directly any more without losing data, as you've consumed a value from it. So you have to use the wrapped version, and the differing behaviour means you need to write your code differently. I'd prefer it if lookahead(it) behaved exactly like it, except that it had a new peek() method for getting the lookahead value, and maybe a finished() method to tell if next() will raise StopIteration or not. Bikeshed away over what should happen if peek() is called when finished() is true :-) (Disclaimer: I have no real-world use cases for this feature, so the comments above are largely theoretical objections...) Paul.
data:image/s3,"s3://crabby-images/4217a/4217a515224212b2ea36411402cf9d76744a5025" alt=""
On 2013-02-25, at 11:08 , Paul Moore wrote:
from lookahead import lookahead it = lookahead(range(3)) list(it) <hangs>
The problem is that once you wrap an iterator with lookahead() you can't use the underlying iterator directly any more without losing data, as you've consumed a value from it. So you have to use the wrapped version
That's a drawback to most itertools wrappers though, and one which makes sense since Python iterators and generators are not replayable/restartable.
data:image/s3,"s3://crabby-images/e2594/e259423d3f20857071589262f2cb6e7688fbc5bf" alt=""
On 2/25/2013 5:08 AM, Paul Moore wrote:
On 25 February 2013 09:51, Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> wrote:
Terry Reedy <tjreedy@...> writes:
class lookahead(): "Wrap iterator with lookahead to both peek and test exhausted" ...
+1 That's a nice tool that I'd love to have in itertools.
It's not a bad idea, but I don't like the fact that as written it turns a finite iterator into an infinite one (returning an endless sequence of sentinels after the underlying iterator is exhausted).
This is a bug in this re-write. The corrected .__next__ def __next__(self): if self: ret = self.peek self._set_peek() return ret else: raise StopIteration() passes the test with this addition try: next(it) assert False, "Next should have raised StopIteration" except StopIteration: pass
list(lookahead('abc')) == list('abc') True
-- Terry Jan Reedy
data:image/s3,"s3://crabby-images/f3b2e/f3b2e2e3b59baba79270b218c754fc37694e3059" alt=""
Hi all! What is the problem with iterttols.tee even? It is almots the samething - although much more flexible, than what is proposed on this thread.
from itertools import tee my_iter, peek = tee(iter(range(3)), 2) next(peek) 0 next(my_iter) 0
js -><- On 25 February 2013 09:03, Terry Reedy <tjreedy@udel.edu> wrote:
On 2/25/2013 5:08 AM, Paul Moore wrote:
On 25 February 2013 09:51, Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> wrote:
Terry Reedy <tjreedy@...> writes:
class lookahead(): "Wrap iterator with lookahead to both peek and test exhausted"
...
+1 That's a nice tool that I'd love to have in itertools.
It's not a bad idea, but I don't like the fact that as written it turns a finite iterator into an infinite one (returning an endless sequence of sentinels after the underlying iterator is exhausted).
This is a bug in this re-write. The corrected .__next__
def __next__(self): if self:
ret = self.peek self._set_peek() return ret else: raise StopIteration()
passes the test with this addition
try: next(it) assert False, "Next should have raised StopIteration" except StopIteration: pass
list(lookahead('abc')) == list('abc')
True
-- Terry Jan Reedy
_______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
data:image/s3,"s3://crabby-images/4217a/4217a515224212b2ea36411402cf9d76744a5025" alt=""
On 2013-02-25, at 13:37 , Joao S. O. Bueno wrote:
Hi all!
What is the problem with iterttols.tee even? It is almots the samething - although much more flexible, than what is proposed on this thread.
from itertools import tee my_iter, peek = tee(iter(range(3)), 2) next(peek) 0 next(my_iter) 0
Verbose way to do the same thing: you have to tee() + next() once per "peek()". And splitting to subroutines means these subroutines will have to return an iterator alongside any potential value as they'll have to tee() themselves.
data:image/s3,"s3://crabby-images/fef1e/fef1ed960ef8d77a98dd6e2c2701c87878206a2e" alt=""
Le Sun, 24 Feb 2013 22:41:48 -0500, Terry Reedy <tjreedy@udel.edu> a écrit :
def test_lookahead(): it = lookahead('abc') while it: a = it.peek b = next(it) print('next:', b, '; is peek:', a is b )
def test_lookahead(): it = iter('abc') while True: it, peeking = itertools.tee(it) try: a = next(peeking) except StopIteration: break b = next(it) print('next:', b, '; is peek:', a is b ) Regards Antoine.
data:image/s3,"s3://crabby-images/98c42/98c429f8854de54c6dfbbe14b9c99e430e0e4b7d" alt=""
On 25.02.13 12:07, Antoine Pitrou wrote:
Le Sun, 24 Feb 2013 22:41:48 -0500, Terry Reedy <tjreedy@udel.edu> a écrit :
def test_lookahead(): it = lookahead('abc') while it: a = it.peek b = next(it) print('next:', b, '; is peek:', a is b )
def test_lookahead(): it = iter('abc') while True: it, peeking = itertools.tee(it)
This should be outside a loop.
try: a = next(peeking) except StopIteration: break b = next(it) print('next:', b, '; is peek:', a is b )
data:image/s3,"s3://crabby-images/fef1e/fef1ed960ef8d77a98dd6e2c2701c87878206a2e" alt=""
On Mon, 25 Feb 2013 13:58:59 +0200 Serhiy Storchaka <storchaka@gmail.com> wrote:
On 25.02.13 12:07, Antoine Pitrou wrote:
Le Sun, 24 Feb 2013 22:41:48 -0500, Terry Reedy <tjreedy@udel.edu> a écrit :
def test_lookahead(): it = lookahead('abc') while it: a = it.peek b = next(it) print('next:', b, '; is peek:', a is b )
def test_lookahead(): it = iter('abc') while True: it, peeking = itertools.tee(it)
This should be outside a loop.
Only if you restrict yourself to access peeking each time you access it. (which, I suppose, is not the general use case for the lookahead proposal) Regards Antoine.
data:image/s3,"s3://crabby-images/98c42/98c429f8854de54c6dfbbe14b9c99e430e0e4b7d" alt=""
On 25.02.13 20:53, Antoine Pitrou wrote:
def test_lookahead(): it = iter('abc') while True: it, peeking = itertools.tee(it)
This should be outside a loop.
Only if you restrict yourself to access peeking each time you access it. (which, I suppose, is not the general use case for the lookahead proposal)
Only if your do not want to consume O(N) memory and spend O(N**2) time.
data:image/s3,"s3://crabby-images/fef1e/fef1ed960ef8d77a98dd6e2c2701c87878206a2e" alt=""
On Mon, 25 Feb 2013 22:21:03 +0200 Serhiy Storchaka <storchaka@gmail.com> wrote:
On 25.02.13 20:53, Antoine Pitrou wrote:
def test_lookahead(): it = iter('abc') while True: it, peeking = itertools.tee(it)
This should be outside a loop.
Only if you restrict yourself to access peeking each time you access it. (which, I suppose, is not the general use case for the lookahead proposal)
Only if your do not want to consume O(N) memory and spend O(N**2) time.
No, that's beside the point. If you don't consume "peeking" in lock-step with "it", then "peeking" and "it" become desynchronized and therefore the semantics are wrong w.r.t to the original feature request (where "peeking" is supposed to be some proxy to "it", not an independently-running iterator). Regards Antoine.
data:image/s3,"s3://crabby-images/98c42/98c429f8854de54c6dfbbe14b9c99e430e0e4b7d" alt=""
On 25.02.13 22:27, Antoine Pitrou wrote:
On Mon, 25 Feb 2013 22:21:03 +0200 Serhiy Storchaka <storchaka@gmail.com> wrote:
On 25.02.13 20:53, Antoine Pitrou wrote:
def test_lookahead(): it = iter('abc') while True: it, peeking = itertools.tee(it)
This should be outside a loop.
Only if you restrict yourself to access peeking each time you access it. (which, I suppose, is not the general use case for the lookahead proposal)
Only if your do not want to consume O(N) memory and spend O(N**2) time.
No, that's beside the point. If you don't consume "peeking" in lock-step with "it", then "peeking" and "it" become desynchronized and therefore the semantics are wrong w.r.t to the original feature request (where "peeking" is supposed to be some proxy to "it", not an independently-running iterator).
Yes, of course, you should consume "peeking" in lock-step with "it". My note is that if you create tee every iteration, this will lead to an linear increase in memory consumption and degradation of speed.
data:image/s3,"s3://crabby-images/4217a/4217a515224212b2ea36411402cf9d76744a5025" alt=""
On 2013-02-26, at 10:19 , Serhiy Storchaka wrote:
On 25.02.13 22:27, Antoine Pitrou wrote:
On Mon, 25 Feb 2013 22:21:03 +0200 Serhiy Storchaka <storchaka@gmail.com> wrote:
On 25.02.13 20:53, Antoine Pitrou wrote:
def test_lookahead(): it = iter('abc') while True: it, peeking = itertools.tee(it)
This should be outside a loop.
Only if you restrict yourself to access peeking each time you access it. (which, I suppose, is not the general use case for the lookahead proposal)
Only if your do not want to consume O(N) memory and spend O(N**2) time.
No, that's beside the point. If you don't consume "peeking" in lock-step with "it", then "peeking" and "it" become desynchronized and therefore the semantics are wrong w.r.t to the original feature request (where "peeking" is supposed to be some proxy to "it", not an independently-running iterator).
Yes, of course, you should consume "peeking" in lock-step with "it".
My note is that if you create tee every iteration, this will lead to an linear increase in memory consumption and degradation of speed.
I believe I also saw situations where I blew through the recursion limit by stacking too many tees, but I can't reproduce it right now.
data:image/s3,"s3://crabby-images/98c42/98c429f8854de54c6dfbbe14b9c99e430e0e4b7d" alt=""
On 26.02.13 11:23, Masklinn wrote:
On 2013-02-26, at 10:19 , Serhiy Storchaka wrote:
My note is that if you create tee every iteration, this will lead to an linear increase in memory consumption and degradation of speed.
I believe I also saw situations where I blew through the recursion limit by stacking too many tees, but I can't reproduce it right now.
data:image/s3,"s3://crabby-images/4217a/4217a515224212b2ea36411402cf9d76744a5025" alt=""
On 2013-02-26, at 10:44 , Serhiy Storchaka wrote:
On 26.02.13 11:23, Masklinn wrote:
On 2013-02-26, at 10:19 , Serhiy Storchaka wrote:
My note is that if you create tee every iteration, this will lead to an linear increase in memory consumption and degradation of speed.
I believe I also saw situations where I blew through the recursion limit by stacking too many tees, but I can't reproduce it right now.
No, it was a recursion limit, I was trying to use iterators in a situation where I needed multiline lookaheads and skipping so I stacked at least 1 tee + 1 islice per item and it blew up at one point. I can't reproduce it anymore and I can't find the original code (replacing it with straight lists ended up being simpler) Although I can trivially get a segfault with islice: from itertools import islice, count it = count() while True: it = islice(it, 0) Run this, C-c at some point, every CPython version on my machine segfaults (pypy does not)
data:image/s3,"s3://crabby-images/98c42/98c429f8854de54c6dfbbe14b9c99e430e0e4b7d" alt=""
On 26.02.13 12:06, Masklinn wrote:
Although I can trivially get a segfault with islice:
from itertools import islice, count
it = count() while True: it = islice(it, 0)
Run this, C-c at some point, every CPython version on my machine segfaults (pypy does not)
Thank you. This looks as pretty same issue as with tee(). I knew that there must be more such catches, but could not find them. http://bugs.python.org/issue17300
data:image/s3,"s3://crabby-images/4217a/4217a515224212b2ea36411402cf9d76744a5025" alt=""
On 2013-02-26, at 11:37 , Serhiy Storchaka wrote:
On 26.02.13 12:06, Masklinn wrote:
Although I can trivially get a segfault with islice:
from itertools import islice, count
it = count() while True: it = islice(it, 0)
Run this, C-c at some point, every CPython version on my machine segfaults (pypy does not)
Thank you. This looks as pretty same issue as with tee(). I knew that there must be more such catches, but could not find them.
Cool. And I looked back in the VCS, turns out the code hasn't been lost but the issue was not in the stdlib, it was a custom iterator (used as a wrapper for a bunch of operations) which needed to be reapplied very often (code basically went Iterator -> mix of tee, chain and dropwhile -> Iterator -> same mix), essentially doing the following: import itertools _placeholder = object() class It(object): def __init__(self, stream): self.stream = iter(stream) self.stopped = False def __iter__(self): return self def __next__(self): if self.stopped: raise StopIteration() val = next(self.stream, _placeholder) if val is _placeholder: self.stopped = True raise StopIteration() return val it = itertools.count() while True: it = It(it) next(it) I'm not sure if there's any way to implement such a wrapping iterator in a way which does not ultimately blow the stack (save in C taking after itertools implementations I guess as they don't seem to have the issue)
data:image/s3,"s3://crabby-images/fef1e/fef1ed960ef8d77a98dd6e2c2701c87878206a2e" alt=""
Le Tue, 26 Feb 2013 11:19:00 +0200, Serhiy Storchaka <storchaka@gmail.com> a écrit :
On 25.02.13 22:27, Antoine Pitrou wrote:
On Mon, 25 Feb 2013 22:21:03 +0200 Serhiy Storchaka <storchaka@gmail.com> wrote:
On 25.02.13 20:53, Antoine Pitrou wrote:
def test_lookahead(): it = iter('abc') while True: it, peeking = itertools.tee(it)
This should be outside a loop.
Only if you restrict yourself to access peeking each time you access it. (which, I suppose, is not the general use case for the lookahead proposal)
Only if your do not want to consume O(N) memory and spend O(N**2) time.
No, that's beside the point. If you don't consume "peeking" in lock-step with "it", then "peeking" and "it" become desynchronized and therefore the semantics are wrong w.r.t to the original feature request (where "peeking" is supposed to be some proxy to "it", not an independently-running iterator).
Yes, of course, you should consume "peeking" in lock-step with "it".
My note is that if you create tee every iteration, this will lead to an linear increase in memory consumption and degradation of speed.
Apparently, itertools.tee() is optimized for tee'ing a tee:
it = iter("abc") id(it) 31927760 it, peeking = itertools.tee(it) id(it) 31977128 it, peeking = itertools.tee(it) id(it) 31977128
Regards Antoine.
data:image/s3,"s3://crabby-images/ef1c2/ef1c2b0cd950cc4cbc0d26a5e2b8ae2dd6375afc" alt=""
On 02/24/2013 09:41 PM, Terry Reedy wrote:
Does anyone else this should be added to itertools? It seems to not be completely obvious to everyone, is more complex that some of the existing itertools, and cannot be composed from them either. (Nor can it be written as a generator function.)
Any of the names can be changed. Perhaps the class should be 'peek' and the lookahead object something else. The sentinel should be read-only if possible. I considered whether the peek object should be read-only, but someone would say that they *want* be able to replace the next object to be yielded. Peeking into an exhausted iterable could raise instead of returning the sentinel, but I don't know if that would be more useful.
---------------- class lookahead(): "Wrap iterator with lookahead to both peek and test exhausted"
_NONE = object() def __init__(self, iterable): self._it = iter(iterable) self._set_peek() def __iter__(self): return self def __next__(self): ret = self.peek self._set_peek() return ret def _set_peek(self): try: self.peek = next(self._it) except StopIteration: self.peek = self._NONE def __bool__(self): return self.peek is not self._NONE
def test_lookahead(): it = lookahead('abc') while it: a = it.peek b = next(it) print('next:', b, '; is peek:', a is b )
test_lookahead()
I think with a few small changes I would find it useful. The key feature here is that the result is pre calculated and held until it's needed, rather than calculated when it's asked for. You should catch any exception and hold that as well. On the next .next() call, it should raise the exception if there was one, or emit the value. I'm not sure if using the __bool__ attribute is the best choice. I would prefer a .error flag, along with a .next_value attribute. It would make the code using it easier to follow. it.error <-- True if next(it) will raise an exception. it.next_value <-- The next value, or the exception to raise. Note that iterating a list of exceptions will still work. About it.error. If it was a concurrent version, then it.error could have three values. it.error == True # Will raise an exception it.error == False # Will not raise an exception it.error == None # Still calculating I wonder how this type of generator will behave with "yield from". And if there would be any advantages for writing concurrent (or concurrent acting) code. Of course you really need to think about weather or not this fits the problem being solved. Cheers, Ron
data:image/s3,"s3://crabby-images/00961/0096120befbac92f6a77f58144e07bb2e712ea65" alt=""
On Mon, Feb 25, 2013 at 5:29 PM, Ron Adam <ron3200@gmail.com> wrote:
On 02/24/2013 09:41 PM, Terry Reedy wrote:
Does anyone else this should be added to itertools? It seems to not be completely obvious to everyone, is more complex that some of the existing itertools, and cannot be composed from them either. (Nor can it be written as a generator function.)
Any of the names can be changed. Perhaps the class should be 'peek' and the lookahead object something else. The sentinel should be read-only if possible. I considered whether the peek object should be read-only, but someone would say that they *want* be able to replace the next object to be yielded. Peeking into an exhausted iterable could raise instead of returning the sentinel, but I don't know if that would be more useful.
---------------- class lookahead(): "Wrap iterator with lookahead to both peek and test exhausted"
_NONE = object() def __init__(self, iterable): self._it = iter(iterable) self._set_peek() def __iter__(self): return self def __next__(self): ret = self.peek self._set_peek() return ret def _set_peek(self): try: self.peek = next(self._it) except StopIteration: self.peek = self._NONE def __bool__(self): return self.peek is not self._NONE
def test_lookahead(): it = lookahead('abc') while it: a = it.peek b = next(it) print('next:', b, '; is peek:', a is b )
test_lookahead()
I think with a few small changes I would find it useful.
The key feature here is that the result is pre calculated and held until it's needed, rather than calculated when it's asked for.
It seems like a buggy feature, think a DB cursor, next has side effects(like row locks and such)
data:image/s3,"s3://crabby-images/e2594/e259423d3f20857071589262f2cb6e7688fbc5bf" alt=""
On 2/25/2013 10:29 AM, Ron Adam wrote:
On 02/24/2013 09:41 PM, Terry Reedy wrote:
class lookahead(): "Wrap iterator with lookahead to both peek and test exhausted"
_NONE = object() def __init__(self, iterable): self._it = iter(iterable) self._set_peek() def __iter__(self): return self def __next__(self): ret = self.peek self._set_peek() return ret def _set_peek(self): try: self.peek = next(self._it) except StopIteration: self.peek = self._NONE def __bool__(self): return self.peek is not self._NONE
def test_lookahead(): it = lookahead('abc') while it: a = it.peek b = next(it) print('next:', b, '; is peek:', a is b )
test_lookahead()
Since revised.
I think with a few small changes I would find it useful.
The key feature here is that the result is pre calculated and held until it's needed, rather than calculated when it's asked for.
You should catch any exception and hold that as well. On the next .next() call, it should raise the exception if there was one, or emit the value.
My revised version has def _set_peek(self): try: self.peek = next(self._it) except StopIteration: self.peek = self._NONE That could easily be changed to e except Exception as e: self.peek = self._NONE self._error = e __next would then raise self._error instead of explicitly StopIteration. I do not especially like the redundancy of two 'exhausted' indicators and thought of storing e as self.peek, but an iterator can legitimately yield exception classes and instances. I think some argument can be made that if the iterator is broken, the exception should be raised immediately even if it means not returning the last item. No user should be expecting anything other than StopIteration.
I'm not sure if using the __bool__ attribute is the best choice.
I am ;-). It hides the test for exhaustion, which could change, without complication.
I would prefer a .error flag, along with a .next_value attribute. It would make the code using it easier to follow.
Not clear to me, but a minor detail.
it.error <-- True if next(it) will raise an exception. it.next_value <-- The next value, or the exception to raise.
About it.error. If it was a concurrent version, then it.error could have three values.
it.error == True # Will raise an exception it.error == False # Will not raise an exception it.error == None # Still calculating
I wonder how this type of generator will behave with "yield from".
lookaheads are iterators, but not generators and 'yield from' requires a generator. A generator function could recompute items to yield, but it cannot add methods or attributes to the generator instances it will produce. -- Terry Jan Reedy
data:image/s3,"s3://crabby-images/ef1c2/ef1c2b0cd950cc4cbc0d26a5e2b8ae2dd6375afc" alt=""
On 02/26/2013 03:57 PM, Terry Reedy wrote:
I would prefer a .error flag, along with a .next_value attribute. It would make the code using it easier to follow.
Not clear to me, but a minor detail.
it.error <-- True if next(it) will raise an exception. it.next_value <-- The next value, or the exception to raise.
About it.error. If it was a concurrent version, then it.error could have three values.
it.error == True # Will raise an exception it.error == False # Will not raise an exception it.error == None # Still calculating
I wonder how this type of generator will behave with "yield from".
lookaheads are iterators, but not generators and 'yield from' requires a generator. A generator function could recompute items to yield, but it cannot add methods or attributes to the generator instances it will produce.
Yep, I missed that point. I was thinking it may be more useful in the case of generators along with yield from, but it isn't as straight forward as the iterator case as you correctly pointed out. (More or less just thinking out loud at this point.) My intuition/feelings was that in both the cases of a look-ahead iterator and a calc-ahead generator, (if we could manage that), should act the same in as many ways as possible. And a calc-ahead generator should act like a concurrent instance. The reason for thinking that is, it would allow the calc-ahead generator instance to be replaced with a concurrent instance without changing anything. So if (in theory) a concurrent generator instance wouldn't hold exceptions, then your look-ahead iterator shouldn't also. But if everything happens on the next and send calls, then it makes handling exceptions a bit easier as you don't have to special case the instance creation part. But iterators don't necessarily need to match all generator behaviours, It's just may preference that they do as much as possible. :-) Of course Guido may have something in what he's doing now that would be very much like this. A kind of "futures" generator. I've been away from these boards for a while and haven't caught up with everything yet. Cheers, Ron
data:image/s3,"s3://crabby-images/5f8b2/5f8b2ad1b2b61ef91eb396773cce6ee17c3a4eca" alt=""
On 25 February 2013 03:41, Terry Reedy <tjreedy@udel.edu> wrote:
An iterator iter represents the remainder of some collection, concrete or not, finite or not. If the remainder is not empty, its .__next__ method selects one object, mutates iter to represent the reduced remainder (now possibly empty), and returns the one item.
At various times, people have asked for the ability to determine whether an iterator is exhausted, whether a next() call will return or raise. If it will return, people have also asked for the ability to peek at what the return would be, without disturbing what the return will be. For instance, on the 'iterable.__unpack__ method' Alex Stewart today wrote:
The problem is that there is no standard way to query an iterator to find out if it has more data available without automatically consuming that next data element in the rocess. [SNIP]
At times I thought I wanted the ability to query an iterator without necessarily popping an element. I've generally found that I ended up solving the problem in a different way. My own solution is to have a pushable iterator so that after inspecting a value I can push it back onto the iterator ready for a subsequent next() call. Example code for this looks like: def _pushable(iterable): '''Helper for pushable''' iterator = iter(iterable) stack = [] yield lambda x: stack.append(x) while True: while stack: yield stack.pop() yield next(iterator) def pushable(iterable): '''Make an iterable pushable. >>> iterable, push = pushable(range(9)) >>> next(iterable) 0 >>> next(iterable) 1 >>> push(1) # Push the 1 back on to the iterable >>> list(iterable) [1, 2, 3, 4, 5, 6, 7, 8] ''' gen = _pushable(iterable) push = next(gen) return gen, push Oscar
data:image/s3,"s3://crabby-images/50535/5053512c679a1bec3b1143c853c1feacdabaee83" alt=""
On Feb 25, 2013, at 04:20 PM, Oscar Benjamin wrote:
At times I thought I wanted the ability to query an iterator without necessarily popping an element. I've generally found that I ended up solving the problem in a different way. My own solution is to have a pushable iterator so that after inspecting a value I can push it back onto the iterator ready for a subsequent next() call.
The email package has one of these in its feedparser module, called BufferedSubFile. It needs to be able to push lines of text back onto the stack that the normal iteration pops off. Cheers, -Barry
data:image/s3,"s3://crabby-images/22664/22664bad0ed5de8dd48a71d2b2b8ef1e45647daa" alt=""
Barry Warsaw <barry@...> writes:
On Feb 25, 2013, at 04:20 PM, Oscar Benjamin wrote:
At times I thought I wanted the ability to query an iterator without necessarily popping an element. I've generally found that I ended up solving the problem in a different way. My own solution is to have a pushable iterator so that after inspecting a value I can push it back onto the iterator ready for a subsequent next() call.
The email package has one of these in its feedparser module, called BufferedSubFile. It needs to be able to push lines of text back onto the stack that the normal iteration pops off.
It's also a common pattern in lexers/parsers/compilers where you need to push back characters/tokens while doing lookahead for e.g. disambiguation. Regards, Vinay Sajip
participants (12)
-
Antoine Pitrou
-
Barry Warsaw
-
Joao S. O. Bueno
-
Masklinn
-
Oscar Benjamin
-
Paul Moore
-
Ron Adam
-
Serhiy Storchaka
-
Terry Reedy
-
Vinay Sajip
-
Wolfgang Maier
-
yoav glazner