
On Sun, 15 Apr 2018 00:05:58 -0500 Tim Peters <tim.peters@gmail.com> wrote:
Just for fun - no complaint, no suggestion, just sharing a bit of code that tickled me.
The docs for `itertools.tee()` contain a Python work-alike, which is easy to follow. It gives each derived generator its own deque, and when a new value is obtained from the original iterator it pushes that value onto each of those deques.
Of course it's possible for them to share a single deque, but the code gets more complicated. Is it possible to make it simpler instead?
What it "really" needs is a shared singly-linked list of values, pointing from oldest value to newest. Then each derived generator can just follow the links, and yield its next result in time independent of the number of derived generators. But how do you know when a new value needs to be obtained from the original iterator, and how do you know when space for an older value can be recycled (because all of the derived generators have yielded it)?
I ended up with almost a full page of code to do that, storing with each value and link a count of the number of derived generators that had yet to yield the value, effectively coding my own reference-count scheme by hand, along with "head" and "tail" pointers to the ends of the linked list that proved remarkably tricky to keep correct in all cases.
Then I thought "this is stupid! Python already does reference counting." Voila! Vast swaths of tedious code vanished, giving this remarkably simple implementation:
This implementation doesn't work with Python 3.7 or 3.8. I've tried it here: https://gist.github.com/pitrou/b3991f638300edb6d06b5be23a4c66d6 and get: Traceback (most recent call last): File "mytee.py", line 14, in gen mylast = last[1] = last = [next(it), None] StopIteration The above exception was the direct cause of the following exception: Traceback (most recent call last): File "mytee.py", line 47, in <module> run(mytee1) File "mytee.py", line 36, in run lists[i].append(next(iters[i])) RuntimeError: generator raised StopIteration (Yuck!) In short, you want the following instead: try: mylast = last[1] = last = [next(it), None] except StopIteration: return
def mytee(xs, n): last = [None, None]
def gen(it, mylast): nonlocal last while True: mylast = mylast[1] if not mylast: mylast = last[1] = last = [next(it), None]
That's smart and obscure :-o The way it works is that the `last` assignment changes the `last` value seen by all derived generators, while the `last[1]` assignment updates the bindings made in the other generators' `mylast` lists... It's difficult to find the words to explain it. The chained assignment makes it more difficult to parse as well (when I read this I don't know if `last[i]` or `last` gets assigned first; apparently the answer is `last[i]`, otherwise the recipe wouldn't work correctly). Perhaps like this: while True: mylast = mylast[1] if not mylast: try: # Create new list link mylast = [next(it), None] except StopIteration: return else: # Append to other generators `mylast` linked lists last[1] = mylast # Update shared list link last = last[1] yield mylast[0] Regards Antoine.