[Python-Dev] Re: Reiterability

Alex Martelli aleaxit at yahoo.com
Sun Oct 19 16:16:44 EDT 2003


On Sunday 19 October 2003 06:30 pm, Guido van Rossum wrote:
   ...
> > I have an iterator it whose items, after an arbitrary prefix terminated
> > by the first empty item, are supposed to be each 'yes' or 'no'.
>
> This is a made-up toy example, right?  Does it correspond with
> something you've had to do in real life?

Yes, but I signed an NDA, and thus made irrelevant changes sufficient
to completely mask the application area &c (how is the prefix's end is found,
how the rest of the stream is analyzed to determine how to process it).

> But I'm not sure that abstracting this away all the way to an iterator

Perhaps I over-abstracted it, but I just love abstracting streams as
iterators whenever I can get away with it -- I love the clean, reusable
program structure I often get that way, I love the reusable functions
it promotes.  I guess I'll just build my iterators by suitable factory
functions (including "optimized tee-ability" when feasible), tweak
Raymond's "tee" to use "optimized tee-ability" when supplied, and
tell my clients to build the iterators with my factories if they need
memory-optimal tee-ing.  As long as I can't share that code more
widely, having to use e.g. richiters.iter instead of the built-in iter isn't
too bad, anyway.

> makes sense.  For one, the generic approach to cloning if the iterator
> doesn't have __clone__ would be to make a memory copy, but in this app
> a disk copy is desirable (I can invent something that overflows to

An iterator that knows it's coming from disk or pipe can provide that
disk copy (or reuse the existing file) as part of its "optimized tee-ability".

> offset), or each clone must keep a file offset, but now you lose the
> performance effect of a streaming buffer unless you code up something
> extremely hairy with locks etc.

??? when one clone iterates to the end, on a read-only disk file, its seeks
(which happen always to be to the current offset) don't remove the
benefits of read-ahead done on its behalf by the OS.  Maybe you mean
something else by "lose the performance effect"?

As for locks, why?  An iterator in general is not thread-safe: if two threads
iterate on the same iterator, without providing their own locking, boom.  So
why should clones imply stricter thread-safety?


Alex




More information about the Python-Dev mailing list