proposal: another file iterator

Jean-Paul Calderone exarkun at
Sun Jan 15 21:20:59 EST 2006

On 15 Jan 2006 16:44:24 -0800, Paul Rubin <""@nospam.invalid> wrote:
>I find pretty often that I want to loop through characters in a file:
>  while True:
>     c =
>     if not c: break
>     ...
>or sometimes of some other blocksize instead of 1.  It would sure
>be easier to say something like:
>   for c in f.iterbytes(): ...
>   for c in f.iterbytes(blocksize): ...
>this isn't anything terribly advanced but just seems like a matter of
>having the built-in types keep up with language features.  The current
>built-in iterator (for line in file: ...) is useful for text files but
>can potentially read strings of unbounded size, so it's inadvisable for
>arbitrary files.
>Does anyone else like this idea?

It's a pretty useful thing to do, but the edge-cases are somewhat complex.  When I just want the dumb version, I tend to write this:

    for chunk in iter(lambda:, ''):

Which is only very slightly longer than your version.  I would like it even more if iter() had been written with the impending doom of lambda in mind, so that this would work:

    for chunk in iter('',, blocksize):

But it's a bit late now.  Anyhow, here are some questions about your iterbytes():

  * Would it guarantee the chunks returned were read using a single read?  If blocksize were a multiple of the filesystem block size, would it guarantee reads on block-boundaries (where possible)?

  * How would it handle EOF?  Would it stop iterating immediately after the first short read or would it wait for an empty return?

  * What would the buffering behavior be?  Could one interleave calls to .next() on whatever iterbytes() returns with calls to .read() on the file?


More information about the Python-list mailing list