Lazy iteration of words in a file (was Re: Code block literals)

Dave Benjamin ramen at lackingtalent.com
Fri Oct 10 18:57:15 EDT 2003


In article <qCFhb.262547$R32.8510844 at news2.tin.it>, Alex Martelli wrote:
> 
> I think that using methods for such things is not a particularly good idea.
> 
> A generator that takes a sequence (typically an iterator) of strings and
> returns as the items the single bytes or words is more general:
> 
> def eachbyte(seq):
>     for s in seq:
>         for c in s:
>             yield c
> 
> def eachword(seq):
>     for s in seq:
>         for w in s.split():
>             yield w
> 
> and now you can loop "for b in eachbyte(file("input.txt")):" etc -- AND you
> have also gained the ability to loop per-byte or per-word on any other
> sequence of strings.  Actually eachbyte is much more general than its
> name suggests -- feed it e.g. a list of files, and it will return the lines 
> of each file -- one after the other -- as a single sequence.

eachbyte is in fact so general, I'd be tempted to give it the name
"iflatten", though I can never decide whether a shallow flatten or a
recursive flatten is worthy of the name "flatten". Here's another way to
loop through words lazily, this time using itertools:

import string
from itertools import imap

def iflatten(seq):
    for subseq in seq:
        for item in subseq:
            yield item
                        
for word in iflatten(imap(string.split, file('input.txt'))):
    print word

-- 
.:[ dave benjamin (ramenboy) -:- www.ramenfest.com -:- www.3dex.com ]:.
: d r i n k i n g   l i f e   o u t   o f   t h e   c o n t a i n e r :




More information about the Python-list mailing list