xreadlines (was Re: while true: !!!)

Emile van Sebille emile at fenx.com
Fri Dec 15 10:41:24 EST 2000


It's a C module that (correct me if I'm wrong) Jeff posted a couple of 
days ago.  I compiled it in, and can now:

from xreadlines import xreadlines
fp = open(filename)
for line in xreadlines(fp): process()

for very large files at speeds that rival normal readlines().

Emile van Sebille
emile at fenx.com

Alex Martelli wrote:

> "Jeff Epler" <jepler at inetnebr.com> wrote in message
> news:slrn93k8sn.lb.jepler at potty.housenet...
> 
>> On Thu, 14 Dec 2000 11:14:38 +0100, Alex Martelli
>>  <aleaxit at yahoo.com> wrote:
>> 
>>> ...if it was able to iterate over already-opened file-like objects,
>> 
> rather
> 
>>> than having to open actual files itself, it WOULD conceptually solve it
>>> all.  I wonder how hard it would be to generalize fileinput in this
>> 
> way --
> 
>>> probably not very (the in-place-rewriting option would be incompatible
>>> with having already-opened file-like objects in the list, of course).
>> 
>> xreadlines takes a file object, not a filename, as its argument.
> 
> 
> Good for it (what IS it?-), but I was talking about the fileinput
> module; I thought it would be nice for the user to generalize the
> functionality (and enhance the speed) of fileinput, rather than
> introducing a completely new and separate module.
> 
> 
> 
>>> If [using readline] was a performance problem, it could of course
>>> also be fixed in a future fileinput version without changing code that
>>> uses it (again, in-place-rewriting would probably have to inhibit the
>>> optimization, although that isn't entirely clear).
>> 
>> xreadlines internally uses fp.readlines(sizehint), so
>> for line in xreadlines.xreadlines(fp): pass
>> is just about as fast as
>> for line in fp.readlines(): pass
>> except that it will not require storage for the whole file.
> 
> 
> Now *THAT* is interesting indeed!  The closest I got was about
> a 2:1 ratio, in the LinesOf class which I posted today -- using
> the same technique.  I wonder what optimizations I missed that
> would allow twice-as-good performance as that!  Do tell, it's
> always nice to learn little (or big) tricks like that...!
> 
> 
> Alex




More information about the Python-list mailing list