bz2.readline() slow ?

Soeren Sonnenburg python-ml at nn7.de
Fri Nov 24 11:11:06 CET 2006


Dear all,

I am a bit puzzled, as

-----snip-----
import bz2
f=bz2.BZ2File('data/data.bz2');

while f.readline():
        pass
-----snip-----

takes twice the time (10 seconds) to read/decode a bz2 file
compared to

-----snip-----
import bz2
f=bz2.BZ2File('data/data.bz2');
x=f.readlines()
-----snip-----

(5 seconds). This is even more strange as the help(bz2) says:

     |  readlines(...)
     |      readlines([size]) -> list
     |      
     |      Call readline() repeatedly and return a list of lines read.
     |      The optional size argument, if given, is an approximate bound on the
     |      total number of bytes in the lines returned.

This happens on python2.3 - python2.5 and it does not help to specify a
maximum line size.

Any ideas ?
Soeren



More information about the Python-list mailing list